Shared Datasets have been around for quite a while now. In June 2019, Microsoft announced a new feature called Shared and Certified Datasets with the mindset of supporting enterprise-grade BI within the Power BI ecosystem. In essence, the shared dataset feature allows organisations to have a single source of truth across the organisation serving many reports.
A Thin Report is a report that connects to an existing dataset on Power BI Service using the Connect Live connectivity mode. So, we basically have multiple reports connected to a single dataset. Now that we know what a thin report is, let’s see why it is best practice to follow this approach.
Prior to the Shared and Certified Datasets announcement, we used to create separate reports in Power BI Desktop and publish those reports into Power BI Service. This approach had many disadvantages, such as:
- Having many disparate islands of data instead of a single source of truth.
- Consuming more storage on Power BI Service by having repetitive table across many datasets
- Reducing collaboration between data modellers and report creators (contributors) as Power BI Desktop is not a multi-user application.
- The reports were strictly connected to the underlying dataset so it is so hard, if not totally impossible, to decouple a report from a dataset and connect it to a different dataset. This was pretty restrictive for the developers to follow the Dev/Test/Prod approach.
- If we had a fairly large report with many pages, say more than 20 pages, then again, it was almost impossible to break the report down into some smaller and more business-centric reports.
- Putting too much load on the data sources connected to many disparate datasets. The situation gets even worst when we schedule multiple refreshes a day. In some cases the data refresh process put exclusive locks on the the source system that can potentially cause many issues down the road.
- Having many datasets and reports made it harder and more expensive to maintain the solution.
In my previous blog, I explained the different components of a Business Intelligence solution and how they map to the Power BI ecosystem. In that post, I mentioned that the Power BI Service Datasets map to a Semantic Layer in a Business Intelligence solution. So, when we create a Power BI report with Power BI Desktop and publish the report to the Power BI Service, we create a semantic layer with a report connected to it altogether. By creating many disparate reports in Power BI Desktop and publishing them to the Power BI Service, we are indeed creating many semantic layers with many repeated tables on top of our data which does not make much sense.
On the other hand, having some shared datasets with many connected thin reports makes a lot of sense. This approach covers all the disadvantages of the previous development method; in addition, it decreases the confusion for report writers around the datasets they are connecting to, it helps with storage management in Power BI Service, and it is easier to comply with security and privacy concerns.Continue reading “Thin Reports, What Are They, Why Should I Care and How Can I Create Them?”