Unveiling Microsoft Fabric’s Impact on Power BI Developers and Analysts

Unveiling Microsoft Fabric’s Impact on Power BI Developers and Analysts

Microsoft Fabric is a new platform designed to bring together the data and analytics features of Microsoft products like Power BI and Azure Synapse Analytics into a single SaaS product. Its goal is to provide a smooth and consistent experience for both data professionals and business users, covering everything from data entry to gaining insights. A new data platform comes with new keywords and terminologies, so to get more familiar with some new terms in Microsoft Fabric, check out this blog post.

As mentioned in one of my previous posts, Microsoft Fabric is built upon the Power BI platform; therefore we expect it to provide ease of use, strong collaboration, and wide integration capabilities. While Microsoft Fabric is getting more attention in the market, so we see more and more organisations investigating the possibilities of migrating their existing data platforms to Microsoft Fabric. But what does it mean for seasoned Power BI developers? What about Power BI professional users such as data analysts and business analysts? In this post, I endeavor to answer those questions.

I have been blogging predominantly around Microsoft Data Platforms and especially Power BI since 2013. But I have never written about the history of Power BI. I believe it makes sense to touch upon the history of Power BI to better understand the size of its user base and how introducing a new data platform that includes Power BI can affect them. A quick search on the internet provides some interesting facts about it. So let’s take a moment and talk about it.

The history of Power BI

Power BI started as a top-secret project at Microsoft in 2006 by Thierry D’Hers and Amir Netz. They wanted to make a better way to analyse data using Microsoft Excel. They called their project “Gemini” at first.

In 2009, they released PowerPivot, a free extension for Excel that supports in-memory data processing. This made it faster and easier to do calculations and create reports. PowerPivot got quickly popular among Excel users, but it had some limitations. For example, it was hard to share large Excel files with others, and it was not possible to update the data automatically.

In 2015, Microsoft combined PowerPivot with another extension called Power Query, which lets users get data from different sources and clean it up. They also added a cloud service that lets users publish and share their reports online. They called this new product Power BI, which stands for Power Business Intelligence.

In the past few years, Power BI grasped a lot of attention in the market and improved a lot to cover more use cases and business requirements from data transformation, data modelling, and data visualisation to combining all these goods with the power of AI and ML to provide predictive and prescriptive analysis.

Who are Power BI Users?

Since its birth, Power BI has become one of the most popular and powerful data analysis and data visualisation tools in the world used by a wide variety of users. In the past few years, Power BI generated many new roles in the job market, such as Power BI developer, Power BI consultant, Power BI administrator, Power BI report writer, and whatnot, as well as helping many others by making their lives easier, such as data analysts and business analysts. With Power BI, the data analysts could efficiently analyse the data and make recommendations based on their findings. Business analysts could use Power BI to focus on more practical changes resulting from their analysis of the data and show their findings to the business much quicker than before. As a result, millions of users interact with Power BI on a daily basis in many ways. So, introducing a new data platform that sort of “Swallows Power BI” may sound daunting to those whose daily job relates to content creation, maintenance, or administrating Power BI environments. For many, the fear is real. But shall the developers and analysts be afraid of Microsoft Fabric? The short answer is “Absolutely not!”. Does it change the way we used to work with Power BI? Well, it depends.

To answer these questions, we first need to know who are Power BI users and how they interact with it.

Continue reading “Unveiling Microsoft Fabric’s Impact on Power BI Developers and Analysts”

Integrating Power BI with Azure DevOps (Git), part 2: Local Machine Integration

Integrating Power BI with Azure DevOps (Git), part 2: Local Machine Integration

This is the second part of the series of blog posts showing how to integrate Power BI with Azure DevOps, a cloud platform for software development. The previous post gave a brief history of source control systems, which help developers manage code changes. It also explained what Git is, a fast and flexible distributed source control system, and why it is useful. It introduced the initial configurations required in Azure DevOps and explained how to integrate Power BI (Fabric) Service with Azure DevOps.

This blog post explains how to synchronise an Azure DevOps repository with your local machine to integrate your Power BI Projects with Azure DevOps. Before we start, we need to know what a Power BI Project is and how we can create it.

What is Power BI Project (Developer Mode)

Power BI Project (*.PBIP) is a new file format for Power BI Desktop that was announced in May 2023 and made available for public preview in June 2023. It allows us to save our work as a project, which consists of a folder structure containing individual text files that define the report and dataset artefacts. This enables us to use source control systems, such as Git, to track changes, compare revisions, resolve conflicts, and review changes. It also enables us to use text editors, such as Visual Studio Code, to edit the artefact definitions more productively and programmatically. Additionally, it supports CI/CD (continuous integration and continuous delivery), where we submit changes to a series of quality gates before applying them to the production system.

PBIP files differ from the regular Power BI Desktop files (PBIX), which store the report and dataset artefacts as a single binary file. This made integrating with source control systems, text editors, and CI/CD systems difficult. PBIP aims to overcome these limitations and provide a more developer-friendly experience for Power BI Desktop users.

Since this feature is still in public preview when writing this blog post, we have to enable it from the Power BI Desktop Options and Settings.

Enable Power BI Project (Developer Mode) (Currently in Preview)

As mentioned, we first need to enable the Power BI Project (Developer Mode) feature, introduced for public preview in the June 2023 release of Power BI Desktop. Power BI Project files allow us to save our Power BI files as *.PBIP files deconstruct the legacy Power BI report files (*.PBIX) into well-organised folders and files.
With this feature, we can:

  • Edit individual components of our Power BI file, such as data sources, queries, data model, visuals, etc.
  • Use any text editor or IDE to edit our Power BI file
  • Compare and merge changes
  • Collaborate with other developers on the same Power BI file

To enable Power BI Project (Developer Mode), follow these steps in Power BI Desktop:

Continue reading “Integrating Power BI with Azure DevOps (Git), part 2: Local Machine Integration”

Incremental Refresh in Power BI, Part 1: Implementation in Power BI Desktop

Incremental-Refresh-in-Power-BI-Part-1-Implementation-in-Power-BI-Desktop

Incremental refresh, or IR, refers to loading the data incrementally, which has been around in the world of ETL for data warehousing for a long time. Let us discuss incremental refresh (or incremental data loading) in a simple language to better understand how it works.

From a data movement standpoint, there are always two options when we transfer data from location A to location B:

  1. Truncation and load: We transfer the data as a whole from location A to location B. If location B has some data already, we entirely truncate the location B and reload the whole data from location A to B
  2. Incremental load: We transfer the data as a whole from location A to location B just once for the first time. The next time, we only load the data changes from A to B. In this approach, we never truncate B. Instead, we only transfer the data that exists in A but not in B

When we refresh the data in Power BI, we use the first approach, truncation and load, if we have not configured an incremental refresh. In Power BI, the first approach only applies to tables with Import or Dual storage modes. Previously, the Incremental load was available only in the tables with either Import or Dual storage modes. But the new announcement from Microsoft about Hybrid Tables greatly affects how Incremental load works. With the Hybrid Tables, the Incremental load is available on a portion of the table when a specific partition is in Direct Query mode, while the rest of the partitions are in Import storage mode.

Incremental refresh used to be available only on Premium capacities, but from Feb 2020 onwards, it is also available in Power BI Pro with some limitations. However, the Hybrid Tables are currently available on Power BI Premium Capacity and Premium Per User (PPU), not Pro. Let’s hope that Microsft will change its licensing plan for the Hybrid Tables in the future and make it available in Pro.

I will write about Hybrid Tables in a future blog post.

When we successfully configure the incremental refresh policies in Power BI, we always have two ranges of data; the historical range and the incremental range. The historical range includes all data processed in the past, and the incremental range is the current range of data to process. Incremental refresh in Power BI always looks for data changes in the incremental range, not the historical range. Therefore, the incremental refresh will not notice any changes in the historical data. When we talk about the data changes, we are referring to new rows inserted, updated or deleted, however, the incremental refresh detects updated rows as deleting the rows and inserting new rows of data.

Benefits of Incremental Refresh

Configuring incremental refresh is beneficial for large tables with hundreds of millions of rows. The following are some benefits of configuring incremental refresh in Power BI:

  • The data refreshes much faster than when we truncate and load the data as the incremental refresh only refreshes the incremental range
  • The data refresh process is less resource-intensive than refreshing the entire data all the time
  • The data refresh is less expensive and more maintainable than the non-incremental refreshes over large tables
  • The incremental refresh is inevitable when dealing with massive datasets with billions of rows that do not fit into our data model in Power BI Desktop. Remember, Power BI uses in-memory data processing engine; therefore, it is improbable that our local machine can handle importing billions of rows of data into the memory

Now that we understand the basic concepts of the incremental refresh, let us see how it works in Power BI.

Implementing Incremental Refresh Policies with Power BI Desktop

We currently can configure incremental refresh in the Power BI Desktop and in Dataflows contained in a Premium Workspace. This blog post looks at the incremental refresh implementation within the Power BI Desktop.

After successfully implementing the incremental refresh policies with the desktop, we publish the model to Power BI Service. The first data refresh takes longer as we transfer all data from the data source(s) to Power BI Service for the first time. After the first load, all future data refreshes will be incremental.

How to Implement Incremental Refresh

Implementing incremental refresh in Power BI is simple. There are two generic parts of the implementation:

  1. Preparing some prerequisites in Power Query and defining incremental policies in the data model
  2. Publishing the model to Power BI Service and refreshing the dataset

Let’s briefly get to some more details to quickly understand how the implementation works.

  • Preparing Prerequisites in Power Query
    • We require to define two parameters with DateTime data type in Power Query Editor. The names for the two parameters are RangeStart and RangeEnd, which are reserved for defining incremental refresh policies. As you know, Power Query is case-sensitive, so the names of the parameters must be RangeStart and RangeEnd.
    • The next step is to filter the table by a DateTime column using the RangeStart and RangeEnd parameters when the value of the DateTime column is between RangeStart and RangeEnd.

Notes

  • The data type of the parameters must be DateTime
  • The datat tpe of the column we use for incremental refresh must be Int64 (integer) Date or DateTime.Therefore, for scenarios that our table has a smart date key instead of Date or DateTime, we have to convert the RangeStart and RangeEnd parameters to Int64
  • When we filter a table using the RangeStart and RangeEnd parameters, Power BI uses the filter on the DateTime column for creating partitions on the table. So it is important to pay attention to the DateTime ranges when filtering the values so that only one filter condition must have an “equal to” on RangeStart or RangeEnd, not both
Continue reading “Incremental Refresh in Power BI, Part 1: Implementation in Power BI Desktop”

Power BI 101, What Should I Learn?

This is the second part of my new series of Power BI posts named Power BI 101. In the previous post, I briefly discussed what Power BI is. In this post, I look into one of the most confusing parts for those who want to start learning Power BI. Many people jump straight online and look for Power BI training courses which there are plenty out there. But which one is the right training course for you? Let’s find out.

What do you want to gain from learning Power BI?

Regardless of attending paid training courses or being a self-learner, the above question is one of the most important questions you might ask yourself before going to the next steps. The answer to this question dictates the sort of training you must look for. Your answer to the preceding question can be one or none of the following:

  • I am a graduate/student looking at the job market
  • I am a business analyst and I want to know how Power BI can help you with my daily job
  • I am a database developer and I want to learn more about business intelligence and data and analytics space
  • I am a non-Microsoft Business Intelligence developer and I want to start learning more about Microsoft offerings
  • I am a system admin and I have to manage our Power BI tenant
  • I am a data scientist and I want to know how I can use Power BI
  • I am just ciourious to see what Power BI can do for me

As mentioned, your answer might not be any of the above, but, thinking about your reason(s) for learning Power BI can help you to find the best way to learn and use Power BI more efficiently. You can spend time and money taking some online courses and get even more confused. You don’t want that do you?

So, whatever reason(s) you have in mind to learn Power BI, most probably you fall into one of the following user categories:

Think about your goal(s) and what you want to achieve by learning Power BI then try to identify your user category. For instance, if you are a student thinking of joining an IT company as a data and analytics developer, then your user category is most probably a Power BI Developer or a Contributor.

To help you find out your user category let’s see what the above user categories mean.

Power BI Developers

The Power BI Developers are the beating hearts of any Power BI development project. Regardless of the project you will be involved with, you definitely require to have a certain level of knowledge of the following:

  • Data preparation/ETL processes
  • Data warehousing
  • Data modelling/Star schema
  • Data visualisation

To be a successful Power BI developer you must learn the following languages in Power BI:

  • Power Query
  • DAX

Depending on the types of projects you will be involved in, you may require to learn the following languages as well:

  • Microsoft Visual Basic (for Paginated Reports)
  • Python
  • R
  • T-SQL
  • PL/SQL

As a Power BI developer, you will write a lot of Power Query and DAX expressions. Most probably you require to learn T-SQL as well. The following resources can be pretty helpful:

Continue reading “Power BI 101, What Should I Learn?”