Slowly Changing Dimension (SCD) in Power BI, Part 1, Introduction to SCD

Slowly changing dimension (SCD) is a data warehousing concept coined by the amazing Ralph Kimball. The SCD concept deals with moving a specific set of data from one state to another. Imagine a human resources (HR) system having an Employee table. As the following image shows, Stephen Jiang is a Sales Manager having ten sales representatives in his team:

SCD in Power BI, Stephen Jiang is the sales manager of a team of 10 sales representatives
Image 1: Stephen Jiang is the sales manager of a team of 10 sales representatives

Today, Stephen Jiang got his promotion to the Vice President of Sales role, so his team has grown in size from 10 to 17. Stephen is the same person, but his role is now changed, as shown in the following image:

SCD in Power BI, Stephen's team after he was promoted to Vice President of Sales
Image 2: Stephen’s team after he was promoted to Vice President of Sales

Another example is when a customer’s address changes in a sales system. Again, the customer is the same, but their address is now different. From a data warehousing standpoint, we have different options to deal with the data depending on the business requirements, leading us to different types of SDCs. It is crucial to note that the data changes in the transactional source systems (in our examples, the HR system or a sales system). We move and transform the data from the transactional systems via ETL (Extract, Transform, and Load) processes and land it in a data warehouse, where the SCD concept kicks in. SCD is about how changes in the source systems reflect the data in the data warehouse. These kinds of changes in the source system do not happen very often hence the term slowly changing. Many SCD types have been developed over the years, which is out of the scope of this post, but for your reference, we cover the first three types as follows.

SCD type zero (SCD 0)

With this type of SCD, we ignore all changes in a dimension. So, when a person’s residential address changes in the source system (an HR system, in our example), we do not change the landing dimension in our data warehouse. In other words, we ignore the changes within the data source. SCD 0 is also referred to as fixed dimensions.

Continue reading “Slowly Changing Dimension (SCD) in Power BI, Part 1, Introduction to SCD”

Combining X Number of Rows in Power Query for Power BI, Excel and Power Query Online

Combining X Number of Rows in Power Query for Power BI, Excel and Power Query Online

A while back, I was working on a project involving getting data from Excel files. The Excel files contain the data in sheets and tables. Getting the data from the tables is easy. However, the data in the sheets have some commentaries on top of the sheet, then the column names and then the data itself. Something like below:

Sample data
Sample data

This approach is pretty consistent across many Excel files. The customer wants to have the commentary in the column names when the data is imported into Power BI. So the final result must look like this:

Sample Data to be loaded into Power BI
Sample Data to be loaded into Power BI

The business requirement though is to combine the first 3 rows of data and promote it as the column name.

The Challenge

Let’s connect the Excel file and look at the data in Power BI Desktop.

Connecting to sample data from Power BI Desktop
Connecting to sample data from Power BI Desktop

As you can see in the preceding image, Power BI, or more precisely, Power Query, sees the data in Table format. After we click the Transform Data button, this is what we get in Power Query Editor:

Connected to sample data from Power Query in Power BI Desktop
Connected to sample data from Power Query in Power BI Desktop

We all know that tables consist of Columns and Rows. The conjunction of a column and a row is a Cell. What we require to do is to concatenate the values of cells from the first three rows. We also have to use a Space character to separate the values of each cell from the others.

Column, rows and cells in a Table in Power BI
Column, rows and cells in a Table

In Power Query, we can get each row of data in as a Record with the following syntax:

Table{RecordIndex}

In the above syntax, the Table can be the results of the previous transformation step, and the RecordIndex starts from 0. So to get the first row of the table in the preceding image, we use the following syntax:

#"Changed Type"{0}

Where the #"Changed Type" is the previous step. Here are the results of running the preceding expression:

Getting the first row of a Table
Getting the first row of a Table

So we can get the second and third rows with similar expressions. The following image shows the entire codes in the Advanced Editor:

Power Query expressions in Advanced Editor in Power BI Desktop
Power Query expressions in Advanced Editor

But how do we concatenate the values of the rows?

Continue reading “Combining X Number of Rows in Power Query for Power BI, Excel and Power Query Online”

Quick Tips: Conditionally Replace Values Based on Other Values in Power Query

Power Query (M) made a lot of data transformation activities much easier and value replacement is one of them. You can easily right click on any desired value in Power Query, either in Excel or Power BI, or other components of Power Platform in general, and simply replace that value with any desired alternative. Replacing values based on certain conditions however, may not seem that easy at first. I’ve seen a lot of Power Query (M) developers adding new columns to accomplish that. But adding a new column is not always a good idea, especially when you can do it in a simple single step in Power Query. In this post I show you a quick and easy way to that can help you handling many different value replacement scenarios.

Imagine you have a table like below and you have a requirement to replace the values column [B] with the values of column [C] if the [A] = [B].

Sample Data in Power BI

One way is to add a new conditional column and with the following logic:

if [B] = [A] then [C] else [B]

Well, it works perfectly fine, but wait, you’re adding a new column right? Wouldn’t it be better to handle the above simple scenario without adding a new column? If your answer is yes then continue reading.

Continue reading “Quick Tips: Conditionally Replace Values Based on Other Values in Power Query”

Quick Tips: How to Filter a Column by another Column from a Different Query in Power Query

Filter a Column by a Column from a Different Query in Power Query

A while ago I was visiting a customer that asked if they can filter a query data by a column from another query in Power BI. And I said of course you can. In this post I explain how that can be achieved in Power Query. The key point is to know how to reference a query and how to reference a column of that query in Power Query. This is useful when you have a lookup table that can be sourced from every supported data source in Power Query and you want to filter the results of another query by relevant column in the lookup query. In that case, you’ll have a sort of dynamic filtering. So, whenever you refresh your model if new records have been changed in or added to the source of the lookup query, your table will automatically include the new values in the filter step in Power Query.

Referencing a Query

It is quite simple, you just need to use the name of the query. If the query name contains special characters like space, then you need to wrap it with number sign and double quotes like #”QUERY_NAME”. So, if I want to reference another query, in a new blank query, then the Power Query (M) scripts would look like below:

let
    Source = Product
in
    Source

Or something like

let
    Source = #"Product Category"
in
    Source

Referencing a Column

Referencing a column is also quite simple. When you reference a column you need to mention the referencing query name, explained above, along with the column name in brackets. So, the format will look like #”QUERY_NAME”[COLUMN_NAME]. The result is a list of values of that particular column.

let
    Source = #"Product Category"[Product Category Name]
in
    Source
Referencing a Column from Another Query in Power Query
Continue reading “Quick Tips: How to Filter a Column by another Column from a Different Query in Power Query”