Without a doubt cloud computing is going to change the future of data analytics and data visualisation very significantly. Microsoft Azure SQL Data Warehouse recently released for public preview. Combining Power BI as a powerful data visualisation tool with Azure SQL Data Warehouse will give the users the ability to see data insights of their data stored in Azure Data Warehouse very easily. In this post I explain how to install Azure SQL Data Warehouse and the the way it works with Power BI. Before going any further I’d like to have a look at the Azure SQL Data Warehouse very briefly.
What Is Azure SQL Data Warehouse?
Based on Microsoft documentation a SQL Data Warehouse is
Azure SQL Data Warehouse supports stored procedures, user-defined functions, indexes and collations. It uses columnstore index technology which significantly improves query performance as well as getting you up to 5 times compression in compare with traditional row based indexing.
I leave it to you learn more about Azure SQL Data Warehouse. But, it is important to keep in mind that there are some features like primary keys and foreign keys that are NOT supported in Azure SQL Data Warehouse which affect the way we use Power BI as a data visualisation tool over Azure SQL Data Warehouse. Without primary keys and foreign keys there is no physical relationships between the tables so Power BI service cannot detect any relationships by itself. There is a workaround for this that we can create some SQL views in Azure side to make it work. This can be an expensive solution. The other way is to load the data warehouse into a Power BI Desktop model which can detect the relationships automatically.
Now you know a bit bout Azure SQL Data Warehouse let’s get back to the subject and talk more about Power BI and Azure SQL Data Warehouse.
First things first. You need to have a Microsoft Azure subscription. If you don’t already have it you can use it for a one month trial here. You’ll also get $250 credit. But, remember that if you succeed the $250 in less than a month then you’ll need to pay for it if you want to use it longer.
Install Azure SQL Data Warehouse
After you get your Azure subscription, login to your account and you should see a dashboard like this
I’m not going to explain the above dashboard as it is out of scope of this article.
- Click New
- Click “Data + Storage” then click “SQL Data Ware House”
- Enter a name for your database
- Select a performance value
Note: Lager Data Warehouse unit values improve your workload performance with more compute resources. Remember, the larger value the more money you pay. So for our experimental sample I select the minimum possible value which is “100” that will cost me 0.70 USD per hour! YES, per hour. 🙂
- Click “Server” to configure required settings
- Click “Create a new server”
- Enter the server name
- Enter “Server admin login”
- Type in a password
- Confirm the password
- Select “Location”. The default is “East US 2”
- So far we configures a new server to host our SQL Data Warehouse
- Click OK
- Now we can see the server name and location appear under the “Server” section
- Click “Select source”
- You can now select a blank database or select a sample database. I select “Sample”.
- As you can see “AdventureWorksDW: is appears under the “Select sample” section
- Click “Resource Group”
- Click “Create a new resource group”
- Type a name for your new resource group then click OK
- Now we configured all needed settings and we’re ready to create a new Azure SQL Data Warehouse service. Click “Create”
- Now you’ll see a new icon on the dashboard showing the SQL Data Warehouse is creating
Now we successfully created a new Azure SQL Data Warehouse.
But, we are not done yet. We still need to configure firewall settings to be able to connect to the data warehouse from an application like Power BI Desktop, Excel etc.
Azure SQL Data Warehouse Service Firewall Settings
- Click “All Resources” fro the dashboard
- Click your Azure SQL Data Warehouse resource from the list
- Click one of the below for configuring firewall
- Click “Add client IP”
- Make sure “Allow access to Azure service” is switched to “ON”
- You should see your IP address added to the list
- Click “Save”
Note: You can add an IP range instead of just one IP address. You just need to enter a “Start IP” and an “End IP” address then what ever devices exist in that IP range will be able to access the Azure SQL Data Warehouse service.
To be able to connect from a client application we need to have the server name. To find the server name you can click on the Azure SQL Data Warehouse from the dashboard.
Then click on the copy icon under “Server Name”.
If you don’t see your instance of Azure SQL Data Warehouse on the dashboard:
- Click “SQL Databases”
- Select the database from the list
- Copy the server name
Connect to Power BI directly from Azure SQL Data Warehouse
As you might already noticed there is a “Open In PowerBI” available on top your instance of Azure SQL Data Warehouse.
- Click on the “Open In PowerBI” button
- It redirects you to Power BI Website, opens “Connect to Azure SQL Data Warehouse” connections and automatically fills server and database names. Click Next.
- Enter the admin user name and password you created previously in Azure SQL Data Warehouse then click “Sign In”:
Now we connected our Azure SQL Data Warehouse to Power BI successfully.
Create a Sample Report on Power BI Website
- Click the new Azure Data Warehouse dataset from Datasets pane from the Power BI Website to create a new report
- Expand “FactInternetSales”
- Select “Sales Amount”
- Expand “DimProductCategory”
- Select “EnglishProductCategoryName”
OOPS! It looks nasty.
Remember that I mentioned before that Azure SQL Data Warehouse does NOT support primary keys and foreign keys. So Power BI web service cannot auto detect any relationships. Therefore, it shows the same Sales Amount for all Product Categories. Sadly, editing relationships is NOT available in Power BI Website. So at this stage, it might not be a good idea to connect Power BI Website to an instance of Azure SQL Data Warehouse directly. I posted an idea to add the ability to create or edit relationships in Power BI Website. If you think like me and would like to see this feature in the future releases of Power BI please vote for the idea.
There is workaround that you can still use the Power BI web service directly connected to your Azure SQL Data Warehouse. You can create SQL views on Azure side and create your visualisations on top of the views.
Direct connection to Azure SQL Data Warehouse makes the dataset to be always up-to-date. To see if this is really the case do the following simple steps:
- Click open menu ellipsis button on the right side of the dataset
- As you can see it says “This dataset connects to a source with direct connect which is always up-to-date. You do not have to schedule a refresh on this dataset.” so we don’t need to do anything as the dataset is always up-to-date.This dataset connects to a source with direct connect which is always up-to-date. You do not have to schedule a refresh on this dataset.
I inserted some data into FactInternetSales and the dataset got updated immediately. It’s really awesome isn’t it?
Let’s continue and see how Azure SQL Data Warehouse works with Power BI Desktop.
Azure SQL Data Warehouse and Power BI Desktop
- Open Power BI Desktop
- Click “Microsoft Azure Data Warehouse” then click “Connect”
- Enter the server name of your Azure SQL Data Warehouse
- Database name is optional
- Click OK
- Click “Database” then enter the user name and password for your Azure SQL Data Warehouse then click “Connect”
- Expand the database then select all then click “Load”
- Click “Relationships” view and scroll right
- As you can see Power BI Desktop detected lots of relationships automatically. But, there are still some missing relationships. For instance no relationships detected between FactInternetSales and DimDate. However, we might not experience the same in a real world project. So I leave it to you to create the missing relationships I the AdventjureWorksDW data model for more experiments.
- Click Report view
- Expand FactInternetSales
- Tick SalesAmount
- Expand DimProductCategory
- Drag and drop EnglishProductCategoryName field into Axis
- Expand DimProductSubCategory
- Drag and drop EnglishProductSubCategoryName field into Axis right below the EnglishProductCategoryName field to add Drill down action to the report
We successfully created a simple report on Power BI Desktop on top of Azure SQL Data Warehouse.
- Click Publish from the ribbon.
- Now jump online and login to your Power BI account
- Find the new report you just published now. Everything looks to be fine as expected.
Note: You need to have a Power BI Pro account to be able to use an Azure SQL Data Warehouse dataset in Power BI service.
In this scenario we connected to the Azure SQL Data Warehouse from Power BI Desktop which means the connection is NOT a direct connection. Therefore, we need to configure “Schedule Refresh” on Power BI website.
- Click on open menu ellipsis button on the right side of your data set
- Click “Schedule Refresh”
- Expand “Data Source Credentials”
- Click “Edit Credentials”
- Select “Basic” from “Authentication Method” drop down
- Enter your valid Azure “Username” and “Password”
- Click “Sign In”
- Expand “Schedule Refresh”
- Switch “Keep your data up-to-date” button to ON
- Do your desired schedule settings then click “Apply”
- Expand “Featured Q&A Questions”. This is a new feature added to Power BI. What ever you type here will be ready to use in Power BI dashboard.
Create a New Dashboard
- Open the report you published from Power BI Desktop
- Click “Pin Visual”
- Click “New Dashboard” from the “Pin to Dashboard” page
- Type a name for the new dashboard then click “Pin”
- Open the new dashboard from “Dashboards” pane
- Click on “Ask a question about the data on this dashboard”
- The first question would be the featured question we added before
- Click on the question and here you go, your pie chart is ready to use
- You can pin it to the dashboard
- On the “Pin to Dashboard” page click “Pin”
- Back to the dashboard and you should see the new pie chart
Now we are done.
Some Facts About Azure SQL Data Warehouse and Power BI
So far we created a useful dashboard in Power BI on top of Azure SQL Data Warehouse. We can make lots of other data visualisations and reports in Power BI Desktop and publish them to powerbi.com. Then we can create lots of other dashboards there.
But, is that really it? What happens when we want to do a real world project? Okay. Let’s have a look at some realities about the current version of Azure SQL Data Warehouse in combine with the current version of Power BI. I know that it is a preview version, but, the following points are valid for the current version:
- As you saw before there is a “Open In PowerBI” button available to directly connect an Azure SQL Data Warehouse to Power BI Website. But, what is the point of having such a feature when Power BI Website does NOT automatically detect relationships? Besides, there is no edit relationships feature available in Power BI website, so at the moment there is no way we can fix the issue with current available features. As I pointed before, we can create some views on Azure side, but, I don’t think it is realistic. I newly heard that supporting auto detect relationships on Power BI website for Azure SQL Data Warehouse is a work in progress, but, until it’s not available we cannot use Power BI website and have the benefit of Direct Connect to Azure SQL Data Warehouse. When a dataset connects to a source with direct connect we don’t have to configure a schedule refresh and the dataset in Power BI side would be always up-to-date.
- As mentioned before Power BI Desktop can automatically detect the relationships of an Azure SQL Data Warehouse. But, how precise it is? When we connected to our Adventure Works sample we saw that Power BI Desktop detected some relationships. But, I found out some problems with creating new relationships or edit the existing relationships. Try the following examples:
- As you might noticed, there is no relationship detected between FactInternetSales and DimDate. Try to define a relationship using OrderDateKey from FactInternetSales and DateKey from DimDate. As you can see, the relationship is inactive. If you want to activate the relationship you’ll get the following message: “There is already an active relationship between the two tables. You can deactivate the existing relationship, and then make this relationship active.” Actually this is absolutely alright. We all know that Power BI model doesn’t support multiple relationships between two tables. But, the problem is that it would be a pain to find out which relationship we need to deactivate first. As I stated in one of my previous posts we can use USERELATIONSHIP function to control the relationships, but, it would be time consuming if want to write DAX codes for each relationship.
- There is a relationship detected between FactSalesQuota and DimDate. Deactivate the relationship. Now try to activate it again and see what happens. Oops! You’re not allowed to do that! Lets try another one. Deactivate the relationship between FactInternetSales and DimProduct. Now try to activate it, the sale thing happens.
- UPDATE: I noticed that Q&A is NOT available for Azure SQL Data Warehouse direct connect. So when you create a new dashboard the Q&A is NOT there.
As a result, despite I believe that the combination of Azure SQL Data Warehouse and Power BI would make a perfect solution for enterprise level projects in near future, I think the current versions are not mature enough to support a real data visualisation project on top of an enterprise-class distributed database.
I would be happy to have your opinions and comments.