OData has been adopted by many software solutions and has been around for many years. Most solutions are using the OData is to serve their transactional processes. But as we know, Power BI is an analytical solution that can fetch hundreds of thousands (or millions) rows of data in a single table. So, obviously, OData is not optimised for that kind of purpose. One of the biggest challenges many Power BI developers face when working with OData connections is performance issues. The performance depends on numerous factors such as the size of tables in the backend database that the OData connection is serving, peak read data volume over periods of time, throttling mechanism to control over-utilisation of resources etc…
So, generally speaking, we do not expect to get a blazing fast data refresh performance over OData connections, that’s why in many cases using OData connections for analytical tools such as Power BI is discouraged. So, what are the solutions or alternatives if we do not use OData connections in Power BI? Well, the best solution is to migrate the data into an intermediary repository, such as Azure SQL Database or Azure Data Lake Store or even a simple Azure Storage Account, then connect from Power BI to that database. We must decide on the intermediary repository depending on the business requirements, technology preferences, costs, desired data latency, future support requirement and expertise etc…
But, what if we do not have any other options for now, and we have to use OData connection in Power BI without blasting the size and costs of the project by moving the data to an intermediary space? And.. let’s face it, many organisations dislike the idea of using an intermediary space for various reasons. The simplest one is that they simply cannot afford the associated costs of using intermediary storage or they do not have the expertise to support the solution in long term.
In this post, I am not discussing the solutions involving any alternatives; instead, I provide some tips and tricks that can improve the performance of your data refreshes over OData connections in Power BI.
The tips in this post will not give you blazing-fast data refresh performance over OData, but they will help you to improve the data refresh performance. So if you take all the actions explained in this post and you still do not get an acceptable performance, then you might need to think about the alternatives and move your data into a central repository.
If you are getting data from a D365 data source, you may want to look at some alternatives to OData connection such as Dataverse (SQL Endpoint), D365 Dataverse (Legacy) or Common Data Services (CDS). But keep in mind, even those connectors have some limitations and might not give you an acceptable data refresh performance. For instance, Dataverse (SQL Endpoint) has 80MB table size limitation. There might be some other reasons for not getting a good performance over those connections such as having extra wide tables. Believe me, I’ve seen some tables with more than 800 columns.
Some suggestions in this post apply to other data sources and are not limited to OData connections only.
Suggestion 1: Measure the data source size
It is always good to have an idea of the size of the data source we are dealing with and OData connection is no different. In fact, the backend tables on OData sources can be wast. I wrote a blog post around that before, so I suggest you use the custom function I wrote to understand the size of the data source. If your data source is large, then the query in that post takes a long time to get the results, but you can filter the tables to get the results quicker.
Suggestion 2: Avoid getting throttled
As mentioned earlier, many solutions have some throttling mechanisms to control the over-utilisation of resources. Sending many API requests may trigger throttling which limits our access to the data for a short period of time. During that period, our calls are redirected to a different URL.
Tip 1: Disabling Parallel Loading of Tables
One of the many reasons that Power BI requests many API calls is loading the data into multiple tables in Parallel. We can disable this setting from Power BI Desktop by following these steps:
- Click the File menu
- Click Options and settings
- Click Options
- Click the Data Load tab from the CURREN FILE section
- Untick the Enable parallel loading of tables option