In this webinar, our experts showcase a variety of demo use cases of how different components of the...
Today’s blogpost is devoted to Power BI, Microsoft’s robust business analytics service. Specifically, we’ll describe the impact scheduled data refreshes can have on report development. Let’s get started!
When initiating a Power BI project – regardless of its size and complexity – we always ask the same kick-off questions. The answers to these three important questions help us gain an understanding of the project requirements. The questions are:
This part of the discussion is from the developer’s perspective, though management and customers are generally interested in the running costs and maintenance expenses associated with running Power BI reports. In this way, licenses often become a big part of the discussion.
End users are always excited about the dynamic and interactive Power BI visualizations for their operations and business decision-making. However, if the performance of reports is poor, the user experience quickly turns sour, leading to failed user adoption.
The keys to Power BI project success are basic:
The challenge is in striking a balance between Low Cost and High Performance. Finding that balance is critical, and yet there is an important and oft-forgotten issue that must be addressed up front: Data Refresh.
Data Refresh is often discussed later in projects, but ideally it is confronted early on, so all parties have a firm grasp of the topic. A quick overview:
Power BI has two distinct data refreshes: Power BI Desktop refresh and Power BI Service refresh. The purpose of the refresh is the same between Power BI Desktop and Power BI Service. However, there is a very important element of the refresh that we must understand: "pbix" file size limitation. This is applied when we publish pbix files from Power BI Desktop to Power BI Service and when we set a scheduled data refresh.
Why is it important? Because the pbix file size limitation could affect the design of the data model, the performance, and the cost – unless it is discussed and assessed accurately in the project discovery phase. The worst-case scenario is having Power BI reports ready for production but not being able to use them because the data refresh could not be completed in Power BI Service due to file size limitations.
Since Power BI Desktop has no pbix file size limitation, it is easy and relatively common to assume Power BI Service doesn’t either. But now we know!
Please note: the file size limitation we are referring to is not file Storage size limitation in Power BI Service Workspaces.
Here, we need to discuss Power BI licenses to find a cost associated to the file size limitation for data refresh. Why? Because the number of users determines the license cost and – at the same time – the pbix file size limitation impacts the cost, as well.
In the discovery phase, we should already know the approximate number of users. Additionally, we should measure initial pbix file size and estimate incremental data volume for the future when selecting licenses. The table below shows an approximate cost per license type and per pbix file size limitation for data refresh.
Note that the pbix file size limitation is not applied to the size of each pbix file but rather is a sum of pbix file size per license. For example, if we have 3 pbix files for a project and the sum of the 3 pbix files is 600 MB (less than 1GB), the Pro license is applicable.
The pbix file size depends on the data volume in the data source, the number of datasets and visuals in Power BI, and more. Let’s look at two examples that illustrate how to choose licenses based on data volume and pbix file size.
1. pbix File Size: 250 MB
Criteria: Pro License Selected
2. pbix File Size: 1.25 GB
Criteria: Premium License P1 Selected
There is one more element to consider when it comes to data refresh as described in the list above: it is a time window to complete refresh. While data is refreshed in Power BI Service, a session is open and the session times out in one hour for Pro license and in 4 hours for Premium license. Therefore, we must consider refresh time duration as well as pbix file size limitation. The refresh time duration is associated to pbix file size and can be improved by data modeling in Power BI, network environment, data source connection, gateway configuration, etc.
The question we must ask is how we can reduce the size of pbix files when designing Power BI reports. We can change data connection from Import mode to DirectQuery mode so that the file size would be reduced dramatically. However, there are some downsides. Performance is affected when dealing with a large set of records – e.g., 12 million records in a dataset. A slow performance appears clearly when filtering and highlighting data in chart visuals with DirectQuery mode. Import mode allows users a great filter experience with 3 - 5 seconds per click. In our project example, we experienced 10 - 30 seconds responses depending on charts with DirectQuery, which decreased a sense of dynamic user experiences. There is an option to create aggregation tables and balance the performance and file size. In our example, it worked well; yet when we needed drilldown data, we had the same performance issue with DirectQuery mode. In order to keep the best performance, import mode is a best option and we created aggregation and transaction tables/views in SQL database and used as data sources. It did not reduce the file size compared to DirectQuery mode, but it retained the performance and great user experiences.
The remaining challenge is that if we choose import mode for best performance, the pbix file size and refresh time both increase as data volume increases in data sources. The pbix file size would reach a threshold of the Power BI license SKUs and increase the running cost over time.
The conclusion is that pbix file size tells us license options (cost) and dictate the way we need to design data sources and data models in Power BI to minimize the file size and optimize report and data refresh performances. The following list is a quick reference to license options and data refresh performance based on pbix file limitations. Hopefully, this limitation will be improved in upcoming Power BI releases.
That’s a lot of information, but hopefully we’ve illustrated how file size, licenses, and refresh times are all interrelated and must be considered when planning a Power BI project.
Happy Power BI-ing!