Eugene Lebedev is a Power BI consultant by background. Before founding Vidi Corp in 2021 he created Power BI reporting for Autodesk. His Power Bi reports were used by VPs of Finance and C-suite of Autodesk. As part of Vidi Corp, Eugene created dashboards for Google, Teleperformance, Delta Airlines and 200+ clients worldwide.
Power BI has become a potent tool for data analysis and visualisation in the realm of business intelligence. Power BI’s ability to use Direct Query to connect directly to data sources is one of its distinguishing characteristics. An in-depth discussion of Power BI’s Direct Query concept, along with its advantages and how it facilitates real-time insights for data-driven decision-making, is provided in this article.
Direct Query is a connection mode in Power BI that allows you to query data directly from the source system in real time. Unlike other connection modes such as Import or Live Connection, where data is imported or cached in Power BI, Direct Query retrieves data dynamically, ensuring that the analysis is based on the most up-to-date information.
The query you make in the Power BI interface is sent directly to the underlying data source, like a relational database or an Analysis Services model, when you use DirectQuery. Instead of using Power BI’s own engine (SQL Profiler), the data source itself retrieves and processes the data. Here’s a high-level overview of how DirectQuery works in Power BI:
Connection | Query Creation | Query Translation | Query Execution | Visualisation | Real-Time Updates |
Connection: You establish a connection to the data source using one of the supported connectors. Power BI supports various data sources, including SQL Server, Oracle, MySQL, and many others.
Query Creation: You create your queries using Power BI’s query editor or the Query Builder. These queries can include filtering, sorting, aggregating, and joining operations, similar to the data transformation operations performed in Power Query.
Query Translation: Power BI translates the queries you create into the appropriate syntax for the underlying data source. The specific translation process depends on the data source and its query language.
Query Execution: The translated query is sent directly to the data source. The data source processes the query and returns the requested data to Power BI.
Visualisation: Power BI receives the data from the data source and visualises it based on the report or dashboard you’ve designed. You can create various visualisations, apply filters, and interact with the data in real time.
Real-Time Updates: As you interact with the visualisations or apply additional filters, Power BI sends additional queries to the data source to retrieve the updated data. This allows you to explore and analyse the data in real time, reflecting any changes made in the data source.
Feature | Import | Query |
Data Storage | Data is stored within Power BI | Data stays in the source system |
Data Refresh | Manual or scheduled refresh required | Real-time or on-demand refresh |
Data Volume | Limited by Power BI capacity | No limitation on data volume |
Data Transformation | Extensive data transformation | Limited data transformation capabilities |
Data Security | Data resides in Power BI service | Data stays in the source system, maintaining security measures of the source system |
Compatibility | Supports a wide range of data sources | Limited to supported DirectQuery data sources |
Query Complexity | Can handle complex queries | Limited ability to handle complex queries |
Custom Calculations | Calculations performed within Power BI | Calculations performed in the source system |
While Direct Query offers significant advantages, it is important to consider some key aspects and limitations:
Performance: Real time querying can impact performance, especially when dealing with large datasets or complex queries. It is crucial to optimise the underlying data source and ensure efficient indexing and query performance for optimal results.
Compatibility with Data Sources: Not all data sources support Direct Query. Power BI has a wide range of supported data sources, but it’s important to verify compatibility before choosing Direct Query as the connection mode.
Data Refresh: When using DirectQuery, a report or dashboard will always display the most recent data from the source when it is opened or refreshed. Additionally, the dashboard tiles may be refreshed as regularly as once every 15 minutes.
Query Folding Limitations: Direct Query relies on query folding, which means that certain transformations performed within Power Query Editor may not be compatible and will be executed in Power BI instead. This can impact performance and data retrieval.
Limited Power BI Features: Direct Query has some limitations when it comes to certain Power BI features, such as aggregations, calculated tables, and some visualisation options. It’s important to understand these limitations and determine if they align with your reporting requirements.
Single Query Rows Limit: The maximum number of rows that can return from a single query to the underlying source is 1 million. Visuals won’t show that many points and this limit typically has no practical repercussions. The restriction can, however, appear in Power BI sends queries that aren’t entirely optimized and demands some intermediate result that goes over the limit. You still can work around this limit if you split your data extract into multiple queries.
Unlike import mode, not all the data sources support Direct Query. Here is a list of some commonly used data sources supported by Power BI for Direct Query:
Azure SQL Database | Azure Synapse Analytics | SQL Server Database |
Oracle Database | Salesforce | SAP HANA |
Spark (Version 0.9 and above) | Snowflake | Amazon Redshift |
Azure Analysis Services | Teradata Database | Google BigQuery |
SAP Business Warehouse | IBM Netezza | Azure HDInsight Spark |
When using DirectQuery mode in Power BI, there are certain DAX functions that are not supported directly and cannot be used in calculations within the report. Here is a list of some commonly used DAX functions that are not supported in DirectQuery mode:
SUMX | AVERAGEX | MAXX |
TOTALMTD | ISBLANK | EXACT |
RANKX | ROW | FORMAT |
ISTEXT | ISNUMBER | ISERROR |
To make the most of Direct Query in Power BI, consider the following best practices:
Optimise Data Sources: Ensure that the underlying data sources are properly optimised for efficient querying and performance. Implement appropriate indexes and query-tuning techniques to enhance data retrieval speed.
Query Folding and Filtering: Leverage query folding by applying filters and transformations directly in the source system whenever possible. This helps reduce the data transferred between Power BI and the source, improving performance.
Aggregations and Summarisation: Explore the use of aggregated tables or materialised views in the data source to enhance performance and query speed, especially for large datasets. Aggregations can help pre-calculate and store summarized data for faster retrieval.
Consider Import or Composite Models: Depending on the reporting requirements, it may be beneficial to use Import or Composite models instead of Direct Query. Import mode allows for caching and pre-processing data, while Composite models offer a combination of Direct Query and Import modes.
This article has highlighted the benefits and limitations of using a Direct Query connectivity mode. Direct Query gives Power BI users flexibility and real-time analysis capabilities. When selecting whether to use Direct Query as connectivity mode, it’s critical to take into account the performance, compatibility, and restrictions of both the data source and Power BI itself.