Power BI ETL: Power Query, Dataflows, and Datamarts (NEW)

Ethan
CEO, Portable

What is Power BI?

Microsoft Power BI is a data visualization tool often used as an alternative to Tableau for creating dashboards and performing data analysis.

What are the Advantages of using Power BI?

Power BI combines data visualization with data preparation.

Power BI is useful for business users looking for a way to summarize insights from their organization's data in the form of dashboards and data visualizations.

Prior to visualization, however, users must transform data coming from different data sources into a centralized schema. This is called data preparation, and is an essential step in business intelligence workflows.

Power BI supports this use case with a native ETL-lite framework called Power Query.

Power Query is a feature with Power BI that provides basic extract, transform capabilities in an easy-to-use UI.

Related: Top 50 Data Visualization Tools

Is Power BI an ETL or ELT tool?

Power BI is not a dedicated ETL tool, but Power BI does come with a built-in user interface to get and transform data called Power Query.

Power Query is a UI within Excel and Power BI that allows for self-service data preparation and data transformation.

What is Power Query used for?

Power Query allows analysts to get data from data sources like Excel and load data into Power BI, Microsoft Azure Data Lake Storage, or other Azure apps and data warehouses.

Power Query can be used in a number of environments, including Power BI Desktop, Power BI Service, and even Microsoft Excel.

Power Query is different from Excel. Excel users can use Power Query as a tool to import data into spreadsheets.

What are the Advantages of Power Query in Power BI?

Power Query saves business users time, especially non-technical users who want to extract, transform, and analyze data from multiple data sources.

The advantages of Power Query in Power BI are:

  1. An easy-to-use UI for extracting data from data sources and transforming data from different schemas into one centralized source of truth.

  2. A large connector library with data integrations for most popular data sources.

  3. Flexibility to aggregate and append data regardless of data volume and data shape.

  4. Tools for building repeatable processes to automate one-off use cases.

  5. Collaboration across environments: Power Query can be used locally on Power BI Desktop and also within a shared workspace via Power BI Service, the cloud-based version of Power BI.

Using Power Query for ETL

Power Query is primarily used to get data (extract) from data sources like Excel, Sharepoint, or SQL Server.

Power Query also allows users to aggregate, append, and transform data before it's loaded into Power BI for visualization.

You can learn more about Power Query for ETL on Microsoft's blog.

Using M Language in Power Query's Advanced Editor

For advanced data transformation, Power Query's Advanced Editor comes with a "mash up" language similar to F# that allows users to define custom data columns within their schema.

Is Power Query the same as SQL?

Power Query is not the same as SQL. M Language is a scripting language that serves a similar purpose of filtering and defining aggregate columns within Power BI, as you would with SQL, but they are not the same.

Implementing DAX Measures

If M Language is the advanced way of transforming data that's ETL'd into Power BI, DAX is an advanced way of defining measures (aggregate, pivot, unpivot) within the data visualization layer after data has been ETL'd into Power BI.

DAX is most similar to an Excel formula and is typically used for lightweight data filtering and column customization for dashboarding and data visualizations.

Using Power BI Dataflows for ETL

When users need to share reusable data across Power BI reports or ETL large data volume at scale, Power BI Dataflows provides advanced ETL features into Azure Data Lake Storage and other Microsoft products.

Similar to Power Query, Power BI Dataflows can get data from a database. On top of this, Power BI Dataflows allows users to create, merge, and refresh tables with incremental data on a scheduled basis.

Dataflows also unlock advanced Azure features like Azure Machine Learning and data streaming.

You can learn more about Dataflows on Microsoft's blog.

Power BI Datamarts, should I use them?

Looking for a more powerful solution than Power Query, but with less IT involvement than Power BI Dataflows? Datamarts is a simple in-between solution to ETL'ing data into a managed database without DevOps or IT support.

Datamarts is designed for users who want a no-code environment for use cases with large datasets that do not involve advanced data transformation. Datamarts are not a replacement for data warehouses. Instead of creating new one-off database for ETL workflows, users can use Datamarts as an in between SaaS offering between Power Query and Dataflows.

What is the difference between Power BI and dedicated ETL tools?

Between Power Query, Power Query Editor, Power BI Dataflows, and Datamarts, business users have many options to ETL data into Power BI or Azure Data Lake Storage.

In addition to the above, Microsoft even provides legacy ETL services like SQL Server Integration Services (SSIS).

Depending on your use case, however, a managed ETL process can be a better option than out-of-the-box Power BI features.

Disadvantages of Power BI for ETL

Performing operations across data types and importing data across large datasets might be a job more suitable for an ETL service when:

Below are the top ETL tools for every Power BI use case so you can choose one that works best for your organization.

Best ETL tools for Power BI

1. Portable

Portable is the best ETL tool for teams with long-tail data sources. It has built-in connectors for 300+ hard-to-find data sources and adds more regularly.

Even better, the Portable team develops new data connectors upon request with turnarounds in as little as a few hours. And they maintain those connectors if APIs change or datasets are no longer supported.

Pricing

  • Portable offers a free plan for manual data workflows with no caps on volume, connectors, or destinations.

  • For automated data flows, Portable charges a flat fee of $200/month.

  • For enterprise requirements and SLAs, contact sales.

Key features

  • 300+ built-in connectors for data sources you won't find with most other ETL tools.

  • Development and maintenance of custom connectors at no cost.

  • Premium support is included on all plans.

Disadvantages

  • Portable focuses on long-tail data connectors and doesn't support major enterprise applications like Oracle or Salesforce.

  • No support for data lakes.

  • Only available to users in the U.S.

Who is Portable best suited for?

Portable is best for teams that can't find connectors for one or more data sources and want a solution that just works.

2. Stitch

Stitch is an ETL tool that's part of the Talend ecosystem. It supports data transformations with Python, Java, SQL, or its no-code GUI. Stitch also supports change data capture and data replication.

Pricing

  • Standard plan starting at $100/month for up to 5 million active rows per month, one destination, and 10 sources (limited to "Standard" sources)

  • Advanced plan at $1,250/month for up to 100 million rows and three destinations

  • Premium plan at $2,500/month for up to 1 billion rows and five destinations

  • 14-day free trial available

Key features

  • Support for over 130 data sources.

  • Built-in integrations with Talend suite of data tools.

  • Compatible with scripted and GUI-based data transformations.

  • Automations for monitoring and notifications.

Disadvantages

  • Complex data transformations are not as well supported as on some other platforms.

  • On-premise deployments not available.

  • Limits on the number of data sources and destinations.

Who is Stitch best suited for?

Stitch is best for teams using widely used data sources and looking for a tool with basic transformation support.

3. Blendo

Blendo is a data integration tool with several automations to speed up the creation of ETL pipelines. It has scripts and predefined data models.

Pricing

  • Free plan limited to three sources

  • Pro plan starts at $750/month and includes transformations

  • Enterprise plans available with custom pricing

Key features

  • Supports 45+ data sources.

  • No-code platform that's ideal for nontechnical teams.

  • Built-in monitoring and alert features.

Disadvantages

  • Not as many data connectors as other ETL tools.

  • Limited data transformation functionality.

  • Teams can't create new data connectors on their own.

Who is Blendo best suited for?

Data teams with a small number of sources and no transformation needs looking for an easy-to-use platform.

Power BI ETL: The Bottom Line

Power BI is a handy data visualization tool that comes with a built-in data importing tool called Power Query.

Use a dedicated ETL tool when you are looking for a long-tail connector that is not supported by Power BI's connector library.

You'll only get the most use from Power BI with a powerful ETL tool.

Most tools focus on major enterprise applications and won't pull in the critical data from your long-tail data sources. Portable does just that and handles the development and maintenance of new connectors, too.

Looking for the best Power BI ETL tool? Get started with Portable.