Tableau is a business intelligence platform primarily used for data visualization and dashboard creation.
In recent years, Tableau has expanded upon its data extraction capabilities, allowing data analysts to pull data from multiple data sources for gathering insights and performing data analysis.
This post summarizes Tableau's product offerings, including Tableau Data Prep, Tableau's built-in data extraction tooling, and provides alternatives when you need a more powerful ETL (Extract, Transform, Load) solution.
The 5 Tableau product offerings are:
Tableau Public
Tableau Desktop
Tableau Cloud fka Tableau Online
Tableau Server
Tableau Prep
Tableau Public is Tableau's free data visualization offering, geared towards journalists, writers, anyone who wants to explore data science and data visualization.
Tableau Public limits data exploration to its platform only, and does not allow for data extraction to a CSV or Excel file.
Tableau Desktop is Tableau's flagship offering for analysts and BI professionals. Data analysts can use Tableau Desktop to create visualizations on top of data from a SQL Database, an Excel Spreadsheet, or a cloud application like Salesforce or Oracle.
Tableau Cloud is the rebranded version of Tableau Online and provides the same capabilities as Tableau Desktop, in the cloud.
It's designed as a SaaS platform with BI tools for data analysts to connect to multiple data sources and create insights for enterprise-level use cases.
Tableau Server allows users to share Tableau artifacts across an organization. Tableau Server works by hosting dashboards and sources that were created locally, on a Linux server. Despite similar naming, Tableau Server is not a database like SQL Server.
Tableau Prep is a lightweight, but powerful tool for data analysts to ETL (extract, transform, load) data from multiple data sources into Tableau for data visualization.
Tableau Prep solves for basic data preparation use cases, such as cleansing aggregate datasets coming from multiple sources.
Data can be ETL'd into Tableau with Tableau Prep.
Tableau is not a data warehouse. Tableau allows for creating data visualizations on top data warehouses like Snowflake or Amazon RedShift.
Tableau Prep is Tableau's offering for extracting and transforming data from data sources like Excel, Salesforce, Microsoft SQL Server into Tableau.
Tableau Prep also works with data warehouses like Snowflake, Amazon RedShift and Google BigQuery.
Using Tableau Prep, analysts can add new data to existing datasets as well as create filters for existing data.
Tableau Prep is powerful. Data analysts can complement their data preparation workflows with Python scripts and SQL.
However, Tableau Prep is also low-code, meaning data analysts can write SQL to manipulate and filter data coming from a database, but are not required to.
Data transformation can be done via the UI in Tableau. Finally, Tableau's large community forum, with plenty of tutorials and FAQs, makes it a great option for beginners and power users alike.
Tableau Prep is a feature of the Tableau Creator Plan, which costs $70/user/month, billed annually for a total of $840 per year.
Tableau Prep supports a number of connectors to popular sources, but there are some limitations to what data analysts can do.
Tableau does not have connectors for long tail data sources. Users cannot ETL data from Google Analytics in Tableau Prep for example, or extract data from APIs.
Data analysts often need data refreshed at a regular schedule. Tableau does not allow for a regularly occurring ETL process, making it not a great option for ETL automation.
When extracting data, Tableau's maximum file size is 1GB and data sampling is limited to 1M rows at a time.
Users have observed that Tableau is slow when it comes to data flows with millions of rows and sometimes run into issues with crashing when running complex workflows.
Portable
Stitch
FiveTran
Blendo
Airbyte
Hevo Data
Integrate.io
Talend
SQL Server Integration Services (SSIS)
AWS Glue
Oracle Data Integrator
Informatica
Alteryx
Portable is the best MySQL ETL tool for teams with long-tail data sources. It has built-in connectors for 300+ hard-to-find data sources and adds more regularly.
The Portable team develops new data connectors upon request with turnarounds in as little as a few hours. And they maintain those connectors if APIs change or datasets are no longer supported.
Portable offers a free plan for manual data workflows with no caps on volume, connectors, or destinations.
For automated data flows, Portable charges a flat fee of $200/month.
For enterprise requirements and SLAs, contact sales.
300+ built-in connectors for data sources you won't find with most other ETL tools.
Development and maintenance of custom connectors at no cost.
Premium support is included on all plans.
Portable focuses on long-tail data connectors and doesn't support major enterprise applications like Oracle or Salesforce.
No support for data lakes.
Only available to users in the U.S.
Portable is best for teams that can't find connectors for one or more data sources and want a solution that just works.
Stitch is an ETL tool that's part of the Talend ecosystem. It supports data transformations with Python, Java, SQL, or its no-code GUI. Stitch also supports change data capture and data replication.
Standard plan starting at $100/month for up to 5 million active rows per month, one destination, and 10 sources (limited to "Standard" sources)
Advanced plan at $1,250/month for up to 100 million rows and three destinations
Premium plan at $2,500/month for up to 1 billion rows and five destinations
14-day free trial available
Support for over 130 data sources.
Built-in integrations with Talend suite of data tools.
Compatible with scripted and GUI-based data transformations.
Automations for monitoring and notifications.
Complex data transformations are not as well supported as on some other platforms.
On-premise deployments not available.
Limits on the number of data sources and destinations.
Stitch is best for teams using widely used data sources and looking for a tool with basic transformation support.
Fivetran is a popular ETL tool with 160+ supported data sources.
It can load data to MySQL and Postgresql databases hosted locally and on Amazon RDS, Amazon Aurora, Google Cloud, and Microsoft Azure.
Standard select: Est. $60/month (limited to 1 user and 500k monthly active rows)
Starter: Est. $120/month (limited to 10 users)
Standard: Est. $180/month
Enterprise: Est. $240/month
Business critical: Contact sales
14-day free trial available
Native warehouse transformations that work well even with complex data.
Support for change data capture for data replication jobs.
Real-time or near real-time data synchronization.
Higher-priced tool than many competitors.
Consumption-based pricing models can be hard to predict month-to-month.
Only supports ELT workloads, not ETL.
Fivetran is best for large businesses looking for a solution that supports the most popular enterprise platforms.
Blendo is a data integration tool with several automations to speed up the creation of ETL pipelines. It has scripts and predefined data models.
Free plan limited to three sources
Pro plan starts at $750/month and includes transformations
Enterprise plans available with custom pricing
Supports 45+ data sources.
No-code platform that's ideal for nontechnical teams.
Built-in monitoring and alert features.
Not as many data connectors as other ETL tools.
Limited data transformation functionality.
Teams can't create new data connectors on their own.
Data teams with a small number of sources and no transformation needs looking for an easy-to-use platform.
Airbyte is an open-source ETL framework. You can deploy Airbyte's open-source version yourself or use its paid cloud plan.
Open source: Free to use since you host the software yourself
Cloud: $2.50/credit (one million rows = 6 credits; 1 GB = 4 credits)
Cloud high volume: Custom pricing (for 5,000+ credits)
Support for 170+ data connectors (not all connectors available on cloud plan).
Large open-source community.
Warehouse-native data transformations.
Consumption-based pricing model, which can be hard to predict from one month to the next.
Cloud plan is missing some data integrations.
Airbyte is best for teams with the technical ability to develop and maintain any additional connectors using the Airbyte CDK.
Integrate is a no-code platform that supports 200+ data sources. It has pre-built templates to speed up creating new data flows.
Hevo is a no-code ETL tool that supports 150+ data sources and ETL, ELT, and Reverse ETL workflows. It supports real-time data loading, replications, and transformations.
Talend Open Studio is a free to download tool compatible with MySQL and other RDBMS like Microsoft SQL Server. It is free under the open source Apache license. The Talend ecosystem also includes Stitch.
AWS Glue is a serverless integration service that allows for ETL workflows between Amazon services like S3 and RedShift.
SSIS from Microsoft is an ETL tool that works well with data in SQL Server.
Oracle Data Integrator is an ELT platform that emphasizes "Load-Transform" rather than ETL ("Extract, Transform, Load"). Like other ETL providers, Oracle provides a visual interface for creating data flow from multiple sources.
Informatica is a cloud-native enterprise ETL solution that charges using consumption-based pricing.
Tableau is an industry-leader in data visualization and business intelligence. Tableau Prep bundles data preparation and lightweight ETL with its core use case of data analytics.
While Tableau Prep has powerful pre-built features, data analysts should consider using a dedicated ETL tool when dealing with long-tail data sources or complex workflows dealing with datasets containing millions of rows of data (machine learning, data science).
Most tools focus on major enterprise applications and won't pull in the critical data from your long-tail data sources. Portable does just that and handles the development and maintenance of new connectors, too.
Looking for the best Power BI ETL tool? Get started with Portable.