For most of the history of ELT/ETL tools, open-source options have stayed on the sidelines.
But that all changed with Airbyte in 2020. In just a few years, Airbyte has become one of the most talked-about ELT tools in the industry.
Data ingestion and transformation are becoming more mission-critical every year. And open-source choices - and Airbyte in particular - has some advantages but also comes with its downsides.
And Airbyte is far from the only option on the market. Today, we'll look at some of its most compelling competitors so you can make the best choice for your business.
Because it's open source, you can run Airbyte on your own servers at no cost. The open-source nature also means it's more extensible to add your own connectors than it might be on other closed-source platforms (however, you need to be a developer to do so).
But despite its early appeal, Airbyte has some flaws.
The cloud offering is nascent, with only 120+ connectors available - most of which can be found from a number of other vendors.
And like many cloud-hosted ETL offerings, pricing is usage-based and can be confusing and very hard to predict from one month to the next.
According to Airbyte's G2 profile, this is what real users are saying:
What do you like best about Airbyte?
Airbyte allowed us to copy millions of rows from a SQL Server to Snowflake with no cost and very little overhead.
The reliability and the ability to build custom connectors. The fact that you can choose self-hosted or cloud-hosted versions.
What do you dislike about Airbyte?
Occasionally had to reset Docker & Airbyte in order for the engine to continue working.
Originally, there was no checkpointing when extracting large sources, but this has now been added as a feature.
What problems is Airbyte solving and how is that benefiting you?
Pushing data to the cloud with little or no cost incurred. When we tried doing the same via a different method, the cost was in the thousands. As a non-profit, we have to watch expenses at every turn.
It's one of the best ELT systems for integrating the less commonly used third-party data sources - it is not enough for us to integrate 50% of our data sources: we need them all.
Open source: Free to use since you host the software yourself
Cloud: $2.50/credit (one million rows = 6 credits; 1 GB = 4 credits)
Cloud high volume: Custom pricing (for 5,000+ credits)
300+ open-source connectors available in the open source offering
Open-source solution licensed under the Elastic License 2.0
Extensive destination support, with 20+ data warehouses and data lakes available including Snowflake, Google BigQuery, Microsoft Azure Blob Storage, AWS Datalake, and more
Warehouse-native data transformations
Cloud service lacks support for many connectors that are included in the open-source offering (120+ total)
You need to be a developer to write a custom connector
For long-tail connectors, you are on the hook to maintain the connector yourself
Pricing credits can be hard to predict, leading to unexpected costs each month
Portable is a platform designed to provide the long-tail connectors Airbyte is slow to support.
Portable currently has a library of 500+ connectors and creates bespoke connectors on request in as little as a few hours.
Portable also maintains these connectors as APIs change, maintaining data quality and saving engineering teams from ongoing work.
Manual: Unlimited data sources, destinations, and volumes at no charge when used with manual sync
Scheduled data flow: $200/data flow for unlimited sources, destinations, and volumes
Custom: Enterprise pricing for teams that need custom plans
300+ data source connectors
Major cloud data warehouses supported
No limits for sources, destinations, and volumes---even on basic plan
Development and maintenance of new data sources included
White-glove customer support
Portable has 500+ pre-built connectors, more than Airbyte's cloud offering, and all these connectors are available on all plans.
Portable builds and maintains new long-tail connectors upon request at no additional cost.
Portable offers premium support for all customers, without extra charges for better help.
300+ connectors, more than most competitors and focused around the long-tail data sources most tools don't support
Per-source pricing is clear and easy to predict month-to-month
Full-featured free plan lets you test and use all connectors at no charge for as long as you want, with manual sync
Bespoke connectors built and maintained at no cost
Focused on long-tail connectors and doesn't support enterprise data sources like Salesforce or QuickBooks
Doesn't focus on databases as connectors
Portable is only available for companies based in the U.S.
Portable is best for companies that need one or more hard-to-find data connectors but don't have the resources to build and maintain those connectors themselves.
Meltano is probably the closest direct competitor to Airbyte. It's one of the newest entrants to the ELT world, having spun off of GitLab in 2021. Meltano is an open-source platform like Airbyte, but unlike Airbye integrates with the Singer protocol.
At the time of writing, Meltano does not offer paid plans---only its open-source platform which is free to run. But a cloud-based SaaS solution is in the works with a projected launch date in 2023.
300+ data sources
Native integration with Singer open-source data integration format
SDK for custom connectors
Version control and testing built-in
Engineer-first design and features
Meltano integrates with 300+ data sources. The platform has access to more connectors because of its integration with Singer, a key difference in the philosophies behind Meltano and Airbyte.
Meltano's CDK integrates with Singer for building your own custom connectors. Most likely, your team would be responsible for development and maintenance.
Meltano is not as established as Airbyte and has a small-but-growing community. Meltano may offer support in the future as part of its cloud plan.
Free, open-source platform
Features like Git version control, staging and dev environments, and end-to-end CI testing are built in
Integration with Singer, making it easier to build flexible custom connectors
Meltano is built for engineers and doesn't offer a no-code or low-code options
Currently no managed SaaS plan, only self-hosted
Smaller user community and no official support
Singer connectors can break without warning, and Meltano does not guarantee connector maintenance
Meltano is best for data engineering teams with a high level of technical expertise.
Fivetran was one of the first ETL tools in the industry, founded back in 2012. It was originally a data analysis and visualization platform (its name is a play on the programming language Fortran), but pivoted to ETL and has become one of the major players in the industry.
Unlike Airbyte, Fivetran is closed-source. And its main focus is on building connectors for widely used platforms and data sources.
Standard select: Est. $60/month (limited to 1 user and 500k monthly active rows)
Starter: Est. $120/month (limited to 10 users)
Standard: Est. $180/month
Enterprise: Est. $240/month
Business critical: Contact sales
14-day free trial available
Robust support for the biggest, most popular enterprise applications
Native warehouse transformation that's especially effective with complex data models
Data replication using data change capture
Fivetran offers 160+ data source connectors. Most of these are for the biggest enterprise platforms.
You can request custom data connectors through Fivetran, but development is slow. You can also create your own custom connectors, but won't have access to its REST API on the Starter plan.
Fivetran offers 24/7 global support on all of its plans. More expensive plans have access to SLAs.
One of the most established data integration platforms that's trusted by some of the world's biggest companies
Integrations with the largest enterprise sources and destinations
Real-time or near real-time data synchronization
Higher prices than most other tools in the space
Complex pricing model that can be difficult to understand and predict
Limited support for long-tail data connectors, and a low probability they'll be developed in-house
Fivetran only supports ELT workloads, so businesses with ETL will need a different tool
Companies that use the most widely available data sources and have a larger budget for robust, reliable data integration.
CData is a suite of data tools for universal connectivity. It includes ELT, ETL, visualization, data connectivity, cloud connectivity, and others.
Standard: $49/month (one data source)
Professional: $99/month (limited to five data sources)
Enterprise: $199/month (limited to ten data sources; custom pricing for 10+ sources)
Other tools in suite available separately
250+ connectors, which CData calls "drivers"
Includes CData suite of tools like Arc, Sync, and Connect Cloud
Optimized speed for data sources
Allows access to NoSQL data types as though they were SQL
Support for most types of workloads, including ETL, ELT, and Reverse ETL
Available as a cloud SaaS product or on-premises
CData offers 250+ built-in drivers (data connectors).
If you need to connect to a data source that's not natively supported by CData, you can use the Universal API Driver to develop and maintain your own connector.
CData has well-reviewed company support. All plans get email assistance, and Enterprise customers with more than five users have access to Premium Support with phone help and 24-hour response times.
Straightforward, easy-to-understand pricing for data ingestion
Built-in connectors for a wide range of popular data sources
Tool integrations for almost every aspect of the data pipeline
Tools aren't bundled, so pricing can be hard to calculate and quickly get expensive for a complete pipeline
Reviews say documentation can be hard to understand and use
Support community is small compared to many competitors, especially Airbyte
Data apps have limited support for custom fields
CData is a good fit for teams looking for a complete set of SQL pipeline tools.
Stitch is an ETL tool that focuses on business intelligence. It has a unique position in the market, straddling both closed- and open-source technologies. The platform itself is closed-sourced and owned by Talend.
But Stitch integrates with---and created---the Singer open source data protocol. In other words, it's a closed-source platform based on open-source components.
Standard: Starts at $100/month (limited to 5 million monthly active rows, 1 destination, and 10 sources)
Advanced: $1250/month (limited to 100 million rows and 3 destinations)
Premium: $2500/month (limited to 1 billion rows and 5 destinations)
14-day free trial
130+ data sources through Singer
Automation tools like monitoring and alerts
Support for most popular warehouses and data lakes
Built-in transformation capabilities with a UI or Java, SQL, or Python
Stitch comes with 130+ data sources powered by Singer. While all Singer connectors are open-source, some are only available on the Advanced and Premium plans. (One of the downsides of a hybrid open/closed-source platform.)
Stitch integrates seamlessly with Singer for creating new connectors. Singer uses interchangeable "taps" and "targets" for data sources and destinations, respectively. So developing a new tap integrates with all existing targets, and vise-versa. This is an improvement, but your team will need to develop and maintain any new connectors.
Stitch offers basic email and chat support for all customers during business hours. Certain customers with higher-paying plans are eligible for phone support and a dedicated support rep.
Robust transformations with Singer, especially with JSON data types
Fast and easy to set up, in as little as a few minutes
Integrations with other Talend data tools
Cost-effective solution for the needs of most small businesses
More affordable pricing than some competitors, but row-based model gets expensive and can be hard to predict
Customer support is limited on the more basic plans
Singer connectors aren't maintained by Stitch and can break unexpectedly - that's why Portable might be a better choice than Stitch.
Stitch is designed for teams expecting to use common data sources, or that are familiar with building and maintaining new connectors on Singer.
Segment is a platform for customer data. This focus on customer data means Segment has a few unique features, like merging personas and segmentation based on the customer journey. The company was acquired by Twilio in 2020.
Free: Capped at 1,000 monthly visitors and 2 data sources
Team: Starts at $120/month for 10,000 monthly visitors
Business: Custom pricing
300+ data sources
Data governance tools and debugging
Customer segments, user profiles, and pixel embedding
Functionality to export to marketing tools and CRM platforms
Segment has 300+ connectors which are available on all plans.
Segment has a built-in Functions tool for developing custom connections. Your team will be responsible for any development or maintenance of custom connectors.
Every plan has Standard support. Segment offers four additional levels of support as paid add-ons, available to teams with plans totalling $60,000 or more.
Simple, easy-to-use user interface
Free plan and affordable plans for teams with smaller data needs
Unique features for customer segmentation that aren't available from other competitors, including Airbyte
Pricing based on users can become expensive and can charge for anonymous users several times
Some connectors don't leverage all data contained in APIs, meaning you'll need to develop a custom connector to extract additional data from "supported" data sources
Heavily focused around sales and marketing and not as applicable for other use cases
Segment is best for data teams that only deal with customer and marketing data.
Precog is a cloud ETL tool with a completely different approach. Instead of manually integrating every data source, Precog uses artificial intelligence to parse and apply schema to new data types.
Custom pricing based on data source.
10,000+ data sources ready to use out of the box
100+ destinations supported
Available both as a cloud SaaS product and on-premise
No-code platform
Precog offers 10,000+ connectors, more than any other Airbyte competitor. These aren't hand-coded but rather developed using Precog's AI engine.
You can create new connectors using Precog's AI in a no-code environment. You can also request a connection from Precog, and they'll deliver within 48 hours.
Precog offers customer support based on the plan you choose and your team's unique needs.
Thousands more connectors than Airbyte or any competitor
Transparent pricing based on data sources
Unique no-code, AI-based approach to data integration
SaaS and on-premise both available
Recent newcomer to the ELT space without much community (especially when compared with Airbyte's 12,000+ member community)
No-code focus can result in limitations for technical teams
Pricing can be expensive for teams that need a large number of connectors
Precog is best for teams who need extremely long-tail data connectors and want a no-code environment.
Hevo is a data pipeline tool with built-in transformation capabilities. It's a no-code platform and offers ETL, ELT, and Reverse ETL workflows.
Free: Capped at one million events (50+ data sources available)
Starter: Starts at $239/month
Business: Custom quote
Hevo Activate is a separate tool for Reverse ETL:
Free: Limited to 4 data warehouses and 3 SaaS targets
Start: $199/month (limited to 4 data warehouses and 5 SaaS targets
Flexible pricing
150+ data connectors (free plan only has access to 50+)
Data migration offered in real-time
Python data transformations built-in
Round-the-clock live support
Hevo's free plan includes 50+ data connectors. Paid plans include 150+.
You can build a custom connector using Hevo's REST API. Alternatively, you can connect with long-tail data sources using its webhooks source.
Hevo has 24/7 customer support for all plans. Starter includes live chat support, and Business includes a dedicated account manager.
Broad support for ETL, ELT, and Reverse ETL pipelines
No-code platform that doesn't require programming knowledge
Generous free plan for businesses with popular data sources
Limited support for long-tail connectors, especially on the free plan
Usage-based pricing can be hard to predict month-to-month
Migrations between tools can require manual mapping
Hevo is best suited for teams using popular data sources that need a low-code tool.
Rivery is a closed-source SaaS ELT data orchestration platform that has both no-code and code options. You can create Rivery ELT scripts, called "rivers," using a GUI or with Python.
Starter: $0.75 per credit
Professional: $1.20 per credit
Enterprise: Custom pricing
One credit = one API pipeline execution, 100 MB of data replication, or one logic and transformation execution
14-day free trial
200+ data sources
15+ supported data management destinations
24/7 customer support
Fully managed ELT, Reverse ELT, and transformations
Starter kits that include prebuilt, editable rivers already set up
Rivery currently has 200+ pre-built connectors, ready to use on all plans.
You can create a new data source using Rivery's Custom API feature. The Rivery team will also build custom data connectors upon request.
Rivery is known in the industry for its support, scoring 9.8 on G2 (compared to the industry average of 8.5). The level of support depends on your plan.
Comprehensive API support for development teams
Starter kits and no-code GUI make it easy to set up new data pipelines
Recognized for excellent customer support
Credit-based pricing is complex, changes month-to-month, and can be hard to predict
GUI can become complex for more intricate data sets and connections
Error messages can be vague and make it hard to pinpoint the problem
Rivery is best for teams looking for a flexible code/no-code option who want to get set up quickly.
Matillion is a closed-source data workflow platform that serves larger enterprises. While most competitors are either open-source (like Airbyte) or closed-source SaaS (like Fivetran), Matillion offers an on-premise solution that you deploy on your own.
Free: Capped at one million rows/month
Basic: $2.00/credit
Advanced: $2.50/credit
Enterprise: $2.70/credit
110+ data source connectors
Cloud-based and on-premise versions available
Can be deployed on-premise
GUI-based data transformations in the cloud
Pricing model based on consumption
Integration with Matillion Data Loader and Matillion ETL
Matillion comes with 110+ data connectors. They aren't identical across the two products, and some connectors in ETL aren't available in Data Loader.
At just 110+ connectors, Matillion is focused on the largest, most popular data sources. There's a GUI interface for basic API integrations but maintenance is the user's responsibility.
Because Matillion is an all-in-one data loading and transformation platform, it can require less support than building a data stack from several products. Matillion offers standard support for all plans. Enterprise plans have the option to add Mission Critical Support.
Powerful built-in data transformations
On-premise deployments available
Easier data governance with all-in-one tool
GUI-based transformations can take a while a to learn
Limited number of prebuilt data connectors---the lowest number amongst Airbyte competitors
Integration with native Matillion tools can make it hard to integrate with other systems
Matillion is best for large businesses looking for an on-premise platform to handle both data ingestion and transformations.
A company with growing data needs an intuitive, effective tool to move and transform that data and convert it into actionable insights. But it can be hard to find an ETL tool that works best with your business.
Airbyte has been one of the fastest-growing platforms in recent years, with big promises for its open-source model. A free data tool is appealing, but the open-source offering is tailored to engineers that want to build and maintain connectors themselves.
Airbyte's cloud option has fewer connectors and a pricing model that can be confusing and hard to predict. There are other options that offer faster connectivity and more transparent pricing.