There is an explosion of new companies, new products, and new features that will drive innovation and success. But, with more tools comes more complexity.
This complexity has led to the rapid emergence of the Modern Data Stack and demand for data integration tools that centralize data from disparate applications and create sanity from the chaos of applications sprawled across the organization.
But where do you get started when evaluating ELT tools? Which tools should you evaluate? How should you evaluate them? And what is the fastest way to get up to speed on your options?
We'll provide an overview of ELT, how you can evaluate solutions and the top 5 ELT tools you have to consider for your modern data stack in 2022.
Let's start from the top.
ELT tools sync raw data from applications into a data warehouse or data lake to power data analytics, process automation, and product development.
At the highest level.
Definition of ELT: Extract, Load, Transform (ELT) tools offer no-code connectors that sync data from systems across the enterprise into a data warehouse or data lake. In ELT workflows, data transformation (typically via SQL) takes place once data is landed in the target system.
Value Proposition of ELT: ELT tools improve strategic decision-making (business intelligence), automate manual tasks, and help product teams to build data products. In some scenarios, ELT tools can be used during one-off data migrations.
Users of ELT Tools: Data engineers and data analysts are in charge of data management and enterprise data infrastructure, including ELT data flows, cloud data warehouses, and the extract, transform, and load process more generally.
Common ELT Data Sources: ELT tools pull data from APIs, databases, warehouses, event sources, webhooks, and unstructured data sources like files. The most common data types for ingestion are product databases, CRM systems, ERP platforms, and HR applications.
What is the Difference Between ETL And ELT: Historically, data sets needed to be transformed and aggregated before replication could take place into a downstream data store. With the advent of cloud warehouses, ELT tools can extract data in its raw form directly. ELT tools simply replicate data as-is instead of transforming data during the ELT process. Nowadays, most data teams use the ELT architecture instead of the legacy ETL process architecture in their Modern Data Stack.
Let's dig into the top 5 ELT tools on the market today.
The top 5 ELT tools are:
If you are ready to invest in an ELT solution, you need a starting point for evaluation. Below, we've outlined some of the key pros and cons of the top 5 ELT platforms on the market today.
Fivetran is the most established ELT tool on the market today. They were founded in 2012, they were one of the early players in the ELT market as the shift took place from ETL to ELT, and they provide a robust and reliable solution for core ELT connectors.
Fivetran is known to provide reliable cloud-based pipelines for the largest databases and business applications (Oracle, Salesforce, etc.) - connecting these data sources to the common data warehouses and data lakes.
In many scenarios, data teams that have access to budget (it's not cheap) will use Fivetran to stand up their modern data stack with core connectors to the largest applications within the enterprise. As needs expand, and long-tail business applications become important, it's common for data teams to augment Fivetran with additional ELT capabilities.
Stitch played a similar role to Fivetran in the shift from ETL to ELT. In 2018, Stitch was acquired by Talend.
This has led to changes in the team and a divergence in the support model between Stitch-supported and community-supported connectors.
From a technical perspective, Stitch pioneered the open-source model for modern ELT. With an open-source ETL tool framework called Singer.
Stitch provided the ability for community members to build and maintain their connectors with commonly used languages like Python. This community has developed, but in recent years, it has seen less investment than other open-source communities in the space.
Stitch is a cost-effective solution for small data teams that don't want to spend much money on an ELT solution, but want a no-code vendor to provide core ELT connectors. As a tradeoff, when things go wrong, data teams end up working with the community to address issues.
Airbyte is a recent addition to the ELT landscape, and the company has raised massive capital very quickly.
From a technical perspective, the Airbyte open-source framework is not too dissimilar from the Singer framework developed by Stitch.
For teams that want to deploy their infrastructure, build their connectors, and work with open-source code directly, Airbyte is the most well-capitalized solution on the market. The connector catalog is on par with Singer, but support levels and investment are on the upswing while the Singer open-source ecosystem sees less investment.
Airbyte recently released a cloud solution, which is new and competes on the common connectors you'll find from Fivetran, Stitch, and other core ELT solutions.
Founded in 2011, Matillion has been solving data integration problems for large enterprises for over a decade. In addition to native transformations, one of the most unique aspects of Matillion is that the entire solution can be deployed on-premises, or in a cloud environment (even though the technology is not open source).
The enterprise flexibility, built-in drag-and-drop transformation capabilities, and deployment model can make Matillion less approachable than the other tools on this list, but great to get started with large enterprise use cases and data modeling.
Portable is focused on long tail ETL connectors. As data teams aim to connect more and more source data from applications to their warehouse in near real-time, they need to constantly search for a partner that can provide these bespoke connectors in a user-friendly manner.
Built from the realization that every ELT company was building the same 150 connectors, Portable has focused on building a cloud platform on which new custom ETL connectors can be built on-demand for clients in hours or days.
So, even in scenarios where you are using a data integration platform like Fivetran, Stitch, Airbyte, or Matillion, Portable is the perfect solution to provide a no-code experience to pull data from bespoke business applications quickly. It's extremely simple to get started.
Even though Portable is the most recent addition to the ELT landscape on this list - with over 300+ connectors - Portable has more cloud-hosted, no-code connectors than every other company on this list.