Custom ETL: How To Build Winning Data Pipelines (No Code)

CEO, Portable

Analytics teams commonly need data from bespoke applications delivered to their data warehouse. Custom ETL solutions are the ideal answer.

Whether it's Snowflake, Redshift, BigQuery or Postgres - We've got you covered in this handy guide.

Building custom ETL pipelines

There are three scenarios where it makes sense to build a custom ETL pipeline.

1. Analytics - You're building a dashboard. To do so, you need data from a bespoke system.

2. Automation - You're automating a manual task. You need data from a long-tail system.

3. Data Products - You're trying to generate revenue. You need to offer a no-code connector to a particular tech platform.

It is clear that there is value to create; however, to build the dashboard, automate the workflow, or power the data product, you realize the solution relies on data from a long tail system.

When do you need custom ETL?

You need custom ETL pipelines in scenarios where the top data integration tools won't build or support connectors. The most common scenarios are:

1. Vertical specific ETL connectors 2. Business specific integrations 3. Connectors to nascent APIs

Vertical specific ETL connectors

  • Specific industries are powered by a fragmented ecosystem of long tail applications. If you work in these verticals, it's likely that you will need custom ETL pipelines to centralize data into your Modern Data Stack.

  • For data teams in eCommerce, real estate, cannabis, finance, etc. long tail connectors are critical to unlocking a 360 degree view of your organization.

  • While most ETL companies support CRM systems and databases that are used across verticals, niche industries tend to gravitate towards nuanced technologies purpose-built for the vertical.

Business specific integrations

  • HR teams need insights. Security teams need insights. Marketing teams need insights. In each business unit, teams might have their own analytics function, or they might rely on a centralized data team for analytics to answer business questions.

  • When you want to do a deep dive into the talent acquisition and retention pipeline, you need data from HR specific applications. These connectors are difficult to find (or might not exist yet) and fit squarely into the long tail where a custom ETL pipeline is necessary.

Connectors to nascent APIs

  • Even the largest companies build new APIs. In many scenarios, these aren't picked up by ETL vendors in a timely fashion.

  • They might be waiting to see further adoption and maturity before investing the time developing and maintaining the connector.

  • Or they might have the integration in their backlog, but it's going to take a while to build the connector.

  • That's great, but if you need insights now, you need a solution.

Look for an "off the shelf" connector first

  • Before you evaluate custom ETL solutions, double check you can't find an off-the-shelf connector from your current ETL vendor.

  • If they don't have the connector you need, you need to find another solution.

What are your options? Buy or Build?

6 paths forward for Custom ETL

  1. Write code from scratch.
  2. Use an open source framework.
  3. Hire a data consultant.
  4. Use serverless infrastructure.
  5. Try Airflow.
  6. Hire Portable.

Option 1) Write code and manage infrastructure (from scratch)

  • You can always develop custom ETL connectors from scratch. Is it worthwhile? Very rarely.

  • In most scenarios, someone else has already written a framework, created a scaffold, or is willing to take on the development work so you don't have to.

Option 2) Use an open source framework

  • Open source frameworks are great at providing structure when you want to build your own ETL connector.

  • They are particularly useful for companies in regulated industries like Healthcare and Financial Services, where you have to write code in-house (vs. using a cloud-hosted solution), but you'd like to have a starting point.

Option 3) Hire a data consultant

  • Data consultants are great. Not only can they help you build custom ETL connectors, but they can also help create data models, develop dashboards, and assist in architecting your data stack.

  • The problem with using a consultant for custom ETL connectors is that they are expensive, and ephemeral.

  • This means if you move on, or stop paying them, you need to rebuild your custom ETL connector, or find someone else to maintain it.

Option 4) Use serverless infrastructure like cloud functions

  • This does not solve the entire problem. It's just a solution for not managing infrastructure.

  • Serverless technology is simple, but you also need secure authentication, monitoring, alerting, retry logic, pagination logic, and more.

  • This can help if you absolutely need to build in-house, but you don't want to manage infrastructure.

Option 5) Try Airflow

  • Airflow provides structure to an otherwise unstructured problem.

  • Scheduling, orchestration, and stringing together requests, responses, and downstream actions can be handled with Airflow, but you still need actual integration logic.

  • Airflow is simply a piece of the puzzle similar to serverless infrastructure.

Option 6) Hire Portable

  • Portable specializes in building custom ETL connectors for clients.

  • If you need a random, bespoke, long-tail system connected to your warehouse, just reach out.

  • We build API to warehouse connectors on-demand, and can turn around production-grade SaaS hosted ETL / ELT connectors in a matter of hours or days.

  • We handle development, monitoring, maintenance, alerting, troubleshooting, and support. If something goes wrong, we're on call, so you can sleep well.

How do you get started with Custom ETL?

Contact us with the name of the system you need. We ship new ETL connectors lightning fast!

The slides below outline how simple it is to get started with custom ETL.