Portable vs. AWS Glue: 2023 Guide for Data Integration

Ethan
CEO, Portable

Portable or AWS Glue, which data integration tool do you need?

Cloud agnostic data sources

  • Data teams need to be able to access data from all systems, not just AWS specific platforms.

  • Glue is built specifically to connect to managed AWS products as sources.

  • The platform can leverage JDBC to connect to cloud agnostic databases, but file stores, and other solutions are AWS specific.

  • This ties data teams to a particular ecosystem, instead of offering the flexibility and extensibility of the open data ecosystem.

  • Portable is cloud agnostic, supports data sources across cloud providers, and offers a rapidly expanding catalog of API connectors syncing data from third party software as a solution (SAAS) tools to data warehouses.

Warehouse flexibility

  • Data warehouse technology is rapidly evolving. Scalability is increasing, latency is decreasing, and solutions are becoming more economical.

  • Data teams should not be trapped choosing from only certain warehouses and data lake solutions based on the cloud provider they leverage - they should have the confidence knowing they can migrate technologies efficiently over time.

  • Portable is warehouse agnostic, and currently supports deliveries to Redshift, BigQuery, Snowflake and Postgres data warehouses.

  • Data teams can be confident they can migrate their tech stack to a new warehouse with Portable. It is as simple as setting up a parallel delivery to the new warehouse, finalizing the migration, and sunsetting legacy data flows.

API connectors

  • Portable is an expert in extracting data from API based SAAS tools. Glue focuses on cataloguing semi-structured and structured data from event sources, file stores, and databases, not SAAS tools.

  • When analyzing data, generating insights, and automating workflows, data teams should have a holistic view of the operations of a company - not just a subset.

  • Data from event sources, file stores, and databases is critical, but it should be available alongside data from SAAS solutions like customer relationship management (CRM) tools, enterprise resource planning (ERP) tools, and email service providers (ESPs).

Simple setup

  • It should be simple to start the flow of data from business applications into a data warehouse.

  • While Glue can be set up directly inside of the AWS console, it is complicated to start analyzing data.

  • Users must decide if they want to catalog data, whether to use a drag and drop graphical interface, and how best to monitor their workflows. On the other hand, Portable has ruthlessly invested in usability and offers no-code connectors.

  • To connect a system to a warehouse, users simply authenticate with the source, configure the warehouse, decide where the data should show up, and leave it to Portable to create schemas, monitor workflows, and send alerts if authentication credentials expire.

Decoupled transformation

  • The data ecosystem is shifting from the extract, transform, load paradigm (ETL) to the extract, load, transform (ELT) paradigm.

  • The main rationale is that by loading all available raw data into a warehouse before transformation allows faster syncing, and deep specialization by solutions focused on data replication (E + L) and separate solutions focused on transformation (T).

  • Portable is a pure play data replication tool doubling down on this new architecture because transformation should not be coupled with ETL pipelines.