FDA Data Integration with BigQuery

With Portable, integrate FDA data with your BigQuery warehouse in minutes. Access your the website of the Federal Food and Drug Administration data from BigQuery without having to manage cumbersome ETL scripts.

The Two Paths to Connect FDA to Google BigQuery

There are two ways to sync data from FDA into your data warehouse for analytics.

Method 1: Manually Developing a Custom Data Pipeline Yourself

Write code from scratch or use an open-source framework to build an integration between FDA and BigQuery.

Method 2: Automating the ETL Process with a No-Code Solution

Leverage a pre-built connector from a cloud-hosted solution like Portable.

How to Create Value with FDA Data

Teams connect FDA to their data warehouse to build dashboards and generate value for their business. Let’s dig into the capabilities FDA exposes via their API, outline insights you can build with the data, and summarize the most common analytics environments that teams are using to process their FDA data.

Extract: What Data Can You Extract from the FDA API?

FDA is a the website of the Federal Food and Drug Administration used for pulling data on from the Food and Drug Administration.

To help clients power downstream analytics, FDA offers an application programming interface (API) for clients to extract data on business entities. Here are a few example entities you can extract from the API:

  • Animal & Veterinary
  • Drug
  • Device
  • Food
  • Tobacco
  • Other

You can visit the FDA API Documentation to explore the entire catalog of available API resources and the complete schema definition for each.

As you think about the data you will need for analytics, don’t forget that Portable offers no-code integrations to other similar applications.

Regardless of the SaaS solution you use, it’s important to find a the website of the Federal Food and Drug Administration with robust data available for analytics.

Load: Which Destinations Are Best for Your FDA ETL Pipeline?

To turn raw data from FDA into dashboards, most companies centralize information into a data warehouse or data lake. For Portable clients, the most common ETL pipelines are:

  1. FDA to Snowflake Integration
  2. FDA to Google BigQuery Integration
  3. FDA to Amazon Redshift Integration
  4. FDA to PostgreSQL Integration
Common Data Warehouses
Common Data Warehouses

Once you have a destination to load the data, it’s common to combine FDA data with information from other enterprise applications like Jira, Mailchimp, HubSpot, Zendesk, and Klaviyo.

From there, you can build cross-functional dashboards in a visualization tool like Power BI, Tableau, Looker, or Retool.

Develop: Which Dashboards Should You Build with FDA Data?

Now that you have identified the data you want to extract, the next step is to plan out the dashboards you can build with the data.

As a process, you want to consume raw data, overlay SQL logic, and build a dashboard to either 1) increase revenue or 2) decrease costs.

Replicating FDA data into your cloud data warehouse can unlock a wide array of opportunities to power analytics, automate workflows, and develop products. The use cases are endless.

Now that we have a clear sense of the insights we can create, let’s compare the process of developing a custom FDA integration with the benefits of using a no-code ETL solution like Portable.

Method 1: Building a Custom FDA ETL Pipeline

To build your own FDA integration, there are three steps:

  1. Navigate the FDA API documentation
  2. Make your first API request
  3. Turn an API request into a complete data pipeline

Let’s walk through the process in more detail.

How to Interpret FDA’s API Documentation

When reading API documentation, there are a handful of key concepts to consider.


There are many common authentication mechanisms. OAuth 2.0 (Auth Code and Client Credentials), API Keys, JWT Tokens, Personal Access Tokens, Basic Authentication, etc. For FDA, it’s important to identify the authentication mechanism and how best to incorporate the necessary credentials into your API requests.

An API key is required to make calls to the openFDA API. The key is free of charge. Your use of the API may be subject to certain limitations on access, calls, or use. These limitations are designed to manage load on the system, promote equitable access, and prevent abuse.


It’s important to identify the FDA API endpoints you want to use for analytics. Most APIs offer a combination of GET, POST, PUT, and DELETE request methods; however, for analytics, GET requests are typically the most useful. At times, POST requests can be used to extract data as well.

For FDA, the drug endpoint is a great place to get started.

Request Parameters

For each API endpoint you would like to use for analytics, you need to understand the method (GET, POST, PUT, or DELETE) and the URL, but there are other considerations to take into account as well. You should look out for pagination mechanics, query parameters, and parameters that are added to the request path.

openFDA is designed primarily for real-time queries. Using combinations of the skip/limit parameters you can page through a result set that has up to 26,000 hits. This limit is in place to protect openFDA infrastructure and is sufficient in most cases; however, sometimes it is desirable to navigate through a result set that exceeds 26,000 search matches.

The API supports only five types of query parameters. The basic building block of queries is the search parameter. Use it to filter requests to the API by looking in specific fields for matches. Each endpoint has its own unique fields that can be searched.

search: What to search for, in which fields. If you dont specify a field to search, the API will search in every field. sort: Sort the results of the search by the specified field in ascending or descending order by using the :asc or :desc modifier. count: Count the number of unique values of a certain field, for all the records that matched the search parameter. By default, the API returns the 1000 most frequent values.

limit: Return up to this number of records that match the search parameter. Currently, the largest allowed value for the limit parameter is 1000.

skip: Skip this number of records that match the search parameter, then return the matching records that follow.

How Do You Call the FDA API? (Tutorial)

  1. Follow the instructions above to read the FDA API documentation
  2. Identify and collect your credentials for authentication
  3. Pick the API resource you want to pull data from
  4. Configure the necessary parameters, method, and URL to make your first request (e.g. with curl or Postman)
  5. Add your credentials and make your first API call . Here is an example request using curl (without real credentials):

How Do You Maintain a Custom FDA to BigQuery ETL Pipeline?

Making a call to the FDA API is just the beginning of maintaining a complete custom ETL pipeline.

Here is a getting-started guide to building a production-grade pipeline for FDA:

  • For each API endpoint, define schemas (which fields exist and the type for each)
  • Process the API response and parse the data (typically parsing JSON or XML)
  • Handle and replicate nested objects and custom fields
  • Identify which FDA fields are primary keys and which keys are required vs. optional
  • Version control your changes in a git-based workflow (using GitHub, GitLab, etc.)
  • Handle code dependencies in your toolchain and the upgrades that come with each
  • Monitor the health of the upstream API, and —when things go wrong— troubleshoot via the status page, reach out to support, and open tickets
  • Handle error codes (HTTP error codes like 400s, 500s, etc.)
  • Manage and respect rate limits imposed by the server

We won’t go into detail on all of the items above, but rate limits are a great example of the complexity found in a production-grade data pipeline.

Here are openFDA's standard limits:

With no API key: 240 requests per minute, per IP address. 1,000 requests per day, per IP address.

With an API key: 240 requests per minute, per key. 120,000 requests per day, per key.

If you don’t respect rate limits, and if you can’t handle server responses (like 429 errors with a Retry-After header), your pipeline can break, and analytics can become out-of-date.

What Are the Drawbacks of Building the FDA ETL Pipeline Yourself?

You can probably tell at this point that there is a lot of work that goes into building and maintaining an ETL pipeline from FDA to your data warehouse.

If you want less development work, faster insights, and no ongoing responsibilities, you should consider a cloud-hosted ETL solution.

Let’s walk through the setup process for a no-code ETL solution and its benefits.

Method 2: Using a No-Code FDA ETL Solution

No-code ETL solutions are simple. Vendors specialize in building and maintaining data pipelines on your behalf. Instead of starting from scratch for each integration. Companies like Portable create connector templates that can be leveraged by hundreds or thousands of clients.

Step-By-Step Tutorial for Configuring Your FDA ETL Pipeline

Off-the-shelf ETL tools offer a no-code setup process. Here are the instructions to connect FDA to your cloud data warehouse with Portable.

  1. Create an account (no credit card required)
  2. Add a source —search for and select FDA
  3. Authenticate with FDA using the instructions in the Portable console
  4. Select BigQuery and authenticate
  5. Set up a flow connecting FDA to your analytics environment
  6. Run your flow to replicate data from FDA to your warehouse
  7. Use the dropdown to set your data flow to run on a cadence

What Are the Benefits of Using Portable for FDA ETL?

No-Code Simplicity

Start moving FDA data in minutes. Save yourself the headaches of reading API documentation, writing code, and worrying about maintenance. Leave the hassle to us.

Easy to Understand Pricing

With predictable, fixed-cost pricing per data flow, you know exactly how much your FDA integration will cost every month.

Fast Development Speeds

Access lightning-fast connector development. Portable can build new integrations on-demand in hours or days.

Hands-On Support

APIs change. Schemas evolve. FDA will have maintenance issues and errors. With Portable, we will do everything in our power to make your life easier.

Unlimited Data Volumes

You can move as much data from FDA to Google BigQuery as you want without worrying about usage credits or overages. Instead of analyzing your ETL costs, you should be analyzing your data.

Free to Get Started

Sign up and get started for free. You don’t need a credit card to manually trigger a data sync, so you can try all of our connectors before paying a dime.

Stop waiting for your data.Start using Portable today.

Pioneer insights and streamline operations with data from all your business applications.

Get Started