Amazon Redshift is the world's most widely used data warehouse.
But before you decide to use Redshift, you'll need to learn if it works for your budget.
And unfortunately, Redshift pricing is anything but simple.
Just getting a basic estimate of what your data usage will cost requires a half-dozen questions.
So today, we're going to simplify all the fine print and break down just how much, exactly, Redshift charges.
Redshift is a fully managed, petabyte-scale, cloud-based data warehouse that's a part of Amazon Web Services (AWS).
It's a scalable and cost-effective platform built for high-performance columnar data warehousing in SQL. Redshift accepts structured and semi-structured data formats and virtually any amount of data, from a few gigabytes to several petabytes.
For an idea of Redshift's big data capabilities, Nasdaq uploads 70 billion stock trading records to Redshift at market close every day, then completes analysis queries on all 70 billion records before the market opens the next morning.
But scalability isn't its only strength. Redshift also stands apart due to its deep integration with other tools in the AWS suite, including Amazon Simple Storage Service (S3).
Redshift is more expensive than Amazon Relational Database Service (RDS), but it offers a more robust feature set and better scalability.
Compared with other data warehouses, Redshift is not expensive at all. It tends to be one of the most affordable data warehouse options, with prices typically lower than other cloud-based warehouses like Google BigQuery or Snowflake.
But affordable pricing doesn't mean Redshift costs are simple. In fact, Amazon Redshift pricing is notoriously difficult to understand. Let's break it down.
One of the reasons why Amazon Redshift prices seem so complex is that there's just so much to choose from.
Asking how much it costs to use Redshift is like asking how much it costs to eat at a restaurant---it depends on what you order. And Redshift has a very broad menu to choose from.
Amazon Redshift divides its computing resources into units called nodes, which can be grouped together in a cluster. There are separate node types for different use cases, and each cluster can only contain one type of node.
For data sets under one terabyte, use DC2 nodes. This stands for "dense compute" and uses a fast solid-state drive (SSD) for data storage. DC2 nodes have different capacities measured as virtual CPUs (vCPUs).
Users in a US region can expect to pay around $0.25-0.35/hour for the lowest-power, 2-vCPU dc2.
For large node, users can expect to pay around $5-6/hour for the highest-power 32-vCPU dc2.8xlarge node.
For data sets over one terabyte, use RA3 nodes. These use a unique split that stores data for fast access in SSD storage while keeping the rest in Amazon S3. The RA3 node essentially separates compute and storage architectures, similar to the default setup on Snowflake and BigQuery.
Users in a US region can expect to pay around $1.09-1.20/hour for the lowest-power, 4-vCPU ra3.xlplus node and up to around $13-14.50/hour for the highest-power 48-vCPU ra3.16xlarge node.
Amazon launched RA3 nodes in 2019 to replace dense storage nodes (DS2), which used hard disk drives (HDD). This DS2 type of node is still available, though outdated, and doesn't offer the same features as the new RA3 node.
The RA3 node offers elastic resizing which lets you scale your cluster size up and down instead of reallocating the workload across several clusters.
The pricing model depends on the size of your Amazon Redshift cluster---the number of nodes and type of nodes you're using---and the region you're in.
Redshift Managed Storage prices come only with RA3 nodes and are billed at a per-gigabyte rate monthly. Managed Storage does not include backup storage, which is charged separately.
While the exact price depends on your region, most US users can expect to pay around $0.025 per gigabyte per month.
Keep in mind that for large datasets that only need infrequent queries, it may be more affordable to store data long-term in S3 and use Redshift Spectrum to scan it when necessary.
What if you want to use Redshift but don't want to worry about managing Redshift clusters and compute nodes yourself?
Enter Redshift Serverless.
This service handles all capacity behind the scenes, scaling up or down as you need. And instead of charging based on the nodes you need, it bills based on Redshift Processing Units (RPUs).
These are credits for an hour of processing time, billed on a per-second basis with a 60-second minimum. The price depends on your region, but customers in the US can expect to pay from $0.35 to $0.50 per hour.
Redshift Serverless also includes Concurrency Scaling and Redshift Spectrum features for free.
Finally, you'll pay additional fees for transferring data in and out of Amazon Redshift.
The only exception is any data you'll be moving within Amazon Web Services within the same AWS region. So if you need to connect Redshift to a third-party data source, you'll be charged for inputting that data. But if you're moving data that's already stored in Amazon S3 in the same region, moving into Redshift will be free.
Redshift Concurrency Scaling is a feature that lets you scale your processing power. It essentially adds additional cluster capacity at peak times, ensuring faster query processing without charging you extra when you don't need it.
Concurrency Scaling is accumulated for free with each day of Redshift cluster usage. You'll earn credit for one Concurrency Scaling hour per 24-hour day, which can accumulate up to 30 at once. The credits don't expire as long as you don't cancel the cluster.
Scaling beyond the credits you've accumulated comes at an additional cost billed per second at on-demand rates. For customers using Redshift Serverless, Concurrency Scaling happens automatically and is included in the plan.
While Redshift has low storage prices compared to other data warehouses, it's still much more expensive than the nearly infinite storage capacity of Amazon S3.
Redshift Spectrum gives you the best of both worlds. You can import data from your data sources, store it in an S3 data lake for a fraction of the price, then process that data on Redshift using Spectrum.
And Redshift Spectrum pricing is simple: around $5.00 per terabyte of data scanned, depending on your region. There's a 10-megabyte minimum per query, and fractional amounts are rounded to the nearest 10 megabytes.
Spectrum is included for free with Redshift Serverless.
Redshift ML lets you create, train, and apply machine learning models using standard SQL within Redshift warehouses.
Pricing for Redshift ML is based on the total number of created cells.
$20 per million cells for the first 10 million cells
$15 per million cells for the next 90 million cells
$7 per million cells thereafter
There is also a minimal charge---typically less than $1/month---for S3 storage generated for these created cells.
Redshift offers free, automatic snapshots of your data that are stored for 35 days. Manual snapshots, taken using the console, command-line interface, or Redshift API, incur separate charges.
Manual snapshots taken from RA3 clusters are stored as S3 data at standard rates. For DC and DS clusters, backups are stored on the cluster itself and not charged unless they exceed the provisioned cluster size, in which case they're billed at the standard S3 storage rate.
Now that we've seen the prices Amazon Redshift charges for computing power, storage, and additional features, let's look at the plans it offers. You'll need to select a few different options to see your final cost.
You'll need to select the physical region where Amazon Redshift will be hosted. Current-generation RA3 nodes are available in these US regions:
US East (N. Virginia)
US East (Ohio)
US West (N. California)
US West (Oregon)
AWS GovCloud East
AWS GovCloud West
There are slight variations in price based on region, but pricing isn't the most important factor.
Instead, make sure you're choosing a region that meets any regulatory guidelines (for example, GDPR or governmental compliance) and one that's closest to the end users to reduce latency.
Amazon Redshift has two pricing methods and three different payment methods for committed pricing.
On-Demand pricing doesn't require any commitment, making it extremely flexible. You only pay for what you use, though the On-Demand rate is always the highest.
Reserved Instance pricing gives steep discounts for enterprises that can commit to a certain amount of processing power in advance.
If you choose Reserved Instance, you have three options regarding upfront costs:
No upfront, pay monthly has the lowest discount but lets you pay over the course of a year.
Partial upfront offers better discounts and splits your total payment between an upfront payment and monthly installments for up to three years.
All upfront offers the biggest discounts---up to 60%+ for three-year commitments paid in advance.
Amazon Redshift isn't free to use forever, but if you're new to the platform you can take advantage of its generous free trial.
You'll get two months for free with 750 hours per month.
The free trial ends after two months or once you exceed the 750-hour cap, whichever comes first.
While Redshift is known for its affordable pricing, there are still ways to reduce your total cost.
You can set monthly, weekly, and daily caps on Concurrency Scaling. This can help keep costs down and prevent you from using the automatically allocated hours.
Redshift includes a feature to monitor the usage based on each query. You may find some are consuming more processing power than expected and can modify them or reduce their frequency.
If you're doing infrequent scans on large amounts of data, one of the best ways to save money is to move that data to an S3 bucket and use Spectrum for the few scans you need.
Redshift is the market leader in data warehousing for good reason. It offers a leading product with a huge feature set, robust integration with AWS, and competitive pricing.
But it can be difficult to know just what you'll pay until you see your first invoice. Understanding AWS Redshift pricing is the first step to choosing the right warehouse tool for your organization.
Your data warehouse is only one part of the equation. You'll also need a way to extract data from your current apps, transform it, and load it into Redshift. Portable helps you manage this ETL process so your team can focus on insights, not overhead.
Looking for the best Redshift ETL tool? Get started with Portable.