PostgreSQL is one of the most popular open-source database applications.
But how much does it cost?
There's no easy answer. Whether you decide to self-host or use a fully managed platform, Postgres can be the most cost-effective or expensive option, depending on how you run it.
But there are some estimates on PostgreSQL pricing that can help you decide if Postgres is right for you.
PostgreSQL, commonly called Postgres, is a high-performance open-source database platform. It's been around since 1986, making it one of the oldest data platforms in existence. Postgres uses a row-oriented relational database, which makes it faster for workflows with large amounts of data to write.
Postgres has stayed a popular favorite with DevOps for decades because of its power and flexibility. Plus, some unique PostgreSQL features help it stand apart from other open-source database platforms like MySQL, MariaDB, or MongoDB and closed-source competitors like Oracle or Microsoft SQL Server.
Postgres offers several types of indexing, including B-tree, generalized search tree, hash indexes, and generalized inverted indexes.
Postgres supports four procedural languages out-of-the-box: PL/pgSQL, PL/Tcl, PL/Perl, and PL/Python.
Multiversion concurrency control (MVCC) lets separate users interact with the database at the same time without locking rows or blocking actions.
Postgres includes replication capabilities that help keep backup copies to prevent data loss, and uses load balancing to let several computers serve the same data.
Postgres handles a broad range of data types that other databases can't, including JSON datasets for NoSQL workloads.
There are a lot of reasons to choose Postgres. And it's even more attractive based on your budget.
If you're going to host Postgres yourself, pricing starts at $0 and can go up to nearly any amount you're willing to spend.
Postgres is an open-source database engine that is completely free to use, even for commercial purposes, under the PostgreSQL license.
There is no startup cost, no ongoing payment as with a SaaS product, and no cost to upgrade software with patches or new releases.
Chances are, hardware is the biggest segment of your total cost for running Postgres.
This can be essentially free if you use hardware you already own. Postgres works on nearly all operating systems, including Linux, Microsoft Windows, and macOS.
But for most use cases, you'll want to deploy Postgres on new hardware. There are a few options.
This is the easiest, cheapest, and least scalable option---deploying Postgres on a computer you use.
It's the lowest cost and essentially free, making it a good option if you're testing out Postgres for the first time. But it's unsuitable for any large-scale data operations.
This is the most typical setup for on-premise deployments.
Costs for a server range from buying all new hardware for a dedicated server setup to adding Postgres alongside existing applications on a server you already have.
Using a separate server can manage larger data loads but will struggle if there are too many other applications that need to share resources.
In this setup, you'll rent a server hosted at a data center.
In many ways this setup is similar to a local dedicated server except instead of being on-premise it's at another facility that manages most of the maintenance and upkeep.
Similar to a rented server at a data center, in this setup you don't have a dedicated server but rather an allocation of space. This is usually a more affordable option though can result in limited capacity.
In addition to the hardware, you'll also need personnel to handle maintenance, security, updates, and more.
A single administrator could manage very simple deployments without sensitive data. But larger operations or ones with strict data requirements---like those for financial services--- require a team of specialists to ensure the hardware and software are working at peak performance.
The increasing complexity for bigger and bigger local deployments is what makes managed platforms so attractive.
Instead of deploying Postgres yourself, you can use a managed platform that handles all the overhead for you. The most popular belong to the major cloud providers: Google Cloud, Amazon Web Services, and Microsoft Azure. Other common options include Digital Ocean, Heroku, and IBM.
The main advantages of a fully managed database platform are lower overhead cost, scalability, and reliability.
The cloud platform automates overhead like provisioning and storage capacity management, plus security and maintenance of the hardware, operating system, and database management.
You don't need to install security patches, worry about data integrity or compliance, or fix a server crash. If you need to scale performance or memory up or down you can do it in a few clicks, instead of manually upgrading or downgrading hardware.
Plus, cloud infrastructure isn't limited to your physical location. Instead of relying on the limitations of an on-premise server, you can access your data from anywhere in the world with a web application or API.
Here's the estimated Postgres pricing for two major cloud platforms: Google Cloud Platform and Amazon Web Services.
Google Cloud SQL for PostgreSQL is a popular cloud platform option. It integrates easily with other applications in the Google Cloud Platform (GCP) like BigQuery, Google Kubernetes Engine, and Google's world-class machine learning engines.
New Google Cloud SQL customers get $300 in credit.
All Google Cloud pricing depends on the region. Google's US regions are:
Los Angeles (us-west2)
Salt Lake City (us-west3)
Las Vegas (us-west4)
South Carolina (us-east1)
Northern Virginia (us-east4)
CPU and memory for a dedicated core uses pricing billed hourly or monthly. You can choose up to 96 virtual CPUs and 624 gigabytes of memory.
Expect to pay $0.04-0.05/hour or $30-$36/month for each vCPU, and $0.007-0.009/hour or $5-6/month for each gigabyte of memory.
You can also choose High Availability (HA) options, which provide additional data redundancy in case of an outage. High Availability pricing is approximately double the standard vCPU and memory prices.
And for steeper discounts of as much as 50% or more you can select one- or three-year commitments.
Data storage in your PostgreSQL database is charged separately from your compute needs.
As with CPU pricing, storage prices depend on your region and have a High Availability option that's approximately twice as much.
For solid-state drive (SSD) storage, expect to pay between $0.17-0.20 per gigabyte per month.
For hard disk drive (HDD) storage, expect to pay between $0.09-0.11 per gigabyte per month.
And for backup storage, expect to pay between $0.08-0.10 per gigabyte per month.
Networking charges for data ingress (importing) are free, while egress (export) charges depend on the destination platform.
Instance pricing is the billing method for shared core processing. It depends on the type of core and your region, but ranges from around $0.01-$0.08 per hour, or $8-60 per month.
One of the most popular options is Amazon Relational Database Service (RDS), part of Amazon Web Services (AWS).
As with most AWS services, you get extremely granular configuration control. There are well over one thousand choices available for your Postgres setup, so we'll just highlight the most important variables here.
AWS has a generous free tier for one year, with 750 hours per month for single-AZ micro instances and 20 GB storage.
Like other cloud platforms, AWS pricing will depend on your region. Here are the US-based AWS regions:
US East (Northern Virginia)
US East (Ohio)
US West (Los Angeles)
US West (Northern California)
US West (Oregon)
And if you need compliance with government-level security, you can choose between AWS GovCloud East and AWS GovCloud West.
Your compute instances can be billed two ways: On-Demand or Reserved Instance.
On-Demand charges based on time with no upfront commitment. It's billed by the second with a 10-minute minimum charge. You'll select your Availability Zone (AZ) and DB Instance type.
Single-AZ deployment uses one region. Multi-AZ deployments maintain instances in two regions (one standby) or three regions (two readable standbys) and charge roughly 2x and 3x the Single-AZ rates, respectively.
You can also choose from several dozen Instance sizes, each with varying price points. The lowest-priced DB Instance is the db.t4g.micro at around $0.02/hour, while the most expensive is the db.m6i.32xlarge at around $11.40/hour.
You can also choose from memory-optimized Instances which are better served for jobs with high memory workloads.
Reserved Instance offers much better pricing with one-year or three-year terms. You can pay with nothing upfront, partial upfront, or all upfront for additional savings.
AWS storage charges also vary by region, and also have Single-AZ and Multi-AZ options which are approximately double the standard pricing.
There are several options for data storage, the most basic being General Purpose SSD (gp2) which ranges in price from $0.12-0.14 per gigabyte per month.
Data ingress is free, but egress has various prices depending on the destination.
No matter how you've set up your Postgres deployment, there are a few tips to optimize your processing needs (which correlates to cost, if you're using a managed platform).
Use COPY instead of INSERT. The COPY command is optimized for larger datasets and requires less overhead than INSERT.
Run ANALYZE after updating. Once you've added or edited a lot of information in a table, run ANALYZE to give the planner better statistics. This ensures more efficient queries later on.
Disable Autocommit. When performing multiple INSERTs, you'll save processing power by running one COMMIT at the end, rather than using the autocommit feature.
Postgres is a popular open-source database application, and has been for over 30 years. It provides enterprise-level features for a deployment cost that's as close to free as possible.
But depending on how you implement PostgreSQL in your organization, there will be additional costs beyond the open-source software. Understanding these costs for self-managed and cloud-based deployments can help you decide if Postgres is the right fit.
Even if PostgreSQL is the best data warehouse option for your needs, you'll still need a tool to extract, transform, and load data from your apps into your database.
Portable is an ETL tool that can help you aggregate this data without spending time managing pipelines. Your data team can focus on analysis instead of managing ETL processes.
Looking for the best PostgreSQL ETL tool? Get started with Portable.