Best Databases

Last updated: October 18 2024

Table of contents

A lot of the world’s digital infrastructure relies on transactional, non-distributed, relational databases.

There are many types of databases, but this type is so common that whenever someone says “database” without specifying which kind, they are probably referring to this kind.

What are they?

A relational database is a piece of software that allows you to create and maintain data in tables. Tables store data in columns and rows and look a lot like a spreadsheet at first glance. They also allow you to enforce complex constraints and logic to ensure data consistency.

There are two main types of relational databases, OLTP and OLAP. OLTP stands for “online transaction processing” and OLAP stands for “online analytical processing”. Transactional relational databases, as you might have guessed from the name, fall into the OLTP camp.

The basic difference between these two kinds of systems is that OLTP databases write their data row-wise and OLAP databases write their data column wise. Databases that write their data in a row-wise fashion can update small amounts of data very quickly, whereas columnar databases are optimized for long running queries on huge amounts of data.

Most databases that support an organization’s operations are OLTP. Every time you buy something in a store, use a subway pass, read a blog post on a dynamic website, or even make a note on your phone, it’s likely that you’re interacting with an OLTP database. These kind of databases employ concepts like primary keys, foreign keys, indexes, constraints and ACID, to ensure that every action taken on the database is recorded in a safe, clean, fast, and durable manner.

Which one should you use?

There are two main categories of database systems:

I prefer open-source standalone systems and cloud systems that offer compatibility with one of those open-source systems. This is maintain some plausible chance of switching providers if a cloud service ever gets too expensive, changes their terms of service in a hostile way, or even goes out of business.

Standalone systems

PostgreSQL

My personal favorite relational transactional database is PostgreSQL and a lot of other people love it too. I would recommend it as your default choice whenever starting a new project, unless you have some specific use-case that PostgreSQL isn’t optimal for.

In fact, so many people love PostgreSQL that there is a (somewhat humorous) movement called PostgreSQL for everything.

If PostgreSQL can’t do something out-of-the-box, there’s a good chance that there is an extension for that use case.

SQL Server

If SQL Server was free, it would be my favorite database. But it’s not. In fact, it’s really expensive.

On of my favorite SQL Server features is its ability to double as a respectable OLAP database by creating columnstore indexes. Moving data from an OLTP database to an OLAP database can be a pain, and can also get quite expensive, so being able to just leave it right there in the same database can be a huge win.

SQL Server also has a fantastic ecosystem of 1st party tools around it like SSMS, SSIS, SQL Server Profiler, SSDT, SSRS, SSAS, and SQL Server Agent. PostgreSQL has a great ecosystem around it as well, but it’s hard to deny that Microsoft has the edge in terms of first party tooling.

MySQL and MariaDB

You will have to consider MySQL or its popular alternative MariaDB if you’re planning to use the popular CMS Wordpress to host a website. Also, I’ve read an article from Uber’s engineering blog claiming that MySQL has performance benefits over PostgreSQL in certain situations. For what it’s worth, in my own personal tests, PostgreSQL outperforms MySQL and MariaDB in standard use cases.

Cloud native systems

Amazon Aurora

Aurora is a cloud-native database from AWS. It is PostgreSQL and MySQL compatible.

For me, the nice thing about Aurora is that it makes it relatively easy to achieve high availability and scale your throughput by spinning up read replicas.

AlloyDB

AlloyDB is a PostgreSQL compatible database from Google that claims to be 4X faster than plain PostgreSQL. I would like to validate that claim, and will do so in the near future.

For what it’s worth, Google Cloud Platform’s database team is one of the best around. I doubt it’s actually 4X faster, but would be willing to bet it is very fast.

Azure SQL Database

I wish Microsoft would pick less boring, more searchable names for their database products. Owell.

Azure SQL Database is a cloud native, managed, SQL Server compatible database from Microsoft Azure. For all intents and purposes, you should consider it to be a very well run SQL Server cluster.

Like I said above, SQL Server would actually be my favorite database, if it was free. Since this is basically the very best version of SQL Server you can get, it might be the best cloud native database out there (for most use cases). You will pay a high price for that privilege, though.

When should you use a distributed database?

Sometimes an organization’s scale reaches a point where a single computer cannot possibly comply with the performance or reliability requirements of a given workload. At this point, many company’s choose to migrate from a traditional relational database to a distributed transactional relational database. “Sharding” simply means that multiple computers cooperate to handle the workload. However, it is not advisable to start with such a system unless you’re absolutely certain you’ll need it in the near future.