Blog

All of my content is listed on this page.

Performance Benchmarks

PostgreSQL 17 Performance Benchmark

PostgreSQL 17 claims many performance improvements. In this article, I benchmark it against PostgreSQL 16.

PostgreSQL VS MySQL Performance Comparison

PostgreSQL is significantly faster than MySQL in my most recent performance benchmarks using Reserva, a custom database benchmarking tool.

PostgreSQL VS MariaDB Performance Comparison

In this article, I conduct original research into the performance characteristics of MariaDB and PostgreSQL.

MariaDB VS MySQL Performance Comparison

MariaDB is significantly faster than MySQL, according to my custom benchmarking tool. I would prefer using MariaDB, if I had that choice.

Fastest Open-Source Databases

PostgreSQL and MariaDB are the fastest mainstream open-source databases. MySQL has some catching up to do.

GCP AlloyDB VS AWS Aurora Performance Comparison

GCP AlloyDB is 12% to 13% faster than AWS Aurora at a similar price point. Aurora has some nice durability features, though.

Best Data Systems

Best Databases

An explanation of the most important database type and my opinion on which ones are worth your time and money.

Best NewSQL Databases

An explanation of what NewSQL databases are, when you might need one, and my opinion on which ones are worth your time and money.

Best Data Warehouses

An explanation of what a data warehouse is and opinions on which ones are worth your time and money.

Best Data Lakehouses

An explanation of what a data lakehouse is and my opinion on which ones are worth your time and money.

Reviews

Yugabyte Review

Yugabyte is an open-source, distributed, PostgreSQL compatible database. It is an excellent option for when you have outgrown PostgreSQL (or some other PostgreSQL compatible database) and need exceptionally high throughput or fault tolerance.

Trino and Presto Review

Presto and its popular fork Trino are versatile tools for creating a data lakehouse. Their primary use case is running analytical queries on large amounts of data stored in other systems.

Azure Synapse Review

Azure’s data warehouse used to be called Azure SQL Data Warehouse, but Microsoft rolled that product into a larger collection of products called Azure Synapse Analytics.

SQL Server Review

SQL Server is an excellent relational database from Microsoft. I default to using PostgreSQL because it is also very good… and free! But if money was no object, I’d default to using SQL Server for most use cases.

Spark Review

Spark is a popular and versatile tool for data movement and processing. It even does a pretty good job acting as a data warehouse using a pattern called a data lakehouse.

Spanner Review

Spanner is a distributed, PostgreSQL compatible database from Google. It might be the most technically impressive database out there. If you have outgrown PostgreSQL (or some other PostgreSQL compatible database) and are already on GCP, adopting Spanner should be a no brainer.

Snowflake Review

Snowflake is my personal favorite data warehouse. They were one of the first to offer a scalable service that separates storage from compute. This architecture is outlined in a whitepaper that I recommend every newbie data engineer reads.

Redshift Review

Redshift is a data warehouse from Amazon. Its a fork of PostgreSQL that writes data in a column oriented format, making it much faster at analytical queries.

PostgreSQL Review

PostgreSQL is my default choice whenever I need a relational database. In fact, it is one of my favorite pieces of technology, ever.

MySQL Review

MySQL is a battle tested database that will be forever tied to the most popular blogging tool of all time - Wordpress.

MariaDB Review

MariaDB is a fork of MySQL. In my opinion, MariaDB is a better piece of technology. However, MySQL has a larger user base and better managed service offerings.

CockroachDB Review

CockroachDB is an open-source, distributed, PostgreSQL compatible database. It is an excellent option for when you have outgrown PostgreSQL (or some other PostgreSQL compatible database) and need exceptionally high throughput or fault tolerance.

Citus Review

Citus is an open-source PostgreSQL extension that grants it the ability to shard and distribute data to multiple nodes, and also to store data in a columnar fashion. It was recently purchased by Microsoft, but remains open-source.

Bigquery Review

Bigquery is a data warehouse from Google that allows users to store and perform analytics on structured and semi structured data. In some situations, I prefer it over my overall #1 data warehouse, Snowflake.

Amazon Aurora Review

Aurora is a transactional database from Amazon. Its research white paper details the innovative, cloud-native storage engine that distinguishes it from its peers. The thing I find exceptional about it is how durable data is once written to the database’s storage layer - its written 6 times!

AlloyDB Review

AlloyDB is a cloud native, PostgreSQL compatible database from Google. I recommend using it if you need a non distributed relational database and are already a GCP user.

Database Comparisons

Should You Use Snowflake Or Databricks?

Snowflake and Databricks are two well known providers of analytical data tools. Because of this, they’re frequently compared to each other. Fortunately, the choice between their product offerings is usually straightforward once you’ve defined your use case. In fact, their products complement each other well and shouldn’t really be considered competitors.

Explanations

lil-bit And speedy-1 Testing Environment

A description of an Intel mini-pc testing environment I frequently use to run Reserva database benchmarks.

Azure Database Services Explained

Azure has several managed database services that are subtly different. This article explains how they are different, and when to use each one.

Should You Use Snowflake Or Databricks?

Snowflake and Databricks are two well known providers of analytical data tools. Because of this, they’re frequently compared to each other. Fortunately, the choice between their product offerings is usually straightforward once you’ve defined your use case. In fact, their products complement each other well and shouldn’t really be considered competitors.

Columnar VS Row Storage In Databases

Columnar storage is optimal for analytical queries, while row storage is optimal for transactional queries. This article explores the difference.