Blog

Yugabyte Review

Yugabyte is an open-source, distributed, PostgreSQL compatible database. It is an excellent option for when you have outgrown PostgreSQL (or some other PostgreSQL compatible database) and need exceptionally high throughput or fault tolerance.

Trino and Presto Review

Presto and its popular fork Trino are versatile tools for creating a data lakehouse. Their primary use case is running analytical queries on large amounts of data stored in other systems.

Azure Synapse Review

Azure’s data warehouse used to be called Azure SQL Data Warehouse, but Microsoft rolled that product into a larger collection of products called Azure Synapse Analytics.

SQL Server Review

SQL Server is an excellent relational database from Microsoft. I default to using PostgreSQL because it is also very good… and free! But if money was no object, I’d default to using SQL Server for most use cases.

Spark Review

Spark is a popular and versatile tool for data movement and processing. It even does a pretty good job acting as a data warehouse using a pattern called a data lakehouse.

Spanner Review

Spanner is a distributed, PostgreSQL compatible database from Google. It might be the most technically impressive database out there. If you have outgrown PostgreSQL (or some other PostgreSQL compatible database) and are already on GCP, adopting Spanner should be a no brainer.

Snowflake Review

Snowflake is my personal favorite data warehouse. They were one of the first to offer a scalable service that separates storage from compute. This architecture is outlined in a whitepaper that I recommend every newbie data engineer reads.

Redshift Review

Redshift is a data warehouse from Amazon. Its a fork of PostgreSQL that writes data in a column oriented format, making it much faster at analytical queries.

PostgreSQL Review

PostgreSQL is my default choice whenever I need a relational database. In fact, it is one of my favorite pieces of technology, ever.

MySQL Review

MySQL is a battle tested database that will be forever tied to the most popular blogging tool of all time - Wordpress.

MongoDB Review

MongoDB is an open source document database database. In my opinion, it is the best one.

Best NewSQL Databases

An explanation of what NewSQL databases are, when you might need one, and my opinion on which ones are worth your time and money.

Best Data Warehouses

An explanation of what a data warehouse is and opinions on which ones are worth your time and money.

Best Data Lakehouses

An explanation of what a data lakehouse is and my opinion on which ones are worth your time and money.

Azure Database Services Explained

Azure has several managed database services that are subtly different. This article explains how they are different, and when to use each one.

MariaDB Review

MariaDB is a fork of MySQL. In my opinion, MariaDB is a better piece of technology. However, MySQL has a larger user base and better managed service offerings.

CockroachDB Review

CockroachDB is an open-source, distributed, PostgreSQL compatible database. It is an excellent option for when you have outgrown PostgreSQL (or some other PostgreSQL compatible database) and need exceptionally high throughput or fault tolerance.

Citus Review

Citus is an open-source PostgreSQL extension that grants it the ability to shard and distribute data to multiple nodes, and also to store data in a columnar fashion. It was recently purchased by Microsoft, but remains open-source.

Bigquery Review

Bigquery is a data warehouse from Google that allows users to store and perform analytics on structured and semi structured data. In some situations, I prefer it over my overall #1 data warehouse, Snowflake.

Best Transactional Relational Databases

An explanation of the most important database type and my opinion on which ones are worth your time and money.

Amazon Aurora Review

Aurora is a transactional database from Amazon. Its research white paper details the innovative, cloud-native storage engine that distinguishes it from its peers. The thing I find exceptional about it is how durable data is once written to the database’s storage layer - its written 6 times!

AlloyDB Review

AlloyDB is a cloud native, PostgreSQL compatible database from Google. I recommend using it if you need a non distributed relational database and are already a GCP user.

Should You Use Snowflake Or Databricks?

Snowflake and Databricks are two well known providers of analytical data tools. Because of this, they’re frequently compared to each other. Fortunately, the choice between their product offerings is usually straightforward once you’ve defined your use case. In fact, their products complement each other well and shouldn’t really be considered competitors.

PostgreSQL VS MariaDB Performance Comparison

In this article, I conduct original research into the performance characteristics of MariaDB and PostgreSQL, using HammerDB.

Columnar VS Row Storage In Databases

Columnar storage is optimal for analytical queries, while row storage is optimal for transactional queries. This article explores the difference.