Data System Reviews

This site categorizes, explains, and reviews the systems that software engineers use to process and store data.

Transactional relational databases

Transactional relational databases store data in structures called tables that look a lot like spreadsheets. They are among the most popular and important data systems in the world.

Modern nondistributed databases are incredible pieces of engineering and will support most use cases. However, certain applications may require features or performance that is only available through a distributed database.

Best transactional relational databases

Traditional transactional relational databases can usually handle a few hundred gigabytes to a few terabytes of data.

Best NewSQL databases

Consider using a NewSQL over a traditional transactional relational database if:

Analytical relational databases

Analytical relational databases, frequently called data warehouses, are similar to transactional databases, but are optimized for analytics.

Best data warehouses

Data warehouses behave a lot like a normal database, but store data in columns instead of rows.

Best data lakehouses

Data lakehouses approximate the functionality of a data warehouse by reading and writing columnar data in an object storage platform or distributed file system.

Time series database

Time series databases make certain design tradeoffs to efficiently work with time series data.

NoSQL databases

There are a wide variety of systems that allow you to store, modify, and retrieve non-relational data which are collectively called NoSQL.

Document databases

If you look at a document database, it looks like a bunch of JSON rows. They can be super scalable in a way that traditional databases struggle to achieve, but make a lot of important tradeoffs to do so.

Key value caches

Databases that store key value information in memory for ultra fast retrieval. These are typically used to create “caches” to avoid sending your database too many queries.

Wide column stores

These kinds of databases excel at storing “sparse” data, which means they can efficiently deal with lots of “null” values.

Data movement

These tools help you move data from one data system to another.

Stream processing

Allows you to ingest, process, and then do something with data, in real time, as it arrives.

Extract, load, and transform (ELT)

These tools move a bunch of data at a time, frequently at predetermined intervals. Think “at midnight, every night, copy this database table to a data warehouse.”

Message queues

A key building block of stream processing and many other systems, these tools accept and deliver messages in a reliable, ordered, and predictable manner.

Dashboards

These tools help you discover and visualize data.

Business intelligence dashboards

Visualize data. Mostly business related data from SQL queries, but not always.

Metrics dashboards

Dashboards optimized for showing how your computer systems are doing.

Special databases

Databases for special use cases.

Vector databases

Vector databases are optimized for reading and writing high dimensional data and are most frequently used to power AI applications by storing embedding data.

Graph databases

Sometimes we want to map how things relate to each other. Graph databases do just that, and power things like “people you may know” suggestions on social media services.

Geospatial databases

These databases allow you to store and query geospatial data. Services like Google Maps and real estate websites make heavy use of these kinds of systems to map the real world.

These systems allow you to upload and quickly search tons of textual data, extremely quickly.

Other

Other tools that didn’t fit another category.

Object storage

These cloud based systems allow you to upload a file to a certain path, then retrieve later. They are tightly integrated with other cloud services and have become the dominant way to store unstructured data like images, audio files, text, and even static websites (like this one).

Orchestration

A central program that tells other stuff what to do, when.

Database backups

Programs that help you keep your database backups organized.