Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

The YouTube video How to Choose The Right Database has great advices on how to choose the right databases.

Types of Databases

Advantages of Relational Databases

Advantages of Non-relational Databases

Storage Format

columnar storage is good for analytical operations

Comparison of Databases

NameLanguageOpensource/FreePACELCAdvantagesDisadvantagesComment
SQLiteSQLOpensource-/ELCthe most popular embedded database
DuckDBSQLOpensource-/ELC- embedded OLAP - a good raising alternative to SQLite
PostgreSQLSQLOpensourcePC/ECa better alternative to MySQL
MySQL [1]SQLOpensourcePC/ECthe most popular opensource RDBMS
ClickHouse [2]SQLOpensourceOLAP for big dataHas very good performance
TiDB [3]SQLOpensourceOLAP for big datagood performance, support integration with Spark
Cassandra [1]CQL (Cassandra
Query Language)
OpensourcePA/ELreal-timeno join
HBase [1]OpensourcePC/ECreal-timeno join
Redis [4]DSL (hashmap
API-like)
OpensourceDistributed in-memory cache for real-time applicationsQueries or joins
neo4j [5]Cypher (Graph
Query Language)
OpensourceGraph applicationsThe most popular graph database
Elasticsearch [6]DSL, SQLOpensourceOut-of-the-box search engine for large documentsDesigned as a search engine but also popularly used as a database
TDengine [7]SQLOpensourceIoTIoT, good performance

[2] ClickHouse is an open-source column-oriented database management system that allows generating analytical data reports in real time.

yugabyte-db

yugabyte-db

scylladb

Scylla is the real-time big data database that is API-compatible with Apache Cassandra and Amazon DynamoDB. Scylla embraces a shared-nothing approach that increases throughput and storage capacity to realize order-of-magnitude performance improvements and reduce hardware costs.

MongoDB

MongoDB is a document-oriented, disk-based database optimized for operational simplicity, schema-free design and very large data volumes.

Distributed In-memory Cache

A distributed in-memory cache is essentially a distributed key-value storage/database. You can think it as a hashmap over network.

Redis is the most popular in-memory cache which is implemented in C. memcached is another (not so popular) in-memory cache and is also implemented in C. pelikan is Twitter’s unified cache backend which is implemented in C and Rust.

References