Ben Chuanlong Du's Blog

It is never too late to learn.

Useful Rust Crates for Database

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

https://github.com/neondatabase/neon Neon is a serverless open-source alternative to AWS Aurora Postgres. It separates storage and compute and substitutes the PostgreSQL storage layer by redistributing data across a cluster of nodes.

sqlx

SQLx is an async, pure Rust† SQL crate featuring compile-time checked queries without a DSL. Supports PostgreSQL, MySQL, and SQLite.

gluesql

GlueSQL is an open source SQL database engine fully written in Rust with pure functional execution layer, easily swappable storage and web assembly support!

surrealdb

SurrealDB is an end-to-end cloud native database for web, mobile, serverless, Jamstack, backend, and traditional applications. SurrealDB reduces the development time of modern applications by simplifying your database and API stack, removing the need for most server-side components, and allowing you to build secure, performant apps quicker and cheaper. SurrealDB acts as both a database and a modern, real-time, collaborative API backend layer. SurrealDB can run as a single server or in a highly-available, highly-scalable distributed mode, with support for SQL querying from client devices, GraphQL, ACID transactions, WebSocket connections, structured and unstructured data, graph querying, full-text indexing, geospatial querying, and row-by-row permissions-based access.

https://github.com/quickwit-oss/quickwit

https://github.com/cberner/redb

https://github.com/quickwit-oss/tantivy

rusqlite

rusqlite is an ergonomic wrapper for SQLite.

sled

sled is an embedded key-value pair databases written in Rust.

indradb

indradb is a graph database written in rust.

oxigraph/

oxigraph/ Oxigraph is a graph database implementing the SPARQL standard. Its goal is to provide a compliant, safe, and fast graph database based on the RocksDB and Sled key-value stores. It is written in Rust. It also provides a set of utility functions for reading, writing, and processing RDF files. Oxigraph is in heavy development and SPARQL query evaluation has not been optimized yet.

skytable

Skytable is a free and open-source NoSQL database that aims to provide flexible data modeling at scale. Simplicity, performance and flexibility are the guiding design principles.

bolt-rs

bolt-rs aims to provide a comprehensive set of libraries that allow for interaction with graph database servers that support the Bolt protocol, namely, Neo4j. This set of libraries allows interacting with servers supporting versions 1 through 4.1 of the protocol, which includes Neo4j 3.1 through 4.2.

neo4rs

Neo4rs is a Neo4j rust driver implemented using bolt specification. This driver is compatible with neo4j 4.x versions

qdrant

Qdrant is a vector similarity search engine and vector database. It provides a production-ready service with a convenient API to store, search, and manage points - vectors with an additional payload. Qdrant is tailored to extended filtering support. It makes it useful for all sorts of neural-network or semantic-based matching, faceted search, and other applications.

OLAP Specific Databases

seafowl

Seafowl is an analytical database for modern data-driven Web applications. Its CDN and HTTP cache-friendly query execution API lets you deliver data to your visualizations, dashboards and notebooks by running SQL straight from the user's browser.

Time Series Databases

ceresdb

CeresDB is a high-performance, distributed, cloud native time-series database.

cnosdb

CnosDB An Open Source Distributed Time Series Database with high performance, high compression ratio and high usability.

influxdb_iox

Influxdb IOX (short for Iron Oxide, pronounced InfluxDB "eye-ox") is the future core of InfluxDB, an open source time series database. The name is in homage to Rust, the language this project is written in. It is built using Apache Arrow and DataFusion among other things.

Storage

datenlord

DatenLord is a next-generation cloud-native distributed storage platform, which aims to meet the performance-critical storage needs from next-generation cloud-native applications, such as microservice, serverless, AI, etc. On one hand, DatenLord is designed to be a cloud-native storage system, which itself is distributed, fault-tolerant, and graceful upgrade. These cloud-native features make DatenLord easy to use and easy to maintain. On the other hand, DatenLord is designed as an application-orientated storage system, in that DatenLord is optimized for many performance-critical scenarios, such as databases, AI machine learning, big data. Meanwhile, DatenLord provides high-performance storage service for containers, which facilitates stateful applications running on top of Kubernetes (K8S). The high performance of DatenLord is achieved by leveraging the most recent technology revolution in hardware and software, such as NVMe, non-volatile memory, asynchronous programming, and the native Linux asynchronous IO support.

Log Storage

openobserve https://github.com/openobserve/openobserve OpenObserve is a cloud native observability platform built specifically for logs, metrics, traces and analytics designed to work at petabyte scale. πŸš€ 10x easier, πŸš€ 140x lower storage cost, πŸš€ high performance, πŸš€ petabyte scale - Elasticsearch/Splunk/Datadog alternative for πŸš€ (logs, metrics, traces).

parseable

Parseable is a lightweight, cloud native log observability engine. It can use either a local drive or S3 (and compatible stores) for backend data storage. Parseable is written in Rust and uses Apache Arrow and Parquet as underlying data structures. Additionally, it uses a simple, index-free mechanism to organize and query data allowing low latency, and high throughput ingestion and query.

zincobserve

ZincObserve is a cloud native observability platform built specifically for logs, metrics, traces and analytics designed to work at petabyte scale. It is very simple and easy to operate as opposed to Elasticsearch which requires a couple dozen knobs to understand and tune which you can get up and running in under 2 minutes. It is a drop-in replacement for Elasticsearch if you are just ingesting data using APIs and searching using kibana (Kibana is not supported nor required with ZincObserve. ZincObserve provides its own UI which does not require separate installation unlike kibana).

minio-rs

MinIO Rust SDK is Simple Storage Service (aka S3) client to perform bucket and object operations to any Amazon S3 compatible object storage service.

Cache

https://github.com/06chaynes/http-cache A caching middleware that follows HTTP caching rules A caching middleware that follows HTTP caching rules, thanks to http-cache-semantics. By default, it uses cacache as the backend cache manager.

Data Layer

https://github.com/vectordotdev/vector Vector is a high-performance, end-to-end (agent & aggregator) observability data pipeline that puts you in control of your observability data. Collect, transform, and route all your logs, metrics, and traces to any vendors you want today and any other vendors you may want tomorrow. Vector enables dramatic cost reduction, novel data enrichment, and data security where you need it, not where it is most convenient for your vendors. Additionally, it is open source and up to 10x faster than every alternative in the space.

https://github.com/grafbase/grafbase

Dozer

Dozer makes it easy to build low-latency data APIs (gRPC and REST) from any data source. Data is transformed on the fly using Dozer's reactive SQL engine and stored in a high-performance cache to offer the best possible experience. Dozer is useful for quickly building data products.

OpenDal

Apache OpenDal makes data accessing freely, painlessly, and efficiently.

cube.js

cube.js is the semantic layer for building data applications. It helps data engineers and application developers access data from modern data stores, organize it into consistent definitions, and deliver it to every application.

Database Clients for Rust

Please refer to Database Clients for Rust for details .

Metrics and Monitoring

https://github.com/frolicorg/frolic Frolic is an open source project (written in Rust) to build customer facing dashboards 10x faster. You can directly connect your database to the project and use ready made APIs to query data and create customer facing dashboards. You can also use frolic-react for your UI along with frolic to create full stack dashboards much faster.

Command-line Tools

https://github.com/timvw/qv A simply CLI to quickly view your data. Powered by DataFusion.

Big Data

Rust for Big Data and Parallel Processing Applications

datafuse

fluvio

DataBend

kamu-cli

datafusion

ballista

Polars

References

Comments