Lineage DB is an educational MVCC database written in Rust. Current functionality & limitations.
Inspired by: High-Performance Concurrency Control Mechanisms for Main-Memory Databases & Tikhu
- Go to releases, find a binary that matches your OS / Architecture
- Extract the binary from the release
- MacOS - Allow running from unidentified developer - Instructions
- Run
./lineagedb
, by default the database storage is storeddata/
in a directory alongside the binary - Open
http://0.0.0.0:9000/graphiql
- Examples of GraphQL queries / mutations can be found here here
An optional CLI is provided for various configuration options
📀 Lineagedb GraphQL Server, provides a simple GraphQL interface for interacting with the database
Usage: lineagedb [OPTIONS]
Options:
-p, --port <PORT>
Port the graphql server will run on [default: 9000]
-a, --address <ADDRESS>
Address the graphql server will run on [default: 0.0.0.0]
--log-http
Whether to log out GraphQL HTTP requests
--http-workers <HTTP_WORKERS>
[default: 2]
--storage <STORAGE>
Which storage mechanism to use [default: file] [possible values: file, dynamo, postgres, s3]
--data <DATA>
When using file storage, location of the database. Reads / writes to this directory. Note: Does not support shell paths, e.g. ~ [default: data]
Running with cargo
# Default options
cargo run
# Pass in command line arguments
cargo run -- --help
Debugging
# Prints out logs from the database crate (skips printing any GraphQL)
RUST_LOG=lineagedb cargo run
# Prints out full exception strings
RUST_BACKTRACE=1 cargo run
# Prints out logs from tests, note requires updating the test annotation to #[test_log::test]
RUST_LOG=debug cargo test -p database with_storage_file -- --nocapture
Other binaries
cargo run --package tcp-server --bin lineagedb-tcp-server
Tested on an M1 Mac.
Threads: | 1 | 2 | 3 | 4 |
---|---|---|---|---|
Read | 640k | 1100k | 1400k | 1700k |
Write | 280k | 400k | 150k | 100k |
Test notes:
- Metrics required in transactions per second
- A transaction has a single statement
Testing / Benchmarking
# Quick functional unit tests
cargo test --all
# Running performance unit tests
# Notes:
# 1. Running these tests one after another will yield different results to
# running them individually. I suspect this could be because the OS' cleaning up allocated memory.
# 2. These tests will yield different results based on whether the laptop is charging or not
cargo test --package database "database::database::tests::bulk" -- --nocapture --ignored --test-threads=1
# Using the benchmarking tool https://bheisler.github.io/criterion.rs/book/user_guide/command_line_options.html#baselines
cargo bench --all
cargo bench -- --save-baseline no-fsync # Saves the baseline to compare to another branch
Current functionality
- Supports ACID transactions
- Utilizes a WAL for performant writes / supports trimming the WAL
- Time travel; query the database at any given transaction id (* assuming the previous transactions are untrimmed)
- For any given item can look at all revisions (* assuming the previous transactions are untrimmed)
- Supports basic querying
Current limitations:
- Does not support session based transactions, statements in a transaction must be sent all at once
- Does not support DDL statements, at the moment the system is limited to a single entity (Person)
- The working dataset must fit entirely within memory, there is no storage pool / disk paging
- Does not have an SQL frontend
- Has limited querying capabilities, just
AND
, noOR
,IN
, etc. - Version compression, for each new version we make a clean copy of all of the previous versions' data
- Uniqueness or index based querying
# Create
mutation writeHuman {
createHuman(newHuman: { fullName: "Frank Walker" }) {
id
fullName
email
}
}
# Create builk
mutation createHumans ($newHumans: [NewHuman!]!) {
createHumans(newHumans: $newHumans) {
id
fullName
email
}
}
{
"newHumans": [
{ "fullName": "test1", "email": "[email protected]" },
{ "fullName": "test2", "email": null }
]
}
# Update
mutation updateHuman {
updateHuman(id: "53db1e6f-4b90-4d3d-8871-b24288bf9192", updateHuman: { email: "[email protected]"}) {
id
fullName
email
}
}
# Use ID in mutation response to get the human
query queryHuman {
human (id: "bf5567e4-1d4e-4451-aeb3-449cdd2970be") {
id
fullName
email
}
}
# List
query listHuman {
listHuman {
id
fullName
email
}
}
query listHumanWithQuery {
listHuman(query: { fullName: "test1" }) {
id
fullName
email
}
}
mutation dbSnapshot {
snapshot
}
mutation dbReset {
reset
}