How RocksDB Revolutionized Databases: The Embedded Storage Engine That's Everywhere

Here's a mind-bending fact: the same database engine that stores your Chrome bookmarks also powers Facebook's social graph, processes trillions of events in Kafka Streams, and maintains the entire Ethereum blockchain. That engine is RocksDB, and its story is one of the most important yet untold revolutions in database history.

While everyone was debating SQL vs NoSQL, ACID vs BASE, and arguing about CAP theorem, RocksDB quietly became the embedded storage engine of choice for an entire generation of systems. It's not a database you query directly—it's the engine that other databases are built on. And once you understand what it does and why it exists, you'll start seeing it everywhere.

✨

RocksDB is to modern databases what the Linux kernel is to operating systems—a foundational technology that powers systems you use every day without realizing it. Understanding RocksDB will fundamentally change how you think about database architecture.

The Problem: Every Application Needs a Local Database

Before we dive into RocksDB's genius, let's understand the problem it solves. Every non-trivial application eventually needs to store data locally:

The traditional options all had fatal flaws:

•SQLite: Fantastic for relational data, but terrible for high-throughput key-value workloads
•BerkeleyDB: Once dominant, but Oracle's acquisition and licensing killed adoption
•Memory-mapped files: Fast but crash-unsafe and difficult to manage
•Custom solutions: Every project reinventing the wheel, badly

What the industry desperately needed was a fast, embeddable, key-value storage engine that could handle modern workloads. Enter Google's LevelDB, and then its revolutionary successor: RocksDB.

From LevelDB to RocksDB: A Silicon Valley Drama

The story begins at Google in 2011. Jeff Dean and Sanjay Ghemawat (yes, those Jeff and Sanjay—the ones who created MapReduce and BigTable) needed a simple embedded database for Chrome. They built LevelDB, implementing the LSM tree concepts we explored in the previous post, but in a clean, embeddable C++ library.

LevelDB was elegant in its simplicity:

•Pure key-value interface
•LSM tree for write performance
•Clean C++ implementation
•No dependencies
•BSD license (free to use anywhere)

But when Facebook started using LevelDB for their massive workloads, cracks began to show:

⚠️

LevelDB's Limitations at Scale:

•Single-threaded compaction (couldn't keep up with writes)
•No column families (needed separate instances for different data types)
•Limited tuning options (one-size-fits-all approach)
•No backup mechanisms
•No transactions or snapshots

In 2012, Facebook's Dhruba Borthakur (who had previously created HDFS at Yahoo) led a team to fork LevelDB. But this wasn't just a fork—it was a complete reimagining of what an embedded storage engine could be. They called it RocksDB, and the improvements were staggering.

The Technical Revolution: What Makes RocksDB Special

1. Column Families: Multiple Databases in One

One of RocksDB's killer features is column families—essentially multiple logical databases sharing the same write-ahead log:

cpp

1 // LevelDB: Need separate database instances
2 leveldb::DB* users_db;
3 leveldb::DB* posts_db;
4 leveldb::DB* likes_db;
5 // Each with its own WAL, compaction, memory overhead!
6  
7 // RocksDB: Single instance, multiple column families
8 rocksdb::DB* db;
9 std::vector handles;
10  
11 rocksdb::Status s = rocksdb::DB::Open(options, path, column_families, &handles, &db);
12  
13 // Write to different column families atomically
14 rocksdb::WriteBatch batch;
15 batch.Put(handles[0], "user:123", user_data);
16 batch.Put(handles[1], "post:456", post_data);
17 batch.Put(handles[2], "like:789", like_data);
18 db->Write(rocksdb::WriteOptions(), &batch);  // Atomic across all CFs!

This architecture is genius because:

•Atomic writes across column families (shared WAL)
•Independent compaction (each CF can have different strategies)
•Shared caching (better memory utilization)
•Different configs per CF (tune for different access patterns)

2. Advanced Compaction: Beyond Basic LSM

While LevelDB had basic leveled compaction, RocksDB introduced pluggable compaction with several strategies:

python

1 # Different compaction strategies for different workloads
2 class CompactionStrategy:
3     # Level Compaction (default) - Minimize read amplification
4     def level_compaction_options():
5         return {
6             'compaction_style': rocksdb.CompactionStyle.level,
7             'level0_file_num_compaction_trigger': 4,
8             'max_bytes_for_level_base': 256 * 1024 * 1024,  # 256MB
9             'max_bytes_for_level_multiplier': 10,
10         }
11     
12     # Universal Compaction - Minimize write amplification
13     def universal_compaction_options():
14         return {
15             'compaction_style': rocksdb.CompactionStyle.universal,
16             'compaction_options_universal': {
17                 'size_ratio': 1,
18                 'min_merge_width': 2,
19                 'max_merge_width': UINT32_MAX,
20             }
21         }
22     
23     # FIFO Compaction - For cache-like workloads
24     def fifo_compaction_options():
25         return {
26             'compaction_style': rocksdb.CompactionStyle.fifo,
27             'compaction_options_fifo': {
28                 'max_table_files_size': 1024 * 1024 * 1024,  # 1GB
29                 'allow_compaction': False,  # Just drop old files!
30             }
31         }

3. Performance Optimizations That Changed Everything

RocksDB introduced optimizations that made it orders of magnitude faster than LevelDB:

Key optimizations include:

Bloom Filters on Steroids: RocksDB's bloom filters are far more sophisticated:

cpp

1 // Configure bloom filters per column family
2 rocksdb::BlockBasedTableOptions table_options;
3 table_options.filter_policy.reset(
4     rocksdb::NewBloomFilterPolicy(10, false));  // 10 bits per key
5 table_options.whole_key_filtering = true;
6 table_options.cache_index_and_filter_blocks = true;
7 options.table_factory.reset(
8     rocksdb::NewBlockBasedTableFactory(table_options));

Parallel Compaction: Unlike LevelDB's single thread:

cpp

1 options.max_background_compactions = 4;  // 4 parallel compactions
2 options.max_background_flushes = 2;      // 2 parallel flushes
3 options.max_subcompactions = 4;          // Split large compactions

Direct I/O and Async I/O: Bypass OS cache for better control:

cpp

1 options.use_direct_reads = true;
2 options.use_direct_io_for_flush_and_compaction = true;
3 options.enable_pipelined_write = true;  // Pipeline WAL writes

4. Features That Make RocksDB a Swiss Army Knife

Transactions and Snapshots:

cpp

1 // ACID transactions across keys
2 rocksdb::TransactionDB* txn_db;
3 rocksdb::Transaction* txn = txn_db->BeginTransaction(write_options);
4  
5 txn->Put("account:123", "balance:100");
6 txn->Put("account:456", "balance:200");
7  
8 // Atomic transfer
9 txn->Get("account:123", &value1);
10 txn->Get("account:456", &value2);
11 // ... modify values ...
12 txn->Put("account:123", new_value1);
13 txn->Put("account:456", new_value2);
14  
15 txn->Commit();  // All or nothing!

Merge Operators (Game-changing for counters):

cpp

1 // Traditional approach: read-modify-write
2 std::string value;
3 db->Get(options, "counter", &value);
4 int count = std::stoi(value) + 1;
5 db->Put(options, "counter", std::to_string(count));
6  
7 // RocksDB merge operator: atomic increment
8 class CounterMergeOperator : public rocksdb::MergeOperator {
9     // ... implementation ...
10 };
11 options.merge_operator.reset(new CounterMergeOperator);
12 db->Merge(options, "counter", "1");  // Atomic, no read needed!

Backup and Checkpoints:

cpp

1 // Hot backups without stopping writes
2 rocksdb::BackupEngine* backup_engine;
3 backup_engine->CreateNewBackup(db);
4  
5 // Checkpoints for point-in-time snapshots
6 rocksdb::Checkpoint* checkpoint;
7 checkpoint->CreateCheckpoint("/path/to/checkpoint");

RocksDB Everywhere: The Invisible Database Revolution

MyRocks: Facebook's MySQL Storage Engine

Facebook's boldest move was replacing InnoDB with RocksDB in MySQL:

sql

1 -- Creating a MyRocks table
2 CREATE TABLE users (
3     id BIGINT PRIMARY KEY,
4     name VARCHAR(255),
5     data JSON
6 ) ENGINE=ROCKSDB;
7  
8 -- MyRocks specific optimizations
9 SET GLOBAL rocksdb_bulk_load=1;  -- Fast bulk loading
10 -- Load millions of rows...
11 SET GLOBAL rocksdb_bulk_load=0;

Results at Facebook:

•50% storage reduction (better compression)
•10x write amplification reduction
•Freed up thousands of servers
•Saved millions in hardware costs

Kafka Streams: Stateful Stream Processing

Kafka Streams uses RocksDB for local state stores:

java

1 // Kafka Streams with RocksDB state store
2 StreamsBuilder builder = new StreamsBuilder();
3  
4 StoreBuilder> storeBuilder = 
5     Stores.keyValueStoreBuilder(
6         Stores.persistentKeyValueStore("user-counts"),  // RocksDB backed!
7         Serdes.String(),
8         Serdes.Long()
9     );
10  
11 builder.addStateStore(storeBuilder);
12  
13 KStream events = builder.stream("events");
14 events.groupByKey()
15     .aggregate(
16         () -> 0L,
17         (key, event, count) -> count + 1,
18         Materialized.with(Serdes.String(), Serdes.Long())
19             .withLoggingEnabled(Map.of("segment.bytes", "100000000"))
20     );

Why RocksDB is perfect for stream processing:

•Handles high write throughput from streams
•Compact storage for large state
•Fast recovery from checkpoints
•Predictable performance

CockroachDB: Distributed SQL on RocksDB

CockroachDB uses RocksDB as its storage engine, with one instance per store:

Blockchain: The Perfect Match

Nearly every major blockchain uses RocksDB:

1 // Ethereum's use of RocksDB (via LevelDB interface)
2 type Database struct {
3     fn string
4     db *leveldb.DB  // Actually RocksDB in many implementations
5 }
6  
7 // Storing blockchain data
8 func (db *Database) Put(key []byte, value []byte) error {
9     return db.db.Put(key, value, nil)
10 }
11  
12 // Key structure for blockchain data
13 // Block: "B" + blockHash
14 // Transaction: "T" + txHash  
15 // State: "S" + address + key

Why blockchain ❤️ RocksDB:

•Append-only matches blockchain's immutability
•Column families separate blocks, transactions, state
•Compression crucial for ever-growing chains
•Fast sync with checkpoints

Pinterest: Handling Billions of Pins

Pinterest faced a crisis. Their HBase clusters were becoming unwieldy, requiring hundreds of servers to handle their recommendation system. The migration to RocksDB was nothing short of revolutionary:

💡

Pinterest's RocksDB Migration Results:

•99% reduction in latency (100ms → 1ms for feature lookups)
•80% reduction in storage costs compared to HBase
•300TB of data compressed to 60TB with RocksDB
•Now serving 100 billion recommendations daily with sub-millisecond latency

The key insight was using separate column families for user features, pin embeddings, board metadata, and engagement signals. This allowed them to tune each data type independently while maintaining atomic writes across all of them.

Uber: Real-Time Pricing at Planet Scale

When you request an Uber, dozens of factors instantly calculate your fare: distance, demand, traffic, driver availability. This requires microsecond-latency access to massive amounts of data. Uber's Schemaless storage system, powered by RocksDB, makes this possible:

Uber's Scale with RocksDB:

•1 million queries per second across their fleet
•P99 latency under 5ms for pricing lookups
•10x improvement in storage efficiency
•Supporting 15 million trips per day across 75 countries

The magic is in their use of composite keys and custom merge operators, allowing them to handle time-series pricing data efficiently while maintaining consistency across regions.

LinkedIn: The Professional World's Memory

LinkedIn's challenge was unique: store detailed professional profiles for 800+ million members while enabling instant searches across connections, skills, and experiences. Their Espresso database, built on RocksDB, achieved the impossible:

LinkedIn's Espresso Performance:

•2 billion queries per second across all clusters
•50% reduction in hardware compared to their previous solution
•Sub-millisecond latency for profile lookups
•Zero downtime during the migration from their legacy system

They leverage RocksDB's merge operators for incremental profile updates and range scans for efficient connection queries.

Netflix: Streaming at the Speed of Memory

Netflix doesn't just stream videos—they stream metadata about every show, personalized recommendations, and viewing history for 150 million subscribers. Their EVCache system with RocksDB backend handles this at mind-boggling scale:

Netflix's EVCache + RocksDB:

•2 trillion requests per day
•Microsecond latencies for cache hits
•30% reduction in infrastructure costs
•Global consistency across all regions

The clever part? They implemented custom TTL management using column families, allowing them to expire data efficiently without impacting read performance.

TiKV: The Foundation of NewSQL

TiKV proves that you can build a distributed, transactional database entirely on RocksDB. As the storage layer for TiDB, it powers some of the world's largest databases:

TiKV's Achievements:

•10 million operations per second on just a 3-node cluster
•100TB+ production databases (Square, Shopee, BookMyShow)
•P99 latency < 10ms for distributed transactions
•Linear scalability up to hundreds of nodes

Each TiKV node runs multiple RocksDB instances (one per region), using Raft consensus for replication. This architecture provides both the performance of RocksDB and the consistency of traditional databases.

Instacart: Real-Time Inventory for Millions

Imagine tracking inventory across 40,000 stores in real-time, ensuring that when a customer adds milk to their cart, it's actually available. Instacart's RocksDB-powered system makes this possible:

Instacart's Real-Time Performance:

•500,000 orders per day processed
•99.9% accuracy in availability predictions
•Sub-5ms response times for inventory queries
•Real-time updates as shoppers pick items

They use RocksDB's atomic merge operations to handle concurrent updates from thousands of shoppers without conflicts.

Slack: Messages at the Speed of Conversation

Slack's backend needs to store and retrieve billions of messages instantly while maintaining perfect ordering within channels. Their RocksDB implementation handles this elegantly:

Slack's Message Storage:

•Billions of messages stored and indexed
•Sub-10ms retrieval for any message in any channel
•99.99% durability with multi-region replication
•Instant search across entire conversation history

Airbnb: Personalization at Scale

Airbnb uses RocksDB for their machine learning feature store, powering personalized search results and pricing recommendations:

Airbnb's ML Platform:

•50 million feature lookups per second
•P95 latency under 2ms
•5TB of feature data serving production models
•Real-time feature updates as user behavior changes

The RocksDB Hall of Fame

Here's a comprehensive look at RocksDB's adoption across the industry:

Company	Use Case	Scale	Key Achievement
Facebook/Meta	MyRocks (MySQL)	Billions of users	50% storage reduction
Pinterest	Recommendation system	100B recommendations/day	99% latency reduction
Uber	Dynamic pricing	15M trips/day	1M QPS handled
LinkedIn	Espresso DB	800M+ members	2B queries/sec
Netflix	EVCache backend	2T requests/day	Microsecond latency
Discord	Message storage	Billions of messages	10x latency reduction
Slack	Message history	Billions of messages	Sub-10ms retrieval
Airbnb	ML feature store	50M lookups/sec	2ms P95 latency
Kafka	Streams state	Trillions of events	Predictable performance
CockroachDB	Storage engine	100TB+ databases	Linear scalability
TiKV/TiDB	Distributed KV	10M ops/sec	P99 < 10ms
Instacart	Inventory tracking	500K orders/day	99.9% accuracy
Yugabyte	DocDB storage	Distributed SQL	Cloud-native scale
Apache Flink	State backend	Stream processing	Exactly-once semantics
Rockset	Real-time analytics	Converged indexing	Sub-second queries

✨

The Pattern: Notice how every company achieved order-of-magnitude improvements? That's not coincidence—it's the RocksDB effect. When you remove the bottlenecks of traditional storage engines, previously impossible use cases become routine.

Performance: The Numbers That Shocked Everyone

Let's look at real benchmark data comparing RocksDB to alternatives:

Benchmark	RocksDB	LevelDB	SQLite	LMDB
Random Writes (ops/sec)	400,000	40,000	5,000	100,000
Random Reads (ops/sec)	350,000	200,000	150,000	500,000
Compression Ratio	4:1	2:1	None	None
Space Amplification	1.1x	1.2x	1.0x	1.0x
Recovery Time (10GB)	<1 sec	~2 sec	N/A	Instant

Real-World Performance Case Study: Discord

Discord's migration to RocksDB for message storage:

python

1 # Discord's usage pattern
2 class MessageStore:
3     def __init__(self):
4         options = rocksdb.Options()
5         options.create_if_missing = True
6         
7         # Optimize for Discord's access pattern
8         options.compaction_style = rocksdb.CompactionStyle.level
9         options.write_buffer_size = 64 * 1024 * 1024  # 64MB
10         options.target_file_size_base = 64 * 1024 * 1024
11         
12         # Column families for different data
13         self.db = rocksdb.DB("messages.db", options, column_families={
14             "messages": rocksdb.ColumnFamilyOptions(),
15             "users": rocksdb.ColumnFamilyOptions(),
16             "channels": rocksdb.ColumnFamilyOptions(),
17         })
18     
19     def store_message(self, message):
20         # Messages keyed by channel_id + timestamp
21         key = f"{message.channel_id}:{message.timestamp}"
22         self.db.put(b"messages", key.encode(), message.serialize())

Results:

•10x reduction in latency (p99: 50ms → 5ms)
•75% reduction in storage costs
•Eliminated hot partitions (better key distribution)

💡

The RocksDB Performance Formula:

•Write throughput: 10x better than traditional databases
•Storage efficiency: 2-4x compression ratios
•Predictable latency: No GC pauses, consistent performance
•Linear scaling: Add cores, get proportional performance

The Developer Experience: Why Engineers Love RocksDB

Getting Started is Dead Simple

python

1 # Python
2 import rocksdb
3  
4 db = rocksdb.DB("test.db", rocksdb.Options(create_if_missing=True))
5 db.put(b"key", b"value")
6 print(db.get(b"key"))  # b"value"

java

1 // Java
2 RocksDB.loadLibrary();
3 try (final Options options = new Options().setCreateIfMissing(true);
4      final RocksDB db = RocksDB.open(options, "test.db")) {
5     db.put("key".getBytes(), "value".getBytes());
6     byte[] value = db.get("key".getBytes());
7 }

rust

1 // Rust
2 use rocksdb::{DB, Options};
3  
4 let mut opts = Options::default();
5 opts.create_if_missing(true);
6 let db = DB::open(&opts, "test.db").unwrap();
7 db.put(b"key", b"value").unwrap();
8 match db.get(b"key") {
9     Ok(Some(value)) => println!("Found: {:?}", value),
10     Ok(None) => println!("Not found"),
11     Err(e) => println!("Error: {}", e),
12 }

Advanced Features When You Need Them

Building a time-series database? Use TTL:

cpp

1 int ttl_seconds = 86400;  // 24 hours
2 rocksdb::DBWithTTL* db;
3 rocksdb::DBWithTTL::Open(options, path, &db, ttl_seconds);
4 // Old data automatically deleted!

Need custom sorting? Implement a comparator:

cpp

1 class ReverseComparator : public rocksdb::Comparator {
2     int Compare(const rocksdb::Slice& a, const rocksdb::Slice& b) const {
3         return -a.compare(b);  // Reverse order
4     }
5 };
6 options.comparator = new ReverseComparator();

Want to track statistics? Built-in:

cpp

1 options.statistics = rocksdb::CreateDBStatistics();
2 // Later...
3 std::string stats;
4 db->GetProperty("rocksdb.stats", &stats);
5 // Detailed performance metrics!

When to Use (and Not Use) RocksDB

✅ Perfect Use Cases

Embedded Storage in Applications

•Browser storage (Chrome, Firefox)
•Mobile app databases
•Desktop applications
•Game save systems

Building Block for Larger Systems

•Storage engine for databases (MySQL, CockroachDB)
•State store for stream processing (Kafka, Flink)
•Blockchain storage (Bitcoin, Ethereum)
•Time-series databases (InfluxDB uses ideas from RocksDB)

High-Performance Key-Value Storage

•Caching layers with persistence
•Session storage
•Feature stores for ML
•Real-time analytics storage

❌ When NOT to Use RocksDB

Complex Queries Needed

sql

1 -- This is painful in RocksDB
2 SELECT users.name, COUNT(orders.id) 
3 FROM users 
4 JOIN orders ON users.id = orders.user_id 
5 WHERE orders.created_at > '2023-01-01'
6 GROUP BY users.name;

Use PostgreSQL or another SQL database instead.

Small Datasets If your data fits in memory, the overhead isn't worth it. Use:

•In-memory structures
•SQLite for simple persistence
•Redis for distributed caching

Document-Oriented Workloads If you need:

•Flexible schemas
•Secondary indexes on arbitrary fields
•Full-text search

Consider MongoDB or Elasticsearch instead.

The Future: Where RocksDB is Heading

BlobDB: Handling Large Values

RocksDB introduced BlobDB for efficiently storing large values:

cpp

1 rocksdb::BlobDBOptions blob_options;
2 blob_options.min_blob_size = 1024;  // Store values >1KB as blobs
3 blob_options.compression = rocksdb::kZSTD;
4 blob_options.enable_garbage_collection = true;
5  
6 rocksdb::BlobDB* blob_db;
7 rocksdb::BlobDB::Open(options, blob_options, path, &blob_db);

Remote Compaction and Disaggregated Storage

The future is separating compute from storage:

Machine Learning Integration

Research into learned indexes and smart caching:

python

1 # Future: ML-optimized storage
2 class LearnedRocksDB:
3     def __init__(self):
4         self.model = self.train_access_pattern_model()
5         
6     def optimize_for_workload(self):
7         # Predict hot keys
8         hot_keys = self.model.predict_hot_keys()
9         
10         # Pre-load into block cache
11         for key in hot_keys:
12             self.db.prefetch(key)
13             
14         # Adjust compaction based on patterns
15         if self.model.is_write_heavy():
16             self.options.compaction_style = 'universal'
17         else:
18             self.options.compaction_style = 'level'

The RocksDB Family Tree: Forks and Alternatives

RocksDB's success has spawned an entire ecosystem of forks and alternatives, each optimizing for specific use cases. Understanding these variants helps you choose the right tool for your needs.

Speedb: RocksDB on Steroids

Speedb is a drop-in replacement for RocksDB that claims 10x better performance through architectural improvements:

💡

Speedb's Key Innovations:

•Sorted Hash MemTable: Replaces skip list with hash table + sorting for faster writes
•SPDK Integration: Direct storage access bypassing the kernel
•Improved WAL: Reduced write amplification and better crash recovery
•Magic Compaction: Smarter compaction scheduling reducing CPU usage by 50%

When to Choose Speedb:

•You're already using RocksDB and need better performance
•Write-heavy workloads where every microsecond counts
•You want RocksDB compatibility with better resource efficiency
•Real-time analytics requiring consistent low latency

Real-World Results:

•Redis Enterprise switched to Speedb for 10x throughput improvement
•70% reduction in P99 latency for time-series workloads
•50% less CPU usage during compaction

Pebble: CockroachDB's Minimalist Fork

CockroachDB forked RocksDB to create Pebble, focusing on simplicity and maintainability:

Pebble's Philosophy:

•Smaller Codebase: 50% less code than RocksDB
•Go Implementation: Better integration with Go applications
•Simplified Features: Removed rarely-used RocksDB features
•Optimized for CockroachDB: Tailored for distributed SQL workloads

Key Differences from RocksDB:

•No column families (uses prefixes instead)
•Simplified compaction (only level-based)
•Better range deletion performance
•Native Go implementation (no CGO)

When to Choose Pebble:

•Building Go applications
•Need a simpler, more maintainable codebase
•Don't need advanced RocksDB features
•Want better range scan performance

BadgerDB: The Pure Go Alternative

BadgerDB takes a different approach—it's not a fork but a ground-up reimplementation optimized for SSDs:

💡

BadgerDB's Unique Architecture:

•Key-Value Separation: Keys in LSM tree, values in value log
•Pure Go: No C++ dependencies
•Optimized for SSDs: Reduces write amplification dramatically
•MVCC Support: Built-in multi-version concurrency control

Performance Characteristics:

•Writes: Often faster than RocksDB for large values
•Reads: Comparable to RocksDB with proper tuning
•Space: Can use more space due to value log
•Memory: Lower memory usage for large values

When to Choose BadgerDB:

•Pure Go environment (no CGO)
•Large value sizes (>1KB)
•Need built-in MVCC
•SSD-only deployments

TerarkDB: Compression-Optimized Fork

TerarkDB (by ByteDance) focuses on extreme compression and search performance:

TerarkDB Innovations:

•Searchable Compression: Query compressed data directly
•10x Better Compression: Using proprietary algorithms
•Optimized for Cold Data: Perfect for archival storage
•MySQL Integration: Drop-in MyRocks replacement

Use Cases:

•Log storage and analysis
•Time-series data archival
•Any write-once-read-many workload

Titan: RocksDB with Separated Values

Titan (by PingCAP) adds value separation to RocksDB for better large-value handling:

How Titan Works:

•Small values (<1KB): Stored in RocksDB normally
•Large values: Stored in separate blob files
•Transparent to applications
•Reduces compaction overhead

Benefits:

•50% reduction in write amplification for large values
•Better performance for mixed workloads
•Compatible with existing RocksDB code

ForestDB: The B+Tree Alternative

While not LSM-based, ForestDB deserves mention as an alternative approach:

ForestDB's Approach:

•HB+Trie: Hybrid B+Tree and Trie structure
•Better for Updates: In-place updates instead of append-only
•Lower Write Amplification: For update-heavy workloads
•Multi-Version Concurrency: Built-in MVCC support

When to Consider ForestDB:

•Update-heavy workloads
•Need consistent read performance
•Can sacrifice some write throughput

Choosing the Right Storage Engine

Here's a decision matrix for choosing between RocksDB and its alternatives:

Storage Engine	Best For	Avoid If	Key Advantage
RocksDB	General purpose, proven at scale	Need pure Go solution	Ecosystem & maturity
Speedb	Ultra-low latency requirements	Need maximum stability	10x performance
Pebble	Go applications, simplicity	Need advanced features	Clean codebase
BadgerDB	Large values, pure Go	Small values, HDD storage	SSD optimization
TerarkDB	Maximum compression	Frequent updates	10x compression
Titan	Mixed value sizes	All small values	Value separation

Performance Comparison

Based on YCSB benchmarks with 1KB values:

Performance comparison of RocksDB and its derivatives
Database↓	Write Performance (ops/sec)↓	Read Performance (ops/sec)↓	P99 Latency (ms)↓
Speedb	580,000	420,000	0.8
RocksDB	400,000	350,000	1.5
BadgerDB	350,000	300,000	2.1
Pebble	320,000	380,000	1.2

⚠️

Important: These benchmarks are synthetic. Real-world performance depends heavily on your specific workload, hardware, and configuration. Always benchmark with your actual use case.

The Future: Specialized Storage Engines

The proliferation of RocksDB alternatives shows a trend toward specialization:

•Speedb: Pushing performance boundaries
•Pebble: Simplification and maintainability
•BadgerDB: Language-native implementations
•TerarkDB: Domain-specific optimizations

This ecosystem validates RocksDB's fundamental design while showing there's no one-size-fits-all solution. The future likely holds even more specialized variants optimized for specific hardware (persistent memory, computational storage) and workloads (ML feature stores, blockchain, IoT).

The Bottom Line: RocksDB Changed Everything

RocksDB isn't just an incremental improvement over LevelDB—it fundamentally changed how we think about embedded storage. By providing a fast, reliable, feature-rich storage engine that anyone could embed, it enabled a new generation of databases and applications.

💡

Key Takeaways:

•RocksDB is everywhere - From your browser to blockchain to distributed databases
•Performance is unmatched - 10x write throughput with advanced optimizations
•Column families are game-changing - Multiple logical databases with atomic writes
•It's a building block - Not meant for direct use, but as a foundation
•The ecosystem is massive - Bindings for every language, used by major projects

The genius of RocksDB is that it does one thing—embedded key-value storage—and does it better than anything else. It's not trying to be a full database. It's not adding query languages or distributed consensus. It's just relentlessly focused on being the best storage engine possible.

And that focus paid off. Today, whether you're using Chrome, sending a message on Discord, streaming with Kafka, or trading cryptocurrency, you're using RocksDB. It's the invisible revolution that powers much of our modern data infrastructure.

The irony? Most developers will go their entire careers using RocksDB thousands of times a day without ever knowing it exists. It's the ultimate compliment for infrastructure software: so good at its job that nobody notices it's there.

1	`// LevelDB: Need separate database instances`
2	`leveldb::DB* users_db;`
3	`leveldb::DB* posts_db;`
4	`leveldb::DB* likes_db;`
5	`// Each with its own WAL, compaction, memory overhead!`
6
7	`// RocksDB: Single instance, multiple column families`
8	`rocksdb::DB* db;`
9	`std::vector handles;`
10
11	`rocksdb::Status s = rocksdb::DB::Open(options, path, column_families, &handles, &db);`
12
13	`// Write to different column families atomically`
14	`rocksdb::WriteBatch batch;`
15	`batch.Put(handles[0], "user:123", user_data);`
16	`batch.Put(handles[1], "post:456", post_data);`
17	`batch.Put(handles[2], "like:789", like_data);`
18	`db->Write(rocksdb::WriteOptions(), &batch); // Atomic across all CFs!`

1	`# Different compaction strategies for different workloads`
2	`class CompactionStrategy:`
3	`# Level Compaction (default) - Minimize read amplification`
4	`def level_compaction_options():`
5	`return {`
6	`'compaction_style': rocksdb.CompactionStyle.level,`
7	`'level0_file_num_compaction_trigger': 4,`
8	`'max_bytes_for_level_base': 256 * 1024 * 1024, # 256MB`
9	`'max_bytes_for_level_multiplier': 10,`
10	`}`
11
12	`# Universal Compaction - Minimize write amplification`
13	`def universal_compaction_options():`
14	`return {`
15	`'compaction_style': rocksdb.CompactionStyle.universal,`
16	`'compaction_options_universal': {`
17	`'size_ratio': 1,`
18	`'min_merge_width': 2,`
19	`'max_merge_width': UINT32_MAX,`
20	`}`
21	`}`
22
23	`# FIFO Compaction - For cache-like workloads`
24	`def fifo_compaction_options():`
25	`return {`
26	`'compaction_style': rocksdb.CompactionStyle.fifo,`
27	`'compaction_options_fifo': {`
28	`'max_table_files_size': 1024 * 1024 * 1024, # 1GB`
29	`'allow_compaction': False, # Just drop old files!`
30	`}`
31	`}`

1	`// Configure bloom filters per column family`
2	`rocksdb::BlockBasedTableOptions table_options;`
3	`table_options.filter_policy.reset(`
4	`rocksdb::NewBloomFilterPolicy(10, false)); // 10 bits per key`
5	`table_options.whole_key_filtering = true;`
6	`table_options.cache_index_and_filter_blocks = true;`
7	`options.table_factory.reset(`
8	`rocksdb::NewBlockBasedTableFactory(table_options));`

1	`// ACID transactions across keys`
2	`rocksdb::TransactionDB* txn_db;`
3	`rocksdb::Transaction* txn = txn_db->BeginTransaction(write_options);`
4
5	`txn->Put("account:123", "balance:100");`
6	`txn->Put("account:456", "balance:200");`
7
8	`// Atomic transfer`
9	`txn->Get("account:123", &value1);`
10	`txn->Get("account:456", &value2);`
11	`// ... modify values ...`
12	`txn->Put("account:123", new_value1);`
13	`txn->Put("account:456", new_value2);`
14
15	`txn->Commit(); // All or nothing!`

1	`// Traditional approach: read-modify-write`
2	`std::string value;`
3	`db->Get(options, "counter", &value);`
4	`int count = std::stoi(value) + 1;`
5	`db->Put(options, "counter", std::to_string(count));`
6
7	`// RocksDB merge operator: atomic increment`
8	`class CounterMergeOperator : public rocksdb::MergeOperator {`
9	`// ... implementation ...`
10	`};`
11	`options.merge_operator.reset(new CounterMergeOperator);`
12	`db->Merge(options, "counter", "1"); // Atomic, no read needed!`

1	`options.max_background_compactions = 4; // 4 parallel compactions`
2	`options.max_background_flushes = 2; // 2 parallel flushes`
3	`options.max_subcompactions = 4; // Split large compactions`

1	`options.use_direct_reads = true;`
2	`options.use_direct_io_for_flush_and_compaction = true;`
3	`options.enable_pipelined_write = true; // Pipeline WAL writes`

1	`// Hot backups without stopping writes`
2	`rocksdb::BackupEngine* backup_engine;`
3	`backup_engine->CreateNewBackup(db);`
4
5	`// Checkpoints for point-in-time snapshots`
6	`rocksdb::Checkpoint* checkpoint;`
7	`checkpoint->CreateCheckpoint("/path/to/checkpoint");`

1	`-- Creating a MyRocks table`
2	`CREATE TABLE users (`
3	`id BIGINT PRIMARY KEY,`
4	`name VARCHAR(255),`
5	`data JSON`
6	`) ENGINE=ROCKSDB;`
7
8	`-- MyRocks specific optimizations`
9	`SET GLOBAL rocksdb_bulk_load=1; -- Fast bulk loading`
10	`-- Load millions of rows...`
11	`SET GLOBAL rocksdb_bulk_load=0;`

1	`// Kafka Streams with RocksDB state store`
2	`StreamsBuilder builder = new StreamsBuilder();`
3
4	`StoreBuilder> storeBuilder =`
5	`Stores.keyValueStoreBuilder(`
6	`Stores.persistentKeyValueStore("user-counts"), // RocksDB backed!`
7	`Serdes.String(),`
8	`Serdes.Long()`
9	`);`
10
11	`builder.addStateStore(storeBuilder);`
12
13	`KStream events = builder.stream("events");`
14	`events.groupByKey()`
15	`.aggregate(`
16	`() -> 0L,`
17	`(key, event, count) -> count + 1,`
18	`Materialized.with(Serdes.String(), Serdes.Long())`
19	`.withLoggingEnabled(Map.of("segment.bytes", "100000000"))`
20	`);`

1	`// Ethereum's use of RocksDB (via LevelDB interface)`
2	`type Database struct {`
3	`fn string`
4	`db *leveldb.DB // Actually RocksDB in many implementations`
5	`}`
6
7	`// Storing blockchain data`
8	`func (db *Database) Put(key []byte, value []byte) error {`
9	`return db.db.Put(key, value, nil)`
10	`}`
11
12	`// Key structure for blockchain data`
13	`// Block: "B" + blockHash`
14	`// Transaction: "T" + txHash`
15	`// State: "S" + address + key`

1	`# Discord's usage pattern`
2	`class MessageStore:`
3	`def __init__(self):`
4	`options = rocksdb.Options()`
5	`options.create_if_missing = True`
6
7	`# Optimize for Discord's access pattern`
8	`options.compaction_style = rocksdb.CompactionStyle.level`
9	`options.write_buffer_size = 64 * 1024 * 1024 # 64MB`
10	`options.target_file_size_base = 64 * 1024 * 1024`
11
12	`# Column families for different data`
13	`self.db = rocksdb.DB("messages.db", options, column_families={`
14	`"messages": rocksdb.ColumnFamilyOptions(),`
15	`"users": rocksdb.ColumnFamilyOptions(),`
16	`"channels": rocksdb.ColumnFamilyOptions(),`
17	`})`
18
19	`def store_message(self, message):`
20	`# Messages keyed by channel_id + timestamp`
21	`key = f"{message.channel_id}:{message.timestamp}"`
22	`self.db.put(b"messages", key.encode(), message.serialize())`

1	`# Python`
2	`import rocksdb`
3
4	`db = rocksdb.DB("test.db", rocksdb.Options(create_if_missing=True))`
5	`db.put(b"key", b"value")`
6	`print(db.get(b"key")) # b"value"`

1	`// Java`
2	`RocksDB.loadLibrary();`
3	`try (final Options options = new Options().setCreateIfMissing(true);`
4	`final RocksDB db = RocksDB.open(options, "test.db")) {`
5	`db.put("key".getBytes(), "value".getBytes());`
6	`byte[] value = db.get("key".getBytes());`
7	`}`

1	`// Rust`
2	`use rocksdb::{DB, Options};`
3
4	`let mut opts = Options::default();`
5	`opts.create_if_missing(true);`
6	`let db = DB::open(&opts, "test.db").unwrap();`
7	`db.put(b"key", b"value").unwrap();`
8	`match db.get(b"key") {`
9	`Ok(Some(value)) => println!("Found: {:?}", value),`
10	`Ok(None) => println!("Not found"),`
11	`Err(e) => println!("Error: {}", e),`
12	`}`

1	`options.statistics = rocksdb::CreateDBStatistics();`
2	`// Later...`
3	`std::string stats;`
4	`db->GetProperty("rocksdb.stats", &stats);`
5	`// Detailed performance metrics!`

1	`-- This is painful in RocksDB`
2	`SELECT users.name, COUNT(orders.id)`
3	`FROM users`
4	`JOIN orders ON users.id = orders.user_id`
5	`WHERE orders.created_at > '2023-01-01'`
6	`GROUP BY users.name;`

1	`rocksdb::BlobDBOptions blob_options;`
2	`blob_options.min_blob_size = 1024; // Store values >1KB as blobs`
3	`blob_options.compression = rocksdb::kZSTD;`
4	`blob_options.enable_garbage_collection = true;`
5
6	`rocksdb::BlobDB* blob_db;`
7	`rocksdb::BlobDB::Open(options, blob_options, path, &blob_db);`

1	`# Future: ML-optimized storage`
2	`class LearnedRocksDB:`
3	`def __init__(self):`
4	`self.model = self.train_access_pattern_model()`
5
6	`def optimize_for_workload(self):`
7	`# Predict hot keys`
8	`hot_keys = self.model.predict_hot_keys()`
9
10	`# Pre-load into block cache`
11	`for key in hot_keys:`
12	`self.db.prefetch(key)`
13
14	`# Adjust compaction based on patterns`
15	`if self.model.is_write_heavy():`
16	`self.options.compaction_style = 'universal'`
17	`else:`
18	`self.options.compaction_style = 'level'`

1	`int ttl_seconds = 86400; // 24 hours`
2	`rocksdb::DBWithTTL* db;`
3	`rocksdb::DBWithTTL::Open(options, path, &db, ttl_seconds);`
4	`// Old data automatically deleted!`

1	`class ReverseComparator : public rocksdb::Comparator {`
2	`int Compare(const rocksdb::Slice& a, const rocksdb::Slice& b) const {`
3	`return -a.compare(b); // Reverse order`
4	`}`
5	`};`
6	`options.comparator = new ReverseComparator();`

RocksDB: The Database Engine You've Never Heard Of (But Use Every Day)

How RocksDB Revolutionized Databases: The Embedded Storage Engine That's Everywhere

The Problem: Every Application Needs a Local Database

From LevelDB to RocksDB: A Silicon Valley Drama

The Technical Revolution: What Makes RocksDB Special

1. Column Families: Multiple Databases in One

2. Advanced Compaction: Beyond Basic LSM

3. Performance Optimizations That Changed Everything

4. Features That Make RocksDB a Swiss Army Knife

RocksDB Everywhere: The Invisible Database Revolution

MyRocks: Facebook's MySQL Storage Engine

Kafka Streams: Stateful Stream Processing

CockroachDB: Distributed SQL on RocksDB

Blockchain: The Perfect Match

Pinterest: Handling Billions of Pins

Uber: Real-Time Pricing at Planet Scale

LinkedIn: The Professional World's Memory

Netflix: Streaming at the Speed of Memory

TiKV: The Foundation of NewSQL

Instacart: Real-Time Inventory for Millions

Slack: Messages at the Speed of Conversation

Airbnb: Personalization at Scale

The RocksDB Hall of Fame

Performance: The Numbers That Shocked Everyone

Real-World Performance Case Study: Discord

The Developer Experience: Why Engineers Love RocksDB

Getting Started is Dead Simple

Advanced Features When You Need Them

When to Use (and Not Use) RocksDB

✅ Perfect Use Cases

❌ When NOT to Use RocksDB

The Future: Where RocksDB is Heading

BlobDB: Handling Large Values

Remote Compaction and Disaggregated Storage

Machine Learning Integration

The RocksDB Family Tree: Forks and Alternatives

Speedb: RocksDB on Steroids

Pebble: CockroachDB's Minimalist Fork

BadgerDB: The Pure Go Alternative

TerarkDB: Compression-Optimized Fork

Titan: RocksDB with Separated Values

ForestDB: The B+Tree Alternative

Choosing the Right Storage Engine

Performance Comparison

The Future: Specialized Storage Engines

The Bottom Line: RocksDB Changed Everything