Why Walrus?
Write-ahead logs are fundamental infrastructure. Every database, message queue, and distributed system relies on one. So why build another?
The Problem
Most WAL implementations make one of two choices:
Option 1: Embedded in a larger system
- RocksDB’s WAL
- PostgreSQL’s WAL
- Kafka’s commit log
These are tightly coupled to their parent systems, hard to extract, and carry assumptions that don’t generalize.
Option 2: Heavy distributed frameworks
- Requires complex cluster setup
- Consensus overhead (Raft/Paxos) even for single-node use
- Operational complexity
The gap: A fast, reusable, standalone WAL for building systems—not operating them.
Walrus fills that gap.
Design Principles
1. Single-Node First, Distributed Later
Walrus is production-ready for single-node workloads today. Distributed features (replication, consensus) are in development but don’t compromise the core.
Why? Most systems start on one node. Scaling should be a config change, not a rewrite.
2. Zero Bullshit
- No external dependencies (except rkyv for serialization)
- No async runtime (no Tokio tax)
- No hidden allocations in hot paths
- No “magic”—just memory, syscalls, and atomics
Result: Predictable performance. Every microsecond is accounted for.
3. Performance as a Feature
Walrus achieves 1M ops/sec and 1 GB/s on consumer hardware because:
- Spin locks eliminate syscalls in allocation
- io_uring batches operations to single syscalls (Linux)
- Tail reads provide immediate consistency without write amplification
- Lock-free reads via memory-mapped files
Philosophy: If it’s not measurably faster, it’s not worth the complexity.
4. Fail Gracefully
Corruption happens. Disks fail. Processes crash.
Walrus logs errors, skips corrupted entries, and continues. Availability > correctness for transient failures.
What Makes Walrus Different?
Topic Isolation
Most WALs are monolithic (one log for everything). Walrus gives each topic independent chains with isolated checkpointing.
Benefits:
- Delete consumed data per-topic (not global GC)
- Tune durability per-workload (critical topics fsync, logs don’t)
- No head-of-line blocking (slow reader on topic A doesn’t stall topic B)
Example:
// High-durability for transactions
let wal_txn = Walrus::with_consistency_and_schedule_for_key(
"transactions",
ReadConsistency::StrictlyAtOnce,
FsyncSchedule::SyncEach,
)?;
// Low-latency for metrics
let wal_metrics = Walrus::with_consistency_and_schedule_for_key(
"metrics",
ReadConsistency::AtLeastOnce { persist_every: 10_000 },
FsyncSchedule::NoFsync,
)?;
Coordination-Free Deletion
Traditional WALs require consensus to delete files:
- Kafka: broker coordination
- RocksDB: compaction with reference counting
Walrus uses four atomic counters per file:
struct FileState {
locked_blocks: u16, // Writers holding blocks
checkpointed_blocks: u16, // Readers finished blocks
total_blocks: u16, // Blocks allocated
is_fully_allocated: bool, // File full
}
Deletion condition:
is_fully_allocated
&& locked_blocks == 0
&& total_blocks > 0
&& checkpointed_blocks >= total_blocks
Result: Files delete themselves when safe, without any coordination protocol.
Tail Reading
Readers can consume from the active writer’s block without forcing rotation.
Traditional approach:
- Writer fills block (10 MB)
- Writer seals block
- Reader can now read
Problem: Low-throughput topics wait forever.
Walrus approach:
// Reader snapshots writer state (atomic)
let (block_id, offset) = writer.snapshot();
// Read up to current offset (lock-free)
for entry in read_range(last_offset..offset) {
process(entry);
}
Result: Read latency drops from “when block fills” (seconds to hours) to sub-millisecond.
Performance Comparison
vs RocksDB WAL
| Metric | RocksDB WAL | Walrus |
|---|---|---|
| Write throughput (8 threads) | ~400k ops/sec | ~1M ops/sec |
| Allocation overhead | malloc per entry | Spin lock (200 ns) |
| Read path | Deserialize + copy | Zero-copy mmap |
| Deletion | Compaction (STW pauses) | Atomic counters (lockless) |
vs Kafka
| Metric | Kafka | Walrus |
|---|---|---|
| Single-node setup | Requires ZooKeeper/KRaft | Single binary |
| Latency (P99) | 5-20 ms | <1 ms |
| Topic isolation | Yes (partitions) | Yes (chains) |
| Batch writes | Yes | Yes (io_uring) |
Note: Kafka is distributed by default; Walrus single-node (distributed features coming).
vs PostgreSQL WAL
| Metric | PostgreSQL WAL | Walrus |
|---|---|---|
| Designed for | Single database | Generic log |
| Reusability | Embedded | Standalone |
| Checkpointing | Global checkpoints | Per-topic |
| Performance | Database-optimized | Log-optimized |
Use Cases
1. Event Sourcing
// Append events as they occur
for event in event_stream {
wal.append_for_topic("user-events", &serialize(event))?;
}
// Rebuild state from log
let mut state = State::new();
while let Some(entry) = wal.read_next("user-events", true)? {
state.apply(deserialize(&entry.data)?);
}
2. Message Queue
// Producers append
wal.append_for_topic("jobs", &job_payload)?;
// Consumers read
while let Some(entry) = wal.read_next("jobs", true)? {
process_job(&entry.data)?;
}
3. Database Write-Ahead Log
// Before mutating in-memory state
wal.append_for_topic("commits", &transaction_log)?;
// Fsync ensures durability
wal.flush()?;
// Apply to in-memory store
btree.apply(transaction);
4. Replication Source
// Leader writes
wal.append_for_topic("replicated", data)?;
// Follower reads
while let Some(entry) = wal.read_next("replicated", true)? {
send_to_follower(entry)?;
}
5. Analytics Ingestion
// High-throughput append (1M ops/sec)
wal.batch_append_for_topic("events", &batch)?;
// Batch reads for processing
let entries = wal.batch_read_for_topic("events", 10_485_760, true)?; // 10 MB
process_batch(entries);
When NOT to Use Walrus
Be honest about the trade-offs:
You Need Distributed Consensus Today
Walrus v0.1.0 is single-node. Replication is in development.
Alternatives: Kafka, Etcd, or build on Raft libraries like tikv/raft-rs.
You Need Cross-Platform Guarantees
Walrus’s fastest path (io_uring batching) is Linux-only. Works on macOS/Windows but slower.
Alternatives: RocksDB (more portable, less performance).
You Can’t Afford Any Data Loss
Even with FsyncSchedule::SyncEach, kernel crashes can lose writes.
Requirement: For zero data loss, you need replicated quorum writes.
You Need Built-In Compression
Walrus stores data as-is. Compress before appending.
let compressed = zstd::encode_all(data.as_slice(), 3)?;
wal.append_for_topic("compressed", &compressed)?;
You Want SQL-Like Queries
Walrus is append-only logs. No indexes, no SQL.
Alternatives: Build indexing on top, or use a database.
The Roadmap
Current (v0.1.0):
- ✅ Single-node WAL with topic isolation
- ✅ Configurable consistency (StrictlyAtOnce, AtLeastOnce)
- ✅ io_uring batching (Linux)
- ✅ Coordination-free deletion
- ✅ Tail reading optimization
In Development:
- 🚧 Raft-based cluster consensus
- 🚧 Quorum writes (leader + followers)
- 🚧 Hierarchical consensus (sub-cluster leases)
- 🚧 Lock-free inter-node queues
Future:
- Compression plugins
- Encryption at rest
- Multi-region replication
- Observability (metrics, tracing)
Getting Started
Ready to try Walrus?
Install:
[dependencies]
walrus-rust = "0.1.0"
Hello World:
use walrus_rust::Walrus;
let wal = Walrus::new()?;
wal.append_for_topic("events", b"hello walrus")?;
if let Some(entry) = wal.read_next("events", true)? {
println!("{:?}", String::from_utf8_lossy(&entry.data));
}
Learn more:
- Getting Started - installation and basic usage
- Architecture - how it works
- Internals - deep dive into optimizations
- Benchmarks - performance testing
Philosophy: Build Your Own
Walrus isn’t trying to be Kafka or RocksDB. It’s a building block for creating distributed systems.
Our bet: The next generation of databases, queues, and stream processors will be built from reusable, composable primitives—not monolithic frameworks.
Walrus is one such primitive: a WAL you can understand, modify, and build on.
If that resonates with you, check out the code. It’s 1000 lines of Rust you can actually read.