Use case
We’re running Stalwart 0.16.5 with the RocksDB DataStore on a local SSD PVC in Kubernetes. Mail HA + DR is built around being able to back up the DataStore from a sidecar container while the main Stalwart process keeps serving SMTP/IMAP. The intended path is:
# in a CronJob sidecar, same pod as the live Stalwart:
stalwart --config /etc/stalwart/config.json -e /tmp/export-2026-05-13.lz4
This currently fails:
⚠️ Startup failed: Failed to open database:
Error { message: "IO error: While lock file:
/var/lib/stalwart/data/LOCK: Resource temporarily unavailable" }
The live Stalwart holds an exclusive flock(LOCK) (RocksDB DB::Open() primary mode), so a second process trying to open the same DB as primary blocks. That’s correct RocksDB behaviour — but it means -e is unusable for hot backup of a live instance, which forces operators onto file-level workarounds (restic / tar / borg of the raw data dir) that lose the store-agnostic, app-level-consistent export Stalwart’s -e was designed for.
Proposal
Add a flag — --secondary or --read-only — that opens the DataStore via RocksDB’s DB::OpenAsSecondary() instead of DB::Open():
// rocksdb-rust binding:
DB::open_as_secondary(&opts, primary_path, secondary_path)
RocksDB secondary mode:
-
Does NOT acquire the LOCK on the primary’s data dir
-
Reads SST files + WAL of the primary
-
TryCatchUpWithPrimary()periodically replays the WAL tail into the secondary’s in-memory state -
Is the upstream-recommended way to do hot reads / backups against a live primary
This is exactly the case -e is missing today.
What would be needed
-
New flag on the export command (e.g.
--secondary). -
When set, the DataStore opener picks the secondary path:
-
For RocksDB:
OpenAsSecondary(primary_path, secondary_path)wheresecondary_pathis a writable tmpdir for the secondary’s own metadata. -
For other stores (FoundationDB, PG, etc.): no-op or use the store’s read-only equivalent if any.
-
-
After open, optionally call
TryCatchUpWithPrimary()so the export reflects state as of the call rather than the (possibly stale) WAL position.
Happy to put up a PR if you’d accept this — wanted to file the design first to check alignment.
Workarounds we’ve considered
-
RocksDB Checkpoint sidecar: call
rocksdb::Checkpoint::CreateCheckpoint(\"/tmp/cp\")from a small Rust/C++ tool, then runstalwart -eagainst an alt-config pointing at/tmp/cp. Works without upstream change, but requires shipping the rocksdb-checkpoint binary in our sidecar image plus maintaining a duplicate config. -
Stop Stalwart briefly: not viable for mail (breaks SMTP/IMAP).
-
File-level restic backup of the raw data dir: what we run today — works but loses app-level consistency (relies on RocksDB WAL replay on restore) and isn’t store-agnostic (tied to the on-disk RocksDB format, breaks if we ever switch DataStore).
Native -e --secondary would be much cleaner than any of these.
Thanks for considering!