I noticed that these events keep popping up while browsing in the Webui to ‘Directory’ → ‘Accounts’. On this page I also get a red banner stating; Invalid response code 500 Internal Server Error
Therefore Im also not able to see any account configuration in here.
As far as I can see now all mailservers work. I can still receive and send email.
I’m requesting for help on how to troubleshoot these kind of database errors. I have no idea where to start.
I am running Stalwart v0.15.5 on Debian 13
It is quite simple setup with only 1 server configured on 1 domain with only a handful of accounts and aliases.
Thanks for the help on guiding me how to troubleshoot this.
I understand that topics in this category are triaged by a bot first but a human reply will follow up. If I’d prefer a human-only reply, I’ll add the no-ai tag to my topic.
The error “Corruption: Merge operator failed” is a RocksDB-level data corruption issue. Here is how to approach troubleshooting it:
The "Corruption: Merge operator failed" error means RocksDB’s internal merge operation on counter/quota column families is returning corrupt data. This is a RocksDB-level database corruption, not a Stalwart configuration issue. Here is how to approach it:
Immediate steps:
Stop Stalwart before doing anything else. The docs explicitly warn that accessing the raw data store while the server is running risks further data inconsistency.
Back up your data directory first. For RocksDB, the entire data directory (default /var/lib/stalwart/) is your database. Copy it somewhere safe before attempting any repair:
cp -a /var/lib/stalwart/ /var/lib/stalwart-backup/
Try RocksDB’s own repair tool. The ldb tool (part of the rocksdb-tools or librocksdb-dev package on Debian) can attempt a repair:
ldb repair --db=/var/lib/stalwart/
This is RocksDB’s built-in recovery mechanism; it discards unrecoverable data but tries to salvage the rest.
If repair does not work, migrate or restore. If you have a recent backup, restore from it. Alternatively, you can export what is still readable using the Stalwart export/import migration utility (server must be stopped), then re-import into a fresh store. See the migration docs.
What caused it? The merge operator failure typically happens when a value in one of the counter column families (SUBSPACE_COUNTER, SUBSPACE_QUOTA, or SUBSPACE_IN_MEMORY_COUNTER) cannot be deserialized as a valid i64. This can happen due to an unclean shutdown, disk I/O error, or filesystem corruption. Check dmesg and your disk health (smartctl -a /dev/sdX) to rule out underlying hardware issues.
Since your mail flow is still working, the corruption appears limited to the directory/account metadata layer rather than the message store itself.
This is an automated reply from the Stalwart Help Bot. Other community members may follow up if this answer is incomplete or wrong.
The ldb repair --db=/var/lib/stalwart/ command did something, but did not resolve the issue. The issue persists.
I dont have an export of the database. I do have a backup of the whole data folder and all .sst files however.
I assume replacing the .sst files will result into dataloss from the moment of backup till now. As this is my personal mailserver and not a mission-critical server I am willing to take some loss of data as it isnt much yet. But the question rises, is this possible and the only way to go in my situation?
Decided to replace all the .sst files with the ones from my backup which means I went 1 day back in time. It worked with some small dataloss. It is what it is. I am still curious though on how this corruption happened.