Your question
Hi,
Stalwart 0.16.8 installed in a VM with RocksDB as a storage.
I see a lot of warning like this:
2026-06-12T10:19:55Z WARN Training sample not found (spam.train-sample-not-found) reason = “Blob not found”, accountId = 204, blobId = “5acfa5a6757e964ea08c50da01b30511ad9f7287a33b5e4cc580c65b6656d715”
There are only 2 accountId but blobId are unique (meaning is the blobId is not repeating).
I couldn’t find out what represent accountId or blobId in order to get more information. Even at “Trace” log level I don’t see more details regarding this.
Any help is appreciated.
Thanks,
Andrei
I understand that topics in this category are triaged by a bot first but a human reply will follow up. If I’d prefer a human-only reply, I’ll add the no-ai tag to my topic.
on
Here is what is happening and what those fields mean:
The warning means the spam classifier training task has a reference to a training sample (an email blob), but the underlying blob is no longer present in the blob store. This happens when the blob was deleted (e.g. the email was expunged) between the time the sample was queued and the time training ran. It is a non-fatal condition; the classifier simply skips that sample.
Field meanings:
accountId - The numeric internal ID of the account (Account object) whose interaction (e.g. marking a message as spam/ham) originally generated the training sample. The two IDs you see likely correspond to your two mail-user accounts.
blobId - The BLAKE3 content-addressed hash of the raw email message that was supposed to be used as the training sample. Each logged entry is unique because each is a different message that was queued for training but is now missing from the blob store.
Why does this happen repeatedly? Each training cycle picks up newly queued samples. If users regularly mark messages as spam/ham but those messages are later deleted before the training interval fires, the blobs disappear and the warning appears. The blob reference in the SpamTrainingSample record outlives the actual stored message.
What you can do:
- You can inspect (and clean up) dangling
SpamTrainingSample records via the WebUI under Account > Spam Samples, or via the JMAP API using x:SpamTrainingSample/query and x:SpamTrainingSample/set (destroy). See the SpamTrainingSample reference for the API details.
- If the warnings bother you but you do not want to act on them, you can suppress them by setting the
spam.train-sample-not-found event to a lower log level in the WebUI under Settings > Telemetry.
These warnings do not indicate data corruption; they are informational notices that orphaned training references were skipped.
This is an automated reply from the Stalwart Help Bot. Other community members may follow up if this answer is incomplete or wrong.
That one is harmless. During a classifier training run, samples reference a message by its content hash, and if the message was deleted, moved or expired between being queued for training and the training pass actually running, the blob is gone and Stalwart logs “training sample not found” and skips it.
It’s a documented warn-level event, not an error. The accountId is the mailbox the sample came from and the blobId is the missing message.
Seeing it on a couple of accounts with distinct blobIds is the normal shape: auto-learn enqueued some messages that were then removed. Nothing breaks; the trainer just skips them. If the volume bothers you, review the auto-learn settings so it isn’t learning from messages that get deleted right after.
I noticed that this accountId in base10 is also used in log.
How can this be linked with the mailbox?
From stalwart-cli the mailbox id is something in base16 which is not the same number.
They’re the same number in different encodings, which is why they don’t look alike. The accountId in that spam log line is Stalwart’s internal account number printed as a plain base10 integer; the CLI and JMAP show account and mailbox ids base32-encoded, so the strings differ even though the value is the same.
To find the account, look it up by that internal number in the account list (web UI) or via stalwart-cli. Note it’s an account, not a specific mailbox, and the blobId is the content hash of the missing message, not a folder. As before, the warning is harmless: it just means a message queued for training was removed before the trainer ran.