50% of ham classified as spam

Issue Description

I just installed Stalwart and 50% of my emails as classified as spam and end up in the spam folder, not the inbox. 90% of the mail in my spam folder is actually not spam, only 10% is actually spam.

That is a desastrously bad ratio of false positives.

Expected Behavior

All emails are considered ham unless the spam filter has good reason to believe that they are spam.

Training happens with use.

Actual Behavior

The spam filter seems to be completely random. 50/50 is the scientific definition of randomness.

Reproduction Steps

  1. Forward mail from MX (Postfix) to Stalwart
  2. Set up IP address of MX server in Stalwart. Settings → MTA → Inbound → SPF verification and MAIL FROM (both the same), add condition:
    IF remote_ip == ā€˜1.2.3.4’ THEN relaxed
  3. Get mail

Stalwart Version

v0.16.x

Installation Method

Docker

Database Backend

RocksDB

Blob Storage

RocksDB

Search Engine

Internal

Directory Backend

Internal

Additional Context

I haven’t found a setting to tell Stalwart the MX host, but just in case I did not correctly configure the MX host:
a) There should be a clear way in the admin UI to configure the MX server
b) Even if misconfigured, it shouldn’t be considering 50% of my mail as spam.

Bottom line:
Be conservative before classifying something as spam. Spam is only mail where you have > 99% confidence that it is in fact spam. That’s clearly not the case here for me.

I have reviewed the documentation and FAQ and confirm that my issue is NOT addressed there.

on

I have searched this support forum (open and closed topics) and confirm this is not a duplicate.

on

I understand that topics in this category are triaged by a bot first but a human reply will follow up. If I’d prefer a human-only reply, I’ll add the no-ai tag to my topic.

on

If you have a MX or an host that does not validate SPF/DKIM… You can create a rule in spam filter (remote_ip == …),assign a tag, for instance IP_GOOD and then set a score to -500 if IP_GOOD is present (scores) in WebUI. This would set the score so low all emails from this host would be considered as ham. However if you want Stalwart to filter spam, you can finetune rules, we could need to see spam header in your email to help you.

The spam classifier needs by default at least 100 ham and 100 spam samples to start working. Until then you rely on the DNSBLs, rules and other heuristics. Manually classifying ham/spam by moving messages to the right folder counts as a training sample.

The spam classifier needs by default at least 100 ham and 100 spam samples to start working

That’s fine and normal.

However, until it’s trained reliably, it should not classify emails as spam. Rather, it should presume everything as ham, and only once it’s confident that something is spam, mark it as such. As you said, in training phase, it cannot be confident, so it cannot mark as spam at all until it’s trained.

In other words: false positives are much worse than false negatives. Not being trained means you need to accept false negatives, but not false positives.

I came here to ask for help on this as well. I’ve written custom scripts and have trained with a corpus of ~400 ham and ~500 spam, have added many domains to Trusted Domains, always unflag false positives by moving them to the inbox, and still >50% of my inbound mail gets flagged as spam (pretty much universally seeing PROB_SPAM_HIGH (8.00) in my X-Spam-Result headers). Am I missing something obvious here?

The statistical spam classifier is not activated until the 100/100 samples are collected. Until then the score is determined by DNSBL, rules and other tools, check the spam status headers for details.

Set the log level to trace and check the spam classifier training logs. Make sure that you don’t have unbalanced classes. If you still have the samples, you can trigger from Tasks a model reset and then check your logs once training is complete.

Here’s what I see for an email sent by Amazon (ā€œWe delivered your packageā€ mails)

X-Spam-Result: DMARC_POLICY_ALLOW (-0.50),
	DKIM_ALLOW (-0.20),
	ARC_NA (0.00),
	DBL_BLOCKED_OPENRESOLVER (0.00),
	DKIM_SIGNED (0.00),
	DMARC_POLICY_ALLOW_WITH_FAILURES (0.00),
	DNSWL_BLOCKED (0.00),
	DWL_DNSWL_BLOCKED (0.00),
	FROM_HAS_DN (0.00),
	FROM_NEQ_ENV_FROM (0.00),
	HAS_EXTERNAL_IMG (0.00),
	HAS_LINK_TO_IMG (0.00),
	HTML_SHORT_2 (0.00),
	PREVIOUSLY_DELIVERED (0.00),
	RBL_SENDERSCORE_REPUT_BLOCKED (0.00),
	RBL_SPAMHAUS_BLOCKED_OPENRESOLVER (0.00),
	RCPT_COUNT_ONE (0.00),
	RCVD_COUNT_ONE (0.00),
	RCVD_TLS_ALL (0.00),
	RECEIVED_SPAMHAUS_BLOCKED_OPENRESOLVER (0.00),
	SOURCE_ASN_24940 (0.00),
	TO_DN_NONE (0.00),
	URIBL_BLOCKED (0.00),
	FORGED_SENDER (0.30),
	URI_COUNT_ODD (0.50),
	PARTS_DIFFER (1.00),
	SPF_FAIL (1.00),
	FORGED_RECIPIENTS (2.00),
	SPAM_FLAG (5.00)
X-Spam-Score: spam, score=9.10

Same thing for an email sent by Apple:

X-Spam-Result: DMARC_POLICY_ALLOW (-0.50),
	DKIM_ALLOW (-0.20),
	ARC_NA (0.00),
	DBL_BLOCKED_OPENRESOLVER (0.00),
	DKIM_SIGNED (0.00),
	DMARC_POLICY_ALLOW_WITH_FAILURES (0.00),
	DNSWL_BLOCKED (0.00),
	DWL_DNSWL_BLOCKED (0.00),
	FROM_EQ_ENV_FROM (0.00),
	FROM_HAS_DN (0.00),
	HAS_EXTERNAL_IMG (0.00),
	HTML_SHORT_1 (0.00),
	MID_RHS_MATCH_ENV_FROM (0.00),
	PREVIOUSLY_DELIVERED (0.00),
	RBL_SENDERSCORE_REPUT_BLOCKED (0.00),
	RBL_SPAMHAUS_BLOCKED_OPENRESOLVER (0.00),
	RCPT_COUNT_ONE (0.00),
	RCVD_COUNT_ONE (0.00),
	RCVD_TLS_ALL (0.00),
	RECEIVED_SPAMHAUS_BLOCKED_OPENRESOLVER (0.00),
	SOURCE_ASN_24940 (0.00),
	SPF_SOFTFAIL (0.00),
	TO_DN_NONE (0.00),
	URIBL_BLOCKED (0.00),
	CTE_CASE (0.50),
	MID_RHS_MATCH_FROM (1.00),
	FORGED_RECIPIENTS (2.00),
	MIME_MA_MISSING_TEXT (2.00),
	SPAM_FLAG (5.00)

It seems the reason is `SPAM_FLAG` with score 5. What is this? (Given that all rules are about flagging spam, the rule name is not helpful.)

The `FORGED_SENDER` suggests that my Stalwart doesn’t recognize my Postfix MX that’s before it as trusted inbound relay, but I haven’t found a way to configure it. However, that rule is only score 0.3 and not the reason for the problems.

`FORGED_RECIPIENTS` (score 2.0) could be the same cause. My Postfix forwards mail from my primary mail address to the email address configured in Stalwart. Not sure how I’m supposed to configure this.

`URI_COUNT_ODD` is 0.5, even though that is a legitimate Amazon mail.

Lastly, there are lots of positive signals, but they are not counted positively into the spam score.

Mauro, thanks for your answer. My previous post was additional data, because you asked. Arguably, you need that information to understand what’s going on. Thanks for looking into this.

My point was another, mostly about the training phase: The spam filter should be either reliable (98+% accuracy), or suspend itself and not filter until it is sufficiently trained.

If in doubt, it should deliver to the inbox. What it does right now is being statistically random and delivering half the mails to spam folder. That should not be happening, even if I have a fresh server or fresh user.

The spam classification is always deterministic (unless there is a DNSBl network timeout which affects the score), there is no random component.
In your case, something is adding spam result headers to your emails which Stalwart (and also RSpamd) consider an attempt to influence the spam classification outcome and assign a high score to it. Either remove this header or set the SPAM_FLAG score to 0.0.
Also, in general, make sure you are not sending outbound emails with X-Spam-Result or X-Spam-Score headers as you risk your emails being classified as spam by Stalwart, RSpamd and possibly SpamAssassin too.

something is adding spam result headers to your emails which Stalwart (and also RSpamd) consider an attempt to influence the spam classification outcome and assign a high score to it.

Interesting. Thank you! That would explain it. That ā€œsomethingā€ is the Postfix inbound MX server that I mentioned in the ā€œReproduction stepsā€.

I have now disabled the `SPAM_FLAG` rule in Settings | Spam | Rules. I could not find the other rules like `FORGED_RECIPIENTS`.

What would be good is a single setting in Stalwart that informs it that it’s not the inbound MX server for the domain, but that there’s another MX server in front of it. I had actually been looking for a single setting in ā€œMTA → Inboundā€ where I can set the IP address of that server. It would also be prudent to ask during the onboarding setup dialogs whether Stalwart is responsible alone for the domain, or has an inbound MX, outbound MX, and other mail store with mailboxes of the same domain sitting in parallel of it. There are already some individual settings that help with it, but they are very hard to find. I would think that almost all larger deployments will start in such a mixed setup, in some form or other, so it’s a good question to a) ask during setup and b) have central toggles for it in the server settings, which I can easily change later without having to hunt down a quadrillion of individual settings across all modules. I understand that the individual settings are important for custom cases, but what I mentioned are standard cases that have impact in many places, as we can see here.

Is there such a setting in Stalwart already, or should I file an enhancement suggestion?

I reset the Bayes filter and trained with ~400 ham and ~500 spam. PROB_SPAM_HIGH and PROB_SPAM_MEDIUM are still getting set on non-spam email without fail. The only way I seem to be able to avoid false positives is to set both of those spam scores to 0 in the config. Is that to be expected after this level of training? I’m not sure what’s leading those to get set during analysis.