Tokio-rt-worker divide by zero panics

Issue Description

An accidental discovery while testing Vandelay, but changing the Stalwart HTTP rate limit (Network > Services > HTTP > Security > Rate Limit > Authenticated) units to Milliseconds, causes Tokio “Divide by Zero” crashes, and appears to be completely unrecoverable, as neither the WebUI, CLI or recovery mode are usable once this occurs.
Restoring or recreating the data store is the only solution I found.

Error message is:
thread ‘tokio-rt-worker’ (8) panicked at crates/store/src/dispatch/lookup.rs:304:27:
attempt to divide by zero

Expected Behavior

There should be no divice by zero panics, and the WebUI, CLI and recovery mode should work.

Actual Behavior

thread ‘tokio-rt-worker’ (8) panicked at crates/store/src/dispatch/lookup.rs:304:27:
attempt to divide by zero

Relevant Log Output

  1. From a running Stalwart configuration (please backup before you try this), navigate to Network > Services > HTTP > Security > Rate Limit > Authenticated and change the units to Milliseconds and save.
  2. Restart stalwart, and the WebUI and CLI will no longer be available. The panics will be written to the (docker) log.

Stalwart Version

v0.16.x

Installation Method

Docker

Database Backend

PostgreSQL

Blob Storage

S3-compatible

Search Engine

Meilisearch

Directory Backend

Internal

I have reviewed the documentation and FAQ and confirm that my issue is NOT addressed there.

on

I have searched this support forum (open and closed topics) and confirm this is not a duplicate.

on

I understand that topics in this category are triaged by a bot first but a human reply will follow up. If I’d prefer a human-only reply, I’ll add the no-ai tag to my topic.

on

Let me look at the relevant source code to understand this bug.

The panic is on line 304: let range_start = now / rate.period.as_secs();. When the rate limit period is set to milliseconds (e.g. “1 millisecond”), rate.period.as_secs() returns 0 (since it’s less than one second), causing an integer divide-by-zero panic. This crash then renders the HTTP listener unresponsive, making the WebUI, CLI, and recovery mode all inaccessible.

The panic is a confirmed bug. Here is what is happening:

Root cause: Line 304 of crates/store/src/dispatch/lookup.rs computes the rate-limit time bucket with:

let range_start = now / rate.period.as_secs();

When the rate limit period is stored in milliseconds and then converted to seconds with .as_secs(), a sub-second Duration produces 0 from .as_secs() – and dividing by zero panics the Tokio worker thread. Because the HTTP API itself goes through the rate-limit path, every subsequent request also panics, making the WebUI, CLI, and recovery mode all unreachable.

Immediate workaround: As you found, restoring or recreating the data store is the only recovery path once the bad value is persisted. Before doing that again, you can try editing the store record directly (e.g. in PostgreSQL) to reset the rate-limit period to a sane value (seconds, not milliseconds).

To avoid it in the future: Do not set the HTTP Authenticated Rate Limit period unit to Milliseconds. Use Seconds (or larger). Any sub-second period will produce the same divide-by-zero crash.

This is a genuine bug – the code should guard against a zero-valued period (e.g. with a checked_div or by validating at configuration-save time that the period is at least 1 second). The issue is clearly reproducible and worth flagging; a human maintainer will follow up on this thread.

This is an automated reply from the Stalwart Help Bot. Other community members may follow up if this answer is incomplete or wrong.

This has been fixed on the main branch. The fix will be included in v0.16.8.