Stalwart unable to recover from transient OIDC errors during startup

Issue Description

Context: Sometimes when restarting my server, Stalwart comes up before the OIDC provider does.

During startup, if Stalwart cannot reach <issuer>/.well-known/openid-configuration, it will fall back to the internal OAuth directory, and no further attempts to fetch the external directory’s config are made. This leaves all users unable to login, except via App Password.

Expected Behavior

If the external OIDC provider is unreachable at startup, preferably it should retry whenever a new login attempt is made, or less preferably it should retry periodically.
This is the behaviour of all other OIDC client applications I’m using

Actual Behavior

If the OIDC directory is unreachable:

  • The below errors are logged during startup
  • Stalwart behaves as though the default OAuth directory is configured
  • No attempts are made to re-connect to the OIDC provider.
  • Login errors do not mention an earlier failure.
  • Web UI shows a generic username/password form (which none of the configured users can use since they have no password)

Reproduction Steps

  1. Configure an external OIDC directory with an issuer that doesn’t work, like https://auth.example.com
  2. Set that directory as the default
  3. Restart Stalwart
  4. See below logs
  5. See no re-attempts made

Relevant Log Output

During startup:

ERROR Configuration build error (registry.build-error) source = "Directory", id = <unsure if this should be anonymised>, reason = "Network error: Discovery fetch failed: error sending request for url (https://<issuer>/.well-known/openid-configuration)"
ERROR Configuration build error (registry.build-error) source = "Authentication", id = <unsure if this should be anonymised>, reason = "Default directory with ID xyzxyzxyz not found"

During login attempt with OIDC ID token:

ERROR Authentication error (auth.error) listenerId = "imaps", localPort = 993, remoteIp = 10.0.0.1, remotePort = 39702, reason = "Failed to decode token. If you are using an external OIDC provider, make sure it is configured as the default directory under the Authentication object.", causedBy = "crates/common/src/auth/oauth/token.rs:125", details = "<valid access token from oidc provider>", remoteIp = 10.0.0.1, id = "A0001"

Stalwart Version

v0.16.x

Installation Method

Docker

Database Backend

RocksDB

Blob Storage

RocksDB

Search Engine

Internal

Directory Backend

OIDC

I have reviewed the documentation and FAQ and confirm that my issue is NOT addressed there.

on

I have searched this support forum (open and closed topics) and confirm this is not a duplicate.

on

I understand that topics in this category are triaged by a bot first but a human reply will follow up. If I’d prefer a human-only reply, I’ll add the no-ai tag to my topic.

on

1 Like

This is a won’t fix unfortunately. Stalwart needs to be able to access the OIDC discovery document during startup. Although lazy loading is possible, it will slow things down. In addition to that, Stalwart needs to validate all settings before enabling a store or directory. When restarting the server, make sure that Stalwart is started after the OIDC provider and any external databases are available.