Migration script for 0.16 drop umlaut domains

Issue Description

The migration script dropes Umlaut domains. Umlaute are not represented in the label regex stalwart/resources/scripts/migrate_v016.py at main · stalwartlabs/stalwart · GitHub

Or was the support of Umlaut domains dropped for 0.16?

Expected Behavior

Don’t drop the Umlaut domains.

Actual Behavior

dropping domain ‘Ümlaut.de’: not a valid v0.16 hostname (must have at least two labels, valid characters)

Stalwart Version

v0.16.x

Installation Method

Docker

Database Backend

RocksDB

Blob Storage

RocksDB

Search Engine

Internal

Directory Backend

Internal

I have reviewed the documentation and FAQ and confirm that my issue is NOT addressed there.

on

I have searched this support forum (open and closed topics) and confirm this is not a duplicate.

on

I understand that topics in this category are triaged by a bot first but a human reply will follow up. If I’d prefer a human-only reply, I’ll add the no-ai tag to my topic.

on

Let me look into the migration script and Stalwart’s handling of internationalized domain names.

This is a confirmed bug in the migration script. The issue is clearly visible in the source. The _LABEL_RE regex on line 419 is:

_LABEL_RE = re.compile(r"^[a-z0-9]([a-z0-9-]{0,61}[a-z0-9])?$")

This only allows ASCII letters and digits, so any IDN/internationalized domain name containing Unicode characters (like umlauts: ü, ä, ö, etc.) will fail the is_valid_domain_name() check and be silently dropped with the warning you saw.

The root cause is that internationalized domain names like Ümlaut.de need to be either:

  1. Stored in their Unicode (U-label) form and validated with a Unicode-aware regex, or
  2. Converted to Punycode (ACE/A-label) form first – e.g. xn--mlaut-0ra.de – before being passed to the validator.

Workaround until the script is fixed: Before running the migration, manually convert your umlaut/IDN domain names to their Punycode equivalents in your v0.15 data (e.g. rename Ümlaut.de to xn--mlaut-0ra.de). You can use Python’s built-in encoder: "ümlaut.de".encode("idna").decode("ascii") gives xn--mlaut-0ra.de. After migration, you can rename them back in v0.16’s admin UI if Unicode display names are supported.

This is a script bug, not a change in Stalwart’s domain support policy. A human maintainer will need to patch _LABEL_RE (or add Punycode conversion before validation) in resources/scripts/migrate_v016.py.

This is an automated reply from the Stalwart Help Bot. Other community members may follow up if this answer is incomplete or wrong. If you found it useful, mention @helpbot in any reply to ask follow-up questions and the bot will respond again.

To add here. Its only about parsing the index label so:

´´´

“acme.näh-podcast.de.domains.0”: “mail.xn–nh-podcast-q5a.de”,

´´´

is still corretly handled in my view. The encoding of domain is right its only about the index label.

This has been fixed in the latest migration script.

Nice works now, thanks for this fast fix.