Recommended idempotent pattern for apply plans given non-nullable singleton references?

Your question

I’m trying to write stalwart-cli apply plan(s) that I can run on every NixOS rebuild while converging to the same state independent of existing data. I keep hitting a chain of constraints I can’t see a way through, and I’d like to know what the intended pattern is.

The constraints, as I understand them:

  1. SystemSettings.defaultDomainId is a non-nullable Id. Patching it to null is rejected with invalidPatch | Invalid path for Id, expected a string value. While a clean install defaults it to #domain-p333333333333, it doesn’t seem possible to replicate with the cli.
  2. A Domain referenced by SystemSettings.defaultDomainId cannot be destroyed (objectIsLinked).
  3. The filter on destroy operations appears to only support AND operators (unsupportedFilter: Only AND is supported in filters). This rules out the workaround of destroy every Domain except the canonical one or creating a temporary boostrap.invalid domain to temporarily fill defaultDomainId
  4. create is not idempotent: running the same create Domain twice fails on the second run

What’s the intended pattern here?

If helpful, here is what I have attempted:

stalwart-cli apply --stdin <<'EOF'
{"@type": "create", "object": "Domain", "value": {"bootstrap-invalid": {"name": "bootstrap.invalid"}}}
{"@type": "update", "object": "SystemSettings", "value": {"defaultDomainId": "#bootstrap-invalid"}}
EOF

stalwart-cli apply --stdin <<'EOF'
{"@type": "destroy", "object": "Domain", "value": {"operator": "NOT", "conditions": [{"name": "bootstrap.invalid"}]}}
{"@type": "destroy", "object": "Account"}
{"@type": "create", "object": "Domain", "value": {"example": {"name": "example.com"}}}
{"@type": "update", "object": "SystemSettings", "value": {"defaultDomainId": "#example"}}
EOF

stalwart-cli apply --stdin <<'EOF'
{"@type": "destroy", "object": "Domain", "value": {"name": "bootstrap.invalid"}}
EOF

stalwart-cli apply --stdin <<'EOF'
{"@type": "update", "object": "SystemSettings", "value": {"defaultHostname": "mail.example.com"}}
EOF

NB I was also having issues with the .invalid TLD, though this message seemed to imply it should now be accepted.

All this testing was done with:

  • Stalwart 0.16.5
  • stalwart-cli 1.0.6

I understand that topics in this category are triaged by a bot first but a human reply will follow up. If I’d prefer a human-only reply, I’ll add the no-ai tag to my topic.

on

The docs actually contain the exact pattern needed. Looking at the annotated example in the apply.md docs, there’s a key insight that directly addresses the defaultDomainId bootstrap problem: you can update SystemSettings.defaultDomainId using a #ref to a domain you’re creating in the same plan, which means you never need the NOT-filter workaround at all.

Here is the intended pattern:

// Destroy pass (written parents-first, reversed at runtime to children-first)
{"@type":"destroy","object":"Domain","value":{"name":"example.com"}}

// Create / update pass (parents-first)
{"@type":"create","object":"Domain","value":{"dom-primary":{"name":"example.com"}}}
{"@type":"update","object":"SystemSettings","value":{"defaultDomainId":"#dom-primary","defaultHostname":"mail.example.com"}}

The trick that resolves your constraint chain:

  • Constraint 4 (create not idempotent): Pair every create with a leading destroy filtered to the same domain name. On the first run the destroy is a no-op (nothing to delete); on subsequent runs it tears down the old instance before recreating it.
  • Constraints 1 & 2 (defaultDomainId non-nullable / domain can’t be destroyed while referenced): You do not need to null out defaultDomainId first. Point it at the new domain via #dom-primary in the same plan’s update pass. The destroy pass runs first (removing the old example.com if it exists), then the create pass recreates it and simultaneously updates SystemSettings to point at the new id. There is no window where defaultDomainId is null.
  • Constraint 3 (NOT filter unsupported): Avoid needing it entirely by using per-domain named filters ({"name":"example.com"}) rather than “destroy everything except X”.

So the bootstrap.invalid workaround is unnecessary. The reason it was failing is that the NOT operator isn’t supported in destroy filters, as you found. The intended design is: destroy the specific domains you own by name, recreate them, and update SystemSettings in the same pass using a #ref.

For a clean NixOS rebuild plan the shape would be:

{"@type":"destroy","object":"Domain","value":{"name":"example.com"}}
{"@type":"create","object":"Domain","value":{"dom-a":{"name":"example.com"}}}
{"@type":"update","object":"SystemSettings","value":{"defaultDomainId":"#dom-a","defaultHostname":"mail.example.com"}}

The docs also note that the stalwart-cli snapshot command emits plans in exactly this idempotent destroy-then-recreate shape, so running a snapshot on a working server is a useful way to get a reference plan structure.

Reference: Declarative bulk operations (apply.md) - “Idempotent re-runs” and “Annotated example” sections.

This is an automated reply from the Stalwart Help Bot. Other community members may follow up if this answer is incomplete or wrong. If you found it useful, mention @helpbot in any reply to ask follow-up questions and the bot will respond again.

@helpbot I did in fact see that part of the docs, however see the following output which demonstrates why it doesn’t work:

The first invocation is fine:

❯ stalwart-cli apply --stdin <<'EOF'
{"@type":"destroy","object":"Domain","value":{"name":"example.com"}}
{"@type":"create","object":"Domain","value":{"dom-a":{"name":"example.com"}}}
{"@type":"update","object":"SystemSettings","value":{"defaultDomainId":"#dom-a","defaultHostname":"mail.example.com"}}
EOF
Plan: 1 destroy, 1 update, 1 create (1 objects)
✓ destroyed Domain (0)
✓ created Domain (1)
✓ updated SystemSettings (1)
Done: 0 destroyed, 1 updated, 1 created (0 failed)

However the second is not:

❯ stalwart-cli apply --stdin <<'EOF'
{"@type":"destroy","object":"Domain","value":{"name":"example.com"}}
{"@type":"create","object":"Domain","value":{"dom-a":{"name":"example.com"}}}
{"@type":"update","object":"SystemSettings","value":{"defaultDomainId":"#dom-a","defaultHostname":"mail.example.com"}}
EOF
Plan: 1 destroy, 1 update, 1 create (1 objects)
✗ destroy Domain: Domain: destroy failed for id b: error: objectIsLinked |   Object id:   Domain#b |   Linked by:   SystemSettings#singleton
Done: 0 destroyed, 0 updated, 0 created (1 failed)
error: Domain: destroy failed for id b: error: objectIsLinked |   Object id:   Domain#b |   Linked by:   SystemSettings#singleton

When a domain is created with DKIM management enabled, a DKIM key is created automatically shortly after that. To make your plan idempotent you need to either delete the associated DKIM keys before deleting the domain or create the domain with automated DKIM management disabled.

Thanks for the response, but I think this is a different issue. The linker named in my error is SystemSettings#singleton, not a DKIM key:

✗ destroy Domain: Domain: destroy failed for id b: error: objectIsLinked |   Object id:   Domain#b |   Linked by:   SystemSettings#singleton

The problem is that SystemSettings.defaultDomainId was set to #dom-a by the update in the first run’s pass 2. On the second run, that reference is still live when pass 1 (destroy) starts, so the Domain destroy fails --and pass 2, which would repoint defaultDomainId to the newly-created Domain, never gets a chance to run because the plan halts on the first error.


As a sanity check I ran:

❯ sudo rm -rf /var/lib/stalwart/db
❯ sudo systemctl restart stalwart

Initial snapshot:

❯ STALWART_URL="http://localhost:8080" STALWART_USER='recovery-admin' STALWART_PASSWORD="12345678" stalwart-cli snapshot Directory DnsServer AcmeProvider Role Tenant Domain Certificate SystemSettings Authentication
snapshot: 20 creates, 2 singletons
  fetching Certificate...
    0 fetched
    Certificate: 0
  fetching Tenant...
    0 fetched
    Tenant: 0
  fetching Role...
    0 fetched
    Role: 0
  fetching AcmeProvider...
    0 fetched
    AcmeProvider: 0
  fetching DnsServer...
    0 fetched
    DnsServer (GoogleCloudDns): 0
    DnsServer (Route53): 0
    DnsServer (Spaceship): 0
    DnsServer (Dnsimple): 0
    DnsServer (Porkbun): 0
    DnsServer (Bunny): 0
    DnsServer (Ovh): 0
    DnsServer (DeSEC): 0
    DnsServer (DigitalOcean): 0
    DnsServer (Cloudflare): 0
    DnsServer (Sig0): 0
    DnsServer (Tsig): 0
  fetching Directory...
    0 fetched
    Directory (Oidc): 0
    Directory (Sql): 0
    Directory (Ldap): 0
  fetching Domain...
    0 fetched
    Domain: 0
  fetching singleton SystemSettings...
  fetching singleton Authentication...
{"@type":"destroy","object":"Certificate"}
{"@type":"destroy","object":"Tenant"}
{"@type":"destroy","object":"Role"}
{"@type":"destroy","object":"AcmeProvider"}
{"@type":"destroy","object":"DnsServer"}
{"@type":"destroy","object":"Directory"}
{"@type":"destroy","object":"Domain"}
{"@type":"update","object":"SystemSettings","value":{"defaultCertificateId":null,"defaultHostname":"","threadPoolSize":null,"proxyTrustedNetworks":{},"services":{"caldav":{"hostname":null,"cleartext":false},"carddav":{"hostname":null,"cleartext":false},"imap":{"hostname":null,"cleartext":false},"jmap":{"hostname":null,"cleartext":false},"managesieve":{"hostname":null,"cleartext":false},"pop3":{"hostname":null,"cleartext":false},"smtp":{"hostname":null,"cleartext":false},"webdav":{"hostname":null,"cleartext":false}},"mailExchangers":{"0":{"hostname":null,"priority":10}},"maxConnections":8192,"defaultDomainId":"#domain-p333333333333","providerInfo":{}}}
{"@type":"update","object":"Authentication","value":{"maxApiKeys":5,"defaultTenantRoleIds":{},"directoryId":null,"passwordMinLength":8,"passwordMaxLength":128,"passwordDefaultExpiry":null,"defaultUserRoleIds":{},"defaultGroupRoleIds":{},"passwordHashAlgorithm":"argon2id","maxAppPasswords":5,"passwordMinStrength":"three","defaultAdminRoleIds":{}}}

And then tried creating a domain with manual dkim management:

❯ STALWART_URL="http://localhost:8080" STALWART_USER='recovery-admin' STALWART_PASSWORD="12345678" stalwart-cli apply \
  --stdin <<'EOF'
{"@type":"destroy","object":"Domain","value":{"name":"example.com"}}
{"@type":"create","object":"Domain","value":{"dom-a":{"name":"example.com","dkimManagement":{"@type":"Manual"}}}}
{"@type":"update","object":"SystemSettings","value":{"defaultDomainId":"#dom-a","defaultHostname":"mail.example.com"}}
EOF
Plan: 1 destroy, 1 update, 1 create (1 objects)
✓ destroyed Domain (0)
✓ created Domain (1)
✓ updated SystemSettings (1)
Done: 0 destroyed, 1 updated, 1 created (0 failed)

Snapshot after first attempt:

❯ STALWART_URL="http://localhost:8080" STALWART_USER='recovery-admin' STALWART_PASSWORD="12345678" stalwart-cli snapshot Domain SystemSettings Tenant AcmeProvider DnsServer Directory Certificate Role
snapshot: 20 creates, 1 singletons
  fetching Certificate...
    0 fetched
    Certificate: 0
  fetching Tenant...
    0 fetched
    Tenant: 0
  fetching Role...
    0 fetched
    Role: 0
  fetching Directory...
    0 fetched
    Directory (Oidc): 0
    Directory (Sql): 0
    Directory (Ldap): 0
  fetching DnsServer...
    0 fetched
    DnsServer (GoogleCloudDns): 0
    DnsServer (Route53): 0
    DnsServer (Spaceship): 0
    DnsServer (Dnsimple): 0
    DnsServer (Porkbun): 0
    DnsServer (Bunny): 0
    DnsServer (Ovh): 0
    DnsServer (DeSEC): 0
    DnsServer (DigitalOcean): 0
    DnsServer (Cloudflare): 0
    DnsServer (Sig0): 0
    DnsServer (Tsig): 0
  fetching AcmeProvider...
    0 fetched
    AcmeProvider: 0
  fetching Domain...
    1 fetched
    Domain: 1
  fetching singleton SystemSettings...
{"@type":"destroy","object":"Certificate"}
{"@type":"destroy","object":"Tenant"}
{"@type":"destroy","object":"Role"}
{"@type":"destroy","object":"Directory"}
{"@type":"destroy","object":"DnsServer"}
{"@type":"destroy","object":"AcmeProvider"}
{"@type":"destroy","object":"Domain"}
{"@type":"create","object":"Domain","value":{"domain-b":{"isEnabled":true,"dkimManagement":{"@type":"Manual"},"certificateManagement":{"@type":"Manual"},"directoryId":null,"allowRelaying":false,"description":null,"aliases":{},"dnsManagement":{"@type":"Manual"},"memberTenantId":null,"name":"example.com","reportAddressUri":"mailto:postmaster","subAddressing":{"@type":"Enabled"},"catchAllAddress":null,"logo":null}}}
{"@type":"update","object":"SystemSettings","value":{"threadPoolSize":null,"proxyTrustedNetworks":{},"providerInfo":{},"mailExchangers":{"0":{"priority":10,"hostname":null}},"defaultDomainId":"#domain-b","defaultHostname":"mail.example.com","maxConnections":8192,"defaultCertificateId":null,"services":{"caldav":{"cleartext":false,"hostname":null},"carddav":{"cleartext":false,"hostname":null},"imap":{"cleartext":false,"hostname":null},"jmap":{"cleartext":false,"hostname":null},"managesieve":{"cleartext":false,"hostname":null},"pop3":{"cleartext":false,"hostname":null},"smtp":{"cleartext":false,"hostname":null},"webdav":{"cleartext":false,"hostname":null}}}}
snapshot complete

idempotency failure demonstrated when running the same command again:

❯ STALWART_URL="http://localhost:8080" STALWART_USER='recovery-admin' STALWART_PASSWORD="12345678" stalwart-cli apply \
  --stdin <<'EOF'
{"@type":"destroy","object":"Domain","value":{"name":"example.com"}}
{"@type":"create","object":"Domain","value":{"dom-a":{"name":"example.com","dkimManagement":{"@type":"Manual"}}}}
{"@type":"update","object":"SystemSettings","value":{"defaultDomainId":"#dom-a","defaultHostname":"mail.example.com"}}
EOF
Plan: 1 destroy, 1 update, 1 create (1 objects)
✗ destroy Domain: Domain: destroy failed for id b: error: objectIsLinked |   Object id:   Domain#b |   Linked by:   SystemSettings#singleton
Done: 0 destroyed, 0 updated, 0 created (1 failed)
error: Domain: destroy failed for id b: error: objectIsLinked |   Object id:   Domain#b |   Linked by:   SystemSettings#singleton

Even trying to replay the output of the snapshot commands fails:

❯ STALWART_URL="http://localhost:8080" STALWART_USER='recovery-admin' STALWART_PASSWORD="12345678" stalwart-cli apply \
  --stdin <<'EOF'
{"@type":"destroy","object":"Certificate"}
{"@type":"destroy","object":"Tenant"}
{"@type":"destroy","object":"Role"}
{"@type":"destroy","object":"Directory"}
{"@type":"destroy","object":"DnsServer"}
{"@type":"destroy","object":"AcmeProvider"}
{"@type":"destroy","object":"Domain"}
{"@type":"create","object":"Domain","value":{"domain-b":{"isEnabled":true,"dkimManagement":{"@type":"Manual"},"certificateManagement":{"@type":"Manual"},"directoryId":null,"allowRelaying":false,"description":null,"aliases":{},"dnsManagement":{"@type":"Manual"},"memberTenantId":null,"name":"example.com","reportAddressUri":"mailto:postmaster","subAddressing":{"@type":"Enabled"},"catchAllAddress":null,"logo":null}}}
{"@type":"update","object":"SystemSettings","value":{"threadPoolSize":null,"proxyTrustedNetworks":{},"providerInfo":{},"mailExchangers":{"0":{"priority":10,"hostname":null}},"defaultDomainId":"#domain-b","defaultHostname":"mail.example.com","maxConnections":8192,"defaultCertificateId":null,"services":{"caldav":{"cleartext":false,"hostname":null},"carddav":{"cleartext":false,"hostname":null},"imap":{"cleartext":false,"hostname":null},"jmap":{"cleartext":false,"hostname":null},"managesieve":{"cleartext":false,"hostname":null},"pop3":{"cleartext":false,"hostname":null},"smtp":{"cleartext":false,"hostname":null},"webdav":{"cleartext":false,"hostname":null}}}}
EOF
Plan: 7 destroy, 1 update, 1 create (1 objects)
✗ destroy Domain: Domain: destroy failed for id b: error: objectIsLinked |   Object id:   Domain#b |   Linked by:   SystemSettings#singleton
Done: 0 destroyed, 0 updated, 0 created (1 failed)
error: Domain: destroy failed for id b: error: objectIsLinked |   Object id:   Domain#b |   Linked by:   SystemSettings#singleton

Hi @stalwart I am facing the same issue with idempotency. In 0.15 one could easily apply the same config X number of times. With the new configuration paradigm, one has to do a lot of contorsions to reach idempotency. Would it be possible to look into in a future release as this is preventing migration?

This change would require a schema change (making the default domain field nullable) which we want to avoid if possible. Have you tried changing your plan so the new domain is created and then the default domain property is updated with the new domain?
Also, in general, how often do you need to apply the same schema? Ideally the default domain should be created during the initial deployment only.