Clustering - Web interface access via AWS NLB - Works at first, then resets all connections

@helpbot This setup of having the NLB answer with TLS/ACM appears to be working, however, the security group has to be opened to the entire Internet in order to allow connections to the nodes on 8080.

This might not be in your wheelhouse, but if you can suggest a method to allow the NLB to accept the connection and pass it to the Stalwart nodes without having the security group wide open, so that the nodes themselves cannot be directly accessed by bypassing the NLB and using a direct URL.

My guess would be this is only resolvable by using iptables on the nodes themselves.

This is an AWS networking question rather than a Stalwart configuration question, but it has a clean answer:

Use the NLB’s security group (or the NLB’s managed prefix list) as the source in your EC2 instance security group rule, rather than 0.0.0.0/0. This restricts port 8080 access to traffic originating from the NLB only.

Specifically:

  1. NLB security group as source (if your NLB has a security group – NLBs on VPCs support security groups): Add an inbound rule on your EC2 instance security group allowing TCP port 8080 from the NLB’s security group ID. No traffic from outside the NLB can reach the instances on that port.

  2. If your NLB does not have a security group (older NLB behavior): Restrict port 8080 inbound to the NLB’s private IP addresses (the ENI IPs in each AZ, visible in EC2 > Network Interfaces filtered by the NLB). Use those specific IPs as the source CIDR. You already know these IPs because they’re the ones you’re adding to Stalwart’s trusted proxy networks.

  3. For the IMAP/SMTP/other ports: These can similarly be locked to the NLB source rather than the internet directly.

The iptables approach also works but the security group approach is simpler to manage in AWS and doesn’t require OS-level configuration on each node.

This is squarely an AWS security group / VPC design topic – the Stalwart docs don’t cover it. A community member or AWS-experienced human may be able to give more specific guidance for your exact VPC setup.

This is an automated reply from the Stalwart Help Bot. Other community members may follow up if this answer is incomplete or wrong. If you found it useful, mention @helpbot in any reply to ask follow-up questions and the bot will respond again.

@helpbot My NLB uses the same Security Group as the nodes, that would explain the undesired behaviour. I will split them and then I will be able to use the NLB source SG into the nodes SG.

Correct — splitting the NLB and the EC2 instances into separate security groups is the right move. Once they have distinct security groups, you can:

  1. Create a dedicated security group for the NLB (e.g. sg-nlb), open inbound 443 (and any other public-facing ports) from 0.0.0.0/0 on it.
  2. On the EC2 instance security group (e.g. sg-stalwart), allow inbound TCP 8080 (and your IMAP/SMTP ports) from sg-nlb as the source — not from the internet.

This means direct access to port 8080 (or any other backend port) from outside the NLB is blocked at the AWS network layer, with no iptables rules needed on the nodes themselves. The Stalwart docs don’t cover this, but once you have it wired up the configuration on the Stalwart side (trusted proxy networks, AllowedIPs, /healthz/live health check path) should be exactly as discussed in this thread.

This is an automated reply from the Stalwart Help Bot. Other community members may follow up if this answer is incomplete or wrong. If you found it useful, mention @helpbot in any reply to ask follow-up questions and the bot will respond again.

@helpbot Now that the HTTPS side of things should be resolved, aren’t we going to run into the same problem with connection re-use when it comes to SMTPS, IMAPS and POP3S, as we will be answering TLS and passing through as TLS? Do I need to create non-TLS listeners on the nodes instead? Would this impact the received mail status in any way if we downgrade to non-TLS?

Yes, you will run into the same NLB connection-reuse problem with SMTPS (port 465), IMAPS (port 993), and POP3S (port 995) in TCP passthrough mode – the root cause is identical: the NLB pools backend TCP connections and reuses them across different clients’ TLS sessions, which breaks Stalwart’s per-connection TLS state.

The solution is the same pattern: terminate TLS at the NLB, forward plain protocol to the Stalwart nodes.

For mail protocols, this means:

  • IMAPS (993): NLB terminates TLS on 993 with ACM cert, forwards plain IMAP to the node on port 143. Create a plain IMAP listener on Stalwart (protocol: imap, bind: [::]:143, useTls: false).
  • POP3S (995): NLB terminates TLS on 995, forwards plain POP3 to port 110 on the node. Plain POP3 listener (protocol: pop3, bind: [::]:110, useTls: false).
  • SMTPS (465): NLB terminates TLS on 465, forwards plain SMTP to port 25 or a dedicated plain SMTP port on the node.
  • Submission (587 with STARTTLS): More complex – STARTTLS is an in-band protocol upgrade, not a separate TLS-wrapped port. NLB cannot strip STARTTLS. The safest approach is to route 587 as TCP passthrough (one connection per session, no reuse issue since STARTTLS is always a fresh connection per client) or to skip 587 and rely on SMTPS only via the NLB.

Impact on received mail / TLS status:

The Received header Stalwart adds to inbound messages includes TLS information. The Proxy Protocol v2 carries a TLS status bit from the proxy to the backend – so when the NLB terminates TLS and sends Proxy Protocol v2 headers, Stalwart can correctly mark the connection as having arrived over TLS in the Received header (from the docs: “whether the incoming connection was protected by TLS is a policy input on several code paths” and the Proxy Protocol “carries both the original client IP and the original TLS bit”). So as long as Proxy Protocol v2 is enabled on the NLB target group and the NLB source IPs are in trusted proxy networks, the Received header and TLS status will correctly reflect the client-facing TLS – no downgrade in the headers.

Plain text between the NLB and your Stalwart nodes is acceptable as long as that traffic stays within your VPC (private subnet, locked down by security group as discussed), which it will be once you split the security groups.

This is an automated reply from the Stalwart Help Bot. Other community members may follow up if this answer is incomplete or wrong. If you found it useful, mention @helpbot in any reply to ask follow-up questions and the bot will respond again.

@helpbot

If this was an option on the NLB, I wouldn’t have any any problems keeping the TLS all the way through!

The NLB’s connection reuse behavior is the fundamental constraint here – it applies to all TCP passthrough listeners, not just HTTPS. The NLB maintains a pool of backend TCP connections and reuses them across multiple client sessions, which is incompatible with per-connection TLS state regardless of protocol.

For STARTTLS on port 587, the situation is actually different from implicit TLS ports: each SMTP session is a fresh connection initiated by the sending client, and STARTTLS upgrade happens once per connection. There is no NLB-side connection reuse problem in the same way as IMAPS/HTTPS, because SMTP clients don’t share backend connections. However, the NLB still cannot “pass through” STARTTLS in any meaningful way – you have two practical options:

  1. TCP passthrough on 587 – works fine for STARTTLS because each client opens a fresh TCP connection, negotiates STARTTLS once, and closes it. The NLB won’t reuse that connection for another client’s session mid-stream. Enable Proxy Protocol v2 and add the NLB IPs to trusted proxy networks as normal.
  2. Drop 587 entirely and route all client submission through SMTPS (465) with TLS termination at the NLB, as described above.

For inbound SMTP on port 25 (from other mail servers), TCP passthrough is also the right approach – external mail servers each open their own fresh connection and the NLB connection pooling behavior doesn’t cause the same reuse problem as it does for long-lived browser or IMAP sessions.

The summary for your full stack:

Port NLB mode Backend listener
443 (HTTPS/admin/JMAP) TLS termination (ACM) Plain HTTP :8080
993 (IMAPS) TLS termination (ACM) Plain IMAP :143
995 (POP3S) TLS termination (ACM) Plain POP3 :110
465 (SMTPS) TLS termination (ACM) Plain SMTP :2525 (new listener)
587 (Submission/STARTTLS) TCP passthrough SMTP :587 with STARTTLS
25 (Inbound SMTP) TCP passthrough SMTP :25

Docs references:

This is an automated reply from the Stalwart Help Bot. Other community members may follow up if this answer is incomplete or wrong. If you found it useful, mention @helpbot in any reply to ask follow-up questions and the bot will respond again.

@helpbot

Specific reason why creating a 2nd listener and not utilizing the existing 25?

The only reason port 2525 was suggested in that table was to avoid a conflict: your existing port 25 listener is already used for TCP passthrough (inbound SMTP from the internet), so routing the NLB’s SMTPS-terminated traffic to it would mix inbound relay traffic with authenticated client submissions on the same listener.

The cleaner approach is to create a new plain SMTP listener on a dedicated port (2525 is a common convention, but any unused port works) and configure it specifically for authenticated submission – with appropriate requireAuth, SASL, and rate-limit settings that match your existing port 465/587 submission listeners, minus the TLS. The NLB then terminates TLS on port 465 and forwards to this new plain-SMTP submission port.

If your deployment keeps inbound SMTP (port 25) and client submission (port 465/587) as logically separate flows with different authentication and policy requirements (which is standard), keeping them on separate backend listeners is the right approach. If you only use the NLB for client-facing ports and port 25 is handled separately (e.g. direct TCP passthrough for inbound mail), you could reuse port 25 – but that conflates two different traffic flows on one listener, which makes policy enforcement harder.

The Listeners and MTA configuration docs cover setting up separate listeners per protocol role.

This is an automated reply from the Stalwart Help Bot. Other community members may follow up if this answer is incomplete or wrong. If you found it useful, mention @helpbot in any reply to ask follow-up questions and the bot will respond again.