I was facing the same issue and did some analysis with Claude. May someone help me providing a pull-request for this? Thanks.
Bug: matches() in expressions produces wrong regex due to double-escaping in parse_string
Summary
The parse_string function in crates/common/src/expr/tokenizer.rs incorrectly handles backslash escape sequences, causing the matches() function to compile the wrong regex pattern.
The backslash character is pushed to the output buffer before determining whether it starts an escape sequence, resulting in double backslashes in the compiled regex.
Root Cause
File: crates/common/src/expr/tokenizer.rs, function parse_string() (line ~266)
Current code:
fn parse_string(&mut self, stop_ch: u8) -> Result<CompactString, String> {
let mut buf = Vec::with_capacity(16);
let mut last_ch = 0;
let mut found_end = false;
for &ch in self.iter.by_ref() {
if last_ch != b'\\' {
if ch != stop_ch {
buf.push(ch); // BUG: pushes '\' to buf before knowing it's an escape
} else {
found_end = true;
break;
}
} else {
match ch {
b'n' => buf.push(b'\n'),
b'r' => buf.push(b'\r'),
b't' => buf.push(b'\t'),
_ => buf.push(ch), // pushes the escaped char (second char after '\')
}
}
last_ch = ch;
}
// ...
}
Trace for input \\. (backslash backslash dot)
| Step |
ch |
last_ch |
Branch |
Action |
buf |
| 1 |
\ |
0 |
normal |
push \ |
[\] |
| 2 |
\ |
\ |
escape |
push \ |
[\\] |
| 3 |
. |
\ |
escape (BUG!) |
push . |
[\\.] |
Result: 3 bytes \\. passed to Regex::new()
Regex interprets: \\ = literal backslash, . = any character
Expected: 2 bytes \. → Regex::new interprets as literal dot
The bug in step 3: after \\, last_ch is still \ (set at end of step 2), so the . is treated as another escaped character. The root problem is step 1: the first \ should NOT be pushed, because it’s the start of an escape sequence.
Why starts_with() works but matches() doesn’t
starts_with(rcpt, "prefix.") passes "prefix." through parse_string with no backslashes, so no escaping issue occurs. It correctly returns "prefix." and the comparison works.
matches('^prefix\\..+$', rcpt) passes ^prefix\\..+$ through parse_string, producing regex ^prefix\\..+$ (literal backslash + any char) instead of ^prefix\..+$ (literal dot). The regex fails to match prefix.foo because there’s no backslash in the input.
Fix
Replace parse_string with a proper escape-state-machine:
fn parse_string(&mut self, stop_ch: u8) -> Result<CompactString, String> {
let mut buf = Vec::with_capacity(16);
let mut escape = false;
let mut found_end = false;
for &ch in self.iter.by_ref() {
if escape {
match ch {
b'n' => buf.push(b'\n'),
b'r' => buf.push(b'\r'),
b't' => buf.push(b'\t'),
_ => buf.push(ch), // handles \\, \', \", and unknown escapes
}
escape = false;
} else if ch == b'\\' {
escape = true; // don't push '\'; wait for next char
} else if ch == stop_ch {
found_end = true;
break;
} else {
buf.push(ch);
}
}
if found_end {
CompactString::from_utf8(buf).map_err(|_| "Invalid UTF-8".into())
} else {
Err("Unterminated string".to_string())
}
}
Trace of fix for \\.
| Step |
ch |
escape |
Action |
buf |
| 1 |
\ |
false |
set escape=true |
[] |
| 2 |
\ |
true |
push \, esc=false |
[\] |
| 3 |
. |
false |
push . |
[\.] |
Result: 2 bytes \. → Regex::new interprets as literal dot ✓
Trace for single backslash \.
| Step |
ch |
escape |
Action |
buf |
| 1 |
\ |
false |
set escape=true |
[] |
| 2 |
. |
true |
push ., esc=false |
[.] |
Result: 1 byte . → Regex interprets as “any char”
This means users must use \\. in expression strings to get \. in regex, which is standard behavior (same as Java, Python, JavaScript string literals).
Breaking Change Note
Users who currently write matches('\\.', rcpt) and get correct behavior (accidentally, due to the double-push bug) will continue to work correctly after the fix.
Users who write matches('\.', rcpt) currently get \. in the regex (literal dot) due to the bug. After the fix, they’ll get . (any char) — still matches but is more permissive. These users should update to matches('\\.', rcpt).
The fix aligns string escape handling with standard language conventions.
Affected Components
-
Custom sub-addressing with matches() (the reported issue)
-
ALL expression string literals that use backslash escapes
-
RCPT rewrite rules using matches()
-
Any other expression context using regex patterns
Test Cases
#[test]
fn test_parse_string_escapes() {
// \\\\ in expression → \\ in output → regex literal backslash
assert_parse_string("\\\\", "\\");
// \\. in expression → \. in output → regex literal dot
assert_parse_string("\\\\.", "\\.");
// \\n in expression → newline in output
assert_parse_string("\\n", "\n");
// No escapes
assert_parse_string("hello", "hello");
// Mixed
assert_parse_string("^prefix\\\\..+$", "^prefix\\..+$");
}
Workaround (for current v0.16.x without fix)
Use starts_with() or contains() instead of matches():
if_then(starts_with(rcpt, "prefix."), "bar", rcpt)
---
crates/common/src/expr/tokenizer.rs | 29 ++++++++++++-----------
1 file changed, 15 insertions(+), 14 deletions(-)
diff --git a/crates/common/src/expr/tokenizer.rs b/crates/common/src/expr/tokenizer.rs
--- a/crates/common/src/expr/tokenizer.rs
+++ b/crates/common/src/expr/tokenizer.rs
@@ -266,31 +266,32 @@
fn parse_string(&mut self, stop_ch: u8) -> Result<CompactString, String> {
let mut buf = Vec::with_capacity(16);
- let mut last_ch = 0;
+ let mut escape = false;
let mut found_end = false;
for &ch in self.iter.by_ref() {
- if last_ch != b'\\' {
- if ch != stop_ch {
- buf.push(ch);
- } else {
- found_end = true;
- break;
- }
- } else {
+ if escape {
match ch {
b'n' => {
buf.push(b'\n');
}
b'r' => {
buf.push(b'\r');
}
b't' => {
buf.push(b'\t');
}
_ => {
buf.push(ch);
}
}
+ escape = false;
+ } else if ch == b'\\' {
+ escape = true;
+ } else if ch == stop_ch {
+ found_end = true;
+ break;
+ } else {
+ buf.push(ch);
}
-
- last_ch = ch;
}
if found_end {