When a multi-digit backreference like \12 is written but fewer than 12 capturing groups exist, std.regex silently drops the trailing digit — it is matched as neither a backreference digit, a literal, nor an octal escape, and no error is raised.
import std.regex, std.stdio;
void main()
{
// only one group exists, so `\12` cannot be group 12
writeln(matchFirst("aa2", regex(`(a)\12`)).hit); // prints "aa", not "aa2"
}
(a)\12 reduces to \1 and the 2 disappears (it also matches "aa" with no trailing 2). With 12+ groups \12 correctly refers to group 12; only this fallback path is broken.
I think the cause is, in std/regex/internal/parser.d:
//perl's disambiguation rule i.e.
//get next digit only if there is such group number
popFront();
while (nref < maxBackref && !empty && std.ascii.isDigit(front))
{
nref = nref * 10 + front - '0';
popFront(); // the extra digit is consumed here
}
if (nref >= maxBackref)
nref /= 10; // number reverted, but cursor already past the digit
The loop forms nref = 12 and consumes the 2; when group 12 doesn't exist, nref /= 10 reverts the number, but the cursor has already moved past the digit, so it is lost. This already contradicts the comment's own rule — "get next digit only if there is such group number".
The rule is also labelled "perl's disambiguation rule", but that isn't how Perl resolves \12: an out-of-range multi-digit reference is an octal escape in Perl..
When a multi-digit backreference like
\12is written but fewer than 12 capturing groups exist, std.regex silently drops the trailing digit — it is matched as neither a backreference digit, a literal, nor an octal escape, and no error is raised.(a)\12reduces to\1and the2disappears (it also matches"aa"with no trailing2). With 12+ groups\12correctly refers to group 12; only this fallback path is broken.I think the cause is, in
std/regex/internal/parser.d:The loop forms
nref = 12and consumes the2; when group 12 doesn't exist,nref /= 10reverts the number, but the cursor has already moved past the digit, so it is lost. This already contradicts the comment's own rule — "get next digit only if there is such group number".The rule is also labelled "perl's disambiguation rule", but that isn't how Perl resolves
\12: an out-of-range multi-digit reference is an octal escape in Perl..