pgsql: Make regexp engine's backref-related compilation state more bull - Mailing list pgsql-committers

From Tom Lane
Subject pgsql: Make regexp engine's backref-related compilation state more bull
Date
Msg-id E1mCYXI-0005Se-W7@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Make regexp engine's backref-related compilation state more bulletproof.

Up to now, we remembered the definition of a capturing parenthesis
subexpression by storing a pointer to the associated subRE node.
That was okay before, because that subRE didn't get modified anymore
while parsing the rest of the regexp.  However, in the wake of
commit ea1268f63, that's no longer true: the outer invocation of
parseqatom() feels free to scribble on that subRE.  This seems to
work anyway, because the states we jam into the child atom in the
"prepare a general-purpose state skeleton" stanza aren't really
semantically different from the original endpoints of the child atom.
But that would be mighty easy to break, and it's definitely not how
things worked before.

Between this and the issue fixed in the prior commit, it seems best
to get rid of this dependence on subRE nodes entirely.  We don't need
the whole child subRE for future backrefs, only its starting and ending
NFA states; so let's just store pointers to those.

Also, in the corner case where we make an extra subRE to handle
immediately-nested capturing parentheses, it seems like it'd be smart
to have the extra subRE have the same begin/end states as the original
child subRE does (s/s2 not lp/rp).  I think that linking it from lp to
rp might actually be semantically wrong, though since Spencer's original
code did it that way, I'm not totally certain.  Using s/s2 is certainly
not wrong, in any case.

Per report from Mark Dilger.  Back-patch to v14 where the problematic
patches came in.

Discussion: https://postgr.es/m/0203588E-E609-43AF-9F4F-902854231EE7@enterprisedb.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/cb76fbd7ec87e44b3c53165d68dc2747f7e26a9a

Modified Files
--------------
src/backend/regex/regcomp.c | 57 ++++++++++++++++++++++++++++-----------------
1 file changed, 35 insertions(+), 22 deletions(-)


pgsql-committers by date:

Previous
From: Andres Freund
Date:
Subject: Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o
Next
From: Tom Lane
Date:
Subject: pgsql: Fix use-after-free issue in regexp engine.