Re: Uninterruptable regexp_replace in 9.3.1 ? - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Uninterruptable regexp_replace in 9.3.1 ?
Date
Msg-id 646.1393031856@sss.pgh.pa.us
Whole thread Raw
In response to Re: Uninterruptable regexp_replace in 9.3.1 ?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Uninterruptable regexp_replace in 9.3.1 ?  (Sandro Santilli <strk@keybit.net>)
List pgsql-hackers
I wrote:
> Craig Ringer <craig@2ndquadrant.com> writes:
>> So I'd like to confirm that this issue doesn't affect 9.1.

> It doesn't.  I suspect it has something to do with 173e29aa5 or one
> of the nearby commits in backend/regex/.

Indeed, git bisect fingers that commit as introducing the problem.

What seems to be happening is that citerdissect() is trying some
combinatorially large number of ways to split the input string into
substrings that can satisfy the argument of the outer "+" iterator.
It keeps failing on the substring starting with the first '$', and
then vainly looking for other splits that dodge the problem.

I'm not entirely sure how come the previous coding didn't fall into
the same problem.  It's quite possible Henry Spencer is smarter than
I am, but there was certainly nothing there before that was obviously
avoiding this hazard.

Worthy of note is that I think pre-9.2 is actually giving the wrong
answer --- it's claiming the whole string matches the regex,
which it does not if I'm reading it right.  This may be related to
the problem that commit 173e29aa5 was trying to fix, ie failure to
enforce backref matches in some cases.  So one possible theory is
that by failing to notice that it *didn't* have a valid match,
the old code accidentally failed to go down the rabbit hole of trying
zillions of other ways to match.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: pg_stat_tmp files for dropped databases
Next
From: Thom Brown
Date:
Subject: Re: pg_stat_tmp files for dropped databases