Re: uninterruptable regexp_replace in 9.2 and 9.3 - Mailing list pgsql-bugs

From Pedro Gimeno
Subject Re: uninterruptable regexp_replace in 9.2 and 9.3
Date
Msg-id 5310925B.8010809@personal.formauri.es
Whole thread Raw
In response to uninterruptable regexp_replace in 9.2 and 9.3  (Sandro Santilli <strk@keybit.net>)
Responses Re: uninterruptable regexp_replace in 9.2 and 9.3  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
Sandro Santilli wrote, On 2014-02-28 12:28:
> Starting with commit 173e29aa5 the regexp_replace function
> became uninterruptable during operations, and takes a lot
> of time and RAM to process some patterns.
>
> Full story in this thread:
> http://www.postgresql.org/message-id/646.1393031856@sss.pgh.pa.us
>
> I'm hoping a mail to pgsql-bugs would assign this issue a ticket
> number for me to follow. Let me know if there's a better way.
> Thank you !

This may be relevant:
https://gist.github.com/johnbartholomew/8379265

I've added these lines:

printf 'Testing psql:\n'
time psql -c "SELECT regexp_matches('$pattern','$input');"

The results in my machine are (with PostgreSQL 9.1.9):

Pattern:
"a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?aaaaaaaaaaaaaaaaaaaaaaaaaaa"
Input: "aaaaaaaaaaaaaaaaaaaaaaaaaaa"
Testing grep:
aaaaaaaaaaaaaaaaaaaaaaaaaaa

real    0m0.003s
user    0m0.000s
sys    0m0.000s
Testing perl:
match

real    0m6.103s
user    0m6.100s
sys    0m0.000s
Testing python:
<_sre.SRE_Match object at 0xb70e2250>

real    0m10.207s
user    0m10.141s
sys    0m0.060s
Testing psql:
        regexp_matches
-------------------------------
 {aaaaaaaaaaaaaaaaaaaaaaaaaaa}
(1 row)


real    0m0.039s
user    0m0.024s
sys    0m0.004s

Interestingly, as noted in the comments, PHP reports an error that the
backtracking limit was reached.

I'm adding this note in case it helps anyone get a bigger picture as to
what other implementations do about this problem.

pgsql-bugs by date:

Previous
From: mark.a.sloan@gmail.com
Date:
Subject: BUG #9374: some timzone abbver is missing from pg_timezone_abbrevs that are in pg_timezone_names
Next
From: randy.antler@socrata.com
Date:
Subject: BUG #9396: Line editing keybindings not working