Re: BUG #13538: REGEX non-greedy is working incorrectly (and also greedy matches fail if non-greedy is present) - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #13538: REGEX non-greedy is working incorrectly (and also greedy matches fail if non-greedy is present)
Date
Msg-id 32324.1438702786@sss.pgh.pa.us
Whole thread Raw
In response to Re: BUG #13538: REGEX non-greedy is working incorrectly (and also greedy matches fail if non-greedy is present)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BUG #13538: REGEX non-greedy is working incorrectly (and also greedy matches fail if non-greedy is present)  ("David G. Johnston" <david.g.johnston@gmail.com>)
List pgsql-bugs
I wrote:
> As David says, these examples appear to be following what's stated in
> http://www.postgresql.org/docs/9.4/static/functions-matching.html#POSIX-MATCHING-RULES
> The Spencer regex engine we use has a notion of greediness or
> non-greediness of the entire regex, and further that that takes precedence
> for determining the overall match length over greediness of individual
> subexpressions.  That behavior might be inconvenient for this particular
> use-case, but that doesn't make it a bug.

BTW, perhaps it would be worth adding an example to that section that
shows how to control this behavior.  The trick is obvious once you've seen
it, but not so much otherwise: you add something to the start of the regex
that establishes the overall greediness you want, but can never actually
match any characters.  "\0*" or "\0*?" will work fine in Postgres
use-cases since there can never be a NUL character in the data.

regression=# select regexp_matches('abc01234xyz', '(.*)(\d+)(.*)');
 regexp_matches
-----------------
 {abc0123,4,xyz}
(1 row)

regression=# select regexp_matches('abc01234xyz', '(.*?)(\d+)(.*)');
 regexp_matches
----------------
 {abc,0,""}
(1 row)

regression=# select regexp_matches('abc01234xyz', '\0*(.*?)(\d+)(.*)');
 regexp_matches
-----------------
 {abc,01234,xyz}
(1 row)


            regards, tom lane

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #13538: REGEX non-greedy is working incorrectly (and also greedy matches fail if non-greedy is present)
Next
From: "David G. Johnston"
Date:
Subject: Re: BUG #13538: REGEX non-greedy is working incorrectly (and also greedy matches fail if non-greedy is present)