Thread: BUG #8605: Regular expression lazy quantification issue

BUG #8605: Regular expression lazy quantification issue

From
atoriwork@gmail.com
Date:
The following bug has been logged on the website:

Bug reference:      8605
Logged by:          Atori
Email address:      atoriwork@gmail.com
PostgreSQL version: 9.2.4
Operating system:   Debian 4.7.2-5, 64-bit
Description:

Lazy quantificators does't work after "or" block in regexp mask
('(a)|(b)'):
example:
string: 'CsssQsDpppppQsDpppQ'
mask: '((a)|(C.+?Q))s(D.+?Q)'


select regexp_replace('CsssQsDpppppQsDpppQ', '((C.+?Q))s(D.+?Q)', '#foo#');
result: "#foo#sDpppQ"


select regexp_replace('CsssQsDpppppQsDpppQ', '((a)|(C.+?Q))s(D.+?Q)',
'#foo#');
result: "#foo#"
expected result: "#foo#sDpppQ"

Re: BUG #8605: Regular expression lazy quantification issue

From
Tom Lane
Date:
atoriwork@gmail.com writes:
> Lazy quantificators does't work after "or" block in regexp mask
> ('(a)|(b)'):

This isn't a bug, it's documented behavior.  See
http://www.postgresql.org/docs/9.2/static/functions-matching.html#POSIX-MATCHING-RULES
specifically the bit that an RE containing an | operator is always greedy.
The non-greedy operators within it are constrained to match as little
as possible, but that happens after determining the overall match, which
will be greedy.

I realize that this might not be the behavior you'd like, but we're
unlikely to change it.

            regards, tom lane