Re: BUG #13538: REGEX non-greedy is working incorrectly (and also greedy matches fail if non-greedy is present) - Mailing list pgsql-bugs

From David G. Johnston
Subject Re: BUG #13538: REGEX non-greedy is working incorrectly (and also greedy matches fail if non-greedy is present)
Date
Msg-id CAKFQuwaXQvXx9OEDNXAA5eXdbP0CKBnWow+L0aoh1vH2J6LrtQ@mail.gmail.com
Whole thread Raw
In response to BUG #13538: REGEX non-greedy is working incorrectly (and also greedy matches fail if non-greedy is present)  (christian_maechler@hotmail.com)
Responses Re: BUG #13538: REGEX non-greedy is working incorrectly (and also greedy matches fail if non-greedy is present)  (Christian Mächler<christian_maechler@hotmail.com>)
List pgsql-bugs
On Monday, August 3, 2015, <christian_maechler@hotmail.com> wrote:

> The following bug has been logged on the website:
>
> Bug reference:      13538
> Logged by:          Chris M=C3=A4chler
> Email address:      christian_maechler@hotmail.com <javascript:;>
> PostgreSQL version: 9.3.0
> Operating system:   ?
> Description:
>
> Here is an example to verify and reproduce the error (extract a number an=
d
> the things before and after it with 3 groups):
>
>
> '(.*)([+-]?[0-9]*\.[0-9]+)(.*)'
>
> Using regex=C3=BC_matches this will produce an undesirable result (only o=
ne
> digit
> in group 2), but everything behaves correctly, the third group matches
> until
> the end.
>
> '(.*?)([+-]?[0-9]*\.[0-9]+)(.*)'
>
> If we change the first group to non-greedy to fix this, then the bug
> appears: the third group becomes non-greedy too (it shouldn't!) and
> therefore it is always empty instead of matching until the end of the lin=
e.
> Also the first group is empty (should match from start!), it should find =
a
> match at start position, whether it is non-greedy or not and not look ahe=
ad
> if the non-greedy group can be reduced if starting to match at the next
> index. Both are wrong behaviors.
>
> (the workaround is anchoring, but the behavior of the regex is still wron=
g)
>
> link: http://sqlfiddle.com/#!15/f0f14/14
>
>
>
Reading the documentation this seems to be working as intended.

http://www.postgresql.org/docs/9.3/static/functions-matching.html#POSIX-MAT=
CHING-RULES

On what are you basing your concept of correctness?  Specifically, what
language implementation do you consider "right"?

The TCL implementation used by PostgreSQL has some differences compared to
Java and Perl, the two I am most familiar with.

David J.

pgsql-bugs by date:

Previous
From: christian_maechler@hotmail.com
Date:
Subject: BUG #13538: REGEX non-greedy is working incorrectly (and also greedy matches fail if non-greedy is present)
Next
From: Heikki Linnakangas
Date:
Subject: Re: BUG #13536: SQLParamData thows "Invalid Endian" error