Re: BUG #13538: REGEX non-greedy is working incorrectly (and also greedy matches fail if non-greedy is present) - Mailing list pgsql-bugs

From David G. Johnston
Subject Re: BUG #13538: REGEX non-greedy is working incorrectly (and also greedy matches fail if non-greedy is present)
Date
Msg-id CAKFQuwZkht_CVypBxE_sV9=Dt9P-i4G2f1ipczp2iN0860ukYQ@mail.gmail.com
Whole thread Raw
In response to BUG #13538: REGEX non-greedy is working incorrectly (and also greedy matches fail if non-greedy is present)  (christian_maechler@hotmail.com)
List pgsql-bugs
On Tue, Aug 4, 2015 at 8:39 AM, Christian M=C3=A4chler <
christian_maechler@hotmail.com> wrote:

> You say it is okay that a greedy group suddenly becomes non-greedy if
> ANOTHER group is made non-greedy?
>
> I've chosen a simple example, but I'm pretty sure I could construct
> several use-cases which can be solved easily if the regex behaves like in
> java, javaScript, perl etc.  but not with how it is done here. It's clear=
ly
> not a feature. Already simple things like ending a match with any amount =
of
> numbers will become difficult if non-greedy groups are present, e.g.
> instead of ...([0-9]+) you will have to write ...([0-9]+)(?![0-9])  makes
> things easier...
>
> Seriously I didn't want to start a debate whether this is right or wrong,
> because I honestly can't understand how anyone could defend the behavior
> mentioned in the first sentence of this message. As I said, I just wanted
> to point out that there is a bug to help improve, but if you prefer it li=
ke
> this it is fine with me, I just think then you probably haven't used rege=
x
> that much.
>
>
=E2=80=8BI use RegEx quite a bit and while I will agree with the sentiment =
it would
be nearly impossible to replace the existing implementation.  Fortunately,
PostgreSQL is quite extensible and so if you need a more flexible RegEx
implementation you can write a function in pl/perl or pl/v8 and use their
implementations.

About the only thing that would make sense would be to get the TCL
implementation to accept the "possessive" modifier (+ in Java:
https://docs.oracle.com/javase/tutorial/essential/regex/quant.html) and let
it override the "overall greediness" aspect of the matching region while
leaving the unadorned case to use the overall aspect.

There is probable a bit more consideration than the brief amount I've done
here - though I think the point has been made.  Right or wrong it is a
design choice that has been made and in use for many years.  It works well
enough, and options/hacks exist, that finding someone who wants to dedicate
resources to improving the situation is likely to be difficult.

David J.
=E2=80=8B

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #13538: REGEX non-greedy is working incorrectly (and also greedy matches fail if non-greedy is present)
Next
From: "David G. Johnston"
Date:
Subject: Re: BUG #13538: REGEX non-greedy is working incorrectly (and also greedy matches fail if non-greedy is present)