Re: [PATCH] proposal for regexp_count, regexp_instr, regexp_substr and regexp_replace - Mailing list pgsql-hackers

From Gilles Darold
Subject Re: [PATCH] proposal for regexp_count, regexp_instr, regexp_substr and regexp_replace
Date
Msg-id b0e7ffa2-e776-2b6b-d0b1-bf3289439790@darold.net
Whole thread Raw
In response to Re: [PATCH] proposal for regexp_count, regexp_instr, regexp_substr and regexp_replace  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [PATCH] proposal for regexp_count, regexp_instr, regexp_substr and regexp_replace  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Le 30/07/2021 à 23:38, Tom Lane a écrit :
Gilles Darold <gilles@darold.net> writes:
Le 26/07/2021 à 21:56, Tom Lane a écrit :
I'm inclined to just drop the regexp_replace additions.  I don't think
that the extra parameters Oracle provides here are especially useful.
They're definitely not useful enough to justify creating compatibility
hazards for.
I would not say that being able to replace the Nth occurrence of a
pattern matching is not useful but i agree that this is not a common
case with replacement. Both Oracle [1] and IBM DB2 [2] propose this form
and I have though that we can not have compatibility issues because of
the different data type at the 4th parameter.
Well, here's an example of the potential issues:

[...]


Thanks for pointing me this case, I did not think that the prepared statement could lead to this confusion.


Anyway, maybe we can just
rename the function even if I would prefer that regexp_replace() be
extended. For example:
    regexp_replace(source, pattern, replacement [, flags ]);
    regexp_substitute(source, pattern, replacement [, position ] [,
occurrence ] [, flags ]);
Hmm.  Of course the entire selling point of this patch seems to be
bug-compatibility with Oracle, so using different names is largely
defeating the point :-(

Maybe we should just hold our noses and do it.  The point that
you'd get a recognizable failure if the wrong function were chosen
reassures me a little bit.  We've seen a lot of cases where this
sort of ambiguity results in the system just silently doing something
different from what you expected, and I was afraid that that could
happen here.


I join a new version of the patch that include a check of the option parameter in the basic form of regexp_replace() and return an error in ambiguous cases.


PREPARE rr AS SELECT regexp_replace('healthy, wealthy, and wise','(\w+)thy', '\1ish', $1);
EXECUTE rr(1);
ERROR:  ambiguous use of the option parameter in regex_replace(), value: 1
HINT:  you might set the occurrence parameter to force the use of the extended form of regex_replace()


This is done by checking if the option parameter value is an integer and throw the error in this case. I don't think of anything better.


Best regards,

-- 
Gilles Darold
Attachment

pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: slab allocator performance issues
Next
From: Gilles Darold
Date:
Subject: Re: [PATCH] proposal for regexp_count, regexp_instr, regexp_substr and regexp_replace