Re: [PATCH] proposal for regexp_count, regexp_instr, regexp_substr and regexp_replace - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: [PATCH] proposal for regexp_count, regexp_instr, regexp_substr and regexp_replace |
Date | |
Msg-id | 1124806.1627681104@sss.pgh.pa.us Whole thread Raw |
In response to | Re: [PATCH] proposal for regexp_count, regexp_instr, regexp_substr and regexp_replace (Gilles Darold <gilles@darold.net>) |
Responses |
Re: [PATCH] proposal for regexp_count, regexp_instr, regexp_substr and regexp_replace
Re: [PATCH] proposal for regexp_count, regexp_instr, regexp_substr and regexp_replace |
List | pgsql-hackers |
Gilles Darold <gilles@darold.net> writes: > Le 26/07/2021 à 21:56, Tom Lane a écrit : >> I'm inclined to just drop the regexp_replace additions. I don't think >> that the extra parameters Oracle provides here are especially useful. >> They're definitely not useful enough to justify creating compatibility >> hazards for. > I would not say that being able to replace the Nth occurrence of a > pattern matching is not useful but i agree that this is not a common > case with replacement. Both Oracle [1] and IBM DB2 [2] propose this form > and I have though that we can not have compatibility issues because of > the different data type at the 4th parameter. Well, here's an example of the potential issues: regression=# create function rr(text,text,text,text) returns text regression-# language sql as $$select 'text'$$; CREATE FUNCTION regression=# create function rr(text,text,text,int4) returns text language sql as $$select 'int4'$$; CREATE FUNCTION regression=# select rr('a','b','c','d'); rr ------ text (1 row) regression=# select rr('a','b','c',42); rr ------ int4 (1 row) So far so good, but: regression=# prepare rr as select rr('a','b','c',$1); PREPARE regression=# execute rr(12); rr ------ text (1 row) So somebody trying to use the 4-parameter Oracle form from, say, JDBC would get bit if they were sloppy about specifying parameter types. The one saving grace is that digits aren't valid regexp flags, so the outcome would be something like regression=# select regexp_replace('a','b','c','12'); ERROR: invalid regular expression option: "1" which'd be less difficult to debug than silent misbehavior. Conversely, if you thought you were passing flags but it somehow got interpreted as a start position, that would fail too: regression=# prepare rri as select rr('a','b','c', $1::int); PREPARE regression=# execute rri('gi'); ERROR: invalid input syntax for type integer: "gi" LINE 1: execute rri('gi'); ^ Still, I bet a lot that we'd see periodic bug reports complaining that it doesn't work. > Anyway, maybe we can just > rename the function even if I would prefer that regexp_replace() be > extended. For example: > regexp_replace(source, pattern, replacement [, flags ]); > regexp_substitute(source, pattern, replacement [, position ] [, > occurrence ] [, flags ]); Hmm. Of course the entire selling point of this patch seems to be bug-compatibility with Oracle, so using different names is largely defeating the point :-( Maybe we should just hold our noses and do it. The point that you'd get a recognizable failure if the wrong function were chosen reassures me a little bit. We've seen a lot of cases where this sort of ambiguity results in the system just silently doing something different from what you expected, and I was afraid that that could happen here. regards, tom lane
pgsql-hackers by date: