Thread: doc regexp_replace replacement string \n does not explained properly

hi.

https://www.postgresql.org/docs/current/functions-matching.html#FUNCTIONS-POSIX-REGEXP
<<<start_quote
The regexp_replace function provides substitution of new text for
substrings that match POSIX regular expression patterns. It has the
syntax regexp_replace(source, pattern, replacement [, start [, N ]] [,
flags ]). (Notice that N cannot be specified unless start is, but
flags can be given in any case.) The source string is returned
unchanged if there is no match to the pattern. If there is a match,
the source string is returned with the replacement string substituted
for the matching substring. The replacement string can contain \n,
where n is 1 through 9, to indicate that the source substring matching
the n'th parenthesized subexpression of the pattern should be
inserted, and it can contain \& to indicate that the substring
matching the entire pattern should be inserted.
<<<end_quote

<<
The replacement string can contain \n, where n is 1 through 9,
to indicate that the source substring matching the n'th parenthesized
subexpression of the pattern should be inserted
<<
i think it explained example like:
SELECT regexp_replace('foobarbaz', 'b(..)', 'X\1Y', 'g');

but it does not seem to explain cases like:
SELECT regexp_replace('foobarbaz', 'b(..)', 'X\2Y', 'g');
?


I think it means that 'b(..)', (..) the parenthesized subexpression is
1, the whole expression is (n+1) parenthesized subexpression.
so it is equivalent to
SELECT regexp_replace('foobarbaz', 'b..', 'XY', 'g');



Re: doc regexp_replace replacement string \n does not explained properly

From
"David G. Johnston"
Date:
On Monday, May 20, 2024, jian he <jian.universality@gmail.com> wrote:
hi.

https://www.postgresql.org/docs/current/functions-matching.html#FUNCTIONS-POSIX-REGEXP


 If there is a match,
the source string is returned with the replacement string substituted
for the matching substring.


This happens regardless of the presence of parentheses.



 The replacement string can contain \n,
where n is 1 through 9, to indicate that the source substring matching
the n'th parenthesized subexpression of the pattern should be
inserted, and it can contain \& to indicate that the substring
matching the entire pattern should be inserted.

 Then if the replacement text contains “\n” expressions those are replaced with text captured from the corresponding parentheses group.


<<
i think it explained example like:
SELECT regexp_replace('foobarbaz', 'b(..)', 'X\1Y', 'g');

global - find two matches to process.

foobarbaz
fooX\1YX\1Y
fooXarYXazY
 

but it does not seem to explain cases like:
SELECT regexp_replace('foobarbaz', 'b(..)', 'X\2Y', 'g');


foobarbaz
fooX\2YX\2Y
fooX{empty string, no second capture group}YX{empty}Y
fooXYXY

The docs are correct, though I suppose being explicit that a missing capture group results in an empty string substitution instead of an error is probably warranted.

David J.