Re: replace inside regexp_replace - Mailing list pgsql-general

From Oliver Kohll
Subject Re: replace inside regexp_replace
Date
Msg-id CAMS=m5LwznPRV66+Ouh4emxxaLObNnsmwbVKLp_Yp+HiV3HGXA@mail.gmail.com
Whole thread Raw
In response to Re: replace inside regexp_replace  (Francisco Olarte <folarte@peoplecall.com>)
List pgsql-general
On Mon, 21 Jun 2021 at 15:09, Francisco Olarte <folarte@peoplecall.com> wrote:
Oliver:

On Mon, Jun 21, 2021 at 3:27 PM Oliver Kohll <oliver@agilechilli.com> wrote:
...
> My attempt to do that is the regex
> select regexp_replace(
> 'here is [[my text]] to replace and [[some more]]',
> E'\\[\\[(.*?)\\]\\]',
> replace(E'\\1', ' ', '_'),
> 'g'
> );
> which results in
> 'here is my text to replace and some more'

> It half works, i.e. it removes the brackets but doesn't seem to process the inner replace. It's as if the select were just
> select regexp_replace(
> 'here is [[my text]] to replace and [[some more]]',
> E'\\[\\[(.*?)\\]\\]',
> E'\\1',
> 'g'
> );

> I've a feeling I'm missing something fundamental, any idea what?

You are assuming replace will magically work in a way it does not. The
inner replace is evaluated first:

> select replace(E'\\1', ' ', '_');
 replace
---------
 \1

and it's result is passed as 3rd argument to the outer replace, so
both select are equivalent.

What you want to do can be done in some languages passing a closure,
or a function, to their replace function, or with special forms ( like
the e modifier in perl s/// ), but I'm not sure it can be done.

On languages with basic regex support, like I think SQL is, you
normally have to either split the string in match/no match or do a
multiple match ( match something like (.*?)\[\[(.*?)\]\]  with two
captures ) and loop in the result aplying your second replacement (
which is what perl does behind the scenes, and other languages do )

In perl you can do it with something like:

$ perl -pe 's{\[\[(.*?)\]\]}{ $1=~s/ /_/gr}eg'
here is [[my text]] to replace and [[some more]]',
here is my_text to replace and some_more',

But note the magic e there.

In python you can use the function form:

re.sub(pattern, repl, string, count=0, flags=0)

Return the string obtained by replacing ......repl can be a string or
a function; if it is a string,....
If repl is a function, it is called for every non-overlapping
occurrence of pattern. The function takes a single match object
argument, and returns the replacement string.

An so on on other languages, but in sql

regexp_replace ( string text, pattern text, replacement text [, flags
text ] ) → text

The replacement is a plain text ( and AFAIK you cannot use functions
as values in sql ).

You could probably define your function doing that if you have any PL
installed in your DB.

Francisco Olarte.

Right, thanks, I have a better understanding now. The calling app is written in Java so I will write a routine there to do it instead.

Cheers
Oliver

pgsql-general by date:

Previous
From: Vijaykumar Jain
Date:
Subject: Re: Postgres PANIC when it could not open file in pg_logical/snapshots directory
Next
From: Vijaykumar Jain
Date:
Subject: Re: second CTE kills perf