Re: replace inside regexp_replace - Mailing list pgsql-general

From Francisco Olarte
Subject Re: replace inside regexp_replace
Date
Msg-id CA+bJJbyEAWtGGUB8KPMxZqRzgZBwk80u4t66D-Hpt4cpn3cCUQ@mail.gmail.com
Whole thread Raw
In response to replace inside regexp_replace  (Oliver Kohll <oliver@agilechilli.com>)
Responses Re: replace inside regexp_replace  (Oliver Kohll <oliver@agilechilli.com>)
List pgsql-general
Oliver:

On Mon, Jun 21, 2021 at 3:27 PM Oliver Kohll <oliver@agilechilli.com> wrote:
...
> My attempt to do that is the regex
> select regexp_replace(
> 'here is [[my text]] to replace and [[some more]]',
> E'\\[\\[(.*?)\\]\\]',
> replace(E'\\1', ' ', '_'),
> 'g'
> );
> which results in
> 'here is my text to replace and some more'

> It half works, i.e. it removes the brackets but doesn't seem to process the inner replace. It's as if the select were
just
> select regexp_replace(
> 'here is [[my text]] to replace and [[some more]]',
> E'\\[\\[(.*?)\\]\\]',
> E'\\1',
> 'g'
> );

> I've a feeling I'm missing something fundamental, any idea what?

You are assuming replace will magically work in a way it does not. The
inner replace is evaluated first:

> select replace(E'\\1', ' ', '_');
 replace
---------
 \1

and it's result is passed as 3rd argument to the outer replace, so
both select are equivalent.

What you want to do can be done in some languages passing a closure,
or a function, to their replace function, or with special forms ( like
the e modifier in perl s/// ), but I'm not sure it can be done.

On languages with basic regex support, like I think SQL is, you
normally have to either split the string in match/no match or do a
multiple match ( match something like (.*?)\[\[(.*?)\]\]  with two
captures ) and loop in the result aplying your second replacement (
which is what perl does behind the scenes, and other languages do )

In perl you can do it with something like:

$ perl -pe 's{\[\[(.*?)\]\]}{ $1=~s/ /_/gr}eg'
here is [[my text]] to replace and [[some more]]',
here is my_text to replace and some_more',

But note the magic e there.

In python you can use the function form:

re.sub(pattern, repl, string, count=0, flags=0)

Return the string obtained by replacing ......repl can be a string or
a function; if it is a string,....
If repl is a function, it is called for every non-overlapping
occurrence of pattern. The function takes a single match object
argument, and returns the replacement string.

An so on on other languages, but in sql

regexp_replace ( string text, pattern text, replacement text [, flags
text ] ) → text

The replacement is a plain text ( and AFAIK you cannot use functions
as values in sql ).

You could probably define your function doing that if you have any PL
installed in your DB.

Francisco Olarte.



pgsql-general by date:

Previous
From: Thorsten Schöning
Date:
Subject: How to hash a large amount of data within Postgres?
Next
From: Laurenz Albe
Date:
Subject: Re: user privileges