Re: regexp_replace double quote - Mailing list pgsql-general

From hubert depesz lubaczewski
Subject Re: regexp_replace double quote
Date
Msg-id 20160815133649.3mruwuwbqsce3yml@depesz.com
Whole thread Raw
In response to regexp_replace double quote  (Михаил <m.nasedkin@gmail.com>)
Responses Re: regexp_replace double quote  (Михаил <m.nasedkin@gmail.com>)
List pgsql-general
On Mon, Aug 15, 2016 at 06:27:06PM +0500, Михаил wrote:
> I need to escape double quotes only:
> test=# select regexp_replace('"""{Performer,"Boomwacker ""a"" Recording""}"""', '([^"])"{2}([^"])', '\1\"\2', 'g');
>                  regexp_replace
> -------------------------------------------------
>  """{Performer,"Boomwacker \"a"" Recording\"}"""
>
> This is unexpected result.
>
> But when added one symbol to ""a"" the result is right:
> test=# select regexp_replace('"""{Performer,"Boomwacker ""a1"" Recording""}"""', '([^"])"{2}([^"])', '\1\"\2', 'g');
>                   regexp_replace
> --------------------------------------------------
>  """{Performer,"Boomwacker \"a1\" Recording\"}"""

This is because when finding first "", "a" that is afterwards get
assigned to \2. and thus is already "used", and can't be part of
match for the second "".

What will solve the problem is to use lookahead, like:
$ select regexp_replace('"""{Performer,"Boomwacker ""a"" Recording""}"""', '([^"])"{2}(?=[^"])', '\1\"', 'g');
                 regexp_replace
-------------------------------------------------
 """{Performer,"Boomwacker \"a\" Recording\"}"""
(1 row)

because then the part inside (?=...) is not "used", and can be used for next
match.

Not sure if I'm clear, but hopefully you'll understand what I'm trying to
explain :)

Best regards,

depesz



pgsql-general by date:

Previous
From: Михаил
Date:
Subject: regexp_replace double quote
Next
From: "David G. Johnston"
Date:
Subject: Re: regexp_replace double quote