Thread: regexp_replace double quote

regexp_replace double quote

From
Михаил
Date:
Hi!

I need to escape double quotes only:
test=# select regexp_replace('"""{Performer,"Boomwacker ""a""
Recording""}"""', '([^"])"{2}([^"])', '\1\"\2', 'g');
                 regexp_replace
-------------------------------------------------
 """{Performer,"Boomwacker \"a"" Recording\"}"""

This is unexpected result.

But when added one symbol to ""a"" the result is right:
test=# select regexp_replace('"""{Performer,"Boomwacker ""a1""
Recording""}"""', '([^"])"{2}([^"])', '\1\"\2', 'g');
                  regexp_replace
--------------------------------------------------
 """{Performer,"Boomwacker \"a1\" Recording\"}"""


I had tested on versions:
 PostgreSQL 9.5.1 on x86_64-pc-linux-gnu, compiled by gcc (Gentoo
4.8.3 p1.1, pie-0.5.9) 4.8.3, 64-bit

And:
 PostgreSQL 9.5.3 on x86_64-apple-darwin, compiled by
i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc.
build 5658)
 (LLVM build 2336.11.00), 64-bit

And:
PostgreSQL 9.4.7 on x86_64-unknown-linux-gnu, compiled by gcc
(Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, 64-bit

What`s wrong?

--
---
Regards,
Mikhail


Re: regexp_replace double quote

From
hubert depesz lubaczewski
Date:
On Mon, Aug 15, 2016 at 06:27:06PM +0500, Михаил wrote:
> I need to escape double quotes only:
> test=# select regexp_replace('"""{Performer,"Boomwacker ""a"" Recording""}"""', '([^"])"{2}([^"])', '\1\"\2', 'g');
>                  regexp_replace
> -------------------------------------------------
>  """{Performer,"Boomwacker \"a"" Recording\"}"""
>
> This is unexpected result.
>
> But when added one symbol to ""a"" the result is right:
> test=# select regexp_replace('"""{Performer,"Boomwacker ""a1"" Recording""}"""', '([^"])"{2}([^"])', '\1\"\2', 'g');
>                   regexp_replace
> --------------------------------------------------
>  """{Performer,"Boomwacker \"a1\" Recording\"}"""

This is because when finding first "", "a" that is afterwards get
assigned to \2. and thus is already "used", and can't be part of
match for the second "".

What will solve the problem is to use lookahead, like:
$ select regexp_replace('"""{Performer,"Boomwacker ""a"" Recording""}"""', '([^"])"{2}(?=[^"])', '\1\"', 'g');
                 regexp_replace
-------------------------------------------------
 """{Performer,"Boomwacker \"a\" Recording\"}"""
(1 row)

because then the part inside (?=...) is not "used", and can be used for next
match.

Not sure if I'm clear, but hopefully you'll understand what I'm trying to
explain :)

Best regards,

depesz



Re: regexp_replace double quote

From
"David G. Johnston"
Date:
On Mon, Aug 15, 2016 at 9:27 AM, Михаил <m.nasedkin@gmail.com> wrote:
Hi!

I need to escape double quotes only:
test=# select regexp_replace('"""{Performer,"Boomwacker ""a""
Recording""}"""', '([^"])"{2}([^"])', '\1\"\2', 'g');
                 regexp_replace
-------------------------------------------------
 """{Performer,"Boomwacker \"a"" Recording\"}"""

What is the goal you are trying to accomplish​.  Its possible to do what you ask but only if no other solution is feasible.


This is unexpected result.

But when added one symbol to ""a"" the result is right:
test=# select regexp_replace('"""{Performer,"Boomwacker ""a1""
Recording""}"""', '
​​
([^"])"{2}([^"])', '\1\"\2', 'g');
                  regexp_replace
--------------------------------------------------
 """{Performer,"Boomwacker \"a1\" Recording\"}"""


<​
([^"])"{2}([^"])> on < ""a""> consumes < ""a>​ leaving <""> which doesn't match your pattern since there is nothing before the double-quote to satisfy the [^"]

See depesz's simultaneous post for the solution using look-ahead.

David J.

Re: regexp_replace double quote

From
Михаил
Date:
Thank you!

2016-08-15 18:36 GMT+05:00, hubert depesz lubaczewski <depesz@depesz.com>:
> On Mon, Aug 15, 2016 at 06:27:06PM +0500, Михаил wrote:
>> I need to escape double quotes only:
>> test=# select regexp_replace('"""{Performer,"Boomwacker ""a""
>> Recording""}"""', '([^"])"{2}([^"])', '\1\"\2', 'g');
>>                  regexp_replace
>> -------------------------------------------------
>>  """{Performer,"Boomwacker \"a"" Recording\"}"""
>>
>> This is unexpected result.
>>
>> But when added one symbol to ""a"" the result is right:
>> test=# select regexp_replace('"""{Performer,"Boomwacker ""a1""
>> Recording""}"""', '([^"])"{2}([^"])', '\1\"\2', 'g');
>>                   regexp_replace
>> --------------------------------------------------
>>  """{Performer,"Boomwacker \"a1\" Recording\"}"""
>
> This is because when finding first "", "a" that is afterwards get
> assigned to \2. and thus is already "used", and can't be part of
> match for the second "".
>
> What will solve the problem is to use lookahead, like:
> $ select regexp_replace('"""{Performer,"Boomwacker ""a"" Recording""}"""',
> '([^"])"{2}(?=[^"])', '\1\"', 'g');
>                  regexp_replace
> -------------------------------------------------
>  """{Performer,"Boomwacker \"a\" Recording\"}"""
> (1 row)
>
> because then the part inside (?=...) is not "used", and can be used for
> next
> match.
>
> Not sure if I'm clear, but hopefully you'll understand what I'm trying to
> explain :)
>
> Best regards,
>
> depesz
>
>


--
---
Regards,

Mikhail