Thread: BUG #5039: 'i' flag i in regexp_replace ignored for polish letters

BUG #5039: 'i' flag i in regexp_replace ignored for polish letters

From
"Kamil Roman"
Date:
The following bug has been logged online:

Bug reference:      5039
Logged by:          Kamil Roman
Email address:      kamil.lech.roman@gmail.com
PostgreSQL version: 8.3.7
Operating system:   Windows XP
Description:        'i' flag i in regexp_replace ignored for polish letters
Details:

select  regexp_replace('LUBŻKOĄŚĆĘŁŃÓ','[ośżźćęąłńó]',
'_','ig');

returns 'LUBŻK_ĄŚĆĘŁŃÓ' and it should return LUB_K_______

Re: BUG #5039: 'i' flag i in regexp_replace ignored for polish letters

From
Robert Haas
Date:
On Sat, Sep 5, 2009 at 5:42 AM, Kamil Roman <kamil.lech.roman@gmail.com> wr=
ote:
>
> The following bug has been logged online:
>
> Bug reference: =C2=A0 =C2=A0 =C2=A05039
> Logged by: =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Kamil Roman
> Email address: =C2=A0 =C2=A0 =C2=A0kamil.lech.roman@gmail.com
> PostgreSQL version: 8.3.7
> Operating system: =C2=A0 Windows XP
> Description: =C2=A0 =C2=A0 =C2=A0 =C2=A0'i' flag i in regexp_replace igno=
red for polish letters
> Details:
>
> select =C2=A0regexp_replace('LUB=C5=BBKO=C4=84=C5=9A=C4=86=C4=98=C5=81=C5=
=83=C3=93','[o=C5=9B=C5=BC=C5=BA=C4=87=C4=99=C4=85=C5=82=C5=84=C3=B3]',
> '_','ig');
>
> returns 'LUB=C5=BBK_=C4=84=C5=9A=C4=86=C4=98=C5=81=C5=83=C3=93' and it sh=
ould return LUB_K_______

I haven't seen a response to this.   Anyone think this might be a bug?

...Robert

Re: BUG #5039: 'i' flag i in regexp_replace ignored for polish letters

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Sat, Sep 5, 2009 at 5:42 AM, Kamil Roman <kamil.lech.roman@gmail.com> wrote:
>> Description:        'i' flag i in regexp_replace ignored for polish letters

> I haven't seen a response to this.   Anyone think this might be a bug?

If he's using a multibyte character set (UTF8 most likely) there is
pretty much 0 hope of it working.  The existing TODO entry for this
links to
http://archives.postgresql.org/pgsql-hackers/2008-12/msg00433.php

            regards, tom lane

Re: BUG #5039: 'i' flag i in regexp_replace ignored for polish letters

From
Kamil Roman
Date:
Hello,
yes, I have been using UTF-8. Shouldn't this behaviour be at least
documented in the postgresql documentation? I am aware that it is a bug, but
if it is not likely to be fixed soon IMHO it should be documented somehow.

Regards,
Kamil Roman

2009/10/22 Tom Lane <tgl@sss.pgh.pa.us>

> Robert Haas <robertmhaas@gmail.com> writes:
> > On Sat, Sep 5, 2009 at 5:42 AM, Kamil Roman <kamil.lech.roman@gmail.com>
> wrote:
> >> Description:        'i' flag i in regexp_replace ignored for polish
> letters
>
> > I haven't seen a response to this.   Anyone think this might be a bug?
>
> If he's using a multibyte character set (UTF8 most likely) there is
> pretty much 0 hope of it working.  The existing TODO entry for this
> links to
> http://archives.postgresql.org/pgsql-hackers/2008-12/msg00433.php
>
>                        regards, tom lane
>