Re: [SPAM] [MessageLimit][lowlimit] Re: pl/perl and utf-8 in sql_ascii databases - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: [SPAM] [MessageLimit][lowlimit] Re: pl/perl and utf-8 in sql_ascii databases
Date
Msg-id 1342035182-sup-2265@alvh.no-ip.org
Whole thread Raw
In response to Re: [SPAM] [MessageLimit][lowlimit] Re: pl/perl and utf-8 in sql_ascii databases  (Alvaro Herrera <alvherre@commandprompt.com>)
Responses Re: [SPAM] [MessageLimit][lowlimit] Re: pl/perl and utf-8 in sql_ascii databases  (Alex Hunsaker <badalex@gmail.com>)
Re: [SPAM] [MessageLimit][lowlimit] Re: pl/perl and utf-8 in sql_ascii databases  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
List pgsql-hackers
Excerpts from Alvaro Herrera's message of mar jul 10 16:23:57 -0400 2012:
> Excerpts from Kyotaro HORIGUCHI's message of mar jul 03 04:59:38 -0400 2012:
> > Hello, Here is regression test runs on pg's also built with
> > cygwin-gcc and VC++.
> >
> > The patches attached following,
> >
> > - plperl_sql_ascii-4.patch         : fix for pl/perl utf8 vs sql_ascii
> > - plperl_sql_ascii_regress-1.patch : regression test for this patch.
> >                                      I added some tests on encoding to this.
> >
> > I will mark this patch as 'ready for committer' after this.
>
> I have pushed these changes to HEAD, 9.2 and 9.1.  Instead of the games
> with plperl_lc_*.out being copied around, I just used the ASCII version
> as plperl_lc_1.out and the UTF8 one as plperl_lc.out.

... and this story hasn't ended yet, because one of the new tests is
failing.  See here:

http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=magpie&dt=2012-07-11%2010%3A00%3A04

The interesting part of the diff is:

***************
*** 34,41 ****   return ($str ne $match ? $code."DIFFER" : $code."ab\x{5ddd}cd"); $$ LANGUAGE plperl; SELECT
encode(perl_utf_inout(E'ab\xe5\xb1\xb1cd')::bytea,'escape') 
!           encode
! --------------------------
!  NotUTF8:ab\345\267\235cd
! (1 row)
!
--- 34,38 ----   return ($str ne $match ? $code."DIFFER" : $code."ab\x{5ddd}cd"); $$ LANGUAGE plperl; SELECT
encode(perl_utf_inout(E'ab\xe5\xb1\xb1cd')::bytea,'escape') 
! ERROR:  character with byte sequence 0xe5 0xb7 0x9d in encoding "UTF8" has no equivalent in encoding "LATIN1"
! CONTEXT:  PL/Perl function "perl_utf_inout"


I am not sure what can we do here other than remove this function and
query from the test.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Synchronous Standalone Master Redoux
Next
From: Tom Lane
Date:
Subject: Re: [PATCH] lock_timeout and common SIGALRM framework