Thread: BUG #5010: perl iconv function returns ? character
The following bug has been logged online: Bug reference: 5010 Logged by: Lampa Email address: lampacz@gmail.com PostgreSQL version: 8.4.0 Operating system: Debian testing/unstable Description: perl iconv function returns ? character Details: See the difference (example is the best explanation): psql -U postgres -p 5433 psql (8.4.0, server 8.3.7) WARNING: psql version 8.4, server version 8.3. Some psql features might not work. Type "help" for help. postgres=# select my_ascii2('BockaniÄová'); my_ascii2 ------------- Bockanicova (1 row) psql -U postgres -p 5432 psql (8.4.0) Type "help" for help. postgres=# select my_ascii2('BockaniÄová'); my_ascii2 ------------- Bockani?ov? (1 row) function my_ascii2 is defined: CREATE FUNCTION my_ascii2(text) RETURNS text AS $$ use strict; use Text::Iconv; my $conv = Text::Iconv->new("UTF8", "ASCII//TRANSLIT"); return $conv->convert($_[0]); $$ LANGUAGE plperlu; 8.3.x version works perfectly, 8.4.0 problem in more complicated queries (joins, conditions) after my_ascii2 function query are returned incorect count of rows
On Tue, Aug 25, 2009 at 8:15 AM, Lampa<lampacz@gmail.com> wrote: > > The following bug has been logged online: > > Bug reference: =C2=A0 =C2=A0 =C2=A05010 > Logged by: =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Lampa > Email address: =C2=A0 =C2=A0 =C2=A0lampacz@gmail.com > PostgreSQL version: 8.4.0 > Operating system: =C2=A0 Debian testing/unstable > Description: =C2=A0 =C2=A0 =C2=A0 =C2=A0perl iconv function returns ? cha= racter > Details: > > See the difference (example is the best explanation): > > psql -U postgres -p 5433 > psql (8.4.0, server 8.3.7) > WARNING: psql version 8.4, server version 8.3. > =C2=A0 =C2=A0 =C2=A0 =C2=A0 Some psql features might not work. > Type "help" for help. > > postgres=3D# select my_ascii2('Bockani=C4=8Dov=C3=A1'); > =C2=A0my_ascii2 > ------------- > =C2=A0Bockanicova > (1 row) > > psql -U postgres -p 5432 > psql (8.4.0) > Type "help" for help. > > postgres=3D# select my_ascii2('Bockani=C4=8Dov=C3=A1'); > =C2=A0my_ascii2 > ------------- > =C2=A0Bockani?ov? > (1 row) > > > function my_ascii2 is defined: > CREATE FUNCTION my_ascii2(text) RETURNS text AS $$ use strict; use > Text::Iconv; my $conv =3D Text::Iconv->new("UTF8", "ASCII//TRANSLIT"); re= turn > $conv->convert($_[0]); $$ LANGUAGE plperlu; > > 8.3.x version works perfectly, 8.4.0 problem I can't reproduce this on 8.4.0 or CVS HEAD. I think that whatever problem you have here is not a PostgreSQL bug. > in more complicated queries (joins, conditions) after my_ascii2 function > query are returned incorect count of rows This may be related to whatever your other problem is. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Aug 25, 2009 at 8:15 AM, Lampa<lampacz@gmail.com> wrote: >> function my_ascii2 is defined: >> CREATE FUNCTION my_ascii2(text) RETURNS text AS $$ use strict; use >> Text::Iconv; my $conv = Text::Iconv->new("UTF8", "ASCII//TRANSLIT"); return >> $conv->convert($_[0]); $$ LANGUAGE plperlu; >> >> 8.3.x version works perfectly, 8.4.0 problem > I can't reproduce this on 8.4.0 or CVS HEAD. I think that whatever > problem you have here is not a PostgreSQL bug. I suspect that function will only work as desired in a database with UTF8 server_encoding. Maybe the problem is the 8.4 database is set up with some other encoding? regards, tom lane
Cluster is created with cs_CZ.UTF-8 collation. List of databases Name | Owner | Encoding | Collation | Ctype | Access privileges -----------+----------+----------+-------------+-------------+-------------= ---------- postgres | postgres | UTF8 | cs_CZ.UTF-8 | cs_CZ.UTF-8 | template0 | postgres | UTF8 | cs_CZ.UTF-8 | cs_CZ.UTF-8 | =3Dc/postgres : postgres=3DCTc/postgres template1 | postgres | UTF8 | cs_CZ.UTF-8 | cs_CZ.UTF-8 | =3Dc/postgres : postgres=3DCTc/postgres (3 rows) 2009/9/6 Tom Lane <tgl@sss.pgh.pa.us>: > Robert Haas <robertmhaas@gmail.com> writes: >> On Tue, Aug 25, 2009 at 8:15 AM, Lampa<lampacz@gmail.com> wrote: >>> function my_ascii2 is defined: >>> CREATE FUNCTION my_ascii2(text) RETURNS text AS $$ use strict; use >>> Text::Iconv; my $conv =3D Text::Iconv->new("UTF8", "ASCII//TRANSLIT"); = return >>> $conv->convert($_[0]); $$ LANGUAGE plperlu; >>> >>> 8.3.x version works perfectly, 8.4.0 problem > >> I can't reproduce this on 8.4.0 or CVS HEAD. =A0I think that whatever >> problem you have here is not a PostgreSQL bug. > > I suspect that function will only work as desired in a database with > UTF8 server_encoding. =A0Maybe the problem is the 8.4 database is set up > with some other encoding? > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0regards, tom lane > --=20 Lampa
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Aug 25, 2009 at 8:15 AM, Lampa<lampacz@gmail.com> wrote: >> function my_ascii2 is defined: >> CREATE FUNCTION my_ascii2(text) RETURNS text AS $$ use strict; use >> Text::Iconv; my $conv = Text::Iconv->new("UTF8", "ASCII//TRANSLIT"); return >> $conv->convert($_[0]); $$ LANGUAGE plperlu; >> >> 8.3.x version works perfectly, 8.4.0 problem > I can't reproduce this on 8.4.0 or CVS HEAD. I think that whatever > problem you have here is not a PostgreSQL bug. Hmm ... I can reproduce the problem on Fedora 11. Given a UTF8-encoded database (I don't think locale matters), 8.3.7 works as described, but 8.3.8 fails as described, as do 8.4.1 and HEAD. Given that the only difference in plperl.c between 8.3.7 and 8.3.8 is the addition of the PERL_SYS_INIT3 call, I have to suppose that that's screwing up Text::Iconv somehow. I'd bet a small amount of money that this is somehow related to the UTF8-specific code in plperl_safe_init(), which always struck me as unexplained hocus-pocus. Since the test function is plperlu, plperl_safe_init() obviously can't be directly to blame; but I'm thinking that what it's really doing is papering over some missed initialization issue that affects plperlu functions too. regards, tom lane
I wrote: > Hmm ... I can reproduce the problem on Fedora 11. Given a UTF8-encoded > database (I don't think locale matters), 8.3.7 works as described, but > 8.3.8 fails as described, as do 8.4.1 and HEAD. Given that the only > difference in plperl.c between 8.3.7 and 8.3.8 is the addition of the > PERL_SYS_INIT3 call, I have to suppose that that's screwing up > Text::Iconv somehow. Huh ... belay that. Diking out the PERL_SYS_INIT3 call doesn't make the problem go away. What I was actually comparing was the current Fedora 11 8.3.7 RPMs with 8.3.8 built from source. I would have said that the RPMs are not built in any way significantly different from a straight configure-and-build-from-source, but it appears that something in the RPM build options makes this work. Investigating ... (Whether this has anything to do with the OP's problem on Debian remains to be determined, but it's definitely busted on Fedora.) regards, tom lane
On Sun, 2009-09-06 at 12:52 -0400, Tom Lane wrote: > I would have said that the RPMs are > not built in any way significantly different from a straight > configure-and-build-from-source, but it appears that something in > the RPM build options makes this work. Investigating ... Could it be because of perl-Text-Iconv package? -- Devrim GÜNDÜZ, RHCE Command Prompt - http://www.CommandPrompt.com devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr http://www.gunduz.org
Devrim GÜNDÜZ <devrim@gunduz.org> writes: > On Sun, 2009-09-06 at 12:52 -0400, Tom Lane wrote: >> I would have said that the RPMs are >> not built in any way significantly different from a straight >> configure-and-build-from-source, but it appears that something in >> the RPM build options makes this work. Investigating ... > Could it be because of perl-Text-Iconv package? Well, you have to install that before you can test the problem at all, but the working and non-working cases are using the same Text::IConv code. I think I just figured it out though. I had dismissed locale as not being the critical difference, but that was foolish (and I paid for it with an hour of wasted effort). My RPM installation is working because it defaults to en_US locale, and my source installation is not working because it uses C locale. If I switch to either en_US or cz_CZ locale then Text::IConv gives the expected result. I now believe that the OP's actual problem is related to this: http://archives.postgresql.org/pgsql-committers/2009-07/msg00098.php He's probably ending up in C locale internally. If so it'll be fixed in 8.4.1. The only observation not accounted for is Robert's statement that he couldn't reproduce it in 8.4.0 --- but I think the behavior with the bug is dependent on the postmaster's starting environment, so it would be easy to fail to duplicate someone else's result. regards, tom lane