Thread: Re: Postgresql 8.1: plperl code works with LATIN1, fail
> In an 8.1.6 UTF-8 database this example returns false; in 8.2.1 it > returns true. See the following commit message and the related bug > report regarding PL/Perl and UTF-8: > > http://archives.postgresql.org/pgsql-committers/2006-10/msg00277.php > http://archives.postgresql.org/pgsql-bugs/2006-10/msg00077.php > > If you can't upgrade to 8.2 then you might be able to work around > the problem by creating the function as plperlu and adding 'use utf8;'. > -- > Michael Fuhr Hello Michael! As fas as i know 'use utf8;' normally just tells Perl that the source code is written in UTF-8 and noting more. For converting from and to UTF-8 in data usually the Encode modul is used. Or is this different for plperlu? Greetings, Matthias
On Mon, Jan 29, 2007 at 01:34:47PM +0100, Matthias.Pitzl@izb.de wrote: > > If you can't upgrade to 8.2 then you might be able to work around > > the problem by creating the function as plperlu and adding 'use utf8;'. > > As fas as i know 'use utf8;' normally just tells Perl that the source code > is written in UTF-8 and noting more. The string literals in the PL/Perl function body are UTF-8 but Perl isn't treating them as such. Isn't "use utf8" or "use encoding 'utf8'" the way to tell Perl to do so? The perluniintro manual page says this: Only one case remains where an explicit "use utf8" is needed: if your Perl script itself is encoded in UTF-8, you can use UTF-8 in your identifier names, and in string and regular expression literals, by saying "use utf8". Isn't that the situation here? The PL/Perl function body is a string encoded in the database's encoding, which in this case is UTF-8. > For converting from and to UTF-8 in data usually the Encode modul is used. > Or is this different for plperlu? Isn't the Encode module used for doing explicit conversions? I think the goal is not to have to do so, i.e., to have PL/Perl treat string literals as UTF-8 if the database encoding is UTF-8. PostgreSQL 8.2 does so but earlier versions don't. -- Michael Fuhr
Re: Postgresql 8.1: plperl code works with LATIN1, fail
From
merlyn@stonehenge.com (Randal L. Schwartz)
Date:
>>>>> "Michael" == Michael Fuhr <mike@fuhr.org> writes: Michael> Isn't that the situation here? The PL/Perl function body is a Michael> string encoded in the database's encoding, which in this case is Michael> UTF-8. If that's always the case, then the embedded Perl interpreter should be started in that mode, perhaps by adding "-Mutf8" to the arg list of the embedded interpreter. -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/> Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!