On Fri, Jan 26, 2007 at 06:17:03PM +0100, Philippe Lang wrote:
> I've got plperl code that works just fine when the database is
> encoded using LATIN1, but fails as soon as I switch to UTF8.
>
> I've been testing PG 8.1.4 under Linux, and PG 8.1.6 under FreeBSD,
> both behave exactly the save.
[...]
> ERROR: error from Perl function: invalid input syntax for integer: "" at line 54.
The function has several integer output parameters and in some cases
the code sets the output value to '' (empty string). A couple of
those cases (larg_maconnerie, haut_maconnerie) involve comparisons
against strings with non-ASCII characters -- if you add an elog()
statement in each of those places you'll probably see that at least
one of them is being reached unexpectedly.
Aside from the fact that an empty string isn't a valid integer, I
think the problem can be reduced to the following example:
CREATE FUNCTION test(text) RETURNS boolean AS $$
return ($_[0] eq 'ä') ? 't' : 'f';
$$ LANGUAGE plperl IMMUTABLE STRICT;
SELECT test('ä');
test
------
f
(1 row)
In an 8.1.6 UTF-8 database this example returns false; in 8.2.1 it
returns true. See the following commit message and the related bug
report regarding PL/Perl and UTF-8:
http://archives.postgresql.org/pgsql-committers/2006-10/msg00277.php
http://archives.postgresql.org/pgsql-bugs/2006-10/msg00077.php
If you can't upgrade to 8.2 then you might be able to work around
the problem by creating the function as plperlu and adding 'use utf8;'.
--
Michael Fuhr