lei@aswsyst.cz writes:
> I have UTF-8 database and using UTF-8 text in plpgsql and plperl functions.
> Everything works ok, only exceptions from plperl functions have bad encoding
> (maybe double encoded).
> Compare output of these 2 functions:
> create or replace function perl_test() returns text
> language plperl as $$
> return "Äeský text ÄÅ¡ÄÅžýáÃé";
> $$;
> create or replace function perl_test_err() returns text
> language plperl as $$
> elog(ERROR, "Äeská chyba ÄÅ¡ÄÅžýáÃé");
> $$;
I traced through this example to the extent of finding that:
* The string passed to elog() in do_util_elog() seems correctly
encoded.
* So does the string passed to croak() after elog does its longjmp,
which is unsurprising since elog.c doesn't really do anything to it.
* Back in plperl_call_perl_func, we use sv2cstr(ERRSV) to get the
error string. sv2cstr calls "SvPVutf8(sv, len)", and that is what
is giving back the bogusly-encoded data.
I suspect the root problem is that instead of baldly doing
croak("%s", edata->message);
in do_util_elog(), we need to do something to inform Perl what encoding
the message string is in. This is beyond my Perl-fu, however.
regards, tom lane