Re: BUG #13638: Exception texts from plperl has bad encoding - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #13638: Exception texts from plperl has bad encoding
Date
Msg-id 18105.1443208708@sss.pgh.pa.us
Whole thread Raw
In response to BUG #13638: Exception texts from plperl has bad encoding  (lei@aswsyst.cz)
Responses Re: BUG #13638: Exception texts from plperl has bad encoding  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
lei@aswsyst.cz writes:
> I have UTF-8 database and using UTF-8 text in plpgsql and plperl functions.
> Everything works ok, only exceptions from plperl functions have bad encoding
> (maybe double encoded).

> Compare output of these 2 functions:

> create or replace function perl_test() returns text
> language plperl as $$
>   return "Český text ěščřžýáíé";
> $$;

> create or replace function perl_test_err() returns text
> language plperl as $$
>   elog(ERROR, "Česká chyba ěščřžýáíé");
> $$;

I traced through this example to the extent of finding that:

* The string passed to elog() in do_util_elog() seems correctly
encoded.

* So does the string passed to croak() after elog does its longjmp,
which is unsurprising since elog.c doesn't really do anything to it.

* Back in plperl_call_perl_func, we use sv2cstr(ERRSV) to get the
error string.  sv2cstr calls "SvPVutf8(sv, len)", and that is what
is giving back the bogusly-encoded data.

I suspect the root problem is that instead of baldly doing

        croak("%s", edata->message);

in do_util_elog(), we need to do something to inform Perl what encoding
the message string is in.  This is beyond my Perl-fu, however.

            regards, tom lane

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #13636: psql numericlocale adds comma where it ought not
Next
From: Tom Lane
Date:
Subject: Re: BUG #13638: Exception texts from plperl has bad encoding