Thread: utf-8 flag always off in plperl function arguments

utf-8 flag always off in plperl function arguments

From
David Kamholz
Date:
Hello:

Since 5.6 or so, perl has stored an internal flag on every string to
mark whether it's UTF-8 or not. For data of unknown encoding, such as
data read from files, the default is latin1, but it can be changed with
use encoding 'utf8'. Now, I have a postgresql database in charset
UNICODE. So, postgres knows the data is UTF-8. However, when passing
arguments to plperl functions, no matter what the charset, postgres
ALWAYS sets the UTF-8 flag to off. This means that the only way to
handle the string properly in perl, when it matters that perl knows
it's UTF-8, is to use utf8::upgrade -- on every argument, in every
function, every time. This is rather kludgy, considering there already
exists a way to fix it by calling the libperl API properly. It would be
nice if it could be fixed in 8 final (it's exactly the same in 8 beta
and 7.4.6).

Regards,
Dave

Re: utf-8 flag always off in plperl function arguments

From
Tom Lane
Date:
David Kamholz <davekam@pobox.com> writes:
> This is rather kludgy, considering there already
> exists a way to fix it by calling the libperl API properly.

If you know how to do it, how about offering a patch?

            regards, tom lane