Re: plperlu problem with utf8 [REVIEW] - Mailing list pgsql-hackers

From Andy Colson
Subject Re: plperlu problem with utf8 [REVIEW]
Date
Msg-id 4D33BFA4.8060505@squeakycode.net
Whole thread Raw
In response to Re: plperlu problem with utf8 [REVIEW]  (Alex Hunsaker <badalex@gmail.com>)
List pgsql-hackers
On 01/16/2011 07:14 PM, Alex Hunsaker wrote:
> On Sat, Jan 15, 2011 at 14:20, Andy Colson<andy@squeakycode.net>  wrote:
>>
>> This is a review of  "plperl encoding issues"
>>
>> https://commitfest.postgresql.org/action/patch_view?id=452
>
> Thanks for taking the time to review!
>
> [...]
>>
>> The Patch:
>> ==========
>> Applies clean to git head as of January 15 2011.  PG built with
>> --enable-cassert and --enable-debug seems to run fine with no errors.
>>
>> I don't think regression tests cover plperl, so understandable there are no
>> tests in the patch.
>
> FWI there are plperl tests, you can do 'make installcheck' from the
> plperl dir or installcheck-world from the top.  However I did not add
> any as AFAIK there is not a way to handle multiple locales with them
> (at least for the automated case).

oh, cool.  I'd kinda thought 'make check' was the one to run.  I'll have to checkout 'make check' vs 'make
installcheck'.


>> There is no manual updates in the patch either, and I think there should be.
>>   I think it should be made clear
>> that data (varchar, text, etc.  but not bytea) will be passed to perl as
>> UTF-8, regardless of database encoding
>
> I don't disagree, but I dont see where to put it either.  Maybe its
> only release note material?
>

I think this page:
http://www.postgresql.org/docs/current/static/plperl-funcs.html

Right after:
"Arguments and results are handled as in any other Perl subroutine: arguments are passed in @_, and a result value is
returnedwith return or as the last expression evaluated in the function."
 

Add:

Arguments will be converted from the databases encoding to UTF-8 for use inside plperl, and then converted from UTF-8
backto the database encoding upon return.
 


OR, that same sentence could be added to the next page:

http://www.postgresql.org/docs/current/static/plperl-data.html


However, this patch brings back DWIM to plperl.  It should just work without having to worry about it.  I'd be ok
eitherway.
 

>>     Also that "use utf8;" is always loaded and in use.
>
> Sorry, I probably mis-worded that in my original description. Its that
> we always do the 'utf8fix' for plperl. Not that utf8 is loaded and in
> use. This fix basically makes sure the unicode database and associated
> modules are loaded. This is needed because perl will try to
> dynamically load these when you need them. As we restrict 'require' in
> the plperl case, things that depended on that would fail. Previously
> we only did the utf8fix when we were a PG_UTF8 database.  I don't
> really think its worth documenting, its more a bug fix than anything
> else.
>

Agreed.

-Andy


pgsql-hackers by date:

Previous
From: Jeff Janes
Date:
Subject: Re: auto-sizing wal_buffers
Next
From: Jeff Janes
Date:
Subject: Re: auto-sizing wal_buffers