Re: Re: [COMMITTERS] pgsql: Force strings passed to and from plperl to be in UTF8 encoding. - Mailing list pgsql-hackers

From Amit Khandekar
Subject Re: Re: [COMMITTERS] pgsql: Force strings passed to and from plperl to be in UTF8 encoding.
Date
Msg-id CACoZds1SE7=+8iuoS2BntHwwX+nhcq57pH1qF5sN+-eqn5Omhg@mail.gmail.com
Whole thread Raw
In response to Re: [COMMITTERS] pgsql: Force strings passed to and from plperl to be in UTF8 encoding.  (Alex Hunsaker <badalex@gmail.com>)
Responses Re: Re: [COMMITTERS] pgsql: Force strings passed to and from plperl to be in UTF8 encoding.
List pgsql-hackers
On 12 February 2011 14:48, Alex Hunsaker <badalex@gmail.com> wrote:
> On Sun, Feb 6, 2011 at 15:31, Andrew Dunstan <andrew@dunslane.net> wrote:
>> Force strings passed to and from plperl to be in UTF8 encoding.
>>
>> String are converted to UTF8 on the way into perl and to the
>> database encoding on the way back. This avoids a number of
>> observed anomalies, and ensures Perl a consistent view of the
>> world.
>
> So I noticed a problem while playing with this in my discussion with
> David Wheeler. pg_do_encoding() does nothing when the src encoding ==
> the dest encoding. That means on a UTF-8 database we fail make sure
> our strings are valid utf8.
>
> An easy way to see this is to embed a null in the middle of a string:
> => create or replace function zerob() returns text as $$ return
> "abcd\0efg"; $$ language plperl;
> => SELECT zerob();
> abcd
>
> Also It seems bogus to bogus to do any encoding conversion when we are
> SQL_ASCII, and its really trivial to fix.
>
> With the attached:
> - when we are on a utf8 database make sure to verify our output string
> in sv2cstr (we assume database strings coming in are already valid)
>
> - Do no string conversion when we are SQL_ASCII in or out
>
> - add plperl_helpers.h as a dep to plperl.o in our makefile
>
> - remove some redundant calls to pg_verify_mbstr()
>
> - as utf_e2u only as one caller dont pstrdup() instead have the caller
> check (saves some cycles and memory)
>

Is there a plan to commit this issue? I am still seeing this issue on
PG 9.1 STABLE branch. Attached is a small patch that targets only the
specific issue in the described testcase :

create or replace function zerob() returns text as $$ return
"abcd\0efg"; $$ language plperl;
SELECT zerob();

The patch does the perl data validation in the function utf_u2e() itself.

>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

Attachment

pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: [REVIEW] pg_last_xact_insert_timestamp
Next
From: senthilnathan
Date:
Subject: Re: Tracking latest timeline in standby mode