Re: [COMMITTERS] pgsql: Prevent the injection of invalidly encoded strings by PL/Python - Mailing list pgsql-hackers

From Peter Eisentraut
Subject Re: [COMMITTERS] pgsql: Prevent the injection of invalidly encoded strings by PL/Python
Date
Msg-id 1269299444.14588.23.camel@vanquo.pezone.net
Whole thread Raw
Responses Re: Re: [COMMITTERS] pgsql: Prevent the injection of invalidly encoded strings by PL/Python
List pgsql-hackers
On fre, 2010-03-19 at 11:50 -0400, Andrew Dunstan wrote:
> Peter Eisentraut wrote:
> > Log Message:
> > -----------
> > Prevent the injection of invalidly encoded strings by PL/Python into PostgreSQL
> > with a few strategically placed pg_verifymbstr calls.

> Awesome. Do we need to fix pltcl too?

Short answer: yes

I have never used Tcl before just now, and the documentation is sketchy,
but it looks like the behavior of Tcl is kind of mixed in this area.

Escapes such as "\xd0" are apparently converted to Unicode code points
rather than bytes when the appropriate OS locale is set.  So that is
safe.  Except that it doesn't work in some locale/charset setups, such
as EUC_JP.  To adapt Hannu's original example:

CREATE TABLE utf_test
( id serial PRIMARY KEY, data character varying
);

CREATE OR REPLACE FUNCTION invalid_utf_seq() RETURNS character varying AS
$BODY$
return "\xd0";
$BODY$
LANGUAGE 'pltclu' VOLATILE STRICT;

insert into utf_test(data) values(invalid_utf_seq());

-- This works in UTF8 and LATIN1 with the right locales, but ...

select invalid_utf_seq();
ERROR:  22021: invalid byte sequence for encoding "EUC_JP": 0xc390



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: 9.0 release notes done
Next
From: Bruce Momjian
Date:
Subject: Re: 9.0 release notes done