Re: invalidly encoded strings - Mailing list pgsql-hackers

From Tatsuo Ishii
Subject Re: invalidly encoded strings
Date
Msg-id 20070911.145019.26510762.t-ishii@sraoss.co.jp
Whole thread Raw
In response to Re: invalidly encoded strings  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: invalidly encoded strings
Re: invalidly encoded strings
List pgsql-hackers

> On Tue, 2007-09-11 at 12:29 +0900, Tatsuo Ishii wrote:
> > Please show me concrete examples how I could introduce a vulnerability
> > using this kind of convert() usage.
> 
> Try the sequence below. Then, try to dump and then reload the database.
> When you try to reload it, you will get an error:
> 
> ERROR:  invalid byte sequence for encoding "UTF8": 0xbd

I know this could be a problem (like chr() with invalid byte pattern).
What I really want to know is, read query something like this:

SELECT * FROM japanese_table ORDER BY convert(japanese_text using utf8_to_euc_jp);

could be a problem (I assume we use C locale).
--
Tatsuo Ishii
SRA OSS, Inc. Japan

> Regards,
>     Jeff Davis
> 
> test=> select version();
> 
> version                                                          
>
--------------------------------------------------------------------------------------------------------------------------
>  PostgreSQL 8.3devel on x86_64-unknown-linux-gnu, compiled by GCC gcc
> (GCC) 4.1.3 20070601 (prerelease) (Debian 4.1.2-12)
> (1 row)
> 
> test=> show lc_collate;
>  lc_collate  
> -------------
>  en_US.UTF-8
> (1 row)
> 
> test=> create table encoding_test(t text);
> CREATE TABLE
> test=> insert into encoding_test values('初');
> INSERT 0 1
> test=> insert into encoding_test values(convert('初' using
> utf8_to_euc_jp));
> INSERT 0 1
> 
> 


pgsql-hackers by date:

Previous
From: Teodor Sigaev
Date:
Subject: Re: Ts_rank internals
Next
From: Martijn van Oosterhout
Date:
Subject: Re: invalidly encoded strings