Home > mailing lists

Re: invalidly encoded strings - Mailing list pgsql-hackers

From	Andrew Dunstan
Subject	Re: invalidly encoded strings
Date	September 10, 2007 23:56:05
Msg-id	46E603BD.70707@dunslane.net Whole thread Raw
In response to	Re: invalidly encoded strings (Jeff Davis <pgsql@j-davis.com>)
List	pgsql-hackers

Tree view


Jeff Davis wrote:
> On Tue, 2007-09-11 at 11:27 +0900, Tatsuo Ishii wrote:
>   
>>> BTW, it strikes me that there is another hole that we need to plug in
>>> this area, and that's the convert() function.  Being able to create
>>> a value of type text that is not in the database encoding is simply
>>> broken.  Perhaps we could make it work on bytea instead (providing
>>> a cast from text to bytea but not vice versa), or maybe we should just
>>> forbid the whole thing if the database encoding isn't SQL_ASCII.
>>>       
>> Please don't do that. It will break an usefull use case of convert().
>>
>> A user has a database encoded in UTF-8. He has English, French,
>> Chinese  and Japanese data in tables. To sort the tables in the
>> language order, he will do like this:
>>
>> SELECT * FROM japanese_table ORDER BY convert(japanese_text using utf8_to_euc_jp);
>>
>> Without using convert(), he will get random order of data. This is
>> because Kanji characters are in random order in UTF-8, while Kanji
>> characters are reasonably ordered in EUC_JP.
>>     
>
> Isn't the collation a locale issue, not an encoding issue? Is there a
> ja_JP.UTF-8 that defines the proper order?
>
>
>   

That won't help you much if you have all the collection mentioned above.

cheers

andrew

pgsql-hackers by date:

From: Andrew Dunstan
Date: 10 September 2007, 23:54:26
Subject: Re: invalidly encoded strings

From: Tom Lane
Date: 11 September 2007, 00:01:14
Subject: Re: invalidly encoded strings

Re: invalidly encoded strings - Mailing list pgsql-hackers

Previous

Next