Home > mailing lists

Re: UTF8 national character data type support WIP patch and list of open issues. - Mailing list pgsql-hackers

From	MauMau
Subject	Re: UTF8 national character data type support WIP patch and list of open issues.
Date	September 25, 2013 11:41:18
Msg-id	39433F3837CB4CAE90753D4B1F514615@maumau Whole thread
In response to	Re: UTF8 national character data type support WIP patch and list of open issues. (Peter Eisentraut <peter_e@gmx.net>)
List	pgsql-hackers

Tree view

From: "Peter Eisentraut" <peter_e@gmx.net>
> On Tue, 2013-09-24 at 21:04 +0900, MauMau wrote:
>> "4. I guess some users really want to continue to use ShiftJIS or EUC_JP 
>> for
>> database encoding, and use NCHAR for a limited set of columns to store
>> international text in Unicode:
>> - to avoid code conversion between the server and the client for 
>> performance
>> - because ShiftJIS and EUC_JP require less amount of storage (2 bytes for
>> most Kanji) than UTF-8 (3 bytes)
>> This use case is described in chapter 6 of "Oracle Database Globalization
>> Support Guide"."
>
> But your proposal wouldn't address the first point, because data would
> have to go client -> server -> NCHAR.
>
> The second point is valid, but it's going to be an awful amount of work
> for that limited result.

I (or, Oracle's use case) meant the following, for example:

initdb -E EUC_JP
CREATE DATABASE mydb ENCODING EUC_JP NATIONAL ENCODING UTF-8;
CREATE TABLE mytable (   col1 char(10),   -- EUC_JP text   col2 Nchar(10), -- UTF-8 text
);
client encoding = EUC_JP

That is,

1. Currently, the user is only handling Japanese text.  To avoid unnecessary 
conversion, he uses EUC_JP for both client and server.
2. He needs to store some limited amount of international (non-Japanese) 
text in a few columns for a new feature of the system.  But the 
international text is limited, so he wants to sacrifice performance and 
storage cost due to code conversion for most text and more bytes for each 
character.

Regards
MauMau

pgsql-hackers by date:

From: Stas Kelvich
Date: 25 September 2013, 11:14:19
Subject: Cube extension split algorithm fix

From: Peter Eisentraut
Date: 25 September 2013, 12:48:32
Subject: Re: Freezing without write I/O

Re: UTF8 national character data type support WIP patch and list of open issues. - Mailing list pgsql-hackers

Previous

Next