Re: Proposal - Support for National Characters functionality - Mailing list pgsql-hackers

From Tatsuo Ishii
Subject Re: Proposal - Support for National Characters functionality
Date
Msg-id 20130705.140213.799971806521596931.t-ishii@sraoss.co.jp
Whole thread Raw
In response to Proposal - Support for National Characters functionality  ("Arulappan, Arul Shaji" <arul@fast.au.fujitsu.com>)
Responses Re: Proposal - Support for National Characters functionality
Re: Proposal - Support for National Characters functionality
List pgsql-hackers
Arul Shaji,

NCHAR support is on our TODO list for some time and I would like to
welcome efforts trying to implement it. However I have a few
questions:

> This is a proposal to implement functionalities for the handling of
> National Characters. 
> 
> [Introduction]
> 
> The aim of this proposal is to eventually have a way to represent
> 'National Characters' in a uniform way, even in non-UTF8 encoded
> databases. Many of our customers in the Asian region who are now, as
> part of their platform modernization, are moving away from mainframes
> where they have used National Characters representation in COBOL and
> other databases. Having stronger support for national characters
> representation will also make it easier for these customers to look at
> PostgreSQL more favourably when migrating from other well known RDBMSs
> who all have varying degrees of NCHAR/NVARCHAR support.
> 
> [Specifications]
> 
> Broadly speaking, the national characters implementation ideally will
> include the following 
> - Support for NCHAR/NVARCHAR data types
> - Representing NCHAR and NVARCHAR columns in UTF-8 encoding in non-UTF8
> databases

I think this is not a trivial work because we do not have framework to
allow mixed encodings in a database. I'm interested in how you are
going to solve the problem.

> - Support for UTF16 column encoding and representing NCHAR and NVARCHAR
> columns in UTF16 encoding in all databases.

Why do yo need UTF-16 as the database encoding? UTF-8 is already
supported, and any UTF-16 character can be represented in UTF-8 as far
as I know.

> - Support for NATIONAL_CHARACTER_SET GUC variable that will determine
> the encoding that will be used in NCHAR/NVARCHAR columns.

You said NCHAR's encoding is UTF-8. Why do you need the GUC if NCHAR's
encoding is fixed to UTF-8?

> The above points are at the moment a 'wishlist' only. Our aim is to
> tackle them one-by-one as we progress. I will send a detailed proposal
> later with more technical details.
> 
> The main aim at the moment is to get some feedback on the above to know
> if this feature is something that would benefit PostgreSQL in general,
> and if users maintaining DBs in non-English speaking regions will find
> this beneficial.
> 
> Rgds,
> Arul Shaji
> 
> 
> 
> P.S.: It has been quite some time since I send a correspondence to this
> list. Our mail server adds a standard legal disclaimer to all outgoing
> mails, which I know that this list is not a huge fan of. I used to have
> an exemption for the mails I send to this list. If the disclaimer
> appears, apologies in advance. I will rectify that on the next one.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Support for REINDEX CONCURRENTLY
Next
From: Amit Kapila
Date:
Subject: Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: Proposal for Allow postgresql.conf values to be changed via SQL [review])