Re: Proposal - Support for National Characters functionality - Mailing list pgsql-hackers

From Arulappan, Arul Shaji
Subject Re: Proposal - Support for National Characters functionality
Date
Msg-id 3AFB102B67FAEE48874E0607386DF4210CDE8A0D@SYDExchTmp.au.fjanz.com
Whole thread Raw
In response to Re: Proposal - Support for National Characters functionality  (Tatsuo Ishii <ishii@postgresql.org>)
Responses Re: Proposal - Support for National Characters functionality
List pgsql-hackers
Ishii san,

Thank you for your positive and early response.

> -----Original Message-----
> From: Tatsuo Ishii [mailto:ishii@postgresql.org]
> Sent: Friday, 5 July 2013 3:02 PM
> To: Arulappan, Arul Shaji
> Cc: pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Proposal - Support for National Characters
> functionality
>
> Arul Shaji,
>
> NCHAR support is on our TODO list for some time and I would like to
welcome
> efforts trying to implement it. However I have a few
> questions:
>
> > This is a proposal to implement functionalities for the handling of
> > National Characters.
> >
> > [Introduction]
> >
> > The aim of this proposal is to eventually have a way to represent
> > 'National Characters' in a uniform way, even in non-UTF8 encoded
> > databases. Many of our customers in the Asian region who are now, as
> > part of their platform modernization, are moving away from
mainframes
> > where they have used National Characters representation in COBOL and
> > other databases. Having stronger support for national characters
> > representation will also make it easier for these customers to look
at
> > PostgreSQL more favourably when migrating from other well known
RDBMSs
> > who all have varying degrees of NCHAR/NVARCHAR support.
> >
> > [Specifications]
> >
> > Broadly speaking, the national characters implementation ideally
will
> > include the following
> > - Support for NCHAR/NVARCHAR data types
> > - Representing NCHAR and NVARCHAR columns in UTF-8 encoding in
> > non-UTF8 databases
>
> I think this is not a trivial work because we do not have framework to
allow
> mixed encodings in a database. I'm interested in how you are going to
solve
> the problem.
>

I would be lying if I said I have the design already speced out. I will
be working on this in the coming weeks and hope to design a working
solution consulting with the community.

> > - Support for UTF16 column encoding and representing NCHAR and
> > NVARCHAR columns in UTF16 encoding in all databases.
>
> Why do yo need UTF-16 as the database encoding? UTF-8 is already
supported,
> and any UTF-16 character can be represented in UTF-8 as far as I know.
>

Yes, that's correct. However there are advantages in using UTF-16
encoding for those characters that are always going to take atleast
two-bytes to represent.

Having said that, my intention is to use UTF-8 for NCHAR as well.
Supporting UTF-16 will be even more complicated as it is not supported
natively in some Linux platforms. I only included it to give an option.

> > - Support for NATIONAL_CHARACTER_SET GUC variable that will
determine
> > the encoding that will be used in NCHAR/NVARCHAR columns.
>
> You said NCHAR's encoding is UTF-8. Why do you need the GUC if NCHAR's
> encoding is fixed to UTF-8?
>

If we are going to only support UTF-8 for NCHAR, then we don't need the
GUC variable obviously.

Rgds,
Arul Shaji



> > The above points are at the moment a 'wishlist' only. Our aim is to
> > tackle them one-by-one as we progress. I will send a detailed
proposal
> > later with more technical details.
> >
> > The main aim at the moment is to get some feedback on the above to
> > know if this feature is something that would benefit PostgreSQL in
> > general, and if users maintaining DBs in non-English speaking
regions
> > will find this beneficial.
> >
> > Rgds,
> > Arul Shaji
> >
> >
> >
> > P.S.: It has been quite some time since I send a correspondence to
> > this list. Our mail server adds a standard legal disclaimer to all
> > outgoing mails, which I know that this list is not a huge fan of. I
> > used to have an exemption for the mails I send to this list. If the
> > disclaimer appears, apologies in advance. I will rectify that on the
next
> one.
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese: http://www.sraoss.co.jp





pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: WITH CHECK OPTION for auto-updatable views
Next
From: "Arulappan, Arul Shaji"
Date:
Subject: Re: Proposal - Support for National Characters functionality