Re: UTF8 national character data type support WIP patch and list of open issues. - Mailing list pgsql-hackers

From MauMau
Subject Re: UTF8 national character data type support WIP patch and list of open issues.
Date
Msg-id F98A2B6D87404A7D8134F6C50BAACAD5@maumau
Whole thread Raw
In response to Re: UTF8 national character data type support WIP patch and list of open issues.  (Valentine Gogichashvili <valgog@gmail.com>)
Responses Re: UTF8 national character data type support WIP patch and list of open issues.
List pgsql-hackers
From: "Valentine Gogichashvili" <valgog@gmail.com>
> the whole NCHAR appeared as hack for the systems, that did not have it 
> from
> the beginning. It would not be needed, if all the text would be magically
> stored in UNICODE or UTF from the beginning and idea of character would be
> the same as an idea of a rune and not a byte.

I guess so, too.


> PostgreSQL has a very powerful possibilities for storing any kind of
> encoding. So maybe it makes sense to add the ENCODING as another column
> property, the same way a COLLATION was added?

Some other people in this community suggested that.  ANd the SQL standard 
suggests the same -- specifying a character encoding for each column: 
CHAR(n) CHARASET SET ch.


> Text operations should work automatically, as in memory all strings will 
> be
> converted to the database encoding.
>
> This approach will also open a possibility to implement custom ENCODINGs
> for the column data storage, like snappy compression or even BSON, gobs or
> protbufs for much more compact type storage.

Thanks for your idea that sounds interesting, although I don't understand 
that well.

Regards
MauMau




pgsql-hackers by date:

Previous
From: "MauMau"
Date:
Subject: Re: UTF8 national character data type support WIP patch and list of open issues.
Next
From: "MauMau"
Date:
Subject: Re: UTF8 national character data type support WIP patch and list of open issues.