Re: encoding names - Mailing list pgsql-hackers

From Peter Eisentraut
Subject Re: encoding names
Date
Msg-id Pine.LNX.4.30.0108151654030.677-100000@peter.localdomain
Whole thread Raw
In response to encoding names  (Karel Zak <zakkr@zf.jcu.cz>)
Responses Re: encoding names
Re: encoding names
List pgsql-hackers
Karel Zak writes:

>  before some time I was discuss with Tatsuo and Thomas about support
> for synonyms of encoding names (for example allows to use
> "ISO-8859-1" as the encoding name) and use binary search for searching
> in encoding names.

Funny, I was thinking the same thing last night...

A couple of other things I was thinking about in the encoding area:

If you want to have codeset synonyms, you should also implement the
normalization of codeset names, defined as such:
 1. Remove all characters beside numbers and letters.
 2. Fold letters to lowercase.
 3. If the same only contains digits prepend the string `"iso"'.
[quote glibc]

This allows ISO_8859-1 and iso88591 to be treated the same.

Here's a good resource of official character set names and aliases:

http://www.iana.org/assignments/character-sets

Also, we ought to have support for the ISO_8859-15 character set, or
people will spread the word that PostgreSQL is not ready for the Euro.

Then I figured, if the client is configured with locale, it should
automatically determine the client's encoding.  Not sure if this is
portably possible, but it would be very nice to have.

Finally, as I've mentioned before I'd like to try out the iconv interface.
Might become an option in 7.2 even.

-- 
Peter Eisentraut   peter_e@gmx.net   http://funkturm.homeip.net/~peter



pgsql-hackers by date:

Previous
From: Karel Zak
Date:
Subject: Re: encoding names
Next
From: Peter Eisentraut
Date:
Subject: Re: encoding names