Thread: What encoding to use for English, French, Spanish

What encoding to use for English, French, Spanish

From
novnov
Date:
My project is currently SQL_ASCII encoded. I will need to accomodate both
French and Spanish in addition to English. I don't anticipate needing Far
East languages. Reading here on the forums I come up with Latin9 as perhaps
adequate. But others recommend unicode for relatively simple needs like my
own.

I'd appreciate any advice on this topic. Unicode is the most versatile?
What's the downside of unicode?

If Far East languages do become a requirement, unicode is the way to go?

--
View this message in context:
http://www.nabble.com/What-encoding-to-use-for-English%2C-French%2C-Spanish-tf4622283.html#a13200459
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: What encoding to use for English, French, Spanish

From
Peter Eisentraut
Date:
novnov wrote:
> My project is currently SQL_ASCII encoded. I will need to accomodate
> both French and Spanish in addition to English. I don't anticipate
> needing Far East languages. Reading here on the forums I come up with
> Latin9 as perhaps adequate. But others recommend unicode for
> relatively simple needs like my own.

LATIN9 or UTF-8 are the appropriate choices for your project.  The
choice between these is mostly a matter of taste, unless there are
additional requirements in the project.  Nowadays, many operating
systems configure themselves to use Unicode by default, and so there is
probably no reason to use a more restricted character set.

Note that some versions of PostgreSQL have various degrees of trouble
with UTF-8 support.  Be sure to use the latest version.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: What encoding to use for English, French, Spanish

From
Alvaro Herrera
Date:
Peter Eisentraut escribió:
> novnov wrote:
> > My project is currently SQL_ASCII encoded. I will need to accomodate
> > both French and Spanish in addition to English. I don't anticipate
> > needing Far East languages. Reading here on the forums I come up with
> > Latin9 as perhaps adequate. But others recommend unicode for
> > relatively simple needs like my own.
>
> LATIN9 or UTF-8 are the appropriate choices for your project.  The
> choice between these is mostly a matter of taste, unless there are
> additional requirements in the project.

I used to think that there was no practical difference between using
LATIN9 or UTF8, but experience (not my own, but those from people in the
pgsql-es-ayuda list) has told me otherwise.  When people start mixing
environments, it is quite common that they get the client_encoding wrong
in some cases.  In those cases, having an encoding able to tell a valid
string from an invalid one is really helpful -- thus using UTF8 as the
server encoding is the way to go.

Latin9 is _capable_ of storing your data, yes, but if you fail to set
client_encoding then it is also capable of storing something you don't
really want to store.  I'd stay away from it.

--
Alvaro Herrera                 http://www.amazon.com/gp/registry/DXLWNGRJD34J
"Los románticos son seres que mueren de deseos de vida"