Home > mailing lists

Re: Mixing different LC_COLLATE and database encodings - Mailing list pgsql-general

From	Peter Eisentraut
Subject	Re: Mixing different LC_COLLATE and database encodings
Date	February 19, 2006 01:26:19
Msg-id	200602190326.17637.peter_e@gmx.net Whole thread Raw
In response to	Re: Mixing different LC_COLLATE and database encodings (Bill Moseley <moseley@hank.org>)
List	pgsql-general

Tree view

Bill Moseley wrote:
> What's a bad idea?  Having a lc_collate on the cluster that doesn't
> support the encodings in the databases?

Exactly

> Again, not sure what "it" is, but I do find it confusing when the
> cluster can have only one lc_collate, but the databases on that
> cluster can have more than one encoding.

It is confusing, so don't do it.

> That's why I was asking
> how postgresql handles (possibly) different encodings.

It doesn't.

> Are you saying that if a database is encoded as utf8 then the cluster
> should be initiated with something like en_US.utf8?  And then all
> databaes on that cluster should be encoded the same?

Yes

> I thought the locale defines the order of the characters, but not the
> encoding of those characters.

In theory, they are independent concepts.  But in practice, the C
library gets a bunch bytes from the application (in this case, the
PostgreSQL server) and is asked to sort them.  So it needs to know what
these bytes are supposed to mean.  By design of the POSIX locale
facilities, the C library is told that by way of the locale.  It would
be much simpler for everyone if there was a function strcmp(string1,
string2, collation, encoding), but there isn't.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

pgsql-general by date:

From: Michael Fuhr
Date: 19 February 2006, 01:21:05
Subject: Re: PostgreSQL Functions / PL-Language

From: Greg Stark
Date: 19 February 2006, 01:31:36
Subject: Re: Mixing different LC_COLLATE and database encodings

Re: Mixing different LC_COLLATE and database encodings - Mailing list pgsql-general

Previous

Next