Re: server/db encoding (mix) issues - Mailing list pgsql-admin

From Jan-Peter.Seifert@gmx.de
Subject Re: server/db encoding (mix) issues
Date
Msg-id 20080908085758.36590@gmx.net
Whole thread Raw
In response to Re: server/db encoding (mix) issues  (Peter Eisentraut <peter_e@gmx.net>)
Responses Re: server/db encoding (mix) issues
List pgsql-admin
Hello Peter,

thank you very much for your quick reply.

> Datum: Thu, 04 Sep 2008 16:46:33 +0300
> Von: Peter Eisentraut <peter_e@gmx.net>
> An: Jan-Peter Seifert <Jan-Peter.Seifert@gmx.de>
> CC: Postgres <pgsql-admin@postgresql.org>
> Betreff: Re: [ADMIN] server/db encoding (mix) issues

> Jan-Peter Seifert wrote:
> > we have a mix of older software still using LATIN1 as db encoding and
> the psqlODBC-drivers (ANSI) and newer software using UTF8 as db encoding. As
> running two server instances would use up more resources(?) than just one
> we'd like to have all dbs in one cluster. Which cons against this solution
> are there? Which operating system locale should be used then? C locale is
> recommended in the docs - also because of better performance. However, the
> language of the software is not English but German - so shouldn't there be
> problems with sorting German Umlauts etc. correctly etc.? Which encoding
> should the server have - UTF8/Unicode or LATIN1? BTW which is the correct
> locale for LATIN1 and German (de_DE (my guess) or de_DE@euro (which seems to be
> for LATIN9)). Using SQL_ASCII doesn't seem to be a wise choice. Are there
> no problems when connecting with psqlODBC-ANSI drivers if the server
> encoding is UTF8/Unicode? I'd be happy if you could enlighten me a bit.
>
> Set your locale to de_DE.utf8 and use UTF8 as server encoding.

Well - I did setup two instances of 8.3.3 on an Ubuntu 7.10 system last week - both under a different user account. I
setthe locale for each account in the .bashrcs ("export LANG=de_DE" and "export LANG=de_DE.UTF-8" respectively). After
thatI ran initdb ("initdb --encoding='LATIN1' -W -A md5 -D $PGDATA" and "initdb --encoding='UTF8' -W -A md5 -D
$PGDATA"(?)).I'm not sure whether I specified the server encoding for the UTF8-instance though. Did I make something
wrong?
However, when I try to create an UTF-8 db in the LATIN1 server or an LATIN1 db in the UTF-8 server I get the error that
thedb encoding does not match the server locale and that the LC_TYPE-Locale requires the encoding of the server. Before
thatI thought it just fails because there is no locale with the name LATIN1 in windows. Are those additional encoding
checksin v8.3.3 or had they been put in place with v8.3.1 already?  
This makes me wonder whether there are any problems with migrating the LATIN1 databases to UTF8, but still using the
psqlODBC-ANSIdrivers for connecting for the non-unicode-capable applications. A quick test worked, but ... 

> I would be interested to know where the documentation "recommends" using
> the C locale.  That would certainly not be reasonable for many uses.

It isn't really recommended:
http://www.postgresql.org/docs/8.3/static/release-8-3.html
But the consequences could maybe pointed out more clearly.

http://www.postgresql.org/docs/8.3/interactive/locale.html
"The drawback of using locales other than C or POSIX in PostgreSQL is its performance impact. It slows character
handlingand prevents ordinary indexes from being used by LIKE. For this reason use locales only if you actually need
them."

Thank you very much,

Peter
--
GMX Kostenlose Spiele: Einfach online spielen und Spaß haben mit Pastry Passion!
http://games.entertainment.gmx.net/de/entertainment/games/free/puzzle/6169196

pgsql-admin by date:

Previous
From: Jumping
Date:
Subject: update to 8.3.3
Next
From: Peter Eisentraut
Date:
Subject: Re: server/db encoding (mix) issues