Hello all,
as already known, 8.3 now enforces a match between DB encoding and
server locale [1]. I agree that this is a Good Thing=E2=84=A2, but it causes
automatic upgrades from previous versions to 8.3 to fail with
something like
pg_restore: [archiver (db)] could not execute query: ERROR: encoding LATIN=
1 does not match server's locale it_IT.UTF-8
DETAIL: The server's LC_CTYPE setting requires encoding UTF8.
Command was:=20
CREATE DATABASE lixpergroupware WITH TEMPLATE =3D template0 ENCODING =3D 'L=
ATIN1';
There were some proposals [3][4] to have the server run under locale C
or POSIX, but IMHO this is rather an aggravation than a viable
solution (locales are a good thing), so I rather don't do that.
My gut feeling is that the right approach would be to create all
target (8.3) databases with a correct encoding that matches the server
locale and have the character data from the pg_dump converted on the
fly (with iconv, or with pg_dump itself), so that people get from a
broken 8.1/8.2 setup to a good 8.3 setup painlessly.
I'm seeking some input on (1) whether above approach is the correct
one, and (2) suggestions how to implement it properly.
My current pg_upgradecluster uses pg_dumpall to copy the schema, and a
per-db pg_dump to copy the DB table contents. Will calling pg_dump
with --encoding=3D<encoding of 8.3 server's locale> always DTRT and is
it meant to solve this problem? The common case is --encoding=3DUTF-8,
but of course in theory someone might also have it the other way
around, so the upgrade should fail if it encounters an UTF-8 character
which cannot be encoded into an. e. g. LATIN1 character.
Thank you all for any suggestion,
Martin
[1] http://www.nabble.com/BUG--3924:-Create-Database-with-another-encoding-=
as-the-encoding-from-postgres-td15246051.html
[2] http://bugs.debian.org/472930
[3] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=3D472930#10
[4] https://bugs.launchpad.net/207779
--=20
Martin Pitt | http://www.piware.de
Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org)