Home > mailing lists

Recommended approach for upgrading DBs with nonmatching encodings - Mailing list pgsql-bugs

From	Martin Pitt
Subject	Recommended approach for upgrading DBs with nonmatching encodings
Date	March 30, 2008 15:40:51
Msg-id	20080330184050.GM6486@piware.de Whole thread Raw
Responses	Re: Recommended approach for upgrading DBs with nonmatching encodings
List	pgsql-bugs

Tree view

Hello all,

as already known, 8.3 now enforces a match between DB encoding and
server locale [1]. I agree that this is a Good Thing=E2=84=A2, but it causes
automatic upgrades from previous versions to 8.3 to fail with
something like

pg_restore: [archiver (db)] could not execute query: ERROR:  encoding LATIN=
1 does not match server's locale it_IT.UTF-8
DETAIL:  The server's LC_CTYPE setting requires encoding UTF8.
    Command was:=20
CREATE DATABASE lixpergroupware WITH TEMPLATE =3D template0 ENCODING =3D 'L=
ATIN1';

There were some proposals [3][4] to have the server run under locale C
or POSIX, but IMHO this is rather an aggravation than a viable
solution (locales are a good thing), so I rather don't do that.

My gut feeling is that the right approach would be to create all
target (8.3) databases with a correct encoding that matches the server
locale and have the character data from the pg_dump converted on the
fly (with iconv, or with pg_dump itself), so that people get from a
broken 8.1/8.2 setup to a good 8.3 setup painlessly.

I'm seeking some input on (1) whether above approach is the correct
one, and (2) suggestions how to implement it properly.

My current pg_upgradecluster uses pg_dumpall to copy the schema, and a
per-db pg_dump to copy the DB table contents. Will calling pg_dump
with  --encoding=3D<encoding of 8.3 server's locale> always DTRT and is
it meant to solve this problem? The common case is --encoding=3DUTF-8,
but of course in theory someone might also have it the other way
around, so the upgrade should fail if it encounters an UTF-8 character
which cannot be encoded into an. e. g.  LATIN1 character.

Thank you all for any suggestion,

Martin

[1] http://www.nabble.com/BUG--3924:-Create-Database-with-another-encoding-=
as-the-encoding-from-postgres-td15246051.html
[2] http://bugs.debian.org/472930
[3] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=3D472930#10
[4] https://bugs.launchpad.net/207779

--=20
Martin Pitt                        | http://www.piware.de
Ubuntu Developer (www.ubuntu.com)  | Debian Developer  (www.debian.org)

pgsql-bugs by date:

From: Stephan Szabo
Date: 30 March 2008, 14:51:16
Subject: Re: The problem with FULL JOIN

From: Martin Pitt
Date: 30 March 2008, 17:39:37
Subject: Re: Recommended approach for upgrading DBs with nonmatching encodings

Recommended approach for upgrading DBs with nonmatching encodings - Mailing list pgsql-bugs

Previous

Next