pg_upgrade, locale and encoding - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject pg_upgrade, locale and encoding
Date
Msg-id 5433E208.5070203@vmware.com
Whole thread Raw
Responses Re: pg_upgrade, locale and encoding
List pgsql-hackers
While looking at bug #11431, I noticed that pg_upgrade still seems to
think that encoding and locale are cluster-wide properties. We got
per-database locale support in 8.4, and encoding has been per-database
much longer than that.

pg_upgrade checks the encoding and locale of template0 in both clusters,
and throws an error if they don't match. But it doesn't check the locale
or encoding of postgres or template1 databases. That leads to problems
if e.g. the postgres database was dropped and recreated with a different
encoding or locale in the old cluster. We will merrily upgrade it, but
strings in the database will be incorrectly encoded.

I propose the attached patch, for git master. It's more complicated in
back-branches, as they still support upgrading from pre-8.4 clusters. We
haven't heard any complaints from the field on this, so I don't think
it's worth trying to back-patch this.

This slightly changes the way the locale comparison works. First, it
ignores the encoding suffix of the locale name. It's of course important
that the databases have a compatible encoding, but pg_database has a
separate field for encoding, and that's now compared directly. Secondly,
it tries to canonicalize the names, by calling setlocale(). That seems
like a good idea, in response to bug #11431
(http://www.postgresql.org/message-id/5424090E.9060700@vmware.com).

- Heikki

Attachment

pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Promise index tuples for UPSERT
Next
From: Robert Haas
Date:
Subject: Re: Promise index tuples for UPSERT