Re: [BUGS] Re: [BUGS] Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale. - Mailing list pgsql-bugs

From Heikki Linnakangas
Subject Re: [BUGS] Re: [BUGS] Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.
Date
Msg-id 5424090E.9060700@vmware.com
Whole thread Raw
In response to Re: [BUGS] Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses Re: [BUGS] Re: [BUGS] Re: [BUGS] Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.
List pgsql-bugs
On 09/22/2014 06:15 PM, Heikki Linnakangas wrote:
> Patch for that attached. pg_upgrade canonicalizes locale names by
> passing them through setlocale(), before comparing them, so it should
> still work. I'm a bit wary of back-patching, though. I think this would
> work with existing clusters (as far as they work currently, with the
> non-ASCII characters stored in pg_database), but would need some more
> testing to be confident.

This seems the best way to fix this in master, but there's a problem if
we backpatch this. If existing databases in the cluster already have
"Norwegian (Bokmål)" as the locale, and you update the binaries and try
to create a new database:

postgres=# create database foodb;
ERROR:  new collation (norwegian-bokmal_Norway.1252) is incompatible
with the co llation of the template database (Norwegian
(Bokmål)_Norway.1252)
HINT:  Use the same collation as in the template database, or use
template0 as t emplate.

That's straightforward to fix; instead of doing a straight strcmp() to
check if the locales are the same, canonicalize them by calling
check_locale first. Attached patch does that.

After this, it's a bit strange that newly created databases use
"norwegian-bokmal" as the locale, while old ones use "Norwegian (Bokmål)":

foodb=# select datname, encoding, datcollate from pg_database;
   datname  | encoding |           datcollate
-----------+----------+--------------------------------
  template1 |       24 | Norwegian (Bokmål)_Norway.1252
  template0 |       24 | Norwegian (Bokmål)_Norway.1252
  postgres  |       24 | Norwegian (Bokmål)_Norway.1252
  foodb     |       24 | norwegian-bokmal_Norway.1252
  utf8db    |        6 | norwegian-bokmal_Norway.1252
(5 rows)

But we know those non-ASCII characters are problematic, so I think this
is an improvement even in old clusters. At least you won't get any more
of them. You could also UPDATE pg_database manually to fix that in an
existing cluster.

One more problem: pg_upgrade doesn't canonicalize locale names either,
so you get:

> lc_collate cluster values do not match:  old "Norwegian (BokmÕl)_Norway.1252", n
> ew "norwegian-bokmal_Norway.1252"
>
> Failure, exiting

Bruce: do you think it would be OK to canonicalize the locale names
before comparing? pg_upgrade already has a function to canonicalize, but
it's only used when upgrading from a pre-9.2 server; locale names on
newer versions are assumed to be already in canonical form.

Alternatively, we could not bother with changing pg_upgrade or CREATE
DATABASE, and instead instruct Bokmål users to do the manual UPDATE of
pg_database in the release notes. That might be the most robust
solution, if there are more cases where we compare locales that I've missed.

- Heikki


Attachment

pgsql-bugs by date:

Previous
From: cadili@gesp.it
Date:
Subject: BUG #11484: 42804: il tipo del parametro 69 (numeric) non combacia con quello usato alla preparazione del piano
Next
From: Michael Paquier
Date:
Subject: Re: BUG #11457: The below query crashes 9.3.5, but not 9.3.4