Re: [BUGS] Re: [BUGS] Re: [BUGS] Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale. - Mailing list pgsql-bugs

From Heikki Linnakangas
Subject Re: [BUGS] Re: [BUGS] Re: [BUGS] Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.
Date
Msg-id 5437D4CD.8020909@vmware.com
Whole thread Raw
In response to Re: [BUGS] Re: [BUGS] Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses Re: Re: [BUGS] Re: [BUGS] Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.  (Alon <asimantov@tableausoftware.com>)
Re: [BUGS] Re: [BUGS] Re: [BUGS] Re: [BUGS] Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.  (Heikki Linnakangas <hlinnakangas@vmware.com>)
List pgsql-bugs
On 09/25/2014 03:22 PM, Heikki Linnakangas wrote:
> Alternatively, we could not bother with changing pg_upgrade or CREATE
> DATABASE, and instead instruct Bokmål users to do the manual UPDATE of
> pg_database in the release notes. That might be the most robust
> solution, if there are more cases where we compare locales that I've missed.

I just realized that pg_upgrade doesn't currently do the
canonicalization when upgrading from a pre-8.4 server. So AFAICS,
upgrading from 8.3 to 9.2-9.4 will fail, if the locale name in the old
cluster doesn't happen to match the canonical form.

I committed the pg_upgrade patch discussed on the other thread to
master. For the 9.2->9.4 back-branches, I think we should do the
attached
(0001-Make-the-locale-comparison-in-pg_upgrade-more-lenien.patch). This
fixes the problem with 8.3 -> 9.2 upgrades, and prepares pg_upgrade for
the "Norwegian (Bokmål)" -> "norwegian-bokmal" mapping.


Now, for the mapping. I thought about it some more, and I think we
really need to provide instructions to fix existing clusters, as part of
the release notes. Even if we do the change to CREATE DATABASE that I
included in my earlier patch, you'll continue to have problems with
different encodings. And initdb writes the values in postgresql.conf
too, for lc_messages et all. Those really should be fixed too.

So I don't think there's much point in the CREATE DATABASE part. Let's
backpatch the attached patch
(0001-Work-around-Windows-locale-name-with-non-ASCII-chara.patch) and
add the attached instructions to the release notes.

The UPDATE statements in the instructions are specially crafted to avoid
"invalid byte sequence" errors, if they're run from a database that uses
a different encoding than the one used in the existing rows. I tried
using regexp_replace at first, but tries to decode all the characters,
and fails. I'm not 100% this would work with every encoding, but at
least it works with a mixture of LATIN1 and UTF-8, which are the most
likely encodings to be used with Norwegian.

These three things together, the pg_upgrade patch, the setlocale()
mapping patch and the release note changes, resolves the issue.


- Heikki

Attachment

pgsql-bugs by date:

Previous
From: Fujii Masao
Date:
Subject: Re: BUG #10675: alter database set tablespace and unlogged table
Next
From: ashwin.vakkaleri@nestpensions.org.uk
Date:
Subject: BUG #11634: ERROR-FORMAT-STRING