Thread: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.

BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.

From

asimantov@tableausoftware.com

Date:

16 September 2014, 21:00:49

The following bug has been logged on the website:

Bug reference:      11431
Logged by:          Alon Siman-Tov
Email address:      asimantov@tableausoftware.com
PostgreSQL version: 9.3.5
Operating system:   Windows 7 , Windows 2008
Description:

I am failing to backup/restore a Norwegian-Bokmål database on Windows,
whether the cluster was initialized with English US or Norwegian Bokmål
locale.

The database created within this cluster, has the following locale
settings:
                LC_COLLATE = ''
                LC_CTYPE = ''

I have backed-up the database using the pg_dump command:
                pg_dump.exe -h localhost -p 5432 -U pgadmin -F c -Z 0 -f
"C:/test/bokmaldb.pg_dump" bokmaldb

Then try restoring the pg_dump file using:
                pg_restore.exe -h localhost -p 5432 -U pgadmin -C -d
postgres -j 4 "C:/test/bokmaldb.pg_dump"

Fails with this error:
                pg_restore: [archiver (db)] Error while PROCESSING TOC:
                pg_restore: [archiver (db)] Error from TOC entry 1928; 1262
16384 DATABASE bokmaldb pgadmin
                pg_restore: [archiver (db)] could not execute query: ERROR:
invalid byte sequence for encoding "UTF8": 0xe5 0x6c 0x29
    Command was: CREATE DATABASE bokmaldb WITH TEMPLATE = template0 ENCODING
= 'UTF8' LC_COLLATE = 'Norwegian (Bokmål)_Norway.1252' LC_CTYPE...

I have tested and failed using the following Windows postgres versions:
                9.3.5
                9.1.14
                9.0.18

Please investigate why Norwegian-Bokmål database cannot be backed-up and
restored.

Set of commands I ran:
                initdb.exe -D  c:/test/pgsql/data -E UTF8
--locale="Norwegian (Bokmål)_Norway.1252" -A trust -U pgadmin
                pg_ctl -D "c:/test/pgsql/data" -l
c:/test/bokmalLocalenblocalePG.log start
                psql.exe -h localhost -p 5432 -d postgres -U pgadmin -c
"CREATE DATABASE bokmaldb;"
                pg_dump.exe -h localhost -p 5432 -U pgadmin -F c -Z 0 -f
"C:/test/bokmaldb.pg_dump" bokmaldb
                psql.exe -h localhost -p 5432 -d postgres -U pgadmin -c
"DROP DATABASE bokmaldb;"
                pg_restore.exe -h localhost -p 5432 -U pgadmin -C -d
postgres -j 4 "C:/test/bokmaldb.pg_dump"

Output of running restore:

                C:\Program Files\PostgreSQL\9.3\bin>pg_restore.exe -h
localhost -p 5432 -U pgadmin -C -d postgres -j 4 "C:/test/bokmaldb.pg_dump
                "
                pg_restore: [archiver (db)] Error while PROCESSING TOC:
                pg_restore: [archiver (db)] Error from TOC entry 1928; 1262
16384 DATABASE bokmaldb pgadmin
                pg_restore: [archiver (db)] could not execute query: ERROR:
invalid byte sequence for encoding "UTF8": 0xe5 0x6c 0x29
                                Command was: CREATE DATABASE bokmaldb WITH
TEMPLATE = template0 ENCODING = 'UTF8' LC_COLLATE = 'Norwegian
(Bokmål)_Norway.1252' LC_CTYPE...
                pg_restore: [archiver (db)] could not execute query: ERROR:
database "bokmaldb" does not exist
                                Command was: ALTER DATABASE bokmaldb OWNER
TO pgadmin;


                pg_restore: [archiver (db)] could not reconnect to database:
FATAL:  database "bokmaldb" does not exist

NOTE: running the following set of commands, that restore database with
Swedish locale works fine:
initdb.exe -D  c:/test/pgsql/data -E UTF8 --locale="Swedish_SWeden.1252" -A
trust -U pgadmin
pg_ctl -D "c:/test/pgsql/data" -l c:/test/swdb.log start
psql.exe -h localhost -p 5432 -d postgres -U pgadmin -c "CREATE DATABASE
swdb;"
pg_dump.exe -h localhost -p 5432 -U pgadmin -F c -Z 0 -f
"C:/test/swdb.pg_dump" swdb
psql.exe -h localhost -p 5432 -d postgres -U pgadmin -c "DROP DATABASE
swdb;"
pg_restore.exe -h localhost -p 5432 -U pgadmin -C -d postgres -j 4
"C:/test/swdb.pg_dump"

** The database is restored with following settings:

CREATE DATABASE swdb
  WITH OWNER = pgadmin
       ENCODING = 'UTF8'
       TABLESPACE = pg_default
       LC_COLLATE = 'Swedish_Sweden.1252'
       LC_CTYPE = 'Swedish_Sweden.1252'
       CONNECTION LIMIT = -1;

Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.

From

Alon

Date:

20 September 2014, 13:34:50

The pg_dump file contains this command:
CREATE DATABASE workgroup WITH TEMPLATE = template0 ENCODING = 'UTF8'
LC_COLLATE = 'Norwegian (Bokmål)_Norway.1252' LC_CTYPE = 'Norwegian
(Bokmål)_Norway.1252';

The UTF16 encoding for ål) [a-ring l parenthesis] is
00e5 006c 0029

In UTF8 this set of characters encoded as:
c3 a5 6c 29

The a-ring is converted to two bytes while the others are one.

Based on the ERROR:
invalid byte sequence for encoding "UTF8": 0xe5 0x6c 0x29

It appears the set of characters is getting passed as:
e5 6c 29

In UTF8, e5 is always the start of a three byte character,possibly
pg_restore, ceratedb or else, tries to read these bytes as a single
character.
However, 6c and 29 can only be single byte characters, they can't be the
next two bytes in a three byte character.  Hence the failure.
Seems like in the code, the 00xe5 is converted to e5 instead of 'c3 a5' when
passing the LC_COLLATE and LC_CTYPE values.



--
View this message in context:
http://postgresql.1045698.n5.nabble.com/BUG-11431-Failing-to-backup-and-restore-a-Windows-postgres-database-with-Norwegian-Bokm-l-locale-tp5819260p5819745.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.

Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.

From

Noah Misch

Date:

21 September 2014, 05:19:01

On Fri, Sep 19, 2014 at 03:15:53PM -0700, Alon wrote:
> The pg_dump file contains this command:
> CREATE DATABASE workgroup WITH TEMPLATE = template0 ENCODING = 'UTF8'
> LC_COLLATE = 'Norwegian (Bokmål)_Norway.1252' LC_CTYPE = 'Norwegian
> (Bokmål)_Norway.1252';
>
> The UTF16 encoding for ål) [a-ring l parenthesis] is
> 00e5 006c 0029
>
> In UTF8 this set of characters encoded as:
> c3 a5 6c 29
>
> The a-ring is converted to two bytes while the others are one.
>
> Based on the ERROR:
> invalid byte sequence for encoding "UTF8": 0xe5 0x6c 0x29
>
> It appears the set of characters is getting passed as:
> e5 6c 29
>
> In UTF8, e5 is always the start of a three byte character,possibly
> pg_restore, ceratedb or else, tries to read these bytes as a single
> character.
> However, 6c and 29 can only be single byte characters, they can't be the
> next two bytes in a three byte character.  Hence the failure.
> Seems like in the code, the 00xe5 is converted to e5 instead of 'c3 a5' when
> passing the LC_COLLATE and LC_CTYPE values.

In WIN1252, "e5 6c 29" is "ål)".  We're likely failing to set client_encoding
at some essential point in the process.

Re: [BUGS] Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.

From

John R Pierce

Date:

21 September 2014, 05:31:52

On 9/20/2014 10:18 PM, Noah Misch wrote:
> On Fri, Sep 19, 2014 at 03:15:53PM -0700, Alon wrote:
>> >The pg_dump file contains this command:
>> >CREATE DATABASE workgroup WITH TEMPLATE = template0 ENCODING = 'UTF8'
>> >LC_COLLATE = 'Norwegian (Bokmål)_Norway.1252' LC_CTYPE = 'Norwegian
>> >(Bokmål)_Norway.1252';...
> In WIN1252, "e5 6c 29" is "ål)".  We're likely failing to set client_encoding
> at some essential point in the process.

is it a problem that ENCODING=utf8 is being used with the collation and
char typing of Norwegian windows-1252 ?


--
john r pierce                                      37N 122W
somewhere on the middle of the left coast

Re: Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.

From

Tom Lane

Date:

21 September 2014, 16:13:34

Noah Misch <noah@leadboat.com> writes:
> On Fri, Sep 19, 2014 at 03:15:53PM -0700, Alon wrote:
>> The pg_dump file contains this command:
>> CREATE DATABASE workgroup WITH TEMPLATE = template0 ENCODING = 'UTF8'
>> LC_COLLATE = 'Norwegian (Bokm�l)_Norway.1252' LC_CTYPE = 'Norwegian
>> (Bokm�l)_Norway.1252';

> In WIN1252, "e5 6c 29" is "�l)".  We're likely failing to set client_encoding
> at some essential point in the process.

The level of stupidity needed to use non-ASCII characters in a locale name
is breathtaking.  What were Microsoft thinking?  How are we supposed to
guess what encoding applies when setting the encoding?

            regards, tom lane

Re: Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.

From

Noah Misch

Date:

21 September 2014, 18:04:25

On Sun, Sep 21, 2014 at 12:13:25PM -0400, Tom Lane wrote:
> Noah Misch <noah@leadboat.com> writes:
> > On Fri, Sep 19, 2014 at 03:15:53PM -0700, Alon wrote:
> >> The pg_dump file contains this command:
> >> CREATE DATABASE workgroup WITH TEMPLATE = template0 ENCODING = 'UTF8'
> >> LC_COLLATE = 'Norwegian (Bokmål)_Norway.1252' LC_CTYPE = 'Norwegian
> >> (Bokmål)_Norway.1252';
>
> > In WIN1252, "e5 6c 29" is "ål)".  We're likely failing to set client_encoding
> > at some essential point in the process.
>
> The level of stupidity needed to use non-ASCII characters in a locale name
> is breathtaking.  What were Microsoft thinking?

Windows Vista did deprecate that locale name style in favor of "nb-NO".
setlocale(LC_x, "") still returns the old style, though.  You need PostgreSQL
built with VS2012 or later to use "nb-NO" style; see IsoLocaleName().

> How are we supposed to
> guess what encoding applies when setting the encoding?

Windows functions with "char *" arguments where encoding matters typically
expect the argument to be in the Windows ANSI code page.

Re: Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.

From

Alon

Date:

22 September 2014, 03:35:04

Yes, the Encoding might be the source of the problem, once initialized the
cluster with WINDOWS1252 as following:

initdb.exe -D  c:/test/pgsql/data -E WINDOWS1252
--locale="Norwegian (Bokmål)_Norway.1252" -A trust -U pgadmin
                pg_ctl -D "c:/test/pgsql/data" -l
c:/test/bokmalLocalenblocalePG.log start

The cluster created with these settings:
       ENCODING = 'WIN1252'
       TABLESPACE = pg_default
       LC_COLLATE = 'Norwegian (Bokmål)_Norway.1252'
       LC_CTYPE = 'Norwegian (Bokmål)_Norway.1252'

Then pg_backup will create a proper pg_dump file, that can be restored into
database with a Norwegian (Bokmål)_Norway.1252 locale.



--
View this message in context:
http://postgresql.1045698.n5.nabble.com/BUG-11431-Failing-to-backup-and-restore-a-Windows-postgres-database-with-Norwegian-Bokm-l-locale-tp5819260p5819836.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.

RE: Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.

From

Alon

Date:

22 September 2014, 03:35:18

Yes, the Encoding might be the source of the problem, once initialized the cluster with WINDOWS1252 as following:

initdb.exe -D c:/test/pgsql/data -E WINDOWS1252

--locale="Norwegian (Bokmål)_Norway.1252" -A trust -U pgadmin

pg_ctl -D "c:/test/pgsql/data" -l

c:/test/bokmalLocalenblocalePG.log start

The cluster created with these settings:

ENCODING = 'WIN1252'

TABLESPACE = pg_default

LC_COLLATE = 'Norwegian (Bokmål)_Norway.1252'

LC_CTYPE = 'Norwegian (Bokmål)_Norway.1252'

Then pg_backup will create a proper pg_dump file, that can be restored into database with a Norwegian (Bokmål)_Norway.1252 locale.

Thank you,

Alon

From: John R Pierce [via PostgreSQL] [mailto:ml-node+[hidden email]]
Sent: Saturday, September 20, 2014 10:32 PM
To: Alon Siman Tov
Subject: Re: [BUGS] Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.

On 9/20/2014 10:18 PM, Noah Misch wrote:
> On Fri, Sep 19, 2014 at 03:15:53PM -0700, Alon wrote:
>> >The pg_dump file contains this command:
>> >CREATE DATABASE workgroup WITH TEMPLATE = template0 ENCODING = 'UTF8'
>> >LC_COLLATE = 'Norwegian (Bokmål)_Norway.1252' LC_CTYPE = 'Norwegian
>> >(Bokmål)_Norway.1252';...
> In WIN1252, "e5 6c 29" is "ål)". We're likely failing to set client_encoding
> at some essential point in the process.

is it a problem that ENCODING=utf8 is being used with the collation and
char typing of Norwegian windows-1252 ?

--
john r pierce 37N 122W
somewhere on the middle of the left coast

--
Sent via pgsql-bugs mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

If you reply to this email, your message will be added to the discussion below:

http://postgresql.1045698.n5.nabble.com/BUG-11431-Failing-to-backup-and-restore-a-Windows-postgres-database-with-Norwegian-Bokm-l-locale-tp5819260p5819819.html

To unsubscribe from BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale., click here.
NAML

View this message in context: RE: [BUGS] Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.

Re: [BUGS] Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.

From

Heikki Linnakangas

Date:

22 September 2014, 15:15:57

On 09/21/2014 09:04 PM, Noah Misch wrote:
> On Sun, Sep 21, 2014 at 12:13:25PM -0400, Tom Lane wrote:
>> Noah Misch <noah@leadboat.com> writes:
>>> On Fri, Sep 19, 2014 at 03:15:53PM -0700, Alon wrote:
>>>> The pg_dump file contains this command:
>>>> CREATE DATABASE workgroup WITH TEMPLATE = template0 ENCODING = 'UTF8'
>>>> LC_COLLATE = 'Norwegian (Bokmål)_Norway.1252' LC_CTYPE = 'Norwegian
>>>> (Bokmål)_Norway.1252';
>>
>>> In WIN1252, "e5 6c 29" is "ål)".  We're likely failing to set client_encoding
>>> at some essential point in the process.
>>
>> The level of stupidity needed to use non-ASCII characters in a locale name
>> is breathtaking.  What were Microsoft thinking?
>
> Windows Vista did deprecate that locale name style in favor of "nb-NO".
> setlocale(LC_x, "") still returns the old style, though.  You need PostgreSQL
> built with VS2012 or later to use "nb-NO" style; see IsoLocaleName().

Older versions support "norwegian-bokmal" as an alias, so we can use
that. Unfortunately, the user can't use the alias as a work-around, as
we always use the canonical string returned by setlocale(), regardless
of what alias the user used. But we could use it as a work-around in our
code.

We have a similar mapping for a few country names that have dots in the
name, "Hong Kong S.A.R." , "U.A.E.", and "Macau S.A.R.". This is
slightly different, though. With the dots, the problem is that
setlocale() doesn't accept the country name as argument. With Bokmål,
the problem is that we don't store the locale name in the catalogs
correctly, and hence don't pass correctly to setlocale(). (even if we
stored it correctly when a single encoding is used throughout the
cluster, things will go wrong as soon as you create a database that uses
a non-default encoding).

I think we should map "Norwegian (Bokmål)" to "norwegian-bokmal", so
that the mapping is applied to the return value of setlocale(). So when
initdb does setlocale(NULL, "") to get the locale, we'll return
"norwegian-bokmal", and that gets stored in the catalogs and
postgresql.conf.

Patch for that attached. pg_upgrade canonicalizes locale names by
passing them through setlocale(), before comparing them, so it should
still work. I'm a bit wary of back-patching, though. I think this would
work with existing clusters (as far as they work currently, with the
non-ASCII characters stored in pg_database), but would need some more
testing to be confident.

- Heikki

Attachment

map-bokmal-1.patch

Re: Re: [BUGS] Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.

From

Tom Lane

Date:

22 September 2014, 16:10:56

Heikki Linnakangas <hlinnakangas@vmware.com> writes:
> I think we should map "Norwegian (Bokm�l)" to "norwegian-bokmal", so
> that the mapping is applied to the return value of setlocale(). So when
> initdb does setlocale(NULL, "") to get the locale, we'll return
> "norwegian-bokmal", and that gets stored in the catalogs and
> postgresql.conf.

That seems like a reasonable hack if there's only the one case ...
how sure are we about that?

            regards, tom lane

Re: [BUGS] Re: [BUGS] Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.

From

Heikki Linnakangas

Date:

22 September 2014, 16:38:39

On 09/22/2014 07:10 PM, Tom Lane wrote:
> Heikki Linnakangas <hlinnakangas@vmware.com> writes:
>> I think we should map "Norwegian (Bokmål)" to "norwegian-bokmal", so
>> that the mapping is applied to the return value of setlocale(). So when
>> initdb does setlocale(NULL, "") to get the locale, we'll return
>> "norwegian-bokmal", and that gets stored in the catalogs and
>> postgresql.conf.
>
> That seems like a reasonable hack if there's only the one case ...
> how sure are we about that?

That's the only one I can see in MSDN documentation, and in the dropdown
box in Windows 7. New languages and countries pop up every now and then,
but overall the lists are fairly stable. And we could live with a few of
these if we have to.

- Heikki

Re: [BUGS] Re: [BUGS] Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.

From

Heikki Linnakangas

Date:

25 September 2014, 12:22:52

On 09/22/2014 06:15 PM, Heikki Linnakangas wrote:
> Patch for that attached. pg_upgrade canonicalizes locale names by
> passing them through setlocale(), before comparing them, so it should
> still work. I'm a bit wary of back-patching, though. I think this would
> work with existing clusters (as far as they work currently, with the
> non-ASCII characters stored in pg_database), but would need some more
> testing to be confident.

This seems the best way to fix this in master, but there's a problem if
we backpatch this. If existing databases in the cluster already have
"Norwegian (Bokmål)" as the locale, and you update the binaries and try
to create a new database:

postgres=# create database foodb;
ERROR:  new collation (norwegian-bokmal_Norway.1252) is incompatible
with the co llation of the template database (Norwegian
(Bokmål)_Norway.1252)
HINT:  Use the same collation as in the template database, or use
template0 as t emplate.

That's straightforward to fix; instead of doing a straight strcmp() to
check if the locales are the same, canonicalize them by calling
check_locale first. Attached patch does that.

After this, it's a bit strange that newly created databases use
"norwegian-bokmal" as the locale, while old ones use "Norwegian (Bokmål)":

foodb=# select datname, encoding, datcollate from pg_database;
   datname  | encoding |           datcollate
-----------+----------+--------------------------------
  template1 |       24 | Norwegian (Bokmål)_Norway.1252
  template0 |       24 | Norwegian (Bokmål)_Norway.1252
  postgres  |       24 | Norwegian (Bokmål)_Norway.1252
  foodb     |       24 | norwegian-bokmal_Norway.1252
  utf8db    |        6 | norwegian-bokmal_Norway.1252
(5 rows)

But we know those non-ASCII characters are problematic, so I think this
is an improvement even in old clusters. At least you won't get any more
of them. You could also UPDATE pg_database manually to fix that in an
existing cluster.

One more problem: pg_upgrade doesn't canonicalize locale names either,
so you get:

> lc_collate cluster values do not match:  old "Norwegian (BokmÕl)_Norway.1252", n
> ew "norwegian-bokmal_Norway.1252"
>
> Failure, exiting

Bruce: do you think it would be OK to canonicalize the locale names
before comparing? pg_upgrade already has a function to canonicalize, but
it's only used when upgrading from a pre-9.2 server; locale names on
newer versions are assumed to be already in canonical form.

Alternatively, we could not bother with changing pg_upgrade or CREATE
DATABASE, and instead instruct Bokmål users to do the manual UPDATE of
pg_database in the release notes. That might be the most robust
solution, if there are more cases where we compare locales that I've missed.

- Heikki

Attachment

map-bokmal-2.patch

Re: [BUGS] Re: [BUGS] Re: [BUGS] Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.

From

Heikki Linnakangas

Date:

10 October 2014, 12:45:19

On 09/25/2014 03:22 PM, Heikki Linnakangas wrote:
> Alternatively, we could not bother with changing pg_upgrade or CREATE
> DATABASE, and instead instruct Bokmål users to do the manual UPDATE of
> pg_database in the release notes. That might be the most robust
> solution, if there are more cases where we compare locales that I've missed.

I just realized that pg_upgrade doesn't currently do the
canonicalization when upgrading from a pre-8.4 server. So AFAICS,
upgrading from 8.3 to 9.2-9.4 will fail, if the locale name in the old
cluster doesn't happen to match the canonical form.

I committed the pg_upgrade patch discussed on the other thread to
master. For the 9.2->9.4 back-branches, I think we should do the
attached
(0001-Make-the-locale-comparison-in-pg_upgrade-more-lenien.patch). This
fixes the problem with 8.3 -> 9.2 upgrades, and prepares pg_upgrade for
the "Norwegian (Bokmål)" -> "norwegian-bokmal" mapping.

Now, for the mapping. I thought about it some more, and I think we
really need to provide instructions to fix existing clusters, as part of
the release notes. Even if we do the change to CREATE DATABASE that I
included in my earlier patch, you'll continue to have problems with
different encodings. And initdb writes the values in postgresql.conf
too, for lc_messages et all. Those really should be fixed too.

So I don't think there's much point in the CREATE DATABASE part. Let's
backpatch the attached patch
(0001-Work-around-Windows-locale-name-with-non-ASCII-chara.patch) and
add the attached instructions to the release notes.

The UPDATE statements in the instructions are specially crafted to avoid
"invalid byte sequence" errors, if they're run from a database that uses
a different encoding than the one used in the existing rows. I tried
using regexp_replace at first, but tries to decode all the characters,
and fails. I'm not 100% this would work with every encoding, but at
least it works with a mixture of LATIN1 and UTF-8, which are the most
likely encodings to be used with Norwegian.

These three things together, the pg_upgrade patch, the setlocale()
mapping patch and the release note changes, resolves the issue.

- Heikki

Attachment

Re: Re: [BUGS] Re: [BUGS] Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.

From

Alon

Date:

10 October 2014, 18:32:59

Looks like great progress has been made to address this issue. Any idea which
release the patch might go in?




--
View this message in context:
http://postgresql.1045698.n5.nabble.com/BUG-11431-Failing-to-backup-and-restore-a-Windows-postgres-database-with-Norwegian-Bokm-l-locale-tp5819260p5822615.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.

Re: [BUGS] Re: [BUGS] Re: [BUGS] Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.

From

Michael Paquier

Date:

11 October 2014, 12:57:23

On Sat, Oct 11, 2014 at 3:32 AM, Alon <asimantov@tableausoftware.com> wrote:
> Looks like great progress has been made to address this issue. Any idea which
> release the patch might go in?
I imagine that it will be in for 9.3.6, the next minor release of hte
9.3 series.
Regards,
--
Michael

Re: [BUGS] Re: [BUGS] Re: [BUGS] Re: [BUGS] Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.

From

Heikki Linnakangas

Date:

24 October 2014, 18:16:11

On 10/10/2014 03:45 PM, Heikki Linnakangas wrote:
> On 09/25/2014 03:22 PM, Heikki Linnakangas wrote:
>> Alternatively, we could not bother with changing pg_upgrade or CREATE
>> DATABASE, and instead instruct Bokmål users to do the manual UPDATE of
>> pg_database in the release notes. That might be the most robust
>> solution, if there are more cases where we compare locales that I've missed.
>
> I just realized that pg_upgrade doesn't currently do the
> canonicalization when upgrading from a pre-8.4 server. So AFAICS,
> upgrading from 8.3 to 9.2-9.4 will fail, if the locale name in the old
> cluster doesn't happen to match the canonical form.
>
> I committed the pg_upgrade patch discussed on the other thread to
> master. For the 9.2->9.4 back-branches, I think we should do the
> attached
> (0001-Make-the-locale-comparison-in-pg_upgrade-more-lenien.patch). This
> fixes the problem with 8.3 -> 9.2 upgrades, and prepares pg_upgrade for
> the "Norwegian (Bokmål)" -> "norwegian-bokmal" mapping.
>
>
> Now, for the mapping. I thought about it some more, and I think we
> really need to provide instructions to fix existing clusters, as part of
> the release notes. Even if we do the change to CREATE DATABASE that I
> included in my earlier patch, you'll continue to have problems with
> different encodings. And initdb writes the values in postgresql.conf
> too, for lc_messages et all. Those really should be fixed too.
>
> So I don't think there's much point in the CREATE DATABASE part. Let's
> backpatch the attached patch
> (0001-Work-around-Windows-locale-name-with-non-ASCII-chara.patch) and
> add the attached instructions to the release notes.
>
> The UPDATE statements in the instructions are specially crafted to avoid
> "invalid byte sequence" errors, if they're run from a database that uses
> a different encoding than the one used in the existing rows. I tried
> using regexp_replace at first, but tries to decode all the characters,
> and fails. I'm not 100% this would work with every encoding, but at
> least it works with a mixture of LATIN1 and UTF-8, which are the most
> likely encodings to be used with Norwegian.
>
> These three things together, the pg_upgrade patch, the setlocale()
> mapping patch and the release note changes, resolves the issue.

Ok, I've committed these patches now.

The release note entry needs to be written, with the instructions on
updating pg_database. We don't have anywhere to stash release note
entries for minor releases, until we actually write the release notes,
do we?

- Heikki

Re: Re: [BUGS] Re: [BUGS] Re: [BUGS] Re: [BUGS] Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.

From

Tom Lane

Date:

24 October 2014, 18:52:40

Heikki Linnakangas <hlinnakangas@vmware.com> writes:
> The release note entry needs to be written, with the instructions on
> updating pg_database. We don't have anywhere to stash release note
> entries for minor releases, until we actually write the release notes,
> do we?

The normal procedure is that you put sufficient information into the
commit message for the release notes to be written from that.  If you
didn't, tut tut ... but you could add an SGML comment at the front of
release-9.4.sgml where the next section will get added.

            regards, tom lane

Re: Re: [BUGS] Re: [BUGS] Re: [BUGS] Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.

From

Alon

Date:

08 May 2015, 21:06:19

Any time estimate in which release/revision the fix might be?



--
View this message in context:
http://postgresql.nabble.com/BUG-11431-Failing-to-backup-and-restore-a-Windows-postgres-database-with-Norwegian-Bokm-l-locale-tp5819260p5848514.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.

Re: Re: Re: [BUGS] Re: [BUGS] Re: [BUGS] Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.

From

Tom Lane

Date:

08 May 2015, 21:09:06

Alon <asimantov@tableausoftware.com> writes:
> Any time estimate in which release/revision the fix might be?

The last ones, no?  Or are you claiming this was not fixed in 9.4.1 et al?

http://www.postgresql.org/docs/9.4/static/release-9-4-1.html

            regards, tom lane

Re: Re: Re: [BUGS] Re: [BUGS] Re: [BUGS] Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.

From

Alon

Date:

08 May 2015, 21:22:38

"The 9.4.1, 9.3.6, and 9.2.10 minor releases contain changes to the way
Postgres handles the Norwegian (Bokmål) locale on Windows. "

So based on the link, 9.3.6 should contain the fix. Right?
If so, I will test it and inform.



--
View this message in context:
http://postgresql.nabble.com/BUG-11431-Failing-to-backup-and-restore-a-Windows-postgres-database-with-Norwegian-Bokm-l-locale-tp5819260p5848517.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.