Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale. - Mailing list pgsql-bugs

From Noah Misch
Subject Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.
Date
Msg-id 20140921051846.GA1565935@tornado.leadboat.com
Whole thread Raw
In response to Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.  (Alon <asimantov@tableausoftware.com>)
Responses Re: [BUGS] Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.
Re: Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.
List pgsql-bugs
On Fri, Sep 19, 2014 at 03:15:53PM -0700, Alon wrote:
> The pg_dump file contains this command:
> CREATE DATABASE workgroup WITH TEMPLATE = template0 ENCODING = 'UTF8'
> LC_COLLATE = 'Norwegian (Bokmål)_Norway.1252' LC_CTYPE = 'Norwegian
> (Bokmål)_Norway.1252';
>
> The UTF16 encoding for ål) [a-ring l parenthesis] is
> 00e5 006c 0029
>
> In UTF8 this set of characters encoded as:
> c3 a5 6c 29
>
> The a-ring is converted to two bytes while the others are one.
>
> Based on the ERROR:
> invalid byte sequence for encoding "UTF8": 0xe5 0x6c 0x29
>
> It appears the set of characters is getting passed as:
> e5 6c 29
>
> In UTF8, e5 is always the start of a three byte character,possibly
> pg_restore, ceratedb or else, tries to read these bytes as a single
> character.
> However, 6c and 29 can only be single byte characters, they can't be the
> next two bytes in a three byte character.  Hence the failure.
> Seems like in the code, the 00xe5 is converted to e5 instead of 'c3 a5' when
> passing the LC_COLLATE and LC_CTYPE values.

In WIN1252, "e5 6c 29" is "ål)".  We're likely failing to set client_encoding
at some essential point in the process.



pgsql-bugs by date:

Previous
From: Maxim Boguk
Date:
Subject: Re: BUG #11441: Weird (and seems wrong) behavior of partial indexes with order by/limit
Next
From: John R Pierce
Date:
Subject: Re: [BUGS] Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.