Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale. - Mailing list pgsql-bugs

From Alon
Subject Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.
Date
Msg-id 1411164953375-5819745.post@n5.nabble.com
Whole thread Raw
In response to BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.  (asimantov@tableausoftware.com)
Responses Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.
List pgsql-bugs
The pg_dump file contains this command:
CREATE DATABASE workgroup WITH TEMPLATE = template0 ENCODING = 'UTF8'
LC_COLLATE = 'Norwegian (Bokmål)_Norway.1252' LC_CTYPE = 'Norwegian
(Bokmål)_Norway.1252';

The UTF16 encoding for ål) [a-ring l parenthesis] is
00e5 006c 0029

In UTF8 this set of characters encoded as:
c3 a5 6c 29

The a-ring is converted to two bytes while the others are one.

Based on the ERROR:
invalid byte sequence for encoding "UTF8": 0xe5 0x6c 0x29

It appears the set of characters is getting passed as:
e5 6c 29

In UTF8, e5 is always the start of a three byte character,possibly
pg_restore, ceratedb or else, tries to read these bytes as a single
character.
However, 6c and 29 can only be single byte characters, they can't be the
next two bytes in a three byte character.  Hence the failure.
Seems like in the code, the 00xe5 is converted to e5 instead of 'c3 a5' when
passing the LC_COLLATE and LC_CTYPE values.



--
View this message in context:
http://postgresql.1045698.n5.nabble.com/BUG-11431-Failing-to-backup-and-restore-a-Windows-postgres-database-with-Norwegian-Bokm-l-locale-tp5819260p5819745.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.



pgsql-bugs by date:

Previous
From: Caleb Epstein
Date:
Subject: Re: BUG #11455: PQerrorMessage not reset after PQreset
Next
From: npage@dynamicsignal.com
Date:
Subject: BUG #11457: The below query crashes 9.3.5, but not 9.3.4