Re: Encoding problems with migration from 8.0.14 to 8.3.0 on Windows - Mailing list pgsql-admin

From Meetesh Karia
Subject Re: Encoding problems with migration from 8.0.14 to 8.3.0 on Windows
Date
Msg-id 47D8FB49.6000406@gmail.com
Whole thread Raw
In response to Re: Encoding problems with migration from 8.0.14 to 8.3.0 on Windows  (Robert Treat <xzilla@users.sourceforge.net>)
Responses Re: Encoding problems with migration from 8.0.14 to 8.3.0 on Windows
Re: Encoding problems with migration from 8.0.14 to 8.3.0 on Windows
List pgsql-admin
Thanks for your response Robert.

Unfortunately I don't think that will work (unless I'm misunderstanding something).

I can't export using a client encoding of WIN1250 because I have true UNICODE chars which can't be represented with it (I've tried and I get the error below).  Therefore, I have a sql dump with UNICODE chars and setting the client encoding to WIN1250 when I run the sql dump script would toast the characters.

Additionally, here's what I get when I run your test below (my server encoding is UTF-8):

ltefull=# create table x (r varchar(255) unique);
NOTICE:  CREATE TABLE / UNIQUE will create implicit index "x_r_key" for table "x"
CREATE TABLE
ltefull=#
ltefull=# set client_encoding=WIN1250;
SET
ltefull=# insert into x (r) values ('Daniel Brühl');
INSERT 0 1
ltefull=#
ltefull=# insert into x (r) values ('Daniel Bruehl');
ERROR:  duplicate key value violates unique constraint "x_r_key"

Meetesh

Robert Treat wrote:
On Wednesday 12 March 2008 09:37, Meetesh Karia wrote: 
One quick addition to this:

The column I'm creating this unique index on is a varchar(255) and the
command I was running was:

create unique index foo_name on foo (name);

If I use the following, it now works:

create unique index foo_name on foo (cast(name as bytea));

Thoughts?

Meetesh

Meetesh Karia wrote:   
Hi all,

I'm trying to migrate from 8.0.14 on Windows (Vista Home Premium) to
8.3.0 and I've been trying to solve what appears to be an encoding
problem.  My old db was in the UNICODE encoding.  I know that this
isn't supported on 8.0.x, but it was a restore of a db from a Linux
environment and postgres didn't appear to have any problems with it.

My 8.3 server and client encodings are UTF8 and I used pg_dumpall (I
tried the 8.0 and 8.3 versions) to dump the db.  However, when I tried
to restore the db, I got an error during index creation which wouldn't
let me create a unique index on a column that had all unique values
(it had the index in 8.0 and a group by having query with no indexes
on the table confirms uniqueness).  The thing that this column does
have however is values like:

'Bruehl'
'Brühl'

I created a blank table with the unique index on it and inserted rows
one at a time until I confirmed that it was the above values that were
causing a problem.  Running the following query shows the difference
in the hex encoded values (I changed my client encoding to WIN1250 to
get the below to show up correctly):

select name, encode(decode(name, 'escape'), 'hex') from ...
    name      |           encode
---------------+----------------------------Daniel Brühl  | 44616e69656c204272c3bc686cDaniel Bruehl | 44616e69656c2042727565686c
(2 rows)

I've also tried exporting using an encoding of WIN1250 but I get
errors like this:

pg_dump: Error message from server: ERROR:  character 0xc383 of
encoding "UNICODE" has no equivalent in "WIN1250"

Anyone have any thoughts or suggestions?  Why would the index creation
fail?  Is there a workaround?
     
I'm not convinced your problem isn't solved by proper setting of 
client_encoding for both input and output:

pagila=# create table x (r varchar(255) unique);
NOTICE:  CREATE TABLE / UNIQUE will create implicit index "x_r_key" for 
table "x "
CREATE TABLE
pagila=# set client_encoding=WIN1250;
SET
pagila=# insert into x (r) values ('Daniel Brühl');
INSERT 0 1
pagila=# insert into x (r) values ('Daniel Bruehl');
INSERT 0 1
pagila=# select * from x;      r
---------------Daniel BrühlDaniel Bruehl
(2 rows)
 

pgsql-admin by date:

Previous
From: Pietro
Date:
Subject: LAN connection to server only after restart server
Next
From: Julio Leyva
Date:
Subject: Re: LAN connection to server only after restart server