Re: Chars problem restoring to ps 8.4 (utf8) a dumped db from ps 8.1 (latin9) - Mailing list pgsql-general

From Tom Lane
Subject Re: Chars problem restoring to ps 8.4 (utf8) a dumped db from ps 8.1 (latin9)
Date
Msg-id 20971.1439476765@sss.pgh.pa.us
Whole thread Raw
In response to Re: Chars problem restoring to ps 8.4 (utf8) a dumped db from ps 8.1 (latin9)  ("Martín Marqués" <martin.marques@gmail.com>)
List pgsql-general
"=?UTF-8?Q?Mart=c3=adn_Marqu=c3=a9s?=" <martin.marques@gmail.com> writes:
> El 12/08/15 a las 11:12, Tom Lane escribi�:
>> It does not seem likely to me that this would work at all.  You're taking
>> a dump file that is full of LATIN9 data and simply asserting that it's
>> UTF8 data.  That doesn't make it so.  If it seemed to work, maybe that's
>> because your editor changed the encoding?  Not to be relied on, for sure.

> Well, IIRC a LATIN9 encoding char which is interpreted as UTF8 will get
> inserted with no error on a UTF8 server (although the final data will be
> bogus).

I'd believe the other way around: if you tell the database that you're
using LATIN9, but what you send is really UTF8, it will not reject it
because the individual bytes are perfectly valid LATIN9 characters and
there are no cross-byte checks to make in LATIN9.  But it seems highly
unlikely that LATIN9-encoded data would get past the UTF8 validity
checker with any consistency.

It's possible that the problem is one of mislabeling, ie the database
was claimed to use LATIN9 but what was actually sent was always UTF8.
If that was *always* the case then the OP's fix of changing the label
in the dump file was actually the right thing to do.  But we haven't
been given enough information to be sure of that --- and if that's
what was happening, then some client software fixes would be in order
anyway, because the client code was using the wrong client_encoding.

            regards, tom lane


pgsql-general by date:

Previous
From: John Turner
Date:
Subject: Re: PostgreSQL - The Best Overall Database
Next
From: Edson Richter
Date:
Subject: Re: Sync replication + high latency server