Thierry Hauchard wrote:
> When restoring from backup (created from 8.4 database with PG_Dump
> 9.4.4), the log shows errors about UTF like :
> 2015-07-07 17:03:35 CEST ERREUR: séquence d'octets invalide pour
> l'encodage « UTF8 » : 0xf4 0x6c 0x65 0x20
[...]
> UPDATE test_table SET str_field = '\\' WHERE id = 75160909
> -> ERROR: invalid byte sequence for encoding "UTF8": 0xee 0x6e 0x65
>
These sequences of bytes seem to come from LATIN1-encoded
error messages from the backend, translated to french.
0xf4 0x6c 0x65 is "ôle" which could come from "rôle"="role" in
english, a fragment of message that occurs routinely when
restoring a dump granting permissions to roles that don't exist in
the target cluster.
0xee 0x6e 0x65 is "îne" as in "chaîne" which is "string" in french. It's
plausible that the above update, given standard_conforming_strings
to false, produces the translated version of:
"nonstandard use of \\' in a string literal"
which is:
"utilisation non standard de \\' dans une chaîne littérale"
where non-surprisingly, the first non US-ASCII sequence is "îne"
See how lc_messages is configured in postgresql.conf.
Presumably it's French_France.1252 ?
If you can live with english messages, set it to C, otherwise
someone more knowledgeable in Windows might suggest a
proper explanation and fix.
Personally I don't understand in the first place how UTF-8
is handled with '*.1252' locales, as cp1252 seems
incompatible with UTF-8 by definition.
Best regards,
--
Daniel
PostgreSQL-powered mail user agent and storage: http://www.manitou-mail.org