latin1db=# SELECT version(); version
----------------------------------------------------------------------------------- PostgreSQL 8.3.7 on
i686-pc-linux-gnu,compiled by GCC gcc (Debian
4.3.3-5) 4.3.3
(1 row)
latin1db=# SELECT name, setting FROM pg_settings where name like 'lc%'
OR name like '%encoding'; name | setting
-----------------+--------- client_encoding | utf8 lc_collate | C lc_ctype | C lc_messages | es_ES
lc_monetary | C lc_numeric | C lc_time | C server_encoding | LATIN1
(8 rows)
latin1db=# SELECT * FROM foo;
ERROR: no existe la relación «foo»
The accented characters are garbled. When I try the same with a database
that's in UTF8 in the same cluster, it works:
utf8db=# SELECT name, setting FROM pg_settings where name like 'lc%' OR
name like '%encoding'; name | setting
-----------------+--------- client_encoding | UTF8 lc_collate | C lc_ctype | C lc_messages | es_ES
lc_monetary | C lc_numeric | C lc_time | C server_encoding | UTF8
(8 rows)
utf8db=# SELECT * FROM foo;
ERROR: no existe la relación «foo»
What is happening is that gettext() returns the message in the encoding
determined by LC_CTYPE, while we expect it to return it in the database
encoding. Starting with PG 8.3 we enforce that the encoding specified in
LC_CTYPE matches the database encoding, but not for the C locale.
In CVS HEAD, we call bind_textdomain_codeset() in SetDatabaseEncoding()
which fixes that, but we only do it on Windows. In earlier versions we
called it on all platforms, but only for UTF-8. It seems that we should
call bind_textdomain_codeset on all platforms and all encodings.
However, there seems to be a reason why we only do it for Windows on CVS
HEAD: we need a mapping from our encoding ID to the OS codeset name, and
the OS codeset names vary.
How can we make this more robust?
-- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com