Tom Lane wrote:
> CREATE TABLE public."myÉclass" (
> f1 text
> );
>
> If we start to case-fold É, then the only way to access this table will
> be by double-quoting its name, which the application probably is not
> expecting (else it would have double-quoted in the original CREATE TABLE).
This problem already exists when migrating from a mono-byte database
to a multi-byte database, since downcase_identifier() does use
tolower() for mono-byte databases.
db9=# show server_encoding ;
server_encoding
-----------------
LATIN9
(1 row)
db9=# create table MYÉCLASS (f1 text);
CREATE TABLE
db9=# \d
List of relations
Schema | Name | Type | Owner
--------+----------+-------+----------
public | myéclass | table | postgres
(1 row)
db9=# select * from MYÉCLASS;
f1
----
(0 rows)
pg_dump will dump this as
CREATE TABLE public."myéclass" (
f1 text
);
So far so good. But after importing this into an UTF-8 database,
the same "select * from MYÉCLASS" that used to work now fails:
u8=# show server_encoding ;
server_encoding
-----------------
UTF8
(1 row)
u8=# select * from MYÉCLASS;
ERROR: relation "myÉclass" does not exist
The compromise that is mentioned in downcase_identifier() justifying
this inconsistency is not very convincing, because the issues in case
folding due to linguistic differences exist both in mono-byte and
multi-byte encodings. For instance, if it's fine to trust the locale
to downcase 'İ' in a LATIN5 db, it should be okay in a UTF-8 db too.
Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: https://www.manitou-mail.org
Twitter: @DanielVerite