pgsql: Fix collation handling for foreign keys - Mailing list pgsql-committers
From | Peter Eisentraut |
---|---|
Subject | pgsql: Fix collation handling for foreign keys |
Date | |
Msg-id | E1tBwso-001wX1-3B@gemulon.postgresql.org Whole thread Raw |
List | pgsql-committers |
Fix collation handling for foreign keys Allowing foreign keys where the referenced and the referencing columns have collations with different notions of equality is problematic. This can only happen when using nondeterministic collations, for example, if the referencing column is case-insensitive and the referenced column is not, or vice versa. It does not happen if both collations are deterministic. To show one example: CREATE COLLATION case_insensitive (provider = icu, deterministic = false, locale = 'und-u-ks-level2'); CREATE TABLE pktable (x text COLLATE "C" PRIMARY KEY); CREATE TABLE fktable (x text COLLATE case_insensitive REFERENCES pktable ON UPDATE CASCADE ON DELETE CASCADE); INSERT INTO pktable VALUES ('A'), ('a'); INSERT INTO fktable VALUES ('A'); BEGIN; DELETE FROM pktable WHERE x = 'a'; TABLE fktable; ROLLBACK; BEGIN; DELETE FROM pktable WHERE x = 'A'; TABLE fktable; ROLLBACK; Both of these DELETE statements delete the one row from fktable. So this means that one row from fktable references two rows in pktable, which should not happen. (That's why a primary key or unique constraint is required on pktable.) When nondeterministic collations were implemented, the SQL standard available to yours truly said that referential integrity checks should be performed with the collation of the referenced column, and so that's how we implemented it. But this turned out to be a mistake in the SQL standard, for the same reasons as above, that was later (SQL:2016) fixed to require both collations to be the same. So that's what we are aiming for here. We don't have to be quite so strict. We can allow different collations if they are both deterministic. This is also good for backward compatibility. So the new rule is that the collations either have to be the same or both deterministic. Or in other words, if one of them is nondeterministic, then both have to be the same. Users upgrading from before that have affected setups will need to make changes to their schemas (i.e., change one or both collations in affected foreign-key relationships) before the upgrade will succeed. Some of the nice test cases for the previous situation in collate.icu.utf8.sql are now obsolete. They are changed to just check the error checking of the new rule. Note that collate.sql already contained a test for foreign keys with different deterministic collations. A bunch of code in ri_triggers.c that added a COLLATE clause to enforce the referenced column's collation can be removed, because both columns now have to have the same notion of equality, so it doesn't matter which one to use. Reported-by: Paul Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Jian He <jian.universality@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/78d824e0-b21e-480d-a252-e4b84bc2c24b@illuminatedcomputing.com Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/9321d2fdf808e93095c4f86565dc8c8178675841 Modified Files -------------- doc/src/sgml/ref/create_table.sgml | 7 ++ src/backend/commands/tablecmds.c | 84 +++++++++++++++++---- src/backend/utils/adt/ri_triggers.c | 57 +++++--------- src/test/regress/expected/collate.icu.utf8.out | 100 ++----------------------- src/test/regress/sql/collate.icu.utf8.sql | 35 +-------- 5 files changed, 105 insertions(+), 178 deletions(-)
pgsql-committers by date: