Re: altering a column's collation leaves an invalid foreign key - Mailing list pgsql-hackers

From jian he
Subject Re: altering a column's collation leaves an invalid foreign key
Date
Msg-id CACJufxE=f=m21fx5rmr=w7oySw+aXq=JWLKJK10tq5AJoRjCVg@mail.gmail.com
Whole thread Raw
In response to Re: altering a column's collation leaves an invalid foreign key  (jian he <jian.universality@gmail.com>)
List pgsql-hackers
On Fri, Oct 25, 2024 at 2:27 PM jian he <jian.universality@gmail.com> wrote:
>

>  /*
>   * ri_GenerateQualCollation --- add a COLLATE spec to a WHERE clause
>   *
> - * At present, we intentionally do not use this function for RI queries that
> - * compare a variable to a $n parameter.  Since parameter symbols always have
> - * default collation, the effect will be to use the variable's collation.
> - * Now that is only strictly correct when testing the referenced column, since
> - * the SQL standard specifies that RI comparisons should use the referenced
> - * column's collation.  However, so long as all collations have the same
> - * notion of equality (which they do, because texteq reduces to bitwise
> - * equality), there's no visible semantic impact from using the referencing
> - * column's collation when testing it, and this is a good thing to do because
> - * it lets us use a normal index on the referencing column.  However, we do
> - * have to use this function when directly comparing the referencing and
> - * referenced columns, if they are of different collations; else the parser
> - * will fail to resolve the collation to use.
> + * We only have to use this function when directly comparing the referencing
> + * and referenced columns, if they are of different collations; else the
> + * parser will fail to resolve the collation to use.  We don't need to use
> + * this function for RI queries that compare a variable to a $n parameter.
> + * Since parameter symbols always have default collation, the effect will be
> + * to use the variable's collation.
> + *
> + * Note that we require that the collations of the referencing and the
> + * referenced column have the some notion of equality: Either they have to
> + * both be deterministic or else they both have to be the same.
>   */


drop table if exists pktable, fktable;
CREATE TABLE pktable (x text COLLATE "POSIX" PRIMARY KEY);
CREATE TABLE fktable (x text COLLATE "C" REFERENCES pktable on update
cascade on delete cascade);
INSERT INTO pktable VALUES ('A'), ('Å');
INSERT INTO fktable VALUES ('A');

update pktable set x = 'a' collate "C" where x = 'A' collate "POSIX";

the cascade update fktable query string is:
UPDATE ONLY "public"."fktable" SET "x" = $1 WHERE $2 OPERATOR(pg_catalog.=) "x"
ideally it should be
UPDATE ONLY "public"."fktable" SET "x" = $1 collate "C" WHERE $2
collate "POSIX" OPERATOR(pg_catalog.=) "x"

as we already mentioned in several places: PK-FK tie either they have to
both be deterministic or else they both have to be the same collation
oid.
so the reduction to
UPDATE ONLY "public"."fktable" SET "x" = $1 WHERE $2 OPERATOR(pg_catalog.=) "x"
is safe.
but you look at SPI_execute_snapshot, _SPI_convert_params. then we can see
the collation metadata is not keeped.

> + We don't need to use
> + * this function for RI queries that compare a variable to a $n parameter.
> + * Since parameter symbols always have default collation, the effect will be
> + * to use the variable's collation.

so I think a better description is
> + We don't need to use
> + * this function for RI queries that compare a variable to a $n parameter.
> + * Since parameter symbols don't have collation information, the effect will be
> + * to use the variable's collation.

you can see related discovery in
https://www.postgresql.org/message-id/CACJufxEtPBWAk7nEn69ww2LKi9w1i4dLwd5gnjD1DQ2vaYoi2g%40mail.gmail.com



pgsql-hackers by date:

Previous
From: Shubham Khanna
Date:
Subject: Re: Pgoutput not capturing the generated columns
Next
From: Alvaro Herrera
Date:
Subject: Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails