Re: Collation versioning - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: Collation versioning
Date
Msg-id CAOBaU_bVxcvKHF64utKWZ=F927_z-_GP0DF7Uf+i9Uhi5jo6-g@mail.gmail.com
Whole thread Raw
In response to Re: Collation versioning  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: Collation versioning  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers
On Sun, Nov 10, 2019 at 10:08 AM Thomas Munro <thomas.munro@gmail.com> wrote:
>
> Some more thoughts:
>
> 1.  If you create an index on an expression that includes a COLLATE or
> a partial index that has one in the WHERE clause, you get bogus
> warnings:
>
> postgres=# create table t (v text);
> CREATE TABLE
> postgres=# create index on t(v) where v > 'hello' collate "en_NZ";
> WARNING:  index "t_v_idx3" depends on collation "en_NZ" version "",
> but the current version is "2.28"
> DETAIL:  The index may be corrupted due to changes in sort order.
> HINT:  REINDEX to avoid the risk of corruption.
> CREATE INDEX
>
> postgres=# create index on t((case when v < 'x' collate "en_NZ" then
> 'foo' else 'bar' end));
> WARNING:  index "t_case_idx" depends on collation "en_NZ" version "",
> but the current version is "2.28"
> DETAIL:  The index may be corrupted due to changes in sort order.
> HINT:  REINDEX to avoid the risk of corruption.
> CREATE INDEX
>
> That's because the 0003 patch only calls recordDependencyOnVersion()
> for simple attribute references.  When
> recordDependencyOnSingleRelExpr() is called by index_create() to
> analyse ii_Expressions and ii_Predicate, it's going to have to be
> smart enough to detect collation references and record the versions.
> There is also some more code that ignores pinned collations hiding in
> there.
>
> This leads to the difficult question of how you recognise a real
> dependency on a collation's version in an expression.  I have some
> vague ideas but haven't seriously looked into it yet.  (The same
> question comes up for check constraint -> collation dependencies.)

Oh good point.  A simple and exhaustive way to deal with that would be
to teach recordMultipleDependencies() to override isObjectPinned() and
retrieve the collation version if the referenced object is a collation
and it's neither C or POSIX collation.   It means that we could also
remove the extra "version" argument and get rid of
recordDependencyOnVersion to simply call recordMultipleDependencies in
create_index for direct column references having a collation.

Would it be ok to add this kind of knowledge in
recordMultipleDependencies() or is it too hacky?

> 2.  If you create a composite type with a text attribute (with or
> without an explicit collation), and then create an index on a column
> of that type, we don't record the dependency.
>
> postgres=# create type my_type as (x text collate "en_NZ");
> CREATE TYPE
> postgres=# create table t (v my_type);
> CREATE TABLE
> postgres=# create index on t(v);
> CREATE INDEX
> postgres=# select * from pg_depend where refobjversion != '';
>  classid | objid | objsubid | refclassid | refobjid | refobjsubid |
> refobjversion | deptype
> ---------+-------+----------+------------+----------+-------------+---------------+---------
> (0 rows)
>
> I think create_index() will need to perform recursive analysis on
> composite types to look for text attributes, when they appear as
> simple attributes, and then add direct dependencies index -> collation
> to capture the versions.  Then we might need to do the same for
> composite types hiding inside ii_Expressions and ii_Predicate (once we
> figure out what that really means).

Isn't that actually a bug?  For instance such an index will have a 0
indcollation in pg_index, and according to pg_index documentation:

" this contains the OID of the collation to use for the index, or zero
if the column is not of a collatable data type."

You can't use a COLLATE expression on such data type, but it still has
a collation used.

> 3.  Test t/002_pg_dump.pl in src/bin/pg_upgrade fails.

Apparently neither "make check" nor "make world" run this test :(
This was broken due collversion support in pg_dump, I have fixed it
locally.

> 4.  In the warning message we should show get_collation_name() instead
> of the OID.

+1, I also fixed it locally.



pgsql-hackers by date:

Previous
From: Luis Carril
Date:
Subject: Re: Option to dump foreign data in pg_dump
Next
From: Alvaro Herrera
Date:
Subject: Re: Option to dump foreign data in pg_dump