Re: Collation versioning - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: Collation versioning
Date
Msg-id CAOBaU_bF-mhb5--HOzgFebcke3J0vYGOAaeb24bipeYQrOyw0g@mail.gmail.com
Whole thread Raw
In response to Re: Collation versioning  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers
On Tue, Nov 12, 2019 at 10:16 PM Thomas Munro <thomas.munro@gmail.com> wrote:
>
> On Wed, Nov 13, 2019 at 3:27 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
> > On Sun, Nov 10, 2019 at 10:08 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> > > That's because the 0003 patch only calls recordDependencyOnVersion()
> > > for simple attribute references.  When
> > > recordDependencyOnSingleRelExpr() is called by index_create() to
> > > analyse ii_Expressions and ii_Predicate, it's going to have to be
> > > smart enough to detect collation references and record the versions.
> > > There is also some more code that ignores pinned collations hiding in
> > > there.
[...]

Indeed.  Now, using a composite type in an expression index, I can see that eg.

CREATE TYPE mytype AS (fr text COLLATE "fr-x-icu", en text COLLATE "en-x-icu");
CREATE TABLE test1(id integer, myval mytype);
CREATE INDEX ON sometable (somecol) WHERE (mytype).fr_name = 'meh'

does create a dependency on fr-x-icu collation, because collations are
checked for FieldSelect nodes (which indeed ignores default
collation), but eg.

CREATE INDEX idx2 ON test1(id) WHERE myval = ('foo', 'bar');

won't, so I confirm that recordDependencyOnSingleRelExpr() isn't
bullet proof either for finding collation dependencies.


> > > I think create_index() will need to perform recursive analysis on
> > > composite types to look for text attributes, when they appear as
> > > simple attributes, and then add direct dependencies index -> collation
> > > to capture the versions.  Then we might need to do the same for
> > > composite types hiding inside ii_Expressions and ii_Predicate (once we
> > > figure out what that really means).

I did write some code that can extract all collations that are used by
a datatype, which seems to work as intended over many combinations of
composite / array / domain types used in index simple attributes.  I'm
not sure if I should change find_expr_references_walker (called by
recordDependencyOnExpr) to also track those new dependencies in
ii_Expression and ii_Predicate, as it'll also add unneeded
dependencies for other callers.  And if I should add version detection
there too or have recordMultipleDependencies() automatically take care
of this.

> > A simple and exhaustive way to deal with that would be
> > to teach recordMultipleDependencies() to override isObjectPinned() and
> > retrieve the collation version if the referenced object is a collation
> > and it's neither C or POSIX collation
> >
> That doesn't seem like the right place; that's a raw data insertion
> function, though... I guess it does already have enough brains to skip
> pinned objects.  Hmm.

Another point is that unless we also do an additional check to see
what relkind is referencing the collation, it'll record a version
string for types and other objects.

> > Isn't that actually a bug?  For instance such an index will have a 0
> > indcollation in pg_index, and according to pg_index documentation:
> >
> > " this contains the OID of the collation to use for the index, or zero
> > if the column is not of a collatable data type."
> >
> > You can't use a COLLATE expression on such data type, but it still has
> > a collation used.
>
> I don't think it's a bug.  The information is available, but you have
> to follow the graph to get it.  Considering that the composite type
> could be something like CREATE TYPE my_type AS (fr_name text COLLATE
> "fr_CA", en_name text COLLATE "en_CA"), there is no single collation
> you could put into pg_index.indcollation anyway.  As for pg_depend,
> it's currently enough for the index to depend on the type, and the
> type to depend on the collation, because the purpose of dependencies
> is to control dropping and dumping order, but for our new purpose we
> also need to create direct dependencies index -> "fr_CA", index ->
> "en_CA" so we can record the current versions.

Oh right, I didn't think about that.

> > > 3.  Test t/002_pg_dump.pl in src/bin/pg_upgrade fails.
> >
> > Apparently neither "make check" nor "make world" run this test :(
> > This was broken due collversion support in pg_dump, I have fixed it
> > locally.
>
> make check-world

Thanks!



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: [PATCH] Fix possible underflow in expression (maxoff - 1)
Next
From: Ranier Vilela
Date:
Subject: RE: [PATCH] Fix possible underflow in expression (maxoff - 1)