Re: Bogus collation version recording in recordMultipleDependencies - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Bogus collation version recording in recordMultipleDependencies
Date
Msg-id 4166274.1618595149@sss.pgh.pa.us
Whole thread Raw
In response to Re: Bogus collation version recording in recordMultipleDependencies  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Bogus collation version recording in recordMultipleDependencies  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
I wrote:
> Oh, I bet it's "C.utf8", because I can reproduce the failure with that.
> This crystallizes a nagging feeling I'd had that you were misdescribing
> the collate.icu.utf8 test as not being run under --no-locale.  Actually,
> it's only skipped if the encoding isn't UTF8, not the same thing.
> I think we need to remove the default-collation cases from that test too.

Hmm ... this is more subtle than it seemed.

I tried to figure out where the default-collation dependencies were coming
from, and it's quite non-obvious, at least for some of them.  Observe:

u8de=# create table t1 (f1 text collate "fr_FR");
CREATE TABLE
u8de=# create index on t1(f1) where f1 > 'foo';
CREATE INDEX
u8de=# SELECT objid::regclass, refobjid::regcollation, refobjversion
FROM pg_depend d
LEFT JOIN pg_class c ON c.oid = d.objid
WHERE refclassid = 'pg_collation'::regclass
AND coalesce(relkind, 'i') = 'i'
AND relname LIKE 't1_%';
   objid   | refobjid  | refobjversion 
-----------+-----------+---------------
 t1_f1_idx | "fr_FR"   | 2.28
 t1_f1_idx | "fr_FR"   | 2.28
 t1_f1_idx | "default" | 2.28
(3 rows)

(The "default" item doesn't show up if default collation is C,
which is what's causing the buildfarm instability.)

Now, it certainly looks like that index definition ought to only
have fr_FR dependencies.  I dug into it and discovered that the
reason we're coming up with a dependency on "default" is that
the WHERE clause looks like

             {OPEXPR 
             :opno 666 
             :opfuncid 742 
             :opresulttype 16 
             :opretset false 
             :opcollid 0 
             :inputcollid 14484 
             :args (
                {VAR 
                :varno 1 
                :varattno 1 
                :vartype 25 
                :vartypmod -1 
                :varcollid 14484 
                :varlevelsup 0 
                :varnosyn 1 
                :varattnosyn 1 
                :location 23
                }
                {CONST 
                :consttype 25 
                :consttypmod -1 
                :constcollid 100 
                :constlen -1 
                :constbyval false 
                :constisnull false 
                :location 28 
                :constvalue 7 [ 28 0 0 0 102 111 111 ]
                }
             )
             :location 26
             }

So sure enough, the comparison operator's inputcollid is
fr_FR, but the 'foo' constant has constcollid = "default".
That will have exactly zero impact on the semantics of the
expression, but dependency.c doesn't realize that and
reports it as a dependency anyway.

I feel like this is telling us that there's a fundamental
misunderstanding in find_expr_references_walker about which
collation dependencies to report.  It's reporting all the
leaf-node collations, and ignoring the ones that actually
count semantically, that is the inputcollid fields of
function and operator nodes.

Not sure what's the best thing to do here.  Redesigning
this post-feature-freeze doesn't seem terribly appetizing,
but on the other hand, this index collation recording
feature has put a premium on not overstating the collation
dependencies of an expression.  We don't want to tell users
that an index is broken when it isn't really.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Bogus collation version recording in recordMultipleDependencies
Next
From: Andres Freund
Date:
Subject: Re: Forget close an open relation in ReorderBufferProcessTXN()