Re: Bogus collation version recording in recordMultipleDependencies - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Bogus collation version recording in recordMultipleDependencies
Date
Msg-id 428255.1618759782@sss.pgh.pa.us
Whole thread Raw
In response to Re: Bogus collation version recording in recordMultipleDependencies  (Julien Rouhaud <rjuju123@gmail.com>)
Responses Re: Bogus collation version recording in recordMultipleDependencies  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Julien Rouhaud <rjuju123@gmail.com> writes:
> On Sat, Apr 17, 2021 at 10:01:53AM +1200, Thomas Munro wrote:
>> It seems to me that there are two things that would be needed to
>> salvage this for PG14: (1) deciding that we're unlikely to come up
>> with a better idea than using pg_depend for this (following the
>> argument that it'd only create duplication to have a parallel
>> dedicated catalog), (2) fixing any remaining flaws in the dependency
>> analyser code.  I'll look into the details some more on Monday.

> So IIUC the issue here is that the code could previously record useless
> collation version dependencies in somes cases, ...
> The new situation is now that the code can record too few version dependencies
> leading to false negative detection, which is way more problematic.

I'm not sure that an error in this direction is all that much more
problematic than the other direction.  If it's okay to claim that
indexes need to be rebuilt when they don't really, then we could just
drop this entire overcomplicated infrastructure and report that all
indexes need to be rebuilt after any collation version change.

But in any case you're oversimplifying tremendously.  The previous code is
just as capable of errors of omission, because it was inquiring into the
wrong composite types, ie those of leaf expression nodes.  The ones we'd
need to look at are the immediate inputs of record_eq and siblings.  Here
are a couple of examples where the leaf types are unhelpful:

... where row(a,b,c)::composite_type < row(d,e,f)::composite_type;
... where function_returning_composite(...) < function_returning_composite(...);

And even if we do this, we're not entirely in the clear in an abstract
sense, because this only covers cases in which an immediate input is
of a known named composite type.  Cases dealing in anonymous RECORD
types simply can't be resolved statically.  It might be that that
can't occur in the specific situation of CREATE INDEX expressions,
but I'm not 100% sure of it.  The apparent counterexample of

... where row(a,b) < row(a,c)

isn't one because we parse that as RowCompareExpr not an application
of record_lt.

> We agreed that having possible false positive dependencies was acceptable for
> the initial implementation and that we will improve it in later versions, as
> otherwise the alternative is to reindex everything without getting any warning,
> which clearly isn't better anyway.

[ shrug... ] You have both false positives and false negatives in the
thing as it stood before f24b15699.  I'm not convinced that it's possible
to completely avoid either issue via static analysis.  I'm inclined to
think that false negatives around record_eq-like functions are not such a
problem for real index definitions, and we'd be better off with fewer
false positives.  But it's all judgment calls.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: 2 questions about volatile attribute of pg_proc.
Next
From: Tom Lane
Date:
Subject: Re: 2 questions about volatile attribute of pg_proc.