Re: Bogus collation version recording in recordMultipleDependencies - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: Bogus collation version recording in recordMultipleDependencies
Date
Msg-id 20210418112333.5cbcamht2b7grfnm@nol
Whole thread Raw
In response to Re: Bogus collation version recording in recordMultipleDependencies  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: Bogus collation version recording in recordMultipleDependencies
Re: Bogus collation version recording in recordMultipleDependencies
List pgsql-hackers
On Sat, Apr 17, 2021 at 10:01:53AM +1200, Thomas Munro wrote:
> On Sat, Apr 17, 2021 at 8:39 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Per the changes in collate.icu.utf8.out, this gets rid of
> > a lot of imaginary collation dependencies, but it also gets
> > rid of some arguably-real ones.  In particular, calls of
> > record_eq and its siblings will be considered not to have
> > any collation dependencies, although we know that internally
> > those will look up per-column collations of their input types.
> > We could imagine special-casing record_eq etc here, but that
> > sure seems like a hack.
> 
> [...]
> 
> > So I wonder if, rather than continuing to pursue this right now,
> > we shouldn't revert 257836a75 and try again later with a new design
> > that doesn't try to commandeer the existing dependency infrastructure.
> > We might have a better idea about what to do on Windows by the time
> > that's done, too.
> 
> It seems to me that there are two things that would be needed to
> salvage this for PG14: (1) deciding that we're unlikely to come up
> with a better idea than using pg_depend for this (following the
> argument that it'd only create duplication to have a parallel
> dedicated catalog), (2) fixing any remaining flaws in the dependency
> analyser code.  I'll look into the details some more on Monday.

So IIUC the issue here is that the code could previously record useless
collation version dependencies in somes cases, which could lead to false
positive possible corruption messages (and of course additional bloat on
pg_depend).  False positive messages can't be avoided anyway, as a collation
version update may not corrupt the actually indexed set of data, especially for
glibc.  But with the infrastructure as-is advanced user can look into the new
version changes and choose to ignore changes for a specific set of collation,
which is way easier to do with the recorded dependencies.

The new situation is now that the code can record too few version dependencies
leading to false negative detection, which is way more problematic.

This was previously discussed around [1].  Quoting Thomas:

> To state more explicitly what's happening here, we're searching the
> expression trees for subexpresions that have a collation as part of
> their static type.  We don't know which functions or operators are
> actually affected by the collation, though.  For example, if an
> expression says "x IS NOT NULL" and x happens to be a subexpression of
> a type with a particular collation, we don't now that this
> expression's value can't possibly be affected by the collation version
> changing.  So, the system will nag you to rebuild an index just
> because you mentioned it, even though the index can't be corrupted.
> To do better than that, I suppose we'd need declarations in the
> catalog to say which functions/operators are collation sensitive.

We agreed that having possible false positive dependencies was acceptable for
the initial implementation and that we will improve it in later versions, as
otherwise the alternative is to reindex everything without getting any warning,
which clearly isn't better anyway.

FTR was had the same agreement to not handle specific AMs that don't care about
collation (like hash or bloom) in [2], even though I provided a patch to handle
that case ([3]) which was dropped later on ([4]).

Properly and correctly handling collation version dependency in expressions is
a hard problem and will definitely require additional fields in pg_proc, so we
clearly can't add that in pg14.  So yes we have to decide whether we want to
keep the feature in pg14 with the known limitations (and in that case probably
revert f24b15699, possibly improving documentation on the possibility of false
positive) or revert it entirely.

Unsurprisingly, I think that the feature as-is is already a significant
improvement, which can be easily improved, so my vote is to keep it in pg14.
And just to be clear I'm volunteering to work on the expression problem and all
other related improvements for the next version, whether the current feature is
reverted or not.

[1]: https://www.postgresql.org/message-id/CA%2BhUKGK8CwBcTcXWL2kUjpHT%2B6t2hEFCzkcZ-Z7xXbz%3DC4NLCQ%40mail.gmail.com
[2]: https://www.postgresql.org/message-id/13b0c950-80f9-4c10-7e0f-f59feac56a98%402ndquadrant.com
[3]: https://www.postgresql.org/message-id/20200908144507.GA57691%40nol
[4]:
https://www.postgresql.org/message-id/CA%2BhUKGKHj4aYmmwKZdZjkD%3DCWRmn%3De6UsS7S%2Bu6oLrrp0orgsw%40mail.gmail.com



pgsql-hackers by date:

Previous
From: Vladimír Houba ml.
Date:
Subject: Re: feature request ctid cast / sql exception
Next
From: Sven Klemm
Date:
Subject: Fix dropped object handling in pg_event_trigger_ddl_commands