Re: Collation versioning - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Collation versioning
Date
Msg-id CA+hUKGKa_ZKbWVLwH_O0MvLQMHRjTpHOR-9Y_7XCm-OpXLVkcQ@mail.gmail.com
Whole thread Raw
In response to Re: Collation versioning  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Collation versioning  (Laurenz Albe <laurenz.albe@cybertec.at>)
List pgsql-hackers
On Fri, Nov 8, 2019 at 2:37 PM Michael Paquier <michael@paquier.xyz> wrote:
> On Fri, Nov 08, 2019 at 02:23:54PM +1300, Thomas Munro wrote:
> > Right, so this is basically a policy decision: do we assume that all
> > pre-13 indexes that depend on collations are potentially corrupted, or
> > assume that they are not?  The "correct" thing to do would be to
> > assume they are potentially corrupted and complain until the user
> > reindexes, but I think the pragmatic thing to do would be to assume
> > that they're not and just let them adopt the current versions, even
> > though it's a lie.  I lean towards the pragmatic choice; we're trying
> > to catch future problems, not give the entire user base a load of
> > extra work to do on their next pg_upgrade for mostly theoretical
> > reasons.  (That said, given the new glibc versioning, we'll
> > effectively be giving most of our user base a load of extra work to do
> > on their next OS upgrade and that'll be a characteristic of PostgreSQL
> > going forward, once the versioning-for-default-provider patch goes
> > in.)  Any other opinions?
>
> Matching an incorrect collation version on an index which physically
> uses something else does not strike me as a good idea to me because
> you may hide corruptions, and you would actually lose the reason why
> the corruption happened (did the version bump up from an incorrect
> one?  Or what?).  Could it be possible to mark any existing indexes
> with an unknown version or something like that?  This way, we could
> just let the user decide what needs to be reindexed or not, and we
> need to offer an option to update the collation version from unknown
> to the latest one available.

Fair point.

So we have three proposals:

1.  Assume that pre-13 indexes that depend on collations are
potentially corrupted and complain until they are reindexed.  This
could be done by having pg_upgrade run ALTER INDEX ... DEPENDS ON
COLLATION "fr_FR" VERSION '' (empty string, or some other default
value that we don't think is going to coincide with a real version).
2.  Assume that pre-13 indexes are not corrupted.  In the target 13
database, the index will be created in the catalogs with the
provider's current version.
3.  We don't know if pre-13 indexes are corrupted or not, and we'll
record that with a special value just as in proposal #1, except that
we could show a different hint for that special version value.  It
would tell you can you can either REINDEX, or run ALTER INDEX ...
DEPENDS ON COLLATION "fr_FR" VERSION '34.0' if you believe the index
to have been created with the current collation version on an older
release of PostgreSQL that didn't track versions.



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Collation versioning
Next
From: Amit Kapila
Date:
Subject: Re: define bool in pgtypeslib_extern.h