Re: Collation versioning - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: Collation versioning
Date
Msg-id 20200319191247.GA15412@nol
Whole thread Raw
In response to Re: Collation versioning  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Collation versioning
List pgsql-hackers
On Thu, Mar 19, 2020 at 12:31:54PM +0900, Michael Paquier wrote:
> On Wed, Mar 18, 2020 at 04:35:43PM +0100, Julien Rouhaud wrote:
> > On Wed, Mar 18, 2020 at 09:56:40AM +0100, Julien Rouhaud wrote:
> > AFAICT it was only missing a call to index_update_collation_versions() in
> > ReindexRelationConcurrently.  I added regression tests to make sure that
> > REINDEX, REINDEX [INDEX|TABLE] CONCURRENTLY and VACUUM FULL are doing what's
> > expected.
> 
> If you add a call to index_update_collation_versions(), the old and
> invalid index will use the same refobjversion as the new index, which
> is the latest collation version of the system, no?  If the operation
> is interrupted before the invalid index is dropped, then we would keep
> a confusing value for refobjversion, because the old invalid index
> does not rely on the new collation version, but on the old one.
> Hence, it seems to me that it would be correct to have the old invalid
> index either use an empty version string to say "we don't know"
> because the index is invalid anyway, or keep a reference to the old
> collation version intact.  I think that the latter is much more useful
> for debugging issues when upgrading a subset of indexes if the
> operation is interrupted for a reason or another.

Indeed, I confused the _ccold and _ccnew indexes.  So, the root cause is phase
4, more precisely the dependency swap in index_concurrently_swap.

A possible fix would be to teach changeDependenciesOf() to preserve the
dependency version.  It'd be quite bit costly as this would mean an extra index
search for each dependency row found.  We could probably skip the lookup if the
row have a NULL recorded version, as version should either be null or non null
for both objects.

I'm wondering if that's a good time to make changeDependenciesOf and
changeDependenciesOn private, and instead expose a swapDependencies(classid,
obj1, obj2) that would call both, as preserving the version doesn't really
makes sense outside a switch.  It's als oa good way to ensure that no CCI is
performed in the middle.

> > Given discussion in nearby threads, I obviously can't add tests for failed
> > REINDEX CONCURRENTLY, so here's what's happening with a manual repro:
> > 
> > =# UPDATE pg_depend SET refobjversion = 'meh' WHERE refobjversion = '153.97';
> > UPDATE 1
> 
> Updates to catalogs are not an existing practice in the core
> regression tests, so patches had better not do that. :p

I already heavily relied on that in the previous version of the patchset.  The
only possible alternative would be to switch to TAP tests, and constantly
restart the instance in binary upgrade mode to be able to call
binary_upgrade_set_index_coll_version.  I'd prefer to avoid that if that's
possible, as it'll make the test way more complex and quite unreadable.

> > =# REINDEX TABLE CONCURRENTLY t1 ;
> > LOCATION:  ReindexRelationConcurrently, indexcmds.c:2839
> > ^CCancel request sent
> > ERROR:  57014: canceling statement due to user request
> > LOCATION:  ProcessInterrupts, postgres.c:3171
> 
> I guess that you used a second session here beginning a transaction
> before REINDEX CONCURRENTLY ran here so as it would stop after
> swapping dependencies, right?

Yes, sorry for eluding that.  I'm using a SELECT FOR UPDATE, same scenario as
the recent issue with TOAST tables with REINDEX CONCURRENTLY.



pgsql-hackers by date:

Previous
From: Dean Rasheed
Date:
Subject: Re: Additional improvements to extended statistics
Next
From: Chapman Flack
Date:
Subject: Re: GSoC applicant proposal, Uday PB