Re: REINDEX INDEX results in a crash for an index of pg_class since9.6 - Mailing list pgsql-hackers

From Andres Freund
Subject Re: REINDEX INDEX results in a crash for an index of pg_class since9.6
Date
Msg-id 20190501170603.hx6r3ztyvi3rlgq6@alap3.anarazel.de
Whole thread Raw
In response to Re: REINDEX INDEX results in a crash for an index of pg_class since 9.6  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: REINDEX INDEX results in a crash for an index of pg_class since9.6  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Hi,

On 2019-05-01 12:20:22 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > I see the bug. Turns out we need to figure out another way to solve the
> > assertion triggered by doing catalog updates within
> > RelationSetNewRelfilenode() - we can't just move the
> > SetReindexProcessing() before it.  When CCA is enabled, the
> > CommandCounterIncrement() near the tail of RelationSetNewRelfilenode()
> > triggers a rebuild of the catalog entries - but without the
> > SetReindexProcessing() those scans will try to use the index currently
> > being rebuilt.
> 
> Yeah.  I think what this demonstrates is that REINDEX INDEX has to have
> RelationSetIndexList logic similar to what REINDEX TABLE has, to control
> which indexes get updated when while we're rebuilding an index of
> pg_class.  In hindsight that seems glaringly obvious ... I wonder how we
> missed that when we built that infrastructure for REINDEX TABLE?

I'm not sure this is the right short-term answer. Why isn't it, for now,
sufficient to do what I suggested with RelationSetNewRelfilenode() not
doing the CommandCounterIncrement(), and reindex_index() then doing the
SetReindexProcessing() before a CommandCounterIncrement()? That's like
~10 line code change, and a few more with comments.

There is the danger that the current and above approach basically relies
on there not to be any non-inplace updates during reindex.  But at the
moment code does take care to use inplace updates
(cf. index_update_stats()).

It's not clear to me whether the approach of using
RelationSetIndexList() in reindex_index() would be meaningfully more
robust against non-inplace updates during reindex either - ISTM we'd
just as well skip the necessary index insertions if we hid the index
being rebuilt. Skipping to-be-rebuilt indexes works for
reindex_relation() because they're going to be rebuilt subsequently (and
thus the missing index rows don't matter) - but it'd not work for
reindexing a single index, because it'll not get the result at a later
stage.


> I'm pretty sure that infrastructure is my fault, so I'll take a
> whack at fixing this.
> 
> Did you figure out why this doesn't also happen in HEAD?

It does for me now, at least when just doing a reindex in isolation (CCA
tests would have taken too long last night). I'm not sure why I wasn't
previously able to trigger it and markhor hasn't run yet on master.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: REINDEX INDEX results in a crash for an index of pg_class since 9.6
Next
From: Robert Haas
Date:
Subject: Re: hyrax vs. RelationBuildPartitionDesc