Re: Parallel index creation does not properly cleanup after error - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Parallel index creation does not properly cleanup after error
Date
Msg-id CAH2-WzmGuTdu58BgrEcbH8u1Y=r+u9iqFY5GHaQ=t3iZcOZLDQ@mail.gmail.com
Whole thread Raw
In response to Parallel index creation does not properly cleanup after error  (David Rowley <david.rowley@2ndquadrant.com>)
Responses Re: Parallel index creation does not properly cleanup after error  (David Rowley <david.rowley@2ndquadrant.com>)
List pgsql-hackers
On Sun, Mar 11, 2018 at 3:22 AM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
> Due to the failure during the index build, it appears that the
> PG_TRY/PG_CATCH block in reindex_relation() causes the reindex_index()
> to abort and jump out to the catch block. Here there's a call to
> ResetReindexPending(), which complains as we're still left in parallel
> mode from the aborted _bt_begin_parallel() call which has called
> EnterParallelMode(), but not managed to make it all the way to
> _bt_end_parallel() (called from btbuild()), where ExitParallelMode()
> is normally called.
>
> Subsequent attempts to refresh the materialized view result in an
> Assert failure in list_member_oid()

Thanks for the report.

> I've not debugged that, but I assume it's because
> pendingReindexedIndexes is left as a non-empty list but has had its
> memory context obliterated due to the previous query having ended.

It's not really related to memory lifetime, so much as a corruption of
the state that tracks reindexed indexes within a backend. This is of
course due to that "cannot modify reindex state during a parallel
operation" error you saw.

> The comment in the following fragment is not well honored by the
> ResetReindexPending() since it does not clear the list if there's an
> error.

> A perhaps simple fix would be just to have ResetReindexPending() only
> reset the list to NIL again and not try to raise any error.

I noticed a very similar bug in ResetReindexProcessing() just before
parallel CREATE INDEX was committed. The fix there was simply not
throwing a "can't happen" error. I agree that the same fix should be
used here. It's not worth enforcing !IsInParallelMode() in the reset
functions; just enforcing !IsInParallelMode() in the set functions is
sufficient. Attached patch does this.

-- 
Peter Geoghegan

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Using JIT for VACUUM, COPY, ANALYZE
Next
From: Andres Freund
Date:
Subject: Re: Using JIT for VACUUM, COPY, ANALYZE