Re: v13: CLUSTER segv with wal_level=minimal and parallel index creation - Mailing list pgsql-hackers

From Noah Misch
Subject Re: v13: CLUSTER segv with wal_level=minimal and parallel index creation
Date
Msg-id 20200907093255.GA3609623@rfd.leadboat.com
Whole thread Raw
In response to Re: v13: CLUSTER segv with wal_level=minimal and parallel index creation  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: v13: CLUSTER segv with wal_level=minimal and parallel index creation
List pgsql-hackers
On Mon, Sep 07, 2020 at 05:40:36PM +0900, Kyotaro Horiguchi wrote:
> At Mon, 07 Sep 2020 13:45:28 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> > The cause is that the worker had received pending-sync entry correctly
> > but not never created a relcache entry for the relation using
> > RelationBuildDesc. So the rd_firstRelfilenodeSubid is not correctly
> > set.
> > 
> > I'm investigating it.
> 
> Relcaches are loaded from a file with old content at parallel worker
> startup. The relcache entry is corrected by invalidation at taking a
> lock but pending syncs are not considered.
> 
> Since parallel workers don't access the files so we can just ignore
> the assertion safely, but I want to rd_firstRelfilenodeSubid flag at
> invalidation, as attached PoC patch.

> [patch: When RelationInitPhysicalAddr() handles a mapped relation, re-fill
> rd_firstRelfilenodeSubid from RelFileNodeSkippingWAL(), like
> RelationBuildDesc() would do.]

As a PoC, this looks promising.  Thanks.  Would you add a test case such that
the following demonstrates the bug in the absence of your PoC?

  printf '%s\n%s\n%s\n' 'log_statement = all' 'wal_level = minimal' 'max_wal_senders = 0' >/tmp/minimal.conf
  make check TEMP_CONFIG=/tmp/minimal.conf

Please have the test try both a nailed-and-mapped relation and a "nailed, but
not mapped" relation.  I am fairly confident that your PoC fixes the former
case, but the latter may need additional code.



pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: Transactions involving multiple postgres foreign servers, take 2
Next
From: Magnus Hagander
Date:
Subject: Re: A micro-optimisation for walkdir()