At Fri, 10 Apr 2020 12:32:31 +0900, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote in
> On Fri, 10 Apr 2020 at 04:05, Peter Geoghegan <pg@bowt.ie> wrote:
> >
> > On Wed, Apr 8, 2020 at 10:56 PM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > > Here is the reproducer:
> >
> > What version of Postgres did you notice the actual customer issue on?
> > I ask because I wonder if the work on B-Tree indexes in Postgres 12
> > affects the precise behavior you get here with real world workloads.
> > It probably makes _bt_killitems() more effective with some workloads,
> > which naturally increases the likelihood of having multiple FPI issued
> > in the manner that you describe. OTOH, it might make it less likely
> > with low cardinality indexes, since large groups of garbage duplicate
> > tuples tend to get concentrated on just a few leaf pages.
> >
> > > The inner test in the comment "found the item" never tests the item
> > > for being dead. So maybe we can add !ItemIdIsDead(iid) to that
> > > condition. But there still is a race condition of recording multiple
> > > FPIs can happen. Maybe a better solution is to change the lock to
> > > exclusive, at least when wal_log_hints = on, so that only one process
> > > can run this code -- the reduction in concurrency might be won back by
> > > the fact that we don't wal-log the page multiple times.
> >
> > I like the idea of checking !ItemIdIsDead(iid) as a further condition
> > of killing the item -- there is clearly no point in doing work to kill
> > an item that is already dead. I don't like the idea of using an
> > exclusive buffer lock (even if it's just with wal_log_hints = on),
> > though.
> >
>
> Okay. I think only adding the check would also help with reducing the
> likelihood. How about the changes for the current HEAD I've attached?
FWIW, looks good to me.
> Related to this behavior on btree indexes, this can happen even on
> heaps during searching heap tuples. To reduce the likelihood of that
> more generally I wonder if we can acquire a lock on buffer descriptor
> right before XLogSaveBufferForHint() and set a flag to the buffer
> descriptor that indicates that we're about to log FPI for hint bit so
> that concurrent process can be aware of that.
Makes sense if the lock were acquired just before the "BM_DIRTY |
BM_JUST_DIRTIED) check. Could we use double-checking, as similar to
the patch for ItemIdIsDead()?
> if ((pg_atomic_read_u32(&bufHdr->state) & (BM_DIRTY | BM_JUST_DIRTIED)) !=
> (BM_DIRTY | BM_JUST_DIRTIED))
> {
...
> * essential that CreateCheckpoint waits for virtual transactions
> * rather than full transactionids.
> */
> /* blah, blah */
> buf_state = LockBufHdr(bufHdr);
>
> if (buf_state & (BM_ | BM_JUST) != (..))
> {
> MyProc->delayChkpt = delayChkpt = true;
> lsn = XLogSaveBufferForHint(buffer, buffer_std);
> }
> }
> else
> buf_state = LockBuffer(bufHdr);
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center