Re: Concurrency bug in amcheck - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Concurrency bug in amcheck
Date
Msg-id CAH2-Wzk3K6o0EbieaGFHmqfYD-af86gm6yp1KHyL4WdOduMyWg@mail.gmail.com
Whole thread Raw
In response to Re: Concurrency bug in amcheck  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Responses Re: Concurrency bug in amcheck  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
List pgsql-hackers
On Mon, Apr 27, 2020 at 4:17 AM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> > Assuming it doesn't seem we actually need any items on deleted pages,
> > I can propose to delete them on primary as well.  That would make
> > contents of primary and standby more consistent.  What do you think?
>
> So, my proposal is following.  Backpatch my fix upthread to 11.  In
> master additionally make _bt_unlink_halfdead_page() remove page items
> on primary as well.  Do you have any objections?

What I meant was that we might as well review the behavior of
_bt_unlink_halfdead_page() here, since we have to change it anyway.
But I think you're right: No matter what happens or doesn't happen to
_bt_unlink_halfdead_page(), the fact is that deleted pages may or may
not have a single remaining item (which was originally the "top
parent" item from the page at offset number P_HIKEY), now and forever.
We have to conservatively assume that it could be either state, now
and forever. That means that we definitely have to give up on the
check, per your patch. So, +1 for backpatching that back to 11.

(BTW, I don't think that this is a concurrency issue, except in the
sense that a test case that recreates the false positive is sensitive
to concurrency -- I imagine you agree with this.)

I like your idea of making the primary consistent with the REDO
routine on the master branch only. I wonder if that will make it
possible to change btree_mask() so that wal_consistency_checking can
check deleted pages as well. The contents of a deleted page's special
area do matter, and yet we don't currently verify that it matches (we
use mask_page_content() within btree_mask() for deleted pages, which
seems inappropriately broad). In particular, the left and right
sibling links should be consistent with the primary on a deleted page.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: James Coleman
Date:
Subject: Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays
Next
From: Amit Kapila
Date:
Subject: Re: WAL usage calculation patch