Re: Concurrency bug in amcheck - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: Concurrency bug in amcheck
Date
Msg-id CAPpHfdv2qLS7ApVKTzshnrZi5qULpPp2a1f4PWkEZMUCiVf5Ow@mail.gmail.com
Whole thread Raw
In response to Re: Concurrency bug in amcheck  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Concurrency bug in amcheck
List pgsql-hackers
On Tue, Apr 28, 2020 at 4:05 AM Peter Geoghegan <pg@bowt.ie> wrote:
> On Mon, Apr 27, 2020 at 4:17 AM Alexander Korotkov
> <a.korotkov@postgrespro.ru> wrote:
> > > Assuming it doesn't seem we actually need any items on deleted pages,
> > > I can propose to delete them on primary as well.  That would make
> > > contents of primary and standby more consistent.  What do you think?
> >
> > So, my proposal is following.  Backpatch my fix upthread to 11.  In
> > master additionally make _bt_unlink_halfdead_page() remove page items
> > on primary as well.  Do you have any objections?
>
> What I meant was that we might as well review the behavior of
> _bt_unlink_halfdead_page() here, since we have to change it anyway.
> But I think you're right: No matter what happens or doesn't happen to
> _bt_unlink_halfdead_page(), the fact is that deleted pages may or may
> not have a single remaining item (which was originally the "top
> parent" item from the page at offset number P_HIKEY), now and forever.
> We have to conservatively assume that it could be either state, now
> and forever. That means that we definitely have to give up on the
> check, per your patch. So, +1 for backpatching that back to 11.

Thank you.  I've worked a bit on comments and commit message.  I would
appreciate you review.

> (BTW, I don't think that this is a concurrency issue, except in the
> sense that a test case that recreates the false positive is sensitive
> to concurrency -- I imagine you agree with this.)

Yes, I agree it's related to concurrency, but not exactly concurrency issue.

> I like your idea of making the primary consistent with the REDO
> routine on the master branch only. I wonder if that will make it
> possible to change btree_mask() so that wal_consistency_checking can
> check deleted pages as well. The contents of a deleted page's special
> area do matter, and yet we don't currently verify that it matches (we
> use mask_page_content() within btree_mask() for deleted pages, which
> seems inappropriately broad). In particular, the left and right
> sibling links should be consistent with the primary on a deleted page.

Thank you.  2nd patch is proposed for master and makes btree page
unlink remove all the items from the page being deleted.


------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

pgsql-hackers by date:

Previous
From: Julien Rouhaud
Date:
Subject: Re: Add "-Wimplicit-fallthrough" to default flags (was Re: pgsql:Support FETCH FIRST WITH TIES)
Next
From: Ashutosh Bapat
Date:
Subject: Re: making update/delete of inheritance trees scale better