Thread: Concurrency bug in amcheck

Concurrency bug in amcheck

From
Alexander Korotkov
Date:
Hi!

I found concurrency bug in amcheck running on replica.  When
btree_xlog_unlink_page() replays changes to replica, deleted page is
left with no items.  But if amcheck steps on such deleted page
palloc_btree_page() expects it would have items.

(lldb_on_primary) b btbulkdelete

primary=# drop table test;
primary=# create table test as (select random() x from
generate_series(1,1000000) i);
primary=# create index test_x_idx on test(x);
primary=# delete from test;
primary=# vacuum test;

(lldb_on_replica) b bt_check_level_from_leftmost

replica=# select bt_index_check('test_x_idx');

# skip to internal level
(lldb_on_replica) c
(lldb_on_replica) b palloc_btree_page
# skip to non-leftmost page
(lldb_on_replica) c
(lldb_on_replica) c
# concurrently delete btree pages
(lldb_on_primary) c
# continue with pages
(lldb_on_replica) c

Finally replica gets error.
ERROR:  internal block 289 in index "test_x_idx" lacks high key and/or
at least one downlink

Proposed fix is attached.  Spotted by Konstantin Knizhnik,
reproduction case and fix from me.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: Concurrency bug in amcheck

From
Alexander Korotkov
Date:
On Tue, Apr 21, 2020 at 12:54 PM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> I found concurrency bug in amcheck running on replica.  When
> btree_xlog_unlink_page() replays changes to replica, deleted page is
> left with no items.  But if amcheck steps on such deleted page
> palloc_btree_page() expects it would have items.

I forgot to mention that I've reproduced it on master.  Quick check
shows bug should exist since 11.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Concurrency bug in amcheck

From
Andres Freund
Date:
Hi,

Peter, Just thought you might want to see this one...

On 2020-04-21 15:31:13 +0300, Alexander Korotkov wrote:
> On Tue, Apr 21, 2020 at 12:54 PM Alexander Korotkov
> <a.korotkov@postgrespro.ru> wrote:
> > I found concurrency bug in amcheck running on replica.  When
> > btree_xlog_unlink_page() replays changes to replica, deleted page is
> > left with no items.  But if amcheck steps on such deleted page
> > palloc_btree_page() expects it would have items.
> 
> I forgot to mention that I've reproduced it on master.  Quick check
> shows bug should exist since 11.

- Andres



Re: Concurrency bug in amcheck

From
Peter Geoghegan
Date:
On Tue, Apr 21, 2020 at 2:54 AM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> Proposed fix is attached.  Spotted by Konstantin Knizhnik,
> reproduction case and fix from me.

I wonder if we should fix btree_xlog_unlink_page() instead of amcheck.
We already know that its failure to be totally consistent with the
primary causes problems for backwards scans -- this problem can be
fixed at the same time:

https://postgr.es/m/CANtu0ohkR-evAWbpzJu54V8eCOtqjJyYp3PQ_SGoBTRGXWhWRw@mail.gmail.com

We'd probably still use your patch for the backbranches if we went this way.

What do you think?

-- 
Peter Geoghegan



Re: Concurrency bug in amcheck

From
Alexander Korotkov
Date:
On Wed, Apr 22, 2020 at 7:47 PM Peter Geoghegan <pg@bowt.ie> wrote:
> On Tue, Apr 21, 2020 at 2:54 AM Alexander Korotkov
> <a.korotkov@postgrespro.ru> wrote:
> > Proposed fix is attached.  Spotted by Konstantin Knizhnik,
> > reproduction case and fix from me.
>
> I wonder if we should fix btree_xlog_unlink_page() instead of amcheck.
> We already know that its failure to be totally consistent with the
> primary causes problems for backwards scans -- this problem can be
> fixed at the same time:
>
> https://postgr.es/m/CANtu0ohkR-evAWbpzJu54V8eCOtqjJyYp3PQ_SGoBTRGXWhWRw@mail.gmail.com
>
> We'd probably still use your patch for the backbranches if we went this way.
>
> What do you think?

I've skip through the thread.  It seems to be quite independent issue
from this one.  This issue is related to the fact that we leave some
items on deleted pages on primary, and on the same time we have no
items on deleted pages on standby.  This inconsistency cause amcheck
pass normally on primary, but fail on standby.  BackwardScan on
standby issue seems to be related solely on locking protocol and
btpo_prev, btpo_next pointers.  It wasn't mention on that thread that
we might need hikeys on deleted pages.

Assuming it doesn't seem we actually need any items on deleted pages,
I can propose to delete them on primary as well.  That would make
contents of primary and standby more consistent.  What do you think?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Concurrency bug in amcheck

From
Alexander Korotkov
Date:
On Mon, Apr 27, 2020 at 11:51 AM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> On Wed, Apr 22, 2020 at 7:47 PM Peter Geoghegan <pg@bowt.ie> wrote:
> > On Tue, Apr 21, 2020 at 2:54 AM Alexander Korotkov
> > <a.korotkov@postgrespro.ru> wrote:
> > > Proposed fix is attached.  Spotted by Konstantin Knizhnik,
> > > reproduction case and fix from me.
> >
> > I wonder if we should fix btree_xlog_unlink_page() instead of amcheck.
> > We already know that its failure to be totally consistent with the
> > primary causes problems for backwards scans -- this problem can be
> > fixed at the same time:
> >
> > https://postgr.es/m/CANtu0ohkR-evAWbpzJu54V8eCOtqjJyYp3PQ_SGoBTRGXWhWRw@mail.gmail.com
> >
> > We'd probably still use your patch for the backbranches if we went this way.
> >
> > What do you think?
>
> I've skip through the thread.  It seems to be quite independent issue
> from this one.  This issue is related to the fact that we leave some
> items on deleted pages on primary, and on the same time we have no
> items on deleted pages on standby.  This inconsistency cause amcheck
> pass normally on primary, but fail on standby.  BackwardScan on
> standby issue seems to be related solely on locking protocol and
> btpo_prev, btpo_next pointers.  It wasn't mention on that thread that
> we might need hikeys on deleted pages.
>
> Assuming it doesn't seem we actually need any items on deleted pages,
> I can propose to delete them on primary as well.  That would make
> contents of primary and standby more consistent.  What do you think?

So, my proposal is following.  Backpatch my fix upthread to 11.  In
master additionally make _bt_unlink_halfdead_page() remove page items
on primary as well.  Do you have any objections?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Concurrency bug in amcheck

From
Peter Geoghegan
Date:
On Mon, Apr 27, 2020 at 4:17 AM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> > Assuming it doesn't seem we actually need any items on deleted pages,
> > I can propose to delete them on primary as well.  That would make
> > contents of primary and standby more consistent.  What do you think?
>
> So, my proposal is following.  Backpatch my fix upthread to 11.  In
> master additionally make _bt_unlink_halfdead_page() remove page items
> on primary as well.  Do you have any objections?

What I meant was that we might as well review the behavior of
_bt_unlink_halfdead_page() here, since we have to change it anyway.
But I think you're right: No matter what happens or doesn't happen to
_bt_unlink_halfdead_page(), the fact is that deleted pages may or may
not have a single remaining item (which was originally the "top
parent" item from the page at offset number P_HIKEY), now and forever.
We have to conservatively assume that it could be either state, now
and forever. That means that we definitely have to give up on the
check, per your patch. So, +1 for backpatching that back to 11.

(BTW, I don't think that this is a concurrency issue, except in the
sense that a test case that recreates the false positive is sensitive
to concurrency -- I imagine you agree with this.)

I like your idea of making the primary consistent with the REDO
routine on the master branch only. I wonder if that will make it
possible to change btree_mask() so that wal_consistency_checking can
check deleted pages as well. The contents of a deleted page's special
area do matter, and yet we don't currently verify that it matches (we
use mask_page_content() within btree_mask() for deleted pages, which
seems inappropriately broad). In particular, the left and right
sibling links should be consistent with the primary on a deleted page.

-- 
Peter Geoghegan



Re: Concurrency bug in amcheck

From
Alexander Korotkov
Date:
On Tue, Apr 28, 2020 at 4:05 AM Peter Geoghegan <pg@bowt.ie> wrote:
> On Mon, Apr 27, 2020 at 4:17 AM Alexander Korotkov
> <a.korotkov@postgrespro.ru> wrote:
> > > Assuming it doesn't seem we actually need any items on deleted pages,
> > > I can propose to delete them on primary as well.  That would make
> > > contents of primary and standby more consistent.  What do you think?
> >
> > So, my proposal is following.  Backpatch my fix upthread to 11.  In
> > master additionally make _bt_unlink_halfdead_page() remove page items
> > on primary as well.  Do you have any objections?
>
> What I meant was that we might as well review the behavior of
> _bt_unlink_halfdead_page() here, since we have to change it anyway.
> But I think you're right: No matter what happens or doesn't happen to
> _bt_unlink_halfdead_page(), the fact is that deleted pages may or may
> not have a single remaining item (which was originally the "top
> parent" item from the page at offset number P_HIKEY), now and forever.
> We have to conservatively assume that it could be either state, now
> and forever. That means that we definitely have to give up on the
> check, per your patch. So, +1 for backpatching that back to 11.

Thank you.  I've worked a bit on comments and commit message.  I would
appreciate you review.

> (BTW, I don't think that this is a concurrency issue, except in the
> sense that a test case that recreates the false positive is sensitive
> to concurrency -- I imagine you agree with this.)

Yes, I agree it's related to concurrency, but not exactly concurrency issue.

> I like your idea of making the primary consistent with the REDO
> routine on the master branch only. I wonder if that will make it
> possible to change btree_mask() so that wal_consistency_checking can
> check deleted pages as well. The contents of a deleted page's special
> area do matter, and yet we don't currently verify that it matches (we
> use mask_page_content() within btree_mask() for deleted pages, which
> seems inappropriately broad). In particular, the left and right
> sibling links should be consistent with the primary on a deleted page.

Thank you.  2nd patch is proposed for master and makes btree page
unlink remove all the items from the page being deleted.


------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: Concurrency bug in amcheck

From
Peter Geoghegan
Date:
On Mon, May 11, 2020 at 5:56 AM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> Thank you.  I've worked a bit on comments and commit message.  I would
> appreciate you review.

This looks good to me.

> > I like your idea of making the primary consistent with the REDO
> > routine on the master branch only. I wonder if that will make it
> > possible to change btree_mask() so that wal_consistency_checking can
> > check deleted pages as well. The contents of a deleted page's special
> > area do matter, and yet we don't currently verify that it matches (we
> > use mask_page_content() within btree_mask() for deleted pages, which
> > seems inappropriately broad). In particular, the left and right
> > sibling links should be consistent with the primary on a deleted page.
>
> Thank you.  2nd patch is proposed for master and makes btree page
> unlink remove all the items from the page being deleted.

This looks good, but can we do the
wal_consistency_checking/btree_mask() improvement, too?

There is no reason why it can't work with fully deleted pages. It
already works with half-dead pages. It would be nice to be able to
test this patch in that way, and it would be nice to have it in
general.

-- 
Peter Geoghegan



Re: Concurrency bug in amcheck

From
Peter Geoghegan
Date:
On Wed, May 13, 2020 at 4:06 PM Peter Geoghegan <pg@bowt.ie> wrote:
> On Mon, May 11, 2020 at 5:56 AM Alexander Korotkov
> <a.korotkov@postgrespro.ru> wrote:
> > Thank you.  2nd patch is proposed for master and makes btree page
> > unlink remove all the items from the page being deleted.
>
> This looks good, but can we do the
> wal_consistency_checking/btree_mask() improvement, too?

You never got around to committing the second patch (or the
wal_consistency_checking stuff). Are you planning on picking it up
again?

I'm currently working on this bug fix from Michail Nikolaev:

https://postgr.es/m/CANtu0ohkR-evAWbpzJu54V8eCOtqjJyYp3PQ_SGoBTRGXWhWRw@mail.gmail.com

It would be nice if you could commit your second patch at around the
same time. It's related IMV.

Thanks
-- 
Peter Geoghegan



Re: Concurrency bug in amcheck

From
Alexander Korotkov
Date:
Hi, Peter!

On Sat, Aug 1, 2020 at 3:23 AM Peter Geoghegan <pg@bowt.ie> wrote:
On Wed, May 13, 2020 at 4:06 PM Peter Geoghegan <pg@bowt.ie> wrote:
> On Mon, May 11, 2020 at 5:56 AM Alexander Korotkov
> <a.korotkov@postgrespro.ru> wrote:
> > Thank you.  2nd patch is proposed for master and makes btree page
> > unlink remove all the items from the page being deleted.
>
> This looks good, but can we do the
> wal_consistency_checking/btree_mask() improvement, too?

You never got around to committing the second patch (or the
wal_consistency_checking stuff). Are you planning on picking it up
again?

Thank you for your reminder.   Revised patch is attached.  Now, the contents of deleted btree pages isn't masked.  I've checked that installcheck passes with wal_consistency_checking='Btree'.  I'm going to push this if no objections.

------
Regards,
Alexander Korotkov 
Attachment

Re: Concurrency bug in amcheck

From
Peter Geoghegan
Date:
Hi Alexander,

On Tue, Aug 4, 2020 at 7:27 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> Thank you for your reminder.   Revised patch is attached.  Now, the contents of deleted btree pages isn't masked.
I'vechecked that installcheck passes with wal_consistency_checking='Btree'.  I'm going to push this if no objections.
 

This looks good to me. One small thing, though: maybe the comments
should not say anything about the REDO routine -- that seems like a
case of "the tail wagging the dog" to me. Perhaps say something like:

"Remove the last pivot tuple on the page.  This keeps things simple
for WAL consistency checking."

(Just a suggestion.)

Thanks!
-- 
Peter Geoghegan



Re: Concurrency bug in amcheck

From
Alexander Korotkov
Date:
On Wed, Aug 5, 2020 at 1:58 AM Peter Geoghegan <pg@bowt.ie> wrote:
> On Tue, Aug 4, 2020 at 7:27 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > Thank you for your reminder.   Revised patch is attached.  Now, the contents of deleted btree pages isn't masked.
I'vechecked that installcheck passes with wal_consistency_checking='Btree'.  I'm going to push this if no objections.
 
>
> This looks good to me. One small thing, though: maybe the comments
> should not say anything about the REDO routine -- that seems like a
> case of "the tail wagging the dog" to me. Perhaps say something like:
>
> "Remove the last pivot tuple on the page.  This keeps things simple
> for WAL consistency checking."

Pushed.  Comment is changed as you suggested.  But I've replaced "last
pivot tuple" with "remaining tuples", because the page can also have a
high key, which is also a tuple.

------
Regards,
Alexander Korotkov



Re: Concurrency bug in amcheck

From
Peter Geoghegan
Date:
On Tue, Aug 4, 2020 at 4:18 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> Pushed.  Comment is changed as you suggested.  But I've replaced "last
> pivot tuple" with "remaining tuples", because the page can also have a
> high key, which is also a tuple.

You're right, of course.

Thanks again
-- 
Peter Geoghegan