Re: Amcheck: do rightlink verification with lock coupling - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Amcheck: do rightlink verification with lock coupling
Date
Msg-id CAH2-Wz=Mz6-588AKMw5p0CpRq5a5SWMsC0QM+BOnLCjjvpqV9Q@mail.gmail.com
Whole thread Raw
In response to Re: Amcheck: do rightlink verification with lock coupling  (Andrey Borodin <x4mmm@yandex-team.ru>)
Responses Re: Amcheck: do rightlink verification with lock coupling
List pgsql-hackers
On Mon, Jan 13, 2020 at 8:47 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:
> > 11 янв. 2020 г., в 7:49, Peter Geoghegan <pg@bowt.ie> написал(а):
> > I'm curious why Andrey's corruption problems were not detected by the
> > cross-page amcheck test, though. We compare the first non-pivot tuple
> > on the right sibling leaf page with the last one on the target page,
> > towards the end of bt_target_page_check() -- isn't that almost as good
> > as what you have here in practice? I probably would have added
> > something like this myself earlier, if I had reason to think that
> > verification would be a lot more effective that way.
>
> We were dealing with corruption caused by lost page update. Consider two pages:
> A->B
> If A is split into A` and C we get:
> A`->C->B
> But if update of A is lost we still have
> A->B, but B backward pointers points to C.
> B's smallest key is bigger than hikey of A`, but this do not violate
> cross-pages invariant.
>
> Page updates may be lost due to bug in backup software with incremental
> backups, bug in storage layer of Aurora-style system, bug in page cache, incorrect
> fsync error handling, bug in ssd firmware etc. And our data checksums do not
> detect this kind of corruption. BTW I think that it would be better if our
> checksums were not stored on a page itseft, they could detect this kind of faults.

I find this argument convincing. I'll try to get this committed soon.

While you could have used bt_index_parent_check() or heapallindexed to
detect the issue, those two options are a lot more expensive (plus the
former option won't work on a standby). Relaxing the principle that
says that we shouldn't hold multiple buffer locks concurrently doesn't
seem like that much to ask for to get such a clear benefit.

I think that this is safe, but page deletion/half-dead pages need more
thought. In general, the target page could have become "ignorable"
when rechecked.

> We were able to timely detect all those problems on primaries in our testing
> environment. But much later found out that some standbys were corrupted,
> the problem appeared only when they were promoted.
> Also, in nearby thread Grygory Rylov (g0djan) is trying to enable one more
> invariant in standby checks.

I looked at that thread just now, but Grygory didn't say why this true
root check was particularly important, so I can't see much upside.
Plus that seems riskier than what you have in mind here.

Does it have something to do with the true root looking like a deleted
page? The details matter.

--
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Setting min/max TLS protocol in clientside libpq
Next
From: Michael Paquier
Date:
Subject: Re: pgindent && weirdness