Thread: Do not check unlogged indexes on standby

Do not check unlogged indexes on standby

From
Andrey Borodin
Date:
Hi hackers!

Currently, if we check indexes on standby we often get

man-psbpshn0skhsxynd/xiva_xtable_testing_01 R # select bt_index_check('xiva_loadtest.pk_uid');
ERROR:  58P01: could not open file "base/16453/125407": No such file or directory

I think that we should print warning and that's it. Amcheck should not give false positives.

Or, maybe, there are some design considerations that I miss?


BTW I really want to enable rightlink-leftlink invariant validation on standby..

Thanks!

Best regards, Andrey Borodin.


Attachment

Re: Do not check unlogged indexes on standby

From
Peter Geoghegan
Date:
On Mon, Aug 12, 2019 at 2:58 AM Andrey Borodin <x4mmm@yandex-team.ru> wrote:
> Currently, if we check indexes on standby we often get
>
> man-psbpshn0skhsxynd/xiva_xtable_testing_01 R # select bt_index_check('xiva_loadtest.pk_uid');
> ERROR:  58P01: could not open file "base/16453/125407": No such file or directory
>
> I think that we should print warning and that's it. Amcheck should not give false positives.

I agree -- amcheck should just skip over unlogged tables during
recovery, since there is simply nothing to check.

I pushed your patch to all branches that have amcheck just now, so now
we skip over unlogged relations when in recovery, though I made some
revisions.

Your patch didn't handle temp tables/indexes that were created in the
first session correctly -- we must be careful about the distinction
between unlogged tables, and tables that don't require WAL logging
(the later includes temp tables). Also, I thought that it was a good
idea to actively test for the presence of a main fork when we don't
skip (i.e. when the system isn't in recovery and the B-Tree indexes
isn't unlogged) -- we now give a clean report of corruption when that
happens, rather than letting an ambiguous "can't happen" error get
raised by low-level code. This might be possible with system catalog
corruption, for example. Finally, I thought that the WARNING was a bit
strong -- a NOTICE is more appropriate.

Thanks!

--
Peter Geoghegan



Re: Do not check unlogged indexes on standby

From
Peter Geoghegan
Date:
On Mon, Aug 12, 2019 at 2:58 AM Andrey Borodin <x4mmm@yandex-team.ru> wrote:
> BTW I really want to enable rightlink-leftlink invariant validation on standby..

That seems very hard. My hope was that bt_check_index() can detect the
same problem a different way. The bt_right_page_check_scankey()
cross-page check (which needs nothing more than an AccessShareLock)
will often detect such problems, because the page image itself will be
totally wrong in some way.

I'm guessing that you have direct experience with that *not* being
good enough, though. Can you share further details? I suppose that
bt_right_page_check_scankey() helps with transposed pages, but doesn't
help so much when you have WAL-level inconsistencies.

-- 
Peter Geoghegan



Re: Do not check unlogged indexes on standby

From
Andrey Borodin
Date:

> 13 авг. 2019 г., в 3:23, Peter Geoghegan <pg@bowt.ie> написал(а):
>
> I pushed your patch to all branches that have amcheck just now, so now
> we skip over unlogged relations when in recovery, though I made some
> revisions.
Oh, cool, thanks!

> Your patch didn't handle temp tables/indexes that were created in the
> first session correctly -- we must be careful about the distinction
> between unlogged tables, and tables that don't require WAL logging
> (the later includes temp tables). Also, I thought that it was a good
> idea to actively test for the presence of a main fork when we don't
> skip (i.e. when the system isn't in recovery and the B-Tree indexes
> isn't unlogged) -- we now give a clean report of corruption when that
> happens, rather than letting an ambiguous "can't happen" error get
> raised by low-level code. This might be possible with system catalog
> corruption, for example. Finally, I thought that the WARNING was a bit
> strong -- a NOTICE is more appropriate.
+1

> 13 авг. 2019 г., в 3:36, Peter Geoghegan <pg@bowt.ie> написал(а):
>
> On Mon, Aug 12, 2019 at 2:58 AM Andrey Borodin <x4mmm@yandex-team.ru> wrote:
>> BTW I really want to enable rightlink-leftlink invariant validation on standby..
>
> That seems very hard. My hope was that bt_check_index() can detect the
> same problem a different way. The bt_right_page_check_scankey()
> cross-page check (which needs nothing more than an AccessShareLock)
> will often detect such problems, because the page image itself will be
> totally wrong in some way.
>
> I'm guessing that you have direct experience with that *not* being
> good enough, though. Can you share further details? I suppose that
> bt_right_page_check_scankey() helps with transposed pages, but doesn't
> help so much when you have WAL-level inconsistencies.

We have a bunch of internal testing HA clusters that suffered from corruption conditions.
We fixed everything that can be detected with parent-check on primaries or usual check on standbys.
(page updates were lost both on primary and during WAL replay)
But from time to time when clusters switch primary from one availability zone to another we observe
"right sibling's left-link doesn't match: block 32709 links to 37022 instead of expected 40953 in index"

We are going to search for these clusters with this [0] tolerating possible fraction of false positives, we have them
anyway.
But I think I could put some effort into making corruption-detection tooling better.
I think if we observe links discrepancy, we can acquire lock of left and right pages and recheck.


Best regards, Andrey Borodin.

[0] https://github.com/x4m/amcheck/commit/894d8bafb3c9a26bbc168ea5f4f33bcd1fc9f495


Re: Do not check unlogged indexes on standby

From
Peter Geoghegan
Date:
On Tue, Aug 13, 2019 at 5:17 AM Andrey Borodin <x4mmm@yandex-team.ru> wrote:
> We have a bunch of internal testing HA clusters that suffered from corruption conditions.
> We fixed everything that can be detected with parent-check on primaries or usual check on standbys.
> (page updates were lost both on primary and during WAL replay)
> But from time to time when clusters switch primary from one availability zone to another we observe
> "right sibling's left-link doesn't match: block 32709 links to 37022 instead of expected 40953 in index"

That sounds like an issue caused by a failure to replay all available
WAL, where only one page happened to get written out by a checkpoint
before a crash. It's something like that. That wouldn't be caught by
the cross-page bt_index_check() check that we do already.

> We are going to search for these clusters with this [0] tolerating possible fraction of false positives, we have them
anyway.
> But I think I could put some effort into making corruption-detection tooling better.
> I think if we observe links discrepancy, we can acquire lock of left and right pages and recheck.

That's one possibility. When I first designed amcheck it was important
to be conservative, so I invented a general rule about never acquiring
multiple buffer locks at once. I still think that that was the correct
decision for the bt_downlink_check() check (the main extra
bt_index_parent_check() check), but I think that you're right about
retrying to verify the sibling links when bt_index_check() is called
from SQL.

nbtree will often "couple" buffer locks on the leaf level; it will
acquire a lock on a leaf page, and not release that lock until it has
also acquired a lock on the right sibling page (I'm mostly thinking of
_bt_stepright()). I am in favor of a patch that makes amcheck perform
sibling link verification within bt_index_check(), by retrying while
pessimistically coupling buffer locks. (Though I think that that
should just happen on the leaf level. We should not try to be too
clever about ignorable/half-dead/deleted pages, to be conservative.)

-- 
Peter Geoghegan



Re: Do not check unlogged indexes on standby

From
Andrey Borodin
Date:

> 13 авг. 2019 г., в 20:30, Peter Geoghegan <pg@bowt.ie> написал(а):
>
> That's one possibility. When I first designed amcheck it was important
> to be conservative, so I invented a general rule about never acquiring
> multiple buffer locks at once. I still think that that was the correct
> decision for the bt_downlink_check() check (the main extra
> bt_index_parent_check() check), but I think that you're right about
> retrying to verify the sibling links when bt_index_check() is called
> from SQL.
>
> nbtree will often "couple" buffer locks on the leaf level; it will
> acquire a lock on a leaf page, and not release that lock until it has
> also acquired a lock on the right sibling page (I'm mostly thinking of
> _bt_stepright()). I am in favor of a patch that makes amcheck perform
> sibling link verification within bt_index_check(), by retrying while
> pessimistically coupling buffer locks. (Though I think that that
> should just happen on the leaf level. We should not try to be too
> clever about ignorable/half-dead/deleted pages, to be conservative.)

PFA V1 of this check retry.

Best regards, Andrey Borodin.

Attachment

Re: Do not check unlogged indexes on standby

From
Alvaro Herrera
Date:
On 2019-Aug-15, Andrey Borodin wrote:

> PFA V1 of this check retry.

CFbot complains that this doesn't apply; can you please rebase?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Do not check unlogged indexes on standby

From
Peter Geoghegan
Date:
The patch has been committed already. 

Peter Geoghegan
(Sent from my phone)

Re: Do not check unlogged indexes on standby

From
Peter Geoghegan
Date:
On Wed, Sep 11, 2019 at 7:10 PM Peter Geoghegan <pg@bowt.ie> wrote:
> The patch has been committed already.

Oh, wait. It hasn't. Andrey didn't create a new thread for his largely
independent patch, so I incorrectly assumed he created a CF entry for
his original bugfix.


-- 
Peter Geoghegan



Re: Do not check unlogged indexes on standby

From
Alvaro Herrera
Date:
On 2019-Sep-11, Peter Geoghegan wrote:

> On Wed, Sep 11, 2019 at 7:10 PM Peter Geoghegan <pg@bowt.ie> wrote:
> > The patch has been committed already.
> 
> Oh, wait. It hasn't. Andrey didn't create a new thread for his largely
> independent patch, so I incorrectly assumed he created a CF entry for
> his original bugfix.

So, I'm confused.  There appear to be two bugfix patches in this thread,
with no relationship between them, and as far as I can tell only one of
them has been addressed.  What was applied (6754fe65a4c6) is
significantly different from what Andrey submitted.  Is that correct?
If so, we still have an open bug, right?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Do not check unlogged indexes on standby

From
Peter Geoghegan
Date:
On Wed, Feb 5, 2020 at 1:27 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> So, I'm confused.  There appear to be two bugfix patches in this thread,
> with no relationship between them, and as far as I can tell only one of
> them has been addressed.  What was applied (6754fe65a4c6) is
> significantly different from what Andrey submitted.  Is that correct?
> If so, we still have an open bug, right?

No. We had two separate patches on this thread:

1. A bugfix patch to make amcheck not do the wrong thing with unlogged
indexes when operating on a standby.

2. An unrelated feature/enhancement that would allow amcheck to detect
more types of corruption with only an AccessShareLock on the relation.

The first item was dealt with way back in August, without controversy
-- my commit 6754fe65 was more or less Andrey's bugfix.

The second item genereated another thread a little after this thread.
Everything was handled on this other thread. Ultimately, I rejected
the enhancement on the grounds that it wasn't safe on standbys in the
face of concurrent splits and deletions [1].

I believe that all of the items discussed on this thread have been
resolved. Did I miss a CF entry or something?

[1] https://postgr.es/m/CAH2-Wzmb_QOmHX=uWjCFV4Gf1810kz-yVzK6RA=VS41EFcKh=g@mail.gmail.com
-- 
Peter Geoghegan



Re: Do not check unlogged indexes on standby

From
Alvaro Herrera
Date:
On 2020-Feb-05, Peter Geoghegan wrote:

> The second item genereated another thread a little after this thread.
> Everything was handled on this other thread. Ultimately, I rejected
> the enhancement on the grounds that it wasn't safe on standbys in the
> face of concurrent splits and deletions [1].
> 
> I believe that all of the items discussed on this thread have been
> resolved. Did I miss a CF entry or something?

Nah.  I just had one of the messages flagged in my inbox, and I wasn't
sure what had happened since the other thread was not referenced in this
one.  I wasn't looking at any CF entries.

Thanks for the explanation,

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services