Thread: BUG #16582: Logical index corruption leading to apparent index scan infinite loop
BUG #16582: Logical index corruption leading to apparent index scan infinite loop
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 16582 Logged by: James Lucas Email address: jlucasdba@gmail.com PostgreSQL version: 11.5 Operating system: Centos 7 Description: I have an 11.5 database with a declaratively partitioned table. The table has around 40 partitions. We began observing a number of long running sessions (over a week), all executing the same select query against this table. Attempting to kill the sessions with pg_terminate_backend has no effect. I pulled several stack traces against the backend processes executing this query and traced the issue back to the _bt_moveright function in src/backend/access/nbtree/nbtsearch.c. This function appears to be implemented as an infinite loop, which breaks when it finds a page that matches what it's looking for. So my working theory is that I have a corrupt index where that condition is never properly satisfied, resulting in an endless loop. I found a thread in the mailing list archive about a similar issue in one of the other functions in this file (https://www.postgresql.org/message-id/flat/CAM3SWZT7dCes%3DuOA3NAHYBA1kth%3Db4pXkszNLMPVtNAAYUp_wg%40mail.gmail.com#e9bf163da8f50f0f2845b59250da6ab5). The _bt_moveright function does not have CHECK_FOR_INTERRUPTS in its loop, which would explain why the sessions don't respond to pg_terminate_backend. The table has two indexes, so I decided to scan both indexes on all partitions with the bt_index_check function from the amcheck extension. I identified one partition where both indexes throw the following result: ERROR: cross page item order invariant violated for index "xxxxx" DETAIL: Last item on page tid(xx,xx) page lsn=xxxxxxxxxx Since both indexes appear to have the same issue, it suggests to me something about this table/data may be triggering a bug. I have not yet tried reindexing. I did pg_dump/pg_restore the entire table (all partitions) to another system (also 11.5), and the indexes built there test okay. Other possibly relevant information: The table is range partitioned on an insert timestamp (timestamp(0) without timezone). The two indexes involved are both composite indexes, over columns of types (numeric, numeric, numeric) and (numeric, timestamp(6) without time zone). The partition key is not covered by either index. There is also no primary key. I realize it may not be possible to remote diagnose the root cause of the index issues, but figured it was worth reporting. My next steps would probably be to try reindexing the affected partition. If that doesn't work, it seems like either a vacuum full or pg_dump/pg_restore would be the next step. Any other suggestions appreciated. Thanks, James
Re: BUG #16582: Logical index corruption leading to apparent index scan infinite loop
From
Peter Geoghegan
Date:
On Fri, Aug 14, 2020 at 2:03 PM PG Bug reporting form <noreply@postgresql.org> wrote: > The table has two indexes, so I decided to scan both indexes on all > partitions with the bt_index_check function from the amcheck extension. I > identified one partition where both indexes throw the following result: > ERROR: cross page item order invariant violated for index "xxxxx" > DETAIL: Last item on page tid(xx,xx) page lsn=xxxxxxxxxx This sounds very much like an index with sibling pages that are in the wrong order relative to each other. That's totally consistent with what you describe with _bt_moveright() -- circular sibling links can cause it to just keep going. It's possible that you'll get a better error with bt_index_parent_check(), which might be worth trying. But it probably won't give you any additional information. Note that there is DEBUG1 and DEBUG2 output from amcheck, which might give you a few more details. You can "set client_min_messages = 'debug2'" in an interactive session that runs bt_index_check() to see some additional context. Again, this is unlikely to make all that much difference. -- Peter Geoghegan
Re: BUG #16582: Logical index corruption leading to apparent index scan infinite loop
From
James Lucas
Date:
Hi Peter, I re-ran with DEBUG2 messages enabled. Got a bunch of output, but the last few lines are like this for each index: DEBUG: level 965868789 leftmost page of index "xxxxx" was found deleted or half dead DETAIL: Deleted page found when building scankey from right sibling. DEBUG: level 966240004 leftmost page of index "xxxxx" was found deleted or half dead DETAIL: Deleted page found when building scankey from right sibling. ERROR: cross page item order invariant violated for index "xxxxx" DETAIL: Last item on page tid=(xx,xx) page lsn=xxxxxxxxxx DEBUG: level 967745369 leftmost page of index "xxxxx" was found deleted or half dead DETAIL: Deleted page found when building scankey from right sibling. DEBUG: level 967746918 leftmost page of index "xxxxx" was found deleted or half dead DETAIL: Deleted page found when building scankey from right sibling. ERROR: cross page item order invariant violated for index "xxxxx" DETAIL: Last item on page tid=(xx,xx) page lsn=xxxxxxxxxx Not sure if pageinspect might be able to tell anything else useful? I'd like to find the root cause of the corruption if possible, so this doesn't happen in other databases. Also wanted to see if it might be a good idea to add a CHECK_FOR_INTERRUPTS call to _bt_moveright() so if this does happen again, at least the session would be killable. I don't have enough background in the code to know where it's safe to add, or I'd submit a patch. Thanks, James On Fri, Aug 14, 2020 at 4:33 PM Peter Geoghegan <pg@bowt.ie> wrote: > > On Fri, Aug 14, 2020 at 2:03 PM PG Bug reporting form > <noreply@postgresql.org> wrote: > > The table has two indexes, so I decided to scan both indexes on all > > partitions with the bt_index_check function from the amcheck extension. I > > identified one partition where both indexes throw the following result: > > ERROR: cross page item order invariant violated for index "xxxxx" > > DETAIL: Last item on page tid(xx,xx) page lsn=xxxxxxxxxx > > This sounds very much like an index with sibling pages that are in the > wrong order relative to each other. That's totally consistent with > what you describe with _bt_moveright() -- circular sibling links can > cause it to just keep going. > > It's possible that you'll get a better error with > bt_index_parent_check(), which might be worth trying. But it probably > won't give you any additional information. > > Note that there is DEBUG1 and DEBUG2 output from amcheck, which might > give you a few more details. You can "set client_min_messages = > 'debug2'" in an interactive session that runs bt_index_check() to see > some additional context. Again, this is unlikely to make all that much > difference. > > -- > Peter Geoghegan
Re: BUG #16582: Logical index corruption leading to apparent index scan infinite loop
From
James Lucas
Date:
Forgot to say, I don't think I can run bt_index_parent_check() right now due to the broader locks required. I will try to get a run in if I get an opportunity. Thanks, James On Mon, Aug 17, 2020 at 10:51 AM James Lucas <jlucasdba@gmail.com> wrote: > > Hi Peter, > > I re-ran with DEBUG2 messages enabled. Got a bunch of output, but the > last few lines are like this for each index: > > DEBUG: level 965868789 leftmost page of index "xxxxx" was found > deleted or half dead > DETAIL: Deleted page found when building scankey from right sibling. > DEBUG: level 966240004 leftmost page of index "xxxxx" was found > deleted or half dead > DETAIL: Deleted page found when building scankey from right sibling. > ERROR: cross page item order invariant violated for index "xxxxx" > DETAIL: Last item on page tid=(xx,xx) page lsn=xxxxxxxxxx > > DEBUG: level 967745369 leftmost page of index "xxxxx" was found > deleted or half dead > DETAIL: Deleted page found when building scankey from right sibling. > DEBUG: level 967746918 leftmost page of index "xxxxx" was found > deleted or half dead > DETAIL: Deleted page found when building scankey from right sibling. > ERROR: cross page item order invariant violated for index "xxxxx" > DETAIL: Last item on page tid=(xx,xx) page lsn=xxxxxxxxxx > > > Not sure if pageinspect might be able to tell anything else useful? > I'd like to find the root cause of the corruption if possible, so this > doesn't happen in other databases. > > Also wanted to see if it might be a good idea to add a > CHECK_FOR_INTERRUPTS call to _bt_moveright() so if this does happen > again, at least the session would be killable. I don't have enough > background in the code to know where it's safe to add, or I'd submit a > patch. > > Thanks, > James > > On Fri, Aug 14, 2020 at 4:33 PM Peter Geoghegan <pg@bowt.ie> wrote: > > > > On Fri, Aug 14, 2020 at 2:03 PM PG Bug reporting form > > <noreply@postgresql.org> wrote: > > > The table has two indexes, so I decided to scan both indexes on all > > > partitions with the bt_index_check function from the amcheck extension. I > > > identified one partition where both indexes throw the following result: > > > ERROR: cross page item order invariant violated for index "xxxxx" > > > DETAIL: Last item on page tid(xx,xx) page lsn=xxxxxxxxxx > > > > This sounds very much like an index with sibling pages that are in the > > wrong order relative to each other. That's totally consistent with > > what you describe with _bt_moveright() -- circular sibling links can > > cause it to just keep going. > > > > It's possible that you'll get a better error with > > bt_index_parent_check(), which might be worth trying. But it probably > > won't give you any additional information. > > > > Note that there is DEBUG1 and DEBUG2 output from amcheck, which might > > give you a few more details. You can "set client_min_messages = > > 'debug2'" in an interactive session that runs bt_index_check() to see > > some additional context. Again, this is unlikely to make all that much > > difference. > > > > -- > > Peter Geoghegan