Re: BTP_DELETED leaf still in tree - Mailing list pgsql-hackers
From | Daniel Wood |
---|---|
Subject | Re: BTP_DELETED leaf still in tree |
Date | |
Msg-id | 1430933023.970983.1570751086396@connect.xfinity.com Whole thread Raw |
In response to | Re: BTP_DELETED leaf still in tree (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: BTP_DELETED leaf still in tree
|
List | pgsql-hackers |
> On October 10, 2019 at 1:18 PM Peter Geoghegan <pg@bowt.ie> wrote: > > > On Thu, Oct 10, 2019 at 12:48 PM Daniel Wood <hexexpert@comcast.net> wrote: > > Update query stuck in a loop. Looping in _bt_moveright(). > > You didn't say which PostgreSQL versions were involved, and if the > database was ever upgraded using pg_upgrade. Those details could > matter. PG_VERSION says 10. I suspect we are running 10.9. I have no idea if pg_upgrade was ever done. > > ExecInsertIndexTuples->btinsert->_bt_doinsert->_bt_search->_bt_moveright > > > > Mid Tree Node downlink path taken by _bt_search points to a BTP_DELETED Leaf. > > This should hardly ever happen -- it is barely possible for an index > scan to land on a BTP_DELETED leaf page (or a half-dead page) when > following a downlink in its parent. Recall that nbtree uses Lehman & > Yao's design, so _bt_search() does not "couple" buffer locks on the > way down. It would probably be impossible to observe this happening > without carefully setting breakpoints in multiple sessions. > > If this happens reliably for you, which it sounds like, then you can > already assume that the index is corrupt. > > > btpo_next is also DELETED but not in the tree. > > > > btpo_next->btpo_next is NOT deleted but in the mid tree as a lesser key value. > > > > Thus creating an endless loop in moveright. > > Offhand, these other details sound normal. The side links are still > needed in fully deleted (BTP_DELETED) pages. And, moving right and > finding lesser key values (not greater key values) is normal with > deleted pages, since page deletion makes the keyspace move right, not > left (moving the keyspace left is how the source Lanin & Shasha paper > does it, though). > > Actually, I take it back -- the looping part is not normal. The > btpo_next->btpo_next page has no business linking back to the > original/first deleted page you mentioned. That's just odd. btpo_next->btpo_next does NOT link directly back to the 1st deleted page. It simply links to some in-use page which is 50or so leaf pages back in the tree. Eventually we do reach the two deleted pages again. Only the first one is in the 'tree'. > Can you provide me with a dump of the page images? The easiest way of > getting a page dump is described here: Customer data. Looks like meaningless customer data (5 digit key values). But too much paperwork. :-) The hard part for me to understand isn't just why the DELETED leaf node is still referenced in the mid tree node. It is that the step which sets BTP_DELETED should have also linked its leaf and right siblings together. But this hasn'tbeen done. Could the page have already have been dirty, but because of "target != leafblkno", we didn't stamp a new LSN on it. Couldthis allow us to write the DELETED dirty page without the XLOG_BTREE_MARK_PAGE_HALFDEAD and XLOG_BTREE_UNLINK_PAGE beingflushed? Of course, I don't understand the "target != leafblkno". In any case, thanks. > https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD#contrib.2Fpageinspect_page_dump > > If I had to guess, I'd guess that this was due to a generic storage problem. > > -- > Peter Geoghegan
pgsql-hackers by date: