Re: Index use during Hot Standby - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Index use during Hot Standby
Date
Msg-id 48FCC9DD.9090307@enterprisedb.com
Whole thread Raw
In response to Re: Index use during Hot Standby  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: Index use during Hot Standby
List pgsql-hackers
Simon Riggs wrote:
> OK, I think I've found a problem.
> 
> In _bt_insertonpg(), if we split we do _bt_split() then do
> _bt_insert_parent(), which then does _bt_insertonpg() recursively.
> 
> _bt_split() writes a WAL record but continues holding write locks.
> btree_xlog_split() reads WAL record and does *not* continue to hold
> write locks. So recovery locking differs from Lehman & Yao requirements
> at that point.

Hmm. I don't have Lehman & Yao's paper at hand, but I fail to see what 
would go wrong.

Recovery of a split works like this:

1. Reconstruct new right sibling from scratch. Keep locked
2. Update old page (= new left sibling). Keep locked
3. Release locks on both pages.
4. Update the left-link of the page to the right of the new right sibling.

Searches descending work just fine without the pointer in the parent 
page to the new right sibling, just slower because they will always land 
on the left sibling, and might have move right from there. Searchers 
moving from left to right work fine; they will see either the old page, 
or both the new left and right sibling. Searchers moving right to left 
will likewise work; they will see either the old page, or the new right, 
then left page, or between steps 3 and 4, they will move to the left 
page, see that the right-link doesn't point to the page it came from, 
and move right to the new right sibling.

All that works just like during normal operation, so I don't actually 
understand why L&Y requires that you keep the split pages locked until 
you've locked the parent. Maybe it's needed to handle concurrent inserts 
or splits, but there can't be any during WAL replay.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Teodor Sigaev
Date:
Subject: Re: Block level concurrency during recovery
Next
From: Magnus Hagander
Date:
Subject: Re: SSL cleanups/hostname verification