Home > mailing lists

Inadequate thought about buffer locking during hot standby replay - Mailing list pgsql-hackers

From	Tom Lane
Subject	Inadequate thought about buffer locking during hot standby replay
Date	November 10, 2012 02:24:31
Msg-id	21405.1352503465@sss.pgh.pa.us Whole thread Raw
Responses	Re: Inadequate thought about buffer locking during hot standby replay Re: Inadequate thought about buffer locking during hot standby replay Re: Inadequate thought about buffer locking during hot standby replay Re: Inadequate thought about buffer locking during hot standby replay
List	pgsql-hackers

Tree view

During normal running, operations such as btree page splits are
extremely careful about the order in which they acquire and release
buffer locks, if they're doing something that concurrently modifies
multiple pages.

During WAL replay, that all goes out the window.  Even if an individual
WAL-record replay function does things in the right order for "standard"
cases, RestoreBkpBlocks has no idea what it's doing.  So if one or more
of the referenced pages gets treated as a full-page image, we are left
with no guarantee whatsoever about what order the pages are restored in.
That never mattered when the code was originally designed, but it sure
matters during Hot Standby when other queries might be able to see the
intermediate states.

I can't prove that this is the cause of bug #7648, but it's fairly easy
to see that it could explain the symptom.  You only need to assume that
the page-being-split had been handled as a full-page image, and that the
new right-hand page had gotten allocated by extending the relation.
Then there will be an interval just after RestoreBkpBlocks does its
thing where the updated left-hand sibling is in the index and is not
locked in any way, but its right-link points off the end of the index.
If a few indexscans come along before the replay process gets to
continue, you'd get exactly the reported errors.

I'm inclined to think that we need to fix this by getting rid of
RestoreBkpBlocks per se, and instead having the per-WAL-record restore
routines dictate when each full-page image is restored (and whether or
not to release the buffer lock immediately).  That's not going to be a
small change unfortunately :-(
        regards, tom lane

pgsql-hackers by date:

From: Jeff Davis
Date: 10 November 2012, 02:19:49
Subject: Re: WIP checksums patch

From: Jeff Davis
Date: 10 November 2012, 02:25:22
Subject: Re: WIP checksums patch

Inadequate thought about buffer locking during hot standby replay - Mailing list pgsql-hackers

Previous

Next