Re: Inadequate thought about buffer locking during hot standby replay - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Inadequate thought about buffer locking during hot standby replay
Date
Msg-id 3269.1352750361@sss.pgh.pa.us
Whole thread Raw
In response to Re: Inadequate thought about buffer locking during hot standby replay  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
Simon Riggs <simon@2ndQuadrant.com> writes:
> On 11 November 2012 23:24, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Practically all WAL record types that touch multiple pages have some
>> bug of this type.  In addition to btree_xlog_split, I found that
>> heap_xlog_update, ginRedoDeletePage, spgRedoAddLeaf, spgRedoMoveLeafs,
>> spgRedoAddNode, spgRedoSplitTuple, and spgRedoPickSplit fail to hold
>> locks as required to make their updates safe for concurrent queries.
>> (I'm not totally sure about ginRedoDeletePage, but the original action
>> definitely locks the pages simultaneously, and it's not clear that it's
>> safe not to.)  Most of these are okay in cases without any full-page
>> images, but could fail if the wrong subset of the pages-to-be-touched
>> were processed by RestoreBkpBlocks.  Some had bugs even without that :-(

> Hmm, not good. Thanks for spotting.

> Do these changes do anything to actions that occur across multiple
> records? I assume not and think those are OK, agreed?

Right, we were and still are assuming that any state that exists between
WAL records is consistent and safe to expose to hot-standby queries.
The important thing here is that these WAL replay functions were failing
to ensure that their own actions appear atomic to onlookers.  This is
basically hangover from pre-hot-standby coding conventions, when no such
concern existed.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Markus Wanner
Date:
Subject: Re: Enabling Checksums
Next
From: Tom Lane
Date:
Subject: Re: Inadequate thought about buffer locking during hot standby replay