Simon Riggs <simon@2ndQuadrant.com> writes:
> On 11 November 2012 23:24, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Practically all WAL record types that touch multiple pages have some
>> bug of this type. In addition to btree_xlog_split, I found that
>> heap_xlog_update, ginRedoDeletePage, spgRedoAddLeaf, spgRedoMoveLeafs,
>> spgRedoAddNode, spgRedoSplitTuple, and spgRedoPickSplit fail to hold
>> locks as required to make their updates safe for concurrent queries.
>> (I'm not totally sure about ginRedoDeletePage, but the original action
>> definitely locks the pages simultaneously, and it's not clear that it's
>> safe not to.) Most of these are okay in cases without any full-page
>> images, but could fail if the wrong subset of the pages-to-be-touched
>> were processed by RestoreBkpBlocks. Some had bugs even without that :-(
> Hmm, not good. Thanks for spotting.
> Do these changes do anything to actions that occur across multiple
> records? I assume not and think those are OK, agreed?
Right, we were and still are assuming that any state that exists between
WAL records is consistent and safe to expose to hot-standby queries.
The important thing here is that these WAL replay functions were failing
to ensure that their own actions appear atomic to onlookers. This is
basically hangover from pre-hot-standby coding conventions, when no such
concern existed.
regards, tom lane