On 2013-05-28 21:36:17 -0400, Robert Haas wrote:
> On Tue, May 28, 2013 at 1:58 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> > Guessing around I looked and noticed the following problematic pattern:
> > 1) A: wants to do an update, doesn't have enough freespace
> > 2) A: extends the relation on the filesystem level (RelationGetBufferForTuple)
> > 3) A: does PageInit (RelationGetBufferForTuple)
> > 4) A: aborts, e.g. due to a serialization failure (heap_update)
> >
> > At this point the page is initialized in memory, but not wal logged. It
> > isn't pinned or locked either.
> >
> > 5) B: vacuum finds that page and it's empty. So it marks it all
> > visible. But since the page wasn't written out (we haven't even marked
> > it dirty in 3.) the standby doesn't know that and reports the page as
> > being uninitialized.
> >
> > ISTM the best backbranchable fix for this is to teach lazy_scan_heap to
> > log an FPI for the heap page via visibilitymap_set in that rather
> > limited case.
> >
> > Happy to provide a patch unless somebody has a better idea?
>
> Good catch. However, I would suggest using log_newpage() before
> visibilitymap_set() rather than trying to stick extra logic into
> visibilitymap_set(). I think that will be cleaner and simpler.
Thought about that, but given that 9.3's visibilitymap_set already will
already FPI heap pages I concluded it wouldn't really be an improvement
since it's only one ||log_heap_page or so there. Not sure what's
better. Will write the patch and see how it goes.
Greetings,
Andres Freund
-- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services