Thread: Bug in visibility hint bit

Bug in visibility hint bit

From
Jeff Janes
Date:
There seems to be a bug in the visibility map in 8.4.0, introduced to
cvs on 2008-12-03.  It results in tuples being called visible that
shouldn't be.


In heap_update function from heapam.c:
       /*        * Note: we mustn't clear PD_ALL_VISIBLE flags before writing the WAL        * record, because
log_heap_updatelooks at those flags to set the        * corresponding flags in the WAL record.        */
 

So the full_page_write of the block sent to WAL has the wrong
PD_ALL_VISIBLE.  It needs to be fixed during WAL replay after a crash.But it is not.

In heap_xlog_update:
       if (record->xl_info & XLR_BKP_BLOCK_1)       {               if (samepage)                       return;
               /* backup
 
block covered both changes */               goto newt;       }

The goto newt causes it to skip the code that would have called
PageClearAllVisible.

I don't feel particularly competent to propose a patch for this.  It
seems to me that
log_heap_update should be sent the correct block in the first place,
and some other
method should be used to communicate between heap_update and log_heap_update
if communication is necessary.  But really, I don't think such
communication should be necessary, and the xlrec.all_visible_cleared
and xlrec.new_all_visible_cleared fields are unneeded.  Just assume
they are true.  It seems like the worst thing that can happen is that
we call PageClearAllVisible when it is already cleared, which is
hardly harmful (the blocks that have redo applied to them are already
dirty, so a spurious clear doesn't cause unneeded IO)


Jeff


Re: Bug in visibility hint bit

From
Jeff Janes
Date:
On Mon, Aug 24, 2009 at 6:23 PM, Jeff Janes<<a href="mailto:jeff.janes@gmail.com">jeff.janes@gmail.com</a>>
wrote:<br/>> There seems to be a bug in the visibility map in 8.4.0, introduced to<br />> cvs on 2008-12-03.  It
resultsin tuples being called visible that<br /> > shouldn't be.<br /><br />Well, never mind.  It took me a few days
totrack down the bug and in the mean time I didn't want to rsync the CVS repository and lose my own local against it. 
Soonce I'm done I rsync and see that Tom already patched it yesterday.<br /><br />Cheers,<br /><br />Jeff<br /> 

Re: Bug in visibility hint bit

From
Alvaro Herrera
Date:
Jeff Janes escribió:
> On Mon, Aug 24, 2009 at 6:23 PM, Jeff Janes<jeff.janes@gmail.com> wrote:
> > There seems to be a bug in the visibility map in 8.4.0, introduced to
> > cvs on 2008-12-03.  It results in tuples being called visible that
> > shouldn't be.
> 
> Well, never mind.  It took me a few days to track down the bug and in the
> mean time I didn't want to rsync the CVS repository and lose my own local
> against it.  So once I'm done I rsync and see that Tom already patched it
> yesterday.

Congratulations on finding it independently.  We welcome your eyes on
our code ;-)

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: Bug in visibility hint bit

From
Tom Lane
Date:
Jeff Janes <jeff.janes@gmail.com> writes:
> ... But really, I don't think such
> communication should be necessary, and the xlrec.all_visible_cleared
> and xlrec.new_all_visible_cleared fields are unneeded.  Just assume
> they are true.  It seems like the worst thing that can happen is that
> we call PageClearAllVisible when it is already cleared, which is
> hardly harmful (the blocks that have redo applied to them are already
> dirty, so a spurious clear doesn't cause unneeded IO)

Just to respond to that --- I spent awhile yesterday thinking the same
thing.  But the value of those flags is to tell the WAL replay functions
whether they need to go and clear the corresponding bits in the
visibility map.  Making them do that unconditionally for every
insert/update/delete would surely be a pretty big hit to the speed of
WAL replay, which already leaves a lot to be desired :-(
        regards, tom lane