Thread: Bug in visibility hint bit
There seems to be a bug in the visibility map in 8.4.0, introduced to cvs on 2008-12-03. It results in tuples being called visible that shouldn't be. In heap_update function from heapam.c: /* * Note: we mustn't clear PD_ALL_VISIBLE flags before writing the WAL * record, because log_heap_updatelooks at those flags to set the * corresponding flags in the WAL record. */ So the full_page_write of the block sent to WAL has the wrong PD_ALL_VISIBLE. It needs to be fixed during WAL replay after a crash.But it is not. In heap_xlog_update: if (record->xl_info & XLR_BKP_BLOCK_1) { if (samepage) return; /* backup block covered both changes */ goto newt; } The goto newt causes it to skip the code that would have called PageClearAllVisible. I don't feel particularly competent to propose a patch for this. It seems to me that log_heap_update should be sent the correct block in the first place, and some other method should be used to communicate between heap_update and log_heap_update if communication is necessary. But really, I don't think such communication should be necessary, and the xlrec.all_visible_cleared and xlrec.new_all_visible_cleared fields are unneeded. Just assume they are true. It seems like the worst thing that can happen is that we call PageClearAllVisible when it is already cleared, which is hardly harmful (the blocks that have redo applied to them are already dirty, so a spurious clear doesn't cause unneeded IO) Jeff
On Mon, Aug 24, 2009 at 6:23 PM, Jeff Janes<<a href="mailto:jeff.janes@gmail.com">jeff.janes@gmail.com</a>> wrote:<br/>> There seems to be a bug in the visibility map in 8.4.0, introduced to<br />> cvs on 2008-12-03. It resultsin tuples being called visible that<br /> > shouldn't be.<br /><br />Well, never mind. It took me a few days totrack down the bug and in the mean time I didn't want to rsync the CVS repository and lose my own local against it. Soonce I'm done I rsync and see that Tom already patched it yesterday.<br /><br />Cheers,<br /><br />Jeff<br />
Jeff Janes escribió: > On Mon, Aug 24, 2009 at 6:23 PM, Jeff Janes<jeff.janes@gmail.com> wrote: > > There seems to be a bug in the visibility map in 8.4.0, introduced to > > cvs on 2008-12-03. It results in tuples being called visible that > > shouldn't be. > > Well, never mind. It took me a few days to track down the bug and in the > mean time I didn't want to rsync the CVS repository and lose my own local > against it. So once I'm done I rsync and see that Tom already patched it > yesterday. Congratulations on finding it independently. We welcome your eyes on our code ;-) -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Jeff Janes <jeff.janes@gmail.com> writes: > ... But really, I don't think such > communication should be necessary, and the xlrec.all_visible_cleared > and xlrec.new_all_visible_cleared fields are unneeded. Just assume > they are true. It seems like the worst thing that can happen is that > we call PageClearAllVisible when it is already cleared, which is > hardly harmful (the blocks that have redo applied to them are already > dirty, so a spurious clear doesn't cause unneeded IO) Just to respond to that --- I spent awhile yesterday thinking the same thing. But the value of those flags is to tell the WAL replay functions whether they need to go and clear the corresponding bits in the visibility map. Making them do that unconditionally for every insert/update/delete would surely be a pretty big hit to the speed of WAL replay, which already leaves a lot to be desired :-( regards, tom lane