Re: PageIsAllVisible()'s trustworthiness in Hot Standby - Mailing list pgsql-hackers
From | Pavan Deolasee |
---|---|
Subject | Re: PageIsAllVisible()'s trustworthiness in Hot Standby |
Date | |
Msg-id | CABOikdNbwdJcgd70VJC6oKwQ0T-foRPe0+R0fmd2P5mOHMHtig@mail.gmail.com Whole thread Raw |
In response to | Re: PageIsAllVisible()'s trustworthiness in Hot Standby (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: PageIsAllVisible()'s trustworthiness in Hot Standby
|
List | pgsql-hackers |
On Tue, Dec 4, 2012 at 8:03 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Dec 4, 2012 at 9:00 AM, Andres Freund <andres@2ndquadrant.com> wrote:Yeah, there certainly isn't that. Now you could perhaps make an
> Youre right, it currently seems to be possible, there's no LSN interlock
> prohibiting this as far as I can see.
argument that no operation that can propagate a set bit from master to
standby can arrive until after the standby's xmin horizon is
sufficiently current, but that does feel a little fragile to me...
even if it's true now, new WAL record types might break it, for
example.
Another piece of code that caught my attention is this (sorry for digging up topics that probably have been discussed to death before and I might not have paid attention to then):
Whenever we clear PD_ALL_VISIBLE flag, we do that while holding the heap page lock and also WAL log that action (in fact, piggy-back with insert/update/delete). The only exception that I saw is in lazy_scan_heap()
916 else if (PageIsAllVisible(page) && has_dead_tuples)
917 {
918 elog(WARNING, "page containing dead tuples is marked as all-visible in relation \"%s\" page %u",
919 relname, blkno);
920 PageClearAllVisible(page);
921 MarkBufferDirty(buf);
922 visibilitymap_clear(onerel, blkno, vmbuffer);
923 }
Obviously, this is not a case that we expect to occur. But if it does occur for whatever reason, then even though we will correct the page flag on master, this would never be populated to the standby. So the standby may continue to have that tiny bit set on the page, at least until another insert/update/delete happens on the page. Not sure if there is anything to worry about, but looks a bit strange. I looked at the archive but can't see any report of the warning, so may be we are safe after all.
Another side issue: The way we set visibility map bits in the VACUUM, we do that in the first phase of the vacuum. Now the very fact that we have selected the page for vacuuming, it will have one or more dead tuples. As soon as we find a dead tuple in the page, we decide not to mark the page all-visible and rightly so. But this page would not get marked all-visible even if we remove all dead tuples in the second phase. Why don't we push that work to the second phase or at least retry that again ? Of course, we can't just do it that way without scanning the page again because new tuples may get added or updated/deleted in the meanwhile. But that can be addressed with some tricks - like tracking LSN of every such (which has only dead and live tuples, but not recently dead or insert-in-progress or delete-in-progress) and then comparing that LSN with the page LSN during the second phase. If an update had happened in between, we will know that and we won't deal with the visibility bits.
Another idea would be to track this intermediate state in the page header itself, something like PD_ALL_VISIBLE_OR_DEAD. If an intermediate update/delete/insert comes in, it will clear the bit. If we see the bit set during the second phase of the vacuum, we will know that it contains only dead and live tuples and the dead ones are being removed now. So we can mark the page PD_ALL_VISIBLE.
Thanks,
Pavan
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee
pgsql-hackers by date: