Re: show Heap Fetches in EXPLAIN for index-only scans - Mailing list pgsql-hackers

From Noah Misch
Subject Re: show Heap Fetches in EXPLAIN for index-only scans
Date
Msg-id 20120202235128.GC21463@tornado.leadboat.com
Whole thread Raw
In response to Re: show Heap Fetches in EXPLAIN for index-only scans  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Wed, Jan 25, 2012 at 08:42:05PM -0500, Robert Haas wrote:
> Only the first pass of vacuum knows how to mark pages all-visible.
> After the update, the first pass of the first vacuum sees a dead tuple
> on the old page and truncates it to a dead line pointer.  When it
> comes to the new page, it observes that the page is now all-visible
> and marks it so.  It then does index vacuuming and returns to the
> first page, marking the dead line pointer unused.  But during this
> second visit to the old page, there's no possibility of marking the
> page as all-visible, because the code doesn't know how to do that.
> The next vacuum's first pass, however, can do so, because there are no
> longer any dead tuples on the page.
> 
> We could fix this by modifying lazy_vacuum_page(): since we have to
> dirty the buffer anyway, we could recheck whether all the remaining
> tuples on the page are now all-visible, and if so set the visibility
> map bit.  This is probably desirable even apart from index-only scans,
> because it will frequently save the next vacuum the cost of reading,
> dirtying, and writing extra pages.  There will be some incremental CPU
> cost, but that seems likely to be more than repaid by the I/O savings.
> 
> Thoughts?  Should we do this at all?  If so, should we squeeze it into
> 9.2 or leave it for 9.3?

Sounds like a good idea.  It has bothered me that two consecutive VACUUMs of a
table, with no intervening activity, can dirty a page twice.  Making that less
frequent is a good thing.  I'd hold the change for 9.3, but that's probably an
unusually-conservative viewpoint.


pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: JSON output functions.
Next
From: Joachim Wieland
Date:
Subject: Re: patch for parallel pg_dump