Re: [PERFORM] encouraging index-only scans - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: [PERFORM] encouraging index-only scans
Date
Msg-id 20140211171213.GE2289@momjian.us
Whole thread Raw
In response to Re: [PERFORM] encouraging index-only scans  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [PERFORM] encouraging index-only scans  (Andres Freund <andres@2ndquadrant.com>)
Re: [PERFORM] encouraging index-only scans  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-hackers
On Tue, Feb 11, 2014 at 11:28:36AM -0500, Robert Haas wrote:
> A sequential scan will set hint bits and will prune the page, but
> pruning the page doesn't ever mark it all-visible; that logic is
> entirely in vacuum.  If that could be made cheap enough to be
> negligible, it might well be worth doing in heap_page_prune().  I
> think there might be a way to do that, but it's a bit tricky because
> the pruning logic iterates over the page in a somewhat complex way,
> not just a straightforward scan of all the item pointers the way the
> existing logic doesn't.  It would be pretty cool if we could just use
> a bit out of the heap-prune xlog record to indicate whether the
> all-visible bit should be set; then we'd gain the benefit of marking
> things all-visible much more often without needing vacuum.
> 
> That doesn't help insert-only tables much, though, because those won't
> require pruning.  We set hint bits (which dirties the page) but
> currently don't write WAL.  We'd have to change that to set the
> all-visible bit when scanning such a table, and that would be
> expensive.  :-(

Yes, that pretty much sums it up.  We introduced index-only scans in 9.2
(2012) but they still seem to be not usable for insert-only workloads
two years later.  Based on current progress, it doesn't look like this
will be corrected until 9.5 (2015).  I am kind of confused why this has
not generated more urgency.

I guess my question is what approach do we want to take to fixing this? 
If we are doing pruning, aren't we emitting WAL?  You are right that for
an insert-only workload, we aren't going to prune, but if pruning WAL
overhead is acceptable for a sequential scan, isn't index-only
page-all-visible WAL overhead acceptable?

Do we want to track the number of inserts in statistics and trigger an
auto-vacuum after a specified number of inserts?  The problem there is
that we really don't need to do any index cleanup, which is what vacuum
typically does --- we just want to scan the table and set the
all-visible bits, so that approach seems non-optimal.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Performance Improvement by reducing WAL for Update Operation
Next
From: Tom Lane
Date:
Subject: Re: narwhal and PGDLLIMPORT