Re: the big picture for index-only scans - Mailing list pgsql-hackers

From Merlin Moncure
Subject Re: the big picture for index-only scans
Date
Msg-id BANLkTinWNELtBZy3DU8TGHogmP-ozPpw4Q@mail.gmail.com
Whole thread Raw
In response to the big picture for index-only scans  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: the big picture for index-only scans  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Mon, May 9, 2011 at 10:25 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> So, what do we need in order to find our way to index-only scans?
>
> 1. The visibility map needs to be crash-safe.  The basic idea of
> index-only scans is that, instead of checking the heap to find out
> whether each tuple is visible, we first check the visibility map.  If
> the visibility map bit is set, then we know all tuples on the page are
> visible to all transactions, and therefore the tuple of interest is
> visible to our transaction.  Assuming that a significant number of
> visibility map bits are set, this should enable us to avoid a fair
> amount of I/O, especially on large tables, because the visibility map
> is roughly 8000 times smaller than the heap, and therefore far more
> practical to keep in cache.

hm, what are the implications for tuple hint bits, short and long
term?  I'm particularly interested if you think any hint bit i/o
mitigation strategies are worth pursuing.

> 2. Crash safe visibility map vs. pg_upgrade.  Even if we make the
> visibility map crash-safe in 9.2, people are going to want to use
> pg_upgrade to migrate from older versions, bringing their
> possibly-not-quite-correct visibility map forks along with them.  How
> should we handle that?  We could (2A) arrange to have pg_upgrade nuke
> all visibility forks when upgrading from a release where the
> visibility map is not crash-safe to one where it is;

+1 on 2A.

> 3. Statistics.  I believe that in order to accurately estimate the
> cost of an index-only scan, we're going to need to know the fraction
> of tuples that are on pages whose visibility map bits are set.

It would be helpful to know the performance benefit of index only
scans before knowing how much benefit to attribute here.  Maybe a
system wide kludge would for starters anyway, like assuming 60% of
pages can be vis checked from the VM, or a single GUC, Then again,
maybe not.

merlin


pgsql-hackers by date:

Previous
From: Merlin Moncure
Date:
Subject: Re: crash-safe visibility map, take five
Next
From: Noah Misch
Date:
Subject: Re: XML with invalid chars