Re: [HACKERS] Display number of heap accesses for index scans - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: [HACKERS] Display number of heap accesses for index scans
Date
Msg-id 20171105210412.GA5006@marmot
Whole thread Raw
In response to [HACKERS] Display number of heap accesses for index scans  (Andres Freund <andres@anarazel.de>)
Responses Re: [HACKERS] Display number of heap accesses for index scans  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Andres Freund <andres@anarazel.de> wrote:
>The number of index lookups that failed to return anything can be a
>critical performance factor in OLTP workloads.  Therefore it seems like
>it'd be a good idea to extend the explain analyze output to include that
>information.

I certainly agree.

I've sometimes wondered if we could do better here by exploiting unique
indexes' uniqueness property. The general idea is that we could avoid
visiting duplicates in the heap when we've determined that they cannot
be visible to our MVCC snapshot - we already encountered the visible
version for that value, so this must be true. This is somewhat like what
we do with HOT chains, except it occurs across index tuples with the
same value rather than across heap tuples within a HOT chain.

The big difficulty is LP_DEAD bit setting within B-Tree pages; if no one
is visiting the heap, no one can opportunistically mark the bit to
indicate that nobody needs to visit the heap tuple pointed to by the
index tuple (because the heap tuple is dead to everyone). Now, there
isn't an immediate problem with that, because the main point of both
optimizations is to avoid heap visits. But there is a problem when there
needs to be a page split and there is a lack of LP_DEAD bits set from
which to reclaim space to defer the page split (in nbtree, we can
reclaim the space needed for our an index tuple that needs to be
inserted without actually spliting, in the end). We cannot undermine the
secondary goal of the LP_DEAD/kill_prior_tuple optimization, which is
reclaiming space early within B-Tree pages.

I can imagine a way of addressing this problem, though it is very
invasive - versioning distinct index tuple versions in the index with
ordinal version numbers. I wonder if anyone else has thought about doing
something like this, and has a better idea, though. I think that we
could be cleverer about unique indexes in a number of different ways
[1]. Reducing heap accesses is a big one.

[1]
https://wiki.postgresql.org/wiki/Key_normalization#Making_all_items_in_the_index_unique_by_treating_heap_TID_as_an_implicit_last_attribute
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [HACKERS] WIP: long transactions on hot standby feedback replica/ proof of concept
Next
From: "Jim Van Fleet"
Date:
Subject: Re: [HACKERS] [POC] Faster processing at Gather node