Thread: [HACKERS] Display number of heap accesses for index scans

[HACKERS] Display number of heap accesses for index scans

From
Andres Freund
Date:
Hi,

right now it's hard to figure out whether a plain indexscan returns
matches that then get eliminated due to visibility checks in the
heap. For both index only scans (via "Heap Fetches") and bitmapscans
(via row mismatches between bitmap heap and index scans) one can gather
that to some degree from explain analyze.

The number of index lookups that failed to return anything can be a
critical performance factor in OLTP workloads.  Therefore it seems like
it'd be a good idea to extend the explain analyze output to include that
information.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Display number of heap accesses for index scans

From
Peter Geoghegan
Date:
Andres Freund <andres@anarazel.de> wrote:
>The number of index lookups that failed to return anything can be a
>critical performance factor in OLTP workloads.  Therefore it seems like
>it'd be a good idea to extend the explain analyze output to include that
>information.

I certainly agree.

I've sometimes wondered if we could do better here by exploiting unique
indexes' uniqueness property. The general idea is that we could avoid
visiting duplicates in the heap when we've determined that they cannot
be visible to our MVCC snapshot - we already encountered the visible
version for that value, so this must be true. This is somewhat like what
we do with HOT chains, except it occurs across index tuples with the
same value rather than across heap tuples within a HOT chain.

The big difficulty is LP_DEAD bit setting within B-Tree pages; if no one
is visiting the heap, no one can opportunistically mark the bit to
indicate that nobody needs to visit the heap tuple pointed to by the
index tuple (because the heap tuple is dead to everyone). Now, there
isn't an immediate problem with that, because the main point of both
optimizations is to avoid heap visits. But there is a problem when there
needs to be a page split and there is a lack of LP_DEAD bits set from
which to reclaim space to defer the page split (in nbtree, we can
reclaim the space needed for our an index tuple that needs to be
inserted without actually spliting, in the end). We cannot undermine the
secondary goal of the LP_DEAD/kill_prior_tuple optimization, which is
reclaiming space early within B-Tree pages.

I can imagine a way of addressing this problem, though it is very
invasive - versioning distinct index tuple versions in the index with
ordinal version numbers. I wonder if anyone else has thought about doing
something like this, and has a better idea, though. I think that we
could be cleverer about unique indexes in a number of different ways
[1]. Reducing heap accesses is a big one.

[1]
https://wiki.postgresql.org/wiki/Key_normalization#Making_all_items_in_the_index_unique_by_treating_heap_TID_as_an_implicit_last_attribute
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Display number of heap accesses for index scans

From
Tom Lane
Date:
Peter Geoghegan <pg@bowt.ie> writes:
> Andres Freund <andres@anarazel.de> wrote:
>> The number of index lookups that failed to return anything can be a
>> critical performance factor in OLTP workloads.  Therefore it seems like
>> it'd be a good idea to extend the explain analyze output to include that
>> information.

> I certainly agree.

Doesn't the EXPLAIN (BUFFERS) output already address this?
        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Display number of heap accesses for index scans

From
Alexander Korotkov
Date:
On Mon, Nov 6, 2017 at 6:33 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Peter Geoghegan <pg@bowt.ie> writes:
> Andres Freund <andres@anarazel.de> wrote:
>> The number of index lookups that failed to return anything can be a
>> critical performance factor in OLTP workloads.  Therefore it seems like
>> it'd be a good idea to extend the explain analyze output to include that
>> information.

> I certainly agree.

Doesn't the EXPLAIN (BUFFERS) output already address this?

In plain index scan EXPLAIN (ANALYZE, BUFFERS) doesn't distinguish buffers accessed in index and buffers accessed in heap

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company