Re: btree: implement dynamic prefix truncation (was: Improving btree performance through specializing by key shape, take 2) - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: btree: implement dynamic prefix truncation (was: Improving btree performance through specializing by key shape, take 2)
Date
Msg-id CAFj8pRD-WPgDy0o6CYwbAqQLsU=Z2OKOK5u48LrfCVx2rH5D+A@mail.gmail.com
Whole thread Raw
In response to btree: implement dynamic prefix truncation (was: Improving btree performance through specializing by key shape, take 2)  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Responses Re: btree: implement dynamic prefix truncation (was: Improving btree performance through specializing by key shape, take 2)
List pgsql-hackers
Hi

út 31. 10. 2023 v 22:12 odesílatel Matthias van de Meent <boekewurm+postgres@gmail.com> napsal:
Hi,

Currently, nbtree code compares each and every column of an index
tuple during the binary search on the index page. With large indexes
that have many duplicate prefix column values (e.g. an index on (bool,
bool, uuid) ) that means a lot of wasted time getting to the right
column.

The attached patch improves on that by doing per-page dynamic prefix
truncation: If we know that on both the right and left side there are
index tuples where the first two attributes are equal to the scan key,
we skip comparing those attributes at the current index tuple and
start with comparing attribute 3, saving two attribute compares. We
gain performance whenever comparing prefixing attributes is expensive
and when there are many tuples with a shared prefix - in unique
indexes this doesn't gain much, but we also don't lose much in this
case.

This patch was originally suggested at [0], but it was mentioned that
they could be pulled out into it's own thread. Earlier, the
performance gains were not clearly there for just this patch, but
after further benchmarking this patch stands on its own for
performance: it sees no obvious degradation of performance, while
gaining 0-5% across various normal indexes on the cc-complete sample
dataset, with the current worst-case index shape getting a 60%+
improved performance on INSERTs in the tests at [0].

+1

This can be nice functionality. I had a customer with a very slow index scan - the main problem was a long common prefix like prg010203xxxx.

Regards

Pavel


Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

PS. Best served with the downlink right separator/HIKEY optimization
(separate patch to be submitted soon), and specialization over at [0].

[0] https://www.postgresql.org/message-id/CAEze2WiqOONRQTUT1p_ZV19nyMA69UU2s0e2dp+jSBM=j8snuw@mail.gmail.com

pgsql-hackers by date:

Previous
From: John Naylor
Date:
Subject: Commitfest manager November 2023
Next
From: Jeevan Chalke
Date:
Subject: Re: More new SQL/JSON item methods