Re: Improving btree performance through specializing by key shape, take 2 - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Improving btree performance through specializing by key shape, take 2
Date
Msg-id CAH2-WznFZN95V4pqDommzDEf7VwM+J2g-Y1XwK5kBMnB=20p3g@mail.gmail.com
Whole thread Raw
In response to Re: Improving btree performance through specializing by key shape, take 2  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Responses Re: Improving btree performance through specializing by key shape, take 2
List pgsql-hackers
On Sun, Apr 10, 2022 at 4:08 PM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:
> I'll send an updated patchset soon (I'm planning on moving around when
> what is changed/added); but before that the attached incremental patch
> should help. FYI, the patchset has been tested on commit 05023a23, and
> compiles (with unused function warnings) after applying the attached
> patch.

I can get it to work now, with your supplemental patch.

> Queries that I expect to be faster are situations where the index does
> front-to-back attribute accesses in a tight loop and repeated index
> lookups; such as in index builds, data loads, JOINs, or IN () and =
> ANY () operations; and then specifically for indexes with only a
> single key attribute, or indexes where we can determine based on the
> index attributes' types that nocache_index_getattr will be called at
> least once for a full _bt_compare call (i.e. att->attcacheoff cannot
> be set for at least one key attribute).

I did some quick testing of the patch series -- pretty much just
reusing my old test suite from the Postgres 12 and 13 nbtree work.
This showed that there is a consistent improvement in some cases. It
also failed to demonstrate any performance regressions. That's
definitely a good start.

I saw about a 4% reduction in runtime for the same UK land registry
test that you yourself have run in the past for the same patch series
[1]. I suspect that there just aren't that many ways to get that kind
of speed up with this test case, except perhaps by further compressing
the on-disk representation used by nbtree. My guess is that the patch
reduces the runtime for this particular test case to a level that's
significantly closer to the limit for this particular piece of
silicon. Which is not to be sniffed at.

Admittedly these test cases were chosen purely because they were
convenient. They were originally designed to test space utilization,
which isn't affected either way here. I like writing reproducible test
cases for indexing stuff, and think that it could work well here too
(even though you're not optimizing space utilization at all). A real
test suite that targets a deliberately chosen cross section of "index
shapes" might work very well.

> In the previous iteration, I discerned that there are different
> "shapes" of indexes, some of which currently have significant overhead
> in the existing btree infrastructure. Especially indexes with multiple
> key attributes can see significant overhead while their attributes are
> being extracted, which (for a significant part) can be attributed to
> the O(n) behaviour of nocache_index_getattr. This O(n) overhead is
> currently avoided only by indexes with only a single key attribute and
> by indexes in which all key attributes have a fixed size (att->attlen
> > 0).

Good summary.

> The types of btree keys I discerned were:
> CREATE INDEX ON tst ...
> ... (single_key_attribute)
> ... (varlen, other, attributes, ...)
> ... (fixed_size, also_fixed, ...)
> ... (sometimes_null, other, attributes, ...)
>
> For single-key-attribute btrees, the performance benefits in the patch
> are achieved by reducing branching in the attribute extraction: There
> are no other key attributes to worry about, so much of the code
> dealing with looping over attributes can inline values, and thus
> reduce the amount of code generated in the hot path.

I agree that it might well be useful to bucket indexes into several
different "index shape archetypes" like this. Roughly the same
approach worked well for me in the past. This scheme might turn out to
be reductive, but even then it could still be very useful (all models
are wrong, some are useful, now as ever).

> For btrees with multiple key attributes, benefits are achieved if some
> attributes are of variable length (e.g. text):
> On master, if your index looks like CREATE INDEX ON tst (varlen,
> fixed, fixed), for the latter attributes the code will always hit the
> slow path of nocache_index_getattr. This introduces a significant
> overhead; as that function wants to re-establish that the requested
> attribute's offset is indeed not cached and not cacheable, and
> calculates the requested attributes' offset in the tuple from
> effectively zero.

Right. So this particular index shape seems like something that we
treat in a rather naive way currently.

Can you demonstrate that with a custom test case? (The result I cited
before was from a '(varlen,varlen,varlen)' index, which is important,
but less relevant.)

[1] https://www.postgresql.org/message-id/flat/CAEze2Whwvr8aYcBf0BeBuPy8mJGtwxGvQYA9OGR5eLFh6Q_ZvA%40mail.gmail.com
-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Commitfest Closed
Next
From: Justin Pryzby
Date:
Subject: typos