Re: Improving btree performance through specializing by key shape, take 2 - Mailing list pgsql-hackers

From Matthias van de Meent
Subject Re: Improving btree performance through specializing by key shape, take 2
Date
Msg-id CAEze2WiivoByXEHgp1CyL+i3_2znHvXcF0sar8wLwhMekxmo_g@mail.gmail.com
Whole thread Raw
In response to Re: Improving btree performance through specializing by key shape, take 2  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Improving btree performance through specializing by key shape, take 2
List pgsql-hackers
On Mon, 11 Apr 2022 at 03:11, Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Sun, Apr 10, 2022 at 4:08 PM Matthias van de Meent
> <boekewurm+postgres@gmail.com> wrote:
> > I'll send an updated patchset soon (I'm planning on moving around when
> > what is changed/added); but before that the attached incremental patch
> > should help. FYI, the patchset has been tested on commit 05023a23, and
> > compiles (with unused function warnings) after applying the attached
> > patch.
>
> I can get it to work now, with your supplemental patch.

Great. Attached is the updated patchset, with as main changes:

- Rebased on top of 5bb2b6ab
- All patches should compile when built on top of each preceding patch.
- Reordered the patches to a more logical order, and cleaned up the
content of each patch
- Updated code so that GCC doesn't warn about unused code.
- Add a patch for dynamic prefix truncation at page level; ref thread at [1].
- Fixed issues in the specialization macros that resulted in issues
with DPT above.

Still to-do:

- Validate performance and share the numbers for the same test indexes
in [1]. I'm planning on doing that next Monday.
- Decide whether / how to keep the NBTS_ENABLED flag. The current
#define in nbtree.h is a bad example of a compile-time configuration,
that should be changed (even if we only want to be able to disable
specialization at compile-time, it should be moved).

Maybe:

- More tests: PG already extensively tests the btree code while it is
running the test suite - btree is the main index AM - but more tests
might be needed to test the validity of the specialized code.

> > Queries that I expect to be faster are situations where the index does
> > front-to-back attribute accesses in a tight loop and repeated index
> > lookups; such as in index builds, data loads, JOINs, or IN () and =
> > ANY () operations; and then specifically for indexes with only a
> > single key attribute, or indexes where we can determine based on the
> > index attributes' types that nocache_index_getattr will be called at
> > least once for a full _bt_compare call (i.e. att->attcacheoff cannot
> > be set for at least one key attribute).
>
> I did some quick testing of the patch series -- pretty much just
> reusing my old test suite from the Postgres 12 and 13 nbtree work.
> This showed that there is a consistent improvement in some cases. It
> also failed to demonstrate any performance regressions. That's
> definitely a good start.
>
> I saw about a 4% reduction in runtime for the same UK land registry
> test that you yourself have run in the past for the same patch series
> [1].

That's good to know. The updated patches (as attached) have dynamic
prefix truncation from the patch series in [1] added too, which should
improve the performance by a few more percentage points in that
specific test case.

> I suspect that there just aren't that many ways to get that kind
> of speed up with this test case, except perhaps by further compressing
> the on-disk representation used by nbtree. My guess is that the patch
> reduces the runtime for this particular test case to a level that's
> significantly closer to the limit for this particular piece of
> silicon. Which is not to be sniffed at.
>
> Admittedly these test cases were chosen purely because they were
> convenient. They were originally designed to test space utilization,
> which isn't affected either way here. I like writing reproducible test
> cases for indexing stuff, and think that it could work well here too
> (even though you're not optimizing space utilization at all). A real
> test suite that targets a deliberately chosen cross section of "index
> shapes" might work very well.

I'm not sure what you're refering to. Is the set of indexes I used in
[1] an example of what you mean by "test suite of index shapes"?

> > In the previous iteration, I discerned that there are different
> > "shapes" of indexes, some of which currently have significant overhead
> > in the existing btree infrastructure. Especially indexes with multiple
> > key attributes can see significant overhead while their attributes are
> > being extracted, which (for a significant part) can be attributed to
> > the O(n) behaviour of nocache_index_getattr. This O(n) overhead is
> > currently avoided only by indexes with only a single key attribute and
> > by indexes in which all key attributes have a fixed size (att->attlen
> > > 0).
>
> Good summary.
>
> > The types of btree keys I discerned were:
> > CREATE INDEX ON tst ...
> > ... (single_key_attribute)
> > ... (varlen, other, attributes, ...)
> > ... (fixed_size, also_fixed, ...)
> > ... (sometimes_null, other, attributes, ...)
> >
> > For single-key-attribute btrees, the performance benefits in the patch
> > are achieved by reducing branching in the attribute extraction: There
> > are no other key attributes to worry about, so much of the code
> > dealing with looping over attributes can inline values, and thus
> > reduce the amount of code generated in the hot path.
>
> I agree that it might well be useful to bucket indexes into several
> different "index shape archetypes" like this. Roughly the same
> approach worked well for me in the past. This scheme might turn out to
> be reductive, but even then it could still be very useful (all models
> are wrong, some are useful, now as ever).
>
> > For btrees with multiple key attributes, benefits are achieved if some
> > attributes are of variable length (e.g. text):
> > On master, if your index looks like CREATE INDEX ON tst (varlen,
> > fixed, fixed), for the latter attributes the code will always hit the
> > slow path of nocache_index_getattr. This introduces a significant
> > overhead; as that function wants to re-establish that the requested
> > attribute's offset is indeed not cached and not cacheable, and
> > calculates the requested attributes' offset in the tuple from
> > effectively zero.
>
> Right. So this particular index shape seems like something that we
> treat in a rather naive way currently.

But really every index shape is treated naively, except the cacheable
index shapes. The main reason we haven't cared about it much is that
you don't often see btrees with many key attributes, and when it's
slow that is explained away 'because it is a big index and a wide
index key' but still saves orders of magnitude over a table scan, so
people generally don't complain about it. A notable exception was
80b9e9c4, where a customer complained about index scans being faster
than index only scans.

> Can you demonstrate that with a custom test case? (The result I cited
> before was from a '(varlen,varlen,varlen)' index, which is important,
> but less relevant.)

Anything that has a variable length in any attribute other than the
last; so that includes (varlen, int) and also (int, int, varlen, int,
int, int, int).
The catalogs currently seem to include only one such index:
pg_proc_proname_args_nsp_index is an index on pg_proc (name (const),
oidvector (varlen), oid (const)).

- Matthias

[1]
https://www.postgresql.org/message-id/flat/CAEze2WhyBT2bKZRdj_U0KS2Sbewa1XoO_BzgpzLC09sa5LUROg%40mail.gmail.com#fe3369c4e202a7ed468e47bf5420f530

Attachment

pgsql-hackers by date:

Previous
From: Sahil Harpal
Date:
Subject: GSOC-2022 | Improve pgarchives proposal review
Next
From: Tom Lane
Date:
Subject: Re: Crash in new pgstats code