Re: btree: downlink right separator/HIKEY optimization - Mailing list pgsql-hackers

From Matthias van de Meent
Subject Re: btree: downlink right separator/HIKEY optimization
Date
Msg-id CAEze2WgF+2yW1f+POrYLCWpTfOEgb28zEQX6tFPUPOziJ2tYmA@mail.gmail.com
Whole thread Raw
In response to Re: btree: downlink right separator/HIKEY optimization  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On Fri, 8 Mar 2024 at 20:14, Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Thu, Feb 22, 2024 at 10:42 AM Matthias van de Meent
> <boekewurm+postgres@gmail.com> wrote:
> > I forgot to address this in the previous patch, so here's v3 which
> > fixes the issue warning.
>
> What benchmarking have you done here?

I have benchmarked this atop various versions of master when it was
part of the btree specialization patchset, where it showed a 2-9%
increase in btree insert performance over the previous patch in the
patchset on the various index types in that set.
More recently, on an unlogged pgbench with foreign keys enabled (-s400
-j4 -c8) I can't find any obvious regressions (it gains 0-0.7% on
master across 5-minute runs), while being 4.5% faster on inserting
data on a table with an excessively bad index shape (single index of
10 columns of empty strings with the non-default "nl-BE-x-icu"
collation followed by 1 random uuid column, inserted from a 10M row
dataset. Extrapolation indicates this could indeed get over 7%
improvement when the index shape is 31 nondefault -collated nonnull
text columns and a single random ID index column).

> Have you tried just reordering things in _bt_search() instead? If we
> delay the check until after the binary search, then the result of the
> binary search is usually proof enough that we cannot possibly need to
> move right. That has the advantage of not requiring that we copy
> anything to the stack.

I've not tried that, because it would makes page-level prefix
truncation more expensive by ~50%: With this patch, we need only 2
full tuple _bt_compares per page before we can establish a prefix,
while without this patch (i.e. if we did a binsrch-first approach)
we'd need 3 on average (assuming linearly randomly distributed
accesses). Because full-key compares can be arbitrarily more expensive
than normal attribute compares, I'd rather not have that 50% overhead.

> > On Fri, Mar 8, 2024 at 2:14 PM Peter Geoghegan <pg@bowt.ie> wrote:
> > What benchmarking have you done here?
> I think that the memcmp() test is subtly wrong:

Good catch, it's been fixed in the attached version, using a new function.

Kind regards,

Matthias van de Meent.

Attachment

pgsql-hackers by date:

Previous
From: "Andrey M. Borodin"
Date:
Subject: Re: UUID v7
Next
From: Heikki Linnakangas
Date:
Subject: Re: Confine vacuum skip logic to lazy_scan_skip