Re: Improving btree performance through specializing by key shape, take 2 - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Improving btree performance through specializing by key shape, take 2
Date
Msg-id CAH2-WznD9a=-JThYHESad1p1X385TuWAoQR8zWAiZ2pJjLkfkA@mail.gmail.com
Whole thread Raw
In response to Re: Improving btree performance through specializing by key shape, take 2  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Responses Re: Improving btree performance through specializing by key shape, take 2
List pgsql-hackers
On Tue, Sep 19, 2023 at 6:28 AM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:
> > To be clear, page deletion does what I described here (it does an
> > in-place update of the downlink to the deleted page, so the same pivot
> > tuple now points to its right sibling, which is our page of concern),
> > in addition to fully removing the original pivot tuple whose downlink
> > originally pointed to our page of concern. This is why page deletion
> > makes the key space "move to the right", very much like a page split
> > would.
>
> I am still aware of this issue, and I think we've discussed it in
> detail earlier. I think it does not really impact this patchset. Sure,
> I can't use dynamic prefix compression to its full potential, but I
> still do get serious performance benefits:

Then why have you linked whatever the first patch does with the high
key to dynamic prefix compression in the first place? Your commit
message makes it sound like it's a way to get around the race
condition that affects dynamic prefix compression, but as far as I can
tell it has nothing whatsoever to do with that race condition.

Questions:

1. Why shouldn't the high key thing be treated as an unrelated piece of work?

I guess it's possible that it really should be structured that way,
but even then it's your responsibility to make it clear why that is.
As things stand, this presentation is very confusing.

2. Separately, why should dynamic prefix compression be tied to the
specialization work? I also see no principled reason why it should be
tied to the other two things.

I didn't mind this sort of structure so much back when this work was
very clearly exploratory -- I've certainly structured work in this
area that way myself, in the past. But if you want this patch set to
ever go beyond being an exploratory patch set, something has to
change. I don't have time to do a comprehensive (or even a fairly
cursory) analysis of which parts of the patch are helping, and which
are marginal or even add no value.

> > You'd have
> > to compare the lower bound separator key from the parent (which might
> > itself be the page-level low key for the parent) to the page low key.
> > That's not a serious suggestion; I'm just pointing out that you need
> > to be able to compare like with like for a canary condition like this
> > one, and AFAICT there is no lightweight practical way of doing that
> > that is 100% robust.
>
> True, if we had consistent LOWKEYs on pages, that'd make this job much
> easier: the prefix could indeed be carried over in full. But that's
> not currently the case for the nbtree code, and this is the next best
> thing, as it also has the benefit of working with all currently
> supported physical formats of btree indexes.

I went over the low key thing again because I had to struggle to
understand what your high key optimization had to do with dynamic
prefix compression. I'm still struggling. I think that your commit
message very much led me astray. Quoting it here:

"""
Although this limits the overall applicability of the
performance improvement, it still allows for a nice performance
improvement in most cases where initial columns have many
duplicate values and a compare function that is not cheap.

As an exception to the above rule, most of the time a pages'
highkey is equal to the right seperator on the parent page due to
how btree splits are done. By storing this right seperator from
the parent page and then validating that the highkey of the child
page contains the exact same data, we can restore the right prefix
bound without having to call the relatively expensive _bt_compare.
"""

You're directly tying the high key optimization to the dynamic prefix
compression optimization. But why?

I have long understood that you gave up on the idea of keeping the
bounds across levels of the tree (which does make sense to me), but
yesterday the issue became totally muddled by this high key business.
That's why I rehashed the earlier discussion, which I had previously
understood to be settled.

--
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: dikkop seems unhappy because of openssl stuff (FreeBSD 14-BETA1)
Next
From: "Tristan Partin"
Date:
Subject: Re: Extensible storage manager API - SMGR hook Redux