Home > mailing lists

Re: btree: implement dynamic prefix truncation (was: Improving btree performance through specializing by key shape, take 2) - Mailing list pgsql-hackers

From	Peter Geoghegan
Subject	Re: btree: implement dynamic prefix truncation (was: Improving btree performance through specializing by key shape, take 2)
Date	August 13, 2024 18:39:10
Msg-id	CAH2-Wzm5M+Djb8dJnGdiP8dwN7fZDNR-PSwNekGvKxa9-a_9cQ@mail.gmail.com Whole thread
In response to	Re: btree: implement dynamic prefix truncation (was: Improving btree performance through specializing by key shape, take 2) (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Responses	Re: btree: implement dynamic prefix truncation (was: Improving btree performance through specializing by key shape, take 2)
List	pgsql-hackers

Tree view

On Tue, Aug 6, 2024 at 5:42 PM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:
> Attached is version 16 now.

I ran this with my old land registry benchmark, used for validating
the space utilization impact of nbtree deduplication (among other
things). This isn't obviously the best benchmark for this sort of
thing, but I seem to recall that you'd used it yourself at some point.
To validate work in this area, likely including this patch. So I
decided to start there.

To be clear, this test involves bulk loading of an unlogged table (the
land registry table). The following composite index is created on the
table before we insert any rows, so most of the cycles here are in
index maintenance including _bt_search descents:

CREATE INDEX composite ON land2 USING btree (county COLLATE "C", city
COLLATE "C", locality COLLATE "C");

I wasn't able to see much of an improvement with this patch applied.
It went from  ~00:51.598 to ~00:51.053. That's a little disappointing,
given that this is supposed to be a sympathetic case for the patch.
Can you suggest something else? (Granted, I understand that this patch
has some complicated relationship with other patches of yours, which I
don't understand currently.)

I'm a bit worried about side-effects for this assertion:

@@ -485,7 +489,7 @@ _bt_check_unique(Relation rel, BTInsertState
insertstate, Relation heapRel,
                Assert(insertstate->bounds_valid);
                Assert(insertstate->low >= P_FIRSTDATAKEY(opaque));
                Assert(insertstate->low <= insertstate->stricthigh);
-               Assert(_bt_compare(rel, itup_key, page, offset) < 0);
+               Assert(_bt_compare(rel, itup_key, page, offset, &sprefix) < 0);
                break;
            }

More generally, it's not entirely clear how the code in
_bt_check_unique is supposed to work with the patch. Why should it be
safe to do what you're doing with the prefix there? It's not like
we're doing a binary search here -- it's more like a linear search.

> - Move from 1-indexed AttributeNumber to 0-indexed ints for prefixes,
> and use "prefix" as naming scheme, rather than cmpcol. A lack of
> prefix, previously indicated with a cmpcol value of 1, is now a prefix
> value of 0.

Found a likely-related bug in the changes you made to amcheck, which I
was able to fix myself like so:

diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index c7dc6725a..15be61777 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -3187,7 +3187,7 @@ bt_rootdescend(BtreeCheckState *state, IndexTuple itup)
         insertstate.buf = lbuf;

         /* Get matching tuple on leaf page */
-        offnum = _bt_binsrch_insert(state->rel, &insertstate, 1);
+        offnum = _bt_binsrch_insert(state->rel, &insertstate, 0);
         /* Compare first >= matching item on leaf page, if any */

--
Peter Geoghegan

pgsql-hackers by date:

From: Etsuro Fujita
Date: 13 August 2024, 18:35:33
Subject: Re: Cross-version Compatibility of postgres_fdw

From: Heikki Linnakangas
Date: 13 August 2024, 18:39:11
Subject: Re: collect_corrupt_items_vacuum.patch

Re: btree: implement dynamic prefix truncation (was: Improving btree performance through specializing by key shape, take 2) - Mailing list pgsql-hackers

Previous

Next