Re: Iterating on IndexTuple attributes and nbtree page-level dynamic prefix truncation - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: Iterating on IndexTuple attributes and nbtree page-level dynamic prefix truncation |
Date | |
Msg-id | CAH2-WzkZY5QUnKfjTyVjUcwUrDOA3yedjn1YOfcPYj_e75kTeA@mail.gmail.com Whole thread Raw |
In response to | Iterating on IndexTuple attributes and nbtree page-level dynamic prefix truncation (Matthias van de Meent <boekewurm+postgres@gmail.com>) |
Responses |
Re: Iterating on IndexTuple attributes and nbtree page-level dynamic prefix truncation
|
List | pgsql-hackers |
On Thu, Apr 15, 2021 at 11:06 AM Matthias van de Meent <boekewurm+postgres@gmail.com> wrote: > I've noticed there are a lot of places in the btree index > infrastructure (and also some other index AMs) that effectively > iterate over the attributes of the index tuple, but won't use > index_deform_tuple for reasons. However, this implies that they must > repeatedly call index_getattr, which in the worst case is O(n) for the > n-th attribute, slowing down extraction of multi-column indexes > significantly. As such, I've added some API that allows for iteration > (ish) over index attributes. Interesting approach. I think that in an ideal world we would have a tuple format with attribute lengths/offsets right in the header. But it's too late for that, so other approaches seem worth considering. > Also attached is 0002, which dynamically truncates attribute prefixes > of tuples whilst _binsrch-ing through a nbtree page. It greatly uses > the improved performance of 0001; they work very well together. The > problems that Peter (cc-ed) mentions in [0] only result in invalid > search bounds when traversing the tree, but on the page level valid > bounds can be constructed. > > This is patchset 1 of a series of patches I'm starting for eventually > adding static prefix truncation into nbtree infrastructure in > PostgreSQL. I've put up a wiki page [1] with my current research and > thoughts on that topic. The idea of making _bt_truncate() produce new leaf page high keys based on the lastleft tuple rather than the firstright tuple (i.e. +inf truncated attribute values rather than the current -inf) seems like a non-starter. As you point out in "1.) Suffix-truncation; -INF in high keys" on the Postgres wiki page, the current approach truncates firstright (not lastleft), making the left page's new high key contain what you call a 'foreign' value. But I see that as a big advantage of the current approach. Consider, for example, the nbtree high key "continuescan" optimization added by commit 29b64d1d. The fact that leaf page high keys are generated in this way kind of allows us to "peak" on the page to the immediate right before actually visiting it -- possibly without ever visiting it (which is where the benefit of that particular optimization lies). _bt_check_unique() uses a similar trick. After the Postgres 12 work, _bt_check_unique() will only visit a second page in the extreme case where we cannot possibly fit all of the relevant version duplicates on even one whole leaf page (i.e. practically never). There is also cleverness inside _bt_compare() to make sure that we handle the boundary cases perfectly while descending the tree. You might also consider how the nbtsplitloc.c code works with duplicates, and how that would be affected by +inf truncated attributes. The leaf-page-packing performed in the SPLIT_SINGLE_VALUE case only goes ahead when the existing high key confirms that this must be the rightmost page. Now, I guess that you could still do something like that if we switched to +inf semantics. But, the fact that the new right page will have a 'foreign' value in the SPLIT_SINGLE_VALUE-split case is also of benefit -- it is practically empty right after the split (since the original/left page is packed full), and we want this empty space to be eligible to either take more duplicates, or to take values that may happen to fit between the highly duplicated value and the original foreign high key value. We want that flexibility, I think. I also find -inf much more natural. If in the future we teach nbtree to truncate "inside" text attributes (say text columns), you'd pretty much be doing the same thing at the level of characters rather than whole columns. The -inf semantics are like strcmp() semantics. If you're going to pursue full prefix compression anyway, maybe you should use a low key on the leaf level in cases where the optimization is in use. This creates complexity during page deletion, because the low key in the subtree to the right of the deletion target subtree may need to be updated. Perhaps you can find a way to make that work that isn't too complicated. > I've run some tests with regards to performance on my laptop; which > tests nbtree index traversal. The test is based on a recent UK land > registry sales prices dataset (25744780 rows), being copied from one > table into an unlogged table with disabled autovacuum, with one index > as specified by the result. Master @ 99964c4a, patched is with both > 0001 and 0002. The results are averages over 3 runs, with plain > configure, compiled by gcc (Debian 6.3.0-18+deb9u1). You should probably account for index size here. I have lots of my own tests for space utilization, using data from a variety of sources. -- Peter Geoghegan
pgsql-hackers by date: