Re: Improving btree performance through specializing by key shape, take 2 - Mailing list pgsql-hackers
From | vignesh C |
---|---|
Subject | Re: Improving btree performance through specializing by key shape, take 2 |
Date | |
Msg-id | CALDaNm1dpnVGZtsdkXPB4Nbgbja9rifYPqQhmsaaY3YYELSy5w@mail.gmail.com Whole thread Raw |
In response to | Re: Improving btree performance through specializing by key shape, take 2 (Matthias van de Meent <boekewurm+postgres@gmail.com>) |
Responses |
Re: Improving btree performance through specializing by key shape, take 2
|
List | pgsql-hackers |
On Mon, 30 Oct 2023 at 21:50, Matthias van de Meent <boekewurm+postgres@gmail.com> wrote: > > On Thu, 26 Oct 2023 at 00:36, Peter Geoghegan <pg@bowt.ie> wrote: > > Most of the work with something like > > this is the analysis of the trade-offs, not writing code. There are > > all kinds of trade-offs that you could make with something like this, > > and the prospect of doing that myself is kind of daunting. Ideally > > you'd have made a significant start on that at this point. > > I believe I'd already made most trade-offs clear earlier in the > threads, along with rationales for the changes in behaviour. But here > goes again: > > _bt_compare currently uses index_getattr() on each attribute of the > key. index_getattr() is O(n) for the n-th attribute if the index tuple > has any null or non-attcacheoff attributes in front of the current > one. Thus, _bt_compare costs O(n^2) work with n=the number of > attributes, which can cost several % of performance, but is very very > bad in extreme cases, and doesO(n) calls to opclass-supplied compare > operations. > > To solve most of the O(n) compare operations, we can optimize > _bt_compare to only compare "interesting" attributes, i.e. we can > apply "dynamic prefix truncation". This is implemented by patch 0001. > This is further enhanced with 0002, where we skip the compare > operations if the HIKEY is the same as the right separator of the > downlink we followed (due to our page split code, this case is > extremely likely). > > However, the above only changes the attribute indexing code in > _bt_compare to O(n) for at most about 76% of the index tuples on the > page (1 - (2 / log2(max_tuples_per_page))), while the other on average > 20+% of the compare operations still have to deal with the O(n^2) > total complexity of index_getattr. > To fix this O(n^2) issue (the issue this thread was originally created > for) the approach I implemented originally is to not use index_getattr > but an "attribute iterator" that incrementally extracts the next > attribute, while keeping track of the current offset into the tuple, > so each next attribute would be O(1). That is implemented in the last > patches of the patchset. > > This attribute iterator approach has an issue: It doesn't perform very > well for indexes that make full use of attcacheoff. The bookkeeping > for attribute iteration proved to be much more expensive than just > reading attcacheoff from memory. This is why the latter patches > (patchset 14 0003+) adapt the btree code to generate different paths > for different "shapes" of key index attributes, to allow the current > attcacheoff code to keep its performance, but to get higher > performance for indexes where the attcacheoff optimization can not be > applied. In passing, it also specializes the code for single-attribute > indexes, so that they don't have to manage branching code, increasing > their performance, too. > > TLDR: > The specialization in 0003+ is applied because index_getattr is good > when attcacheoff applies, but very bad when it doesn't. Attribute > iteration is worse than index_getattr when attcacheoff applies, but is > significantly better when attcacheoff does not work. By specializing > we get the best of both worlds. > > The 0001 and 0002 optimizations were added later to further remove > unneeded calls to the btree attribute compare functions, thus further > reducing the total time spent in _bt_compare. > > Anyway. > > PFA v14 of the patchset. v13's 0001 is now split in two, containing > prefix truncation in 0001, and 0002 containing the downlink's right > separator/HIKEY optimization. > > Performance numbers (data attached): > 0001 has significant gains in multi-column indexes with shared > prefixes, where the prefix columns are expensive to compare, but > otherwise doesn't have much impact. > 0002 further improves performance across the board, but again mostly > for indexes with expensive compare operations. > 0007 sees performance improvements almost across the board, with only > the 'ul' and 'tnt' indexes getting some worse results than master (but > still better average results), > > All patches applied, per-index average performance improvements on 15 > runs range from 3% to 290% across the board for INSERT benchmarks, and > -2.83 to 370% for REINDEX. > > Configured with autoconf: config.log: > > It was created by PostgreSQL configure 17devel, which was > > generated by GNU Autoconf 2.69. Invocation command line was > > > > $ ./configure --enable-tap-tests --enable-depend --with-lz4 --with-zstd COPT=-ggdb -O3 --prefix=/home/matthias/projects/postgresql/pg_install--no-create --no-recursion > > Benchmark was done on 1m random rows of the pp-complete dataset, as > found on UK Gov's S3 bucket [0]: using a parallel and threaded > downloader is preferred because the throughput is measured in kBps per > client. > > I'll do a few runs on the full dataset of 29M rows soon too, but > master's performance is so bad for the 'worstcase' index that I can't > finish its runs fast enough; benchmarking it takes hours per > iteration. CFBot shows that the patch does not apply anymore as in [1]: === Applying patches on top of PostgreSQL commit ID 55627ba2d334ce98e1f5916354c46472d414bda6 === === applying patch ./v14-0001-btree-Implement-dynamic-prefix-compression.patch ... Hunk #7 succeeded at 3169 with fuzz 2 (offset 75 lines). Hunk #8 succeeded at 3180 (offset 75 lines). Hunk #9 FAILED at 3157. Hunk #10 FAILED at 3180. Hunk #11 FAILED at 3218. 3 out of 11 hunks FAILED -- saving rejects to file contrib/amcheck/verify_nbtree.c.rej Please post an updated version for the same. [1] - http://cfbot.cputube.org/patch_46_3672.log Regards, Vignesh
pgsql-hackers by date: