Re: [PoC] Improve dead tuple storage for lazy vacuum - Mailing list pgsql-hackers

From Andres Freund
Subject Re: [PoC] Improve dead tuple storage for lazy vacuum
Date
Msg-id 20210719234915.5kw6k2k5jojvamc5@alap3.anarazel.de
Whole thread Raw
In response to Re: [PoC] Improve dead tuple storage for lazy vacuum  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: [PoC] Improve dead tuple storage for lazy vacuum
Re: [PoC] Improve dead tuple storage for lazy vacuum
List pgsql-hackers
Hi,

On 2021-07-19 15:20:54 +0900, Masahiko Sawada wrote:
> BTW is the implementation of the radix tree approach available
> somewhere? If so I'd like to experiment with that too.
>
> >
> > I have toyed with implementing adaptively large radix nodes like
> > proposed in https://db.in.tum.de/~leis/papers/ART.pdf - but haven't
> > gotten it quite working.
>
> That seems promising approach.

I've since implemented some, but not all of the ideas of that paper
(adaptive node sizes, but not the tree compression pieces).

E.g. for

select prepare(
1000000, -- max block
20, -- # of dead tuples per page
10, -- dead tuples interval within a page
1 -- page inteval
);
        attach  size    shuffled    ordered
array    69 ms  120 MB  84.87 s          8.66 s
intset  173 ms   65 MB  68.82 s         11.75 s
rtbm    201 ms   67 MB  11.54 s          1.35 s
tbm     232 ms  100 MB   8.33 s          1.26 s
vtbm    162 ms   58 MB  10.01 s          1.22 s
radix    88 ms   42 MB  11.49 s          1.67 s

and for
select prepare(
1000000, -- max block
10, -- # of dead tuples per page
1, -- dead tuples interval within a page
1 -- page inteval
);

        attach  size    shuffled    ordered
array    24 ms   60MB   3.74s            1.02 s
intset   97 ms   49MB   3.14s            0.75 s
rtbm    138 ms   36MB   0.41s            0.14 s
tbm     198 ms  101MB   0.41s            0.14 s
vtbm    118 ms   27MB   0.39s            0.12 s
radix    33 ms   10MB   0.28s            0.10 s

(this is an almost unfairly good case for radix)

Running out of time to format the results of the other testcases before
I have to run, unfortunately. radix uses 42MB both in test case 3 and
4.


The radix tree code isn't good right now. A ridiculous amount of
duplication etc. The naming clearly shows its origins from a buffer
mapping radix tree...


Currently in a bunch of the cases 20% of the time is spent in
radix_reaped(). If I move that into radix.c and for bfm_lookup() to be
inlined, I get reduced overhead. rbtm for example essentially already
does that, because it does splitting of ItemPointer in rtbm.c.


I've attached my current patches against your tree.

Greetings,

Andres Freund

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: O_DIRECT on macOS
Next
From: Peter Geoghegan
Date:
Subject: Re: Transactions and indexes