Re: [PoC] Improve dead tuple storage for lazy vacuum - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: [PoC] Improve dead tuple storage for lazy vacuum
Date
Msg-id CAD21AoATJQM4PuQ46NvQAAuxUcATmb7udFyummFHPN-QhUcL5g@mail.gmail.com
Whole thread Raw
In response to Re: [PoC] Improve dead tuple storage for lazy vacuum  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: [PoC] Improve dead tuple storage for lazy vacuum  (Yura Sokolov <y.sokolov@postgrespro.ru>)
Re: [PoC] Improve dead tuple storage for lazy vacuum  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Mon, Jul 26, 2021 at 11:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I'll experiment with the proposed ideas including this idea in more
> scenarios and share the results tomorrow.
>

I've done some benchmarks for proposed data structures. In this trial,
I've done with the scenario where dead tuples are concentrated on a
particular range of table blocks (test 5-8), in addition to the
scenarios I've done in the previous trial. Also, I've done benchmarks
of each scenario while increasing table size. In the first test, the
maximum block number of the table is 1,000,000 (i.g., 8GB table) and
in the second test, it's 10,000,000 (80GB table). We can see how
performance and memory consumption changes with a large-scale table.
Here are the results:

* Test 1
select prepare(
1000000, -- max block
10, -- # of dead tuples per page
1, -- dead tuples interval within  a page
1, -- # of consecutive pages having dead tuples
20 -- page interval
);

  name  |  attach   | attach | shuffled |  size_x10  | attach_x10| shuffled_x10
--------+-----------+--------+----------+------------+-----------+-------------
 array  | 57.23 MB  |  0.040 |   98.613 | 572.21 MB  |     0.387 |    1521.981
 intset | 46.88 MB  |  0.114 |   75.944 | 468.67 MB  |     0.961 |     997.760
 radix  | 40.26 MB  |  0.102 |   18.427 | 336.64 MB  |     0.797 |     266.146
 rtbm   | 64.02 MB  |  0.234 |   22.443 | 512.02 MB  |     2.230 |     275.143
 svtm   | 27.28 MB  |  0.060 |   13.568 | 274.07 MB  |     0.476 |     211.073
 tbm    | 96.01 MB  |  0.273 |   10.347 | 768.01 MB  |     2.882 |     128.103

* Test 2
select prepare(
1000000, -- max block
10, -- # of dead tuples per page
1, -- dead tuples interval within  a page
1, -- # of consecutive pages having dead tuples
1 -- page interval
);

  name  |  attach   | attach | shuffled |  size_x10  | attach_x10| shuffled_x10
--------+-----------+--------+----------+------------+-----------+-------------
 array  | 57.23 MB  |  0.041 |    4.757 | 572.21 MB  |     0.344 |      71.228
 intset | 46.88 MB  |  0.127 |    3.762 | 468.67 MB  |     1.093 |      49.573
 radix  | 9.95 MB   |  0.048 |    0.679 | 82.57 MB   |     0.371 |      16.211
 rtbm   | 34.02 MB  |  0.179 |    0.534 | 288.02 MB  |     2.092 |       8.693
 svtm   | 5.78 MB   |  0.043 |    0.239 | 54.60 MB   |     0.342 |       7.759
 tbm    | 96.01 MB  |  0.274 |    0.521 | 768.01 MB  |     2.685 |       6.360

* Test 3
select prepare(
1000000, -- max block
2, -- # of dead tuples per page
100, -- dead tuples interval within  a page
1, -- # of consecutive pages having dead tuples
1 -- page interval
);

  name  |  attach   | attach | shuffled |  size_x10  | attach_x10| shuffled_x10
--------+-----------+--------+----------+------------+-----------+-------------
 array  | 11.45 MB  |  0.009 |   57.698 | 114.45 MB  |     0.076 |    1045.639
 intset | 15.63 MB  |  0.031 |   46.083 | 156.23 MB  |     0.243 |     848.525
 radix  | 40.26 MB  |  0.063 |   13.755 | 336.64 MB  |     0.501 |     223.413
 rtbm   | 36.02 MB  |  0.123 |   11.527 | 320.02 MB  |     1.843 |     180.977
 svtm   | 9.28 MB   |  0.053 |    9.631 | 92.59 MB   |     0.438 |     212.626
 tbm    | 96.01 MB  |  0.228 |   10.381 | 768.01 MB  |     2.258 |     126.630

* Test 4
select prepare(
1000000, -- max block
100, -- # of dead tuples per page
1, -- dead tuples interval within  a page
1, -- # of consecutive pages having dead tuples
1 -- page interval
);

  name  |  attach   | attach | shuffled |  size_x10  | attach_x10| shuffled_x10
--------+-----------+--------+----------+------------+-----------+-------------
 array  | 572.21 MB |  0.367 |   78.047 | 5722.05 MB |     3.942 |    1154.776
 intset | 93.74 MB  |  0.777 |   45.146 | 937.34 MB  |     7.716 |     643.708
 radix  | 40.26 MB  |  0.203 |    9.015 | 336.64 MB  |     1.775 |     133.294
 rtbm   | 36.02 MB  |  0.369 |    5.639 | 320.02 MB  |     3.823 |      88.832
 svtm   | 7.28 MB   |  0.294 |    3.891 | 73.60 MB   |     2.690 |     103.744
 tbm    | 96.01 MB  |  0.534 |    5.223 | 768.01 MB  |     5.679 |      60.632


* Test 5
select prepare(
1000000, -- max block
150, -- # of dead tuples per page
1, -- dead tuples interval within  a page
10000, -- # of consecutive pages having dead tuples
20000 -- page interval
);

There are 10000 consecutive pages that have 150 dead tuples at every
20000 pages.

  name  |  attach   | attach | shuffled |  size_x10  | attach_x10| shuffled_x10
--------+-----------+--------+----------+------------+-----------+-------------
 array  | 429.16 MB |  0.274 |   75.664 | 4291.54 MB |     3.067 |    1259.501
 intset | 46.88 MB  |  0.559 |   36.449 | 468.67 MB  |     4.565 |     517.445
 radix  | 20.26 MB  |  0.166 |    8.466 | 196.90 MB  |     1.273 |     166.587
 rtbm   | 18.02 MB  |  0.242 |    8.491 | 160.02 MB  |     2.407 |     171.725
 svtm   | 3.66 MB   |  0.243 |    3.635 | 37.10 MB   |     2.022 |      86.165
 tbm    | 48.01 MB  |  0.344 |    9.763 | 384.01 MB  |     3.327 |     151.824

* Test 6
select prepare(
1000000, -- max block
10, -- # of dead tuples per page
1, -- dead tuples interval within  a page
10000, -- # of consecutive pages having dead tuples
20000 -- page interval
);

There are 10000 consecutive pages that have 10 dead tuples at every 20000 pages.

  name  |  attach   | attach | shuffled |  size_x10  | attach_x10| shuffled_x10
--------+-----------+--------+----------+------------+-----------+-------------
 array  | 28.62 MB  |  0.022 |    2.791 | 286.11 MB  |     0.170 |      46.920
 intset | 23.45 MB  |  0.061 |    2.156 | 234.34 MB  |     0.501 |      32.577
 radix  | 5.04 MB   |  0.026 |    0.433 | 48.57 MB   |     0.191 |      11.060
 rtbm   | 17.02 MB  |  0.074 |    0.533 | 144.02 MB  |     0.954 |      11.502
 svtm   | 3.16 MB   |  0.023 |    0.206 | 27.60 MB   |     0.175 |       4.886
 tbm    | 48.01 MB  |  0.132 |    0.656 | 384.01 MB  |     1.284 |      10.231

* Test 7
select prepare(
1000000, -- max block
150, -- # of dead tuples per page
1, -- dead tuples interval within  a page
1000, -- # of consecutive pages having dead tuples
999000 -- page interval
);

There are pages that have 150 dead tuples at first 1000 blocks and
last 1000 blocks.

  name  |  attach   | attach | shuffled |  size_x10  | attach_x10| shuffled_x10
--------+-----------+--------+----------+------------+-----------+-------------
 array  | 1.72 MB   |  0.002 |    7.507 | 17.17 MB   |     0.011 |      76.510
 intset | 0.20 MB   |  0.003 |    6.742 | 1.89 MB    |     0.022 |      52.122
 radix  | 0.20 MB   |  0.001 |    1.023 | 1.07 MB    |     0.007 |      12.023
 rtbm   | 0.15 MB   |  0.001 |    2.637 | 0.65 MB    |     0.009 |      34.528
 svtm   | 0.52 MB   |  0.002 |    0.721 | 0.61 MB    |     0.010 |       6.434
 tbm    | 0.20 MB   |  0.002 |    2.733 | 1.51 MB    |     0.015 |      38.538

* Test 8
select prepare(
1000000, -- max block
100, -- # of dead tuples per page
1, -- dead tuples interval within  a page
50, -- # of consecutive pages having dead tuples
100 -- page interval
);

There are 50 consecutive pages that have 100 dead tuples at every 100 pages.

  name  |  attach   | attach | shuffled |  size_x10  | attach_x10| shuffled_x10
--------+-----------+--------+----------+------------+-----------+-------------
 array  | 286.11 MB |  0.184 |   67.233 | 2861.03 MB |     1.743 |     979.070
 intset | 46.88 MB  |  0.389 |   35.176 | 468.67 MB  |     3.698 |     505.322
 radix  | 21.82 MB  |  0.116 |    6.160 | 186.86 MB  |     0.891 |     117.730
 rtbm   | 18.02 MB  |  0.182 |    5.909 | 160.02 MB  |     1.870 |     112.550
 svtm   | 4.28 MB   |  0.152 |    3.213 | 37.60 MB   |     1.383 |      79.073
 tbm    | 48.01 MB  |  0.265 |    6.673 | 384.01 MB  |     2.586 |     101.327

Overall, 'svtm' is faster and consumes less memory. 'radix' tree also
has good performance and memory usage.

From these results, svtm is the best data structure among proposed
ideas for dead tuple storage used during lazy vacuum in terms of
performance and memory usage. I think it can support iteration by
extracting the offset of dead tuples for each block while iterating
chunks.

Apart from performance and memory usage points of view, we also need
to consider the reusability of the code. When I started this thread, I
thought the best data structure would be the one optimized for
vacuum's dead tuple storage. However, if we can use a data structure
that can also be used in general, we can use it also for other
purposes. Moreover, if it's too optimized for the current TID system
(32 bits block number, 16 bits offset number, maximum block/offset
number, etc.) it may become a blocker for future changes.

In that sense, radix tree also seems good since it can also be used in
gist vacuum as a replacement for intset, or a replacement for hash
table for shared buffer as discussed before. Are there any other use
cases? On the other hand, I’m concerned that radix tree would be an
over-engineering in terms of vacuum's dead tuples storage since the
dead tuple storage is static data and requires only lookup operation,
so if we want to use radix tree as dead tuple storage, I'd like to see
further use cases.


Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



pgsql-hackers by date:

Previous
From: "Bossart, Nathan"
Date:
Subject: Re: log_checkpoint's "WAL file(s) added" is misleading to the point of uselessness
Next
From: Dilip Kumar
Date:
Subject: Re: row filtering for logical replication