Re: [PoC] Improve dead tuple storage for lazy vacuum - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: [PoC] Improve dead tuple storage for lazy vacuum |
Date | |
Msg-id | CAD21AoB8RY5aM_+PzWPEb5M=MZ9J4aVUH+xCj_D8uUx4JWcZDA@mail.gmail.com Whole thread Raw |
In response to | Re: [PoC] Improve dead tuple storage for lazy vacuum (John Naylor <john.naylor@enterprisedb.com>) |
Responses |
Re: [PoC] Improve dead tuple storage for lazy vacuum
|
List | pgsql-hackers |
Hi, On Mon, Aug 14, 2023 at 8:05 PM John Naylor <john.naylor@enterprisedb.com> wrote: > > On Thu, Jul 13, 2023 at 3:09 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > 0007, 0008, 0010, and 0011 are straightforward and agree to merge them. Thank you for updating the patch! > > [Part 1 - clear the deck of earlier performance work etc] > > Thanks for taking a look! I've merged 0007 and 0008. The others need a performance test to justify them -- an eyeball checkis not enough. I've now made the time to do that. > > ==== sparse loads > > v38 0001-0006 (still using node3 for this test only): > > select avg(load_ms) from generate_series(1,100) x(x), lateral (select * from bench_load_random_int(100 * 1000 * (1+x-x)))a; > avg > --------------------- > 27.1000000000000000 > > select avg(load_ms) from generate_series(1,30) x(x), lateral (select * from bench_load_random_int(500 * 1000 * (1+x-x)))a; > avg > ---------------------- > 165.6333333333333333 > > v38-0007-Optimize-RT_EXTEND_DOWN.patch > > select avg(load_ms) from generate_series(1,100) x(x), lateral (select * from bench_load_random_int(100 * 1000 * (1+x-x)))a; > avg > --------------------- > 25.0900000000000000 > > select avg(load_ms) from generate_series(1,30) x(x), lateral (select * from bench_load_random_int(500 * 1000 * (1+x-x)))a; > avg > ---------------------- > 157.3666666666666667 > > That seems worth doing. > > v38-0008-Use-4-children-for-node-4-also-attempt-portable-.patch > > This combines two things because I messed up a rebase: Use fanout of 4, and try some macros for shmem sizes, both 32- and64-bit. Looking at this much, I no longer have a goal to have a separate set of size-classes for non-SIMD platforms, becausethat would cause global maintenance problems -- it's probably better to reduce worst-case search time where necessary.That would be much more localized. > > > I have some questions on 0009 patch: > > > According to the comment, this optimization is for only gcc? > > No, not at all. That tells me the comment is misleading. > > > I think this change reduces > > readability and maintainability. > > Well, that much is obvious. What is not obvious is how much it gains us over the alternatives. I do have a simpler idea,though... > > ==== load mostly node4 > > select * from bench_search_random_nodes(250*1000, '0xFFFFFF'); > n4 = 42626, n16 = 21492, n32 = 0, n64 = 0, n256 = 257 > mem_allocated | load_ms | search_ms > ---------------+---------+----------- > 7352384 | 25 | 0 > > v38-0009-TEMP-take-out-search-time-from-bench.patch > > This is just to allow LATERAL queries for better measurements. > > select avg(load_ms) from generate_series(1,100) x(x), lateral (select * from bench_search_random_nodes(250*1000 * (1+x-x),'0xFFFFFF')) a; > > avg > --------------------- > 24.8333333333333333 0007, 0008, and 0009 look good to me. > > v38-0010-Try-a-simpler-way-to-avoid-memmove.patch > > This slightly rewrites the standard loop so that gcc doesn't turn it into a memmove(). Unlike the patch you didn't like,this *is* gcc-specific. (needs a comment, which I forgot) > > avg > --------------------- > 21.9600000000000000 > > So, that's not a trivial difference. I wasn't a big fan of Andres' __asm("") workaround, but that may be just my ignoranceabout it. We need something like either of the two. > > v38-0011-Optimize-add_child_4-take-2.patch > avg > --------------------- > 21.3500000000000000 > > This is possibly faster than v38-0010, but looking like not worth the complexity, assuming the other way avoids the buggoing forward. I prefer 0010 but is it worth testing with other compilers such as clang? > > > According to the bugzilla ticket > > referred to in the comment, it's realized as a bug in the community, > > so once the gcc bug fixes, we might no longer need this trick, no? > > No comment in two years... > > v38-0013-Use-constant-for-initial-copy-of-chunks-and-chil.patch > > This is the same as v37-0011. I wasn't quite satisfied with it since it still has two memcpy() calls, but it actually seemsto regress: > > avg > --------------------- > 22.0900000000000000 > > v38-0012-Use-branch-free-coding-to-skip-new-element-index.patch > > This patch uses a single loop for the copy. > > avg > --------------------- > 21.0300000000000000 > > Within noise level of v38-0011, but it's small and simple, so I like it, at least for small arrays. Agreed. > > v38-0014-node48-Remove-need-for-RIGHTMOST_ONE-in-radix-tr.patch > v38-0015-node48-Remove-dead-code-by-using-loop-local-var.patch > > Just small cleanups. > > v38-0016-Use-memcpy-for-children-when-growing-into-node48.patch > > Makes sense, but untested. Agreed. BTW cfbot reported that some regression tests failed due to OOM. I've attached the patch to fix it. > > =============== > [Part 2] > > Per off-list discussion with Masahiko, it makes sense to take some of the ideas I've used locally on tidbitmap, and startincorporating them into earlier vacuum work to get that out the door faster. With that in mind... > > v38-0017-Make-tidstore-more-similar-to-tidbitmap.patch > > This uses a simplified PagetableEntry (unimaginatively called BlocktableEntry just to avoid confusion), to be replacedwith the real thing at a later date. This is still fixed size, to be replaced with a varlen type. That's more readable. > > Looking at the tidstore tests again after some months, I'm not particularly pleased with the amount of code required forhow little it seems to be testing, nor the output when something fails. (I wonder how hard it would be to have SQL functionsthat add blocks/offsets to the tid store, and emit tuples of tids found in the store.) It would not be hard to have such SQL functions. I'll try it. > > I'm also concerned about the number of places that have to know if the store is using shared memory or not. Something tothink about later. > > v38-0018-Consolidate-inserting-updating-values.patch > > This is something I coded up to get to an API more similar to one in simplehash, as used in tidbitmap.c. It seem worthdoing on its own to reduce code duplication, and also simplifies coding of varlen types and "runtime-embeddable values". Agreed. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Attachment
pgsql-hackers by date: