Re: [PoC] Improve dead tuple storage for lazy vacuum - Mailing list pgsql-hackers

From John Naylor
Subject Re: [PoC] Improve dead tuple storage for lazy vacuum
Date
Msg-id CAFBsxsFMuyUd48SXy1VMFvpVBHh6B6300_UfTSiPckKJtcwWMA@mail.gmail.com
Whole thread Raw
In response to Re: [PoC] Improve dead tuple storage for lazy vacuum  (John Naylor <john.naylor@enterprisedb.com>)
Responses Re: [PoC] Improve dead tuple storage for lazy vacuum  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
> [working on templating]

In the end, I decided to base my effort on v8, and not v12 (based on one of my less-well-thought-out ideas). The latter was a good experiment, but it did not lead to an increase in readability as I had hoped. The attached v17 is still rough, but it's in good enough shape to evaluate a mostly-complete templating implementation.

Part of what I didn't like about v8 was distinctions like "node" vs "nodep", which hinder readability. I've used "allocnode" for some cases where it makes sense, which is translated to "newnode" for the local pointer. Some places I just gave up and used "nodep" for parameters like in v8, just to get it done. We can revisit naming later.

Not done yet:

- get_handle() is not implemented
- rt_attach is defined but unused
- grow_node_kind() was hackishly removed, but could be turned into a macro (or function that writes to 2 pointers)
- node_update_inner() is back, now that we can share a template with "search". Seems easier to read, and I suspect this is easier for the compiler.
- the value type should really be a template macro, but is still hard-coded to uint64
- I think it's okay if the key is hard coded for PG16: If some use case needs more than uint64, we could consider "single-value leaves" with varlen keys as a template option.
- benchmark tests not updated

v13-0007 had some changes to the regression tests, but I haven't included those. The tests from v13-0003 do pass, both locally and shared. I quickly hacked together changing shared/local tests by hand (need to recompile), but it would be good for maintainability if tests could run once each with local and shmem, but use the same "expected" test output.

Also, I didn't look to see if there were any changes in v14/15 that didn't have to do with precise memory accounting.

At this point, Masahiko, I'd appreciate your feedback on whether this is an improvement at all (or at least a good base for improvement), especially for integrating with the TID store. I think there are some advantages to the template approach. One possible disadvantage is needing separate functions for each local and shared memory.

If we go this route, I do think the TID store should invoke the template as static functions. I'm not quite comfortable with a global function that may not fit well with future use cases.

One review point I'll mention: Somehow I didn't notice there is no use for the "chunk" field in the rt_node type -- it's only set to zero and copied when growing. What is the purpose? Removing it would allow the smallest node to take up only 32 bytes with a fanout of 3, by eliminating padding.

Also, v17-0005 has an optimization/simplification for growing into node125 (my version needs an assertion or fallback, but works well now), found by another reading of Andres' prototype There is a lot of good engineering there, we should try to preserve it.

--
John Naylor
EDB: http://www.enterprisedb.com
Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Next
From: David Rowley
Date:
Subject: Re: An oversight in ExecInitAgg for grouping sets