Re: [PoC] Improve dead tuple storage for lazy vacuum - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: [PoC] Improve dead tuple storage for lazy vacuum |
Date | |
Msg-id | CAD21AoD-anVc6aozi+pi2Zn0CJA3bPOS3i7oABpZjALPbmwtbg@mail.gmail.com Whole thread Raw |
In response to | Re: [PoC] Improve dead tuple storage for lazy vacuum (John Naylor <johncnaylorls@gmail.com>) |
Responses |
Re: [PoC] Improve dead tuple storage for lazy vacuum
|
List | pgsql-hackers |
On Mon, Mar 4, 2024 at 8:48 PM John Naylor <johncnaylorls@gmail.com> wrote: > > On Mon, Mar 4, 2024 at 1:05 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Sun, Mar 3, 2024 at 2:43 PM John Naylor <johncnaylorls@gmail.com> wrote: > > > > Right, I should have said "reset". Resetting a context will delete > > > it's children as well, and seems like it should work to reset the tree > > > context, and we don't have to know whether that context actually > > > contains leaves at all. That should allow copying "tree context" to > > > "leaf context" in the case where we have no special context for > > > leaves. > > > > Resetting the tree->context seems to work. But I think we should note > > for callers that the dsa_area passed to RT_CREATE should be created in > > a different context than the context passed to RT_CREATE because > > otherwise RT_FREE() will also free the dsa_area. For example, the > > following code in test_radixtree.c will no longer work: > > > > dsa = dsa_create(tranche_id); > > radixtree = rt_create(CurrentMemoryContext, dsa, tranche_id); > > : > > rt_free(radixtree); > > dsa_detach(dsa); // dsa is already freed. > > > > So I think that a practical usage of the radix tree will be that the > > caller creates a memory context for a radix tree and passes it to > > RT_CREATE(). > > That sounds workable to me. > > > I've attached an update patch set: > > > > - 0008 updates RT_FREE_RECURSE(). > > Thanks! > > > - 0009 patch is an updated version of cleanup radix tree memory handling. > > Looks pretty good, as does the rest. I'm going through again, > squashing and making tiny adjustments to the template. The only thing > not done is changing the test with many values to resemble the perf > test more. > > I wrote: > > > Secondly, I thought about my recent work to skip checking if we first > > > need to create a root node, and that has a harmless (for vacuum at > > > least) but slightly untidy behavior: When RT_SET is first called, and > > > the key is bigger than 255, new nodes will go on top of the root node. > > > These have chunk '0'. If all subsequent keys are big enough, the > > > orginal root node will stay empty. If all keys are deleted, there will > > > be a chain of empty nodes remaining. Again, I believe this is > > > harmless, but to make tidy, it should easy to teach RT_EXTEND_UP to > > > call out to RT_EXTEND_DOWN if it finds the tree is empty. I can work > > > on this, but likely not today. > > > > This turns out to be a lot trickier than it looked, so it seems best > > to allow a trivial amount of waste, as long as it's documented > > somewhere. It also wouldn't be terrible to re-add those branches, > > since they're highly predictable. > > I put a little more work into this, and got it working, just needs a > small amount of finicky coding. I'll share tomorrow. > > I have a question about RT_FREE_RECURSE: > > + check_stack_depth(); > + CHECK_FOR_INTERRUPTS(); > > I'm not sure why these are here: The first seems overly paranoid, > although harmless, but the second is probably a bad idea. Why should > the user be able to to interrupt the freeing of memory? Good catch. We should not check the interruption there. > Also, I'm not quite happy that RT_ITER has a copy of a pointer to the > tree, leading to coding like "iter->tree->ctl->root". I *think* it > would be easier to read if the tree was a parameter to these iteration > functions. That would require an API change, so the tests/tidstore > would have some churn. I can do that, but before trying I wanted to > see what you think -- is there some reason to keep the current way? I considered both usages, there are two reasons for the current style. I'm concerned that if we pass both the tree and RT_ITER to iteration functions, the caller could mistakenly pass a different tree than the one that was specified to create the RT_ITER. And the second reason is just to make it consistent with other data structures such as dynahash.c and dshash.c, but I now realized that in simplehash.h we pass both the hash table and the iterator. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: