Re: Combine Prune and Freeze records emitted by vacuum - Mailing list pgsql-hackers
From | Melanie Plageman |
---|---|
Subject | Re: Combine Prune and Freeze records emitted by vacuum |
Date | |
Msg-id | CAAKRu_bEo54H+sKXiCb4zsG32vXU7QoO+kGYTAoFyvYZzXxGiA@mail.gmail.com Whole thread Raw |
In response to | Re: Combine Prune and Freeze records emitted by vacuum (Heikki Linnakangas <hlinnaka@iki.fi>) |
Responses |
Re: Combine Prune and Freeze records emitted by vacuum
|
List | pgsql-hackers |
Thanks so much for the review! On Wed, Mar 6, 2024 at 7:59 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote: > > On 25/01/2024 00:49, Melanie Plageman wrote: > > > The attached patch set is broken up into many separate commits for > > ease of review. Each patch does a single thing which can be explained > > plainly in the commit message. Every commit passes tests and works on > > its own. > > About this very first change: > > > --- a/src/backend/access/heap/vacuumlazy.c > > +++ b/src/backend/access/heap/vacuumlazy.c > > @@ -1526,8 +1526,7 @@ lazy_scan_prune(LVRelState *vacrel, > > * that everyone sees it as committed? > > */ > > xmin = HeapTupleHeaderGetXmin(htup); > > - if (!TransactionIdPrecedes(xmin, > > - vacrel->cutoffs.OldestXmin)) > > + if (!GlobalVisTestIsRemovableXid(vacrel->vistest, xmin)) > > { > > all_visible = false; > > break; > > Does GlobalVisTestIsRemovableXid() handle FrozenTransactionId correctly? Okay, so I thought a lot about this, and I don't understand how GlobalVisTestIsRemovableXid() would not handle FrozenTransactionId correctly. vacrel->cutoffs.OldestXmin is computed initially from GetOldestNonRemovableTransactionId() which uses ComputeXidHorizons(). GlobalVisState is updated by ComputeXidHorizons() (when it is updated). I do see that the comment above GlobalVisTestIsRemovableXid() says * It is crucial that this only gets called for xids from a source that * protects against xid wraparounds (e.g. from a table and thus protected by * relfrozenxid). and then in * Convert 32 bit argument to FullTransactionId. We can do so safely * because we know the xid has to, at the very least, be between * [oldestXid, nextXid), i.e. within 2 billion of xid. I'm not sure what oldestXid is here. It's true that I don't see GlobalVisTestIsRemovableXid() being called anywhere else with an xmin as an input. I think that hints that it is not safe with FrozenTransactionId. But I don't see what could go wrong. Maybe it has something to do with converting it to a FullTransactionId? FullTransactionIdFromU64(U64FromFullTransactionId(rel) + (int32) (xid - rel_xid)); Sorry, I couldn't quite figure it out :( > I read through all the patches in order, and aside from the above they > all look correct to me. Some comments on the set as whole: > > I don't think we need XLOG_HEAP2_FREEZE_PAGE as a separate record type > anymore. By removing that, you also get rid of the freeze-only codepath > near the end of heap_page_prune(), and the > heap_freeze_execute_prepared() function. That makes sense to me. > The XLOG_HEAP2_FREEZE_PAGE record is a little smaller than > XLOG_HEAP2_PRUNE. But we could optimize the XLOG_HEAP2_PRUNE format for > the case that there's no pruning, just freezing. The record format > (xl_heap_prune) looks pretty complex as it is, I think it could be made > both more compact and more clear with some refactoring. I'm happy to change up xl_heap_prune format. In its current form, according to pahole, it has no holes and just 3 bytes of padding at the end. One way we could make it smaller is by moving the isCatalogRel member to directly after snapshotConflictHorizon and then adding a flags field and defining flags to indicate whether or not other members exist at all. We could set bits for HAS_FREEZE_PLANS, HAS_REDIRECTED, HAS_UNUSED, HAS_DEAD. Then I would remove the non-optional uint16 nredirected, nunused, nplans, ndead and just put the number of redirected/unused/etc at the beginning of the arrays containing the offsets. Then I could write various macros for accessing them. That would make it smaller, but it definitely wouldn't make it less complex (IMO). > FreezeMultiXactId still takes a separate 'cutoffs' arg, but it could use > pagefrz->cutoffs now. Yep, I forgot to add a commit to do this. Thanks! > HeapPageFreeze has two "trackers", for the "freeze" and "no freeze" > cases. heap_page_prune() needs to track both, until it decides whether > to freeze or not. But it doesn't make much sense that the caller > (lazy_scan_prune()) has to initialize both, and has to choose which of > the values to use after the call depending on whether heap_page_prune() > froze or not. The two trackers should be just heap_page_prune()'s > internal business. > > HeapPageFreeze is a bit confusing in general, as it's both an input and > an output to heap_page_prune(). Not sure what exactly to do there, but I > feel that we should make heap_page_prune()'s interface more clear. > Perhaps move the output fields to PruneResult. Great point. Perhaps I just add NewRelfrozenXid and NewRelminMxid to PruneResult (and call it PruneFreezeResult) and then make VacuumCutoffs an optional argument to heap_page_prune() (used by vacuum and not on-access pruning). Then I eliminate HeapPageFreeze as a parameter altogether. > Let's rename heap_page_prune() to heap_page_prune_and_freeze(), as > freezing is now an integral part of the function. And mention it in the > function comment, too. Agreed. Will do in the next version. I want to get some consensus on what to do with xl_heap_prune before going on my rebase journey with these 15 patches. > In heap_prune_chain: > > > * Tuple visibility information is provided in presult->htsv. > > Not this patch's fault directly, but it's not immediate clear what "is > provided" means here. Does the caller provide it, or does the function > set it, ie. is it an input or output argument? Looking at the code, it's > an input, but now it looks a bit weird that an input argument is called > 'presult'. So, htsv is a member of PruneResult on master because heap_page_prune() populates PruneResult->htsv for use in lazy_scan_prune(). heap_prune_chain() doesn't have access to PruneResult on master. Once I move PruneResult to being populated both by heap_page_prune() and heap_prune_chain(), it gets more confusing. htsv is always populated in heap_page_prune(), but it is not until later patches in the set that I stop accessing it in lazy_scan_prune(). Once I do so, I move htsv from PruneResult into PruneState -- which fixes the heap_prune_chain() confusion. So, only intermediate commits in the set have PruneResult->htsv used in heap_prune_chain(). The end state is that heap_prune_chain() accesses PruneState->htsv. However, I can document how it is used more clearly in the function comment in the intermediate commits. Or, I can simply leave htsv as a separate input argument to heap_prune_chain() in the intermediate commits. - Melanie
pgsql-hackers by date: