Re: Combine Prune and Freeze records emitted by vacuum - Mailing list pgsql-hackers
From | Melanie Plageman |
---|---|
Subject | Re: Combine Prune and Freeze records emitted by vacuum |
Date | |
Msg-id | CAAKRu_aiOvy47jU2AjSp0vk2zL1S3ih8YaGJ5-6swH4VpXGKfA@mail.gmail.com Whole thread Raw |
In response to | Re: Combine Prune and Freeze records emitted by vacuum (Heikki Linnakangas <hlinnaka@iki.fi>) |
Responses |
Re: Combine Prune and Freeze records emitted by vacuum
|
List | pgsql-hackers |
On Thu, Mar 28, 2024 at 4:49 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote: > > On 28/03/2024 03:53, Melanie Plageman wrote: > > On Thu, Mar 28, 2024 at 01:04:04AM +0200, Heikki Linnakangas wrote: > >> One change with this is that live_tuples and many of the other fields are > >> now again updated, even if the caller doesn't need them. It was hard to skip > >> them in a way that would save any cycles, with the other refactorings. > > > > I am worried we are writing checks that are going to have to come out of > > SELECT queries' bank accounts, but I'll do some benchmarking when we're > > all done with major refactoring. > > Sounds good, thanks. Actually, after having looked at it again, there really are only a few more increments of various counters, the memory consumed by revisit, and the additional setting of offsets in marked. I think a few carefully constructed queries which do on-access pruning could test the impact of this (as opposed to a bigger benchmarking endeavor). I also wonder if there would be any actual impact of marking the various record_*() routines inline. > >> @@ -728,10 +832,12 @@ htsv_get_valid_status(int status) > >> * DEAD, our visibility test is just too coarse to detect it. > >> * > >> * In general, pruning must never leave behind a DEAD tuple that still has > >> - * tuple storage. VACUUM isn't prepared to deal with that case. That's why > >> + * tuple storage. VACUUM isn't prepared to deal with that case (FIXME: it no longer cares, right?). > >> + * That's why > >> * VACUUM prunes the same heap page a second time (without dropping its lock > >> * in the interim) when it sees a newly DEAD tuple that we initially saw as > >> - * in-progress. Retrying pruning like this can only happen when an inserting > >> + * in-progress (FIXME: Really? Does it still do that?). > > > > Yea, I think all that is no longer true. I missed this comment back > > then. > > Committed a patch to remove it. > > >> @@ -981,7 +1069,18 @@ heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum, > >> * RECENTLY_DEAD tuples just in case there's a DEAD one after them; > >> * but we can't advance past anything else. We have to make sure that > >> * we don't miss any DEAD tuples, since DEAD tuples that still have > >> - * tuple storage after pruning will confuse VACUUM. > >> + * tuple storage after pruning will confuse VACUUM (FIXME: not anymore > >> + * I think?). > > > > Meaning, it won't confuse vacuum anymore or there won't be DEAD tuples > > with storage after pruning anymore? > > I meant that it won't confuse VACUUM anymore. lazy_scan_prune() doesn't > loop through the items on the page checking their visibility anymore. > > Hmm, one confusion remains though: In the 2nd phase of vacuum, we remove > all the dead line pointers that we have now removed from the indexes. > When we do that, we assume them all to be dead line pointers, without > storage, rather than normal tuples that happen to be HEAPTUPLE_DEAD. So > it's important that if pruning would leave behind HEAPTUPLE_DEAD tuples, > they are not included in 'deadoffsets'. It seems like master only adds items it is marking LP_DEAD to deadoffsets. And things marked LP_DEAD have lp_len set to 0. > In any case, let's just make sure that pruning doesn't leave > HEAPTUPLE_DEAD tuples. There's no reason it should. Maybe worth adding an assert to static void heap_prune_record_unchanged_lp_dead(ItemId itemid, PruneState *prstate, OffsetNumber offnum) { ... Assert(!ItemIdHasStorage(itemid)); prstate->deadoffsets[prstate->lpdead_items++] = offnum; } By the way, I wasn't quite sure about the purpose of heap_prune_record_unchanged_lp_dead(). It is called in heap_prune_chain() in a place where we didn't add things to deadoffsets before, no? /* * Likewise, a dead line pointer can't be part of the chain. (We * already eliminated the case of dead root tuple outside this * function.) */ if (ItemIdIsDead(lp)) { /* * If the caller set PRUNE_DO_MARK_UNUSED_NOW, we can set dead * line pointers LP_UNUSED now. */ if (unlikely(prstate->actions & PRUNE_DO_MARK_UNUSED_NOW)) heap_prune_record_unused(prstate, offnum, false); else heap_prune_record_unchanged_lp_dead(lp, prstate, offnum); break; } > >> @@ -1224,10 +1327,21 @@ heap_prune_record_live_or_recently_dead(Page page, PruneState *prstate, OffsetNu > >> * ensure the math works out. The assumption that the transaction will > >> * commit and update the counters after we report is a bit shaky; but it > >> * is what acquire_sample_rows() does, so we do the same to be consistent. > >> + * > >> + * HEAPTUPLE_DEAD are handled by the other heap_prune_record_*() > >> + * subroutines. They don't count dead items like acquire_sample_rows() > >> + * does, because we assume that all dead items will become LP_UNUSED > >> + * before VACUUM finishes. This difference is only superficial. VACUUM > >> + * effectively agrees with ANALYZE about DEAD items, in the end. VACUUM > >> + * won't remember LP_DEAD items, but only because they're not supposed to > >> + * be left behind when it is done. (Cases where we bypass index vacuuming > >> + * will violate this optimistic assumption, but the overall impact of that > >> + * should be negligible.) FIXME: I don't understand that last sentence in > >> + * parens. It was copied from elsewhere. > > > > If we bypass index vacuuming, there will be LP_DEAD items left behind > > when we are done because we didn't do index vacuuming and then reaping > > of those dead items. All of these comments are kind of a copypasta, > > though. > > Ah, gotcha, makes sense now. I didn't remember that we sometimes by pass > index vacuuming if there are very few dead items. I thought this talked > about the case that there are no indexes, but that case is OK. These comments could use another pass. I had added some extra (probably redundant) content when I thought I was refactoring it a certain way and then changed my mind. Attached is a diff with some ideas I had for a bit of code simplification. Are you working on cleaning this patch up or should I pick it up? I wonder if it makes sense to move some of the changes to heap_prune_chain() with setting things in marked/revisited to the start of the patch set (to be committed first). - Melanie
Attachment
pgsql-hackers by date: