Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations |
Date | |
Msg-id | CAH2-Wzn6bGJGfOy3zSTJicKLw99PHJeSOQBOViKjSCinaxUKDQ@mail.gmail.com Whole thread Raw |
In response to | Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations
|
List | pgsql-hackers |
On Sun, Feb 20, 2022 at 12:27 PM Peter Geoghegan <pg@bowt.ie> wrote: > You've given me a lot of high quality feedback on all of this, which > I'll work through soon. It's hard to get the balance right here, but > it's made much easier by this kind of feedback. Attached is v9. Lots of changes. Highlights: * Much improved 0001 ("loosen coupling" dynamic relfrozenxid tracking patch). Some of the improvements are due to recent feedback from Robert. * Much improved 0002 ("Make page-level characteristics drive freezing" patch). Whole new approach to the implementation, though the same algorithm as before. * No more FSM patch -- that was totally separate work, that I shouldn't have attached to this project. * There are 2 new patches (these are now 0003 and 0004), both of which are concerned with allowing non-aggressive VACUUM to consistently advance relfrozenxid. I think that 0003 makes sense on general principle, but I'm much less sure about 0004. These aren't too important. While working on the new approach to freezing taken by v9-0002, I had some insight about the issues that Robert raised around 0001, too. I wasn't expecting that to happen. 0002 makes page-level freezing a first class thing. heap_prepare_freeze_tuple now has some (limited) knowledge of how this works. heap_prepare_freeze_tuple's cutoff_xid argument is now always the VACUUM caller's OldestXmin (not its FreezeLimit, as before). We still have to pass FreezeLimit to heap_prepare_freeze_tuple, which helps us to respect FreezeLimit as a backstop, and so now it's passed via the new backstop_cutoff_xid argument instead. Whenever we opt to "freeze a page", the new page-level algorithm *always* uses the most recent possible XID and MXID values (OldestXmin and oldestMxact) to decide what XIDs/XMIDs need to be replaced. That might sound like it'd be too much, but it only applies to those pages that we actually decide to freeze (since page-level characteristics drive everything now). FreezeLimit is only one way of triggering that now (and one of the least interesting and rarest). 0002 also adds an alternative set of relfrozenxid/relminmxid tracker variables, to make the "don't freeze the page" path within lazy_scan_prune simpler (if you don't want to freeze the page, then use the set of tracker variables that go with that choice, which heap_prepare_freeze_tuple knows about and helps with). With page-level freezing, lazy_scan_prune wants to make a decision about the page as a whole, at the last minute, after all heap_prepare_freeze_tuple calls have already been made. So I think that heap_prepare_freeze_tuple needs to know about that aspect of lazy_scan_prune's behavior. When we *don't* want to freeze the page, we more or less need everything related to freezing inside lazy_scan_prune to behave like lazy_scan_noprune, which never freezes the page (that's mostly the point of lazy_scan_noprune). And that's almost what we actually do -- heap_prepare_freeze_tuple now outsources maintenance of this alternative set of "don't freeze the page" relfrozenxid/relminmxid tracker variables to its sibling function, heap_tuple_needs_freeze. That is the same function that lazy_scan_noprune itself actually calls. Now back to Robert's feedback on 0001, which had very complicated comments in the last version. This approach seems to make the "being versus becoming" or "going to freeze versus not going to freeze" distinctions much clearer. This is less true if you assume that 0002 won't be committed but 0001 will be. Even if that happens with Postgres 15, I have to imagine that adding something like 0002 must be the real goal, long term. Without 0002, the value from 0001 is far more limited. You need both together to get the virtuous cycle I've described. The approach with always using OldestXmin as cutoff_xid and oldestMxact as our cutoff_multi makes a lot of sense to me, in part because I think that it might well cut down on the tendency of VACUUM to allocate new MultiXacts in order to be able to freeze old ones. AFAICT the only reason that heap_prepare_freeze_tuple does that is because it has no flexibility on FreezeLimit and MultiXactCutoff. These are derived from vacuum_freeze_min_age and vacuum_multixact_freeze_min_age, respectively, and so they're two independent though fairly meaningless cutoffs. On the other hand, OldestXmin and OldestMxact are not independent in the same way. We get both of them at the same time and the same place, in vacuum_set_xid_limits. OldestMxact really is very close to OldestXmin -- only the units differ. It seems that heap_prepare_freeze_tuple allocates new MXIDs (when freezing old ones) in large part so it can NOT freeze XIDs that it would have been useful (and much cheaper) to remove anyway. On HEAD, FreezeMultiXactId() doesn't get passed down the VACUUM operation's OldestXmin at all (it actually just gets FreezeLimit passed as its cutoff_xid argument). It cannot possibly recognize any of this for itself. Does that theory about MultiXacts sound plausible? I'm not claiming that the patch makes it impossible that FreezeMultiXactId() will have to allocate a new MultiXact to freeze during VACUUM -- the freeze-the-dead isolation tests already show that that's not true. I just think that page-level freezing based on page characteristics with oldestXmin and oldestMxact (not FreezeLimit and MultiXactCutoff) cutoffs might make it a lot less likely in practice. oldestXmin and oldestMxact map to the same wall clock time, more or less -- that seems like it might be an important distinction, independent of everything else. Thanks -- Peter Geoghegan
Attachment
pgsql-hackers by date: