Thread: effective_multixact_freeze_max_age issue
The effective_multixact_freeze_max_age mechanism added by commit 53bb309d2d forces aggressive VACUUMs to take place earlier, as protection against wraparound of the MultiXact member space SLRU. There was also a follow-up bugfix several years later -- commit 6bda2af039 -- which made sure that the MXID-wise cutoff used to determine which MXIDs to freeze in vacuumlazy.c could never exceed oldestMxact (VACUUM generally cannot freeze MultiXacts that are still seen as running by somebody according to oldestMxact). I would like to talk about making the effective_multixact_freeze_max_age stuff more robust, particularly in the presence of a long held snapshot that holds things up even as SLRU space for MultiXact members dwindles. I have to admit that I always found this part of vacuum_set_xid_limits() confusing. I suspect that the problem has something to do with how we calculate vacuumlazy.c's multiXactCutoff (as well as FreezeLimit): vacuum_set_xid_limits() just subtracts a freezemin value from GetOldestMultiXactId(). This is confusingly similar (though different in important ways) to the handling for other related cutoffs that happens nearby. In particular, we start from ReadNextMultiXactId() (not from GetOldestMultiXactId()) for the cutoff that determines if the VACUUM is going to be aggressive. I think that this can be fixed -- see the attached patch. Of course, it wouldn't be safe to allow vacuum_set_xid_limits() to hand off a multiXactCutoff to vacuumlazy.c that is (for whatever reason) less than GetOldestMultiXactId()/oldestMxact (the bug fixed by 6bda2af039 involved just such a scenario). But that doesn't seem like much of a problem to me. We can just handle it directly, as needed. The attached patch handles it as follows: /* Compute multiXactCutoff, being careful to generate a valid value */ *multiXactCutoff = nextMXID - mxid_freezemin; if (*multiXactCutoff < FirstMultiXactId) *multiXactCutoff = FirstMultiXactId; /* multiXactCutoff must always be <= oldestMxact */ if (MultiXactIdPrecedes(*oldestMxact, *multiXactCutoff)) *multiXactCutoff = *oldestMxact; That is, we only need to make sure that the "multiXactCutoff must always be <= oldestMxact" invariant holds once, by checking for it directly. The same thing happens with OldestXmin/FreezeLimit. That seems like a simpler foundation. It's also a lot more logical. Why should the cutoff for freezing be held back by a long running transaction, except to the extent that it is strictly necessary to do so to avoid wrong answers (wrong answers seen by the long running transaction)? This allows us to simplify the code that issues a WARNING about oldestMxact/OldestXmin inside vacuum_set_xid_limits(). Why not actually test oldestMxact/OldestXmin directly, without worrying about the limits (multiXactCutoff/FreezeLimit)? That also seems more logical; there is more to be concerned about than freezing being blocked when OldestXmin gets very old. Though we still rely on the autovacuum_freeze_max_age GUC to represent "a wildly unreasonable number of XIDs for OldestXmin to be held back by", just because that's still convenient. -- Peter Geoghegan
Attachment
On Tue, Aug 2, 2022 at 4:12 PM Peter Geoghegan <pg@bowt.ie> wrote: > That is, we only need to make sure that the "multiXactCutoff must > always be <= oldestMxact" invariant holds once, by checking for it > directly. The same thing happens with OldestXmin/FreezeLimit. That > seems like a simpler foundation. It's also a lot more logical. Why > should the cutoff for freezing be held back by a long running > transaction, except to the extent that it is strictly necessary to do > so to avoid wrong answers (wrong answers seen by the long running > transaction)? Anybody have any input on this? I'm hoping that this can be committed soon. ISTM that the way that we currently derive FreezeLimit (by starting with OldestXmin rather than starting with the same ReadNextTransactionId() value that gets used for the aggressiveness cutoffs) is just an accident of history. The "Routine vacuuming" docs already describe this behavior in terms that sound closer to the behavior with the patch than the actual current behavior: "When VACUUM scans every page in the table that is not already all-frozen, it should set age(relfrozenxid) to a value just a little more than the vacuum_freeze_min_age setting that was used (more by the number of transactions started since the VACUUM started)" Besides, why should there be an idiosyncratic definition of "age" that is only used with vacuum_freeze_min_age/vacuum_multixact_freeze_min_age? Why would anyone want to do less freezing in the presence of a long running transaction? It simply makes no sense (unless we're forced to do so to preserve basic guarantees needed for correctness, such as the "FreezeLimit <= OldestXmin" invariant). -- Peter Geoghegan
On Sun, Aug 28, 2022 at 11:38:09AM -0700, Peter Geoghegan wrote: > On Tue, Aug 2, 2022 at 4:12 PM Peter Geoghegan <pg@bowt.ie> wrote: >> That is, we only need to make sure that the "multiXactCutoff must >> always be <= oldestMxact" invariant holds once, by checking for it >> directly. The same thing happens with OldestXmin/FreezeLimit. That >> seems like a simpler foundation. It's also a lot more logical. Why >> should the cutoff for freezing be held back by a long running >> transaction, except to the extent that it is strictly necessary to do >> so to avoid wrong answers (wrong answers seen by the long running >> transaction)? > > Anybody have any input on this? I'm hoping that this can be committed soon. > > ISTM that the way that we currently derive FreezeLimit (by starting > with OldestXmin rather than starting with the same > ReadNextTransactionId() value that gets used for the aggressiveness > cutoffs) is just an accident of history. The "Routine vacuuming" docs > already describe this behavior in terms that sound closer to the > behavior with the patch than the actual current behavior: > > "When VACUUM scans every page in the table that is not already > all-frozen, it should set age(relfrozenxid) to a value just a little > more than the vacuum_freeze_min_age setting that was used (more by the > number of transactions started since the VACUUM started)" The idea seems sound to me, and IMO your patch simplifies things nicely, which might be reason enough to proceed with it. However, I'm struggling to understand when this change would help much in practice. IIUC it will cause vacuums to freeze a bit more, but outside of extreme cases (maybe when vacuum_freeze_min_age is set very high and there are long-running transactions), ISTM it might not have tremendously much impact. Is the intent to create some sort of long-term behavior change for autovacuum, or is this mostly aimed towards consistency among the cutoff calculations? -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Sun, 28 Aug 2022 at 20:38, Peter Geoghegan <pg@bowt.ie> wrote: > > On Tue, Aug 2, 2022 at 4:12 PM Peter Geoghegan <pg@bowt.ie> wrote: > > That is, we only need to make sure that the "multiXactCutoff must > > always be <= oldestMxact" invariant holds once, by checking for it > > directly. The same thing happens with OldestXmin/FreezeLimit. That > > seems like a simpler foundation. It's also a lot more logical. Why > > should the cutoff for freezing be held back by a long running > > transaction, except to the extent that it is strictly necessary to do > > so to avoid wrong answers (wrong answers seen by the long running > > transaction)? > > Anybody have any input on this? I'm hoping that this can be committed soon. Apart from the message that this behaviour is changing, I'd prefer some more description in the commit message as to why this needs changing. Then, on to the patch itself: > + * XXX We don't do push back oldestMxact here, which is not ideal Do you intend to commit this marker, or is this leftover from the development process? > + if (*multiXactCutoff < FirstMultiXactId) [...] > + if (safeOldestMxact < FirstMultiXactId) [...] > + if (aggressiveMXIDCutoff < FirstMultiXactId) I prefer !TransactionId/MultiXactIdIsValid() over '< First [MultiXact/Transaction]Id', even though it is the same in functionality, because it clarifies the problem we're trying to solve. I understand that the use of < is pre-existing, but since we're touching this code shouldn't we try to get this new code up to current standards? Kind regards, Matthias van de Meent
On Sun, Aug 28, 2022 at 4:14 PM Nathan Bossart <nathandbossart@gmail.com> wrote: > The idea seems sound to me, and IMO your patch simplifies things nicely, > which might be reason enough to proceed with it. It is primarily a case of making things simpler. Why would it ever make sense to interpret age differently in the presence of a long running transaction, though only for the FreezeLimit/MultiXactCutoff cutoff calculation? And not for the closely related freeze_table_age/multixact_freeze_table_age calculation? It's hard to imagine that that was ever a deliberate choice. vacuum_set_xid_limits() didn't contain the logic for determining if its caller's VACUUM should be an aggressive VACUUM until quite recently. Postgres 15 commit efa4a9462a put the logic for determining aggressiveness right next to the logic for determining FreezeLimit, which made the inconsistency much more noticeable. It is easy to believe that this was really just an oversight, all along. > However, I'm struggling > to understand when this change would help much in practice. IIUC it will > cause vacuums to freeze a bit more, but outside of extreme cases (maybe > when vacuum_freeze_min_age is set very high and there are long-running > transactions), ISTM it might not have tremendously much impact. Is the > intent to create some sort of long-term behavior change for autovacuum, or > is this mostly aimed towards consistency among the cutoff calculations? I agree that this will have only a negligible impact on the majority (perhaps even the vast majority) of applications. The primary justification for this patch (simplification) seems sufficient, all things considered. Even still, it's possible that it will help in extreme cases. Cases with pathological performance issues, particularly those involving MultiXacts. We already set FreezeLimit to the most aggressive possible value of OldestXmin when OldestXmin has itself already crossed a quasi arbitrary XID-age threshold of autovacuum_freeze_max_age XIDs (i.e. when OldestXmin < safeLimit), with analogous rules for MultiXactCutoff/OldestMxact. Consequently, the way that we set the cutoffs for freezing in pathological cases where (say) there is a leaked replication slot will see a sharp discontinuity in how FreezeLimit is set, within and across tables. And for what? Initially, these pathological cases will end up using exactly the same FreezeLimit for every VACUUM against every table (assuming that we're using a system-wide min_freeze_age setting) -- every VACUUM operation will use a FreezeLimit of `OldestXmin - vacuum_freeze_min_age`. At a certain point that'll suddenly flip -- now every VACUUM operation will use a FreezeLimit of `OldestXmin`. OTOH with the patch they'd all have a FreezeLimit that is tied to the time that each VACUUM started -- which is exactly the FreezeLimit behavior that we'd get if there was no leaked replication slot (at least until OldestXmin finally attains an age of vacuum_freeze_min_age, when it finally becomes unavoidable, even with the patch). There is something to be said for preserving the "natural diversity" of the relfrozenxid values among tables, too. The FreezeLimit we use is (at least for now) almost always going to be very close to (if not exactly) the same value as the NewFrozenXid value that we set relfrozenxid to at the end of VACUUM (at least in larger tables). Without the patch, a once-off problem with a leaked replication slot can accidentally result in lasting problems where all of the largest tables get their antiwraparound autovacuums at exactly the same time. The current behavior increases the risk that we'll accidentally "synchronize" the relfrozenxid values for large tables that had an antiwraparound vacuum during the time when OldestXmin was held back. Needlessly using the same FreezeLimit across each VACUUM operation risks disrupting the natural ebb and flow of things. It's hard to say how much of a problem that is in the real word. But why take any chances? -- Peter Geoghegan
On Mon, Aug 29, 2022 at 10:25:50AM -0700, Peter Geoghegan wrote: > On Sun, Aug 28, 2022 at 4:14 PM Nathan Bossart <nathandbossart@gmail.com> wrote: >> The idea seems sound to me, and IMO your patch simplifies things nicely, >> which might be reason enough to proceed with it. > > It is primarily a case of making things simpler. Why would it ever > make sense to interpret age differently in the presence of a long > running transaction, though only for the FreezeLimit/MultiXactCutoff > cutoff calculation? And not for the closely related > freeze_table_age/multixact_freeze_table_age calculation? It's hard to > imagine that that was ever a deliberate choice. > > vacuum_set_xid_limits() didn't contain the logic for determining if > its caller's VACUUM should be an aggressive VACUUM until quite > recently. Postgres 15 commit efa4a9462a put the logic for determining > aggressiveness right next to the logic for determining FreezeLimit, > which made the inconsistency much more noticeable. It is easy to > believe that this was really just an oversight, all along. Agreed. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Mon, Aug 29, 2022 at 2:20 AM Matthias van de Meent <boekewurm+postgres@gmail.com> wrote: > Apart from the message that this behaviour is changing, I'd prefer > some more description in the commit message as to why this needs > changing. I usually only write a full commit message before posting a patch when it's a full patch series, where it can be helpful to be very explicit about how the parts fit together. The single line commit message is just a placeholder -- I'll definitely write a better one before commit. > Then, on to the patch itself: > > > + * XXX We don't do push back oldestMxact here, which is not ideal > > Do you intend to commit this marker, or is this leftover from the > development process? Ordinarily I would never commit an XXX comment, and probably wouldn't even leave one in early revisions of patches that I post to the list. This is a special case, though -- it involves the "snapshot too old" feature, which has many similar XXX/FIXME/TODO comments. I think I might leave it like that when committing. The background here is that the snapshot too old code still has lots of problems -- there is a FIXME comment that gives an overview of this in TransactionIdLimitedForOldSnapshots(). We're going to have to live with the fact that that feature isn't in good shape for the foreseeable future. I can only really work around it. > > + if (*multiXactCutoff < FirstMultiXactId) > [...] > > + if (safeOldestMxact < FirstMultiXactId) > [...] > > + if (aggressiveMXIDCutoff < FirstMultiXactId) > > I prefer !TransactionId/MultiXactIdIsValid() over '< First > [MultiXact/Transaction]Id', even though it is the same in > functionality, because it clarifies the problem we're trying to solve. > I understand that the use of < is pre-existing, but since we're > touching this code shouldn't we try to get this new code up to current > standards? I agree in principle, but there are already 40+ other places that use the same idiom in places like multixact.c. Perhaps you can propose a patch to change all of them at once, together? -- Peter Geoghegan
On Mon, Aug 29, 2022 at 3:40 PM Nathan Bossart <nathandbossart@gmail.com> wrote: > Agreed. Attached is v2, which cleans up the structure of vacuum_set_xid_limits() a bit more. The overall idea was to improve the overall flow/readability of the function by moving the WARNINGs into their own "code stanza", just after the logic for establishing freeze cutoffs and just before the logic for deciding on aggressiveness. That is now the more logical approach (group the stanzas by functionality), since we can't sensibly group the code based on whether it deals with XIDs or with Multis anymore (not since the function was taught to deal with the question of whether caller's VACUUM will be aggressive). Going to push this in the next day or so. I also removed some local variables that seem to make the function a lot harder to follow in v2. Consider code like this: - freezemin = freeze_min_age; - if (freezemin < 0) - freezemin = vacuum_freeze_min_age; - freezemin = Min(freezemin, autovacuum_freeze_max_age / 2); - Assert(freezemin >= 0); + if (freeze_min_age < 0) + freeze_min_age = vacuum_freeze_min_age; + freeze_min_age = Min(freeze_min_age, autovacuum_freeze_max_age / 2); + Assert(freeze_min_age >= 0); Why have this freezemin temp variable? Why not just use the vacuum_freeze_min_age function parameter directly instead? That is a better representation of what's going on at the conceptual level. We now assign vacuum_freeze_min_age to the vacuum_freeze_min_age arg (not to the freezemin variable) when our VACUUM caller passes us a value of -1 for that arg. -1 effectively means "whatever the value of the vacuum_freeze_min_age GUC is'', which is clearer without the superfluous freezemin variable. -- Peter Geoghegan
Attachment
On Tue, Aug 30, 2022 at 05:24:17PM -0700, Peter Geoghegan wrote: > Attached is v2, which cleans up the structure of > vacuum_set_xid_limits() a bit more. The overall idea was to improve > the overall flow/readability of the function by moving the WARNINGs > into their own "code stanza", just after the logic for establishing > freeze cutoffs and just before the logic for deciding on > aggressiveness. That is now the more logical approach (group the > stanzas by functionality), since we can't sensibly group the code > based on whether it deals with XIDs or with Multis anymore (not since > the function was taught to deal with the question of whether caller's > VACUUM will be aggressive). > > Going to push this in the next day or so. LGTM -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Tue, Aug 30, 2022 at 8:56 PM Nathan Bossart <nathandbossart@gmail.com> wrote: > LGTM Pushed, thanks. -- Peter Geoghegan
Hello! On 31.08.2022 21:38, Peter Geoghegan wrote: > On Tue, Aug 30, 2022 at 8:56 PM Nathan Bossart <nathandbossart@gmail.com> wrote: >> LGTM > > Pushed, thanks. > In this commit https://github.com/postgres/postgres/commit/c3ffa731a5f99c4361203015ce2219d209fea94c there are checks if oldestXmin and oldestMxact havn't become too far in the past. But the corresponding error messages say also some different things about 'cutoff for freezing tuples', ie about checks for another variables: freezeLimit and multiXactCutoff. See: https://github.com/postgres/postgres/commit/c3ffa731a5f99c4361203015ce2219d209fea94c?diff=split#diff-795a3938e3bed9884d426bd010670fe505580732df7d7012fad9edeb9df54badR1075 and https://github.com/postgres/postgres/commit/c3ffa731a5f99c4361203015ce2219d209fea94c?diff=split#diff-795a3938e3bed9884d426bd010670fe505580732df7d7012fad9edeb9df54badR1080 It's interesting that prior to this commit, checks were made for freeze limits, but the error messages were talking aboutoldestXmin and oldestMxact. Could you clarify this moment please? Would be very grateful. As variant may be split these checks for the freeze cuttoffs and the oldest xmins for clarity? The patch attached tries to do this. -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On Tue, Oct 18, 2022 at 3:43 AM Anton A. Melnikov <aamelnikov@inbox.ru> wrote: > It's interesting that prior to this commit, checks were made for freeze limits, but the error messages were talking aboutoldestXmin and oldestMxact. The problem really is that oldestXmin and oldestMxact are held back, though. While that can ultimately result in older FreezeLimit and MultiXactCutoff cutoffs in vacuumlazy.c, that's just one problem. Usually not the worst problem. The term "removable cutoff" is how VACUUM VERBOSE reports OldestXmin. I think that it's good to use the same terminology here. > Could you clarify this moment please? Would be very grateful. While this WARNING is triggered when a threshold controlled by autovacuum_freeze_max_age is crossed, it's not just a problem with freezing. It's convenient to use autovacuum_freeze_max_age to represent "a wildly excessive number of XIDs for OldestXmin to be held back by, no matter what". In practice it is usually already a big problem when OldestXmin is held back by far fewer XIDs than that, but it's hard to reason about when exactly we need to consider that a problem. However, we can at least be 100% sure of real problems when OldestXmin age reaches autovacuum_freeze_max_age. There is no longer any doubt that we need to warn the user, since antiwraparound autovacuum cannot work as designed at that point. But the WARNING is nevertheless not primarily (or not exclusively) about not being able to freeze. It's also about not being able to remove bloat. Freezing can be thought of as roughly the opposite process to removing dead tuples deleted by now committed transactions. OldestXmin is the cutoff both for removing dead tuples (which we want to get rid of immediately), and freezing live all-visible tuples (which we want to keep long term). FreezeLimit is usually 50 million XIDs before OldestXmin (the freeze_min_age default), but that's just how we implement lazy freezing, which is just an optimization. > As variant may be split these checks for the freeze cuttoffs and the oldest xmins for clarity? > The patch attached tries to do this. I don't think that this is an improvement. For one thing the FreezeLimit cutoff will have been held back (relative to nextXID-wise table age) by more than the freeze_min_age setting for a long time before this WARNING is hit -- so we're not going to show the WARNING in most cases where freeze_min_age has been completely ignored (it must be ignored in extreme cases because FreezeLimit must always be <= OldestXmin). Also, the proposed new WARNING is only seen when we're bound to also see the existing OldestXmin WARNING already. Why have 2 WARNINGs about exactly the same problem? -- Peter Geoghegan
Hello! On 18.10.2022 20:56, Peter Geoghegan wrote: > The term "removable cutoff" is how VACUUM VERBOSE reports OldestXmin. > I think that it's good to use the same terminology here. Thanks for the explanation! Firstly exactly this term confused me. Sure, the same terminology makes all easier to understand. > >> Could you clarify this moment please? Would be very grateful. > > While this WARNING is triggered when a threshold controlled by > autovacuum_freeze_max_age is crossed, it's not just a problem with > freezing. It's convenient to use autovacuum_freeze_max_age to > represent "a wildly excessive number of XIDs for OldestXmin to be held > back by, no matter what". In practice it is usually already a big > problem when OldestXmin is held back by far fewer XIDs than that, but > it's hard to reason about when exactly we need to consider that a > problem. However, we can at least be 100% sure of real problems when > OldestXmin age reaches autovacuum_freeze_max_age. There is no longer > any doubt that we need to warn the user, since antiwraparound > autovacuum cannot work as designed at that point. But the WARNING is > nevertheless not primarily (or not exclusively) about not being able > to freeze. It's also about not being able to remove bloat.> Freezing can be thought of as roughly the opposite processto removing > dead tuples deleted by now committed transactions. OldestXmin is the > cutoff both for removing dead tuples (which we want to get rid of > immediately), and freezing live all-visible tuples (which we want to > keep long term). FreezeLimit is usually 50 million XIDs before > OldestXmin (the freeze_min_age default), but that's just how we > implement lazy freezing, which is just an optimization. > That's clear. Thanks a lot! >> As variant may be split these checks for the freeze cuttoffs and the oldest xmins for clarity? >> The patch attached tries to do this. > > I don't think that this is an improvement. For one thing the > FreezeLimit cutoff will have been held back (relative to nextXID-wise > table age) by more than the freeze_min_age setting for a long time > before this WARNING is hit -- so we're not going to show the WARNING > in most cases where freeze_min_age has been completely ignored (it > must be ignored in extreme cases because FreezeLimit must always be <= > OldestXmin). > Also, the proposed new WARNING is only seen when we're > bound to also see the existing OldestXmin WARNING already. Why have 2 > WARNINGs about exactly the same problem?> I didn't understand this moment. If the FreezeLimit is always older than OldestXmin or equal to it according to: > FreezeLimit is usually 50 million XIDs before > OldestXmin (the freeze_min_age default), can't there be a situation like this? ______________________________ | autovacuum_freeze_max_age | past <_______|____________|_____________|________________|> future FreezeLimit safeOldestXmin oldestXmin nextXID |___________________________________________| freeze_min_age In that case the existing OldestXmin WARNING will not fire while the new one will. As the FreezeLimit is only optimization it's likely not a warning but a notice message before OldestXmin WARNING and possible real problems in the future. Maybe it can be useful in a such kind? With best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Mon, Oct 24, 2022 at 1:18 AM Anton A. Melnikov <aamelnikov@inbox.ru> wrote: > > Also, the proposed new WARNING is only seen when we're > > bound to also see the existing OldestXmin WARNING already. Why have 2 > > WARNINGs about exactly the same problem?> > > I didn't understand this moment. > > If the FreezeLimit is always older than OldestXmin or equal to it according to: > > > FreezeLimit is usually 50 million XIDs before > > OldestXmin (the freeze_min_age default), > > can't there be a situation like this? I don't understand what you mean. FreezeLimit *isn't* always exactly 50 million XIDs before OldestXmin -- not anymore. In fact that's the main benefit of this work (commit c3ffa731). That detail has changed (and changed for the better). Though it will only be noticeable to users when an old snapshot holds back OldestXmin by a significant amount. It is true that we must always respect the classic "FreezeLimit <= OldestXmin" invariant. So naturally vacuum_set_xid_limits() continues to make sure that the invariant holds in all cases, if necessary by holding back FreezeLimit: + /* freezeLimit must always be <= oldestXmin */ + if (TransactionIdPrecedes(*oldestXmin, *freezeLimit)) + *freezeLimit = *oldestXmin; When we *don't* have to do this (very typical when vacuum_freeze_min_age is set to its default of 50 million), then FreezeLimit won't be affected by old snapshots. Overall, FreezeLimit must either be: 1. *Exactly* freeze_min_age XIDs before nextXID (note that it is nextXID instead of OldestXmin here, as of commit c3ffa731). or: 2. *Exactly* OldestXmin. FreezeLimit must always be either exactly 1 or exactly 2, regardless of anything else (like long running transactions/snapshots). Importantly, we still never interpret freeze_min_age as more than "autovacuum_freeze_max_age / 2" when determining FreezeLimit. While the safeOldestXmin cutoff is "nextXID - autovacuum_freeze_max_age". Before commit c3ffa731, FreezeLimit would sometimes be interpreted as exactly OldestXmin -- it would be set to OldestXmin directly when the WARNING was given. But now we get smoother behavior, without any big discontinuities in how FreezeLimit is set over time when OldestXmin is held back generally. -- Peter Geoghegan
On Mon, Oct 24, 2022 at 7:56 AM Peter Geoghegan <pg@bowt.ie> wrote: > I don't understand what you mean. FreezeLimit *isn't* always exactly > 50 million XIDs before OldestXmin -- not anymore. In fact that's the > main benefit of this work (commit c3ffa731). That detail has changed > (and changed for the better). Though it will only be noticeable to > users when an old snapshot holds back OldestXmin by a significant > amount. I meant that the new behavior will only have a noticeable impact when OldestXmin is significantly earlier than nextXID. Most of the time there won't be any old snapshots, which means that there will only be a negligible difference between OldestXmin and nextXID when things are operating normally (OldestXmin will still probably be a tiny bit earlier than nextXID, but not enough to matter). And so most of the time the difference between the old approach and the new approach will be completely negligible; 50 million XIDs is usually a huge number (it is usually far far larger than the difference between OldestXmin and nextXID). BTW, I have some sympathy for the argument that the WARNINGs that we have here may not be enough -- we only warn when the situation is already extremely serious. I just don't see any reason why that problem should be treated as a regression caused by commit c3ffa731. The WARNINGs may be inadequate, but that isn't new. -- Peter Geoghegan
Hi, Peter! Sorry! For a some time i forgot about this thread and forgot to thank you for your answer. Thereby now its clear for me that this patch allows the autovacuum to win some time between OldestXmin and nextXID that could not be used before. I think, it maybe especially useful for high-load applications. Digging depeer, i found some inconsistency between current docs and the real behavior and would like to bring this to your attention. Now the doc says that an aggressive vacuum scan will occur for any table whose multixact-age is greater than autovacuum_multixact_freeze_max_age. But really vacuum_get_cutoffs() will return true when multixact-age is greater or equal than autovacuum_multixact_freeze_max_age if relminmxid is not equal to one. If relminmxid is equal to one then vacuum_get_cutoffs() return true even multixact-age is less by one then autovacuum_multixact_freeze_max_age. For instance, if relminmxid = 1 and multixact_freeze_table_age = 100, vacuum will start to be aggressive from the age of 99 (when ReadNextMultiXactId() = 100). The patch attached attempts to fix this and tries to use semantic like in the doc. The similar fix was made for common xacts too. Additional check for relminxid allows to disable agressive scan at all if it is invalid. But i'm not sure if such a check is needed here. Please take it into account. With kindly regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company