Re: Eagerly scan all-visible pages to amortize aggressive vacuum - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Eagerly scan all-visible pages to amortize aggressive vacuum |
Date | |
Msg-id | CAD21AoBTWQ6AiaJ2GZMfPvDg86_2_hP1H7pXctggH3UoBrgeBg@mail.gmail.com Whole thread Raw |
In response to | Re: Eagerly scan all-visible pages to amortize aggressive vacuum (Andres Freund <andres@anarazel.de>) |
List | pgsql-hackers |
On Mon, Jan 27, 2025 at 9:45 AM Melanie Plageman <melanieplageman@gmail.com> wrote: > > On Fri, Jan 24, 2025 at 3:43 PM Robert Haas <robertmhaas@gmail.com> wrote: > > > > On Fri, Jan 24, 2025 at 3:02 PM Melanie Plageman > > <melanieplageman@gmail.com> wrote: > > > I think you're right. I would go with a percentage. I don't see many > > > other GUCs that are percents. What would you call it? Perhaps > > > vacuum_eager_scan_fail_threshold? The % of the blocks in the table > > > vacuum may scan and fail to freeze. > > > > > > There is an issue I see with making it a percentage. The current > > > default vacuum_eager_scan_max_fails is 128 out of 4096. That means you > > > are allowed to scan about 3% of the blocks in the table even if you > > > fail to freeze every one. I don't think there are very many reasonable > > > values above 256, personally. So, it might be weird to add a > > > percentage GUC value with only very low acceptable values. Part of > > > this is that we are not making the success cap configurable, so that > > > means that you might have lots of extra I/O if you are both failing > > > and succeeding. Someone configuring this GUC might think this is > > > controlling the amount of extra I/O they are incurring. > > > > Why would it be unreasonable to set this value to 25% or even 100%? I > > grant that it doesn't sound like the most prudent possible value, but > > if I'm willing to fritter away my VACUUM resources to have the best > > possible chance of eagerly freezing stuff, isn't that up to me? I > > think setting this value to 100% would be WAY less damaging than > > work_mem='1TB' or autovacuum_naptime='7d', both of which are allowed. > > In fact, setting this to 100% could theoretically have no negative > > consequences at all, if it so happens that no freeze failures occur. > > Couldn't it even be a win, if freeze failures are common but > > minimizing the impact of aggressive vacuum is of overwhelming > > importance? > > > > Looking at ConfigureNamesReal[], some existing words we use to talk > > about real-valued GUCs include: cost, fraction, factor, bias, > > multiplier, target, rate. Of those, I like "fraction" and "rate" best > > here, because they imply a range of 0-1 rather than anything broader. > > vacuum_max_eager_freeze_failure_rate? Kinda long... > > attached v11 uses a fraction with this name. It follows the > conventions and I find it descriptive. > > Changing the configuration to a fraction does mean that quite a few of > the comments are different in this version. I think the description of > the guc in > config.sgml could use some review in particular. I tried to make it > clear that the percentage is not the maximum number of blocks that > could be eager scanned, because you may also eagerly scan blocks and > succeed. > > Other note: I noticed AutoVacOpts that are floating point numbers (like > vacuum_cost_delay) are float8s but their associated GUCs (like > autovacuum_vacuum_cost_delay) are doubles. These are going to be equivalent, > but I wanted to note the inconsistency in case I was missing something. > > > > """ > > > <command>VACUUM</command> typically scans pages that have been > > > modified since the last vacuum. However, some all-visible but not > > > all-frozen pages are eagerly scanned to try and freeze them. Note that > > > the <structfield>relfrozenxid</structfield> can only be advanced when > > > every page of the table that might contain unfrozen XIDs is scanned. > > > """ > > > > "to try and freeze them" seems a little awkward to me. The rest seems fine. > > > > "in the hope that we can freeze them"? > > > > That still doesn't seem great to me. Maybe it's worth expending more > > text here, not sure. > > Given this and Treat's input from a different mail: > > > Yeah, this feels like a net negative. As I see it, we're trying to > > connect three not-obviously related ideas for users who are trying to > > understand how this system works especially with regards to > > relfrozenxid advanced (based on the section of the docs we are in), > > loosely > > 1. you've probably heard vacuum mainly scans modified pages for cleanup > > 2. you might notice it also scans non-modified pages, because > > sometimes it wants to freeze stuff > > 3. you might think, if we are freezing stuff, why doesn't relfrozenxid advance? > > > > So that middle bit is trying to act as glue that pulls this all > > together. I thought the previous version was closer, with Haas's > > feedback I might go with something more like this: > > I've mostly used his suggested wording in attached v11. > > > > I rewrote the comment using some of your input and trying to clarify > > > my initial point about opportunistic vs guaranteed freezing. > > > > > > """ > > > If FreezeLimit has not advanced past the relfrozenxid or > > > MultiXactCutoff has not advanced past relminmxid, we are unlikely to > > > be able to freeze anything new. > > > Tuples with XIDs older than OldestXmin or MXIDs older than OldestMxact > > > are technically freezable, but we won't freeze them unless some other > > > criteria for opportunistic freezing is met. > > > > It's not super clear to me what "some other criteria" means here, but > > maybe it's fine. > > Given this thought and Treat's feedback below > > > Yeah, I also think the "some other criteria" part seems poor, although > > tbh this all feels like a net negative to the previous wording or > > Haas's suggested change, like you're describing what the code below it > > does rather than what the intentions of the code are. For example, I > > like the "freeze horizon" language vs explicit FreezeLimit since the > > code could change even if the goal doesn't. But in lue of going back > > to that wording (modulo your 1 line explanation in your email), if I > > were to attempt to split the difference: > > > > We only want to > > enable eager scanning if we are likely to be able to freeze some of > > the pages in the relation, which is unlikely if FreezeLimit has not > > advanced past > > relfrozenxid or if MultiXactCutoff has not advanced passed > > relminmxid. Granted, there may be pages we didn't try to freeze > > before, or some previously blocking XID greater than FreezeLimit may > > have now ended (allowing for freezing), but as a heuristic we wait > > until the > > FreezeLimit advances to increase our chances of successful freezing. > > I've updated the proposed text to mostly be in line with what he suggested. > > > > ended, rendering additional tuples freezable. > > > As a heuristic, however, it makes sense to wait until the FreezeLimit > > > or MultiXactCutoff has advanced before eager scanning. > > > """ > > > > > > > + ereport(INFO, > > > > + (errmsg("Vacuum successfully froze %u eager scanned blocks of > > > > \"%s.%s.%s\". Now disabling eager scanning.", > > > > > > > > I predict that if Tom sees this, you will get complaints about both > > > > the wording of the message, which pretty clearly does not conform to > > > > style guidelines for a primary error message, and also about the use > > > > of the INFO level. Allow me to suggest DEBUG1 or DEBUG2 and "disabling > > > > eager scanning after freezing %u eagerly scanned blocks". > > > > > > I've used your wording. Just for future reference, are the style > > > guidelines I was violating the capitalization and punctuation? Are > > > these documented somewhere? Also, what is a primary error message? > > > INFO level? Or ones that use ereport and are translated? I looked at > > > other messages and saw that they don't capitalize the first word or > > > use punctuation, so I assume that those were problems. > > > > Yes. Primary error messages, i.e. errmsg(), are not capitalized and > > punctuated, unlike errdetail() and errhint() messages, which are. > > > > See https://www.postgresql.org/docs/current/error-style-guide.html > > > > INFO level is used for VERY few things. I can't tell you off the top > > of my head when it's appropriate, but I think the answer is "almost > > never". > > I have changed it to DEBUG2. I started it as INFO because the other vacuum > messages about whether or not the vacuum is an aggressive vacuum were at INFO > level.I don't know what a user might prefer. Treat said this downthread: > > > Maybe, but one of the areas that INFO is used for is providing > > additional details in VACUUM VERBOSE output, and this seems like it > > would be pretty useful information to have if you are trying to > > discern changes in i/o rate during a vacuum, or trying to tune the > > failure rate setting, or several other related fields (including > > automated capture by tools like pganalyze), so I believe INFO is the > > right choice for this. > > I'm happy to go either way. I don't want users mad about verbosity, but if they > think it's helpful, then that seems good. > Thank you for updating the patch. I was reviewing the v10 patch and had some comments. I believe these comments are still valid for v11, but please ignore them if outdated. + if (TransactionIdIsNormal(vacrel->cutoffs.relfrozenxid) && + TransactionIdPrecedesOrEquals(vacrel->cutoffs.relfrozenxid, + vacrel->cutoffs.FreezeLimit)) + oldest_unfrozen_before_cutoff = true; + + if (!oldest_unfrozen_before_cutoff && + MultiXactIdIsValid(vacrel->cutoffs.relminmxid) && + MultiXactIdPrecedesOrEquals(vacrel->cutoffs.relminmxid, + vacrel->cutoffs.MultiXactCutoff)) + oldest_unfrozen_before_cutoff = true; Given that our freeze check strictly checks if an xid is older than the cutoff (exclusive bound), I think we should check if the relfrozenxid and relminmxid strictly precede the cutoff values. --- if (*was_eager_scanned) vacrel->eager_scanned_pages++; How about incrementing this counter near the place where incrementing scanned_pages (i.e., at the beginning of the loop of heap_vac_scan_next_block())? It would make it easy to understand the difference between eager_scanned_pages and scanned_pages. --- + * No vm_page_frozen output parameter (like what is passed to + * lazy_scan_prune()) is passed here because empty pages are always frozen and + * thus could never be eagerly scanned. The last part "thus could never be eagerly scanned" confused me a bit; IIUC we count all pages that are scanned because of this new eager scan feature as "eagerly scanned pages". We increment eager_scanned_pages counter even if the page is either new or empty. This fact seems to contradict the sentence "empty pages could never be eagerly scanned". Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: