Thread: Re: Trigger more frequent autovacuums of heavy insert tables
On Tue, Oct 22, 2024 at 03:12:53PM -0400, Melanie Plageman wrote: > By considering only the unfrozen portion of the table when calculating > the vacuum insert threshold, we can trigger vacuums more proactively > on insert-heavy tables. This changes the definition of > insert_scale_factor to a percentage of "active" table size. The > attached patch does this. I think this is a creative idea. My first reaction is to question whether it makes send to have two strategies for this sort of thing: autovacuum_vacuum_max_threshold for updates/deletes and this for inserts. Perhaps we don't want to more aggressively clean up bloat (except for the very largest tables via the hard cap), but we do want to more aggressively mark newly-inserted tuples frozen. I'm curious what you think. > I've estimated the unfrozen percentage of the table by adding a new > field to pg_class, relallfrozen, which is updated in the same places > as relallvisible. Wouldn't relallvisible be sufficient here? We'll skip all-visible pages unless this is an anti-wraparound vacuum, at which point I would think the insert threshold goes out the window. > More frequent vacuums means each vacuum scans fewer pages, but, more > interestingly, the first vacuum after a checkpoint is much more > efficient. With the patch, the first vacuum after a checkpoint emits > half as many FPIs. You can see that only 18 pages were newly dirtied. > So, with the patch, the pages being vacuumed are usually still in > shared buffers and still dirty. Are you aware of any scenarios where your proposed strategy might make things worse? From your test results, it sounds like these vacuums ought to usually be relatively efficient, so sending insert-only tables to the front of the line is normally okay, but maybe that's not always true. -- nathan
On Fri, Feb 7, 2025 at 12:37 PM Nathan Bossart <nathandbossart@gmail.com> wrote: > > On Tue, Oct 22, 2024 at 03:12:53PM -0400, Melanie Plageman wrote: > > By considering only the unfrozen portion of the table when calculating > > the vacuum insert threshold, we can trigger vacuums more proactively > > on insert-heavy tables. This changes the definition of > > insert_scale_factor to a percentage of "active" table size. The > > attached patch does this. > > I think this is a creative idea. Indeed. I can't take much credit for it -- Andres suggested this direction during an off-list conversation where I was complaining about how difficult it was to benchmark my vacuum eager scanning patch set [1] because normal vacuums were so rarely triggered for insert-only tables after the first aggressive vacuum. > My first reaction is to question whether > it makes send to have two strategies for this sort of thing: > autovacuum_vacuum_max_threshold for updates/deletes and this for inserts. > Perhaps we don't want to more aggressively clean up bloat (except for the > very largest tables via the hard cap), but we do want to more aggressively > mark newly-inserted tuples frozen. I'm curious what you think. The goal with insert-only tables is to set the whole page frozen in the VM. So, the number of pages is more important than the total number of tuples inserted. Whereas, with updates/deletes, it seems like the total amount of garbage (# tuples) needing cleaning is more important. My intuition (maybe wrong) is that it is more common to have a bunch of pages with a single (or few) updates/deletes than it is to have a bunch of pages with a single insert. This patch is mostly meant to trigger vacuums sooner on large insert-only or bulk loaded tables. Though, it is more common to have a cluster of hot pages than uniformly distributed updates and deletes... > > I've estimated the unfrozen percentage of the table by adding a new > > field to pg_class, relallfrozen, which is updated in the same places > > as relallvisible. > > Wouldn't relallvisible be sufficient here? We'll skip all-visible pages > unless this is an anti-wraparound vacuum, at which point I would think the > insert threshold goes out the window. It's a great question. There are a couple reasons why I don't think so. I think this might lead to triggering vacuums too often for insert-mostly tables. For those tables, the pages that are not all-visible will largely be just those with data that is new since the last vacuum. And if we trigger vacuums based off of the % not all-visible, we might decrease the number of cases where we are able to vacuum inserted data and freeze it the first time it is vacuumed -- thereby increasing the total amount of work. As for your point about us skipping all-visible pages except in anti-wraparound vacuums -- that's not totally true. Autovacuums triggered by the insert or update/delete thresholds and not by autovacuum_freeze_max_age can also be aggressive (that's based on vacuum_freeze_table_age). Aggressive vacuums scan all-visible pages. And we actually want to trigger more normal aggressive (non-anti-wrap) vacuums because anti-wraparound vacuums are not canceled by conflicting lock requests (like those needed by DDL) -- see PROC_VACUUM_FOR_WRAPAROUND in ProcSleep(). We also scan a surprising number of all-visible pages in practice due to SKIP_PAGES_THRESHOLD. I was pretty taken aback while testing [1] how many all-visible pages we scan due to this optimization. And, I'm planning on merging [1] in the next few days, so this will also increase the number of all-visible pages scanned during normal vacuums. > > More frequent vacuums means each vacuum scans fewer pages, but, more > > interestingly, the first vacuum after a checkpoint is much more > > efficient. With the patch, the first vacuum after a checkpoint emits > > half as many FPIs. You can see that only 18 pages were newly dirtied. > > So, with the patch, the pages being vacuumed are usually still in > > shared buffers and still dirty. > > Are you aware of any scenarios where your proposed strategy might make > things worse? From your test results, it sounds like these vacuums ought > to usually be relatively efficient, so sending insert-only tables to the > front of the line is normally okay, but maybe that's not always true. So, of course they aren't exactly at the front of the line since we autovacuum based on the order in pg_class. But, I suppose if you spend a bunch of time vacuuming an insert-mostly table you previously would have skipped instead of some other table -- that is effectively prioritizing the insert-mostly tables. For insert-only/mostly tables, what you are ideally doing is vacuuming more frequently and handling a small number of pages each vacuum of the relation, so it has a low performance impact. I suppose if you only have a few autovacuum workers and an equal number of massive insert-only tables, you could end up starving other actively updated tables of vacuum resources. But, those insert-only tables would have to be vacuumed eventually -- and I imagine that the impact of a massive aggressive vacuum of all of the data in those tables would be more disruptive than some extra bloat in your other tables. I'd be interested if other people with more field experience can imagine starvation scenarios that would be much worse with this patch. What kinds of starvation scenarios do you normally see? In terms of specific, dramatic differences in behavior (since this wouldn't be hidden behind a guc) people might be surprised by how soon tables start being vacuumed after a huge COPY FREEZE. - Melanie
On Fri, Feb 7, 2025 at 3:38 PM Nathan Bossart <nathandbossart@gmail.com> wrote: > > On Fri, Feb 07, 2025 at 02:21:07PM -0500, Melanie Plageman wrote: > > On Fri, Feb 7, 2025 at 12:37 PM Nathan Bossart <nathandbossart@gmail.com> wrote: > >> > >> Wouldn't relallvisible be sufficient here? We'll skip all-visible pages > >> unless this is an anti-wraparound vacuum, at which point I would think the > >> insert threshold goes out the window. > > > > It's a great question. There are a couple reasons why I don't think so. > > > > I think this might lead to triggering vacuums too often for > > insert-mostly tables. For those tables, the pages that are not > > all-visible will largely be just those with data that is new since the > > last vacuum. And if we trigger vacuums based off of the % not > > all-visible, we might decrease the number of cases where we are able > > to vacuum inserted data and freeze it the first time it is vacuumed -- > > thereby increasing the total amount of work. > > Rephrasing to make sure I understand correctly: you're saying that using > all-frozen would trigger less frequent insert vacuums, which would give us > a better chance of freezing more than more frequent insert vacuums > triggered via all-visible? My suspicion is that the difference would tend > to be quite subtle in practice, but I have no concrete evidence to back > that up. You understood me correctly. As for relallfrozen, one of the justifications for adding it to pg_class is actually for the visibility it would provide. We have no way of knowing how many all-visible but not all-frozen pages there are on users' systems without pg_visibility. If users had this information, they could potentially tune their freeze-related settings more aggressively. Regularly reading the whole visibility map with pg_visibilitymap_summary() is pretty hard to justify on most production systems. But querying pg_class every 10 minutes or something is much more reasonable. - Melanie
On Fri, Feb 07, 2025 at 03:57:49PM -0500, Melanie Plageman wrote: > As for relallfrozen, one of the justifications for adding it to > pg_class is actually for the visibility it would provide. We have no > way of knowing how many all-visible but not all-frozen pages there are > on users' systems without pg_visibility. If users had this > information, they could potentially tune their freeze-related settings > more aggressively. Regularly reading the whole visibility map with > pg_visibilitymap_summary() is pretty hard to justify on most > production systems. But querying pg_class every 10 minutes or > something is much more reasonable. If we need it anyway, then I have no objections to using a freeze-related metric for a freeze-related feature. Okay, I'll actually look at the patches next... -- nathan
On Fri, Feb 07, 2025 at 03:05:09PM -0600, Nathan Bossart wrote: > Okay, I'll actually look at the patches next... Ugh, it's already been 10 days since I said that. A couple of thoughts on 0001: I'm not sure I understand the reason for capping relallfrozen to relallvisible. From upthread, I gather this is mostly to deal with manual statistics manipulation, but my first reaction is that we should just let those values be bogus. Is there something that fundamentally requires relallfrozen to be <= relallvisible? These are only estimates, so I don't think it would be that surprising for them to defy this expectation. Should we allow manipulating relallfrozen like we do relallvisible? My assumption is that would even be required for the ongoing statistics import/export work. Upthread, you mentioned that you weren't seeing relallfrozen in pg_class_d.h. I checked on my machine and see it there as expected. Are you still missing it? -- nathan
On Mon, Feb 17, 2025 at 11:11 AM Nathan Bossart <nathandbossart@gmail.com> wrote: > > On Fri, Feb 07, 2025 at 03:05:09PM -0600, Nathan Bossart wrote: > > Okay, I'll actually look at the patches next... Thanks for taking a look! > I'm not sure I understand the reason for capping relallfrozen to > relallvisible. From upthread, I gather this is mostly to deal with manual > statistics manipulation, but my first reaction is that we should just let > those values be bogus. Is there something that fundamentally requires > relallfrozen to be <= relallvisible? These are only estimates, so I don't > think it would be that surprising for them to defy this expectation. I wasn't quite sure what to do here. I see your perspective: for example, reltuples can't possibly be more than relpages but we don't do any validation of that. My rationale wasn't exactly principled, so I'll change it to not cap relallfrozen. This makes me think I should also not cap relallfrozen when using it in relation_needs_vacanalyze(). There I cap it to relallvisible and relallvisible is capped to relpages. One of the ideas behind letting people modify these stats in pg_class is that they can change a single field to see what the effect on their system is, right? > Should we allow manipulating relallfrozen like we do relallvisible? My > assumption is that would even be required for the ongoing statistics > import/export work. Why would it be required for the statistics import/export work? > Upthread, you mentioned that you weren't seeing relallfrozen in > pg_class_d.h. I checked on my machine and see it there as expected. Are > you still missing it? I see it now. No idea what was happening. - Melanie
On Wed, Feb 19, 2025 at 04:36:05PM -0500, Melanie Plageman wrote: > This makes me think I should also not cap relallfrozen when using it > in relation_needs_vacanalyze(). There I cap it to relallvisible and > relallvisible is capped to relpages. One of the ideas behind letting > people modify these stats in pg_class is that they can change a single > field to see what the effect on their system is, right? Right. Capping these values to reflect reality seems like it could make that more difficult. >> Should we allow manipulating relallfrozen like we do relallvisible? My >> assumption is that would even be required for the ongoing statistics >> import/export work. > > Why would it be required for the statistics import/export work? It's probably not strictly required, but my naive expectation would be that we'd handle relallfrozen just like relallvisible, which appears to be dumped in the latest stats import/export patch. Is there any reason we shouldn't do the same for relallfrozen? -- nathan
On Wed, Feb 19, 2025 at 4:59 PM Nathan Bossart <nathandbossart@gmail.com> wrote: > > On Wed, Feb 19, 2025 at 04:36:05PM -0500, Melanie Plageman wrote: > > This makes me think I should also not cap relallfrozen when using it > > in relation_needs_vacanalyze(). There I cap it to relallvisible and > > relallvisible is capped to relpages. One of the ideas behind letting > > people modify these stats in pg_class is that they can change a single > > field to see what the effect on their system is, right? > > Right. Capping these values to reflect reality seems like it could make > that more difficult. Attache v7 doesn't cap the result for manual stats updating done with relation_statistics_update(). I did, however, keep the cap for the places where vacuum/analyze/create index update the stats. There the number for relallfrozen is coming directly from visibilitymap_count(), so it should be correct. I could perhaps add an assert instead, but I didn't think that really made sense. An assert is meant to help the developer and what could the developer do about the visibility map being corrupted. > >> Should we allow manipulating relallfrozen like we do relallvisible? My > >> assumption is that would even be required for the ongoing statistics > >> import/export work. > > > > Why would it be required for the statistics import/export work? > > It's probably not strictly required, but my naive expectation would be that > we'd handle relallfrozen just like relallvisible, which appears to be > dumped in the latest stats import/export patch. Is there any reason we > shouldn't do the same for relallfrozen? Nope I don't think so, but I also don't know about how people are envisioning using a manually updated relallvisible. - Melanie
Attachment
On Thu, Feb 20, 2025 at 07:35:32PM -0500, Melanie Plageman wrote: > Attache v7 doesn't cap the result for manual stats updating done with > relation_statistics_update(). Unfortunately, this already needs a rebase... > I did, however, keep the cap for the > places where vacuum/analyze/create index update the stats. There the > number for relallfrozen is coming directly from visibilitymap_count(), > so it should be correct. I could perhaps add an assert instead, but I > didn't think that really made sense. An assert is meant to help the > developer and what could the developer do about the visibility map > being corrupted. This might just be personal preference, but I think this is exactly the sort of thing an assertion is meant for. If we expect the value to always be correct, and it's not, then our assumptions were wrong or someone has broken something. In both of these cases, I as a developer would really like to know so that I don't introduce a latent bug. If we want Postgres to gracefully handle or detect visibility map corruption, then maybe we should do both or PANIC. I do see that heap_vacuum_rel() already caps relallvisible to relpages, but it's not clear to me whether that's something that we expect to regularly happen in practice or what the adverse effects might be. So perhaps I'm misunderstanding the scope and severity of bogus results from visibilitymap_count(). Commit e6858e6, which added this code, doesn't say anything about safety, but commit 3d351d9 changed the comment in question to its current wording. After a very quick skim of the latter's thread [0], I don't see any discussion about this point, either. >> >> Should we allow manipulating relallfrozen like we do relallvisible? My >> >> assumption is that would even be required for the ongoing statistics >> >> import/export work. >> > >> > Why would it be required for the statistics import/export work? >> >> It's probably not strictly required, but my naive expectation would be that >> we'd handle relallfrozen just like relallvisible, which appears to be >> dumped in the latest stats import/export patch. Is there any reason we >> shouldn't do the same for relallfrozen? > > Nope I don't think so, but I also don't know about how people are > envisioning using a manually updated relallvisible. That does seem unlikely. I'd expect it to be more useful for porting statistics over during pg_upgrade. [0] https://postgr.es/m/flat/F02298E0-6EF4-49A1-BCB6-C484794D9ACC%40thebuild.com -- nathan
On Mon, Feb 24, 2025 at 4:53 PM Nathan Bossart <nathandbossart@gmail.com> wrote: > > On Thu, Feb 20, 2025 at 07:35:32PM -0500, Melanie Plageman wrote: > > Attache v7 doesn't cap the result for manual stats updating done with > > relation_statistics_update(). > > Unfortunately, this already needs a rebase... Thanks! Attached v8 does just that. I've also taken a pass at updating the stats import tests to include relallfrozen. I'm not totally sure how I feel about the results. I erred on the side of putting relallfrozen wherever relallvisible was. But, that means relallfrozen is included in test cases where it doesn't seem important to the test case. But, that was true with relallvisible too. Anyway, I talked to Corey off-list last week. Maybe he will have a chance to weigh in on the test cases. Also, I had forgotten to include relallfrozen in pg_clear_relation_stats(). I've fixed that in v8, but now I'm worried there are some other places I may have missed. > > I did, however, keep the cap for the > > places where vacuum/analyze/create index update the stats. There the > > number for relallfrozen is coming directly from visibilitymap_count(), > > so it should be correct. I could perhaps add an assert instead, but I > > didn't think that really made sense. An assert is meant to help the > > developer and what could the developer do about the visibility map > > being corrupted. > > This might just be personal preference, but I think this is exactly the > sort of thing an assertion is meant for. If we expect the value to always > be correct, and it's not, then our assumptions were wrong or someone has > broken something. In both of these cases, I as a developer would really > like to know so that I don't introduce a latent bug. If we want Postgres > to gracefully handle or detect visibility map corruption, then maybe we > should do both or PANIC. I'm on the fence about adding a PANIC. We do PANIC in other places where we notice corruption (like PageAddItemExtended()). But, in most of the cases, it seems like we are PANICing because there isn't a reasonable way to accomplish the intended task. In this case, we probably can't trust the visibility map counts for that page, but the pg_class columns are just estimates, so just capping relallfrozen might be good enough. I will note that in the other place where we may notice corruption in the VM, in lazy_scan_prune(), we do visibilitymap_clear() and print a WARNING -- as opposed to PANICing. Perhaps that's because there is no need to panic, since we are already fixing the problem with visibiliytmap_clear(). An assert would only help if the developer did something while developing that corrupted the visibility map. It doesn't help keep bogus values out of pg_class if a user's visibility map got corrupted in some way. But, maybe that isn't needed. > I do see that heap_vacuum_rel() already caps relallvisible to relpages, but > it's not clear to me whether that's something that we expect to regularly > happen in practice or what the adverse effects might be. So perhaps I'm > misunderstanding the scope and severity of bogus results from > visibilitymap_count(). Commit e6858e6, which added this code, doesn't say > anything about safety, but commit 3d351d9 changed the comment in question > to its current wording. After a very quick skim of the latter's thread > [0], I don't see any discussion about this point, either. Thanks for doing this archaeology. I am hesitant to keep the current cap on relallvisible in heap_vacuum_rel() but then not include an equivalent cap for relallfrozen. And I think it would be even more confusing for relallvisible to be capped but relallfrozen has an assert instead. None of the other locations where relallvisible is updated (do_analyze_rel(), index_update_stats()) do this capping of relallvisible. It seems like we should make it consistent. Perhaps we should just remove it from heap_vacuum_rel(). Then add an assert in all these places to at least protect development mistakes. - Melanie
Attachment
Hi Melanie
> relallvisible. It seems like we should make it consistent. Perhaps we
> should just remove it from heap_vacuum_rel(). Then add an assert in
> all these places to at least protect development mistakes.
> all these places to at least protect development mistakes.
I think there's some objection to that.
Thanks
On Tue, Feb 25, 2025 at 7:35 AM Melanie Plageman <melanieplageman@gmail.com> wrote:
On Mon, Feb 24, 2025 at 4:53 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Thu, Feb 20, 2025 at 07:35:32PM -0500, Melanie Plageman wrote:
> > Attache v7 doesn't cap the result for manual stats updating done with
> > relation_statistics_update().
>
> Unfortunately, this already needs a rebase...
Thanks! Attached v8 does just that.
I've also taken a pass at updating the stats import tests to include
relallfrozen. I'm not totally sure how I feel about the results. I
erred on the side of putting relallfrozen wherever relallvisible was.
But, that means relallfrozen is included in test cases where it
doesn't seem important to the test case. But, that was true with
relallvisible too.
Anyway, I talked to Corey off-list last week. Maybe he will have a
chance to weigh in on the test cases. Also, I had forgotten to include
relallfrozen in pg_clear_relation_stats(). I've fixed that in v8, but
now I'm worried there are some other places I may have missed.
> > I did, however, keep the cap for the
> > places where vacuum/analyze/create index update the stats. There the
> > number for relallfrozen is coming directly from visibilitymap_count(),
> > so it should be correct. I could perhaps add an assert instead, but I
> > didn't think that really made sense. An assert is meant to help the
> > developer and what could the developer do about the visibility map
> > being corrupted.
>
> This might just be personal preference, but I think this is exactly the
> sort of thing an assertion is meant for. If we expect the value to always
> be correct, and it's not, then our assumptions were wrong or someone has
> broken something. In both of these cases, I as a developer would really
> like to know so that I don't introduce a latent bug. If we want Postgres
> to gracefully handle or detect visibility map corruption, then maybe we
> should do both or PANIC.
I'm on the fence about adding a PANIC. We do PANIC in other places
where we notice corruption (like PageAddItemExtended()). But, in most
of the cases, it seems like we are PANICing because there isn't a
reasonable way to accomplish the intended task. In this case, we
probably can't trust the visibility map counts for that page, but the
pg_class columns are just estimates, so just capping relallfrozen
might be good enough.
I will note that in the other place where we may notice corruption in
the VM, in lazy_scan_prune(), we do visibilitymap_clear() and print a
WARNING -- as opposed to PANICing. Perhaps that's because there is no
need to panic, since we are already fixing the problem with
visibiliytmap_clear().
An assert would only help if the developer did something while
developing that corrupted the visibility map. It doesn't help keep
bogus values out of pg_class if a user's visibility map got corrupted
in some way. But, maybe that isn't needed.
> I do see that heap_vacuum_rel() already caps relallvisible to relpages, but
> it's not clear to me whether that's something that we expect to regularly
> happen in practice or what the adverse effects might be. So perhaps I'm
> misunderstanding the scope and severity of bogus results from
> visibilitymap_count(). Commit e6858e6, which added this code, doesn't say
> anything about safety, but commit 3d351d9 changed the comment in question
> to its current wording. After a very quick skim of the latter's thread
> [0], I don't see any discussion about this point, either.
Thanks for doing this archaeology.
I am hesitant to keep the current cap on relallvisible in
heap_vacuum_rel() but then not include an equivalent cap for
relallfrozen. And I think it would be even more confusing for
relallvisible to be capped but relallfrozen has an assert instead.
None of the other locations where relallvisible is updated
(do_analyze_rel(), index_update_stats()) do this capping of
relallvisible. It seems like we should make it consistent. Perhaps we
should just remove it from heap_vacuum_rel(). Then add an assert in
all these places to at least protect development mistakes.
- Melanie
On Mon, Feb 24, 2025 at 10:30 PM wenhui qiu <qiuwenhuifx@gmail.com> wrote: > > Hi Melanie > > relallvisible. It seems like we should make it consistent. Perhaps we > > should just remove it from heap_vacuum_rel(). Then add an assert in > > all these places to at least protect development mistakes. > I think there's some objection to that. Could you elaborate a bit? There were new merge conflicts, so v9 is attached. - Melanie
Attachment
On Mon, Feb 24, 2025 at 6:35 PM Melanie Plageman <melanieplageman@gmail.com> wrote: > I'm on the fence about adding a PANIC. We do PANIC in other places > where we notice corruption (like PageAddItemExtended()). But, in most > of the cases, it seems like we are PANICing because there isn't a > reasonable way to accomplish the intended task. In this case, we > probably can't trust the visibility map counts for that page, but the > pg_class columns are just estimates, so just capping relallfrozen > might be good enough. +1 for not unnecessarily inflating the log level. I agree with the principle you articulate here: a PANIC is reasonable when there's no sane way to continue, but for other corruption events, a WARNING is better. Note that one really bad consequence of using ERROR or any higher level in VACUUM is that VACUUM doesn't ever complete. We just keep retrying and dying. Even if it's only an ERROR, now we're no longer controlling bloat or preventing wraparound. That's extremely painful for users, so we don't want to end up there if we have a reasonable alternative. Also, I think that it's usually a bad idea to use an Assert to test for data corruption scenarios. The problem is that programmers are more or less entitled to assume that an assertion will always hold true, barring bugs, and just ignore completely what the code might do if the assertion turns out to be false. But data does get corrupted on disk, so you SHOULD think about what the code is going to do in such scenarios. Depending on the details, you might want to WARNING or ERROR or PANIC or try to repair it or try to just tolerate it being wrong or something else -- but you should be thinking about what the code is actually going to do, not just going "eh, well, that can't happen". -- Robert Haas EDB: http://www.enterprisedb.com
On Tue, Feb 25, 2025 at 10:39 AM Robert Haas <robertmhaas@gmail.com> wrote: > > On Mon, Feb 24, 2025 at 6:35 PM Melanie Plageman > <melanieplageman@gmail.com> wrote: > > I'm on the fence about adding a PANIC. We do PANIC in other places > > where we notice corruption (like PageAddItemExtended()). But, in most > > of the cases, it seems like we are PANICing because there isn't a > > reasonable way to accomplish the intended task. In this case, we > > probably can't trust the visibility map counts for that page, but the > > pg_class columns are just estimates, so just capping relallfrozen > > might be good enough. > > +1 for not unnecessarily inflating the log level. I agree with the > principle you articulate here: a PANIC is reasonable when there's no > sane way to continue, but for other corruption events, a WARNING is > better. Note that one really bad consequence of using ERROR or any > higher level in VACUUM is that VACUUM doesn't ever complete. We just > keep retrying and dying. Even if it's only an ERROR, now we're no > longer controlling bloat or preventing wraparound. That's extremely > painful for users, so we don't want to end up there if we have a > reasonable alternative. That makes sense. I think I'll add a WARNING if we don't cap the value. > Also, I think that it's usually a bad idea to use an Assert to test > for data corruption scenarios. The problem is that programmers are > more or less entitled to assume that an assertion will always hold > true, barring bugs, and just ignore completely what the code might do > if the assertion turns out to be false. But data does get corrupted on > disk, so you SHOULD think about what the code is going to do in such > scenarios. Depending on the details, you might want to WARNING or > ERROR or PANIC or try to repair it or try to just tolerate it being > wrong or something else -- but you should be thinking about what the > code is actually going to do, not just going "eh, well, that can't > happen". I agree with this. I don't see what the assertion would catch in this case. Even as I try to think of a scenario where this would help the developer, if you write code that corrupts the visibility map, I don't think waiting for an assert at the end of a vacuum will be your first or best signal. This does however leave me with the question of how to handle the original question of whether or not to cap the proposed relallfrozen to the value of relallvisible when updating stats at the end of vacuum. The current code in heap_vacuum_rel() caps relallvisible to relpages, so capping relallfrozen to relallvisible would follow that pattern. However, the other places relallvisible is updated do no such capping (do_analyze_rel(), index_update_stats()). It doesn't seem like there is a good reason to do it one place and not the others. So, I suggest either removing all the caps and adding a WARNING or capping the value in all places. Because users can now manually update these values in pg_class, there wouldn't be a way to detect the difference between a bogus relallfrozen value due to VM corruption or a bogus value due to manual statistics intervention. This led me to think that a WARNING and no cap would be more effective for heap_vacuum_rel(). - Melanie
On Tue, Feb 25, 2025 at 10:38:48AM -0500, Robert Haas wrote: > On Mon, Feb 24, 2025 at 6:35 PM Melanie Plageman > <melanieplageman@gmail.com> wrote: >> I'm on the fence about adding a PANIC. We do PANIC in other places >> where we notice corruption (like PageAddItemExtended()). But, in most >> of the cases, it seems like we are PANICing because there isn't a >> reasonable way to accomplish the intended task. In this case, we >> probably can't trust the visibility map counts for that page, but the >> pg_class columns are just estimates, so just capping relallfrozen >> might be good enough. > > +1 for not unnecessarily inflating the log level. I agree with the > principle you articulate here: a PANIC is reasonable when there's no > sane way to continue, but for other corruption events, a WARNING is > better. Note that one really bad consequence of using ERROR or any > higher level in VACUUM is that VACUUM doesn't ever complete. We just > keep retrying and dying. Even if it's only an ERROR, now we're no > longer controlling bloat or preventing wraparound. That's extremely > painful for users, so we don't want to end up there if we have a > reasonable alternative. > > Also, I think that it's usually a bad idea to use an Assert to test > for data corruption scenarios. The problem is that programmers are > more or less entitled to assume that an assertion will always hold > true, barring bugs, and just ignore completely what the code might do > if the assertion turns out to be false. But data does get corrupted on > disk, so you SHOULD think about what the code is going to do in such > scenarios. Depending on the details, you might want to WARNING or > ERROR or PANIC or try to repair it or try to just tolerate it being > wrong or something else -- but you should be thinking about what the > code is actually going to do, not just going "eh, well, that can't > happen". In this case, it sounds to me like we want production builds to tolerate visibility map corruption (at least, to some extent), but we don't really expect to see it regularly. If that's true, then my inclination would be to add assertions to make sure nobody has fundamentally broken the code, but to at most either warn or restrict the values for non-assert builds. TBH my gut feeling is that, outside of assert-enabled builds, we should just let the values be bogus if they are indeed just estimates that can be wildly inaccurate anyway. I see little value in trying to gently corral the value in certain cases, lest we want to mask bugs and give credence to bad assumptions about the possible values. I think the argument for PANIC-ing in this case is that visibility map corruption seems likely to be the tip of the iceberg. If we see that one file is corrupted, I would expect a DBA to be extremely wary of trusting the other files on the system, too. (I will freely admit that I might be missing situations where race conditions or the like cause us to retrieve seemingly invalid values for relpages and relallvisible, so please pardon my ignorance as I dive deeper into this code.) -- nathan
On Tue, Feb 25, 2025 at 11:02:40AM -0500, Melanie Plageman wrote: > This does however leave me with the question of how to handle the > original question of whether or not to cap the proposed relallfrozen > to the value of relallvisible when updating stats at the end of > vacuum. The current code in heap_vacuum_rel() caps relallvisible to > relpages, so capping relallfrozen to relallvisible would follow that > pattern. However, the other places relallvisible is updated do no such > capping (do_analyze_rel(), index_update_stats()). It doesn't seem like > there is a good reason to do it one place and not the others. So, I > suggest either removing all the caps and adding a WARNING or capping > the value in all places. Because users can now manually update these > values in pg_class, there wouldn't be a way to detect the difference > between a bogus relallfrozen value due to VM corruption or a bogus > value due to manual statistics intervention. This led me to think that > a WARNING and no cap would be more effective for heap_vacuum_rel(). I'm currently leaning towards the "remove all caps" idea. But I'm not sure I totally understand the need for a WARNING. What do we expect users to do with that information? If it's intended to alert them of possible corruption, then IMHO a WARNING may be too gentle. I guess we could warn and suggest a way to fix the value with the new statistics manipulation functions if it's not that big of a deal, but at that point we might as well just cap it on our own again. If we don't really expect users to have to do anything about it, then isn't it just adding unnecessary log noise? -- nathan
On Tue, Feb 25, 2025 at 11:03 AM Melanie Plageman <melanieplageman@gmail.com> wrote:
Because users can now manually update these values in pg_class, there wouldn't be a way to detect the difference
between a bogus relallfrozen value due to VM corruption or a bogus value due to manual statistics intervention.
Er..you had me until this. If manual monkeying of the system catalogs leads to a "bogus" error that resembles a real one, then sow the wind, and reap the whirlwind. I don't think that should be a consideration here.
Cheers,
Greg
--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support
On Tue, Feb 25, 2025 at 11:32 AM Nathan Bossart <nathandbossart@gmail.com> wrote: > On Tue, Feb 25, 2025 at 11:02:40AM -0500, Melanie Plageman wrote: > > This does however leave me with the question of how to handle the > > original question of whether or not to cap the proposed relallfrozen > > to the value of relallvisible when updating stats at the end of > > vacuum. The current code in heap_vacuum_rel() caps relallvisible to > > relpages, so capping relallfrozen to relallvisible would follow that > > pattern. However, the other places relallvisible is updated do no such > > capping (do_analyze_rel(), index_update_stats()). It doesn't seem like > > there is a good reason to do it one place and not the others. So, I > > suggest either removing all the caps and adding a WARNING or capping > > the value in all places. Because users can now manually update these > > values in pg_class, there wouldn't be a way to detect the difference > > between a bogus relallfrozen value due to VM corruption or a bogus > > value due to manual statistics intervention. This led me to think that > > a WARNING and no cap would be more effective for heap_vacuum_rel(). > > I'm currently leaning towards the "remove all caps" idea. But I'm not sure > I totally understand the need for a WARNING. What do we expect users to do > with that information? If it's intended to alert them of possible > corruption, then IMHO a WARNING may be too gentle. I guess we could warn > and suggest a way to fix the value with the new statistics manipulation > functions if it's not that big of a deal, but at that point we might as > well just cap it on our own again. If we don't really expect users to have > to do anything about it, then isn't it just adding unnecessary log noise? > If the end user is manipulating numbers to test some theory, I think it's valuable feedback that they have probably bent the system too far, because we are now seeing numbers that don't make sense. If they aren't mucking with the system, then it's valuable feedback that they may have an underlying system problem that could be about to get worse. Robert Treat https://xzilla.net
On Tue, Feb 25, 2025 at 12:36:40PM -0500, Robert Treat wrote: > On Tue, Feb 25, 2025 at 11:32 AM Nathan Bossart > <nathandbossart@gmail.com> wrote: >> I'm currently leaning towards the "remove all caps" idea. But I'm not sure >> I totally understand the need for a WARNING. What do we expect users to do >> with that information? If it's intended to alert them of possible >> corruption, then IMHO a WARNING may be too gentle. I guess we could warn >> and suggest a way to fix the value with the new statistics manipulation >> functions if it's not that big of a deal, but at that point we might as >> well just cap it on our own again. If we don't really expect users to have >> to do anything about it, then isn't it just adding unnecessary log noise? > > If the end user is manipulating numbers to test some theory, I think > it's valuable feedback that they have probably bent the system too > far, because we are now seeing numbers that don't make sense. If we do that in other cases, then that seems reasonable. But it feels weird to me to carve out just this one inaccuracy. For example, do we warn if someone sets reltuples too high for relpages, or if they set relpages too low for reltuples? I'm not seeing why the relallfrozen <= relallvisible <= relpages case is especially important to uphold. If we were consistent about enforcing these kinds of invariants everywhere, then I think I would be more on board with capping the values and/or ERROR-ing when folks tried to set bogus ones. But if the only outcome of bogus values is that you might get weird plans or autovacuum might prioritize the table differently, then I'm not sure I see the point. You can accomplish both of those things with totally valid values already. > If they > aren't mucking with the system, then it's valuable feedback that they > may have an underlying system problem that could be about to get > worse. Maybe a WARNING is as much as we can realistically do in this case, but I think it could be easily missed in the server logs. I dunno, it just feels wrong to me to deal with potential corruption by gently notifying the user and proceeding normally. I guess that's what we do already elsewhere, though (e.g., for relfrozenxid/relminmxid in vac_update_relstats()). -- nathan
On Tue, Feb 25, 2025 at 11:03 AM Melanie Plageman <melanieplageman@gmail.com> wrote: > This does however leave me with the question of how to handle the > original question of whether or not to cap the proposed relallfrozen > to the value of relallvisible when updating stats at the end of > vacuum. The current code in heap_vacuum_rel() caps relallvisible to > relpages, so capping relallfrozen to relallvisible would follow that > pattern. However, the other places relallvisible is updated do no such > capping (do_analyze_rel(), index_update_stats()). It doesn't seem like > there is a good reason to do it one place and not the others. So, I > suggest either removing all the caps and adding a WARNING or capping > the value in all places. Because users can now manually update these > values in pg_class, there wouldn't be a way to detect the difference > between a bogus relallfrozen value due to VM corruption or a bogus > value due to manual statistics intervention. This led me to think that > a WARNING and no cap would be more effective for heap_vacuum_rel(). I mean, does it really make any difference one way or the other? Given that users could manually update the catalog, we have to be able to tolerate bad data in the catalogs without the world ending. If that code has to exist anyway, then it's not mandatory to cap. On the other hand, there's no great virtue in refusing to correct data that we know to be wrong. Unless there is some other consideration which makes one way better than the other, this feels like author's choice. -- Robert Haas EDB: http://www.enterprisedb.com
On Tue, Feb 25, 2025 at 01:52:28PM -0500, Robert Haas wrote: > Given that users could manually update the catalog, we have to be able > to tolerate bad data in the catalogs without the world ending. If that > code has to exist anyway, then it's not mandatory to cap. On the other > hand, there's no great virtue in refusing to correct data that we know > to be wrong. Unless there is some other consideration which makes one > way better than the other, this feels like author's choice. Maybe the most conservative choice is to simply follow the example of surrounding code. If it's careful to cap relallvisible to relpages, also have it cap relallfrozen. If not, don't. *shrug* In any case, I don't want to hold up this patch on this relatively minor point. This seems like something we could pretty easily change in the future if needed. -- nathan
On Tue, Feb 25, 2025 at 11:36 AM Greg Sabino Mullane <htamfids@gmail.com> wrote: > > On Tue, Feb 25, 2025 at 11:03 AM Melanie Plageman <melanieplageman@gmail.com> wrote: >> >> Because users can now manually update these values in pg_class, there wouldn't be a way to detect the difference >> between a bogus relallfrozen value due to VM corruption or a bogus value due to manual statistics intervention. > > Er..you had me until this. If manual monkeying of the system catalogs leads to a "bogus" error that resembles a real one,then sow the wind, and reap the whirlwind. I don't think that should be a consideration here. Oh, the WARNING would only show up in a case of actual VM corruption. The WARNING proposed is after calling visibilitymap_count() before updating pg_class, if the value we get from the VM for relallfrozen exceeds relallvisible. If you manually change pg_class relallfrozen/relallvisible to bogus values, you wouldn't get a warning. I meant there wouldn't be a way to detect the difference when viewing pg_class between bogus values because of a corrupt VM and bogus values because of manual updates to pg_class. - Melanie
On Tue, Feb 25, 2025 at 1:52 PM Robert Haas <robertmhaas@gmail.com> wrote: > > On Tue, Feb 25, 2025 at 11:03 AM Melanie Plageman > <melanieplageman@gmail.com> wrote: > > This does however leave me with the question of how to handle the > > original question of whether or not to cap the proposed relallfrozen > > to the value of relallvisible when updating stats at the end of > > vacuum. The current code in heap_vacuum_rel() caps relallvisible to > > relpages, so capping relallfrozen to relallvisible would follow that > > pattern. However, the other places relallvisible is updated do no such > > capping (do_analyze_rel(), index_update_stats()). It doesn't seem like > > there is a good reason to do it one place and not the others. So, I > > suggest either removing all the caps and adding a WARNING or capping > > the value in all places. Because users can now manually update these > > values in pg_class, there wouldn't be a way to detect the difference > > between a bogus relallfrozen value due to VM corruption or a bogus > > value due to manual statistics intervention. This led me to think that > > a WARNING and no cap would be more effective for heap_vacuum_rel(). > > I mean, does it really make any difference one way or the other? > > Given that users could manually update the catalog, we have to be able > to tolerate bad data in the catalogs without the world ending. If that > code has to exist anyway, then it's not mandatory to cap. On the other > hand, there's no great virtue in refusing to correct data that we know > to be wrong. Unless there is some other consideration which makes one > way better than the other, this feels like author's choice. I realized that whether or not we add a WARNING is an independent question from whether or not we cap these values. In these instances, we happen to have just read the whole VM and so we can tell you if it is broken in a particular way. If I want to write a patch to warn users of visibility map corruption after calling visibilitymap_count(), I could do that and it might be a good idea, but it should probably be a separate commit anyway. - Melanie
On Tue, Feb 25, 2025 at 3:05 PM Nathan Bossart <nathandbossart@gmail.com> wrote: > > On Tue, Feb 25, 2025 at 01:52:28PM -0500, Robert Haas wrote: > > Given that users could manually update the catalog, we have to be able > > to tolerate bad data in the catalogs without the world ending. If that > > code has to exist anyway, then it's not mandatory to cap. On the other > > hand, there's no great virtue in refusing to correct data that we know > > to be wrong. Unless there is some other consideration which makes one > > way better than the other, this feels like author's choice. > > Maybe the most conservative choice is to simply follow the example of > surrounding code. If it's careful to cap relallvisible to relpages, also > have it cap relallfrozen. If not, don't. *shrug* Agreed. I've done this in attached v10. I handle relallfrozen values > relpages in the second patch in the set when using the relallfrozen value, so I think we are all good. > In any case, I don't want to hold up this patch on this relatively minor > point. This seems like something we could pretty easily change in the > future if needed. Yes, so one thing you haven't said yet is if you are +1 on going forward with these patches in general. As for the code, I'm not 100% convinced I've got all the stats import/export bits perfect (those are changing under my feet right now anyway). - Melanie
Attachment
On Tue, Feb 25, 2025 at 05:19:30PM -0500, Melanie Plageman wrote: > Yes, so one thing you haven't said yet is if you are +1 on going > forward with these patches in general. Sorry, yes, I'm +1 in general. It conceptually makes sense to me that we should disregard frozen pages when deciding whether to do an insert vacuum, and it's hard to argue with the results in your original post. I also am not overly concerned about worker starvation. While this patch does give higher priority to insert-only/mostly tables, it's also reducing the amount of resources required to vacuum them, anyway. -- nathan
On Tue, Feb 25, 2025 at 4:29 PM Melanie Plageman <melanieplageman@gmail.com> wrote: > > On Tue, Feb 25, 2025 at 11:36 AM Greg Sabino Mullane <htamfids@gmail.com> wrote: > > > > On Tue, Feb 25, 2025 at 11:03 AM Melanie Plageman <melanieplageman@gmail.com> wrote: > >> > >> Because users can now manually update these values in pg_class, there wouldn't be a way to detect the difference > >> between a bogus relallfrozen value due to VM corruption or a bogus value due to manual statistics intervention. > > > > Er..you had me until this. If manual monkeying of the system catalogs leads to a "bogus" error that resembles a realone, then sow the wind, and reap the whirlwind. I don't think that should be a consideration here. > > Oh, the WARNING would only show up in a case of actual VM corruption. > The WARNING proposed is after calling visibilitymap_count() before > updating pg_class, if the value we get from the VM for relallfrozen > exceeds relallvisible. If you manually change pg_class > relallfrozen/relallvisible to bogus values, you wouldn't get a > warning. I meant there wouldn't be a way to detect the difference when > viewing pg_class between bogus values because of a corrupt VM and > bogus values because of manual updates to pg_class. > It strikes me as a bit odd to have this extra wording in the pg_class documentation: + Every all-frozen page must also be marked + all-visible in the visibility map, so + <structfield>relallfrozen</structfield> should be less than or equal to + <structfield>relallvisible</structfield>. However, if either field is + updated manually or if the visibility map is corrupted, it is possible + for <structfield>relallfrozen</structfield> to exceed + <structfield>relallvisible</structfield>. For example, we don't document that rellallvisible should never exceed relpages, and we aren't normally in the habit of documenting weird behavior that might happen if people go updating the system catalogs. Maybe it's just me, but when I read this earlier, I thought there might be some intended use case for updating the catalog manually that you had in mind and so the comments were warranted (and indeed, it's part of why I thought the warning would be useful for users). But upon reading the thread more and another pass through your updated patches, this doesn't seem to be the case, and I wonder if this language might be more encouraging of people updating catalogs than we would typically be. Robert Treat https://xzilla.net
On Wed, Feb 26, 2025 at 11:17:06AM -0500, Robert Treat wrote: > It strikes me as a bit odd to have this extra wording in the pg_class > documentation: > > + Every all-frozen page must also be marked > + all-visible in the visibility map, so > + <structfield>relallfrozen</structfield> should be less than or equal to > + <structfield>relallvisible</structfield>. However, if either field is > + updated manually or if the visibility map is corrupted, it is possible > + for <structfield>relallfrozen</structfield> to exceed > + <structfield>relallvisible</structfield>. > > For example, we don't document that rellallvisible should never exceed > relpages, and we aren't normally in the habit of documenting weird > behavior that might happen if people go updating the system catalogs. > Maybe it's just me, but when I read this earlier, I thought there > might be some intended use case for updating the catalog manually that > you had in mind and so the comments were warranted (and indeed, it's > part of why I thought the warning would be useful for users). But upon > reading the thread more and another pass through your updated patches, > this doesn't seem to be the case, and I wonder if this language might > be more encouraging of people updating catalogs than we would > typically be. +1. If we did want to add more information about the ordinary expectations of relallfrozen and friends here, I'd suggest doing so in a separate patch. IMHO the usual "This is only an estimate..." wording is sufficient for the introduction of relallfrozen. -- nathan
On Wed, Feb 26, 2025 at 12:25 PM Nathan Bossart <nathandbossart@gmail.com> wrote: > > On Wed, Feb 26, 2025 at 11:17:06AM -0500, Robert Treat wrote: > > It strikes me as a bit odd to have this extra wording in the pg_class > > documentation: > > > > + Every all-frozen page must also be marked > > + all-visible in the visibility map, so > > + <structfield>relallfrozen</structfield> should be less than or equal to > > + <structfield>relallvisible</structfield>. However, if either field is > > + updated manually or if the visibility map is corrupted, it is possible > > + for <structfield>relallfrozen</structfield> to exceed > > + <structfield>relallvisible</structfield>. > > > > For example, we don't document that rellallvisible should never exceed > > relpages, and we aren't normally in the habit of documenting weird > > behavior that might happen if people go updating the system catalogs. > > Maybe it's just me, but when I read this earlier, I thought there > > might be some intended use case for updating the catalog manually that > > you had in mind and so the comments were warranted (and indeed, it's > > part of why I thought the warning would be useful for users). But upon > > reading the thread more and another pass through your updated patches, > > this doesn't seem to be the case, and I wonder if this language might > > be more encouraging of people updating catalogs than we would > > typically be. > > +1. If we did want to add more information about the ordinary expectations > of relallfrozen and friends here, I'd suggest doing so in a separate patch. > IMHO the usual "This is only an estimate..." wording is sufficient for the > introduction of relallfrozen. Makes sense. Thanks Robert and Nathan. Attached v11 changes the docs wording and is rebased. - Melanie
Attachment
On Wed, Feb 26, 2025 at 04:48:20PM -0500, Melanie Plageman wrote: > Makes sense. Thanks Robert and Nathan. Attached v11 changes the docs > wording and is rebased. 0001 LGTM. > <para> > - Specifies a fraction of the table size to add to > - <varname>autovacuum_vacuum_insert_threshold</varname> > - when deciding whether to trigger a <command>VACUUM</command>. > - The default is <literal>0.2</literal> (20% of table size). > - This parameter can only be set in the <filename>postgresql.conf</filename> > - file or on the server command line; > - but the setting can be overridden for individual tables by > - changing table storage parameters. > + Specifies a fraction of the active (unfrozen) table size to add to > + <varname>autovacuum_vacuum_insert_threshold</varname> > + when deciding whether to trigger a <command>VACUUM</command>. > + The default is <literal>0.2</literal> (20% of active table size). > + This parameter can only be set in the <filename>postgresql.conf</filename> > + file or on the server command line; > + but the setting can be overridden for individual tables by > + changing table storage parameters. > </para> nitpick: There might be an unintentional indentation change here. I'm wondering about the use of the word "active," too. While it's qualified by the "(unfrozen)" after it, I'm worried it might not be descriptive enough. For example, I might consider a frozen page that's in the buffer cache and is being read by queries to be "active." And it doesn't seem clear to me that it's referring to unfrozen pages and not unfrozen tuples. Perhaps we should say something like "a fraction of the unfrozen pages in the table to add...". > + /* > + * If we have data for relallfrozen, calculate the unfrozen percentage > + * of the table to modify insert scale factor. This helps us decide > + * whether or not to vacuum an insert-heavy table based on the number > + * of inserts to the "active" part of the table. > + */ > + if (relpages > 0 && relallfrozen > 0) So, if we don't have this data, we just use reltuples, which is the existing behavior and should trigger vacuums less aggressively than if we _did_ have the data. That seems like the correct choice to me. > + /* > + * It could be the stats were updated manually and relallfrozen > > + * relpages. Clamp relallfrozen to relpages to avoid nonsensical > + * calculations. > + */ > + relallfrozen = Min(relallfrozen, relpages); > + pcnt_unfrozen = 1 - ((float4) relallfrozen / relpages); Makes sense. -- nathan
Hi
> + * It could be the stats were updated manually and relallfrozen >
> + * relpages. Clamp relallfrozen to relpages to avoid nonsensical
> + * calculations.
> + */
> + relallfrozen = Min(relallfrozen, relpages);
> + pcnt_unfrozen = 1 - ((float4) relallfrozen / relpages);
> + }
> +
> + * It could be the stats were updated manually and relallfrozen >
> + * relpages. Clamp relallfrozen to relpages to avoid nonsensical
> + * calculations.
> + */
> + relallfrozen = Min(relallfrozen, relpages);
> + pcnt_unfrozen = 1 - ((float4) relallfrozen / relpages);
> + }
> +
Based on the comments, the pcnt_unfrozen value could potentially be 0, which would indicate that everything is frozen. Therefore, is it necessary to handle the case where the value is 0.?
On Sat, Mar 1, 2025 at 1:54 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Wed, Feb 26, 2025 at 04:48:20PM -0500, Melanie Plageman wrote:
> Makes sense. Thanks Robert and Nathan. Attached v11 changes the docs
> wording and is rebased.
0001 LGTM.
> <para>
> - Specifies a fraction of the table size to add to
> - <varname>autovacuum_vacuum_insert_threshold</varname>
> - when deciding whether to trigger a <command>VACUUM</command>.
> - The default is <literal>0.2</literal> (20% of table size).
> - This parameter can only be set in the <filename>postgresql.conf</filename>
> - file or on the server command line;
> - but the setting can be overridden for individual tables by
> - changing table storage parameters.
> + Specifies a fraction of the active (unfrozen) table size to add to
> + <varname>autovacuum_vacuum_insert_threshold</varname>
> + when deciding whether to trigger a <command>VACUUM</command>.
> + The default is <literal>0.2</literal> (20% of active table size).
> + This parameter can only be set in the <filename>postgresql.conf</filename>
> + file or on the server command line;
> + but the setting can be overridden for individual tables by
> + changing table storage parameters.
> </para>
nitpick: There might be an unintentional indentation change here.
I'm wondering about the use of the word "active," too. While it's
qualified by the "(unfrozen)" after it, I'm worried it might not be
descriptive enough. For example, I might consider a frozen page that's in
the buffer cache and is being read by queries to be "active." And it
doesn't seem clear to me that it's referring to unfrozen pages and not
unfrozen tuples. Perhaps we should say something like "a fraction of the
unfrozen pages in the table to add...".
> + /*
> + * If we have data for relallfrozen, calculate the unfrozen percentage
> + * of the table to modify insert scale factor. This helps us decide
> + * whether or not to vacuum an insert-heavy table based on the number
> + * of inserts to the "active" part of the table.
> + */
> + if (relpages > 0 && relallfrozen > 0)
So, if we don't have this data, we just use reltuples, which is the
existing behavior and should trigger vacuums less aggressively than if we
_did_ have the data. That seems like the correct choice to me.
> + /*
> + * It could be the stats were updated manually and relallfrozen >
> + * relpages. Clamp relallfrozen to relpages to avoid nonsensical
> + * calculations.
> + */
> + relallfrozen = Min(relallfrozen, relpages);
> + pcnt_unfrozen = 1 - ((float4) relallfrozen / relpages);
Makes sense.
--
nathan
On Sat, Mar 01, 2025 at 08:57:52AM +0800, wenhui qiu wrote: > Based on the comments, the pcnt_unfrozen value could potentially be 0, > which would indicate that everything is frozen. Therefore, is it necessary > to handle the case where the value is 0.? How so? If it's 0, then the insert threshold calculation would produce autovacuum_vacuum_insert_threshold, just like for an empty table. That seems like the intended behavior to me. -- nathan
On Fri, Feb 28, 2025 at 12:54 PM Nathan Bossart <nathandbossart@gmail.com> wrote: > > On Wed, Feb 26, 2025 at 04:48:20PM -0500, Melanie Plageman wrote: > > Makes sense. Thanks Robert and Nathan. Attached v11 changes the docs > > wording and is rebased. > > 0001 LGTM. Cool. Corey checked over the stats import tests off-list and +1'd them, so I went ahead and pushed this. > > <para> > > - Specifies a fraction of the table size to add to <--snip--> > > </para> > > nitpick: There might be an unintentional indentation change here. Yep, fixed it in attached v12, thanks. > I'm wondering about the use of the word "active," too. While it's > qualified by the "(unfrozen)" after it, I'm worried it might not be > descriptive enough. For example, I might consider a frozen page that's in > the buffer cache and is being read by queries to be "active." And it > doesn't seem clear to me that it's referring to unfrozen pages and not > unfrozen tuples. Perhaps we should say something like "a fraction of the > unfrozen pages in the table to add...". Done. Thanks for the suggestion. I noticed the docs wording is kind of different than that in postgresql.conf.sample. The docs wording mentions that the scale factor gets added to the threshold and postgresql.conf.sample does not (in master as well). I just wanted to make sure my updates to postgresql.conf.sample sound accurate. > > + /* > > + * If we have data for relallfrozen, calculate the unfrozen percentage > > + * of the table to modify insert scale factor. This helps us decide > > + * whether or not to vacuum an insert-heavy table based on the number > > + * of inserts to the "active" part of the table. > > + */ > > + if (relpages > 0 && relallfrozen > 0) > > So, if we don't have this data, we just use reltuples, which is the > existing behavior and should trigger vacuums less aggressively than if we > _did_ have the data. That seems like the correct choice to me. Yep. - Melanie
Attachment
On Mon, Mar 03, 2025 at 12:18:37PM -0500, Melanie Plageman wrote: > I noticed the docs wording is kind of different than that in > postgresql.conf.sample. The docs wording mentions that the scale > factor gets added to the threshold and postgresql.conf.sample does not > (in master as well). I just wanted to make sure my updates to > postgresql.conf.sample sound accurate. > -#autovacuum_vacuum_insert_scale_factor = 0.2 # fraction of inserts over table > - # size before insert vacuum > +#autovacuum_vacuum_insert_scale_factor = 0.2 # fraction of unfrozen pages > + # inserted to before insert vacuum Hm. So for autovacuum_vacuum_scale_factor and autovacuum_analyze_scale_factor, we have "fraction of table size before vacuum" "fraction of table size before analyze" And currently, for autovacuum_vacuum_insert_scale_factor, we have "fraction of inserts over table size before insert vacuum" I think the key change here is that we are replacing "table size" with "unfrozen pages," which would give us "fraction of inserts over unfrozen pages before insert vacuum" However, I find that unclear, if for no other reason than I'm not sure what "over" means in this context. A reader might think that refers to the fraction (i.e., inserts / unfrozen pages), or they might think it means "on" or "above." IMHO your proposed wording is much clearer. The only part that feels a bit awkward to me is "to before". My mind wants the "to" to be followed with a verb, so I stumbled the first time I read it and had to read it again. Perhaps we could just say "fraction of unfrozen pages before insert vacuum" That more closely matches the other scale factor parameters. It's admittedly quite terse, but so are the vast majority of other descriptions in the sample file. I don't think those are intended to serve as substitutes for the "real" documentation. I would make the same argument for leaving out the base threshold. While it is important information, these extremely concise descriptions don't really have enough room for the nitty-gritty details. -- nathan
On Mon, Mar 3, 2025 at 1:11 PM Nathan Bossart <nathandbossart@gmail.com> wrote: > > "fraction of unfrozen pages before insert vacuum" > > That more closely matches the other scale factor parameters. It's > admittedly quite terse, but so are the vast majority of other descriptions > in the sample file. I don't think those are intended to serve as > substitutes for the "real" documentation. cool, I've changed it to this, pushed, and marked the commitfest entry as committed. Thanks so much for your attention and review! - Melanie