Re: Berserk Autovacuum (let's save next Mandrill) - Mailing list pgsql-hackers
From | Justin Pryzby |
---|---|
Subject | Re: Berserk Autovacuum (let's save next Mandrill) |
Date | |
Msg-id | 20200313184442.GS29065@telsasoft.com Whole thread Raw |
In response to | Re: Berserk Autovacuum (let's save next Mandrill) (David Rowley <dgrowleyml@gmail.com>) |
Responses |
Re: Berserk Autovacuum (let's save next Mandrill)
Re: Berserk Autovacuum (let's save next Mandrill) |
List | pgsql-hackers |
On Tue, Mar 10, 2020 at 01:53:42PM +1300, David Rowley wrote: > 2. Perhaps the documentation in maintenance.sgml should mention that > the table will be vacuumed with the equivalent of having > vacuum_freeze_min_age = 0, instead of: > > "Such a vacuum will aggressively freeze tuples." > > aggressive is the wrong word here. We call it an aggressive vacuum if > we disable page skipping, not for setting the vacuum_freeze_min_age to > 0. Possible it would be better to run VACUUM *without* freeze_min_age=0 ? (I get confused and have to spend 20min re-reading the vacuum GUC docs every time I deal with this stuff, so maybe I'm off). As I understand, the initial motivation of this patch was to avoid disruptive anti-wraparound vacuums on insert-only table. But if vacuum were triggered at all, it would freeze the oldest tuples, which is all that's needed; especially since fd31cd2651 "Don't vacuum all-frozen pages.", those pages would never need to be vacuumed again. Recently written tuples wouldn't be frozen, which is ok, they're handled next time. Another motivation of the patch is to allow indexonly scan, for which the planner looks at pages' "relallvisible" fraction (and at execution if a page isn't allvisible, visits the heap). Again, that happens if vacuum were run at all. Again, some pages won't be marked allvisible, which is fine, they're handled next time. I think freeze_min_age=0 could negatively affect people who have insert-mostly tables (I'm not concerned, but that includes us). If they consistently hit the autovacuum insert threshold before the cleanup threshold for updated/deleted tuples, any updated/deleted tuples would be frozen, which would be wasteful: |One disadvantage of decreasing vacuum_freeze_min_age is that it might cause |VACUUM to do useless work: freezing a row version is a waste of time if the row |is modified soon thereafter (causing it to acquire a new XID). So the setting |should be large enough that rows are not frozen until they are unlikely to |change any more. So my question is if autovacuum triggered by insert threshold should trigger VACUUM with the same settings as a vacuum due to deleted tuples. I realize the DBA could just configure the thresholds so they'd hit vacuum for cleaning dead tuples, so my suggestion maybe just improves the case with the default settings. It's possible to set the reloption autovacuum_freeze_min_age, which I think supports the idea of running a vacuum normally and letting it (and the DBA) decide what do with existing logic. Also, there was a discussion about index cleanup with the conclusion that it was safer not to skip it, since otherwise indexes might bloat. I think that's right, since vacuum for cleanup is triggered by the number of dead heap tuples. To skip index cleanup, I think you'd want a metric for n_dead_since_index_cleanup. (Or maybe analyze could track dead index tuples and trigger vacuum of each index separately). Having now played with the patch, I'll suggest that 10000000 is too high a threshold. If autovacuum runs without FREEZE, I don't see why it couldn't be much lower (100000?) or use (0.2 * n_ins + 50) like the other autovacuum GUC. -- Justin
pgsql-hackers by date: