Thread: Allow to specify (auto-)vacuum cost limits relative to the database/cluster size?
Allow to specify (auto-)vacuum cost limits relative to the database/cluster size?
From
Andres Freund
Date:
Hi, right now the defaults for autovacuum cost limiting are so low that they regularly cause problems for our users. It's not exactly obvious that any installation above a couple gigabytes definitely needs to change autovacuum_vacuum_cost_delay & autovacuum_vacuum_cost_limit/vacuum_cost_limit. Especially anti-wraparound/full table vacuums basically take forever with the default settings. On the other hand we don't want a database of a couple hundred megabytes to be vacuumed as fast as possible and trash the poor tiny system. So we can't just massively increase the limits by default; although I believe some default adjustment would be appropriate anyway. I wonder if it makes sense to compute the delays / limits in relation to either cluster or relation size. If you have a 10 TB table, you obviously don't want to scan with a few megabytes a second, which the default settings will do for you. With that in mind we could just go for something like the autovacuum_*_scale_factor settings. But e.g. for partitioned workloads with a hundreds of tables in the couple gigabyte range that'd not work that well. Somehow computing the speed in relation to the cluster/database size is probably possible, but I wonder how we can do so without constantly re-computing something relatively expensive? Thoughts? - Andres
Re: Allow to specify (auto-)vacuum cost limits relative to the database/cluster size?
From
Jim Nasby
Date:
On 1/12/16 6:42 AM, Andres Freund wrote: > Somehow computing the speed in relation to the cluster/database size is > probably possible, but I wonder how we can do so without constantly > re-computing something relatively expensive? ISTM relpages would probably be good enough for this, if we take the extra step of getting actual relation size when relpages is 0. I'm not sure a straght scale factor is the way to go though... it seems that might be problematic? I think we'd at least one a minimum default value; you certainly don't want even a small system running vacuum at 1kB/s... -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com
Re: Allow to specify (auto-)vacuum cost limits relative to the database/cluster size?
From
Robert Haas
Date:
On Tue, Jan 12, 2016 at 6:12 PM, Andres Freund <andres@anarazel.de> wrote: > right now the defaults for autovacuum cost limiting are so low that they > regularly cause problems for our users. It's not exactly obvious that > any installation above a couple gigabytes definitely needs to change > autovacuum_vacuum_cost_delay & > autovacuum_vacuum_cost_limit/vacuum_cost_limit. Especially > anti-wraparound/full table vacuums basically take forever with the > default settings. > > On the other hand we don't want a database of a couple hundred megabytes > to be vacuumed as fast as possible and trash the poor tiny system. So we > can't just massively increase the limits by default; although I believe > some default adjustment would be appropriate anyway. > > I wonder if it makes sense to compute the delays / limits in relation to > either cluster or relation size. If you have a 10 TB table, you > obviously don't want to scan with a few megabytes a second, which the > default settings will do for you. With that in mind we could just go for > something like the autovacuum_*_scale_factor settings. But e.g. for > partitioned workloads with a hundreds of tables in the couple gigabyte > range that'd not work that well. > > Somehow computing the speed in relation to the cluster/database size is > probably possible, but I wonder how we can do so without constantly > re-computing something relatively expensive? > > Thoughts? Thanks for bringing this up. I fully agree we should try to do something about this. This comes up quite regularly in EnterpriseDB support discussions, and I'm sure lots of other people have problems with it too. It seems to me that what we really want to do is try to finish vacuuming the table before we again need to vacuum the table. For the sake of simplicity, just consider the anti-wraparound case for a second. If it takes three days to vacuum the table and we consume 200 million XIDs in two days, we are pretty clearly not vacuuming fast enough. I think we should do something similar to what we do for checkpoints. We estimate when the table will next need vacuuming based on the rate of XID advancement and the rate at which dead tuples are being created. We can also estimate what percentage of the relation we've vacuumed and derive some estimate of when we'll be done - perhaps assuming only one index pass, for the sake of simplicity. If we're behind, we should vacuum faster to try to catch up. We could even try to include some fudge factor in the calculation - e.g. if the time until the next vacuum is estimated to be 30 hours from the start of the current vacuum, we try to make the current vacuum finish in no more than 75% * 30 hours = 22.5 hours. I think this is better than your proposal to scale it just based on the size of the relation because it may be find for the vacuum to run slowly if we're creating very few dead tuples and consuming very few XIDs. IME, there's one very specific scenario where the wheels come off, and that's when the table doesn't get fully vacuumed before it's due to be vacuumed again. Of course, anything we did here wouldn't be perfect - it would all be based on estimates - but I bet we could make things a lot better. There's an even more global version of this problem, which is that you could have a situation when any given table gets vacuumed it runs quick enough to finish before that table gets vacuumed again, but there are lots of large tables so overall we don't make enough progress. It would be nice to fix that, too, but even something simple that ignored that more global problem would help a lot of people. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Allow to specify (auto-)vacuum cost limits relative to the database/cluster size?
From
Joe Conway
Date:
On 02/23/2016 10:23 PM, Robert Haas wrote: > On Tue, Jan 12, 2016 at 6:12 PM, Andres Freund <andres@anarazel.de> wrote: >> right now the defaults for autovacuum cost limiting are so low that they >> regularly cause problems for our users. It's not exactly obvious that >> any installation above a couple gigabytes definitely needs to change >> autovacuum_vacuum_cost_delay & >> autovacuum_vacuum_cost_limit/vacuum_cost_limit. Especially >> anti-wraparound/full table vacuums basically take forever with the >> default settings. >> >> On the other hand we don't want a database of a couple hundred megabytes >> to be vacuumed as fast as possible and trash the poor tiny system. So we >> can't just massively increase the limits by default; although I believe >> some default adjustment would be appropriate anyway. >> >> I wonder if it makes sense to compute the delays / limits in relation to >> either cluster or relation size. If you have a 10 TB table, you >> obviously don't want to scan with a few megabytes a second, which the >> default settings will do for you. With that in mind we could just go for >> something like the autovacuum_*_scale_factor settings. But e.g. for >> partitioned workloads with a hundreds of tables in the couple gigabyte >> range that'd not work that well. >> >> Somehow computing the speed in relation to the cluster/database size is >> probably possible, but I wonder how we can do so without constantly >> re-computing something relatively expensive? >> >> Thoughts? > > Thanks for bringing this up. I fully agree we should try to do > something about this. This comes up quite regularly in EnterpriseDB > support discussions, and I'm sure lots of other people have problems > with it too. It seems to me that what we really want to do is try to > finish vacuuming the table before we again need to vacuum the table. > For the sake of simplicity, just consider the anti-wraparound case for > a second. If it takes three days to vacuum the table and we consume > 200 million XIDs in two days, we are pretty clearly not vacuuming fast > enough. In my experience it is almost always best to run autovacuum very often and very aggressively. That generally means tuning scaling factor and thresholds as well, such that there are never more than say 50-100k dead rows. Then running vacuum with no delays or limits runs quite fast is is generally not noticeable/impactful. However that strategy does not work well for vacuums which run long, such as an anti-wraparound vacuum. So in my opinion we need to think about this as at least two distinct cases requiring different solutions. > I think we should do something similar to what we do for checkpoints. > We estimate when the table will next need vacuuming based on the rate > of XID advancement and the rate at which dead tuples are being > created. We can also estimate what percentage of the relation we've > vacuumed and derive some estimate of when we'll be done - perhaps > assuming only one index pass, for the sake of simplicity. If we're > behind, we should vacuum faster to try to catch up. We could even try > to include some fudge factor in the calculation - e.g. if the time > until the next vacuum is estimated to be 30 hours from the start of > the current vacuum, we try to make the current vacuum finish in no > more than 75% * 30 hours = 22.5 hours. This seems reasonable for the anti-wraparound case. > I think this is better than your proposal to scale it just based on > the size of the relation because it may be find for the vacuum to run > slowly if we're creating very few dead tuples and consuming very few > XIDs. IME, there's one very specific scenario where the wheels come > off, and that's when the table doesn't get fully vacuumed before it's > due to be vacuumed again. Of course, anything we did here wouldn't be > perfect - it would all be based on estimates - but I bet we could make > things a lot better. There's an even more global version of this > problem, which is that you could have a situation when any given table > gets vacuumed it runs quick enough to finish before that table gets > vacuumed again, but there are lots of large tables so overall we don't > make enough progress. It would be nice to fix that, too, but even > something simple that ignored that more global problem would help a > lot of people. This brings up a third scenario I've seen, which is lots (think thousands) of individual tables needing vacuum almost constantly. In that case autovacuum_naptime and autovacuum_max_workers also need tuning or you never get to all of them. Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development
Re: Allow to specify (auto-)vacuum cost limits relative to the database/cluster size?
From
Alvaro Herrera
Date:
Joe Conway wrote: > In my experience it is almost always best to run autovacuum very often > and very aggressively. That generally means tuning scaling factor and > thresholds as well, such that there are never more than say 50-100k dead > rows. Then running vacuum with no delays or limits runs quite fast is is > generally not noticeable/impactful. > > However that strategy does not work well for vacuums which run long, > such as an anti-wraparound vacuum. So in my opinion we need to think > about this as at least two distinct cases requiring different solutions. With the freeze map there is no need for anti-wraparound vacuums to be terribly costly, since they don't need to scan the whole table each time. That patch probably changes things a lot in this area. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Allow to specify (auto-)vacuum cost limits relative to the database/cluster size?
From
Joe Conway
Date:
On 02/24/2016 08:54 AM, Alvaro Herrera wrote: > Joe Conway wrote: > >> In my experience it is almost always best to run autovacuum very often >> and very aggressively. That generally means tuning scaling factor and >> thresholds as well, such that there are never more than say 50-100k dead >> rows. Then running vacuum with no delays or limits runs quite fast is is >> generally not noticeable/impactful. >> >> However that strategy does not work well for vacuums which run long, >> such as an anti-wraparound vacuum. So in my opinion we need to think >> about this as at least two distinct cases requiring different solutions. > > With the freeze map there is no need for anti-wraparound vacuums to be > terribly costly, since they don't need to scan the whole table each > time. That patch probably changes things a lot in this area. Yes, I had forgotten about that. It would be a huge help. -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development