Re: Should we increase the default vacuum_cost_limit? - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Re: Should we increase the default vacuum_cost_limit? |
Date | |
Msg-id | CAKJS1f8s5Cxat65_-uE97oUpoonO3p_2rFMf-cKA7kpAaS1-OA@mail.gmail.com Whole thread Raw |
In response to | Re: Should we increase the default vacuum_cost_limit? (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Should we increase the default vacuum_cost_limit?
Re: Should we increase the default vacuum_cost_limit? |
List | pgsql-hackers |
On Sat, 9 Mar 2019 at 16:11, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I propose therefore that instead of increasing vacuum_cost_limit, > what we ought to be doing is reducing vacuum_cost_delay by a similar > factor. And, to provide some daylight for people to reduce it even > more, we ought to arrange for it to be specifiable in microseconds > not milliseconds. There's no GUC_UNIT_US right now, but it's time. > (Perhaps we should also look into using other delay APIs, such as > nanosleep(2), where available.) It does seem like a genuine concern that there might be too much all or nothing. It's no good being on a highspeed train if it stops at every platform. I agree that vacuum_cost_delay might not be granular enough, however. If we're going to change the vacuum_cost_delay into microseconds, then I'm a little concerned that it'll silently break existing code that sets it. Scripts that do manual off-peak vacuums are pretty common out in the wild. In an ideal world we'd just redesign the vacuum throttling to have MB/s for hit/read/dirty, and possible also WAL write rate. I'm not sure exactly how they'd cooperate together, but we could likely minimise gettimeofday() calls by sampling the time it took to process N pages, and if N pages didn't take the time we wanted them to take we could set N = Min(N * ($target_gettimeofday_sample_rate / $timetaken), 1); e.g if N was 2000 and it just took us 1 second to do 2000 pages, but we want to sleep every millisecond, then just do N *= (0.001 / 1), so the next run we only do 2 pages before checking how long we should sleep for. If we happened to process those 2 pages in 0.5 milliseconds, then N would become 4, etc. We'd just need to hard code the $target_gettimeofday_sample_rate. Probably 1 millisecond would be about right and we'd need to just guess the first value of N, but if we guess a low value, it'll be quick to correct itself after the first batch of pages. If anyone thinks that idea has any potential, then maybe it's better to leave the new vacuum_cost_limit default in place and consider redesigning this for PG13... as such a change is too late for PG12. It may also be possible to make this a vacuum rate limit in %. Say 10% would just sleep for 10x as long is it took to process the last set of pages. The problem with this is that if the server was under heavy load then auto-vacuum might crawl along, but that might be the exact opposite of what's required as it might be crawling due to inadequate vacuuming. > I don't have any particular objection to kicking up the maximum > value of vacuum_cost_limit by 10X or so, if anyone's hot to do that. > But that's not where we ought to be focusing our concern. And there > really is a good reason, not just nannyism, not to make that > setting huge --- it's just the wrong thing to do, as compared to > reducing vacuum_cost_delay. My vote is to 10x the maximum for vacuum_cost_limit and consider changing how it all works in PG13. If nothing happens before this time next year then we can consider making vacuum_cost_delay a microseconds GUC. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: