Re: Vacuum rate limit in KBps - Mailing list pgsql-hackers
From | Jeff Janes |
---|---|
Subject | Re: Vacuum rate limit in KBps |
Date | |
Msg-id | CAMkU=1zCnqkp1oZqhXR-YQ+HUkTApW-=zR1Y4psbcUOFt7Y5mg@mail.gmail.com Whole thread Raw |
In response to | Vacuum rate limit in KBps (Greg Smith <greg@2ndQuadrant.com>) |
List | pgsql-hackers |
On Sun, Jan 15, 2012 at 12:24 AM, Greg Smith <greg@2ndquadrant.com> wrote: > So far the reaction I've gotten from my recent submission to make autovacuum > log its read/write in MB/s has been rather positive. I've been surprised at > the unprecedented (to me at least) amount of backporting onto big production > systems it's gotten. There is a whole lot of pent up frustration among > larger installs over not having good visibility into how changing cost-based > vacuum parameters turns into real-world units. > > That got me thinking: if MB/s is what everyone wants to monitor, can we > provide a UI to set these parameters that way too? The attached patch is a > bit rough still, but it does that. The key was recognizing that the cost > delay plus cost limit can be converted into an upper limit on cost units per > second, presuming the writes themselves are free. If you then also assume > the worst case--that everything will end up dirty--by throwing in the block > size, too, you compute a maximum rate in MB/s. That represents the fastest > you can possibly write. Since this is mostly a usability patch, I was looking at the documentation, trying to pretend I'm a end user who hasn't seen the sausage being made. I think the doc changes are too conservative. "When using cost-based vacuuming" should be something like "When using rate-limited vacuuming". What does the "cost" mean in the variable "vacuum_cost_rate_limit"? It seems to be genuflection to the past--the past we think is too confusing. Since this is the primary knob the end user is expected to use, the fact that we use these "cost things" to implement rate-limited vacuuming is a detail that should not be reflected in the variable name, so "vacuum_rate_limit" seems better. Leave the cost stuff to the advanced users who want to read beyond the primary knob. Whether I want to rate-limit the vacuum at all should be determined by vacuum_rate_limit instead of by setting vacuum_cost_delay. vacuum_rate_limit=0 should mean unlimited. I think it is pretty intuitive that, in cases where a literal 0 makes no sense, then 0 really means infinity, and that convention is used in other places. I think it is confusing to have more variables than there are degrees of freedom. If we want one knob which is largely writes but mixes in some reads and simple page visits as well, then I think vacuum_cost_page_dirty should go away (effectively be fixed at 1.0), and the vacuum_cost_page_miss would default to 0.5 and vacuum_cost_page_hit default to 0.05. Also, in the current patch, in addition to the overflow at high rate limits, there is an rounding-to-zero at low rate limits that leads to floating point exceptions. PST:LOG: cost limit=0 based on rate limit=10 KB/s delay=20 dirty cost=20 PST:STATEMENT: VACUUM VERBOSE t; PST:ERROR: floating-point exception PST:DETAIL: An invalid floating-point operation was signaled. This probably means an out-of-range result or an invalid operation, such as division by zero. PST:STATEMENT: VACUUM VERBOSE t; Cheers, Jeff
pgsql-hackers by date: