On 5/24/13 8:21 AM, Robert Haas wrote:
> On Thu, May 23, 2013 at 7:27 PM, Greg Smith<greg@2ndquadrant.com> wrote:
>> >I'm working on a new project here that I wanted to announce, just to keep
>> >from duplicating effort in this area. I've started to add a cost limit
>> >delay for regular statements. The idea is that you set a new
>> >statement_cost_delay setting before running something, and it will restrict
>> >total resources the same way autovacuum does. I'll be happy with it when
>> >it's good enough to throttle I/O on SELECT and CREATE INDEX CONCURRENTLY.
> Cool. We have an outstanding customer request for this type of
> functionality; although in that case, I think the desire is more along
> the lines of being able to throttle writes rather than reads.
>
> But I wonder if we wouldn't be better off coming up with a little more
> user-friendly API. Instead of exposing a cost delay, a cost limit,
> and various charges, perhaps we should just provide limits measured in
> KB/s, like dirty_rate_limit = <amount of data you can dirty per
> second, in kB> and read_rate_limit = <amount of data you can read into
> shared buffers per second, in kB>. This is less powerful than what we
> currently offer for autovacuum, which allows you to come up with a
> "blended" measure of when vacuum has done too much work, but I don't
> have a lot of confidence that it's better in practice.
Doesn't that hit the old issue of not knowing if a read came from FS cache or disk? I realize that the current
cost_delaymechanism suffers from that too, but since the API is lower level that restriction is much more apparent.
Instead of KB/s, could we look at how much time one process is spending waiting on IO vs the rest of the cluster? Is it
reasonablefor us to measure IO wait time for every request, at least on the most popular OSes?
--
Jim C. Nasby, Data Architect jim@nasby.net
512.569.9461 (cell) http://jim.nasby.net