Re: Cost limited statements RFC - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Cost limited statements RFC |
Date | |
Msg-id | CA+TgmoZVY=zsBbY8ERem=VG_XKOYQLFhhDD5XqjZ9JFr5GUcLA@mail.gmail.com Whole thread Raw |
In response to | Re: Cost limited statements RFC (Jeff Janes <jeff.janes@gmail.com>) |
Responses |
Re: Cost limited statements RFC
|
List | pgsql-hackers |
On Sat, Jun 8, 2013 at 4:43 PM, Jeff Janes <jeff.janes@gmail.com> wrote: > I don't know what two independent setting would look like. Say you keep two > independent counters, where each can trigger a sleep, and the triggering of > that sleep clears only its own counter. Now you still have a limit on the > linear combination, it is just that summation has moved to a different > location. You have two independent streams of sleeps, but they add up to > the same amount of sleeping as a single stream based on a summed counter. > > Or if one sleep clears both counters (the one that triggered it and the > other one), I don't think that that is what I would call independent either. > Or at least not if it has no memory. The intuitive meaning of independent > would require that it keep track of which of the two counters was > "controlling" over the last few seconds. Am I overthinking this? Yep. Suppose the user has a read limit of 64 MB/s and a dirty limit of 4MB/s. That means that, each second, we can read 8192 buffers and dirty 512 buffers. If we sleep for 20 ms (1/50th of a second), that "covers" 163 buffer reads and 10 buffer writes, so we just reduce the accumulate counters by those amounts (minimum zero). > Also, in all the anecdotes I've been hearing about autovacuum causing > problems from too much IO, in which people can identify the specific > problem, it has always been the write pressure, not the read, that caused > the problem. Should the default be to have the read limit be inactive and > rely on the dirty-limit to do the throttling? The main time I think you're going to hit the read limit is during anti-wraparound vacuums. That problem may be gone in 9.4, if Heikki writes that patch we were discussing just recently. But at the moment, we'll do periodic rescans of relations that are already all-frozen, and that's potentially expensive. So I'm not particularly skeptical about the need to throttle reads. I suspect many people don't need it, but there are probably some who do, at least for anti-wraparound cases - especially on EC2, where the limit on I/O is often the GigE card. What I *am* skeptical about is the notion that people need the precise value of the write limit to depend on how many of the pages read are being found in shared_buffers versus not. That's essentially what the present system is accomplishing - at a great cost in user-visible complexity. Basically, I think that anti-wraparound vacuums may need either read throttling or write throttling depending on whether the data is already frozen; and regular vacuums probably only need write-throttling. But I have neither any firsthand experience nor any empirical reason to presume that the write limit needs to be lower when the read-rate is high. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: