Thread: Foreground vacuum and buffer access strategy
If I invoke vacuum manually and do so with VacuumCostDelay == 0, I have basically declared my intentions to get this pain over with as fast as possible even if it might interfere with other processes. Under that condition, shouldn't it use BAS_BULKWRITE rather than BAS_VACUUM? The smaller ring size leads to a lot of synchronous WAL flushes which I think can slow the vacuum down a lot. Cheers, Jeff
On Fri, May 25, 2012 at 4:06 PM, Jeff Janes <jeff.janes@gmail.com> wrote: > If I invoke vacuum manually and do so with VacuumCostDelay == 0, I > have basically declared my intentions to get this pain over with as > fast as possible even if it might interfere with other processes. > > Under that condition, shouldn't it use BAS_BULKWRITE rather than > BAS_VACUUM? The smaller ring size leads to a lot of synchronous WAL > flushes which I think can slow the vacuum down a lot. Of course, an autovacuum of a really big table could run too slowly, too, even though it's not a foreground task. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Reviving a very old thread, because I've run into the issue again. On Tue, May 29, 2012 at 11:58 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, May 25, 2012 at 4:06 PM, Jeff Janes <jeff.janes@gmail.com> wrote: >> If I invoke vacuum manually and do so with VacuumCostDelay == 0, I >> have basically declared my intentions to get this pain over with as >> fast as possible even if it might interfere with other processes. >> >> Under that condition, shouldn't it use BAS_BULKWRITE rather than >> BAS_VACUUM? The smaller ring size leads to a lot of synchronous WAL >> flushes which I think can slow the vacuum down a lot. > > Of course, an autovacuum of a really big table could run too slowly, > too, even though it's not a foreground task. True. But almost by definition, an autovacuum is not trying to run inside a maintenance window. Would it be reasonable to upgrade the ring buffer size whenever VacuumCostDelay is zero, regardless of whether it is a manual or an auto vac? One thing I worry about is that many people may have changed autovacuum_vacuum_cost_delay from 20 directly to 0 or -1, and the accidental throttling on WAL syncs might be the only thing preventing their system from falling over each time autovac of a large table kicks in. Cheers, Jeff
On Mon, Aug 12, 2013 at 11:47 PM, Jeff Janes <jeff.janes@gmail.com> wrote: > Reviving a very old thread, because I've run into the issue again. > On Tue, May 29, 2012 at 11:58 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Fri, May 25, 2012 at 4:06 PM, Jeff Janes <jeff.janes@gmail.com> wrote: >>> If I invoke vacuum manually and do so with VacuumCostDelay == 0, I >>> have basically declared my intentions to get this pain over with as >>> fast as possible even if it might interfere with other processes. >>> >>> Under that condition, shouldn't it use BAS_BULKWRITE rather than >>> BAS_VACUUM? The smaller ring size leads to a lot of synchronous WAL >>> flushes which I think can slow the vacuum down a lot. >> >> Of course, an autovacuum of a really big table could run too slowly, >> too, even though it's not a foreground task. > > True. But almost by definition, an autovacuum is not trying to run > inside a maintenance window. > > Would it be reasonable to upgrade the ring buffer size whenever > VacuumCostDelay is zero, regardless of whether it is a manual or an > auto vac? One thing I worry about is that many people may have > changed autovacuum_vacuum_cost_delay from 20 directly to 0 or -1, and > the accidental throttling on WAL syncs might be the only thing > preventing their system from falling over each time autovac of a large > table kicks in. I'm not sure what the right thing to do here is, but I definitely agree there's a problem. There are definitely cases where people want or indeed need to vacuum as fast as possible, and using a small ring buffer is not the way to do that. Now, tying that to VacuumCostDelay doesn't seem right, because setting that to 0 shouldn't suddenly change the behavior in other ways, as well. In general, the approach we've taken so far has been to try to hide the ring-buffer behavior from users and not make it tunable, but I'm not sure we can really get away with that in this case. Increasing the ring-buffer size has system-wide performance implications which could be very good (less bloat) or very bad (I/O starvation of concurrent activity). I don't think the system knows enough to guess which one will be better in any particular case. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Aug 13, 2013 at 3:45 PM, Robert Haas <robertmhaas@gmail.com> wrote: > > I'm not sure what the right thing to do here is, but I definitely > agree there's a problem. There are definitely cases where people want > or indeed need to vacuum as fast as possible, and using a small ring > buffer is not the way to do that. I'm not convinced using a ring buffer is necessarily that bad even if you want to vacuum as fast as possible. The reason we use a small ring buffer is to avoid poisoning the entire cache with vacuum pages, not to throttle the speed of vacuum by introducing synchronous wal flushes. I think we should increase the size of the ring buffer if we hit a synchronous wal buffer flush and there is less than some amount of wal pending. That amount is the relevant thing people might want to limit to avoid slowing down other transaction commits. The walwriter might even provide a relevant knob already for how much wal should be the maximum pending. -- greg
On Wed, Aug 14, 2013 at 1:41 AM, Greg Stark <stark@mit.edu> wrote: > On Tue, Aug 13, 2013 at 3:45 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> >> I'm not sure what the right thing to do here is, but I definitely >> agree there's a problem. There are definitely cases where people want >> or indeed need to vacuum as fast as possible, and using a small ring >> buffer is not the way to do that. > > I'm not convinced using a ring buffer is necessarily that bad even if > you want to vacuum as fast as possible. The reason we use a small ring > buffer is to avoid poisoning the entire cache with vacuum pages, not > to throttle the speed of vacuum by introducing synchronous wal > flushes. > > I think we should increase the size of the ring buffer if we hit a > synchronous wal buffer flush and there is less than some amount of wal > pending. It will be better if the decision to increase ring buffer also consider other activity, otherwise it can lead to more I/O due to buffer replacements by backend. I am not sure currentlythere is any way to check that, but if we maintain buffers on free list, then it can be used to check the current activity (if there are enough buffers on free list, then ring size can be increased as it is an indication that the system is relatively less busy). > That amount is the relevant thing people might want to limit > to avoid slowing down other transaction commits. The walwriter might > even provide a relevant knob already for how much wal should be the > maximum pending. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On Tue, Aug 13, 2013 at 4:11 PM, Greg Stark <stark@mit.edu> wrote: > I'm not convinced using a ring buffer is necessarily that bad even if > you want to vacuum as fast as possible. The reason we use a small ring > buffer is to avoid poisoning the entire cache with vacuum pages, not > to throttle the speed of vacuum by introducing synchronous wal > flushes. Right, but the DBA, being God, is entitled to override that. A regular user should not be able to change system policy here, but if a superuser wants to do it, who are we to say no? > I think we should increase the size of the ring buffer if we hit a > synchronous wal buffer flush and there is less than some amount of wal > pending. That amount is the relevant thing people might want to limit > to avoid slowing down other transaction commits. The walwriter might > even provide a relevant knob already for how much wal should be the > maximum pending. I doubt that would work out; the amount of WAL pending is going to change extremely rapidly. You can't increase the size of the ring buffer for a vacuum that might run for another 24 hours on the basis of an instantaneous measurement whose value might be completely different a few milliseconds later or earlier. Also, if there is a lot of WAL pending, then the system is likely I/O saturated, which might be exactly the wrong time to allow more cache poisoning. Auto-tuning is nice when we can do it, but you can't auto-guess-what-the-human-wants. ...Robert