Re: cost based vacuum (parallel) - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: cost based vacuum (parallel) |
Date | |
Msg-id | CA+fd4k4T2udSkcDWKix1s18bKMVworsRXm0ZAujtQ7tJk0XAUg@mail.gmail.com Whole thread Raw |
In response to | Re: cost based vacuum (parallel) (Amit Kapila <amit.kapila16@gmail.com>) |
List | pgsql-hackers |
On Fri, 15 Nov 2019 at 11:54, Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Nov 13, 2019 at 10:02 AM Masahiko Sawada > <masahiko.sawada@2ndquadrant.com> wrote: > > > > I've done some tests while changing shared buffer size, delays and > > number of workers. The overall results has the similar tendency as the > > result shared by Dilip and looks reasonable to me. > > > > Thanks, Sawada-san for repeating the tests. I can see from yours, > Dilip and Mahendra's testing that the delay is distributed depending > upon the I/O done by a particular worker and the total I/O is also as > expected in various kinds of scenarios. So, I think this is a better > approach. Do you agree or you think we should still investigate more > on another approach as well? > > I would like to summarize this approach. The basic idea for parallel > vacuum is to allow the parallel workers and master backend to have a > shared view of vacuum cost related parameters (mainly > VacuumCostBalance) and allow each worker to update it and then based > on that decide whether it needs to sleep. With this basic idea, we > found that in some cases the throttling is not accurate as explained > with an example in my email above [1] and then the tests performed by > Dilip and others in the following emails (In short, the workers doing > more I/O can be throttled less). Then as discussed in an email later > [2], we tried a way to avoid letting the workers sleep which has done > less or no I/O as compared to other workers. This ensured that > workers who are doing more I/O got throttled more. The idea is to > allow any worker to sleep only if it has performed the I/O above a > certain threshold and the overall balance is more than the cost_limit > set by the system. Then we will allow the worker to sleep > proportional to the work done by it and reduce the > VacuumSharedCostBalance by the amount which is consumed by the current > worker. This scheme leads to the desired throttling by different > workers based on the work done by the individual worker. > > We have tested this idea with various kinds of workloads like by > varying shared buffer size, delays and number of workers. Then also, > we have tried with a different number of indexes and workers. In all > the tests, we found that the workers are throttled proportional to the > I/O being done by a particular worker. Thank you for summarizing! I agreed to this approach. Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: