Re: cost based vacuum (parallel) - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: cost based vacuum (parallel)
Date
Msg-id CA+fd4k4T2udSkcDWKix1s18bKMVworsRXm0ZAujtQ7tJk0XAUg@mail.gmail.com
Whole thread Raw
In response to Re: cost based vacuum (parallel)  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Fri, 15 Nov 2019 at 11:54, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Nov 13, 2019 at 10:02 AM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > I've done some tests while changing shared buffer size, delays and
> > number of workers. The overall results has the similar tendency as the
> > result shared by Dilip and looks reasonable to me.
> >
>
> Thanks, Sawada-san for repeating the tests.  I can see from yours,
> Dilip and Mahendra's testing that the delay is distributed depending
> upon the I/O done by a particular worker and the total I/O is also as
> expected in various kinds of scenarios.  So, I think this is a better
> approach.  Do you agree or you think we should still investigate more
> on another approach as well?
>
> I would like to summarize this approach.  The basic idea for parallel
> vacuum is to allow the parallel workers and master backend to have a
> shared view of vacuum cost related parameters (mainly
> VacuumCostBalance) and allow each worker to update it and then based
> on that decide whether it needs to sleep.  With this basic idea, we
> found that in some cases the throttling is not accurate as explained
> with an example in my email above [1] and then the tests performed by
> Dilip and others in the following emails (In short, the workers doing
> more I/O can be throttled less).  Then as discussed in an email later
> [2], we tried a way to avoid letting the workers sleep which has done
> less or no I/O as compared to other workers.  This ensured that
> workers who are doing more I/O got throttled more.  The idea is to
> allow any worker to sleep only if it has performed the I/O above a
> certain threshold and the overall balance is more than the cost_limit
> set by the system.  Then we will allow the worker to sleep
> proportional to the work done by it and reduce the
> VacuumSharedCostBalance by the amount which is consumed by the current
> worker.  This scheme leads to the desired throttling by different
> workers based on the work done by the individual worker.
>
> We have tested this idea with various kinds of workloads like by
> varying shared buffer size, delays and number of workers.  Then also,
> we have tried with a different number of indexes and workers.  In all
> the tests, we found that the workers are throttled proportional to the
> I/O being done by a particular worker.

Thank you for summarizing!

I agreed to this approach.

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: dropdb --force
Next
From: Surafel Temesgen
Date:
Subject: Re: Conflict handling for COPY FROM