Re: [HACKERS] Block level parallel vacuum - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: [HACKERS] Block level parallel vacuum
Date
Msg-id CA+fd4k7d=ga3z2gw9_43nXhvDSQ-30xwPfBcNXBR_be-EWZXNQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Block level parallel vacuum  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: [HACKERS] Block level parallel vacuum  (Dilip Kumar <dilipbalaut@gmail.com>)
List pgsql-hackers
On Wed, 4 Dec 2019 at 04:57, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Wed, Dec 4, 2019 at 9:12 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Dec 4, 2019 at 1:58 AM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > On Tue, 3 Dec 2019 at 11:55, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > In your code, I think if two workers enter to compute_parallel_delay
> > > function at the same time, they add their local balance to
> > > VacuumSharedCostBalance and both workers sleep because both values
> > > reach the VacuumCostLimit.
> > >
> >
> > True, but isn't it more appropriate because the local cost of any
> > worker should be ideally added to shared cost as soon as it occurred?
> > I mean to say that we are not adding any cost in shared balance
> > without actually incurring it.   Then we also consider the individual
> > worker's local balance as well and sleep according to local balance.
>
> Even I think it is better to add the balance to the shared balance at
> the earliest opportunity.  Just consider the case that there are 5
> workers and all have I/O balance of 20, and VacuumCostLimit is 50.  So
> Actually, there combined balance is 100 (which is double of the
> VacuumCostLimit) but if we don't add immediately then none of the
> workers will sleep and it may go to the next cycle which is not very
> good. OTOH, if we add 20 immediately then check the shared balance
> then all the workers might go for sleep if their local balances have
> reached the limit but they will only sleep in proportion to their
> local balance.  So IMHO, adding the current balance to shared balance
> early is more close to the model we are trying to implement i.e.
> shared cost accounting.

I agree to add the balance as soon as it occurred. But the problem I'm
concerned is, let's suppose we have 4 workers, the cost limit is 100
and the shared balance is now 95. Two workers, whom local
balance(VacuumCostBalanceLocal) are 40, consumed I/O, added 10 to
theirs local balance and entered compute_parallel_delay function at
the same time. One worker adds 10 to the shared
balance(VacuumSharedCostBalance) and another worker also adds 10 to
the shared balance. The one worker then subtracts the local balance
from the shared balance and sleeps because the shared cost is now 115
(> the cost limit) and its local balance is 50 (> 0.5*(100/4)). Even
another worker also does the same for the same reason. On the other
hand if two workers do that serially, only one worker sleeps and
another worker doesn't because the total shared cost will be 75 when
the later worker enters the condition. At first glance it looks like a
concurrency problem but is that expected behaviour?

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: backup manifests
Next
From: Tom Lane
Date:
Subject: Re: adding strndup