Re: [HACKERS] Block level parallel vacuum - Mailing list pgsql-hackers
From | Kyotaro HORIGUCHI |
---|---|
Subject | Re: [HACKERS] Block level parallel vacuum |
Date | |
Msg-id | 20190319.191449.04094806.horiguchi.kyotaro@lab.ntt.co.jp Whole thread Raw |
In response to | Re: [HACKERS] Block level parallel vacuum (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: [HACKERS] Block level parallel vacuum
|
List | pgsql-hackers |
At Tue, 19 Mar 2019 19:01:06 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoA3PpkcNNzcQmiNgFL3DudhdLRWoTvQE6=kRagFLjUiBg@mail.gmail.com> > On Tue, Mar 19, 2019 at 4:59 PM Kyotaro HORIGUCHI > <horiguchi.kyotaro@lab.ntt.co.jp> wrote: > > > > At Tue, 19 Mar 2019 13:31:04 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoD4ivrYqg5tau460zEEcgR0t9cV-UagjJ997OfvP3gsNQ@mail.gmail.com> > > > > For indexes=4,8,16, the cases with parallel_degree=4,8,16 behave > > > > almost the same. I suspect that the indexes are too-small and all > > > > the index pages were on memory and CPU is saturated. Maybe you > > > > had four cores and parallel workers more than the number had no > > > > effect. Other normal backends should have been able do almost > > > > nothing meanwhile. Usually the number of parallel workers is > > > > determined so that IO capacity is filled up but this feature > > > > intermittently saturates CPU capacity very under such a > > > > situation. > > > > > > > > > > I'm sorry I didn't make it clear enough. If the parallel degree is > > > higher than 'the number of indexes - 1' redundant workers are not > > > launched. So for indexes=4, 8, 16 the number of actually launched > > > parallel workers is up to 3, 7, 15 respectively. That's why the result > > > shows almost the same execution time in the cases where nindexes <= > > > parallel_degree. > > > > In the 16 indexes case, the performance saturated at 4 workers > > which contradicts to your explanation. > > Because the machine I used has 4 cores the performance doesn't get > improved even if more than 4 parallel workers are launched. That is what I mentioned in the cited phrases. Sorry for perhaps hard-to-read phrases.. > > > > > I'll share the performance test result of more larger tables and indexes. > > > > > > > I'm not sure, but what if we do index vacuum in one-tuple-by-one > > > > manner? That is, heap vacuum passes dead tuple one-by-one (or > > > > buffering few tuples) to workers and workers process it not by > > > > bulkdelete, but just tuple_delete (we don't have one). That could > > > > avoid the sleep time of heap-scan while index bulkdelete. > > > > > > > > > > Just to be clear, in parallel lazy vacuum all parallel vacuum > > > processes including the leader process do index vacuuming, no one > > > doesn't sleep during index vacuuming. The leader process does heap > > > scan and launches parallel workers before index vacuuming. Each > > > processes exclusively processes indexes one by one. > > > > The leader doesn't continue heap-scan while index vacuuming is > > running. And the index-page-scan seems eat up CPU easily. If > > index vacuum can run simultaneously with the next heap scan > > phase, we can make index scan finishes almost the same time with > > the next round of heap scan. It would reduce the (possible) CPU > > contention. But this requires as the twice size of shared > > memoryas the current implement. > > Yeah, I've considered that something like pipe-lining approach that > one process continue to queue the dead tuples and other process > fetches and processes them during index vacuuming but the current > version patch employed the most simple approach as the first step. > Once we had the retail index deletion approach we might be able to use > it for parallel vacuum. Ok, I understood the direction. ... > > > Sorry I couldn't get your comment. You meant to move nprocessed to > > > LVParallelState? > > > > Exactly. I meant letting lvshared points to private memory, but > > it might introduce confusion. > > Hmm, I'm not sure it would be a good idea. It would introduce > confusion as you mentioned. And since 'nprocessed' have to be > pg_atomic_uint32 in parallel mode we will end up with having an > another branch. Ok. Agreed. Thank you for the pacience. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
pgsql-hackers by date: