Re: [HACKERS] Block level parallel vacuum - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: [HACKERS] Block level parallel vacuum
Date
Msg-id CAD21AoA_6dARPG_5xfQE=1wTay2zUw0QcQZqVx9kOFHP_GYBbQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Block level parallel vacuum  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: [HACKERS] Block level parallel vacuum
Re: [HACKERS] Block level parallel vacuum
List pgsql-hackers
On Tue, Oct 29, 2019 at 4:06 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Oct 28, 2019 at 2:13 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Thu, Oct 24, 2019 at 4:33 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Thu, Oct 24, 2019 at 4:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Thu, Oct 24, 2019 at 11:51 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > >
> > > > > On Fri, Oct 18, 2019 at 12:18 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > > >
> > > > > > On Fri, Oct 18, 2019 at 11:25 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > > >
> > > > > > > I am thinking if we can write the patch for both the approaches (a.
> > > > > > > compute shared costs and try to delay based on that, b. try to divide
> > > > > > > the I/O cost among workers as described in the email above[1]) and do
> > > > > > > some tests to see the behavior of throttling, that might help us in
> > > > > > > deciding what is the best strategy to solve this problem, if any.
> > > > > > > What do you think?
> > > > > >
> > > > > > I agree with this idea.  I can come up with a POC patch for approach
> > > > > > (b).  Meanwhile, if someone is interested to quickly hack with the
> > > > > > approach (a) then we can do some testing and compare.  Sawada-san,
> > > > > > by any chance will you be interested to write POC with approach (a)?
> > > > > > Otherwise, I will try to write it after finishing the first one
> > > > > > (approach b).
> > > > > >
> > > > > I have come up with the POC for approach (a).
> >
> > > > Can we compute the overall throttling (sleep time) in the operation
> > > > separately for heap and index, then divide the index's sleep_time with
> > > > a number of workers and add it to heap's sleep time?  Then, it will be
> > > > a bit easier to compare the data between parallel and non-parallel
> > > > case.
> > I have come up with a patch to compute the total delay during the
> > vacuum.  So the idea of computing the total cost delay is
> >
> > Total cost delay = Total dealy of heap scan + Total dealy of
> > index/worker;  Patch is attached for the same.
> >
> > I have prepared this patch on the latest patch of the parallel
> > vacuum[1].  I have also rebased the patch for the approach [b] for
> > dividing the vacuum cost limit and done some testing for computing the
> > I/O throttling.  Attached patches 0001-POC-compute-total-cost-delay
> > and 0002-POC-divide-vacuum-cost-limit can be applied on top of
> > v31-0005-Add-paralell-P-option-to-vacuumdb-command.patch.  I haven't
> > rebased on top of v31-0006, because v31-0006 is implementing the I/O
> > throttling with one approach and 0002-POC-divide-vacuum-cost-limit is
> > doing the same with another approach.   But,
> > 0001-POC-compute-total-cost-delay can be applied on top of v31-0006 as
> > well (just 1-2 lines conflict).
> >
> > Testing:  I have performed 2 tests, one with the same size indexes and
> > second with the different size indexes and measured total I/O delay
> > with the attached patch.
> >
> > Setup:
> > VacuumCostDelay=10ms
> > VacuumCostLimit=2000
> >
> > Test1 (Same size index):
> > create table test(a int, b varchar, c varchar);
> > create index idx1 on test(a);
> > create index idx2 on test(b);
> > create index idx3 on test(c);
> > insert into test select i, repeat('a',30)||i, repeat('a',20)||i from
> > generate_series(1,500000) as i;
> > delete from test where a < 200000;
> >
> >                       Vacuum (Head)                   Parallel Vacuum
> >            Vacuum Cost Divide Patch
> > Total Delay        1784 (ms)                           1398(ms)
> >                  1938(ms)
> >
> >
> > Test2 (Variable size dead tuple in index)
> > create table test(a int, b varchar, c varchar);
> > create index idx1 on test(a);
> > create index idx2 on test(b) where a > 100000;
> > create index idx3 on test(c) where a > 150000;
> >
> > insert into test select i, repeat('a',30)||i, repeat('a',20)||i from
> > generate_series(1,500000) as i;
> > delete from test where a < 200000;
> >
> > Vacuum (Head)                                   Parallel Vacuum
> >               Vacuum Cost Divide Patch
> > Total Delay 1438 (ms)                               1029(ms)
> >                    1529(ms)
> >
> >
> > Conclusion:
> > 1. The tests prove that the total I/O delay is significantly less with
> > the parallel vacuum.
> > 2. With the vacuum cost divide the problem is solved but the delay bit
> > more compared to the non-parallel version.  The reason could be the
> > problem discussed at[2], but it needs further investigation.
> >
> > Next, I will test with the v31-0006 (shared vacuum cost) patch.  I
> > will also try to test different types of indexes.
> >
>
> Thank you for testing!
>
> I realized that v31-0006 patch doesn't work fine so I've attached the
> updated version patch that also incorporated some comments I got so
> far. Sorry for the inconvenience. I'll apply your 0001 patch and also
> test the total delay time.
>

FWIW I'd like to share the results of total delay time evaluation of
approach (a) (shared cost balance). I used the same workloads that
Dilip shared and set vacuum_cost_delay to 10. The results of two test
cases are here:

* Test1
normal      : 12656 ms (hit 50594, miss 5700, dirty 7258, total 63552)
2 workers : 17149 ms (hit 47673, miss 8647, dirty 9157, total 65477)
1 worker   : 19498 ms (hit 45954, miss 10340, dirty 10517, total 66811)

* Test2
normal      : 1530 ms (hit 30645, miss 2, dirty 3, total 30650)
2 workers : 1538 ms (hit 30645, miss 2, dirty 3, total 30650)
1 worker   : 1538 ms (hit 30645, miss 2, dirty 3, total 30650)

'hit', 'miss' and 'dirty' are the total numbers of buffer hits, buffer
misses and flushing dirty buffer, respectively. 'total' is the sum of
these three values.

In this evaluation I expect that parallel vacuum cases delay time as
much as the time of normal vacuum because the total number of pages to
vacuum is the same and we have the shared cost balance value and each
workers decide to sleep based on that value. According to the above
Test1 results, we can see that there is a big difference in the total
delay time among  these cases (normal vacuum case is shortest), but
the cause of this is that parallel vacuum had to to flush more dirty
pages. Actually after increased shared_buffer I got expected results:

* Test1 (after increased shared_buffers)
normal      : 2807 ms (hit 56295, miss 2, dirty 3, total 56300)
2 workers : 2840 ms (hit 56295, miss 2, dirty 3, total 56300)
1 worker   : 2841 ms (hit 56295, miss 2, dirty 3, total 56300)

I updated the patch that computes the total cost delay shared by
Dilip[1] so that it collects the number of buffer hits and so on, and
have attached it. It can be applied on top of my latest patch set[1].

[1] https://www.postgresql.org/message-id/CAFiTN-thU-z8f04jO7xGMu5yUUpTpsBTvBrFW6EhRf-jGvEz%3Dg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAD21AoAqT17QwKJ_sWOqRxNvg66wMw1oZZzf9Rt-E-zD%2BXOh_Q%40mail.gmail.com

Regards,

--
Masahiko Sawada

Attachment

pgsql-hackers by date:

Previous
From: Fabien COELHO
Date:
Subject: Re: Getting psql to redisplay command after \e
Next
From: Peter Eisentraut
Date:
Subject: Re: MinGW compiler warnings in ecpg tests