Re: [HACKERS] Block level parallel vacuum - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [HACKERS] Block level parallel vacuum
Date
Msg-id CAA4eK1KkD-CFuYhy0pi+NSR6FUzWYO0gyt2ux7T39uL_Pc3-0w@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Block level parallel vacuum  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Fri, Oct 25, 2019 at 7:37 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Oct 24, 2019 at 8:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Oct 24, 2019 at 3:21 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > I have come up with the POC for approach (a).
> > >
> > > The idea is
> > > 1) Before launching the worker divide the current VacuumCostBalance
> > > among workers so that workers start accumulating the balance from that
> > > point.
> > > 2) Also, divide the VacuumCostLimit among the workers.
> > > 3) Once the worker are done with the index vacuum, send back the
> > > remaining balance with the leader.
> > > 4) The leader will sum all the balances and add that to its current
> > > VacuumCostBalance.  And start accumulating its balance from this
> > > point.
> > >
> > > I was trying to test how is the behaviour of the vacuum I/O limit, but
> > > I could not find an easy way to test that so I just put the tracepoint
> > > in the code and just checked that at what point we are giving the
> > > delay.
> > > I also printed the cost balance at various point to see that after how
> > > much I/O accumulation we are hitting the delay.  Please feel free to
> > > suggest a better way to test this.
> > >
> > > I have printed these logs for parallel vacuum patch (v30) vs v(30) +
> > > patch for dividing i/o limit (attached with the mail)
> > >
> > > Note: Patch and the test results are attached.
> > >
> >
> > Thank you!
> >
> > For approach (a) the basic idea I've come up with is that we have a
> > shared balance value on DSM and each workers including the leader
> > process add its local balance value to it in vacuum_delay_point, and
> > then based on the shared value workers sleep. I'll submit that patch
> > with other updates.
> >
>
> I think it would be better if we can prepare the I/O balance patches
> on top of main patch and evaluate both approaches.  We can test both
> the approaches and integrate the one which turned out to be good.
>

Just to add something to testing both approaches.  I think we can
first come up with a way to compute the throttling vacuum does as
mentioned by me in one of the emails above [1] or in some other way.
I think Dilip is planning to give it a try and once we have that we
can evaluate both the patches.  Some of the tests I have in mind are:
a. All indexes have an equal amount of deleted data.
b. indexes have an uneven amount of deleted data.
c. try with mix of indexes (btree, gin, gist, hash, etc..) on a table.

Feel free to add more tests.

[1] - https://www.postgresql.org/message-id/CAA4eK1%2BPeiFLdTuwrE6CvbNdx80E-O%3DZxCuWB2maREKFD-RaCA%40mail.gmail.com

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: "tsunakawa.takay@fujitsu.com"
Date:
Subject: RE: Fix of fake unlogged LSN initialization
Next
From: Andres Freund
Date:
Subject: Re: EXPLAIN BUFFERS and I/O timing accounting questions