Re: cost based vacuum (parallel) - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: cost based vacuum (parallel)
Date
Msg-id CAA4eK1LyNGQ3-26WphEOQdPswE8r+r04kwZOcEHXAgwp4FEmmA@mail.gmail.com
Whole thread Raw
In response to Re: cost based vacuum (parallel)  (Andres Freund <andres@anarazel.de>)
Responses Re: cost based vacuum (parallel)  (Dilip Kumar <dilipbalaut@gmail.com>)
List pgsql-hackers
On Mon, Nov 4, 2019 at 11:58 PM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2019-11-04 12:59:02 -0500, Jeff Janes wrote:
> > On Mon, Nov 4, 2019 at 1:54 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > > For parallel vacuum [1], we were discussing what is the best way to
> > > divide the cost among parallel workers but we didn't get many inputs
> > > apart from people who are very actively involved in patch development.
> > > I feel that we need some more inputs before we finalize anything, so
> > > starting a new thread.
> > >
> >
> > Maybe a I just don't have experience in the type of system that parallel
> > vacuum is needed for, but if there is any meaningful IO throttling which is
> > active, then what is the point of doing the vacuum in parallel in the first
> > place?
>
> I am wondering the same - but to be fair, it's pretty easy to run into
> cases where VACUUM is CPU bound. E.g. because most pages are in
> shared_buffers, and compared to the size of the indexes number of tids
> that need to be pruned is fairly small (also [1]). That means a lot of
> pages need to be scanned, without a whole lot of IO going on. The
> problem with that is just that the defaults for vacuum throttling will
> also apply here, I've never seen anybody tune vacuum_cost_page_hit = 0,
> vacuum_cost_page_dirty=0 or such (in contrast, the latter is the highest
> cost currently).  Nor do we reduce the cost of vacuum_cost_page_dirty
> for unlogged tables.
>
> So while it doesn't seem unreasonable to want to use cost limiting to
> protect against vacuum unexpectedly causing too much, especially read,
> IO, I'm doubtful it has current practical relevance.
>

IIUC, you mean to say that it is of not much practical use to do
parallel vacuum if I/O throttling is enabled for an operation, is that
right?


> I'm wondering how much of the benefit of parallel vacuum really is just
> to work around vacuum ringbuffers often massively hurting performance
> (see e.g. [2]).
>

Yeah, it is a good thing to check, but if anything, I think a parallel
vacuum will further improve the performance with larger ring buffers
as it will make it more CPU bound.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Grigory Smolkin
Date:
Subject: Re: [proposal] recovery_target "latest"
Next
From: Amit Kapila
Date:
Subject: Re: cost based vacuum (parallel)