Home > mailing lists

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ] - Mailing list pgsql-hackers

From	Amit Kapila
Subject	Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Date	July 3, 2014 09:29:40
Msg-id	CAA4eK1KyPf0BNpTTmWhVSoQkaGgO1LrNjJHsChJFS=AaJqKdvg@mail.gmail.com Whole thread Raw
In response to	Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ] (Alvaro Herrera <alvherre@2ndquadrant.com>)
List	pgsql-hackers

Tree view

On Wed, Jul 2, 2014 at 11:45 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> Jeff Janes wrote:
>
> > I would only envision using the parallel feature for vacuumdb after a
> > pg_upgrade or some other major maintenance window (that is the only
> > time I ever envision using vacuumdb at all). I don't think autovacuum
> > can be expected to handle such situations well, as it is designed to
> > be a smooth background process.
>
> That's a fair point. One thing that would be pretty neat but I don't
> think I would get anyone to implement it, is having the user control the
> autovacuum launcher in some way. For instance "please vacuum this set
> of tables as quickly as possible", and it would launch as many workers
> are configured. It would take months to get a UI settled for this,
> however.

This sounds to be a better way to have multiple workers working

on vacuuming tables. For vacuum as we already have some sort

of infrastructure (vacuum workers) to perform tasks in parallel, why

not to leverage that instead of inventing a new one even if we assume

that we can reduce the duplicate code.

> > I don't know how to calibrate the number of lines that is worthwhile.
> > If you write in C and need to have cross-platform compatibility and
> > robust error handling, it seems to take hundreds of lines to do much
> > of anything. The code duplication is a problem, but I don't think
> > just raw line count is, especially since it has already been written.
>
> Well, there are (at least) two types of duplicate code: first you have
> these common routines such as pgpipe that are duplicates for no good
> reason. Just move them to src/port or something and it's all good. But
> the OP said there is code that cannot be shared even though it's very
> similar in both incarnations. That means we cannot (or it's difficult
> to) just have one copy, which means as they fix bugs in one copy we need
> to update the other.

I checked briefly the duplicate code among both versions and I think,

we might be able to reduce it to a significant amount by making common

functions and use AH where passed (as an example, I have checked

function ParallelBackupStart() which is more than 100 lines). If you see

code duplication as a major point for which you don't prefer this patch,

then I think that can be ameliorated or atleast it is worth a try to do so.

However I think it might be better to achieve in a way suggested by you

using autovacuum launcher.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: Abhijit Menon-Sen
Date: 03 July 2014, 09:11:44
Subject: Re: [PATCH] introduce XLogLockBlockRangeForCleanup()

From: Rajeev rastogi
Date: 03 July 2014, 09:33:53
Subject: Re: Autonomous Transaction (WIP)

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ] - Mailing list pgsql-hackers

Previous

Next