Re: Parallel Vacuum - Mailing list pgsql-performance

From Dimitri
Subject Re: Parallel Vacuum
Date
Msg-id 200703221655.03068.dimitrik.fr@gmail.com
Whole thread Raw
In response to Re: Parallel Vacuum  (Alvaro Herrera <alvherre@commandprompt.com>)
Responses Re: Parallel Vacuum  (Alvaro Herrera <alvherre@commandprompt.com>)
Re: Parallel Vacuum  (Michael Stone <mstone+postgres@mathom.us>)
List pgsql-performance
On Thursday 22 March 2007 16:12, Alvaro Herrera wrote:
> Dimitri escribió:
> > On Thursday 22 March 2007 14:52, Alvaro Herrera wrote:
> > > Dimitri escribió:
> > > > Folks,
> > > >
> > > > is there any constrains/problems/etc. to run several vacuum processes
> > > > in parallel while each one is 'vaccuming' one different table?
> > >
> > > No, no problem.  Keep in mind that if one of them takes a very long
> > > time, the others will not be able to remove dead tuples that were
> > > killed while the long vacuum was running -- unless you are in 8.2.
> >
> > Yes, I'm using the last 8.2.3 version. So, will they *really* processing
> > in parallel, or will block each other step by step?
>
> They won't block.

Wow! Excellent! :)
So, in this case why not to add 'parallel' option integrated directly into
the 'vacuumdb' command?

In my case I have several CPU on the server and quite powerful storage box
which is not really busy with a single vacuum. So, my idea is quite simple -
speed-up vacuum with parallel execution (just an algorithm):

--------------------------------------------------------------------------
PLL=parallel_degree
select tab_size, tabname, dbname from ... order by tab_size desc;
  vacuumdb -d $dbname -t $tabname  2>&1 > /tmp/vac.$dbname.$tabname.log &
  while (pgrep vacuumdb | wc -l ) >= $PLL
   sleep 1
  end
end
wait
--------------------------------------------------------------------------

biggest tables are vacuumed first, etc.

But of course it will be much more cool to have something like:

   vacuumdb -a -P parallel_degree

What do you think? ;)

Rgds,
-Dimitri

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Potential memory usage issue
Next
From: Carlos Moreno
Date:
Subject: Re: Performance of count(*)