Re: Parallel Vacuum - Mailing list pgsql-performance

From Dimitri
Subject Re: Parallel Vacuum
Date
Msg-id 200703231637.33211.dimitrik.fr@gmail.com
Whole thread Raw
In response to Re: Parallel Vacuum  (Michael Stone <mstone+postgres@mathom.us>)
Responses Re: Parallel Vacuum
List pgsql-performance
On Thursday 22 March 2007 19:46, Michael Stone wrote:
> On Thu, Mar 22, 2007 at 07:24:38PM +0100, Dimitri wrote:
> >you're right until you're using a single disk :)
> >Now, imagine you have more disks
>
> I do have more disks. I maximize the I/O performance by dedicating
> different sets of disks to different tables. YMMV. I do suggest watching
> your I/O rates and wallclock time if you try this to see if your
> aggregate is actually substantially faster than the single case. (I
> assume that you haven't yet gotten far enough to actually do performance
> testing.) You may also want to look into tuning your sequential I/O
> performance.
>
> Mike Stone

Mike, specially for you :)

Parallel Vacuum Test
======================

- Database 'db_OBJ'
   PgSQL 8.2.3
   tables: object1, object2, ... object8 (all the same)
   volume: 10.000.000 rows in each table, 22GB in total

- Script Mono Vacuum
  $ cat vac_mono.sh
/usr/local/pgsqlFF/bin/vacuumdb -p 5464 -d db_OBJ -t object1
/usr/local/pgsqlFF/bin/vacuumdb -p 5464 -d db_OBJ -t object2
/usr/local/pgsqlFF/bin/vacuumdb -p 5464 -d db_OBJ -t object3
/usr/local/pgsqlFF/bin/vacuumdb -p 5464 -d db_OBJ -t object4
/usr/local/pgsqlFF/bin/vacuumdb -p 5464 -d db_OBJ -t object5
/usr/local/pgsqlFF/bin/vacuumdb -p 5464 -d db_OBJ -t object6
/usr/local/pgsqlFF/bin/vacuumdb -p 5464 -d db_OBJ -t object7
/usr/local/pgsqlFF/bin/vacuumdb -p 5464 -d db_OBJ -t object8
  $

- Script Parallel Vacuum
  $ cat vac_pll.sh
/usr/local/pgsqlFF/bin/vacuumdb -p 5464 -d db_OBJ -t object1 &
/usr/local/pgsqlFF/bin/vacuumdb -p 5464 -d db_OBJ -t object2 &
/usr/local/pgsqlFF/bin/vacuumdb -p 5464 -d db_OBJ -t object3 &
/usr/local/pgsqlFF/bin/vacuumdb -p 5464 -d db_OBJ -t object4 &
/usr/local/pgsqlFF/bin/vacuumdb -p 5464 -d db_OBJ -t object5 &
/usr/local/pgsqlFF/bin/vacuumdb -p 5464 -d db_OBJ -t object6 &
/usr/local/pgsqlFF/bin/vacuumdb -p 5464 -d db_OBJ -t object7 &
/usr/local/pgsqlFF/bin/vacuumdb -p 5464 -d db_OBJ -t object8 &
wait
  $


Test 1: Cold Clean database (already previously vacuumed)
=========================================================
 Scenario:
  - stop database
  - flush FS cache (umount/mount)
  - start database
  - execute vacuum script

$ time sh vac_mono.sh
real    4m24.23s
user    0m0.00s
sys     0m0.01s

$ time sh vac_pll.sh
real    1m9.36s
user    0m0.00s
sys     0m0.01s


Test 2: Hot Dirty database (modified and not vacuumed)
======================================================
 Scenario:
  - stop database
  - flush FS cache (umount/mount)
  - start database
  - execute 200.000 updates against each from 8 object' tables
  - execute vacuum script

$ time sh vac_mono.sh
real    9m36.90s
user    0m0.00s
sys     0m0.01s

$ time sh vac_pll.sh
real    2m10.41s
user    0m0.00s
sys     0m0.02s


Speed-up x4 is obtained just because single vacuum process reaching max
80MB/sec in throughput, while with 8 parallel vacuum processes I'm jumping to
360MB/sec... And speakink about Sequential I/O: while you're doing read -
file system may again prefetch incoming data in way once you reclaim next
read - your data will be already in FS cache. However, file system
cannot 'pre-write' data for you - so having more concurrent writers helps a
lot! (Of course in case you have a storage configured to keep concurrent
I/O :))

Well, why all this staff?...
Let's imagine once you need more performance, and you buy 10x times more
performant storage box, will you still able to kill it with a single-process
I/O activity? No... :)  To scale well you need to be able to split your work
in several task executed in parallel. And personally, I'm very happy we can
do it with vacuum now - the one of the most critical part of PostgreSQL...

Best regards!
-Dimitri

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: linux - server configuration for small database
Next
From: "Joshua D. Drake"
Date:
Subject: Re: linux - server configuration for small database