Re: PostgreSQL 8.4 performance tuning questions - Mailing list pgsql-performance

From Kevin Grittner
Subject Re: PostgreSQL 8.4 performance tuning questions
Date
Msg-id 4A718F3C02000025000290F8@gw.wicourts.gov
Whole thread Raw
In response to Re: PostgreSQL 8.4 performance tuning questions  (Matthew Wakeling <matthew@flymine.org>)
Responses Re: PostgreSQL 8.4 performance tuning questions  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Matthew Wakeling <matthew@flymine.org> wrote:

> tests on servers over here have indicated that running four "CREATE
> INDEX" statements at the time runs four times as fast, assuming the
> table fits in maintenance_work_mem.

I'm benchmarking a patch to the parallel restore, and just out of
curiosity I've been comparing the multi-job approach, with various
numbers of jobs, to a restore within a single database transaction;
and I'm seeing (on serious production-quality servers) the parallel
restore run in 55% to 75% of the time of a restore running off the
same dump file using the -1 switch.  The 16 processor machine got the
best results, running with anywhere from 12 to 20 jobs.  The 2
processor machine got the lesser benefit, running with 2 to 4 jobs.
(The exact number of jobs really didn't make a difference big enough
to emerge from the noise.)

I've got 431 user tables with 578 indexes in a database which, freshly
restored, is 70GB.  (That's 91GB with the fragmentation and reasonable
dead space we have in production.)  Real production data; nothing
synthetic.

Since the dump to custom format ran longer than the full pg_dump
piped directly to psql would have taken, the overall time to use this
technique is clearly longer for our databases on our hardware.  I'm
sure there are cases where people don't have the option to pipe things
through, or that there may sometime be a big enough savings in the
multiple jobs to pay off, even without overlapping the dump and
restore, and with the necessity to write and read the data an extra
time; but there are clearly situations where the piped approach is
faster.

We may want to try to characterize the conditions under which each is
a win, so we can better target our advice....

-Kevin

pgsql-performance by date:

Previous
From: Matthew Wakeling
Date:
Subject: Re: PostgreSQL 8.4 performance tuning questions
Next
From: Tom Lane
Date:
Subject: Re: PostgreSQL 8.4 performance tuning questions