pg_dump performance - Mailing list pgsql-performance

From Jared Mauch
Subject pg_dump performance
Date
Msg-id 20071226202230.GA76524@puck.nether.net
Whole thread Raw
Responses Re: pg_dump performance
Re: pg_dump performance
List pgsql-performance
    I've been looking at the performance of pg_dump in
the past week off and on trying to see if I can get it to
work a bit faster and was looking for tips on this.

    doing a pg_dump on my 34311239 row table (1h of data btw)
results in a wallclock time of 187.9 seconds or ~182k rows/sec.

    I've got my insert (COPY) performance around 100k/sec and
was hoping to get the reads to be much faster.  The analysis I'm
doing is much faster doing a pg_dump than utilizing a few
queries for numerous reasons.  (If you care, I can enumerate them
to you privately but the result is pg_dump is the best way to handle
the multiple bits of analysis that are needed, please trust me).

    What i'm seeing:

    pg_dump is utilizing about 13% of the cpu and the
corresponding postgres backend is at 100% cpu time.
(multi-core, multi-cpu, lotsa ram, super-fast disk).

    I'm not seeing myself being I/O bound so was interested
if there was a way I could tweak the backend performance or
offload some of the activity to another process.

    pg8.3(beta) with the following variances from default

checkpoint_segments = 300        # in logfile segments, min 1, 16MB each
effective_cache_size = 512MB    # typically 8KB each
wal_buffers = 128MB                # min 4, 8KB each
shared_buffers = 128MB            # min 16, at least max_connections*2, 8KB each
work_mem = 512MB                 # min 64, size in KB


    unrelated but associated data, the table has one index on it.
not relevant for pg_dump but i'm interested in getting better concurent index
creation (utilize those cpus better but not slow down my row/sec perf)
but that's another topic entirely..

    Any tips on getting pg_dump (actually the backend) to perform
much closer to 500k/sec or more?  This would also aide me when I upgrade
pg versions and need to dump/restore with minimal downtime (as the data
never stops coming.. whee).

    Thanks!

    - Jared

--
Jared Mauch  | pgp key available via finger from jared@puck.nether.net
clue++;      | http://puck.nether.net/~jared/  My statements are only mine.

pgsql-performance by date:

Previous
From: david@lang.hm
Date:
Subject: Re: With 4 disks should I go for RAID 5 or RAID 10
Next
From: "Fernando Hevia"
Date:
Subject: Re: With 4 disks should I go for RAID 5 or RAID 10