I've been looking at the performance of pg_dump in
the past week off and on trying to see if I can get it to
work a bit faster and was looking for tips on this.
doing a pg_dump on my 34311239 row table (1h of data btw)
results in a wallclock time of 187.9 seconds or ~182k rows/sec.
I've got my insert (COPY) performance around 100k/sec and
was hoping to get the reads to be much faster. The analysis I'm
doing is much faster doing a pg_dump than utilizing a few
queries for numerous reasons. (If you care, I can enumerate them
to you privately but the result is pg_dump is the best way to handle
the multiple bits of analysis that are needed, please trust me).
What i'm seeing:
pg_dump is utilizing about 13% of the cpu and the
corresponding postgres backend is at 100% cpu time.
(multi-core, multi-cpu, lotsa ram, super-fast disk).
I'm not seeing myself being I/O bound so was interested
if there was a way I could tweak the backend performance or
offload some of the activity to another process.
pg8.3(beta) with the following variances from default
checkpoint_segments = 300 # in logfile segments, min 1, 16MB each
effective_cache_size = 512MB # typically 8KB each
wal_buffers = 128MB # min 4, 8KB each
shared_buffers = 128MB # min 16, at least max_connections*2, 8KB each
work_mem = 512MB # min 64, size in KB
unrelated but associated data, the table has one index on it.
not relevant for pg_dump but i'm interested in getting better concurent index
creation (utilize those cpus better but not slow down my row/sec perf)
but that's another topic entirely..
Any tips on getting pg_dump (actually the backend) to perform
much closer to 500k/sec or more? This would also aide me when I upgrade
pg versions and need to dump/restore with minimal downtime (as the data
never stops coming.. whee).
Thanks!
- Jared
--
Jared Mauch | pgp key available via finger from jared@puck.nether.net
clue++; | http://puck.nether.net/~jared/ My statements are only mine.