pg_dump far too slow - Mailing list pgsql-performance

From David Newall
Subject pg_dump far too slow
Date
Msg-id 4B9C97E1.40509@davidnewall.com
Whole thread Raw
Responses Re: pg_dump far too slow
Re: pg_dump far too slow
Re: pg_dump far too slow
Re: pg_dump far too slow
List pgsql-performance
Evening all,

Maiden post to this list.  I've a performance problem for which I'm uncharacteristically in need of good advice.

I have a read-mostly database using 51GB on an ext3 filesystem on a server running Ubuntu 9.04  and PG 8.3.  Forty hours ago I started a plain-format dump, compressed with -Z9, and it is still running, having produced 32GB of an expected 40 - 45GB of compressed output.  CPU load is 100% on the core executing pg_dump, and negligible on all others cores.  The system is read-mostly, and largely idle.  The exact invocation was:

    nohup time pg_dump -f database.dmp -Z9 database

I presumed pg_dump was CPU-bound because of gzip compression, but a test I ran makes that seem unlikely: Copying the database files to a USB hard drive (cp -r /var/lib/postgresql/8.3/main /mnt) took 25 minutes; and gzip-compressing the first first 500MB of the dumpfile (dd if=database.dmp bs=64k count=16000 | time gzip -9 > dd.gz) took one minute and 15 seconds; to gzip the complete 51GB set of files should take no more than 90 minutes.

The database is unremarkable except for one table, the biggest, which contains a bytea column, and which pg_dump has been outputting for at least 39 hours.  That table has 276,292 rows, in which the bytea for 140,695 contains PDFs totalling 32,791MB, and the bytea for the remaining 135,597 rows contains PostScript totalling 602MB.  I think I've never done a full vacuum; only ever auto-vacuum; however I did copy the table to new, deleted the old, and renamed, which I expect is effectively equivalent for it; which is described by the following schema:

             Table "database.bigtable"   Column    |       Type        |     Modifiers    
--------------+-------------------+--------------------headerid     | integer           | not nullmember       | numeric(10,0)     | not nullpostcode     | character varying |bsp          | character varying |details      | bytea             | not nullmembertypeid | integer           | not null default 0
Indexes:   "bigtable_pkey" PRIMARY KEY, btree (headerid, member)   "bigtable_member" btree (member)
Foreign-key constraints:   "bigtable_headerid_fkey" FOREIGN KEY (headerid) REFERENCES header(headerid)

The following describes the application environment:
  • PostgreSQL 8.3.8 on i486-pc-linux-gnu, compiled by GCC gcc-4.3.real (Ubuntu 4.3.3-5ubuntu4) 4.3.3
  • pg_dump (PostgreSQL) 8.3.8
  • Ubuntu 9.04
  • Linux server 2.6.28-13-server #45-Ubuntu SMP Tue Jun 30 20:51:10 UTC 2009 i686 GNU/Linux
  • Intel(R) Xeon(R) CPU E5430  @ 2.66GHz (4 core)
  • RAM 2GB
  • 2 SATA, 7200rpm disks with hardware  RAID-1 (IBM ServeRAID)

My question is, what's going on?

Thanks,

David

pgsql-performance by date:

Previous
From: Dave Crooke
Date:
Subject: Re: Deleting bytea, autovacuum, and 8.2/8.4 differences
Next
From: "fkater@googlemail.com"
Date:
Subject: Re: Deleting bytea, autovacuum, and 8.2/8.4 differences