Re: pg_dump and pg_restore - Mailing list pgsql-performance

From Robert Haas
Subject Re: pg_dump and pg_restore
Date
Msg-id AANLkTinqHM6bikBGh8dEFsEay86RgfQBdYuCWDa9MkbX@mail.gmail.com
Whole thread Raw
In response to pg_dump and pg_restore  (Jayadevan M <Jayadevan.Maymala@ibsplc.com>)
Responses Re: pg_dump and pg_restore
List pgsql-performance
On Mon, May 17, 2010 at 1:04 AM, Jayadevan M
<Jayadevan.Maymala@ibsplc.com> wrote:
> Hello all,
> I was testing how much time a pg_dump backup would take to get restored.
> Initially, I tried it with psql (on a backup taken with pg_dumpall). It took
> me about one hour. I felt that I should target for a recovery time of 15
> minutes to half an hour. So I went through the blogs/documentation etc and
> switched to pg_dump and pg_restore. I tested only the database with the
> maximum volume of data (about 1.5 GB). With
> pg_restore -U postgres -v -d PROFICIENT --clean -Fc proficient.dmp
> it took about 45 minutes. I tried it with
> pg_restore -U postgres -j8 -v -d PROFICIENT --clean -Fc proficient.dmp
> Not much improvement there either. Have I missed something or 1.5 GB data on
> a machine with the following configuration will take about 45 minutes? There
> is nothing else running on the machine consuming memory or CPU. Out of 300
> odd tables, about 10 tables have millions of records, rest are all having a
> few thousand records at most.
>
> Here are the specs  ( a pc class  machine)-
>
> PostgreSQL 8.4.3 on i686-pc-linux-gnu
> CentOS release 5.2
> Intel(R) Pentium(R) D CPU 2.80GHz
> 2 GB RAM
> Storage is local disk.
>
> Postgresql parameters (what I felt are relevant) -
> max_connections = 100
> shared_buffers = 64MB
> work_mem = 16MB
> maintenance_work_mem = 16MB
> synchronous_commit on

I would suggest raising shared_buffers to perhaps 512MB and cranking
up checkpoint_segments to 10 or more.  Also, your email doesn't give
too much information about how many CPUs you have and what kind of
disk subsystem you are using (RAID?  how many disks?) so it's had to
say if -j8 is reasonable.  That might be too high.

Another thing I would recommend is that during the restore you use
tools like top and iostat to monitor the system.  You'll want to check
things like whether all the CPUs are in use, and how the disk activity
compares to the maximum you can generate using some other method
(perhaps dd).

One thing I've noticed (to my chagrin) is that if pg_restore is given
a set of options that are incompatible with parallel restore, it just
does a single-threaded restore.  The options you've specified look
right to me, but, again, examining exactly what is going on during the
restore should tell you if there's a problem in this area.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

pgsql-performance by date:

Previous
From: David Jarvis
Date:
Subject: Re: Optimize date query for large child tables: GiST or GIN?
Next
From: David Jarvis
Date:
Subject: Re: Optimize date query for large child tables: GiST or GIN?