Re: Speed/Performance with pg_restore - Mailing list pgsql-general

From Richard Huxton
Subject Re: Speed/Performance with pg_restore
Date
Msg-id 200306090852.52557.dev@archonet.com
Whole thread Raw
In response to Speed/Performance with pg_restore  (Matthew Rudolph <mrudolph@zetec.com>)
List pgsql-general
On Friday 06 Jun 2003 7:13 pm, Matthew Rudolph wrote:
> I am currently unable to search the archive so I will ask even though I
> am sure something similar has been asked before.
>
> I am wondering why a pg_restore is so slow. I have a db that was dumped
> with columns and inserts, which I Know is slower than using the copy
> statements but why are inserts so slow and is there anything I can
> do to make it faster. Is there any documentation that talks about this?

Inserts are slow because (by default) each runs in a separate transaction, and
you might even have all your indexes, constraints, triggers etc. in place.
That can represent a lot of checking and a lot of disk writes before each row
gets committed. There's plenty on this in the mailing list archives (if
they're up again). Usually INSERTS get grouped into transaction batches of
1,000 - 10,000.

> I have a db that as plain text is 6.2MB, not much, but when I restore it
> (using psql for plain text) it takes over 6 min.
> That same db created with pg_dump -DFc and then run through pg_restore
> takes over 12 minutes.

If dumping as columns+inserts is slow, then DONT DO IT. Try "pg_dump -Fc" and
then pg_restore. If that's still slow, then there's something to look at.

> My machine is an AMD xp2000 with 500MB ram running win2000 and
> postgresql 7.3.2, obviously through cygwin.
>
> I just can't imagine a multi gigabyte database being able to be restored
> via pg_restore.

A multi-gig restore on your hardware is not going to be pretty. We're not
talking minutes of downtime.

> The reason I think I need to use insert statements is for longevity. The
> customers may need to go on a witch hunt years down the road and my
> thinking was that if this was just SQL, it will always work.

Ah - I see what you're getting at. Dump the database twice - once as inserts
and once in custom format. Use custom for restores, but keep inserts around
for reference.

To be honest, except in the case of historical legal action/audits I can't
imaging what use a years-old database dump would be.

--
  Richard Huxton

pgsql-general by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Linux 2.6 kernel, tuned for use with databases -
Next
From: Ruben
Date:
Subject: Re: Postmaster only takes 4-5% CPU