Re: [GENERAL] pg_restore taking 4 hours! - Mailing list pgsql-performance

From Riccardo G. Facchini
Subject Re: [GENERAL] pg_restore taking 4 hours!
Date
Msg-id 20041201151917.17368.qmail@web13921.mail.yahoo.com
Whole thread Raw
In response to Re: pg_restore taking 4 hours!  (Shridhar Daithankar <ghodechhap@ghodechhap.net>)
List pgsql-performance
--- Shridhar Daithankar <__> wrote:

> On Wednesday 01 Dec 2004 4:46 pm, Rodrigo Carvalhaes wrote:
> > I need to find a solution for this because I am convincing
> customers
> > that are using SQL Server, DB2 and Oracle to change to PostgreSQL
> but
> > this customers have databases of 5GB!!! I am thinking that even
> with a
> > better server, the restore will take 2 days!
> >
> > My data:
> > Conectiva Linux 10 , Kernel 2.6.8
> > PostgreSQL 7.4.6.
> >
> > postgresql.conf modified parameters (the other parameters are the
> default)
> > tcpip_socket = true
> > max_connections = 30
> > shared_buffers = 30000
> > sort_mem = 4096
> > vacuum_mem = 8192
> > max_fsm_pages = 20000
> > max_fsm_relations = 1000
>
> Can you try bumping sort mem lot higher(basically whatever the
> machine can
> afford) so that index creation is faster?
>
> Just try setting sort mem for the restore session and see if it
> helps..
>
>  Shridhar
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 8: explain analyze is your friend
>

Yes, indexing is probably the issue.

You can always ask them to report how long does it take to restore
their M$-SQL, DB2 and Oracle from a scripting dump.

I've been restoring DB2 for a looong time (on different architectures)
and the main problem comes from indexing.

As an index is basically a dynamic structure that is created on the
physical data (the data stored on the table), what is normally saved is
the index DEFINITION, not the index itself, so this is recreated at
restore time.

Some DB2 architectures (and M$-SQL, and Oracle, and Sybase, and others.
others) may have a backup tool that is capable of saving the index
data, but is almost never used, as the index space itself can grow well
over the data size.

I'll give one example: we have one DB2 on iSeries that runs around the
70Gb of Data and Indexes. We do a full backup that occupies only 45Gb
of Data and we do that in a little more than 1 hour because we only
save the index definitions.

We know for sure that this full backup takes something between 5 and 7
hours because of the reindexing. I had this written down in the Restore
Procedure Manual, so the user can't complain (they know that the
procedure will eventually restore the data and the full functionality).

So, make sure that your client knows of their restore times.

One small trick that can help you:
FIRST restore the tables.
THEN restore the foreingn keys, the constraints and the triggers and
procedures.
LAST restore the indexes and views.
LATEST restore the security.

This way, if you have complicated views and indexes with a lot of info,
the procedure <<<may>>> be shorter.

regards,

R.

pgsql-performance by date:

Previous
From: Josh Berkus
Date:
Subject: Re: Off-list Re: Alternatives to Dell?
Next
From: Steven Rosenstein
Date:
Subject: Re: Alternatives to Dell?