Thread: dump_all/restore times?

dump_all/restore times?

From
nolan@celery.tssi.com
Date:
I'm testing a database against the 7.4 CVS, so I did a pg_dumpall
and a restore on another system.

The database has around 130 user tables which take up a bit under 15GB of
disk space.

The dump file was about 7GB.  (That seems about right based on past
experience, nobody ever said relational databases saved on disk space.)

My question has to do with the time it is taking to do the restore on the
2nd system.

I started the restore at 6PM last night, 18 hours later I'm hoping it is
getting close to being done, as it has been building indexes for the past
hour or so.

The test system is quite a bit slower than the system I'm cloning, it is on
a 133Mhz P2 vs a 2.2 Ghz Athlon, but it has a fairly fast 40 GB IDE disk
drive.

Is the amount of time it is taking within reasonable limits?  (That will
become a factor in deciding when and how to upgrade the production database,
which is on yet another system.)
--
Mike Nolan

Re: dump_all/restore times?

From
Joe Conway
Date:
nolan@celery.tssi.com wrote:
> The dump file was about 7GB.  (That seems about right based on past
<snip>
> I started the restore at 6PM last night, 18 hours later I'm hoping it is
> getting close to being done, as it has been building indexes for the past
> hour or so.
>
> The test system is quite a bit slower than the system I'm cloning, it is on
> a 133Mhz P2 vs a 2.2 Ghz Athlon, but it has a fairly fast 40 GB IDE disk
> drive.
>
> Is the amount of time it is taking within reasonable limits?

A couple of years ago I loaded a database of ~7GB on a dual ppro 200 MHz
server (with SCSI drives). I think it took about 24 hours. Based on that
experience, I'd say your 18 hours on a 133 Mhz machine is not bad.

The exact same data, later loaded onto a dual 1.4GHz pentium III server
took about an hour.

HTH,

Joe


Re: dump_all/restore times?

From
Tom Lane
Date:
Joe Conway <mail@joeconway.com> writes:
> nolan@celery.tssi.com wrote:
>> Is the amount of time it is taking within reasonable limits?

> A couple of years ago I loaded a database of ~7GB on a dual ppro 200 MHz
> server (with SCSI drives). I think it took about 24 hours. Based on that
> experience, I'd say your 18 hours on a 133 Mhz machine is not bad.

It's a good idea to bump up sort_mem as high as you can before running
the restore, so that the index builds run as fast as possible.  Assuming
that the new server isn't doing anything else, you could realistically
set sort_mem to maybe a quarter or a third of physical RAM for this
purpose.  (Don't forget to knock it back down afterwards...)

I have a sneaking suspicion that creation of foreign key constraints may
be unreasonably inefficient during a restore, too.  Have not had a
chance to check up on it though.  Next time you run such a restore,
could you turn on log_statement and log_timestamp (or log_duration if
you have it) so we can see which steps take the most time?

            regards, tom lane

Re: dump_all/restore times?

From
nolan@celery.tssi.com
Date:
> I have a sneaking suspicion that creation of foreign key constraints may
> be unreasonably inefficient during a restore, too.  Have not had a
> chance to check up on it though.  Next time you run such a restore,
> could you turn on log_statement and log_timestamp (or log_duration if
> you have it) so we can see which steps take the most time?

No foreign keys, but a lot of fairly large tables (by my usual standards),
several of which are multiply indexed.

It didn't seem to get into the indexing part until the last 2-3 hours,
I think most of that time was just loading data.

I can run the load again, this system was set up so I could develop more
familiarity with FreeBSD and to have a place to run CVS, the data
is just there to have something to test 7.4 with.

I was also planning to set up 7.3.3 on it (probably with a subset of the
data) so that I could do some comparative timings during beta testing.
I will try the restore under 7.3.3 to see if it performs much differently,
though after reading Joe's note I suspect that's probably just how long
it takes on this box.
--
Mike Nolan



Re: dump_all/restore times?

From
Vivek Khera
Date:
>>>>> "n" == nolan  <nolan@celery.tssi.com> writes:

n> It didn't seem to get into the indexing part until the last 2-3 hours,
n> I think most of that time was just loading data.

n> I can run the load again, this system was set up so I could develop more
n> familiarity with FreeBSD and to have a place to run CVS, the data
n> is just there to have something to test 7.4 with.

Last dump/restore I did when moving from 7.1 to 7.2 took something
like 8 hours, but that was many GB ago ;-)

For my data, the load takes about 2-3 hours, and the rest of the time
indexing.  I don't see how FK's really will affect restore since they
are not installed until after the data is fully loaded.

I'll have to keep in mind the sort_mem setting when I do this again...


--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.                Khera Communications, Inc.
Internet: khera@kciLink.com       Rockville, MD       +1-240-453-8497
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/

Re: dump_all/restore times?

From
Vivek Khera
Date:
>>>>> "TL" == Tom Lane <tgl@sss.pgh.pa.us> writes:

TL> It's a good idea to bump up sort_mem as high as you can before running
TL> the restore, so that the index builds run as fast as possible.  Assuming

Wouldn't it make a cool option to pg_restore to be able to set
sort_mem for the current connection?

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.                Khera Communications, Inc.
Internet: khera@kciLink.com       Rockville, MD       +1-240-453-8497
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/

Re: dump_all/restore times?

From
Kaarel
Date:
>TL> It's a good idea to bump up sort_mem as high as you can before running
>TL> the restore, so that the index builds run as fast as possible.  Assuming
>
>Wouldn't it make a cool option to pg_restore to be able to set
>sort_mem for the current connection?
>
>
>
I was thinking the very same thing. If it is a must to always bump up
sort_mem before running restore why can't the restore utility do it?
Take a portion of the currently available memory or ask a user
interactively etc.

Kaarel


Re: dump_all/restore times?

From
nolan@celery.tssi.com
Date:
> I was thinking the very same thing. If it is a must to always bump up
> sort_mem before running restore why can't the restore utility do it?
> Take a portion of the currently available memory or ask a user
> interactively etc.

This would need to be a command line option, because there may be systems
on which bumping up sort_mem should not be done, for reasons unrelated to
performance during the restore, and the restore may not always be done
interactively.
--
Mike Nolan

Re: dump_all/restore times?

From
Kaarel
Date:
<blockquote type="cite"
 cite="mid20030723075229.22738.qmail@celery.tssi.com">

    I was thinking the very same thing. If it is a must to always bump up
sort_mem before running restore why can't the restore utility do it?
Take a portion of the currently available memory or ask a user
interactively etc.



This would need to be a command line option, because there may be systems
on which bumping up sort_mem should not be done, for reasons unrelated to
performance during the restore, and the restore may not always be done
interactively.



If this can not be done then at least the sort_mem thing should be
mentioned in Chapter 9. Backup and Restore in the PostgreSQL
Administrator's Guide.
http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html#sortmem
too only slightly mentiones data dumps as opposed to "before running
the restore" suggested by Tom.

Kaarel

Re: dump_all/restore times?

From
Kaarel
Date:
>>I was thinking the very same thing. If it is a must to always bump up
>>sort_mem before running restore why can't the restore utility do it?
>>Take a portion of the currently available memory or ask a user
>>interactively etc.
>>
>>
>
>This would need to be a command line option, because there may be systems
>on which bumping up sort_mem should not be done, for reasons unrelated to
>performance during the restore, and the restore may not always be done
>interactively.
>
>
If this can not be done then at least the sort_mem thing should be
mentioned in Chapter 9. Backup and Restore in the PostgreSQL
Administrator's Guide.
http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html#sortmem too
only slightly mentiones data dumps as opposed to "before running the
restore" suggested by Tom.

Kaarel


Re: dump_all/restore times?

From
Bruce Momjian
Date:
Here is an applied patch that suggests increasing sort_mem during
restore.  I though of putting it near the database restore section or in
the manual, but is seemed more centralized to put it near the actual
parameter.


---------------------------------------------------------------------------

Kaarel wrote:
>
> >>I was thinking the very same thing. If it is a must to always bump up
> >>sort_mem before running restore why can't the restore utility do it?
> >>Take a portion of the currently available memory or ask a user
> >>interactively etc.
> >>
> >>
> >
> >This would need to be a command line option, because there may be systems
> >on which bumping up sort_mem should not be done, for reasons unrelated to
> >performance during the restore, and the restore may not always be done
> >interactively.
> >
> >
> If this can not be done then at least the sort_mem thing should be
> mentioned in Chapter 9. Backup and Restore in the PostgreSQL
> Administrator's Guide.
> http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html#sortmem too
> only slightly mentiones data dumps as opposed to "before running the
> restore" suggested by Tom.
>
> Kaarel
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
>                http://archives.postgresql.org
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
Index: doc/src/sgml/runtime.sgml
===================================================================
RCS file: /cvsroot/pgsql-server/doc/src/sgml/runtime.sgml,v
retrieving revision 1.194
diff -c -c -r1.194 runtime.sgml
*** doc/src/sgml/runtime.sgml    22 Jul 2003 20:29:13 -0000    1.194
--- doc/src/sgml/runtime.sgml    23 Jul 2003 20:13:53 -0000
***************
*** 887,893 ****
          times the value of <varname>SORT_MEM</varname>. Sort operations are used
          by <literal>ORDER BY</>, merge joins, and <command>CREATE INDEX</>.
          Hash tables are used in hash joins, hash-based aggregation, and
!         hash-based processing of <literal>IN</> subqueries.
         </para>
        </listitem>
       </varlistentry>
--- 887,895 ----
          times the value of <varname>SORT_MEM</varname>. Sort operations are used
          by <literal>ORDER BY</>, merge joins, and <command>CREATE INDEX</>.
          Hash tables are used in hash joins, hash-based aggregation, and
!         hash-based processing of <literal>IN</> subqueries.  Because
!         <command>CREATE INDEX</> is used when restoring a database, it might
!         be good to temporary increase this value during a restore.
         </para>
        </listitem>
       </varlistentry>

Re: dump_all/restore times?

From
Vivek Khera
Date:
>>>>> "BM" == Bruce Momjian <pgman@candle.pha.pa.us> writes:

BM> Here is an applied patch that suggests increasing sort_mem during
BM> restore.  I though of putting it near the database restore section or in
BM> the manual, but is seemed more centralized to put it near the actual
BM> parameter.

But nobody will find it there...

Re: dump_all/restore times?

From
Bruce Momjian
Date:
Vivek Khera wrote:
> >>>>> "BM" == Bruce Momjian <pgman@candle.pha.pa.us> writes:
>
> BM> Here is an applied patch that suggests increasing sort_mem during
> BM> restore.  I though of putting it near the database restore section or in
> BM> the manual, but is seemed more centralized to put it near the actual
> BM> parameter.
>
> But nobody will find it there...

OK, then where should I put it?

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: dump_all/restore times?

From
Vivek Khera
Date:
>>>>> "BM" == Bruce Momjian <pgman@candle.pha.pa.us> writes:

BM> Vivek Khera wrote:
>> >>>>> "BM" == Bruce Momjian <pgman@candle.pha.pa.us> writes:
>>
BM> Here is an applied patch that suggests increasing sort_mem during
BM> restore.  I though of putting it near the database restore section or in
BM> the manual, but is seemed more centralized to put it near the actual
BM> parameter.
>>
>> But nobody will find it there...

BM> OK, then where should I put it?


In the section describing how to do a restore.  I'd be reading that as
opposed to the section that describes tuning the server.

Re: dump_all/restore times?

From
Bruce Momjian
Date:
I have added a mention of sort_mem in the restore docs --- patch
attached.

---------------------------------------------------------------------------

Vivek Khera wrote:
> >>>>> "BM" == Bruce Momjian <pgman@candle.pha.pa.us> writes:
>
> BM> Vivek Khera wrote:
> >> >>>>> "BM" == Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >>
> BM> Here is an applied patch that suggests increasing sort_mem during
> BM> restore.  I though of putting it near the database restore section or in
> BM> the manual, but is seemed more centralized to put it near the actual
> BM> parameter.
> >>
> >> But nobody will find it there...
>
> BM> OK, then where should I put it?
>
>
> In the section describing how to do a restore.  I'd be reading that as
> opposed to the section that describes tuning the server.
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
Index: doc/src/sgml/backup.sgml
===================================================================
RCS file: /cvsroot/pgsql-server/doc/src/sgml/backup.sgml,v
retrieving revision 2.27
diff -c -c -r2.27 backup.sgml
*** doc/src/sgml/backup.sgml    1 Aug 2003 01:01:52 -0000    2.27
--- doc/src/sgml/backup.sgml    17 Aug 2003 03:11:24 -0000
***************
*** 153,158 ****
--- 153,162 ----
      </para>
     </important>

+    <tip>
+     Restore performance can be improved by increasing <literal>SORT_MEM</>
+     (see <xref linkend="runtime-config-resource-memory">).
+    </tip>
    </sect2>

    <sect2 id="backup-dump-all">