Thread: dump_all/restore times?
I'm testing a database against the 7.4 CVS, so I did a pg_dumpall and a restore on another system. The database has around 130 user tables which take up a bit under 15GB of disk space. The dump file was about 7GB. (That seems about right based on past experience, nobody ever said relational databases saved on disk space.) My question has to do with the time it is taking to do the restore on the 2nd system. I started the restore at 6PM last night, 18 hours later I'm hoping it is getting close to being done, as it has been building indexes for the past hour or so. The test system is quite a bit slower than the system I'm cloning, it is on a 133Mhz P2 vs a 2.2 Ghz Athlon, but it has a fairly fast 40 GB IDE disk drive. Is the amount of time it is taking within reasonable limits? (That will become a factor in deciding when and how to upgrade the production database, which is on yet another system.) -- Mike Nolan
nolan@celery.tssi.com wrote: > The dump file was about 7GB. (That seems about right based on past <snip> > I started the restore at 6PM last night, 18 hours later I'm hoping it is > getting close to being done, as it has been building indexes for the past > hour or so. > > The test system is quite a bit slower than the system I'm cloning, it is on > a 133Mhz P2 vs a 2.2 Ghz Athlon, but it has a fairly fast 40 GB IDE disk > drive. > > Is the amount of time it is taking within reasonable limits? A couple of years ago I loaded a database of ~7GB on a dual ppro 200 MHz server (with SCSI drives). I think it took about 24 hours. Based on that experience, I'd say your 18 hours on a 133 Mhz machine is not bad. The exact same data, later loaded onto a dual 1.4GHz pentium III server took about an hour. HTH, Joe
Joe Conway <mail@joeconway.com> writes: > nolan@celery.tssi.com wrote: >> Is the amount of time it is taking within reasonable limits? > A couple of years ago I loaded a database of ~7GB on a dual ppro 200 MHz > server (with SCSI drives). I think it took about 24 hours. Based on that > experience, I'd say your 18 hours on a 133 Mhz machine is not bad. It's a good idea to bump up sort_mem as high as you can before running the restore, so that the index builds run as fast as possible. Assuming that the new server isn't doing anything else, you could realistically set sort_mem to maybe a quarter or a third of physical RAM for this purpose. (Don't forget to knock it back down afterwards...) I have a sneaking suspicion that creation of foreign key constraints may be unreasonably inefficient during a restore, too. Have not had a chance to check up on it though. Next time you run such a restore, could you turn on log_statement and log_timestamp (or log_duration if you have it) so we can see which steps take the most time? regards, tom lane
> I have a sneaking suspicion that creation of foreign key constraints may > be unreasonably inefficient during a restore, too. Have not had a > chance to check up on it though. Next time you run such a restore, > could you turn on log_statement and log_timestamp (or log_duration if > you have it) so we can see which steps take the most time? No foreign keys, but a lot of fairly large tables (by my usual standards), several of which are multiply indexed. It didn't seem to get into the indexing part until the last 2-3 hours, I think most of that time was just loading data. I can run the load again, this system was set up so I could develop more familiarity with FreeBSD and to have a place to run CVS, the data is just there to have something to test 7.4 with. I was also planning to set up 7.3.3 on it (probably with a subset of the data) so that I could do some comparative timings during beta testing. I will try the restore under 7.3.3 to see if it performs much differently, though after reading Joe's note I suspect that's probably just how long it takes on this box. -- Mike Nolan
>>>>> "n" == nolan <nolan@celery.tssi.com> writes: n> It didn't seem to get into the indexing part until the last 2-3 hours, n> I think most of that time was just loading data. n> I can run the load again, this system was set up so I could develop more n> familiarity with FreeBSD and to have a place to run CVS, the data n> is just there to have something to test 7.4 with. Last dump/restore I did when moving from 7.1 to 7.2 took something like 8 hours, but that was many GB ago ;-) For my data, the load takes about 2-3 hours, and the rest of the time indexing. I don't see how FK's really will affect restore since they are not installed until after the data is fully loaded. I'll have to keep in mind the sort_mem setting when I do this again... -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Vivek Khera, Ph.D. Khera Communications, Inc. Internet: khera@kciLink.com Rockville, MD +1-240-453-8497 AIM: vivekkhera Y!: vivek_khera http://www.khera.org/~vivek/
>>>>> "TL" == Tom Lane <tgl@sss.pgh.pa.us> writes: TL> It's a good idea to bump up sort_mem as high as you can before running TL> the restore, so that the index builds run as fast as possible. Assuming Wouldn't it make a cool option to pg_restore to be able to set sort_mem for the current connection? -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Vivek Khera, Ph.D. Khera Communications, Inc. Internet: khera@kciLink.com Rockville, MD +1-240-453-8497 AIM: vivekkhera Y!: vivek_khera http://www.khera.org/~vivek/
>TL> It's a good idea to bump up sort_mem as high as you can before running >TL> the restore, so that the index builds run as fast as possible. Assuming > >Wouldn't it make a cool option to pg_restore to be able to set >sort_mem for the current connection? > > > I was thinking the very same thing. If it is a must to always bump up sort_mem before running restore why can't the restore utility do it? Take a portion of the currently available memory or ask a user interactively etc. Kaarel
> I was thinking the very same thing. If it is a must to always bump up > sort_mem before running restore why can't the restore utility do it? > Take a portion of the currently available memory or ask a user > interactively etc. This would need to be a command line option, because there may be systems on which bumping up sort_mem should not be done, for reasons unrelated to performance during the restore, and the restore may not always be done interactively. -- Mike Nolan
<blockquote type="cite" cite="mid20030723075229.22738.qmail@celery.tssi.com"> I was thinking the very same thing. If it is a must to always bump up sort_mem before running restore why can't the restore utility do it? Take a portion of the currently available memory or ask a user interactively etc. This would need to be a command line option, because there may be systems on which bumping up sort_mem should not be done, for reasons unrelated to performance during the restore, and the restore may not always be done interactively. If this can not be done then at least the sort_mem thing should be mentioned in Chapter 9. Backup and Restore in the PostgreSQL Administrator's Guide. http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html#sortmem too only slightly mentiones data dumps as opposed to "before running the restore" suggested by Tom. Kaarel
>>I was thinking the very same thing. If it is a must to always bump up >>sort_mem before running restore why can't the restore utility do it? >>Take a portion of the currently available memory or ask a user >>interactively etc. >> >> > >This would need to be a command line option, because there may be systems >on which bumping up sort_mem should not be done, for reasons unrelated to >performance during the restore, and the restore may not always be done >interactively. > > If this can not be done then at least the sort_mem thing should be mentioned in Chapter 9. Backup and Restore in the PostgreSQL Administrator's Guide. http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html#sortmem too only slightly mentiones data dumps as opposed to "before running the restore" suggested by Tom. Kaarel
Here is an applied patch that suggests increasing sort_mem during restore. I though of putting it near the database restore section or in the manual, but is seemed more centralized to put it near the actual parameter. --------------------------------------------------------------------------- Kaarel wrote: > > >>I was thinking the very same thing. If it is a must to always bump up > >>sort_mem before running restore why can't the restore utility do it? > >>Take a portion of the currently available memory or ask a user > >>interactively etc. > >> > >> > > > >This would need to be a command line option, because there may be systems > >on which bumping up sort_mem should not be done, for reasons unrelated to > >performance during the restore, and the restore may not always be done > >interactively. > > > > > If this can not be done then at least the sort_mem thing should be > mentioned in Chapter 9. Backup and Restore in the PostgreSQL > Administrator's Guide. > http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html#sortmem too > only slightly mentiones data dumps as opposed to "before running the > restore" suggested by Tom. > > Kaarel > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073 Index: doc/src/sgml/runtime.sgml =================================================================== RCS file: /cvsroot/pgsql-server/doc/src/sgml/runtime.sgml,v retrieving revision 1.194 diff -c -c -r1.194 runtime.sgml *** doc/src/sgml/runtime.sgml 22 Jul 2003 20:29:13 -0000 1.194 --- doc/src/sgml/runtime.sgml 23 Jul 2003 20:13:53 -0000 *************** *** 887,893 **** times the value of <varname>SORT_MEM</varname>. Sort operations are used by <literal>ORDER BY</>, merge joins, and <command>CREATE INDEX</>. Hash tables are used in hash joins, hash-based aggregation, and ! hash-based processing of <literal>IN</> subqueries. </para> </listitem> </varlistentry> --- 887,895 ---- times the value of <varname>SORT_MEM</varname>. Sort operations are used by <literal>ORDER BY</>, merge joins, and <command>CREATE INDEX</>. Hash tables are used in hash joins, hash-based aggregation, and ! hash-based processing of <literal>IN</> subqueries. Because ! <command>CREATE INDEX</> is used when restoring a database, it might ! be good to temporary increase this value during a restore. </para> </listitem> </varlistentry>
>>>>> "BM" == Bruce Momjian <pgman@candle.pha.pa.us> writes: BM> Here is an applied patch that suggests increasing sort_mem during BM> restore. I though of putting it near the database restore section or in BM> the manual, but is seemed more centralized to put it near the actual BM> parameter. But nobody will find it there...
Vivek Khera wrote: > >>>>> "BM" == Bruce Momjian <pgman@candle.pha.pa.us> writes: > > BM> Here is an applied patch that suggests increasing sort_mem during > BM> restore. I though of putting it near the database restore section or in > BM> the manual, but is seemed more centralized to put it near the actual > BM> parameter. > > But nobody will find it there... OK, then where should I put it? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
>>>>> "BM" == Bruce Momjian <pgman@candle.pha.pa.us> writes: BM> Vivek Khera wrote: >> >>>>> "BM" == Bruce Momjian <pgman@candle.pha.pa.us> writes: >> BM> Here is an applied patch that suggests increasing sort_mem during BM> restore. I though of putting it near the database restore section or in BM> the manual, but is seemed more centralized to put it near the actual BM> parameter. >> >> But nobody will find it there... BM> OK, then where should I put it? In the section describing how to do a restore. I'd be reading that as opposed to the section that describes tuning the server.
I have added a mention of sort_mem in the restore docs --- patch attached. --------------------------------------------------------------------------- Vivek Khera wrote: > >>>>> "BM" == Bruce Momjian <pgman@candle.pha.pa.us> writes: > > BM> Vivek Khera wrote: > >> >>>>> "BM" == Bruce Momjian <pgman@candle.pha.pa.us> writes: > >> > BM> Here is an applied patch that suggests increasing sort_mem during > BM> restore. I though of putting it near the database restore section or in > BM> the manual, but is seemed more centralized to put it near the actual > BM> parameter. > >> > >> But nobody will find it there... > > BM> OK, then where should I put it? > > > In the section describing how to do a restore. I'd be reading that as > opposed to the section that describes tuning the server. > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073 Index: doc/src/sgml/backup.sgml =================================================================== RCS file: /cvsroot/pgsql-server/doc/src/sgml/backup.sgml,v retrieving revision 2.27 diff -c -c -r2.27 backup.sgml *** doc/src/sgml/backup.sgml 1 Aug 2003 01:01:52 -0000 2.27 --- doc/src/sgml/backup.sgml 17 Aug 2003 03:11:24 -0000 *************** *** 153,158 **** --- 153,162 ---- </para> </important> + <tip> + Restore performance can be improved by increasing <literal>SORT_MEM</> + (see <xref linkend="runtime-config-resource-memory">). + </tip> </sect2> <sect2 id="backup-dump-all">