Thread: Suggestions needed about how to dump/restore a database
Hi all, I've got a DB in production that is bigger than 2GB that dumping it takes more than 12 hours. I have a new server to replace this old one where I have restore the DB's dump. The problem is I can't afford to have the server out of business for so long, so I need your advice about how you'd do this dump/restore. The big amount of data is placed in two tables (statistics data), so I was thinking in dump/restore all except this two tables and once the server is running again I'd dump/restore this data. The problem is I don't know how exactly do this. Any suggestion? Thanks -- Arnau
On Tue, 19 Dec 2006, Arnau wrote: > I've got a DB in production that is bigger than 2GB that dumping it > takes more than 12 hours. I have a new server to replace this old one > where I have restore the DB's dump. The problem is I can't afford to > have the server out of business for so long, so I need your advice about > how you'd do this dump/restore. The big amount of data is placed in two > tables (statistics data), so I was thinking in dump/restore all except > this two tables and once the server is running again I'd dump/restore > this data. The problem is I don't know how exactly do this. Arnau, 2GB and it takes 12 hours? What sort of server is this running on? Does your postgresql.conf have all default values perhaps? I routinely dump DBs that are 4-8GB in size and it takes about 10-15 minutes. -- Jeff Frost, Owner <jeff@frostconsultingllc.com> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954
Hi Jeff, > On Tue, 19 Dec 2006, Arnau wrote: > >> I've got a DB in production that is bigger than 2GB that dumping it >> takes more than 12 hours. I have a new server to replace this old one >> where I have restore the DB's dump. The problem is I can't afford to >> have the server out of business for so long, so I need your advice about >> how you'd do this dump/restore. The big amount of data is placed in two >> tables (statistics data), so I was thinking in dump/restore all except >> this two tables and once the server is running again I'd dump/restore >> this data. The problem is I don't know how exactly do this. > > Arnau, > > 2GB and it takes 12 hours? What sort of server is this running on? > Does your postgresql.conf have all default values perhaps? I routinely > dump DBs that are 4-8GB in size and it takes about 10-15 minutes. > It's a dual Xeon with 4 GB of ram and with a RAID 5. Probably it has the default values. Any suggestion about what parameters I should change to speed it up? -- Arnau
On Tue, 19 Dec 2006, Arnau wrote: >> On Tue, 19 Dec 2006, Arnau wrote: >> >>> I've got a DB in production that is bigger than 2GB that dumping it >>> takes more than 12 hours. I have a new server to replace this old one >>> where I have restore the DB's dump. The problem is I can't afford to >>> have the server out of business for so long, so I need your advice about >>> how you'd do this dump/restore. The big amount of data is placed in two >>> tables (statistics data), so I was thinking in dump/restore all except >>> this two tables and once the server is running again I'd dump/restore >>> this data. The problem is I don't know how exactly do this. >> >> Arnau, >> >> 2GB and it takes 12 hours? What sort of server is this running on? Does >> your postgresql.conf have all default values perhaps? I routinely dump DBs >> that are 4-8GB in size and it takes about 10-15 minutes. >> > > > It's a dual Xeon with 4 GB of ram and with a RAID 5. Probably it has the > default values. Any suggestion about what parameters I should change to speed > it up? Have a look at: http://www.powerpostgresql.com/PerfList and http://www.powerpostgresql.com/Downloads/annotated_conf_80.html -- Jeff Frost, Owner <jeff@frostconsultingllc.com> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954
On 12/19/06, Arnau <arnaulist@andromeiberica.com> wrote: > Hi Jeff, > > > On Tue, 19 Dec 2006, Arnau wrote: > > > >> I've got a DB in production that is bigger than 2GB that dumping it > >> takes more than 12 hours. thats strange , we dump +20GB data in 2 hrs or so. I have a new server to replace this old one > >> where I have restore the DB's dump. The problem is I can't afford to > >> have the server out of business for so long, if the two biggest tables are *not critical* for application availability i can dump out their data separately into two SQL files and later restore it. once you dump out the data you can drop the tables from the production DB before dumping out and see how long it takes. pg_dump -t schema.table -h hostname -U user <dbname> can dump out a specific schema.table. (the exact options are version dependent, which version btw u using?) it is always desired to know the root cause of why pg_dump is taking so long in your machine , but in worst case you could take the approach you suggested. so I need your advice about > >> how you'd do this dump/restore. The big amount of data is placed in two > >> tables (statistics data), so I was thinking in dump/restore all except > >> this two tables and once the server is running again I'd dump/restore > >> this data. The problem is I don't know how exactly do this. > > > > Arnau, > >
One other option is to shut the database down competely, and then do a copy of the file system the new server. I have done this when I need to move a very large database to a new server. I can copy 500GB's in a couple of hours, where restoring my large databases backups would take 10+ hours. Just make sure you are keeping postgres at the same version level.
HTH,
Chris
HTH,
Chris
On 12/19/06, Arnau <arnaulist@andromeiberica.com> wrote:
Hi all,
I've got a DB in production that is bigger than 2GB that dumping it
takes more than 12 hours. I have a new server to replace this old one
where I have restore the DB's dump. The problem is I can't afford to
have the server out of business for so long, so I need your advice about
how you'd do this dump/restore. The big amount of data is placed in two
tables (statistics data), so I was thinking in dump/restore all except
this two tables and once the server is running again I'd dump/restore
this data. The problem is I don't know how exactly do this.
Any suggestion?
Thanks
--
Arnau
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
Chris Hoover a écrit : > One other option is to shut the database down competely, and then do a > copy of the file system the new server. I have done this when I need > to move a very large database to a new server. I can copy 500GB's in > a couple of hours, where restoring my large databases backups would > take 10+ hours. Just make sure you are keeping postgres at the same > version level. > > HTH, > > Chris > > On 12/19/06, *Arnau* <arnaulist@andromeiberica.com > <mailto:arnaulist@andromeiberica.com>> wrote: > > Hi all, > > I've got a DB in production that is bigger than 2GB that dumping it > takes more than 12 hours. I have a new server to replace this old one > where I have restore the DB's dump. The problem is I can't afford to > have the server out of business for so long, so I need your advice > about > how you'd do this dump/restore. The big amount of data is placed > in two > tables (statistics data), so I was thinking in dump/restore all > except > this two tables and once the server is running again I'd dump/restore > this data. The problem is I don't know how exactly do this. > > Any suggestion? > > Thanks > -- > Arnau > > > ---------------------------(end of > broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match > > How many tables have you got in your database ? If you have only a few tables you can dump them one at a time pgdump -t .... Olivier
Attachment
Hi, 12h is a lot. i use the copy statement with binary option. this is faster and takes less space. try pg_dump with schema only and export your 2 tables with 2 single statements. in that way i can export my db in less than 3min without downtime (pg_dump produces a 9gb file, binary dump 4.4gb) Thomas Arnau schrieb: > Hi all, > > I've got a DB in production that is bigger than 2GB that dumping it > takes more than 12 hours. I have a new server to replace this old one > where I have restore the DB's dump. The problem is I can't afford to > have the server out of business for so long, so I need your advice about > how you'd do this dump/restore. The big amount of data is placed in two > tables (statistics data), so I was thinking in dump/restore all except > this two tables and once the server is running again I'd dump/restore > this data. The problem is I don't know how exactly do this. > > Any suggestion? > > Thanks
On Wed, 2006-12-20 at 10:37 +0100, Olivier Boissard wrote: > Chris Hoover a écrit : > > One other option is to shut the database down competely, and then do a > > copy of the file system the new server. I have done this when I need > > to move a very large database to a new server. I can copy 500GB's in > > a couple of hours, where restoring my large databases backups would > > take 10+ hours. Just make sure you are keeping postgres at the same > > version level. > > > > HTH, > > > > Chris > > > > On 12/19/06, *Arnau* <arnaulist@andromeiberica.com > > <mailto:arnaulist@andromeiberica.com>> wrote: > > > > Hi all, > > > > I've got a DB in production that is bigger than 2GB that dumping it > > takes more than 12 hours. I have a new server to replace this old one > > where I have restore the DB's dump. The problem is I can't afford to > > have the server out of business for so long, so I need your advice > > about > > how you'd do this dump/restore. The big amount of data is placed > > in two > > tables (statistics data), so I was thinking in dump/restore all > > except > > this two tables and once the server is running again I'd dump/restore > > this data. The problem is I don't know how exactly do this. > > > > Any suggestion? > > > > Thanks > > -- > > Arnau > > > > > > ---------------------------(end of > > broadcast)--------------------------- > > TIP 9: In versions below 8.0, the planner will ignore your desire to > > choose an index scan if your joining column's datatypes do not > > match > > > > > How many tables have you got in your database ? > > If you have only a few tables you can dump them one at a time This approach can get you into serious trouble. Say you a have two tables (a and b) that reference one another - you dump table a at time t1. You dump table b at time t2. In between t1 and t2, you delete a tuple from a and it's referenced tuple from b. Your dump is garbage. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp.
Hi all, > I've got a DB in production that is bigger than 2GB that dumping it > takes more than 12 hours. I have a new server to replace this old one > where I have restore the DB's dump. The problem is I can't afford to > have the server out of business for so long, so I need your advice about > how you'd do this dump/restore. The big amount of data is placed in two > tables (statistics data), so I was thinking in dump/restore all except > this two tables and once the server is running again I'd dump/restore > this data. The problem is I don't know how exactly do this. > > Any suggestion? > > Thanks Jeff answer made me check what were the configuration parameters. That machine had the default ones, so I tweaked a bit them and now I got a dump in about 2 hours. To dump the DB I'm using the following command: /usr/bin/pg_dump -o -b -Fc $db > $backup_file And as result I got a file of 2.2GB. The improvement has been quite big but still very far from the 10-15 minutes that Jeff says or the Thomas'3 minutes. The version I'm running is a 7.4.2 and the postgresql.conf parameters are the following: # - Memory - shared_buffers = 10000 # min 16, at least max_connections*2, 8KB each sort_mem = 10240 # min 64, size in KB vacuum_mem = 81920 # min 1024, size in KB # - Free Space Map - max_fsm_pages = 40000 # min max_fsm_relations*16, 6 bytes each max_fsm_relations = 2000 # min 100, ~50 bytes each Any suggestions the even reduce more the dump period? Thank you very much. -- Arnau
On Thu, 21 Dec 2006, Arnau wrote: > And as result I got a file of 2.2GB. The improvement has been quite big but > still very far from the 10-15 minutes that Jeff says or the Thomas'3 minutes. Just FYI, if you were reporting the DB size by the size of the compressed dump file, then that's not exactly accurate. If your dump file is 2.2GB, I'd guess your DB size is tremendously larger. Maybe it's more like 10GB. What does the output of 'du -sh /var/lib/pgsql/data' show (you might have to substitute /var/lib/pgsql/data for wherever your PGDATA directory is located)? That also assumes you only have one database in your system. A better way is to install the dbsize contrib module and use it to determine db size. > > The version I'm running is a 7.4.2 and the postgresql.conf parameters are > the following: > You should at least upgrade to 7.4.14. The version you are running has some potentional data loss bugs in it. For some reason I can't find this email thread in the archives, so if you've answered this already, please forgive me. What's the disk subsytem look like? > # - Memory - > > shared_buffers = 10000 # min 16, at least max_connections*2, 8KB > each > sort_mem = 10240 # min 64, size in KB > vacuum_mem = 81920 # min 1024, size in KB > > # - Free Space Map - > > max_fsm_pages = 40000 # min max_fsm_relations*16, 6 bytes each > max_fsm_relations = 2000 # min 100, ~50 bytes each > > > Any suggestions the even reduce more the dump period? > > Thank you very much. > -- Jeff Frost, Owner <jeff@frostconsultingllc.com> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954
Arnau a écrit : > Hi all, > >> I've got a DB in production that is bigger than 2GB that dumping it >> takes more than 12 hours. I have a new server to replace this old one >> where I have restore the DB's dump. The problem is I can't afford to >> have the server out of business for so long, so I need your advice about >> how you'd do this dump/restore. The big amount of data is placed in two >> tables (statistics data), so I was thinking in dump/restore all except >> this two tables and once the server is running again I'd dump/restore >> this data. The problem is I don't know how exactly do this. >> >> Any suggestion? >> >> Thanks > > Jeff answer made me check what were the configuration parameters. > That machine had the default ones, so I tweaked a bit them and now I > got a dump in about 2 hours. To dump the DB I'm using the following > command: > > /usr/bin/pg_dump -o -b -Fc $db > $backup_file > > And as result I got a file of 2.2GB. The improvement has been quite > big but still very far from the 10-15 minutes that Jeff says or the > Thomas'3 minutes. > > The version I'm running is a 7.4.2 and the postgresql.conf > parameters are the following: > > # - Memory - > > shared_buffers = 10000 # min 16, at least max_connections*2, > 8KB each > sort_mem = 10240 # min 64, size in KB > vacuum_mem = 81920 # min 1024, size in KB > > # - Free Space Map - > > max_fsm_pages = 40000 # min max_fsm_relations*16, 6 bytes each > max_fsm_relations = 2000 # min 100, ~50 bytes each > > > Any suggestions the even reduce more the dump period? > > Thank you very much. To reduce dump period I suggest you to use the unix pipes : pgdumps $PSOPTIONS | psql $database I make always like this as it 's a lot faster Olivier