Thread: How to backup a postgreSQL of 80 GByte ?
Hello, I am running a postgreSQL under Debian/GNU-Linux 3.0r0 and a Ultra- ATA-133 RAID-1 with three 120 GByte IBM-Drives. Now I have the Problem to backup the stuff. I was thinking, to export all tables seperately to text files, gzip and backup it. There will be around 830.000 millons tables ! How can I do effectiv backup so that I can get informationd from it, if all other fails... The Textfiles are perfectly in general, because you can get infos with an editor or browsing with mc ;-)) Any suggestins are welcome... Oh yes, I like to backup to my CD-R55S and I like to know, whats the ratio from a compressed (gzip'ed) database ? Thanks in advance Michelle
On Thu, 19 Dec 2002, Michelle Konzack wrote: > Hello, > > I am running a postgreSQL under Debian/GNU-Linux 3.0r0 and a Ultra- > ATA-133 RAID-1 with three 120 GByte IBM-Drives. > > Now I have the Problem to backup the stuff. > > I was thinking, to export all tables seperately to text files, gzip > and backup it. There will be around 830.000 millons tables ! > > How can I do effectiv backup so that I can get informationd from it, > if all other fails... I would put a large hard drive or two into caddies you can remove from your machine (get spare back planes in case the building burns down or something and you need to restore) and backup onto those. anything else is gonna be slow OR expensive. Fast and cheap backup is just not a reality for 120 gigs of data. Going to CDR is gonna need about 150 cdrs or so, and a lot of patience. You might be able to get some kind of jukebox cheap with four CDR burners or something in it (the NSM Mercuries are VERY solid and reliable, but not cheap.) to automate things. I'd also look at some of the larger tape drives out now. I think Exabyte makes a packet driven tape drive (the VXA drive) that can hold 80 gigs uncompressed for less than $1,000 US. Not bad If your data compesses reasonably well, it should all fit on one tape. Note that the packet based VXA drives aren't nearly as fast as the regular mammoth drives, but they offer superior recoverability and survivability in the case of accidents. I've seen the VXA drives running before and they are quite nice units. If you have to burn your data to a disc of some kind, look into a DVD writer, as it'll likely pay for itself just on the lower cost of your time and the media needed to supply your backup needs. (150 CDRW versus 20 DVDRW)
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 > I was thinking, to export all tables seperately to text files, gzip > and backup it. There will be around 830.000 millons tables ! Ignoring the more important hardware issue, you should consider using bzip2 instead of gzip, which can save you a lot of disk space at the cost of a slightly higher time to compress. The built in checksums and the bzip2recover program are nice things to have as well. I am not sure how many tables you actually have, but saving them all to separate files may not be the most officient way: consider having pg_dump do more than one table, or consider tarring many tables into one file. Midnight Commander can also read tar files. Greg Sabino Mullane greg@turnstep.com PGP Key: 0x14964AC8 200212210817 -----BEGIN PGP SIGNATURE----- Comment: http://www.turnstep.com/pgp.html iD8DBQE+BGxkvJuQZxSWSsgRAqNmAJ0TPN16Dr7hKURRQf1b7eX6Iz7UywCgl+2c sEZ5Uuap3aesizogZ/0T/CE= =8Nn6 -----END PGP SIGNATURE-----
----- Original Message ----- From: "Michelle Konzack" <linux.mailinglists@freenet.de> To: <pgsql-general@postgresql.org> Sent: Friday, December 20, 2002 12:55 AM Subject: [GENERAL] How to backup a postgreSQL of 80 GByte ? > Hello, > > I am running a postgreSQL under Debian/GNU-Linux 3.0r0 and a Ultra- > ATA-133 RAID-1 with three 120 GByte IBM-Drives. > > Now I have the Problem to backup the stuff. > > I was thinking, to export all tables seperately to text files, gzip > and backup it. There will be around 830.000 millons tables ! Call me suspicious, but 830 THOUSAND millions of tables in 80Gbyte is a little too much to believe (less than 1 bit for a table on the average)... Even if you also want to dump views, for example, which, regarding the size of the db, I can't imagine (managing such a large db you can't be that lame ;) ). Is 80GB the binary DB size, the dump size, or the compressed dump size? Is it really 830 billion tables, or is it 830 thousand/million? Just for curiosity... As for the hardware, I agree with Scott regarding tapes. The typical thing to backup/archive. As for its speed-- well, let's hope you won't have to restore from it ;) As for compression, bzip2 is really much better on text files. Also don't forget the "-9" flag (though I think this should be the default). Another surprisingly good compression method is using plain zip (!) twice (!). I only know one that's even better -- it's zip.rar :) Try to find the most effective zip and rar programs, though. I used a quite old zip and the newest winrar (but there is also a rar3.10 for linux, for example) These methods have the inconvenience of uneasy decompressing or browsing, which, according to the raw data size, is really a bad thing. Make a test on a 80MB dump part first to see if it's worth for you at all -- IIRC it can be more than 10% better than bzip2. G. -- while (!asleep()) sheep++; ---------------------------- cut here ------------------------------
Oops... and Hello, Am 11:39 2002-12-23 +0100 hat SZUCS Gábor geschrieben: > >Call me suspicious, but 830 THOUSAND millions of tables in 80Gbyte is a 8-O ...a little BIG error... they are ONLY 830.000 tables ! :-P >As for the hardware, I agree with Scott regarding tapes. The typical thing >to backup/archive. As for its speed-- well, let's hope you won't have to >restore from it ;) Curently I am using the Software RAID-1 with two Harddrives... Maybe I will switch to RAID-5 with 3 Drives. >As for compression, bzip2 is really much better on text files. Also don't >forget the "-9" flag (though I think this should be the default). But HOW do you backup 80 GBytes ??? OK, I have three drives and the first drive is a normal 120 GByte HD, on which I have the OS and a around 120 Gbyte TEMP Partition !!! If I try to 'gzip' or 'bzip2' the 80 GBytes, how many days need it ??? It is a Productionssystem and will go online in Paris at amen.fr next year. I think, 'bzip2' nos no solution... Never I have tried it. - I think, it is amegalomania ! OK, I use the HP Surstore 12000e with 16 GByte Tapes... Thanks Michelle
On Mon, 23 Dec 2002, Michelle Konzack wrote: > Oops... and Hello, > > Am 11:39 2002-12-23 +0100 hat SZUCS Gábor geschrieben: > > > > >Call me suspicious, but 830 THOUSAND millions of tables in 80Gbyte is a > > 8-O ...a little BIG error... they are ONLY 830.000 tables ! :-P > > >As for the hardware, I agree with Scott regarding tapes. The typical thing > >to backup/archive. As for its speed-- well, let's hope you won't have to > >restore from it ;) > > Curently I am using the Software RAID-1 with two Harddrives... > Maybe I will switch to RAID-5 with 3 Drives. > > >As for compression, bzip2 is really much better on text files. Also don't > >forget the "-9" flag (though I think this should be the default). > > But HOW do you backup 80 GBytes ??? > > OK, I have three drives and the first drive is a normal 120 GByte HD, > on which I have the OS and a around 120 Gbyte TEMP Partition !!! > > If I try to 'gzip' or 'bzip2' the 80 GBytes, how many days need it ??? If you have a VXA2 drive (80 gig native, 160 gig compressed) then you don't do anything of the sort. Those drives achieve a respectable amount of compression on database backups, and pre-compressing your data means the compression in the drive hasn't got much left to work with. So, just dump it straight to tape. mt density 0x00 mt rewind pg_dump -options here database_name >/dev/nst0 > OK, I use the HP Surstore 12000e with 16 GByte Tapes... That's way too small. You can get a VXA2 drive for <$1000 and while not quite as fast as the monster mammoth from exabyte, it is 1/3 the cost. Plus, with it's packet writing scheme, you almost can't get an error on restore. Great tape drives, I'm glad exabyte bought them out so they didn't disappear due to being underfunded/marketed.
On Friday 03 January 2003 9:57 am, scott.marlowe wrote: > On Mon, 23 Dec 2002, Michelle Konzack wrote: > > Oops... and Hello, > > > > Am 11:39 2002-12-23 +0100 hat SZUCS Gábor geschrieben: > > >Call me suspicious, but 830 THOUSAND millions of tables in 80Gbyte is a > > > > 8-O ...a little BIG error... they are ONLY 830.000 tables ! :-P > > > > >As for the hardware, I agree with Scott regarding tapes. The typical > > > thing to backup/archive. As for its speed-- well, let's hope you won't > > > have to restore from it ;) > > > > Curently I am using the Software RAID-1 with two Harddrives... > > Maybe I will switch to RAID-5 with 3 Drives. > > > > >As for compression, bzip2 is really much better on text files. Also > > > don't forget the "-9" flag (though I think this should be the default). > > > > But HOW do you backup 80 GBytes ??? > > > > OK, I have three drives and the first drive is a normal 120 GByte HD, > > on which I have the OS and a around 120 Gbyte TEMP Partition !!! > > > > If I try to 'gzip' or 'bzip2' the 80 GBytes, how many days need it ??? > > If you have a VXA2 drive (80 gig native, 160 gig compressed) then you > don't do anything of the sort. Those drives achieve a respectable amount > of compression on database backups, and pre-compressing your data means > the compression in the drive hasn't got much left to work with. > > So, just dump it straight to tape. > > mt density 0x00 > mt rewind > pg_dump -options here database_name >/dev/nst0 > > > OK, I use the HP Surstore 12000e with 16 GByte Tapes... > > That's way too small. You can get a VXA2 drive for <$1000 and while not > quite as fast as the monster mammoth from exabyte, it is 1/3 the cost. > > Plus, with it's packet writing scheme, you almost can't get an error on > restore. Great tape drives, I'm glad exabyte bought them out so they > didn't disappear due to being underfunded/marketed. I'll second the vote for the VXA2 - nice little drive. As to compression this is a big case of "your mileage may vary". I did a test dump of a 1.8 million record table (routing output to wc to get file sizes which I determined is only a tiny bit slower than to /dev/null and which also correlate nicely with a couple tests to the VXA drive): As full SQL inserts (-d flag): 7:15, size 484MB Same, with gzip: 9:08, size 51.8MB As default copy (without -d flag): 3:23, size 333.8MB Same, with gzip: 4:21, size 47.8MB So, assuming the margarita grande at lunch hasn't killed my ability to do math, the highest data rate (copy, no compression) is ~1.7MB/second which is well under the ~6MB/second native speed of the VXA2. So if speed is paramount then don't compress but if you need to fit lots on a tape then do (compression is from 7:1 to 9:1 in these tests or 3.5 to 4.5 times better than the best hardware compression on the tape drive and would allow you to fit over 1/2 terabyte on a tape). Using the -9 flag with gzip gave me a very modest improvement in compression (down to 48.8 from 51.8MB or 9.9:1 instead of 9:1) but the gzip CPU cost even for this relatively small dump tripled to 3 minutes from 1 minute. So it looks like you will need lots faster hardware than the desktop I used for these tests (850MHz, single IDE...) or more data than will fit on a tape before compression will be of any use (unless you want to fit that 80GB onto a single 16GB surestore tape - the compression overhead is not that terrible). Note, you may want to experiment with piping your backups through a program like "buffer" (http://www.microwerks.net/~hugo/download/buffer-1.19.tgz) or one of the many similar buffering programs - your backup speed may not improve at all or it may increase by a factor of 2-3. You will have to test to tell. There is nothing that makes this look terribly undoable - even using my non-server-grade desktop I could dump this database to one tape in 13 hours. Cheers, Steve
Hello Steve and Scott, Am 14:13 2003-01-03 -0800 hat Steve Crawford geschrieben: > >On Friday 03 January 2003 9:57 am, scott.marlowe wrote: >> On Mon, 23 Dec 2002, Michelle Konzack wrote: Because the server will be in Paris (at amen.fr, OC3) while I am in Strasbourg (UUnet, 2 MBit) it is not possibel to make such backups !!! I was hoping to make incremental Backups of new or changed tables. So I need another solution. I have the O'Reilly-Book "Practical PostgreSQL" and ther are some lines about Master-Slave systems and Cluster of Proxy-Master-(some)Slave. Hmmm, if I have my SQL-Master in Paris and I get only 200-400 new/ changed Tables (100kByte each) and the updates of the bigger (10-20 MBytes) Reference Tables a day, I can use a SQL-Slave in Strasbourg in Write modus only... (it get the copy of all changed Data in Paris) My Problem is, that I do not know, how to do it... Curently I have only some 'HP Vectra XA' vor testing but I think, I can install such test system... The Server for Paris is an Athlon 1400 with 1 GByte of memory. Does anyone have a HOWTO or useful links How to setup a Master-Slave system or a Cluster containing the SQL-Proxy, SQL-Master, and 2-3 SQL-Slaves ? Thanks in Advance Michelle
Hello Steve and Scott, Am 14:13 2003-01-03 -0800 hat Steve Crawford geschrieben: > >On Friday 03 January 2003 9:57 am, scott.marlowe wrote: >> On Mon, 23 Dec 2002, Michelle Konzack wrote: Because the server will be in Paris (at amen.fr, OC3) while I am in Strasbourg (UUnet, 2 MBit) it is not possibel to make such backups !!! I was hoping to make incremental Backups of new or changed tables. So I need another solution. I have the O'Reilly-Book "Practical PostgreSQL" and ther are some lines about Master-Slave systems and Cluster of Proxy-Master-(some)Slave. Hmmm, if I have my SQL-Master in Paris and I get only 200-400 new/ changed Tables (100kByte each) and the updates of the bigger (10-20 MBytes) Reference Tables a day, I can use a SQL-Slave in Strasbourg in Write modus only... (it get the copy of all changed Data in Paris) My Problem is, that I do not know, how to do it... Curently I have only some 'HP Vectra XA' vor testing but I think, I can install such test system... The Server for Paris is an Athlon 1400 with 1 GByte of memory. Does anyone have a HOWTO or useful links How to setup a Master-Slave system or a Cluster containing the SQL-Proxy, SQL-Master, and 2-3 SQL-Slaves ? Thanks in Advance Michelle