Re: Backup advice - Mailing list pgsql-general
From | Shaun Thomas |
---|---|
Subject | Re: Backup advice |
Date | |
Msg-id | 5162E4E0.4010809@optionshouse.com Whole thread Raw |
In response to | Backup advice (Johann Spies <johann.spies@gmail.com>) |
Responses |
Re: Backup advice
|
List | pgsql-general |
On 04/08/2013 08:14 AM, Johann Spies wrote: > Size: might be about 200Gb > The server uses a Tivoli backup client with daily backup > At the moment There are pg_dumps for each database on the server on a > daily, weekly and monthly basis. All this gets backed up to the Tivoli > server. Ok. So far, so good. > I have read about using pg_basebackup in an article from Shaun > Thomas' booklet on Packt Publishers**(I will probably buy the > booklet)*. *That seems to be a possible solution. Ok, with pg_basebackup, you'll get a binary backup of the actual data files involved in your database cluster. This will, in almost every case, be larger than pg_dump, and take about the same amount of time to produce. You also won't be able to get the described method you're using in your Tivoli software, since pg_basebackup works on the entire install instead of each individual database. One benefit, as Birta pointed out, is that you could use this backup as a base, and apply WAL / transaction logs instead, and those are generally smaller if your database doesn't see a lot of daily overhead. Unfortunately if you do a weekly base, and need to recover far into the week, that can be a time-consuming process. Then again, so can restoring a pg_dump of a 200GB cluster, thanks to the index creation points. One thing you might want to consider, is that 9.3 will presumably have parallel pg_dump to complement parallel pg_restore. This would greatly reduce the amount of time a full dump->restore cycle requires. Aside from that, you're already likely using the smallest backup you can have. Backing up a 600GB database takes about 120GB for us using tar, which is close to what pg_basebackup would give you in terms of size. A compressed pg_dump of the same data is around 50GB. Still, we use the binaries, because we need uptime more than size, and they are much faster to restore. We have daily binary backups going back over four years, because the backups are probably the most important thing in the company. Lose those under bad circumstances, and we'd literally be out of business. If size really is that important, you might want to check what kind of compression is going on with Tivoli. If you can produce a pg_dump and compress it with bzip2, or a utility such as lrzip that makes use of better compression algorithms like LZMA, you may be able to back up much smaller files than your current process. I'll warn you, though. Even parallel compression methods like lbzip2 are much slower than something like parallel gzip (pigz). You'll get a 20-30% smaller file at the cost of a 4-8x slower backup process. Since your data is likely to grow from its current size of 200GB, it's something to consider. Anyone else? I'm so used to using basic utilities, I know I haven't covered things like deduplication backup solutions. It may be that Tivoli isn't right for this, but I'm not familiar with that software. -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604 312-676-8870 sthomas@optionshouse.com ______________________________________________ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email
pgsql-general by date: