Thread: PG backup performance
Hello PG list, I 'm looking for some tip, advice toimprove PG backups performance, presently running pg_dumpall compressed option on raid array 0 getting aprox14GB writes in 45 min, I'm backing up aprox 200GB database cluster daily . How can I improve this performance with the present hardware and PG version 8.3.6 , can I run parallel backups in PG ? Thank you Isabella -- ----------------------------------------------------------- Isabella A. Ghiurea Isabella.Ghiurea@nrc-cnrc.gc.ca Canadian Astronomy Data Centre |http://www.nrc-cnrc.gc.ca/eng/services/hia/data-centre.html National Research Council of Canada, Herzberg Institute of Astrophysics 5071 West Saanich Road, Victoria BC V9E 2E7, Canada Phone: 250 363-3446 fax: 250 363-0045
On 05/31/2010 11:05 AM, Isabella Ghiurea wrote: > Hello PG list, > I 'm looking for some tip, advice toimprove PG backups performance, > presently running > pg_dumpall compressed option on raid array 0 getting aprox14GB writes in > 45 min, I'm backing up aprox 200GB database cluster daily . > How can I improve this performance with the present hardware and PG > version 8.3.6 , can I run parallel backups in PG ? > Thank you > Isabella > Short answer, yes, you can. Long answer, we need more info. We need to know what the slow part is. Are you CPU bound or IO bound? Are you backing up over a network? How many tables? (well, big tables... how many really big tables). How many cores/cpu's do you have? I bet all 200GB hardly changes, are you sure you need to back it all up over and over again? Have you thought of replication? -Andy
Hi Andy, Thank you , please, see bellow my answers: Andy Colson wrote: > > On 05/31/2010 11:05 AM, Isabella Ghiurea wrote: > > Hello PG list, > > I 'm looking for some tip, advice toimprove PG backups performance, > > presently running > > pg_dumpall compressed option on raid array 0 getting aprox14GB writes in > > 45 min, I'm backing up aprox 200GB database cluster daily . > > How can I improve this performance with the present hardware and PG > > version 8.3.6 , can I run parallel backups in PG ? > > Thank you > > Isabella > > > > Short answer, yes, you can. > Long answer, we need more info. > > We need to know what the slow part is. > > Are you CPU bound > CPU > > or IO bound? > Are you backing up over a network? > No , on locally disks > > How many tables? (well, big tables... how many really big tables). > Around 20 big tables all of them in a one separate schema on separate tables space from rest of other schemas. I already start backing up individuals schema. My big concern is IF I have to recover from backups will take me at least twice as much time around 1-2 days I expect. > > How many cores/cpu's do you have? > 4 quad core CPU server. How can I make PG to use multi threading? > > > I bet all 200GB hardly changes, are you sure you need to back it all > up over and over again? Have you thought of replication? > Yes, but waiting for a more robust build in replication PG version aka PG 9.1. Isabella -- ----------------------------------------------------------- Isabella A. Ghiurea Isabella.Ghiurea@nrc-cnrc.gc.ca Canadian Astronomy Data Centre |http://www.nrc-cnrc.gc.ca/eng/services/hia/data-centre.html National Research Council of Canada, Herzberg Institute of Astrophysics 5071 West Saanich Road, Victoria BC V9E 2E7, Canada Phone: 250 363-3446 fax: 250 363-0045
On 05/31/2010 02:45 PM, Isabella Ghiurea wrote: > > Hi Andy, > Thank you , please, see bellow my answers: > Andy Colson wrote: >> >> On 05/31/2010 11:05 AM, Isabella Ghiurea wrote: >> > Hello PG list, >> > I 'm looking for some tip, advice toimprove PG backups performance, >> > presently running >> > pg_dumpall compressed option on raid array 0 getting aprox14GB >> writes in >> > 45 min, I'm backing up aprox 200GB database cluster daily . >> > How can I improve this performance with the present hardware and PG >> > version 8.3.6 , can I run parallel backups in PG ? >> > Thank you >> > Isabella >> > >> >> Short answer, yes, you can. >> Long answer, we need more info. >> >> We need to know what the slow part is. >> >> Are you CPU bound >> > CPU >> >> or IO bound? >> Are you backing up over a network? >> > No , on locally disks >> >> How many tables? (well, big tables... how many really big tables). >> > Around 20 big tables all of them in a one separate schema on separate > tables space from rest of other schemas. I already start backing up > individuals schema. > My big concern is IF I have to recover from backups will take me at > least twice as much time around 1-2 days I expect. >> >> How many cores/cpu's do you have? >> > 4 quad core CPU server. How can I make PG to use multi threading? >> >> >> I bet all 200GB hardly changes, are you sure you need to back it all >> up over and over again? Have you thought of replication? >> > Yes, but waiting for a more robust build in replication PG version aka > PG 9.1. > Isabella > > cool.. my second astronomer :-) You are cpu bound because pg_dump will use one cpu, and if you have pg_dump compress, it uses the same cpu. But if you usea pipe (pg_dump | gzip) then each gets a cpu. And instead of using pg_dumpall, which dumps one db at a time, use pg_dumpand run several in parallel. try something like: pg_dump big1 | gzip > big1.sql & pg_dump big2 | gzip > big2.sql This will dump two at a time using 4 cpu's. Have a backup file for each big table will also make restore faster, because you can restore them all at the same time, whereas pg_dumpall will restore one at a time. The compression is probably going to take the most time... if your data is not very compressible then you'll get a big speedboost by dropping the compress. -Andy
On 05/31/2010 02:45 PM, Isabella Ghiurea wrote: > >> > I 'm looking for some tip, advice to improve PG backups performance, Yep, I thought I recalled a conversation like this before, this might have some interesting info: http://archives.postgresql.org/pgsql-performance/2010-03/msg00132.php -Andy