Thread: pg_dump / pg_dumpall / memory issues
Hi, We have a nightly backup going on our db server. We use pg_dumpall , which when its done, generates a text dump around 9 Gigs. Most nights, the backup runs at around a load of 2.5 -- with a normal load of arount 2 -- (this on a dual 2.4Ghz Xeon machine with 6GB ram). Our schema has a huge table (about 5 million tuples) which gets queried about 30 times per second. These queries fetch one records at a time pretty evenly throughout this large table, so I would imagine this table would dominate the shared RAM (currently set at 320MB). As you can imagine, at times the backup process (or in fact any large query that dominates the cache), tends to spike up the load pretty severely. At some point, we experimented with more shared memory, but that actually decreased overall performance, as was discussed here earlier. What can we do to alleviate this problem? Its going to be difficult to not query the large table at any given time (24/7 service and all). Are there any strategies that we can take with pg_dump/pg_dumpall? My dump command is : > pg_dumpall -c > /tmp/backupfile.sql Help!!! -- Ericson Smith <eric@did-it.com>
Ericson Smith <eric@did-it.com> writes: > As you can imagine, at times the backup process (or in fact any large > query that dominates the cache), tends to spike up the load pretty > severely. At some point, we experimented with more shared memory, but > that actually decreased overall performance, as was discussed here > earlier. > What can we do to alleviate this problem? There was a great deal of discussion back in July 2000 about throttling the rate at which pg_dump pulls down data. You might check the archives, and also look in pg_dump.c for the comments about it therein. No one seemed to be able to come up with a great answer that time around, but perhaps you can find a way that works. regards, tom lane
On 9 Apr 2003, Ericson Smith wrote: > Hi, > > We have a nightly backup going on our db server. We use pg_dumpall , > which when its done, generates a text dump around 9 Gigs. > > Most nights, the backup runs at around a load of 2.5 -- with a normal > load of arount 2 -- (this on a dual 2.4Ghz Xeon machine with 6GB ram). > Our schema has a huge table (about 5 million tuples) which gets queried > about 30 times per second. These queries fetch one records at a time > pretty evenly throughout this large table, so I would imagine this table > would dominate the shared RAM (currently set at 320MB). > > As you can imagine, at times the backup process (or in fact any large > query that dominates the cache), tends to spike up the load pretty > severely. At some point, we experimented with more shared memory, but > that actually decreased overall performance, as was discussed here > earlier. > > What can we do to alleviate this problem? Its going to be difficult to > not query the large table at any given time (24/7 service and all). > > Are there any strategies that we can take with pg_dump/pg_dumpall? My > dump command is : > > pg_dumpall -c > /tmp/backupfile.sql What version of pgsql are you running? I've gotten MUCH better backup restore performance on 7.3 series than I did with 7.2. I can backup 1 gig of data in about 10 minutes across 100 Base Tx network with a pipe like so: pg_dump -h hostname databasename | psql databasename It took something like 30 minutes to an hour before to do this in 7.2.x. (My box is a dual PIII 750 with 1.5 gig ram, and a 10KRPM UWScsi drive for the database seperate from the system.)
Oh, as a followup on that last message I sent off, I ran the backup by hand, which basically runs a PHP script on a postgresql 7.3 machine to backup a postgresql 7.2 machine. Since 7.3's pg_dumpall isn't so good at talking to the 7.2 machine, I had to write my own, that's ok, there were other things to do as well, it's a backup script after all. Anyway, the backup script gets a list of all the databases on the 7.2 server and initializes an empty place with 'initdb --locate=C' then promptly fires off line after line like this: createdb postgres;pg_dump -O -h mainbox postgres | psql postgres In fact it fires it off 62 times for our system. With the backup script running the load factor on the mainbox was about 1.09 while the % CPU for the postmaster doing the backup was 50%. when I ran pgbench -c 4 -t 1000000 to provide some PTL (parallel thrash load :-) the usage of the postmasters running the pgbench was about 24%, while the postmaster running the backup was about 35 to 40%. Note that this was a dual PIII-750, so the totals can add up to 200% in RH Linux. The responsiveness of the main box is about the same during just the backup, but the pgbench was a killer, with or without the backup, that slows the machine down a lot more for me. So I'm wondering if the simple solution might be to either use a slower box / network connection / throttled port on the backup box, or just backup into another database since the copies into the other machine probably slow things down enough to render less of a load on the server being backed up. both the two servers in this test are identical, except the mainline box is still running 7.2.4 while the backup / test box is running 7.3.2. both have dual 750 MHz CPUs and 1.5 gig ram with a 10krpm USCSI one for system, one for postgresql.
We're running the 7.3 series. We are getting better backup performance than 7.2 indeed. Looking at pg_dump.c -- it seems to use the COPY command so perhaps throttling in the code may not be the best solution. I did see the notes about this that Tom mentioned. > So I'm wondering if the simple solution might be to either use a slower > box / network connection / throttled port on the backup box, or just > backup into another database since the copies into the other machine > probably slow things down enough to render less of a load on the server > being backed up. Hmmm... that might be an interesting solution. We do have a slower standby DB, that would be excellent for that purpose. It would be an added incentive too, because the standby DB would be hot after backup. I'm gonna give this a shot and report back. I guess we can do a pg_dumpall from the standby DB as soon as the main DB has finished backing up too! Regards - Ericson Smith eric@did-it.com scott.marlowe wrote: >What version of pgsql are you running? I've gotten MUCH better backup >restore performance on 7.3 series than I did with 7.2. I can backup 1 gig >of data in about 10 minutes across 100 Base Tx network with a pipe like >so: > >pg_dump -h hostname databasename | psql databasename > >It took something like 30 minutes to an hour before to do this in 7.2.x. > >(My box is a dual PIII 750 with 1.5 gig ram, and a 10KRPM UWScsi drive for >the database seperate from the system.) > > > > >
On Thursday 10 April 2003 11:14, you wrote: > > So I'm wondering if the simple solution might be to either use a slower > > box / network connection / throttled port on the backup box, or just > > backup into another database since the copies into the other machine > > probably slow things down enough to render less of a load on the server > > being backed up. > > Hmmm... that might be an interesting solution. We do have a slower > standby DB, that would be excellent for that purpose. It would be an > added incentive too, because the standby DB would be hot after backup. > I'm gonna give this a shot and report back. I guess we can do a > pg_dumpall from the standby DB as soon as the main DB has finished > backing up too! If you have an OS that binds nice value of a process to it's I/O priority, you can lower the priority of pg_dump so that it runs very slow, so to speak. I believe freeBSD does this. Not sure if linux does it as well. HTH Shridhar
At 01:44 AM 4/10/2003 -0400, Ericson Smith wrote: >Hmmm... that might be an interesting solution. We do have a slower standby >DB, that would be excellent for that purpose. It would be an added >incentive too, because the standby DB would be hot after backup. I'm gonna >give this a shot and report back. I guess we can do a pg_dumpall from the >standby DB as soon as the main DB has finished backing up too! With tee you can load to a staging db (only switch it to standby if the dump is ok) and write to a file at the same time. Could be better if the backup file is on a different disk/storage hardware. I normally pipe a pg_dump through gzip to a file - need to justify >>1GHz cpus ;). Link.
Shridhar has a good point using nice. I know RedHat supports nice because I have installed a totally background program that uses "nice 19" as part of its command line. I just successfully tried this on a RedHat 7.3: nice -n 15 pg_dumpall >testdump.sql It seems to work. However, my SQL server is so lightly loaded I cannot verify that it actually *helps*. -Kyle Shridhar Daithankar wrote: > If you have an OS that binds nice value of a process to it's I/O priority, you > can lower the priority of pg_dump so that it runs very slow, so to speak. > > I believe freeBSD does this. Not sure if linux does it as well. > > HTH > > Shridhar > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html
If you have a standby database server, and you are looking to get speedy backups, you will want to look at: http://marc.theaimsgroup.com/?l=postgresql-admin&w=2&r=1&s=lvm+snapshots&q=b The thread discusses filesystem level backup of the database cluster off an [LVM/EVMS/NetApp] filer snapshot, with implementation details for linux LVM snapshots. Cheers, Murthy -----Original Message----- From: Ericson Smith [mailto:eric@did-it.com] Sent: Thursday, April 10, 2003 01:45 To: Postgresql General Subject: Re: [GENERAL] pg_dump / pg_dumpall / memory issues We're running the 7.3 series. We are getting better backup performance than 7.2 indeed. Looking at pg_dump.c -- it seems to use the COPY command so perhaps throttling in the code may not be the best solution. I did see the notes about this that Tom mentioned. > So I'm wondering if the simple solution might be to either use a slower > box / network connection / throttled port on the backup box, or just > backup into another database since the copies into the other machine > probably slow things down enough to render less of a load on the server > being backed up. Hmmm... that might be an interesting solution. We do have a slower standby DB, that would be excellent for that purpose. It would be an added incentive too, because the standby DB would be hot after backup. I'm gonna give this a shot and report back. I guess we can do a pg_dumpall from the standby DB as soon as the main DB has finished backing up too! Regards - Ericson Smith eric@did-it.com scott.marlowe wrote: >What version of pgsql are you running? I've gotten MUCH better backup >restore performance on 7.3 series than I did with 7.2. I can backup 1 gig >of data in about 10 minutes across 100 Base Tx network with a pipe like >so: > >pg_dump -h hostname databasename | psql databasename > >It took something like 30 minutes to an hour before to do this in 7.2.x. > >(My box is a dual PIII 750 with 1.5 gig ram, and a 10KRPM UWScsi drive for >the database seperate from the system.) > > > > > ---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Kyle <kyle@ccidomain.com> writes: > Shridhar has a good point using nice. I know RedHat supports nice > because I have installed a totally background program that uses "nice > 19" as part of its command line. He's talking about honoring 'nice' for I/O scheduling, which is in addition to the standard modification of CPU priority. Linux doesn't currently do this in 2.4--it was discussed for 2.5 but I don't know if it has gone in yet. -Doug
If it does, then the first PC based multimedia system will fially be born. Actually, the reverse of nice is needed for that; I need X amount of mips and Y amount of I/O to run, should be a valid API call. THEN, multimedia apps can be waht they're supposed to be. Doug McNaught wrote: > Kyle <kyle@ccidomain.com> writes: > > >>Shridhar has a good point using nice. I know RedHat supports nice >>because I have installed a totally background program that uses "nice >>19" as part of its command line. > > > He's talking about honoring 'nice' for I/O scheduling, which is > in addition to the standard modification of CPU priority. Linux > doesn't currently do this in 2.4--it was discussed for 2.5 but I don't > know if it has gone in yet. > > -Doug > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster >