Thread: autovacuum
I have a large database used with our mail filter. The pg_dumpall results in about 3GB with this being the only database in the system besides templates and postgres. I do a vacuum every night after backup and it takes about an hour, is this normal for this size db? I also have autovacuum enabled and when it is running during the day, our mail queues will tend to fill up with slow response to the server. Should I have autovacuum on even if I am vacuuming the db every night? Let me know if you need more specifics. Just trying to get some feedback on if my vacuum is taking too long or if both are necessary...thanks for the help! -- Robert
In response to Robert Fitzpatrick <lists@webtent.net>: > I have a large database used with our mail filter. The pg_dumpall > results in about 3GB with this being the only database in the system > besides templates and postgres. > > I do a vacuum every night after backup and it takes about an hour, is > this normal for this size db? "normal" is relative. If it's taking an hour to vacuum 3G, I would say that either your hardware is undersized/badly configured, or you're not vacuuming often enough. > I also have autovacuum enabled and when it > is running during the day, our mail queues will tend to fill up with > slow response to the server. Should I have autovacuum on even if I am > vacuuming the db every night? I'm not aware of any problems with autovaccum and scheduled vacuum working together. That doesn't mean you're vacuuming often enough, however. Switch your nightly vacuum to vacuum verbose and capture the output to see how much work it has to do. Are your fsm settings high enough? > Let me know if you need more specifics. Just trying to get some feedback > on if my vacuum is taking too long or if both are necessary...thanks for > the help! How much RAM does the system have? What's your shared_buffer settings? What's your maintenance_work_mem set to? -- Bill Moran http://www.potentialtech.com
On Thu, 2007-09-20 at 13:22 -0400, Bill Moran wrote: > In response to Robert Fitzpatrick <lists@webtent.net>: > > > I have a large database used with our mail filter. The pg_dumpall > > results in about 3GB with this being the only database in the system > > besides templates and postgres. > > > > I do a vacuum every night after backup and it takes about an hour, is > > this normal for this size db? > > "normal" is relative. If it's taking an hour to vacuum 3G, I would say > that either your hardware is undersized/badly configured, or you're > not vacuuming often enough. It is a dual P4 processor supermicro server with 2GB of RAM, so I will need to go over the configuration then? I didn't think it should take so long... > That doesn't mean you're vacuuming often enough, however. Switch your > nightly vacuum to vacuum verbose and capture the output to see how much > work it has to do. Are your fsm settings high enough? > > > Let me know if you need more specifics. Just trying to get some feedback > > on if my vacuum is taking too long or if both are necessary...thanks for > > the help! > > How much RAM does the system have? What's your shared_buffer settings? > What's your maintenance_work_mem set to? Yes, this is the first time I've had to do any tuning to pgsql, so I most likely need help in this area. This is 8.2.4 on a FreeBSD 6.2 server...here are those settings currently below. I also had to tweak BSD loader.conf to allow the changes to work... max_connections = 250 max_fsm_pages = 204800 shared_buffers = 128MB effective_cache_size = 256MB work_mem = 64MB maintenance_work_mem = 256MB mx1# cat /etc/loader.conf kern.ipc.semmni=256 kern.ipc.semmns=512 kern.ipc.semmnu=256 mx1# cat /etc/sysctl.conf # tuning for PostgreSQL kern.ipc.shm_use_phys=1 kern.ipc.shmmax=1073741824 kern.ipc.shmall=262144 kern.ipc.semmsl=512 kern.ipc.semmap=256 If I don't have it listed above, then it is default settings for anything else. Thanks for the help! > -- Robert
In response to Robert Fitzpatrick <lists@webtent.net>: > On Thu, 2007-09-20 at 13:22 -0400, Bill Moran wrote: > > That doesn't mean you're vacuuming often enough, however. Switch your > > nightly vacuum to vacuum verbose and capture the output to see how much > > work it has to do. Are your fsm settings high enough? <snip> > INFO: free space map contains 30078 pages in 41 relations > DETAIL: A total of 30304 page slots are in use (including overhead). > 30304 page slots are required to track all free space. > Current limits are: 204800 page slots, 1000 relations, using 1265 kB. This was what I was most concerned about. If your FSM settings are too low, vacuum won't be able to fully do its job. But it looks like you're OK here. -- Bill Moran http://www.potentialtech.com
In response to Robert Fitzpatrick <lists@webtent.net>: > On Thu, 2007-09-20 at 13:22 -0400, Bill Moran wrote: > > In response to Robert Fitzpatrick <lists@webtent.net>: > > > > > I have a large database used with our mail filter. The pg_dumpall > > > results in about 3GB with this being the only database in the system > > > besides templates and postgres. > > > > > > I do a vacuum every night after backup and it takes about an hour, is > > > this normal for this size db? > > > > "normal" is relative. If it's taking an hour to vacuum 3G, I would say > > that either your hardware is undersized/badly configured, or you're > > not vacuuming often enough. > > It is a dual P4 processor supermicro server with 2GB of RAM, so I will > need to go over the configuration then? I didn't think it should take so > long... Why does everyone leave of the IO subsystem? It's almost as if many people don't realize that disks exist ... With 2G of RAM, and a DB that's about 3G, then there's at least a G of database data _not_ in memory at any time. As a result, disk speed is important, and _could_ be part of your problem. You're not using RAID 5 are you? > > > Let me know if you need more specifics. Just trying to get some feedback > > > on if my vacuum is taking too long or if both are necessary...thanks for > > > the help! > > > > How much RAM does the system have? What's your shared_buffer settings? > > What's your maintenance_work_mem set to? > > Yes, this is the first time I've had to do any tuning to pgsql, so I > most likely need help in this area. This is 8.2.4 on a FreeBSD 6.2 > server...here are those settings currently below. I also had to tweak > BSD loader.conf to allow the changes to work... > > max_connections = 250 > max_fsm_pages = 204800 > shared_buffers = 128MB Unless this machine runs programs other than PostgreSQL, raise this to about 650MB. You might get better performance from even higher values. The rule of thumb is allocate 1/4 - 1/3 of the available RAM to shared_buffers ... subtract the RAM that other programs are using first. > effective_cache_size = 256MB More like 1300MB (again, unless this machine is doing other things) > work_mem = 64MB > maintenance_work_mem = 256MB > > mx1# cat /etc/loader.conf > kern.ipc.semmni=256 > kern.ipc.semmns=512 > kern.ipc.semmnu=256 > mx1# cat /etc/sysctl.conf > # tuning for PostgreSQL > kern.ipc.shm_use_phys=1 > kern.ipc.shmmax=1073741824 > kern.ipc.shmall=262144 > kern.ipc.semmsl=512 > kern.ipc.semmap=256 > > If I don't have it listed above, then it is default settings for > anything else. Watch the system during vacuum to see if it's blocking on IO or CPU. systat, vmstat, iostat, and top (use 'm' to switch views) are all good utilities to check on this. Another possibility is that autovac isn't configured correctly. Watch your PostgreSQL logs to see if it's running at all. If it is, turn up the logging level until it tells you which tables it's vacuuming. You may have to tweak the thresholds to make it more aggressive. -- Bill Moran http://www.potentialtech.com
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Bill Moran wrote: > In response to Robert Fitzpatrick <lists@webtent.net>: > >> On Thu, 2007-09-20 at 13:22 -0400, Bill Moran wrote: >>> In response to Robert Fitzpatrick <lists@webtent.net>: >>> >>>> I have a large database used with our mail filter. The pg_dumpall >>>> results in about 3GB with this being the only database in the system >>>> besides templates and postgres. >>>> >>>> I do a vacuum every night after backup and it takes about an hour, is >>>> this normal for this size db? >>> "normal" is relative. If it's taking an hour to vacuum 3G, I would say >>> that either your hardware is undersized/badly configured, or you're >>> not vacuuming often enough. >> It is a dual P4 processor supermicro server with 2GB of RAM, so I will >> need to go over the configuration then? I didn't think it should take so >> long... > > Why does everyone leave of the IO subsystem? It's almost as if many > people don't realize that disks exist ... I have disks? - -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240 PostgreSQL solutions since 1997 http://www.commandprompt.com/ UNIQUE NOT NULL Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFG8txpATb/zqfZUUQRAlgRAKCUhKoTaTFjYy5fdRktrCuAfWTdlwCgo1wF Hrg1Dwk+TYxTbWeQEKW2LYY= =JDqc -----END PGP SIGNATURE-----
On Thu, 2007-09-20 at 16:38 -0400, Bill Moran wrote: > In response to Robert Fitzpatrick <lists@webtent.net>: > Why does everyone leave of the IO subsystem? It's almost as if many > people don't realize that disks exist ... > > With 2G of RAM, and a DB that's about 3G, then there's at least a G of > database data _not_ in memory at any time. As a result, disk speed is > important, and _could_ be part of your problem. You're not using RAID > 5 are you? Yes, using RAID 5, not good? RAID 5 with hot fix total of 4 drives. All SATA 80GB drives giving me little under 300GB to work with. Also, my nightly backup does a pg_dump of the one database and vacuums only that database as there are no other except template#'s. Then it does a pg_dumpall. Now, I noticed that we have the -dD flags on pg_dumpall, not sure why, I took them off. But the strange thing I am finding is while my one database using a 'pg_dump -F c' only comes out at 930MB while the pg_dumpall results in 3GB, is that due to the use of INSERTS by using -dD? > > max_connections = 250 > > max_fsm_pages = 204800 > > shared_buffers = 128MB > > Unless this machine runs programs other than PostgreSQL, raise this to > about 650MB. You might get better performance from even higher values. > The rule of thumb is allocate 1/4 - 1/3 of the available RAM to > shared_buffers ... subtract the RAM that other programs are using first. Yes, it runs a few other things like Postfix+amavisd-maia+SA+clamAV, but low priority MX so it gets little unless the primary is not responding. Other than that, I use it to run the web GUI (php) for this amavisd-maia mail server where users can view spam/ham caches. Can I determine the amount of memory everything else is running by stopping postgres and look in top to see what is being used? Thanks for the other pointers...! -- Robert
On Sep 20, 2007, at 3:47 PM, Joshua D. Drake wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Bill Moran wrote: >> In response to Robert Fitzpatrick <lists@webtent.net>: >> >>> On Thu, 2007-09-20 at 13:22 -0400, Bill Moran wrote: >>>> In response to Robert Fitzpatrick <lists@webtent.net>: >>>> >>>>> I have a large database used with our mail filter. The pg_dumpall >>>>> results in about 3GB with this being the only database in the >>>>> system >>>>> besides templates and postgres. >>>>> >>>>> I do a vacuum every night after backup and it takes about an >>>>> hour, is >>>>> this normal for this size db? >>>> "normal" is relative. If it's taking an hour to vacuum 3G, I >>>> would say >>>> that either your hardware is undersized/badly configured, or you're >>>> not vacuuming often enough. >>> It is a dual P4 processor supermicro server with 2GB of RAM, so I >>> will >>> need to go over the configuration then? I didn't think it should >>> take so >>> long... >> >> Why does everyone leave of the IO subsystem? It's almost as if many >> people don't realize that disks exist ... > > I have disks? You ARE have disks ;) Erik Jones Software Developer | Emma® erik@myemma.com 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate & market in style. Visit us online at http://www.myemma.com
On 9/20/07, Robert Fitzpatrick <lists@webtent.net> wrote: > On Thu, 2007-09-20 at 16:38 -0400, Bill Moran wrote: > > In response to Robert Fitzpatrick <lists@webtent.net>: > > Why does everyone leave of the IO subsystem? It's almost as if many > > people don't realize that disks exist ... > > > > With 2G of RAM, and a DB that's about 3G, then there's at least a G of > > database data _not_ in memory at any time. As a result, disk speed is > > important, and _could_ be part of your problem. You're not using RAID > > 5 are you? > > Yes, using RAID 5, not good? RAID 5 with hot fix total of 4 drives. All > SATA 80GB drives giving me little under 300GB to work with. RAID5 optimizes for space, not performance or reliability. It gets faster but less reliable as it gets bigger. If you can afford the space RAID-10 is generally preferred. Note however that it is far more important for most general purpose servers to have a RAID controller that is both fast by nature (i.e. not $50.00) and has battery backed cache with write thru turned on. RAID5 on a fast controller with battery backed cache is ok. But I've seen software RAID-10 outrun it for certain loads...
"Scott Marlowe" <scott.marlowe@gmail.com> writes: > RAID5 optimizes for space, not performance or reliability. It gets > faster but less reliable as it gets bigger. If you can afford the > space RAID-10 is generally preferred. RAID5 can be faster for DSS style work loads. If you're writing data to the raid in large contiguous chunks then it you get higher bandwidth than RAID1+0. The problem with RAID5 is that if you're writing random access chunks then it's even slower than not having a raid at all. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
On Thu, Sep 20, 2007 at 04:33:25PM -0500, Scott Marlowe wrote: > On 9/20/07, Robert Fitzpatrick <lists@webtent.net> wrote: > > On Thu, 2007-09-20 at 16:38 -0400, Bill Moran wrote: > > > In response to Robert Fitzpatrick <lists@webtent.net>: > > > Why does everyone leave of the IO subsystem? It's almost as if many > > > people don't realize that disks exist ... > > > > > > With 2G of RAM, and a DB that's about 3G, then there's at least a G of > > > database data _not_ in memory at any time. As a result, disk speed is > > > important, and _could_ be part of your problem. You're not using RAID > > > 5 are you? > > > > Yes, using RAID 5, not good? RAID 5 with hot fix total of 4 drives. All > > SATA 80GB drives giving me little under 300GB to work with. > > RAID5 optimizes for space, not performance or reliability. It gets > faster but less reliable as it gets bigger. If you can afford the > space RAID-10 is generally preferred. > > Note however that it is far more important for most general purpose > servers to have a RAID controller that is both fast by nature (i.e. > not $50.00) and has battery backed cache with write thru turned on. Surely you mean with write thru turned *off*... Or write-back turned on. But write thru turned on will make your battery unnecessary... //Magnus
On 9/21/07, Magnus Hagander <magnus@hagander.net> wrote: > On Thu, Sep 20, 2007 at 04:33:25PM -0500, Scott Marlowe wrote: > > On 9/20/07, Robert Fitzpatrick <lists@webtent.net> wrote: > > > On Thu, 2007-09-20 at 16:38 -0400, Bill Moran wrote: > > > > In response to Robert Fitzpatrick <lists@webtent.net>: > > > > Why does everyone leave of the IO subsystem? It's almost as if many > > > > people don't realize that disks exist ... > > > > > > > > With 2G of RAM, and a DB that's about 3G, then there's at least a G of > > > > database data _not_ in memory at any time. As a result, disk speed is > > > > important, and _could_ be part of your problem. You're not using RAID > > > > 5 are you? > > > > > > Yes, using RAID 5, not good? RAID 5 with hot fix total of 4 drives. All > > > SATA 80GB drives giving me little under 300GB to work with. > > > > RAID5 optimizes for space, not performance or reliability. It gets > > faster but less reliable as it gets bigger. If you can afford the > > space RAID-10 is generally preferred. > > > > Note however that it is far more important for most general purpose > > servers to have a RAID controller that is both fast by nature (i.e. > > not $50.00) and has battery backed cache with write thru turned on. > > Surely you mean with write thru turned *off*... Or write-back turned on. > But write thru turned on will make your battery unnecessary... Yeah, I meant write back turned on...