Thread: More Vacuum questions...
Hi List; I've recently started cleaning up some postgres db's which previous to my recent arrival had no DBA to care for them. I quickly figured out that there were several tables which were grossly full of dead space. One table in particular had 75G worth of dead pages (or the equivelant in overall dead rows). So I rebuilt these several tables via this process: 1) BEGIN; 2) LOCK TABLE table_old (the current table) 2) CREATE TABLE table_new (...) (copy of table_old above without the indexes) 3) insert into table_new select * from table_old; 4) DROP TABLE table_old; 5) ALTER TABLE table_new rename to table_old; 6) CREATE INDEX (create all original table indexes) 7) COMMIT; The biggest table mentioned above did in fact reduce the able's overall size by about 69G. After the table rebuild, as an interum measure since I'm still tuning and I need to go through a full test/qa/prod lifecycle to get anything rolled onto the production servers I added this table to pg_autovacuum with enabled = 'f' and setup a daily cron job to vacuum the table during off hours. This was due primarily to the fact that the vacuum of this table was severely impacting day2day processing. I've since upped the maintenance_work_mem to 300,000 and in general the vacuums no longer impact day2day processing - with the exception of this big table. I let the cron vacuum run for 14 days. in that 14 days the time it takes to vacuum the table grew from 1.2hours directly after the rebuild to > 8hours last nite. It's difficult to try and vacuum this table during the day as it seems to begin blocking all the other queries against the database after some time. I plan to rebuild the table again and see if I can get away with vacuuming more often - it during the day. Also I'm considering a weekly cron job each Sunday (minimal processing happens on the weekends) to rebuild the table. Just curious if anyone has any thoughts on an automated rebuild scenario? or better yet managing the vac of this table more efficiently? Maybe it's worth upping maintenance_work_mem sky-high for this table (via a session specific SET of maintenance_work_mem) and running a vacuum every 3 hours or so. Also, does Postgres allocate maintenence_work_memory from the overall shared_buffers space available (I think not) ? Is there some method / guideline I could use to determine the memory needs on a table by table basis for the vacuum process ? If so, I suspect I could use this as a guide for setting a session specific maintenance_work_mem via cron to vacuum these problem tables on a specified schedule. Thanks in advance... /Kevin
Kevin Kempter wrote: > It's difficult to try and vacuum this table during the day as it seems to > begin blocking all the other queries against the database after some time. Vacuum can generate so much I/O that it overwhelms all other transactions, but it shouldn't block other queries otherwise. You can use the vacuum cost delay options to throttle vacuum so that it doesn't runs slower, but doesn't disrupt other operations so much. > I plan to rebuild the table again and see if I can get away with vacuuming more > often - it during the day. Also I'm considering a weekly cron job each Sunday > (minimal processing happens on the weekends) to rebuild the table. > > Just curious if anyone has any thoughts on an automated rebuild scenario? or > better yet managing the vac of this table more efficiently? CLUSTER is a handy way to do rebuild tables. > Maybe it's worth upping maintenance_work_mem sky-high for this table (via a > session specific SET of maintenance_work_mem) and running a vacuum every 3 > hours or so. You only need enough maintenance_work_mem to hold pointers to all dead tuples in the table. Using more than that won't help. > Also, does Postgres allocate maintenence_work_memory from the > overall shared_buffers space available (I think not) ? No. > Is there some method / guideline I could use to determine the memory needs on > a table by table basis for the vacuum process ? If so, I suspect I could use > this as a guide for setting a session specific maintenance_work_mem via cron > to vacuum these problem tables on a specified schedule. You need 6 bytes per dead tuple in the table to avoid scanning the indexes more than once. If you vacuum regularly, you shouldn't need more than a few hundred MB. One way is to run VACUUM VERBOSE, which will tell how many passes it used. If it used more than one, increase maintenance_work_mem. I would suggest using autovacuum after all. If it seems to be disrupting other activity too much, increase autovacuum_cost_delay. Or decrease it if it can't keep up with the updates. BTW, you didn't mention which version of PostgreSQL you're using. There's been some performance enhancements to VACUUM in 8.2, as well as autovacuum changes. You might consider upgrading if you're not on 8.2 already. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Tue, Sep 11, 2007 at 10:24:58AM -0600, Kevin Kempter wrote: > I let the cron vacuum run for 14 days. in that 14 days the time it takes to > vacuum the table grew from 1.2hours directly after the rebuild to > 8hours > last nite. Sounds to me like daily isn't enough, and that your FSM is too small. -- Decibel!, aka Jim Nasby decibel@decibel.org EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)