Thread: db size and VACUUM ANALYZE
Hello, The db in the application I maintain but didn't write (it obviously makes use of PG, v 8.3), has been systematically growing in size from about 600M to 1.6G. At the same time, the performance of the app has degraded significantly (several times). So I've done VACUUM ANALYZE on entire db. Nothing. The db did not decrease in size, the performance stayed the same. So I backed it up using pg_dump, deleted database, and recreated it from backup. The size of db on disk went down to 600M, performance recovered to the original level. Why that is so? I thought that VACUUM ANALYZE does everything that is needed to optimize disk usage? Regards, mk
On Fri, 2010-02-12 at 18:09 +0100, Marcin Krol wrote: > Hello, > > The db in the application I maintain but didn't write (it obviously > makes use of PG, v 8.3), has been systematically growing in size from > about 600M to 1.6G. > > At the same time, the performance of the app has degraded significantly > (several times). This is usually an indication that you are not vacuuming frequently enough. > So I've done VACUUM ANALYZE on entire db. Nothing. The db did not > decrease in size, the performance stayed the same. Regular vacuum will not decrease the size of the database. It will just mark the dead tuples available for re-use. VACUUM FULL will remove the dead tuples from the database, and make it smaller. It's a bad idea to get in the habit of doing this. It's better to tune autovacuum and avoid the issue altogether. > So I backed it up using pg_dump, deleted database, and recreated it from > backup. > > The size of db on disk went down to 600M, performance recovered to the > original level. Reloading the database eliminates the dead tuples, which is why the database is smaller. > Why that is so? I thought that VACUUM ANALYZE does everything that is > needed to optimize disk usage? It may not do everything, but the idea is to VACUUM your tables often enough that they don't grow out of control. Are you running autovacuum? It should take care of this for you. You may need to make it more aggressive than the default though. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp.
Brad Nicholson wrote: First of all, I don't really care about 1G of disk space, the main problem was why the performance degraded so much? > Are you running autovacuum? Apparently no. I have turned it on in conf and restarted pg, I'll see how that works. It should take care of this for you. You > may need to make it more aggressive than the default though. Hmm what do you mean by more aggressive? I haven't seen anything in the parameters that would suggest whether it is more likely or less likely to recover dead tuples: # actions running at least that time. #autovacuum_max_workers = 3 # max number of autovacuum subprocesses #autovacuum_naptime = 1min # time between autovacuum runs #autovacuum_vacuum_threshold = 50 # min number of row updates before # vacuum #autovacuum_analyze_threshold = 50 # min number of row updates before # analyze #autovacuum_vacuum_scale_factor = 0.2 # fraction of table size before vacuum #autovacuum_analyze_scale_factor = 0.1 # fraction of table size before analyze #autovacuum_freeze_max_age = 200000000 # maximum XID age before forced vacuum # (change requires restart) #autovacuum_vacuum_cost_delay = 20 # default vacuum cost delay for # autovacuum, -1 means use # vacuum_cost_delay #autovacuum_vacuum_cost_limit = -1 # default vacuum cost limit for # autovacuum, -1 means use # vacuum_cost_limit I don't see anything in here that would suggest equivalent of VACUUM FULL. Regards, mk
On Fri, 2010-02-12 at 18:41 +0100, Marcin Krol wrote: > Brad Nicholson wrote: > > First of all, I don't really care about 1G of disk space, the main > problem was why the performance degraded so much? Because you data is now spread across a whole lot more blocks on disk. It takes more work to find the data than it used to. > > Are you running autovacuum? > > Apparently no. I have turned it on in conf and restarted pg, I'll see > how that works. > > It should take care of this for you. You > > may need to make it more aggressive than the default though. > > Hmm what do you mean by more aggressive? I haven't seen anything in the > parameters that would suggest whether it is more likely or less likely > to recover dead tuples: I would start by turning autovacuum on and running it with the defaults. This will most likely make this problem go away. If you find that you still have problems, try lowering autovacuum_vacuum_scale_factor a bit. > # actions running at least that > time. > #autovacuum_max_workers = 3 # max number of autovacuum > subprocesses > #autovacuum_naptime = 1min # time between autovacuum runs > #autovacuum_vacuum_threshold = 50 # min number of row updates before > # vacuum > #autovacuum_analyze_threshold = 50 # min number of row updates before > # analyze > #autovacuum_vacuum_scale_factor = 0.2 # fraction of table size before > vacuum > #autovacuum_analyze_scale_factor = 0.1 # fraction of table size before > analyze > #autovacuum_freeze_max_age = 200000000 # maximum XID age before forced > vacuum > # (change requires restart) > #autovacuum_vacuum_cost_delay = 20 # default vacuum cost delay for > # autovacuum, -1 means use > # vacuum_cost_delay > #autovacuum_vacuum_cost_limit = -1 # default vacuum cost limit for > # autovacuum, -1 means use > # vacuum_cost_limit > > > I don't see anything in here that would suggest equivalent of VACUUM FULL. Autovaccum does not run VACUUM FULL, it runs a regular VACUUM. I recommend reading this page in the documentation: http://www.postgresql.org/docs/8.3/interactive/routine-vacuuming.html -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp.
On Fri, 2010-02-12 at 13:46 -0500, Brad Nicholson wrote: > On Fri, 2010-02-12 at 18:41 +0100, Marcin Krol wrote: > > Brad Nicholson wrote: > > > > First of all, I don't really care about 1G of disk space, the main > > problem was why the performance degraded so much? > > Because you data is now spread across a whole lot more blocks on disk. > It takes more work to find the data than it used to. Oh, I didn't realize the DB was so small. How big is your shared buffer pool? It very well might be that at 600MB, the entire DB fits into buffer pool. As the DB grows, it no longer the case, and the speed of your disk now comes into play. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp.