Thread: Some Autovacuum Questions
Hi! Some of our clients databases are performing less good after a while. We are using autovacuum to vacuuming and analyzing the tables. After some analyzes by my own it looks like that the tables or table indexes are not analyzed or vacuumed fully or correctly. A count(*) query takes multiple times longer on this databases as on a fresh dumped/restored version of the same database on the same machine. During the query it looks like that postgres scans all the time the harddisk and is reading a lot more data in then from the fresh restored database. This showed me the output of the vmstat-commands 'bi'-attribute (blocks reading in) and the up to 99,9% wait-state of the cpu. My 1. question is, if the known bugfixes for autovacuum after release 8.1.4 addressing my depicted issues? We are still using 8.1.4 because a database upgrade for us and our product is a hefty step wich involves a lot of customer databases. But if it could help we consider to upgrade to 8.1.11 or 8.3. What would u suggest? My 2. questions is, if i possible configured something improper? For this i will give a brief overview of our database. The database stores mainly historical data for a reporting application. This data will be consolidated per day at frequent intervals. For this the data of one day will be removed from the historical tables and will be newly calculated out of some tables with raw data. Depending on the daytime and the amount of data it takes normaly up to 2 minutes to summarize an compress the data of one day. After one minute break it starts again. We talk about up to 3000 records per day out of up to 3 million records (and growing) in the whole historical table. Can autovacuum handle that much changing data with this configuration? From default configuration differing settings: vacuum_cost_delay = 200 vacuum_cost_page_hit = 6 vacuum_cost_limit = 100 autovacuum_naptime = 60 Strange enough a manual analyze and vacuum makes the measured count(*) query less performant? See attached vacuum log. My 3. question is, if it possible to read the vaccuming or analyzing state of a given table manually? Are there any indicatores in statistic tables, wich the autovacuum demaon is using too wich can show me the progress of a running autovacuum? We are using a customized debian Linux on Pentium 4 2,8 GHz Thanks for any help! regards, thomas
Attachment
Thomas Chille wrote: > Hi! > > Some of our clients databases are performing less good after a while. > We are using autovacuum to vacuuming and analyzing the tables. > > After some analyzes by my own it looks like that the tables or table > indexes are not analyzed or vacuumed fully or correctly. > You might have some index bloat.... > A count(*) query takes multiple times longer on this databases as on a > fresh dumped/restored version of the same database on the same > machine. > Are you using a where clause with your count? If not, the PostgreSQL must do a table scan to count the rows. If you have a lot of churn prior to running vaccum, that might increase your table size significantly and thus account for the additional time. You should find that it drops back to the dump/restore time if you do a vacuum full (or cluster), but that has other performance implications. You could also tune AVD to be a bit more aggressive, but that might have other performance-related implications for you. > During the query it looks like that postgres scans all the time the > harddisk and is reading a lot more data in then from the fresh > restored database. This showed me the output of the vmstat-commands > 'bi'-attribute (blocks reading in) and the up to 99,9% wait-state of > the cpu. > > My 1. question is, > if the known bugfixes for autovacuum after release 8.1.4 addressing my > depicted issues? > > We are still using 8.1.4 because a database upgrade for us and our > product is a hefty step wich involves a lot of customer databases. But > if it could help we consider to upgrade to 8.1.11 or 8.3. What would u > suggest? > I don't think so...though HOT in 8.3 might help a bit (its a feature, not a bug fix though), but based on what you say your application does below, I'm not sure that you would see a benefit in HOT anyways. > My 2. questions is, > if i possible configured something improper? > Well, you might be better off using the row estimates in pg_class, rather than doing a count() to get them. pg_class values are estimates, and will be somewhat accurate provided you did a recent analyze... > For this i will give a brief overview of our database. > > The database stores mainly historical data for a reporting > application. This data will be consolidated per day at frequent > intervals. For this the data of one day will be removed from the > historical tables and will be newly calculated out of some tables with > raw data. Depending on the daytime and the amount of data it takes > normaly up to 2 minutes to summarize an compress the data of one day. > After one minute break it starts again. > > We talk about up to 3000 records per day out of up to 3 million > records (and growing) in the whole historical table. > Can autovacuum handle that much changing data with this configuration? > Sure. though you might be able to improve performance by using PostgreSQL inheritance and putting the static data (that doesn't change much) in a larger parition that you don't need to vacuum as frequently. > From default configuration differing settings: > > vacuum_cost_delay = 200 > vacuum_cost_page_hit = 6 > vacuum_cost_limit = 100 > autovacuum_naptime = 60 > > Strange enough a manual analyze and vacuum makes the measured count(*) > query less performant? See attached vacuum log. > I'm sure if you did the manual vacuum at the same interval as the auto-vacuum you would see it be similarly performant. > My 3. question is, > if it possible to read the vaccuming or analyzing state of a given > table manually? Are there any indicatores in statistic tables, wich > the autovacuum demaon is using too wich can show me the progress of a > running autovacuum? > Sure, you can look at the table level row i/o stats. If you add the number of rows deleted with the number of rows updated, you'll get a count of the number of dead tuples (assuming you reset stats and didn't do a vacuum ). AVD uses these values to determine when it needs to vacuum and analyze... > We are using a customized debian Linux on Pentium 4 2,8 GHz > > > Thanks for any help! > > regards, thomas > > ------------------------------------------------------------------------ > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster > -- Chander Ganesan Open Technology Group, Inc. One Copley Parkway, Suite 210 Morrisville, NC 27560 919-463-0999/877-258-8987 http://www.otg-nc.com
Thomas Chille wrote: > My 1. question is, > if the known bugfixes for autovacuum after release 8.1.4 addressing my > depicted issues? Not directly, but keep reading. > We are still using 8.1.4 because a database upgrade for us and our > product is a hefty step wich involves a lot of customer databases. But > if it could help we consider to upgrade to 8.1.11 or 8.3. What would u > suggest? You will be really happy when your database stops working because of a bug in autovacuum which was fixed in 8.1.6. I should add that perhaps autovacuum is not processing your databases at all because it is stuck on processing template0 due to that bug -- this could cause dead rows to accumulate. I'm not sure why you say it's a hefty step to upgrade to 8.1.11. Perhaps you're not up to speed on our upgrade procedures. On the other hand, perhaps this bug hasn't bit you yet. But I can promise, if you don't upgrade it will. > My 2. questions is, > if i possible configured something improper? If you didn't tune autovacuum at all, yeah, perhaps it would be good to adjust the settings somewhat. > My 3. question is, > if it possible to read the vaccuming or analyzing state of a given > table manually? Are there any indicatores in statistic tables, wich > the autovacuum demaon is using too wich can show me the progress of a > running autovacuum? Yes, they are kept in pg_stat_all_tables. (Not all autovacuum numbers are displayed however). -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Tue, Feb 12, 2008 at 04:13:33PM +0100, Thomas Chille wrote: > We are still using 8.1.4 because a database upgrade for us and our > product is a hefty step wich involves a lot of customer databases. But > if it could help we consider to upgrade to 8.1.11 or 8.3. What would u > suggest? Obviously 8.1.11, given the upgrade should only take moments... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Those who make peaceful revolution impossible will make violent revolution inevitable. > -- John F Kennedy
Attachment
On Feb 12, 2008, at 9:13 AM, Thomas Chille wrote: > vacuum_cost_delay = 200 > vacuum_cost_page_hit = 6 > vacuum_cost_limit = 100 Vacuum is going to take forever with those settings. I strongly suggest you set them back to default. If you need to throttle vacuum, try setting cost_delay to between 10 and 20. -- Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828