Thread: Some Autovacuum Questions

Some Autovacuum Questions

From
"Thomas Chille"
Date:
Hi!

Some of our clients databases are performing less good after a while.
We are using autovacuum to vacuuming and analyzing the tables.

After some analyzes by my own it looks like that the tables or table
indexes are not analyzed or vacuumed fully or correctly.

A count(*) query takes multiple times longer on this databases as on a
fresh dumped/restored version of the same database on the same
machine.

During the query it looks like that postgres scans all the time the
harddisk and is reading a lot more data in then from the fresh
restored database. This showed me the output of the vmstat-commands
'bi'-attribute (blocks reading in) and the up to 99,9% wait-state of
the cpu.

My 1. question is,
if the known bugfixes for autovacuum after release 8.1.4 addressing my
depicted issues?

We are still using 8.1.4 because a database upgrade for us and our
product is a hefty step wich involves a lot of customer databases. But
if it could help we consider to upgrade to 8.1.11 or 8.3. What would u
suggest?

My 2. questions is,
if i possible configured something improper?

For this i will give a brief overview of our database.

The database stores mainly historical data for a reporting
application. This data will be consolidated per day at frequent
intervals. For this the data of one day will be removed from the
historical tables and will be newly calculated out of some tables with
raw data.  Depending on the daytime and the amount of data it takes
normaly up to 2 minutes to summarize an compress the data of one day.
After one minute break it starts again.

We talk about up to 3000 records per day out of up to 3 million
records (and growing) in the whole historical table.

Can autovacuum handle that much changing data with this configuration?

From default configuration differing settings:

vacuum_cost_delay = 200
vacuum_cost_page_hit = 6
vacuum_cost_limit = 100
autovacuum_naptime = 60

Strange enough a manual analyze and vacuum makes the measured count(*)
query less performant? See attached vacuum log.

My 3. question is,
if it possible to read the vaccuming or analyzing state of a given
table manually? Are there any indicatores in statistic tables, wich
the autovacuum demaon is using too wich can show me the progress of a
running autovacuum?

We are using a customized debian Linux on Pentium 4 2,8 GHz


Thanks for any help!

regards, thomas

Attachment

Re: Some Autovacuum Questions

From
Chander Ganesan
Date:
Thomas Chille wrote:
> Hi!
>
> Some of our clients databases are performing less good after a while.
> We are using autovacuum to vacuuming and analyzing the tables.
>
> After some analyzes by my own it looks like that the tables or table
> indexes are not analyzed or vacuumed fully or correctly.
>
You might have some index bloat....
> A count(*) query takes multiple times longer on this databases as on a
> fresh dumped/restored version of the same database on the same
> machine.
>
Are you using a where clause with your count?  If not, the PostgreSQL
must do a table scan to count the rows.  If you have a lot of churn
prior to running vaccum, that might increase your table size
significantly and thus account for the additional time.  You should find
that it drops back to the dump/restore time if you do a vacuum full (or
cluster), but that has other performance implications.  You could also
tune AVD to be a bit more aggressive, but that might have other
performance-related implications for you.
> During the query it looks like that postgres scans all the time the
> harddisk and is reading a lot more data in then from the fresh
> restored database. This showed me the output of the vmstat-commands
> 'bi'-attribute (blocks reading in) and the up to 99,9% wait-state of
> the cpu.
>
> My 1. question is,
> if the known bugfixes for autovacuum after release 8.1.4 addressing my
> depicted issues?
>
> We are still using 8.1.4 because a database upgrade for us and our
> product is a hefty step wich involves a lot of customer databases. But
> if it could help we consider to upgrade to 8.1.11 or 8.3. What would u
> suggest?
>
I don't think so...though HOT in 8.3 might help a bit (its a feature,
not a bug fix though), but based on what you say your application does
below, I'm not sure that you would see a benefit in HOT anyways.
> My 2. questions is,
> if i possible configured something improper?
>
Well, you might be better off using the row estimates in pg_class,
rather than doing a count() to get them.  pg_class values are estimates,
and will be somewhat accurate provided you did a recent analyze...
> For this i will give a brief overview of our database.
>
> The database stores mainly historical data for a reporting
> application. This data will be consolidated per day at frequent
> intervals. For this the data of one day will be removed from the
> historical tables and will be newly calculated out of some tables with
> raw data.  Depending on the daytime and the amount of data it takes
> normaly up to 2 minutes to summarize an compress the data of one day.
> After one minute break it starts again.
>
> We talk about up to 3000 records per day out of up to 3 million
> records (and growing) in the whole historical table.
> Can autovacuum handle that much changing data with this configuration?
>
Sure.  though you might be able to improve performance by using
PostgreSQL inheritance and putting the static data (that doesn't change
much) in a larger parition that you don't need to vacuum as frequently.
> From default configuration differing settings:
>
> vacuum_cost_delay = 200
> vacuum_cost_page_hit = 6
> vacuum_cost_limit = 100
> autovacuum_naptime = 60
>
> Strange enough a manual analyze and vacuum makes the measured count(*)
> query less performant? See attached vacuum log.
>
I'm sure if you did the manual vacuum at the same interval as the
auto-vacuum you would see it be similarly performant.
> My 3. question is,
> if it possible to read the vaccuming or analyzing state of a given
> table manually? Are there any indicatores in statistic tables, wich
> the autovacuum demaon is using too wich can show me the progress of a
> running autovacuum?
>
Sure, you can look at the table level row i/o stats.  If you add the
number of rows deleted with the number of rows updated, you'll get  a
count of the number of dead tuples (assuming you reset stats and didn't
do a vacuum ).  AVD uses these values to determine when it needs to
vacuum and analyze...
> We are using a customized debian Linux on Pentium 4 2,8 GHz
>
>
> Thanks for any help!
>
> regards, thomas
>
> ------------------------------------------------------------------------
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>


--
Chander Ganesan
Open Technology Group, Inc.
One Copley Parkway, Suite 210
Morrisville, NC  27560
919-463-0999/877-258-8987
http://www.otg-nc.com


Re: Some Autovacuum Questions

From
Alvaro Herrera
Date:
Thomas Chille wrote:

> My 1. question is,
> if the known bugfixes for autovacuum after release 8.1.4 addressing my
> depicted issues?

Not directly, but keep reading.

> We are still using 8.1.4 because a database upgrade for us and our
> product is a hefty step wich involves a lot of customer databases. But
> if it could help we consider to upgrade to 8.1.11 or 8.3. What would u
> suggest?

You will be really happy when your database stops working because of a
bug in autovacuum which was fixed in 8.1.6.  I should add that perhaps
autovacuum is not processing your databases at all because it is stuck
on processing template0 due to that bug -- this could cause dead rows to
accumulate.

I'm not sure why you say it's a hefty step to upgrade to 8.1.11.
Perhaps you're not up to speed on our upgrade procedures.

On the other hand, perhaps this bug hasn't bit you yet.  But I can
promise, if you don't upgrade it will.

> My 2. questions is,
> if i possible configured something improper?

If you didn't tune autovacuum at all, yeah, perhaps it would be good to
adjust the settings somewhat.


> My 3. question is,
> if it possible to read the vaccuming or analyzing state of a given
> table manually? Are there any indicatores in statistic tables, wich
> the autovacuum demaon is using too wich can show me the progress of a
> running autovacuum?

Yes, they are kept in pg_stat_all_tables.  (Not all autovacuum numbers
are displayed however).

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Some Autovacuum Questions

From
Martijn van Oosterhout
Date:
On Tue, Feb 12, 2008 at 04:13:33PM +0100, Thomas Chille wrote:
> We are still using 8.1.4 because a database upgrade for us and our
> product is a hefty step wich involves a lot of customer databases. But
> if it could help we consider to upgrade to 8.1.11 or 8.3. What would u
> suggest?

Obviously 8.1.11, given the upgrade should only take moments...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Those who make peaceful revolution impossible will make violent revolution inevitable.
>  -- John F Kennedy

Attachment

Re: Some Autovacuum Questions

From
Decibel!
Date:
On Feb 12, 2008, at 9:13 AM, Thomas Chille wrote:
> vacuum_cost_delay = 200
> vacuum_cost_page_hit = 6
> vacuum_cost_limit = 100


Vacuum is going to take forever with those settings. I strongly
suggest you set them back to default. If you need to throttle vacuum,
try setting cost_delay to between 10 and 20.
--
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828



Attachment