Re: Some Autovacuum Questions - Mailing list pgsql-general

From Chander Ganesan
Subject Re: Some Autovacuum Questions
Date
Msg-id 47B1BD85.5040209@otg-nc.com
Whole thread Raw
In response to Some Autovacuum Questions  ("Thomas Chille" <thomas@chille.de>)
List pgsql-general
Thomas Chille wrote:
> Hi!
>
> Some of our clients databases are performing less good after a while.
> We are using autovacuum to vacuuming and analyzing the tables.
>
> After some analyzes by my own it looks like that the tables or table
> indexes are not analyzed or vacuumed fully or correctly.
>
You might have some index bloat....
> A count(*) query takes multiple times longer on this databases as on a
> fresh dumped/restored version of the same database on the same
> machine.
>
Are you using a where clause with your count?  If not, the PostgreSQL
must do a table scan to count the rows.  If you have a lot of churn
prior to running vaccum, that might increase your table size
significantly and thus account for the additional time.  You should find
that it drops back to the dump/restore time if you do a vacuum full (or
cluster), but that has other performance implications.  You could also
tune AVD to be a bit more aggressive, but that might have other
performance-related implications for you.
> During the query it looks like that postgres scans all the time the
> harddisk and is reading a lot more data in then from the fresh
> restored database. This showed me the output of the vmstat-commands
> 'bi'-attribute (blocks reading in) and the up to 99,9% wait-state of
> the cpu.
>
> My 1. question is,
> if the known bugfixes for autovacuum after release 8.1.4 addressing my
> depicted issues?
>
> We are still using 8.1.4 because a database upgrade for us and our
> product is a hefty step wich involves a lot of customer databases. But
> if it could help we consider to upgrade to 8.1.11 or 8.3. What would u
> suggest?
>
I don't think so...though HOT in 8.3 might help a bit (its a feature,
not a bug fix though), but based on what you say your application does
below, I'm not sure that you would see a benefit in HOT anyways.
> My 2. questions is,
> if i possible configured something improper?
>
Well, you might be better off using the row estimates in pg_class,
rather than doing a count() to get them.  pg_class values are estimates,
and will be somewhat accurate provided you did a recent analyze...
> For this i will give a brief overview of our database.
>
> The database stores mainly historical data for a reporting
> application. This data will be consolidated per day at frequent
> intervals. For this the data of one day will be removed from the
> historical tables and will be newly calculated out of some tables with
> raw data.  Depending on the daytime and the amount of data it takes
> normaly up to 2 minutes to summarize an compress the data of one day.
> After one minute break it starts again.
>
> We talk about up to 3000 records per day out of up to 3 million
> records (and growing) in the whole historical table.
> Can autovacuum handle that much changing data with this configuration?
>
Sure.  though you might be able to improve performance by using
PostgreSQL inheritance and putting the static data (that doesn't change
much) in a larger parition that you don't need to vacuum as frequently.
> From default configuration differing settings:
>
> vacuum_cost_delay = 200
> vacuum_cost_page_hit = 6
> vacuum_cost_limit = 100
> autovacuum_naptime = 60
>
> Strange enough a manual analyze and vacuum makes the measured count(*)
> query less performant? See attached vacuum log.
>
I'm sure if you did the manual vacuum at the same interval as the
auto-vacuum you would see it be similarly performant.
> My 3. question is,
> if it possible to read the vaccuming or analyzing state of a given
> table manually? Are there any indicatores in statistic tables, wich
> the autovacuum demaon is using too wich can show me the progress of a
> running autovacuum?
>
Sure, you can look at the table level row i/o stats.  If you add the
number of rows deleted with the number of rows updated, you'll get  a
count of the number of dead tuples (assuming you reset stats and didn't
do a vacuum ).  AVD uses these values to determine when it needs to
vacuum and analyze...
> We are using a customized debian Linux on Pentium 4 2,8 GHz
>
>
> Thanks for any help!
>
> regards, thomas
>
> ------------------------------------------------------------------------
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>


--
Chander Ganesan
Open Technology Group, Inc.
One Copley Parkway, Suite 210
Morrisville, NC  27560
919-463-0999/877-258-8987
http://www.otg-nc.com


pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: deadlock while re-indexing table
Next
From: Alvaro Herrera
Date:
Subject: Re: Some Autovacuum Questions