Sidetracking pg_autovacuum - Mailing list pgsql-admin

From Jeff Boes
Subject Sidetracking pg_autovacuum
Date
Msg-id 428B8A27.7050402@nexcerpt.com
Whole thread Raw
Responses Re: Sidetracking pg_autovacuum
List pgsql-admin
(Environment: Pg 7.4.1 on Linux; yes, I know it's past time to upgrade.)

(Originally, I started to post this as a question about how
pg_class.reltuples could get so far out of whack compared to the actual
table size. After posting it, I found that (a) I had the mailing list
address wrong [d'oh!], and (b) I think I solved my problem. In essence,
by doing a VACUUM in a script after a sizeable DELETE, apparently I took
the affected table out of the hands of pg_autovacuum so that it never,
ever did anything with the table. Including ANALYZE. Thus, reltuples
never got updated.)

I started tracking the COUNT(*) (actual row count) versus the value in
pg_class.reltuples for a number of our larger tables. Some of the tables
see a lot of INSERTs and DELETEs over the course of a day; as much as
1/12th of the data will be deleted overnight, and new data inserted over
the course of the day. I have pg_autovacuum running, and I also do
regular VACUUMs, ANALYZEs, and even some CLUSTERs on these tables.

[N.B.: In fact, I started doing a VACUUM explicitly after the big
nightly DELETE.]

One table in particular started to act "funny", which got me looking at
the innards of it, and I found that on our production system (which has
significantly higher throughput), the ratio of (pg_class.reltuples /
count(*)) would climb to 2.0 or higher; that seemed to indicate that
pg_autovacuum wasn't doing its job, at least not when I expected it to,
so I started doing a VACUUM after the DELETEs overnight, and that made
*that* problem go away.

However, now the test system (with the lower throughput) is behaving
oddly. The ratio has *fallen* to about 0.16, meaning there are *six
times as many* actual rows in the table compared to what the optimizer
thinks. I did a hand-analyze on the table, and the situation didn't
change. Right now, row count is about 182,000 and reltuples is under 29,000.

What would cause an otherwise well-behaved table to start doing this? Is
this just a "dead spot" in the ANALYZE command? (By which I mean:
ANALYZE randomly sampling rows, but my data is not terribly random, so
it gets fooled?)

[And here's the remaining question in my puzzled mind: ANALYZE would not
change the reltuples value, but VACUUM FULL ANALYZE did. Er-wha?]


--
Jeff Boes                                         Vox 269-226-9550 x24
Director of Software Development                  Fax 269-349-9076

Exfacto!         Exceptional Online Content     http://www.exfacto.com
Nexcerpt         ...Extend Your Expertise...    http://www.nexcerpt.com



pgsql-admin by date:

Previous
From: Pallav Kalva
Date:
Subject: Re: SQL logging (log_min_duration_statement) not working
Next
From: Tom Lane
Date:
Subject: Re: Sidetracking pg_autovacuum