Thread: Sidetracking pg_autovacuum

Sidetracking pg_autovacuum

From
Jeff Boes
Date:
(Environment: Pg 7.4.1 on Linux; yes, I know it's past time to upgrade.)

(Originally, I started to post this as a question about how
pg_class.reltuples could get so far out of whack compared to the actual
table size. After posting it, I found that (a) I had the mailing list
address wrong [d'oh!], and (b) I think I solved my problem. In essence,
by doing a VACUUM in a script after a sizeable DELETE, apparently I took
the affected table out of the hands of pg_autovacuum so that it never,
ever did anything with the table. Including ANALYZE. Thus, reltuples
never got updated.)

I started tracking the COUNT(*) (actual row count) versus the value in
pg_class.reltuples for a number of our larger tables. Some of the tables
see a lot of INSERTs and DELETEs over the course of a day; as much as
1/12th of the data will be deleted overnight, and new data inserted over
the course of the day. I have pg_autovacuum running, and I also do
regular VACUUMs, ANALYZEs, and even some CLUSTERs on these tables.

[N.B.: In fact, I started doing a VACUUM explicitly after the big
nightly DELETE.]

One table in particular started to act "funny", which got me looking at
the innards of it, and I found that on our production system (which has
significantly higher throughput), the ratio of (pg_class.reltuples /
count(*)) would climb to 2.0 or higher; that seemed to indicate that
pg_autovacuum wasn't doing its job, at least not when I expected it to,
so I started doing a VACUUM after the DELETEs overnight, and that made
*that* problem go away.

However, now the test system (with the lower throughput) is behaving
oddly. The ratio has *fallen* to about 0.16, meaning there are *six
times as many* actual rows in the table compared to what the optimizer
thinks. I did a hand-analyze on the table, and the situation didn't
change. Right now, row count is about 182,000 and reltuples is under 29,000.

What would cause an otherwise well-behaved table to start doing this? Is
this just a "dead spot" in the ANALYZE command? (By which I mean:
ANALYZE randomly sampling rows, but my data is not terribly random, so
it gets fooled?)

[And here's the remaining question in my puzzled mind: ANALYZE would not
change the reltuples value, but VACUUM FULL ANALYZE did. Er-wha?]


--
Jeff Boes                                         Vox 269-226-9550 x24
Director of Software Development                  Fax 269-349-9076

Exfacto!         Exceptional Online Content     http://www.exfacto.com
Nexcerpt         ...Extend Your Expertise...    http://www.nexcerpt.com



Re: Sidetracking pg_autovacuum

From
Tom Lane
Date:
Jeff Boes <jboes@nexcerpt.com> writes:
> What would cause an otherwise well-behaved table to start doing this? Is
> this just a "dead spot" in the ANALYZE command? (By which I mean:
> ANALYZE randomly sampling rows, but my data is not terribly random, so
> it gets fooled?)

7.4's ANALYZE is indeed easily fooled by nonuniform tuple density.
8.0 uses a different sampling algorithm that we hope will do better.

> [And here's the remaining question in my puzzled mind: ANALYZE would not
> change the reltuples value, but VACUUM FULL ANALYZE did. Er-wha?]

Any variant of VACUUM records the exact number of tuples that it found
in its full table scan in reltuples.  It's only a standalone ANALYZE
that has to approximate.

The case where you saw reltuples > count(*) probably came from the fact
that VACUUM counts physical tuples --- that is, the difference reflects
dead-but-not-yet-removable tuples.

            regards, tom lane

Re: Sidetracking pg_autovacuum

From
Chris Browne
Date:
jboes@nexcerpt.com (Jeff Boes) writes:
> (Environment: Pg 7.4.1 on Linux; yes, I know it's past time to
> upgrade.)

If you were running 7.2, I'd be pestering you ;-).

We didn't get rid of our last 7.1 system until last December, and I
think the backend is probably still running, albeit with nothing
hooking up to it :-).

> (Originally, I started to post this as a question about how
> pg_class.reltuples could get so far out of whack compared to the
> actual table size. After posting it, I found that (a) I had the
> mailing list address wrong [d'oh!], and (b) I think I solved my
> problem. In essence, by doing a VACUUM in a script after a sizeable
> DELETE, apparently I took the affected table out of the hands of
> pg_autovacuum so that it never, ever did anything with the
> table. Including ANALYZE. Thus, reltuples never got updated.)

Oops.  Change that to a VACUUM ANALYZE (in your script), and things
should work out better.

> I started tracking the COUNT(*) (actual row count) versus the value
> in pg_class.reltuples for a number of our larger tables. Some of the
> tables see a lot of INSERTs and DELETEs over the course of a day; as
> much as 1/12th of the data will be deleted overnight, and new data
> inserted over the course of the day. I have pg_autovacuum running,
> and I also do regular VACUUMs, ANALYZEs, and even some CLUSTERs on
> these tables.
>
> [N.B.: In fact, I started doing a VACUUM explicitly after the big
> nightly DELETE.]

That's not a bad idea, all in all.

> What would cause an otherwise well-behaved table to start doing
> this? Is this just a "dead spot" in the ANALYZE command? (By which I
> mean: ANALYZE randomly sampling rows, but my data is not terribly
> random, so it gets fooled?)

Yes, that doubtless would be it.  There's a new sampling scheme in
version 8 that may do better.

> [And here's the remaining question in my puzzled mind: ANALYZE would
> not change the reltuples value, but VACUUM FULL ANALYZE
> did. Er-wha?]

VACUUM ANALYZE would also change the reltuples value...
--
(format nil "~S@~S" "cbbrowne" "acm.org")
http://www.ntlug.org/~cbbrowne/sap.html
Rules of the Evil Overlord #78.  "I will not tell my Legions of Terror
"And he must  be taken alive!" The command will be:  ``And try to take
him alive if it is reasonably practical.''"
<http://www.eviloverlord.com/>