Thread: Planner estimates and VACUUM/autovacuum
Hi,
I am not sure if it is bug or not but I found some strange behaviour. Maybe it is the same as described on http://www.postgresql.org/message-id/14616.1244317490@sss.pgh.pa.us ?). If yes – I’m sorry for the trouble, but I think that my example is more obvious.
Tested on PostgreSQL 9.2.4 and 9.2.6.
Console 1:
BEGIN;
DECLARE a CURSOR FOR SELECT * FROM tab;
--- Keep cursor open for disallow full vacuum of tab
Console 2:
SELECT count(*) FROM tab;
---- Result: 3588;
select reltuples from pg_class where relname='table’;
--- Result: 3588
UPDATE tab SET id=id;
UPDATE tab SET id=id;
UPDATE tab SET id=id;
VACUUM ANALYZE tab;
select reltuples from pg_class where relname='table’;
--- Result: 3588
Now wait few seconds J
select reltuples from pg_class where relname='table’;
--- Result: 12560
VACUUM ANALYZE tab;
select reltuples from pg_class where relname='table’;
--- Result: 3588
There is 3588 live records and 12560 live+dead records in table.
That is strange for me. VACUUM updates pg_class.reltuples differently (only live roiws count) than autovacuum (live and dead rows). Why?
Also in planning:
explain SELECT id FROM tab;
QUERY PLAN
----------------------------------------------------------------------
Seq Scan on tab (cost=0.00..1074.60 rows=12560 width=4)
Estimation is done with the use of current pg_class.reltuples value. This value includes dead rows count after autovacuum so estimation is bad, especially in more complex planner tree, for example:
Explain SELECT a.id FROM tab AS a JOIN tab AS b USING (id);
QUERY PLAN
---------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..6410.70 rows=12560 width=4)
-> Seq Scan on tab a (cost=0.00..1074.60 rows=12560 width=8)
-> Index Only Scan using tab_pkey on tab b (cost=0.00..0.41 rows=1 width=4)
Index Cond: (id = a.id)
PostgreSQL estimates 12560 records in query result. This is wrong estimation if dead tuples are removed during seq scan or index scan (I suppose that it is).
I don’t think that AUTOVACUUM and VACUUM ANALYZE should behave differently L
--------------------------------------------------------------------------
Artur Zajac
=?iso-8859-2?Q?Artur_Zaj=B1c?= <azajac@ang.com.pl> writes: > That is strange for me. VACUUM updates pg_class.reltuples differently (only > live roiws count) than autovacuum (live and dead rows). Why? I don't have time to poke into this in detail right now, but I think that more likely the issue is that ANALYZE might have a different rule than VACUUM. autovacuum will fire those operations independently, which doesn't match what you're doing by hand. Which is not to say that we shouldn't think about fixing that, but that it's important to understand the problem clearly first. regards, tom lane