Re: Select max(foo) and select count(*) optimization - Mailing list pgsql-performance

From Tom Lane
Subject Re: Select max(foo) and select count(*) optimization
Date
Msg-id 2698.1073431195@sss.pgh.pa.us
Whole thread Raw
In response to Re: Select max(foo) and select count(*) optimization  ("D'Arcy J.M. Cain" <darcy@druid.net>)
List pgsql-performance
"D'Arcy J.M. Cain" <darcy@druid.net> writes:
> In any case, if I have to vacuum a 20,000,000 row table to get an accurate
> count then I may as well run count(*) on it.
> (*): Actually I only analyze but I understand that that should be sufficient.

ANALYZE without VACUUM will deliver a not-very-accurate estimate, since it
only looks at a sample of the table's pages and doesn't grovel through
every one.  Any of the VACUUM variants, on the other hand, will set
pg_class.reltuples reasonably accurately (as the number of rows actually
seen and left undeleted by the VACUUM pass).

There are pathological cases where ANALYZE's estimate of the overall row
count can be horribly bad --- mainly, when the early pages of the table
are empty or nearly so, but there are well-filled pages out near the
end.  I have a TODO item to try to make ANALYZE less prone to getting
fooled that way...

            regards, tom lane

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: PgAdmin startup query VERY slow
Next
From: "Eric Jain"
Date:
Subject: Index creation