Re: count(*) slow on large tables - Mailing list pgsql-performance

From Christopher Browne
Subject Re: count(*) slow on large tables
Date
Msg-id m3vfr7f4z1.fsf@wolfe.cbbrowne.com
Whole thread Raw
In response to count(*) slow on large tables  (Dror Matalon <dror@zapatec.com>)
Responses Re: count(*) slow on large tables  (Dror Matalon <dror@zapatec.com>)
Re: count(*) slow on large tables  (Jeff <threshar@torgo.978.org>)
List pgsql-performance
The world rejoiced as dror@zapatec.com (Dror Matalon) wrote:
> I don't have an opinion on how hard it would be to implement the
> tracking in the indexes, but "select count(*) from some table" is, in my
> experience, a query that people tend to run quite often.
> One of the databases that I've used, I believe it was Informix, had that
> info cached so that it always new how many rows there were in any
> table. It was quite useful.

I can't imagine why the raw number of tuples in a relation would be
expected to necessarily be terribly useful.

I'm involved with managing Internet domains, and it's only when people
are being pretty clueless that anyone imagines that "select count(*)
from domains;" would be of any use to anyone.  There are enough "test
domains" and "inactive domains" and other such things that the raw
number of "things in the table" aren't really of much use.

- I _do_ care how many pages a table occupies, to some extent, as that
determines whether it will fit in my disk space or not, but that's not
COUNT(*).

- I might care about auditing the exact numbers of records in order to
be assured that a data conversion process was done correctly.  But in
that case, I want to do something a whole *lot* more detailed than
mere COUNT(*).

I'm playing "devil's advocate" here, to some extent.  But
realistically, there is good reason to be skeptical of the merits of
using SELECT COUNT(*) FROM TABLE for much of anything.

Furthermore, the relation that you query mightn't be a physical
"table."  It might be a more virtual VIEW, and if that's the case,
bets are even MORE off.  If you go with the common dictum of "good
design" that users don't directly access tables, but go through VIEWs,
users may have no way to get at SELECT COUNT(*) FROM TABLE.
--
output = reverse("ac.notelrac.teneerf" "@" "454aa")
http://www.ntlug.org/~cbbrowne/finances.html
Rules  of  the  Evil  Overlord  #74.   "When  I  create  a  multimedia
presentation of my plan designed  so that my five-year-old advisor can
easily  understand the  details, I  will not  label the  disk "Project
Overlord" and leave it lying on top of my desk."
<http://www.eviloverlord.com/>

pgsql-performance by date:

Previous
From: "CN"
Date:
Subject: Is This My Speed Limit?
Next
From: Dror Matalon
Date:
Subject: Re: count(*) slow on large tables