Re: count(*) and bad design was: Experiences with extensibility - Mailing list pgsql-general

From Ivan Sergio Borgonovo
Subject Re: count(*) and bad design was: Experiences with extensibility
Date
Msg-id 20080109120409.38c49072@webthatworks.it
Whole thread Raw
In response to Re: count(*) and bad design was: Experiences with extensibility  ("Dann Corbit" <DCorbit@connx.com>)
Responses Re: count(*) and bad design was: Experiences with extensibility  ("Harald Armin Massa" <haraldarminmassa@gmail.com>)
List pgsql-general
On Wed, 9 Jan 2008 01:39:34 -0800
"Dann Corbit" <DCorbit@connx.com> wrote:

> > On Wed, 09 Jan 2008 00:06:45 -0800
> > "Joshua D. Drake" <jd@commandprompt.com> wrote:

> > > Granted there are scenarios where others are FASTER (SELECT
> > > COUNT(*)) but I find that if you are doing those items, you
> > > normally have a weird design anyway.

> > > Sincerely,

> > Sincerely, would you make an example of such a bad design?

> A program that estimates cardinality by doing SELECT COUNT(*) is a
> bad design.  Assuming you have the wherewithal to vacuum your
> tables (or have autovacuum enabled) a query against the system
> tables will be a much better estimate of cardinality.
>
> Now (some may argue) what if we want an _EXACT_ value for
> COUNT(*)?  We had better ask ourselves (in that circumstance) "Am I
> willing to lock the entire table and scan it?" because that is what
> will be necessary to get a truly exact value.  Otherwise, you can
> get totals that are wildly off-base if someone is doing a bulk
> import or deleting a large number of records.

Please forgive my naiveness in this field but what does it mean an
"exact count" and what other DB means with "an exact count" and how
other DB deal with it?

How "count" is defined in the SQL standard?

Is there a real situation then where you really need the "exact"
count?

Am I right saying that:

select count(*) from ...
-- here count may already be different

and that:
select for update count(*)

could be very expensive? Or what would it mean to do a
select for update count(*) ...?


> I think this should be a FAQ because it is a (F)requently (A)sked
> (Q)uestion.

After reading your email I think the real problem is not how to do
otherwise but understand what count(*) really mean and when and if it
is really useful and when it can be avoided.

I'd write in the FAQ something in the line of:

- What count(*) really does?
- When it can be avoided?
- When it can't be avoided?

In my everyday use of count(*), after your email I can hardly spot a
place where I need an exact count.
But to better understand and being convinced that using count(*) is
bad design I think last question could help a lot.

How does count(*) with a where clause perform generally on postgresql
compared to other DB?
I'd expect it perform as good or better than other DB since now the
bottleneck should be how efficiently it can filter records... but
still a count(*) with a where clause will incur in the same problem
of what "exact" means.

thanks

--
Ivan Sergio Borgonovo
http://www.webthatworks.it


pgsql-general by date:

Previous
From: "Harald Armin Massa"
Date:
Subject: Re: data transfer/migrate from win to linux
Next
From: Ashish Karalkar
Date:
Subject: Insert waiting for update?