Re: Improving count(*) - Mailing list pgsql-hackers

From Gavin Sherry
Subject Re: Improving count(*)
Date
Msg-id Pine.LNX.4.58.0511181149070.9614@linuxworld.com.au
Whole thread Raw
In response to Re: Improving count(*)  (Simon Riggs <simon@2ndquadrant.com>)
Responses Re: Improving count(*)  (Simon Riggs <simon@2ndquadrant.com>)
List pgsql-hackers
On Fri, 18 Nov 2005, Simon Riggs wrote:

> >From here, another proposal. We have a GUC called count_uses_estimate
> that is set to off by default. If set to true, then a count(*) will use
> the planner logic to estimate number of rows in the table and return
> that as the answer, rather than actually count the row. Unless analyze
> statistics are not available, in which case it does the real count.

I'm finishing off a tablesample patch a grad student on #postgresql was
working on.

template1=# select count(*)*100 from a tablesample system(1) repeatable
(2);?column?
---------- 8371100
(1 row)

Time: 6366.757 ms
template1=# select count(*)*50 from a tablesample system(2) repeatable
(11);?column?
---------- 8453550
(1 row)

Time: 10521.871 ms
template1=# select count(*)*10 from a tablesample system(10) repeatable
(3);?column?
---------- 8314350
(1 row)

Time: 28744.498 ms
template1=# select count(*) from a; count
---------8388608
(1 row)

Time: 33897.857 ms


Seems like a better solution. I can finish the patch pretty soon. I need
to contact the original author, who has disappeared, but I'll send it over
to you.

Gavin


pgsql-hackers by date:

Previous
From: Joe Conway
Date:
Subject: Re: Some array semantics issues
Next
From: mark@mark.mielke.cc
Date:
Subject: Re: Improving count(*)