Thread: count() in 9.2

count() in 9.2

From

Steve Horn

Date:

17 October 2012, 15:00:48

One of the reasons that my team could not take advantage of PostgreSQL was due to the poor performance of count(*) aggregate function.

I visited the "Slow Counting" page on the wiki (http://wiki.postgresql.org/wiki/Slow_Counting) and it states:

"Note that the following article only applies to versions of PostgreSQL prior to 9.2. Index-only scans are now implemented."

So does this mean that count() is no longer slow? If not, how much faster is it? And what does it mean when it says "Index-only scans are now implemented."?

Thanks for your help!

--
Steve Horn

Re: count() in 9.2

From

Simon Riggs

Date:

17 October 2012, 15:42:59

On 17 October 2012 16:00, Steve Horn <steve@stevehorn.cc> wrote:
> One of the reasons that my team could not take advantage of PostgreSQL was
> due to the poor performance of count(*) aggregate function.
>
> I visited the "Slow Counting" page on the wiki
> (http://wiki.postgresql.org/wiki/Slow_Counting) and it states:
> "Note that the following article only applies to versions of PostgreSQL
> prior to 9.2. Index-only scans are now implemented."
>
> So does this mean that count() is no longer slow? If not, how much faster is
> it? And what does it mean when it says "Index-only scans are now
> implemented."?

I think someone's been drinking the cool-aid...

YMMV but its possible in some cases to get 2-3 times previous
performance on larger, relatively static tables. Not much effect on
smaller, regularly updated tables. Given that most people doing
count() run them either with a WHERE clause or on tables that change
enough you need to re-execute the query, you might not see much in
real usage.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: count() in 9.2

From

Steve Horn

Date:

17 October 2012, 16:22:25

Yeah I should have mentioned...adding a where clause is always my use case when using count().

So to clarify my question: does anyone have any experience with count() (with a WHERE clause) in Postgres 9.2?

On Oct 17, 2012 11:42 AM, "Simon Riggs" <simon@2ndquadrant.com> wrote:

On 17 October 2012 16:00, Steve Horn <steve@stevehorn.cc> wrote:
> One of the reasons that my team could not take advantage of PostgreSQL was
> due to the poor performance of count(*) aggregate function.
>
> I visited the "Slow Counting" page on the wiki
> (http://wiki.postgresql.org/wiki/Slow_Counting) and it states:
> "Note that the following article only applies to versions of PostgreSQL
> prior to 9.2. Index-only scans are now implemented."
>
> So does this mean that count() is no longer slow? If not, how much faster is
> it? And what does it mean when it says "Index-only scans are now
> implemented."?

I think someone's been drinking the cool-aid...

YMMV but its possible in some cases to get 2-3 times previous
performance on larger, relatively static tables. Not much effect on
smaller, regularly updated tables. Given that most people doing
count() run them either with a WHERE clause or on tables that change
enough you need to re-execute the query, you might not see much in
real usage.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: count() in 9.2

From

Simon Riggs

Date:

17 October 2012, 16:57:34

On 17 October 2012 17:22, Steve Horn <steve@stevehorn.cc> wrote:
> Yeah I should have mentioned...adding a where clause is always my use case
> when using count().
>
> So to clarify my question: does anyone have any experience with count()
> (with a WHERE clause) in Postgres 9.2?

If the WHERE clause looks only at columns in one index it might be
faster, otherwise same.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: count() in 9.2

From

Thomas Kellerer

Date:

17 October 2012, 17:48:46

Steve Horn wrote on 17.10.2012 17:00:
> One of the reasons that my team could not take advantage of
> PostgreSQL was due to the poor performance of count(*) aggregate
> function.

I wonder what kind of application makes a slow count(*) on a table a show stopper.

I have been developing DB centric applications for over 20 years now and that never has been any issue.

And which DBMS are you currently using?
I don't know any transactional DBMS that will do count all the rows in a table *really* fast...

Re: count() in 9.2

From

Simon Riggs

Date:

17 October 2012, 18:38:17

On 17 October 2012 18:48, Thomas Kellerer <spam_eater@gmx.net> wrote:
> Steve Horn wrote on 17.10.2012 17:00:
>
>> One of the reasons that my team could not take advantage of
>> PostgreSQL was due to the poor performance of count(*) aggregate
>> function.
>
> I wonder what kind of application makes a slow count(*) on a table a show
> stopper.

That's a much better answer than my original reply.

There is only one answer, in any DBMS: Prepare special datastructures
that allow you to do that quickly, which is also an option in
Postgres.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: count() in 9.2

From

Steve Horn

Date:

17 October 2012, 18:42:07

If our application could not do a count quickly there is no reason to show up for work. Our application provides counts of lists to customers who then use that information to purchase records of data from our lists. There are on average 80 columns that the user can apply "WHERE, AND, or OR" to narrow their count (list).

We are using Microsoft Sql Server currently. It provides counts in an acceptable amount of time with 160 million+ rowcount tables with dozens of AND clauses applied.

On Wed, Oct 17, 2012 at 1:48 PM, Thomas Kellerer <spam_eater@gmx.net> wrote:

Steve Horn wrote on 17.10.2012 17:00:

One of the reasons that my team could not take advantage of
PostgreSQL was due to the poor performance of count(*) aggregate
function.

I wonder what kind of application makes a slow count(*) on a table a show stopper.

I have been developing DB centric applications for over 20 years now and that never has been any issue.

And which DBMS are you currently using?
I don't know any transactional DBMS that will do count all the rows in a table *really* fast...

--
Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-novice

--
Steve Horn

Re: count() in 9.2

From

Thomas Kellerer

Date:

17 October 2012, 18:56:34

Steve Horn wrote on 17.10.2012 20:41:
> If our application could not do a count quickly there is no reason to
> show up for work. Our application provides counts of lists to
> customers who then use that information to purchase records of data
> from our lists. There are on average 80 columns that the user can
> apply "WHERE, AND, or OR" to narrow their count (list).
>
> We are using Microsoft Sql Server currently. It provides counts in an
> acceptable amount of time with 160 million+ rowcount tables with
> dozens of AND clauses applied.
>

Now that is something completely different than what the wiki page regarding "Slow counting" talks about.

That page is *only* about counting *all* rows in a table, so a "select count(*) from foobar" without any where
condition.
I'm pretty sure SQL Server won't be that much faster calculating the total row count of a table than PostgreSQL.

When you talk about "dozens of AND clauses" then this is a completely different problem.
It can be fast in PostgreSQL as well provided you do have proper indexing applied.


If you have a problem with that, I'm sure this list can help you tune the query according to your needs if you show us
thequery that is too slow. You might want to read this page: http://wiki.postgresql.org/wiki/SlowQueryQuestions to make
sureyou include all the needed information when asking such a question. 


Thomas

Re: count() in 9.2

From

Steve Horn

Date:

17 October 2012, 19:03:25

Thanks for the replies. I think I need to re-evaluate based on what you are saying. From earlier tests I was finding that Postgres was slow to count records with any WHERE clause. Maybe my perception was incorrect because my samples were not controlled.

Thanks again!

On Wed, Oct 17, 2012 at 2:56 PM, Thomas Kellerer <spam_eater@gmx.net> wrote:

Steve Horn wrote on 17.10.2012 20:41:

If our application could not do a count quickly there is no reason to
show up for work. Our application provides counts of lists to
customers who then use that information to purchase records of data
from our lists. There are on average 80 columns that the user can
apply "WHERE, AND, or OR" to narrow their count (list).

We are using Microsoft Sql Server currently. It provides counts in an
acceptable amount of time with 160 million+ rowcount tables with
dozens of AND clauses applied.

Now that is something completely different than what the wiki page regarding "Slow counting" talks about.

That page is *only* about counting *all* rows in a table, so a "select count(*) from foobar" without any where condition.
I'm pretty sure SQL Server won't be that much faster calculating the total row count of a table than PostgreSQL.

When you talk about "dozens of AND clauses" then this is a completely different problem.
It can be fast in PostgreSQL as well provided you do have proper indexing applied.

If you have a problem with that, I'm sure this list can help you tune the query according to your needs if you show us the query that is too slow. You might want to read this page: http://wiki.postgresql.org/wiki/SlowQueryQuestions to make sure you include all the needed information when asking such a question.

Thomas

--
Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-novice

--
Steve Horn

Re: count() in 9.2

From

Frank Lanitz

Date:

18 October 2012, 10:37:47

Am 2012-10-17 21:03, schrieb Steve Horn:
> Thanks for the replies. I think I need to re-evaluate based on what
> you are
> saying. From earlier tests I was finding that Postgres was slow to
> count
> records with any WHERE clause. Maybe my perception was incorrect
> because my
> samples were not controlled.

I really think its the first step: Define a good set of data and a
couple of
typical queries using the data to compare systems. Also keep care to
bring up
valid indexing and -- I assuming you already did that -- a good layout
of data.
E.g. splitting tables if you have 0 either 1 conditions.

Cheers,
Frank