Thread: Slow count(*)

Slow count(*)

From

"Abraham, Danny"

Date:

02 January 2008, 14:29:37

Hi,

We are looking for a patch that will help us  count using the indexes.

Our product is about 20 times slower on Postgres compared to MS SQL
Server.

Any ideas?

Danny Abraham
BMC Software
CTM&D Business Unit
972-52-4286-513
danny_abraham@bmc.com

Re: Slow count(*)

From

"Pavel Stehule"

Date:

02 January 2008, 14:50:40

On 02/01/2008, Abraham, Danny <danny_abraham@bmc.com> wrote:
> Hi,
>
> We are looking for a patch that will help us  count using the indexes.
>
> Our product is about 20 times slower on Postgres compared to MS SQL
> Server.
>
> Any ideas?

There isn't any similar patch and will not be.

Use materialized views or similar techniques.

Are you sure, so all your problems are only in SELECT COUNT(*)?

Check, please, all slow queries.

Regards
Pavel Stehule
>
> Danny Abraham
> BMC Software
> CTM&D Business Unit
> 972-52-4286-513
> danny_abraham@bmc.com
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>

Re: Slow count(*)

From

"A. Kretschmer"

Date:

02 January 2008, 14:51:01

am  Wed, dem 02.01.2008, um  9:29:24 -0600 mailte Abraham, Danny folgendes:
> Hi,
> 
> We are looking for a patch that will help us  count using the indexes.
> 
> Our product is about 20 times slower on Postgres compared to MS SQL
> Server.
> 
> Any ideas?

Please show us your SQL and the execution plan (EXPLAIN or, better,
EXPLAIN ANALYSE) and read our FAQ.


Andreas
-- 
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net

Re: Slow count(*)

From

Andrew Sullivan

Date:

02 January 2008, 15:26:01

On Wed, Jan 02, 2008 at 09:29:24AM -0600, Abraham, Danny wrote:
> We are looking for a patch that will help us  count using the indexes.

Is this for 
SELECT count(*) FROM table;
or 
SELECT count(1) FROM table WHERE. . .
The latter _will_ use an index, if the index is correct, the statistics are
right, and the index selectivity is worth the cost of reading the index. 
The former will not use an index at all, because the answer depends on
visibility, and you can't know that without reading the table.  If you're
counting how many rows are in the table (for, for instance, display
purposes), you probably need to do something else.

> Our product is about 20 times slower on Postgres compared to MS SQL
> Server.
> 
> Any ideas?

Not without the queries, the EXPLAIN ANALYZE plans, and some information
about the database.

A

Re: Slow count(*)

From

Simon Riggs

Date:

02 January 2008, 17:12:33

On Wed, 2008-01-02 at 09:29 -0600, Abraham, Danny wrote:

> Our product is about 20 times slower on Postgres compared to MS SQL
> Server.

If you want to have a cross-platform product then you must consider how
to access multiple systems both accurately and quickly. Not much point
catering for the different SQL dialects and then ignoring the
performance differences. All products are not the same; you will find
many advantages with Postgres.

--  Simon Riggs 2ndQuadrant  http://www.2ndQuadrant.com

Re: Slow count(*)

From

"Kevin Grittner"

Date:

03 January 2008, 14:16:16

>>> On Wed, Jan 2, 2008 at  9:29 AM, in message
<BE67D1149BBD5746984545ED91F702E04DCC7A@hou-ex-02.adprod.bmc.com>, "Abraham,
Danny" <danny_abraham@bmc.com> wrote:

> We are looking for a patch that will help us  count using the indexes.
As others have mentioned, that's not currently possible for a count
of all rows in a table, because there can be many versions of a row
under PostgreSQL's MVCC techniques, and the row must currently be
visited to determine whether it is visible in the context of your
database transaction.
> Our product is about 20 times slower on Postgres compared to MS SQL
> Server.
>
> Any ideas?
Again, it is best to show a particular example of a problem, because
you might be making a bad assumption about the cause of your slowness.
If you don't understand MVCC and the need for maintenance, you might
have table bloat which could be the issue.  Also, always give the
exact version of PostgreSQL, the OS, and a description of the
hardware.
If you really are doing proper maintenance, and you don't need exact
counts, you might be able to use the approximation stored in the
system tables:
cc=> \timing
Timing is on.
cc=> select count(*) from "Party";count
--------135093
(1 row)

Time: 48.626 ms
cc=> select reltuples from pg_class where relname = 'Party';reltuples
-----------   135091
(1 row)

Time: 9.799 ms
-Kevin

Re: Slow count(*)

From

Brian Hurt

Date:

03 January 2008, 14:37:39

Kevin Grittner wrote:

> 
>If you really are doing proper maintenance, and you don't need exact
>counts, you might be able to use the approximation stored in the
>system tables:
>  
>

Also, if you're using count(*) as an existance test (common in Mysql 
code), it's better to use exists instead.  Using a table in my system, I 
see:

> proddb=> explain analyze select count(*) from instrument_listings 
> where update_date is null and delist_date is null;
>                                                             QUERY PLAN
>
-----------------------------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=18385.45..18385.46 rows=1 width=0) (actual 
> time=897.799..897.801 rows=1 loops=1)
>    ->  Seq Scan on instrument_listings  (cost=0.00..17973.43 
> rows=164807 width=0) (actual time=0.018..634.197 rows=146122 loops=1)
>          Filter: ((update_date IS NULL) AND (delist_date IS NULL))
>  Total runtime: 897.846 ms
> (4 rows)
>
> Time: 898.478 ms
> proddb=> explain analyze select true where exists(select 1 from 
> instrument_listings where update_date is null and delist_date is null 
> limit 1);
>                                                              QUERY PLAN
>
------------------------------------------------------------------------------------------------------------------------------------
>  Result  (cost=0.11..0.12 rows=1 width=0) (actual time=0.026..0.028 
> rows=1 loops=1)
>    One-Time Filter: $0
>    InitPlan
>      ->  Limit  (cost=0.00..0.11 rows=1 width=0) (actual 
> time=0.022..0.022 rows=1 loops=1)
>            ->  Seq Scan on instrument_listings  (cost=0.00..17973.43 
> rows=164807 width=0) (actual time=0.017..0.017 rows=1 loops=1)
>                  Filter: ((update_date IS NULL) AND (delist_date IS NULL))
>  Total runtime: 0.063 ms
> (7 rows)
>
> Time: 0.768 ms
> proddb=>


The exists version is over 1000x faster (and the only reason it's not 
more studly is that I'm working on the table as we speak, so it's all in 
memory).

As a general rule in postgres, don't do count(*) unless you really mean it.

Brian