Re: Slow count(*) - Mailing list pgsql-hackers

From Brian Hurt
Subject Re: Slow count(*)
Date
Msg-id 477D013B.2010909@janestcapital.com
Whole thread Raw
In response to Re: Slow count(*)  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
List pgsql-hackers
Kevin Grittner wrote:

> 
>If you really are doing proper maintenance, and you don't need exact
>counts, you might be able to use the approximation stored in the
>system tables:
>  
>

Also, if you're using count(*) as an existance test (common in Mysql 
code), it's better to use exists instead.  Using a table in my system, I 
see:

> proddb=> explain analyze select count(*) from instrument_listings 
> where update_date is null and delist_date is null;
>                                                             QUERY PLAN
>
-----------------------------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=18385.45..18385.46 rows=1 width=0) (actual 
> time=897.799..897.801 rows=1 loops=1)
>    ->  Seq Scan on instrument_listings  (cost=0.00..17973.43 
> rows=164807 width=0) (actual time=0.018..634.197 rows=146122 loops=1)
>          Filter: ((update_date IS NULL) AND (delist_date IS NULL))
>  Total runtime: 897.846 ms
> (4 rows)
>
> Time: 898.478 ms
> proddb=> explain analyze select true where exists(select 1 from 
> instrument_listings where update_date is null and delist_date is null 
> limit 1);
>                                                              QUERY PLAN
>
------------------------------------------------------------------------------------------------------------------------------------
>  Result  (cost=0.11..0.12 rows=1 width=0) (actual time=0.026..0.028 
> rows=1 loops=1)
>    One-Time Filter: $0
>    InitPlan
>      ->  Limit  (cost=0.00..0.11 rows=1 width=0) (actual 
> time=0.022..0.022 rows=1 loops=1)
>            ->  Seq Scan on instrument_listings  (cost=0.00..17973.43 
> rows=164807 width=0) (actual time=0.017..0.017 rows=1 loops=1)
>                  Filter: ((update_date IS NULL) AND (delist_date IS NULL))
>  Total runtime: 0.063 ms
> (7 rows)
>
> Time: 0.768 ms
> proddb=>


The exists version is over 1000x faster (and the only reason it's not 
more studly is that I'm working on the table as we speak, so it's all in 
memory).

As a general rule in postgres, don't do count(*) unless you really mean it.

Brian



pgsql-hackers by date:

Previous
From: "Usama Dar"
Date:
Subject: Re: Autovacuum & Table List Ordering
Next
From: Peter Eisentraut
Date:
Subject: Re: Selectivity estimation for equality and range queries