Thread: count is ten times faster

count is ten times faster

From
"Sabin Coanda"
Date:
Hi there,

I have a simple aggregate query: SELECT count("PK_ID") AS "b1" FROM "tbA"
WHERE "PK_ID" > "f1"( 'c1' ), which has the following execution plan:
"Aggregate  (cost=2156915.42..2156915.43 rows=1 width=4)"
"  ->  Seq Scan on "tbA"  (cost=0.00..2137634.36 rows=7712423 width=4)"
"        Filter: ("PK_ID" > "f1"('c1'::character varying))"

I tried to get the same result with the following query:
SELECT (
     SELECT count("PK_ID") AS "b1" FROM "tbA" ) -
     (
          SELECT count("PK_ID") AS "b1"
          FROM "tbA"
          WHERE "PK_ID" <= "f1"( 'c1' )
     )
with the execution plan:
"Result  (cost=248952.95..248952.96 rows=1 width=0)"
"  InitPlan"
"    ->  Aggregate  (cost=184772.11..184772.12 rows=1 width=4)"
"          ->  Seq Scan on "tbA"  (cost=0.00..165243.49 rows=7811449
width=4)"
"    ->  Aggregate  (cost=64180.81..64180.82 rows=1 width=4)"
"          ->  Index Scan using "tbA_pkey" on "tbA"  (cost=0.25..63933.24
rows=99026 width=4)"
"                Index Cond: ("PK_ID" <= "f1"('c1'::character varying))"

How do you explain the cost is about ten times lower in the 2nd query than
the first ?

TIA,
Sabin



Re: count is ten times faster

From
"Pierre C"
Date:
> How do you explain the cost is about ten times lower in the 2nd query
> than the first ?

Function call cost ?

Can you EXPLAIN ANALYZE ?

Re: count is ten times faster

From
"Kevin Grittner"
Date:
"Sabin Coanda" <coanda@hotpop.com> wrote:

> How do you explain the cost is about ten times lower in the 2nd
> query than the first ?

To elaborate on Pierre's answer:

In the first query, you scan the entire table and execute the "f1"
function on each row.  In the second query you pass the entire table
just counting visible tuples and then run the "f1" function once,
and use the resulting value to scan an index on which it expects to
find one row.

It estimates the cost of running the "f1" function 7.7 million times
as being roughly ten times the cost of scanning the table.  Now,
this is all just estimates; if they don't really reflect the
relative cost of *running* the two queries, you might want to adjust
costs factors -- perhaps the estimated cost of the "f1" function.

-Kevin