Re: Yet Another COUNT(*)...WHERE...question - Mailing list pgsql-general

From Rainer Bauer
Subject Re: Yet Another COUNT(*)...WHERE...question
Date
Msg-id man8c3tdchp1umuf8vu5kvuujrr1vfvmkj@4ax.com
Whole thread Raw
In response to Re: Yet Another COUNT(*)...WHERE...question  (Gregory Stark <stark@enterprisedb.com>)
Responses Re: Yet Another COUNT(*)...WHERE...question
List pgsql-general
Gregory Stark wrote:

>"Rainer Bauer" <usenet@munnin.com> writes:
>
>> Anyway, what Phoenix is trying to say is that 2 queries are required: One to
>> get the total count and one to get the tuples for the current page. I reckon
>> it would help, if the query returning the result set could also report the
>> total no. of tuples found. Somthing like
>> SELECT COUNT(*), *  FROM <table> WHERE <cond> OFFSET <o> LIMIT <l>
>>
>> Or is there a way to do that?
>
>Well anything like the above would just report l as the count.

True, but what about this:

SELECT (SELECT COUNT(*) FROM <table> WHERE <cond>), *  FROM <table> WHERE <cond> OFFSET <o> LIMIT <l>

I just tested this on a query and its about 5-10% faster than issuing both commands separately (caching effects?). I
wonderwhether there would be any chance that Postgres could detect that the "count" select and the "data" select result
inthe same query plan? 

In my example (which is included below) the hash join is executed twice.

>The only way to do it in Postgres currently is to create a temporary table.
>Then you can populate it once, then select the count from the temporary table
>in one query and the required page from it in the second query.
>
>But temporary tables in Postgres are not really designed for this. In
>particular they count as DDL so you have to grant privileges to create tables
>to the application and it has to create and delete entries in pg_class for
>every use.

Well I don't think popuplating a temporary table with possible millions of rows is faster than executing the query
twice.Remember that a performance problem only occurs if there are a lot of tuples returned. 

Rainer

======================================================================

This is the count query:
SELECT COUNT(*) FROM "tblItem" AS i INNER JOIN
(SELECT "intItemID" FROM "tblItem2Category" WHERE "intCategoryID"=88869805) AS vrtChild ON
i."intItemIDCnt"=vrtChild."intItemID"
WHERE ("intTimeEnd" < 1187273177)

This is the select analyse output:
"Aggregate  (cost=356098.46..356098.47 rows=1 width=0) (actual time=29411.570..29411.570 rows=1 loops=1)"
"  ->  Hash Join  (cost=177545.23..350137.60 rows=2384343 width=0) (actual time=17382.286..28864.851 rows=2383740
loops=1)"
"        Hash Cond: ("tblItem2Category"."intItemID" = i."intItemIDCnt")"
"        ->  Bitmap Heap Scan on "tblItem2Category"  (cost=41560.03..134660.50 rows=2561397 width=4) (actual
time=1984.006..11048.762rows=2513204 loops=1)" 
"              Recheck Cond: ("intCategoryID" = 88869805)"
"              ->  Bitmap Index Scan on ccitem2categorycategoryidix  (cost=0.00..40919.69 rows=2561397 width=0) (actual
time=1980.614..1980.614rows=2513204 loops=1)" 
"                    Index Cond: ("intCategoryID" = 88869805)"
"        ->  Hash  (cost=95316.41..95316.41 rows=2339583 width=4) (actual time=15024.827..15024.827 rows=2383832
loops=1)"
"              ->  Seq Scan on "tblItem" i  (cost=0.00..95316.41 rows=2339583 width=4) (actual time=8.634..13763.878
rows=2383832loops=1)" 
"                    Filter: ("intTimeEnd" < 1187273177)"
"Total runtime: 29411.668 ms"

======================================================================

This is the data query:
SELECT i."intItemIDCnt" FROM "tblItem" AS i INNER JOIN
(SELECT "intItemID" FROM "tblItem2Category" WHERE "intCategoryID"=88869805) AS vrtChild ON
i."intItemIDCnt"=vrtChild."intItemID"
WHERE ("intTimeEnd" < 1187273177)
ORDER BY "intFlagListingFeatured" DESC, "intTimeEnd", "intItemIDCnt" OFFSET 500 LIMIT 50

This is the select analyse output:
"Limit  (cost=733011.30..733011.42 rows=50 width=12) (actual time=37852.007..37852.058 rows=50 loops=1)"
"  ->  Sort  (cost=733010.05..738970.91 rows=2384343 width=12) (actual time=37851.581..37851.947 rows=550 loops=1)"
"        Sort Key: i."intFlagListingFeatured", i."intTimeEnd", i."intItemIDCnt""
"        ->  Hash Join  (cost=179830.23..354707.60 rows=2384343 width=12) (actual time=17091.753..29040.425
rows=2383740loops=1)" 
"              Hash Cond: ("tblItem2Category"."intItemID" = i."intItemIDCnt")"
"              ->  Bitmap Heap Scan on "tblItem2Category"  (cost=41560.03..134660.50 rows=2561397 width=4) (actual
time=1976.599..10970.394rows=2513204 loops=1)" 
"                    Recheck Cond: ("intCategoryID" = 88869805)"
"                    ->  Bitmap Index Scan on ccitem2categorycategoryidix  (cost=0.00..40919.69 rows=2561397 width=0)
(actualtime=1973.160..1973.160 rows=2513204 loops=1)" 
"                          Index Cond: ("intCategoryID" = 88869805)"
"              ->  Hash  (cost=95316.41..95316.41 rows=2339583 width=12) (actual time=14758.256..14758.256 rows=2383832
loops=1)"
"                    ->  Seq Scan on "tblItem" i  (cost=0.00..95316.41 rows=2339583 width=12) (actual
time=8.592..13373.179rows=2383832 loops=1)" 
"                          Filter: ("intTimeEnd" < 1187273177)"
"Total runtime: 38247.533 ms"

======================================================================

This is the combined count/data query:
SELECT (SELECT COUNT(*) FROM "tblItem" AS i INNER JOIN (SELECT "intItemID" FROM "tblItem2Category" WHERE
"intCategoryID"=88869805)AS vrtChild ON i."intItemIDCnt"=vrtChild."intItemID" WHERE ("intTimeEnd" < 1187273177)),
i."intItemIDCnt"FROM "tblItem" AS i INNER JOIN (SELECT "intItemID" FROM "tblItem2Category" WHERE
"intCategoryID"=88869805)AS vrtChild ON i."intItemIDCnt"=vrtChild."intItemID" WHERE ("intTimeEnd" < 1187273177) ORDER
BY"intFlagListingFeatured" DESC, "intTimeEnd", "intItemIDCnt" OFFSET 500 LIMIT 50 

"Limit  (cost=1089109.77..1089109.90 rows=50 width=12) (actual time=62547.673..62547.727 rows=50 loops=1)"
"  InitPlan"
"    ->  Aggregate  (cost=356098.46..356098.47 rows=1 width=0) (actual time=16385.927..16385.927 rows=1 loops=1)"
"          ->  Hash Join  (cost=177545.23..350137.60 rows=2384343 width=0) (actual time=4320.749..15843.759
rows=2383740loops=1)" 
"                Hash Cond: ("tblItem2Category"."intItemID" = i."intItemIDCnt")"
"                ->  Bitmap Heap Scan on "tblItem2Category"  (cost=41560.03..134660.50 rows=2561397 width=4) (actual
time=293.025..9211.673rows=2513204 loops=1)" 
"                      Recheck Cond: ("intCategoryID" = 88869805)"
"                      ->  Bitmap Index Scan on ccitem2categorycategoryidix  (cost=0.00..40919.69 rows=2561397 width=0)
(actualtime=289.602..289.602 rows=2513204 loops=1)" 
"                            Index Cond: ("intCategoryID" = 88869805)"
"                ->  Hash  (cost=95316.41..95316.41 rows=2339583 width=4) (actual time=3461.292..3461.292 rows=2383832
loops=1)"
"                      ->  Seq Scan on "tblItem" i  (cost=0.00..95316.41 rows=2339583 width=4) (actual
time=3.658..1692.979rows=2383832 loops=1)" 
"                            Filter: ("intTimeEnd" < 1187273177)"
"  ->  Sort  (cost=733010.05..738970.91 rows=2384343 width=12) (actual time=62547.242..62547.616 rows=550 loops=1)"
"        Sort Key: i."intFlagListingFeatured", i."intTimeEnd", i."intItemIDCnt""
"        ->  Hash Join  (cost=179830.23..354707.60 rows=2384343 width=12) (actual time=38625.024..51768.452
rows=2383740loops=1)" 
"              Hash Cond: ("tblItem2Category"."intItemID" = i."intItemIDCnt")"
"              ->  Bitmap Heap Scan on "tblItem2Category"  (cost=41560.03..134660.50 rows=2561397 width=4) (actual
time=1968.052..2893.092rows=2513204 loops=1)" 
"                    Recheck Cond: ("intCategoryID" = 88869805)"
"                    ->  Bitmap Index Scan on ccitem2categorycategoryidix  (cost=0.00..40919.69 rows=2561397 width=0)
(actualtime=1964.642..1964.642 rows=2513204 loops=1)" 
"                          Index Cond: ("intCategoryID" = 88869805)"
"              ->  Hash  (cost=95316.41..95316.41 rows=2339583 width=12) (actual time=19915.284..19915.284 rows=2383832
loops=1)"
"                    ->  Seq Scan on "tblItem" i  (cost=0.00..95316.41 rows=2339583 width=12) (actual
time=8.622..18369.696rows=2383832 loops=1)" 
"                          Filter: ("intTimeEnd" < 1187273177)"
"Total runtime: 63042.165 ms"

======================================================================

pgsql-general by date:

Previous
From: Rainer Bauer
Date:
Subject: Re: Yet Another COUNT(*)...WHERE...question
Next
From: Rainer Bauer
Date:
Subject: Re: Yet Another COUNT(*)...WHERE...question