Slow GROUP BY query - Mailing list pgsql-sql

From Stuart Brooks
Subject Slow GROUP BY query
Date
Msg-id 479F3B93.9070907@cat.co.za
Whole thread Raw
List pgsql-sql
I have a very simple table set:

Transactions:transaction_key PRIMARY KEYclient TEXTtime TIMESTAMP

LineItemstransaction_key INTamount INT

A query to print the contents of transactions with a sum of the line 
item amounts provides a very suboptimal result. The problem seems to be 
the GROUP BY clause as it doesn't use the primary index. Rewriting the 
query to only group on the transaction_key and returning the max of the 
other transaction fields results in a query of <1ms. (see queries below)

Can anyone shed any light here, I would have expected the queries to 
take roughly the same time?

Out of interest, since we are grouping by transaction_key which is 
unique, surely the other Transaction fields in the group by could be 
ignored by the planner?

ThanksStuart

(running postgresql 8.2.5 on NetBSD 3)

>> Slow query

EXPLAIN 
SELECT t.transaction_key,t.cashier,t.time,SUM(l.amount) 
FROM Transactions t JOIN LineItems l USING (transaction_key) 
GROUP BY t.transaction_key,t.cashier,t.time 
ORDER BY t.transaction_key;                                       QUERY PLAN
-------------------------------------------------------------------------------------------Sort  (cost=449.16..454.16
rows=2000width=32)  Sort Key: t.transaction_key  ->  HashAggregate  (cost=314.50..339.50 rows=2000 width=32)        ->
HashJoin  (cost=66.00..262.07 rows=5243 width=32)              Hash Cond: (l.transaction_key = t.transaction_key)
      ->  Seq Scan on lineitems l  (cost=0.00..117.43 rows=5243 width=16)              ->  Hash  (cost=41.00..41.00
rows=2000width=24)                    ->  Seq Scan on transactions t  (cost=0.00..41.00 rows=2000 width=24)
 
(8 rows)


>> Fast query

EXPLAIN 
SELECT t.transaction_key,MAX(t.cashier),MAX(t.time),SUM(l.amount) 
FROM Transactions t JOIN LineItems l USING (transaction_key) 
GROUP BY t.transaction_key 
ORDER BY t.transaction_key;
                                                  QUERY PLAN

-----------------------------------------------------------------------------------------------------------------GroupAggregate
(cost=0.00..459.11 rows=2000 width=32)  ->  Merge Join  (cost=0.00..371.68 rows=5243 width=32)        Merge Cond:
(t.transaction_key= l.transaction_key)        ->  Index Scan using transactions_pkey on transactions t
(cost=0.00..86.25rows=2000 width=24)        ->  Index Scan using lineitems_transaction_index on lineitems l
(cost=0.00..214.90rows=5243 width=16)
 
(5 rows)





pgsql-sql by date:

Previous
From: Andreas Joseph Krogh
Date:
Subject: Re: Slow Query problem
Next
From: "Dawid Kuroczko"
Date:
Subject: JOINing SET returning function.