I have a very simple table set:
Transactions:transaction_key PRIMARY KEYclient TEXTtime TIMESTAMP
LineItemstransaction_key INTamount INT
A query to print the contents of transactions with a sum of the line
item amounts provides a very suboptimal result. The problem seems to be
the GROUP BY clause as it doesn't use the primary index. Rewriting the
query to only group on the transaction_key and returning the max of the
other transaction fields results in a query of <1ms. (see queries below)
Can anyone shed any light here, I would have expected the queries to
take roughly the same time?
Out of interest, since we are grouping by transaction_key which is
unique, surely the other Transaction fields in the group by could be
ignored by the planner?
ThanksStuart
(running postgresql 8.2.5 on NetBSD 3)
>> Slow query
EXPLAIN
SELECT t.transaction_key,t.cashier,t.time,SUM(l.amount)
FROM Transactions t JOIN LineItems l USING (transaction_key)
GROUP BY t.transaction_key,t.cashier,t.time
ORDER BY t.transaction_key; QUERY PLAN
-------------------------------------------------------------------------------------------Sort (cost=449.16..454.16
rows=2000width=32) Sort Key: t.transaction_key -> HashAggregate (cost=314.50..339.50 rows=2000 width=32) ->
HashJoin (cost=66.00..262.07 rows=5243 width=32) Hash Cond: (l.transaction_key = t.transaction_key)
-> Seq Scan on lineitems l (cost=0.00..117.43 rows=5243 width=16) -> Hash (cost=41.00..41.00
rows=2000width=24) -> Seq Scan on transactions t (cost=0.00..41.00 rows=2000 width=24)
(8 rows)
>> Fast query
EXPLAIN
SELECT t.transaction_key,MAX(t.cashier),MAX(t.time),SUM(l.amount)
FROM Transactions t JOIN LineItems l USING (transaction_key)
GROUP BY t.transaction_key
ORDER BY t.transaction_key;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------GroupAggregate
(cost=0.00..459.11 rows=2000 width=32) -> Merge Join (cost=0.00..371.68 rows=5243 width=32) Merge Cond:
(t.transaction_key= l.transaction_key) -> Index Scan using transactions_pkey on transactions t
(cost=0.00..86.25rows=2000 width=24) -> Index Scan using lineitems_transaction_index on lineitems l
(cost=0.00..214.90rows=5243 width=16)
(5 rows)