This is in version 8.3.1 (I also tried 8.3.3).
It looks like the sort is producing more rows than the input. The hash
aggregate produces 10k, but the sort produces 10M.
Am I just misinterpreting this output? Even the optimizer thinks that
the output of the hashagg and the output of the sort should be
identical.
Regards,
Jeff Davis
=> explain analyze select
-> a, b, c_max
-> from
-> (select a, max(c) as c_max from t group by a) dummy1
-> natural join
-> (select a, b from t) dummy2;
QUERY
PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Merge Join (cost=199211.12..660979.37 rows=9998773 width=12) (actual
time=8887.540..27866.804 rows=10000000 loops=1)
Merge Cond: (public.t.a = public.t.a)
-> Index Scan using t_a_idx on t (cost=0.00..286789.72 rows=9998773
width=8) (actual time=19.784..5676.407 rows=10000000 loops=1)
-> Sort (cost=199211.12..199217.72 rows=2641 width=8) (actual
time=8867.749..11692.015 rows=10000000 loops=1)
Sort Key: public.t.a
Sort Method: quicksort Memory: 647kB
-> HashAggregate (cost=199001.60..199034.61 rows=2641
width=8) (actual time=8854.848..8859.306 rows=10001 loops=1)
-> Seq Scan on t (cost=0.00..149007.73 rows=9998773
width=8) (actual time=0.007..3325.292 rows=10000000 loops=1)
Total runtime: 30355.218 ms
(9 rows)