From what I've read on the net, these should be very similar,
and should generate equivalent plans, in such cases:
SELECT DISTINCT x FROM mytable
SELECT x FROM mytable GROUP BY x
However, in my case (postgresql-server-8.1.18-2.el5_4.1),
they generated different results with quite different
execution times (73ms vs 40ms for DISTINCT and GROUP BY
respectively):
tts_server_db=# EXPLAIN ANALYZE select userdata from tagrecord where clientRmaInId = 'CPC-RMA-00110' group by userdata;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=775.68..775.69 rows=1 width=146) (actual time=40.058..40.058 rows=0 loops=1)
-> Bitmap Heap Scan on tagrecord (cost=4.00..774.96 rows=286 width=146) (actual time=40.055..40.055 rows=0
loops=1)
Recheck Cond: ((clientrmainid)::text = 'CPC-RMA-00110'::text)
-> Bitmap Index Scan on idx_tagdata_clientrmainid (cost=0.00..4.00 rows=286 width=0) (actual
time=40.050..40.050rows=0 loops=1)
Index Cond: ((clientrmainid)::text = 'CPC-RMA-00110'::text)
Total runtime: 40.121 ms
tts_server_db=# EXPLAIN ANALYZE select distinct userdata from tagrecord where clientRmaInId = 'CPC-RMA-00109';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
Unique (cost=786.63..788.06 rows=1 width=146) (actual time=73.018..73.018 rows=0 loops=1)
-> Sort (cost=786.63..787.34 rows=286 width=146) (actual time=73.016..73.016 rows=0 loops=1)
Sort Key: userdata
-> Bitmap Heap Scan on tagrecord (cost=4.00..774.96 rows=286 width=146) (actual time=72.940..72.940 rows=0
loops=1)
Recheck Cond: ((clientrmainid)::text = 'CPC-RMA-00109'::text)
-> Bitmap Index Scan on idx_tagdata_clientrmainid (cost=0.00..4.00 rows=286 width=0) (actual
time=72.936..72.936rows=0 loops=1)
Index Cond: ((clientrmainid)::text = 'CPC-RMA-00109'::text)
Total runtime: 73.144 ms
What gives?
--
Dimi Paun <dimi@lattica.com>
Lattica, Inc.