DISTINCT vs. GROUP BY

From: Dimi Paun
Subject: DISTINCT vs. GROUP BY
Date: ,
Msg-id: 1265751976.2513.34.camel@localhost
(view: Whole thread, Raw)
Responses: Re: DISTINCT vs. GROUP BY  (Thom Brown)
Re: DISTINCT vs. GROUP BY  (Tom Lane)
List: pgsql-performance

Tree view

DISTINCT vs. GROUP BY  (Dimi Paun, )
 Re: DISTINCT vs. GROUP BY  (Thom Brown, )
 Re: DISTINCT vs. GROUP BY  (Tom Lane, )
  Re: DISTINCT vs. GROUP BY  (Dimi Paun, )

From what I've read on the net, these should be very similar,
and should generate equivalent plans, in such cases:

SELECT DISTINCT x FROM mytable
SELECT x FROM mytable GROUP BY x

However, in my case (postgresql-server-8.1.18-2.el5_4.1),
they generated different results with quite different
execution times (73ms vs 40ms for DISTINCT and GROUP BY
respectively):

tts_server_db=# EXPLAIN ANALYZE select userdata from tagrecord where clientRmaInId = 'CPC-RMA-00110' group by userdata;
                                                                 QUERY PLAN
                    

--------------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=775.68..775.69 rows=1 width=146) (actual time=40.058..40.058 rows=0 loops=1)
   ->  Bitmap Heap Scan on tagrecord  (cost=4.00..774.96 rows=286 width=146) (actual time=40.055..40.055 rows=0
loops=1)
         Recheck Cond: ((clientrmainid)::text = 'CPC-RMA-00110'::text)
         ->  Bitmap Index Scan on idx_tagdata_clientrmainid  (cost=0.00..4.00 rows=286 width=0) (actual
time=40.050..40.050rows=0 loops=1) 
               Index Cond: ((clientrmainid)::text = 'CPC-RMA-00110'::text)
 Total runtime: 40.121 ms

tts_server_db=# EXPLAIN ANALYZE select distinct userdata from tagrecord where clientRmaInId = 'CPC-RMA-00109';
                                                                    QUERY PLAN
                          

--------------------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=786.63..788.06 rows=1 width=146) (actual time=73.018..73.018 rows=0 loops=1)
   ->  Sort  (cost=786.63..787.34 rows=286 width=146) (actual time=73.016..73.016 rows=0 loops=1)
         Sort Key: userdata
         ->  Bitmap Heap Scan on tagrecord  (cost=4.00..774.96 rows=286 width=146) (actual time=72.940..72.940 rows=0
loops=1)
               Recheck Cond: ((clientrmainid)::text = 'CPC-RMA-00109'::text)
               ->  Bitmap Index Scan on idx_tagdata_clientrmainid  (cost=0.00..4.00 rows=286 width=0) (actual
time=72.936..72.936rows=0 loops=1) 
                     Index Cond: ((clientrmainid)::text = 'CPC-RMA-00109'::text)
 Total runtime: 73.144 ms

What gives?

--
Dimi Paun <>
Lattica, Inc.



pgsql-performance by date:

From: Jayadevan M
Date:
Subject: PostgreSQL - case studies
From: Bryce Nesbitt
Date:
Subject: 512,600ms query becomes 7500ms... but why? Postgres 8.3 query planner quirk?