Re: DISTINCT vs. GROUP BY

From: Dimi Paun
Subject: Re: DISTINCT vs. GROUP BY
Date: ,
Msg-id: 1265766227.2513.41.camel@localhost
(view: Whole thread, Raw)
In response to: Re: DISTINCT vs. GROUP BY  (Tom Lane)
List: pgsql-performance

Tree view

DISTINCT vs. GROUP BY  (Dimi Paun, )
 Re: DISTINCT vs. GROUP BY  (Thom Brown, )
 Re: DISTINCT vs. GROUP BY  (Tom Lane, )
  Re: DISTINCT vs. GROUP BY  (Dimi Paun, )

On Tue, 2010-02-09 at 17:38 -0500, Tom Lane wrote:
> The results certainly ought to be the same (although perhaps not with
> the same ordering) --- if they aren't, please provide a reproducible
> test case.

The results are the same, this is not a problem.

> As for efficiency, though, 8.1 didn't understand how to use hash
> aggregation for DISTINCT.  Less-obsolete versions do know how to do
> that.

Indeed, this seem to be the issue:

tts_server_db=# EXPLAIN ANALYZE select userdata from tagrecord where clientRmaInId = 'CPC-RMA-00110' group by userdata;
                                                                   QUERY PLAN
                        

------------------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=253.34..253.50 rows=16 width=15) (actual time=0.094..0.094 rows=0 loops=1)
   ->  Index Scan using idx_tagdata_clientrmainid on tagrecord  (cost=0.00..252.85 rows=195 width=15) (actual
time=0.091..0.091rows=0 loops=1) 
         Index Cond: ((clientrmainid)::text = 'CPC-RMA-00110'::text)
 Total runtime: 0.146 ms
(4 rows)

tts_server_db=# EXPLAIN ANALYZE select distinct userdata from tagrecord where clientRmaInId = 'CPC-RMA-00110';
                                                                      QUERY PLAN
                              

------------------------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=260.27..261.25 rows=16 width=15) (actual time=0.115..0.115 rows=0 loops=1)
   ->  Sort  (cost=260.27..260.76 rows=195 width=15) (actual time=0.113..0.113 rows=0 loops=1)
         Sort Key: userdata
         ->  Index Scan using idx_tagdata_clientrmainid on tagrecord  (cost=0.00..252.85 rows=195 width=15) (actual
time=0.105..0.105rows=0 loops=1) 
               Index Cond: ((clientrmainid)::text = 'CPC-RMA-00110'::text)
 Total runtime: 0.151 ms
(6 rows)

For now we are stuck with 8.1, so the easiest fix for us is to use GROUP BY.
Since this is fixed in later versions, I guess there's not much to see here... :)

Thanks for the quick reply!

--
Dimi Paun <>
Lattica, Inc.



pgsql-performance by date:

From: Scott Marlowe
Date:
Subject: Re: Linux I/O tuning: CFQ vs. deadline
From: Bryce Nesbitt
Date:
Subject: Re: 512,600ms query becomes 7500ms... but why? Postgres 8.3 query planner quirk?