Re: queries with DISTINCT / GROUP BY giving different plans - Mailing list pgsql-performance

From Tom Lane
Subject Re: queries with DISTINCT / GROUP BY giving different plans
Date
Msg-id 11586.1376505314@sss.pgh.pa.us
Whole thread Raw
In response to queries with DISTINCT / GROUP BY giving different plans  ("Tomas Vondra" <tv@fuzzy.cz>)
Responses Re: queries with DISTINCT / GROUP BY giving different plans
Re: queries with DISTINCT / GROUP BY giving different plans
List pgsql-performance
"Tomas Vondra" <tv@fuzzy.cz> writes:
> I've run into a strange plan difference on 9.1.9 - the first query does
> "DISTINCT" by doing a GROUP BY on the columns (both INT). ...
> Now, this takes ~45 seconds to execute, but after rewriting the query to
> use the regular DISTINCT it suddenly switches to HashAggregate with ~1/3
> the cost (although it produces the same output, AFAIK), and it executes in
> ~15 seconds.

[ scratches head... ]  I guess you're running into some corner case where
choose_hashed_grouping and choose_hashed_distinct make different choices.
It's going to be tough to debug without a test case though.  I couldn't
reproduce the behavior in a few tries here.

> BTW I can't test this on 9.2 or 9.3 easily, as this is our production
> environment and I can't just export the data. I've tried to simulate this
> but so far no luck.

I suppose they won't yet you step through those two functions with a
debugger either ...

            regards, tom lane


pgsql-performance by date:

Previous
From: "Tomas Vondra"
Date:
Subject: queries with DISTINCT / GROUP BY giving different plans
Next
From: Tom Lane
Date:
Subject: Re: Interesting case of IMMUTABLE significantly hurting performance