Re: DISTINCT/Optimizer question - Mailing list pgsql-hackers

From Martijn van Oosterhout
Subject Re: DISTINCT/Optimizer question
Date
Msg-id 20060707211845.GH7485@svana.org
Whole thread Raw
In response to DISTINCT/Optimizer question  ("Beth Jen" <raelys@gmail.com>)
Responses Re: DISTINCT/Optimizer question  (Greg Stark <gsstark@mit.edu>)
List pgsql-hackers
On Fri, Jul 07, 2006 at 01:25:53PM -0400, Beth Jen wrote:
> Right now, the distinct clause adds its targets to the sort clause list when
> it is parsed. This causes an automatic insertion of the sort node into the
> query plan before the application of the unique node. The hash-based
> implementation however is meant to bypass the need to sort. I could just
> remove this action, but the optimizer should only consider using the

<snip>

My laymans opinion suggests that this needs a new specific "distinct
clause" which looks a lot like a sort clause only isn't. And then in
the planner this clause would either be converted to your new node type
or the traditional sort node.

> What are your suggestions for going about this? Are these approaches
> feasible without a significant restructuring of the code? Are there any
> other approaches I should consider?

I think it should be possible without too much changes, since much
would be shared. For example you could have the distinct node look
exactly like the sort, so they could share code. Or perhaps just a
flag to distinguish them. I admit I havn't looked carefully though...

Have you considered how your code interacts with DISTINCT ON ()?
Perhaps a clue lies there...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

pgsql-hackers by date:

Previous
From: "Beth Jen"
Date:
Subject: DISTINCT/Optimizer question
Next
From: Greg Stark
Date:
Subject: Re: DISTINCT/Optimizer question