Re: GROUP BY DISTINCT - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: GROUP BY DISTINCT
Date
Msg-id 7137947a-87be-d04a-bf91-f42e47520daa@enterprisedb.com
Whole thread Raw
Responses Re: GROUP BY DISTINCT
List pgsql-hackers
On 3/16/21 3:52 PM, Tomas Vondra wrote:
> 
> 
> On 3/16/21 9:21 AM, Vik Fearing wrote:
>> On 3/13/21 12:33 AM, Tomas Vondra wrote:
>>> Hi Vik,
>>>
>>> The patch seems quite ready, I have just two comments.
>>
>> Thanks for taking a look.
>>
>>> 1) Shouldn't this add another <indexterm> for DISTINCT, somewhere in the
>>> documentation? Now the index points just to the SELECT DISTINCT part.
>>
>> Good idea; I never think about the index.
>>
>>> 2) The part in gram.y that wraps/unwraps the boolean flag as an integer,
>>> in order to stash it in the group lists is rather ugly, IMHO. It forces
>>> all the places handling the list to be aware of this (there are not
>>> many, but still ...). And there are no other places doing (bool) intVal
>>> so it's not like there's a precedent for this.
>>
>> There is kind of a precedent for it, I was copying off of TriggerEvents
>> and func_alias_clause.
>>
> 
> I see. I was looking for "(bool) intVal" but you're right TriggerEvents
> code does something similar.
> 
>>> I think the clean solution is to make group_clause produce a struct with
>>> two fields, and just use that. Not sure how invasive that will be
>>> outside gram.y, though.
>>
>> I didn't want to create a whole new parse node for it, but Andrew Gierth
>> pointed me towards SelectLimit so I did it like that and I agree it is
>> much cleaner.
>>
> 
> I agree, that's much cleaner.
> 
>>> Also, the all_or_distinct vs. distinct_or_all seems a bit error-prone. I
>>> wonder if we can come up with some clearer names, describing the context
>>> of those types.
>>
>> I turned this into an enum for ALL/DISTINCT/default and the caller can
>> choose what it wants to do with default.  I think that's a lot cleaner,
>> too.  Maybe DISTINCT ON should be changed to fit in that?  I left it
>> alone for now.
>>
> 
> I think DISTINCT ON is a different kind of animal, because that is a
> list of expressions, not just a simple enum state.
> 
>> I also snuck in something that all of us overlooked which is outputting
>> the DISTINCT in ruleutils.c.  I didn't add a test for it but that would
>> have been an unfortunate bug.
>>
> 
> Oh!
> 
>> New patch attached, rebased on 15639d5e8f.
>>
> 
> Thanks. At this point it seems fine to me, no further comments.
> 

Pushed. Thanks for the patch.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Key management with tests
Next
From: Alvaro Herrera
Date:
Subject: Re: Key management with tests