Thread: Re: GROUP BY DISTINCT

Re: GROUP BY DISTINCT

From
Tomas Vondra
Date:
On 3/16/21 3:52 PM, Tomas Vondra wrote:
> 
> 
> On 3/16/21 9:21 AM, Vik Fearing wrote:
>> On 3/13/21 12:33 AM, Tomas Vondra wrote:
>>> Hi Vik,
>>>
>>> The patch seems quite ready, I have just two comments.
>>
>> Thanks for taking a look.
>>
>>> 1) Shouldn't this add another <indexterm> for DISTINCT, somewhere in the
>>> documentation? Now the index points just to the SELECT DISTINCT part.
>>
>> Good idea; I never think about the index.
>>
>>> 2) The part in gram.y that wraps/unwraps the boolean flag as an integer,
>>> in order to stash it in the group lists is rather ugly, IMHO. It forces
>>> all the places handling the list to be aware of this (there are not
>>> many, but still ...). And there are no other places doing (bool) intVal
>>> so it's not like there's a precedent for this.
>>
>> There is kind of a precedent for it, I was copying off of TriggerEvents
>> and func_alias_clause.
>>
> 
> I see. I was looking for "(bool) intVal" but you're right TriggerEvents
> code does something similar.
> 
>>> I think the clean solution is to make group_clause produce a struct with
>>> two fields, and just use that. Not sure how invasive that will be
>>> outside gram.y, though.
>>
>> I didn't want to create a whole new parse node for it, but Andrew Gierth
>> pointed me towards SelectLimit so I did it like that and I agree it is
>> much cleaner.
>>
> 
> I agree, that's much cleaner.
> 
>>> Also, the all_or_distinct vs. distinct_or_all seems a bit error-prone. I
>>> wonder if we can come up with some clearer names, describing the context
>>> of those types.
>>
>> I turned this into an enum for ALL/DISTINCT/default and the caller can
>> choose what it wants to do with default.  I think that's a lot cleaner,
>> too.  Maybe DISTINCT ON should be changed to fit in that?  I left it
>> alone for now.
>>
> 
> I think DISTINCT ON is a different kind of animal, because that is a
> list of expressions, not just a simple enum state.
> 
>> I also snuck in something that all of us overlooked which is outputting
>> the DISTINCT in ruleutils.c.  I didn't add a test for it but that would
>> have been an unfortunate bug.
>>
> 
> Oh!
> 
>> New patch attached, rebased on 15639d5e8f.
>>
> 
> Thanks. At this point it seems fine to me, no further comments.
> 

Pushed. Thanks for the patch.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: GROUP BY DISTINCT

From
Tomas Vondra
Date:

On 3/18/21 6:25 PM, Tomas Vondra wrote:
> On 3/16/21 3:52 PM, Tomas Vondra wrote:
>>
>>
>> On 3/16/21 9:21 AM, Vik Fearing wrote:
>>> On 3/13/21 12:33 AM, Tomas Vondra wrote:
>>>> Hi Vik,
>>>>
>>>> The patch seems quite ready, I have just two comments.
>>>
>>> Thanks for taking a look.
>>>
>>>> 1) Shouldn't this add another <indexterm> for DISTINCT, somewhere in the
>>>> documentation? Now the index points just to the SELECT DISTINCT part.
>>>
>>> Good idea; I never think about the index.
>>>
>>>> 2) The part in gram.y that wraps/unwraps the boolean flag as an integer,
>>>> in order to stash it in the group lists is rather ugly, IMHO. It forces
>>>> all the places handling the list to be aware of this (there are not
>>>> many, but still ...). And there are no other places doing (bool) intVal
>>>> so it's not like there's a precedent for this.
>>>
>>> There is kind of a precedent for it, I was copying off of TriggerEvents
>>> and func_alias_clause.
>>>
>>
>> I see. I was looking for "(bool) intVal" but you're right TriggerEvents
>> code does something similar.
>>
>>>> I think the clean solution is to make group_clause produce a struct with
>>>> two fields, and just use that. Not sure how invasive that will be
>>>> outside gram.y, though.
>>>
>>> I didn't want to create a whole new parse node for it, but Andrew Gierth
>>> pointed me towards SelectLimit so I did it like that and I agree it is
>>> much cleaner.
>>>
>>
>> I agree, that's much cleaner.
>>
>>>> Also, the all_or_distinct vs. distinct_or_all seems a bit error-prone. I
>>>> wonder if we can come up with some clearer names, describing the context
>>>> of those types.
>>>
>>> I turned this into an enum for ALL/DISTINCT/default and the caller can
>>> choose what it wants to do with default.  I think that's a lot cleaner,
>>> too.  Maybe DISTINCT ON should be changed to fit in that?  I left it
>>> alone for now.
>>>
>>
>> I think DISTINCT ON is a different kind of animal, because that is a
>> list of expressions, not just a simple enum state.
>>
>>> I also snuck in something that all of us overlooked which is outputting
>>> the DISTINCT in ruleutils.c.  I didn't add a test for it but that would
>>> have been an unfortunate bug.
>>>
>>
>> Oh!
>>
>>> New patch attached, rebased on 15639d5e8f.
>>>
>>
>> Thanks. At this point it seems fine to me, no further comments.
>>
> 
> Pushed. Thanks for the patch.
> 

Hmmm, this seems to fail on lapwing with this error:

parse_agg.c: In function 'expand_grouping_sets':
parse_agg.c:1851:23: error: value computed is not used
[-Werror=unused-value]
cc1: all warnings being treated as errors

That line is this:

    foreach_delete_current(result, cell);

and I don't see how any of the values close by could be unused ...

The only possibility I can think of is some sort of issue in the old-ish
gcc release (4.7.2).


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: GROUP BY DISTINCT

From
Thomas Munro
Date:
On Fri, Mar 19, 2021 at 8:27 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
> Hmmm, this seems to fail on lapwing with this error:
>
> parse_agg.c: In function 'expand_grouping_sets':
> parse_agg.c:1851:23: error: value computed is not used
> [-Werror=unused-value]
> cc1: all warnings being treated as errors
>
> That line is this:
>
>     foreach_delete_current(result, cell);
>
> and I don't see how any of the values close by could be unused ...
>
> The only possibility I can think of is some sort of issue in the old-ish
> gcc release (4.7.2).

No sure what's going on there, but data points: I tried a 32 bit build
here (that's the other special thing about lapwing) and didn't see the
warning.  GCC 10.  Also curculio (gcc 4.2) and snapper (gcc 4.7) are
also showing this warning, but they don't have -Werror so they don't
fail.  sidewinder (gcc 4.8) is not showing the warning.



Re: GROUP BY DISTINCT

From
Tomas Vondra
Date:

On 3/18/21 10:02 PM, Thomas Munro wrote:
> On Fri, Mar 19, 2021 at 8:27 AM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
>> Hmmm, this seems to fail on lapwing with this error:
>>
>> parse_agg.c: In function 'expand_grouping_sets':
>> parse_agg.c:1851:23: error: value computed is not used
>> [-Werror=unused-value]
>> cc1: all warnings being treated as errors
>>
>> That line is this:
>>
>>     foreach_delete_current(result, cell);
>>
>> and I don't see how any of the values close by could be unused ...
>>
>> The only possibility I can think of is some sort of issue in the old-ish
>> gcc release (4.7.2).
> 
> No sure what's going on there, but data points: I tried a 32 bit build
> here (that's the other special thing about lapwing) and didn't see the
> warning.  GCC 10.  Also curculio (gcc 4.2) and snapper (gcc 4.7) are
> also showing this warning, but they don't have -Werror so they don't
> fail.  sidewinder (gcc 4.8) is not showing the warning.
> 

Thanks for the info. So it's likely related to older gcc releases. The
question is how to tweak the code to get rid of this ...


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: GROUP BY DISTINCT

From
Thomas Munro
Date:
On Fri, Mar 19, 2021 at 10:14 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
> >> The only possibility I can think of is some sort of issue in the old-ish
> >> gcc release (4.7.2).
> >
> > No sure what's going on there, but data points: I tried a 32 bit build
> > here (that's the other special thing about lapwing) and didn't see the
> > warning.  GCC 10.  Also curculio (gcc 4.2) and snapper (gcc 4.7) are
> > also showing this warning, but they don't have -Werror so they don't
> > fail.  sidewinder (gcc 4.8) is not showing the warning.
> >
>
> Thanks for the info. So it's likely related to older gcc releases. The
> question is how to tweak the code to get rid of this ...

It's frustrating to have to do press-ups to fix a problem because a
zombie Debian 7 system is running with -Werror (though it's always
possible that it's telling us something interesting...).  Anyway, I
think someone with a GCC < 4.8 compiler would have to investigate.  I
was hoping to help, but none of my systems have one in easy-to-install
format...



Re: GROUP BY DISTINCT

From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes:
> On Fri, Mar 19, 2021 at 10:14 AM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
>> Thanks for the info. So it's likely related to older gcc releases. The
>> question is how to tweak the code to get rid of this ...

> It's frustrating to have to do press-ups to fix a problem because a
> zombie Debian 7 system is running with -Werror (though it's always
> possible that it's telling us something interesting...).  Anyway, I
> think someone with a GCC < 4.8 compiler would have to investigate.  I
> was hoping to help, but none of my systems have one in easy-to-install
> format...

Hmm ... prairiedog isn't showing the warning, but maybe gaur will.
I can take a look if nobody else is stepping up.

            regards, tom lane



Re: GROUP BY DISTINCT

From
Tom Lane
Date:
I wrote:
> Hmm ... prairiedog isn't showing the warning, but maybe gaur will.

Bingo:

parse_agg.c: In function 'expand_grouping_sets':
parse_agg.c:1851:5: warning: value computed is not used

This is gcc 4.5, but hopefully whatever shuts it up will also work on 4.7.
I'll work on figuring that out.

            regards, tom lane



Re: GROUP BY DISTINCT

From
Tom Lane
Date:
I wrote:
> This is gcc 4.5, but hopefully whatever shuts it up will also work on 4.7.
> I'll work on figuring that out.

Actually, the problem is pretty obvious after comparing this use
of foreach_delete_current() to every other one.  I'm not sure why
the compiler warnings are phrased just as they are, but the fix
I just pushed does make 4.5 happy.

            regards, tom lane



Re: GROUP BY DISTINCT

From
Tomas Vondra
Date:
On 3/19/21 12:26 AM, Tom Lane wrote:
> I wrote:
>> This is gcc 4.5, but hopefully whatever shuts it up will also work on 4.7.
>> I'll work on figuring that out.
> 
> Actually, the problem is pretty obvious after comparing this use
> of foreach_delete_current() to every other one.  I'm not sure why
> the compiler warnings are phrased just as they are, but the fix
> I just pushed does make 4.5 happy.
> 

Thanks! Yeah, that looks obvious. Funny the older compilers noticed
this, not the new fancy ones.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: GROUP BY DISTINCT

From
Vik Fearing
Date:
On 3/19/21 12:52 AM, Tomas Vondra wrote:
> 
> On 3/19/21 12:26 AM, Tom Lane wrote:
>> I wrote:
>>> This is gcc 4.5, but hopefully whatever shuts it up will also work on 4.7.
>>> I'll work on figuring that out.
>>
>> Actually, the problem is pretty obvious after comparing this use
>> of foreach_delete_current() to every other one.  I'm not sure why
>> the compiler warnings are phrased just as they are, but the fix
>> I just pushed does make 4.5 happy.
>>
> 
> Thanks! Yeah, that looks obvious. Funny the older compilers noticed
> this, not the new fancy ones.

+1

I'm glad the buildfarm is so diverse.
-- 
Vik Fearing