David Rowley <dgrowleyml@gmail.com> writes:
> For the master version, I think it's safe just to get rid of
> PlannerInfo.num_groupby_pathkeys now. I only added that so I could
> strip off the ORDER BY / DISTINCT aggregate PathKeys from the group by
> pathkeys before passing to the functions that rearranged the GROUP BY
> clause.
I was kind of unhappy with that data structure too, but from the
other direction: I didn't like that you were folding aggregate-derived
pathkeys into root->group_pathkeys in the first place. That seems like
a kluge that might work all right for the moment but will cause problems
down the road. (Despite the issues with the patch at hand, I don't
think it's unreasonable to suppose that somebody will have a more
successful go at optimizing GROUP BY sorting later.) If we keep the
data structure like this, I think we absolutely need num_groupby_pathkeys,
or some other way of recording which pathkeys came from what source.
One way to manage that would be to insist that the length of
root->group_clauses should indicate the number of associated grouping
pathkeys. Right now they might not be the same because we might discover
some of the pathkeys to be redundant --- but if we do, ISTM that the
corresponding GROUP BY clauses are also redundant and could get dropped.
That ties into the stuff I was worried about in [1], though. I'll keep
this in mind when I get back to messing with that.
regards, tom lane
[1] https://www.postgresql.org/message-id/flat/1657885.1657647073%40sss.pgh.pa.us