This patch is aimed at being smarter about cases where we have redundant GROUP BY entries, for example
SELECT ... WHERE a.x = b.y GROUP BY a.x, b.y;
It's clearly not necessary to perform grouping using both columns. Grouping by either one alone would produce the same results, assuming compatible equality semantics. I'm not sure how often such cases arise in the wild; but we have about ten of them in our regression tests, which makes me think it's worth the trouble to de-duplicate as long as it doesn't cost too much. And it doesn't, because PathKey construction already detects exactly this sort of redundancy. We need only do something with the knowledge.
While we are here, I wonder if we can do the same trick for distinctClause, to cope with cases like
select distinct a.x, b.y from a, b where a.x = b.y;
And there is case from regression test 'select_distinct.sql' that can benefit from this optimization.
-- -- Check mentioning same column more than once --
EXPLAIN (VERBOSE, COSTS OFF) SELECT count(*) FROM (SELECT DISTINCT two, four, two FROM tenk1) ss;