Re: Removing redundant grouping columns - Mailing list pgsql-hackers

From Richard Guo
Subject Re: Removing redundant grouping columns
Date
Msg-id CAMbWs48G84f9RNg+Ueq-0pytCg-YHMP95t+hQnZ8RaL-_GLBMg@mail.gmail.com
Whole thread Raw
In response to Removing redundant grouping columns  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Removing redundant grouping columns
List pgsql-hackers

On Wed, Dec 28, 2022 at 6:18 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
This patch is aimed at being smarter about cases where we have
redundant GROUP BY entries, for example

SELECT ... WHERE a.x = b.y GROUP BY a.x, b.y;

It's clearly not necessary to perform grouping using both columns.
Grouping by either one alone would produce the same results,
assuming compatible equality semantics.  I'm not sure how often
such cases arise in the wild; but we have about ten of them in our
regression tests, which makes me think it's worth the trouble to
de-duplicate as long as it doesn't cost too much.  And it doesn't,
because PathKey construction already detects exactly this sort of
redundancy.  We need only do something with the knowledge.
 
While we are here, I wonder if we can do the same trick for
distinctClause, to cope with cases like

    select distinct a.x, b.y from a, b where a.x = b.y;

And there is case from regression test 'select_distinct.sql' that can
benefit from this optimization.

    --
    -- Check mentioning same column more than once
    --

    EXPLAIN (VERBOSE, COSTS OFF)
    SELECT count(*) FROM
      (SELECT DISTINCT two, four, two FROM tenk1) ss;

Thanks
Richard

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Getting rid of SQLValueFunction
Next
From: adherent postgres
Date:
Subject: Re: Add 64-bit XIDs into PostgreSQL 15