Thread: reorder GROUP BY list

reorder GROUP BY list

From
Neil Conway
Date:
Per recent discussion on -hackers, we should sometimes try to reorder
the columns of the grouping clause to avoid redundant sorts. The
optimizer is not currently capable of doing this, so this patch
implements a simple hack in the analysis phase (transformGroupClause):
if any subset of the GROUP BY clause matches a prefix of the ORDER BY
list, that prefix is moved to the front of the GROUP BY clause. This
shouldn't change the semantics of the query, and allows a redundant sort
to be avoided for queries like "GROUP BY a, b ORDER BY b".

One question about the implementation: to avoid redundant and
potentially expensive calls to findTargetlistEntry(), I constructed a
temporary list of TLEs. I think that's probably a good tradeoff, but
suggestions for improvement are welcome. Also, I released the list via
list_free() when finished with it -- is that a waste of cycles?

Barring any objections, I'll apply this to HEAD tomorrow.

-Neil


Attachment

Re: reorder GROUP BY list

From
Tom Lane
Date:
Neil Conway <neilc@samurai.com> writes:
> Per recent discussion on -hackers, we should sometimes try to reorder
> the columns of the grouping clause to avoid redundant sorts. The
> optimizer is not currently capable of doing this, so this patch
> implements a simple hack in the analysis phase (transformGroupClause):
> if any subset of the GROUP BY clause matches a prefix of the ORDER BY
> list, that prefix is moved to the front of the GROUP BY clause.

I find this patch a tad inelegant.  It seems like a confusing way to
think about the problem, and it also means that the routine is
effectively matching the sort clause and group clause twice --- once
in the code you added and once in the existing code.

The idea that I had about how the revised routine should work is
basically:

1. For each ORDER BY item:
    1a. Find a matching GROUP BY item.
        (Break out of loop if no match.)
    1b. Add it to the output list with ORDER BY item's ordering op.
    1c. Remove the matched item from the GROUP BY list.
2. For each remaining GROUP BY item:
    2a. Add it to output list using default ordering op.

This might require a bit of code duplication from the two "add to output
list" steps, but I think it'd be a lot clearer.  You could probably
avoid most of the code duplication by making a preliminary list of
TargetEntrys (findTargetlistEntry and the UNKNOWN->TEXT hack) before
applying the above steps.  Or make a subroutine.

            regards, tom lane

Re: reorder GROUP BY list

From
Neil Conway
Date:
On Sat, 2006-03-04 at 21:21 -0500, Tom Lane wrote:
> 1. For each ORDER BY item:
>     1a. Find a matching GROUP BY item.
>         (Break out of loop if no match.)
>     1b. Add it to the output list with ORDER BY item's ordering op.
>     1c. Remove the matched item from the GROUP BY list.
> 2. For each remaining GROUP BY item:
>     2a. Add it to output list using default ordering op.

Okay, attached is a revised patch that implements this. Barring any
objections I'll apply it tomorrow.

-Neil


Attachment

Re: reorder GROUP BY list

From
Neil Conway
Date:
On Sun, 2006-03-05 at 00:04 -0500, Neil Conway wrote:
> Okay, attached is a revised patch that implements this.

Applied.

-Neil