Re: Add proper planner support for ORDER BY / DISTINCT aggregates - Mailing list pgsql-hackers

From Richard Guo
Subject Re: Add proper planner support for ORDER BY / DISTINCT aggregates
Date
Msg-id CAMbWs4_hAK6+0Gk=vZX+ikSFBx=6981iFLGkgLObDkvvGpjogg@mail.gmail.com
Whole thread Raw
In response to Re: Add proper planner support for ORDER BY / DISTINCT aggregates  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Add proper planner support for ORDER BY / DISTINCT aggregates
List pgsql-hackers

On Tue, Jul 26, 2022 at 7:38 AM David Rowley <dgrowleyml@gmail.com> wrote:
On Fri, 22 Jul 2022 at 21:33, Richard Guo <guofenglinux@gmail.com> wrote:
> I can see this problem with
> the query below:
>
>     select max(b order by b), max(a order by a) from t group by a;
>
> When processing the first aggregate, we compose the 'currpathkeys' as
> {a, b} and mark this aggregate in 'aggindexes'. When it comes to the
> second aggregate, we compose its pathkeys as {a} and decide that it is
> not stronger than 'currpathkeys'. So the second aggregate is not
> recorded in 'aggindexes'. As a result, we fail to mark aggpresorted for
> the second aggregate.

Yeah, you're right. I have a missing check to see if currpathkeys are
better than the pathkeys for the current aggregate. In your example
case we'd have still processed the 2nd aggregate the old way instead
of realising we could take the new pre-sorted path for faster
processing.

I've adjusted that in the attached to make it properly include the
case where currpathkeys are better.

Thanks. Verified problem is solved in v8 patch.

Also I'm wondering if it's possible to take into consideration the
ordering indicated by existing indexes when determining the pathkeys. So
that for the query below we can avoid the Incremental Sort node if we
consider that there is an index on t(a, c):

# explain (costs off) select max(b order by b), max(c order by c) from t group by a;
                 QUERY PLAN
---------------------------------------------
 GroupAggregate
   Group Key: a
   ->  Incremental Sort
         Sort Key: a, b
         Presorted Key: a
         ->  Index Scan using t_a_c_idx on t
(6 rows)

Thanks
Richard 

pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: remove more archiving overhead
Next
From: Masahiko Sawada
Date:
Subject: Re: Introduce wait_for_subscription_sync for TAP tests