Home > mailing lists

Re: Add proper planner support for ORDER BY / DISTINCT aggregates - Mailing list pgsql-hackers

From	Ronan Dunklau
Subject	Re: Add proper planner support for ORDER BY / DISTINCT aggregates
Date	July 19, 2021 06:32:34
Msg-id	4285468.TQqHX9kgJM@aivenronan Whole thread Raw
In response to	Re: Add proper planner support for ORDER BY / DISTINCT aggregates (David Rowley <dgrowleyml@gmail.com>)
Responses	Re: Add proper planner support for ORDER BY / DISTINCT aggregates Re: Add proper planner support for ORDER BY / DISTINCT aggregates
List	pgsql-hackers

Tree view

> Looks like I did a sloppy job of that.  I messed up the condition in
> standard_qp_callback() that sets the ORDER BY aggregate pathkeys so
> that it accidentally set them when there was an unsortable GROUP BY
> clause, as highlighted by the postgres_fdw tests failing.  I've now
> added a comment to explain why the condition is the way it is so that
> I don't forget again.
> 
> Here's a cleaned-up version that passes make check-world.
> 

I've noticed that when the ORDER BY is a grouping key (which to be honest 
doesn't seem to make much sense to me), the sort key is duplicated, as 
demonstrated by one of the modified tests (partition_aggregate.sql). 

This leads to additional sort nodes being added when there is no necessity to 
do so. In the case of sort and index pathes, the duplicate keys are not 
considered,  I think the same should apply here.

It means the logic for appending the order by pathkeys to the existing group 
by pathkeys would ideally need to remove the redundant group by keys from the 
order by keys, considering this example:

regression=# explain select sum(unique1 order by ten, two), sum(unique1 order 
by four), sum(unique1 order by two, four) from tenk2 group by ten;
                               QUERY PLAN                               
------------------------------------------------------------------------
 GroupAggregate  (cost=1109.39..1234.49 rows=10 width=28)
   Group Key: ten
   ->  Sort  (cost=1109.39..1134.39 rows=10000 width=16)
         Sort Key: ten, ten, two
         ->  Seq Scan on tenk2  (cost=0.00..445.00 rows=10000 width=16)


We would ideally like to sort on ten, two, four to satisfy the first and last 
aggref at once. Stripping the common prefix (ten) would eliminate this problem. 

Also, existing regression tests cover the first problem (order by a grouping 
key) but I feel like they should be extended with a case similar as the above 
to check which pathkeys are used in the "multiple ordered aggregates + group 
by" cases. 


-- 
Ronan Dunklau

pgsql-hackers by date:

From: Peter Eisentraut
Date: 19 July 2021, 06:29:42
Subject: Re: Replace remaining castNode(…, lfirst(…)) and friends calls with l*_node()

From: Masahiko Sawada
Date: 19 July 2021, 06:35:47
Subject: Re: Skipping logical replication transactions on subscriber side

Re: Add proper planner support for ORDER BY / DISTINCT aggregates - Mailing list pgsql-hackers

Previous

Next