Re: Todo: Teach planner to evaluate multiple windows in the optimal order - Mailing list pgsql-hackers

From Ankit Kumar Pandey
Subject Re: Todo: Teach planner to evaluate multiple windows in the optimal order
Date
Msg-id 7f77ee7d-bd04-d8e2-bb34-42395fd1f7c2@gmail.com
Whole thread Raw
In response to Re: Todo: Teach planner to evaluate multiple windows in the optimal order  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Todo: Teach planner to evaluate multiple windows in the optimal order
List pgsql-hackers

Hi,

On 03/01/23 08:21, David Rowley wrote:

I do think you'll likely want to put any WindowClauses which have
pathkeys which are a true subset or true superset of the ORDER BY /
DISTINCT pathkeys last.  If they're a superset then we won't need to
perform any additional ordering for the DISTINCT / ORDER BY clause.
If they're a subset then we might be able to perform an Incremental
Sort, which is likely much cheaper than a full sort.  The existing
code should handle that part. You just need to make
select_active_windows() more intelligent.

I think current implementation does exactly this.

#1. If order by clause in the window function is subset of order by in query

create table abcd(a int, b int, c int, d int);
insert into abcd select x,y,z,c from generate_series(1,5) x, generate_Series(1,5)y, generate_Series(1,5) z, generate_Series(1,5) c;
explain analyze select a,row_number() over (order by b),count(*) over (order by a,b) from abcd order by a,b,c;
                                                            QUERY PLAN                                                            
--------------------------------------------------------------------------------------------------------------------------
-------- Incremental Sort  (cost=80.32..114.56 rows=625 width=28) (actual time=1.440..3.311 rows=625 loops=1)   Sort Key: a, b, c   Presorted Key: a, b   Full-sort Groups: 13  Sort Method: quicksort  Average Memory: 28kB  Peak Memory: 28kB   ->  WindowAgg  (cost=79.24..91.74 rows=625 width=28) (actual time=1.272..2.567 rows=625 loops=1)         ->  Sort  (cost=79.24..80.80 rows=625 width=20) (actual time=1.233..1.296 rows=625 loops=1)               Sort Key: a, b               Sort Method: quicksort  Memory: 64kB               ->  WindowAgg  (cost=39.27..50.21 rows=625 width=20) (actual time=0.304..0.786 rows=625 loops=1)                     ->  Sort  (cost=39.27..40.84 rows=625 width=12) (actual time=0.300..0.354 rows=625 loops=1)                           Sort Key: b                           Sort Method: quicksort  Memory: 54kB                           ->  Seq Scan on abcd  (cost=0.00..10.25 rows=625 width=12) (actual time=0.021..0.161 rows=625 l
oops=1) Planning Time: 0.068 ms Execution Time: 3.509 ms
(15 rows)

Here, as window function (row count) has two cols a, b for order by, incremental sort is performed for remaining col in query,

which makes sense.


#2. If order by clause in the Window function is superset of order by in query

explain analyze select a,row_number() over (order by a,b,c),count(*) over (order by a,b) from abcd order by a;
                                                      QUERY PLAN                                                      
---------------------------------------------------------------------------------------------------------------------- WindowAgg  (cost=39.27..64.27 rows=625 width=28) (actual time=1.089..3.020 rows=625 loops=1)   ->  WindowAgg  (cost=39.27..53.34 rows=625 width=20) (actual time=1.024..1.635 rows=625 loops=1)         ->  Sort  (cost=39.27..40.84 rows=625 width=12) (actual time=1.019..1.084 rows=625 loops=1)               Sort Key: a, b, c               Sort Method: quicksort  Memory: 54kB               ->  Seq Scan on abcd  (cost=0.00..10.25 rows=625 width=12) (actual time=0.023..0.265 rows=625 loops=1) Planning Time: 0.071 ms Execution Time: 3.156 ms
(8 rows)

No, additional sort is needed to be performed in this case, as you referred.

On 03/01/23 08:21, David Rowley wrote:
If they're a superset then we won't need to perform any additional ordering for the DISTINCT / ORDER BY clause.
If they're a subset then we might be able to perform an Incremental
Sort, which is likely much cheaper than a full sort.  

So question is, how current implementation is different from desired one?

 
-- 
Regards,
Ankit Kumar Pandey

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: WIN32 pg_import_system_collations
Next
From: "Drouvot, Bertrand"
Date:
Subject: Re: Split index and table statistics into different types of stats