Todo: Teach planner to evaluate multiple windows in the optimal order - Mailing list pgsql-hackers
From | Ankit Kumar Pandey |
---|---|
Subject | Todo: Teach planner to evaluate multiple windows in the optimal order |
Date | |
Msg-id | 83d80853-a45c-d85c-68eb-59acfe7fb5fb@gmail.com Whole thread Raw |
Responses |
Re: Todo: Teach planner to evaluate multiple windows in the optimal order
|
List | pgsql-hackers |
Hi,
While looking at one of the todo item in Window function, namely:
Teach planner to evaluate multiple windows in the optimal order Currently windows are always evaluated in the query-specified order.
From threads, relevant points.
Point #1
and Point #2In the above query Oracle 10g performs 2 sorts, DB2 and Sybase perform 3sorts. We also perform 3.
Repro:Teach planner to decide which window to evaluate first based on costs.
Currently the first window in the query is evaluated first, there may be no
index to help sort the first window, but perhaps there are for other windows
in the query. This may allow an index scan instead of a seqscan -> sort.
select pg_catalog.version();
version
----------------------------------------------------------------------------------------------------
PostgreSQL 16devel on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 12.2.0-3ubuntu1) 12.2.0, 64-bit
(1 row)
create table empsalary(depname text, empno int, salary int);
insert into empsalary values (select substr(md5(random()::text), 0, 25), generate_series(1,10), generate_series(10000,12000));
explain SELECT depname, SUM(salary) OVER (ORDER BY salary), SUM(salary) OVER (ORDER BY empno) FROM empsalary ORDER BY salary;
QUERY PLAN
--------------------------------------------------------------------------------------
WindowAgg (cost=289.47..324.48 rows=2001 width=49)
-> Sort (cost=289.47..294.47 rows=2001 width=41)
Sort Key: salary
-> WindowAgg (cost=144.73..179.75 rows=2001 width=41)
-> Sort (cost=144.73..149.73 rows=2001 width=33)
Sort Key: empno
-> Seq Scan on empsalary (cost=0.00..35.01 rows=2001 width=33)
(7 rows)
As it is seen, for case #1, issue looks like resolved and only 2 sorts are performed.
For #2, index col ordering is changed.
create index idx_emp on empsalary (empno);
explain SELECT depname, SUM(salary) OVER (ORDER BY salary), SUM(salary) OVER (ORDER BY empno) FROM empsalary ORDER BY salary;
QUERY PLAN
------------------------------------------------------------------------------------------------
WindowAgg (cost=204.03..239.04 rows=2001 width=49)
-> Sort (cost=204.03..209.03 rows=2001 width=41)
Sort Key: salary
-> WindowAgg (cost=0.28..94.31 rows=2001 width=41)
-> Index Scan using idx_emp on empsalary (cost=0.28..64.29 rows=2001 width=33)
(5 rows)
explain SELECT depname, SUM(salary) OVER (ORDER BY empno), SUM(salary) OVER (ORDER BY salary) FROM empsalary ORDER BY salary;
QUERY PLAN
------------------------------------------------------------------------------------------------
WindowAgg (cost=204.03..239.04 rows=2001 width=49)
-> Sort (cost=204.03..209.03 rows=2001 width=41)
Sort Key: salary
-> WindowAgg (cost=0.28..94.31 rows=2001 width=41)
-> Index Scan using idx_emp on empsalary (cost=0.28..64.29 rows=2001 width=33)
(5 rows)
In both cases, index scan is performed, which means this issue is resolved as well.
Is this todo still relevant?
Further down the threads:
I do think the patch has probably left some low-hanging fruit on the
simpler end of the difficulty spectrum, namely when the window stuff
requires only one ordering that could be done either explicitly or
by an indexscan. That choice should ideally be done with a proper
cost comparison taking any LIMIT into account. I think right now
the LIMIT might not be accounted for, or might be considered even
when it shouldn't be because another sort is needed anyway.
I am not sure if I understand this fully but does it means proposed todo (to use indexes) should be refined to
teach planner to take into account of cost model while deciding to use index or not in window functions? Meaning not always go with index route (modify point #2)?
-- Regards, Ankit Kumar Pandey
pgsql-hackers by date: