Re: Todo: Teach planner to evaluate multiple windows in the optimal order - Mailing list pgsql-hackers

From Ankit Kumar Pandey
Subject Re: Todo: Teach planner to evaluate multiple windows in the optimal order
Date
Msg-id 4ef7f947-526b-c4ec-f7a0-8b4586e693e4@gmail.com
Whole thread Raw
In response to Re: Todo: Teach planner to evaluate multiple windows in the optimal order  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Todo: Teach planner to evaluate multiple windows in the optimal order
Re: Todo: Teach planner to evaluate multiple windows in the optimal order
List pgsql-hackers
> On 09/01/23 17:53, David Rowley wrote:

> We need to keep looping backwards until we find the first WindowClause
> which does not contain the pathkeys of the ORDER BY.

I found the cause of confusion, *first* WindowClause means first from

forward direction. Since we are looping from backward, I took it first

from the last.


eg:

select count(*) over (order by a), count(*) over (order by a,b), 
count(*) over (order by a,b,c) from abcd order by a,b,c,d;

first window clause is count(*) over (order by a) which we are using for 
order-by pushdown.


This is in sync with implementation as well.


> I couldn't quite understand why the foreach() loop's
> condition couldn't just be "if (foreach_current_index(l) ==
> sort_pushdown_idx)", but I see that if we don't check if the path is
> already correctly sorted that we could end up pushing the sort down
> onto the path that's already correctly sorted.  We decided we didn't
> want to move the sort around if it does not reduce the amount of
> sorting.

Yes, this was the reason, the current patch handles this without is_sort 
now, which is great.

> All the tests you added still pass. Although, I didn't
> really study the tests yet to see if everything we talked about is
> covered.

It covers general cases and exceptions. Also, I did few additional 
tests. Looked good.

> It turned out the sort_pushdown_idx = foreach_current_index(l) - 1;
> break; didn't work as if all the WindowClauses have pathkeys contained
> in the order by pathkeys then we don't ever set sort_pushdown_idx. I
> adjusted it to do:

> if (pathkeys_contained_in(window_pathkeys, orderby_pathkeys))
>    sort_pushdown_idx = foreach_current_index(l);
> else
>   break;

Yes, that would have been problematic. I have verified this case

and on related note, I have added a test case that ensure order by pushdown

shouldn't happen if window function's order by is superset of query's 
order by.

> I also fixed up the outdated comments and changed it so we only set
> orderby_pathkeys once instead of once per loop in the
> foreach_reverse() loop.

Thanks, code look a lot neater now (is_sorted is gone and handled in a 
better way).

> I gave some thought to whether doing foreach_delete_current() is safe
> within a foreach_reverse() loop. I didn't try it, but I couldn't see
> any reason why not.  It is pretty late here and I'd need to test that
> to be certain. If it turns out not to be safe then we need to document
> that fact in the comments of the foreach_reverse() macro and the
> foreach_delete_current() macro.

I tested foreach_delete_current inside foreach_reverse loop.

It worked fine.


I have attached patch with one extra test case (as mentioned above). 
Rest of the changes are looking fine.

Ran pgbench again and optimized version still had a lead (168 tps vs 135 
tps) in performance.


Do we have any pending items for this patch now?

-- 

Regards,
Ankit Kumar Pandey

Attachment

pgsql-hackers by date:

Previous
From: Matthias van de Meent
Date:
Subject: Re: BUG: Postgres 14 + vacuum_defer_cleanup_age + FOR UPDATE + UPDATE
Next
From: Nathan Bossart
Date:
Subject: Re: wake up logical workers after ALTER SUBSCRIPTION