Re: Todo: Teach planner to evaluate multiple windows in the optimal order - Mailing list pgsql-hackers

From David Rowley
Subject Re: Todo: Teach planner to evaluate multiple windows in the optimal order
Date
Msg-id CAApHDvpigtPVJOJ4EmWOAODtPTccz=L_gdPXDVgG2Z=iSPfNmQ@mail.gmail.com
Whole thread Raw
In response to Re: Todo: Teach planner to evaluate multiple windows in the optimal order  (Ankit Kumar Pandey <itsankitkp@gmail.com>)
Responses Re: Todo: Teach planner to evaluate multiple windows in the optimal order
List pgsql-hackers
On Mon, 9 Jan 2023 at 06:17, Ankit Kumar Pandey <itsankitkp@gmail.com> wrote:
> I have addressed all points 1-6 in the attached patch.

A few more things:

1. You're still using the 'i' variable in the foreach loop.
foreach_current_index() will work.

2. I think the "index" variable needs a better name. sort_pushdown_idx maybe.

3. I don't think you need to set "index" on every loop. Why not just
set it to foreach_current_index(l) - 1; break;

4. You're still setting orderby_pathkeys in the foreach loop. That's
already been set above and it won't have changed.

5. I don't think there's any need to check pathkeys_contained_in() in
the foreach loop anymore. With #3 the index will be -1 if the
optimisation cannot apply. You could likely also get rid of
try_sort_pushdown too and just make the condition "if
(sort_pushdown_idx == foreach_current_index(l))".  I'm a little unsure
why there's still the is_sorted check there.  Shouldn't that always be
false now that you're looping until the pathkeys don't match in the
foreach_reverse loop?

Correct me if I'm wrong as I've not tested, but I think the new code
in the foreach loop can just become:

if (sort_pushdown_idx == foreach_current_index(l))
{
    Assert(!is_sorted);
    window_pathkeys = orderby_pathkeys;
    is_sorted = pathkeys_count_contained_in(window_pathkeys,
path->pathkeys, &presorted_keys);
}


> I have one doubt regarding runCondition, do we only need to ensure
> that window function which has subset sort clause of main query should
> not have runCondition or none of the window functions should not contain
> runCondition? I have gone with later case but wanted to clarify.

Actually, maybe it's ok just to check the top-level WindowClause for
runConditions. It's only that one that'll filter rows.  That probably
simplifies the code quite a bit. Lower-level runConditions only serve
to halt the evaluation of WindowFuncs when the runCondition is no
longer met.

>
>
> > Also, do you want to have a go at coding up the sort bound pushdown
> > too?  It'll require removing the limitCount restriction and adjusting
> > ExecSetTupleBound() to recurse through a WindowAggState. I think it's
> > pretty easy. You could try it then play around with it to make sure it
> > works and we get the expected performance.
>
> I tried this in the patch but kept getting `retrieved too many tuples in
> a bounded sort`.
>
> Added following code in ExecSetTupleBound which correctly found sortstate
>
> and set bound value.
>
>         else if(IsA(child_node, WindowAggState))
>
>         {
>
>                 WindowAggState  *winstate = (WindowAggState *) child_node;
>
>                 if (outerPlanState(winstate))
>
>                         ExecSetTupleBound(tuples_needed, outerPlanState(winstate));
>
>         }
>
> I think problem is that are not using limit clause inside window
> function (which
> may need to scan all tuples) so passing bound value to
> WindowAggState->sortstate
> is not working as we might expect. Or maybe I am getting it wrong? I was
> trying to
> have top-N sort for limit clause if orderby pushdown happens.

hmm, perhaps the Limit would have to be put between the WindowAgg and
Sort for it to work.  Maybe that's more complexity than it's worth.

David



pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: POC: Lock updated tuples in tuple_update() and tuple_delete()
Next
From: Peter Geoghegan
Date:
Subject: Re: Fixing a couple of buglets in how VACUUM sets visibility map bits