Re: Allow WindowFuncs prosupport function to use more optimal WindowClause options - Mailing list pgsql-hackers

From Erwin Brandstetter
Subject Re: Allow WindowFuncs prosupport function to use more optimal WindowClause options
Date
Msg-id CAGHENJ7oNWBKWLn-+qte4NujhakHdxsaq4KTC_zskRvLpNjszQ@mail.gmail.com
Whole thread Raw
In response to Re: Allow WindowFuncs prosupport function to use more optimal WindowClause options  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Allow WindowFuncs prosupport function to use more optimal WindowClause options  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Thu, 13 Oct 2022 at 02:34, David Rowley <dgrowleyml@gmail.com> wrote:
On Wed, 12 Oct 2022 at 16:33, Vik Fearing <vik@postgresfriends.org> wrote:
> Per spec, the ROW_NUMBER() window function is not even allowed to have a
> frame specified.
>
>      b) The window framing clause of WDX shall not be present.
>
> Also, the specification for ROW_NUMBER() is:
>
>      f) ROW_NUMBER() OVER WNS is equivalent to the <window function>:
>
>          COUNT (*) OVER (WNS1 ROWS UNBOUNDED PRECEDING)
>
>
> So I don't think we need to test for anything at all and can
> indiscriminately add or replace the frame with ROWS UNBOUNDED PRECEDING.

Thanks for digging that out.

Just above that I see:

RANK() OVER WNS is equivalent to:
( COUNT (*) OVER (WNS1 RANGE UNBOUNDED PRECEDING)
- COUNT (*) OVER (WNS1 RANGE CURRENT ROW) + 1 )

and

DENSE_RANK() OVER WNS is equivalent to the <window function>:
COUNT (DISTINCT ROW ( VE1, ..., VEN ) )
OVER (WNS1 RANGE UNBOUNDED PRECEDING)

So it looks like the same can be done for rank() and dense_rank() too.
I've added support for those in the attached.

This also got me thinking that maybe we should be a bit more generic
with the support function node tag name. After looking at the
nodeWindowAgg.c code for a while, I wondered if we might want to add
some optimisations in the future that makes WindowAgg not bother
storing tuples for row_number(), rank() and dense_rank().  That might
save a bit of overhead from the tuple store.  I imagined that we'd
want to allow the expansion of this support request so that the
support function could let the planner know if any tuples will be
accessed by the window function or not.  The
SupportRequestWFuncOptimizeFrameOpts name didn't seem very fitting for
that so I adjusted it to become SupportRequestOptimizeWindowClause
instead.

The updated patch is attached.
David

I am thinking of building a test case to run
- all existing window functions
- with all basic variants of frame definitions
- once with ROWS, once with RANGE
- on basic table that has duplicate and NULL values in partition and ordering columns
- in all supported major versions

To verify for which of our window functions ROWS vs. RANGE never makes a difference.
That should be obvious in most cases, just to be sure.

Do you think this would be helpful?

Regards
Erwin
 

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: havingQual vs hasHavingQual buglets
Next
From: Tom Lane
Date:
Subject: Re: Allow WindowFuncs prosupport function to use more optimal WindowClause options