Thread: Allow WindowFuncs prosupport function to use more optimal WindowClause options

Allow WindowFuncs prosupport function to use more optimal WindowClause options

From
David Rowley
Date:
Over on [1], Erwin mentions that row_number() over (ORDER BY ... ROWS
UNBOUNDED PRECEDING) is substantially faster than the default RANGE
UNBOUNDED PRECEDING WindowClause options.  The difference between
these options are that nodeWindowAgg.c checks for peer rows for RANGE
but not for ROWS.  That would make a difference for many of our
built-in window and aggregate functions, but row_number() does not
really care.

To quantify the performance difference, take the following example:

create table ab (a int, b int) ;
insert into ab
select x,y from generate_series(1,1000)x,generate_series(1,1000)y;
create index on ab(a,b);

--  range unbounded (the default)
explain (analyze, costs off, timing off)
select a,b from (select a,b,row_number() over (partition by a order by
a range unbounded preceding) rn from ab) ab where rn <= 1;
                                      QUERY PLAN
---------------------------------------------------------------------------------------
 Subquery Scan on ab (actual rows=1000 loops=1)
   ->  WindowAgg (actual rows=1000 loops=1)
         Run Condition: (row_number() OVER (?) <= 1)
         ->  Index Only Scan using ab_a_b_idx on ab ab_1 (actual
rows=1000000 loops=1)
               Heap Fetches: 1000000
 Planning Time: 0.091 ms
 Execution Time: 336.172 ms
(7 rows)

If that were switched to use ROWS UNBOUNDED PRECEDING then the
executor would not have to check for peer rows in the window frame.

explain (analyze, costs off, timing off)
select a,b from (select a,b,row_number() over (partition by a order by
a rows unbounded preceding) rn from ab) ab where rn <= 1;
                                      QUERY PLAN
---------------------------------------------------------------------------------------
 Subquery Scan on ab (actual rows=1000 loops=1)
   ->  WindowAgg (actual rows=1000 loops=1)
         Run Condition: (row_number() OVER (?) <= 1)
         ->  Index Only Scan using ab_a_b_idx on ab ab_1 (actual
rows=1000000 loops=1)
               Heap Fetches: 0
 Planning Time: 0.178 ms
 Execution Time: 75.101 ms
(7 rows)

Time: 75.673 ms (447% faster)

You can see that this executes quite a bit more quickly. As Erwin
pointed out to me (off-list), this along with the monotonic window
function optimisation that was added in PG15 the performance of this
gets close to DISTINCT ON.

explain (analyze, costs off, timing off)
select distinct on (a) a,b from ab order by a,b;
                                 QUERY PLAN
----------------------------------------------------------------------------
 Unique (actual rows=1000 loops=1)
   ->  Index Only Scan using ab_a_b_idx on ab (actual rows=1000000 loops=1)
         Heap Fetches: 0
 Planning Time: 0.071 ms
 Execution Time: 77.988 ms
(5 rows)

I've not really done any analysis into which other window functions
can use this optimisation. The attached only adds support to
row_number()'s support function and only converts exactly "RANGE
UNBOUNDED PRECEDING AND CURRENT ROW" into "ROW UNBOUNDED PRECEDING AND
CURRENT ROW".  That might need to be relaxed a little, but I've done
no analysis to find that out.   Erwin mentioned to me that he's not
currently in a position to produce a patch for this, so here's the
patch. I'm hoping the existence of this might coax Erwin into doing
some analysis into what other window functions we can support and what
other frame options can be optimised.

[1] https://postgr.es/m/CAGHENJ7LBBszxS+SkWWFVnBmOT2oVsBhDMB1DFrgerCeYa_DyA@mail.gmail.com

Attachment
On 10/12/22 04:40, David Rowley wrote:
> I've not really done any analysis into which other window functions
> can use this optimisation. The attached only adds support to
> row_number()'s support function and only converts exactly "RANGE
> UNBOUNDED PRECEDING AND CURRENT ROW" into "ROW UNBOUNDED PRECEDING AND
> CURRENT ROW".  That might need to be relaxed a little, but I've done
> no analysis to find that out. 

Per spec, the ROW_NUMBER() window function is not even allowed to have a 
frame specified.

     b) The window framing clause of WDX shall not be present.

Also, the specification for ROW_NUMBER() is:

     f) ROW_NUMBER() OVER WNS is equivalent to the <window function>:

         COUNT (*) OVER (WNS1 ROWS UNBOUNDED PRECEDING)


So I don't think we need to test for anything at all and can 
indiscriminately add or replace the frame with ROWS UNBOUNDED PRECEDING.
-- 
Vik Fearing




Re: Allow WindowFuncs prosupport function to use more optimal WindowClause options

From
Erwin Brandstetter
Date:

On Wed, 12 Oct 2022 at 05:33, Vik Fearing <vik@postgresfriends.org> wrote:
On 10/12/22 04:40, David Rowley wrote:
> I've not really done any analysis into which other window functions
> can use this optimisation. The attached only adds support to
> row_number()'s support function and only converts exactly "RANGE
> UNBOUNDED PRECEDING AND CURRENT ROW" into "ROW UNBOUNDED PRECEDING AND
> CURRENT ROW".  That might need to be relaxed a little, but I've done
> no analysis to find that out.

Per spec, the ROW_NUMBER() window function is not even allowed to have a
frame specified.

     b) The window framing clause of WDX shall not be present.

Also, the specification for ROW_NUMBER() is:

     f) ROW_NUMBER() OVER WNS is equivalent to the <window function>:

         COUNT (*) OVER (WNS1 ROWS UNBOUNDED PRECEDING)


So I don't think we need to test for anything at all and can
indiscriminately add or replace the frame with ROWS UNBOUNDED PRECEDING.


To back this up:
SQL Server returns an error right away if you  try to add a window frame
https://dbfiddle.uk/SplT-F3E

> Msg 10752 Level 15 State 3 Line 1
> The function 'row_number' may not have a window frame.

And Oracle reports a syntax error:

row_number() is defined without a "windowing clause" (in Oravle's nomenclature)

Allowing the same in Postgres (and defaulting to RANGE mode) seems like (a) genuine bug(s) after all.

Regards
Erwin
On Wed, 12 Oct 2022 at 16:33, Vik Fearing <vik@postgresfriends.org> wrote:
> Per spec, the ROW_NUMBER() window function is not even allowed to have a
> frame specified.
>
>      b) The window framing clause of WDX shall not be present.
>
> Also, the specification for ROW_NUMBER() is:
>
>      f) ROW_NUMBER() OVER WNS is equivalent to the <window function>:
>
>          COUNT (*) OVER (WNS1 ROWS UNBOUNDED PRECEDING)
>
>
> So I don't think we need to test for anything at all and can
> indiscriminately add or replace the frame with ROWS UNBOUNDED PRECEDING.

Thanks for digging that out.

Just above that I see:

RANK() OVER WNS is equivalent to:
( COUNT (*) OVER (WNS1 RANGE UNBOUNDED PRECEDING)
- COUNT (*) OVER (WNS1 RANGE CURRENT ROW) + 1 )

and

DENSE_RANK() OVER WNS is equivalent to the <window function>:
COUNT (DISTINCT ROW ( VE1, ..., VEN ) )
OVER (WNS1 RANGE UNBOUNDED PRECEDING)

So it looks like the same can be done for rank() and dense_rank() too.
I've added support for those in the attached.

This also got me thinking that maybe we should be a bit more generic
with the support function node tag name. After looking at the
nodeWindowAgg.c code for a while, I wondered if we might want to add
some optimisations in the future that makes WindowAgg not bother
storing tuples for row_number(), rank() and dense_rank().  That might
save a bit of overhead from the tuple store.  I imagined that we'd
want to allow the expansion of this support request so that the
support function could let the planner know if any tuples will be
accessed by the window function or not.  The
SupportRequestWFuncOptimizeFrameOpts name didn't seem very fitting for
that so I adjusted it to become SupportRequestOptimizeWindowClause
instead.

The updated patch is attached.

David

Attachment


On Wed, Oct 12, 2022 at 5:35 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Wed, 12 Oct 2022 at 16:33, Vik Fearing <vik@postgresfriends.org> wrote:
> Per spec, the ROW_NUMBER() window function is not even allowed to have a
> frame specified.
>
>      b) The window framing clause of WDX shall not be present.
>
> Also, the specification for ROW_NUMBER() is:
>
>      f) ROW_NUMBER() OVER WNS is equivalent to the <window function>:
>
>          COUNT (*) OVER (WNS1 ROWS UNBOUNDED PRECEDING)
>
>
> So I don't think we need to test for anything at all and can
> indiscriminately add or replace the frame with ROWS UNBOUNDED PRECEDING.

Thanks for digging that out.

Just above that I see:

RANK() OVER WNS is equivalent to:
( COUNT (*) OVER (WNS1 RANGE UNBOUNDED PRECEDING)
- COUNT (*) OVER (WNS1 RANGE CURRENT ROW) + 1 )

and

DENSE_RANK() OVER WNS is equivalent to the <window function>:
COUNT (DISTINCT ROW ( VE1, ..., VEN ) )
OVER (WNS1 RANGE UNBOUNDED PRECEDING)

So it looks like the same can be done for rank() and dense_rank() too.
I've added support for those in the attached.

This also got me thinking that maybe we should be a bit more generic
with the support function node tag name. After looking at the
nodeWindowAgg.c code for a while, I wondered if we might want to add
some optimisations in the future that makes WindowAgg not bother
storing tuples for row_number(), rank() and dense_rank().  That might
save a bit of overhead from the tuple store.  I imagined that we'd
want to allow the expansion of this support request so that the
support function could let the planner know if any tuples will be
accessed by the window function or not.  The
SupportRequestWFuncOptimizeFrameOpts name didn't seem very fitting for
that so I adjusted it to become SupportRequestOptimizeWindowClause
instead.

The updated patch is attached.

David
Hi,

+       req->frameOptions = (FRAMEOPTION_ROWS |
+                            FRAMEOPTION_START_UNBOUNDED_PRECEDING |
+                            FRAMEOPTION_END_CURRENT_ROW);

The bit combination appears multiple times in the patch.
Maybe define the combination as a constant in supportnodes.h and reference it in the code.

Cheers

Re: Allow WindowFuncs prosupport function to use more optimal WindowClause options

From
Erwin Brandstetter
Date:
On Thu, 13 Oct 2022 at 02:34, David Rowley <dgrowleyml@gmail.com> wrote:
On Wed, 12 Oct 2022 at 16:33, Vik Fearing <vik@postgresfriends.org> wrote:
> Per spec, the ROW_NUMBER() window function is not even allowed to have a
> frame specified.
>
>      b) The window framing clause of WDX shall not be present.
>
> Also, the specification for ROW_NUMBER() is:
>
>      f) ROW_NUMBER() OVER WNS is equivalent to the <window function>:
>
>          COUNT (*) OVER (WNS1 ROWS UNBOUNDED PRECEDING)
>
>
> So I don't think we need to test for anything at all and can
> indiscriminately add or replace the frame with ROWS UNBOUNDED PRECEDING.

Thanks for digging that out.

Just above that I see:

RANK() OVER WNS is equivalent to:
( COUNT (*) OVER (WNS1 RANGE UNBOUNDED PRECEDING)
- COUNT (*) OVER (WNS1 RANGE CURRENT ROW) + 1 )

and

DENSE_RANK() OVER WNS is equivalent to the <window function>:
COUNT (DISTINCT ROW ( VE1, ..., VEN ) )
OVER (WNS1 RANGE UNBOUNDED PRECEDING)

So it looks like the same can be done for rank() and dense_rank() too.
I've added support for those in the attached.

This also got me thinking that maybe we should be a bit more generic
with the support function node tag name. After looking at the
nodeWindowAgg.c code for a while, I wondered if we might want to add
some optimisations in the future that makes WindowAgg not bother
storing tuples for row_number(), rank() and dense_rank().  That might
save a bit of overhead from the tuple store.  I imagined that we'd
want to allow the expansion of this support request so that the
support function could let the planner know if any tuples will be
accessed by the window function or not.  The
SupportRequestWFuncOptimizeFrameOpts name didn't seem very fitting for
that so I adjusted it to become SupportRequestOptimizeWindowClause
instead.

The updated patch is attached.
David

I am thinking of building a test case to run
- all existing window functions
- with all basic variants of frame definitions
- once with ROWS, once with RANGE
- on basic table that has duplicate and NULL values in partition and ordering columns
- in all supported major versions

To verify for which of our window functions ROWS vs. RANGE never makes a difference.
That should be obvious in most cases, just to be sure.

Do you think this would be helpful?

Regards
Erwin
 
Erwin Brandstetter <brsaweda@gmail.com> writes:
> I am thinking of building a test case to run
> - all existing window functions
> - with all basic variants of frame definitions
> - once with ROWS, once with RANGE
> - on basic table that has duplicate and NULL values in partition and
> ordering columns
> - in all supported major versions

> To verify for which of our window functions ROWS vs. RANGE never makes a
> difference.
> That should be obvious in most cases, just to be sure.

> Do you think this would be helpful?

Doubt it.  Per the old saying "testing can prove the presence of bugs,
but not their absence", this could prove that some functions *do*
respond to these options, but it cannot prove that a function
*doesn't*.  Maybe you just didn't try the right test case.

If you want to try something like that as a heuristic to see which
cases are worth looking at closer, sure, but it's only a heuristic.

            regards, tom lane



On Tue, 18 Oct 2022 at 12:18, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Erwin Brandstetter <brsaweda@gmail.com> writes:
> > I am thinking of building a test case to run
> > - all existing window functions
> > - with all basic variants of frame definitions
> > - once with ROWS, once with RANGE
> > - on basic table that has duplicate and NULL values in partition and
> > ordering columns
> > - in all supported major versions
>
> > To verify for which of our window functions ROWS vs. RANGE never makes a
> > difference.
> > That should be obvious in most cases, just to be sure.
>
> > Do you think this would be helpful?
>
> Doubt it.  Per the old saying "testing can prove the presence of bugs,
> but not their absence", this could prove that some functions *do*
> respond to these options, but it cannot prove that a function
> *doesn't*.  Maybe you just didn't try the right test case.

I suppose this is kind of like fuzz testing.  Going by "git log
--grep=sqlsmith", fuzzing certainly has found bugs for us in the past.
I personally wouldn't discourage Erwin from doing this.

For me, my first port of call will be to study the code of each window
function to see if the frame options can affect the result. I *do*
need to spend more time on this still. It would be good to have some
extra assurance on having read the code with some more exhaustive
testing results. If Erwin was to find result variations that I missed
then we might avoid writing some new bugs.

Also, I just did spend a little more time reading a few window
functions and I see percent_rank() is another candidate for this
optimisation. I've never needed to use that function before, but from
the following experiment, it seems to just be (rank() over (order by
...) - 1) / (count(*) over () - 1). Since rank() is already on the
list and count(*) over() contains all rows in the frame, then it seems
percent_rank() can join the club too.

create table t0 as select x*random() as a from generate_series(1,1000000)x;
select * from (select a,percent_rank() over (order by a) pr,(rank()
over (order by a) - 1) / (count(*) over () - 1)::float8 pr2  from t0)
c where pr <> pr2;

David



Thanks for having a look at this.

On Fri, 14 Oct 2022 at 10:52, Zhihong Yu <zyu@yugabyte.com> wrote:
> +       req->frameOptions = (FRAMEOPTION_ROWS |
> +                            FRAMEOPTION_START_UNBOUNDED_PRECEDING |
> +                            FRAMEOPTION_END_CURRENT_ROW);
>
> The bit combination appears multiple times in the patch.
> Maybe define the combination as a constant in supportnodes.h and reference it in the code.

I don't believe supportnodes.h has any business having any code that's
related to actual implementations of the support request type.  If we
were to have such a definition then I think it would belong in
windowfuncs.c.  I'd rather see each implementation of the support
request spell out exactly what they mean, which is what the patch does
already.

David





On Mon, Oct 17, 2022 at 5:05 PM David Rowley <dgrowleyml@gmail.com> wrote:
Thanks for having a look at this.

On Fri, 14 Oct 2022 at 10:52, Zhihong Yu <zyu@yugabyte.com> wrote:
> +       req->frameOptions = (FRAMEOPTION_ROWS |
> +                            FRAMEOPTION_START_UNBOUNDED_PRECEDING |
> +                            FRAMEOPTION_END_CURRENT_ROW);
>
> The bit combination appears multiple times in the patch.
> Maybe define the combination as a constant in supportnodes.h and reference it in the code.

I don't believe supportnodes.h has any business having any code that's
related to actual implementations of the support request type.  If we
were to have such a definition then I think it would belong in
windowfuncs.c.  I'd rather see each implementation of the support
request spell out exactly what they mean, which is what the patch does
already.

David
Hi,
I am fine with keeping the code where it is now.

Cheers
On Thu, 13 Oct 2022 at 13:34, David Rowley <dgrowleyml@gmail.com> wrote:
> So it looks like the same can be done for rank() and dense_rank() too.
> I've added support for those in the attached.

The attached adds support for percent_rank(), cume_dist() and ntile().

David

Attachment
On 10/20/22 22:02, David Rowley wrote:
> On Thu, 13 Oct 2022 at 13:34, David Rowley <dgrowleyml@gmail.com> wrote:
>> So it looks like the same can be done for rank() and dense_rank() too.
>> I've added support for those in the attached.
> 
> The attached adds support for percent_rank(), cume_dist() and ntile().


Shouldn't it be able to detect that these two windows are the same and 
only do one WindowAgg pass?


explain (verbose, costs off)
select row_number() over w1,
        lag(amname) over w2
from pg_am
window w1 as (order by amname),
        w2 as (w1 rows unbounded preceding)
;


                            QUERY PLAN
-----------------------------------------------------------------
  WindowAgg
    Output: (row_number() OVER (?)), lag(amname) OVER (?), amname
    ->  WindowAgg
          Output: amname, row_number() OVER (?)
          ->  Sort
                Output: amname
                Sort Key: pg_am.amname
                ->  Seq Scan on pg_catalog.pg_am
                      Output: amname
(9 rows)

-- 
Vik Fearing




On Sun, 23 Oct 2022 at 03:03, Vik Fearing <vik@postgresfriends.org> wrote:
> Shouldn't it be able to detect that these two windows are the same and
> only do one WindowAgg pass?
>
>
> explain (verbose, costs off)
> select row_number() over w1,
>         lag(amname) over w2
> from pg_am
> window w1 as (order by amname),
>         w2 as (w1 rows unbounded preceding)
> ;

Good thinking. I think the patch should also optimise that case. It
requires re-doing a similar de-duplication phase the same as what's
done in transformWindowFuncCall().  I've added code to do that in the
attached version.

This got me wondering if the support function, instead of returning
some more optimal versions of the frameOptions, I wondered if it
should just return which aspects of the WindowClause it does not care
about.  For example,

SELECT row_number() over (), lag(relname) over (order by relname)
from pg_class;

could, in theory, have row_number() reuse the WindowAgg for lag.  Here
because the WindowClause for row_number() has an empty ORDER BY
clause, I believe it could just reuse the lag's WindowClause. It
wouldn't be able to do that if row_number() had an ORDER BY, or if
row_number() were some other WindowFunc that cared about peer rows.
I'm currently thinking this might not be worth the trouble as it seems
a bit unlikely that someone would use row_number() and not care about
the ORDER BY. However, maybe the row_number() could reuse some other
WindowClause with a more strict ordering. My current thoughts are that
this feels a bit too unlikely to apply in enough cases for it to be
worthwhile. I just thought I'd mention it for the sake of the
archives.

David


Thanks for taking it for a spin.

David

Attachment
On Wed, 26 Oct 2022 at 14:38, David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Sun, 23 Oct 2022 at 03:03, Vik Fearing <vik@postgresfriends.org> wrote:
> > Shouldn't it be able to detect that these two windows are the same and
> > only do one WindowAgg pass?
> >
> >
> > explain (verbose, costs off)
> > select row_number() over w1,
> >         lag(amname) over w2
> > from pg_am
> > window w1 as (order by amname),
> >         w2 as (w1 rows unbounded preceding)
> > ;
>
> Good thinking. I think the patch should also optimise that case. It
> requires re-doing a similar de-duplication phase the same as what's
> done in transformWindowFuncCall().  I've added code to do that in the
> attached version.

I've spent a bit more time on this now and added a few extra
regression tests.  The previous version had nothing to test to ensure
that an aggregate function being used as a window function does not
have its frame options changed when it's sharing the same WindowClause
as a WindowFunc which can have the frame options changed.

I've now pushed the final result.  Thank you to everyone who provided
input on this.

David



On 12/23/22 00:47, David Rowley wrote:
> On Wed, 26 Oct 2022 at 14:38, David Rowley <dgrowleyml@gmail.com> wrote:
> 
> I've now pushed the final result.  Thank you to everyone who provided
> input on this.

This is a very good improvement.  Thank you for working on it.
-- 
Vik Fearing