Re: Bug in row_number() optimization - Mailing list pgsql-hackers

From Richard Guo
Subject Re: Bug in row_number() optimization
Date
Msg-id CAMbWs49DaFZViFVD-SwdWxk3JrWvyRq9j+_oopiAZKxM0eEZCg@mail.gmail.com
Whole thread Raw
In response to Re: Bug in row_number() optimization  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Bug in row_number() optimization
List pgsql-hackers

On Fri, Nov 25, 2022 at 11:26 AM David Rowley <dgrowleyml@gmail.com> wrote:
There are two different pass-through modes that the WindowAgg can move
into when it detects that the run condition is no longer true:

1) WINDOWAGG_PASSTHROUGH and
2) WINDOWAGG_PASSTHROUGH_STRICT

#2 is used when the WindowAgg is the top-level one in this query
level. Remember we'll need multiple WindowAgg nodes when there are
multiple different windows to evaluate.  The reason that we need #1 is
that if there are multiple WindowAggs, then the top-level one (or just
any WindowAgg above it) might need all the rows from a lower-level
WindowAgg. 
 
Thanks for the explanation!  I think now I understand pass-through modes
much better.
 
What I mean by "WindowAggs above us cannot reference the result of
another WindowAgg" is that the evaluation of sum(qty) over (order by
date) can't see the "rn" column. SQL does not allow it. 
 
I think I get your point.  Yeah, the 'rn' column is not needed for the
evaluation of WindowAggs above.  But ISTM it might be needed to evaluate
the quals of WindowAggs above.  Such as in the plan below

explain (costs off) SELECT * FROM
   (SELECT
           count(salary) OVER (PARTITION BY depname || '') c1, -- w1
           row_number() OVER (PARTITION BY depname) rn -- w2
    FROM empsalary
) e WHERE rn <= 1;
                            QUERY PLAN
-------------------------------------------------------------------
 Subquery Scan on e
   ->  WindowAgg
         Filter: ((row_number() OVER (?)) <= 1)
         ->  Sort
               Sort Key: (((empsalary.depname)::text || ''::text))
               ->  WindowAgg
                     Run Condition: (row_number() OVER (?) <= 1)
                     ->  Sort
                           Sort Key: empsalary.depname
                           ->  Seq Scan on empsalary
(10 rows)

The 'rn' column is calculated in the lower-level WindowAgg, and it is
needed to evaluate the 'Filter' of the upper-level WindowAgg.  In
pass-through mode, the lower-level WindowAgg would not be evaluated any
more, so we need to mark the 'rn' column to something that can false the
'Filter'.  Considering the 'Filter' is a strict function, marking it as
NULL would do.  I think this is why this patch works.
 
Just thinking of the patch a bit more, what I wrote ends up
continually zeroing the values and marking the columns as NULL. Likely
we can just do this once when we do: winstate->status =
WINDOWAGG_PASSTHROUGH; 
 
Yes, I also think we can do this only once when we go into pass-through
mode.

Thanks
Richard

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: xml2: add test for coverage
Next
From: Dean Rasheed
Date:
Subject: Re: Bug in MERGE's test for tables with rules