Re: Row pattern recognition - Mailing list pgsql-hackers

From Vik Fearing
Subject Re: Row pattern recognition
Date
Msg-id 60651930-70bb-c849-1862-e8f7eb109094@postgresfriends.org
Whole thread Raw
In response to Re: Row pattern recognition  (Tatsuo Ishii <ishii@sraoss.co.jp>)
List pgsql-hackers
On 7/28/23 09:09, Tatsuo Ishii wrote:
>>> We already recalculate a frame each time a row is processed even
>>> without RPR. See ExecWindowAgg.
>>
>> Yes, after each row.  Not for each function.
> 
> Ok, I understand now. Closer look at the code, I realized that each
> window function calls update_frameheadpos, which computes the frame
> head position. But actually it checks winstate->framehead_valid and if
> it's already true (probably by other window function), then it does
> nothing.
> 
>>> Also RPR always requires a frame option ROWS BETWEEN CURRENT ROW,
>>> which means the frame head is changed each time current row position
>>> changes.
>>
>> Off topic for now: I wonder why this restriction is in place and
>> whether we should respect or ignore it.  That is a discussion for
>> another time, though.
> 
> My guess is, it is because other than ROWS BETWEEN CURRENT ROW has
> little or no meaning. Consider following example:

Yes, that makes sense.

>>>> I strongly disagree with this.  Window function do not need to know
>>>> how the frame is defined, and indeed they should not.
>>> We already break the rule by defining *support functions. See
>>> windowfuncs.c.
>> The support functions don't know anything about the frame, they just
>> know when a window function is monotonically increasing and execution
>> can either stop or be "passed through".
> 
> I see following code in window_row_number_support:
> 
>         /*
>          * The frame options can always become "ROWS BETWEEN UNBOUNDED
>          * PRECEDING AND CURRENT ROW".  row_number() always just increments by
>          * 1 with each row in the partition.  Using ROWS instead of RANGE
>          * saves effort checking peer rows during execution.
>          */
>         req->frameOptions = (FRAMEOPTION_NONDEFAULT |
>                              FRAMEOPTION_ROWS |
>                              FRAMEOPTION_START_UNBOUNDED_PRECEDING |
>                              FRAMEOPTION_END_CURRENT_ROW);
> 
> I think it not only knows about frame but it even changes the frame
> options. This seems far from "don't know anything about the frame", no?

That's the planner support function.  The row_number() function itself 
is not even allowed to *have* a frame, per spec.  We allow it, but as 
you can see from that support function, we completely replace it.

So all of the partition-level window functions are not affected by RPR 
anyway.

>> I have two comments about this:
>>
>> It isn't just for convenience, it is for correctness.  The window
>> functions do not need to know which rows they are *not* operating on.
>>
>> There is no such thing as a "full" or "reduced" frame.  The standard
>> uses those terms to explain the difference between before and after
>> RPR is applied, but window functions do not get to choose which frame
>> they apply over.  They only ever apply over the reduced window frame.
> 
> I agree that "full window frame" and "reduced window frame" do not
> exist at the same time, and in the end (after computation of reduced
> frame), only "reduced" frame is visible to window
> functions/aggregates. But I still do think that "full window frame"
> and "reduced window frame" are important concept to explain/understand
> how PRP works.

If we are just using those terms for documentation, then okay.
-- 
Vik Fearing




pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: Re: postgres_fdw: wrong results with self join + enable_nestloop off
Next
From: Michael Paquier
Date:
Subject: Re: Support worker_spi to execute the function dynamically.