Re: Row pattern recognition - Mailing list pgsql-hackers
From | Vik Fearing |
---|---|
Subject | Re: Row pattern recognition |
Date | |
Msg-id | 60651930-70bb-c849-1862-e8f7eb109094@postgresfriends.org Whole thread Raw |
In response to | Re: Row pattern recognition (Tatsuo Ishii <ishii@sraoss.co.jp>) |
List | pgsql-hackers |
On 7/28/23 09:09, Tatsuo Ishii wrote: >>> We already recalculate a frame each time a row is processed even >>> without RPR. See ExecWindowAgg. >> >> Yes, after each row. Not for each function. > > Ok, I understand now. Closer look at the code, I realized that each > window function calls update_frameheadpos, which computes the frame > head position. But actually it checks winstate->framehead_valid and if > it's already true (probably by other window function), then it does > nothing. > >>> Also RPR always requires a frame option ROWS BETWEEN CURRENT ROW, >>> which means the frame head is changed each time current row position >>> changes. >> >> Off topic for now: I wonder why this restriction is in place and >> whether we should respect or ignore it. That is a discussion for >> another time, though. > > My guess is, it is because other than ROWS BETWEEN CURRENT ROW has > little or no meaning. Consider following example: Yes, that makes sense. >>>> I strongly disagree with this. Window function do not need to know >>>> how the frame is defined, and indeed they should not. >>> We already break the rule by defining *support functions. See >>> windowfuncs.c. >> The support functions don't know anything about the frame, they just >> know when a window function is monotonically increasing and execution >> can either stop or be "passed through". > > I see following code in window_row_number_support: > > /* > * The frame options can always become "ROWS BETWEEN UNBOUNDED > * PRECEDING AND CURRENT ROW". row_number() always just increments by > * 1 with each row in the partition. Using ROWS instead of RANGE > * saves effort checking peer rows during execution. > */ > req->frameOptions = (FRAMEOPTION_NONDEFAULT | > FRAMEOPTION_ROWS | > FRAMEOPTION_START_UNBOUNDED_PRECEDING | > FRAMEOPTION_END_CURRENT_ROW); > > I think it not only knows about frame but it even changes the frame > options. This seems far from "don't know anything about the frame", no? That's the planner support function. The row_number() function itself is not even allowed to *have* a frame, per spec. We allow it, but as you can see from that support function, we completely replace it. So all of the partition-level window functions are not affected by RPR anyway. >> I have two comments about this: >> >> It isn't just for convenience, it is for correctness. The window >> functions do not need to know which rows they are *not* operating on. >> >> There is no such thing as a "full" or "reduced" frame. The standard >> uses those terms to explain the difference between before and after >> RPR is applied, but window functions do not get to choose which frame >> they apply over. They only ever apply over the reduced window frame. > > I agree that "full window frame" and "reduced window frame" do not > exist at the same time, and in the end (after computation of reduced > frame), only "reduced" frame is visible to window > functions/aggregates. But I still do think that "full window frame" > and "reduced window frame" are important concept to explain/understand > how PRP works. If we are just using those terms for documentation, then okay. -- Vik Fearing
pgsql-hackers by date: