Re: Row pattern recognition - Mailing list pgsql-hackers

From Tatsuo Ishii
Subject Re: Row pattern recognition
Date
Msg-id 20230728.200230.1800661389729335298.t-ishii@sranhm.sra.co.jp
Whole thread Raw
In response to Re: Row pattern recognition  (Vik Fearing <vik@postgresfriends.org>)
Responses Re: Row pattern recognition
List pgsql-hackers
>> Attached is the v3 patch. In this patch following changes are made.
> 
> Excellent.  Thanks!

You are welcome!

> A few quick comments:
> 
> - PERMUTE is still misspelled as PREMUTE

Oops. Will fix.

> - PATTERN variables do not have to exist in the DEFINE clause.  They are
> - considered TRUE if not present.

Do you think we really need this? I found a criticism regarding this.

https://link.springer.com/article/10.1007/s13222-022-00404-3
"3.2 Explicit Definition of All Row Pattern Variables"

What do you think?

>> - I am working on making window aggregates RPR aware now. The
>>    implementation is in progress and far from completeness. An example
>>    is below. I think row 2, 3, 4 of "count" column should be NULL
>>    instead of 3, 2, 0, 0. Same thing can be said to other
>>    rows. Probably this is an effect of moving aggregate but I still
>>    studying the window aggregation code.
> 
> This tells me again that RPR is not being run in the right place.  All
> windowed aggregates and frame-level window functions should Just Work
> with no modification.

I am not touching each aggregate function. I am modifying
eval_windowaggregates() in nodeWindowAgg.c, which calls each aggregate
function. Do you think it's not the right place to make window
aggregates RPR aware?

>> SELECT company, tdate, first_value(price) OVER W, count(*) OVER w FROM
>> stock
>>   WINDOW w AS (
>>   PARTITION BY company
>>   ORDER BY tdate
>>   ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
>>   AFTER MATCH SKIP PAST LAST ROW
>>   INITIAL
>>   PATTERN (START UP+ DOWN+)
>>   DEFINE
>>    START AS TRUE,
>>    UP AS price > PREV(price),
>>    DOWN AS price < PREV(price)
>> );
>>   company  |   tdate    | first_value | count
>> ----------+------------+-------------+-------
>>   company1 | 2023-07-01 |         100 |     4
>>   company1 | 2023-07-02 |             |     3
>>   company1 | 2023-07-03 |             |     2
>>   company1 | 2023-07-04 |             |     0
>>   company1 | 2023-07-05 |             |     0
>>   company1 | 2023-07-06 |          90 |     4
>>   company1 | 2023-07-07 |             |     3
>>   company1 | 2023-07-08 |             |     2
>>   company1 | 2023-07-09 |             |     0
>>   company1 | 2023-07-10 |             |     0
>>   company2 | 2023-07-01 |          50 |     4
>>   company2 | 2023-07-02 |             |     3
>>   company2 | 2023-07-03 |             |     2
>>   company2 | 2023-07-04 |             |     0
>>   company2 | 2023-07-05 |             |     0
>>   company2 | 2023-07-06 |          60 |     4
>>   company2 | 2023-07-07 |             |     3
>>   company2 | 2023-07-08 |             |     2
>>   company2 | 2023-07-09 |             |     0
>>   company2 | 2023-07-10 |             |     0
> 
> In this scenario, row 1's frame is the first 5 rows and specified SKIP
> PAST LAST ROW, so rows 2-5 don't have *any* frame (because they are
> skipped) and the result of the outer count should be 0 for all of them
> because there are no rows in the frame.

Ok. Just I want to make sure. If it's other aggregates like sum or
avg, the result of the outer aggregates should be NULL.

> When we get to adding count in the MEASURES clause, there will be a
> difference between no match and empty match, but that does not apply
> here.

Can you elaborate more? I understand that "no match" and "empty match"
are different things. But I do not understand how the difference
affects the result of count.

>> I am going to add this thread to CommitFest and plan to add both of
>> you as reviewers. Thanks in advance.
> 
> My pleasure.  Thank you for working on this difficult feature.

Thank you for accepting being registered as a reviewer. Your comments
are really helpful.
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: cataloguing NOT NULL constraints
Next
From: Matthias van de Meent
Date:
Subject: Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan