Re: Row pattern recognition - Mailing list pgsql-hackers
| From | Henson Choi |
|---|---|
| Subject | Re: Row pattern recognition |
| Date | |
| Msg-id | CAAAe_zCQkrPyZ86QFwmcM7Jq8anUCJWNXsSGtDNVWhWnrpOW9A@mail.gmail.com Whole thread Raw |
| In response to | Re: Row pattern recognition (Tatsuo Ishii <ishii@postgresql.org>) |
| Responses |
Re: Row pattern recognition
|
| List | pgsql-hackers |
Hi Ishii-san,
Thank you for the quick feedback and helpful spec clarifications!
2026년 1월 3일 (토) AM 10:05, Tatsuo Ishii <ishii@postgresql.org>님이 작성:
Hi Henson,
Thank you for the detailed design documentation. While learning the
document, just a quick comment and question.
> Hi hackers,
>
> I've been working on an alternative implementation approach for Row
> Pattern Recognition (RPR) that addresses some limitations in the
> current regex-based patch.
>
> Key differences from the existing approach:
>
> 1. Streaming NFA instead of regex engine
> - Process rows incrementally without accumulating encoded strings
> - Avoids O(V^N) combinatorial explosion in multi-variable scenarios
Sounds good. As you already pointed out, current approach has a memory
space problem. If Streaming NFA could solve the issue, it would be
really good.
> 2. No 26-variable limit
> - Variables identified by index, not alphabet encoding
Sounds good.
> 3. Proper Lexical Order support
> - Respects PATTERN alternative order for ONE ROW PER MATCH
Here do you have "ONE ROW PER MATCH" option of RPR in your mind? In my
understanding it's in MATCH_RECOGNIZE clause, but not in RPR in Window
clause. (ALL ROWS PER MATCH is not applicable either in RPR in
Window clause).
You're absolutely right - thank you for the correction!
I'm currently exploring what this NFA structure could theoretically
support, rather than strictly binding to the spec yet. Since I'm
still learning from Oracle manuals and your code without access to
the full SQL:2016 standard, I mixed up MATCH_RECOGNIZE syntax with
Window clause RPR - sorry for the confusion!
Once I have a clearer understanding of the spec, I'll properly
define how it integrates with PostgreSQL's grammar.
Your spec knowledge is invaluable as I continue learning. Thank you!
I'm currently exploring what this NFA structure could theoretically
support, rather than strictly binding to the spec yet. Since I'm
still learning from Oracle manuals and your code without access to
the full SQL:2016 standard, I mixed up MATCH_RECOGNIZE syntax with
Window clause RPR - sorry for the confusion!
Once I have a clearer understanding of the spec, I'll properly
define how it integrates with PostgreSQL's grammar.
Your spec knowledge is invaluable as I continue learning. Thank you!
> 4. GREEDY/RELUCTANT quantifier support
> - Longest vs shortest match semantics
Sounds good.
> 5. Incremental MEASURES computation
> - Aggregate values computed during matching, no rescan needed
>
> The attached document describes the design in detail, including:
> - Data structures (Pattern, MatchState, MatchContext)
> - Execution flow and state transitions
> - Context absorption optimization for greedy patterns
> - AST optimization passes
>
> This is still a work in progress. I'd appreciate any feedback on
> the approach before proceeding with PostgreSQL integration.
I understand you are still in the design phase and sorry for jumping
into an implementation question. but If you have an idea, please
advice:
Not at all - implementation questions are valuable advice that
help catch what I might miss in the design phase!
help catch what I might miss in the design phase!
How do you handle '{' and '}' in the PATTERN clause in the raw parser?
I am not a parser expert but it seems it's not easy to handle '{' and
'}' in gram.y.
You're right - it's not straightforward. Both gram.y and scan.l
need to be modified together.
I've attached a rough patch showing one possible approach. However,
since the OR handling has also been modified in SQL/PGQ, there will
likely be conflicts that need to be resolved when integrating.
need to be modified together.
I've attached a rough patch showing one possible approach. However,
since the OR handling has also been modified in SQL/PGQ, there will
likely be conflicts that need to be resolved when integrating.
Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
Attachment
pgsql-hackers by date: