Re: Row pattern recognition - Mailing list pgsql-hackers

From Tatsuo Ishii
Subject Re: Row pattern recognition
Date
Msg-id 20260118.210929.125236923268769711.ishii@postgresql.org
Whole thread Raw
In response to Re: Row pattern recognition  (Henson Choi <assam258@gmail.com>)
List pgsql-hackers
Hi Henson,

>  Attached is the patch with the following changes:
>
> 1. Removed 26 variables limit and fixed count overflow:
>    - Deleted defineInitial field from WindowClause (parsenodes.h),
>      WindowAgg (plannodes.h), and related code in createplan.c and
> parse_rpr.c
>    - Now uses RPR_VARID_MAX (252) constant for the limit check
>    - Fixed NFA count overflow issue:
>      * Changed RPRNFAState.counts from int16 to int32 (execnodes.h)
>      * Added RPR_COUNT_MAX (INT32_MAX) constant (rpr.h)
>      * Handles pattern matching failures correctly at >32,767 repetitions
>      * Added INT32_MAX boundary handling to prevent overflow
>    - Updated test cases in rpr.sql:
>      * Updated variable limit tests from 26 to 252
>      * Added 100K row large partition test to verify int32 overflow handling
>
> 2. Simplified context absorption logic (under review):
>    - Key insight: Most absorption opportunities provide no practical benefit
>      because subsequent contexts are discarded immediately after a match
>      succeeds. Only absorption at unbounded pattern start shows measurable
> impact.
>    - Changes:
>      * Planner (rpr.c): Added isUnboundedStart() to identify patterns where
>        leading unbounded quantifiers enable safe absorption
>      * Executor (nodeWindowAgg.c): Simplified to only absorb at pattern
> start,
>        separated cleanup logic into nfa_cleanup_dead_contexts()
>      * Added pattern->isAbsorbable flag to enable absorption only when safe
>    - Impact: Reduces context growth from quadratic to linear.
>      Example from rpr_explain.sql Test 3.1 (PATTERN A+ B, 50 rows):
>      * With absorption: 51 total contexts, 3 peak, 30 absorbed
>      * Without absorption: Would create O(n²) contexts (each row starts a
>        context, creating 1+2+3+4 contexts per 5-row segment)
>      * Absorption maintains linear O(n) growth instead of quadratic
> explosion
>    - Note: Changes in EXPLAIN ANALYZE output suggest there may be issues in
>      the matching process, particularly in absorption and NFA state
> transitions.
>      I need to review these areas more thoroughly before I can be confident
> in
>      their correctness.
>
> 3. Added NFA statistics to EXPLAIN ANALYZE output (under review):
>    - Extended explain.c to show NFA construction and matching statistics
>    - Added comprehensive statistics tracking (execnodes.h):
>      * NFALengthStats structure for min/max/avg length measurements
>      * State statistics: peak, total created, merged (deduplicated)
>      * Context statistics: peak, total created, absorbed, skipped, pruned
>      * Match statistics: succeeded/failed counts with length distributions
>    - Implemented statistics collection in nodeWindowAgg.c
>    - New test suite: rpr_explain.sql for testing NFA statistics output
>    - Note: rpr_explain is not included in parallel_schedule (runs
> separately)
>    - Statistics reveal pattern complexity impact:
>      Example from Test 2.7 (PATTERN ((A | B)+)+, 1000 rows):
>      * NFA States: 2,664 peak, 892,447 total created
>      * Peak grows linearly with rows (~2.6x), likely due to state merging
>      * High total shows significant state churn despite controlled peak
>      * This demonstrates complex pattern behavior that users should
> understand
>      * Question: Should we add a GUC parameter to limit NFA state peak count
>        (e.g., max_rpr_nfa_states) to provide administrators with safety
> controls?
>    - This feature is still under review and may be refined in future patches
>
> 4. Major refactoring of pattern compilation and optimization (rpr.c):
>    - Split large monolithic functions into smaller, semantically meaningful
>      single-purpose functions for better readability
>    - Pattern compilation refactored:
>      * Added collectDefineVariables() to populate variable names from DEFINE
>      * Split scanRPRPattern() into recursive worker and wrapper
>      * Extracted allocateRPRPattern() for memory allocation
>      * Split fillRPRPattern() by node type: fillRPRPatternVar(),
>        fillRPRPatternGroup(), fillRPRPatternAlt()
>    - Pattern optimization converted to dispatcher pattern:
>      * optimizeSeqPattern() - flattens nested SEQ, merges consecutive
> vars/groups
>      * optimizeAltPattern() - flattens nested ALT, removes duplicates
>      * optimizeGroupPattern() - multiplies quantifiers, unwraps single
> children
>    - Each optimization step broken into separate helper functions:
>      * flattenSeqChildren(), mergeConsecutiveVars(),
> mergeConsecutiveGroups()
>      * flattenAltChildren(), removeDuplicateAlternatives()
>      * tryMultiplyQuantifiers(), tryUnwrapGroup()
>    - New optimization: mergeGroupPrefixSuffix()
>      * A B (A B)+ -> (A B){2,}
>      * (A B)+ A B -> (A B){2,}
>      * A B (A B)+ A B -> (A B){3,}
>    - Code structure reorganized with categorized forward declarations
>    - Added test cases in rpr.sql to verify new optimizations:
>      * Consecutive GROUP merge tests
>      * Quantifier multiplication tests (including INF blocking)
>      * Group prefix/suffix merge tests
>
> 5. Fixed quantifier validation:
>    - Corrected bounds checking in gram.y (e.g., {,m} now requires m >= 1)
>    - Improved error messages for invalid quantifier ranges
>
> Additional notes:
> - The NFA statistics output format in EXPLAIN ANALYZE was quickly
>   implemented for debugging purposes. I would appreciate feedback from
>   experienced PostgreSQL developers on:
>   * Whether the output format and structure are appropriate
>   * Whether these statistics would be valuable for query tuning in
> production,
>     or if they are only useful for development/debugging
>   * If production-worthy, what simplifications might make them more
> accessible
>     to users unfamiliar with NFA internals
> - Reluctant quantifier handling needs further clarification. If I cannot
>   make confident fixes before the next review cycle, I may temporarily
>   raise errors for reluctant cases until the logic is properly sorted out.
>
> Next steps:
> - The executor implementation (nodeWindowAgg.c) is the only major component
>   remaining for detailed review.
> - I will focus on thoroughly reviewing the absorption logic and NFA state
>   transition mechanisms to address the concerns noted in section 2.
> - After making necessary improvements, I will send the next patch with
>   more confidence in the executor correctness.

Ok, I have merged your patches with my patches to create v40
patches.

My patches are to fix issue with RPR + IGNORE NULLS case. I modified
ignorenulls_getfuncarginframe() which is called from window functions
when IGNORE NULLS option is specified. To cope with RPR, I added
row_is_in_reduced_frame() call to it. Also test cases were added.

Attached are the v40 patches.

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Attachment

pgsql-hackers by date:

Previous
From: Henson Choi
Date:
Subject: Re: Row pattern recognition