Re: Row pattern recognition - Mailing list pgsql-hackers

From Tatsuo Ishii
Subject Re: Row pattern recognition
Date
Msg-id 20230809.174112.1806434641666895071.t-ishii@sranhm.sra.co.jp
Whole thread Raw
In response to Re: Row pattern recognition  (Tatsuo Ishii <ishii@sraoss.co.jp>)
Responses Re: Row pattern recognition
Re: Row pattern recognition
List pgsql-hackers
Attached is the v4 patch. Differences from previous patch include:

> - PERMUTE is still misspelled as PREMUTE

Fixed.

> - PATTERN variables do not have to exist in the DEFINE clause.  They are
> - considered TRUE if not present.

Fixed. Moreover new regression test case is added.

- It was possible that tle nodes in DEFINE clause do not appear in the
  plan's target list. This makes impossible to evaluate expressions in
  the DEFINE because it does not appear in the outer plan's target
  list. To fix this, call findTargetlistEntrySQL99 (with resjunk is
  true) so that the missing TargetEntry is added to the outer plan
  later on.

- I eliminated some hacks in handling the Var node in DEFINE
  clause. Previously I replaced varattno of Var node in a plan tree by
  hand so that it refers to correct varattno in the outer plan
  node. In this patch I modified set_upper_references so that it calls
  fix_upper_expr for those Var nodes in the DEFINE clause. See v4-0003
  patch for more details.

- I found a bug with pattern matching code. It creates a string for
  subsequent regular expression matching. It uses the initial letter
  of each define variable name. For example, if the varname is "foo",
  then "f" is used. Obviously this makes trouble if we have two or
  more variables starting with same "f" (e.g. "food"). To fix this, I
  assign [a-z] to each variable instead of its initial letter. However
  this way limits us not to have more than 26 variables. I hope 26 is
  enough for most use cases.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
From 5c6430e171ab9f9a75df92fac25949cb96fe1da2 Mon Sep 17 00:00:00 2001
From: Tatsuo Ishii <ishii@postgresql.org>
Date: Wed, 9 Aug 2023 16:56:17 +0900
Subject: [PATCH v4 1/7] Row pattern recognition patch for raw parser.

---
 src/backend/parser/gram.y       | 216 +++++++++++++++++++++++++++++---
 src/include/nodes/parsenodes.h  |  56 +++++++++
 src/include/parser/kwlist.h     |   8 ++
 src/include/parser/parse_node.h |   1 +
 4 files changed, 267 insertions(+), 14 deletions(-)

diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 15ece871a0..62c1919538 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -251,6 +251,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
     DefElem       *defelt;
     SortBy       *sortby;
     WindowDef  *windef;
+    RPCommonSyntax    *rpcom;
+    RPSubsetItem    *rpsubset;
     JoinExpr   *jexpr;
     IndexElem  *ielem;
     StatsElem  *selem;
@@ -453,8 +455,12 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
                 TriggerTransitions TriggerReferencing
                 vacuum_relation_list opt_vacuum_relation_list
                 drop_option_list pub_obj_list
-
-%type <node>    opt_routine_body
+                row_pattern_measure_list row_pattern_definition_list
+                opt_row_pattern_subset_clause
+                row_pattern_subset_list row_pattern_subset_rhs
+                row_pattern
+%type <rpsubset>     row_pattern_subset_item
+%type <node>    opt_routine_body row_pattern_term
 %type <groupclause> group_clause
 %type <list>    group_by_list
 %type <node>    group_by_item empty_grouping_set rollup_clause cube_clause
@@ -551,6 +557,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 %type <range>    relation_expr_opt_alias
 %type <node>    tablesample_clause opt_repeatable_clause
 %type <target>    target_el set_target insert_column_item
+                row_pattern_measure_item row_pattern_definition
 
 %type <str>        generic_option_name
 %type <node>    generic_option_arg
@@ -633,6 +640,9 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 %type <list>    window_clause window_definition_list opt_partition_clause
 %type <windef>    window_definition over_clause window_specification
                 opt_frame_clause frame_extent frame_bound
+%type <rpcom>    opt_row_pattern_common_syntax opt_row_pattern_skip_to
+%type <boolean>    opt_row_pattern_initial_or_seek
+%type <list>    opt_row_pattern_measures
 %type <ival>    opt_window_exclusion_clause
 %type <str>        opt_existing_window_name
 %type <boolean> opt_if_not_exists
@@ -659,7 +669,6 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
                 json_object_constructor_null_clause_opt
                 json_array_constructor_null_clause_opt
 
-
 /*
  * Non-keyword token types.  These are hard-wired into the "flex" lexer.
  * They must be listed first so that their numeric codes do not depend on
@@ -702,7 +711,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
     CURRENT_TIME CURRENT_TIMESTAMP CURRENT_USER CURSOR CYCLE
 
     DATA_P DATABASE DAY_P DEALLOCATE DEC DECIMAL_P DECLARE DEFAULT DEFAULTS
-    DEFERRABLE DEFERRED DEFINER DELETE_P DELIMITER DELIMITERS DEPENDS DEPTH DESC
+    DEFERRABLE DEFERRED DEFINE DEFINER DELETE_P DELIMITER DELIMITERS DEPENDS DEPTH DESC
     DETACH DICTIONARY DISABLE_P DISCARD DISTINCT DO DOCUMENT_P DOMAIN_P
     DOUBLE_P DROP
 
@@ -718,7 +727,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
     HANDLER HAVING HEADER_P HOLD HOUR_P
 
     IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IMPORT_P IN_P INCLUDE
-    INCLUDING INCREMENT INDENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
+    INCLUDING INCREMENT INDENT INDEX INDEXES INHERIT INHERITS INITIAL INITIALLY INLINE_P
     INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
     INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
 
@@ -731,7 +740,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
     LEADING LEAKPROOF LEAST LEFT LEVEL LIKE LIMIT LISTEN LOAD LOCAL
     LOCALTIME LOCALTIMESTAMP LOCATION LOCK_P LOCKED LOGGED
 
-    MAPPING MATCH MATCHED MATERIALIZED MAXVALUE MERGE METHOD
+    MAPPING MATCH MATCHED MATERIALIZED MAXVALUE MEASURES MERGE METHOD
     MINUTE_P MINVALUE MODE MONTH_P MOVE
 
     NAME_P NAMES NATIONAL NATURAL NCHAR NEW NEXT NFC NFD NFKC NFKD NO NONE
@@ -743,8 +752,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
     ORDER ORDINALITY OTHERS OUT_P OUTER_P
     OVER OVERLAPS OVERLAY OVERRIDING OWNED OWNER
 
-    PARALLEL PARAMETER PARSER PARTIAL PARTITION PASSING PASSWORD
-    PLACING PLANS POLICY
+    PARALLEL PARAMETER PARSER PARTIAL PARTITION PASSING PASSWORD PAST
+    PATTERN PERMUTE PLACING PLANS POLICY
     POSITION PRECEDING PRECISION PRESERVE PREPARE PREPARED PRIMARY
     PRIOR PRIVILEGES PROCEDURAL PROCEDURE PROCEDURES PROGRAM PUBLICATION
 
@@ -755,12 +764,13 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
     RESET RESTART RESTRICT RETURN RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK ROLLUP
     ROUTINE ROUTINES ROW ROWS RULE
 
-    SAVEPOINT SCALAR SCHEMA SCHEMAS SCROLL SEARCH SECOND_P SECURITY SELECT
+    SAVEPOINT SCALAR SCHEMA SCHEMAS SCROLL SEARCH SECOND_P SECURITY SEEK SELECT
     SEQUENCE SEQUENCES
+
     SERIALIZABLE SERVER SESSION SESSION_USER SET SETS SETOF SHARE SHOW
     SIMILAR SIMPLE SKIP SMALLINT SNAPSHOT SOME SQL_P STABLE STANDALONE_P
     START STATEMENT STATISTICS STDIN STDOUT STORAGE STORED STRICT_P STRIP_P
-    SUBSCRIPTION SUBSTRING SUPPORT SYMMETRIC SYSID SYSTEM_P SYSTEM_USER
+    SUBSCRIPTION SUBSET SUBSTRING SUPPORT SYMMETRIC SYSID SYSTEM_P SYSTEM_USER
 
     TABLE TABLES TABLESAMPLE TABLESPACE TEMP TEMPLATE TEMPORARY TEXT_P THEN
     TIES TIME TIMESTAMP TO TRAILING TRANSACTION TRANSFORM
@@ -853,6 +863,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
  */
 %nonassoc    UNBOUNDED        /* ideally would have same precedence as IDENT */
 %nonassoc    IDENT PARTITION RANGE ROWS GROUPS PRECEDING FOLLOWING CUBE ROLLUP
+%nonassoc    MEASURES AFTER INITIAL SEEK PATTERN
 %left        Op OPERATOR        /* multi-character ops and user-defined operators */
 %left        '+' '-'
 %left        '*' '/' '%'
@@ -15832,7 +15843,8 @@ over_clause: OVER window_specification
         ;
 
 window_specification: '(' opt_existing_window_name opt_partition_clause
-                        opt_sort_clause opt_frame_clause ')'
+                        opt_sort_clause opt_row_pattern_measures opt_frame_clause
+                        opt_row_pattern_common_syntax ')'
                 {
                     WindowDef  *n = makeNode(WindowDef);
 
@@ -15840,10 +15852,12 @@ window_specification: '(' opt_existing_window_name opt_partition_clause
                     n->refname = $2;
                     n->partitionClause = $3;
                     n->orderClause = $4;
+                    n->rowPatternMeasures = $5;
                     /* copy relevant fields of opt_frame_clause */
-                    n->frameOptions = $5->frameOptions;
-                    n->startOffset = $5->startOffset;
-                    n->endOffset = $5->endOffset;
+                    n->frameOptions = $6->frameOptions;
+                    n->startOffset = $6->startOffset;
+                    n->endOffset = $6->endOffset;
+                    n->rpCommonSyntax = $7;
                     n->location = @1;
                     $$ = n;
                 }
@@ -15867,6 +15881,31 @@ opt_partition_clause: PARTITION BY expr_list        { $$ = $3; }
             | /*EMPTY*/                                { $$ = NIL; }
         ;
 
+/*
+ * ROW PATTERN MEASURES
+ */
+opt_row_pattern_measures: MEASURES row_pattern_measure_list    { $$ = $2; }
+            | /*EMPTY*/                                { $$ = NIL; }
+        ;
+
+row_pattern_measure_list:
+            row_pattern_measure_item
+                    { $$ = list_make1($1); }
+            | row_pattern_measure_list ',' row_pattern_measure_item
+                    { $$ = lappend($1, $3); }
+        ;
+
+row_pattern_measure_item:
+            a_expr AS ColLabel
+                {
+                    $$ = makeNode(ResTarget);
+                    $$->name = $3;
+                    $$->indirection = NIL;
+                    $$->val = (Node *) $1;
+                    $$->location = @1;
+                }
+        ;
+
 /*
  * For frame clauses, we return a WindowDef, but only some fields are used:
  * frameOptions, startOffset, and endOffset.
@@ -16026,6 +16065,139 @@ opt_window_exclusion_clause:
             | /*EMPTY*/                { $$ = 0; }
         ;
 
+opt_row_pattern_common_syntax:
+opt_row_pattern_skip_to opt_row_pattern_initial_or_seek
+                PATTERN '(' row_pattern ')'
+                opt_row_pattern_subset_clause
+                DEFINE row_pattern_definition_list
+            {
+                RPCommonSyntax *n = makeNode(RPCommonSyntax);
+                n->rpSkipTo = $1->rpSkipTo;
+                n->rpSkipVariable = $1->rpSkipVariable;
+                n->initial = $2;
+                n->rpPatterns = $5;
+                n->rpSubsetClause = $7;
+                n->rpDefs = $9;
+                $$ = n;
+            }
+            | /*EMPTY*/        { $$ = NULL; }
+    ;
+
+opt_row_pattern_skip_to:
+            AFTER MATCH SKIP TO NEXT ROW
+                {
+                    RPCommonSyntax *n = makeNode(RPCommonSyntax);
+                    n->rpSkipTo = ST_NEXT_ROW;
+                    n->rpSkipVariable = NULL;
+                    $$ = n;
+            }
+            | AFTER MATCH SKIP PAST LAST_P ROW
+                {
+                    RPCommonSyntax *n = makeNode(RPCommonSyntax);
+                    n->rpSkipTo = ST_PAST_LAST_ROW;
+                    n->rpSkipVariable = NULL;
+                    $$ = n;
+                }
+/*
+            | AFTER MATCH SKIP TO FIRST_P ColId        %prec FIRST_P
+                {
+                    RPCommonSyntax *n = makeNode(RPCommonSyntax);
+                    n->rpSkipTo = ST_FIRST_VARIABLE;
+                    n->rpSkipVariable = $6;
+                    $$ = n;
+                }
+            | AFTER MATCH SKIP TO LAST_P ColId        %prec LAST_P
+                {
+                    RPCommonSyntax *n = makeNode(RPCommonSyntax);
+                    n->rpSkipTo = ST_LAST_VARIABLE;
+                    n->rpSkipVariable = $6;
+                    $$ = n;
+                }
+ * Shift/reduce
+            | AFTER MATCH SKIP TO ColId
+                {
+                    RPCommonSyntax *n = makeNode(RPCommonSyntax);
+                    n->rpSkipTo = ST_VARIABLE;
+                    n->rpSkipVariable = $5;
+                    $$ = n;
+                }
+*/
+            | /*EMPTY*/
+                {
+                    RPCommonSyntax *n = makeNode(RPCommonSyntax);
+                    /* temporary set default to ST_NEXT_ROW */
+                    n->rpSkipTo = ST_PAST_LAST_ROW;
+                    n->rpSkipVariable = NULL;
+                    $$ = n;
+                }
+    ;
+
+opt_row_pattern_initial_or_seek:
+            INITIAL            { $$ = true; }
+            | SEEK
+                {
+                    ereport(ERROR,
+                            (errcode(ERRCODE_SYNTAX_ERROR),
+                             errmsg("SEEK is not supported"),
+                             errhint("Use INITIAL."),
+                             parser_errposition(@1)));
+                }
+            | /*EMPTY*/        { $$ = true; }
+        ;
+
+row_pattern:
+            row_pattern_term                            { $$ = list_make1($1); }
+            | row_pattern row_pattern_term                { $$ = lappend($1, $2); }
+        ;
+
+row_pattern_term:
+            ColId    { $$ = (Node *) makeSimpleA_Expr(AEXPR_OP, "", (Node *)makeString($1), NULL, @1); }
+            | ColId '*'    { $$ = (Node *) makeSimpleA_Expr(AEXPR_OP, "*", (Node *)makeString($1), NULL, @1); }
+            | ColId '+'    { $$ = (Node *) makeSimpleA_Expr(AEXPR_OP, "+", (Node *)makeString($1), NULL, @1); }
+            | ColId '?'    { $$ = (Node *) makeSimpleA_Expr(AEXPR_OP, "?", (Node *)makeString($1), NULL, @1); }
+        ;
+
+opt_row_pattern_subset_clause:
+            SUBSET row_pattern_subset_list    { $$ = $2; }
+            | /*EMPTY*/                                                { $$ = NIL; }
+        ;
+
+row_pattern_subset_list:
+            row_pattern_subset_item                                    { $$ = list_make1($1); }
+            | row_pattern_subset_list ',' row_pattern_subset_item    { $$ = lappend($1, $3); }
+            | /*EMPTY*/                                                { $$ = NIL; }
+        ;
+
+row_pattern_subset_item: ColId '=' '(' row_pattern_subset_rhs ')'
+            {
+                RPSubsetItem *n = makeNode(RPSubsetItem);
+                n->name = $1;
+                n->rhsVariable = $4;
+                $$ = n;
+            }
+        ;
+
+row_pattern_subset_rhs:
+            ColId                                { $$ = list_make1(makeStringConst($1, @1)); }
+            | row_pattern_subset_rhs ',' ColId    { $$ = lappend($1, makeStringConst($3, @1)); }
+            | /*EMPTY*/                            { $$ = NIL; }
+        ;
+
+row_pattern_definition_list:
+            row_pattern_definition                                        { $$ = list_make1($1); }
+            | row_pattern_definition_list ',' row_pattern_definition    { $$ = lappend($1, $3); }
+        ;
+
+row_pattern_definition:
+            ColId AS a_expr
+                {
+                    $$ = makeNode(ResTarget);
+                    $$->name = $1;
+                    $$->indirection = NIL;
+                    $$->val = (Node *) $3;
+                    $$->location = @1;
+                }
+        ;
 
 /*
  * Supporting nonterminals for expressions.
@@ -17121,6 +17293,7 @@ unreserved_keyword:
             | INDEXES
             | INHERIT
             | INHERITS
+            | INITIAL
             | INLINE_P
             | INPUT_P
             | INSENSITIVE
@@ -17148,6 +17321,7 @@ unreserved_keyword:
             | MATCHED
             | MATERIALIZED
             | MAXVALUE
+            | MEASURES
             | MERGE
             | METHOD
             | MINUTE_P
@@ -17190,6 +17364,9 @@ unreserved_keyword:
             | PARTITION
             | PASSING
             | PASSWORD
+            | PAST
+            | PATTERN
+            | PERMUTE
             | PLANS
             | POLICY
             | PRECEDING
@@ -17240,6 +17417,7 @@ unreserved_keyword:
             | SEARCH
             | SECOND_P
             | SECURITY
+            | SEEK
             | SEQUENCE
             | SEQUENCES
             | SERIALIZABLE
@@ -17265,6 +17443,7 @@ unreserved_keyword:
             | STRICT_P
             | STRIP_P
             | SUBSCRIPTION
+            | SUBSET
             | SUPPORT
             | SYSID
             | SYSTEM_P
@@ -17452,6 +17631,7 @@ reserved_keyword:
             | CURRENT_USER
             | DEFAULT
             | DEFERRABLE
+            | DEFINE
             | DESC
             | DISTINCT
             | DO
@@ -17614,6 +17794,7 @@ bare_label_keyword:
             | DEFAULTS
             | DEFERRABLE
             | DEFERRED
+            | DEFINE
             | DEFINER
             | DELETE_P
             | DELIMITER
@@ -17689,6 +17870,7 @@ bare_label_keyword:
             | INDEXES
             | INHERIT
             | INHERITS
+            | INITIAL
             | INITIALLY
             | INLINE_P
             | INNER_P
@@ -17738,6 +17920,7 @@ bare_label_keyword:
             | MATCHED
             | MATERIALIZED
             | MAXVALUE
+            | MEASURES
             | MERGE
             | METHOD
             | MINVALUE
@@ -17791,6 +17974,9 @@ bare_label_keyword:
             | PARTITION
             | PASSING
             | PASSWORD
+            | PAST
+            | PATTERN
+            | PERMUTE
             | PLACING
             | PLANS
             | POLICY
@@ -17847,6 +18033,7 @@ bare_label_keyword:
             | SCROLL
             | SEARCH
             | SECURITY
+            | SEEK
             | SELECT
             | SEQUENCE
             | SEQUENCES
@@ -17878,6 +18065,7 @@ bare_label_keyword:
             | STRICT_P
             | STRIP_P
             | SUBSCRIPTION
+            | SUBSET
             | SUBSTRING
             | SUPPORT
             | SYMMETRIC
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index fe003ded50..c78bd849f9 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -547,6 +547,44 @@ typedef struct SortBy
     int            location;        /* operator location, or -1 if none/unknown */
 } SortBy;
 
+/*
+ * AFTER MATCH row pattern skip to types in row pattern common syntax
+ */
+typedef enum RPSkipTo
+{
+    ST_NONE,                    /* AFTER MATCH omitted */
+    ST_NEXT_ROW,                /* SKIP TO NEXT ROW */
+    ST_PAST_LAST_ROW,            /* SKIP TO PAST LAST ROW */
+    ST_FIRST_VARIABLE,            /* SKIP TO FIRST variable name */
+    ST_LAST_VARIABLE,            /* SKIP TO LAST variable name */
+    ST_VARIABLE                    /* SKIP TO variable name */
+} RPSkipTo;
+
+/*
+ * Row Pattern SUBSET clause item
+ */
+typedef struct RPSubsetItem
+{
+    NodeTag        type;
+    char       *name;            /* Row Pattern SUBSET clause variable name */
+    List       *rhsVariable;    /* Row Pattern SUBSET rhs variables (list of char *string) */
+} RPSubsetItem;
+
+/*
+ * RowPatternCommonSyntax - raw representation of row pattern common syntax
+ *
+ */
+typedef struct RPCommonSyntax
+{
+    NodeTag        type;
+    RPSkipTo    rpSkipTo;        /* Row Pattern AFTER MATCH SKIP type */
+    char       *rpSkipVariable;    /* Row Pattern Skip To variable name, if any */
+    bool        initial;        /* true if <row pattern initial or seek> is initial */
+    List       *rpPatterns;        /* PATTERN variables (list of A_Expr) */
+    List       *rpSubsetClause;    /* row pattern subset clause (list of RPSubsetItem), if any */
+    List       *rpDefs;            /* row pattern definitions clause (list of ResTarget) */
+} RPCommonSyntax;
+
 /*
  * WindowDef - raw representation of WINDOW and OVER clauses
  *
@@ -562,6 +600,8 @@ typedef struct WindowDef
     char       *refname;        /* referenced window name, if any */
     List       *partitionClause;    /* PARTITION BY expression list */
     List       *orderClause;    /* ORDER BY (list of SortBy) */
+    List       *rowPatternMeasures;    /* row pattern measures (list of ResTarget) */
+    RPCommonSyntax *rpCommonSyntax;    /* row pattern common syntax */
     int            frameOptions;    /* frame_clause options, see below */
     Node       *startOffset;    /* expression for starting bound, if any */
     Node       *endOffset;        /* expression for ending bound, if any */
@@ -1483,6 +1523,11 @@ typedef struct GroupingSet
  * the orderClause might or might not be copied (see copiedOrder); the framing
  * options are never copied, per spec.
  *
+ * "defineClause" is Row Pattern Recognition DEFINE clause (list of
+ * TargetEntry). TargetEntry.resname represents row pattern definition
+ * variable name. "patternVariable" and "patternRegexp" represents PATTERN
+ * clause.
+ * 
  * The information relevant for the query jumbling is the partition clause
  * type and its bounds.
  */
@@ -1514,6 +1559,17 @@ typedef struct WindowClause
     Index        winref;            /* ID referenced by window functions */
     /* did we copy orderClause from refname? */
     bool        copiedOrder pg_node_attr(query_jumble_ignore);
+    /* Row Pattern AFTER MACH SKIP clause */
+    RPSkipTo    rpSkipTo;        /* Row Pattern Skip To type */
+    bool        initial;        /* true if <row pattern initial or seek> is initial */
+    /* Row Pattern DEFINE clause (list of TargetEntry) */
+    List        *defineClause;
+    /* Row Pattern DEFINE variable initial names (list of String) */
+    List        *defineInitial;
+    /* Row Pattern PATTERN variable name (list of String) */
+    List        *patternVariable;
+    /* Row Pattern PATTERN regular expression quantifier ('+' or ''. list of String) */
+    List        *patternRegexp;
 } WindowClause;
 
 /*
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index 5984dcfa4b..431e0625f1 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -128,6 +128,7 @@ PG_KEYWORD("default", DEFAULT, RESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("defaults", DEFAULTS, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("deferrable", DEFERRABLE, RESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("deferred", DEFERRED, UNRESERVED_KEYWORD, BARE_LABEL)
+PG_KEYWORD("define", DEFINE, RESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("definer", DEFINER, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("delete", DELETE_P, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("delimiter", DELIMITER, UNRESERVED_KEYWORD, BARE_LABEL)
@@ -212,6 +213,7 @@ PG_KEYWORD("index", INDEX, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("indexes", INDEXES, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("inherit", INHERIT, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("inherits", INHERITS, UNRESERVED_KEYWORD, BARE_LABEL)
+PG_KEYWORD("initial", INITIAL, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("initially", INITIALLY, RESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("inline", INLINE_P, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("inner", INNER_P, TYPE_FUNC_NAME_KEYWORD, BARE_LABEL)
@@ -265,6 +267,7 @@ PG_KEYWORD("match", MATCH, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("matched", MATCHED, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("materialized", MATERIALIZED, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("maxvalue", MAXVALUE, UNRESERVED_KEYWORD, BARE_LABEL)
+PG_KEYWORD("measures", MEASURES, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("merge", MERGE, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("method", METHOD, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("minute", MINUTE_P, UNRESERVED_KEYWORD, AS_LABEL)
@@ -326,6 +329,9 @@ PG_KEYWORD("partial", PARTIAL, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("partition", PARTITION, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("passing", PASSING, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("password", PASSWORD, UNRESERVED_KEYWORD, BARE_LABEL)
+PG_KEYWORD("past", PAST, UNRESERVED_KEYWORD, BARE_LABEL)
+PG_KEYWORD("pattern", PATTERN, UNRESERVED_KEYWORD, BARE_LABEL)
+PG_KEYWORD("permute", PERMUTE, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("placing", PLACING, RESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("plans", PLANS, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("policy", POLICY, UNRESERVED_KEYWORD, BARE_LABEL)
@@ -385,6 +391,7 @@ PG_KEYWORD("scroll", SCROLL, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("search", SEARCH, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("second", SECOND_P, UNRESERVED_KEYWORD, AS_LABEL)
 PG_KEYWORD("security", SECURITY, UNRESERVED_KEYWORD, BARE_LABEL)
+PG_KEYWORD("seek", SEEK, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("select", SELECT, RESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("sequence", SEQUENCE, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("sequences", SEQUENCES, UNRESERVED_KEYWORD, BARE_LABEL)
@@ -416,6 +423,7 @@ PG_KEYWORD("stored", STORED, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("strict", STRICT_P, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("strip", STRIP_P, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("subscription", SUBSCRIPTION, UNRESERVED_KEYWORD, BARE_LABEL)
+PG_KEYWORD("subset", SUBSET, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("substring", SUBSTRING, COL_NAME_KEYWORD, BARE_LABEL)
 PG_KEYWORD("support", SUPPORT, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("symmetric", SYMMETRIC, RESERVED_KEYWORD, BARE_LABEL)
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index f589112d5e..6640090910 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -51,6 +51,7 @@ typedef enum ParseExprKind
     EXPR_KIND_WINDOW_FRAME_RANGE,    /* window frame clause with RANGE */
     EXPR_KIND_WINDOW_FRAME_ROWS,    /* window frame clause with ROWS */
     EXPR_KIND_WINDOW_FRAME_GROUPS,    /* window frame clause with GROUPS */
+    EXPR_KIND_RPR_DEFINE,        /* DEFINE */
     EXPR_KIND_SELECT_TARGET,    /* SELECT target list item */
     EXPR_KIND_INSERT_TARGET,    /* INSERT target list item */
     EXPR_KIND_UPDATE_SOURCE,    /* UPDATE assignment source item */
-- 
2.25.1

From 50ac2f93492c949dbbe8c5e4df19eb541a75f6fb Mon Sep 17 00:00:00 2001
From: Tatsuo Ishii <ishii@postgresql.org>
Date: Wed, 9 Aug 2023 16:56:17 +0900
Subject: [PATCH v4 2/7] Row pattern recognition patch (parse/analysis).

---
 src/backend/parser/parse_agg.c    |   7 +
 src/backend/parser/parse_clause.c | 292 +++++++++++++++++++++++++++++-
 src/backend/parser/parse_expr.c   |   4 +
 src/backend/parser/parse_func.c   |   3 +
 4 files changed, 305 insertions(+), 1 deletion(-)

diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 85cd47b7ae..aa7a1cee80 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -564,6 +564,10 @@ check_agglevels_and_constraints(ParseState *pstate, Node *expr)
             errkind = true;
             break;
 
+        case EXPR_KIND_RPR_DEFINE:
+            errkind = true;
+            break;
+
             /*
              * There is intentionally no default: case here, so that the
              * compiler will warn if we add a new ParseExprKind without
@@ -953,6 +957,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
         case EXPR_KIND_CYCLE_MARK:
             errkind = true;
             break;
+        case EXPR_KIND_RPR_DEFINE:
+            errkind = true;
+            break;
 
             /*
              * There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c
index 334b9b42bd..60020a7025 100644
--- a/src/backend/parser/parse_clause.c
+++ b/src/backend/parser/parse_clause.c
@@ -100,7 +100,10 @@ static WindowClause *findWindowClause(List *wclist, const char *name);
 static Node *transformFrameOffset(ParseState *pstate, int frameOptions,
                                   Oid rangeopfamily, Oid rangeopcintype, Oid *inRangeFunc,
                                   Node *clause);
-
+static void transformRPR(ParseState *pstate, WindowClause *wc, WindowDef *windef, List **targetlist);
+static List *transformDefineClause(ParseState *pstate, WindowClause *wc, WindowDef *windef, List **targetlist);
+static void transformPatternClause(ParseState *pstate, WindowClause *wc, WindowDef *windef);
+static List *transformMeasureClause(ParseState *pstate, WindowClause *wc, WindowDef *windef);
 
 /*
  * transformFromClause -
@@ -2950,6 +2953,10 @@ transformWindowDefinitions(ParseState *pstate,
                                              rangeopfamily, rangeopcintype,
                                              &wc->endInRangeFunc,
                                              windef->endOffset);
+
+        /* Process Row Pattern Recognition related clauses */
+        transformRPR(pstate, wc, windef, targetlist);
+
         wc->runCondition = NIL;
         wc->winref = winref;
 
@@ -3815,3 +3822,286 @@ transformFrameOffset(ParseState *pstate, int frameOptions,
 
     return node;
 }
+
+/*
+ * transformRPR
+ *        Process Row Pattern Recognition related clauses
+ */
+static void
+transformRPR(ParseState *pstate, WindowClause *wc, WindowDef *windef, List **targetlist)
+{
+    /*
+     * Window definition exists?
+     */
+    if (windef == NULL)
+        return;
+
+    /*
+     * Row Pattern Common Syntax clause exists?
+     */
+    if (windef->rpCommonSyntax == NULL)
+        return;
+
+    /* Check Frame option. Frame must start at current row */
+    if ((wc->frameOptions & FRAMEOPTION_START_CURRENT_ROW) == 0)
+        ereport(ERROR,
+                (errcode(ERRCODE_SYNTAX_ERROR),
+                 errmsg("FRAME must start at current row when row patttern recognition is used")));
+
+    /* Transform AFTER MACH SKIP TO clause */
+    wc->rpSkipTo = windef->rpCommonSyntax->rpSkipTo;
+
+    /* Transform SEEK or INITIAL clause */
+    wc->initial = windef->rpCommonSyntax->initial;
+
+    /* Transform DEFINE clause into list of TargetEntry's */
+    wc->defineClause = transformDefineClause(pstate, wc, windef, targetlist);
+
+    /* Check PATTERN clause and copy to patternClause */
+    transformPatternClause(pstate, wc, windef);
+
+    /* Transform MEASURE clause */
+    transformMeasureClause(pstate, wc, windef);
+}
+
+/*
+ * transformDefineClause Process DEFINE clause and transform ResTarget into
+ *        list of TargetEntry.
+ *
+ * XXX we only support column reference in row pattern definition search
+ * condition, e.g. "price". <row pattern definition variable name>.<column
+ * reference> is not supported, e.g. "A.price".
+ */
+static List *
+transformDefineClause(ParseState *pstate, WindowClause *wc, WindowDef *windef, List **targetlist)
+{
+    /* DEFINE variable name initials */
+    static    char    *defineVariableInitials = "abcdefghijklmnopqrstuvwxyz";
+
+    ListCell        *lc, *l;
+    ResTarget        *restarget, *r;
+    List            *restargets;
+    char            *name;
+    int                initialLen;
+    int                i;
+
+    /*
+     * If Row Definition Common Syntax exists, DEFINE clause must exist.
+     * (the raw parser should have already checked it.)
+     */
+    Assert(windef->rpCommonSyntax->rpDefs != NULL);
+
+    /*
+     * Check and add "A AS A IS TRUE" if pattern variable is missing in DEFINE
+     * per the SQL standard.
+     */
+    restargets = NIL;
+    foreach(lc, windef->rpCommonSyntax->rpPatterns)
+    {
+        A_Expr    *a;
+        bool    found = false;
+
+        if (!IsA(lfirst(lc), A_Expr))
+            ereport(ERROR,
+                    errmsg("node type is not A_Expr"));
+
+        a = (A_Expr *)lfirst(lc);
+        name = strVal(a->lexpr);
+
+        foreach(l, windef->rpCommonSyntax->rpDefs)
+        {
+            restarget = (ResTarget *)lfirst(l);
+
+            if (!strcmp(restarget->name, name))
+            {
+                found = true;
+                break;
+            }
+        }
+
+        if (!found)
+        {
+            /*
+             * "name" is missing. So create "name AS name IS TRUE" ResTarget
+             * node and add it to the temporary list.
+             */
+            A_Const       *n;
+
+            restarget = makeNode(ResTarget);
+            n = makeNode(A_Const);
+            n->val.boolval.type = T_Boolean;
+            n->val.boolval.boolval = true;
+            n->location = -1;
+            restarget->name = pstrdup(name);
+            restarget->indirection = NIL;
+            restarget->val = (Node *)n;
+            restarget->location = -1;
+            restargets = lappend((List *)restargets, restarget);
+        }
+    }
+
+    if (list_length(restargets) >= 1)
+    {
+        /* add missing DEFINEs */
+        windef->rpCommonSyntax->rpDefs = list_concat(windef->rpCommonSyntax->rpDefs,
+                                                     restargets);
+        list_free(restargets);
+    }
+
+    /*
+     * Check for duplicate row pattern definition variables.  The standard
+     * requires that no two row pattern definition variable names shall be
+     * equivalent.
+     */
+    restargets = NIL;
+    foreach(lc, windef->rpCommonSyntax->rpDefs)
+    {
+        restarget = (ResTarget *)lfirst(lc);
+        name = restarget->name;
+
+        /*
+         * Add DEFINE expression (Restarget->val) to the targetlist as a
+         * TargetEntry if it does not exist yet. Planner will add the column
+         * ref var node to the outer plan's target list later on. This makes
+         * DEFINE expression could access the outer tuple while evaluating
+         * PATTERN.
+         *
+         * XXX: adding whole expressions of DEFINE to the plan.targetlist is
+         * not so good, because it's not necessary to evalute the expression
+         * in the target list while running the plan. We should extract the
+         * var nodes only then add them to the plan.targetlist.
+         */
+        findTargetlistEntrySQL99(pstate, (Node *)restarget->val, targetlist, EXPR_KIND_RPR_DEFINE);
+
+        /*
+         * Make sure that the row pattern definition search condition is a
+         * boolean expression.
+         */
+        transformWhereClause(pstate, restarget->val,
+                             EXPR_KIND_RPR_DEFINE, "DEFINE");
+
+        foreach(l, restargets)
+        {
+            char        *n;
+
+            r = (ResTarget *) lfirst(l);
+            n = r->name;
+
+            if (!strcmp(n, name))
+                ereport(ERROR,
+                        (errcode(ERRCODE_SYNTAX_ERROR),
+                         errmsg("row pattern definition variable name \"%s\" appears more than once in DEFINE
clause",
+                                name),
+                         parser_errposition(pstate, exprLocation((Node *)r))));
+        }
+        restargets = lappend(restargets, restarget);
+    }
+    list_free(restargets);
+
+    /*
+     * Create list of row pattern DEFINE variable name's initial.
+     * We assign [a-z] to them (up to 26 variable names are allowed).
+     */
+    restargets = NIL;
+    i = 0;
+    initialLen = strlen(defineVariableInitials);
+
+    foreach(lc, windef->rpCommonSyntax->rpDefs)
+    {
+        char    initial[2];
+
+        restarget = (ResTarget *)lfirst(lc);
+        name = restarget->name;
+
+        if (i >= initialLen)
+        {
+            ereport(ERROR,
+                    (errcode(ERRCODE_SYNTAX_ERROR),
+                     errmsg("number of row pattern definition variable names exceeds %d", initialLen),
+                     parser_errposition(pstate, exprLocation((Node *)restarget))));
+        }
+        initial[0] = defineVariableInitials[i++];
+        initial[1] = '\0';
+        wc->defineInitial = lappend(wc->defineInitial, makeString(pstrdup(initial)));
+    }
+
+    return transformTargetList(pstate, windef->rpCommonSyntax->rpDefs,
+                               EXPR_KIND_RPR_DEFINE);
+}
+
+/*
+ * transformPatternClause
+ *        Process PATTERN clause and return PATTERN clause in the raw parse tree
+ */
+static void
+transformPatternClause(ParseState *pstate, WindowClause *wc, WindowDef *windef)
+{
+    ListCell    *lc, *l;
+
+    /*
+     * Row Pattern Common Syntax clause exists?
+     */
+    if (windef->rpCommonSyntax == NULL)
+        return;
+
+    /*
+     * Primary row pattern variable names in PATTERN clause must appear in
+     * DEFINE clause as row pattern definition variable names.
+     */
+    wc->patternVariable = NIL;
+    wc->patternRegexp = NIL;
+    foreach(lc, windef->rpCommonSyntax->rpPatterns)
+    {
+        A_Expr    *a;
+        char    *name;
+        char    *regexp;
+        bool    found = false;
+
+        if (!IsA(lfirst(lc), A_Expr))
+            ereport(ERROR,
+                    errmsg("node type is not A_Expr"));
+
+        a = (A_Expr *)lfirst(lc);
+        name = strVal(a->lexpr);
+
+        foreach(l, windef->rpCommonSyntax->rpDefs)
+        {
+            ResTarget    *restarget = (ResTarget *)lfirst(l);
+
+            if (!strcmp(restarget->name, name))
+            {
+                found = true;
+                break;
+            }
+        }
+
+        if (!found)
+        {
+            ereport(ERROR,
+                    (errcode(ERRCODE_SYNTAX_ERROR),
+                     errmsg("primary row pattern variable name \"%s\" does not appear in DEFINE clause",
+                            name),
+                     parser_errposition(pstate, exprLocation((Node *)a))));
+        }
+        wc->patternVariable = lappend(wc->patternVariable, makeString(pstrdup(name)));
+        regexp = strVal(lfirst(list_head(a->name)));
+        wc->patternRegexp = lappend(wc->patternRegexp, makeString(pstrdup(regexp)));
+    }
+}
+
+/*
+ * transformMeasureClause
+ *        Process MEASURE clause
+ *    XXX MEASURE clause is not supported yet
+ */
+static List *
+transformMeasureClause(ParseState *pstate, WindowClause *wc, WindowDef *windef)
+{
+    if (windef->rowPatternMeasures == NIL)
+        return NIL;
+
+    ereport(ERROR,
+            (errcode(ERRCODE_SYNTAX_ERROR),
+             errmsg("%s","MEASURE clause is not supported yet"),
+             parser_errposition(pstate, exprLocation((Node *)windef->rowPatternMeasures))));
+}
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index fed8e4d089..8921b7ae01 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -557,6 +557,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
         case EXPR_KIND_COPY_WHERE:
         case EXPR_KIND_GENERATED_COLUMN:
         case EXPR_KIND_CYCLE_MARK:
+        case EXPR_KIND_RPR_DEFINE:
             /* okay */
             break;
 
@@ -1770,6 +1771,7 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
         case EXPR_KIND_VALUES:
         case EXPR_KIND_VALUES_SINGLE:
         case EXPR_KIND_CYCLE_MARK:
+        case EXPR_KIND_RPR_DEFINE:
             /* okay */
             break;
         case EXPR_KIND_CHECK_CONSTRAINT:
@@ -3149,6 +3151,8 @@ ParseExprKindName(ParseExprKind exprKind)
             return "GENERATED AS";
         case EXPR_KIND_CYCLE_MARK:
             return "CYCLE";
+        case EXPR_KIND_RPR_DEFINE:
+            return "DEFINE";
 
             /*
              * There is intentionally no default: case here, so that the
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index b3f0b6a137..2ff3699538 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2656,6 +2656,9 @@ check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
         case EXPR_KIND_CYCLE_MARK:
             errkind = true;
             break;
+        case EXPR_KIND_RPR_DEFINE:
+            errkind = true;
+            break;
 
             /*
              * There is intentionally no default: case here, so that the
-- 
2.25.1

From ca2d129d4cc11f06ae40d584c5192ea58b3ee880 Mon Sep 17 00:00:00 2001
From: Tatsuo Ishii <ishii@postgresql.org>
Date: Wed, 9 Aug 2023 16:56:17 +0900
Subject: [PATCH v4 3/7] Row pattern recognition patch (planner).

---
 src/backend/optimizer/plan/createplan.c | 23 ++++++++++++++++++-----
 src/backend/optimizer/plan/setrefs.c    | 23 +++++++++++++++++++++++
 src/include/nodes/plannodes.h           | 15 +++++++++++++++
 3 files changed, 56 insertions(+), 5 deletions(-)

diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index af48109058..828d8d5b62 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -286,9 +286,10 @@ static WindowAgg *make_windowagg(List *tlist, Index winref,
                                  int ordNumCols, AttrNumber *ordColIdx, Oid *ordOperators, Oid *ordCollations,
                                  int frameOptions, Node *startOffset, Node *endOffset,
                                  Oid startInRangeFunc, Oid endInRangeFunc,
-                                 Oid inRangeColl, bool inRangeAsc, bool inRangeNullsFirst,
-                                 List *runCondition, List *qual, bool topWindow,
-                                 Plan *lefttree);
+                                 Oid inRangeColl, bool inRangeAsc, bool inRangeNullsFirst, List *runCondition,
+                                 RPSkipTo rpSkipTo, List *patternVariable, List *patternRegexp, List *defineClause,
+                                 List *defineInitial,
+                                 List *qual, bool topWindow, Plan *lefttree);
 static Group *make_group(List *tlist, List *qual, int numGroupCols,
                          AttrNumber *grpColIdx, Oid *grpOperators, Oid *grpCollations,
                          Plan *lefttree);
@@ -2679,6 +2680,11 @@ create_windowagg_plan(PlannerInfo *root, WindowAggPath *best_path)
                           wc->inRangeAsc,
                           wc->inRangeNullsFirst,
                           wc->runCondition,
+                          wc->rpSkipTo,
+                          wc->patternVariable,
+                          wc->patternRegexp,
+                          wc->defineClause,
+                          wc->defineInitial,
                           best_path->qual,
                           best_path->topwindow,
                           subplan);
@@ -6582,8 +6588,10 @@ make_windowagg(List *tlist, Index winref,
                int ordNumCols, AttrNumber *ordColIdx, Oid *ordOperators, Oid *ordCollations,
                int frameOptions, Node *startOffset, Node *endOffset,
                Oid startInRangeFunc, Oid endInRangeFunc,
-               Oid inRangeColl, bool inRangeAsc, bool inRangeNullsFirst,
-               List *runCondition, List *qual, bool topWindow, Plan *lefttree)
+               Oid inRangeColl, bool inRangeAsc, bool inRangeNullsFirst, List *runCondition, 
+               RPSkipTo rpSkipTo, List *patternVariable, List *patternRegexp, List *defineClause,
+               List *defineInitial,
+               List *qual, bool topWindow, Plan *lefttree)
 {
     WindowAgg  *node = makeNode(WindowAgg);
     Plan       *plan = &node->plan;
@@ -6609,6 +6617,11 @@ make_windowagg(List *tlist, Index winref,
     node->inRangeAsc = inRangeAsc;
     node->inRangeNullsFirst = inRangeNullsFirst;
     node->topWindow = topWindow;
+    node->rpSkipTo = rpSkipTo,
+    node->patternVariable = patternVariable;
+    node->patternRegexp = patternRegexp;
+    node->defineClause = defineClause;
+    node->defineInitial = defineInitial;
 
     plan->targetlist = tlist;
     plan->lefttree = lefttree;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 97fa561e4e..2ed00b5d41 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -2456,6 +2456,29 @@ set_upper_references(PlannerInfo *root, Plan *plan, int rtoffset)
                        NRM_EQUAL,
                        NUM_EXEC_QUAL(plan));
 
+    /*
+     *    Modifies an expression tree in each DEFINE clause so that all Var
+     *    nodes reference outputs of a subplan.
+     */
+    if (IsA(plan, WindowAgg))
+    {
+        WindowAgg  *wplan = (WindowAgg *) plan;
+
+        foreach(l, wplan->defineClause)
+        {
+            TargetEntry *tle = (TargetEntry *) lfirst(l);
+
+            tle->expr = (Expr *)
+                fix_upper_expr(root,
+                               (Node *) tle->expr,
+                               subplan_itlist,
+                               OUTER_VAR,
+                               rtoffset,
+                               NRM_EQUAL,
+                               NUM_EXEC_QUAL(plan));
+        }
+    }
+
     pfree(subplan_itlist);
 }
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 1b787fe031..19815a98bb 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -1096,6 +1096,21 @@ typedef struct WindowAgg
     /* nulls sort first for in_range tests? */
     bool        inRangeNullsFirst;
 
+    /* Row Pattern Recognition AFTER MACH SKIP clause */
+    RPSkipTo    rpSkipTo;        /* Row Pattern Skip To type */    
+
+    /* Row Pattern PATTERN variable name (list of String) */
+    List        *patternVariable;
+
+    /* Row Pattern RPATTERN regular expression quantifier ('+' or ''. list of String) */
+    List        *patternRegexp;
+
+    /* Row Pattern DEFINE clause (list of TargetEntry) */
+    List       *defineClause;
+
+    /* Row Pattern DEFINE variable initial names (list of String) */
+    List        *defineInitial;
+
     /*
      * false for all apart from the WindowAgg that's closest to the root of
      * the plan
-- 
2.25.1

From 0967c10ad0c157fefee876b6a9ec89373de6fda9 Mon Sep 17 00:00:00 2001
From: Tatsuo Ishii <ishii@postgresql.org>
Date: Wed, 9 Aug 2023 16:56:17 +0900
Subject: [PATCH v4 4/7] Row pattern recognition patch (executor).

---
 src/backend/executor/nodeWindowAgg.c | 674 ++++++++++++++++++++++++++-
 src/backend/utils/adt/windowfuncs.c  |  38 +-
 src/include/catalog/pg_proc.dat      |   6 +
 src/include/nodes/execnodes.h        |  19 +
 src/include/windowapi.h              |   8 +
 5 files changed, 732 insertions(+), 13 deletions(-)

diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index 310ac23e3a..5354e47045 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -36,6 +36,7 @@
 #include "access/htup_details.h"
 #include "catalog/objectaccess.h"
 #include "catalog/pg_aggregate.h"
+#include "catalog/pg_collation_d.h"
 #include "catalog/pg_proc.h"
 #include "executor/executor.h"
 #include "executor/nodeWindowAgg.h"
@@ -48,6 +49,7 @@
 #include "utils/acl.h"
 #include "utils/builtins.h"
 #include "utils/datum.h"
+#include "utils/fmgroids.h"
 #include "utils/expandeddatum.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
@@ -182,8 +184,9 @@ static void begin_partition(WindowAggState *winstate);
 static void spool_tuples(WindowAggState *winstate, int64 pos);
 static void release_partition(WindowAggState *winstate);
 
-static int    row_is_in_frame(WindowAggState *winstate, int64 pos,
+static int  row_is_in_frame(WindowAggState *winstate, int64 pos,
                             TupleTableSlot *slot);
+
 static void update_frameheadpos(WindowAggState *winstate);
 static void update_frametailpos(WindowAggState *winstate);
 static void update_grouptailpos(WindowAggState *winstate);
@@ -195,9 +198,20 @@ static Datum GetAggInitVal(Datum textInitVal, Oid transtype);
 
 static bool are_peers(WindowAggState *winstate, TupleTableSlot *slot1,
                       TupleTableSlot *slot2);
-static bool window_gettupleslot(WindowObject winobj, int64 pos,
-                                TupleTableSlot *slot);
 
+static void attno_map(Node *node);
+static bool attno_map_walker(Node *node, void *context);
+static int row_is_in_reduced_frame(WindowObject winobj, int64 pos);
+
+static int64 evaluate_pattern(WindowObject winobj, int64 current_pos, 
+                              char *vname, StringInfo encoded_str, bool *result);
+
+static bool get_slots(WindowObject winobj, int64 current_pos);
+
+static int search_str_set(char *pattern, StringInfo *str_set, int set_size);
+static void search_str_set_recurse(char *pattern, StringInfo *str_set, int set_size, int set_index,
+                                   char *encoded_str, int *resultlen);
+static char    pattern_initial(WindowAggState *winstate, char *vname);
 
 /*
  * initialize_windowaggregate
@@ -673,6 +687,9 @@ eval_windowaggregates(WindowAggState *winstate)
     WindowObject agg_winobj;
     TupleTableSlot *agg_row_slot;
     TupleTableSlot *temp_slot;
+    bool        reduced_frame_set;
+    bool        check_reduced_frame;
+    int            num_rows_in_reduced_frame;
 
     numaggs = winstate->numaggs;
     if (numaggs == 0)
@@ -790,6 +807,7 @@ eval_windowaggregates(WindowAggState *winstate)
             (winstate->frameOptions & FRAMEOPTION_EXCLUSION) ||
             winstate->aggregatedupto <= winstate->frameheadpos)
         {
+            elog(DEBUG1, "peraggstate->restart  is set");
             peraggstate->restart = true;
             numaggs_restart++;
         }
@@ -861,8 +879,10 @@ eval_windowaggregates(WindowAggState *winstate)
      * If we created a mark pointer for aggregates, keep it pushed up to frame
      * head, so that tuplestore can discard unnecessary rows.
      */
+#ifdef NOT_USED
     if (agg_winobj->markptr >= 0)
         WinSetMarkPosition(agg_winobj, winstate->frameheadpos);
+#endif
 
     /*
      * Now restart the aggregates that require it.
@@ -919,6 +939,10 @@ eval_windowaggregates(WindowAggState *winstate)
         ExecClearTuple(agg_row_slot);
     }
 
+    reduced_frame_set = false;
+    check_reduced_frame = false;
+    num_rows_in_reduced_frame = 0;
+
     /*
      * Advance until we reach a row not in frame (or end of partition).
      *
@@ -930,12 +954,18 @@ eval_windowaggregates(WindowAggState *winstate)
     {
         int            ret;
 
+        elog(DEBUG1, "===== loop in frame starts: " INT64_FORMAT, winstate->aggregatedupto);
+
         /* Fetch next row if we didn't already */
         if (TupIsNull(agg_row_slot))
         {
             if (!window_gettupleslot(agg_winobj, winstate->aggregatedupto,
                                      agg_row_slot))
+            {
+                if (check_reduced_frame)
+                    winstate->aggregatedupto--;
                 break;            /* must be end of partition */
+            }
         }
 
         /*
@@ -944,10 +974,47 @@ eval_windowaggregates(WindowAggState *winstate)
          */
         ret = row_is_in_frame(winstate, winstate->aggregatedupto, agg_row_slot);
         if (ret < 0)
+        {
+            if (winstate->patternVariableList != NIL && check_reduced_frame)
+                winstate->aggregatedupto--;
             break;
+        }
         if (ret == 0)
             goto next_tuple;
 
+        if (winstate->patternVariableList != NIL)
+        {
+            if (!reduced_frame_set)
+            {
+                num_rows_in_reduced_frame = row_is_in_reduced_frame(winstate->agg_winobj, winstate->aggregatedupto);
+                reduced_frame_set = true;
+                elog(DEBUG1, "set num_rows_in_reduced_frame: %d pos: " INT64_FORMAT,
+                     num_rows_in_reduced_frame, winstate->aggregatedupto);
+
+                if (num_rows_in_reduced_frame <= 0)
+                    break;
+
+                else if (num_rows_in_reduced_frame > 0)
+                    check_reduced_frame = true;
+            }
+
+            if (check_reduced_frame)
+            {
+                elog(DEBUG1, "decrease num_rows_in_reduced_frame: %d pos: " INT64_FORMAT,
+                     num_rows_in_reduced_frame, winstate->aggregatedupto);
+                num_rows_in_reduced_frame--;
+                if (num_rows_in_reduced_frame < 0)
+                {
+                    /*
+                     * No more rows remain in the reduced frame. Finish
+                     * accumulating row into the aggregates.
+                     */
+                    winstate->aggregatedupto--;
+                    break;
+                }
+            }
+        }
+
         /* Set tuple context for evaluation of aggregate arguments */
         winstate->tmpcontext->ecxt_outertuple = agg_row_slot;
 
@@ -976,6 +1043,8 @@ next_tuple:
         ExecClearTuple(agg_row_slot);
     }
 
+    elog(DEBUG1, "===== break loop in frame starts: " INT64_FORMAT, winstate->aggregatedupto);
+
     /* The frame's end is not supposed to move backwards, ever */
     Assert(aggregatedupto_nonrestarted <= winstate->aggregatedupto);
 
@@ -2053,6 +2122,8 @@ ExecWindowAgg(PlanState *pstate)
 
     CHECK_FOR_INTERRUPTS();
 
+    elog(DEBUG1, "ExecWindowAgg called. pos: " INT64_FORMAT , winstate->currentpos);
+
     if (winstate->status == WINDOWAGG_DONE)
         return NULL;
 
@@ -2388,6 +2459,9 @@ ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags)
     TupleDesc    scanDesc;
     ListCell   *l;
 
+    TargetEntry    *te;
+    Expr        *expr;
+
     /* check for unsupported flags */
     Assert(!(eflags & (EXEC_FLAG_BACKWARD | EXEC_FLAG_MARK)));
 
@@ -2483,6 +2557,16 @@ ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags)
     winstate->temp_slot_2 = ExecInitExtraTupleSlot(estate, scanDesc,
                                                    &TTSOpsMinimalTuple);
 
+    winstate->prev_slot = ExecInitExtraTupleSlot(estate, scanDesc,
+                                                 &TTSOpsMinimalTuple);
+
+    winstate->next_slot = ExecInitExtraTupleSlot(estate, scanDesc,
+                                                 &TTSOpsMinimalTuple);
+
+    winstate->null_slot = ExecInitExtraTupleSlot(estate, scanDesc,
+                                                 &TTSOpsMinimalTuple);
+    winstate->null_slot = ExecStoreAllNullTuple(winstate->null_slot);
+
     /*
      * create frame head and tail slots only if needed (must create slots in
      * exactly the same cases that update_frameheadpos and update_frametailpos
@@ -2667,6 +2751,39 @@ ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags)
     winstate->inRangeAsc = node->inRangeAsc;
     winstate->inRangeNullsFirst = node->inRangeNullsFirst;
 
+    /* Set up SKIP TO type */
+    winstate->rpSkipTo = node->rpSkipTo;
+    /* Set up row pattern recognition PATTERN clause */
+    winstate->patternVariableList = node->patternVariable;
+    winstate->patternRegexpList = node->patternRegexp;
+
+    /* Set up row pattern recognition DEFINE clause */
+    winstate->defineInitial = node->defineInitial;
+    winstate->defineVariableList = NIL;
+    winstate->defineClauseList = NIL;
+    if (node->defineClause != NIL)
+    {
+        /*
+         * Tweak arg var of PREV/NEXT so that it refers to scan/inner slot.
+         */
+        foreach(l, node->defineClause)
+        {
+            char        *name;
+            ExprState    *exps;
+
+            te = lfirst(l);
+            name = te->resname;
+            expr = te->expr;
+
+            elog(DEBUG1, "defineVariable name: %s", name);
+            winstate->defineVariableList = lappend(winstate->defineVariableList,
+                                                   makeString(pstrdup(name)));
+            attno_map((Node *)expr);
+            exps = ExecInitExpr(expr, (PlanState *) winstate);
+            winstate->defineClauseList = lappend(winstate->defineClauseList, exps);
+        }
+    }
+
     winstate->all_first = true;
     winstate->partition_spooled = false;
     winstate->more_partitions = false;
@@ -2674,6 +2791,57 @@ ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags)
     return winstate;
 }
 
+/*
+ * Rewrite varno of Var node that is the argument of PREV/NET so that it sees
+ * scan tuple (PREV) or inner tuple (NEXT).
+ */
+static void
+attno_map(Node *node)
+{
+    (void) expression_tree_walker(node, attno_map_walker, NULL);
+}
+
+static bool
+attno_map_walker(Node *node, void *context)
+{
+    FuncExpr    *func;
+    int            nargs;
+    Expr        *expr;
+    Var            *var;
+    
+    if (node == NULL)
+        return false;
+
+    if (IsA(node, FuncExpr))
+    {
+        func = (FuncExpr *)node;
+
+        if (func->funcid == F_PREV || func->funcid == F_NEXT)
+        {
+            /* sanity check */
+            nargs = list_length(func->args);
+            if (list_length(func->args) != 1)
+                elog(ERROR, "PREV/NEXT must have 1 argument but function %d has %d args", func->funcid, nargs);
+
+            expr = (Expr *) lfirst(list_head(func->args));
+            if (!IsA(expr, Var))
+                elog(ERROR, "PREV/NEXT's arg is not Var");    /* XXX: is it possible that arg type is Const? */
+            var = (Var *)expr;
+
+            if (func->funcid == F_PREV)
+                /*
+                 * Rewrite varno from OUTER_VAR to regular var no so that the
+                 * var references scan tuple.
+                 */
+                var->varno = var->varnosyn;
+            else
+                var->varno = INNER_VAR;
+            elog(DEBUG1, "PREV/NEXT's varno is rewritten to: %d", var->varno);
+        }
+    }
+    return expression_tree_walker(node, attno_map_walker, NULL);
+}
+
 /* -----------------
  * ExecEndWindowAgg
  * -----------------
@@ -2691,6 +2859,8 @@ ExecEndWindowAgg(WindowAggState *node)
     ExecClearTuple(node->agg_row_slot);
     ExecClearTuple(node->temp_slot_1);
     ExecClearTuple(node->temp_slot_2);
+    ExecClearTuple(node->prev_slot);
+    ExecClearTuple(node->next_slot);
     if (node->framehead_slot)
         ExecClearTuple(node->framehead_slot);
     if (node->frametail_slot)
@@ -2740,6 +2910,8 @@ ExecReScanWindowAgg(WindowAggState *node)
     ExecClearTuple(node->agg_row_slot);
     ExecClearTuple(node->temp_slot_1);
     ExecClearTuple(node->temp_slot_2);
+    ExecClearTuple(node->prev_slot);
+    ExecClearTuple(node->next_slot);
     if (node->framehead_slot)
         ExecClearTuple(node->framehead_slot);
     if (node->frametail_slot)
@@ -3080,7 +3252,7 @@ are_peers(WindowAggState *winstate, TupleTableSlot *slot1,
  *
  * Returns true if successful, false if no such row
  */
-static bool
+bool
 window_gettupleslot(WindowObject winobj, int64 pos, TupleTableSlot *slot)
 {
     WindowAggState *winstate = winobj->winstate;
@@ -3100,7 +3272,7 @@ window_gettupleslot(WindowObject winobj, int64 pos, TupleTableSlot *slot)
         return false;
 
     if (pos < winobj->markpos)
-        elog(ERROR, "cannot fetch row before WindowObject's mark position");
+        elog(ERROR, "cannot fetch row: " INT64_FORMAT " before WindowObject's mark position: " INT64_FORMAT,  pos,
winobj->markpos);
 
 
     oldcontext = MemoryContextSwitchTo(winstate->ss.ps.ps_ExprContext->ecxt_per_query_memory);
 
@@ -3420,14 +3592,54 @@ WinGetFuncArgInFrame(WindowObject winobj, int argno,
     WindowAggState *winstate;
     ExprContext *econtext;
     TupleTableSlot *slot;
-    int64        abs_pos;
-    int64        mark_pos;
 
     Assert(WindowObjectIsValid(winobj));
     winstate = winobj->winstate;
     econtext = winstate->ss.ps.ps_ExprContext;
     slot = winstate->temp_slot_1;
 
+    if (WinGetSlotInFrame(winobj, slot,
+                          relpos, seektype, set_mark,
+                          isnull, isout) == 0)
+    {
+        econtext->ecxt_outertuple = slot;
+        return ExecEvalExpr((ExprState *) list_nth(winobj->argstates, argno),
+                            econtext, isnull);
+    }
+
+    if (isout)
+        *isout = true;
+    *isnull = true;
+    return (Datum) 0;
+}
+
+/*
+ * WinGetSlotInFrame
+ * slot: TupleTableSlot to store the result
+ * relpos: signed rowcount offset from the seek position
+ * seektype: WINDOW_SEEK_HEAD or WINDOW_SEEK_TAIL
+ * set_mark: If the row is found/in frame and set_mark is true, the mark is
+ *        moved to the row as a side-effect.
+ * isnull: output argument, receives isnull status of result
+ * isout: output argument, set to indicate whether target row position
+ *        is out of frame (can pass NULL if caller doesn't care about this)
+ *
+ * Returns 0 if we successfullt got the slot. false if out of frame.
+ * (also isout is set)
+ */
+int
+WinGetSlotInFrame(WindowObject winobj, TupleTableSlot *slot,
+                     int relpos, int seektype, bool set_mark,
+                     bool *isnull, bool *isout)
+{
+    WindowAggState *winstate;
+    int64        abs_pos;
+    int64        mark_pos;
+    int            num_reduced_frame;
+
+    Assert(WindowObjectIsValid(winobj));
+    winstate = winobj->winstate;
+
     switch (seektype)
     {
         case WINDOW_SEEK_CURRENT:
@@ -3494,6 +3706,12 @@ WinGetFuncArgInFrame(WindowObject winobj, int argno,
                          winstate->frameOptions);
                     break;
             }
+            num_reduced_frame = row_is_in_reduced_frame(winobj, winstate->frameheadpos);
+            if (num_reduced_frame < 0)
+                goto out_of_frame;
+            else if (num_reduced_frame > 0)
+                if (relpos >= num_reduced_frame)
+                    goto out_of_frame;
             break;
         case WINDOW_SEEK_TAIL:
             /* rejecting relpos > 0 is easy and simplifies code below */
@@ -3565,6 +3783,12 @@ WinGetFuncArgInFrame(WindowObject winobj, int argno,
                     mark_pos = 0;    /* keep compiler quiet */
                     break;
             }
+
+            num_reduced_frame = row_is_in_reduced_frame(winobj, winstate->frameheadpos + relpos);
+            if (num_reduced_frame < 0)
+                goto out_of_frame;
+            else if (num_reduced_frame > 0)
+                abs_pos = winstate->frameheadpos + relpos + num_reduced_frame - 1;
             break;
         default:
             elog(ERROR, "unrecognized window seek type: %d", seektype);
@@ -3583,15 +3807,13 @@ WinGetFuncArgInFrame(WindowObject winobj, int argno,
         *isout = false;
     if (set_mark)
         WinSetMarkPosition(winobj, mark_pos);
-    econtext->ecxt_outertuple = slot;
-    return ExecEvalExpr((ExprState *) list_nth(winobj->argstates, argno),
-                        econtext, isnull);
+    return 0;
 
 out_of_frame:
     if (isout)
         *isout = true;
     *isnull = true;
-    return (Datum) 0;
+    return -1;
 }
 
 /*
@@ -3622,3 +3844,433 @@ WinGetFuncArgCurrent(WindowObject winobj, int argno, bool *isnull)
     return ExecEvalExpr((ExprState *) list_nth(winobj->argstates, argno),
                         econtext, isnull);
 }
+
+WindowAggState *
+WinGetAggState(WindowObject winobj)
+{
+    return winobj->winstate;
+}
+
+/*
+ * row_is_in_reduced_frame
+ * Determine whether a row is in the current row's reduced window frame according
+ * to row pattern matching
+ *
+ * The row must has been already determined that it is in a full window frame
+ * and fetched it into slot.
+ *
+ * Returns:
+ * = 0, RPR is not defined.
+ * >0, if the row is the first in the reduced frame. Return the number of rows in the reduced frame.
+ * -1, if the row is unmatched row
+ * -2, if the row is in the reduced frame but needed to be skipped because of
+ * AFTER MATCH SKIP PAST LAST ROW
+ */
+static
+int row_is_in_reduced_frame(WindowObject winobj, int64 pos)
+{
+    WindowAggState *winstate = winobj->winstate;
+    ListCell    *lc1, *lc2;
+    bool        expression_result;
+    int            num_matched_rows;
+    int64        original_pos;
+    bool        anymatch;
+    StringInfo    encoded_str;
+    StringInfo    pattern_str = makeStringInfo();
+
+    /*
+     * Array of pattern variables evaluted to true.
+     * Each character corresponds to pattern variable.
+     * Example:
+     * str_set[0] = "AB";
+     * str_set[1] = "AC";
+     * In this case at row 0 A and B are true, and A and C are true in row 1.
+     */
+    #define ENCODED_STR_ARRAY_ALLOC_SIZE    128
+    StringInfo    *str_set = NULL;
+    int        str_set_index;
+    int        str_set_size;
+
+    if (winstate->patternVariableList == NIL)
+    {
+        /*
+         * RPR is not defined. Assume that we are always in the the reduced
+         * window frame.
+         */
+        return 0;
+    }
+
+    /* save original pos */
+    original_pos = pos;
+
+    /*
+     * Check whether the row speicied by pos is in the reduced frame. The
+     * second and subsequent rows need to be recognized as "unmatched" rows if
+     * AFTER MATCH SKIP PAST LAST ROW is defined.
+     */
+    if (winstate->rpSkipTo == ST_PAST_LAST_ROW &&
+        pos > winstate->headpos_in_reduced_frame &&
+        pos < (winstate->headpos_in_reduced_frame + winstate->num_rows_in_reduced_frame))
+        return -2;
+        
+    /*
+     * Loop over until none of pattern matches or encounters end of frame.
+     */
+    for (;;)
+    {
+        int64    result_pos = -1;
+
+        /*
+         * Loop over each PATTERN variable.
+         */
+        anymatch = false;
+        encoded_str = makeStringInfo();
+
+        forboth(lc1, winstate->patternVariableList, lc2, winstate->patternRegexpList)
+        {
+            char    *vname = strVal(lfirst(lc1));
+            char    *quantifier = strVal(lfirst(lc2));
+
+            elog(DEBUG1, "pos: " INT64_FORMAT " pattern vname: %s quantifier: %s", pos, vname, quantifier);
+
+            expression_result = false;
+
+            /* evaluate row pattern against current row */
+            result_pos = evaluate_pattern(winobj, pos, vname, encoded_str, &expression_result);
+            if (expression_result)
+            {
+                elog(DEBUG1, "expression result is true");
+                anymatch = true;
+            }
+
+            /*
+             * If out of frame, we are done.
+             */
+             if (result_pos < 0)
+                 break;
+        }
+
+        if (!anymatch)
+        {
+            /* none of patterns matched. */
+            break;
+        }
+
+        /* build encoded string array */
+        if (str_set == NULL)
+        {
+            str_set_index = 0;
+            str_set_size = ENCODED_STR_ARRAY_ALLOC_SIZE * sizeof(StringInfo);
+            str_set = palloc(str_set_size);
+        }
+
+        str_set[str_set_index++] = encoded_str;
+
+        elog(DEBUG1, "pos: " INT64_FORMAT " str_set_index: %d encoded_str: %s", pos, str_set_index,
encoded_str->data);
+
+        if (str_set_index >= str_set_size)
+        {
+            str_set_size *= 2;
+            str_set = repalloc(str_set, str_set_size);
+        }
+
+        /* move to next row */
+        pos++;
+
+        if (result_pos < 0)
+        {
+            /* out of frame */
+            break;
+        }
+    }
+
+    if (str_set == NULL)
+    {
+        /* no matches found in the first row */
+        return -1;
+    }
+
+    elog(DEBUG1, "pos: " INT64_FORMAT " encoded_str: %s", pos, encoded_str->data);
+
+    /* build regular expression */
+    pattern_str = makeStringInfo();
+    appendStringInfoChar(pattern_str, '^');
+    forboth (lc1, winstate->patternVariableList, lc2, winstate->patternRegexpList)
+    {
+        char    *vname = strVal(lfirst(lc1));
+        char    *quantifier = strVal(lfirst(lc2));
+        char     initial;
+
+        initial = pattern_initial(winstate, vname);
+        Assert(initial != 0);
+        appendStringInfoChar(pattern_str, initial);
+        if (quantifier[0])
+            appendStringInfoChar(pattern_str, quantifier[0]);
+        elog(DEBUG1, "vname: %s initial: %c quantifier: %s", vname, initial, quantifier);
+    }
+
+    elog(DEBUG1, "pos: " INT64_FORMAT " pattern: %s", pos, pattern_str->data);
+
+    /* look for matching pattern variable sequence */
+    num_matched_rows = search_str_set(pattern_str->data, str_set, str_set_index);
+    if (num_matched_rows <= 0)
+        return -1;
+
+    /*
+     * We are at the first row in the reduced frame.  Save the number of
+     * matched rows as the number of rows in the reduced frame.
+     */
+    winstate->headpos_in_reduced_frame = original_pos;
+    winstate->num_rows_in_reduced_frame = num_matched_rows;
+
+    return num_matched_rows;
+}
+
+/*
+ * search set of encode_str.
+ * set_size: size of set_str array.
+ */
+static
+int search_str_set(char *pattern, StringInfo *str_set, int set_size)
+{
+    char        *encoded_str = palloc0(set_size+1);
+    int            resultlen = 0;
+
+    search_str_set_recurse(pattern, str_set, set_size, 0, encoded_str, &resultlen);
+    elog(DEBUG1, "search_str_set returns %d", resultlen);
+    return resultlen;
+}
+
+static
+void search_str_set_recurse(char *pattern, StringInfo *str_set,
+                            int set_size, int set_index, char *encoded_str, int *resultlen)
+{
+    char    *p;
+
+    if (set_index >= set_size)
+    {
+        Datum    d;
+        text    *res;
+        char    *substr;
+
+        /*
+         * We first perform pattern matching using regexp_instr, then call
+         * textregexsubstr to get matched substring to know how log the
+         * matched string is. That is the number of rows in the reduced window
+         * frame.  The reason why we can't call textregexsubstr is, it error
+         * out if pattern is not match.
+         */
+        if (DatumGetInt32(DirectFunctionCall2Coll(regexp_instr, DEFAULT_COLLATION_OID,
+                                                  PointerGetDatum(cstring_to_text(encoded_str)),
+                                                  PointerGetDatum(cstring_to_text(pattern)))) > 0)
+        {
+            d = DirectFunctionCall2Coll(textregexsubstr,
+                                        DEFAULT_COLLATION_OID,
+                                        PointerGetDatum(cstring_to_text(encoded_str)),
+                                        PointerGetDatum(cstring_to_text(pattern)));
+            if (d != 0)
+            {
+                int        len;
+
+                res = DatumGetTextPP(d);
+                substr = text_to_cstring(res);
+                len = strlen(substr);
+                if (len > *resultlen)
+                    /* remember the longest match */
+                    *resultlen = len;
+            }
+        }
+        return;
+    }
+
+    p = str_set[set_index]->data;
+    while (*p)
+    {
+        encoded_str[set_index] = *p;
+        p++;
+        search_str_set_recurse(pattern, str_set, set_size, set_index + 1, encoded_str, resultlen);
+    }
+}
+
+
+/*
+ * Evaluate expression associated with PATTERN variable vname.
+ * relpos is relative row position in a frame (starting from 0).
+ * "quantifier" is the quatifier part of the PATTERN regular expression.
+ * Currently only '+' is allowed.
+ * result is out paramater representing the expression evaluation result
+ * is true of false.
+ * Return values are:
+ * >=0: the last match absolute row position
+ * other wise out of frame.
+ */
+static
+int64 evaluate_pattern(WindowObject winobj, int64 current_pos, 
+                        char *vname, StringInfo encoded_str, bool *result)
+{
+    WindowAggState    *winstate = winobj->winstate;
+    ExprContext        *econtext = winstate->ss.ps.ps_ExprContext;
+    ListCell        *lc1, *lc2, *lc3;
+    ExprState        *pat;
+    Datum            eval_result;
+    bool            out_of_frame = false;
+    bool            isnull;
+
+    forthree (lc1, winstate->defineVariableList, lc2, winstate->defineClauseList, lc3, winstate->defineInitial)
+    {
+        char    initial;
+        char    *name = strVal(lfirst(lc1));
+
+        elog(DEBUG1, "evaluate_pattern: define variable: %s, pattern variable: %s", name, vname);
+
+        if (strcmp(vname, name))
+            continue;
+
+        initial = *(strVal(lfirst(lc3)));
+
+        /* set expression to evaluate */
+        pat = lfirst(lc2);
+
+        /* get current, previous and next tuples */
+        if (!get_slots(winobj, current_pos))
+        {
+            out_of_frame = true;
+        }
+        else
+        {
+            /* evaluate the expression */
+            eval_result = ExecEvalExpr(pat, econtext, &isnull);
+            if (isnull)
+            {
+                /* expression is NULL */
+                elog(DEBUG1, "expression for %s is NULL at row: " INT64_FORMAT, vname, current_pos);
+                *result = false;
+            }
+            else
+            {
+                if (!DatumGetBool(eval_result))
+                {
+                    /* expression is false */
+                    elog(DEBUG1, "expression for %s is false at row: " INT64_FORMAT, vname, current_pos);
+                    *result = false;
+                }
+                else
+                {
+                    /* expression is true */
+                    elog(DEBUG1, "expression for %s is true at row: " INT64_FORMAT, vname, current_pos);
+                    appendStringInfoChar(encoded_str, initial);
+                    *result = true;
+                }
+            }
+            break;
+        }
+
+        if (out_of_frame)
+        {
+            *result = false;
+            return -1;
+        }
+    }
+    return current_pos;
+}
+
+/*
+ * Get current, previous and next tuples.
+ * Returns false if current row is out of partition/full frame.
+ */
+static
+bool get_slots(WindowObject winobj, int64 current_pos)
+{
+    WindowAggState *winstate = winobj->winstate;
+    TupleTableSlot *slot;
+    int        ret;
+    ExprContext *econtext;
+
+    econtext = winstate->ss.ps.ps_ExprContext;
+
+    /* set up current row tuple slot */
+    slot = winstate->temp_slot_1;
+    if (!window_gettupleslot(winobj, current_pos, slot))
+    {
+        elog(DEBUG1, "current row is out of partition at:" INT64_FORMAT, current_pos);
+        return false;
+
+        ret = row_is_in_frame(winstate, current_pos, slot);
+        if (ret <= 0)
+        {
+            elog(DEBUG1, "current row is out of frame at: " INT64_FORMAT, current_pos);
+            return false;
+        }
+    }
+    econtext->ecxt_outertuple = slot;
+
+    /* for PREV */
+    if (current_pos > 0)
+    {
+        slot = winstate->prev_slot;
+        if (!window_gettupleslot(winobj, current_pos - 1, slot))
+        {
+            elog(DEBUG1, "previous row is out of partition at: " INT64_FORMAT, current_pos - 1);
+            econtext->ecxt_scantuple = winstate->null_slot;
+        }
+        else
+        {
+            ret = row_is_in_frame(winstate, current_pos - 1, slot);
+            if (ret <= 0)
+            {
+                elog(DEBUG1, "previous row is out of frame at: " INT64_FORMAT, current_pos - 1);
+                econtext->ecxt_scantuple = winstate->null_slot;
+            }
+            else
+            {
+                econtext->ecxt_scantuple = slot;
+            }
+        }
+    }
+    else
+        econtext->ecxt_scantuple = winstate->null_slot;
+
+    /* for NEXT */
+    slot = winstate->next_slot;
+    if (!window_gettupleslot(winobj, current_pos + 1, slot))
+    {
+        elog(DEBUG1, "next row is out of partiton at: " INT64_FORMAT, current_pos + 1);
+        econtext->ecxt_innertuple = winstate->null_slot;
+    }
+    else
+    {
+        ret = row_is_in_frame(winstate, current_pos + 1, slot);
+        if (ret <= 0)
+        {
+            elog(DEBUG1, "next row is out of frame at: " INT64_FORMAT, current_pos + 1);
+            econtext->ecxt_innertuple = winstate->null_slot;
+        }
+        else
+            econtext->ecxt_innertuple = slot;
+    }
+    return true;
+}
+
+/*
+ * Return pattern variable initial character
+ * matching with pattern variable name vname.
+ * If not found, return 0.
+ */
+static
+char    pattern_initial(WindowAggState *winstate, char *vname)
+{
+    char        initial;
+    char        *name;
+    ListCell    *lc1, *lc2;
+
+    forboth (lc1, winstate->defineVariableList, lc2, winstate->defineInitial)
+    {
+        name = strVal(lfirst(lc1));                /* DEFINE variable name */
+        initial = *(strVal(lfirst(lc2)));        /* DEFINE variable initial */
+
+
+        if (!strcmp(name, vname))
+                return initial;                    /* found */
+    }
+    return 0;
+}
diff --git a/src/backend/utils/adt/windowfuncs.c b/src/backend/utils/adt/windowfuncs.c
index b87a624fb2..e4cab36ec9 100644
--- a/src/backend/utils/adt/windowfuncs.c
+++ b/src/backend/utils/adt/windowfuncs.c
@@ -13,6 +13,9 @@
  */
 #include "postgres.h"
 
+#include "catalog/pg_collation_d.h"
+#include "executor/executor.h"
+#include "nodes/execnodes.h"
 #include "nodes/supportnodes.h"
 #include "utils/builtins.h"
 #include "windowapi.h"
@@ -36,11 +39,19 @@ typedef struct
     int64        remainder;        /* (total rows) % (bucket num) */
 } ntile_context;
 
+/*
+ * rpr process information.
+ * Used for AFTER MATCH SKIP PAST LAST ROW
+ */
+typedef struct SkipContext
+{
+    int64        pos;    /* last row absolute position */
+} SkipContext;
+
 static bool rank_up(WindowObject winobj);
 static Datum leadlag_common(FunctionCallInfo fcinfo,
                             bool forward, bool withoffset, bool withdefault);
 
-
 /*
  * utility routine for *_rank functions.
  */
@@ -673,7 +684,7 @@ window_last_value(PG_FUNCTION_ARGS)
     bool        isnull;
 
     result = WinGetFuncArgInFrame(winobj, 0,
-                                  0, WINDOW_SEEK_TAIL, true,
+                                  0, WINDOW_SEEK_TAIL, false,
                                   &isnull, NULL);
     if (isnull)
         PG_RETURN_NULL();
@@ -713,3 +724,26 @@ window_nth_value(PG_FUNCTION_ARGS)
 
     PG_RETURN_DATUM(result);
 }
+
+/*
+ * prev
+ * Dummy function to invoke RPR's navigation operator "PREV".
+ * This is *not* a window function.
+ */
+Datum
+window_prev(PG_FUNCTION_ARGS)
+{
+    PG_RETURN_DATUM(PG_GETARG_DATUM(0));
+}
+
+/*
+ * next
+ * Dummy function to invoke RPR's navigation operation "NEXT".
+ * This is *not* a window function.
+ */
+Datum
+window_next(PG_FUNCTION_ARGS)
+{
+    PG_RETURN_DATUM(PG_GETARG_DATUM(0));
+}
+
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 6996073989..fa100b2665 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -10397,6 +10397,12 @@
 { oid => '3114', descr => 'fetch the Nth row value',
   proname => 'nth_value', prokind => 'w', prorettype => 'anyelement',
   proargtypes => 'anyelement int4', prosrc => 'window_nth_value' },
+{ oid => '6122', descr => 'previous value',
+  proname => 'prev', provolatile => 's', prorettype => 'anyelement',
+  proargtypes => 'anyelement', prosrc => 'window_prev' },
+{ oid => '6123', descr => 'next value',
+  proname => 'next', provolatile => 's', prorettype => 'anyelement',
+  proargtypes => 'anyelement', prosrc => 'window_next' },
 
 # functions for range types
 { oid => '3832', descr => 'I/O',
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index cb714f4a19..2bd6fcb5e1 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -2519,6 +2519,15 @@ typedef struct WindowAggState
     int64        groupheadpos;    /* current row's peer group head position */
     int64        grouptailpos;    /* " " " " tail position (group end+1) */
 
+    /* these fields are used in Row pattern recognition: */
+    RPSkipTo    rpSkipTo;        /* Row Pattern Skip To type */    
+    List       *patternVariableList;    /* list of row pattern variables names (list of String) */
+    List       *patternRegexpList;    /* list of row pattern regular expressions ('+' or ''. list of String) */
+    List       *defineVariableList;    /* list of row pattern definition variables (list of String) */
+    List       *defineClauseList;    /* expression for row pattern definition
+                                     * search conditions ExprState list */
+    List       *defineInitial;        /* list of row pattern definition variable initials (list of String) */
+
     MemoryContext partcontext;    /* context for partition-lifespan data */
     MemoryContext aggcontext;    /* shared context for aggregate working data */
     MemoryContext curaggcontext;    /* current aggregate's working data */
@@ -2555,6 +2564,16 @@ typedef struct WindowAggState
     TupleTableSlot *agg_row_slot;
     TupleTableSlot *temp_slot_1;
     TupleTableSlot *temp_slot_2;
+
+    /* temporary slots for RPR */
+    TupleTableSlot *prev_slot;    /* PREV row navigation operator */
+    TupleTableSlot *next_slot;    /* NEXT row navigation operator */
+    TupleTableSlot *null_slot;    /* all NULL slot */
+
+    /* head of the reduced window frame */
+    int64        headpos_in_reduced_frame;
+    /* number of rows in the reduced window frame */
+    int64        num_rows_in_reduced_frame;
 } WindowAggState;
 
 /* ----------------
diff --git a/src/include/windowapi.h b/src/include/windowapi.h
index b8c2c565d1..1e292648e9 100644
--- a/src/include/windowapi.h
+++ b/src/include/windowapi.h
@@ -58,7 +58,15 @@ extern Datum WinGetFuncArgInFrame(WindowObject winobj, int argno,
                                   int relpos, int seektype, bool set_mark,
                                   bool *isnull, bool *isout);
 
+extern int WinGetSlotInFrame(WindowObject winobj, TupleTableSlot *slot,
+                             int relpos, int seektype, bool set_mark,
+                             bool *isnull, bool *isout);
+
 extern Datum WinGetFuncArgCurrent(WindowObject winobj, int argno,
                                   bool *isnull);
 
+extern WindowAggState *WinGetAggState(WindowObject winobj);
+
+extern bool window_gettupleslot(WindowObject winobj, int64 pos, TupleTableSlot *slot);
+
 #endif                            /* WINDOWAPI_H */
-- 
2.25.1

From 833cd2182cdd1137f144bab93635252d1aa0fb29 Mon Sep 17 00:00:00 2001
From: Tatsuo Ishii <ishii@postgresql.org>
Date: Wed, 9 Aug 2023 16:56:17 +0900
Subject: [PATCH v4 5/7] Row pattern recognition patch (docs).

---
 doc/src/sgml/advanced.sgml   | 52 ++++++++++++++++++++++++++++++++++
 doc/src/sgml/func.sgml       | 54 ++++++++++++++++++++++++++++++++++++
 doc/src/sgml/ref/select.sgml | 38 +++++++++++++++++++++++--
 3 files changed, 142 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/advanced.sgml b/doc/src/sgml/advanced.sgml
index 755c9f1485..eda3612822 100644
--- a/doc/src/sgml/advanced.sgml
+++ b/doc/src/sgml/advanced.sgml
@@ -537,6 +537,58 @@ WHERE pos < 3;
     <literal>rank</literal> less than 3.
    </para>
 
+   <para>
+    Row pattern common syntax can be used with row pattern common syntax to
+    perform row pattern recognition in a query. Row pattern common syntax
+    includes two sub clauses. <literal>DEFINE</literal> defines definition
+    variables along with an expression. The expression must be a logical
+    expression, which means it must
+    return <literal>TRUE</literal>, <literal>FALSE</literal>
+    or <literal>NULL</literal>. Moreover if the expression comprises a column
+    reference, it must be the argument of <function>rpr</function>. An example
+    of <literal>DEFINE</literal> is as follows.
+
+<programlisting>
+DEFINE
+ LOWPRICE AS price <= 100,
+ UP AS price > PREV(price),
+ DOWN AS price < PREV(price)
+</programlisting>
+
+    Note that <function>PREV</function> returns price column in the previous
+    row if it's called in a context of row pattern recognition. So in the
+    second line means the definition variable "UP" is <literal>TRUE</literal>
+    when price column in the current row is greater than the price column in
+    the previous row. Likewise, "DOWN" is <literal>TRUE</literal> when when
+    price column in the current row is lower than the price column in the
+    previous row.
+   </para>
+   <para>
+    Once <literal>DEFINE</literal> exists, <literal>PATTERN</literal> can be
+    used. <literal>PATTERN</literal> defines a sequence of rows that satisfies
+    certain conditions.  For example following <literal>PATTERN</literal>
+    defines that a row starts with the condition "LOWPRICE", then one or more
+    rows satisfy "UP" and finally one or more rows satisfy "DOWN". If a
+    sequence of rows found, rpr returns the column at the starting row.
+    Example of a <literal>SELECT</literal> using the <literal>DEFINE</literal>
+    and <literal>PATTERN</literal> clause is as follows.
+
+<programlisting>    
+SELECT company, tdate, price, max(price) OVER w FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP PAST LAST ROW
+ INITIAL
+ PATTERN (LOWPRICE UP+ DOWN+)
+ DEFINE
+  LOWPRICE AS price <= 100,
+  UP AS price > PREV(price),
+  DOWN AS price < PREV(price)
+);
+</programlisting>
+   </para>
+
    <para>
     When a query involves multiple window functions, it is possible to write
     out each one with a separate <literal>OVER</literal> clause, but this is
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index be2f54c914..6cda164522 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -21715,6 +21715,7 @@ SELECT count(*) FROM sometable;
         returns <literal>NULL</literal> if there is no such row.
        </para></entry>
       </row>
+
      </tbody>
     </tgroup>
    </table>
@@ -21754,6 +21755,59 @@ SELECT count(*) FROM sometable;
    Other frame specifications can be used to obtain other effects.
   </para>
 
+  <para>
+   Row pattern recognition navigation functions are listed in
+   <xref linkend="functions-rpr-navigation-table"/>.  These functions
+   can be used to describe DEFINE clause of Row pattern recognition.
+  </para>
+
+   <table id="functions-rpr-navigation-table">
+    <title>Row Pattern Navigation Functions</title>
+    <tgroup cols="1">
+     <thead>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para></entry>
+      </row>
+     </thead>
+
+     <tbody>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>prev</primary>
+        </indexterm>
+        <function>prev</function> ( <parameter>value</parameter> <type>anyelement</type> )
+        <returnvalue>anyelement</returnvalue>
+       </para>
+       <para>
+        Returns the column value at the previous row;
+        returns NULL if there is no previous row in the window frame.
+       </para></entry>
+      </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>next</primary>
+        </indexterm>
+        <function>next</function> ( <parameter>value</parameter> <type>anyelement</type> )
+        <returnvalue>anyelement</returnvalue>
+       </para>
+       <para>
+        Returns the column value at the next row;
+        returns NULL if there is no next row in the window frame.
+       </para></entry>
+      </row>
+
+     </tbody>
+    </tgroup>
+   </table>
+
   <note>
    <para>
     The SQL standard defines a <literal>RESPECT NULLS</literal> or
diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
index 0ee0cc7e64..8d3becd57a 100644
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -966,8 +966,8 @@ WINDOW <replaceable class="parameter">window_name</replaceable> AS ( <replaceabl
     The <replaceable class="parameter">frame_clause</replaceable> can be one of
 
 <synopsis>
-{ RANGE | ROWS | GROUPS } <replaceable>frame_start</replaceable> [ <replaceable>frame_exclusion</replaceable> ]
-{ RANGE | ROWS | GROUPS } BETWEEN <replaceable>frame_start</replaceable> AND <replaceable>frame_end</replaceable> [
<replaceable>frame_exclusion</replaceable>]
 
+{ RANGE | ROWS | GROUPS } <replaceable>frame_start</replaceable> [ <replaceable>frame_exclusion</replaceable> ]
[row_pattern_common_syntax]
+{ RANGE | ROWS | GROUPS } BETWEEN <replaceable>frame_start</replaceable> AND <replaceable>frame_end</replaceable> [
<replaceable>frame_exclusion</replaceable>] [row_pattern_common_syntax]
 
 </synopsis>
 
     where <replaceable>frame_start</replaceable>
@@ -1074,6 +1074,40 @@ EXCLUDE NO OTHERS
     a given peer group will be in the frame or excluded from it.
    </para>
 
+   <para>
+    The
+    optional <replaceable class="parameter">row_pattern_common_syntax</replaceable>
+    defines the <firstterm>row pattern recognition condition</firstterm> for
+    this
+    window. <replaceable class="parameter">row_pattern_common_syntax</replaceable>
+    includes following subclauses. <literal>AFTER MATCH SKIP PAST LAST
+    ROW</literal> or <literal>AFTER MATCH SKIP TO NEXT ROW</literal> controls
+    how to proceed to next row position after a match
+    found. With <literal>AFTER MATCH SKIP PAST LAST ROW</literal> (the
+    default) next row position is next to the last row of previous match. On
+    the other hand, with <literal>AFTER MATCH SKIP TO NEXT ROW</literal> next
+    row position is always next to the last row of previous
+    match. <literal>DEFINE</literal> defines definition variables along with a
+    boolean expression. <literal>PATTERN</literal> defines a sequence of rows
+    that satisfies certain conditions using variables defined
+    in <literal>DEFINE</literal> clause. If the variable is not defined in
+    the <literal>DEFINE</literal> clause, it is implicitly assumed
+    following is defined in the <literal>DEFINE</literal> clause.
+
+<synopsis>
+<literal>variable_name</literal> AS TRUE
+</synopsis>
+
+    Note that the maximu number of variables defined
+    in <literal>DEFINE</literal> clause is 26.
+
+<synopsis>
+[ AFTER MATCH SKIP PAST LAST ROW | AFTER MATCH SKIP TO NEXT ROW ]
+PATTERN <replaceable class="parameter">pattern_variable_name</replaceable>[+] [, ...]
+DEFINE <replaceable class="parameter">definition_varible_name</replaceable> AS <replaceable
class="parameter">expression</replaceable>[, ...]
 
+</synopsis>    
+   </para>
+
    <para>
     The purpose of a <literal>WINDOW</literal> clause is to specify the
     behavior of <firstterm>window functions</firstterm> appearing in the query's
-- 
2.25.1

From 398f9f7f93b1523555f83b99cb456355c87b50fa Mon Sep 17 00:00:00 2001
From: Tatsuo Ishii <ishii@postgresql.org>
Date: Wed, 9 Aug 2023 16:56:17 +0900
Subject: [PATCH v4 6/7] Row pattern recognition patch (tests).

---
 src/test/regress/expected/rpr.out  | 463 +++++++++++++++++++++++++++++
 src/test/regress/parallel_schedule |   2 +-
 src/test/regress/sql/rpr.sql       | 229 ++++++++++++++
 3 files changed, 693 insertions(+), 1 deletion(-)
 create mode 100644 src/test/regress/expected/rpr.out
 create mode 100644 src/test/regress/sql/rpr.sql

diff --git a/src/test/regress/expected/rpr.out b/src/test/regress/expected/rpr.out
new file mode 100644
index 0000000000..340ccf242c
--- /dev/null
+++ b/src/test/regress/expected/rpr.out
@@ -0,0 +1,463 @@
+--
+-- Test for row pattern definition clause
+--
+CREATE TEMP TABLE stock (
+       company TEXT,
+       tdate DATE,
+       price INTEGER
+                  );
+INSERT INTO stock VALUES ('company1', '2023-07-01', 100);
+INSERT INTO stock VALUES ('company1', '2023-07-02', 200);
+INSERT INTO stock VALUES ('company1', '2023-07-03', 150);
+INSERT INTO stock VALUES ('company1', '2023-07-04', 140);
+INSERT INTO stock VALUES ('company1', '2023-07-05', 150);
+INSERT INTO stock VALUES ('company1', '2023-07-06', 90);
+INSERT INTO stock VALUES ('company1', '2023-07-07', 110);
+INSERT INTO stock VALUES ('company1', '2023-07-08', 130);
+INSERT INTO stock VALUES ('company1', '2023-07-09', 120);
+INSERT INTO stock VALUES ('company1', '2023-07-10', 130);
+INSERT INTO stock VALUES ('company2', '2023-07-01', 50);
+INSERT INTO stock VALUES ('company2', '2023-07-02', 2000);
+INSERT INTO stock VALUES ('company2', '2023-07-03', 1500);
+INSERT INTO stock VALUES ('company2', '2023-07-04', 1400);
+INSERT INTO stock VALUES ('company2', '2023-07-05', 1500);
+INSERT INTO stock VALUES ('company2', '2023-07-06', 60);
+INSERT INTO stock VALUES ('company2', '2023-07-07', 1100);
+INSERT INTO stock VALUES ('company2', '2023-07-08', 1300);
+INSERT INTO stock VALUES ('company2', '2023-07-09', 1200);
+INSERT INTO stock VALUES ('company2', '2023-07-10', 1300);
+SELECT * FROM stock;
+ company  |   tdate    | price 
+----------+------------+-------
+ company1 | 07-01-2023 |   100
+ company1 | 07-02-2023 |   200
+ company1 | 07-03-2023 |   150
+ company1 | 07-04-2023 |   140
+ company1 | 07-05-2023 |   150
+ company1 | 07-06-2023 |    90
+ company1 | 07-07-2023 |   110
+ company1 | 07-08-2023 |   130
+ company1 | 07-09-2023 |   120
+ company1 | 07-10-2023 |   130
+ company2 | 07-01-2023 |    50
+ company2 | 07-02-2023 |  2000
+ company2 | 07-03-2023 |  1500
+ company2 | 07-04-2023 |  1400
+ company2 | 07-05-2023 |  1500
+ company2 | 07-06-2023 |    60
+ company2 | 07-07-2023 |  1100
+ company2 | 07-08-2023 |  1300
+ company2 | 07-09-2023 |  1200
+ company2 | 07-10-2023 |  1300
+(20 rows)
+
+-- basic test using PREV
+SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w,
+ nth_value(tdate, 2) OVER w AS nth_second
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ INITIAL
+ PATTERN (START UP+ DOWN+)
+ DEFINE
+  START AS TRUE,
+  UP AS price > PREV(price),
+  DOWN AS price < PREV(price)
+);
+ company  |   tdate    | price | first_value | last_value | nth_second 
+----------+------------+-------+-------------+------------+------------
+ company1 | 07-01-2023 |   100 |         100 |        140 | 07-02-2023
+ company1 | 07-02-2023 |   200 |             |            | 
+ company1 | 07-03-2023 |   150 |             |            | 
+ company1 | 07-04-2023 |   140 |             |            | 
+ company1 | 07-05-2023 |   150 |             |            | 
+ company1 | 07-06-2023 |    90 |          90 |        120 | 07-07-2023
+ company1 | 07-07-2023 |   110 |             |            | 
+ company1 | 07-08-2023 |   130 |             |            | 
+ company1 | 07-09-2023 |   120 |             |            | 
+ company1 | 07-10-2023 |   130 |             |            | 
+ company2 | 07-01-2023 |    50 |          50 |       1400 | 07-02-2023
+ company2 | 07-02-2023 |  2000 |             |            | 
+ company2 | 07-03-2023 |  1500 |             |            | 
+ company2 | 07-04-2023 |  1400 |             |            | 
+ company2 | 07-05-2023 |  1500 |             |            | 
+ company2 | 07-06-2023 |    60 |          60 |       1200 | 07-07-2023
+ company2 | 07-07-2023 |  1100 |             |            | 
+ company2 | 07-08-2023 |  1300 |             |            | 
+ company2 | 07-09-2023 |  1200 |             |            | 
+ company2 | 07-10-2023 |  1300 |             |            | 
+(20 rows)
+
+-- omit "START" in DEFINE but it is ok because "START AS TRUE" is
+-- implicitly defined. per spec.
+SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w,
+ nth_value(tdate, 2) OVER w AS nth_second
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ INITIAL
+ PATTERN (START UP+ DOWN+)
+ DEFINE
+  UP AS price > PREV(price),
+  DOWN AS price < PREV(price)
+);
+ company  |   tdate    | price | first_value | last_value | nth_second 
+----------+------------+-------+-------------+------------+------------
+ company1 | 07-01-2023 |   100 |         100 |        140 | 07-02-2023
+ company1 | 07-02-2023 |   200 |             |            | 
+ company1 | 07-03-2023 |   150 |             |            | 
+ company1 | 07-04-2023 |   140 |             |            | 
+ company1 | 07-05-2023 |   150 |             |            | 
+ company1 | 07-06-2023 |    90 |          90 |        120 | 07-07-2023
+ company1 | 07-07-2023 |   110 |             |            | 
+ company1 | 07-08-2023 |   130 |             |            | 
+ company1 | 07-09-2023 |   120 |             |            | 
+ company1 | 07-10-2023 |   130 |             |            | 
+ company2 | 07-01-2023 |    50 |          50 |       1400 | 07-02-2023
+ company2 | 07-02-2023 |  2000 |             |            | 
+ company2 | 07-03-2023 |  1500 |             |            | 
+ company2 | 07-04-2023 |  1400 |             |            | 
+ company2 | 07-05-2023 |  1500 |             |            | 
+ company2 | 07-06-2023 |    60 |          60 |       1200 | 07-07-2023
+ company2 | 07-07-2023 |  1100 |             |            | 
+ company2 | 07-08-2023 |  1300 |             |            | 
+ company2 | 07-09-2023 |  1200 |             |            | 
+ company2 | 07-10-2023 |  1300 |             |            | 
+(20 rows)
+
+-- the first row start with less than or equal to 100
+SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ INITIAL
+ PATTERN (LOWPRICE UP+ DOWN+)
+ DEFINE
+  LOWPRICE AS price <= 100,
+  UP AS price > PREV(price),
+  DOWN AS price < PREV(price)
+);
+ company  |   tdate    | price | first_value | last_value 
+----------+------------+-------+-------------+------------
+ company1 | 07-01-2023 |   100 |         100 |        140
+ company1 | 07-02-2023 |   200 |             |           
+ company1 | 07-03-2023 |   150 |             |           
+ company1 | 07-04-2023 |   140 |             |           
+ company1 | 07-05-2023 |   150 |             |           
+ company1 | 07-06-2023 |    90 |          90 |        120
+ company1 | 07-07-2023 |   110 |             |           
+ company1 | 07-08-2023 |   130 |             |           
+ company1 | 07-09-2023 |   120 |             |           
+ company1 | 07-10-2023 |   130 |             |           
+ company2 | 07-01-2023 |    50 |          50 |       1400
+ company2 | 07-02-2023 |  2000 |             |           
+ company2 | 07-03-2023 |  1500 |             |           
+ company2 | 07-04-2023 |  1400 |             |           
+ company2 | 07-05-2023 |  1500 |             |           
+ company2 | 07-06-2023 |    60 |          60 |       1200
+ company2 | 07-07-2023 |  1100 |             |
+ company2 | 07-08-2023 |  1300 |             |           
+ company2 | 07-09-2023 |  1200 |             |           
+ company2 | 07-10-2023 |  1300 |             |           
+(20 rows)
+
+-- second row raises 120%
+SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ INITIAL
+ PATTERN (LOWPRICE UP+ DOWN+)
+ DEFINE
+  LOWPRICE AS price <= 100,
+  UP AS price > PREV(price) * 1.2,
+  DOWN AS price < PREV(price)
+);
+ company  |   tdate    | price | first_value | last_value 
+----------+------------+-------+-------------+------------
+ company1 | 07-01-2023 |   100 |         100 |        140
+ company1 | 07-02-2023 |   200 |             |           
+ company1 | 07-03-2023 |   150 |             |           
+ company1 | 07-04-2023 |   140 |             |           
+ company1 | 07-05-2023 |   150 |             |           
+ company1 | 07-06-2023 |    90 |             |           
+ company1 | 07-07-2023 |   110 |             |           
+ company1 | 07-08-2023 |   130 |             |           
+ company1 | 07-09-2023 |   120 |             |           
+ company1 | 07-10-2023 |   130 |             |           
+ company2 | 07-01-2023 |    50 |          50 |       1400
+ company2 | 07-02-2023 |  2000 |             |           
+ company2 | 07-03-2023 |  1500 |             |           
+ company2 | 07-04-2023 |  1400 |             |           
+ company2 | 07-05-2023 |  1500 |             |           
+ company2 | 07-06-2023 |    60 |             |           
+ company2 | 07-07-2023 |  1100 |             |           
+ company2 | 07-08-2023 |  1300 |             |           
+ company2 | 07-09-2023 |  1200 |             |           
+ company2 | 07-10-2023 |  1300 |             |           
+(20 rows)
+
+-- using NEXT
+SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ INITIAL
+ PATTERN (START UPDOWN)
+ DEFINE
+  START AS TRUE,
+  UPDOWN AS price > PREV(price) AND price > NEXT(price)
+);
+ company  |   tdate    | price | first_value | last_value 
+----------+------------+-------+-------------+------------
+ company1 | 07-01-2023 |   100 |         100 |        200
+ company1 | 07-02-2023 |   200 |             |           
+ company1 | 07-03-2023 |   150 |             |           
+ company1 | 07-04-2023 |   140 |         140 |        150
+ company1 | 07-05-2023 |   150 |             |           
+ company1 | 07-06-2023 |    90 |             |           
+ company1 | 07-07-2023 |   110 |         110 |        130
+ company1 | 07-08-2023 |   130 |             |           
+ company1 | 07-09-2023 |   120 |             |           
+ company1 | 07-10-2023 |   130 |             |           
+ company2 | 07-01-2023 |    50 |          50 |       2000
+ company2 | 07-02-2023 |  2000 |             |           
+ company2 | 07-03-2023 |  1500 |             |           
+ company2 | 07-04-2023 |  1400 |        1400 |       1500
+ company2 | 07-05-2023 |  1500 |             |           
+ company2 | 07-06-2023 |    60 |             |           
+ company2 | 07-07-2023 |  1100 |        1100 |       1300
+ company2 | 07-08-2023 |  1300 |             |           
+ company2 | 07-09-2023 |  1200 |             |           
+ company2 | 07-10-2023 |  1300 |             |           
+(20 rows)
+
+-- using AFTER MATCH SKIP TO NEXT ROW
+SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP TO NEXT ROW
+ INITIAL
+ PATTERN (START UPDOWN)
+ DEFINE
+  START AS TRUE,
+  UPDOWN AS price > PREV(price) AND price > NEXT(price)
+);
+ company  |   tdate    | price | first_value | last_value 
+----------+------------+-------+-------------+------------
+ company1 | 07-01-2023 |   100 |         100 |        200
+ company1 | 07-02-2023 |   200 |             |           
+ company1 | 07-03-2023 |   150 |             |           
+ company1 | 07-04-2023 |   140 |         140 |        150
+ company1 | 07-05-2023 |   150 |             |           
+ company1 | 07-06-2023 |    90 |             |           
+ company1 | 07-07-2023 |   110 |         110 |        130
+ company1 | 07-08-2023 |   130 |             |           
+ company1 | 07-09-2023 |   120 |             |           
+ company1 | 07-10-2023 |   130 |             |           
+ company2 | 07-01-2023 |    50 |          50 |       2000
+ company2 | 07-02-2023 |  2000 |             |           
+ company2 | 07-03-2023 |  1500 |             |           
+ company2 | 07-04-2023 |  1400 |        1400 |       1500
+ company2 | 07-05-2023 |  1500 |             |           
+ company2 | 07-06-2023 |    60 |             |           
+ company2 | 07-07-2023 |  1100 |        1100 |       1300
+ company2 | 07-08-2023 |  1300 |             |           
+ company2 | 07-09-2023 |  1200 |             |           
+ company2 | 07-10-2023 |  1300 |             |           
+(20 rows)
+
+-- match everything
+SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ORDER BY tdate
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP PAST LAST ROW
+ INITIAL
+ PATTERN (A+)
+ DEFINE
+  A AS TRUE
+);
+ company  |   tdate    | price | first_value | last_value 
+----------+------------+-------+-------------+------------
+ company1 | 07-01-2023 |   100 |         100 |        130
+ company1 | 07-02-2023 |   200 |             |           
+ company1 | 07-03-2023 |   150 |             |           
+ company1 | 07-04-2023 |   140 |             |           
+ company1 | 07-05-2023 |   150 |             |           
+ company1 | 07-06-2023 |    90 |             |           
+ company1 | 07-07-2023 |   110 |             |           
+ company1 | 07-08-2023 |   130 |             |           
+ company1 | 07-09-2023 |   120 |             |           
+ company1 | 07-10-2023 |   130 |             |           
+ company2 | 07-01-2023 |    50 |          50 |       1300
+ company2 | 07-02-2023 |  2000 |             |           
+ company2 | 07-03-2023 |  1500 |             |           
+ company2 | 07-04-2023 |  1400 |             |           
+ company2 | 07-05-2023 |  1500 |             |           
+ company2 | 07-06-2023 |    60 |             |           
+ company2 | 07-07-2023 |  1100 |             |           
+ company2 | 07-08-2023 |  1300 |             |           
+ company2 | 07-09-2023 |  1200 |             |           
+ company2 | 07-10-2023 |  1300 |             |           
+(20 rows)
+
+-- backtracking with reclassification of rows
+-- using AFTER MATCH SKIP PAST LAST ROW
+SELECT company, tdate, price, first_value(tdate) OVER w, last_value(tdate) OVER w
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ORDER BY tdate
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP PAST LAST ROW
+ INITIAL
+ PATTERN (A+ B+)
+ DEFINE
+  A AS price > 100,
+  B AS price > 100
+);
+ company  |   tdate    | price | first_value | last_value 
+----------+------------+-------+-------------+------------
+ company1 | 07-01-2023 |   100 |             | 
+ company1 | 07-02-2023 |   200 | 07-02-2023  | 07-05-2023
+ company1 | 07-03-2023 |   150 |             | 
+ company1 | 07-04-2023 |   140 |             | 
+ company1 | 07-05-2023 |   150 |             | 
+ company1 | 07-06-2023 |    90 |             | 
+ company1 | 07-07-2023 |   110 | 07-07-2023  | 07-10-2023
+ company1 | 07-08-2023 |   130 |             | 
+ company1 | 07-09-2023 |   120 |             | 
+ company1 | 07-10-2023 |   130 |             | 
+ company2 | 07-01-2023 |    50 |             | 
+ company2 | 07-02-2023 |  2000 | 07-02-2023  | 07-05-2023
+ company2 | 07-03-2023 |  1500 |             | 
+ company2 | 07-04-2023 |  1400 |             |
+ company2 | 07-05-2023 |  1500 |             | 
+ company2 | 07-06-2023 |    60 |             | 
+ company2 | 07-07-2023 |  1100 | 07-07-2023  | 07-10-2023
+ company2 | 07-08-2023 |  1300 |             | 
+ company2 | 07-09-2023 |  1200 |             | 
+ company2 | 07-10-2023 |  1300 |             | 
+(20 rows)
+
+-- backtracking with reclassification of rows
+-- using AFTER MATCH SKIP TO NEXT ROW
+SELECT company, tdate, price, first_value(tdate) OVER w, last_value(tdate) OVER w
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ORDER BY tdate
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP TO NEXT ROW
+ INITIAL
+ PATTERN (A+ B+)
+ DEFINE
+  A AS price > 100,
+  B AS price > 100
+);
+ company  |   tdate    | price | first_value | last_value 
+----------+------------+-------+-------------+------------
+ company1 | 07-01-2023 |   100 |             | 
+ company1 | 07-02-2023 |   200 | 07-02-2023  | 07-05-2023
+ company1 | 07-03-2023 |   150 | 07-03-2023  | 07-05-2023
+ company1 | 07-04-2023 |   140 | 07-04-2023  | 07-05-2023
+ company1 | 07-05-2023 |   150 |             | 
+ company1 | 07-06-2023 |    90 |             | 
+ company1 | 07-07-2023 |   110 | 07-07-2023  | 07-10-2023
+ company1 | 07-08-2023 |   130 | 07-08-2023  | 07-10-2023
+ company1 | 07-09-2023 |   120 | 07-09-2023  | 07-10-2023
+ company1 | 07-10-2023 |   130 |             | 
+ company2 | 07-01-2023 |    50 |             | 
+ company2 | 07-02-2023 |  2000 | 07-02-2023  | 07-05-2023
+ company2 | 07-03-2023 |  1500 | 07-03-2023  | 07-05-2023
+ company2 | 07-04-2023 |  1400 | 07-04-2023  | 07-05-2023
+ company2 | 07-05-2023 |  1500 |             | 
+ company2 | 07-06-2023 |    60 |             | 
+ company2 | 07-07-2023 |  1100 | 07-07-2023  | 07-10-2023
+ company2 | 07-08-2023 |  1300 | 07-08-2023  | 07-10-2023
+ company2 | 07-09-2023 |  1200 | 07-09-2023  | 07-10-2023
+ company2 | 07-10-2023 |  1300 |             | 
+(20 rows)
+
+--
+-- Error cases
+--
+-- row pattern definition variable name must not appear more than once
+SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ ORDER BY tdate
+ INITIAL
+ PATTERN (START UP+ DOWN+)
+ DEFINE
+  START AS TRUE,
+  UP AS price > PREV(price),
+  DOWN AS price < PREV(price),
+  UP AS price > PREV(price)
+);
+ERROR:  syntax error at or near "ORDER"
+LINE 6:  ORDER BY tdate
+         ^
+-- pattern variable name must appear in DEFINE
+SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ORDER BY tdate
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ INITIAL
+ PATTERN (START UP+ DOWN+ END)
+ DEFINE
+  START AS TRUE,
+  UP AS price > PREV(price),
+  DOWN AS price < PREV(price)
+);
+ERROR:  syntax error at or near "END"
+LINE 8:  PATTERN (START UP+ DOWN+ END)
+                                  ^
+-- FRAME must start at current row when row patttern recognition is used
+SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ORDER BY tdate
+ ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
+ INITIAL
+ PATTERN (START UP+ DOWN+)
+ DEFINE
+  START AS TRUE,
+  UP AS price > PREV(price),
+  DOWN AS price < PREV(price)
+);
+ERROR:  FRAME must start at current row when row patttern recognition is used
+-- SEEK is not supported
+SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ORDER BY tdate
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP TO NEXT ROW
+ SEEK
+ PATTERN (START UP+ DOWN+)
+ DEFINE
+  START AS TRUE,
+  UP AS price > PREV(price),
+  DOWN AS price < PREV(price)
+);
+ERROR:  SEEK is not supported
+LINE 8:  SEEK
+         ^
+HINT:  Use INITIAL.
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 4df9d8503b..896531002b 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -98,7 +98,7 @@ test: publication subscription
 # Another group of parallel tests
 # select_views depends on create_view
 # ----------
-test: select_views portals_p2 foreign_key cluster dependency guc bitmapops combocid tsearch tsdicts foreign_data
windowxmlmap functional_deps advisory_lock indirect_toast equivclass
 
+test: select_views portals_p2 foreign_key cluster dependency guc bitmapops combocid tsearch tsdicts foreign_data
windowxmlmap functional_deps advisory_lock indirect_toast equivclass rpr
 
 
 # ----------
 # Another group of parallel tests (JSON related)
diff --git a/src/test/regress/sql/rpr.sql b/src/test/regress/sql/rpr.sql
new file mode 100644
index 0000000000..c4f704cdc4
--- /dev/null
+++ b/src/test/regress/sql/rpr.sql
@@ -0,0 +1,229 @@
+--
+-- Test for row pattern definition clause
+--
+
+CREATE TEMP TABLE stock (
+       company TEXT,
+       tdate DATE,
+       price INTEGER
+                  );
+INSERT INTO stock VALUES ('company1', '2023-07-01', 100);
+INSERT INTO stock VALUES ('company1', '2023-07-02', 200);
+INSERT INTO stock VALUES ('company1', '2023-07-03', 150);
+INSERT INTO stock VALUES ('company1', '2023-07-04', 140);
+INSERT INTO stock VALUES ('company1', '2023-07-05', 150);
+INSERT INTO stock VALUES ('company1', '2023-07-06', 90);
+INSERT INTO stock VALUES ('company1', '2023-07-07', 110);
+INSERT INTO stock VALUES ('company1', '2023-07-08', 130);
+INSERT INTO stock VALUES ('company1', '2023-07-09', 120);
+INSERT INTO stock VALUES ('company1', '2023-07-10', 130);
+INSERT INTO stock VALUES ('company2', '2023-07-01', 50);
+INSERT INTO stock VALUES ('company2', '2023-07-02', 2000);
+INSERT INTO stock VALUES ('company2', '2023-07-03', 1500);
+INSERT INTO stock VALUES ('company2', '2023-07-04', 1400);
+INSERT INTO stock VALUES ('company2', '2023-07-05', 1500);
+INSERT INTO stock VALUES ('company2', '2023-07-06', 60);
+INSERT INTO stock VALUES ('company2', '2023-07-07', 1100);
+INSERT INTO stock VALUES ('company2', '2023-07-08', 1300);
+INSERT INTO stock VALUES ('company2', '2023-07-09', 1200);
+INSERT INTO stock VALUES ('company2', '2023-07-10', 1300);
+
+SELECT * FROM stock;
+
+-- basic test using PREV
+SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w,
+ nth_value(tdate, 2) OVER w AS nth_second
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ INITIAL
+ PATTERN (START UP+ DOWN+)
+ DEFINE
+  START AS TRUE,
+  UP AS price > PREV(price),
+  DOWN AS price < PREV(price)
+);
+
+-- omit "START" in DEFINE but it is ok because "START AS TRUE" is
+-- implicitly defined. per spec.
+SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w,
+ nth_value(tdate, 2) OVER w AS nth_second
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ INITIAL
+ PATTERN (START UP+ DOWN+)
+ DEFINE
+  UP AS price > PREV(price),
+  DOWN AS price < PREV(price)
+);
+
+-- the first row start with less than or equal to 100
+SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ INITIAL
+ PATTERN (LOWPRICE UP+ DOWN+)
+ DEFINE
+  LOWPRICE AS price <= 100,
+  UP AS price > PREV(price),
+  DOWN AS price < PREV(price)
+);
+
+-- second row raises 120%
+SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ INITIAL
+ PATTERN (LOWPRICE UP+ DOWN+)
+ DEFINE
+  LOWPRICE AS price <= 100,
+  UP AS price > PREV(price) * 1.2,
+  DOWN AS price < PREV(price)
+);
+
+-- using NEXT
+SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ INITIAL
+ PATTERN (START UPDOWN)
+ DEFINE
+  START AS TRUE,
+  UPDOWN AS price > PREV(price) AND price > NEXT(price)
+);
+
+-- using AFTER MATCH SKIP TO NEXT ROW
+SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP TO NEXT ROW
+ INITIAL
+ PATTERN (START UPDOWN)
+ DEFINE
+  START AS TRUE,
+  UPDOWN AS price > PREV(price) AND price > NEXT(price)
+);
+
+-- match everything
+
+SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ORDER BY tdate
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP PAST LAST ROW
+ INITIAL
+ PATTERN (A+)
+ DEFINE
+  A AS TRUE
+);
+
+-- backtracking with reclassification of rows
+-- using AFTER MATCH SKIP PAST LAST ROW
+SELECT company, tdate, price, first_value(tdate) OVER w, last_value(tdate) OVER w
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ORDER BY tdate
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP PAST LAST ROW
+ INITIAL
+ PATTERN (A+ B+)
+ DEFINE
+  A AS price > 100,
+  B AS price > 100
+);
+
+-- backtracking with reclassification of rows
+-- using AFTER MATCH SKIP TO NEXT ROW
+SELECT company, tdate, price, first_value(tdate) OVER w, last_value(tdate) OVER w
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ORDER BY tdate
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP TO NEXT ROW
+ INITIAL
+ PATTERN (A+ B+)
+ DEFINE
+  A AS price > 100,
+  B AS price > 100
+);
+
+--
+-- Error cases
+--
+
+-- row pattern definition variable name must not appear more than once
+SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ ORDER BY tdate
+ INITIAL
+ PATTERN (START UP+ DOWN+)
+ DEFINE
+  START AS TRUE,
+  UP AS price > PREV(price),
+  DOWN AS price < PREV(price),
+  UP AS price > PREV(price)
+);
+
+-- pattern variable name must appear in DEFINE
+SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ORDER BY tdate
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ INITIAL
+ PATTERN (START UP+ DOWN+ END)
+ DEFINE
+  START AS TRUE,
+  UP AS price > PREV(price),
+  DOWN AS price < PREV(price)
+);
+
+-- FRAME must start at current row when row patttern recognition is used
+SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ORDER BY tdate
+ ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
+ INITIAL
+ PATTERN (START UP+ DOWN+)
+ DEFINE
+  START AS TRUE,
+  UP AS price > PREV(price),
+  DOWN AS price < PREV(price)
+);
+
+-- SEEK is not supported
+SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ORDER BY tdate
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP TO NEXT ROW
+ SEEK
+ PATTERN (START UP+ DOWN+)
+ DEFINE
+  START AS TRUE,
+  UP AS price > PREV(price),
+  DOWN AS price < PREV(price)
+);
-- 
2.25.1

From c5ae7386f1d62242ac1b8c6ca769bbd1d0062317 Mon Sep 17 00:00:00 2001
From: Tatsuo Ishii <ishii@postgresql.org>
Date: Wed, 9 Aug 2023 16:56:17 +0900
Subject: [PATCH v4 7/7] Allow to print raw parse tree.

---
 src/backend/tcop/postgres.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 36cc99ec9c..c01e90f735 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -653,6 +653,10 @@ pg_parse_query(const char *query_string)
     }
 #endif
 
+    if (Debug_print_parse)
+        elog_node_display(LOG, "raw parse tree", raw_parsetree_list,
+                          Debug_pretty_print);
+
     TRACE_POSTGRESQL_QUERY_PARSE_DONE(query_string);
 
     return raw_parsetree_list;
-- 
2.25.1


pgsql-hackers by date:

Previous
From: Anthonin Bonnefoy
Date:
Subject: Re: POC: Extension for adding distributed tracing - pg_tracing
Next
From: David Rowley
Date:
Subject: Re: Avoid stack frame setup in performance critical routines using tail calls