Re: SQL Property Graph Queries (SQL/PGQ) - Mailing list pgsql-hackers

From Henson Choi
Subject Re: SQL Property Graph Queries (SQL/PGQ)
Date
Msg-id CAAAe_zD9FP-sVqNs5-Dwc0CqNqT2B40CsYnwDGXpOLMT5KnhSg@mail.gmail.com
Whole thread Raw
In response to Re: SQL Property Graph Queries (SQL/PGQ)  (Junwang Zhao <zhjwpku@gmail.com>)
Responses Re: SQL Property Graph Queries (SQL/PGQ)
List pgsql-hackers
Hi Junwang,

> The fix is simply:
>
>   tr = replace_property_refs(rte->relid, pf->whereClause, graph_path);
>
> Ashutosh, could you include this fix in the next patch revision?

This fixes the crash.
 
Thanks for confirming.
 
> Also, I'd like to check — do you see any potential side effects from
> passing the full graph_path instead of list_make1(pe)? Since the mutator
> now has access to all element mappings, I want to make sure there are no
> unintended interactions in other code paths.

One concern is that if we support

MATCH (a IS users)-[]->(x IS users)<-[]-(b IS users WHERE b.name != a.name)

user may expect the following also works:

MATCH (a IS users WHERE b.name != a.name)-[]->(x IS users)<-[]-(b IS users)

but the second actually failed to pass the transform phase.

I tested neo4j, both are well supported.

Good catch. You're right -- both orderings should work.

The standard is explicit about this. Subclause 10.6 Syntax Rule 18
(ISO/IEC 9075-16:2023) states:

  "The scope of an <element variable> that is declared by an
   <element pattern> EP is the innermost <graph table> containing EP."

The scope is the entire <graph table>, not "from the point of
declaration onward." So a forward reference like your second example
is just as valid as the backward reference in the first.

The current implementation registers variables into gpstate->variables
sequentially inside transformGraphElementPattern(), which makes forward
references invisible at transform time.

So we might follow the same behavior. The solution I came out is in
transformPathTerm
we collect gpstate->variables before each transformGraphElementPattern.

Something like:

 transformPathTerm(ParseState *pstate, List *path_term)
 {
        List       *result = NIL;
+       GraphTableParseState *gpstate = pstate->p_graph_table_pstate;
+
+       /*
+        * First gather all element variables from this path term so that WHERE
+        * clauses in any element pattern can reference variables
appearing anywhere
+        * in the term, regardless of order.
+        */
+       foreach_node(GraphElementPattern, gep, path_term)
+       {
+               if (gep->variable)
+               {
+                       String *v = makeString(pstrdup(gep->variable));
+
+                       if (!list_member(gpstate->variables, v))
+                               gpstate->variables =
lappend(gpstate->variables, v);
+               }
+       }

        foreach_node(GraphElementPattern, gep, path_term)
                result = lappend(result,

Thoughts?

This correctly matches the standard's scoping semantics for a
single path term.

One thing worth noting for the future: the standard says the
variable scope covers the entire <graph table> (SR 18), which
means cross-path-term references should also work once we support
multiple path patterns. For example:

  MATCH (a IS users WHERE b.name != a.name)-[]->(x IS users),
        (b IS users)-[]->(y IS users)

Here `b` is declared in the second path term but forward-referenced
in the first. Pre-collecting inside transformPathTerm would not see
`b` when processing the first path term. Moving the collection up
to transformPathPatternList, spanning all path terms before any
transformation, would cover both cases. I'll be posting a
multi-pattern path matching patch soon in a separate thread.

Regards,
Henson 

pgsql-hackers by date:

Previous
From: Imran Zaheer
Date:
Subject: Re: [WIP] Pipelined Recovery
Next
From: lakshmi
Date:
Subject: Re: parallel data loading for pgbench -i