SQL/PGQ: Support multi-pattern path matching in GRAPH_TABLE - Mailing list pgsql-hackers

From Henson Choi
Subject SQL/PGQ: Support multi-pattern path matching in GRAPH_TABLE
Date
Msg-id CAAAe_zAxTXev97Zhc=Z+LH6ZWQ8UbHQ5LAo_iVkqHm=49SZS+g@mail.gmail.com
Whole thread
List pgsql-hackers
Hi hackers,

Now that the SQL/PGQ core has been committed, I'd like to propose
extending GRAPH_TABLE to accept multiple path patterns in the MATCH
clause, i.e. the comma-separated form:

    SELECT ... FROM GRAPH_TABLE (g
        MATCH <path_pattern_1>, <path_pattern_2>, ...
        COLUMNS (...)
    );

This shape is not supported today — the parser rejects it with

    multiple path patterns in one GRAPH_TABLE clause not supported

and the rewriter asserts that the path_pattern_list has exactly one
entry. Among the features that are not yet covered, I think this one
has the highest practical need: many realistic graph queries express
joins and star-shaped traversals most naturally as multiple
comma-separated patterns, and without it those queries have to be
rewritten into more awkward forms. The attached patch lifts the
restriction and wires the existing path-rewriting pipeline through
the list of path patterns.

What the patch does
-------------------

Parser side: the error that rejected multi-pattern MATCH is
removed, so such queries now reach the rewriter.

Rewriter side: each path pattern is processed as its own chain,
so adjacency linking never crosses a path boundary.  Element
variables that share a name across paths are still merged into
the same element — shared variables produce joins, disconnected
paths produce cross products.

An earlier version flattened all element patterns into a single
list, but that treated elements from adjacent paths as adjacent
within one path and broke on vertex-vertex boundaries.  The
per-path approach is the minimal fix; a more principled cross-path
join construction is left for this thread to settle.

Examples
--------

Shared variable (join):

    MATCH (a IS vl1)-[e1 IS el1]->(b IS vl2),
          (b)-[e2 IS el2]->(c IS vl3)

    -- b is shared -> the two patterns are joined on b.

Star/hub:

    MATCH (a IS vl1)-[]->(b IS vl2),
          (a)-[]->(c IS vl3),
          (d IS vl2)-[]->(a)

    -- three patterns meeting at a.

Disconnected patterns (cross product):

    MATCH (a IS vl1), (b IS vl3)

Partial connection mixed with a disconnected piece:

    MATCH (a)-[]->(b), (b)-[]->(c), (d IS vl1)

Status
------

I would appreciate feedback along these axes:

  * standard conformance — whether the shape and the handling of
    multi-pattern MATCH are aligned with SQL/PGQ (ISO/IEC 9075-16);
  * semantics — whether the behavior on shared variables,
    disconnected patterns, and their combinations is the right one;
  * functionality — coverage gaps, cases the patch does not yet
    handle, or constructs that should be rejected but currently are
    not (and vice versa);
  * robustness — correctness under edge cases, error handling, and
    anything that could destabilize the existing GRAPH_TABLE path;
  * code shape — whether generate_queries_for_path_pattern() has
    grown large enough that the per-path body should be factored
    out into a helper.  I kept the function intact in this round,
    but would gladly split it if reviewers prefer that.

Review comments, objections, and alternative approaches are all
welcome — please don't hesitate to push back on anything that looks
off.

Thanks,
Henson

Reference to the SQL/PGQ main thread:
  https://postgr.es/m/CAAAe_zAEEAb=piH4n-mZUhqcL=oKbDv4v-_7C_7KyXroem=HUg@mail.gmail.com

Attachment

pgsql-hackers by date:

Previous
From: Ashutosh Bapat
Date:
Subject: Re: [Patch] Block ALTER TABLE RENAME COLUMN when column is used by property graph
Next
From: jian he
Date:
Subject: Re: [PATCH] Fix null pointer dereference in PG19