Re: SQL Property Graph Queries (SQL/PGQ) - Mailing list pgsql-hackers
| From | Henson Choi |
|---|---|
| Subject | Re: SQL Property Graph Queries (SQL/PGQ) |
| Date | |
| Msg-id | CAAAe_zBLUHXKODFD=_AKggERF7=i8PAz0ag4LhpLzOP5PXfbFA@mail.gmail.com Whole thread Raw |
| In response to | Re: SQL Property Graph Queries (SQL/PGQ) (Junwang Zhao <zhjwpku@gmail.com>) |
| Responses |
Re: SQL Property Graph Queries (SQL/PGQ)
|
| List | pgsql-hackers |
Hi Junwang,
Thanks for the feedback!
Thanks for the feedback!
2026년 1월 9일 (금) PM 7:51, Junwang Zhao <zhjwpku@gmail.com>님이 작성:
On Mon, Jan 5, 2026 at 9:56 PM Henson Choi <assam258@gmail.com> wrote:
>
> Hi hackers,
>
> I think it may be a bit early for this, but I wanted to share this design
> document now so we can revisit and discuss when the time is right.
>
> ===================================================================
> Variable Length Edge (VLE) Implementation Strategies for PostgreSQL
> ===================================================================
>
> 1. Overview
> -----------
>
> This document describes two implementation strategies for Variable
> Length Edge (VLE) traversal in PostgreSQL-based graph databases.
>
> VLE Query Example:
>
> MATCH (a)-[*1..5]->(b) RETURN a, b
> MATCH p = (a)-[:knows*]->(b) RETURN p
>
> Two implementation approaches:
>
> 1. Unoptimized: Query rewriting to WITH RECURSIVE (CTE)
> 2. Optimized: Custom executor node with DFS algorithm
>
Interesting ideas. I was thinking about expanding the edge pattern using the
lower and upper but it seems it can not handle the * case.
Right. With a bounded quantifier like -[*1..3]->, we can unroll it into
a fixed number of JOINs at compile time. But -[*]-> has no upper bound
known at compile time, so unrolling isn't possible.
That said, the traversal is still finite due to the edge uniqueness
constraint (e.id <> ALL(p.edge_ids)) which enforces TRAIL semantics.
This prevents traversing the same edge twice, not revisiting vertices.
In a graph with E edges, the maximum path length is E. The challenge
isn't infinity - it's that we need runtime iteration with dynamic
termination.
a fixed number of JOINs at compile time. But -[*]-> has no upper bound
known at compile time, so unrolling isn't possible.
That said, the traversal is still finite due to the edge uniqueness
constraint (e.id <> ALL(p.edge_ids)) which enforces TRAIL semantics.
This prevents traversing the same edge twice, not revisiting vertices.
In a graph with E edges, the maximum path length is E. The challenge
isn't infinity - it's that we need runtime iteration with dynamic
termination.
ISTM these two approaches can both work when we need runtime
pruning, for example if we are going to support path mode.
Edge predicates ([e* WHERE ...]) are applied at each hop, while path-level
predicates (length, aggregates) require completion. The real difference is
the execution model: CTE uses BFS storing all intermediate paths, whereas
a custom executor enables DFS with backtracking and O(depth) memory.
The first option can be costly, since you have to pre calculate the recursive
CTE, it can not utilize the predicate from previous hops. Can it work with
quantifier * case?
Yes, CTE can handle * - the edge uniqueness constraint (e.id <> ALL(...))
ensures termination.
As mentioned above, edge predicates ([e* WHERE ...]) go into the recursion,
while path-level predicates (length, aggregates) stay outside. Since e* is
a collection, scalar access like "WHERE e.weight > 50" outside the pattern
would be invalid - edge filtering belongs inside [e* WHERE ...].
ensures termination.
As mentioned above, edge predicates ([e* WHERE ...]) go into the recursion,
while path-level predicates (length, aggregates) stay outside. Since e* is
a collection, scalar access like "WHERE e.weight > 50" outside the pattern
would be invalid - edge filtering belongs inside [e* WHERE ...].
The second option seems more promising, but it needs to touch the planner
and executor. I guess you need a special node to represent the VLE in
the rewrite phase.
Right. Whether to introduce new syntax or hijack existing constructs
(like SEARCH DEPTH FIRST) needs deeper thought.
This won't be easy, but it's an unavoidable challenge. Adding a new plan
node is one thing, but PlanState stack with push/pop could affect all
executor nodes. I expect this will spark lively discussion among planner
hackers.
Best regards,
Henson
(like SEARCH DEPTH FIRST) needs deeper thought.
This won't be easy, but it's an unavoidable challenge. Adding a new plan
node is one thing, but PlanState stack with push/pop could affect all
executor nodes. I expect this will spark lively discussion among planner
hackers.
Best regards,
Henson
pgsql-hackers by date: