Home > mailing lists

Re: SQL Property Graph Queries (SQL/PGQ) - Mailing list pgsql-hackers

From	Henson Choi
Subject	Re: SQL Property Graph Queries (SQL/PGQ)
Date	January 5 16:57:53
Msg-id	CAAAe_zDbS-uDYgbQ3pQn+BD02JvCTek2LzpdK4b-xy48k7ZMmg@mail.gmail.com Whole thread Raw
In response to	Re: SQL Property Graph Queries (SQL/PGQ) (Henson Choi <assam258@gmail.com>)
List	pgsql-hackers

Tree view

Hi hackers,

I think it may be a bit early for this, but I wanted to share this design
document now so we can revisit and discuss when the time is right.

=====================================================
Shortestpath Implementation Strategies for PostgreSQL
=====================================================

1. Overview
-----------

This document describes two implementation strategies for shortest path
queries in PostgreSQL-based graph databases.

Shortestpath Query Example:

MATCH p = shortestpath((a)-[*]->(b)) RETURN p
MATCH p = allshortestpaths((a)-[:knows*]->(b)) RETURN p

Two implementation approaches:

1. Unoptimized: Query rewriting to WITH RECURSIVE (CTE)
2. Optimized: Custom executor node with bidirectional BFS

2. Unoptimized Approach: WITH RECURSIVE Rewriting
-------------------------------------------------

2.1 Query Transformation

Cypher query:

MATCH p = shortestpath((a:person)-[:knows*]->(b:person))
WHERE a.name = 'Alice' AND b.name = 'Bob'
RETURN p

Rewritten SQL using WITH RECURSIVE:

WITH RECURSIVE sp_paths AS (
-- Base case: start vertex
SELECT v.id as start_vid,
v.id as end_vid,
ARRAY[v.id] as vertex_ids,
ARRAY[]::bigint[] as edge_ids,
0 as depth
FROM person v
WHERE name = 'Alice'

UNION ALL

-- Recursive case: expand one hop
SELECT p.start_vid,
e.end_id,
p.vertex_ids || e.end_id,
p.edge_ids || e.id,
p.depth + 1
FROM sp_paths p
JOIN knows e ON p.end_vid = e.start_id
WHERE e.end_id <> ALL(p.vertex_ids) -- Cycle prevention
)
SELECT vertex_ids, edge_ids, depth
FROM sp_paths
WHERE end_vid = (SELECT id FROM person WHERE name = 'Bob')
ORDER BY depth
LIMIT 1;

2.2 Execution Plan

Limit
-> Sort
Sort Key: depth
-> CTE Scan on sp_paths
Filter: (end_vid = $bob_id)
CTE sp_paths
Recursive Union
Seq Scan on person v
Filter: (name = 'Alice')
Hash Join
Hash Cond: (p.end_vid = e.start_id)
Filter: (e.end_id <> ALL(p.vertex_ids))
WorkTable Scan on sp_paths p
Hash
Seq Scan on knows e

2.3 Problems

┌─────────────────────┬──────────────────────────────────────────┐
│ Problem │ Description │
├─────────────────────┼──────────────────────────────────────────┤
│ Unidirectional │ Search expands only from start vertex │
│ │ Misses optimization from end vertex │
├─────────────────────┼──────────────────────────────────────────┤
│ Full Exploration │ Must explore all paths up to min depth │
│ │ before finding shortest │
├─────────────────────┼──────────────────────────────────────────┤
│ No Early Termination│ Cannot stop when shortest found │
│ │ CTE runs to completion │
├─────────────────────┼──────────────────────────────────────────┤
│ Memory Usage │ O(V) visited vertices per iteration │
│ │ All partial paths stored │
├─────────────────────┼──────────────────────────────────────────┤
│ Cycle Prevention │ <> ALL(array) grows with path length │
│ │ O(path_length) per edge check │
└─────────────────────┴──────────────────────────────────────────┘

2.4 Search Space Comparison

Unidirectional BFS (CTE):

Start ──────────────────────────────────────> End
╲ ╱
╲ Explores entire cone ╱
╲ from start ╱
╲ ╱
╲ Search space: O(B^D) ╱
╲ ╱
────────────────────────

Bidirectional BFS (Optimized):

Start ────────> <──────── End
╲ ╱╲ ╱
╲ ╱ ╲ ╱
╲ ╱ ╲ ╱
╲ ╱ Meet ╲ ╱
╲╱ Point ╲
Search space: O(2 * B^(D/2))

3. Optimized Approach: Custom Executor Node
-------------------------------------------

3.1 Design Principle

Instead of query rewriting, implement a dedicated executor node that:

- Performs bidirectional BFS from both start and end vertices
- Uses hash tables to track visited vertices from each direction
- Detects meeting point where both searches intersect
- Terminates early when shortest path is found

3.2 Plan Node Structure

┌────────────────────────────────────────────────────────────┐
│ ShortestpathPlan │
│ - Bidirectional BFS with dual hash tables │
│ - Early termination on path found │
└────────────────────────────────────────────────────────────┘
│ │
left │ │ right
▼ ▼
┌───────────────────────────┐ ┌───────────────────────────┐
│ Hash2Side (Left) │ │ Hash2Side (Right) │
│ - KeyTable + HashTable │ │ - KeyTable + HashTable │
│ - Forward direction │ │ - Reverse direction │
└───────────────────────────┘ └───────────────────────────┘
│ │
▼ ▼
┌───────────────────────────┐ ┌───────────────────────────┐
│ SubPlan (Forward) │ │ SubPlan (Reverse) │
│ Index Cond: │ │ Index Cond: │
│ start_id = $current │ │ end_id = $current │
└───────────────────────────┘ └───────────────────────────┘

Execution Flow:

Initialize:
Left HashTable = {start_vertex}
Right HashTable = {end_vertex}
│
▼
Expansion loop (smaller side first):
│
├──→ Compare |Left frontier| vs |Right frontier|
│ │
│ └──→ Expand smaller side
│ │
│ └──→ Check intersection with other side
│
├──→ Repeat until intersection found
│
└──→ Return shortest path(s)

3.3 Key Concept: Dual Hash Tables

The executor maintains two hash tables for bidirectional search:

┌─────────────────────────────────────────────────────────────┐
│ Left Direction (Start → End) │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ KeyTable: Previous hop vertices (expansion base) │ │
│ │ HashTable: Current hop results (newly discovered) │ │
│ └─────────────────────────────────────────────────────┘ │
├─────────────────────────────────────────────────────────────┤
│ Right Direction (End → Start) │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ KeyTable: Previous hop vertices (expansion base) │ │
│ │ HashTable: Current hop results (newly discovered) │ │
│ └─────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘

After each hop:
- Swap KeyTable and HashTable (new base for next hop)
- Check for intersection between Left and Right HashTables
- If intersection found, reconstruct path through meeting point

3.4 Key Concept: Smaller Side Expansion

At each hop, expand from the direction with fewer frontier vertices.
This minimizes the total number of edges scanned.

Example (highly unbalanced tree):

Graph structure:
- Forward (Left->Right): branching factor = 100
- Backward (Right->Left): branching factor = 2
- Path distance: 6 hops

Simple alternating:

Hop 1: expand Left (1) -> Left=100, Right=1
Hop 2: expand Right (1) -> Left=100, Right=2
Hop 3: expand Left (100) -> Left=10,000, Right=2
Hop 4: expand Right (2) -> Left=10,000, Right=4
Hop 5: expand Left (10,000) -> Left=1,000,000, Right=4
Hop 6: expand Right (4) -> meet!

Total vertices expanded: 1 + 1 + 100 + 2 + 10,000 + 4 = 10,108

With smaller side expansion:

Hop 1: Left=1, Right=1 -> expand Left -> Left=100, Right=1
Hop 2: 1 < 100 -> expand Right -> Left=100, Right=2
Hop 3: 2 < 100 -> expand Right -> Left=100, Right=4
Hop 4: 4 < 100 -> expand Right -> Left=100, Right=8
Hop 5: 8 < 100 -> expand Right -> Left=100, Right=16
Hop 6: 16 < 100 -> expand Right -> meet!

Total vertices expanded: 1 + 1 + 2 + 4 + 8 + 16 = 32

Speedup: 10,108 / 32 = ~300x fewer vertices explored

3.5 Key Concept: Cycle Detection via Hash Lookup

Hash table insertion automatically prevents revisiting vertices.

Two tuple types in HashTable:

┌─────────────────────────────────────────────────────────────┐
│ Long Tuple (path data) │
│ ┌─────────┬─────────┬─────────────────────────────────┐ │
│ │ Header │ Graphid │ Edge Rowids (path to vertex) │ │
│ └─────────┴─────────┴─────────────────────────────────┘ │
├─────────────────────────────────────────────────────────────┤
│ Short Tuple (visited marker) │
│ ┌─────────┬─────────┐ │
│ │ Header │ Graphid │ <- No path data, just marker │
│ └─────────┴─────────┘ │
└─────────────────────────────────────────────────────────────┘

Insertion logic:

1. Inserting long tuple (with path):
- Check if short tuple (marker) exists for same Graphid
- If marker exists: reject insertion (already visited)

2. Inserting short tuple (marker):
- Insert marker first
- Remove all existing long tuples with same Graphid
- Vertex is now "locked" - no more paths to this vertex

Example:

Left Hop 1: Expand from A
HashTable = {B(long), C(long)} <- paths A->B, A->C

Left and Right meet at C -> shortest path found

Insert C(short) marker:
HashTable = {B(long), C(short)} <- C's long tuple removed

Later, another path tries to reach C (e.g., D->C):
-> C(short) marker exists -> insertion rejected
-> Already processed at shortest hop

Benefits:
- O(1) duplicate detection via hash lookup
- No separate "visited" set needed
- Automatic cleanup of superseded paths
- Memory efficient (markers are minimal size)

3.6 Key Concept: Batch Synchronization

When hash table exceeds memory, it splits into multiple batches.
Left and Right hash tables must have matching batch structure:

Problem without synchronization:

Left: nbatch=1 [all data in batch 0]
Right: nbatch=4 [batch0] [batch1] [batch2] [batch3]

-> Cannot find intersection (different batch structures)

Solution:

When Right splits into 4 batches, clone structure to Left:

Left: nbatch=4 [batch0] [batch1] [batch2] [batch3]
Right: nbatch=4 [batch0] [batch1] [batch2] [batch3]
| | | |
Join only matching batches

This ensures correct intersection detection across batch boundaries.

3.7 Execution Plan (Optimized)

EXPLAIN MATCH p = shortestpath((a:person)-[:knows*]->(b:person))
WHERE a.name='Alice' AND b.name='Bob' RETURN p

Shortestpath (limit=1)
-> Hash2Side (left direction=FORWARD)
-> Index Scan on knows e
Index Cond: (start_id = $current_vertex)
-> Hash2Side (right direction=REVERSE)
-> Index Scan on knows e
Index Cond: (end_id = $current_vertex)

4. Comparison Summary
---------------------

┌─────────────────────┬─────────────────────────┬─────────────────────────┐
│ Aspect │ Unoptimized (CTE) │ Optimized (Executor) │
├─────────────────────┼─────────────────────────┼─────────────────────────┤
│ Implementation │ Query rewriting │ Custom executor node │
├─────────────────────┼─────────────────────────┼─────────────────────────┤
│ Search Direction │ Unidirectional │ Bidirectional │
├─────────────────────┼─────────────────────────┼─────────────────────────┤
│ Search Space │ O(B^D) │ O(2 * B^(D/2)) │
├─────────────────────┼─────────────────────────┼─────────────────────────┤
│ Early Termination │ Not possible │ Stop when path found │
├─────────────────────┼─────────────────────────┼─────────────────────────┤
│ Data Structure │ WorkTable (tuplestore) │ Dual hash tables │
├─────────────────────┼─────────────────────────┼─────────────────────────┤
│ Cycle Prevention │ Array membership check │ Hash table lookup O(1) │
├─────────────────────┼─────────────────────────┼─────────────────────────┤
│ Code Complexity │ Low (rewriting only) │ High (new executor) │
├─────────────────────┼─────────────────────────┼─────────────────────────┤
│ PostgreSQL Changes │ None │ New plan/executor node │
└─────────────────────┴─────────────────────────┴─────────────────────────┘

5. Search Space Example
-----------------------

Graph: Social network, avg 100 connections per person
Query: shortestpath((Alice)-[*]->(Bob)), actual distance = 6 hops

Unidirectional BFS (CTE):
Hop 1: 100 vertices
Hop 2: 10,000 vertices
Hop 3: 1,000,000 vertices
Hop 4: 100,000,000 vertices
Hop 5: 10,000,000,000 vertices
Hop 6: Find Bob

Total vertices explored: ~10 billion

Bidirectional BFS with smaller side expansion:
Hop 1: Left=1, Right=1 -> expand Left -> Left=100
Hop 2: 1 < 100 -> expand Right -> Right=100
Hop 3: 100 = 100 -> expand Left -> Left=10,000
Hop 4: 100 < 10,000 -> expand Right -> Right=10,000
Hop 5: 10,000 = 10,000 -> expand Left -> Left=1,000,000
Hop 6: 10,000 < 1,000,000-> expand Right -> Meet!

Total vertices explored: ~1 million (each side explores ~500K)

Speedup: ~10,000x fewer vertices explored

6. Key Techniques Summary
-------------------------

┌────┬───────────────────────────┬────────────────────────────────────────┐
│ # │ Technique │ Benefit │
├────┼───────────────────────────┼────────────────────────────────────────┤
│ 1 │ Bidirectional BFS │ Exponential reduction in search space │
│ 2 │ Dual Hash Tables │ O(1) visited check, O(1) intersection │
│ 3 │ Smaller Side Expansion │ Expand from side with fewer vertices │
│ 4 │ Early Termination │ Stop immediately when path found │
│ 5 │ Cycle Detection │ Hash lookup prevents revisiting vertex │
│ 6 │ Batch Synchronization │ Match nbatch between Left/Right hashes │
│ 7 │ Path Reconstruction │ Merge left and right paths at meeting │
│ 8 │ Parameterized Subplan │ Reuse edge scan plan per vertex │
│ 9 │ Batch Processing │ Handle memory overflow with batching │
│ 10 │ Limit Control │ shortestpath (1) vs allshortestpaths │
└────┴───────────────────────────┴────────────────────────────────────────┘

7. Conclusion
-------------

The unoptimized CTE-based approach is simpler to implement but suffers from:

- Unidirectional search with O(B^D) complexity
- No early termination capability
- Inefficient cycle prevention with arrays
- Must explore all paths before finding shortest

The optimized executor-based approach requires more implementation
effort but provides:

- Bidirectional BFS with O(B^(D/2)) complexity
- Early termination when shortest path found
- Efficient hash-based visited tracking
- Support for both shortestpath and allshortestpaths

For production graph database systems, the bidirectional BFS approach
is essential for practical shortest path queries on large graphs.

This document focuses on the conceptual design. Implementation details
such as hash table sizing, batch overflow handling, and path reconstruction
algorithms are left for follow-up discussion.

pgsql-hackers by date:

From: Henson Choi
Date: 05 January, 16:55:52
Subject: Re: SQL Property Graph Queries (SQL/PGQ)

From: Alvaro Herrera
Date: 05 January, 17:07:00
Subject: Re: Adding REPACK [concurrently]

Re: SQL Property Graph Queries (SQL/PGQ) - Mailing list pgsql-hackers

Previous

Next