Re: SQL Property Graph Queries (SQL/PGQ) - Mailing list pgsql-hackers
From | Junwang Zhao |
---|---|
Subject | Re: SQL Property Graph Queries (SQL/PGQ) |
Date | |
Msg-id | CAEG8a3JrO--th6N6oQebqAvZZBHShOJ5Z-fZSS8uNJZs-tQ_tQ@mail.gmail.com Whole thread Raw |
In response to | Re: SQL Property Graph Queries (SQL/PGQ) (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>) |
List | pgsql-hackers |
Hi Ashutosh, On Mon, Apr 7, 2025 at 1:19 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > On Sat, Apr 5, 2025 at 6:20 PM Junwang Zhao <zhjwpku@gmail.com> wrote: > > > > Hi Ashutosh and Peter, > > > > Since this PGQ feature won't be in PG 18, I'd like to raise a discussion of > > the possibility of implementing the quantifier feature, which I think is a > > quite useful feature in the graph database area. > > I agree that quantifiers feature is very useful; it's being used in > many usecases. However, it's a bit of a complex feature. IMO, we > should keep that discussion as well as the patch in a separate thread, > so that this patchset doesn't grow too large to review and also > discussion in this thread can remain focused. Once we get the current > patch set reviewed and committed we can tackle the quantifier problem > in a separate discussion. Of course that doesn't mean that we can not > start discussion, try POC and even a working patch for quantifier > support. > > Peter may think otherwise. > > > > > I'll start with a graph definition first. > > > > `Person(id, name, age, sex)` with id as PK > > `Knows(id, start_id, end_id, since)` with id as PK, start_id and > > end_id FK referencing Person's id > > > > insert into Person values(1, 'A', 31, 'M'), (2, 'B', 30, 'F'), (3, > > 'C', 33, 'M'), (4, 'D', 31, 'F'), (5, 'E', 32, 'M'), (6, 'F', 33, > > 'M'); > > insert into Knows values (1, 1, 2, '2020'); -- A knows B since 2020 > > insert into Knows values (2, 1, 3, '2021'); -- A knows C since 2021 > > insert into Knows values (3, 1, 4, '2020'); -- A knows D since 2020 > > insert into Knows values (4, 2, 4, '2023'); -- B knows D since 2023 > > insert into Knows values (5, 3, 5, '2022'); -- C knows E since 2022 > > insert into Knows values (6, 2, 6, '2021'); -- B knows F since 2021 > > insert into Knows values (7, 4, 6, '2020'); -- D knows F since 2020 > > > > Then we create a property graph: > > > > CREATE property graph new_graph > > VERTEX TABLES (Person) > > EDGE TABLES (Knows); > > > > If we want to find A's non-directly known friends within 3 hops, we can query: > > > > select name from graph_table (new_graph match (a:Person WHERE a.name = > > 'A') --> (b:Person) --> (c:Person) COLUMNS (c.name)) > > union > > select name from graph_table (new_graph match (a:Person WHERE a.name = > > 'A') --> (b:Person) -->(c:Person)-->(d:Person) COLUMNS (d.name)); > > > > Or if we support quantifier, we can simply the query as: > > > > select name from graph_table (new_graph match (a:Person WHERE a.name = > > 'A') -->{2,3} (b:Person) COLUMNS (b.name)); > > > > In the current design of PostgreSQL, we can rewrite this pattern with > > quantifiers to > > the union form with some effort. > > > > But what if the pattern is more complicated, for example: > > > > 1. select name, since from graph_table (new_graph match (a:Person > > WHERE a.name = 'A') -[r:Knows]->{2,3} (b:Person) COLUMNS (b.name, > > r.since)); > > Can we support the r.since column? I guess not, in this case r is a > > variable length edge. > > > > 2. select name, count from graph_table (new_graph match (a:Person > > WHERE a.name = 'A') -[r:Knows]->{2,3} (b:Person) COLUMNS (b.name, > > count(r))); > > Can we support this count aggregation(this is called horizontal > > aggregation in Oracle's pgql)? How can the executor know the length of > > the variable length edge? > > > > 3. What if the query doesn't specify the Label of edge, and there can > > be different edge labels of r, can we easily do the rewrite? > > > > I did some study of the apache age, they have fixed columns for node > > labels(id, agtype) > > and edge labels(id, source_id, end_id, agtype), agtype is kind of > > json. So they can > > resolve the above question easily. > > > > Above are just my random thoughts of the quantifier feature, I don't have a copy > > of the PGQ standard, so I'd like to hear your opinion about this. > > > > I think the questions you have raised are valid. If we decide to > discuss this in a separate thread, I will start that thread just by > responding to these questions and design I have in mind. I'm ok with starting a new thread for quantifier discussion, and I'd really happy to know your design on this. > > -- > Best Wishes, > Ashutosh Bapat -- Regards Junwang Zhao
pgsql-hackers by date: