Re: SQL Property Graph Queries (SQL/PGQ) - Mailing list pgsql-hackers
| From | Ashutosh Bapat |
|---|---|
| Subject | Re: SQL Property Graph Queries (SQL/PGQ) |
| Date | |
| Msg-id | CAExHW5tR4O0vjeqTCPr2VB5pYjNYbJgbCBEQf63NtU5Pz1MiOQ@mail.gmail.com Whole thread |
| In response to | Re: SQL Property Graph Queries (SQL/PGQ) (Peter Eisentraut <peter@eisentraut.org>) |
| Responses |
Re: SQL Property Graph Queries (SQL/PGQ)
|
| List | pgsql-hackers |
On Wed, Mar 25, 2026 at 2:11 PM Peter Eisentraut <peter@eisentraut.org> wrote: > > On 20.03.26 14:42, Ashutosh Bapat wrote: > >> We should either: > >> > >> (1) Support G041 fully -- both directions. Junwang's patch handles > >> this cleanly with minimal code change, or > >> (2) Not support G041 -- reject all non-local element pattern > >> references, throwing an error for both directions. > >> > >> What do you think? > > > > We will support the cross variable references in future once we > > support more path patterns leading to a more mature code in that area. > > But it doesn't harm to support backward references which can be easily > > supported now. I don't see the code to do so getting disturbed as it > > matures. For the segmentation fault a simple fix to add graph_pattern > > suffices for now. Let's see if Peter has a different opinion. > > I think we should not allow non-local references in either direction > (until we implement feature G041), option 2 above. Allowing only > backward references seems kind of an arbitrary implementation artifact. > Sorry, I didn't explain well earlier. It isn't arbitrary. It mimics backward lateral references. Consider following query Query1 SELECT x1.a, g.* FROM x1, GRAPH_TABLE (myshop MATCH (x1 IS customers)-[IS customer_orders]->(o IS orders WHERE o.order_id = x1.a) COLUMNS (x1.name AS customer_name, x1.customer_id AS cid)) g; x1 is a table referenced in from clause as well as an element pattern variable in path pattern. If we don't allow backward (cross) referencing, x1.a in the element pattern o resolves to the table x1's column a, but x1.name resolves to element x1's property reference name. So within the same graph pattern x1 may get resolved to two different things, even though x1 was declared before using it. Is that how we expect it to happen? Let's call this inconsistency1. However, if we allow backward references, we resolve x1 to the table column reference till the point where x1 appears in the path pattern and then onwards resolve x1 to the element pattern variable. So x1.a and x1.name are both resolved into a property of element x1. Ofc, in this example x1.a will result in an error since there is no property with name a in the property graph. But in case label "customers" also exposes property a, x1.a will be resolved to a property reference - applying resolution rule consistently everywhere. The behaviour remains consistent even with a forward reference like Query 2 SELECT x1.a, g.* FROM x1, GRAPH_TABLE (myshop MATCH (c IS customers WHERE c.address = 'US' AND c.customer_id = x1.a)-[IS customer_orders]->(x1 IS orders) COLUMNS (c.name AS customer_name, c.customer_id AS cid, x1.order_id)) g; x1 in element pattern c is resolved to table x1 since that's the only known resolution so far. x1.order_id however gets resolved to order_id property of label orders as by then there is a closer reference available. If we do not support backward references now, but start supporting it later, the Query 1's semantics will change, breaking backward compatibility. Let's call this "backward incompatibility 1". If we do not support forward element pattern references, that will be inline with the fact that we don't support forward lateral references. Those products which do not support lateral with graph_table don't have to worry about any of these problems when supporting cross element references. We have the following options 1. Don't support cross referencing element variables, but then we have "inconsistency1". 2. Support backward referencing element variables, with a risk of "backward incompatibility 1". I think the standard leaves room to not support forward referencing at all while supporting backward referencing 3. Support both backward and forward referencing now, with a risk that it may become a hurdle when we support more advanced features in graph pattern matching. Given that we are in an early phase of implementing PGQ query support, avoiding such hurdles is better. I am leaning towards 2 + not supporting forward referencing at all. But in the attached patchset I have implemented option 1 as you suggested. If you are fine with option 1, let's use that patch. On Wed, Mar 25, 2026 at 3:04 PM Peter Eisentraut <peter@eisentraut.org> wrote: > > On 25.03.26 10:12, Ashutosh Bapat wrote: > > On Wed, Mar 25, 2026 at 2:27 PM Peter Eisentraut <peter@eisentraut.org> wrote: > >> > >> This functionality that you can leave off empty vertex patterns at the > >> beginning and end is part of the optional feature G047, which we don't > >> claim to support. So we should reject these path patterns. > >> > > > > Thanks for the clarification. What about the implicit empty vertex > > patterns in the middle of the path? Should we reject these path > > patterns also? > Yes, I think so. > > The specification of feature G047 is: > > "... in conforming SQL language, any <edge pattern> shall be immediately > preceded and followed by a <vertex pattern>." > > So if you write > > () -[]-> -[]-> () > > then the first edge pattern is not followed immediately by a vertex > pattern, and so it would be invalid without feature G047. Implemented it this way in the attached patchset. 0001: is a small adjustment to make sure that we add an element variable name to the graph table namespace only once. This isn't a problem right now, but we will have to do that anyway in [1] and it has some effect one 0002. 0002: prohibits cross element variable references within graph_table clause. It adds a new member GraphTableParseState::cur_variable to track the variable name of the current pattern being transformed. We can not use llast() simply because a. pattern currently being transformed may not have name, so llast will be misleading b. changes in 0001. We may want to combine 0001 and 0002 when committing. 0003: prohibits consecutive element patterns of the same kind 0004: some cleanup remaining from 5282bf535e474dc2517f2e835d147420ae2144de. We may want to wait to accumulate more cleanups. [1] https://www.postgresql.org/message-id/CAExHW5tYCE9QyCvVraKUeesKW5RTR%2Bmrzsg3u64qSps-RPJR5A%40mail.gmail.com -- Best Wishes, Ashutosh Bapat
Attachment
pgsql-hackers by date: