SQL/PGQ: All properties reference - Mailing list pgsql-hackers
| From | Ashutosh Bapat |
|---|---|
| Subject | SQL/PGQ: All properties reference |
| Date | |
| Msg-id | CAExHW5tYCE9QyCvVraKUeesKW5RTR+mrzsg3u64qSps-RPJR5A@mail.gmail.com Whole thread Raw |
| Responses |
Re: SQL/PGQ: All properties reference
|
| List | pgsql-hackers |
Hi Peter and Pgsql-Hackers, I am starting a new thread to discuss all properties reference feature which was not committed with the main patch. [1] A <variable>.* is called all properties reference and it is allowed only in COLUMNs clause. Interpreting subclause 9.2 and 9.3 together, it expands to a list of graph property references <variable>.p1, ... <variable>.pn where p1, ..., pn are the properties of the labels which satisfy the label expression in the element pattern identified by <variable>. The graph property references are added to the COLUMNs clause in place of the all property reference, just like how <table>.* expands in SELECT's targetlist. In the current implementation, we delay resolving graph property references (<variable>.<property>) till the time query is generated (generate_query_for_graph_path()). If we delay the all properties reference till that time, we can not determine the data types and names of the columns in the COLUMNs list. So we need to do that when the COLUMNs clause is resolved. This means that the properties associated with the labels needs to be resolved earlier. Since the properties are not associated with labels directly but through the elements, we need to find at least one element for every label in the label expression. In brief, all the namespace resolution need to happen before we transform COLUMNs clause. The patch rearranges the code that way. I like the resultant code since a. it handles errors more gracefully and can provide error location as well b. It avoids repeated property and element label lookups c. the code seems closer to how we create namespaces for table references. Flip side I think it fetches the properties which may or may not be needed by the query, similar to how we compute the tuple descriptor of a table referenced in the query even though we don't need all the columns. But overall I think it's better code as well. There are some things that still need work as below o. Order of properties in all properties reference ---------------------------------------------------------------- The standard (subclause 9.3) mentions this as implementation dependent. Since properties are associated with labels which are a logical entity, I don't think we need to define property numbers like attribute numbers. Natural order is ordering by the property names (even across the labels). Do we have any other option? o. Difference from Oracle ---------------------------------- Consider following query from the patch (please refer the graph_table.sql for details) SELECT * FROM GRAPH_TABLE (g1 MATCH (src IS vl1 | vl2 WHERE src.vprop1 = 10 OR src.vprop1 = 1020) COLUMNS (src.*)); The element pattern has only two labels vl1 and vl2 in it. If I understand subclause 9.2 and 9.3 correctly, only the properties of the labels that appear in the label expression should be part of all properties reference. The expected output in the patch is based on this interpretation - which has columns vname | vprop1 | vprop2 | lprop1 . However, Oracle includes elname as well which is not associated with vl1 or vl2. I think, Oracle's output is wrong. But maybe I am misinterpreting those clauses. Peter, what do you think? o. pg_node_attrs for new fields ----------------------------------------- Need to think about pg_node_attrs for the new members of GraphLabelRef added in patch. Possibly the current annotation is right, but need to check again. o. Need to document what a <variable>.* will result into Planning to work on it after the main patch is somewhat stable. [1] postgr.es/m/bcf58f6e-d0bd-49f8-b074-e3ee69ef6567@eisentraut.org -- Best Wishes, Ashutosh Bapat
Attachment
pgsql-hackers by date: