Re: Idea: Avoid JOINs by using path expressions to follow FKs - Mailing list pgsql-hackers
From | Martin Jonsson |
---|---|
Subject | Re: Idea: Avoid JOINs by using path expressions to follow FKs |
Date | |
Msg-id | 2128453883.3993456.1617218663313@mail.yahoo.com Whole thread Raw |
In response to | Re: Idea: Avoid JOINs by using path expressions to follow FKs (Pavel Stehule <pavel.stehule@gmail.com>) |
List | pgsql-hackers |
SAP has implemented something similar all across their stack. In their HANA database, application platform ABAP and also their cloud. So clearly they find it very popular:-) It is called CDS (Core Data Services) views. Here is a quick overview:
- Superset of SQL to declare views and associations between views. They are views with sort of named joins. The code is parsed and stored as a normal SQL view as well as metadata. Note this metadata is not technically part of the database SQL layer but rather the database application layer. The user normally sees no difference.
- Superset of SQL to query in a very similar way as described above with paths. This is parsed to normal SQL with joins, taking into consideration above metadata. Can only work on the above views.
This has obvious limitations, most mentioned earlier in this thread. Specifically, join types are limited. Still it eases the pain considerably of writing queries. The SAP system I work on now has 400K tables and over 1 million fields. Most keys are composite.. One needs to be a super hero to keep that data model in memory....
This might be an extreme case but I'm sure there are other use cases. SAP technical users are actually quite happy to work with it since, in my humble opinion, it is in a way SQL light. The nice data model parts without the pesky complicated stuff.
It is not really an ORM but it makes ORM work significantly simpler by keeping the metadata on the database. The hierarchical CRUD stuff of ORM legend is squarely out of scope.
I've been seeing this type of question appearing regularly in this forum and maybe this SAP way holds water as a solution? In that case, the work should probably be in user land but close to the database. Maybe as as extension with a SQL preprocessor? Much of the grammar and parsing work is already available, at least as inspiration.
Martin
Le lundi 29 mars 2021, 12:48:58 UTC+2, Pavel Stehule <pavel.stehule@gmail.com> a écrit :
po 29. 3. 2021 v 12:01 odesílatel Joel Jacobson <joel@compiler.org> napsal:
On Sun, Mar 28, 2021, at 16:04, Tom Lane wrote:I'm imagining a syntax in whichyou give the constraint name instead of the column name. Thoughtexperiment: how could the original syntax proposal make any use ofa multi-column foreign key?Thanks for coming up with this genius idea.At first I didn't see the beauty of it; I wrongly thought the constraint name needed to beunique per schema, but I realize we could just use the foreign table's nameas the constraint name, which will allow a nice syntax:SELECT DISTINCT order_details.orders.customers.company_nameFROM order_detailsWHERE order_details.products.product_name = 'Chocolade';
This syntax is similar to Oracle's object references (this is example from thread from Czech Postgres list last week)
Select e.last_name employee,
e.department_ref.department_name department,
e.department_ref.manager_ref.last_name dept_manager
From employees_obj e
where e.initials() like 'K_';
e.department_ref.department_name department,
e.department_ref.manager_ref.last_name dept_manager
From employees_obj e
where e.initials() like 'K_';
I see few limitations: a) there is not support for outer join, b) there is not support for aliasing - and it probably doesn't too nice, when you want to returns more (but not all) columns
Regards
Pavel
Given this data model:CREATE TABLE customers (customer_id bigint NOT NULL GENERATED ALWAYS AS IDENTITY,company_name text,PRIMARY KEY (customer_id));CREATE TABLE orders (order_id bigint NOT NULL GENERATED ALWAYS AS IDENTITY,customer_id bigint NOT NULL,PRIMARY KEY (order_id),CONSTRAINT customers FOREIGN KEY (customer_id) REFERENCES customers);CREATE TABLE products (product_id bigint NOT NULL GENERATED ALWAYS AS IDENTITY,product_name text NOT NULL,PRIMARY KEY (product_id));CREATE TABLE order_details (order_id bigint NOT NULL,product_id bigint NOT NULL,PRIMARY KEY (order_id, product_id),CONSTRAINT orders FOREIGN KEY (order_id) REFERENCES orders,CONSTRAINT products FOREIGN KEY (product_id) REFERENCES products);> Not saying I think this suggestion is a good idea, though. We've seen> many frameworks that hide joins, and the results are ... less than> universally good.Yeah, I'm pretty much not sold on this idea either. I think it wouldlead to the same problems we see with ORMs, namely that people writequeries that are impossible to execute efficiently and then blamethe database for their poor choice of schema.I think this concern is valid for the original syntax,but I actually think the idea on using foreign key constraint nameseffectively solves an entire class of query writing bugs.Users writing queries using this syntax are guaranteed to be awareof the existence of the foreign keys, otherwise they couldn't writethe query this way, since they must use the foreign keyconstraint names in the path expression.This ensures it's not possible to produce a nonsensical JOINon the wrong columns, a problem for which traditional JOINshave no means to protect against.Even with foreign keys, indexes could of course be missing,causing an inefficient query anyway, but at least the classesof potential problems is reduced by one.I think what's neat is how this syntax works excellent in combinationwith traditional JOINs, allowing the one which feels most natural foreach part of the query to be used.Let's also remember foreign keys did first appear in SQL-89,so they couldn't have been taken into account when SQL-86was designed. Maybe they would have came up with the ideaof making more use of foreign key constraints,if they would have been invented from the very beginning.However, it's not too late to fix this, it seems doable withoutbreaking any backwards compatibility. I think there is a riskour personal preferences are biased due to being experiencedSQL users. I think it's likely newcomers to SQL would reallyfancy this proposed syntax, and cause them to prefer PostgreSQLover some other NoSQL product.If we can provide such newcomers with a built-in solution,I think that better than telling them they shoulduse some ORM/tool/macro to simplify their query writing./Joel
pgsql-hackers by date: