Re: Idea: Avoid JOINs by using path expressions to follow FKs - Mailing list pgsql-hackers

From Joel Jacobson
Subject Re: Idea: Avoid JOINs by using path expressions to follow FKs
Date
Msg-id 52b17e8e-41ab-4d00-a9a8-c37e03a97888@www.fastmail.com
Whole thread Raw
In response to Re: Idea: Avoid JOINs by using path expressions to follow FKs  (Julien Rouhaud <rjuju123@gmail.com>)
Responses Re: Idea: Avoid JOINs by using path expressions to follow FKs  ("Joel Jacobson" <joel@compiler.org>)
Re: Idea: Avoid JOINs by using path expressions to follow FKs  (Isaac Morland <isaac.morland@gmail.com>)
List pgsql-hackers
On Wed, Mar 31, 2021, at 19:16, Julien Rouhaud wrote:
On Wed, Mar 31, 2021 at 5:19 PM Joel Jacobson <joel@compiler.org> wrote:
>
> If using the -> notation, you would only need to manually
> inspect the tables involved in the remaining JOINs;
> since you could be confident all uses of -> cannot affect cardinality.

Talking about that, do you have some answers to the points raised in
my previous mail, which is how it's supposed to behave when a table is
both join using your "->" syntax and a plain JOIN, how to join the
same table multiple time using this new syntax, and how to add
predicates to the join clause using  this new syntax.

It's tricky, I don't see a good solution.

My original proposal aimed to improve syntax conciseness.
While this would be nice, I see much more potential value in Tom's idea
of somehow making use of foreign key constrain names.

Instead of trying to hack it into the <select list> part of a query,
maybe it's more fruitful to see if we can find a way to integrate it into the <from clause>.

Perhaps something along the lines of what Vik suggested earlier:
> FROM a JOIN b WITH a_b_fk

The problem I have with the above is "b" is redundant information,
since the foreign key is always between two specific tables,
and given "a" and "a_b_fk" we know we are joining "b".

I would prefer a new chainable binary operator.

Pavel raised some concerns about using "->" since used by the standard already,
but perhaps it is less of a problem when only used in the <from clause>?
Otherwise we could use something else entirely.

Here comes some ideas on <from clause> syntax.

With default foreign key constraint names:

SELECT DISTINCT customers.company_name
FROM order_details->order_details_product_id_fkey AS products
JOIN order_details->order_details_order_id_fkey->orders_customer_id_fkey AS customers
WHERE products.product_name = 'Chocolade';

In a PostgreSQL-only environment, foreign keys could be renamed:

ALTER TABLE orders RENAME CONSTRAINT orders_customer_id_fkey TO customers;
ALTER TABLE order_details RENAME CONSTRAINT order_details_order_id_fkey TO orders;
ALTER TABLE order_details RENAME CONSTRAINT order_details_product_id_fkey TO products;

Then we would get:

SELECT DISTINCT customers.company_name
FROM order_details->products
JOIN order_details->orders->customers
WHERE products.product_name = 'Chocolade';

Which would be the same thing as:

SELECT DISTINCT customers.company_name
FROM order_details
JOIN order_details->products
JOIN order_details->orders
JOIN orders->customers
WHERE products.product_name = 'Chocolade';

Type of join can be specified as well as aliases, just like normal:

SELECT DISTINCT c.company_name
FROM order_details AS od
JOIN od->products AS p
FULL JOIN od->orders AS o
LEFT JOIN o->customers AS c
WHERE p.product_name = 'Chocolade';

(FULL and LEFT join makes no sense in this example, but just to illustrate join types works just like normal)

I don't know how challenging this would be to integrate into the grammar though.
Here are some other ideas which might be easier to parse:

SELECT DISTINCT customers.company_name
FROM order_details->products
JOIN ON order_details->orders->customers
WHERE products.product_name = 'Chocolade';

SELECT DISTINCT customers.company_name
FROM order_details->products
JOIN USING order_details->orders->customers
WHERE products.product_name = 'Chocolade';

SELECT DISTINCT customers.company_name
FROM order_details->products
JOIN WITH order_details->orders->customers
WHERE products.product_name = 'Chocolade';

More syntax ideas?

Semantic ideas:

* When chaining, all joins on the chain would be made of the same type.
* To use different join types, you would write a separate join.
* All tables joined in the chain, would be accessible in the <select list>, via the names of the foreign key constraints.
* Only the last link on the chain can be given an alias. If you want to alias something in the middle, split the chain into two separate joins (, so that the one in the middle becomes the last one, which can then be given an alias.)
 
Thoughts?

/Joel

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Redundant errdetail prefix "The error was:" in some logical replication messages
Next
From: Tom Lane
Date:
Subject: Re: Crash in record_type_typmod_compare