Re: Idea: Avoid JOINs by using path expressions to follow FKs - Mailing list pgsql-hackers

From Isaac Morland
Subject Re: Idea: Avoid JOINs by using path expressions to follow FKs
Date
Msg-id CAMsGm5exUX75Y4bQ756Q87tbiFtCrbou7qxtG=kft5vL+DkOEQ@mail.gmail.com
Whole thread Raw
In response to Re: Idea: Avoid JOINs by using path expressions to follow FKs  ("Joel Jacobson" <joel@compiler.org>)
Responses Re: Idea: Avoid JOINs by using path expressions to follow FKs  ("Joel Jacobson" <joel@compiler.org>)
List pgsql-hackers
On Wed, 31 Mar 2021 at 15:32, Joel Jacobson <joel@compiler.org> wrote:
On Wed, Mar 31, 2021, at 19:16, Julien Rouhaud wrote:
On Wed, Mar 31, 2021 at 5:19 PM Joel Jacobson <joel@compiler.org> wrote:
>
> If using the -> notation, you would only need to manually
> inspect the tables involved in the remaining JOINs;
> since you could be confident all uses of -> cannot affect cardinality.

Talking about that, do you have some answers to the points raised in
my previous mail, which is how it's supposed to behave when a table is
both join using your "->" syntax and a plain JOIN, how to join the
same table multiple time using this new syntax, and how to add
predicates to the join clause using  this new syntax.

It's tricky, I don't see a good solution.

My original proposal aimed to improve syntax conciseness.
While this would be nice, I see much more potential value in Tom's idea
of somehow making use of foreign key constrain names.

Maybe I have a different proposal in mind than anybody else, but I don't think there is a problem with multiple joins to the same table. If the joins are via the same constraint, then a single join is enough, and if they are via different constraints, the constraints have unique names.

I think if TA is a table with a foreign key constraint CB to another table TB, then the hypothetical expression:

TA -> CB

really just means:

(select TB from TB where (TB.[primary key columns) = (TA.[source columns of constraint CB]))

You can then add .fieldname to get the required fieldname. The issue is that writing it this way is hopelessly verbose, but the short form is fine. The query planner also needs to be guaranteed to collapse multiple references through the same constraint to a single actual join (and then take all the multiple fields requested).

If TA is a table with a foreign key constraint CB to TB, which has a foreign key constraint CC to TC, then this expression:

TA -> CB -> CC

just means, by the same definition (except I won't expand it fully, only one level):

(select TC from TC where (TC.[primary key columns) = ((TA -> CB).[source columns of constraint CC]))

Which reminds me, I often find myself wanting to write something like a.(f1, f2, f3) = b.(f1, f2, f3) rather than (a.f1, a.f2, a.f3) = (b.f1, b.f2, b.f3). But that's another story.

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: libpq debug log
Next
From: Stephen Frost
Date:
Subject: Re: Add docs stub for recovery.conf