Re: Idea: Avoid JOINs by using path expressions to follow FKs - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: Idea: Avoid JOINs by using path expressions to follow FKs
Date
Msg-id 20210331061824.xdb3flwapflylqku@nol
Whole thread Raw
In response to Re: Idea: Avoid JOINs by using path expressions to follow FKs  ("Joel Jacobson" <joel@compiler.org>)
Responses Re: Idea: Avoid JOINs by using path expressions to follow FKs
List pgsql-hackers
On Wed, Mar 31, 2021 at 12:50:19AM +0200, Joel Jacobson wrote:
> On Tue, Mar 30, 2021, at 22:01, Isaac Morland wrote:
> > On Tue, 30 Mar 2021 at 15:33, Joel Jacobson <joel@compiler.org> wrote:
> >>> Also, should the join be a left join, which would therefore return a NULL when there is no matching record? Or
couldwe have a variation such as ->? to give a left join (NULL when no matching record) with -> using an inner join
(recordis not included in result when no matching record).
 
> >> 
> >> Interesting idea, but I think we can keep it simple, and still support the case you mention:
> >> 
> >> If we only have -> and you want to exclude records where the column is NULL (i.e. INNER JOIN),
> >> I think we should just use the WHERE clause and filter on such condition.
> >> 
> > 
> > Just to be clear, it will always be a left join? Agreed that getting the inner join behaviour can be done in the
WHEREclause. I think this is a case where simple is good. As long as the left join case is supported I'm happy.
 
> 
> Hmm, I guess, since technically, if all foreign key column(s) are declared as NOT NULL, we would know for sure such
valuesexist, so a LEFT JOIN and INNER JOIN would always produce the same result.
 
> I'm not sure if the query planner could produce different plans though, and if an INNER JOIN could be more efficient.
Ifit matters, then I think we should generate an INNER JOIN for the "all column(s) NOT NULL" case.
 

I'm not sure who is supposed to be the target for this proposal.

As far as I understand this won't change the fact that users will still have to
understand the "relational" part of RDBMS, understand what is a JOIN
cardinality and everything that comes with it.  So you think that people who
are too lazy to learn the proper JOIN syntax will still bother to learn about
relational algebra and understand what they're doing, and I'm very doubtful
about that.

You also think that writing a proper JOIN is complex, but somehow writing a
proper WHERE clause to subtly change the query behavior is not a problem, or
that if users want to use aggregate or anything more complex then they'll
happily open the documentation and learn how to do that.  In my experience what
will happen is that instead users will keep using that limited subset of SQL
features and build creative and incredibly inefficient systems to avoid using
anything else and will then complain that postgres is too slow.

As an example just yesterday some user complained that it's not possible to
write a trigger on a table that could intercept inserting a textual value on an
integer field and replace it with the referenced value.  And he rejected our
suggested solution to replace the "INSERT INTO sometable VALUES..." with
"INSERT INTO sometable SELECT ...".  And no this proposal would not have
changed anything because changing the python script doing the import to add
some minimal SQL knowledge was apparently too problematic.  Instead he will
insert the data in a temporary table and dispatch everything on a per-row
basis, using triggers.  So here again the problem wasn't the syntax but having
to deal with a relational rather than an imperative approach.

Even if I'm totally wrong about that, I still think your proposal will lead to
problematic or ambiguous situation unless you come up with a syntax that can
fully handle the JOIN grammar.  For instance, what should happen if the query
also contains an explicit JOIN for the same relation?  I can see many reason
why this would happen with this proposal given the set of features it can
handle.  For instance:

- you want multiple JOIN.  Like one OUTER JOIN and one INNER JOIN for the same
  relation
- you want to push predicates on an OUTER JOIN



pgsql-hackers by date:

Previous
From: Kyotaro Horiguchi
Date:
Subject: Re: Remove page-read callback from XLogReaderState.
Next
From: Markus Wanner
Date:
Subject: Re: [PATCH] add concurrent_abort callback for output plugin