Re: Removing INNER JOINs - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Removing INNER JOINs
Date
Msg-id CA+TgmoY8ZQESeVOkioxeEiS-G_=Kg0g+rKmk=pdv8kNTibuOaA@mail.gmail.com
Whole thread Raw
In response to Re: Removing INNER JOINs  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Removing INNER JOINs  (Stephen Frost <sfrost@snowman.net>)
Re: Removing INNER JOINs  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-hackers
On Sun, Nov 30, 2014 at 12:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Bottom line, given all the restrictions on whether the optimization can
> happen, I have very little enthusiasm for the whole idea.  I do not think
> the benefit will be big enough to justify the amount of mess this will
> introduce.

This optimization applies to a tremendous number of real-world cases,
and we really need to have it.  This was a huge problem for me in my
previous life as a web developer.  The previous work that we did to
remove LEFT JOINs was an enormous help, but it's not enough; we need a
way to remove INNER JOINs as well.

I thought that David's original approach of doing this in the planner
was a good one.  That fell down because of the possibility that
apparently-valid referential integrity constraints might not be valid
at execution time if the triggers were deferred.  But frankly, that
seems like an awfully nitpicky thing for this to fall down on.  Lots
of web applications are going to issue only SELECT statements that run
as as single-statement transactions, and so that issue, so troubling
in theory, will never occur in practice.  That doesn't mean that we
don't need to account for it somehow to make the code safe, but any
argument that it abridges the use case significantly is, in my
opinion, not credible.

Anyway, David was undeterred by the rejection of that initial approach
and rearranged everything, based on suggestions from Andres and later
Simon, into the form it's reached now.  Kudos to him for his
persistance.  But your point that we might have chosen a whole
different plan if it had known that this join was cheaper is a good
one.  However, that takes us right back to square one, which is to do
this at plan time.  I happen to think that's probably better anyway,
but I fear we're just going around in circles here.  We can either do
it at plan time and find some way of handling the fact that there
might be deferred triggers that haven't fired yet; or we can do it at
execution time and live with the fact that we might have chosen a plan
that is not optimal, though still better than executing a
completely-unnecessary join.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: How about a option to disable autovacuum cancellation on lock conflict?
Next
From: Andres Freund
Date:
Subject: Re: How about a option to disable autovacuum cancellation on lock conflict?