Re: Allowing join removals for more join types - Mailing list pgsql-hackers

From David Rowley
Subject Re: Allowing join removals for more join types
Date
Msg-id CAApHDvrUc-kLg+EnaEpu_EHT65voqh7KSZkLYEcf5W3bwBfmNA@mail.gmail.com
Whole thread Raw
In response to Re: Allowing join removals for more join types  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Allowing join removals for more join types  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Wed, Jul 9, 2014 at 12:59 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
David Rowley <dgrowley@gmail.com> writes:
> On 9 July 2014 09:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> On review it looks like analyzejoins.c would possibly benefit from an
>> earlier fast-path check as well.

> Do you mean for non-subqueries? There already is a check to see if the
> relation has no indexes.

Oh, sorry, that was a typo: I meant to write pathnode.c.  Specifically,
we could skip the translate_sub_tlist step.  Admittedly that's not
hugely expensive, but as long as we have the infrastructure for a quick
check it might be worth doing.


>> TBH I find the checks for FOR UPDATE and volatile functions to be
>> questionable as well.

> Well, that's a real tough one for me as I only added that based on what you
> told me here:
>> I doubt you should drop a subquery containing FOR UPDATE, either.
>> That's a side effect, just as much as a volatile function would be.

Hah ;-).  But the analogy to qual pushdown hadn't occurred to me at the
time.


Ok, I've removed the check for volatile functions and FOR UPDATE.
 
> As far as I know the FOR UPDATE check is pretty much void as of now anyway,
> since the current state of query_is_distinct_for() demands that there's
> either a DISTINCT, GROUP BY or just a plain old aggregate without any
> grouping, which will just return a single row, neither of these will allow
> FOR UPDATE anyway.

True.

> So the effort here should be probably be more focused on if we should allow
> the join removal when the subquery contains volatile functions. We should
> probably think fairly hard on this now as I'm still planning on working on
> INNER JOIN removals at some point and I'm thinking we should likely be
> consistent between the 2 types of join for when it comes to FOR UPDATE and
> volatile functions, and I'm thinking right now that for INNER JOINs that,
> since they're INNER that we could remove either side of the join. In that
> case maybe it would be harder for the user to understand why their volatile
> function didn't get executed.

Color me dubious.  In exactly what circumstances would it be valid to
suppress an inner join involving a sub-select?


hmm, probably I didn't think this through enough before commenting as I don't actually have any plans for subselects with INNER JOINs. Though saying that I guess there are cases that can be removed... Anything that queries a single table that has a foreign key matching the join condition, where the subquery does not filter or group the results. Obviously something about the query would have to exist that caused it not to be flattened, perhaps some windowing functions... 


I've attached an updated patch which puts in some fast path code for subquery type joins. I'm really not too sure on a good name for this function. I've ended up with query_supports_distinctness() which I'm not that keen on, but I didn't manage to come up with anything better.

Regards

David Rowley
Attachment

pgsql-hackers by date:

Previous
From: Sawada Masahiko
Date:
Subject: Re: add line number as prompt option to psql
Next
From: Christoph Martin
Date:
Subject: [PATCH] Fix search_path default value separator.