Re: Foreign join pushdown vs EvalPlanQual - Mailing list pgsql-hackers

From Etsuro Fujita
Subject Re: Foreign join pushdown vs EvalPlanQual
Date
Msg-id 5617917F.1010608@lab.ntt.co.jp
Whole thread Raw
In response to Re: Foreign join pushdown vs EvalPlanQual  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Foreign join pushdown vs EvalPlanQual  (Jeevan Chalke <jeevan.chalke@enterprisedb.com>)
List pgsql-hackers
On 2015/09/12 1:38, Robert Haas wrote:
> On Thu, Sep 10, 2015 at 11:36 PM, Etsuro Fujita
> <fujita.etsuro@lab.ntt.co.jp> wrote:
>> I've proposed the following API changes:
>>
>> * I modified create_foreignscan_path, which is called from
>> postgresGetForeignJoinPaths/postgresGetForeignPaths, so that a path,
>> subpath, is passed as the eighth argument of the function. subpath
>> represents a local join execution path if scanrelid==0, but NULL if
>> scanrelid>0.

> OK, I see now.  But I don't much like the way
> get_unsorted_unparameterized_path() looks.
>
> First, it's basically praying that MergePath, NodePath, and NestPath
> can be flat-copied without breaking anything.  In general, we have
> copyfuncs.c support for nodes that we need to be able to copy, and we
> use copyObject() to do it.  Even if what you've got here works today,
> it's not very future-proof.

Agreed.

> Second, what guarantee do we have that we'll find a path with no
> pathkeys and a NULL param_info?  Why can't all of the paths for a join
> relation have pathkeys?  Why can't they all be parameterized?  I can't
> think of anything that would guarantee that.

No.  The reason why I've modified the patch that way is simply because 
the latest postgres_fdw patch doesn't support creating a remote query 
for a presorted or parameterized path for a remote join.

> Third, even if such a guarantee existed, why is this the right
> behavior?  Any join type will produce the same output; it's just a
> question of performance.  And if you have only one tuple on each side,
> surely a nested loop would be fine.

Yeah, I think we would also need to consider the parameterization.

> It seems to me that what you ought to be doing is using data hung off
> the fdw_private field of each RelOptInfo to cache a NestPath that can
> be used for EPQ rechecks at that level.  When you go to consider
> pushing down another join, you can build up a new NestPath that's
> suitable for the new level.  That seems much cleaner than groveling
> through the list of surviving paths and hoping you find the right kind
> of thing.

Agreed.

(From the first, I am not against that an FDW author creates the local 
join execution path by itself.  The reason why I've modified the patch 
so as to find a local join execution path from the path list is simply 
because that is simple.  The main point I'd like to discuss about the 
patch is the changes to the core code).

> And all that having been said, I still don't really understand why you
> are resisting the idea of providing a callback so that the FDW can
> execute arbitrary code in the recheck path.  There doesn't seem to be
> any reason not to let the FDW take control of the rechecks if it
> wishes, and there's no real cost in complexity that I can see.

IMO I thought there would be not a little development burden on an FDW 
author.  So, I was rather against the idea of providing such a callback.

I know we still haven't reached a consensus on whether we address this 
issue by using a local join execution path.

Best regards,
Etsuro Fujita




pgsql-hackers by date:

Previous
From: Dmitry Vasilyev
Date:
Subject: Postgres service stops when I kill client backend on Windows
Next
From: "Charles Clavadetscher"
Date:
Subject: Re: Postgres service stops when I kill client backend on Windows