Thread: FDW: ForeignPlan and parameterized paths

FDW: ForeignPlan and parameterized paths

From
Ronan Dunklau
Date:
Hello.

I've noticed that,  when implementing a FDW, it is difficult to use a plan which 
best path is a parameterized path. This comes from the fact that the 
parameterized clause is not easily available at plan time.

This is what I understood from how it works:

- The clauses coming from the best path restrictinfo are not available in the 
scan_clauses argument to the GetForeignPlan function.

- They are, however, directly available on the path, but at this point the 
clauses are of the form InnerVar OPERATOR OuterVar. The outer Var node is then 
replaced by a Param node, using the replace_nestloop_params function.

It could be useful to make the "parameterized" version of the clause (in the 
form InnerVar OPERATOR Param) available to the fdw at plan time.

Could this be possible ?
Maybe by replacing the clauses on the restrictinfo nodes from the path param 
info by the "parameterized" clauses, and then adding these to the scan clauses 
passed to GetForeignPlan ?

Regards,

--
Ronan Dunklau



Re: FDW: ForeignPlan and parameterized paths

From
Tom Lane
Date:
Ronan Dunklau <rdunklau@gmail.com> writes:
> I've noticed that,  when implementing a FDW, it is difficult to use a plan which 
> best path is a parameterized path. This comes from the fact that the 
> parameterized clause is not easily available at plan time.

> This is what I understood from how it works:

> - The clauses coming from the best path restrictinfo are not available in the 
> scan_clauses argument to the GetForeignPlan function.

> - They are, however, directly available on the path, but at this point the 
> clauses are of the form InnerVar OPERATOR OuterVar. The outer Var node is then 
> replaced by a Param node, using the replace_nestloop_params function.

> It could be useful to make the "parameterized" version of the clause (in the 
> form InnerVar OPERATOR Param) available to the fdw at plan time.

> Could this be possible ?

I intentionally did the nestloop_params substitution after calling
GetForeignPlan not before.  It's not apparent to me why it would be
useful to do it before, because the FDW is going to have no idea what
those params represent.  (Note that they represent values coming from
some other, probably local, relation; not from the foreign table.)
        regards, tom lane



Re: FDW: ForeignPlan and parameterized paths

From
Ronan Dunklau
Date:
> I intentionally did the nestloop_params substitution after calling
> GetForeignPlan not before.  It's not apparent to me why it would be
> useful to do it before, because the FDW is going to have no idea what
> those params represent.  (Note that they represent values coming from
> some other, probably local, relation; not from the foreign table.)

Even if the FDW have no idea what they represent, it can identify a
clause of the form Var Operator Param, which allows to store the param
reference (paramid) for retrieving the param value at execution time.
If the chosen best path is a parameterized path that has been built by
the FDW, it allows to push down this restriction.

If this isn't possible, the only way I found to use those clauses
would be at scan time.

Lets's assume atable is a local relation, and aftable is a foreign
table, and the query looks like this:
   select * from atable t1 inner join aftable t2 on t1.c1 = t2.c1


The FDW identifies the join clause on its column c1, and build a
parameterized path on this column (maybe because this column is unique
and indexed on the remote side).

The planner chooses this path, building a nested loop rescanning the
foreign table with this parameter value reflecting the outer relation
value (maybe because the local relation's size is much smaller than
the remote relation's size).

In that case, it seems to be of particular importance to have access
to the clause, so that the nested loop can work as intended: avoiding
a full seqscan on the remote side.

Or is there another way to achieve the same goal ?

Regards,

--
Ronan Dunklau



Re: FDW: ForeignPlan and parameterized paths

From
Tom Lane
Date:
Ronan Dunklau <rdunklau@gmail.com> writes:
>> I intentionally did the nestloop_params substitution after calling
>> GetForeignPlan not before.  It's not apparent to me why it would be
>> useful to do it before, because the FDW is going to have no idea what
>> those params represent.  (Note that they represent values coming from
>> some other, probably local, relation; not from the foreign table.)

> Even if the FDW have no idea what they represent, it can identify a
> clause of the form Var Operator Param, which allows to store the param
> reference (paramid) for retrieving the param value at execution time.

I don't see any plausible reason for an FDW to special-case nestloop
params like that.  What an FDW should be looking for is clauses of the
form Var-of-foreign-table Operator Expression-not-involving-foreign-table,
and a Param is just one case of Expression-not-involving-foreign-table.
(Compare the handling of indexscan clauses: indxpath.c doesn't much care
what's on the righthand side of an indexable clause, so long as there
is no Var of the indexed table there.)

Moreover, in order to do effective parameterized-path creation in the
first place, the FDW's GetForeignPaths function will already have had
to recognize these same clauses in their original form.  If we do the
param substitution before calling GetForeignPlan, that will just mean
that the two functions can't share code anymore.

Or in short: the fact that the righthand-side expression gets replaced
(perhaps only partially) by a Param is an implementation detail of the
executor's expression evaluation methods.  The FDW shouldn't care about
that, only about the result of the expression.
        regards, tom lane