Thread: FDW: ForeignPlan and parameterized paths
Hello. I've noticed that, when implementing a FDW, it is difficult to use a plan which best path is a parameterized path. This comes from the fact that the parameterized clause is not easily available at plan time. This is what I understood from how it works: - The clauses coming from the best path restrictinfo are not available in the scan_clauses argument to the GetForeignPlan function. - They are, however, directly available on the path, but at this point the clauses are of the form InnerVar OPERATOR OuterVar. The outer Var node is then replaced by a Param node, using the replace_nestloop_params function. It could be useful to make the "parameterized" version of the clause (in the form InnerVar OPERATOR Param) available to the fdw at plan time. Could this be possible ? Maybe by replacing the clauses on the restrictinfo nodes from the path param info by the "parameterized" clauses, and then adding these to the scan clauses passed to GetForeignPlan ? Regards, -- Ronan Dunklau
Ronan Dunklau <rdunklau@gmail.com> writes: > I've noticed that, when implementing a FDW, it is difficult to use a plan which > best path is a parameterized path. This comes from the fact that the > parameterized clause is not easily available at plan time. > This is what I understood from how it works: > - The clauses coming from the best path restrictinfo are not available in the > scan_clauses argument to the GetForeignPlan function. > - They are, however, directly available on the path, but at this point the > clauses are of the form InnerVar OPERATOR OuterVar. The outer Var node is then > replaced by a Param node, using the replace_nestloop_params function. > It could be useful to make the "parameterized" version of the clause (in the > form InnerVar OPERATOR Param) available to the fdw at plan time. > Could this be possible ? I intentionally did the nestloop_params substitution after calling GetForeignPlan not before. It's not apparent to me why it would be useful to do it before, because the FDW is going to have no idea what those params represent. (Note that they represent values coming from some other, probably local, relation; not from the foreign table.) regards, tom lane
> I intentionally did the nestloop_params substitution after calling > GetForeignPlan not before. It's not apparent to me why it would be > useful to do it before, because the FDW is going to have no idea what > those params represent. (Note that they represent values coming from > some other, probably local, relation; not from the foreign table.) Even if the FDW have no idea what they represent, it can identify a clause of the form Var Operator Param, which allows to store the param reference (paramid) for retrieving the param value at execution time. If the chosen best path is a parameterized path that has been built by the FDW, it allows to push down this restriction. If this isn't possible, the only way I found to use those clauses would be at scan time. Lets's assume atable is a local relation, and aftable is a foreign table, and the query looks like this: select * from atable t1 inner join aftable t2 on t1.c1 = t2.c1 The FDW identifies the join clause on its column c1, and build a parameterized path on this column (maybe because this column is unique and indexed on the remote side). The planner chooses this path, building a nested loop rescanning the foreign table with this parameter value reflecting the outer relation value (maybe because the local relation's size is much smaller than the remote relation's size). In that case, it seems to be of particular importance to have access to the clause, so that the nested loop can work as intended: avoiding a full seqscan on the remote side. Or is there another way to achieve the same goal ? Regards, -- Ronan Dunklau
Ronan Dunklau <rdunklau@gmail.com> writes: >> I intentionally did the nestloop_params substitution after calling >> GetForeignPlan not before. It's not apparent to me why it would be >> useful to do it before, because the FDW is going to have no idea what >> those params represent. (Note that they represent values coming from >> some other, probably local, relation; not from the foreign table.) > Even if the FDW have no idea what they represent, it can identify a > clause of the form Var Operator Param, which allows to store the param > reference (paramid) for retrieving the param value at execution time. I don't see any plausible reason for an FDW to special-case nestloop params like that. What an FDW should be looking for is clauses of the form Var-of-foreign-table Operator Expression-not-involving-foreign-table, and a Param is just one case of Expression-not-involving-foreign-table. (Compare the handling of indexscan clauses: indxpath.c doesn't much care what's on the righthand side of an indexable clause, so long as there is no Var of the indexed table there.) Moreover, in order to do effective parameterized-path creation in the first place, the FDW's GetForeignPaths function will already have had to recognize these same clauses in their original form. If we do the param substitution before calling GetForeignPlan, that will just mean that the two functions can't share code anymore. Or in short: the fact that the righthand-side expression gets replaced (perhaps only partially) by a Param is an implementation detail of the executor's expression evaluation methods. The FDW shouldn't care about that, only about the result of the expression. regards, tom lane