54.4. Foreign Data Wrapper Query Planning
The FDW callback functions GetForeignRelSize
, GetForeignPaths
, GetForeignPlan
, PlanForeignModify
, and GetForeignJoinPaths
must fit into the workings of the Postgres Pro planner. Here are some notes about what they must do.
The information in root
and baserel
can be used to reduce the amount of information that has to be fetched from the foreign table (and therefore reduce the cost). baserel->baserestrictinfo
is particularly interesting, as it contains restriction quals (WHERE
clauses) that should be used to filter the rows to be fetched. (The FDW itself is not required to enforce these quals, as the core executor can check them instead.) baserel->reltargetlist
can be used to determine which columns need to be fetched; but note that it only lists columns that have to be emitted by the ForeignScan
plan node, not columns that are used in qual evaluation but not output by the query.
Various private fields are available for the FDW planning functions to keep information in. Generally, whatever you store in FDW private fields should be palloc'd, so that it will be reclaimed at the end of planning.
baserel->fdw_private
is a void
pointer that is available for FDW planning functions to store information relevant to the particular foreign table. The core planner does not touch it except to initialize it to NULL when the RelOptInfo
node is created. It is useful for passing information forward from GetForeignRelSize
to GetForeignPaths
and/or GetForeignPaths
to GetForeignPlan
, thereby avoiding recalculation.
GetForeignPaths
can identify the meaning of different access paths by storing private information in the fdw_private
field of ForeignPath
nodes. fdw_private
is declared as a List
pointer, but could actually contain anything since the core planner does not touch it. However, best practice is to use a representation that's dumpable by nodeToString
, for use with debugging support available in the backend.
GetForeignPlan
can examine the fdw_private
field of the selected ForeignPath
node, and can generate fdw_exprs
and fdw_private
lists to be placed in the ForeignScan
plan node, where they will be available at execution time. Both of these lists must be represented in a form that copyObject
knows how to copy. The fdw_private
list has no other restrictions and is not interpreted by the core backend in any way. The fdw_exprs
list, if not NIL, is expected to contain expression trees that are intended to be executed at run time. These trees will undergo post-processing by the planner to make them fully executable.
In GetForeignPlan
, generally the passed-in target list can be copied into the plan node as-is. The passed scan_clauses
list contains the same clauses as baserel->baserestrictinfo
, but may be re-ordered for better execution efficiency. In simple cases the FDW can just strip RestrictInfo
nodes from the scan_clauses
list (using extract_actual_clauses
) and put all the clauses into the plan node's qual list, which means that all the clauses will be checked by the executor at run time. More complex FDWs may be able to check some of the clauses internally, in which case those clauses can be removed from the plan node's qual list so that the executor doesn't waste time rechecking them.
As an example, the FDW might identify some restriction clauses of the form foreign_variable
=
sub_expression
, which it determines can be executed on the remote server given the locally-evaluated value of the sub_expression
. The actual identification of such a clause should happen during GetForeignPaths
, since it would affect the cost estimate for the path. The path's fdw_private
field would probably include a pointer to the identified clause's RestrictInfo
node. Then GetForeignPlan
would remove that clause from scan_clauses
, but add the sub_expression
to fdw_exprs
to ensure that it gets massaged into executable form. It would probably also put control information into the plan node's fdw_private
field to tell the execution functions what to do at run time. The query transmitted to the remote server would involve something like WHERE
, with the parameter value obtained at run time from evaluation of the foreign_variable
= $1fdw_exprs
expression tree.
Any clauses removed from the plan node's qual list must instead be added to fdw_recheck_quals
or rechecked by RecheckForeignScan
in order to ensure correct behavior at the READ COMMITTED
isolation level. When a concurrent update occurs for some other table involved in the query, the executor may need to verify that all of the original quals are still satisfied for the tuple, possibly against a different set of parameter values. Using fdw_recheck_quals
is typically easier than implementing checks inside RecheckForeignScan
, but this method will be insufficient when outer joins have been pushed down, since the join tuples in that case might have some fields go to NULL without rejecting the tuple entirely.
Another ForeignScan
field that can be filled by FDWs is fdw_scan_tlist
, which describes the tuples returned by the FDW for this plan node. For simple foreign table scans this can be set to NIL
, implying that the returned tuples have the row type declared for the foreign table. A non-NIL
value must be a target list (list of TargetEntry
s) containing Vars and/or expressions representing the returned columns. This might be used, for example, to show that the FDW has omitted some columns that it noticed won't be needed for the query. Also, if the FDW can compute expressions used by the query more cheaply than can be done locally, it could add those expressions to fdw_scan_tlist
. Note that join plans (created from paths made by GetForeignJoinPaths
) must always supply fdw_scan_tlist
to describe the set of columns they will return.
The FDW should always construct at least one path that depends only on the table's restriction clauses. In join queries, it might also choose to construct path(s) that depend on join clauses, for example foreign_variable
=
local_variable
. Such clauses will not be found in baserel->baserestrictinfo
but must be sought in the relation's join lists. A path using such a clause is called a “parameterized path”. It must identify the other relations used in the selected join clause(s) with a suitable value of param_info
; use get_baserel_parampathinfo
to compute that value. In GetForeignPlan
, the local_variable
portion of the join clause would be added to fdw_exprs
, and then at run time the case works the same as for an ordinary restriction clause.
If an FDW supports remote joins, GetForeignJoinPaths
should produce ForeignPath
s for potential remote joins in much the same way as GetForeignPaths
works for base tables. Information about the intended join can be passed forward to GetForeignPlan
in the same ways described above. However, baserestrictinfo
is not relevant for join relations; instead, the relevant join clauses for a particular join are passed to GetForeignJoinPaths
as a separate parameter (extra->restrictlist
).
When planning an UPDATE
or DELETE
, PlanForeignModify
can look up the RelOptInfo
struct for the foreign table and make use of the baserel->fdw_private
data previously created by the scan-planning functions. However, in INSERT
the target table is not scanned so there is no RelOptInfo
for it. The List
returned by PlanForeignModify
has the same restrictions as the fdw_private
list of a ForeignScan
plan node, that is it must contain only structures that copyObject
knows how to copy.
INSERT
with an ON CONFLICT
clause does not support specifying the conflict target, as unique constraints or exclusion constraints on remote tables are not locally known. This in turn implies that ON CONFLICT DO UPDATE
is not supported, since the specification is mandatory there.