WIP: Join push-down for foreign tables - Mailing list pgsql-hackers
From | Shigeru Hanada |
---|---|
Subject | WIP: Join push-down for foreign tables |
Date | |
Msg-id | 4E7072C9.10508@gmail.com Whole thread Raw |
Responses |
Re: WIP: Join push-down for foreign tables
Re: WIP: Join push-down for foreign tables |
List | pgsql-hackers |
Hi all, I'd like to propose $SUBJECT for further foreign query optimization. I've not finished development, but I'd appreciate it if I got someone's review on my WIP code and its design. Changes I made ============== (1) Add foreign server OID to RelOptInfo I think it would be nice to know whether a join comes from one foreign server or not without digging into child nodes during considering paths for a query. So I added serverid field to RelOptInfo, which defaults to InvalidOid ,and is set to OID of the server if the node and all of its children are from same foreign server. This also avoids looking catalog up for foreign table entry to determine FDW routine. (2) Add new planner node, ForeignJoinPath ForeignJoinPath derives JoinPath, like other Join nodes, and holds FdwPlan like ForeignPath node. This node is used to represent a pushed-down join between foreign tables and/or another foreign join in early planning phase, for all of combination such as table-table, table-join, join-table and join-join will be considered. In addition, though they might generate same internal (FDW-specific) plan, reversed combination is considered; planner generates two ForeignJoinPath for both (A & B) and (B & A). During creation of this node, planner calls new FDW handler function PlanForeignJoin to get a FdwPlan which includes costs and internal plan of a foreign join. If a FDW can't (or doesn't want to) handle this join, just return NULL is OK, and then planner gives such optimization up and considers other usual join methods such as nested loop and hash join. A subtree which has a ForeignJoin on its top is translated into a ForeignScan node during constructing a plan tree. This behavior is different from other join path nodes such as NestPath and MergePath, because they have child plan nodes correspond to path nodes. (3) Add EXPALIN support for foreign join (currently just for debug) ForeignScan might not be a simple foreign table scan, so ExplainScanTarget() can't be used for it. An idea I have is adding ExplainForeignScanTarget() to handle ForeignScan separately from other scan nodes. (4) Add new GUC parameter, enable_foreignjoin If this was off, planner never generates ForeignJoinPath. In such case, foreign tables will be joined with one of NestLoop, MergeJoin and HashJoin. Known issue =========== I'm sorry but attached patch, join_pushdown_v1.patch, is WIP, so currently some kind of query fails. Known failure patterns are: *) SELECT * FROM A JOIN B (...) doesn't work. Specifying columns in SELECT clause explicitly like "SELECT A.col1, A.col2, ..." seems to work. *) ORDER BY causes error if no column is specified in SELECT clause from sort key's table. Probably more problems still are there... PG-wrapper as sample implementation =================================== pgsql_fdw-0.1.0.tar.gz is an WIP implementation of PG-wrapper, which can (hopefully) handle both simple foreign table scan and multiple foreign joins. You can build it with placing in contrib/, or using pgxs. Note that it has some issues such as memory leak of PGresult. I'm planning to propose this wrapper as a contrib module, but it would be after clearing such issues. Regards, -- Shigeru Hanada
Attachment
pgsql-hackers by date: