Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs) - Mailing list pgsql-hackers
From | Shigeru Hanada |
---|---|
Subject | Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs) |
Date | |
Msg-id | CAEZqfEczvBEsnvrwi8dPD1AaP6-3Oa3KoHii5e5MNQ=Njc84OA@mail.gmail.com Whole thread Raw |
In response to | Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs) (Shigeru HANADA <shigeru.hanada@gmail.com>) |
Responses |
Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)
Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs) |
List | pgsql-hackers |
Attached is the v15 patch of foreign join support for postgres_fdw. This patch is based on current master, and having being removed some hunks which are not essential. And I wrote description of changes done by the patch. It is little bit long but I hope it would help understanding what the patch does. The total LOC of the patch is 3.7k, 1.8k for code and 2.0k for regression tests. This is not a small patch, as Robert says, so I'd like to summarize changed done by this patch and explain why they are necessary. Outline of join push-down support for postgres_fdw ================================================== This patch provides new capability to join between foriegn tables managed by same foreign server on remote side, by constructing a remote query containing join clause, and executing it as source of a pseudo foreign scan. This patch is based on Custom/Foreign join patch written by Kohei KaiGai. PostgreSQL's planning for a query containing join is done with these steps: 1. generate possible scan paths for each base relations 2. generate join paths with bottom-up approach 3. generate plan nodes required for the cheapest path 4. execute the plan nodes to obtain result tuples Generating path node -------------------- As of now, postgres_fdw generates a ForeignPath which represents a result of a join for each RelOptInfo, and planner can determine which path is cheapest from its cost values. GetForeignJoinPaths is called once for each join combination, i.e. A JOIN B and B JOIN A are considered separately. So GetForeignJoinPath should return immediately to skip its job when the call is the reversed combination of already considered one. For this purpose, I added update_safe flag to PgFdwRelationInfo. This flag is always set for simple foriegn scans, but for join relation it is set only when the join can be pushed down. The reason of adding this flag is that checking RelOptInfo#fdw_private is MULL can't prevent useless processing for a join combination which is reversed one of already considered join which can't be pushed down. postgres_fdw's GetForeignJoinPaths() does various checks, to ensure that the result has same semantics as local joins. Now postgres_fdw have these criteria: a) join type must be one of INNER/LEFT OUTER/RIGHT OUTER/FULL OUTER join This check is done with given jointype argument. IOW, CROSS joins and SEMI/ANTI joins are not pushed down. This is because 1) CROSS joins would produe more result than separeted join sources, and 2) ANTI/SEMI joins need to be deparsed as sub-query and it seems to take some more time to implement. b) Both outer and inner must have RelOptInfo#fdw_private Having fdw_private means that the RelOptInfo is safe to push down, so having no fdw_private means that portion is not safe to push down and thus the whole join is not safe to push down. c) All relations in the join must belong to the same server This check is done with serverid stored in RelOptInfo#fdw_private as PgFdwRelationInfo. Joining relations belong to different servers is not leagal. Even they finally have completely same connection information, they should accessed via different libpq sessions. Responsibility of checking server matching is under discussion in the Custom/Foreign join thread, and I'd vote for checking it in core. If it is decided, I remove this criterion soon. d) All relations must have the same effective user id This check is done with userid stored in PgFdwRelationInfo, which is valid only when underlying relations have the same effective user id. Here "effective user id" means id of the user executing the query, or the owner of the view when the foreign table is accessed via view. Tom suggested that it is necessary to check that user mapping matches for security reason, and now I think it's better than checking effective user as current postgres_fdw does. e) Each source relation must not have any local filter Evaluating conditions of join source talbe potentially produces different result in OUTER join cases. This can be relaxed for the cases when the join is INNER and local filters don't contain any volatile function/operator, but it is left as future enhancement. f) All join conditions of non-inner join must be safe to push down This is similar to e). A join which passes all criteria above is safe to push-down, so postgres_fdw create a ForeignPath for the join and add it to RelOptInfo. Currently postgres_fdw doesn't set pathkeys (ordering information) nor require_outer (information for parameterized path). PgFdwRelationInfo is used to store various planning information specific to postgres_fdw. To support join push-down, I added some fields which are necessary to deparse join query recursively in deparseSelectSql. - outer RelOptInfo, to generate source relatoin as subquery - inner RelOptInfo, to generate source relation as subquery - jointype, to deparse JOIN keyword string (e.g. INNER JOIN) - joinclauses, to deprase ON part of JOIN clause - otherclauses, to deparse WHERE clause of join query I also moved it from postgres_fdw.c to postgres_fdw.h, because deparse.c needs to refer the definition during deparsing query. Generating plan node -------------------- Once planner find the cheapest path, it generates plan tree from the path tree. During the steps, planner calls GetForeignPlan for the ForeignPath in the top of a join tree. IOW, GetForeignPlan is not called for underlying joins and scans, so postgres_fdw needs a way to do its task (mainly generating query string) recursively. GetForeignPlan generates remote query and store it in ForeignScan node to be returned. The construction of remote query is done by calling deparseSelectSql for given RelOptInfo. This function was modified to accept both base relations and join relations to support join push-down. Main part of generating join query is implemented in deparseJoinSql, which is a newly added function, and deparseSelectSql calls it if given relation was a join relation. One big change about deparsing base relation is aliasing. This patch adds column alias to SELECT clause even original query is a simple single table SELECT. fdw=# EXPLAIN (VERBOSE, COSTS false) SELECT * FROM pgbench_branches b; QUERY PLAN ------------------------------------------------------------------------------------ Foreign Scan on public.pgbench_branches b Output: bid, bbalance, filler Remote SQL: SELECT bid a9, bbalance a10, filler a11 FROM public.pgbench_branches (3 rows) As you see, every column has alias in format "a%d" with index value derived from pg_attribute.attnum. Index value is attnum + 8, and the magic number "8" comes from FirstLowInvalidHeapAttributeNumber for the adjustment that makes attribute number of system attributes positive. To make the code more readable, I introduced local macro GET_RELATIVE_ATTNO(), but I'm not sure that this name is very nice. This column alias system allows to refer a particular column pointed by Var node in upper join level without consideration about column position in SELECT clause. To achieve this, deparseVar is expanded to handle variables used in join query, and deparseJoinVar is added to deparse column reference with table alias "l" or "r" for outer or inner respectively. As mentioned at the beginning of this section, GetForeignPlan is called only for the top node of the join tree, so we need to do something recursively. So postgres_fdw has information about join tree structure in PgFdwRelationInfo and pass it via RelOptInfo#fdw_private. This link works down to the base relation at the leaf of the join tree. When deparseSelectSql is called for a join relation, it calls deparseSelectSql for each outer and inner RelOptInfo to generate query strings, and pass them to deparseJoinSql as parts for FROM clause subquery. For example of two way join: [table definition] CREATE FOREIGN TABLE table1 ( id int NOT NULL, name text, ... ) SERVER server; CREATE FOREIGN TABLE table2 ( id int NOT NULL, name text, ... ) SERVER server; [original query] SELECT t1.name, t2.name FROM table1 t1 INNER JOIN table2 t2 ON t1.id = t2.id; [remote query] SELECT l.a2, r.a2 FROM (SELECT id, name FROM table1) l (a1, a2) INNER JOIN (SELECT id, name FROM table2) r (a1, a2) ON (l.a1 = r.a1) ; During deparsing join query, deparseJoinSql maintains fdw_scan_tlist list to hold TargetEntry for columns used in SELECT clause of every query. One thing tricky is "peusdo projection" which is done by deparseProjectionSql for whole-row reference. This is done by put the query string in FROM subquery and add whole-row reference in outer SELECT clause. This is done by ExecProjection in 9.4 and older, but we need to do this in postgres_fdw because ExecProjection is not called any more. For various conditions, such as join conditions in JOIN clauses and filter conditions in WHERE clauses, appendCondition is used to deparse condition strings. This was expanded version of appendWhereClause. Note that appendConditions accepts list of RestrictInfo or list of Expr as source, and downcasts them properly. As of 9.4 appendWhereClause accepted only RestrictInfo, but join conditions are Expr, so I made it little flexible. Finally deparseJoinSql construct whole query by putting parts into the right places. Note that column aliases are not written in SELECT clause but in FROM clause, after table alias. This simpifies SELECT clause construction simpler. debug stuffs ------------ bms_to_str is a function which prints contents of a bitmapset in human-readable format. I added this for debug purpose, but IMO it's ok to have such function as public bitmapset API. 2015-05-03 10:51 GMT+09:00 Shigeru HANADA <shigeru.hanada@gmail.com>: > Thanks for the comments. > > 2015/05/01 22:35、Robert Haas <robertmhaas@gmail.com> のメール: >> On Mon, Apr 27, 2015 at 5:07 AM, Shigeru Hanada >> <shigeru.hanada@gmail.com> wrote: >>> 2015-04-27 11:00 GMT+09:00 Kouhei Kaigai <kaigai@ak.jp.nec.com>: >>>> Hanada-san, could you adjust your postgres_fdw patch according to >>>> the above new (previous?) definition. >>> >>> The attached v14 patch is the revised version for your v13 patch. It also contains changed for Ashutosh’s comments. >> >> We should probably move this discussion to a new thread now that the >> other patch is committed. Changing subject line accordingly. >> >> Generally, there's an awful lot of changes in this patch - it is over >> 2000 insertions and more than 450 deletions - and it's not awfully >> obvious why all of those changes are there. I think this patch needs >> a detailed README to accompany it explaining what the various changes >> in the patch are and why those things got changed; or maybe there is a >> way to break it up into multiple patches so that we can take a more >> incremental approach. I am really suspicious of the amount of >> wholesale reorganization of code that this patch is doing. It's >> really hard to validate that a reorganization like that is necessary, >> or that it's correct, and it's gonna make back-patching noticeably >> harder in the future. If we really need this much code churn it needs >> careful justification; if we don't, we shouldn't do it. >> > > I agree. I’ll write detailed description for the patch and repost the new one with rebasing onto current HEAD. I’m sorrybut it will take a day or so... > >> +SET enable_mergejoin = off; -- planner choose MergeJoin even it has >> higher costs, so disable it for testing. >> >> This seems awfully strange. Why would the planner choose a plan if it >> had a higher cost? > > I thought so but I couldn’t reproduce such situation now. I’ll investigate it more. If the issue has gone, I’ll removethat SET statement for straightforward tests. > > >> >> - * If the table or the server is configured to use remote estimates, >> - * identify which user to do remote access as during planning. This >> + * Identify which user to do remote access as during planning. This >> * should match what ExecCheckRTEPerms() does. If we fail due >> to lack of >> * permissions, the query would have failed at runtime anyway. >> */ >> - if (fpinfo->use_remote_estimate) >> - { >> - RangeTblEntry *rte = planner_rt_fetch(baserel->relid, root); >> - Oid userid = rte->checkAsUser ? >> rte->checkAsUser : GetUserId(); >> - >> - fpinfo->user = GetUserMapping(userid, fpinfo->server->serverid); >> - } >> - else >> - fpinfo->user = NULL; >> + rte = planner_rt_fetch(baserel->relid, root); >> + fpinfo->userid = rte->checkAsUser ? rte->checkAsUser : GetUserId(); >> >> So, wait a minute, remote estimates aren't optional any more? > > No, it seems to be removed accidentally. I’ll check the reason again though, but I’ll revert the change unless I findany problem. > > -- > Shigeru HANADA > shigeru.hanada@gmail.com > > > > -- Shigeru HANADA
Attachment
pgsql-hackers by date: