Re: Join push-down support for foreign tables - Mailing list pgsql-hackers
From | Shigeru Hanada |
---|---|
Subject | Re: Join push-down support for foreign tables |
Date | |
Msg-id | CAEZqfEeTJRcjJ=ZUoEDetdnw1yp2DjXxEdO7MeOP4pe1w3NnZA@mail.gmail.com Whole thread Raw |
In response to | Re: Join push-down support for foreign tables (Shigeru HANADA <shigeru.hanada@gmail.com>) |
Responses |
Re: Join push-down support for foreign tables
|
List | pgsql-hackers |
2014-09-08 8:07 GMT+09:00 Shigeru HANADA <shigeru.hanada@gmail.com>: > (2014/09/04 21:37), Robert Haas wrote:> On Wed, Sep 3, 2014 at 5:16 AM, >> Probably both the initial cost and final cost calculations should be >> delegated to the FDW, but maybe within postgres_fdw, the initial cost >> should do only the work that can be done without contacting the remote >> server; then, let the final cost step do that if appropriate. But I'm >> not entirely sure what is best here. > > Agreed. I'll design planner API along that way for now. I tried some patterns of implementation but I've not gotten feasible way yet. So I'd like to hear hackers' idea. * Foreign join hook point First I thought that following existing cost estimating manner is the way to go, but I tend to think it doesn't fit foreign joins because join method is tightly-coupled to sort-ness, but foreign join would not. In current planner, add_paths_to_joinrel is conscious of sort-ness, and functions directly called from it are conscious of join method. But this seems not fit the abstraction level of FDW. FDW is highly abstracted, say differ from custom plan providers, so most of work should be delegated to FDW, including pathkeys consideration, IMO. Besides that, order of join consideration is another issue. First I try to add foreign join consideration at the last (after hash join consideration), but after some thought I noticed that early-elimination would work better if we try foreign join first, because in most cases foreign join is the cheapest way to accomplish a join between two foreign relations. So currently I'm thinking that delegating whole join consideration to FDWs before other join consideration in add_paths_to_joinrel, by calling new FDW API would be promising. This means that FDWs can add multiple arbitrary paths to RelOptInfo in a call. Of course this allows FDWs to do round-trip per path, but it would be optimization issue, and they can compare their own candidates they can get without round-trip. * Supported join types INNER and (LEFT|RIGHT|FULL) OUTER would be safe to push down, even though some of OUTER JOIN might not be much faster than local join. I'm not sure that SEMI and ANTI joins are safe to push-down. Can we leave the matter to FDWs, or should we forbid FDWs pushing down by not calling foreign join API? Anyway SEMI/ANTI would not be supported in the first version. * Blockers of join push-down Pushing down join means that foreign scans for inner and outer are skipped, so some elements blocks pushing down. Basically the criteria is same as scan push-down and update push-down. After some thoughts, we should check only unsafe expression in join qual and restrict qual. This limitation is necessary to avoid difference between results of pushe-down or not. Target list seems to contain only Var for necessary columns, but we should check that too. * WIP patch Attached is WIP patch for reviewing the design. Works should be done are 1) judging push-down or not, and 2) generating join SQL. For 2), I'm thinking about referring Postgres-XC's join shipping mechanism. Any comments or questions are welcome. -- Shigeru HANADA
Attachment
pgsql-hackers by date: