Re: Join push-down support for foreign tables - Mailing list pgsql-hackers

From Shigeru Hanada
Subject Re: Join push-down support for foreign tables
Date
Msg-id CAEZqfEeTJRcjJ=ZUoEDetdnw1yp2DjXxEdO7MeOP4pe1w3NnZA@mail.gmail.com
Whole thread Raw
In response to Re: Join push-down support for foreign tables  (Shigeru HANADA <shigeru.hanada@gmail.com>)
Responses Re: Join push-down support for foreign tables
List pgsql-hackers
2014-09-08 8:07 GMT+09:00 Shigeru HANADA <shigeru.hanada@gmail.com>:
> (2014/09/04 21:37), Robert Haas wrote:> On Wed, Sep 3, 2014 at 5:16 AM,
>> Probably both the initial cost and final cost calculations should be
>> delegated to the FDW, but maybe within postgres_fdw, the initial cost
>> should do only the work that can be done without contacting the remote
>> server; then, let the final cost step do that if appropriate.  But I'm
>> not entirely sure what is best here.
>
> Agreed.  I'll design planner API along that way for now.

I tried some patterns of implementation but I've not gotten feasible
way yet.  So I'd like to hear hackers' idea.

* Foreign join hook point
First I thought that following existing cost estimating manner is the
way to go, but I tend to think it doesn't fit foreign joins because
join method is tightly-coupled to sort-ness, but foreign join would
not.

In current planner, add_paths_to_joinrel is conscious of sort-ness,
and functions directly called from it are conscious of join method.
But this seems not fit the abstraction level of FDW.  FDW is highly
abstracted, say differ from custom plan providers, so most of work
should be delegated to FDW, including pathkeys consideration, IMO.

Besides that, order of join consideration is another issue.  First I
try to add foreign join consideration at the last (after hash join
consideration), but after some thought I noticed that
early-elimination would work better if we try foreign join first,
because in most cases foreign join is the cheapest way to accomplish a
join between two foreign relations.

So currently I'm thinking that delegating whole join consideration to
FDWs before other join consideration in add_paths_to_joinrel, by
calling new FDW API would be promising.

This means that FDWs can add multiple arbitrary paths to RelOptInfo in
a call.  Of course this allows FDWs to do round-trip per path, but it
would be optimization issue,  and they can compare their own
candidates they can get without round-trip.

* Supported join types
INNER and (LEFT|RIGHT|FULL) OUTER would be safe to push down, even
though some of OUTER JOIN might not be much faster than local join.
I'm not sure that SEMI and ANTI joins are safe to push-down.  Can we
leave the matter to FDWs, or should we forbid FDWs pushing down by not
calling foreign join API?  Anyway SEMI/ANTI would not be supported in
the first version.

* Blockers of join push-down
Pushing down join means that foreign scans for inner and outer are
skipped, so some elements blocks pushing down.  Basically the criteria
is same as scan push-down and update push-down.

After some thoughts, we should check only unsafe expression in join
qual and restrict qual.  This limitation is necessary to avoid
difference between results of pushe-down or not.  Target list seems to
contain only Var for necessary columns, but we should check that too.

* WIP patch
Attached is WIP patch for reviewing the design.  Works should be done
are 1) judging push-down or not, and 2) generating join SQL.  For 2),
I'm thinking about referring Postgres-XC's join shipping mechanism.

Any comments or questions are welcome.
--
Shigeru HANADA

Attachment

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Promise index tuples for UPSERT
Next
From: Simon Riggs
Date:
Subject: Re: Promise index tuples for UPSERT