On Mon, Mar 14, 2016 at 1:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Petr Jelinek <petr@2ndquadrant.com> writes:
>> On 14/03/16 02:43, Kouhei Kaigai wrote:
>>> Even though I couldn't check the new planner implementation entirely,
>>> it seems to be the points below are good candidate to inject CustomPath
>>> (and potentially ForeignScan).
>>>
>>> - create_grouping_paths
>>> - create_window_paths
>>> - create_distinct_paths
>>> - create_ordered_paths
>>> - just below of the create_modifytable_path
>>> (may be valuable for foreign-update pushdown)
>
>> To me that seems too low inside the planning tree, perhaps adding it
>> just to the subquery_planner before SS_identify_outer_params would be
>> better, that's the place where you see the path for the whole (sub)query
>> so you can search and modify what you need from there.
>
> I don't like either of those too much. The main thing I've noticed over
> the past few days is that you can't readily generate custom upper-level
> Paths unless you know what PathTarget grouping_planner is expecting each
> level to produce. So what I was toying with doing is (1) having
> grouping_planner put all those targets into the PlannerInfo, perhaps
> in an array indexed by UpperRelationKind; and (2) adding a hook call
> immediately after those targets are computed, say right before
> the create_grouping_paths() call (approximately planner.c:1738
> in HEAD). It should be sufficient to have one hook there since
> you can inject Paths into any of the upper relations at that point;
> moreover, that's late enough that you shouldn't have to recompute
> anything you figured out during scan/join planning.
That is a nice set of characteristics for a hook.
> Now, a simple hook function is probably about the best we can offer
> CustomScan providers, since we don't really know what they're going
> to do. But I'm pretty unenthused about telling FDW authors to use
> such a hook, because generally speaking they're simply going to waste
> cycles finding out that they aren't involved in a given query.
> It would be better if we invent an FDW callback that's meant to be
> invoked at this stage, but only call it for FDW(s) actively involved
> in the query. I'm not sure exactly what that ought to look like though.
> Maybe only call the FDW identified as possible owner of the topmost
> scan/join relation?
I think in the short term that's as well as we're going to do, so +1.
In the long run, I'm interested in making FDWs be able to optimize
queries like foreigntab JOIN localtab ON foreigntab.x = localtab.x
(e.g. by copying localtab to the remote side when it's small); but
that will require revisiting some old decisions, too. What you're
proposing here sounds consistent with what we've done so far.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company