Re: Foreign join pushdown vs EvalPlanQual - Mailing list pgsql-hackers
From | Etsuro Fujita |
---|---|
Subject | Re: Foreign join pushdown vs EvalPlanQual |
Date | |
Msg-id | 564461AC.6050400@lab.ntt.co.jp Whole thread Raw |
In response to | Re: Foreign join pushdown vs EvalPlanQual (Kouhei Kaigai <kaigai@ak.jp.nec.com>) |
Responses |
Re: Foreign join pushdown vs EvalPlanQual
|
List | pgsql-hackers |
Robert and Kaigai-san, Sorry, I sent in an unfinished email. On 2015/11/12 15:30, Kouhei Kaigai wrote: >> On 2015/11/12 2:53, Robert Haas wrote: >>> On Sun, Nov 8, 2015 at 11:13 PM, Etsuro Fujita >>> <fujita.etsuro@lab.ntt.co.jp> wrote: >>>> To test this change, I think we should update the postgres_fdw patch so as >>>> to add the RecheckForeignScan. >>>> >>>> Having said that, as I said previously, I don't see much value in adding the >>>> callback routine, to be honest. I know KaiGai-san considers that that would >>>> be useful for custom joins, but I don't think that that would be useful even >>>> for foreign joins, because I think that in case of foreign joins, the >>>> practical implementation of that routine in FDWs would be to create a >>>> secondary plan and execute that plan by performing ExecProcNode, as my patch >>>> does [1]. Maybe I'm missing something, though. >>> I really don't see why you're fighting on this point. Making this a >>> generic feature will require only a few extra lines of code for FDW >>> authors. If this were going to cause some great inconvenience for FDW >>> authors, then I'd agree it isn't worth it. But I see zero evidence >>> that this is actually the case. >> Really? I think there would be not a little burden on an FDW author; >> when postgres_fdw delegates to the subplan to the remote server, for >> example, it would need to create a remote join query by looking at >> tuples possibly fetched and stored in estate->es_epqTuple[], send the >> query and receive the result during the callback routine. > I cannot understand why it is the only solution. I didn't say that. >> Furthermore, >> what I'm most concerned about is that wouldn't be efficient. So, my > You have to add "because ..." sentence here because I and Robert > think a little inefficiency is not a problem. Sorry, my explanation was not enough. The reason for that is that in the above postgres_fdw case for example, the overhead in sending the query to the remote end and transferring the result to the local end would not be negligible. Yeah, we might be able to apply a special handling for the improved efficiency when using early row locking, but otherwise can we do the same thing? > Please don't start the sentence from "I think ...". We all knows > your opinion, but what I've wanted to see is "the reason why my > approach is valuable is ...". I didn't say that my approach is *valuable* either. What I think is, I see zero evidence that there is a good use-case for an FDW to do something other than doing an ExecProcNode in the callback routine, as I said below, so I don't see the need to add such a routine while that would cause maybe not a large, but not a little burden for writing such a routine on FDW authors. > Nobody prohibits postgres_fdw performs a secondary join here. > All you need to do is, picking up a sub-plan tree from FDW's private > field then call ExecProcNode() inside the callback. >> As I said before, I know that KaiGai-san considers that >> that approach would be useful for custom joins. But I see zero evidence >> that there is a good use-case for an FDW. >>> From my point of view I'm now >>> thinking this solution has two parts: >>> >>> (1) Let foreign scans have inner and outer subplans. For this >>> purpose, we only need one, but it's no more work to enable both, so we >>> may as well. If we had some reason, we could add a list of subplans >>> of arbitrary length, but there doesn't seem to be an urgent need for >>> that. I did the same thing in an earlier version of the patch I posted. Although I agreed on Robert's comment "The Plan tree and the PlanState tree should be mirror images of each other; breaking that equivalence will cause confusion, at least.", I think that that would make code much simpler, especially the code for setting chgParam for inner/outer subplans. But one thing I'm concerned about is enable both inner and outer plans, because I think that that would make the planner postprocessing complicated, depending on what the foreign scans do by the inner/outer subplans. Is it worth doing so? Maybe I'm missing something, though. >>> (2) Add a recheck callback. >>> >>> If the foreign data wrapper wants to adopt the solution you're >>> proposing, the recheck callback can call >>> ExecProcNode(outerPlanState(node)). I don't think this should end up >>> being more than a few lines of code, although of course we should >>> verify that. Yeah, I think FDWs would probably need to create a subplan accordingly at planning time, and then initializing/closing the plan at execution time. I think we could facilitate subplan creation by providing helper functions for that, though. Best regards, Etsuro Fujita
pgsql-hackers by date: