Re: Foreign join pushdown vs EvalPlanQual - Mailing list pgsql-hackers
From | Etsuro Fujita |
---|---|
Subject | Re: Foreign join pushdown vs EvalPlanQual |
Date | |
Msg-id | 56289A04.9080007@lab.ntt.co.jp Whole thread Raw |
In response to | Re: Foreign join pushdown vs EvalPlanQual (Kouhei Kaigai <kaigai@ak.jp.nec.com>) |
List | pgsql-hackers |
On 2015/10/20 9:36, Kouhei Kaigai wrote: > Even if we fetch whole-row of both side, join pushdown is exactly working > because we can receive less number of rows than local join + 2 of foreign- > scan. (If planner works well, we can expect join-path that increases number > of rows shall be dropped.) > > One downside of my proposition is growth of width for individual rows. > It is a trade-off situation. The above approach takes no changes for > existing EPQ infrastructure, thus, its implementation design is clear. > On the other hands, your approach will reduce traffic over the network, > however, it is still unclear how we integrate scanrelid==0 with EPQ > infrastructure. I agree with KaiGai-san that his proposition (or my proposition based on secondary plans) is still a performance improvement over the current implementation on local joining plus early row locking, since that that wouldn't have to transfer useless data that didn't satisfy join conditions at all! > On the other hands, in case of custom-scan that takes underlying local > scan-nodes, thus, any kind of ROW_MARK_* except for ROW_MARK_COPY will > happen. I think width of the joined tuples are relatively minor issue > than FDW cases. However, we cannot expect the fetched rows are protected > by early row-locking mechanism, so probability of re-fetching rows and > reconstruction of joined-tuple has relatively higher priority. I see. >> There is also some possible loss of efficiency with this approach. >> Suppose that we have two tables ft1 and ft2 which are being joined, >> and we push down the join. They are being joined on an integer >> column, and the join needs to select several other columns as well. >> However, ft1 and ft2 are very wide tables that also contain some text >> columns. The query is like this: >> >> SELECT localtab.a, ft1.p, ft2.p FROM localtab LEFT JOIN (ft1 JOIN ft2 >> ON ft1.x = ft2.x AND ft1.huge ~ 'stuff' AND f2.huge2 ~ 'nonsense') ON >> localtab.q = ft1.q; >> >> If we refetch each row individually, we will need a wholerow image of >> ft1 and ft2 that includes all columns, or at least f1.huge and >> f2.huge2. If we just fetch a wholerow image of the join output, we >> can exclude those. The only thing we need to recheck is that it's >> still the case that localtab.q = ft1.q (because the value of >> localtab.q might have changed). As KaiGai-san mentioned above, what we need to discuss more about with Robert's proposition is how to integrate that into the existing EPQ machinery. For example, when, where, and how should we refetch the whole-row image of the join output in the case of late row locking? IMV I think that that would need to add a new FDW API different from RefetchForeignRow, say RefetchForeignJoinRow. IMO I think that another benefit from the proposition from KaiGai-san (or me) would be that that could provide the whole functionality for row locking in remote joins, without an additional development burden on an FDW author; the author only has to write GetForeignRowMarkType and RefetchForeignRow, which I think is relatively easy. I think that in the proposition, the use of rowmark types such as ROW_MARK_SHARE or ROW_MARK_EXCLUSIVE for foreign tables in remote joins would be quite inefficient, but I think that the use of ROW_MARK_REFERENCE instead of ROW_MARK_COPY would be an option for the workload where EPQ rechecks are rarely invoked, because we just need to transfer ctids, not whole-row images. Best regards, Etsuro Fujita
pgsql-hackers by date: