Re: Foreign join pushdown vs EvalPlanQual - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: Foreign join pushdown vs EvalPlanQual
Date
Msg-id 20151002.172639.18062431.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Re: Foreign join pushdown vs EvalPlanQual  (Kouhei Kaigai <kaigai@ak.jp.nec.com>)
Responses Re: Foreign join pushdown vs EvalPlanQual  (Robert Haas <robertmhaas@gmail.com>)
Re: Foreign join pushdown vs EvalPlanQual  (Kouhei Kaigai <kaigai@ak.jp.nec.com>)
List pgsql-hackers
Hello, thank you for explanation. I understood the background.

On the current planner implement, row marks are tightly bound to
initial RTEs. This is quite natural for the purpose of row marks.

During join search, a joinrel should be comptible between local
joins and remote joins, of course target list also should be
so. So it is quite difficult to add wholerow resjunk for joinrels
before whole join tree is completed even if we allow row marks
that are not bound to base RTEs.

The result of make_rel_from_joinlist contains only winner paths
so we might be able to transform target list for this joinrel so
that it has join wholerows (and doesn't have unnecessary RTE
wholerows), but I don't see any clean way to do that.

As the result, all that LockRow can collect for EPQ are tuples
for base relations. No room to pass joined whole row so far.


At Fri, 2 Oct 2015 05:04:44 +0000, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote in
<9A28C8860F777E439AA12E8AEA7694F80114DBFB@BPXM15GP.gisp.nec.co.jp>
> > > I never say FDW should refetch tuples on the recheck routine.
> > > All I suggest is, projection to generate a joined tuple and
> > > recheck according to the qualifier pushed down are role of
> > > FDW driver, because it knows the best strategy to do the job.
> > 
> > I have no objection that rechecking is FDW's job.
> > 
> > I think you are thinking that all ROW_MARK_COPY base rows are
> > held in ss_ScanTupleSlot so simply calling recheckMtd on the slot
> > gives enough data to the function. (EPQState would also be needed
> > to retrieve, though..) Right?
> >
> Not ss_ScanTupleSlot. It is initialized according to fdw_scan_tlist
> in case of scanrelid==0, regardless of base foreign relation's
> definition.

Sorry, EvalPlanQualFetchRowMarks retrieves wholerows from
epqstate->origslot.

> My expectation is, FDW callback construct tts_values/tts_isnull
> of ss_ScanTupleSlot according to the preloaded tuples in EPQ slots
> and underlying projection. Only FDW driver knows the best way to
> construct this result tuple.

Currently only FDW itself knows how the joined relaiton are made
precisely.

> You can pull out EState reference from PlanState portion of the
> ForeignScanState, so nothing needs to be changed.

Exactly.

> > > > > Apart from FDW requirement, custom-scan/join needs recheckMtd is
> > > > > called when scanrelid==0 to avoid assertion fail. I hope FDW has
> > > > > symmetric structure, however, not a mandatory requirement for me.
...
> > Hmm. What I said by "works as expected" is that the function
> > stores the tuple for the "foreign join" scan node. If it doesn't,
> > you're right.
> >
> Which slot of the EPQ slot will save the joined tuple?

Yes, that is the second significant problem. As described above,
furtermore, the way to inject joined wholrow var into the target
list for the pushed-down join seems more difficult to find

> scanrelid is zero, and we have no identifier of join planstate.
> 
> > > Who can provide a projection to generate joined tuple?
> > > It is a job of individual plan-state-node to be walked on during
> > > EvalPlanQualNext().
> > 
> > EvalPlanQualNext simply does recheck tuples stored in epqTuples,
> > which are designed to be provided by EvalPlanQualFetchRowMarks.
> > 
> > I think that that premise shouldn't be broken for convenience...
> > 
> Do I see something different or understand incorrectly?
> EvalPlanQualNext() walks down entire subtree of the Lock node.
> (epqstate->planstate is entire subplan of Lock node.)
> 
>   TupleTableSlot *
>   EvalPlanQualNext(EPQState *epqstate)
>   {
>       MemoryContext oldcontext;
>       TupleTableSlot *slot;
>   
>       oldcontext = MemoryContextSwitchTo(epqstate->estate->es_query_cxt);
>       slot = ExecProcNode(epqstate->planstate);
>       MemoryContextSwitchTo(oldcontext);
>   
>       return slot;
>   }
> 
> If and when relations joins are kept in the sub-plan, ExecProcNode()
> processes the projection by join, doesn't it?

Yes, but it is needed to prepare alternative plan to do such
projection.

> Why projection by join is not a part of EvalPlanQualNext()?
> It is the core of its job. Unless projection by join, upper node cannot
> recheck the tuple come from child nodes.

What I'm uneasy on is the foreign join introduced the difference
in behavior between ordinary fetching and epq fetching. It is
quite natural having joined whole row but is seems hard to get.

Another reason is that ExecScanFetch with scanrelid == 0 on EPQ
is FDW/CS specific feature and looks to be a kind of hack. (Even
if it would be one of many)

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Confusing remark about UPSERT in fdwhandler.sgml
Next
From: Kouhei Kaigai
Date:
Subject: Re: Parallel Seq Scan