Re: [v9.5] Custom Plan API - Mailing list pgsql-hackers

From Kouhei Kaigai
Subject Re: [v9.5] Custom Plan API
Date
Msg-id 9A28C8860F777E439AA12E8AEA7694F8010883E0@BPXM15GP.gisp.nec.co.jp
Whole thread Raw
In response to [v9.5] Custom Plan API  (Kouhei Kaigai <kaigai@ak.jp.nec.com>)
Responses Re: [v9.5] Custom Plan API
List pgsql-hackers
> On Mon, Nov 24, 2014 at 6:57 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> > Indeed, I don't think it is a good idea to start from this harder portion.
> > Let's focus on just varno/varattno remapping to replace join relation
> > by custom-scan, as an immediate target.
> 
> We still need something like this for FDWs, as well.  The potential gains
> there are enormous.  Anything we do had better fit in nicely with that,
> rather than looking like a separate hack.
> 
Today, I had a talk with Hanada-san to clarify which can be a common portion
of them and how to implement it. Then, we concluded both of features can be
shared most of the infrastructure.
Let me put an introduction of join replacement by foreign-/custom-scan below.

Its overall design intends to inject foreign-/custom-scan node instead of
the built-in join logic (based on the estimated cost). From the viewpoint of
core backend, it looks like a sub-query scan that contains relations join
internally.

What we need to do is below:

(1) Add a hook add_paths_to_joinrel()
It gives extensions (including FDW drivers and custom-scan providers) chance
to add alternative paths towards a particular join of relations, using
ForeignScanPath or CustomScanPath, if it can run instead of the built-in ones.

(2) Informs the core backend varno/varattno mapping
One thing we need to pay attention is, foreign-/custom-scan node that performs
instead of the built-in join node must return mixture of values come from both
relations. In case when FDW driver fetch a remote record (also, fetch a record
computed by external computing resource), the most reasonable way is to store
it on ecxt_scantuple of ExprContext, then kicks projection with varnode that
references this slot.
It needs an infrastructure that tracks relationship between original varnode
and the alternative varno/varattno. We thought, it shall be mapped to INDEX_VAR
and a virtual attribute number to reference ecxt_scantuple naturally, and
this infrastructure is quite helpful for both of ForegnScan/CustomScan.
We'd like to add List *fdw_varmap/*custom_varmap variable to both of plan nodes.
It contains list of the original Var node that shall be mapped on the position
according to the list index. (e.g, the first varnode is varno=INDEX_VAR and
varattno=1)

(3) Reverse mapping on EXPLAIN
For EXPLAIN support, above varnode on the pseudo relation scan needed to be
solved. All we need to do is initialization of dpns->inner_tlist on
set_deparse_planstate() according to the above mapping.

(4) case of scanrelid == 0
To skip open/close (foreign) tables, we need to have a mark to introduce the
backend not to initialize the scan node according to table definition, but
according to the pseudo varnodes list.
As earlier custom-scan patch doing, scanrelid == 0 is a straightforward mark
to show the scan node is not combined with a particular real relation.
So, it also need to add special case handling around foreign-/custom-scan code.

We expect above changes are enough small to implement basic join push-down
functionality (that does not involves external computing of complicated
expression node), but valuable to support in v9.5.

Please comment on the proposition above.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


pgsql-hackers by date:

Previous
From: Alex Shulgin
Date:
Subject: Re: Replication connection URI?
Next
From: Christoph Berg
Date:
Subject: Re: Use of recent Russian TZ changes in regression tests