RFC: ExecNodeExtender - Mailing list pgsql-hackers

From Kohei KaiGai
Subject RFC: ExecNodeExtender
Date
Msg-id CADyhKSXadW02Q5sn=d-4YU0FgKRTPF3F6cAoWpJWm1Q9dui4cw@mail.gmail.com
Whole thread Raw
Responses Re: RFC: ExecNodeExtender
List pgsql-hackers
I'd like to propose a feature that allows extensions to replace a part
of plan-tree underlying PlannedStmt
by self-defined exec node being associated with several callback
functions implemented at extension
module.
Right now, about 30 built-in exec nodes are implemented, and all the
query execution plan is
a combination of them, but no way to extend it on the fly. This API
enhancement will make sense to
implement cutting edge features, but not upstreamed yet, without
undesirable source branch.

The API I'd like to propose is as follows:

typedef struct {   const char *extender_name;   ExecNodeExtender_BeginExec_function fn_begin_exec;
ExecNodeExtender_GetNext_function    fn_getnext;   ExecNodeExtender_ReScan_function     fn_rescan;
ExecNodeExtender_EndExec_function  fn_end_exec;   ExecNodeExtender_Explain_function     fn_explain;
 
} ENExecRoutine;

Index RegisterExecNodeExtender(const ENExecRoutine *routine);

It registers a set of callbacks being identified with a unique name,
then returns a unique index
to the internal array to save multiple self-define exec node.
Once self-define exec node is registered, extension module can replace
a part of plan-tree
underlying PlannedStmt by ENExec node with the index value being returned on its
registration time. Probably, planner_hook is the best position for
extensions to modify
PlannedStmt being constructed by standard planner.

typedef struct {   Plan    plan;   Index   enexec_id;   List   *private_list;
} ENExec;

Please note that it does not intend to inject self-defined node on
planner stage (on initial
implementation at least) to avoid unnecessary patch complexity.
Then, ENExec node shall be transformed to ENExecState at ExecInitNode() time.

typedef strcut {   PlanState      ps;   ENExecRoutine *routine;   void          *private;
} ENExecState;

The overall process is much similar to APIs of FDWs.
fn_begin_exec() shall be invoked on ExecInitNode() to initialize
relevant ENExecState, then
fn_end_exec() shall be also invoked on ExecEndNode().

One thing different from FDW is, it does not assume self-defined exec
node is always table
scan, thus, fn_getnext() has equivalent job with ExecForeignScan(),
not IterateForeignScan.
Extension module can choose whether it uses ExecScan() to implement an
alternative
scan, or something others like sort or aggregate.

I'm not certain whether PlannedStmt in text format by nodeToString()
needs to be valid
beyond server restart, or not.
One thing we have to pay attention is, server restart with module
unload may make orphan
PlannedStmt that contains self-defined node being not valid, if it has
to be valid beyond
server restarting. Please correct me.

Also, I don't think ExecNodeExtender is not a good naming, because it
is a bit long and
abbreviation (ENE?) is hard to imagine the feature. Please give this
feature a cool and
well understandable name.

I plan to submit a first working patch with PoC contrib module towards
commit-fest:2nd,
because we have less than 10 days now, and we already have enough queued patches
including row-level security...

Thanks,
--
KaiGai Kohei <kaigai@kaigai.gr.jp>



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [PATCH] Add support for TAS/S_UNLOCK for aarch64
Next
From: Tom Lane
Date:
Subject: Re: Configurable location for extension .control files