Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c) - Mailing list pgsql-hackers

From Kouhei Kaigai
Subject Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)
Date
Msg-id 9A28C8860F777E439AA12E8AEA7694F80117663D@BPXM15GP.gisp.nec.co.jp
Whole thread Raw
In response to Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)  (Kouhei Kaigai <kaigai@ak.jp.nec.com>)
Responses Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)
List pgsql-hackers
> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Kouhei Kaigai
> Sent: Saturday, November 21, 2015 8:05 AM
> To: Robert Haas
> Cc: Andres Freund; Amit Kapila; pgsql-hackers
> Subject: Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support
> on readfuncs.c)
> 
> > On Thu, Nov 19, 2015 at 10:11 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> > > The above two points are for the case if and when extension want to use
> > > variable length fields for its private fields.
> > > So, nodeAlloc() callback is not a perfect answer for the use case because
> > > length of the variable length fields shall be (usually) determined by the
> > > value of another fields (like num_inner_rels, num_gpu_devices, ...) thus
> > > we cannot determine the correct length before read.
> > >
> > > Let's assume an custom-scan extension that wants to have:
> > >
> > >   typedef struct {
> > >       CustomScan    cscan;
> > >       int           priv_value_1;
> > >       long          priv_value_2;
> > >       extra_info_t  extra_subplan_info[FLEXIBLE_ARRAY_MEMBER];
> > >       /* its length equal to number of sub-plans */
> > >   } ExampleScan;
> > >
> > > The "extnodename" within CustomScan allows to pull callback functions
> > > to handle read node from the serialized format.
> > > However, nodeAlloc() callback cannot determine the correct length
> > > (= number of sub-plans in this example) prior to read 'cscan' part.
> > >
> > > So, I'd like to suggest _readCustomScan (and other extendable nodes
> > > also) read tokens on local CustomScan variable once, then call
> > >   Node *(nodeRead)(Node *local_node)
> > > to allocate entire ExampleScan node and read other private fields.
> > >
> > > The local_node is already filled up by the core backend prior to
> > > the callback invocation, so extension can know how many sub-plans
> > > are expected thus how many private tokens shall appear.
> > > It also means extension can know exact length of the ExampleScan
> > > node, so it can allocate the node as expected then fill up
> > > remaining private tokens.
> >
> > On second thought, I think we should insist that nodes have to be
> > fixed-size.  This sounds like a mess.  If the node needs a variable
> > amount of storage for something, it can store a pointer to a
> > separately-allocated array someplace inside of it.  That's what
> > existing nodes do, and it works fine.
> >
> OK, let's have "node_size" instead of nodeAlloc callback and other
> stuffs.
> 
> Please wait for a patch.
>
I'm now implementing. The above design perfectly works on ForeignScan.
On the other hands, I'd like to have deeper consideration for CustomScan.

My recent patch adds LibraryName and SymbolName on CustomScanMethods
to lookup the method table even if library is not loaded yet.
However, this ExtensibleNodeMethods relies custom scan provider shall
be loaded, by parallel infrastructure, prior to the deserialization.
It means extension has a chance to register itself as well.

My idea is, redefine CustomScanMethod as follows:

typedef struct ExtensibleNodeMethods
{   const char *extnodename;   Size        node_size;   Node     *(*nodeCopy)(const Node *from);   bool
(*nodeEqual)(constNode *a, const Node *b);   void      (*nodeOut)(struct StringInfoData *str, const Node *node);   void
    (*nodeRead)(Node *node);
 
} ExtensibleNodeMethods;

typedef struct CustomScanMethods
{   union {       const char *CustomName;       ExtensibleNodeMethods  xnode;   };   /* Create execution state
(CustomScanState)from a CustomScan plan node */   Node       *(*CreateCustomScanState) (struct CustomScan *cscan);
 
} CustomScanMethods;

It allows to pull CustomScanMethods using GetExtensibleNodeMethods
by CustomName; that is alias of extnodename in ExtensibleNodeMethods.
Thus, we don't need to care about LibraryName and SymbolName.

This kind of trick is not needed for ForeignScan because all the pointer
stuff are reproducible by catalog, however, method table of CustomScan
is not.

How about your opinion?

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: Redefine default result from PQhost()?
Next
From: YUriy Zhuravlev
Date:
Subject: Re: WIP: About CMake v2