Re: Parallel Seq Scan - Mailing list pgsql-hackers

From Kouhei Kaigai
Subject Re: Parallel Seq Scan
Date
Msg-id 9A28C8860F777E439AA12E8AEA7694F80114956C@BPXM15GP.gisp.nec.co.jp
Whole thread Raw
In response to Parallel Seq Scan  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Parallel Seq Scan  (Robert Haas <robertmhaas@gmail.com>)
Re: Parallel Seq Scan  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
> > On Thu, Sep 17, 2015 at 4:44 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> > >
> > > I haven't studied the planner logic in enough detail yet to have a
> > > clear opinion on this.  But what I do think is that this is a very
> > > good reason why we should bite the bullet and add outfuncs/readfuncs
> > > support for all Plan nodes.  Otherwise, we're going to have to scan
> > > subplans for nodes we're not expecting to see there, which seems
> > > silly.  We eventually want to allow all of those nodes in the worker
> > > anyway.
> > >
> >
> > makes sense to me.  There are 39 plan nodes and it seems we have
> > support for all of them in outfuncs and needs to add for most of them
> > in readfuncs.
> >
> 
> Attached patch (read_funcs_v1.patch) contains support for all the plan
> and other nodes (like SubPlan which could be required for worker) except
> CustomScan node.  CustomScan contains TextOutCustomScan and doesn't
> contain corresponding Read function pointer, we could add the support for
> same, but I am not sure if CustomScan is required to be passed to worker
> in near future, so I am leaving it for now.
>
Oh... I did exactly duplicated job a few days before.
https://github.com/kaigai/sepgsql/blob/readfuncs/src/backend/nodes/readfuncs.c

Regarding of CustomScan node, I'd like to run on worker process as soon as
possible once it gets supported. I'm highly motivated.

Andres raised a related topic a few weeks before:
http://www.postgresql.org/message-id/20150825181933.GA19326@awork2.anarazel.de

Here are two issues:

* How to reproduce "methods" pointer on another process. Extension may not be loaded via shared_preload_libraries.
-> One solution is to provide a pair of library and symbol name of the method  table, instead of the pointer. I think
itis a reasonable idea.
 

* How to treat additional output of TextOutCustomScan.
-> Here are two solutions. (1) Mark TextOutCustomScan as an obsolete callback,  however, it still makes Andres concern
becausewe need to form/deform private  data for copyObject safe. (2) Add TextReadCustomScan (and NodeEqualCustomScan?)
callbackto process private fields.
 

> To verify the patch, I have done 2 things, first I have added elog to
> the newly supported read funcs and then in planner, I have used
> nodeToString and stringToNode on planned_stmt and then used the
> newly generated planned_stmt for further execution.  After making these
> changes, I have ran make check-world and ensures that it covers all the
> newly added nodes.
> 
> Note, that as we don't populate funcid's in expressions during read, the
> same has to be updated by traversing the tree and updating in different
> expressions based on node type.  Attached patch (read_funcs_test_v1)
> contains the changes required for testing the patch.  I am not very sure
> about what do about some of the ForeignScan fields (fdw_private) in order
> to update the funcid as the data in those expressions could be FDW specific.
> This is anyway for test, so doesn't matter much, but the same will be
> required to support read of ForeignScan node by worker.
>
Because of interface contract, it is role of FDW driver to put nodes which
are safe to copyObject on fdw_exprs and fdw_private field. Unless FDW driver
does not violate, fdw_exprs and fdw_private shall be reproduced on the worker
side. (Of course, we cannot guarantee nobody has local pointer on private
field...)
Sorry, I cannot understand the sentence of funcid population. It seems to me
funcid is displayed as-is, and _readFuncExpr() does nothing special...?


Robert Haas said:
> I think it would be worth doing something like this:
> 
> #define READ_ATTRNUMBER_ARRAY(fldname, len) \
>     token = pg_strtok(&length); \
>     local_node->fldname = readAttrNumberCols(len);
> 
> And similarly for READ_OID_ARRAY, READ_BOOL_ARRAY, READ_INT_ARRAY.
>
Like this?
https://github.com/kaigai/sepgsql/blob/readfuncs/src/backend/nodes/readfuncs.c#L133

I think outfuncs.c also have similar macro to centralize the format of array.
Actually, most of boolean array are displayed using booltostr(), however, only
_outMergeJoin() uses "%d" format to display boolean as integer.
It is a bit inconsistent manner.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Less than ideal error reporting in pg_stat_statements
Next
From: Robert Haas
Date:
Subject: Re: planstate_tree_walker oversight CustomScan