Thread: CustomScan and readfuncs.c
Hello, Under the investigation of ParallelAppend, I noticed here is a few problems in CustomScan, that prevents to reproduce an equivalent plan node on the background worker from serialized string. 1. CustomScanMethods->TextOutCustomScan callback ------------------------------------------------ This callback allows to output custom information on nodeToString. Originally, we intend to use this callback for debug only, because CustomScan must be copyObject() safe, thus, all the private data also must be stored in custom_exprs or custom_private. However, it will lead another problem when we try to reproduce CustomScan node from the string form generated by outfuncs.c. If TextOutCustomScan prints something, upcoming _readCustomScan has to deal with unexpected number of tokens in unexpected format. I'd like to propose to omit this callback prior to v9.5 release, for least compatibility issues. 2. Reproduce method table on background worker ---------------------------------------------- The method field of CustomPath/Scan/ScanState is expected to be a reference to a static structure. Thus, copyObject() does not copy the entire table, but only pointers. However, we have no way to guarantee the callback functions have same entrypoint addressed on background workers. So, we may need an infrastructure to reproduce same CustomScan node with same callback function tables, probably, identified by name. We may have a few ways to solve the problem. * Add system catalog, function returns pointer The simplest way, like FDW. System catalog has a name and function to return callback pointers. It also needs SQL statement support, even a little down side. * Registered by name, during shared_preload_libraries only Like an early version of CustomScan interface, it requires custom- scan providers to register a pair of name and callbacks, but only when shared_preload_libraries is processed, to guarantee the callbacks are also registered in the background workers also. (Is this assumption right on windows?) Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com>
Kouhei Kaigai <kaigai@ak.jp.nec.com> writes: > Under the investigation of ParallelAppend, I noticed here is a few > problems in CustomScan, that prevents to reproduce an equivalent > plan node on the background worker from serialized string. > 1. CustomScanMethods->TextOutCustomScan callback > ------------------------------------------------ > This callback allows to output custom information on nodeToString. > Originally, we intend to use this callback for debug only, because > CustomScan must be copyObject() safe, thus, all the private data > also must be stored in custom_exprs or custom_private. > However, it will lead another problem when we try to reproduce > CustomScan node from the string form generated by outfuncs.c. > If TextOutCustomScan prints something, upcoming _readCustomScan > has to deal with unexpected number of tokens in unexpected format. Um ... wait a second. There is no support in readfuncs for any plan node type, and never has been, and I seriously doubt that there ever should be. I do not think it makes sense to ship plans around in the way you seem to have in mind. (Also, I don't think the problems you mention are exactly unique to CustomScan. There's no reason to assume that FDW plans could survive this treatment either, since we do not know what's in the fdw_private stuff; certainly no one has ever suggested that it should not contain pointers to static data.) > I'd like to propose to omit this callback prior to v9.5 release, > for least compatibility issues. I regard our commitment to cross-version compatibility for the custom scan APIs as being essentially zero, for reasons previously discussed. So if this goes away in 9.6 it will not matter, but we might as well leave it in for now for debug support. regards, tom lane
> Kouhei Kaigai <kaigai@ak.jp.nec.com> writes: > > Under the investigation of ParallelAppend, I noticed here is a few > > problems in CustomScan, that prevents to reproduce an equivalent > > plan node on the background worker from serialized string. > > > 1. CustomScanMethods->TextOutCustomScan callback > > ------------------------------------------------ > > This callback allows to output custom information on nodeToString. > > Originally, we intend to use this callback for debug only, because > > CustomScan must be copyObject() safe, thus, all the private data > > also must be stored in custom_exprs or custom_private. > > However, it will lead another problem when we try to reproduce > > CustomScan node from the string form generated by outfuncs.c. > > If TextOutCustomScan prints something, upcoming _readCustomScan > > has to deal with unexpected number of tokens in unexpected format. > > Um ... wait a second. There is no support in readfuncs for any > plan node type, and never has been, and I seriously doubt that there > ever should be. I do not think it makes sense to ship plans around > in the way you seem to have in mind. (Also, I don't think the > problems you mention are exactly unique to CustomScan. There's no > reason to assume that FDW plans could survive this treatment either, > since we do not know what's in the fdw_private stuff; certainly no > one has ever suggested that it should not contain pointers to static > data.) > Yep, no Plan node types are supported at this moment, however, will appear soon by the Funnel + PartialSeqScan nodes. It serializes a partial plan subtree using nodeToString() then gives the flatten PlannedStmt to background workers. I'm now investigating to apply same structure to Append not to kick child nodes in parallel. Once various plan node types appear in readfuncs.c, we have to care about this problem, don't it? I'm working for the patch submission of ParallelAppend on the next commit-fest, so like to make a consensus how to treat this matter. > > I'd like to propose to omit this callback prior to v9.5 release, > > for least compatibility issues. > > I regard our commitment to cross-version compatibility for the > custom scan APIs as being essentially zero, for reasons previously > discussed. So if this goes away in 9.6 it will not matter, but we > might as well leave it in for now for debug support. > I don't argue this point strongly. If TextOutCustomScan shall be obsoleted on v9.6, it is just kindness for developers not to use this callback. Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com>
> 2. Reproduce method table on background worker > ---------------------------------------------- > The method field of CustomPath/Scan/ScanState is expected to be > a reference to a static structure. Thus, copyObject() does not > copy the entire table, but only pointers. > However, we have no way to guarantee the callback functions have > same entrypoint addressed on background workers. So, we may need > an infrastructure to reproduce same CustomScan node with same > callback function tables, probably, identified by name. > We may have a few ways to solve the problem. > > * Add system catalog, function returns pointer > The simplest way, like FDW. System catalog has a name and function > to return callback pointers. It also needs SQL statement support, > even a little down side. > I tried to design a DDL statement and relevant system catalog as follows. #define CustomPlanRelationId 3999 CATALOG(pg_custom_plan,3999) { NameData custom_name; regproc custom_handler;} FormData_pg_custom_plan; This simple catalog saves a pair of name and handler function of custom plan provider. Like FDW, this handler function returns pointers to the entrypoint to be called by set_(rel|join)_pathlist_hook and relevant CustomXXXMethods table. User can register a custom plan provider using the following statement: CREATE CUSTOM PLAN <name> HANDLER <function_name>; And unregister: DROP CUSTOM PLAN <name>; This enhancement allows background workers to reproduce CustomScan node that was serialized by nodeToString(), as long as provider is specified by the name. Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com> > -----Original Message----- > From: Kaigai Kouhei(海外 浩平) > Sent: Monday, July 27, 2015 8:42 AM > To: 'Tom Lane' > Cc: pgsql-hackers@postgresql.org > Subject: RE: [HACKERS] CustomScan and readfuncs.c > > > Kouhei Kaigai <kaigai@ak.jp.nec.com> writes: > > > Under the investigation of ParallelAppend, I noticed here is a few > > > problems in CustomScan, that prevents to reproduce an equivalent > > > plan node on the background worker from serialized string. > > > > > 1. CustomScanMethods->TextOutCustomScan callback > > > ------------------------------------------------ > > > This callback allows to output custom information on nodeToString. > > > Originally, we intend to use this callback for debug only, because > > > CustomScan must be copyObject() safe, thus, all the private data > > > also must be stored in custom_exprs or custom_private. > > > However, it will lead another problem when we try to reproduce > > > CustomScan node from the string form generated by outfuncs.c. > > > If TextOutCustomScan prints something, upcoming _readCustomScan > > > has to deal with unexpected number of tokens in unexpected format. > > > > Um ... wait a second. There is no support in readfuncs for any > > plan node type, and never has been, and I seriously doubt that there > > ever should be. I do not think it makes sense to ship plans around > > in the way you seem to have in mind. (Also, I don't think the > > problems you mention are exactly unique to CustomScan. There's no > > reason to assume that FDW plans could survive this treatment either, > > since we do not know what's in the fdw_private stuff; certainly no > > one has ever suggested that it should not contain pointers to static > > data.) > > > Yep, no Plan node types are supported at this moment, however, will > appear soon by the Funnel + PartialSeqScan nodes. > It serializes a partial plan subtree using nodeToString() then gives > the flatten PlannedStmt to background workers. > I'm now investigating to apply same structure to Append not to kick > child nodes in parallel. > Once various plan node types appear in readfuncs.c, we have to care > about this problem, don't it? I'm working for the patch submission > of ParallelAppend on the next commit-fest, so like to make a consensus > how to treat this matter. > > > > I'd like to propose to omit this callback prior to v9.5 release, > > > for least compatibility issues. > > > > I regard our commitment to cross-version compatibility for the > > custom scan APIs as being essentially zero, for reasons previously > > discussed. So if this goes away in 9.6 it will not matter, but we > > might as well leave it in for now for debug support. > > > I don't argue this point strongly. If TextOutCustomScan shall be > obsoleted on v9.6, it is just kindness for developers not to use > this callback. > > Thanks, > -- > NEC Business Creation Division / PG-Strom Project > KaiGai Kohei <kaigai@ak.jp.nec.com>
On Sun, Jul 26, 2015 at 11:14 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Um ... wait a second. There is no support in readfuncs for any > plan node type, and never has been, and I seriously doubt that there > ever should be. As KaiGai says, the parallel query stuff contemplates changing this; how else will we get the plan from the master to each worker? We could invent a bespoke serialization format, but that seems to have exactly nothing to recommend it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company