Thread: Construction of Plan-node by CSP (RE: Custom/Foreign-Join-APIs)
> -----Original Message----- > Sent: Friday, May 15, 2015 8:44 AM > To: 'Tom Lane'; Kohei KaiGai > Cc: Robert Haas; Thom Brown; Shigeru Hanada; pgsql-hackers@postgreSQL.org > Subject: RE: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API) > > > A possible compromise that we could perhaps still wedge into 9.5 is to > > extend CustomPath with a List of child Paths, and CustomScan with a List > > of child Plans, which createplan.c would know to build from the Paths, > > and other modules would then also be aware of these children. I find that > > uglier than a separate join node type, but it would be tolerable I guess. > > > The attached patch implements what you suggested as is. > It allows custom-scan providers to have child Paths without exporting > create_plan_recurse(), and enables to represent N-way join naturally. > Please add any solution, even if we don't reach the consensus of how > create_plan_recurse (and other useful static functions) are visible to > extensions. > I updated the patch to fix up this problem towards the latest master branch. Let me remind the problem again. (I really have a hard time of it) When an extension tries to implement its own join logic using custom- scan interface, it adds CustomPath on set_join_pathlist_hook with its cost estimation. Once the path gets chosen by planner, PlanCustomPath callback shall be invoked, then, the custom-scan provider will construct its CustomScan node according to the path, and I expected it recursively initializes underlying Path nodes (that work as join input, if any) using create_plan_recurse(). However, at this moment, we didn't get 100% consensus to export this function to extensions. So, later commit made this function as static one, again. Instead of this approach, Tom suggested to add a list of child Paths on CustomPath node, then createplan.c calls create_plan_recurse() for each entry of the list, without this function getting exported. I can agree with this approach as an alternative of the previous public create_plan_recurse(), and the attached patch implements this idea, as is. (Do I understand his suggestion correctly?) Below is the expectation of the custom-scan provider which takes underlying Path/Plan nodes. 1. It adds a list of underlying Path nodes on custom_children of the CustomPath node. 2. On the PlanCustomPath, it adds adds Plan nodes (initialized by createplan.c, and passed as argument) onto lefttree, righttree and/or custom_children of CustomScan node 3. On the BeginCustomScan, it calls ExecInitNode() to begin execution of the underlying plan node. Then, if it has more than 2 children, attach these PlanState objects on the custom_children list for EXPLAIN output. As long as extension follows the above interface contract, it can have underlying child Path/Plan/PlanState without direct call of create_plan_recurse() as previously argued. I think it is enough reasonable solution for the problem. How about people's thought? Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com>
Attachment
On Mon, May 25, 2015 at 5:08 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote: > I updated the patch to fix up this problem towards the latest master > branch. [ ... ] > Instead of this approach, Tom suggested to add a list of child Paths > on CustomPath node, then createplan.c calls create_plan_recurse() for > each entry of the list, without this function getting exported. Tom, do you want to review this patch and figure out how to solve the underlying problem? If not, I will take care of it. But I will be unhappy if I put time and effort into this and then you insist on changing everything afterwards, again. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > Tom, do you want to review this patch and figure out how to solve the > underlying problem? If not, I will take care of it. But I will be > unhappy if I put time and effort into this and then you insist on > changing everything afterwards, again. [ sorry for slow response, been busy ] I will take a look. regards, tom lane
> -----Original Message----- > From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Tom Lane > Sent: Thursday, May 28, 2015 5:35 AM > To: Robert Haas > Cc: Kaigai Kouhei(海外 浩平); Thom Brown; Kohei KaiGai; Shigeru Hanada; > pgsql-hackers@postgreSQL.org > Subject: Re: [HACKERS] Construction of Plan-node by CSP (RE: > Custom/Foreign-Join-APIs) > > Robert Haas <robertmhaas@gmail.com> writes: > > Tom, do you want to review this patch and figure out how to solve the > > underlying problem? If not, I will take care of it. But I will be > > unhappy if I put time and effort into this and then you insist on > > changing everything afterwards, again. > > [ sorry for slow response, been busy ] I will take a look. > Tom, how about your availability? >From my side, I adjust my extension (PG-Strom) to fit the infrastructure you proposed, then confirmed it is workable even if custom-scan, that replaced relations join, takes more than two Path nodes in the custom_children list of CustomPath, with no exportiong create_plan_recurse(). Below is an example of custom-scan (GpuJoin) that involves four relations join. Its code base is the latest master + custom-join-children.v2.patch; unchanged from the last post. postgres=# explain analyze select avg(x) from t0 natural join t1 natural join t2 natural join t3 group by cat; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------- HashAggregate (cost=298513.77..298514.10 rows=26 width=12) (actual time=5622.028..5622.033 rows=26 loops=1) Group Key: t0.cat -> Custom Scan (GpuJoin) (cost=4702.00..249169.85 rows=9868784 width=12) (actual time=540.718..2075.566 rows=10000000loops=1) Bulkload: On (density: 100.00%) Depth 1: Logic: GpuHashJoin, HashKeys: (cid), JoinQual: (cid = cid), nrows_ratio: 0.98936439 Depth 2: Logic: GpuHashJoin, HashKeys: (bid), JoinQual: (bid = bid), nrows_ratio: 0.99748135 Depth 3: Logic: GpuHashJoin, HashKeys: (aid), JoinQual: (aid = aid), nrows_ratio: 1.00000000 -> Custom Scan (BulkScan) on t0 (cost=0.00..242858.60 rows=10000060 width=24) (actual time=8.555..903.864 rows=10000000loops=1) -> Seq Scan on t3 (cost=0.00..734.00 rows=40000 width=4) (actual time=0.019..4.370 rows=40000 loops=1) -> Seq Scan on t2 (cost=0.00..734.00 rows=40000 width=4) (actual time=0.004..4.182 rows=40000 loops=1) -> Seq Scan on t1 (cost=0.00..734.00 rows=40000 width=4) (actual time=0.005..4.275 rows=40000 loops=1) Planning time: 0.918 ms Execution time: 6178.264 ms (13 rows) Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com>
Attachment
> Robert Haas <robertmhaas@gmail.com> writes: > > Tom, do you want to review this patch and figure out how to solve the > > underlying problem? If not, I will take care of it. But I will be > > unhappy if I put time and effort into this and then you insist on > > changing everything afterwards, again. > > [ sorry for slow response, been busy ] I will take a look. > > regards, tom lane > Tom, please don't forget the problem. It is still problematic for custom-scan provider that tries to implement own join logic, thus we still have to apply additional patch (or copy&paste createplan.c to module's source). Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com>