Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API) - Mailing list pgsql-hackers
From | Shigeru Hanada |
---|---|
Subject | Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API) |
Date | |
Msg-id | CAEZqfEcNvjqq-P=jxnW1Pb4T9wvpcPoRCN7G6cc46JGuB7dY8w@mail.gmail.com Whole thread Raw |
In response to | Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API) (Kouhei Kaigai <kaigai@ak.jp.nec.com>) |
List | pgsql-hackers |
Kaigai-san, The v6 patch was cleanly applied on master branch. I'll rebase my patch onto it, but before that I have a comment about name of the new FDW API handler GetForeignJoinPath. Obviously FDW can add multiple paths at a time, like GetForeignPaths, so IMO it should be renamed to GetForeignJoinPaths, with plural form. In addition to that, new member of RelOptInfo, fdw_handler, should be initialized explicitly in build_simple_rel. Please see attached a patch for these changes. I'll review the v6 path afterwards. 2015-03-03 20:20 GMT+09:00 Kouhei Kaigai <kaigai@ak.jp.nec.com>: > Sorry, I misoperated on patch creation. > Attached one is the correct version. > -- > NEC OSS Promotion Center / PG-Strom Project > KaiGai Kohei <kaigai@ak.jp.nec.com> > > >> -----Original Message----- >> From: pgsql-hackers-owner@postgresql.org >> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Kouhei Kaigai >> Sent: Tuesday, March 03, 2015 6:31 PM >> To: Kaigai Kouhei(海外 浩平); Robert Haas >> Cc: Tom Lane; pgsql-hackers@postgreSQL.org; Shigeru Hanada >> Subject: Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API) >> >> The attached version of custom/foreign-join interface patch >> fixes up the problem reported on the join-pushdown support >> thread. >> >> The previous version referenced *_ps_tlist on setrefs.c, to >> check whether the Custom/ForeignScan node is associated with >> a particular base relation, or not. >> This logic considered above nodes performs base relation scan, >> if *_ps_tlist is valid. However, it was incorrect in case when >> underlying pseudo-scan relation has empty targetlist. >> Instead of the previous logic, it shall be revised to check >> scanrelid itself. If zero, it means Custom/ForeignScan node is >> not associated with a particular base relation, thus, its slot >> descriptor for scan shall be constructed based on *_ps_tlist. >> >> >> Also, I noticed a potential problem if CSP/FDW driver want to >> displays expression nodes using deparse_expression() but >> varnode within this expression does not appear in the *_ps_tlist. >> For example, a remote query below shall return rows with two >> columns. >> >> SELECT atext, btext FROM tbl_a, tbl_b WHERE aid = bid; >> >> Thus, ForeignScan will perform like as a scan on relation with >> two columns, and FDW driver will set two TargetEntry on the >> fdw_ps_tlist. If FDW is designed to keep the join condition >> (aid = bid) using expression node form, it is expected to be >> saved on custom/fdw_expr variable, then setrefs.c rewrites the >> varnode according to *_ps_tlist. >> It means, we also have to add *_ps_tlist both of "aid" and "bid" >> to avoid failure on variable lookup. However, these additional >> entries changes the definition of the slot descriptor. >> So, I adjusted ExecInitForeignScan and ExecInitCustomScan to >> use ExecCleanTypeFromTL(), not ExecTypeFromTL(), when it construct >> the slot descriptor based on the *_ps_tlist. >> It expects CSP/FDW drivers to add target-entries with resjunk=true, >> if it wants to have additional entries for variable lookups on >> EXPLAIN command. >> >> Fortunately or unfortunately, postgres_fdw keeps its remote query >> in cstring form, so it does not need to add junk entries on the >> fdw_ps_tlist. >> >> Thanks, >> -- >> NEC OSS Promotion Center / PG-Strom Project >> KaiGai Kohei <kaigai@ak.jp.nec.com> >> >> >> > -----Original Message----- >> > From: pgsql-hackers-owner@postgresql.org >> > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Kouhei Kaigai >> > Sent: Sunday, February 15, 2015 11:01 PM >> > To: Kaigai Kouhei(海外 浩平); Robert Haas >> > Cc: Tom Lane; pgsql-hackers@postgreSQL.org; Shigeru Hanada >> > Subject: Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API) >> > >> > The attached patch is a rebased version of join replacement with >> > foreign-/custom-scan. Here is no feature updates at this moment >> > but SGML documentation is added (according to Michael's comment). >> > >> > This infrastructure allows foreign-data-wrapper and custom-scan- >> > provider to add alternative scan paths towards relations join. >> > From viewpoint of the executor, it looks like a scan on a pseudo- >> > relation that is materialized from multiple relations, even though >> > FDW/CSP internally processes relations join with their own logic. >> > >> > Its basic idea is, (1) scanrelid==0 indicates this foreign/custom >> > scan node runs on a pseudo relation and (2) fdw_ps_tlist and >> > custom_ps_tlist introduce the definition of the pseudo relation, >> > because it is not associated with a tangible relation unlike >> > simple scan case, thus planner cannot know the expected record >> > type to be returned without these additional information. >> > These two enhancement enables extensions to process relations >> > join internally, and to perform as like existing scan node from >> > viewpoint of the core backend. >> > >> > Also, as an aside. I had a discussion with Hanada-san about this >> > interface off-list. He had an idea to keep create_plan_recurse() >> > static, using a special list field in CustomPath structure to >> > chain underlying Path node. If core backend translate the Path >> > node to Plan node if valid list given, extension does not need to >> > call create_plan_recurse() by itself. >> > I have no preference about this. Does anybody have opinion? >> > >> > Thanks, >> > -- >> > NEC OSS Promotion Center / PG-Strom Project >> > KaiGai Kohei <kaigai@ak.jp.nec.com> >> > >> > >> > > -----Original Message----- >> > > From: pgsql-hackers-owner@postgresql.org >> > > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Kouhei Kaigai >> > > Sent: Thursday, January 15, 2015 8:03 AM >> > > To: Robert Haas >> > > Cc: Tom Lane; pgsql-hackers@postgreSQL.org; Shigeru Hanada >> > > Subject: Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan >> > > API) >> > > >> > > > On Fri, Jan 9, 2015 at 10:51 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> >> > > wrote: >> > > > > When custom-scan node replaced a join-plan, it shall have at least >> > > > > two child plan-nodes. The callback handler of PlanCustomPath needs >> > > > > to be able to call create_plan_recurse() to transform the underlying >> > > > > path-nodes to plan-nodes, because this custom-scan node may take >> > > > > other built-in scan or sub-join nodes as its inner/outer input. >> > > > > In case of FDW, it shall kick any underlying scan relations to >> > > > > remote side, thus we may not expect ForeignScan has underlying plans... >> > > > >> > > > Do you have an example of this? >> > > > >> > > Yes, even though full code set is too large for patch submission... >> > > >> > > https://github.com/pg-strom/devel/blob/master/src/gpuhashjoin.c#L1880 >> > > >> > > This create_gpuhashjoin_plan() is PlanCustomPath callback of GpuHashJoin. >> > > It takes GpuHashJoinPath inherited from CustomPath that has multiple >> > > underlying scan/join paths. >> > > Once it is called back from the backend, it also calls create_plan_recurse() >> > > to make inner/outer plan nodes according to the paths. >> > > >> > > In the result, we can see the following query execution plan that CustomScan >> > > takes underlying scan plans. >> > > >> > > postgres=# EXPLAIN SELECT * FROM t0 NATURAL JOIN t1 NATURAL JOIN t2; >> > > QUERY PLAN >> > > ---------------------------------------------------------------------- >> > > ------------ >> > > Custom Scan (GpuHashJoin) (cost=2968.00..140120.31 rows=3970922 >> > > width=143) >> > > Hash clause 1: (aid = aid) >> > > Hash clause 2: (bid = bid) >> > > Bulkload: On >> > > -> Custom Scan (GpuScan) on t0 (cost=500.00..57643.00 rows=4000009 >> > > width=77) >> > > -> Custom Scan (MultiHash) (cost=734.00..734.00 rows=40000 >> > > width=37) >> > > hash keys: aid >> > > nBatches: 1 Buckets: 46000 Memory Usage: 99.99% >> > > -> Seq Scan on t1 (cost=0.00..734.00 rows=40000 width=37) >> > > -> Custom Scan (MultiHash) (cost=734.00..734.00 rows=40000 >> > > width=37) >> > > hash keys: bid >> > > nBatches: 1 Buckets: 46000 Memory Usage: 49.99% >> > > -> Seq Scan on t2 (cost=0.00..734.00 rows=40000 >> > > width=37) >> > > (13 rows) >> > > >> > > Thanks, >> > > -- >> > > NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei >> > > <kaigai@ak.jp.nec.com> >> > > >> > > >> > > > -----Original Message----- >> > > > From: Robert Haas [mailto:robertmhaas@gmail.com] >> > > > Sent: Thursday, January 15, 2015 2:07 AM >> > > > To: Kaigai Kouhei(海外 浩平) >> > > > Cc: Tom Lane; pgsql-hackers@postgreSQL.org; Shigeru Hanada >> > > > Subject: ##freemail## Re: Custom/Foreign-Join-APIs (Re: [HACKERS] >> > > > [v9.5] Custom Plan API) >> > > > >> > > > On Fri, Jan 9, 2015 at 10:51 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> >> > > wrote: >> > > > > When custom-scan node replaced a join-plan, it shall have at least >> > > > > two child plan-nodes. The callback handler of PlanCustomPath needs >> > > > > to be able to call create_plan_recurse() to transform the underlying >> > > > > path-nodes to plan-nodes, because this custom-scan node may take >> > > > > other built-in scan or sub-join nodes as its inner/outer input. >> > > > > In case of FDW, it shall kick any underlying scan relations to >> > > > > remote side, thus we may not expect ForeignScan has underlying plans... >> > > > >> > > > Do you have an example of this? >> > > > >> > > > -- >> > > > Robert Haas >> > > > EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL >> > > > Company >> > > >> > > -- >> > > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make >> > > changes to your subscription: >> > > http://www.postgresql.org/mailpref/pgsql-hackers -- Shigeru HANADA
Attachment
pgsql-hackers by date: