Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API) - Mailing list pgsql-hackers

From Shigeru Hanada
Subject Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)
Date
Msg-id CAEZqfEcNvjqq-P=jxnW1Pb4T9wvpcPoRCN7G6cc46JGuB7dY8w@mail.gmail.com
Whole thread Raw
In response to Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)  (Kouhei Kaigai <kaigai@ak.jp.nec.com>)
List pgsql-hackers
Kaigai-san,

The v6 patch was cleanly applied on master branch.  I'll rebase my
patch onto it, but before that I have a comment about name of the new
FDW API handler GetForeignJoinPath.

Obviously FDW can add multiple paths at a time, like GetForeignPaths,
so IMO it should be renamed to GetForeignJoinPaths, with plural form.

In addition to that, new member of RelOptInfo, fdw_handler, should be
initialized explicitly in build_simple_rel.

Please see attached a patch for these changes.

I'll review the v6 path afterwards.


2015-03-03 20:20 GMT+09:00 Kouhei Kaigai <kaigai@ak.jp.nec.com>:
> Sorry, I misoperated on patch creation.
> Attached one is the correct version.
> --
> NEC OSS Promotion Center / PG-Strom Project
> KaiGai Kohei <kaigai@ak.jp.nec.com>
>
>
>> -----Original Message-----
>> From: pgsql-hackers-owner@postgresql.org
>> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Kouhei Kaigai
>> Sent: Tuesday, March 03, 2015 6:31 PM
>> To: Kaigai Kouhei(海外 浩平); Robert Haas
>> Cc: Tom Lane; pgsql-hackers@postgreSQL.org; Shigeru Hanada
>> Subject: Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)
>>
>> The attached version of custom/foreign-join interface patch
>> fixes up the problem reported on the join-pushdown support
>> thread.
>>
>> The previous version referenced *_ps_tlist on setrefs.c, to
>> check whether the Custom/ForeignScan node is associated with
>> a particular base relation, or not.
>> This logic considered above nodes performs base relation scan,
>> if *_ps_tlist is valid. However, it was incorrect in case when
>> underlying pseudo-scan relation has empty targetlist.
>> Instead of the previous logic, it shall be revised to check
>> scanrelid itself. If zero, it means Custom/ForeignScan node is
>> not associated with a particular base relation, thus, its slot
>> descriptor for scan shall be constructed based on *_ps_tlist.
>>
>>
>> Also, I noticed a potential problem if CSP/FDW driver want to
>> displays expression nodes using deparse_expression() but
>> varnode within this expression does not appear in the *_ps_tlist.
>> For example, a remote query below shall return rows with two
>> columns.
>>
>>   SELECT atext, btext FROM tbl_a, tbl_b WHERE aid = bid;
>>
>> Thus, ForeignScan will perform like as a scan on relation with
>> two columns, and FDW driver will set two TargetEntry on the
>> fdw_ps_tlist. If FDW is designed to keep the join condition
>> (aid = bid) using expression node form, it is expected to be
>> saved on custom/fdw_expr variable, then setrefs.c rewrites the
>> varnode according to *_ps_tlist.
>> It means, we also have to add *_ps_tlist both of "aid" and "bid"
>> to avoid failure on variable lookup. However, these additional
>> entries changes the definition of the slot descriptor.
>> So, I adjusted ExecInitForeignScan and ExecInitCustomScan to
>> use ExecCleanTypeFromTL(), not ExecTypeFromTL(), when it construct
>> the slot descriptor based on the *_ps_tlist.
>> It expects CSP/FDW drivers to add target-entries with resjunk=true,
>> if it wants to have additional entries for variable lookups on
>> EXPLAIN command.
>>
>> Fortunately or unfortunately, postgres_fdw keeps its remote query
>> in cstring form, so it does not need to add junk entries on the
>> fdw_ps_tlist.
>>
>> Thanks,
>> --
>> NEC OSS Promotion Center / PG-Strom Project
>> KaiGai Kohei <kaigai@ak.jp.nec.com>
>>
>>
>> > -----Original Message-----
>> > From: pgsql-hackers-owner@postgresql.org
>> > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Kouhei Kaigai
>> > Sent: Sunday, February 15, 2015 11:01 PM
>> > To: Kaigai Kouhei(海外 浩平); Robert Haas
>> > Cc: Tom Lane; pgsql-hackers@postgreSQL.org; Shigeru Hanada
>> > Subject: Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)
>> >
>> > The attached patch is a rebased version of join replacement with
>> > foreign-/custom-scan. Here is no feature updates at this moment
>> > but SGML documentation is added (according to Michael's comment).
>> >
>> > This infrastructure allows foreign-data-wrapper and custom-scan-
>> > provider to add alternative scan paths towards relations join.
>> > From viewpoint of the executor, it looks like a scan on a pseudo-
>> > relation that is materialized from multiple relations, even though
>> > FDW/CSP internally processes relations join with their own logic.
>> >
>> > Its basic idea is, (1) scanrelid==0 indicates this foreign/custom
>> > scan node runs on a pseudo relation and (2) fdw_ps_tlist and
>> > custom_ps_tlist introduce the definition of the pseudo relation,
>> > because it is not associated with a tangible relation unlike
>> > simple scan case, thus planner cannot know the expected record
>> > type to be returned without these additional information.
>> > These two enhancement enables extensions to process relations
>> > join internally, and to perform as like existing scan node from
>> > viewpoint of the core backend.
>> >
>> > Also, as an aside. I had a discussion with Hanada-san about this
>> > interface off-list. He had an idea to keep create_plan_recurse()
>> > static, using a special list field in CustomPath structure to
>> > chain underlying Path node. If core backend translate the Path
>> > node to Plan node if valid list given, extension does not need to
>> > call create_plan_recurse() by itself.
>> > I have no preference about this. Does anybody have opinion?
>> >
>> > Thanks,
>> > --
>> > NEC OSS Promotion Center / PG-Strom Project
>> > KaiGai Kohei <kaigai@ak.jp.nec.com>
>> >
>> >
>> > > -----Original Message-----
>> > > From: pgsql-hackers-owner@postgresql.org
>> > > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Kouhei Kaigai
>> > > Sent: Thursday, January 15, 2015 8:03 AM
>> > > To: Robert Haas
>> > > Cc: Tom Lane; pgsql-hackers@postgreSQL.org; Shigeru Hanada
>> > > Subject: Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan
>> > > API)
>> > >
>> > > > On Fri, Jan 9, 2015 at 10:51 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com>
>> > > wrote:
>> > > > > When custom-scan node replaced a join-plan, it shall have at least
>> > > > > two child plan-nodes. The callback handler of PlanCustomPath needs
>> > > > > to be able to call create_plan_recurse() to transform the underlying
>> > > > > path-nodes to plan-nodes, because this custom-scan node may take
>> > > > > other built-in scan or sub-join nodes as its inner/outer input.
>> > > > > In case of FDW, it shall kick any underlying scan relations to
>> > > > > remote side, thus we may not expect ForeignScan has underlying plans...
>> > > >
>> > > > Do you have an example of this?
>> > > >
>> > > Yes, even though full code set is too large for patch submission...
>> > >
>> > > https://github.com/pg-strom/devel/blob/master/src/gpuhashjoin.c#L1880
>> > >
>> > > This create_gpuhashjoin_plan() is PlanCustomPath callback of GpuHashJoin.
>> > > It takes GpuHashJoinPath inherited from CustomPath that has multiple
>> > > underlying scan/join paths.
>> > > Once it is called back from the backend, it also calls create_plan_recurse()
>> > > to make inner/outer plan nodes according to the paths.
>> > >
>> > > In the result, we can see the following query execution plan that CustomScan
>> > > takes underlying scan plans.
>> > >
>> > > postgres=# EXPLAIN SELECT * FROM t0 NATURAL JOIN t1 NATURAL JOIN t2;
>> > >                                     QUERY PLAN
>> > > ----------------------------------------------------------------------
>> > > ------------
>> > >  Custom Scan (GpuHashJoin)  (cost=2968.00..140120.31 rows=3970922
>> > > width=143)
>> > >    Hash clause 1: (aid = aid)
>> > >    Hash clause 2: (bid = bid)
>> > >    Bulkload: On
>> > >    ->  Custom Scan (GpuScan) on t0  (cost=500.00..57643.00 rows=4000009
>> > > width=77)
>> > >    ->  Custom Scan (MultiHash)  (cost=734.00..734.00 rows=40000
>> > > width=37)
>> > >          hash keys: aid
>> > >          nBatches: 1  Buckets: 46000  Memory Usage: 99.99%
>> > >          ->  Seq Scan on t1  (cost=0.00..734.00 rows=40000 width=37)
>> > >          ->  Custom Scan (MultiHash)  (cost=734.00..734.00 rows=40000
>> > > width=37)
>> > >                hash keys: bid
>> > >                nBatches: 1  Buckets: 46000  Memory Usage: 49.99%
>> > >                ->  Seq Scan on t2  (cost=0.00..734.00 rows=40000
>> > > width=37)
>> > > (13 rows)
>> > >
>> > > Thanks,
>> > > --
>> > > NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei
>> > > <kaigai@ak.jp.nec.com>
>> > >
>> > >
>> > > > -----Original Message-----
>> > > > From: Robert Haas [mailto:robertmhaas@gmail.com]
>> > > > Sent: Thursday, January 15, 2015 2:07 AM
>> > > > To: Kaigai Kouhei(海外 浩平)
>> > > > Cc: Tom Lane; pgsql-hackers@postgreSQL.org; Shigeru Hanada
>> > > > Subject: ##freemail## Re: Custom/Foreign-Join-APIs (Re: [HACKERS]
>> > > > [v9.5] Custom Plan API)
>> > > >
>> > > > On Fri, Jan 9, 2015 at 10:51 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com>
>> > > wrote:
>> > > > > When custom-scan node replaced a join-plan, it shall have at least
>> > > > > two child plan-nodes. The callback handler of PlanCustomPath needs
>> > > > > to be able to call create_plan_recurse() to transform the underlying
>> > > > > path-nodes to plan-nodes, because this custom-scan node may take
>> > > > > other built-in scan or sub-join nodes as its inner/outer input.
>> > > > > In case of FDW, it shall kick any underlying scan relations to
>> > > > > remote side, thus we may not expect ForeignScan has underlying plans...
>> > > >
>> > > > Do you have an example of this?
>> > > >
>> > > > --
>> > > > Robert Haas
>> > > > EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL
>> > > > Company
>> > >
>> > > --
>> > > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make
>> > > changes to your subscription:
>> > > http://www.postgresql.org/mailpref/pgsql-hackers



--
Shigeru HANADA

Attachment

pgsql-hackers by date:

Previous
From: Asif Naeem
Date:
Subject: Re: Install shared libs in lib/ and bin/ with MSVC (Was: install libpq.dll in bin directory on Windows / Cygwin)
Next
From: Andres Freund
Date:
Subject: Re: autogenerated column names + views are a dump hazard