Re: Join push-down support for foreign tables - Mailing list pgsql-hackers

From Shigeru Hanada
Subject Re: Join push-down support for foreign tables
Date
Msg-id CAEZqfEepwyW3K55oserRmMTh4twW1eBEND9uJ382Cc9NwLJ=Fg@mail.gmail.com
Whole thread Raw
In response to Re: Join push-down support for foreign tables  (Shigeru Hanada <shigeru.hanada@gmail.com>)
Responses Re: Join push-down support for foreign tables
Re: Join push-down support for foreign tables
List pgsql-hackers
Attached is the revised/rebased version of the $SUBJECT.

This patch is based on Kaigai-san's custom/foreign join patch, so
please apply it before this patch.  In this version I changed some
points from original postgres_fdw.

1) Disabled SELECT clause optimization
~9.4 postgres_fdw lists only columns actually used in SELECT clause,
but AFAIS it makes SQL generation complex.  So I disabled such
optimization and put "NULL" for unnecessary columns in SELECT clause
of remote query.

2) Extended deparse context
To allow deparsing based on multiple source relations, I added some
members to context structure.  They are unnecessary for simple query
with single foreign table, but IMO it should be integrated.

With Kaigai-san's advise, changes for supporting foreign join on
postgres_fdw is minimized into postgres_fdw itself.  But I added new
FDW API named GetForeignJoinPaths() to keep the policy that all
interface between core and FDW should be in FdwRoutine, instead of
using hook function.  Now I'm writing document about it, and will post
it in a day.

2015-02-19 16:19 GMT+09:00 Shigeru Hanada <shigeru.hanada@gmail.com>:
> 2015-02-17 10:39 GMT+09:00 Kouhei Kaigai <kaigai@ak.jp.nec.com>:
>> Let me put some comments in addition to where you're checking now.
>>
>> [design issues]
>> * Cost estimation
>> Estimation and evaluation of cost for remote join query is not an
>> obvious issue. In principle, local side cannot determine the cost
>> to run remote join without remote EXPLAIN, because local side has
>> no information about JOIN logic applied on the remote side.
>> Probably, we have to put an assumption for remote join algorithm,
>> because local planner has no idea about remote planner's choice
>> unless foreign-join don't take "use_remote_estimate".
>> I think, it is reasonable assumption (even if it is incorrect) to
>> calculate remote join cost based on local hash-join algorithm.
>> If user wants more correct estimation, remote EXPLAIN will make
>> more reliable cost estimation.
>
> Hm, I guess that you chose hash-join as "least-costed join".  In the
> pgbench model, most combination between two tables generate hash join
> as cheapest path.  Remote EXPLAIN is very expensive in the context of
> planning, so it would easily make the plan optimization meaningless.
> But giving an option to users is good, I agree.
>
>>
>> It also needs a consensus whether cost for remote CPU execution is
>> equivalent to local CPU. If we think local CPU is rare resource
>> than remote one, a discount rate will make planner more preferable
>> to choose remote join than local one
>
> Something like cpu_cost_ratio as a new server-level FDW option?
>
>>
>> Once we assume a join algorithm for remote join, unit cost for
>> remote CPU, we can calculate a cost for foreign join based on
>> the local join logic plus cost for network translation (maybe
>> fdw_tuple_cost?).
>
> Yes, sum of these costs is the total cost of a remote join.
>     o fdw_startup_cost
>     o hash-join cost, estimated as a local join
>     o fdw_tuple_cost * rows * width
>
>> * FDW options
>> Unlike table scan, FDW options we should refer is unclear.
>> Table level FDW options are associated with a foreign table as
>> literal. I think we have two options here:
>> 1. Foreign-join refers FDW options for foreign-server, but ones
>>    for foreign-tables are ignored.
>> 2. Foreign-join is prohibited when both of relations don't have
>>    identical FDW options.
>> My preference is 2. Even though N-way foreign join, it ensures
>> all the tables involved with (N-1)-way foreign join has identical
>> FDW options, thus it leads we can make N-way foreign join with
>> all identical FDW options.
>> One exception is "updatable" flag of postgres_fdw. It does not
>> make sense on remote join, so I think mixture of updatable and
>> non-updatable foreign tables should be admitted, however, it is
>> a decision by FDW driver.
>>
>> Probably, above points need to take time for getting consensus.
>> I'd like to see your opinion prior to editing your patch.
>
> postgres_fdw can't push down a join which contains foreign tables on
> multiple servers, so use_remote_estimate and fdw_startup_cost  are the
> only FDW options to consider.  So we have options for each option.
>
> 1-a. If all foreign tables in the join has identical
> use_remote_estimate, allow pushing down.
> 1-b. If any of foreign table in the join has true as
> use_remote_estimate, use remote estimate.
>
> 2-a. If all foreign tables in the join has identical fdw_startup_cost,
> allow pushing down.
> 2-b. Always use max value in the join. (cost would be more expensive)
> 2-c. Always use min value in the join.  (cost would be cheaper)
>
> I prefer 1-a and 2-b, so more joins avoid remote EXPLAIN but have
> reasonable cost about startup.
>
> I agree about "updatable" option.
>
>>
>> [implementation issues]
>> The interface does not intend to add new Path/Plan type for each scan
>> that replaces foreign joins. What postgres_fdw should do is, adding
>> ForeignPath towards a particular joinrel, then it populates ForeignScan
>> with remote join query once it got chosen by the planner.
>
> That idea is interesting, and make many things simpler.  Please let me consider.
>
>>
>> A few functions added in src/backend/foreign/foreign.c are not
>> called by anywhere, at this moment.
>>
>> create_plan_recurse() is reverted to static. It is needed for custom-
>> join enhancement, if no other infrastructure can support.
>
> I made it back to static because I thought that create_plan_recurse
> can be called by core before giving control to FDWs.  But I'm not sure
> it can be applied to custom scans.  I'll recheck that part.
>
>
> --
> Shigeru HANADA



--
Shigeru HANADA

Attachment

pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: remove pg_standby?
Next
From: Greg Stark
Date:
Subject: Re: Providing catalog view to pg_hba.conf file - Patch submission