Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs) - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)
Date
Msg-id CAFjFpRfB6S4vWMjfBKM+gngXBXgw+SY0ssdUD2Yt21=oVSi2FQ@mail.gmail.com
Whole thread Raw
In response to Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)  (Thom Brown <thom@linux.com>)
Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
List pgsql-hackers
Hi All,
PFA patches for postgres_fdw join pushdown, taken care of all TODOs in my last mail.

Here is the list of things that have been improved/added new as compared to Hanada-san's previous patch at [1].

1. Condition handling for join
Patch in [1] allowed a foreign join to be pushed down if only all the conditions were safe to push down to the foreign server. This patch differentiates these conditions into 1. conditions to be applied while joining (ON clause) 2. conditions to be applied after joining (WHERE clause). For a join to be safe to pushdown, only conditions in 1 need to be all safe to pushdown. The conditions in second category, which are not safe to be pushed down can be applied locally. This allows more avenue for join pushdown. For an INNER join all the conditions can be applied on the cross product. Hence we can push down an INNER join even if one or more of the conditions are not safe to be pushed down. This patch includes the optimization as well.

2. Targetlist handling:
The columns required to evaluate the non-pushable conditions on a join relation need to be fetched from the foreign server. In previous patch the SELECT clauses were built from rel->reltargetlist, which doesn't contain these columns. This patch includes those columns as well.

3. Column projection:
Earlier patch required another layer of SQL to project whole-row attribute from a base relation. This patch takes care of that while constructing and deparsing
targetlist. This reduces the complexity and length of the query to be sent to the foreign server e.g.

With the projection in previous patch the query looked like
EXPLAIN (COSTS false, VERBOSE)
SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
                                              QUERY PLAN                                                                                                                                                                                                                                                   
... explain output clipped
               Remote SQL: SELECT l.a1, l.a2, l.a3, l.a4, r.a1 FROM (SELECT l.a7, ROW(l.a10, l.a11, l.a12, l.a13, l.a14, l.a15, l.a16, l.a17), l.a10, l.a12 FROM (SELECT "C 1" a10, c2 a11, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17, ctid a7 FROM "S 1"."T 1") l) l (a1, a2, a3, a4) INNER JOIN (SELECT ROW(r.a9, r.a10, r.a12, r.a13, r.a14, r.a15, r.a16, r.a17), r.a9 FROM (SELECT "C 1" a9, c2 a10, c3 a12, c4 a13, c5 a14, c6 a15, c7 a16, c8 a17 FROM "S 1"."T 1") r) r (a1, a2) ON ((l.a3 = r.a2))

With this patch it looks like
EXPLAIN (COSTS false, VERBOSE)
SELECT t1.ctid, t1, t2, t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
                                               QUERY PLAN                                                                                                                                              
... explain output clipped
               Remote SQL: SELECT l.a3, l.a4, l.a1, l.a2, r.a2 FROM (SELECT "C 1", c3, ctid, ROW("C 1", c2, c3, c4, c5, c6, c7, c8) FROM "S 1"."T 1") l (a1, a2, a3, a4) INNER JOIN (SELECT "C 1", ROW("C 1", c2, c3, c4, c5, c6, c7, c8) FROM "S 1"."T 1") r (a1, a2) ON (TRUE) WHERE ((l.a1 = r.a1))
(9 rows)

4. Local cost estimation
Previous patch had a TODO left for estimating join cost locally, when use_remote_estimate is false. This patch adds support for the same. The relevant
discussion in mail thread [2], [3].

5. This patch adds a GUC enable_foreignjoin to enable or disable join pushdown through core.

6. Added more tests to test lateral references, unsafe to push conditions at various places in the query,

Many cosmetic improvements like adding static function declarations, comment improvements and making code readable.

[1] http://www.postgresql.org/message-id/CAEZqfEe9KGy=1_waGh2rgZPg0o4pqgD+iauYaj8wTze+CYJUHg@mail.gmail.com
[2] http://www.postgresql.org/message-id/CAFjFpRcqSwUs+tb5iyp1M3c-w0k3xaB6H5mw4+N2q2iuAfSzKA@mail.gmail.com
[3] http://www.postgresql.org/message-id/CAFjFpRepSC2e3mZ1uYSopJD6R19fOZ0dNNf9Z=gnyKSB6wGk5g@mail.gmail.com

I will be working next on (in that order)
1. eval_plan_qual fix for foreign join. (Considered as a must-have for 9.6)
2. Pushing down ORDER BY clause along with join pushdown
3. Parameterization of foreign join paths (Given the complexity of the feature this may not make it into 9.6)

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

pgsql-hackers by date:

Previous
From: "Shulgin, Oleksandr"
Date:
Subject: Re: More stable query plans via more predictable column statistics
Next
From: David Rowley
Date:
Subject: Re: Combining Aggregates