[idea] more aggressive join pushdown on postgres_fdw - Mailing list pgsql-hackers

From Kouhei Kaigai
Subject [idea] more aggressive join pushdown on postgres_fdw
Date
Msg-id 9A28C8860F777E439AA12E8AEA7694F8010F20AD@BPXM15GP.gisp.nec.co.jp
Whole thread Raw
Responses Re: [idea] more aggressive join pushdown on postgres_fdw  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Hi,

Yesterday, JPUG held an unconference event at Tokyo, and
Hanada-san had a talk about join-pushdown feature of
postgres_fdw.
At this talk, someone proposed an interesting idea to
make join pushdown more aggressive/effective.
Let me share it with pgsql-hackers.

He said, we may have a workload to join a large foreign-
scan and a small local-scan regardless of the plan type.

For example: joinrel (expected nrows = 5)   + outerrel ForeignScan (expected nrows = 1000000)   + innerrel LocalScan
(expectednrows = 5) 

In this case, we may be able to run the entire joinrel
on the remote side then fetch just 5 rows, if fdw-driver
construct VALUES() clause according to the contents of
LocalScan then makes an entire join query with another
one kept in ForeignScan.

If above ForeignScan have the following remote query, SELECT a, b, c FROM t0 WHERE d < 1000000
we may be able to construct the query below to run remote
join with local (small) relation.
 SELECT a, b, c, x, y FROM   (SELECT a, b, c FROM t0 WHERE d < 1000000) AS ft   JOIN   (VALUES (1,'aaa'), (2,'bbb'),
(3,'ccc'),          (4,'ddd'), (5,'eee')) AS lt (x, y)   ON ft.a = lt.x 

The VALUES clauses can be mechanically constructed according
to the result set of LocalScan, and it is not difficult to
make such a remote query on top of the existing ForeignScan.
In the result, it will reduce amount of network traffic and
CPU cycles to form/deform tuples dramatically.

I don't intend to implement this idea urgently (of course,
join pushdown for both ForeignScan case has higher priority),
however, it makes sense to keep the future direction in mind.

Also, as an aside, even though Hanada-san mentioned ForeignScan
does not need an infrastructure to initialize child path nodes,
this idea may require ForeignScan to have local child path.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>




pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1
Next
From: David Steele
Date:
Subject: Re: [CORE] postpone next week's release