Thread: Nested loop join condition does not get pushed down to foreign scan

Nested loop join condition does not get pushed down to foreign scan

From
Albe Laurenz
Date:
I just noticed something surprising:

-- create a larger local table
CREATE TABLE llarge (id integer NOT NULL, val integer NOT NULL);
INSERT INTO llarge SELECT i, i%100 FROM generate_series(1, 10000) i;
ALTER TABLE llarge ADD PRIMARY KEY (id);

-- create a small local table
CREATE TABLE small (id integer PRIMARY KEY, val text NOT NULL);
INSERT INTO small VALUES (1, 'one');

-- create a foreign table based on llarge
CREATE EXTENSION postgres_fdw;
CREATE SERVER loopback FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'localhost', port '5432', dbname 'test');
CREATE USER MAPPING FOR myself SERVER loopback OPTIONS (user 'myself', password 'mypassword');
CREATE FOREIGN TABLE rlarge (id integer NOT NULL, val integer NOT NULL) SERVER loopback OPTIONS (table_name 'llarge');

SET enable_hashjoin = off;
-- plan for a nested loop join with a local table
EXPLAIN (COSTS off) SELECT * FROM small JOIN llarge USING (id);                 QUERY PLAN
----------------------------------------------Nested Loop  ->  Seq Scan on small  ->  Index Scan using llarge_pkey on
llarge       Index Cond: (id = small.id)
 
(4 rows)

-- plan for a nested loop join with a foreign table
EXPLAIN (COSTS off) SELECT * FROM small JOIN rlarge USING (id);             QUERY PLAN
---------------------------------------Nested Loop  Join Filter: (small.id = rlarge.id)  ->  Seq Scan on small  ->
ForeignScan on rlarge
 
(4 rows)


Is there a fundamental reason why the join condition does not get pushed down into
the foreign scan or is that an omission that can easily be fixed?

Yours,
Laurenz Albe

Re: Nested loop join condition does not get pushed down to foreign scan

From
Ashutosh Bapat
Date:
On Tue, Sep 13, 2016 at 4:05 PM, Albe Laurenz <laurenz.albe@wien.gv.at> wrote:
> I just noticed something surprising:
>
> -- create a larger local table
> CREATE TABLE llarge (id integer NOT NULL, val integer NOT NULL);
> INSERT INTO llarge SELECT i, i%100 FROM generate_series(1, 10000) i;
> ALTER TABLE llarge ADD PRIMARY KEY (id);
>
> -- create a small local table
> CREATE TABLE small (id integer PRIMARY KEY, val text NOT NULL);
> INSERT INTO small VALUES (1, 'one');
>
> -- create a foreign table based on llarge
> CREATE EXTENSION postgres_fdw;
> CREATE SERVER loopback FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'localhost', port '5432', dbname 'test');
> CREATE USER MAPPING FOR myself SERVER loopback OPTIONS (user 'myself', password 'mypassword');
> CREATE FOREIGN TABLE rlarge (id integer NOT NULL, val integer NOT NULL) SERVER loopback OPTIONS (table_name
'llarge');
>
> SET enable_hashjoin = off;
> -- plan for a nested loop join with a local table
> EXPLAIN (COSTS off) SELECT * FROM small JOIN llarge USING (id);
>                   QUERY PLAN
> ----------------------------------------------
>  Nested Loop
>    ->  Seq Scan on small
>    ->  Index Scan using llarge_pkey on llarge
>          Index Cond: (id = small.id)
> (4 rows)
>
> -- plan for a nested loop join with a foreign table
> EXPLAIN (COSTS off) SELECT * FROM small JOIN rlarge USING (id);
>               QUERY PLAN
> ---------------------------------------
>  Nested Loop
>    Join Filter: (small.id = rlarge.id)
>    ->  Seq Scan on small
>    ->  Foreign Scan on rlarge
> (4 rows)
>
>
> Is there a fundamental reason why the join condition does not get pushed down into
> the foreign scan or is that an omission that can easily be fixed?
>

While creating the foreign table, if you specify use_remote_estimate =
true for the table OR do so for the foreign server, postgres_fdw
creates parameterized paths for that foreign relation. If using a
parameterized path reduces cost of the join, it will use a nested loop
join with inner relation parameterized by the outer relation, pushing
join conditions down into the foreign scan.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company