Re: Function scan FDW pushdown - Mailing list pgsql-hackers

From Alexander Pyhalov
Subject Re: Function scan FDW pushdown
Date
Msg-id b91abe323f12f4210a53feb1ead3ac0d@postgrespro.ru
Whole thread
In response to Re: Function scan FDW pushdown  (solaimurugan vellaipandiyan <drsolaimurugan.v@gmail.com>)
List pgsql-hackers
solaimurugan vellaipandiyan писал(а) 2026-05-08 08:17:
> Hi,
> 
> I tested the v3 patch on current master using a postgres_fdw loopback 
> setup.
> 
> The patch mostly applied cleanly for me, with only a small manual
> conflict resolution needed in postgres_fdw.c (missing
> optimizer/clauses.h include). PostgreSQL built and started
> successfully after that.
> 
> For testing, I used queries involving generate_series() together with
> a foreign table, for example:
> 
> SELECT *
> FROM ft
> JOIN generate_series(1,3) g
> ON ft.id = g;
> 
> I also tried:
> 1. implicit join syntax
> 2. LATERAL usage
> 3. disabling hashjoin/mergejoin to force different join paths
> 
> In all cases, the plans still showed a local Function Scan and local
> join execution. The Remote SQL remained:
> SELECT id FROM public.localtab
> So I wasn't able to observe function scan pushdown with these 
> testcases.
> Maybe I'm missing a query shape or planner condition required to
> trigger the new functionality. It would help to know which cases are
> currently expected to be pushed down.
> 
> Regards,
> Solaimurugan V


Hi. Of course, this depends on foreign scan cost. Foreign join should be 
cheaper than local one.
The cost of foreign join consists from
1) cost of evaluating inner and outer part of the join,
2) total costs of retreiving rows (nrows * cost of qual evaluation),
3) cost of preparing to evaluate quals (remote and local),
4) fdw_startup_cost
5) (fdw_tuple_cost + cpu_tuple_cost) * retrieved_rows.
Look at estimate_path_cost_size().

The cost of retreiving foreign rows consists from
1) cost of estimating baserestrictinfos
2) cost of remote sequential scan
3) cost of evaluating foreign relation target list
4) fdw_startup_cost
5) (fdw_tuple_cost + cpu_tuple_cost) * retrieved_rows.

If we compare function scan pushdown for foreign table and function with 
foreign join pushdown of two foreign tables, we don't have  two 
penalties from fdw_startup_cost here. So, for function scan pushdown to 
be chosen, it should filter out significant part of data, as data 
transfer cost likely dominates in foreign scan costs estimation. You can 
see this on the following toy example.


CREATE EXTENSION postgres_fdw;

SELECT current_database() AS current_database,
   current_setting('port') AS current_port \gset
CREATE SERVER loopback FOREIGN DATA WRAPPER postgres_fdw
         OPTIONS (dbname :'current_database', port :'current_port');
CREATE USER MAPPING FOR current_user SERVER loopback;
CREATE table t1 (i int, j text);
create foreign table ft1 (i int, j text) server loopback options 
(table_name 't1');

-- No adequate data about relation, we assume 1000+ rows in table, so 
foreign join is chosen.

#  explain select * from ft1 , generate_series(1,10) g where i=g;
                       QUERY PLAN
------------------------------------------------------
  Foreign Scan  (cost=100.00..172.16 rows=68 width=40)
    Relations: (ft1) INNER JOIN (generate_series() g)
(2 rows)

-- We have actual data, there's no use to push down foreign join
postgres=# analyze ft1;
ANALYZE
postgres=# explain select * from ft1 , generate_series(1,10) g where 
i=g;
                                  QUERY PLAN
-----------------------------------------------------------------------------
  Hash Join  (cost=100.02..100.16 rows=1 width=40)
    Hash Cond: (g.g = ft1.i)
    ->  Function Scan on generate_series g  (cost=0.00..0.10 rows=10 
width=4)
    ->  Hash  (cost=100.00..100.00 rows=1 width=36)
          ->  Foreign Scan on ft1  (cost=100.00..100.00 rows=1 width=36)

-- If we had two foreign scans below join here, its startup cost would 
be more than 2*100.
-- But evaluating function locally saves us from such great penalty.
-- Let's add some actual data.

postgres=# insert into ft1 select i, 'test'||i from 
generate_series(1,1000) i;
INSERT 0 1000
postgres=# analyze ft1;
ANALYZE

-- Now we have actual data, transfering 1000 rows (plain foreign scan 
costs ~ 410) and joining locally is more expensive than doing
-- remote join and transfering 10 rows.

postgres=# explain select * from ft1 , generate_series(1,10) g where 
i=g;
                       QUERY PLAN
------------------------------------------------------
  Foreign Scan  (cost=100.00..143.20 rows=10 width=15)
    Relations: (ft1) INNER JOIN (generate_series() g)
(2 rows)


-- 
Best regards,
Alexander Pyhalov,
Postgres Professional



pgsql-hackers by date:

Previous
From: vignesh C
Date:
Subject: Re: Include schema-qualified names in publication error messages.
Next
From: Michael Paquier
Date:
Subject: Support for 8-byte TOAST values, round two