Re: Why is DEFAULT_FDW_TUPLE_COST so insanely low? - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Why is DEFAULT_FDW_TUPLE_COST so insanely low?
Date
Msg-id ZT2dBCFrx89lLLgM@momjian.us
Whole thread Raw
In response to Why is DEFAULT_FDW_TUPLE_COST so insanely low?  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Why is DEFAULT_FDW_TUPLE_COST so insanely low?
List pgsql-hackers
Has anything been done about this issue?

---------------------------------------------------------------------------

On Wed, Aug  3, 2022 at 02:56:12AM +1200, David Rowley wrote:
> Over on [1] I was complaining that I thought DEFAULT_FDW_TUPLE_COST,
> which is defined as 0.01 was unrealistically low.
> 
> For comparison, cpu_tuple_cost, something we probably expect to be in
> a CPU cache is also 0.01.  We've defined DEFAULT_PARALLEL_TUPLE_COST
> to be 0.1, which is 10x cpu_tuple_cost.  That's coming from a shared
> memory segment.  So why do we think DEFAULT_FDW_TUPLE_COST should be
> the same as cpu_tuple_cost when that's probably pulling a tuple from
> some remote server over some (possibly slow) network?
> 
> I did a little experiment in the attached .sql file and did some maths
> to try to figure out what it's really likely to be costing us. I tried
> this with and without the attached hack to have the planner not
> consider remote grouping just to see how much slower pulling a million
> tuples through the FDW would cost.
> 
> I setup a loopback server on localhost (which has about the lowest
> possible network latency) and found the patched query to the foreign
> server took:
> 
> Execution Time: 530.000 ms
> 
> This is pulling all million tuples over and doing the aggregate locally.
> 
> Unpatched, the query took:
> 
> Execution Time: 35.334 ms
> 
> so about 15x faster.
> 
> If I take the seqscan cost for querying the local table, which is
> 14425.00 multiply that by 15 (the extra time it took to pull the 1
> million tuples) then divide by 1 million to get the extra cost per
> tuple, then that comes to about 0.216.  So that says
> DEFAULT_FDW_TUPLE_COST is about 21x lower than it should be.
> 
> I tried cranking DEFAULT_FDW_TUPLE_COST up to 0.5 to see what plans
> would change in the postgres_fdw regression tests and quite a number
> changed. Many seem to be pushing the sorts down to the remote server
> where they were being done locally before. A few others just seem
> weird. For example, the first one seems to be blindly adding a remote
> sort when it does no good. I think it would take quite a bit of study
> with a debugger to figure out what's going on with many of these.
> 
> Does anyone have any ideas why DEFAULT_FDW_TUPLE_COST was set so low?
> 
> Does anyone object to it being set to something more realistic?
> 
> David
> 
> [1] https://www.postgresql.org/message-id/CAApHDvpXiXLxg4TsA8P_4etnuGQqAAbHWEOM4hGe=DCaXmi_jA@mail.gmail.com

> diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
> index 64632db73c..b4e3b91d7f 100644
> --- a/src/backend/optimizer/plan/planner.c
> +++ b/src/backend/optimizer/plan/planner.c
> @@ -3921,7 +3921,7 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
>       * If there is an FDW that's responsible for all baserels of the query,
>       * let it consider adding ForeignPaths.
>       */
> -    if (grouped_rel->fdwroutine &&
> +    if (0 && grouped_rel->fdwroutine &&
>          grouped_rel->fdwroutine->GetForeignUpperPaths)
>          grouped_rel->fdwroutine->GetForeignUpperPaths(root, UPPERREL_GROUP_AGG,
>                                                        input_rel, grouped_rel,

> ALTER SYSTEM SET max_parallel_workers_per_gather = 0;
> SELECT pg_reload_conf();
> 
> CREATE EXTENSION postgres_fdw;
> CREATE EXTENSION pg_prewarm;
> 
> 
> DO $d$
>     BEGIN
>         EXECUTE $$CREATE SERVER loopback FOREIGN DATA WRAPPER postgres_fdw
>             OPTIONS (dbname '$$||current_database()||$$',
>                      port '$$||current_setting('port')||$$'
>             )$$;
>     END;
> $d$;
> 
> CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;
> 
> CREATE TABLE public.t (a INT);
> INSERT INTO t SELECT x FROM generate_series(1,1000000) x;
> VACUUM FREEZE ANALYZE t;
> SELECT pg_prewarm('t');
> 
> CREATE FOREIGN TABLE ft (
>     a INT
> ) SERVER loopback  OPTIONS (schema_name 'public', table_name 't');
> 
> EXPLAIN (ANALYZE) SELECT COUNT(*) FROM ft;
> EXPLAIN (ANALYZE) SELECT COUNT(*) FROM t;
> 
> DROP FOREIGN TABLE ft;
> DROP TABLE t;
> DROP SERVER loopback CASCADE;
> ALTER SYSTEM RESET max_parallel_workers_per_gather;
> SELECT pg_reload_conf();

> --- "expected\\postgres_fdw.out"    2022-08-03 01:34:42.806967000 +1200
> +++ "results\\postgres_fdw.out"    2022-08-03 02:33:40.719712900 +1200
> @@ -2164,8 +2164,8 @@
>  -- unsafe conditions on one side (c8 has a UDT), not pushed down.
>  EXPLAIN (VERBOSE, COSTS OFF)
>  SELECT t1.c1, t2.c1 FROM ft1 t1 LEFT JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET
100LIMIT 10;
 
> -                                 QUERY PLAN                                  
> ------------------------------------------------------------------------------
> +                                                          QUERY PLAN
        
 
>
+------------------------------------------------------------------------------------------------------------------------------
>   Limit
>     Output: t1.c1, t2.c1, t1.c3
>     ->  Sort
> @@ -2182,7 +2182,7 @@
>                       ->  Foreign Scan on public.ft1 t1
>                             Output: t1.c1, t1.c3
>                             Filter: (t1.c8 = 'foo'::user_enum)
> -                           Remote SQL: SELECT "C 1", c3, c8 FROM "S 1"."T 1"
> +                           Remote SQL: SELECT "C 1", c3, c8 FROM "S 1"."T 1" ORDER BY c3 ASC NULLS LAST, "C 1" ASC
NULLSLAST
 
>  (17 rows)
>  
>  SELECT t1.c1, t2.c1 FROM ft1 t1 LEFT JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET
100LIMIT 10;
 
> @@ -2873,13 +2873,13 @@
>   Sort
>     Output: (sum(c1)), c2
>     Sort Key: (sum(ft1.c1))
> -   ->  HashAggregate
> +   ->  GroupAggregate
>           Output: sum(c1), c2
>           Group Key: ft1.c2
>           Filter: (avg((ft1.c1 * ((random() <= '1'::double precision))::integer)) > '100'::numeric)
>           ->  Foreign Scan on public.ft1
>                 Output: c1, c2
> -               Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1"
> +               Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" ORDER BY c2 ASC NULLS LAST
>  (10 rows)
>  
>  -- Remote aggregate in combination with a local Param (for the output
> @@ -3123,12 +3123,12 @@
>   Sort
>     Output: (sum(c1) FILTER (WHERE ((((c1 / c1))::double precision * random()) <= '1'::double precision))), c2
>     Sort Key: (sum(ft1.c1) FILTER (WHERE ((((ft1.c1 / ft1.c1))::double precision * random()) <= '1'::double
precision)))
> -   ->  HashAggregate
> +   ->  GroupAggregate
>           Output: sum(c1) FILTER (WHERE ((((c1 / c1))::double precision * random()) <= '1'::double precision)), c2
>           Group Key: ft1.c2
>           ->  Foreign Scan on public.ft1
>                 Output: c1, c2
> -               Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1"
> +               Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" ORDER BY c2 ASC NULLS LAST
>  (9 rows)
>  
>  explain (verbose, costs off)
> @@ -3885,24 +3885,21 @@
>  -- subquery using stable function (can't be sent to remote)
>  PREPARE st2(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 > $1 AND
date(c4)= '1970-01-17'::date) ORDER BY c1;
 
>  EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st2(10, 20);
> -                                                QUERY PLAN                                                
> -----------------------------------------------------------------------------------------------------------
> - Sort
> +                                                            QUERY PLAN
            
 
>
+----------------------------------------------------------------------------------------------------------------------------------
> + Nested Loop Semi Join
>     Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
> -   Sort Key: t1.c1
> -   ->  Nested Loop Semi Join
> +   Join Filter: (t1.c3 = t2.c3)
> +   ->  Foreign Scan on public.ft1 t1
>           Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
> -         Join Filter: (t1.c3 = t2.c3)
> -         ->  Foreign Scan on public.ft1 t1
> -               Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
> -               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 20))
> -         ->  Materialize
> +         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 20)) ORDER BY "C 1"
ASCNULLS LAST
 
> +   ->  Materialize
> +         Output: t2.c3
> +         ->  Foreign Scan on public.ft2 t2
>                 Output: t2.c3
> -               ->  Foreign Scan on public.ft2 t2
> -                     Output: t2.c3
> -                     Filter: (date(t2.c4) = '01-17-1970'::date)
> -                     Remote SQL: SELECT c3, c4 FROM "S 1"."T 1" WHERE (("C 1" > 10))
> -(15 rows)
> +               Filter: (date(t2.c4) = '01-17-1970'::date)
> +               Remote SQL: SELECT c3, c4 FROM "S 1"."T 1" WHERE (("C 1" > 10))
> +(12 rows)
>  
>  EXECUTE st2(10, 20);
>   c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8  
> @@ -9381,21 +9378,19 @@
>  -- test FOR UPDATE; partitionwise join does not apply
>  EXPLAIN (COSTS OFF)
>  SELECT t1.a, t2.b FROM fprt1 t1 INNER JOIN fprt2 t2 ON (t1.a = t2.b) WHERE t1.a % 25 = 0 ORDER BY 1,2 FOR UPDATE OF
t1;
> -                          QUERY PLAN                          
> ---------------------------------------------------------------
> +                       QUERY PLAN                       
> +--------------------------------------------------------
>   LockRows
> -   ->  Sort
> -         Sort Key: t1.a
> -         ->  Hash Join
> -               Hash Cond: (t2.b = t1.a)
> +   ->  Nested Loop
> +         Join Filter: (t1.a = t2.b)
> +         ->  Append
> +               ->  Foreign Scan on ftprt1_p1 t1_1
> +               ->  Foreign Scan on ftprt1_p2 t1_2
> +         ->  Materialize
>                 ->  Append
>                       ->  Foreign Scan on ftprt2_p1 t2_1
>                       ->  Foreign Scan on ftprt2_p2 t2_2
> -               ->  Hash
> -                     ->  Append
> -                           ->  Foreign Scan on ftprt1_p1 t1_1
> -                           ->  Foreign Scan on ftprt1_p2 t1_2
> -(12 rows)
> +(10 rows)
>  
>  SELECT t1.a, t2.b FROM fprt1 t1 INNER JOIN fprt2 t2 ON (t1.a = t2.b) WHERE t1.a % 25 = 0 ORDER BY 1,2 FOR UPDATE OF
t1;
>    a  |  b  
> @@ -9430,18 +9425,16 @@
>  SET enable_partitionwise_aggregate TO false;
>  EXPLAIN (COSTS OFF)
>  SELECT a, sum(b), min(b), count(*) FROM pagg_tab GROUP BY a HAVING avg(b) < 22 ORDER BY 1;
> -                        QUERY PLAN                         
> ------------------------------------------------------------
> - Sort
> -   Sort Key: pagg_tab.a
> -   ->  HashAggregate
> -         Group Key: pagg_tab.a
> -         Filter: (avg(pagg_tab.b) < '22'::numeric)
> -         ->  Append
> -               ->  Foreign Scan on fpagg_tab_p1 pagg_tab_1
> -               ->  Foreign Scan on fpagg_tab_p2 pagg_tab_2
> -               ->  Foreign Scan on fpagg_tab_p3 pagg_tab_3
> -(9 rows)
> +                     QUERY PLAN                      
> +-----------------------------------------------------
> + GroupAggregate
> +   Group Key: pagg_tab.a
> +   Filter: (avg(pagg_tab.b) < '22'::numeric)
> +   ->  Append
> +         ->  Foreign Scan on fpagg_tab_p1 pagg_tab_1
> +         ->  Foreign Scan on fpagg_tab_p2 pagg_tab_2
> +         ->  Foreign Scan on fpagg_tab_p3 pagg_tab_3
> +(7 rows)
>  
>  -- Plan with partitionwise aggregates is enabled
>  SET enable_partitionwise_aggregate TO true;
> @@ -9475,34 +9468,32 @@
>  -- Should have all the columns in the target list for the given relation
>  EXPLAIN (VERBOSE, COSTS OFF)
>  SELECT a, count(t1) FROM pagg_tab t1 GROUP BY a HAVING avg(b) < 22 ORDER BY 1;
> -                               QUERY PLAN                               
> -------------------------------------------------------------------------
> - Sort
> -   Output: t1.a, (count(((t1.*)::pagg_tab)))
> +                                         QUERY PLAN                                         
> +--------------------------------------------------------------------------------------------
> + Merge Append
>     Sort Key: t1.a
> -   ->  Append
> -         ->  HashAggregate
> -               Output: t1.a, count(((t1.*)::pagg_tab))
> -               Group Key: t1.a
> -               Filter: (avg(t1.b) < '22'::numeric)
> -               ->  Foreign Scan on public.fpagg_tab_p1 t1
> -                     Output: t1.a, t1.*, t1.b
> -                     Remote SQL: SELECT a, b, c FROM public.pagg_tab_p1
> -         ->  HashAggregate
> -               Output: t1_1.a, count(((t1_1.*)::pagg_tab))
> -               Group Key: t1_1.a
> -               Filter: (avg(t1_1.b) < '22'::numeric)
> -               ->  Foreign Scan on public.fpagg_tab_p2 t1_1
> -                     Output: t1_1.a, t1_1.*, t1_1.b
> -                     Remote SQL: SELECT a, b, c FROM public.pagg_tab_p2
> -         ->  HashAggregate
> -               Output: t1_2.a, count(((t1_2.*)::pagg_tab))
> -               Group Key: t1_2.a
> -               Filter: (avg(t1_2.b) < '22'::numeric)
> -               ->  Foreign Scan on public.fpagg_tab_p3 t1_2
> -                     Output: t1_2.a, t1_2.*, t1_2.b
> -                     Remote SQL: SELECT a, b, c FROM public.pagg_tab_p3
> -(25 rows)
> +   ->  GroupAggregate
> +         Output: t1.a, count(((t1.*)::pagg_tab))
> +         Group Key: t1.a
> +         Filter: (avg(t1.b) < '22'::numeric)
> +         ->  Foreign Scan on public.fpagg_tab_p1 t1
> +               Output: t1.a, t1.*, t1.b
> +               Remote SQL: SELECT a, b, c FROM public.pagg_tab_p1 ORDER BY a ASC NULLS LAST
> +   ->  GroupAggregate
> +         Output: t1_1.a, count(((t1_1.*)::pagg_tab))
> +         Group Key: t1_1.a
> +         Filter: (avg(t1_1.b) < '22'::numeric)
> +         ->  Foreign Scan on public.fpagg_tab_p2 t1_1
> +               Output: t1_1.a, t1_1.*, t1_1.b
> +               Remote SQL: SELECT a, b, c FROM public.pagg_tab_p2 ORDER BY a ASC NULLS LAST
> +   ->  GroupAggregate
> +         Output: t1_2.a, count(((t1_2.*)::pagg_tab))
> +         Group Key: t1_2.a
> +         Filter: (avg(t1_2.b) < '22'::numeric)
> +         ->  Foreign Scan on public.fpagg_tab_p3 t1_2
> +               Output: t1_2.a, t1_2.*, t1_2.b
> +               Remote SQL: SELECT a, b, c FROM public.pagg_tab_p3 ORDER BY a ASC NULLS LAST
> +(23 rows)
>  
>  SELECT a, count(t1) FROM pagg_tab t1 GROUP BY a HAVING avg(b) < 22 ORDER BY 1;
>   a  | count 
> @@ -9518,24 +9509,23 @@
>  -- When GROUP BY clause does not match with PARTITION KEY.
>  EXPLAIN (COSTS OFF)
>  SELECT b, avg(a), max(a), count(*) FROM pagg_tab GROUP BY b HAVING sum(a) < 700 ORDER BY 1;
> -                           QUERY PLAN                            
> ------------------------------------------------------------------
> - Sort
> -   Sort Key: pagg_tab.b
> -   ->  Finalize HashAggregate
> -         Group Key: pagg_tab.b
> -         Filter: (sum(pagg_tab.a) < 700)
> -         ->  Append
> -               ->  Partial HashAggregate
> -                     Group Key: pagg_tab.b
> -                     ->  Foreign Scan on fpagg_tab_p1 pagg_tab
> -               ->  Partial HashAggregate
> -                     Group Key: pagg_tab_1.b
> -                     ->  Foreign Scan on fpagg_tab_p2 pagg_tab_1
> -               ->  Partial HashAggregate
> -                     Group Key: pagg_tab_2.b
> -                     ->  Foreign Scan on fpagg_tab_p3 pagg_tab_2
> -(15 rows)
> +                        QUERY PLAN                         
> +-----------------------------------------------------------
> + Finalize GroupAggregate
> +   Group Key: pagg_tab.b
> +   Filter: (sum(pagg_tab.a) < 700)
> +   ->  Merge Append
> +         Sort Key: pagg_tab.b
> +         ->  Partial GroupAggregate
> +               Group Key: pagg_tab.b
> +               ->  Foreign Scan on fpagg_tab_p1 pagg_tab
> +         ->  Partial GroupAggregate
> +               Group Key: pagg_tab_1.b
> +               ->  Foreign Scan on fpagg_tab_p2 pagg_tab_1
> +         ->  Partial GroupAggregate
> +               Group Key: pagg_tab_2.b
> +               ->  Foreign Scan on fpagg_tab_p3 pagg_tab_2
> +(14 rows)
>  
>  -- ===================================================================
>  -- access rights and superuser


-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: remove useless comments
Next
From: Bruce Momjian
Date:
Subject: Re: COPY TO (FREEZE)?