join pushdown and issue with foreign update - Mailing list pgsql-hackers

From Alexander Pyhalov
Subject join pushdown and issue with foreign update
Date
Msg-id 7817fb9ebd6661cdf9b67dec6e129a78@postgrespro.ru
Whole thread Raw
Responses Re: join pushdown and issue with foreign update
List pgsql-hackers
Hi.

There's issue with join pushdown after

commit 86dc90056dfdbd9d1b891718d2e5614e3e432f35
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Wed Mar 31 11:52:34 2021 -0400

     Rework planning and execution of UPDATE and DELETE

To make sure that join pushdown path selected, one can patch
contrib/postgres_fdw/postgres_fdw.c in the following way:

diff --git a/contrib/postgres_fdw/postgres_fdw.c 
b/contrib/postgres_fdw/postgres_fdw.c
index c48a421e88b..c2bf6833050 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -5959,6 +5959,8 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
         /* Estimate costs for bare join relation */
         estimate_path_cost_size(root, joinrel, NIL, NIL, NULL,
                                                         &rows, &width, 
&startup_cost, &total_cost);
+
+       startup_cost = total_cost = 0;
         /* Now update this information in the joinrel */
         joinrel->rows = rows;
         joinrel->reltarget->width = width;

Now, this simple test shows the issue:

create extension postgres_fdw;

DO $d$
     BEGIN
         EXECUTE $$CREATE SERVER loopback FOREIGN DATA WRAPPER 
postgres_fdw
             OPTIONS (dbname '$$||current_database()||$$',
                      port '$$||current_setting('port')||$$')$$;
    END;
$d$;

CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;

CREATE TABLE base_tbl (a int, b int);
CREATE FOREIGN TABLE remote_tbl (a int, b int)
   SERVER loopback OPTIONS (table_name 'base_tbl');

insert into remote_tbl select generate_series(1,100), 
generate_series(1,100);

explain verbose update remote_tbl d set a= case when current_timestamp> 
'2012-02-02'::timestamp then 5 else 6 end FROM remote_tbl AS t (a, b) 
WHERE d.a = (t.a);
                                                                          
                                                 QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Update on public.remote_tbl d  (cost=0.00..42.35 rows=0 width=0)
    Remote SQL: UPDATE public.base_tbl SET a = $2 WHERE ctid = $1
    ->  Foreign Scan  (cost=0.00..42.35 rows=8470 width=74)
          Output: CASE WHEN (CURRENT_TIMESTAMP > '2012-02-02 
00:00:00'::timestamp without time zone) THEN 5 ELSE 6 END, d.ctid, d.*, 
t.*
          Relations: (public.remote_tbl d) INNER JOIN (public.remote_tbl 
t)
          Remote SQL: SELECT r1.ctid, CASE WHEN (r1.*)::text IS NOT NULL 
THEN ROW(r1.a, r1.b) END, CASE WHEN (r2.*)::text IS NOT NULL THEN 
ROW(r2.a, r2.b) END FROM (public.base_tbl r1 INNER JOIN public.base_tbl 
r2 ON (((r1.a = r2.a)))) FOR UPDATE OF r1
          ->  Merge Join  (cost=433.03..566.29 rows=8470 width=70)
                Output: d.ctid, d.*, t.*
                Merge Cond: (d.a = t.a)
                ->  Sort  (cost=211.00..214.10 rows=1241 width=42)
                      Output: d.ctid, d.*, d.a
                      Sort Key: d.a
                      ->  Foreign Scan on public.remote_tbl d  
(cost=100.00..147.23 rows=1241 width=42)
                            Output: d.ctid, d.*, d.a
                            Remote SQL: SELECT a, b, ctid FROM 
public.base_tbl FOR UPDATE
                ->  Sort  (cost=222.03..225.44 rows=1365 width=36)
                      Output: t.*, t.a
                      Sort Key: t.a
                      ->  Foreign Scan on public.remote_tbl t  
(cost=100.00..150.95 rows=1365 width=36)
                            Output: t.*, t.a
                            Remote SQL: SELECT a, b FROM public.base_tbl
update remote_tbl d set a= case when current_timestamp> 
'2012-02-02'::timestamp then 5 else 6 end FROM remote_tbl AS t (a, b) 
WHERE d.a = (t.a);

You'll get
ERROR:  input of anonymous composite types is not implemented
CONTEXT:  whole-row reference to foreign table "remote_tbl"

make_tuple_from_result_row() (called by fetch_more_data()), will try to 
call InputFunctionCall() for ROW(r1.a, r1.b) and will get error in 
record_in().

Here ROW(r2.a, r2.b) would have attribute type id, corresponding to 
remote_tbl, but ROW(r1.a, r1.b) would have atttypid 2249 (RECORD).

Before 86dc90056dfdbd9d1b891718d2e5614e3e432f35 the plan would be 
different and looked like


-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Update on public.remote_tbl d  (cost=0.00..73.54 rows=14708 width=46)
    Remote SQL: UPDATE public.base_tbl SET a = $2 WHERE ctid = $1
    ->  Foreign Scan  (cost=0.00..73.54 rows=14708 width=46)
          Output: CASE WHEN (CURRENT_TIMESTAMP > '2012-02-02 
00:00:00'::timestamp without time zone) THEN d.a ELSE 6 END, d.b, 
d.ctid, t.*
          Relations: (public.remote_tbl d) INNER JOIN (public.remote_tbl 
t)
          Remote SQL: SELECT r1.a, r1.b, r1.ctid, CASE WHEN (r2.*)::text 
IS NOT NULL THEN ROW(r2.a, r2.b) END FROM (public.base_tbl r1 INNER JOIN 
public.base_tbl r2 ON (((r1.a = r2.a)))) FOR UPDATE OF r1
          ->  Merge Join  (cost=516.00..747.39 rows=14708 width=46)
                Output: d.a, d.b, d.ctid, t.*
                Merge Cond: (d.a = t.a)
                ->  Sort  (cost=293.97..299.35 rows=2155 width=14)
                      Output: d.a, d.b, d.ctid
                      Sort Key: d.a
                      ->  Foreign Scan on public.remote_tbl d  
(cost=100.00..174.65 rows=2155 width=14)
                            Output: d.a, d.b, d.ctid
                            Remote SQL: SELECT a, b, ctid FROM 
public.base_tbl FOR UPDATE
                ->  Sort  (cost=222.03..225.44 rows=1365 width=36)
                      Output: t.*, t.a
                      Sort Key: t.a
                      ->  Foreign Scan on public.remote_tbl t  
(cost=100.00..150.95 rows=1365 width=36)
                            Output: t.*, t.a
                            Remote SQL: SELECT a, b FROM public.base_tbl

Here ROW(r2.a, r2.b) would have attribute type id, corresponding to 
remote_tbl.

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Incorrect snapshots while promoting hot standby node when 2PC is used
Next
From: Dilip Kumar
Date:
Subject: Re: Decoding speculative insert with toast leaks memory