Re: Problem while updating a foreign table pointing to apartitioned table on foreign server - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: Problem while updating a foreign table pointing to apartitioned table on foreign server
Date
Msg-id 20180808.173050.191913193.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Responses Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server
List pgsql-hackers
Hello. Please find the attached.

At Fri, 3 Aug 2018 11:48:38 +0530, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote in
<CAFjFpRcF-j+B8W8o-wrvOguA0=r8SJ-rCrzWAnHT2V66NxGfFQ@mail.gmail.com>
> The purpose of non-Var node is to avoid adding the attribute to
> relation descriptor. Idea is to create a new node, which will act as a
> place holder for table oid or row id (whatever) to be fetched from the
> foreign server. I don't understand why do you think we need it to be
> added to the relation descriptor.

I choosed to expand tuple descriptor for junk column added to
foreign relaions. We might be better to have new member in
ForeignScan but I didn't so that we can backpatch it.

What the patch does are:

- This abuses ForeignScan->fdw_scan_tlist to store the additional
  junk columns when foreign simple relation scan (that is, not a
  join).

  Several places seems to be assuming that fdw_scan_tlist may be
  used foreign scan on simple relation but I didn't find that
  actually happens. This let us avoid adding new data members to
  core data structure. Separate member would be preferable for
  new version.

- The remote OID request is added to targetlist as a non-system
  junk column. get_relation_info exands per-column storage in
  creating RelOptInfo so that the additional junk columns can be
  handled.

- ExecInitForeignScan is changed so that it expands created tuple
  descriptor if it finds the junk columns stored in
  fdw_scan_tlist so that make_tuple_from_result_row can store
  them. ( ExecEvalWholeRowVar needed to modify so that it ignores
  the expanded portion of tuple descriptor.)

I'm not sure whether the following ponits are valid.

- If fdw_scan_tlist is used for simple relation scans, this would
  break the case. (ExecInitForeignScan,  set_foreignscan_references)

- I'm using the name "tableoid" for the junk column but it also
  can be used in user query. The two points to different targets
  so it doesn't matter at all, except that it makes a bit
  confusing explain output.

- Explain stuff doesn't have a crue for the name of the added
  junk. It is shown as <added_junk> in EXPLAIN output.

| Update on public.fp
|   Remote SQL: UPDATE public.p SET b = $3 WHERE tableoid = $1 AND ctid = $2
|   ->  Foreign Scan on public.fp
|         Output: a, (b + 1), "<added_junk>", ctid
|         Filter: (random() <= '1'::double precision)
|         Remote SQL: SELECT a, b, tableoid AS __remote_tableoid, ctid
|                     FROM public.p WHERE ((a = 0)) FOR UPDATE

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From fe660a5ab953d68a671861479fce0b3e60a57cd8 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Wed, 8 Aug 2018 12:14:58 +0900
Subject: [PATCH 2/2] Regression test for update/delete on foreign partitioned
 table

Add test for foreign update on remote partitioned tables.
---
 contrib/postgres_fdw/expected/postgres_fdw.out | 221 ++++++++++++++++---------
 contrib/postgres_fdw/sql/postgres_fdw.sql      |  27 +++
 2 files changed, 167 insertions(+), 81 deletions(-)

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index f5498c62bd..9ae329ab4f 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -5497,15 +5497,15 @@ INSERT INTO ft2 (c1,c2,c3)
   SELECT id, id % 10, to_char(id, 'FM00000') FROM generate_series(2001, 2010) id;
 EXPLAIN (verbose, costs off)
 UPDATE ft2 SET c3 = 'bar' WHERE postgres_fdw_abs(c1) > 2000 RETURNING *;            -- can't be pushed down
-                                                QUERY PLAN                                                
-----------------------------------------------------------------------------------------------------------
+                                                         QUERY PLAN
    
 

+----------------------------------------------------------------------------------------------------------------------------
  Update on public.ft2
    Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: UPDATE "S 1"."T 1" SET c3 = $2 WHERE ctid = $1 RETURNING "C 1", c2, c3, c4, c5, c6, c7, c8
+   Remote SQL: UPDATE "S 1"."T 1" SET c3 = $3 WHERE tableoid = $1 AND ctid = $2 RETURNING "C 1", c2, c3, c4, c5, c6,
c7,c8
 
    ->  Foreign Scan on public.ft2
-         Output: c1, c2, NULL::integer, 'bar'::text, c4, c5, c6, c7, c8, ctid
+         Output: c1, c2, NULL::integer, 'bar'::text, c4, c5, c6, c7, c8, "<added_junk>", ctid
          Filter: (postgres_fdw_abs(ft2.c1) > 2000)
-         Remote SQL: SELECT "C 1", c2, c4, c5, c6, c7, c8, ctid FROM "S 1"."T 1" FOR UPDATE
+         Remote SQL: SELECT "C 1", c2, c4, c5, c6, c7, c8, tableoid, ctid FROM "S 1"."T 1" FOR UPDATE
 (7 rows)
 
 UPDATE ft2 SET c3 = 'bar' WHERE postgres_fdw_abs(c1) > 2000 RETURNING *;
@@ -5532,13 +5532,13 @@ UPDATE ft2 SET c3 = 'baz'

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Update on public.ft2
    Output: ft2.c1, ft2.c2, ft2.c3, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8, ft4.c1, ft4.c2, ft4.c3, ft5.c1, ft5.c2,
ft5.c3
-   Remote SQL: UPDATE "S 1"."T 1" SET c3 = $2 WHERE ctid = $1 RETURNING "C 1", c2, c3, c4, c5, c6, c7, c8
+   Remote SQL: UPDATE "S 1"."T 1" SET c3 = $3 WHERE tableoid = $1 AND ctid = $2 RETURNING "C 1", c2, c3, c4, c5, c6,
c7,c8
 
    ->  Nested Loop
-         Output: ft2.c1, ft2.c2, NULL::integer, 'baz'::text, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8, ft2.ctid, ft4.*,
ft5.*,ft4.c1, ft4.c2, ft4.c3, ft5.c1, ft5.c2, ft5.c3
 
+         Output: ft2.c1, ft2.c2, NULL::integer, 'baz'::text, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8,
ft2."<added_junk>",ft2.ctid, ft4.*, ft5.*, ft4.c1, ft4.c2, ft4.c3, ft5.c1, ft5.c2, ft5.c3
 
          Join Filter: (ft2.c2 === ft4.c1)
          ->  Foreign Scan on public.ft2
-               Output: ft2.c1, ft2.c2, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8, ft2.ctid
-               Remote SQL: SELECT "C 1", c2, c4, c5, c6, c7, c8, ctid FROM "S 1"."T 1" WHERE (("C 1" > 2000)) FOR
UPDATE
+               Output: ft2.c1, ft2.c2, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8, ft2."<added_junk>", ft2.ctid
+               Remote SQL: SELECT "C 1", c2, c4, c5, c6, c7, c8, tableoid, ctid FROM "S 1"."T 1" WHERE (("C 1" >
2000))FOR UPDATE
 
          ->  Foreign Scan
                Output: ft4.*, ft4.c1, ft4.c2, ft4.c3, ft5.*, ft5.c1, ft5.c2, ft5.c3
                Relations: (public.ft4) INNER JOIN (public.ft5)
@@ -5570,24 +5570,24 @@ DELETE FROM ft2
   USING ft4 INNER JOIN ft5 ON (ft4.c1 === ft5.c1)
   WHERE ft2.c1 > 2000 AND ft2.c2 = ft4.c1
   RETURNING ft2.c1, ft2.c2, ft2.c3;       -- can't be pushed down
-
                                             QUERY PLAN
                                                                                                   
 

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+
                                                   QUERY PLAN
                                                                                                                
 

+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Delete on public.ft2
    Output: ft2.c1, ft2.c2, ft2.c3
-   Remote SQL: DELETE FROM "S 1"."T 1" WHERE ctid = $1 RETURNING "C 1", c2, c3
+   Remote SQL: DELETE FROM "S 1"."T 1" WHERE tableoid = $1 AND ctid = $2 RETURNING "C 1", c2, c3
    ->  Foreign Scan
-         Output: ft2.ctid, ft4.*, ft5.*
+         Output: ft2."<added_junk>", ft2.ctid, ft4.*, ft5.*
          Filter: (ft4.c1 === ft5.c1)
          Relations: ((public.ft2) INNER JOIN (public.ft4)) INNER JOIN (public.ft5)
-         Remote SQL: SELECT r1.ctid, CASE WHEN (r2.*)::text IS NOT NULL THEN ROW(r2.c1, r2.c2, r2.c3) END, CASE WHEN
(r3.*)::textIS NOT NULL THEN ROW(r3.c1, r3.c2, r3.c3) END, r2.c1, r3.c1 FROM (("S 1"."T 1" r1 INNER JOIN "S 1"."T 3" r2
ON(((r1.c2 = r2.c1)) AND ((r1."C 1" > 2000)))) INNER JOIN "S 1"."T 4" r3 ON (TRUE)) FOR UPDATE OF r1
 
+         Remote SQL: SELECT r1.tableoid, r1.ctid, CASE WHEN (r2.*)::text IS NOT NULL THEN ROW(r2.c1, r2.c2, r2.c3)
END,CASE WHEN (r3.*)::text IS NOT NULL THEN ROW(r3.c1, r3.c2, r3.c3) END, r2.c1, r3.c1 FROM (("S 1"."T 1" r1 INNER JOIN
"S1"."T 3" r2 ON (((r1.c2 = r2.c1)) AND ((r1."C 1" > 2000)))) INNER JOIN "S 1"."T 4" r3 ON (TRUE)) FOR UPDATE OF r1
 
          ->  Nested Loop
-               Output: ft2.ctid, ft4.*, ft5.*, ft4.c1, ft5.c1
+               Output: ft2."<added_junk>", ft2.ctid, ft4.*, ft5.*, ft4.c1, ft5.c1
                ->  Nested Loop
-                     Output: ft2.ctid, ft4.*, ft4.c1
+                     Output: ft2."<added_junk>", ft2.ctid, ft4.*, ft4.c1
                      Join Filter: (ft2.c2 = ft4.c1)
                      ->  Foreign Scan on public.ft2
-                           Output: ft2.ctid, ft2.c2
-                           Remote SQL: SELECT c2, ctid FROM "S 1"."T 1" WHERE (("C 1" > 2000)) FOR UPDATE
+                           Output: ft2."<added_junk>", ft2.ctid, ft2.c2
+                           Remote SQL: SELECT c2, tableoid, ctid FROM "S 1"."T 1" WHERE (("C 1" > 2000)) FOR UPDATE
                      ->  Foreign Scan on public.ft4
                            Output: ft4.*, ft4.c1
                            Remote SQL: SELECT c1, c2, c3 FROM "S 1"."T 3"
@@ -6229,13 +6229,13 @@ SELECT * FROM foreign_tbl;
 
 EXPLAIN (VERBOSE, COSTS OFF)
 UPDATE rw_view SET b = b + 5;
-                                      QUERY PLAN                                       
----------------------------------------------------------------------------------------
+                                            QUERY PLAN                                            
+--------------------------------------------------------------------------------------------------
  Update on public.foreign_tbl
-   Remote SQL: UPDATE public.base_tbl SET b = $2 WHERE ctid = $1 RETURNING a, b
+   Remote SQL: UPDATE public.base_tbl SET b = $3 WHERE tableoid = $1 AND ctid = $2 RETURNING a, b
    ->  Foreign Scan on public.foreign_tbl
-         Output: foreign_tbl.a, (foreign_tbl.b + 5), foreign_tbl.ctid
-         Remote SQL: SELECT a, b, ctid FROM public.base_tbl WHERE ((a < b)) FOR UPDATE
+         Output: foreign_tbl.a, (foreign_tbl.b + 5), foreign_tbl."<added_junk>", foreign_tbl.ctid
+         Remote SQL: SELECT a, b, tableoid, ctid FROM public.base_tbl WHERE ((a < b)) FOR UPDATE
 (5 rows)
 
 UPDATE rw_view SET b = b + 5; -- should fail
@@ -6243,13 +6243,13 @@ ERROR:  new row violates check option for view "rw_view"
 DETAIL:  Failing row contains (20, 20).
 EXPLAIN (VERBOSE, COSTS OFF)
 UPDATE rw_view SET b = b + 15;
-                                      QUERY PLAN                                       
----------------------------------------------------------------------------------------
+                                            QUERY PLAN                                             
+---------------------------------------------------------------------------------------------------
  Update on public.foreign_tbl
-   Remote SQL: UPDATE public.base_tbl SET b = $2 WHERE ctid = $1 RETURNING a, b
+   Remote SQL: UPDATE public.base_tbl SET b = $3 WHERE tableoid = $1 AND ctid = $2 RETURNING a, b
    ->  Foreign Scan on public.foreign_tbl
-         Output: foreign_tbl.a, (foreign_tbl.b + 15), foreign_tbl.ctid
-         Remote SQL: SELECT a, b, ctid FROM public.base_tbl WHERE ((a < b)) FOR UPDATE
+         Output: foreign_tbl.a, (foreign_tbl.b + 15), foreign_tbl."<added_junk>", foreign_tbl.ctid
+         Remote SQL: SELECT a, b, tableoid, ctid FROM public.base_tbl WHERE ((a < b)) FOR UPDATE
 (5 rows)
 
 UPDATE rw_view SET b = b + 15; -- ok
@@ -6316,14 +6316,14 @@ SELECT * FROM foreign_tbl;
 
 EXPLAIN (VERBOSE, COSTS OFF)
 UPDATE rw_view SET b = b + 5;
-                                       QUERY PLAN                                       
-----------------------------------------------------------------------------------------
+                                             QUERY PLAN                                              
+-----------------------------------------------------------------------------------------------------
  Update on public.parent_tbl
    Foreign Update on public.foreign_tbl
-     Remote SQL: UPDATE public.child_tbl SET b = $2 WHERE ctid = $1 RETURNING a, b
+     Remote SQL: UPDATE public.child_tbl SET b = $3 WHERE tableoid = $1 AND ctid = $2 RETURNING a, b
    ->  Foreign Scan on public.foreign_tbl
-         Output: foreign_tbl.a, (foreign_tbl.b + 5), foreign_tbl.ctid
-         Remote SQL: SELECT a, b, ctid FROM public.child_tbl WHERE ((a < b)) FOR UPDATE
+         Output: foreign_tbl.a, (foreign_tbl.b + 5), foreign_tbl."<added_junk>", foreign_tbl.ctid
+         Remote SQL: SELECT a, b, tableoid, ctid FROM public.child_tbl WHERE ((a < b)) FOR UPDATE
 (6 rows)
 
 UPDATE rw_view SET b = b + 5; -- should fail
@@ -6331,14 +6331,14 @@ ERROR:  new row violates check option for view "rw_view"
 DETAIL:  Failing row contains (20, 20).
 EXPLAIN (VERBOSE, COSTS OFF)
 UPDATE rw_view SET b = b + 15;
-                                       QUERY PLAN                                       
-----------------------------------------------------------------------------------------
+                                             QUERY PLAN                                              
+-----------------------------------------------------------------------------------------------------
  Update on public.parent_tbl
    Foreign Update on public.foreign_tbl
-     Remote SQL: UPDATE public.child_tbl SET b = $2 WHERE ctid = $1 RETURNING a, b
+     Remote SQL: UPDATE public.child_tbl SET b = $3 WHERE tableoid = $1 AND ctid = $2 RETURNING a, b
    ->  Foreign Scan on public.foreign_tbl
-         Output: foreign_tbl.a, (foreign_tbl.b + 15), foreign_tbl.ctid
-         Remote SQL: SELECT a, b, ctid FROM public.child_tbl WHERE ((a < b)) FOR UPDATE
+         Output: foreign_tbl.a, (foreign_tbl.b + 15), foreign_tbl."<added_junk>", foreign_tbl.ctid
+         Remote SQL: SELECT a, b, tableoid, ctid FROM public.child_tbl WHERE ((a < b)) FOR UPDATE
 (6 rows)
 
 UPDATE rw_view SET b = b + 15; -- ok
@@ -6808,13 +6808,13 @@ BEFORE UPDATE ON rem1
 FOR EACH ROW EXECUTE PROCEDURE trigger_data(23,'skidoo');
 EXPLAIN (verbose, costs off)
 UPDATE rem1 set f2 = '';          -- can't be pushed down
-                             QUERY PLAN                              
----------------------------------------------------------------------
+                                   QUERY PLAN                                   
+--------------------------------------------------------------------------------
  Update on public.rem1
-   Remote SQL: UPDATE public.loc1 SET f2 = $2 WHERE ctid = $1
+   Remote SQL: UPDATE public.loc1 SET f2 = $3 WHERE tableoid = $1 AND ctid = $2
    ->  Foreign Scan on public.rem1
-         Output: f1, ''::text, ctid, rem1.*
-         Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE
+         Output: f1, ''::text, "<added_junk>", ctid, rem1.*
+         Remote SQL: SELECT f1, f2, tableoid, ctid FROM public.loc1 FOR UPDATE
 (5 rows)
 
 EXPLAIN (verbose, costs off)
@@ -6832,13 +6832,13 @@ AFTER UPDATE ON rem1
 FOR EACH ROW EXECUTE PROCEDURE trigger_data(23,'skidoo');
 EXPLAIN (verbose, costs off)
 UPDATE rem1 set f2 = '';          -- can't be pushed down
-                                  QUERY PLAN                                   
--------------------------------------------------------------------------------
+                                           QUERY PLAN                                            
+-------------------------------------------------------------------------------------------------
  Update on public.rem1
-   Remote SQL: UPDATE public.loc1 SET f2 = $2 WHERE ctid = $1 RETURNING f1, f2
+   Remote SQL: UPDATE public.loc1 SET f2 = $3 WHERE tableoid = $1 AND ctid = $2 RETURNING f1, f2
    ->  Foreign Scan on public.rem1
-         Output: f1, ''::text, ctid, rem1.*
-         Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE
+         Output: f1, ''::text, "<added_junk>", ctid, rem1.*
+         Remote SQL: SELECT f1, f2, tableoid, ctid FROM public.loc1 FOR UPDATE
 (5 rows)
 
 EXPLAIN (verbose, costs off)
@@ -6866,13 +6866,13 @@ UPDATE rem1 set f2 = '';          -- can be pushed down
 
 EXPLAIN (verbose, costs off)
 DELETE FROM rem1;                 -- can't be pushed down
-                             QUERY PLAN                              
----------------------------------------------------------------------
+                                  QUERY PLAN                                   
+-------------------------------------------------------------------------------
  Delete on public.rem1
-   Remote SQL: DELETE FROM public.loc1 WHERE ctid = $1
+   Remote SQL: DELETE FROM public.loc1 WHERE tableoid = $1 AND ctid = $2
    ->  Foreign Scan on public.rem1
-         Output: ctid, rem1.*
-         Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE
+         Output: "<added_junk>", ctid, rem1.*
+         Remote SQL: SELECT f1, f2, tableoid, ctid FROM public.loc1 FOR UPDATE
 (5 rows)
 
 DROP TRIGGER trig_row_before_delete ON rem1;
@@ -6890,13 +6890,13 @@ UPDATE rem1 set f2 = '';          -- can be pushed down
 
 EXPLAIN (verbose, costs off)
 DELETE FROM rem1;                 -- can't be pushed down
-                               QUERY PLAN                               
-------------------------------------------------------------------------
+                                        QUERY PLAN                                        
+------------------------------------------------------------------------------------------
  Delete on public.rem1
-   Remote SQL: DELETE FROM public.loc1 WHERE ctid = $1 RETURNING f1, f2
+   Remote SQL: DELETE FROM public.loc1 WHERE tableoid = $1 AND ctid = $2 RETURNING f1, f2
    ->  Foreign Scan on public.rem1
-         Output: ctid, rem1.*
-         Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE
+         Output: "<added_junk>", ctid, rem1.*
+         Remote SQL: SELECT f1, f2, tableoid, ctid FROM public.loc1 FOR UPDATE
 (5 rows)
 
 DROP TRIGGER trig_row_after_delete ON rem1;
@@ -7147,12 +7147,12 @@ select * from bar where f1 in (select f1 from foo) for share;
 -- Check UPDATE with inherited target and an inherited source table
 explain (verbose, costs off)
 update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                                    QUERY PLAN                                                    
+------------------------------------------------------------------------------------------------------------------
  Update on public.bar
    Update on public.bar
    Foreign Update on public.bar2
-     Remote SQL: UPDATE public.loct2 SET f2 = $2 WHERE ctid = $1
+     Remote SQL: UPDATE public.loct2 SET f2 = $3 WHERE tableoid = $1 AND ctid = $2
    ->  Hash Join
          Output: bar.f1, (bar.f2 + 100), bar.ctid, foo.ctid, foo.*, foo.tableoid
          Inner Unique: true
@@ -7171,12 +7171,12 @@ update bar set f2 = f2 + 100 where f1 in (select f1 from foo);
                                  Output: foo2.ctid, foo2.*, foo2.tableoid, foo2.f1
                                  Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct1
    ->  Hash Join
-         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2.ctid, foo.ctid, foo.*, foo.tableoid
+         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2."<added_junk>", bar2.ctid, foo.ctid, foo.*, foo.tableoid
          Inner Unique: true
          Hash Cond: (bar2.f1 = foo.f1)
          ->  Foreign Scan on public.bar2
-               Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid
-               Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
+               Output: bar2.f1, bar2.f2, bar2.f3, bar2."<added_junk>", bar2.ctid
+               Remote SQL: SELECT f1, f2, f3, tableoid, ctid FROM public.loct2 FOR UPDATE
          ->  Hash
                Output: foo.ctid, foo.*, foo.tableoid, foo.f1
                ->  HashAggregate
@@ -7208,12 +7208,12 @@ update bar set f2 = f2 + 100
 from
   ( select f1 from foo union all select f1+3 from foo ) ss
 where bar.f1 = ss.f1;
-                                      QUERY PLAN                                      
---------------------------------------------------------------------------------------
+                                            QUERY PLAN                                            
+--------------------------------------------------------------------------------------------------
  Update on public.bar
    Update on public.bar
    Foreign Update on public.bar2
-     Remote SQL: UPDATE public.loct2 SET f2 = $2 WHERE ctid = $1
+     Remote SQL: UPDATE public.loct2 SET f2 = $3 WHERE tableoid = $1 AND ctid = $2
    ->  Hash Join
          Output: bar.f1, (bar.f2 + 100), bar.ctid, (ROW(foo.f1))
          Hash Cond: (foo.f1 = bar.f1)
@@ -7233,14 +7233,14 @@ where bar.f1 = ss.f1;
                ->  Seq Scan on public.bar
                      Output: bar.f1, bar.f2, bar.ctid
    ->  Merge Join
-         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2.ctid, (ROW(foo.f1))
+         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2."<added_junk>", bar2.ctid, (ROW(foo.f1))
          Merge Cond: (bar2.f1 = foo.f1)
          ->  Sort
-               Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid
+               Output: bar2.f1, bar2.f2, bar2.f3, bar2."<added_junk>", bar2.ctid
                Sort Key: bar2.f1
                ->  Foreign Scan on public.bar2
-                     Output: bar2.f1, bar2.f2, bar2.f3, bar2.ctid
-                     Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
+                     Output: bar2.f1, bar2.f2, bar2.f3, bar2."<added_junk>", bar2.ctid
+                     Remote SQL: SELECT f1, f2, f3, tableoid, ctid FROM public.loct2 FOR UPDATE
          ->  Sort
                Output: (ROW(foo.f1)), foo.f1
                Sort Key: foo.f1
@@ -7438,17 +7438,17 @@ AFTER UPDATE OR DELETE ON bar2
 FOR EACH ROW EXECUTE PROCEDURE trigger_data(23,'skidoo');
 explain (verbose, costs off)
 update bar set f2 = f2 + 100;
-                                      QUERY PLAN                                      
---------------------------------------------------------------------------------------
+                                               QUERY PLAN                                               
+--------------------------------------------------------------------------------------------------------
  Update on public.bar
    Update on public.bar
    Foreign Update on public.bar2
-     Remote SQL: UPDATE public.loct2 SET f2 = $2 WHERE ctid = $1 RETURNING f1, f2, f3
+     Remote SQL: UPDATE public.loct2 SET f2 = $3 WHERE tableoid = $1 AND ctid = $2 RETURNING f1, f2, f3
    ->  Seq Scan on public.bar
          Output: bar.f1, (bar.f2 + 100), bar.ctid
    ->  Foreign Scan on public.bar2
-         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2.ctid, bar2.*
-         Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 FOR UPDATE
+         Output: bar2.f1, (bar2.f2 + 100), bar2.f3, bar2."<added_junk>", bar2.ctid, bar2.*
+         Remote SQL: SELECT f1, f2, f3, tableoid, ctid FROM public.loct2 FOR UPDATE
 (9 rows)
 
 update bar set f2 = f2 + 100;
@@ -7466,18 +7466,18 @@ NOTICE:  trig_row_after(23, skidoo) AFTER ROW UPDATE ON bar2
 NOTICE:  OLD: (7,277,77),NEW: (7,377,77)
 explain (verbose, costs off)
 delete from bar where f2 < 400;
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                                              QUERY PLAN                                               
+-------------------------------------------------------------------------------------------------------
  Delete on public.bar
    Delete on public.bar
    Foreign Delete on public.bar2
-     Remote SQL: DELETE FROM public.loct2 WHERE ctid = $1 RETURNING f1, f2, f3
+     Remote SQL: DELETE FROM public.loct2 WHERE tableoid = $1 AND ctid = $2 RETURNING f1, f2, f3
    ->  Seq Scan on public.bar
          Output: bar.ctid
          Filter: (bar.f2 < 400)
    ->  Foreign Scan on public.bar2
-         Output: bar2.ctid, bar2.*
-         Remote SQL: SELECT f1, f2, f3, ctid FROM public.loct2 WHERE ((f2 < 400)) FOR UPDATE
+         Output: bar2."<added_junk>", bar2.ctid, bar2.*
+         Remote SQL: SELECT f1, f2, f3, tableoid, ctid FROM public.loct2 WHERE ((f2 < 400)) FOR UPDATE
 (10 rows)
 
 delete from bar where f2 < 400;
@@ -7568,6 +7568,65 @@ drop table loct1;
 drop table loct2;
 drop table parent;
 -- ===================================================================
+-- test update foreign partiton table
+-- ===================================================================
+CREATE TABLE p1 (a int, b int);
+CREATE TABLE c1 (LIKE p1) INHERITS (p1);
+NOTICE:  merging column "a" with inherited definition
+NOTICE:  merging column "b" with inherited definition
+CREATE TABLE c2 (LIKE p1) INHERITS (p1);
+NOTICE:  merging column "a" with inherited definition
+NOTICE:  merging column "b" with inherited definition
+CREATE FOREIGN TABLE fp1 (a int, b int)
+ SERVER loopback OPTIONS (table_name 'p1');
+INSERT INTO c1 VALUES (0, 1);
+INSERT INTO c2 VALUES (1, 1);
+SELECT tableoid, ctid, * FROM fp1;
+ tableoid | ctid  | a | b 
+----------+-------+---+---
+    16638 | (0,1) | 0 | 1
+    16638 | (0,1) | 1 | 1
+(2 rows)
+
+-- random() causes non-direct foreign update
+EXPLAIN VERBOSE UPDATE fp1 SET b = b + 1 WHERE a = 0 and random() <= 1;
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Update on public.fp1  (cost=100.00..144.31 rows=3 width=18)
+   Remote SQL: UPDATE public.p1 SET b = $3 WHERE tableoid = $1 AND ctid = $2
+   ->  Foreign Scan on public.fp1  (cost=100.00..144.31 rows=3 width=18)
+         Output: a, (b + 1), "<added_junk>", ctid
+         Filter: (random() <= '1'::double precision)
+         Remote SQL: SELECT a, b, tableoid, ctid FROM public.p1 WHERE ((a = 0)) FOR UPDATE
+(6 rows)
+
+UPDATE fp1 SET b = b + 1 WHERE a = 0 and random() <= 1;
+SELECT tableoid, ctid, * FROM fp1; -- Only one tuple should be updated
+ tableoid | ctid  | a | b 
+----------+-------+---+---
+    16638 | (0,2) | 0 | 2
+    16638 | (0,1) | 1 | 1
+(2 rows)
+
+-- Reset ctid
+TRUNCATE c1;
+TRUNCATE c2;
+INSERT INTO c1 VALUES (0, 1);
+INSERT INTO c2 VALUES (1, 1);
+DELETE FROM fp1 WHERE a = 1 and random() <= 1;
+SELECT tableoid, ctid, * FROM fp1; -- Only one tuple should be deleted
+ tableoid | ctid  | a | b 
+----------+-------+---+---
+    16638 | (0,1) | 0 | 1
+(1 row)
+
+-- cleanup
+DROP FOREIGN TABLE fp1;
+DROP TABLE p1 CASCADE;
+NOTICE:  drop cascades to 2 other objects
+DETAIL:  drop cascades to table c1
+drop cascades to table c2
+-- ===================================================================
 -- test tuple routing for foreign-table partitions
 -- ===================================================================
 -- Test insert tuple routing
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index e1b955f3f0..7b9dc027a0 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -1846,6 +1846,33 @@ drop table loct1;
 drop table loct2;
 drop table parent;
 
+-- ===================================================================
+-- test update foreign partiton table
+-- ===================================================================
+CREATE TABLE p1 (a int, b int);
+CREATE TABLE c1 (LIKE p1) INHERITS (p1);
+CREATE TABLE c2 (LIKE p1) INHERITS (p1);
+CREATE FOREIGN TABLE fp1 (a int, b int)
+ SERVER loopback OPTIONS (table_name 'p1');
+INSERT INTO c1 VALUES (0, 1);
+INSERT INTO c2 VALUES (1, 1);
+SELECT tableoid, ctid, * FROM fp1;
+-- random() causes non-direct foreign update
+EXPLAIN VERBOSE UPDATE fp1 SET b = b + 1 WHERE a = 0 and random() <= 1;
+UPDATE fp1 SET b = b + 1 WHERE a = 0 and random() <= 1;
+SELECT tableoid, ctid, * FROM fp1; -- Only one tuple should be updated
+-- Reset ctid
+TRUNCATE c1;
+TRUNCATE c2;
+INSERT INTO c1 VALUES (0, 1);
+INSERT INTO c2 VALUES (1, 1);
+DELETE FROM fp1 WHERE a = 1 and random() <= 1;
+SELECT tableoid, ctid, * FROM fp1; -- Only one tuple should be deleted
+
+-- cleanup
+DROP FOREIGN TABLE fp1;
+DROP TABLE p1 CASCADE;
+
 -- ===================================================================
 -- test tuple routing for foreign-table partitions
 -- ===================================================================
-- 
2.16.3

From b7ef61b5fe14392fc2288ebe553d368fe83923d5 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Wed, 8 Aug 2018 12:15:04 +0900
Subject: [PATCH 1/2] Fix foreign update on remote partitioned tables

postgres_fdw's non-direct foreign update was using only ctid to
identify the remote tuple in the second update step. This can cause
false updates/deletes on the remote side. This patch lets foreign
scans to use remote table oid along with ctid as remote tuple
identifier.
---
 contrib/file_fdw/file_fdw.c            |   2 +-
 contrib/postgres_fdw/deparse.c         | 135 +++++++++++++++------------
 contrib/postgres_fdw/postgres_fdw.c    | 161 +++++++++++++++++++++++++--------
 src/backend/executor/execExprInterp.c  |  41 +++++++--
 src/backend/executor/nodeForeignscan.c |  44 ++++++++-
 src/backend/foreign/foreign.c          |  13 ++-
 src/backend/optimizer/plan/setrefs.c   |   2 +-
 src/backend/optimizer/util/plancat.c   |  41 ++++++++-
 src/backend/utils/adt/ruleutils.c      |   6 +-
 src/include/foreign/foreign.h          |   3 +-
 10 files changed, 330 insertions(+), 118 deletions(-)

diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index 2cf09aecf6..4c03700191 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -453,7 +453,7 @@ get_file_fdw_attribute_options(Oid relid)
         if (attr->attisdropped)
             continue;
 
-        options = GetForeignColumnOptions(relid, attnum);
+        options = GetForeignColumnOptions(relid, attnum, false);
         foreach(lc, options)
         {
             DefElem    *def = (DefElem *) lfirst(lc);
diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 6001f4d25e..9e5b0e3cc0 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -1088,6 +1088,42 @@ deparseFromExpr(List *quals, deparse_expr_cxt *context)
     }
 }
 
+/*
+ * Adds one element in target/returning list if it is in attrs_used.
+ *
+ * If deparsestr is given, just use it. Otherwise resolves the name using rte.
+ */
+static inline void
+deparseAddTargetListItem(StringInfo buf,
+                         List **retrieved_attrs, Bitmapset *attrs_used,
+                         Index rtindex, AttrNumber attnum,
+                         char *deparsestr, RangeTblEntry *rte,
+                         bool is_returning, bool qualify_col,
+                         bool have_wholerow, bool *first)
+{
+    if (!have_wholerow &&
+        !bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, attrs_used))
+        return;
+
+    if (!*first)
+        appendStringInfoString(buf, ", ");
+    else if (is_returning)
+        appendStringInfoString(buf, " RETURNING ");
+    *first = false;
+
+    if (deparsestr)
+    {
+        if (qualify_col)
+            ADD_REL_QUALIFIER(buf, rtindex);
+
+        appendStringInfoString(buf, deparsestr);
+    }
+    else
+        deparseColumnRef(buf, rtindex, attnum, rte, qualify_col);
+    
+    *retrieved_attrs = lappend_int(*retrieved_attrs, attnum);
+}
+
 /*
  * Emit a target list that retrieves the columns specified in attrs_used.
  * This is used for both SELECT and RETURNING targetlists; the is_returning
@@ -1128,58 +1164,27 @@ deparseTargetList(StringInfo buf,
         if (attr->attisdropped)
             continue;
 
-        if (have_wholerow ||
-            bms_is_member(i - FirstLowInvalidHeapAttributeNumber,
-                          attrs_used))
-        {
-            if (!first)
-                appendStringInfoString(buf, ", ");
-            else if (is_returning)
-                appendStringInfoString(buf, " RETURNING ");
-            first = false;
-
-            deparseColumnRef(buf, rtindex, i, rte, qualify_col);
-
-            *retrieved_attrs = lappend_int(*retrieved_attrs, i);
-        }
+        deparseAddTargetListItem(buf, retrieved_attrs, attrs_used,
+                                 rtindex, i, NULL, rte,
+                                 is_returning, qualify_col, have_wholerow,
+                                 &first);
     }
 
     /*
-     * Add ctid and oid if needed.  We currently don't support retrieving any
-     * other system columns.
+     * Add ctid, oid and tableoid if needed. The attribute name and number are
+     * assigned in postgresAddForeignUpdateTargets.
      */
-    if (bms_is_member(SelfItemPointerAttributeNumber - FirstLowInvalidHeapAttributeNumber,
-                      attrs_used))
-    {
-        if (!first)
-            appendStringInfoString(buf, ", ");
-        else if (is_returning)
-            appendStringInfoString(buf, " RETURNING ");
-        first = false;
-
-        if (qualify_col)
-            ADD_REL_QUALIFIER(buf, rtindex);
-        appendStringInfoString(buf, "ctid");
-
-        *retrieved_attrs = lappend_int(*retrieved_attrs,
-                                       SelfItemPointerAttributeNumber);
-    }
-    if (bms_is_member(ObjectIdAttributeNumber - FirstLowInvalidHeapAttributeNumber,
-                      attrs_used))
-    {
-        if (!first)
-            appendStringInfoString(buf, ", ");
-        else if (is_returning)
-            appendStringInfoString(buf, " RETURNING ");
-        first = false;
-
-        if (qualify_col)
-            ADD_REL_QUALIFIER(buf, rtindex);
-        appendStringInfoString(buf, "oid");
-
-        *retrieved_attrs = lappend_int(*retrieved_attrs,
-                                       ObjectIdAttributeNumber);
-    }
+    deparseAddTargetListItem(buf, retrieved_attrs, attrs_used,
+                             rtindex, tupdesc->natts + 1, "tableoid",
+                             NULL, is_returning, qualify_col, false, &first);
+    
+    deparseAddTargetListItem(buf, retrieved_attrs, attrs_used,
+                             rtindex, SelfItemPointerAttributeNumber, "ctid",
+                             NULL, is_returning, qualify_col, false, &first);
+    
+    deparseAddTargetListItem(buf, retrieved_attrs, attrs_used,
+                             rtindex, ObjectIdAttributeNumber, "oid",
+                             NULL, is_returning, qualify_col, false, &first);
 
     /* Don't generate bad syntax if no undropped columns */
     if (first && !is_returning)
@@ -1728,7 +1733,7 @@ deparseUpdateSql(StringInfo buf, RangeTblEntry *rte,
     deparseRelation(buf, rel);
     appendStringInfoString(buf, " SET ");
 
-    pindex = 2;                    /* ctid is always the first param */
+    pindex = 3;                    /* tableoid and ctid always precede */
     first = true;
     foreach(lc, targetAttrs)
     {
@@ -1742,7 +1747,7 @@ deparseUpdateSql(StringInfo buf, RangeTblEntry *rte,
         appendStringInfo(buf, " = $%d", pindex);
         pindex++;
     }
-    appendStringInfoString(buf, " WHERE ctid = $1");
+    appendStringInfoString(buf, " WHERE tableoid = $1 AND ctid = $2");
 
     deparseReturningList(buf, rte, rtindex, rel,
                          rel->trigdesc && rel->trigdesc->trig_update_after_row,
@@ -1858,7 +1863,7 @@ deparseDeleteSql(StringInfo buf, RangeTblEntry *rte,
 {
     appendStringInfoString(buf, "DELETE FROM ");
     deparseRelation(buf, rel);
-    appendStringInfoString(buf, " WHERE ctid = $1");
+    appendStringInfoString(buf, " WHERE tableoid = $1 AND ctid = $2");
 
     deparseReturningList(buf, rte, rtindex, rel,
                          rel->trigdesc && rel->trigdesc->trig_delete_after_row,
@@ -2033,7 +2038,7 @@ deparseAnalyzeSql(StringInfo buf, Relation rel, List **retrieved_attrs)
 
         /* Use attribute name or column_name option. */
         colname = NameStr(TupleDescAttr(tupdesc, i)->attname);
-        options = GetForeignColumnOptions(relid, i + 1);
+        options = GetForeignColumnOptions(relid, i + 1, false);
 
         foreach(lc, options)
         {
@@ -2160,7 +2165,7 @@ deparseColumnRef(StringInfo buf, int varno, int varattno, RangeTblEntry *rte,
     }
     else
     {
-        char       *colname = NULL;
+        const char *colname = NULL;
         List       *options;
         ListCell   *lc;
 
@@ -2171,7 +2176,7 @@ deparseColumnRef(StringInfo buf, int varno, int varattno, RangeTblEntry *rte,
          * If it's a column of a foreign table, and it has the column_name FDW
          * option, use that value.
          */
-        options = GetForeignColumnOptions(rte->relid, varattno);
+        options = GetForeignColumnOptions(rte->relid, varattno, true);
         foreach(lc, options)
         {
             DefElem    *def = (DefElem *) lfirst(lc);
@@ -2188,11 +2193,29 @@ deparseColumnRef(StringInfo buf, int varno, int varattno, RangeTblEntry *rte,
          * FDW option, use attribute name.
          */
         if (colname == NULL)
-            colname = get_attname(rte->relid, varattno, false);
+            colname = get_attname(rte->relid, varattno, true);
+
+        if (colname == NULL)
+        {
+            /*
+             * This may be additional junk column. Make sure it is that.
+             * We must already have required lock on the relation.
+             */
+            Relation rel = heap_open(rte->relid, NoLock);
+            int natts = RelationGetNumberOfAttributes(rel);
+            heap_close(rel, NoLock);
+
+            /* XX: shouldn't we use the same message with get_attname? */
+            if (varattno != natts + 1)
+                elog(ERROR, "name resolution failed for attribute %d of relation %u",
+                     varattno, rte->relid);
+                
+            colname = "tableoid";
+        }
 
         if (qualify_col)
             ADD_REL_QUALIFIER(buf, varno);
-
+        
         appendStringInfoString(buf, quote_identifier(colname));
     }
 }
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 0803c30a48..162fbeed48 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -179,6 +179,7 @@ typedef struct PgFdwModifyState
 
     /* info about parameters for prepared statement */
     AttrNumber    ctidAttno;        /* attnum of input resjunk ctid column */
+    AttrNumber    toidAttno;        /* attnum of input resjunk tableoid column */
     int            p_nums;            /* number of parameters to transmit */
     FmgrInfo   *p_flinfo;        /* output conversion functions for them */
 
@@ -392,6 +393,7 @@ static PgFdwModifyState *create_foreign_modify(EState *estate,
                       List *retrieved_attrs);
 static void prepare_foreign_modify(PgFdwModifyState *fmstate);
 static const char **convert_prep_stmt_params(PgFdwModifyState *fmstate,
+                         Oid tableoid,
                          ItemPointer tupleid,
                          TupleTableSlot *slot);
 static void store_returning_result(PgFdwModifyState *fmstate,
@@ -1140,10 +1142,13 @@ postgresGetForeignPlan(PlannerInfo *root,
     List       *fdw_recheck_quals = NIL;
     List       *retrieved_attrs;
     StringInfoData sql;
-    ListCell   *lc;
 
     if (IS_SIMPLE_REL(foreignrel))
     {
+        Relation frel;
+        int         base_nattrs;
+        ListCell *lc;
+
         /*
          * For base relations, set scan_relid as the relid of the relation.
          */
@@ -1191,6 +1196,29 @@ postgresGetForeignPlan(PlannerInfo *root,
          * should recheck all the remote quals.
          */
         fdw_recheck_quals = remote_exprs;
+
+        /*
+         * We may have put tableoid junk column to the targetlist. Add the
+         * junk column to fdw_scan_tlist so that core can take care of it.  We
+         * should have only one junk column but we don't premise that here.
+         */
+        frel = heap_open(foreigntableid, NoLock);
+        base_nattrs = RelationGetNumberOfAttributes(frel);
+        heap_close(frel, NoLock);
+        
+        foreach (lc, root->parse->targetList)
+        {
+            TargetEntry *tle = lfirst_node(TargetEntry, lc);
+            Var *var = (Var *) tle->expr;
+
+            /*
+             * We need only additional non-system junk vars for the scanned
+             * relation here
+             */
+            if (tle->resjunk && IsA(var, Var) &&
+                base_nattrs < var->varattno && var->varno == scan_relid)
+                fdw_scan_tlist = lappend(fdw_scan_tlist, tle);
+        }
     }
     else
     {
@@ -1383,16 +1411,12 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
      * into local representation and error reporting during that process.
      */
     if (fsplan->scan.scanrelid > 0)
-    {
         fsstate->rel = node->ss.ss_currentRelation;
-        fsstate->tupdesc = RelationGetDescr(fsstate->rel);
-    }
     else
-    {
         fsstate->rel = NULL;
-        fsstate->tupdesc = node->ss.ss_ScanTupleSlot->tts_tupleDescriptor;
-    }
 
+    /* We use the tuple descriptor privided by core */
+    fsstate->tupdesc = node->ss.ss_ScanTupleSlot->tts_tupleDescriptor;
     fsstate->attinmeta = TupleDescGetAttInMetadata(fsstate->tupdesc);
 
     /*
@@ -1541,14 +1565,41 @@ postgresAddForeignUpdateTargets(Query *parsetree,
                                 Relation target_relation)
 {
     Var           *var;
-    const char *attrname;
     TargetEntry *tle;
 
     /*
-     * In postgres_fdw, what we need is the ctid, same as for a regular table.
+     * In postgres_fdw, what we need is the tableoid and ctid, same as for a
+     * regular table.
      */
 
-    /* Make a Var representing the desired value */
+    /*
+     * Table OID is needed to retrieved as a non-system junk column in the
+     * returning tuple. We add it as a column after all regular columns.
+     */
+    var = makeVar(parsetree->resultRelation,
+                  RelationGetNumberOfAttributes(target_relation) + 1,
+                  OIDOID,
+                  -1,
+                  InvalidOid,
+                  0);
+
+    /*
+     * Wrap it in a resjunk TLE with a name accessible later by FDW. However
+     * we can use an arbitrary resname since this won't be used in remote
+     * query and this column is not used to join with other relations, just
+     * use understandable name. Doesn't seem that we explicitly free this tle
+     * but give pstrdup'ed string here just in case.
+     */
+    tle = makeTargetEntry((Expr *) var,
+                          list_length(parsetree->targetList) + 1,
+                          pstrdup("tableoid"),
+                          true);
+
+    /* ... and add it to the query's targetlist */
+    parsetree->targetList = lappend(parsetree->targetList, tle);
+
+
+    /* Do the same for ctid */
     var = makeVar(parsetree->resultRelation,
                   SelfItemPointerAttributeNumber,
                   TIDOID,
@@ -1556,15 +1607,11 @@ postgresAddForeignUpdateTargets(Query *parsetree,
                   InvalidOid,
                   0);
 
-    /* Wrap it in a resjunk TLE with the right name ... */
-    attrname = "ctid";
-
     tle = makeTargetEntry((Expr *) var,
                           list_length(parsetree->targetList) + 1,
-                          pstrdup(attrname),
+                          pstrdup("ctid"),
                           true);
 
-    /* ... and add it to the query's targetlist */
     parsetree->targetList = lappend(parsetree->targetList, tle);
 }
 
@@ -1769,7 +1816,7 @@ postgresExecForeignInsert(EState *estate,
         prepare_foreign_modify(fmstate);
 
     /* Convert parameters needed by prepared statement to text form */
-    p_values = convert_prep_stmt_params(fmstate, NULL, slot);
+    p_values = convert_prep_stmt_params(fmstate, InvalidOid, NULL, slot);
 
     /*
      * Execute the prepared statement.
@@ -1824,7 +1871,7 @@ postgresExecForeignUpdate(EState *estate,
                           TupleTableSlot *planSlot)
 {
     PgFdwModifyState *fmstate = (PgFdwModifyState *) resultRelInfo->ri_FdwState;
-    Datum        datum;
+    Datum        toiddatum, ctiddatum;
     bool        isNull;
     const char **p_values;
     PGresult   *res;
@@ -1835,17 +1882,26 @@ postgresExecForeignUpdate(EState *estate,
         prepare_foreign_modify(fmstate);
 
     /* Get the ctid that was passed up as a resjunk column */
-    datum = ExecGetJunkAttribute(planSlot,
-                                 fmstate->ctidAttno,
-                                 &isNull);
+    toiddatum = ExecGetJunkAttribute(planSlot,
+                                     fmstate->toidAttno,
+                                     &isNull);
+    /* shouldn't ever get a null result... */
+    if (isNull)
+        elog(ERROR, "tableoid is NULL");
+
+    /* Get the ctid that was passed up as a resjunk column */
+    ctiddatum = ExecGetJunkAttribute(planSlot,
+                                     fmstate->ctidAttno,
+                                     &isNull);
     /* shouldn't ever get a null result... */
     if (isNull)
         elog(ERROR, "ctid is NULL");
 
     /* Convert parameters needed by prepared statement to text form */
     p_values = convert_prep_stmt_params(fmstate,
-                                        (ItemPointer) DatumGetPointer(datum),
-                                        slot);
+                                    DatumGetObjectId(toiddatum),
+                                    (ItemPointer) DatumGetPointer(ctiddatum),
+                                    slot);
 
     /*
      * Execute the prepared statement.
@@ -1900,7 +1956,7 @@ postgresExecForeignDelete(EState *estate,
                           TupleTableSlot *planSlot)
 {
     PgFdwModifyState *fmstate = (PgFdwModifyState *) resultRelInfo->ri_FdwState;
-    Datum        datum;
+    Datum        ctiddatum, toiddatum;
     bool        isNull;
     const char **p_values;
     PGresult   *res;
@@ -1911,17 +1967,26 @@ postgresExecForeignDelete(EState *estate,
         prepare_foreign_modify(fmstate);
 
     /* Get the ctid that was passed up as a resjunk column */
-    datum = ExecGetJunkAttribute(planSlot,
-                                 fmstate->ctidAttno,
-                                 &isNull);
+    toiddatum = ExecGetJunkAttribute(planSlot,
+                                     fmstate->toidAttno,
+                                     &isNull);
+    /* shouldn't ever get a null result... */
+    if (isNull)
+        elog(ERROR, "tableoid is NULL");
+
+    /* Get the ctid that was passed up as a resjunk column */
+    ctiddatum = ExecGetJunkAttribute(planSlot,
+                                     fmstate->ctidAttno,
+                                     &isNull);
     /* shouldn't ever get a null result... */
     if (isNull)
         elog(ERROR, "ctid is NULL");
 
     /* Convert parameters needed by prepared statement to text form */
     p_values = convert_prep_stmt_params(fmstate,
-                                        (ItemPointer) DatumGetPointer(datum),
-                                        NULL);
+                                    DatumGetObjectId(toiddatum),
+                                    (ItemPointer) DatumGetPointer(ctiddatum),
+                                    NULL);
 
     /*
      * Execute the prepared statement.
@@ -2458,7 +2523,6 @@ postgresBeginDirectModify(ForeignScanState *node, int eflags)
             tupdesc = RelationGetDescr(dmstate->rel);
 
         dmstate->attinmeta = TupleDescGetAttInMetadata(tupdesc);
-
         /*
          * When performing an UPDATE/DELETE .. RETURNING on a join directly,
          * initialize a filter to extract an updated/deleted tuple from a scan
@@ -3345,7 +3409,7 @@ create_foreign_modify(EState *estate,
         fmstate->attinmeta = TupleDescGetAttInMetadata(tupdesc);
 
     /* Prepare for output conversion of parameters used in prepared stmt. */
-    n_params = list_length(fmstate->target_attrs) + 1;
+    n_params = list_length(fmstate->target_attrs) + 2;
     fmstate->p_flinfo = (FmgrInfo *) palloc0(sizeof(FmgrInfo) * n_params);
     fmstate->p_nums = 0;
 
@@ -3353,13 +3417,24 @@ create_foreign_modify(EState *estate,
     {
         Assert(subplan != NULL);
 
+        /* Find the remote tableoid resjunk column in the subplan's result */
+        fmstate->toidAttno = ExecFindJunkAttributeInTlist(subplan->targetlist,
+                                                          "tableoid");
+        if (!AttributeNumberIsValid(fmstate->toidAttno))
+            elog(ERROR, "could not find junk tableoid column");
+
+        /* First transmittable parameter will be table oid */
+        getTypeOutputInfo(OIDOID, &typefnoid, &isvarlena);
+        fmgr_info(typefnoid, &fmstate->p_flinfo[fmstate->p_nums]);
+        fmstate->p_nums++;
+
         /* Find the ctid resjunk column in the subplan's result */
         fmstate->ctidAttno = ExecFindJunkAttributeInTlist(subplan->targetlist,
                                                           "ctid");
         if (!AttributeNumberIsValid(fmstate->ctidAttno))
             elog(ERROR, "could not find junk ctid column");
 
-        /* First transmittable parameter will be ctid */
+        /* Second transmittable parameter will be ctid */
         getTypeOutputInfo(TIDOID, &typefnoid, &isvarlena);
         fmgr_info(typefnoid, &fmstate->p_flinfo[fmstate->p_nums]);
         fmstate->p_nums++;
@@ -3442,6 +3517,7 @@ prepare_foreign_modify(PgFdwModifyState *fmstate)
  */
 static const char **
 convert_prep_stmt_params(PgFdwModifyState *fmstate,
+                         Oid tableoid,
                          ItemPointer tupleid,
                          TupleTableSlot *slot)
 {
@@ -3453,10 +3529,15 @@ convert_prep_stmt_params(PgFdwModifyState *fmstate,
 
     p_values = (const char **) palloc(sizeof(char *) * fmstate->p_nums);
 
-    /* 1st parameter should be ctid, if it's in use */
-    if (tupleid != NULL)
+    /* First two parameters should be tableoid and ctid, if it's in use */
+    if (tableoid != InvalidOid)
     {
+        Assert (tupleid != NULL);
+
         /* don't need set_transmission_modes for TID output */
+        p_values[pindex] = OutputFunctionCall(&fmstate->p_flinfo[pindex],
+                                              ObjectIdGetDatum(tableoid));
+        pindex++;
         p_values[pindex] = OutputFunctionCall(&fmstate->p_flinfo[pindex],
                                               PointerGetDatum(tupleid));
         pindex++;
@@ -3685,8 +3766,8 @@ rebuild_fdw_scan_tlist(ForeignScan *fscan, List *tlist)
         new_tlist = lappend(new_tlist,
                             makeTargetEntry(tle->expr,
                                             list_length(new_tlist) + 1,
-                                            NULL,
-                                            false));
+                                            tle->resname,
+                                            tle->resjunk));
     }
     fscan->fdw_scan_tlist = new_tlist;
 }
@@ -5576,12 +5657,12 @@ make_tuple_from_result_row(PGresult *res,
      */
     oldcontext = MemoryContextSwitchTo(temp_context);
 
-    if (rel)
-        tupdesc = RelationGetDescr(rel);
+    if (fsstate)
+        tupdesc = fsstate->ss.ss_ScanTupleSlot->tts_tupleDescriptor;
     else
     {
-        Assert(fsstate);
-        tupdesc = fsstate->ss.ss_ScanTupleSlot->tts_tupleDescriptor;
+        Assert(rel);
+        tupdesc = RelationGetDescr(rel);
     }
 
     values = (Datum *) palloc0(tupdesc->natts * sizeof(Datum));
@@ -5623,7 +5704,7 @@ make_tuple_from_result_row(PGresult *res,
         errpos.cur_attno = i;
         if (i > 0)
         {
-            /* ordinary column */
+            /* ordinary column and tableoid */
             Assert(i <= tupdesc->natts);
             nulls[i - 1] = (valstr == NULL);
             /* Apply the input function even to nulls, to support domains */
diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c
index 9d6e25aae5..c4d75c611b 100644
--- a/src/backend/executor/execExprInterp.c
+++ b/src/backend/executor/execExprInterp.c
@@ -3883,14 +3883,39 @@ ExecEvalWholeRowVar(ExprState *state, ExprEvalStep *op, ExprContext *econtext)
             slot_tupdesc = slot->tts_tupleDescriptor;
 
             if (var_tupdesc->natts != slot_tupdesc->natts)
-                ereport(ERROR,
-                        (errcode(ERRCODE_DATATYPE_MISMATCH),
-                         errmsg("table row type and query-specified row type do not match"),
-                         errdetail_plural("Table row contains %d attribute, but query expects %d.",
-                                          "Table row contains %d attributes, but query expects %d.",
-                                          slot_tupdesc->natts,
-                                          slot_tupdesc->natts,
-                                          var_tupdesc->natts)));
+            {
+                bool sane = false;
+
+                /*
+                 * Foreign scan may have added junk columns at the end of
+                 * tuple. We don't assume it as a inconsistency and just igore
+                 * them here.
+                 */
+                if (var_tupdesc->natts < slot_tupdesc->natts)
+                {
+                    int i;
+
+                    sane = true;
+                    for (i = var_tupdesc->natts; i < slot_tupdesc->natts ; i++)
+                    {
+                        if (slot_tupdesc->attrs[i].attrelid != 0)
+                        {
+                            sane = false;
+                            break;
+                        }
+                    }
+                }
+
+                if (!sane)
+                    ereport(ERROR,
+                            (errcode(ERRCODE_DATATYPE_MISMATCH),
+                             errmsg("table row type and query-specified row type do not match"),
+                             errdetail_plural("Table row contains %d attribute, but query expects %d.",
+                                              "Table row contains %d attributes, but query expects %d.",
+                                              slot_tupdesc->natts,
+                                              slot_tupdesc->natts,
+                                              var_tupdesc->natts)));
+            }
 
             for (i = 0; i < var_tupdesc->natts; i++)
             {
diff --git a/src/backend/executor/nodeForeignscan.c b/src/backend/executor/nodeForeignscan.c
index a2a28b7ec2..3eaa23194e 100644
--- a/src/backend/executor/nodeForeignscan.c
+++ b/src/backend/executor/nodeForeignscan.c
@@ -172,10 +172,13 @@ ExecInitForeignScan(ForeignScan *node, EState *estate, int eflags)
     }
 
     /*
-     * Determine the scan tuple type.  If the FDW provided a targetlist
-     * describing the scan tuples, use that; else use base relation's rowtype.
+     * Determine the scan tuple type.  If currentRelation is NULL, use the
+     * targetlist provided by the FDW; else use base relation's rowtype. FDW
+     * may have provided fdw_scan_tlist for relation scan. They must consists
+     * only of junk colums and we extend the tuple descriptor for the base
+     * relation with them.
      */
-    if (node->fdw_scan_tlist != NIL || currentRelation == NULL)
+    if (currentRelation == NULL)
     {
         TupleDesc    scan_tupdesc;
 
@@ -190,6 +193,41 @@ ExecInitForeignScan(ForeignScan *node, EState *estate, int eflags)
 
         /* don't trust FDWs to return tuples fulfilling NOT NULL constraints */
         scan_tupdesc = CreateTupleDescCopy(RelationGetDescr(currentRelation));
+
+        /*
+         * If we have fdw_scan_tlist here, it should consists only of junk
+         * columns.  Extend the tuple descriptor with them so that the FDW can
+         * handle the columns.
+         */
+        if (node->fdw_scan_tlist != NIL)
+        {
+            ListCell *lc;
+            AttrNumber oldnattrs PG_USED_FOR_ASSERTS_ONLY = scan_tupdesc->natts;
+            AttrNumber newnattrs =
+                scan_tupdesc->natts + list_length(node->fdw_scan_tlist);
+
+            scan_tupdesc = (TupleDesc)
+                repalloc(scan_tupdesc,
+                         offsetof(struct tupleDesc, attrs) +
+                         newnattrs * sizeof(FormData_pg_attribute));
+            scan_tupdesc->natts = newnattrs;
+                
+            foreach (lc, node->fdw_scan_tlist)
+            {
+                TargetEntry *tle = lfirst_node(TargetEntry, lc);
+                Var *var = (Var *) tle->expr;
+
+                Assert(IsA(tle->expr, Var) &&
+                           tle->resjunk && var->varattno > oldnattrs);
+                TupleDescInitEntry(scan_tupdesc,
+                                   var->varattno,
+                                   tle->resname,
+                                   var->vartype,
+                                   var->vartypmod,
+                                   0);
+            }                
+        }
+
         ExecInitScanTupleSlot(estate, &scanstate->ss, scan_tupdesc);
         /* Node's targetlist will contain Vars with varno = scanrelid */
         tlistvarno = scanrelid;
diff --git a/src/backend/foreign/foreign.c b/src/backend/foreign/foreign.c
index eac78a5d31..f5c7f7af73 100644
--- a/src/backend/foreign/foreign.c
+++ b/src/backend/foreign/foreign.c
@@ -249,9 +249,12 @@ GetForeignTable(Oid relid)
 /*
  * GetForeignColumnOptions - Get attfdwoptions of given relation/attnum
  * as list of DefElem.
+ *
+ * If no such attribute exists and missing_ok is true, NIL is returned;
+ * otherwise a not-intended-for-user-consumption error is thrown.
  */
 List *
-GetForeignColumnOptions(Oid relid, AttrNumber attnum)
+GetForeignColumnOptions(Oid relid, AttrNumber attnum, bool missing_ok)
 {
     List       *options;
     HeapTuple    tp;
@@ -262,8 +265,12 @@ GetForeignColumnOptions(Oid relid, AttrNumber attnum)
                          ObjectIdGetDatum(relid),
                          Int16GetDatum(attnum));
     if (!HeapTupleIsValid(tp))
-        elog(ERROR, "cache lookup failed for attribute %d of relation %u",
-             attnum, relid);
+    {
+        if (!missing_ok)
+            elog(ERROR, "cache lookup failed for attribute %d of relation %u",
+                 attnum, relid);
+        return NIL;
+    }
     datum = SysCacheGetAttr(ATTNUM,
                             tp,
                             Anum_pg_attribute_attfdwoptions,
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 69dd327f0c..3a0b67508a 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1194,7 +1194,7 @@ set_foreignscan_references(PlannerInfo *root,
     if (fscan->scan.scanrelid > 0)
         fscan->scan.scanrelid += rtoffset;
 
-    if (fscan->fdw_scan_tlist != NIL || fscan->scan.scanrelid == 0)
+    if (fscan->scan.scanrelid == 0)
     {
         /*
          * Adjust tlist, qual, fdw_exprs, fdw_recheck_quals to reference
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 8369e3ad62..cfcb912bbb 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -33,6 +33,7 @@
 #include "foreign/fdwapi.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
+#include "nodes/nodeFuncs.h"
 #include "optimizer/clauses.h"
 #include "optimizer/cost.h"
 #include "optimizer/plancat.h"
@@ -58,7 +59,8 @@ int            constraint_exclusion = CONSTRAINT_EXCLUSION_PARTITION;
 /* Hook for plugins to get control in get_relation_info() */
 get_relation_info_hook_type get_relation_info_hook = NULL;
 
-
+static AttrNumber tlist_max_attrnum(List *tlist, Index varno,
+                                    AttrNumber relattrnum);
 static void get_relation_foreign_keys(PlannerInfo *root, RelOptInfo *rel,
                           Relation relation, bool inhparent);
 static bool infer_collation_opclass_match(InferenceElem *elem, Relation idxRel,
@@ -76,6 +78,33 @@ static PartitionScheme find_partition_scheme(PlannerInfo *root, Relation rel);
 static void set_baserel_partition_key_exprs(Relation relation,
                                 RelOptInfo *rel);
 
+/*
+ * tlist_max_attrnum
+ *   Find the largest varattno in the targetlist
+ *
+ * FDWs may add junk columns for internal usage. This function finds the
+ * maximum attribute number including such columns. Such additional columns
+ * are always Var so we don't go deeper.
+ */
+
+static AttrNumber
+tlist_max_attrnum(List *tlist, Index varno, AttrNumber relattrnum)
+{
+    AttrNumber    maxattrnum = relattrnum;
+    ListCell   *lc;
+
+    foreach (lc, tlist)
+    {
+        TargetEntry *tle = lfirst_node(TargetEntry, lc);
+        Var            *var = (Var *) tle->expr;
+
+        if (IsA(var, Var) && var->varno == varno && maxattrnum < var->varattno)
+            maxattrnum = var->varattno;
+    }
+
+    return maxattrnum;
+}
+
 /*
  * get_relation_info -
  *      Retrieves catalog information for a given relation.
@@ -112,6 +141,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
     Relation    relation;
     bool        hasindex;
     List       *indexinfos = NIL;
+    AttrNumber  max_attrnum;
 
     /*
      * We need not lock the relation since it was already locked, either by
@@ -126,8 +156,15 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                  errmsg("cannot access temporary or unlogged relations during recovery")));
 
+    max_attrnum = RelationGetNumberOfAttributes(relation);
+
+    /* Foreign table may have exanded this relation with junk columns */
+    if (root->simple_rte_array[varno]->relkind == RELKIND_FOREIGN_TABLE)
+        max_attrnum = tlist_max_attrnum(root->parse->targetList,
+                                        varno, max_attrnum);
+
     rel->min_attr = FirstLowInvalidHeapAttributeNumber + 1;
-    rel->max_attr = RelationGetNumberOfAttributes(relation);
+    rel->max_attr = max_attrnum;
     rel->reltablespace = RelationGetForm(relation)->reltablespace;
 
     Assert(rel->max_attr >= rel->min_attr);
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 03e9a28a63..e3b3f57e66 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -6671,9 +6671,9 @@ get_variable(Var *var, int levelsup, bool istoplevel, deparse_context *context)
     {
         /* Get column name to use from the colinfo struct */
         if (attnum > colinfo->num_cols)
-            elog(ERROR, "invalid attnum %d for relation \"%s\"",
-                 attnum, rte->eref->aliasname);
-        attname = colinfo->colnames[attnum - 1];
+            attname = "<added_junk>";
+        else
+            attname = colinfo->colnames[attnum - 1];
         if (attname == NULL)    /* dropped column? */
             elog(ERROR, "invalid attnum %d for relation \"%s\"",
                  attnum, rte->eref->aliasname);
diff --git a/src/include/foreign/foreign.h b/src/include/foreign/foreign.h
index 3ca12e64d2..5b1fec2be8 100644
--- a/src/include/foreign/foreign.h
+++ b/src/include/foreign/foreign.h
@@ -77,7 +77,8 @@ extern ForeignDataWrapper *GetForeignDataWrapperByName(const char *name,
                             bool missing_ok);
 extern ForeignTable *GetForeignTable(Oid relid);
 
-extern List *GetForeignColumnOptions(Oid relid, AttrNumber attnum);
+extern List *GetForeignColumnOptions(Oid relid, AttrNumber attnum,
+                                     bool missing_ok);
 
 extern Oid    get_foreign_data_wrapper_oid(const char *fdwname, bool missing_ok);
 extern Oid    get_foreign_server_oid(const char *servername, bool missing_ok);
-- 
2.16.3


pgsql-hackers by date:

Previous
From: Marina Polyakova
Date:
Subject: Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Next
From: Michael Paquier
Date:
Subject: Re: Temporary tables prevent autovacuum, leading to XID wraparound