Re: Problem while updating a foreign table pointing to apartitioned table on foreign server - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: Problem while updating a foreign table pointing to apartitioned table on foreign server
Date
Msg-id 20180418.131317.164571488.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Problem while updating a foreign table pointing to a partitionedtable on foreign server  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Responses Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server
List pgsql-hackers
Hello.

At Mon, 16 Apr 2018 17:05:28 +0530, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote in
<CAFjFpRfcgwsHRmpvoOK-GUQi-n8MgAS+OxcQo=aBDn1COywmcg@mail.gmail.com>
> Hi,
> Consider this scenario
> 
> postgres=# CREATE TABLE plt (a int, b int) PARTITION BY LIST(a);
> postgres=# CREATE TABLE plt_p1 PARTITION OF plt FOR VALUES IN (1);
> postgres=# CREATE TABLE plt_p2 PARTITION OF plt FOR VALUES IN (2);
> postgres=# INSERT INTO plt VALUES (1, 1), (2, 2);
> postgres=# CREATE FOREIGN TABLE fplt (a int, b int) SERVER loopback
> OPTIONS (table_name 'plt');
> postgres=# SELECT tableoid::regclass, ctid, * FROM fplt;
>  tableoid | ctid  | a | b
> ----------+-------+---+---
>  fplt     | (0,1) | 1 | 1
>  fplt     | (0,1) | 2 | 2
> (2 rows)
> 
> -- Need to use random() so that following update doesn't turn into a
> direct UPDATE.
> postgres=# EXPLAIN (VERBOSE, COSTS OFF)
> postgres-# UPDATE fplt SET b = (CASE WHEN random() <= 1 THEN 10 ELSE
> 20 END) WHERE a = 1;
>                                          QUERY PLAN
> --------------------------------------------------------------------------------------------
>  Update on public.fplt
>    Remote SQL: UPDATE public.plt SET b = $2 WHERE ctid = $1
>    ->  Foreign Scan on public.fplt
>          Output: a, CASE WHEN (random() <= '1'::double precision) THEN
> 10 ELSE 20 END, ctid
>          Remote SQL: SELECT a, ctid FROM public.plt WHERE ((a = 1)) FOR UPDATE
> (5 rows)
> 
> postgres=# UPDATE fplt SET b = (CASE WHEN random() <= 1 THEN 10 ELSE
> 20 END) WHERE a = 1;
> postgres=# SELECT tableoid::regclass, ctid, * FROM fplt;
>  tableoid | ctid  | a | b
> ----------+-------+---+----
>  fplt     | (0,2) | 1 | 10
>  fplt     | (0,2) | 2 | 10
> (2 rows)
> 
> We expect only 1 row with a = 1 to be updated, but both the rows get
> updated. This happens because both the rows has ctid = (0, 1) and
> that's the only qualification used for UPDATE and DELETE. Thus when a
> non-direct UPDATE is run on a foreign table which points to a
> partitioned table or inheritance hierarchy on the foreign server, it
> will update rows from all child table which have ctids same as the
> qualifying rows. Same is the case with DELETE.

Anyway I think we should warn or error out if one nondirect
update touches two nor more tuples in the first place.

=# UPDATE fplt SET b = (CASE WHEN random() <= 1 THEN 10 ELSE 20 END) WHERE a = 1;
ERROR:  updated 2 rows for a tuple identity on the remote end


> There are two ways to fix this
> 1. Use WHERE CURRENT OF with cursors to update rows. This means that
> we fetch only one row at a time and update it. This can slow down the
> execution drastically.
> 2. Along with ctid use tableoid as a qualifier i.e. WHERE clause of
> UPDATE/DELETE statement has ctid = $1 AND tableoid = $2 as conditions.
> 
> PFA patch along the lines of 2nd approach and along with the
> testcases. The idea is to inject tableoid attribute to be fetched from
> the foreign server similar to ctid and then add it to the DML
> statement being constructed.
> 
> It does fix the problem. But the patch as is interferes with the way
> we handle tableoid currently. That can be seen from the regression
> diffs that the patch causes.  RIght now, every tableoid reference gets
> converted into the tableoid of the foreign table (and not the tableoid
> of the foreign table). Somehow we need to differentiate between the
> tableoid injected for DML and tableoid references added by the user in
> the original query and then use tableoid on the foreign server for the
> first and local foreign table's oid for the second. Right now, I don't
> see a simple way to do that.

We cannot add no non-system (junk) columns not defined in foreign
table columns. We could pass tableoid via a side channel but we
get wrong value if the scan is not consists of only one foreign
relation. I don't think adding remote_tableoid in HeapTupleData
is acceptable. Explicity defining remote_tableoid column in
foreign relation might work but it makes things combersome..

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index e1c2639fde..7cd31cb6ab 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -1895,6 +1895,13 @@ postgresExecForeignUpdate(EState *estate,
 
     MemoryContextReset(fmstate->temp_cxt);
 
+    /* ERROR if more than one row was updated on the remote end */
+    if (n_rows > 1)
+        ereport(ERROR,
+                (errcode (ERRCODE_FDW_ERROR), /* XXX */
+                 errmsg ("updated %d rows for a tuple identity on the remote end",
+                         n_rows)));
+
     /* Return NULL if nothing was updated on the remote end */
     return (n_rows > 0) ? slot : NULL;
 }

pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: pruning disabled for array, enum, record, range type partitionkeys
Next
From: Amit Langote
Date:
Subject: Re: ON CONFLICT DO UPDATE for partitioned tables