bug in update tuple routing with foreign partitions - Mailing list pgsql-hackers
From | Amit Langote |
---|---|
Subject | bug in update tuple routing with foreign partitions |
Date | |
Msg-id | 21e7eaa4-0d4d-20c2-a1f7-c7e96f4ce440@lab.ntt.co.jp Whole thread Raw |
Responses |
Re: bug in update tuple routing with foreign partitions
Re: bug in update tuple routing with foreign partitions |
List | pgsql-hackers |
Hi, (added Fujita-san) I noticed a bug with how UPDATE tuple routing initializes ResultRelInfos to use for partition routing targets. Specifically, the bug occurs when UPDATE targets include a foreign partition that is locally modified (as opposed to being modified directly on the remove server) and its ResultRelInfo (called subplan result rel in the source code) is reused for tuple routing: -- setup create extension postgres_fdw ; create server loopback foreign data wrapper postgres_fdw; create user mapping for current_user server loopback; create table p (a int) partition by list (a); create table p1 partition of p for values in (1); create table p2base (a int check (a = 2)); create foreign table p2 partition of p for values in (2) server loopback options (table_name 'p2base'); insert into p values (1), (2); -- see in the plan that foreign partition p2 is locally modified explain verbose update p set a = 2 from (values (1), (2)) s(x) where a = s.x returning *; QUERY PLAN ───────────────────────────────────────────────────────────────────────────────── Update on public.p (cost=0.05..236.97 rows=50 width=42) Output: p1.a, "*VALUES*".column1 Update on public.p1 Foreign Update on public.p2 Remote SQL: UPDATE public.p2base SET a = $2 WHERE ctid = $1 RETURNING a -> Hash Join (cost=0.05..45.37 rows=26 width=42) Output: 2, p1.ctid, "*VALUES*".*, "*VALUES*".column1 Hash Cond: (p1.a = "*VALUES*".column1) -> Seq Scan on public.p1 (cost=0.00..35.50 rows=2550 width=10) Output: p1.ctid, p1.a -> Hash (cost=0.03..0.03 rows=2 width=32) Output: "*VALUES*".*, "*VALUES*".column1 -> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 width=32) Output: "*VALUES*".*, "*VALUES*".column1 -> Hash Join (cost=100.05..191.59 rows=24 width=42) Output: 2, p2.ctid, "*VALUES*".*, "*VALUES*".column1 Hash Cond: (p2.a = "*VALUES*".column1) -> Foreign Scan on public.p2 (cost=100.00..182.27 rows=2409 width=10) Output: p2.ctid, p2.a Remote SQL: SELECT a, ctid FROM public.p2base FOR UPDATE -> Hash (cost=0.03..0.03 rows=2 width=32) Output: "*VALUES*".*, "*VALUES*".column1 -> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 width=32) Output: "*VALUES*".*, "*VALUES*".column1 -- as opposed to directly on remote side (because there's no local join) explain verbose update p set a = 2 returning *; QUERY PLAN ───────────────────────────────────────────────────────────────────────────── Update on public.p (cost=0.00..227.40 rows=5280 width=10) Output: p1.a Update on public.p1 Foreign Update on public.p2 -> Seq Scan on public.p1 (cost=0.00..35.50 rows=2550 width=10) Output: 2, p1.ctid -> Foreign Update on public.p2 (cost=100.00..191.90 rows=2730 width=10) Remote SQL: UPDATE public.p2base SET a = 2 RETURNING a (8 rows) Running the first update query crashes: update p set a = 2 from (values (1), (2)) s(x) where a = s.x returning tableoid::regclass, *; server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. The problem seems to occur because ExecSetupPartitionTupleRouting thinks it can reuse p2's ResultRelInfo for tuple routing. In this case, it can't be used, because its ri_FdwState contains information that will be needed for postgresExecForeignUpdate to work, but it's still used today. Because it's assigned to be used for tuple routing, its ri_FdwState will be overwritten by postgresBeginForeignInsert that's invoked by the tuple routing code using the aforementioned ResultRelInfo. Crash occurs when postgresExecForeignUpdate () can't find the information it's expecting in the ri_FdwState. The solution is to teach ExecSetupPartitionTupleRouting to avoid using a subplan result rel if its ri_FdwState is already set. I was wondering if it should also check ri_usesFdwDirectModify is true, but in that case, ri_FdwState is unused, so it's perhaps safe for tuple routing code to scribble on it. I have attached 2 patches, one for PG 11 where the problem first appeared and another for HEAD. The patch for PG 11 is significantly bigger due to having to handle the complexities of mapping of subplan result rel indexes to leaf partition indexes. Most of that complexity is gone in HEAD due to 3f2393edefa5, so the patch for HEAD is much simpler. I've also added a test in postgres_fdw.sql to exercise this test case. Thanks, Amit
Attachment
pgsql-hackers by date: