Re: bug in update tuple routing with foreign partitions - Mailing list pgsql-hackers
From | Etsuro Fujita |
---|---|
Subject | Re: bug in update tuple routing with foreign partitions |
Date | |
Msg-id | 5CADAB76.9060506@lab.ntt.co.jp Whole thread Raw |
In response to | bug in update tuple routing with foreign partitions (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>) |
Responses |
Re: bug in update tuple routing with foreign partitions
|
List | pgsql-hackers |
(2019/03/06 18:33), Amit Langote wrote: > I noticed a bug with how UPDATE tuple routing initializes ResultRelInfos > to use for partition routing targets. Specifically, the bug occurs when > UPDATE targets include a foreign partition that is locally modified (as > opposed to being modified directly on the remove server) and its > ResultRelInfo (called subplan result rel in the source code) is reused for > tuple routing: > > -- setup > create extension postgres_fdw ; > create server loopback foreign data wrapper postgres_fdw; > create user mapping for current_user server loopback; > create table p (a int) partition by list (a); > create table p1 partition of p for values in (1); > create table p2base (a int check (a = 2)); > create foreign table p2 partition of p for values in (2) server loopback > options (table_name 'p2base'); > insert into p values (1), (2); > > -- see in the plan that foreign partition p2 is locally modified > explain verbose update p set a = 2 from (values (1), (2)) s(x) where a = > s.x returning *; > QUERY PLAN > > ───────────────────────────────────────────────────────────────────────────────── > Update on public.p (cost=0.05..236.97 rows=50 width=42) > Output: p1.a, "*VALUES*".column1 > Update on public.p1 > Foreign Update on public.p2 > Remote SQL: UPDATE public.p2base SET a = $2 WHERE ctid = $1 RETURNING a > -> Hash Join (cost=0.05..45.37 rows=26 width=42) > Output: 2, p1.ctid, "*VALUES*".*, "*VALUES*".column1 > Hash Cond: (p1.a = "*VALUES*".column1) > -> Seq Scan on public.p1 (cost=0.00..35.50 rows=2550 width=10) > Output: p1.ctid, p1.a > -> Hash (cost=0.03..0.03 rows=2 width=32) > Output: "*VALUES*".*, "*VALUES*".column1 > -> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 > width=32) > Output: "*VALUES*".*, "*VALUES*".column1 > -> Hash Join (cost=100.05..191.59 rows=24 width=42) > Output: 2, p2.ctid, "*VALUES*".*, "*VALUES*".column1 > Hash Cond: (p2.a = "*VALUES*".column1) > -> Foreign Scan on public.p2 (cost=100.00..182.27 rows=2409 > width=10) > Output: p2.ctid, p2.a > Remote SQL: SELECT a, ctid FROM public.p2base FOR UPDATE > -> Hash (cost=0.03..0.03 rows=2 width=32) > Output: "*VALUES*".*, "*VALUES*".column1 > -> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 > width=32) > Output: "*VALUES*".*, "*VALUES*".column1 > > > -- as opposed to directly on remote side (because there's no local join) > explain verbose update p set a = 2 returning *; > QUERY PLAN > ───────────────────────────────────────────────────────────────────────────── > Update on public.p (cost=0.00..227.40 rows=5280 width=10) > Output: p1.a > Update on public.p1 > Foreign Update on public.p2 > -> Seq Scan on public.p1 (cost=0.00..35.50 rows=2550 width=10) > Output: 2, p1.ctid > -> Foreign Update on public.p2 (cost=100.00..191.90 rows=2730 width=10) > Remote SQL: UPDATE public.p2base SET a = 2 RETURNING a > (8 rows) > > Running the first update query crashes: > > update p set a = 2 from (values (1), (2)) s(x) where a = s.x returning > tableoid::regclass, *; > server closed the connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > The connection to the server was lost. Attempting reset: Failed. > > The problem seems to occur because ExecSetupPartitionTupleRouting thinks > it can reuse p2's ResultRelInfo for tuple routing. In this case, it can't > be used, because its ri_FdwState contains information that will be needed > for postgresExecForeignUpdate to work, but it's still used today. Because > it's assigned to be used for tuple routing, its ri_FdwState will be > overwritten by postgresBeginForeignInsert that's invoked by the tuple > routing code using the aforementioned ResultRelInfo. Crash occurs when > postgresExecForeignUpdate () can't find the information it's expecting in > the ri_FdwState. Agreed, as I said before in another thread. > The solution is to teach ExecSetupPartitionTupleRouting to avoid using a > subplan result rel if its ri_FdwState is already set. I'm not sure that is a good idea, because that requires to create a new ResultRelInfo, which is not free; I think it requires a lot of work. Another solution to avoid that would be to store the fmstate created by postgresBeginForeignInsert() into the ri_FdwState, not overwrite the ri_FdwState, like the attached. This would not need any changes to the core, and this would not cause any overheads either, IIUC. What do you think about that? > I have attached 2 patches, one for PG 11 where the problem first appeared > and another for HEAD. The patch for PG 11 is significantly bigger due to > having to handle the complexities of mapping of subplan result rel indexes > to leaf partition indexes. Most of that complexity is gone in HEAD due to > 3f2393edefa5, so the patch for HEAD is much simpler. Thanks for the patches! > I've also added a > test in postgres_fdw.sql to exercise this test case. Thanks again, but the test case you added works well without any fix: +-- Check case where the foreign partition is a subplan target rel and +-- foreign parttion is locally modified (target table being joined +-- locally prevents a direct/remote modification plan). +explain (verbose, costs off) +update utrtest set a = 1 from (values (1), (2)) s(x) where a = s.x returning *; + QUERY PLAN +------------------------------------------------------------------------------ + Update on public.utrtest + Output: remp.a, remp.b, "*VALUES*".column1 + Foreign Update on public.remp + Remote SQL: UPDATE public.loct SET a = $2 WHERE ctid = $1 RETURNING a, b + Update on public.locp + -> Hash Join + Output: 1, remp.b, remp.ctid, "*VALUES*".*, "*VALUES*".column1 + Hash Cond: (remp.a = "*VALUES*".column1) + -> Foreign Scan on public.remp + Output: remp.b, remp.ctid, remp.a + Remote SQL: SELECT a, b, ctid FROM public.loct FOR UPDATE + -> Hash + Output: "*VALUES*".*, "*VALUES*".column1 + -> Values Scan on "*VALUES*" + Output: "*VALUES*".*, "*VALUES*".column1 + -> Hash Join + Output: 1, locp.b, locp.ctid, "*VALUES*".*, "*VALUES*".column1 + Hash Cond: (locp.a = "*VALUES*".column1) + -> Seq Scan on public.locp + Output: locp.b, locp.ctid, locp.a + -> Hash + Output: "*VALUES*".*, "*VALUES*".column1 + -> Values Scan on "*VALUES*" + Output: "*VALUES*".*, "*VALUES*".column1 +(24 rows) In this test case, the foreign partition is updated before ri_FdwState is overwritten by postgresBeginForeignInsert() that's invoked by the tuple routing code, so it would work without any fix. So I modified this test case as such. Sorry for the long delay, again. Best regards, Etsuro Fujita
Attachment
pgsql-hackers by date: