Re: bug in update tuple routing with foreign partitions - Mailing list pgsql-hackers
From | Etsuro Fujita |
---|---|
Subject | Re: bug in update tuple routing with foreign partitions |
Date | |
Msg-id | 5CAF257A.9050604@lab.ntt.co.jp Whole thread Raw |
In response to | Re: bug in update tuple routing with foreign partitions (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>) |
Responses |
Re: bug in update tuple routing with foreign partitions
|
List | pgsql-hackers |
(2019/04/11 14:33), Amit Langote wrote: > On 2019/04/10 17:38, Etsuro Fujita wrote: >> (2019/03/06 18:33), Amit Langote wrote: >>> I noticed a bug with how UPDATE tuple routing initializes ResultRelInfos >>> to use for partition routing targets. Specifically, the bug occurs when >>> UPDATE targets include a foreign partition that is locally modified (as >>> opposed to being modified directly on the remove server) and its >>> ResultRelInfo (called subplan result rel in the source code) is reused for >>> tuple routing: >> >>> The solution is to teach ExecSetupPartitionTupleRouting to avoid using a >>> subplan result rel if its ri_FdwState is already set. >> >> I'm not sure that is a good idea, because that requires to create a new >> ResultRelInfo, which is not free; I think it requires a lot of work. > > After considering this a bit more and studying your proposed solution, I > tend to agree. Beside the performance considerations you point out that I > too think are valid, I realize that going with my approach would mean > embedding some assumptions in the core code about ri_FdwState; in this > case, assuming that a subplan result rel's ri_FdwState is not to be > re-purposed for tuple routing insert, which is only based on looking at > postgres_fdw's implementation. Yeah, that assumption might not apply to some other FDWs. > Not to mention the ensuing complexity in > the core tuple routing code -- it now needs to account for the fact that > not all subplan result rels may have been re-purposed for tuple-routing, > something that's too complicated to handle in PG 11. I think so too. > IOW, let's call this a postgres_fdw bug and fix it there as you propose. Agreed. >> Another solution to avoid that would be to store the fmstate created by >> postgresBeginForeignInsert() into the ri_FdwState, not overwrite the >> ri_FdwState, like the attached. This would not need any changes to the >> core, and this would not cause any overheads either, IIUC. What do you >> think about that? > > +1. Just one comment: > > + /* > + * If the given resultRelInfo already has PgFdwModifyState set, it means > + * the foreign table is an UPDATE subplan resultrel; in which case, store > + * the resulting state into the PgFdwModifyState. > + */ > > The last "PgFdwModifyState" should be something else? Maybe, > sub-PgModifyState or sub_fmstate? OK, will revise. My favorite would be the latter. >>> I've also added a >>> test in postgres_fdw.sql to exercise this test case. >> >> Thanks again, but the test case you added works well without any fix: > > Oops, I should really adopt a habit of adding a test case before code. > >> +-- Check case where the foreign partition is a subplan target rel and >> +-- foreign parttion is locally modified (target table being joined >> +-- locally prevents a direct/remote modification plan). >> +explain (verbose, costs off) >> +update utrtest set a = 1 from (values (1), (2)) s(x) where a = s.x >> returning *; >> + QUERY PLAN >> +------------------------------------------------------------------------------ >> >> + Update on public.utrtest >> + Output: remp.a, remp.b, "*VALUES*".column1 >> + Foreign Update on public.remp >> + Remote SQL: UPDATE public.loct SET a = $2 WHERE ctid = $1 RETURNING >> a, b >> + Update on public.locp >> + -> Hash Join >> + Output: 1, remp.b, remp.ctid, "*VALUES*".*, "*VALUES*".column1 >> + Hash Cond: (remp.a = "*VALUES*".column1) >> + -> Foreign Scan on public.remp >> + Output: remp.b, remp.ctid, remp.a >> + Remote SQL: SELECT a, b, ctid FROM public.loct FOR UPDATE >> + -> Hash >> + Output: "*VALUES*".*, "*VALUES*".column1 >> + -> Values Scan on "*VALUES*" >> + Output: "*VALUES*".*, "*VALUES*".column1 >> + -> Hash Join >> + Output: 1, locp.b, locp.ctid, "*VALUES*".*, "*VALUES*".column1 >> + Hash Cond: (locp.a = "*VALUES*".column1) >> + -> Seq Scan on public.locp >> + Output: locp.b, locp.ctid, locp.a >> + -> Hash >> + Output: "*VALUES*".*, "*VALUES*".column1 >> + -> Values Scan on "*VALUES*" >> + Output: "*VALUES*".*, "*VALUES*".column1 >> +(24 rows) >> >> In this test case, the foreign partition is updated before ri_FdwState is >> overwritten by postgresBeginForeignInsert() that's invoked by the tuple >> routing code, so it would work without any fix. So I modified this test >> case as such. > > Thanks for fixing that. I hadn't noticed that foreign partition remp is > updated before local partition locp; should've added a locp0 for values > (0) so that remp appears second in the ModifyTable's subplans. Or, well, > make the foreign table remp appear later by making it a partition for > values in (3) as your patch does. > > BTW, have you noticed that the RETURNING clause returns the same row twice? I noticed this, but I didn't think it hard. :( > +update utrtest set a = 3 from (values (2), (3)) s(x) where a = s.x > returning *; > + a | b | x > +---+-------------------+--- > + 3 | qux triggered ! | 2 > + 3 | xyzzy triggered ! | 3 > + 3 | qux triggered ! | 3 > +(3 rows) > > You can see that the row that's moved into remp is returned twice (one > with "qux triggered!" in b column), whereas it should've been only once? Yeah, this is unexpected behavior, so will look into this. Thanks for pointing that out! Best regards, Etsuro Fujita
pgsql-hackers by date: