Re: bug in update tuple routing with foreign partitions - Mailing list pgsql-hackers

From Amit Langote
Subject Re: bug in update tuple routing with foreign partitions
Date
Msg-id a79eac5f-00aa-3cf3-fdda-a7ad5d0d4c91@lab.ntt.co.jp
Whole thread Raw
In response to Re: bug in update tuple routing with foreign partitions  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
Responses Re: bug in update tuple routing with foreign partitions
List pgsql-hackers
Fujita-san,

On 2019/04/17 21:49, Etsuro Fujita wrote:
> (2019/04/11 20:31), Etsuro Fujita wrote:
>> (2019/04/11 14:33), Amit Langote wrote:
>>> BTW, have you noticed that the RETURNING clause returns the same row
>>> twice?
>>
>> I noticed this, but I didn't think it hard. :(
>>
>>> +update utrtest set a = 3 from (values (2), (3)) s(x) where a = s.x
>>> returning *;
>>> + a | b | x
>>> +---+-------------------+---
>>> + 3 | qux triggered ! | 2
>>> + 3 | xyzzy triggered ! | 3
>>> + 3 | qux triggered ! | 3
>>> +(3 rows)
>>>
>>> You can see that the row that's moved into remp is returned twice (one
>>> with "qux triggered!" in b column), whereas it should've been only once?
>>
>> Yeah, this is unexpected behavior, so will look into this.

Thanks for investigating.

> I think the reason for that is: the row routed to remp is incorrectly
> fetched as a to-be-updated row when updating remp as a subplan targetrel.

Yeah.  In the fully-local case, that is, where both the source and the
target partition of a row movement operation are local tables, heap AM
ensures that tuples that's moved into a given relation in the same command
(by way of row movement) are not returned as to-be-updated, because it
deems such tuples invisible.  The "same command" part being crucial for
that to work.

In the case where the target of a row movement operation is a foreign
table partition, the INSERT used as part of row movement and subsequent
UPDATE of the same foreign table are distinct commands for the remote
server.  So, the rows inserted by the 1st command (as part of the row
movement) are deemed visible by the 2nd command (UPDATE) even if both are
operating within the same transaction.

I guess there's no easy way for postgres_fdw to make the remote server
consider them as a single command.  IOW, no way to make the remote server
not touch those "moved-in" rows during the UPDATE part of the local query.
 
> The right way to fix this would be to have some way in postgres_fdw in
> which we don't fetch such rows when updating such subplan targetrels.  I
> tried to figure out a (simple) way to do that, but I couldn't.

Yeah, that seems a bit hard to ensure with our current infrastructure.

> One
> probably-simple solution I came up with is to sort subplan targetrels into
> the order in which foreign-table subplan targetrels get processed first in
> ExecModifyTable().  (Note: currently, rows can be moved from local
> partitions to a foreign-table partition, but rows cannot be moved from
> foreign-table partitions to another partition, so we wouldn't encounter
> this situation once we sort like that.)  But I think that's ugly, and I
> don't think it's a good idea to change the core, just for postgres_fdw.

Agreed that it seems like contorting the core code to accommodate
limitations of postgres_fdw.

> So what I'm thinking is to throw an error for cases like this.  (Though, I
> think we should keep to allow rows to be moved from local partitions to a
> foreign-table subplan targetrel that has been updated already.)

Hmm, how would you distinguish (totally inside postgred_fdw I presume) the
two cases?

Thanks,
Amit




pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: [HACKERS] Weaker shmem interlock w/o postmaster.pid
Next
From: Amit Langote
Date:
Subject: Re: bug in update tuple routing with foreign partitions