Thread: BUG #17809: MERGE ... UPDATE fails with BEFORE ROW UPDATE trigger when target row updated concurrently
BUG #17809: MERGE ... UPDATE fails with BEFORE ROW UPDATE trigger when target row updated concurrently
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 17809 Logged by: Alexander Lakhin Email address: exclusion@gmail.com PostgreSQL version: 15.2 Operating system: Ubuntu 22.04 Description: When executing the following script: cat << "EOF" | psql CREATE TABLE target (tid integer, val integer); INSERT INTO target VALUES (1, 0); CREATE OR REPLACE FUNCTION brut_func() RETURNS trigger LANGUAGE plpgsql AS 'BEGIN RETURN NEW; END;'; CREATE TRIGGER brut BEFORE UPDATE ON target FOR EACH ROW EXECUTE PROCEDURE brut_func(); EOF for ((i=1;i<=10;i++)); do echo "iteration $i" psql -c "UPDATE TARGET SET val = 0" & psql -c "MERGE INTO target t1 USING target t2 ON t1.tid = t2.tid WHEN MATCHED THEN UPDATE SET val = 0" & wait ls tmpdb/core* && break coredumpctl --no-pager && break done I get a server crash with the following stack trace: Program terminated with signal SIGSEGV, Segmentation fault. warning: Section `.reg-xstate/1808258' in core file too small. #0 0x000055e66490440e in internalGetUpdateNewTuple (relinfo=0x55e66595e040, planSlot=0x55e665998288, oldSlot=0x55e665969978, relaction=0x0) at nodeModifyTable.c:741 741 econtext = newProj->pi_exprContext; (gdb) bt #0 0x000055e66490440e in internalGetUpdateNewTuple (relinfo=0x55e66595e040, planSlot=0x55e665998288, oldSlot=0x55e665969978, relaction=0x0) at nodeModifyTable.c:741 #1 0x000055e6649043e0 in ExecGetUpdateNewTuple (relinfo=0x55e66595e040, planSlot=0x55e665998288, oldSlot=0x55e665969978) at nodeModifyTable.c:726 #2 0x000055e66487e835 in ExecBRUpdateTriggers (estate=0x55e66595dba0, epqstate=0x55e66595df10, relinfo=0x55e66595e040, tupleid=0x7ffec9335002, fdw_trigtuple=0x0, newslot=0x55e665969748, tmfd=0x7ffec93350b0) at trigger.c:3037 #3 0x000055e66490622e in ExecUpdatePrologue (context=0x7ffec9335080, resultRelInfo=0x55e66595e040, tupleid=0x7ffec9335002, oldtuple=0x0, slot=0x55e665969748) at nodeModifyTable.c:1912 #4 0x000055e6649078b2 in ExecMergeMatched (context=0x7ffec9335080, resultRelInfo=0x55e66595e040, tupleid=0x7ffec9335002, canSetTag=true) at nodeModifyTable.c:2884 #5 0x000055e6649075da in ExecMerge (context=0x7ffec9335080, resultRelInfo=0x55e66595e040, tupleid=0x7ffec9335002, canSetTag=true) at nodeModifyTable.c:2750 #6 0x000055e6649097f2 in ExecModifyTable (pstate=0x55e66595de28) at nodeModifyTable.c:3866 #7 0x000055e6648c60d7 in ExecProcNodeFirst (node=0x55e66595de28) at execProcnode.c:464 #8 0x000055e6648b9353 in ExecProcNode (node=0x55e66595de28) at ../../../src/include/executor/executor.h:259 #9 0x000055e6648bc25a in ExecutePlan (estate=0x55e66595dba0, planstate=0x55e66595de28, use_parallel_mode=false, operation=CMD_MERGE, sendTuples=false, numberTuples=0, direction=ForwardScanDirection, dest=0x55e665955bf8, execute_once=true) at execMain.c:1636 #10 0x000055e6648b9a8a in standard_ExecutorRun (queryDesc=0x55e66594c960, direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:363 #11 0x000055e6648b9875 in ExecutorRun (queryDesc=0x55e66594c960, direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:307 #12 0x000055e664b31952 in ProcessQuery (plan=0x55e665955b08, sourceText=0x55e6658593b0 "MERGE INTO target t1 USING target t2 ON t1.tid = t2.tid WHEN MATCHED THEN UPDATE SET val = 0", params=0x0, queryEnv=0x0, dest=0x55e665955bf8, qc=0x7ffec93354e0) at pquery.c:160 #13 0x000055e664b33545 in PortalRunMulti (portal=0x55e6658d5190, isTopLevel=true, setHoldSnapshot=false, dest=0x55e665955bf8, altdest=0x55e665955bf8, qc=0x7ffec93354e0) at pquery.c:1277 #14 0x000055e664b32a28 in PortalRun (portal=0x55e6658d5190, count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x55e665955bf8, altdest=0x55e665955bf8, qc=0x7ffec93354e0) at pquery.c:791 #15 0x000055e664b2b803 in exec_simple_query ( query_string=0x55e6658593b0 "MERGE INTO target t1 USING target t2 ON t1.tid = t2.tid WHEN MATCHED THEN UPDATE SET val = 0") at postgres.c:1250 #16 0x000055e664b30722 in PostgresMain (dbname=0x55e665885938 "regression", username=0x55e665885918 "law") at postgres.c:4593 #17 0x000055e664a55c23 in BackendRun (port=0x55e66587d420) at postmaster.c:4511 #18 0x000055e664a554aa in BackendStartup (port=0x55e66587d420) at postmaster.c:4239 #19 0x000055e664a51437 in ServerLoop () at postmaster.c:1806 #20 0x000055e664a50b94 in PostmasterMain (argc=3, argv=0x55e665853610) at postmaster.c:1478 #21 0x000055e664944ad5 in main (argc=3, argv=0x55e665853610) at main.c:202 At first sight it appeared like another manifestation of bug #17798, but it seems that this is a distinct defect. As I can see, ExecModifyTable() for the "case CMD_MERGE" can pass to ExecMerge() the resultRelInfo structure with ri_projectNew == null (but ri_projectNewInfoValid == true (set in ExecInitMergeTupleSlots)). This is not an issue till GetTupleForTrigger() in ExecBRUpdateTriggers() gets a tuple. But when a concurrent update occurs, GetTupleForTrigger() fails and ExecGetUpdateNewTuple() gets called, where internalGetUpdateNewTuple() tries to get relinfo->ri_projectNew->pi_exprContext... The issue appeared with the MERGE introduction (7103ebb7a).
Re: BUG #17809: MERGE ... UPDATE fails with BEFORE ROW UPDATE trigger when target row updated concurrently
From
Dean Rasheed
Date:
On Mon, 27 Feb 2023 at 08:06, PG Bug reporting form <noreply@postgresql.org> wrote: > > When executing the following script I get a server crash with the following stack trace: > Program terminated with signal SIGSEGV, Segmentation fault. > Confirmed. I can reliably reproduce this using 2 psql sessions as follows: -- Setup DROP TABLE IF EXISTS target; CREATE TABLE target (tid integer, val integer); INSERT INTO target VALUES (1, 0); CREATE OR REPLACE FUNCTION brut_func() RETURNS trigger LANGUAGE plpgsql AS $$ BEGIN RETURN NEW; END; $$; CREATE TRIGGER brut BEFORE UPDATE ON target FOR EACH ROW EXECUTE PROCEDURE brut_func(); -- Session 1 BEGIN; UPDATE TARGET SET val = 0; -- Session 2 MERGE INTO target t1 USING target t2 ON t1.tid = t2.tid WHEN MATCHED THEN UPDATE SET val = 0; -- Session 1 COMMIT; What's happening is that (in session 2) the trigger code in ExecBRUpdateTriggers() detects the concurrent update and calls ExecGetUpdateNewTuple() to get a new tuple. However, ExecGetUpdateNewTuple() just calls internalGetUpdateNewTuple(), when really it ought to call mergeGetUpdateNewTuple(), since we're in a MERGE. Unfortunately there's no way for ExecGetUpdateNewTuple() to know that, because it doesn't have the required context information. A quick hack that fixes the problem is to pass the current MergeActionState to ExecGetUpdateNewTuple(), which also then requires that it be passed to ExecBRUpdateTriggers(). Changing the signature of 2 public functions in that way in back branches doesn't seem very nice though. OTOH, it's difficult to see how it can be fixed any other way. I don't really like this change to ExecGetUpdateNewTuple(), but if we are going to change it like this, then I think we should go all the way and get rid of the GetUpdateNewTuple callback, and the separate internalGetUpdateNewTuple() and mergeGetUpdateNewTuple() functions, and just do it all in ExecGetUpdateNewTuple(), so that all the code is in one place, making it easier to follow. Thoughts? Regards, Dean
Attachment
Re: BUG #17809: MERGE ... UPDATE fails with BEFORE ROW UPDATE trigger when target row updated concurrently
From
Alvaro Herrera
Date:
On 2023-Feb-27, Dean Rasheed wrote: > What's happening is that (in session 2) the trigger code in > ExecBRUpdateTriggers() detects the concurrent update and calls > ExecGetUpdateNewTuple() to get a new tuple. However, > ExecGetUpdateNewTuple() just calls internalGetUpdateNewTuple(), when > really it ought to call mergeGetUpdateNewTuple(), since we're in a > MERGE. Unfortunately there's no way for ExecGetUpdateNewTuple() to > know that, because it doesn't have the required context information. Hmm, right. > A quick hack that fixes the problem is to pass the current > MergeActionState to ExecGetUpdateNewTuple(), which also then requires > that it be passed to ExecBRUpdateTriggers(). Changing the signature of > 2 public functions in that way in back branches doesn't seem very nice > though. > > OTOH, it's difficult to see how it can be fixed any other way. > > I don't really like this change to ExecGetUpdateNewTuple(), but if we > are going to change it like this, then I think we should go all the > way and get rid of the GetUpdateNewTuple callback, and the separate > internalGetUpdateNewTuple() and mergeGetUpdateNewTuple() functions, > and just do it all in ExecGetUpdateNewTuple(), so that all the code is > in one place, making it easier to follow. I'm pretty sure that I split things this way (using a callback) because the MERGE code was at arms-length, being in a different file, and there was no reasonable way, without duplicating a lot of other code, to implement the different behavior other than using a callback. I then moved the MERGE code from a separate file into nodeModifyTable.c and could have put it back in one piece, but didn't remember that it was there. It even looks like doing that may even make the code simpler, so perhaps we should just do that -- nodeModifyTable.c if fully aware of MERGE now, so it's not a problem. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "Puedes vivir sólo una vez, pero si lo haces bien, una vez es suficiente"
Re: BUG #17809: MERGE ... UPDATE fails with BEFORE ROW UPDATE trigger when target row updated concurrently
From
Dean Rasheed
Date:
On Tue, 7 Mar 2023 at 11:50, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > > A quick hack that fixes the problem is to pass the current > > MergeActionState to ExecGetUpdateNewTuple(), which also then requires > > that it be passed to ExecBRUpdateTriggers(). Changing the signature of > > 2 public functions in that way in back branches doesn't seem very nice > > though. > > > > OTOH, it's difficult to see how it can be fixed any other way. > > > > I don't really like this change to ExecGetUpdateNewTuple(), but if we > > are going to change it like this, then I think we should go all the > > way and get rid of the GetUpdateNewTuple callback, and the separate > > internalGetUpdateNewTuple() and mergeGetUpdateNewTuple() functions, > > and just do it all in ExecGetUpdateNewTuple(), so that all the code is > > in one place, making it easier to follow. > > I'm pretty sure that I split things this way (using a callback) because > the MERGE code was at arms-length, being in a different file, and there > was no reasonable way, without duplicating a lot of other code, to > implement the different behavior other than using a callback. > > I then moved the MERGE code from a separate file into nodeModifyTable.c > and could have put it back in one piece, but didn't remember that it was > there. It even looks like doing that may even make the code simpler, so > perhaps we should just do that -- nodeModifyTable.c if fully aware of > MERGE now, so it's not a problem. > So I wrote a patch doing that, and added some isolation tests confirming that it fixes the reported issue. Unfortunately, I think this exposes a different issue -- if the target table has a BEFORE ROW trigger, and the target tuple is concurrently updated, the trigger code will lock the updated tuple, and the merge code won't see it as a concurrent update, and so it won't re-check the WHEN condition. That leads to some quite surprising results, as shown in the new isolation test cases. This makes me wonder if perhaps the merge code ought to lock the target tuple before checking WHEN conditions, in the same way that the trigger code does before checking trigger WHEN clauses. If it did that, the trigger code would never have to re-compute the new tuple, so this change to ExecBRUpdateTriggers() wouldn't be necessary. It would also probably resolve the issue with cross-partition updates and concurrently deleted tuples over on [1] and I think the requirement for cpUpdateRetrySlot would go away. That seems like a pretty invasive set of changes though. [1] https://www.postgresql.org/message-id/20230215194224.egfyt3j5ewy4c6m3%40alvherre.pgsql Regards, Dean
Attachment
Re: BUG #17809: MERGE ... UPDATE fails with BEFORE ROW UPDATE trigger when target row updated concurrently
From
Dean Rasheed
Date:
On Thu, 9 Mar 2023 at 13:11, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > > So I wrote a patch doing that, and added some isolation tests > confirming that it fixes the reported issue. > > Unfortunately, I think this exposes a different issue -- if the target > table has a BEFORE ROW trigger, and the target tuple is concurrently > updated, the trigger code will lock the updated tuple, and the merge > code won't see it as a concurrent update, and so it won't re-check the > WHEN condition. That leads to some quite surprising results, as shown > in the new isolation test cases. > I have been thinking about this some more, and I added a bunch of additional isolation tests to test MERGE UPDATE/DELETE against concurrent UPDATE/DELETE for normal tables, tables with BEFORE ROW triggers, and partitioned tables where the MERGE does a cross-partition update. Various of those tests currently fail, either with the crash reported in this thread, or because they execute the wrong merge action, or no action at all. To fix this, it's necessary to pass the TM_Result and TM_FailureData results from the trigger code's attempt to lock the old tuple, and from the cross-partition-update code's attempt to delete the old tuple, up to the merge code. This allows the merge code to handle those cases in the same way as it does for non-partitioned tables without triggers. Overall, this leads to a few simplifications in nodeModifyTable.c: 1). ModifyTableContext->GetUpdateNewTuple() is now never called for MERGE, so it and mergeGetUpdateNewTuple() can be deleted, and ExecGetUpdateNewTuple() can be restored to how it was in v14. 2). ModifyTableContext->cpUpdateRetrySlot is no longer used by MERGE, so it can also be deleted, and ExecCrossPartitionUpdate() can just return retry_slot directly to ExecUpdateAct(), as it used to do. 3). ExecMergeMatched() becomes a bit simpler, since it no longer needs special-case code for cross-partition updates. Overall, this removes around 65 lines from nodeModifyTable.c, but more importantly, it seems to fix the concurrent update issues. Attached is an updated patch that I'm more happy with now, and a slightly modified one for v15, keeping the trigger API backwards-compatible for extensions. Regards, Dean
Attachment
Re: BUG #17809: MERGE ... UPDATE fails with BEFORE ROW UPDATE trigger when target row updated concurrently
From
Dean Rasheed
Date:
On Sat, 11 Mar 2023 at 18:12, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > > Attached is an updated patch that I'm more happy with now, and a > slightly modified one for v15, keeping the trigger API > backwards-compatible for extensions. > Going over this again, I spotted another couple of things: Passing recheckIndexes to ExecUpdateEpilogue() is pointless, since that's something that's only ever computed and used locally within ExecUpdateEpilogue(). It's also pointless for the caller to list_free() it, since the caller only has a pointer to the empty list, not the list built in ExecUpdateEpilogue(), which it should free. This is pretty harmless, but I might as well tidy it up while hacking on this. If a BEFORE ROW trigger returns NULL to skip an update or delete, ExecMergeMatched() will still update estate->es_processed, and so the final row count in the command tag will be wrong. I'll tackle that in a follow-on patch, since it's really a separate bug. Regards, Dean
Attachment
Re: BUG #17809: MERGE ... UPDATE fails with BEFORE ROW UPDATE trigger when target row updated concurrently
From
Dean Rasheed
Date:
On Sun, 12 Mar 2023 at 11:14, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > > On Sat, 11 Mar 2023 at 18:12, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > > > > Attached is an updated patch that I'm more happy with now, and a > > slightly modified one for v15, keeping the trigger API > > backwards-compatible for extensions. > > If a BEFORE ROW trigger returns NULL to skip an update or delete, > ExecMergeMatched() will still update estate->es_processed, and so the > final row count in the command tag will be wrong. I'll tackle that in > a follow-on patch, since it's really a separate bug. > OK, I have pushed fixes for both those issues. Regards, Dean