Re: Adding a test for speculative insert abort case - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Adding a test for speculative insert abort case |
Date | |
Msg-id | 20200208004046.t4odt6rqulhx6sbf@alap3.anarazel.de Whole thread Raw |
In response to | Re: Adding a test for speculative insert abort case (Andres Freund <andres@anarazel.de>) |
Responses |
Re: Adding a test for speculative insert abort case
|
List | pgsql-hackers |
Hi, I'm currently fighting with a race I'm observing in about 1/4 of the runs. I get: @@ -361,16 +361,18 @@ locktype mode granted speculative tokenShareLock f speculative tokenExclusiveLock t step controller_unlock_2_4: SELECT pg_advisory_unlock(2, 4); pg_advisory_unlock t s1: NOTICE: blurt_and_lock_123() called for k1 in session 1 s1: NOTICE: acquiring advisory lock on 2 +s1: NOTICE: blurt_and_lock_123() called for k1 in session 1 +s1: NOTICE: acquiring advisory lock on 2 step s1_upsert: <... completed> step s2_upsert: <... completed> step controller_show: SELECT * FROM upserttest; key data k1 inserted s2 with conflict update s1 (this is the last permutation) The issue is basically that s1 goes through the check_exclusion_or_unique_constraint() conflict check twice. I added a bit of debugging information (using fprintf, with elog it was much harder to hit): Success: 2020-02-07 16:14:56.501 PST [1167003][5/1254:8465] CONTEXT: PL/pgSQL function blurt_and_lock_123(text) line 7 at RAISE 1167003: acquiring xact lock 2020-02-07 16:14:56.512 PST [1167001][3/0:0] DEBUG: bind <unnamed> to isolationtester_waiting 2020-02-07 16:14:56.522 PST [1167001][3/0:0] DEBUG: bind <unnamed> to isolationtester_waiting 1167002: releasing xact lock 2 3 1167004: acquired xact lock 1167004: xid 8466 acquiring 5 2020-02-07 16:14:56.523 PST [1167004][6/1014:8466] LOG: INSERT @ 0/9CC70F0: - Heap/INSERT: off 2 flags 0x0C 2020-02-07 16:14:56.523 PST [1167004][6/1014:8466] NOTICE: blurt_and_lock_123() called for k1 in session 2 2020-02-07 16:14:56.523 PST [1167004][6/1014:8466] CONTEXT: PL/pgSQL function blurt_and_lock_123(text) line 3 at RAISE 2020-02-07 16:14:56.523 PST [1167004][6/1014:8466] NOTICE: acquiring advisory lock on 2 2020-02-07 16:14:56.523 PST [1167004][6/1014:8466] CONTEXT: PL/pgSQL function blurt_and_lock_123(text) line 7 at RAISE 1167004: acquiring xact lock 2020-02-07 16:14:56.533 PST [1167001][3/0:0] DEBUG: bind <unnamed> to isolationtester_waiting 2020-02-07 16:14:56.544 PST [1167001][3/0:0] DEBUG: bind <unnamed> to isolationtester_waiting 2020-02-07 16:14:56.555 PST [1167001][3/0:0] DEBUG: bind <unnamed> to isolationtester_waiting 2020-02-07 16:14:56.565 PST [1167001][3/0:0] DEBUG: bind <unnamed> to isolationtester_waiting 2020-02-07 16:14:56.576 PST [1167001][3/0:0] DEBUG: bind <unnamed> to isolationtester_waiting 2020-02-07 16:14:56.587 PST [1167001][3/0:0] DEBUG: bind <unnamed> to isolationtester_waiting 1167002: releasing xact lock 2 2 1167004: acquired xact lock 2020-02-07 16:14:56.588 PST [1167004][6/1014:8466] LOG: INSERT @ 0/9CC7150: - Btree/NEWROOT: lev 0 2020-02-07 16:14:56.588 PST [1167004][6/1014:8466] LOG: INSERT @ 0/9CC7190: - Btree/INSERT_LEAF: off 1 2020-02-07 16:14:56.588 PST [1167004][6/1014:8466] NOTICE: blurt_and_lock_4() called for k1 in session 2 2020-02-07 16:14:56.588 PST [1167004][6/1014:8466] CONTEXT: PL/pgSQL function blurt_and_lock_4(text) line 3 at RAISE 2020-02-07 16:14:56.588 PST [1167004][6/1014:8466] NOTICE: acquiring advisory lock on 4 2020-02-07 16:14:56.588 PST [1167004][6/1014:8466] CONTEXT: PL/pgSQL function blurt_and_lock_4(text) line 4 at RAISE 1167004: acquiring xact lock 2020-02-07 16:14:56.598 PST [1167001][3/0:0] DEBUG: bind <unnamed> to isolationtester_waiting 2020-02-07 16:14:56.609 PST [1167001][3/0:0] DEBUG: bind <unnamed> to isolationtester_waiting 2020-02-07 16:14:56.620 PST [1167001][3/0:0] DEBUG: bind <unnamed> to isolationtester_waiting 2020-02-07 16:14:56.630 PST [1167001][3/0:0] DEBUG: bind <unnamed> to isolationtester_waiting 1167002: releasing xact lock 1 2 1167003: acquired xact lock 2020-02-07 16:14:56.631 PST [1167003][5/1254:8465] LOG: INSERT @ 0/9CC71D0: - Btree/INSERT_LEAF: off 1 2020-02-07 16:14:56.631 PST [1167003][5/1254:8465] NOTICE: blurt_and_lock_4() called for k1 in session 1 2020-02-07 16:14:56.631 PST [1167003][5/1254:8465] CONTEXT: PL/pgSQL function blurt_and_lock_4(text) line 3 at RAISE 2020-02-07 16:14:56.631 PST [1167003][5/1254:8465] NOTICE: acquiring advisory lock on 4 2020-02-07 16:14:56.631 PST [1167003][5/1254:8465] CONTEXT: PL/pgSQL function blurt_and_lock_4(text) line 4 at RAISE 1167003: acquiring xact lock 1167003: acquired xact lock 2020-02-07 16:14:56.632 PST [1167003][5/1254:8465] LOG: INSERT @ 0/9CC7230: - Btree/NEWROOT: lev 0 2020-02-07 16:14:56.632 PST [1167003][5/1254:8465] LOG: INSERT @ 0/9CC7270: - Btree/INSERT_LEAF: off 1 2020-02-07 16:14:56.632 PST [1167003][5/1254:8465] LOG: INSERT @ 0/9CC72A8: - Heap/DELETE: off 1 flags 0x08 1167003: xid 8465 releasing lock 5 1167003: retry due to conflict 2020-02-07 16:14:56.632 PST [1167003][5/1254:8465] NOTICE: blurt_and_lock_123() called for k1 in session 1 2020-02-07 16:14:56.632 PST [1167003][5/1254:8465] CONTEXT: PL/pgSQL function blurt_and_lock_123(text) line 3 at RAISE 2020-02-07 16:14:56.632 PST [1167003][5/1254:8465] NOTICE: acquiring advisory lock on 2 2020-02-07 16:14:56.632 PST [1167003][5/1254:8465] CONTEXT: PL/pgSQL function blurt_and_lock_123(text) line 7 at RAISE 1167003: acquiring xact lock 1167003: acquired xact lock 2020-02-07 16:14:56.632 PST [1167003][5/1254:8465] NOTICE: blurt_and_lock_123() called for k1 in session 1 2020-02-07 16:14:56.632 PST [1167003][5/1254:8465] CONTEXT: PL/pgSQL function blurt_and_lock_123(text) line 3 at RAISE 2020-02-07 16:14:56.632 PST [1167003][5/1254:8465] NOTICE: acquiring advisory lock on 2 2020-02-07 16:14:56.632 PST [1167003][5/1254:8465] CONTEXT: PL/pgSQL function blurt_and_lock_123(text) line 7 at RAISE 1167003: acquiring xact lock 1167003: acquired xact lock 1167003: xid 8465 waiting xwait 8466 (xmin 8466 xmax 0) spec 5 2020-02-07 16:14:56.642 PST [1167001][3/0:0] DEBUG: bind <unnamed> to isolationtester_waiting 2020-02-07 16:14:56.653 PST [1167001][3/0:0] DEBUG: bind <unnamed> to isolationtester_waiting 2020-02-07 16:14:56.667 PST [1167001][3/0:0] DEBUG: bind <unnamed> to isolationtester_waiting 2020-02-07 16:14:56.677 PST [1167001][3/0:0] DEBUG: bind <unnamed> to isolationtester_waiting 1167002: releasing xact lock 2 4 1167004: acquired xact lock 2020-02-07 16:14:56.678 PST [1167004][6/1014:8466] LOG: INSERT @ 0/9CC72E8: - Btree/INSERT_LEAF: off 2 2020-02-07 16:14:56.678 PST [1167004][6/1014:8466] LOG: INSERT @ 0/9CC7318: - Heap/HEAP_CONFIRM: off 2 1167004: xid 8466 releasing lock 5 2020-02-07 16:14:56.678 PST [1167004][6/1014:8466] LOG: INSERT @ 0/9CC7348: - Transaction/COMMIT: 2020-02-07 16:14:56.678602-08 2020-02-07 16:14:56.678 PST [1167004][6/1014:8466] LOG: xlog flush request 0/9CC7348; write 0/9CBDF58; flush 0/9CBDF58 2020-02-07 16:14:56.678 PST [1167004][6/1014:8466] STATEMENT: INSERT INTO upserttest(key, data) VALUES('k1', 'inserted s2')ON CONFLICT (blurt_and_lock_123(key)) DO UPDATE SET data = upserttest.data || ' with conflict update s2'; 2020-02-07 16:14:56.678 PST [1167003][5/1254:8465] NOTICE: blurt_and_lock_123() called for k1 in session 1 2020-02-07 16:14:56.678 PST [1167003][5/1254:8465] CONTEXT: PL/pgSQL function blurt_and_lock_123(text) line 3 at RAISE 2020-02-07 16:14:56.678 PST [1167003][5/1254:8465] NOTICE: acquiring advisory lock on 2 2020-02-07 16:14:56.678 PST [1167003][5/1254:8465] CONTEXT: PL/pgSQL function blurt_and_lock_123(text) line 7 at RAISE 1167003: acquiring xact lock 1167003: acquired xact lock 2020-02-07 16:14:56.678 PST [1167003][5/1254:8465] LOG: INSERT @ 0/9CC7380: - Heap/LOCK: off 2: xid 8465: flags 0x00 LOCK_ONLYEXCL_LOCK 2020-02-07 16:14:56.679 PST [1167003][5/1254:8465] LOG: INSERT @ 0/9CC73F0: - Heap/HOT_UPDATE: off 2 xmax 8465 flags 0x10; new off 3 xmax 8465 2020-02-07 16:14:56.679 PST [1167003][5/1254:8465] LOG: INSERT @ 0/9CC7420: - Transaction/COMMIT: 2020-02-07 16:14:56.679085-08 2020-02-07 16:14:56.679 PST [1167003][5/1254:8465] LOG: xlog flush request 0/9CC7420; write 0/9CC7060; flush 0/9CC7060 2020-02-07 16:14:56.679 PST [1167003][5/1254:8465] STATEMENT: INSERT INTO upserttest(key, data) VALUES('k1', 'inserted s1')ON CONFLICT (blurt_and_lock_123(key)) DO UPDATE SET data = upserttest.data || ' with conflict update s1'; fail: 1167056: releasing xact lock 2 3 1167058: acquired xact lock 1167058: xid 8490 acquiring 5 2020-02-07 16:16:43.990 PST [1167058][6/11:8490] LOG: INSERT @ 0/9D2D1D8: - Heap/INSERT: off 2 flags 0x0C 2020-02-07 16:16:43.990 PST [1167058][6/11:8490] NOTICE: blurt_and_lock_123() called for k1 in session 2 2020-02-07 16:16:43.990 PST [1167058][6/11:8490] CONTEXT: PL/pgSQL function blurt_and_lock_123(text) line 3 at RAISE 2020-02-07 16:16:43.990 PST [1167058][6/11:8490] NOTICE: acquiring advisory lock on 2 2020-02-07 16:16:43.990 PST [1167058][6/11:8490] CONTEXT: PL/pgSQL function blurt_and_lock_123(text) line 7 at RAISE 1167058: acquiring xact lock 2020-02-07 16:16:44.000 PST [1167055][3/0:0] DEBUG: bind <unnamed> to isolationtester_waiting 2020-02-07 16:16:44.011 PST [1167055][3/0:0] DEBUG: bind <unnamed> to isolationtester_waiting 2020-02-07 16:16:44.022 PST [1167055][3/0:0] DEBUG: bind <unnamed> to isolationtester_waiting 2020-02-07 16:16:44.033 PST [1167055][3/0:0] DEBUG: bind <unnamed> to isolationtester_waiting 2020-02-07 16:16:44.044 PST [1167055][3/0:0] DEBUG: bind <unnamed> to isolationtester_waiting 2020-02-07 16:16:44.054 PST [1167055][3/0:0] DEBUG: bind <unnamed> to isolationtester_waiting 1167056: releasing xact lock 2 2 1167058: acquired xact lock 2020-02-07 16:16:44.055 PST [1167058][6/11:8490] LOG: INSERT @ 0/9D2D238: - Btree/NEWROOT: lev 0 2020-02-07 16:16:44.055 PST [1167058][6/11:8490] LOG: INSERT @ 0/9D2D278: - Btree/INSERT_LEAF: off 1 2020-02-07 16:16:44.056 PST [1167058][6/11:8490] NOTICE: blurt_and_lock_4() called for k1 in session 2 2020-02-07 16:16:44.056 PST [1167058][6/11:8490] CONTEXT: PL/pgSQL function blurt_and_lock_4(text) line 3 at RAISE 2020-02-07 16:16:44.056 PST [1167058][6/11:8490] NOTICE: acquiring advisory lock on 4 2020-02-07 16:16:44.056 PST [1167058][6/11:8490] CONTEXT: PL/pgSQL function blurt_and_lock_4(text) line 4 at RAISE 1167058: acquiring xact lock 2020-02-07 16:16:44.066 PST [1167055][3/0:0] DEBUG: bind <unnamed> to isolationtester_waiting 2020-02-07 16:16:44.076 PST [1167055][3/0:0] DEBUG: bind <unnamed> to isolationtester_waiting 2020-02-07 16:16:44.087 PST [1167055][3/0:0] DEBUG: bind <unnamed> to isolationtester_waiting 2020-02-07 16:16:44.098 PST [1167055][3/0:0] DEBUG: bind <unnamed> to isolationtester_waiting 1167056: releasing xact lock 1 2 1167057: acquired xact lock 2020-02-07 16:16:44.099 PST [1167057][5/13:8489] LOG: INSERT @ 0/9D2D2B8: - Btree/INSERT_LEAF: off 1 2020-02-07 16:16:44.099 PST [1167057][5/13:8489] NOTICE: blurt_and_lock_4() called for k1 in session 1 2020-02-07 16:16:44.099 PST [1167057][5/13:8489] CONTEXT: PL/pgSQL function blurt_and_lock_4(text) line 3 at RAISE 2020-02-07 16:16:44.099 PST [1167057][5/13:8489] NOTICE: acquiring advisory lock on 4 2020-02-07 16:16:44.099 PST [1167057][5/13:8489] CONTEXT: PL/pgSQL function blurt_and_lock_4(text) line 4 at RAISE 1167057: acquiring xact lock 1167057: acquired xact lock 2020-02-07 16:16:44.100 PST [1167057][5/13:8489] LOG: INSERT @ 0/9D2D318: - Btree/NEWROOT: lev 0 2020-02-07 16:16:44.100 PST [1167057][5/13:8489] LOG: INSERT @ 0/9D2D358: - Btree/INSERT_LEAF: off 1 2020-02-07 16:16:44.100 PST [1167057][5/13:8489] LOG: INSERT @ 0/9D2D390: - Heap/DELETE: off 1 flags 0x08 1167057: xid 8489 releasing lock 5 1167057: retry due to conflict 2020-02-07 16:16:44.100 PST [1167057][5/13:8489] NOTICE: blurt_and_lock_123() called for k1 in session 1 2020-02-07 16:16:44.100 PST [1167057][5/13:8489] CONTEXT: PL/pgSQL function blurt_and_lock_123(text) line 3 at RAISE 2020-02-07 16:16:44.100 PST [1167057][5/13:8489] NOTICE: acquiring advisory lock on 2 2020-02-07 16:16:44.100 PST [1167057][5/13:8489] CONTEXT: PL/pgSQL function blurt_and_lock_123(text) line 7 at RAISE 1167057: acquiring xact lock 1167057: acquired xact lock 2020-02-07 16:16:44.100 PST [1167057][5/13:8489] NOTICE: blurt_and_lock_123() called for k1 in session 1 2020-02-07 16:16:44.100 PST [1167057][5/13:8489] CONTEXT: PL/pgSQL function blurt_and_lock_123(text) line 3 at RAISE 2020-02-07 16:16:44.100 PST [1167057][5/13:8489] NOTICE: acquiring advisory lock on 2 2020-02-07 16:16:44.100 PST [1167057][5/13:8489] CONTEXT: PL/pgSQL function blurt_and_lock_123(text) line 7 at RAISE 1167057: acquiring xact lock 1167057: acquired xact lock 1167057: xid 8489 waiting xwait 8490 (xmin 8490 xmax 0) spec 5 2020-02-07 16:16:44.110 PST [1167055][3/0:0] DEBUG: bind <unnamed> to isolationtester_waiting 2020-02-07 16:16:44.121 PST [1167055][3/0:0] DEBUG: bind <unnamed> to isolationtester_waiting 2020-02-07 16:16:44.135 PST [1167055][3/0:0] DEBUG: bind <unnamed> to isolationtester_waiting 2020-02-07 16:16:44.145 PST [1167055][3/0:0] DEBUG: bind <unnamed> to isolationtester_waiting 1167056: releasing xact lock 2 4 1167058: acquired xact lock 2020-02-07 16:16:44.146 PST [1167058][6/11:8490] LOG: INSERT @ 0/9D2D3D0: - Btree/INSERT_LEAF: off 2 2020-02-07 16:16:44.146 PST [1167058][6/11:8490] LOG: INSERT @ 0/9D2D400: - Heap/HEAP_CONFIRM: off 2 1167058: xid 8490 releasing lock 5 2020-02-07 16:16:44.146 PST [1167058][6/11:8490] LOG: INSERT @ 0/9D2D430: - Transaction/COMMIT: 2020-02-07 16:16:44.146767-08 2020-02-07 16:16:44.146 PST [1167058][6/11:8490] LOG: xlog flush request 0/9D2D430; write 0/9D24058; flush 0/9D24058 2020-02-07 16:16:44.146 PST [1167058][6/11:8490] STATEMENT: INSERT INTO upserttest(key, data) VALUES('k1', 'inserted s2')ON CONFLICT (blurt_and_lock_123(key)) DO UPDATE SET data = upserttest.data || ' with conflict update s2'; 2020-02-07 16:16:44.146 PST [1167057][5/13:8489] NOTICE: blurt_and_lock_123() called for k1 in session 1 2020-02-07 16:16:44.146 PST [1167057][5/13:8489] CONTEXT: PL/pgSQL function blurt_and_lock_123(text) line 3 at RAISE 2020-02-07 16:16:44.146 PST [1167057][5/13:8489] NOTICE: acquiring advisory lock on 2 2020-02-07 16:16:44.146 PST [1167057][5/13:8489] CONTEXT: PL/pgSQL function blurt_and_lock_123(text) line 7 at RAISE 1167057: acquiring xact lock 1167057: acquired xact lock 1167057: xid 8489 waiting xwait 8490 (xmin 8490 xmax 0) spec 0 2020-02-07 16:16:44.147 PST [1167057][5/13:8489] NOTICE: blurt_and_lock_123() called for k1 in session 1 2020-02-07 16:16:44.147 PST [1167057][5/13:8489] CONTEXT: PL/pgSQL function blurt_and_lock_123(text) line 3 at RAISE 2020-02-07 16:16:44.147 PST [1167057][5/13:8489] NOTICE: acquiring advisory lock on 2 2020-02-07 16:16:44.147 PST [1167057][5/13:8489] CONTEXT: PL/pgSQL function blurt_and_lock_123(text) line 7 at RAISE 1167057: acquiring xact lock 1167057: acquired xact lock 2020-02-07 16:16:44.147 PST [1167057][5/13:8489] LOG: INSERT @ 0/9D2D468: - Heap/LOCK: off 2: xid 8489: flags 0x00 LOCK_ONLYEXCL_LOCK 2020-02-07 16:16:44.147 PST [1167057][5/13:8489] LOG: INSERT @ 0/9D2D4D8: - Heap/HOT_UPDATE: off 2 xmax 8489 flags 0x10; new off 3 xmax 8489 2020-02-07 16:16:44.147 PST [1167057][5/13:8489] LOG: INSERT @ 0/9D2D508: - Transaction/COMMIT: 2020-02-07 16:16:44.147348-08 2020-02-07 16:16:44.147 PST [1167057][5/13:8489] LOG: xlog flush request 0/9D2D508; write 0/9D2D148; flush 0/9D2D148 2020-02-07 16:16:44.147 PST [1167057][5/13:8489] STATEMENT: INSERT INTO upserttest(key, data) VALUES('k1', 'inserted s1')ON CONFLICT (blurt_and_lock_123(key)) DO UPDATE SET data = upserttest.data || ' with conflict update s1'; The important bit is here the different "xid .* waiting xwait .* spec*" lines. In the success case we see: 1167003: xid 8465 waiting xwait 8466 (xmin 8466 xmax 0) spec 5 1167002: releasing xact lock 2 4 1167004: acquired xact lock 1167004: xid 8466 releasing lock 5 2020-02-07 16:14:56.678 PST [1167004][6/1014:8466] LOG: INSERT @ 0/9CC7348: - Transaction/COMMIT: 2020-02-07 16:14:56.678602-08 1167003: acquired xact lock In the failing case we see: 1167057: xid 8489 waiting xwait 8490 (xmin 8490 xmax 0) spec 5 1167056: releasing xact lock 2 4 1167058: acquired xact lock 1167058: xid 8490 releasing lock 5 2020-02-07 16:16:44.146 PST [1167058][6/11:8490] LOG: INSERT @ 0/9D2D430: - Transaction/COMMIT: 2020-02-07 16:16:44.146767-08 1167057: xid 8489 waiting xwait 8490 (xmin 8490 xmax 0) spec 0 1167057: acquired xact lock I think the issue here is that determines whether s1 can finish its check_exclusion_or_unique_constraint() check with one retry is whether it reaches it does the tuple visibility test before s2's transaction has actually marked itself as visible (note that ProcArrayEndTransaction is after RecordTransactionCommit logging the COMMIT above). I think the fix is quite easy: Ensure that there *always* will be the second wait iteration on the transaction (in addition to the already always existing wait on the speculative token). Which is just adding s2_begin s2_commit steps. Simple, but took me a few hours to understand :/. I've attached that portion of my changes. Will interrupt scheduled programming for a bit of exercise now. Greetings, Andres Freund
Attachment
pgsql-hackers by date: