Thread: Re: pgsql: Avoid spurious deadlocks when upgrading a tuple lock
On 2019-Jun-14, Tom Lane wrote: > I wrote: > >> Hm, I don't get that warning. Does this patch silence it, please? > > > Uh, no patch attached? But initializing the variable where it's > > declared would certainly silence it. > > BTW, after looking around a bit I wonder if this complaint isn't > exposing an actual logic bug. Shouldn't skip_tuple_lock have > a lifetime similar to first_time? I think there are worse problems here. I tried the attached isolation spec. Note that the only difference in the two permutations is that s0 finishes earlier in one than the other; yet the first one works fine and the second one hangs until killed by the 180s timeout. (s3 isn't released for a reason I'm not sure I understand.) I don't think I'm going to have time to investigate this deeply over the weekend, so I think the safest course of action is to revert this for next week's set. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > On 2019-Jun-14, Tom Lane wrote: >> BTW, after looking around a bit I wonder if this complaint isn't >> exposing an actual logic bug. Shouldn't skip_tuple_lock have >> a lifetime similar to first_time? > I think there are worse problems here. I tried the attached isolation > spec. Note that the only difference in the two permutations is that s0 > finishes earlier in one than the other; yet the first one works fine and > the second one hangs until killed by the 180s timeout. (s3 isn't > released for a reason I'm not sure I understand.) Ugh. > I don't think I'm going to have time to investigate this deeply over the > weekend, so I think the safest course of action is to revert this for > next week's set. +1. This is an old bug, we don't have to improve it for this release. regards, tom lane
On 2019-Jun-14, Alvaro Herrera wrote: > I think there are worse problems here. I tried the attached isolation > spec. Note that the only difference in the two permutations is that s0 > finishes earlier in one than the other; yet the first one works fine and > the second one hangs until killed by the 180s timeout. (s3 isn't > released for a reason I'm not sure I understand.) Actually, those behaviors both seem correct to me now that I look closer. So this was a false alarm. In the code before de87a084c0, the first permutation deadlocks, and the second permutation hangs. The only behavior change is that the first one no longer deadlocks, which is the desired change. I'm still trying to form a case to exercise the case of skip_tuple_lock having the wrong lifetime. The fact that both permutations behave differently, even though the only difference is where s0 commits relative to the s3_share step, is an artifact of our unusual tuple locking implementation. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > On 2019-Jun-14, Alvaro Herrera wrote: > >> I think there are worse problems here. I tried the attached isolation >> spec. Note that the only difference in the two permutations is that s0 >> finishes earlier in one than the other; yet the first one works fine and >> the second one hangs until killed by the 180s timeout. (s3 isn't >> released for a reason I'm not sure I understand.) > > Actually, those behaviors both seem correct to me now that I look > closer. So this was a false alarm. In the code before de87a084c0, the > first permutation deadlocks, and the second permutation hangs. The only > behavior change is that the first one no longer deadlocks, which is the > desired change. > > I'm still trying to form a case to exercise the case of skip_tuple_lock > having the wrong lifetime. Hm… I think it was an oversight from my part not to give skip_lock_tuple the same lifetime as have_tuple_lock or first_time (also initializing it to false at the same time). Even if now it might not break anything in an obvious way, a backward jump to l3 label will leave skip_lock_tuple uninitialized, making it very dangerous for any future code that will rely on its value. > The fact that both permutations behave differently, even though the > only difference is where s0 commits relative to the s3_share step, is an > artifact of our unusual tuple locking implementation. Cheers, Oleksii
On 2019-Jun-16, Oleksii Kliukin wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > > > On 2019-Jun-14, Alvaro Herrera wrote: > > > >> I think there are worse problems here. I tried the attached isolation > >> spec. Note that the only difference in the two permutations is that s0 > >> finishes earlier in one than the other; yet the first one works fine and > >> the second one hangs until killed by the 180s timeout. (s3 isn't > >> released for a reason I'm not sure I understand.) > > > > Actually, those behaviors both seem correct to me now that I look > > closer. So this was a false alarm. In the code before de87a084c0, the > > first permutation deadlocks, and the second permutation hangs. The only > > behavior change is that the first one no longer deadlocks, which is the > > desired change. > > > > I'm still trying to form a case to exercise the case of skip_tuple_lock > > having the wrong lifetime. > > Hm… I think it was an oversight from my part not to give skip_lock_tuple the > same lifetime as have_tuple_lock or first_time (also initializing it to > false at the same time). Even if now it might not break anything in an > obvious way, a backward jump to l3 label will leave skip_lock_tuple > uninitialized, making it very dangerous for any future code that will rely > on its value. But that's not the danger ... with the current coding, it's initialized to false every time through that block; that means the tuple lock will never be skipped if we jump back to l3. So the danger is that the first iteration sets the variable, then jumps back; second iteration initializes the variable again, so instead of skipping the lock, it takes it, causing a spurious deadlock. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-Jun-15, Alvaro Herrera wrote: > But that's not the danger ... with the current coding, it's initialized > to false every time through that block; that means the tuple lock will > never be skipped if we jump back to l3. So the danger is that the first > iteration sets the variable, then jumps back; second iteration > initializes the variable again, so instead of skipping the lock, it > takes it, causing a spurious deadlock. So, I'm too lazy today to generate a case that fully reproduces the deadlock, because you need to stall 's2' a little bit using the well-known advisory lock trick, but this one hits the code that would re-initialize the variable. I'm going to push the change of lifetime of the variable for now. setup { drop table if exists tlu_job; create table tlu_job (id integer primary key, name text); insert into tlu_job values(1, 'a'); } teardown { drop table tlu_job; } session "s0" setup { begin; set deadlock_timeout=1} step "s0_fornokeyupdate" { select id from tlu_job where id = 1 for no key update; } step "s0_update" { update tlu_job set name = 's0' where id = 1; } step "s0_commit" { commit; } session "s1" setup { begin; set deadlock_timeout=1} step "s1_for_key_share" { select id from tlu_job where id = 1 for key share; } step "s1_for_update" { select id from tlu_job where id = 1 for update; } step "s1_rollback" { rollback; } session "s2" setup { begin; set deadlock_timeout=1} step "s2_for_key_share" { select id from tlu_job where id = 1 for key share; } step "s2_for_share" { select id from tlu_job where id = 1 for share; } step "s2_rollback" { rollback; } session "s3" setup { begin; set deadlock_timeout=1} step "s3_update" { update tlu_job set name = 'c' where id = 1; } step "s3_rollback" { rollback; } permutation "s1_for_key_share" "s2_for_key_share" "s0_fornokeyupdate" "s2_for_share" "s0_update" "s0_commit" "s1_rollback""s2_rollback" "s3_rollback" -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > I'm going to push the change of lifetime of the variable for now. If you're going to push anything before tomorrow's wrap, please do it *now* not later. We're running out of time to get a full sample of buildfarm results. regards, tom lane
On 2019-Jun-16, Tom Lane wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > I'm going to push the change of lifetime of the variable for now. > > If you're going to push anything before tomorrow's wrap, please do it > *now* not later. We're running out of time to get a full sample of > buildfarm results. Yeah, I had to bail out earlier today, so the only thing I'm confident pushing now is a revert. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > On 2019-Jun-16, Tom Lane wrote: >> If you're going to push anything before tomorrow's wrap, please do it >> *now* not later. We're running out of time to get a full sample of >> buildfarm results. > Yeah, I had to bail out earlier today, so the only thing I'm confident > pushing now is a revert. Yeah, let's do that. I don't want to risk shipping broken code. We can try again for the next updates. regards, tom lane
On Sun, Jun 16, 2019 at 09:10:13PM -0400, Tom Lane wrote: > Yeah, let's do that. I don't want to risk shipping broken code. > We can try again for the next updates. Could you revert asap please then? -- Michael
Attachment
On 2019-Jun-17, Michael Paquier wrote: > On Sun, Jun 16, 2019 at 09:10:13PM -0400, Tom Lane wrote: > > Yeah, let's do that. I don't want to risk shipping broken code. > > We can try again for the next updates. > > Could you revert asap please then? Done. I initially thought to keep the test in place, but then realized it might be unstable, so I removed that too. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > On 2019-Jun-17, Michael Paquier wrote: >> Could you revert asap please then? > Done. Thanks. regards, tom lane
On Sun, Jun 16, 2019 at 10:27:25PM -0400, Tom Lane wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: >> On 2019-Jun-17, Michael Paquier wrote: >>> Could you revert asap please then? > >> Done. > > Thanks. Thanks, Alvaro. -- Michael
Attachment
On 2019-Jun-16, Alvaro Herrera wrote: > So, I'm too lazy today to generate a case that fully reproduces the > deadlock, because you need to stall 's2' a little bit using the > well-known advisory lock trick, but this one hits the code that would > re-initialize the variable. Here's such a case. I was unable to reproduce the condition with a smaller sequence of commands. This one does hit the deadlock when used with the previous code, as expected; with the fixed code (ie. skip_tuple_lock in the outer scope and same lifetime as "first_time") then it works fine, no deadlock. I'm going to push the fixed commit this afternoon, including this as an additional permutation in the spec file. setup { drop table if exists tlu_job; create table tlu_job (id integer primary key, name text); insert into tlu_job values(1, 'a'); } teardown { drop table tlu_job; } session "s0" setup { begin; } step "s0_keyshare" { select id from tlu_job where id = 1 for key share; } step "s0_share" { select id from tlu_job where id = 1 for share; } step "s0_rollback" { rollback; } session "s1" setup { begin; } step "s1_keyshare" { select id from tlu_job where id = 1 for key share; } step "s1_savept_e" { savepoint s1_e; } step "s1_share" { select id from tlu_job where id = 1 for share; } step "s1_savept_f" { savepoint s1_f; } step "s1_fornokeyupd" { select id from tlu_job where id = 1 for no key update; } step "s1_rollback_f" { rollback to s1_f; } step "s1_rollback_e" { rollback to s1_e; } step "s1_rollback" { rollback; } session "s2" setup { begin; } step "s2_keyshare" { select id from tlu_job where id = 1 for key share; } step "s2_fornokeyupd" { select id from tlu_job where id = 1 for no key update; } step "s2_rollback" { rollback; } session "s3" setup { begin; } step "s3_for_update" { select id from tlu_job where id = 1 for update; } step "s3_rollback" { rollback; } permutation "s1_keyshare" "s3_for_update" "s2_keyshare" "s1_savept_e" "s1_share" "s1_savept_f" "s1_fornokeyupd" "s2_fornokeyupd""s0_keyshare" "s1_rollback_f" "s0_share" "s1_rollback_e" "s1_rollback" "s2_rollback" "s0_rollback" "s3_rollback" -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > On 2019-Jun-16, Oleksii Kliukin wrote: > >> Alvaro Herrera <alvherre@2ndquadrant.com> wrote: >> >>> On 2019-Jun-14, Alvaro Herrera wrote: >>> >>>> I think there are worse problems here. I tried the attached isolation >>>> spec. Note that the only difference in the two permutations is that s0 >>>> finishes earlier in one than the other; yet the first one works fine and >>>> the second one hangs until killed by the 180s timeout. (s3 isn't >>>> released for a reason I'm not sure I understand.) >>> >>> Actually, those behaviors both seem correct to me now that I look >>> closer. So this was a false alarm. In the code before de87a084c0, the >>> first permutation deadlocks, and the second permutation hangs. The only >>> behavior change is that the first one no longer deadlocks, which is the >>> desired change. >>> >>> I'm still trying to form a case to exercise the case of skip_tuple_lock >>> having the wrong lifetime. >> >> Hm… I think it was an oversight from my part not to give skip_lock_tuple the >> same lifetime as have_tuple_lock or first_time (also initializing it to >> false at the same time). Even if now it might not break anything in an >> obvious way, a backward jump to l3 label will leave skip_lock_tuple >> uninitialized, making it very dangerous for any future code that will rely >> on its value. > > But that's not the danger ... with the current coding, it's initialized > to false every time through that block; that means the tuple lock will > never be skipped if we jump back to l3. So the danger is that the first > iteration sets the variable, then jumps back; second iteration > initializes the variable again, so instead of skipping the lock, it > takes it, causing a spurious deadlock. Sorry, I was confused, as I was looking only at https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=de87a084c0a5ac927017cd0834b33a932651cfc9 without taking your subsequent commit that silences compiler warnings at https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=3da73d6839dc47f1f47ca57974bf28e5abd9b572 into consideration. With that commit, the danger is indeed in resetting the skip mechanism on each jump and potentially causing deadlocks. Cheers, Oleksii
On 2019-Jun-18, Oleksii Kliukin wrote: > Sorry, I was confused, as I was looking only at > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=de87a084c0a5ac927017cd0834b33a932651cfc9 > > without taking your subsequent commit that silences compiler warnings at > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=3da73d6839dc47f1f47ca57974bf28e5abd9b572 > into consideration. With that commit, the danger is indeed in resetting the > skip mechanism on each jump and potentially causing deadlocks. Yeah, I understand the confusion. Anyway, as bugs go, this one seems pretty benign. It would result in a unexplained deadlock, very rarely, and only for people who use a very strange locking pattern that includes (row-level) lock upgrades. I think it also requires aborted savepoints too, though I don't rule out the possibility that there might be a way to reproduce this without that. I pushed the patch again just now, with the new permutation. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > On 2019-Jun-18, Oleksii Kliukin wrote: > >> Sorry, I was confused, as I was looking only at >> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=de87a084c0a5ac927017cd0834b33a932651cfc9 >> >> without taking your subsequent commit that silences compiler warnings at >> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=3da73d6839dc47f1f47ca57974bf28e5abd9b572 >> into consideration. With that commit, the danger is indeed in resetting the >> skip mechanism on each jump and potentially causing deadlocks. > > Yeah, I understand the confusion. > > Anyway, as bugs go, this one seems pretty benign. It would result in a > unexplained deadlock, very rarely, and only for people who use a very > strange locking pattern that includes (row-level) lock upgrades. I > think it also requires aborted savepoints too, though I don't rule out > the possibility that there might be a way to reproduce this without > that. > > I pushed the patch again just now, with the new permutation. Thank you very much for working on it and committing the fix! Cheers, Oleksii