Re: pgsql: Avoid spurious deadlocks when upgrading a tuple lock - Mailing list pgsql-hackers

From Oleksii Kliukin
Subject Re: pgsql: Avoid spurious deadlocks when upgrading a tuple lock
Date
Msg-id DB0AA460-7962-4E8D-AC1A-958457DB6563@hintbits.com
Whole thread Raw
In response to Re: pgsql: Avoid spurious deadlocks when upgrading a tuple lock  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: pgsql: Avoid spurious deadlocks when upgrading a tuple lock
List pgsql-hackers
Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

> On 2019-Jun-16, Oleksii Kliukin wrote:
>
>> Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>>
>>> On 2019-Jun-14, Alvaro Herrera wrote:
>>>
>>>> I think there are worse problems here.  I tried the attached isolation
>>>> spec.  Note that the only difference in the two permutations is that s0
>>>> finishes earlier in one than the other; yet the first one works fine and
>>>> the second one hangs until killed by the 180s timeout.  (s3 isn't
>>>> released for a reason I'm not sure I understand.)
>>>
>>> Actually, those behaviors both seem correct to me now that I look
>>> closer.  So this was a false alarm.  In the code before de87a084c0, the
>>> first permutation deadlocks, and the second permutation hangs.  The only
>>> behavior change is that the first one no longer deadlocks, which is the
>>> desired change.
>>>
>>> I'm still trying to form a case to exercise the case of skip_tuple_lock
>>> having the wrong lifetime.
>>
>> Hm… I think it was an oversight from my part not to give skip_lock_tuple the
>> same lifetime as have_tuple_lock or first_time (also initializing it to
>> false at the same time). Even if now it might not break anything in an
>> obvious way, a backward jump to l3 label will leave skip_lock_tuple
>> uninitialized, making it very dangerous for any future code that will rely
>> on its value.
>
> But that's not the danger ... with the current coding, it's initialized
> to false every time through that block; that means the tuple lock will
> never be skipped if we jump back to l3.  So the danger is that the first
> iteration sets the variable, then jumps back; second iteration
> initializes the variable again, so instead of skipping the lock, it
> takes it, causing a spurious deadlock.

Sorry, I was confused, as I was looking only at
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=de87a084c0a5ac927017cd0834b33a932651cfc9

without taking your subsequent commit that silences compiler warnings at
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=3da73d6839dc47f1f47ca57974bf28e5abd9b572
into consideration. With that commit, the danger is indeed in resetting the
skip mechanism on each jump and potentially causing deadlocks.

Cheers,
Oleksii


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: POC: Cleaning up orphaned files using undo logs
Next
From: Pavel Stehule
Date:
Subject: Re: idea: log_statement_sample_rate - bottom limit for sampling