Thread: upgrades in row-level locks can deadlock

upgrades in row-level locks can deadlock

From

Oleksii Kliukin

Date:

22 May 2019, 15:27:15

Hello,

I have recently observed a deadlock on one of our production servers related
to locking only a single row in a job table. There were two functions involved
in the deadlock, the first one acquires a “for key share” lock on the row that
represents the job it works on and subsequently updates it with the job’s end
time (we need multiple jobs to be operating on a single row concurrently,
that’s why there is a "for key share" lock). The other function starts by
acquiring the “for update” lock on the job row and then performs actions that
should not be run in parallel with other jobs.

The deadlock can be easily reproduced with the following statements. The
queries run against a table job (id integer primary key, name text) with a
single row of (1,'a'))

X1: select id from job where name = 'a' for key share;
Y: select id from job where name = 'a' for update; -- starts waiting for X1
X2: select id from job where name = 'a' for key share;
X1: update job set name = 'b' where id = 1;
X2: update job set name = 'c' where id = 1; -- starts waiting for X1
X1: rollback;

At this point, Y is terminated by the deadlock detector:

"deadlock detected",
Process 53937 waits for ShareLock on transaction 488; blocked by process 53953.
Process 53953 waits for ExclusiveLock on tuple (0,1) of relation 16386 of database 12931;
blocked by process 53937.
Process 53937: select id from job where name = 'a' for update;
Process 53953: update job set name = 'c' where id = 1;",

The deadlock is between X2 and Y. Y waits for X2 to finish, as X2 holds a "key
share" lock, incompatible with "for update" that Y attempts to acquire. On the
other hand, X2 needs to acquire the row lock to perform the update, and that
is a two-phase process: first, get the tuple lock and then wait for
conflicting transactions to finish, releasing the tuple lock afterward. X2
tries to acquire the tuple lock, but it is owned by Y. PostgreSQL detects the
deadlock and terminates Y.

Such a deadlock only occurs when three or more sessions locking the same row
are present and the lock is upgraded in at least one session. With only two
sessions the upgrade does not go through the lock manager, as there are no
conflicts with locks stored in the tuple.

That gave me an idea on how to change PostgreSQL to avoid deadlocking under
the condition above. When detecting the lock upgrade from the multixact, we
can avoid acquiring the tuple lock; however, we should still wait for the
mutlixact members that hold conflicting locks, to avoid acquiring incompatible
ones.

The patch is attached. I had to tweak heap_update and heap_delete alongside
the heap_lock_tuple, as they acquire row locks as well. I am not very happy
with overloading DoesMultiXactIdConflict with another function to check if
current transaction id is among the multixact members, perhaps it is worth to
have a separate function for this. We can figure this out if we agree this is
the problem that needs to be solved and on the solution. The other possible
objection is related to the statement from README.tuplock that we need to go
through the lock manager to avoid starving waiting exclusive-lockers. Since
this patch omits the tuple lock only when the lock upgrade happens, it does
limit the starvation condition to the cases when the lock compatible with the
one the waiting process asks for is acquired first and then gets upgraded to
the incompatible one. Since under present conditions the same operation will
likely deadlock and cancel the exclusive waiter altogether, I don't see this
as a strong objection.

Cheers,
Oleksii

Attachment

deadlock_on_row_lock_upgrades_v1.diff

Re: upgrades in row-level locks can deadlock

From

Alvaro Herrera

Date:

11 June 2019, 20:40:25

Hello

On 2019-May-22, Oleksii Kliukin wrote:

> I have recently observed a deadlock on one of our production servers related
> to locking only a single row in a job table. There were two functions involved
> in the deadlock, the first one acquires a “for key share” lock on the row that
> represents the job it works on and subsequently updates it with the job’s end
> time (we need multiple jobs to be operating on a single row concurrently,
> that’s why there is a "for key share" lock). The other function starts by
> acquiring the “for update” lock on the job row and then performs actions that
> should not be run in parallel with other jobs.
> 
> The deadlock can be easily reproduced with the following statements. The
> queries run against a table job (id integer primary key, name text) with a
> single row of (1,'a'))

Hmm, great find.

> X1: select id from job where name = 'a' for key share;
> Y: select id from job where name = 'a' for update; -- starts waiting for X1
> X2: select id from job where name = 'a' for key share;
> X1: update job set name = 'b' where id = 1;
> X2: update job set name = 'c' where id = 1; -- starts waiting for X1
> X1: rollback;
> 
> At this point, Y is terminated by the deadlock detector:

Yeah, this seems undesirable in general terms.  Here's a quick
isolationtester spec that reproduces the problem:

setup {
    drop table if exists tlu_job;
    create table tlu_job (id integer primary key, name text);
    insert into tlu_job values (1, 'a');
}

teardown {
    drop table tlu_job;
}

session "s1"
setup                { begin; }
step "s1_keyshare"    { select id from tlu_job where name = 'a' for key share; }
step "s1_update"    { update tlu_job set name = 'b' where id = 1; }
step "s1_rollback"    { rollback; }

session "s2"
setup                { begin; }
step "s2_keyshare"    { select id from tlu_job where name = 'a' for key share; }
step "s2_update"    { update tlu_job set name = 'c' where id = 1; }
step "s2_commit"    { commit; }

session "s3"
setup                { begin; }
step "s3_forupd"    { select id from tlu_job where name = 'a' for update; }
teardown            { commit; }

# Alexey's permutation
permutation "s1_keyshare" "s3_forupd" "s2_keyshare" "s1_update" "s2_update" "s1_rollback" "s2_commit"

(X1 is s1, X2 is s2, Y is s3).

Permutations such as that one report a deadlock with the original code,
and does not report a deadlock after your proposed patch.

permutation "s1_keyshare" "s1_update" "s2_keyshare" "s3_forupd" "s2_update" "s1_rollback" "s2_commit"

But semantically, I wonder if your transactions are correct.  If you
intend to modify the row in s1 and s2, shouldn't you be acquiring FOR NO
KEY UPDATE lock instead?  I don't see how can s1 and s2 coexist
peacefully.  Also, can your Y transaction use FOR NO KEY UPDATE instead
.. unless you intend to delete the tuple in that transaction?


I'm mulling over your proposed fix.  I don't much like the idea that
DoesMultiXactIdConflict() returns that new boolean -- seems pretty
ad-hoc -- but I don't see any way to do better than that ...  (If we get
down to details, DoesMultiXactIdConflict needn't initialize that
boolean: better let the callers do.)

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: upgrades in row-level locks can deadlock

From

Oleksii Kliukin

Date:

12 June 2019, 15:14:13

Oleksii Kliukin <alexk@hintbits.com> wrote:

> Hi,
>
> Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
>> On 2019-May-22, Oleksii Kliukin wrote:
>>
>>> X1: select id from job where name = 'a' for key share;
>>> Y: select id from job where name = 'a' for update; -- starts waiting for X1
>>> X2: select id from job where name = 'a' for key share;
>>> X1: update job set name = 'b' where id = 1;
>>> X2: update job set name = 'c' where id = 1; -- starts waiting for X1
>>> X1: rollback;
>>>
>>> At this point, Y is terminated by the deadlock detector:
>>
>> Yeah, this seems undesirable in general terms.  Here's a quick
>> isolationtester spec that reproduces the problem:
>>
>> setup {
>>     drop table if exists tlu_job;
>>     create table tlu_job (id integer primary key, name text);
>>     insert into tlu_job values (1, 'a');
>> }
>>
>> teardown {
>>     drop table tlu_job;
>> }
>>
>> session "s1"
>> setup                { begin; }
>> step "s1_keyshare"    { select id from tlu_job where name = 'a' for key share; }
>> step "s1_update"    { update tlu_job set name = 'b' where id = 1; }
>> step "s1_rollback"    { rollback; }
>>
>> session "s2"
>> setup                { begin; }
>> step "s2_keyshare"    { select id from tlu_job where name = 'a' for key share; }
>> step "s2_update"    { update tlu_job set name = 'c' where id = 1; }
>> step "s2_commit"    { commit; }
>>
>> session "s3"
>> setup                { begin; }
>> step "s3_forupd"    { select id from tlu_job where name = 'a' for update; }
>> teardown            { commit; }
>
> Thank you! I can make it even simpler;  s1 just acquires for share lock, s3
> gets for update one and s2 takes for share lock first, and then tries to
> acquire for update one; once s1 finishes, s3 deadlocks.
>
>> But semantically, I wonder if your transactions are correct.  If you
>> intend to modify the row in s1 and s2, shouldn't you be acquiring FOR NO
>> KEY UPDATE lock instead?  I don't see how can s1 and s2 coexist
>> peacefully.  Also, can your Y transaction use FOR NO KEY UPDATE instead
>> .. unless you intend to delete the tuple in that transaction?
>
> It is correct. I wanted to make sure jobs that acquire for key share lock
> can run concurrently most of the time; they only execute one update at the
> end of the job, and it is just to update the last run timestamp.
>
>> I'm mulling over your proposed fix.  I don't much like the idea that
>> DoesMultiXactIdConflict() returns that new boolean -- seems pretty
>> ad-hoc -- but I don't see any way to do better than that ...  (If we get
>> down to details, DoesMultiXactIdConflict needn't initialize that
>> boolean: better let the callers do.)
>
> I am also not happy about the new parameter to DoesMultiXactIdConflict, but
> calling a separate function to fetch the presence of the current transaction
> in the multixact would mean doing the job of DoesMultiXactIdConflict twice.
> I am open to suggestions on how to make it nicer.
>
> Attached is a slightly modified patch that avoids initializing
> has_current_xid inside DoesMultiXactIdConflict and should apply cleanly to
> the current master.

And here is the v3 that also includes the isolation test I described above
(quoting my previous message in full as I accidentally sent it off-list,
sorry about that).

Cheers,
Oleksii

Attachment

deadlock_on_row_lock_upgrades_v3.diff

Re: upgrades in row-level locks can deadlock

From

Alvaro Herrera

Date:

12 June 2019, 16:46:49

On 2019-Jun-12, Oleksii Kliukin wrote:

> Thank you! I can make it even simpler;  s1 just acquires for share lock, s3
> gets for update one and s2 takes for share lock first, and then tries to
> acquire for update one; once s1 finishes, s3 deadlocks.

Cool.  I think it would be worthwhile to include a number of reasonable
permutations instead of just one, and make sure they all work correctly.
I don't think we need to include all possible permutations, just a few.
I think we need at least enough permutations to cover the two places of
the code that are modified by the patch, for both values of
have_current_xid (so there should be four permutations, I think).

Please don't simplify the table name to just "t" -- the reason I used
another name is that we want these tests to be able to run concurrently
at some point; ref.
https://postgr.es/m/20180124231006.z7spaz5gkzbdvob5@alvherre.pgsql

> > But semantically, I wonder if your transactions are correct.  If you
> > intend to modify the row in s1 and s2, shouldn't you be acquiring FOR NO
> > KEY UPDATE lock instead?  I don't see how can s1 and s2 coexist
> > peacefully.  Also, can your Y transaction use FOR NO KEY UPDATE instead
> > .. unless you intend to delete the tuple in that transaction?
> 
> It is correct. I wanted to make sure jobs that acquire for key share lock
> can run concurrently most of the time; they only execute one update at the
> end of the job, and it is just to update the last run timestamp.

I see.  Under READ COMMITTED it works okay, I suppose.

> > I'm mulling over your proposed fix.  I don't much like the idea that
> > DoesMultiXactIdConflict() returns that new boolean -- seems pretty
> > ad-hoc -- but I don't see any way to do better than that ...  (If we get
> > down to details, DoesMultiXactIdConflict needn't initialize that
> > boolean: better let the callers do.)
> 
> I am also not happy about the new parameter to DoesMultiXactIdConflict, but
> calling a separate function to fetch the presence of the current transaction
> in the multixact would mean doing the job of DoesMultiXactIdConflict twice.
> I am open to suggestions on how to make it nicer.

Yeah, I didn't find anything better either.  We could make things more
complex that we resolve the multixact once and then extract the two
sepraate bits of information that we need from that ... but it ends up
being uglier and messier for no real gain.  So let's go with your
original idea.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: upgrades in row-level locks can deadlock

From

Oleksii Kliukin

Date:

13 June 2019, 13:42:53

Hello,

Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

> On 2019-Jun-12, Oleksii Kliukin wrote:
>
>> Thank you! I can make it even simpler;  s1 just acquires for share lock, s3
>> gets for update one and s2 takes for share lock first, and then tries to
>> acquire for update one; once s1 finishes, s3 deadlocks.
>
> Cool.  I think it would be worthwhile to include a number of reasonable
> permutations instead of just one, and make sure they all work correctly.
> I don't think we need to include all possible permutations, just a few.
> I think we need at least enough permutations to cover the two places of
> the code that are modified by the patch, for both values of
> have_current_xid (so there should be four permutations, I think).

Makes sense. For the symmetry I have included those that perform lock
upgrades in one session and those that doesn’t, while the other sessions
acquire locks, do updates or deletes. For those that don’t upgrade locks the
test checks that the locks are acquired in the correct order.

> Please don't simplify the table name to just "t" -- the reason I used
> another name is that we want these tests to be able to run concurrently
> at some point; ref.
> https://postgr.es/m/20180124231006.z7spaz5gkzbdvob5@alvherre.pgsql

Alright, thanks.

>
>>> But semantically, I wonder if your transactions are correct.  If you
>>> intend to modify the row in s1 and s2, shouldn't you be acquiring FOR NO
>>> KEY UPDATE lock instead?  I don't see how can s1 and s2 coexist
>>> peacefully.  Also, can your Y transaction use FOR NO KEY UPDATE instead
>>> .. unless you intend to delete the tuple in that transaction?
>>
>> It is correct. I wanted to make sure jobs that acquire for key share lock
>> can run concurrently most of the time; they only execute one update at the
>> end of the job, and it is just to update the last run timestamp.
>
> I see.  Under READ COMMITTED it works okay, I suppose.
>
>>> I'm mulling over your proposed fix.  I don't much like the idea that
>>> DoesMultiXactIdConflict() returns that new boolean -- seems pretty
>>> ad-hoc -- but I don't see any way to do better than that ...  (If we get
>>> down to details, DoesMultiXactIdConflict needn't initialize that
>>> boolean: better let the callers do.)
>>
>> I am also not happy about the new parameter to DoesMultiXactIdConflict, but
>> calling a separate function to fetch the presence of the current transaction
>> in the multixact would mean doing the job of DoesMultiXactIdConflict twice.
>> I am open to suggestions on how to make it nicer.
>
> Yeah, I didn't find anything better either.  We could make things more
> complex that we resolve the multixact once and then extract the two
> sepraate bits of information that we need from that ... but it ends up
> being uglier and messier for no real gain.  So let's go with your
> original idea.

Ok, the v4 is attached. I have addressed your suggestion for the isolation
tests, added a paragraph to README.tuplock explaining why do we skip
LockTuple to avoid a deadlock in the session that upgrades its lock.

Cheers,
Oleksii

Attachment

deadlock_on_row_lock_upgrades_v4.diff

Re: upgrades in row-level locks can deadlock

From

Robert Haas

Date:

13 June 2019, 17:32:49

On Wed, Jun 12, 2019 at 12:47 PM Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Please don't simplify the table name to just "t" -- the reason I used
> another name is that we want these tests to be able to run concurrently
> at some point; ref.
> https://postgr.es/m/20180124231006.z7spaz5gkzbdvob5@alvherre.pgsql

Not only that, but 't' is completely ungreppable.  If you name the
table 'walrus' and five years from now somebody sees an error about it
in some buildfarm log or whatever, they can type 'git grep walrus' to
find the test, and they'll probably only get that one hit.  If you
name it 't', well...

[rhaas pgsql]$ git grep t | wc -l
 1653468

Not very helpful.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: upgrades in row-level locks can deadlock

From

Alvaro Herrera

Date:

13 June 2019, 18:00:14

On 2019-Jun-13, Oleksii Kliukin wrote:

> Makes sense. For the symmetry I have included those that perform lock
> upgrades in one session and those that doesn’t, while the other sessions
> acquire locks, do updates or deletes. For those that don’t upgrade locks the
> test checks that the locks are acquired in the correct order.

Thanks for the updated patch!  I'm about to push to branches 9.6-master.
It applies semi-cleanly (only pgindent-maturity whitespace conflicts).

The [pg11 version of the] patch does applies to 9.5 cleanly ... but the
isolation test doesn't work, because isolationtester was not smart
enough back then.  Since there have been no previous reports of this
problem, and to avoid pushing untested code, I'm going to refrain from
back-patching there.  My guess is that it should work ...

In 9.4 there are quite some severe conflicts, because 27846f02c176 was
not back-patched there.  (The bug number "#8470" still floats in my
memory from time to time.  Shudder)

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: upgrades in row-level locks can deadlock

From

Alvaro Herrera

Date:

13 June 2019, 21:37:31

On 2019-Jun-13, Alvaro Herrera wrote:

> On 2019-Jun-13, Oleksii Kliukin wrote:
> 
> > Makes sense. For the symmetry I have included those that perform lock
> > upgrades in one session and those that doesn’t, while the other sessions
> > acquire locks, do updates or deletes. For those that don’t upgrade locks the
> > test checks that the locks are acquired in the correct order.
> 
> Thanks for the updated patch!  I'm about to push to branches 9.6-master.
> It applies semi-cleanly (only pgindent-maturity whitespace conflicts).

Done, thanks for the report and patch!

I tried hard to find a scenario that this patch breaks, but couldn't
find anything.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: upgrades in row-level locks can deadlock

From

Oleksii Kliukin

Date:

14 June 2019, 07:52:18

Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

> On 2019-Jun-13, Alvaro Herrera wrote:
>
>> On 2019-Jun-13, Oleksii Kliukin wrote:
>>
>>> Makes sense. For the symmetry I have included those that perform lock
>>> upgrades in one session and those that doesn’t, while the other sessions
>>> acquire locks, do updates or deletes. For those that don’t upgrade locks the
>>> test checks that the locks are acquired in the correct order.
>>
>> Thanks for the updated patch!  I'm about to push to branches 9.6-master.
>> It applies semi-cleanly (only pgindent-maturity whitespace conflicts).
>
> Done, thanks for the report and patch!
>
> I tried hard to find a scenario that this patch breaks, but couldn't
> find anything.

Thank you very much for reviewing and committing it!

Cheers,
Oleksii