Thread: trying to make sense of deadlocks

trying to make sense of deadlocks

From
Richard Yen
Date:
Hi,

I'm trying to make sense of a situation I ran into this morning.
Apparently, there were numerous deadlocks (approx. 75 in a 30-min
period) while procs were trying to write to a table (16634, "account")
in my database.

Just to give you a sense of what's going on, process 22583 tried to do
an insert, followed by an update, and the insert took 225925.724 ms:

Feb  9 06:50:11 tii-db2.oak postgres[22583]: [8-1] 2009-02-09
06:50:11.785 PST [user=www_master,db=tii 192.168.1.172(46203) PID:
22583 XID:554622832]LOG:  duration: 225925.724 ms
Feb  9 06:50:11 tii-db2.oak postgres[22583]: [8-2]  statement: INSERT
INTO r_student_account (account, student) VALUES (39271, 15535213)

Unfortunately, the update never finished:
Feb  9 07:11:41 tii-db2.oak postgres[22346]: [9-1] 2009-02-09
07:11:41.888 PST [user=www_master,db=tii 192.168.1.172(46059) PID:
22346 XID:554621636]ERROR:  deadlock detected
Feb  9 07:11:41 tii-db2.oak postgres[22346]: [9-2] 2009-02-09
07:11:41.888 PST [user=www_master,db=tii 192.168.1.172(46059) PID:
22346 XID:554621636]DETAIL:  Process 22346 waits
Feb  9 07:11:41 tii-db2.oak postgres[22346]: [9-3]  for ShareLock on
transaction 554622832; blocked by process 22583.
Feb  9 07:11:41 tii-db2.oak postgres[22346]: [9-4]   Process 22583
waits for ExclusiveLock on tuple (3,60) of relation 16634 of database
16384; blocked by process 22346.
Feb  9 07:11:41 tii-db2.oak postgres[22346]: [9-5] 2009-02-09
07:11:41.888 PST [user=www_master,db=tii 192.168.1.172(46059) PID:
22346 XID:554621636]STATEMENT:  UPDATE account SET
Feb  9 07:11:41 tii-db2.oak postgres[22346]: [9-6]  student_count =
student_count + 1 WHERE id=39271
...snip...
Feb  9 07:12:11 tii-db2.oak postgres[20654]: [9-1] 2009-02-09
07:12:11.890 PST [user=www_master,db=tii 192.168.1.233(49622) PID:
20654 XID:554623388]ERROR:  deadlock detected
Feb  9 07:12:11 tii-db2.oak postgres[20654]: [9-2] 2009-02-09
07:12:11.890 PST [user=www_master,db=tii 192.168.1.233(49622) PID:
20654 XID:554623388]DETAIL:  Process 20654 waits
Feb  9 07:12:11 tii-db2.oak postgres[20654]: [9-3]  for ShareLock on
transaction 554622832; blocked by process 22583.
Feb  9 07:12:11 tii-db2.oak postgres[20654]: [9-4]   Process 22583
waits for ExclusiveLock on tuple (3,60) of relation 16634 of database
16384; blocked by process 20654.
Feb  9 07:12:11 tii-db2.oak postgres[20654]: [9-5] 2009-02-09
07:12:11.890 PST [user=www_master,db=tii 192.168.1.233(49622) PID:
20654 XID:554623388]STATEMENT:  UPDATE account SET
Feb  9 07:12:11 tii-db2.oak postgres[20654]: [9-6]  student_count =
student_count + 1 WHERE id=39271
...snip...
Feb  9 07:12:41 tii-db2.oak postgres[23699]: [9-1] 2009-02-09
07:12:41.891 PST [user=www_master,db=tii 192.168.1.172(46559) PID:
23699 XID:554631713]ERROR:  deadlock detected
Feb  9 07:12:41 tii-db2.oak postgres[23699]: [9-2] 2009-02-09
07:12:41.891 PST [user=www_master,db=tii 192.168.1.172(46559) PID:
23699 XID:554631713]DETAIL:  Process 23699 waits
Feb  9 07:12:41 tii-db2.oak postgres[23699]: [9-3]  for ShareLock on
transaction 554622832; blocked by process 22583.
Feb  9 07:12:41 tii-db2.oak postgres[23699]: [9-4]   Process 22583
waits for ExclusiveLock on tuple (3,60) of relation 16634 of database
16384; blocked by process 23699.
Feb  9 07:12:41 tii-db2.oak postgres[23699]: [9-5] 2009-02-09
07:12:41.891 PST [user=www_master,db=tii 192.168.1.172(46559) PID:
23699 XID:554631713]STATEMENT:  UPDATE account SET
Feb  9 07:12:41 tii-db2.oak postgres[23699]: [9-6]  student_count =
student_count + 1 WHERE id=39271
...snip, more deadlocks, etc...
Feb  9 07:16:22 tii-db2.oak postgres[22583]: [9-1] 2009-02-09
07:16:22.716 PST [user=www_master,db=tii 192.168.1.172(46203) PID:
22583 XID:554622832]FATAL:  terminating connection
Feb  9 07:16:22 tii-db2.oak postgres[22583]: [9-2]  due to
administrator command
Feb  9 07:16:22 tii-db2.oak postgres[22583]: [9-3] 2009-02-09
07:16:22.716 PST [user=www_master,db=tii 192.168.1.172(46203) PID:
22583 XID:554622832]STATEMENT:  UPDATE account SET
Feb  9 07:16:22 tii-db2.oak postgres[22583]: [9-4]  student_count =
student_count + 1 WHERE id=39271

It seems like all the deadlocks are for tuple (3,60), but strangely,
tuple (3,60) on the account table doesn't exist.  Perhaps it was
deleted?  According to the account table, the account with id = 39271
(which the UPDATE statements call for) corresponds to tuple (3,15):

tii=# select ctid,id from account where id = 39271;
   ctid  |  id
--------+-------
  (3,15) | 39271
(1 row)

Other procs failed with deadlocks against other procs, but on the same
table/tuple combination (relation 16634, tuple (3,60)).  Why would
there be deadlock on something that's not going to be written to?
Also, my deadlock_timeout is set to 30sec.

Might someone be able to help me make more sense of this?

Thanks for your time,
--Richard

Re: trying to make sense of deadlocks

From
"Adam Rich"
Date:
> -----Original Message-----
> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
> owner@postgresql.org] On Behalf Of Richard Yen
> Sent: Monday, February 09, 2009 4:18 PM
> To: pgsql-general@postgresql.org
> Subject: [GENERAL] trying to make sense of deadlocks
>
> Hi,
>
> I'm trying to make sense of a situation I ran into this morning.
> Apparently, there were numerous deadlocks (approx. 75 in a 30-min
> period) while procs were trying to write to a table (16634, "account")
> in my database.
>
> Just to give you a sense of what's going on, process 22583 tried to do
> an insert, followed by an update, and the insert took 225925.724 ms:
>
>
> Might someone be able to help me make more sense of this?
>

The two processes that are conflicting, can you list everything they do
since the beginning of the transaction until the deadlock?






Re: trying to make sense of deadlocks

From
Tom Lane
Date:
Richard Yen <dba@richyen.com> writes:
> It seems like all the deadlocks are for tuple (3,60), but strangely,
> tuple (3,60) on the account table doesn't exist.  Perhaps it was
> deleted?  According to the account table, the account with id = 39271
> (which the UPDATE statements call for) corresponds to tuple (3,15):

Is this PG 8.3?

What I'm guessing is happening is that (3,60) is where the updated
version of (3,15) gets put.  It never becomes visible to you because
the updating transaction never manages to commit due to the deadlock.
It would be possible/likely for the same CTID to be assigned over and
over in repeated trials if you're using 8.3 --- in prior versions this
theory gets a lot weaker because a vacuum pass would have to occur to
clean out the failed update tuple.

In any case what you seem to be looking at is updates of the same set
of two or more rows, but done in different orders by different
transactions.

            regards, tom lane

Re: trying to make sense of deadlocks

From
Richard Yen
Date:
在 Feb 9, 2009 8:52 PM 時, Tom Lane 寫到:

> Richard Yen <dba@richyen.com> writes:
>> It seems like all the deadlocks are for tuple (3,60), but strangely,
>> tuple (3,60) on the account table doesn't exist.  Perhaps it was
>> deleted?  According to the account table, the account with id = 39271
>> (which the UPDATE statements call for) corresponds to tuple (3,15):
>
> Is this PG 8.3?
>>
Yes, I'm using 8.3.5

> What I'm guessing is happening is that (3,60) is where the updated
> version of (3,15) gets put.  It never becomes visible to you because
> the updating transaction never manages to commit due to the deadlock.
> It would be possible/likely for the same CTID to be assigned over and
> over in repeated trials if you're using 8.3 --- in prior versions this
> theory gets a lot weaker because a vacuum pass would have to occur to
> clean out the failed update tuple.
That makes sense.  I should've thought of that earlier.

> In any case what you seem to be looking at is updates of the same set
> of two or more rows, but done in different orders by different
> transactions.
Thanks for the tip!

--Richard