Thread: Deadlock Problem

Deadlock Problem

From
Matthias Schmitt
Date:
<fontfamily><param>Courier</param><x-tad-smaller>Hello,


I am in trouble with more and more deadlock problems. We are
programming a web application with multiple users editing content at
the same time. Multiple times a day PostgreSQL runs into a deadlock,
which can only be resolved by killing some of the clients. Here is an
example from the pg_locks table:


SELECT l.mode, l.granted, l.pid, l.transaction, d.datname, r.relname
FROM pg_locks l, pg_database d, pg_class r WHERE d.oid=l.database AND
r.oid=l.relation;

       mode       | granted |  pid  | transaction |      datname
|         relname

------------------+---------+-------+-------------+--------------------+--------------------------

 AccessShareLock  | t       | 12708 |             | p247_website_1_1_0
| pg_locks

 AccessShareLock  | t       | 12708 |             | p247_website_1_1_0
| pg_class

 AccessShareLock  | t       | 12714 |             | p247_website_1_1_0
| upd_dependency_id

 AccessShareLock  | t       | 12714 |             | p247_website_1_1_0
| preset

 AccessShareLock  | t       | 12726 |             | p247_website_1_1_0
| preset

 AccessShareLock  | t       | 12714 |             | p247_website_1_1_0
| file

 AccessShareLock  | t       | 12726 |             | p247_website_1_1_0
| node_pkey

 AccessShareLock  | t       | 12714 |             | p247_website_1_1_0
| descr_node

 AccessShareLock  | t       | 12726 |             | p247_website_1_1_0
| systemuser_usergroup_rel

 AccessShareLock  | t       | 12726 |             | p247_website_1_1_0
| permgroup_permission_rel

 AccessShareLock  | t       | 12726 |             | p247_website_1_1_0
| usergroup_permgroup_rel

 AccessShareLock  | t       | 12726 |             | p247_website_1_1_0
| account_permission_grant

 AccessShareLock  | t       | 12726 |             | p247_website_1_1_0
| permission

 AccessShareLock  | t       | 12726 |             | p247_website_1_1_0
| account_permgroup_grant

 AccessShareLock  | t       | 12714 |             | p247_website_1_1_0
| environment

 AccessShareLock  | t       | 12726 |             | p247_website_1_1_0
| environment

 AccessShareLock  | t       | 12726 |             | p247_website_1_1_0
| systemuser

 AccessShareLock  | t       | 12726 |             | p247_website_1_1_0
| account

 AccessShareLock  | t       | 12714 |             | p247_website_1_1_0
| account

 AccessShareLock  | t       | 12714 |             | p247_website_1_1_0
| nodetype

 AccessShareLock  | t       | 12726 |             | p247_website_1_1_0
| nodetype

 AccessShareLock  | t       | 12714 |             | p247_website_1_1_0
| upd_templatefile

 RowExclusiveLock | t       | 12726 |             | p247_website_1_1_0
| upd_template

 AccessShareLock  | t       | 12726 |             | p247_website_1_1_0
| upd_template

 RowExclusiveLock | t       | 12714 |             | p247_website_1_1_0
| upd_template

 AccessShareLock  | t       | 12714 |             | p247_website_1_1_0
| upd_template

 AccessShareLock  | t       | 12714 |             | p247_website_1_1_0
| upd_pagevalue

 AccessShareLock  | t       | 12714 |             | p247_website_1_1_0
| descr_upd_page

 RowExclusiveLock | t       | 12714 |             | p247_website_1_1_0
| upd_page

 AccessShareLock  | t       | 12714 |             | p247_website_1_1_0
| upd_page

 RowExclusiveLock | t       | 12726 |             | p247_website_1_1_0
| upd_page

 AccessShareLock  | t       | 12726 |             | p247_website_1_1_0
| upd_page

 RowExclusiveLock | t       | 12714 |             | p247_website_1_1_0
| upd_dependency

 AccessShareLock  | t       | 12714 |             | p247_website_1_1_0
| upd_dependency

 RowExclusiveLock | t       | 12726 |             | p247_website_1_1_0
| node

 AccessShareLock  | t       | 12726 |             | p247_website_1_1_0
| node

 RowExclusiveLock | t       | 12714 |             | p247_website_1_1_0
| node

 RowShareLock     | t       | 12714 |             | p247_website_1_1_0
| node

 AccessShareLock  | t       | 12714 |             | p247_website_1_1_0
| node

 AccessShareLock  | t       | 12714 |             | p247_website_1_1_0
| descr_node_en

(40 rows)


I expected deadlocks to be reported by PostgreSQL after the configured
timeout with an error message, but this one is just hanging up all
clients.


We are running PostgreSQL 7.3.6. Client software is written in Perl
using the DBI interface.


I would really appreciate any comments where to search for the problem.


Matthias Schmitt

</x-tad-smaller></fontfamily>


<fontfamily><param>Courier</param>magic moving pixel s.a.    Phone:
+352 54 75 75 - 0

Technoport Schlassgoart    Fax  : +352 54 75 75 - 54

66, rue de Luxembourg      URL  : http://www.mmp.lu

L-4221 Esch-sur-Alzette

</fontfamily>
Hello,

I am in trouble with more and more deadlock problems. We are
programming a web application with multiple users editing content at
the same time. Multiple times a day PostgreSQL runs into a deadlock,
which can only be resolved by killing some of the clients. Here is an
example from the pg_locks table:

SELECT l.mode, l.granted, l.pid, l.transaction, d.datname, r.relname
FROM pg_locks l, pg_database d, pg_class r WHERE d.oid=l.database AND
r.oid=l.relation;
        mode       | granted |  pid  | transaction |      datname
|         relname
------------------+---------+-------+-------------+--------------------
+--------------------------
  AccessShareLock  | t       | 12708 |             | p247_website_1_1_0
| pg_locks
  AccessShareLock  | t       | 12708 |             | p247_website_1_1_0
| pg_class
  AccessShareLock  | t       | 12714 |             | p247_website_1_1_0
| upd_dependency_id
  AccessShareLock  | t       | 12714 |             | p247_website_1_1_0
| preset
  AccessShareLock  | t       | 12726 |             | p247_website_1_1_0
| preset
  AccessShareLock  | t       | 12714 |             | p247_website_1_1_0
| file
  AccessShareLock  | t       | 12726 |             | p247_website_1_1_0
| node_pkey
  AccessShareLock  | t       | 12714 |             | p247_website_1_1_0
| descr_node
  AccessShareLock  | t       | 12726 |             | p247_website_1_1_0
| systemuser_usergroup_rel
  AccessShareLock  | t       | 12726 |             | p247_website_1_1_0
| permgroup_permission_rel
  AccessShareLock  | t       | 12726 |             | p247_website_1_1_0
| usergroup_permgroup_rel
  AccessShareLock  | t       | 12726 |             | p247_website_1_1_0
| account_permission_grant
  AccessShareLock  | t       | 12726 |             | p247_website_1_1_0
| permission
  AccessShareLock  | t       | 12726 |             | p247_website_1_1_0
| account_permgroup_grant
  AccessShareLock  | t       | 12714 |             | p247_website_1_1_0
| environment
  AccessShareLock  | t       | 12726 |             | p247_website_1_1_0
| environment
  AccessShareLock  | t       | 12726 |             | p247_website_1_1_0
| systemuser
  AccessShareLock  | t       | 12726 |             | p247_website_1_1_0
| account
  AccessShareLock  | t       | 12714 |             | p247_website_1_1_0
| account
  AccessShareLock  | t       | 12714 |             | p247_website_1_1_0
| nodetype
  AccessShareLock  | t       | 12726 |             | p247_website_1_1_0
| nodetype
  AccessShareLock  | t       | 12714 |             | p247_website_1_1_0
| upd_templatefile
  RowExclusiveLock | t       | 12726 |             | p247_website_1_1_0
| upd_template
  AccessShareLock  | t       | 12726 |             | p247_website_1_1_0
| upd_template
  RowExclusiveLock | t       | 12714 |             | p247_website_1_1_0
| upd_template
  AccessShareLock  | t       | 12714 |             | p247_website_1_1_0
| upd_template
  AccessShareLock  | t       | 12714 |             | p247_website_1_1_0
| upd_pagevalue
  AccessShareLock  | t       | 12714 |             | p247_website_1_1_0
| descr_upd_page
  RowExclusiveLock | t       | 12714 |             | p247_website_1_1_0
| upd_page
  AccessShareLock  | t       | 12714 |             | p247_website_1_1_0
| upd_page
  RowExclusiveLock | t       | 12726 |             | p247_website_1_1_0
| upd_page
  AccessShareLock  | t       | 12726 |             | p247_website_1_1_0
| upd_page
  RowExclusiveLock | t       | 12714 |             | p247_website_1_1_0
| upd_dependency
  AccessShareLock  | t       | 12714 |             | p247_website_1_1_0
| upd_dependency
  RowExclusiveLock | t       | 12726 |             | p247_website_1_1_0
| node
  AccessShareLock  | t       | 12726 |             | p247_website_1_1_0
| node
  RowExclusiveLock | t       | 12714 |             | p247_website_1_1_0
| node
  RowShareLock     | t       | 12714 |             | p247_website_1_1_0
| node
  AccessShareLock  | t       | 12714 |             | p247_website_1_1_0
| node
  AccessShareLock  | t       | 12714 |             | p247_website_1_1_0
| descr_node_en
(40 rows)

I expected deadlocks to be reported by PostgreSQL after the configured
timeout with an error message, but this one is just hanging up all
clients.

We are running PostgreSQL 7.3.6. Client software is written in Perl
using the DBI interface.

I would really appreciate any comments where to search for the problem.

Matthias Schmitt


magic moving pixel s.a.    Phone: +352 54 75 75 - 0
Technoport Schlassgoart    Fax  : +352 54 75 75 - 54
66, rue de Luxembourg      URL  : http://www.mmp.lu
L-4221 Esch-sur-Alzette

Re: Deadlock Problem

From
Tom Lane
Date:
Matthias Schmitt <freak002@mmp.lu> writes:
> I am in trouble with more and more deadlock problems. We are
> programming a web application with multiple users editing content at
> the same time. Multiple times a day PostgreSQL runs into a deadlock,
> which can only be resolved by killing some of the clients. Here is an
> example from the pg_locks table:

All of the rows you showed us have granted=t.  No blockage is evident,
let alone any deadlock.

            regards, tom lane

Re: Deadlock Problem

From
Matthias Schmitt
Date:
<excerpt><fontfamily><param>Courier</param><x-tad-smaller>Matthias
Schmitt <<freak002@mmp.lu> writes:

</x-tad-smaller><excerpt><x-tad-smaller>I am in trouble with more and
more deadlock problems. We are

programming a web application with multiple users editing content at

the same time. Multiple times a day PostgreSQL runs into a deadlock,

which can only be resolved by killing some of the clients. Here is an

example from the pg_locks table:

</x-tad-smaller></excerpt><x-tad-smaller>

All of the rows you showed us have granted=t.  No blockage is evident,

let alone any deadlock.

</x-tad-smaller></fontfamily></excerpt><fontfamily><param>Courier</param><x-tad-smaller>

Hello,


we tried to reduce the possible error sources. So we logged the last
statements sent to the database and were able to reproduce our problem
with psql alone. I did the following in two psql shell environments:


shell no 1:


CREATE TABLE the_test (

id   int4 PRIMARY KEY,

name varchar(32)

);


insert into the_test values (1, 'hello world');


begin;

update the_test set name = 'still alive' where id = 1;


To keep the transaction open I did not issue any commit or rollback
command.


shell no 2:


</x-tad-smaller><x-tad-smaller>begin;

update the_test set name = 'still alive' where id = 1;

</x-tad-smaller><x-tad-smaller>

The second shell hangs now forever. The pg_locks table shows:


select * from pg_locks;

 relation | database | transaction |  pid  |       mode       |
granted

----------+----------+-------------+-------+------------------+---------

  1980976 |  1980969 |             | 16034 | AccessShareLock  | t

    16757 |        1 |             | 16100 | AccessShareLock  | t

          |          |      762472 | 16036 | ExclusiveLock    | t

          |          |      762473 | 16034 | ExclusiveLock    | t

          |          |      762472 | 16034 | ShareLock        | f

  1980974 |  1980969 |             | 16036 | AccessShareLock  | t

  1980974 |  1980969 |             | 16036 | RowExclusiveLock | t

          |          |      762478 | 16100 | ExclusiveLock    | t

  1980974 |  1980969 |             | 16034 | AccessShareLock  | t

  1980974 |  1980969 |             | 16034 | RowExclusiveLock | t

(10 rows)


In our applications it is possible that multiple records of different
tables are updated in different sequences, depending on the task to
fulfill. Shouldn't a time-out error resolve those problems?


Thank you.


Matthias Schmitt


</x-tad-smaller><x-tad-smaller>

</x-tad-smaller><x-tad-smaller>magic moving pixel s.a.    Phone: +352
54 75 75 - 0

Technoport Schlassgoart    Fax  : +352 54 75 75 - 54

66, rue de Luxembourg      URL  : http://www.mmp.lu

L-4221 Esch-sur-Alzette

</x-tad-smaller><x-tad-smaller>

</x-tad-smaller></fontfamily>
> Matthias Schmitt <freak002@mmp.lu> writes:
>> I am in trouble with more and more deadlock problems. We are
>> programming a web application with multiple users editing content at
>> the same time. Multiple times a day PostgreSQL runs into a deadlock,
>> which can only be resolved by killing some of the clients. Here is an
>> example from the pg_locks table:
>
> All of the rows you showed us have granted=t.  No blockage is evident,
> let alone any deadlock.

Hello,

we tried to reduce the possible error sources. So we logged the last
statements sent to the database and were able to reproduce our problem
with psql alone. I did the following in two psql shell environments:

shell no 1:

CREATE TABLE the_test (
id   int4 PRIMARY KEY,
name varchar(32)
);

insert into the_test values (1, 'hello world');

begin;
update the_test set name = 'still alive' where id = 1;

To keep the transaction open I did not issue any commit or rollback
command.

shell no 2:

begin;
update the_test set name = 'still alive' where id = 1;

The second shell hangs now forever. The pg_locks table shows:

select * from pg_locks;
  relation | database | transaction |  pid  |       mode       | granted
----------+----------+-------------+-------+------------------+---------
   1980976 |  1980969 |             | 16034 | AccessShareLock  | t
     16757 |        1 |             | 16100 | AccessShareLock  | t
           |          |      762472 | 16036 | ExclusiveLock    | t
           |          |      762473 | 16034 | ExclusiveLock    | t
           |          |      762472 | 16034 | ShareLock        | f
   1980974 |  1980969 |             | 16036 | AccessShareLock  | t
   1980974 |  1980969 |             | 16036 | RowExclusiveLock | t
           |          |      762478 | 16100 | ExclusiveLock    | t
   1980974 |  1980969 |             | 16034 | AccessShareLock  | t
   1980974 |  1980969 |             | 16034 | RowExclusiveLock | t
(10 rows)

In our applications it is possible that multiple records of different
tables are updated in different sequences, depending on the task to
fulfill. Shouldn't a time-out error resolve those problems?

Thank you.

Matthias Schmitt


magic moving pixel s.a.    Phone: +352 54 75 75 - 0
Technoport Schlassgoart    Fax  : +352 54 75 75 - 54
66, rue de Luxembourg      URL  : http://www.mmp.lu
L-4221 Esch-sur-Alzette


Re: Deadlock Problem

From
Andrew Sullivan
Date:
On Tue, Mar 16, 2004 at 09:50:43AM +0100, Matthias Schmitt wrote:
> begin;
> update the_test set name = 'still alive' where id = 1;
>
> To keep the transaction open I did not issue any commit or rollback
> command.



>
> shell no 2:
>
> begin;
> update the_test set name = 'still alive' where id = 1;
>
> The second shell hangs now forever. The pg_locks table shows:

First, you haven't tried to COMMIT anywhere.  Nothing will happen in
that case.  For all you know, one of these is going to ROLLBACK and
resolve the lock on its own.  The second one you issue is just going
to sit there, sure.

You actually haven't describe a deadlock here.  This is just a
straightforward wait-and-see lock for transaction 2: that update
statement will indeed wait forever until it sees what 1 has done.
You'd only get a deadlock in case transaction 2 first did something
that T1 _next_ had to depend on.  You really need two conflicting
locks for a deadlock to happen.

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
The fact that technology doesn't work is no bar to success in the marketplace.
        --Philip Greenspun

Re: Deadlock Problem

From
Tom Lane
Date:
Matthias Schmitt <freak002@mmp.lu> writes:
> I did the following in two psql shell environments:

> shell no 1:

> CREATE TABLE the_test (
> id   int4 PRIMARY KEY,
> name varchar(32)
> );

> insert into the_test values (1, 'hello world');

> begin;
> update the_test set name = 'still alive' where id = 1;

> To keep the transaction open I did not issue any commit or rollback
> command.

> shell no 2:

> begin;
> update the_test set name = 'still alive' where id = 1;

> The second shell hangs now forever.

Well, of course.  It has to wait to see if the previous update of the
row commits or not, so that it knows which version of the row to start
from.  (In this trivial case it doesn't really matter, but in more
complex cases such as where different fields are being updated, it
does.)

This is *not* a deadlock, however, as transaction 1 is free to make
progress.  The fact that you've got a client holding an open transaction
and not doing anything is a client-side design error, not a deadlock.

> Shouldn't a time-out error resolve those problems?

Sure, and it works fine:

regression=# set statement_timeout TO 10000;
SET
regression=# begin;
BEGIN
regression=# update the_test set name = 'still alive' where id = 1;
-- about ten seconds elapse, then:
ERROR:  canceling query due to user request
regression=#

            regards, tom lane