Not your father's question about deadlocks - Mailing list pgsql-general

From Clarence Gardner
Subject Not your father's question about deadlocks
Date
Msg-id 455A3FD0.1050506@adecn.com
Whole thread Raw
Responses Re: Not your father's question about deadlocks
List pgsql-general
Once upon a time, I put a question regarding deadlocks to the group,
and Tom Lane immediately answered with this:

 >The guy waiting on the tuple-specific lock is second in
 >line to actually mung the tuple.  Whoever is first in line behind the
 >current tenant will be blocked trying to acquire ShareLock on the
 >current tenant's transaction ID.
 >
 >What you appear to have is a situation where two transactions are trying
 >to lock or update the same two rows in different orders.  Without a lot
 >more info about your application logic, I couldn't guess why this seems
 >to be associated with having more than two transaction interested in the
 >same tuple.
 >
 >Note that the guy looking for ShareLock on the tuple is evidently either
 >doing SELECT FOR SHARE on this tuple, or trying to install a new tuple
 >referencing this one as a foreign key (which does SELECT FOR SHARE under
 >the hood).  But he's blocked by someone who's done either SELECT FOR
 >UPDATE or an actual UPDATE on that tuple.

I'm still creeping up on the problem. I configured postgres to resolve
a deadlock only after a month, and now I seem to have a complete
snapshot of one of them. The details are below. The part that doesn't
fit with what Tom says is that there are only two transactions involved.
The first and third "guys" are the same guy!

The first one, I'll call it X71, is the one doing the insert. The
LedgerDetail table references various other tables; the one of interest
is the buys table, and it is inserting a record referencing buy #955. It
had done several other inserts before this, some of them also referencing
buy #955. (In fact, the immediately preceding statement was one such.)
This is evidenced by the RowShareLock that it holds on the buys table.

The other one, X78, is executing a stored procedure which has only one
reference to buys, this statement:
    update buys set budget=budget+remainder where id=buynum;

So the timeline is this
   X71                          X78
   -----------------------      ----------------------
   insert buy955 reference
   insert buy955 reference
   insert buy955 reference
                                update buy955 (blocks)
   insert buy955 reference (blocks)

Here is the state of the system. I have edited it somewhat. The pg_locks
table actually shows five waiters; the three that I've deleted started
much later than when the deadlock happened, so I've deleted all the pg_locks
rows that belonged to their transactions. I've also deleted all pg_locks
rows for X71 and X78 that referred to relations that the other transaction
had no lock on. And I've deleted empty or constant columns, and the relation
oid column.
======================================================================
   27478 |  insert into LedgerDetail


(ledger,strategy,advId,campaignId,buyId,creativeId,siteId,providerType,adspotId,checkbookId,count,dr,cr,dr_usd,cr_usd,time)

values('imp',',61144,',533,676,955,4076,2,5,656,31275,14,0,110936,0,142142,1163530800)
         | 2006-11-14 19:22:43.760703+00 | 2006-11-14 19:22:29.342683+00
   99056 |  FETCH 1 FROM "PgSQL_0CDD618C"
         | 2006-11-14 19:22:43.748025+00 | 2006-11-13 19:21:40.038948+00

2006-11-14 11:22:43 PST[99056]LOG:  statement: DECLARE "PgSQL_0CDD618C"
CURSOR FOR
               select repayBuyBudget(955, 31275, fromusd(ecc2usd(7790),
'EUR'))

         relation          |   locktype    |  page | tuple |
transactionid |  transaction |  pid  |       mode       | granted

---------------------------+---------------+-------+-------+---------------+--------------+-------+------------------+---------
 buys                      | relation      |       |
|               |    101953371 | 27478 | AccessShareLock  | t
 buys                      | relation      |       |
|               |    101953371 | 27478 | RowShareLock     | t
 buys                      | tuple         |    38 |    11
|               |    101953371 | 27478 | ShareLock        | f
 buys_pkey                 | relation      |       |
|               |    101953371 | 27478 | AccessShareLock  | t

                           | transactionid |       |       |
101953371 |    101954678 | 99056 | ShareLock        | f
 buys                      | relation      |       |
|               |    101954678 | 99056 | AccessShareLock  | t
 buys                      | relation      |       |
|               |    101954678 | 99056 | RowExclusiveLock | t
 buys                      | tuple         |    38 |    11
|               |    101954678 | 99056 | ExclusiveLock    | t
 buys_pkey                 | relation      |       |
|               |    101954678 | 99056 | AccessShareLock  | t
 buys_pkey                 | relation      |       |
|               |    101954678 | 99056 | RowExclusiveLock | t
======================================================================

That scenario seems quite simple, but I can't reproduce the deadlock with
this seemingly-identical sequence. The "Sn" prompts show the sequence of
statements done from psql in two terminal sessions.
    S1 => create table t1(f1 int primary key);
    NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
"t1_pkey" for table "t1"
    CREATE TABLE
    S2 => create table t2(f1 int references t1);
    CREATE TABLE
    S3 => insert into t1 values(1);
    INSERT 0 1
    S4 => begin;
    BEGIN
    S7 => update t1 set f1=1;   (blocks)

    S5 => begin;
    BEGIN
    S6 => insert into t2 values(1);
    INSERT 0 1
    S8 => insert into t2 values(1);
    INSERT 0 1

S8 is the statement that seems analagous to the one that blocks in the
real system, but it doesn't here.

Apart from that test being different somehow that I can't figure out,
situation on the real system seems to suggest that I need to either
lock all the reference tables at the beginning of X71 (uncomfortable),
or have it commit after each insert (unacceptable).

Thanks in advance for your help.



pgsql-general by date:

Previous
From: "whytwelve13@yahoo.com"
Date:
Subject: Re: PgSQL not recognized
Next
From: mwrynn@gmail.com
Date:
Subject: schema rename - is analyze necessary?