Silent deadlock possible in current sources - Mailing list pgsql-hackers

From Tom Lane
Subject Silent deadlock possible in current sources
Date
Msg-id 20211.967658324@sss.pgh.pa.us
Whole thread Raw
List pgsql-hackers
Observe:

heap_update()
{   /* lock page containing old copy of tuple */   LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
   ...
   /* Find buffer for new tuple */   if ((unsigned) MAXALIGN(newtup->t_len) <= PageGetFreeSpace((Page) dp))
newbuf= buffer;   else       newbuf = RelationGetBufferForTuple(relation, newtup->t_len, buffer);
 
   ...
   if (newbuf != buffer)   {       LockBuffer(newbuf, BUFFER_LOCK_UNLOCK);       WriteBuffer(newbuf);   }
LockBuffer(buffer,BUFFER_LOCK_UNLOCK);   WriteBuffer(buffer);
 
}

RelationGetBufferForTuple(Relation relation, Size len, Buffer Ubuf)
{   if (!relation->rd_myxactonly)       LockPage(relation, 0, ExclusiveLock);
   ...
   buffer = ReadBuffer(relation, lastblock - 1);
   if (buffer != Ubuf)       LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
   ...
   if (!relation->rd_myxactonly)       UnlockPage(relation, 0, ExclusiveLock);
   ...
}

In other words, if heap_update can't fit the new copy of the tuple on
the same page it's already on, then *while still holding the exclusive
lock on the old tuple's buffer*, it calls RelationGetBufferForTuple
which tries to grab the relation's extension lock and then the exclusive
lock on the last page of the relation.  The code is smart enough to deal
with the case that the old tuple is in the last page of the relation
(in which case we already have the exclusive buffer lock on that page,
and mustn't ask for it twice).

BUT: suppose two different processes are trying to do this at about
the same time.  Process A is updating a tuple in the last page of the
relation and Process B is updating a tuple in some earlier page.  Both
are able to get their exclusive buffer locks on their old tuples' pages.
Now, suppose that Process B is a little bit ahead and so it is first
to reach the LockPage operation.  It gets the relation extension lock.
Now it wants to get an exclusive buffer lock on the last page of the
relation.  It can't, because Process A already has that lock --- but
now Process A will be waiting to get the relation extension lock that
Process B has.

This deadlock is not detected or reported because the buffer lock
mechanism doesn't have any deadlock detection capability (buffer locks
aren't done via the lock manager, which might be considered a bug in
itself).  Instead, the two processes just silently lock up, and
thereafter so will all other processes that try to update or insert
in that relation.


This bug did not exist in 7.0.* because heap_update used to release
its exclusive lock on the source page while extending the relation:
       /*        * New item won't fit on same page as old item, have to look for a        * new place to put it. Note
thatwe have to unlock current buffer        * context - not good but RelationPutHeapTupleAtEnd uses extend        *
lock.       */       LockBuffer(buffer, BUFFER_LOCK_UNLOCK);       RelationPutHeapTupleAtEnd(relation, newtup);
LockBuffer(buffer,BUFFER_LOCK_EXCLUSIVE);
 

I'm inclined to think that that is the correct solution and the new
approach is simply broken.  But, not knowing what Vadim had in mind
while making this change, I'm going to leave it to him to fix this.


Although this specific lockup mode didn't exist in 7.0.*, it does
suggest a possible cause of the deadlocks-with-no-deadlock-report
behavior that a couple of people have reported with 7.0: maybe there
is another logic path that allows a deadlock involving two buffer locks,
or a buffer lock and a normal lock.  I'm on the warpath now ...
        regards, tom lane


pgsql-hackers by date:

Previous
From: Mike Mascari
Date:
Subject: Re: Backend-internal SPI operations
Next
From: Mario Weilguni
Date:
Subject: Patch for TNS services