Thread: deadlock

deadlock

From
rihad
Date:
Hi,

I've come across a strange deadlock that I need your help with. There
are two copies of the same Perl daemon running on a 2 cpu box. The
program is pretty simple (because I wrote it :)) so I can trace its
pathway fairly well: in it, there's a single "LOCK table foo" occurring
part way through a transaction that sometimes ends up as this:

DETAIL:  Process 91376 waits for AccessExclusiveLock on relation 16488
of database 16386; blocked by process 92387.
Process 92387 waits for AccessExclusiveLock on relation 16488 of
database 16386; blocked by process 91376.

After the exclusive lock, there is also exactly one SELECT, and then one
UPDATE query involving table foo, among others, doing their usual
implicit locking on it. I've read in the manuals that it's okay to stack
locks this way as long as the more restrictive locks precede less
restrictive ones. Mind you, there may be many requests per second, and
some of them can and will happen at the same wall clock time due to 2
cpus at work. Can locking break under these circumstances? I'd rather
opt for an educated solution to this, than having to check and restart
the query.

PostgreSQL 8.3.1
FreeBSD 7.0
p5-DBI-1.60.1
p5-DBD-Pg-1.49


Thanks for any tips.

Re: deadlock

From
"Scott Marlowe"
Date:
On Thu, Apr 3, 2008 at 10:29 AM, rihad <rihad@mail.ru> wrote:
> Hi,
>
>  I've come across a strange deadlock that I need your help with. There are
> two copies of the same Perl daemon running on a 2 cpu box. The program is
> pretty simple (because I wrote it :)) so I can trace its pathway fairly
> well: in it, there's a single "LOCK table foo" occurring part way through a
> transaction that sometimes ends up as this:

For what reason is the table being locked?

Re: deadlock

From
Craig Ringer
Date:
rihad wrote:
> Hi,
>
> I've come across a strange deadlock that I need your help with. There
> are two copies of the same Perl daemon running on a 2 cpu box. The
> program is pretty simple (because I wrote it :)) so I can trace its
> pathway fairly well: in it, there's a single "LOCK table foo" occurring
> part way through a transaction that sometimes ends up as this:
>
> DETAIL:  Process 91376 waits for AccessExclusiveLock on relation 16488
> of database 16386; blocked by process 92387.
> Process 92387 waits for AccessExclusiveLock on relation 16488 of
> database 16386; blocked by process 91376.

If there are only two processes, and each is waiting for an ACCESS
EXCLUSIVE lock on the same relation and being blocked by the other one,
then presumably both have weaker locks that conflict with ACCESS
EXCLUSIVE on that relation.

Process 1 can't proceed with the ACCESS EXCLUSIVE lock because process 2
has a lesser lock on the table.

Process 2 can't proceed with the ACCESS EXCLUSIVE lock because process 1
has a lesser lock on the table.

Deadlock.

I don't see any other way the situation could arise, but I'm *very* far
from an expert.

Note that many statements take out fairly weak locks on a table.
See: http://www.postgresql.org/docs/8.3/static/explicit-locking.html
In particular, even a basic SELECT takes out an ACCESS SHARE, which
conflicts with ACCESS EXCLUSIVE.

If you are going to lock the table with ACCESS EXCLUSIVE you need to either:

- Take out the ACCESS EXCLUSIVE lock before doing anything else with the
table;
- Rewrite to avoid the need for the ACCESS EXCLUSIVE lock (say, by using
appropriate SELECT ... FOR UPDATE/SHARE row level locking); or
- Be prepared to retry transactions when deadlocks arise

I'd prefer to avoid the exclusive lock entirely if possible, and failing
that I'd want to take it out before doing anything else.

> After the exclusive lock, there is also exactly one SELECT

But what about BEFORE the LOCK statement? That's what matters.

> I've read in the manuals that it's okay to stack
> locks this way as long as the more restrictive locks precede less
> restrictive ones.

Yep, and since ACCESS EXCLUSIVE is the most restrictive lock you need to
take it out before doing ANYTHING else if you can't prove through other
means (say, knowledge about locking on other tables/records) that a
deadlock won't arise.

> Mind you, there may be many requests per second, and
> some of them can and will happen at the same wall clock time due to 2
> cpus at work. Can locking break under these circumstances?

I'd be VERY surprised.

--
Craig Ringer

Re: deadlock

From
rihad
Date:
> rihad wrote:
>> Hi,
>>
>> I've come across a strange deadlock that I need your help with. There
>> are two copies of the same Perl daemon running on a 2 cpu box. The
>> program is pretty simple (because I wrote it :)) so I can trace its
>> pathway fairly well: in it, there's a single "LOCK table foo" occurring
>> part way through a transaction that sometimes ends up as this:
>>
>> DETAIL:  Process 91376 waits for AccessExclusiveLock on relation 16488
>> of database 16386; blocked by process 92387.
>> Process 92387 waits for AccessExclusiveLock on relation 16488 of
>> database 16386; blocked by process 91376.
>
> If there are only two processes, and each is waiting for an ACCESS
> EXCLUSIVE lock on the same relation and being blocked by the other one,
> then presumably both have weaker locks that conflict with ACCESS
> EXCLUSIVE on that relation.
>
> Process 1 can't proceed with the ACCESS EXCLUSIVE lock because process 2
> has a lesser lock on the table.
>
> Process 2 can't proceed with the ACCESS EXCLUSIVE lock because process 1
> has a lesser lock on the table.
>
> Deadlock.
>
> I don't see any other way the situation could arise, but I'm *very* far
> from an expert.
>
Indeed, there is one SELECT and, conditionally, one UPDATE before the
exclusive LOCK, on the table. I've re-read the manual, particularly this
line:
"One should also ensure that the first lock acquired on an object in a
transaction is the highest mode that will be needed for that object."

Since SELECT & UPDATE come before LOCK on bw_pool, the bug is obvious.
Sadly I can't use any other locking as I need exclusive access to OS's
firewall after getting bw_id. Well, I thought I'd move LOCK further away
inside the transaction to better mimic fine-grained locking. So one
solution is to move it back to the beginning.


Thank you!

Re: deadlock

From
rihad
Date:
> rihad wrote:
>> Given this type query:
>>
>>         UPDATE bw_pool
>>         SET user_id=?
>>         WHERE bw_id=
>>                 (SELECT MIN(bw_id) FROM bw_pool WHERE user_id IS NULL)
>>         RETURNING bw_id
>
> Can you use a SERIALIZABLE transaction and avoid the explicit lock?
>
Not really. Since LOCKing bw_pool backs up later firewall manipulation
(of which there's one) I'm not really prepared to restart transactions
due to deadlocks. It's easier for me to prevent deadlocks altogether by
carefully stacking queries according to the level of lock
restrictiveness, albeit at a price that the whole transaction will be
single threaded, even parts of it that don't need it. I was indeed
willing to exclusively lock only as little code as possible
(fine-grained locking), but neglected the importance of the locking-type
order.

Re: deadlock

From
rihad
Date:
> Scott Marlowe wrote:
>
>     Sure, but you have to trap that all the time.  The solution using a
>     cycling sequence keeps you from ever seeing that (unless you managed
>     to check out all 9,999 other values while still getting the current
>     one.  No locking needed, dozens of updaters running concurrently and
>     no need to track update errors.
>
> Yep, that does sound like it'd be nicer, at least if locks are
> becoming free at a reasonable rate (ie you don't have to step through
> most of the table to find a free lock). I was working on the probably
> mistaken assumption that the OP wanted the "next" / "first" available
> slot, not any free slot.
>
> If there are very few free locks at any given time I have the feeling
> the sequence approach could spend a lot of time just scanning through
> the table looking for free entries. Then again, using an aggregate
> subquery is far from free either, and it's a whole lot nicer to just
> repeat one statement until it succeeds rather than retrying the whole
> transaction if it conflicts with another (which will happen often if
> there's really high demand for locks).
>
> In fact, both transactions trying to grab the lowest free lock is
> practically a recipe for serialization failures, making it even less
> attractive. With only two concurrent connections it'd work OK if one
> used min() and the other used max() ... but add another couple and
> you're in trouble.
>
> The serial based approach sounds a fair bit better.


Serial access to the firewall is what I'm now emulating with "LOCK
bw_pool" as the first statement in the transaction. AFAIK Postgres'
serializable transactions give you decent parallelism at the price of
expecting you to retry them due to serialization errors, and thus cannot
be relied upon for doing actions on a shared external resource (like
manipulating the firewall) after having LOCKed some table. It's a pity
there's no way to let go of the lock as soon as possible, only
implicitly at the end of the transaction.