Thread: Deadlocks -- what can I do about them?
Hello, I'm using PostgreSQL 7.4.3 on a RedHat 9 Linux server (a P4 HyperThreaded, using the SMP kernel, in case this makes a difference). I'm not 100% sure I understand exactly why I am causing them, but let's say that several inserts inside an SQL transaction (i.e., a BEGIN / COMMIT-ROLLBACK block), where each insert references two different foreign-keys kind of sets the alarm. The thing is, what can I do? Is it ok if I check the error and whenever a deadlock is detected, a execute a delay of some random number of milliseconds and then try again? After some preliminary analysis, I'm not sure there is anything (sensible) that I can do about those SQL insert statements or the fact that they're enclosed in an SQL transaction. I can't seem to see why those would be fundamentally wrong. That's why I'm looking for a solution that may involve recovering from the error. The transaction has been rolled back, so I'm guessing I could try again a little bit later, no? Is there some standard practice to deal with these deadlocks, or to avoid them in a situation like I described (I know, I didn't post any details, but details would only bore you at this point, I guess) Thanks for any advice/comments, Carlos --
Ok, now I'm really intrigued by what looks to me (possibly from a naive point of view) like a bug, or rather, a limitation on the implementation. I can't find a reasonable justification why the following would cause a deadlock: I run two instances of psql using the same DB on the same machine. On one of them, I run: create table A (id int primary key); create table B (id int primary key); create table AB ( A_id int references A(id), B_id int references B(id) ); Then I add a few records (all this from the same instance of psql): insert into A values (1); insert into A values (2); insert into B values (10); insert into B values (11); Ok, now, I try two concurrent transactions, by executing commands alternating from one psql instance to the other one: I'll prefix each line with 1: or 2: indicating which console I execute it on -- the commands were executing in the time sequence corresponding to the lines below: 1: begin; 2: begin; 1: insert into AB values (1,10); 2: insert into AB values (2,10); <AT THIS POINT, CONSOLE 2 BLOCKS> 1: insert into AB values (2,11); At this point, console 1 blocks for a second or two, and then I get an error, reporting that a deadlock was detected; then, console 2 unblocks. I can't see how it is justified that the above causes a deadlock. I do understand how the deadlock is happening: trans. 1 puts a lock on rows 1 of A and row 10 of B -- meaning, "nobody touches these rows until I'm finished"; then trans 2. locks row 2 of A, but is put on hold waiting to lock row 10 of B, since there is already a lock on it. When trans. A now tries to put a lock on row 2 of A, the deadlock happens. The thing is, why? Why is this a deadlock? When we look at the low-level details, sure; but when you look at the nature of what's happening at a conceptual level, a deadlock is not justified, IMHO: Trans. 1 doesn't really need to put a mutex type of lock around row 1 of A -- it simply needs to atomically flag the order: "nobody delete or modify this row of table A"... Another trans. that attempts to place the same order should not block -- it should succeed and return immediately and continue with the transaction; there is no conflict in the above example -- the first transaction does not want to allow anyone to mess with row 1 of A; the other transaction wants exactly the same, so it seems to me that the lock is more restrictive than it needs to be. I don't know about the internals of how transactions and locks and FK constraints are handled, but I'm analyzing it and describing what seems to be happening internally, based on the behaviour I observe. Any comments? Carlos --
On Tue, 24 Aug 2004, Carlos Moreno wrote: > Ok, now I'm really intrigued by what looks to me > (possibly from a naive point of view) like a bug, > or rather, a limitation on the implementation. Yep. See recent (and historical) discussions on needing a weaker lock than FOR UPDATE for handling foreign keys.
moreno@mochima.com (Carlos Moreno) wrote in message news:<412A5C47.5070908@mochima.com>... > Hello, > > I'm using PostgreSQL 7.4.3 on a RedHat 9 Linux server > (a P4 HyperThreaded, using the SMP kernel, in case this > makes a difference). > > I'm not 100% sure I understand exactly why I am causing > them, but let's say that several inserts inside an SQL > transaction (i.e., a BEGIN / COMMIT-ROLLBACK block), > where each insert references two different foreign-keys > kind of sets the alarm. > > The thing is, what can I do? Is it ok if I check the > error and whenever a deadlock is detected, a execute > a delay of some random number of milliseconds and > then try again? > > After some preliminary analysis, I'm not sure there is > anything (sensible) that I can do about those SQL insert > statements or the fact that they're enclosed in an SQL > transaction. I can't seem to see why those would be > fundamentally wrong. That's why I'm looking for a > solution that may involve recovering from the error. > The transaction has been rolled back, so I'm guessing > I could try again a little bit later, no? > > Is there some standard practice to deal with these > deadlocks, or to avoid them in a situation like I > described (I know, I didn't post any details, but > details would only bore you at this point, I guess) > > Thanks for any advice/comments, > > Carlos Best thing to do is know your locking. IF you are using Pg you have several different levels avail. You know what causes deadlocks, the key is to use as little locking as is necesary so that under no circumstances /can/ you get a deadlock, while at the same time ensuring consistency (ACID). Pg even supports row-level locks I think so you have a lot of options. Sometimes complex trx can cause challenges but there is always a way. gl
>>>>> "CM" == Carlos Moreno <moreno@mochima.com> writes: CM> Ok, now I'm really intrigued by what looks to me CM> (possibly from a naive point of view) like a bug, CM> or rather, a limitation on the implementation. [[ ... ]] CM> I don't know about the internals of how transactions CM> and locks and FK constraints are handled, but I'm CM> analyzing it and describing what seems to be CM> happening internally, based on the behaviour I CM> observe. FWIW I get bit by this quite a bit. Unfortunately all the deadlock avoidance theory doesn't help you since you're not explicitly getting the locks, and as you see, ordering the insert/update operations such as to avoid conflicting locks is hard to do. If I could designate the transaction I prefer to be killed, it would save me a lot: often my short easy to repeat transaction wins out over some large multi-thousand row select/insertion operation. -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Vivek Khera, Ph.D. Khera Communications, Inc. Internet: khera@kciLink.com Rockville, MD +1-301-869-4449 x806 AIM: vivekkhera Y!: vivek_khera http://www.khera.org/~vivek/