Thread: row-level deadlock problem
Hello. I'm running postgresql 7.4.6 on linux 2.4.21(Redhat Enterpise 3). I have problems with deadlocks caused by(at least I think so) row-level locks and I can't find the reason. First I thought this has something with fk constraints, but removing it didn't change anything. Here is simplified schema of my table: CREATE TABLE stats ( counter integer, color varchar(6), shape varchar(6), size integer, d date ); There are non-unique btree indexes on color,shape and size. There is no primary key. This table is modified in plpgsql function, launched like this: # SELECT updatestats('red'); All statement run in auto-commit mode, there is no explicit BEGIN/COMMIT anywhere. Function updatestats goes like this: CREATE FUNCTION updatestats(text) RETURNS integer AS ' DECLARE color_var ALIAS FOR $1; BEGIN UPDATE stats SET counter=counter+1 WHERE color=color_var AND shape IS NULL AND d=current_date; IF NOT FOUND THEN INSERT INTO stats (color,counter,d) VALUES(color_var,1,current_date); END IF; RETURN 1; END; ' LANGUAGE plpgsql; Everything is ok until function updatestats is called frequently, > ~ 3 times per second. Then I get following error: postgres[2247]: [89-1] ERROR: deadlock detected postgres[2247]: [89-2] DETAIL: Process 2247 waits for ShareLock on transaction 148407635; blocked by process 2248. postgres[2247]: [89-3] Process 2248 waits for ShareLock on transaction 148407641; blocked by process 2247. postgres[2247]: [89-4] CONTEXT: PL/pgSQL function "updatestats" line 4 at SQL statement Last query for both childs is the same: UPDATE stats SET counter=counter+1 WHERE color=$1 AND shape IS NULL AND d=current_date called from: SELECT updatestats('red'); It always locks at first UPDATE statement. I don't understand where is a deadlock possibility in such simple function. I know that waiting for share lock on transactionmeans waiting for row-level lock acquired by this transaction. There's no explicit locking, no SELECT FOR UPDATEstatements, all fk constraints has been dropped. Table stats is also modified by other functions, but I have deadlocks only for statements calling updatesstats, always twocalls with the same 'color' argument. Am I missing something obvious? I have no idea what can cause these deadlocks and how to avoid them. Number of deadlock events during one day is so big that it looks like it happens everytime two updatestats function are runningconcurrently. All sugestions are welcomed, thanks in advance. -- Kamil Kaczkowski kamil@kamil.eisp.pl
Kamil Kaczkowski <kamil@kamil.eisp.pl> writes: > I have problems with deadlocks caused by(at least I think so) row-level > locks and I can't find the reason. The failure seems clearly a deadlock on row-level locks. Are you certain you've removed all relevant FKs (those pointing to the table as well as out of it)? Another possible explanation is if the UPDATE command can update more than one row --- in that case different backends might happen to reach the target rows in different orders. regards, tom lane
On Fri, 26 Nov 2004, Tom Lane wrote: > Kamil Kaczkowski <kamil@kamil.eisp.pl> writes: > > I have problems with deadlocks caused by(at least I think so) row-level > > locks and I can't find the reason. > > The failure seems clearly a deadlock on row-level locks. Are you > certain you've removed all relevant FKs (those pointing to the table > as well as out of it)? Yes, I browsed whole database schema, all FKs has been dropped. > Another possible explanation is if the UPDATE > command can update more than one row --- in that case different backends > might happen to reach the target rows in different orders. This could be it. Yes, this UPDATE changes several rows, I didn't know this can be a problem. My understanding was that row-level lock at UPDATE statement is somehow atomic and it locks all rows matched at once. But what's the solution? How can I force UPDATEs to lock rows in the same order? There's no ORDER BY clause for UPDATE. Thanks for help. -- Kamil Kaczkowski kamil@kamil.eisp.pl
Kamil Kaczkowski <kamil@kamil.eisp.pl> writes: >> Another possible explanation is if the UPDATE >> command can update more than one row --- in that case different backends >> might happen to reach the target rows in different orders. > Yes, this UPDATE changes several rows, I didn't know this can be a > problem. > My understanding was that row-level lock at UPDATE statement is somehow > atomic and it locks all rows matched at once. Nope. > But what's the solution? How can I force UPDATEs to lock rows in the same > order? There's no ORDER BY clause for UPDATE. Change things so you don't need to update more than one row per query, perhaps? The lack of any primary key on that table was already pretty disturbing from a database-theory point of view. Maybe you should rethink the table layout. regards, tom lane
On Fri, 26 Nov 2004, Tom Lane wrote: > > My understanding was that row-level lock at UPDATE statement is somehow > > atomic and it locks all rows matched at once. > > Nope. > > > But what's the solution? How can I force UPDATEs to lock rows in the same > > order? There's no ORDER BY clause for UPDATE. > > Change things so you don't need to update more than one row per query, > perhaps? The lack of any primary key on that table was already pretty > disturbing from a database-theory point of view. Maybe you should > rethink the table layout. Yes, I know. I'm not the developer of this application, my job is only to find reasons behind those deadlocks and suggest solution. Anyway I'm suprised that UPDATE statements are so locking sensitive. Consider more general case with following table: CREATE TABLE test(id serial not null primary key,val integer not null); Does it mean that there's no simple way(without explicit locking) to run: UPDATE test SET val=1; safely in concurrent transactions(let's name them T1,T2) assuming that other transactions are modifing table frequently, e.g.: T1: UPDATE started ... some other activity on table test ... T2: UPDATE started(with different order then T1's update) DEADLOCK If it looks like this, I'm probably lucky it didn't bite me before. Best regards. -- Kamil Kaczkowski kamil@kamil.eisp.pl
On Sat, Nov 27, 2004 at 03:20:16AM +0100, Kamil Kaczkowski wrote: > > Change things so you don't need to update more than one row per query, > > perhaps? The lack of any primary key on that table was already pretty > > disturbing from a database-theory point of view. Maybe you should > > rethink the table layout. > Yes, I know. I'm not the developer of this application, my job is > only to find reasons behind those deadlocks and suggest solution. > Anyway I'm suprised that UPDATE statements are so locking sensitive. It's not the locking on the UPDATE that's getting you. Multiple updates can run concurrently (depending on your serialization level anyway, I'm talking about default setup here). Where the problem is is the foreign key locks. The usual thing is to sort the rows you are updating in such a way that the foreign keys references are always processed in the same order, hence can't deadlock. -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Attachment
On Sat, 27 Nov 2004, Martijn van Oosterhout wrote: > On Sat, Nov 27, 2004 at 03:20:16AM +0100, Kamil Kaczkowski wrote: > > > Change things so you don't need to update more than one row per query, > > > perhaps? The lack of any primary key on that table was already pretty > > > disturbing from a database-theory point of view. Maybe you should > > > rethink the table layout. > > Yes, I know. I'm not the developer of this application, my job is > > only to find reasons behind those deadlocks and suggest solution. > > Anyway I'm suprised that UPDATE statements are so locking sensitive. > > It's not the locking on the UPDATE that's getting you. Multiple updates > can run concurrently (depending on your serialization level anyway, I'm > talking about default setup here). > > Where the problem is is the foreign key locks. The usual thing is to > sort the rows you are updating in such a way that the foreign keys > references are always processed in the same order, hence can't > deadlock. See earlier posts in this thread, I have no foreign key constraints on this table. Regards. -- Kamil Kaczkowski kamil@kamil.eisp.pl
On Sat, Nov 27, 2004 at 03:40:11PM +0100, Kamil Kaczkowski wrote: > > It's not the locking on the UPDATE that's getting you. Multiple updates > > can run concurrently (depending on your serialization level anyway, I'm > > talking about default setup here). > > > > Where the problem is is the foreign key locks. The usual thing is to > > sort the rows you are updating in such a way that the foreign keys > > references are always processed in the same order, hence can't > > deadlock. > See earlier posts in this thread, I have no foreign key constraints on > this table. Well, there has to be something, since an UPDATE in Read Committed mode simply doesn't have any locks that can deadlock. It has to be at least a SELECT FOR UPDATE, which is the lock foreign keys use. -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Attachment
Martijn van Oosterhout <kleptog@svana.org> writes: > Well, there has to be something, since an UPDATE in Read Committed mode > simply doesn't have any locks that can deadlock. You're mistaken; it takes a row lock on each row it updates. I'm not sure why the two UPDATEs are visiting the same rows in different orders, but if they do the failure is certainly possible. regards, tom lane
On Sat, Nov 27, 2004 at 12:58:36PM -0500, Tom Lane wrote: > Martijn van Oosterhout <kleptog@svana.org> writes: > > Well, there has to be something, since an UPDATE in Read Committed mode > > simply doesn't have any locks that can deadlock. > > You're mistaken; it takes a row lock on each row it updates. I'm not > sure why the two UPDATEs are visiting the same rows in different orders, > but if they do the failure is certainly possible. One of them could be using an indexscan while the other is not. If the heap is in reverse order compared to the scan, that would explain it. Is there a solution to this problem? The FK deadlock problem can be fixed with shared row locks, but this is a different one. -- Alvaro Herrera (<alvherre[@]dcc.uchile.cl>) "La felicidad no es mañana. La felicidad es ahora"
On Sat, 27 Nov 2004, Alvaro Herrera wrote: > > You're mistaken; it takes a row lock on each row it updates. I'm not > > sure why the two UPDATEs are visiting the same rows in different orders, > > but if they do the failure is certainly possible. > > One of them could be using an indexscan while the other is not. If the > heap is in reverse order compared to the scan, that would explain it. > > Is there a solution to this problem? The FK deadlock problem can be > fixed with shared row locks, but this is a different one. > > In my case deadlock happens between two identical statements executed from different transactions and they have the same execution plan(index scan on one attribute - 'color' in schema I presented). Also, there's no other modifing statements in logs at the time so only thing that can change scan order is the first UPDATE called of two involved in deadlock(or the same statement exeduted from some third transaction, but I don't think so). Could someone with more knowledge on internals of btree indexes provide info on how this can happen, what causes index scan order to change? Thinking of it, possibility of such event should be very low, but I'm getting 2000 deadlocks during busy day. Regards. -- Kamil Kaczkowski kamil@kamil.eisp.pl
Kamil Kaczkowski <kamil@kamil.eisp.pl> writes: >>> You're mistaken; it takes a row lock on each row it updates. I'm not >>> sure why the two UPDATEs are visiting the same rows in different orders, >>> but if they do the failure is certainly possible. >> >> One of them could be using an indexscan while the other is not. If the >> heap is in reverse order compared to the scan, that would explain it. >> > In my case deadlock happens between two identical statements executed > from different transactions and they have the same execution plan(index > scan on one attribute - 'color' in schema I presented). That's a bit hard to believe; once the rows are entered in the index their relative order won't change anymore, so it's real hard to see how two indexscans could visit them in different orders. IIRC you said that these commands were being done inside plpgsql functions, so it's possible that the planner is doing something different with the parameterized plans than what you see in a simple EXPLAIN with values already inserted. Still, it's odd that you might get different plans in different executions of the same function. I think there is some factor we're not seeing here. Is it possible that one backend has a cached plan much older than the other one, and that the planner's plan choice changed over time? regards, tom lane
On Sat, 27 Nov 2004, Tom Lane wrote: > > In my case deadlock happens between two identical statements executed > > from different transactions and they have the same execution plan(index > > scan on one attribute - 'color' in schema I presented). > > That's a bit hard to believe; once the rows are entered in the index > their relative order won't change anymore, so it's real hard to see how > two indexscans could visit them in different orders. > > IIRC you said that these commands were being done inside plpgsql > functions, so it's possible that the planner is doing something > different with the parameterized plans than what you see in a simple > EXPLAIN with values already inserted. Still, it's odd that you might > get different plans in different executions of the same function. > > I think there is some factor we're not seeing here. Is it possible that > one backend has a cached plan much older than the other one, and that > the planner's plan choice changed over time? Theorically possible, but chances are very low. I think condition which now use index is most selective all the time, even if not, I don't believe planner could change plan so quickly with dataset change. Also, connections to this database have rather short life. I can add EXECUTE in front of UPDATE to verify cached plan theory. I'll try that during the week, but I don't think it'd change anything. I agree it looks like we're missing something. Any other theories, suggestions what should I check? I'll try to isolate problem to test case by writing scripts to simulate actual load but it'll take same time. Regards. -- Kamil Kaczkowski kamil@kamil.eisp.pl