Thread: row-level deadlock problem

row-level deadlock problem

From
Kamil Kaczkowski
Date:
Hello.
I'm running postgresql 7.4.6 on linux 2.4.21(Redhat Enterpise 3).
I have problems with deadlocks caused by(at least I think so) row-level
locks and I can't find the reason.
First I thought this has something with fk constraints, but removing it didn't change anything.
Here is simplified schema of my table:

CREATE TABLE stats (
counter integer,
color varchar(6),
shape varchar(6),
size integer,
d date
);

There are non-unique btree indexes on color,shape and size. There is no
primary key.

This table is modified in plpgsql function, launched like this:
# SELECT updatestats('red');
All statement run in auto-commit mode, there is no explicit BEGIN/COMMIT anywhere.
Function updatestats goes like this:

CREATE FUNCTION updatestats(text) RETURNS integer AS '
DECLARE
    color_var ALIAS FOR $1;
BEGIN
    UPDATE stats SET counter=counter+1 WHERE color=color_var AND shape IS NULL AND d=current_date;
    IF NOT FOUND THEN
        INSERT INTO stats (color,counter,d) VALUES(color_var,1,current_date);
    END IF;
    RETURN 1;
END;
' LANGUAGE plpgsql;

Everything is ok until function updatestats is called frequently, > ~ 3 times per second.
Then I get following error:
postgres[2247]: [89-1] ERROR:  deadlock detected
postgres[2247]: [89-2] DETAIL:  Process 2247 waits for ShareLock on transaction 148407635; blocked by process 2248.
postgres[2247]: [89-3] Process 2248 waits for ShareLock on transaction 148407641; blocked by process 2247.
postgres[2247]: [89-4] CONTEXT:  PL/pgSQL function "updatestats" line 4 at SQL statement
Last query for both childs is the same:
UPDATE stats SET counter=counter+1 WHERE color=$1 AND shape IS NULL AND d=current_date
called from: SELECT updatestats('red');
It always locks at first UPDATE statement.

I don't understand where is a deadlock possibility in such simple function. I know that waiting for share lock on
transactionmeans waiting for row-level lock acquired by this transaction. There's no explicit locking, no SELECT FOR
UPDATEstatements, all fk constraints has been dropped. 
Table stats is also modified by other functions, but I have deadlocks only for statements calling updatesstats, always
twocalls with the same 'color' argument. 
Am I missing something obvious? I have no idea what can cause these deadlocks and how to avoid them.
Number of deadlock events during one day is so big that it looks like it happens everytime two updatestats function are
runningconcurrently. 
All sugestions are welcomed, thanks in advance.
--
Kamil Kaczkowski
kamil@kamil.eisp.pl

Re: row-level deadlock problem

From
Tom Lane
Date:
Kamil Kaczkowski <kamil@kamil.eisp.pl> writes:
> I have problems with deadlocks caused by(at least I think so) row-level
> locks and I can't find the reason.

The failure seems clearly a deadlock on row-level locks.  Are you
certain you've removed all relevant FKs (those pointing to the table
as well as out of it)?  Another possible explanation is if the UPDATE
command can update more than one row --- in that case different backends
might happen to reach the target rows in different orders.

            regards, tom lane

Re: row-level deadlock problem

From
Kamil Kaczkowski
Date:
On Fri, 26 Nov 2004, Tom Lane wrote:

> Kamil Kaczkowski <kamil@kamil.eisp.pl> writes:
> > I have problems with deadlocks caused by(at least I think so) row-level
> > locks and I can't find the reason.
>
> The failure seems clearly a deadlock on row-level locks.  Are you
> certain you've removed all relevant FKs (those pointing to the table
> as well as out of it)?
Yes, I browsed whole database schema, all FKs has been dropped.
> Another possible explanation is if the UPDATE
> command can update more than one row --- in that case different backends
> might happen to reach the target rows in different orders.
This could be it.
Yes, this UPDATE changes several rows, I didn't know this can be a
problem.
My understanding was that row-level lock at UPDATE statement is somehow
atomic and it locks all rows matched at once.
But what's the solution? How can I force UPDATEs to lock rows in the same
order? There's no ORDER BY clause for UPDATE.
Thanks for help.
--
Kamil Kaczkowski
kamil@kamil.eisp.pl

Re: row-level deadlock problem

From
Tom Lane
Date:
Kamil Kaczkowski <kamil@kamil.eisp.pl> writes:
>> Another possible explanation is if the UPDATE
>> command can update more than one row --- in that case different backends
>> might happen to reach the target rows in different orders.

> Yes, this UPDATE changes several rows, I didn't know this can be a
> problem.

> My understanding was that row-level lock at UPDATE statement is somehow
> atomic and it locks all rows matched at once.

Nope.

> But what's the solution? How can I force UPDATEs to lock rows in the same
> order? There's no ORDER BY clause for UPDATE.

Change things so you don't need to update more than one row per query,
perhaps?  The lack of any primary key on that table was already pretty
disturbing from a database-theory point of view.  Maybe you should
rethink the table layout.

            regards, tom lane

Re: row-level deadlock problem

From
Kamil Kaczkowski
Date:
On Fri, 26 Nov 2004, Tom Lane wrote:

> > My understanding was that row-level lock at UPDATE statement is somehow
> > atomic and it locks all rows matched at once.
>
> Nope.
>
> > But what's the solution? How can I force UPDATEs to lock rows in the same
> > order? There's no ORDER BY clause for UPDATE.
>
> Change things so you don't need to update more than one row per query,
> perhaps?  The lack of any primary key on that table was already pretty
> disturbing from a database-theory point of view.  Maybe you should
> rethink the table layout.
Yes, I know. I'm not the developer of this application, my job is
only to find reasons behind those deadlocks and suggest solution.
Anyway I'm suprised that UPDATE statements are so locking sensitive.
Consider more general case with following table:
CREATE TABLE test(id serial not null primary key,val integer not null);

Does it mean that there's no simple way(without explicit locking) to run:
UPDATE test SET val=1;
safely in concurrent transactions(let's name them T1,T2) assuming that
other transactions are modifing table frequently, e.g.:
T1: UPDATE started
... some other activity on table test ...
T2: UPDATE started(with different order then T1's update)
DEADLOCK

If it looks like this, I'm probably lucky it didn't bite me before.
Best regards.
--
Kamil Kaczkowski
kamil@kamil.eisp.pl

Re: row-level deadlock problem

From
Martijn van Oosterhout
Date:
On Sat, Nov 27, 2004 at 03:20:16AM +0100, Kamil Kaczkowski wrote:
> > Change things so you don't need to update more than one row per query,
> > perhaps?  The lack of any primary key on that table was already pretty
> > disturbing from a database-theory point of view.  Maybe you should
> > rethink the table layout.
> Yes, I know. I'm not the developer of this application, my job is
> only to find reasons behind those deadlocks and suggest solution.
> Anyway I'm suprised that UPDATE statements are so locking sensitive.

It's not the locking on the UPDATE that's getting you. Multiple updates
can run concurrently (depending on your serialization level anyway, I'm
talking about default setup here).

Where the problem is is the foreign key locks. The usual thing is to
sort the rows you are updating in such a way that the foreign keys
references are always processed in the same order, hence can't
deadlock.
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: row-level deadlock problem

From
Kamil Kaczkowski
Date:
On Sat, 27 Nov 2004, Martijn van Oosterhout wrote:

> On Sat, Nov 27, 2004 at 03:20:16AM +0100, Kamil Kaczkowski wrote:
> > > Change things so you don't need to update more than one row per query,
> > > perhaps?  The lack of any primary key on that table was already pretty
> > > disturbing from a database-theory point of view.  Maybe you should
> > > rethink the table layout.
> > Yes, I know. I'm not the developer of this application, my job is
> > only to find reasons behind those deadlocks and suggest solution.
> > Anyway I'm suprised that UPDATE statements are so locking sensitive.
>
> It's not the locking on the UPDATE that's getting you. Multiple updates
> can run concurrently (depending on your serialization level anyway, I'm
> talking about default setup here).
>
> Where the problem is is the foreign key locks. The usual thing is to
> sort the rows you are updating in such a way that the foreign keys
> references are always processed in the same order, hence can't
> deadlock.
See earlier posts in this thread, I have no foreign key constraints on
this table.
Regards.
--
Kamil Kaczkowski
kamil@kamil.eisp.pl

Re: row-level deadlock problem

From
Martijn van Oosterhout
Date:
On Sat, Nov 27, 2004 at 03:40:11PM +0100, Kamil Kaczkowski wrote:
> > It's not the locking on the UPDATE that's getting you. Multiple updates
> > can run concurrently (depending on your serialization level anyway, I'm
> > talking about default setup here).
> >
> > Where the problem is is the foreign key locks. The usual thing is to
> > sort the rows you are updating in such a way that the foreign keys
> > references are always processed in the same order, hence can't
> > deadlock.
> See earlier posts in this thread, I have no foreign key constraints on
> this table.

Well, there has to be something, since an UPDATE in Read Committed mode
simply doesn't have any locks that can deadlock. It has to be at least
a SELECT FOR UPDATE, which is the lock foreign keys use.

--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: row-level deadlock problem

From
Tom Lane
Date:
Martijn van Oosterhout <kleptog@svana.org> writes:
> Well, there has to be something, since an UPDATE in Read Committed mode
> simply doesn't have any locks that can deadlock.

You're mistaken; it takes a row lock on each row it updates.  I'm not
sure why the two UPDATEs are visiting the same rows in different orders,
but if they do the failure is certainly possible.

            regards, tom lane

Re: row-level deadlock problem

From
Alvaro Herrera
Date:
On Sat, Nov 27, 2004 at 12:58:36PM -0500, Tom Lane wrote:
> Martijn van Oosterhout <kleptog@svana.org> writes:
> > Well, there has to be something, since an UPDATE in Read Committed mode
> > simply doesn't have any locks that can deadlock.
>
> You're mistaken; it takes a row lock on each row it updates.  I'm not
> sure why the two UPDATEs are visiting the same rows in different orders,
> but if they do the failure is certainly possible.

One of them could be using an indexscan while the other is not.  If the
heap is in reverse order compared to the scan, that would explain it.

Is there a solution to this problem?  The FK deadlock problem can be
fixed with shared row locks, but this is a different one.

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"La felicidad no es mañana. La felicidad es ahora"

Re: row-level deadlock problem

From
Kamil Kaczkowski
Date:
On Sat, 27 Nov 2004, Alvaro Herrera wrote:

> > You're mistaken; it takes a row lock on each row it updates.  I'm not
> > sure why the two UPDATEs are visiting the same rows in different orders,
> > but if they do the failure is certainly possible.
>
> One of them could be using an indexscan while the other is not.  If the
> heap is in reverse order compared to the scan, that would explain it.
>
> Is there a solution to this problem?  The FK deadlock problem can be
> fixed with shared row locks, but this is a different one.
>
>
In my case deadlock happens between two identical statements executed
from different transactions and they have the same execution plan(index
scan on one attribute - 'color' in schema I presented).
Also, there's no other modifing statements in logs at the time so only
thing that can change scan order is the first UPDATE called of two
involved in deadlock(or the same statement exeduted from some third
transaction, but I don't think so).
Could someone with more knowledge on internals of btree indexes provide
info on how this can happen, what causes index scan order to change?
Thinking of it, possibility of such event should be very low, but I'm
getting 2000 deadlocks during busy day.
Regards.
--
Kamil Kaczkowski
kamil@kamil.eisp.pl

Re: row-level deadlock problem

From
Tom Lane
Date:
Kamil Kaczkowski <kamil@kamil.eisp.pl> writes:
>>> You're mistaken; it takes a row lock on each row it updates.  I'm not
>>> sure why the two UPDATEs are visiting the same rows in different orders,
>>> but if they do the failure is certainly possible.
>>
>> One of them could be using an indexscan while the other is not.  If the
>> heap is in reverse order compared to the scan, that would explain it.
>>
> In my case deadlock happens between two identical statements executed
> from different transactions and they have the same execution plan(index
> scan on one attribute - 'color' in schema I presented).

That's a bit hard to believe; once the rows are entered in the index
their relative order won't change anymore, so it's real hard to see how
two indexscans could visit them in different orders.

IIRC you said that these commands were being done inside plpgsql
functions, so it's possible that the planner is doing something
different with the parameterized plans than what you see in a simple
EXPLAIN with values already inserted.  Still, it's odd that you might
get different plans in different executions of the same function.

I think there is some factor we're not seeing here.  Is it possible that
one backend has a cached plan much older than the other one, and that
the planner's plan choice changed over time?

            regards, tom lane

Re: row-level deadlock problem

From
Kamil Kaczkowski
Date:
On Sat, 27 Nov 2004, Tom Lane wrote:

> > In my case deadlock happens between two identical statements executed
> > from different transactions and they have the same execution plan(index
> > scan on one attribute - 'color' in schema I presented).
>
> That's a bit hard to believe; once the rows are entered in the index
> their relative order won't change anymore, so it's real hard to see how
> two indexscans could visit them in different orders.
>
> IIRC you said that these commands were being done inside plpgsql
> functions, so it's possible that the planner is doing something
> different with the parameterized plans than what you see in a simple
> EXPLAIN with values already inserted.  Still, it's odd that you might
> get different plans in different executions of the same function.
>
> I think there is some factor we're not seeing here.  Is it possible that
> one backend has a cached plan much older than the other one, and that
> the planner's plan choice changed over time?
Theorically possible, but chances are very low.
I think condition which now use index is most selective all the time, even
if not, I don't believe planner could change plan so quickly with dataset
change. Also, connections to this database have rather short life.
I can add EXECUTE in front of UPDATE to verify cached plan
theory. I'll try that during the week, but I don't think it'd change
anything.
I agree it looks like we're missing something. Any other theories,
suggestions what should I check?
I'll try to isolate problem to test case by writing scripts to simulate actual load but it'll take same time.
Regards.
--
Kamil Kaczkowski
kamil@kamil.eisp.pl