Thread: postgresql93-9.3.5: deadlock when updating parent table expected?

postgresql93-9.3.5: deadlock when updating parent table expected?

From
Dmitry O Litvintsev
Date:
Hi,

I recently updated to postgresql93-9.3.5 (from 9.2.9). I see frequent deadlocks
when updating parent table in insert into child table. There is foreign key constraint between
child table and parent table. Parent table is updated on by trigger in insert into child table. So
pretty much standard thing. Is it expected to deadlock?

A simplified version:

 create table volume (
   id serial primary key,
   name varchar,
   counter integer default(0));

 create table file (
   id serial primary key,
   name varchar,
   volume bigint, foreign key (volume) references volume(id));

 create or replace function update_volume_file_counter()
 returns "trigger" as $$
 begin
 if (tg_op='INSERT') then
  update volume set counter=counter+1 where volume.id=new.volume;
  return new;
 elseif (tg_op='DELETE') then
  update volume set counter=counter-1 where volume.id=old.volume;
  return old;
 end if;
 end;
 $$
 language plpgsql;

 create trigger update_volume_counter
    after insert or delete on file
    for each row
    execute procedure update_volume_file_counter();

So record is inserted into file table and counter gets updated in volume table. Nothing
fancy.

insert into volume (name) values ('foo');
insert into file(name,volume) values ('f1',(select id from volume where name='foo'));
insert into file(name,volume) values ('f2',(select id from volume where name='foo'));

select * from volume;
 id | name | counter
----+------+---------
  2 | foo  |       2
(1 row)

delete from file where name='f2';
DELETE 1
billing=# select * from volume;
 id | name | counter
----+------+---------
  2 | foo  |       1
(1 row)

So, counter increments/decrements as it should.
Works fine.
But in real life application where multiple threads are inserting into file
table I see sometimes:

CSTERROR:  deadlock detected
 Process 24611 waits for ExclusiveLock on tuple (1749,58) of relation 138328329 of database 138328263; blocked by
process25082. 
 Process 25082 waits for ShareLock on transaction 14829630; blocked by process 24611.
 Process 24611:             update volume set counter=counter+1 where id=new.volume;
 Process 25082:             insert into file(name,volume) values('f10000',(select id from volume where name='foo'));
  CSTHINT:  See server log for query details.

(not a "real" log file excerpt).

This does not happen all the time, happens sometimes when multiple threads "add" file to the same volume;.

Question - am I doing something wrong or this deadlock is expected? ( I read somewhere
that when inserting into  child table the corresponding record of parent table is locked).
I did not seem to encounter this issue in postgresql 9.2 and 8.4 which I had before.

Should I drop foreign key constraint ?

Thanks,
Dmitry

Re: postgresql93-9.3.5: deadlock when updating parent table expected?

From
Bill Moran
Date:
On Wed, 18 Feb 2015 04:55:47 +0000
Dmitry O Litvintsev <litvinse@fnal.gov> wrote:

> Hi,
>
> I recently updated to postgresql93-9.3.5 (from 9.2.9). I see frequent deadlocks
> when updating parent table in insert into child table. There is foreign key constraint between
> child table and parent table. Parent table is updated on by trigger in insert into child table. So
> pretty much standard thing. Is it expected to deadlock?
>
> A simplified version:
>
>  create table volume (
>    id serial primary key,
>    name varchar,
>    counter integer default(0));
>
>  create table file (
>    id serial primary key,
>    name varchar,
>    volume bigint, foreign key (volume) references volume(id));
>
>  create or replace function update_volume_file_counter()
>  returns "trigger" as $$
>  begin
>  if (tg_op='INSERT') then
>   update volume set counter=counter+1 where volume.id=new.volume;
>   return new;
>  elseif (tg_op='DELETE') then
>   update volume set counter=counter-1 where volume.id=old.volume;
>   return old;
>  end if;
>  end;
>  $$
>  language plpgsql;
>
>  create trigger update_volume_counter
>     after insert or delete on file
>     for each row
>     execute procedure update_volume_file_counter();
>
> So record is inserted into file table and counter gets updated in volume table. Nothing
> fancy.
>
> insert into volume (name) values ('foo');
> insert into file(name,volume) values ('f1',(select id from volume where name='foo'));
> insert into file(name,volume) values ('f2',(select id from volume where name='foo'));
>
> select * from volume;
>  id | name | counter
> ----+------+---------
>   2 | foo  |       2
> (1 row)
>
> delete from file where name='f2';
> DELETE 1
> billing=# select * from volume;
>  id | name | counter
> ----+------+---------
>   2 | foo  |       1
> (1 row)
>
> So, counter increments/decrements as it should.
> Works fine.
> But in real life application where multiple threads are inserting into file
> table I see sometimes:
>
> CSTERROR:  deadlock detected
>  Process 24611 waits for ExclusiveLock on tuple (1749,58) of relation 138328329 of database 138328263; blocked by
process25082. 
>  Process 25082 waits for ShareLock on transaction 14829630; blocked by process 24611.
>  Process 24611:             update volume set counter=counter+1 where id=new.volume;
>  Process 25082:             insert into file(name,volume) values('f10000',(select id from volume where name='foo'));
>   CSTHINT:  See server log for query details.
>
> (not a "real" log file excerpt).
>
> This does not happen all the time, happens sometimes when multiple threads "add" file to the same volume;.
>
> Question - am I doing something wrong or this deadlock is expected? ( I read somewhere
> that when inserting into  child table the corresponding record of parent table is locked).
> I did not seem to encounter this issue in postgresql 9.2 and 8.4 which I had before.

Operations on the file table will take out a sharelock on the
corresponding row in the volume table, to ensure the foreign
key isn't made invalid by another process while this transaction
is in progress.

In order for this to deadlock, I believe you would have to have
2 processes operating on the same volume id at the same time.
You're using ambiguous terms like "sometimes" to describe the
frequency. The chance of it happening is a factor of how much
INSERT/DELETE traffic you have on the file table, and how often
those INSERT/DELETEs center around a single volume id.

> Should I drop foreign key constraint ?

If you don't feel that the relational guarantees provided by the
constraint are necessary, then you should delete it. You should
NOT delete the foreign key in an attempt to reduce deadlocks, as
you'd simply be avoiding one relational problem by allowing another
to happen.

Deadlocks are a perfectly normal consequence of high write activity.
It's possible to design schemas and access patterns that can't
deadlock, but it's time-consuming, complex, and generally performs
poorly.

In this case, however, I think you can avoid the deadlock with the
following:

BEGIN;
SELECT id FROM volume WHERE id = $? FOR UPDATE;
-- INSERT or DELETE here
COMMIT;

I don't believe this will create a significant performance
degradation, but you'll have to test it against your workload
to be sure.

A more general solution is to have your application code catch
deadlocks and replay the applicable transaction when they happen.
A deadlock can generally be consider "I can't do that right now,
please try again later" and unless the server is under a
tremendous load, the second attemp usually succeeds (of course,
there is a chance that it will deadlock again, so you have to
take into account that it might take an arbitrary number of
attempts before it succeeds)

I've seen this all too many times: many application developers
assume that a deadlock is an error that should never happen, and
this seems to result from the fact that most application developers
have only worked on applications that are 99% read and only 1%
write, thus they see deadlock scenarios so seldom that they have
no experience with them. Education is really the key here, and
teaching developers that the following pattern is terrible
design:

try {
  // SQL operations here
}
catch (SQLException e) {
  e.printStackTrace();
}

--
Bill Moran


Re: postgresql93-9.3.5: deadlock when updating parent table expected?

From
Alvaro Herrera
Date:
Dmitry O Litvintsev wrote:
> Hi,
>
> I recently updated to postgresql93-9.3.5 (from 9.2.9). I see frequent deadlocks
> when updating parent table in insert into child table. There is foreign key constraint between
> child table and parent table. Parent table is updated on by trigger in insert into child table. So
> pretty much standard thing. Is it expected to deadlock?

This is probably caused by a bug that was fixed in 9.3.6:

Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
Branch: master [0e5680f47] 2014-12-26 13:52:27 -0300
Branch: REL9_4_STABLE Release: REL9_4_1 [0e3a1f71d] 2014-12-26 13:52:27 -0300
Branch: REL9_3_STABLE Release: REL9_3_6 [048912386] 2014-12-26 13:52:27 -0300

    Grab heavyweight tuple lock only before sleeping

    We were trying to acquire the lock even when we were subsequently
    not sleeping in some other transaction, which opens us up unnecessarily
    to deadlocks.  In particular, this is troublesome if an update tries to
    lock an updated version of a tuple and finds itself doing EvalPlanQual
    update chain walking; more than two sessions doing this concurrently
    will find themselves sleeping on each other because the HW tuple lock
    acquisition in heap_lock_tuple called from EvalPlanQualFetch races with
    the same tuple lock being acquired in heap_update -- one of these
    sessions sleeps on the other one to finish while holding the tuple lock,
    and the other one sleeps on the tuple lock.

    Per trouble report from Andrew Sackville-West in
    http://www.postgresql.org/message-id/20140731233051.GN17765@andrew-ThinkPad-X230

    His scenario can be simplified down to a relatively simple
    isolationtester spec file which I don't include in this commit; the
    reason is that the current isolationtester is not able to deal with more
    than one blocked session concurrently and it blocks instead of raising
    the expected deadlock.  In the future, if we improve isolationtester, it
    would be good to include the spec file in the isolation schedule.  I
    posted it in
    http://www.postgresql.org/message-id/20141212205254.GC1768@alvh.no-ip.org

    Hat tip to Mark Kirkwood, who helped diagnose the trouble.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: postgresql93-9.3.5: deadlock when updating parent table expected?

From
Dmitry O Litvintsev
Date:
Thank you, Bill,

Yes, deadlock occurs when there are multiple processes insert
concurrently into file table with the same volume id field.
I used "sometimes"  as opposed to "all the time".

I think you advise to retry transaction or add select for update prior
to insert. I will pursue this (together with upgrade to 9.3.6 suggested by
Alvaro).

Thanks,
Dmitry


Re: postgresql93-9.3.5: deadlock when updating parent table expected?

From
Bill Moran
Date:
On Wed, 18 Feb 2015 18:30:09 +0000
Dmitry O Litvintsev <litvinse@fnal.gov> wrote:
>
> Yes, deadlock occurs when there are multiple processes insert
> concurrently into file table with the same volume id field.
> I used "sometimes"  as opposed to "all the time".

I resonded in that way since I frequently hear people complaining,
"we're seeing all kinds of deadlocks" as if the whole world is
collapsing under the deadlocks, but when pressed for more information
it turns out they're only seeing a few deadlocks per hour when the
system is under the heaviest load -- that scenario is hardly
unexpected. As a result, having more detailed information than
just "sometimes" helps to understand what's really going on.

> I think you advise to retry transaction or add select for update prior
> to insert. I will pursue this (together with upgrade to 9.3.6 suggested by
> Alvaro).

The nice thing about a retry strategy is that it always works.
The problem with a retry strategy is that it's easy to do wrong
(i.e. it may be more than just the transaction that needs to
restart ... depending on what data has changed, calculations may
need to be redone, the user requeried for certain information,
etc).

The problem with the SELECT ... FOR UPDATE is that it's a bit
slower, and can be significantly slower unders some circumstances,
but it's easier to implement correctly.

The good news form Alvaro is that this is probably happening more
frequently than necessary because of the bug he mentioned ... so
upgrading may cause the problem to happen infrequently enough that
you don't really care about it. The solutions I suggest are still
relevent, they just might not be as immediately important.

--
Bill Moran


Re: postgresql93-9.3.5: deadlock when updating parent table expected?

From
dinesh kumar
Date:
Hi,

If you feel FOR UPDATE is taking much time, then I believe,we can solve this kind of issues using advisory locks, .

Regards,
Dinesh


On Wed, Feb 18, 2015 at 10:45 AM, Bill Moran <wmoran@potentialtech.com> wrote:
On Wed, 18 Feb 2015 18:30:09 +0000
Dmitry O Litvintsev <litvinse@fnal.gov> wrote:
>
> Yes, deadlock occurs when there are multiple processes insert
> concurrently into file table with the same volume id field.
> I used "sometimes"  as opposed to "all the time".

I resonded in that way since I frequently hear people complaining,
"we're seeing all kinds of deadlocks" as if the whole world is
collapsing under the deadlocks, but when pressed for more information
it turns out they're only seeing a few deadlocks per hour when the
system is under the heaviest load -- that scenario is hardly
unexpected. As a result, having more detailed information than
just "sometimes" helps to understand what's really going on.

> I think you advise to retry transaction or add select for update prior
> to insert. I will pursue this (together with upgrade to 9.3.6 suggested by
> Alvaro).

The nice thing about a retry strategy is that it always works.
The problem with a retry strategy is that it's easy to do wrong
(i.e. it may be more than just the transaction that needs to
restart ... depending on what data has changed, calculations may
need to be redone, the user requeried for certain information,
etc).

The problem with the SELECT ... FOR UPDATE is that it's a bit
slower, and can be significantly slower unders some circumstances,
but it's easier to implement correctly.

The good news form Alvaro is that this is probably happening more
frequently than necessary because of the bug he mentioned ... so
upgrading may cause the problem to happen infrequently enough that
you don't really care about it. The solutions I suggest are still
relevent, they just might not be as immediately important.

--
Bill Moran


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: postgresql93-9.3.5: deadlock when updating parent table expected?

From
Dmitry O Litvintsev
Date:
Thanks, Alvaro,

Yes indeed.  I have a test that causes the deadlock almost immediately.
I have upgraded to 9.3.6 and have been running for a few hours now w/o
deadlock errors observed.

Dmitry
________________________________________
From: Alvaro Herrera [alvherre@2ndquadrant.com]
Sent: Wednesday, February 18, 2015 6:19 AM
To: Dmitry O Litvintsev
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] postgresql93-9.3.5: deadlock when updating parent table expected?

Dmitry O Litvintsev wrote:
> Hi,
>
> I recently updated to postgresql93-9.3.5 (from 9.2.9). I see frequent deadlocks
> when updating parent table in insert into child table. There is foreign key constraint between
> child table and parent table. Parent table is updated on by trigger in insert into child table. So
> pretty much standard thing. Is it expected to deadlock?

This is probably caused by a bug that was fixed in 9.3.6:

Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
Branch: master [0e5680f47] 2014-12-26 13:52:27 -0300
Branch: REL9_4_STABLE Release: REL9_4_1 [0e3a1f71d] 2014-12-26 13:52:27 -0300
Branch: REL9_3_STABLE Release: REL9_3_6 [048912386] 2014-12-26 13:52:27 -0300

    Grab heavyweight tuple lock only before sleeping

    We were trying to acquire the lock even when we were subsequently
    not sleeping in some other transaction, which opens us up unnecessarily
    to deadlocks.  In particular, this is troublesome if an update tries to
    lock an updated version of a tuple and finds itself doing EvalPlanQual
    update chain walking; more than two sessions doing this concurrently
    will find themselves sleeping on each other because the HW tuple lock
    acquisition in heap_lock_tuple called from EvalPlanQualFetch races with
    the same tuple lock being acquired in heap_update -- one of these
    sessions sleeps on the other one to finish while holding the tuple lock,
    and the other one sleeps on the tuple lock.

    Per trouble report from Andrew Sackville-West in
    http://www.postgresql.org/message-id/20140731233051.GN17765@andrew-ThinkPad-X230

    His scenario can be simplified down to a relatively simple
    isolationtester spec file which I don't include in this commit; the
    reason is that the current isolationtester is not able to deal with more
    than one blocked session concurrently and it blocks instead of raising
    the expected deadlock.  In the future, if we improve isolationtester, it
    would be good to include the spec file in the isolation schedule.  I
    posted it in
    http://www.postgresql.org/message-id/20141212205254.GC1768@alvh.no-ip.org

    Hat tip to Mark Kirkwood, who helped diagnose the trouble.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services