Thread: postgresql93-9.3.5: deadlock when updating parent table expected?
Hi, I recently updated to postgresql93-9.3.5 (from 9.2.9). I see frequent deadlocks when updating parent table in insert into child table. There is foreign key constraint between child table and parent table. Parent table is updated on by trigger in insert into child table. So pretty much standard thing. Is it expected to deadlock? A simplified version: create table volume ( id serial primary key, name varchar, counter integer default(0)); create table file ( id serial primary key, name varchar, volume bigint, foreign key (volume) references volume(id)); create or replace function update_volume_file_counter() returns "trigger" as $$ begin if (tg_op='INSERT') then update volume set counter=counter+1 where volume.id=new.volume; return new; elseif (tg_op='DELETE') then update volume set counter=counter-1 where volume.id=old.volume; return old; end if; end; $$ language plpgsql; create trigger update_volume_counter after insert or delete on file for each row execute procedure update_volume_file_counter(); So record is inserted into file table and counter gets updated in volume table. Nothing fancy. insert into volume (name) values ('foo'); insert into file(name,volume) values ('f1',(select id from volume where name='foo')); insert into file(name,volume) values ('f2',(select id from volume where name='foo')); select * from volume; id | name | counter ----+------+--------- 2 | foo | 2 (1 row) delete from file where name='f2'; DELETE 1 billing=# select * from volume; id | name | counter ----+------+--------- 2 | foo | 1 (1 row) So, counter increments/decrements as it should. Works fine. But in real life application where multiple threads are inserting into file table I see sometimes: CSTERROR: deadlock detected Process 24611 waits for ExclusiveLock on tuple (1749,58) of relation 138328329 of database 138328263; blocked by process25082. Process 25082 waits for ShareLock on transaction 14829630; blocked by process 24611. Process 24611: update volume set counter=counter+1 where id=new.volume; Process 25082: insert into file(name,volume) values('f10000',(select id from volume where name='foo')); CSTHINT: See server log for query details. (not a "real" log file excerpt). This does not happen all the time, happens sometimes when multiple threads "add" file to the same volume;. Question - am I doing something wrong or this deadlock is expected? ( I read somewhere that when inserting into child table the corresponding record of parent table is locked). I did not seem to encounter this issue in postgresql 9.2 and 8.4 which I had before. Should I drop foreign key constraint ? Thanks, Dmitry
On Wed, 18 Feb 2015 04:55:47 +0000 Dmitry O Litvintsev <litvinse@fnal.gov> wrote: > Hi, > > I recently updated to postgresql93-9.3.5 (from 9.2.9). I see frequent deadlocks > when updating parent table in insert into child table. There is foreign key constraint between > child table and parent table. Parent table is updated on by trigger in insert into child table. So > pretty much standard thing. Is it expected to deadlock? > > A simplified version: > > create table volume ( > id serial primary key, > name varchar, > counter integer default(0)); > > create table file ( > id serial primary key, > name varchar, > volume bigint, foreign key (volume) references volume(id)); > > create or replace function update_volume_file_counter() > returns "trigger" as $$ > begin > if (tg_op='INSERT') then > update volume set counter=counter+1 where volume.id=new.volume; > return new; > elseif (tg_op='DELETE') then > update volume set counter=counter-1 where volume.id=old.volume; > return old; > end if; > end; > $$ > language plpgsql; > > create trigger update_volume_counter > after insert or delete on file > for each row > execute procedure update_volume_file_counter(); > > So record is inserted into file table and counter gets updated in volume table. Nothing > fancy. > > insert into volume (name) values ('foo'); > insert into file(name,volume) values ('f1',(select id from volume where name='foo')); > insert into file(name,volume) values ('f2',(select id from volume where name='foo')); > > select * from volume; > id | name | counter > ----+------+--------- > 2 | foo | 2 > (1 row) > > delete from file where name='f2'; > DELETE 1 > billing=# select * from volume; > id | name | counter > ----+------+--------- > 2 | foo | 1 > (1 row) > > So, counter increments/decrements as it should. > Works fine. > But in real life application where multiple threads are inserting into file > table I see sometimes: > > CSTERROR: deadlock detected > Process 24611 waits for ExclusiveLock on tuple (1749,58) of relation 138328329 of database 138328263; blocked by process25082. > Process 25082 waits for ShareLock on transaction 14829630; blocked by process 24611. > Process 24611: update volume set counter=counter+1 where id=new.volume; > Process 25082: insert into file(name,volume) values('f10000',(select id from volume where name='foo')); > CSTHINT: See server log for query details. > > (not a "real" log file excerpt). > > This does not happen all the time, happens sometimes when multiple threads "add" file to the same volume;. > > Question - am I doing something wrong or this deadlock is expected? ( I read somewhere > that when inserting into child table the corresponding record of parent table is locked). > I did not seem to encounter this issue in postgresql 9.2 and 8.4 which I had before. Operations on the file table will take out a sharelock on the corresponding row in the volume table, to ensure the foreign key isn't made invalid by another process while this transaction is in progress. In order for this to deadlock, I believe you would have to have 2 processes operating on the same volume id at the same time. You're using ambiguous terms like "sometimes" to describe the frequency. The chance of it happening is a factor of how much INSERT/DELETE traffic you have on the file table, and how often those INSERT/DELETEs center around a single volume id. > Should I drop foreign key constraint ? If you don't feel that the relational guarantees provided by the constraint are necessary, then you should delete it. You should NOT delete the foreign key in an attempt to reduce deadlocks, as you'd simply be avoiding one relational problem by allowing another to happen. Deadlocks are a perfectly normal consequence of high write activity. It's possible to design schemas and access patterns that can't deadlock, but it's time-consuming, complex, and generally performs poorly. In this case, however, I think you can avoid the deadlock with the following: BEGIN; SELECT id FROM volume WHERE id = $? FOR UPDATE; -- INSERT or DELETE here COMMIT; I don't believe this will create a significant performance degradation, but you'll have to test it against your workload to be sure. A more general solution is to have your application code catch deadlocks and replay the applicable transaction when they happen. A deadlock can generally be consider "I can't do that right now, please try again later" and unless the server is under a tremendous load, the second attemp usually succeeds (of course, there is a chance that it will deadlock again, so you have to take into account that it might take an arbitrary number of attempts before it succeeds) I've seen this all too many times: many application developers assume that a deadlock is an error that should never happen, and this seems to result from the fact that most application developers have only worked on applications that are 99% read and only 1% write, thus they see deadlock scenarios so seldom that they have no experience with them. Education is really the key here, and teaching developers that the following pattern is terrible design: try { // SQL operations here } catch (SQLException e) { e.printStackTrace(); } -- Bill Moran
Dmitry O Litvintsev wrote: > Hi, > > I recently updated to postgresql93-9.3.5 (from 9.2.9). I see frequent deadlocks > when updating parent table in insert into child table. There is foreign key constraint between > child table and parent table. Parent table is updated on by trigger in insert into child table. So > pretty much standard thing. Is it expected to deadlock? This is probably caused by a bug that was fixed in 9.3.6: Author: Alvaro Herrera <alvherre@alvh.no-ip.org> Branch: master [0e5680f47] 2014-12-26 13:52:27 -0300 Branch: REL9_4_STABLE Release: REL9_4_1 [0e3a1f71d] 2014-12-26 13:52:27 -0300 Branch: REL9_3_STABLE Release: REL9_3_6 [048912386] 2014-12-26 13:52:27 -0300 Grab heavyweight tuple lock only before sleeping We were trying to acquire the lock even when we were subsequently not sleeping in some other transaction, which opens us up unnecessarily to deadlocks. In particular, this is troublesome if an update tries to lock an updated version of a tuple and finds itself doing EvalPlanQual update chain walking; more than two sessions doing this concurrently will find themselves sleeping on each other because the HW tuple lock acquisition in heap_lock_tuple called from EvalPlanQualFetch races with the same tuple lock being acquired in heap_update -- one of these sessions sleeps on the other one to finish while holding the tuple lock, and the other one sleeps on the tuple lock. Per trouble report from Andrew Sackville-West in http://www.postgresql.org/message-id/20140731233051.GN17765@andrew-ThinkPad-X230 His scenario can be simplified down to a relatively simple isolationtester spec file which I don't include in this commit; the reason is that the current isolationtester is not able to deal with more than one blocked session concurrently and it blocks instead of raising the expected deadlock. In the future, if we improve isolationtester, it would be good to include the spec file in the isolation schedule. I posted it in http://www.postgresql.org/message-id/20141212205254.GC1768@alvh.no-ip.org Hat tip to Mark Kirkwood, who helped diagnose the trouble. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: postgresql93-9.3.5: deadlock when updating parent table expected?
From
Dmitry O Litvintsev
Date:
Thank you, Bill, Yes, deadlock occurs when there are multiple processes insert concurrently into file table with the same volume id field. I used "sometimes" as opposed to "all the time". I think you advise to retry transaction or add select for update prior to insert. I will pursue this (together with upgrade to 9.3.6 suggested by Alvaro). Thanks, Dmitry
On Wed, 18 Feb 2015 18:30:09 +0000 Dmitry O Litvintsev <litvinse@fnal.gov> wrote: > > Yes, deadlock occurs when there are multiple processes insert > concurrently into file table with the same volume id field. > I used "sometimes" as opposed to "all the time". I resonded in that way since I frequently hear people complaining, "we're seeing all kinds of deadlocks" as if the whole world is collapsing under the deadlocks, but when pressed for more information it turns out they're only seeing a few deadlocks per hour when the system is under the heaviest load -- that scenario is hardly unexpected. As a result, having more detailed information than just "sometimes" helps to understand what's really going on. > I think you advise to retry transaction or add select for update prior > to insert. I will pursue this (together with upgrade to 9.3.6 suggested by > Alvaro). The nice thing about a retry strategy is that it always works. The problem with a retry strategy is that it's easy to do wrong (i.e. it may be more than just the transaction that needs to restart ... depending on what data has changed, calculations may need to be redone, the user requeried for certain information, etc). The problem with the SELECT ... FOR UPDATE is that it's a bit slower, and can be significantly slower unders some circumstances, but it's easier to implement correctly. The good news form Alvaro is that this is probably happening more frequently than necessary because of the bug he mentioned ... so upgrading may cause the problem to happen infrequently enough that you don't really care about it. The solutions I suggest are still relevent, they just might not be as immediately important. -- Bill Moran
Hi,
Regards,
Dinesh
Dinesh
On Wed, Feb 18, 2015 at 10:45 AM, Bill Moran <wmoran@potentialtech.com> wrote:
On Wed, 18 Feb 2015 18:30:09 +0000
Dmitry O Litvintsev <litvinse@fnal.gov> wrote:
>
> Yes, deadlock occurs when there are multiple processes insert
> concurrently into file table with the same volume id field.
> I used "sometimes" as opposed to "all the time".
I resonded in that way since I frequently hear people complaining,
"we're seeing all kinds of deadlocks" as if the whole world is
collapsing under the deadlocks, but when pressed for more information
it turns out they're only seeing a few deadlocks per hour when the
system is under the heaviest load -- that scenario is hardly
unexpected. As a result, having more detailed information than
just "sometimes" helps to understand what's really going on.
> I think you advise to retry transaction or add select for update prior
> to insert. I will pursue this (together with upgrade to 9.3.6 suggested by
> Alvaro).
The nice thing about a retry strategy is that it always works.
The problem with a retry strategy is that it's easy to do wrong
(i.e. it may be more than just the transaction that needs to
restart ... depending on what data has changed, calculations may
need to be redone, the user requeried for certain information,
etc).
The problem with the SELECT ... FOR UPDATE is that it's a bit
slower, and can be significantly slower unders some circumstances,
but it's easier to implement correctly.
The good news form Alvaro is that this is probably happening more
frequently than necessary because of the bug he mentioned ... so
upgrading may cause the problem to happen infrequently enough that
you don't really care about it. The solutions I suggest are still
relevent, they just might not be as immediately important.
--
Bill Moran
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Re: postgresql93-9.3.5: deadlock when updating parent table expected?
From
Dmitry O Litvintsev
Date:
Thanks, Alvaro, Yes indeed. I have a test that causes the deadlock almost immediately. I have upgraded to 9.3.6 and have been running for a few hours now w/o deadlock errors observed. Dmitry ________________________________________ From: Alvaro Herrera [alvherre@2ndquadrant.com] Sent: Wednesday, February 18, 2015 6:19 AM To: Dmitry O Litvintsev Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] postgresql93-9.3.5: deadlock when updating parent table expected? Dmitry O Litvintsev wrote: > Hi, > > I recently updated to postgresql93-9.3.5 (from 9.2.9). I see frequent deadlocks > when updating parent table in insert into child table. There is foreign key constraint between > child table and parent table. Parent table is updated on by trigger in insert into child table. So > pretty much standard thing. Is it expected to deadlock? This is probably caused by a bug that was fixed in 9.3.6: Author: Alvaro Herrera <alvherre@alvh.no-ip.org> Branch: master [0e5680f47] 2014-12-26 13:52:27 -0300 Branch: REL9_4_STABLE Release: REL9_4_1 [0e3a1f71d] 2014-12-26 13:52:27 -0300 Branch: REL9_3_STABLE Release: REL9_3_6 [048912386] 2014-12-26 13:52:27 -0300 Grab heavyweight tuple lock only before sleeping We were trying to acquire the lock even when we were subsequently not sleeping in some other transaction, which opens us up unnecessarily to deadlocks. In particular, this is troublesome if an update tries to lock an updated version of a tuple and finds itself doing EvalPlanQual update chain walking; more than two sessions doing this concurrently will find themselves sleeping on each other because the HW tuple lock acquisition in heap_lock_tuple called from EvalPlanQualFetch races with the same tuple lock being acquired in heap_update -- one of these sessions sleeps on the other one to finish while holding the tuple lock, and the other one sleeps on the tuple lock. Per trouble report from Andrew Sackville-West in http://www.postgresql.org/message-id/20140731233051.GN17765@andrew-ThinkPad-X230 His scenario can be simplified down to a relatively simple isolationtester spec file which I don't include in this commit; the reason is that the current isolationtester is not able to deal with more than one blocked session concurrently and it blocks instead of raising the expected deadlock. In the future, if we improve isolationtester, it would be good to include the spec file in the isolation schedule. I posted it in http://www.postgresql.org/message-id/20141212205254.GC1768@alvh.no-ip.org Hat tip to Mark Kirkwood, who helped diagnose the trouble. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services