Thread: Foreign Keys and Deadlocks
Howdy, We have a process that's deadlocking frequently. It's basically multiple threads inserting data into a single table. That table has FK constraints to 3 other tables. I understand how an FK check will cause a sharelock to be acquired on the reference table and in some instances that leads to or at least participates in a deadlock. I don't think that's the case here, (or at least not the entire case) but I could use some assistance in helping to convince my developers of that ;). They'd like to just remove the FK and be done with it. I've had this link sent to me probably 100 times in the past day or so to support the theory that postgres is just deadlocking itself: http://archives.postgresql.org/pgsql-general/2004-01/msg00272.php I think that's a misinterpretation, and I also assume PGs come quite a ways since then (i'm on PG9.0). The actual error is: Nov 1 13:59:19 db001 postgres-dev[480]: [5470-1] user=x,db=y,pid=480,2011-11-01 20:59:19 UTC ERROR: deadlock detected Nov 1 13:59:19 db001 postgres-dev[480]: [5470-2] user=x,db=y,pid=480,2011-11-01 20:59:19 UTC DETAIL: Process 480 waits forShareLock on transaction 4537069; blocked by process 471. Nov 1 13:59:19 db001 postgres-dev[480]: [5470-3] #011Process 471 waits for ShareLock on transaction 4537063; blocked by process480. Nov 1 13:59:19 db001 postgres-dev[480]: [5470-4] #011Process 480: insert into a (col1, col2, col3, col4) values ($1, $2,$3, $4) Nov 1 13:59:19 db001 postgres-dev[480]: [5470-5] #011Process 471: insert into a (col1, col2, col3, col4) values ($1, $2,$3, $4) Nov 1 13:59:19 db001 postgres-dev[480]: [5470-6] user=x,db=y,pid=480,2011-11-01 20:59:19 UTC HINT: See server log for querydetails. Here's the scenario table a ( int col1 references b, int col2 references c, int col3 references d, text col4 ) The app, basically, does a ton of parallel, possibly duplicate, inserts into table a. That's all it's supposed to be doing (hibernate's involved though, so anything goes). Nothing else touches those tables. Is it possible for a deadlock to occur under those circumstances? I suspect that it has to be a transaction, and that further up in the TX is an update to one of the reference tables in each TX. If we remove the FKs we no longer get the deadlock, but I'd actually like to know the cause. Thanks Dave
On Thu, Nov 03, 2011 at 03:30:20PM -0700, David Kerr wrote: - Howdy, - - We have a process that's deadlocking frequently. It's basically multiple threads inserting data into a single table. - - That table has FK constraints to 3 other tables. - - I understand how an FK check will cause a sharelock to be acquired on the reference table and in some instances that - leads to or at least participates in a deadlock. - - I don't think that's the case here, (or at least not the entire case) but I could use some assistance in helping - to convince my developers of that ;). They'd like to just remove the FK and be done with it. [snip] So it appears that I'm the big dummy, and that you can deadlock with just inserts. I did more digging and found some good discussions on the subject in general, but most of the examples out there contain explicit updates (which is why i was confused) but it looks like it's being addressed. http://justatheory.com/computers/databases/postgresql/fk-locks-project.html http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg158205.html http://www.commandprompt.com/blogs/alvaro_herrera/2010/11/fixing_foreign_key_deadlocks/ Attached is the script to reproduce it with only inserts (for postarities sake) drop table a; drop table b; drop table c; drop table d; create table b ( bref int, description text); alter table b add primary key (bref); create table c ( cref int, description text); alter table c add primary key (cref); create table d ( dref int, description text); alter table d add primary key (dref); create table a ( bref int, cref int, dref int, description text); alter table a add primary key (bref, cref); alter table a add foreign key (bref) REFERENCES b(bref); alter table a add foreign key (cref) REFERENCES c(cref); alter table a add foreign key (dref) REFERENCES d(dref); insert into b values (1,'hello'); insert into b values (2,'hello2'); insert into b values (3,'hello3'); insert into b values (4,'hello4'); insert into c values (1,'hello'); insert into c values (2,'hello2'); insert into c values (3,'hello3'); insert into c values (4,'hello4'); insert into d values (1,'hello'); insert into d values (2,'hello2'); insert into d values (3,'hello3'); insert into d values (4,'hello4'); Fire up 2 psqls #SESSION1 ## STEP1 begin; insert into a values (1,1,1,'hello'); ##STEP3 insert into a values (1,2,1,'hello2'); #SESSION2 ## STEP2 begin; insert into a values (1,2,1,'hello2'); ## STEP4 insert into a values (1,1,1,'hello'); You'll get: ERROR: deadlock detected DETAIL: Process 8382 waits for ShareLock on transaction 7222455; blocked by process 6981. Process 6981 waits for ShareLock on transaction 7222456; blocked by process 8382. HINT: See server log for query details
Hi David, On Thu, 2011-11-03 at 15:30 -0700, David Kerr wrote: > I suspect that it has to be a transaction, and that further up in the TX is an update to one of > the reference tables in each TX. This is your cause - updating the referenced table in the same transaction. That will want an exclusive lock on the row, but the shared lock taken by the foreign key check (in another process) is conflicting, and will deadlock when the other process will also want to update some row in the referenced table which is locked by a foreign key check in this process. While the lock on the referenced row was changed to be a shared lock instead of an exclusive lock as in older postgres versions (see http://archives.postgresql.org/pgsql-general/2002-11/msg00397.php for the original problem, which is relaxed now), the lock is still too strong and the deadlock problem remains. A solution is not trivial at all, and involves only locking the row for changes of the referenced columns (which postgres can't do currently). While getting rid of the foreign key will solve your problem, I think it's not the best solution - you can perhaps design a way to not update the referenced tables in the same transaction. Here we adopted a different solution - we run a patched postgres which skips that lock altogether, which means a partially broken foreign key code which mostly works but can leave orphans. I will not recommend to do that though - the reasons we did it that way is that it was the path of least resistance as the application was also running on other DBs (which were the primary DB at that time) and there was no way to make extensive changes to the application code. If I were to change the code, I would have separated the updated fields from the parent table to yet another child table, and have the parent table never updated. That will still have some potential for deadlock (if you don't order the inserts/updates properly) but much less. Cheers, Csaba.
Excerpts from David Kerr's message of vie nov 04 13:01:29 -0300 2011: > I did more digging and found some good discussions on the subject in general, but > most of the examples out there contain explicit updates (which is why i was confused) > but it looks like it's being addressed. > > > http://justatheory.com/computers/databases/postgresql/fk-locks-project.html > http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg158205.html > http://www.commandprompt.com/blogs/alvaro_herrera/2010/11/fixing_foreign_key_deadlocks/ > > Attached is the script to reproduce it with only inserts (for postarities sake) Actually, your script as presented has nothing to do with foreign keys. The cause for the lock and the deadlock is not in the tuple lock code, but in the primary key uniqueness check. You can duplicate your issue with a single one-column table: Session one: alvherre=# create table pk (a int primary key); NOTICE: CREATE TABLE / PRIMARY KEY creará el índice implícito «pk_pkey» para la tabla «pk» CREATE TABLE alvherre=# begin; BEGIN alvherre=# insert into pk values (1); INSERT 0 1 Session two: alvherre=# begin; BEGIN alvherre=# insert into pk values (2); INSERT 0 1 alvherre=# insert into pk values (1); <blocks> Now go back to session one and alvherre=# insert into pk values (2); ERROR: se ha detectado un deadlock DETALLE: El proceso 17430 espera ShareLock en transacción 710; bloqueado por proceso 17495. El proceso 17495 espera ShareLock en transacción 709; bloqueado por proceso 17430. SUGERENCIA: Vea el registro del servidor para obtener detalles de las consultas. This case is not helped by the patch I'm working on. As far as I can see, if you got rid of the PK in table a in your example script, things should work just fine. There is no way to cause FK-induced deadlocks with only inserts in 8.1 and later. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Wed, Nov 09, 2011 at 11:11:23AM -0300, Alvaro Herrera wrote: - - Excerpts from David Kerr's message of vie nov 04 13:01:29 -0300 2011: - - > I did more digging and found some good discussions on the subject in general, but - > most of the examples out there contain explicit updates (which is why i was confused) - > but it looks like it's being addressed. - > - > - > http://justatheory.com/computers/databases/postgresql/fk-locks-project.html - > http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg158205.html - > http://www.commandprompt.com/blogs/alvaro_herrera/2010/11/fixing_foreign_key_deadlocks/ - > - > Attached is the script to reproduce it with only inserts (for postarities sake) - - Actually, your script as presented has nothing to do with foreign keys. - The cause for the lock and the deadlock is not in the tuple lock code, - but in the primary key uniqueness check. You can duplicate your issue - with a single one-column table: - - Session one: - - alvherre=# create table pk (a int primary key); - NOTICE: CREATE TABLE / PRIMARY KEY creará el Ãndice implÃcito «pk_pkey» para la tabla «pk» - CREATE TABLE - alvherre=# begin; - BEGIN - alvherre=# insert into pk values (1); - INSERT 0 1 - - Session two: - - alvherre=# begin; - BEGIN - alvherre=# insert into pk values (2); - INSERT 0 1 - alvherre=# insert into pk values (1); - <blocks> - - Now go back to session one and - - alvherre=# insert into pk values (2); - ERROR: se ha detectado un deadlock - DETALLE: El proceso 17430 espera ShareLock en transacción 710; bloqueado por proceso 17495. - El proceso 17495 espera ShareLock en transacción 709; bloqueado por proceso 17430. - SUGERENCIA: Vea el registro del servidor para obtener detalles de las consultas. - - - This case is not helped by the patch I'm working on. As far as I can - see, if you got rid of the PK in table a in your example script, things - should work just fine. There is no way to cause FK-induced deadlocks - with only inserts in 8.1 and later. Ok, well that's good to know. I had planned on testing my script w/o FKs but it slipped my mind. So, aside from removing the PKs do i have any other options? (we use Hibernate and i don't think that I'll be able to removet he Primary Keys, and a serial primary key probably isn't great for this table because it's sort of a staging area (so it gets written to and wiped out frequently) Would you consider this a problem in Pg or is it unavoidable? Thanks Dave
Excerpts from David Kerr's message of mié nov 09 14:52:01 -0300 2011: > On Wed, Nov 09, 2011 at 11:11:23AM -0300, Alvaro Herrera wrote: > - This case is not helped by the patch I'm working on. As far as I can > - see, if you got rid of the PK in table a in your example script, things > - should work just fine. There is no way to cause FK-induced deadlocks > - with only inserts in 8.1 and later. > > Ok, well that's good to know. I had planned on testing my script w/o FKs but it slipped > my mind. > > So, aside from removing the PKs do i have any other options? (we use Hibernate and > i don't think that I'll be able to removet he Primary Keys, and a serial primary key > probably isn't great for this table because it's sort of a staging area (so it gets > written to and wiped out frequently) Not sure about that. > Would you consider this a problem in Pg or is it unavoidable? Well, you have to ensure that only one copy of two or more concurrent insertions of a given PK value will survive. Otherwise the unique constraint would be violated. This is currently implemented with sleeps in the second inserter, which waits until the first transaction is closed. Note that there is some code to support deferred uniqueness checks, which might help, but I don't know if it can be applied to primary keys. I'd recommend reading the manual on that subject. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Hi David, On Wed, 2011-11-09 at 09:52 -0800, David Kerr wrote: > So, aside from removing the PKs do i have any other options? Sure you have: order the inserts by primary key inside each transaction. Then you will not get deadlocks, but inserting the same key again will fail of course (but that's the purpose of the primary key, right ?) Ordering inserts/updates by the columns which cause locks is the first thing to do to avoid dead-locks... Cheers, Csaba.
On Thu, Nov 10, 2011 at 09:09:06AM +0100, Csaba Nagy wrote: - Hi David, - - On Wed, 2011-11-09 at 09:52 -0800, David Kerr wrote: - > So, aside from removing the PKs do i have any other options? - - Sure you have: order the inserts by primary key inside each transaction. - Then you will not get deadlocks, but inserting the same key again will - fail of course (but that's the purpose of the primary key, right ?) - - Ordering inserts/updates by the columns which cause locks is the first - thing to do to avoid dead-locks... - - Cheers, - Csaba. ah, hmmm. i'm not sure if that's an option based on how the program works but I'll forward the suggestion onto the devleoper. thanks!