We're getting deadlock error messages in the production database logs
during times of inactivity, where the only other thing using the
database (we think) is the every-15-minutes pg_dump process. There
are still database connections up-and-running from unused Hibernate
Java processes, but they Shouldn't Be doing anything (and shouldn't
be holding locks, etc).
The deadlock error message looks like this:
ERROR: deadlock detected
DETAIL: Process 1120 waits for ShareLock on transaction 5847116;
blocked by process 1171.
Process 1171 waits for ExclusiveLock on tuple (6549,28) of relation
37637 of database 37574; blocked by process 1120.
Relation 37636 is the users table (schema attached).
Process 1120 was running an UPDATE query and changing a single row in
the users table. The users table does have foreign keys to 4 other
tables. Is it possible that those foreign key constraints acquire
locks in a different order than pg_dump (as it's SELECTing from the
tables), and it's hitting at *just* the right time to cause a deadlock?
I've tried to reproduce it on a test machine by running pgbench
(after adding foreign keys to the pgbench tables) and pg_dump in
tight loops in two concurrent shell scripts, but no deadlock.
Any ideas on how to track this down?
Under what conditions does a process acquire a ShareLock on another
transaction?
Thanks!
- Chris