pgsql: Detect the deadlocks between backends and the startup process. - Mailing list pgsql-committers

From Fujii Masao
Subject pgsql: Detect the deadlocks between backends and the startup process.
Date
Msg-id E1kwzfu-0005La-S7@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Detect the deadlocks between backends and the startup process.

The deadlocks that the recovery conflict on lock is involved in can
happen between hot-standby backends and the startup process.
If a backend takes an access exclusive lock on the table and which
finally triggers the deadlock, that deadlock can be detected
as expected. On the other hand, previously, if the startup process
took an access exclusive lock and which finally triggered the deadlock,
that deadlock could not be detected and could remain even after
deadlock_timeout passed. This is a bug.

The cause of this bug was that the code for handling the recovery
conflict on lock didn't take care of deadlock case at all. It assumed
that deadlocks involving the startup process and backends were able
to be detected by the deadlock detector invoked within backends.
But this assumption was incorrect. The startup process also should
have invoked the deadlock detector if necessary.

To fix this bug, this commit makes the startup process invoke
the deadlock detector if deadlock_timeout is reached while handling
the recovery conflict on lock. Specifically, in that case, the startup
process requests all the backends holding the conflicting locks to
check themselves for deadlocks.

Back-patch to v9.6. v9.5 has also this bug, but per discussion we decided
not to back-patch the fix to v9.5. Because v9.5 doesn't have some
infrastructure codes (e.g., 37c54863cf) that this bug fix patch depends on.
We can apply those codes for the back-patch, but since the next minor
version release is the final one for v9.5, it's risky to do that. If we
unexpectedly introduce new bug to v9.5 by the back-patch, there is no
chance to fix that. We determined that the back-patch to v9.5 would give
more risk than gain.

Author: Fujii Masao
Reviewed-by: Bertrand Drouvot, Masahiko Sawada, Kyotaro Horiguchi
Discussion: https://postgr.es/m/4041d6b6-cf24-a120-36fa-1294220f8243@oss.nttdata.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/8900b5a9d59a645b3485f5b046c4c7871b2c4026

Modified Files
--------------
src/backend/storage/ipc/procarray.c |   9 ++-
src/backend/storage/ipc/standby.c   | 143 ++++++++++++++++++++++++++++--------
src/backend/storage/lmgr/proc.c     |   3 +
src/backend/tcop/postgres.c         |  16 +++-
src/include/storage/procarray.h     |   2 +
5 files changed, 141 insertions(+), 32 deletions(-)


pgsql-committers by date:

Previous
From: Fujii Masao
Date:
Subject: pgsql: Detect the deadlocks between backends and the startup process.
Next
From: Peter Eisentraut
Date:
Subject: pgsql: Replace CLOBBER_CACHE_ALWAYS with run-time GUC