Triaging pg_ctl shutdown hang - Mailing list pgsql-admin

From Joseph Hammerman
Subject Triaging pg_ctl shutdown hang
Date
Msg-id CAHs7QM_yx=KhjwHub7PyqvaosTpb9AQxXzHPFx7Pnu+0hvxLaw@mail.gmail.com
Whole thread Raw
List pgsql-admin
Hi pgsql-admins list,

We recently had an incident precipitated by postgres 9.6.22 shutdown -m fast hanging. There were two processes that were not quitting, the postmaster and the logger process. We had limited visibility into the underlying conditions since psql locks out new connections and kicks everyone out in fast shut down mode. Even when we escalated the shutdown signal to immediate, the processes were not exiting.

I’m trying to put together a checklist for data for us to capture to determine the root cause of the hang if we encounter this issue again. For example, running echo w > /proc/sysrq-trigger to get a list of processes in uninterruptible sleep, and perform a kernel stack trace on them. Is it worth stracing the postmaster process and surviving children? Does pg_controldata surface any useful data?

As a follow up question, is there a way to obtain an administrative backdoor or leave one open during hanging fast shutdown operations?

Thanks in advance for any clarity or guidance anyone the message board can provide.

Joe Hammerman

pgsql-admin by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: PostgreSQL Replication between Different Major Version (11-13)
Next
From: Magnus Rolf
Date:
Subject: Re: PostgreSQL Replication between Different Major Version (11-13)