Thread: BUG #18469: OOM occurs and backend processes are kept in Zombie state.
BUG #18469: OOM occurs and backend processes are kept in Zombie state.
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 18469 Logged by: song yutao Email address: 2986538596@qq.com PostgreSQL version: 12.16 Operating system: Linux Description: I was performing a lot of operations on a server deployed with postgresql 12.16. As heavy operations performed continuously. memory consumption has been increased, the OS eventually got OOM and some background connection processes that were taking up too much memory were killed. However, these processes were not successfully killed and remained in Zombie state. In the meantime, the whole database process seems to be stuck and time out happened while connect via psql. Below is the status after OOM happened: Ruby 7822 0.0 0.6 4485088 110940 の May06 10:24 /usr/pgsql/bin/postmaster -D /var/lib/pgsql/data Ruby 7874 0.3 0.0 o o sZ May06 33:30 [postmaster] <defunct> Ruby 7893 0.0 0.0 。 。 sz May06 3:34 [postmaster] <defunct> Ruby 7919 0.0 0.0 70592 4344 Ss May06 3:27 postgres: stats collector Ruby 9061 0.0 0.1 4485000 17836 ? Ss May06 3:19 postgres: walwriter Ruby 9062 0.0 0.0 4486544 2428 ? ss May06 0:03 postgres: autovacuum launcher Ruby 9063 0.0 0.0 66364 992 ? ss May06 1:27 postgres: archivers last was 00000002000002C5000000FB Ruby 9064 0.0 0.0 4486384 3280 ? sS May06 00:0 postgres: logical replication launcher Ruby 14403 0.1 0.0 4487084 3788 ? Ss May06 18:53 postgres: walsender rdsRepl 192.168.13.78(43284) strean Ruby Ruby 2170474 2170401 0.0 0.0 0.0 0.0 May11 0:05 0:05 [postmaster] <defunct> [postmaster] <defunct> I would like to know if the postmaster process is stuck because of the process Zombie state.
PG Bug reporting form <noreply@postgresql.org> writes: > I was performing a lot of operations on a server deployed with postgresql > 12.16. As heavy operations performed continuously. memory consumption has > been increased, the OS eventually got OOM and some background connection > processes that were taking up too much memory were killed. However, these > processes were not successfully killed and remained in Zombie state. In the > meantime, the whole database process seems to be stuck and time out happened > while connect via psql. It sounds to me like the OOM killer decided to kill the postmaster process, rather than the child process(es) that were actually eating memory. That's *extremely* unhelpful behavior. There is some advice in our manual about configuring your system to not do that. > Below is the status after OOM happened: > Ruby 7822 0.0 0.6 4485088 110940 の May06 10:24 /usr/pgsql/bin/postmaster -D /var/lib/pgsql/data It's not clear to me where this postmaster process came from, but it appears to be younger than the other postgres-related processes you're showing, so they are not its children. I'd manually nuke all of these processes and start fresh. regards, tom lane