Home > mailing lists

Re: debug a lockup - Mailing list pgsql-admin

From	Scott Ribe
Subject	Re: debug a lockup
Date	February 11 03:12:19
Msg-id	A724D8F7-D1F2-401E-8307-618AAF5B2A13@elevated-dev.com Whole thread Raw
In response to	Re: debug a lockup (Aislan Luiz Wendling <aislanluiz@hotmail.com>)
List	pgsql-admin

Tree view

OK, we figured it out--I think.

pgbench was stuck in restart_syscall(<...resuming interrupted read...

it was set to open 100 connections

there were ~20 pg sessions in idle, and the last one (highest pid) in auth

that one was in write to fd 2

So... This is running in kubernetes. I was doing some load testing against a storage service (thus 100 connections). PG
waslaunched manually in a bash session connected to the pod, in k9s. There were ~20 total bash sessions open in k9s
across15 nodes. 

Theory: k9s glitched and stopped reading the piped file descriptor, buffer filled, and PG blocked on the write. (I have
seenprior evidence of less-than-perfect handling of output by k9s). Particularly, I had logging of connections on, so
atauth it would have been writing to stderr. 

This happened in one of probably over 100 runs of the same test, so not readily reproducible and I wanted to autopsy it
beforekilling off the hung processes. Unless someone pokes a hole in my theory, at this point I think it is neither
pgbenchnor PG nor Pure/Portworx at fault. 

--
Scott Ribe
scott_ribe@elevated-dev.com
https://www.linkedin.com/in/scottribe/

pgsql-admin by date:

From: Aislan Luiz Wendling
Date: 11 February, 03:00:43
Subject: Re: debug a lockup

From: Edwin UY
Date: 16 February, 00:19:22
Subject: has_table_privilege

Re: debug a lockup - Mailing list pgsql-admin

Previous

Next