Re: Pg stuck at 100% cpu, for multiple days - Mailing list pgsql-hackers

From hubert depesz lubaczewski
Subject Re: Pg stuck at 100% cpu, for multiple days
Date
Msg-id 20210831061110.GB32253@depesz.com
Whole thread Raw
In response to Re: Pg stuck at 100% cpu, for multiple days  (Laurenz Albe <laurenz.albe@cybertec.at>)
Responses Re: Pg stuck at 100% cpu, for multiple days  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Mon, Aug 30, 2021 at 09:09:20PM +0200, Laurenz Albe wrote:
> On Mon, 2021-08-30 at 17:18 +0200, hubert depesz lubaczewski wrote:
> > The thing is - I can't close it with pg_terminate_backend(), and I'd
> > rather not kill -9, as it will, I think, close all other connections,
> > and this is prod server.
> 
> Of course the cause should be fixed, but to serve your immediate need:
> 
> After calling pg_terminate_backend(), you can attach gdb to the backend and then run
> 
>   print ProcessInterrupts()
> 
> That will cause the backend to exit normally without crashing the server.

I got this mail too late, and the decision was made to restart Pg. After
restart all cleaned up nicely.

So, while I can't help more with diagnosing the problem, I think it
might be good to try to figure out what could have happened.

On my end I gathered some more data:
1. the logical replication app is debezium
2. as far as I can tell it was patched against
   https://issues.redhat.com/browse/DBZ-1596
3. app was gone (kubernetes cluister was shut down) in the mean time.
4. the backend was up and running for 12 days, in the tight loop.

depesz



pgsql-hackers by date:

Previous
From: Yugo NAGATA
Date:
Subject: Re: Fix around conn_duration in pgbench
Next
From: hubert depesz lubaczewski
Date:
Subject: Re: Pg stuck at 100% cpu, for multiple days