Re: soft lockup - CPU#16 stuck for 3124s! [postmaster:2273] - Mailing list pgsql-general

From Ron Johnson
Subject Re: soft lockup - CPU#16 stuck for 3124s! [postmaster:2273]
Date
Msg-id CANzqJaAda267=Noy_bQceGdvHTVs+fQm=AixhRdiqe6rWfZ07g@mail.gmail.com
Whole thread Raw
In response to Re: soft lockup - CPU#16 stuck for 3124s! [postmaster:2273]  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: soft lockup - CPU#16 stuck for 3124s! [postmaster:2273]  (Matthias Apitz <guru@unixarea.de>)
List pgsql-general
On Fri, Mar 22, 2024 at 1:27 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Matthias Apitz <guru@unixarea.de> writes:
> We have a PostgreSQL 15.1 server in production at a customer for some
> weeks (migrated from an older version) on SuSE SLES 15.

> The customer is facing machine locks and before the Linux server does
> not respond any more (not even on SSH, only power-cycle reset helps to
> get it back), short before the fault a lot of messages are in
> /var/log/messages of the content:

> # grep watchdog: /var/log/messages
> ...
> 2024-03-22T13:11:32.056154+01:00 sunrise kernel: [327844.313048][   C25] watchdog: BUG: soft lockup - CPU#25 stuck for 3069s! [migration/25:166]
> 2024-03-22T13:12:28.056244+01:00 sunrise kernel: [327900.310267][   C16] watchdog: BUG: soft lockup - CPU#16 stuck for 3124s! [postmaster:2273]
> 2024-03-22T13:12:28.056340+01:00 sunrise kernel: [327900.311052][   C25] watchdog: BUG: soft lockup - CPU#25 stuck for 3121s! [migration/25:166]

Sounds like failing hardware to me :-(

Updating to 15.6 would rule out any bugs squashed in the last 15 months.
 

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: soft lockup - CPU#16 stuck for 3124s! [postmaster:2273]
Next
From: Fred Habash
Date:
Subject: Re: Timing out A Blocker Based on Time or Count of Waiters