Re: Anti-critical-section assertion failure in mcxt.c reached by walsender - Mailing list pgsql-hackers

From Noah Misch
Subject Re: Anti-critical-section assertion failure in mcxt.c reached by walsender
Date
Msg-id 20210508001418.GA3076445@rfd.leadboat.com
Whole thread Raw
In response to Re: Anti-critical-section assertion failure in mcxt.c reached by walsender  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Anti-critical-section assertion failure in mcxt.c reached by walsender
List pgsql-hackers
On Fri, May 07, 2021 at 01:18:19PM -0400, Tom Lane wrote:
> Realizing that 9989d37d prevents the assertion failure, I went
> to see if thorntail had shown EIO failures without assertions.
> Looking back 180 days, I found these:
> 
>   sysname  |    branch     |      snapshot       |       stage        |
                       l                                                                        
 
>
-----------+---------------+---------------------+--------------------+------------------------------------------------------------------------------------------------------------------------------------------------
>  thorntail | HEAD          | 2021-03-19 21:28:15 | recoveryCheck      | 2021-03-20 00:48:48.117 MSK [4089174:11]
008_fsm_truncation.plPANIC:  could not fdatasync file "000000010000000000000002": Input/output error
 
>  thorntail | HEAD          | 2021-04-06 16:08:10 | recoveryCheck      | 2021-04-06 19:30:54.103 MSK [3355008:11]
008_fsm_truncation.plPANIC:  could not fdatasync file "000000010000000000000002": Input/output error
 
>  thorntail | REL9_6_STABLE | 2021-04-12 02:38:04 | pg_basebackupCheck | pg_basebackup: could not fsync file
"000000010000000000000013":Input/output error
 
> 
> So indeed the kernel-or-hardware problem is affecting other branches.

Having a flaky buildfarm member is bad news.  I'll LD_PRELOAD the attached to
prevent fsync from reaching the kernel.  Hopefully, that will make the
hardware-or-kernel trouble unreachable.  (Changing 008_fsm_truncation.pl
wouldn't avoid this, because fsync=off doesn't affect syncs outside the
backend.)

Attachment

pgsql-hackers by date:

Previous
From: Peter Lee
Date:
Subject: Will Postgres12 installed on a RHEL 6 server continue to function after the server get O/S upgrade to RHEL 7?
Next
From: David Rowley
Date:
Subject: Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays