Re: Anti-critical-section assertion failure in mcxt.c reached by walsender - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Anti-critical-section assertion failure in mcxt.c reached by walsender
Date
Msg-id 43381.1620407899@sss.pgh.pa.us
Whole thread Raw
In response to Re: Anti-critical-section assertion failure in mcxt.c reached by walsender  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Anti-critical-section assertion failure in mcxt.c reached by walsender
List pgsql-hackers
I wrote:
> Thomas Munro <thomas.munro@gmail.com> writes:
>> Oh, and I see that 13 has 9989d37d "Remove XLogFileNameP() from the
>> tree" to fix this exact problem.

> Hah, so that maybe explains why thorntail has only shown this in
> the v12 branch.  Should we consider back-patching that?

Realizing that 9989d37d prevents the assertion failure, I went
to see if thorntail had shown EIO failures without assertions.
Looking back 180 days, I found these:

  sysname  |    branch     |      snapshot       |       stage        |
                     l                                                                         

-----------+---------------+---------------------+--------------------+------------------------------------------------------------------------------------------------------------------------------------------------
 thorntail | HEAD          | 2021-03-19 21:28:15 | recoveryCheck      | 2021-03-20 00:48:48.117 MSK [4089174:11]
008_fsm_truncation.plPANIC:  could not fdatasync file "000000010000000000000002": Input/output error 
 thorntail | HEAD          | 2021-04-06 16:08:10 | recoveryCheck      | 2021-04-06 19:30:54.103 MSK [3355008:11]
008_fsm_truncation.plPANIC:  could not fdatasync file "000000010000000000000002": Input/output error 
 thorntail | REL9_6_STABLE | 2021-04-12 02:38:04 | pg_basebackupCheck | pg_basebackup: could not fsync file
"000000010000000000000013":Input/output error 

So indeed the kernel-or-hardware problem is affecting other branches.
I suspect that the lack of reports in the pre-v12 branches is mostly
down to there having been many fewer runs on those branches within
the past couple months.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Why do we have perl and sed versions of Gen_dummy_probes?
Next
From: Andres Freund
Date:
Subject: Re: Why do we have perl and sed versions of Gen_dummy_probes?