Re: [BUG] Panic due to incorrect missingContrecPtr after promotion - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: [BUG] Panic due to incorrect missingContrecPtr after promotion
Date
Msg-id 20220808.130654.541433441863454305.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: [BUG] Panic due to incorrect missingContrecPtr after promotion  ("Imseih (AWS), Sami" <simseih@amazon.com>)
Responses Re: [BUG] Panic due to incorrect missingContrecPtr after promotion
List pgsql-hackers
At Fri, 5 Aug 2022 21:28:16 +0000, "Imseih (AWS), Sami" <simseih@amazon.com> wrote in 
> > Would you mind trying the second attached to abtain detailed log on
> > your testing environment? With the patch, the modified TAP test yields
> > the log lines like below.
> 
> I applied the logging patch to 13.7 ( attached is the backport ) and repro'd the 
> Issue.
> 
> I stripped out the relevant parts of the file. Let me know if this is
> helpful.

Thank you very much!

> postgresql.log.2022-08-05-17:2022-08-05 17:18:51 UTC::@:[359]:LOG:  ### [F] @0/10000000: abort=(0/0)0/0,
miss=(0/0)0/0,SbyMode=0, SbyModeReq=1
 
> postgresql.log.2022-08-05-17:2022-08-05 17:22:21 UTC::@:[359]:LOG:  ### [S] @0/10000060: abort=(0/0)0/0,
miss=(0/0)0/0,SbyMode=1, SbyModeReq=1
 

The server seem to have started as a standby after crashing a
primary. Is it correct?

> postgresql.log.2022-08-05-18:2022-08-05 18:38:14 UTC::@:[359]:LOG:  ### [F] @6/B6CB27D0: abort=(0/0)0/0,
miss=(0/0)0/0,SbyMode=1, SbyModeReq=1
 
> postgresql.log.2022-08-05-18:2022-08-05 18:38:14 UTC::@:[359]:LOG:  ### [S] @6/B6CB27D0: abort=(0/0)0/0,
miss=(0/0)0/0,SbyMode=0, SbyModeReq=1
 

Archive recovery ended here. The server should have promoted that
time..  Do you see some interesting log lines around this time?

> postgresql.log.2022-08-05-18:2022-08-05 18:50:13 UTC::@:[359]:LOG:  ### [S] @6/B8000198: abort=(0/0)0/0,
miss=(0/0)0/0,SbyMode=0, SbyModeReq=1
 

But, recovery continues in non-standby mode.  I don't see how come it
behaves that way.

> postgresql.log.2022-08-05-18:2022-08-05 18:50:20 UTC::@:[359]:LOG:  ### [A] @6/F3FFFF20: abort=(6/F3FFFF20)0/0,
miss=(6/F4000000)0/0,SbyMode=0, SbyModeReq=1
 
> postgresql.log.2022-08-05-18:2022-08-05 18:50:20 UTC::@:[359]:LOG:  ### [S] @6/F4000030: abort=(0/0)6/F3FFFF20,
miss=(0/0)6/F4000000,SbyMode=1, SbyModeReq=1
 

Then archive recovery starts again.


regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
Next
From: Bharath Rupireddy
Date:
Subject: Re: Generalize ereport_startup_progress infrastructure