Re: Crash in new pgstats code - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Crash in new pgstats code
Date
Msg-id CA+hUKGL4g=fn6Zne8o3hv4Ek=u8OWK4kcopZfmSj4Rp=ueSckA@mail.gmail.com
Whole thread Raw
In response to Crash in new pgstats code  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Crash in new pgstats code  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Mon, Apr 18, 2022 at 7:19 PM Michael Paquier <michael@paquier.xyz> wrote:
> On Sat, Apr 16, 2022 at 02:36:33PM -0700, Andres Freund wrote:
> > which I haven't seen locally. Looks like we have some race between
> > startup process and walreceiver? That seems not great.  I'm a bit
> > confused that walreceiver and archiving are both active at the same time
> > in the first place - that doesn't seem right as things are set up
> > currently.
>
> Yeah, that should be exclusively one or the other, never both.
> WaitForWALToBecomeAvailable() would be a hot spot when it comes to
> decide when a WAL receiver should be spawned by the startup process.
> Except from the recent refactoring of xlog.c or the WAL prefetch work,
> there has not been many changes in this area lately.

Hmm, well I'm not sure what is happening here and will try to dig
tomorrow, but one observation from some log scraping is that kestrel
logged similar output with "could not link file" several times before
the main prefetching commit (5dc0418).  I looked back 3 months on
kestrel/HEAD and found these:

 commit  |                                                        log

---------+-------------------------------------------------------------------------------------------------------------------
 411b913 |
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=kestrel&dt=2022-03-27%2010:57:20&stg=recovery-check
 3d067c5 |
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=kestrel&dt=2022-03-29%2017:52:32&stg=recovery-check
 cd7ea75 |
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=kestrel&dt=2022-03-30%2015:25:03&stg=recovery-check
 8e053dc |
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=kestrel&dt=2022-03-30%2020:27:44&stg=recovery-check
 4e34747 |
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=kestrel&dt=2022-04-04%2020:32:24&stg=recovery-check
 01effb1 |
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=kestrel&dt=2022-04-06%2007:32:40&stg=recovery-check
 fbfe691 |
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=kestrel&dt=2022-04-07%2005:10:05&stg=recovery-check
 5dc0418 |
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=kestrel&dt=2022-04-07%2007:51:00&stg=recovery-check
 bd037dc |
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=kestrel&dt=2022-04-11%2022:00:58&stg=recovery-check
 a4b5754 |
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=kestrel&dt=2022-04-12%2004:40:44&stg=recovery-check
 7129a97 |
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=kestrel&dt=2022-04-15%2022:42:07&stg=recovery-check
 9f4f0a0 |
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=kestrel&dt=2022-04-16%2020:05:34&stg=recovery-check



pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: Stabilizing the test_decoding checks, take N
Next
From: Amit Kapila
Date:
Subject: Re: Column Filtering in Logical Replication