Re: Crash in new pgstats code - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Crash in new pgstats code
Date
Msg-id 20220418145003.ni7tl6pmyokvj2ie@alap3.anarazel.de
Whole thread Raw
In response to Re: Crash in new pgstats code  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: Crash in new pgstats code  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers
Hi,

On 2022-04-18 22:45:07 +1200, Thomas Munro wrote:
> On Mon, Apr 18, 2022 at 7:19 PM Michael Paquier <michael@paquier.xyz> wrote:
> > On Sat, Apr 16, 2022 at 02:36:33PM -0700, Andres Freund wrote:
> > > which I haven't seen locally. Looks like we have some race between
> > > startup process and walreceiver? That seems not great.  I'm a bit
> > > confused that walreceiver and archiving are both active at the same time
> > > in the first place - that doesn't seem right as things are set up
> > > currently.
> >
> > Yeah, that should be exclusively one or the other, never both.
> > WaitForWALToBecomeAvailable() would be a hot spot when it comes to
> > decide when a WAL receiver should be spawned by the startup process.
> > Except from the recent refactoring of xlog.c or the WAL prefetch work,
> > there has not been many changes in this area lately.
> 
> Hmm, well I'm not sure what is happening here and will try to dig
> tomorrow, but one observation from some log scraping is that kestrel
> logged similar output with "could not link file" several times before
> the main prefetching commit (5dc0418).  I looked back 3 months on
> kestrel/HEAD and found these:

Kestrel won't go that far back even - I set it up 23 days ago...

I'm formally on vacation till Thursday, I'll try to look at earlier
instances then. Unless it's already figured out :). I failed at
reproducing it locally, despite a fair bit of effort.

The BF really should break out individual tests into their own stage
logs. The recovery-check stage is 13MB and 150k lines by now.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Postgres perl module namespace
Next
From: Tom Lane
Date:
Subject: Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403)