Re: walreceiver termination - Mailing list pgsql-general

From Justin King
Subject Re: walreceiver termination
Date
Msg-id CAE39h23=1sg0zDopX2tm8ggz1Axh60RiTNzR2G8YpVLB52RPww@mail.gmail.com
Whole thread Raw
In response to Re: walreceiver termination  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: walreceiver termination
List pgsql-general
On Thu, Apr 23, 2020 at 3:06 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Justin King <kingpin867@gmail.com> writes:
> > I assume it would be related to the following:
> > LOG:  incorrect resource manager data checksum in record at 2D6/C259AB90
> > since the walreceiver terminates just after this - but I'm unclear
> > what precisely this means.
>
> What it indicates is corrupt data in the WAL stream.  When reading WAL
> after crash recovery, we assume that that indicates end of WAL.  When
> pulling live data from a source server, it suggests some actual problem
> ... but killing the walreceiver and trying to re-fetch the data might
> be a reasonable response to that.  I'm not sure offhand what the startup
> code thinks it's doing in this context.  It might either be attempting
> to retry, or concluding that it's come to the end of WAL and it ought
> to promote to being a live server.  If you don't see the walreceiver
> auto-restarting then I'd suspect that the latter is happening.
>
>                         regards, tom lane

walrecevier is definitely not restarting -- replication stops cold
right at that segment.  I'm a little unclear where to go from here --
is there additional info that would be useful?



pgsql-general by date:

Previous
From: Rob Sargent
Date:
Subject: Re: Fw: Re: Could Not Connect To Server
Next
From: Adrian Klaver
Date:
Subject: Re: Fw: Re: Could Not Connect To Server