On Thu, Apr 23, 2020 at 3:06 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Justin King <kingpin867@gmail.com> writes:
> > I assume it would be related to the following:
> > LOG: incorrect resource manager data checksum in record at 2D6/C259AB90
> > since the walreceiver terminates just after this - but I'm unclear
> > what precisely this means.
>
> What it indicates is corrupt data in the WAL stream. When reading WAL
> after crash recovery, we assume that that indicates end of WAL. When
> pulling live data from a source server, it suggests some actual problem
> ... but killing the walreceiver and trying to re-fetch the data might
> be a reasonable response to that. I'm not sure offhand what the startup
> code thinks it's doing in this context. It might either be attempting
> to retry, or concluding that it's come to the end of WAL and it ought
> to promote to being a live server. If you don't see the walreceiver
> auto-restarting then I'd suspect that the latter is happening.
>
> regards, tom lane
walrecevier is definitely not restarting -- replication stops cold
right at that segment. I'm a little unclear where to go from here --
is there additional info that would be useful?