Re: Make mesage at end-of-recovery less scary. - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Make mesage at end-of-recovery less scary.
Date
Msg-id 20200323194736.uijmupfle2ccvyah@alap3.anarazel.de
Whole thread Raw
In response to Re: Make mesage at end-of-recovery less scary.  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Responses Re: Make mesage at end-of-recovery less scary.  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
List pgsql-hackers
Hi,

On 2020-03-23 10:37:16 +0100, Peter Eisentraut wrote:
> On 2020-03-05 08:06, Kyotaro Horiguchi wrote:
> > | [20866] LOG:  replication terminated by primary server
> > | [20866] DETAIL:  End of WAL reached on timeline 1 at 0/30001C8.
> > | [20866] FATAL:  could not send end-of-streaming message to primary: no COPY in progress

IMO it's a bug that we see this FATAL. I seem to recall that we didn't
use to get that?


> > | [20851] LOG:  reached end of WAL at 0/30001C8 on timeline 1 in archive during standby mode
> > | [20851] DETAIL:  invalid record length at 0/30001C8: wanted 24, got 0
> > 
> > I changed the above to the below, which looks more adequate.
> > 
> > | [24271]  LOG:  replication terminated by primary server on timeline 1 at 0/3000240.
> > | [24271]  FATAL:  could not send end-of-streaming message to primary: no COPY in progress
> > | [24267]  LOG:  reached end of WAL at 0/3000240 on timeline 1 in archive during standby mode
> > | [24267]  DETAIL:  invalid record length at 0/3000240: wanted 24, got 0
> 
> Is this the before and after?  That doesn't seem like a substantial
> improvement to me.  You still get the "scary" message at the end.

It seems like a minor improvement - folding the DETAIL into the message
makes sense to me here. But it indeed doesn't really address the main
issue.

I think we don't want to elide the information about how the end of the
WAL was detected - there are some issues where I found that quite
helpful. But we could reformulate it to be clearer that it's informative
output, not a bug.  E.g. something roughly like

LOG:  reached end of WAL at 0/3000240 on timeline 1 in archive during standby mode
DETAIL: End detected due to invalid record length at 0/3000240: expected 24, got 0
(I first elided the position in the DETAIL, but it could differ from the
one in LOG)

I don't find that very satisfying, but I can't come up with something
that provides the current information, while being less scary than my
suggestion?

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Assert() failures during RI checks
Next
From: Andres Freund
Date:
Subject: Re: Make mesage at end-of-recovery less scary.