Re: Wall shiping replica failed to recover database with error: invalid contrecord length 1956 at FED/38FFE208 - Mailing list pgsql-general

From Tom Lane
Subject Re: Wall shiping replica failed to recover database with error: invalid contrecord length 1956 at FED/38FFE208
Date
Msg-id 30577.1573752493@sss.pgh.pa.us
Whole thread Raw
In response to Re: Wall shiping replica failed to recover database with error:invalid contrecord length 1956 at FED/38FFE208  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: Wall shiping replica failed to recover database with error:invalid contrecord length 1956 at FED/38FFE208
List pgsql-general
Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:
> At Wed, 2 Oct 2019 19:24:02 -0400, Stephen Frost <sfrost@snowman.net> wrote in
>> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>>> Yeah, those messages are all pretty ancient, from when WAL was new and not
>>> to be trusted much.  Perhaps the thing to do is move the existing info
>>> into DETAIL and make the primary message be something like "reached
>>> apparent end of WAL stream".

>> Yes, +1 on that.

> What do you think about this?

It seems overcomplicated.  Why do you need to reach into
emode_for_corrupt_record's private state?  I think we could just
change the message text without that, and leave the emode
machinations as-is.

I don't understand the restriction to "if (RecPtr == InvalidXLogRecPtr)"
either?  Maybe that's fine but the comment fails to explain it.

Another point is that, as the comment for emode_for_corrupt_record
notes, we don't really want to consider that we've hit end-of-WAL
when reading any source except XLOG_FROM_PG_WAL.  So I think the
change of message ought to include a test of the source, but this
doesn't.

Also, the business with an "operation" string violates the message
translatability guideline about "don't assemble a message out of
phrases".  If we want to have that extra detail, it's better just
to make three separate ereport() calls with separately translatable
messages.

Also, it seems wrong that the page TLI check, just below, is where
it is and isn't part of the main set of page header sanity checks.
That's sort of unrelated to this patch, except not really, because
shouldn't a failure of that test also be treated as an "end of WAL"
condition?

            regards, tom lane



pgsql-general by date:

Previous
From: github kran
Date:
Subject: PostGreSQL Replication and question on maintenance
Next
From: Ron
Date:
Subject: Query which shows FK child columns?