Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures
Date
Msg-id CALj2ACWM7j3m0N9frSUCFMUpRv6m0Jk3dd8kFwNOQZ58s+70Jw@mail.gmail.com
Whole thread Raw
In response to Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Responses Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures
List pgsql-hackers
On Tue, Sep 20, 2022 at 12:57 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2022-Sep-19, Bharath Rupireddy wrote:
>
> > We have a bunch of messages [1] that have an offset, but not LSN in
> > the error message. Firstly, is there an easiest way to figure out LSN
> > from offset reported in the error messages? If not, is adding LSN to
> > these messages along with offset a good idea? Of course, we can't just
> > convert offset to LSN using XLogSegNoOffsetToRecPtr() and report, but
> > something meaningful like reporting the LSN of the page that we are
> > reading-in or writing-out etc.
>
> Maybe add errcontext() somewhere that reports the LSN would be
> appropriate.  For example, the page_read() callbacks have the LSN
> readily available, so the ones in backend could install the errcontext
> callback; or perhaps ReadPageInternal can do it #ifndef FRONTEND.  Not
> sure what is best of those options, but either of those sounds better
> than sticking the LSN in a lower-level routine that doesn't necessarily
> have the info already.

All of the error messages [1] have the LSN from which offset was
calculated, I think we can just append that to the error messages
(something like ".... offset %u, LSN %X/%X: %m") and not complicate
it. Thoughts?

[1]
errmsg("could not read from WAL segment %s, offset %u: %m",
errmsg("could not read from WAL segment %s, offset %u: %m",
errmsg("could not write to log file %s "
       "at offset %u, length %zu: %m",
errmsg("unexpected timeline ID %u in WAL segment %s, offset %u",
errmsg("could not read from WAL segment %s, offset %u: read %d of %zu",
pg_log_error("received write-ahead log record for offset %u with no file open",
"invalid magic number %04X in WAL segment %s, offset %u",
"invalid info bits %04X in WAL segment %s, offset %u",
"invalid info bits %04X in WAL segment %s, offset %u",
"unexpected pageaddr %X/%X in WAL segment %s, offset %u",
"out-of-sequence timeline ID %u (after %u) in WAL segment %s, offset %u",

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Melih Mutlu
Date:
Subject: Re: Summary function for pg_buffercache
Next
From: Robert Haas
Date:
Subject: Re: Add SPLIT PARTITION/MERGE PARTITIONS commands