Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures - Mailing list pgsql-hackers

From Maxim Orlov
Subject Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures
Date
Msg-id CACG=ezZfCTJJTd=uNgMJn2x+5mcCZ0qwLHsLFZVCX=h4LOxEtw@mail.gmail.com
Whole thread Raw
In response to Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures
List pgsql-hackers


On Tue, 15 Nov 2022 at 13:02, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
On Fri, Nov 11, 2022 at 5:52 PM Maxim Orlov <orlovmg@gmail.com> wrote:

> But, in my view, some improvements may be proposed. We should be more, let's say, cautious (or a paranoid if you wish),
> in pg_walfile_offset_lsn while dealing with user input values. What I mean by that is:
>  - to init input vars of sscanf, i.e. tli, log andseg;
>  - check the return value of sscanf call;
>  - check filename max length.

IsXLogFileName() will take care of this. Also, I've added a new inline
function XLogIdFromFileName() that parses file name and returns log
and seg along with XLogSegNo and timeline id. This new function avoids
an extra sscanf() call as well.

> Another question arises for me: is this function can be called during recovery? If not, we should ereport about this, should we?

It's just a utility function and doesn't depend on any of the server's
current state (unlike pg_walfile_name()), hence it doesn't matter if
this function is called during recovery.

> And one last note here: pg_walfile_offset_lsn is accepting NULL values and return NULL in this case. From a theoretical
> point of view, this is perfectly fine. Actually, I think this is exactly how it supposed to be, but I'm not sure if there are no other opinions here.

These functions are marked as 'STRICT', meaning a null is returned,
without even calling the function, if any of the input parameters is
null. I think we can keep the behaviour the same as its friends.
 
Thanks for the explanations. I think you are right.
 
>    errmsg("offset must not be negative or greater than or equal to the WAL segment size")));

Changed.
 
Confirm. And a timeline_id is added.
 
While on this, I noticed that the pg_walfile_name_offset() isn't
covered in tests. I took an opportunity and added a simple test case
along with pg_walfile_offset_lsn().

I'm attaching the v3 patch set for further review.
 
Great job! We should mark this patch as RFC, shall we?

--
Best regards,
Maxim Orlov.

pgsql-hackers by date:

Previous
From: Juan José Santamaría Flecha
Date:
Subject: Re: Meson add host_system to PG_VERSION_STR
Next
From: Pavel Borisov
Date:
Subject: Re: [PoC] configurable out of disk space elog level