Re: Print physical file path when checksum check fails - Mailing list pgsql-hackers

From Hubert Zhang
Subject Re: Print physical file path when checksum check fails
Date
Msg-id CAB0yremmEKPyvzdx9ufXSQ-4w_6QSkoY3xRgHu6_jLc-TC-VMQ@mail.gmail.com
Whole thread Raw
In response to Re: Print physical file path when checksum check fails  (Hubert Zhang <hzhang@pivotal.io>)
Responses Re: Print physical file path when checksum check fails
List pgsql-hackers
On Wed, Feb 12, 2020 at 5:22 PM Hubert Zhang <hzhang@pivotal.io> wrote:
Thanks Andres,

On Tue, Feb 11, 2020 at 5:30 AM Andres Freund <andres@anarazel.de> wrote:
HHi,

On 2020-02-10 16:04:21 +0800, Hubert Zhang wrote:
> Currently we only print block number and relation path when checksum check
> fails. See example below:
>
> ERROR: invalid page in block 333571 of relation base/65959/656195

> DBA complains that she needs additional work to calculate which physical
> file is broken, since one physical file can only contain `RELSEG_SIZE`
> number of blocks. For large tables, we need to use many physical files with
> additional suffix, e.g. 656195.1, 656195.2 ...
>
> Is that a good idea to also print the physical file path in error message?
> Like below:
>
> ERROR: invalid page in block 333571 of relation base/65959/656195, file
> path base/65959/656195.2

I think that'd be a nice improvement. But:

I don't think the way you did it is right architecturally. The
segmenting is really something that lives within md.c, and we shouldn't
further expose it outside of that. And e.g. the undo patchset uses files
with different segmentation - but still goes through bufmgr.c.

I wonder if this partially signals that the checksum verification piece
is architecturally done in the wrong place currently? It's imo not good
that every place doing an smgrread() needs to separately verify
checksums. OTOH, it doesn't really belong inside smgr.c.


This layering issue was also encountered in
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=3eb77eba5a51780d5cf52cd66a9844cd4d26feb0
so perhaps we should work to reuse the FileTag it introduces to
represent segments, without hardcoding the specific segment size?


I checked the FileTag commit. It calls `register_xxx_segment` inside md.c to store the sync request into a hashtable and used by checkpointer later.

Checksum verify is simpler. We could move the `PageIsVerified` into md.c (mdread). But we can not elog error inside md.c because read buffer mode RBM_ZERO_ON_ERROR is at bugmgr.c level.

One idea is to change save the error message(or the FileTag) at (e.g. a static string in bufmgr.c) by calling `register_checksum_failure` inside mdread in md.c.

As for your concern about the need to do checksum verify after every smgrread, we now move the checksum verify logic into md.c, but we still need to check the checksum verify result after smgrread and reset buffer to zero if mode is RBM_ZERO_ON_ERROR.  

If this idea is OK, I will submit the new PR.


Based on Andres's comments, here is the new patch for moving checksum verify logic into mdread() instead of call PageIsVerified in every smgrread(). Also using FileTag to print the physical file name when checksum verify failed, which handle segmenting inside md.c as well.

--
Thanks

Hubert Zhang
Attachment

pgsql-hackers by date:

Previous
From: Corey Huinker
Date:
Subject: Re: Resolving the python 2 -> python 3 mess
Next
From: Michael Paquier
Date:
Subject: Re: tiny documentation fix