On Thu, Mar 27, 2025 at 12:06:45PM -0400, Robert Haas wrote:
> On Thu, Mar 27, 2025 at 11:58 AM Andres Freund <andres@anarazel.de> wrote:
> > So, today we have the weird situation that *some* checksum errors on shared
> > relations get attributed to the current database (if they happen in a backend
> > normally accessing a shared relation), whereas others get reported to the
> > "shared relations" "database" (if they happen during a base backup). That
> > seems ... not optimal.
> >
> > One question is whether we consider this a bug that should be backpatched.
>
> I think it would be defensible if pg_basebackup reported all errors
> with OID 0 and backend connections reported all errors with OID
> MyDatabaseId, but it seems hard to justify having pg_basebackup take
> care to report things using the correct database OID and individual
> backend connections not take care to do the same thing. So I think
> this is a bug. If fixing it in the back-branches is too annoying, I
> think it would be reasonable to fix it only in master, but
> back-patching seems OK too.
Being able to get a better reporting for shared relations in back
branches would be nice, but that's going to require some invasive
chirurgy, isn't it?
We don't know currently the OID of the relation whose block is
corrupted with only PageIsVerifiedExtended(). There are two callers
of PIV_REPORT_STAT on HEAD:
- The checksum reports from RelationCopyStorage() know the
SMgrRelation.
- ReadBuffersOperation() has an optional Relation and a
SMgrRelationData.
We could just refactor PageIsVerifiedExtended() so as it reports a
state about why the verification failed and let the callers report the
checksum failure with a relation OID, splitting the data for shared
and non-shared relations?
--
Michael