Re: Exceptional md.c paths for recovery and zero_damaged_pages - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Exceptional md.c paths for recovery and zero_damaged_pages
Date
Msg-id 59281296-4cca-4a6f-9025-837e0e701555@iki.fi
Whole thread Raw
In response to Exceptional md.c paths for recovery and zero_damaged_pages  (Andres Freund <andres@anarazel.de>)
Responses Re: Exceptional md.c paths for recovery and zero_damaged_pages
Re: Exceptional md.c paths for recovery and zero_damaged_pages
List pgsql-hackers
On 14/12/2024 01:44, Andres Freund wrote:
> The zero_damaged_pages path in bufmgr.c makes sense to me, but this one seems
> less sane to me.  If you want to recover from a data corruption event and
> can't dump the data because a seqscan stumbles over an invalid page -
> zero_damaged_pages makes sense.
> 
> Seqscans or tidscans won't reach the mdreadv() path, because they check the
> relation size first.  Which leaves access from indexes - e.g. an index pointer
> beyond the end of the heap.  But in that case it's not sane to use
> zero_damaged_pages, because that's almost a guarantee for worsening corruption
> in the future, because the now empty heap page will eventually be filled with
> new tuples, which now will be pointed to by index entries pointing that were
> created before the zeroing.

Well, if you need to do zero_damage_pages=off, you're screwed already, 
so I don't know think the worsening corruption argument matters much. 
And you have the same problem by pages zeroed by a seqscan too. To avoid 
that, you'd want to mark the page explicitly as "damaged, do not reuse" 
rather than just zero it, but that'd be a lot of new code.

Hmm, looking at index_fetch_heap(), I'm surprised it doesn't throw an 
error or even a warning if the heap tuple isn't found. That would seem 
like a useful sanity check. An index tuple should never point to a 
non-existent heap TID I believe.

> I'm wondering if we should just put an error into the relevant paths in HEAD
> and see whether it triggers for anybody in the next months. Having all these
> untested paths in md.c forever doesn't seem great.

+1

-- 
Heikki Linnakangas
Neon (https://neon.tech)




pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Maybe we should reduce SKIP_PAGES_THRESHOLD a bit?
Next
From: Tom Lane
Date:
Subject: Re: Adding NetBSD and OpenBSD to Postgres CI