On Sat, Nov 6, 2010 at 11:48 PM, Tom Lane
<tgl@sss.pgh.pa.us> wrote:
Gurjeet Singh <
singh.gurjeet@gmail.com> writes:
> .) The basic idea is to have a magic number in every PageHeader before it is
> written to disk, and check for this magic number when performing page
> validity
> checks.
Um ... and exactly how does that differ from the existing behavior?
Right now a zero filled page considered valid, and is treated as a new page; PageHeaderIsValid()->/* Check all-zeroes case */, and PageIsNew(). This means that looking at a zero-filled page on disk (say after a crash) does not give us any clue if it was indeed left zeroed by Postgres, or did FS/storage failed to do their job.
With the proposed change, if it is a valid page (a page actually written by Postgres) it will either have a sensible LSN or the magic-LSN; the LSN will never be zero. OTOH, if we encounter a zero filled page ( => LSN={0,0)} ) it clearly would implicate elements outside Postgres in making that page zero.
The amount of fragility introduced by the assumptions you have to make
for this seems to me to be vastly riskier than the risk you are trying
to respond to.
I understand that it is a pretty low-level change, but IMHO the change is minimal and is being applied in well understood places. All the assumptions listed have been effective for quite a while, and I don't see these assumptions being affected in the near future. Most crucial assumptions we have to work with are, that XLogPtr{n, 0xFFFFFFFF} will never be used, and that mdextend() is the only place that extends a relation (until we implement an md.c sibling, say flash.c or tape.c; the last change to md.c regarding mdextend() was in January 2007).
Only mdextend() and PageHeaderIsValid() need to know this change in behaviour, and all the other APIs work and behave the same as they do now.
This change would increase the diagnosability of zero-page issues, and help the users point fingers at right places.