Re: Protecting against unexpected zero-pages: proposal - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Protecting against unexpected zero-pages: proposal
Date
Msg-id 25798.1289233932@sss.pgh.pa.us
Whole thread Raw
In response to Re: Protecting against unexpected zero-pages: proposal  (Gurjeet Singh <singh.gurjeet@gmail.com>)
Responses Re: Protecting against unexpected zero-pages: proposal
List pgsql-hackers
Gurjeet Singh <singh.gurjeet@gmail.com> writes:
> On Sat, Nov 6, 2010 at 11:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Um ... and exactly how does that differ from the existing behavior?

> Right now a zero filled page considered valid, and is treated as a new page;
> PageHeaderIsValid()->/* Check all-zeroes case */, and PageIsNew(). This
> means that looking at a  zero-filled page on disk (say after a crash) does
> not give us any clue if it was indeed left zeroed by Postgres, or did
> FS/storage failed to do their job.

I think this is really a non-problem.  You said earlier that the
underlying filesystem uses 4K blocks.  Filesystem misfeasance would
therefore presumably affect 4K at a time.  If you see that both halves
of an 8K block are zero, it's far more likely that Postgres left it that
way than that the filesystem messed up.  Of course, if only one half of
an 8K page went to zeroes, you know the filesystem or disk did it.

There are also crosschecks that you can apply: if it's a heap page, are
there any index pages with pointers to it?  If it's an index page, are
there downlink or sibling links to it from elsewhere in the index?
A page that Postgres left as zeroes would not have any references to it.

IMO there are a lot of methods that can separate filesystem misfeasance
from Postgres errors, probably with greater reliability than this hack.
I would also suggest that you don't really need to prove conclusively
that any particular instance is one or the other --- a pattern across
multiple instances will tell you what you want to know.

> This change would increase the diagnosability of zero-page issues, and help
> the users point fingers at right places.

[ shrug... ] If there were substantial user clamor for diagnosing
zero-page issues, I might be for this.  As is, I think it's a non
problem.  What's more, if I did believe that this was a safe and
reliable technique, I'd be unhappy about the opportunity cost of
reserving it for zero-page testing rather than other purposes.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: SQL2011 and writeable CTE
Next
From: Tom Lane
Date:
Subject: Re: How to share the result data of separated plan