Hi,
We just came across a situation where a corrupted HFS+ filesystem
appears to return ERANGE on a customer machine. Our first reaction was
to turn zero_damaged_pages on to allow taking a pg_dump backup of the
database, but surprisingly this does not work. A quick glance at the
code shows the reason:
if (nbytes != BLCKSZ){ if (nbytes < 0) ereport(ERROR, (errcode_for_file_access(),
errmsg("could not read block %u in file \"%s\": %m", blocknum, FilePathName(v->mdfd_vfd))));
/* * Short read: we are at or past EOF, or we read a partial block at * EOF. Normally this is an error;
upperlevels should never try to * read a nonexistent block. However, if zero_damaged_pages is ON or * we are
InRecovery,we should instead return zeroes without * complaining. This allows, for example, the case of trying to
* update a block that was later truncated away. */ if (zero_damaged_pages || InRecovery) MemSet(buffer,
0,BLCKSZ); else ereport(ERROR, (errcode(ERRCODE_DATA_CORRUPTED), errmsg("could
notread block %u in file \"%s\": read only %d of %d bytes", blocknum, FilePathName(v->mdfd_vfd),
nbytes, BLCKSZ)));
Note that zero_damaged_pages only enters the picture if it's a short
read, not if the read actually fails completely.
Is this by design, or is this just an oversight?
See
http://lists.gnu.org/archive/html/rdiff-backup-users/2007-12/msg00053.html
I don't have yet any evidence that the filesystem is actually corrupt,
but the error message from the kernel is "Result out of range", which is
not documented to be possible on read() in Mac OS X.
--
Álvaro Herrera <alvherre@alvh.no-ip.org>