Re: Should mdxxx functions(e.g. mdread, mdwrite, mdsync etc) PANIC instead of ERROR when I/O failed? - Mailing list pgsql-hackers

From Jacky Leng
Subject Re: Should mdxxx functions(e.g. mdread, mdwrite, mdsync etc) PANIC instead of ERROR when I/O failed?
Date
Msg-id h16v1p$2q8o$1@news.hub.org
Whole thread Raw
In response to Should mdxxx functions(e.g. mdread, mdwrite, mdsync etc) PANIC instead of ERROR when I/O failed?  ("Jacky Leng" <lengjianquan@163.com>)
List pgsql-hackers
>> I think the reasoning is that if those functions reported a PANIC the
>> chance you could recover your data is zero, because you need the
>> database system to read the other (good) data.

I do not see why PANIC reduced the chance to recover my data. AFAICS,
my data has already corrupted(because of the bad-block here), whether
PANIC or not, the read opertion on the bad-block should get the same result.


> Also, in the case you're complaining about, the problem was that there
> wasn't any O/S error report that we could have PANIC'd about anyhow.

No, the O/S did report the error, which lead to the 453 ERROR messages of
postgres. The O/S error messages(got this using dmesg) is like this:   end_request: I/O error, dev sda, sector
504342711  ata1: EH complete   SCSI device sda: 976773168 512-byte hdwr sectors (500108 MB)   sda: Write Protect is off
 sda: Mode Sense: 00 3a 00 00   SCSI device sda: drive cache: write back   ata1.00: exception Emask 0x0 SAct 0x1 SErr
0x0action 0x0   ata1.00: (irq_stat 0x40000008)   ata1.00: cmd 60/08:00:b0:a8:0f/00:00:1e:00:00/40 tag 0 cdb 0x0 data
4096
 
in        res 41/40:08:b7:a8:0f/06:00:1e:00:00/00 Emask 0x9 (media error)   ata1.00: ata_hpa_resize 1: sectors =
976773168,hpa_sectors = 976773168   ata1.00: ata_hpa_resize 1: sectors = 976773168, hpa_sectors = 976773168
 


> We already do refuse
> to read a page into shared buffers if there's a read error on it,
> so it's not clear to me how you think that an ERROR leaves things
> in an unstable state.
>

In my scene, it seems that the O/S does not ensure that if an I/O operation
(read, write, sync, etc) on a block failed, then all later I/O operations
on this block will also failed. For example:
1. As I noted before, although the bad db-block in my data has been read  unsuccessfully for 453 times, but the 454th
readoperation succeeds(but  some data(the bad sector) has been set to all-zero). So, even if the 453  failed I/O has
reportedERROR, there is still chance that the bad 
 
db-block  can be read in shared buffres.
2. Besides, I have noticed a scene like this: 1)an mdsync operations failed  with the message "ERROR: could not fsync
segmentXXX of relation XXX: 
 
??";
  The error message of O/S(I get this using dmesg command) is like this:      Buffer I/O error on device
^AXX205503,logical block 43837786      lost page write due to I/O error on ^AXX205503
 
  2) This leaves a half-writen db-block in my data. But the page can still  be read in shared buffers successfully
later,which leads to an curious  scene that says "ERROR:  could not access status of transaction XXXXX"
 






pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: [PATCH] backend: compare word-at-a-time in bcTruelen
Next
From: Bruce YUAN
Date:
Subject: How to embed postgresql?