Thread: bufmgr and smgr don't talk to each other, apparently
I have just noticed something that's been broken for a good long while (at least since 6.3): bufmgr.c expects that I/O errors will result in an SM_FAIL return code from the smgr.c routines, but smgr.c does no such thing: it does elog(ERROR) if it sees a failure. All of the "error handling" paths in bufmgr.c are dead code and have been since at least 6.3. It seems to me that we should either reduce smgr.c's elog()s to NOTICEs, or rip out all of the dead code in bufmgr.c. I'm inclined to the latter, since the former seems likely to create new bugs. I'm also thinking that AbortBufferIO is *way* overstepping its authority by forcing a postmaster restart if it notices a double write failure. The dirty buffer is a problem, no doubt, but this solution looks like urban renewal via A-bomb. I'd rather just keep failing anytime some transaction tries to write the buffer --- better that than taking out all active transactions whether they'd ever touched that buffer or not. If the write failure really is permanent, the dbadmin would eventually have to intervene via a manual restart, but a manual restart at the time of the dbadmin's choosing seems better than forcing a failure under load. Comments? regards, tom lane
> -----Original Message----- > From: pgsql-hackers-owner@hub.org [mailto:pgsql-hackers-owner@hub.org]On > Behalf Of Tom Lane > > I have just noticed something that's been broken for a good long while > (at least since 6.3): bufmgr.c expects that I/O errors will result in > an SM_FAIL return code from the smgr.c routines, but smgr.c does no > such thing: it does elog(ERROR) if it sees a failure. All of the except smgropen(). It's not easy to return from mdxxx() in case of errors. Fortunately I succeeded to return from mdopen() in 'file non- existent' cases. > "error handling" paths in bufmgr.c are dead code and have been since > at least 6.3. > > It seems to me that we should either reduce smgr.c's elog()s to NOTICEs, > or rip out all of the dead code in bufmgr.c. I'm inclined to the > latter, since the former seems likely to create new bugs. > I also prefer the latter. Even though smgr returns SM_FAIL,md stuff already calls elog(ERROR) in many places. Regards. Hiroshi Inoue
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes: >> (at least since 6.3): bufmgr.c expects that I/O errors will result in >> an SM_FAIL return code from the smgr.c routines, but smgr.c does no >> such thing: it does elog(ERROR) if it sees a failure. All of the > except smgropen(). Right. I'm mainly looking at the block read/write/flush calls, which have a lot of now-useless error recovery code after them. > I also prefer the latter. Even though smgr returns SM_FAIL,md stuff > already calls elog(ERROR) in many places. Good point, and the fd.c level may have some elogs too... regards, tom lane