bufmgr and smgr don't talk to each other, apparently - Mailing list pgsql-hackers

From Tom Lane
Subject bufmgr and smgr don't talk to each other, apparently
Date
Msg-id 1853.964822211@sss.pgh.pa.us
Whole thread Raw
Responses RE: bufmgr and smgr don't talk to each other, apparently  ("Hiroshi Inoue" <Inoue@tpf.co.jp>)
List pgsql-hackers
I have just noticed something that's been broken for a good long while
(at least since 6.3): bufmgr.c expects that I/O errors will result in
an SM_FAIL return code from the smgr.c routines, but smgr.c does no
such thing: it does elog(ERROR) if it sees a failure.  All of the
"error handling" paths in bufmgr.c are dead code and have been since
at least 6.3.

It seems to me that we should either reduce smgr.c's elog()s to NOTICEs,
or rip out all of the dead code in bufmgr.c.  I'm inclined to the
latter, since the former seems likely to create new bugs.

I'm also thinking that AbortBufferIO is *way* overstepping its authority
by forcing a postmaster restart if it notices a double write failure.
The dirty buffer is a problem, no doubt, but this solution looks like
urban renewal via A-bomb.  I'd rather just keep failing anytime some
transaction tries to write the buffer --- better that than taking out
all active transactions whether they'd ever touched that buffer or not.
If the write failure really is permanent, the dbadmin would eventually
have to intervene via a manual restart, but a manual restart at the time
of the dbadmin's choosing seems better than forcing a failure under
load.

Comments?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Malcolm Beattie
Date:
Subject: Re: Security choices...
Next
From: "Hiroshi Inoue"
Date:
Subject: RE: Fwd: Postgres update