Thread: pgsql: Address set of issues with errno handling
Address set of issues with errno handling System calls mixed up in error code paths are causing two issues which several code paths have not correctly handled: 1) For write() calls, sometimes the system may return less bytes than what has been written without errno being set. Some paths were careful enough to consider that case, and assumed that errno should be set to ENOSPC, other calls missed that. 2) errno generated by a system call is overwritten by other system calls which may succeed once an error code path is taken, causing what is reported to the user to be incorrect. This patch uses the brute-force approach of correcting all those code paths. Some refactoring could happen in the future, but this is let as future work, which is not targeted for back-branches anyway. Author: Michael Paquier Reviewed-by: Ashutosh Sharma Discussion: https://postgr.es/m/20180622061535.GD5215@paquier.xyz Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/6cb3372411fd6ed8ba0f8d36ae578a2daa517c16 Modified Files -------------- src/backend/access/heap/rewriteheap.c | 5 ++++ src/backend/access/transam/twophase.c | 23 +++++++++++++++++ src/backend/access/transam/xlog.c | 7 +++++ src/backend/access/transam/xlogutils.c | 6 +++-- src/backend/replication/basebackup.c | 3 +++ src/backend/replication/logical/origin.c | 15 +++++++++++ src/backend/replication/logical/reorderbuffer.c | 4 ++- src/backend/replication/logical/snapbuild.c | 20 +++++++++++++++ src/backend/replication/slot.c | 7 ++++- src/bin/pg_basebackup/walmethods.c | 34 +++++++++++++++++++++++-- 10 files changed, 118 insertions(+), 6 deletions(-)
Michael Paquier <michael@paquier.xyz> writes: > Address set of issues with errno handling I happened to look at this patch while working up the release notes, and I don't think it's right at all for this aspect: > 1) For write() calls, sometimes the system may return less bytes than > what has been written without errno being set. Some paths were careful > enough to consider that case, and assumed that errno should be set to > ENOSPC, other calls missed that. AFAICS, what you did about that was like this: @@ -816,7 +828,12 @@ tar_close(Walfile f, WalCloseMethod method) if (!tar_data->compression) { if (write(tar_data->fd, tf->header, 512) != 512) + { + /* if write didn't set errno, assume problem is no disk space */ + if (errno == 0) + errno = ENOSPC; return -1; + } } #ifdef HAVE_LIBZ else That's not good enough, because there is no reason to suppose that errno is initially zero; in reality it'll be whatever was left over from the last failed syscall, perhaps far distant from here. The places that do this correctly do it like so: errno = 0; if (write(...) != expected-length) { /* if write didn't set errno, assume problem is no disk space */ if (errno == 0) errno = ENOSPC; ... } You need to go back and add the pre-clearing of errno in each of these places, otherwise the added code is basically useless. regards, tom lane
On Fri, Aug 03, 2018 at 04:04:36PM -0400, Tom Lane wrote: > That's not good enough, because there is no reason to suppose that errno > is initially zero; in reality it'll be whatever was left over from the > last failed syscall, perhaps far distant from here. The places that do > this correctly do it like so: > > [... code ...] Yes, I have visibly assumed that the last syscall was already failing so those conditions would not be hit. If the code gets changed again that would be a problem. > You need to go back and add the pre-clearing of errno in each of these > places, otherwise the added code is basically useless. I looked at all code paths enforcing ENOSPC on write() calls, and attached is a patch to address this issue for all of them. What do you think? I can of course get that addressed before the next set of minor releases. -- Michael
Attachment
Michael Paquier <michael@paquier.xyz> writes: > On Fri, Aug 03, 2018 at 04:04:36PM -0400, Tom Lane wrote: >> You need to go back and add the pre-clearing of errno in each of these >> places, otherwise the added code is basically useless. > I looked at all code paths enforcing ENOSPC on write() calls, and > attached is a patch to address this issue for all of them. What do you > think? I can of course get that addressed before the next set of minor > releases. That looks good as far as it goes. I didn't cross-check that you hit everyplace that needs this, but if you grepped for references to ENOSPC then you presumably found them all. IMO, it's OK to push this this weekend if you have time. regards, tom lane
On Sat, Aug 04, 2018 at 11:21:28AM -0400, Tom Lane wrote: > That looks good as far as it goes. I didn't cross-check that you > hit everyplace that needs this, but if you grepped for references to > ENOSPC then you presumably found them all. That's exactly how the history goes. > IMO, it's OK to push this this weekend if you have time. OK, done now down to 9.3. -- Michael
Attachment
Michael Paquier <michael@paquier.xyz> writes: > On Sat, Aug 04, 2018 at 11:21:28AM -0400, Tom Lane wrote: >> IMO, it's OK to push this this weekend if you have time. > OK, done now down to 9.3. Thanks! regards, tom lane