Thread: pgsql: Address set of issues with errno handling

pgsql: Address set of issues with errno handling

From
Michael Paquier
Date:
Address set of issues with errno handling

System calls mixed up in error code paths are causing two issues which
several code paths have not correctly handled:
1) For write() calls, sometimes the system may return less bytes than
what has been written without errno being set.  Some paths were careful
enough to consider that case, and assumed that errno should be set to
ENOSPC, other calls missed that.
2) errno generated by a system call is overwritten by other system calls
which may succeed once an error code path is taken, causing what is
reported to the user to be incorrect.

This patch uses the brute-force approach of correcting all those code
paths.  Some refactoring could happen in the future, but this is let as
future work, which is not targeted for back-branches anyway.

Author: Michael Paquier
Reviewed-by: Ashutosh Sharma
Discussion: https://postgr.es/m/20180622061535.GD5215@paquier.xyz

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/6cb3372411fd6ed8ba0f8d36ae578a2daa517c16

Modified Files
--------------
src/backend/access/heap/rewriteheap.c           |  5 ++++
src/backend/access/transam/twophase.c           | 23 +++++++++++++++++
src/backend/access/transam/xlog.c               |  7 +++++
src/backend/access/transam/xlogutils.c          |  6 +++--
src/backend/replication/basebackup.c            |  3 +++
src/backend/replication/logical/origin.c        | 15 +++++++++++
src/backend/replication/logical/reorderbuffer.c |  4 ++-
src/backend/replication/logical/snapbuild.c     | 20 +++++++++++++++
src/backend/replication/slot.c                  |  7 ++++-
src/bin/pg_basebackup/walmethods.c              | 34 +++++++++++++++++++++++--
10 files changed, 118 insertions(+), 6 deletions(-)


Re: pgsql: Address set of issues with errno handling

From
Tom Lane
Date:
Michael Paquier <michael@paquier.xyz> writes:
> Address set of issues with errno handling

I happened to look at this patch while working up the release notes,
and I don't think it's right at all for this aspect:

> 1) For write() calls, sometimes the system may return less bytes than
> what has been written without errno being set.  Some paths were careful
> enough to consider that case, and assumed that errno should be set to
> ENOSPC, other calls missed that.

AFAICS, what you did about that was like this:

@@ -816,7 +828,12 @@ tar_close(Walfile f, WalCloseMethod method)
    if (!tar_data->compression)
    {
        if (write(tar_data->fd, tf->header, 512) != 512)
+       {
+           /* if write didn't set errno, assume problem is no disk space */
+           if (errno == 0)
+               errno = ENOSPC;
            return -1;
+       }
    }
 #ifdef HAVE_LIBZ
    else

That's not good enough, because there is no reason to suppose that errno
is initially zero; in reality it'll be whatever was left over from the
last failed syscall, perhaps far distant from here.  The places that do
this correctly do it like so:

        errno = 0;
        if (write(...) != expected-length)
        {
            /* if write didn't set errno, assume problem is no disk space */
            if (errno == 0)
                errno = ENOSPC;
            ...
        }

You need to go back and add the pre-clearing of errno in each of these
places, otherwise the added code is basically useless.

            regards, tom lane


Re: pgsql: Address set of issues with errno handling

From
Michael Paquier
Date:
On Fri, Aug 03, 2018 at 04:04:36PM -0400, Tom Lane wrote:
> That's not good enough, because there is no reason to suppose that errno
> is initially zero; in reality it'll be whatever was left over from the
> last failed syscall, perhaps far distant from here.  The places that do
> this correctly do it like so:
>
> [... code ...]

Yes, I have visibly assumed that the last syscall was already failing
so those conditions would not be hit.  If the code gets changed again
that would be a problem.

> You need to go back and add the pre-clearing of errno in each of these
> places, otherwise the added code is basically useless.

I looked at all code paths enforcing ENOSPC on write() calls, and
attached is a patch to address this issue for all of them.  What do you
think?  I can of course get that addressed before the next set of minor
releases.
--
Michael

Attachment

Re: pgsql: Address set of issues with errno handling

From
Tom Lane
Date:
Michael Paquier <michael@paquier.xyz> writes:
> On Fri, Aug 03, 2018 at 04:04:36PM -0400, Tom Lane wrote:
>> You need to go back and add the pre-clearing of errno in each of these
>> places, otherwise the added code is basically useless.

> I looked at all code paths enforcing ENOSPC on write() calls, and
> attached is a patch to address this issue for all of them.  What do you
> think?  I can of course get that addressed before the next set of minor
> releases.

That looks good as far as it goes.  I didn't cross-check that you
hit everyplace that needs this, but if you grepped for references to
ENOSPC then you presumably found them all.

IMO, it's OK to push this this weekend if you have time.

            regards, tom lane


Re: pgsql: Address set of issues with errno handling

From
Michael Paquier
Date:
On Sat, Aug 04, 2018 at 11:21:28AM -0400, Tom Lane wrote:
> That looks good as far as it goes.  I didn't cross-check that you
> hit everyplace that needs this, but if you grepped for references to
> ENOSPC then you presumably found them all.

That's exactly how the history goes.

> IMO, it's OK to push this this weekend if you have time.

OK, done now down to 9.3.
--
Michael

Attachment

Re: pgsql: Address set of issues with errno handling

From
Tom Lane
Date:
Michael Paquier <michael@paquier.xyz> writes:
> On Sat, Aug 04, 2018 at 11:21:28AM -0400, Tom Lane wrote:
>> IMO, it's OK to push this this weekend if you have time.

> OK, done now down to 9.3.

Thanks!

            regards, tom lane