Re: Windows pg_basebackup unable to create >2GB pg_wal.tar tarballs ("could not close file: Invalid argument" when creating pg_wal.tar of size ~ 2^31 bytes) - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Windows pg_basebackup unable to create >2GB pg_wal.tar tarballs ("could not close file: Invalid argument" when creating pg_wal.tar of size ~ 2^31 bytes)
Date
Msg-id CA+hUKGLpkcNRkrwMCXuJu1jc5-38bvN-yEq5B0FpSV91U9ZwMQ@mail.gmail.com
Whole thread Raw
In response to Re: Windows pg_basebackup unable to create >2GB pg_wal.tar tarballs ("could not close file: Invalid argument" when creating pg_wal.tar of size ~ 2^31 bytes)  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: Windows pg_basebackup unable to create >2GB pg_wal.tar tarballs ("could not close file: Invalid argument" when creating pg_wal.tar of size ~ 2^31 bytes)
List pgsql-hackers
On Fri, Nov 22, 2024 at 10:55 PM Jakub Wartak
<jakub.wartak@enterprisedb.com> wrote:
> but with attached _lseeki64 dirty patch I've got success and resulting tarball greater than 2^31 too:
>
> C:\compiledpg\bin>pg_basebackup.exe -U postgres -D "c:\backup7" -F t -P -X stream -c fast --compress=none
--create-slot--slot=slot11 
> Password:
> 18134035/18134035 kB (100%), 1/1 tablespace
> C:\compiledpg\bin>
> C:\compiledpg\bin>dir c:\backup7\pg_wal.tar
> [..]
> 11/22/2024  10:37 AM     4,026,778,112 pg_wal.tar
>
> PoC patch is attached and I'll register CFentry to see how the tests will go (all of this MSVC/MinGW stuff is
frighteningto me, of course this will fail on non-Win32...). 

Great news.

This will indeed require solving a few archeology puzzles to
back-patch.  For lseek(), I think it should be OK to just point it to
_lseeki64() in win32_port.h as I showed earlier, because _lseeki64()
was present in ancient msvcrt.dll.  It's technically possible that an
extension might be upset by the change but it'd have to be doing
something pretty strange, like using lseek's type, or other such
assumptions, not terribly likely.  A more cautious change would do
that just inside that translation unit to avoid upsetting anything
else (or fixing anything else...), but that's probably too cautious,
let's just do it in win32_port.h.

For ftruncate(), though you didn't see a problem with that, Jacob is
quite right that it must be able to corrupt stuff, and I retract that
replacement code I showed earlier, it's not really OK to move the file
position around.  I would like to use _s_chsize() instead, but it
arrived in msvcr80.dll so I don't think old MinGW can use it.  Here's
a new idea (sketch code not tested):

static inline int
ftruncate(int fd, pgoff_t length)
{
#if defined(_UCRT) || defined(_MSC_VER)
    /* MinGW + UCRT and all supported MSVC versions have this. */
    errno = _s_chsize(fd, length);
    return errno == 0 ? 0 : -1;
#else
    /* MinGW + ancient msvcrt.dll has only _chsize, limited by off_t (long). */
    if (length > LONG_MAX)
    {
        /* A clear error is better than silent corruption. */
        errno = EFBIG;
        return - 1;
    }
    return _chsize(fd, length);
#endif
}

I'm not sure if we'd ever know if we broke MinGW + msvcrt.dll in the
back branches, ie if any computer actually existing on earth would
ever compile and run the second branch, and if so, be used for serious
database work and then be able to reach the failure case.  I think an
honest but disappointing errno is probably OK in that hypothetical
case.  It's hard to know how far to go when programming ghost
computers.  Thoughts, anyone?

(Obviously an exorcism is overdue in master, working on it...)



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Allow non-superuser to cancel superuser tasks.
Next
From: John Naylor
Date:
Subject: Re: SIMD optimization for list_sort