Thread: pgsql: Fix our Windows stat() emulation to handle file sizes > 4GB.

pgsql: Fix our Windows stat() emulation to handle file sizes > 4GB.

From
Andrew Dunstan
Date:
Fix our Windows stat() emulation to handle file sizes > 4GB.

Hack things so that our idea of "struct stat" is equivalent to Windows'
struct __stat64, allowing it to have a wide enough st_size field.

Instead of relying on native stat(), use GetFileInformationByHandle().
This avoids a number of issues with Microsoft's multiple and rather
slipshod emulations of stat().  We still need to jump through hoops
to deal with ERROR_DELETE_PENDING, though :-(

Pull the relevant support code out of dirmod.c and put it into
its own file, win32stat.c.

Still TODO: do we need to do something different with lstat(),
rather than treating it identically to stat()?

Juan José Santamaría Flecha, reviewed by Emil Iggland;
based on prior work by Michael Paquier, Sergey Zubkovsky, and others

Discussion: https://postgr.es/m/1803D792815FC24D871C00D17AE95905CF5099@g01jpexmbkw24
Discussion: https://postgr.es/m/15858-9572469fd3b73263@postgresql.org
(cherry picked from commit bed90759fcbcd72d4d06969eebab81e47326f9a2)

Author: Alexandra Wang <alexandra.wang.oss@gmail.com>

Branch
------
REL_13_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/a9beed67670e680edeadd2a3cf7557a3c9808adf
Author: Tom Lane <tgl@sss.pgh.pa.us>

Modified Files
--------------
configure                     |   6 +
configure.in                  |   1 +
src/include/port/win32_port.h |  44 +++++--
src/port/dirmod.c             |  52 --------
src/port/win32stat.c          | 299 ++++++++++++++++++++++++++++++++++++++++++
src/tools/msvc/Mkvcbuild.pm   |   2 +-
6 files changed, 339 insertions(+), 65 deletions(-)


On Thu, Nov 7, 2024 at 6:19 PM Andrew Dunstan <andrew@dunslane.net> wrote:
> Juan José Santamaría Flecha, reviewed by Emil Iggland;
> based on prior work by Michael Paquier, Sergey Zubkovsky, and others
>
> Author: Alexandra Wang <alexandra.wang.oss@gmail.com>

This authorship information is confusing.

--
Robert Haas
EDB: http://www.enterprisedb.com



Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Nov 7, 2024 at 6:19 PM Andrew Dunstan <andrew@dunslane.net> wrote:
>> Juan José Santamaría Flecha, reviewed by Emil Iggland;
>> based on prior work by Michael Paquier, Sergey Zubkovsky, and others
>>
>> Author: Alexandra Wang <alexandra.wang.oss@gmail.com>

> This authorship information is confusing.

Yes.  Worse, there is no link to what prompted this sudden burst
of back-patches of years-old commits.  After digging around I guess
it was this thread:

https://www.postgresql.org/message-id/flat/CA%2BpA0kLc5VxOaO4WfLjmh7W0V%2BquVvVtT5CaRVRAZMuh0zft4Q%40mail.gmail.com

I'm nervous about pushing these in mere hours before a release freeze,
primarily because there's no way to be sure whether any necessary
follow-up fixes got missed.  If we were unwilling to back-patch at
the time, is reversing that decision such a good idea?

I'm also confused about how to document them in the release notes.
Alexandra should get some credit I guess for collecting and testing
the patches, but she's not the original author(s).

            regards, tom lane



I wrote:
> I'm also confused about how to document them in the release notes.
> Alexandra should get some credit I guess for collecting and testing
> the patches, but she's not the original author(s).

After thinking for awhile, I'm going to fold all of these into one
release-note entry:

     <para>
      Fix misbehavior with junction points on Windows, particularly
      in <application>pg_rewind</application> (Alexandra Wang)
     </para>

     <para>
      This entailed back-patching previous fixes by Thomas Munro, Peter
      Eisentraut, Alexander Lakhin, and Juan José Santamaría Flecha.
      Those changes were originally not back-patched out of caution, but
      they have been in use in later branches for long enough to deem
      them safe.
     </para>

If anyone's really unhappy with that plan, let me know.

            regards, tom lane



Re: pgsql: Fix our Windows stat() emulation to handle file sizes > 4GB.

From
Andrew Dunstan
Date:


On Sat, Nov 9, 2024 at 2:33 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Nov 7, 2024 at 6:19 PM Andrew Dunstan <andrew@dunslane.net> wrote:
>> Juan José Santamaría Flecha, reviewed by Emil Iggland;
>> based on prior work by Michael Paquier, Sergey Zubkovsky, and others
>>
>> Author: Alexandra Wang <alexandra.wang.oss@gmail.com>

> This authorship information is confusing.

Yes.  Worse, there is no link to what prompted this sudden burst
of back-patches of years-old commits.  After digging around I guess
it was this thread:

https://www.postgresql.org/message-id/flat/CA%2BpA0kLc5VxOaO4WfLjmh7W0V%2BquVvVtT5CaRVRAZMuh0zft4Q%40mail.gmail.com


Right. I meant to add that link to the commit messages. Mea culpa.

Alexandra isn't the original author of these patches, but she did the work to determine exactly what needed to be backpatched, and tested it. I figured that was worth an Author credit. There didn't seem to be some other appropriate tag.

 


I'm nervous about pushing these in mere hours before a release freeze,
primarily because there's no way to be sure whether any necessary
follow-up fixes got missed.  If we were unwilling to back-patch at
the time, is reversing that decision such a good idea?


My impression was that this was something we just didn't get around to. I wouldn't have pushed these so close to release if this hadn't been code already tested for a long time in release 16+. Maybe we missed something, but I doubt it.
 

cheers

andrew

Re: pgsql: Fix our Windows stat() emulation to handle file sizes > 4GB.

From
Andrew Dunstan
Date:


On Sat, Nov 9, 2024 at 6:11 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wrote:
> I'm also confused about how to document them in the release notes.
> Alexandra should get some credit I guess for collecting and testing
> the patches, but she's not the original author(s).

After thinking for awhile, I'm going to fold all of these into one
release-note entry:

     <para>
      Fix misbehavior with junction points on Windows, particularly
      in <application>pg_rewind</application> (Alexandra Wang)
     </para>

     <para>
      This entailed back-patching previous fixes by Thomas Munro, Peter
      Eisentraut, Alexander Lakhin, and Juan José Santamaría Flecha.
      Those changes were originally not back-patched out of caution, but
      they have been in use in later branches for long enough to deem
      them safe.
     </para>

If anyone's really unhappy with that plan, let me know.



WFM, Thanks.

cheers

andrew

Re: pgsql: Fix our Windows stat() emulation to handle file sizes > 4GB.

From
Thomas Munro
Date:
On Sat, Nov 9, 2024 at 9:30 AM Andrew Dunstan <andrew@dunslane.net> wrote:
> My impression was that this was something we just didn't get around to. I wouldn't have pushed these so close to
releaseif this hadn't been code already tested for a long time in release 16+. Maybe we missed something, but I doubt
it.

c5cb8f3b did have a couple more follow-ups: 4517358e and f71007fb.
These fixed initdb when run under a junction point:

https://www.postgresql.org/message-id/4590c37927d7b8ee84f9855d83229018%40postgrespro.ru



Re: pgsql: Fix our Windows stat() emulation to handle file sizes > 4GB.

From
Andrew Dunstan
Date:


On Sat, Nov 9, 2024 at 11:20 AM Thomas Munro <thomas.munro@gmail.com> wrote:
On Sat, Nov 9, 2024 at 9:30 AM Andrew Dunstan <andrew@dunslane.net> wrote:
> My impression was that this was something we just didn't get around to. I wouldn't have pushed these so close to release if this hadn't been code already tested for a long time in release 16+. Maybe we missed something, but I doubt it.

c5cb8f3b did have a couple more follow-ups: 4517358e and f71007fb.
These fixed initdb when run under a junction point:

https://www.postgresql.org/message-id/4590c37927d7b8ee84f9855d83229018%40postgrespro.ru


Oh, ouch. I guess it's too late to add those for this release. Still, an initdb failure is not as bad as a pg_rewind failure, IMNSHO.

cheers

andrew

Re: pgsql: Fix our Windows stat() emulation to handle file sizes > 4GB.

From
Thomas Munro
Date:
On Sun, Nov 10, 2024 at 8:12 PM Andrew Dunstan <andrew@dunslane.net> wrote:
> Oh, ouch. I guess it's too late to add those for this release. Still, an initdb failure is not as bad as a pg_rewind
failure,IMNSHO. 

Yeah, I don't think it's all that bad.  It seems both unlikely
(certain kinds of junction points with no intermediate directories, or
chains of junction points), and easy to work around once you get the
idea by experimentation (probably: add subdir, or tell initdb the real
target path directly, or move into place).

In hindsight I should have posted that observation onto the -hackers
thread where people able to test on Windows would have been more
likely to act, not buried here in -committers (and I didn't want to
commit blind to pre-CI branches in a hurry myself).  Sorry about that.