Thread: Checking pgwin32_is_junction() errors
Hi, The comment for pgwin32_is_junction() says "Assumes the file exists, so will return false if it doesn't (since a nonexistent file is not a junction)". In fact that's the behaviour for any kind of error, and although we set errno in that case, no caller ever checks it. I think it'd be better to add missing_ok and elevel parameters, following existing patterns. Unfortunately, it can't use the generic frontend logging to implement elevel in frontend code from its current location, because pgport can't call pgcommon. For now I came up with a kludge to work around that problem, but I don't like it, and would need to come up with something better... Sketch code attached.
Attachment
On Thu, Mar 24, 2022 at 04:30:26PM +1300, Thomas Munro wrote: > I think it'd be better to add missing_ok and elevel parameters, > following existing patterns. Unfortunately, it can't use the generic > frontend logging to implement elevel in frontend code from its current > location, because pgport can't call pgcommon. For now I came up with > a kludge to work around that problem, but I don't like it, and would > need to come up with something better... The only barrier reason why elevel if needed is because of pg_wal in SyncDataDirectory() that cannot fail hard. I don't have a great idea here, except using a bits32 with some bitwise flags to control the behavior of the routine, aka something close to a MISSING_OK and a FAIL_HARD_ON_ERROR. This pattern exists already in some of the *Extended() routines. -- Michael
Attachment
Here's a better idea, now that I'm emboldened by having working CI for Windows frankenbuilds, and since I broke some stuff in this area on MSYS[1], which caused me to look more closely at this area. Why don't we just nuke pgwin32_is_junction() from orbit, and teach Windows how to lstat()? We're already defining our own replacement stat() used in both MSVC and MSYS builds along with our own junction point-based symlink() and readlink() functions, and lstat() was already suggested in a comment in win32stat.c. There's one curious change in the draft patch attached: you can't unlink() a junction point, you have to rmdir() it. Previously, things that traverse directories without ever calling pgwin32_is_junction() would see junction points as S_ISDIR() and call rmdir(), which was OK, but now they see S_ISLNK() and call unlink(). So I taught unlink() to try both things. Which is kinda weird, and not beautiful, especially when combined with the existing looping weirdness. 0001 is a copy of v2 of Melih Mutlu's CI patch[2] to show cfbot how to test this on MSYS (alongside the normal MSVC result), but that's not part of this submission. [1] https://www.postgresql.org/message-id/flat/b9ddf605-6b36-f90d-7c30-7b3e95c46276%40dunslane.net [2] https://www.postgresql.org/message-id/flat/CAGPVpCSKS9E0An4%3De7ZDnme%2By%3DWOcQFJYJegKO8kE9%3Dgh8NJKQ%40mail.gmail.com
Attachment
On Thu, Jul 28, 2022 at 9:31 PM Thomas Munro <thomas.munro@gmail.com> wrote: > There's one curious change in the draft patch attached: you can't > unlink() a junction point, you have to rmdir() it. Previously, things > that traverse directories without ever calling pgwin32_is_junction() > would see junction points as S_ISDIR() and call rmdir(), which was OK, > but now they see S_ISLNK() and call unlink(). So I taught unlink() to > try both things. Which is kinda weird, and not beautiful, especially > when combined with the existing looping weirdness. Here's a new attempt at unlink(), this time in its own patch. This version is a little more careful about calling rmdir() only after checking that it is a junction point, so that unlink("a directory") fails just like on Unix (well, POSIX says that that should fail with EPERM, not EACCES, and implementations are allowed to make it work anyway, but it doesn't seem helpful to allow it to work there when every OS I know of fails with EPERM or EISDIR). That check is racy, but should be good enough for our purposes, no (see comment for a note on that)? Longer term, I wonder if we should get rid of our use of symlinks, and instead just put paths in a file and do our own path translation. But for now, this patch set completes the set of junction point-based emulations, and, IMHO, cleans up a confusing aspect of our code. As before, 0001 is just for cfbot to add an MSYS checkmark.
Attachment
On 2022-08-01 Mo 01:09, Thomas Munro wrote: > On Thu, Jul 28, 2022 at 9:31 PM Thomas Munro <thomas.munro@gmail.com> wrote: >> There's one curious change in the draft patch attached: you can't >> unlink() a junction point, you have to rmdir() it. Previously, things >> that traverse directories without ever calling pgwin32_is_junction() >> would see junction points as S_ISDIR() and call rmdir(), which was OK, >> but now they see S_ISLNK() and call unlink(). So I taught unlink() to >> try both things. Which is kinda weird, and not beautiful, especially >> when combined with the existing looping weirdness. > Here's a new attempt at unlink(), this time in its own patch. This > version is a little more careful about calling rmdir() only after > checking that it is a junction point, so that unlink("a directory") > fails just like on Unix (well, POSIX says that that should fail with > EPERM, not EACCES, and implementations are allowed to make it work > anyway, but it doesn't seem helpful to allow it to work there when > every OS I know of fails with EPERM or EISDIR). That check is racy, > but should be good enough for our purposes, no (see comment for a note > on that)? > > Longer term, I wonder if we should get rid of our use of symlinks, and > instead just put paths in a file and do our own path translation. But > for now, this patch set completes the set of junction point-based > emulations, and, IMHO, cleans up a confusing aspect of our code. > > As before, 0001 is just for cfbot to add an MSYS checkmark. I'll try it out on fairywren/drongo. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On 2022-08-01 Mo 16:06, Andrew Dunstan wrote: > On 2022-08-01 Mo 01:09, Thomas Munro wrote: >> On Thu, Jul 28, 2022 at 9:31 PM Thomas Munro <thomas.munro@gmail.com> wrote: >>> There's one curious change in the draft patch attached: you can't >>> unlink() a junction point, you have to rmdir() it. Previously, things >>> that traverse directories without ever calling pgwin32_is_junction() >>> would see junction points as S_ISDIR() and call rmdir(), which was OK, >>> but now they see S_ISLNK() and call unlink(). So I taught unlink() to >>> try both things. Which is kinda weird, and not beautiful, especially >>> when combined with the existing looping weirdness. >> Here's a new attempt at unlink(), this time in its own patch. This >> version is a little more careful about calling rmdir() only after >> checking that it is a junction point, so that unlink("a directory") >> fails just like on Unix (well, POSIX says that that should fail with >> EPERM, not EACCES, and implementations are allowed to make it work >> anyway, but it doesn't seem helpful to allow it to work there when >> every OS I know of fails with EPERM or EISDIR). That check is racy, >> but should be good enough for our purposes, no (see comment for a note >> on that)? >> >> Longer term, I wonder if we should get rid of our use of symlinks, and >> instead just put paths in a file and do our own path translation. But >> for now, this patch set completes the set of junction point-based >> emulations, and, IMHO, cleans up a confusing aspect of our code. >> >> As before, 0001 is just for cfbot to add an MSYS checkmark. > > > I'll try it out on fairywren/drongo. > > They are happy with patches 2, 3, and 4. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On Thu, Aug 4, 2022 at 9:28 AM Andrew Dunstan <andrew@dunslane.net> wrote: > On 2022-08-01 Mo 16:06, Andrew Dunstan wrote: > > I'll try it out on fairywren/drongo. > They are happy with patches 2, 3, and 4. Thanks for testing! If there are no objections, I'll go ahead and commit these later today.
On Thu, Aug 4, 2022 at 9:42 AM Thomas Munro <thomas.munro@gmail.com> wrote: > On Thu, Aug 4, 2022 at 9:28 AM Andrew Dunstan <andrew@dunslane.net> wrote: > > On 2022-08-01 Mo 16:06, Andrew Dunstan wrote: > > > I'll try it out on fairywren/drongo. > > > They are happy with patches 2, 3, and 4. > > Thanks for testing! > > If there are no objections, I'll go ahead and commit these later today. Hmm, POSIX says st_link should contain the length of a symlink's target path, so I suppose we should probably set that even though we never consult it. Here's a version that does that. I also removed the rest of the now redundant #ifdef S_ISLNK conditions.
Attachment
On Fri, Aug 5, 2022 at 9:17 PM Thomas Munro <thomas.munro@gmail.com> wrote: > Hmm, POSIX says st_link should contain the length of a symlink's > target path, so I suppose we should probably set that even though we > never consult it. Here's a version that does that. I also removed > the rest of the now redundant #ifdef S_ISLNK conditions. Pushed. Hmm, this stuff could *really* use a little test framework that's run by check-world, that exercises these various replacement operations. But I also suspect that problems in this area are likely to be due to concurrency. It's hard to make a simple test that simulates the case where a file is unlinked between system calls within stat() and hits the STATUS_DELETE_PENDING case. That check is code I cargo-culted in this patch. So much of the stuff we've had in the tree relating to that area has been wrong in the past...
On 2022-08-06 08:02, Thomas Munro wrote: > > Pushed. > > Hmm, this stuff could *really* use a little test framework that's run > by check-world, that exercises these various replacement operations. > But I also suspect that problems in this area are likely to be due to > concurrency. It's hard to make a simple test that simulates the case > where a file is unlinked between system calls within stat() and hits > the STATUS_DELETE_PENDING case. That check is code I cargo-culted in > this patch. So much of the stuff we've had in the tree relating to > that area has been wrong in the past... Hello, hackers! initdb on my windows 10 system stopped working after the commit c5cb8f3b: "Provide lstat() for Windows." The error message is: creating directory C:/HOME/data ... initdb: error: could not create directory "C:/HOME": File exists "C:/HOME" is the junction point to the second volume on my hard drive - "\??\Volume{GUID}\" which name pgreadlink() erroneously strips here: https://github.com/postgres/postgres/blob/7e29a79a46d30dc236d097825ab849158929d977/src/port/dirmod.c#L357. So initdb could not stat the file with name "Volume{GUID}", tried to create it and failed. With the attached patch initdb works fine again. -- regards, Roman
Attachment
On Mon, Aug 8, 2022 at 8:23 PM <r.zharkov@postgrespro.ru> wrote: > initdb on my windows 10 system stopped working after the commit > c5cb8f3b: "Provide lstat() for Windows." > The error message is: creating directory C:/HOME/data ... initdb: > error: could not create directory "C:/HOME": File exists > > "C:/HOME" is the junction point to the second volume on my hard drive - > "\??\Volume{GUID}\" which name pgreadlink() erroneously strips here: > https://github.com/postgres/postgres/blob/7e29a79a46d30dc236d097825ab849158929d977/src/port/dirmod.c#L357. > So initdb could not stat the file with name "Volume{GUID}", tried to > create it and failed. > With the attached patch initdb works fine again. - if (r > 4 && strncmp(buf, "\\??\\", 4) == 0) + if (r > 4 && strncmp(buf, "\\??\\", 4) == 0 && + strncmp(buf, "\\??\\Volume", 10) != 0) { memmove(buf, buf + 4, strlen(buf + 4) + 1); r -= 4; Hmm. I suppose the problem must be in pg_mkdir_p(). Our symlink() emulation usually adds this "\??\" prefix (making it an "NT object path"?), because junction points only work if they are in that format. Then our readlink() emulation removes it again, but in the case of your \??\Volume{GUID} path, created by you, not our symlink() emulation, removing "\??\" apparently makes it unopenable with CreateFile() (I guess that's what fails? What's the error?). So your patch just says: don't strip "\??\" if it's followed by "Volume". I don't understand all the kinds of DOS, Windows and NT paths (let me take a moment to say how much I love Unix), but here's a guess: could it be that NT "\??\C:\foo" = DOS "C:\foo", but NT "\??\Volume..." = DOS "\Volume..."? In other words, if it hasn't got a drive letter, maybe it still needs an initial "\" (or if not that, then *something* special, because otherwise it looks like a relative path). Would it be better to say: if it doesn't begin with "\??\X:", where X could be any letter, then don't modify it? Maybe [1] has some clues. It seems to give the info in a higher density form than the Windows docs (at least to the uninitiated like me wanting a quick overview with examples). Hmm, I wonder if we could get away from doing our own path mangling and use some of the proper library calls mentioned on that page... [1] https://googleprojectzero.blogspot.com/2016/02/the-definitive-guide-on-win32-to-nt.html
On Tue, Aug 9, 2022 at 8:30 AM Thomas Munro <thomas.munro@gmail.com> wrote: > On Mon, Aug 8, 2022 at 8:23 PM <r.zharkov@postgrespro.ru> wrote: > > "C:/HOME" is the junction point to the second volume on my hard drive - > > "\??\Volume{GUID}\" which name pgreadlink() erroneously strips here: > > https://github.com/postgres/postgres/blob/7e29a79a46d30dc236d097825ab849158929d977/src/port/dirmod.c#L357. > ... Would it > be better to say: if it doesn't begin with "\??\X:", where X could be > any letter, then don't modify it? Concretely, I wonder if this is a good fix at least in the short term. Does this work for you, and do the logic and explanation make sense?
Attachment
On 2022-08-09 03:30, Thomas Munro wrote: > Then our readlink() emulation removes it again, but in the case of > your \??\Volume{GUID} path, created by you, not our symlink() > emulation, removing "\??\" apparently makes it unopenable with > CreateFile() (I guess that's what fails? What's the error?). So your > patch just says: don't strip "\??\" if it's followed by "Volume". Sorry, I thought wrong that everyone sees the backtrace on my screen. Failes the CreateFile() function with fileName = "Volume{GUID}\" at [1]. And the GetLastError() returnes 2 (ERROR_FILE_NOT_FOUND). Call Stack: initdb.exe!pgwin32_open_handle(const char * fileName, ...) Line 111 C initdb.exe!_pglstat64(const char * name, stat * buf) Line 128 C initdb.exe!_pgstat64(const char * name, stat * buf) Line 221 C initdb.exe!pg_mkdir_p(char * path, int omode) Line 123 C initdb.exe!create_data_directory() Line 2537 C initdb.exe!initialize_data_directory() Line 2696 C initdb.exe!main(int argc, char * * argv) Line 3102 C > I don't understand all the kinds of DOS, Windows and NT paths (let me > take a moment to say how much I love Unix), but here's a guess: could > it be that NT "\??\C:\foo" = DOS "C:\foo", but NT "\??\Volume..." = > DOS "\Volume..."? In other words, if it hasn't got a drive letter, > maybe it still needs an initial "\" (or if not that, then *something* > special, because otherwise it looks like a relative path). It seems to me, when we call CreateFile() Windows Object Manager searches DOS devices (drive letters in our case) in DOS Device namespaces. But it doesn't search the "Volume{GUID}" devices which must be named as "\\?\Volume{GUID}\" [2]. > Would it be better to say: if it doesn't begin with "\??\X:", where X > could be any letter, then don't modify it? > I think it will be better. [1] https://github.com/postgres/postgres/blob/7e29a79a46d30dc236d097825ab849158929d977/src/port/open.c#L86 [2] https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-volume
On 2022-08-09 05:44, Thomas Munro wrote: > On Tue, Aug 9, 2022 at 8:30 AM Thomas Munro <thomas.munro@gmail.com> > wrote: >> On Mon, Aug 8, 2022 at 8:23 PM <r.zharkov@postgrespro.ru> wrote: >> > "C:/HOME" is the junction point to the second volume on my hard drive - >> > "\??\Volume{GUID}\" which name pgreadlink() erroneously strips here: >> > https://github.com/postgres/postgres/blob/7e29a79a46d30dc236d097825ab849158929d977/src/port/dirmod.c#L357. > >> ... Would it >> be better to say: if it doesn't begin with "\??\X:", where X could be >> any letter, then don't modify it? > > Concretely, I wonder if this is a good fix at least in the short term. > Does this work for you, and do the logic and explanation make sense? Yes, this patch works well with my junction points. I checked a few variants: 21.07.2022 15:11 <JUNCTION> HOME [\??\Volume{GUID}\] 09.08.2022 15:06 <JUNCTION> Test1 [\\?\Volume{GUID}\] 09.08.2022 15:06 <JUNCTION> Test2 [\\.\Volume{GUID}\] 09.08.2022 15:17 <JUNCTION> Test3 [\??\Volume{GUID}\] 09.08.2022 15:27 <JUNCTION> Test4 [C:\temp\1] 09.08.2022 15:28 <JUNCTION> Test5 [C:\HOME\Temp\1] After hours of reading the documentation and debugging, it seems to me we can use REPARSE_GUID_DATA_BUFFER structure instead of our REPARSE_JUNCTION_DATA_BUFFER [1]. DataBuffer doesn't contain any prefixes, so we don't need to strip them. But we still need to construct a correct volume name if a junction point is a volume mount point. Is it worth to check this idea? [1] https://docs.microsoft.com/en-us/windows/win32/api/winnt/ns-winnt-reparse_guid_data_buffer
On Tue, Aug 9, 2022 at 10:59 PM <r.zharkov@postgrespro.ru> wrote: > On 2022-08-09 05:44, Thomas Munro wrote: > > On Tue, Aug 9, 2022 at 8:30 AM Thomas Munro <thomas.munro@gmail.com> > > wrote: > >> On Mon, Aug 8, 2022 at 8:23 PM <r.zharkov@postgrespro.ru> wrote: > >> > "C:/HOME" is the junction point to the second volume on my hard drive - > >> > "\??\Volume{GUID}\" which name pgreadlink() erroneously strips here: > >> > https://github.com/postgres/postgres/blob/7e29a79a46d30dc236d097825ab849158929d977/src/port/dirmod.c#L357. > > > >> ... Would it > >> be better to say: if it doesn't begin with "\??\X:", where X could be > >> any letter, then don't modify it? > > > > Concretely, I wonder if this is a good fix at least in the short term. > > Does this work for you, and do the logic and explanation make sense? > > Yes, this patch works well with my junction points. Thanks for testing! I did a bit more reading on this stuff, so that I could update the comments with the correct terminology from Windows APIs. I also realised that the pattern we could accept to symlink() and expect to work is not just "C:..." (could be RtlPathTypeDriveRelative, which wouldn't actually work in a junction point) but "C:\..." (RtlPathTypeDriveAbsolute). I tweaked it a bit to test for that. > I checked a few variants: > > 21.07.2022 15:11 <JUNCTION> HOME [\??\Volume{GUID}\] > 09.08.2022 15:06 <JUNCTION> Test1 [\\?\Volume{GUID}\] > 09.08.2022 15:06 <JUNCTION> Test2 [\\.\Volume{GUID}\] > 09.08.2022 15:17 <JUNCTION> Test3 [\??\Volume{GUID}\] > 09.08.2022 15:27 <JUNCTION> Test4 [C:\temp\1] > 09.08.2022 15:28 <JUNCTION> Test5 [C:\HOME\Temp\1] One more thing I wondered about, now that we're following junctions outside PGDATA: can a junction point to another junction? If so, I didn't allow for that: stat() gives up after one hop, because I figured that was enough for the stuff we expect inside PGDATA and I couldn't find any evidence in the man pages that referred to chains. But if you *are* allowed to create a junction "c:\huey" that points to junction "c:\dewey" that points to "c:\louie", and then you do initdb -D c:\huey\pgdata, then I guess it would fail. Would you mind checking if that is a real possibility, and if so, testing this chain-following patch to see if it fixes it? > After hours of reading the documentation and debugging, it seems to me > we can use REPARSE_GUID_DATA_BUFFER structure instead of our > REPARSE_JUNCTION_DATA_BUFFER [1]. DataBuffer doesn't contain any > prefixes, > so we don't need to strip them. But we still need to construct a correct > volume name if a junction point is a volume mount point. Is it worth to > check this idea? I don't know. I think I prefer our current approach, because it can handle anything (raw/full NT paths) and doesn't try to be very clever, and I don't want to change to a different scheme for no real benefit...
Attachment
On 2022-08-11 07:55, Thomas Munro wrote: >> I checked a few variants: >> >> 21.07.2022 15:11 <JUNCTION> HOME [\??\Volume{GUID}\] >> 09.08.2022 15:06 <JUNCTION> Test1 [\\?\Volume{GUID}\] >> 09.08.2022 15:06 <JUNCTION> Test2 [\\.\Volume{GUID}\] >> 09.08.2022 15:17 <JUNCTION> Test3 [\??\Volume{GUID}\] >> 09.08.2022 15:27 <JUNCTION> Test4 [C:\temp\1] >> 09.08.2022 15:28 <JUNCTION> Test5 [C:\HOME\Temp\1] > > One more thing I wondered about, now that we're following junctions > outside PGDATA: can a junction point to another junction? If so, I > didn't allow for that: stat() gives up after one hop, because I > figured that was enough for the stuff we expect inside PGDATA and I > couldn't find any evidence in the man pages that referred to chains. > But if you *are* allowed to create a junction "c:\huey" that points to > junction "c:\dewey" that points to "c:\louie", and then you do initdb > -D c:\huey\pgdata, then I guess it would fail. Would you mind > checking if that is a real possibility, and if so, testing this > chain-following patch to see if it fixes it? I made some junctions and rechecked both patches. 11.08.2022 16:11 <JUNCTION> donald [C:\huey] 11.08.2022 13:23 <JUNCTION> huey [C:\dewey] 11.08.2022 13:23 <JUNCTION> dewey [C:\louie] 11.08.2022 16:57 <DIR> louie With the small attached patch initdb succeeded in any of these "directories". If the junction chain is too long, initdb fails with "could not create directory" as expected. initdb -D huey/pgdata ... Success. initdb -N -D donald ... Success. 11.08.2022 17:32 <DIR> 1 11.08.2022 17:32 <JUNCTION> 2 [C:\1] 11.08.2022 17:32 <JUNCTION> 3 [C:\2] 11.08.2022 17:32 <JUNCTION> 4 [C:\3] 11.08.2022 17:32 <JUNCTION> 5 [C:\4] 11.08.2022 17:32 <JUNCTION> 6 [C:\5] 11.08.2022 17:32 <JUNCTION> 7 [C:\6] 11.08.2022 17:32 <JUNCTION> 8 [C:\7] 11.08.2022 17:32 <JUNCTION> 9 [C:\8] 11.08.2022 17:32 <JUNCTION> 10 [C:\9] initdb -D 10/pgdata ... creating directory 10/pgdata ... initdb: error: could not create directory "10": File exists initdb -D 9/pgdata ... Success.
Attachment
On Thu, Aug 11, 2022 at 10:40 PM <r.zharkov@postgrespro.ru> wrote: > On 2022-08-11 07:55, Thomas Munro wrote: > >> I checked a few variants: > >> > >> 21.07.2022 15:11 <JUNCTION> HOME [\??\Volume{GUID}\] > >> 09.08.2022 15:06 <JUNCTION> Test1 [\\?\Volume{GUID}\] > >> 09.08.2022 15:06 <JUNCTION> Test2 [\\.\Volume{GUID}\] > >> 09.08.2022 15:17 <JUNCTION> Test3 [\??\Volume{GUID}\] > >> 09.08.2022 15:27 <JUNCTION> Test4 [C:\temp\1] > >> 09.08.2022 15:28 <JUNCTION> Test5 [C:\HOME\Temp\1] > > > > One more thing I wondered about, now that we're following junctions > > outside PGDATA: can a junction point to another junction? If so, I > > didn't allow for that: stat() gives up after one hop, because I > > figured that was enough for the stuff we expect inside PGDATA and I > > couldn't find any evidence in the man pages that referred to chains. > > But if you *are* allowed to create a junction "c:\huey" that points to > > junction "c:\dewey" that points to "c:\louie", and then you do initdb > > -D c:\huey\pgdata, then I guess it would fail. Would you mind > > checking if that is a real possibility, and if so, testing this > > chain-following patch to see if it fixes it? > > I made some junctions and rechecked both patches. > > 11.08.2022 16:11 <JUNCTION> donald [C:\huey] > 11.08.2022 13:23 <JUNCTION> huey [C:\dewey] > 11.08.2022 13:23 <JUNCTION> dewey [C:\louie] > 11.08.2022 16:57 <DIR> louie > > With the small attached patch initdb succeeded in any of these > "directories". If the junction chain is too long, initdb fails with > "could not create directory" as expected. Thanks for testing and for that fix! I do intend to push this, and a nearby fix for unlink(), but first I want to have test coverage for all this stuff so we can demonstrate comprehensively that it works via automated testing, otherwise it's just impossible to maintain (at least for me, Unix guy). I have a prototype test suite based on writing TAP tests in C and I've already found more subtle ancient bugs around the Windows porting layer... more soon.