Thread: port conflicts when running tests concurrently on windows.
Hi, With the meson patch applied the tests on windows run concurrently by default. Unfortunately that fails semi-regularly. The reason for this basically is that windows defaults to using TCP in tests, and that the tap-test port determination is very racy: > # When selecting a port, we look for an unassigned TCP port number, > # even if we intend to use only Unix-domain sockets. This is clearly > # necessary on $use_tcp (Windows) configurations, and it seems like a > # good idea on Unixen as well. > $port = get_free_port(); > ... > =item get_free_port() > > Locate an unprivileged (high) TCP port that's not currently bound to > anything. This is used by C<new()>, and also by some test cases that need to > start other, non-Postgres servers. > > Ports assigned to existing PostgreSQL::Test::Cluster objects are automatically > excluded, even if those servers are not currently running. > > XXX A port available now may become unavailable by the time we start > the desired service. I don't think there's an easy way to make this race-free. We'd need to teach postmaster to use pre-opened socket or something like that. An alternative to that would be to specify a base port number externally. In the meson branch I already did that for the pg_regress style tests, since they don't have the automatic port thing above. But for tap tests there's currently no way to pass in a base-port that I can see. Is it perhaps time to to use unix sockets on windows by default (i.e. PG_TEST_USE_UNIX_SOCKETS), at least when on a new enough windows? Greetings, Andres Freund
Hi, On 2021-12-08 14:45:50 -0800, Andres Freund wrote: > Is it perhaps time to to use unix sockets on windows by default > (i.e. PG_TEST_USE_UNIX_SOCKETS), at least when on a new enough windows? On its own PG_TEST_USE_UNIX_SOCKETS doesn't work at all on windows - it fails trying to use /tmp/ as a socket directory. Using PG_REGRESS_SOCK_DIR fixes that for PG_REGRESS. But the tap tests don't look at that :(. Greetings, Andres Freund
Hi, On 2021-12-08 16:36:14 -0800, Andres Freund wrote: > On 2021-12-08 14:45:50 -0800, Andres Freund wrote: > > Is it perhaps time to to use unix sockets on windows by default > > (i.e. PG_TEST_USE_UNIX_SOCKETS), at least when on a new enough windows? > > On its own PG_TEST_USE_UNIX_SOCKETS doesn't work at all on windows - it fails > trying to use /tmp/ as a socket directory. Using PG_REGRESS_SOCK_DIR fixes > that for PG_REGRESS. But the tap tests don't look at that :(. The tap failures in turn are caused by the Cluster.pm choosing a socket directory with backslashes. Those backslashes are then treated as an escape character both by guc.c and libpq. I think this can be addressed by something like diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm index 9467a199c8f..c2a8487bbab 100644 --- a/src/test/perl/PostgreSQL/Test/Cluster.pm +++ b/src/test/perl/PostgreSQL/Test/Cluster.pm @@ -119,7 +119,15 @@ INIT $use_tcp = !$PostgreSQL::Test::Utils::use_unix_sockets; $test_localhost = "127.0.0.1"; $last_host_assigned = 1; - $test_pghost = $use_tcp ? $test_localhost : PostgreSQL::Test::Utils::tempdir_short; + if ($use_tcp) + { + $test_pghost = $test_localhost; + } + else + { + $test_pghost = PostgreSQL::Test::Utils::tempdir_short; + $test_pghost =~ s!\\!/!g if $PostgreSQL::Test::Utils::windows_os; + } $ENV{PGHOST} = $test_pghost; $ENV{PGDATABASE} = 'postgres'; I wonder if we need a host2unix() helper accompanying perl2host()? Seems nicer than sprinkling s!\\!/!g if $PostgreSQL::Test::Utils::windows_os in a growing number of places... Greetings, Andres Freund
Hi, On 2021-12-08 17:03:07 -0800, Andres Freund wrote: > On 2021-12-08 16:36:14 -0800, Andres Freund wrote: > > On 2021-12-08 14:45:50 -0800, Andres Freund wrote: > > > Is it perhaps time to to use unix sockets on windows by default > > > (i.e. PG_TEST_USE_UNIX_SOCKETS), at least when on a new enough windows? > > > > On its own PG_TEST_USE_UNIX_SOCKETS doesn't work at all on windows - it fails > > trying to use /tmp/ as a socket directory. Using PG_REGRESS_SOCK_DIR fixes > > that for PG_REGRESS. But the tap tests don't look at that :(. > > The tap failures in turn are caused by the Cluster.pm choosing a socket > directory with backslashes. Those backslashes are then treated as an escape > character both by guc.c and libpq. > > I think this can be addressed by something like > [...] That indeed seems to pass [1] where it previously failed [2]. There's another failure that I haven't diagnosed yet, but it's independent of tcp vs unix sockets. [1] https://cirrus-ci.com/task/5055530235330560?logs=ssl_test#L5 [2] https://cirrus-ci.com/task/5524596901281792?logs=ssl_test#L5 Greetings, Andres Freund
On Thu, Dec 9, 2021 at 11:46 AM Andres Freund <andres@anarazel.de> wrote: > Is it perhaps time to to use unix sockets on windows by default > (i.e. PG_TEST_USE_UNIX_SOCKETS), at least when on a new enough windows? Makes sense. As a data point, it looks like this feature is in all supported releases of Windows. It arrived in 1803, already EOL'd, and IIUC even a Windows Server 2016 "LTSC" system that's been disconnected from the internet and refusing all updates reaches "mainstream EOL" next month. (Not a Windows person myself, but I've been looking at this stuff while contemplating various filesystem-related changes... there it's a little murkier, you need a WSL1-era kernel *and* you need to be running on top of local NTFS, which is harder for us to expect.) https://docs.microsoft.com/en-us/lifecycle/products/windows-10-enterprise-and-education https://docs.microsoft.com/en-us/windows-server/get-started/windows-server-release-info https://en.wikipedia.org/wiki/List_of_Microsoft_Windows_versions
On 09.12.21 03:44, Thomas Munro wrote: > On Thu, Dec 9, 2021 at 11:46 AM Andres Freund<andres@anarazel.de> wrote: >> Is it perhaps time to to use unix sockets on windows by default >> (i.e. PG_TEST_USE_UNIX_SOCKETS), at least when on a new enough windows? Makes sense to get this to work, at least as an option. > Makes sense. As a data point, it looks like this feature is in all > supported releases of Windows. It arrived in 1803, already EOL'd, and > IIUC even a Windows Server 2016 "LTSC" system that's been disconnected > from the internet and refusing all updates reaches "mainstream EOL" > next month. I believe the "18" in "1803" refers to 2018. We have Windows buildfarm members that mention 2016 and 2017 in their title. Would those be in trouble?
On 12/9/21 08:35, Peter Eisentraut wrote: > On 09.12.21 03:44, Thomas Munro wrote: >> On Thu, Dec 9, 2021 at 11:46 AM Andres Freund<andres@anarazel.de> >> wrote: >>> Is it perhaps time to to use unix sockets on windows by default >>> (i.e. PG_TEST_USE_UNIX_SOCKETS), at least when on a new enough windows? > > Makes sense to get this to work, at least as an option. > >> Makes sense. As a data point, it looks like this feature is in all >> supported releases of Windows. It arrived in 1803, already EOL'd, and >> IIUC even a Windows Server 2016 "LTSC" system that's been disconnected >> from the internet and refusing all updates reaches "mainstream EOL" >> next month. > > I believe the "18" in "1803" refers to 2018. We have Windows > buildfarm members that mention 2016 and 2017 in their title. Would > those be in trouble? Probably not if they have been updated. I have Windows machines substantially older that 2018 but now running versions dated later. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On 12/8/21 20:03, Andres Freund wrote: > > I wonder if we need a host2unix() helper accompanying perl2host()? Seems nicer > than sprinkling s!\\!/!g if $PostgreSQL::Test::Utils::windows_os in a growing > number of places... > Probably a good idea. I would call it canonical_path or something like that. / works quite happily as a path separator in almost all contexts on Windows - there are a handful of command line programs that choke on it - but sometimes you need to quote the path. WHen I recently provided for cross version upgrade testing on MSVC builds I just quoted everything. On Unix/Msys the shell just removes the quotes. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Hi, On 2021-12-09 14:35:37 +0100, Peter Eisentraut wrote: > On 09.12.21 03:44, Thomas Munro wrote: > > On Thu, Dec 9, 2021 at 11:46 AM Andres Freund<andres@anarazel.de> wrote: > > > Is it perhaps time to to use unix sockets on windows by default > > > (i.e. PG_TEST_USE_UNIX_SOCKETS), at least when on a new enough windows? > > Makes sense to get this to work, at least as an option. With https://github.com/anarazel/postgres/commit/046203741803da863f6129739fd215f8a32ec357 all tests pass. pg_regress requires PG_REGRESS_SOCK_DIR because it checks for TMPDIR, but windows only has TMP and TEMP. > > Makes sense. As a data point, it looks like this feature is in all > > supported releases of Windows. It arrived in 1803, already EOL'd, and > > IIUC even a Windows Server 2016 "LTSC" system that's been disconnected > > from the internet and refusing all updates reaches "mainstream EOL" > > next month. > > I believe the "18" in "1803" refers to 2018. We have Windows buildfarm > members that mention 2016 and 2017 in their title. Would those be in > trouble? Perhaps it could make sense to print the windows version somewhere as part of a windows build? Perhaps in the buildfarm client? Seems like it could be generally useful, outside of this specific issue. The most minimal thing would be to just print cmd /c ver or such. systeminfo.exe output could also be useful, but has a bit of runtime (1.5s on my windows VM). Greetings, Andres Freund
On 09.12.21 19:41, Andres Freund wrote: > Withhttps://github.com/anarazel/postgres/commit/046203741803da863f6129739fd215f8a32ec357 > all tests pass. pg_regress requires PG_REGRESS_SOCK_DIR because it checks for > TMPDIR, but windows only has TMP and TEMP. This looks reasonable so far. The commit messages 8f3ec75de4060d86176ad4ac998eeb87a39748c2 and 1d53432ff940b789c2431ba476a2a6e2db3edf84 contain some notes about what I thought at the time didn't work yet. In particular, the pg_upgrade tests don't support the use of Unix sockets on Windows. (Those tests have been rewritten since, so I don't know what the status is.) Also, the comment in pg_regress.c at remove_temp() is still there. These things should probably be addressed before we can consider making this the default.
Hi, On 2021-12-10 10:22:13 +0100, Peter Eisentraut wrote: > On 09.12.21 19:41, Andres Freund wrote: > > Withhttps://github.com/anarazel/postgres/commit/046203741803da863f6129739fd215f8a32ec357 > > all tests pass. pg_regress requires PG_REGRESS_SOCK_DIR because it checks for > > TMPDIR, but windows only has TMP and TEMP. > > This looks reasonable so far. I pushed that part, since we clearly need something like them. > The commit messages 8f3ec75de4060d86176ad4ac998eeb87a39748c2 and > 1d53432ff940b789c2431ba476a2a6e2db3edf84 contain some notes about what I > thought at the time didn't work yet. > In particular, the pg_upgrade tests > don't support the use of Unix sockets on Windows. (Those tests have been > rewritten since, so I don't know what the status is.) ISTM we still use two different implementations of the pg_upgrade tests :(. I recall there being some recent-ish work on moving it to be a tap test, but apparently not yet committed. It doesn't look like the vcregress.pl implementation respects PG_TEST_USE_UNIX_SOCKETS right now. > pg_regress.c at remove_temp() is still there. These things should probably > be addressed before we can consider making this the default. Hm, not immediately obvious what to do about this. Do you know if windows has restrictions around the length of unix domain sockets? If not, I wonder if it could be worth using the data directory as the socket path on windows instead of the separate temp directory? Greetings, Andres Freund
On 13.12.21 20:33, Andres Freund wrote: >> pg_regress.c at remove_temp() is still there. These things should probably >> be addressed before we can consider making this the default. > > Hm, not immediately obvious what to do about this. Do you know if windows has > restrictions around the length of unix domain sockets? If not, I wonder if it > could be worth using the data directory as the socket path on windows instead > of the separate temp directory? According to src/include/port/win32.h, it's the same 108 or so bytes that everyone else uses. That might not be the kernel limit, however, just the way the external API is defined. After the reading the material again, I think that comment might be overly cautious. The list of things not to do in a signal handler on Windows is not significantly different than those for say Linux, yet we do a lot of them anyway. I'm tempted to just ignore the advice and do it anyway, while ignoring errors, which is what it already does.