Thread: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b
Regression tests failures on Windows Server 2019 - on master at commit # d816f366b
From
Bharath Rupireddy
Date:
Hi, It looks like regression tests are failing on Windows Server 2019 [1]. Ignore if it's reported elsewhere. [1] https://github.com/postgres/postgres/runs/5419324953 ❌ 00:02 test_pl [08:28:53.758] [08:28:53.758] c:\cirrus>call "C:/Program Files/Git/usr/bin/timeout.exe" -v -k60s 15m perl src/tools/msvc/vcregress.pl plcheck [08:28:53.953] ============================================================ [08:28:53.953] Checking plpgsql [08:28:53.977] (using postmaster on Unix socket, default port) [08:28:53.977] ============== dropping database "pl_regression" ============== [08:28:56.044] psql: error: connection to server at "localhost" (::1), port 5432 failed: Connection refused (0x0000274D/10061) [08:28:56.044] Is the server running on that host and accepting TCP/IP connections? [08:28:56.044] connection to server at "localhost" (127.0.0.1), port 5432 failed: Connection refused (0x0000274D/10061) [08:28:56.044] Is the server running on that host and accepting TCP/IP connections? [08:28:56.050] command failed: "c:/cirrus/Debug/psql/psql" -X -c "SET client_min_messages = warning" -c "DROP DATABASE IF EXISTS \"pl_regression\"" "postgres" [08:28:56.063] [08:28:56.063] c:\cirrus>if 2 NEQ 0 exit /b 2 [08:28:56.084] [08:28:56.084] Exit status: 2 Regards, Bharath Rupireddy.
Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b
From
Thomas Munro
Date:
On Fri, Mar 4, 2022 at 10:37 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > It looks like regression tests are failing on Windows Server 2019 [1]. > Ignore if it's reported elsewhere. It's broken since 46ab07ff "Clean up assorted failures under clang's -fsanitize=undefined checks.": https://github.com/postgres/postgres/commits/master I don't see what that patch has to do with the symptoms. It looks a bit like the cluster started by the "startcreate" step ("pg_ctl.exe start ...") is mysteriously no longer running when we get to the "test_pl" step ("Connection refused").
Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b
From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes: > On Fri, Mar 4, 2022 at 10:37 PM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: >> It looks like regression tests are failing on Windows Server 2019 [1]. >> Ignore if it's reported elsewhere. > It's broken since 46ab07ff "Clean up assorted failures under clang's > -fsanitize=undefined checks.": > https://github.com/postgres/postgres/commits/master > I don't see what that patch has to do with the symptoms. The buildfarm is not unhappy, so I'd be looking at what changed recently in the cfbot's setup. I find it a little suspicious that startcreate is merrily starting the postmaster on port 5432, without concern for the possibility that that port is in use or blocked by a packet filter. regards, tom lane
Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b
From
Andres Freund
Date:
Hi, On 2022-03-04 09:54:28 -0500, Tom Lane wrote: > Thomas Munro <thomas.munro@gmail.com> writes: > > https://github.com/postgres/postgres/commits/master > > > I don't see what that patch has to do with the symptoms. > > The buildfarm is not unhappy, so I'd be looking at what changed > recently in the cfbot's setup. Very odd. > I find it a little suspicious that startcreate is merrily starting > the postmaster on port 5432, without concern for the possibility > that that port is in use or blocked by a packet filter. It's a container that's just created for that run, so there shouldn't be anything else running. We also see that the server successfully starts: 2022-03-03 23:24:15.261 GMT [5504][postmaster] LOG: starting PostgreSQL 15devel, compiled by Visual C++ build 1929, 64-bit 2022-03-03 23:24:15.270 GMT [5504][postmaster] LOG: listening on IPv6 address "::1", port 5432 2022-03-03 23:24:15.270 GMT [5504][postmaster] LOG: listening on IPv4 address "127.0.0.1", port 5432 2022-03-03 23:24:15.321 GMT [5808][startup] LOG: database system was shut down at 2022-03-03 23:24:15 GMT 2022-03-03 23:24:15.341 GMT [5504][postmaster] LOG: database system is ready to accept connections Greetings, Andres Freund
Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b
From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes: > On 2022-03-04 09:54:28 -0500, Tom Lane wrote: >> I find it a little suspicious that startcreate is merrily starting >> the postmaster on port 5432, without concern for the possibility >> that that port is in use or blocked by a packet filter. > It's a container that's just created for that run, so there shouldn't be > anything else running. We also see that the server successfully starts: True, so the port was free. The postmaster wouldn't know whether there's a relevant packet filter though. The reason I'm harping on this is that it's hard to see why this postmaster start would fail before anything was actually done, when several previous starts on other ports behaved just fine. Actually ... looking closer at the failure log: [09:41:50.589] Checking plpgsql [09:41:50.608] (using postmaster on Unix socket, default port) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [09:41:50.608] ============== dropping database "pl_regression" ============== [09:41:52.731] psql: error: connection to server at "localhost" (::1), port 5432 failed: Connection refused (0x0000274D/10061) [09:41:52.731] Is the server running on that host and accepting TCP/IP connections? [09:41:52.731] connection to server at "localhost" (127.0.0.1), port 5432 failed: Connection refused (0x0000274D/10061) [09:41:52.731] Is the server running on that host and accepting TCP/IP connections? pg_regress thinks the postmaster is listening to a Unix socket. Maybe it's *only* listening to a Unix socket. In any case, psql has very clearly not been told to use a Unix socket. Something is not wired up correctly there. regards, tom lane
Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b
From
Andres Freund
Date:
Hi, On 2022-03-04 09:10:14 -0800, Andres Freund wrote: > > I find it a little suspicious that startcreate is merrily starting > > the postmaster on port 5432, without concern for the possibility > > that that port is in use or blocked by a packet filter. > > It's a container that's just created for that run, so there shouldn't be > anything else running. We also see that the server successfully starts: > > 2022-03-03 23:24:15.261 GMT [5504][postmaster] LOG: starting PostgreSQL 15devel, compiled by Visual C++ build 1929, 64-bit > 2022-03-03 23:24:15.270 GMT [5504][postmaster] LOG: listening on IPv6 address "::1", port 5432 > 2022-03-03 23:24:15.270 GMT [5504][postmaster] LOG: listening on IPv4 address "127.0.0.1", port 5432 > 2022-03-03 23:24:15.321 GMT [5808][startup] LOG: database system was shut down at 2022-03-03 23:24:15 GMT > 2022-03-03 23:24:15.341 GMT [5504][postmaster] LOG: database system is ready to accept connections Oh, huh. I added a printout of the running tasks, and it sure looks like postgres is not running anymore when plcheck is run, without anything ending up in the server log... It looks like the CI agent had some changes about stdin / stdout of the shells used to start scripts. The version between the last successful run and the first failing run changed. https://github.com/cirruslabs/cirrus-ci-agent/commits/master I wonder if we're missing some steps, at least on windows, to make pg_ctl start independent of the starting shell? Greetings, Andres Freund
Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b
From
Andres Freund
Date:
Hi, On 2022-03-04 12:23:43 -0500, Tom Lane wrote: > Actually ... looking closer at the failure log: > > [09:41:50.589] Checking plpgsql > [09:41:50.608] (using postmaster on Unix socket, default port) > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > [09:41:50.608] ============== dropping database "pl_regression" ============== > [09:41:52.731] psql: error: connection to server at "localhost" (::1), port 5432 failed: Connection refused (0x0000274D/10061) > [09:41:52.731] Is the server running on that host and accepting TCP/IP connections? > [09:41:52.731] connection to server at "localhost" (127.0.0.1), port 5432 failed: Connection refused (0x0000274D/10061) > [09:41:52.731] Is the server running on that host and accepting TCP/IP connections? > > pg_regress thinks the postmaster is listening to a Unix socket. > Maybe it's *only* listening to a Unix socket. In any case, > psql has very clearly not been told to use a Unix socket. > Something is not wired up correctly there. I saw that too, but I don't think it's the primary problem. The server is listening on ::1 as we know from the log, and quite evidently psql is trying to connect to that too. This output also looked this way before the failures. pg_regress outputs the above message when neither PGHOST, PGPORT or --host/--port are set. On windows that nevertheless ends up with connecting to localhost. Greetings, Andres Freund
Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b
From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes: > On 2022-03-04 12:23:43 -0500, Tom Lane wrote: >> pg_regress thinks the postmaster is listening to a Unix socket. > pg_regress outputs the above message when neither PGHOST, PGPORT or > --host/--port are set. On windows that nevertheless ends up with connecting to > localhost. Yeah, I just traced that down. pg_regress was not updated when libpq's behavior was changed for platform-varying DEFAULT_PGSOCKET_DIR. I'll go fix that, but you're right that it's cosmetic. regards, tom lane
Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b
From
Andres Freund
Date:
Hi, On 2022-03-04 09:30:37 -0800, Andres Freund wrote: > I wonder if we're missing some steps, at least on windows, to make pg_ctl > start independent of the starting shell? Sure looks that way. On windows, if I do pg_ctl start, then hit ctrl-c, the server shuts down. Greetings, Andres Freund
Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b
From
Andres Freund
Date:
On 2022-03-04 09:46:44 -0800, Andres Freund wrote: > On 2022-03-04 09:30:37 -0800, Andres Freund wrote: > > I wonder if we're missing some steps, at least on windows, to make pg_ctl > > start independent of the starting shell? > > Sure looks that way. On windows, if I do pg_ctl start, then hit ctrl-c, the > server shuts down. Short term the easiest fix might be to start postgres for those tests as a service. But it seems we should fix whatever the cause of that terminal-connectedness behaviour is. I'm out for ~2-3h. I started a test run with using a service just now: https://cirrus-ci.com/task/5519573792325632 but I very well might have typoed something...
Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b
From
Andres Freund
Date:
Hi, On 2022-03-04 09:57:35 -0800, Andres Freund wrote: > On 2022-03-04 09:46:44 -0800, Andres Freund wrote: > > On 2022-03-04 09:30:37 -0800, Andres Freund wrote: > > > I wonder if we're missing some steps, at least on windows, to make pg_ctl > > > start independent of the starting shell? > > > > Sure looks that way. On windows, if I do pg_ctl start, then hit ctrl-c, the > > server shuts down. > > Short term the easiest fix might be to start postgres for those tests as a > service. But it seems we should fix whatever the cause of that > terminal-connectedness behaviour is. > > I'm out for ~2-3h. I started a test run with using a service just now: > https://cirrus-ci.com/task/5519573792325632 but I very well might have typoed > something... Seems to have worked for the first few tests at least. Unless somebody wants to clean up that commit and push it, I'll do so once I'm back. Perhaps pg_ctl needs to call FreeConsole() or such? https://docs.microsoft.com/en-us/windows/console/closing-a-console Or perhaps pg_ctl ought to pass CREATE_NEW_PROCESS_GROUP to CreateProcess()? The lack of a process group would explain why we're getting signalled on ctrl-c... Greetings, Andres Freund
Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b
From
Thomas Munro
Date:
On Sat, Mar 5, 2022 at 7:10 AM Andres Freund <andres@anarazel.de> wrote: > Or perhaps pg_ctl ought to pass CREATE_NEW_PROCESS_GROUP to CreateProcess()? > The lack of a process group would explain why we're getting signalled on > ctrl-c... I thought that sounded promising, given that the recent Cirrus agent commit you pointed to says "Always kill spawned shell's process group to avoid pipe FD hangs", and given that we do something conceptually similar on Unix. It doesn't seem to help, though... https://cirrus-ci.com/task/5572163880091648
Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b
From
Andres Freund
Date:
Hi, On 2022-03-04 09:57:35 -0800, Andres Freund wrote: > On 2022-03-04 09:46:44 -0800, Andres Freund wrote: > > On 2022-03-04 09:30:37 -0800, Andres Freund wrote: > > > I wonder if we're missing some steps, at least on windows, to make pg_ctl > > > start independent of the starting shell? > > > > Sure looks that way. On windows, if I do pg_ctl start, then hit ctrl-c, the > > server shuts down. > > Short term the easiest fix might be to start postgres for those tests as a > service. But it seems we should fix whatever the cause of that > terminal-connectedness behaviour is. > > I'm out for ~2-3h. I started a test run with using a service just now: > https://cirrus-ci.com/task/5519573792325632 but I very well might have typoed > something... That fixed the immediate problem, but dblink, postgres_fdw tests failed: https://api.cirrus-ci.com/v1/artifact/task/5519573792325632/log/contrib/dblink/regression.diffs FROM dblink(connection_parameters(),'SELECT * FROM foo') AS t(a int, b text, c text[]) WHERE t.a > 7; - a | b | c ----+---+------------ - 8 | i | {a8,b8,c8} - 9 | j | {a9,b9,c9} -(2 rows) - +ERROR: could not establish connection +DETAIL: connection to server at "localhost" (::1), port 5432 failed: FATAL: role "SYSTEM" does not exist https://api.cirrus-ci.com/v1/artifact/task/5519573792325632/log/contrib/postgres_fdw/regression.diffs +ERROR: could not connect to server "loopback" and it also seems to redirect logging to the event log without further configuration... The dblink and fdw failures can presumably be fixed by passing current_user as the username. That seems like a good idea anyway? Greetings, Andres Freund
Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b
From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes: > That fixed the immediate problem, but dblink, postgres_fdw tests failed: > +ERROR: could not establish connection > +DETAIL: connection to server at "localhost" (::1), port 5432 failed: FATAL: > role "SYSTEM" does not exist [ scratches head... ] Where is libpq getting that username from? Why is it different from whatever we'd determined during initdb? (Maybe a case-folding issue??) > The dblink and fdw failures can presumably be fixed by passing current_user as > the username. That seems like a good idea anyway? I don't think it's a good idea to hack that without understanding why it's suddenly going wrong. regards, tom lane
Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b
From
Andres Freund
Date:
Hi, On 2022-03-05 09:29:01 +1300, Thomas Munro wrote: > On Sat, Mar 5, 2022 at 7:10 AM Andres Freund <andres@anarazel.de> wrote: > > Or perhaps pg_ctl ought to pass CREATE_NEW_PROCESS_GROUP to CreateProcess()? > > The lack of a process group would explain why we're getting signalled on > > ctrl-c... > > I thought that sounded promising, given that the recent Cirrus agent > commit you pointed to says "Always kill spawned shell's process group > to avoid pipe FD hangs", and given that we do something conceptually > similar on Unix. It doesn't seem to help, though... > > https://cirrus-ci.com/task/5572163880091648 I suspect one also needs the console detach thing. I don't really understand why start_postmaster() bothers to wrap postmaster start through cmd.exe, particularly when it prevents us from getting postmaster's pid. Also the caveats around cmd.exe and sharing mode. Greetings, Andres Freund
Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b
From
Andres Freund
Date:
Hi, On 2022-03-04 16:51:32 -0500, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > That fixed the immediate problem, but dblink, postgres_fdw tests failed: > > +ERROR: could not establish connection > > +DETAIL: connection to server at "localhost" (::1), port 5432 failed: FATAL: > > role "SYSTEM" does not exist > > [ scratches head... ] Where is libpq getting that username from? > Why is it different from whatever we'd determined during initdb? > (Maybe a case-folding issue??) When running as a service (via pg_ctl register) the default username to run under appears to be SYSTEM. Which then differs from the user that vcregress.pl runs under. Trying to make it use the current user now - I don't know what permissions services are needed to run a service as a user or such. > > The dblink and fdw failures can presumably be fixed by passing current_user as > > the username. That seems like a good idea anyway? > > I don't think it's a good idea to hack that without understanding > why it's suddenly going wrong. I think I understand why - seems more a question of whether we want to support running tests against a server running as a different user. Greetings, Andres Freund
Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b
From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes: > I don't really understand why start_postmaster() bothers to wrap postmaster > start through cmd.exe, particularly when it prevents us from getting > postmaster's pid. Also the caveats around cmd.exe and sharing mode. I think the basic idea is to avoid having to write our own code to do the I/O redirections --- that's certainly why we use a shell on the Unix side. But it might well be that biting the bullet and handling redirection ourselves would be easier than coping with all these other problems. regards, tom lane
Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b
From
Andres Freund
Date:
On 2022-03-04 14:04:44 -0800, Andres Freund wrote: > Trying to make it use the current user now - I don't know what > permissions services are needed to run a service as a user or such. My first attempt of using %USERNAME% failed: [22:10:20.862] c:\cirrus>call tmp_install\bin\pg_ctl.exe register -D tmp_check/db "-UContainerAdministrator" [22:10:20.889] pg_ctl: could not register service "PostgreSQL": error code 1057 seems to require a domain for some reason: [22:33:54.599] c:\cirrus>call tmp_install\bin\pg_ctl.exe register -Dtmp_check/db -U "User Manager\ContainerAdministrator" but that then doesn't start: [22:33:54.660] c:\cirrus>call net start PostgreSQL [22:33:55.887] System error 1068 has occurred. So it indeed seems hard to start a service as the current user. At least with my limited windows knowledge.
Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b
From
Thomas Munro
Date:
On Sat, Mar 5, 2022 at 10:56 AM Andres Freund <andres@anarazel.de> wrote: > I suspect one also needs the console detach thing. I tried adding DETACHED_PROCESS (which should be like calling FreeConsole() in child process) and then I tried CREATE_NEW_CONSOLE instead, on top of CREATE_NEW_PROCESS_GROUP. Neither helped, though I lost the postmaster's output. Things I tried and their output are here: https://github.com/macdice/postgres/commits/hack3
Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b
From
Andres Freund
Date:
Hi, On 2022-03-05 13:21:26 +1300, Thomas Munro wrote: > On Sat, Mar 5, 2022 at 10:56 AM Andres Freund <andres@anarazel.de> wrote: > > I suspect one also needs the console detach thing. > > I tried adding DETACHED_PROCESS (which should be like calling > FreeConsole() in child process) and then I tried CREATE_NEW_CONSOLE > instead, on top of CREATE_NEW_PROCESS_GROUP. Neither helped, though I > lost the postmaster's output. I think the issue with the process group is real, but independent of the failing test.Locally just specifying CREATE_NEW_PROCESS_GROUP fixes the problem of a pg_ctl start'ed database being stopped after ctrl-c. I think CI failing is due to cirrus bug, assuming a bit too much similarity between unix and windows.... https://github.com/cirruslabs/cirrus-ci-agent/issues/218#issuecomment-1059657781 As indicated in the message above it, there's a workaround. Not sure if worth committing, if they were to fix it in a few days? cfbot could be repaired by just adding a repo environment variable of CIRRUS_AGENT_VERSION 1.73.2... OTOH, committing and reverting a single line + comment is cheap. Greetings, Andres Freund
Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b
From
Thomas Munro
Date:
On Sat, Mar 5, 2022 at 4:19 PM Andres Freund <andres@anarazel.de> wrote: > https://github.com/cirruslabs/cirrus-ci-agent/issues/218#issuecomment-1059657781 Oof. Nice detective work. > As indicated in the message above it, there's a workaround. Not sure if worth > committing, if they were to fix it in a few days? cfbot could be repaired by > just adding a repo environment variable of CIRRUS_AGENT_VERSION 1.73.2... > OTOH, committing and reverting a single line + comment is cheap. I vote for committing that workaround into the tree temporarily, because it's not just cfbot, it's also everyone's dev branches on Github + the official mirror that are red.
Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b
From
Andres Freund
Date:
Hi, On 2022-03-05 16:39:21 +1300, Thomas Munro wrote: > On Sat, Mar 5, 2022 at 4:19 PM Andres Freund <andres@anarazel.de> wrote: > > https://github.com/cirruslabs/cirrus-ci-agent/issues/218#issuecomment-1059657781 > > Oof. Nice detective work. Thanks. > > As indicated in the message above it, there's a workaround. Not sure if worth > > committing, if they were to fix it in a few days? cfbot could be repaired by > > just adding a repo environment variable of CIRRUS_AGENT_VERSION 1.73.2... > > OTOH, committing and reverting a single line + comment is cheap. > > I vote for committing that workaround into the tree temporarily, > because it's not just cfbot, it's also everyone's dev branches on > Github + the official mirror that are red. I'll do so after making dinner, unless you want to do so sooner. It did fix the problem (intermixed with a few irrelevant changes): https://cirrus-ci.com/task/4928987829895168 - Andres
Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b
From
Andres Freund
Date:
On 2022-03-04 20:06:43 -0800, Andres Freund wrote: > On 2022-03-05 16:39:21 +1300, Thomas Munro wrote: > > I vote for committing that workaround into the tree temporarily, > > because it's not just cfbot, it's also everyone's dev branches on > > Github + the official mirror that are red. > > I'll do so after making dinner, unless you want to do so sooner. It did fix > the problem (intermixed with a few irrelevant changes): https://cirrus-ci.com/task/4928987829895168 Pushed.
Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b
From
Andrew Dunstan
Date:
On 3/4/22 17:04, Andres Freund wrote: > Hi, > > On 2022-03-04 16:51:32 -0500, Tom Lane wrote: >> Andres Freund <andres@anarazel.de> writes: >>> That fixed the immediate problem, but dblink, postgres_fdw tests failed: >>> +ERROR: could not establish connection >>> +DETAIL: connection to server at "localhost" (::1), port 5432 failed: FATAL: >>> role "SYSTEM" does not exist >> [ scratches head... ] Where is libpq getting that username from? >> Why is it different from whatever we'd determined during initdb? >> (Maybe a case-folding issue??) > When running as a service (via pg_ctl register) the default username to run > under appears to be SYSTEM. Which then differs from the user that vcregress.pl > runs under. Trying to make it use the current user now - I don't know what > permissions services are needed to run a service as a user or such. SeServiceLogonRight is what the user needs to run the service. To manage it (e.g. stop or start it) they need some extra permissions, see for example <http://get-carbon.org/Grant-ServicePermission.html> cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b
From
Andres Freund
Date:
Hi, On 2022-03-04 22:01:22 -0800, Andres Freund wrote: > On 2022-03-04 20:06:43 -0800, Andres Freund wrote: > > On 2022-03-05 16:39:21 +1300, Thomas Munro wrote: > > > I vote for committing that workaround into the tree temporarily, > > > because it's not just cfbot, it's also everyone's dev branches on > > > Github + the official mirror that are red. > > > > I'll do so after making dinner, unless you want to do so sooner. It did fix > > the problem (intermixed with a few irrelevant changes): https://cirrus-ci.com/task/4928987829895168 > > Pushed. Cirrus now provides a way to get the old behaviour back without pinning an old agent version. See attached and a run passing the problematic steps [1]: The way this is intended to be done in cirrus (rather than preventing it from killing "escaped" processes) would be to use 'background_script' to run something longer running. Unfortunately that's surprisingly hard with our tooling, or maybe I'm just daft: 1) We don't have a way to wait for server startup to finish if we don't block on pg_ctl start. So we'd have to use 'sleep', write a loop around pq_isready. Perhaps pg_ctl should have an option to wait for server startup / shutdown without doing the starting/stopping itself? 2) There's no trivial way of starting postgres with pg_ctl and then waiting for the server to be shut down in the background script. The easiest would be to start psql just wait for it to be killed by an immediate shutdown :/. 3) We can't just start postgres in the foreground to get around 2), because pg_ctl does the dropping of permissions we need, rather than postgres itself. It'd also need 1). Seems like we're missing some fairly basic tooling somehow... Regards, Andres [1] https://cirrus-ci.com/task/5810196135018496
Attachment
Re: Regression tests failures on Windows Server 2019 - on master at commit # d816f366b
From
Andres Freund
Date:
On 2022-03-19 11:16:48 -0700, Andres Freund wrote: > Cirrus now provides a way to get the old behaviour back without pinning an old > agent version. See attached I've pushed that now. If we can come up with a better recipe for starting postgres in the background, cool. But until then...