Re: Cirrus-ci is lowering free CI cycles - what to do with cfbot, etc? - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Cirrus-ci is lowering free CI cycles - what to do with cfbot, etc?
Date
Msg-id 20230823192239.jxew5s3sjru63lio@awork3.anarazel.de
Whole thread Raw
In response to Re: Cirrus-ci is lowering free CI cycles - what to do with cfbot, etc?  (Daniel Gustafsson <daniel@yesql.se>)
Responses Re: Cirrus-ci is lowering free CI cycles - what to do with cfbot, etc?
List pgsql-hackers
Hi,

On 2023-08-23 14:48:26 +0200, Daniel Gustafsson wrote:
> > On 23 Aug 2023, at 08:58, Andres Freund <andres@anarazel.de> wrote:
>
> > I'm hoping to push this fairly soon, as I'll be on vacation the last week of
> > August. I'll be online intermittently though, if there are issues, I can react
> > (very limited connectivity for middday Aug 29th - midday Aug 31th though). I'd
> > appreciate a quick review or two.
>
> I've been reading over these and the thread, and while not within my area of
> expertise, nothing really sticks out.

Thanks!


> I'll do another pass, but below are a few small comments so far.
>
> I don't know Windows to know the implications, but should the below file have
> some sort of warning about not doing that for production/shared systems, only
> for dedicated test instances?

Ah, I should have explained that: I'm not planning to apply
- regress: Check for postgres startup completion more often
- ci: windows: Disabling write cache flushing during test
right now. Compared to the other patches the wins are much smaller and/or more
work is needed to make them good.

I think it might be worth going for
- ci: switch tasks to debugoptimized build
because that provides a fair bit of gain. But it might be more hurtful than
helpful due to costing more when ccache doesn't work...


> +++ b/src/tools/ci/windows_write_cache.ps1
> @@ -0,0 +1,20 @@
> +# Define the write cache to be power protected. This reduces the rate of cache
> +# flushes, which seems to help metadata heavy workloads on NTFS. We're just
> +# testing here anyway, so ...
> +#
> +# Let's do so for all disks, this could be useful beyond cirrus-ci.
>
> One thing in 0010 caught my eye, and while not introduced in this patchset it
> might be of interest here.  In the below hunks we loop X ticks around
> system(psql), with the loop assuming the server can come up really quickly and
> sleeping if it doesn't.  On my systems I always reach the pg_usleep after
> failing the check, but if I reverse the check such it first sleeps and then
> checks I only need to check once instead of twice.

I think there's more effective ways to make this cheaper. The basic thing
would be to use libpq instead of forking of psql to make a connection
check. Medium term, I think we should invent a way for pg_ctl and other
tooling (including pg_regress) to wait for the service to come up. E.g. having
a named pipe that postmaster opens once the server is up, which should allow
multiple clients to use select/epoll/... to wait for it without looping.

ISTM making pg_regress use libpq w/ PQping() should be a pretty simple patch?
The non-polling approach obviously is even better, but also requires more
thought (and documentation and ...).


> @@ -2499,7 +2502,7 @@ regression_main(int argc, char *argv[],
>          else
>              wait_seconds = 60;
>
> -        for (i = 0; i < wait_seconds; i++)
> +        for (i = 0; i < wait_seconds * WAITS_PER_SEC; i++)
>          {
>              /* Done if psql succeeds */
>              fflush(NULL);
> @@ -2519,7 +2522,7 @@ regression_main(int argc, char *argv[],
>                       outputdir);
>              }
>
> -            pg_usleep(1000000L);
> +            pg_usleep(1000000L / WAITS_PER_SEC);
>          }
>          if (i >= wait_seconds)
>          {
>
> It's a micro-optimization, but if we're changing things here to chase cycles it
> might perhaps be worth doing?

I wouldn't quite call not waiting for 1s for the server to start, when it does
so within a few ms, chasing cycles ;). For short tests it's a substantial
fraction of the overall runtime...

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Make all Perl warnings fatal
Next
From: Peter Eisentraut
Date:
Subject: Make documentation builds reproducible