Re: CI and test improvements - Mailing list pgsql-hackers

From Andres Freund
Subject Re: CI and test improvements
Date
Msg-id 20221002215123.6ejlhqr2y3at4efw@awork3.anarazel.de
Whole thread Raw
In response to Re: CI and test improvements  (Justin Pryzby <pryzby@telsasoft.com>)
List pgsql-hackers
Hi,

On 2022-10-02 16:35:06 -0500, Justin Pryzby wrote:
> Maybe - that would avoid waiting 4 minutes for a windows instance to
> start in the (hopefully atypical) case of a patch that fails in 1-2
> minutes under linux/freebsd.
> 
> If the patch were completely broken, the windows task would take ~4min
> to start, plus up to ~4min before failing to compile or failing an early
> test.  6-8 minutes isn't nothing, but doesn't seem worth the added
> complexity.

Avoiding 6-8mins of wasted windows time would, I think, allow us to crank
cfbot's concurrency up a notch or two.


> Also, this would mean that in the common case, the slowest task would be
> delayed until after the SanityCheck task instance starts, compiles, and
> runs some test :( Your best case is 32sec, but I doubt that's going to
> be typical.

Even the worst case isn't that bad, the uncached minimal build is 67s.


> I was thinking about the idea of cfbot handling "tasks" separately,
> similar to what it used to do with travis/appveyor.  The logic for
> "windows tasks are only run if linux passes tests" could live there.

I don't really see the advantage of doing that over just increasing
concurrency by a bit.


> > +    # no options enabled, should be small
> > +    CCACHE_MAXSIZE: "150M"
> 
> Actually, tasks can share caches if the "cache key" is set.
> If there was a separate "Sanity" task, I think it should use whatever
> flags linux (or freebsd) use to avoid doing two compilations (with lots
> of cache misses for patches which modify *.h files, which would then
> happen twice, in serial).

I think the price of using exactly the same flags is higher than the gain. And
it'll rarely work if we use the container task for the sanity check, as the
timestamps of the compiler, system headers etc will be different.


> > +  # use always: to continue after failures. Task that did not run count as a
> > +  # success, so we need to recheck SanityChecks's condition here ...
> 
> > -  # task that did not run, count as a success, so we need to recheck Linux'
> > -  # condition here ...
> 
> Another/better justification/description is that "cirrus warns if the
> depending task has different only_if conditions than the dependant task".

That doesn't really seem easier to understand to me.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Question: test "aggregates" failed in 32-bit machine
Next
From: Andres Freund
Date:
Subject: Re: CI and test improvements