Re: Cirrus-ci is lowering free CI cycles - what to do with cfbot, etc? - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Cirrus-ci is lowering free CI cycles - what to do with cfbot, etc?
Date
Msg-id 20230823065833.vnnfwoi3n2sbw5kf@awork3.anarazel.de
Whole thread Raw
In response to Cirrus-ci is lowering free CI cycles - what to do with cfbot, etc?  (Andres Freund <andres@anarazel.de>)
Responses Re: Cirrus-ci is lowering free CI cycles - what to do with cfbot, etc?
Re: Cirrus-ci is lowering free CI cycles - what to do with cfbot, etc?
Re: Cirrus-ci is lowering free CI cycles - what to do with cfbot, etc?
List pgsql-hackers
Hi,

On 2023-08-07 19:15:41 -0700, Andres Freund wrote:
> As some of you might have seen when running CI, cirrus-ci is restricting how
> much CI cycles everyone can use for free (announcement at [1]). This takes
> effect September 1st.
>
> This obviously has consequences both for individual users of CI as well as
> cfbot.
>
> [...]

> Potential paths forward for individual CI:
>
> - migrate wholesale to another CI provider
>
> - split CI tasks across different CI providers, rely on github et al
>   displaying the CI status for different platforms
>
> - give up
>
>
> Potential paths forward for cfbot, in addition to the above:
>
> - Pay for compute / ask the various cloud providers to grant us compute
>   credits. At least some of the cloud providers can be used via cirrus-ci.
>
> - Host (some) CI runners ourselves. Particularly with macos and windows, that
>   could provide significant savings.

To make that possible, we need to make the compute resources for CI
configurable on a per-repository basis.  After experimenting with a bunch of
ways to do that, I got stuck on that for a while. But since today we have
sufficient macos runners for cfbot available, so... I think the approach I
finally settled on is decent, although not great. It's described in the "main"
commit message:
    ci: Prepare to make compute resources for CI configurable

    cirrus-ci will soon restrict the amount of free resources every user gets (as
    have many other CI providers). For most users of CI that should not be an
    issue. But e.g. for cfbot it will be an issue.

    To allow configuring different resources on a per-repository basis, introduce
    infrastructure for overriding the task execution environment. Unfortunately
    this is not entirely trivial, as yaml anchors have to be defined before their
    use, and cirrus-ci only allows injecting additional contents at the end of
    .cirrus.yml.

    To deal with that, move the definition of the CI tasks to
    .cirrus.tasks.yml. The main .cirrus.yml is loaded first, then, if defined, the
    file referenced by the REPO_CI_CONFIG_GIT_URL variable, will be added,
    followed by the contents of .cirrus.tasks.yml. That allows
    REPO_CI_CONFIG_GIT_URL to override the yaml anchors defined in .cirrus.yml.

    Unfortunately git's default merge / rebase strategy does not handle copied
    files, just renamed ones. To avoid painful rebasing over this change, this
    commit just renames .cirrus.yml to .cirrus.tasks.yml, without adding a new
    .cirrus.yml. That's done in the followup commit, which moves the relevant
    portion of .cirrus.tasks.yml to .cirrus.yml.  Until that is done,
    REPO_CI_CONFIG_GIT_URL does not fully work.

    The subsequent commit adds documentation for how to configure custom compute
    resources to src/tools/ci/README

    Discussion: https://postgr.es/m/20230808021541.7lbzdefvma7qmn3w@awork3.anarazel.de
    Backpatch: 15-, where CI support was added


I don't love moving most of the contents of .cirrus.yml into a new file, but I
don't see another way. I did implement it without that as well (see [1]), but
that ends up considerably harder to understand, and hardcodes what cfbot
needs.  Splitting the commit, as explained above, at least makes git rebase
fairly painless. FWIW, I did merge the changes into 15, with only reasonable
conflicts (due to new tasks, autoconf->meson).


A prerequisite commit converts "SanityCheck" and "CompilerWarnings" to use a
full VM instead of a container - that way providing custom compute resources
doesn't have to deal with containers in addition to VMs. It also looks like
the increased startup overhead is outweighed by the reduction in runtime
overhead.


I'm hoping to push this fairly soon, as I'll be on vacation the last week of
August. I'll be online intermittently though, if there are issues, I can react
(very limited connectivity for middday Aug 29th - midday Aug 31th though). I'd
appreciate a quick review or two.


Greetings,

Andres Freund

[1] https://github.com/anarazel/postgres/commit/b95fd302161b951f1dc14d586162ed3d85564bfc

Attachment

pgsql-hackers by date:

Previous
From: "Zhijie Hou (Fujitsu)"
Date:
Subject: RE: subscription/015_stream sometimes breaks
Next
From: Denis Laxalde
Date:
Subject: Re: list of acknowledgments for PG16