Thread: ci: Macos failures due to MacPorts behaviour change

ci: Macos failures due to MacPorts behaviour change

From
Andres Freund
Date:
Hi,

I noticed that CI tasks on the postgres' github repo fail for some branches on
macos [1].

Initially I thought the problem was related to some outdated cache of the
macports installation. And indeed clearing that out does fix the issue - but
only temporarily and fixing one branch would cause some others to fail.

A fair bit of debugging later I realized that the problem is due to
src/tools/ci/ci_macports_packages.sh not actually installing missing packages
unless starting without a cache.

The cache key for the macports installation currently does not include the
major version. However, the md5 of src/tools/ci/ci_macports_packages.sh is
included in the cache key, and 17+ have a typo fix, leading to a different
cache key.

Whenever the cache was built with 15, 16 would fail, because gmake was
installed but not meson. And vice versa. It gets worse, see further down.


While we could fix the issue by including the major version in the key, that'd
only be a partial fix, because the goal is to be able to adjust the list of
packages.

The reason src/tools/ci/ci_macports_packages.sh does not "incrementally"
install new packages anymore is that port setrequested changed it's error
behaviour:

port setrequested $package errors out if $package is not installed. In the
past this was also true if multiple packages were passed in. However, now an
error is only raised if the first package is not installed:

andres@m4-dev ~ % sudo port -v setrequested bzip2 libiconv non-existing-package; echo $?
Setting requested flag for bzip2 @1.0.8_0 to 1
Setting requested flag for libiconv @1.17_0 to 1
0

andres@m4-dev ~ % sudo port -v setrequested non-existing-package bzip2 libiconv; echo $?
Error: non-existing-package is not installed
1

This means that src/tools/ci/ci_macports_packages.sh would only enter the
install-a-missing-package path if the first mentioned package was missing. As
the first package did not change between 15 and 16, we'd never install the
missing gmake/meson.

What's worse, because the remove-unneeded-packages path *does* work, we
eventually end up with a cache that has neither gmake nor meson installed
causing both 15 and 16 to fail.


The easiest fix I can see is to simply loop over the to-be-installed installed
packages and mark them as installed one-by-one. That's a few seconds slower,
but that's not too bad.  Anyone got a better idea?

I'll try to report a bug to macports, but I suspect we ought to fix this for
CI before this is addressed via a new macports release.

Greetings,

Andres Freund

[1] https://cirrus-ci.com/github/postgres/postgres/



Re: ci: Macos failures due to MacPorts behaviour change

From
Andres Freund
Date:
Hi,

On 2024-11-21 14:24:26 +1300, Thomas Munro wrote:
> Oh, and yeah, we should include the branch name in the cache key.
> Something like the attached.

I think that'd be too granular - we'd end up with lots of copies of
effectively the same cache, but which won't exactly the same due to timestamps
and such.


> I guess the alternative would be to set the package list the same
> across all branches, even though they need different stuff, so they
> could share the same cache without fighting over it?

I don't think that'd work well either, imagine adding a new package to the
list...

The right approach probably is to include the list of packages in the key. A
bit annoying to change, because we'd need to move the list of packages to an
environment variable or file, but doable. I think something like

env:
  MACOS_PACKAGE_LIST: >-
    ccache
    icu
...
    fingerprint_script: |
      ...
      echo $MACOS_PACKAGE_LIST
    ...
  setup_additional_packages_script: |
    sh src/tools/ci/ci_macports_packages.sh $MACOS_PACKAGE_LIST

should work?


> For some reason CI is not allowing me to
> see the output from macOS right now (?!) so I couldn't see what
> "Populate macports cache" printed out[1], but I think this should be
> right... will try again tomorrow.

I can see it for your link at the momemnt, fwiw.

Greetings,

Andres Freund



Re: ci: Macos failures due to MacPorts behaviour change

From
Andres Freund
Date:
Hi,

On 2024-11-27 16:33:02 +0300, Nazir Bilal Yavuz wrote:
> I think this is a nice solution and it worked successfully [1]. Now,
> REL_[17 | 16]_* and master branches use the same cache which is
> different from the REL_15_STABLE branch's cache.
> 
> In case you want to continue with this, the patches are attached. I
> merged 'using a loop in the install script' from Thomas' patch and the
> change above.

Thanks!  I added one comment and pushed it after giving it a spin myself.

Greetings,

Andres