Thread: Cirrus CI for macOS branches 16 and 15 broken

Cirrus CI for macOS branches 16 and 15 broken

From
Peter Eisentraut
Date:
The Cirrus CI for REL_16_STABLE and REL_15_STABLE for macOS is 
apparently broken right now.  Here is a log example:

[13:57:11.305] sh src/tools/ci/ci_macports_packages.sh \
[13:57:11.305]   ccache \
[13:57:11.305]   icu \
[13:57:11.305]   kerberos5 \
[13:57:11.305]   lz4 \
[13:57:11.305]   meson \
[13:57:11.305]   openldap \
[13:57:11.305]   openssl \
[13:57:11.305]   p5.34-io-tty \
[13:57:11.305]   p5.34-ipc-run \
[13:57:11.305]   python312 \
[13:57:11.305]   tcl \
[13:57:11.305]   zstd
[13:57:11.325] macOS major version: 14
[13:57:12.554] MacPorts package URL: 
https://github.com/macports/macports-base/releases/download/v2.9.3/MacPorts-2.9.3-14-Sonoma.pkg
[14:01:37.252] installer: Package name is MacPorts
[14:01:37.252] installer: Installing at base path /
[14:01:37.252] installer: The install was successful.
[14:01:37.257]
[14:01:37.257] real    4m23.837s
[14:01:37.257] user    0m0.385s
[14:01:37.257] sys    0m0.339s
[14:01:37.282] macportsuser root
[14:01:37.431] Error: /opt/local/bin/port: Failed to initialize 
MacPorts, sqlite error: attempt to write a readonly database (8) while 
executing query: CREATE INDEX registry.snapshot_file_id ON 
snapshot_files(id)
[14:01:37.599] Error: No ports matched the given expression


REL_17_STABLE and up are working.

I know there have been some changes recently to manage the OS version 
change.  Are these older branches expected to work?



Re: Cirrus CI for macOS branches 16 and 15 broken

From
Tomas Vondra
Date:
On 8/18/24 16:07, Peter Eisentraut wrote:
> The Cirrus CI for REL_16_STABLE and REL_15_STABLE for macOS is
> apparently broken right now.  Here is a log example:
> 
> [13:57:11.305] sh src/tools/ci/ci_macports_packages.sh \
> [13:57:11.305]   ccache \
> [13:57:11.305]   icu \
> [13:57:11.305]   kerberos5 \
> [13:57:11.305]   lz4 \
> [13:57:11.305]   meson \
> [13:57:11.305]   openldap \
> [13:57:11.305]   openssl \
> [13:57:11.305]   p5.34-io-tty \
> [13:57:11.305]   p5.34-ipc-run \
> [13:57:11.305]   python312 \
> [13:57:11.305]   tcl \
> [13:57:11.305]   zstd
> [13:57:11.325] macOS major version: 14
> [13:57:12.554] MacPorts package URL:
> https://github.com/macports/macports-base/releases/download/v2.9.3/MacPorts-2.9.3-14-Sonoma.pkg
> [14:01:37.252] installer: Package name is MacPorts
> [14:01:37.252] installer: Installing at base path /
> [14:01:37.252] installer: The install was successful.
> [14:01:37.257]
> [14:01:37.257] real    4m23.837s
> [14:01:37.257] user    0m0.385s
> [14:01:37.257] sys    0m0.339s
> [14:01:37.282] macportsuser root
> [14:01:37.431] Error: /opt/local/bin/port: Failed to initialize
> MacPorts, sqlite error: attempt to write a readonly database (8) while
> executing query: CREATE INDEX registry.snapshot_file_id ON
> snapshot_files(id)
> [14:01:37.599] Error: No ports matched the given expression
> 
> 
> REL_17_STABLE and up are working.
> 

Are they? I see exactly the same failure on all branches, including 17
and master. For here this is on master:

https://cirrus-ci.com/task/5918517050998784

regards

-- 
Tomas Vondra



Re: Cirrus CI for macOS branches 16 and 15 broken

From
Thomas Munro
Date:
On Mon, Aug 19, 2024 at 2:07 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> [14:01:37.431] Error: /opt/local/bin/port: Failed to initialize
> MacPorts, sqlite error: attempt to write a readonly database (8) while
> executing query: CREATE INDEX registry.snapshot_file_id ON
> snapshot_files(id)

Hmmm.  Basically there is a loop-back disk device that get cached
between runs (same technique as ccache), on which macports is
installed.  This makes it ready to test stuff fast, with all the
dependencies ready and being updated only when they need to be
upgraded.  It is both clever and scary due to the path dependency...
(Cf other OSes, where we have a base image with all the right packages
installed already, no "memory" between runs like that.)

The macOS major version and hash of the MacPorts package install
script are in the cache key for that (see 64c39bd5), so a change to
that script would make a totally fresh installation, and hopefully
work.  I will look into that, but it would also be nice to understand
how it go itself into that state so we can avoid it...

> I know there have been some changes recently to manage the OS version
> change.  Are these older branches expected to work?

Yes.



Re: Cirrus CI for macOS branches 16 and 15 broken

From
Thomas Munro
Date:
On Mon, Aug 19, 2024 at 7:52 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> The macOS major version and hash of the MacPorts package install
> script are in the cache key for that (see 64c39bd5), so a change to
> that script would make a totally fresh installation, and hopefully
> work.  I will look into that, but it would also be nice to understand
> how it go itself into that state so we can avoid it...

Oh, it already is a cache miss and thus a fresh installation, in
Tomas's example.  I can reproduce that in my own Github account by
making a trivial change to ci_macports_packages.sh to I get a cache
miss too.  It appears to install macports just fine, and then a later
command fails in MacPort's sqlite package registry database, "attempt
to write a readonly database".  At a wild guess, what has changed here
to trigger this new condition is that MacPorts has noticed a new
stable release of itself available and taken some new code path
related to upgrading.  No idea why it thinks its package database is
read-only, though... looking...



Re: Cirrus CI for macOS branches 16 and 15 broken

From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes:
> Oh, it already is a cache miss and thus a fresh installation, in
> Tomas's example.  I can reproduce that in my own Github account by
> making a trivial change to ci_macports_packages.sh to I get a cache
> miss too.  It appears to install macports just fine, and then a later
> command fails in MacPort's sqlite package registry database, "attempt
> to write a readonly database".  At a wild guess, what has changed here
> to trigger this new condition is that MacPorts has noticed a new
> stable release of itself available and taken some new code path
> related to upgrading.  No idea why it thinks its package database is
> read-only, though... looking...

Indeed, MacPorts seems to have recently put out a 2.10.1 release.
This is not specific to the CI installation though.  What I saw on
my laptop, following my usual process for a MacPorts update, was:

$ sudo port -v selfupdate
... reported installing 2.10.1 ...
$ port outdated  # to see what will be upgraded
... failed with "write a readonly database" error!
$ sudo port upgrade outdated
... it's busily rebuilding a pile o' stuff ...

I didn't think to try it, but I bet "sudo port outdated" would
have worked.  I'm also betting that something in the CI update
recipe is taking the same shortcut of omitting "sudo".  That
works in the normal case, but seemingly not after a MacPorts base
update.

            regards, tom lane



Re: Cirrus CI for macOS branches 16 and 15 broken

From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes:
> I still don't know what's happening.  In case it helps someone else
> see it, the error comes from "sudo port unsetrequested installed".
> But in any case, switching to 2.10.1 seems to do the trick.  See
> attached.

Interesting.  Now that I've finished "sudo port upgrade outdated",
my laptop is back to a state where unprivileged "port outdated"
is successful.

What this smells like is that MacPorts has to do some kind of database
update as a result of its major version change, and there are code
paths that are not expecting that to get invoked.  It makes sense
that unprivileged "port outdated" would fail to perform the database
update, but not quite as much for "sudo port unsetrequested installed"
to fail.  That case seems like a MacPorts bug; maybe worth filing?

            regards, tom lane



Re: Cirrus CI for macOS branches 16 and 15 broken

From
Thomas Munro
Date:
On Mon, Aug 19, 2024 at 10:55 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.munro@gmail.com> writes:
> > I still don't know what's happening.  In case it helps someone else
> > see it, the error comes from "sudo port unsetrequested installed".
> > But in any case, switching to 2.10.1 seems to do the trick.  See
> > attached.
>
> Interesting.  Now that I've finished "sudo port upgrade outdated",
> my laptop is back to a state where unprivileged "port outdated"
> is successful.
>
> What this smells like is that MacPorts has to do some kind of database
> update as a result of its major version change, and there are code
> paths that are not expecting that to get invoked.  It makes sense
> that unprivileged "port outdated" would fail to perform the database
> update, but not quite as much for "sudo port unsetrequested installed"
> to fail.  That case seems like a MacPorts bug; maybe worth filing?

Huh.  Right, interesting theory.  OK, I'll push that patch to use
2.10.1 anyway, and report what we observed to see what they say.

It's funny that when I had an automatic "pick latest" thing, it broke
on their beta release, but when I pinned it to 2.9.3, it broke when
they made a new stable release anyway.  A middle way would be to use a
pattern that skips alpha/beta/etc...



Re: Cirrus CI for macOS branches 16 and 15 broken

From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes:
> On Mon, Aug 19, 2024 at 10:55 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> What this smells like is that MacPorts has to do some kind of database
>> update as a result of its major version change, and there are code
>> paths that are not expecting that to get invoked.  It makes sense
>> that unprivileged "port outdated" would fail to perform the database
>> update, but not quite as much for "sudo port unsetrequested installed"
>> to fail.  That case seems like a MacPorts bug; maybe worth filing?

> Huh.  Right, interesting theory.  OK, I'll push that patch to use
> 2.10.1 anyway, and report what we observed to see what they say.

Actually, it's a bug that it's trying to force an upgrade on us, isn't
it?  Or does the CI recipe include something that asks for that?

            regards, tom lane



Re: Cirrus CI for macOS branches 16 and 15 broken

From
Tom Lane
Date:
I wrote:
> Interesting.  Now that I've finished "sudo port upgrade outdated",
> my laptop is back to a state where unprivileged "port outdated"
> is successful.

I confirmed on another machine that, immediately after "sudo port
selfupdate" from 2.9.3 to 2.10.1, I get

$ port outdated
sqlite error: attempt to write a readonly database (8) while executing query: CREATE INDEX registry.snapshot_file_id ON
snapshot_files(id)

but if I do "sudo port outdated", I get the right thing:

$ sudo port outdated
The following installed ports are outdated:
bash                           5.2.26_0 < 5.2.32_0
bind9                          9.18.27_0 < 9.20.0_3
... etc etc ...

and then once I've done that, unprivileged "port outdated" works
again:

$ port outdated
The following installed ports are outdated:
bash                           5.2.26_0 < 5.2.32_0
bind9                          9.18.27_0 < 9.20.0_3
... yadda yadda ...

So there's definitely some behind-the-back updating going on there.
I'm not sure why the CI script should trigger that though.  It
does do a couple of "port" calls without "sudo", but not in places
where the state should be only partially upgraded.

            regards, tom lane



Re: Cirrus CI for macOS branches 16 and 15 broken

From
Thomas Munro
Date:
On Mon, Aug 19, 2024 at 12:51 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I'm not sure why the CI script should trigger that though.  It
> does do a couple of "port" calls without "sudo", but not in places
> where the state should be only partially upgraded.

Oooh, I think I see where we missed a sudo:

if [ -n "$(port -q installed installed)" ] ; then



Re: Cirrus CI for macOS branches 16 and 15 broken

From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes:
> On Mon, Aug 19, 2024 at 12:51 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I'm not sure why the CI script should trigger that though.  It
>> does do a couple of "port" calls without "sudo", but not in places
>> where the state should be only partially upgraded.

> Oooh, I think I see where we missed a sudo:

> if [ -n "$(port -q installed installed)" ] ; then

I wondered about that too, but you should still have a plain 2.9.3
installation at that point.  AFAICT you'd only be at risk between

    sudo port selfupdate
    sudo port upgrade outdated

and there's nothing but a comment there.

            regards, tom lane



Re: Cirrus CI for macOS branches 16 and 15 broken

From
Peter Eisentraut
Date:
On 19.08.24 01:44, Thomas Munro wrote:
> On Mon, Aug 19, 2024 at 10:55 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Thomas Munro <thomas.munro@gmail.com> writes:
>>> I still don't know what's happening.  In case it helps someone else
>>> see it, the error comes from "sudo port unsetrequested installed".
>>> But in any case, switching to 2.10.1 seems to do the trick.  See
>>> attached.
>>
>> Interesting.  Now that I've finished "sudo port upgrade outdated",
>> my laptop is back to a state where unprivileged "port outdated"
>> is successful.
>>
>> What this smells like is that MacPorts has to do some kind of database
>> update as a result of its major version change, and there are code
>> paths that are not expecting that to get invoked.  It makes sense
>> that unprivileged "port outdated" would fail to perform the database
>> update, but not quite as much for "sudo port unsetrequested installed"
>> to fail.  That case seems like a MacPorts bug; maybe worth filing?
> 
> Huh.  Right, interesting theory.  OK, I'll push that patch to use
> 2.10.1 anyway, and report what we observed to see what they say.

REL_15_STABLE is fixed now.  REL_16_STABLE now fails with another thing:

[08:57:02.082] export PKG_CONFIG_PATH="/opt/local/lib/pkgconfig/"
[08:57:02.082] meson setup \
[08:57:02.082]   --buildtype=debug \
[08:57:02.082]   -Dextra_include_dirs=/opt/local/include \
[08:57:02.082]   -Dextra_lib_dirs=/opt/local/lib \
[08:57:02.083]   -Dcassert=true \
[08:57:02.083]   -Duuid=e2fs -Ddtrace=auto \
[08:57:02.083]   -DPG_TEST_EXTRA="$PG_TEST_EXTRA" \
[08:57:02.083]   build
[08:57:02.097] 
/var/folders/0n/_7v_bpwd1w71f8l0b0kjw5br0000gn/T/scriptscbc157a91c26ed806bb3701f4d85e91d.sh: 
line 6: meson: command not found
[08:57:03.078]
[08:57:03.078] Exit status: 127




Re: Cirrus CI for macOS branches 16 and 15 broken

From
Thomas Munro
Date:
On Wed, Aug 21, 2024 at 9:04 PM Peter Eisentraut <peter@eisentraut.org> wrote:
> REL_15_STABLE is fixed now.  REL_16_STABLE now fails with another thing:
...
> line 6: meson: command not found

Huh.  I don't see that in my own account.  And postgres/postgres is
currently out of juice until the end of the month (I know why,
something to fix, a topic for a different forum).  Can you share the
link?



Re: Cirrus CI for macOS branches 16 and 15 broken

From
Peter Eisentraut
Date:
On 21.08.24 11:28, Thomas Munro wrote:
> On Wed, Aug 21, 2024 at 9:04 PM Peter Eisentraut <peter@eisentraut.org> wrote:
>> REL_15_STABLE is fixed now.  REL_16_STABLE now fails with another thing:
> ...
>> line 6: meson: command not found
> 
> Huh.  I don't see that in my own account.  And postgres/postgres is
> currently out of juice until the end of the month (I know why,
> something to fix, a topic for a different forum).  Can you share the
> link?

It fixed itself after I cleaned the task caches.