Thread: CI, macports, darwin version problems
Hi, Problem #1: we're still using Ventura, but Cirrus has started doing this: Only ghcr.io/cirruslabs/macos-runner:sonoma is allowed. Automatically upgraded. It doesn't do it to cfbot, which runs macOS stuff on PGDG-hosted Mac Minis, but it does it to regular users who use free compute minutes tagged "instance:OSXCommunityInstance". This causes them to fail, because: [11:17:42.711] Error: Current platform "darwin 23" does not match expected platform "darwin 22" Sure enough, the sysinfo task shows "... Darwin Kernel Version 23.5.0...", but for cfbot it's still 22.y.z. So probably it's time to change to macOS 14 AKA sonoma AKA darwin 23. Problem #2: Once you do that with a simple s/ventura/sonoma/, it still "upgrades" to macos-runner:sonoma, which is not the same as macos-sonoma-base:latest. It has more versions of xcode installed? Not sure what else will break with that because I haven't successfully run it yet due to the next problem, but blind patch attached. Problem #3: If you have a macports installation cached (eg for CI in your github account), then the pre-existing macports installation will be for the wrong darwin version (error shown above). So I think we need to teach src/tools/ci/ci_macports_packages.sh to detect that condition and do a clean install. I can look into that, but does anyone already know how to do it? I know how to find out which darwin version is running: uname -r | sed 's/\..*//'. What I don't know is how to find the darwin version for a macports installation. I have found a terrible way to deduce it: sqlite3 /opt/local/var/macports/registry/registry.db "select max(os_major) from ports where os_major != 'any'" But that's stupid. There must be a way to ask it what version it was installed for ... I think it's the variable macports::os_major[2] (which is written in TCL, a language I can't follow too well), but I can't figure out where it's reading it from.... I hope there is a text file under /opt/local or at worst a SQLite database, or a way to ask the port command to spit that number out or ask it if it thinks migration is necessary... [1] https://github.com/cirruslabs/macos-image-templates/pkgs/container/macos-ventura-xcode [2] https://github.com/macports/macports-base/blob/bf27e0c98c7443877e081d5f6b6
Attachment
Thomas Munro <thomas.munro@gmail.com> writes: > I know how to find out which darwin version is running: uname -r | sed > 's/\..*//'. What I don't know is how to find the darwin version for a > macports installation. "port platform"? regards, tom lane
On Wed, Jun 26, 2024 at 12:00 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Thomas Munro <thomas.munro@gmail.com> writes: > > I know how to find out which darwin version is running: uname -r | sed > > 's/\..*//'. What I don't know is how to find the darwin version for a > > macports installation. > > "port platform"? Thanks, that's exactly what I was looking for. But I thought of an easier way: instead of trying to do my own cache invalidation with shell script and duct tape, I can include the current OS major version in the cache key used to carry the macports directory between CI runs. Hopefully Cirrus's cache machinery is smart enough to age out the old stuff eventually. This seems to have the desired effect. I've registered this thread to see how cfbot likes this, and see if anyone sees a problem with switching to the "macos-runner:sonoma" image, or the cache key scheme. https://commitfest.postgresql.org/48/5076/ FTR there is a newer macOS release that recently came out, Sequoia aka macOS 15, but the image available to us for CI is marked beta so I figured we can wait a bit longer for that.
Attachment
Thomas Munro <thomas.munro@gmail.com> writes: > But I thought of an easier way: instead of trying to do my own cache > invalidation with shell script and duct tape, I can include the > current OS major version in the cache key used to carry the > macports directory between CI runs. Hopefully Cirrus's cache machinery > is smart enough to age out the old stuff eventually. Sounds reasonable. > FTR there is a newer macOS release that recently came out, Sequoia aka > macOS 15, but the image available to us for CI is marked beta so I > figured we can wait a bit longer for that. Indeed not; that's only beta and will be so till September-ish. We don't really want to touch it yet because of this issue: https://www.postgresql.org/message-id/flat/CAMBWrQnEwEJtgOv7EUNsXmFw2Ub4p5P%2B5QTBEgYwiyjy7rAsEQ%40mail.gmail.com I'm not sure what the resolution of that will be, but we surely don't want to gate CI improvement on that. regards, tom lane
On Wed, Jun 26, 2024 at 4:04 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Thomas Munro <thomas.munro@gmail.com> writes: > > But I thought of an easier way: instead of trying to do my own cache > > invalidation with shell script and duct tape, I can include the > > current OS major version in the cache key used to carry the > > macports directory between CI runs. Hopefully Cirrus's cache machinery > > is smart enough to age out the old stuff eventually. > > Sounds reasonable. cfbot didn't like v2. It seems that github accounts using "instance:OSXCommunityInstance" are forced to use ghcr.io/cirruslabs/macos-runner:sonoma no matter what you ask for (example: [1]), while accounts configured to use user-supplied runners like the Mac Minis that cfbot is using *can't* use ghcr.io/cirruslabs/macos-runner:sonoma, and fail (example: [2]). I don't know why. So I think we should request ghcr.io/cirruslabs/macos-sonoma-base:latest. Personal github accounts will use macos-runner:sonoma instead, but at least it's the same OS release. Here's a new version like that, to see if cfbot likes it. Given that the OS release affects the macports_url we have to specify, I think this either means that we'll have to stay in sync with whatever macOS version is being forced for "instance:OSXCommunityInstance" users, or construct the macports_url automatically. Here is an attempt at the latter, as a second patch. Seems to work OK. For example, the setup_additional_packages step currently prints out: [06:23:08.584] macOS major version: 14 [06:23:09.672] MacPorts package URL: https://github.com/macports/macports-base/releases/download/v2.9.3/MacPorts-2.9.3-14-Sonoma.pkg As for the difference between the two types of image, they're described at [3]. The -runner images seem to be part of a project for faster starting VMs[4], which sounds like a pretty good reason to want to standardise on images to make pre-started instances fungible but there is perhaps also potential for selecting different xcode versions. > > FTR there is a newer macOS release that recently came out, Sequoia aka > > macOS 15, but the image available to us for CI is marked beta so I > > figured we can wait a bit longer for that. > > Indeed not; that's only beta and will be so till September-ish. > We don't really want to touch it yet because of this issue: > > https://www.postgresql.org/message-id/flat/CAMBWrQnEwEJtgOv7EUNsXmFw2Ub4p5P%2B5QTBEgYwiyjy7rAsEQ%40mail.gmail.com > > I'm not sure what the resolution of that will be, but we surely > don't want to gate CI improvement on that. Urgh. Also we have to wait for MacPorts to make a release for Sequoia, which might involve lots of maintainers hunting stuff like that. (If Cirrus starts forcing people to use Sequoia before then, that'd be a problem.) [1] https://cirrus-ci.com/task/4747151899623424 [2] https://cirrus-ci.com/task/6601239016767488 [3] https://github.com/cirruslabs/macos-image-templates [4] https://cirrus-runners.app/blog/2024/04/11/optimizing-startup-time-of-cirrus-runners/
Attachment
On Thu, Jun 27, 2024 at 6:32 PM Thomas Munro <thomas.munro@gmail.com> wrote: > So I think we should request > ghcr.io/cirruslabs/macos-sonoma-base:latest. Personal github accounts > will use macos-runner:sonoma instead, but at least it's the same OS > release. Here's a new version like that, to see if cfbot likes it. The first cfbot run of v3 was successful, but a couple of days later when retested it failed with the dreaded "Error: ShouldBeAtLeastOneLayer". (It also failed on Windows, just because master was temporarily broken, unrelated to any of this. Note also that the commit message created by cfbot now includes the patch version, making the test history easier to grok, thanks Jelte!) https://cirrus-ci.com/github/postgresql-cfbot/postgresql/cf/5076 One difference that jumps out is that the successful v3 run has label worker:jc-m2-1 (Mac hosted by Joe), and the failure has worker:pgx-m2-1 (Mac hosted by Christophe P). Is this a software version issue, ie need newer Tart to use that image, or could be a difficulty fetching the image? CCing our Mac Mini pool attendants. Temporary options include disabling pgx-m2-1 from the pool, or teaching .cirrus.task.yml to use Ventura for cfbot but Sonoma for anyone else's github account, but ideally we'd figure out why it's not working... This new information also invalidates my previous hypothesis, that the new "macos-runner:sonoma" image can't work on self-hosted Macs, because that was also on pgx-m2-1.
On 7/2/24 17:39, Thomas Munro wrote: > One difference that jumps out is that the successful v3 run has label > worker:jc-m2-1 (Mac hosted by Joe), and the failure has > worker:pgx-m2-1 (Mac hosted by Christophe P). Is this a software > version issue, ie need newer Tart to use that image, or could be a > difficulty fetching the image? CCing our Mac Mini pool attendants. How can I help? Do you need to know versions of some of the stuff on my mac mini? -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Hi, On July 3, 2024 3:17:29 PM GMT+02:00, Joe Conway <mail@joeconway.com> wrote: >On 7/2/24 17:39, Thomas Munro wrote: >> One difference that jumps out is that the successful v3 run has label >> worker:jc-m2-1 (Mac hosted by Joe), and the failure has >> worker:pgx-m2-1 (Mac hosted by Christophe P). Is this a software >> version issue, ie need newer Tart to use that image, or could be a >> difficulty fetching the image? CCing our Mac Mini pool attendants. > >How can I help? Do you need to know versions of some of the stuff on my mac mini? Fwiw, I seem to recall that macos vms didn't work on hosts that are older than the guest. So I think it might be worth upgradingChristophe's Mac mini. Greetings, Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Hi, On 2024-07-03 09:39:06 +1200, Thomas Munro wrote: > On Thu, Jun 27, 2024 at 6:32 PM Thomas Munro <thomas.munro@gmail.com> wrote: > > So I think we should request > > ghcr.io/cirruslabs/macos-sonoma-base:latest. Personal github accounts > > will use macos-runner:sonoma instead, but at least it's the same OS > > release. Here's a new version like that, to see if cfbot likes it. > > The first cfbot run of v3 was successful, but a couple of days later > when retested it failed with the dreaded "Error: > ShouldBeAtLeastOneLayer". (It also failed on Windows, just because > master was temporarily broken, unrelated to any of this. Note also > that the commit message created by cfbot now includes the patch > version, making the test history easier to grok, thanks Jelte!) > > https://cirrus-ci.com/github/postgresql-cfbot/postgresql/cf/5076 > > One difference that jumps out is that the successful v3 run has label > worker:jc-m2-1 (Mac hosted by Joe), and the failure has > worker:pgx-m2-1 (Mac hosted by Christophe P). Is this a software > version issue, ie need newer Tart to use that image, or could be a > difficulty fetching the image? CCing our Mac Mini pool attendants. > > Temporary options include disabling pgx-m2-1 from the pool, or > teaching .cirrus.task.yml to use Ventura for cfbot but Sonoma for > anyone else's github account, but ideally we'd figure out why it's not > working... Yep, I think we'll have to do that, unless it has been fixed by now. > This new information also invalidates my previous hypothesis, that the > new "macos-runner:sonoma" image can't work on self-hosted Macs, > because that was also on pgx-m2-1. Besides the base-os-version issue, another theory is that the newer image is just very large (141GB) and that we've seen some other issues related to Christophe's internet connection not being the fastest. WRT your patches: - I think we ought to switch to the -runner image, otherwise we'll just continue to get that "upgraded" warning - With a fingerprint_script specified, we need to add reupload_on_changes: true otherwise it'll not be updated. - I think the fingerprint_script should use sw_vers, just as the script does. I see no reason to differ? - We could just sw_vers -productVersion | sed 's/\..*//g' instead of the more complicated version you used, I doubt that they're going to go away from numerical major versions... Greetings, Andres Freund
On Tue, Jul 16, 2024 at 10:48 AM Andres Freund <andres@anarazel.de> wrote: > WRT your patches: > - I think we ought to switch to the -runner image, otherwise we'll just > continue to get that "upgraded" warning Right, let's try it. > - With a fingerprint_script specified, we need to add > reupload_on_changes: true > otherwise it'll not be updated. Ahh, I see. > - I think the fingerprint_script should use sw_vers, just as the script > does. I see no reason to differ? Yeah might as well. I started with Darwin versions because that is what MacPorts complains about, but they move in lockstep. > - We could just sw_vers -productVersion | sed 's/\..*//g' instead of the more > complicated version you used, I doubt that they're going to go away from > numerical major versions... Yep. I've attached a new version like that. Let's see which runner machine gets it and how it turns out...
Attachment
On Tue, Jul 16, 2024 at 3:19 PM Thomas Munro <thomas.munro@gmail.com> wrote: > I've attached a new version like that. Let's see which runner machine > gets it and how it turns out... It failed[1] on pgx-m2-1: "Error: ShouldBeAtLeastOneLayer". So I temporarily disabled that machine from the pool and click the re-run button, and it failed[2] on jc-m2-1: "Error: The operation couldn’t be completed. No space left on device" after a long period during which it was presumably trying to download that image. I could try this experiment again if Joe could see a way to free up some disk space. I've reenabled pgx-m2-1 for now. [1] https://cirrus-ci.com/task/5127256689868800 [2] https://cirrus-ci.com/task/6446688024395776
On 7/16/24 00:34, Thomas Munro wrote: > temporarily disabled that machine from the pool and click the re-run > button, and it failed[2] on jc-m2-1: "Error: The operation couldn’t be > completed. No space left on device" after a long period during which > it was presumably trying to download that image. I could try this > experiment again if Joe could see a way to free up some disk space. Hmmm, sorry, will take a look now -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On 7/16/24 08:28, Joe Conway wrote: > On 7/16/24 00:34, Thomas Munro wrote: >> temporarily disabled that machine from the pool and click the re-run >> button, and it failed[2] on jc-m2-1: "Error: The operation couldn’t be >> completed. No space left on device" after a long period during which >> it was presumably trying to download that image. I could try this >> experiment again if Joe could see a way to free up some disk space. > > Hmmm, sorry, will take a look now I am not super strong on Macs in general, but cannot see anything full: df -h Filesystem Size Used Avail Capacity iused ifree %iused Mounted on /dev/disk3s1s1 228Gi 8.7Gi 111Gi 8% 356839 1165143240 0% / devfs 199Ki 199Ki 0Bi 100% 690 0 100% /dev /dev/disk3s6 228Gi 20Ki 111Gi 1% 0 1165143240 0% /System/Volumes/VM /dev/disk3s2 228Gi 5.0Gi 111Gi 5% 1257 1165143240 0% /System/Volumes/Preboot /dev/disk3s4 228Gi 28Mi 111Gi 1% 47 1165143240 0% /System/Volumes/Update /dev/disk1s2 500Mi 6.0Mi 483Mi 2% 1 4941480 0% /System/Volumes/xarts /dev/disk1s1 500Mi 6.2Mi 483Mi 2% 29 4941480 0% /System/Volumes/iSCPreboot /dev/disk1s3 500Mi 492Ki 483Mi 1% 55 4941480 0% /System/Volumes/Hardware /dev/disk3s5 228Gi 102Gi 111Gi 48% 365768 1165143240 0% /System/Volumes/Data map auto_home 0Bi 0Bi 0Bi 100% 0 0 100% /System/Volumes/Data/home As far as I can tell, the 100% usage for /dev and /System/Volumes/Data/home are irrelevant. ¯\_(ツ)_/¯ I ran an update to the latest Ventura and rebooted as part of that. Can you check again? -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
hi, On 2024-07-16 09:38:21 -0400, Joe Conway wrote: > On 7/16/24 08:28, Joe Conway wrote: > > On 7/16/24 00:34, Thomas Munro wrote: > > > temporarily disabled that machine from the pool and click the re-run > > > button, and it failed[2] on jc-m2-1: "Error: The operation couldn’t be > > > completed. No space left on device" after a long period during which > > > it was presumably trying to download that image. I could try this > > > experiment again if Joe could see a way to free up some disk space. > > > > Hmmm, sorry, will take a look now > > I am not super strong on Macs in general, but cannot see anything full: > /dev/disk3s5 228Gi 102Gi 111Gi 48% 365768 1165143240 0% > /System/Volumes/Data Unfortunately the 'base disk' for sonoma is 144GB large... It might be worth trying to pull it separately from a CI job, under your control. As the CI user (it'll be downloaded redundantly if you do it as your user!), you can do: tart pull ghcr.io/cirruslabs/macos-runner:sonoma It's possible you have some old images stored as your user, check "tart list" for both users. Greetings, Andres Freund
On 7/16/24 11:44, Andres Freund wrote: > hi, > > On 2024-07-16 09:38:21 -0400, Joe Conway wrote: >> On 7/16/24 08:28, Joe Conway wrote: >> > On 7/16/24 00:34, Thomas Munro wrote: >> > > temporarily disabled that machine from the pool and click the re-run >> > > button, and it failed[2] on jc-m2-1: "Error: The operation couldn’t be >> > > completed. No space left on device" after a long period during which >> > > it was presumably trying to download that image. I could try this >> > > experiment again if Joe could see a way to free up some disk space. >> > >> > Hmmm, sorry, will take a look now >> >> I am not super strong on Macs in general, but cannot see anything full: > >> /dev/disk3s5 228Gi 102Gi 111Gi 48% 365768 1165143240 0% >> /System/Volumes/Data > > Unfortunately the 'base disk' for sonoma is 144GB large... > > It might be worth trying to pull it separately from a CI job, under your > control. As the CI user (it'll be downloaded redundantly if you do it as your > user!), you can do: > tart pull ghcr.io/cirruslabs/macos-runner:sonoma > > It's possible you have some old images stored as your user, check > "tart list" for both users. Hmm, this is not the easiest ever to parse for me... macmini:~ ci-run$ tart list Source Name Disk Size State local ventura-base-test 50 20 stopped oci ghcr.io/cirruslabs/macos-ventura-base:latest 50 21 stopped oci ghcr.io/cirruslabs/macos-ventura-base@sha256:bddfa1e2b6f6ec41b5db844b06a6784a2bffe0b071965470efebd95ea3355b4f 50 21 stopped macmini:~ jconway$ tart list Source Name Disk Size State local ventura-test 50 20 stopped oci ghcr.io/cirruslabs/macos-ventura-base:latest 50 50 stopped oci ghcr.io/cirruslabs/macos-ventura-base@sha256:a4d4861123427a23ad3dc53a6a1d4d20d6bc1a0df82bd1495cc53217075c0a8c 50 50 stopped So does that mean I have 6 copies of the ventura image? How do I get rid of them? Or maybe simpler -- how do people typically add storage to a mac mini? I don't mind buying an external disk or whatever. -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Hi, On 2024-07-16 12:12:37 -0400, Joe Conway wrote: > > It's possible you have some old images stored as your user, check > > "tart list" for both users. > > Hmm, this is not the easiest ever to parse for me... Unfortunately due to the wrapping it's not easy to read here either... I don't think it quite indicates 6 - the ones with :latest are just aliases for the one with the hash, I believe. > macmini:~ ci-run$ tart list > Source Name Disk Size State > local ventura-base-test 50 20 stopped > oci ghcr.io/cirruslabs/macos-ventura-base:latest 50 21 stopped > oci ghcr.io/cirruslabs/macos-ventura-base@sha256:bddfa1e2b6f6ec41b5db844b06a6784a2bffe0b071965470efebd95ea3355b4f 50 21 stopped > > macmini:~ jconway$ tart list I'd delete all of the ones stored for jconway - that's just redundant. tart delete ghcr.io/cirruslabs/macos-ventura-base:latest > Or maybe simpler -- how do people typically add storage to a mac mini? I > don't mind buying an external disk or whatever. That I do not know, not a mac person at all... Greetings, Andres Freund
On 7/17/24 13:01, Andres Freund wrote: > On 2024-07-16 12:12:37 -0400, Joe Conway wrote: >> > It's possible you have some old images stored as your user, check >> > "tart list" for both users. >> >> Hmm, this is not the easiest ever to parse for me... > > Unfortunately due to the wrapping it's not easy to read here either... > > I don't think it quite indicates 6 - the ones with :latest are just aliases > for the one with the hash, I believe. makes sense >> macmini:~ ci-run$ tart list >> Source Name Disk Size State >> local ventura-base-test 50 20 stopped >> oci ghcr.io/cirruslabs/macos-ventura-base:latest 50 21 stopped >> oci ghcr.io/cirruslabs/macos-ventura-base@sha256:bddfa1e2b6f6ec41b5db844b06a6784a2bffe0b071965470efebd95ea3355b4f 50 21 stopped >> >> macmini:~ jconway$ tart list > > I'd delete all of the ones stored for jconway - that's just redundant. done > tart delete ghcr.io/cirruslabs/macos-ventura-base:latest and done tart list for both users shows nothing now. >> Or maybe simpler -- how do people typically add storage to a mac mini? I >> don't mind buying an external disk or whatever. > > That I do not know, not a mac person at all... Well maybe unneeded? -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Andres Freund <andres@anarazel.de> writes: > On 2024-07-16 12:12:37 -0400, Joe Conway wrote: >> Or maybe simpler -- how do people typically add storage to a mac mini? I >> don't mind buying an external disk or whatever. > That I do not know, not a mac person at all... I think USB SSD is the way at present. MacRumors has some reviews/testing, eg this one: https://www.macrumors.com/review/hyper-usb-hubs-ssd-enclosure/ regards, tom lane
Hi, On 2024-07-17 13:20:06 -0400, Joe Conway wrote: > > > Or maybe simpler -- how do people typically add storage to a mac mini? I > > > don't mind buying an external disk or whatever. > > > > That I do not know, not a mac person at all... > > Well maybe unneeded? Does "tart pull ghcr.io/cirruslabs/macos-runner:sonoma" as the CI user succeed? Greetings, Andres Freund
On 7/17/24 16:41, Andres Freund wrote: > Hi, > > On 2024-07-17 13:20:06 -0400, Joe Conway wrote: >> > > Or maybe simpler -- how do people typically add storage to a mac mini? I >> > > don't mind buying an external disk or whatever. >> > >> > That I do not know, not a mac person at all... >> >> Well maybe unneeded? > > Does "tart pull ghcr.io/cirruslabs/macos-runner:sonoma" as the CI user > succeed? Yes, with about 25 GB to spare. -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Thu, Jul 18, 2024 at 9:58 AM Joe Conway <mail@joeconway.com> wrote: > On 7/17/24 16:41, Andres Freund wrote: > > Does "tart pull ghcr.io/cirruslabs/macos-runner:sonoma" as the CI user > > succeed? > > Yes, with about 25 GB to spare. Thanks. Now it works! But for some reason it spends several minutes in the "scheduling" stage before it starts. Are there any logs that might give a clue what it was doing, for example for this run? https://cirrus-ci.com/task/5963784852865024
Hi, On Thu, 18 Jul 2024 at 07:40, Thomas Munro <thomas.munro@gmail.com> wrote: > > On Thu, Jul 18, 2024 at 9:58 AM Joe Conway <mail@joeconway.com> wrote: > > On 7/17/24 16:41, Andres Freund wrote: > > > Does "tart pull ghcr.io/cirruslabs/macos-runner:sonoma" as the CI user > > > succeed? > > > > Yes, with about 25 GB to spare. > > Thanks. Now it works! But for some reason it spends several minutes > in the "scheduling" stage before it starts. Are there any logs that > might give a clue what it was doing, for example for this run? > > https://cirrus-ci.com/task/5963784852865024 Could it be pulling the ''macos-runner:sonoma' image on every run? I cross-compared and for every new version of the 'macos-sonoma-base:latest' image [1]; scheduling takes ~4 minutes [2]. Then, it takes a couple of seconds [2] for the consecutive runs until a new version of the image is released. Also, from their manifest; the uncompressed size of the runner image is 5x of the sonoma-base image [3]. This is very close to scheduling time differences between 'macos-runner:sonoma' and 'newly pulled macos-sonoma-base:latest' (22mins / 4 mins). [1] https://github.com/cirruslabs/macos-image-templates/pkgs/container/macos-sonoma-base/versions [2] https://cirrus-ci.com/task/5299490515582976 -> 4 minutes, first pull https://cirrus-ci.com/task/6081946936147968 -> 20 seconds https://cirrus-ci.com/task/6078712070799360 -> 4 minutes, new version of the image was released on the same day (6th of July) https://cirrus-ci.com/task/6539977129984000 -> 40 seconds https://cirrus-ci.com/task/5839361126694912 -> 40 seconds https://cirrus-ci.com/task/6708845278396416 -> 4 minutes, new version of the image was released a day ago [3] https://github.com/cirruslabs/macos-image-templates/pkgs/container/macos-sonoma-base/245087497?tag=latest https://github.com/orgs/cirruslabs/packages/container/macos-runner/242649219?tag=sonoma -- Regards, Nazir Bilal Yavuz Microsoft
On 7/18/24 04:12, Nazir Bilal Yavuz wrote: > Hi, > > On Thu, 18 Jul 2024 at 07:40, Thomas Munro <thomas.munro@gmail.com> wrote: >> >> On Thu, Jul 18, 2024 at 9:58 AM Joe Conway <mail@joeconway.com> wrote: >> > On 7/17/24 16:41, Andres Freund wrote: >> > > Does "tart pull ghcr.io/cirruslabs/macos-runner:sonoma" as the CI user >> > > succeed? >> > >> > Yes, with about 25 GB to spare. >> >> Thanks. Now it works! But for some reason it spends several minutes >> in the "scheduling" stage before it starts. Are there any logs that >> might give a clue what it was doing, for example for this run? >> https://cirrus-ci.com/task/5963784852865024 I only see this in the log: time="2024-07-17T23:13:56-04:00" level=info msg="started task 5963784852865024" time="2024-07-17T23:42:24-04:00" level=info msg="task 5963784852865024 completed" > Could it be pulling the ''macos-runner:sonoma' image on every run? Or perhaps since this was the first run it simply needed to pull the image for the first time? The scheduling timing (21:24) looks a lot like what I observed when I did the test for the time to download. Unfortunately I did not time the test though. -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On 7/18/24 07:55, Joe Conway wrote: > On 7/18/24 04:12, Nazir Bilal Yavuz wrote: >> Could it be pulling the ''macos-runner:sonoma' image on every run? > > Or perhaps since this was the first run it simply needed to pull the > image for the first time? > > The scheduling timing (21:24) looks a lot like what I observed when I > did the test for the time to download. Unfortunately I did not time the > test though. Actually it does look like the image is gone now based on the free space on the volume, so maybe it is pulling every run and cleaning up rather than caching for some reason? Filesystem Size Used Avail Capacity /dev/disk3s5 228Gi 39Gi 161Gi 20% -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Hi, On Thu, 18 Jul 2024 at 15:00, Joe Conway <mail@joeconway.com> wrote: > > On 7/18/24 07:55, Joe Conway wrote: > > On 7/18/24 04:12, Nazir Bilal Yavuz wrote: > >> Could it be pulling the ''macos-runner:sonoma' image on every run? > > > > Or perhaps since this was the first run it simply needed to pull the > > image for the first time? It was not the first run, Thomas rerun it a couple of times but all of them were in the same build. So, I thought that CI may set some settings to pull the image while starting the build, so it re-downloads the image for all the tasks in the same build. But that looks wrong because of what you said below. > > > > The scheduling timing (21:24) looks a lot like what I observed when I > > did the test for the time to download. Unfortunately I did not time the > > test though. > > Actually it does look like the image is gone now based on the free space > on the volume, so maybe it is pulling every run and cleaning up rather > than caching for some reason? > > Filesystem Size Used Avail Capacity > /dev/disk3s5 228Gi 39Gi 161Gi 20% That is interesting. Only one thing comes to my mind. It seems that the 'tart prune' command runs automatically to reclaim space when there is no space left and thinks it can reclaim the space by removing some things [1]. So, it could be that somehow 'tart prune' ran automatically and deleted the sonoma image. I think you can check if this is the case. You can check these locations [2] from ci-user to see when ventura images are created. If they have been created less than 1 day ago, I think the current space is not enough to pull both ventura and sonoma images. [1] https://github.com/cirruslabs/tart/issues/33#issuecomment-1134789129 [2] https://tart.run/faq/#vm-location-on-disk -- Regards, Nazir Bilal Yavuz Microsoft
On 7/18/24 08:55, Nazir Bilal Yavuz wrote: > Hi, > > On Thu, 18 Jul 2024 at 15:00, Joe Conway <mail@joeconway.com> wrote: >> >> On 7/18/24 07:55, Joe Conway wrote: >> > On 7/18/24 04:12, Nazir Bilal Yavuz wrote: >> >> Could it be pulling the ''macos-runner:sonoma' image on every run? >> > >> > Or perhaps since this was the first run it simply needed to pull the >> > image for the first time? > > It was not the first run, Thomas rerun it a couple of times but all of > them were in the same build. So, I thought that CI may set some > settings to pull the image while starting the build, so it > re-downloads the image for all the tasks in the same build. But that > looks wrong because of what you said below. > >> > >> > The scheduling timing (21:24) looks a lot like what I observed when I >> > did the test for the time to download. Unfortunately I did not time the >> > test though. >> >> Actually it does look like the image is gone now based on the free space >> on the volume, so maybe it is pulling every run and cleaning up rather >> than caching for some reason? >> >> Filesystem Size Used Avail Capacity >> /dev/disk3s5 228Gi 39Gi 161Gi 20% > > That is interesting. Only one thing comes to my mind. It seems that > the 'tart prune' command runs automatically to reclaim space when > there is no space left and thinks it can reclaim the space by removing > some things [1]. So, it could be that somehow 'tart prune' ran > automatically and deleted the sonoma image. I think you can check if > this is the case. You can check these locations [2] from ci-user to > see when ventura images are created. If they have been created less > than 1 day ago, I think the current space is not enough to pull both > ventura and sonoma images. I think you nailed it (this will wrap badly): 8<----------------- macmini:~ ci-run$ ll ~/.tart/cache/OCIs/ghcr.io/cirruslabs/* /Users/ci-run/.tart/cache/OCIs/ghcr.io/cirruslabs/macos-runner: total 0 drwxr-xr-x 2 ci-run staff 64 Jul 17 23:53 . drwxr-xr-x 5 ci-run staff 160 Jul 17 17:16 .. /Users/ci-run/.tart/cache/OCIs/ghcr.io/cirruslabs/macos-sonoma-base: total 0 drwxr-xr-x 2 ci-run staff 64 Jul 17 13:18 . drwxr-xr-x 5 ci-run staff 160 Jul 17 17:16 .. /Users/ci-run/.tart/cache/OCIs/ghcr.io/cirruslabs/macos-ventura-base: total 0 drwxr-xr-x 4 ci-run staff 128 Jul 17 23:53 . drwxr-xr-x 5 ci-run staff 160 Jul 17 17:16 .. lrwxr-xr-x 1 ci-run staff 140 Jul 17 23:53 latest -> /Users/ci-run/.tart/cache/OCIs/ghcr.io/cirruslabs/macos-ventura-base/sha256:bddfa1e2b6f6ec41b5db844b06a6784a2bffe0b071965470efebd95ea3355b4f drwxr-xr-x 5 ci-run staff 160 Jul 17 23:53 sha256:bddfa1e2b6f6ec41b5db844b06a6784a2bffe0b071965470efebd95ea3355b4f 8<----------------- So perhaps I am back to needing more storage... -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Hi, On Thu, 18 Jul 2024 at 17:01, Joe Conway <mail@joeconway.com> wrote: > > On 7/18/24 08:55, Nazir Bilal Yavuz wrote: > > Hi, > > > > On Thu, 18 Jul 2024 at 15:00, Joe Conway <mail@joeconway.com> wrote: > >> > >> On 7/18/24 07:55, Joe Conway wrote: > >> > On 7/18/24 04:12, Nazir Bilal Yavuz wrote: > >> >> Could it be pulling the ''macos-runner:sonoma' image on every run? > >> > > >> > Or perhaps since this was the first run it simply needed to pull the > >> > image for the first time? > > > > It was not the first run, Thomas rerun it a couple of times but all of > > them were in the same build. So, I thought that CI may set some > > settings to pull the image while starting the build, so it > > re-downloads the image for all the tasks in the same build. But that > > looks wrong because of what you said below. > > > >> > > >> > The scheduling timing (21:24) looks a lot like what I observed when I > >> > did the test for the time to download. Unfortunately I did not time the > >> > test though. > >> > >> Actually it does look like the image is gone now based on the free space > >> on the volume, so maybe it is pulling every run and cleaning up rather > >> than caching for some reason? > >> > >> Filesystem Size Used Avail Capacity > >> /dev/disk3s5 228Gi 39Gi 161Gi 20% > > > > That is interesting. Only one thing comes to my mind. It seems that > > the 'tart prune' command runs automatically to reclaim space when > > there is no space left and thinks it can reclaim the space by removing > > some things [1]. So, it could be that somehow 'tart prune' ran > > automatically and deleted the sonoma image. I think you can check if > > this is the case. You can check these locations [2] from ci-user to > > see when ventura images are created. If they have been created less > > than 1 day ago, I think the current space is not enough to pull both > > ventura and sonoma images. > > I think you nailed it (this will wrap badly): > 8<----------------- > macmini:~ ci-run$ ll ~/.tart/cache/OCIs/ghcr.io/cirruslabs/* > /Users/ci-run/.tart/cache/OCIs/ghcr.io/cirruslabs/macos-runner: > total 0 > drwxr-xr-x 2 ci-run staff 64 Jul 17 23:53 . > drwxr-xr-x 5 ci-run staff 160 Jul 17 17:16 .. > > /Users/ci-run/.tart/cache/OCIs/ghcr.io/cirruslabs/macos-sonoma-base: > total 0 > drwxr-xr-x 2 ci-run staff 64 Jul 17 13:18 . > drwxr-xr-x 5 ci-run staff 160 Jul 17 17:16 .. > > /Users/ci-run/.tart/cache/OCIs/ghcr.io/cirruslabs/macos-ventura-base: > total 0 > drwxr-xr-x 4 ci-run staff 128 Jul 17 23:53 . > drwxr-xr-x 5 ci-run staff 160 Jul 17 17:16 .. > lrwxr-xr-x 1 ci-run staff 140 Jul 17 23:53 latest -> > /Users/ci-run/.tart/cache/OCIs/ghcr.io/cirruslabs/macos-ventura-base/sha256:bddfa1e2b6f6ec41b5db844b06a6784a2bffe0b071965470efebd95ea3355b4f > drwxr-xr-x 5 ci-run staff 160 Jul 17 23:53 > sha256:bddfa1e2b6f6ec41b5db844b06a6784a2bffe0b071965470efebd95ea3355b4f > 8<----------------- > > So perhaps I am back to needing more storage... You might not need more storage. Thomas knows better, but AFAIU, CFBot will pull only sonoma images after the patch in this thread gets merged. And your storage seems enough for storing it. -- Regards, Nazir Bilal Yavuz Microsoft
On 7/18/24 10:23, Nazir Bilal Yavuz wrote: > On Thu, 18 Jul 2024 at 17:01, Joe Conway <mail@joeconway.com> wrote: >> So perhaps I am back to needing more storage... > > You might not need more storage. Thomas knows better, but AFAIU, CFBot > will pull only sonoma images after the patch in this thread gets > merged. And your storage seems enough for storing it. I figured I would go ahead and buy it. Basically $250 total for a 2TB WD_BLACK NVMe plus a mac mini expansion enclosure. Should be delivered Sunday. -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On 7/18/24 10:33, Joe Conway wrote: > On 7/18/24 10:23, Nazir Bilal Yavuz wrote: >> On Thu, 18 Jul 2024 at 17:01, Joe Conway <mail@joeconway.com> wrote: >>> So perhaps I am back to needing more storage... >> >> You might not need more storage. Thomas knows better, but AFAIU, CFBot >> will pull only sonoma images after the patch in this thread gets >> merged. And your storage seems enough for storing it. > > I figured I would go ahead and buy it. Basically $250 total for a 2TB > WD_BLACK NVMe plus a mac mini expansion enclosure. Should be delivered > Sunday. I installed and mounted the new volume, moved "~/.tart" to /Volumes/extnvme and created a symlink, and the restarted the ci process, but now I am getting continuous errors streaming to the log: 8<------------------ macmini:~ ci-run$ ll /Users/ci-run/.tart lrwxr-xr-x 1 ci-run staff 29 Jul 21 15:53 /Users/ci-run/.tart -> /Volumes/extnvme/ci-run/.tart macmini:~ ci-run$ df -h /Volumes/extnvme Filesystem Size Used Avail Capacity iused ifree %iused Mounted on /dev/disk5s2 1.8Ti 76Gi 1.7Ti 5% 105 18734532280 0% /Volumes/extnvme macmini:~ ci-run$ tail -n1 log/cirrus-worker.log time="2024-07-21T16:09:29-04:00" level=error msg="failed to poll upstream https://grpc.cirrus-ci.com:443: rpc error: code = NotFound desc = Can't find worker by session token!" 8<------------------ Any ideas? -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On 7/21/24 16:15, Joe Conway wrote: > On 7/18/24 10:33, Joe Conway wrote: >> On 7/18/24 10:23, Nazir Bilal Yavuz wrote: >>> On Thu, 18 Jul 2024 at 17:01, Joe Conway <mail@joeconway.com> wrote: >>>> So perhaps I am back to needing more storage... >>> >>> You might not need more storage. Thomas knows better, but AFAIU, CFBot >>> will pull only sonoma images after the patch in this thread gets >>> merged. And your storage seems enough for storing it. >> >> I figured I would go ahead and buy it. Basically $250 total for a 2TB >> WD_BLACK NVMe plus a mac mini expansion enclosure. Should be delivered >> Sunday. > > I installed and mounted the new volume, moved "~/.tart" to > /Volumes/extnvme and created a symlink, and the restarted the ci > process, but now I am getting continuous errors streaming to the log: > > 8<------------------ > macmini:~ ci-run$ tail -n1 log/cirrus-worker.log > time="2024-07-21T16:09:29-04:00" level=error msg="failed to poll > upstream https://grpc.cirrus-ci.com:443: rpc error: code = NotFound desc > = Can't find worker by session token!" > 8<------------------ > > Any ideas? Hmmm, maybe nevermind? I rebooted the mac mini and now it seems to be working. Maybe someone can confirm. There ought to be plenty of space available for sonoma and ventura at the same time now. -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Mon, Jul 22, 2024 at 8:34 AM Joe Conway <mail@joeconway.com> wrote: > Hmmm, maybe nevermind? I rebooted the mac mini and now it seems to be > working. Maybe someone can confirm. There ought to be plenty of space > available for sonoma and ventura at the same time now. Thanks for doing that. Initial results are that it's running the tests much more slowly. Example: https://cirrus-ci.com/task/5607066713194496 I wonder if there is a way to use the external drive for caching images and so on, but the faster (?) internal drive for work...
On 7/21/24 17:26, Thomas Munro wrote: > On Mon, Jul 22, 2024 at 8:34 AM Joe Conway <mail@joeconway.com> wrote: >> Hmmm, maybe nevermind? I rebooted the mac mini and now it seems to be >> working. Maybe someone can confirm. There ought to be plenty of space >> available for sonoma and ventura at the same time now. > > Thanks for doing that. Initial results are that it's running the > tests much more slowly. Example: > > https://cirrus-ci.com/task/5607066713194496 > > I wonder if there is a way to use the external drive for caching > images and so on, but the faster (?) internal drive for work... Maybe -- I moved the symlink to include only the "cache" part of the tree under ~/.tart. 8<-------------- macmini:.tart ci-run$ cd ~ macmini:~ ci-run$ ll .tart total 0 drwxr-xr-x 5 ci-run staff 160 Jul 22 08:42 . drwxr-x---+ 25 ci-run staff 800 Jul 22 08:41 .. lrwxr-xr-x 1 ci-run staff 35 Jul 22 08:42 cache -> /Volumes/extnvme/ci-run/.tart/cache drwxr-xr-x 3 ci-run staff 96 Jul 22 08:43 tmp drwxr-xr-x 2 ci-run staff 64 Jul 22 08:39 vms 8<-------------- Previously I had the entire "~/.tart" directory tree on the external drive. Please check again. -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Hi, On 2024-07-22 08:46:03 -0400, Joe Conway wrote: > On 7/21/24 17:26, Thomas Munro wrote: > > On Mon, Jul 22, 2024 at 8:34 AM Joe Conway <mail@joeconway.com> wrote: > > > Hmmm, maybe nevermind? I rebooted the mac mini and now it seems to be > > > working. Maybe someone can confirm. There ought to be plenty of space > > > available for sonoma and ventura at the same time now. > > > > Thanks for doing that. Initial results are that it's running the > > tests much more slowly. Example: > > > > https://cirrus-ci.com/task/5607066713194496 > > > > I wonder if there is a way to use the external drive for caching > > images and so on, but the faster (?) internal drive for work... > > Maybe -- I moved the symlink to include only the "cache" part of the tree > under ~/.tart. > [...] > Previously I had the entire "~/.tart" directory tree on the external drive. > > Please check again. That looks like it did the trick! E.g. [1] has good timings. I triggered a run with Sonoma that did end up scheduled on your machine [2], let's see how that goes. Looks like it's perhaps downloading the image again :/. Greetings, Andres Freund [1] https://cirrus-ci.com/task/6187836754362368 [2] https://cirrus-ci.com/task/5190473306865664
On Tue, Jul 23, 2024 at 7:37 AM Andres Freund <andres@anarazel.de> wrote: > [2] https://cirrus-ci.com/task/5190473306865664 "Error: “disk.img” couldn’t be copied to “3FA983DD-3078-4B28-A969-BCF86F8C9585” because there isn’t enough space." Could it be copying the whole image every time, in some way that would get copy-on-write on the same file system, but having to copy physically here? That is, instead of using some kind of chain of overlay disk image files as seen elsewhere, is this Tart thing relying on file system COW for that? Perhaps that is happening here[1] but I don't immediately know how to find out where that Swift standard library call turns into system calls... [1] https://github.com/cirruslabs/tart/blob/main/Sources/tart/VMDirectory.swift#L119
On 7/23/24 06:31, Thomas Munro wrote: > On Tue, Jul 23, 2024 at 7:37 AM Andres Freund <andres@anarazel.de> wrote: >> [2] https://cirrus-ci.com/task/5190473306865664 > > "Error: “disk.img” couldn’t be copied to > “3FA983DD-3078-4B28-A969-BCF86F8C9585” because there isn’t enough > space." > > Could it be copying the whole image every time, in some way that would > get copy-on-write on the same file system, but having to copy > physically here? That is, instead of using some kind of chain of > overlay disk image files as seen elsewhere, is this Tart thing relying > on file system COW for that? Perhaps that is happening here[1] but I > don't immediately know how to find out where that Swift standard > library call turns into system calls... > > [1] https://github.com/cirruslabs/tart/blob/main/Sources/tart/VMDirectory.swift#L119 I tried moving ~/.tart/tmp to the external drive as well, but that failed -- I *think* because tart is trying to do some kind of hardlink between the files in ~/.tart/tmp and ~/.tart/vms. So I move that back and at least the ventura runs are working again. </facepalm>I also noticed that when I set up the external drive, I somehow automatically configured time machine to run (it was not done intentionally), and it seemed that the backups were consuming space on the primary drive </facepalm>. Did I mention I really hate messing with macos ;-). Any idea how to disable time machine entirely? The settings app provides next to zero configuration of the thing. Anyway, maybe with the time machine stuff removed the there is enough space? I guess if all else fails I will have to get the mac mini with more built in storage in order to accommodate sonoma. -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On 7/23/24 10:44, Joe Conway wrote: > I guess if all else fails I will have to get the mac mini with more > built in storage in order to accommodate sonoma. I *think* I finally have it in a good place. I replaced the nvme enclosure that I bought the other day (which had a 10G interface speed) with a new one (which has 40G rated speed). The entire ~/.tart directory is a symlink to /Volumes/extnvme. The last two runs completed successfully and at about the same speed as the PGX macmini does. Let me know if you see any issues. -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Thu, Jul 25, 2024 at 7:25 AM Joe Conway <mail@joeconway.com> wrote: > I *think* I finally have it in a good place. I replaced the nvme > enclosure that I bought the other day (which had a 10G interface speed) > with a new one (which has 40G rated speed). The entire ~/.tart directory > is a symlink to /Volumes/extnvme. The last two runs completed > successfully and at about the same speed as the PGX macmini does. Looking good! Thanks. I have now pushed the patch to switch CI to Sonoma, back-patched as far as 15. Let's see how that goes. I have also paused the pgx machine for now, until Christophe is available to help us fix it.
On Thu, Jul 25, 2024 at 11:35 AM Thomas Munro <thomas.munro@gmail.com> wrote: > Looking good! Thanks. I have now pushed the patch to switch CI to > Sonoma, back-patched as far as 15. Let's see how that goes. I have > also paused the pgx machine for now, until Christophe is available to > help us fix it. Cfbot builds are working nicely on Sonoma. But... unfortunately the github.com/postgres/postgres CI was working for REL_15_STABLE only, and not 16, 17 or master. After scratching my head for a moment, I realised that the new logic for finding the most recent version of MacPorts is now picking up a new beta version of MacPorts that has just been published, and apparently it doesn't work quite right and couldn't install meson. D'oh! I have pushed a fix for that: I went back to requesting 2.9.3 until we are ready to change it. Long version: I had imagined we might one day need to nail down the version in a commented out line already, I just failed to think about non-release versions appearing in the list. An alternative might be to use a pattern that matches stable releases eg [0-9][0-9.]* to exclude stuff like 2.10-beta1 but for now I have cold feet about that lack of control. While thinking about that, it occurred to me that it might also be better if it also re-installs from scratch whenever our script that installs MacPorts changes, so I included its md5 in the cache key. Otherwise it's a bit hard to test, since the cached installation survives across rebuilds by design (that's why cfbot isn't affected, it installed 2.9.3 before they unleashed 2.10-beta1 and cached the result). This way, if someone changes 2.9.3 to 2.whatever in that script, it'll do a fresh installation of MacPorts on the first build in a given github account, and then later builds will work from the cached copy. Seems like desirable behaviour for future maintenance work.
> On Jul 24, 2024, at 16:35, Thomas Munro <thomas.munro@gmail.com> wrote: > I have > also paused the pgx machine for now, until Christophe is available to > help us fix it. Present. I haven't fully digested the thread, but is there a fix that doesn't involve adding more storage to the machine(I'm happy to do that, just wanted to confirm)? -- Christophe Pettus / christophe.pettus@pgexperts.com Chief Executive Officer / PGX Inc. / 24x7 Support, Consulting, Development / pgexperts.com See us at PgConf.nyc and PgConf.eu!
On Thu, Jul 25, 2024 at 4:55 PM Christophe Pettus <christophe.pettus@pgexperts.com> wrote: > Present. I haven't fully digested the thread, but is there a fix that doesn't involve adding more storage to the machine(I'm happy to do that, just wanted to confirm)? How much disk space is free after deleting existing cached images?
On 7/25/24 01:09, Thomas Munro wrote: > On Thu, Jul 25, 2024 at 4:55 PM Christophe Pettus > <christophe.pettus@pgexperts.com> wrote: >> Present. I haven't fully digested the thread, but is there a fix >> that doesn't involve adding more storage to the machine (I'm happy >> to do that, just wanted to confirm)? > > How much disk space is free after deleting existing cached images? FWIW, here is the TL;DR: * Bought to expand storage: ----------- https://www.amazon.com/dp/B0B7CMZ3QH?ref=ppx_yo2ov_dt_b_product_details https://www.amazon.com/dp/B0BYPVNBTQ?psc=1&ref=ppx_yo2ov_dt_b_product_details ----------- * Moved "~/.tart" ----------- macmini:~ ci-run$ ll ~/.tart ... /Users/ci-run/.tart -> /Volumes/extnvme/ci-run/.tart ----------- -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
> On Jul 24, 2024, at 22:09, Thomas Munro <thomas.munro@gmail.com> wrote: > How much disk space is free after deleting existing cached images? At the moment, the system (and only real) volume is only 20% used (without deleting anything): cfbot@Cockerel ~ % df ~ Filesystem 512-blocks Used Available Capacity iused ifree %iused Mounted on /dev/disk3s5 965595304 180477256 725211416 20% 637368 3626057080 0% /System/Volumes/Data -- Christophe Pettus / christophe.pettus@pgexperts.com Chief Executive Officer / PGX Inc. / 24x7 Support, Consulting, Development / pgexperts.com See us at PgConf.nyc and PgConf.eu!
> On Jul 31, 2024, at 10:55, Christophe Pettus <christophe.pettus@pgexperts.com> wrote: > > > >> On Jul 24, 2024, at 22:09, Thomas Munro <thomas.munro@gmail.com> wrote: >> How much disk space is free after deleting existing cached images? > > At the moment, the system (and only real) volume is only 20% used (without deleting anything): > > cfbot@Cockerel ~ % df ~ > Filesystem 512-blocks Used Available Capacity iused ifree %iused Mounted on > /dev/disk3s5 965595304 180477256 725211416 20% 637368 3626057080 0% /System/Volumes/Data A quick search shows that the issue is most likely an old version of `tart`. I've upgraded both to the current cirrus/cliversion. Can you let me know if things look resolved? -- Christophe Pettus / christophe.pettus@pgexperts.com Chief Executive Officer / PGX Inc. / 24x7 Support, Consulting, Development / pgexperts.com See us at PgConf.nyc and PgConf.eu!
On Thu, Aug 1, 2024 at 6:08 AM Christophe Pettus <christophe.pettus@pgexperts.com> wrote: > A quick search shows that the issue is most likely an old version of `tart`. I've upgraded both to the current cirrus/cliversion. Can you let me know if things look resolved? I re-enabled it in the pool that cfbot uses for a couple of hours, and it said[1]: Persistent worker failed to start the task: tart command returned non-zero exit code: "root privileges are required to run and passwordless sudo was not available" I recall Joe and Andres dealing with something like that at some point on their Macs, but I don't have the details... [1] https://cirrus-ci.com/task/5597845632319488
On 8/1/24 21:42, Thomas Munro wrote: > On Thu, Aug 1, 2024 at 6:08 AM Christophe Pettus > <christophe.pettus@pgexperts.com> wrote: >> A quick search shows that the issue is most likely an old version of `tart`. I've upgraded both to the current cirrus/cliversion. Can you let me know if things look resolved? > > I re-enabled it in the pool that cfbot uses for a couple of hours, and > it said[1]: > > Persistent worker failed to start the task: tart command returned > non-zero exit code: "root privileges are required to run and > passwordless sudo was not available" > > I recall Joe and Andres dealing with something like that at some point > on their Macs, but I don't have the details... > > [1] https://cirrus-ci.com/task/5597845632319488 I think the solution was that the ci runner had to be executed directly as the ci user. 8<----------------- macmini:~ ci-run$ cat /Users/ci-run/bin/ci1.sh #!/bin/bash WORKER_NAME=jc-m2-1 TOKEN=/Users/ci-run/cirrus-token.txt WORKER_YML=/Users/ci-run/cirrus-worker-macos.yml BREW_BIN=/opt/homebrew/bin CIRRUS=${BREW_BIN}/cirrus CAT=/bin/cat export PATH=${BREW_BIN}:${PATH} ${CIRRUS} worker run \ -f "${WORKER_YML}" \ --name "${WORKER_NAME}" \ --token "$(${CAT} ${TOKEN})" macmini:~ ci-run$ /Users/ci-run/bin/ci1.sh & 8<----------------- I tried making this run like a service using launchctl, but that was giving the permissions errors. I finally gave up trying to figure it out and just accepted that I need to manually start the script whenever I reboot the mac. BTW, if there are any MacOS launchctl wizards around, I am all ears :-) -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Sat, Aug 3, 2024 at 12:07 AM Joe Conway <mail@joeconway.com> wrote: > I tried making this run like a service using launchctl, but that was > giving the permissions errors. I finally gave up trying to figure it out > and just accepted that I need to manually start the script whenever I > reboot the mac. It seems to be unhappy recently: Persistent worker failed to start the task: tart isolation failed: failed to create VM cloned from "ghcr.io/cirruslabs/macos-runner:sonoma": tart command returned non-zero exit code: "tart/VMStorageOCI.swift:5: Fatal error: 'try!' expression unexpectedly raised an error: Error Domain=NSCocoaErrorDomain Code=512 \"The file “ci-run” couldn’t be saved in the folder “Users”.\" UserInfo={NSFilePath=/Users/ci-run, NSUnderlyingError=0x6000019f0720 {Error Domain=NSPOSIXErrorDomain Code=20 \"Not a directory\"}}"
On 9/8/24 16:55, Thomas Munro wrote: > On Sat, Aug 3, 2024 at 12:07 AM Joe Conway <mail@joeconway.com> wrote: >> I tried making this run like a service using launchctl, but that was >> giving the permissions errors. I finally gave up trying to figure it out >> and just accepted that I need to manually start the script whenever I >> reboot the mac. > > It seems to be unhappy recently: > > Persistent worker failed to start the task: tart isolation failed: > failed to create VM cloned from > "ghcr.io/cirruslabs/macos-runner:sonoma": tart command returned > non-zero exit code: "tart/VMStorageOCI.swift:5: Fatal error: 'try!' > expression unexpectedly raised an error: Error > Domain=NSCocoaErrorDomain Code=512 \"The file “ci-run” couldn’t be > saved in the folder “Users”.\" UserInfo={NSFilePath=/Users/ci-run, > NSUnderlyingError=0x6000019f0720 {Error Domain=NSPOSIXErrorDomain > Code=20 \"Not a directory\"}}" Seems the mounted drive got unmounted somehow ¯\_(ツ)_/¯ Please check it out and let me know if it is working properly now. -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Mon, Sep 9, 2024 at 1:37 PM Joe Conway <mail@joeconway.com> wrote: > Seems the mounted drive got unmounted somehow ¯\_(ツ)_/¯ > > Please check it out and let me know if it is working properly now. Looks good, thanks!
On Tue, Sep 10, 2024 at 3:04 PM Thomas Munro <thomas.munro@gmail.com> wrote: > On Mon, Sep 9, 2024 at 1:37 PM Joe Conway <mail@joeconway.com> wrote: > > Seems the mounted drive got unmounted somehow ¯\_(ツ)_/¯ > > > > Please check it out and let me know if it is working properly now. > > Looks good, thanks! ... but it's broken again.
On 9/12/24 02:05, Thomas Munro wrote: > On Tue, Sep 10, 2024 at 3:04 PM Thomas Munro <thomas.munro@gmail.com> wrote: >> On Mon, Sep 9, 2024 at 1:37 PM Joe Conway <mail@joeconway.com> wrote: >>> Seems the mounted drive got unmounted somehow ¯\_(ツ)_/¯ >>> >>> Please check it out and let me know if it is working properly now. >> >> Looks good, thanks! > > ... but it's broken again. The external volume somehow got unmounted again :-( I have rebooted it and restarted the process now. -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com