Thread: Docs Build in CI failing with "failed to load external entity"
Hi, I know in the past docs builds failing with "failed to load external entity" have happened on macos. But, recently I've noticed this failure for docs build on CI (not on macos) -- docs build is one of the jobs run under the "Compiler Warnings" task. See an example of this on CI for the github mirror [1]. Anyone know what the story is here? I couldn't find an existing thread on this specific issue. - Melanie [1] https://github.com/postgres/postgres/runs/32028560196
Melanie Plageman <melanieplageman@gmail.com> writes: > I know in the past docs builds failing with "failed to load external > entity" have happened on macos. But, recently I've noticed this > failure for docs build on CI (not on macos) -- docs build is one of > the jobs run under the "Compiler Warnings" task. It looks to me like a broken docbook installation on (one of?) the CI machines. Note that the *first* complaint is [19:23:20.590] file:///etc/xml/catalog:1: parser error : Document is empty I suspect that the subsequent "failed to load external entity" complaints happen because the XML processor doesn't find any DTD objects in the local catalog, so it tries to go out to the net for them, and is foiled by the --no-net switch we use. regards, tom lane
On Fri, Oct 25, 2024 at 4:44 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Melanie Plageman <melanieplageman@gmail.com> writes: > > I know in the past docs builds failing with "failed to load external > > entity" have happened on macos. But, recently I've noticed this > > failure for docs build on CI (not on macos) -- docs build is one of > > the jobs run under the "Compiler Warnings" task. > > It looks to me like a broken docbook installation on (one of?) > the CI machines. Note that the *first* complaint is > > [19:23:20.590] file:///etc/xml/catalog:1: parser error : Document is empty Yeah. That CI job runs on a canned Debian image that is rebuilt and republished every couple of days to make sure it's using up to date packages and kernel etc. Perhaps the package installation silently corrupted /etc/xml/catalog, given that multiple packages probably mess with it, though I don't have a specific theory for how that could happen, given that package installation seems to be serial... The installation log doesn't seem to show anything suspicious. https://github.com/anarazel/pg-vm-images/ https://cirrus-ci.com/github/anarazel/pg-vm-images https://cirrus-ci.com/build/5427240429682688 https://api.cirrus-ci.com/v1/task/6621385303261184/logs/build_image.log I tried simply reinstalling docbook-xml in my own github account (which is showing the problem), and it cleared the error: setup_additional_packages_script: | - #apt-get update - #DEBIAN_FRONTEND=noninteractive apt-get -y install ... + apt-get update + DEBIAN_FRONTEND=noninteractive apt-get -y install --reinstall docbook-xml https://cirrus-ci.com/task/6458406242877440 I wonder if this will magically fix itself when the next CI image build cron job kicks off. I have no idea what time zone this page is showing but it should happen in another day or so, unless Andres is around to kick it sooner: https://cirrus-ci.com/github/anarazel/pg-vm-images
Hi, On 2024-10-25 08:22:42 +0300, Thomas Munro wrote: > On Fri, Oct 25, 2024 at 4:44 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Melanie Plageman <melanieplageman@gmail.com> writes: > > > I know in the past docs builds failing with "failed to load external > > > entity" have happened on macos. But, recently I've noticed this > > > failure for docs build on CI (not on macos) -- docs build is one of > > > the jobs run under the "Compiler Warnings" task. > > > > It looks to me like a broken docbook installation on (one of?) > > the CI machines. Note that the *first* complaint is > > > > [19:23:20.590] file:///etc/xml/catalog:1: parser error : Document is empty > > Yeah. That CI job runs on a canned Debian image that is rebuilt and > republished every couple of days to make sure it's using up to date > packages and kernel etc. Perhaps the package installation silently > corrupted /etc/xml/catalog, given that multiple packages probably mess > with it, though I don't have a specific theory for how that could > happen, given that package installation seems to be serial... The > installation log doesn't seem to show anything suspicious. Yea, it's clearly corrupted - the file is empty. I don't understand how that can happen, particularly without any visible error. I certainly can't reproduce it when installing the packages exactly the same way it happens for the image. I also don't think this happened before, despite the recipe for building the images not having meaningfully changed in quite a while. So it must be some rare edge case. > I wonder if this will magically fix itself when the next CI image > build cron job kicks off. I have no idea what time zone this page is > showing but it should happen in another day or so, unless Andres is > around to kick it sooner: > > https://cirrus-ci.com/github/anarazel/pg-vm-images I did trigger a rebuild of the image just now. Hopefully that'll fix it. Greetings, Andres Freund
Hi, On 2024-10-25 04:14:03 -0400, Andres Freund wrote: > On 2024-10-25 08:22:42 +0300, Thomas Munro wrote: > > I wonder if this will magically fix itself when the next CI image > > build cron job kicks off. I have no idea what time zone this page is > > showing but it should happen in another day or so, unless Andres is > > around to kick it sooner: > > > > https://cirrus-ci.com/github/anarazel/pg-vm-images > > I did trigger a rebuild of the image just now. Hopefully that'll fix it. It did. Greetings, Andres Freund
On Fri, Oct 25, 2024 at 4:31 AM Andres Freund <andres@anarazel.de> wrote: > > On 2024-10-25 04:14:03 -0400, Andres Freund wrote: > > On 2024-10-25 08:22:42 +0300, Thomas Munro wrote: > > > I wonder if this will magically fix itself when the next CI image > > > build cron job kicks off. I have no idea what time zone this page is > > > showing but it should happen in another day or so, unless Andres is > > > around to kick it sooner: > > > > > > https://cirrus-ci.com/github/anarazel/pg-vm-images > > > > I did trigger a rebuild of the image just now. Hopefully that'll fix it. > > It did. I noticed that CI for my fork of Postgres, which had been failing on docs build and on test-running on the injection points test only on freebsd, started working as expected again this morning. All of this is a bit of magic to me -- are the CI images you build used by all of our CIs? - Melanie
On 2024-10-25 09:34:41 -0400, Melanie Plageman wrote: > I noticed that CI for my fork of Postgres, which had been failing on > docs build and on test-running on the injection points test only on > freebsd, started working as expected again this morning. All of this > is a bit of magic to me -- are the CI images you build used by all of > our CIs? Yes. Installing the packages every time would be far far too time consuming.