Thread: Docs Build in CI failing with "failed to load external entity"

Docs Build in CI failing with "failed to load external entity"

From
Melanie Plageman
Date:
Hi,

I know in the past docs builds failing with "failed to load external
entity" have happened on macos. But, recently I've noticed this
failure for docs build on CI (not on macos) -- docs build is one of
the jobs run under the "Compiler Warnings" task.

See an example of this on CI for the github mirror [1].

Anyone know what the story is here? I couldn't find an existing thread
on this specific issue.

- Melanie

[1] https://github.com/postgres/postgres/runs/32028560196



Melanie Plageman <melanieplageman@gmail.com> writes:
> I know in the past docs builds failing with "failed to load external
> entity" have happened on macos. But, recently I've noticed this
> failure for docs build on CI (not on macos) -- docs build is one of
> the jobs run under the "Compiler Warnings" task.

It looks to me like a broken docbook installation on (one of?)
the CI machines.  Note that the *first* complaint is

[19:23:20.590] file:///etc/xml/catalog:1: parser error : Document is empty

I suspect that the subsequent "failed to load external entity"
complaints happen because the XML processor doesn't find any DTD
objects in the local catalog, so it tries to go out to the net for
them, and is foiled by the --no-net switch we use.

            regards, tom lane



Re: Docs Build in CI failing with "failed to load external entity"

From
Thomas Munro
Date:
On Fri, Oct 25, 2024 at 4:44 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Melanie Plageman <melanieplageman@gmail.com> writes:
> > I know in the past docs builds failing with "failed to load external
> > entity" have happened on macos. But, recently I've noticed this
> > failure for docs build on CI (not on macos) -- docs build is one of
> > the jobs run under the "Compiler Warnings" task.
>
> It looks to me like a broken docbook installation on (one of?)
> the CI machines.  Note that the *first* complaint is
>
> [19:23:20.590] file:///etc/xml/catalog:1: parser error : Document is empty

Yeah.  That CI job runs on a canned Debian image that is rebuilt and
republished every couple of days to make sure it's using up to date
packages and kernel etc.  Perhaps the package installation silently
corrupted /etc/xml/catalog, given that multiple packages probably mess
with it, though I don't have a specific theory for how that could
happen, given that package installation seems to be serial...  The
installation log doesn't seem to show anything suspicious.

https://github.com/anarazel/pg-vm-images/
https://cirrus-ci.com/github/anarazel/pg-vm-images
https://cirrus-ci.com/build/5427240429682688
https://api.cirrus-ci.com/v1/task/6621385303261184/logs/build_image.log

I tried simply reinstalling docbook-xml in my own github account
(which is showing the problem), and it cleared the error:

   setup_additional_packages_script: |
-    #apt-get update
-    #DEBIAN_FRONTEND=noninteractive apt-get -y install ...
+    apt-get update
+    DEBIAN_FRONTEND=noninteractive apt-get -y install --reinstall docbook-xml

https://cirrus-ci.com/task/6458406242877440

I wonder if this will magically fix itself when the next CI image
build cron job kicks off.  I have no idea what time zone this page is
showing but it should happen in another day or so, unless Andres is
around to kick it sooner:

https://cirrus-ci.com/github/anarazel/pg-vm-images



Re: Docs Build in CI failing with "failed to load external entity"

From
Andres Freund
Date:
Hi,

On 2024-10-25 08:22:42 +0300, Thomas Munro wrote:
> On Fri, Oct 25, 2024 at 4:44 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Melanie Plageman <melanieplageman@gmail.com> writes:
> > > I know in the past docs builds failing with "failed to load external
> > > entity" have happened on macos. But, recently I've noticed this
> > > failure for docs build on CI (not on macos) -- docs build is one of
> > > the jobs run under the "Compiler Warnings" task.
> >
> > It looks to me like a broken docbook installation on (one of?)
> > the CI machines.  Note that the *first* complaint is
> >
> > [19:23:20.590] file:///etc/xml/catalog:1: parser error : Document is empty
> 
> Yeah.  That CI job runs on a canned Debian image that is rebuilt and
> republished every couple of days to make sure it's using up to date
> packages and kernel etc.  Perhaps the package installation silently
> corrupted /etc/xml/catalog, given that multiple packages probably mess
> with it, though I don't have a specific theory for how that could
> happen, given that package installation seems to be serial...  The
> installation log doesn't seem to show anything suspicious.

Yea, it's clearly corrupted - the file is empty.  I don't understand how that
can happen, particularly without any visible error. I certainly can't
reproduce it when installing the packages exactly the same way it happens for
the image.

I also don't think this happened before, despite the recipe for building the
images not having meaningfully changed in quite a while. So it must be some
rare edge case.


> I wonder if this will magically fix itself when the next CI image
> build cron job kicks off.  I have no idea what time zone this page is
> showing but it should happen in another day or so, unless Andres is
> around to kick it sooner:
> 
> https://cirrus-ci.com/github/anarazel/pg-vm-images

I did trigger a rebuild of the image just now. Hopefully that'll fix it.

Greetings,

Andres Freund



Re: Docs Build in CI failing with "failed to load external entity"

From
Andres Freund
Date:
Hi,

On 2024-10-25 04:14:03 -0400, Andres Freund wrote:
> On 2024-10-25 08:22:42 +0300, Thomas Munro wrote:
> > I wonder if this will magically fix itself when the next CI image
> > build cron job kicks off.  I have no idea what time zone this page is
> > showing but it should happen in another day or so, unless Andres is
> > around to kick it sooner:
> > 
> > https://cirrus-ci.com/github/anarazel/pg-vm-images
> 
> I did trigger a rebuild of the image just now. Hopefully that'll fix it.

It did.

Greetings,

Andres Freund



Re: Docs Build in CI failing with "failed to load external entity"

From
Melanie Plageman
Date:
On Fri, Oct 25, 2024 at 4:31 AM Andres Freund <andres@anarazel.de> wrote:
>
> On 2024-10-25 04:14:03 -0400, Andres Freund wrote:
> > On 2024-10-25 08:22:42 +0300, Thomas Munro wrote:
> > > I wonder if this will magically fix itself when the next CI image
> > > build cron job kicks off.  I have no idea what time zone this page is
> > > showing but it should happen in another day or so, unless Andres is
> > > around to kick it sooner:
> > >
> > > https://cirrus-ci.com/github/anarazel/pg-vm-images
> >
> > I did trigger a rebuild of the image just now. Hopefully that'll fix it.
>
> It did.

I noticed that CI for my fork of Postgres, which had been failing on
docs build and on test-running on the injection points test only on
freebsd, started working as expected again this morning. All of this
is a bit of magic to me -- are the CI images you build used by all of
our CIs?

- Melanie



Re: Docs Build in CI failing with "failed to load external entity"

From
Andres Freund
Date:
On 2024-10-25 09:34:41 -0400, Melanie Plageman wrote:
> I noticed that CI for my fork of Postgres, which had been failing on
> docs build and on test-running on the injection points test only on
> freebsd, started working as expected again this morning. All of this
> is a bit of magic to me -- are the CI images you build used by all of
> our CIs?

Yes. Installing the packages every time would be far far too time consuming.