Re: test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?) - Mailing list pgsql-hackers

From Christoph Berg
Subject Re: test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?)
Date
Msg-id 20141124200824.GA22662@msg.df7cb.de
Whole thread Raw
In response to Re: test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?)  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: test_shm_mq failing on mips*  (Christoph Berg <cb@df7cb.de>)
List pgsql-hackers
Re: Robert Haas 2014-11-24 <CA+TgmoacnppMdgg4n14U2BjujNDNMOU8xxHhPMvO+0u92ckH+w@mail.gmail.com>
> > https://buildd.debian.org/status/fetch.php?pkg=postgresql-9.4&arch=mipsel&ver=9.4~rc1-1&stamp=1416547779
> >
> > mips had the problem as well in the past (9.4 beta3):
> >
> > https://buildd.debian.org/status/fetch.php?pkg=postgresql-9.4&arch=mips&ver=9.4~beta3-3&stamp=1413607370
> > https://buildd.debian.org/status/fetch.php?pkg=postgresql-9.4&arch=mips&ver=9.4~beta3-1&stamp=1412893135
> 
> For how long did was it hung before you killed it?

That was an automated build, killed "after 300 minutes of inactivity".
(I guess that means it didn't output anything on the terminal for that
long, but I've never looked closer into sbuild.)

> > The mips beta3 failures eventually went away when the build was done
> > on a different machine. This was the first time the mipsel build was
> > done on this build machine, so it seems the problem might well be
> > caused by some subarchitecture difference.
> 
> Does it fail every time when run on a machine where it fails sometimes?

So far there's a consistent host -> fail-or-not mapping, but most
mips/mipsel build hosts have seen only one attempt so far which
actually came so far to actually run the shm_mq test.

> It might not be related to the subarchitecture difference in any
> particularly interesting way; it could just be a race condition that
> is triggered, or not, depending on the precise timing of things, which
> might vary based on subarchitecture, compiler, running kernel version,
> etc.

Compiler and kernel should mostly be the same on all hosts (gcc from
Debian unstable inside the build chroots and kernel from stable on the
host system). The common bit on the failing hosts is that both mips
and mipsel are registered as "Broadcom BCM91250A aka SWARM" which is
hinting at a subarch problem. But of course that could just be a
coincidence.

Atm I don't have access to the boxes where it was failing (the builds
succeed on the mips(el) porter hosts available to Debian developers).
I'll see if I can arrange access there and run a test.

Christoph
-- 
cb@df7cb.de | http://www.df7cb.de/



pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: Disabling auto.conf WAS: Turning recovery.conf into GUCs
Next
From: Thom Brown
Date:
Subject: postgresql.auto.conf comments