Thread: Sending out a request for more buildfarm animals?

Sending out a request for more buildfarm animals?

From
Andres Freund
Date:
Hi,

There's pretty little coverage of non mainstream platforms/compilers in
the buildfarm atm. Maybe we should send an email on -announce asking for
new ones?
There's no coverage for OS-wise;
* AIX (at all)
* HP-UX (for master at least)
(* Tru64)
(* UnixWare)

Architecture wise there's no coverage for:
* some ARM architecture varians
* mips
* s390/x
* sparc 32bit
(* s390)
(* alpha)
(* mipsel)
(* M68K)

A couple of those aren't that important (my opinion indicated by ()),
but the other ones really should be covered or desupported.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Sending out a request for more buildfarm animals?

From
Noah Misch
Date:
On Fri, May 02, 2014 at 05:04:01PM +0200, Andres Freund wrote:
> There's pretty little coverage of non mainstream platforms/compilers in
> the buildfarm atm. Maybe we should send an email on -announce asking for
> new ones?
> There's no coverage for OS-wise;
> * AIX (at all)
> * HP-UX (for master at least)
> (* Tru64)
> (* UnixWare)
> 
> Architecture wise there's no coverage for:
> * some ARM architecture varians
> * mips
> * s390/x
> * sparc 32bit
> (* s390)
> (* alpha)
> (* mipsel)
> (* M68K)
> 
> A couple of those aren't that important (my opinion indicated by ()),
> but the other ones really should be covered or desupported.

More coverage of non-gcc compilers would be an asset to the buildfarm.

+1 for sending a call for help to -announce.  I agree with your importance
estimates, particularly on the OS side.  -1 for making code-level changes to
"desupport" a platform based on the lack of a buildfarm member, though I don't
mind documentation/advocacy changes on that basis.

-- 
Noah Misch
EnterpriseDB                                 http://www.enterprisedb.com



Re: Sending out a request for more buildfarm animals?

From
Andres Freund
Date:
On 2014-05-02 21:07:55 -0400, Noah Misch wrote:
> On Fri, May 02, 2014 at 05:04:01PM +0200, Andres Freund wrote:
> > There's pretty little coverage of non mainstream platforms/compilers in
> > the buildfarm atm. Maybe we should send an email on -announce asking for
> > new ones?
> > There's no coverage for OS-wise;
> > * AIX (at all)
> > * HP-UX (for master at least)
> > (* Tru64)
> > (* UnixWare)
> > 
> > Architecture wise there's no coverage for:
> > * some ARM architecture varians
> > * mips
> > * s390/x
> > * sparc 32bit
> > (* s390)
> > (* alpha)
> > (* mipsel)
> > (* M68K)
> > 
> > A couple of those aren't that important (my opinion indicated by ()),
> > but the other ones really should be covered or desupported.
> 
> More coverage of non-gcc compilers would be an asset to the buildfarm.
> 
> +1 for sending a call for help to -announce.  I agree with your importance
> estimates, particularly on the OS side.  -1 for making code-level changes to
> "desupport" a platform based on the lack of a buildfarm member, though I don't
> mind documentation/advocacy changes on that basis.

I was thinking of changing
http://www.postgresql.org/docs/devel/static/supported-platforms.html to
list untested platforms similar to the way M32R and VAX are
documented. I.e. code exists, but we have no clue whether it works.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Sending out a request for more buildfarm animals?

From
Dave Page
Date:
Hamid@EDB; Can you please have someone configure anole to build git
head as well as the other branches? Thanks.

Andres, Andrew; I think the only other gap EDB could fill at the
moment is RHEL6 on Power7 (though we do have a couple of Power8 boxes
on order that should be here pretty soon). Dotterel is building some
branches (including head). I'm not sure what generation of Power CPU
that box has. Bernd?

On Fri, May 2, 2014 at 4:04 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> Hi,
>
> There's pretty little coverage of non mainstream platforms/compilers in
> the buildfarm atm. Maybe we should send an email on -announce asking for
> new ones?
> There's no coverage for OS-wise;
> * AIX (at all)
> * HP-UX (for master at least)
> (* Tru64)
> (* UnixWare)
>
> Architecture wise there's no coverage for:
> * some ARM architecture varians
> * mips
> * s390/x
> * sparc 32bit
> (* s390)
> (* alpha)
> (* mipsel)
> (* M68K)
>
> A couple of those aren't that important (my opinion indicated by ()),
> but the other ones really should be covered or desupported.
>
> Greetings,
>
> Andres Freund
>
> --
>  Andres Freund                     http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services



-- 
Dave Page
Chief Architect, Tools & Installers
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake



Re: Sending out a request for more buildfarm animals?

From
Bernd Helmle
Date:
It's a POWER 7 machine. <br /><br /><div class="gmail_quote">On 3. Mai 2014 10:31:34 MESZ, Dave Page
<dave.page@enterprisedb.com>wrote:<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left:
1pxsolid rgb(204, 204, 204); padding-left: 1ex;"><pre class="k9mail">Hamid@EDB; Can you please have someone configure
anoleto build git<br />head as well as the other branches? Thanks.<br /><br />Andres, Andrew; I think the only other
gapEDB could fill at the<br />moment is RHEL6 on Power7 (though we do have a couple of Power8 boxes<br />on order that
shouldbe here pretty soon). Dotterel is building some<br />branches (including head). I'm not sure what generation of
PowerCPU<br />that box has. Bernd?<br /><br />On Fri, May 2, 2014 at 4:04 PM, Andres Freund
<andres@2ndquadrant.com>wrote:<br /><blockquote class="gmail_quote" style="margin: 0pt 0pt 1ex 0.8ex;
border-left:1px solid #729fcf; padding-left: 1ex;"> Hi,<br /><br /> There's pretty little coverage of non mainstream
platforms/compilersin<br /> the buildfarm atm. Maybe we should send an email on -announce asking for<br /> new ones?<br
/>There's no coverage for OS-wise;<br /> * AIX (at all)<br /> * HP-UX (for master at least)<br /> (* Tru64)<br /> (*
UnixWare)<br/><br /> Architecture wise there's no coverage for:<br /> * some ARM architecture varians<br /> * mips<br
/>* s390/x<br /> * sparc 32bit<br /> (* s390)<br /> (* alpha)<br /> (* mipsel)<br /> (* M68K)<br /><br /> A couple of
thosearen't that important (my opinion indicated by ()),<br /> but the other ones really should be covered or
desupported.<br/><br /> Greetings,<br /><br /> Andres Freund<br /><br /> --<br />  Andres Freund                     <a
href="http://www.2ndQuadrant.com">http://www.2ndQuadrant.com</a>/<br/>  PostgreSQL Development, 24x7 Support, Training
&Services<br /></blockquote><br /><br /></pre></blockquote></div><br /> -- <br /> Diese Nachricht wurde von meinem
Android-Mobiltelefonmit K-9 Mail gesendet. 

Re: Sending out a request for more buildfarm animals?

From
Noah Misch
Date:
On Sat, May 03, 2014 at 10:09:56AM +0200, Andres Freund wrote:
> On 2014-05-02 21:07:55 -0400, Noah Misch wrote:
> > +1 for sending a call for help to -announce.  I agree with your importance
> > estimates, particularly on the OS side.  -1 for making code-level changes to
> > "desupport" a platform based on the lack of a buildfarm member, though I don't
> > mind documentation/advocacy changes on that basis.
> 
> I was thinking of changing
> http://www.postgresql.org/docs/devel/static/supported-platforms.html to
> list untested platforms similar to the way M32R and VAX are
> documented. I.e. code exists, but we have no clue whether it works.

Sounds perfect.

-- 
Noah Misch
EnterpriseDB                                 http://www.enterprisedb.com



Re: Sending out a request for more buildfarm animals?

From
Tomas Vondra
Date:
On 3.5.2014 03:07, Noah Misch wrote:
> More coverage of non-gcc compilers would be an asset to the buildfarm.

Does that include non-gcc compilers on Linux/x86 platforms?

Magpie is pretty much dedicated to the buildfarm, and it's pretty much
doing nothing most of the time, so running the tests with other
compilers  (llvm/ic/...) would be just fine. Not sure how to do that,
though. Should I run the tests with multiple configurations, or should
we have one animal for each config?

Tomas



Re: Sending out a request for more buildfarm animals?

From
Tom Lane
Date:
Tomas Vondra <tv@fuzzy.cz> writes:
> Magpie is pretty much dedicated to the buildfarm, and it's pretty much
> doing nothing most of the time, so running the tests with other
> compilers  (llvm/ic/...) would be just fine. Not sure how to do that,
> though. Should I run the tests with multiple configurations, or should
> we have one animal for each config?

I believe the intent is one animal name per configuration.
        regards, tom lane



Re: Sending out a request for more buildfarm animals?

From
Andrew Dunstan
Date:
On 05/03/2014 12:42 PM, Tomas Vondra wrote:
> On 3.5.2014 03:07, Noah Misch wrote:
>> More coverage of non-gcc compilers would be an asset to the buildfarm.
> Does that include non-gcc compilers on Linux/x86 platforms?
>
> Magpie is pretty much dedicated to the buildfarm, and it's pretty much
> doing nothing most of the time, so running the tests with other
> compilers  (llvm/ic/...) would be just fine. Not sure how to do that,
> though. Should I run the tests with multiple configurations, or should
> we have one animal for each config?
>

No, don't run with multiple configs. That makes it much harder to see 
where problems come from. One animal per config, please.

cheers

andrew



>




Re: Sending out a request for more buildfarm animals?

From
Tomas Vondra
Date:
On 3.5.2014 19:01, Andrew Dunstan wrote:
> 
> On 05/03/2014 12:42 PM, Tomas Vondra wrote:
>> On 3.5.2014 03:07, Noah Misch wrote:
>>> More coverage of non-gcc compilers would be an asset to the buildfarm.
>> Does that include non-gcc compilers on Linux/x86 platforms?
>>
>> Magpie is pretty much dedicated to the buildfarm, and it's pretty much
>> doing nothing most of the time, so running the tests with other
>> compilers  (llvm/ic/...) would be just fine. Not sure how to do that,
>> though. Should I run the tests with multiple configurations, or should
>> we have one animal for each config?
>>
> 
> No, don't run with multiple configs. That makes it much harder to see
> where problems come from. One animal per config, please.

Yeah, that's what I thought.

I've requested another animal for clang, I'll do the same with the intel
compiler once I get the clang one running.

Are there any other compilers / something else we could run on this box?

regards
Tomas



Re: Sending out a request for more buildfarm animals?

From
Josh Berkus
Date:
Andres,

> There's pretty little coverage of non mainstream platforms/compilers in
> the buildfarm atm. Maybe we should send an email on -announce asking for
> new ones?
> There's no coverage for OS-wise;
> * AIX (at all)
> * HP-UX (for master at least)
> (* Tru64)
> (* UnixWare)

Do we want a SmartOS (opensolaris/Joyent) animal?  Do we already have one?

> Architecture wise there's no coverage for:
> * some ARM architecture varians

I could run a buildfarm animal on a Raspberry Pi if the Postgres
community will replace my flash cards as they burn out.

> * mips
> * s390/x
> * sparc 32bit

Do we really care about sparc 32bit at this point?  You're talking a
10-year-old machine, there.

> (* s390)
> (* alpha)
> (* mipsel)
> (* M68K)
> 
> A couple of those aren't that important (my opinion indicated by ()),
> but the other ones really should be covered or desupported.
> 
> Greetings,
> 
> Andres Freund
> 


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: Sending out a request for more buildfarm animals?

From
Magnus Hagander
Date:

On Sun, May 4, 2014 at 9:35 PM, Josh Berkus <josh@agliodbs.com> wrote:
> Architecture wise there's no coverage for:
> * some ARM architecture varians

I could run a buildfarm animal on a Raspberry Pi if the Postgres
community will replace my flash cards as they burn out.

Heikki already does that - it's chipmunk.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: Sending out a request for more buildfarm animals?

From
Heikki Linnakangas
Date:
On 05/04/2014 11:13 PM, Magnus Hagander wrote:
> On Sun, May 4, 2014 at 9:35 PM, Josh Berkus <josh@agliodbs.com> wrote:
>
>>> Architecture wise there's no coverage for:
>>> * some ARM architecture varians
>>
>> I could run a buildfarm animal on a Raspberry Pi if the Postgres
>> community will replace my flash cards as they burn out.
>
> Heikki already does that - it's chipmunk.

Michael Paquier's hamster is also a Raspberry Pi.

- Heikki



Re: Sending out a request for more buildfarm animals?

From
Michael Paquier
Date:
On Mon, May 5, 2014 at 3:40 PM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
> On 05/04/2014 11:13 PM, Magnus Hagander wrote:
>>
>> On Sun, May 4, 2014 at 9:35 PM, Josh Berkus <josh@agliodbs.com> wrote:
>>
>>>> Architecture wise there's no coverage for:
>>>> * some ARM architecture varians
>>>
>>>
>>> I could run a buildfarm animal on a Raspberry Pi if the Postgres
>>> community will replace my flash cards as they burn out.
>>
>>
>> Heikki already does that - it's chipmunk.
>
>
> Michael Paquier's hamster is also a Raspberry Pi.
Yep, using Archlinux for the PI, Heikki's stuff plays with Raspbian.
-- 
Michael



Re: Sending out a request for more buildfarm animals?

From
Andres Freund
Date:
Hi,

On 2014-05-04 12:35:44 -0700, Josh Berkus wrote:
> > There's pretty little coverage of non mainstream platforms/compilers in
> > the buildfarm atm. Maybe we should send an email on -announce asking for
> > new ones?
> > There's no coverage for OS-wise;
> > * AIX (at all)
> > * HP-UX (for master at least)
> > (* Tru64)
> > (* UnixWare)
> 
> Do we want a SmartOS (opensolaris/Joyent) animal?  Do we already have one?

I don't think so. We only have solaris 10 afaics.

> > * mips
> > * s390/x
> > * sparc 32bit
> 
> Do we really care about sparc 32bit at this point?  You're talking a
> 10-year-old machine, there.

I personally don't really, but the last time it came up significant
parts of community opinionated the other way. And I'd rather have it
tested and actually supported than supposedly supported.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Sending out a request for more buildfarm animals?

From
Alvaro Herrera
Date:
Andres Freund wrote:

> > > * sparc 32bit
> > 
> > Do we really care about sparc 32bit at this point?  You're talking a
> > 10-year-old machine, there.
> 
> I personally don't really, but the last time it came up significant
> parts of community opinionated the other way. And I'd rather have it
> tested and actually supported than supposedly supported.

The thing is that a machine as weird as Sparc uncovers strange failures
that don't show up in other architectures.  For instance, spoonbill
(sparc64 IIRC) has turned up rather interesting numbers of problems.
I think it's useful to have such a thing in the buildfarm.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: Sending out a request for more buildfarm animals?

From
Tomas Vondra
Date:
On 4.5.2014 21:29, Tomas Vondra wrote:
> On 3.5.2014 19:01, Andrew Dunstan wrote:
>>
>> On 05/03/2014 12:42 PM, Tomas Vondra wrote:
>>> On 3.5.2014 03:07, Noah Misch wrote:
>>>> More coverage of non-gcc compilers would be an asset to the buildfarm.
>>> Does that include non-gcc compilers on Linux/x86 platforms?
>>>
>>> Magpie is pretty much dedicated to the buildfarm, and it's pretty much
>>> doing nothing most of the time, so running the tests with other
>>> compilers  (llvm/ic/...) would be just fine. Not sure how to do that,
>>> though. Should I run the tests with multiple configurations, or should
>>> we have one animal for each config?
>>>
>>
>> No, don't run with multiple configs. That makes it much harder to see
>> where problems come from. One animal per config, please.
> 
> Yeah, that's what I thought.
> 
> I've requested another animal for clang, I'll do the same with the intel
> compiler once I get the clang one running.
> 
> Are there any other compilers / something else we could run on this box?

OK, both new animals are up and apparently running - treepie for clang,
fulmar for icc.

I recall there was a call for more animals with CLOBBER_CACHE_ALWAYS
some time ago, so I went and enabled that on all three animals. Let's
see how long that will take.

I see there are more 'clobber' options in the code: CLOBBER_FREED_MEMORY
and CLOBBER_CACHE_RECURSIVELY. Would that be a good idea to enable these
as well?

The time requirements will be much higher (especially for the
RECURSIVELY option), but running that once a week shouldn't be a big
deal - the machine is pretty much dedicated to the buildfarm.

regards
Tomas



Re: Sending out a request for more buildfarm animals?

From
Tom Lane
Date:
Tomas Vondra <tv@fuzzy.cz> writes:
> I recall there was a call for more animals with CLOBBER_CACHE_ALWAYS
> some time ago, so I went and enabled that on all three animals. Let's
> see how long that will take.

> I see there are more 'clobber' options in the code: CLOBBER_FREED_MEMORY
> and CLOBBER_CACHE_RECURSIVELY. Would that be a good idea to enable these
> as well?

> The time requirements will be much higher (especially for the
> RECURSIVELY option), but running that once a week shouldn't be a big
> deal - the machine is pretty much dedicated to the buildfarm.

I've never had the patience to run the regression tests to completion with 
CLOBBER_CACHE_RECURSIVELY at all, let alone do it on a regular basis.
(I wonder if there's some easy way to run it for just a few regression
tests...)

I think testing CLOBBER_FREED_MEMORY would be sensible though.
        regards, tom lane



Re: Sending out a request for more buildfarm animals?

From
Tomas Vondra
Date:
On 6.5.2014 22:24, Tom Lane wrote:
> Tomas Vondra <tv@fuzzy.cz> writes:
>> I recall there was a call for more animals with CLOBBER_CACHE_ALWAYS
>> some time ago, so I went and enabled that on all three animals. Let's
>> see how long that will take.
> 
>> I see there are more 'clobber' options in the code: CLOBBER_FREED_MEMORY
>> and CLOBBER_CACHE_RECURSIVELY. Would that be a good idea to enable these
>> as well?
> 
>> The time requirements will be much higher (especially for the
>> RECURSIVELY option), but running that once a week shouldn't be a big
>> deal - the machine is pretty much dedicated to the buildfarm.
> 
> I've never had the patience to run the regression tests to completion
> with CLOBBER_CACHE_RECURSIVELY at all, let alone do it on a regular 
> basis. (I wonder if there's some easy way to run it for just a few 
> regression tests...)

Now, that's a challenge ;-)

> 
> I think testing CLOBBER_FREED_MEMORY would be sensible though.

OK, I've enabled this for now.

Tomas



Re: Sending out a request for more buildfarm animals?

From
Tomas Vondra
Date:
On 6.5.2014 23:01, Tomas Vondra wrote:
> On 6.5.2014 22:24, Tom Lane wrote:
>> Tomas Vondra <tv@fuzzy.cz> writes:
>>> I recall there was a call for more animals with CLOBBER_CACHE_ALWAYS
>>> some time ago, so I went and enabled that on all three animals. Let's
>>> see how long that will take.
>>
>>> I see there are more 'clobber' options in the code: CLOBBER_FREED_MEMORY
>>> and CLOBBER_CACHE_RECURSIVELY. Would that be a good idea to enable these
>>> as well?
>>
>>> The time requirements will be much higher (especially for the
>>> RECURSIVELY option), but running that once a week shouldn't be a big
>>> deal - the machine is pretty much dedicated to the buildfarm.
>>
>> I've never had the patience to run the regression tests to completion
>> with CLOBBER_CACHE_RECURSIVELY at all, let alone do it on a regular 
>> basis. (I wonder if there's some easy way to run it for just a few 
>> regression tests...)
> 
> Now, that's a challenge ;-)
> 
>>
>> I think testing CLOBBER_FREED_MEMORY would be sensible though.
> 
> OK, I've enabled this for now.

Hmmmm, with CLOBBER_CACHE_ALWAYS + CLOBBER_FREED_MEMORY the tests take
~20h on a single branch/animal. With a single locale (e.g. "C") it would
take ~4h, but we're testing a bunch of additional czech/slovak locales.

The tests are running in sequence (magpie->treepie->fulmar) so with all
6 branches, this would take ~14 days to complete. I don't mind the
machine is running tests 100% of the time, that's why it's in buildfarm,
but I'd rather see the failures soon after the commit (and two weeks is
well over the "soon" edge, IMHO).

So I'm thinking about how to improve this. I'd like to keep the options
for all the branches (e.g. not just HEAD, as a few other animals do).
But I'm thinking about running the tests in parallel, somehow - the
machine has 4 cores, and most of the time only one of them is used. I
don't expect a perfect ~3x speedup, but getting ~2x would be nice.

Any recommendations how to do that? I see there's 'base_port' in the
config - is it enough to tweak this, or do I need to run separate the
animals using e.g. lxc?

regards
Tomas



Re: Sending out a request for more buildfarm animals?

From
Alvaro Herrera
Date:
Tomas Vondra wrote:

> Hmmmm, with CLOBBER_CACHE_ALWAYS + CLOBBER_FREED_MEMORY the tests take
> ~20h on a single branch/animal. With a single locale (e.g. "C") it would
> take ~4h, but we're testing a bunch of additional czech/slovak locales.
> 
> The tests are running in sequence (magpie->treepie->fulmar) so with all
> 6 branches, this would take ~14 days to complete. I don't mind the
> machine is running tests 100% of the time, that's why it's in buildfarm,
> but I'd rather see the failures soon after the commit (and two weeks is
> well over the "soon" edge, IMHO).
> 
> So I'm thinking about how to improve this. I'd like to keep the options
> for all the branches (e.g. not just HEAD, as a few other animals do).

I think doing the CLOBBER_CACHE_ALWAYS + CLOBBER_FREED_MEMORY runs in a
separate set of animals that does a single locale is enough.  It's
pretty dubious that there is any point in doing the CLOBBER stuff in
multiple locales.

Perhaps run these in HEAD only once a day, and do the other branches
once a week, or something like that.

If you add a seventh animal that does the recursive clobber thingy,
running say once a month, that could be useful too.  Since you have
spare CPUs, it shouldn't matter if it takes a whole week to run.

The advantage of doing this in separate animals is that you can run them
in parallel with your regular non-CLOBBER animals (which would run more
frequently.)

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: Sending out a request for more buildfarm animals?

From
Andrew Dunstan
Date:


On 05/08/2014 12:21 PM, Tomas Vondra wrote:
> On 6.5.2014 23:01, Tomas Vondra wrote:
>> On 6.5.2014 22:24, Tom Lane wrote:
>>> Tomas Vondra <tv@fuzzy.cz> writes:
>>>> I recall there was a call for more animals with CLOBBER_CACHE_ALWAYS
>>>> some time ago, so I went and enabled that on all three animals. Let's
>>>> see how long that will take.
>>>> I see there are more 'clobber' options in the code: CLOBBER_FREED_MEMORY
>>>> and CLOBBER_CACHE_RECURSIVELY. Would that be a good idea to enable these
>>>> as well?
>>>> The time requirements will be much higher (especially for the
>>>> RECURSIVELY option), but running that once a week shouldn't be a big
>>>> deal - the machine is pretty much dedicated to the buildfarm.
>>> I've never had the patience to run the regression tests to completion
>>> with CLOBBER_CACHE_RECURSIVELY at all, let alone do it on a regular
>>> basis. (I wonder if there's some easy way to run it for just a few
>>> regression tests...)
>> Now, that's a challenge ;-)
>>
>>> I think testing CLOBBER_FREED_MEMORY would be sensible though.
>> OK, I've enabled this for now.
> Hmmmm, with CLOBBER_CACHE_ALWAYS + CLOBBER_FREED_MEMORY the tests take
> ~20h on a single branch/animal. With a single locale (e.g. "C") it would
> take ~4h, but we're testing a bunch of additional czech/slovak locales.
>
> The tests are running in sequence (magpie->treepie->fulmar) so with all
> 6 branches, this would take ~14 days to complete. I don't mind the
> machine is running tests 100% of the time, that's why it's in buildfarm,
> but I'd rather see the failures soon after the commit (and two weeks is
> well over the "soon" edge, IMHO).
>
> So I'm thinking about how to improve this. I'd like to keep the options
> for all the branches (e.g. not just HEAD, as a few other animals do).
> But I'm thinking about running the tests in parallel, somehow - the
> machine has 4 cores, and most of the time only one of them is used. I
> don't expect a perfect ~3x speedup, but getting ~2x would be nice.
>
> Any recommendations how to do that? I see there's 'base_port' in the
> config - is it enough to tweak this, or do I need to run separate the
> animals using e.g. lxc?


Here is what I do on my FreeBSD VM. I have 2 animals, nightjar and 
friarbird. They have the same buildroot. friarbird is set up to build 
with CLOBBER_CACHE_ALWAYS, building just HEAD and just testing C locale; 
nightjar builds all branches we are interested in and tests locale 
cs_CZ.utf8 in addition to C.

Other than those differences they are pretty similar.

Here is the crontab that drives them:
   27 5-22 * * * cd bf && ./run_branches.pl --run-all --verbose   --config=nightjarx.conf >> bf.out 2>&1   20 0 * * *
cdbf && ./run_branches.pl --run-all --verbose   --config=friarbird.conf --skip-steps=install-check >> bf.out 2>&1
 


The buildfarm code has enough locking smarts to make sure we don't get 
any build collisions doing this.

If you have an animal to do a special type of build (e.g. CLOBBER_foo) 
then it's probably a good idea to set a note for that animal - see the 
buildfarm program setnotes.pl. friarbird has the note set "Uses 
-DCLOBBER_CACHE_ALWAYS".


If you want to do this in parallel, then you will need different 
buildroots and different base ports for each animal. I would not run the 
same animal on different branches concurrently, that is quite likely to 
end up in port collisions.


HTH.

cheers

andrew





Re: Sending out a request for more buildfarm animals?

From
Alvaro Herrera
Date:
Andrew Dunstan wrote:

> Here is what I do on my FreeBSD VM. I have 2 animals, nightjar and
> friarbird. They have the same buildroot. friarbird is set up to
> build with CLOBBER_CACHE_ALWAYS, building just HEAD and just testing
> C locale; nightjar builds all branches we are interested in and
> tests locale cs_CZ.utf8 in addition to C.

So nightjar would build frequently almost all the time, but as soon as
friarbird is doing a CLOBBER run nightjar would just stop running?  That
seems a bit odd, given that the CLOBBER runs take a lot longer than
non-CLOBBER ones.  (I guess it makes sense if you don't want to devote
double CPU time now and then to running the CLOBBER animal, but other
than that there doesn't seem to be a point to setting it up like that.)

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: Sending out a request for more buildfarm animals?

From
Andrew Dunstan
Date:
On 05/08/2014 04:09 PM, Alvaro Herrera wrote:
> Andrew Dunstan wrote:
>
>> Here is what I do on my FreeBSD VM. I have 2 animals, nightjar and
>> friarbird. They have the same buildroot. friarbird is set up to
>> build with CLOBBER_CACHE_ALWAYS, building just HEAD and just testing
>> C locale; nightjar builds all branches we are interested in and
>> tests locale cs_CZ.utf8 in addition to C.
> So nightjar would build frequently almost all the time, but as soon as
> friarbird is doing a CLOBBER run nightjar would just stop running?  That
> seems a bit odd, given that the CLOBBER runs take a lot longer than
> non-CLOBBER ones.  (I guess it makes sense if you don't want to devote
> double CPU time now and then to running the CLOBBER animal, but other
> than that there doesn't seem to be a point to setting it up like that.)
>


Why? This was actually discussed when I set this up and Tom opined that 
a once a day run with CLOBBER_CACHE_ALWAYS was plenty. It takes about 4 
/12 hours. The rest of the time nightjar runs. friarbird runs a bit 
after midnight US East Coast time, which is generally a slowish time for 
commits, so not running nightjar at that time seems perfectly reasonable.

I really don't get what your objection to the setup is. And no, I don't 
want them to run concurrently, I'd rather spread out the cycles.

cheers

andrew



Re: Sending out a request for more buildfarm animals?

From
Andrew Dunstan
Date:
On 05/08/2014 04:54 PM, Andrew Dunstan wrote:
>
>
>
> Why? This was actually discussed when I set this up and Tom opined 
> that a once a day run with CLOBBER_CACHE_ALWAYS was plenty. It takes 
> about 4 /12 hours. The rest of the time nightjar runs. friarbird runs 
> a bit after midnight US East Coast time, which is generally a slowish 
> time for commits, so not running nightjar at that time seems perfectly 
> reasonable.
>
>

er, that's 4 1/2 hours.

cheers

andrew




Re: Sending out a request for more buildfarm animals?

From
Alvaro Herrera
Date:
Andrew Dunstan wrote:

> I really don't get what your objection to the setup is. And no, I
> don't want them to run concurrently, I'd rather spread out the
> cycles.

I wasn't objecting, merely an observation.  Note that Tomas mentioned
he's okay with running 4 builds at once.  My main point here, really, is
that having a larger number of animals shouldn't be an impediment for a
more complex permutation of configurations, if he's okay with doing
that.  I assume you wouldn't object to my approving four extra animals
running on the same machine, if Tomas wants to go for that.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: Sending out a request for more buildfarm animals?

From
Andrew Dunstan
Date:
On 05/08/2014 05:21 PM, Alvaro Herrera wrote:
> Andrew Dunstan wrote:
>
>> I really don't get what your objection to the setup is. And no, I
>> don't want them to run concurrently, I'd rather spread out the
>> cycles.
> I wasn't objecting, merely an observation.  Note that Tomas mentioned
> he's okay with running 4 builds at once.  My main point here, really, is
> that having a larger number of animals shouldn't be an impediment for a
> more complex permutation of configurations, if he's okay with doing
> that.  I assume you wouldn't object to my approving four extra animals
> running on the same machine, if Tomas wants to go for that.
>


No, that's fine.

cheers

andrew



Re: Sending out a request for more buildfarm animals?

From
Tomas Vondra
Date:
On 8.5.2014 23:48, Andrew Dunstan wrote:
> 
> On 05/08/2014 05:21 PM, Alvaro Herrera wrote:
>> Andrew Dunstan wrote:
>>
>>> I really don't get what your objection to the setup is. And no, I
>>> don't want them to run concurrently, I'd rather spread out the
>>> cycles.
>> I wasn't objecting, merely an observation. Note that Tomas
>> mentioned he's okay with running 4 builds at once. My main point
>> here, really, is that having a larger number of animals shouldn't
>> be an impediment for a more complex permutation of configurations,
>> if he's okay with doing that. I assume you wouldn't object to my
>> approving four extra animals running on the same machine, if Tomas
>> wants to go for that.

So, if I get this right, the proposal is to have 7 animals:


1) all branches/locales, frequent builds (every few hours) magpie  - gcc fulmar  - icc treepie - clang

2) single branch/locale, CLOBBER, built once a week magpie2 - gcc fulmar2 - icc treepie - clang

3) single branch/locale, recursive CLOBBER, built once a month


I don't particularly mind the number of animals, although I was shooting
for lower number.

The only question is - should we use 3 animals for the recursive CLOBBER
too? I mean, one for each compiler?

regards
Tomas



Re: Sending out a request for more buildfarm animals?

From
Alvaro Herrera
Date:
Tomas Vondra wrote:

> So, if I get this right, the proposal is to have 7 animals:

It's your machine, so you decide what you want.  I'm only throwing out
some ideas.

> 1) all branches/locales, frequent builds (every few hours)
>   magpie  - gcc
>   fulmar  - icc
>   treepie - clang
> 
> 2) single branch/locale, CLOBBER, built once a week
>   magpie2 - gcc
>   fulmar2 - icc
>   treepie - clang
> 
> 3) single branch/locale, recursive CLOBBER, built once a month

Check.  Not those "2" names though.

> I don't particularly mind the number of animals, although I was shooting
> for lower number.

Consider that if the recursive clobber fails, we don't want that failure
to appear "diluted" among many successes of runs using the same animal
with non-recursive clobber.

> The only question is - should we use 3 animals for the recursive CLOBBER
> too? I mean, one for each compiler?

I guess it depends how likely we think that a different compiler will
change the behavior of the shared invalidation queue.  Somebody else
would have to answer that.  If not, then clearly we need only 5 animals.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: Sending out a request for more buildfarm animals?

From
Robert Haas
Date:
On Fri, May 9, 2014 at 11:18 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> I guess it depends how likely we think that a different compiler will
> change the behavior of the shared invalidation queue.  Somebody else
> would have to answer that.  If not, then clearly we need only 5 animals.

This may be heresy, but one of the things that drives me nuts about
the buildfarm is that the names of the animals are all weird stuff
that I've never heard of, and things on the same machine have
completely unrelated names.  Would it be crazy to think we might name
all of these animals in some way that lets people associated them with
each other?  e.g. brownbear, blackbear, polarbear, grizzlybear,
teddybear?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Sending out a request for more buildfarm animals?

From
Alvaro Herrera
Date:
Robert Haas wrote:
> On Fri, May 9, 2014 at 11:18 AM, Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:
> > I guess it depends how likely we think that a different compiler will
> > change the behavior of the shared invalidation queue.  Somebody else
> > would have to answer that.  If not, then clearly we need only 5 animals.
> 
> This may be heresy, but one of the things that drives me nuts about
> the buildfarm is that the names of the animals are all weird stuff
> that I've never heard of, and things on the same machine have
> completely unrelated names.  Would it be crazy to think we might name
> all of these animals in some way that lets people associated them with
> each other?  e.g. brownbear, blackbear, polarbear, grizzlybear,
> teddybear?

Sure.  I guess it'd be better that people notify somewhere the intention
to create many animals, somehow, so that we know to pick related names.
Right now the interface to requesting a new animal is 100% focused on an
individual animal.  Someone had several animals that were all moths, for
instance, IIRC.

Should we consider renaming Tomas' recent animals?  Not sure that this
would reduce confusion, and it might be heresy as well.  Andrew?

Would it help if the buildfarm page had pics of each animal next to its
name, or something like that?

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: Sending out a request for more buildfarm animals?

From
Robert Haas
Date:
On Fri, May 9, 2014 at 11:32 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Robert Haas wrote:
>> On Fri, May 9, 2014 at 11:18 AM, Alvaro Herrera
>> <alvherre@2ndquadrant.com> wrote:
>> > I guess it depends how likely we think that a different compiler will
>> > change the behavior of the shared invalidation queue.  Somebody else
>> > would have to answer that.  If not, then clearly we need only 5 animals.
>>
>> This may be heresy, but one of the things that drives me nuts about
>> the buildfarm is that the names of the animals are all weird stuff
>> that I've never heard of, and things on the same machine have
>> completely unrelated names.  Would it be crazy to think we might name
>> all of these animals in some way that lets people associated them with
>> each other?  e.g. brownbear, blackbear, polarbear, grizzlybear,
>> teddybear?
>
> Sure.  I guess it'd be better that people notify somewhere the intention
> to create many animals, somehow, so that we know to pick related names.
> Right now the interface to requesting a new animal is 100% focused on an
> individual animal.  Someone had several animals that were all moths, for
> instance, IIRC.
>
> Should we consider renaming Tomas' recent animals?  Not sure that this
> would reduce confusion, and it might be heresy as well.  Andrew?
>
> Would it help if the buildfarm page had pics of each animal next to its
> name, or something like that?

I'm not sure how helpful pictures would really be, but I bet I'd have
more *fun* looking at the buildfarm status page.  :-)

I don't know that I have all the answers as to what would really be
best here.  If we were starting over I think a taxonomy might be more
useful than what we have today - e.g. mammals for Linux, avians for
BSD-derived systems, reptiles for other System V-derived systems, and
invertebrates for Windows.  But it's surely not worth renaming
everything now.  Some easy way to group things on the same actual
system might be worthwhile, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Sending out a request for more buildfarm animals?

From
Andrew Dunstan
Date:
On 05/09/2014 11:25 AM, Robert Haas wrote:
> On Fri, May 9, 2014 at 11:18 AM, Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:
>> I guess it depends how likely we think that a different compiler will
>> change the behavior of the shared invalidation queue.  Somebody else
>> would have to answer that.  If not, then clearly we need only 5 animals.
> This may be heresy, but one of the things that drives me nuts about
> the buildfarm is that the names of the animals are all weird stuff
> that I've never heard of, and things on the same machine have
> completely unrelated names.  Would it be crazy to think we might name
> all of these animals in some way that lets people associated them with
> each other?  e.g. brownbear, blackbear, polarbear, grizzlybear,
> teddybear?
>



I've done that a bit in the past. At one stage all my Windows animals 
were some sort of bat. There's nothing magical about the names. It's 
just a text field and can be whatever we like. I initially started with 
animals because it seemed like a category that was likely to supply a 
virtually endless list of names.

We could maybe use more generic names to start with and then add 
specialized names to extra animals on the same machine. But that's 
really pretty much a hack, and something I would criticize if shown it 
in a client's schema. If we want to be able to group machines on the 
same box then we should have a database table or field that groups them 
cleanly. That's going to require a bit of thought on how to do it with 
minimal disruption.

cheers

andrew




Re: Sending out a request for more buildfarm animals?

From
Tomas Vondra
Date:
On 9.5.2014 17:18, Alvaro Herrera wrote:
> Tomas Vondra wrote:
> 
>> So, if I get this right, the proposal is to have 7 animals:
> 
> It's your machine, so you decide what you want.  I'm only throwing
> out some ideas.
> 
>> 1) all branches/locales, frequent builds (every few hours) magpie
>> - gcc fulmar  - icc treepie - clang
>> 
>> 2) single branch/locale, CLOBBER, built once a week magpie2 - gcc 
>> fulmar2 - icc treepie - clang
>> 
>> 3) single branch/locale, recursive CLOBBER, built once a month
> 
> Check.  Not those "2" names though.

Sure. That was just for illustration purposes.

>> I don't particularly mind the number of animals, although I was
>> shooting for lower number.
> 
> Consider that if the recursive clobber fails, we don't want that
> failure to appear "diluted" among many successes of runs using the
> same animal with non-recursive clobber.
> 
>> The only question is - should we use 3 animals for the recursive
>> CLOBBER too? I mean, one for each compiler?
> 
> I guess it depends how likely we think that a different compiler
> will change the behavior of the shared invalidation queue.  Somebody
> else would have to answer that.  If not, then clearly we need only 5
> animals.

Well, I think you're forgetting CLOBBER_FREED_MEMORY - that's not just
about the invalidation queue. And I think we've been bitten by compilers
optimizing out parts of the code before (e.g. because we relied on
undefined behaviour).

regards
Tomas



Re: Sending out a request for more buildfarm animals?

From
Tomas Vondra
Date:
On 9.5.2014 20:09, Andrew Dunstan wrote:
> 
> I've done that a bit in the past. At one stage all my Windows animals
> were some sort of bat. There's nothing magical about the names. It's
> just a text field and can be whatever we like. I initially started with
> animals because it seemed like a category that was likely to supply a
> virtually endless list of names.
> 
> We could maybe use more generic names to start with and then add
> specialized names to extra animals on the same machine. But that's
> really pretty much a hack, and something I would criticize if shown it
> in a client's schema. If we want to be able to group machines on the
> same box then we should have a database table or field that groups them
> cleanly. That's going to require a bit of thought on how to do it with
> minimal disruption.

I'm not really sure what would be the purpose of this information? I
mean, why do we need to identify the animals running on the same
machine? And what if they run in different VMs on the same hardware?

And I certainly prefer animal names than e.g. animal001 and similar
naming schemes.

regards
Tomas



Re: Sending out a request for more buildfarm animals?

From
Tomas Vondra
Date:
On 9.5.2014 00:47, Tomas Vondra wrote:
> On 8.5.2014 23:48, Andrew Dunstan wrote:
>>
>> On 05/08/2014 05:21 PM, Alvaro Herrera wrote:
>>> Andrew Dunstan wrote:
>>>
>>>> I really don't get what your objection to the setup is. And no, I
>>>> don't want them to run concurrently, I'd rather spread out the
>>>> cycles.
>>> I wasn't objecting, merely an observation. Note that Tomas
>>> mentioned he's okay with running 4 builds at once. My main point
>>> here, really, is that having a larger number of animals shouldn't
>>> be an impediment for a more complex permutation of configurations,
>>> if he's okay with doing that. I assume you wouldn't object to my
>>> approving four extra animals running on the same machine, if Tomas
>>> wants to go for that.
> 
> So, if I get this right, the proposal is to have 7 animals:
> 
> 
> 1) all branches/locales, frequent builds (every few hours)
>   magpie  - gcc
>   fulmar  - icc
>   treepie - clang
> 
> 2) single branch/locale, CLOBBER, built once a week
>   magpie2 - gcc
>   fulmar2 - icc
>   treepie - clang
> 
> 3) single branch/locale, recursive CLOBBER, built once a month

I've just noticed that the CLOBBER tests completed fine on the first
branch, but failed after the second one with this error:

Query for: stage=OK&animal=magpie&ts=1399599933
Target:
http://www.pgbuildfarm.org/cgi-bin/pgstatus.pl/3c4d6bf5c9ac87be37fcbd0d046f02ae5b39d09e
Status Line: 493 snapshot too old: Wed May  7 04:36:57 2014 GMT
Content:
snapshot to old: Wed May  7 04:36:57 2014 GMT

Now, I assume this happens because the tests duration exceeds 24h (or
whatever limit is there), and it probably won't be a problem after
switching to the single branch/locale combination.

But won't that be a problem on the animal running tests with
CLOBBER_CACHE_RECURSIVELY? That's supposed to run much longer, and I
wouldn't be surprised if it exceeds 24h on a single branch/locale.

regards
Tomas



Re: Sending out a request for more buildfarm animals?

From
Tomas Vondra
Date:
On 9.5.2014 00:47, Tomas Vondra wrote:
> On 8.5.2014 23:48, Andrew Dunstan wrote:
>>
>> On 05/08/2014 05:21 PM, Alvaro Herrera wrote:
>>> Andrew Dunstan wrote:
>>>
>>>> I really don't get what your objection to the setup is. And no, I
>>>> don't want them to run concurrently, I'd rather spread out the
>>>> cycles.
>>> I wasn't objecting, merely an observation. Note that Tomas
>>> mentioned he's okay with running 4 builds at once. My main point
>>> here, really, is that having a larger number of animals shouldn't
>>> be an impediment for a more complex permutation of configurations,
>>> if he's okay with doing that. I assume you wouldn't object to my
>>> approving four extra animals running on the same machine, if Tomas
>>> wants to go for that.
> 
> So, if I get this right, the proposal is to have 7 animals:
> 
> 
> 1) all branches/locales, frequent builds (every few hours)
>   magpie  - gcc
>   fulmar  - icc
>   treepie - clang
> 
> 2) single branch/locale, CLOBBER, built once a week
>   magpie2 - gcc
>   fulmar2 - icc
>   treepie - clang
> 
> 3) single branch/locale, recursive CLOBBER, built once a month
> 
> 
> I don't particularly mind the number of animals, although I was shooting
> for lower number.
> 
> The only question is - should we use 3 animals for the recursive CLOBBER
> too? I mean, one for each compiler?

OK. I've switched the three original animals (magpie, fulmar, treepie)
back to the original configuration (no clobber, all branches, multiple
locales).

And I've requested 6 more animals - two for each compiler. One set for
tests with basic CLOBBER, one set for recursive CLOBBER.

Each group will run in a separate VM, in a round-robin manner.

regards
Tomas



Re: Sending out a request for more buildfarm animals?

From
Tomas Vondra
Date:
On 10.5.2014 20:21, Tomas Vondra wrote:
> On 9.5.2014 00:47, Tomas Vondra wrote:
>
>> So, if I get this right, the proposal is to have 7 animals:
>>
>>
>> 1) all branches/locales, frequent builds (every few hours)
>>   magpie  - gcc
>>   fulmar  - icc
>>   treepie - clang
>>
>> 2) single branch/locale, CLOBBER, built once a week
>>   magpie2 - gcc
>>   fulmar2 - icc
>>   treepie - clang
>>
>> 3) single branch/locale, recursive CLOBBER, built once a month
>>
>>
>> I don't particularly mind the number of animals, although I was shooting
>> for lower number.
>>
>> The only question is - should we use 3 animals for the recursive CLOBBER
>> too? I mean, one for each compiler?
> 
> OK. I've switched the three original animals (magpie, fulmar, treepie)
> back to the original configuration (no clobber, all branches, multiple
> locales).
> 
> And I've requested 6 more animals - two for each compiler. One set for
> tests with basic CLOBBER, one set for recursive CLOBBER.

Can someone please approve the animals I've requested a few days ago?
I'm already running the clobber tests with '--nosend --nostatus' and
it's already reporting some errors. Would be nice to get it to the
buildfarm.

regards
Tomas



Re: Sending out a request for more buildfarm animals?

From
Andres Freund
Date:
On 2014-05-13 20:42:16 +0200, Tomas Vondra wrote:
> Can someone please approve the animals I've requested a few days ago?
> I'm already running the clobber tests with '--nosend --nostatus' and
> it's already reporting some errors. Would be nice to get it to the
> buildfarm.

Can you provide some details about those failures until then?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Sending out a request for more buildfarm animals?

From
"Tomas Vondra"
Date:
On 14 Květen 2014, 13:51, Andres Freund wrote:
> On 2014-05-13 20:42:16 +0200, Tomas Vondra wrote:
>> Can someone please approve the animals I've requested a few days ago?
>> I'm already running the clobber tests with '--nosend --nostatus' and
>> it's already reporting some errors. Would be nice to get it to the
>> buildfarm.
>
> Can you provide some details about those failures until then?

Sure.

Apparently there's something wrong with 'test-decoding-check':

============== running regression test queries        ==============
test ddl                      ... FAILED
test rewrite                  ... ok
test toast                    ... FAILED
test permissions              ... ok
test decoding_in_xact         ... ok
test binary                   ... ok
============== shutting down postmaster               ==============

The whole logfile is attached, complete logs are available at
http://www.fuzzy.cz/tmp/buildlogs.tgz

This only happens on animals executed with

-DCLOBBER_CACHE_ALWAYS -DCLOBBER_FREED_MEMORY -DMEMORY_CONTEXT_CHECKING
-DRANDOMIZE_ALLOCATED_MEMORY -DCLOBBER_CACHE_RECURSIVELY

it does not happen with

CPPFLAGS => '-DCLOBBER_CACHE_ALWAYS -DCLOBBER_FREED_MEMORY
-DMEMORY_CONTEXT_CHECKING -DRANDOMIZE_ALLOCATED_MEMORY',

So clearly this is about CLOBBER_CACHE_RECURSIVELY. Also, it fails on all
three animals (one for each compiler - gcc, icc, clang).

regards
Tomas

Re: Sending out a request for more buildfarm animals?

From
Andres Freund
Date:
On 2014-05-14 15:08:08 +0200, Tomas Vondra wrote:
> On 14 Květen 2014, 13:51, Andres Freund wrote:
> > On 2014-05-13 20:42:16 +0200, Tomas Vondra wrote:
> >> Can someone please approve the animals I've requested a few days ago?
> >> I'm already running the clobber tests with '--nosend --nostatus' and
> >> it's already reporting some errors. Would be nice to get it to the
> >> buildfarm.
> >
> > Can you provide some details about those failures until then?
>
> Sure.

Thanks.

> Apparently there's something wrong with 'test-decoding-check':

Man. I shouldn't have asked... My code. There's some output in there
that's probably triggered by the extraordinarily long runtimes, but
there's definitely something else wrong.
My gut feeling says it's in RelationGetIndexList().

> This only happens on animals executed with
>
> -DCLOBBER_CACHE_ALWAYS -DCLOBBER_FREED_MEMORY -DMEMORY_CONTEXT_CHECKING
> -DRANDOMIZE_ALLOCATED_MEMORY -DCLOBBER_CACHE_RECURSIVELY
>
> it does not happen with
>
> CPPFLAGS => '-DCLOBBER_CACHE_ALWAYS -DCLOBBER_FREED_MEMORY
> -DMEMORY_CONTEXT_CHECKING -DRANDOMIZE_ALLOCATED_MEMORY',
>
> So clearly this is about CLOBBER_CACHE_RECURSIVELY. Also, it fails on all
> three animals (one for each compiler - gcc, icc, clang).

I tested it with CLOBBER_CACHE_ALWAYS, but not RECURSIVELY... So it's
entirely possible that i've missed something.


Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Sending out a request for more buildfarm animals?

From
Tomas Vondra
Date:
On 13.5.2014 20:42, Tomas Vondra wrote:
> On 10.5.2014 20:21, Tomas Vondra wrote:
>> On 9.5.2014 00:47, Tomas Vondra wrote:
>>
>> And I've requested 6 more animals - two for each compiler. One set for
>> tests with basic CLOBBER, one set for recursive CLOBBER.
> 
> Can someone please approve the animals I've requested a few days ago?
> I'm already running the clobber tests with '--nosend --nostatus' and
> it's already reporting some errors. Would be nice to get it to the
> buildfarm.

So the new animals:

CLOBBER_CACHE_ALWAYS (+ others)markhor  gcctick     clang/llvmleech    icc

CLOBBER_CACHE_RECURSIVELY (+ others)addax    gccmite     clang/llvmbarnacle icc

The builds trigger every hour, but it may take hours / days to get first
results.

Tomas



Re: Sending out a request for more buildfarm animals?

From
Tomas Vondra
Date:
On 14.5.2014 15:17, Andres Freund wrote:
> On 2014-05-14 15:08:08 +0200, Tomas Vondra wrote:
>> On 14 Květen 2014, 13:51, Andres Freund wrote:
>>> On 2014-05-13 20:42:16 +0200, Tomas Vondra wrote:
>>>> Can someone please approve the animals I've requested a few days ago?
>>>> I'm already running the clobber tests with '--nosend --nostatus' and
>>>> it's already reporting some errors. Would be nice to get it to the
>>>> buildfarm.
>>>
>>> Can you provide some details about those failures until then?
>>
>> Sure.
> 
> Thanks.
> 
>> Apparently there's something wrong with 'test-decoding-check':
> 
> Man. I shouldn't have asked... My code. There's some output in there
> that's probably triggered by the extraordinarily long runtimes, but
> there's definitely something else wrong.
> My gut feeling says it's in RelationGetIndexList().

The cache invalidation bug was apparently fixed, but we're still getting
failures (see for example markhor):

http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=markhor&br=HEAD

I see there's a transaction (COMMIT+BEGIN) - is this caused by the
extremely long runtimes?

Tomas



Re: Sending out a request for more buildfarm animals?

From
Andres Freund
Date:
On 2014-05-25 01:02:25 +0200, Tomas Vondra wrote:
> On 14.5.2014 15:17, Andres Freund wrote:
> The cache invalidation bug was apparently fixed, but we're still getting
> failures (see for example markhor):
> 
> http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=markhor&br=HEAD
> 
> I see there's a transaction (COMMIT+BEGIN) - is this caused by the
> extremely long runtimes?

Yes, that's the reason. Normally the test doesn't trigger autovacuum at
all, but if it's running for a *long* time it can. I haven't yet figured
out a good way to deal with that.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Sending out a request for more buildfarm animals?

From
Tom Lane
Date:
Andres Freund <andres@2ndquadrant.com> writes:
> On 2014-05-25 01:02:25 +0200, Tomas Vondra wrote:
>> The cache invalidation bug was apparently fixed, but we're still getting
>> failures (see for example markhor):
>> http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=markhor&br=HEAD
>> I see there's a transaction (COMMIT+BEGIN) - is this caused by the
>> extremely long runtimes?

> Yes, that's the reason. Normally the test doesn't trigger autovacuum at
> all, but if it's running for a *long* time it can. I haven't yet figured
> out a good way to deal with that.

Any way to make the test print only WAL entries arising from the
foreground transaction?

If an autovac run can trigger this failure, then I would think it would
happen sometimes, probabilistically, even when the test runtime wasn't
all that long.  That would be very unhappy-making, eg for packagers
who would like build runs to reliably work the first time.  So I think
this is important even without the desire to run CLOBBER_CACHE regression
tests.

Another idea is to provide a variant "expected" file, but that seems
a bit fragile: if you can get one extra transaction, why not two?
        regards, tom lane



Re: Sending out a request for more buildfarm animals?

From
Andres Freund
Date:
On 2014-05-25 16:58:39 -0400, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > On 2014-05-25 01:02:25 +0200, Tomas Vondra wrote:
> >> The cache invalidation bug was apparently fixed, but we're still getting
> >> failures (see for example markhor):
> >> http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=markhor&br=HEAD
> >> I see there's a transaction (COMMIT+BEGIN) - is this caused by the
> >> extremely long runtimes?
> 
> > Yes, that's the reason. Normally the test doesn't trigger autovacuum at
> > all, but if it's running for a *long* time it can. I haven't yet figured
> > out a good way to deal with that.
> 
> Any way to make the test print only WAL entries arising from the
> foreground transaction?

None that doesn't suck, so far :(. The least bad I can think of is toe
just add a xinfo flag for such 'background' transaction commits. Don't
like it much...

> If an autovac run can trigger this failure, then I would think it would
> happen sometimes, probabilistically, even when the test runtime wasn't
> all that long.  That would be very unhappy-making, eg for packagers
> who would like build runs to reliably work the first time.  So I think
> this is important even without the desire to run CLOBBER_CACHE regression
> tests.

Agreed.

> Another idea is to provide a variant "expected" file, but that seems
> a bit fragile: if you can get one extra transaction, why not two?

Yeah, that's not going to work. Autovac's transaction could appear at
different places in the changestream. We probably don't want a expected
file listing all.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Sending out a request for more buildfarm animals?

From
Andres Freund
Date:
On 2014-05-25 16:58:39 -0400, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > On 2014-05-25 01:02:25 +0200, Tomas Vondra wrote:
> >> The cache invalidation bug was apparently fixed, but we're still getting
> >> failures (see for example markhor):
> >> http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=markhor&br=HEAD
> >> I see there's a transaction (COMMIT+BEGIN) - is this caused by the
> >> extremely long runtimes?
> 
> > Yes, that's the reason. Normally the test doesn't trigger autovacuum at
> > all, but if it's running for a *long* time it can. I haven't yet figured
> > out a good way to deal with that.
> 
> Any way to make the test print only WAL entries arising from the
> foreground transaction?

None that doesn't suck, so far :(. The least bad I can think of is toe
just add a xinfo flag for such 'background' transaction commits. Don't
like it much...

> If an autovac run can trigger this failure, then I would think it would
> happen sometimes, probabilistically, even when the test runtime wasn't
> all that long.  That would be very unhappy-making, eg for packagers
> who would like build runs to reliably work the first time.  So I think
> this is important even without the desire to run CLOBBER_CACHE regression
> tests.

Agreed.

> Another idea is to provide a variant "expected" file, but that seems
> a bit fragile: if you can get one extra transaction, why not two?

Yeah, that's not going to work. Autovac's transaction could appear at
different places in the changestream. We probably don't want a expected
file listing all.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Sending out a request for more buildfarm animals?

From
Tomas Vondra
Date:
On 15.5.2014 00:41, Tomas Vondra wrote:
> On 13.5.2014 20:42, Tomas Vondra wrote:
>> On 10.5.2014 20:21, Tomas Vondra wrote:
>>> On 9.5.2014 00:47, Tomas Vondra wrote:
>>>
>>> And I've requested 6 more animals - two for each compiler. One set for
>>> tests with basic CLOBBER, one set for recursive CLOBBER.
>>
>> Can someone please approve the animals I've requested a few days ago?
>> I'm already running the clobber tests with '--nosend --nostatus' and
>> it's already reporting some errors. Would be nice to get it to the
>> buildfarm.
> 
> So the new animals:
> 
> CLOBBER_CACHE_ALWAYS (+ others)
>  markhor  gcc
>  tick     clang/llvm
>  leech    icc
> 
> CLOBBER_CACHE_RECURSIVELY (+ others)
>  addax    gcc
>  mite     clang/llvm
>  barnacle icc
> 
> The builds trigger every hour, but it may take hours / days to get first
> results.

Just a quick update from barnacle, running the tests with recursive
CLOBBER. Right now the "top" shows this:

top - 23:16:55 up 43 days, 23:43, 1 user, load average: 3.16, 2.83, 2.61
Tasks:  36 total,   2 running,  34 sleeping,   0 stopped,   0 zombie
Cpu(s): 75.7%us, 7.6%sy, 0.0%ni, 16.4%id, 0.2%wa, 0.0%hi, 0.0%si, 0.0%st
Mem:   8059116k total,  7108540k used,   950576k free,   554060k buffers
Swap:  8388600k total,     1600k used,  8387000k free,  4809468k cached
 PID %CPU   TIME+  COMMAND7916 99.7 45373:24 postgres: pgbuild regression [local] CREATE INDEX 426 33.9  0:01.02
postgres:autovacuum worker process   regression7871  0.3  0:10.79 postgres: wal writer process   1  0.0  0:00.09
/sbin/init

So, it's running for >1 month without a crash (or obvious memory leaks).
Good ;-)

Anyway, I've been checking the animal regularly and I've seen the
'CREATE INDEX' for a very long time (at least 2 weeks, I guess). Sadly
there's no timestamp in log_line_prefix, so it's difficult to say
exactly which commands took the longest, but the last two commands
logged in postmaster.log are

[537ddad8.1eec:6] LOG:  statement: CREATE UNIQUE INDEX
test_replica_identity_keyab_key ON test_replica_identity (keya, keyb);
[537ddad8.1eec:7] LOG:  statement: CREATE UNIQUE INDEX
test_replica_identity_nonkey ON test_replica_identity (keya, nonkey);

regards
Tomas