Re: strange buildfarm failures - Mailing list pgsql-hackers

From Tom Lane
Subject Re: strange buildfarm failures
Date
Msg-id 6407.1177538756@sss.pgh.pa.us
Whole thread Raw
In response to Re: strange buildfarm failures  (Stefan Kaltenbrunner <stefan@kaltenbrunner.cc>)
Responses Re: strange buildfarm failures
List pgsql-hackers
Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> writes:
> Stefan Kaltenbrunner wrote:
>> two of my buildfarm members had different but pretty weird looking
>> failures lately:
>> http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=quagga&dt=2007-04-25%2002:03:03
>> and
>> 
>> http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=emu&dt=2007-04-24%2014:35:02
>> 
>> any ideas on what might causing those ?

> lionfish just failed too:

> http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=lionfish&dt=2007-04-25%2005:30:09

And had a similar failure a few days ago.  The curious thing is that
what we get in the postmaster log is

LOG:  server process (PID 23405) was terminated by signal 6: Aborted
LOG:  terminating any other active server processes

You would think SIGABRT would come from an assertion failure, but
there's no preceding assertion message in the log.  The other
characteristic of these crashes is that *all* of the failing regression
instances report "terminating connection because of crash of another
server process", which suggests strongly that the crash was in an
autovacuum process (if it were bgwriter or stats collector the
postmaster would've said so).  So I think the recent autovac patches
are at fault.  I spent a bit of time trolling for a spot where the code
might abort() without having printed anything, but didn't find one.

If any of the buildfarm owners can get a stack trace from the core dump
of one of these events, it'd be mighty helpful.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [PATCHES] Full page writes improvement, code update
Next
From: Gregory Stark
Date:
Subject: Re: database size estimates