Thread: Fwd: PGBuildfarm member colugos Branch HEAD Status changed from OK to StartDb-C:3 failure

If anyone is interested, I think this failure was accompanied by the following:

Process:         postgres [35159]
Path:            /usr/local/src/build-farm-3.2_llvm/builds/HEAD/inst/bin/postgres
Identifier:      postgres
Version:         ??? (???)
Code Type:       X86-64 (Native)
Parent Process:  postgres [35036]

Date/Time:       2010-05-19 17:12:19.213 -0600
OS Version:      Mac OS X 10.6.3 (10D573)
Report Version:  6

Interval Since Last Report:          394069 sec
Crashes Since Last Report:           10
Per-App Crashes Since Last Report:   2
Anonymous UUID:                      77053C10-A2B5-4078-A796-5862E233A1AC

Exception Type:  EXC_CRASH (SIGABRT)
Exception Codes: 0x0000000000000000, 0x0000000000000000
Crashed Thread:  0  Dispatch queue: com.apple.main-thread

Application Specific Information:
abort() called

Thread 0 Crashed:  Dispatch queue: com.apple.main-thread
0   libSystem.B.dylib             0x00007fff88268886 __kill + 10
1   libSystem.B.dylib             0x00007fff88308eae abort + 83
2   postgres                       0x00000001004cd513 errfinish + 1059
3   postgres                       0x000000010008ae1e UpdateControlFile + 382
4   postgres                       0x0000000100091f93 CreateCheckPoint + 659
5   postgres                       0x00000001000917c1 ShutdownXLOG + 193
6   postgres                       0x00000001002fa042 BackgroundWriterMain + 1378
7   postgres                       0x00000001000d0139 AuxiliaryProcessMain + 1993
8   postgres                       0x000000010030c924 StartChildProcess + 356
9   postgres                       0x0000000100309ab9 reaper + 489
10  libSystem.B.dylib             0x00007fff8827a80a _sigtramp + 26
11  libSystem.B.dylib             0x00007fff8825e286 select$DARWIN_EXTSN + 10
12  postgres                       0x00000001003076cc ServerLoop + 364
13  postgres                       0x0000000100306a04 PostmasterMain + 5588
14  postgres                       0x000000010025cd0b main + 667
15  postgres                       0x0000000100000da4 start + 52

Thread 0 crashed with X86 Thread State (64-bit):
  rax: 0x0000000000000000  rbx: 0x00007fff5fbff0e0  rcx: 0x00007fff5fbfe548  rdx: 0x0000000000000000
  rdi: 0x0000000000008957  rsi: 0x0000000000000006  rbp: 0x00007fff5fbfe560  rsp: 0x00007fff5fbfe548
   r8: 0x0000000101002890   r9: 0x00007fff5fbfe438  r10: 0x00007fff882648ca  r11: 0x0000000000000202
  r12: 0x0000000000000000  r13: 0x0000000000000000  r14: 0x0000000000000000  r15: 0x0000000000000000
  rip: 0x00007fff88268886  rfl: 0x0000000000000202  cr2: 0x0000000100536050

Binary Images:
       0x100000000 -        0x1006eafef +postgres ??? (???) <20ED8627-555A-780D-6CD7-B7AACBD814EC> /usr/local/src/build-farm-3.2_llvm/builds/HEAD/inst/bin/postgres
       0x100917000 -        0x100a38fff +libxml2.2.dylib 10.7.0 (compatibility 10.0.0) <CC8AA05E-419A-8855-CA51-3E988F6AF074> /opt/local/lib/libxml2.2.dylib
       0x100a69000 -        0x100aaafef +libssl.0.9.8.dylib 0.9.8 (compatibility 0.9.8) <E95CC9C8-7EA2-49DC-8ADB-38ABB54ADCD4> /opt/local/lib/libssl.0.9.8.dylib
       0x100abe000 -        0x100bd0ff7 +libcrypto.0.9.8.dylib 0.9.8 (compatibility 0.9.8) <869559F9-EF7E-94F5-6810-2D4B9163F7A0> /opt/local/lib/libcrypto.0.9.8.dylib
       0x100c33000 -        0x100c47ff7 +libz.1.dylib 1.2.5 (compatibility 1.0.0) <CED4D01F-2054-94F0-E944-962F279BC84C> /opt/local/lib/libz.1.dylib
       0x100c4b000 -        0x100d47ff7 +libiconv.2.dylib 8.0.0 (compatibility 8.0.0) <D674866F-82E0-B1ED-4A97-9B8ED4EE6C3B> /opt/local/lib/libiconv.2.dylib
    0x7fff5fc00000 -     0x7fff5fc3bdef  dyld 132.1 (???) <B633F790-4DDB-53CD-7ACF-2A3682BCEA9F> /usr/lib/dyld
    0x7fff803aa000 -     0x7fff803abff7  com.apple.TrustEvaluationAgent 1.1 (1) <51867586-1C71-AE37-EAAD-535A58DD3550> /System/Library/PrivateFrameworks/TrustEvaluationAgent.framework/Versions/A/TrustEvaluationAgent
    0x7fff81525000 -     0x7fff81537fe7  libsasl2.2.dylib 3.15.0 (compatibility 3.0.0) <76B83C8D-8EFE-4467-0F75-275648AFED97> /usr/lib/libsasl2.2.dylib
    0x7fff81865000 -     0x7fff8189dff7  libssl.0.9.8.dylib 0.9.8 (compatibility 0.9.8) <FAB9687F-0A86-A13F-7644-CE02E71140DB> /usr/lib/libssl.0.9.8.dylib
    0x7fff82491000 -     0x7fff82495ff7  libmathCommon.A.dylib 315.0.0 (compatibility 1.0.0) <95718673-FEEE-B6ED-B127-BCDBDB60D4E5> /usr/lib/system/libmathCommon.A.dylib
    0x7fff8491e000 -     0x7fff84a2dfe7  libcrypto.0.9.8.dylib 0.9.8 (compatibility 0.9.8) <826C2437-F760-E049-1719-9C69A3BAA4B0> /usr/lib/libcrypto.0.9.8.dylib
    0x7fff8549d000 -     0x7fff854befff  libresolv.9.dylib 40.0.0 (compatibility 1.0.0) <1AE68BBB-6536-125C-DE2A-13CA916D0EC4> /usr/lib/libresolv.9.dylib
    0x7fff86bad000 -     0x7fff86beafff  com.apple.LDAPFramework 2.0 (120.1) <16383FF5-0537-6298-73C9-473AEC9C149C> /System/Library/Frameworks/LDAP.framework/Versions/A/LDAP
    0x7fff86e61000 -     0x7fff86e72ff7  libz.1.dylib 1.2.3 (compatibility 1.0.0) <C1154E2E-B1CB-1FAD-77ED-B139BA1AB073> /usr/lib/libz.1.dylib
    0x7fff88219000 -     0x7fff883d8ff7  libSystem.B.dylib 125.0.1 (compatibility 1.0.0) <CB9A4929-61AF-DE71-5635-133E9EC95783> /usr/lib/libSystem.B.dylib
    0x7fffffe00000 -     0x7fffffe01fff  libSystem.B.dylib ??? (???) <CB9A4929-61AF-DE71-5635-133E9EC95783> /usr/lib/libSystem.B.dylib


Begin forwarded message:

From: PG Build Farm <pgbuildfarm-web @hosting-two.commandprompt.com >
Date: May 19, 2010 5:12:14 PM MDT
Subject: PGBuildfarm member colugos Branch HEAD Status changed from OK to StartDb-C:3 failure


The PGBuildfarm member colugos had the following event on branch HEAD:

Status changed from OK to StartDb-C:3 failure

The snapshot timestamp for the build that triggered this notification is: 2010-05-19 18:55:25

The specs of this machine are:
OS:  OS X / 10.6
Arch: x86_64
Comp: LLVM / 4.2.1

For more information, see http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=colugos&br=HEAD


Robert Creager <rsc@logicalchaos.org> writes:
> If anyone is interested, I think this failure was accompanied by the following:
> [ apparent PANIC in UpdateControlFile ]

Hmm, do you have the panic message in the postmaster log?  So far as I
can tell, the postmaster log isn't captured anywhere in the buildfarm
report of this event.  (Which seems like a buildfarm bug.)

Given that polecat has already run a couple of later buildfarm
iterations, I'm not too hopeful that you do have that log file.
Was there any special environment here, like running out of disk space?
        regards, tom lane


Robert Creager <robert@logicalchaos.org> writes:
> On May 20, 2010, at 11:54 AM, Tom Lane wrote:
>> Was there any special environment here, like running out of disk space?

> Not that I'm aware of.  I did empty trash sometime yesterday after noticing I was around 1Gb of free disk.  Not sure
ifthat correlates or not.  Maybe on failure buildfarm should could disk usage of the disk it's on and add that to the
report?

I'm just guessing, but it seems like a system-level condition like being
out of disk space would explain concurrent similar failures for both of
your animals.

A quick look at UpdateControlFile shows that it definitely will PANIC
if it gets any sort of failure while trying to write pg_control.
It would be interesting to know what sort of failure it got, but
I'm not going to (ahem) panic about it.  It's annoying though that
the buildfarm script didn't capture the relevant log file in this
particular failure case.  Andrew, can we get that fixed?
        regards, tom lane


On May 20, 2010, at 11:54 AM, Tom Lane wrote:

> Robert Creager <rsc@logicalchaos.org> writes:
>> If anyone is interested, I think this failure was accompanied by the following:
>> [ apparent PANIC in UpdateControlFile ]
>
> Hmm, do you have the panic message in the postmaster log?  So far as I
> can tell, the postmaster log isn't captured anywhere in the buildfarm
> report of this event.  (Which seems like a buildfarm bug.)

'Course not.

>
> Given that polecat has already run a couple of later buildfarm
> iterations, I'm not too hopeful that you do have that log file.
> Was there any special environment here, like running out of disk space?

Not that I'm aware of.  I did empty trash sometime yesterday after noticing I was around 1Gb of free disk.  Not sure if
thatcorrelates or not.  Maybe on failure buildfarm should could disk usage of the disk it's on and add that to the
report?

I've updated my OSX clients to keep error builds.

Later,
Rob

On Thu, May 20, 2010 2:12 pm, Tom Lane wrote:
> It's annoying though that
> the buildfarm script didn't capture the relevant log file in this
> particular failure case.  Andrew, can we get that fixed?
>


It was captured, but apparently had no new content - not surprising if it
ran out of space.

cheers