Re: Gerbil build farm failure - Mailing list pgsql-hackers

From Jim C. Nasby
Subject Re: Gerbil build farm failure
Date
Msg-id 20050920174436.GM7630@pervasive.com
Whole thread Raw
In response to Gerbil build farm failure  (Bruce Momjian <pgman@candle.pha.pa.us>)
Responses Re: Gerbil build farm failure
List pgsql-hackers
On Tue, Sep 20, 2005 at 01:17:10PM -0400, Bruce Momjian wrote:
> I worked with Jim Nasby and we found this is the line that is failing on
> Gerbil in the build farm during initdb: tqual.c, line 844 in 8.0.X
> 
>     if (HeapTupleHeaderGetCmin(tuple) >= snapshot->curcid)
> 
> This particular line was last modified in 2002.  However, this was a
> file that was changed as part of the VACUUM tuple chain commit:
> 
>     revision 1.81.4.2
>     date: 2005/08/25 19:45:01;  author: tgl;  state: Exp;  lines: +7 -4
>     Back-patch fixes for problems with VACUUM destroying t_ctid chains too soon,
>     and with insufficient paranoia in code that follows t_ctid links.
>     This patch covers the 8.0 branch.
> 
> and the date of the commit to 8.0.X corresponds to the date that
> failures started to happen:
> 
>     http://pgbuildfarm.org/cgi-bin/show_history.pl?nm=gerbil&br=REL8_0_STABLE

BTW, I want to point out for others that when initdb dumps core trying
to get a stack trace out of the initdb binary will probably be useless,
because initdb is just calling other binaries. In this case we had
sucess with the postgres binary. Had I know this I would have had this
stack trace available a couple weeks ago. :(

http://lnk.nu/developer.postgresql.org/3zx.c is the annotated version of
tqual. As Bruce mentioned, the line referenced in the core file probably
isn't the culprit. http://lnk.nu/pgbuildfarm.org/3zz.pl has the list of
files that changed to break gerbil.

Here's the output from gdb:
#0  HeapTupleSatisfiesSnapshot (tuple=0xfe28fc78, snapshot=0xd7, buffer=295) at tqual.c:844
844     tqual.c: No such file or directory.       in tqual.c
(gdb) bt
#0  HeapTupleSatisfiesSnapshot (tuple=0xfe28fc78, snapshot=0xd7, buffer=295) at tqual.c:844
#1  0x0004bdd0 in heap_update ()
#2  0x000ec4b0 in ExecutorRun (queryDesc=0x0, direction=-4198192, count=16) at execMain.c:1592
(gdb)

I'm in the process of trying to get this machine moved someplace where I
could give a developer ssh access. That should hopefully happen by the
end of the week.
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Gerbil build farm failure
Next
From: "Jim C. Nasby"
Date:
Subject: Re: pg_autovacuum settings not saved on dump