Thread: [Fwd: PGBuildfarm member narwhal Branch HEAD Status changed from OK to InstallCheck failure]
[Fwd: PGBuildfarm member narwhal Branch HEAD Status changed from OK to InstallCheck failure]
From
Dave Page
Date:
I've been seeing this failure intermittently on Narwhal HEAD, and once on 8.1. Other branches have been OK, as have other animals running on the same physical box. Narwhal-HEAD is run more often than any other builds however. Anyone have any idea what might be wrong? It seems unlikely to be a hardware issue given that it's the exact same test failures each time. Regards, Dave. -------- Original Message -------- Subject: PGBuildfarm member narwhal Branch HEAD Status changed from OK to InstallCheck failure Date: Fri, 20 Apr 2007 13:46:22 -0700 (PDT) From: PG Build Farm <pgbuildfarm-web@hosting-two.commandprompt.com> To: pgbuildfarm-status-chngs@pgfoundry.org, pgbuildfarm-status-green@pgfoundry.org The PGBuildfarm member narwhal had the following event on branch HEAD: Status changed from OK to InstallCheck failure The snapshot timestamp for the build that triggered this notification is: 2007-04-20 20:00:01 The specs of this machine are: OS: Windows Server 2003 R2 / 5.2.3790 Arch: i686 Comp: GCC / 3.4.2 (mingw-special) For more information, see http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=narwhal&br=HEAD
Re: [Fwd: PGBuildfarm member narwhal Branch HEAD Status changed from OK to InstallCheck failure]
From
Tom Lane
Date:
Dave Page <dpage@postgresql.org> writes: > I've been seeing this failure intermittently on Narwhal HEAD, and once > on 8.1. Other branches have been OK, as have other animals running on > the same physical box. Narwhal-HEAD is run more often than any other > builds however. > Anyone have any idea what might be wrong? It seems unlikely to be a > hardware issue given that it's the exact same test failures each time. Yeah, I'd been wondering about that too, but have no clue what's up. It seems particularly odd that all the failures are in installcheck not check. If you want to poke at it, I'd suggest changing the ERROR to PANIC (it's in bufmgr.c) to cause a core dump, run installchecks till you get a panic, and then look around in the dump to see what you can find. It'd be particularly interesting to see what the buffer actually contains. Also you could look at the corresponding page of the disk file (which in theory should be the same as the buffer contents, since this error check is only made just after a read() ...) regards, tom lane
Re: [Fwd: PGBuildfarm member narwhal Branch HEAD Status changed from OK to InstallCheck failure]
From
Dave Page
Date:
Tom Lane wrote: > Dave Page <dpage@postgresql.org> writes: >> I've been seeing this failure intermittently on Narwhal HEAD, and once >> on 8.1. Other branches have been OK, as have other animals running on >> the same physical box. Narwhal-HEAD is run more often than any other >> builds however. > >> Anyone have any idea what might be wrong? It seems unlikely to be a >> hardware issue given that it's the exact same test failures each time. > > Yeah, I'd been wondering about that too, but have no clue what's up. > It seems particularly odd that all the failures are in installcheck > not check. > > If you want to poke at it, I'd suggest changing the ERROR to PANIC > (it's in bufmgr.c) to cause a core dump, run installchecks till you > get a panic, and then look around in the dump to see what you can find. > It'd be particularly interesting to see what the buffer actually > contains. Also you could look at the corresponding page of the disk > file (which in theory should be the same as the buffer contents, > since this error check is only made just after a read() ...) Hmm, I'll give it a go when I'm back in the office, but bear in mind this is a Mingw build on which debugging is nigh-on impossible. Regards, Dave.
Re: [Fwd: PGBuildfarm member narwhal Branch HEAD Status changed from OK to InstallCheck failure]
From
Tom Lane
Date:
Dave Page <dpage@postgresql.org> writes: > Tom Lane wrote: >> If you want to poke at it, I'd suggest changing the ERROR to PANIC >> (it's in bufmgr.c) to cause a core dump, run installchecks till you >> get a panic, and then look around in the dump to see what you can find. >> It'd be particularly interesting to see what the buffer actually >> contains. Also you could look at the corresponding page of the disk >> file (which in theory should be the same as the buffer contents, >> since this error check is only made just after a read() ...) > Hmm, I'll give it a go when I'm back in the office, but bear in mind > this is a Mingw build on which debugging is nigh-on impossible. I was afraid of that. Well, at least get a dump of page 104 in that index so we can see what's on-disk. regards, tom lane
Re: [Fwd: PGBuildfarm member narwhal Branch HEAD Status changed from OK to InstallCheck failure]
From
Dave Page
Date:
Tom Lane wrote: > Dave Page <dpage@postgresql.org> writes: >> Tom Lane wrote: >>> If you want to poke at it, I'd suggest changing the ERROR to PANIC >>> (it's in bufmgr.c) to cause a core dump, run installchecks till you >>> get a panic, and then look around in the dump to see what you can find. >>> It'd be particularly interesting to see what the buffer actually >>> contains. Also you could look at the corresponding page of the disk >>> file (which in theory should be the same as the buffer contents, >>> since this error check is only made just after a read() ...) > >> Hmm, I'll give it a go when I'm back in the office, but bear in mind >> this is a Mingw build on which debugging is nigh-on impossible. > > I was afraid of that. Well, at least get a dump of page 104 in that > index so we can see what's on-disk. Sure - I'll have to try with 8.1/8.2 unless you have a pg_filedump that'll work with -HEAD? /D
Re: [Fwd: PGBuildfarm member narwhal Branch HEAD Status changed from OK to InstallCheck failure]
From
Tom Lane
Date:
Dave Page <dpage@postgresql.org> writes: > Tom Lane wrote: >> I was afraid of that. Well, at least get a dump of page 104 in that >> index so we can see what's on-disk. > Sure - I'll have to try with 8.1/8.2 unless you have a pg_filedump > that'll work with -HEAD? No, I don't, but a plain hex/ascii dump is probably the best thing anyway, since we know the page header is wrong. So use any old version of pg_filedump with -d switch. regards, tom lane
Re: [Fwd: PGBuildfarm member narwhal Branch HEAD Statuschanged from OK to InstallCheck failure]
From
"Zeugswetter Andreas ADI SD"
Date:
> Hmm, I'll give it a go when I'm back in the office, but bear > in mind this is a Mingw build on which debugging is nigh-on > impossible. I use the Snapshot http://prdownloads.sf.net/mingw/gdb-6.3-2.exe?download from sf.net. It has some issues, but it is definitely useable. Andreas
Re: [Fwd: PGBuildfarm member narwhal Branch HEAD Statuschanged from OK to InstallCheck failure]
From
Dave Page
Date:
Zeugswetter Andreas ADI SD wrote: >> Hmm, I'll give it a go when I'm back in the office, but bear >> in mind this is a Mingw build on which debugging is nigh-on >> impossible. > > I use the Snapshot > http://prdownloads.sf.net/mingw/gdb-6.3-2.exe?download from sf.net. > It has some issues, but it is definitely useable. I'll give it a go - thanks. Regards, Dave.
Re: [Fwd: PGBuildfarm member narwhal Branch HEAD Status changed from OK to InstallCheck failure]
From
Dave Page
Date:
Dave Page wrote: >> If you want to poke at it, I'd suggest changing the ERROR to PANIC >> (it's in bufmgr.c) to cause a core dump, run installchecks till you >> get a panic, and then look around in the dump to see what you can find. Well, in typical fashion after 25+ runs this morning there's not a failure in sight :-(. I'll keep trying this afternoon, but in case that doesn't work, I've tweaked my buildfarm config to leave error trees in place so maybe we can catch it that way (though that'll be without the PANIC of course). Regards, Dave
Re: [Fwd: PGBuildfarm member narwhal Branch HEAD Status changed from OK to InstallCheck failure]
From
Tom Lane
Date:
Dave Page <dpage@postgresql.org> writes: > I've been seeing this failure intermittently on Narwhal HEAD, and once > on 8.1. Other branches have been OK, as have other animals running on > the same physical box. Narwhal-HEAD is run more often than any other > builds however. Oh, this is interesting: http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=baiji&dt=2007-04-26%2022:00:02 Different compiler, different OS, not quite the same block number (109, whereas IIRC all the previous examples have complained of block 104). Is this the same physical machine as narwhal? regards, tom lane
Re: [Fwd: PGBuildfarm member narwhal Branch HEAD Status changed from OK to InstallCheck failure]
From
Dave Page
Date:
Tom Lane wrote: > Dave Page <dpage@postgresql.org> writes: >> I've been seeing this failure intermittently on Narwhal HEAD, and once >> on 8.1. Other branches have been OK, as have other animals running on >> the same physical box. Narwhal-HEAD is run more often than any other >> builds however. > > Oh, this is interesting: > > http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=baiji&dt=2007-04-26%2022:00:02 > > Different compiler, different OS, not quite the same block number (109, > whereas IIRC all the previous examples have complained of block 104). > Is this the same physical machine as narwhal? Yes, it is. It's an FC6 box running VMWare server, with a Win 2k3r2 VM and a Vista ultimate VM, both with mingw and msvc animals. I'm still not convinced it's a hardware problem - aside from the fact that it's the same error every time (although, I note in this case it was in check, not installcheck), I would expect at least one of SMART, FC6, VMware or 2k3/Vista to spot that there was a problem. I have also recreated the virtual disks of both VMs since this started happening. I wonder if we're hitting some odd bug in VMware. Anyhoo, unfortunately Baiji wasn't set to keep error builds - I've changed that now and will run it a few times again. I'll also run a sector level check of Narwhal's virtual disk and see if that complains. Regards, Dave.
Re: [Fwd: PGBuildfarm member narwhal Branch HEAD Status changed from OK to InstallCheck failure]
From
Tom Lane
Date:
Dave Page <dpage@postgresql.org> writes: > Tom Lane wrote: >> Is this the same physical machine as narwhal? > Yes, it is. It's an FC6 box running VMWare server, with a Win 2k3r2 VM > and a Vista ultimate VM, both with mingw and msvc animals. > I'm still not convinced it's a hardware problem - aside from the fact > that it's the same error every time (although, I note in this case it > was in check, not installcheck), I would expect at least one of SMART, > FC6, VMware or 2k3/Vista to spot that there was a problem. I have also > recreated the virtual disks of both VMs since this started happening. I > wonder if we're hitting some odd bug in VMware. I concur it's too regular to be a hardware issue. The VMware idea is a bit plausible though. If that's it, we ought to see failures of this ilk on all four animals sooner or later ... regards, tom lane
Re: [Fwd: PGBuildfarm member narwhal Branch HEAD Status changed from OK to InstallCheck failure]
From
Dave Page
Date:
Tom Lane wrote: > I concur it's too regular to be a hardware issue. The VMware idea is > a bit plausible though. If that's it, we ought to see failures of this > ilk on all four animals sooner or later ... I've run full disk scans in both Windows VMs, and forced an fsck of the host just to be on the safe side and nothing showed up. By chance it seems that VMWare released an update just yesterday so I've upgraded everything. Hopefully the problem will go away now, but I'm not holding my breath! If not, one other option would be to roll back a couple of versions of VMware - an older version hosted Bandicoot for some time with no problems. Regards, Dave