It happened again: Server hung up solid - Mailing list pgsql-hackers

From The Hermit Hacker
Subject It happened again: Server hung up solid
Date
Msg-id Pine.BSF.4.21.0005072034270.87721-100000@thelab.hub.org
Whole thread Raw
Responses Re: It happened again: Server hung up solid  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: It happened again: Server hung up solid  (The Hermit Hacker <scrappy@hub.org>)
List pgsql-hackers
Okay, this is with code of ~May 4th ... a 'psql' connection to the
database hangs solid.

errout is dated:

pgsql% !ls
ls -lt
total 13324
-rw-------   1 pgsql  pgsql  4842715 May  7 10:57 errout.5432

and the last few lines contain:

ERROR:  parser: parse error at or near "vpti"
pq_recvbuf: unexpected EOF on client connection
pq_flush: send() failed: Broken pipe
pq_recvbuf: recv() failed: Connection reset by peer
pq_recvbuf: unexpected EOF on client connection
pq_recvbuf: unexpected EOF on client connection
pq_flush: send() failed: Broken pipe
pq_recvbuf: recv() failed: Connection reset by peer

But, of course, no date/time ...

ps shows:

USER    PID %CPU %MEM   VSZ  RSS  TT  STAT STARTED      TIME COMMAND
pgsql 33515  0.0  0.0     0    0  ??  Z     4:45PM   0:00.00  (postgres)
pgsql 33516  0.0  0.0     0    0  ??  Z     4:45PM   0:00.00  (postgres)
pgsql 93757  0.0  0.2  1456 1088  p0  S    Wed03PM   0:01.11 -su (tcsh)
pgsql  7100  0.0  0.5 38692 2616  ??  Is   Fri12AM   8:43.44 /pgsql/bin/postmas
pgsql 33667  0.0  0.0   396  224  p0  R+    7:35PM   0:00.00 ps ux

and postmaster is started with:

pgsql% cat pgstart
#!/bin/tcsh
setenv PORT 5432
setenv POSTMASTER /pgsql/bin/postmaster
unlimit
${POSTMASTER} -B 4096 -N 128 -S -o "-F -o /pgsql/errout.${PORT} -S 32768" \       -i -p ${PORT} -D/pgsql/data

The machine is a Dual PIII with 512Meg of RAM, running FreeBSD 4.0-STABLE
from April 22nd ...

pgsql% truss -p 7100

Shows zilch ...

Since this is a production server, I can't just leave it there hung like
that, but if someone wants to give some instructions on what to do the
next time this happens, please feel free to do so, and I'll add that to my
list ... maybe run a gdb command on it, since truss doesn't appear to
help?

At this time, I consider this to be a show-stopper on the release ... this
is what happened the last time when the result appeared to be the index
corruption ... this time, I've checked a VACUUM after re-starting and it
doesn't appear to be a problem, but they might not have been related, just
a fluke ...

Marc G. Fournier                   ICQ#7615664              IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 



pgsql-hackers by date:

Previous
From: "Robert B. Easter"
Date:
Subject: Re: Passwords
Next
From: The Hermit Hacker
Date:
Subject: Re: CREATE DATABASE WITH OWNER '??';