Thread: VACUUM causes violent postmaster death
Server process (pid 13361) exited with status 26 at Fri Nov 3 17:49:44 2000 Terminating any active server processes... NOTICE: Message from PostgreSQL backend: The Postmaster has informed me that some other backend died abnormally andpossibly corrupted shared memory. I have rolled back the current transaction and am going to terminate your databasesystem connection and exit. Please reconnect to the database system and repeat your query. This happens fairly regularly. I assume exit code 26 is used to dictate that a specific error has occured. The database is a decent size (~3M records) with about 4 indexes. -Dan -- Man is a rational animal who always loses his temper when he is called upon to act in accordance with the dictates of reason. -- Oscar Wilde
* Dan Moschuk <dan@freebsd.org> [001103 14:55] wrote: > > Server process (pid 13361) exited with status 26 at Fri Nov 3 17:49:44 2000 > Terminating any active server processes... > NOTICE: Message from PostgreSQL backend: > The Postmaster has informed me that some other backend died abnormally and possibly corrupted shared memory. > I have rolled back the current transaction and am going to terminate your database system connection and exit. > Please reconnect to the database system and repeat your query. > > This happens fairly regularly. I assume exit code 26 is used to dictate > that a specific error has occured. > > The database is a decent size (~3M records) with about 4 indexes. What version of postgresql? Tom Lane recently fixed some severe problems with vacuum and heavily used databases, the fix should be in the latest 7.0.2-patches/7.0.3 release. -- -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org] "I have the heart of a child; I keep it in a jar on my desk."
Dan Moschuk <dan@freebsd.org> writes: > Server process (pid 13361) exited with status 26 at Fri Nov 3 17:49:44 2000 What's signal 26 on your system? (Look in /usr/include/signal.h or /usr/include/signum.h or /usr/include/sys/signal.h) regards, tom lane
| > Server process (pid 13361) exited with status 26 at Fri Nov 3 17:49:44 2000 | | What's signal 26 on your system? (Look in /usr/include/signal.h or | /usr/include/signum.h or /usr/include/sys/signal.h) dan@spirit:/home/dan grep 26 /usr/include/sys/signal.h #define SIGVTALRM 26 /* virtual time alarm */ Cheers, -Dan -- Man is a rational animal who always loses his temper when he is called upon to act in accordance with the dictates of reason. -- Oscar Wilde
| > This happens fairly regularly. I assume exit code 26 is used to dictate | > that a specific error has occured. | > | > The database is a decent size (~3M records) with about 4 indexes. | | What version of postgresql? Tom Lane recently fixed some severe problems | with vacuum and heavily used databases, the fix should be in the latest | 7.0.2-patches/7.0.3 release. It's 7.0.2-patches from about two or three weeks ago. -Dan -- Man is a rational animal who always loses his temper when he is called upon to act in accordance with the dictates of reason. -- Oscar Wilde
* Dan Moschuk <dan@freebsd.org> [001103 15:32] wrote: > > | > This happens fairly regularly. I assume exit code 26 is used to dictate > | > that a specific error has occured. > | > > | > The database is a decent size (~3M records) with about 4 indexes. > | > | What version of postgresql? Tom Lane recently fixed some severe problems > | with vacuum and heavily used databases, the fix should be in the latest > | 7.0.2-patches/7.0.3 release. > > It's 7.0.2-patches from about two or three weeks ago. Make sure pgsql/src/backend/commands/vacuum.c is at: revision 1.148.2.1 date: 2000/09/19 21:01:04; author: tgl; state: Exp; lines: +37 -19 Back-patch fix to ensure that VACUUM always calls FlushRelationBuffers. -- -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org] "I have the heart of a child; I keep it in a jar on my desk."
Dan Moschuk <dan@freebsd.org> writes: > | > Server process (pid 13361) exited with status 26 at Fri Nov 3 17:49:44 2000 > | > | What's signal 26 on your system? > #define SIGVTALRM 26 /* virtual time alarm */ Well, that sure shouldn't be happening. You aren't perhaps running it under a ulimit setting that limits total process CPU time, are you? regards, tom lane
I don't think Dan's problem is related to the recently found VACUUM bugs. Killing a backend with SIGVTALRM suggests that something thinks the backend's been running too long. ulimit is a likely suspect. Another possibility is some sort of profiling mechanism gone haywire. There's nothing in our source code that would invoke that signal, so it's got to be some outside agency, I think. regards, tom lane