Thread: VACUUM causes violent postmaster death

VACUUM causes violent postmaster death

From
Dan Moschuk
Date:
Server process (pid 13361) exited with status 26 at Fri Nov  3 17:49:44 2000
Terminating any active server processes...
NOTICE:  Message from PostgreSQL backend:       The Postmaster has informed me that some other backend died abnormally
andpossibly corrupted shared memory.       I have rolled back the current transaction and am going to terminate your
databasesystem connection and exit.       Please reconnect to the database system and repeat your query.
 

This happens fairly regularly.  I assume exit code 26 is used to dictate
that a specific error has occured.

The database is a decent size (~3M records) with about 4 indexes.

-Dan
-- 
Man is a rational animal who always loses his temper when he is called
upon to act in accordance with the dictates of reason.               -- Oscar Wilde


Re: VACUUM causes violent postmaster death

From
Alfred Perlstein
Date:
* Dan Moschuk <dan@freebsd.org> [001103 14:55] wrote:
> 
> Server process (pid 13361) exited with status 26 at Fri Nov  3 17:49:44 2000
> Terminating any active server processes...
> NOTICE:  Message from PostgreSQL backend:
>         The Postmaster has informed me that some other backend died abnormally and possibly corrupted shared memory.
>         I have rolled back the current transaction and am going to terminate your database system connection and
exit.
>         Please reconnect to the database system and repeat your query.
> 
> This happens fairly regularly.  I assume exit code 26 is used to dictate
> that a specific error has occured.
> 
> The database is a decent size (~3M records) with about 4 indexes.

What version of postgresql?  Tom Lane recently fixed some severe problems
with vacuum and heavily used databases, the fix should be in the latest
7.0.2-patches/7.0.3 release.

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."


Re: VACUUM causes violent postmaster death

From
Tom Lane
Date:
Dan Moschuk <dan@freebsd.org> writes:
> Server process (pid 13361) exited with status 26 at Fri Nov  3 17:49:44 2000

What's signal 26 on your system?  (Look in /usr/include/signal.h or
/usr/include/signum.h or /usr/include/sys/signal.h)
        regards, tom lane


Re: VACUUM causes violent postmaster death

From
Dan Moschuk
Date:
| > Server process (pid 13361) exited with status 26 at Fri Nov  3 17:49:44 2000
| 
| What's signal 26 on your system?  (Look in /usr/include/signal.h or
| /usr/include/signum.h or /usr/include/sys/signal.h)

dan@spirit:/home/dan grep 26 /usr/include/sys/signal.h
#define SIGVTALRM       26      /* virtual time alarm */

Cheers,
-Dan
-- 
Man is a rational animal who always loses his temper when he is called
upon to act in accordance with the dictates of reason.               -- Oscar Wilde


Re: VACUUM causes violent postmaster death

From
Dan Moschuk
Date:
| > This happens fairly regularly.  I assume exit code 26 is used to dictate
| > that a specific error has occured.
| > 
| > The database is a decent size (~3M records) with about 4 indexes.
| 
| What version of postgresql?  Tom Lane recently fixed some severe problems
| with vacuum and heavily used databases, the fix should be in the latest
| 7.0.2-patches/7.0.3 release.

It's 7.0.2-patches from about two or three weeks ago.

-Dan
-- 
Man is a rational animal who always loses his temper when he is called
upon to act in accordance with the dictates of reason.               -- Oscar Wilde


Re: VACUUM causes violent postmaster death

From
Alfred Perlstein
Date:
* Dan Moschuk <dan@freebsd.org> [001103 15:32] wrote:
> 
> | > This happens fairly regularly.  I assume exit code 26 is used to dictate
> | > that a specific error has occured.
> | > 
> | > The database is a decent size (~3M records) with about 4 indexes.
> | 
> | What version of postgresql?  Tom Lane recently fixed some severe problems
> | with vacuum and heavily used databases, the fix should be in the latest
> | 7.0.2-patches/7.0.3 release.
> 
> It's 7.0.2-patches from about two or three weeks ago.

Make sure pgsql/src/backend/commands/vacuum.c is at:

revision 1.148.2.1
date: 2000/09/19 21:01:04;  author: tgl;  state: Exp;  lines: +37 -19
Back-patch fix to ensure that VACUUM always calls FlushRelationBuffers.

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."


Re: VACUUM causes violent postmaster death

From
Tom Lane
Date:
Dan Moschuk <dan@freebsd.org> writes:
> | > Server process (pid 13361) exited with status 26 at Fri Nov  3 17:49:44 2000
> | 
> | What's signal 26 on your system?

> #define SIGVTALRM       26      /* virtual time alarm */

Well, that sure shouldn't be happening.  You aren't perhaps running it
under a ulimit setting that limits total process CPU time, are you?
        regards, tom lane


Re: VACUUM causes violent postmaster death

From
Tom Lane
Date:
I don't think Dan's problem is related to the recently found VACUUM
bugs.  Killing a backend with SIGVTALRM suggests that something thinks
the backend's been running too long.  ulimit is a likely suspect.
Another possibility is some sort of profiling mechanism gone haywire.
There's nothing in our source code that would invoke that signal, so
it's got to be some outside agency, I think.
        regards, tom lane