Thread: How to deal with crashes?
Hello, I'm posting to pgsql-general since I'm not sure that this should be a bug report. Please correct me if I'm wrong. I have PostgreSQL 7.2, compiled by GCC egcs-2.91.66, Linux version 2.2.20, i686 Periodically, Postgres crashes. The following lines are added to the log file: 2002-04-17 13:55:18 [17524] DEBUG: server process (pid 23600) was terminated by signal 11 2002-04-17 13:55:18 [17524] DEBUG: terminating any other active server processes 2002-04-17 13:55:18 [17524] DEBUG: all server processes terminated; reinitializing shared memory and semaphores 2002-04-17 13:55:18 [17524] DEBUG: startup process (pid 23605) was terminated by signal 11 2002-04-17 13:55:18 [17524] DEBUG: aborting startup due to startup process failure When I do pg_ctl start, the following is logged: 2002-04-17 14:00:04 [26188] DEBUG: database system was interrupted at 2002-04-17 13:48:34 MSD 2002-04-17 14:00:04 [26188] DEBUG: checkpoint record is at 0/7F9D0C4 2002-04-17 14:00:04 [26188] DEBUG: redo record is at 0/7F9D0C4; undo record is at 0/0; shutdown FALSE 2002-04-17 14:00:04 [26188] DEBUG: next transaction id: 551189; next oid: 210621 2002-04-17 14:00:04 [26188] DEBUG: database system was not properly shut down; automatic recovery in progress 2002-04-17 14:00:04 [26188] DEBUG: ReadRecord: record with zero length at 0/7F9D104 2002-04-17 14:00:04 [26188] DEBUG: redo is not required 2002-04-17 14:00:08 [26258] FATAL 1: The database system is starting up 2002-04-17 14:00:08 [26262] FATAL 1: The database system is starting up 2002-04-17 14:00:13 [26365] FATAL 1: The database system is starting up 2002-04-17 14:00:13 [26375] FATAL 1: The database system is starting up 2002-04-17 14:00:24 [26188] DEBUG: database system is ready I do not run debug version, so core dump is not available (I'm not sure that it's a good idea to run debug version on the real web site, and this problem occures only on that particular server). Please tell me, what should I do solve my problem? Is it safe to run debug version on the public web server? What will be the penalties of doing that? (server runs ~10000 queries daily)? Thank you in advance, Andrey
On Wed, Apr 24, 2002 at 05:30:33PM +0400, Andrey Mishchenko wrote: > Periodically, Postgres crashes. The following lines are added to the log > file: > 2002-04-17 13:55:18 [17524] DEBUG: server process (pid 23600) was > terminated by signal 11 > 2002-04-17 13:55:18 [17524] DEBUG: terminating any other active server > processes etc... Are you logging the queries? It would be helpful if you could identify the query actually causing the problem. > I do not run debug version, so core dump is not available (I'm not sure > that it's a good idea to run debug version on the real web site, and this > problem occures only on that particular server). Debug info doesn't actually cost anything speed-wise or memory-wise. And that's all you need to get useful info out of a core dump. > Please tell me, what should I do solve my problem? > Is it safe to run debug version on the public web server? > What will be the penalties of doing that? (server runs ~10000 queries > daily)? Logging the queries is only really an issue if you don't rotate the logs on a regular basis. HTH, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Canada, Mexico, and Australia form the Axis of Nations That > Are Actually Quite Nice But Secretly Have Nasty Thoughts About America
Yes, I'm logging queries. Here are log records that precede Postgres crash. Postgres crashes only after new session is opened (DEBUG: connection: ...) It seems to me that there is no correlation between crash and queries executed before new session is opened (crash occurs after different SELECT or UPDATE queries). As for debug version, I know what Windows programs may run number of times slower in debug build. Isn't this a case for Postgres on Linux? 2002-04-17 13:55:00 [23382] DEBUG: connection: host=[local] user=spa-www database=spa 2002-04-17 13:55:01 [23382] DEBUG: query: SELECT count(*) FROM skus 2002-04-17 13:55:01 [23382] DEBUG: query: COMMIT 2002-04-17 13:55:01 [23382] DEBUG: ProcessUtility: COMMIT 2002-04-17 13:55:01 [23382] NOTICE: COMMIT: no transaction in progress 2002-04-17 13:55:01 [23386] DEBUG: connection: host=[local] user=spa-www database=spa 2002-04-17 13:55:01 [23386] DEBUG: query: SELECT count(*) FROM skus 2002-04-17 13:55:01 [23386] DEBUG: query: COMMIT 2002-04-17 13:55:01 [23386] DEBUG: ProcessUtility: COMMIT 2002-04-17 13:55:01 [23386] NOTICE: COMMIT: no transaction in progress 2002-04-17 13:55:18 [23600] DEBUG: connection: host=[local] user=spa-www database=spa 2002-04-17 13:55:18 [17524] DEBUG: server process (pid 23600) was terminated by signal 11 2002-04-17 13:55:18 [17524] DEBUG: terminating any other active server processes 2002-04-17 13:55:18 [17524] DEBUG: all server processes terminated; reinitializing shared memory and semaphores 2002-04-17 13:55:18 [17524] DEBUG: startup process (pid 23605) was terminated by signal 11 2002-04-17 13:55:18 [17524] DEBUG: aborting startup due to startup process failure 2002-04-17 14:00:04 [26188] DEBUG: database system was interrupted at 2002-04-17 13:48:34 MSD 2002-04-17 14:00:04 [26188] DEBUG: checkpoint record is at 0/7F9D0C4 2002-04-17 14:00:04 [26188] DEBUG: redo record is at 0/7F9D0C4; undo record is at 0/0; shutdown FALSE 2002-04-17 14:00:04 [26188] DEBUG: next transaction id: 551189; next oid: 210621 2002-04-17 14:00:04 [26188] DEBUG: database system was not properly shut down; automatic recovery in progress 2002-04-17 14:00:04 [26188] DEBUG: ReadRecord: record with zero length at 0/7F9D104 2002-04-17 14:00:04 [26188] DEBUG: redo is not required Thank you, Andrey > -----Original Message----- > From: Martijn van Oosterhout [mailto:kleptog@svana.org] > Sent: Thursday, April 25, 2002 05:35 PM > To: Andrey Mishchenko > Cc: pgsql-general@postgresql.org > Subject: Re: [GENERAL] How to deal with crashes? > > > On Wed, Apr 24, 2002 at 05:30:33PM +0400, Andrey Mishchenko wrote: > > Periodically, Postgres crashes. The following lines are added to the log > > file: > > 2002-04-17 13:55:18 [17524] DEBUG: server process (pid 23600) was > > terminated by signal 11 > > 2002-04-17 13:55:18 [17524] DEBUG: terminating any other active server > > processes > > etc... > > Are you logging the queries? It would be helpful if you could identify the > query actually causing the problem. > > > I do not run debug version, so core dump is not available (I'm not sure > > that it's a good idea to run debug version on the real web > site, and this > > problem occures only on that particular server). > > Debug info doesn't actually cost anything speed-wise or memory-wise. And > that's all you need to get useful info out of a core dump. > > > Please tell me, what should I do solve my problem? > > Is it safe to run debug version on the public web server? > > What will be the penalties of doing that? (server runs ~10000 queries > > daily)? > > Logging the queries is only really an issue if you don't rotate > the logs on > a regular basis. > > HTH, > -- > Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > > Canada, Mexico, and Australia form the Axis of Nations That > > Are Actually Quite Nice But Secretly Have Nasty Thoughts About America >
"Andrey" <am@NOSPAM.netactor.net> writes: > As for debug version, I know what Windows programs may run number of times > slower in debug build. Isn't this a case for Postgres on Linux? The gcc boys say that -g makes no difference in the generated code. I've never bothered to test the assertion; I assume they know what they're talking about. > 2002-04-17 13:55:18 [23600] DEBUG: connection: host=[local] user=spa-www > database=spa > 2002-04-17 13:55:18 [17524] DEBUG: server process (pid 23600) was > terminated by signal 11 This is most curious --- PID 23600 evidently crashed during startup, before it had received any queries. You definitely need to prepare a debug version and get a stack backtrace from one of the coredumps before we'll be able to say much more than that. regards, tom lane
On Wed, 24 Apr 2002, Andrey Mishchenko wrote: > I do not run debug version, so core dump is not available (I'm not sure > that it's a good idea to run debug version on the real web site, and this > problem occures only on that particular server). If by "debug version" we mean a version compiled with -g, it will be fine. It shouldn't affect performance at all. All it will do is make the binary bigger on the disk (not in memory) and make it take longer to compile. > What will be the penalties of doing that? (server runs ~10000 queries > daily)? This server runs ten thousand queries per day? This is a very low volume of queries, so unless the queries are hideously complex, you shouldn't need to worry about performance. 86,400 seconds per day / 10,000 queries works out to one query every 8.64 seconds. Of course the load will not be perfectly even, but even so I doubt you'll be seeing a short-term average of more than one query every two seconds. I recommend you use at least a 386SX-16 system for this. :-) BTW, there are tools like "top" and "iostat" that can tell you how loaded your system is. Or you an probably even just look at the disk lights. cjs -- Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're all light. --XTC