newbie admin question: postgresql server crashes - Mailing list pgsql-admin
From | Murthy Kambhampaty |
---|---|
Subject | newbie admin question: postgresql server crashes |
Date | |
Msg-id | E631530D51ABD411B823009027855C5B02797F@THOR Whole thread Raw |
List | pgsql-admin |
Our postgresql server crashed this morning unexpectedly. Attached is our server log from before the crash. What's perplexing is the sequence "2002-08-01 10:17:30 DEBUG: server process (pid 13715) was terminated by signal 11 2002-08-01 10:17:30 DEBUG: terminating any other active server processes 2002-08-01 10:17:30 DEBUG: all server processes terminated; reinitializing shared memory and semaphores 2002-08-01 10:17:30 FATAL 1: The database system is starting up 2002-08-01 10:17:30 DEBUG: database system was interrupted at 2002-08-01 05:30:41 EDT" But there is nor report of any problems at 05:30:41 EDT. I should note that we deployed this server last week with checkpoint_segments=3 and fsync off. The server crashed repeatedly on file imports when the checkpoing interval dropped below 5 secs. This would lead to corrupted tables, and vacuuming those tables would also cause the server to crash with the "pg_clog/nnnn no such file" message, which I gather means the table has corrupted data. Since then I have increased the checkpoint_segments to 64, and as the log shows the interval is back to five minutes per the documentation, and with fsync on, it is my understanding that we should not be seeing any corrupted tables. However, the server would crash if we vacuumed any of the tables previously corrupted, and then crash again with "signal 11" just about an hour later. So we started restarting the server after every crash while vacuuming a corrupted table (we had several we wanted to check through, and didn't want to just drop and recreate). But we got through those, and now don't think we have any corrupted tables (we can vacuum all that are left with success). So the crashes on "signal 11" since this morning are totally perplexing. The server is a quad PIII Xeon 550 MHz 2MB cache, with 4GB mem, mylex acceleraid 352 attached raid5 array running 2.4.18-SGI_XFS-1.1 on an extended partition (not LVM). Help?! (I'm not on the list, though in the process of joining, so please copy me on replies.) Thanks, Murthy S. Murthy Kambhampaty Glassman-Oliver Economic Consultants, Inc. Washington, DC 2002-08-01 04:58:51 DEBUG: recycled transaction log file 00000040000000A6 2002-08-01 04:58:51 DEBUG: recycled transaction log file 00000040000000A7 2002-08-01 04:58:51 DEBUG: recycled transaction log file 00000040000000A8 2002-08-01 04:58:51 DEBUG: recycled transaction log file 0000004000000095 2002-08-01 04:58:51 DEBUG: recycled transaction log file 0000004000000096 2002-08-01 04:58:51 DEBUG: recycled transaction log file 0000004000000097 2002-08-01 04:58:51 DEBUG: recycled transaction log file 0000004000000098 2002-08-01 04:58:51 DEBUG: recycled transaction log file 0000004000000099 2002-08-01 04:58:51 DEBUG: recycled transaction log file 000000400000009A 2002-08-01 04:58:51 DEBUG: recycled transaction log file 00000040000000A9 2002-08-01 04:58:51 DEBUG: recycled transaction log file 00000040000000AA 2002-08-01 04:58:51 DEBUG: recycled transaction log file 00000040000000AB 2002-08-01 04:58:51 DEBUG: recycled transaction log file 00000040000000AC 2002-08-01 04:58:51 DEBUG: recycled transaction log file 00000040000000AD 2002-08-01 04:58:51 DEBUG: recycled transaction log file 00000040000000AE 2002-08-01 04:58:51 DEBUG: recycled transaction log file 00000040000000AF 2002-08-01 04:58:51 DEBUG: recycled transaction log file 00000040000000B0 2002-08-01 04:58:51 DEBUG: recycled transaction log file 000000400000009B 2002-08-01 04:58:51 DEBUG: recycled transaction log file 000000400000009C 2002-08-01 04:58:51 DEBUG: recycled transaction log file 000000400000009D 2002-08-01 04:58:51 DEBUG: recycled transaction log file 000000400000009E 2002-08-01 04:58:51 DEBUG: recycled transaction log file 000000400000009F 2002-08-01 04:58:51 DEBUG: recycled transaction log file 00000040000000A0 2002-08-01 04:58:51 DEBUG: recycled transaction log file 00000040000000A1 2002-08-01 04:58:51 DEBUG: recycled transaction log file 00000040000000A2 2002-08-01 04:58:51 DEBUG: recycled transaction log file 00000040000000A3 2002-08-01 04:58:51 DEBUG: recycled transaction log file 00000040000000A4 2002-08-01 04:58:51 DEBUG: recycled transaction log file 00000040000000A5 2002-08-01 05:00:55 DEBUG: connection: host=[local] user=[xxxxx] database=[xxxxx] 2002-08-01 05:00:55 NOTICE: ALTER TABLE / ADD PRIMARY KEY will create implicit index 'level6_sales_fact_tot_dry_pkey' for table 'level6_sales_fact_tot_dry' 2002-08-01 05:04:17 DEBUG: recycled transaction log file 00000040000000B1 2002-08-01 05:04:17 DEBUG: recycled transaction log file 00000040000000B2 2002-08-01 05:04:17 DEBUG: recycled transaction log file 00000040000000B3 2002-08-01 05:04:17 DEBUG: recycled transaction log file 00000040000000B4 2002-08-01 05:04:17 DEBUG: recycled transaction log file 00000040000000B5 2002-08-01 05:04:17 DEBUG: recycled transaction log file 00000040000000B6 2002-08-01 05:04:17 DEBUG: recycled transaction log file 00000040000000B7 2002-08-01 05:04:17 DEBUG: recycled transaction log file 00000040000000B8 2002-08-01 05:04:17 DEBUG: recycled transaction log file 00000040000000B9 2002-08-01 05:04:17 DEBUG: recycled transaction log file 00000040000000BA 2002-08-01 05:04:17 DEBUG: recycled transaction log file 00000040000000BB 2002-08-01 05:04:17 DEBUG: recycled transaction log file 00000040000000BC 2002-08-01 05:05:10 DEBUG: connection: host=[local] user=[xxxxx] database=[xxxxx] 2002-08-01 05:05:10 NOTICE: ALTER TABLE / ADD PRIMARY KEY will create implicit index 'level5_sales_dim_tot_dry_pkey' for table 'level5_sales_dim_tot_dry' 2002-08-01 05:15:00 DEBUG: connection: host=[local] user=[xxxxx] database=[xxxxx] 2002-08-01 05:15:00 NOTICE: ALTER TABLE / ADD PRIMARY KEY will create implicit index 'level5_sales_fact_tot_dry_pkey' for table 'level5_sales_fact_tot_dry' 2002-08-01 09:06:12 DEBUG: connection: host=[local] user=[xxxxx] database=[xxxxx] 2002-08-01 09:06:12 DEBUG: connection: host=[local] user=[xxxxx] database=[xxxxx] 2002-08-01 10:17:30 DEBUG: server process (pid 13715) was terminated by signal 11 2002-08-01 10:17:30 DEBUG: terminating any other active server processes 2002-08-01 10:17:30 DEBUG: all server processes terminated; reinitializing shared memory and semaphores 2002-08-01 10:17:30 FATAL 1: The database system is starting up 2002-08-01 10:17:30 DEBUG: database system was interrupted at 2002-08-01 05:30:41 EDT 2002-08-01 10:17:30 DEBUG: checkpoint record is at 40/BDC984D0 2002-08-01 10:17:30 DEBUG: redo record is at 40/BDC984D0; undo record is at 0/0; shutdown FALSE 2002-08-01 10:17:30 DEBUG: next transaction id: 54583; next oid: 3618449 2002-08-01 10:17:30 DEBUG: database system was not properly shut down; automatic recovery in progress 2002-08-01 10:17:30 DEBUG: ReadRecord: record with zero length at 40/BDC98510 2002-08-01 10:17:30 DEBUG: redo is not required 2002-08-01 10:17:30 FATAL 1: The database system is starting up 2002-08-01 10:17:30 FATAL 1: The database system is starting up 2002-08-01 10:17:30 FATAL 1: The database system is starting up 2002-08-01 10:17:32 DEBUG: database system is ready 2002-08-01 11:40:01 DEBUG: fast shutdown request 2002-08-01 11:40:01 DEBUG: shutting down 2002-08-01 11:40:03 DEBUG: database system is shut down 2002-08-01 11:40:12 DEBUG: database system was shut down at 2002-08-01 11:40:03 EDT 2002-08-01 11:40:12 DEBUG: checkpoint record is at 40/BDC98550 2002-08-01 11:40:12 DEBUG: redo record is at 40/BDC98550; undo record is at 0/0; shutdown TRUE 2002-08-01 11:40:12 DEBUG: next transaction id: 54583; next oid: 3618449 2002-08-01 11:40:12 DEBUG: database system is ready 2002-08-01 11:40:20 DEBUG: connection: host=[local] user=[xxxxx] database=[xxxxx] 2002-08-01 11:40:21 DEBUG: connection: host=[local] user=[xxxxx] database=[xxxxx] 2002-08-01 12:51:44 DEBUG: server process (pid 8695) was terminated by signal 11 2002-08-01 12:51:44 DEBUG: terminating any other active server processes 2002-08-01 12:51:44 DEBUG: all server processes terminated; reinitializing shared memory and semaphores 2002-08-01 12:51:44 FATAL 1: The database system is starting up 2002-08-01 12:51:44 DEBUG: database system was interrupted at 2002-08-01 11:40:12 EDT 2002-08-01 12:51:44 DEBUG: checkpoint record is at 40/BDC98550 2002-08-01 12:51:44 DEBUG: redo record is at 40/BDC98550; undo record is at 0/0; shutdown TRUE 2002-08-01 12:51:44 DEBUG: next transaction id: 54583; next oid: 3618449 2002-08-01 12:51:44 DEBUG: database system was not properly shut down; automatic recovery in progress 2002-08-01 12:51:44 FATAL 1: The database system is starting up 2002-08-01 12:51:44 FATAL 1: The database system is starting up 2002-08-01 12:51:44 DEBUG: ReadRecord: record with zero length at 40/BDC98590 2002-08-01 12:51:44 DEBUG: redo is not required 2002-08-01 12:51:44 FATAL 1: The database system is starting up 2002-08-01 12:51:46 DEBUG: database system is ready
pgsql-admin by date: