[7.4.1] signal 11 while accesing DB - Mailing list pgsql-general
From | Marcin |
---|---|
Subject | [7.4.1] signal 11 while accesing DB |
Date | |
Msg-id | 20050124174606Z2660446-20169+489@kps3.test.onet.pl Whole thread Raw |
Responses |
Re: [7.4.1] signal 11 while accesing DB
Re: [7.4.1] signal 11 while accesing DB |
List | pgsql-general |
Hi, Yesterday morning one of our PostgreSQL engines stopped unexpectedly, with the following log: postgres[4192]: [5105-1] LOG: server process (PID 26678) was terminated by signal 11 postgres[4192]: [5106-1] LOG: terminating any other active server processes postgres[4192]: [5107-1] LOG: all server processes terminated; reinitializing postgres[26717]: [5108-1] LOG: database system was interrupted at 2005-01-23 07:23:37 CET postgres[26717]: [5109-1] LOG: checkpoint record is at 31B/FE2C9BA4 postgres[26717]: [5110-1] LOG: redo record is at 31B/FE00A484; undo record is at 0/0; shutdown FALSE postgres[26717]: [5111-1] LOG: next transaction ID: 390435693; next OID: 678576128 postgres[26717]: [5112-1] LOG: database system was not properly shut down; automatic recovery in progress postgres[26717]: [5113-1] LOG: redo starts at 31B/FE00A484 postgres[26717]: [5114-1] LOG: record with zero length at 31B/FEFCE3A8 postgres[26717]: [5115-1] LOG: redo done at 31B/FEFCE384 postgres[26717]: [5116-1] LOG: removing transaction log file "0000031B000000FB" postgres[26717]: [5117-1] LOG: removing transaction log file "0000031B000000FC" postgres[26717]: [5118-1] LOG: removing transaction log file "0000031B000000FD" postgres[26717]: [5119-1] LOG: database system is ready and then keep crashing as client connects to one, specified database: postgres[4192]: [5108-1] LOG: server process (PID 4501) was terminated by signal 11 postgres[4192]: [5109-1] LOG: terminating any other active server processes postgres[4192]: [5110-1] LOG: all server processes terminated; reinitializing postgres[4621]: [5111-1] LOG: database system was interrupted at 2005-01-23 07:25:32 CET postgres[4621]: [5112-1] LOG: checkpoint record is at 31B/FEFCE3A8 postgres[4621]: [5113-1] LOG: redo record is at 31B/FEFCE3A8; undo record is at 0/0; shutdown TRUE postgres[4621]: [5114-1] LOG: next transaction ID: 390439982; next OID: 678592512 postgres[4621]: [5115-1] LOG: database system was not properly shut down; automatic recovery in progress postgres[4621]: [5116-1] LOG: record with zero length at 31B/FEFCE3E8 postgres[4621]: [5117-1] LOG: redo is not required postgres[4621]: [5118-1] LOG: database system is ready postgres[4192]: [5111-1] LOG: server process (PID 16866) was terminated by signal 11 I find out two ways to crash the server: 1. Connecting to the one, specified database ("data") 2. Creating another database, and trying to access it Other databases works find, I can add or remove records with no problem. I copied the PGDATA to another machine for debugging and restored the PGDATA from night dump. Everything went smoothly. On debugging machine with exactly the same version of PostgreSQL the backend keep crashing: [16350] DEBUG: /usr/lib/postgresql/bin/postmaster: PostmasterMain: initial environ dump: [16350] DEBUG: ----------------------------------------- [16350] DEBUG: PWD=/DEBUG/lib/postgres/data [16350] DEBUG: USER=postgres [16350] DEBUG: MAIL=/root/Mailbox [16350] DEBUG: LANG=pl_PL [16350] DEBUG: LOGNAME=root [16350] DEBUG: SHLVL=4 [16350] DEBUG: SHELL=/bin/sh [16350] DEBUG: HOME=/var/lib/postgres [16350] DEBUG: TERM=linux [16350] DEBUG: PGDATA=/DEBUG/lib/postgres/data [16350] DEBUG: PATH=/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/games [16350] DEBUG: LESSCHARSET=latin1 [16350] DEBUG: SSH_TTY=/dev/pts/0 [16350] DEBUG: _=/usr/lib/postgresql/bin/postmaster [16350] DEBUG: ----------------------------------------- [16350] DEBUG: found "/usr/lib/postgresql/bin/postgres" using argv[0] [16350] LOG: could not create IPv6 socket: Rodzina adresów nie obsługiwana przez protokół [16350] DEBUG: invoking IpcMemoryCreate(size=31940608) [16350] DEBUG: found "/usr/lib/postgresql/bin/postmaster" using argv[0] [16356] LOG: database system was shut down at 2005-01-23 14:36:26 CET [16356] LOG: checkpoint record is at 31C/1F7C150 [16356] LOG: redo record is at 31C/1F7C150; undo record is at 0/0; shutdown TRUE [16356] LOG: next transaction ID: 390466146; next OID: 679780352 [16356] LOG: database system is ready [16356] DEBUG: proc_exit(0) [16356] DEBUG: shmem_exit(0) [16356] DEBUG: exit(0) [16350] DEBUG: reaping dead processes [16350] DEBUG: forked new backend, pid=16435 socket=8 [16435] DEBUG: /usr/lib/postgresql/bin/postmaster child[16435]: starting with ( [16435] DEBUG: postgres [16435] DEBUG: -v196608 [16435] DEBUG: -p [16435] DEBUG: template1 [16435] DEBUG: ) [16435] DEBUG: InitPostgres [16435] DEBUG: StartTransactionCommand [16435] DEBUG: CommitTransactionCommand [16435] DEBUG: proc_exit(0) [16435] DEBUG: shmem_exit(0) [16435] DEBUG: exit(0) [16350] DEBUG: reaping dead processes [16350] DEBUG: child process (PID 16435) exited with exit code 0 [16350] DEBUG: forked new backend, pid=16510 socket=8 [16510] DEBUG: /usr/lib/postgresql/bin/postmaster child[16510]: starting with ( [16510] DEBUG: postgres [16510] DEBUG: -v196608 [16510] DEBUG: -p [16510] DEBUG: data [16510] DEBUG: ) [16510] DEBUG: InitPostgres [16350] DEBUG: reaping dead processes [16350] DEBUG: child process (PID 16510) was terminated by signal 11 [16350] LOG: server process (PID 16510) was terminated by signal 11 [16350] LOG: terminating any other active server processes [16350] LOG: all server processes terminated; reinitializing [16350] DEBUG: shmem_exit(0) [16350] DEBUG: invoking IpcMemoryCreate(size=31940608) [16516] LOG: database system was interrupted at 2005-01-24 08:54:48 CET [16516] LOG: checkpoint record is at 31C/1F7C150 [16516] LOG: redo record is at 31C/1F7C150; undo record is at 0/0; shutdown TRUE [16516] LOG: next transaction ID: 390466146; next OID: 679780352 [16516] LOG: database system was not properly shut down; automatic recovery in progress [16516] LOG: record with zero length at 31C/1F7C190 [16516] LOG: redo is not required [16516] LOG: database system is ready the postgres process left the core, and the backtrace in gdb shows: (gdb) bt #0 0x081b4e79 in GetRawDatabaseInfo () #1 0x081b4865 in InitPostgres () #2 0x08154741 in PostgresMain () #3 0x0813437d in ClosePostmasterPorts () #4 0x08133c0f in ClosePostmasterPorts () #5 0x08132519 in PostmasterMain () #6 0x08131f50 in PostmasterMain () #7 0x081088df in main () #8 0x001ef14f in __libc_start_main () from /lib/libc.so.6 Some specs: Main machine: Dual Xeon 2.8GHz, 2GB ECC RAM, two SCSI 73GBs 15K RPM. Both machines are Debian Woody with 2.4 kernels, and own compilation of PostgreSQL 7.4.1. The server is running for nearly two years now, with PostgreSQL for all the time, and as it's the first problem I encountered, I don't suspect it's a hardware fault. Thanks for your help, -- Marcin
pgsql-general by date: