Thread: simple query terminated by signal 11
Hi List, i run in to an error while dumping a db. after investigating it, i found a possible corrupted table. but i am not sure. and i dont know how i can repair it? could it be a harddrive error? Here are the logs: # all fine: SELECT * FROM hst_sales_report WHERE id = 5078866 [6208 / 2006-06-19 18:46:17 CEST]LOG: 00000: connection received: host=[local] port= [6208 / 2006-06-19 18:46:17 CEST]LOCATION: BackendRun, postmaster.c:2679 [6208 / 2006-06-19 18:46:17 CEST]LOG: 00000: connection authorized: user=postgres database=backoffice_db [6208 / 2006-06-19 18:46:17 CEST]LOCATION: BackendRun, postmaster.c:2751 [6208 / 2006-06-19 18:46:17 CEST]LOG: 00000: statement: SELECT * FROM hst_sales_report WHERE id = 5078866 [6208 / 2006-06-19 18:46:17 CEST]LOCATION: pg_parse_query, postgres.c:526 [6208 / 2006-06-19 18:46:18 CEST]LOG: 00000: duration: 117.638 ms [6208 / 2006-06-19 18:46:18 CEST]LOCATION: exec_simple_query, postgres.c:1076 [6208 / 2006-06-19 18:46:18 CEST]LOG: 00000: disconnection: session time: 0:00:00.12 user=postgres database=backoffice_db host=[local] port= [6208 / 2006-06-19 18:46:18 CEST]LOCATION: log_disconnections, postgres.c:3447 # now the error: SELECT * FROM hst_sales_report WHERE id = 5078867 [6216 / 2006-06-19 18:46:23 CEST]LOG: 00000: connection received: host=[local] port= [6216 / 2006-06-19 18:46:23 CEST]LOCATION: BackendRun, postmaster.c:2679 [6216 / 2006-06-19 18:46:23 CEST]LOG: 00000: connection authorized: user=postgres database=backoffice_db [6216 / 2006-06-19 18:46:23 CEST]LOCATION: BackendRun, postmaster.c:2751 [6216 / 2006-06-19 18:46:23 CEST]LOG: 00000: statement: SELECT * FROM hst_sales_report WHERE id = 5078867 [6216 / 2006-06-19 18:46:23 CEST]LOCATION: pg_parse_query, postgres.c:526 [3762 / 2006-06-19 18:46:23 CEST]LOG: 00000: server process (PID 6216) was terminated by signal 11 [3762 / 2006-06-19 18:46:23 CEST]LOCATION: LogChildExit, postmaster.c:2358 [3762 / 2006-06-19 18:46:23 CEST]LOG: 00000: terminating any other active server processes [3762 / 2006-06-19 18:46:23 CEST]LOCATION: HandleChildCrash, postmaster.c:2251 [3985 / 2006-06-19 18:46:23 CEST]WARNING: 57P02: terminating connection because of crash of another server process [3985 / 2006-06-19 18:46:23 CEST]DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. [3985 / 2006-06-19 18:46:23 CEST]HINT: In a moment you should be able to reconnect to the database and repeat your command. [3985 / 2006-06-19 18:46:23 CEST]LOCATION: quickdie, postgres.c:1945 [3762 / 2006-06-19 18:46:23 CEST]LOG: 00000: all server processes terminated; reinitializing [3762 / 2006-06-19 18:46:23 CEST]LOCATION: reaper, postmaster.c:2150 [6217 / 2006-06-19 18:46:23 CEST]LOG: 00000: database system was interrupted at 2006-06-19 18:42:49 CEST [6217 / 2006-06-19 18:46:23 CEST]LOCATION: StartupXLOG, xlog.c:4094 [6217 / 2006-06-19 18:46:23 CEST]LOG: 00000: checkpoint record is at 11/3E77AB1C [6217 / 2006-06-19 18:46:23 CEST]LOCATION: StartupXLOG, xlog.c:4163 [6217 / 2006-06-19 18:46:23 CEST]LOG: 00000: redo record is at 11/3E774940; undo record is at 0/0; shutdown FALSE [6217 / 2006-06-19 18:46:23 CEST]LOCATION: StartupXLOG, xlog.c:4191 [6217 / 2006-06-19 18:46:23 CEST]LOG: 00000: next transaction ID: 3899415; next OID: 46429694 [6217 / 2006-06-19 18:46:23 CEST]LOCATION: StartupXLOG, xlog.c:4194 [6217 / 2006-06-19 18:46:23 CEST]LOG: 00000: database system was not properly shut down; automatic recovery in progress [6217 / 2006-06-19 18:46:23 CEST]LOCATION: StartupXLOG, xlog.c:4250 [6217 / 2006-06-19 18:46:23 CEST]LOG: 00000: redo starts at 11/3E774940 [6217 / 2006-06-19 18:46:23 CEST]LOCATION: StartupXLOG, xlog.c:4287 [6217 / 2006-06-19 18:46:23 CEST]LOG: 00000: record with zero length at 11/3E77AD20 [6217 / 2006-06-19 18:46:23 CEST]LOCATION: ReadRecord, xlog.c:2496 [6217 / 2006-06-19 18:46:23 CEST]LOG: 00000: redo done at 11/3E77ACF8 [6217 / 2006-06-19 18:46:23 CEST]LOCATION: StartupXLOG, xlog.c:4345 [6217 / 2006-06-19 18:46:23 CEST]LOG: 00000: database system is ready [6217 / 2006-06-19 18:46:23 CEST]LOCATION: StartupXLOG, xlog.c:4557 Can anyone help me, please? regards, thomas!
""Thomas Chille"" <thomas.chille@gmail.com> wrote > Hi List, > > i run in to an error while dumping a db. > > after investigating it, i found a possible corrupted table. but i am not sure. > and i dont know how i can repair it? could it be a harddrive error? > > > # now the error: SELECT * FROM hst_sales_report WHERE id = 5078867 > > [6216 / 2006-06-19 18:46:23 CEST]LOG: 00000: connection received: > host=[local] port= > [6216 / 2006-06-19 18:46:23 CEST]LOCATION: BackendRun, postmaster.c:2679 > [6216 / 2006-06-19 18:46:23 CEST]LOG: 00000: connection authorized: > user=postgres database=backoffice_db > [6216 / 2006-06-19 18:46:23 CEST]LOCATION: BackendRun, postmaster.c:2751 > [6216 / 2006-06-19 18:46:23 CEST]LOG: 00000: statement: SELECT * FROM > hst_sales_report WHERE id = 5078867 > [6216 / 2006-06-19 18:46:23 CEST]LOCATION: pg_parse_query, postgres.c:526 > [3762 / 2006-06-19 18:46:23 CEST]LOG: 00000: server process (PID > 6216) was terminated by signal 11 > [3762 / 2006-06-19 18:46:23 CEST]LOCATION: LogChildExit, postmaster.c:2358 > [3762 / 2006-06-19 18:46:23 CEST]LOG: 00000: terminating any other > active server processes > [3762 / 2006-06-19 18:46:23 CEST]LOCATION: HandleChildCrash, postmaster.c:2251 > [3985 / 2006-06-19 18:46:23 CEST]WARNING: 57P02: terminating > connection because of crash of another server process > [3985 / 2006-06-19 18:46:23 CEST]DETAIL: The postmaster has commanded > this server process to roll back the current transaction and exit, > because another server process exited abnormally and possibly > corrupted shared memory. Which verison are you using? In any way, except a random hardware error, we expect Postgres to be able to detect and report the problem instead of a silent core dump. So can you gather the core dump and post it here? Regards, Qingqing
Hi Qingqing, thanks for your reply! The postgresql version is 8.0.4 and runs on a debian based linux server with kernel 2.6.11.2. I never dealed with a core dump before. but after setting "ulimit -c 1024" i got it. I don't know how to post it, because the size is 1,5 MB?! I try to attch it as gzip. I also could not install dbg on the erroneous system, so i tried to examine the core dump on another machine (gentoo) with postgres 8.0.4 anf got the following output: spoonpc01 ~ # gdb /usr/bin/postgres core GNU gdb 6.4 Copyright 2005 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i686-pc-linux-gnu"...(no debugging symbols found) Using host libthread_db library "/lib/tls/libthread_db.so.1". warning: core file may not match specified executable file. (no debugging symbols found) Core was generated by `postgres: postgres backoffice_db [local] SELECT ' . Program terminated with signal 11, Segmentation fault. #0 0x080753c2 in DataFill () (gdb) where #0 0x080753c2 in DataFill () #1 0xb74253d4 in ?? () #2 0x0000001d in ?? () #3 0x08356fa8 in ?? () #4 0x08379420 in ?? () #5 0x00000000 in ?? () (gdb) What i can say too, is that i can reproduce the error everytime with the same query. thanks in advonce On 6/20/06, Qingqing Zhou <zhouqq@cs.toronto.edu> wrote: > > ""Thomas Chille"" <thomas.chille@gmail.com> wrote > > Hi List, > > > > i run in to an error while dumping a db. > > > > after investigating it, i found a possible corrupted table. but i am not > sure. > > and i dont know how i can repair it? could it be a harddrive error? > > > > > > # now the error: SELECT * FROM hst_sales_report WHERE id = 5078867 > > > > [6216 / 2006-06-19 18:46:23 CEST]LOG: 00000: connection received: > > host=[local] port= > > [6216 / 2006-06-19 18:46:23 CEST]LOCATION: BackendRun, postmaster.c:2679 > > [6216 / 2006-06-19 18:46:23 CEST]LOG: 00000: connection authorized: > > user=postgres database=backoffice_db > > [6216 / 2006-06-19 18:46:23 CEST]LOCATION: BackendRun, postmaster.c:2751 > > [6216 / 2006-06-19 18:46:23 CEST]LOG: 00000: statement: SELECT * FROM > > hst_sales_report WHERE id = 5078867 > > [6216 / 2006-06-19 18:46:23 CEST]LOCATION: pg_parse_query, postgres.c:526 > > [3762 / 2006-06-19 18:46:23 CEST]LOG: 00000: server process (PID > > 6216) was terminated by signal 11 > > [3762 / 2006-06-19 18:46:23 CEST]LOCATION: LogChildExit, > postmaster.c:2358 > > [3762 / 2006-06-19 18:46:23 CEST]LOG: 00000: terminating any other > > active server processes > > [3762 / 2006-06-19 18:46:23 CEST]LOCATION: HandleChildCrash, > postmaster.c:2251 > > [3985 / 2006-06-19 18:46:23 CEST]WARNING: 57P02: terminating > > connection because of crash of another server process > > [3985 / 2006-06-19 18:46:23 CEST]DETAIL: The postmaster has commanded > > this server process to roll back the current transaction and exit, > > because another server process exited abnormally and possibly > > corrupted shared memory. > > Which verison are you using? In any way, except a random hardware error, we > expect Postgres to be able to detect and report the problem instead of a > silent core dump. So can you gather the core dump and post it here? > > Regards, > Qingqing > > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster >
""Thomas Chille"" <thomas@chille.de> wrote > > I don't know how to post it, because the size is 1,5 MB?! I try to > attch it as gzip. > No ... I mean the "bt" result of the core dump. $gdb <postgres_exe_path> -c <core_file_name> bt > . > Program terminated with signal 11, Segmentation fault. > #0 0x080753c2 in DataFill () > (gdb) where > #0 0x080753c2 in DataFill () > #1 0xb74253d4 in ?? () > #2 0x0000001d in ?? () > #3 0x08356fa8 in ?? () > #4 0x08379420 in ?? () > #5 0x00000000 in ?? () > (gdb) > Since it is repeatable in your machine, you can compile a new postgres version with "--enable-cassert" (enable assertions in code) and "--enable-debug" (enable gcc debug support) configuration. Then run it on your data and "bt" the core dump. Regards, Qingqing
Thanks for your Tipps! > Since it is repeatable in your machine, you can compile a new postgres > version with "--enable-cassert" (enable assertions in code) and > "--enable-debug" (enable gcc debug support) configuration. Then run it on > your data and "bt" the core dump. I try to found out the reason for that behavoir. For now i could drop this damaged table und restore it from an older backup, so all works fine again. regards, thomas!