Re: BUG #5905: segfault on pg_terminate_backend - Mailing list pgsql-bugs
From | Merlin Moncure |
---|---|
Subject | Re: BUG #5905: segfault on pg_terminate_backend |
Date | |
Msg-id | AANLkTik_NFbQ5eXizWJrK_cRjBYtxUJKVSMhtz2GT3jT@mail.gmail.com Whole thread Raw |
In response to | BUG #5905: segfault on pg_terminate_backend ("Matt Harrington" <matt.harrington@rentrak.com>) |
List | pgsql-bugs |
On Tue, Mar 1, 2011 at 12:50 PM, Matt Harrington <matt.harrington@rentrak.com> wrote: > > The following bug has been logged online: > > Bug reference: =A0 =A0 =A05905 > Logged by: =A0 =A0 =A0 =A0 =A0Matt Harrington > Email address: =A0 =A0 =A0matt.harrington@rentrak.com > PostgreSQL version: 8.4.3 > Operating system: =A0 CentOS 4.7 > Description: =A0 =A0 =A0 =A0segfault on pg_terminate_backend > Details: > > I have occasionally triggered a segfault while executing > pg_terminate_backend (or kill on child postgres processes.) =A0I still ha= ve > the core file and associated logs. > > $ gdb -c core.5746 /usr/local/stow/postgresql-8.4.3/bin/postmaster > GNU gdb Red Hat Linux (6.3.0.0-1.159.el4rh) > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you > are > welcome to change it and/or distribute copies of it under certain > conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. =A0Type "show warranty" for deta= ils. > This GDB was configured as "x86_64-redhat-linux-gnu"...Using host > libthread_db library "/lib64/tls/libthread_db.so.1". > > Core was generated by `postgres: postgres att_dev [local] SELECT > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0'. > Program terminated with signal 11, Segmentation fault. > Reading symbols from /lib64/libcrypt.so.1...done. > Loaded symbols for /lib64/libcrypt.so.1 > Reading symbols from /lib64/libdl.so.2...done. > Loaded symbols for /lib64/libdl.so.2 > Reading symbols from /lib64/tls/libm.so.6...done. > Loaded symbols for /lib64/tls/libm.so.6 > Reading symbols from /lib64/tls/libc.so.6...done. > Loaded symbols for /lib64/tls/libc.so.6 > Reading symbols from /lib64/ld-linux-x86-64.so.2...done. > Loaded symbols for /lib64/ld-linux-x86-64.so.2 > Reading symbols from > /data_storage/local/stow/postgresql-8.4.3/lib/auto_explain.so...done. > Loaded symbols for > /data_storage/local/stow/postgresql-8.4.3/lib/auto_explain.so > Reading symbols from > /data_storage/local/stow/postgresql-8.4.3/lib/pg_stat_statements.so...don= e. > Loaded symbols for > /data_storage/local/stow/postgresql-8.4.3/lib/pg_stat_statements.so > Reading symbols from > /data_storage/local/stow/postgresql-8.4.3/lib/plpgsql.so...done. > Loaded symbols for /data_storage/local/stow/postgresql-8.4.3/lib/plpgsql.= so > #0 =A0ResourceOwnerEnlargeCatCacheRefs (owner=3D0x0) at resowner.c:601 > 601 =A0 =A0 =A0 =A0 =A0 =A0 if (owner->ncatrefs < owner->maxcatrefs) > (gdb) bt > #0 =A0ResourceOwnerEnlargeCatCacheRefs (owner=3D0x0) at resowner.c:601 > #1 =A00x000000000066586e in SearchCatCache (cache=3D0xa26ff0, v1=3DVariab= le "v1" > is not available. > ) at catcache.c:1182 > #2 =A00x0000000000536095 in sql_exec_error_callback (arg=3DVariable "arg"= is not > available. > ) at functions.c:883 > #3 =A00x000000000067355c in errfinish (dummy=3DVariable "dummy" is not > available. > ) at elog.c:386 > #4 =A00x00000000005cf3e5 in log_disconnections (code=3DVariable "code" is= not > available. > ) at postgres.c:4019 > #5 =A00x00000000005bd020 in proc_exit_prepare (code=3D1) at ipc.c:182 > #6 =A00x00000000005bd048 in proc_exit (code=3D0) at ipc.c:95 > #7 =A00x00000000006736d6 in errfinish (dummy=3DVariable "dummy" is not > available. > ) at elog.c:488 > #8 =A00x00000000005cb4c1 in ProcessInterrupts () at postgres.c:2653 > #9 =A00x00000000005342fa in ExecScan (node=3D0xaccb60, accessMtd=3D0x5400= 00 > <SeqNext>) at execScan.c:101 > #10 0x000000000052d86d in ExecProcNode (node=3D0xaccb60) at > execProcnode.c:363 > #11 0x000000000053c83c in ExecHashJoin (node=3D0xb49920) at > nodeHashjoin.c:131 > #12 0x000000000052d7bd in ExecProcNode (node=3D0xb49920) at > execProcnode.c:412 > #13 0x0000000000540b1a in ExecSort (node=3D0xb49810) at nodeSort.c:102 > #14 0x000000000052d79a in ExecProcNode (node=3D0xb49810) at > execProcnode.c:423 > #15 0x000000000052b5ca in standard_ExecutorRun (queryDesc=3D0xb153a0, > direction=3DForwardScanDirection, count=3D10) at execMain.c:1504 > #16 0x0000002a98417197 in explain_ExecutorRun (queryDesc=3D0xb153a0, > direction=3DForwardScanDirection, count=3D10) at auto_explain.c:168 > #17 0x0000002a9851a3e3 in pgss_ExecutorRun (queryDesc=3D0xb153a0, > direction=3DForwardScanDirection, count=3D10) at pg_stat_statements.c:514 > #18 0x00000000005cff49 in PortalRunSelect (portal=3D0xaa2db0, forward=3DV= ariable > "forward" is not available. > ) at pquery.c:953 > #19 0x00000000005d0e80 in PortalRunFetch (portal=3D0xaa2db0, > fdirection=3DFETCH_FORWARD, count=3D10, dest=3D0x92db20) at pquery.c:1402 > #20 0x0000000000548a81 in _SPI_cursor_operation (portal=3D0xaa2db0, > direction=3DFETCH_FORWARD, count=3D10, dest=3D0x92db20) at spi.c:2088 > #21 0x0000002ae7a195f4 in exec_for_query (estate=3D0x7fbfffcd50, > stmt=3D0xaec558, portal=3D0xaa2db0, prefetch_ok=3D1 '\001') at pl_exec.c:= 4211 > #22 0x0000002ae7a197f6 in exec_stmt_fors (estate=3D0x7fbfffcd50, > stmt=3D0xaec558) at pl_exec.c:1852 > #23 0x0000002ae7a16699 in exec_stmts (estate=3D0x7fbfffcd50, stmts=3DVari= able > "stmts" is not available. > ) at pl_exec.c:1269 > #24 0x0000002ae7a17ff1 in exec_stmt_block (estate=3D0x0, block=3D0xaec650= ) at > pl_exec.c:1137 > #25 0x0000002ae7a18b69 in plpgsql_exec_function (func=3D0xa88458, > fcinfo=3D0x7fbfffd020) at pl_exec.c:315 > #26 0x0000002ae7a0f940 in plpgsql_call_handler (fcinfo=3D0x7fbfffd020) at > pl_handler.c:95 > #27 0x000000000052fd64 in ExecMakeTableFunctionResult (funcexpr=3D0xabe00= 0, > econtext=3D0xabdd60, expectedDesc=3D0xabdec0, randomAccess=3D0 '\0') > =A0 =A0at execQual.c:2016 > #28 0x000000000053f561 in FunctionNext (node=3D0xabdc50) at > nodeFunctionscan.c:64 > #29 0x0000000000534331 in ExecScan (node=3D0x0, accessMtd=3D0x53f4f0 > <FunctionNext>) at execScan.c:68 > #30 0x000000000052d81d in ExecProcNode (node=3D0xabdc50) at > execProcnode.c:385 > #31 0x0000000000537acc in ExecAppend (node=3D0xaa5dd0) at nodeAppend.c:269 > #32 0x000000000052d88d in ExecProcNode (node=3D0xaa5dd0) at > execProcnode.c:348 > #33 0x000000000052b5ca in standard_ExecutorRun (queryDesc=3D0xb5dbc0, > direction=3DForwardScanDirection, count=3D0) at execMain.c:1504 > #34 0x0000002a98417197 in explain_ExecutorRun (queryDesc=3D0xb5dbc0, > direction=3DForwardScanDirection, count=3D0) at auto_explain.c:168 > #35 0x0000002a9851a3e3 in pgss_ExecutorRun (queryDesc=3D0xb5dbc0, > direction=3DForwardScanDirection, count=3D0) at pg_stat_statements.c:514 > #36 0x0000000000536da4 in fmgr_sql (fcinfo=3D0x7fbfffd9d0) at functions.c= :472 > #37 0x000000000052fd64 in ExecMakeTableFunctionResult (funcexpr=3D0xaf14a= 0, > econtext=3D0xaf11d0, expectedDesc=3D0xaf1360, randomAccess=3D0 '\0') > =A0 =A0at execQual.c:2016 > #38 0x000000000053f561 in FunctionNext (node=3D0xaf10c0) at > nodeFunctionscan.c:64 > #39 0x0000000000534331 in ExecScan (node=3D0x0, accessMtd=3D0x53f4f0 > <FunctionNext>) at execScan.c:68 > #40 0x000000000052d81d in ExecProcNode (node=3D0xaf10c0) at > execProcnode.c:385 > #41 0x000000000052b5ca in standard_ExecutorRun (queryDesc=3D0xa98450, > direction=3DForwardScanDirection, count=3D0) at execMain.c:1504 > ---Type <return> to continue, or q <return> to quit--- > #42 0x0000002a98417197 in explain_ExecutorRun (queryDesc=3D0xa98450, > direction=3DForwardScanDirection, count=3D0) at auto_explain.c:168 > #43 0x0000002a9851a3e3 in pgss_ExecutorRun (queryDesc=3D0xa98450, > direction=3DForwardScanDirection, count=3D0) at pg_stat_statements.c:514 > #44 0x00000000005cff49 in PortalRunSelect (portal=3D0xaa2ca0, forward=3DV= ariable > "forward" is not available. > ) at pquery.c:953 > #45 0x00000000005d0760 in PortalRun (portal=3D0xaa2ca0, > count=3D9223372036854775807, isTopLevel=3D1 '\001', dest=3D0xa55c98, > altdest=3D0xa55c98, > =A0 =A0completionTag=3D0x7fbfffe360 "") at pquery.c:779 > #46 0x00000000005cc2d6 in exec_simple_query ( > =A0 =A0query_string=3D0xa53da0 "select * from > rtk_role_maint.sql_to_refresh_grants_for_dml =A0 =A0 =A0 ('att', 'att_dml= ') ;") at > postgres.c:991 > #47 0x00000000005cdcdf in PostgresMain (argc=3D4, argv=3DVariable "argv" = is not > available. > ) at postgres.c:3614 > #48 0x00000000005a4ee0 in ServerLoop () at postmaster.c:3447 > #49 0x00000000005a686d in PostmasterMain (argc=3D3, argv=3D0x9b04e0) at > postmaster.c:1040 > #50 0x00000000005530be in main (argc=3D3, argv=3D0x9b04e0) at main.c:188 > > > $ grep -A70 'Segmentation fault' postgresql-2011-03-01_000000.log > <6719, 2011-03-01 06:00:15 PST> LOG: =A0server process (PID 5746) was > terminated by signal 11: Segmentation fault > <6719, 2011-03-01 06:00:15 PST> LOG: =A0terminating any other active serv= er > processes > <5940, 2011-03-01 06:00:15 PST> WARNING: =A0terminating connection becaus= e of > crash of another server process > <5940, 2011-03-01 06:00:15 PST> DETAIL: =A0The postmaster has commanded t= his > server process to roll back the current transaction and exit, because > another server process exited abnormally and possibly corrupted shared > memory. > <5940, 2011-03-01 06:00:15 PST> HINT: =A0In a moment you should be able to > reconnect to the database and repeat your command. > ... > <6040, 2011-03-01 06:00:17 PST> FATAL: =A0the database system is in recov= ery > mode > <6039, 2011-03-01 06:00:17 PST> LOG: =A0database system was interrupted; = last > known up at 2011-03-01 05:51:18 PST > <6039, 2011-03-01 06:00:17 PST> LOG: =A0database system was not properly = shut > down; automatic recovery in progress > <6039, 2011-03-01 06:00:17 PST> LOG: =A0redo starts at 7D0/6FF7E490 > <6039, 2011-03-01 06:00:17 PST> LOG: =A0unexpected pageaddr 7D0/20F9A000 = in > log file 2000, segment 111, offset 16359424 > <6039, 2011-03-01 06:00:17 PST> LOG: =A0redo done at 7D0/6FF97728 > <6039, 2011-03-01 06:00:17 PST> LOG: =A0last completed transaction was at= log > time 2011-03-01 06:00:04.083136-08 > <6039, 2011-03-01 06:00:17 PST> LOG: =A0checkpoint starting: end-of-recov= ery > immediate > <6039, 2011-03-01 06:00:17 PST> LOG: =A0checkpoint complete: wrote 24 buf= fers > (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=3D0= .009 > s, sync=3D0.002 s, total=3D0.011 s > <6044, 2011-03-01 06:00:18 PST> LOG: =A0autovacuum launcher started > <6719, 2011-03-01 06:00:18 PST> LOG: =A0database system is ready to accept > connections > > Unfortunately, I have not been successful in generating a reproducible ca= se; > it may be a race condition edge case. did you read: http://archives.postgresql.org/pgsql-bugs/2010-03/msg00170.ph= p? this may have been fixed -- check follow ups in pgsql-commits. and upgrade to latest 8.4!! merlin
pgsql-bugs by date: