Thread: Core dump on 7.1.3 on Linux 2.2.19
On a production server I am getting periodic core dumps from postgres. The server can go for days or weeks fine without any problems, but does dump core every so often. It happened to me this afternoon while running a 'vacuum analyze verbose'. I have attached the stack trace below. I looked at a core from the vacuum as well as another core file from a prior operation (which wasn't a vacuum) and they had the same stack. So I don't think this is a vacuum problem. Any ideas? (I intend to rebuild to get some better info in the stack trace, but it may be a while before I get around to that). thanks, --Barry [root@xythos1 26382]# gdb postgres core GNU gdb 5.0 Copyright 2000 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux"... warning: core file may not match specified executable file. Core was generated by `postgres: postgres files 127.0.0.1 SELECT '. Program terminated with signal 11, Segmentation fault. Reading symbols from /usr/lib/libz.so.1...done. Loaded symbols for /usr/lib/libz.so.1 Reading symbols from /lib/libcrypt.so.1...done. Loaded symbols for /lib/libcrypt.so.1 Reading symbols from /lib/libresolv.so.2...done. Loaded symbols for /lib/libresolv.so.2 Reading symbols from /lib/libnsl.so.1...done. Loaded symbols for /lib/libnsl.so.1 Reading symbols from /lib/libdl.so.2...done. Loaded symbols for /lib/libdl.so.2 Reading symbols from /lib/libm.so.6...done. Loaded symbols for /lib/libm.so.6 Reading symbols from /usr/lib/libreadline.so.4.1...done. Loaded symbols for /usr/lib/libreadline.so.4.1 Reading symbols from /lib/libtermcap.so.2...done. Loaded symbols for /lib/libtermcap.so.2 Reading symbols from /lib/libc.so.6...done. Loaded symbols for /lib/libc.so.6 Reading symbols from /lib/ld-linux.so.2...done. Loaded symbols for /lib/ld-linux.so.2 Reading symbols from /usr/local/pgsql/lib/plpgsql.so...done. Loaded symbols for /usr/local/pgsql/lib/plpgsql.so #0 0x80b9693 in ExecEvalVar () (gdb) where #0 0x80b9693 in ExecEvalVar () #1 0x80ba219 in ExecEvalExpr () #2 0x80b9c6b in ExecEvalFuncArgs () #3 0x80b9ce4 in ExecMakeFunctionResult () #4 0x80b9e81 in ExecEvalOper () #5 0x80ba289 in ExecEvalExpr () #6 0x80ba39a in ExecQual () #7 0x80bdca1 in IndexNext () #8 0x80ba83f in ExecScan () #9 0x80bde78 in ExecIndexScan () #10 0x80b8fc1 in ExecProcNode () #11 0x80bf6f4 in ExecNestLoop () #12 0x80b8ffd in ExecProcNode () #13 0x80b8c5e in EvalPlanQualNext () #14 0x80b8c35 in EvalPlanQual () #15 0x80b818a in ExecutePlan () #16 0x80b7738 in ExecutorRun () #17 0x80fc3af in ProcessQuery () #18 0x80faebe in pg_exec_query_string () #19 0x80fbea6 in PostgresMain () #20 0x80e6fc8 in DoBackend () #21 0x80e6bc7 in BackendStartup () #22 0x80e5e3d in ServerLoop () #23 0x80e5888 in PostmasterMain () #24 0x80c7107 in main () #25 0x400ecf31 in __libc_start_main (main=0x80c6fd4 <main>, argc=3, ubp_av=0xbffffa74, init=0x8065314 <_init>, fini=0x813e19c<_fini>, rtld_fini=0x4000e274 <_dl_fini>, stack_end=0xbffffa6c) at ../sysdeps/generic/libc-start.c:129
Barry Lind <barry@xythos.com> writes: > It happened to me this afternoon while running a 'vacuum analyze > verbose'. I have attached the stack trace below. That trace is certainly not from a vacuum operation. I'd suggest rebuilding with --enable-debug; we won't be able to learn much without that. Until you do that, possibly it'd help to turn on query logging so that we can learn what query is crashing. I find the presence of EvalPlanQual in the backtrace suggestive. I don't trust that code at all ;-) ... but without a lot more info we're not going to be able to figure out anything. BTW, EvalPlanQual is only called if the query is an UPDATE or DELETE that tries to update a row that's already been updated by a not-yet-committed transaction. That probably explains why you don't see the crash often --- if you deliberately set up the right circumstances, you could perhaps reproduce it on-demand. regards, tom lane
Re: Core dump on 7.1.3 on Linux 2.2.19
From
"Dmitry G. Mastrukov" Дмитрий Геннадьевич Мастрюков
Date:
On Втр, 2001-11-06 at 04:41, Barry Lind wrote: > On a production server I am getting periodic core dumps from postgres. > The server can go for days or weeks fine without any problems, but does > dump core every so often. > > It happened to me this afternoon while running a 'vacuum analyze > verbose'. I have attached the stack trace below. I looked at a core > from the vacuum as well as another core file from a prior operation > (which wasn't a vacuum) and they had the same stack. So I don't think > this is a vacuum problem. > > Any ideas? (I intend to rebuild to get some better info in the stack > trace, but it may be a while before I get around to that). > I experienced problem with 'vacuum analyze' with postgres 7.1.2 on glibc-2.2.2. And it was bug in libc. Upgrading to glibc-2.2.3 solved my problem. Regards, Dmitry