Thread: Core Dump

Core Dump

From

"Ian Harding"

Date:

01 October 2002, 14:40:44

PostgreSQL just quit on me unexpectedly for the first time ever.  I have no doubt I did something stupid to cause it,
butI can't find any clues in error logs.  Here is a backtrace of the core file, I am wondering if it tells anyone
anything.

I hacked my pltcl.so the other day, but all has been well up to now.  I added a few SPI_freetuptable() to keep pltcl
fromhogging all the memory.  I wonder if I hacked it a little wrong. 

                             version
-----------------------------------------------------------------
 PostgreSQL 7.2.1 on i386--netbsdelf, compiled by GCC egcs-1.1.2


Reading symbols from /usr/pkg/lib/postgresql/ltree.so...
(no debugging symbols found)...done.
#0  0x814955a in pfree ()
(gdb) bt
#0  0x814955a in pfree ()
#1  0x8149390 in MemoryContextDelete ()
#2  0x80c8411 in SPI_freetuptable ()
#3  0x4836c418 in pltclu_call_handler ()
#4  0x4838abb9 in TclInvokeStringCommand ()
#5  0x483a43d9 in TclExecuteByteCode ()
#6  0x4838b590 in Tcl_EvalObjEx ()
#7  0x483c67d7 in TclObjInterpProc ()
#8  0x483bfa24 in EvalObjv ()
#9  0x483c00ba in Tcl_EvalEx ()
#10 0x483c03a2 in Tcl_Eval ()
#11 0x4838caf0 in Tcl_GlobalEval ()
#12 0x4836af88 in pltclu_call_handler ()
#13 0x4836a6af in pltcl_call_handler ()
#14 0x80b6d17 in ExecCallTriggerFunc ()
#15 0x80b71ff in ExecBRUpdateTriggers ()
#16 0x80bdc53 in ExecReplace ()
#17 0x80bd996 in ExecutePlan ()
#18 0x80bcf27 in ExecutorRun ()
#19 0x80c8d3b in _SPI_pquery ()
#20 0x80c8a0b in _SPI_execute ()
#21 0x80c7899 in SPI_exec ()
#22 0x4836c0e2 in pltclu_call_handler ()
---Type <return> to continue, or q <return> to quit---
#23 0x4838abb9 in TclInvokeStringCommand ()
#24 0x483a43d9 in TclExecuteByteCode ()
#25 0x4838b590 in Tcl_EvalObjEx ()
#26 0x483c67d7 in TclObjInterpProc ()
#27 0x483bfa24 in EvalObjv ()
#28 0x483c00ba in Tcl_EvalEx ()
#29 0x483c03a2 in Tcl_Eval ()
#30 0x4838caf0 in Tcl_GlobalEval ()
#31 0x4836a991 in pltclu_call_handler ()
#32 0x4836a6c0 in pltcl_call_handler ()
#33 0x80bf68b in ExecMakeFunctionResult ()
#34 0x80bf72e in ExecEvalFunc ()
#35 0x80bfc41 in ExecEvalExpr ()
#36 0x80bfef5 in ExecTargetList ()
#37 0x80c013d in ExecProject ()
#38 0x80c50dc in ExecResult ()
#39 0x80be72a in ExecProcNode ()
#40 0x80bd9fd in ExecutePlan ()
#41 0x80bcf27 in ExecutorRun ()
#42 0x8105cd6 in ProcessQuery ()
#43 0x81045f9 in pg_exec_query_string ()
#44 0x8105560 in PostgresMain ()
#45 0x80ed60f in DoBackend ()
---Type <return> to continue, or q <return> to quit---
#46 0x80ecfc9 in BackendStartup ()
#47 0x80ec2e0 in ServerLoop ()
#48 0x80ebefe in PostmasterMain ()
#49 0x80cd67f in main ()
#50 0x80673f9 in ___start ()


Ian A. Harding
Programmer/Analyst II
Tacoma-Pierce County Health Department
(253) 798-3549
iharding@tpchd.org

WWSD - What Would Scooby Doo?

Re: Core Dump

From

Tom Lane

Date:

01 October 2002, 17:08:06

"Ian Harding" <ianh@tpchd.org> writes:
> I hacked my pltcl.so the other day, but all has been well up to now.
> I added a few SPI_freetuptable() to keep pltcl from hogging all the
> memory.  I wonder if I hacked it a little wrong.

Looks that way.  The stack trace doesn't seem completely trustworthy,
though, so you might want to consider recompiling with --enable-debug.

Note that you seem to be inside a re-entrant use of pltcl (outer
function is triggering a trigger also written in pltcl).  I'm wondering
if your tuptable hacking is not taking account of the possibility of
re-entrancy.  This might be a bug that had been latent in pltcl all
along, and was only exposed when you tried to free stuff ...

            regards, tom lane

Re: Core Dump

From

"Nigel J. Andrews"

Date:

01 October 2002, 21:03:44

On Tue, 1 Oct 2002, Tom Lane wrote:

> "Ian Harding" <ianh@tpchd.org> writes:
> > I hacked my pltcl.so the other day, but all has been well up to now.
> > I added a few SPI_freetuptable() to keep pltcl from hogging all the
> > memory.  I wonder if I hacked it a little wrong.
>
> Looks that way.  The stack trace doesn't seem completely trustworthy,
> though, so you might want to consider recompiling with --enable-debug.
>
> Note that you seem to be inside a re-entrant use of pltcl (outer
> function is triggering a trigger also written in pltcl).  I'm wondering
> if your tuptable hacking is not taking account of the possibility of
> re-entrancy.  This might be a bug that had been latent in pltcl all
> along, and was only exposed when you tried to free stuff ...

That's exactly the fault I kicked with my original patch to HEAD. However,
wasn't there very little work done on pltcl.c since 7.2.x and shouldn't Neil's,
or was it Joe's?, last patch have applied fairly cleanly to 7.2.2?


--
Nigel J. Andrews

Re: Core Dump

From

"Ian Harding"

Date:

03 October 2002, 11:22:33

I have finally got a chance to do more looking and you are correct.  It seems the only invocation of SPI_freetuptable
thatis OK (taking into account re-entrancy) is the one in the "If there is no loop body given..." block.  Any time any
ofthe ones in the "There is a loop body..." bit get called, it explodes. 

I assumed the SPI_freetuptable(SPI_tuptable) bit would know to only free the tuple table (whatever that is) from the
mostrecently executed spi_exec.   

To take care of my problem, and not blow up in nested "-array" types of spi_exec constructs, it seems we only need the
lineadded in the "If there is no loop body given..." blocks.  If there is a loop body, doesn't the memory get freed
whenthe procedure finishes up anyway?  I guess if you had numerous consecutive large loops within a tcl proc you might
gobbleup some memory, but even I don't do that and I am a pretty clumsy programmer.  If they are nested, that should be
allright since the memory bloat was only caused by the innermost (non "-array" call to spi_exec. 

Thank you for looking at this!

Ian

>>> Tom Lane <tgl@sss.pgh.pa.us> 10/01/02 02:08PM >>>
"Ian Harding" <ianh@tpchd.org> writes:
> I hacked my pltcl.so the other day, but all has been well up to now.
> I added a few SPI_freetuptable() to keep pltcl from hogging all the
> memory.  I wonder if I hacked it a little wrong.

Looks that way.  The stack trace doesn't seem completely trustworthy,
though, so you might want to consider recompiling with --enable-debug.

Note that you seem to be inside a re-entrant use of pltcl (outer
function is triggering a trigger also written in pltcl).  I'm wondering
if your tuptable hacking is not taking account of the possibility of
re-entrancy.  This might be a bug that had been latent in pltcl all
along, and was only exposed when you tried to free stuff ...

            regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Re: Core Dump

From

"Nigel J. Andrews"

Date:

03 October 2002, 11:28:51

On Thu, 3 Oct 2002, Ian Harding wrote:

> I have finally got a chance to do more looking and you are correct.  It seems the only invocation of SPI_freetuptable
thatis OK (taking into account re-entrancy) is the one in the "If there is no loop body given..." block.  Any time any
ofthe ones in the "There is a loop body..." bit get called, it explodes. 
>
> I assumed the SPI_freetuptable(SPI_tuptable) bit would know to only free the tuple table (whatever that is) from the
mostrecently executed spi_exec.   
>
> To take care of my problem, and not blow up in nested "-array" types of spi_exec constructs, it seems we only need
theline added in the "If there is no loop body given..." blocks.  If there is a loop body, doesn't the memory get freed
whenthe procedure finishes up anyway?  I guess if you had numerous consecutive large loops within a tcl proc you might
gobbleup some memory, but even I don't do that and I am a pretty clumsy programmer.  If they are nested, that should be
allright since the memory bloat was only caused by the innermost (non "-array" call to spi_exec. 

Yes, I think Neil sent a patch that took out this fault but reinserted a
variation of the original leak problem. I know how to fix it I just need to
sort out what has gone on with the source file in the meantime because I can't
see Neil's patch, which did other things as well, in there yet. I will do this
memory problem regardless later tonight or early tomorrow.

(Neil might be Joe, I'll have to look at my saved messages)

--
Nigel J. Andrews

Re: Core Dump

From

Neil Conway

Date:

03 October 2002, 21:56:08

"Nigel J. Andrews" <nandrews@investsystems.co.uk> writes:
> Yes, I think Neil sent a patch that took out this fault but
> reinserted a variation of the original leak problem. I know how to
> fix it I just need to sort out what has gone on with the source file
> in the meantime because I can't see Neil's patch, which did other
> things as well, in there yet. I will do this memory problem
> regardless later tonight or early tomorrow.

Ok, sounds good to me -- if you'd like another copy of my patch, just
let me know. The patch I sent was really just a quick hack, so I'm not
surprised it's didn't cover all the cases. I'll leave the proper fix
in your hands -- let me know if you're too busy and I can fix it...

> (Neil might be Joe, I'll have to look at my saved messages)

Heh, all us Conways are interchangeable, eh? :-)

Cheers,

Neil

--
Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC