Core Dump

Started by Ian Hardingover 23 years ago6 messagesgeneral
Jump to latest
#1Ian Harding
ianh@tpchd.org

PostgreSQL just quit on me unexpectedly for the first time ever. I have no doubt I did something stupid to cause it, but I can't find any clues in error logs. Here is a backtrace of the core file, I am wondering if it tells anyone anything.

I hacked my pltcl.so the other day, but all has been well up to now. I added a few SPI_freetuptable() to keep pltcl from hogging all the memory. I wonder if I hacked it a little wrong.

version
-----------------------------------------------------------------
PostgreSQL 7.2.1 on i386--netbsdelf, compiled by GCC egcs-1.1.2

Reading symbols from /usr/pkg/lib/postgresql/ltree.so...
(no debugging symbols found)...done.
#0 0x814955a in pfree ()
(gdb) bt
#0 0x814955a in pfree ()
#1 0x8149390 in MemoryContextDelete ()
#2 0x80c8411 in SPI_freetuptable ()
#3 0x4836c418 in pltclu_call_handler ()
#4 0x4838abb9 in TclInvokeStringCommand ()
#5 0x483a43d9 in TclExecuteByteCode ()
#6 0x4838b590 in Tcl_EvalObjEx ()
#7 0x483c67d7 in TclObjInterpProc ()
#8 0x483bfa24 in EvalObjv ()
#9 0x483c00ba in Tcl_EvalEx ()
#10 0x483c03a2 in Tcl_Eval ()
#11 0x4838caf0 in Tcl_GlobalEval ()
#12 0x4836af88 in pltclu_call_handler ()
#13 0x4836a6af in pltcl_call_handler ()
#14 0x80b6d17 in ExecCallTriggerFunc ()
#15 0x80b71ff in ExecBRUpdateTriggers ()
#16 0x80bdc53 in ExecReplace ()
#17 0x80bd996 in ExecutePlan ()
#18 0x80bcf27 in ExecutorRun ()
#19 0x80c8d3b in _SPI_pquery ()
#20 0x80c8a0b in _SPI_execute ()
#21 0x80c7899 in SPI_exec ()
#22 0x4836c0e2 in pltclu_call_handler ()
---Type <return> to continue, or q <return> to quit---
#23 0x4838abb9 in TclInvokeStringCommand ()
#24 0x483a43d9 in TclExecuteByteCode ()
#25 0x4838b590 in Tcl_EvalObjEx ()
#26 0x483c67d7 in TclObjInterpProc ()
#27 0x483bfa24 in EvalObjv ()
#28 0x483c00ba in Tcl_EvalEx ()
#29 0x483c03a2 in Tcl_Eval ()
#30 0x4838caf0 in Tcl_GlobalEval ()
#31 0x4836a991 in pltclu_call_handler ()
#32 0x4836a6c0 in pltcl_call_handler ()
#33 0x80bf68b in ExecMakeFunctionResult ()
#34 0x80bf72e in ExecEvalFunc ()
#35 0x80bfc41 in ExecEvalExpr ()
#36 0x80bfef5 in ExecTargetList ()
#37 0x80c013d in ExecProject ()
#38 0x80c50dc in ExecResult ()
#39 0x80be72a in ExecProcNode ()
#40 0x80bd9fd in ExecutePlan ()
#41 0x80bcf27 in ExecutorRun ()
#42 0x8105cd6 in ProcessQuery ()
#43 0x81045f9 in pg_exec_query_string ()
#44 0x8105560 in PostgresMain ()
#45 0x80ed60f in DoBackend ()
---Type <return> to continue, or q <return> to quit---
#46 0x80ecfc9 in BackendStartup ()
#47 0x80ec2e0 in ServerLoop ()
#48 0x80ebefe in PostmasterMain ()
#49 0x80cd67f in main ()
#50 0x80673f9 in ___start ()

Ian A. Harding
Programmer/Analyst II
Tacoma-Pierce County Health Department
(253) 798-3549
iharding@tpchd.org

WWSD - What Would Scooby Doo?

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ian Harding (#1)
Re: Core Dump

"Ian Harding" <ianh@tpchd.org> writes:

I hacked my pltcl.so the other day, but all has been well up to now.
I added a few SPI_freetuptable() to keep pltcl from hogging all the
memory. I wonder if I hacked it a little wrong.

Looks that way. The stack trace doesn't seem completely trustworthy,
though, so you might want to consider recompiling with --enable-debug.

Note that you seem to be inside a re-entrant use of pltcl (outer
function is triggering a trigger also written in pltcl). I'm wondering
if your tuptable hacking is not taking account of the possibility of
re-entrancy. This might be a bug that had been latent in pltcl all
along, and was only exposed when you tried to free stuff ...

regards, tom lane

#3Nigel J. Andrews
nandrews@investsystems.co.uk
In reply to: Tom Lane (#2)
Re: Core Dump

On Tue, 1 Oct 2002, Tom Lane wrote:

"Ian Harding" <ianh@tpchd.org> writes:

I hacked my pltcl.so the other day, but all has been well up to now.
I added a few SPI_freetuptable() to keep pltcl from hogging all the
memory. I wonder if I hacked it a little wrong.

Looks that way. The stack trace doesn't seem completely trustworthy,
though, so you might want to consider recompiling with --enable-debug.

Note that you seem to be inside a re-entrant use of pltcl (outer
function is triggering a trigger also written in pltcl). I'm wondering
if your tuptable hacking is not taking account of the possibility of
re-entrancy. This might be a bug that had been latent in pltcl all
along, and was only exposed when you tried to free stuff ...

That's exactly the fault I kicked with my original patch to HEAD. However,
wasn't there very little work done on pltcl.c since 7.2.x and shouldn't Neil's,
or was it Joe's?, last patch have applied fairly cleanly to 7.2.2?

--
Nigel J. Andrews

#4Ian Harding
ianh@tpchd.org
In reply to: Nigel J. Andrews (#3)
Re: Core Dump

I have finally got a chance to do more looking and you are correct. It seems the only invocation of SPI_freetuptable that is OK (taking into account re-entrancy) is the one in the "If there is no loop body given..." block. Any time any of the ones in the "There is a loop body..." bit get called, it explodes.

I assumed the SPI_freetuptable(SPI_tuptable) bit would know to only free the tuple table (whatever that is) from the most recently executed spi_exec.

To take care of my problem, and not blow up in nested "-array" types of spi_exec constructs, it seems we only need the line added in the "If there is no loop body given..." blocks. If there is a loop body, doesn't the memory get freed when the procedure finishes up anyway? I guess if you had numerous consecutive large loops within a tcl proc you might gobble up some memory, but even I don't do that and I am a pretty clumsy programmer. If they are nested, that should be all right since the memory bloat was only caused by the innermost (non "-array" call to spi_exec.

Thank you for looking at this!

Ian

Tom Lane <tgl@sss.pgh.pa.us> 10/01/02 02:08PM >>>

"Ian Harding" <ianh@tpchd.org> writes:

I hacked my pltcl.so the other day, but all has been well up to now.
I added a few SPI_freetuptable() to keep pltcl from hogging all the
memory. I wonder if I hacked it a little wrong.

Looks that way. The stack trace doesn't seem completely trustworthy,
though, so you might want to consider recompiling with --enable-debug.

Note that you seem to be inside a re-entrant use of pltcl (outer
function is triggering a trigger also written in pltcl). I'm wondering
if your tuptable hacking is not taking account of the possibility of
re-entrancy. This might be a bug that had been latent in pltcl all
along, and was only exposed when you tried to free stuff ...

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

#5Nigel J. Andrews
nandrews@investsystems.co.uk
In reply to: Ian Harding (#4)
Re: Core Dump

On Thu, 3 Oct 2002, Ian Harding wrote:

I have finally got a chance to do more looking and you are correct. It seems the only invocation of SPI_freetuptable that is OK (taking into account re-entrancy) is the one in the "If there is no loop body given..." block. Any time any of the ones in the "There is a loop body..." bit get called, it explodes.

I assumed the SPI_freetuptable(SPI_tuptable) bit would know to only free the tuple table (whatever that is) from the most recently executed spi_exec.

To take care of my problem, and not blow up in nested "-array" types of spi_exec constructs, it seems we only need the line added in the "If there is no loop body given..." blocks. If there is a loop body, doesn't the memory get freed when the procedure finishes up anyway? I guess if you had numerous consecutive large loops within a tcl proc you might gobble up some memory, but even I don't do that and I am a pretty clumsy programmer. If they are nested, that should be all right since the memory bloat was only caused by the innermost (non "-array" call to spi_exec.

Yes, I think Neil sent a patch that took out this fault but reinserted a
variation of the original leak problem. I know how to fix it I just need to
sort out what has gone on with the source file in the meantime because I can't
see Neil's patch, which did other things as well, in there yet. I will do this
memory problem regardless later tonight or early tomorrow.

(Neil might be Joe, I'll have to look at my saved messages)

--
Nigel J. Andrews

#6Neil Conway
neilc@samurai.com
In reply to: Nigel J. Andrews (#5)
Re: Core Dump

"Nigel J. Andrews" <nandrews@investsystems.co.uk> writes:

Yes, I think Neil sent a patch that took out this fault but
reinserted a variation of the original leak problem. I know how to
fix it I just need to sort out what has gone on with the source file
in the meantime because I can't see Neil's patch, which did other
things as well, in there yet. I will do this memory problem
regardless later tonight or early tomorrow.

Ok, sounds good to me -- if you'd like another copy of my patch, just
let me know. The patch I sent was really just a quick hack, so I'm not
surprised it's didn't cover all the cases. I'll leave the proper fix
in your hands -- let me know if you're too busy and I can fix it...

(Neil might be Joe, I'll have to look at my saved messages)

Heh, all us Conways are interchangeable, eh? :-)

Cheers,

Neil

--
Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC