Gcc 4.4 causes abort in plpython.
Hi,
I've been trying a gcc 4.4 snapshot (20081213) on buildfarm member
panda. It gets a abort during the pl-install-check part.
Here is the backtrace:
Core was generated by `postgres: build-farm pl_regression [local] SELECT '.
Program terminated with signal 6, Aborted.
[New process 3588]
#0 0x00002b41e7662ed5 in raise () from /lib/libc.so.6
(gdb) bt
#0 0x00002b41e7662ed5 in raise () from /lib/libc.so.6
#1 0x00002b41e76643f3 in abort () from /lib/libc.so.6
#2 0x00000000006a889d in ExceptionalCondition (
conditionName=<value optimized out>, errorType=<value optimized out>,
fileName=<value optimized out>, lineNumber=<value optimized out>)
at assert.c:57
#3 0x00000000006c8033 in MemoryContextAlloc (context=0x0, size=112)
at mcxt.c:507
#4 0x00000000006abe82 in CopyErrorData () at elog.c:1082
#5 0x00002b41ea61a755 in PLy_spi_execute_plan (ob=<value optimized out>,
list=<value optimized out>, limit=<value optimized out>) at plpython.c:2587
#6 0x00002b41ea61a9a6 in PLy_spi_execute (self=<value optimized out>,
args=0x2b41eae11d20) at plpython.c:2477
#7 0x00002b41ea8e5fdd in PyEval_EvalFrameEx ()
from /usr/lib/libpython2.5.so.1.0
#8 0x00002b41ea8e7385 in PyEval_EvalFrameEx ()
from /usr/lib/libpython2.5.so.1.0
#9 0x00002b41ea8e7bfd in PyEval_EvalCodeEx ()
from /usr/lib/libpython2.5.so.1.0
#10 0x00002b41ea8e7df2 in PyEval_EvalCode () from /usr/lib/libpython2.5.so.1.0
#11 0x00002b41ea61b89b in PLy_procedure_call (proc=0xc62880,
kargs=<value optimized out>, vargs=<value optimized out>) at plpython.c:962
#12 0x00002b41ea61eaae in PLy_function_handler (fcinfo=<value optimized out>,
proc=<value optimized out>) at plpython.c:790
#13 0x00002b41ea61f359 in plpython_call_handler (fcinfo=<value optimized out>)
at plpython.c:355
#14 0x000000000054f171 in ExecMakeFunctionResult (
fcache=<value optimized out>, econtext=<value optimized out>,
isNull=0xbdd3d0 "\177~\177\177\177\177\177\177", isDone=0xbdd488)
at execQual.c:1635
#15 0x000000000054a39b in ExecProject (projInfo=<value optimized out>,
isDone=<value optimized out>) at execQual.c:4922
#16 0x000000000055dfab in ExecResult (node=0xbdc7d8) at nodeResult.c:155
#17 0x0000000000549928 in ExecProcNode (node=0xbdc7d8) at execProcnode.c:338
#18 0x00000000005474c9 in standard_ExecutorRun (
queryDesc=<value optimized out>, direction=ForwardScanDirection,
count=<value optimized out>) at execMain.c:1343
#19 0x00000000005fc878 in PortalRunSelect (portal=0xbd6c58,
forward=<value optimized out>, count=0, dest=0xbd4c60) at pquery.c:942
#20 0x00000000005fdd30 in PortalRun (portal=<value optimized out>,
count=<value optimized out>, isTopLevel=<value optimized out>,
dest=<value optimized out>, altdest=<value optimized out>,
completionTag=<value optimized out>) at pquery.c:768
#21 0x00000000005f90cd in exec_simple_query (
query_string=<value optimized out>) at postgres.c:992
#22 0x00000000005fa707 in PostgresMain (argc=<value optimized out>,
argv=<value optimized out>, username=<value optimized out>)
at postgres.c:3569
#23 0x00000000005c7227 in ServerLoop () at postmaster.c:3258
#24 0x00000000005c963d in PostmasterMain (argc=3, argv=0xaf3720)
at postmaster.c:1031
#25 0x0000000000571695 in main (argc=3, argv=0xaf3710) at main.c:188
(gdb) frame 3
#3 0x00000000006c8033 in MemoryContextAlloc (context=0x0, size=112)
at mcxt.c:507
507 AssertArg(MemoryContextIsValid(context));
(gdb) p context
$1 = (MemoryContext) 0x0
I've tried looking at it, but I have no idea what could be wrong.
Note that this might be a compiler bug, and it would be nice
if someone could figure out if it's a bug in pgsql or gcc.
kurt
Kurt
Kurt Roeckx wrote:
#3 0x00000000006c8033 in MemoryContextAlloc (context=0x0, size=112)
at mcxt.c:507
#4 0x00000000006abe82 in CopyErrorData () at elog.c:1082
#5 0x00002b41ea61a755 in PLy_spi_execute_plan (ob=<value optimized out>,
list=<value optimized out>, limit=<value optimized out>) at plpython.c:2587
It's calling CopyErrorData with CurrentMemoryContext pointing to NULL,
which is not impossible since the GCC-inlined version of
MemoryContextSwitchTo does not check that it wasn't (the other version
does -- should we fix that?).
The question is why is that memory context set to NULL. The code looks
like this:
PLy_spi_execute_plan( ... )
{
MemoryContext oldcontext;
...
oldcontext = CurrentMemoryContext;
PG_TRY();
{
...
}
PG_CATCH();
{
MemoryContextSwitchTo(oldcontext);
CopyErrorData();
...
}
This has been like this for quite a while, which I find surprising
because I got scolded for a similar coding pattern awhile back. I think
I found that the variable was reversed to the value it had on entering
the block by the longjmp call. (IIRC Tom complained because his
compiler threw a "variable might be clobbered by longjmp" warning). We
at Command Prompt also had a similar case on the then-proprietary
Replicator code.
I think a simplistic solution is to declare the variable volatile.
Would you test that and report back?
Thanks.
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
On Mon, Dec 29, 2008 at 09:25:47AM -0300, Alvaro Herrera wrote:
I think a simplistic solution is to declare the variable volatile.
Would you test that and report back?
Yes, making oldcontext volatile makes the test pass.
It now fails at the ECPG-Check stage, but it seems that is a common
problem.
Kurt
Kurt Roeckx <kurt@roeckx.be> writes:
On Mon, Dec 29, 2008 at 09:25:47AM -0300, Alvaro Herrera wrote:
I think a simplistic solution is to declare the variable volatile.
Would you test that and report back?
Yes, making oldcontext volatile makes the test pass.
This is a gcc bug and you should report it. Since the variable is
not assigned within the try-block, volatile marking should not be
necessary.
regards, tom lane
On Mon, Dec 29, 2008 at 11:19:56AM -0500, Tom Lane wrote:
Kurt Roeckx <kurt@roeckx.be> writes:
On Mon, Dec 29, 2008 at 09:25:47AM -0300, Alvaro Herrera wrote:
I think a simplistic solution is to declare the variable volatile.
Would you test that and report back?Yes, making oldcontext volatile makes the test pass.
This is a gcc bug and you should report it. Since the variable is
not assigned within the try-block, volatile marking should not be
necessary.
Reported as:
http://gcc.gnu.org/PR38660
kurt