codlin_month is up and complain - PL/Python crash
I revived codlin_month and it falls during PL/Python test:
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=codlin_moth&dt=2010-02-16%2015:09:05
TRAP: BadArgument("!(((context) != 0 && (((((Node*)((context)))->type)
== T_AllocSetContext))))", File: "mcxt.c", Line: 641)
feaf5005 _lwp_kill (1, 6, 80459c8, fea9bbde) + 15
fea9bbea raise (6, 0, 8045a18, fea725aa) + 22
fea725ca abort (8046670, 8361f80, 8045a48, 8719ccf, 89021f0,
89021e4) + f2
086d07c0 ExceptionalCondition (89021f0, 89021e4, 89021dc, 281) + 58
08719ccf MemoryContextSwitchTo (89264ac, 0, 0, 8045a7c) + 47
fec21990 PLy_spi_execute (0, 8b141cc, 80460f8, fe84abde) + 750
fe84ad6e PyCFunction_Call (8b0ff6c, 8b141cc, 0, fe8a8d92) + 19e
fe8a91a0 call_function (80461bc, 1, 610f2d31, fe8a3206) + 41c
fe8a6221 PyEval_EvalFrameEx (8b5798c, 0, 8b0cbdc, 0) + 3029
fe8a9310 fast_function (8b05144, 80462fc, 0, 0, 0, fe91c63c) + 108
fe8a8e72 call_function (80462fc, 0, 80462d8, fe8a3206) + ee
fe8a6221 PyEval_EvalFrameEx (8b576a4, 0, 8b0cbdc, 8b0cbdc) + 3029
fe8a7cd0 PyEval_EvalCodeEx (8ab4770, 8b0cbdc, 8b0cbdc, 0, 0, 0) + 91c
fe8a3102 PyEval_EvalCode (8ab4770, 8b0cbdc, 8b0cbdc, fec17831) + 32
fec1799c PLy_function_handler (8046980, 8b5d508, 8046880, fec1480f) + 17c
fec14b92 plpython_call_handler (8046980, 8046bb0, 8046be8, 8323774) + 3aa
08324393 ExecEvalFunc (8a033b0, 8a0329c, 8a0390c, 8a039b8) + e33
0832b1bc ExecProject (8a03920, 8046c6c, 2, 8977abc) + 834
08348785 ExecResult (8a03210, 8a03184, 0, 1) + 9d
0831f66f ExecProcNode (8a03210, 1, 8a037ec, 8731314) + 227
0831a186 ExecutorRun (8a02d7c, 1, 0, 8719ad4) + 2de
084d7778 PortalRun (898effc, 7fffffff, 1, 8977b38, 8977b38) + 450
084ceae9 exec_simple_query (8976984, 0, 80473b8, 84d5185) + ba9
084d51a2 PostgresMain (2, 8973b4c, 897398c, 893d00c, 893d008,
130d7661) + 7fa
0844aded BackendRun (898c3d0) + 1cd
084440f3 ServerLoop (1, 89561d4, 3, fea7bb7e, 5c54, feb83cd8) + 973
08443004 PostmasterMain (3) + 119c
0837db12 main (3, 8047b14, 8047b24, 80fa21f) + 1ea
080fa27d _start (3, 8047be8, 8047fb0, 8047fb0, 0, 8047c35) + 7d
It seems that problem is with compiler aggressive optimization. I change
it to lower level and now it works fine. Interesting is that
MemoryContext corruption only appears with PL/Python.
Zdenek
Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:
I revived codlin_month and it falls during PL/Python test:
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=codlin_moth&dt=2010-02-16%2015:09:05
All of the MemoryContextSwitchTo calls in plpython seem to be in
patterns like this:
MemoryContext oldcontext;
oldcontext = CurrentMemoryContext;
PG_TRY();
{
... do something ...
}
PG_CATCH();
{
MemoryContextSwitchTo(oldcontext);
Since oldcontext is only set in the one place, it really shouldn't
require "volatile" decoration, but maybe it does. Can you do some
testing to see if that would fix it?
(Of course, really plpython's bogus approach to error handling ought
to get thrown out and rewritten from scratch, but that's not happening
right now.)
regards, tom lane
On ons, 2010-02-17 at 11:05 -0500, Tom Lane wrote:
All of the MemoryContextSwitchTo calls in plpython seem to be in
patterns like this:MemoryContext oldcontext;
oldcontext = CurrentMemoryContext;
PG_TRY();
{
... do something ...
}
PG_CATCH();
{
MemoryContextSwitchTo(oldcontext);Since oldcontext is only set in the one place, it really shouldn't
require "volatile" decoration, but maybe it does.
It is my understanding that local automatic variables may be clobbered
by [sig]longjmp unless they are marked volatile. The PG_CATCH branch is
reached by means of a [sig]longjmp. So that would mean that any
variable that you want to use both before the TRY and inside the CATCH
has to be volatile.
Peter Eisentraut <peter_e@gmx.net> writes:
On ons, 2010-02-17 at 11:05 -0500, Tom Lane wrote:
Since oldcontext is only set in the one place, it really shouldn't
require "volatile" decoration, but maybe it does.
It is my understanding that local automatic variables may be clobbered
by [sig]longjmp unless they are marked volatile. The PG_CATCH branch is
reached by means of a [sig]longjmp. So that would mean that any
variable that you want to use both before the TRY and inside the CATCH
has to be volatile.
If the rule were quite that strict then we'd need many more "volatile"
markers than we have. I believe the actual implementation issue is that
longjmp restores the register contents to what they were at the time of
the setjmp call, and thus a variable allocated in a register would get
restored to the value it had at entry to PG_TRY whereas a variable
allocated on the stack would still have an up-to-date value. Now the
picture isn't quite that simple since a sufficiently smart compiler
might move the variable's value around within the routine. But the
behavior gcc appears to exhibit is that it won't warn about variables
that are only assigned once before the PG_TRY is entered, and that seems
reasonable to me since such a variable ought to have the correct value
either way.
It might be interesting to modify these bits of code so that the
oldcontext variables are assigned only at declaration:
MemoryContext oldcontext = CurrentMemoryContext;
...
PG_TRY();
and see if that makes the issue go away.
regards, tom lane
On ons, 2010-02-17 at 11:26 -0500, Tom Lane wrote:
But the behavior gcc appears to exhibit is that it won't warn about
variables that are only assigned once before the PG_TRY is entered,
and that seems reasonable to me since such a variable ought to have
the correct value either way.
FWIW, this is a Sun Studio build that is complaining here.
Dne 17.02.10 18:39, Peter Eisentraut napsal(a):
On ons, 2010-02-17 at 11:26 -0500, Tom Lane wrote:
But the behavior gcc appears to exhibit is that it won't warn about
variables that are only assigned once before the PG_TRY is entered,
and that seems reasonable to me since such a variable ought to have
the correct value either way.FWIW, this is a Sun Studio build that is complaining here.
Yes It is SS12. I add volatile keyword and problem disappears. The code
difference is following:
< PLy_spi_execute+0x742: 83 ec 0c subl $0xc,%esp
< PLy_spi_execute+0x745: ff b5 b8 f9 ff ff pushl 0xfffff9b8(%ebp)
< PLy_spi_execute+0x74b: e8 fc ff ff ff call MemoryContextSwitch
PLy_spi_execute+0x742: 8b 85 cc f9 ff ff movl
0xfffff9cc(%ebp),%eax
PLy_spi_execute+0x748: 83 ec 0c subl $0xc,%esp
PLy_spi_execute+0x74b: 50 pushl %eax
PLy_spi_execute+0x74c: e8 fc ff ff ff call MemoryContextSwitch
Good to mention that SS inline PLy_spi_execute_query inside
PLy_spi_execute(), because it is only one caller.
Zdenek
Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:
Dne 17.02.10 18:39, Peter Eisentraut napsal(a):
FWIW, this is a Sun Studio build that is complaining here.
Yes It is SS12. I add volatile keyword and problem disappears.
OK, I've applied that change in CVS. Please change codlin_moth back to
the higher optimization setting.
regards, tom lane