Fix for segfault in plpython's exception handling

Started by Rafa de la Torreover 9 years ago5 messageshackers
Jump to latest
#1Rafa de la Torre
rtorre@carto.com

The patch attached (a one-liner) includes information about how to
reproduce the issue and why it happens.

We applied it to our production servers (9.5) and has been working as
expected for a while. I've also checked that it can be applied cleanly to
current master.

Could it make its way into master? maybe 9.5 and 9.6 as well?

Thanks!

--
Rafa de la Torre
rtorre@carto.com

Attachments:

plpython-segfault-excetion-handling.patchtext/x-patch; charset=US-ASCII; name=plpython-segfault-excetion-handling.patchDownload+1-0
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Rafa de la Torre (#1)
Re: Fix for segfault in plpython's exception handling

Rafa de la Torre <rtorre@carto.com> writes:

exc = PyErr_NewException(exception_map[i].name, base, dict);
+ Py_INCREF(exc);
PyModule_AddObject(mod, exception_map[i].classname, exc);

Hm. Seems like if this is a problem, the code for the other three
exceptions is being a bit careless: it does do Py_INCREFs on them,
but not soon enough to ensure no problems. Also, I wonder why that
code checks for a null result from PyErr_NewException but this doesn't.

Good catch though. A naive person would have assumed that
PyModule_AddObject would increment the object's refcount, but
the Python docs say "This steals a reference to value", which
I guess must mean that the caller is required to do it.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#2)
Re: Fix for segfault in plpython's exception handling

I wrote:

Rafa de la Torre <rtorre@carto.com> writes:

exc = PyErr_NewException(exception_map[i].name, base, dict);
+ Py_INCREF(exc);
PyModule_AddObject(mod, exception_map[i].classname, exc);

Hm. Seems like if this is a problem, the code for the other three
exceptions is being a bit careless: it does do Py_INCREFs on them,
but not soon enough to ensure no problems. Also, I wonder why that
code checks for a null result from PyErr_NewException but this doesn't.

Good catch though. A naive person would have assumed that
PyModule_AddObject would increment the object's refcount, but
the Python docs say "This steals a reference to value", which
I guess must mean that the caller is required to do it.

For me (testing with Python 2.6.6 on RHEL6), this test case didn't result
in a server crash, but in the wrong exception object name being reported.
Tracing through it showed that a python GC was happening during the loop
adding all the exceptions to the spiexceptions module, so that some of the
exception objects went away and then were immediately reallocated as other
exception objects. The explicit gc call in the test case wasn't
necessary, because the problem happened before that. Fun fun.

I've pushed a patch that deals with all these problems. Thanks for
the report!

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Rafa de la Torre
rtorre@carto.com
In reply to: Tom Lane (#3)
Re: Fix for segfault in plpython's exception handling

Thank you! Glad to see that the report was useful.

On Fri, Dec 9, 2016 at 9:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wrote:

Rafa de la Torre <rtorre@carto.com> writes:

exc = PyErr_NewException(exception_map[i].name, base,

dict);

+ Py_INCREF(exc);
PyModule_AddObject(mod, exception_map[i].classname, exc);

Hm. Seems like if this is a problem, the code for the other three
exceptions is being a bit careless: it does do Py_INCREFs on them,
but not soon enough to ensure no problems. Also, I wonder why that
code checks for a null result from PyErr_NewException but this doesn't.

Good catch though. A naive person would have assumed that
PyModule_AddObject would increment the object's refcount, but
the Python docs say "This steals a reference to value", which
I guess must mean that the caller is required to do it.

For me (testing with Python 2.6.6 on RHEL6), this test case didn't result
in a server crash, but in the wrong exception object name being reported.
Tracing through it showed that a python GC was happening during the loop
adding all the exceptions to the spiexceptions module, so that some of the
exception objects went away and then were immediately reallocated as other
exception objects. The explicit gc call in the test case wasn't
necessary, because the problem happened before that. Fun fun.

I've pushed a patch that deals with all these problems. Thanks for
the report!

regards, tom lane

--
Rafa de la Torre
rtorre@carto.com

#5Rafa de la Torre
rafael.delatorre.vega@gmail.com
In reply to: Rafa de la Torre (#4)
Re: Fix for segfault in plpython's exception handling

For the record: I tested the patch by Tom Lane in our setup (python 2.7.3-0ubuntu3.8) and works like a charm.

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=9cda81f0056ca488dbd6cded64db1238aed816b2

Also in 9.5 and 9.6 series. My request in commitfest queue can be closed. Cheers!
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers