plpython crash

Started by Jean-Baptiste Quenotover 14 years ago14 messages
#1Jean-Baptiste Quenot
jbq@caraldi.com

Hi there,

plpython crashes on me on various 64-bit Ubuntu hosts, see the gdb
backtrace at: https://gist.github.com/1140005

Do you believe there was recent bugfixes regarding PLyMapping_ToTuple() ?

This is PG 9.0.4 with HEAD of plpython taken in march 2011 and backported.

Please tell me if you need more information.
--
Jean-Baptiste Quenot

#2Jan Urbański
wulczer@wulczer.org
In reply to: Jean-Baptiste Quenot (#1)
Re: plpython crash

On 11/08/11 18:01, Jean-Baptiste Quenot wrote:

Hi there,

plpython crashes on me on various 64-bit Ubuntu hosts, see the gdb
backtrace at: https://gist.github.com/1140005

Do you believe there was recent bugfixes regarding PLyMapping_ToTuple() ?

This is PG 9.0.4 with HEAD of plpython taken in march 2011 and backported.

Please tell me if you need more information.

Hi,

there were no changes to that area of plpython after March 2011.

Could you try to see if the error also appears if you run your app with
current PostgreSQL HEAD (both the server and plpython)?

Which Python version is that? You can get that info by running:

do $$ import sys; plpy.info(sys.version) $$ language plpythonu;

Could you try to extract a self-contained example of how to reproduce
it? If the bug only appears under your application's specific workload,
perhaps you could try running it with Postgres compiled with -O0,
because compiling with -O2 causes the gdb backtrace to be missing
optimised out values and inlined functions?

Cheers,
Jan

#3Jean-Baptiste Quenot
jbq@caraldi.com
In reply to: Jan Urbański (#2)
Re: plpython crash

Here is the same with -O0:

https://gist.github.com/1140005

sys.version reports this:

INFO: 2.6.6 (r266:84292, Sep 15 2010, 16:41:53)
[GCC 4.4.5]
--
Jean-Baptiste Quenot

#4Jan Urbański
wulczer@wulczer.org
In reply to: Jean-Baptiste Quenot (#3)
Re: plpython crash

On 12/08/11 13:55, Jean-Baptiste Quenot wrote:

Here is the same with -O0:

https://gist.github.com/1140005

sys.version reports this:

INFO: 2.6.6 (r266:84292, Sep 15 2010, 16:41:53)
[GCC 4.4.5]

I'm still at a loss. Did you reproduce it with git HEAD? I see that the
query being execute is "select * from myfunc()"; would it be possible to
share the code of myfunc?

Jan

#5Jean-Baptiste Quenot
jbq@caraldi.com
In reply to: Jan Urbański (#4)
Re: plpython crash

After backporting plpython.c from HEAD, this is the error message I get:

ERROR: key "........pg.dropped.6........" not found in mapping
HINT: To return null in a column, add the value None to the mapping
with the key named after the column.
CONTEXT: while creating return value
PL/Python function "myfunc"

What does it mean?
--
Jean-Baptiste Quenot

#6Jan Urbański
wulczer@wulczer.org
In reply to: Jean-Baptiste Quenot (#5)
Re: plpython crash

On 16/08/11 16:52, Jean-Baptiste Quenot wrote:

After backporting plpython.c from HEAD, this is the error message I get:

ERROR: key "........pg.dropped.6........" not found in mapping
HINT: To return null in a column, add the value None to the mapping
with the key named after the column.
CONTEXT: while creating return value
PL/Python function "myfunc"

What does it mean?

Ah, interesting, I think that this means that you are returning a table
type and that table has a dropped column. The code should skip over
dropped columns, but apparently it does not and tries to find a value
for that column in the mapping you are returning.

I'll try to reproduce it here.

Jan

#7Jan Urbański
wulczer@wulczer.org
In reply to: Jan Urbański (#6)
Re: plpython crash

On 16/08/11 17:06, Jan Urbański wrote:

On 16/08/11 16:52, Jean-Baptiste Quenot wrote:

After backporting plpython.c from HEAD, this is the error message I get:

ERROR: key "........pg.dropped.6........" not found in mapping
HINT: To return null in a column, add the value None to the mapping
with the key named after the column.
CONTEXT: while creating return value
PL/Python function "myfunc"

What does it mean?

I did a couple of simple tests and can't see how can the code not skip
dropped columns.

It seems like you're missing this commit:

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=41282111e6cc73aca4b63dffe950ba7a63e4bd8a

Could you try running this query? (assuming your function is called
'myfync')

select proname, relname, attname, attisdropped from pg_proc join
pg_class on (prorettype = reltype) join pg_attribute on (attrelid =
pg_class.oid) where proname = 'myfunc';

Cheers,
Jan

#8Jean-Baptiste Quenot
jbq@caraldi.com
In reply to: Jan Urbański (#7)
Re: plpython crash

Dear Jan,

Sorry I typed the wrong git commands. With latest plpython from
branch master I got the same gdb backtrace as reported before. I
managed to wrap up a testcase that fails 100% of times on my setup:
https://gist.github.com/1149512

Hope it crashes on your side too :-)

This is the result on PG 9.0.4:
https://gist.github.com/1149543

This is the result on PG 9.0.4 with plpython.c backported from HEAD:
https://gist.github.com/1149558

Cheers,
--
Jean-Baptiste Quenot

#9Jan Urbański
wulczer@wulczer.org
In reply to: Jean-Baptiste Quenot (#8)
Re: plpython crash

On 16/08/11 19:07, Jean-Baptiste Quenot wrote:

Dear Jan,

Sorry I typed the wrong git commands. With latest plpython from
branch master I got the same gdb backtrace as reported before. I
managed to wrap up a testcase that fails 100% of times on my setup:
https://gist.github.com/1149512

Hope it crashes on your side too :-)

Awesome, it segfaults for me with HEAD ;)

Now it's just a simple matter of programming... I'll take a look at it
this evening.

Jan

#10Jan Urbański
wulczer@wulczer.org
In reply to: Jan Urbański (#9)
Re: plpython crash

On 16/08/11 19:12, Jan Urbański wrote:

On 16/08/11 19:07, Jean-Baptiste Quenot wrote:

Dear Jan,

Sorry I typed the wrong git commands. With latest plpython from
branch master I got the same gdb backtrace as reported before. I
managed to wrap up a testcase that fails 100% of times on my setup:
https://gist.github.com/1149512

Hope it crashes on your side too :-)

Awesome, it segfaults for me with HEAD ;)

Now it's just a simple matter of programming... I'll take a look at it
this evening.

Found it, we're invalidating the compiled functions cache when input
composite arguments change, but not when output composite arguments
change and the function gets called with pointers to invalid I/O routines.

I'll have a patch ready soon.

Jan

#11Jan Urbański
wulczer@wulczer.org
In reply to: Jan Urbański (#10)
1 attachment(s)
Re: plpython crash

On 17/08/11 11:40, Jan Urbański wrote:

On 16/08/11 19:12, Jan Urbański wrote:

On 16/08/11 19:07, Jean-Baptiste Quenot wrote:

[plpython is buggy]

I'll have a patch ready soon.

Here are two patches that fix two separate bugs that you found
simultaneously. Because they're actually separate issues, it turned out
fixing them was a bit more tricky than I expected (fixing one was
unmasking the other one etc).

Thanks for the report!
Jan

Attachments:

0002-Guard-against-return-type-changing-in-PL-Python-func.patchtext/x-patch; name=0002-Guard-against-return-type-changing-in-PL-Python-func.patchDownload
#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jan Urbański (#11)
Re: plpython crash

=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <wulczer@wulczer.org> writes:

On 16/08/11 19:07, Jean-Baptiste Quenot wrote:

[plpython is buggy]

Here are two patches that fix two separate bugs that you found
simultaneously. Because they're actually separate issues, it turned out
fixing them was a bit more tricky than I expected (fixing one was
unmasking the other one etc).

These look generally sane although I have some minor stylistic gripes.
Will clean them up and apply in a few hours (I have to leave for an
appointment shortly).

regards, tom lane

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jan Urbański (#11)
Re: plpython crash

=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <wulczer@wulczer.org> writes:

Here are two patches that fix two separate bugs that you found
simultaneously. Because they're actually separate issues, it turned out
fixing them was a bit more tricky than I expected (fixing one was
unmasking the other one etc).

Applied with one non-cosmetic change: I got rid of the test on
TransactionIdIsValid(arg->typrel_xmin) in PLy_input_tuple_funcs,
as well as where you'd copied that logic in PLy_output_tuple_funcs.
AFAICS skipping the update on the xmin/tid, if we're coming through
there a second time, would be simply wrong.

regards, tom lane

#14Jan Urbański
wulczer@wulczer.org
In reply to: Tom Lane (#13)
Re: plpython crash

On 17/08/11 23:10, Tom Lane wrote:

=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <wulczer@wulczer.org> writes:

Here are two patches that fix two separate bugs that you found
simultaneously. Because they're actually separate issues, it turned out
fixing them was a bit more tricky than I expected (fixing one was
unmasking the other one etc).

Applied with one non-cosmetic change: I got rid of the test on
TransactionIdIsValid(arg->typrel_xmin) in PLy_input_tuple_funcs,
as well as where you'd copied that logic in PLy_output_tuple_funcs.
AFAICS skipping the update on the xmin/tid, if we're coming through
there a second time, would be simply wrong.

Thanks!

The way things are set up now I think you never go through
PLy_input_tuple_funcs twice, unless the cache is determined to be
invalid and then you recreate the function from scratch.

But of course it's better to be safe than sorry and even if I'm right
and it was never executed twice, any refactoring effort might have
broken it easily.

Cheers,
Jan