can't load plpython

Started by Alvaro Herreraalmost 17 years ago12 messages
#1Alvaro Herrera
alvherre@commandprompt.com

Hi,

So I've been trying to get a plpython function that removes accented
letters, based on a Python snippet posted on another thread. The
function is simple enough:

create or replace function unaccent(text) returns text language plpythonu as $$
import unicodedata
s = unicodedata.normalize("NFKD", args[0])
s = ''.join(c for c in s if ord(c) < 127)
return s
$$ ;

However, on HEAD this is crashing for me, and it's right when plpython
loads. Backtrace below.

I already distclean'ed, initdb'd, rebuilt the whole thing from scratch
and I can't make it work. This is on Python 2.5.4, Debian unstable
stuff.

On 8.3 it just fails thusly:
alvherre=# select unaccent('�lvaro mu�oz');
ERROR: plpython: function "unaccent" failed
DETALLE: <type 'exceptions.TypeError'>: normalize() argument 2 must be unicode, not str

Obviously I don't know Python to fix it :-)

#0 dl_open_worker (a=<value optimized out>) at dl-open.c:369
#1 0x00007f6b8bba9436 in _dl_catch_error (objname=0x7fff93db7950, errstring=0x7fff93db7948,
mallocedp=0x7fff93db795f, operate=0x7f6b8bbad780 <dl_open_worker>, args=0x7fff93db7900)
at dl-error.c:178
#2 0x00007f6b8bbad2ab in _dl_open (
file=0x1349980 "/home/alvherre/Code/CVS/pgsql/install/00head/lib/plpython.so",
mode=-2147483390, caller_dlopen=0x78f1ba, nsid=-2, argc=1, argv=0x7fff93db8c08, env=0x127ceb0)
at dl-open.c:596
#3 0x00007f6b8b04ef5b in dlopen_doit (a=<value optimized out>) at dlopen.c:67
#4 0x00007f6b8bba9436 in _dl_catch_error (objname=0x7f6b8b2510d0, errstring=0x7f6b8b2510d8,
mallocedp=0x7f6b8b2510c8, operate=0x7f6b8b04eef0 <dlopen_doit>, args=0x7fff93db7b20)
at dl-error.c:178
#5 0x00007f6b8b04f30c in _dlerror_run (operate=0x7f6b8b04eef0 <dlopen_doit>, args=0x7fff93db7b20)
at dlerror.c:164
#6 0x00007f6b8b04eec1 in __dlopen (file=<value optimized out>, mode=<value optimized out>)
at dlopen.c:88
#7 0x000000000078f1ba in internal_load_library (
libname=0x13762a0 "/home/alvherre/Code/CVS/pgsql/install/00head/lib/plpython.so")
at /pgsql/source/00head/src/backend/utils/fmgr/dfmgr.c:234
#8 0x000000000078ee6a in load_external_function (filename=0x1376268 "$libdir/plpython",
funcname=0x13721a8 "plpython_call_handler", signalNotFound=1 '\001', filehandle=0x7fff93db7d08)
at /pgsql/source/00head/src/backend/utils/fmgr/dfmgr.c:113
#9 0x0000000000790668 in fmgr_info_C_lang (functionId=16393, finfo=0x7fff93db7e60,
procedureTuple=0x7f6b8bd085c0) at /pgsql/source/00head/src/backend/utils/fmgr/fmgr.c:345
#10 0x00000000007904e1 in fmgr_info_cxt_security (functionId=16393, finfo=0x7fff93db7e60,
mcxt=0x13478b8, ignore_security=0 '\0')
at /pgsql/source/00head/src/backend/utils/fmgr/fmgr.c:276
#11 0x000000000079022a in fmgr_info_cxt (functionId=16393, finfo=0x7fff93db7e60, mcxt=0x13478b8)
at /pgsql/source/00head/src/backend/utils/fmgr/fmgr.c:166
#12 0x0000000000790200 in fmgr_info (functionId=16393, finfo=0x7fff93db7e60)
at /pgsql/source/00head/src/backend/utils/fmgr/fmgr.c:156

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#1)
Re: can't load plpython

Alvaro Herrera <alvherre@commandprompt.com> writes:

... However, on HEAD this is crashing for me, and it's right when plpython
loads. Backtrace below.

Does plpython pass its regression tests for you (I'd suppose not)?

For me on Fedora 10 x86_64, CVS HEAD plus python 2.5.2 passes regression
but the given example still dumps core. postmaster log says

postgres: tgl regression [local] SELECT: Objects/stringobject.c:107: PyString_FromString: Assertion `str != ((void *)0)' failed.
LOG: server process (PID 4714) was terminated by signal 6: Aborted
LOG: terminating any other active server processes

backtrace

#0 0x0000003e1d032f05 in raise () from /lib64/libc.so.6
#1 0x0000003e1d034a73 in abort () from /lib64/libc.so.6
#2 0x0000003e1d02bef9 in __assert_fail () from /lib64/libc.so.6
#3 0x0000003e3367e67c in PyString_FromString ()
from /usr/lib64/libpython2.5.so.1.0
#4 0x0000003e3366ec26 in PyDict_SetItemString ()
from /usr/lib64/libpython2.5.so.1.0
#5 0x0000000000b2149d in PLy_function_build_args (fcinfo=0x7fffff0adb80,
proc=0x2685c80) at plpython.c:1055
#6 0x0000000000b2281e in PLy_function_handler (fcinfo=0x7fffff0adb80,
proc=0x2685c80) at plpython.c:795
#7 0x0000000000b230f6 in plpython_call_handler (fcinfo=0x7fffff0adb80)
at plpython.c:356
#8 0x000000000056009a in ExecMakeFunctionResult (fcache=0x267c0f0,
econtext=0x267bfc0, isNull=0x267cb38 "", isDone=0x267cbf0)
at execQual.c:1665
#9 0x000000000055af44 in ExecTargetList () at execQual.c:5001
#10 ExecProject (projInfo=<value optimized out>, isDone=0x7fffff0ae06c)
at execQual.c:5202
#11 0x000000000056f289 in ExecResult (node=0x267bea8) at nodeResult.c:155
#12 0x000000000055a27d in ExecProcNode (node=0x267bea8) at execProcnode.c:344
#13 0x0000000000557cca in ExecutePlan () at execMain.c:1504

Obviously I don't know Python to fix it :-)

Me either. Something is pretty bad in python-land, it seems.

regards, tom lane

#3Alvaro Herrera
alvherre@commandprompt.com
In reply to: Tom Lane (#2)
Re: can't load plpython

Tom Lane wrote:

Alvaro Herrera <alvherre@commandprompt.com> writes:

... However, on HEAD this is crashing for me, and it's right when plpython
loads. Backtrace below.

Does plpython pass its regression tests for you (I'd suppose not)?

Doh. Silly me. It does pass the regression tests, all six of them. I
guess it's trying to load the unicode stuff that it crashes, not
plpython itself ...

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#3)
Re: can't load plpython

Alvaro Herrera <alvherre@commandprompt.com> writes:

Tom Lane wrote:

Does plpython pass its regression tests for you (I'd suppose not)?

Doh. Silly me. It does pass the regression tests, all six of them. I
guess it's trying to load the unicode stuff that it crashes, not
plpython itself ...

Hm, maybe we weren't testing quite the same scenario. What locale
and database_encoding were you using? I tried C/SQL_ASCII and
C/UTF8 and got the same result both ways, but obviously that's not
covering much territory.

regards, tom lane

#5Alvaro Herrera
alvherre@commandprompt.com
In reply to: Tom Lane (#4)
Re: can't load plpython

Tom Lane wrote:

Alvaro Herrera <alvherre@commandprompt.com> writes:

Tom Lane wrote:

Does plpython pass its regression tests for you (I'd suppose not)?

Doh. Silly me. It does pass the regression tests, all six of them. I
guess it's trying to load the unicode stuff that it crashes, not
plpython itself ...

Hm, maybe we weren't testing quite the same scenario. What locale
and database_encoding were you using? I tried C/SQL_ASCII and
C/UTF8 and got the same result both ways, but obviously that's not
covering much territory.

I'm on es_CL.UTF-8. I just tried on C/SQL_ASCII and the regression
tests pass there too (and the function still crashes).

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

In reply to: Tom Lane (#2)
1 attachment(s)
Re: can't load plpython

Tom Lane escreveu:

Alvaro Herrera <alvherre@commandprompt.com> writes:

... However, on HEAD this is crashing for me, and it's right when plpython
loads. Backtrace below.

Does plpython pass its regression tests for you (I'd suppose not)?

For me on Fedora 10 x86_64, CVS HEAD plus python 2.5.2 passes regression
but the given example still dumps core. postmaster log says

postgres: tgl regression [local] SELECT: Objects/stringobject.c:107: PyString_FromString: Assertion `str != ((void *)0)' failed.
LOG: server process (PID 4714) was terminated by signal 6: Aborted
LOG: terminating any other active server processes

PyString_FromString() [1]http://svn.python.org/view/python/trunk/Objects/stringobject.c?revision=70682&amp;view=markup fails to return something useful, i.e, null pointer
when its argument is null. The trivial fix (that is attached) is to ensure
that we don't pass a null pointer as the second argument of
PyDict_SetItemString(). Of course, it's a Python bug and I filled it [3]http://bugs.python.org/issue5627.

Obviously I don't know Python to fix it :-)

Me either. Something is pretty bad in python-land, it seems.

Me either. ;)

[1]: http://svn.python.org/view/python/trunk/Objects/stringobject.c?revision=70682&amp;view=markup
http://svn.python.org/view/python/trunk/Objects/stringobject.c?revision=70682&amp;view=markup
[2]: http://svn.python.org/view/python/trunk/Objects/dictobject.c?revision=70550&amp;view=markup
http://svn.python.org/view/python/trunk/Objects/dictobject.c?revision=70550&amp;view=markup
[3]: http://bugs.python.org/issue5627

--
Euler Taveira de Oliveira
http://www.timbira.com/

Attachments:

py.difftext/plain; name=py.diffDownload
Index: plpython.c
===================================================================
RCS file: /a/pgsql/dev/anoncvs/pgsql/src/pl/plpython/plpython.c,v
retrieving revision 1.118
diff -c -r1.118 plpython.c
*** plpython.c	15 Jan 2009 13:49:56 -0000	1.118
--- plpython.c	31 Mar 2009 16:48:47 -0000
***************
*** 1053,1059 ****
  			}
  
  			if (PyList_SetItem(args, i, arg) == -1 ||
! 				(proc->argnames &&
  				 PyDict_SetItemString(proc->globals, proc->argnames[i], arg) == -1))
  				PLy_elog(ERROR, "PyDict_SetItemString() failed for PL/Python function \"%s\" while setting up arguments", proc->proname);
  			arg = NULL;
--- 1053,1059 ----
  			}
  
  			if (PyList_SetItem(args, i, arg) == -1 ||
! 				(proc->argnames && proc->argnames[i] != NULL &&
  				 PyDict_SetItemString(proc->globals, proc->argnames[i], arg) == -1))
  				PLy_elog(ERROR, "PyDict_SetItemString() failed for PL/Python function \"%s\" while setting up arguments", proc->proname);
  			arg = NULL;
#7Alvaro Herrera
alvherre@commandprompt.com
In reply to: Euler Taveira de Oliveira (#6)
Re: can't load plpython

Euler Taveira de Oliveira wrote:

Tom Lane escreveu:

Alvaro Herrera <alvherre@commandprompt.com> writes:

... However, on HEAD this is crashing for me, and it's right when plpython
loads. Backtrace below.

Does plpython pass its regression tests for you (I'd suppose not)?

For me on Fedora 10 x86_64, CVS HEAD plus python 2.5.2 passes regression
but the given example still dumps core. postmaster log says

postgres: tgl regression [local] SELECT: Objects/stringobject.c:107: PyString_FromString: Assertion `str != ((void *)0)' failed.
LOG: server process (PID 4714) was terminated by signal 6: Aborted
LOG: terminating any other active server processes

PyString_FromString() [1] fails to return something useful, i.e, null pointer
when its argument is null. The trivial fix (that is attached) is to ensure
that we don't pass a null pointer as the second argument of
PyDict_SetItemString(). Of course, it's a Python bug and I filled it [3].

I'm not sure I'm reading this right, but isn't this preventing a
plpytHon function to work if parameters don't have names assigned?
i.e. apparently I can't just use args[0]. I'm sure I'm wrong on this ...?

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

In reply to: Alvaro Herrera (#7)
1 attachment(s)
Re: can't load plpython

Alvaro Herrera escreveu:

Euler Taveira de Oliveira wrote:

Tom Lane escreveu:

Alvaro Herrera <alvherre@commandprompt.com> writes:

... However, on HEAD this is crashing for me, and it's right when plpython
loads. Backtrace below.

Does plpython pass its regression tests for you (I'd suppose not)?

For me on Fedora 10 x86_64, CVS HEAD plus python 2.5.2 passes regression
but the given example still dumps core. postmaster log says

postgres: tgl regression [local] SELECT: Objects/stringobject.c:107: PyString_FromString: Assertion `str != ((void *)0)' failed.
LOG: server process (PID 4714) was terminated by signal 6: Aborted
LOG: terminating any other active server processes

PyString_FromString() [1] fails to return something useful, i.e, null pointer
when its argument is null. The trivial fix (that is attached) is to ensure
that we don't pass a null pointer as the second argument of
PyDict_SetItemString(). Of course, it's a Python bug and I filled it [3].

I'm not sure I'm reading this right, but isn't this preventing a
plpytHon function to work if parameters don't have names assigned?

No. See the proc->argnames test before PyDict_SetItemString(). The other test
is just tightening the check.

Indeed, the PyDict_*ItemString() functions suffer from the same disease. :( I
reported upstream too.

Attached is another patch that add another test before PyDict_DelItemString();
it's safe because if we don't have a key we don't know what to remove.

Here is my test case (I'm not a python programmer, sorry!).

euler@harman $ cat /tmp/{f,g}.sql
create or replace function unaccent(text) returns text language plpythonu as $$
import unicodedata
s = unicodedata.normalize("NFKD", args[0])
s = ''.join(c for c in s if ord(c) < 127)
return s
$$ ;
drop function add(int, int);
drop function add2(int, int);

create or replace function add(a int, b int) returns int language plpythonu as $$
return a + b
$$ ;

create or replace function add2(int, int) returns int language plpythonu as $$
return args[0] + args[1]
$$ ;

euler@harman $ ./install/bin/psql
psql (8.4devel)
Type "help" for help.

euler=# select unaccent('at�');
NOTA: PL/Python: args[0]: (null)
ERRO: PL/Python: PL/Python function "unaccent" failed
DETALHE: <type 'exceptions.TypeError'>: normalize() argument 2 must be
unicode, not str
euler=# select add(1,2);
NOTA: PL/Python: args[0]: a
NOTA: PL/Python: args[1]: b
NOTA: PL/Python: args[0]: a
NOTA: PL/Python: args[1]: b
add
-----
3
(1 registro)

euler=# select add2(1,2);
NOTA: PL/Python: args[0]: (null)
NOTA: PL/Python: args[1]: (null)
NOTA: PL/Python: args[0]: (null)
NOTA: PL/Python: args[1]: (null)
add2
------
3
(1 registro)

--
Euler Taveira de Oliveira
http://www.timbira.com/

Attachments:

py2.difftext/plain; name=py2.diffDownload
Index: plpython.c
===================================================================
RCS file: /a/pgsql/dev/anoncvs/pgsql/src/pl/plpython/plpython.c,v
retrieving revision 1.118
diff -c -r1.118 plpython.c
*** plpython.c	15 Jan 2009 13:49:56 -0000	1.118
--- plpython.c	31 Mar 2009 18:50:54 -0000
***************
*** 1053,1059 ****
  			}
  
  			if (PyList_SetItem(args, i, arg) == -1 ||
! 				(proc->argnames &&
  				 PyDict_SetItemString(proc->globals, proc->argnames[i], arg) == -1))
  				PLy_elog(ERROR, "PyDict_SetItemString() failed for PL/Python function \"%s\" while setting up arguments", proc->proname);
  			arg = NULL;
--- 1053,1059 ----
  			}
  
  			if (PyList_SetItem(args, i, arg) == -1 ||
! 				(proc->argnames && proc->argnames[i] != NULL &&
  				 PyDict_SetItemString(proc->globals, proc->argnames[i], arg) == -1))
  				PLy_elog(ERROR, "PyDict_SetItemString() failed for PL/Python function \"%s\" while setting up arguments", proc->proname);
  			arg = NULL;
***************
*** 1081,1087 ****
  		return;
  
  	for (i = 0; i < proc->nargs; i++)
! 		PyDict_DelItemString(proc->globals, proc->argnames[i]);
  }
  
  
--- 1081,1088 ----
  		return;
  
  	for (i = 0; i < proc->nargs; i++)
! 		if (proc->argnames[i] != NULL)
! 			PyDict_DelItemString(proc->globals, proc->argnames[i]);
  }
  
  
#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Euler Taveira de Oliveira (#8)
Re: can't load plpython

Euler Taveira de Oliveira <euler@timbira.com> writes:

Alvaro Herrera escreveu:

I'm not sure I'm reading this right, but isn't this preventing a
plpytHon function to work if parameters don't have names assigned?

No. See the proc->argnames test before PyDict_SetItemString(). The other test
is just tightening the check.

Indeed, the PyDict_*ItemString() functions suffer from the same disease. :( I
reported upstream too.

Attached is another patch that add another test before PyDict_DelItemString();
it's safe because if we don't have a key we don't know what to remove.

Applied, thanks, along with a regression test case. As far as I can
tell, plpython functions that have no names given for their parameters
have been broken for months, and we did not notice because whoever
added named-parameter support changed *every single* test case to use
only named parameters. Brilliant.

Alvaro's example now gives me this on Fedora 10:

ERROR: PL/Python: PL/Python function "unaccent" failed
DETAIL: <type 'exceptions.TypeError'>: normalize() argument 2 must be unicode, not str

which is the same as it did in 8.3. I do not know if that's a bug
or expected (making the database encoding be utf8 doesn't help).

Alvaro, would you see if it still crashes for you on Debian?
If so there's some other issue with python 2.5.4 ...

regards, tom lane

#10Alvaro Herrera
alvherre@commandprompt.com
In reply to: Tom Lane (#9)
Re: can't load plpython

Tom Lane wrote:

Alvaro's example now gives me this on Fedora 10:

ERROR: PL/Python: PL/Python function "unaccent" failed
DETAIL: <type 'exceptions.TypeError'>: normalize() argument 2 must be unicode, not str

which is the same as it did in 8.3. I do not know if that's a bug
or expected (making the database encoding be utf8 doesn't help).

Apparently the problem is that "str" is a different type in Python than
"unicode". I could get it to work this way:

create or replace function unaccent(text) returns text language plpythonu as $$
import unicodedata
rv = plpy.execute("select setting from pg_settings where name = 'server_encoding'");
encoding = rv[0]["setting"]
s = args[0].decode(encoding)
s = unicodedata.normalize("NFKD", s)
s = ''.join(c for c in s if ord(c) < 127)
return s
$$;

Alvaro, would you see if it still crashes for you on Debian?
If so there's some other issue with python 2.5.4 ...

It works for me now. Thanks to Euler for tracking the Python problem
down and to you for the commit!

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#10)
Re: can't load plpython

Alvaro Herrera <alvherre@commandprompt.com> writes:

Tom Lane wrote:

Alvaro, would you see if it still crashes for you on Debian?
If so there's some other issue with python 2.5.4 ...

It works for me now. Thanks to Euler for tracking the Python problem
down and to you for the commit!

Hmph. I wonder what caused that crash you reported originally? The
backtrace doesn't look like it's explained by the argument-name bug:
http://archives.postgresql.org/pgsql-hackers/2009-03/msg01344.php

Maybe that backtrace is just bogus, though --- if you'd pointed gdb
at the wrong executable version, or something, you could have come
up with silly results. Anyway, if it's no longer reproducible, we
probably shouldn't spend too much time on it.

regards, tom lane

#12Alvaro Herrera
alvherre@commandprompt.com
In reply to: Tom Lane (#11)
Re: can't load plpython

Tom Lane wrote:

Alvaro Herrera <alvherre@commandprompt.com> writes:

It works for me now. Thanks to Euler for tracking the Python problem
down and to you for the commit!

Hmph. I wonder what caused that crash you reported originally? The
backtrace doesn't look like it's explained by the argument-name bug:
http://archives.postgresql.org/pgsql-hackers/2009-03/msg01344.php

Maybe that backtrace is just bogus, though --- if you'd pointed gdb
at the wrong executable version, or something, you could have come
up with silly results.

No, the backtrace is right -- I get the same if I revert the plpython.c
commit. I have no idea why the backtrace looks like this. It's even
compiled with -O0.

Anyway, if it's no longer reproducible, we probably shouldn't spend
too much time on it.

Okay.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support