Re: Array of composite types returned from python

Started by Sim Zacksover 11 years ago10 messages
#1Sim Zacks
sim@compulab.co.il

Am I missing anything, (ie memory leak, undesirable behavior elsewhere)?

-Ed

I applied the patch and it looks like it is working well. As a longtime plpython user, I appreciate the fix.

I have a few comments:
1) I would remove the error message from the PO files as well.

2) You removed the comment:
- /*
- * We don't support arrays of row types yet, so the first argument
- * can be NULL.
- */

But didn't change the code there.
I haven't delved deep enough into the code yet to understand the full meaning, but the comment would indicate that if arrays of row types are supported, the first argument cannot be null.

3) This is such a simple change with no new infrastructure code (PLyObject_ToComposite already exists). Can you think of a reason why this wasn't done until now? Was it a simple miss or purposefully excluded?

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Abhijit Menon-Sen
ams@2ndQuadrant.com
In reply to: Sim Zacks (#1)

Hi.

When this patch was first added to a CF, I had a quick look at it, but
left it for a proper review by someone more familiar with PL/Python
internals for precisely this reason:

2) You removed the comment:
- /*
- * We don't support arrays of row types yet, so the first argument
- * can be NULL.
- */

But didn't change the code there.
I haven't delved deep enough into the code yet to understand the full
meaning, but the comment would indicate that if arrays of row types
are supported, the first argument cannot be null.

I had another look now, and I think removing the comment is fine. It
actually made no sense to me in context, so I went digging a little.

After following a plpython.c → plpy_*.c refactoring (#147c2482) and a
pgindent run (#65e806cb), I found that the comment was added along with
the code by this commit:

commit db7386187f78dfc45b86b6f4f382f6b12cdbc693
Author: Peter Eisentraut <peter_e@gmx.net>
Date: Thu Dec 10 20:43:40 2009 +0000

PL/Python array support

Support arrays as parameters and return values of PL/Python functions.

At the time, the code looked like this:

+               else
+               {
+                       nulls[i] = false;
+                       /* We don't support arrays of row types yet, so the first
+                        * argument can be NULL. */
+                       elems[i] = arg->elm->func(NULL, arg->elm, obj);
+               }

Note that the first argument was actually NULL, so the comment made
sense when it was written. But the code was subsequently changed to
pass in arg->elm by the following commit:

commit 09130e5867d49c72ef0f11bef30c5385d83bf194
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: Mon Oct 11 22:16:40 2010 -0400

Fix plpython so that it again honors typmod while assigning to tuple fields.

This was broken in 9.0 while improving plpython's conversion behavior for
bytea and boolean. Per bug report from maizi.

The comment should have been removed at the same time. So I don't think
there's a problem here.

3) This is such a simple change with no new infrastructure code
(PLyObject_ToComposite already exists). Can you think of a reason
why this wasn't done until now? Was it a simple miss or purposefully
excluded?

This is not an authoritative answer: I think the infrastructure was
originally missing, but was later added in #bc411f25 for OUT parameters.
Perhaps it was overlooked at the time that the same code would suffice
for this earlier-missing case. (I've Cc:ed Peter E. in case he has any
comments.)

I think the patch is ready for committer.

-- Abhijit

P.S. I'm a wee bit confused by this mail I'm replying to, because it's
signed "Ed" and looks like a response, but it's "From: Sim Zacks". I've
added the original author's address to the Cc: in case I misunderstood
something.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Abhijit Menon-Sen
ams@2ndQuadrant.com
In reply to: Abhijit Menon-Sen (#2)

At 2014-06-29 18:08:53 +0530, ams@2ndQuadrant.com wrote:

I think the patch is ready for committer.

That's based on my earlier quick look and the current archaeology. But
I'm not a PL/Python user, and Ronan signed up to review the patch, so I
haven't changed the status.

Ronan, did you get a chance to look at it?

-- Abhijit

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Abhijit Menon-Sen (#2)

Abhijit Menon-Sen <ams@2ndQuadrant.com> writes:

I had another look now, and I think removing the comment is fine. It
actually made no sense to me in context, so I went digging a little.
...
Note that the first argument was actually NULL, so the comment made
sense when it was written. But the code was subsequently changed to
pass in arg->elm by the following commit:

commit 09130e5867d49c72ef0f11bef30c5385d83bf194
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: Mon Oct 11 22:16:40 2010 -0400

Fix plpython so that it again honors typmod while assigning to tuple fields.

This was broken in 9.0 while improving plpython's conversion behavior for
bytea and boolean. Per bug report from maizi.

The comment should have been removed at the same time. So I don't think
there's a problem here.

Yeah, you're right: the comment is referring to the struct PLyTypeInfo *
argument, which isn't there at all anymore. Mea culpa --- that's the same
sort of failure-to-update-nearby-comments thinko that I regularly mutter
about other people making :-(

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Abhijit Menon-Sen (#2)

Abhijit Menon-Sen <ams@2ndQuadrant.com> writes:

3) This is such a simple change with no new infrastructure code
(PLyObject_ToComposite already exists). Can you think of a reason
why this wasn't done until now? Was it a simple miss or purposefully
excluded?

This is not an authoritative answer: I think the infrastructure was
originally missing, but was later added in #bc411f25 for OUT parameters.
Perhaps it was overlooked at the time that the same code would suffice
for this earlier-missing case. (I've Cc:ed Peter E. in case he has any
comments.)

I think the patch is ready for committer.

I took a quick look at this; not really a review either, but I have
a couple comments.

1. While I think the patch does what it intends to, it's a bit distressing
that it will invoke the information lookups in PLyObject_ToComposite over
again for *each element* of the array. We probably ought to quantify that
overhead to see if it's bad enough that we need to do something about
improving caching, as speculated in the comment in PLyObject_ToComposite.

2. I wonder whether the no-composites restriction in plpy.prepare
(see lines 133ff in plpy_spi.c) could be removed as easily.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Behn, Edward (EBEHN)
EBEHN@arinc.com
In reply to: Tom Lane (#5)
1 attachment(s)

Just writing to check in.

I haven't done anything to look into allowing arrays of composites for input
to PL/Python function. I made the submitted modification for a specific
project that I'm working on that involves python code that returns data
structures.

I also have no idea about a more efficient way to convert composite
elements.
-Ed

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Sunday, June 29, 2014 4:54 PM
To: Abhijit Menon-Sen
Cc: Sim Zacks; Behn, Edward (EBEHN); pgsql-hackers@postgresql.org; Peter
Eisentraut
Subject: Re: [HACKERS] Array of composite types returned from python

Abhijit Menon-Sen <ams@2ndQuadrant.com> writes:

3) This is such a simple change with no new infrastructure code
(PLyObject_ToComposite already exists). Can you think of a reason why
this wasn't done until now? Was it a simple miss or purposefully
excluded?

This is not an authoritative answer: I think the infrastructure was
originally missing, but was later added in #bc411f25 for OUT parameters.
Perhaps it was overlooked at the time that the same code would suffice
for this earlier-missing case. (I've Cc:ed Peter E. in case he has any
comments.)

I think the patch is ready for committer.

I took a quick look at this; not really a review either, but I have a couple
comments.

1. While I think the patch does what it intends to, it's a bit distressing
that it will invoke the information lookups in PLyObject_ToComposite over
again for *each element* of the array. We probably ought to quantify that
overhead to see if it's bad enough that we need to do something about
improving caching, as speculated in the comment in PLyObject_ToComposite.

2. I wonder whether the no-composites restriction in plpy.prepare (see lines
133ff in plpy_spi.c) could be removed as easily.

regards, tom lane

Attachments:

smime.p7sapplication/pkcs7-signature; name=smime.p7sDownload
#7Ronan Dunklau
ronan.dunklau@dalibo.com
In reply to: Tom Lane (#5)
1 attachment(s)

Le dimanche 29 juin 2014 16:54:03 Tom Lane a écrit :

Abhijit Menon-Sen <ams@2ndQuadrant.com> writes:

3) This is such a simple change with no new infrastructure code
(PLyObject_ToComposite already exists). Can you think of a reason
why this wasn't done until now? Was it a simple miss or purposefully
excluded?

This is not an authoritative answer: I think the infrastructure was
originally missing, but was later added in #bc411f25 for OUT parameters.
Perhaps it was overlooked at the time that the same code would suffice
for this earlier-missing case. (I've Cc:ed Peter E. in case he has any
comments.)

I think the patch is ready for committer.

Sorry for being this late.

I've tested the patch, everything seems to work as expected, including complex
nesting of Composite and array types.

No documentation changes are needed, since the limitation wasn't even
mentioned before.

Regression tests are ok, and the patch seems simple enough. Formatting looks
OK too.

I took a quick look at this; not really a review either, but I have
a couple comments.

1. While I think the patch does what it intends to, it's a bit distressing
that it will invoke the information lookups in PLyObject_ToComposite over
again for *each element* of the array. We probably ought to quantify that
overhead to see if it's bad enough that we need to do something about
improving caching, as speculated in the comment in PLyObject_ToComposite.

I don't know how to do that without implementing the cache itself.

2. I wonder whether the no-composites restriction in plpy.prepare
(see lines 133ff in plpy_spi.c) could be removed as easily.

Hum, I tried that, but its not that easy: lifting the restriction results in a
SEGFAULT when trying to pfree the parameters given to SPI_ExecutePlan (line
320 in plpy_spi.c).

Correct me if I'm wrong, but I think the problem is that HeapTupleGetDatum
returns the t_data field, whereas heap_form_tuple allocation returns the
address of the HeapTuple itself. Then, the datum itself has not been palloced.

Changing the HeapTupleGetDatum call for an heap_copy_tuple_as_datum fixes this
issue, but I'm not sure this the best way to do that.

The attached patch implements this.

regards, tom lane

--
Ronan Dunklau
http://dalibo.com - http://dalibo.org

Attachments:

PLPythonCompositeArrays_v2.patchtext/x-patch; charset=UTF-8; name=PLPythonCompositeArrays_v2.patchDownload
diff --git a/src/pl/plpython/expected/plpython_composite.out b/src/pl/plpython/expected/plpython_composite.out
index 61b3046..e7a7edb 100644
--- a/src/pl/plpython/expected/plpython_composite.out
+++ b/src/pl/plpython/expected/plpython_composite.out
@@ -270,9 +270,13 @@ yield {'tab': [['first', 1], ['second', 2]],
               {'first': 'fourth', 'second': 4}]}
 $$ LANGUAGE plpythonu;
 SELECT * FROM composite_types_table();
-ERROR:  PL/Python functions cannot return type table_record[]
-DETAIL:  PL/Python does not support conversion to arrays of row types.
-CONTEXT:  PL/Python function "composite_types_table"
+            tab             |            typ             
+----------------------------+----------------------------
+ {"(first,1)","(second,2)"} | {"(third,3)","(fourth,4)"}
+ {"(first,1)","(second,2)"} | {"(third,3)","(fourth,4)"}
+ {"(first,1)","(second,2)"} | {"(third,3)","(fourth,4)"}
+(3 rows)
+
 -- check what happens if the output record descriptor changes
 CREATE FUNCTION return_record(t text) RETURNS record AS $$
 return {'t': t, 'val': 10}
diff --git a/src/pl/plpython/expected/plpython_spi.out b/src/pl/plpython/expected/plpython_spi.out
index 5175794..de547d2 100644
--- a/src/pl/plpython/expected/plpython_spi.out
+++ b/src/pl/plpython/expected/plpython_spi.out
@@ -376,6 +376,15 @@ plan = plpy.prepare("select fname, lname from users where fname like $1 || '%'",
                     ["text"])
 c = plpy.cursor(plan, ["a", "b"])
 $$ LANGUAGE plpythonu;
+CREATE TYPE test_composite_type AS (
+  a1 int,
+  a2 varchar
+);
+CREATE OR REPLACE FUNCTION plan_composite_args() RETURNS test_composite_type AS $$
+plan = plpy.prepare("select $1::test_composite_type as c1", ["test_composite_type"])
+res = plpy.execute(plan, [{"a1": 3, "a2": "label"}])
+return res[0]["c1"]
+$$ LANGUAGE plpythonu;
 SELECT simple_cursor_test();
  simple_cursor_test 
 --------------------
@@ -432,3 +441,9 @@ CONTEXT:  Traceback (most recent call last):
   PL/Python function "cursor_plan_wrong_args", line 4, in <module>
     c = plpy.cursor(plan, ["a", "b"])
 PL/Python function "cursor_plan_wrong_args"
+SELECT plan_composite_args();
+ plan_composite_args 
+---------------------
+ (3,label)
+(1 row)
+
diff --git a/src/pl/plpython/expected/plpython_types.out b/src/pl/plpython/expected/plpython_types.out
index b98318c..9576905 100644
--- a/src/pl/plpython/expected/plpython_types.out
+++ b/src/pl/plpython/expected/plpython_types.out
@@ -632,13 +632,12 @@ PL/Python function "test_type_conversion_array_mixed2"
 CREATE FUNCTION test_type_conversion_array_record() RETURNS type_record[] AS $$
 return [None]
 $$ LANGUAGE plpythonu;
-ERROR:  PL/Python functions cannot return type type_record[]
-DETAIL:  PL/Python does not support conversion to arrays of row types.
 SELECT * FROM test_type_conversion_array_record();
-ERROR:  function test_type_conversion_array_record() does not exist
-LINE 1: SELECT * FROM test_type_conversion_array_record();
-                      ^
-HINT:  No function matches the given name and argument types. You might need to add explicit type casts.
+ test_type_conversion_array_record 
+-----------------------------------
+ {NULL}
+(1 row)
+
 CREATE FUNCTION test_type_conversion_array_string() RETURNS text[] AS $$
 return 'abc'
 $$ LANGUAGE plpythonu;
diff --git a/src/pl/plpython/expected/plpython_types_3.out b/src/pl/plpython/expected/plpython_types_3.out
index e104356..f38e2a7 100644
--- a/src/pl/plpython/expected/plpython_types_3.out
+++ b/src/pl/plpython/expected/plpython_types_3.out
@@ -632,13 +632,12 @@ PL/Python function "test_type_conversion_array_mixed2"
 CREATE FUNCTION test_type_conversion_array_record() RETURNS type_record[] AS $$
 return [None]
 $$ LANGUAGE plpython3u;
-ERROR:  PL/Python functions cannot return type type_record[]
-DETAIL:  PL/Python does not support conversion to arrays of row types.
 SELECT * FROM test_type_conversion_array_record();
-ERROR:  function test_type_conversion_array_record() does not exist
-LINE 1: SELECT * FROM test_type_conversion_array_record();
-                      ^
-HINT:  No function matches the given name and argument types. You might need to add explicit type casts.
+ test_type_conversion_array_record 
+-----------------------------------
+ {NULL}
+(1 row)
+
 CREATE FUNCTION test_type_conversion_array_string() RETURNS text[] AS $$
 return 'abc'
 $$ LANGUAGE plpython3u;
diff --git a/src/pl/plpython/plpy_spi.c b/src/pl/plpython/plpy_spi.c
index 060d514..6c3eeff 100644
--- a/src/pl/plpython/plpy_spi.c
+++ b/src/pl/plpython/plpy_spi.c
@@ -130,12 +130,7 @@ PLy_spi_prepare(PyObject *self, PyObject *args)
 
 			plan->types[i] = typeId;
 			typeStruct = (Form_pg_type) GETSTRUCT(typeTup);
-			if (typeStruct->typtype != TYPTYPE_COMPOSITE)
-				PLy_output_datum_func(&plan->args[i], typeTup);
-			else
-				ereport(ERROR,
-						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				   errmsg("plpy.prepare does not support composite types")));
+			PLy_output_datum_func(&plan->args[i], typeTup);
 			ReleaseSysCache(typeTup);
 		}
 
diff --git a/src/pl/plpython/plpy_typeio.c b/src/pl/plpython/plpy_typeio.c
index 566cf6c..a8d1b3e 100644
--- a/src/pl/plpython/plpy_typeio.c
+++ b/src/pl/plpython/plpy_typeio.c
@@ -404,11 +404,7 @@ PLy_output_datum_func2(PLyObToDatum *arg, HeapTuple typeTup)
 		Oid			funcid;
 
 		if (type_is_rowtype(element_type))
-			ereport(ERROR,
-					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-					 errmsg("PL/Python functions cannot return type %s",
-							format_type_be(arg->typoid)),
-					 errdetail("PL/Python does not support conversion to arrays of row types.")));
+			arg->func = PLyObject_ToComposite;
 
 		arg->elm = PLy_malloc0(sizeof(*arg->elm));
 		arg->elm->func = arg->func;
@@ -835,11 +831,6 @@ PLySequence_ToArray(PLyObToDatum *arg, int32 typmod, PyObject *plrv)
 		else
 		{
 			nulls[i] = false;
-
-			/*
-			 * We don't support arrays of row types yet, so the first argument
-			 * can be NULL.
-			 */
 			elems[i] = arg->elm->func(arg->elm, -1, obj);
 		}
 		Py_XDECREF(obj);
@@ -947,7 +938,7 @@ PLyMapping_ToComposite(PLyTypeInfo *info, TupleDesc desc, PyObject *mapping)
 	pfree(values);
 	pfree(nulls);
 
-	return HeapTupleGetDatum(tuple);
+	return heap_copy_tuple_as_datum(tuple, desc);
 }
 
 
@@ -1033,7 +1024,7 @@ PLySequence_ToComposite(PLyTypeInfo *info, TupleDesc desc, PyObject *sequence)
 	pfree(values);
 	pfree(nulls);
 
-	return HeapTupleGetDatum(tuple);
+	return heap_copy_tuple_as_datum(tuple, desc);
 }
 
 
@@ -1105,7 +1096,7 @@ PLyGenericObject_ToComposite(PLyTypeInfo *info, TupleDesc desc, PyObject *object
 	pfree(values);
 	pfree(nulls);
 
-	return HeapTupleGetDatum(tuple);
+	return heap_copy_tuple_as_datum(tuple, desc);
 }
 
 /*
diff --git a/src/pl/plpython/sql/plpython_spi.sql b/src/pl/plpython/sql/plpython_spi.sql
index 09f0d7f..4675199 100644
--- a/src/pl/plpython/sql/plpython_spi.sql
+++ b/src/pl/plpython/sql/plpython_spi.sql
@@ -284,6 +284,17 @@ plan = plpy.prepare("select fname, lname from users where fname like $1 || '%'",
 c = plpy.cursor(plan, ["a", "b"])
 $$ LANGUAGE plpythonu;
 
+CREATE TYPE test_composite_type AS (
+  a1 int,
+  a2 varchar
+);
+
+CREATE OR REPLACE FUNCTION plan_composite_args() RETURNS test_composite_type AS $$
+plan = plpy.prepare("select $1::test_composite_type as c1", ["test_composite_type"])
+res = plpy.execute(plan, [{"a1": 3, "a2": "label"}])
+return res[0]["c1"]
+$$ LANGUAGE plpythonu;
+
 SELECT simple_cursor_test();
 SELECT double_cursor_close();
 SELECT cursor_fetch();
@@ -293,3 +304,4 @@ SELECT next_after_close();
 SELECT cursor_fetch_next_empty();
 SELECT cursor_plan();
 SELECT cursor_plan_wrong_args();
+SELECT plan_composite_args();
#8Abhijit Menon-Sen
ams@2ndQuadrant.com
In reply to: Ronan Dunklau (#7)

Hi Ronan.

Based on your review, I'm marking this as ready for committer.

The attached patch implements this.

Your patch looks sensible enough (thanks for adding tests), but I guess
we'll let the reviewer sort out whether to commit the original or your
extended version.

Thanks.

-- Abhijit

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ronan Dunklau (#7)

Ronan Dunklau <ronan.dunklau@dalibo.com> writes:

Le dimanche 29 juin 2014 16:54:03 Tom Lane a =E9crit :

1. While I think the patch does what it intends to, it's a bit distressing
that it will invoke the information lookups in PLyObject_ToComposite over
again for *each element* of the array. We probably ought to quantify that
overhead to see if it's bad enough that we need to do something about
improving caching, as speculated in the comment in PLyObject_ToComposite.

I don't know how to do that without implementing the cache itself.

I don't either, but my thought was that we could hack up a simple
one-element cache pretty trivially, eg static info and desc variables
in PLyObject_ToComposite that are initialized the first time through.
You could only test one composite-array type per session with that
sort of kluge, but that would be good enough for doing some simple
performance testing.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#9)

I wrote:

Ronan Dunklau <ronan.dunklau@dalibo.com> writes:

I don't know how to do that without implementing the cache itself.

I don't either, but my thought was that we could hack up a simple
one-element cache pretty trivially, eg static info and desc variables
in PLyObject_ToComposite that are initialized the first time through.
You could only test one composite-array type per session with that
sort of kluge, but that would be good enough for doing some simple
performance testing.

I did that, and found that building and returning a million-element
composite array took about 4.2 seconds without any optimization, and 3.2
seconds with the hacked-up cache (as of HEAD, asserts off). I'd say that
means we might want to do something about it eventually, but it's hardly
the first order of business.

I've committed the patch with a bit of additional cleanup. I credited
Ronan and Ed equally as authors, since I'd say the fix for plpy.prepare
was at least as complex as the original patch.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers