PL/Python adding support for multi-dimensional arrays
Hi
Current implementation of PL/Python does not allow the use of
multi-dimensional arrays, for both input and output parameters. This forces
end users to introduce workarounds like casting arrays to text before
passing them to the functions and parsing them after, which is an
error-prone approach
This patch adds support for multi-dimensional arrays as both input and
output parameters for PL/Python functions. The number of dimensions
supported is limited by Postgres MAXDIM macrovariable, by default equal to
6. Both input and output multi-dimensional arrays should have fixed
dimension sizes, i.e. 2-d arrays should represent MxN matrix, 3-d arrays
represent MxNxK cube, etc.
This patch does not support multi-dimensional arrays of composite types, as
composite types in Python might be represented as iterators and there is no
obvious way to find out when the nested array stops and composite type
structure starts. For example, if we have a composite type of (int, text),
we can try to return "[ [ [1,'a'], [2,'b'] ], [ [3,'c'], [4,'d'] ] ]", and
it is hard to find out that the first two lists are lists, and the third
one represents structure. Things are getting even more complex when you
have arrays as members of composite type. This is why I think this
limitation is reasonable.
Given the function:
CREATE FUNCTION test_type_conversion_array_int4(x int4[]) RETURNS int4[] AS
$$
plpy.info(x, type(x))
return x
$$ LANGUAGE plpythonu;
Before patch:
# SELECT * FROM test_type_conversion_array_int4(ARRAY[[1,2,3],[4,5,6]]);
ERROR: cannot convert multidimensional array to Python list
DETAIL: PL/Python only supports one-dimensional arrays.
CONTEXT: PL/Python function "test_type_conversion_array_int4"
After patch:
# SELECT * FROM test_type_conversion_array_int4(ARRAY[[1,2,3],[4,5,6]]);
INFO: ([[1, 2, 3], [4, 5, 6]], <type 'list'>)
test_type_conversion_array_int4
---------------------------------
{{1,2,3},{4,5,6}}
(1 row)
--
Best regards,
Alexey Grishchenko
Attachments:
0001-PL-Python-adding-support-for-multi-dimensional-arrays.patchapplication/octet-stream; name=0001-PL-Python-adding-support-for-multi-dimensional-arrays.patchDownload+261-50
On Wed, Aug 3, 2016 at 12:49 PM, Alexey Grishchenko <agrishchenko@pivotal.io
wrote:
Hi
Current implementation of PL/Python does not allow the use of
multi-dimensional arrays, for both input and output parameters. This forces
end users to introduce workarounds like casting arrays to text before
passing them to the functions and parsing them after, which is an
error-prone approachThis patch adds support for multi-dimensional arrays as both input and
output parameters for PL/Python functions. The number of dimensions
supported is limited by Postgres MAXDIM macrovariable, by default equal to
6. Both input and output multi-dimensional arrays should have fixed
dimension sizes, i.e. 2-d arrays should represent MxN matrix, 3-d arrays
represent MxNxK cube, etc.This patch does not support multi-dimensional arrays of composite types,
as composite types in Python might be represented as iterators and there is
no obvious way to find out when the nested array stops and composite type
structure starts. For example, if we have a composite type of (int, text),
we can try to return "[ [ [1,'a'], [2,'b'] ], [ [3,'c'], [4,'d'] ] ]", and
it is hard to find out that the first two lists are lists, and the third
one represents structure. Things are getting even more complex when you
have arrays as members of composite type. This is why I think this
limitation is reasonable.Given the function:
CREATE FUNCTION test_type_conversion_array_int4(x int4[]) RETURNS int4[]
AS $$
plpy.info(x, type(x))
return x
$$ LANGUAGE plpythonu;Before patch:
# SELECT * FROM test_type_conversion_array_int4(ARRAY[[1,2,3],[4,5,6]]);
ERROR: cannot convert multidimensional array to Python list
DETAIL: PL/Python only supports one-dimensional arrays.
CONTEXT: PL/Python function "test_type_conversion_array_int4"After patch:
# SELECT * FROM test_type_conversion_array_int4(ARRAY[[1,2,3],[4,5,6]]);
INFO: ([[1, 2, 3], [4, 5, 6]], <type 'list'>)
test_type_conversion_array_int4
---------------------------------
{{1,2,3},{4,5,6}}
(1 row)--
Best regards,
Alexey Grishchenko
Also this patch incorporates the fix for
/messages/by-id/CAH38_tkwA5qgLV8zPN1OpPzhtkNKQb30n3xq-2NR9jUfv3qwHA@mail.gmail.com,
as they touch the same piece of code - array manipulation in PL/Python
--
Best regards,
Alexey Grishchenko
Hi
2016-08-03 13:54 GMT+02:00 Alexey Grishchenko <agrishchenko@pivotal.io>:
On Wed, Aug 3, 2016 at 12:49 PM, Alexey Grishchenko <
agrishchenko@pivotal.io> wrote:Hi
Current implementation of PL/Python does not allow the use of
multi-dimensional arrays, for both input and output parameters. This forces
end users to introduce workarounds like casting arrays to text before
passing them to the functions and parsing them after, which is an
error-prone approachThis patch adds support for multi-dimensional arrays as both input and
output parameters for PL/Python functions. The number of dimensions
supported is limited by Postgres MAXDIM macrovariable, by default equal to
6. Both input and output multi-dimensional arrays should have fixed
dimension sizes, i.e. 2-d arrays should represent MxN matrix, 3-d arrays
represent MxNxK cube, etc.This patch does not support multi-dimensional arrays of composite types,
as composite types in Python might be represented as iterators and there is
no obvious way to find out when the nested array stops and composite type
structure starts. For example, if we have a composite type of (int, text),
we can try to return "[ [ [1,'a'], [2,'b'] ], [ [3,'c'], [4,'d'] ] ]", and
it is hard to find out that the first two lists are lists, and the third
one represents structure. Things are getting even more complex when you
have arrays as members of composite type. This is why I think this
limitation is reasonable.Given the function:
CREATE FUNCTION test_type_conversion_array_int4(x int4[]) RETURNS int4[]
AS $$
plpy.info(x, type(x))
return x
$$ LANGUAGE plpythonu;Before patch:
# SELECT * FROM test_type_conversion_array_int4(ARRAY[[1,2,3],[4,5,6]]);
ERROR: cannot convert multidimensional array to Python list
DETAIL: PL/Python only supports one-dimensional arrays.
CONTEXT: PL/Python function "test_type_conversion_array_int4"After patch:
# SELECT * FROM test_type_conversion_array_int4(ARRAY[[1,2,3],[4,5,6]]);
INFO: ([[1, 2, 3], [4, 5, 6]], <type 'list'>)
test_type_conversion_array_int4
---------------------------------
{{1,2,3},{4,5,6}}
(1 row)--
Best regards,
Alexey GrishchenkoAlso this patch incorporates the fix for https://www.postgresql.
org/message-id/CAH38_tkwA5qgLV8zPN1OpPzhtkNKQb30n3x
q-2NR9jUfv3qwHA%40mail.gmail.com, as they touch the same piece of code -
array manipulation in PL/Python
I am sending review of this patch:
1. The implemented functionality is clearly benefit - passing MD arrays,
pretty faster passing bigger arrays
2. I was able to use this patch cleanly without any errors or warnings
3. There is no any error or warning
4. All tests passed - I tested Python 2.7 and Python 3.5
5. The code is well commented and clean
6. For this new functionality the documentation is not necessary
7. I invite more regress tests for both directions (Python <-> Postgres)
for more than two dimensions
My only one objection is not enough regress tests - after fixing this patch
will be ready for commiters.
Good work, Alexey
Thank you
Regards
Pavel
Show quoted text
--
Best regards,
Alexey Grishchenko
On 10 August 2016 at 01:53, Pavel Stehule <pavel.stehule@gmail.com> wrote:
Hi
2016-08-03 13:54 GMT+02:00 Alexey Grishchenko <agrishchenko@pivotal.io>:
On Wed, Aug 3, 2016 at 12:49 PM, Alexey Grishchenko <
agrishchenko@pivotal.io> wrote:Hi
Current implementation of PL/Python does not allow the use of
multi-dimensional arrays, for both input and output parameters. This forces
end users to introduce workarounds like casting arrays to text before
passing them to the functions and parsing them after, which is an
error-prone approachThis patch adds support for multi-dimensional arrays as both input and
output parameters for PL/Python functions. The number of dimensions
supported is limited by Postgres MAXDIM macrovariable, by default equal to
6. Both input and output multi-dimensional arrays should have fixed
dimension sizes, i.e. 2-d arrays should represent MxN matrix, 3-d arrays
represent MxNxK cube, etc.This patch does not support multi-dimensional arrays of composite types,
as composite types in Python might be represented as iterators and there is
no obvious way to find out when the nested array stops and composite type
structure starts. For example, if we have a composite type of (int, text),
we can try to return "[ [ [1,'a'], [2,'b'] ], [ [3,'c'], [4,'d'] ] ]", and
it is hard to find out that the first two lists are lists, and the third
one represents structure. Things are getting even more complex when you
have arrays as members of composite type. This is why I think this
limitation is reasonable.Given the function:
CREATE FUNCTION test_type_conversion_array_int4(x int4[]) RETURNS
int4[] AS $$
plpy.info(x, type(x))
return x
$$ LANGUAGE plpythonu;Before patch:
# SELECT * FROM test_type_conversion_array_int4(ARRAY[[1,2,3],[4,5,6]]);
ERROR: cannot convert multidimensional array to Python list
DETAIL: PL/Python only supports one-dimensional arrays.
CONTEXT: PL/Python function "test_type_conversion_array_int4"After patch:
# SELECT * FROM test_type_conversion_array_int4(ARRAY[[1,2,3],[4,5,6]]);
INFO: ([[1, 2, 3], [4, 5, 6]], <type 'list'>)
test_type_conversion_array_int4
---------------------------------
{{1,2,3},{4,5,6}}
(1 row)--
Best regards,
Alexey GrishchenkoAlso this patch incorporates the fix for https://www.postgresql.org
/message-id/CAH38_tkwA5qgLV8zPN1OpPzhtkNKQb30n3xq-
2NR9jUfv3qwHA%40mail.gmail.com, as they touch the same piece of code -
array manipulation in PL/PythonI am sending review of this patch:
1. The implemented functionality is clearly benefit - passing MD arrays,
pretty faster passing bigger arrays
2. I was able to use this patch cleanly without any errors or warnings
3. There is no any error or warning
4. All tests passed - I tested Python 2.7 and Python 3.5
5. The code is well commented and clean
6. For this new functionality the documentation is not necessary7. I invite more regress tests for both directions (Python <-> Postgres)
for more than two dimensionsMy only one objection is not enough regress tests - after fixing this
patch will be ready for commiters.Good work, Alexey
Thank you
Regards
Pavel
--
Best regards,
Alexey Grishchenko
Pavel,
I will pick this up.
Dave Cramer
davec@postgresintl.com
www.postgresintl.com
On 18 September 2016 at 09:27, Dave Cramer <pg@fastcrypt.com> wrote:
On 10 August 2016 at 01:53, Pavel Stehule <pavel.stehule@gmail.com> wrote:
Hi
2016-08-03 13:54 GMT+02:00 Alexey Grishchenko <agrishchenko@pivotal.io>:
On Wed, Aug 3, 2016 at 12:49 PM, Alexey Grishchenko <
agrishchenko@pivotal.io> wrote:Hi
Current implementation of PL/Python does not allow the use of
multi-dimensional arrays, for both input and output parameters. This forces
end users to introduce workarounds like casting arrays to text before
passing them to the functions and parsing them after, which is an
error-prone approachThis patch adds support for multi-dimensional arrays as both input and
output parameters for PL/Python functions. The number of dimensions
supported is limited by Postgres MAXDIM macrovariable, by default equal to
6. Both input and output multi-dimensional arrays should have fixed
dimension sizes, i.e. 2-d arrays should represent MxN matrix, 3-d arrays
represent MxNxK cube, etc.This patch does not support multi-dimensional arrays of composite
types, as composite types in Python might be represented as iterators and
there is no obvious way to find out when the nested array stops and
composite type structure starts. For example, if we have a composite type
of (int, text), we can try to return "[ [ [1,'a'], [2,'b'] ], [ [3,'c'],
[4,'d'] ] ]", and it is hard to find out that the first two lists are
lists, and the third one represents structure. Things are getting even more
complex when you have arrays as members of composite type. This is why I
think this limitation is reasonable.Given the function:
CREATE FUNCTION test_type_conversion_array_int4(x int4[]) RETURNS
int4[] AS $$
plpy.info(x, type(x))
return x
$$ LANGUAGE plpythonu;Before patch:
# SELECT * FROM test_type_conversion_array_int
4(ARRAY[[1,2,3],[4,5,6]]);
ERROR: cannot convert multidimensional array to Python list
DETAIL: PL/Python only supports one-dimensional arrays.
CONTEXT: PL/Python function "test_type_conversion_array_int4"After patch:
# SELECT * FROM test_type_conversion_array_int
4(ARRAY[[1,2,3],[4,5,6]]);
INFO: ([[1, 2, 3], [4, 5, 6]], <type 'list'>)
test_type_conversion_array_int4
---------------------------------
{{1,2,3},{4,5,6}}
(1 row)--
Best regards,
Alexey GrishchenkoAlso this patch incorporates the fix for https://www.postgresql.org
/message-id/CAH38_tkwA5qgLV8zPN1OpPzhtkNKQb30n3xq-2NR9jUfv3q
wHA%40mail.gmail.com, as they touch the same piece of code - array
manipulation in PL/PythonI am sending review of this patch:
1. The implemented functionality is clearly benefit - passing MD arrays,
pretty faster passing bigger arrays
2. I was able to use this patch cleanly without any errors or warnings
3. There is no any error or warning
4. All tests passed - I tested Python 2.7 and Python 3.5
5. The code is well commented and clean
6. For this new functionality the documentation is not necessary7. I invite more regress tests for both directions (Python <-> Postgres)
for more than two dimensionsMy only one objection is not enough regress tests - after fixing this
patch will be ready for commiters.Good work, Alexey
Thank you
Regards
Pavel
--
Best regards,
Alexey GrishchenkoPavel,
I will pick this up.
Pavel,
Please see attached patch which provides more test cases
I just realized this patch contains the original patch as well. What is the
protocol for sending in subsequent patches ?
Show quoted text
Dave Cramer
davec@postgresintl.com
www.postgresintl.com
Attachments:
0002-PL-Python-adding-support-for-multi-dimensional-arrays.patchapplication/octet-stream; name=0002-PL-Python-adding-support-for-multi-dimensional-arrays.patchDownload+443-49
Hi
2016-09-21 19:53 GMT+02:00 Dave Cramer <pg@fastcrypt.com>:
On 18 September 2016 at 09:27, Dave Cramer <pg@fastcrypt.com> wrote:
On 10 August 2016 at 01:53, Pavel Stehule <pavel.stehule@gmail.com>
wrote:Hi
2016-08-03 13:54 GMT+02:00 Alexey Grishchenko <agrishchenko@pivotal.io>:
On Wed, Aug 3, 2016 at 12:49 PM, Alexey Grishchenko <
agrishchenko@pivotal.io> wrote:Hi
Current implementation of PL/Python does not allow the use of
multi-dimensional arrays, for both input and output parameters. This forces
end users to introduce workarounds like casting arrays to text before
passing them to the functions and parsing them after, which is an
error-prone approachThis patch adds support for multi-dimensional arrays as both input and
output parameters for PL/Python functions. The number of dimensions
supported is limited by Postgres MAXDIM macrovariable, by default equal to
6. Both input and output multi-dimensional arrays should have fixed
dimension sizes, i.e. 2-d arrays should represent MxN matrix, 3-d arrays
represent MxNxK cube, etc.This patch does not support multi-dimensional arrays of composite
types, as composite types in Python might be represented as iterators and
there is no obvious way to find out when the nested array stops and
composite type structure starts. For example, if we have a composite type
of (int, text), we can try to return "[ [ [1,'a'], [2,'b'] ], [ [3,'c'],
[4,'d'] ] ]", and it is hard to find out that the first two lists are
lists, and the third one represents structure. Things are getting even more
complex when you have arrays as members of composite type. This is why I
think this limitation is reasonable.Given the function:
CREATE FUNCTION test_type_conversion_array_int4(x int4[]) RETURNS
int4[] AS $$
plpy.info(x, type(x))
return x
$$ LANGUAGE plpythonu;Before patch:
# SELECT * FROM test_type_conversion_array_int
4(ARRAY[[1,2,3],[4,5,6]]);
ERROR: cannot convert multidimensional array to Python list
DETAIL: PL/Python only supports one-dimensional arrays.
CONTEXT: PL/Python function "test_type_conversion_array_int4"After patch:
# SELECT * FROM test_type_conversion_array_int
4(ARRAY[[1,2,3],[4,5,6]]);
INFO: ([[1, 2, 3], [4, 5, 6]], <type 'list'>)
test_type_conversion_array_int4
---------------------------------
{{1,2,3},{4,5,6}}
(1 row)--
Best regards,
Alexey GrishchenkoAlso this patch incorporates the fix for https://www.postgresql.org
/message-id/CAH38_tkwA5qgLV8zPN1OpPzhtkNKQb30n3xq-2NR9jUfv3q
wHA%40mail.gmail.com, as they touch the same piece of code - array
manipulation in PL/PythonI am sending review of this patch:
1. The implemented functionality is clearly benefit - passing MD arrays,
pretty faster passing bigger arrays
2. I was able to use this patch cleanly without any errors or warnings
3. There is no any error or warning
4. All tests passed - I tested Python 2.7 and Python 3.5
5. The code is well commented and clean
6. For this new functionality the documentation is not necessary7. I invite more regress tests for both directions (Python <-> Postgres)
for more than two dimensionsMy only one objection is not enough regress tests - after fixing this
patch will be ready for commiters.
Now, the tests are enough - so I'll mark this patch as ready for commiters.
I had to fix tests - there was lot of white spaces, and the result for
python3 was missing
Regards
Pavel
Show quoted text
Good work, Alexey
Thank you
Regards
Pavel
--
Best regards,
Alexey GrishchenkoPavel,
I will pick this up.
Pavel,
Please see attached patch which provides more test cases
I just realized this patch contains the original patch as well. What is
the protocol for sending in subsequent patches ?Dave Cramer
davec@postgresintl.com
www.postgresintl.com
Attachments:
PL-Python-adding-support-for-multi-dimensional-arrays-20160922.patchtext/x-patch; charset=US-ASCII; name=PL-Python-adding-support-for-multi-dimensional-arrays-20160922.patchDownload+546-49
On 09/22/2016 10:28 AM, Pavel Stehule wrote:
Now, the tests are enough - so I'll mark this patch as ready for commiters.
I had to fix tests - there was lot of white spaces, and the result for
python3 was missing
Thanks Pavel!
This crashes with arrays with non-default lower bounds:
postgres=# SELECT * FROM test_type_conversion_array_int4('[2:4]={1,2,3}');
INFO: ([1, 2, <NULL>], <type 'list'>)
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
I'd like to see some updates to the docs for this. The manual doesn't
currently say anything about multi-dimensional arrays in pl/python, but
it should've mentioned that they're not supported. Now that it is
supported, should mention that, and explain briefly that a
multi-dimensional array is mapped to a python list of lists.
It seems we don't have any mention in the docs about arrays with
non-default lower-bounds ATM. That's not this patch's fault, but it
would be good to point out that the lower bounds are discarded when an
array is passed to python.
I find the loop in PLyList_FromArray() quite difficult to understand.
Are the comments there mixing up the "inner" and "outer" dimensions? I
wonder if that would be easier to read, if it was written in a
recursive-style, rather than iterative with stacks for the dimensions.
On 08/03/2016 02:49 PM, Alexey Grishchenko wrote:
This patch does not support multi-dimensional arrays of composite types, as
composite types in Python might be represented as iterators and there is no
obvious way to find out when the nested array stops and composite type
structure starts. For example, if we have a composite type of (int, text),
we can try to return "[ [ [1,'a'], [2,'b'] ], [ [3,'c'], [4,'d'] ] ]", and
it is hard to find out that the first two lists are lists, and the third
one represents structure. Things are getting even more complex when you
have arrays as members of composite type. This is why I think this
limitation is reasonable.
How do we handle single-dimensional arrays of composite types at the
moment? At a quick glance, it seems that the composite types are just
treated like strings, when they're in an array. That's probably OK, but
it means that there's nothing special about composite types in
multi-dimensional arrays. In any case, we should mention that in the docs.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 9/23/16 2:42 AM, Heikki Linnakangas wrote:
How do we handle single-dimensional arrays of composite types at the
moment? At a quick glance, it seems that the composite types are just
treated like strings, when they're in an array. That's probably OK, but
it means that there's nothing special about composite types in
multi-dimensional arrays. In any case, we should mention that in the docs.
That is how they're handled, but I'd really like to change that. I've
held off because I don't know how to handle the backwards
incompatibility that would introduce. (I've been wondering if we might
add a facility to allow specifying default TRANSFORMs that should be
used for specific data types in specific languages.)
The converse case (a composite with arrays) suffers the same problem
(array is just treated as a string).
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532) mobile: 512-569-9461
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
This crashes with arrays with non-default lower bounds:
postgres=# SELECT * FROM test_type_conversion_array_int4('[2:4]={1,2,3}');
INFO: ([1, 2, <NULL>], <type 'list'>)
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.Attached patch fixes this bug, and adds a test for it.
I'd like to see some updates to the docs for this. The manual doesn't
currently say anything about multi-dimensional arrays in pl/python, but it
should've mentioned that they're not supported. Now that it is supported,
should mention that, and explain briefly that a multi-dimensional array is
mapped to a python list of lists.If the code passes I'll fix the docs
It seems we don't have any mention in the docs about arrays with
non-default lower-bounds ATM. That's not this patch's fault, but it would
be good to point out that the lower bounds are discarded when an array is
passed to python.I find the loop in PLyList_FromArray() quite difficult to understand. Are
the comments there mixing up the "inner" and "outer" dimensions? I wonder
if that would be easier to read, if it was written in a recursive-style,
rather than iterative with stacks for the dimensions.Yes, it is fairly convoluted.
Dave Cramer
davec@postgresintl.com
www.postgresintl.com
On 26 September 2016 at 14:52, Dave Cramer <pg@fastcrypt.com> wrote:
Show quoted text
This crashes with arrays with non-default lower bounds:
postgres=# SELECT * FROM test_type_conversion_array_int
4('[2:4]={1,2,3}');
INFO: ([1, 2, <NULL>], <type 'list'>)
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.Attached patch fixes this bug, and adds a test for it.
I'd like to see some updates to the docs for this. The manual doesn't
currently say anything about multi-dimensional arrays in pl/python, but it
should've mentioned that they're not supported. Now that it is supported,
should mention that, and explain briefly that a multi-dimensional array is
mapped to a python list of lists.If the code passes I'll fix the docs
It seems we don't have any mention in the docs about arrays with
non-default lower-bounds ATM. That's not this patch's fault, but it would
be good to point out that the lower bounds are discarded when an array is
passed to python.I find the loop in PLyList_FromArray() quite difficult to understand. Are
the comments there mixing up the "inner" and "outer" dimensions? I wonder
if that would be easier to read, if it was written in a recursive-style,
rather than iterative with stacks for the dimensions.Yes, it is fairly convoluted.
Dave Cramer
davec@postgresintl.com
www.postgresintl.com
Attachments:
PL-Python-adding-support-for-multi-dimensional-arrays-20160926.patchapplication/octet-stream; name=PL-Python-adding-support-for-multi-dimensional-arrays-20160926.patchDownload+554-49
On 09/27/2016 02:04 PM, Dave Cramer wrote:
On 26 September 2016 at 14:52, Dave Cramer <pg@fastcrypt.com> wrote:
This crashes with arrays with non-default lower bounds:
postgres=# SELECT * FROM test_type_conversion_array_int
4('[2:4]={1,2,3}');
INFO: ([1, 2, <NULL>], <type 'list'>)
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.Attached patch fixes this bug, and adds a test for it.
I spent some more time massaging this:
* Changed the loops from iterative to recursive style. I think this
indeed is slightly easier to understand.
* Fixed another segfault, with too deeply nested lists:
CREATE or replace FUNCTION test_type_conversion_mdarray_toodeep()
RETURNS int[] AS $$
return [[[[[[[[[[[[[[[[[[1]]]]]]]]]]]]]]]]]]
$$ LANGUAGE plpythonu;
* Also, in PLySequence_ToArray(), we must check that the 'len' of the
array doesn't overflow.
* Fixed reference leak in the loop in PLySequence_ToArray() to count the
number of dimensions.
I'd like to see some updates to the docs for this. The manual doesn't
currently say anything about multi-dimensional arrays in pl/python, but it
should've mentioned that they're not supported. Now that it is supported,
should mention that, and explain briefly that a multi-dimensional array is
mapped to a python list of lists.If the code passes I'll fix the docs
Please do, thanks!
- Heikki
Attachments:
0001-WIP-Multi-dimensional-arrays-in-PL-python.patchtext/x-patch; name=0001-WIP-Multi-dimensional-arrays-in-PL-python.patchDownload+608-53
On 27 September 2016 at 14:58, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
On 09/27/2016 02:04 PM, Dave Cramer wrote:
On 26 September 2016 at 14:52, Dave Cramer <pg@fastcrypt.com> wrote:
This crashes with arrays with non-default lower bounds:
postgres=# SELECT * FROM test_type_conversion_array_int
4('[2:4]={1,2,3}');
INFO: ([1, 2, <NULL>], <type 'list'>)
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.Attached patch fixes this bug, and adds a test for it.
I spent some more time massaging this:
* Changed the loops from iterative to recursive style. I think this indeed
is slightly easier to understand.* Fixed another segfault, with too deeply nested lists:
CREATE or replace FUNCTION test_type_conversion_mdarray_toodeep() RETURNS
int[] AS $$
return [[[[[[[[[[[[[[[[[[1]]]]]]]]]]]]]]]]]]
$$ LANGUAGE plpythonu;* Also, in PLySequence_ToArray(), we must check that the 'len' of the
array doesn't overflow.* Fixed reference leak in the loop in PLySequence_ToArray() to count the
number of dimensions.I'd like to see some updates to the docs for this. The manual doesn't
currently say anything about multi-dimensional arrays in pl/python, but
it
should've mentioned that they're not supported. Now that it is
supported,
should mention that, and explain briefly that a multi-dimensional array
is
mapped to a python list of lists.If the code passes I'll fix the docs
Please do, thanks!
see attached
Dave Cramer
davec@postgresintl.com
www.postgresintl.com
Attachments:
0002-WIP-Multi-dimensional-arrays-in-PL-python.patchapplication/octet-stream; name=0002-WIP-Multi-dimensional-arrays-in-PL-python.patchDownload+625-52
On 09/23/2016 10:27 PM, Jim Nasby wrote:
On 9/23/16 2:42 AM, Heikki Linnakangas wrote:
How do we handle single-dimensional arrays of composite types at the
moment? At a quick glance, it seems that the composite types are just
treated like strings, when they're in an array. That's probably OK, but
it means that there's nothing special about composite types in
multi-dimensional arrays. In any case, we should mention that in the docs.That is how they're handled, but I'd really like to change that. I've
held off because I don't know how to handle the backwards
incompatibility that would introduce. (I've been wondering if we might
add a facility to allow specifying default TRANSFORMs that should be
used for specific data types in specific languages.)The converse case (a composite with arrays) suffers the same problem
(array is just treated as a string).
I take that back, I don't know what I was talking about. Without this
patch, an array of composite types can be returned, using any of the
three representations for the composite type explained in the docs: a
string, a sequence, or a dictionary. So, all these work, and return the
same value:
create table foo (a int4, b int4);
CREATE FUNCTION comp_array_string() RETURNS foo[] AS $$
return ["(1, 2)"]
$$ LANGUAGE plpythonu;
CREATE FUNCTION comp_array_sequence() RETURNS foo[] AS $$
return [[1, 2]]
$$ LANGUAGE plpythonu;
CREATE FUNCTION comp_array_dict() RETURNS foo[] AS $$
return [{"a": 1, "b": 2}]
$$ LANGUAGE plpythonu;
Jim, I was confused, but you agreed with me. Were you also confused, or
am I missing something?
Now, back to multi-dimensional arrays. I can see that the Sequence
representation is problematic, with arrays, because if you have a python
list of lists, like [[1, 2]], it's not immediately clear if that's a
one-dimensional array of tuples, or two-dimensional array of integers.
Then again, we do have the type definitions available. So is it really
ambiguous?
The string and dict representations don't have that ambiguity at all, so
I don't see why we wouldn't support those, at least.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 9/29/16 1:51 PM, Heikki Linnakangas wrote:
Jim, I was confused, but you agreed with me. Were you also confused, or
am I missing something?
I was confused by inputs:
CREATE FUNCTION repr(i foo[]) RETURNS text LANGUAGE plpythonu AS
$$return repr(i)$$;
select repr(array[row(1,2)::foo, row(3,4)::foo]);
repr
--------------------
['(1,2)', '(3,4)']
(1 row)
(in ipython...)
In [1]: i=['(1,2)', '(3,4)']
In [2]: type(i)
Out[2]: list
In [3]: type(i[0])
Out[3]: str
I wonder if your examples work only
Now, back to multi-dimensional arrays. I can see that the Sequence
representation is problematic, with arrays, because if you have a python
list of lists, like [[1, 2]], it's not immediately clear if that's a
one-dimensional array of tuples, or two-dimensional array of integers.
Then again, we do have the type definitions available. So is it really
ambiguous?
[[1,2]] is a list of lists...
In [4]: b=[[1,2]]
In [5]: type(b)
Out[5]: list
In [6]: type(b[0])
Out[6]: list
If you want a list of tuples...
In [7]: c=[(1,2)]
In [8]: type(c)
Out[8]: list
In [9]: type(c[0])
Out[9]: tuple
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532) mobile: 512-569-9461
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Oct 1, 2016 at 8:45 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
On 9/29/16 1:51 PM, Heikki Linnakangas wrote:
Jim, I was confused, but you agreed with me. Were you also confused, or
am I missing something?I was confused by inputs:
I have marked the patch as returned with feedback. Or Heikki, do you
plan on looking at it more and commit soon?
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 10/01/2016 02:45 AM, Jim Nasby wrote:
On 9/29/16 1:51 PM, Heikki Linnakangas wrote:
Now, back to multi-dimensional arrays. I can see that the Sequence
representation is problematic, with arrays, because if you have a python
list of lists, like [[1, 2]], it's not immediately clear if that's a
one-dimensional array of tuples, or two-dimensional array of integers.
Then again, we do have the type definitions available. So is it really
ambiguous?[[1,2]] is a list of lists...
In [4]: b=[[1,2]]In [5]: type(b)
Out[5]: listIn [6]: type(b[0])
Out[6]: listIf you want a list of tuples...
In [7]: c=[(1,2)]In [8]: type(c)
Out[8]: listIn [9]: type(c[0])
Out[9]: tuple
Hmm, so we would start to treat lists and tuples differently? A Python
list would be converted into an array, and a Python tuple would be
converted into a composite type. That does make a lot of sense. The only
problem is that it's not backwards-compatible. A PL/python function that
returns an SQL array of rows, and does that by returning Python list of
lists, it would start failing.
I think we should bite the bullet and do that anyway. As long as it's
clearly documented, and the error message you get contains a clear hint
on how to fix it, I don't think it would be too painful to adjust
existing application.
We could continue to accept a Python list for a plain composite type,
this would only affect arrays of composite types.
I don't use PL/python much myself, so I don't feel qualified to make the
call, though. Any 3rd opinions?
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2016-10-10 12:31 GMT+02:00 Heikki Linnakangas <hlinnaka@iki.fi>:
On 10/01/2016 02:45 AM, Jim Nasby wrote:
On 9/29/16 1:51 PM, Heikki Linnakangas wrote:
Now, back to multi-dimensional arrays. I can see that the Sequence
representation is problematic, with arrays, because if you have a python
list of lists, like [[1, 2]], it's not immediately clear if that's a
one-dimensional array of tuples, or two-dimensional array of integers.
Then again, we do have the type definitions available. So is it really
ambiguous?[[1,2]] is a list of lists...
In [4]: b=[[1,2]]In [5]: type(b)
Out[5]: listIn [6]: type(b[0])
Out[6]: listIf you want a list of tuples...
In [7]: c=[(1,2)]In [8]: type(c)
Out[8]: listIn [9]: type(c[0])
Out[9]: tupleHmm, so we would start to treat lists and tuples differently? A Python
list would be converted into an array, and a Python tuple would be
converted into a composite type. That does make a lot of sense. The only
problem is that it's not backwards-compatible. A PL/python function that
returns an SQL array of rows, and does that by returning Python list of
lists, it would start failing.
is not possible do decision in last moment - on PL/Postgres interface?
There the expected type should be known.
Regards
Pavel
Show quoted text
I think we should bite the bullet and do that anyway. As long as it's
clearly documented, and the error message you get contains a clear hint on
how to fix it, I don't think it would be too painful to adjust existing
application.We could continue to accept a Python list for a plain composite type, this
would only affect arrays of composite types.I don't use PL/python much myself, so I don't feel qualified to make the
call, though. Any 3rd opinions?- Heikki
On 10 October 2016 at 13:42, Pavel Stehule <pavel.stehule@gmail.com> wrote:
2016-10-10 12:31 GMT+02:00 Heikki Linnakangas <hlinnaka@iki.fi>:
On 10/01/2016 02:45 AM, Jim Nasby wrote:
On 9/29/16 1:51 PM, Heikki Linnakangas wrote:
Now, back to multi-dimensional arrays. I can see that the Sequence
representation is problematic, with arrays, because if you have a python
list of lists, like [[1, 2]], it's not immediately clear if that's a
one-dimensional array of tuples, or two-dimensional array of integers.
Then again, we do have the type definitions available. So is it really
ambiguous?[[1,2]] is a list of lists...
In [4]: b=[[1,2]]In [5]: type(b)
Out[5]: listIn [6]: type(b[0])
Out[6]: listIf you want a list of tuples...
In [7]: c=[(1,2)]In [8]: type(c)
Out[8]: listIn [9]: type(c[0])
Out[9]: tupleHmm, so we would start to treat lists and tuples differently? A Python
list would be converted into an array, and a Python tuple would be
converted into a composite type. That does make a lot of sense. The only
problem is that it's not backwards-compatible. A PL/python function that
returns an SQL array of rows, and does that by returning Python list of
lists, it would start failing.is not possible do decision in last moment - on PL/Postgres interface?
There the expected type should be known.Regards
Pavel
I think we should bite the bullet and do that anyway. As long as it's
clearly documented, and the error message you get contains a clear hint on
how to fix it, I don't think it would be too painful to adjust existing
application.We could continue to accept a Python list for a plain composite type,
this would only affect arrays of composite types.I don't use PL/python much myself, so I don't feel qualified to make the
call, though. Any 3rd opinions?
Can't you determine the correct output based on the function output
definition ?
For instance if the function output was an array type then we would return
the list as an array
if the function output was a set of then we return tuples ?
Dave Cramer
davec@postgresintl.com
www.postgresintl.com
On 10/10/2016 08:42 PM, Pavel Stehule wrote:
2016-10-10 12:31 GMT+02:00 Heikki Linnakangas <hlinnaka@iki.fi>:
On 10/01/2016 02:45 AM, Jim Nasby wrote:
On 9/29/16 1:51 PM, Heikki Linnakangas wrote:
Now, back to multi-dimensional arrays. I can see that the Sequence
representation is problematic, with arrays, because if you have a python
list of lists, like [[1, 2]], it's not immediately clear if that's a
one-dimensional array of tuples, or two-dimensional array of integers.
Then again, we do have the type definitions available. So is it really
ambiguous?[[1,2]] is a list of lists...
In [4]: b=[[1,2]]In [5]: type(b)
Out[5]: listIn [6]: type(b[0])
Out[6]: listIf you want a list of tuples...
In [7]: c=[(1,2)]In [8]: type(c)
Out[8]: listIn [9]: type(c[0])
Out[9]: tupleHmm, so we would start to treat lists and tuples differently? A Python
list would be converted into an array, and a Python tuple would be
converted into a composite type. That does make a lot of sense. The only
problem is that it's not backwards-compatible. A PL/python function that
returns an SQL array of rows, and does that by returning Python list of
lists, it would start failing.is not possible do decision in last moment - on PL/Postgres interface?
There the expected type should be known.
Unfortunately there are cases that are fundamentally ambiguous.
create type comptype as (intarray int[]);
create function array_return() returns comptype[] as $$
return [[[[1]]]];
$$ language plpython;
What does the function return? It could be two-dimension array of
comptype, with a single-dimension intarray, or a single-dimension
comptype, with a two-dimension intarray.
We could resolve it for simpler cases, but not the general case. The
simple cases would probably cover most things people do in practice. But
if the distinction between a tuple and a list feels natural to Python
programmers, I think it would be more clear in the long run to have
people adjust their applications.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2016-10-11 7:49 GMT+02:00 Heikki Linnakangas <hlinnaka@iki.fi>:
On 10/10/2016 08:42 PM, Pavel Stehule wrote:
2016-10-10 12:31 GMT+02:00 Heikki Linnakangas <hlinnaka@iki.fi>:
On 10/01/2016 02:45 AM, Jim Nasby wrote:
On 9/29/16 1:51 PM, Heikki Linnakangas wrote:
Now, back to multi-dimensional arrays. I can see that the Sequence
representation is problematic, with arrays, because if you have a
python
list of lists, like [[1, 2]], it's not immediately clear if that's a
one-dimensional array of tuples, or two-dimensional array of integers.
Then again, we do have the type definitions available. So is it really
ambiguous?[[1,2]] is a list of lists...
In [4]: b=[[1,2]]In [5]: type(b)
Out[5]: listIn [6]: type(b[0])
Out[6]: listIf you want a list of tuples...
In [7]: c=[(1,2)]In [8]: type(c)
Out[8]: listIn [9]: type(c[0])
Out[9]: tupleHmm, so we would start to treat lists and tuples differently? A Python
list would be converted into an array, and a Python tuple would be
converted into a composite type. That does make a lot of sense. The only
problem is that it's not backwards-compatible. A PL/python function that
returns an SQL array of rows, and does that by returning Python list of
lists, it would start failing.is not possible do decision in last moment - on PL/Postgres interface?
There the expected type should be known.Unfortunately there are cases that are fundamentally ambiguous.
create type comptype as (intarray int[]);
create function array_return() returns comptype[] as $$
return [[[[1]]]];
$$ language plpython;What does the function return? It could be two-dimension array of
comptype, with a single-dimension intarray, or a single-dimension comptype,
with a two-dimension intarray.We could resolve it for simpler cases, but not the general case. The
simple cases would probably cover most things people do in practice. But if
the distinction between a tuple and a list feels natural to Python
programmers, I think it would be more clear in the long run to have people
adjust their applications.
I agree. The distinction is natural - and it is our issue, so we don't
distinguish strongly.
Regards
Pavel
Show quoted text
- Heikki