Number of dimensions of an array parameter

Started by Thomas Hallgrenover 19 years ago12 messages
#1Thomas Hallgren
thomas@tada.se

I can create a function that takes a two dimension int array:

CREATE FUNCTION twodims(int[][]) RETURNS void AS ...

but there's nothing stopping me from calling this function with an
arbitrary number of dimensions on the array.

I'd like to map a parameter like the one above to a corresponding
representation in Java (it would be int[][] there too). As it turns out,
I can't do that. PostgreSQL will not store any information that can tell
me how many dimensions that where used in the declaration, i.e. it's
impossible to write a language VALIDATOR that, based on the information
in pg_proc, builds a signature where the number of dimensions is reflected.

This leaves me with two choices:

Only allow arrays with one dimension unless the parameter is of a domain
type (domains are apparently stored with the actual number of
dimensions). Any call that uses an array parameter with more then one
dimension will yield an exception.
--OR--
Always map to Object[] instead of mapping to the correct type, . This
will work since an array in Java is also an Object and all primitive
types can be represented as objects (i.e. int can be a
java.lang.Integer). The strong typing and the ability to use primitives
are lost however.

I'm leaning towards #1 and hoping that PostgreSQL will enhance the
parameter type declarations to include the dimensions in future releases.

... After some more testing ...

Unfortunately, I run into problems even when I use domains. Consider the
following:

thhal=# CREATE DOMAIN twodims as int[][];
CREATE DOMAIN
thhal=# SELECT typndims FROM pg_type WHERE typname = 'twodims';
typndims
----------
2
(1 row)

thhal=# SELECT array_dims('{{{1,2,3},{3,4,3}},{{5,3,2},{9,9,9}}}'::twodims);
array_dims
-----------------
[1:2][1:2][1:3]
(1 row)

IMO, there is something seriously wrong here. Clearly the number of
dimensions is a property of the type. Any array with a different number
of dimensions should yield an error or at least be coerced into the
right number of dimensions.

Kind Regards,
Thomas Hallgren

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Hallgren (#1)
Re: Number of dimensions of an array parameter

Thomas Hallgren <thomas@tada.se> writes:

Only allow arrays with one dimension unless the parameter is of a domain
type (domains are apparently stored with the actual number of
dimensions).

No, they don't enforce dimensionality any more than ordinary array
columns do. typndims and attndims are both effectively just booleans:
is it an array or not?

IMO, there is something seriously wrong here. Clearly the number of
dimensions is a property of the type.

[ shrug... ] That's debatable. You could just as well argue that the
exact array size should be enforced by the type system.

regards, tom lane

#3Stefan Kaltenbrunner
stefan@kaltenbrunner.cc
In reply to: Thomas Hallgren (#1)
Re: Number of dimensions of an array parameter

Thomas Hallgren wrote:

I can create a function that takes a two dimension int array:

CREATE FUNCTION twodims(int[][]) RETURNS void AS ...

but there's nothing stopping me from calling this function with an
arbitrary number of dimensions on the array.

I'd like to map a parameter like the one above to a corresponding
representation in Java (it would be int[][] there too). As it turns out,
I can't do that. PostgreSQL will not store any information that can tell
me how many dimensions that where used in the declaration, i.e. it's
impossible to write a language VALIDATOR that, based on the information
in pg_proc, builds a signature where the number of dimensions is reflected.

This leaves me with two choices:

Only allow arrays with one dimension unless the parameter is of a domain
type (domains are apparently stored with the actual number of
dimensions). Any call that uses an array parameter with more then one
dimension will yield an exception.
--OR--
Always map to Object[] instead of mapping to the correct type, . This
will work since an array in Java is also an Object and all primitive
types can be represented as objects (i.e. int can be a
java.lang.Integer). The strong typing and the ability to use primitives
are lost however.

I'm leaning towards #1 and hoping that PostgreSQL will enhance the
parameter type declarations to include the dimensions in future releases.

... After some more testing ...

Unfortunately, I run into problems even when I use domains. Consider the
following:

thhal=# CREATE DOMAIN twodims as int[][];
CREATE DOMAIN
thhal=# SELECT typndims FROM pg_type WHERE typname = 'twodims';
typndims
----------
2
(1 row)

thhal=# SELECT
array_dims('{{{1,2,3},{3,4,3}},{{5,3,2},{9,9,9}}}'::twodims);
array_dims -----------------
[1:2][1:2][1:3]
(1 row)

IMO, there is something seriously wrong here. Clearly the number of
dimensions is a property of the type. Any array with a different number
of dimensions should yield an error or at least be coerced into the
right number of dimensions.

while it would be nice to improve that - it is actually documented quite
clearly.

http://www.postgresql.org/docs/current/static/arrays.html has:

"However, the current implementation does not enforce the array size
limits � the behavior is the same as for arrays of unspecified length.

Actually, the current implementation does not enforce the declared
number of dimensions either. Arrays of a particular element type are all
considered to be of the same type, regardless of size or number of
dimensions. So, declaring number of dimensions or sizes in CREATE TABLE
is simply documentation, it does not affect run-time behavior. "

Stefan

#4Thomas Hallgren
thomas@tada.se
In reply to: Stefan Kaltenbrunner (#3)
Re: Number of dimensions of an array parameter

Stefan Kaltenbrunner wrote:

while it would be nice to improve that - it is actually documented quite
clearly.

http://www.postgresql.org/docs/current/static/arrays.html has:

"However, the current implementation does not enforce the array size
limits � the behavior is the same as for arrays of unspecified length.

Actually, the current implementation does not enforce the declared
number of dimensions either. Arrays of a particular element type are all
considered to be of the same type, regardless of size or number of
dimensions. So, declaring number of dimensions or sizes in CREATE TABLE
is simply documentation, it does not affect run-time behavior. "

A documented flaw is much better than an undocumented one but it's still
a flaw, and a pretty bad one at that. It's like having a compiler that
doesn't complain when you define a C-function that takes an int** and
then pass an int*.

Would it be hard to enforce a real check? The implementation could use
GUC settings like 'enforce_array_dimensions' and 'enforce_array_lengths'
that could be set to false for the legacy implementations that rely on
the current behavior. I know Tom added the ability to have NULL values
in the arrays. Perhaps now is the time to improve the type semantics as
well?

Regards,
Thomas Hallgren

#5Rod Taylor
pg@rbt.ca
In reply to: Thomas Hallgren (#1)
Re: Number of dimensions of an array parameter

thhal=# CREATE DOMAIN twodims as int[][];
CREATE DOMAIN

While still not perfect, you can use a CHECK constraint on the domain to
enforce dimension.

It's not perfect because domain constraints are not enforced in all
locations in versions earlier than 8.2. Adding extra explicit casts can
often work around that though.

ru=# create domain twodims as int[][] check(array_dims(value) =
'[1:2][1:2]');

ru=# select
array_dims('{{{1,2},{3,4}},{{5,3},{9,9}}}'::twodims);
ERROR: value for domain twodims violates check constraint
"twodims_check"

ru=# select array_dims('{{1,2},{3,4}}'::twodims);
array_dims
------------
[1:2][1:2]
(1 row)

If you want to be fancy, use something like this:

check(array_dims(value) ~ '^[1:\\d+][1:\\d+]$');

--

#6Martijn van Oosterhout
kleptog@svana.org
In reply to: Thomas Hallgren (#4)
Re: Number of dimensions of an array parameter

On Mon, May 08, 2006 at 07:31:14PM +0200, Thomas Hallgren wrote:

Would it be hard to enforce a real check? The implementation could use
GUC settings like 'enforce_array_dimensions' and 'enforce_array_lengths'
that could be set to false for the legacy implementations that rely on
the current behavior. I know Tom added the ability to have NULL values
in the arrays. Perhaps now is the time to improve the type semantics as
well?

The big probem is where do you store the number of declared dimensions?
It's not stored anywhere, so there's nowhere to check against either.
If we can fix that first we might get to the checking part.

test=# create function foo(int[][]) returns int4 as 'select 1' language sql;
CREATE FUNCTION
test=# \df foo
List of functions
Result data type | Schema | Name | Argument data types
------------------+--------+------+---------------------
integer | public | foo | integer[]
(1 row)

Have a ncie day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

From each according to his ability. To each according to his ability to litigate.

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Hallgren (#4)
Re: Number of dimensions of an array parameter

Thomas Hallgren <thomas@tada.se> writes:

Would it be hard to enforce a real check? The implementation could use
GUC settings like 'enforce_array_dimensions' and 'enforce_array_lengths'
that could be set to false for the legacy implementations that rely on
the current behavior.

The fact that it doesn't exactly match Java semantics does not make it
"legacy behavior". I don't agree that it's a bug; I think it's a
feature, precisely because many functions can work on arrays of
different dimensions. Why should we change to make PL/Java happier,
when it will move us further away from the semantics of, say, PL/R?

I think reasonable choices for PL/Java would be to reject
multidimensional array arguments, or to silently ignore the
dimensionality and treat the data as 1-D in storage order
(as I think plperl for instance already does).

regards, tom lane

#8Thomas Hallgren
thomas@tada.se
In reply to: Tom Lane (#7)
Re: Number of dimensions of an array parameter

Tom Lane wrote:

Thomas Hallgren <thomas@tada.se> writes:

Would it be hard to enforce a real check? The implementation could use
GUC settings like 'enforce_array_dimensions' and 'enforce_array_lengths'
that could be set to false for the legacy implementations that rely on
the current behavior.

The fact that it doesn't exactly match Java semantics does not make it
"legacy behavior". I don't agree that it's a bug; I think it's a
feature, precisely because many functions can work on arrays of
different dimensions. Why should we change to make PL/Java happier,
when it will move us further away from the semantics of, say, PL/R?

Would it really? The way I see it, the choice of language is irrelevant.
Either you support dimensions or you don't. The way PostgreSQL does it,
you get the impression that it is supported while in fact it's not. I
can't see how anyone would consider that a feature. If you want the
ability to use an arbitrary number of dimensions, then you should have a
syntax that supports that particular use-case. An int[][] cannot be
anything but a two dimensional int array. Not in my book anyway. That
opinion has nothing to do with Java.

I think reasonable choices for PL/Java would be to reject
multidimensional array arguments, or to silently ignore the
dimensionality and treat the data as 1-D in storage order
(as I think plperl for instance already does).

I agree. That's the way I'll do it.

Regards,
Thomas Hallgren

#9Thomas Hallgren
thomas@tada.se
In reply to: Martijn van Oosterhout (#6)
Re: Number of dimensions of an array parameter

Martijn van Oosterhout wrote:

On Mon, May 08, 2006 at 07:31:14PM +0200, Thomas Hallgren wrote:

Would it be hard to enforce a real check? The implementation could use
GUC settings like 'enforce_array_dimensions' and 'enforce_array_lengths'
that could be set to false for the legacy implementations that rely on
the current behavior. I know Tom added the ability to have NULL values
in the arrays. Perhaps now is the time to improve the type semantics as
well?

The big probem is where do you store the number of declared dimensions?
It's not stored anywhere, so there's nowhere to check against either.
If we can fix that first we might get to the checking part.

test=# create function foo(int[][]) returns int4 as 'select 1' language sql;
CREATE FUNCTION
test=# \df foo
List of functions
Result data type | Schema | Name | Argument data types
------------------+--------+------+---------------------
integer | public | foo | integer[]
(1 row)

Let each type have it's own entry in pg_type. I.e. let the int[] and
int[][] be two distinct types (like int and int[] already are). In
addition, perhaps introduce a new syntax that denotes 'arbitrary number
of dimensions' and let that too be a distinct type.

Regards,
Thomas Hallgren

#10Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Stefan Kaltenbrunner (#3)
1 attachment(s)
Re: Number of dimensions of an array parameter

Documentation updated to mention "dimmensions" are not enforced.

---------------------------------------------------------------------------

Stefan Kaltenbrunner wrote:

Thomas Hallgren wrote:

I can create a function that takes a two dimension int array:

CREATE FUNCTION twodims(int[][]) RETURNS void AS ...

but there's nothing stopping me from calling this function with an
arbitrary number of dimensions on the array.

I'd like to map a parameter like the one above to a corresponding
representation in Java (it would be int[][] there too). As it turns out,
I can't do that. PostgreSQL will not store any information that can tell
me how many dimensions that where used in the declaration, i.e. it's
impossible to write a language VALIDATOR that, based on the information
in pg_proc, builds a signature where the number of dimensions is reflected.

This leaves me with two choices:

Only allow arrays with one dimension unless the parameter is of a domain
type (domains are apparently stored with the actual number of
dimensions). Any call that uses an array parameter with more then one
dimension will yield an exception.
--OR--
Always map to Object[] instead of mapping to the correct type, . This
will work since an array in Java is also an Object and all primitive
types can be represented as objects (i.e. int can be a
java.lang.Integer). The strong typing and the ability to use primitives
are lost however.

I'm leaning towards #1 and hoping that PostgreSQL will enhance the
parameter type declarations to include the dimensions in future releases.

... After some more testing ...

Unfortunately, I run into problems even when I use domains. Consider the
following:

thhal=# CREATE DOMAIN twodims as int[][];
CREATE DOMAIN
thhal=# SELECT typndims FROM pg_type WHERE typname = 'twodims';
typndims
----------
2
(1 row)

thhal=# SELECT
array_dims('{{{1,2,3},{3,4,3}},{{5,3,2},{9,9,9}}}'::twodims);
array_dims -----------------
[1:2][1:2][1:3]
(1 row)

IMO, there is something seriously wrong here. Clearly the number of
dimensions is a property of the type. Any array with a different number
of dimensions should yield an error or at least be coerced into the
right number of dimensions.

while it would be nice to improve that - it is actually documented quite
clearly.

http://www.postgresql.org/docs/current/static/arrays.html has:

"However, the current implementation does not enforce the array size
limits ? the behavior is the same as for arrays of unspecified length.

Actually, the current implementation does not enforce the declared
number of dimensions either. Arrays of a particular element type are all
considered to be of the same type, regardless of size or number of
dimensions. So, declaring number of dimensions or sizes in CREATE TABLE
is simply documentation, it does not affect run-time behavior. "

Stefan

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

--
Bruce Momjian http://candle.pha.pa.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

Attachments:

/bjm//difftext/x-diffDownload
Index: doc/src/sgml/array.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/array.sgml,v
retrieving revision 1.49
diff -c -c -r1.49 array.sgml
*** doc/src/sgml/array.sgml	23 Apr 2006 03:39:47 -0000	1.49
--- doc/src/sgml/array.sgml	9 May 2006 16:30:24 -0000
***************
*** 49,55 ****
  </programlisting>
  
    However, the current implementation does not enforce the array size
!   limits &mdash; the behavior is the same as for arrays of unspecified
    length.
   </para>
  
--- 49,56 ----
  </programlisting>
  
    However, the current implementation does not enforce the array size
!   or dimmension limits &mdash; the behavior is the same as for 
! arrays of unspecified
    length.
   </para>
  
#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#10)
Re: Number of dimensions of an array parameter

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Documentation updated to mention "dimmensions" are not enforced.

This patch seems entirely pointless, since the point is already made
(with correct spelling even) in the very next line.

regards, tom lane

#12Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#11)
Re: Number of dimensions of an array parameter

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Documentation updated to mention "dimmensions" are not enforced.

This patch seems entirely pointless, since the point is already made
(with correct spelling even) in the very next line.

Thanks, reverted.

--
Bruce Momjian http://candle.pha.pa.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +