So what's an "empty" array anyway?

Started by Tom Laneabout 17 years ago25 messages
#1Tom Lane
tgl@sss.pgh.pa.us

Currently, the constructs
'{}'::arraytype
ARRAY[]::arraytype
return zero-dimensional arrays, as does the underlying function
construct_empty_array(). I can't immediately find any way at SQL
level to produce an empty array with one or more dimensions.
However, construct_array and construct_md_array will happily build
zero-length arrays of dimension 1 or higher, leading to weirdnesses
such as illustrated here:
http://archives.postgresql.org/pgsql-general/2008-10/msg00915.php

Seems like we ought to clean this up. I'm not sure which way to jump
though: should we decree that arrays of no elements must always have
zero dimensions, or should we get rid of that and standardize on, say,
1-D array with lower bound 1 and upper bound 0?

A somewhat related issue that I noticed while poking at this is that
array_dims() returns NULL for a zero-dimension array. That seems a bit
bogus too; wouldn't an empty string be saner? Of course the issue
goes away if we get rid of zero-dimension arrays.

Thoughts?

regards, tom lane

#2Merlin Moncure
mmoncure@gmail.com
In reply to: Tom Lane (#1)
Re: So what's an "empty" array anyway?

On Tue, Oct 21, 2008 at 1:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Currently, the constructs
'{}'::arraytype
ARRAY[]::arraytype
return zero-dimensional arrays, as does the underlying function
construct_empty_array(). I can't immediately find any way at SQL
level to produce an empty array with one or more dimensions.
However, construct_array and construct_md_array will happily build
zero-length arrays of dimension 1 or higher, leading to weirdnesses
such as illustrated here:
http://archives.postgresql.org/pgsql-general/2008-10/msg00915.php

Seems like we ought to clean this up. I'm not sure which way to jump
though: should we decree that arrays of no elements must always have
zero dimensions, or should we get rid of that and standardize on, say,
1-D array with lower bound 1 and upper bound 0?

A somewhat related issue that I noticed while poking at this is that
array_dims() returns NULL for a zero-dimension array. That seems a bit
bogus too; wouldn't an empty string be saner? Of course the issue
goes away if we get rid of zero-dimension arrays.

Thoughts?

This reminds me a little bit of the zero point polygon issue we
uncovered a while back. IMO, you shouldn't be able to create things
that are not possible at the sql level....it invariably leads to
errors. But why do you have to force empty arrays to 1 dims? This
seems like needless restriction.

ISTM this is the way it should work from SQL level:

'{}'::int[] empty 1d
'{{},{}}'::int[] :: empty 2d

If you dump zero dimension arrays, then the problem about what to do
with array_dims goes away. Otherwise, I'd make:

''::int[] as empty 0d array

merlin

#3Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tom Lane (#1)
Re: So what's an "empty" array anyway?

2008/10/21 Tom Lane <tgl@sss.pgh.pa.us>:

Currently, the constructs
'{}'::arraytype
ARRAY[]::arraytype
return zero-dimensional arrays, as does the underlying function
construct_empty_array(). I can't immediately find any way at SQL
level to produce an empty array with one or more dimensions.
However, construct_array and construct_md_array will happily build
zero-length arrays of dimension 1 or higher, leading to weirdnesses
such as illustrated here:
http://archives.postgresql.org/pgsql-general/2008-10/msg00915.php

Seems like we ought to clean this up. I'm not sure which way to jump
though: should we decree that arrays of no elements must always have
zero dimensions, or should we get rid of that and standardize on, say,
1-D array with lower bound 1 and upper bound 0?

I believe so zero dimensions for empty array should be more clean and
more simple. This solve question about array_dims too. But this empty
dimensionless array should be simple cast to dimensional empty array.

array_ndims(array[]) --> 0
array[1,2] || array[] = array[1,2]
array[[1,2],[1,3]] || array[] = array[[1,2],[1,3]]

or
array[1,2] = array[] -> false
array[[1,2],[1,3]] = array[] -> false

Regards
Pavel Stehule

Show quoted text

A somewhat related issue that I noticed while poking at this is that
array_dims() returns NULL for a zero-dimension array. That seems a bit
bogus too; wouldn't an empty string be saner? Of course the issue
goes away if we get rid of zero-dimension arrays.

Thoughts?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Merlin Moncure (#2)
Re: So what's an "empty" array anyway?

"Merlin Moncure" <mmoncure@gmail.com> writes:

ISTM this is the way it should work from SQL level:

'{}'::int[] empty 1d
'{{},{}}'::int[] :: empty 2d

The first one looks okay, but ISTM the second one is not describing
an "empty" array: the upper dimension is of length 2. In particular
I think that under your proposal array_dims() would probably yield
these results:

[1:0]
[1:2][1:0]

and all of these would be different:

'{{}}'::int[] [1:1][1:0]
'{{},{}}'::int[] [1:2][1:0]
'{{},{},{}}'::int[] [1:3][1:0]

Maybe this is okay but it feels a bit weird.

If you dump zero dimension arrays, then the problem about what to do
with array_dims goes away.

I'm not against dropping zero-dimension arrays ...

regards, tom lane

#5Merlin Moncure
mmoncure@gmail.com
In reply to: Tom Lane (#4)
Re: So what's an "empty" array anyway?

On Tue, Oct 21, 2008 at 2:23 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Merlin Moncure" <mmoncure@gmail.com> writes:

ISTM this is the way it should work from SQL level:

'{}'::int[] empty 1d
'{{},{}}'::int[] :: empty 2d

The first one looks okay, but ISTM the second one is not describing
an "empty" array: the upper dimension is of length 2. In particular
I think that under your proposal array_dims() would probably yield
these results:

[1:0]
[1:2][1:0]

and all of these would be different:

'{{}}'::int[] [1:1][1:0]
'{{},{}}'::int[] [1:2][1:0]
'{{},{},{}}'::int[] [1:3][1:0]

Maybe this is okay but it feels a bit weird.

agreed...you're right...and if this flies, you are still restricted to
making empty arrays for 1d only, so in this case I guess that's where
the array should be locked down.

If you dump zero dimension arrays, then the problem about what to do
with array_dims goes away.

I'm not against dropping zero-dimension arrays ...

yup.

merlin

#6Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#1)
Re: So what's an "empty" array anyway?

On Tue, 2008-10-21 at 13:50 -0400, Tom Lane wrote:

Currently, the constructs
'{}'::arraytype
ARRAY[]::arraytype
return zero-dimensional arrays, as does the underlying function
construct_empty_array(). I can't immediately find any way at SQL
level to produce an empty array with one or more dimensions.
However, construct_array and construct_md_array will happily build
zero-length arrays of dimension 1 or higher, leading to weirdnesses
such as illustrated here:
http://archives.postgresql.org/pgsql-general/2008-10/msg00915.php

Seems like we ought to clean this up. I'm not sure which way to jump
though: should we decree that arrays of no elements must always have
zero dimensions, or should we get rid of that and standardize on, say,
1-D array with lower bound 1 and upper bound 0?

A somewhat related issue that I noticed while poking at this is that
array_dims() returns NULL for a zero-dimension array. That seems a bit
bogus too; wouldn't an empty string be saner? Of course the issue
goes away if we get rid of zero-dimension arrays.

Please remove zero-dimension arrays.

The number of dimensions of an empty array really ought to be NULL, or
if we fix it to be non-NULL then 1+. Zero just makes a weird case for no
reason. An empty string only makes sense in the context of that
particular function, it doesn't really help with other maths.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#7David E. Wheeler
david@kineticode.com
In reply to: Simon Riggs (#6)
Re: So what's an "empty" array anyway?

On Oct 21, 2008, at 12:08, Simon Riggs wrote:

Please remove zero-dimension arrays.

The number of dimensions of an empty array really ought to be NULL, or
if we fix it to be non-NULL then 1+. Zero just makes a weird case
for no
reason. An empty string only makes sense in the context of that
particular function, it doesn't really help with other maths.

If we got rid of zero dimension arrays, how would I declare a new
empty array in a PL/pgSQL function?

Best,

David

#8Andrew Chernow
ac@esilo.com
In reply to: David E. Wheeler (#7)
Re: So what's an "empty" array anyway?

David E. Wheeler wrote:

On Oct 21, 2008, at 12:08, Simon Riggs wrote:

If we got rid of zero dimension arrays, how would I declare a new empty
array in a PL/pgSQL function?

Why would you want to do that? Is there a use case for that?

--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

#9David E. Wheeler
david@kineticode.com
In reply to: Andrew Chernow (#8)
Re: So what's an "empty" array anyway?

On Oct 21, 2008, at 13:00, Andrew Chernow wrote:

On Oct 21, 2008, at 12:08, Simon Riggs wrote:
If we got rid of zero dimension arrays, how would I declare a new
empty array in a PL/pgSQL function?

Why would you want to do that? Is there a use case for that?

Perhaps not. In older versions of PostgreSQL, I *had* to initialize an
empty array in a DECLARE block or else I couldn't use it with
array_append() to collect things in an array in a loop. I don't have
to do so 8.3, but I keep it that way in some modules for compatibility
reasons.

But since that was perhaps an issue with older versions of PostgreSQL
that has since been addressed, I guess I just think too much like a
Perl hacker, where I can add things to an array as I need to. That's
different from SQL arrays, where you can't add a value to an existing
array, create a new array from an old one plus a new value.

So I guess I don't *have* to have it, but for compatibility with older
versions of PostgreSQL, I think they should be kept.

Best,

David

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: David E. Wheeler (#7)
Re: So what's an "empty" array anyway?

"David E. Wheeler" <david@kineticode.com> writes:

If we got rid of zero dimension arrays, how would I declare a new
empty array in a PL/pgSQL function?

Same as before, I think: initialize it to '{}'. What's at stake here
is exactly what does that notation mean ...

regards, tom lane

#11David E. Wheeler
david@kineticode.com
In reply to: Tom Lane (#10)
Re: So what's an "empty" array anyway?

On Oct 21, 2008, at 13:58, Tom Lane wrote:

If we got rid of zero dimension arrays, how would I declare a new
empty array in a PL/pgSQL function?

Same as before, I think: initialize it to '{}'. What's at stake here
is exactly what does that notation mean ...

An empty, single-dimension array. But I got the impression from Simon
that he thought it should be NULL.

Best,

David

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: David E. Wheeler (#11)
Re: So what's an "empty" array anyway?

"David E. Wheeler" <david@kineticode.com> writes:

On Oct 21, 2008, at 13:58, Tom Lane wrote:

Same as before, I think: initialize it to '{}'. What's at stake here
is exactly what does that notation mean ...

An empty, single-dimension array. But I got the impression from Simon
that he thought it should be NULL.

Well, we can't do that because it would clearly break too much existing
code. '{}' has got to result in something you can successfully
concatenate more elements to. But either the current behavior with
a zero-dimension array, or a one-dimensional length-zero array would
presumably work okay.

regards, tom lane

#13David E. Wheeler
david@kineticode.com
In reply to: Tom Lane (#12)
Re: So what's an "empty" array anyway?

On Oct 21, 2008, at 14:16, Tom Lane wrote:

Well, we can't do that because it would clearly break too much
existing
code. '{}' has got to result in something you can successfully
concatenate more elements to.

Right, that's what I was trying to day. Badly, I guess.

But either the current behavior with
a zero-dimension array, or a one-dimensional length-zero array would
presumably work okay.

Right, that sounds right to me.

Thanks,

David

#14Josh Berkus
josh@agliodbs.com
In reply to: David E. Wheeler (#11)
Re: So what's an "empty" array anyway?

An empty, single-dimension array. But I got the impression from Simon
that he thought it should be NULL.

I disagree with Simon *if* that's what he's saying. '{}' isn't equivalent
to NULL any more than 0 or '' is. NULL means "I don't know / Doesn't
apply" wheras '{}' means "purposefully left blank". It's a defect of the
Timestamp type (and a few others) that it doesn't have a standard "zero"
value -- the typical tri-value NULL problem.

I do agree that we ought to support multi-dimensional empty arrays for
consistency. However: is '{}' = '{}{}' or not?

--
--Josh

Josh Berkus
PostgreSQL
San Francisco

#15Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#1)
Re: So what's an "empty" array anyway?

Seems like we ought to clean this up. I'm not sure which way to jump
though: should we decree that arrays of no elements must always have
zero dimensions, or should we get rid of that and standardize on, say,
1-D array with lower bound 1 and upper bound 0?

Isn't the zero-dimensional array vaguely polymorphic? If '{}'::int[]
means a one-dimensional array, how do I create an empty
two-dimensional array onto which I can concatenate one-dimensional
arrays that are all of the same length? I don't necessarily object to
changing this, even if it breaks backward-compatibility, but there
should be SOME easy way to do it.

A somewhat related issue that I noticed while poking at this is that
array_dims() returns NULL for a zero-dimension array. That seems a bit
bogus too; wouldn't an empty string be saner? Of course the issue
goes away if we get rid of zero-dimension arrays.

Most all of the existing array functions contain identical checks for
ndims out of range (and 0 is considered out of range) and return NULL
in that case. This behavior doesn't appear to make a great deal of
sense to me in general. If these functions can be called with an
object that's not an array, then the check is not nearly strong enough
to prevent chaos; if they can't, the check is unnecessary. But maybe
I'm missing something?

In any event, the correct behavior for all of these functions on a
0-dimensional array should probably be reviewed, unless we eliminate
0-dimensional arrays.

...Robert

#16Simon Riggs
simon@2ndQuadrant.com
In reply to: David E. Wheeler (#11)
Re: So what's an "empty" array anyway?

On Tue, 2008-10-21 at 14:03 -0700, David E. Wheeler wrote:

On Oct 21, 2008, at 13:58, Tom Lane wrote:

If we got rid of zero dimension arrays, how would I declare a new
empty array in a PL/pgSQL function?

Same as before, I think: initialize it to '{}'. What's at stake here
is exactly what does that notation mean ...

An empty, single-dimension array. But I got the impression from Simon
that he thought it should be NULL.

I meant the dimension of {} should be NULL.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#17David E. Wheeler
david@kineticode.com
In reply to: Simon Riggs (#16)
Re: So what's an "empty" array anyway?

On Oct 22, 2008, at 00:40, Simon Riggs wrote:

An empty, single-dimension array. But I got the impression from Simon
that he thought it should be NULL.

I meant the dimension of {} should be NULL.

To me that's 0.

Best,

David

#18Josh Berkus
josh@agliodbs.com
In reply to: Simon Riggs (#16)
Re: So what's an "empty" array anyway?

Simon,

I meant the dimension of {} should be NULL.

Oh, so array_dims(1, '{}') IS NULL?

I'm not entirely satisfied with that, but other solutions are just as bad.

--Josh

#19Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#1)
Re: So what's an "empty" array anyway?

Tom Lane wrote:

Seems like we ought to clean this up. I'm not sure which way to jump
though: should we decree that arrays of no elements must always have
zero dimensions, or should we get rid of that and standardize on, say,
1-D array with lower bound 1 and upper bound 0?

It was pointed out to me today that a zero-dimensional matrix is a
scalar. This makes a bit of sense, if you say that

'{{56}}' is of type int[][], 2 dimensions
'{56}' is of type int[], 1 dimension
'56' is of type int, 0 dimensions

Notice that the number of brace pairs in the literal matches the number
of bracket pairs in the type declaration.

By that logic, '{}' has one dimension. I think this also works best in
practice, for example with array concatenation.

#20Robert Haas
robertmhaas@gmail.com
In reply to: Peter Eisentraut (#19)
Re: So what's an "empty" array anyway?

It was pointed out to me today that a zero-dimensional matrix is a scalar.
This makes a bit of sense, if you say that

'{{56}}' is of type int[][], 2 dimensions
'{56}' is of type int[], 1 dimension
'56' is of type int, 0 dimensions

Notice that the number of brace pairs in the literal matches the number of
bracket pairs in the type declaration.

By that logic, '{}' has one dimension. I think this also works best in
practice, for example with array concatenation.

I think it's actually more correct to say that {} has an undefined
number of dimensions. The || operator is valid only if the dimensions
of the two arrays are equal (in which case it appends them) or if the
dimension of one of the arrays is one less than the other (in which
case, if legal, it treats the array of lesser dimension as a new
element to be added to the other array).

But {}, and ONLY {}, can be concatenated with an array of any number
of dimensions.

...Robert

#21Sam Mason
sam@samason.me.uk
In reply to: Peter Eisentraut (#19)
Re: So what's an "empty" array anyway?

On Wed, Nov 12, 2008 at 02:12:19PM +0200, Peter Eisentraut wrote:

Tom Lane wrote:

Seems like we ought to clean this up. I'm not sure which way to jump
though: should we decree that arrays of no elements must always have
zero dimensions, or should we get rid of that and standardize on, say,
1-D array with lower bound 1 and upper bound 0?

It was pointed out to me today that a zero-dimensional matrix is a
scalar. This makes a bit of sense, if you say that

'{{56}}' is of type int[][], 2 dimensions
'{56}' is of type int[], 1 dimension
'56' is of type int, 0 dimensions

Notice that the number of brace pairs in the literal matches the number
of bracket pairs in the type declaration.

By that logic, '{}' has one dimension. I think this also works best in
practice, for example with array concatenation.

I could argue a very convincing case that '{}' should be of *at least*
one dimension. The fact that you can concat it onto any other size
array suggests the current implementation of PG agrees---probably for
very different reasons though. '{{}}' would imply an array of at least
two dimensions, but doesn't appear to be currently supported.

Another weirdo is the fact that this works:

(ARRAY[ARRAY[3]])[1][1];

whereas this doesn't:

((ARRAY[ARRAY[3]])[1])[1];

which seems a little strange, although from a historical point of view
I can understand why it behaves the way it does. From the current
behavior, the following syntax would make more sense:

(ARRAY[ARRAY[3]])[1,1];

I.e. it's obvious that you're pulling stuff out of a two dimensional
array. With the current syntax, I'd expect to get slices of higher
dimensional arrays back:

(ARRAY[ARRAY[3,4]])[1];

would evaluate to:

ARRAY[3,4]

We then get onto issues of arrays currently having to be rectangular;
even though it's trivial to express things that don't want rectangular
arrays:

SELECT array(VALUES (ARRAY[1]),(ARRAY[2,3]));

I'd expect to get a 1D array of 1D arrays back from this---note that
this is distinct from a 2D array, which would indeed be rectangular and
wouldn't work at all. I'd expect the following to be invalid:

SELECT array(VALUES (ARRAY[1]),(ARRAY[ARRAY[2,3]]));

I could go on, but I'm not sure of any good ways of keeping this all
compatible with the current syntax or semantics.

Sam

#22Sam Mason
sam@samason.me.uk
In reply to: Robert Haas (#20)
Re: So what's an "empty" array anyway?

On Wed, Nov 12, 2008 at 09:46:09AM -0500, Robert Haas wrote:

The || operator is valid only if the dimensions
of the two arrays are equal (in which case it appends them) or if the
dimension of one of the arrays is one less than the other (in which
case, if legal, it treats the array of lesser dimension as a new
element to be added to the other array).

In most other languages these would be distinct operations; one to
append or concatenate two arrays of equal dimensionality and another to
add a new element on the end. It's very rare that you don't know which
one you want to do when you're writing code and hence breaking the two
apart both simplifies the implementation of the array handling code and
reduces bugs in the calling code.

Sam

#23Peter Eisentraut
peter_e@gmx.net
In reply to: Robert Haas (#20)
Re: So what's an "empty" array anyway?

Robert Haas wrote:

It was pointed out to me today that a zero-dimensional matrix is a scalar.
This makes a bit of sense, if you say that

'{{56}}' is of type int[][], 2 dimensions
'{56}' is of type int[], 1 dimension
'56' is of type int, 0 dimensions

Notice that the number of brace pairs in the literal matches the number of
bracket pairs in the type declaration.

By that logic, '{}' has one dimension. I think this also works best in
practice, for example with array concatenation.

I think it's actually more correct to say that {} has an undefined
number of dimensions.

That is your opinion, but not my argument above.

The || operator is valid only if the dimensions
of the two arrays are equal (in which case it appends them) or if the
dimension of one of the arrays is one less than the other (in which
case, if legal, it treats the array of lesser dimension as a new
element to be added to the other array).

But {}, and ONLY {}, can be concatenated with an array of any number
of dimensions.

Which doesn't necessarily make much sense ...

#24Robert Haas
robertmhaas@gmail.com
In reply to: Peter Eisentraut (#23)
Re: So what's an "empty" array anyway?

I think it's actually more correct to say that {} has an undefined
number of dimensions.

That is your opinion, but not my argument above.

I'm aware of that. I believe that the semantics of array operations
should carry more weight than a count of curly braces, but certainly
you don't have to agree.

But {}, and ONLY {}, can be concatenated with an array of any number
of dimensions.

Which doesn't necessarily make much sense ...

Well, we could change it, which would render the argument about the
current semantics of {} moot. The obvious thing to do would be create
separate empty arrays of dimensions 1, 2, 3, 4, 5, and 6, say
'{}'::int[], '{{}}'::int[], etc. This would break backward
compatibility for people using multi-dimensional arrays, but that
might not be that many people. (Dunno.)

Alternatively, we could make separate types int[], int[][], int[][][],
etc. Then '{}'::int[] would be an empty one-dimensional array,
'{}'::int[][] would be an empty two-dimensional array, etc.

...Robert

#25Dean Rasheed
dean_rasheed@hotmail.com
In reply to: Robert Haas (#24)
Re: So what's an "empty" array anyway?

Peter Eisentraut wrote:

It was pointed out to me today that a zero-dimensional
matrix is a scalar. This makes a bit of sense, if you say
that

'{{56}}' is of type int[][], 2 dimensions
'{56}' is of type int[], 1 dimension
'56' is of type int, 0 dimensions

Notice that the number of brace pairs in the literal
matches the number of bracket pairs in the type
declaration.

By that logic, '{}' has one dimension. I think this also
works best in practice, for example with array
concatenation.

When I think about this, I think that '{x}' has dimension 1 greater
than 'x'. So '{x}' may be a 2-D array, if 'x' is a 1-D array. But
it is always the case that dim('{x}')>= 1.

It seems logical therefore to treat '{}' as an empty array of
something (possibly other arrays), but its precise meaning is
undefined until you actually coerce it into a specific type. Thus
you could have:

'{}'::int - illegal
'{}'::int[] - empty array of integers
'{}'::int[][] - empty 2-D array of integers

However, I would say that '{{}}' isn't an empty array, since the outer
array has 1 element ('{}'). So this leads to:

'{{}}'::int - illegal
'{{}}'::int[] - illegal
'{{}}'::int[][] - non-empty array containing an empty array of integers
(ie. a 1x0 matrix)
'{{},{}}'::int[][] - non-empty array containing 2 empty arrays of integers
(ie. a 2x0 matrix)

I don't know if Nx0 matrices actually have any use, but that's where
the logic takes me.

Dean

_________________________________________________________________
Win £1000 John Lewis shopping sprees with BigSnapSearch.com
http://clk.atdmt.com/UKM/go/117442309/direct/01/