Support for negative index values in array fetching

Started by Valtonen, Hannuabout 15 years ago11 messages
#1Valtonen, Hannu
hannu.valtonen@hut.fi
1 attachment(s)

Hi,

I ran into the problem of getting the last n elements out of an array
and while some workarounds do exist:
(http://stackoverflow.com/questions/2949881/getting-the-last-element-of-a-postgres-array-declaratively)
I was still annoyed that I couldn't just ask for the last n values in an
array Python/Perl style.

Here's a patch to add support for negative index values in fetching
elements from an array.

i.e.

postgres=# CREATE TABLE blah (a int[]);
CREATE TABLE
Time: 11.357 ms
postgres=# INSERT INTO blah (a) VALUES (ARRAY[1,2,3,4,5,6,7,8,9,10]);
INSERT 0 1
Time: 1.282 ms
postgres=# SELECT a[-1] FROM blah;
a
----
10
(1 row)

Time: 0.450 ms
postgres=# SELECT a[-5:10] FROM blah;
a
--------------
{6,7,8,9,10}
(1 row)

Time: 0.949 ms

While testing this I BTW ran into funny behaviour in setting array
slices, as in:

postgres=# update blah set a[-5] = 12;
UPDATE 1
Time: 1.500 ms
postgres=# select * from blah;
a
------------------------------------------------------------
[-5:10]={12,NULL,NULL,NULL,NULL,NULL,1,2,3,4,5,6,7,8,9,10}
(1 row)

Time: 0.431 ms

And since this negative array expansion behaviour totally surprised me,
I haven't changed that in this patch at all.

--
Hannu Valtonen

Attachments:

array.difftext/plain; name=array.diff; x-mac-creator=0; x-mac-type=0Download
diff --git a/doc/src/sgml/array.sgml b/doc/src/sgml/array.sgml
index bb4657e..dc7b6f4 100644
--- a/doc/src/sgml/array.sgml
+++ b/doc/src/sgml/array.sgml
@@ -224,7 +224,9 @@ SELECT name FROM sal_emp WHERE pay_by_quarter[1] <> pay_by_quarter[2];
   By default <productname>PostgreSQL</productname> uses a
   one-based numbering convention for arrays, that is,
   an array of <replaceable>n</> elements starts with <literal>array[1]</literal> and
-  ends with <literal>array[<replaceable>n</>]</literal>.
+  ends with <literal>array[<replaceable>n</>]</literal>. Negative
+  array subscript numbers indicate that the position of the element is calculated from
+  the end of the array, with -1 indicating the last element in the array.
  </para>
 
  <para>
@@ -242,6 +244,34 @@ SELECT pay_by_quarter[3] FROM sal_emp;
  </para>
 
  <para>
+  This query retrieves the last quarter pay of all employees:
+
+<programlisting>
+SELECT pay_by_quarter[-1] FROM sal_emp;
+
+ pay_by_quarter
+----------------
+          10000
+          25000
+(2 rows)
+</programlisting>
+</para>
+
+<para>
+  This query retrieves the pay of all employees for the last three quarters:
+<programlisting>
+SELECT pay_by_quarter[-3:4] FROM sal_emp;
+
+   pay_by_quarter
+---------------------
+ {10000,10000,10000}
+ {25000,25000,25000}
+(2 rows)
+
+</programlisting>
+ </para>
+
+ <para>
   We can also access arbitrary rectangular slices of an array, or
   subarrays.  An array slice is denoted by writing
   <literal><replaceable>lower-bound</replaceable>:<replaceable>upper-bound</replaceable></literal>
diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index fb4cbce..9d6c3f1 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -1786,6 +1786,8 @@ array_ref(ArrayType *array,
 	}
 	for (i = 0; i < ndim; i++)
 	{
+	   	if (indx[i] < 0) /* A negative index number indicates a position calculated from the end of the array */
+	           	indx[i] = dim[i] + indx[i] + lb[i];
 		if (indx[i] < lb[i] || indx[i] >= (dim[i] + lb[i]))
 		{
 			*isNull = true;
@@ -1914,6 +1916,10 @@ array_get_slice(ArrayType *array,
 
 	for (i = 0; i < nSubscripts; i++)
 	{
+	   	if (lowerIndx[i] < 0) /* A negative index number indicates a position calculated from the end of the array */
+	           	lowerIndx[i] = dim[i] + lowerIndx[i] + lb[i];
+	   	if (upperIndx[i] < 0) /* A negative index number indicates a position calculated from the end of the array */
+	           	upperIndx[i] = dim[i] + upperIndx[i] + lb[i];
 		if (lowerIndx[i] < lb[i])
 			lowerIndx[i] = lb[i];
 		if (upperIndx[i] >= (dim[i] + lb[i]))
#2Florian Pflug
fgp@phlo.org
In reply to: Valtonen, Hannu (#1)
Re: Support for negative index values in array fetching

On Jan2, 2011, at 11:45 , Valtonen, Hannu wrote:

I ran into the problem of getting the last n elements out of an array and while some workarounds do exist:
(http://stackoverflow.com/questions/2949881/getting-the-last-element-of-a-postgres-array-declaratively) I was still annoyed that I couldn't just ask for the last n values in an array Python/Perl style.

Here's a patch to add support for negative index values in fetching elements from an array.

That won't work. In SQL, array indices don't necessarily start with 0 (or 1, or *any*
single value). Instead, you can each dimension's lower and upper bound for index values
with array_lower() and array_upper().

Here's an example

fgp=> do $$
declare a text[];
begin
a[-1] := 'foo';
a[0] := 'bar';
raise notice 'a[-1] == %', a[-1];
end
$$ language 'plpgsql' ;

This will raise the notice 'a[-1] == foo'!

The only way around that would be to introduce magic constants "lower", "upper" that
can be used within index expressions and evaluate to the indexed dimension's lower
and upper bound. You'd then use

my_array[upper], my_array[upper-1], ...

to refer to the last, second-to-last, ... element in the array. Actually doing this
could get pretty messy, though - not sure if it's really worth the effort...

best regards,
Florian Pflug

#3Peter Eisentraut
peter_e@gmx.net
In reply to: Florian Pflug (#2)
Re: Support for negative index values in array fetching

On sön, 2011-01-02 at 12:47 +0100, Florian Pflug wrote:

Here's a patch to add support for negative index values in fetching elements from an array.

That won't work. In SQL, array indices don't necessarily start with 0 (or 1, or *any*
single value).

FYI, this is true for PostgreSQL, but not in SQL in general. In the
standard, array indexes go from 1 to N.

The only way around that would be to introduce magic constants "lower", "upper" that
can be used within index expressions and evaluate to the indexed dimension's lower
and upper bound. You'd then use

my_array[upper], my_array[upper-1], ...

to refer to the last, second-to-last, ... element in the array. Actually doing this
could get pretty messy, though - not sure if it's really worth the effort...

How about just some functions:

array_first(array, dim)
array_last(array, dim)

Perhaps some variants for splice vs. scalar.

#4Pavel Stehule
pavel.stehule@gmail.com
In reply to: Peter Eisentraut (#3)
Re: Support for negative index values in array fetching

Hello

2011/1/5 Peter Eisentraut <peter_e@gmx.net>:

On sön, 2011-01-02 at 12:47 +0100, Florian Pflug wrote:

Here's a patch to add support for negative index values in fetching elements from an array.

negative arguments for array can be really strange

That won't work. In SQL, array indices don't necessarily start with 0 (or 1, or *any*
single value).

FYI, this is true for PostgreSQL, but not in SQL in general.  In the
standard, array indexes go from 1 to N.

The only way around that would be to introduce magic constants "lower", "upper" that
can be used within index expressions and evaluate to the indexed dimension's lower
and upper bound. You'd then use

  my_array[upper], my_array[upper-1], ...

to refer to the last, second-to-last, ... element in the array. Actually doing this
could get pretty messy, though - not sure if it's really worth the effort...

How about just some functions:

array_first(array, dim)
array_last(array, dim)

Perhaps some variants for splice vs. scalar.

Itakagi has a function trim_array in
http://archives.postgresql.org/message-id/AANLkTinrRubdSSWvqO481sL0EyGz830=mFKAdK_knfgZ@mail.gmail.com
patch. It's similar to array_first.

I understand to a missing functionality for FIFO or LIFO queues
implementation based on array. There can be function that reduce a
array to first or last n items, and functions that returns first or
last items.

some like array_first(array, items), array_last(array, items),
array_remove_first(array, items), array_remove_last(array, items)

or some similar

Pavel

Show quoted text

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Florian Pflug
fgp@phlo.org
In reply to: Peter Eisentraut (#3)
Re: Support for negative index values in array fetching

On Jan5, 2011, at 10:25 , Peter Eisentraut wrote:

On sön, 2011-01-02 at 12:47 +0100, Florian Pflug wrote:

The only way around that would be to introduce magic constants "lower", "upper" that
can be used within index expressions and evaluate to the indexed dimension's lower
and upper bound. You'd then use

my_array[upper], my_array[upper-1], ...

to refer to the last, second-to-last, ... element in the array. Actually doing this
could get pretty messy, though - not sure if it's really worth the effort...

How about just some functions:

array_first(array, dim)
array_last(array, dim)

You image these to return the actual element, not the first and last index value, right?
Because we already have array_lower() and array_upper() which return the lower and upper
index bound for a certain dimension.
(http://www.postgresql.org/docs/9.0/interactive/functions-array.htm)

A more general solution would be a function

array_relative(array anyarray, indices int[])

which would return the element indexed by <indices>, where positive indices are assumed to
be relative to the respective dimension's lower bound and negative indices to the
upper bound + 1.

For slices, we could additionally have

array_relative(array anyarray, indices_start int[], indices_end int[])

best regards,
Florian Pflug

#6Pavel Stehule
pavel.stehule@gmail.com
In reply to: Florian Pflug (#5)
Re: Support for negative index values in array fetching

2011/1/5 Florian Pflug <fgp@phlo.org>:

On Jan5, 2011, at 10:25 , Peter Eisentraut wrote:

On sön, 2011-01-02 at 12:47 +0100, Florian Pflug wrote:

The only way around that would be to introduce magic constants "lower", "upper" that
can be used within index expressions and evaluate to the indexed dimension's lower
and upper bound. You'd then use

 my_array[upper], my_array[upper-1], ...

to refer to the last, second-to-last, ... element in the array. Actually doing this
could get pretty messy, though - not sure if it's really worth the effort...

How about just some functions:

array_first(array, dim)
array_last(array, dim)

You image these to return the actual element, not the first and last index value, right?
Because we already have array_lower() and array_upper() which return the lower and upper
index bound for a certain dimension.
(http://www.postgresql.org/docs/9.0/interactive/functions-array.htm)

A more general solution would be a function

array_relative(array anyarray, indices int[])

I don't think so this design helps. instead maintaining a data array,
you should to maintain a indices array.

Pavel

Show quoted text

which would return the element indexed by <indices>, where positive indices are assumed to
be relative to the respective dimension's lower bound and negative indices to the
upper bound + 1.

For slices, we could additionally have

array_relative(array anyarray, indices_start int[], indices_end int[])

best regards,
Florian Pflug

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Florian Pflug
fgp@phlo.org
In reply to: Pavel Stehule (#6)
Re: Support for negative index values in array fetching

On Jan5, 2011, at 13:08 , Pavel Stehule wrote:

2011/1/5 Florian Pflug <fgp@phlo.org>:

On Jan5, 2011, at 10:25 , Peter Eisentraut wrote:

On sön, 2011-01-02 at 12:47 +0100, Florian Pflug wrote:

The only way around that would be to introduce magic constants "lower", "upper" that
can be used within index expressions and evaluate to the indexed dimension's lower
and upper bound. You'd then use

my_array[upper], my_array[upper-1], ...

to refer to the last, second-to-last, ... element in the array. Actually doing this
could get pretty messy, though - not sure if it's really worth the effort...

How about just some functions:

array_first(array, dim)
array_last(array, dim)

You image these to return the actual element, not the first and last index value, right?
Because we already have array_lower() and array_upper() which return the lower and upper
index bound for a certain dimension.
(http://www.postgresql.org/docs/9.0/interactive/functions-array.htm)

A more general solution would be a function

array_relative(array anyarray, indices int[])

I don't think so this design helps. instead maintaining a data array,
you should to maintain a indices array.

How so? You'd still be able to get the last element by simply writing

array_relative(some_array, array[-1]).

Or, if we made the function variadic, by writing

array_relative(some_array, -1).

It's essentially what the OP proposed, but with the function array_relative() in place of
the indexing operator [].

best regards,
Florian Pflug

#8Pavel Stehule
pavel.stehule@gmail.com
In reply to: Florian Pflug (#7)
Re: Support for negative index values in array fetching

2011/1/5 Florian Pflug <fgp@phlo.org>:

On Jan5, 2011, at 13:08 , Pavel Stehule wrote:

2011/1/5 Florian Pflug <fgp@phlo.org>:

On Jan5, 2011, at 10:25 , Peter Eisentraut wrote:

On sön, 2011-01-02 at 12:47 +0100, Florian Pflug wrote:

The only way around that would be to introduce magic constants "lower", "upper" that
can be used within index expressions and evaluate to the indexed dimension's lower
and upper bound. You'd then use

 my_array[upper], my_array[upper-1], ...

to refer to the last, second-to-last, ... element in the array. Actually doing this
could get pretty messy, though - not sure if it's really worth the effort...

How about just some functions:

array_first(array, dim)
array_last(array, dim)

You image these to return the actual element, not the first and last index value, right?
Because we already have array_lower() and array_upper() which return the lower and upper
index bound for a certain dimension.
(http://www.postgresql.org/docs/9.0/interactive/functions-array.htm)

A more general solution would be a function

array_relative(array anyarray, indices int[])

I don't think so this design helps. instead maintaining a data array,
you should to maintain a indices array.

How so? You'd still be able to get the last element by simply writing

 array_relative(some_array, array[-1]).

Or, if we made the function variadic, by writing

 array_relative(some_array, -1).

Sorry, but It isn't too intuitive. Minimally for me. Why you don't
thinking about simple functions with only positive arguments. There
are only four combinations. I don't think we must have only one super
function.

we need functionality for:

a) get first n items
b) get items without last n items
c) get last n items
d) skip first n items

I think so this functionality is relative important, so we can use a
richer api.

Maybe we thinking about different use cases.

Pavel

Show quoted text

It's essentially what the OP proposed, but with the function array_relative() in place of
the indexing operator [].

best regards,
Florian Pflug

#9Florian Pflug
fgp@phlo.org
In reply to: Pavel Stehule (#8)
Re: Support for negative index values in array fetching

On Jan5, 2011, at 15:17 , Pavel Stehule wrote:

2011/1/5 Florian Pflug <fgp@phlo.org>:

How so? You'd still be able to get the last element by simply writing

array_relative(some_array, array[-1]).

Or, if we made the function variadic, by writing

array_relative(some_array, -1).

Sorry, but It isn't too intuitive. Minimally for me. Why you don't
thinking about simple functions with only positive arguments. There
are only four combinations. I don't think we must have only one super
function.

we need functionality for:

a) get first n items
b) get items without last n items
c) get last n items
d) skip first n items

Now you've moved the goalpost - the OP wanted to access individual
elements, not slices! To support slices, a three-argument version
of array_relative() would be required, with the signature

array_relative(some_array anyarray, first int[], last int[])

Your requirements (a) to (d) are then easily satisfied

a) array_relative(ary, array[0], array[n-1])
b) array_relative(ary, array[0], array[-n-1])
c) array_relative(ary, array[-n], array[-1])
d) array_relative(ary, array[n], array[-1])

The individual function approach might be a tad more readable for
one-dimensional arrays, but they don't scale well to the general
case.

Maybe the OP could comment on whether any of these solutions
would fit his needs?

best regards,
Florian Pflug

#10Pavel Stehule
pavel.stehule@gmail.com
In reply to: Florian Pflug (#9)
Re: Support for negative index values in array fetching

2011/1/5 Florian Pflug <fgp@phlo.org>:

On Jan5, 2011, at 15:17 , Pavel Stehule wrote:

2011/1/5 Florian Pflug <fgp@phlo.org>:

How so? You'd still be able to get the last element by simply writing

 array_relative(some_array, array[-1]).

Or, if we made the function variadic, by writing

 array_relative(some_array, -1).

Sorry, but It isn't too intuitive. Minimally for me. Why you don't
thinking about simple functions with only positive arguments. There
are only four combinations. I don't think we must have only one super
function.

we need functionality for:

a) get first n items
b) get items without last n items
c) get last n items
d) skip first n items

Now you've moved the goalpost - the OP wanted to access individual
elements, not slices! To support slices, a three-argument version
of array_relative() would be required, with the signature

I am not sure. Usually need both

when I play with a stack I need

a) FIFO - first element from array and all others without first element
b) LIFO - last element from array and all others without last element

The game with queues is only one use case that I know where I need
access to relative indexed items in array.

Maybe is other, but I don't know it. ??? I don't know why I need a
access to relative indexed items?

Pavel

 array_relative(some_array anyarray, first int[], last int[])

Your requirements (a) to (d) are then easily satisfied

a) array_relative(ary, array[0], array[n-1])
b) array_relative(ary, array[0], array[-n-1])
c) array_relative(ary, array[-n], array[-1])
d) array_relative(ary, array[n], array[-1])

what is n?? it's not implementable.

Show quoted text

The individual function approach might be a tad more readable for
one-dimensional arrays, but they don't scale well to the general
case.

Maybe the OP could comment on whether any of these solutions
would fit his needs?

best regards,
Florian Pflug

#11Hannu Valtonen
hannu.valtonen@f-secure.com
In reply to: Valtonen, Hannu (#1)
Re: Support for negative index values in array fetching

On 1/5/11 6:19 PM, Florian Pflug wrote:

Sorry, but It isn't too intuitive. Minimally for me. Why you don't
thinking about simple functions with only positive arguments. There
are only four combinations. I don't think we must have only one super
function.

we need functionality for:

a) get first n items
b) get items without last n items
c) get last n items
d) skip first n items

Now you've moved the goalpost - the OP wanted to access individual
elements, not slices! To support slices, a three-argument version
of array_relative() would be required, with the signature

array_relative(some_array anyarray, first int[], last int[])

Your requirements (a) to (d) are then easily satisfied

a) array_relative(ary, array[0], array[n-1])
b) array_relative(ary, array[0], array[-n-1])
c) array_relative(ary, array[-n], array[-1])
d) array_relative(ary, array[n], array[-1])

The individual function approach might be a tad more readable for
one-dimensional arrays, but they don't scale well to the general
case.

Maybe the OP could comment on whether any of these solutions
would fit his needs?

Hi,

(sorry for the late reply, got lost in my Inbox)

For my main use case I just needed the last element of the array, but I
could see myself needing a slice as well. (i.e. give me the last 5 items
in an array)

So in that sense yes, this would fit the bill.

Hannu Valtonen
Lead Software Architect
Technology Office
F-Secure Corporationhttp://www.F-Secure.com