[mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

Started by Jim Nasbyover 18 years ago18 messageshackers
Jump to latest
#1Jim Nasby
Jim.Nasby@BlueTreble.com

ISTM that having a built-in array_to_set function would be awfully
useful... Is the aggregate method below an acceptable way to do it?

----- Forwarded message from Merlin Moncure <mmoncure@gmail.com> -----
On 8/3/07, Guy Fraser <guy@incentre.net> wrote:

On Wed, 2007-08-01 at 07:14 +0530, Merlin Moncure wrote:

On 8/1/07, Decibel! <decibel@decibel.org> wrote:

David Fetter and I just came up with these, perhaps others will find
them useful:

CREATE OR REPLACE FUNCTION array_to_set(anyarray, int) RETURNS SETOF anyelement LANGUAGE SQL AS $$
SELECT $1[i] from generate_series(array_lower($1, $2), array_upper($1, $2)) i
$$;
CREATE OR REPLACE FUNCTION array_to_set(anyarray) RETURNS SETOF anyelement LANGUAGE SQL AS $$
SELECT array_to_set($1, 1)
$$;

very nice, although IMO there is a strong justification for these
functions to be in core and written in C for efficiency (along with
array_accum, which I have hand burn from copying and pasting out of
the documentation).

merlin

Excellent timing guys. :^)

I was trying to build a function to list the items of an array, but
ran into problems and was going to post what I had been working on.

Your functions work great.

In case you don't have the function to generate an array from a set
here is one I have been using :

CREATE AGGREGATE array_accum (
BASETYPE = anyelement,
SFUNC = array_append,
STYPE = anyarray,
INITCOND = '{}'
);

I think that's what just about everyone uses. Unfortunately the
reverse of the function (array_to_set above) AFAIK does not map
directly to the C array API.

merlin

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

----- End forwarded message -----

--
Decibel!, aka Jim Nasby decibel@decibel.org
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

#2Jeremy Drake
pgsql@jdrake.com
In reply to: Jim Nasby (#1)
Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

On Tue, 7 Aug 2007, Decibel! wrote:

ISTM that having a built-in array_to_set function would be awfully
useful... Is the aggregate method below an acceptable way to do it?

Umm, the array_to_set function is not an aggregate. Personally, when I
need this functionality, I use this function conveniently present in the
default install:

select * from information_schema._pg_expandarray(ARRAY['foo', 'bar', 'baz']);
x | n
-----+---
foo | 1
bar | 2
baz | 3
(3 rows)

Not exactly well documented or well known, but it works.

----- Forwarded message from Merlin Moncure <mmoncure@gmail.com> -----
On 8/3/07, Guy Fraser <guy@incentre.net> wrote:

On Wed, 2007-08-01 at 07:14 +0530, Merlin Moncure wrote:

On 8/1/07, Decibel! <decibel@decibel.org> wrote:

David Fetter and I just came up with these, perhaps others will find
them useful:

CREATE OR REPLACE FUNCTION array_to_set(anyarray, int) RETURNS SETOF anyelement LANGUAGE SQL AS $$
SELECT $1[i] from generate_series(array_lower($1, $2), array_upper($1, $2)) i
$$;
CREATE OR REPLACE FUNCTION array_to_set(anyarray) RETURNS SETOF anyelement LANGUAGE SQL AS $$
SELECT array_to_set($1, 1)
$$;

very nice, although IMO there is a strong justification for these
functions to be in core and written in C for efficiency (along with
array_accum, which I have hand burn from copying and pasting out of
the documentation).

merlin

Excellent timing guys. :^)

I was trying to build a function to list the items of an array, but
ran into problems and was going to post what I had been working on.

Your functions work great.

In case you don't have the function to generate an array from a set
here is one I have been using :

CREATE AGGREGATE array_accum (
BASETYPE = anyelement,
SFUNC = array_append,
STYPE = anyarray,
INITCOND = '{}'
);

I think that's what just about everyone uses. Unfortunately the
reverse of the function (array_to_set above) AFAIK does not map
directly to the C array API.

merlin

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

----- End forwarded message -----

--
Mollison's Bureaucracy Hypothesis:
If an idea can survive a bureaucratic review and be implemented
it wasn't worth doing.

#3Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Jeremy Drake (#2)
Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

On Tue, Aug 07, 2007 at 10:18:32PM -0700, Jeremy Drake wrote:

On Tue, 7 Aug 2007, Decibel! wrote:

ISTM that having a built-in array_to_set function would be awfully
useful... Is the aggregate method below an acceptable way to do it?

Umm, the array_to_set function is not an aggregate. Personally, when I

Sorry, should have been more clear... array_to_set as shown below isn't,
but array_accum is, and does the opposite (set_to_array).

need this functionality, I use this function conveniently present in the
default install:

select * from information_schema._pg_expandarray(ARRAY['foo', 'bar', 'baz']);
x | n
-----+---
foo | 1
bar | 2
baz | 3
(3 rows)

Not exactly well documented or well known, but it works.

Worse than that, that's something that's entirely internal and could
change at any release. The fact that it exists for info_schema indicates
further need for these functions to exist in the backend.

----- Forwarded message from Merlin Moncure <mmoncure@gmail.com> -----
On 8/3/07, Guy Fraser <guy@incentre.net> wrote:

On Wed, 2007-08-01 at 07:14 +0530, Merlin Moncure wrote:

On 8/1/07, Decibel! <decibel@decibel.org> wrote:

David Fetter and I just came up with these, perhaps others will find
them useful:

CREATE OR REPLACE FUNCTION array_to_set(anyarray, int) RETURNS SETOF anyelement LANGUAGE SQL AS $$
SELECT $1[i] from generate_series(array_lower($1, $2), array_upper($1, $2)) i
$$;
CREATE OR REPLACE FUNCTION array_to_set(anyarray) RETURNS SETOF anyelement LANGUAGE SQL AS $$
SELECT array_to_set($1, 1)
$$;

very nice, although IMO there is a strong justification for these
functions to be in core and written in C for efficiency (along with
array_accum, which I have hand burn from copying and pasting out of
the documentation).

merlin

Excellent timing guys. :^)

I was trying to build a function to list the items of an array, but
ran into problems and was going to post what I had been working on.

Your functions work great.

In case you don't have the function to generate an array from a set
here is one I have been using :

CREATE AGGREGATE array_accum (
BASETYPE = anyelement,
SFUNC = array_append,
STYPE = anyarray,
INITCOND = '{}'
);

I think that's what just about everyone uses. Unfortunately the
reverse of the function (array_to_set above) AFAIK does not map
directly to the C array API.

merlin

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

----- End forwarded message -----

--
Mollison's Bureaucracy Hypothesis:
If an idea can survive a bureaucratic review and be implemented
it wasn't worth doing.

--
Decibel!, aka Jim Nasby decibel@decibel.org
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

#4Michael Glaesemann
grzm@seespotcode.net
In reply to: Jim Nasby (#3)
Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

On Aug 8, 2007, at 11:41 , Decibel! wrote:

On Tue, Aug 07, 2007 at 10:18:32PM -0700, Jeremy Drake wrote:

select * from information_schema._pg_expandarray(ARRAY['foo',
'bar', 'baz']);
x | n
-----+---
foo | 1
bar | 2
baz | 3
(3 rows)

Not exactly well documented or well known, but it works.

Worse than that, that's something that's entirely internal and could
change at any release. The fact that it exists for info_schema
indicates
further need for these functions to exist in the backend.

Personally, I think expandarray is more appropriate and its
functionality probably more generally useful, as it identifies the
array indices as well. Note you can also rename the columns.

select * from information_schema._pg_expandarray(ARRAY['foo', 'bar',
'baz']) as b(a,i);
a | i
-----+---
foo | 1
bar | 2
baz | 3
(3 rows)

array_to_set really isn't, as AFAICS it didn't guarantee element
uniqueness (but that's just a naming issue).

Michael Glaesemann
grzm seespotcode net

#5Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Michael Glaesemann (#4)
Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

On Wed, Aug 08, 2007 at 12:03:34PM -0500, Michael Glaesemann wrote:

Personally, I think expandarray is more appropriate and its
functionality probably more generally useful, as it identifies the
array indices as well. Note you can also rename the columns.

Sure. My point is that we should have a way to convert arrays to sets
and back in the backend.
--
Decibel!, aka Jim Nasby decibel@decibel.org
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

#6Michael Glaesemann
grzm@seespotcode.net
In reply to: Jim Nasby (#5)
Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

On Aug 8, 2007, at 12:18 , Decibel! wrote:

On Wed, Aug 08, 2007 at 12:03:34PM -0500, Michael Glaesemann wrote:

Personally, I think expandarray is more appropriate and its
functionality probably more generally useful, as it identifies the
array indices as well. Note you can also rename the columns.

Sure. My point is that we should have a way to convert arrays to sets
and back in the backend.

Can't really argue with you there, as I find array_accum myself.
(Though I'd still nit-pick that this isn't an array to set
conversion, but rather array to--possibly single-column--table.)

Michael Glaesemann
grzm seespotcode net

#7Bruce Momjian
bruce@momjian.us
In reply to: Michael Glaesemann (#6)
Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

TODO item?

---------------------------------------------------------------------------

Michael Glaesemann wrote:
-- Start of PGP signed section.

On Aug 8, 2007, at 12:18 , Decibel! wrote:

On Wed, Aug 08, 2007 at 12:03:34PM -0500, Michael Glaesemann wrote:

Personally, I think expandarray is more appropriate and its
functionality probably more generally useful, as it identifies the
array indices as well. Note you can also rename the columns.

Sure. My point is that we should have a way to convert arrays to sets
and back in the backend.

Can't really argue with you there, as I find array_accum myself.
(Though I'd still nit-pick that this isn't an array to set
conversion, but rather array to--possibly single-column--table.)

Michael Glaesemann
grzm seespotcode net

-- End of PGP section, PGP failed!

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#8Pavel Stehule
pavel.stehule@gmail.com
In reply to: Bruce Momjian (#7)
Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

2007/8/14, Bruce Momjian <bruce@momjian.us>:

TODO item?

+ If your life is a hard drive, Christ can be your backup. +

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

I am against. It's too simple do it in SQL language.

Regards
Pavel Stehule

#9Joe Conway
mail@joeconway.com
In reply to: Bruce Momjian (#7)
Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

Bruce Momjian wrote:

TODO item?

Probably. See SQL2003 UNNEST:

<collection derived table> ::=
UNNEST <left paren> <collection value expression> <right paren>
[ WITH ORDINALITY ]

<collection value expression> ::=
<array value expression>
| <multiset value expression>

Joe

#10Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Pavel Stehule (#8)
Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

On Tue, Aug 14, 2007 at 05:38:33PM +0200, Pavel Stehule wrote:

2007/8/14, Bruce Momjian <bruce@momjian.us>:

TODO item?

+ If your life is a hard drive, Christ can be your backup. +

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

I am against. It's too simple do it in SQL language.

Why make everyone who works with arrays create a function just to do
this? Something that's of use to common users should be included, simple
or not.
--
Decibel!, aka Jim Nasby decibel@decibel.org
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

#11Merlin Moncure
mmoncure@gmail.com
In reply to: Bruce Momjian (#7)
Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

On 8/14/07, Bruce Momjian <bruce@momjian.us> wrote:

TODO item?

I would say yes...array_accum is virtually an essential function when
working with arrays and the suggested array_to_set (and it's built in
cousin, _pg_expand_array) really should not be built around
generate_series when a C function is faster and will scale much
better.

array_to_set, as suggested in SQL, is something only a relative expert
with PostgreSQL could be expected to write.

Thus could generate_series be relieved from providing the only core
function for set returning functions in the documentation. IMO, this
part of the documentation could use some expansion anyways :)

merlin

#12David Fetter
david@fetter.org
In reply to: Jim Nasby (#10)
Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

On Tue, Aug 14, 2007 at 04:08:27PM -0500, Decibel! wrote:

On Tue, Aug 14, 2007 at 05:38:33PM +0200, Pavel Stehule wrote:

2007/8/14, Bruce Momjian <bruce@momjian.us>:

TODO item?

+ If your life is a hard drive, Christ can be your backup. +

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

I am against. It's too simple do it in SQL language.

Why make everyone who works with arrays create a function just to do
this? Something that's of use to common users should be included, simple
or not.

As I recall, this wasn't included because it didn't do all of what the
SQL:2003 standard UNNEST does, although looking it over, the only
thing it "doesn't do" is to allow the results to come back in any
order other than the array index's. Sadly, UNNEST would be a new
feature, and we're *way* past that for 8.3 :/

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
phone: +1 415 235 3778 AIM: dfetter666
Skype: davidfetter

Remember to vote!
Consider donating to PostgreSQL: http://www.postgresql.org/about/donate

#13Pavel Stehule
pavel.stehule@gmail.com
In reply to: Jim Nasby (#10)
Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

2007/8/14, Decibel! <decibel@decibel.org>:

On Tue, Aug 14, 2007 at 05:38:33PM +0200, Pavel Stehule wrote:

2007/8/14, Bruce Momjian <bruce@momjian.us>:

TODO item?

+ If your life is a hard drive, Christ can be your backup. +

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

I am against. It's too simple do it in SQL language.

Why make everyone who works with arrays create a function just to do
this? Something that's of use to common users should be included, simple
or not.
--

Unpacking array is more SQL construct for me, than SRF function. With
function you cannot conntrol behave of unpacking. With SQL construct I
can

SELECT DISTINCT a(i) FROM generate_series ... remove duplicities
SELECT a(i) FROM generate_series ORDER BY .. sorted output
etc

But I can

SELECT * FROM generate_series(ARRAY[1,3,4,5,7,10]);

else
FUNCTION generate_series(anyarray) returns setof any

Regards
Pavel Stehule

#14Pavel Stehule
pavel.stehule@gmail.com
In reply to: Merlin Moncure (#11)
Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

2007/8/15, Merlin Moncure <mmoncure@gmail.com>:

On 8/14/07, Bruce Momjian <bruce@momjian.us> wrote:

TODO item?

I would say yes...array_accum is virtually an essential function when
working with arrays and the suggested array_to_set (and it's built in
cousin, _pg_expand_array) really should not be built around
generate_series when a C function is faster and will scale much
better.

Hello Merlin

array_accum is good sample of PostgreSQL possibilities. But it is slow.

SELECT ARRAY(SELECT ... FROM ...)) is much faster. :(

so I unlike not necessary aggregate functions

I agree. These constructs can be showed in doc

Regards
Pavel Stehule

#15Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Pavel Stehule (#13)
Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

On Wed, Aug 15, 2007 at 06:47:05AM +0200, Pavel Stehule wrote:

2007/8/14, Decibel! <decibel@decibel.org>:

On Tue, Aug 14, 2007 at 05:38:33PM +0200, Pavel Stehule wrote:

2007/8/14, Bruce Momjian <bruce@momjian.us>:

TODO item?

+ If your life is a hard drive, Christ can be your backup. +

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

I am against. It's too simple do it in SQL language.

Why make everyone who works with arrays create a function just to do
this? Something that's of use to common users should be included, simple
or not.
--

Unpacking array is more SQL construct for me, than SRF function. With
function you cannot conntrol behave of unpacking. With SQL construct I
can

Huh? You can do a DISTINCT or an ORDER BY on the output of a SRF.

SELECT DISTINCT a(i) FROM generate_series ... remove duplicities
SELECT a(i) FROM generate_series ORDER BY .. sorted output
etc

But I can

SELECT * FROM generate_series(ARRAY[1,3,4,5,7,10]);

else
FUNCTION generate_series(anyarray) returns setof any

--
Decibel!, aka Jim Nasby decibel@decibel.org
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

#16Pavel Stehule
pavel.stehule@gmail.com
In reply to: Jim Nasby (#15)
Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

2007/8/15, Decibel! <decibel@decibel.org>:

On Wed, Aug 15, 2007 at 06:47:05AM +0200, Pavel Stehule wrote:

2007/8/14, Decibel! <decibel@decibel.org>:

On Tue, Aug 14, 2007 at 05:38:33PM +0200, Pavel Stehule wrote:

2007/8/14, Bruce Momjian <bruce@momjian.us>:

TODO item?

+ If your life is a hard drive, Christ can be your backup. +

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

I am against. It's too simple do it in SQL language.

Why make everyone who works with arrays create a function just to do
this? Something that's of use to common users should be included, simple
or not.
--

Unpacking array is more SQL construct for me, than SRF function. With
function you cannot conntrol behave of unpacking. With SQL construct I
can

Huh? You can do a DISTINCT or an ORDER BY on the output of a SRF.

Yes, but then you get little bit different my form :)

Show quoted text

SELECT DISTINCT a(i) FROM generate_series ... remove duplicities
SELECT a(i) FROM generate_series ORDER BY .. sorted output
etc

But I can

SELECT * FROM generate_series(ARRAY[1,3,4,5,7,10]);

else
FUNCTION generate_series(anyarray) returns setof any

--
Decibel!, aka Jim Nasby decibel@decibel.org
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

#17Bruce Momjian
bruce@momjian.us
In reply to: Pavel Stehule (#14)
Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

This has been saved for the 8.4 release:

http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---------------------------------------------------------------------------

Pavel Stehule wrote:

2007/8/15, Merlin Moncure <mmoncure@gmail.com>:

On 8/14/07, Bruce Momjian <bruce@momjian.us> wrote:

TODO item?

I would say yes...array_accum is virtually an essential function when
working with arrays and the suggested array_to_set (and it's built in
cousin, _pg_expand_array) really should not be built around
generate_series when a C function is faster and will scale much
better.

Hello Merlin

array_accum is good sample of PostgreSQL possibilities. But it is slow.

SELECT ARRAY(SELECT ... FROM ...)) is much faster. :(

so I unlike not necessary aggregate functions

I agree. These constructs can be showed in doc

Regards
Pavel Stehule

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#18Bruce Momjian
bruce@momjian.us
In reply to: Merlin Moncure (#11)
Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]

Added to TODO:

* Add array_accum() and array_to_set() functions for arrays

http://archives.postgresql.org/pgsql-hackers/2007-08/msg00464.php

---------------------------------------------------------------------------

Merlin Moncure wrote:

On 8/14/07, Bruce Momjian <bruce@momjian.us> wrote:

TODO item?

I would say yes...array_accum is virtually an essential function when
working with arrays and the suggested array_to_set (and it's built in
cousin, _pg_expand_array) really should not be built around
generate_series when a C function is faster and will scale much
better.

array_to_set, as suggested in SQL, is something only a relative expert
with PostgreSQL could be expected to write.

Thus could generate_series be relieved from providing the only core
function for set returning functions in the documentation. IMO, this
part of the documentation could use some expansion anyways :)

merlin

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +