Converting each item in array to a query result row

Started by Postgres Useralmost 17 years ago13 messagesgeneral
Jump to latest
#1Postgres User
postgres.developer@gmail.com

Hi,

I'd writing a query against a function (pg_proc) that contains 2
fields of an array type. Ideally, I'd like to select 1 row from the
table, but return a query row for each item in the array.

For example, if one row contains the array {"a", "b", "c"}
I'd like the query to return 3 rows, one for each of these elements.

Any idea if this is possible?

Thanks.

#2Scott Bailey
artacus@comcast.net
In reply to: Postgres User (#1)
Re: Converting each item in array to a query result row

----- Original Message -----
From: "Postgres User" <postgres.developer@gmail.com>
To: "pgsql-general" <pgsql-general@postgresql.org>
Sent: Friday, May 29, 2009 12:21:11 AM GMT -08:00 Tijuana / Baja California
Subject: [GENERAL] Converting each item in array to a query result row

Hi,

I'd writing a query against a function (pg_proc) that contains 2
fields of an array type. Ideally, I'd like to select 1 row from the
table, but return a query row for each item in the array.

For example, if one row contains the array {"a", "b", "c"}
I'd like the query to return 3 rows, one for each of these elements.

Any idea if this is possible?

Not sure exactly what you want, but this should get you in the ball park
SELECT f.my_field[i] AS value,
(array['I am one','I am two','I am three'])[i] AS description
FROM my_func(foo) f
JOIN generate_series(1,3) i ON 1=1

Postgres 8.4 will has an unpack() function to convert an array to a set. Pavel has a write up about doing this in 8.3 and lower here:

http://www.postgres.cz/index.php/PostgreSQL_SQL_Tricks#Array_to_table

I've got some examples of pulling data ouf of arrays here:

http://scottrbailey.wordpress.com/2009/05/20/etl-with-postgres-arrays/

#3Gevik Babakhani
pgdev@xs4all.nl
In reply to: Postgres User (#1)
Re: Converting each item in array to a query result row

Postgres User wrote:

Hi,

I'd writing a query against a function (pg_proc) that contains 2
fields of an array type. Ideally, I'd like to select 1 row from the
table, but return a query row for each item in the array.

For example, if one row contains the array {"a", "b", "c"}
I'd like the query to return 3 rows, one for each of these elements.

Any idea if this is possible?

Thanks.

No matter how you create your sub query results, you still have to
create a sub result record by record.

Perhaps the following helps:

Note that the input parameter is not an array but a string that looks
like an array

------------------------------------------------------------------
create or replace function convert_to_query(p_array varchar) returns
setof record as
$$
declare
result record;
begin
return query
select
data.idx[enumerator.counter]::varchar
from
generate_series(1,array_upper(string_to_array(p_array,','),1))
as enumerator(counter),
string_to_array(p_array,',') as data(idx);
end;
$$
language plpgsql;

select result.field1 from convert_to_query('a,c,b,d,e,f') as
result(field1 varchar);

-----------------------------------------------------------

--
Regards,
Gevik

#4Adam Ruth
adamruth@mac.com
In reply to: Scott Bailey (#2)
Re: Converting each item in array to a query result row

I needed to do this just the other day, here's what I did:

create or replace function explode(_a anyarray) returns setof
anyelement as $$
begin
for i in array_lower(_a,1) .. array_upper(_a,1) loop
return next _a[i];
end loop;
return;
end;
$$ language plpgsql strict immutable;

select * from explode((select col from table where id = whatever)) as
Exploded;

On 29/05/2009, at 6:52 PM, artacus@comcast.net wrote:

Show quoted text

----- Original Message -----
From: "Postgres User" <postgres.developer@gmail.com>
To: "pgsql-general" <pgsql-general@postgresql.org>
Sent: Friday, May 29, 2009 12:21:11 AM GMT -08:00 Tijuana / Baja
California
Subject: [GENERAL] Converting each item in array to a query result row

Hi,

I'd writing a query against a function (pg_proc) that contains 2
fields of an array type. Ideally, I'd like to select 1 row from the
table, but return a query row for each item in the array.

For example, if one row contains the array {"a", "b", "c"}
I'd like the query to return 3 rows, one for each of these elements.

Any idea if this is possible?

Not sure exactly what you want, but this should get you in the ball
park
SELECT f.my_field[i] AS value,
(array['I am one','I am two','I am three'])[i] AS description
FROM my_func(foo) f
JOIN generate_series(1,3) i ON 1=1

Postgres 8.4 will has an unpack() function to convert an array to a
set. Pavel has a write up about doing this in 8.3 and lower here:
http://www.postgres.cz/index.php/PostgreSQL_SQL_Tricks#Array_to_table

I've got some examples of pulling data ouf of arrays here:
http://scottrbailey.wordpress.com/2009/05/20/etl-with-postgres-arrays/

#5Grzegorz Jaśkiewicz
gryzman@gmail.com
In reply to: Adam Ruth (#4)
Re: Converting each item in array to a query result row

why complicate so much ? this is a single sql query. It really shocks
me, how people easily lean on plpgsql and for/loops - which are times
slower than simple sql query:

CREATE FUNCTION unnest(anyarray) RETURNS SETOF anyelement AS
$_$
SELECT ($1)[i] FROM generate_series(array_lower($1,1),array_upper($1,1)) i;
$_$
LANGUAGE sql IMMUTABLE;

#6Adam Ruth
adamruth@mac.com
In reply to: Grzegorz Jaśkiewicz (#5)
Re: Converting each item in array to a query result row

Always test your performance assumptions. The plpgsql function is
faster than the sql function, a lot faster on smaller arrays.

unnest - 10 element array - 100,000 times: 6701.746 ms
unnest - 100 element array - 100,000 times: 11847.933 ms
unnest - 1000 element array - 100,000 times: 59472.691 ms

explode - 10 element array - 100,000 times: 1941.942 ms
explode - 100 element array - 100,000 times: 8521.289 ms
explode - 1000 element array - 100,000 times: 44980.048 ms

On 29/05/2009, at 8:55 PM, Grzegorz Jaśkiewicz wrote:

Show quoted text

why complicate so much ? this is a single sql query. It really shocks
me, how people easily lean on plpgsql and for/loops - which are times
slower than simple sql query:

CREATE FUNCTION unnest(anyarray) RETURNS SETOF anyelement AS
$_$
SELECT ($1)[i] FROM
generate_series(array_lower($1,1),array_upper($1,1)) i;
$_$
LANGUAGE sql IMMUTABLE;

#7Grzegorz Jaśkiewicz
gryzman@gmail.com
In reply to: Adam Ruth (#6)
Re: Converting each item in array to a query result row

2009/5/29 Adam Ruth <adamruth@mac.com>:

Always test your performance assumptions. The plpgsql function is faster
than the sql function, a lot faster on smaller arrays.

unnest - 10 element array - 100,000 times: 6701.746 ms
unnest - 100 element array - 100,000 times: 11847.933 ms
unnest - 1000 element array - 100,000 times: 59472.691 ms

explode - 10 element array - 100,000 times: 1941.942 ms
explode - 100 element array - 100,000 times: 8521.289 ms
explode - 1000 element array - 100,000 times: 44980.048 ms

o.O

--
GJ

#8Osvaldo Kussama
osvaldo.kussama@gmail.com
In reply to: Grzegorz Jaśkiewicz (#7)
Re: Converting each item in array to a query result row

2009/5/29 Grzegorz Jaśkiewicz <gryzman@gmail.com>:

2009/5/29 Adam Ruth <adamruth@mac.com>:

Always test your performance assumptions. The plpgsql function is faster
than the sql function, a lot faster on smaller arrays.

unnest - 10 element array - 100,000 times: 6701.746 ms
unnest - 100 element array - 100,000 times: 11847.933 ms
unnest - 1000 element array - 100,000 times: 59472.691 ms

explode - 10 element array - 100,000 times: 1941.942 ms
explode - 100 element array - 100,000 times: 8521.289 ms
explode - 1000 element array - 100,000 times: 44980.048 ms

Version 8.4 has an internal function unnest(anyarray):
http://developer.postgresql.org/pgdocs/postgres/functions-array.html

Osvaldo

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Adam Ruth (#4)
Re: Converting each item in array to a query result row

Adam Ruth <adamruth@mac.com> writes:

I needed to do this just the other day, here's what I did:

create or replace function explode(_a anyarray) returns setof
anyelement as $$

You'd be well advised to call this unnest(), because that's what the
equivalent built-in function in 8.4 is called ;-)

regards, tom lane

#10Grzegorz Jaśkiewicz
gryzman@gmail.com
In reply to: Tom Lane (#9)
Re: Converting each item in array to a query result row

CREATE OR REPLACE FUNCTION explode(_a anyarray) returns setof anyelement AS
$_$
BEGIN
RAISE NOTICE 'poof... just kidding... kabooom';
RETURN unnest(_a);
END;
$_$ LANGUAGE 'plpgsql'

;]

seriously, the 8.4 version is written in C, so I will be really
surprised if it under performs plpgsql one .)

--
GJ

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Adam Ruth (#6)
Re: Converting each item in array to a query result row

Adam Ruth <adamruth@mac.com> writes:

Always test your performance assumptions. The plpgsql function is
faster than the sql function, a lot faster on smaller arrays.

And, of course, it also pays to be precise about what you're testing
and on what. Set-returning SQL functions got a lot faster in 8.4.
Using CVS HEAD on a not-very-fast machine, I get these timings for
the attached script (10000 loop iterations in all cases)

10 elements 100 elements 1000 elements

built-in unnest 2.44 6.52 47.96
SQL function 2.52 6.50 46.71
plpgsql function 3.63 12.47 101.68

So at least in this specific test condition, there's not much
perceptible difference between the SQL function and the builtin,
while plpgsql lags behind.

regards, tom lane

create or replace function testit(n int, l int) returns float8 as $$
declare arr int[];
st timestamptz;
et timestamptz;
begin
arr := '{}';
for i in 1 .. n loop
arr[i] = i;
end loop;
st := clock_timestamp();
for i in 1 .. l loop
perform count(*) from unnest(arr); -- or unnest_sql or unnest_plpgsql
end loop;
et := clock_timestamp();
return extract(epoch from et - st);
end $$ language plpgsql;

CREATE or replace FUNCTION unnest_sql(anyarray) RETURNS SETOF anyelement AS
$_$
SELECT ($1)[i] FROM generate_series(array_lower($1,1),array_upper($1,1)) i;
$_$
LANGUAGE sql IMMUTABLE;

create or replace function unnest_plpgsql(_a anyarray) returns setof
anyelement as $$
begin
for i in array_lower(_a,1) .. array_upper(_a,1) loop
return next _a[i];
end loop;
return;
end;
$$ language plpgsql strict immutable;

#12Adam Ruth
adamruth@mac.com
In reply to: Tom Lane (#11)
Re: Converting each item in array to a query result row

Good point, I should have specified 8.3.7.

Just one more reason to anxiously anticipate upgrading to 8.4.

On 30/05/2009, at 2:56 AM, Tom Lane wrote:

Show quoted text

Adam Ruth <adamruth@mac.com> writes:

Always test your performance assumptions. The plpgsql function is
faster than the sql function, a lot faster on smaller arrays.

And, of course, it also pays to be precise about what you're testing
and on what. Set-returning SQL functions got a lot faster in 8.4.
Using CVS HEAD on a not-very-fast machine, I get these timings for
the attached script (10000 loop iterations in all cases)

10 elements 100 elements 1000 elements

built-in unnest 2.44 6.52 47.96
SQL function 2.52 6.50 46.71
plpgsql function 3.63 12.47 101.68

So at least in this specific test condition, there's not much
perceptible difference between the SQL function and the builtin,
while plpgsql lags behind.

regards, tom lane

create or replace function testit(n int, l int) returns float8 as $$
declare arr int[];
st timestamptz;
et timestamptz;
begin
arr := '{}';
for i in 1 .. n loop
arr[i] = i;
end loop;
st := clock_timestamp();
for i in 1 .. l loop
perform count(*) from unnest(arr); -- or unnest_sql or
unnest_plpgsql
end loop;
et := clock_timestamp();
return extract(epoch from et - st);
end $$ language plpgsql;

CREATE or replace FUNCTION unnest_sql(anyarray) RETURNS SETOF
anyelement AS
$_$
SELECT ($1)[i] FROM
generate_series(array_lower($1,1),array_upper($1,1)) i;
$_$
LANGUAGE sql IMMUTABLE;

create or replace function unnest_plpgsql(_a anyarray) returns setof
anyelement as $$
begin
for i in array_lower(_a,1) .. array_upper(_a,1) loop
return next _a[i];
end loop;
return;
end;
$$ language plpgsql strict immutable;

#13Postgres User
postgres.developer@gmail.com
In reply to: Adam Ruth (#12)
Re: Converting each item in array to a query result row

Thanks for all the replies. I'm going to post the results of using
the recommended approach in another thread.

Show quoted text

On Fri, May 29, 2009 at 1:18 PM, Adam Ruth <adamruth@mac.com> wrote:

Good point, I should have specified 8.3.7.

Just one more reason to anxiously anticipate upgrading to 8.4.

On 30/05/2009, at 2:56 AM, Tom Lane wrote:

Adam Ruth <adamruth@mac.com> writes:

Always test your performance assumptions. The plpgsql function is
faster than the sql function, a lot faster on smaller arrays.

And, of course, it also pays to be precise about what you're testing
and on what.  Set-returning SQL functions got a lot faster in 8.4.
Using CVS HEAD on a not-very-fast machine, I get these timings for
the attached script (10000 loop iterations in all cases)

                       10 elements     100 elements    1000 elements

built-in unnest         2.44            6.52            47.96
SQL function            2.52            6.50            46.71
plpgsql function        3.63            12.47           101.68

So at least in this specific test condition, there's not much
perceptible difference between the SQL function and the builtin,
while plpgsql lags behind.

                       regards, tom lane

create or replace function testit(n int, l int) returns float8 as $$
declare arr int[];
 st timestamptz;
 et timestamptz;
begin
 arr := '{}';
 for i in 1 .. n loop
  arr[i] = i;
 end loop;
 st := clock_timestamp();
 for i in 1 .. l loop
  perform count(*) from unnest(arr); -- or unnest_sql or unnest_plpgsql
 end loop;
 et := clock_timestamp();
 return extract(epoch from et - st);
end $$ language plpgsql;

CREATE or replace FUNCTION unnest_sql(anyarray) RETURNS SETOF anyelement
AS
$_$
SELECT ($1)[i] FROM generate_series(array_lower($1,1),array_upper($1,1))
i;
$_$
  LANGUAGE sql IMMUTABLE;

create or replace function unnest_plpgsql(_a anyarray) returns setof
anyelement as $$
begin
       for i in array_lower(_a,1) .. array_upper(_a,1) loop
               return next _a[i];
       end loop;
       return;
end;
$$ language plpgsql strict immutable;