final patch - plpgsql: for-in-array
Hello
this patch implement a new iteration construct - iteration over an
array. The sense of this new iteration is:
* a simple and cleaner syntax
* a faster execution - this bring down a number of detoast operations
create or replace function subscripts(anyarray, int)
returns int[] as $$
select array(select generate_subscripts($1,$2));
$$ language sql;
create or replace function fora_test()
returns int as $$
declare x int; s int = 0;
a int[] := array[1,2,3,4,5,6,7,8,9,10];
begin
for x in array subscripts(a, 1)
loop
s := s + a[x];
end loop;
return s;
end;
$$ language plpgsql;
create or replace function fora_test()
returns int as $$
declare x int; s int = 0;
begin
for x in array array[1,2,3,4,5,6,7,8,9,10]
loop
s := s + x;
end loop;
return s;
end;
$$ language plpgsql;
create or replace function fora_test()
returns int as $$
declare x int; y int;
a fora_point[] := array[(1,2),(3,4),(5,6)];
begin
for x, y in array a
loop
raise notice 'point=%,%', x, y;
end loop;
return 0;
end;
$$ language plpgsql;
Regards
Pavel Stehule
Attachments:
for-in-arrayapplication/octet-stream; name=for-in-arrayDownload+571-74
On Thu, Sep 30, 2010 at 7:46 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
Hello
this patch implement a new iteration construct - iteration over an
array. The sense of this new iteration is:
* a simple and cleaner syntax
i will start the review of this one... but before that sorry for
suggesting this a bit later but about using UNNEST as part of the
sintax?
FOR var IN UNNEST array_expr LOOP
END LOOP
i like this because:
1) is cleaner when array_expr is ARRAY[1,2,3]
2) is not legal now to use the unnest() function without a SELECT in
the context of a FOR loop (or am i missing something?)
3) the unnest() function does the same so seems intuitive what a
FOR-IN-UNNEST do
what i don't know if is this syntax could co-exist with the unnest() function?
--
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte y capacitación de PostgreSQL
On Wed, Nov 17, 2010 at 7:08 PM, Jaime Casanova <jaime@2ndquadrant.com> wrote:
On Thu, Sep 30, 2010 at 7:46 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
Hello
this patch implement a new iteration construct - iteration over an
array. The sense of this new iteration is:
* a simple and cleaner syntaxi will start the review of this one... but before that sorry for
suggesting this a bit later but about using UNNEST as part of the
sintax?
Does for-in-array do what unnset does? unnest() flattens the whole
array into scalars irregardless of dimensions:
select unnest(array[array[1,2],array[3,4]]);
unnest
--------
1
2
3
4
If yes, then +1 (unless there is some other problem) otherwise -1.
merlin
Merlin Moncure <mmoncure@gmail.com> writes:
On Wed, Nov 17, 2010 at 7:08 PM, Jaime Casanova <jaime@2ndquadrant.com> wrote:
i will start the review of this one... but before that sorry for
suggesting this a bit later but about using UNNEST as part of the
sintax?
Does for-in-array do what unnset does?
Yes, which begs the question of why bother at all. AFAICS this patch
simply allows you to replace
for x in select unnest(array_value) loop
with
for x in unnest array_value loop
(plus or minus a parenthesis or so). I do not think we need to add a
bunch of code and create even more syntactic ambiguity (FOR loops are
already on the hairy edge of unparsability) to save people from writing
"select".
regards, tom lane
2010/11/18 Tom Lane <tgl@sss.pgh.pa.us>:
Merlin Moncure <mmoncure@gmail.com> writes:
On Wed, Nov 17, 2010 at 7:08 PM, Jaime Casanova <jaime@2ndquadrant.com> wrote:
i will start the review of this one... but before that sorry for
suggesting this a bit later but about using UNNEST as part of the
sintax?Does for-in-array do what unnset does?
Yes, which begs the question of why bother at all. AFAICS this patch
simply allows you to replacefor x in select unnest(array_value) loop
with
for x in unnest array_value loop
(plus or minus a parenthesis or so). I do not think we need to add a
bunch of code and create even more syntactic ambiguity (FOR loops are
already on the hairy edge of unparsability) to save people from writing
"select".
this patch is semantically equal to SELECT unnest(..), but it is
evaluated as simple expression and does directly array unpacking and
iteration, - so it means this fragment is significantly >>faster<<.
Regards
Pavel Stehule
Show quoted text
regards, tom lane
2010/11/18 Pavel Stehule <pavel.stehule@gmail.com>:
2010/11/18 Tom Lane <tgl@sss.pgh.pa.us>:
Merlin Moncure <mmoncure@gmail.com> writes:
On Wed, Nov 17, 2010 at 7:08 PM, Jaime Casanova <jaime@2ndquadrant.com> wrote:
i will start the review of this one... but before that sorry for
suggesting this a bit later but about using UNNEST as part of the
sintax?Does for-in-array do what unnset does?
Yes, which begs the question of why bother at all. AFAICS this patch
simply allows you to replacefor x in select unnest(array_value) loop
with
for x in unnest array_value loop
(plus or minus a parenthesis or so). I do not think we need to add a
bunch of code and create even more syntactic ambiguity (FOR loops are
already on the hairy edge of unparsability) to save people from writing
"select".this patch is semantically equal to SELECT unnest(..), but it is
evaluated as simple expression and does directly array unpacking and
iteration, - so it means this fragment is significantly >>faster<<.
Did you implement a method to be able to walk the array and detoast
only the current needed data ?
(I wonder because I have something like that in that garage : select
array_filter(foo,'like','%bar%',10); where 10 is the limit and can be
avoided, foo is the array, like is callback function, '%bar%' the
parameter for the callback function for filtering results.)
It will make my toy in the garage a fast race car (and probably doable
in (plpg)SQL instead of C) ...
--
Cédric Villemain 2ndQuadrant
http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support
2010/11/18 Cédric Villemain <cedric.villemain.debian@gmail.com>:
2010/11/18 Pavel Stehule <pavel.stehule@gmail.com>:
2010/11/18 Tom Lane <tgl@sss.pgh.pa.us>:
Merlin Moncure <mmoncure@gmail.com> writes:
On Wed, Nov 17, 2010 at 7:08 PM, Jaime Casanova <jaime@2ndquadrant.com> wrote:
i will start the review of this one... but before that sorry for
suggesting this a bit later but about using UNNEST as part of the
sintax?Does for-in-array do what unnset does?
Yes, which begs the question of why bother at all. AFAICS this patch
simply allows you to replacefor x in select unnest(array_value) loop
with
for x in unnest array_value loop
(plus or minus a parenthesis or so). I do not think we need to add a
bunch of code and create even more syntactic ambiguity (FOR loops are
already on the hairy edge of unparsability) to save people from writing
"select".this patch is semantically equal to SELECT unnest(..), but it is
evaluated as simple expression and does directly array unpacking and
iteration, - so it means this fragment is significantly >>faster<<.Did you implement a method to be able to walk the array and detoast
only the current needed data ?
not only - iteration over array can help with readability but a
general work with SRF (set returning functions is more harder and
slower) - so special loop statement can to safe a some toast op / when
you use a large array and access via index, or can to safe a some work
with memory, because there isn't necessary convert array to set of
tuples. Please, recheck these tests.
test:
CREATE OR REPLACE FUNCTION rndstr() RETURNS text AS $$select
array_to_string(array(select substring('ABCDEFGHIJKLMNOPQ' FROM
(random()*16)::int FOR 1) from generate_series(1,10)),'')$$ LANGUAGE
sql;
create or replace function rndarray(int) returns text[] as $$select
array(select rndstr() from generate_series(1,$1)) $$ language sql;
create table t10(x text[]);
insert into t10 select rndarray(10) from generate_series(1,10000);
create table t100(x text[]);
insert into t100 select rndarray(100) from generate_series(1,10000);
create table t1000(x text[]);
insert into t1000 select rndarray(1000) from generate_series(1,10000);
CREATE OR REPLACE FUNCTION public.filter(text[], text, integer)
RETURNS text[]
LANGUAGE plpgsql
AS $function$
DECLARE
s text[] := '{}';
l int := 0;
v text;
BEGIN
FOR v IN ARRAY $1
LOOP
EXIT WHEN l = $3;
IF v LIKE $2 THEN
s := s || v;
l := l + 1;
END IF;
END LOOP;
RETURN s;
END;$function$;
postgres=# select avg(array_upper(filter(x,'%AA%', 10),1)) from t10;
avg
--------------------
1.1596079803990200
(1 row)
Time: 393.649 ms
postgres=# select avg(array_upper(filter(x,'%AA%', 10),1)) from t100;
avg
--------------------
3.4976777789245536
(1 row)
Time: 2804.502 ms
postgres=# select avg(array_upper(filter(x,'%AA%', 10),1)) from t1000;
avg
---------------------
10.0000000000000000
(1 row)
Time: 9729.994 ms
CREATE OR REPLACE FUNCTION public.filter01(text[], text, integer)
RETURNS text[]
LANGUAGE plpgsql
AS $function$
DECLARE
s text[] := '{}';
l int := 0;
v text;
BEGIN
FOR v IN SELECT UNNEST($1)
LOOP
EXIT WHEN l = $3;
IF v LIKE $2 THEN
s := s || v;
l := l + 1;
END IF;
END LOOP;
RETURN s;
END;$function$;
postgres=# select avg(array_upper(filter01(x,'%AA%', 10),1)) from t10;
avg
--------------------
1.1596079803990200
(1 row)
Time: 795.383 ms
postgres=# select avg(array_upper(filter01(x,'%AA%', 10),1)) from t100;
avg
--------------------
3.4976777789245536
(1 row)
Time: 3848.258 ms
postgres=# select avg(array_upper(filter01(x,'%AA%', 10),1)) from t1000;
avg
---------------------
10.0000000000000000
(1 row)
Time: 12366.093 ms
The iteration via specialized FOR IN ARRAY is about 25-30% faster than
FOR IN SELECT UNNEST
postgres=# CREATE OR REPLACE FUNCTION public.filter02(text[], text, integer)
RETURNS text[]
LANGUAGE plpgsql
AS $function$
DECLARE
s text[] := '{}';
l int := 0; i int;
v text;
BEGIN
FOR i IN array_lower($1,1)..array_upper($1,1)
LOOP
EXIT WHEN l = $3;
IF $1[i] LIKE $2 THEN
s := s || $1[i];
l := l + 1;
END IF;
END LOOP;
RETURN s;
END;$function$
;
postgres=# select avg(array_upper(filter02(x,'%AA%', 10),1)) from t10;
avg
--------------------
1.1596079803990200
(1 row)
Time: 414.960 ms
postgres=# select avg(array_upper(filter02(x,'%AA%', 10),1)) from t100;
avg
--------------------
3.4976777789245536
(1 row)
Time: 3460.970 ms
there FOR IN ARRAY is faster about 30% then access per index
for T1000 I had to cancel over 1 minute!!!!
(I wonder because I have something like that in that garage : select
array_filter(foo,'like','%bar%',10); where 10 is the limit and can be
avoided, foo is the array, like is callback function, '%bar%' the
parameter for the callback function for filtering results.)It will make my toy in the garage a fast race car (and probably doable
in (plpg)SQL instead of C) ...
it can help with reading of array. But it doesn't help with array
updating :(. For large arrays it can be slow too.
Regards
Pavel Stehule
Show quoted text
--
Cédric Villemain 2ndQuadrant
http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support
On Thu, Nov 18, 2010 at 12:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Merlin Moncure <mmoncure@gmail.com> writes:
On Wed, Nov 17, 2010 at 7:08 PM, Jaime Casanova <jaime@2ndquadrant.com> wrote:
i will start the review of this one... but before that sorry for
suggesting this a bit later but about using UNNEST as part of the
sintax?Does for-in-array do what unnset does?
Yes, which begs the question of why bother at all. AFAICS this patch
simply allows you to replacefor x in select unnest(array_value) loop
with
for x in unnest array_value loop
(plus or minus a parenthesis or so). I do not think we need to add a
bunch of code and create even more syntactic ambiguity (FOR loops are
already on the hairy edge of unparsability) to save people from writing
"select".
Pavel's performance argument is imnsho valid. arrays at present are
the best way to pass data around functions and any optimizations here
are very welcome. Given that, is there any way to mitigate your
concerns on the syntax side?
merlin
On Thu, Nov 18, 2010 at 10:24 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
On Thu, Nov 18, 2010 at 12:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Merlin Moncure <mmoncure@gmail.com> writes:
On Wed, Nov 17, 2010 at 7:08 PM, Jaime Casanova <jaime@2ndquadrant.com> wrote:
i will start the review of this one... but before that sorry for
suggesting this a bit later but about using UNNEST as part of the
sintax?Does for-in-array do what unnset does?
Yes, which begs the question of why bother at all. AFAICS this patch
simply allows you to replacefor x in select unnest(array_value) loop
with
for x in unnest array_value loop
(plus or minus a parenthesis or so). I do not think we need to add a
bunch of code and create even more syntactic ambiguity (FOR loops are
already on the hairy edge of unparsability) to save people from writing
"select".Pavel's performance argument is imnsho valid. arrays at present are
the best way to pass data around functions and any optimizations here
are very welcome. Given that, is there any way to mitigate your
concerns on the syntax side?
Can we get the performance benefit any other way? I hate to think
that it will still be slow for people using the already-supported
syntax.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
2010/11/18 Pavel Stehule <pavel.stehule@gmail.com>:
2010/11/18 Cédric Villemain <cedric.villemain.debian@gmail.com>:
2010/11/18 Pavel Stehule <pavel.stehule@gmail.com>:
2010/11/18 Tom Lane <tgl@sss.pgh.pa.us>:
Merlin Moncure <mmoncure@gmail.com> writes:
On Wed, Nov 17, 2010 at 7:08 PM, Jaime Casanova <jaime@2ndquadrant.com> wrote:
i will start the review of this one... but before that sorry for
suggesting this a bit later but about using UNNEST as part of the
sintax?Does for-in-array do what unnset does?
Yes, which begs the question of why bother at all. AFAICS this patch
simply allows you to replacefor x in select unnest(array_value) loop
with
for x in unnest array_value loop
(plus or minus a parenthesis or so). I do not think we need to add a
bunch of code and create even more syntactic ambiguity (FOR loops are
already on the hairy edge of unparsability) to save people from writing
"select".this patch is semantically equal to SELECT unnest(..), but it is
evaluated as simple expression and does directly array unpacking and
iteration, - so it means this fragment is significantly >>faster<<.Did you implement a method to be able to walk the array and detoast
only the current needed data ?not only - iteration over array can help with readability but a
general work with SRF (set returning functions is more harder and
slower) - so special loop statement can to safe a some toast op / when
you use a large array and access via index, or can to safe a some work
with memory, because there isn't necessary convert array to set of
tuples. Please, recheck these tests.test:
CREATE OR REPLACE FUNCTION rndstr() RETURNS text AS $$select
array_to_string(array(select substring('ABCDEFGHIJKLMNOPQ' FROM
(random()*16)::int FOR 1) from generate_series(1,10)),'')$$ LANGUAGE
sql;create or replace function rndarray(int) returns text[] as $$select
array(select rndstr() from generate_series(1,$1)) $$ language sql;create table t10(x text[]);
insert into t10 select rndarray(10) from generate_series(1,10000);
create table t100(x text[]);
insert into t100 select rndarray(100) from generate_series(1,10000);
create table t1000(x text[]);
insert into t1000 select rndarray(1000) from generate_series(1,10000);CREATE OR REPLACE FUNCTION public.filter(text[], text, integer)
RETURNS text[]
LANGUAGE plpgsql
AS $function$
DECLARE
s text[] := '{}';
l int := 0;
v text;
BEGIN
FOR v IN ARRAY $1
LOOP
EXIT WHEN l = $3;
IF v LIKE $2 THEN
s := s || v;
l := l + 1;
END IF;
END LOOP;
RETURN s;
END;$function$;postgres=# select avg(array_upper(filter(x,'%AA%', 10),1)) from t10;
avg
--------------------
1.1596079803990200
(1 row)Time: 393.649 ms
postgres=# select avg(array_upper(filter(x,'%AA%', 10),1)) from t100;
avg
--------------------
3.4976777789245536
(1 row)Time: 2804.502 ms
postgres=# select avg(array_upper(filter(x,'%AA%', 10),1)) from t1000;
avg
---------------------
10.0000000000000000
(1 row)Time: 9729.994 ms
CREATE OR REPLACE FUNCTION public.filter01(text[], text, integer)
RETURNS text[]
LANGUAGE plpgsql
AS $function$
DECLARE
s text[] := '{}';
l int := 0;
v text;
BEGIN
FOR v IN SELECT UNNEST($1)
LOOP
EXIT WHEN l = $3;
IF v LIKE $2 THEN
s := s || v;
l := l + 1;
END IF;
END LOOP;
RETURN s;
END;$function$;postgres=# select avg(array_upper(filter01(x,'%AA%', 10),1)) from t10;
avg
--------------------
1.1596079803990200
(1 row)Time: 795.383 ms
postgres=# select avg(array_upper(filter01(x,'%AA%', 10),1)) from t100;
avg
--------------------
3.4976777789245536
(1 row)Time: 3848.258 ms
postgres=# select avg(array_upper(filter01(x,'%AA%', 10),1)) from t1000;
avg
---------------------
10.0000000000000000
(1 row)Time: 12366.093 ms
The iteration via specialized FOR IN ARRAY is about 25-30% faster than
FOR IN SELECT UNNESTpostgres=# CREATE OR REPLACE FUNCTION public.filter02(text[], text, integer)
RETURNS text[]
LANGUAGE plpgsql
AS $function$
DECLARE
s text[] := '{}';
l int := 0; i int;
v text;
BEGIN
FOR i IN array_lower($1,1)..array_upper($1,1)
LOOP
EXIT WHEN l = $3;
IF $1[i] LIKE $2 THEN
s := s || $1[i];
l := l + 1;
END IF;
END LOOP;
RETURN s;
END;$function$
;postgres=# select avg(array_upper(filter02(x,'%AA%', 10),1)) from t10;
avg
--------------------
1.1596079803990200
(1 row)Time: 414.960 ms
postgres=# select avg(array_upper(filter02(x,'%AA%', 10),1)) from t100;
avg
--------------------
3.4976777789245536
(1 row)Time: 3460.970 ms
there FOR IN ARRAY is faster about 30% then access per index
for T1000 I had to cancel over 1 minute!!!!
I can't test until this week-end. But I will.
(I wonder because I have something like that in that garage : select
array_filter(foo,'like','%bar%',10); where 10 is the limit and can be
avoided, foo is the array, like is callback function, '%bar%' the
parameter for the callback function for filtering results.)It will make my toy in the garage a fast race car (and probably doable
in (plpg)SQL instead of C) ...it can help with reading of array. But it doesn't help with array
updating :(. For large arrays it can be slow too.
select fast is already a good job, thank you.
--
Cédric Villemain 2ndQuadrant
http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support
2010/11/18 Robert Haas <robertmhaas@gmail.com>:
On Thu, Nov 18, 2010 at 10:24 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
On Thu, Nov 18, 2010 at 12:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Merlin Moncure <mmoncure@gmail.com> writes:
On Wed, Nov 17, 2010 at 7:08 PM, Jaime Casanova <jaime@2ndquadrant.com> wrote:
i will start the review of this one... but before that sorry for
suggesting this a bit later but about using UNNEST as part of the
sintax?Does for-in-array do what unnset does?
Yes, which begs the question of why bother at all. AFAICS this patch
simply allows you to replacefor x in select unnest(array_value) loop
with
for x in unnest array_value loop
(plus or minus a parenthesis or so). I do not think we need to add a
bunch of code and create even more syntactic ambiguity (FOR loops are
already on the hairy edge of unparsability) to save people from writing
"select".Pavel's performance argument is imnsho valid. arrays at present are
the best way to pass data around functions and any optimizations here
are very welcome. Given that, is there any way to mitigate your
concerns on the syntax side?Can we get the performance benefit any other way? I hate to think
that it will still be slow for people using the already-supported
syntax.
If you are able to make unnest() outputting 1st row without detoasting
last field.
I think if we have :
#define DatumGetTextPSlice(X,m,n) ((text *) PG_DETOAST_DATUM_SLICE(X,m,n))
but for array, most is done
Pavel, am I correct ?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
--
Cédric Villemain 2ndQuadrant
http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support
2010/11/18 Robert Haas <robertmhaas@gmail.com>:
On Thu, Nov 18, 2010 at 10:24 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
On Thu, Nov 18, 2010 at 12:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Merlin Moncure <mmoncure@gmail.com> writes:
On Wed, Nov 17, 2010 at 7:08 PM, Jaime Casanova <jaime@2ndquadrant.com> wrote:
i will start the review of this one... but before that sorry for
suggesting this a bit later but about using UNNEST as part of the
sintax?Does for-in-array do what unnset does?
Yes, which begs the question of why bother at all. AFAICS this patch
simply allows you to replacefor x in select unnest(array_value) loop
with
for x in unnest array_value loop
(plus or minus a parenthesis or so). I do not think we need to add a
bunch of code and create even more syntactic ambiguity (FOR loops are
already on the hairy edge of unparsability) to save people from writing
"select".Pavel's performance argument is imnsho valid. arrays at present are
the best way to pass data around functions and any optimizations here
are very welcome. Given that, is there any way to mitigate your
concerns on the syntax side?Can we get the performance benefit any other way? I hate to think
that it will still be slow for people using the already-supported
syntax.
I afraid so other ways are more difficult with deeper impacts on
plpgsql executor.
what is a slow:
a) repeated detoasting - access with subscripts - maybe detoasted
values can be cached?
b) evaluation of SRF expression - maybe call of SRF function can be
simple expression,
c) faster evaluation ro query
The most important is @a. Only a few people uses a FOR IN SELECT
UNNEST form now. Probably not optimizable on PLpgSQL level is form FOR
IN SELECT * FROM UNNEST.
FOR IN ARRAY doesn't impacts on expression executing - this patch is
just simple and isolated.
Pavel
Show quoted text
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
2010/11/18 Cédric Villemain <cedric.villemain.debian@gmail.com>:
2010/11/18 Robert Haas <robertmhaas@gmail.com>:
On Thu, Nov 18, 2010 at 10:24 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
On Thu, Nov 18, 2010 at 12:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Merlin Moncure <mmoncure@gmail.com> writes:
On Wed, Nov 17, 2010 at 7:08 PM, Jaime Casanova <jaime@2ndquadrant.com> wrote:
i will start the review of this one... but before that sorry for
suggesting this a bit later but about using UNNEST as part of the
sintax?Does for-in-array do what unnset does?
Yes, which begs the question of why bother at all. AFAICS this patch
simply allows you to replacefor x in select unnest(array_value) loop
with
for x in unnest array_value loop
(plus or minus a parenthesis or so). I do not think we need to add a
bunch of code and create even more syntactic ambiguity (FOR loops are
already on the hairy edge of unparsability) to save people from writing
"select".Pavel's performance argument is imnsho valid. arrays at present are
the best way to pass data around functions and any optimizations here
are very welcome. Given that, is there any way to mitigate your
concerns on the syntax side?Can we get the performance benefit any other way? I hate to think
that it will still be slow for people using the already-supported
syntax.If you are able to make unnest() outputting 1st row without detoasting
last field.I think if we have :
#define DatumGetTextPSlice(X,m,n) ((text *) PG_DETOAST_DATUM_SLICE(X,m,n))
but for array, most is donePavel, am I correct ?
yes, it can help, but still if you iterate over complete array, you
have to do n - detoast ops.
Pavel
Show quoted text
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers--
Cédric Villemain 2ndQuadrant
http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support
On 11/18/2010 10:33 AM, Robert Haas wrote:
On Thu, Nov 18, 2010 at 10:24 AM, Merlin Moncure<mmoncure@gmail.com> wrote:
Pavel's performance argument is imnsho valid. arrays at present are
the best way to pass data around functions and any optimizations here
are very welcome. Given that, is there any way to mitigate your
concerns on the syntax side?Can we get the performance benefit any other way? I hate to think
that it will still be slow for people using the already-supported
syntax.
It's not disastrously slower. AFAICT from a very quick glance over the
patch, he's getting the speedup by bypassing the normal mechanism for
evaluating "for x in select ...". So we'd have to special-case that to
trap calls to unnest in the general form. That would be pretty ugly. Or
else make unnest and SPI faster. But that's a much bigger project.
Syntactic sugar is not entirely to be despised, anyway.
cheers
andrew
2010/11/18 Pavel Stehule <pavel.stehule@gmail.com>:
2010/11/18 Cédric Villemain <cedric.villemain.debian@gmail.com>:
2010/11/18 Robert Haas <robertmhaas@gmail.com>:
On Thu, Nov 18, 2010 at 10:24 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
On Thu, Nov 18, 2010 at 12:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Merlin Moncure <mmoncure@gmail.com> writes:
On Wed, Nov 17, 2010 at 7:08 PM, Jaime Casanova <jaime@2ndquadrant.com> wrote:
i will start the review of this one... but before that sorry for
suggesting this a bit later but about using UNNEST as part of the
sintax?Does for-in-array do what unnset does?
Yes, which begs the question of why bother at all. AFAICS this patch
simply allows you to replacefor x in select unnest(array_value) loop
with
for x in unnest array_value loop
(plus or minus a parenthesis or so). I do not think we need to add a
bunch of code and create even more syntactic ambiguity (FOR loops are
already on the hairy edge of unparsability) to save people from writing
"select".Pavel's performance argument is imnsho valid. arrays at present are
the best way to pass data around functions and any optimizations here
are very welcome. Given that, is there any way to mitigate your
concerns on the syntax side?Can we get the performance benefit any other way? I hate to think
that it will still be slow for people using the already-supported
syntax.If you are able to make unnest() outputting 1st row without detoasting
last field.I think if we have :
#define DatumGetTextPSlice(X,m,n) ((text *) PG_DETOAST_DATUM_SLICE(X,m,n))
but for array, most is donePavel, am I correct ?
yes, it can help, but still if you iterate over complete array, you
have to do n - detoast ops.
sure.
Pavel
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers--
Cédric Villemain 2ndQuadrant
http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support
--
Cédric Villemain 2ndQuadrant
http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support
Merlin Moncure <mmoncure@gmail.com> writes:
On Thu, Nov 18, 2010 at 12:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Yes, which begs the question of why bother at all.
Pavel's performance argument is imnsho valid.
Well, that argument is unsupported by any evidence, so far as I've seen.
More to the point, if there is indeed an interesting performance win
here, we could get the same win by internally optimizing the existing
syntax. That would provide the benefit to existing code not just
new code; and it would avoid foreclosing our future options for
extending FOR in not-so-redundant ways.
regards, tom lane
2010/11/18 Tom Lane <tgl@sss.pgh.pa.us>:
Merlin Moncure <mmoncure@gmail.com> writes:
On Thu, Nov 18, 2010 at 12:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Yes, which begs the question of why bother at all.
Pavel's performance argument is imnsho valid.
Well, that argument is unsupported by any evidence, so far as I've seen.
More to the point, if there is indeed an interesting performance win
here, we could get the same win by internally optimizing the existing
syntax. That would provide the benefit to existing code not just
new code; and it would avoid foreclosing our future options for
extending FOR in not-so-redundant ways.
sorry, but I don't agree. I don't think, so there are some big space
for optimizing - and if then it means much more code complexity for
current expr executor. Next - FOR IN ARRAY takes fields from array on
request - and it is possible, because a unpacking of array is
controlled by statement - it's impossible do same when unpacking is
inside other functions with same effectivity.
Regards
Pavel Stehule
Show quoted text
regards, tom lane
Pavel Stehule <pavel.stehule@gmail.com> writes:
2010/11/18 Tom Lane <tgl@sss.pgh.pa.us>:
More to the point, if there is indeed an interesting performance win
here, we could get the same win by internally optimizing the existing
syntax.
sorry, but I don't agree. I don't think, so there are some big space
for optimizing - and if then it means much more code complexity for
current expr executor. Next - FOR IN ARRAY takes fields from array on
request - and it is possible, because a unpacking of array is
controlled by statement - it's impossible do same when unpacking is
inside other functions with same effectivity.
Just because you haven't thought about it doesn't mean it's impossible
or impractical.
The first implementation I was thinking of would involve looking at the
SELECT querytree after parsing to see if it's "SELECT UNNEST(something)"
--- that is, empty jointree and so on, single tlist item that is an
invocation of the unnest() function. If it is, you pull out the
argument expression of unnest() and go from there, with exactly the same
execution behavior as in the specialized-syntax patch. This is
perfectly safe if you identify the array_unnest function by OID: since
it's a built-in function you know exactly what it's supposed to do.
But having said that, it's still not apparent to me that array_unnest
itself is markedly slower than what you could hope to do in plpgsql.
I think the real issue here is that plpgsql's simple-expression code
can't be used with set-returning expressions, which means that we have
to go through the vastly more expensive SPI code path. But that
restriction is largely based on fear of re-using expression trees, which
is something we fixed a few weeks ago. I think that it would now be
interesting to look at whether "FOR x IN SELECT simple-expression" could
use the simple-expression code even when the expression returns set.
That approach might bring a useful speedup not just for unnest, but for
many other use-cases as well.
regards, tom lane
Andrew Dunstan <andrew@dunslane.net> writes:
Syntactic sugar is not entirely to be despised, anyway.
If it were harmless syntactic sugar I wouldn't be objecting so loudly.
The problem here is that FOR is a syntactic choke point: it's already
overloaded with several different sub-syntaxes that are quite difficult
to separate. Adding another one makes that worse, with the consequences
that we might misinterpret the user's intent, leading either to
misleading/unhelpful error messages or unexpected runtime behavior.
If you consult the archives you can find numerous past instances of
complaints from people who hit such problems with the existing set of
FOR sub-syntaxes, so this isn't hypothetical.
I'm also quite afraid of having a conflict with other future extensions
of FOR, which we might want to introduce either on our own hook or for
compatibility with something Oracle might add to PL/SQL in future.
This might not be the worst place in the entire system to be introducing
inessential syntactic sugar, but it's certainly one of the top ten.
I would *much* rather we get the performance benefit by internal
optimization, and forego inventing syntax.
regards, tom lane
2010/11/18 Tom Lane <tgl@sss.pgh.pa.us>:
Pavel Stehule <pavel.stehule@gmail.com> writes:
2010/11/18 Tom Lane <tgl@sss.pgh.pa.us>:
More to the point, if there is indeed an interesting performance win
here, we could get the same win by internally optimizing the existing
syntax.sorry, but I don't agree. I don't think, so there are some big space
for optimizing - and if then it means much more code complexity for
current expr executor. Next - FOR IN ARRAY takes fields from array on
request - and it is possible, because a unpacking of array is
controlled by statement - it's impossible do same when unpacking is
inside other functions with same effectivity.Just because you haven't thought about it doesn't mean it's impossible
or impractical.The first implementation I was thinking of would involve looking at the SELECT querytree after parsing to see if it's "SELECT UNNEST(something)" --- that is, empty jointree and so on, single tlist item that is an invocation of the unnest() function. If it is, you pull out the argument expression of unnest() and go from there, with exactly the same execution behavior as in the specialized-syntax patch. This is perfectly safe if you identify the array_unnest function by OID: since it's a built-in function you know exactly what it's supposed to do.
this additional control will do slow down for any expression - more -
somebody can use a form: SELECT FROM unnest(), and this form will be
slower.
But having said that, it's still not apparent to me that array_unnest
itself is markedly slower than what you could hope to do in plpgsql.
I think the real issue here is that plpgsql's simple-expression code
can't be used with set-returning expressions, which means that we have
to go through the vastly more expensive SPI code path. But that
restriction is largely based on fear of re-using expression trees, which
is something we fixed a few weeks ago. I think that it would now be
interesting to look at whether "FOR x IN SELECT simple-expression" could
use the simple-expression code even when the expression returns set.
That approach might bring a useful speedup not just for unnest, but for
many other use-cases as well.
any SRF call must not be faster then direct access to array. There is
overhead with tuples.
I don't understand you. Sorry. I don't think, so your objections are objective.
Regards
Pavel
Show quoted text
regards, tom lane