WIP: plpgsql - foreach in
Hello
attached patch contains a implementation of iteration over a array:
Regards
Pavel Stehule
Attachments:
foreach.difftext/x-patch; charset=US-ASCII; name=foreach.diffDownload+672-4
Greetings,
* Pavel Stehule (pavel.stehule@gmail.com) wrote:
attached patch contains a implementation of iteration over a array:
I've gone through this patch and, in general, it looks pretty reasonable
to me. There's a number of places where I think additional comments
would be good and maybe some variable name improvments. Also, my
changes should be reviewed to make sure they make sense.
Attached is a patch against master which includes my changes, and a
patch against Pavel's patch, so he can more easily see my changes and
include them if he'd like.
I'm going to mark this returned to author with feedback.
commit 30295015739930e68c33b29da4f7ef535bc293ea
Author: Stephen Frost <sfrost@snowman.net>
Date: Wed Jan 19 17:58:24 2011 -0500
Clean up foreach-in-array PL/PgSQL code/comments
Minor clean-up of the PL/PgSQL foreach-in-array patch, includes
some white-space cleanup, grammar fixes, additional errhint where
it makes sense, etc.
Also added a number of 'XXX' comments asking for clarification
and additional comments on what's happening in the code.
commit f1a02fe3a8fa84217dae32d5ba74e9764c77431c
Author: Stephen Frost <sfrost@snowman.net>
Date: Wed Jan 19 15:11:53 2011 -0500
PL/PgSQL - Add interate-over-array support
This patch adds support for iterating over an array in PL/PgSQL.
Patch Author: Pavel Stehule
Thanks,
Stephen
On Wed, Jan 19, 2011 at 6:04 PM, Stephen Frost <sfrost@snowman.net> wrote:
I'm going to mark this returned to author with feedback.
That implies you don't think it should be considered further for this
CommitFest. Perhaps you mean Waiting on Author?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
* Robert Haas (robertmhaas@gmail.com) wrote:
On Wed, Jan 19, 2011 at 6:04 PM, Stephen Frost <sfrost@snowman.net> wrote:
I'm going to mark this returned to author with feedback.
That implies you don't think it should be considered further for this
CommitFest. Perhaps you mean Waiting on Author?
I did, actually, and that's what I actually marked it as in the CF.
Sorry for any confusion. When I went to mark it in CF, I realized my
mistake.
Thanks,
Stephen
2011/1/20 Stephen Frost <sfrost@snowman.net>:
* Robert Haas (robertmhaas@gmail.com) wrote:
On Wed, Jan 19, 2011 at 6:04 PM, Stephen Frost <sfrost@snowman.net> wrote:
I'm going to mark this returned to author with feedback.
That implies you don't think it should be considered further for this
CommitFest. Perhaps you mean Waiting on Author?I did, actually, and that's what I actually marked it as in the CF.
Sorry for any confusion. When I went to mark it in CF, I realized my
mistake.
ok :), I'll look on it tomorrow.
regards
Pavel
Show quoted text
Thanks,
Stephen
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)iEYEARECAAYFAk03ltkACgkQrzgMPqB3kihSmQCePy6+fpC7RJdki5guPRCLp5IZ
EJMAoIqgjb+IsG853/gC9T9xgFg5M5aM
=VLWh
-----END PGP SIGNATURE-----
Hello
I merge your changes and little enhanced comments.
Regards
Pavel Stehule
2011/1/20 Stephen Frost <sfrost@snowman.net>:
Show quoted text
Greetings,
* Pavel Stehule (pavel.stehule@gmail.com) wrote:
attached patch contains a implementation of iteration over a array:
I've gone through this patch and, in general, it looks pretty reasonable
to me. There's a number of places where I think additional comments
would be good and maybe some variable name improvments. Also, my
changes should be reviewed to make sure they make sense.Attached is a patch against master which includes my changes, and a
patch against Pavel's patch, so he can more easily see my changes and
include them if he'd like.I'm going to mark this returned to author with feedback.
commit 30295015739930e68c33b29da4f7ef535bc293ea
Author: Stephen Frost <sfrost@snowman.net>
Date: Wed Jan 19 17:58:24 2011 -0500Clean up foreach-in-array PL/PgSQL code/comments
Minor clean-up of the PL/PgSQL foreach-in-array patch, includes
some white-space cleanup, grammar fixes, additional errhint where
it makes sense, etc.Also added a number of 'XXX' comments asking for clarification
and additional comments on what's happening in the code.commit f1a02fe3a8fa84217dae32d5ba74e9764c77431c
Author: Stephen Frost <sfrost@snowman.net>
Date: Wed Jan 19 15:11:53 2011 -0500PL/PgSQL - Add interate-over-array support
This patch adds support for iterating over an array in PL/PgSQL.
Patch Author: Pavel Stehule
Thanks,
Stephen
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)iEYEARECAAYFAk03bf8ACgkQrzgMPqB3kihxuwCfZYKFpEraRCIltlUeYtD9AyX0
tvoAnjuxddXhZB6w2/V9oVSD1+K7Idu9
=w38Z
-----END PGP SIGNATURE-----
Attachments:
foreach.difftext/x-patch; charset=US-ASCII; name=foreach.diffDownload+707-9
Pavel,
* Pavel Stehule (pavel.stehule@gmail.com) wrote:
I merge your changes and little enhanced comments.
Thanks. Reviewing this further-
Why are you using 'FOREACH' here instead of just making it another
variation of 'FOR'? What is 'FOUND' set to following this? I realize
that might make the code easier, but it really doesn't seem to make much
sense to me from a usability point of view.
There also appears to be some extra whitespace changes that aren't
necessary and a number of places where you don't follow the indentation
conventions (eg: variable definitions in exec_stmt_foreach_a()).
I'm still not really thrilled with how the checking for scalar vs.
array, etc, is handled. Additionally, what is this? :
else if (stmt->row != NULL)
{
ctrl_var = estate->datums[stmt->row->dno];
}
else
{
ctrl_var = estate->datums[stmt->rec->dno];
}
Other comments- I don't like using 'i' and 'j', you really should use
better variable names, especially in large loops which contain other
loops. I'd also suggest changing the outer loop to be equivilant to the
number of iterations that will be done instead of the number of items
and then to *not* update 'i' inside the inner-loop. That structure is
really just confusing, imv (I certainly didn't entirely follow what was
happening there the first time I read it). Isn't there a function you
could use to pull out the array slice you need on each iteration through
the array?
Thanks,
Stephen
On Sun, Jan 23, 2011 at 9:49 PM, Stephen Frost <sfrost@snowman.net> wrote:
Pavel,
* Pavel Stehule (pavel.stehule@gmail.com) wrote:
I merge your changes and little enhanced comments.
Thanks. Reviewing this further-
Why are you using 'FOREACH' here instead of just making it another
variation of 'FOR'?
Uh oh. You just reopened the can of worms from hell.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
* Robert Haas (robertmhaas@gmail.com) wrote:
On Sun, Jan 23, 2011 at 9:49 PM, Stephen Frost <sfrost@snowman.net> wrote:
Why are you using 'FOREACH' here instead of just making it another
variation of 'FOR'?Uh oh. You just reopened the can of worms from hell.
hahahaha. Apparently I missed that discussion; also wasn't linked off
the patch. :/ Guess I'll go poke through the archives... Struck me as
obviously wrong to invent something completely new for this, but..
Thanks,
Stephen
* Robert Haas (robertmhaas@gmail.com) wrote:
Uh oh. You just reopened the can of worms from hell.
Alright.. I'm missing what happened to this suggestion of using:
FOR var in ARRAY array_expression ...
I like that a lot more than inventing a new top-level keyword, for the
same reasons that Tom mentioned: using a different initial keyword
makes it awkward to make generic statements about "all types of FOR
loop" (as I noticed while looking through the documentation changes that
should be made for this); and also some of the other comments about how
FOREACH doesn't give you any clue that this is some
array-specific-FOR-loop-thingy.
Thanks,
Stephen
2011/1/24 Stephen Frost <sfrost@snowman.net>:
Pavel,
* Pavel Stehule (pavel.stehule@gmail.com) wrote:
I merge your changes and little enhanced comments.
Thanks. Reviewing this further-
Why are you using 'FOREACH' here instead of just making it another
variation of 'FOR'? What is 'FOUND' set to following this? I realize
that might make the code easier, but it really doesn't seem to make much
sense to me from a usability point of view.
FOR keyword - please, look on thread about my proposal FOR-IN-ARRAY
I work with FOUND variable, because I like a consistent behave with
FOR statement. When FOUND is true after cycle, you are sure, so there
was a minimally one iteration.
There also appears to be some extra whitespace changes that aren't
necessary and a number of places where you don't follow the indentation
conventions (eg: variable definitions in exec_stmt_foreach_a()).
I am really not sure about correct indentation of variables :(, if you
know a correct number of spaces, please, fix it.
I'm still not really thrilled with how the checking for scalar vs.
array, etc, is handled. Additionally, what is this? :else if (stmt->row != NULL)
{
ctrl_var = estate->datums[stmt->row->dno];
}
else
{
ctrl_var = estate->datums[stmt->rec->dno];
}
PLpgSQL distinct between vars, row and record values. These structures
can be different and information about variable's offset in frame can
be on different position in structure. This IF means:
1) get offset of target variable
2) get addr, where is target variable saved in current frame
one note: a scalar or array can be saved on var type, only scalar can
be used on row or record type. This is often used pattern - you can
see it more time in executor.
Other comments- I don't like using 'i' and 'j', you really should use
better variable names, especially in large loops which contain other
loops. I'd also suggest changing the outer loop to be equivilant to the
number of iterations that will be done instead of the number of items
and then to *not* update 'i' inside the inner-loop. That structure is
really just confusing, imv (I certainly didn't entirely follow what was
happening there the first time I read it). Isn't there a function you
could use to pull out the array slice you need on each iteration through
the array?
I don't know a better short index identifiers than I used. But I am
not against to change.
I'll try to redesign main cycle.
Regards
Pavel Stehule
Show quoted text
Thanks,
Stephen
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)iEYEARECAAYFAk086MEACgkQrzgMPqB3kigCzQCffx0iVSMjU2UbOgAOaY/MvtOp
iKsAnA5tdhKxTssdXJ+Rda4qkhNVm26g
=Yn5O
-----END PGP SIGNATURE-----
Hello
Other comments- I don't like using 'i' and 'j', you really should use
better variable names, especially in large loops which contain other
loops. I'd also suggest changing the outer loop to be equivilant to the
number of iterations that will be done instead of the number of items
and then to *not* update 'i' inside the inner-loop. That structure is
really just confusing, imv (I certainly didn't entirely follow what was
happening there the first time I read it). Isn't there a function you
could use to pull out the array slice you need on each iteration through
the array?
I looked on code again. There are a few results:
I'll change identifiers 'i' and 'j' with any, that you send. It's
usual identifiers for nested loops and in this case they has really
well known semantic - it's subscript of array.
But others changes are more difficult
we have to iterate over array's items because it allow seq. access to
array's data. I need a global index for function "array_get_isnull". I
can't to use a buildin functions like array_slize_size or
array_get_slice, because there is high overhead of array_seek
function. I redesigned the initial part of main cycle, but code is
little bit longer :(, but I hope, it is more readable.
Regards
Pavel
Show quoted text
I don't know a better short index identifiers than I used. But I am
not against to change.I'll try to redesign main cycle.
Regards
Pavel Stehule
Thanks,
Stephen
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)iEYEARECAAYFAk086MEACgkQrzgMPqB3kigCzQCffx0iVSMjU2UbOgAOaY/MvtOp
iKsAnA5tdhKxTssdXJ+Rda4qkhNVm26g
=Yn5O
-----END PGP SIGNATURE-----
Attachments:
foreach.difftext/x-patch; charset=US-ASCII; name=foreach.diffDownload+374-6
2011/1/24 Pavel Stehule <pavel.stehule@gmail.com>:
Hello
Other comments- I don't like using 'i' and 'j', you really should use
better variable names, especially in large loops which contain other
loops. I'd also suggest changing the outer loop to be equivilant to the
number of iterations that will be done instead of the number of items
and then to *not* update 'i' inside the inner-loop. That structure is
really just confusing, imv (I certainly didn't entirely follow what was
happening there the first time I read it). Isn't there a function you
could use to pull out the array slice you need on each iteration through
the array?I looked on code again. There are a few results:
I'll change identifiers 'i' and 'j' with any, that you send. It's
usual identifiers for nested loops and in this case they has really
well known semantic - it's subscript of array.But others changes are more difficult
we have to iterate over array's items because it allow seq. access to
array's data. I need a global index for function "array_get_isnull". I
can't to use a buildin functions like array_slize_size or
array_get_slice, because there is high overhead of array_seek
function. I redesigned the initial part of main cycle, but code is
little bit longer :(, but I hope, it is more readable.
btw, having array_get_isnul accessible from contrib code is nice (I
used to copy/paste it for my own needs)
Regards
Pavel
I don't know a better short index identifiers than I used. But I am
not against to change.I'll try to redesign main cycle.
Regards
Pavel Stehule
Thanks,
Stephen
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)iEYEARECAAYFAk086MEACgkQrzgMPqB3kigCzQCffx0iVSMjU2UbOgAOaY/MvtOp
iKsAnA5tdhKxTssdXJ+Rda4qkhNVm26g
=Yn5O
-----END PGP SIGNATURE-------
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
--
Cédric Villemain 2ndQuadrant
http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support
On Mon, Jan 24, 2011 at 20:10, Pavel Stehule <pavel.stehule@gmail.com> wrote:
we have to iterate over array's items because it allow seq. access to
array's data. I need a global index for function "array_get_isnull". I
can't to use a buildin functions like array_slize_size or
array_get_slice, because there is high overhead of array_seek
function. I redesigned the initial part of main cycle, but code is
little bit longer :(, but I hope, it is more readable.
The attached patch only includes changes in plpgsql. I tested it
combined with the previous one, that includes other directories.
* src/pl/plpgsql/src/gram.y
| for_variable K_SLICE ICONST
The parameter for SLICE must be an integer literal. Is it a design?
I got a syntax error when I wrote a function like below. However,
FOREACH returns different types on SLICE = 0 or SLICE >= 1.
(element type for 0, or array type for >=1) So, we cannot write
a true generic function that accepts all SLICE values even if
the syntax supports an expr for the parameter.
CREATE FUNCTION foreach_test(int[], int) RETURNS SETOF int[] AS
$$
DECLARE
a integer[];
BEGIN
FOREACH a SLICE $2 IN $1 LOOP -- syntax error
RETURN NEXT a;
END LOOP;
END;
$$ LANGUAGE plpgsql;
* /doc/src/sgml/plpgsql.sgml
+ FOREACH <replaceable>target</replaceable> <optional> SCALE
s/SCALE/SLICE/ ?
* Why do you use "foreach_a" for some identifiers?
We use "foreach" only for arrays, so "_a" suffix doesn't seem needed.
* accumArrayResult() for SLICE has a bit overhead.
If we want to optimize it, we could use memcpy() because slices are
placed in continuous memory. But I'm not sure the worth; I guess
FOREACH will be used with SLICE = 0 in many cases.
--
Itagaki Takahiro
2011/1/26 Itagaki Takahiro <itagaki.takahiro@gmail.com>:
On Mon, Jan 24, 2011 at 20:10, Pavel Stehule <pavel.stehule@gmail.com> wrote:
we have to iterate over array's items because it allow seq. access to
array's data. I need a global index for function "array_get_isnull". I
can't to use a buildin functions like array_slize_size or
array_get_slice, because there is high overhead of array_seek
function. I redesigned the initial part of main cycle, but code is
little bit longer :(, but I hope, it is more readable.The attached patch only includes changes in plpgsql. I tested it
combined with the previous one, that includes other directories.* src/pl/plpgsql/src/gram.y
| for_variable K_SLICE ICONST
The parameter for SLICE must be an integer literal. Is it a design?
I got a syntax error when I wrote a function like below. However,
FOREACH returns different types on SLICE = 0 or SLICE >= 1.
(element type for 0, or array type for >=1) So, we cannot write
a true generic function that accepts all SLICE values even if
the syntax supports an expr for the parameter.
Yes, it is design. You wrote a reason. A parametrized SLICE needs only
a few lines more, but a) I really don't know a use case, b) there is
significant difference between slice 0 and slice 1
CREATE FUNCTION foreach_test(int[], int) RETURNS SETOF int[] AS
$$
DECLARE
a integer[];
BEGIN
FOREACH a SLICE $2 IN $1 LOOP -- syntax error
RETURN NEXT a;
END LOOP;
END;
$$ LANGUAGE plpgsql;* /doc/src/sgml/plpgsql.sgml
+ FOREACH <replaceable>target</replaceable> <optional> SCALE
s/SCALE/SLICE/ ?* Why do you use "foreach_a" for some identifiers?
We use "foreach" only for arrays, so "_a" suffix doesn't seem needed.
yes, it isn't needed. But FOREACH is a new construct, that can be used
for more different iterations - maybe iterations over hstore,
element's set, ...
so _a means - this is implementation for arrays.
* accumArrayResult() for SLICE has a bit overhead.
If we want to optimize it, we could use memcpy() because slices are
placed in continuous memory. But I'm not sure the worth; I guess
FOREACH will be used with SLICE = 0 in many cases.
I agree with you, so SLICE > 0 will not be used often.
accumArrayResult isn't expensive function - for non varlena types. The
cutting of some array needs a redundant code to CopyArrayEls and
construct_md_array. All core functions are not good for our purpose,
because doesn't expect a seq array reading :(. Next - simply copy can
be done only for arrays without null bitmap, else you have to do some
bitmap rotations :(. So, I don't would do this optimization in this
moment. It has sense for multidimensional numeric arrays, but can be
solved in next version. It's last commitfest now, and I don't do this
patch more complex then it is now.
Regards
Pavel Stehule
Show quoted text
--
Itagaki Takahiro
On Mon, Jan 24, 2011 at 13:05, Stephen Frost <sfrost@snowman.net> wrote:
FOR var in ARRAY array_expression ...
I like that a lot more than inventing a new top-level keyword,
AFAIR, the syntax is not good at an array literal.
FOR var IN ARRAY ARRAY[1,2,5] LOOP ...
And it was the only drawback compared with FOREACH var IN expr.
--
Itagaki Takahiro
* Itagaki Takahiro (itagaki.takahiro@gmail.com) wrote:
On Mon, Jan 24, 2011 at 13:05, Stephen Frost <sfrost@snowman.net> wrote:
FOR var in ARRAY array_expression ...
I like that a lot more than inventing a new top-level keyword,
AFAIR, the syntax is not good at an array literal.
FOR var IN ARRAY ARRAY[1,2,5] LOOP ...
I don't really see why that's "not good"? So you have 'ARRAY' twice..
So what? That's better than having a new top-level FOREACH that doesn't
have any reason to exist except to be different from FOR and to not do
the same thing..
Thanks,
Stephen
* Pavel Stehule (pavel.stehule@gmail.com) wrote:
FOR keyword - please, look on thread about my proposal FOR-IN-ARRAY
I did, and I still don't agree w/ using FOREACH.
I work with FOUND variable, because I like a consistent behave with
FOR statement. When FOUND is true after cycle, you are sure, so there
was a minimally one iteration.
Then the documentation around FOUND needs to be updated for FOREACH, and
there's probably other places that need to be changed too.
There also appears to be some extra whitespace changes that aren't
necessary and a number of places where you don't follow the indentation
conventions (eg: variable definitions in exec_stmt_foreach_a()).I am really not sure about correct indentation of variables :(, if you
know a correct number of spaces, please, fix it.
It's not a matter of a 'correct number of space'- make it the same as
what it is in the rest of the code... The gist of it is to make the
variable names all line up (with maximum use of tabs at 4-spaces per
tab, of course):
int my_var;
char *my_string;
double my_double;
etc, etc.
I'll try to redesign main cycle.
Thanks,
Stephen
2011/1/29 Stephen Frost <sfrost@snowman.net>:
* Itagaki Takahiro (itagaki.takahiro@gmail.com) wrote:
On Mon, Jan 24, 2011 at 13:05, Stephen Frost <sfrost@snowman.net> wrote:
FOR var in ARRAY array_expression ...
I like that a lot more than inventing a new top-level keyword,
AFAIR, the syntax is not good at an array literal.
FOR var IN ARRAY ARRAY[1,2,5] LOOP ...I don't really see why that's "not good"? So you have 'ARRAY' twice..
So what? That's better than having a new top-level FOREACH that doesn't
have any reason to exist except to be different from FOR and to not do
the same thing..
I don't see a problem too, but we didn't find a compromise with this
syntax, so I left it. It is true, so current implementation of FOR
stmt is really baroque and next argument is a compatibility with
PL/SQL. My idea is so FOR stmt will be a compatible with PL/SQL
original, and FOREACH can be a platform for PostgreSQL specific code.
Regards
Pavel
Show quoted text
Thanks,
Stephen
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)iEYEARECAAYFAk1D/u8ACgkQrzgMPqB3kij2IwCfZ3W+mGc7LedBdnt9lCa0vYjk
m6QAn0xh7r6oTs+T47k+EuwZRpU2T0X8
=ruBa
-----END PGP SIGNATURE-----
I'll try to redesign main cycle.
Thanks,
please, can you look on code that I sent last time?
Pavel
Show quoted text
Stephen
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)iEUEARECAAYFAk1EAJwACgkQrzgMPqB3kig5bACdH0fm8Klh7Dq1GlICV/Z8yEd4
KQoAlRZEeTrBkKK6TwjZrKmFDDeRfKE=
=JPG4
-----END PGP SIGNATURE-----