proposal: FOREACH-IN-ARRAY (probably for 9.2?)

Started by Pavel Stehuleabout 15 years ago33 messages

pavel.stehule@gmail.com

about 15 years ago

1 attachment(s)

Hello

I am resending a redesigned proposal about special plpgsql statement
that support iteration over an array.

The most conflict issue of last proposal was a syntax. It enhanced
relative complex FOR statement. So now, it's based on new statement
with simple syntax. We can use a keyword FOREACH, this isn't in
conflict with PL/SQL - use a keyword FORALL and it isn't in conflict
with SQL/PSM too. More - this special statement can be used for
PostgreSQL's specific purposes. It can carry a new features in future.
The design of proposed functionality is simple, but respects a
possibility for enhancing a FOREACH cycle for future.

==proposed syntax:==

[ <<label>> ]
FOREACH var [, var [..]] IN ARRAY expr
LOOP
...
END LOOP [ label ]

==the goals:==
* cleaner syntax for full iteration over array
* reduce a overhead from only seq. access to any field in array
(it's not too significant)
* simplify iteration over multidimensional arrays

The most performance issue of access to a untoasted array is "solved"
with other patch.

== Iteration over multidimensional arrays ==
Its designed to reduce one dimension from source array. It can remove
a slicing and simplify code:

CREATE OR REPLACE FUNCTION public.fa(anyarray)
RETURNS void
LANGUAGE plpgsql
AS $function$
DECLARE i int[];
BEGIN
FOREACH i IN ARRAY $1
LOOP
RAISE NOTICE '%', i;
END LOOP;
END;
$function$

postgres=# select fa(array[[[1,2],[3,4]],[[1,2],[3,4]],[[5,6],[7,8]]]);
NOTICE: {{1,2},{3,4}}
NOTICE: {{1,2},{3,4}}
NOTICE: {{5,6},{7,8}}
fa
----

(1 row)

postgres=# select fa(array[[1,2,3,4],[1,2,3,4],[5,6,7,8]]);
NOTICE: {1,2,3,4}
NOTICE: {1,2,3,4}
NOTICE: {5,6,7,8}
fa
----

(1 row)

ideas, notes?

Regards

Pavel

Attachments:

foreach-in-array.difftext/x-patch; charset=US-ASCII; name=foreach-in-array.diffDownload

*** ./plpgsql/src/gram.y.orig	2010-12-16 09:10:44.464254014 +0100
--- ./plpgsql/src/gram.y	2010-12-16 10:24:45.620740197 +0100
***************
*** 190,196 ****
  %type <stmt>	stmt_return stmt_raise stmt_execsql
  %type <stmt>	stmt_dynexecute stmt_for stmt_perform stmt_getdiag
  %type <stmt>	stmt_open stmt_fetch stmt_move stmt_close stmt_null
! %type <stmt>	stmt_case
  
  %type <list>	proc_exceptions
  %type <exception_block> exception_sect
--- 190,196 ----
  %type <stmt>	stmt_return stmt_raise stmt_execsql
  %type <stmt>	stmt_dynexecute stmt_for stmt_perform stmt_getdiag
  %type <stmt>	stmt_open stmt_fetch stmt_move stmt_close stmt_null
! %type <stmt>	stmt_case stmt_foreach_a
  
  %type <list>	proc_exceptions
  %type <exception_block> exception_sect
***************
*** 239,244 ****
--- 239,245 ----
  %token <keyword>	K_ABSOLUTE
  %token <keyword>	K_ALIAS
  %token <keyword>	K_ALL
+ %token <keyword>	K_ARRAY
  %token <keyword>	K_BACKWARD
  %token <keyword>	K_BEGIN
  %token <keyword>	K_BY
***************
*** 264,269 ****
--- 265,271 ----
  %token <keyword>	K_FETCH
  %token <keyword>	K_FIRST
  %token <keyword>	K_FOR
+ %token <keyword>	K_FOREACH
  %token <keyword>	K_FORWARD
  %token <keyword>	K_FROM
  %token <keyword>	K_GET
***************
*** 739,744 ****
--- 741,748 ----
  						{ $$ = $1; }
  				| stmt_for
  						{ $$ = $1; }
+ 				| stmt_foreach_a
+ 						{ $$ = $1; }
  				| stmt_exit
  						{ $$ = $1; }
  				| stmt_return
***************
*** 1386,1391 ****
--- 1390,1434 ----
  					}
  				;
  
+ stmt_foreach_a		: opt_block_label K_FOREACH for_variable K_IN K_ARRAY expr_until_loop loop_body
+ 					{
+ 						PLpgSQL_stmt_foreach_a *new = palloc0(sizeof(PLpgSQL_stmt_foreach_a));
+ 						new->cmd_type = PLPGSQL_STMT_FOREACH_A;
+ 						new->lineno = plpgsql_location_to_lineno(@2);
+ 						new->label = $1;
+ 						new->expr = $6;
+ 						new->body = $7.stmts;
+ 
+ 						if ($3.rec)
+ 						{
+ 							new->rec = $3.rec;
+ 							check_assignable((PLpgSQL_datum *) new->rec, @3);
+ 						}
+ 						else if ($3.row)
+ 						{
+ 							new->row = $3.row;
+ 							check_assignable((PLpgSQL_datum *) new->row, @3);
+ 						}
+ 						else if ($3.scalar)
+ 						{
+ 							Assert($3.scalar->dtype == PLPGSQL_DTYPE_VAR);
+ 							new->var = (PLpgSQL_var *) $3.scalar;
+ 							check_assignable($3.scalar, @3);
+ 
+ 						}
+ 						else
+ 						{
+ 							ereport(ERROR,
+ 									(errcode(ERRCODE_SYNTAX_ERROR),
+ 									 errmsg("loop variable of loop over arrat must be a record, row variable, scalar variable or list of scalar variables"),
+ 											 parser_errposition(@3)));
+ 						}
+ 
+ 						check_labels($1, $7.end_label, $7.end_label_location);
+ 						$$ = (PLpgSQL_stmt *) new;
+ 					}
+ 				;
+ 
  stmt_exit		: exit_type opt_label opt_exitcond
  					{
  						PLpgSQL_stmt_exit *new;
***************
*** 2035,2040 ****
--- 2078,2084 ----
  unreserved_keyword	:
  				K_ABSOLUTE
  				| K_ALIAS
+ 				| K_ARRAY
  				| K_BACKWARD
  				| K_CONSTANT
  				| K_CURSOR
*** ./plpgsql/src/pl_exec.c.orig	2010-12-16 10:25:37.285549156 +0100
--- ./plpgsql/src/pl_exec.c	2010-12-16 13:22:34.123447850 +0100
***************
*** 107,112 ****
--- 107,114 ----
  			   PLpgSQL_stmt_fors *stmt);
  static int exec_stmt_forc(PLpgSQL_execstate *estate,
  			   PLpgSQL_stmt_forc *stmt);
+ static int exec_stmt_foreach_a(PLpgSQL_execstate *estate,
+ 				    PLpgSQL_stmt_foreach_a *stmt);
  static int exec_stmt_open(PLpgSQL_execstate *estate,
  			   PLpgSQL_stmt_open *stmt);
  static int exec_stmt_fetch(PLpgSQL_execstate *estate,
***************
*** 1312,1317 ****
--- 1314,1323 ----
  			rc = exec_stmt_forc(estate, (PLpgSQL_stmt_forc *) stmt);
  			break;
  
+ 		case PLPGSQL_STMT_FOREACH_A:
+ 			rc = exec_stmt_foreach_a(estate, (PLpgSQL_stmt_foreach_a *) stmt);
+ 			break;
+ 
  		case PLPGSQL_STMT_EXIT:
  			rc = exec_stmt_exit(estate, (PLpgSQL_stmt_exit *) stmt);
  			break;
***************
*** 2028,2033 ****
--- 2034,2240 ----
  
  
  /* ----------
+  * exec_stmt_foreach_a			Implements loop over array
+  *
+  * ----------
+  */
+ static int 
+ exec_stmt_foreach_a(PLpgSQL_execstate *estate, PLpgSQL_stmt_foreach_a *stmt)
+ {
+ 	Datum		value;
+ 	bool		isnull;
+ 	Oid			valtype;
+ 	int	numelems = 0;
+ 	Oid 	array_typelem;
+ 	int	idx;
+ 	ArrayType	*arr;
+ 	char *ptr;
+ 	bits8	*arraynullsptr;
+ 	int16	elmlen;
+ 	bool	elmbyval;
+ 	char	elmalign;
+ 	PLpgSQL_datum *ctrl_var;
+ 	bool		found = false;
+ 	int			rc = PLPGSQL_RC_OK;
+ 	int		nitems = 1;
+ 
+ 	/* get a result of array_expr */
+ 	value = exec_eval_expr(estate, stmt->expr, &isnull, &valtype);
+ 	if (isnull)
+ 		ereport(ERROR,
+ 		 				(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
+ 				 errmsg("NULL value isn't allowed as parameter of FOREACH-IN-ARRAY")));
+ 
+ 	/* check a result of expression - must be a array */
+ 	array_typelem = get_element_type(valtype);
+ 	
+ 	if (!OidIsValid(array_typelem))
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_DATATYPE_MISMATCH),
+ 				 errmsg("result of expression isn't array"),
+ 				 errdetail("result of expression is %s", 
+ 									format_type_be(valtype))));
+  
+ 	/* copy a result and takes some infos */
+ 	arr = DatumGetArrayTypePCopy(value);
+ 	numelems = ArrayGetNItems(ARR_NDIM(arr), ARR_DIMS(arr));
+ 	ptr = ARR_DATA_PTR(arr);
+ 	arraynullsptr = ARR_NULLBITMAP(arr);
+ 	get_typlenbyvalalign(ARR_ELEMTYPE(arr),
+ 						&elmlen,
+ 						&elmbyval,
+ 						&elmalign);
+ 
+ 	/* clean a stack */
+ 	exec_eval_cleanup(estate);
+ 
+ 	/* get a target variable */
+ 	if (stmt->var != NULL)
+ 	{
+ 		if (ARR_NDIM(arr) > 1)
+ 		{
+ 			nitems = ArrayGetNItems(ARR_NDIM(arr) - 1, ARR_DIMS(arr) + 1);;
+ 		}
+ 
+ 		ctrl_var = estate->datums[stmt->var->dno];
+ 	}
+ 	else if (stmt->row != NULL)
+ 	{
+ 		if (ARR_NDIM(arr) > 1)
+ 			elog(ERROR, "array must not be assigned to row variable");
+ 
+ 		ctrl_var = estate->datums[stmt->row->dno];
+ 	}
+ 	else 
+ 	{
+ 		if (ARR_NDIM(arr) > 1)
+ 			elog(ERROR, "array must not be assigned to record variable");
+ 
+ 		ctrl_var = estate->datums[stmt->rec->dno];
+ 	}
+ 
+ 	/*
+ 	 * Now do the loop
+ 	 */
+ 	idx = 0;
+ 	while (idx < numelems)
+ 	{
+ 		int		j;
+ 		ArrayBuildState *astate = NULL;
+ 
+ 		found = true;				/* looped at least once */
+ 
+ 		for (j = 0; j < nitems; j++)
+ 		{
+ 			if (arraynullsptr != NULL && !(arraynullsptr[idx / 8] & (1 << (idx % 8))))
+ 			{
+ 				isnull = true;
+ 				value = (Datum) 0;
+ 			}
+ 			else
+ 			{
+ 				isnull = false;
+ 				value = fetch_att(ptr, elmbyval, elmlen);
+ 
+ 				ptr = att_addlength_pointer(ptr, elmlen, ptr);
+ 				ptr = (char *) att_align_nominal(ptr, elmalign);
+ 			}
+ 			
+ 			if (nitems > 1)
+ 			{
+ 				astate = accumArrayResult(astate, value, isnull, 
+ 									array_typelem, 
+ 									CurrentMemoryContext);
+ 			}
+ 			idx++;
+ 		}
+ 
+ 		/* 
+ 		 * store a item to variable - we expecting so almost all
+ 		 * iteration will be over scalar values, so it is reason
+ 		 * why we don't create a fake tuple over scalar and we 
+ 		 * don't use a exec_move_row for scalars. This is about 
+ 		 * four times faster.
+ 		 */
+ 		if (astate == NULL)
+ 			exec_assign_value(estate, ctrl_var,
+ 						    value, array_typelem, &isnull);
+ 		else
+ 		{
+ 			Datum x;
+ 			bool isnull = false;
+ 			
+ 			x = makeMdArrayResult(astate, ARR_NDIM(arr) - 1, ARR_DIMS(arr) + 1, ARR_LBOUND(arr) ? ARR_LBOUND(arr) + 1 : NULL,
+ 										    CurrentMemoryContext, true);
+ 			exec_assign_value(estate, ctrl_var, x, valtype, &isnull);
+ 		}
+ 
+ 		/*
+ 		 * Execute the statements
+ 		 */
+ 		rc = exec_stmts(estate, stmt->body);
+ 
+ 		if (rc == PLPGSQL_RC_RETURN)
+ 			break;				/* break out of the loop */
+ 		else if (rc == PLPGSQL_RC_EXIT)
+ 		{
+ 			if (estate->exitlabel == NULL)
+ 				/* unlabelled exit, finish the current loop */
+ 				rc = PLPGSQL_RC_OK;
+ 			else if (stmt->label != NULL &&
+ 					 strcmp(stmt->label, estate->exitlabel) == 0)
+ 			{
+ 				/* labelled exit, matches the current stmt's label */
+ 				estate->exitlabel = NULL;
+ 				rc = PLPGSQL_RC_OK;
+ 			}
+ 
+ 			/*
+ 			 * otherwise, this is a labelled exit that does not match the
+ 			 * current statement's label, if any: return RC_EXIT so that the
+ 			 * EXIT continues to propagate up the stack.
+ 			 */
+ 			break;
+ 		}
+ 		else if (rc == PLPGSQL_RC_CONTINUE)
+ 		{
+ 			if (estate->exitlabel == NULL)
+ 				/* unlabelled continue, so re-run the current loop */
+ 				rc = PLPGSQL_RC_OK;
+ 			else if (stmt->label != NULL &&
+ 					 strcmp(stmt->label, estate->exitlabel) == 0)
+ 			{
+ 				/* label matches named continue, so re-run loop */
+ 				estate->exitlabel = NULL;
+ 				rc = PLPGSQL_RC_OK;
+ 			}
+ 			else
+ 			{
+ 				/*
+ 				 * otherwise, this is a named continue that does not match the
+ 				 * current statement's label, if any: return RC_CONTINUE so
+ 				 * that the CONTINUE will propagate up the stack.
+ 				 */
+ 				break;
+ 			}
+ 		}
+ 	}
+ 
+ 	pfree(arr);
+ 
+ 	/*
+ 	 * Set the FOUND variable to indicate the result of executing the loop
+ 	 * (namely, whether we looped one or more times). This must be set here so
+ 	 * that it does not interfere with the value of the FOUND variable inside
+ 	 * the loop processing itself.
+ 	 */
+ 	exec_set_found(estate, found);
+ 
+ 	return rc;
+ }
+ 
+ 
+ /* ----------
   * exec_stmt_exit			Implements EXIT and CONTINUE
   *
   * This begins the process of exiting / restarting a loop.
*** ./plpgsql/src/pl_funcs.c.orig	2010-12-16 10:06:08.805286073 +0100
--- ./plpgsql/src/pl_funcs.c	2010-12-16 10:09:28.134644301 +0100
***************
*** 230,235 ****
--- 230,237 ----
  			return _("FOR over SELECT rows");
  		case PLPGSQL_STMT_FORC:
  			return _("FOR over cursor");
+ 		case PLPGSQL_STMT_FOREACH_A:
+ 			return _("FOREACH over array");
  		case PLPGSQL_STMT_EXIT:
  			return "EXIT";
  		case PLPGSQL_STMT_RETURN:
***************
*** 293,298 ****
--- 295,301 ----
  static void dump_close(PLpgSQL_stmt_close *stmt);
  static void dump_perform(PLpgSQL_stmt_perform *stmt);
  static void dump_expr(PLpgSQL_expr *expr);
+ static void dump_foreach_a(PLpgSQL_stmt_foreach_a *stmt);
  
  static void
  dump_ind(void)
***************
*** 375,380 ****
--- 378,386 ----
  		case PLPGSQL_STMT_PERFORM:
  			dump_perform((PLpgSQL_stmt_perform *) stmt);
  			break;
+ 		case PLPGSQL_STMT_FOREACH_A:
+ 			dump_foreach_a((PLpgSQL_stmt_foreach_a *) stmt);
+ 			break;
  		default:
  			elog(ERROR, "unrecognized cmd_type: %d", stmt->cmd_type);
  			break;
***************
*** 595,600 ****
--- 601,621 ----
  }
  
  static void
+ dump_foreach_a(PLpgSQL_stmt_foreach_a *stmt)
+ {
+ 	dump_ind();
+ 	printf("FOREACHA %s", (stmt->rec != NULL) ? stmt->rec->refname : (stmt->row != NULL) ? stmt->row->refname : stmt->var->refname);
+ 	printf("IN ARRAY ");
+ 	dump_expr(stmt->expr);
+ 	printf("\n");
+ 
+ 	dump_stmts(stmt->body);
+ 
+ 	dump_ind();
+ 	printf("    ENDOFOREACHA");
+ }
+ 
+ static void
  dump_open(PLpgSQL_stmt_open *stmt)
  {
  	dump_ind();
*** ./plpgsql/src/plpgsql.h.orig	2010-12-16 09:14:42.645613534 +0100
--- ./plpgsql/src/plpgsql.h	2010-12-16 10:10:09.797151135 +0100
***************
*** 87,92 ****
--- 87,93 ----
  	PLPGSQL_STMT_CASE,
  	PLPGSQL_STMT_LOOP,
  	PLPGSQL_STMT_WHILE,
+ 	PLPGSQL_STMT_FOREACH_A,
  	PLPGSQL_STMT_FORI,
  	PLPGSQL_STMT_FORS,
  	PLPGSQL_STMT_FORC,
***************
*** 427,432 ****
--- 428,446 ----
  
  
  typedef struct
+ {								/* FOREACH item in array loop */
+ 	int			cmd_type;
+ 	int			lineno;
+ 	char	   *label;
+ 	PLpgSQL_var *var;
+ 	PLpgSQL_rec *rec;
+ 	PLpgSQL_row *row;
+ 	PLpgSQL_expr	*expr;
+ 	List	   *body;			/* List of statements */
+ } PLpgSQL_stmt_foreach_a;
+ 
+ 
+ typedef struct
  {								/* FOR statement with integer loopvar	*/
  	int			cmd_type;
  	int			lineno;
*** ./plpgsql/src/pl_scanner.c.orig	2010-12-16 09:11:11.797853910 +0100
--- ./plpgsql/src/pl_scanner.c	2010-12-16 09:51:13.255962643 +0100
***************
*** 77,82 ****
--- 77,83 ----
  	PG_KEYWORD("exit", K_EXIT, RESERVED_KEYWORD)
  	PG_KEYWORD("fetch", K_FETCH, RESERVED_KEYWORD)
  	PG_KEYWORD("for", K_FOR, RESERVED_KEYWORD)
+ 	PG_KEYWORD("foreach", K_FOREACH, RESERVED_KEYWORD)
  	PG_KEYWORD("from", K_FROM, RESERVED_KEYWORD)
  	PG_KEYWORD("get", K_GET, RESERVED_KEYWORD)
  	PG_KEYWORD("if", K_IF, RESERVED_KEYWORD)
***************
*** 105,110 ****
--- 106,112 ----
  static const ScanKeyword unreserved_keywords[] = {
  	PG_KEYWORD("absolute", K_ABSOLUTE, UNRESERVED_KEYWORD)
  	PG_KEYWORD("alias", K_ALIAS, UNRESERVED_KEYWORD)
+ 	PG_KEYWORD("array", K_ARRAY, UNRESERVED_KEYWORD)
  	PG_KEYWORD("backward", K_BACKWARD, UNRESERVED_KEYWORD)
  	PG_KEYWORD("constant", K_CONSTANT, UNRESERVED_KEYWORD)
  	PG_KEYWORD("cursor", K_CURSOR, UNRESERVED_KEYWORD)

Alvaro Herrera

alvherre@commandprompt.com

about 15 years ago

In reply to: Pavel Stehule (#1)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

Excerpts from Pavel Stehule's message of jue dic 16 16:19:17 -0300 2010:

The most performance issue of access to a untoasted array is "solved"
with other patch.

Was the other patch applied?

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Pavel Stehule

pavel.stehule@gmail.com

about 15 years ago

In reply to: Alvaro Herrera (#2)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

2010/12/17 Alvaro Herrera <alvherre@commandprompt.com>:

Excerpts from Pavel Stehule's message of jue dic 16 16:19:17 -0300 2010:

The most performance issue of access to a untoasted array is "solved"
with other patch.

Was the other patch applied?

no, it's in queue for next commitfest

https://commitfest.postgresql.org/action/patch_view?id=440

Regards

Pavel

Show quoted text

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Tom Lane

tgl@sss.pgh.pa.us

about 15 years ago

In reply to: Pavel Stehule (#1)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

Pavel Stehule <pavel.stehule@gmail.com> writes:

I am resending a redesigned proposal about special plpgsql statement
that support iteration over an array.

OK ...

== Iteration over multidimensional arrays ==
Its designed to reduce one dimension from source array. It can remove
a slicing and simplify code:

This seems like a really bad, confusing idea. I think it should throw
a type-mismatch error in this case. If there is any use-case for such a
thing, which I'm quite unconvinced of, it ought to use a separate syntax
rather than overloading the element-by-element syntax.

regards, tom lane

Merlin Moncure

mmoncure@gmail.com

about 15 years ago

In reply to: Tom Lane (#4)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

On Fri, Dec 17, 2010 at 10:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Pavel Stehule <pavel.stehule@gmail.com> writes:

I am resending a redesigned proposal about special plpgsql statement
that support iteration over an array.

OK ...

== Iteration over multidimensional arrays ==
Its designed to reduce one dimension from source array. It can remove
a slicing and simplify code:

This seems like a really bad, confusing idea. I think it should throw
a type-mismatch error in this case. If there is any use-case for such a
thing, which I'm quite unconvinced of, it ought to use a separate syntax
rather than overloading the element-by-element syntax.

I don't agree at all -- iterating arrays by slice is a frequently
requested feature (you can kinda sorta do it by slice notation, but
arr[n] giving null is a -general FAQ. This is how people think arrays
should work. I suppose that having this functionality reserved in a
tiny corner of plpgsql is not so good, but I think foreach... would
become the preferred way to iterate arrays always.

merlin

Pavel Stehule

pavel.stehule@gmail.com

about 15 years ago

In reply to: Tom Lane (#4)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

2010/12/17 Tom Lane <tgl@sss.pgh.pa.us>:

Pavel Stehule <pavel.stehule@gmail.com> writes:

I am resending a redesigned proposal about special plpgsql statement
that support iteration over an array.

OK ...

== Iteration over multidimensional arrays ==
Its designed to reduce one dimension from source array. It can remove
a slicing and simplify code:

This seems like a really bad, confusing idea. I think it should throw
a type-mismatch error in this case. If there is any use-case for such a
thing, which I'm quite unconvinced of, it ought to use a separate syntax
rather than overloading the element-by-element syntax.

Without this feature any iteration over 2d and more dimensional array
is not practical. If I have a 2D array, then I would to get a vector.
Access to individual values can be to limiting, because I need a more
cycles to get a complete vector. Usually I can use a array of row
instead a 2d array, but still and in feature there is problem with
iteration over row. So sometime is more practical to use a 2d array.

Actually It raise a type mismatch error, when a user used a scalar
variable and data is a vector (array)

Pavel

Show quoted text

regards, tom lane

Tom Lane

tgl@sss.pgh.pa.us

about 15 years ago

In reply to: Merlin Moncure (#5)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

Merlin Moncure <mmoncure@gmail.com> writes:

On Fri, Dec 17, 2010 at 10:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

This seems like a really bad, confusing idea. �I think it should throw
a type-mismatch error in this case. �If there is any use-case for such a
thing, which I'm quite unconvinced of, it ought to use a separate syntax
rather than overloading the element-by-element syntax.

I don't agree at all -- iterating arrays by slice is a frequently
requested feature (you can kinda sorta do it by slice notation, but
arr[n] giving null is a -general FAQ. This is how people think arrays
should work. I suppose that having this functionality reserved in a
tiny corner of plpgsql is not so good, but I think foreach... would
become the preferred way to iterate arrays always.

Well, okay, if it's useful we can have it, but I still say it needs to
be a separate syntax. The example Pavel gives looks like nothing so
much as a beginner's error, ie putting [] on the target variable when
he shouldn't have.

Furthermore, it's underspecified: who's to say how many dimensions of
the array are supposed to get sliced off? There's no reasonable place
to extend this syntax to specify that. It will also be inconsistent
for "foreach scalar in array" to iterate element-by-element no matter
how many dimensions array has, while "foreach array in array" does
something different from that.

regards, tom lane

Pavel Stehule

pavel.stehule@gmail.com

about 15 years ago

In reply to: Tom Lane (#7)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

2010/12/17 Tom Lane <tgl@sss.pgh.pa.us>:

Merlin Moncure <mmoncure@gmail.com> writes:

On Fri, Dec 17, 2010 at 10:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

This seems like a really bad, confusing idea. I think it should throw
a type-mismatch error in this case. If there is any use-case for such a
thing, which I'm quite unconvinced of, it ought to use a separate syntax
rather than overloading the element-by-element syntax.

I don't agree at all -- iterating arrays by slice is a frequently
requested feature (you can kinda sorta do it by slice notation, but
arr[n] giving null is a -general FAQ. This is how people think arrays
should work. I suppose that having this functionality reserved in a
tiny corner of plpgsql is not so good, but I think foreach... would
become the preferred way to iterate arrays always.

Well, okay, if it's useful we can have it, but I still say it needs to
be a separate syntax. The example Pavel gives looks like nothing so
much as a beginner's error, ie putting [] on the target variable when
he shouldn't have.

Now the message is unclean - but it can be enhanced. We can a diagnose
situation when result is multidimensional array and target isn't
array, and the we can to throw user friendly message.

Furthermore, it's underspecified: who's to say how many dimensions of
the array are supposed to get sliced off? There's no reasonable place
to extend this syntax to specify that. It will also be inconsistent
for "foreach scalar in array" to iterate element-by-element no matter
how many dimensions array has, while "foreach array in array" does
something different from that.

it reduce just one dimension. Now I expect, and I think so it is
correct, so user knows a used dimension. Just doesn't know a data. So
user can to decide and fill correct type. The design strictly remove
any U.I. from design. So using a incorect type is bug.

Because a FOREACH syntax is new, we can to enhance it to possible direction:

FOREACH VALUE var IN ARRAY expr
LOOP
END LOOP

and then it will iterate per one field without a dimension reduction.
So this possibility is available and I think so could be implemented
too.

Pavel

Show quoted text

regards, tom lane

Tom Lane

tgl@sss.pgh.pa.us

about 15 years ago

In reply to: Pavel Stehule (#8)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

Pavel Stehule <pavel.stehule@gmail.com> writes:

2010/12/17 Tom Lane <tgl@sss.pgh.pa.us>:

Furthermore, it's underspecified: who's to say how many dimensions of
the array are supposed to get sliced off? There's no reasonable place
to extend this syntax to specify that. It will also be inconsistent
for "foreach scalar in array" to iterate element-by-element no matter
how many dimensions array has, while "foreach array in array" does
something different from that.

it reduce just one dimension. Now I expect, and I think so it is
correct, so user knows a used dimension. Just doesn't know a data. So
user can to decide and fill correct type. The design strictly remove
any U.I. from design. So using a incorect type is bug.

In other words, your proposal is error-prone to use, restricted in what
it can do, and incapable of being extended later without breaking
things. If there is some redeeming social value to set against those
problems, I'm not seeing it.

What I think we should have is

FOREACH scalar-variable IN ARRAY array-expression

which iterates element by element regardless of how many dimensions the
array has. Then there should be some other syntax for iterating over
slices, and we should give some thought to being able to specify how
"deep" the slice is. I can definitely think of use cases for pulling
off either 1 dimension at a time (so you get vectors) or N-1 dimensions
at a time, and it's not out of the realm of reason to want intermediate
cases.

Maybe

FOR_EACH scalar-variable IN ARRAY array-expression

FOR_SLICE array-variable [DEPTH n] IN ARRAY array-expression

Or I guess you could use the same leading keyword if you make the depth
specification mandatory for the slice case:

FOREACH scalar-variable IN ARRAY array-expression

FOREACH array-variable SLICE n IN ARRAY array-expression

That might be a better idea since it avoids the inevitable argument over
whether the default slice depth should be 1 dimension or N-1 dimensions.

regards, tom lane

#10

Merlin Moncure

mmoncure@gmail.com

about 15 years ago

In reply to: Tom Lane (#9)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

On Fri, Dec 17, 2010 at 11:38 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Pavel Stehule <pavel.stehule@gmail.com> writes:

2010/12/17 Tom Lane <tgl@sss.pgh.pa.us>:

Furthermore, it's underspecified: who's to say how many dimensions of
the array are supposed to get sliced off? There's no reasonable place
to extend this syntax to specify that. It will also be inconsistent
for "foreach scalar in array" to iterate element-by-element no matter
how many dimensions array has, while "foreach array in array" does
something different from that.

it reduce just one dimension. Now I expect, and I think so it is
correct, so user knows a used dimension. Just doesn't know a data. So
user can to decide and fill correct type. The design strictly remove
any U.I. from design. So using a incorect type is bug.

In other words, your proposal is error-prone to use, restricted in what
it can do, and incapable of being extended later without breaking
things. If there is some redeeming social value to set against those
problems, I'm not seeing it.

What I think we should have is

FOREACH scalar-variable IN ARRAY array-expression

which iterates element by element regardless of how many dimensions the
array has. Then there should be some other syntax for iterating over
slices, and we should give some thought to being able to specify how
"deep" the slice is. I can definitely think of use cases for pulling
off either 1 dimension at a time (so you get vectors) or N-1 dimensions
at a time, and it's not out of the realm of reason to want intermediate
cases.

Maybe

FOR_EACH scalar-variable IN ARRAY array-expression

FOR_SLICE array-variable [DEPTH n] IN ARRAY array-expression

Or I guess you could use the same leading keyword if you make the depth
specification mandatory for the slice case:

FOREACH scalar-variable IN ARRAY array-expression

FOREACH array-variable SLICE n IN ARRAY array-expression

That might be a better idea since it avoids the inevitable argument over
whether the default slice depth should be 1 dimension or N-1 dimensions.

another way:

FOREACH scalar IN ARRAY arr_exp DIMS in dim_var

dim_var being int[], or possibly text, of length #dimensions, giving
per dimesion index.

I like this because it would fit well with alternate form of unnest,
should it ever be written:

create function unnest(anyarray, dims out int[], elem out anyelement)
returns setof...

SLICE notation is still good though, and it's probably faster since
you have less work to do in iteration step? It's certainly easier,
but very plpgsql specific.

merlin

#11

Pavel Stehule

pavel.stehule@gmail.com

about 15 years ago

In reply to: Tom Lane (#9)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

2010/12/17 Tom Lane <tgl@sss.pgh.pa.us>:

Pavel Stehule <pavel.stehule@gmail.com> writes:

2010/12/17 Tom Lane <tgl@sss.pgh.pa.us>:

Furthermore, it's underspecified: who's to say how many dimensions of
the array are supposed to get sliced off? There's no reasonable place
to extend this syntax to specify that. It will also be inconsistent
for "foreach scalar in array" to iterate element-by-element no matter
how many dimensions array has, while "foreach array in array" does
something different from that.

it reduce just one dimension. Now I expect, and I think so it is
correct, so user knows a used dimension. Just doesn't know a data. So
user can to decide and fill correct type. The design strictly remove
any U.I. from design. So using a incorect type is bug.

In other words, your proposal is error-prone to use, restricted in what
it can do, and incapable of being extended later without breaking
things. If there is some redeeming social value to set against those
problems, I'm not seeing it.

What I think we should have is

FOREACH scalar-variable IN ARRAY array-expression

which iterates element by element regardless of how many dimensions the
array has. Then there should be some other syntax for iterating over
slices, and we should give some thought to being able to specify how
"deep" the slice is. I can definitely think of use cases for pulling
off either 1 dimension at a time (so you get vectors) or N-1 dimensions
at a time, and it's not out of the realm of reason to want intermediate
cases.

I am not against

Maybe

FOR_EACH scalar-variable IN ARRAY array-expression

FOR_SLICE array-variable [DEPTH n] IN ARRAY array-expression

Or I guess you could use the same leading keyword if you make the depth
specification mandatory for the slice case:

FOREACH scalar-variable IN ARRAY array-expression

FOREACH array-variable SLICE n IN ARRAY array-expression

I prefer FOREACH keyword. The syntax can be enhanced and I like a talk
about it. I am not sure if SLICE is good keyword for this, but I don't
know better - hope so native speakers can select well. I could to use
maybe "DIMENSIONS" ?

Regards

Pavel

Show quoted text

That might be a better idea since it avoids the inevitable argument over
whether the default slice depth should be 1 dimension or N-1 dimensions.

regards, tom lane

#12

Dmitriy Igrishin

dmitigr@gmail.com

about 15 years ago

In reply to: Merlin Moncure (#10)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

2010/12/17 Merlin Moncure <mmoncure@gmail.com>

On Fri, Dec 17, 2010 at 11:38 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Pavel Stehule <pavel.stehule@gmail.com> writes:

2010/12/17 Tom Lane <tgl@sss.pgh.pa.us>:

Furthermore, it's underspecified: who's to say how many dimensions of
the array are supposed to get sliced off? There's no reasonable place
to extend this syntax to specify that. It will also be inconsistent
for "foreach scalar in array" to iterate element-by-element no matter
how many dimensions array has, while "foreach array in array" does
something different from that.

it reduce just one dimension. Now I expect, and I think so it is
correct, so user knows a used dimension. Just doesn't know a data. So
user can to decide and fill correct type. The design strictly remove
any U.I. from design. So using a incorect type is bug.

In other words, your proposal is error-prone to use, restricted in what
it can do, and incapable of being extended later without breaking
things. If there is some redeeming social value to set against those
problems, I'm not seeing it.

What I think we should have is

FOREACH scalar-variable IN ARRAY array-expression

which iterates element by element regardless of how many dimensions the
array has. Then there should be some other syntax for iterating over
slices, and we should give some thought to being able to specify how
"deep" the slice is. I can definitely think of use cases for pulling
off either 1 dimension at a time (so you get vectors) or N-1 dimensions
at a time, and it's not out of the realm of reason to want intermediate
cases.

Maybe

FOR_EACH scalar-variable IN ARRAY array-expression

FOR_SLICE array-variable [DEPTH n] IN ARRAY array-expression

Or I guess you could use the same leading keyword if you make the depth
specification mandatory for the slice case:

FOREACH scalar-variable IN ARRAY array-expression

FOREACH array-variable SLICE n IN ARRAY array-expression

That might be a better idea since it avoids the inevitable argument over
whether the default slice depth should be 1 dimension or N-1 dimensions.

another way:

FOREACH scalar IN ARRAY arr_exp DIMS in dim_var

dim_var being int[], or possibly text, of length #dimensions, giving
per dimesion index.

If dim_var contains length it is need to be renamed:
FOREACH scalar IN ARRAY arr_exp SIZES IN sizes_var.

I like this because it would fit well with alternate form of unnest,
should it ever be written:

create function unnest(anyarray, dims out int[], elem out anyelement)
returns setof...

SLICE notation is still good though, and it's probably faster since
you have less work to do in iteration step? It's certainly easier,
but very plpgsql specific.

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
// Dmitriy.

#13

Itagaki Takahiro

itagaki.takahiro@gmail.com

about 15 years ago

In reply to: Merlin Moncure (#10)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

On Sat, Dec 18, 2010 at 02:03, Merlin Moncure <mmoncure@gmail.com> wrote:

FOREACH scalar-variable IN ARRAY array-expression
FOR_EACH scalar-variable IN ARRAY array-expression
FOR_SLICE array-variable [DEPTH n] IN ARRAY array-expression
FOREACH scalar-variable IN ARRAY array-expression
FOREACH array-variable SLICE n IN ARRAY array-expression

FOREACH scalar IN ARRAY arr_exp DIMS in dim_var

It should be not a main subject, but I remember there was a discussion
that "IN ARRAY array-expression" looks redundant for a literal array:

IN ARRAY ARRAY[1, 3, 5]

Are there any improvement for the issue?

--
Itagaki Takahiro

#14

Tom Lane

tgl@sss.pgh.pa.us

about 15 years ago

In reply to: Merlin Moncure (#10)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

Merlin Moncure <mmoncure@gmail.com> writes:

another way:

FOREACH scalar IN ARRAY arr_exp DIMS in dim_var

dim_var being int[], or possibly text, of length #dimensions, giving
per dimesion index.

[ scratches head... ] I don't follow what you envision this doing,
exactly?

I'm not thrilled with that specific syntax because it'd require making
DIMS a reserved word, but right at the moment I'm more concerned about
what semantics you have in mind.

regards, tom lane

#15

Pavel Stehule

pavel.stehule@gmail.com

about 15 years ago

In reply to: Itagaki Takahiro (#13)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

2010/12/17 Itagaki Takahiro <itagaki.takahiro@gmail.com>:

On Sat, Dec 18, 2010 at 02:03, Merlin Moncure <mmoncure@gmail.com> wrote:

FOREACH scalar-variable IN ARRAY array-expression
FOR_EACH scalar-variable IN ARRAY array-expression
FOR_SLICE array-variable [DEPTH n] IN ARRAY array-expression
FOREACH scalar-variable IN ARRAY array-expression
FOREACH array-variable SLICE n IN ARRAY array-expression

FOREACH scalar IN ARRAY arr_exp DIMS in dim_var

It should be not a main subject, but I remember there was a discussion
that "IN ARRAY array-expression" looks redundant for a literal array:

IN ARRAY ARRAY[1, 3, 5]

Are there any improvement for the issue?

yes. It know it. The reason for this is bigger space for possible
future features related to FOREACH loop.

Regards

Pavel

Show quoted text

--
Itagaki Takahiro

#16

Merlin Moncure

mmoncure@gmail.com

about 15 years ago

In reply to: Tom Lane (#14)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

On Fri, Dec 17, 2010 at 12:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Merlin Moncure <mmoncure@gmail.com> writes:

another way:

FOREACH scalar IN ARRAY arr_exp DIMS in dim_var

dim_var being int[], or possibly text, of length #dimensions, giving
per dimesion index.

[ scratches head... ] I don't follow what you envision this doing,
exactly?

I'm not thrilled with that specific syntax because it'd require making
DIMS a reserved word, but right at the moment I'm more concerned about
what semantics you have in mind.

It's like _pg_expandarray but alterted support multiple dimensions:

select * from unnest_dims(array[['a','b'],['c','d']]) returns
[1,1], 'a'
[1,2], 'b'
[2,1], 'c'
[2,2], 'd'

this provides alternate way of pulling slices, slower possibly, but
more abstract.

merlin

#17

Andrew Dunstan

andrew@dunslane.net

about 15 years ago

In reply to: Pavel Stehule (#15)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

On 12/17/2010 12:15 PM, Pavel Stehule wrote:

2010/12/17 Itagaki Takahiro<itagaki.takahiro@gmail.com>:

It should be not a main subject, but I remember there was a discussion
that "IN ARRAY array-expression" looks redundant for a literal array:

IN ARRAY ARRAY[1, 3, 5]

Are there any improvement for the issue?

yes. It know it. The reason for this is bigger space for possible
future features related to FOREACH loop.

So what you're saying is we need to allow ugliness now so we can have
more ugliness in future? I don't find that a convincing argument. I
share the dislike for this syntax.

cheers

andrew

#18

Tom Lane

tgl@sss.pgh.pa.us

about 15 years ago

In reply to: Merlin Moncure (#16)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

Merlin Moncure <mmoncure@gmail.com> writes:

On Fri, Dec 17, 2010 at 12:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

[ scratches head... ] �I don't follow what you envision this doing,
exactly?

It's like _pg_expandarray but alterted support multiple dimensions:

select * from unnest_dims(array[['a','b'],['c','d']]) returns
[1,1], 'a'
[1,2], 'b'
[2,1], 'c'
[2,2], 'd'

Oh, so that's an *output* not an input. And IIUC what you are returning
is the subscripts associated with the current element, not the array's
dimensions. Seems like it should go beside the normal target variable
then, not at the end.

FOREACH variable_for_value [, variable_for_subscripts ] IN ARRAY ...

regards, tom lane

#19

Tom Lane

tgl@sss.pgh.pa.us

about 15 years ago

In reply to: Andrew Dunstan (#17)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

Andrew Dunstan <andrew@dunslane.net> writes:

On 12/17/2010 12:15 PM, Pavel Stehule wrote:

The reason for this is bigger space for possible
future features related to FOREACH loop.

So what you're saying is we need to allow ugliness now so we can have
more ugliness in future? I don't find that a convincing argument. I
share the dislike for this syntax.

Well, we did beat up Pavel over trying to shoehorn this facility into
the existing FOR syntax, so I can hardly blame him for thinking this
way. The question is whether we're willing to assume that FOREACH will
be limited to iterating over arrays, meaning we'll be stuck with
inventing yet another initial keyword if some other fundamentally
different concept comes along. Right at the moment I can't think of
any plausible candidates, but ...

regards, tom lane

#20

Pavel Stehule

pavel.stehule@gmail.com

about 15 years ago

In reply to: Andrew Dunstan (#17)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

2010/12/17 Andrew Dunstan <andrew@dunslane.net>:

On 12/17/2010 12:15 PM, Pavel Stehule wrote:

2010/12/17 Itagaki Takahiro<itagaki.takahiro@gmail.com>:

It should be not a main subject, but I remember there was a discussion
that "IN ARRAY array-expression" looks redundant for a literal array:

IN ARRAY ARRAY[1, 3, 5]

Are there any improvement for the issue?

yes. It know it. The reason for this is bigger space for possible
future features related to FOREACH loop.

So what you're saying is we need to allow ugliness now so we can have more
ugliness in future? I don't find that a convincing argument. I share the
dislike for this syntax.

can be strange from me, but it is. If we close a back door now, then
we have not a space after ten years. There can be possible loops over
records, maybe over other iterable data. With this design is important
one think. A keyword after K_IN must not be a reserved keyword.

I am expecting, so typical use case doesn't be a iteration over
constant array, but over variable

so mostly often you have to write

FOREACH var IN ARRAY second_var
LOOP
...
END LOOP

Regards

Pavel

Show quoted text

cheers

andrew

#21

David E. Wheeler

david@kineticode.com

about 15 years ago

In reply to: Tom Lane (#19)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

On Dec 17, 2010, at 9:31 AM, Tom Lane wrote:

Well, we did beat up Pavel over trying to shoehorn this facility into
the existing FOR syntax, so I can hardly blame him for thinking this
way. The question is whether we're willing to assume that FOREACH will
be limited to iterating over arrays, meaning we'll be stuck with
inventing yet another initial keyword if some other fundamentally
different concept comes along. Right at the moment I can't think of
any plausible candidates, but ...

FOREACH pair IN HSTORE…

David

#22

Tom Lane

tgl@sss.pgh.pa.us

about 15 years ago

In reply to: David E. Wheeler (#21)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

"David E. Wheeler" <david@kineticode.com> writes:

On Dec 17, 2010, at 9:31 AM, Tom Lane wrote:

Well, we did beat up Pavel over trying to shoehorn this facility into
the existing FOR syntax, so I can hardly blame him for thinking this
way. The question is whether we're willing to assume that FOREACH will
be limited to iterating over arrays, meaning we'll be stuck with
inventing yet another initial keyword if some other fundamentally
different concept comes along. Right at the moment I can't think of
any plausible candidates, but ...

FOREACH pair IN HSTORE�

I don't actually see any problem with allowing that (or any other
"collection" kind of object) with the same syntax as for arrays.

The issue that we had with adding this to FOR was that it wasn't clear
whether the expression after IN should be thought of as a source of
rows, or as a "scalar" expression yielding a collection object that
should get iterated through --- and because SQL allows sub-SELECT as a
kind of expression, this was an actual formal ambiguity not just the
sort of near-ambiguity that trips up users. If you will, it wouldn't
have been clear whether to iterate vertically or horizontally.

The direction that this proposal establishes is that FOR is for vertical
iteration and FOREACH is for horizontal iteration; that is, the argument
of FOREACH is a scalar expression in SQL terms, but it yields some kind
of collection object that we are going to iterate through the members
of. Given that understanding, I'm not seeing a need for the syntax to
distinguish whether the collection object is an array, an hstore, or
some other kind of collection. It's sufficient if we can determine this
by examining the type of the expression.

We would need an extra keyword if there were some third kind of
iteration that was fundamentally different from either of these, but
like I said, I don't see a plausible candidate. So right at the moment,
I'm leaning to the position that we could do without the ARRAY keyword
in FOREACH. If we do think of something else that could need its own
keyword there, it's arguably going to be different enough that a
different leading keyword would be a better idea anyhow.

regards, tom lane

#23

Robert Haas

robertmhaas@gmail.com

about 15 years ago

In reply to: Tom Lane (#22)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

On Fri, Dec 17, 2010 at 1:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"David E. Wheeler" <david@kineticode.com> writes:

On Dec 17, 2010, at 9:31 AM, Tom Lane wrote:

Well, we did beat up Pavel over trying to shoehorn this facility into
the existing FOR syntax, so I can hardly blame him for thinking this
way. The question is whether we're willing to assume that FOREACH will
be limited to iterating over arrays, meaning we'll be stuck with
inventing yet another initial keyword if some other fundamentally
different concept comes along. Right at the moment I can't think of
any plausible candidates, but ...

FOREACH pair IN HSTORE…

I don't actually see any problem with allowing that (or any other
"collection" kind of object) with the same syntax as for arrays.

The issue that we had with adding this to FOR was that it wasn't clear
whether the expression after IN should be thought of as a source of
rows, or as a "scalar" expression yielding a collection object that
should get iterated through --- and because SQL allows sub-SELECT as a
kind of expression, this was an actual formal ambiguity not just the
sort of near-ambiguity that trips up users. If you will, it wouldn't
have been clear whether to iterate vertically or horizontally.

The direction that this proposal establishes is that FOR is for vertical
iteration and FOREACH is for horizontal iteration; that is, the argument
of FOREACH is a scalar expression in SQL terms, but it yields some kind
of collection object that we are going to iterate through the members
of. Given that understanding, I'm not seeing a need for the syntax to
distinguish whether the collection object is an array, an hstore, or
some other kind of collection. It's sufficient if we can determine this
by examining the type of the expression.

We would need an extra keyword if there were some third kind of
iteration that was fundamentally different from either of these, but
like I said, I don't see a plausible candidate. So right at the moment,
I'm leaning to the position that we could do without the ARRAY keyword
in FOREACH. If we do think of something else that could need its own
keyword there, it's arguably going to be different enough that a
different leading keyword would be a better idea anyhow.

Unfortunately, there are likely to be a limited number of such
keywords available. While I agree it's helpful to have a clear
distinction between what FOR does and what FOREACH does, it's wholly
conventional here and won't be obvious without careful reading of the
documentation. If we had FOR and FOREACH and FOREVERY and, uh,
FORGET, it'd quickly become notational soup. I am still wondering if
there's a way to make something like "FOR ELEMENT e IN a" work. I
suspect we'd be less likely to paint ourselves into a corner that way.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#24

Pavel Stehule

pavel.stehule@gmail.com

about 15 years ago

In reply to: Tom Lane (#22)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

We would need an extra keyword if there were some third kind of
iteration that was fundamentally different from either of these, but
like I said, I don't see a plausible candidate. So right at the moment,
I'm leaning to the position that we could do without the ARRAY keyword
in FOREACH. If we do think of something else that could need its own
keyword there, it's arguably going to be different enough that a
different leading keyword would be a better idea anyhow.

Maybe I propage a higher verbosity than is necessary, but it descrease
a risk so code will do some unexpected work. With ARRAY keyword we can
verify so result of expression is really a array. Next advantage is a
clean implementation now and in future. Without a auxilary keyword is
necessary to wait on execution time. So now, when we have full control
over syntax, we can protect self before "FOR" statement
implementation's complexity.

Personally - syntax without ARRAY keyword isn't significant problem
for me. Just I think so using it wisely.

Second semi argument for using ARRAY keyword is a verbosity of
PL/pgSQL. So from this perspective a ARRAY should be minimally
optional and ensure, so expr result will be really a array. But with a
optional ARRAY keyword we leaving a simple enhancing in future (on
parser level).

Pavel

Show quoted text

regards, tom lane

#25

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

about 15 years ago

In reply to: Robert Haas (#23)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

On 17.12.2010 21:04, Robert Haas wrote:

Unfortunately, there are likely to be a limited number of such
keywords available. While I agree it's helpful to have a clear
distinction between what FOR does and what FOREACH does, it's wholly
conventional here and won't be obvious without careful reading of the
documentation. If we had FOR and FOREACH and FOREVERY and, uh,
FORGET, it'd quickly become notational soup. I am still wondering if
there's a way to make something like "FOR ELEMENT e IN a" work. I
suspect we'd be less likely to paint ourselves into a corner that way.

As a side note, Oracle has FORALL, which is a kind of bulk update
operation over a collection type. So whatever we choose, not FORALL...

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#26

Pavel Stehule

pavel.stehule@gmail.com

about 15 years ago

In reply to: Robert Haas (#23)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

2010/12/17 Robert Haas <robertmhaas@gmail.com>:

On Fri, Dec 17, 2010 at 1:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"David E. Wheeler" <david@kineticode.com> writes:

On Dec 17, 2010, at 9:31 AM, Tom Lane wrote:

Well, we did beat up Pavel over trying to shoehorn this facility into
the existing FOR syntax, so I can hardly blame him for thinking this
way. The question is whether we're willing to assume that FOREACH will
be limited to iterating over arrays, meaning we'll be stuck with
inventing yet another initial keyword if some other fundamentally
different concept comes along. Right at the moment I can't think of
any plausible candidates, but ...

FOREACH pair IN HSTORE…

I don't actually see any problem with allowing that (or any other
"collection" kind of object) with the same syntax as for arrays.

The issue that we had with adding this to FOR was that it wasn't clear
whether the expression after IN should be thought of as a source of
rows, or as a "scalar" expression yielding a collection object that
should get iterated through --- and because SQL allows sub-SELECT as a
kind of expression, this was an actual formal ambiguity not just the
sort of near-ambiguity that trips up users. If you will, it wouldn't
have been clear whether to iterate vertically or horizontally.

The direction that this proposal establishes is that FOR is for vertical
iteration and FOREACH is for horizontal iteration; that is, the argument
of FOREACH is a scalar expression in SQL terms, but it yields some kind
of collection object that we are going to iterate through the members
of. Given that understanding, I'm not seeing a need for the syntax to
distinguish whether the collection object is an array, an hstore, or
some other kind of collection. It's sufficient if we can determine this
by examining the type of the expression.

We would need an extra keyword if there were some third kind of
iteration that was fundamentally different from either of these, but
like I said, I don't see a plausible candidate. So right at the moment,
I'm leaning to the position that we could do without the ARRAY keyword
in FOREACH. If we do think of something else that could need its own
keyword there, it's arguably going to be different enough that a
different leading keyword would be a better idea anyhow.

Unfortunately, there are likely to be a limited number of such
keywords available. While I agree it's helpful to have a clear
distinction between what FOR does and what FOREACH does, it's wholly
conventional here and won't be obvious without careful reading of the
documentation. If we had FOR and FOREACH and FOREVERY and, uh,
FORGET, it'd quickly become notational soup. I am still wondering if
there's a way to make something like "FOR ELEMENT e IN a" work. I
suspect we'd be less likely to paint ourselves into a corner that way.

I understand. But it is true too , so now is FOR statement
implementation too rich. You can see to attachment with initial
implementation. It's absolutely clean and simple. There is more valid
ideas then one. One valid idea is so FOR statement is compatible with
PL/SQL (what isn't true now :() and FOREACH can carry a pg's specific
features.

But I absolutely agree with you, so we can use a only one pg specific
keyword. There are FOREACH (pg), FOR (shared with PL/SQL), FORALL (not
implemented yet - use a PL/SQL).

Pavel

Show quoted text

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#27

Tom Lane

tgl@sss.pgh.pa.us

about 15 years ago

In reply to: Robert Haas (#23)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

Robert Haas <robertmhaas@gmail.com> writes:

Unfortunately, there are likely to be a limited number of such
keywords available. While I agree it's helpful to have a clear
distinction between what FOR does and what FOREACH does, it's wholly
conventional here and won't be obvious without careful reading of the
documentation. If we had FOR and FOREACH and FOREVERY and, uh,
FORGET, it'd quickly become notational soup.

All true, but in the absence of any plausible candidate for third or
fourth or fifth types of iteration, this objection seems a bit thin.

I am still wondering if
there's a way to make something like "FOR ELEMENT e IN a" work. I
suspect we'd be less likely to paint ourselves into a corner that way.

I'm afraid that's only really feasible if you are willing for the second
word to be a fully reserved word, so it can be distinguished from a
plain variable name in that position. Which is probably worse than
inventing multiple initial keywords. It doesn't seem to me that this
would reduce the intellectual burden of remembering which syntax does
what, anyway.

regards, tom lane

#28

Tom Lane

tgl@sss.pgh.pa.us

about 15 years ago

In reply to: Pavel Stehule (#24)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

Pavel Stehule <pavel.stehule@gmail.com> writes:

Second semi argument for using ARRAY keyword is a verbosity of
PL/pgSQL. So from this perspective a ARRAY should be minimally
optional and ensure, so expr result will be really a array. But with a
optional ARRAY keyword we leaving a simple enhancing in future (on
parser level).

No.  If we are going to put a keyword there, it can't be optional.
Making it optional would require it to be a fully reserved word
--- and in the case of ARRAY, even that isn't good enough, because
of the conflict with ARRAY[...] syntax.

regards, tom lane

#29

Pavel Stehule

pavel.stehule@gmail.com

about 15 years ago

In reply to: Tom Lane (#28)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

2010/12/17 Tom Lane <tgl@sss.pgh.pa.us>:

Pavel Stehule <pavel.stehule@gmail.com> writes:

Second semi argument for using ARRAY keyword is a verbosity of
PL/pgSQL. So from this perspective a ARRAY should be minimally
optional and ensure, so expr result will be really a array. But with a
optional ARRAY keyword we leaving a simple enhancing in future (on
parser level).
No.  If we are going to put a keyword there, it can't be optional.
Making it optional would require it to be a fully reserved word
--- and in the case of ARRAY, even that isn't good enough, because
of the conflict with ARRAY[...] syntax.

yes, it's true

Pavel

Show quoted text

regards, tom lane

#30

Robert Haas

robertmhaas@gmail.com

about 15 years ago

In reply to: Tom Lane (#27)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

On Fri, Dec 17, 2010 at 2:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

Unfortunately, there are likely to be a limited number of such
keywords available. While I agree it's helpful to have a clear
distinction between what FOR does and what FOREACH does, it's wholly
conventional here and won't be obvious without careful reading of the
documentation. If we had FOR and FOREACH and FOREVERY and, uh,
FORGET, it'd quickly become notational soup.

All true, but in the absence of any plausible candidate for third or
fourth or fifth types of iteration, this objection seems a bit thin.

Well, Heikki just pointed out one that Oracle supports, so that makes
at least #3...

I am still wondering if
there's a way to make something like "FOR ELEMENT e IN a" work. I
suspect we'd be less likely to paint ourselves into a corner that way.

I'm afraid that's only really feasible if you are willing for the second
word to be a fully reserved word, so it can be distinguished from a
plain variable name in that position.

What if we cheat and peak ahead an extra token?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#31

Tom Lane

tgl@sss.pgh.pa.us

about 15 years ago

In reply to: Robert Haas (#30)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Dec 17, 2010 at 2:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

Unfortunately, there are likely to be a limited number of such
keywords available. �While I agree it's helpful to have a clear
distinction between what FOR does and what FOREACH does, it's wholly
conventional here and won't be obvious without careful reading of the
documentation. �If we had FOR and FOREACH and FOREVERY and, uh,
FORGET, it'd quickly become notational soup.

All true, but in the absence of any plausible candidate for third or
fourth or fifth types of iteration, this objection seems a bit thin.

Well, Heikki just pointed out one that Oracle supports, so that makes
at least #3...

If you posit that we might someday wish to support what Oracle is doing
there, it seems to me to be a precedent for using a different first
keyword, not for what you're suggesting. I'm not arguing that we might
want to duplicate Oracle's syntax; only that if it's going to be cited
as a precedent that we consider what it's actually a precedent for.

I'm afraid that's only really feasible if you are willing for the second
word to be a fully reserved word, so it can be distinguished from a
plain variable name in that position.

What if we cheat and peak ahead an extra token?

plpgsql's parser is rickety enough that I don't have a lot of confidence
in its ability to do things that way. In particular, there's too much
knowledge at the lexer level instead of the grammar --- you'd have to
have a way of keeping the lexer from returning T_DATUM in this one
particular context, even if "element" happened to match some variable.

regards, tom lane

#32

Robert Haas

robertmhaas@gmail.com

about 15 years ago

In reply to: Tom Lane (#31)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

On Fri, Dec 17, 2010 at 2:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Dec 17, 2010 at 2:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

Unfortunately, there are likely to be a limited number of such
keywords available. While I agree it's helpful to have a clear
distinction between what FOR does and what FOREACH does, it's wholly
conventional here and won't be obvious without careful reading of the
documentation. If we had FOR and FOREACH and FOREVERY and, uh,
FORGET, it'd quickly become notational soup.

All true, but in the absence of any plausible candidate for third or
fourth or fifth types of iteration, this objection seems a bit thin.

Well, Heikki just pointed out one that Oracle supports, so that makes
at least #3...

If you posit that we might someday wish to support what Oracle is doing
there, it seems to me to be a precedent for using a different first
keyword, not for what you're suggesting. I'm not arguing that we might
want to duplicate Oracle's syntax; only that if it's going to be cited
as a precedent that we consider what it's actually a precedent for.

I don't quite follow what you're getting at here. My goal was to try
to think of something more mnemonic than FOREACH, and I thought
something involving the word "element" or "array" would do the trick.
The problem is only to find a place to put it that's before the word
"IN". But maybe that's hopeless and we should just go with FOREACH.

I'm afraid that's only really feasible if you are willing for the second
word to be a fully reserved word, so it can be distinguished from a
plain variable name in that position.

What if we cheat and peak ahead an extra token?

plpgsql's parser is rickety enough that I don't have a lot of confidence
in its ability to do things that way.

Bummer. Rickety is not good.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#33

Tom Lane

tgl@sss.pgh.pa.us

about 15 years ago

In reply to: Robert Haas (#32)

Re: proposal: FOREACH-IN-ARRAY (probably for 9.2?)

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Dec 17, 2010 at 2:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

plpgsql's parser is rickety enough that I don't have a lot of confidence
in its ability to do things that way.

Bummer. Rickety is not good.

Agreed, but it's not entirely the parser's fault: the language
definition is pretty d*mn bogus to start with. Read the comments for
the for_variable production, and ask yourself whether you really want
to inject even more difficult-to-disambiguate cases right there.

regards, tom lane