[Fwd: SETOF input parameters (was Re: [HACKERS] proposal: array utility functions phase 1)]

Started by Joe Conwayabout 23 years ago25 messages
#1Joe Conway
mail@joeconway.com
1 attachment(s)

Joe Conway wrote:

Tom Lane wrote:

It seems like somehow we need a level of FROM/WHERE producing some base
rows, and then a set of table function calls to apply to each of the
base rows, and then another level of WHERE to filter the results of the
function calls (in particular to provide join conditions to identify
which rows to match up in the function outputs). I don't see any way to
do this without inventing new SELECT clauses out of whole cloth
... unless SQL99's WITH clause helps, but I don't think it does ...

Well, maybe this is a start. It allows a table function's input
parameter to be declared with setof. The changes involved primarily:

1) a big loop in ExecMakeTableFunctionResult so that functions with set
returning arguments get called for each row of the argument,
and
2) aways initializing the tuplestore in ExecMakeTableFunctionResult and
passing that to the function, even when SFRM_Materialize mode is used.

The result looks like:

create table foot(f1 text, f2 text);
insert into foot values('a','b');
insert into foot values('c','d');
insert into foot values('e','f');

create or replace function test2() returns setof foot as 'select * from
foot order by 1 asc' language 'sql';
create or replace function test(setof foot) returns foot as 'select
$1.f1, $1.f2' language 'sql';

regression=# select * from test(test2());
f1 | f2
----+----
a | b
c | d
e | f
(3 rows)

I know it doesn't solve all the issues discussed, but is it a step
forward? Suggestions?

Patch updated (again) to apply cleanly against cvs. Compiles clean and passes
all regression tests. Any feedback? If not, please apply.

Thanks,

Joe

Attachments:

tablefunc-setof-input.3.patchtext/plain; name=tablefunc-setof-input.3.patchDownload
Index: contrib/tablefunc/tablefunc.c
===================================================================
RCS file: /opt/src/cvs/pgsql-server/contrib/tablefunc/tablefunc.c,v
retrieving revision 1.11
diff -c -r1.11 tablefunc.c
*** contrib/tablefunc/tablefunc.c	23 Nov 2002 01:54:09 -0000	1.11
--- contrib/tablefunc/tablefunc.c	16 Dec 2002 21:21:47 -0000
***************
*** 53,59 ****
  		  int max_depth,
  		  bool show_branch,
  		  MemoryContext per_query_ctx,
! 		  AttInMetadata *attinmeta);
  static Tuplestorestate *build_tuplestore_recursively(char *key_fld,
  							 char *parent_key_fld,
  							 char *relname,
--- 53,60 ----
  		  int max_depth,
  		  bool show_branch,
  		  MemoryContext per_query_ctx,
! 		  AttInMetadata *attinmeta,
! 		  Tuplestorestate *tupstore);
  static Tuplestorestate *build_tuplestore_recursively(char *key_fld,
  							 char *parent_key_fld,
  							 char *relname,
***************
*** 641,646 ****
--- 642,648 ----
  	char	   *branch_delim = NULL;
  	bool		show_branch = false;
  	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+ 	Tuplestorestate *tupstore;
  	TupleDesc	tupdesc;
  	AttInMetadata *attinmeta;
  	MemoryContext per_query_ctx;
***************
*** 673,678 ****
--- 675,681 ----
  			 "allowed in this context");
  
  	/* OK, go to work */
+ 	tupstore = rsinfo->setResult;
  	rsinfo->returnMode = SFRM_Materialize;
  	rsinfo->setResult = connectby(relname,
  								  key_fld,
***************
*** 682,688 ****
  								  max_depth,
  								  show_branch,
  								  per_query_ctx,
! 								  attinmeta);
  	rsinfo->setDesc = tupdesc;
  
  	MemoryContextSwitchTo(oldcontext);
--- 685,692 ----
  								  max_depth,
  								  show_branch,
  								  per_query_ctx,
! 								  attinmeta,
! 								  tupstore);
  	rsinfo->setDesc = tupdesc;
  
  	MemoryContextSwitchTo(oldcontext);
***************
*** 709,732 ****
  		  int max_depth,
  		  bool show_branch,
  		  MemoryContext per_query_ctx,
! 		  AttInMetadata *attinmeta)
  {
- 	Tuplestorestate *tupstore = NULL;
  	int			ret;
- 	MemoryContext oldcontext;
  
  	/* Connect to SPI manager */
  	if ((ret = SPI_connect()) < 0)
  		elog(ERROR, "connectby: SPI_connect returned %d", ret);
  
- 	/* switch to longer term context to create the tuple store */
- 	oldcontext = MemoryContextSwitchTo(per_query_ctx);
- 
- 	/* initialize our tuplestore */
- 	tupstore = tuplestore_begin_heap(true, SortMem);
- 
- 	MemoryContextSwitchTo(oldcontext);
- 
  	/* now go get the whole tree */
  	tupstore = build_tuplestore_recursively(key_fld,
  											parent_key_fld,
--- 713,727 ----
  		  int max_depth,
  		  bool show_branch,
  		  MemoryContext per_query_ctx,
! 		  AttInMetadata *attinmeta,
! 		  Tuplestorestate *tupstore)
  {
  	int			ret;
  
  	/* Connect to SPI manager */
  	if ((ret = SPI_connect()) < 0)
  		elog(ERROR, "connectby: SPI_connect returned %d", ret);
  
  	/* now go get the whole tree */
  	tupstore = build_tuplestore_recursively(key_fld,
  											parent_key_fld,
***************
*** 742,751 ****
  											tupstore);
  
  	SPI_finish();
- 
- 	oldcontext = MemoryContextSwitchTo(per_query_ctx);
- 	tuplestore_donestoring(tupstore);
- 	MemoryContextSwitchTo(oldcontext);
  
  	return tupstore;
  }
--- 737,742 ----
Index: src/backend/commands/functioncmds.c
===================================================================
RCS file: /opt/src/cvs/pgsql-server/src/backend/commands/functioncmds.c,v
retrieving revision 1.24
diff -c -r1.24 functioncmds.c
*** src/backend/commands/functioncmds.c	1 Nov 2002 19:19:58 -0000	1.24
--- src/backend/commands/functioncmds.c	16 Dec 2002 21:21:47 -0000
***************
*** 160,168 ****
  				 TypeNameToString(t));
  		}
  
- 		if (t->setof)
- 			elog(ERROR, "Functions cannot accept set arguments");
- 
  		parameterTypes[parameterCount++] = toid;
  	}
  
--- 160,165 ----
Index: src/backend/executor/execQual.c
===================================================================
RCS file: /opt/src/cvs/pgsql-server/src/backend/executor/execQual.c,v
retrieving revision 1.121
diff -c -r1.121 execQual.c
*** src/backend/executor/execQual.c	15 Dec 2002 16:17:45 -0000	1.121
--- src/backend/executor/execQual.c	16 Dec 2002 21:21:47 -0000
***************
*** 874,879 ****
--- 874,880 ----
  	bool		direct_function_call;
  	bool		first_time = true;
  	bool		returnsTuple = false;
+ 	FuncExprState *fcache = (FuncExprState *) funcexpr;
  
  	/*
  	 * Normally the passed expression tree will be a FuncExprState, since the
***************
*** 888,896 ****
  	if (funcexpr && IsA(funcexpr, FuncExprState) &&
  		IsA(funcexpr->expr, FuncExpr))
  	{
- 		FuncExprState *fcache = (FuncExprState *) funcexpr;
- 		ExprDoneCond argDone;
- 
  		/*
  		 * This path is similar to ExecMakeFunctionResult.
  		 */
--- 889,894 ----
***************
*** 906,943 ****
  			init_fcache(func->funcid, fcache, econtext->ecxt_per_query_memory);
  		}
  
- 		/*
- 		 * Evaluate the function's argument list.
- 		 *
- 		 * Note: ideally, we'd do this in the per-tuple context, but then the
- 		 * argument values would disappear when we reset the context in the
- 		 * inner loop.	So do it in caller context.  Perhaps we should make a
- 		 * separate context just to hold the evaluated arguments?
- 		 */
  		MemSet(&fcinfo, 0, sizeof(fcinfo));
  		fcinfo.flinfo = &(fcache->func);
- 		argDone = ExecEvalFuncArgs(&fcinfo, fcache->args, econtext);
- 		/* We don't allow sets in the arguments of the table function */
- 		if (argDone != ExprSingleResult)
- 			elog(ERROR, "Set-valued function called in context that cannot accept a set");
- 
- 		/*
- 		 * If function is strict, and there are any NULL arguments, skip
- 		 * calling the function and return NULL (actually an empty set).
- 		 */
- 		if (fcache->func.fn_strict)
- 		{
- 			int			i;
- 
- 			for (i = 0; i < fcinfo.nargs; i++)
- 			{
- 				if (fcinfo.argnull[i])
- 				{
- 					*returnDesc = NULL;
- 					return NULL;
- 				}
- 			}
- 		}
  	}
  	else
  	{
--- 904,911 ----
***************
*** 964,1104 ****
  	rsinfo.setResult = NULL;
  	rsinfo.setDesc = NULL;
  
! 	/*
! 	 * Switch to short-lived context for calling the function or expression.
! 	 */
! 	callerContext = MemoryContextSwitchTo(econtext->ecxt_per_tuple_memory);
! 
! 	/*
! 	 * Loop to handle the ValuePerCall protocol (which is also the same
! 	 * behavior needed in the generic ExecEvalExpr path).
! 	 */
! 	for (;;)
  	{
! 		Datum		result;
! 		HeapTuple	tuple;
! 
! 		/*
! 		 * reset per-tuple memory context before each call of the
! 		 * function or expression. This cleans up any local memory the
! 		 * function may leak when called.
! 		 */
! 		ResetExprContext(econtext);
  
- 		/* Call the function or expression one time */
  		if (direct_function_call)
  		{
! 			fcinfo.isnull = false;
! 			rsinfo.isDone = ExprSingleResult;
! 			result = FunctionCallInvoke(&fcinfo);
! 		}
! 		else
! 		{
! 			result = ExecEvalExpr(funcexpr, econtext,
! 								  &fcinfo.isnull, &rsinfo.isDone);
  		}
  
! 		/* Which protocol does function want to use? */
! 		if (rsinfo.returnMode == SFRM_ValuePerCall)
  		{
  			/*
! 			 * Check for end of result set.
! 			 *
! 			 * Note: if function returns an empty set, we don't build a
! 			 * tupdesc or tuplestore (since we can't get a tupdesc in the
! 			 * function-returning-tuple case)
  			 */
! 			if (rsinfo.isDone == ExprEndResult)
! 				break;
  
  			/*
! 			 * If first time through, build tupdesc and tuplestore for
! 			 * result
  			 */
  			if (first_time)
  			{
  				oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_query_memory);
! 				if (funcrettype == RECORDOID ||
! 					get_typtype(funcrettype) == 'c')
  				{
- 					/*
- 					 * Composite type, so function should have returned a
- 					 * TupleTableSlot; use its descriptor
- 					 */
  					slot = (TupleTableSlot *) DatumGetPointer(result);
  					if (fcinfo.isnull ||
  						!slot ||
  						!IsA(slot, TupleTableSlot) ||
! 						!slot->ttc_tupleDescriptor)
  						elog(ERROR, "ExecMakeTableFunctionResult: Invalid result from function returning tuple");
! 					tupdesc = CreateTupleDescCopy(slot->ttc_tupleDescriptor);
! 					returnsTuple = true;
  				}
  				else
  				{
! 					/*
! 					 * Scalar type, so make a single-column descriptor
! 					 */
! 					tupdesc = CreateTemplateTupleDesc(1, false);
! 					TupleDescInitEntry(tupdesc,
! 									   (AttrNumber) 1,
! 									   "column",
! 									   funcrettype,
! 									   -1,
! 									   0,
! 									   false);
  				}
! 				tupstore = tuplestore_begin_heap(true,	/* randomAccess */
! 												 SortMem);
  				MemoryContextSwitchTo(oldcontext);
- 				rsinfo.setResult = tupstore;
- 				rsinfo.setDesc = tupdesc;
- 			}
  
! 			/*
! 			 * Store current resultset item.
! 			 */
! 			if (returnsTuple)
! 			{
! 				slot = (TupleTableSlot *) DatumGetPointer(result);
! 				if (fcinfo.isnull ||
! 					!slot ||
! 					!IsA(slot, TupleTableSlot) ||
! 					TupIsNull(slot))
! 					elog(ERROR, "ExecMakeTableFunctionResult: Invalid result from function returning tuple");
! 				tuple = slot->val;
  			}
! 			else
  			{
! 				char		nullflag;
! 
! 				nullflag = fcinfo.isnull ? 'n' : ' ';
! 				tuple = heap_formtuple(tupdesc, &result, &nullflag);
! 			}
! 
! 			oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_query_memory);
! 			tuplestore_puttuple(tupstore, tuple);
! 			MemoryContextSwitchTo(oldcontext);
  
! 			/*
! 			 * Are we done?
! 			 */
! 			if (rsinfo.isDone != ExprMultipleResult)
  				break;
  		}
- 		else if (rsinfo.returnMode == SFRM_Materialize)
- 		{
- 			/* check we're on the same page as the function author */
- 			if (!first_time || rsinfo.isDone != ExprSingleResult)
- 				elog(ERROR, "ExecMakeTableFunctionResult: Materialize-mode protocol not followed");
- 			/* Done evaluating the set result */
- 			break;
- 		}
- 		else
- 			elog(ERROR, "ExecMakeTableFunctionResult: unknown returnMode %d",
- 				 (int) rsinfo.returnMode);
  
! 		first_time = false;
  	}
  
  	/* If we have a locally-created tupstore, close it up */
--- 932,1121 ----
  	rsinfo.setResult = NULL;
  	rsinfo.setDesc = NULL;
  
! 	callerContext = CurrentMemoryContext;
! 	while(1)
  	{
! 		ExprDoneCond argDone = ExprSingleResult; /* until proven otherwise */
  
  		if (direct_function_call)
  		{
! 			/*
! 			 * Evaluate the function's argument list.
! 			 *
! 			 * Note: ideally, we'd do this in the per-tuple context, but then the
! 			 * argument values would disappear when we reset the context in the
! 			 * inner loop.	So do it in caller context.  Perhaps we should make a
! 			 * separate context just to hold the evaluated arguments?
! 			 */
! 			MemoryContextSwitchTo(callerContext);
! 			argDone = ExecEvalFuncArgs(&fcinfo, fcache->args, econtext);
! 			if (argDone == ExprEndResult)
! 				break;
! 
! 			/*
! 			 * If function is strict, and there are any NULL arguments, skip
! 			 * calling the function and return NULL (actually an empty set).
! 			 */
! 			if (fcache->func.fn_strict)
! 			{
! 				int			i;
! 
! 				for (i = 0; i < fcinfo.nargs; i++)
! 				{
! 					if (fcinfo.argnull[i])
! 					{
! 						*returnDesc = NULL;
! 						return NULL;
! 					}
! 				}
! 			}
  		}
  
! 		/*
! 		 * Switch to short-lived context for calling the function or expression.
! 		 */
! 		MemoryContextSwitchTo(econtext->ecxt_per_tuple_memory);
! 
! 		/*
! 		 * Loop to handle the ValuePerCall protocol (which is also the same
! 		 * behavior needed in the generic ExecEvalExpr path).
! 		 */
! 		for (;;)
  		{
+ 			Datum		result;
+ 			HeapTuple	tuple;
+ 
  			/*
! 			 * reset per-tuple memory context before each call of the
! 			 * function or expression. This cleans up any local memory the
! 			 * function may leak when called.
  			 */
! 			ResetExprContext(econtext);
  
  			/*
! 			 * If first time through, build tuplestore for result
  			 */
  			if (first_time)
  			{
  				oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_query_memory);
! 				tupstore = tuplestore_begin_heap(true,	/* randomAccess */
! 												 SortMem);
! 				MemoryContextSwitchTo(oldcontext);
! 				rsinfo.setResult = tupstore;
! 			}
! 
! 			/* Call the function or expression one time */
! 			if (direct_function_call)
! 			{
! 				fcinfo.isnull = false;
! 				rsinfo.isDone = ExprSingleResult;
! 				result = FunctionCallInvoke(&fcinfo);
! 			}
! 			else
! 			{
! 				result = ExecEvalExpr(funcexpr, econtext,
! 									  &fcinfo.isnull, &rsinfo.isDone);
! 			}
! 
! 			/* Which protocol does function want to use? */
! 			if (rsinfo.returnMode == SFRM_ValuePerCall)
! 			{
! 				/*
! 				 * Check for end of result set.
! 				 *
! 				 * Note: if function returns an empty set, we don't build a
! 				 * tupdesc or tuplestore (since we can't get a tupdesc in the
! 				 * function-returning-tuple case)
! 				 */
! 				if (rsinfo.isDone == ExprEndResult)
! 					break;
! 
! 				/*
! 				 * If first time through, build tupdesc for result
! 				 */
! 				if (first_time)
! 				{
! 					oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_query_memory);
! 					if (funcrettype == RECORDOID ||
! 						get_typtype(funcrettype) == 'c')
! 					{
! 						/*
! 						 * Composite type, so function should have returned a
! 						 * TupleTableSlot; use its descriptor
! 						 */
! 						slot = (TupleTableSlot *) DatumGetPointer(result);
! 						if (fcinfo.isnull ||
! 							!slot ||
! 							!IsA(slot, TupleTableSlot) ||
! 							!slot->ttc_tupleDescriptor)
! 							elog(ERROR, "ExecMakeTableFunctionResult: Invalid result from function returning tuple");
! 						tupdesc = CreateTupleDescCopy(slot->ttc_tupleDescriptor);
! 						returnsTuple = true;
! 					}
! 					else
! 					{
! 						/*
! 						 * Scalar type, so make a single-column descriptor
! 						 */
! 						tupdesc = CreateTemplateTupleDesc(1, false);
! 						TupleDescInitEntry(tupdesc,
! 										   (AttrNumber) 1,
! 										   "column",
! 										   funcrettype,
! 										   -1,
! 										   0,
! 										   false);
! 					}
! 					MemoryContextSwitchTo(oldcontext);
! 					rsinfo.setDesc = tupdesc;
! 					first_time = false;
! 				}
! 
! 				/*
! 				 * Store current resultset item.
! 				 */
! 				if (returnsTuple)
  				{
  					slot = (TupleTableSlot *) DatumGetPointer(result);
  					if (fcinfo.isnull ||
  						!slot ||
  						!IsA(slot, TupleTableSlot) ||
! 						TupIsNull(slot))
  						elog(ERROR, "ExecMakeTableFunctionResult: Invalid result from function returning tuple");
! 					tuple = slot->val;
  				}
  				else
  				{
! 					char		nullflag;
! 
! 					nullflag = fcinfo.isnull ? 'n' : ' ';
! 					tuple = heap_formtuple(tupdesc, &result, &nullflag);
  				}
! 
! 				oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_query_memory);
! 				tuplestore_puttuple(tupstore, tuple);
  				MemoryContextSwitchTo(oldcontext);
  
! 				/*
! 				 * Are we done?
! 				 */
! 				if (rsinfo.isDone != ExprMultipleResult)
! 					break;
  			}
! 			else if (rsinfo.returnMode == SFRM_Materialize)
  			{
! 				first_time = false;
  
! 				/* Done evaluating the set result */
  				break;
+ 			}
+ 			else
+ 				elog(ERROR, "ExecMakeTableFunctionResult: unknown returnMode %d",
+ 					 (int) rsinfo.returnMode);
  		}
  
! 		if (direct_function_call == false || argDone == ExprSingleResult)
! 			break;
  	}
  
  	/* If we have a locally-created tupstore, close it up */
Index: src/pl/plpgsql/src/pl_exec.c
===================================================================
RCS file: /opt/src/cvs/pgsql-server/src/pl/plpgsql/src/pl_exec.c,v
retrieving revision 1.75
diff -c -r1.75 pl_exec.c
*** src/pl/plpgsql/src/pl_exec.c	15 Dec 2002 16:17:58 -0000	1.75
--- src/pl/plpgsql/src/pl_exec.c	16 Dec 2002 21:21:47 -0000
***************
*** 351,357 ****
  			MemoryContext oldcxt;
  
  			oldcxt = MemoryContextSwitchTo(estate.tuple_store_cxt);
- 			tuplestore_donestoring(estate.tuple_store);
  			rsi->setResult = estate.tuple_store;
  			if (estate.rettupdesc)
  				rsi->setDesc = CreateTupleDescCopy(estate.rettupdesc);
--- 351,356 ----
***************
*** 1730,1736 ****
  exec_init_tuple_store(PLpgSQL_execstate * estate)
  {
  	ReturnSetInfo *rsi = estate->rsi;
- 	MemoryContext oldcxt;
  
  	/*
  	 * Check caller can handle a set result in the way we want
--- 1729,1734 ----
***************
*** 1741,1751 ****
  		elog(ERROR, "Set-valued function called in context that cannot accept a set");
  
  	estate->tuple_store_cxt = rsi->econtext->ecxt_per_query_memory;
! 
! 	oldcxt = MemoryContextSwitchTo(estate->tuple_store_cxt);
! 	estate->tuple_store = tuplestore_begin_heap(true, SortMem);
! 	MemoryContextSwitchTo(oldcxt);
! 
  	estate->rettupdesc = rsi->expectedDesc;
  }
  
--- 1739,1745 ----
  		elog(ERROR, "Set-valued function called in context that cannot accept a set");
  
  	estate->tuple_store_cxt = rsi->econtext->ecxt_per_query_memory;
! 	estate->tuple_store = rsi->setResult;
  	estate->rettupdesc = rsi->expectedDesc;
  }
  
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joe Conway (#1)
Re: [Fwd: SETOF input parameters (was Re: [HACKERS] proposal: array utility functions phase 1)]

Joe Conway <mail@joeconway.com> writes:

create or replace function test2() returns setof foot as 'select * from
foot order by 1 asc' language 'sql';
create or replace function test(setof foot) returns foot as 'select
$1.f1, $1.f2' language 'sql';

Uh, where exactly are you storing the information that the function
accepts a setof argument?

(We probably should be rejecting the above syntax at the moment, but
I suspect the parser just fails to notice the setof marker.)

A more serious objection is that this doesn't really address the
fundamental issue, namely that you can't drive a SRF from the results of
a query, except indirectly via single-purpose function definitions (like
test2() in your example).

I'm leaning more and more to the thought that we should reconsider the
Berkeley approach.

Another line of thought is to consider the possibilities of subselects
in the target list. For example,

SELECT ..., (SELECT ... FROM mysrf(a, b)) FROM foo WHERE ...;

I believe it's already the case that foo.a and foo.b can be transmitted
as arguments to mysrf() with this notation. The restriction is that the
sub-select can only return a single value (one row, one column) to the
outer query. It doesn't seem too outlandish to allow multiple columns
to be pulled up into the outer SELECT's result list given the above
syntax. I'm less sure about allowing multiple rows though. Any
thoughts?

regards, tom lane

#3Joe Conway
mail@joeconway.com
In reply to: Tom Lane (#2)
Re: [Fwd: SETOF input parameters (was Re: [HACKERS] proposal:

Tom Lane wrote:

A more serious objection is that this doesn't really address the
fundamental issue, namely that you can't drive a SRF from the results of
a query, except indirectly via single-purpose function definitions (like
test2() in your example).

True enough. I've struggled trying to come up with a better way.

I'm leaning more and more to the thought that we should reconsider the
Berkeley approach.

The problem with the Berkley approach is what to do if there are two SRFs in
the target list.

Suppose

f(t1.x) returns:
1 a z
2 b y

and g(t2.y) returns:
3 q
5 w
7 e

and *without* the SRFs the query
select * from t1 join t2 on t1.id = t2.id;
would return:
id | x | id | y
------+------+------+------
4 | k | 4 | d
6 | v | 6 | u

What do we do for
select f(t1.x), g(t2.y), * from t1 join t2 on t1.id = t2.id;
?

Should we return 2 x 2 x 3 rows? Or do we impose a limit of 1 SRF in the
target list?

Another line of thought is to consider the possibilities of subselects
in the target list. For example,

SELECT ..., (SELECT ... FROM mysrf(a, b)) FROM foo WHERE ...;
I believe it's already the case that foo.a and foo.b can be transmitted
as arguments to mysrf() with this notation. The restriction is that the
sub-select can only return a single value (one row, one column) to the
outer query. It doesn't seem too outlandish to allow multiple columns
to be pulled up into the outer SELECT's result list given the above
syntax. I'm less sure about allowing multiple rows though.

This suffers from the same problem if there can be more than one subselect in
the target list (if multiple rows is allowed).

Any thoughts?

Is it too ugly to allow:
select ... from (select mysrf(foo.a, foo.b) from foo) as t;

where the Berkley syntax is restricted to where both are true:
1. a single target -- the srf
2. in a FROM clause subselect

In this case we could still use the column reference syntax too:
select ... from (select mysrf(foo.a, foo.b) from foo) as t(f1 int, f2 text);

But not allow the Berkley syntax for multi-row, multi-column SRFs otherwise.

What do you think?

Joe

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joe Conway (#3)
Re: [Fwd: SETOF input parameters (was Re: [HACKERS] proposal:

Joe Conway <mail@joeconway.com> writes:

Tom Lane wrote:

I'm leaning more and more to the thought that we should reconsider the
Berkeley approach.

The problem with the Berkley approach is what to do if there are two SRFs in
the target list.

Agreed. The Berkeley code (or more accurately, the descendant code
that's in our source tree) generates the cross product of the rows
output by the SRFs, but I've never understood why that should be a good
approach to take. I could live with just rejecting multiple SRFs in the
same targetlist --- at least till someone comes up with a convincing
semantics for such a thing.

Is it too ugly to allow:
select ... from (select mysrf(foo.a, foo.b) from foo) as t;

where the Berkley syntax is restricted to where both are true:
1. a single target -- the srf
2. in a FROM clause subselect

Point 2 doesn't mean anything I think. Given your point 1 then the
select mysrf() ... is well-defined regardless of context.

regards, tom lane

#5Joe Conway
mail@joeconway.com
In reply to: Tom Lane (#4)
Re: [Fwd: SETOF input parameters (was Re: [HACKERS] proposal:

Tom Lane wrote:

Joe Conway <mail@joeconway.com> writes:

The problem with the Berkley approach is what to do if there are two SRFs in
the target list.

Agreed. The Berkeley code (or more accurately, the descendant code
that's in our source tree) generates the cross product of the rows
output by the SRFs, but I've never understood why that should be a good
approach to take. I could live with just rejecting multiple SRFs in the
same targetlist --- at least till someone comes up with a convincing
semantics for such a thing.

I would like to start spending some time digging in to this. Any pointers or
thoughts on the best way to implement it? A little direction might save me
days of wheel spinning :-).

Thanks,

Joe

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joe Conway (#5)
Re: [Fwd: SETOF input parameters (was Re: [HACKERS] proposal:

Joe Conway <mail@joeconway.com> writes:

Tom Lane wrote:

Agreed. The Berkeley code (or more accurately, the descendant code
that's in our source tree) generates the cross product of the rows
output by the SRFs, but I've never understood why that should be a good
approach to take. I could live with just rejecting multiple SRFs in the
same targetlist --- at least till someone comes up with a convincing
semantics for such a thing.

I would like to start spending some time digging in to this. Any pointers or
thoughts on the best way to implement it? A little direction might save me
days of wheel spinning :-).

Implement what exactly?

The code that presently does the dirty work is in ExecTargetList(), if
that's what you're looking for...

regards, tom lane

#7Joe Conway
mail@joeconway.com
In reply to: Tom Lane (#6)
Re: [Fwd: SETOF input parameters (was Re: [HACKERS] proposal:

Tom Lane wrote:

Joe Conway <mail@joeconway.com> writes:

Tom Lane wrote:

Agreed. The Berkeley code (or more accurately, the descendant code
that's in our source tree) generates the cross product of the rows
output by the SRFs, but I've never understood why that should be a good
approach to take. I could live with just rejecting multiple SRFs in the
same targetlist --- at least till someone comes up with a convincing
semantics for such a thing.

I would like to start spending some time digging in to this. Any pointers or
thoughts on the best way to implement it? A little direction might save me
days of wheel spinning :-).

Implement what exactly?

Well, I want to allow a single table function (or srf if you prefer) in the
target list as discussed above.

Currently, when you try it, record_out() gets called from printtup() when the
srf is encountered, which generates an ERROR. The behavior in 7.2.x is to
return a pointer when the composite type is output. I think that to make this
work as discussed, the target list needs to be "expanded" for the composite
type (similar to expanding a "*" I would think), so I was starting to look at
transformTargetList() and ExpandAllTables().

The code that presently does the dirty work is in ExecTargetList(), if
that's what you're looking for...

OK -- i'll check that out too.

Thanks,

Joe

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joe Conway (#7)
Re: [Fwd: SETOF input parameters (was Re: [HACKERS] proposal:

Joe Conway <mail@joeconway.com> writes:

Tom Lane wrote:

Implement what exactly?

Well, I want to allow a single table function (or srf if you prefer) in the
target list as discussed above.

Currently, when you try it, record_out() gets called from printtup() when the
srf is encountered, which generates an ERROR.

Oh, you're thinking about the multi-column aspect of it, not the
multi-row aspect. You really ought to keep those strictly separate;
their design and implementation problems are quite different IMHO.
I find it quite confusing to refer to both cases as "SRFs".

I think there once was support for multi-column function results in
SELECT; at least that's the best understanding I can muster of the old
"fjoin" code. But that's been broken since the Postgres95 SQL rewrite
(if not earlier), and I finally ripped out the last shreds of it just a
couple weeks ago. I don't think it could have been revived short of a
complete rewrite anyway, so I'm not shedding a tear. You could look at
old releases to see how it worked, though you might have to go back to
Postgres 4.2 or even before to find something that "worked".

My thought for implementation would not be to revive Fjoin as it was;
there are just too many places that deal with targetlists to make it
practical to have an alternative targetlist datastructure for this.
(The reason Fjoin has been too broken to contemplate reviving for all
these years is exactly that there were too many places that had never
coped with it.) I'd think more about adding a level of projection
(probably embedded in a Result plan node) that expands out the SRF tuple
result into individual columns.

Before that, though, you'd better put forward a workable user interface
for this; I'd wonder in particular what the names of the expanded-out
columns will be, and whether they'd be accessible from places that can
normally see output column names (such as ORDER BY). And what if a
multi-column function appears in the targetlist of a sub-SELECT?

On the implementation level, I think you will need to face up to the
problem of allowing a tuple value to be embedded as a column of a larger
tuple. It might be possible to avoid that, but I think it will take
some monstrous kluges to do so. (The existing pretend-a-TupleTableSlot-
pointer-is-a-Datum approach is already a monstrous kluge, not to mention
a source of memory leaks.) Once you've fixed that, the existing
FieldSelect expression node probably is all the run-time mechanism
needed to support the projection step I suggested above.

regards, tom lane

#9Joe Conway
mail@joeconway.com
In reply to: Tom Lane (#8)
targetlist functions proposals (was SETOF input parameters)

(moved from PATCHES back to HACKERS)

Tom Lane wrote:

Oh, you're thinking about the multi-column aspect of it, not the
multi-row aspect. You really ought to keep those strictly separate;
their design and implementation problems are quite different IMHO.
I find it quite confusing to refer to both cases as "SRFs".

[...snip...]

Before that, though, you'd better put forward a workable user interface
for this; I'd wonder in particular what the names of the expanded-out
columns will be, and whether they'd be accessible from places that can
normally see output column names (such as ORDER BY). And what if a
multi-column function appears in the targetlist of a sub-SELECT?

I've put some thought into *two* proposals for how targetlist functions should
behave -- one for a function that returns multiple rows, and one for a
function that returns multiple columns. The need for this was highlighted
recently when I submitted a proposal for array utility functions; see:
http://archives.postgresql.org/pgsql-hackers/2002-12/msg00461.php

At this point I don't have a clear idea how the latter would be implemented
(or if it even *can be* implemented with reasonable effort), but I wanted to
try to get agreement on the interface behavior before getting too caught up in
how to make it work. I think the former is reasonably straightforward (but
could well be wrong).

This is fairly long, so if you're not interested please delete now and accept
my apologies :-)

Proposals are below. Thoughts?

Thanks,

Joe

=================================================================
User interface proposal for multi-row function targetlist entries
=================================================================
1. Only one targetlist entry may return a set.
2. Each targetlist item (other than the set returning one) is
repeated for each item in the returned set.

Examples illustrating the need (these work on cvs HEAD):

CREATE TABLE bar(f1 int, f2 text, f3 int);
INSERT INTO bar VALUES(1, 'Hello', 42);
INSERT INTO bar VALUES(2, 'Happy', 45);

CREATE TABLE foo(a int, b text);
INSERT INTO foo VALUES(42, 'World');
INSERT INTO foo VALUES(42, 'Everyone');
INSERT INTO foo VALUES(45, 'Birthday');
INSERT INTO foo VALUES(45, 'New Year');

CREATE OR REPLACE FUNCTION getfoo(int) RETURNS SETOF text AS '
SELECT b FROM foo WHERE a = $1
' language 'sql';

regression=# SELECT f1, f2, getfoo(f3) AS f4 FROM bar;
f1 | f2 | f4
----+-------+----------
1 | Hello | World
1 | Hello | Everyone
2 | Happy | Birthday
2 | Happy | New Year
(4 rows)

Note that this is exatly how things currently work, i.e. there
is no restriction to the number of set returning targetlist entries.
This lack of restriction leads to strange and unexpected results (at
least IMHO). Continuing the example:

CREATE TABLE foo2(a int, b text);
INSERT INTO foo2 VALUES(42, '!!!!');
INSERT INTO foo2 VALUES(42, '????');
INSERT INTO foo2 VALUES(42, '####');
INSERT INTO foo2 VALUES(45, '$$$$');

CREATE OR REPLACE FUNCTION getfoo2(int) RETURNS SETOF text AS '
SELECT b FROM foo2 WHERE a = $1
' language 'sql';

Now, what *should* the following return if we allow multiple set
returning functions in the targetlist? Here's what it currently
does:

regression=# SELECT f1, f2, getfoo(f3) AS f4, getfoo2(f3) AS f5 FROM bar;
f1 | f2 | f4 | f5
----+-------+----------+------
1 | Hello | World | !!!!
1 | Hello | Everyone | ????
1 | Hello | World | ####
1 | Hello | Everyone | !!!!
1 | Hello | World | ????
1 | Hello | Everyone | ####
2 | Happy | Birthday | $$$$
2 | Happy | New Year | $$$$
(8 rows)

Not very useful as there is no way to prevent the apparent cartesian
join. But now try:

TRUNCATE TABLE foo2;
INSERT INTO foo2 VALUES(42, '!!!!');
INSERT INTO foo2 VALUES(42, '????');
INSERT INTO foo2 VALUES(45, '####');
INSERT INTO foo2 VALUES(45, '$$$$');

regression=# SELECT f1, f2, getfoo(f3) AS f4, getfoo2(f3) AS f5 FROM bar;
f1 | f2 | f4 | f5
----+-------+----------+------
1 | Hello | World | !!!!
1 | Hello | Everyone | ????
2 | Happy | Birthday | ####
2 | Happy | New Year | $$$$
(4 rows)

Hmmm, what happened to that cartesian join?

Under the proposal the two previous scenarios are disallowed with an ERROR.

============================================================================
User interface proposal for multi-column function targetlist entries
============================================================================
1. One, or more, targetlist entries may be a multi-column (composite) type.
2. For functions declared to return a named composite type, the
column names and types are as prescribed by the type unless overridden
in an alias definition.
3. For functions declared to return a "record" type, a column
definition list would be required as an alias at runtime.
4. Any alias provided for a composite returning function must match
the number of columns returned, and types if provided.
5. The composite function column names would be accessible from
places that can normally see output column names (such as ORDER BY).
6. When a composite function appears in the targetlist of a sub-SELECT,
the function's columns should be available outside the sub-SELECT in
the same manner as the other targetlist entries in the sub-SELECT.

Examples (these are all contrived):

CREATE TABLE bar(f1 int, f2 int);
INSERT INTO bar VALUES(1, 2);
INSERT INTO bar VALUES(2, 3);

CREATE TABLE foo(a int, b text);
INSERT INTO foo VALUES(1, 'a');
INSERT INTO foo VALUES(2, 'b');
INSERT INTO foo VALUES(3, 'c');
INSERT INTO foo VALUES(4, 'd');

CREATE OR REPLACE FUNCTION getfoo(int) RETURNS SETOF foo AS '
SELECT * FROM foo WHERE a = $1
' language 'sql';

SELECT f1, getfoo(f2) AS f3 FROM bar;
ERROR: function getfoo alias does not match its declared type

SELECT f1, getfoo(f2) AS (f3, f4) FROM bar;
f1 | f3 | f4
----+----+-----
1 | 2 | b
2 | 3 | c
(2 rows)

SELECT f1, getfoo(f2) AS (f3 int, f4 text) FROM bar;
f1 | f3 | f4
----+----+-----
1 | 2 | b
2 | 3 | c
(2 rows)

SELECT f1, getfoo(f2) FROM bar;
f1 | a | b
----+---+-----
1 | 2 | b
2 | 3 | c
(2 rows)

DROP FUNCTION getfoo(int);
CREATE OR REPLACE FUNCTION getfoo(int) RETURNS SETOF record AS '
SELECT * FROM foo WHERE a = $1
' language 'sql';

SELECT f1, getfoo(f2) AS f3 FROM bar;
ERROR: A column definition list is required for functions returning RECORD

SELECT f1, getfoo(f2) AS (f3 int, f4 text) FROM bar;
f1 | f3 | f4
----+----+-----
1 | 2 | b
2 | 3 | c
(2 rows)

#10Joe Conway
mail@joeconway.com
In reply to: Joe Conway (#9)
1 attachment(s)
targetlist functions part 1 (was [HACKERS] targetlist functions proposals)

Joe Conway wrote:

=================================================================
User interface proposal for multi-row function targetlist entries
=================================================================
1. Only one targetlist entry may return a set.
2. Each targetlist item (other than the set returning one) is
repeated for each item in the returned set.

Having gotten no objections (actually, no response at all), I can only assume
no one had heartburn with this change. The attached patch covers the first of
the two proposals, i.e. restricting the target list to only one set returning
function.

It compiles cleanly, and passes all regression tests. If there are no
objections, please apply.

Any suggestions on where this should be documented (other than maybe sql-select)?

Thanks,

Joe

p.s. Here's what the previous example now looks like:
CREATE TABLE bar(f1 int, f2 text, f3 int);
INSERT INTO bar VALUES(1, 'Hello', 42);
INSERT INTO bar VALUES(2, 'Happy', 45);

CREATE TABLE foo(a int, b text);
INSERT INTO foo VALUES(42, 'World');
INSERT INTO foo VALUES(42, 'Everyone');
INSERT INTO foo VALUES(45, 'Birthday');
INSERT INTO foo VALUES(45, 'New Year');

CREATE TABLE foo2(a int, b text);
INSERT INTO foo2 VALUES(42, '!!!!');
INSERT INTO foo2 VALUES(42, '????');
INSERT INTO foo2 VALUES(42, '####');
INSERT INTO foo2 VALUES(45, '$$$$');

CREATE OR REPLACE FUNCTION getfoo(int) RETURNS SETOF text AS '
SELECT b FROM foo WHERE a = $1
' language 'sql';

CREATE OR REPLACE FUNCTION getfoo2(int) RETURNS SETOF text AS '
SELECT b FROM foo2 WHERE a = $1
' language 'sql';

regression=# SELECT f1, f2, getfoo(f3) AS f4 FROM bar;
f1 | f2 | f4
----+-------+----------
1 | Hello | World
1 | Hello | Everyone
2 | Happy | Birthday
2 | Happy | New Year
(4 rows)

regression=# SELECT f1, f2, getfoo(f3) AS f4, getfoo2(f3) AS f5 FROM bar;
ERROR: Only one target list entry may return a set result

Attachments:

targetlist-retset.1.patchtext/plain; name=targetlist-retset.1.patchDownload
Index: src/backend/parser/parse_clause.c
===================================================================
RCS file: /opt/src/cvs/pgsql-server/src/backend/parser/parse_clause.c,v
retrieving revision 1.103
diff -c -r1.103 parse_clause.c
*** src/backend/parser/parse_clause.c	16 Dec 2002 18:39:22 -0000	1.103
--- src/backend/parser/parse_clause.c	12 Jan 2003 19:23:57 -0000
***************
*** 1121,1127 ****
  	 * the end of the target list.	This target is given resjunk = TRUE so
  	 * that it will not be projected into the final tuple.
  	 */
! 	target_result = transformTargetEntry(pstate, node, expr, NULL, true);
  	lappend(tlist, target_result);
  
  	return target_result;
--- 1121,1127 ----
  	 * the end of the target list.	This target is given resjunk = TRUE so
  	 * that it will not be projected into the final tuple.
  	 */
! 	target_result = transformTargetEntry(pstate, node, expr, NULL, true, NULL);
  	lappend(tlist, target_result);
  
  	return target_result;
Index: src/backend/parser/parse_target.c
===================================================================
RCS file: /opt/src/cvs/pgsql-server/src/backend/parser/parse_target.c,v
retrieving revision 1.94
diff -c -r1.94 parse_target.c
*** src/backend/parser/parse_target.c	12 Dec 2002 20:35:13 -0000	1.94
--- src/backend/parser/parse_target.c	12 Jan 2003 19:25:16 -0000
***************
*** 42,54 ****
   * colname	the column name to be assigned, or NULL if none yet set.
   * resjunk	true if the target should be marked resjunk, ie, it is not
   *			wanted in the final projected tuple.
   */
  TargetEntry *
  transformTargetEntry(ParseState *pstate,
  					 Node *node,
  					 Node *expr,
  					 char *colname,
! 					 bool resjunk)
  {
  	Oid			type_id;
  	int32		type_mod;
--- 42,57 ----
   * colname	the column name to be assigned, or NULL if none yet set.
   * resjunk	true if the target should be marked resjunk, ie, it is not
   *			wanted in the final projected tuple.
+  * retset	if non-NULL, and the entry is a function expression, pass back
+  *			expr->funcretset
   */
  TargetEntry *
  transformTargetEntry(ParseState *pstate,
  					 Node *node,
  					 Node *expr,
  					 char *colname,
! 					 bool resjunk,
! 					 bool *retset)
  {
  	Oid			type_id;
  	int32		type_mod;
***************
*** 61,66 ****
--- 64,72 ----
  	if (IsA(expr, RangeVar))
  		elog(ERROR, "You can't use relation names alone in the target list, try relation.*.");
  
+ 	if (retset && IsA(expr, FuncExpr))
+ 		*retset = ((FuncExpr *) expr)->funcretset;
+ 
  	type_id = exprType(expr);
  	type_mod = exprTypmod(expr);
  
***************
*** 93,102 ****
--- 99,110 ----
  List *
  transformTargetList(ParseState *pstate, List *targetlist)
  {
+ 	bool		retset = false;
  	List	   *p_target = NIL;
  
  	while (targetlist != NIL)
  	{
+ 		bool		entry_retset = false;
  		ResTarget  *res = (ResTarget *) lfirst(targetlist);
  
  		if (IsA(res->val, ColumnRef))
***************
*** 173,179 ****
  														res->val,
  														NULL,
  														res->name,
! 														false));
  			}
  		}
  		else if (IsA(res->val, InsertDefault))
--- 181,188 ----
  														res->val,
  														NULL,
  														res->name,
! 														false,
! 														&entry_retset));
  			}
  		}
  		else if (IsA(res->val, InsertDefault))
***************
*** 194,201 ****
  													res->val,
  													NULL,
  													res->name,
! 													false));
  		}
  
  		targetlist = lnext(targetlist);
  	}
--- 203,217 ----
  													res->val,
  													NULL,
  													res->name,
! 													false,
! 													&entry_retset));
  		}
+ 
+ 		if (retset && entry_retset)
+ 			elog(ERROR, "Only one target list entry may return a set result");
+ 
+ 		if (entry_retset)
+ 			retset = true;
  
  		targetlist = lnext(targetlist);
  	}
Index: src/include/parser/parse_target.h
===================================================================
RCS file: /opt/src/cvs/pgsql-server/src/include/parser/parse_target.h,v
retrieving revision 1.27
diff -c -r1.27 parse_target.h
*** src/include/parser/parse_target.h	18 Sep 2002 21:35:24 -0000	1.27
--- src/include/parser/parse_target.h	12 Jan 2003 19:08:56 -0000
***************
*** 20,26 ****
  extern List *transformTargetList(ParseState *pstate, List *targetlist);
  extern TargetEntry *transformTargetEntry(ParseState *pstate,
  					 Node *node, Node *expr,
! 					 char *colname, bool resjunk);
  extern void updateTargetListEntry(ParseState *pstate, TargetEntry *tle,
  					  char *colname, int attrno,
  					  List *indirection);
--- 20,26 ----
  extern List *transformTargetList(ParseState *pstate, List *targetlist);
  extern TargetEntry *transformTargetEntry(ParseState *pstate,
  					 Node *node, Node *expr,
! 					 char *colname, bool resjunk, bool *retset);
  extern void updateTargetListEntry(ParseState *pstate, TargetEntry *tle,
  					  char *colname, int attrno,
  					  List *indirection);
#11Peter Eisentraut
peter_e@gmx.net
In reply to: Joe Conway (#10)
Re: targetlist functions part 1 (was [HACKERS] targetlist

Joe Conway writes:

=================================================================
User interface proposal for multi-row function targetlist entries
=================================================================
1. Only one targetlist entry may return a set.
2. Each targetlist item (other than the set returning one) is
repeated for each item in the returned set.

Since we have set functions in the FROM list, what do set functions in the
target list give us?

--
Peter Eisentraut peter_e@gmx.net

#12Joe Conway
mail@joeconway.com
In reply to: Peter Eisentraut (#11)
Re: targetlist functions part 1 (was [HACKERS] targetlist

Peter Eisentraut wrote:

Since we have set functions in the FROM list, what do set functions in the
target list give us?

The full discussion is a bit long, but it starts here:
http://archives.postgresql.org/pgsql-hackers/2002-12/msg00461.php
and picks up again here:
http://archives.postgresql.org/pgsql-patches/2002-12/msg00166.php

The short answer is we need a way to allow a "table function" to fire
multiple times given one or more columns from a table as input.

Joe

#13Peter Eisentraut
peter_e@gmx.net
In reply to: Joe Conway (#12)
Re: targetlist functions part 1 (was [HACKERS] targetlist

Joe Conway writes:

The short answer is we need a way to allow a "table function" to fire
multiple times given one or more columns from a table as input.

Has there been an evaluation of the SQL standard and other databases how
this is handled? I can't believe we're operating in ground-breaking
territory here. What concerns me is that we're inserting table-generating
syntax elements into the select list, which is where they've never
belonged.

--
Peter Eisentraut peter_e@gmx.net

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#13)
Re: targetlist functions part 1 (was [HACKERS] targetlist

Peter Eisentraut <peter_e@gmx.net> writes:

I can't believe we're operating in ground-breaking
territory here.

We're not. This functionality has always been in Postgres, right back
to the PostQUEL days. Joe is trying to clean it up to the point where
it has explainable, defensible semantics. But he's not adding something
that wasn't there before.

What concerns me is that we're inserting table-generating
syntax elements into the select list, which is where they've never
belonged.

Do you see another way to pass non-constant arguments to the
table-generating function?

regards, tom lane

#15Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#14)
Re: targetlist functions part 1 (was [HACKERS] targetlist

Tom Lane writes:

What concerns me is that we're inserting table-generating
syntax elements into the select list, which is where they've never
belonged.

Do you see another way to pass non-constant arguments to the
table-generating function?

SELECT * FROM table1 AS t1, table2 AS t2, func1(t1.col, t2.col) ... ?

That's a syntax that would make sense to me.

With sufficiently blurred vision one might even find SQL99's clause

<collection derived table> ::=
UNNEST <left paren> <collection value expression> <right paren>

<table primary> ::=
<table or query name> [ [ AS ] <correlation name>
[ <left paren> <derived column list> <right paren> ] ]
| <derived table> [ AS ] <correlation name>
[ <left paren> <derived column list> <right paren> ]
| <lateral derived table> [ AS ] <correlation name>
[ <left paren> <derived column list> <right paren> ]
| <collection derived table> [ AS ] <correlation name>
[ <left paren> <derived column list> <right paren> ]
| <only spec>
[ [ AS ] <correlation name>
[ <left paren> <derived column list> <right paren> ] ]
| <left paren> <joined table> <right paren>

applicable. Or maybe not.

--
Peter Eisentraut peter_e@gmx.net

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#15)
Re: targetlist functions part 1 (was [HACKERS] targetlist

Peter Eisentraut <peter_e@gmx.net> writes:

Tom Lane writes:

Do you see another way to pass non-constant arguments to the
table-generating function?

SELECT * FROM table1 AS t1, table2 AS t2, func1(t1.col, t2.col) ... ?

That's a syntax that would make sense to me.

That syntax makes no sense whatsoever to me. You are imputing a
causal connection between FROM elements that are at the same level,
which is just totally contrary to any sane understanding of SQL
semantics. Exactly which t1.col value(s) do you see the above syntax
as passing to the func()? Your answer had better not mention the
WHERE clause, because the input tables have to be determined before
WHERE has anything to operate on.

With sufficiently blurred vision one might even find SQL99's clause
<collection derived table> ::=
UNNEST <left paren> <collection value expression> <right paren>
applicable. Or maybe not.

Hm. I'm not sure what UNNEST does, but now that you bring SQL99 into
the picture, what about WITH? That might solve the problem, because
(I think) WITH tables are logically determined before the main SELECT
begins to execute.

regards, tom lane

#17Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#16)
Re: [PATCHES] targetlist functions part 1 (was targetlist

Tom Lane writes:

With sufficiently blurred vision one might even find SQL99's clause
<collection derived table> ::=
UNNEST <left paren> <collection value expression> <right paren>
applicable. Or maybe not.

Hm. I'm not sure what UNNEST does, but now that you bring SQL99 into
the picture, what about WITH? That might solve the problem, because
(I think) WITH tables are logically determined before the main SELECT
begins to execute.

The SQL 200x draft defines a new clause TABLE ( <collection value
expression> ) as a possible <table primary>, where the <collection value
expression> is required to be a function call. At the end this just boils
down to UNNEST, though. UNNEST is defined in terms of a hairy recursive
join subquery with a LATERAL( ) around it. LATERAL(subquery) is the same
as just (subquery) except that the scope clauses are different. So I
think this is probably what we ought to look at.

--
Peter Eisentraut peter_e@gmx.net

#18Peter Eisentraut
peter_e@gmx.net
In reply to: Peter Eisentraut (#17)
Re: [PATCHES] targetlist functions part 1 (was targetlist

I wrote:

The SQL 200x draft defines a new clause TABLE ( <collection value
expression> ) as a possible <table primary>, where the <collection value
expression> is required to be a function call. At the end this just boils
down to UNNEST, though. UNNEST is defined in terms of a hairy recursive
join subquery with a LATERAL( ) around it. LATERAL(subquery) is the same
as just (subquery) except that the scope clauses are different. So I
think this is probably what we ought to look at.

I have stared at this some more and it is indeed what we're looking for.
The hairy recursive join is only so that they can get the WITH ORDINALITY
feature (which basically adds a "line number" column to the output) in
there in a closed form. If you simplify it, the command

SELECT ... FROM TABLE( func(...) ) ...

resolves to

SELECT ... FROM table_generated_by_func ...

As for the question of where nonconstant argument values come from, this
is addressed as well. The general form of this feature is the lateral
derived table, for example

SELECT ... FROM table1, LATERAL(select ...), table2 ...

as opposed to simply

SELECT ... FROM table1, (select ...), table2 ...

In the second form the subquery cannot access outside values. In the
first form, the subquery can access range variables in FROM items defined
to its left.

The table function calls are a special case of this, meaning that in

SELECT .. FROM table1, TABLE( func(args) ), table2 ...

the "args" can refer to table1 but not to table2.

Oracle implements exactly this feature. (See
<http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96540/statements_103a.htm#2104992&gt;.)
If there are doubts about the semantics we could try it out there.

--
Peter Eisentraut peter_e@gmx.net

#19Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Joe Conway (#10)
Re: targetlist functions part 1 (was [HACKERS] targetlist

Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

I will try to apply it within the next 48 hours.

---------------------------------------------------------------------------

Joe Conway wrote:

Joe Conway wrote:

=================================================================
User interface proposal for multi-row function targetlist entries
=================================================================
1. Only one targetlist entry may return a set.
2. Each targetlist item (other than the set returning one) is
repeated for each item in the returned set.

Having gotten no objections (actually, no response at all), I can only assume
no one had heartburn with this change. The attached patch covers the first of
the two proposals, i.e. restricting the target list to only one set returning
function.

It compiles cleanly, and passes all regression tests. If there are no
objections, please apply.

Any suggestions on where this should be documented (other than maybe sql-select)?

Thanks,

Joe

p.s. Here's what the previous example now looks like:
CREATE TABLE bar(f1 int, f2 text, f3 int);
INSERT INTO bar VALUES(1, 'Hello', 42);
INSERT INTO bar VALUES(2, 'Happy', 45);

CREATE TABLE foo(a int, b text);
INSERT INTO foo VALUES(42, 'World');
INSERT INTO foo VALUES(42, 'Everyone');
INSERT INTO foo VALUES(45, 'Birthday');
INSERT INTO foo VALUES(45, 'New Year');

CREATE TABLE foo2(a int, b text);
INSERT INTO foo2 VALUES(42, '!!!!');
INSERT INTO foo2 VALUES(42, '????');
INSERT INTO foo2 VALUES(42, '####');
INSERT INTO foo2 VALUES(45, '$$$$');

CREATE OR REPLACE FUNCTION getfoo(int) RETURNS SETOF text AS '
SELECT b FROM foo WHERE a = $1
' language 'sql';

CREATE OR REPLACE FUNCTION getfoo2(int) RETURNS SETOF text AS '
SELECT b FROM foo2 WHERE a = $1
' language 'sql';

regression=# SELECT f1, f2, getfoo(f3) AS f4 FROM bar;
f1 | f2 | f4
----+-------+----------
1 | Hello | World
1 | Hello | Everyone
2 | Happy | Birthday
2 | Happy | New Year
(4 rows)

regression=# SELECT f1, f2, getfoo(f3) AS f4, getfoo2(f3) AS f5 FROM bar;
ERROR: Only one target list entry may return a set result

Index: src/backend/parser/parse_clause.c
===================================================================
RCS file: /opt/src/cvs/pgsql-server/src/backend/parser/parse_clause.c,v
retrieving revision 1.103
diff -c -r1.103 parse_clause.c
*** src/backend/parser/parse_clause.c	16 Dec 2002 18:39:22 -0000	1.103
--- src/backend/parser/parse_clause.c	12 Jan 2003 19:23:57 -0000
***************
*** 1121,1127 ****
* the end of the target list.	This target is given resjunk = TRUE so
* that it will not be projected into the final tuple.
*/
! 	target_result = transformTargetEntry(pstate, node, expr, NULL, true);
lappend(tlist, target_result);
return target_result;
--- 1121,1127 ----
* the end of the target list.	This target is given resjunk = TRUE so
* that it will not be projected into the final tuple.
*/
! 	target_result = transformTargetEntry(pstate, node, expr, NULL, true, NULL);
lappend(tlist, target_result);
return target_result;
Index: src/backend/parser/parse_target.c
===================================================================
RCS file: /opt/src/cvs/pgsql-server/src/backend/parser/parse_target.c,v
retrieving revision 1.94
diff -c -r1.94 parse_target.c
*** src/backend/parser/parse_target.c	12 Dec 2002 20:35:13 -0000	1.94
--- src/backend/parser/parse_target.c	12 Jan 2003 19:25:16 -0000
***************
*** 42,54 ****
* colname	the column name to be assigned, or NULL if none yet set.
* resjunk	true if the target should be marked resjunk, ie, it is not
*			wanted in the final projected tuple.
*/
TargetEntry *
transformTargetEntry(ParseState *pstate,
Node *node,
Node *expr,
char *colname,
! 					 bool resjunk)
{
Oid			type_id;
int32		type_mod;
--- 42,57 ----
* colname	the column name to be assigned, or NULL if none yet set.
* resjunk	true if the target should be marked resjunk, ie, it is not
*			wanted in the final projected tuple.
+  * retset	if non-NULL, and the entry is a function expression, pass back
+  *			expr->funcretset
*/
TargetEntry *
transformTargetEntry(ParseState *pstate,
Node *node,
Node *expr,
char *colname,
! 					 bool resjunk,
! 					 bool *retset)
{
Oid			type_id;
int32		type_mod;
***************
*** 61,66 ****
--- 64,72 ----
if (IsA(expr, RangeVar))
elog(ERROR, "You can't use relation names alone in the target list, try relation.*.");
+ 	if (retset && IsA(expr, FuncExpr))
+ 		*retset = ((FuncExpr *) expr)->funcretset;
+ 
type_id = exprType(expr);
type_mod = exprTypmod(expr);
***************
*** 93,102 ****
--- 99,110 ----
List *
transformTargetList(ParseState *pstate, List *targetlist)
{
+ 	bool		retset = false;
List	   *p_target = NIL;

while (targetlist != NIL)
{
+ bool entry_retset = false;
ResTarget *res = (ResTarget *) lfirst(targetlist);

if (IsA(res->val, ColumnRef))
***************
*** 173,179 ****
res->val,
NULL,
res->name,
! 														false));
}
}
else if (IsA(res->val, InsertDefault))
--- 181,188 ----
res->val,
NULL,
res->name,
! 														false,
! 														&entry_retset));
}
}
else if (IsA(res->val, InsertDefault))
***************
*** 194,201 ****
res->val,
NULL,
res->name,
! 													false));
}
targetlist = lnext(targetlist);
}
--- 203,217 ----
res->val,
NULL,
res->name,
! 													false,
! 													&entry_retset));
}
+ 
+ 		if (retset && entry_retset)
+ 			elog(ERROR, "Only one target list entry may return a set result");
+ 
+ 		if (entry_retset)
+ 			retset = true;
targetlist = lnext(targetlist);
}
Index: src/include/parser/parse_target.h
===================================================================
RCS file: /opt/src/cvs/pgsql-server/src/include/parser/parse_target.h,v
retrieving revision 1.27
diff -c -r1.27 parse_target.h
*** src/include/parser/parse_target.h	18 Sep 2002 21:35:24 -0000	1.27
--- src/include/parser/parse_target.h	12 Jan 2003 19:08:56 -0000
***************
*** 20,26 ****
extern List *transformTargetList(ParseState *pstate, List *targetlist);
extern TargetEntry *transformTargetEntry(ParseState *pstate,
Node *node, Node *expr,
! 					 char *colname, bool resjunk);
extern void updateTargetListEntry(ParseState *pstate, TargetEntry *tle,
char *colname, int attrno,
List *indirection);
--- 20,26 ----
extern List *transformTargetList(ParseState *pstate, List *targetlist);
extern TargetEntry *transformTargetEntry(ParseState *pstate,
Node *node, Node *expr,
! 					 char *colname, bool resjunk, bool *retset);
extern void updateTargetListEntry(ParseState *pstate, TargetEntry *tle,
char *colname, int attrno,
List *indirection);

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#19)
Re: targetlist functions part 1 (was [HACKERS] targetlist

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Your patch has been added to the PostgreSQL unapplied patches list at:

This patch was objected to by Peter, IIRC, and I think I agree with him.
We should look at whether we can't solve the problem via SQL99 features
before pumping new life into that crufty old Berkeley syntax.

regards, tom lane

Show quoted text

Joe Conway wrote:

=================================================================
User interface proposal for multi-row function targetlist entries
=================================================================

#21Joe Conway
mail@joeconway.com
In reply to: Tom Lane (#20)
Re: targetlist functions part 1 (was [HACKERS] targetlist

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Your patch has been added to the PostgreSQL unapplied patches list at:

This patch was objected to by Peter, IIRC, and I think I agree with him.
We should look at whether we can't solve the problem via SQL99 features
before pumping new life into that crufty old Berkeley syntax.

I know I haven't had time to absorb Peter's suggestions and comment, but I
think the current behavior is broken, and this patch should be applied anyway
(this was only yhe first half of my proposal -- i.e. prevent more than one
targetlist srf). The only reason I can think to not apply it, is if you think
we should completely disallow targetlist set returning functions as part of
moving to SQL99.

Joe

#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joe Conway (#21)
Re: targetlist functions part 1 (was [HACKERS] targetlist

Joe Conway <mail@joeconway.com> writes:

The only reason I can think to not apply it, is if you think
we should completely disallow targetlist set returning functions as part of
moving to SQL99.

I would like to eventually get rid of targetlist SRF's altogether.
I believe the feature represents a nontrivial drag on executor
performance and reliability even when it's not being used. (Look at all
the isDone cruft in execQual, the TupFromTlist hoop-jumping, the places
that are missing TupFromTlist hoop-jumping and should have it, etc.)
Obviously we can't do that until we have a fully functional alternative,
which FROM-list SRF's aren't. But if there is a chance of getting rid
of them via SQL99 extensions then I'd like to pursue that direction.

In the meantime, I don't see any need to spend any effort on cleaning up
what we're likely to discard altogether later...

regards, tom lane

#23Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Joe Conway (#10)
Re: targetlist functions part 1 (was [HACKERS] targetlist

Patch applied. Thanks.

---------------------------------------------------------------------------

Joe Conway wrote:

Joe Conway wrote:

=================================================================
User interface proposal for multi-row function targetlist entries
=================================================================
1. Only one targetlist entry may return a set.
2. Each targetlist item (other than the set returning one) is
repeated for each item in the returned set.

Having gotten no objections (actually, no response at all), I can only assume
no one had heartburn with this change. The attached patch covers the first of
the two proposals, i.e. restricting the target list to only one set returning
function.

It compiles cleanly, and passes all regression tests. If there are no
objections, please apply.

Any suggestions on where this should be documented (other than maybe sql-select)?

Thanks,

Joe

p.s. Here's what the previous example now looks like:
CREATE TABLE bar(f1 int, f2 text, f3 int);
INSERT INTO bar VALUES(1, 'Hello', 42);
INSERT INTO bar VALUES(2, 'Happy', 45);

CREATE TABLE foo(a int, b text);
INSERT INTO foo VALUES(42, 'World');
INSERT INTO foo VALUES(42, 'Everyone');
INSERT INTO foo VALUES(45, 'Birthday');
INSERT INTO foo VALUES(45, 'New Year');

CREATE TABLE foo2(a int, b text);
INSERT INTO foo2 VALUES(42, '!!!!');
INSERT INTO foo2 VALUES(42, '????');
INSERT INTO foo2 VALUES(42, '####');
INSERT INTO foo2 VALUES(45, '$$$$');

CREATE OR REPLACE FUNCTION getfoo(int) RETURNS SETOF text AS '
SELECT b FROM foo WHERE a = $1
' language 'sql';

CREATE OR REPLACE FUNCTION getfoo2(int) RETURNS SETOF text AS '
SELECT b FROM foo2 WHERE a = $1
' language 'sql';

regression=# SELECT f1, f2, getfoo(f3) AS f4 FROM bar;
f1 | f2 | f4
----+-------+----------
1 | Hello | World
1 | Hello | Everyone
2 | Happy | Birthday
2 | Happy | New Year
(4 rows)

regression=# SELECT f1, f2, getfoo(f3) AS f4, getfoo2(f3) AS f5 FROM bar;
ERROR: Only one target list entry may return a set result

Index: src/backend/parser/parse_clause.c
===================================================================
RCS file: /opt/src/cvs/pgsql-server/src/backend/parser/parse_clause.c,v
retrieving revision 1.103
diff -c -r1.103 parse_clause.c
*** src/backend/parser/parse_clause.c	16 Dec 2002 18:39:22 -0000	1.103
--- src/backend/parser/parse_clause.c	12 Jan 2003 19:23:57 -0000
***************
*** 1121,1127 ****
* the end of the target list.	This target is given resjunk = TRUE so
* that it will not be projected into the final tuple.
*/
! 	target_result = transformTargetEntry(pstate, node, expr, NULL, true);
lappend(tlist, target_result);
return target_result;
--- 1121,1127 ----
* the end of the target list.	This target is given resjunk = TRUE so
* that it will not be projected into the final tuple.
*/
! 	target_result = transformTargetEntry(pstate, node, expr, NULL, true, NULL);
lappend(tlist, target_result);
return target_result;
Index: src/backend/parser/parse_target.c
===================================================================
RCS file: /opt/src/cvs/pgsql-server/src/backend/parser/parse_target.c,v
retrieving revision 1.94
diff -c -r1.94 parse_target.c
*** src/backend/parser/parse_target.c	12 Dec 2002 20:35:13 -0000	1.94
--- src/backend/parser/parse_target.c	12 Jan 2003 19:25:16 -0000
***************
*** 42,54 ****
* colname	the column name to be assigned, or NULL if none yet set.
* resjunk	true if the target should be marked resjunk, ie, it is not
*			wanted in the final projected tuple.
*/
TargetEntry *
transformTargetEntry(ParseState *pstate,
Node *node,
Node *expr,
char *colname,
! 					 bool resjunk)
{
Oid			type_id;
int32		type_mod;
--- 42,57 ----
* colname	the column name to be assigned, or NULL if none yet set.
* resjunk	true if the target should be marked resjunk, ie, it is not
*			wanted in the final projected tuple.
+  * retset	if non-NULL, and the entry is a function expression, pass back
+  *			expr->funcretset
*/
TargetEntry *
transformTargetEntry(ParseState *pstate,
Node *node,
Node *expr,
char *colname,
! 					 bool resjunk,
! 					 bool *retset)
{
Oid			type_id;
int32		type_mod;
***************
*** 61,66 ****
--- 64,72 ----
if (IsA(expr, RangeVar))
elog(ERROR, "You can't use relation names alone in the target list, try relation.*.");
+ 	if (retset && IsA(expr, FuncExpr))
+ 		*retset = ((FuncExpr *) expr)->funcretset;
+ 
type_id = exprType(expr);
type_mod = exprTypmod(expr);
***************
*** 93,102 ****
--- 99,110 ----
List *
transformTargetList(ParseState *pstate, List *targetlist)
{
+ 	bool		retset = false;
List	   *p_target = NIL;

while (targetlist != NIL)
{
+ bool entry_retset = false;
ResTarget *res = (ResTarget *) lfirst(targetlist);

if (IsA(res->val, ColumnRef))
***************
*** 173,179 ****
res->val,
NULL,
res->name,
! 														false));
}
}
else if (IsA(res->val, InsertDefault))
--- 181,188 ----
res->val,
NULL,
res->name,
! 														false,
! 														&entry_retset));
}
}
else if (IsA(res->val, InsertDefault))
***************
*** 194,201 ****
res->val,
NULL,
res->name,
! 													false));
}
targetlist = lnext(targetlist);
}
--- 203,217 ----
res->val,
NULL,
res->name,
! 													false,
! 													&entry_retset));
}
+ 
+ 		if (retset && entry_retset)
+ 			elog(ERROR, "Only one target list entry may return a set result");
+ 
+ 		if (entry_retset)
+ 			retset = true;
targetlist = lnext(targetlist);
}
Index: src/include/parser/parse_target.h
===================================================================
RCS file: /opt/src/cvs/pgsql-server/src/include/parser/parse_target.h,v
retrieving revision 1.27
diff -c -r1.27 parse_target.h
*** src/include/parser/parse_target.h	18 Sep 2002 21:35:24 -0000	1.27
--- src/include/parser/parse_target.h	12 Jan 2003 19:08:56 -0000
***************
*** 20,26 ****
extern List *transformTargetList(ParseState *pstate, List *targetlist);
extern TargetEntry *transformTargetEntry(ParseState *pstate,
Node *node, Node *expr,
! 					 char *colname, bool resjunk);
extern void updateTargetListEntry(ParseState *pstate, TargetEntry *tle,
char *colname, int attrno,
List *indirection);
--- 20,26 ----
extern List *transformTargetList(ParseState *pstate, List *targetlist);
extern TargetEntry *transformTargetEntry(ParseState *pstate,
Node *node, Node *expr,
! 					 char *colname, bool resjunk, bool *retset);
extern void updateTargetListEntry(ParseState *pstate, TargetEntry *tle,
char *colname, int attrno,
List *indirection);

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#23)
Re: targetlist functions part 1 (was [HACKERS] targetlist

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Patch applied. Thanks.

This was *not* agreed to, please revert it.

regards, tom lane

#25Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#24)
Re: targetlist functions part 1 (was [HACKERS] targetlist

OK, patch reverted.

---------------------------------------------------------------------------

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Patch applied. Thanks.

This was *not* agreed to, please revert it.

regards, tom lane

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073