[WIP] Caching constant stable expressions per execution

Started by Marti Raudseppover 14 years ago9 messages
#1Marti Raudsepp
marti@juffo.org
1 attachment(s)

Hi,

This is a proof of concept patch for recognizing stable function calls
with constant arguments and only calling them once per execution. I'm
posting it to the list to gather feedback whether this is a dead end
or not.

Last time when this was brought up on the list, Tom Lane commented
that doing the checks for constant folding probably outweigh the
gains. http://archives.postgresql.org/pgsql-hackers/2010-11/msg01641.php

Yesterday I was looking at evaluate_function(), where currently
function constant folding happens, and realized that most of the
knowledge needed to implement this for FuncExpr and OpExpr (and
combinations thereof) is already there. This would probably cover 90%
of the use cases with very little overhead. Other expression types
can't be supported very well with this approach, notably case
expressions and AND/OR/NOT.

However, the code isn't very pretty. It works by changing
evaluate_function() to recognize stable/immutable functions with
constant stable/immutable arguments that aren't eligible for constant
folding. It sets a new boolean attribute FuncExpr.stableconst which is
checked during the function's first execution. If true, the result is
stored in ExprState and evalfunc is replaced with a new one that
returns the cached result.

Most common use cases cover timestamptz expressions (because they are
never immutable) such as:
ts>=to_timestamp('2010-01-01', 'YYYY-MM-DD')
ts>=(now() - interval '10 days')

Typically an index on the column would marginalize performance lost by
repeatedly evaluating the function.

----

Test data (1,051,777 rows, 38MB):
create table ts as select generate_series(timestamptz '2001-01-01',
'2011-01-01', '5min') ts;

Test query (using pgbench -T 30, taking the best of 3 runs):
select * from ts where ts>=to_timestamp('2001-01-01', 'YYYY-MM-DD')
and ts<=to_timestamp('2001-01-01', 'YYYY-MM-DD');

Unpatched tps = 0.927458
Patched tps = 6.729378

There's even a measurable improvement from caching now() calls:
select * from ts where ts>now();

Unpatched tps = 8.106439
Patched tps = 9.401684

Regards,
Marti

Attachments:

const-stable-expressions.patchtext/x-patch; charset=US-ASCII; name=const-stable-expressions.patchDownload
diff --git a/src/backend/executor/execQual.c b/src/backend/executor/execQual.c
index 80f08d8..562e43d 100644
--- a/src/backend/executor/execQual.c
+++ b/src/backend/executor/execQual.c
@@ -40,6 +40,7 @@
 #include "access/tupconvert.h"
 #include "catalog/pg_type.h"
 #include "commands/typecmds.h"
+#include "executor/executor.h"
 #include "executor/execdebug.h"
 #include "executor/nodeSubplan.h"
 #include "funcapi.h"
@@ -95,6 +96,14 @@ static void ExecPrepareTuplestoreResult(FuncExprState *fcache,
 							Tuplestorestate *resultStore,
 							TupleDesc resultDesc);
 static void tupledesc_match(TupleDesc dst_tupdesc, TupleDesc src_tupdesc);
+static void cache_function_result(FuncExprState *fcache,
+					  ExprContext *econtext,
+					  bool *isNull,
+					  ExprDoneCond *isDone);
+static Datum ExecCachedFunctionResult(FuncExprState *fcache,
+					 	 ExprContext *econtext,
+					 	 bool *isNull,
+					 	 ExprDoneCond *isDone);
 static Datum ExecMakeFunctionResult(FuncExprState *fcache,
 					   ExprContext *econtext,
 					   bool *isNull,
@@ -1522,6 +1531,44 @@ tupledesc_match(TupleDesc dst_tupdesc, TupleDesc src_tupdesc)
 }
 
 /*
+ *		cache_function_result
+ */
+static void
+cache_function_result(FuncExprState *fcache,
+					  ExprContext *econtext,
+					  bool *isNull,
+					  ExprDoneCond *isDone)
+{
+	MemoryContext oldcontext;
+
+	/* This cache has to persist for the whole query */
+	oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_query_memory);
+
+	fcache->cachedResult = ExecMakeFunctionResult(fcache, econtext, isNull, isDone);
+	fcache->cachedIsNull = *isNull;
+
+	/* Set-returning functions can't be cached */
+	Assert(!isDone || *isDone == ExprSingleResult);
+
+	MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ *		ExecCachedFunctionResult
+ */
+static Datum
+ExecCachedFunctionResult(FuncExprState *fcache,
+					 	 ExprContext *econtext,
+					 	 bool *isNull,
+					 	 ExprDoneCond *isDone)
+{
+	if(isDone)
+		*isDone = ExprSingleResult;
+	*isNull = fcache->cachedIsNull;
+	return fcache->cachedResult;
+}
+
+/*
  *		ExecMakeFunctionResult
  *
  * Evaluate the arguments to a function and then the function itself.
@@ -2257,10 +2304,18 @@ ExecEvalFunc(FuncExprState *fcache,
 	init_fcache(func->funcid, func->inputcollid, fcache,
 				econtext->ecxt_per_query_memory, true);
 
-	/* Go directly to ExecMakeFunctionResult on subsequent uses */
-	fcache->xprstate.evalfunc = (ExprStateEvalFunc) ExecMakeFunctionResult;
+	if(func->stableconst)
+	{
+		cache_function_result(fcache, econtext, isNull, isDone);
+
+		/* XXX Since we only need function cache once, should we clean it up now? */
+		fcache->xprstate.evalfunc = (ExprStateEvalFunc) ExecCachedFunctionResult;
+	}
+	else
+		/* Go directly to ExecMakeFunctionResult on subsequent uses */
+		fcache->xprstate.evalfunc = (ExprStateEvalFunc) ExecMakeFunctionResult;
 
-	return ExecMakeFunctionResult(fcache, econtext, isNull, isDone);
+	return ExecEvalExpr((ExprState *) fcache, econtext, isNull, isDone);
 }
 
 /* ----------------------------------------------------------------
@@ -2280,10 +2335,18 @@ ExecEvalOper(FuncExprState *fcache,
 	init_fcache(op->opfuncid, op->inputcollid, fcache,
 				econtext->ecxt_per_query_memory, true);
 
-	/* Go directly to ExecMakeFunctionResult on subsequent uses */
-	fcache->xprstate.evalfunc = (ExprStateEvalFunc) ExecMakeFunctionResult;
+	if(op->stableconst)
+	{
+		cache_function_result(fcache, econtext, isNull, isDone);
+
+		/* XXX Since we only need function cache once, should we clean it up now? */
+		fcache->xprstate.evalfunc = (ExprStateEvalFunc) ExecCachedFunctionResult;
+	}
+	else
+		/* Go directly to ExecMakeFunctionResult on subsequent uses */
+		fcache->xprstate.evalfunc = (ExprStateEvalFunc) ExecMakeFunctionResult;
 
-	return ExecMakeFunctionResult(fcache, econtext, isNull, isDone);
+	return ExecEvalExpr((ExprState *) fcache, econtext, isNull, isDone);
 }
 
 /* ----------------------------------------------------------------
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 661a516..cb6a4e8 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -1183,6 +1183,7 @@ _copyFuncExpr(FuncExpr *from)
 	COPY_SCALAR_FIELD(inputcollid);
 	COPY_NODE_FIELD(args);
 	COPY_LOCATION_FIELD(location);
+	COPY_LOCATION_FIELD(stableconst);
 
 	return newnode;
 }
@@ -1219,6 +1220,7 @@ _copyOpExpr(OpExpr *from)
 	COPY_SCALAR_FIELD(inputcollid);
 	COPY_NODE_FIELD(args);
 	COPY_LOCATION_FIELD(location);
+	COPY_LOCATION_FIELD(stableconst);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/makefuncs.c b/src/backend/nodes/makefuncs.c
index 4d2eccf..4c4c909 100644
--- a/src/backend/nodes/makefuncs.c
+++ b/src/backend/nodes/makefuncs.c
@@ -465,6 +465,7 @@ makeFuncExpr(Oid funcid, Oid rettype, List *args,
 	funcexpr->inputcollid = inputcollid;
 	funcexpr->args = args;
 	funcexpr->location = -1;
+	funcexpr->stableconst = false;
 
 	return funcexpr;
 }
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index baa90fa..521bad2 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -120,9 +120,9 @@ static List *add_function_defaults(List *args, Oid result_type,
 static List *fetch_function_defaults(HeapTuple func_tuple);
 static void recheck_cast_function_args(List *args, Oid result_type,
 						   HeapTuple func_tuple);
-static Expr *evaluate_function(Oid funcid, Oid result_type, int32 result_typmod,
-				  Oid result_collid, Oid input_collid, List *args,
-				  HeapTuple func_tuple,
+static Expr *evaluate_function(Expr *oldexpr, Oid funcid, Oid result_type,
+				  int32 result_typmod, Oid result_collid, Oid input_collid,
+				  List *args, HeapTuple func_tuple,
 				  eval_const_expressions_context *context);
 static Expr *inline_function(Oid funcid, Oid result_type, Oid result_collid,
 				Oid input_collid, List *args,
@@ -3413,7 +3413,7 @@ simplify_function(Expr *oldexpr, Oid funcid,
 	Oid			transform;
 
 	/*
-	 * We have three strategies for simplification: execute the function to
+	 * XXX We have three strategies for simplification: execute the function to
 	 * deliver a constant result, use a transform function to generate a
 	 * substitute node tree, or expand in-line the body of the function
 	 * definition (which only works for simple SQL-language functions, but
@@ -3434,7 +3434,7 @@ simplify_function(Expr *oldexpr, Oid funcid,
 	else if (((Form_pg_proc) GETSTRUCT(func_tuple))->pronargs > list_length(*args))
 		*args = add_function_defaults(*args, result_type, func_tuple, context);
 
-	newexpr = evaluate_function(funcid, result_type, result_typmod,
+	newexpr = evaluate_function(oldexpr, funcid, result_type, result_typmod,
 								result_collid, input_collid, *args,
 								func_tuple, context);
 
@@ -3712,7 +3712,7 @@ recheck_cast_function_args(List *args, Oid result_type, HeapTuple func_tuple)
 /*
  * evaluate_function: try to pre-evaluate a function call
  *
- * We can do this if the function is strict and has any constant-null inputs
+ * XXX We can do this if the function is strict and has any constant-null inputs
  * (just return a null constant), or if the function is immutable and has all
  * constant inputs (call it and return the result as a Const node).  In
  * estimation mode we are willing to pre-evaluate stable functions too.
@@ -3720,15 +3720,17 @@ recheck_cast_function_args(List *args, Oid result_type, HeapTuple func_tuple)
  * Returns a simplified expression if successful, or NULL if cannot
  * simplify the function.
  */
+/* XXX rename this function */
 static Expr *
-evaluate_function(Oid funcid, Oid result_type, int32 result_typmod,
-				  Oid result_collid, Oid input_collid, List *args,
-				  HeapTuple func_tuple,
+evaluate_function(Expr *oldexpr, Oid funcid, Oid result_type,
+				  int32 result_typmod, Oid result_collid, Oid input_collid,
+				  List *args, HeapTuple func_tuple,
 				  eval_const_expressions_context *context)
 {
 	Form_pg_proc funcform = (Form_pg_proc) GETSTRUCT(func_tuple);
 	bool		has_nonconst_input = false;
 	bool		has_null_input = false;
+	bool		has_non_stableconst_input = false;
 	ListCell   *arg;
 	FuncExpr   *newexpr;
 
@@ -3757,10 +3759,21 @@ evaluate_function(Oid funcid, Oid result_type, int32 result_typmod,
 	 */
 	foreach(arg, args)
 	{
-		if (IsA(lfirst(arg), Const))
-			has_null_input |= ((Const *) lfirst(arg))->constisnull;
+		Expr	   *expr = (Expr *) lfirst(arg);
+
+		if (IsA(expr, Const))
+			has_null_input |= ((Const *) expr)->constisnull;
 		else
+		{
 			has_nonconst_input = true;
+
+			if (IsA(expr, FuncExpr))
+				has_non_stableconst_input |= !((FuncExpr *) expr)->stableconst;
+			else if (IsA(expr, OpExpr))
+				has_non_stableconst_input |= !((OpExpr *) expr)->stableconst;
+			else
+				has_non_stableconst_input = true;
+		}
 	}
 
 	/*
@@ -3778,37 +3791,69 @@ evaluate_function(Oid funcid, Oid result_type, int32 result_typmod,
 	 * non-strict function, constant NULL inputs are treated the same as
 	 * constant non-NULL inputs.)
 	 */
-	if (has_nonconst_input)
+	if (has_non_stableconst_input)
 		return NULL;
 
 	/*
-	 * Ordinarily we are only allowed to simplify immutable functions. But for
+	 * XXX Ordinarily we are only allowed to simplify immutable functions. But for
 	 * purposes of estimation, we consider it okay to simplify functions that
 	 * are merely stable; the risk that the result might change from planning
 	 * time to execution time is worth taking in preference to not being able
 	 * to estimate the value at all.
 	 */
-	if (funcform->provolatile == PROVOLATILE_IMMUTABLE)
-		 /* okay */ ;
-	else if (context->estimate && funcform->provolatile == PROVOLATILE_STABLE)
-		 /* okay */ ;
-	else
+	if (funcform->provolatile == PROVOLATILE_VOLATILE)
 		return NULL;
 
+	/* Arguments are not plan-time constants, but stable execution-time
+	 * constants -- or the function itself is stable
+	 */
+	if((!context->estimate) && (has_nonconst_input ||
+								funcform->provolatile == PROVOLATILE_STABLE))
+	{
+		Expr	   *expr;
+
+		/*
+		 * We want to keep the original expression type, because EXPLAIN will
+		 * show these expressions
+		 */
+		if(oldexpr)
+			expr = copyObject(oldexpr);
+		else
+			/*
+			 * XXX This can make EXPLAIN ANALYZE VERBOSE ugly!
+			 * explain analyze verbose select current_date;
+			 */
+			expr = makeFuncExpr(funcid, result_type, args, result_collid,
+								input_collid, COERCE_DONTCARE);
+
+		/*
+		 * XXX If we don't mutate 'args' here, we prevent redundant caching
+		 * of argument expressions. Does this have any downsides?
+		 */
+		if (IsA(expr, FuncExpr))
+		{
+			((FuncExpr *) expr)->stableconst = true;
+			((FuncExpr *) expr)->args = args;
+		}
+		else if(IsA(expr, OpExpr))
+		{
+			((OpExpr *) expr)->stableconst = true;
+			((OpExpr *) expr)->args = args;
+		}
+		else
+			Assert(false); /* XXX */
+
+		return (Expr *) expr;
+	}
+
 	/*
 	 * OK, looks like we can simplify this operator/function.
 	 *
 	 * Build a new FuncExpr node containing the already-simplified arguments.
 	 */
-	newexpr = makeNode(FuncExpr);
-	newexpr->funcid = funcid;
-	newexpr->funcresulttype = result_type;
-	newexpr->funcretset = false;
-	newexpr->funcformat = COERCE_DONTCARE;		/* doesn't matter */
-	newexpr->funccollid = result_collid;		/* doesn't matter */
-	newexpr->inputcollid = input_collid;
-	newexpr->args = args;
-	newexpr->location = -1;
+
+	newexpr = makeFuncExpr(funcid, result_type, args, result_collid,
+						   input_collid, COERCE_DONTCARE);
 
 	return evaluate_expr((Expr *) newexpr, result_type, result_typmod,
 						 result_collid);
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index b3eed7d..8238464 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -624,6 +624,13 @@ typedef struct FuncExprState
 	FmgrInfo	func;
 
 	/*
+	 * These are used for holding the cached result of stable function calls
+	 * with constant arguments.
+	 */
+	Datum		cachedResult;
+	bool		cachedIsNull;
+
+	/*
 	 * For a set-returning function (SRF) that returns a tuplestore, we keep
 	 * the tuplestore here and dole out the result rows one at a time. The
 	 * slot holds the row currently being returned.
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index f1e20ef..af0c6a7 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -335,6 +335,7 @@ typedef struct FuncExpr
 	Oid			inputcollid;	/* OID of collation that function should use */
 	List	   *args;			/* arguments to the function */
 	int			location;		/* token location, or -1 if unknown */
+	bool		stableconst;	/* cache the result for each ExprState? */
 } FuncExpr;
 
 /*
@@ -380,6 +381,7 @@ typedef struct OpExpr
 	Oid			inputcollid;	/* OID of collation that operator should use */
 	List	   *args;			/* arguments to the operator (1 or 2) */
 	int			location;		/* token location, or -1 if unknown */
+	bool		stableconst;	/* cache the result for each ExprState? */
 } OpExpr;
 
 /*
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Marti Raudsepp (#1)
Re: [WIP] Caching constant stable expressions per execution

Marti Raudsepp <marti@juffo.org> writes:

This is a proof of concept patch for recognizing stable function calls
with constant arguments and only calling them once per execution. I'm
posting it to the list to gather feedback whether this is a dead end
or not.

Hmm. This is an interesting hack, but I'm not sure it's more than a
hack.

Usually what people have meant when they ask for "function caching"
is that multiple executions of a given stable function with the same
argument values be folded to just one execution. In the general case
that would require some sort of hash table in which we memoize sets of
argument values and the corresponding result. The reason that I've
generally discouraged this line of thought is that I see no way that
such a hash table could be managed or searched with sufficient
efficiency to make it a win.

What you've done here is to solve the problem for the special case of
a single (textual) instance of the function called with
plan-time-constant arguments. While that's probably a common case,
I'm not sure it's common enough to justify extra planner and executor
complexity.

The patch as given has a bunch of implementation issues, but I think
it's close enough for crude performance testing, and your numbers do
show a potential performance benefit. The question that I think is
unresolved is whether the set of cases covered is wide enough to be
useful in practice. I have no data on that ...

regards, tom lane

#3Josh Berkus
josh@agliodbs.com
In reply to: Tom Lane (#2)
Re: [WIP] Caching constant stable expressions per execution

The patch as given has a bunch of implementation issues, but I think
it's close enough for crude performance testing, and your numbers do
show a potential performance benefit. The question that I think is
unresolved is whether the set of cases covered is wide enough to be
useful in practice. I have no data on that ...

Well, I think its use for timestamp/interval/casting functions alone
covers a pretty substantial set of common user actions. Frankly, the
case of not re-evaluating now() alone would be a significant real-life
improvement.

If I understand the limitations correctly, though, what this would do is
cause functions to perform substantially differently if called with
expressions as arguments instead of text constants, no? Seems like that
would lead to some user confusion. Although, with stuff like now(), we
already have that.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

#4Josh Berkus
josh@agliodbs.com
In reply to: Josh Berkus (#3)
Re: [WIP] Caching constant stable expressions per execution

If I understand the limitations correctly, though, what this would do is
cause functions to perform substantially differently if called with
expressions as arguments instead of text constants, no? Seems like that
would lead to some user confusion. Although, with stuff like now(), we
already have that.

After some discussion on IRC, I realized I misunderstood the limitations
of the patch. I'd say that this would be a significant improvement for
a large number of real-world cases. The more in-depth solution ... that
is, one which would accept column parameters ... would of course make
this patch obsolete, but AFAIK nobody is working on that yet.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

#5Marti Raudsepp
marti@juffo.org
In reply to: Tom Lane (#2)
Re: [WIP] Caching constant stable expressions per execution

On Sun, Sep 11, 2011 at 01:51, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Usually what people have meant when they ask for "function caching"
is that multiple executions of a given stable function with the same
argument values be folded to just one execution. In the general case
that would require some sort of hash table

Right, this is more like execution-time constant folding. But I
decided not to call it that since I'm not "folding" the expression.

I know some people are writing queries like ts>=(SELECT now()-interval
'10 days') just to cache the single value from the expression.

What you've done here is to solve the problem for the special case of
a single (textual) instance of the function called with
plan-time-constant arguments.

With a small tweak it can also support functions whose arguments are
(prepared query) placeholder variables.

The patch as given has a bunch of implementation issues

This is my first patch that touches the more complicated internals of
Postgres. I'm sure I have a lot to learn. :)

Regards,
Marti

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Marti Raudsepp (#5)
Re: [WIP] Caching constant stable expressions per execution

Marti Raudsepp <marti@juffo.org> writes:

On Sun, Sep 11, 2011 at 01:51, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The patch as given has a bunch of implementation issues

This is my first patch that touches the more complicated internals of
Postgres. I'm sure I have a lot to learn. :)

Well, people seem to think that this is worth pursuing, so here's a
couple of thoughts about what needs to be done to get to something
committable.

First off, there is one way in which you are cheating that does have
real performance implications, so you ought to fix that before trusting
your performance results too much. You're ensuring that the cached
datum lives long enough by doing this:

+	/* This cache has to persist for the whole query */
+	oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_query_memory);
+
+	fcache->cachedResult = ExecMakeFunctionResult(fcache, econtext, isNull, isDone);
+	fcache->cachedIsNull = *isNull;
+
+	/* Set-returning functions can't be cached */
+	Assert(!isDone || *isDone == ExprSingleResult);
+
+	MemoryContextSwitchTo(oldcontext);

IMO this is no good because it means that every intermediate result
computed within the cacheable expression will be leaked into
per_query_memory. Yeah, you're only doing it once, but once could still
be too much. Consider for instance the case where the function
internally generates a lot of cruft over multiple operations, and it
thinks it's cleaning up by resetting ecxt_per_tuple_memory every so
often. If CurrentMemoryContext isn't pointing to ecxt_per_tuple_memory,
this loses. I think what you need to do is run the function in the
normal environment and then use datumCopy() to save the value into
per_query_memory. The reason this is performance-relevant is that the
datum copy step represents real added cycles. I think it probably
doesn't invalidate the idea, but it'd be good to fix it and recheck
your performance numbers before putting in more work. Assuming
that passes ...

The concept you're trying to encapsulate here is not really specific to
FuncExpr or OpExpr nodes. Rather, the idea you want to implement is
"let's cache the result of any expression tree that contains no Vars,
internal Params, or volatile functions". An example of this is that
the result of
CASE WHEN f1() > 42 THEN f2() ELSE NULL END
ought to be perfectly cacheable if f1 and f2 (and the > operator) are
stable or immutable. Now it doesn't seem like a good plan to me to
plaster "stableconst" flags on every expression node type, nor to
introduce logic for handling that into everything in execQual.c.
So what I suggest is that you should invent a new expression node
type CacheExpr (that's just the first name that came to mind, maybe
somebody has a better idea) that takes an expression node as input
and caches the result value. This makes things simple and clean in
the executor. The planner would have to figure out where to inject
CacheExpr nodes into expression trees --- ideally only the minimum
number of nodes would be added. I think you could persuade
eval_const_expressions to do that, but it would probably involve
bubbling additional information back up from each level of recursion.
I haven't thought through the details.

The other thing that is going to be an issue is that I'm fairly sure
this breaks plpgsql's handling of simple expressions. (If there's not
a regression test that the patch is failing, there ought to be ...)
The reason is that we build an execution tree for a given simple
expression only once per transaction and then re-use it. So for
example consider a plpgsql function containing

x := stablefunction();

I think your patch means that stablefunction() would be called only once
per transaction, and the value would be cached and returned in all later
executions. This would be wrong if the plpgsql function is called in
successive statements that have different snapshots, or contains a loop
around the assignment plus operations that change whatever state
stablefunction() looks at. It would be legitimate for stablefunction()
to have different values in the successive executions.

The quick and dirty solution to this would be for plpgsql to pass some
kind of planner flag that disables insertion of CacheExpr nodes, or
alternatively have it not believe that CacheExpr nodes are safe to have
in simple expressions. But that gives up all the advantage of the
concept for this use-case, which seems a bit disappointing. Maybe we
can think of a better answer.

regards, tom lane

#7Marti Raudsepp
marti@juffo.org
In reply to: Tom Lane (#6)
Re: [WIP] Caching constant stable expressions per execution

On Mon, Sep 12, 2011 at 00:22, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Well, people seem to think that this is worth pursuing, so here's a
couple of thoughts about what needs to be done to get to something
committable.

Thanks, that's exactly the kind of feedback I need.

IMO this is no good because it means that every intermediate result
computed within the cacheable expression will be leaked into
per_query_memory.

Agreed, that's not ideal.

type CacheExpr (that's just the first name that came to mind, maybe
somebody has a better idea)

StableConstExpr? But we can leave the bikeshedding for later :)

The planner would have to figure out where to inject
CacheExpr nodes into expression trees --- ideally only the minimum
number of nodes would be added.

Yeah, that occured to me, but seemed complicated at first, so I didn't
want to do it before having a confirmation from the list. However,
after looking at the expression tree walking code for a bit, it
doesn't seem that scary anymore.

The other thing that is going to be an issue is that I'm fairly sure
this breaks plpgsql's handling of simple expressions.

Oh, I would have never thought of that.

Regards,
Marti

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Marti Raudsepp (#7)
Re: [WIP] Caching constant stable expressions per execution

Marti Raudsepp <marti@juffo.org> writes:

On Mon, Sep 12, 2011 at 00:22, Tom Lane <tgl@sss.pgh.pa.us> wrote:

type CacheExpr (that's just the first name that came to mind, maybe
somebody has a better idea)

StableConstExpr? But we can leave the bikeshedding for later :)

Well, FWIW, I found that terminology entirely detestable.

regards, tom lane

#9Marti Raudsepp
marti@juffo.org
In reply to: Tom Lane (#6)
Re: [WIP] Caching constant stable expressions per execution

On Mon, Sep 12, 2011 at 00:22, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The other thing that is going to be an issue is that I'm fairly sure
this breaks plpgsql's handling of simple expressions.

The quick and dirty solution to this would be for plpgsql to pass some
kind of planner flag that disables insertion of CacheExpr nodes, or
alternatively have it not believe that CacheExpr nodes are safe to have
in simple expressions.  But that gives up all the advantage of the
concept for this use-case, which seems a bit disappointing.  Maybe we
can think of a better answer.

I got around to this and I think there's a better way.

PL/pgSQL "simple expressions" are queries that return a single row, a
single column, don't reference any tables, have no WHERE clauses etc.

All expressions in such queries would be evaluated only once anyway,
so don't benefit from cache -- and indeed could potentially be hurt by
the added overheads.

It seems that we could pre-empt this in the planner by recognizing
potentially-simple queries right from the start and forbidding
CacheExpr. Maybe add a new boolean to PlannerInfo?

However, I got stuck because set-returning functions aren't
immediately obvious -- I'd have to walk the whole expression tree to
find out. A query like "SELECT now(), generate_series(1,10)" isn't a
simple expression and could still benefit from cache for the now()
call.

Or we could just let it slip and not cache anything if there's no
FROM/WHERE clause.

Thoughts?

Regards,
Marti