Patch: New GUC prepared_statement_limit to limit memory used by prepared statements

Started by Daniel Migowskiover 6 years ago9 messages
#1Daniel Migowski
dmigowski@ikoffice.de
1 attachment(s)

Hello,

attached you find a patch that adds a new GUC:

prepared_statement_limit:

        Specifies the maximum amount of memory used in each session to
cache
        parsed-and-rewritten queries and execution plans. This affects
the maximum memory
        a backend threads will reserve when many prepared statements
are used.
        The default value of 0 disables this setting, but it is
recommended to set this
        value to a bit lower than the maximum memory a backend worker
thread should reserve
        permanently.

If the GUC is configured after each save of a CachedPlanSource, or after
creating a CachedPlan from it, the function
EnforcePreparedStatementLimit is called now. It checks the mem usage of
the existing saved CachedPlanSources and invalidates the query_list and
the gplan if available until the memory limit is met again.

CachedPlanSource are removed-and-tailadded in the saved_plan_list
everytime GetCachedPlan is called on them so it can be used as a LRU list.

I also reworked ResetPlanCache, PlanCacheRelCallback and
PlanCacheObjectCallback a bit so when a CachedPlanSource is invalidated
the query_list is not only marked as invalid but it is also fully
released to free memory here.

Regards,
Daniel Migowski

PS@Konstantin: This patch also includes the CachedPlanMemoryUsage
function you like, maybe you like the review the patch for me?

Attachments:

limit_prep_statments_mem.patchtext/plain; charset=UTF-8; name=limit_prep_statments_mem.patchDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index cdc30fa..2da4ba8 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1702,6 +1702,24 @@ include_dir 'conf.d'
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-prepared-statement-limit" xreflabel="prepared_statement_limit">
+      <term><varname>prepared_statement_limit</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>prepared_statement_limit</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Specifies the maximum amount of memory used in each session to cache 
+        parsed-and-rewritten queries and execution plans. This affects the maximum memory
+        a backend threads will reserve when many prepared statements are used.
+        The default value of 0 disables this setting, but it is recommended to set this
+        value to a bit lower than the maximum memory a backend worker thread should reserve
+        permanently. 
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-max-stack-depth" xreflabel="max_stack_depth">
       <term><varname>max_stack_depth</varname> (<type>integer</type>)
       <indexterm>
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index abc3062..dbeb5a2 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -88,6 +88,9 @@
  * those that are in long-lived storage and are examined for sinval events).
  * We use a dlist instead of separate List cells so that we can guarantee
  * to save a CachedPlanSource without error.
+ *
+ * This list is used as a LRU list for prepared statements when a
+ * prepared_statement_limit is configured.
  */
 static dlist_head saved_plan_list = DLIST_STATIC_INIT(saved_plan_list);
 
@@ -97,6 +100,7 @@ static dlist_head saved_plan_list = DLIST_STATIC_INIT(saved_plan_list);
 static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_list);
 
 static void ReleaseGenericPlan(CachedPlanSource *plansource);
+static void ReleaseQueryList(CachedPlanSource *plansource);
 static List *RevalidateCachedQuery(CachedPlanSource *plansource,
 								   QueryEnvironment *queryEnv);
 static bool CheckCachedPlan(CachedPlanSource *plansource);
@@ -114,6 +118,7 @@ static TupleDesc PlanCacheComputeResultDesc(List *stmt_list);
 static void PlanCacheRelCallback(Datum arg, Oid relid);
 static void PlanCacheObjectCallback(Datum arg, int cacheid, uint32 hashvalue);
 static void PlanCacheSysCallback(Datum arg, int cacheid, uint32 hashvalue);
+static void EnforcePreparedStatementLimit(CachedPlanSource* ignoreThis);
 
 /* GUC parameter */
 int			plan_cache_mode;
@@ -482,6 +487,11 @@ SaveCachedPlan(CachedPlanSource *plansource)
 	dlist_push_tail(&saved_plan_list, &plansource->node);
 
 	plansource->is_saved = true;
+
+	if( prep_statement_limit > 0 ) {
+		/* Clean up statements when mem limit is hit */
+		EnforcePreparedStatementLimit(plansource);
+	}
 }
 
 /*
@@ -536,6 +546,31 @@ ReleaseGenericPlan(CachedPlanSource *plansource)
 }
 
 /*
+ * ReleaseQueryList: release a CachedPlanSource's query list, relationOids,
+ * invalItems and search_path.
+ */
+static void
+ReleaseQueryList(CachedPlanSource *plansource)
+{
+	if (plansource->query_list)
+	{
+		plansource->is_valid = false;
+		plansource->query_list = NIL;
+		plansource->relationOids = NIL;
+		plansource->invalItems = NIL;
+		plansource->search_path = NULL;
+
+		if (plansource->query_context)
+		{
+			MemoryContext qcxt = plansource->query_context;
+
+			plansource->query_context = NULL;
+			MemoryContextDelete(qcxt);
+		}
+	}
+}
+
+/*
  * RevalidateCachedQuery: ensure validity of analyzed-and-rewritten query tree.
  *
  * What we do here is re-acquire locks and redo parse analysis if necessary.
@@ -626,27 +661,9 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
 	/*
 	 * Discard the no-longer-useful query tree.  (Note: we don't want to do
 	 * this any earlier, else we'd not have been able to release locks
-	 * correctly in the race condition case.)
+	 * correctly in the race condition case.) Also discard query_context
 	 */
-	plansource->is_valid = false;
-	plansource->query_list = NIL;
-	plansource->relationOids = NIL;
-	plansource->invalItems = NIL;
-	plansource->search_path = NULL;
-
-	/*
-	 * Free the query_context.  We don't really expect MemoryContextDelete to
-	 * fail, but just in case, make sure the CachedPlanSource is left in a
-	 * reasonably sane state.  (The generic plan won't get unlinked yet, but
-	 * that's acceptable.)
-	 */
-	if (plansource->query_context)
-	{
-		MemoryContext qcxt = plansource->query_context;
-
-		plansource->query_context = NULL;
-		MemoryContextDelete(qcxt);
-	}
+	ReleaseQueryList(plansource);
 
 	/* Drop the generic plan reference if any */
 	ReleaseGenericPlan(plansource);
@@ -1141,6 +1158,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 	CachedPlan *plan = NULL;
 	List	   *qlist;
 	bool		customplan;
+	bool        newplan = false;
 
 	/* Assert caller is doing things in a sane order */
 	Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
@@ -1172,6 +1190,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 			/* Link the new generic plan into the plansource */
 			plansource->gplan = plan;
 			plan->refcount++;
+			newplan = true;
 			/* Immediately reparent into appropriate context */
 			if (plansource->is_saved)
 			{
@@ -1241,6 +1260,18 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 		plan->is_saved = true;
 	}
 
+	if( plansource->is_saved && prep_statement_limit > 0 ) {
+		/* move CachedPlanSource to tail of list (use it like a LRU list).
+		* Todo: Make this more efficient / a single op */
+		dlist_delete(&plansource->node);
+		dlist_push_tail(&saved_plan_list, &plansource->node);
+
+		if( newplan ) {
+			/* Clean up statements when mem limit is hit */
+			EnforcePreparedStatementLimit(plansource);
+		}
+	}
+
 	return plan;
 }
 
@@ -1530,6 +1561,181 @@ FreeCachedExpression(CachedExpression *cexpr)
 }
 
 /*
+ * CachedPlanMemUsage: Return memory used by the CachedPlanSource
+ *
+ * Returns the malloced memory used by the two MemoryContexts in
+ * CachedPlanSource and (if available) the MemoryContext in the generic plan.
+ * Does not care for the free memory in those MemoryContexts because it is very
+ * unlikely that it is reused for anythink else anymore and can be considered
+ * dead memory anyway. Also the size of the CachedPlanSource struct is added.
+ *
+ * This function is used only for the pg_prepared_statements view to allow
+ * client applications to monitor memory used by prepared statements and to
+ * selects candidates for eviction in memory contraint environments with
+ * automatic preparation of often called queries.
+ */
+Size
+CachedPlanMemoryUsage(CachedPlanSource *plan)
+{
+	MemoryContextCounters counters;
+	MemoryContext context;
+	counters.totalspace = 0;
+
+	context = plan->context;
+	context->methods->stats(context,NULL,NULL,&counters);
+
+	if( plan->query_context )
+	{
+		context = plan->query_context;
+		context->methods->stats(context,NULL,NULL,&counters);
+	}
+
+	if( plan->gplan )
+	{
+		context = plan->gplan->context;
+		context->methods->stats(context,NULL,NULL,&counters);
+	}
+
+	return counters.totalspace;
+}
+
+/*
+ * AllCachedPlansMemUsage: Return memory used by the saved CachedPlanSources
+ *
+ * If this exceeds a configurable treshhold, Other CachedPlans are invalidated
+ * until enought memory has been freed.
+ */
+static Size
+AllCachedPlansMemUsage()
+{
+
+	dlist_iter	iter;
+	Size        size = 0;
+
+	dlist_foreach(iter, &saved_plan_list)
+	{
+		CachedPlanSource *plansource = dlist_container(CachedPlanSource,
+													   node, iter.cur);
+		Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
+		size += CachedPlanMemoryUsage(plansource);
+	}
+
+	return size;
+}
+
+/*
+ * CachedPlanFreeableMemory: Return memory freed when query_list and gplan
+ * would be released.
+ */
+static Size
+CachedPlanFreeableMemory(CachedPlanSource *plan)
+{
+	MemoryContextCounters counters;
+	MemoryContext context;
+	counters.totalspace = 0;
+
+	if( plan->query_context )
+	{
+		context = plan->query_context;
+		context->methods->stats(plan->query_context,NULL,NULL,&counters);
+	}
+
+	if( plan->gplan )
+	{
+		context = plan->gplan->context;
+		context->methods->stats(context,NULL,NULL,&counters);
+	}
+
+	return counters.totalspace;
+}
+
+/*
+ * Tries to enforce the preparedStatementLimit. Does nothing when there is
+ *
+ */
+static void
+EnforcePreparedStatementLimit(CachedPlanSource* ignoreThis)
+{
+	Size currSize;
+	Size limit = prep_statement_limit<<10;
+	dlist_iter	iter;
+
+	// If GUC prepared_statement_limit is not set, there is nothing to do here.
+	if( prep_statement_limit <= 0 )
+		return;
+
+	// Check the current memory usage
+	currSize = AllCachedPlansMemUsage();
+
+	// If we use less than that we can exit
+	if( currSize <= limit )
+		return;
+
+	// Iterate over all cached plans and invalidate them until the limit is
+	// reached but skip an eventually plan that we want to use NOW.
+	dlist_foreach(iter, &saved_plan_list)
+	{
+		CachedPlanSource *plansource = dlist_container(CachedPlanSource,
+													   node, iter.cur);
+		ListCell   *lc;
+		Size       freeable;
+
+		Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
+
+		/* Skip one we might want to ignore */
+		if( ignoreThis == plansource )
+			continue;
+
+		/* No work if it's already invalidated */
+		if (!plansource->is_valid)
+			continue;
+
+		/*
+		 * We *must not* mark transaction control statements as invalid,
+		 * particularly not ROLLBACK, because they may need to be executed in
+		 * aborted transactions when we can't revalidate them (cf bug #5269).
+		 */
+		if (IsTransactionStmtPlan(plansource))
+			continue;
+
+		/* No work if we can't do anything about it */
+		freeable = CachedPlanFreeableMemory(plansource);
+		if (freeable == 0)
+			continue;
+
+		/*
+		 * In general there is no point in invalidating utility statements
+		 * since they have no plans anyway.  So invalidate it only if it
+		 * contains at least one non-utility statement, or contains a utility
+		 * statement that contains a pre-analyzed query (which could have
+		 * dependencies.)
+		 */
+		foreach(lc, plansource->query_list)
+		{
+			Query	   *query = lfirst_node(Query, lc);
+
+			if (query->commandType != CMD_UTILITY ||
+				UtilityContainsQuery(query->utilityStmt))
+			{
+				/* non-utility statement, so invalidate */
+				ReleaseQueryList(plansource);
+				/* no need to look further */
+				break;
+			}
+		}
+
+		/* Release generic plan if any */
+		ReleaseGenericPlan(plansource);
+
+		/* Stop when enough it released */
+		currSize -= freeable;
+		if( currSize <= limit )
+			break;
+	}
+}
+
+
+/*
  * QueryListGetPrimaryStmt
  *		Get the "primary" stmt within a list, ie, the one marked canSetTag.
  *
@@ -1787,9 +1993,9 @@ PlanCacheRelCallback(Datum arg, Oid relid)
 			list_member_oid(plansource->relationOids, relid))
 		{
 			/* Invalidate the querytree and generic plan */
-			plansource->is_valid = false;
-			if (plansource->gplan)
-				plansource->gplan->is_valid = false;
+			ReleaseQueryList(plansource);
+			/* Release generic plan if any */
+			ReleaseGenericPlan(plansource);
 		}
 
 		/*
@@ -1810,7 +2016,7 @@ PlanCacheRelCallback(Datum arg, Oid relid)
 					list_member_oid(plannedstmt->relationOids, relid))
 				{
 					/* Invalidate the generic plan only */
-					plansource->gplan->is_valid = false;
+					ReleaseGenericPlan(plansource);
 					break;		/* out of stmt_list scan */
 				}
 			}
@@ -1878,9 +2084,9 @@ PlanCacheObjectCallback(Datum arg, int cacheid, uint32 hashvalue)
 				item->hashValue == hashvalue)
 			{
 				/* Invalidate the querytree and generic plan */
-				plansource->is_valid = false;
-				if (plansource->gplan)
-					plansource->gplan->is_valid = false;
+				ReleaseQueryList(plansource);
+				/* Release generic plan if any */
+				ReleaseGenericPlan(plansource);
 				break;
 			}
 		}
@@ -1908,11 +2114,11 @@ PlanCacheObjectCallback(Datum arg, int cacheid, uint32 hashvalue)
 						item->hashValue == hashvalue)
 					{
 						/* Invalidate the generic plan only */
-						plansource->gplan->is_valid = false;
+						ReleaseGenericPlan(plansource);
 						break;	/* out of invalItems scan */
 					}
 				}
-				if (!plansource->gplan->is_valid)
+				if (!plansource->gplan)
 					break;		/* out of stmt_list scan */
 			}
 		}
@@ -2002,9 +2208,9 @@ ResetPlanCache(void)
 				UtilityContainsQuery(query->utilityStmt))
 			{
 				/* non-utility statement, so invalidate */
-				plansource->is_valid = false;
-				if (plansource->gplan)
-					plansource->gplan->is_valid = false;
+				ReleaseQueryList(plansource);
+				/* Release generic plan if any */
+				ReleaseGenericPlan(plansource);
 				/* no need to look further */
 				break;
 			}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 3bf96de..ce1257f 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -120,6 +120,7 @@ bool		enableFsync = true;
 bool		allowSystemTableMods = false;
 int			work_mem = 1024;
 int			maintenance_work_mem = 16384;
+int         prep_statement_limit = 0;
 int			max_parallel_maintenance_workers = 2;
 
 /*
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index eb78522..7c7615a 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -2351,6 +2351,19 @@ static struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"prepared_statement_limit", PGC_POSTMASTER, RESOURCES_MEM,
+			gettext_noop("Limits the memory used to cache plans of prepared statements, per backend."),
+			gettext_noop("This much memory can be used by prepared statements to cache "
+						 "parsed-and-rewritten queries and execution plans. If the limit is reached "
+						 "unused prepared statements loose their plans until they are executed again."),
+			GUC_UNIT_KB
+		},
+		&prep_statement_limit,
+		0, 0, MAX_KILOBYTES,
+		NULL, NULL, NULL
+	},
+
 #ifdef LOCK_DEBUG
 	{
 		{"trace_lock_oidmin", PGC_SUSET, DEVELOPER_OPTIONS,
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 61a24c2..5819e32 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -244,6 +244,7 @@ extern bool enableFsync;
 extern PGDLLIMPORT bool allowSystemTableMods;
 extern PGDLLIMPORT int work_mem;
 extern PGDLLIMPORT int maintenance_work_mem;
+extern PGDLLIMPORT int prep_statement_limit;
 extern PGDLLIMPORT int max_parallel_maintenance_workers;
 
 extern int	VacuumCostPageHit;
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index de2555e..2d3fc34 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -219,6 +219,8 @@ extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 QueryEnvironment *queryEnv);
 extern void ReleaseCachedPlan(CachedPlan *plan, bool useResOwner);
 
+extern Size CachedPlanMemoryUsage(CachedPlanSource *plansource);
+
 extern CachedExpression *GetCachedExpression(Node *expr);
 extern void FreeCachedExpression(CachedExpression *cexpr);
 
#2Ibrar Ahmed
ibrar.ahmad@gmail.com
In reply to: Daniel Migowski (#1)
Re: Patch: New GUC prepared_statement_limit to limit memory used by prepared statements

On Sat, Aug 17, 2019 at 6:58 PM Daniel Migowski <dmigowski@ikoffice.de>
wrote:

Hello,

attached you find a patch that adds a new GUC:

Quick questions before looking at the patch.

prepared_statement_limit:

- Do we have a consensus about the name of GUC? I don't think it is

the right name for that.

- Is this a WIP patch or the final patch? Because I can see TODO and
non-standard
comments in the patch.

Specifies the maximum amount of memory used in each session to
cache
parsed-and-rewritten queries and execution plans. This affects
the maximum memory
a backend threads will reserve when many prepared statements
are used.
The default value of 0 disables this setting, but it is
recommended to set this
value to a bit lower than the maximum memory a backend worker
thread should reserve
permanently.

If the GUC is configured after each save of a CachedPlanSource, or after
creating a CachedPlan from it, the function
EnforcePreparedStatementLimit is called now. It checks the mem usage of
the existing saved CachedPlanSources and invalidates the query_list and
the gplan if available until the memory limit is met again.

CachedPlanSource are removed-and-tailadded in the saved_plan_list
everytime GetCachedPlan is called on them so it can be used as a LRU list.

I also reworked ResetPlanCache, PlanCacheRelCallback and
PlanCacheObjectCallback a bit so when a CachedPlanSource is invalidated
the query_list is not only marked as invalid but it is also fully
released to free memory here.

Regards,
Daniel Migowski

PS@Konstantin: This patch also includes the CachedPlanMemoryUsage
function you like, maybe you like the review the patch for me?

--
Ibrar Ahmed

#3Daniel Migowski
dmigowski@ikoffice.de
In reply to: Ibrar Ahmed (#2)
Re: Patch: New GUC prepared_statement_limit to limit memory used by prepared statements

Am 17.08.2019 um 19:10 schrieb Ibrar Ahmed:

On Sat, Aug 17, 2019 at 6:58 PM Daniel Migowski <dmigowski@ikoffice.de
<mailto:dmigowski@ikoffice.de>> wrote:

attached you find a patch that adds a new GUC:

Quick questions before looking at the patch.

prepared_statement_limit:

 - Do we have a consensus about the name of GUC? I don't think it is
the right name for that.

No, it is a proposal. It could also be named plancache_mem or
cachedplansource_maxmem or anything else. It was intended to make
prepared statements not use up all my mem, but development has shown
that it could also be used for other CachedPlans, as long as it is a
saved plan.

 - Is this a WIP patch or the final patch? Because I can see TODO and
non-standard
comments in the patch.

Definitely work in progress! The current implementation seems to work
for me, but might be improved, but I wanted some input from the mailing
list before I optimize things.

The most important question is, if such a feature would find some love
here. Personally it is essential for me because a single prepared
statement uses up to 45MB in my application and there were cases where
ORM-generated prepared statememts would crash my server after some time.

Then I would like to know if the current implementation would at least
not crash (even it might by slow a bit) or if I have to take more care
for locking in some places. I believe a backend is a single thread of
execution but there were notes of invalidation messages that seem to be
run asynchronously to the main thread. Could someone care to explain the
treading model of a single backend to me? Does it react so signals and
what would those signals change? Can I assume that only the
ResetPlanCache, PlanCacheRelCallback and PlanCacheObjectCallback will be
called async?

         Specifies the maximum amount of memory used in each
session to
cache
         parsed-and-rewritten queries and execution plans. This
affects
the maximum memory
         a backend threads will reserve when many prepared statements
are used.
         The default value of 0 disables this setting, but it is
recommended to set this
         value to a bit lower than the maximum memory a backend
worker
thread should reserve
         permanently.

If the GUC is configured after each save of a CachedPlanSource, or
after
creating a CachedPlan from it, the function
EnforcePreparedStatementLimit is called now. It checks the mem
usage of
the existing saved CachedPlanSources and invalidates the
query_list and
the gplan if available until the memory limit is met again.

CachedPlanSource are removed-and-tailadded in the saved_plan_list
everytime GetCachedPlan is called on them so it can be used as a
LRU list.

Could be a single move-to-tail function I would add.

I also reworked ResetPlanCache, PlanCacheRelCallback and
PlanCacheObjectCallback a bit so when a CachedPlanSource is
invalidated
the query_list is not only marked as invalid but it is also fully
released to free memory here.

Because this seems to work I was able to reuse the new ReleaseQueryList
in my implementation.

Regards,
Daniel Migowski

PS@Konstantin: This patch also includes the CachedPlanMemoryUsage
function you like, maybe you like the review the patch for me?

--
Ibrar Ahmed

Daniel Migowski

#4Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Daniel Migowski (#3)
Re: Patch: New GUC prepared_statement_limit to limit memory used by prepared statements

Hello,

At Sun, 18 Aug 2019 09:43:09 +0200, Daniel Migowski <dmigowski@ikoffice.de> wrote in <6e25ca12-9484-8994-a1ee-40fdbe6afa8b@ikoffice.de>

Am 17.08.2019 um 19:10 schrieb Ibrar Ahmed:

On Sat, Aug 17, 2019 at 6:58 PM Daniel Migowski <dmigowski@ikoffice.de
<mailto:dmigowski@ikoffice.de>> wrote:

attached you find a patch that adds a new GUC:

Quick questions before looking at the patch.

prepared_statement_limit:

 - Do we have a consensus about the name of GUC? I don't think it is
the right name for that.

The almost same was proposed [1]/messages/by-id/20180315.141246.130742928.horiguchi.kyotaro@lab.ntt.co.jp as a part of syscache-pruning
patch [2]https://commitfest.postgresql.org/23/931/, but removed to concentrate on defining how to do that
on the first instance - syscache. We have some mechanisms that
have the same characteristics - can be bloat and no means to keep
it in a certain size. It is better that they are treated the same
way, or at leaast on the same principle.

[1]: /messages/by-id/20180315.141246.130742928.horiguchi.kyotaro@lab.ntt.co.jp
[2]: https://commitfest.postgresql.org/23/931/

Pruning plancaches in any means is valuable, but we haven't
reached a concsensus on how to do that. My old patch does that
based on the number of entries because precise memory accounting
of memory contexts is too expensive. I didn't look this patch
closer but it seems to use MemoryContext->methods->stats to count
memory usage, which would be too expensive for the purpose. We
currently use it only for debug output on critical errors like
OOM.

No, it is a proposal. It could also be named plancache_mem or
cachedplansource_maxmem or anything else. It was intended to make
prepared statements not use up all my mem, but development has shown
that it could also be used for other CachedPlans, as long as it is a
saved plan.

 - Is this a WIP patch or the final patch? Because I can see TODO and
non-standard
comments in the patch.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#5Daniel Migowski
dmigowski@ikoffice.de
In reply to: Kyotaro Horiguchi (#4)
Re: Patch: New GUC prepared_statement_limit to limit memory used by prepared statements

Am 19.08.2019 um 05:57 schrieb Kyotaro Horiguchi:

At Sun, 18 Aug 2019 09:43:09 +0200, Daniel Migowski <dmigowski@ikoffice.de> wrote in <6e25ca12-9484-8994-a1ee-40fdbe6afa8b@ikoffice.de>

Am 17.08.2019 um 19:10 schrieb Ibrar Ahmed:

On Sat, Aug 17, 2019 at 6:58 PM Daniel Migowski <dmigowski@ikoffice.de
<mailto:dmigowski@ikoffice.de>> wrote:

attached you find a patch that adds a new GUC:

Quick questions before looking at the patch.

prepared_statement_limit:

 - Do we have a consensus about the name of GUC? I don't think it is
the right name for that.

The almost same was proposed [1] as a part of syscache-pruning
patch [2], but removed to concentrate on defining how to do that
on the first instance - syscache. We have some mechanisms that
have the same characteristics - can be bloat and no means to keep
it in a certain size. It is better that they are treated the same
way, or at least on the same principle.

Correct. However, I don't know the backend well enough to see how to
unify this. Also time based eviction isn't that important for me, and
I'd rather work with the memory used. I agree that memory leaks are all
bad, and a time based eviction for some small cache entries might
suffice, but CachedPlanSources take up up to 45MB EACH here, and looking
at the memory directly seems desirable in that case.

[1] /messages/by-id/20180315.141246.130742928.horiguchi.kyotaro@lab.ntt.co.jp
[2] https://commitfest.postgresql.org/23/931/

Pruning plancaches in any means is valuable, but we haven't
reached a concsensus on how to do that. My old patch does that
based on the number of entries because precise memory accounting
of memory contexts is too expensive. I didn't look this patch
closer but it seems to use MemoryContext->methods->stats to count
memory usage, which would be too expensive for the purpose. We
currently use it only for debug output on critical errors like
OOM.

Yes, this looks like a place to improve. I hadn't looked at the stats
function before, and wasn't aware it iterates over all chunks of the
context. We really don't need all the mem stats, just the total usage
and this calculates solely from the size of the blocks. Maybe we should
add this counter (totalmemusage) to the MemoryContexts themselves to
start to be able to make decisions based on the memory usage fast.
Shouldn't be too slow because blocks seem to be aquired much less often
than chunks and when they are aquired an additional add to variable in
memory wouldn't hurt. One the other hand they should be as fast as
possible given how often they are used in the database, so that might be
bad.

Also one could precompute the memory usage of a CachedPlanSource when it
is saved, when the query_list gets calculated (for plans about to be
saved) and when the generic plan is stored in it. In combination with a
fast stats function which only calculates the total usage this looks
good for me. Also one could store the sum in a session global variable
to make checking for the need of a prune run faster.

While time based eviction also makes sense in a context where the
database is not alone on a server, contraining the memory used directly
attacks the problem one tries to solve with time based eviction.

#6Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Daniel Migowski (#3)
Re: Patch: New GUC prepared_statement_limit to limit memory used by prepared statements

On 2019-Aug-18, Daniel Migowski wrote:

�- Is this a WIP patch or the final patch? Because I can see TODO
and non-standard comments in the patch.

Definitely work in progress! The current implementation seems to work for
me, but might be improved, but I wanted some input from the mailing list
before I optimize things.

The most important question is, if such a feature would find some love here.
Personally it is essential for me because a single prepared statement uses
up to 45MB in my application and there were cases where ORM-generated
prepared statememts would crash my server after some time.

Then I would like to know if the current implementation would at least not
crash (even it might by slow a bit) or if I have to take more care for
locking in some places.

On this patch, beyond the fact that it's causing a crash in the
regression tests as evidenced by the CFbot, we seem to be waiting on the
input of the larger community on whether it's a desired feature or not.
We have Kyotaro's vote for it, but it would be good to get more.

I'm switching it as Needs Review, so that others chime in.

In the meantime, please do fix the code style: brace location and
whitespacing are not per style, as well as usage of //-comments.
Also please research and fix the crash.

Thanks

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#6)
1 attachment(s)
Re: Patch: New GUC prepared_statement_limit to limit memory used by prepared statements

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

On this patch, beyond the fact that it's causing a crash in the
regression tests as evidenced by the CFbot, we seem to be waiting on the
input of the larger community on whether it's a desired feature or not.
We have Kyotaro's vote for it, but it would be good to get more.

I'm pretty dubious about this (just as I am with the somewhat-related
topic of syscache/relcache size limits). It's hard to tell where
performance is going to fall off a cliff if you limit the cache size.

Another point is that, because this only gets rid of the regeneratable
parts of a plancache entry, it isn't really going to move the needle
all that far in terms of total space consumption. As an experiment,
I instrumented GetCachedPlan right after where it fills in the gplan field
to see the relative sizes of the cache entry's contexts. Running that
over the core + PL regression tests, I get

14 GetCachedPlan: 1024 context, 1024 query_context, 1024 gplan
4 GetCachedPlan: 1024 context, 2048 query_context, 1024 gplan
1 GetCachedPlan: 1024 context, 2048 query_context, 2048 gplan
2109 GetCachedPlan: 2048 context, 2048 query_context, 2048 gplan
29 GetCachedPlan: 2048 context, 2048 query_context, 4096 gplan
6 GetCachedPlan: 2048 context, 4096 query_context, 2048 gplan
33 GetCachedPlan: 2048 context, 4096 query_context, 4096 gplan
2 GetCachedPlan: 2048 context, 4096 query_context, 8192 gplan
1 GetCachedPlan: 2048 context, 8192 query_context, 16384 gplan
4 GetCachedPlan: 2048 context, 8192 query_context, 4096 gplan
2 GetCachedPlan: 2048 context, 8192 query_context, 8192 gplan
8 GetCachedPlan: 3480 context, 8192 query_context, 8192 gplan
250 GetCachedPlan: 4096 context, 2048 query_context, 2048 gplan
107 GetCachedPlan: 4096 context, 2048 query_context, 4096 gplan
3 GetCachedPlan: 4096 context, 4096 query_context, 16384 gplan
1 GetCachedPlan: 4096 context, 4096 query_context, 2048 gplan
7 GetCachedPlan: 4096 context, 4096 query_context, 32768 gplan
190 GetCachedPlan: 4096 context, 4096 query_context, 4096 gplan
61 GetCachedPlan: 4096 context, 4096 query_context, 8192 gplan
11 GetCachedPlan: 4096 context, 8192 query_context, 4096 gplan
587 GetCachedPlan: 4096 context, 8192 query_context, 8192 gplan
1 GetCachedPlan: 4096 context, 16384 query_context, 8192 gplan
5 GetCachedPlan: 4096 context, 32768 query_context, 32768 gplan
1 GetCachedPlan: 4096 context, 65536 query_context, 65536 gplan
12 GetCachedPlan: 8192 context, 4096 query_context, 4096 gplan
2 GetCachedPlan: 8192 context, 4096 query_context, 8192 gplan
49 GetCachedPlan: 8192 context, 8192 query_context, 16384 gplan
46 GetCachedPlan: 8192 context, 8192 query_context, 8192 gplan
10 GetCachedPlan: 8192 context, 16384 query_context, 16384 gplan
1 GetCachedPlan: 8192 context, 16384 query_context, 32768 gplan
1 GetCachedPlan: 8192 context, 16384 query_context, 8192 gplan
1 GetCachedPlan: 8192 context, 32768 query_context, 32768 gplan
2 GetCachedPlan: 8192 context, 131072 query_context, 131072 gplan
3 GetCachedPlan: 16384 context, 8192 query_context, 16384 gplan
1 GetCachedPlan: 16384 context, 16384 query_context, 16384 gplan
2 GetCachedPlan: 16384 context, 16384 query_context, 17408 gplan
1 GetCachedPlan: 16384 context, 16384 query_context, 32768 gplan
1 GetCachedPlan: 16384 context, 16384 query_context, 65536 gplan

(The first column is the number of occurrences of the log entry;
I got this list from "grep|sort|uniq -c" on the postmaster log.)
Patch for this is attached, just in the interests of full disclosure.

So yeah, there are cases where you can save a whole lot by deleting
the query_context and/or gplan, but they're pretty few and far between;
more commonly, the nonreclaimable data in "context" accounts for a third
of the usage.

(BTW, it looks to me like the code tries to keep the *total* usage under
the GUC limit, not the freeable usage. Which means it won't be hard at all
to drive it to the worst case where it tries to free everything all the
time, if the nonreclaimable data is already over the limit.)

Admittedly, the regression tests might well not be representative of
common usage, but if you don't want to take them as a benchmark then
we need some other benchmark we can look at.

I also notice that this doesn't seem to be doing anything with
CachedExpressions, which are likely to be a pretty big factor in
the usage of e.g. plpgsql.

Now, I'd be the first to say that my thoughts about this are probably
biased by my time at Salesforce, where their cache-consumption problems
were driven by lots and lots and lots and lots (and lots) of plpgsql code.
Maybe with another usage pattern you'd come to a different conclusion,
but if I were trying to deal with that situation, what I'd look at
doing is reclaiming stuff starting at the plpgsql function cache level,
and then cascading down to the plancache entries referenced by a plpgsql
function body you've chosen to free. One major advantage of doing that
is that plpgsql has a pretty clear idea of when a given function cache
instance has gone to zero refcount, whereas plancache.c simply doesn't
know that.

As far as the crash issue is concerned, I notice that right now the
cfbot is showing green for this patch, but that seems to just be because
the behavior is unstable. I see crashes in "make installcheck-parallel"
about 50% of the time with this patch applied. Since, in fact,
the patch is not supposed to be doing anything at all with
prepared_statement_limit set to zero, that suggests sloppiness in the
refactoring that was done to separate out the resource-freeing code.

On the other hand, if I do ALTER SYSTEM SET prepared_statement_limit = 1
and then run "make installcheck-parallel", I see a different set of
failures. It rather looks like the patch is deleting plans out from
under plpgsql, which connects back to my point about plancache.c not
really knowing whether a plan is in use or not. Certainly,
EnforcePreparedStatementLimit as coded here has no idea about that.

(Speaking of ALTER SYSTEM, why the devil is the variable PGC_POSTMASTER?
That seems entirely silly.)

Aside from Alvaro's style points, I'm fairly concerned about the overhead
the patch will add when active. Running through the entire plancache
and collecting memory stats is likely to be quite an expensive
proposition when there are lots of entries, yet this is willing to do that
over again at the drop of a hat. It won't be hard to get this to expend
O(N^2) time with N entries. I think that for acceptable performance,
it'll be necessary to track total usage incrementally instead of doing
it this way.

regards, tom lane

Attachments:

instrumentation-patch.txttext/plain; charset=us-ascii; name=instrumentation-patch.txtDownload
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index abc3062..210c049 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -431,6 +431,37 @@ CompleteCachedPlan(CachedPlanSource *plansource,
 
 	plansource->is_complete = true;
 	plansource->is_valid = true;
+
+	{
+		MemoryContextCounters counters;
+		MemoryContext context;
+		Size		csize,
+					qcsize = 0,
+					gpsize = 0;
+
+		memset(&counters, 0, sizeof(counters));
+		context = plansource->context;
+		context->methods->stats(context, NULL, NULL, &counters);
+		csize = counters.totalspace;
+
+		if (plansource->query_context)
+		{
+			memset(&counters, 0, sizeof(counters));
+			context = plansource->query_context;
+			context->methods->stats(context, NULL, NULL, &counters);
+			qcsize = counters.totalspace;
+		}
+
+		if (plansource->gplan)
+		{
+			memset(&counters, 0, sizeof(counters));
+			context = plansource->gplan->context;
+			context->methods->stats(context, NULL, NULL, &counters);
+			gpsize = counters.totalspace;
+		}
+		elog(LOG, "CompleteCachedPlan: %zu context, %zu query_context, %zu gplan",
+			 csize, qcsize, gpsize);
+	}
 }
 
 /*
@@ -1188,6 +1219,38 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
 			/* Update generic_cost whenever we make a new generic plan */
 			plansource->generic_cost = cached_plan_cost(plan, false);
 
+
+			{
+				MemoryContextCounters counters;
+				MemoryContext context;
+				Size		csize,
+							qcsize = 0,
+							gpsize = 0;
+
+				memset(&counters, 0, sizeof(counters));
+				context = plansource->context;
+				context->methods->stats(context, NULL, NULL, &counters);
+				csize = counters.totalspace;
+
+				if (plansource->query_context)
+				{
+					memset(&counters, 0, sizeof(counters));
+					context = plansource->query_context;
+					context->methods->stats(context, NULL, NULL, &counters);
+					qcsize = counters.totalspace;
+				}
+
+				if (plansource->gplan)
+				{
+					memset(&counters, 0, sizeof(counters));
+					context = plansource->gplan->context;
+					context->methods->stats(context, NULL, NULL, &counters);
+					gpsize = counters.totalspace;
+				}
+				elog(LOG, "GetCachedPlan: %zu context, %zu query_context, %zu gplan",
+					 csize, qcsize, gpsize);
+			}
+
 			/*
 			 * If, based on the now-known value of generic_cost, we'd not have
 			 * chosen to use a generic plan, then forget it and make a custom
#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#7)
Re: Patch: New GUC prepared_statement_limit to limit memory used by prepared statements

I wrote:

As far as the crash issue is concerned, I notice that right now the
cfbot is showing green for this patch, but that seems to just be because
the behavior is unstable. I see crashes in "make installcheck-parallel"
about 50% of the time with this patch applied. Since, in fact,
the patch is not supposed to be doing anything at all with
prepared_statement_limit set to zero, that suggests sloppiness in the
refactoring that was done to separate out the resource-freeing code.

Oh ... actually, I bet the problem is that the patch thinks it's okay
to immediately free space in PlanCacheRelCallback and friends, rather
than just marking invalid entries as invalid. Nope, you cannot do that.
You can't tell whether that data is being examined in an outer call
level.

In principle I think you could decrement-the-refcount-and-possibly-free
right away on the gplan, since any outside uses of that ought to have
their own refcounts. But the query_context data isn't refcounted.
Also, some of the failures I saw looked like premature deletion of a
plan, not query_context data, so the patch seems to be too aggressive
on that side too.

regards, tom lane

#9Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#8)
Re: Patch: New GUC prepared_statement_limit to limit memory used by prepared statements

On Tue, Sep 03, 2019 at 06:30:32PM -0400, Tom Lane wrote:

Oh ... actually, I bet the problem is that the patch thinks it's okay
to immediately free space in PlanCacheRelCallback and friends, rather
than just marking invalid entries as invalid. Nope, you cannot do that.
You can't tell whether that data is being examined in an outer call
level.

In principle I think you could decrement-the-refcount-and-possibly-free
right away on the gplan, since any outside uses of that ought to have
their own refcounts. But the query_context data isn't refcounted.
Also, some of the failures I saw looked like premature deletion of a
plan, not query_context data, so the patch seems to be too aggressive
on that side too.

We are a couple of months after this update, and the patch has not
been updated, so I am marking it as returned with feedack.
--
Michael