CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

Started by Jeff Davisover 2 years ago42 messages

pgsql@j-davis.com

over 2 years ago

1 attachment(s)

The attached patch implements a new SEARCH clause for CREATE FUNCTION.
The SEARCH clause controls the search_path used when executing
functions that were created without a SET clause.

Background:

Controlling search_path is critical for the correctness and security of
functions. Right now, the author of a function without a SET clause has
little ability to control the function's behavior, because even basic
operators like "+" involve search_path. This is a big problem for, e.g.
functions used in expression indexes which are called by any user with
write privileges on the table.

Motivation:

I'd like to (eventually) get to safe-by-default behavior. In other
words, the simplest function declaration should be safe for the most
common use cases.

To get there, we need some way to explicitly specify the less common
cases. Right now there's no way for the function author to indicate
that a function intends to use the session's search path. We also need
an easier way to specify that the user wants a safe search_path ("SET
search_path = pg_catalog, pg_temp" is arcane).

And when we know more about the user's actual intent, then it will be
easier to either form a transition plan to push users into the safer
behavior, or at least warn strongly when the user is doing something
dangerous (i.e. using a function that depends on the session's search
path as part of an expression index).

Today, the only information we have about the user's intent is the
presence or absence of a "SET search_path" clause, which is not a
strong signal.

Proposal:

Add SEARCH { DEFAULT | SYSTEM | SESSION } clause to CREATE/ALTER
function.

* SEARCH DEFAULT is the same as no SEARCH clause at all, and ends up
stored in the catalog as prosearch='d'.
* SEARCH SYSTEM means that we switch to the safe search path of
"pg_catalog, pg_temp" when executing the function. Stored as
prosearch='y'.
* SEARCH SESSION means that we don't switch the search_path when
executing the function, and it's inherited from the session. Stored as
prosearch='e'.

Regardless of the SEARCH clause, a "SET search_path" clause will
override it. The SEARCH clause only matters when "SET search_path" is
not there.

Additionally provide a GUC, defaulting to false for compatibility, that
can interpret prosearch='d' as if it were prosearch='y'. It could help
provide a transition path. I know there's a strong reluctance to adding
these kinds of GUCs; I can remove it and I think the patch will still
be worthwhile. Perhaps there are alternatives that could help with
migration at pg_dump time instead?

Benefits:

1. The user can be more explicit about their actual intent. Do they
want safety and consistency? Or the flexibility of using the session's
search_path?

2. We can more accurately serve the user's intent. For instance, the
safe search_path of "pg_catalog, pg_temp" is arcane and seems to be
there just because we don't have a way to specify that pg_temp be
excluded entirely. But perhaps in the future we *do* want to exclude
pg_temp entirely. Knowing that the user just wants "SEARCH SYSTEM"
allows us some freedom to do that.

3. Users can be forward-compatible by specifying the functions that
really do need to use the session's search path as SEARCH SESSION, so
that they will never be broken in the future. That gives us a cleaner
path toward making the default behavior safe.

--
Jeff Davis
PostgreSQL Contributor Team - AWS

Attachments:

0001-CREATE-FUNCTION-.-SEARCH-DEFAULT-SYSTEM-SESSION.patchtext/x-patch; charset=UTF-8; name=0001-CREATE-FUNCTION-.-SEARCH-DEFAULT-SYSTEM-SESSION.patchDownload

From 0218b472bd1ae8152b8bc8d92de4b7cc13dbbbad Mon Sep 17 00:00:00 2001
From: Jeff Davis <jeff@j-davis.com>
Date: Fri, 11 Aug 2023 14:33:36 -0700
Subject: [PATCH] CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }.

Control the search_path while executing functions created without an
explicit SET clause.

Allows users to be forward compatible by explicitly declaring that a
function depends on the session's search_path, or that it only depends
on the safe search_path of "pg_catalog, pg_temp".

Also adds a GUC that allows users to choose how to treat functions
that have not specified a SEARCH clause or a SET clause. The GUC is
false by default to preserve existing semantics, but eventually it
will default to true for greater safety.
---
 doc/src/sgml/config.sgml                      | 26 +++++++
 doc/src/sgml/ref/alter_function.sgml          | 15 ++++
 doc/src/sgml/ref/create_function.sgml         | 34 +++++++++
 src/backend/catalog/pg_aggregate.c            |  1 +
 src/backend/catalog/pg_proc.c                 | 12 ++++
 src/backend/commands/functioncmds.c           | 44 +++++++++++-
 src/backend/commands/typecmds.c               |  4 ++
 src/backend/optimizer/util/clauses.c          |  1 +
 src/backend/parser/gram.y                     | 15 ++++
 src/backend/utils/fmgr/fmgr.c                 | 24 +++++--
 src/backend/utils/misc/guc_tables.c           | 10 +++
 src/backend/utils/misc/postgresql.conf.sample |  2 +
 src/include/catalog/namespace.h               |  6 ++
 src/include/catalog/pg_proc.h                 | 15 ++++
 src/include/utils/guc.h                       |  1 +
 .../regress/expected/create_function_sql.out  | 69 +++++++++++++++++++
 src/test/regress/sql/create_function_sql.sql  | 35 ++++++++++
 17 files changed, 305 insertions(+), 9 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 11251fa05e..f366cc274c 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -10271,6 +10271,32 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir'
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-safe-function-search-path" xreflabel="safe_function_search_path">
+      <term><varname>safe_function_search_path</varname> (<type>boolean</type>)
+      <indexterm>
+       <primary><varname>safe_function_search_path</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        If set to <literal>true</literal>, functions declared with
+        <literal>SEARCH DEFAULT</literal> (or no <literal>SEARCH</literal>
+        clause) behave as if <literal>SEARCH SYSTEM</literal> was
+        specified. This is the safest option because functions will have
+        consistent behavior regardless of the session's current setting of
+        <xref linkend="guc-search-path"/>.
+       </para>
+       <para>
+        If set to <literal>false</literal> (the default), functions declared
+        with <literal>SEARCH DEFAULT</literal> (or no
+        <literal>SEARCH</literal> clause) behave as if <literal>SEARCH
+        SESSION</literal> was specified.
+       </para>
+       <para>
+        See <xref linkend="sql-createfunction"/> for more information.
+       </para>
+      </listitem>
+     </varlistentry>
      <varlistentry id="guc-backslash-quote" xreflabel="backslash_quote">
       <term><varname>backslash_quote</varname> (<type>enum</type>)
       <indexterm><primary>strings</primary><secondary>backslash quotes</secondary></indexterm>
diff --git a/doc/src/sgml/ref/alter_function.sgml b/doc/src/sgml/ref/alter_function.sgml
index 8193b17f25..61d601b68f 100644
--- a/doc/src/sgml/ref/alter_function.sgml
+++ b/doc/src/sgml/ref/alter_function.sgml
@@ -37,6 +37,7 @@ ALTER FUNCTION <replaceable>name</replaceable> [ ( [ [ <replaceable class="param
     CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT | STRICT
     IMMUTABLE | STABLE | VOLATILE
     [ NOT ] LEAKPROOF
+    SEARCH { DEFAULT | SYSTEM | SESSION }
     [ EXTERNAL ] SECURITY INVOKER | [ EXTERNAL ] SECURITY DEFINER
     PARALLEL { UNSAFE | RESTRICTED | SAFE }
     COST <replaceable class="parameter">execution_cost</replaceable>
@@ -198,6 +199,20 @@ ALTER FUNCTION <replaceable>name</replaceable> [ ( [ [ <replaceable class="param
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>SEARCH DEFAULT</literal></term>
+    <term><literal>SEARCH SYSTEM</literal></term>
+    <term><literal>SEARCH SESSION</literal></term>
+
+    <listitem>
+      <para>
+       Change the <literal>SEARCH</literal> behavior of the function to the
+       specified setting.  See <xref linkend="sql-createfunction"/> for
+       details.
+      </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><literal><optional> EXTERNAL </optional> SECURITY INVOKER</literal></term>
     <term><literal><optional> EXTERNAL </optional> SECURITY DEFINER</literal></term>
diff --git a/doc/src/sgml/ref/create_function.sgml b/doc/src/sgml/ref/create_function.sgml
index 863d99d1fc..82ca2955f4 100644
--- a/doc/src/sgml/ref/create_function.sgml
+++ b/doc/src/sgml/ref/create_function.sgml
@@ -31,6 +31,7 @@ CREATE [ OR REPLACE ] FUNCTION
     | { IMMUTABLE | STABLE | VOLATILE }
     | [ NOT ] LEAKPROOF
     | { CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT | STRICT }
+    | SEARCH { DEFAULT | SYSTEM | SESSION }
     | { [ EXTERNAL ] SECURITY INVOKER | [ EXTERNAL ] SECURITY DEFINER }
     | PARALLEL { UNSAFE | RESTRICTED | SAFE }
     | COST <replaceable class="parameter">execution_cost</replaceable>
@@ -402,6 +403,39 @@ CREATE [ OR REPLACE ] FUNCTION
      </listitem>
     </varlistentry>
 
+    <varlistentry>
+     <term><literal>SEARCH DEFAULT</literal></term>
+     <term><literal>SEARCH SYSTEM</literal></term>
+     <term><literal>SEARCH SESSION</literal></term>
+
+     <listitem>
+      <para>
+       These settings control the function's <xref linkend="guc-search-path"/>
+       while executing if not set explicitly with a <literal>SET</literal>
+       clause in the <command>CREATE FUNCTION</command> command.
+      </para>
+      <para>
+       <literal>SEARCH SYSTEM</literal> indicates that the function will use
+       <literal>pg_catalog, pg_temp</literal>. This is the safest option
+       because the function will have consistent behavior regardless of the
+       session's setting.
+      </para>
+      <para>
+       <literal>SEARCH SESSION</literal> indicates that the function will
+       inherit session's current <literal>search_path</literal>
+       setting. Behavior of the function may change when the session's
+       <literal>search_path</literal> changes.
+      </para>
+      <para>
+       <literal>SEARCH DEFAULT</literal> is the same as not specifying any
+       <literal>SEARCH</literal> clause at all. If <xref
+       linkend="guc-safe-function-search-path"/> is set to
+       <literal>true</literal>, then the function will use
+       <literal>pg_catalog, pg_temp</literal>; otherwise it will inherit the
+       session's current <literal>search_path</literal> setting.
+      </para>
+     </listitem>
+    </varlistentry>
    <varlistentry>
     <term><literal><optional>EXTERNAL</optional> SECURITY INVOKER</literal></term>
     <term><literal><optional>EXTERNAL</optional> SECURITY DEFINER</literal></term>
diff --git a/src/backend/catalog/pg_aggregate.c b/src/backend/catalog/pg_aggregate.c
index ebc4454743..e4ec68ec43 100644
--- a/src/backend/catalog/pg_aggregate.c
+++ b/src/backend/catalog/pg_aggregate.c
@@ -624,6 +624,7 @@ AggregateCreate(const char *aggName,
 							 NULL,	/* probin */
 							 NULL,	/* prosqlbody */
 							 PROKIND_AGGREGATE,
+							 PROSEARCH_DEFAULT, /* no SEARCH clause for aggregates */
 							 false, /* security invoker (currently not
 									 * definable for agg) */
 							 false, /* isLeakProof */
diff --git a/src/backend/catalog/pg_proc.c b/src/backend/catalog/pg_proc.c
index b5fd364003..05917fd3df 100644
--- a/src/backend/catalog/pg_proc.c
+++ b/src/backend/catalog/pg_proc.c
@@ -81,6 +81,7 @@ ProcedureCreate(const char *procedureName,
 				const char *probin,
 				Node *prosqlbody,
 				char prokind,
+				char prosearch,
 				bool security_definer,
 				bool isLeakProof,
 				bool isStrict,
@@ -308,6 +309,7 @@ ProcedureCreate(const char *procedureName,
 	values[Anum_pg_proc_provariadic - 1] = ObjectIdGetDatum(variadicType);
 	values[Anum_pg_proc_prosupport - 1] = ObjectIdGetDatum(prosupport);
 	values[Anum_pg_proc_prokind - 1] = CharGetDatum(prokind);
+	values[Anum_pg_proc_prosearch - 1] = CharGetDatum(prosearch);
 	values[Anum_pg_proc_prosecdef - 1] = BoolGetDatum(security_definer);
 	values[Anum_pg_proc_proleakproof - 1] = BoolGetDatum(isLeakProof);
 	values[Anum_pg_proc_proisstrict - 1] = BoolGetDatum(isStrict);
@@ -988,6 +990,16 @@ sql_function_parse_error_callback(void *arg)
 	}
 }
 
+/*
+ * Check whether the search_path must be set while executing this function.
+ */
+bool
+prosearch_is_system(char prosearch)
+{
+	return (prosearch == PROSEARCH_SYSTEM ||
+			(prosearch == PROSEARCH_DEFAULT && safe_function_search_path));
+}
+
 /*
  * Adjust a syntax error occurring inside the function body of a CREATE
  * FUNCTION or DO command.  This can be used by any function validator or
diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c
index 7ba6a86ebe..7c2dfda01e 100644
--- a/src/backend/commands/functioncmds.c
+++ b/src/backend/commands/functioncmds.c
@@ -507,6 +507,7 @@ compute_common_attribute(ParseState *pstate,
 						 DefElem *defel,
 						 DefElem **volatility_item,
 						 DefElem **strict_item,
+						 DefElem **search_item,
 						 DefElem **security_item,
 						 DefElem **leakproof_item,
 						 List **set_items,
@@ -533,6 +534,13 @@ compute_common_attribute(ParseState *pstate,
 
 		*strict_item = defel;
 	}
+	else if (strcmp(defel->defname, "search") == 0)
+	{
+		if (*search_item)
+			errorConflictingDefElem(defel, pstate);
+
+		*search_item = defel;
+	}
 	else if (strcmp(defel->defname, "security") == 0)
 	{
 		if (*security_item)
@@ -603,6 +611,24 @@ procedure_error:
 	return false;
 }
 
+static char
+interpret_func_search(DefElem *defel)
+{
+	char	   *str = strVal(defel->arg);
+
+	if (strcmp(str, "default") == 0)
+		return PROSEARCH_DEFAULT;
+	else if (strcmp(str, "system") == 0)
+		return PROSEARCH_SYSTEM;
+	else if (strcmp(str, "session") == 0)
+		return PROSEARCH_SESSION;
+	else
+	{
+		elog(ERROR, "invalid volatility \"%s\"", str);
+		return 0;				/* keep compiler quiet */
+	}
+}
+
 static char
 interpret_func_volatility(DefElem *defel)
 {
@@ -725,6 +751,7 @@ compute_function_attributes(ParseState *pstate,
 							bool *windowfunc_p,
 							char *volatility_p,
 							bool *strict_p,
+							char *search_p,
 							bool *security_definer,
 							bool *leakproof_p,
 							ArrayType **proconfig,
@@ -740,6 +767,7 @@ compute_function_attributes(ParseState *pstate,
 	DefElem    *windowfunc_item = NULL;
 	DefElem    *volatility_item = NULL;
 	DefElem    *strict_item = NULL;
+	DefElem    *search_item = NULL;
 	DefElem    *security_item = NULL;
 	DefElem    *leakproof_item = NULL;
 	List	   *set_items = NIL;
@@ -786,6 +814,7 @@ compute_function_attributes(ParseState *pstate,
 										  defel,
 										  &volatility_item,
 										  &strict_item,
+										  &search_item,
 										  &security_item,
 										  &leakproof_item,
 										  &set_items,
@@ -814,6 +843,8 @@ compute_function_attributes(ParseState *pstate,
 		*volatility_p = interpret_func_volatility(volatility_item);
 	if (strict_item)
 		*strict_p = boolVal(strict_item->arg);
+	if (search_item)
+		*search_p = interpret_func_search(search_item);
 	if (security_item)
 		*security_definer = boolVal(security_item->arg);
 	if (leakproof_item)
@@ -1042,7 +1073,8 @@ CreateFunction(ParseState *pstate, CreateFunctionStmt *stmt)
 				isStrict,
 				security,
 				isLeakProof;
-	char		volatility;
+	char		prosearch,
+				volatility;
 	ArrayType  *proconfig;
 	float4		procost;
 	float4		prorows;
@@ -1067,6 +1099,7 @@ CreateFunction(ParseState *pstate, CreateFunctionStmt *stmt)
 	language = NULL;
 	isWindowFunc = false;
 	isStrict = false;
+	prosearch = PROSEARCH_DEFAULT;
 	security = false;
 	isLeakProof = false;
 	volatility = PROVOLATILE_VOLATILE;
@@ -1082,8 +1115,8 @@ CreateFunction(ParseState *pstate, CreateFunctionStmt *stmt)
 								stmt->options,
 								&as_clause, &language, &transformDefElem,
 								&isWindowFunc, &volatility,
-								&isStrict, &security, &isLeakProof,
-								&proconfig, &procost, &prorows,
+								&isStrict, &prosearch, &security,
+								&isLeakProof, &proconfig, &procost, &prorows,
 								&prosupport, &parallel);
 
 	if (!language)
@@ -1271,6 +1304,7 @@ CreateFunction(ParseState *pstate, CreateFunctionStmt *stmt)
 						   probin_str,	/* converted to text later */
 						   prosqlbody,
 						   stmt->is_procedure ? PROKIND_PROCEDURE : (isWindowFunc ? PROKIND_WINDOW : PROKIND_FUNCTION),
+						   prosearch,
 						   security,
 						   isLeakProof,
 						   isStrict,
@@ -1355,6 +1389,7 @@ AlterFunction(ParseState *pstate, AlterFunctionStmt *stmt)
 	ListCell   *l;
 	DefElem    *volatility_item = NULL;
 	DefElem    *strict_item = NULL;
+	DefElem    *search_item = NULL;
 	DefElem    *security_def_item = NULL;
 	DefElem    *leakproof_item = NULL;
 	List	   *set_items = NIL;
@@ -1399,6 +1434,7 @@ AlterFunction(ParseState *pstate, AlterFunctionStmt *stmt)
 									 defel,
 									 &volatility_item,
 									 &strict_item,
+									 &search_item,
 									 &security_def_item,
 									 &leakproof_item,
 									 &set_items,
@@ -1413,6 +1449,8 @@ AlterFunction(ParseState *pstate, AlterFunctionStmt *stmt)
 		procForm->provolatile = interpret_func_volatility(volatility_item);
 	if (strict_item)
 		procForm->proisstrict = boolVal(strict_item->arg);
+	if (search_item)
+		procForm->prosearch = interpret_func_search(search_item);
 	if (security_def_item)
 		procForm->prosecdef = boolVal(security_def_item->arg);
 	if (leakproof_item)
diff --git a/src/backend/commands/typecmds.c b/src/backend/commands/typecmds.c
index 5e97606793..baa8dd706e 100644
--- a/src/backend/commands/typecmds.c
+++ b/src/backend/commands/typecmds.c
@@ -1763,6 +1763,7 @@ makeRangeConstructors(const char *name, Oid namespace,
 								 NULL,	/* probin */
 								 NULL,	/* prosqlbody */
 								 PROKIND_FUNCTION,
+								 PROSEARCH_DEFAULT,
 								 false, /* security_definer */
 								 false, /* leakproof */
 								 false, /* isStrict */
@@ -1828,6 +1829,7 @@ makeMultirangeConstructors(const char *name, Oid namespace,
 							 NULL,	/* probin */
 							 NULL,	/* prosqlbody */
 							 PROKIND_FUNCTION,
+							 PROSEARCH_DEFAULT,
 							 false, /* security_definer */
 							 false, /* leakproof */
 							 true,	/* isStrict */
@@ -1872,6 +1874,7 @@ makeMultirangeConstructors(const char *name, Oid namespace,
 							 NULL,	/* probin */
 							 NULL,	/* prosqlbody */
 							 PROKIND_FUNCTION,
+							 PROSEARCH_DEFAULT,
 							 false, /* security_definer */
 							 false, /* leakproof */
 							 true,	/* isStrict */
@@ -1910,6 +1913,7 @@ makeMultirangeConstructors(const char *name, Oid namespace,
 							 NULL,	/* probin */
 							 NULL,	/* prosqlbody */
 							 PROKIND_FUNCTION,
+							 PROSEARCH_DEFAULT,
 							 false, /* security_definer */
 							 false, /* leakproof */
 							 true,	/* isStrict */
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index da258968b8..ad93e0a28f 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -4474,6 +4474,7 @@ inline_function(Oid funcid, Oid result_type, Oid result_collid,
 	 */
 	if (funcform->prolang != SQLlanguageId ||
 		funcform->prokind != PROKIND_FUNCTION ||
+		prosearch_is_system(funcform->prosearch) ||
 		funcform->prosecdef ||
 		funcform->proretset ||
 		funcform->prorettype == RECORDOID ||
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 15ece871a0..e264fa536f 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -8528,6 +8528,21 @@ common_func_opt_item:
 				{
 					$$ = makeDefElem("support", (Node *) $2, @1);
 				}
+			| SEARCH DEFAULT
+				{
+					/* we abuse the normal content of a DefElem here */
+					$$ = makeDefElem("search", (Node *) makeString("default"), @1);
+				}
+			| SEARCH SYSTEM_P
+				{
+					/* we abuse the normal content of a DefElem here */
+					$$ = makeDefElem("search", (Node *) makeString("system"), @1);
+				}
+			| SEARCH SESSION
+				{
+					/* we abuse the normal content of a DefElem here */
+					$$ = makeDefElem("search", (Node *) makeString("session"), @1);
+				}
 			| FunctionSetResetClause
 				{
 					/* we abuse the normal content of a DefElem here */
diff --git a/src/backend/utils/fmgr/fmgr.c b/src/backend/utils/fmgr/fmgr.c
index 9dfdf890c5..2dc7224940 100644
--- a/src/backend/utils/fmgr/fmgr.c
+++ b/src/backend/utils/fmgr/fmgr.c
@@ -16,6 +16,7 @@
 #include "postgres.h"
 
 #include "access/detoast.h"
+#include "catalog/namespace.h"
 #include "catalog/pg_language.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
@@ -110,6 +111,7 @@ fmgr_lookupByName(const char *name)
 	return NULL;
 }
 
+
 /*
  * This routine fills a FmgrInfo struct, given the OID
  * of the function to be called.
@@ -203,6 +205,7 @@ fmgr_info_cxt_security(Oid functionId, FmgrInfo *finfo, MemoryContext mcxt,
 	 */
 	if (!ignore_security &&
 		(procedureStruct->prosecdef ||
+		 prosearch_is_system(procedureStruct->prosearch) ||
 		 !heap_attisnull(procedureTuple, Anum_pg_proc_proconfig, NULL) ||
 		 FmgrHookIsNeeded(functionId)))
 	{
@@ -612,6 +615,7 @@ struct fmgr_security_definer_cache
 {
 	FmgrInfo	flinfo;			/* lookup info for target function */
 	Oid			userid;			/* userid to set, or InvalidOid */
+	char	   *searchPath;		/* from SEARCH clause, if specified */
 	List	   *configNames;	/* GUC names to set, or NIL */
 	List	   *configValues;	/* GUC values to set, or NIL */
 	Datum		arg;			/* passthrough argument for plugin modules */
@@ -630,6 +634,9 @@ struct fmgr_security_definer_cache
 extern Datum
 fmgr_security_definer(PG_FUNCTION_ARGS)
 {
+	GucContext	context = superuser() ? PGC_SUSET : PGC_USERSET;
+	GucSource	source = PGC_S_SESSION;
+	GucAction	action = GUC_ACTION_SAVE;
 	Datum		result;
 	struct fmgr_security_definer_cache *volatile fcache;
 	FmgrInfo   *save_flinfo;
@@ -662,6 +669,9 @@ fmgr_security_definer(PG_FUNCTION_ARGS)
 				 fcinfo->flinfo->fn_oid);
 		procedureStruct = (Form_pg_proc) GETSTRUCT(tuple);
 
+		if (prosearch_is_system(procedureStruct->prosearch))
+			fcache->searchPath = NAMESPACE_SAFE_SEARCH_PATH;
+
 		if (procedureStruct->prosecdef)
 			fcache->userid = procedureStruct->proowner;
 
@@ -687,20 +697,22 @@ fmgr_security_definer(PG_FUNCTION_ARGS)
 
 	/* GetUserIdAndSecContext is cheap enough that no harm in a wasted call */
 	GetUserIdAndSecContext(&save_userid, &save_sec_context);
-	if (fcache->configNames != NIL) /* Need a new GUC nesting level */
+	if (fcache->searchPath != NULL || fcache->configNames != NIL) /* Need a new GUC nesting level */
 		save_nestlevel = NewGUCNestLevel();
 	else
-		save_nestlevel = 0;		/* keep compiler quiet */
+		save_nestlevel = 0;
 
 	if (OidIsValid(fcache->userid))
 		SetUserIdAndSecContext(fcache->userid,
 							   save_sec_context | SECURITY_LOCAL_USERID_CHANGE);
 
+	if (fcache->searchPath != NULL)
+		(void) set_config_option("search_path", fcache->searchPath,
+								 context, source,
+								 action, true, 0, false);
+
 	forboth(lc1, fcache->configNames, lc2, fcache->configValues)
 	{
-		GucContext	context = superuser() ? PGC_SUSET : PGC_USERSET;
-		GucSource	source = PGC_S_SESSION;
-		GucAction	action = GUC_ACTION_SAVE;
 		char	   *name = lfirst(lc1);
 		char	   *value = lfirst(lc2);
 
@@ -749,7 +761,7 @@ fmgr_security_definer(PG_FUNCTION_ARGS)
 
 	fcinfo->flinfo = save_flinfo;
 
-	if (fcache->configNames != NIL)
+	if (save_nestlevel > 0)
 		AtEOXact_GUC(true, save_nestlevel);
 	if (OidIsValid(fcache->userid))
 		SetUserIdAndSecContext(save_userid, save_sec_context);
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index f9dba43b8c..7b32235d52 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -504,6 +504,7 @@ bool		log_btree_build_stats = false;
 char	   *event_source;
 
 bool		row_security;
+bool		safe_function_search_path = false;
 bool		check_function_bodies = true;
 
 /*
@@ -1593,6 +1594,15 @@ struct config_bool ConfigureNamesBool[] =
 		true,
 		NULL, NULL, NULL
 	},
+	{
+		{"safe_function_search_path", PGC_SUSET, CLIENT_CONN_STATEMENT,
+			gettext_noop("Use safe search path for all functions, unless otherwise specified."),
+			NULL
+		},
+		&safe_function_search_path,
+		false,
+		NULL, NULL, NULL
+	},
 	{
 		{"check_function_bodies", PGC_USERSET, CLIENT_CONN_STATEMENT,
 			gettext_noop("Check routine bodies during CREATE FUNCTION and CREATE PROCEDURE."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index c768af9a73..accbfb5609 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -688,6 +688,8 @@
 #default_toast_compression = 'pglz'	# 'pglz' or 'lz4'
 #temp_tablespaces = ''			# a list of tablespace names, '' uses
 					# only default tablespace
+#safe_function_search_path = off	# change to safe search_path while
+					# executing functions
 #check_function_bodies = on
 #default_transaction_isolation = 'read committed'
 #default_transaction_read_only = off
diff --git a/src/include/catalog/namespace.h b/src/include/catalog/namespace.h
index 49ef619e4b..5567c1268f 100644
--- a/src/include/catalog/namespace.h
+++ b/src/include/catalog/namespace.h
@@ -76,6 +76,12 @@ typedef enum RVROption
 typedef void (*RangeVarGetRelidCallback) (const RangeVar *relation, Oid relId,
 										  Oid oldRelId, void *callback_arg);
 
+/*
+ * Safe search path for cases where the session search path can't be trusted
+ * (e.g. for SECURITY DEFINER functions).
+ */
+#define NAMESPACE_SAFE_SEARCH_PATH		"pg_catalog, pg_temp"
+
 #define RangeVarGetRelid(relation, lockmode, missing_ok) \
 	RangeVarGetRelidExtended(relation, lockmode, \
 							 (missing_ok) ? RVR_MISSING_OK : 0, NULL, NULL)
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index e7abe0b497..b9627ad493 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -58,6 +58,9 @@ CATALOG(pg_proc,1255,ProcedureRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(81,Proce
 	/* see PROKIND_ categories below */
 	char		prokind BKI_DEFAULT(f);
 
+	/* security definer */
+	char		prosearch BKI_DEFAULT(d);
+
 	/* security definer */
 	bool		prosecdef BKI_DEFAULT(f);
 
@@ -150,6 +153,15 @@ DECLARE_UNIQUE_INDEX(pg_proc_proname_args_nsp_index, 2691, ProcedureNameArgsNspI
 #define PROKIND_WINDOW 'w'
 #define PROKIND_PROCEDURE 'p'
 
+/*
+ * Symbolic values for prosearch column: these determine the search_path under
+ * which a function executes if not specified in a SET clause
+ * (proconfig). PROSEARCH_DEFAULT means that no SEARCH clause was specified.
+ */
+#define PROSEARCH_DEFAULT		'd'
+#define PROSEARCH_SYSTEM		'y'
+#define PROSEARCH_SESSION		'e'
+
 /*
  * Symbolic values for provolatile column: these indicate whether the result
  * of a function is dependent *only* on the values of its explicit arguments,
@@ -197,6 +209,7 @@ extern ObjectAddress ProcedureCreate(const char *procedureName,
 									 const char *probin,
 									 Node *prosqlbody,
 									 char prokind,
+									 char prosearch,
 									 bool security_definer,
 									 bool isLeakProof,
 									 bool isStrict,
@@ -213,6 +226,8 @@ extern ObjectAddress ProcedureCreate(const char *procedureName,
 									 float4 procost,
 									 float4 prorows);
 
+extern bool prosearch_is_system(char prosearch);
+
 extern bool function_parse_error_transpose(const char *prosrc);
 
 extern List *oid_array_to_list(Datum datum);
diff --git a/src/include/utils/guc.h b/src/include/utils/guc.h
index e89083ee0e..1241d095ba 100644
--- a/src/include/utils/guc.h
+++ b/src/include/utils/guc.h
@@ -249,6 +249,7 @@ extern PGDLLIMPORT bool log_executor_stats;
 extern PGDLLIMPORT bool log_statement_stats;
 extern PGDLLIMPORT bool log_btree_build_stats;
 
+extern PGDLLIMPORT bool safe_function_search_path;
 extern PGDLLIMPORT bool check_function_bodies;
 extern PGDLLIMPORT bool current_role_is_superuser;
 
diff --git a/src/test/regress/expected/create_function_sql.out b/src/test/regress/expected/create_function_sql.out
index 50aca5940f..72bbc02ee9 100644
--- a/src/test/regress/expected/create_function_sql.out
+++ b/src/test/regress/expected/create_function_sql.out
@@ -88,6 +88,75 @@ SELECT proname, provolatile FROM pg_proc
  functest_b_4 | v
 (4 rows)
 
+--
+-- SEARCH DEFAULT | SYSTEM | SESSION
+--
+CREATE FUNCTION show_search_path() RETURNS TEXT
+  LANGUAGE plpgsql AS
+$$
+  BEGIN
+    RETURN current_setting('search_path');
+  END;
+$$;
+SET safe_function_search_path = true;
+ALTER FUNCTION show_search_path() SEARCH DEFAULT;
+SELECT show_search_path();
+  show_search_path   
+---------------------
+ pg_catalog, pg_temp
+(1 row)
+
+ALTER FUNCTION show_search_path() SEARCH SYSTEM;
+SELECT show_search_path();
+  show_search_path   
+---------------------
+ pg_catalog, pg_temp
+(1 row)
+
+ALTER FUNCTION show_search_path() SEARCH SESSION;
+SELECT show_search_path();
+    show_search_path    
+------------------------
+ temp_func_test, public
+(1 row)
+
+RESET safe_function_search_path;
+ALTER FUNCTION show_search_path() SEARCH DEFAULT;
+SELECT show_search_path();
+    show_search_path    
+------------------------
+ temp_func_test, public
+(1 row)
+
+ALTER FUNCTION show_search_path() SEARCH SYSTEM;
+SELECT show_search_path();
+  show_search_path   
+---------------------
+ pg_catalog, pg_temp
+(1 row)
+
+ALTER FUNCTION show_search_path() SEARCH SESSION;
+SELECT show_search_path();
+    show_search_path    
+------------------------
+ temp_func_test, public
+(1 row)
+
+ALTER FUNCTION show_search_path() SEARCH SYSTEM SET search_path = test1;
+SELECT show_search_path();
+ show_search_path 
+------------------
+ test1
+(1 row)
+
+ALTER FUNCTION show_search_path() SEARCH SESSION SET search_path = test1;
+SELECT show_search_path();
+ show_search_path 
+------------------
+ test1
+(1 row)
+
+DROP FUNCTION show_search_path();
 --
 -- SECURITY DEFINER | INVOKER
 --
diff --git a/src/test/regress/sql/create_function_sql.sql b/src/test/regress/sql/create_function_sql.sql
index 89e9af3a49..b0756e1668 100644
--- a/src/test/regress/sql/create_function_sql.sql
+++ b/src/test/regress/sql/create_function_sql.sql
@@ -60,6 +60,41 @@ SELECT proname, provolatile FROM pg_proc
                      'functest_B_3'::regproc,
 		     'functest_B_4'::regproc) ORDER BY proname;
 
+--
+-- SEARCH DEFAULT | SYSTEM | SESSION
+--
+
+CREATE FUNCTION show_search_path() RETURNS TEXT
+  LANGUAGE plpgsql AS
+$$
+  BEGIN
+    RETURN current_setting('search_path');
+  END;
+$$;
+
+SET safe_function_search_path = true;
+ALTER FUNCTION show_search_path() SEARCH DEFAULT;
+SELECT show_search_path();
+ALTER FUNCTION show_search_path() SEARCH SYSTEM;
+SELECT show_search_path();
+ALTER FUNCTION show_search_path() SEARCH SESSION;
+SELECT show_search_path();
+
+RESET safe_function_search_path;
+ALTER FUNCTION show_search_path() SEARCH DEFAULT;
+SELECT show_search_path();
+ALTER FUNCTION show_search_path() SEARCH SYSTEM;
+SELECT show_search_path();
+ALTER FUNCTION show_search_path() SEARCH SESSION;
+SELECT show_search_path();
+
+ALTER FUNCTION show_search_path() SEARCH SYSTEM SET search_path = test1;
+SELECT show_search_path();
+ALTER FUNCTION show_search_path() SEARCH SESSION SET search_path = test1;
+SELECT show_search_path();
+
+DROP FUNCTION show_search_path();
+
 --
 -- SECURITY DEFINER | INVOKER
 --
-- 
2.34.1

Joe Conway

mail@joeconway.com

over 2 years ago

In reply to: Jeff Davis (#1)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On 8/11/23 22:35, Jeff Davis wrote:

The attached patch implements a new SEARCH clause for CREATE FUNCTION.
The SEARCH clause controls the search_path used when executing
functions that were created without a SET clause.

Background:

Controlling search_path is critical for the correctness and security of
functions. Right now, the author of a function without a SET clause has
little ability to control the function's behavior, because even basic
operators like "+" involve search_path. This is a big problem for, e.g.
functions used in expression indexes which are called by any user with
write privileges on the table.

Motivation:

I'd like to (eventually) get to safe-by-default behavior. In other
words, the simplest function declaration should be safe for the most
common use cases.

I agree with the general need.

Add SEARCH { DEFAULT | SYSTEM | SESSION } clause to CREATE/ALTER
function.

* SEARCH DEFAULT is the same as no SEARCH clause at all, and ends up
stored in the catalog as prosearch='d'.
* SEARCH SYSTEM means that we switch to the safe search path of
"pg_catalog, pg_temp" when executing the function. Stored as
prosearch='y'.
* SEARCH SESSION means that we don't switch the search_path when
executing the function, and it's inherited from the session. Stored as
prosearch='e'.

It isn't clear to me what is the precise difference between DEFAULT and
SESSION

2. We can more accurately serve the user's intent. For instance, the
safe search_path of "pg_catalog, pg_temp" is arcane and seems to be
there just because we don't have a way to specify that pg_temp be
excluded entirely. But perhaps in the future we *do* want to exclude
pg_temp entirely. Knowing that the user just wants "SEARCH SYSTEM"
allows us some freedom to do that.

Personally I think having pg_temp in the SYSTEM search path makes sense
for temp tables, but I find it easy to forget that functions can be
created by unprivileged users in pg_temp, and therefore having pg_temp
in the search path for functions is dangerous.

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Joe Conway

mail@joeconway.com

over 2 years ago

In reply to: Joe Conway (#2)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On 8/12/23 09:15, Joe Conway wrote:

On 8/11/23 22:35, Jeff Davis wrote:

2. We can more accurately serve the user's intent. For instance, the
safe search_path of "pg_catalog, pg_temp" is arcane and seems to be
there just because we don't have a way to specify that pg_temp be
excluded entirely. But perhaps in the future we *do* want to exclude
pg_temp entirely. Knowing that the user just wants "SEARCH SYSTEM"
allows us some freedom to do that.

Personally I think having pg_temp in the SYSTEM search path makes sense
for temp tables, but I find it easy to forget that functions can be
created by unprivileged users in pg_temp, and therefore having pg_temp
in the search path for functions is dangerous.

Hmm, I guess I was too hasty -- seems we have some magic related to this
already.

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Jeff Davis

pgsql@j-davis.com

over 2 years ago

In reply to: Joe Conway (#2)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Sat, 2023-08-12 at 09:15 -0400, Joe Conway wrote:

It isn't clear to me what is the precise difference between DEFAULT
and
SESSION

The the current patch, SESSION always gets the search path from the
session, while DEFAULT is controlled by the GUC
safe_function_search_path. If the GUC is false (the default) then
DEFAULT and SESSION are the same. If the GUC is true, then DEFAULT and
SYSTEM are the same.

There are alternatives to using a GUC to differentiate them. The main
point of this patch is to capture what the user intends in a convenient
and forward-compatible way. If the user specifies nothing at all, they
get DEFAULT, and we could treat that specially in various ways to move
toward safety while minimizing breakage.

Personally I think having pg_temp in the SYSTEM search path makes
sense
for temp tables

The patch doesn't change this behavior -- SYSTEM (without any other
SET) gives you "pg_catalog, pg_temp" and there's no way to exclude
pg_temp entirely.

My point was that by capturing the user's intent with SEARCH SYSTEM, it
gives us a bit more freedom to have these kinds of discussions later.
And it's certainly easier for the user to specify SEARCH SYSTEM than
"SET search_path = pg_catalog, pg_temp".

Regards,
Jeff Davis

Jeff Davis

pgsql@j-davis.com

over 2 years ago

In reply to: Joe Conway (#3)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Sat, 2023-08-12 at 09:50 -0400, Joe Conway wrote:

Hmm, I guess I was too hasty -- seems we have some magic related to
this
already.

I was worried after your first email. But yes, the magic is in
FuncnameGetCandidates(), which simply ignores functions in the temp
namespace.

It would be better if we were obviously safe rather than magically
safe, though.

Regards,
Jeff Davis

Andres Freund

andres@anarazel.de

over 2 years ago

In reply to: Jeff Davis (#1)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

Hi,

On 2023-08-11 19:35:22 -0700, Jeff Davis wrote:

Controlling search_path is critical for the correctness and security of
functions. Right now, the author of a function without a SET clause has
little ability to control the function's behavior, because even basic
operators like "+" involve search_path. This is a big problem for, e.g.
functions used in expression indexes which are called by any user with
write privileges on the table.

Motivation:

I'd like to (eventually) get to safe-by-default behavior. In other
words, the simplest function declaration should be safe for the most
common use cases.

I'm not sure that anything based, directly or indirectly, on search_path
really is a realistic way to get there.

To get there, we need some way to explicitly specify the less common
cases. Right now there's no way for the function author to indicate
that a function intends to use the session's search path. We also need
an easier way to specify that the user wants a safe search_path ("SET
search_path = pg_catalog, pg_temp" is arcane).

No disagreement with that. Even if I don't yet agree that your proposal is a
convincing path to "easy security for PLs" - just making the search path stuff
less arcane is good.

And when we know more about the user's actual intent, then it will be
easier to either form a transition plan to push users into the safer
behavior, or at least warn strongly when the user is doing something
dangerous (i.e. using a function that depends on the session's search
path as part of an expression index).

I think that'd be pretty painful from a UX perspective. Having to write
e.g. operators as operator(schema, op) just sucks as an experience. And with
extensions plenty of operators will live outside of pg_catalog, so there is
plenty things that will need qualifying. And because of things like type
coercion search, which prefers "bettering fitting" coercions over search path
order, you can't just put "less important" things later in search path.

I wonder if we ought to work more on "fossilizing" the result of search path
resolutions at the time functions are created, rather than requiring the user
to do so explicitly. Most of the problem here comes down to the fact that if
a user creates a function like 'a + b' we'll not resolve the operator, the
potential type coercions etc, when the function is created - we do so when the
function is executed.

We can't just store the oids at the time, because that'd end up very fragile -
tables/functions/... might be dropped and recreated etc and thus change their
oid. But we could change the core PLs to rewrite all the queries (*) so that
they schema qualify absolutely everything, including operators and implicit
type casts.

That way objects referenced by functions can still be replaced, but search
path can't be used to "inject" objects in different schemas. Obviously it
could lead to errors on some schema changes - e.g. changing a column type
might mean that a relevant cast lives in a different place than with the old
type - but I think that'll be quite rare. Perhaps we could offer a ALTER
FUNCTION ... REFRESH REFERENCES; or such?

One obvious downside of such an approach is that it requires some work with
each PL. I'm not sure that's avoidable - and I suspect that most "security
sensitive" functions are written in just a few languages.

(*) Obviously the one thing that doesn't work for is use of EXECUTE in plpgsql
and similar constructs elsewhere. I'm not sure there's much that can be done
to make that safe, but it's worth thinking about more.

Greetings,

Andres Freund

Jeff Davis

pgsql@j-davis.com

over 2 years ago

In reply to: Andres Freund (#6)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Sat, 2023-08-12 at 11:25 -0700, Andres Freund wrote:

I'm not sure that anything based, directly or indirectly, on
search_path
really is a realistic way to get there.

Can you explain a little more? I see what you mean generally, that
search_path is an imprecise thing, and that it leaves room for
ambiguity and mistakes.

But I also think we can do a lot better than we're doing today and
still retain the basic concept of search_path, which is good because
it's deeply integrated into postgres, and it's not clear that we're
going to get away from it any time soon.

I think that'd be pretty painful from a UX perspective. Having to
write
e.g. operators as operator(schema, op) just sucks as an experience.

I'm not suggesting that the user fully-qualify everything; I'm
suggesting that the include a "SET search_path" clause if they depend
on anything other than pg_catalog.

And with
extensions plenty of operators will live outside of pg_catalog, so
there is
plenty things that will need qualifying.

In my proposal, that would still involve a "SET search_path TO
myextension, pg_catalog, pg_temp".

The main reason that's bad is that adding pg_temp at the end is painful
UX -- just something that the user needs to remember to do with little
obvious reason or observable impact; but it has important security
implications. Perhaps we should just not implicitly include pg_temp for
a function's search_path (at least for the case of CREATE FUNCTION ...
SEARCH SYSTEM)?

And because of things like type
coercion search, which prefers "bettering fitting" coercions over
search path
order, you can't just put "less important" things later in search
path.

I understand this introduces some ambiguity, but you just can't include
schemas in the search_path that you don't trust, for similar reasons as
$PATH. If you have a few objects you'd like to access in another user's
schema, fully-qualify them.

We can't just store the oids at the time, because that'd end up very
fragile -
tables/functions/... might be dropped and recreated etc and thus
change their
oid.

Robert suggested something along those lines[1]/messages/by-id/CA+Tgmobd=eFRGWHhfG4mG2cA+dsVuA4jpBvD8N1NS=Vc9eHFQg@mail.gmail.com. I won't rule it out,
but I agree that there are quite a few things left to figure out.

But we could change the core PLs to rewrite all the queries (*) so
that
they schema qualify absolutely everything, including operators and
implicit
type casts.

So not quite like "SET search_path FROM CURRENT": you resolve it to a
specific "schemaname.objectname", but stop just short of resolving to a
specific OID?

An interesting compromise, but I'm not sure what the benefit is vs. SET
search_path FROM CURRENT (or some defined search_path).

That way objects referenced by functions can still be replaced, but
search
path can't be used to "inject" objects in different schemas.
Obviously it
could lead to errors on some schema changes - e.g. changing a column
type
might mean that a relevant cast lives in a different place than with
the old
type - but I think that'll be quite rare. Perhaps we could offer a
ALTER
FUNCTION ... REFRESH REFERENCES; or such?

Hmm. I feel like that's making things more complicated. I'd find it
more straightforward to use something like Robert's approach of fully
parsing something, and then have the REFRESH command reparse it when
something needs updating. Or perhaps just create all of the dependency
entries more like a view query and then auto-refresh.

(*) Obviously the one thing that doesn't work for is use of EXECUTE
in plpgsql
and similar constructs elsewhere. I'm not sure there's much that can
be done
to make that safe, but it's worth thinking about more.

I think it would be really nice to have some better control over the
search_path regardless, because it still helps with cases like this. A
lot of C functions build queries, and I don't think it's reasonable to
constantly worry about the ambiguity of the schema for "=".

Regards,
Jeff Davis

[1]: /messages/by-id/CA+Tgmobd=eFRGWHhfG4mG2cA+dsVuA4jpBvD8N1NS=Vc9eHFQg@mail.gmail.com
/messages/by-id/CA+Tgmobd=eFRGWHhfG4mG2cA+dsVuA4jpBvD8N1NS=Vc9eHFQg@mail.gmail.com

Peter Eisentraut

peter@eisentraut.org

over 2 years ago

In reply to: Jeff Davis (#1)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On 12.08.23 04:35, Jeff Davis wrote:

The attached patch implements a new SEARCH clause for CREATE FUNCTION.
The SEARCH clause controls the search_path used when executing
functions that were created without a SET clause.

I don't understand this. This adds a new option for cases where the
existing option wasn't specified. Why not specify the existing option
then? Is it not good enough? Can we improve it?

Jeff Davis

pgsql@j-davis.com

over 2 years ago

In reply to: Peter Eisentraut (#8)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Wed, 2023-08-16 at 08:51 +0200, Peter Eisentraut wrote:

On 12.08.23 04:35, Jeff Davis wrote:

The attached patch implements a new SEARCH clause for CREATE
FUNCTION.
The SEARCH clause controls the search_path used when executing
functions that were created without a SET clause.

I don't understand this. This adds a new option for cases where the
existing option wasn't specified. Why not specify the existing
option
then? Is it not good enough? Can we improve it?

SET search_path = '...' not good enough in my opinion.

1. Not specifying a SET clause falls back to the session's search_path,
which is a bad default because it leads to all kinds of inconsistent
behavior and security concerns.

2. There's no way to explicitly request that you'd actually like to use
the session's search_path, so it makes it very hard to ever change the
default.

3. It's user-unfriendly. A safe search_path that would be suitable for
most functions is "SET search_path = pg_catalog, pg_temp", which is
arcane, and requires some explanation.

4. search_path for the session is conceptually different than for a
function. A session should be context-sensitive and the same query
should (quite reasonably) behave differently for different sessions and
users to sort out things like object name conflicts, etc. A function
should (ordinarily) be context-insensitive, especially when used in
something like an index expression or constraint. Having different
syntax helps separate those concepts.

5. There's no way to prevent pg_temp from being included in the
search_path. This is separately fixable, but having the proposed SEARCH
syntax is likely to make for a better user experience in the common
cases.

I'm open to suggestion about other ways to improve it, but SEARCH is
what I came up with.

Regards,
Jeff Davis

#10

Peter Eisentraut

peter@eisentraut.org

over 2 years ago

In reply to: Jeff Davis (#9)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On 16.08.23 19:44, Jeff Davis wrote:

On Wed, 2023-08-16 at 08:51 +0200, Peter Eisentraut wrote:

On 12.08.23 04:35, Jeff Davis wrote:

The attached patch implements a new SEARCH clause for CREATE
FUNCTION.
The SEARCH clause controls the search_path used when executing
functions that were created without a SET clause.

I don't understand this. This adds a new option for cases where the
existing option wasn't specified. Why not specify the existing
option
then? Is it not good enough? Can we improve it?

SET search_path = '...' not good enough in my opinion.

1. Not specifying a SET clause falls back to the session's search_path,
which is a bad default because it leads to all kinds of inconsistent
behavior and security concerns.

Not specifying SEARCH would have the same issue?

2. There's no way to explicitly request that you'd actually like to use
the session's search_path, so it makes it very hard to ever change the
default.

That sounds like something that should be fixed independently. I could
see this being useful for other GUC settings, like I want to run a
function explicitly with the session's work_mem.

3. It's user-unfriendly. A safe search_path that would be suitable for
most functions is "SET search_path = pg_catalog, pg_temp", which is
arcane, and requires some explanation.

True, but is that specific to functions? Maybe I want a safe
search_path just in general, for a session or something.

4. search_path for the session is conceptually different than for a
function. A session should be context-sensitive and the same query
should (quite reasonably) behave differently for different sessions and
users to sort out things like object name conflicts, etc. A function
should (ordinarily) be context-insensitive, especially when used in
something like an index expression or constraint. Having different
syntax helps separate those concepts.

I'm not sure I follow that. When you say a function should be
context-insensitive, you could also say, a function should be
context-sensitive, but have a separate context. Which is kind of how it
works now. Maybe not well enough.

5. There's no way to prevent pg_temp from being included in the
search_path. This is separately fixable, but having the proposed SEARCH
syntax is likely to make for a better user experience in the common
cases.

seems related to #3

I'm open to suggestion about other ways to improve it, but SEARCH is
what I came up with.

Some extensions of the current mechanism, like search_path = safe,
search_path = session, search_path = inherit, etc. might work.

#11

Jeff Davis

pgsql@j-davis.com

over 2 years ago

In reply to: Peter Eisentraut (#10)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Fri, 2023-08-18 at 14:25 +0200, Peter Eisentraut wrote:

Not specifying SEARCH would have the same issue?

Not specifying SEARCH is equivalent to SEARCH DEFAULT, and that gives
us some control over what happens. In the proposed patch, a GUC
determines whether it behaves like SEARCH SESSION (the default for
compatibility reasons) or SEARCH SYSTEM (safer).

2. There's no way to explicitly request that you'd actually like to
use
the session's search_path, so it makes it very hard to ever change
the
default.

That sounds like something that should be fixed independently. I
could
see this being useful for other GUC settings, like I want to run a
function explicitly with the session's work_mem.

I'm confused about how this would work. It doesn't make sense to set a
GUC to be the session value in postgresql.conf, because there's no
session yet. And it doesn't really make sense in a top-level session,
because it would just be a no-op (right?). It maybe makes sense in a
function, but I'm still not totally clear on what that would mean.

True, but is that specific to functions? Maybe I want a safe
search_path just in general, for a session or something.

I agree this is a somewhat orthogonal problem and we should have a way
to keep pg_temp out of the search_path entirely. We just need to agree
on a string representation of a search path that omits pg_temp. One
idea would be to have special identifiers "!pg_temp" and "!pg_catalog"
that would cause those to be excluded entirely.

I'm not sure I follow that. When you say a function should be
context-insensitive, you could also say, a function should be
context-sensitive, but have a separate context. Which is kind of how
it
works now. Maybe not well enough.

For functions called from index expressions or constraints, you want
the function's result to only depend on its arguments; otherwise you
can easily violate a constraint or cause an index to return wrong
results.

You're right that there is some other context, like the database
default collation, but (a) that's mostly nailed down; and (b) if it
changes unexpectedly that also causes problems.

I'm open to suggestion about other ways to improve it, but SEARCH
is
what I came up with.

Some extensions of the current mechanism, like search_path = safe,
search_path = session, search_path = inherit, etc. might work.

I had considered some new special names like this in search path, but I
didn't come up with a specific proposal that I liked. Do you have some
more details about how this would help get us to a safe default?

Regards,
Jeff Davis

#12

Andres Freund

andres@anarazel.de

over 2 years ago

In reply to: Jeff Davis (#7)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

Hi,

On 2023-08-14 12:25:30 -0700, Jeff Davis wrote:

On Sat, 2023-08-12 at 11:25 -0700, Andres Freund wrote:

I'm not sure that anything based, directly or indirectly, on
search_path
really is a realistic way to get there.

Can you explain a little more? I see what you mean generally, that
search_path is an imprecise thing, and that it leaves room for
ambiguity and mistakes.

It just doesn't seem to provide enough control and it's really painful for
users to manage. If you install a bunch of extensions into public - very very
common from what I have seen - you can't really remove public from the search
path. Which then basically makes all approaches of resolving any of the
security issues via search path pretty toothless.

I think that'd be pretty painful from a UX perspective. Having to
write
e.g. operators as operator(schema, op) just sucks as an experience.

I'm not suggesting that the user fully-qualify everything; I'm
suggesting that the include a "SET search_path" clause if they depend
on anything other than pg_catalog.

I don't think that really works in practice, due to the very common practice
of installing extensions into the same schema as the application. Then that
schema needs to be in search path (if you don't want to schema qualify
everything), which leaves you wide open.

And with
extensions plenty of operators will live outside of pg_catalog, so
there is
plenty things that will need qualifying.

In my proposal, that would still involve a "SET search_path TO
myextension, pg_catalog, pg_temp".

myextension is typically public. Which means that there's zero protection due
to such a search path.

ï¿½ And because of things like type
coercion search, which prefers "bettering fitting" coercions over
search path
order, you can't just put "less important" things later in search
path.

I understand this introduces some ambiguity, but you just can't include
schemas in the search_path that you don't trust, for similar reasons as
$PATH. If you have a few objects you'd like to access in another user's
schema, fully-qualify them.

I think the more common attack paths are things like tricking extension
scripts into evaluating arbitrary code, to gain "real superuser" privileges.

We can't just store the oids at the time, because that'd end up very
fragile -
tables/functions/... might be dropped and recreated etc and thus
change their
oid.

Robert suggested something along those lines[1]. I won't rule it out,
but I agree that there are quite a few things left to figure out.

But we could change the core PLs to rewrite all the queries (*) so
that
they schema qualify absolutely everything, including operators and
implicit
type casts.

So not quite like "SET search_path FROM CURRENT": you resolve it to a
specific "schemaname.objectname", but stop just short of resolving to a
specific OID?

An interesting compromise, but I'm not sure what the benefit is vs. SET
search_path FROM CURRENT (or some defined search_path).

Search path does not reliably protect things involving "type matching". If you
have a better fitting cast, or a function call with parameters that won't need
coercion, later in search path, they'll win, even if there's another fit
earlier on.

IOW, search path is a bandaid for this kind of thing, at best.

If we instead store something that avoids the need for such search, the
"better fitting cast" logic wouldn't add these kind of security issues
anymore.

That way objects referenced by functions can still be replaced, but
search
path can't be used to "inject" objects in different schemas.
Obviously it
could lead to errors on some schema changes - e.g. changing a column
type
might mean that a relevant cast lives in a different place than with
the old
type - but I think that'll be quite rare. Perhaps we could offer a
ALTER
FUNCTION ... REFRESH REFERENCES; or such?

Hmm. I feel like that's making things more complicated. I'd find it
more straightforward to use something like Robert's approach of fully
parsing something, and then have the REFRESH command reparse it when
something needs updating. Or perhaps just create all of the dependency
entries more like a view query and then auto-refresh.

Hm, I'm not quite sure I follow on what exactly you see as different here.

Greetings,

Andres Freund

#13

Robert Haas

robertmhaas@gmail.com

over 2 years ago

In reply to: Jeff Davis (#9)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Wed, Aug 16, 2023 at 1:45 PM Jeff Davis <pgsql@j-davis.com> wrote:

On Wed, 2023-08-16 at 08:51 +0200, Peter Eisentraut wrote:

On 12.08.23 04:35, Jeff Davis wrote:

The attached patch implements a new SEARCH clause for CREATE
FUNCTION.
The SEARCH clause controls the search_path used when executing
functions that were created without a SET clause.

I don't understand this. This adds a new option for cases where the
existing option wasn't specified. Why not specify the existing
option
then? Is it not good enough? Can we improve it?

SET search_path = '...' not good enough in my opinion.

1. Not specifying a SET clause falls back to the session's search_path,
which is a bad default because it leads to all kinds of inconsistent
behavior and security concerns.

2. There's no way to explicitly request that you'd actually like to use
the session's search_path, so it makes it very hard to ever change the
default.

3. It's user-unfriendly. A safe search_path that would be suitable for
most functions is "SET search_path = pg_catalog, pg_temp", which is
arcane, and requires some explanation.

4. search_path for the session is conceptually different than for a
function. A session should be context-sensitive and the same query
should (quite reasonably) behave differently for different sessions and
users to sort out things like object name conflicts, etc. A function
should (ordinarily) be context-insensitive, especially when used in
something like an index expression or constraint. Having different
syntax helps separate those concepts.

5. There's no way to prevent pg_temp from being included in the
search_path. This is separately fixable, but having the proposed SEARCH
syntax is likely to make for a better user experience in the common
cases.

I'm open to suggestion about other ways to improve it, but SEARCH is
what I came up with.

The one thing that I really like about your proposal is that you
explicitly included a way of specifying that the prevailing
search_path should be used. If we move to any kind of a system where
the default behavior is something other than that, then we need that
as an option. Another, related thing that I recently discovered would
be useful is a way to say "I'd like to switch the search_path to X,
but I'd also like to discover what the prevailing search_path was just
before entering this function." For example, if I have a function that
is SECURITY DEFINER which takes some executable code as an input, I
might want to arrange to eventually execute that code with the
caller's user ID and search_path, but I can't discover the caller's
search_path unless I don't set it, and that's a dangerous thing to do.

However, my overall concern here is that this feels like it's
reinventing the wheel. We already have a way of setting search_path;
this gives us a second one. If we had no existing mechanism for that,
I think this would definitely be an improvement, and quite possibly
better than the current mechanism. But given that we had a mechanism
already, if we added this, we'd then have two, which seems like the
wrong number.

I'm inclined to think that if there are semantics that we currently
lack, we should think of extending the current syntax to support them.
Right now you can SET search_path = 'specific value' or SET
search_path FROM CURRENT or leave it out. We could introduce a new way
of spelling "leave it out," like RESET search_path or whatever. We
could introduce a new setting that doesn't set the search_path at all
but reverts to the old value on function exit, like SET search_path
USING CALL or whatever. And we could think of making SET search_path
FROM CURRENT or any new semantics we introduce the default in a future
release, or even make the default behavior depend on an evil
behavior-changing GUC as you proposed. I'm not quite sure what we
should do here conceptually, but I don't see why having a completely
new syntax for doing it really helps.

--
Robert Haas
EDB: http://www.enterprisedb.com

#14

Jeff Davis

pgsql@j-davis.com

over 2 years ago

In reply to: Andres Freund (#12)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Sat, 2023-08-19 at 11:59 -0700, Andres Freund wrote:

If you install a bunch of extensions into public - very very
common from what I have seen - you can't really remove public from
the search
path. Which then basically makes all approaches of resolving any of
the
security issues via search path pretty toothless.

Toothless only if (a) untrusted users have CREATE privileges in the
public schema, which is no longer the default; and (b) you're writing a
function that accesses extension objects installed in the public
schema.

While those may be normal things to do, there are a lot of times when
those things aren't true. I speculate that it's far more common to
write functions that only use pg_catalog objects (e.g. the "+"
operator, some string manipulation, etc.) and basic control flow.

There's a lot of value in making those simple cases secure-by-default.
We are already moving users towards a readable-but-not-writable public
schema as a best practice, so if we also move to something like SEARCH
SYSTEM as a best practice, then that will help a LOT of users.

I don't think that really works in practice, due to the very common
practice
of installing extensions into the same schema as the application.
Then that
schema needs to be in search path (if you don't want to schema
qualify
everything), which leaves you wide open.

...

myextension is typically public. Which means that there's zero
protection due
to such a search path.

You mentioned this three times so I must be missing something. Why is
it "wide open" and "zero protection"? If the schema is not world-
writable, then aren't attacks a lot harder to pull off?

I think the more common attack paths are things like tricking
extension
scripts into evaluating arbitrary code, to gain "real superuser"
privileges.

Extension scripts are a separate beast. I do see some potential avenues
of attack, but I don't see how your approach of resolving schemas early
would help.

Search path does not reliably protect things involving "type
matching". If you
have a better fitting cast, or a function call with parameters that
won't need
coercion, later in search path, they'll win, even if there's another
fit
earlier on.

You need to trust the schemas in your search_path.

If we instead store something that avoids the need for such search,
the
"better fitting cast" logic wouldn't add these kind of security
issues
anymore.

I don't disagree, but I don't understand the approach in detail (i.e. I
couldn't write it up as a proposal). For instance, what would the
pg_dump output look like?

And even if we had that in place, I think we'd still want a better way
to control the search_path.

Hm, I'm not quite sure I follow on what exactly you see as different
here.

From what I understand, Robert's approach is to fully parse the
commands and resolve to specific OIDs (necessitating dependencies,
etc.); while your approach resolves to fully-qualified names but not
OIDs (and needing no dependencies).

I don't understand either proposal entirely, so perhaps I'm on the
wrong track here, but I feel like Robert's approach is more "normal"
and easy to document whereas your approach is more "creative" and
perhaps hard to document.

Both approaches (resolving to names and resolving to OIDs) seem pretty
far away, so I'm still very much inclined to nudge users toward safer
best practices with search_path. I think SEARCH SYSTEM is a good start
there and doable for 17.

Regards,
Jeff Davis

#15

Jeff Davis

pgsql@j-davis.com

over 2 years ago

In reply to: Robert Haas (#13)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Mon, 2023-08-21 at 15:14 -0400, Robert Haas wrote:

Another, related thing that I recently discovered would
be useful is a way to say "I'd like to switch the search_path to X,
but I'd also like to discover what the prevailing search_path was
just
before entering this function."

Interesting, that could probably be accommodated one way or another.

However, my overall concern here is that this feels like it's
reinventing the wheel. We already have a way of setting search_path;
this gives us a second one.

In one sense, you are obviously right. We have a way to set search_path
for a function already, just like any other GUC.

But I don't look at the search_path as "just another GUC" when it comes
to executing a function. The source of the initial value of search_path
is more like the IMMUTABLE marker.

We can also do something with the knowledge the SEARCH marker gives us.
For instance, issue WARNINGs or ERRORs when someone uses a SEARCH
SESSION function in an index expression or constraint, or perhaps when
they try to declare a function IMMUTABLE in the first place.

In other words, the SEARCH clause tells us where search_path comes
from, not so much what it is specifically. I believe that tells us
something fundamental about the kind of function it is. If I tell you
nothing about a function except whether the search path comes from the
system or the session, you can imagine how it should be used (or not
used, as the case may be).

I'm inclined to think that if there are semantics that we currently
lack, we should think of extending the current syntax to support
them.
Right now you can SET search_path = 'specific value' or SET
search_path FROM CURRENT or leave it out. We could introduce a new
way
of spelling "leave it out," like RESET search_path or whatever.

The thought occurred to me but any way I looked at it was messier and
less user-friendly. It feels like generalizing from search_path to all
GUCs, and then needing to specialize for search_path anyway.

For instance, if we want the default search_path to be the safe value
'pg_catalog, pg_temp', where would that default value come from? Or
instead, we could say that the default would be FROM CURRENT, which
would seem to generalize; but then we immediately run into the problem
that we don't want most GUCs to default to FROM CURRENT (because that
would capture the entire GUC state, which seems bad for several
reasons), and again we'd need to specialize for search_path.

In other words, search_path really *is* special. I don't think it's
great to generalize from it as though it were just like every other
GUC.

I do recognize that "SEARCH SYSTEM ... SET search_path = '...'" is
redundant, and that's not great. I just see the other options as worse,
but if I've misunderstood your approach then please clarify.

Regards,
Jeff Davis

#16

Robert Haas

robertmhaas@gmail.com

over 2 years ago

In reply to: Jeff Davis (#15)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Mon, Aug 21, 2023 at 5:32 PM Jeff Davis <pgsql@j-davis.com> wrote:

But I don't look at the search_path as "just another GUC" when it comes
to executing a function. The source of the initial value of search_path
is more like the IMMUTABLE marker.

I mean I agree and I disagree.

Philosophically, I agree. Most functions are written with some
particular search_path in mind; the author imagines that the function
will be executed with, well, probably whatever search path the author
typically uses themselves. Now and then, someone may write a function
that's intended to run with various different search paths, e.g.
anything of the form customer_XXXX, pg_catalog, pg_temp. I think that
is a real thing that people actually do, intentionally varying the
search_path with the idea of rebinding some references. However, cases
where somebody sincerely intends for the caller to be able to make +
or || mean something different from normal probably do not exist in
practice. So, if we were designing a system from scratch, then I would
recommend against making search_path a GUC, because it's clearly
shouldn't behave in the same way as a session property like
debug_print_plan or enable_seqscan, where you could want to run the
same code with various values.

But practically, I disagree. As things stand today, search_path *is* a
GUC that dynamically changes the run-time properties of a session, and
your proposed patch wouldn't change that. What it would do is layer
another mechanism on top of that which, IMHO, makes something that is
already complicated and error-prone even more complicated. If we
wanted to really make seach_path behave like a property of the code
rather than the session, I think we'd need to change quite a bit more
stuff, and the price of that in terms of backward-compatibility might
be higher than we'd be willing to pay, but if, hypothetically, we
decided to pay that price, then at the end of it search_path as a GUC
would be gone, and we'd have one way of managing sarch_path that is
different from the one we have now.

But with the patch as you have proposed it that's not what happens. We
just end up with two interconnected mechanisms for managing what,
right now, is managed by a single mechanism. That mechanism is (and I
think we probably mostly all agree on this) bad. Like really really
bad. But having more than one mechanism, to me, still seems worse.

--
Robert Haas
EDB: http://www.enterprisedb.com

#17

Jeff Davis

pgsql@j-davis.com

over 2 years ago

In reply to: Robert Haas (#16)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Mon, 2023-09-18 at 12:01 -0400, Robert Haas wrote:

But with the patch as you have proposed it that's not what happens.
We
just end up with two interconnected mechanisms for managing what,
right now, is managed by a single mechanism. That mechanism is (and I
think we probably mostly all agree on this) bad. Like really really
bad. But having more than one mechanism, to me, still seems worse.

I don't want to make an argument of the form "the status quo is really
bad, and therefore my proposal is good". That line of argument is
suspect for good reason.

But if my proposal isn't good enough, and we don't have a clear
alternative, we need to think seriously about how much we've
collectively over-promised and under-delivered on the concept of
privilege separation.

Absent a better idea, we need to figure out a way to un-promise what we
can't do and somehow guide users towards safe practices. For instance,
don't grant the INSERT or UPDATE privilege if the table uses functions
in index expressions or constraints. Also don't touch any table unless
the onwer has SET ROLE privileges on your role already, or the
operation is part of a special carve out (logical replication or a
maintenance command). And don't use the predefined role
pg_write_all_data, because that's unsafe for most imaginable use cases.

Regards,
Jeff Davis

#18

Robert Haas

robertmhaas@gmail.com

over 2 years ago

In reply to: Jeff Davis (#17)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Mon, Sep 18, 2023 at 4:51 PM Jeff Davis <pgsql@j-davis.com> wrote:

I don't want to make an argument of the form "the status quo is really
bad, and therefore my proposal is good". That line of argument is
suspect for good reason.

+1.

But if my proposal isn't good enough, and we don't have a clear
alternative, we need to think seriously about how much we've
collectively over-promised and under-delivered on the concept of
privilege separation.

Absent a better idea, we need to figure out a way to un-promise what we
can't do and somehow guide users towards safe practices. For instance,
don't grant the INSERT or UPDATE privilege if the table uses functions
in index expressions or constraints. Also don't touch any table unless
the onwer has SET ROLE privileges on your role already, or the
operation is part of a special carve out (logical replication or a
maintenance command). And don't use the predefined role
pg_write_all_data, because that's unsafe for most imaginable use cases.

I agree this is a mess, and that documenting the mess better would be
good. But instead of saying not to do something, we need to say what
will happen if you do the thing. I'm regularly annoyed when somebody
reports that "I tried to do X and it didn't work," instead of saying
what happened when they tried, and this situation is another form of
the same thing. "If you do X, then Y will or can occur" is much better
than "do not do X". And I think better documentation of this area
would be useful regardless of any other improvements that we may or
may not make. Indeed, really good documentation of this area might
facilitate making further improvements by highlighting some of the
problems so that they can more easily be understood by a wider
audience. I fear it will be hard to come up with something that is
clear, that highlights the severity of the problems, and that does not
veer off into useless vitriol against the status quo, but if we can
get there, that would be good.

But, leaving that to one side, what technical options do we have on
the table, supposing that we want to do something that is useful but
not this exact thing?

I think one option is to somehow change the behavior around
search_path but in a different way than you've proposed. The most
radical option would be to make it not be a GUC any more. I think the
backward-compatibility implications of that would likely be
unpalatable to many, and the details of what we'd actually do are also
not clear, at least to me. For a function, I think there is a
reasonable argument that you just make it a function property, like
IMMUTABLE, as you said before. But for code that goes directly into
the session, where's the search_path supposed to come from? It's got
to be configured somewhere, and somehow that somewhere feels a lot
like a GUC. That leads to a second idea, which is having it continue
to be a GUC but only affect directly-entered SQL, with all
indirectly-entered SQL either being stored as a node tree or having a
search_path property attached somewhere. Or, as a third idea, suppose
we leave it a GUC but start breaking semantics around where and how
that GUC gets set, e.g. by changing CREATE FUNCTION to capture the
prevailing search_path by default unless instructed otherwise.
Personally I feel like we'd need pretty broad consensus for any of
these kinds of changes because it would break a lot of stuff for a lot
of people, but if we could get that then I think we could maybe emerge
in a better spot once the pain of the compatibility break receded.

Another option is something around sandboxing and/or function trust.
The idea here is to not care too much about the search_path behavior
itself, and instead focus on the consequences, namely what code is
getting executed as which user and perhaps what kinds of operations
it's performing. To me, this seems like a possibly easier answer way
forward at least in the short to medium term, because I think it will
break fewer things for fewer people, and if somebody doesn't like the
new behavior they can just say "well I trust everyone completely" and
it all goes back to the way it was. That said, I think there are
problems with my previous proposals on the other thread so I believe
some adjustments would be needed there, and then there's the problem
of actually implementing anything. I'll try to respond to your
comments on that thread soon.

Are there any other categories of things we can do? More specific
kinds of things we can do in each category? I don't really see an
option other than (1) "change something in the system design so that
people use search_path wrongly less often" or (2) "make it so that it
doesn't matter as much if people using the wrong search_path" but
maybe I'm missing a clever idea.

--
Robert Haas
EDB: http://www.enterprisedb.com

#19

Jeff Davis

pgsql@j-davis.com

over 2 years ago

In reply to: Robert Haas (#18)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Tue, 2023-09-19 at 11:41 -0400, Robert Haas wrote:

I agree this is a mess, and that documenting the mess better would be
good. But instead of saying not to do something, we need to say what
will happen if you do the thing. I'm regularly annoyed when somebody
reports that "I tried to do X and it didn't work," instead of saying
what happened when they tried, and this situation is another form of
the same thing. "If you do X, then Y will or can occur" is much
better
than "do not do X".

Good documentation includes some guidance. Sure, it should describe the
system behavior, but without anchoring it to some kind of expected use
case, it can be equally frustrating.

Allow me to pick on this example which came up in a recent thread:

"[password_required] Specifies whether connections to the publisher
made as a result of this subscription must use password authentication.
This setting is ignored when the subscription is owned by a superuser.
The default is true. Only superusers can set this value to false."
-- https://www.postgresql.org/docs/16/sql-createsubscription.html

Only superusers can set it, and it's ignored for superusers. That does
a good job of describing the actual behavior, but is a bit puzzling.

I guess what the user is supposed to do is one of:
1. Create a subscription as a superuser with the right connection
string (without a password) and password_required=false, then reassign
it to a non-superuser; or
2. Create a subscription as a non-superuser member of
pg_create_subscription using a bogus connection string, then a
superuser can alter it to set password_required=false, then alter the
connection string; or
3. Create a superuser, let the new superuser create a subscription
with password_required=false, and then remove their superuser status.

so why not just document one of those things as the expected thing to
do? Not a whole section or anything, but a sentence to suggest what
they should do or where else they should look.

I don't mean to set some major new standard in the documentation that
should apply to everything; but for the privilege system, even hackers
are having trouble keeping up (myself included). A bit of guidance
toward supported use cases helps a lot.

I fear it will be hard to come up with something that is
clear, that highlights the severity of the problems, and that does
not
veer off into useless vitriol against the status quo, but if we can
get there, that would be good.

I hope what I'm saying is not useless vitriol. I am offering the best
solutions I see in a bad situation. And I believe I've uncovered some
emergent behaviors that are not well-understood even among prominent
hackers.

That leads to a second idea, which is having it continue
to be a GUC but only affect directly-entered SQL, with all
indirectly-entered SQL either being stored as a node tree or having a
search_path property attached somewhere.

That's not too far from the proposed patch and I'd certainly be
interested to hear more and/or adapt my patch towards this idea.

Or, as a third idea, suppose
we leave it a GUC but start breaking semantics around where and how
that GUC gets set, e.g. by changing CREATE FUNCTION to capture the
prevailing search_path by default unless instructed otherwise.

How would one instruct otherwise?

Personally I feel like we'd need pretty broad consensus for any of
these kinds of changes

because it would break a lot of stuff for a lot
of people, but if we could get that then I think we could maybe
emerge
in a better spot once the pain of the compatibility break receded.

Are there ways we can soften this a bit? I know compatibility GUCs are
not to be added lightly, but perhaps one is justified here?

Another option is something around sandboxing and/or function trust.
The idea here is to not care too much about the search_path behavior
itself, and instead focus on the consequences, namely what code is
getting executed as which user and perhaps what kinds of operations
it's performing.

I'm open to discussing that further, and it certainly may solve some
problems, but it does not seem to solve the fundamental problem with
search_path: that the caller can (intentionally or unintentionally)
cause a function to do unexpected things.

Sometimes an unexpected thing is not a the kind of thing that would be
caught by a sandbox, e.g. just an unexpected function result. But if
that function is used in a constraint or expression index, that
unexpected result can lead to a violated constraint or a bad index
(that will later cause wrong results). The owner of the table might
reasonably consider that a privilege problem, if the user who causes
the trouble had only INSERT privileges.

Are there any other categories of things we can do? More specific
kinds of things we can do in each category? I don't really see an
option other than (1) "change something in the system design so that
people use search_path wrongly less often" or (2) "make it so that it
doesn't matter as much if people using the wrong search_path" but
maybe I'm missing a clever idea.

Perhaps there are some clever ideas about maintaining compatibility
within the approaches (1) or (2), which might make one of them more
appealing.

Regards,
Jeff Davis

#20

Maciek Sakrejda

m.sakrejda@gmail.com

over 2 years ago

In reply to: Jeff Davis (#19)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Tue, Sep 19, 2023 at 5:56 PM Jeff Davis <pgsql@j-davis.com> wrote:

...
On Tue, 2023-09-19 at 11:41 -0400, Robert Haas wrote:

That leads to a second idea, which is having it continue
to be a GUC but only affect directly-entered SQL, with all
indirectly-entered SQL either being stored as a node tree or having a
search_path property attached somewhere.

That's not too far from the proposed patch and I'd certainly be
interested to hear more and/or adapt my patch towards this idea.

As an interested bystander, that's the same thing I was thinking when
reading this. I reread your original e-mail, Jeff, and I still think
that.

I wonder if something like CURRENT (i.e., the search path at function
creation time) might be a useful keyword addition. I can see some uses
(more forgiving than SYSTEM but not as loose as SESSION), but I don't
know if it would justify its presence.

Thanks for working on this.

Thanks,
Maciek

#21

Maciek Sakrejda

m.sakrejda@gmail.com

over 2 years ago

In reply to: Maciek Sakrejda (#20)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Tue, Sep 19, 2023, 20:23 Maciek Sakrejda <m.sakrejda@gmail.com> wrote:

I wonder if something like CURRENT (i.e., the search path at function
creation time) might be a useful keyword addition. I can see some uses
(more forgiving than SYSTEM but not as loose as SESSION), but I don't
know if it would justify its presence.

I realize now this is exactly what SET search_path FROM CURRENT does. Sorry
for the noise.

Regarding extensions installed in the public schema throwing a wrench in
the works, is that still a problem if the public schema is not writable? I
know that that's a new default, but it won't be forever.

#22

Pavel Stehule

pavel.stehule@gmail.com

over 2 years ago

In reply to: Maciek Sakrejda (#20)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

st 20. 9. 2023 v 9:34 odesílatel Maciek Sakrejda <m.sakrejda@gmail.com>
napsal:

On Tue, Sep 19, 2023 at 5:56 PM Jeff Davis <pgsql@j-davis.com> wrote:

...
On Tue, 2023-09-19 at 11:41 -0400, Robert Haas wrote:

That leads to a second idea, which is having it continue
to be a GUC but only affect directly-entered SQL, with all
indirectly-entered SQL either being stored as a node tree or having a
search_path property attached somewhere.

That's not too far from the proposed patch and I'd certainly be
interested to hear more and/or adapt my patch towards this idea.

As an interested bystander, that's the same thing I was thinking when
reading this. I reread your original e-mail, Jeff, and I still think
that.

I wonder if something like CURRENT (i.e., the search path at function
creation time) might be a useful keyword addition. I can see some uses
(more forgiving than SYSTEM but not as loose as SESSION), but I don't
know if it would justify its presence.

Personally, I dislike this - because the value of the search path is hidden
in this case.

I agree so it can be comfortable, but it can be confusing for review,
migration, ...

Regards

Pavel

Show quoted text

Thanks for working on this.

Thanks,
Maciek

#23

Robert Haas

robertmhaas@gmail.com

over 2 years ago

In reply to: Jeff Davis (#19)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Tue, Sep 19, 2023 at 7:56 PM Jeff Davis <pgsql@j-davis.com> wrote:

Good documentation includes some guidance. Sure, it should describe the
system behavior, but without anchoring it to some kind of expected use
case, it can be equally frustrating.

Fair.

I don't mean to set some major new standard in the documentation that
should apply to everything; but for the privilege system, even hackers
are having trouble keeping up (myself included). A bit of guidance
toward supported use cases helps a lot.

Yeah, this stuff is complicated, and I agree that it's hard even for
hackers to keep up with. I don't really have a strong view on the
concrete case you mentioned involving password_required. I always
worry that if there are three cases and we suggest one of them then
the others will be viewed negatively when really they're all equally
fine. On the other hand, that can often be addressed by starting the
sentence with "For example, you could...." or similar, so perhaps
there's no problem here at all. I generally agree with the idea that
examples can be useful for clarifying points that may otherwise be too
theoretical.

I hope what I'm saying is not useless vitriol. I am offering the best
solutions I see in a bad situation. And I believe I've uncovered some
emergent behaviors that are not well-understood even among prominent
hackers.

Yeah, I wasn't really intending to say that you were. I just get
nervous about statements like "don't ever do X!" because I find it
completely unconvincing. In my experience, when you tell people stuff
like that, some of them just go off and do it anyway and, especially
in a case like this, chances are very good that nothing bad will ever
happen to them, simply because most PostgreSQL installations don't
have malicious local users. When you tell them "but that's really bad"
they say "why?" and if the documentation doesn't have an answer to the
question well then that sucks.

because it would break a lot of stuff for a lot
of people, but if we could get that then I think we could maybe
emerge
in a better spot once the pain of the compatibility break receded.

Are there ways we can soften this a bit? I know compatibility GUCs are
not to be added lightly, but perhaps one is justified here?

I don't know. I'm skeptical. This behavior is so complicated and hard
to get right. Having it GUC-dependent makes it even more confusing
than it is already. But I guess it also depends on what the GUC does.

Let's say we make a rule that every function or procedure has to have
a search_path attached to it as a function property. That is, CREATE
FUNCTION .. SEARCH something sets pg_proc.prosearch = 'something'. If
you omit the SEARCH clause, one is implicitly supplied for you. If you
say SEARCH NULL, then the function is executed with the search_path
taken from the GUC; SEARCH 'anything_else' specified a literal
search_path to be used.

In such a world, I can imagine having a GUC that determines whether
the implicitly supplied SEARCH clause is SEARCH
${WHATEVER_THE_SEARCH_PATH_IS_RIGHT_NOW} or SEARCH NULL. Such a GUC
only takes effect at CREATE FUNCTION time. However, I cannot imagine
having a GUC that causes the SEARCH clauses attached to all functions
to be globally ignored at execution time. That seems like a choice we
would likely regret bitterly. The first thing is already painful, but
the second one is exponentially worse, because in the first world, you
have to be careful to get your functions defined correctly, but if you
do, you know they'll run OK on any PostgreSQL cluster anywhere,
whereas in the second world, there's no way to define a function that
behaves the same way on every PostgreSQL instance. Imagine being an
extension author, for example.

I am a little worried that this kind of design might end up reversing
the direction of some security problems that we have now. For
instance, right now, if you call a function with a SET search_path
clause, you know that it can't make any changes to search_path that
survive function exit. You'll get your old search_path back. With this
kind of design, it seems like it would be a lot easier to get back to
the SQL toplevel and find the search_path surprisingly changed under
you. I think we have that problem in some cases already, though. I'm
unclear how much worse this makes it.

Another option is something around sandboxing and/or function trust.
The idea here is to not care too much about the search_path behavior
itself, and instead focus on the consequences, namely what code is
getting executed as which user and perhaps what kinds of operations
it's performing.

I'm open to discussing that further, and it certainly may solve some
problems, but it does not seem to solve the fundamental problem with
search_path: that the caller can (intentionally or unintentionally)
cause a function to do unexpected things.

Well, I think it's meant to solve that problem. How effectively it
does so is a point worth debating.

Sometimes an unexpected thing is not a the kind of thing that would be
caught by a sandbox, e.g. just an unexpected function result. But if
that function is used in a constraint or expression index, that
unexpected result can lead to a violated constraint or a bad index
(that will later cause wrong results). The owner of the table might
reasonably consider that a privilege problem, if the user who causes
the trouble had only INSERT privileges.

That's an interesting example. Earlier versions of the function trust
proposal proposed to block *any* execution of code belonging to an
untrusted party. That could potentially block this attack. However, it
would also block a lot of other things. For instance, if Alice tries
to insert into Bob's table and Bob's table has a CHECK constraint or
an index expression, Alice has to trust Bob or she can't insert
anything at all. By trusting Bob just enough to allow him do things
like CHECK(LENGTH(foo) < 10) or whatever, Alice can operate on Bob's
table without a problem in normal cases, but is still protected if Bob
suddenly starts doing something sneaky. I think that's a significant
improvement, because a system that is so stringent that it blocks even
completely harmless things is likely to get disabled, at which point
it protects nobody from anything.

However, that analysis presumes that what we're trying to do is
protect Alice from Bob, and I think you're raising the question of how
we protect Bob from Alice. Suppose Bob has got a trigger function but
has failed to control search_path for that function. Alice can set
search_path so that Bob's trigger calls some function or operator that
she owns instead of the intended call to, say, a system function or
operator. Some sufficiently-rigid function trust system could catch
this: Bob doesn't trust Alice, and so the fact that his code is trying
to call some a function or operator owned by Alice is a red flag. On
the basis of the fact that Bob doesn't trust Alice, we should error
out to protect Bob. Had the search_path been set in the expected way,
Bob would have been trying to call a superuser-owned function, and Bob
must trust the superuser, so the operation is permitted.

I wouldn't have a problem with a function-trust proposal that
incorporated a mode that rigid as a configuration option. I find this
a convincing example of how that could be useful. But such a mode has
pretty serious downsides, too. It makes it very difficult for one user
to interact with another user's objects in any way without triggering
security errors.

Also, in a case like this, I don't think it's unreasonable to ask
whether, perhaps, Bob just needs to be a little more careful about
setting search_path. I think that there is a big difference between
(a) defining a SQL-language function that is accessible to multiple
users and (b) inserting a row into a table you don't own. When you
define a function, you know people are potentially going to call it.
Asking you, as the function author, to take some care to secure your
function against a malicious search_path doesn't seem like an
unsupportable burden. After all, you control the definition of that
function. The problem with inserting a row into a table you don't own
is that all of the objects involved -- the table itself, its indexes,
its triggers, its defaults, its constraints -- are owned by somebody
else, and that user controls those objects and can change any of them
at any time. You can't really be expected to verify that all code
reachable as a result of an INSERT into the table is safe enough
before every INSERT into that table. You can, I think, be expected to
check that functions you define have SET search_path attached.

Are there any other categories of things we can do? More specific
kinds of things we can do in each category? I don't really see an
option other than (1) "change something in the system design so that
people use search_path wrongly less often" or (2) "make it so that it
doesn't matter as much if people using the wrong search_path" but
maybe I'm missing a clever idea.

Perhaps there are some clever ideas about maintaining compatibility
within the approaches (1) or (2), which might make one of them more
appealing.

Indeed.

--
Robert Haas
EDB: http://www.enterprisedb.com

#24

Jeff Davis

pgsql@j-davis.com

over 2 years ago

In reply to: Maciek Sakrejda (#20)

1 attachment(s)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Tue, 2023-09-19 at 20:23 -0700, Maciek Sakrejda wrote:

On Tue, Sep 19, 2023 at 5:56 PM Jeff Davis <pgsql@j-davis.com> wrote:

...
On Tue, 2023-09-19 at 11:41 -0400, Robert Haas wrote:

That leads to a second idea, which is having it continue
to be a GUC but only affect directly-entered SQL, with all
indirectly-entered SQL either being stored as a node tree or
having a
search_path property attached somewhere.

That's not too far from the proposed patch and I'd certainly be
interested to hear more and/or adapt my patch towards this idea.

As an interested bystander, that's the same thing I was thinking when
reading this. I reread your original e-mail, Jeff, and I still think
that.

I have attached an updated patch. Changes:

* Syntax is now: SEARCH FROM { DEFAULT | TRUSTED | SESSION }
- added "FROM" to suggest that it's the source, and only a starting
place, rather than a specific and final setting. I don't feel strongly
about the FROM one way or another, so I can take it out if it's not
helpful.
- changed "SYSTEM" to "TRUSTED", which better reflects the purpose,
and doesn't suggest any connection to ALTER SYSTEM.
* Removed GUC -- we can reconsider this kind of thing later.
* ERROR if IMMUTABLE is combined with SEARCH FROM SESSION
* pg_dump support. Emits "SEARCH FROM SESSION" or "SEARCH FROM
TRUSTED" only if explicitly specified; otherwise emits no SEARCH
clause. Differentiating the unspecified cases may be useful for
migration purposes later.
* psql support.
* Updated docs to try to better present the concept, and document
CREATE PROCEDURE as well.

The SEARCH clause declares a new property that will be useful to both
enforce safety and also to guide users to migrate in a safe direction
over time.

For instance, the current patch prohibits the combination of IMMUTABLE
and SEARCH FROM SESSION; but allows IMMUTABLE if no SEARCH clause is
specified at all (to avoid breaking anything). We could extend that
slowly over several releases ratchet up the pressure (with warnings or
changing defaults) until all IMMUTABLE functions require SEARCH FROM
TRUSTED. Perhaps IMMUTABLE would even imply SEARCH FROM TRUSTED.

The search property is consistent with other properties, like
IMMUTABLE, which is both a marker and also enforces some restrictions
(e.g. you can't CREATE TABLE). It's also a lot nicer to use than a SET
clause, and provides a nice place to document certain behaviors.

(Aside: the concept of IMMUTABLE is basically broken today, due to
search_path problems.)

SEARCH FROM DEFAULT is just a way to get an object back to the
"unspecified search clause" state. It has the same behavior as SEARCH
FROM SESSION, except that the former will cause a hard error when
combined with IMMUTABLE. I think it's worth differentiating the
unspecified search clause from the explicit SEARCH FROM SESSION clause
for the purposes of migration.

There were three main complaints:

Comaplaint A: That it creates a new mechanism[1]/messages/by-id/CA+TgmoaRPJJN=AOqC4b9t90vFQX81hKiXNPNhbxR0-Sm8F8nCA@mail.gmail.com.

The patch doesn't create a new internal mechanism, it almost entirely
reuses the existing SET clause mechanism. I think complaint A is really
about the user-facing mechanics, which is essentially the same as the
complaint B.

Complaint B: That it's overlapping in functionality with the SET
clause[2]/messages/by-id/CA+Tgmoah_bTjUFng-vZnivPQs0kQWUaSwAu49SU5M+zTxA+3Qw@mail.gmail.com[3]/messages/by-id/15464811-18fb-c7d4-4620-873366d367d6@eisentraut.org. In other words:

CREATE FUNCTION ... SEARCH FROM TRUSTED ...;
CREATE FUNCTION ... SET search_path = pg_catalog, pg_temp ...;

do similar things. But the latter is much worse:

* it's user-unfriendly (requiring pg_temp is highly unintuitive)
* it doesn't allow Postgres to warn if the function is being used in
an unsafe context
* there's no way to explicitly declare that you want the search path
to come from the session instead (either to be more clear about your
intentions, or to be forward-compatible)

In my opinion, the "SET search_path = ..." form should be used when you
actually want the search_path to contain some specific schema, not in
cases where you're just using built-in objects.

Complaint C: search_path is hopeless[4]/messages/by-id/20230812182559.d7plqwx3p65ys4i7@awork3.anarazel.de.

I think we can make meaningful improvements to the status quo, like
with the attached patch, that will reduce a lot of the surface area for
security risks. Right now our privilege model breaks down very quickly
even with trivial and typical use cases and we can do better.

--
Jeff Davis
PostgreSQL Contributor Team - AWS

[1]: /messages/by-id/CA+TgmoaRPJJN=AOqC4b9t90vFQX81hKiXNPNhbxR0-Sm8F8nCA@mail.gmail.com
/messages/by-id/CA+TgmoaRPJJN=AOqC4b9t90vFQX81hKiXNPNhbxR0-Sm8F8nCA@mail.gmail.com
[2]: /messages/by-id/CA+Tgmoah_bTjUFng-vZnivPQs0kQWUaSwAu49SU5M+zTxA+3Qw@mail.gmail.com
/messages/by-id/CA+Tgmoah_bTjUFng-vZnivPQs0kQWUaSwAu49SU5M+zTxA+3Qw@mail.gmail.com
[3]: /messages/by-id/15464811-18fb-c7d4-4620-873366d367d6@eisentraut.org
/messages/by-id/15464811-18fb-c7d4-4620-873366d367d6@eisentraut.org
[4]: /messages/by-id/20230812182559.d7plqwx3p65ys4i7@awork3.anarazel.de
/messages/by-id/20230812182559.d7plqwx3p65ys4i7@awork3.anarazel.de

Attachments:

v2-0001-CREATE-FUNCTION-.-SEARCH-FROM-DEFAULT-TRUSTED-SES.patchtext/x-patch; charset=UTF-8; name=v2-0001-CREATE-FUNCTION-.-SEARCH-FROM-DEFAULT-TRUSTED-SES.patchDownload

From 1cdb1e604535e7e0e1c3a1d8c9d38d99c9d42ba3 Mon Sep 17 00:00:00 2001
From: Jeff Davis <jeff@j-davis.com>
Date: Fri, 11 Aug 2023 14:33:36 -0700
Subject: [PATCH v2] CREATE FUNCTION ... SEARCH FROM { DEFAULT | TRUSTED |
 SESSION }.

Declare how search_path will be initialized before execution of a
function or procedure. SEARCH FROM TRUSTED causes the search_path to
be initialized to a safe value that includes built-in objects. SEARCH
FROM SESSION causes the search_path to be inherited from the current
setting in the session (or current setting in the calling function or
procedure).

The SEARCH clause specifies a new function property that may be used
in the future to check for safe usage of functions in, e.g., index
expressions.

Discussion: https://postgr.es/m/2710f56add351a1ed553efb677408e51b060e67c.camel%40j-davis.com
---
 doc/src/sgml/ref/alter_function.sgml          | 15 +++++
 doc/src/sgml/ref/alter_procedure.sgml         | 15 +++++
 doc/src/sgml/ref/alter_routine.sgml           |  1 +
 doc/src/sgml/ref/create_function.sgml         | 55 +++++++++++++++
 doc/src/sgml/ref/create_procedure.sgml        | 55 +++++++++++++++
 src/backend/catalog/pg_aggregate.c            |  1 +
 src/backend/catalog/pg_proc.c                 |  2 +
 src/backend/commands/functioncmds.c           | 56 +++++++++++++++-
 src/backend/commands/typecmds.c               |  4 ++
 src/backend/optimizer/util/clauses.c          |  1 +
 src/backend/parser/gram.y                     | 15 +++++
 src/backend/utils/fmgr/fmgr.c                 | 23 +++++--
 src/bin/pg_dump/pg_dump.c                     | 18 ++++-
 src/bin/psql/describe.c                       | 23 ++++++-
 src/include/catalog/namespace.h               |  6 ++
 src/include/catalog/pg_proc.h                 | 13 ++++
 .../regress/expected/create_function_sql.out  | 67 +++++++++++++++++++
 src/test/regress/expected/psql.out            | 12 ++--
 src/test/regress/sql/create_function_sql.sql  | 44 ++++++++++++
 19 files changed, 407 insertions(+), 19 deletions(-)

diff --git a/doc/src/sgml/ref/alter_function.sgml b/doc/src/sgml/ref/alter_function.sgml
index 8193b17f25..b4103c0c90 100644
--- a/doc/src/sgml/ref/alter_function.sgml
+++ b/doc/src/sgml/ref/alter_function.sgml
@@ -37,6 +37,7 @@ ALTER FUNCTION <replaceable>name</replaceable> [ ( [ [ <replaceable class="param
     CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT | STRICT
     IMMUTABLE | STABLE | VOLATILE
     [ NOT ] LEAKPROOF
+    SEARCH FROM { DEFAULT | TRUSTED | SESSION }
     [ EXTERNAL ] SECURITY INVOKER | [ EXTERNAL ] SECURITY DEFINER
     PARALLEL { UNSAFE | RESTRICTED | SAFE }
     COST <replaceable class="parameter">execution_cost</replaceable>
@@ -198,6 +199,20 @@ ALTER FUNCTION <replaceable>name</replaceable> [ ( [ [ <replaceable class="param
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>SEARCH FROM DEFAULT</literal></term>
+    <term><literal>SEARCH FROM TRUSTED</literal></term>
+    <term><literal>SEARCH FROM SESSION</literal></term>
+
+    <listitem>
+      <para>
+       Change the <literal>SEARCH</literal> behavior of the function to the
+       specified setting.  See <xref linkend="sql-createfunction"/> for
+       details.
+      </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><literal><optional> EXTERNAL </optional> SECURITY INVOKER</literal></term>
     <term><literal><optional> EXTERNAL </optional> SECURITY DEFINER</literal></term>
diff --git a/doc/src/sgml/ref/alter_procedure.sgml b/doc/src/sgml/ref/alter_procedure.sgml
index a4737a3439..63bc6d695b 100644
--- a/doc/src/sgml/ref/alter_procedure.sgml
+++ b/doc/src/sgml/ref/alter_procedure.sgml
@@ -34,6 +34,7 @@ ALTER PROCEDURE <replaceable>name</replaceable> [ ( [ [ <replaceable class="para
 
 <phrase>where <replaceable class="parameter">action</replaceable> is one of:</phrase>
 
+    SEARCH FROM { DEFAULT | TRUSTED | SESSION }
     [ EXTERNAL ] SECURITY INVOKER | [ EXTERNAL ] SECURITY DEFINER
     SET <replaceable class="parameter">configuration_parameter</replaceable> { TO | = } { <replaceable class="parameter">value</replaceable> | DEFAULT }
     SET <replaceable class="parameter">configuration_parameter</replaceable> FROM CURRENT
@@ -158,6 +159,20 @@ ALTER PROCEDURE <replaceable>name</replaceable> [ ( [ [ <replaceable class="para
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>SEARCH FROM DEFAULT</literal></term>
+    <term><literal>SEARCH FROM TRUSTED</literal></term>
+    <term><literal>SEARCH FROM SESSION</literal></term>
+
+    <listitem>
+      <para>
+       Change the <literal>SEARCH</literal> behavior of the function to the
+       specified setting.  See <xref linkend="sql-createprocedure"/> for
+       details.
+      </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><literal><optional> EXTERNAL </optional> SECURITY INVOKER</literal></term>
     <term><literal><optional> EXTERNAL </optional> SECURITY DEFINER</literal></term>
diff --git a/doc/src/sgml/ref/alter_routine.sgml b/doc/src/sgml/ref/alter_routine.sgml
index d6c9dea2eb..f8575b6cd0 100644
--- a/doc/src/sgml/ref/alter_routine.sgml
+++ b/doc/src/sgml/ref/alter_routine.sgml
@@ -36,6 +36,7 @@ ALTER ROUTINE <replaceable>name</replaceable> [ ( [ [ <replaceable class="parame
 
     IMMUTABLE | STABLE | VOLATILE
     [ NOT ] LEAKPROOF
+    SEARCH FROM { DEFAULT | TRUSTED | SESSION }
     [ EXTERNAL ] SECURITY INVOKER | [ EXTERNAL ] SECURITY DEFINER
     PARALLEL { UNSAFE | RESTRICTED | SAFE }
     COST <replaceable class="parameter">execution_cost</replaceable>
diff --git a/doc/src/sgml/ref/create_function.sgml b/doc/src/sgml/ref/create_function.sgml
index 863d99d1fc..0909bd9c06 100644
--- a/doc/src/sgml/ref/create_function.sgml
+++ b/doc/src/sgml/ref/create_function.sgml
@@ -31,6 +31,7 @@ CREATE [ OR REPLACE ] FUNCTION
     | { IMMUTABLE | STABLE | VOLATILE }
     | [ NOT ] LEAKPROOF
     | { CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT | STRICT }
+    | SEARCH FROM { DEFAULT | TRUSTED | SESSION }
     | { [ EXTERNAL ] SECURITY INVOKER | [ EXTERNAL ] SECURITY DEFINER }
     | PARALLEL { UNSAFE | RESTRICTED | SAFE }
     | COST <replaceable class="parameter">execution_cost</replaceable>
@@ -402,6 +403,60 @@ CREATE [ OR REPLACE ] FUNCTION
      </listitem>
     </varlistentry>
 
+    <varlistentry>
+     <term><literal>SEARCH FROM DEFAULT</literal></term>
+     <term><literal>SEARCH FROM TRUSTED</literal></term>
+     <term><literal>SEARCH FROM SESSION</literal></term>
+
+     <listitem>
+      <para>
+       Declare how <xref linkend="guc-search-path"/> will be initialized
+       before execution. <literal>SEARCH FROM TRUSTED</literal> causes the
+       search_path to be initialized to a safe value that includes built-in
+       objects. <literal>SEARCH FROM SESSION</literal> causes the search_path
+       to be inherited from the current setting in the session (or
+       current setting in the calling function or procedure).
+      </para>
+      <warning>
+       <para>
+        If <literal>search_path</literal> is inherited from the session, the
+        caller of the function or procedure may be able to manipulate its
+        behavior.
+       </para>
+      </warning>
+      <para>
+       Regardless of the <literal>SEARCH</literal> clause, the
+       <literal>search_path</literal> may be changed later just like any other
+       GUC, e.g. by using a <literal>SET</literal> clause in the declaration
+       or a <xref linkend="sql-set"/> command during execution.
+      </para>
+      <para>
+       If the function or procedure is specified with
+       <replaceable>sql_body</replaceable>, the <literal>SEARCH</literal>
+       clause does not affect the <literal>search_path</literal> used to find
+       objects referenced in the <replaceable>sql_body</replaceable> because
+       the it's parsed at definition time.
+      </para>
+      <para>
+       <literal>SEARCH FROM DEFAULT</literal> is the same as not specifying
+       any <literal>SEARCH</literal> clause at all, and the behavior is
+       equivalent to <literal>SEARCH FROM SESSION</literal>.
+      </para>
+      <note>
+       <para>
+        The <literal>search_path</literal> setting is important for functions
+        or procedures that parse statements at execution time (such as
+        PL/pgSQL, SQL specified with a <replaceable>definition</replaceable>,
+        or those that use the <link linkend="spi">Server Programming
+        Interface</link>), or those that call other functions or procedures
+        that parse statements at execution time.
+       </para>
+       <para>
+       </para>
+      </note>
+     </listitem>
+    </varlistentry>
+
    <varlistentry>
     <term><literal><optional>EXTERNAL</optional> SECURITY INVOKER</literal></term>
     <term><literal><optional>EXTERNAL</optional> SECURITY DEFINER</literal></term>
diff --git a/doc/src/sgml/ref/create_procedure.sgml b/doc/src/sgml/ref/create_procedure.sgml
index 03a14c8684..c1c63f0527 100644
--- a/doc/src/sgml/ref/create_procedure.sgml
+++ b/doc/src/sgml/ref/create_procedure.sgml
@@ -25,6 +25,7 @@ CREATE [ OR REPLACE ] PROCEDURE
     <replaceable class="parameter">name</replaceable> ( [ [ <replaceable class="parameter">argmode</replaceable> ] [ <replaceable class="parameter">argname</replaceable> ] <replaceable class="parameter">argtype</replaceable> [ { DEFAULT | = } <replaceable class="parameter">default_expr</replaceable> ] [, ...] ] )
   { LANGUAGE <replaceable class="parameter">lang_name</replaceable>
     | TRANSFORM { FOR TYPE <replaceable class="parameter">type_name</replaceable> } [, ... ]
+    | SEARCH FROM { DEFAULT | TRUSTED | SESSION }
     | [ EXTERNAL ] SECURITY INVOKER | [ EXTERNAL ] SECURITY DEFINER
     | SET <replaceable class="parameter">configuration_parameter</replaceable> { TO <replaceable class="parameter">value</replaceable> | = <replaceable class="parameter">value</replaceable> | FROM CURRENT }
     | AS '<replaceable class="parameter">definition</replaceable>'
@@ -193,6 +194,60 @@ CREATE [ OR REPLACE ] PROCEDURE
      </listitem>
     </varlistentry>
 
+    <varlistentry>
+     <term><literal>SEARCH FROM DEFAULT</literal></term>
+     <term><literal>SEARCH FROM TRUSTED</literal></term>
+     <term><literal>SEARCH FROM SESSION</literal></term>
+
+     <listitem>
+      <para>
+       Declare how <xref linkend="guc-search-path"/> will be initialized
+       before execution. <literal>SEARCH FROM TRUSTED</literal> causes the
+       search_path to be initialized to a safe value that includes built-in
+       objects. <literal>SEARCH FROM SESSION</literal> causes the search_path
+       to be inherited from the current setting in the session (or
+       current setting in the calling function or procedure).
+      </para>
+      <warning>
+       <para>
+        If <literal>search_path</literal> is inherited from the session, the
+        caller of the function or procedure may be able to manipulate its
+        behavior.
+       </para>
+      </warning>
+      <para>
+       Regardless of the <literal>SEARCH</literal> clause, the
+       <literal>search_path</literal> may be changed later just like any other
+       GUC, e.g. by using a <literal>SET</literal> clause in the declaration
+       or a <xref linkend="sql-set"/> command during execution.
+      </para>
+      <para>
+       If the function or procedure is specified with
+       <replaceable>sql_body</replaceable>, the <literal>SEARCH</literal>
+       clause does not affect the <literal>search_path</literal> used to find
+       objects referenced in the <replaceable>sql_body</replaceable> because
+       the it's parsed at definition time.
+      </para>
+      <para>
+       <literal>SEARCH FROM DEFAULT</literal> is the same as not specifying
+       any <literal>SEARCH</literal> clause at all, and the behavior is
+       equivalent to <literal>SEARCH FROM SESSION</literal>.
+      </para>
+      <note>
+       <para>
+        The <literal>search_path</literal> setting is important for functions
+        or procedures that parse statements at execution time (such as
+        PL/pgSQL, SQL specified with a <replaceable>definition</replaceable>,
+        or those that use the <link linkend="spi">Server Programming
+        Interface</link>), or those that call other functions or procedures
+        that parse statements at execution time.
+       </para>
+       <para>
+       </para>
+      </note>
+     </listitem>
+    </varlistentry>
+
    <varlistentry>
     <term><literal><optional>EXTERNAL</optional> SECURITY INVOKER</literal></term>
     <term><literal><optional>EXTERNAL</optional> SECURITY DEFINER</literal></term>
diff --git a/src/backend/catalog/pg_aggregate.c b/src/backend/catalog/pg_aggregate.c
index ebc4454743..e4ec68ec43 100644
--- a/src/backend/catalog/pg_aggregate.c
+++ b/src/backend/catalog/pg_aggregate.c
@@ -624,6 +624,7 @@ AggregateCreate(const char *aggName,
 							 NULL,	/* probin */
 							 NULL,	/* prosqlbody */
 							 PROKIND_AGGREGATE,
+							 PROSEARCH_DEFAULT, /* no SEARCH clause for aggregates */
 							 false, /* security invoker (currently not
 									 * definable for agg) */
 							 false, /* isLeakProof */
diff --git a/src/backend/catalog/pg_proc.c b/src/backend/catalog/pg_proc.c
index b5fd364003..8a252acd99 100644
--- a/src/backend/catalog/pg_proc.c
+++ b/src/backend/catalog/pg_proc.c
@@ -81,6 +81,7 @@ ProcedureCreate(const char *procedureName,
 				const char *probin,
 				Node *prosqlbody,
 				char prokind,
+				char prosearch,
 				bool security_definer,
 				bool isLeakProof,
 				bool isStrict,
@@ -308,6 +309,7 @@ ProcedureCreate(const char *procedureName,
 	values[Anum_pg_proc_provariadic - 1] = ObjectIdGetDatum(variadicType);
 	values[Anum_pg_proc_prosupport - 1] = ObjectIdGetDatum(prosupport);
 	values[Anum_pg_proc_prokind - 1] = CharGetDatum(prokind);
+	values[Anum_pg_proc_prosearch - 1] = CharGetDatum(prosearch);
 	values[Anum_pg_proc_prosecdef - 1] = BoolGetDatum(security_definer);
 	values[Anum_pg_proc_proleakproof - 1] = BoolGetDatum(isLeakProof);
 	values[Anum_pg_proc_proisstrict - 1] = BoolGetDatum(isStrict);
diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c
index 7ba6a86ebe..1f086b84a5 100644
--- a/src/backend/commands/functioncmds.c
+++ b/src/backend/commands/functioncmds.c
@@ -507,6 +507,7 @@ compute_common_attribute(ParseState *pstate,
 						 DefElem *defel,
 						 DefElem **volatility_item,
 						 DefElem **strict_item,
+						 DefElem **search_item,
 						 DefElem **security_item,
 						 DefElem **leakproof_item,
 						 List **set_items,
@@ -533,6 +534,13 @@ compute_common_attribute(ParseState *pstate,
 
 		*strict_item = defel;
 	}
+	else if (strcmp(defel->defname, "search") == 0)
+	{
+		if (*search_item)
+			errorConflictingDefElem(defel, pstate);
+
+		*search_item = defel;
+	}
 	else if (strcmp(defel->defname, "security") == 0)
 	{
 		if (*security_item)
@@ -603,6 +611,24 @@ procedure_error:
 	return false;
 }
 
+static char
+interpret_func_search(DefElem *defel)
+{
+	char	   *str = strVal(defel->arg);
+
+	if (strcmp(str, "default") == 0)
+		return PROSEARCH_DEFAULT;
+	else if (strcmp(str, "trusted") == 0)
+		return PROSEARCH_TRUSTED;
+	else if (strcmp(str, "session") == 0)
+		return PROSEARCH_SESSION;
+	else
+	{
+		elog(ERROR, "invalid search \"%s\"", str);
+		return 0;				/* keep compiler quiet */
+	}
+}
+
 static char
 interpret_func_volatility(DefElem *defel)
 {
@@ -725,6 +751,7 @@ compute_function_attributes(ParseState *pstate,
 							bool *windowfunc_p,
 							char *volatility_p,
 							bool *strict_p,
+							char *search_p,
 							bool *security_definer,
 							bool *leakproof_p,
 							ArrayType **proconfig,
@@ -740,6 +767,7 @@ compute_function_attributes(ParseState *pstate,
 	DefElem    *windowfunc_item = NULL;
 	DefElem    *volatility_item = NULL;
 	DefElem    *strict_item = NULL;
+	DefElem    *search_item = NULL;
 	DefElem    *security_item = NULL;
 	DefElem    *leakproof_item = NULL;
 	List	   *set_items = NIL;
@@ -786,6 +814,7 @@ compute_function_attributes(ParseState *pstate,
 										  defel,
 										  &volatility_item,
 										  &strict_item,
+										  &search_item,
 										  &security_item,
 										  &leakproof_item,
 										  &set_items,
@@ -814,6 +843,8 @@ compute_function_attributes(ParseState *pstate,
 		*volatility_p = interpret_func_volatility(volatility_item);
 	if (strict_item)
 		*strict_p = boolVal(strict_item->arg);
+	if (search_item)
+		*search_p = interpret_func_search(search_item);
 	if (security_item)
 		*security_definer = boolVal(security_item->arg);
 	if (leakproof_item)
@@ -1042,7 +1073,8 @@ CreateFunction(ParseState *pstate, CreateFunctionStmt *stmt)
 				isStrict,
 				security,
 				isLeakProof;
-	char		volatility;
+	char		prosearch,
+				volatility;
 	ArrayType  *proconfig;
 	float4		procost;
 	float4		prorows;
@@ -1067,6 +1099,7 @@ CreateFunction(ParseState *pstate, CreateFunctionStmt *stmt)
 	language = NULL;
 	isWindowFunc = false;
 	isStrict = false;
+	prosearch = PROSEARCH_DEFAULT;
 	security = false;
 	isLeakProof = false;
 	volatility = PROVOLATILE_VOLATILE;
@@ -1082,10 +1115,16 @@ CreateFunction(ParseState *pstate, CreateFunctionStmt *stmt)
 								stmt->options,
 								&as_clause, &language, &transformDefElem,
 								&isWindowFunc, &volatility,
-								&isStrict, &security, &isLeakProof,
-								&proconfig, &procost, &prorows,
+								&isStrict, &prosearch, &security,
+								&isLeakProof, &proconfig, &procost, &prorows,
 								&prosupport, &parallel);
 
+	if (volatility == PROVOLATILE_IMMUTABLE && prosearch == PROSEARCH_SESSION)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("immutable functions cannot be specified with SEARCH FROM SESSION"),
+				 errhint("Specify SEARCH FROM TRUSTED instead.")));
+
 	if (!language)
 	{
 		if (stmt->sql_body)
@@ -1271,6 +1310,7 @@ CreateFunction(ParseState *pstate, CreateFunctionStmt *stmt)
 						   probin_str,	/* converted to text later */
 						   prosqlbody,
 						   stmt->is_procedure ? PROKIND_PROCEDURE : (isWindowFunc ? PROKIND_WINDOW : PROKIND_FUNCTION),
+						   prosearch,
 						   security,
 						   isLeakProof,
 						   isStrict,
@@ -1355,6 +1395,7 @@ AlterFunction(ParseState *pstate, AlterFunctionStmt *stmt)
 	ListCell   *l;
 	DefElem    *volatility_item = NULL;
 	DefElem    *strict_item = NULL;
+	DefElem    *search_item = NULL;
 	DefElem    *security_def_item = NULL;
 	DefElem    *leakproof_item = NULL;
 	List	   *set_items = NIL;
@@ -1399,6 +1440,7 @@ AlterFunction(ParseState *pstate, AlterFunctionStmt *stmt)
 									 defel,
 									 &volatility_item,
 									 &strict_item,
+									 &search_item,
 									 &security_def_item,
 									 &leakproof_item,
 									 &set_items,
@@ -1413,8 +1455,16 @@ AlterFunction(ParseState *pstate, AlterFunctionStmt *stmt)
 		procForm->provolatile = interpret_func_volatility(volatility_item);
 	if (strict_item)
 		procForm->proisstrict = boolVal(strict_item->arg);
+	if (search_item)
+		procForm->prosearch = interpret_func_search(search_item);
 	if (security_def_item)
 		procForm->prosecdef = boolVal(security_def_item->arg);
+	if (procForm->provolatile == PROVOLATILE_IMMUTABLE &&
+		procForm->prosearch == PROSEARCH_SESSION)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("immutable functions cannot be specified with SEARCH FROM SESSION"),
+				 errhint("Specify SEARCH FROM TRUSTED instead.")));
 	if (leakproof_item)
 	{
 		procForm->proleakproof = boolVal(leakproof_item->arg);
diff --git a/src/backend/commands/typecmds.c b/src/backend/commands/typecmds.c
index 5e97606793..baa8dd706e 100644
--- a/src/backend/commands/typecmds.c
+++ b/src/backend/commands/typecmds.c
@@ -1763,6 +1763,7 @@ makeRangeConstructors(const char *name, Oid namespace,
 								 NULL,	/* probin */
 								 NULL,	/* prosqlbody */
 								 PROKIND_FUNCTION,
+								 PROSEARCH_DEFAULT,
 								 false, /* security_definer */
 								 false, /* leakproof */
 								 false, /* isStrict */
@@ -1828,6 +1829,7 @@ makeMultirangeConstructors(const char *name, Oid namespace,
 							 NULL,	/* probin */
 							 NULL,	/* prosqlbody */
 							 PROKIND_FUNCTION,
+							 PROSEARCH_DEFAULT,
 							 false, /* security_definer */
 							 false, /* leakproof */
 							 true,	/* isStrict */
@@ -1872,6 +1874,7 @@ makeMultirangeConstructors(const char *name, Oid namespace,
 							 NULL,	/* probin */
 							 NULL,	/* prosqlbody */
 							 PROKIND_FUNCTION,
+							 PROSEARCH_DEFAULT,
 							 false, /* security_definer */
 							 false, /* leakproof */
 							 true,	/* isStrict */
@@ -1910,6 +1913,7 @@ makeMultirangeConstructors(const char *name, Oid namespace,
 							 NULL,	/* probin */
 							 NULL,	/* prosqlbody */
 							 PROKIND_FUNCTION,
+							 PROSEARCH_DEFAULT,
 							 false, /* security_definer */
 							 false, /* leakproof */
 							 true,	/* isStrict */
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index da258968b8..5a843d71ae 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -4474,6 +4474,7 @@ inline_function(Oid funcid, Oid result_type, Oid result_collid,
 	 */
 	if (funcform->prolang != SQLlanguageId ||
 		funcform->prokind != PROKIND_FUNCTION ||
+		funcform->prosearch == PROSEARCH_TRUSTED ||
 		funcform->prosecdef ||
 		funcform->proretset ||
 		funcform->prorettype == RECORDOID ||
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 7d2032885e..ad6d09cc47 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -8545,6 +8545,21 @@ common_func_opt_item:
 				{
 					$$ = makeDefElem("support", (Node *) $2, @1);
 				}
+			| SEARCH FROM DEFAULT
+				{
+					/* we abuse the normal content of a DefElem here */
+					$$ = makeDefElem("search", (Node *) makeString("default"), @1);
+				}
+			| SEARCH FROM TRUSTED
+				{
+					/* we abuse the normal content of a DefElem here */
+					$$ = makeDefElem("search", (Node *) makeString("trusted"), @1);
+				}
+			| SEARCH FROM SESSION
+				{
+					/* we abuse the normal content of a DefElem here */
+					$$ = makeDefElem("search", (Node *) makeString("session"), @1);
+				}
 			| FunctionSetResetClause
 				{
 					/* we abuse the normal content of a DefElem here */
diff --git a/src/backend/utils/fmgr/fmgr.c b/src/backend/utils/fmgr/fmgr.c
index 9dfdf890c5..d65d88fc31 100644
--- a/src/backend/utils/fmgr/fmgr.c
+++ b/src/backend/utils/fmgr/fmgr.c
@@ -16,6 +16,7 @@
 #include "postgres.h"
 
 #include "access/detoast.h"
+#include "catalog/namespace.h"
 #include "catalog/pg_language.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
@@ -203,6 +204,7 @@ fmgr_info_cxt_security(Oid functionId, FmgrInfo *finfo, MemoryContext mcxt,
 	 */
 	if (!ignore_security &&
 		(procedureStruct->prosecdef ||
+		 procedureStruct->prosearch == PROSEARCH_TRUSTED ||
 		 !heap_attisnull(procedureTuple, Anum_pg_proc_proconfig, NULL) ||
 		 FmgrHookIsNeeded(functionId)))
 	{
@@ -612,6 +614,7 @@ struct fmgr_security_definer_cache
 {
 	FmgrInfo	flinfo;			/* lookup info for target function */
 	Oid			userid;			/* userid to set, or InvalidOid */
+	char	   *searchPath;		/* from SEARCH clause, if specified */
 	List	   *configNames;	/* GUC names to set, or NIL */
 	List	   *configValues;	/* GUC values to set, or NIL */
 	Datum		arg;			/* passthrough argument for plugin modules */
@@ -630,6 +633,9 @@ struct fmgr_security_definer_cache
 extern Datum
 fmgr_security_definer(PG_FUNCTION_ARGS)
 {
+	GucContext	context = superuser() ? PGC_SUSET : PGC_USERSET;
+	GucSource	source = PGC_S_SESSION;
+	GucAction	action = GUC_ACTION_SAVE;
 	Datum		result;
 	struct fmgr_security_definer_cache *volatile fcache;
 	FmgrInfo   *save_flinfo;
@@ -662,6 +668,9 @@ fmgr_security_definer(PG_FUNCTION_ARGS)
 				 fcinfo->flinfo->fn_oid);
 		procedureStruct = (Form_pg_proc) GETSTRUCT(tuple);
 
+		if (procedureStruct->prosearch == PROSEARCH_TRUSTED)
+			fcache->searchPath = NAMESPACE_TRUSTED_SEARCH_PATH;
+
 		if (procedureStruct->prosecdef)
 			fcache->userid = procedureStruct->proowner;
 
@@ -687,20 +696,22 @@ fmgr_security_definer(PG_FUNCTION_ARGS)
 
 	/* GetUserIdAndSecContext is cheap enough that no harm in a wasted call */
 	GetUserIdAndSecContext(&save_userid, &save_sec_context);
-	if (fcache->configNames != NIL) /* Need a new GUC nesting level */
+	if (fcache->searchPath != NULL || fcache->configNames != NIL) /* Need a new GUC nesting level */
 		save_nestlevel = NewGUCNestLevel();
 	else
-		save_nestlevel = 0;		/* keep compiler quiet */
+		save_nestlevel = 0;
 
 	if (OidIsValid(fcache->userid))
 		SetUserIdAndSecContext(fcache->userid,
 							   save_sec_context | SECURITY_LOCAL_USERID_CHANGE);
 
+	if (fcache->searchPath != NULL)
+		(void) set_config_option("search_path", fcache->searchPath,
+								 context, source,
+								 action, true, 0, false);
+
 	forboth(lc1, fcache->configNames, lc2, fcache->configValues)
 	{
-		GucContext	context = superuser() ? PGC_SUSET : PGC_USERSET;
-		GucSource	source = PGC_S_SESSION;
-		GucAction	action = GUC_ACTION_SAVE;
 		char	   *name = lfirst(lc1);
 		char	   *value = lfirst(lc2);
 
@@ -749,7 +760,7 @@ fmgr_security_definer(PG_FUNCTION_ARGS)
 
 	fcinfo->flinfo = save_flinfo;
 
-	if (fcache->configNames != NIL)
+	if (save_nestlevel > 0)
 		AtEOXact_GUC(true, save_nestlevel);
 	if (OidIsValid(fcache->userid))
 		SetUserIdAndSecContext(save_userid, save_sec_context);
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index f7b6176692..184508052f 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -12005,6 +12005,7 @@ dumpFunc(Archive *fout, const FuncInfo *finfo)
 	char	   *prokind;
 	char	   *provolatile;
 	char	   *proisstrict;
+	char	   *prosearch;
 	char	   *prosecdef;
 	char	   *proleakproof;
 	char	   *proconfig;
@@ -12079,10 +12080,17 @@ dumpFunc(Archive *fout, const FuncInfo *finfo)
 
 		if (fout->remoteVersion >= 140000)
 			appendPQExpBufferStr(query,
-								 "pg_get_function_sqlbody(p.oid) AS prosqlbody\n");
+								 "pg_get_function_sqlbody(p.oid) AS prosqlbody,\n");
 		else
 			appendPQExpBufferStr(query,
-								 "NULL AS prosqlbody\n");
+								 "NULL AS prosqlbody,\n");
+
+		if (fout->remoteVersion >= 170000)
+			appendPQExpBufferStr(query,
+								 "prosearch\n");
+		else
+			appendPQExpBufferStr(query,
+								 "NULL AS prosearch\n");
 
 		appendPQExpBufferStr(query,
 							 "FROM pg_catalog.pg_proc p, pg_catalog.pg_language l\n"
@@ -12120,6 +12128,7 @@ dumpFunc(Archive *fout, const FuncInfo *finfo)
 	prokind = PQgetvalue(res, 0, PQfnumber(res, "prokind"));
 	provolatile = PQgetvalue(res, 0, PQfnumber(res, "provolatile"));
 	proisstrict = PQgetvalue(res, 0, PQfnumber(res, "proisstrict"));
+	prosearch = PQgetvalue(res, 0, PQfnumber(res, "prosearch"));
 	prosecdef = PQgetvalue(res, 0, PQfnumber(res, "prosecdef"));
 	proleakproof = PQgetvalue(res, 0, PQfnumber(res, "proleakproof"));
 	proconfig = PQgetvalue(res, 0, PQfnumber(res, "proconfig"));
@@ -12245,6 +12254,11 @@ dumpFunc(Archive *fout, const FuncInfo *finfo)
 	if (proisstrict[0] == 't')
 		appendPQExpBufferStr(q, " STRICT");
 
+	if (prosearch[0] == 't')
+		appendPQExpBufferStr(q, " SEARCH FROM TRUSTED");
+	else if (prosearch[0] == 's')
+		appendPQExpBufferStr(q, " SEARCH FROM SESSION");
+
 	if (prosecdef[0] == 't')
 		appendPQExpBufferStr(q, " SECURITY DEFINER");
 
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index bac94a338c..a2b950382b 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -301,7 +301,10 @@ describeFunctions(const char *functypes, const char *func_pattern,
 	PQExpBufferData buf;
 	PGresult   *res;
 	printQueryOpt myopt = pset.popt;
-	static const bool translate_columns[] = {false, false, false, false, true, true, true, false, true, false, false, false, false};
+	static const bool translate_columns[] = {false, false, false, false, true, true, true, true, false, true, false, false, false, false};
+
+	/* No "Search" column before 17 */
+	static const bool translate_columns_pre_17[] = {false, false, false, false, true, true, true, false, true, false, false, false, false};
 
 	/* No "Parallel" column before 9.6 */
 	static const bool translate_columns_pre_96[] = {false, false, false, false, true, true, false, true, false, false, false, false};
@@ -377,6 +380,17 @@ describeFunctions(const char *functypes, const char *func_pattern,
 
 	if (verbose)
 	{
+		if (pset.sversion >= 170000)
+			appendPQExpBuffer(&buf,
+							  ",\n CASE\n"
+							  "  WHEN p.prosearch = 'd' THEN '%s'\n"
+							  "  WHEN p.prosearch = 't' THEN '%s'\n"
+							  "  WHEN p.prosearch = 's' THEN '%s'\n"
+							  " END as \"%s\"",
+							  gettext_noop("default"),
+							  gettext_noop("trusted"),
+							  gettext_noop("session"),
+							  gettext_noop("Search"));
 		appendPQExpBuffer(&buf,
 						  ",\n CASE\n"
 						  "  WHEN p.provolatile = 'i' THEN '%s'\n"
@@ -588,11 +602,16 @@ describeFunctions(const char *functypes, const char *func_pattern,
 	myopt.nullPrint = NULL;
 	myopt.title = _("List of functions");
 	myopt.translate_header = true;
-	if (pset.sversion >= 90600)
+	if (pset.sversion >= 170000)
 	{
 		myopt.translate_columns = translate_columns;
 		myopt.n_translate_columns = lengthof(translate_columns);
 	}
+	else if (pset.sversion >= 90600)
+	{
+		myopt.translate_columns = translate_columns_pre_17;
+		myopt.n_translate_columns = lengthof(translate_columns_pre_17);
+	}
 	else
 	{
 		myopt.translate_columns = translate_columns_pre_96;
diff --git a/src/include/catalog/namespace.h b/src/include/catalog/namespace.h
index e027940430..c1c313eec6 100644
--- a/src/include/catalog/namespace.h
+++ b/src/include/catalog/namespace.h
@@ -76,6 +76,12 @@ typedef enum RVROption
 typedef void (*RangeVarGetRelidCallback) (const RangeVar *relation, Oid relId,
 										  Oid oldRelId, void *callback_arg);
 
+/*
+ * Trusted search_path for cases where relying on the session search_path is
+ * unsafe (e.g. for SECURITY DEFINER functions).
+ */
+#define NAMESPACE_TRUSTED_SEARCH_PATH		"pg_catalog, pg_temp"
+
 #define RangeVarGetRelid(relation, lockmode, missing_ok) \
 	RangeVarGetRelidExtended(relation, lockmode, \
 							 (missing_ok) ? RVR_MISSING_OK : 0, NULL, NULL)
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index fdb39d4001..9b3763cc1b 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -58,6 +58,9 @@ CATALOG(pg_proc,1255,ProcedureRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(81,Proce
 	/* see PROKIND_ categories below */
 	char		prokind BKI_DEFAULT(f);
 
+	/* security definer */
+	char		prosearch BKI_DEFAULT(d);
+
 	/* security definer */
 	bool		prosecdef BKI_DEFAULT(f);
 
@@ -150,6 +153,15 @@ DECLARE_UNIQUE_INDEX(pg_proc_proname_args_nsp_index, 2691, ProcedureNameArgsNspI
 #define PROKIND_WINDOW 'w'
 #define PROKIND_PROCEDURE 'p'
 
+/*
+ * Symbolic values for prosearch column: these determine how search_path is
+ * initialized when executing the function. PROSEARCH_DEFAULT means that no
+ * SEARCH clause was specified.
+ */
+#define PROSEARCH_DEFAULT		'd'
+#define PROSEARCH_TRUSTED		't'
+#define PROSEARCH_SESSION		's'
+
 /*
  * Symbolic values for provolatile column: these indicate whether the result
  * of a function is dependent *only* on the values of its explicit arguments,
@@ -197,6 +209,7 @@ extern ObjectAddress ProcedureCreate(const char *procedureName,
 									 const char *probin,
 									 Node *prosqlbody,
 									 char prokind,
+									 char prosearch,
 									 bool security_definer,
 									 bool isLeakProof,
 									 bool isStrict,
diff --git a/src/test/regress/expected/create_function_sql.out b/src/test/regress/expected/create_function_sql.out
index 50aca5940f..83c649f725 100644
--- a/src/test/regress/expected/create_function_sql.out
+++ b/src/test/regress/expected/create_function_sql.out
@@ -88,6 +88,73 @@ SELECT proname, provolatile FROM pg_proc
  functest_b_4 | v
 (4 rows)
 
+--
+-- SEARCH FROM DEFAULT | TRUSTED | SESSION
+--
+CREATE FUNCTION f_show_search_path() RETURNS TEXT
+  LANGUAGE plpgsql AS
+$$
+  BEGIN
+    RETURN current_setting('search_path');
+  END;
+$$;
+CREATE PROCEDURE p_show_search_path()
+  LANGUAGE plpgsql AS
+$$
+  BEGIN
+    RAISE NOTICE 'search_path: %', current_setting('search_path');
+  END;
+$$;
+SELECT f_show_search_path();
+   f_show_search_path   
+------------------------
+ temp_func_test, public
+(1 row)
+
+CALL p_show_search_path();
+NOTICE:  search_path: temp_func_test, public
+ALTER FUNCTION f_show_search_path() SEARCH FROM TRUSTED;
+SELECT f_show_search_path();
+ f_show_search_path  
+---------------------
+ pg_catalog, pg_temp
+(1 row)
+
+ALTER FUNCTION f_show_search_path() SEARCH FROM DEFAULT;
+SELECT f_show_search_path();
+   f_show_search_path   
+------------------------
+ temp_func_test, public
+(1 row)
+
+ALTER FUNCTION f_show_search_path() SEARCH FROM SESSION;
+SELECT f_show_search_path();
+   f_show_search_path   
+------------------------
+ temp_func_test, public
+(1 row)
+
+ALTER FUNCTION f_show_search_path() IMMUTABLE; -- fail
+ERROR:  immutable functions cannot be specified with SEARCH FROM SESSION
+HINT:  Specify SEARCH FROM TRUSTED instead.
+ALTER FUNCTION f_show_search_path() SEARCH FROM DEFAULT;
+ALTER FUNCTION f_show_search_path() IMMUTABLE;
+ALTER FUNCTION f_show_search_path() SEARCH FROM SESSION; -- fail
+ERROR:  immutable functions cannot be specified with SEARCH FROM SESSION
+HINT:  Specify SEARCH FROM TRUSTED instead.
+ALTER FUNCTION f_show_search_path() VOLATILE;
+ALTER ROUTINE p_show_search_path() SEARCH FROM TRUSTED SET search_path = test1;
+CALL p_show_search_path();
+NOTICE:  search_path: test1
+ALTER ROUTINE f_show_search_path() SEARCH FROM SESSION SET search_path = test1;
+SELECT f_show_search_path();
+ f_show_search_path 
+--------------------
+ test1
+(1 row)
+
+DROP FUNCTION f_show_search_path();
+DROP PROCEDURE p_show_search_path();
 --
 -- SECURITY DEFINER | INVOKER
 --
diff --git a/src/test/regress/expected/psql.out b/src/test/regress/expected/psql.out
index 7cd0c27cca..d6293c4003 100644
--- a/src/test/regress/expected/psql.out
+++ b/src/test/regress/expected/psql.out
@@ -5277,12 +5277,12 @@ create function psql_df_plpgsql ()
   as $$ begin return; end; $$;
 comment on function psql_df_plpgsql () is 'some comment';
 \df+ psql_df_*
-                                                                                       List of functions
- Schema |       Name       | Result data type | Argument data types | Type | Volatility | Parallel |       Owner       | Security | Access privileges | Language | Internal name | Description  
---------+------------------+------------------+---------------------+------+------------+----------+-------------------+----------+-------------------+----------+---------------+--------------
- public | psql_df_internal | double precision | double precision    | func | immutable  | safe     | regress_psql_user | invoker  |                   | internal | dsin          | 
- public | psql_df_plpgsql  | void             |                     | func | volatile   | unsafe   | regress_psql_user | invoker  |                   | plpgsql  |               | some comment
- public | psql_df_sql      | integer          | x integer           | func | volatile   | unsafe   | regress_psql_user | definer  |                   | sql      |               | 
+                                                                                            List of functions
+ Schema |       Name       | Result data type | Argument data types | Type | Search  | Volatility | Parallel |       Owner       | Security | Access privileges | Language | Internal name | Description  
+--------+------------------+------------------+---------------------+------+---------+------------+----------+-------------------+----------+-------------------+----------+---------------+--------------
+ public | psql_df_internal | double precision | double precision    | func | default | immutable  | safe     | regress_psql_user | invoker  |                   | internal | dsin          | 
+ public | psql_df_plpgsql  | void             |                     | func | default | volatile   | unsafe   | regress_psql_user | invoker  |                   | plpgsql  |               | some comment
+ public | psql_df_sql      | integer          | x integer           | func | default | volatile   | unsafe   | regress_psql_user | definer  |                   | sql      |               | 
 (3 rows)
 
 rollback;
diff --git a/src/test/regress/sql/create_function_sql.sql b/src/test/regress/sql/create_function_sql.sql
index 89e9af3a49..383633cec9 100644
--- a/src/test/regress/sql/create_function_sql.sql
+++ b/src/test/regress/sql/create_function_sql.sql
@@ -60,6 +60,50 @@ SELECT proname, provolatile FROM pg_proc
                      'functest_B_3'::regproc,
 		     'functest_B_4'::regproc) ORDER BY proname;
 
+--
+-- SEARCH FROM DEFAULT | TRUSTED | SESSION
+--
+
+CREATE FUNCTION f_show_search_path() RETURNS TEXT
+  LANGUAGE plpgsql AS
+$$
+  BEGIN
+    RETURN current_setting('search_path');
+  END;
+$$;
+
+CREATE PROCEDURE p_show_search_path()
+  LANGUAGE plpgsql AS
+$$
+  BEGIN
+    RAISE NOTICE 'search_path: %', current_setting('search_path');
+  END;
+$$;
+
+SELECT f_show_search_path();
+CALL p_show_search_path();
+
+ALTER FUNCTION f_show_search_path() SEARCH FROM TRUSTED;
+SELECT f_show_search_path();
+ALTER FUNCTION f_show_search_path() SEARCH FROM DEFAULT;
+SELECT f_show_search_path();
+ALTER FUNCTION f_show_search_path() SEARCH FROM SESSION;
+SELECT f_show_search_path();
+
+ALTER FUNCTION f_show_search_path() IMMUTABLE; -- fail
+ALTER FUNCTION f_show_search_path() SEARCH FROM DEFAULT;
+ALTER FUNCTION f_show_search_path() IMMUTABLE;
+ALTER FUNCTION f_show_search_path() SEARCH FROM SESSION; -- fail
+ALTER FUNCTION f_show_search_path() VOLATILE;
+
+ALTER ROUTINE p_show_search_path() SEARCH FROM TRUSTED SET search_path = test1;
+CALL p_show_search_path();
+ALTER ROUTINE f_show_search_path() SEARCH FROM SESSION SET search_path = test1;
+SELECT f_show_search_path();
+
+DROP FUNCTION f_show_search_path();
+DROP PROCEDURE p_show_search_path();
+
 --
 -- SECURITY DEFINER | INVOKER
 --
-- 
2.34.1

#25

Jeff Davis

pgsql@j-davis.com

over 2 years ago

In reply to: Robert Haas (#23)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Thu, 2023-09-21 at 14:06 -0400, Robert Haas wrote:

Also, in a case like this, I don't think it's unreasonable to ask
whether, perhaps, Bob just needs to be a little more careful about
setting search_path.

That's what this whole thread is about: I wish it was reasonable, but I
don't think the tools we provide today make it reasonable. You expect
Bob to do something like:

CREATE FUNCTION ... SET search_path = pg_catalog, pg_temp ...

for all functions, not just SECURITY DEFINER functions, is that right?

Up until now, we've mostly treated search_path as a problem for
SECURITY DEFINER, and specifying something like that might be
reasonable for a small number of SECURITY DEFINER functions.

But as my example showed, search_path is actually a problem for
SECURITY INVOKER too: an index expression relies on the function
producing the correct results, and it's hard to control that without
controlling the search_path.

I think that there is a big difference between
(a) defining a SQL-language function that is accessible to multiple
users and (b) inserting a row into a table you don't own. When you
define a function, you know people are potentially going to call it.

It's a bit problematic that (a) is the default:

CREATE FUNCTION f(INT) RETURNS INT IMMUTABLE
LANGUAGE plpgsql
AS $$ BEGIN RETURN 42+$1; END; $$;
CREATE TABLE x(i INT);
CREATE INDEX x_idx ON x(f(i));
GRANT INSERT ON TABLE x TO u2;

It's not obvious that f() is directly callable by u2 (though it is
documented).

I'm not disagreeing with the principle behind what you say above. My
point is that "accessible to multiple users" is the ordinary default
case, so there's no cue for the user that they need to do something
special to secure function f().

Asking you, as the function author, to take some care to secure your
function against a malicious search_path doesn't seem like an
unsupportable burden.

What you are suggesting has been possible for quite some time. Do you
think users are taking care to do this today? If not, how can we
encourage them to do so?

You can, I think, be expected to
check that functions you define have SET search_path attached.

We've already established that even postgres hackers are having
difficulty keeping up with these nuances. Even though the SET clause
has been there for a long time, our documentation on the subject is
insufficient and misleading. And on top of that, it's extra typing and
noise for every schema file. Until we make some changes I don't think
we can expect users to do as you suggest.

Regards,
Jeff Davis

#26

Robert Haas

robertmhaas@gmail.com

over 2 years ago

In reply to: Jeff Davis (#25)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Fri, Sep 22, 2023 at 4:05 PM Jeff Davis <pgsql@j-davis.com> wrote:

On Thu, 2023-09-21 at 14:06 -0400, Robert Haas wrote:

Also, in a case like this, I don't think it's unreasonable to ask
whether, perhaps, Bob just needs to be a little more careful about
setting search_path.

That's what this whole thread is about: I wish it was reasonable, but I
don't think the tools we provide today make it reasonable. You expect
Bob to do something like:

CREATE FUNCTION ... SET search_path = pg_catalog, pg_temp ...

for all functions, not just SECURITY DEFINER functions, is that right?

Yes, I do. I think it's self-evident that a SQL function's behavior is
not guaranteed to be invariant under all possible values of
search_path. If you care about your function behaving the same way all
the time, you have to set the search_path.

TBH, I don't see any reasonable way around that requirement. We can
perhaps provide some safeguards that will make it less likely that you
will get completely hosed if your forget, and we could decide to make
SET search_path or some mostly-equivalent thing the default at the
price of pretty large compatibility break, but you can't have
functions that both resolve object references using the caller's
search path and also reliably do what the author intended.

You can, I think, be expected to
check that functions you define have SET search_path attached.

We've already established that even postgres hackers are having
difficulty keeping up with these nuances. Even though the SET clause
has been there for a long time, our documentation on the subject is
insufficient and misleading. And on top of that, it's extra typing and
noise for every schema file. Until we make some changes I don't think
we can expect users to do as you suggest.

Respectfully, I find this position unreasonable, to the point of
finding it difficult to take seriously. You said in another part of
your email that I didn't quote here that it's a problem that it's a
problem that functions and procedures are created with public execute
access by default -- but you can work around this by using a schema to
which other users don't have access, or by changing the default
permissions for functions on the schema where you are creating them,
or by adjusting permissions on the individual objects. If you don't do
any of that but don't trust the other users on your system then you at
least need to set search_path. If you neither understand how function
permissions work nor understand the importance of controlling
search_path, you cannot expect to have a secure system with multiple,
mutually untrusting users. That's just never going to work, regardless
of what the server behavior is.

I also disagree with the idea that setting the search_path should be
regarded as noise. I think it's quite the opposite. I don't believe
that people want to run their functions under a sanitized search_path
that only includes system schemas. That might work for some people,
but I think most people will define functions that call other
functions that they themselves defined, or access tables that they
themselves created. They will therefore need the search_path to
include the schemas in which they created those objects. There's no
way for the system to magically figure out what the user wants here.
*Perhaps* if the function is defined interactively the then-current
value could be captured, but in a pg_dump for example that won't work,
and the configured value, wherever it came from initially, is going to
have to be recorded so that it can be recreated when the dump is
restored.

Most of the problems that we're dealing with here have analogues in
the world of shell scripts. A sql or plpgsql function is like a shell
script. If it's setuid, i.e. SECURITY DEFINER, you have to worry about
the caller hijacking it by setting PATH or IFS or LD_something. Even
if it isn't, you have to either trust that the caller has set a
reasonable PATH, or set one yourself, else your script isn't always
going to work reliably. Nobody really expects to be able to make a
setuid shell script secure at all -- that typically requires a wrapper
executable -- but it definitely can't be done by someone who doesn't
understand the importance of setting their PATH and has no idea how to
use chmod.

One thing that is quite different between the shell script situation
and what we do inside PostgreSQL is that there's a lot more security
by default. Every user gets a home directory which by default is
accessible only to them, or at the very least writable only by them,
and system directories have tightly-controlled permissions. I think
UNIX had analogues of a lot of the problems we have today 40 years
ago, but they've tightened things up. We've started to move in that
direction by, for example, removing public execute access by default.
If we want to move further in the direction that UNIX has taken, we
should probably get rid of the public schema altogether, and
auto-create per-user schemas with permissions that allow only that
user to access them. But that's only making it easier to not
accidentally have users accessing each other's stuff. The core problem
that, if people do want to access each other's stuff, they either need
to trust each other or really be on point with all the
security-related stuff. That's equally true in the shell script case,
and I think that problem is fundamental. It's just really not possible
for people to call other people's code frequently without everyone
involved either being super-careful about security or just not caring.

--
Robert Haas
EDB: http://www.enterprisedb.com

#27

Joe Conway

mail@joeconway.com

over 2 years ago

In reply to: Robert Haas (#26)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On 9/25/23 11:30, Robert Haas wrote:

I don't believe that people want to run their functions under a
sanitized search_path that only includes system schemas. That might
work for some people, but I think most people will define functions
that call other functions that they themselves defined, or access
tables that they themselves created. They will therefore need the
search_path to include the schemas in which they created those
objects.

Without diving into all the detailed nuances of this discussion, this
particular paragraph made me wonder if at least part of the problem here
is that the same search_path is used to find "things that I want to
execute" (functions and operators) and "things I want to access"
(tables, etc).

I think many folks would be well served by only having system schemas in
the search_path for the former (augmented by explicit schema qualifying
of one's own functions), but agree that almost no one wants that for the
latter (needing to schema qualify every table reference).

Should there be a way to have a separate "execution" search_path?

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#28

Robert Haas

robertmhaas@gmail.com

over 2 years ago

In reply to: Joe Conway (#27)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Mon, Sep 25, 2023 at 12:00 PM Joe Conway <mail@joeconway.com> wrote:

Should there be a way to have a separate "execution" search_path?

I have heard that idea proposed before, and I don't think it's a
terrible idea, but I also don't think it fixes anything terribly
fundamental. I think it's pretty normal for people to define functions
and procedures and then call them from other functions and procedures,
and if you do that, then you need that schema in your execution search
path. Of course, if somebody doesn't do that, or schema-qualifies all
such references, then this becomes useful for defense in depth. But I
find it hard to see it as anything more than defense in depth because
I think a lot of people will need to have use cases where they need to
put non-system schemas into the execution search path, and such people
wouldn't really benefit from the existence of this feature.

Slightly off-topic, but I wonder whether, if we do this, we ought to
do it by adding some kind of a marker to the existing search_path,
rather than by creating a new GUC. For example, maybe putting & before
a schema name means that it can be searched, but only for
non-executable things. Then you could set search_path = &jconway,
pg_catalog or something of that kind. It could potentially be more
powerful to have it be a completely separate setting, but if we do
that, everyone who currently needs to secure search_path properly
starts needing to also secure execution_search_path properly. This is
one of those cases where two is not better than one.

--
Robert Haas
EDB: http://www.enterprisedb.com

#29

Jeff Davis

pgsql@j-davis.com

over 2 years ago

In reply to: Robert Haas (#26)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Mon, 2023-09-25 at 11:30 -0400, Robert Haas wrote:

On Fri, Sep 22, 2023 at 4:05 PM Jeff Davis <pgsql@j-davis.com> wrote:

You expect
Bob to do something like:

CREATE FUNCTION ... SET search_path = pg_catalog, pg_temp ...

for all functions, not just SECURITY DEFINER functions, is that
right?

Yes, I do.

Do users like Bob do that today? If not, what causes you to expect them
to do so in the future?

I think it's self-evident that a SQL function's behavior is
not guaranteed to be invariant under all possible values of
search_path.

It's certainly not self-evident in a literal sense. I think you mean
that it's "obvious" or something, and perhaps that narrow question is,
but it's also not terribly helpful.

If the important behaviors here were so obvious, how did we end up in
this mess in the first place?

We've already established that even postgres hackers are having
difficulty keeping up with these nuances. Even though the SET
clause
has been there for a long time, our documentation on the subject is
insufficient and misleading. And on top of that, it's extra typing
and
noise for every schema file. Until we make some changes I don't
think
we can expect users to do as you suggest.

Respectfully, I find this position unreasonable, to the point of
finding it difficult to take seriously.

Which part exactly is unreasonable?

* Hackers are having trouble keeping up with the nuances.
* Our documentation on the subject *is* insufficient and misleading.
* "pg_temp" is noise.

It seems like you think that users are already doing "SET search_path =
pg_catalog, pg_temp" in all the necessary places, and therefore no
change is required?

Most of the problems that we're dealing with here have analogues in
the world of shell scripts.

I think analogies to unix are what caused a lot of the problems we have
today, because postgres is a very different world. In unix-like
environments, a file is just a file; in postgres, we have all kinds of
code attached in interesting ways.

Regards,
Jeff Davis

#30

Jeff Davis

pgsql@j-davis.com

over 2 years ago

In reply to: Joe Conway (#27)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Mon, 2023-09-25 at 12:00 -0400, Joe Conway wrote:

Should there be a way to have a separate "execution" search_path?

I hadn't considered that and I like that idea for a few reasons:

* a lot of the problem cases are for functions that don't need to
access tables at all, e.g., in an index expression.
* it avoids annoyances with pg_temp, because that's not searched for
functions/operators anyway
* perhaps we could force the object search_path to be empty for
IMMUTABLE functions?

I haven't thought it through in detail, but it seems like a promising
approach.

Regards,
Jeff Davis

#31

Joe Conway

mail@joeconway.com

over 2 years ago

In reply to: Jeff Davis (#30)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On 9/25/23 14:03, Jeff Davis wrote:

On Mon, 2023-09-25 at 12:00 -0400, Joe Conway wrote:

Should there be a way to have a separate "execution" search_path?

I hadn't considered that and I like that idea for a few reasons:

* a lot of the problem cases are for functions that don't need to
access tables at all, e.g., in an index expression.
* it avoids annoyances with pg_temp, because that's not searched for
functions/operators anyway
* perhaps we could force the object search_path to be empty for
IMMUTABLE functions?

I haven't thought it through in detail, but it seems like a promising
approach.

Related to this, it would be useful if you could grant create on schema
for only non-executable objects. You may want to allow a user to create
their own tables but not allow them to create their own functions, for
example. Right now "GRANT CREATE ON SCHEMA foo" gives the grantee the
ability to create "all the things".

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#32

Robert Haas

robertmhaas@gmail.com

over 2 years ago

In reply to: Jeff Davis (#29)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Mon, Sep 25, 2023 at 1:56 PM Jeff Davis <pgsql@j-davis.com> wrote:

Do users like Bob do that today? If not, what causes you to expect them
to do so in the future?

What I would say is that if there's a reasonable way of securing your
stuff and you don't make use of it, that's your problem. If securing
your stuff is unreasonably difficult, that's a product problem. I
think that setting the search_path on your own functions is a basic
precaution that you should take if you are worried about multi-user
security. I do not believe it is realistic to eliminate that
requirement, and if people like Bob don't do that today and can't be
made to do that in the future, then I think it's just hopeless. In
contrast, I think that the precautions that you need to take when
doing anything to a table owned by another user are unreasonably
complex and not very realistic for anyone to take on a routine basis.
Even if you validate that there's nothing malicious before you access
the table, the table owner can change that at any time, so it's very
hard to reliably protect yourself.

In terms of whether people like Bob actually DO do that today, I'd say
probably some do and others don't. I think that the overwhelming
majority of PostgreSQL users simply aren't concerned about multi-user
security. They either have a single user account that is used for
everything, or say one account for the application and another for
interactive access, or they have a bunch of users but basically all of
those people are freely accessing each other's stuff and they're not
really concerned with firewalling them from each other. Among the
small percentage of users who are really concerned with making sure
that users can't get into each others accounts, I would expect that
knowing that you need to control search_path is fairly common, but
it's hard to say. I haven't actually met many such users.

I think it's self-evident that a SQL function's behavior is
not guaranteed to be invariant under all possible values of
search_path.

It's certainly not self-evident in a literal sense. I think you mean
that it's "obvious" or something, and perhaps that narrow question is,
but it's also not terribly helpful.

If the important behaviors here were so obvious, how did we end up in
this mess in the first place?

I feel like this isn't really responsive to the argument that I was
and am making, and I'm worried that we're going down a rat-hole here.

I wondered after reading this whether I had misused the term
self-evident, but when I did a Google search for "self-evident" the
definition that comes up is "not needing to be demonstrated or
explained; obvious."

I am not saying that everyone is going to realize that you probably
ought to be setting search_path on all of your functions in any kind
of multi-user environment, and maybe even in a single-user environment
just to avoid weird failures if you ever change your default
search_path. What I am saying is that if you stop to think about what
search_path does while looking at any SQL function you've ever
written, you should probably realize pretty quickly that the behavior
of your function in search_path-dependent, and indeed that the
behavior of every other SQL function you've ever written is probably
search_path-dependent, too. I think the problem here isn't really that
this is hard to understand, but that many people have not stopped to
think about it.

Maybe it is obvious to you what we ought to do about that, but it is
not obvious to me. As I have said, I think that changing the behavior
of CREATE FUNCTION or CREATE PROCEDURE so that some search_path
control is the default is worth considering. However, I think that
such a change inevitably breaks backward compatibility, and I don't
think we have enough people weighing in on this thread to think that
we can just go do that even if everyone agreed on precisely what was
to be done, and I think it is pretty clear that we do not have
unanimous agreement.

Respectfully, I find this position unreasonable, to the point of
finding it difficult to take seriously.

Which part exactly is unreasonable?

I interpreted you to be saying that we can't expect people to set
search_path on their functions. And I just don't buy that. We have
made mistakes in that area in PostgreSQL itself and had to fix them
later, and we may make more mistakes again in the future, so if you
think we need better documentation or better defaults, I think you
might be right. But if you think it's a crazy idea for people running
PostgreSQL in multi-user environments to understand that setting
search_path on all of their functions and procedures is essential, I
disagree. They've got to understand that, because it's not that
complicated, and there's no real alternative.

I think analogies to unix are what caused a lot of the problems we have
today, because postgres is a very different world. In unix-like
environments, a file is just a file; in postgres, we have all kinds of
code attached in interesting ways.

Yeah. That's another area where I find it very unclear how to do
better. From a security point of view, I think that the fact that
there are so many interesting places to attach code is completely
insane. It makes running a secure multi-user environment very
difficult, bordering on impossible. But is there any way we can really
fix that without just removing a whole bunch of functionality? I think
that some of the ideas that have been proposed here could help, but
I'm extremely doubtful that they or anything else are a complete
solution, and I'm pretty sure that there is no "easy button" here --
given the number of "interesting" ways to execute code, I think
security will always be tough to get right, regardless of what we
change.

My emails on this thread seem to have made you frustrated. For that, I am sorry.

--
Robert Haas
EDB: http://www.enterprisedb.com

#33

Jeff Davis

pgsql@j-davis.com

about 2 years ago

In reply to: Jeff Davis (#24)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Thu, 2023-09-21 at 14:33 -0700, Jeff Davis wrote:

I have attached an updated patch. Changes:

Withdrawing this from CF due to lack of consensus.

I'm happy to resume this discussion if someone sees a path forward to
make it easier to secure the search_path; or at least help warn users
when a function without a secured search_path is being used unsafely.

Regards,
Jeff Davis

#34

Jeff Davis

pgsql@j-davis.com

about 2 years ago

In reply to: Robert Haas (#26)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Mon, 2023-09-25 at 11:30 -0400, Robert Haas wrote:

That's what this whole thread is about: I wish it was reasonable,
but I
don't think the tools we provide today make it reasonable. You
expect
Bob to do something like:

CREATE FUNCTION ... SET search_path = pg_catalog, pg_temp ...

for all functions, not just SECURITY DEFINER functions, is that
right?

Yes, I do. I think it's self-evident that a SQL function's behavior
is
not guaranteed to be invariant under all possible values of
search_path. If you care about your function behaving the same way
all
the time, you have to set the search_path.

After adding the search path cache (recent commit f26c2368dc) hopefully
that helps to make the above suggestion more reasonable performance-
wise. I think we can call that progress.

Regards,
Jeff Davis

#35

Robert Haas

robertmhaas@gmail.com

about 2 years ago

In reply to: Jeff Davis (#34)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Tue, Nov 14, 2023 at 11:21 PM Jeff Davis <pgsql@j-davis.com> wrote:

After adding the search path cache (recent commit f26c2368dc) hopefully
that helps to make the above suggestion more reasonable performance-
wise. I think we can call that progress.

I agree. Not to burden you, but do you know what the overhead is now,
and do you have any plans to further reduce it? I don't believe that's
the only thing we ought to be doing here, necessarily, but it is one
thing that we definitely should be doing and probably the least
controversial.

--
Robert Haas
EDB: http://www.enterprisedb.com

#36

Jeff Davis

pgsql@j-davis.com

about 2 years ago

In reply to: Robert Haas (#35)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Mon, 2023-11-20 at 15:52 -0500, Robert Haas wrote:

I agree. Not to burden you, but do you know what the overhead is now,
and do you have any plans to further reduce it? I don't believe
that's
the only thing we ought to be doing here, necessarily, but it is one
thing that we definitely should be doing and probably the least
controversial.

Running the simple test described here:

/messages/by-id/04c8592dbd694e4114a3ed87139a7a04e4363030.camel@j-davis.com

The baseline (no "SET search_path" clause on the function) is around
3800ms, and with the clause it shoots up to 8000ms. That's not good,
but it is down from about 12000ms before.

There are a few patches in the queue to bring it down further. Andres
and I were discussing some GUC hashtable optimizations here:

/messages/by-id/27a7a289d5b8f42e1b1e79b1bcaeef3a40583bd2.camel@j-davis.com

which will (if committed) bring it down into the mid 7s.

There are also a couple other patches I have here (and intend to commit
soon):

/messages/by-id/e6fded24cb8a2c53d4ef069d9f69cc7baaafe9ef.camel@j-davis.com

and those I think will get it into the mid 6s. I think a bit lower
combined with the GUC hash table optimizations above.

So we are still looking at around 50% overhead for a simple function if
all this stuff gets committed. Not great, but a lot better than before.

Of course I welcome others to profile and see what they can do. There's
a setjmp() call, and a couple allocations, and maybe some other stuff
to look at. There are also higher-level ideas, like avoiding calling
into guc.c in some cases, but that starts to get tricky as you pointed
out:

/messages/by-id/CA+Tgmoa8uKQgak5wH0=7sL-ukqbwnCPMXA2ZW7Ccdt7tdNGkzg@mail.gmail.com

It seems others are also interested in this problem, so I can put some
more effort in after this round of patches goes in. I don't have a
specific target other than "low enough overhead that we can reasonably
recommend it as a best practice in multi-user environments".

Regards,
Jeff Davis

#37

Robert Haas

robertmhaas@gmail.com

about 2 years ago

In reply to: Jeff Davis (#36)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Mon, Nov 20, 2023 at 5:27 PM Jeff Davis <pgsql@j-davis.com> wrote:

Of course I welcome others to profile and see what they can do. There's
a setjmp() call, and a couple allocations, and maybe some other stuff
to look at. There are also higher-level ideas, like avoiding calling
into guc.c in some cases, but that starts to get tricky as you pointed
out:

/messages/by-id/CA+Tgmoa8uKQgak5wH0=7sL-ukqbwnCPMXA2ZW7Ccdt7tdNGkzg@mail.gmail.com

It seems others are also interested in this problem, so I can put some
more effort in after this round of patches goes in. I don't have a
specific target other than "low enough overhead that we can reasonably
recommend it as a best practice in multi-user environments".

The two things that jump out at me are the setjmp() and the
hash_search() call inside find_option(). As to the first, could we
remove the setjmp() and instead have abort-time processing know
something about this? For example, imagine we just push something onto
a stack like we do for ErrorContextCallback, do whatever, and then pop
it off. But if an error is thrown then the abort path knows to look at
that variable and do whatever. As to the second, could we somehow come
up with an API for guc.c where you can ask for an opaque handle that
will later allow you to do a fast-SET of a GUC? The opaque handle
would basically be the hashtable entry, perhaps with some kind of
wrapping or decoration. Then fmgr_security_definer() could obtain the
opaque handles and cache them in fn_extra.

(I'm just spitballing here.)

--
Robert Haas
EDB: http://www.enterprisedb.com

#38

Jeff Davis

pgsql@j-davis.com

about 2 years ago

In reply to: Robert Haas (#37)

1 attachment(s)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Tue, 2023-11-21 at 09:24 -0500, Robert Haas wrote:

As to the second, could we somehow come
up with an API for guc.c where you can ask for an opaque handle that
will later allow you to do a fast-SET of a GUC?

Yes, attached. That provides a significant speedup: my test goes from
around ~7300ms to ~6800ms.

This doesn't seem very controversial or complex, so I'll probably
commit this soon unless someone else has a comment.

--
Jeff Davis
PostgreSQL Contributor Team - AWS

Attachments:

v1-0001-Cache-opaque-handle-for-GUC-option-to-avoid-repea.patchtext/x-patch; charset=UTF-8; name=v1-0001-Cache-opaque-handle-for-GUC-option-to-avoid-repea.patchDownload

From 906cb1cdf42f92090d4a9acf296098ec3bfa53e0 Mon Sep 17 00:00:00 2001
From: Jeff Davis <jeff@j-davis.com>
Date: Mon, 4 Dec 2023 16:20:05 -0800
Subject: [PATCH v1] Cache opaque handle for GUC option to avoid repeasted
 lookups.

When setting GUCs from proconfig, performance is important, and hash
lookups in the GUC table are significant.

Per suggestion from Robert Haas.

Discussion: https://postgr.es/m/CA+TgmoYpKxhR3HOD9syK2XwcAUVPa0+ba0XPnwWBcYxtKLkyxA@mail.gmail.com
---
 src/backend/utils/fmgr/fmgr.c | 32 ++++++++++++-----
 src/backend/utils/misc/guc.c  | 67 ++++++++++++++++++++++++++++++++---
 src/include/utils/guc.h       |  8 +++++
 3 files changed, 94 insertions(+), 13 deletions(-)

diff --git a/src/backend/utils/fmgr/fmgr.c b/src/backend/utils/fmgr/fmgr.c
index 9dfdf890c5..6aac698e4b 100644
--- a/src/backend/utils/fmgr/fmgr.c
+++ b/src/backend/utils/fmgr/fmgr.c
@@ -612,7 +612,7 @@ struct fmgr_security_definer_cache
 {
 	FmgrInfo	flinfo;			/* lookup info for target function */
 	Oid			userid;			/* userid to set, or InvalidOid */
-	List	   *configNames;	/* GUC names to set, or NIL */
+	List	   *configHandles;	/* GUC handles to set, or NIL */
 	List	   *configValues;	/* GUC values to set, or NIL */
 	Datum		arg;			/* passthrough argument for plugin modules */
 };
@@ -670,11 +670,25 @@ fmgr_security_definer(PG_FUNCTION_ARGS)
 		if (!isnull)
 		{
 			ArrayType  *array;
+			ListCell   *lc;
+			List	   *names;
 
 			oldcxt = MemoryContextSwitchTo(fcinfo->flinfo->fn_mcxt);
 			array = DatumGetArrayTypeP(datum);
-			TransformGUCArray(array, &fcache->configNames,
+			TransformGUCArray(array, &names,
 							  &fcache->configValues);
+
+			/* transform names to config handles to avoid lookup cost */
+			fcache->configHandles = NIL;
+			foreach(lc, names)
+			{
+				char	   *name = (char *) lfirst(lc);
+
+				fcache->configHandles = lappend(fcache->configHandles,
+												get_config_handle(name));
+			}
+			list_free_deep(names);
+
 			MemoryContextSwitchTo(oldcxt);
 		}
 
@@ -687,7 +701,7 @@ fmgr_security_definer(PG_FUNCTION_ARGS)
 
 	/* GetUserIdAndSecContext is cheap enough that no harm in a wasted call */
 	GetUserIdAndSecContext(&save_userid, &save_sec_context);
-	if (fcache->configNames != NIL) /* Need a new GUC nesting level */
+	if (fcache->configHandles != NIL)	/* Need a new GUC nesting level */
 		save_nestlevel = NewGUCNestLevel();
 	else
 		save_nestlevel = 0;		/* keep compiler quiet */
@@ -696,17 +710,17 @@ fmgr_security_definer(PG_FUNCTION_ARGS)
 		SetUserIdAndSecContext(fcache->userid,
 							   save_sec_context | SECURITY_LOCAL_USERID_CHANGE);
 
-	forboth(lc1, fcache->configNames, lc2, fcache->configValues)
+	forboth(lc1, fcache->configHandles, lc2, fcache->configValues)
 	{
 		GucContext	context = superuser() ? PGC_SUSET : PGC_USERSET;
 		GucSource	source = PGC_S_SESSION;
 		GucAction	action = GUC_ACTION_SAVE;
-		char	   *name = lfirst(lc1);
+		config_handle *handle = lfirst(lc1);
 		char	   *value = lfirst(lc2);
 
-		(void) set_config_option(name, value,
-								 context, source,
-								 action, true, 0, false);
+		(void) set_config_with_handle(handle, value,
+									  context, source, GetUserId(),
+									  action, true, 0, false);
 	}
 
 	/* function manager hook */
@@ -749,7 +763,7 @@ fmgr_security_definer(PG_FUNCTION_ARGS)
 
 	fcinfo->flinfo = save_flinfo;
 
-	if (fcache->configNames != NIL)
+	if (fcache->configHandles != NIL)
 		AtEOXact_GUC(true, save_nestlevel);
 	if (OidIsValid(fcache->userid))
 		SetUserIdAndSecContext(save_userid, save_sec_context);
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index e76c083003..641dc264df 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3357,6 +3357,52 @@ set_config_option_ext(const char *name, const char *value,
 					  bool is_reload)
 {
 	struct config_generic *record;
+
+	if (elevel == 0)
+	{
+		if (source == PGC_S_DEFAULT || source == PGC_S_FILE)
+		{
+			/*
+			 * To avoid cluttering the log, only the postmaster bleats loudly
+			 * about problems with the config file.
+			 */
+			elevel = IsUnderPostmaster ? DEBUG3 : LOG;
+		}
+		else if (source == PGC_S_GLOBAL ||
+				 source == PGC_S_DATABASE ||
+				 source == PGC_S_USER ||
+				 source == PGC_S_DATABASE_USER)
+			elevel = WARNING;
+		else
+			elevel = ERROR;
+	}
+
+	record = find_option(name, true, false, elevel);
+	if (record == NULL)
+		return 0;
+
+	return set_config_with_handle((config_handle *) record, value,
+								  context, source, srole,
+								  action, changeVal, elevel,
+								  is_reload);
+}
+
+
+/*
+ * set_config_option_ext: sets option with the given handle to the given
+ * value.
+ *
+ * This should be used by callers that repeatedly set the same config
+ * option(s), and want to avoid the overhead of a hash lookup each time.
+ */
+int
+set_config_with_handle(config_handle * handle, const char *value,
+					   GucContext context, GucSource source, Oid srole,
+					   GucAction action, bool changeVal, int elevel,
+					   bool is_reload)
+{
+	struct config_generic *record = (struct config_generic *) handle;
+	const char *name = record->name;
 	union config_var_val newval_union;
 	void	   *newextra = NULL;
 	bool		prohibitValueChange = false;
@@ -3395,10 +3441,6 @@ set_config_option_ext(const char *name, const char *value,
 				(errcode(ERRCODE_INVALID_TRANSACTION_STATE),
 				 errmsg("cannot set parameters during a parallel operation")));
 
-	record = find_option(name, true, false, elevel);
-	if (record == NULL)
-		return 0;
-
 	/*
 	 * Check if the option can be set at this time. See guc.h for the precise
 	 * rules.
@@ -4166,6 +4208,23 @@ set_config_option_ext(const char *name, const char *value,
 }
 
 
+/*
+ * Retrieve a config_handle for the given name, suitable for calling
+ * set_config_with_handle().
+ */
+config_handle *
+get_config_handle(const char *name)
+{
+	struct config_generic *record;
+
+	record = find_option(name, false, false, 0);
+	if (record == NULL)
+		return 0;
+
+	return (config_handle *) record;
+}
+
+
 /*
  * Set the fields for source file and line number the setting came from.
  */
diff --git a/src/include/utils/guc.h b/src/include/utils/guc.h
index 20fe13702b..5847776c39 100644
--- a/src/include/utils/guc.h
+++ b/src/include/utils/guc.h
@@ -144,6 +144,8 @@ typedef struct ConfigVariable
 	struct ConfigVariable *next;
 } ConfigVariable;
 
+typedef struct config_handle config_handle;
+
 extern bool ParseConfigFile(const char *config_file, bool strict,
 							const char *calling_file, int calling_lineno,
 							int depth, int elevel,
@@ -387,6 +389,12 @@ extern int	set_config_option_ext(const char *name, const char *value,
 								  Oid srole,
 								  GucAction action, bool changeVal, int elevel,
 								  bool is_reload);
+extern int	set_config_with_handle(config_handle * handle, const char *value,
+								   GucContext context, GucSource source,
+								   Oid srole,
+								   GucAction action, bool changeVal,
+								   int elevel, bool is_reload);
+extern config_handle * get_config_handle(const char *name);
 extern void AlterSystemSetConfigFile(AlterSystemStmt *altersysstmt);
 extern char *GetConfigOptionByName(const char *name, const char **varname,
 								   bool missing_ok);
-- 
2.34.1

#39

John Naylor

johncnaylorls@gmail.com

about 2 years ago

In reply to: Jeff Davis (#38)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Tue, Dec 5, 2023 at 7:55 AM Jeff Davis <pgsql@j-davis.com> wrote:

On Tue, 2023-11-21 at 09:24 -0500, Robert Haas wrote:

As to the second, could we somehow come
up with an API for guc.c where you can ask for an opaque handle that
will later allow you to do a fast-SET of a GUC?

Yes, attached. That provides a significant speedup: my test goes from
around ~7300ms to ~6800ms.

This doesn't seem very controversial or complex, so I'll probably
commit this soon unless someone else has a comment.

+ * set_config_option_ext: sets option with the given handle to the given
+ * value.

Copy-paste-o of the other function name.

+config_handle *
+get_config_handle(const char *name)
+{
+ struct config_generic *record;
+
+ record = find_option(name, false, false, 0);
+ if (record == NULL)
+ return 0;

Part of this code this was copied from a function that returned int,
but this one returns a pointer.

#40

Jeff Davis

pgsql@j-davis.com

about 2 years ago

In reply to: John Naylor (#39)

1 attachment(s)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Tue, 2023-12-05 at 23:22 +0700, John Naylor wrote:

Copy-paste-o of the other function name.

...

Part of this code this was copied from a function that returned int,
but this one returns a pointer.

Thank you, fixed.

Also, I forward-declared config_generic in guc.h to eliminate the cast.

Regards,
Jeff Davis

Attachments:

v2-0001-Cache-opaque-handle-for-GUC-option-to-avoid-repea.patchtext/x-patch; charset=UTF-8; name=v2-0001-Cache-opaque-handle-for-GUC-option-to-avoid-repea.patchDownload

From 72b00b1b094945845e4ea4d427e426eafd5650c2 Mon Sep 17 00:00:00 2001
From: Jeff Davis <jeff@j-davis.com>
Date: Mon, 4 Dec 2023 16:20:05 -0800
Subject: [PATCH v2] Cache opaque handle for GUC option to avoid repeasted
 lookups.

When setting GUCs from proconfig, performance is important, and hash
lookups in the GUC table are significant.

Per suggestion from Robert Haas.

Discussion: https://postgr.es/m/CA+TgmoYpKxhR3HOD9syK2XwcAUVPa0+ba0XPnwWBcYxtKLkyxA@mail.gmail.com
---
 src/backend/utils/fmgr/fmgr.c | 32 ++++++++++++------
 src/backend/utils/misc/guc.c  | 61 ++++++++++++++++++++++++++++++++---
 src/include/utils/guc.h       |  8 +++++
 3 files changed, 88 insertions(+), 13 deletions(-)

diff --git a/src/backend/utils/fmgr/fmgr.c b/src/backend/utils/fmgr/fmgr.c
index 9dfdf890c5..6aac698e4b 100644
--- a/src/backend/utils/fmgr/fmgr.c
+++ b/src/backend/utils/fmgr/fmgr.c
@@ -612,7 +612,7 @@ struct fmgr_security_definer_cache
 {
 	FmgrInfo	flinfo;			/* lookup info for target function */
 	Oid			userid;			/* userid to set, or InvalidOid */
-	List	   *configNames;	/* GUC names to set, or NIL */
+	List	   *configHandles;	/* GUC handles to set, or NIL */
 	List	   *configValues;	/* GUC values to set, or NIL */
 	Datum		arg;			/* passthrough argument for plugin modules */
 };
@@ -670,11 +670,25 @@ fmgr_security_definer(PG_FUNCTION_ARGS)
 		if (!isnull)
 		{
 			ArrayType  *array;
+			ListCell   *lc;
+			List	   *names;
 
 			oldcxt = MemoryContextSwitchTo(fcinfo->flinfo->fn_mcxt);
 			array = DatumGetArrayTypeP(datum);
-			TransformGUCArray(array, &fcache->configNames,
+			TransformGUCArray(array, &names,
 							  &fcache->configValues);
+
+			/* transform names to config handles to avoid lookup cost */
+			fcache->configHandles = NIL;
+			foreach(lc, names)
+			{
+				char	   *name = (char *) lfirst(lc);
+
+				fcache->configHandles = lappend(fcache->configHandles,
+												get_config_handle(name));
+			}
+			list_free_deep(names);
+
 			MemoryContextSwitchTo(oldcxt);
 		}
 
@@ -687,7 +701,7 @@ fmgr_security_definer(PG_FUNCTION_ARGS)
 
 	/* GetUserIdAndSecContext is cheap enough that no harm in a wasted call */
 	GetUserIdAndSecContext(&save_userid, &save_sec_context);
-	if (fcache->configNames != NIL) /* Need a new GUC nesting level */
+	if (fcache->configHandles != NIL)	/* Need a new GUC nesting level */
 		save_nestlevel = NewGUCNestLevel();
 	else
 		save_nestlevel = 0;		/* keep compiler quiet */
@@ -696,17 +710,17 @@ fmgr_security_definer(PG_FUNCTION_ARGS)
 		SetUserIdAndSecContext(fcache->userid,
 							   save_sec_context | SECURITY_LOCAL_USERID_CHANGE);
 
-	forboth(lc1, fcache->configNames, lc2, fcache->configValues)
+	forboth(lc1, fcache->configHandles, lc2, fcache->configValues)
 	{
 		GucContext	context = superuser() ? PGC_SUSET : PGC_USERSET;
 		GucSource	source = PGC_S_SESSION;
 		GucAction	action = GUC_ACTION_SAVE;
-		char	   *name = lfirst(lc1);
+		config_handle *handle = lfirst(lc1);
 		char	   *value = lfirst(lc2);
 
-		(void) set_config_option(name, value,
-								 context, source,
-								 action, true, 0, false);
+		(void) set_config_with_handle(handle, value,
+									  context, source, GetUserId(),
+									  action, true, 0, false);
 	}
 
 	/* function manager hook */
@@ -749,7 +763,7 @@ fmgr_security_definer(PG_FUNCTION_ARGS)
 
 	fcinfo->flinfo = save_flinfo;
 
-	if (fcache->configNames != NIL)
+	if (fcache->configHandles != NIL)
 		AtEOXact_GUC(true, save_nestlevel);
 	if (OidIsValid(fcache->userid))
 		SetUserIdAndSecContext(save_userid, save_sec_context);
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index e76c083003..157e39d654 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3357,6 +3357,52 @@ set_config_option_ext(const char *name, const char *value,
 					  bool is_reload)
 {
 	struct config_generic *record;
+
+	if (elevel == 0)
+	{
+		if (source == PGC_S_DEFAULT || source == PGC_S_FILE)
+		{
+			/*
+			 * To avoid cluttering the log, only the postmaster bleats loudly
+			 * about problems with the config file.
+			 */
+			elevel = IsUnderPostmaster ? DEBUG3 : LOG;
+		}
+		else if (source == PGC_S_GLOBAL ||
+				 source == PGC_S_DATABASE ||
+				 source == PGC_S_USER ||
+				 source == PGC_S_DATABASE_USER)
+			elevel = WARNING;
+		else
+			elevel = ERROR;
+	}
+
+	record = find_option(name, true, false, elevel);
+	if (record == NULL)
+		return 0;
+
+	return set_config_with_handle(record, value,
+								  context, source, srole,
+								  action, changeVal, elevel,
+								  is_reload);
+}
+
+
+/*
+ * set_config_with_handle: sets option with the given handle to the given
+ * value.
+ *
+ * This should be used by callers that repeatedly set the same config
+ * option(s), and want to avoid the overhead of a hash lookup each time.
+ */
+int
+set_config_with_handle(config_handle * handle, const char *value,
+					   GucContext context, GucSource source, Oid srole,
+					   GucAction action, bool changeVal, int elevel,
+					   bool is_reload)
+{
+	struct config_generic *record = handle;
+	const char *name = record->name;
 	union config_var_val newval_union;
 	void	   *newextra = NULL;
 	bool		prohibitValueChange = false;
@@ -3395,10 +3441,6 @@ set_config_option_ext(const char *name, const char *value,
 				(errcode(ERRCODE_INVALID_TRANSACTION_STATE),
 				 errmsg("cannot set parameters during a parallel operation")));
 
-	record = find_option(name, true, false, elevel);
-	if (record == NULL)
-		return 0;
-
 	/*
 	 * Check if the option can be set at this time. See guc.h for the precise
 	 * rules.
@@ -4166,6 +4208,17 @@ set_config_option_ext(const char *name, const char *value,
 }
 
 
+/*
+ * Retrieve a config_handle for the given name, suitable for calling
+ * set_config_with_handle().
+ */
+config_handle *
+get_config_handle(const char *name)
+{
+	return find_option(name, false, false, 0);
+}
+
+
 /*
  * Set the fields for source file and line number the setting came from.
  */
diff --git a/src/include/utils/guc.h b/src/include/utils/guc.h
index 20fe13702b..a5a19ca6a1 100644
--- a/src/include/utils/guc.h
+++ b/src/include/utils/guc.h
@@ -144,6 +144,8 @@ typedef struct ConfigVariable
 	struct ConfigVariable *next;
 } ConfigVariable;
 
+typedef struct config_generic config_handle;
+
 extern bool ParseConfigFile(const char *config_file, bool strict,
 							const char *calling_file, int calling_lineno,
 							int depth, int elevel,
@@ -387,6 +389,12 @@ extern int	set_config_option_ext(const char *name, const char *value,
 								  Oid srole,
 								  GucAction action, bool changeVal, int elevel,
 								  bool is_reload);
+extern int	set_config_with_handle(config_handle * handle, const char *value,
+								   GucContext context, GucSource source,
+								   Oid srole,
+								   GucAction action, bool changeVal,
+								   int elevel, bool is_reload);
+extern config_handle * get_config_handle(const char *name);
 extern void AlterSystemSetConfigFile(AlterSystemStmt *altersysstmt);
 extern char *GetConfigOptionByName(const char *name, const char **varname,
 								   bool missing_ok);
-- 
2.34.1

#41

Jeff Davis

pgsql@j-davis.com

about 2 years ago

In reply to: Robert Haas (#37)

1 attachment(s)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Tue, 2023-11-21 at 09:24 -0500, Robert Haas wrote:

As to the first, could we
remove the setjmp() and instead have abort-time processing know
something about this? For example, imagine we just push something
onto
a stack like we do for ErrorContextCallback, do whatever, and then
pop
it off. But if an error is thrown then the abort path knows to look
at
that variable and do whatever.

If I remove the TRY/CATCH entirely, it shows there's room for ~200ms
improvement in my test.

I attached a rough patch, which doesn't quite achieve that much, it's
more like ~100ms improvement and starts to fall within the noise. So
perhaps an improvement, but a bit disappointing. It's not a lot of
code, but it's not trivial either because the nesting level needs to be
tracked (so a subxact abort doesn't reset too much state).

Also, it's not quite as clean as it could be, because I went to some
effort to avoid an alloc/free by keeping the stack within the fcache. I
didn't pay a lot of attention to correctness in this particular patch;
I was mostly trying a few different formulations for performance
measurement.

I'm not inclined to commit this in its current form but if someone
thinks that it's a worthwhile direction, I can clean it up a bit and
reconsider.

Regards,
Jeff Davis

Attachments:

v1-0001-Eliminate-PG_TRY-CATCH-in-fmgr_security_definer.patchtext/x-patch; charset=UTF-8; name=v1-0001-Eliminate-PG_TRY-CATCH-in-fmgr_security_definer.patchDownload

From e48a54d9880ab65a1e5ad6d136b849bda2e4554e Mon Sep 17 00:00:00 2001
From: Jeff Davis <jeff@j-davis.com>
Date: Thu, 7 Dec 2023 12:13:22 -0800
Subject: [PATCH v1] Eliminate PG_TRY/CATCH in fmgr_security_definer().

---
 src/backend/access/transam/xact.c |   4 ++
 src/backend/utils/fmgr/fmgr.c     | 107 +++++++++++++++++++-----------
 src/include/fmgr.h                |   2 +
 3 files changed, 73 insertions(+), 40 deletions(-)

diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 536edb3792..36e75cc657 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -196,6 +196,7 @@ typedef struct TransactionStateData
 	TBlockState blockState;		/* high-level state */
 	int			nestingLevel;	/* transaction nesting depth */
 	int			gucNestLevel;	/* GUC context nesting depth */
+	int			fmgrNestLevel;	/* fmgr wrapper nesting depth */
 	MemoryContext curTransactionContext;	/* my xact-lifetime context */
 	ResourceOwner curTransactionOwner;	/* my query resources */
 	TransactionId *childXids;	/* subcommitted child XIDs, in XID order */
@@ -2797,6 +2798,7 @@ AbortTransaction(void)
 	 * do abort processing
 	 */
 	AfterTriggerEndXact(false); /* 'false' means it's abort */
+	AtEOXact_Fmgr(false, 1);
 	AtAbort_Portals();
 	smgrDoPendingSyncs(false, is_parallel_worker);
 	AtEOXact_LargeObject(false);
@@ -5151,6 +5153,7 @@ AbortSubTransaction(void)
 	if (s->curTransactionOwner)
 	{
 		AfterTriggerEndSubXact(false);
+		AtEOXact_Fmgr(false, s->fmgrNestLevel);
 		AtSubAbort_Portals(s->subTransactionId,
 						   s->parent->subTransactionId,
 						   s->curTransactionOwner,
@@ -5294,6 +5297,7 @@ PushTransaction(void)
 	s->parent = p;
 	s->nestingLevel = p->nestingLevel + 1;
 	s->gucNestLevel = NewGUCNestLevel();
+	s->fmgrNestLevel = CurrentFmgrNestLevel();
 	s->savepointLevel = p->savepointLevel;
 	s->state = TRANS_DEFAULT;
 	s->blockState = TBLOCK_SUBBEGIN;
diff --git a/src/backend/utils/fmgr/fmgr.c b/src/backend/utils/fmgr/fmgr.c
index 9dfdf890c5..76056d980b 100644
--- a/src/backend/utils/fmgr/fmgr.c
+++ b/src/backend/utils/fmgr/fmgr.c
@@ -610,13 +610,26 @@ fmgr_internal_function(const char *proname)
  */
 struct fmgr_security_definer_cache
 {
-	FmgrInfo	flinfo;			/* lookup info for target function */
+	FunctionCallInfo fcinfo;
+	FmgrInfo	flinfo_target;	/* lookup info for target function */
+	FmgrInfo   *flinfo_wrapper;	/* lookup info for target function */
 	Oid			userid;			/* userid to set, or InvalidOid */
+	int			fmgrNestLevel;
 	List	   *configNames;	/* GUC names to set, or NIL */
 	List	   *configValues;	/* GUC values to set, or NIL */
 	Datum		arg;			/* passthrough argument for plugin modules */
+	struct fmgr_security_definer_cache *previous;
 };
 
+static int FmgrNestLevel = 0;
+static struct fmgr_security_definer_cache *CurrentFmgrCache = NULL;
+
+int
+CurrentFmgrNestLevel()
+{
+	return FmgrNestLevel;
+}
+
 /*
  * Function handler for security-definer/proconfig/plugin-hooked functions.
  * We extract the OID of the actual function and do a fmgr lookup again.
@@ -632,12 +645,11 @@ fmgr_security_definer(PG_FUNCTION_ARGS)
 {
 	Datum		result;
 	struct fmgr_security_definer_cache *volatile fcache;
-	FmgrInfo   *save_flinfo;
 	Oid			save_userid;
 	int			save_sec_context;
 	ListCell   *lc1;
 	ListCell   *lc2;
-	volatile int save_nestlevel;
+	int			save_guc_nestlevel;
 	PgStat_FunctionCallUsage fcusage;
 
 	if (!fcinfo->flinfo->fn_extra)
@@ -651,10 +663,13 @@ fmgr_security_definer(PG_FUNCTION_ARGS)
 		fcache = MemoryContextAllocZero(fcinfo->flinfo->fn_mcxt,
 										sizeof(*fcache));
 
-		fmgr_info_cxt_security(fcinfo->flinfo->fn_oid, &fcache->flinfo,
+		fcache->fcinfo = fcinfo;
+		fcache->flinfo_wrapper = fcinfo->flinfo;
+		fmgr_info_cxt_security(fcinfo->flinfo->fn_oid, &fcache->flinfo_target,
 							   fcinfo->flinfo->fn_mcxt, true);
-		fcache->flinfo.fn_expr = fcinfo->flinfo->fn_expr;
+		fcache->flinfo_target.fn_expr = fcinfo->flinfo->fn_expr;
 
+		fcache->fmgrNestLevel = ++FmgrNestLevel;
 		tuple = SearchSysCache1(PROCOID,
 								ObjectIdGetDatum(fcinfo->flinfo->fn_oid));
 		if (!HeapTupleIsValid(tuple))
@@ -688,9 +703,9 @@ fmgr_security_definer(PG_FUNCTION_ARGS)
 	/* GetUserIdAndSecContext is cheap enough that no harm in a wasted call */
 	GetUserIdAndSecContext(&save_userid, &save_sec_context);
 	if (fcache->configNames != NIL) /* Need a new GUC nesting level */
-		save_nestlevel = NewGUCNestLevel();
+		save_guc_nestlevel = NewGUCNestLevel();
 	else
-		save_nestlevel = 0;		/* keep compiler quiet */
+		save_guc_nestlevel = 0;		/* keep compiler quiet */
 
 	if (OidIsValid(fcache->userid))
 		SetUserIdAndSecContext(fcache->userid,
@@ -711,54 +726,66 @@ fmgr_security_definer(PG_FUNCTION_ARGS)
 
 	/* function manager hook */
 	if (fmgr_hook)
-		(*fmgr_hook) (FHET_START, &fcache->flinfo, &fcache->arg);
+		(*fmgr_hook) (FHET_START, &fcache->flinfo_target, &fcache->arg);
 
-	/*
-	 * We don't need to restore GUC or userid settings on error, because the
-	 * ensuing xact or subxact abort will do that.  The PG_TRY block is only
-	 * needed to clean up the flinfo link.
-	 */
-	save_flinfo = fcinfo->flinfo;
+	/* push onto stack */
+	fcache->previous = CurrentFmgrCache;
+	CurrentFmgrCache = fcache;
 
-	PG_TRY();
-	{
-		fcinfo->flinfo = &fcache->flinfo;
+	/* temporarily set to target function during call */
+	fcinfo->flinfo = &fcache->flinfo_target;
 
-		/* See notes in fmgr_info_cxt_security */
-		pgstat_init_function_usage(fcinfo, &fcusage);
+	/* See notes in fmgr_info_cxt_security */
+	pgstat_init_function_usage(fcinfo, &fcusage);
 
-		result = FunctionCallInvoke(fcinfo);
+	result = FunctionCallInvoke(fcinfo);
 
-		/*
-		 * We could be calling either a regular or a set-returning function,
-		 * so we have to test to see what finalize flag to use.
-		 */
-		pgstat_end_function_usage(&fcusage,
-								  (fcinfo->resultinfo == NULL ||
-								   !IsA(fcinfo->resultinfo, ReturnSetInfo) ||
-								   ((ReturnSetInfo *) fcinfo->resultinfo)->isDone != ExprMultipleResult));
-	}
-	PG_CATCH();
-	{
-		fcinfo->flinfo = save_flinfo;
-		if (fmgr_hook)
-			(*fmgr_hook) (FHET_ABORT, &fcache->flinfo, &fcache->arg);
-		PG_RE_THROW();
-	}
-	PG_END_TRY();
+	/*
+	 * We could be calling either a regular or a set-returning function,
+	 * so we have to test to see what finalize flag to use.
+	 */
+	pgstat_end_function_usage(&fcusage,
+							  (fcinfo->resultinfo == NULL ||
+							   !IsA(fcinfo->resultinfo, ReturnSetInfo) ||
+							   ((ReturnSetInfo *) fcinfo->resultinfo)->isDone != ExprMultipleResult));
 
-	fcinfo->flinfo = save_flinfo;
+	fcinfo->flinfo = fcache->flinfo_wrapper;
 
 	if (fcache->configNames != NIL)
-		AtEOXact_GUC(true, save_nestlevel);
+		AtEOXact_GUC(true, save_guc_nestlevel);
 	if (OidIsValid(fcache->userid))
 		SetUserIdAndSecContext(save_userid, save_sec_context);
 	if (fmgr_hook)
-		(*fmgr_hook) (FHET_END, &fcache->flinfo, &fcache->arg);
+		(*fmgr_hook) (FHET_END, &fcache->flinfo_target, &fcache->arg);
+
+	/* pop from stack */
+	CurrentFmgrCache = fcache->previous;
+	fcache->previous = NULL;
 
 	return result;
 }
 
+void
+AtEOXact_Fmgr(bool isCommit, int nestLevel)
+{
+	while (CurrentFmgrCache && CurrentFmgrCache->fmgrNestLevel >= nestLevel)
+	{
+		struct fmgr_security_definer_cache *fcache = CurrentFmgrCache;
+
+		/* reinstall wrapper */
+		fcache->fcinfo->flinfo = fcache->flinfo_wrapper;
+
+		if (fmgr_hook)
+			(*fmgr_hook) (isCommit ? FHET_END : FHET_ABORT,
+						  &fcache->flinfo_target,
+						  &fcache->arg);
+
+		CurrentFmgrCache = fcache->previous;
+		fcache->previous = NULL;
+	}
+
+	FmgrNestLevel = nestLevel - 1;
+}
 
 /*-------------------------------------------------------------------------
  *		Support routines for callers of fmgr-compatible functions
diff --git a/src/include/fmgr.h b/src/include/fmgr.h
index edf61e53f3..56e5018d23 100644
--- a/src/include/fmgr.h
+++ b/src/include/fmgr.h
@@ -735,6 +735,8 @@ extern bytea *get_fn_opclass_options(FmgrInfo *flinfo);
 extern bool has_fn_opclass_options(FmgrInfo *flinfo);
 extern void set_fn_opclass_options(FmgrInfo *flinfo, bytea *options);
 extern bool CheckFunctionValidatorAccess(Oid validatorOid, Oid functionOid);
+extern int CurrentFmgrNestLevel(void);
+extern void AtEOXact_Fmgr(bool isCommit, int nestLevel);
 
 /*
  * Routines in dfmgr.c
-- 
2.34.1

#42

Jeff Davis

pgsql@j-davis.com

about 2 years ago

In reply to: Jeff Davis (#40)

1 attachment(s)

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

On Tue, 2023-12-05 at 11:58 -0800, Jeff Davis wrote:

Also, I forward-declared config_generic in guc.h to eliminate the
cast.

Looking more closely, I fixed an issue related to placeholder configs.
We can't return a handle to a placeholder, because it's not stable, so
in that case it falls back to using the name.

My apologies for the churn on this (mostly) simple patch. I think this
version is correct; I intend to commit soon.

Regards,
Jeff Davis

Attachments:

v3-0001-Cache-opaque-handle-for-GUC-option-to-avoid-repea.patchtext/x-patch; charset=UTF-8; name=v3-0001-Cache-opaque-handle-for-GUC-option-to-avoid-repea.patchDownload

From 386100494f3bb977e5b9e52fa603898587b4d976 Mon Sep 17 00:00:00 2001
From: Jeff Davis <jeff@j-davis.com>
Date: Mon, 4 Dec 2023 16:20:05 -0800
Subject: [PATCH v3] Cache opaque handle for GUC option to avoid repeasted
 lookups.

When setting GUCs from proconfig, performance is important, and hash
lookups in the GUC table are significant.

Per suggestion from Robert Haas.

Discussion: https://postgr.es/m/CA+TgmoYpKxhR3HOD9syK2XwcAUVPa0+ba0XPnwWBcYxtKLkyxA@mail.gmail.com
Reviewed-by: John Naylor
---
 src/backend/utils/fmgr/fmgr.c | 39 ++++++++++++++++-----
 src/backend/utils/misc/guc.c  | 66 ++++++++++++++++++++++++++++++++---
 src/include/utils/guc.h       |  8 +++++
 3 files changed, 101 insertions(+), 12 deletions(-)

diff --git a/src/backend/utils/fmgr/fmgr.c b/src/backend/utils/fmgr/fmgr.c
index 9dfdf890c5..442d7b8269 100644
--- a/src/backend/utils/fmgr/fmgr.c
+++ b/src/backend/utils/fmgr/fmgr.c
@@ -613,6 +613,7 @@ struct fmgr_security_definer_cache
 	FmgrInfo	flinfo;			/* lookup info for target function */
 	Oid			userid;			/* userid to set, or InvalidOid */
 	List	   *configNames;	/* GUC names to set, or NIL */
+	List	   *configHandles;	/* GUC handles to set, or NIL */
 	List	   *configValues;	/* GUC values to set, or NIL */
 	Datum		arg;			/* passthrough argument for plugin modules */
 };
@@ -635,8 +636,9 @@ fmgr_security_definer(PG_FUNCTION_ARGS)
 	FmgrInfo   *save_flinfo;
 	Oid			save_userid;
 	int			save_sec_context;
-	ListCell   *lc1;
-	ListCell   *lc2;
+	ListCell   *lc1,
+			   *lc2,
+			   *lc3;
 	volatile int save_nestlevel;
 	PgStat_FunctionCallUsage fcusage;
 
@@ -670,11 +672,23 @@ fmgr_security_definer(PG_FUNCTION_ARGS)
 		if (!isnull)
 		{
 			ArrayType  *array;
+			ListCell   *lc;
 
 			oldcxt = MemoryContextSwitchTo(fcinfo->flinfo->fn_mcxt);
 			array = DatumGetArrayTypeP(datum);
 			TransformGUCArray(array, &fcache->configNames,
 							  &fcache->configValues);
+
+			/* transform names to config handles to avoid lookup cost */
+			fcache->configHandles = NIL;
+			foreach(lc, fcache->configNames)
+			{
+				char	   *name = (char *) lfirst(lc);
+
+				fcache->configHandles = lappend(fcache->configHandles,
+												get_config_handle(name));
+			}
+
 			MemoryContextSwitchTo(oldcxt);
 		}
 
@@ -696,17 +710,26 @@ fmgr_security_definer(PG_FUNCTION_ARGS)
 		SetUserIdAndSecContext(fcache->userid,
 							   save_sec_context | SECURITY_LOCAL_USERID_CHANGE);
 
-	forboth(lc1, fcache->configNames, lc2, fcache->configValues)
+	forthree(lc1, fcache->configNames,
+			 lc2, fcache->configHandles,
+			 lc3, fcache->configValues)
 	{
 		GucContext	context = superuser() ? PGC_SUSET : PGC_USERSET;
 		GucSource	source = PGC_S_SESSION;
 		GucAction	action = GUC_ACTION_SAVE;
 		char	   *name = lfirst(lc1);
-		char	   *value = lfirst(lc2);
-
-		(void) set_config_option(name, value,
-								 context, source,
-								 action, true, 0, false);
+		config_handle *handle = lfirst(lc2);
+		char	   *value = lfirst(lc3);
+
+		/* handle not available if GUC is a placeholder */
+		if (handle)
+			(void) set_config_with_handle(handle, value,
+										  context, source, GetUserId(),
+										  action, true, 0, false);
+		else
+			(void) set_config_option(name, value,
+									 context, source,
+									 action, true, 0, false);
 	}
 
 	/* function manager hook */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index e76c083003..4ba6ea6c83 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3357,6 +3357,52 @@ set_config_option_ext(const char *name, const char *value,
 					  bool is_reload)
 {
 	struct config_generic *record;
+
+	if (elevel == 0)
+	{
+		if (source == PGC_S_DEFAULT || source == PGC_S_FILE)
+		{
+			/*
+			 * To avoid cluttering the log, only the postmaster bleats loudly
+			 * about problems with the config file.
+			 */
+			elevel = IsUnderPostmaster ? DEBUG3 : LOG;
+		}
+		else if (source == PGC_S_GLOBAL ||
+				 source == PGC_S_DATABASE ||
+				 source == PGC_S_USER ||
+				 source == PGC_S_DATABASE_USER)
+			elevel = WARNING;
+		else
+			elevel = ERROR;
+	}
+
+	record = find_option(name, true, false, elevel);
+	if (record == NULL)
+		return 0;
+
+	return set_config_with_handle(record, value,
+								  context, source, srole,
+								  action, changeVal, elevel,
+								  is_reload);
+}
+
+
+/*
+ * set_config_with_handle: sets option with the given handle to the given
+ * value.
+ *
+ * This should be used by callers that repeatedly set the same config
+ * option(s), and want to avoid the overhead of a hash lookup each time.
+ */
+int
+set_config_with_handle(config_handle * handle, const char *value,
+					   GucContext context, GucSource source, Oid srole,
+					   GucAction action, bool changeVal, int elevel,
+					   bool is_reload)
+{
+	struct config_generic *record = handle;
+	const char *name = record->name;
 	union config_var_val newval_union;
 	void	   *newextra = NULL;
 	bool		prohibitValueChange = false;
@@ -3395,10 +3441,6 @@ set_config_option_ext(const char *name, const char *value,
 				(errcode(ERRCODE_INVALID_TRANSACTION_STATE),
 				 errmsg("cannot set parameters during a parallel operation")));
 
-	record = find_option(name, true, false, elevel);
-	if (record == NULL)
-		return 0;
-
 	/*
 	 * Check if the option can be set at this time. See guc.h for the precise
 	 * rules.
@@ -4166,6 +4208,22 @@ set_config_option_ext(const char *name, const char *value,
 }
 
 
+/*
+ * Retrieve a config_handle for the given name, suitable for calling
+ * set_config_with_handle(). Only return handle to permanent GUC.
+ */
+config_handle *
+get_config_handle(const char *name)
+{
+	struct config_generic *gen = find_option(name, false, false, 0);
+
+	if (gen && ((gen->flags & GUC_CUSTOM_PLACEHOLDER) == 0))
+		return gen;
+
+	return NULL;
+}
+
+
 /*
  * Set the fields for source file and line number the setting came from.
  */
diff --git a/src/include/utils/guc.h b/src/include/utils/guc.h
index 20fe13702b..a5a19ca6a1 100644
--- a/src/include/utils/guc.h
+++ b/src/include/utils/guc.h
@@ -144,6 +144,8 @@ typedef struct ConfigVariable
 	struct ConfigVariable *next;
 } ConfigVariable;
 
+typedef struct config_generic config_handle;
+
 extern bool ParseConfigFile(const char *config_file, bool strict,
 							const char *calling_file, int calling_lineno,
 							int depth, int elevel,
@@ -387,6 +389,12 @@ extern int	set_config_option_ext(const char *name, const char *value,
 								  Oid srole,
 								  GucAction action, bool changeVal, int elevel,
 								  bool is_reload);
+extern int	set_config_with_handle(config_handle * handle, const char *value,
+								   GucContext context, GucSource source,
+								   Oid srole,
+								   GucAction action, bool changeVal,
+								   int elevel, bool is_reload);
+extern config_handle * get_config_handle(const char *name);
 extern void AlterSystemSetConfigFile(AlterSystemStmt *altersysstmt);
 extern char *GetConfigOptionByName(const char *name, const char **varname,
 								   bool missing_ok);
-- 
2.34.1