our checks for read-only queries are not great

Started by Robert Haasabout 6 years ago37 messages
#1Robert Haas
robertmhaas@gmail.com
1 attachment(s)

I spent some time studying the question of how we classify queries as
either read-only or not, and our various definitions of read-only, and
found some bugs. Specifically:

1. check_xact_readonly() prohibits most kinds of DDL in read-only
transactions, but a bunch of recently-added commands were not added to
the list. The missing node types are T_CreatePolicyStmt,
T_AlterPolicyStmt, T_CreateAmStmt, T_CreateStatsStmt,
T_AlterStatsStmt, and T_AlterCollationStmt, which means you can run
these commands in a read-only transaction with no problem and even
attempt to run them on a standby. The ones I tested on a standby all
fail with random-ish error messages due to lower-level checks, but
that's not a great situation.

2. There are comments in utility.c which assert that certain commands
are "forbidden in parallel mode due to CommandIsReadOnly." That's
technically correct, but silly and misleading.These commands wouldn't
be running in parallel mode unless they were running inside of a
function or procedure or something, and, if they are,
CommandIsReadOnly() checks in spi.c or functions.c would prevent not
only these commands but, in fact, all utility commands, so calling out
those particular ones is just adding confusion. Also, the underlying
restriction is unnecessary, because there's no good reason to prevent
the use of things like SHOW and DO in parallel mode, yet we currently
do.

The problems mentioned under (1) are technically the fault of the
people who wrote, reviewed, and committed the patches which added
those command types, and the problems mentioned under (2) are
basically my fault, dating back to the original ParallelContext patch.
However, I think that all of them can be tracked back to a more
fundamental underlying cause, which is that the way that the various
restrictions on read-write queries are implemented is pretty
confusing. Attached is a patch I wrote to try to improve things. It
centralizes three decisions that are currently made in different
places in a single place: (a) can this be run in a read only
transaction? (b) can it run in parallel mode? (c) can it run on a
standby? -- and along the way, it fixes the problems mentioned above
and tries to supply slightly improved comments. Perhaps we should
back-patch fixes at least for (1) even if this gets committed, but I
guess my first question is what people think of this approach to the
problem.

Thanks,

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

0001-Fix-problems-with-read-only-query-checks-and-refacto.patchapplication/octet-stream; name=0001-Fix-problems-with-read-only-query-checks-and-refacto.patchDownload
From 54a87701f3dfa3e94aaeac76cda16a1c551fa4f3 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Wed, 8 Jan 2020 12:39:15 -0500
Subject: [PATCH] Fix problems with "read only query" checks, and refactor the
 code.

Previously, check_xact_readonly() was responsible for determining which
types of queries could not be run in a read-only transaction,
standard_ProcessUtility() was responsibility for prohibiting things
which were allowed in read only transactions but not in recovery, and
utility commands were basically prohibited in bulk in parallel mode by
calls to CommandIsReadOnly() in functions.c and spi.c.  This situation
was confusing and error-prone. Accordingly, move all the checks to a new
function ClassifyUtilityCommandAsReadOnly, which determines the degree
to which a given statement is read only.

In the old code, check_xact_readonly() inadvertently failed to handle
several statement types that actually should have been prohibited,
specifically T_CreatePolicyStmt, T_AlterPolicyStmt, T_CreateAmStmt,
T_CreateStatsStmt, T_AlterStatsStmt, and T_AlterCollationStmt.  As a
result, thes statements were erroneously allowed in read only
transactions, parallel queries, and standby operation. Generally, they
would fail anyway due to some lower-level error check, but we
shouldn't rely on that.  In the new code structure, future omissions
of this type should cause ClassifyUtilityCommandAsReadOnly to complain
about an unrecognized node type.

As a fringe benefit, this means we can allow certain types of utility
commands in parallel mode, where it's safe to do so. This allows
ALTER SYSTEM, CALL, DO, CHECKPOINT, COPY FROM, EXPLAIN, and SHOW.
It might be possible to allow additional commands with more work
and thought.
---
 src/backend/commands/copy.c      |   1 -
 src/backend/commands/lockcmds.c  |  11 -
 src/backend/executor/functions.c |   3 -
 src/backend/executor/spi.c       |  31 +--
 src/backend/tcop/utility.c       | 342 ++++++++++++++++++++++---------
 src/include/tcop/utility.h       |  35 ++++
 6 files changed, 288 insertions(+), 135 deletions(-)

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index c93a788798..40a8ec1abd 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -1064,7 +1064,6 @@ DoCopy(ParseState *pstate, const CopyStmt *stmt,
 		/* check read-only transaction and parallel mode */
 		if (XactReadOnly && !rel->rd_islocaltemp)
 			PreventCommandIfReadOnly("COPY FROM");
-		PreventCommandIfParallelMode("COPY FROM");
 
 		cstate = BeginCopyFrom(pstate, rel, stmt->filename, stmt->is_program,
 							   NULL, stmt->attlist, stmt->options);
diff --git a/src/backend/commands/lockcmds.c b/src/backend/commands/lockcmds.c
index 75410f04de..329ab849c0 100644
--- a/src/backend/commands/lockcmds.c
+++ b/src/backend/commands/lockcmds.c
@@ -42,17 +42,6 @@ LockTableCommand(LockStmt *lockstmt)
 {
 	ListCell   *p;
 
-	/*---------
-	 * During recovery we only accept these variations:
-	 * LOCK TABLE foo IN ACCESS SHARE MODE
-	 * LOCK TABLE foo IN ROW SHARE MODE
-	 * LOCK TABLE foo IN ROW EXCLUSIVE MODE
-	 * This test must match the restrictions defined in LockAcquireExtended()
-	 *---------
-	 */
-	if (lockstmt->mode > RowExclusiveLock)
-		PreventCommandDuringRecovery("LOCK TABLE");
-
 	/*
 	 * Iterate over the list and process the named relations one at a time
 	 */
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index e8e1957075..5cff6c4321 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -540,9 +540,6 @@ init_execution_state(List *queryTree_list,
 						 errmsg("%s is not allowed in a non-volatile function",
 								CreateCommandTag((Node *) stmt))));
 
-			if (IsInParallelMode() && !CommandIsReadOnly(stmt))
-				PreventCommandIfParallelMode(CreateCommandTag((Node *) stmt));
-
 			/* OK, build the execution_state for this query */
 			newes = (execution_state *) palloc(sizeof(execution_state));
 			if (preves)
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 4a6e82b605..c46764bf42 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1450,14 +1450,13 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	portal->queryEnv = _SPI_current->queryEnv;
 
 	/*
-	 * If told to be read-only, or in parallel mode, verify that this query is
-	 * in fact read-only.  This can't be done earlier because we need to look
-	 * at the finished, planned queries.  (In particular, we don't want to do
-	 * it between GetCachedPlan and PortalDefineQuery, because throwing an
-	 * error between those steps would result in leaking our plancache
-	 * refcount.)
+	 * If told to be read-only, we'd better check for read-only queries. This
+	 * can't be done earlier because we need to look at the finished, planned
+	 * queries.  (In particular, we don't want to do it between GetCachedPlan
+	 * and PortalDefineQuery, because throwing an error between those steps
+	 * would result in leaking our plancache refcount.)
 	 */
-	if (read_only || IsInParallelMode())
+	if (read_only)
 	{
 		ListCell   *lc;
 
@@ -1466,16 +1465,11 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 			PlannedStmt *pstmt = lfirst_node(PlannedStmt, lc);
 
 			if (!CommandIsReadOnly(pstmt))
-			{
-				if (read_only)
-					ereport(ERROR,
-							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-					/* translator: %s is a SQL statement name */
-							 errmsg("%s is not allowed in a non-volatile function",
-									CreateCommandTag((Node *) pstmt))));
-				else
-					PreventCommandIfParallelMode(CreateCommandTag((Node *) pstmt));
-			}
+				ereport(ERROR,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				/* translator: %s is a SQL statement name */
+						 errmsg("%s is not allowed in a non-volatile function",
+								CreateCommandTag((Node *) pstmt))));
 		}
 	}
 
@@ -2263,9 +2257,6 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
 						 errmsg("%s is not allowed in a non-volatile function",
 								CreateCommandTag((Node *) stmt))));
 
-			if (IsInParallelMode() && !CommandIsReadOnly(stmt))
-				PreventCommandIfParallelMode(CreateCommandTag((Node *) stmt));
-
 			/*
 			 * If not read-only mode, advance the command counter before each
 			 * command and update the snapshot.
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index b2c58bf862..55cfb9fb7b 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -75,6 +75,7 @@
 ProcessUtility_hook_type ProcessUtility_hook = NULL;
 
 /* local function declarations */
+static int ClassifyUtilityCommandAsReadOnly(Node *parsetree);
 static void ProcessUtilitySlow(ParseState *pstate,
 							   PlannedStmt *pstmt,
 							   const char *queryString,
@@ -124,106 +125,261 @@ CommandIsReadOnly(PlannedStmt *pstmt)
 }
 
 /*
- * check_xact_readonly: is a utility command read-only?
+ * Determine the degree to which a utility command is read only.
  *
- * Here we use the loose rules of XactReadOnly mode: no permanent effects
- * on the database are allowed.
+ * Some commands are not at all read-only, such as DDL statements that make
+ * catalog entries; and some are entirely read-only, such as fetching a row
+ * from a cursor. There's some middle ground: see src/include/utility/tcop.h
+ * for details.
  */
-static void
-check_xact_readonly(Node *parsetree)
+int
+ClassifyUtilityCommandAsReadOnly(Node *parsetree)
 {
-	/* Only perform the check if we have a reason to do so. */
-	if (!XactReadOnly && !IsInParallelMode())
-		return;
-
-	/*
-	 * Note: Commands that need to do more complicated checking are handled
-	 * elsewhere, in particular COPY and plannable statements do their own
-	 * checking.  However they should all call PreventCommandIfReadOnly or
-	 * PreventCommandIfParallelMode to actually throw the error.
-	 */
-
 	switch (nodeTag(parsetree))
 	{
-		case T_AlterDatabaseStmt:
+		case T_AlterCollationStmt:
 		case T_AlterDatabaseSetStmt:
+		case T_AlterDatabaseStmt:
+		case T_AlterDefaultPrivilegesStmt:
 		case T_AlterDomainStmt:
+		case T_AlterEnumStmt:
+		case T_AlterEventTrigStmt:
+		case T_AlterExtensionContentsStmt:
+		case T_AlterExtensionStmt:
+		case T_AlterFdwStmt:
+		case T_AlterForeignServerStmt:
 		case T_AlterFunctionStmt:
-		case T_AlterRoleStmt:
-		case T_AlterRoleSetStmt:
 		case T_AlterObjectDependsStmt:
 		case T_AlterObjectSchemaStmt:
-		case T_AlterOwnerStmt:
+		case T_AlterOpFamilyStmt:
 		case T_AlterOperatorStmt:
+		case T_AlterOwnerStmt:
+		case T_AlterPolicyStmt:
+		case T_AlterPublicationStmt:
+		case T_AlterRoleSetStmt:
+		case T_AlterRoleStmt:
 		case T_AlterSeqStmt:
+		case T_AlterStatsStmt:
+		case T_AlterSubscriptionStmt:
+		case T_AlterTSConfigurationStmt:
+		case T_AlterTSDictionaryStmt:
 		case T_AlterTableMoveAllStmt:
+		case T_AlterTableSpaceOptionsStmt:
 		case T_AlterTableStmt:
-		case T_RenameStmt:
+		case T_AlterUserMappingStmt:
 		case T_CommentStmt:
-		case T_DefineStmt:
+		case T_CompositeTypeStmt:
+		case T_CreateAmStmt:
 		case T_CreateCastStmt:
-		case T_CreateEventTrigStmt:
-		case T_AlterEventTrigStmt:
 		case T_CreateConversionStmt:
-		case T_CreatedbStmt:
 		case T_CreateDomainStmt:
+		case T_CreateEnumStmt:
+		case T_CreateEventTrigStmt:
+		case T_CreateExtensionStmt:
+		case T_CreateFdwStmt:
+		case T_CreateForeignServerStmt:
+		case T_CreateForeignTableStmt:
 		case T_CreateFunctionStmt:
-		case T_CreateRoleStmt:
-		case T_IndexStmt:
-		case T_CreatePLangStmt:
 		case T_CreateOpClassStmt:
 		case T_CreateOpFamilyStmt:
-		case T_AlterOpFamilyStmt:
-		case T_RuleStmt:
+		case T_CreatePLangStmt:
+		case T_CreatePolicyStmt:
+		case T_CreatePublicationStmt:
+		case T_CreateRangeStmt:
+		case T_CreateRoleStmt:
 		case T_CreateSchemaStmt:
 		case T_CreateSeqStmt:
+		case T_CreateStatsStmt:
 		case T_CreateStmt:
+		case T_CreateSubscriptionStmt:
 		case T_CreateTableAsStmt:
-		case T_RefreshMatViewStmt:
 		case T_CreateTableSpaceStmt:
 		case T_CreateTransformStmt:
 		case T_CreateTrigStmt:
-		case T_CompositeTypeStmt:
-		case T_CreateEnumStmt:
-		case T_CreateRangeStmt:
-		case T_AlterEnumStmt:
-		case T_ViewStmt:
+		case T_CreateUserMappingStmt:
+		case T_CreatedbStmt:
+		case T_DefineStmt:
+		case T_DropOwnedStmt:
+		case T_DropRoleStmt:
 		case T_DropStmt:
-		case T_DropdbStmt:
+		case T_DropSubscriptionStmt:
 		case T_DropTableSpaceStmt:
-		case T_DropRoleStmt:
-		case T_GrantStmt:
-		case T_GrantRoleStmt:
-		case T_AlterDefaultPrivilegesStmt:
-		case T_TruncateStmt:
-		case T_DropOwnedStmt:
-		case T_ReassignOwnedStmt:
-		case T_AlterTSDictionaryStmt:
-		case T_AlterTSConfigurationStmt:
-		case T_CreateExtensionStmt:
-		case T_AlterExtensionStmt:
-		case T_AlterExtensionContentsStmt:
-		case T_CreateFdwStmt:
-		case T_AlterFdwStmt:
-		case T_CreateForeignServerStmt:
-		case T_AlterForeignServerStmt:
-		case T_CreateUserMappingStmt:
-		case T_AlterUserMappingStmt:
 		case T_DropUserMappingStmt:
-		case T_AlterTableSpaceOptionsStmt:
-		case T_CreateForeignTableStmt:
+		case T_DropdbStmt:
+		case T_GrantRoleStmt:
+		case T_GrantStmt:
 		case T_ImportForeignSchemaStmt:
+		case T_IndexStmt:
+		case T_ReassignOwnedStmt:
+		case T_RefreshMatViewStmt:
+		case T_RenameStmt:
+		case T_RuleStmt:
 		case T_SecLabelStmt:
-		case T_CreatePublicationStmt:
-		case T_AlterPublicationStmt:
-		case T_CreateSubscriptionStmt:
-		case T_AlterSubscriptionStmt:
-		case T_DropSubscriptionStmt:
-			PreventCommandIfReadOnly(CreateCommandTag(parsetree));
-			PreventCommandIfParallelMode(CreateCommandTag(parsetree));
-			break;
+		case T_TruncateStmt:
+		case T_ViewStmt:
+			{
+				/* DDL is not read-only, and neither is TRUNCATE. */
+				return COMMAND_IS_NOT_READ_ONLY;
+			}
+
+		case T_AlterSystemStmt:
+			{
+				/*
+				 * It's sort of strange that we treat this as a read-only
+				 * command, but since postgresql.auto.conf isn't WAL-logged
+				 * or stored in shared buffers and is separately writeable on
+				 * the standby, it's fine.
+				 */
+				return COMMAND_IS_STRICTLY_READ_ONLY;
+			}
+
+		case T_CallStmt:
+		case T_DoStmt:
+			{
+				/*
+				 * Commands inside the DO block or the called procedure might
+				 * not be read only, but they'll be checked separately when we
+				 * try to execute them.  Here we only need to worry about the
+				 * DO or CALL command itself.
+				 */
+				return COMMAND_IS_STRICTLY_READ_ONLY;
+			}
+
+		case T_CheckPointStmt:
+			{
+				/*
+				 * You might think that this should not be permitted in
+				 * recovery, but we interpret a CHECKPOINT command during
+				 * recovery as a request for a restartpoint instead. We allow
+				 * this since it can be a useful way of reducing switchover
+				 * time when using various forms of replication.
+				 */
+				return COMMAND_IS_STRICTLY_READ_ONLY;
+			}
+
+		case T_ClosePortalStmt:
+		case T_ConstraintsSetStmt:
+		case T_DeallocateStmt:
+		case T_DeclareCursorStmt:
+		case T_DiscardStmt:
+		case T_ExecuteStmt:
+		case T_FetchStmt:
+		case T_LoadStmt:
+		case T_PrepareStmt:
+		case T_UnlistenStmt:
+		case T_VariableSetStmt:
+			{
+				/*
+				 * These modify only backend-local state, so they're OK to
+				 * run in a read-only transaction or on a standby. However,
+				 * they are disallowed in parallel mode, because they either
+				 * rely upon or modify backend-local state that might not be
+				 * synchronized among cooperating backends.
+				 */
+				return COMMAND_OK_IN_RECOVERY | COMMAND_OK_IN_READ_ONLY_TXN;
+			}
+
+		case T_ClusterStmt:
+		case T_ReindexStmt:
+		case T_VacuumStmt:
+			{
+				/*
+				 * These commands are not read-only in the strict sense: they
+				 * do modify database blocks and system catalog entries.
+				 * However, we choose to allow them during "read only"
+				 * transactions.
+				 */
+				return COMMAND_IS_WEAKLY_READ_ONLY;
+			}
+
+		case T_CopyStmt:
+			{
+				CopyStmt *stmt = (CopyStmt *) parsetree;
+
+				/*
+				 * You might think that COPY FROM is not at all read only,
+				 * but we have for many years permitted in a "read only"
+				 * transaction if the target is a temporary table. If the
+				 * target table turns out to be non-temporary, DoCopy itself
+				 * will call PreventCommandIfReadOnly.
+				 */
+				if (stmt->is_from)
+					return COMMAND_IS_WEAKLY_READ_ONLY;
+				else
+					return COMMAND_IS_STRICTLY_READ_ONLY;
+			}
+
+		case T_ExplainStmt:
+		case T_VariableShowStmt:
+			{
+				/*
+				 * These commands don't modify any data and are safe to run
+				 * in a parallel worker.
+				 */
+				return COMMAND_IS_STRICTLY_READ_ONLY;
+			}
+
+		case T_ListenStmt:
+		case T_NotifyStmt:
+			{
+				/*
+				 * NOTIFY requires an XID assignment, so it can't be permitted
+				 * on a standby. Perhaps LISTEN could, since without NOTIFY
+				 * it would be OK to just do nothing, at least until promotion,
+				 * but we currently prohibit it lest the user get the wrong
+				 * idea.
+				 *
+				 * (We do allow T_UnlistenStmt on a standby, though, because
+				 * it's a no-op.)
+				 */
+				return COMMAND_IS_WEAKLY_READ_ONLY;
+			}
+
+		case T_LockStmt:
+			{
+				LockStmt *stmt = (LockStmt *) parsetree;
+
+				/*
+				 * Only weaker locker modes are allowed during recovery. The
+				 * restrictions here must match those in LockAcquireExtended().
+				 */
+				if (stmt->mode > RowExclusiveLock)
+					return COMMAND_IS_WEAKLY_READ_ONLY;
+				else
+					return COMMAND_IS_STRICTLY_READ_ONLY;
+			}
+
+		case T_TransactionStmt:
+			{
+				TransactionStmt *stmt = (TransactionStmt *) parsetree;
+
+				/*
+				 * We choose to allow PREPARE, COMMIT PREPARED, and ROLLBACK
+				 * PREPARED in read-only transactions, but all of them modify
+				 * persistent on disk state and so are not read-only in the
+				 * strict sense.
+				 */
+				switch (stmt->kind)
+				{
+					case TRANS_STMT_BEGIN:
+					case TRANS_STMT_START:
+					case TRANS_STMT_COMMIT:
+					case TRANS_STMT_ROLLBACK:
+					case TRANS_STMT_SAVEPOINT:
+					case TRANS_STMT_RELEASE:
+					case TRANS_STMT_ROLLBACK_TO:
+						return COMMAND_IS_STRICTLY_READ_ONLY;
+
+					case TRANS_STMT_PREPARE:
+					case TRANS_STMT_COMMIT_PREPARED:
+					case TRANS_STMT_ROLLBACK_PREPARED:
+						return COMMAND_IS_WEAKLY_READ_ONLY;
+				}
+			}
+
 		default:
-			/* do nothing */
+			elog(ERROR, "unrecognized node type: %d",
+				 (int) nodeTag(parsetree));
 			break;
 	}
 }
@@ -231,8 +387,8 @@ check_xact_readonly(Node *parsetree)
 /*
  * PreventCommandIfReadOnly: throw error if XactReadOnly
  *
- * This is useful mainly to ensure consistency of the error message wording;
- * most callers have checked XactReadOnly for themselves.
+ * This is useful partly to ensure consistency of the error message wording;
+ * some callers have checked XactReadOnly for themselves.
  */
 void
 PreventCommandIfReadOnly(const char *cmdname)
@@ -249,8 +405,8 @@ PreventCommandIfReadOnly(const char *cmdname)
  * PreventCommandIfParallelMode: throw error if current (sub)transaction is
  * in parallel mode.
  *
- * This is useful mainly to ensure consistency of the error message wording;
- * most callers have checked IsInParallelMode() for themselves.
+ * This is useful partly to ensure consistency of the error message wording;
+ * some callers have checked IsInParallelMode() for themselves.
  */
 void
 PreventCommandIfParallelMode(const char *cmdname)
@@ -385,11 +541,25 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 	bool		isTopLevel = (context == PROCESS_UTILITY_TOPLEVEL);
 	bool		isAtomicContext = (!(context == PROCESS_UTILITY_TOPLEVEL || context == PROCESS_UTILITY_QUERY_NONATOMIC) || IsTransactionBlock());
 	ParseState *pstate;
+	int			readonly_flags;
 
 	/* This can recurse, so check for excessive recursion */
 	check_stack_depth();
 
-	check_xact_readonly(parsetree);
+	/* Prohibit read/write commands in read-only states. */
+	readonly_flags = ClassifyUtilityCommandAsReadOnly(parsetree);
+	if (readonly_flags != COMMAND_IS_STRICTLY_READ_ONLY &&
+		(XactReadOnly || IsInParallelMode()))
+	{
+		const char *commandtag = CreateCommandTag(parsetree);
+
+		if ((readonly_flags & COMMAND_OK_IN_READ_ONLY_TXN) == 0)
+			PreventCommandIfReadOnly(commandtag);
+		if ((readonly_flags & COMMAND_OK_IN_PARALLEL_MODE) == 0)
+			PreventCommandIfParallelMode(commandtag);
+		if ((readonly_flags & COMMAND_OK_IN_RECOVERY) == 0)
+			PreventCommandDuringRecovery(commandtag);
+	}
 
 	if (completionTag)
 		completionTag[0] = '\0';
@@ -449,7 +619,6 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 						break;
 
 					case TRANS_STMT_PREPARE:
-						PreventCommandDuringRecovery("PREPARE TRANSACTION");
 						if (!PrepareTransactionBlock(stmt->gid))
 						{
 							/* report unsuccessful commit in completionTag */
@@ -460,13 +629,11 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 
 					case TRANS_STMT_COMMIT_PREPARED:
 						PreventInTransactionBlock(isTopLevel, "COMMIT PREPARED");
-						PreventCommandDuringRecovery("COMMIT PREPARED");
 						FinishPreparedTransaction(stmt->gid, true);
 						break;
 
 					case TRANS_STMT_ROLLBACK_PREPARED:
 						PreventInTransactionBlock(isTopLevel, "ROLLBACK PREPARED");
-						PreventCommandDuringRecovery("ROLLBACK PREPARED");
 						FinishPreparedTransaction(stmt->gid, false);
 						break;
 
@@ -607,7 +774,6 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 			{
 				NotifyStmt *stmt = (NotifyStmt *) parsetree;
 
-				PreventCommandDuringRecovery("NOTIFY");
 				Async_Notify(stmt->conditionname, stmt->payload);
 			}
 			break;
@@ -616,7 +782,6 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 			{
 				ListenStmt *stmt = (ListenStmt *) parsetree;
 
-				PreventCommandDuringRecovery("LISTEN");
 				CheckRestrictedOperation("LISTEN");
 				Async_Listen(stmt->conditionname);
 			}
@@ -626,7 +791,6 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 			{
 				UnlistenStmt *stmt = (UnlistenStmt *) parsetree;
 
-				/* we allow UNLISTEN during recovery, as it's a noop */
 				CheckRestrictedOperation("UNLISTEN");
 				if (stmt->conditionname)
 					Async_Unlisten(stmt->conditionname);
@@ -650,22 +814,11 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 			break;
 
 		case T_ClusterStmt:
-			/* we choose to allow this during "read only" transactions */
-			PreventCommandDuringRecovery("CLUSTER");
-			/* forbidden in parallel mode due to CommandIsReadOnly */
 			cluster((ClusterStmt *) parsetree, isTopLevel);
 			break;
 
 		case T_VacuumStmt:
-			{
-				VacuumStmt *stmt = (VacuumStmt *) parsetree;
-
-				/* we choose to allow this during "read only" transactions */
-				PreventCommandDuringRecovery(stmt->is_vacuumcmd ?
-											 "VACUUM" : "ANALYZE");
-				/* forbidden in parallel mode due to CommandIsReadOnly */
-				ExecVacuum(pstate, stmt, isTopLevel);
-			}
+			ExecVacuum(pstate, (VacuumStmt *) parsetree, isTopLevel);
 			break;
 
 		case T_ExplainStmt:
@@ -740,7 +893,6 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 			 * outside a transaction block is presumed to be user error.
 			 */
 			RequireTransactionBlock(isTopLevel, "LOCK TABLE");
-			/* forbidden in parallel mode due to CommandIsReadOnly */
 			LockTableCommand((LockStmt *) parsetree);
 			break;
 
@@ -755,13 +907,6 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 						(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 						 errmsg("must be superuser to do CHECKPOINT")));
 
-			/*
-			 * You might think we should have a PreventCommandDuringRecovery()
-			 * here, but we interpret a CHECKPOINT command during recovery as
-			 * a request for a restartpoint instead. We allow this since it
-			 * can be a useful way of reducing switchover time when using
-			 * various forms of replication.
-			 */
 			RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_WAIT |
 							  (RecoveryInProgress() ? 0 : CHECKPOINT_FORCE));
 			break;
@@ -774,9 +919,6 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 					PreventInTransactionBlock(isTopLevel,
 											  "REINDEX CONCURRENTLY");
 
-				/* we choose to allow this during "read only" transactions */
-				PreventCommandDuringRecovery("REINDEX");
-				/* forbidden in parallel mode due to CommandIsReadOnly */
 				switch (stmt->kind)
 				{
 					case REINDEX_OBJECT_INDEX:
diff --git a/src/include/tcop/utility.h b/src/include/tcop/utility.h
index 40551c45af..6cc407f793 100644
--- a/src/include/tcop/utility.h
+++ b/src/include/tcop/utility.h
@@ -25,6 +25,41 @@ typedef enum
 	PROCESS_UTILITY_SUBCOMMAND	/* a portion of a query */
 } ProcessUtilityContext;
 
+/*
+ * These constants are used to describe the extent to which a particular
+ * command is read-only.
+ *
+ * COMMAND_OK_IN_READ_ONLY_TXN means that the command is permissible even when
+ * XactReadOnly is set. Generally, this should be false for commands that
+ * modify tuples, including tuples in system tables; we sometimes permit it
+ * for utility commands that do not directly modify tuples (e.g. REINDEX,
+ * COMMIT PREPARED).
+ *
+ * COMMAND_OK_IN_PARALLEL_MODE means that the command is permissible even
+ * when in parallel mode. Writing tuples is forbidden, as is anything that
+ * might confuse cooperating processes.
+ *
+ * COMMAND_OK_IN_RECOVERY means that the command is permissible even when in
+ * recovery. It can't write WAL, nor can it do things that would imperil
+ * replay of future WAL received from the master.
+ */
+#define COMMAND_OK_IN_READ_ONLY_TXN	0x0001
+#define COMMAND_OK_IN_PARALLEL_MODE	0x0002
+#define	COMMAND_OK_IN_RECOVERY		0x0004
+
+/*
+ * We say that a command is strictly read-only if it is sufficiently read-only
+ * for all purposes, and weakly read-only if it is sufficiently read-only to
+ * be run in a read-only transaction but nothing more. Other commands are
+ * not read-only.
+ */
+#define COMMAND_IS_STRICTLY_READ_ONLY \
+	(COMMAND_OK_IN_READ_ONLY_TXN | COMMAND_OK_IN_RECOVERY | \
+	 COMMAND_OK_IN_PARALLEL_MODE)
+#define COMMAND_IS_WEAKLY_READ_ONLY \
+	COMMAND_OK_IN_READ_ONLY_TXN
+#define COMMAND_IS_NOT_READ_ONLY	0
+
 /* Hook for plugins to get control in ProcessUtility() */
 typedef void (*ProcessUtility_hook_type) (PlannedStmt *pstmt,
 										  const char *queryString, ProcessUtilityContext context,
-- 
2.17.2 (Apple Git-113)

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#1)
Re: our checks for read-only queries are not great

Robert Haas <robertmhaas@gmail.com> writes:

I spent some time studying the question of how we classify queries as
either read-only or not, and our various definitions of read-only, and
found some bugs. ...
However, I think that all of them can be tracked back to a more
fundamental underlying cause, which is that the way that the various
restrictions on read-write queries are implemented is pretty
confusing. Attached is a patch I wrote to try to improve things.

Hmm. I like the idea of deciding this in one place and insisting that
that one place have a switch case for every statement type. That'll
address the root issue that people fail to think about this when adding
new statements.

I'm less enamored of some of the specific decisions here. Notably

* I find COMMAND_IS_WEAKLY_READ_ONLY to be a more confusing concept
than what it replaces. The test for LockStmt is an example --- the
comment talks about restricting locks during recovery, which is fine and
understandable, but then it's completely unobvious that the actual code
implements that behavior rather than some other one.

* ALTER SYSTEM SET is readonly? Say what? Even if that's how the current
code behaves, it's a damfool idea and we should change it. I think that
the semantics we are really trying to implement for read-only is "has no
effects visible outside the current session" --- this explains, for
example, why copying into a temp table is OK. ALTER SYSTEM SET certainly
isn't that.

I haven't read all of the code; those were just a couple points that
jumped out at me.

I think if we can sort out the notation for how the restrictions
are expressed, this'll be a good improvement.

regards, tom lane

#3Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#2)
Re: our checks for read-only queries are not great

On Wed, Jan 8, 2020 at 2:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hmm. I like the idea of deciding this in one place and insisting that
that one place have a switch case for every statement type. That'll
address the root issue that people fail to think about this when adding
new statements.

Right. Assuming they test their new command at least one time, they
should notice. :-)

I'm less enamored of some of the specific decisions here. Notably

* I find COMMAND_IS_WEAKLY_READ_ONLY to be a more confusing concept
than what it replaces. The test for LockStmt is an example --- the
comment talks about restricting locks during recovery, which is fine and
understandable, but then it's completely unobvious that the actual code
implements that behavior rather than some other one.

Uh, suggestions?

* ALTER SYSTEM SET is readonly? Say what? Even if that's how the current
code behaves, it's a damfool idea and we should change it. I think that
the semantics we are really trying to implement for read-only is "has no
effects visible outside the current session" --- this explains, for
example, why copying into a temp table is OK. ALTER SYSTEM SET certainly
isn't that.

It would be extremely lame and a huge usability regression to
arbitrary restrict ALTER SYSTEM SET on standby nodes for no reason.
Editing the postgresql.auto.conf file works just fine there, and is a
totally sensible thing to want to do. You could argue for restricting
it in parallel mode just out of general paranoia, but I don't favor
that approach.

I think if we can sort out the notation for how the restrictions
are expressed, this'll be a good improvement.

Thanks.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#3)
Re: our checks for read-only queries are not great

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Jan 8, 2020 at 2:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

* I find COMMAND_IS_WEAKLY_READ_ONLY to be a more confusing concept
than what it replaces. The test for LockStmt is an example --- the
comment talks about restricting locks during recovery, which is fine and
understandable, but then it's completely unobvious that the actual code
implements that behavior rather than some other one.

Uh, suggestions?

COMMAND_NOT_IN_RECOVERY, maybe?

* ALTER SYSTEM SET is readonly? Say what?

It would be extremely lame and a huge usability regression to
arbitrary restrict ALTER SYSTEM SET on standby nodes for no reason.

I didn't say that it shouldn't be allowed on standby nodes. I said
it shouldn't be allowed in transactions that have explicitly declared
themselves to be read-only. Maybe we need to disaggregate those
concepts a bit more --- a refactoring such as this would be a fine
time to do that.

regards, tom lane

#5Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#4)
Re: our checks for read-only queries are not great

On Wed, Jan 8, 2020 at 3:26 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Jan 8, 2020 at 2:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

* I find COMMAND_IS_WEAKLY_READ_ONLY to be a more confusing concept
than what it replaces. The test for LockStmt is an example --- the
comment talks about restricting locks during recovery, which is fine and
understandable, but then it's completely unobvious that the actual code
implements that behavior rather than some other one.

Uh, suggestions?

COMMAND_NOT_IN_RECOVERY, maybe?

Well, maybe I should just get rid of COMMAND_IS_WEAKLY_READ_ONLY and
return individual COMMAND_OK_IN_* flags for those cases.

* ALTER SYSTEM SET is readonly? Say what?

It would be extremely lame and a huge usability regression to
arbitrary restrict ALTER SYSTEM SET on standby nodes for no reason.

I didn't say that it shouldn't be allowed on standby nodes.

Oh, OK. I guess I misunderstood.

I said
it shouldn't be allowed in transactions that have explicitly declared
themselves to be read-only. Maybe we need to disaggregate those
concepts a bit more --- a refactoring such as this would be a fine
time to do that.

Yeah, the current rules are pretty weird. Aside from ALTER SYSTEM ..
SET, read-only transaction seem to allow writes to temporary relations
and sequences, plus CLUSTER, REINDEX, VACUUM, PREPARE, ROLLBACK
PREPARED, and COMMIT PREPARED, all of which sound a lot like writes to
me. They also allow LISTEN and SET which are have transactional
behavior in general but for some reason don't feel they need to
respect the R/O property. I worry that if we start whacking these
behaviors around we'll get complaints, so I'm cautious about doing
that. At the least, we would need to have a real clear definition, and
if there is such a definition that covers the current cases, I can't
guess what it is. Forget ALTER SYSTEM for a minute -- why is it OK to
rewrite a table via CLUSTER in a R/O transaction, but not OK to do an
ALTER TABLE that changes the clustering index? Why is it not OK to
LISTEN on a standby (and presumably not get any notifications until a
promotion occurs) but OK to UNLISTEN? Whatever reasons may have
justified the current choice of behaviors are probably lost in the
sands of time; they are for sure unknown to me.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#6Stephen Frost
sfrost@snowman.net
In reply to: Robert Haas (#5)
Re: our checks for read-only queries are not great

Greetings,

(I'm also overall in favor of the direction this is going, so general +1
from me, and I took a quick look through the patch and didn't
particularly see anything I didn't like besides what's commented on
below.)

* Robert Haas (robertmhaas@gmail.com) wrote:

On Wed, Jan 8, 2020 at 3:26 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Jan 8, 2020 at 2:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

* I find COMMAND_IS_WEAKLY_READ_ONLY to be a more confusing concept
than what it replaces. The test for LockStmt is an example --- the
comment talks about restricting locks during recovery, which is fine and
understandable, but then it's completely unobvious that the actual code
implements that behavior rather than some other one.

Uh, suggestions?

COMMAND_NOT_IN_RECOVERY, maybe?

Well, maybe I should just get rid of COMMAND_IS_WEAKLY_READ_ONLY and
return individual COMMAND_OK_IN_* flags for those cases.

Yeah, I don't like the WEAKLY_READ_ONLY idea either- explicitly having
COMMAND_OK_IN_X seems cleaner.

* ALTER SYSTEM SET is readonly? Say what?

It would be extremely lame and a huge usability regression to
arbitrary restrict ALTER SYSTEM SET on standby nodes for no reason.

I didn't say that it shouldn't be allowed on standby nodes.

Oh, OK. I guess I misunderstood.

I agree that we want ALTER SYSTEM SET to work on standbys, but it seems
there isn't really disagreement there.

I said
it shouldn't be allowed in transactions that have explicitly declared
themselves to be read-only. Maybe we need to disaggregate those
concepts a bit more --- a refactoring such as this would be a fine
time to do that.

Yeah, the current rules are pretty weird. Aside from ALTER SYSTEM ..
SET, read-only transaction seem to allow writes to temporary relations
and sequences, plus CLUSTER, REINDEX, VACUUM, PREPARE, ROLLBACK
PREPARED, and COMMIT PREPARED, all of which sound a lot like writes to
me. They also allow LISTEN and SET which are have transactional
behavior in general but for some reason don't feel they need to
respect the R/O property. I worry that if we start whacking these
behaviors around we'll get complaints, so I'm cautious about doing
that. At the least, we would need to have a real clear definition, and
if there is such a definition that covers the current cases, I can't
guess what it is. Forget ALTER SYSTEM for a minute -- why is it OK to
rewrite a table via CLUSTER in a R/O transaction, but not OK to do an
ALTER TABLE that changes the clustering index? Why is it not OK to
LISTEN on a standby (and presumably not get any notifications until a
promotion occurs) but OK to UNLISTEN? Whatever reasons may have
justified the current choice of behaviors are probably lost in the
sands of time; they are for sure unknown to me.

That a 'read-only' transaction can call CLUSTER is definitely bizarre to
me. As relates to 'UN-SOMETHING', having those be allowed makes sense,
to me anyway, since connection poolers like to do those things and it
should be a no-op more-or-less by definition. SET isn't changing data
blocks, so that also seems ok for a read-only transaction.. but, yeah,
there's no real great hard-and-fast-rule we've been following.

Would we be able to make a rule of "can't change on-disk stuff, except
for things in temporary schemas" and have it stick without a lot of
complaints? Seems like that would address Tom's ALTER SYSTEM SET
concern, and would mean CLUSTER/REINDEX/VACUUM are disallowed in a
backwards-incompatible way (though I think I'm fine with that..), and
SET would still be allowed (which strikes me as correct too). I'm not
quite sure how I feel about LISTEN, but that it could possibly actually
be used post-promotion and doesn't change on-disk stuff makes me feel
like it actually probably should be allowed.

Just looking at what was mentioned here- if there's other cases where
this idea falls flat then let's discuss them.

Thanks,

Stephen

#7Robert Haas
robertmhaas@gmail.com
In reply to: Stephen Frost (#6)
1 attachment(s)
Re: our checks for read-only queries are not great

On Wed, Jan 8, 2020 at 5:57 PM Stephen Frost <sfrost@snowman.net> wrote:

Yeah, I don't like the WEAKLY_READ_ONLY idea either- explicitly having
COMMAND_OK_IN_X seems cleaner.

Done that way. v2 attached.

Would we be able to make a rule of "can't change on-disk stuff, except
for things in temporary schemas" and have it stick without a lot of
complaints? Seems like that would address Tom's ALTER SYSTEM SET
concern, and would mean CLUSTER/REINDEX/VACUUM are disallowed in a
backwards-incompatible way (though I think I'm fine with that..), and
SET would still be allowed (which strikes me as correct too). I'm not
quite sure how I feel about LISTEN, but that it could possibly actually
be used post-promotion and doesn't change on-disk stuff makes me feel
like it actually probably should be allowed.

I think we can make any rule we like, but I think we should have some
measure of broad agreement on it. I'd like to go ahead with this for
now and then further changes can continue to be discussed and debated.
Hopefully we'll get a few more people to weigh in, too, because
deciding something like this based on what a handful of people doesn't
seem like a good idea to me.

I'd be really interested to hear if anyone knows the history behind
allowing CLUSTER, REINDEX, VACUUM, and some operations on temp tables.
It seems to have been that way for a long time. I wonder if it was a
deliberate choice or something that just happened semi-accidentally.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

v2-0001-Fix-problems-with-read-only-query-checks-and-refa.patchapplication/octet-stream; name=v2-0001-Fix-problems-with-read-only-query-checks-and-refa.patchDownload
From 2122d3fa00db74a1363d4f8a365ad88351480807 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Thu, 9 Jan 2020 13:40:28 -0500
Subject: [PATCH v2] Fix problems with "read only query" checks, and refactor
 the code.

Previously, check_xact_readonly() was responsible for determining which
types of queries could not be run in a read-only transaction,
standard_ProcessUtility() was responsibility for prohibiting things
which were allowed in read only transactions but not in recovery, and
utility commands were basically prohibited in bulk in parallel mode by
calls to CommandIsReadOnly() in functions.c and spi.c.  This situation
was confusing and error-prone. Accordingly, move all the checks to a new
function ClassifyUtilityCommandAsReadOnly, which determines the degree
to which a given statement is read only.

In the old code, check_xact_readonly() inadvertently failed to handle
several statement types that actually should have been prohibited,
specifically T_CreatePolicyStmt, T_AlterPolicyStmt, T_CreateAmStmt,
T_CreateStatsStmt, T_AlterStatsStmt, and T_AlterCollationStmt.  As a
result, thes statements were erroneously allowed in read only
transactions, parallel queries, and standby operation. Generally, they
would fail anyway due to some lower-level error check, but we
shouldn't rely on that.  In the new code structure, future omissions
of this type should cause ClassifyUtilityCommandAsReadOnly to complain
about an unrecognized node type.

As a fringe benefit, this means we can allow certain types of utility
commands in parallel mode, where it's safe to do so. This allows
ALTER SYSTEM, CALL, DO, CHECKPOINT, COPY FROM, EXPLAIN, and SHOW.
It might be possible to allow additional commands with more work
and thought.
---
 src/backend/commands/copy.c      |   1 -
 src/backend/commands/lockcmds.c  |  11 -
 src/backend/executor/functions.c |   3 -
 src/backend/executor/spi.c       |  31 +--
 src/backend/tcop/utility.c       | 342 ++++++++++++++++++++++---------
 src/include/tcop/utility.h       |  32 +++
 6 files changed, 285 insertions(+), 135 deletions(-)

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index c93a788798..40a8ec1abd 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -1064,7 +1064,6 @@ DoCopy(ParseState *pstate, const CopyStmt *stmt,
 		/* check read-only transaction and parallel mode */
 		if (XactReadOnly && !rel->rd_islocaltemp)
 			PreventCommandIfReadOnly("COPY FROM");
-		PreventCommandIfParallelMode("COPY FROM");
 
 		cstate = BeginCopyFrom(pstate, rel, stmt->filename, stmt->is_program,
 							   NULL, stmt->attlist, stmt->options);
diff --git a/src/backend/commands/lockcmds.c b/src/backend/commands/lockcmds.c
index 75410f04de..329ab849c0 100644
--- a/src/backend/commands/lockcmds.c
+++ b/src/backend/commands/lockcmds.c
@@ -42,17 +42,6 @@ LockTableCommand(LockStmt *lockstmt)
 {
 	ListCell   *p;
 
-	/*---------
-	 * During recovery we only accept these variations:
-	 * LOCK TABLE foo IN ACCESS SHARE MODE
-	 * LOCK TABLE foo IN ROW SHARE MODE
-	 * LOCK TABLE foo IN ROW EXCLUSIVE MODE
-	 * This test must match the restrictions defined in LockAcquireExtended()
-	 *---------
-	 */
-	if (lockstmt->mode > RowExclusiveLock)
-		PreventCommandDuringRecovery("LOCK TABLE");
-
 	/*
 	 * Iterate over the list and process the named relations one at a time
 	 */
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index e8e1957075..5cff6c4321 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -540,9 +540,6 @@ init_execution_state(List *queryTree_list,
 						 errmsg("%s is not allowed in a non-volatile function",
 								CreateCommandTag((Node *) stmt))));
 
-			if (IsInParallelMode() && !CommandIsReadOnly(stmt))
-				PreventCommandIfParallelMode(CreateCommandTag((Node *) stmt));
-
 			/* OK, build the execution_state for this query */
 			newes = (execution_state *) palloc(sizeof(execution_state));
 			if (preves)
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 4a6e82b605..c46764bf42 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1450,14 +1450,13 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	portal->queryEnv = _SPI_current->queryEnv;
 
 	/*
-	 * If told to be read-only, or in parallel mode, verify that this query is
-	 * in fact read-only.  This can't be done earlier because we need to look
-	 * at the finished, planned queries.  (In particular, we don't want to do
-	 * it between GetCachedPlan and PortalDefineQuery, because throwing an
-	 * error between those steps would result in leaking our plancache
-	 * refcount.)
+	 * If told to be read-only, we'd better check for read-only queries. This
+	 * can't be done earlier because we need to look at the finished, planned
+	 * queries.  (In particular, we don't want to do it between GetCachedPlan
+	 * and PortalDefineQuery, because throwing an error between those steps
+	 * would result in leaking our plancache refcount.)
 	 */
-	if (read_only || IsInParallelMode())
+	if (read_only)
 	{
 		ListCell   *lc;
 
@@ -1466,16 +1465,11 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 			PlannedStmt *pstmt = lfirst_node(PlannedStmt, lc);
 
 			if (!CommandIsReadOnly(pstmt))
-			{
-				if (read_only)
-					ereport(ERROR,
-							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-					/* translator: %s is a SQL statement name */
-							 errmsg("%s is not allowed in a non-volatile function",
-									CreateCommandTag((Node *) pstmt))));
-				else
-					PreventCommandIfParallelMode(CreateCommandTag((Node *) pstmt));
-			}
+				ereport(ERROR,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				/* translator: %s is a SQL statement name */
+						 errmsg("%s is not allowed in a non-volatile function",
+								CreateCommandTag((Node *) pstmt))));
 		}
 	}
 
@@ -2263,9 +2257,6 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
 						 errmsg("%s is not allowed in a non-volatile function",
 								CreateCommandTag((Node *) stmt))));
 
-			if (IsInParallelMode() && !CommandIsReadOnly(stmt))
-				PreventCommandIfParallelMode(CreateCommandTag((Node *) stmt));
-
 			/*
 			 * If not read-only mode, advance the command counter before each
 			 * command and update the snapshot.
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index b2c58bf862..5cdf540d49 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -75,6 +75,7 @@
 ProcessUtility_hook_type ProcessUtility_hook = NULL;
 
 /* local function declarations */
+static int ClassifyUtilityCommandAsReadOnly(Node *parsetree);
 static void ProcessUtilitySlow(ParseState *pstate,
 							   PlannedStmt *pstmt,
 							   const char *queryString,
@@ -124,106 +125,261 @@ CommandIsReadOnly(PlannedStmt *pstmt)
 }
 
 /*
- * check_xact_readonly: is a utility command read-only?
+ * Determine the degree to which a utility command is read only.
  *
- * Here we use the loose rules of XactReadOnly mode: no permanent effects
- * on the database are allowed.
+ * Some commands are not at all read-only, such as DDL statements that make
+ * catalog entries; and some are entirely read-only, such as fetching a row
+ * from a cursor. There's some middle ground: see src/include/utility/tcop.h
+ * for details.
  */
-static void
-check_xact_readonly(Node *parsetree)
+int
+ClassifyUtilityCommandAsReadOnly(Node *parsetree)
 {
-	/* Only perform the check if we have a reason to do so. */
-	if (!XactReadOnly && !IsInParallelMode())
-		return;
-
-	/*
-	 * Note: Commands that need to do more complicated checking are handled
-	 * elsewhere, in particular COPY and plannable statements do their own
-	 * checking.  However they should all call PreventCommandIfReadOnly or
-	 * PreventCommandIfParallelMode to actually throw the error.
-	 */
-
 	switch (nodeTag(parsetree))
 	{
-		case T_AlterDatabaseStmt:
+		case T_AlterCollationStmt:
 		case T_AlterDatabaseSetStmt:
+		case T_AlterDatabaseStmt:
+		case T_AlterDefaultPrivilegesStmt:
 		case T_AlterDomainStmt:
+		case T_AlterEnumStmt:
+		case T_AlterEventTrigStmt:
+		case T_AlterExtensionContentsStmt:
+		case T_AlterExtensionStmt:
+		case T_AlterFdwStmt:
+		case T_AlterForeignServerStmt:
 		case T_AlterFunctionStmt:
-		case T_AlterRoleStmt:
-		case T_AlterRoleSetStmt:
 		case T_AlterObjectDependsStmt:
 		case T_AlterObjectSchemaStmt:
-		case T_AlterOwnerStmt:
+		case T_AlterOpFamilyStmt:
 		case T_AlterOperatorStmt:
+		case T_AlterOwnerStmt:
+		case T_AlterPolicyStmt:
+		case T_AlterPublicationStmt:
+		case T_AlterRoleSetStmt:
+		case T_AlterRoleStmt:
 		case T_AlterSeqStmt:
+		case T_AlterStatsStmt:
+		case T_AlterSubscriptionStmt:
+		case T_AlterTSConfigurationStmt:
+		case T_AlterTSDictionaryStmt:
 		case T_AlterTableMoveAllStmt:
+		case T_AlterTableSpaceOptionsStmt:
 		case T_AlterTableStmt:
-		case T_RenameStmt:
+		case T_AlterUserMappingStmt:
 		case T_CommentStmt:
-		case T_DefineStmt:
+		case T_CompositeTypeStmt:
+		case T_CreateAmStmt:
 		case T_CreateCastStmt:
-		case T_CreateEventTrigStmt:
-		case T_AlterEventTrigStmt:
 		case T_CreateConversionStmt:
-		case T_CreatedbStmt:
 		case T_CreateDomainStmt:
+		case T_CreateEnumStmt:
+		case T_CreateEventTrigStmt:
+		case T_CreateExtensionStmt:
+		case T_CreateFdwStmt:
+		case T_CreateForeignServerStmt:
+		case T_CreateForeignTableStmt:
 		case T_CreateFunctionStmt:
-		case T_CreateRoleStmt:
-		case T_IndexStmt:
-		case T_CreatePLangStmt:
 		case T_CreateOpClassStmt:
 		case T_CreateOpFamilyStmt:
-		case T_AlterOpFamilyStmt:
-		case T_RuleStmt:
+		case T_CreatePLangStmt:
+		case T_CreatePolicyStmt:
+		case T_CreatePublicationStmt:
+		case T_CreateRangeStmt:
+		case T_CreateRoleStmt:
 		case T_CreateSchemaStmt:
 		case T_CreateSeqStmt:
+		case T_CreateStatsStmt:
 		case T_CreateStmt:
+		case T_CreateSubscriptionStmt:
 		case T_CreateTableAsStmt:
-		case T_RefreshMatViewStmt:
 		case T_CreateTableSpaceStmt:
 		case T_CreateTransformStmt:
 		case T_CreateTrigStmt:
-		case T_CompositeTypeStmt:
-		case T_CreateEnumStmt:
-		case T_CreateRangeStmt:
-		case T_AlterEnumStmt:
-		case T_ViewStmt:
+		case T_CreateUserMappingStmt:
+		case T_CreatedbStmt:
+		case T_DefineStmt:
+		case T_DropOwnedStmt:
+		case T_DropRoleStmt:
 		case T_DropStmt:
-		case T_DropdbStmt:
+		case T_DropSubscriptionStmt:
 		case T_DropTableSpaceStmt:
-		case T_DropRoleStmt:
-		case T_GrantStmt:
-		case T_GrantRoleStmt:
-		case T_AlterDefaultPrivilegesStmt:
-		case T_TruncateStmt:
-		case T_DropOwnedStmt:
-		case T_ReassignOwnedStmt:
-		case T_AlterTSDictionaryStmt:
-		case T_AlterTSConfigurationStmt:
-		case T_CreateExtensionStmt:
-		case T_AlterExtensionStmt:
-		case T_AlterExtensionContentsStmt:
-		case T_CreateFdwStmt:
-		case T_AlterFdwStmt:
-		case T_CreateForeignServerStmt:
-		case T_AlterForeignServerStmt:
-		case T_CreateUserMappingStmt:
-		case T_AlterUserMappingStmt:
 		case T_DropUserMappingStmt:
-		case T_AlterTableSpaceOptionsStmt:
-		case T_CreateForeignTableStmt:
+		case T_DropdbStmt:
+		case T_GrantRoleStmt:
+		case T_GrantStmt:
 		case T_ImportForeignSchemaStmt:
+		case T_IndexStmt:
+		case T_ReassignOwnedStmt:
+		case T_RefreshMatViewStmt:
+		case T_RenameStmt:
+		case T_RuleStmt:
 		case T_SecLabelStmt:
-		case T_CreatePublicationStmt:
-		case T_AlterPublicationStmt:
-		case T_CreateSubscriptionStmt:
-		case T_AlterSubscriptionStmt:
-		case T_DropSubscriptionStmt:
-			PreventCommandIfReadOnly(CreateCommandTag(parsetree));
-			PreventCommandIfParallelMode(CreateCommandTag(parsetree));
-			break;
+		case T_TruncateStmt:
+		case T_ViewStmt:
+			{
+				/* DDL is not read-only, and neither is TRUNCATE. */
+				return COMMAND_IS_NOT_READ_ONLY;
+			}
+
+		case T_AlterSystemStmt:
+			{
+				/*
+				 * It's sort of strange that we treat this as a read-only
+				 * command, but since postgresql.auto.conf isn't WAL-logged
+				 * or stored in shared buffers and is separately writeable on
+				 * the standby, it's fine.
+				 */
+				return COMMAND_IS_STRICTLY_READ_ONLY;
+			}
+
+		case T_CallStmt:
+		case T_DoStmt:
+			{
+				/*
+				 * Commands inside the DO block or the called procedure might
+				 * not be read only, but they'll be checked separately when we
+				 * try to execute them.  Here we only need to worry about the
+				 * DO or CALL command itself.
+				 */
+				return COMMAND_IS_STRICTLY_READ_ONLY;
+			}
+
+		case T_CheckPointStmt:
+			{
+				/*
+				 * You might think that this should not be permitted in
+				 * recovery, but we interpret a CHECKPOINT command during
+				 * recovery as a request for a restartpoint instead. We allow
+				 * this since it can be a useful way of reducing switchover
+				 * time when using various forms of replication.
+				 */
+				return COMMAND_IS_STRICTLY_READ_ONLY;
+			}
+
+		case T_ClosePortalStmt:
+		case T_ConstraintsSetStmt:
+		case T_DeallocateStmt:
+		case T_DeclareCursorStmt:
+		case T_DiscardStmt:
+		case T_ExecuteStmt:
+		case T_FetchStmt:
+		case T_LoadStmt:
+		case T_PrepareStmt:
+		case T_UnlistenStmt:
+		case T_VariableSetStmt:
+			{
+				/*
+				 * These modify only backend-local state, so they're OK to
+				 * run in a read-only transaction or on a standby. However,
+				 * they are disallowed in parallel mode, because they either
+				 * rely upon or modify backend-local state that might not be
+				 * synchronized among cooperating backends.
+				 */
+				return COMMAND_OK_IN_RECOVERY | COMMAND_OK_IN_READ_ONLY_TXN;
+			}
+
+		case T_ClusterStmt:
+		case T_ReindexStmt:
+		case T_VacuumStmt:
+			{
+				/*
+				 * These commands are not read-only in the strict sense: they
+				 * do modify database blocks and system catalog entries.
+				 * However, we choose to allow them during "read only"
+				 * transactions.
+				 */
+				return COMMAND_OK_IN_READ_ONLY_TXN;
+			}
+
+		case T_CopyStmt:
+			{
+				CopyStmt *stmt = (CopyStmt *) parsetree;
+
+				/*
+				 * You might think that COPY FROM is not at all read only,
+				 * but we have for many years permitted in a "read only"
+				 * transaction if the target is a temporary table. If the
+				 * target table turns out to be non-temporary, DoCopy itself
+				 * will call PreventCommandIfReadOnly.
+				 */
+				if (stmt->is_from)
+					return COMMAND_OK_IN_READ_ONLY_TXN;
+				else
+					return COMMAND_IS_STRICTLY_READ_ONLY;
+			}
+
+		case T_ExplainStmt:
+		case T_VariableShowStmt:
+			{
+				/*
+				 * These commands don't modify any data and are safe to run
+				 * in a parallel worker.
+				 */
+				return COMMAND_IS_STRICTLY_READ_ONLY;
+			}
+
+		case T_ListenStmt:
+		case T_NotifyStmt:
+			{
+				/*
+				 * NOTIFY requires an XID assignment, so it can't be permitted
+				 * on a standby. Perhaps LISTEN could, since without NOTIFY
+				 * it would be OK to just do nothing, at least until promotion,
+				 * but we currently prohibit it lest the user get the wrong
+				 * idea.
+				 *
+				 * (We do allow T_UnlistenStmt on a standby, though, because
+				 * it's a no-op.)
+				 */
+				return COMMAND_OK_IN_READ_ONLY_TXN;
+			}
+
+		case T_LockStmt:
+			{
+				LockStmt *stmt = (LockStmt *) parsetree;
+
+				/*
+				 * Only weaker locker modes are allowed during recovery. The
+				 * restrictions here must match those in LockAcquireExtended().
+				 */
+				if (stmt->mode > RowExclusiveLock)
+					return COMMAND_OK_IN_READ_ONLY_TXN;
+				else
+					return COMMAND_IS_STRICTLY_READ_ONLY;
+			}
+
+		case T_TransactionStmt:
+			{
+				TransactionStmt *stmt = (TransactionStmt *) parsetree;
+
+				/*
+				 * We choose to allow PREPARE, COMMIT PREPARED, and ROLLBACK
+				 * PREPARED in read-only transactions, but all of them modify
+				 * persistent on disk state and so are not read-only in the
+				 * strict sense.
+				 */
+				switch (stmt->kind)
+				{
+					case TRANS_STMT_BEGIN:
+					case TRANS_STMT_START:
+					case TRANS_STMT_COMMIT:
+					case TRANS_STMT_ROLLBACK:
+					case TRANS_STMT_SAVEPOINT:
+					case TRANS_STMT_RELEASE:
+					case TRANS_STMT_ROLLBACK_TO:
+						return COMMAND_IS_STRICTLY_READ_ONLY;
+
+					case TRANS_STMT_PREPARE:
+					case TRANS_STMT_COMMIT_PREPARED:
+					case TRANS_STMT_ROLLBACK_PREPARED:
+						return COMMAND_OK_IN_READ_ONLY_TXN;
+				}
+			}
+
 		default:
-			/* do nothing */
+			elog(ERROR, "unrecognized node type: %d",
+				 (int) nodeTag(parsetree));
 			break;
 	}
 }
@@ -231,8 +387,8 @@ check_xact_readonly(Node *parsetree)
 /*
  * PreventCommandIfReadOnly: throw error if XactReadOnly
  *
- * This is useful mainly to ensure consistency of the error message wording;
- * most callers have checked XactReadOnly for themselves.
+ * This is useful partly to ensure consistency of the error message wording;
+ * some callers have checked XactReadOnly for themselves.
  */
 void
 PreventCommandIfReadOnly(const char *cmdname)
@@ -249,8 +405,8 @@ PreventCommandIfReadOnly(const char *cmdname)
  * PreventCommandIfParallelMode: throw error if current (sub)transaction is
  * in parallel mode.
  *
- * This is useful mainly to ensure consistency of the error message wording;
- * most callers have checked IsInParallelMode() for themselves.
+ * This is useful partly to ensure consistency of the error message wording;
+ * some callers have checked IsInParallelMode() for themselves.
  */
 void
 PreventCommandIfParallelMode(const char *cmdname)
@@ -385,11 +541,25 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 	bool		isTopLevel = (context == PROCESS_UTILITY_TOPLEVEL);
 	bool		isAtomicContext = (!(context == PROCESS_UTILITY_TOPLEVEL || context == PROCESS_UTILITY_QUERY_NONATOMIC) || IsTransactionBlock());
 	ParseState *pstate;
+	int			readonly_flags;
 
 	/* This can recurse, so check for excessive recursion */
 	check_stack_depth();
 
-	check_xact_readonly(parsetree);
+	/* Prohibit read/write commands in read-only states. */
+	readonly_flags = ClassifyUtilityCommandAsReadOnly(parsetree);
+	if (readonly_flags != COMMAND_IS_STRICTLY_READ_ONLY &&
+		(XactReadOnly || IsInParallelMode()))
+	{
+		const char *commandtag = CreateCommandTag(parsetree);
+
+		if ((readonly_flags & COMMAND_OK_IN_READ_ONLY_TXN) == 0)
+			PreventCommandIfReadOnly(commandtag);
+		if ((readonly_flags & COMMAND_OK_IN_PARALLEL_MODE) == 0)
+			PreventCommandIfParallelMode(commandtag);
+		if ((readonly_flags & COMMAND_OK_IN_RECOVERY) == 0)
+			PreventCommandDuringRecovery(commandtag);
+	}
 
 	if (completionTag)
 		completionTag[0] = '\0';
@@ -449,7 +619,6 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 						break;
 
 					case TRANS_STMT_PREPARE:
-						PreventCommandDuringRecovery("PREPARE TRANSACTION");
 						if (!PrepareTransactionBlock(stmt->gid))
 						{
 							/* report unsuccessful commit in completionTag */
@@ -460,13 +629,11 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 
 					case TRANS_STMT_COMMIT_PREPARED:
 						PreventInTransactionBlock(isTopLevel, "COMMIT PREPARED");
-						PreventCommandDuringRecovery("COMMIT PREPARED");
 						FinishPreparedTransaction(stmt->gid, true);
 						break;
 
 					case TRANS_STMT_ROLLBACK_PREPARED:
 						PreventInTransactionBlock(isTopLevel, "ROLLBACK PREPARED");
-						PreventCommandDuringRecovery("ROLLBACK PREPARED");
 						FinishPreparedTransaction(stmt->gid, false);
 						break;
 
@@ -607,7 +774,6 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 			{
 				NotifyStmt *stmt = (NotifyStmt *) parsetree;
 
-				PreventCommandDuringRecovery("NOTIFY");
 				Async_Notify(stmt->conditionname, stmt->payload);
 			}
 			break;
@@ -616,7 +782,6 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 			{
 				ListenStmt *stmt = (ListenStmt *) parsetree;
 
-				PreventCommandDuringRecovery("LISTEN");
 				CheckRestrictedOperation("LISTEN");
 				Async_Listen(stmt->conditionname);
 			}
@@ -626,7 +791,6 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 			{
 				UnlistenStmt *stmt = (UnlistenStmt *) parsetree;
 
-				/* we allow UNLISTEN during recovery, as it's a noop */
 				CheckRestrictedOperation("UNLISTEN");
 				if (stmt->conditionname)
 					Async_Unlisten(stmt->conditionname);
@@ -650,22 +814,11 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 			break;
 
 		case T_ClusterStmt:
-			/* we choose to allow this during "read only" transactions */
-			PreventCommandDuringRecovery("CLUSTER");
-			/* forbidden in parallel mode due to CommandIsReadOnly */
 			cluster((ClusterStmt *) parsetree, isTopLevel);
 			break;
 
 		case T_VacuumStmt:
-			{
-				VacuumStmt *stmt = (VacuumStmt *) parsetree;
-
-				/* we choose to allow this during "read only" transactions */
-				PreventCommandDuringRecovery(stmt->is_vacuumcmd ?
-											 "VACUUM" : "ANALYZE");
-				/* forbidden in parallel mode due to CommandIsReadOnly */
-				ExecVacuum(pstate, stmt, isTopLevel);
-			}
+			ExecVacuum(pstate, (VacuumStmt *) parsetree, isTopLevel);
 			break;
 
 		case T_ExplainStmt:
@@ -740,7 +893,6 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 			 * outside a transaction block is presumed to be user error.
 			 */
 			RequireTransactionBlock(isTopLevel, "LOCK TABLE");
-			/* forbidden in parallel mode due to CommandIsReadOnly */
 			LockTableCommand((LockStmt *) parsetree);
 			break;
 
@@ -755,13 +907,6 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 						(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 						 errmsg("must be superuser to do CHECKPOINT")));
 
-			/*
-			 * You might think we should have a PreventCommandDuringRecovery()
-			 * here, but we interpret a CHECKPOINT command during recovery as
-			 * a request for a restartpoint instead. We allow this since it
-			 * can be a useful way of reducing switchover time when using
-			 * various forms of replication.
-			 */
 			RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_WAIT |
 							  (RecoveryInProgress() ? 0 : CHECKPOINT_FORCE));
 			break;
@@ -774,9 +919,6 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 					PreventInTransactionBlock(isTopLevel,
 											  "REINDEX CONCURRENTLY");
 
-				/* we choose to allow this during "read only" transactions */
-				PreventCommandDuringRecovery("REINDEX");
-				/* forbidden in parallel mode due to CommandIsReadOnly */
 				switch (stmt->kind)
 				{
 					case REINDEX_OBJECT_INDEX:
diff --git a/src/include/tcop/utility.h b/src/include/tcop/utility.h
index 40551c45af..df9a8a165a 100644
--- a/src/include/tcop/utility.h
+++ b/src/include/tcop/utility.h
@@ -25,6 +25,38 @@ typedef enum
 	PROCESS_UTILITY_SUBCOMMAND	/* a portion of a query */
 } ProcessUtilityContext;
 
+/*
+ * These constants are used to describe the extent to which a particular
+ * command is read-only.
+ *
+ * COMMAND_OK_IN_READ_ONLY_TXN means that the command is permissible even when
+ * XactReadOnly is set. Generally, this should be false for commands that
+ * modify tuples, including tuples in system tables; we sometimes permit it
+ * for utility commands that do not directly modify tuples (e.g. REINDEX,
+ * COMMIT PREPARED).
+ *
+ * COMMAND_OK_IN_PARALLEL_MODE means that the command is permissible even
+ * when in parallel mode. Writing tuples is forbidden, as is anything that
+ * might confuse cooperating processes.
+ *
+ * COMMAND_OK_IN_RECOVERY means that the command is permissible even when in
+ * recovery. It can't write WAL, nor can it do things that would imperil
+ * replay of future WAL received from the master.
+ */
+#define COMMAND_OK_IN_READ_ONLY_TXN	0x0001
+#define COMMAND_OK_IN_PARALLEL_MODE	0x0002
+#define	COMMAND_OK_IN_RECOVERY		0x0004
+
+/*
+ * We say that a command is strictly read-only if it is sufficiently read-only
+ * for all purposes. For clarity, we also have a constant for commands that are
+ * in no way read-only.
+ */
+#define COMMAND_IS_STRICTLY_READ_ONLY \
+	(COMMAND_OK_IN_READ_ONLY_TXN | COMMAND_OK_IN_RECOVERY | \
+	 COMMAND_OK_IN_PARALLEL_MODE)
+#define COMMAND_IS_NOT_READ_ONLY	0
+
 /* Hook for plugins to get control in ProcessUtility() */
 typedef void (*ProcessUtility_hook_type) (PlannedStmt *pstmt,
 										  const char *queryString, ProcessUtilityContext context,
-- 
2.17.2 (Apple Git-113)

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#7)
Re: our checks for read-only queries are not great

Robert Haas <robertmhaas@gmail.com> writes:

I'd be really interested to hear if anyone knows the history behind
allowing CLUSTER, REINDEX, VACUUM, and some operations on temp tables.
It seems to have been that way for a long time. I wonder if it was a
deliberate choice or something that just happened semi-accidentally.

Within a "read-only" xact you mean? I believe that allowing DML writes
was intentional. As for the utility commands, I suspect that it was in
part accidental (error of omission?), and then if anyone thought hard
about it they decided that allowing DML writes to temp tables justifies
those operations too.

Have you tried excavating in our git history to see when the relevant
permission tests originated?

regards, tom lane

#9Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#8)
Re: our checks for read-only queries are not great

On Thu, Jan 9, 2020 at 2:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

I'd be really interested to hear if anyone knows the history behind
allowing CLUSTER, REINDEX, VACUUM, and some operations on temp tables.
It seems to have been that way for a long time. I wonder if it was a
deliberate choice or something that just happened semi-accidentally.

Within a "read-only" xact you mean? I believe that allowing DML writes
was intentional. As for the utility commands, I suspect that it was in
part accidental (error of omission?), and then if anyone thought hard
about it they decided that allowing DML writes to temp tables justifies
those operations too.

Have you tried excavating in our git history to see when the relevant
permission tests originated?

check_xact_readonly() with a long list of command tags originated in
the same commit that added read-only transactions. CLUSTER, REINDEX,
and VACUUM weren't included in the list of prohibited operations then,
either, but it's unclear whether that was a deliberate omission or an
oversight. That commit also thought that COPY FROM - and queries -
should allow temp tables. But there's nothing in the commit that seems
to explain why, unless the commit message itself is a hint:

commit b65cd562402ed9d3206d501cc74dc38bc421b2ce
Author: Peter Eisentraut <peter_e@gmx.net>
Date: Fri Jan 10 22:03:30 2003 +0000

Read-only transactions, as defined in SQL.

Maybe the SQL standard has something to say about this?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#9)
Re: our checks for read-only queries are not great

Robert Haas <robertmhaas@gmail.com> writes:

Maybe the SQL standard has something to say about this?

[ pokes around ... ] Yeah, it does, and I'd say it's pretty clearly
in agreement with what Peter did, so far as DML ops go. For instance,
this bit from SQL99's description of DELETE:

1) If the access mode of the current SQL-transaction or the access
mode of the branch of the current SQL-transaction at the current
SQL-connection is read-only, and T is not a temporary table,
then an exception condition is raised: invalid transaction state
- read-only SQL-transaction.

UPDATE and INSERT say the same. (I didn't look at later spec versions,
since Peter's 2003 commit was probably based on SQL99.)

You could argue about exactly how to extend that to non-spec
utility commands, but for the most part allowing them seems
to make sense if DML is allowed.

regards, tom lane

#11Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#10)
Re: our checks for read-only queries are not great

On Thu, Jan 9, 2020 at 3:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

Maybe the SQL standard has something to say about this?

[ pokes around ... ] Yeah, it does, and I'd say it's pretty clearly
in agreement with what Peter did, so far as DML ops go. For instance,
this bit from SQL99's description of DELETE:

1) If the access mode of the current SQL-transaction or the access
mode of the branch of the current SQL-transaction at the current
SQL-connection is read-only, and T is not a temporary table,
then an exception condition is raised: invalid transaction state
- read-only SQL-transaction.

UPDATE and INSERT say the same. (I didn't look at later spec versions,
since Peter's 2003 commit was probably based on SQL99.)

OK. That's good to know.

You could argue about exactly how to extend that to non-spec
utility commands, but for the most part allowing them seems
to make sense if DML is allowed.

But I think we allow them on all tables, not just temp tables, so I
don't think I understand this argument.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#12Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#11)
Re: our checks for read-only queries are not great

On Thu, Jan 9, 2020 at 3:37 PM Robert Haas <robertmhaas@gmail.com> wrote:

You could argue about exactly how to extend that to non-spec
utility commands, but for the most part allowing them seems
to make sense if DML is allowed.

But I think we allow them on all tables, not just temp tables, so I
don't think I understand this argument.

Oh, wait: I'm conflating two things. The current behavior extends the
spec behavior to COPY in a logical way.

But it also allows CLUSTER, REINDEX, and VACUUM on any table. The spec
presumably has no view on that, nor does the passage you quoted seem
to apply here.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#11)
Re: our checks for read-only queries are not great

Robert Haas <robertmhaas@gmail.com> writes:

On Thu, Jan 9, 2020 at 3:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

You could argue about exactly how to extend that to non-spec
utility commands, but for the most part allowing them seems
to make sense if DML is allowed.

But I think we allow them on all tables, not just temp tables, so I
don't think I understand this argument.

Oh, I misunderstood your concern.

Peter might remember more clearly, but I have a feeling that we
concluded that the intent of the spec was for read-only-ness to
disallow globally-visible changes in the visible database contents.
VACUUM, for example, does not cause any visible change, so it
should be admissible. REINDEX ditto. (We ignore here the possibility
of such things causing, say, a change in the order in which rows are
returned, since that's beneath the spec's notice to begin with.)
ANALYZE ditto, except to the extent that if you look at pg_stats
you might see something different --- but again, system catalog
contents are outside the spec's purview.

You could extend this line of argument, perhaps, far enough to justify
ALTER SYSTEM SET as well. But I don't like that because some GUCs have
visible effects on the results that an ordinary query minding its own
business can get. Timezone is perhaps the poster child there, or
search_path. If we were to subdivide the GUCs into "affects
implementation details only" vs "can affect query semantics",
I'd hold still for allowing ALTER SYSTEM SET on the former group.
Doubt it's worth the trouble to distinguish, though.

regards, tom lane

#14Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Tom Lane (#13)
Re: our checks for read-only queries are not great

On 2020-01-09 21:52, Tom Lane wrote:

Peter might remember more clearly, but I have a feeling that we
concluded that the intent of the spec was for read-only-ness to
disallow globally-visible changes in the visible database contents.

I don't really remember, but that was basically the opinion I had
arrived at as I was reading through this current thread. Roughly
speaking, anything that changes the database state (data or schema) in a
way that would be reflected in a pg_dump output is not read-only.

VACUUM, for example, does not cause any visible change, so it
should be admissible. REINDEX ditto. (We ignore here the possibility
of such things causing, say, a change in the order in which rows are
returned, since that's beneath the spec's notice to begin with.)

agreed

ANALYZE ditto, except to the extent that if you look at pg_stats
you might see something different --- but again, system catalog
contents are outside the spec's purview.

agreed

You could extend this line of argument, perhaps, far enough to justify
ALTER SYSTEM SET as well. But I don't like that because some GUCs have
visible effects on the results that an ordinary query minding its own
business can get. Timezone is perhaps the poster child there, or
search_path. If we were to subdivide the GUCs into "affects
implementation details only" vs "can affect query semantics",
I'd hold still for allowing ALTER SYSTEM SET on the former group.
Doubt it's worth the trouble to distinguish, though.

ALTER SYSTEM is read only in my mind.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#15Robert Haas
robertmhaas@gmail.com
In reply to: Peter Eisentraut (#14)
1 attachment(s)
Re: our checks for read-only queries are not great

On Fri, Jan 10, 2020 at 7:23 AM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

I don't really remember, but that was basically the opinion I had
arrived at as I was reading through this current thread. Roughly
speaking, anything that changes the database state (data or schema) in a
way that would be reflected in a pg_dump output is not read-only.

This rule very nearly matches the current behavior: it explains why
temp table operations are allowed, and why ALTER SYSTEM is allowed,
and why REINDEX etc. are allowed. However, there's a notable
exception: PREPARE, COMMIT PREPARED, and ROLLBACK PREPARED are allowed
in a read-only transaction. Under the "doesn't change pg_dump output"
criteria, the first and third ones should be permitted but COMMIT
PREPARED should be denied, except maybe if the prepared transaction
didn't do any writes (and in that case, why did we bother preparing
it?). Despite that, this rule does a way better job explaining the
current behavior than anything else suggested so far.

Accordingly, here's v3, with comments adjusted to match this new
explanation for the current behavior. This seems way better than what
I had before, because it actually explains why stuff is the way it is
rather than just appealing to history.

BTW, there's a pending patch that allows CLUSTER to change the
tablespace of an object while rewriting it. If we want to be strict
about it, that variant would need to be disallowed in read only mode,
under this definition. (I also think that it's lousy syntax and ought
to be spelled ALTER TABLE x SET TABLESPACE foo, CLUSTER or something
like that rather than anything beginning with CLUSTER, but I seem to
be losing that argument.)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

v3-0001-Fix-problems-with-read-only-query-checks-and-refa.patchapplication/octet-stream; name=v3-0001-Fix-problems-with-read-only-query-checks-and-refa.patchDownload
From ab66db6ad0e3f7078133dd4035e36ff2ba2cd258 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Fri, 10 Jan 2020 08:36:44 -0500
Subject: [PATCH v3] Fix problems with "read only query" checks, and refactor
 the code.

Previously, check_xact_readonly() was responsible for determining which
types of queries could not be run in a read-only transaction,
standard_ProcessUtility() was responsibility for prohibiting things
which were allowed in read only transactions but not in recovery, and
utility commands were basically prohibited in bulk in parallel mode by
calls to CommandIsReadOnly() in functions.c and spi.c.  This situation
was confusing and error-prone. Accordingly, move all the checks to a new
function ClassifyUtilityCommandAsReadOnly, which determines the degree
to which a given statement is read only.

In the old code, check_xact_readonly() inadvertently failed to handle
several statement types that actually should have been prohibited,
specifically T_CreatePolicyStmt, T_AlterPolicyStmt, T_CreateAmStmt,
T_CreateStatsStmt, T_AlterStatsStmt, and T_AlterCollationStmt.  As a
result, thes statements were erroneously allowed in read only
transactions, parallel queries, and standby operation. Generally, they
would fail anyway due to some lower-level error check, but we
shouldn't rely on that.  In the new code structure, future omissions
of this type should cause ClassifyUtilityCommandAsReadOnly to complain
about an unrecognized node type.

As a fringe benefit, this means we can allow certain types of utility
commands in parallel mode, where it's safe to do so. This allows
ALTER SYSTEM, CALL, DO, CHECKPOINT, COPY FROM, EXPLAIN, and SHOW.
It might be possible to allow additional commands with more work
and thought.

Patch by me, reviewed by Tom Lane, Stephen Frost, and Peter
Eisentraut.

Discussion: http://postgr.es/m/CA+TgmoZ_rLqJt5sYkvh+JpQnfX0Y+B2R+qfi820xNih6x-FQOQ@mail.gmail.com
---
 src/backend/commands/copy.c      |   1 -
 src/backend/commands/lockcmds.c  |  11 -
 src/backend/executor/functions.c |   3 -
 src/backend/executor/spi.c       |  31 +--
 src/backend/tcop/utility.c       | 350 ++++++++++++++++++++++---------
 src/include/tcop/utility.h       |  31 +++
 6 files changed, 292 insertions(+), 135 deletions(-)

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index c93a788798..40a8ec1abd 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -1064,7 +1064,6 @@ DoCopy(ParseState *pstate, const CopyStmt *stmt,
 		/* check read-only transaction and parallel mode */
 		if (XactReadOnly && !rel->rd_islocaltemp)
 			PreventCommandIfReadOnly("COPY FROM");
-		PreventCommandIfParallelMode("COPY FROM");
 
 		cstate = BeginCopyFrom(pstate, rel, stmt->filename, stmt->is_program,
 							   NULL, stmt->attlist, stmt->options);
diff --git a/src/backend/commands/lockcmds.c b/src/backend/commands/lockcmds.c
index 75410f04de..329ab849c0 100644
--- a/src/backend/commands/lockcmds.c
+++ b/src/backend/commands/lockcmds.c
@@ -42,17 +42,6 @@ LockTableCommand(LockStmt *lockstmt)
 {
 	ListCell   *p;
 
-	/*---------
-	 * During recovery we only accept these variations:
-	 * LOCK TABLE foo IN ACCESS SHARE MODE
-	 * LOCK TABLE foo IN ROW SHARE MODE
-	 * LOCK TABLE foo IN ROW EXCLUSIVE MODE
-	 * This test must match the restrictions defined in LockAcquireExtended()
-	 *---------
-	 */
-	if (lockstmt->mode > RowExclusiveLock)
-		PreventCommandDuringRecovery("LOCK TABLE");
-
 	/*
 	 * Iterate over the list and process the named relations one at a time
 	 */
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index e8e1957075..5cff6c4321 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -540,9 +540,6 @@ init_execution_state(List *queryTree_list,
 						 errmsg("%s is not allowed in a non-volatile function",
 								CreateCommandTag((Node *) stmt))));
 
-			if (IsInParallelMode() && !CommandIsReadOnly(stmt))
-				PreventCommandIfParallelMode(CreateCommandTag((Node *) stmt));
-
 			/* OK, build the execution_state for this query */
 			newes = (execution_state *) palloc(sizeof(execution_state));
 			if (preves)
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 4a6e82b605..c46764bf42 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1450,14 +1450,13 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	portal->queryEnv = _SPI_current->queryEnv;
 
 	/*
-	 * If told to be read-only, or in parallel mode, verify that this query is
-	 * in fact read-only.  This can't be done earlier because we need to look
-	 * at the finished, planned queries.  (In particular, we don't want to do
-	 * it between GetCachedPlan and PortalDefineQuery, because throwing an
-	 * error between those steps would result in leaking our plancache
-	 * refcount.)
+	 * If told to be read-only, we'd better check for read-only queries. This
+	 * can't be done earlier because we need to look at the finished, planned
+	 * queries.  (In particular, we don't want to do it between GetCachedPlan
+	 * and PortalDefineQuery, because throwing an error between those steps
+	 * would result in leaking our plancache refcount.)
 	 */
-	if (read_only || IsInParallelMode())
+	if (read_only)
 	{
 		ListCell   *lc;
 
@@ -1466,16 +1465,11 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 			PlannedStmt *pstmt = lfirst_node(PlannedStmt, lc);
 
 			if (!CommandIsReadOnly(pstmt))
-			{
-				if (read_only)
-					ereport(ERROR,
-							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-					/* translator: %s is a SQL statement name */
-							 errmsg("%s is not allowed in a non-volatile function",
-									CreateCommandTag((Node *) pstmt))));
-				else
-					PreventCommandIfParallelMode(CreateCommandTag((Node *) pstmt));
-			}
+				ereport(ERROR,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				/* translator: %s is a SQL statement name */
+						 errmsg("%s is not allowed in a non-volatile function",
+								CreateCommandTag((Node *) pstmt))));
 		}
 	}
 
@@ -2263,9 +2257,6 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
 						 errmsg("%s is not allowed in a non-volatile function",
 								CreateCommandTag((Node *) stmt))));
 
-			if (IsInParallelMode() && !CommandIsReadOnly(stmt))
-				PreventCommandIfParallelMode(CreateCommandTag((Node *) stmt));
-
 			/*
 			 * If not read-only mode, advance the command counter before each
 			 * command and update the snapshot.
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index b2c58bf862..66e3af6631 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -75,6 +75,7 @@
 ProcessUtility_hook_type ProcessUtility_hook = NULL;
 
 /* local function declarations */
+static int ClassifyUtilityCommandAsReadOnly(Node *parsetree);
 static void ProcessUtilitySlow(ParseState *pstate,
 							   PlannedStmt *pstmt,
 							   const char *queryString,
@@ -124,106 +125,269 @@ CommandIsReadOnly(PlannedStmt *pstmt)
 }
 
 /*
- * check_xact_readonly: is a utility command read-only?
+ * Determine the degree to which a utility command is read only.
  *
- * Here we use the loose rules of XactReadOnly mode: no permanent effects
- * on the database are allowed.
+ * Note the definitions of the relevant flags in src/include/utility/tcop.h.
  */
-static void
-check_xact_readonly(Node *parsetree)
+int
+ClassifyUtilityCommandAsReadOnly(Node *parsetree)
 {
-	/* Only perform the check if we have a reason to do so. */
-	if (!XactReadOnly && !IsInParallelMode())
-		return;
-
-	/*
-	 * Note: Commands that need to do more complicated checking are handled
-	 * elsewhere, in particular COPY and plannable statements do their own
-	 * checking.  However they should all call PreventCommandIfReadOnly or
-	 * PreventCommandIfParallelMode to actually throw the error.
-	 */
-
 	switch (nodeTag(parsetree))
 	{
-		case T_AlterDatabaseStmt:
+		case T_AlterCollationStmt:
 		case T_AlterDatabaseSetStmt:
+		case T_AlterDatabaseStmt:
+		case T_AlterDefaultPrivilegesStmt:
 		case T_AlterDomainStmt:
+		case T_AlterEnumStmt:
+		case T_AlterEventTrigStmt:
+		case T_AlterExtensionContentsStmt:
+		case T_AlterExtensionStmt:
+		case T_AlterFdwStmt:
+		case T_AlterForeignServerStmt:
 		case T_AlterFunctionStmt:
-		case T_AlterRoleStmt:
-		case T_AlterRoleSetStmt:
 		case T_AlterObjectDependsStmt:
 		case T_AlterObjectSchemaStmt:
-		case T_AlterOwnerStmt:
+		case T_AlterOpFamilyStmt:
 		case T_AlterOperatorStmt:
+		case T_AlterOwnerStmt:
+		case T_AlterPolicyStmt:
+		case T_AlterPublicationStmt:
+		case T_AlterRoleSetStmt:
+		case T_AlterRoleStmt:
 		case T_AlterSeqStmt:
+		case T_AlterStatsStmt:
+		case T_AlterSubscriptionStmt:
+		case T_AlterTSConfigurationStmt:
+		case T_AlterTSDictionaryStmt:
 		case T_AlterTableMoveAllStmt:
+		case T_AlterTableSpaceOptionsStmt:
 		case T_AlterTableStmt:
-		case T_RenameStmt:
+		case T_AlterUserMappingStmt:
 		case T_CommentStmt:
-		case T_DefineStmt:
+		case T_CompositeTypeStmt:
+		case T_CreateAmStmt:
 		case T_CreateCastStmt:
-		case T_CreateEventTrigStmt:
-		case T_AlterEventTrigStmt:
 		case T_CreateConversionStmt:
-		case T_CreatedbStmt:
 		case T_CreateDomainStmt:
+		case T_CreateEnumStmt:
+		case T_CreateEventTrigStmt:
+		case T_CreateExtensionStmt:
+		case T_CreateFdwStmt:
+		case T_CreateForeignServerStmt:
+		case T_CreateForeignTableStmt:
 		case T_CreateFunctionStmt:
-		case T_CreateRoleStmt:
-		case T_IndexStmt:
-		case T_CreatePLangStmt:
 		case T_CreateOpClassStmt:
 		case T_CreateOpFamilyStmt:
-		case T_AlterOpFamilyStmt:
-		case T_RuleStmt:
+		case T_CreatePLangStmt:
+		case T_CreatePolicyStmt:
+		case T_CreatePublicationStmt:
+		case T_CreateRangeStmt:
+		case T_CreateRoleStmt:
 		case T_CreateSchemaStmt:
 		case T_CreateSeqStmt:
+		case T_CreateStatsStmt:
 		case T_CreateStmt:
+		case T_CreateSubscriptionStmt:
 		case T_CreateTableAsStmt:
-		case T_RefreshMatViewStmt:
 		case T_CreateTableSpaceStmt:
 		case T_CreateTransformStmt:
 		case T_CreateTrigStmt:
-		case T_CompositeTypeStmt:
-		case T_CreateEnumStmt:
-		case T_CreateRangeStmt:
-		case T_AlterEnumStmt:
-		case T_ViewStmt:
+		case T_CreateUserMappingStmt:
+		case T_CreatedbStmt:
+		case T_DefineStmt:
+		case T_DropOwnedStmt:
+		case T_DropRoleStmt:
 		case T_DropStmt:
-		case T_DropdbStmt:
+		case T_DropSubscriptionStmt:
 		case T_DropTableSpaceStmt:
-		case T_DropRoleStmt:
-		case T_GrantStmt:
-		case T_GrantRoleStmt:
-		case T_AlterDefaultPrivilegesStmt:
-		case T_TruncateStmt:
-		case T_DropOwnedStmt:
-		case T_ReassignOwnedStmt:
-		case T_AlterTSDictionaryStmt:
-		case T_AlterTSConfigurationStmt:
-		case T_CreateExtensionStmt:
-		case T_AlterExtensionStmt:
-		case T_AlterExtensionContentsStmt:
-		case T_CreateFdwStmt:
-		case T_AlterFdwStmt:
-		case T_CreateForeignServerStmt:
-		case T_AlterForeignServerStmt:
-		case T_CreateUserMappingStmt:
-		case T_AlterUserMappingStmt:
 		case T_DropUserMappingStmt:
-		case T_AlterTableSpaceOptionsStmt:
-		case T_CreateForeignTableStmt:
+		case T_DropdbStmt:
+		case T_GrantRoleStmt:
+		case T_GrantStmt:
 		case T_ImportForeignSchemaStmt:
+		case T_IndexStmt:
+		case T_ReassignOwnedStmt:
+		case T_RefreshMatViewStmt:
+		case T_RenameStmt:
+		case T_RuleStmt:
 		case T_SecLabelStmt:
-		case T_CreatePublicationStmt:
-		case T_AlterPublicationStmt:
-		case T_CreateSubscriptionStmt:
-		case T_AlterSubscriptionStmt:
-		case T_DropSubscriptionStmt:
-			PreventCommandIfReadOnly(CreateCommandTag(parsetree));
-			PreventCommandIfParallelMode(CreateCommandTag(parsetree));
-			break;
+		case T_TruncateStmt:
+		case T_ViewStmt:
+			{
+				/* DDL is not read-only, and neither is TRUNCATE. */
+				return COMMAND_IS_NOT_READ_ONLY;
+			}
+
+		case T_AlterSystemStmt:
+			{
+				/*
+				 * Surprisingly, ALTER SYSTEM meets all our definitions of
+				 * read-only: it changes nothing that affects the output of
+				 * pg_dump, it doesn't write WAL or imperil the application
+				 * of future WAL, and it doesn't depend on any state that needs
+				 * to be synchronized with parallel workers.
+				 *
+				 * So, despite the fact that it writes to a file, it's read
+				 * only!
+				 */
+				return COMMAND_IS_STRICTLY_READ_ONLY;
+			}
+
+		case T_CallStmt:
+		case T_DoStmt:
+			{
+				/*
+				 * Commands inside the DO block or the called procedure might
+				 * not be read only, but they'll be checked separately when we
+				 * try to execute them.  Here we only need to worry about the
+				 * DO or CALL command itself.
+				 */
+				return COMMAND_IS_STRICTLY_READ_ONLY;
+			}
+
+		case T_CheckPointStmt:
+			{
+				/*
+				 * You might think that this should not be permitted in
+				 * recovery, but we interpret a CHECKPOINT command during
+				 * recovery as a request for a restartpoint instead. We allow
+				 * this since it can be a useful way of reducing switchover
+				 * time when using various forms of replication.
+				 */
+				return COMMAND_IS_STRICTLY_READ_ONLY;
+			}
+
+		case T_ClosePortalStmt:
+		case T_ConstraintsSetStmt:
+		case T_DeallocateStmt:
+		case T_DeclareCursorStmt:
+		case T_DiscardStmt:
+		case T_ExecuteStmt:
+		case T_FetchStmt:
+		case T_LoadStmt:
+		case T_PrepareStmt:
+		case T_UnlistenStmt:
+		case T_VariableSetStmt:
+			{
+				/*
+				 * These modify only backend-local state, so they're OK to
+				 * run in a read-only transaction or on a standby. However,
+				 * they are disallowed in parallel mode, because they either
+				 * rely upon or modify backend-local state that might not be
+				 * synchronized among cooperating backends.
+				 */
+				return COMMAND_OK_IN_RECOVERY | COMMAND_OK_IN_READ_ONLY_TXN;
+			}
+
+		case T_ClusterStmt:
+		case T_ReindexStmt:
+		case T_VacuumStmt:
+			{
+				/*
+				 * These commands write WAL, so they're not strictly read-only,
+				 * and running them in parallel workers isn't supported.
+				 *
+				 * However, they don't change the database state in a way that
+				 * would affect pg_dump output, so it's fine to run them in a
+				 * read-only transaction. (CLUSTER might change the order of
+				 * rows on disk, which could affect the ordering of pg_dump
+				 * output, but that's not semantically significant.)
+				 */
+				return COMMAND_OK_IN_READ_ONLY_TXN;
+			}
+
+		case T_CopyStmt:
+			{
+				CopyStmt *stmt = (CopyStmt *) parsetree;
+
+				/*
+				 * You might think that COPY FROM is not at all read only,
+				 * but it's OK to copy into a temporary table, because that
+				 * wouldn't change the output of pg_dump.  If the target table
+				 * turns out to be non-temporary, DoCopy itself will call
+				 * PreventCommandIfReadOnly.
+				 */
+				if (stmt->is_from)
+					return COMMAND_OK_IN_READ_ONLY_TXN;
+				else
+					return COMMAND_IS_STRICTLY_READ_ONLY;
+			}
+
+		case T_ExplainStmt:
+		case T_VariableShowStmt:
+			{
+				/*
+				 * These commands don't modify any data and are safe to run
+				 * in a parallel worker.
+				 */
+				return COMMAND_IS_STRICTLY_READ_ONLY;
+			}
+
+		case T_ListenStmt:
+		case T_NotifyStmt:
+			{
+				/*
+				 * NOTIFY requires an XID assignment, so it can't be permitted
+				 * on a standby. Perhaps LISTEN could, since without NOTIFY
+				 * it would be OK to just do nothing, at least until promotion,
+				 * but we currently prohibit it lest the user get the wrong
+				 * idea.
+				 *
+				 * (We do allow T_UnlistenStmt on a standby, though, because
+				 * it's a no-op.)
+				 */
+				return COMMAND_OK_IN_READ_ONLY_TXN;
+			}
+
+		case T_LockStmt:
+			{
+				LockStmt *stmt = (LockStmt *) parsetree;
+
+				/*
+				 * Only weaker locker modes are allowed during recovery. The
+				 * restrictions here must match those in LockAcquireExtended().
+				 */
+				if (stmt->mode > RowExclusiveLock)
+					return COMMAND_OK_IN_READ_ONLY_TXN;
+				else
+					return COMMAND_IS_STRICTLY_READ_ONLY;
+			}
+
+		case T_TransactionStmt:
+			{
+				TransactionStmt *stmt = (TransactionStmt *) parsetree;
+
+				/*
+				 * PREPARE, COMMIT PREPARED, and ROLLBACK PREPARED all change
+				 * write WAL, so they're not read-only in the strict sense;
+				 * but the first and third do not change pg_dump output, so
+				 * they're OK in a read-only transactions.
+				 *
+				 * We also consider COMMIT PREPARED to be OK in a read-only
+				 * transaction environment, by way of exception.
+				 */
+				switch (stmt->kind)
+				{
+					case TRANS_STMT_BEGIN:
+					case TRANS_STMT_START:
+					case TRANS_STMT_COMMIT:
+					case TRANS_STMT_ROLLBACK:
+					case TRANS_STMT_SAVEPOINT:
+					case TRANS_STMT_RELEASE:
+					case TRANS_STMT_ROLLBACK_TO:
+						return COMMAND_IS_STRICTLY_READ_ONLY;
+
+					case TRANS_STMT_PREPARE:
+					case TRANS_STMT_COMMIT_PREPARED:
+					case TRANS_STMT_ROLLBACK_PREPARED:
+						return COMMAND_OK_IN_READ_ONLY_TXN;
+				}
+			}
+
 		default:
-			/* do nothing */
+			elog(ERROR, "unrecognized node type: %d",
+				 (int) nodeTag(parsetree));
 			break;
 	}
 }
@@ -231,8 +395,8 @@ check_xact_readonly(Node *parsetree)
 /*
  * PreventCommandIfReadOnly: throw error if XactReadOnly
  *
- * This is useful mainly to ensure consistency of the error message wording;
- * most callers have checked XactReadOnly for themselves.
+ * This is useful partly to ensure consistency of the error message wording;
+ * some callers have checked XactReadOnly for themselves.
  */
 void
 PreventCommandIfReadOnly(const char *cmdname)
@@ -249,8 +413,8 @@ PreventCommandIfReadOnly(const char *cmdname)
  * PreventCommandIfParallelMode: throw error if current (sub)transaction is
  * in parallel mode.
  *
- * This is useful mainly to ensure consistency of the error message wording;
- * most callers have checked IsInParallelMode() for themselves.
+ * This is useful partly to ensure consistency of the error message wording;
+ * some callers have checked IsInParallelMode() for themselves.
  */
 void
 PreventCommandIfParallelMode(const char *cmdname)
@@ -385,11 +549,25 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 	bool		isTopLevel = (context == PROCESS_UTILITY_TOPLEVEL);
 	bool		isAtomicContext = (!(context == PROCESS_UTILITY_TOPLEVEL || context == PROCESS_UTILITY_QUERY_NONATOMIC) || IsTransactionBlock());
 	ParseState *pstate;
+	int			readonly_flags;
 
 	/* This can recurse, so check for excessive recursion */
 	check_stack_depth();
 
-	check_xact_readonly(parsetree);
+	/* Prohibit read/write commands in read-only states. */
+	readonly_flags = ClassifyUtilityCommandAsReadOnly(parsetree);
+	if (readonly_flags != COMMAND_IS_STRICTLY_READ_ONLY &&
+		(XactReadOnly || IsInParallelMode()))
+	{
+		const char *commandtag = CreateCommandTag(parsetree);
+
+		if ((readonly_flags & COMMAND_OK_IN_READ_ONLY_TXN) == 0)
+			PreventCommandIfReadOnly(commandtag);
+		if ((readonly_flags & COMMAND_OK_IN_PARALLEL_MODE) == 0)
+			PreventCommandIfParallelMode(commandtag);
+		if ((readonly_flags & COMMAND_OK_IN_RECOVERY) == 0)
+			PreventCommandDuringRecovery(commandtag);
+	}
 
 	if (completionTag)
 		completionTag[0] = '\0';
@@ -449,7 +627,6 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 						break;
 
 					case TRANS_STMT_PREPARE:
-						PreventCommandDuringRecovery("PREPARE TRANSACTION");
 						if (!PrepareTransactionBlock(stmt->gid))
 						{
 							/* report unsuccessful commit in completionTag */
@@ -460,13 +637,11 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 
 					case TRANS_STMT_COMMIT_PREPARED:
 						PreventInTransactionBlock(isTopLevel, "COMMIT PREPARED");
-						PreventCommandDuringRecovery("COMMIT PREPARED");
 						FinishPreparedTransaction(stmt->gid, true);
 						break;
 
 					case TRANS_STMT_ROLLBACK_PREPARED:
 						PreventInTransactionBlock(isTopLevel, "ROLLBACK PREPARED");
-						PreventCommandDuringRecovery("ROLLBACK PREPARED");
 						FinishPreparedTransaction(stmt->gid, false);
 						break;
 
@@ -607,7 +782,6 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 			{
 				NotifyStmt *stmt = (NotifyStmt *) parsetree;
 
-				PreventCommandDuringRecovery("NOTIFY");
 				Async_Notify(stmt->conditionname, stmt->payload);
 			}
 			break;
@@ -616,7 +790,6 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 			{
 				ListenStmt *stmt = (ListenStmt *) parsetree;
 
-				PreventCommandDuringRecovery("LISTEN");
 				CheckRestrictedOperation("LISTEN");
 				Async_Listen(stmt->conditionname);
 			}
@@ -626,7 +799,6 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 			{
 				UnlistenStmt *stmt = (UnlistenStmt *) parsetree;
 
-				/* we allow UNLISTEN during recovery, as it's a noop */
 				CheckRestrictedOperation("UNLISTEN");
 				if (stmt->conditionname)
 					Async_Unlisten(stmt->conditionname);
@@ -650,22 +822,11 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 			break;
 
 		case T_ClusterStmt:
-			/* we choose to allow this during "read only" transactions */
-			PreventCommandDuringRecovery("CLUSTER");
-			/* forbidden in parallel mode due to CommandIsReadOnly */
 			cluster((ClusterStmt *) parsetree, isTopLevel);
 			break;
 
 		case T_VacuumStmt:
-			{
-				VacuumStmt *stmt = (VacuumStmt *) parsetree;
-
-				/* we choose to allow this during "read only" transactions */
-				PreventCommandDuringRecovery(stmt->is_vacuumcmd ?
-											 "VACUUM" : "ANALYZE");
-				/* forbidden in parallel mode due to CommandIsReadOnly */
-				ExecVacuum(pstate, stmt, isTopLevel);
-			}
+			ExecVacuum(pstate, (VacuumStmt *) parsetree, isTopLevel);
 			break;
 
 		case T_ExplainStmt:
@@ -740,7 +901,6 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 			 * outside a transaction block is presumed to be user error.
 			 */
 			RequireTransactionBlock(isTopLevel, "LOCK TABLE");
-			/* forbidden in parallel mode due to CommandIsReadOnly */
 			LockTableCommand((LockStmt *) parsetree);
 			break;
 
@@ -755,13 +915,6 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 						(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 						 errmsg("must be superuser to do CHECKPOINT")));
 
-			/*
-			 * You might think we should have a PreventCommandDuringRecovery()
-			 * here, but we interpret a CHECKPOINT command during recovery as
-			 * a request for a restartpoint instead. We allow this since it
-			 * can be a useful way of reducing switchover time when using
-			 * various forms of replication.
-			 */
 			RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_WAIT |
 							  (RecoveryInProgress() ? 0 : CHECKPOINT_FORCE));
 			break;
@@ -774,9 +927,6 @@ standard_ProcessUtility(PlannedStmt *pstmt,
 					PreventInTransactionBlock(isTopLevel,
 											  "REINDEX CONCURRENTLY");
 
-				/* we choose to allow this during "read only" transactions */
-				PreventCommandDuringRecovery("REINDEX");
-				/* forbidden in parallel mode due to CommandIsReadOnly */
 				switch (stmt->kind)
 				{
 					case REINDEX_OBJECT_INDEX:
diff --git a/src/include/tcop/utility.h b/src/include/tcop/utility.h
index 40551c45af..ef7daf837e 100644
--- a/src/include/tcop/utility.h
+++ b/src/include/tcop/utility.h
@@ -25,6 +25,37 @@ typedef enum
 	PROCESS_UTILITY_SUBCOMMAND	/* a portion of a query */
 } ProcessUtilityContext;
 
+/*
+ * These constants are used to describe the extent to which a particular
+ * command is read-only.
+ *
+ * COMMAND_OK_IN_READ_ONLY_TXN means that the command is permissible even when
+ * XactReadOnly is set. This bit should be set for commands that don't change
+ * the state of the database (data or schema) in a way that would affect the
+ * output of pg_dump.
+ *
+ * COMMAND_OK_IN_PARALLEL_MODE means that the command is permissible even
+ * when in parallel mode. Writing tuples is forbidden, as is anything that
+ * might confuse cooperating processes.
+ *
+ * COMMAND_OK_IN_RECOVERY means that the command is permissible even when in
+ * recovery. It can't write WAL, nor can it do things that would imperil
+ * replay of future WAL received from the master.
+ */
+#define COMMAND_OK_IN_READ_ONLY_TXN	0x0001
+#define COMMAND_OK_IN_PARALLEL_MODE	0x0002
+#define	COMMAND_OK_IN_RECOVERY		0x0004
+
+/*
+ * We say that a command is strictly read-only if it is sufficiently read-only
+ * for all purposes. For clarity, we also have a constant for commands that are
+ * in no way read-only.
+ */
+#define COMMAND_IS_STRICTLY_READ_ONLY \
+	(COMMAND_OK_IN_READ_ONLY_TXN | COMMAND_OK_IN_RECOVERY | \
+	 COMMAND_OK_IN_PARALLEL_MODE)
+#define COMMAND_IS_NOT_READ_ONLY	0
+
 /* Hook for plugins to get control in ProcessUtility() */
 typedef void (*ProcessUtility_hook_type) (PlannedStmt *pstmt,
 										  const char *queryString, ProcessUtilityContext context,
-- 
2.17.2 (Apple Git-113)

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#14)
Re: our checks for read-only queries are not great

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

I don't really remember, but that was basically the opinion I had
arrived at as I was reading through this current thread. Roughly
speaking, anything that changes the database state (data or schema) in a
way that would be reflected in a pg_dump output is not read-only.

OK, although I'd put some emphasis on "roughly speaking".

ALTER SYSTEM is read only in my mind.

I'm still having trouble with this conclusion. I think it can only
be justified by a very narrow reading of "reflected in pg_dump"
that relies on the specific factorization we've chosen for upgrade
operations, ie that postgresql.conf mods have to be carried across
by hand. But that's mostly historical baggage, rather than a sane
basis for defining "read only". If somebody comes up with a patch
that causes "pg_dumpall -g" to include ALTER SYSTEM SET commands for
whatever is in postgresql.auto.conf (not an unreasonable idea BTW),
will you then decide that ALTER SYSTEM SET is no longer read-only?
Or, perhaps, reject such a patch on the grounds that it breaks this
arbitrary definition of read-only-ness?

As another example, do we need to consider that replacing pg_hba.conf
via pg_write_file should be allowed in a "read only" transaction?

These conclusions seem obviously silly to me, but perhaps YMMV.

regards, tom lane

#17Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#16)
Re: our checks for read-only queries are not great

On Fri, Jan 10, 2020 at 9:30 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

If somebody comes up with a patch
that causes "pg_dumpall -g" to include ALTER SYSTEM SET commands for
whatever is in postgresql.auto.conf (not an unreasonable idea BTW),
will you then decide that ALTER SYSTEM SET is no longer read-only?
Or, perhaps, reject such a patch on the grounds that it breaks this
arbitrary definition of read-only-ness?

I would vote to reject such a patch as a confused muddle. I mean,
generally, the expectation right now is that if you move your data
from the current cluster to a new one by pg_dump, pg_upgrade, or even
by promoting a standby, you're responsible for making sure that
postgresql.conf and postgresql.auto.conf get copied over separately.
In the last case, the backup that created the standby will have copied
the postgresql.conf from the master as it existed at that time, but
propagating any subsequent changes is up to you. Now, if we now decide
to shove ALTER SYSTEM SET commands into pg_dumpall output, then
suddenly you're changing that rule, and it's not very clear what the
new rule is.

Now, our current approach is fairly arguable. Given that GUCs on
databases, users, functions, etc. are stored in the catalogs and
subject to backup, restore, replication, etc., one might well expect
that global settings would be handled the same way. I tend to think
that would be nicer, though it would require solving the problem of
how to back out bad changes that make the database not start up.
Regardless of what you or anybody thinks about that, though, it's not
how it works today and would require some serious engineering if we
wanted to make it happen.

As another example, do we need to consider that replacing pg_hba.conf
via pg_write_file should be allowed in a "read only" transaction?

I don't really see what the problem with that is. It bothers me a lot
more that CLUSTER can be run in a read-only transaction -- which
actually changes stuff inside the database, even if not necessarily in
a user-visible way -- than it does that somebody might be able to use
the database to change something that isn't really part of the
database anyway. And pg_hba.conf, like postgresql.conf, is largely
treated as an input to the database rather than part of it.

Somebody could create a user-defined function that launches a
satellite into orbit and that is, I would argue, a write operation in
the truest sense. You have changed the state of the world in a lasting
way, and you cannot take it back. But, it's not writing to the
database, so as far as read-only transactions are concerned, who
cares?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#18Stephen Frost
sfrost@snowman.net
In reply to: Robert Haas (#17)
Re: our checks for read-only queries are not great

Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:

On Fri, Jan 10, 2020 at 9:30 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

If somebody comes up with a patch
that causes "pg_dumpall -g" to include ALTER SYSTEM SET commands for
whatever is in postgresql.auto.conf (not an unreasonable idea BTW),
will you then decide that ALTER SYSTEM SET is no longer read-only?
Or, perhaps, reject such a patch on the grounds that it breaks this
arbitrary definition of read-only-ness?

I would vote to reject such a patch as a confused muddle. I mean,
generally, the expectation right now is that if you move your data
from the current cluster to a new one by pg_dump, pg_upgrade, or even
by promoting a standby, you're responsible for making sure that
postgresql.conf and postgresql.auto.conf get copied over separately.

I really don't like that the ALTER SYSTEM SET/postgresql.auto.conf stuff
has to be handled in some way external to logical export/import, or
external to pg_upgrade (particularly in link mode), so I generally see
where Tom's coming from with that suggestion.

In general, I don't think people should be expected to hand-muck around
with anything in the data directory.

In the last case, the backup that created the standby will have copied
the postgresql.conf from the master as it existed at that time, but
propagating any subsequent changes is up to you. Now, if we now decide
to shove ALTER SYSTEM SET commands into pg_dumpall output, then
suddenly you're changing that rule, and it's not very clear what the
new rule is.

I'd like a rule of "users don't muck with the data directory", and we
are nearly there when you include sensible packaging such as what Debian
provides- by moving postrgesql.conf, log files, etc, outside of the data
directory. For things that can't be moved out of the data directory
though, like postgresql.auto.conf, we should be handling those
transparently to the user.

I agree that there are some interesting cases to consider here though-
like doing a pg_dumpall against a standby resulting in something
different than if you did it against the primary because the
postgresql.auto.conf is different between them (something that I'm
generally supportive of having, and it seems everyone else is too). I
think having an option to control if the postgresql.auto.conf settings
are included or not in the pg_dumpall output would be a reasonable way
to deal with that though.

Now, our current approach is fairly arguable. Given that GUCs on
databases, users, functions, etc. are stored in the catalogs and
subject to backup, restore, replication, etc., one might well expect
that global settings would be handled the same way. I tend to think
that would be nicer, though it would require solving the problem of
how to back out bad changes that make the database not start up.
Regardless of what you or anybody thinks about that, though, it's not
how it works today and would require some serious engineering if we
wanted to make it happen.

This sounds an awful lot like the arguments that I tried to make when
ALTER SYSTEM SET was first going in, but what's done is done and there's
not much to do but make the best of it as I can't imagine there'd be
much support for ripping it out.

I don't really agree about the need for 'some serious engineering'
either, but having an option for it, sure.

I do also tend to agree with Tom about making ALTER SYSTEM SET be
prohibited in explicitly read-only transactions, but still allowing it
to be run against replicas as that's a handy thing to be able to do.

As another example, do we need to consider that replacing pg_hba.conf
via pg_write_file should be allowed in a "read only" transaction?

I don't really see what the problem with that is. It bothers me a lot
more that CLUSTER can be run in a read-only transaction -- which
actually changes stuff inside the database, even if not necessarily in
a user-visible way -- than it does that somebody might be able to use
the database to change something that isn't really part of the
database anyway. And pg_hba.conf, like postgresql.conf, is largely
treated as an input to the database rather than part of it.

I don't like that CLUSTER can be run in a read-only transaction either
(though it seems like downthread maybe some people are fine with
that..). I'm also coming around to the idea that pg_write_file()
probably shouldn't be allowed either, and probably not COPY TO either
(except to stdout, since that's a result, not a change operation).

Somebody could create a user-defined function that launches a
satellite into orbit and that is, I would argue, a write operation in
the truest sense. You have changed the state of the world in a lasting
way, and you cannot take it back. But, it's not writing to the
database, so as far as read-only transactions are concerned, who
cares?

I suppose there's another thing to think about in this discussion, which
are FDWs, if the idea is that read-only means "I don't want to make
changes in THIS database". I don't really feel like that's what marking
a transaction as 'read only' is intended to mean though.

When I think of starting a read-only transaction, I feel like it's
usually with the idea of "I want to play it safe and I don't want this
transaction to make ANY changes". I'm feeling more inclined that we
should be going out of our way to make darn sure that we respect that
request of the user, no matter what it is they're running. We can't
prevent user-created C-level functions from launching satellites, but
that's an untrusted language and therefore it's up to the function
author to manage the transaction and privilege system properly anyway,
not ours.

Thanks,

Stephen

#19Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Tom Lane (#16)
Re: our checks for read-only queries are not great

On Fri, 2020-01-10 at 09:29 -0500, Tom Lane wrote:

ALTER SYSTEM is read only in my mind.

I'm still having trouble with this conclusion. I think it can only
be justified by a very narrow reading of "reflected in pg_dump"
that relies on the specific factorization we've chosen for upgrade
operations, ie that postgresql.conf mods have to be carried across
by hand. But that's mostly historical baggage, rather than a sane
basis for defining "read only". If somebody comes up with a patch
that causes "pg_dumpall -g" to include ALTER SYSTEM SET commands for
whatever is in postgresql.auto.conf (not an unreasonable idea BTW),
will you then decide that ALTER SYSTEM SET is no longer read-only?

I think that having ALTER SYSTEM commands in pg_dumpall output
would be a problem. It would cause all kinds of problems whenever
parameters change. Thinking of the transition "checkpoint_segments"
-> "max_wal_size", you'd have to build some translation magic into pg_dump.
Besides, such a feature would make it harder to restore a dump taken
with version x into version x + n for n > 0.

Or, perhaps, reject such a patch on the grounds that it breaks this
arbitrary definition of read-only-ness?

I agree with Robert that such a patch should be rejected on other
grounds.

Concerning the topic of the thread, I personally have come to think
that changing GUCs is *not* writing to the database. But that is based
on the fact that you can change GUCs on streaming replication standbys,
and it may be surprising to a newcomer.

Perhaps it would be good to consider this question:
Do we call something "read-only" if it changes nothing, or do we call it
"read-only" if it is allowed on a streaming replication standby?
The first would be more correct, but the second may be more convenient.

Yours,
Laurenz Albe

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Laurenz Albe (#19)
Re: our checks for read-only queries are not great

Laurenz Albe <laurenz.albe@cybertec.at> writes:

Perhaps it would be good to consider this question:
Do we call something "read-only" if it changes nothing, or do we call it
"read-only" if it is allowed on a streaming replication standby?
The first would be more correct, but the second may be more convenient.

Yeah, this is really the larger point at stake. I'm not sure that
"read-only" and "allowed on standby" should be identical, nor even
that one should be an exact subset of the other. They're certainly
by-and-large the same sets of operations, but there might be
exceptions that belong to only one set. "read-only" is driven by
(some reading of) the SQL standard, while "allowed on standby" is
driven by implementation limitations, so I think it'd be dangerous
to commit ourselves to those being identical.

regards, tom lane

#21Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Robert Haas (#15)
Re: our checks for read-only queries are not great

On 2020-01-10 14:41, Robert Haas wrote:

This rule very nearly matches the current behavior: it explains why
temp table operations are allowed, and why ALTER SYSTEM is allowed,
and why REINDEX etc. are allowed. However, there's a notable
exception: PREPARE, COMMIT PREPARED, and ROLLBACK PREPARED are allowed
in a read-only transaction. Under the "doesn't change pg_dump output"
criteria, the first and third ones should be permitted but COMMIT
PREPARED should be denied, except maybe if the prepared transaction
didn't do any writes (and in that case, why did we bother preparing
it?). Despite that, this rule does a way better job explaining the
current behavior than anything else suggested so far.

I don't follow. Does pg_dump dump prepared transactions?

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#22Stephen Frost
sfrost@snowman.net
In reply to: Laurenz Albe (#19)
Re: our checks for read-only queries are not great

Greetings,

* Laurenz Albe (laurenz.albe@cybertec.at) wrote:

On Fri, 2020-01-10 at 09:29 -0500, Tom Lane wrote:

ALTER SYSTEM is read only in my mind.

I'm still having trouble with this conclusion. I think it can only
be justified by a very narrow reading of "reflected in pg_dump"
that relies on the specific factorization we've chosen for upgrade
operations, ie that postgresql.conf mods have to be carried across
by hand. But that's mostly historical baggage, rather than a sane
basis for defining "read only". If somebody comes up with a patch
that causes "pg_dumpall -g" to include ALTER SYSTEM SET commands for
whatever is in postgresql.auto.conf (not an unreasonable idea BTW),
will you then decide that ALTER SYSTEM SET is no longer read-only?

I think that having ALTER SYSTEM commands in pg_dumpall output
would be a problem. It would cause all kinds of problems whenever
parameters change. Thinking of the transition "checkpoint_segments"
-> "max_wal_size", you'd have to build some translation magic into pg_dump.
Besides, such a feature would make it harder to restore a dump taken
with version x into version x + n for n > 0.

pg_dump already specifically has understanding of how to deal with old
options in other things when constructing a dump for a given version-
and we already have issues that a dump taken with pg_dump X has a good
chance of now being able to be restoreding into a PG X+1, that's why
it's recommended to use the pg_dump for the version of PG you're
intending to restore into, so I don't particularly agree with any of the
arguments presented above.

Or, perhaps, reject such a patch on the grounds that it breaks this
arbitrary definition of read-only-ness?

I agree with Robert that such a patch should be rejected on other
grounds.

Concerning the topic of the thread, I personally have come to think
that changing GUCs is *not* writing to the database. But that is based
on the fact that you can change GUCs on streaming replication standbys,
and it may be surprising to a newcomer.

Perhaps it would be good to consider this question:
Do we call something "read-only" if it changes nothing, or do we call it
"read-only" if it is allowed on a streaming replication standby?
The first would be more correct, but the second may be more convenient.

The two are distinct from each other and one doesn't imply the other. I
don't think we need to, or really want to, force them to be the same.

When we're talking about a "read-only" transaction that the user has
specifically asked be "read-only" then, imv anyway, we should be pretty
stringent regarding what that's allowed to do and shouldn't be allowing
that to change state in the system which other processes will see after
the transaction is over.

A transaction (on a primary or a replica) doesn't need to be started as
explicitly "read-only" and perhaps we should change the language when we
are starting up to say "database is ready to accept replica connections"
or something instead of "read-only" connections to clarify that.

Thanks,

Stephen

#23Robert Haas
robertmhaas@gmail.com
In reply to: Peter Eisentraut (#21)
Re: our checks for read-only queries are not great

On Mon, Jan 13, 2020 at 5:57 AM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 2020-01-10 14:41, Robert Haas wrote:

This rule very nearly matches the current behavior: it explains why
temp table operations are allowed, and why ALTER SYSTEM is allowed,
and why REINDEX etc. are allowed. However, there's a notable
exception: PREPARE, COMMIT PREPARED, and ROLLBACK PREPARED are allowed
in a read-only transaction. Under the "doesn't change pg_dump output"
criteria, the first and third ones should be permitted but COMMIT
PREPARED should be denied, except maybe if the prepared transaction
didn't do any writes (and in that case, why did we bother preparing
it?). Despite that, this rule does a way better job explaining the
current behavior than anything else suggested so far.

I don't follow. Does pg_dump dump prepared transactions?

No, but committing one changes the database contents as seen by a
subsequent pg_dump.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#24Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Stephen Frost (#22)
Re: our checks for read-only queries are not great

On Mon, 2020-01-13 at 13:56 -0500, Stephen Frost wrote:

I think that having ALTER SYSTEM commands in pg_dumpall output
would be a problem. It would cause all kinds of problems whenever
parameters change. Thinking of the transition "checkpoint_segments"
-> "max_wal_size", you'd have to build some translation magic into pg_dump.
Besides, such a feature would make it harder to restore a dump taken
with version x into version x + n for n > 0.

pg_dump already specifically has understanding of how to deal with old
options in other things when constructing a dump for a given version-
and we already have issues that a dump taken with pg_dump X has a good
chance of now being able to be restoreding into a PG X+1, that's why
it's recommended to use the pg_dump for the version of PG you're
intending to restore into, so I don't particularly agree with any of the
arguments presented above.

Right.
But increasing the difficulty of restoring a version x pg_dump with
version x + 1 is still not a thing we should lightly do.

Not that the docs currently say "it is recommended to use pg_dumpall
from the newer version". They don't say "is is not supported".

Yours,
Laurenz Albe

#25Stephen Frost
sfrost@snowman.net
In reply to: Laurenz Albe (#24)
Re: our checks for read-only queries are not great

Greetings,

* Laurenz Albe (laurenz.albe@cybertec.at) wrote:

On Mon, 2020-01-13 at 13:56 -0500, Stephen Frost wrote:

I think that having ALTER SYSTEM commands in pg_dumpall output
would be a problem. It would cause all kinds of problems whenever
parameters change. Thinking of the transition "checkpoint_segments"
-> "max_wal_size", you'd have to build some translation magic into pg_dump.
Besides, such a feature would make it harder to restore a dump taken
with version x into version x + n for n > 0.

pg_dump already specifically has understanding of how to deal with old
options in other things when constructing a dump for a given version-
and we already have issues that a dump taken with pg_dump X has a good
chance of now being able to be restoreding into a PG X+1, that's why
it's recommended to use the pg_dump for the version of PG you're
intending to restore into, so I don't particularly agree with any of the
arguments presented above.

Right.
But increasing the difficulty of restoring a version x pg_dump with
version x + 1 is still not a thing we should lightly do.

I've never heard that and I don't agree with it being a justification
for blocking sensible progress.

Not that the docs currently say "it is recommended to use pg_dumpall
from the newer version". They don't say "is is not supported".

It's recommended due to exactly the reasons presented and no one is
saying that such isn't supported- but we don't and aren't going to
guarantee that it's going to work. We absolutely know of cases where it
just won't work, today. If that's justification for saying it's not
supported, then fine, let's change the language to say that.

Thanks,

Stephen

#26Robert Haas
robertmhaas@gmail.com
In reply to: Stephen Frost (#25)
Re: our checks for read-only queries are not great

On Mon, Jan 13, 2020 at 3:00 PM Stephen Frost <sfrost@snowman.net> wrote:

I've never heard that and I don't agree with it being a justification
for blocking sensible progress.

Speaking of sensible progress, I think we've drifted off on a tangent
here about ALTER SYSTEM. As I understand it, nobody's opposed to the
most recent version (v3) of the proposed patch, which also makes no
definitional changes relative to the status quo, but does fix some
bugs, and makes things a little nicer for parallel query, too. So I'd
like to go ahead and commit that.

Discussing of what to do about ALTER SYSTEM can continue, although I
feel perhaps the current discussion isn't particularly productive. On
the one hand, the argument that it isn't read only because it writes
data someplace doesn't convince me: practically every command can
cause some kind of write some place, e.g. SELECT can write WAL for at
least 2 different reasons, and that does not make it not read-only,
nor does the fact that it updates the table statistics. The question
of what data is being written must be relevant. On the other hand, I'm
unpersuaded by the arguments so far that including ALTER SYSTEM
commands in pg_dump output would be anything other than a train wreck,
though doing it optionally and not by default might be OK. However,
the main thing for me is that messing around with the behavior of
ALTER SYSTEM in either of those ways or some other is not what this
patch is about. I'm just proposing to refactor the code to fix the
existing bugs and make it much less likely that future patches will
create new ones, and I think reclassifying or redesigning ALTER SYSTEM
ought to be done, if at all, separately.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#26)
Re: our checks for read-only queries are not great

Robert Haas <robertmhaas@gmail.com> writes:

Speaking of sensible progress, I think we've drifted off on a tangent
here about ALTER SYSTEM.

Agreed, that's not terribly relevant for the proposed patch.

regards, tom lane

#28Stephen Frost
sfrost@snowman.net
In reply to: Tom Lane (#27)
Re: our checks for read-only queries are not great

Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

Robert Haas <robertmhaas@gmail.com> writes:

Speaking of sensible progress, I think we've drifted off on a tangent
here about ALTER SYSTEM.

Agreed, that's not terribly relevant for the proposed patch.

I agree that the proposed patch seems alright by itself, as the changes
it's making to existing behavior seem to all be bug-fixes and pretty
clear improvements not really related to 'read-only' transactions.

It's unfortunate that we haven't been able to work through to some kind
of agreement around what "SET TRANSACTION READ ONLY" means, so that
users of it can know what to expect.

Thanks,

Stephen

#29Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Robert Haas (#23)
Re: our checks for read-only queries are not great

On 2020-01-13 20:14, Robert Haas wrote:

On Mon, Jan 13, 2020 at 5:57 AM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 2020-01-10 14:41, Robert Haas wrote:

This rule very nearly matches the current behavior: it explains why
temp table operations are allowed, and why ALTER SYSTEM is allowed,
and why REINDEX etc. are allowed. However, there's a notable
exception: PREPARE, COMMIT PREPARED, and ROLLBACK PREPARED are allowed
in a read-only transaction. Under the "doesn't change pg_dump output"
criteria, the first and third ones should be permitted but COMMIT
PREPARED should be denied, except maybe if the prepared transaction
didn't do any writes (and in that case, why did we bother preparing
it?). Despite that, this rule does a way better job explaining the
current behavior than anything else suggested so far.

I don't follow. Does pg_dump dump prepared transactions?

No, but committing one changes the database contents as seen by a
subsequent pg_dump.

Well, if the transaction was declared read-only, then committing it
(directly or 2PC) shouldn't change anything. This appears to be a
circular argument.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#30Robert Haas
robertmhaas@gmail.com
In reply to: Peter Eisentraut (#29)
Re: our checks for read-only queries are not great

On Wed, Jan 15, 2020 at 10:25 AM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

Well, if the transaction was declared read-only, then committing it
(directly or 2PC) shouldn't change anything. This appears to be a
circular argument.

I don't think it's a circular argument. Suppose that someone decrees
that, as of 5pm Eastern time, no more read-write transactions are
permitted. And because the person issuing the decree has a lot of
power, everybody obeys. Now, every pg_dump taken after that time will
be semantically equivalent to every other pg_dump taken after that
time, with one tiny exception. That exception is that someone could
still do COMMIT PREPARED of a read-write transaction that was prepared
before 5pm. If the goal of the powerful person who issued the decree
was to make sure that the database couldn't change - e.g. so they
could COPY each table individually without keeping a snapshot open and
still get a consistent backup - they might fail to achieve it if, as
of the moment of the freeze, there are some prepared write
transactions.

I'm not saying we have to change the behavior or anything. I'm just
saying that there seems to be one, and only one, way to make the
apparent contents of the database change in a read-only transaction
right now. And that's a COMMIT PREPARED of a read-write transaction.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#31Stephen Frost
sfrost@snowman.net
In reply to: Robert Haas (#30)
Re: our checks for read-only queries are not great

Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:

On Wed, Jan 15, 2020 at 10:25 AM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

Well, if the transaction was declared read-only, then committing it
(directly or 2PC) shouldn't change anything. This appears to be a
circular argument.

I don't think it's a circular argument. Suppose that someone decrees
that, as of 5pm Eastern time, no more read-write transactions are
permitted. And because the person issuing the decree has a lot of
power, everybody obeys. Now, every pg_dump taken after that time will
be semantically equivalent to every other pg_dump taken after that
time, with one tiny exception. That exception is that someone could
still do COMMIT PREPARED of a read-write transaction that was prepared
before 5pm. If the goal of the powerful person who issued the decree
was to make sure that the database couldn't change - e.g. so they
could COPY each table individually without keeping a snapshot open and
still get a consistent backup - they might fail to achieve it if, as
of the moment of the freeze, there are some prepared write
transactions.

I'm not saying we have to change the behavior or anything. I'm just
saying that there seems to be one, and only one, way to make the
apparent contents of the database change in a read-only transaction
right now. And that's a COMMIT PREPARED of a read-write transaction.

Yeah, allowing a read-only transaction to start and then commit a
read-write transaction doesn't seem sensible. I'd be in favor of
changing that.

Thanks,

Stephen

#32Robert Haas
robertmhaas@gmail.com
In reply to: Stephen Frost (#28)
Re: our checks for read-only queries are not great

On Tue, Jan 14, 2020 at 1:46 PM Stephen Frost <sfrost@snowman.net> wrote:

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

Robert Haas <robertmhaas@gmail.com> writes:

Speaking of sensible progress, I think we've drifted off on a tangent
here about ALTER SYSTEM.

Agreed, that's not terribly relevant for the proposed patch.

I agree that the proposed patch seems alright by itself, as the changes
it's making to existing behavior seem to all be bug-fixes and pretty
clear improvements not really related to 'read-only' transactions.

There seems to be no disagreement on this point, so I have committed the patch.

It's unfortunate that we haven't been able to work through to some kind
of agreement around what "SET TRANSACTION READ ONLY" means, so that
users of it can know what to expect.

I at least feel like we have a pretty good handle on what it was
intended to mean; that is, "doesn't cause semantically significant
changes to pg_dump output." I do hear some skepticism as to whether
that's the best definition, but it has pretty good explanatory power
relative to the current state of the code, which is something.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#33Stephen Frost
sfrost@snowman.net
In reply to: Robert Haas (#32)
Re: our checks for read-only queries are not great

Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:

On Tue, Jan 14, 2020 at 1:46 PM Stephen Frost <sfrost@snowman.net> wrote:

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

Robert Haas <robertmhaas@gmail.com> writes:

Speaking of sensible progress, I think we've drifted off on a tangent
here about ALTER SYSTEM.

Agreed, that's not terribly relevant for the proposed patch.

I agree that the proposed patch seems alright by itself, as the changes
it's making to existing behavior seem to all be bug-fixes and pretty
clear improvements not really related to 'read-only' transactions.

There seems to be no disagreement on this point, so I have committed the patch.

Works for me.

It's unfortunate that we haven't been able to work through to some kind
of agreement around what "SET TRANSACTION READ ONLY" means, so that
users of it can know what to expect.

I at least feel like we have a pretty good handle on what it was
intended to mean; that is, "doesn't cause semantically significant
changes to pg_dump output." I do hear some skepticism as to whether
that's the best definition, but it has pretty good explanatory power
relative to the current state of the code, which is something.

I think I agree with you regarding the original intent, though even
there, as discussed elsewhere, it seems like there's perhaps either a
bug or a disagreement about the specifics of what that means when it
relates to committing a 2-phase transaction. Still, setting that aside
for the moment, do we feel like this is enough to be able to update our
documentation with?

Thanks,

Stephen

#34Robert Haas
robertmhaas@gmail.com
In reply to: Stephen Frost (#33)
Re: our checks for read-only queries are not great

On Thu, Jan 16, 2020 at 12:22 PM Stephen Frost <sfrost@snowman.net> wrote:

I think I agree with you regarding the original intent, though even
there, as discussed elsewhere, it seems like there's perhaps either a
bug or a disagreement about the specifics of what that means when it
relates to committing a 2-phase transaction. Still, setting that aside
for the moment, do we feel like this is enough to be able to update our
documentation with?

I think that would be possible. What did you have in mind?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#35Bruce Momjian
bruce@momjian.us
In reply to: Stephen Frost (#22)
Re: our checks for read-only queries are not great

On Mon, Jan 13, 2020 at 01:56:30PM -0500, Stephen Frost wrote:

I think that having ALTER SYSTEM commands in pg_dumpall output
would be a problem. It would cause all kinds of problems whenever
parameters change. Thinking of the transition "checkpoint_segments"
-> "max_wal_size", you'd have to build some translation magic into pg_dump.
Besides, such a feature would make it harder to restore a dump taken
with version x into version x + n for n > 0.

pg_dump already specifically has understanding of how to deal with old
options in other things when constructing a dump for a given version-
and we already have issues that a dump taken with pg_dump X has a good
chance of now being able to be restoreding into a PG X+1, that's why
it's recommended to use the pg_dump for the version of PG you're
intending to restore into, so I don't particularly agree with any of the
arguments presented above.

One issue is that system table GUC settings (e.g., per-database,
per-user) cannot include postgresql.conf-only settings, like
max_wal_size, so system tables GUC settings are less likely to be
renamed than postgresql.conf.auto settings. FYI, we are more inclined
to allow postgresql.conf-only changes than others because there is less
impact on applications.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +
#36Stephen Frost
sfrost@snowman.net
In reply to: Bruce Momjian (#35)
Re: our checks for read-only queries are not great

Greetings,

* Bruce Momjian (bruce@momjian.us) wrote:

On Mon, Jan 13, 2020 at 01:56:30PM -0500, Stephen Frost wrote:

I think that having ALTER SYSTEM commands in pg_dumpall output
would be a problem. It would cause all kinds of problems whenever
parameters change. Thinking of the transition "checkpoint_segments"
-> "max_wal_size", you'd have to build some translation magic into pg_dump.
Besides, such a feature would make it harder to restore a dump taken
with version x into version x + n for n > 0.

pg_dump already specifically has understanding of how to deal with old
options in other things when constructing a dump for a given version-
and we already have issues that a dump taken with pg_dump X has a good
chance of now being able to be restoreding into a PG X+1, that's why
it's recommended to use the pg_dump for the version of PG you're
intending to restore into, so I don't particularly agree with any of the
arguments presented above.

One issue is that system table GUC settings (e.g., per-database,
per-user) cannot include postgresql.conf-only settings, like
max_wal_size, so system tables GUC settings are less likely to be
renamed than postgresql.conf.auto settings. FYI, we are more inclined
to allow postgresql.conf-only changes than others because there is less
impact on applications.

I'm a bit unclear about what's being suggested here. When you are
talking about 'applications', are you referring specifically to pg_dump
and pg_restore, or are you talking about regular user applications?

If you're referring to pg_dump/restore, then what I'm understanding from
your comments is that if we made pg_dump/restore aware of ALTER SYSTEM
and were made to support it that we would then be less inclined to
change the names of postgresql.conf-only settings because, if we do so,
we have to update pg_dump/restore.

I can see some argument in that direction but my initial reaction is
that I don't feel like the bar would really be moved very far, and, if
we came up with some mapping from one to the other for those, I actually
think it'd be really helpful downstream for packagers and such who
routinely are dealing with updating from an older postgresql.conf file
to a newer one when an upgrade is done.

Thanks,

Stephen

#37Bruce Momjian
bruce@momjian.us
In reply to: Stephen Frost (#36)
Re: our checks for read-only queries are not great

On Mon, Jan 27, 2020 at 02:26:42PM -0500, Stephen Frost wrote:

Greetings,

* Bruce Momjian (bruce@momjian.us) wrote:

On Mon, Jan 13, 2020 at 01:56:30PM -0500, Stephen Frost wrote:

I think that having ALTER SYSTEM commands in pg_dumpall output
would be a problem. It would cause all kinds of problems whenever
parameters change. Thinking of the transition "checkpoint_segments"
-> "max_wal_size", you'd have to build some translation magic into pg_dump.
Besides, such a feature would make it harder to restore a dump taken
with version x into version x + n for n > 0.

pg_dump already specifically has understanding of how to deal with old
options in other things when constructing a dump for a given version-
and we already have issues that a dump taken with pg_dump X has a good
chance of now being able to be restoreding into a PG X+1, that's why
it's recommended to use the pg_dump for the version of PG you're
intending to restore into, so I don't particularly agree with any of the
arguments presented above.

One issue is that system table GUC settings (e.g., per-database,
per-user) cannot include postgresql.conf-only settings, like
max_wal_size, so system tables GUC settings are less likely to be
renamed than postgresql.conf.auto settings. FYI, we are more inclined
to allow postgresql.conf-only changes than others because there is less
impact on applications.

I'm a bit unclear about what's being suggested here. When you are
talking about 'applications', are you referring specifically to pg_dump
and pg_restore, or are you talking about regular user applications?

Sorry for the late reply. I meant all applications.

If you're referring to pg_dump/restore, then what I'm understanding from
your comments is that if we made pg_dump/restore aware of ALTER SYSTEM
and were made to support it that we would then be less inclined to
change the names of postgresql.conf-only settings because, if we do so,
we have to update pg_dump/restore.

I can see some argument in that direction but my initial reaction is
that I don't feel like the bar would really be moved very far, and, if
we came up with some mapping from one to the other for those, I actually
think it'd be really helpful downstream for packagers and such who
routinely are dealing with updating from an older postgresql.conf file
to a newer one when an upgrade is done.

I should have given more examples. Changing GUC variables like
search_path or work_mem, which can be set in per-database, per-user, and
per-session contexts, is more disruptive than changing GUCs that can
only can be set in postgresql.conf, like max_wal_size. My point is that
all GUC changes don't have the same level of disruption.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +