Improve behavior of concurrent TRUNCATE

Started by Michael Paquierover 7 years ago13 messages
#1Michael Paquier
michael@paquier.xyz
1 attachment(s)

Hi all,

This is a second thread I am spawning for the previous thread "Canceling
authentication due to timeout aka Denial of Service Attack":
/messages/by-id/152512087100.19803.12733865831237526317@wrigleys.postgresql.org

And this time the discussion is about TRUNCATE, as the former thread
discussed about a family of failures hence it is harder to catch the
proper attention. The problem is that when we look for the relation OID
of the relation to truncate, we don't use a callback with
RangeVarGetRelidExtended, causing a lock acquire attempt to happen
before checking the privileges on the relation for the user running the
command.

Attached is a patch I have been working on which refactors the code of
TRUNCATE in such a way that we check for privileges before trying to
acquire a lock, without any user-facing impact (I have reworked a couple
of comments compared to the last version). This includes a set of tests
showing the new behavior.

Like cbe24a6, perhaps we would not want to back-patch it? Based on the
past history (and the consensus being reached for the REINDEX case would
be to patch only HEAD), I would be actually incline to not back-patch
this stuff and qualify that as an improvement. That's also less work
for me at commit :)

Thoughts?
--
Michael

Attachments:

0001-Refactor-TRUNCATE-execution-to-avoid-too-early-lock-.patchtext/x-diff; charset=us-asciiDownload
From 9946e5d2025cd6b8ec5baa214dab06a32f3b20a8 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Mon, 6 Aug 2018 18:43:20 +0200
Subject: [PATCH] Refactor TRUNCATE execution to avoid too early lock lookups

A caller of TRUNCATE can perform early lookup obtention which can cause
other sessions to block on the request done, causing potentially DOS
attacks as even a non-privileged user can attempt a truncation of a
critical catalog table to block even all incoming connection attempts.

This commit fixes the case of TRUNCATE so as RangeVarGetRelidExtended is
used with a callback doing the necessary checks at an earlier stage,
avoiding locking issues at early stages, so as a failure happens for
unprivileged users instead of waiting on a lock.
---
 src/backend/commands/tablecmds.c              | 84 +++++++++++++++----
 .../isolation/expected/truncate-conflict.out  | 45 ++++++++++
 src/test/isolation/isolation_schedule         |  1 +
 .../isolation/specs/truncate-conflict.spec    | 27 ++++++
 4 files changed, 139 insertions(+), 18 deletions(-)
 create mode 100644 src/test/isolation/expected/truncate-conflict.out
 create mode 100644 src/test/isolation/specs/truncate-conflict.spec

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index eb2d33dd86..fd240cb635 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -300,7 +300,10 @@ struct DropRelationCallbackState
 #define child_dependency_type(child_is_partition)	\
 	((child_is_partition) ? DEPENDENCY_AUTO : DEPENDENCY_NORMAL)
 
-static void truncate_check_rel(Relation rel);
+static void truncate_check_rel(Oid relid, Form_pg_class reltuple);
+static void truncate_check_activity(Relation rel);
+static void RangeVarCallbackForTruncate(const RangeVar *relation,
+							Oid relId, Oid oldRelId, void *arg);
 static List *MergeAttributes(List *schema, List *supers, char relpersistence,
 				bool is_partition, List **supOids, List **supconstr,
 				int *supOidCount);
@@ -1336,15 +1339,25 @@ ExecuteTruncate(TruncateStmt *stmt)
 		bool		recurse = rv->inh;
 		Oid			myrelid;
 
-		rel = heap_openrv(rv, AccessExclusiveLock);
-		myrelid = RelationGetRelid(rel);
+		myrelid = RangeVarGetRelidExtended(rv, AccessExclusiveLock,
+										   0, RangeVarCallbackForTruncate,
+										   NULL);
+
+		/* open the relation, we already hold a lock on it */
+		rel = heap_open(myrelid, NoLock);
+
 		/* don't throw error for "TRUNCATE foo, foo" */
 		if (list_member_oid(relids, myrelid))
 		{
 			heap_close(rel, AccessExclusiveLock);
 			continue;
 		}
-		truncate_check_rel(rel);
+
+		/*
+		 * RangeVarGetRelidExtended has done some basic checks with its
+		 * callback, and only remain session-level checks.
+		 */
+		truncate_check_activity(rel);
 		rels = lappend(rels, rel);
 		relids = lappend_oid(relids, myrelid);
 		/* Log this relation only if needed for logical decoding */
@@ -1367,7 +1380,9 @@ ExecuteTruncate(TruncateStmt *stmt)
 
 				/* find_all_inheritors already got lock */
 				rel = heap_open(childrelid, NoLock);
-				truncate_check_rel(rel);
+				truncate_check_rel(RelationGetRelid(rel), rel->rd_rel);
+				truncate_check_activity(rel);
+
 				rels = lappend(rels, rel);
 				relids = lappend_oid(relids, childrelid);
 				/* Log this relation only if needed for logical decoding */
@@ -1450,7 +1465,8 @@ ExecuteTruncateGuts(List *explicit_rels, List *relids, List *relids_logged,
 				ereport(NOTICE,
 						(errmsg("truncate cascades to table \"%s\"",
 								RelationGetRelationName(rel))));
-				truncate_check_rel(rel);
+				truncate_check_rel(relid, rel->rd_rel);
+				truncate_check_activity(rel);
 				rels = lappend(rels, rel);
 				relids = lappend_oid(relids, relid);
 				/* Log this relation only if needed for logical decoding */
@@ -1700,38 +1716,48 @@ ExecuteTruncateGuts(List *explicit_rels, List *relids, List *relids_logged,
 }
 
 /*
- * Check that a given rel is safe to truncate.  Subroutine for ExecuteTruncate
+ * Check that a given relation is safe to truncate.  Subroutine for
+ * ExecuteTruncate and RangeVarCallbackForTruncate.
  */
 static void
-truncate_check_rel(Relation rel)
+truncate_check_rel(Oid relid, Form_pg_class reltuple)
 {
 	AclResult	aclresult;
+	char	   *relname = NameStr(reltuple->relname);
 
 	/*
 	 * Only allow truncate on regular tables and partitioned tables (although,
 	 * the latter are only being included here for the following checks; no
 	 * physical truncation will occur in their case.)
 	 */
-	if (rel->rd_rel->relkind != RELKIND_RELATION &&
-		rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
+	if (reltuple->relkind != RELKIND_RELATION &&
+
+		reltuple->relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
-				 errmsg("\"%s\" is not a table",
-						RelationGetRelationName(rel))));
+				 errmsg("\"%s\" is not a table", relname)));
 
 	/* Permissions checks */
-	aclresult = pg_class_aclcheck(RelationGetRelid(rel), GetUserId(),
-								  ACL_TRUNCATE);
+	aclresult = pg_class_aclcheck(relid, GetUserId(), ACL_TRUNCATE);
 	if (aclresult != ACLCHECK_OK)
-		aclcheck_error(aclresult, get_relkind_objtype(rel->rd_rel->relkind),
-					   RelationGetRelationName(rel));
+		aclcheck_error(aclresult, get_relkind_objtype(reltuple->relkind),
+					   relname);
 
-	if (!allowSystemTableMods && IsSystemRelation(rel))
+	if (!allowSystemTableMods && IsSystemClass(relid, reltuple))
 		ereport(ERROR,
 				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 				 errmsg("permission denied: \"%s\" is a system catalog",
-						RelationGetRelationName(rel))));
+						relname)));
+}
 
+/*
+ * Set of extra sanity checks to check if a given relation is safe to
+ * truncate.  This is split with truncate_check_rel as
+ * RangeVarCallbackForTruncate can only call the former.
+ */
+static void
+truncate_check_activity(Relation rel)
+{
 	/*
 	 * Don't allow truncate on temp tables of other backends ... their local
 	 * buffer manager is not going to cope.
@@ -13419,6 +13445,28 @@ RangeVarCallbackOwnsTable(const RangeVar *relation,
 		aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(get_rel_relkind(relId)), relation->relname);
 }
 
+/*
+ * Callback to RangeVarGetRelidExtended() for TRUNCATE processing.
+ */
+static void
+RangeVarCallbackForTruncate(const RangeVar *relation,
+							Oid relId, Oid oldRelId, void *arg)
+{
+	HeapTuple	tuple;
+
+	/* Nothing to do if the relation was not found. */
+	if (!OidIsValid(relId))
+		return;
+
+	tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(relId));
+	if (!HeapTupleIsValid(tuple))	/* should not happen */
+		elog(ERROR, "cache lookup failed for relation %u", relId);
+
+	truncate_check_rel(relId, (Form_pg_class) GETSTRUCT(tuple));
+
+	ReleaseSysCache(tuple);
+}
+
 /*
  * Callback to RangeVarGetRelidExtended(), similar to
  * RangeVarCallbackOwnsTable() but without checks on the type of the relation.
diff --git a/src/test/isolation/expected/truncate-conflict.out b/src/test/isolation/expected/truncate-conflict.out
new file mode 100644
index 0000000000..6938528380
--- /dev/null
+++ b/src/test/isolation/expected/truncate-conflict.out
@@ -0,0 +1,45 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1_begin s1_catalog_lookup s2_auth s2_truncate s1_commit
+step s1_begin: BEGIN;
+step s1_catalog_lookup: SELECT count(*) >= 0 FROM pg_stat_activity;
+?column?       
+
+t              
+step s2_auth: SET ROLE regress_truncate_conflict;
+step s2_truncate: TRUNCATE pg_authid;
+ERROR:  permission denied for table pg_authid
+step s1_commit: COMMIT;
+
+starting permutation: s1_begin s2_auth s2_truncate s1_catalog_lookup s1_commit
+step s1_begin: BEGIN;
+step s2_auth: SET ROLE regress_truncate_conflict;
+step s2_truncate: TRUNCATE pg_authid;
+ERROR:  permission denied for table pg_authid
+step s1_catalog_lookup: SELECT count(*) >= 0 FROM pg_stat_activity;
+?column?       
+
+t              
+step s1_commit: COMMIT;
+
+starting permutation: s1_begin s2_auth s1_catalog_lookup s2_truncate s1_commit
+step s1_begin: BEGIN;
+step s2_auth: SET ROLE regress_truncate_conflict;
+step s1_catalog_lookup: SELECT count(*) >= 0 FROM pg_stat_activity;
+?column?       
+
+t              
+step s2_truncate: TRUNCATE pg_authid;
+ERROR:  permission denied for table pg_authid
+step s1_commit: COMMIT;
+
+starting permutation: s2_auth s2_truncate s1_begin s1_catalog_lookup s1_commit
+step s2_auth: SET ROLE regress_truncate_conflict;
+step s2_truncate: TRUNCATE pg_authid;
+ERROR:  permission denied for table pg_authid
+step s1_begin: BEGIN;
+step s1_catalog_lookup: SELECT count(*) >= 0 FROM pg_stat_activity;
+?column?       
+
+t              
+step s1_commit: COMMIT;
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index d5594e80e2..48ae740739 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -76,3 +76,4 @@ test: partition-key-update-2
 test: partition-key-update-3
 test: partition-key-update-4
 test: plpgsql-toast
+test: truncate-conflict
diff --git a/src/test/isolation/specs/truncate-conflict.spec b/src/test/isolation/specs/truncate-conflict.spec
new file mode 100644
index 0000000000..f70aca9f69
--- /dev/null
+++ b/src/test/isolation/specs/truncate-conflict.spec
@@ -0,0 +1,27 @@
+# Test for lock lookup conflicts with TRUNCATE when working on unowned
+# relations, particularly catalogs should trigger an ERROR for all the
+# scenarios present here.
+
+setup
+{
+	CREATE ROLE regress_truncate_conflict;
+}
+
+teardown
+{
+	DROP ROLE regress_truncate_conflict;
+}
+
+session "s1"
+step "s1_begin"          { BEGIN; }
+step "s1_catalog_lookup" { SELECT count(*) >= 0 FROM pg_stat_activity; }
+step "s1_commit"         { COMMIT; }
+
+session "s2"
+step "s2_auth"           { SET ROLE regress_truncate_conflict; }
+step "s2_truncate"       { TRUNCATE pg_authid; }
+
+permutation "s1_begin" "s1_catalog_lookup" "s2_auth" "s2_truncate" "s1_commit"
+permutation "s1_begin" "s2_auth" "s2_truncate" "s1_catalog_lookup" "s1_commit"
+permutation "s1_begin" "s2_auth" "s1_catalog_lookup" "s2_truncate" "s1_commit"
+permutation "s2_auth" "s2_truncate" "s1_begin" "s1_catalog_lookup" "s1_commit"
-- 
2.18.0

#2Bossart, Nathan
bossartn@amazon.com
In reply to: Michael Paquier (#1)
Re: Improve behavior of concurrent TRUNCATE

Hi,

On 8/6/18, 11:59 AM, "Michael Paquier" <michael@paquier.xyz> wrote:

Attached is a patch I have been working on which refactors the code of
TRUNCATE in such a way that we check for privileges before trying to
acquire a lock, without any user-facing impact (I have reworked a couple
of comments compared to the last version). This includes a set of tests
showing the new behavior.

Here are some comments for the latest version of the patch.

-truncate_check_rel(Relation rel)
+truncate_check_rel(Oid relid, Form_pg_class reltuple)

Could we use HeapTupleGetOid(reltuple) instead of passing the OID
separately?

-	if (rel->rd_rel->relkind != RELKIND_RELATION &&
-		rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
+	if (reltuple->relkind != RELKIND_RELATION &&
+
+		reltuple->relkind != RELKIND_PARTITIONED_TABLE)

There appears to be an extra empty line here.

+# Test for lock lookup conflicts with TRUNCATE when working on unowned
+# relations, particularly catalogs should trigger an ERROR for all the
+# scenarios present here.

If possible, it might be worth it to check that TRUNCATE still blocks
when a role has privileges, too.

Like cbe24a6, perhaps we would not want to back-patch it? Based on the
past history (and the consensus being reached for the REINDEX case would
be to patch only HEAD), I would be actually incline to not back-patch
this stuff and qualify that as an improvement. That's also less work
for me at commit :)

Since the only behavior this patch changes is the order of locking and
permission checks, my vote would be to back-patch. However, since the
REINDEX piece won't be back-patched, I can see why we might not here,
either.

Nathan

#3Michael Paquier
michael@paquier.xyz
In reply to: Bossart, Nathan (#2)
1 attachment(s)
Re: Improve behavior of concurrent TRUNCATE

On Wed, Aug 08, 2018 at 03:38:57PM +0000, Bossart, Nathan wrote:

Here are some comments for the latest version of the patch.

Thanks for the review, Nathan!

-truncate_check_rel(Relation rel)
+truncate_check_rel(Oid relid, Form_pg_class reltuple)

Could we use HeapTupleGetOid(reltuple) instead of passing the OID
separately?

If that was a HeapTuple. And it seems to me that we had better make
truncate_check_rel() depend directly on a Form_pg_class instead of
allowing the caller pass anything and deform the tuple within the
routine.

-	if (rel->rd_rel->relkind != RELKIND_RELATION &&
-		rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
+	if (reltuple->relkind != RELKIND_RELATION &&
+
+		reltuple->relkind != RELKIND_PARTITIONED_TABLE)

There appears to be an extra empty line here.

Fixed. I don't know where this has come from.

+# Test for lock lookup conflicts with TRUNCATE when working on unowned
+# relations, particularly catalogs should trigger an ERROR for all the
+# scenarios present here.

If possible, it might be worth it to check that TRUNCATE still blocks
when a role has privileges, too.

Good idea. I have added more tests for that. Doing a truncate on
pg_authid for installcheck was a very bad idea anyway (even if it failed
all the time), so I have switched to a custom table, with a GRANT
command to control the access with a custom role.

Since the only behavior this patch changes is the order of locking and
permission checks, my vote would be to back-patch. However, since the
REINDEX piece won't be back-patched, I can see why we might not here,
either.

This patch is a bit more invasive than the REINDEX one to be honest, and
as this is getting qualified as an improvement, at least on this thread,
I would be inclined to be more restrictive and just patch HEAD, but not
v11.

Attached is an updated patch with all your suggestions added.
--
Michael

Attachments:

0001-Refactor-TRUNCATE-execution-to-avoid-too-early-lock-.patchtext/x-diff; charset=us-asciiDownload
From 4402dd2dd8b7b2672e43ebb2407013520898e8bc Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Thu, 9 Aug 2018 12:25:55 +0200
Subject: [PATCH] Refactor TRUNCATE execution to avoid too early lock lookups

A caller of TRUNCATE can perform early lookup obtention which can cause
other sessions to block on the request done, causing potentially DOS
attacks as even a non-privileged user can attempt a truncation of a
critical catalog table to block even all incoming connection attempts.

This commit fixes the case of TRUNCATE so as RangeVarGetRelidExtended is
used with a callback doing the necessary checks at an earlier stage,
avoiding locking issues at early stages, so as a failure happens for
unprivileged users instead of waiting on a lock.
---
 src/backend/commands/tablecmds.c              | 83 ++++++++++++----
 .../isolation/expected/truncate-conflict.out  | 99 +++++++++++++++++++
 src/test/isolation/isolation_schedule         |  1 +
 .../isolation/specs/truncate-conflict.spec    | 38 +++++++
 4 files changed, 203 insertions(+), 18 deletions(-)
 create mode 100644 src/test/isolation/expected/truncate-conflict.out
 create mode 100644 src/test/isolation/specs/truncate-conflict.spec

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index eb2d33dd86..c9f157d764 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -300,7 +300,10 @@ struct DropRelationCallbackState
 #define child_dependency_type(child_is_partition)	\
 	((child_is_partition) ? DEPENDENCY_AUTO : DEPENDENCY_NORMAL)
 
-static void truncate_check_rel(Relation rel);
+static void truncate_check_rel(Oid relid, Form_pg_class reltuple);
+static void truncate_check_activity(Relation rel);
+static void RangeVarCallbackForTruncate(const RangeVar *relation,
+							Oid relId, Oid oldRelId, void *arg);
 static List *MergeAttributes(List *schema, List *supers, char relpersistence,
 				bool is_partition, List **supOids, List **supconstr,
 				int *supOidCount);
@@ -1336,15 +1339,25 @@ ExecuteTruncate(TruncateStmt *stmt)
 		bool		recurse = rv->inh;
 		Oid			myrelid;
 
-		rel = heap_openrv(rv, AccessExclusiveLock);
-		myrelid = RelationGetRelid(rel);
+		myrelid = RangeVarGetRelidExtended(rv, AccessExclusiveLock,
+										   0, RangeVarCallbackForTruncate,
+										   NULL);
+
+		/* open the relation, we already hold a lock on it */
+		rel = heap_open(myrelid, NoLock);
+
 		/* don't throw error for "TRUNCATE foo, foo" */
 		if (list_member_oid(relids, myrelid))
 		{
 			heap_close(rel, AccessExclusiveLock);
 			continue;
 		}
-		truncate_check_rel(rel);
+
+		/*
+		 * RangeVarGetRelidExtended has done some basic checks with its
+		 * callback, and only remain session-level checks.
+		 */
+		truncate_check_activity(rel);
 		rels = lappend(rels, rel);
 		relids = lappend_oid(relids, myrelid);
 		/* Log this relation only if needed for logical decoding */
@@ -1367,7 +1380,9 @@ ExecuteTruncate(TruncateStmt *stmt)
 
 				/* find_all_inheritors already got lock */
 				rel = heap_open(childrelid, NoLock);
-				truncate_check_rel(rel);
+				truncate_check_rel(RelationGetRelid(rel), rel->rd_rel);
+				truncate_check_activity(rel);
+
 				rels = lappend(rels, rel);
 				relids = lappend_oid(relids, childrelid);
 				/* Log this relation only if needed for logical decoding */
@@ -1450,7 +1465,8 @@ ExecuteTruncateGuts(List *explicit_rels, List *relids, List *relids_logged,
 				ereport(NOTICE,
 						(errmsg("truncate cascades to table \"%s\"",
 								RelationGetRelationName(rel))));
-				truncate_check_rel(rel);
+				truncate_check_rel(relid, rel->rd_rel);
+				truncate_check_activity(rel);
 				rels = lappend(rels, rel);
 				relids = lappend_oid(relids, relid);
 				/* Log this relation only if needed for logical decoding */
@@ -1700,38 +1716,47 @@ ExecuteTruncateGuts(List *explicit_rels, List *relids, List *relids_logged,
 }
 
 /*
- * Check that a given rel is safe to truncate.  Subroutine for ExecuteTruncate
+ * Check that a given relation is safe to truncate.  Subroutine for
+ * ExecuteTruncate and RangeVarCallbackForTruncate.
  */
 static void
-truncate_check_rel(Relation rel)
+truncate_check_rel(Oid relid, Form_pg_class reltuple)
 {
 	AclResult	aclresult;
+	char	   *relname = NameStr(reltuple->relname);
 
 	/*
 	 * Only allow truncate on regular tables and partitioned tables (although,
 	 * the latter are only being included here for the following checks; no
 	 * physical truncation will occur in their case.)
 	 */
-	if (rel->rd_rel->relkind != RELKIND_RELATION &&
-		rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
+	if (reltuple->relkind != RELKIND_RELATION &&
+		reltuple->relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
-				 errmsg("\"%s\" is not a table",
-						RelationGetRelationName(rel))));
+				 errmsg("\"%s\" is not a table", relname)));
 
 	/* Permissions checks */
-	aclresult = pg_class_aclcheck(RelationGetRelid(rel), GetUserId(),
-								  ACL_TRUNCATE);
+	aclresult = pg_class_aclcheck(relid, GetUserId(), ACL_TRUNCATE);
 	if (aclresult != ACLCHECK_OK)
-		aclcheck_error(aclresult, get_relkind_objtype(rel->rd_rel->relkind),
-					   RelationGetRelationName(rel));
+		aclcheck_error(aclresult, get_relkind_objtype(reltuple->relkind),
+					   relname);
 
-	if (!allowSystemTableMods && IsSystemRelation(rel))
+	if (!allowSystemTableMods && IsSystemClass(relid, reltuple))
 		ereport(ERROR,
 				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 				 errmsg("permission denied: \"%s\" is a system catalog",
-						RelationGetRelationName(rel))));
+						relname)));
+}
 
+/*
+ * Set of extra sanity checks to check if a given relation is safe to
+ * truncate.  This is split with truncate_check_rel as
+ * RangeVarCallbackForTruncate can only call the former.
+ */
+static void
+truncate_check_activity(Relation rel)
+{
 	/*
 	 * Don't allow truncate on temp tables of other backends ... their local
 	 * buffer manager is not going to cope.
@@ -13419,6 +13444,28 @@ RangeVarCallbackOwnsTable(const RangeVar *relation,
 		aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(get_rel_relkind(relId)), relation->relname);
 }
 
+/*
+ * Callback to RangeVarGetRelidExtended() for TRUNCATE processing.
+ */
+static void
+RangeVarCallbackForTruncate(const RangeVar *relation,
+							Oid relId, Oid oldRelId, void *arg)
+{
+	HeapTuple	tuple;
+
+	/* Nothing to do if the relation was not found. */
+	if (!OidIsValid(relId))
+		return;
+
+	tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(relId));
+	if (!HeapTupleIsValid(tuple))	/* should not happen */
+		elog(ERROR, "cache lookup failed for relation %u", relId);
+
+	truncate_check_rel(relId, (Form_pg_class) GETSTRUCT(tuple));
+
+	ReleaseSysCache(tuple);
+}
+
 /*
  * Callback to RangeVarGetRelidExtended(), similar to
  * RangeVarCallbackOwnsTable() but without checks on the type of the relation.
diff --git a/src/test/isolation/expected/truncate-conflict.out b/src/test/isolation/expected/truncate-conflict.out
new file mode 100644
index 0000000000..2c10f8d40d
--- /dev/null
+++ b/src/test/isolation/expected/truncate-conflict.out
@@ -0,0 +1,99 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1_begin s1_tab_lookup s2_auth s2_truncate s1_commit s2_reset
+step s1_begin: BEGIN;
+step s1_tab_lookup: SELECT count(*) >= 0 FROM truncate_tab;
+?column?       
+
+t              
+step s2_auth: SET ROLE regress_truncate_conflict;
+step s2_truncate: TRUNCATE truncate_tab;
+ERROR:  permission denied for table truncate_tab
+step s1_commit: COMMIT;
+step s2_reset: RESET ROLE;
+
+starting permutation: s1_begin s2_auth s2_truncate s1_tab_lookup s1_commit s2_reset
+step s1_begin: BEGIN;
+step s2_auth: SET ROLE regress_truncate_conflict;
+step s2_truncate: TRUNCATE truncate_tab;
+ERROR:  permission denied for table truncate_tab
+step s1_tab_lookup: SELECT count(*) >= 0 FROM truncate_tab;
+?column?       
+
+t              
+step s1_commit: COMMIT;
+step s2_reset: RESET ROLE;
+
+starting permutation: s1_begin s2_auth s1_tab_lookup s2_truncate s1_commit s2_reset
+step s1_begin: BEGIN;
+step s2_auth: SET ROLE regress_truncate_conflict;
+step s1_tab_lookup: SELECT count(*) >= 0 FROM truncate_tab;
+?column?       
+
+t              
+step s2_truncate: TRUNCATE truncate_tab;
+ERROR:  permission denied for table truncate_tab
+step s1_commit: COMMIT;
+step s2_reset: RESET ROLE;
+
+starting permutation: s2_auth s2_truncate s1_begin s1_tab_lookup s1_commit s2_reset
+step s2_auth: SET ROLE regress_truncate_conflict;
+step s2_truncate: TRUNCATE truncate_tab;
+ERROR:  permission denied for table truncate_tab
+step s1_begin: BEGIN;
+step s1_tab_lookup: SELECT count(*) >= 0 FROM truncate_tab;
+?column?       
+
+t              
+step s1_commit: COMMIT;
+step s2_reset: RESET ROLE;
+
+starting permutation: s1_begin s1_tab_lookup s2_grant s2_auth s2_truncate s1_commit s2_reset
+step s1_begin: BEGIN;
+step s1_tab_lookup: SELECT count(*) >= 0 FROM truncate_tab;
+?column?       
+
+t              
+step s2_grant: GRANT TRUNCATE ON truncate_tab TO regress_truncate_conflict;
+step s2_auth: SET ROLE regress_truncate_conflict;
+step s2_truncate: TRUNCATE truncate_tab; <waiting ...>
+step s1_commit: COMMIT;
+step s2_truncate: <... completed>
+step s2_reset: RESET ROLE;
+
+starting permutation: s1_begin s2_grant s2_auth s2_truncate s1_tab_lookup s1_commit s2_reset
+step s1_begin: BEGIN;
+step s2_grant: GRANT TRUNCATE ON truncate_tab TO regress_truncate_conflict;
+step s2_auth: SET ROLE regress_truncate_conflict;
+step s2_truncate: TRUNCATE truncate_tab;
+step s1_tab_lookup: SELECT count(*) >= 0 FROM truncate_tab;
+?column?       
+
+t              
+step s1_commit: COMMIT;
+step s2_reset: RESET ROLE;
+
+starting permutation: s1_begin s2_grant s2_auth s1_tab_lookup s2_truncate s1_commit s2_reset
+step s1_begin: BEGIN;
+step s2_grant: GRANT TRUNCATE ON truncate_tab TO regress_truncate_conflict;
+step s2_auth: SET ROLE regress_truncate_conflict;
+step s1_tab_lookup: SELECT count(*) >= 0 FROM truncate_tab;
+?column?       
+
+t              
+step s2_truncate: TRUNCATE truncate_tab; <waiting ...>
+step s1_commit: COMMIT;
+step s2_truncate: <... completed>
+step s2_reset: RESET ROLE;
+
+starting permutation: s2_grant s2_auth s2_truncate s1_begin s1_tab_lookup s1_commit s2_reset
+step s2_grant: GRANT TRUNCATE ON truncate_tab TO regress_truncate_conflict;
+step s2_auth: SET ROLE regress_truncate_conflict;
+step s2_truncate: TRUNCATE truncate_tab;
+step s1_begin: BEGIN;
+step s1_tab_lookup: SELECT count(*) >= 0 FROM truncate_tab;
+?column?       
+
+t              
+step s1_commit: COMMIT;
+step s2_reset: RESET ROLE;
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index d5594e80e2..48ae740739 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -76,3 +76,4 @@ test: partition-key-update-2
 test: partition-key-update-3
 test: partition-key-update-4
 test: plpgsql-toast
+test: truncate-conflict
diff --git a/src/test/isolation/specs/truncate-conflict.spec b/src/test/isolation/specs/truncate-conflict.spec
new file mode 100644
index 0000000000..bbacd7aa3e
--- /dev/null
+++ b/src/test/isolation/specs/truncate-conflict.spec
@@ -0,0 +1,38 @@
+# Test for lock lookup conflicts with TRUNCATE when working on unowned
+# relations, particularly catalogs should trigger an ERROR for all the
+# scenarios present here.
+
+setup
+{
+	CREATE ROLE regress_truncate_conflict;
+	CREATE TABLE truncate_tab (a int);
+}
+
+teardown
+{
+	DROP TABLE truncate_tab;
+	DROP ROLE regress_truncate_conflict;
+}
+
+session "s1"
+step "s1_begin"          { BEGIN; }
+step "s1_tab_lookup"     { SELECT count(*) >= 0 FROM truncate_tab; }
+step "s1_commit"         { COMMIT; }
+
+session "s2"
+step "s2_grant"          { GRANT TRUNCATE ON truncate_tab TO regress_truncate_conflict; }
+step "s2_auth"           { SET ROLE regress_truncate_conflict; }
+step "s2_truncate"       { TRUNCATE truncate_tab; }
+step "s2_reset"          { RESET ROLE; }
+
+# TRUNCATE will directly fail
+permutation "s1_begin" "s1_tab_lookup" "s2_auth" "s2_truncate" "s1_commit" "s2_reset"
+permutation "s1_begin" "s2_auth" "s2_truncate" "s1_tab_lookup" "s1_commit" "s2_reset"
+permutation "s1_begin" "s2_auth" "s1_tab_lookup" "s2_truncate" "s1_commit" "s2_reset"
+permutation "s2_auth" "s2_truncate" "s1_begin" "s1_tab_lookup" "s1_commit" "s2_reset"
+
+# TRUNCATE will block
+permutation "s1_begin" "s1_tab_lookup" "s2_grant" "s2_auth" "s2_truncate" "s1_commit" "s2_reset"
+permutation "s1_begin" "s2_grant" "s2_auth" "s2_truncate" "s1_tab_lookup" "s1_commit" "s2_reset"
+permutation "s1_begin" "s2_grant" "s2_auth" "s1_tab_lookup" "s2_truncate" "s1_commit" "s2_reset"
+permutation "s2_grant" "s2_auth" "s2_truncate" "s1_begin" "s1_tab_lookup" "s1_commit" "s2_reset"
-- 
2.18.0

#4Bossart, Nathan
bossartn@amazon.com
In reply to: Michael Paquier (#3)
Re: Improve behavior of concurrent TRUNCATE

On 8/9/18, 5:29 AM, "Michael Paquier" <michael@paquier.xyz> wrote:

-truncate_check_rel(Relation rel)
+truncate_check_rel(Oid relid, Form_pg_class reltuple)

Could we use HeapTupleGetOid(reltuple) instead of passing the OID
separately?

If that was a HeapTuple. And it seems to me that we had better make
truncate_check_rel() depend directly on a Form_pg_class instead of
allowing the caller pass anything and deform the tuple within the
routine.

Got it. I agree, it makes sense to force the caller to provide a
Form_pg_class.

+# Test for lock lookup conflicts with TRUNCATE when working on unowned
+# relations, particularly catalogs should trigger an ERROR for all the
+# scenarios present here.

If possible, it might be worth it to check that TRUNCATE still blocks
when a role has privileges, too.

Good idea. I have added more tests for that. Doing a truncate on
pg_authid for installcheck was a very bad idea anyway (even if it failed
all the time), so I have switched to a custom table, with a GRANT
command to control the access with a custom role.

Good call. Accidentally truncating pg_authid might have led to some
interesting test results...

Since the only behavior this patch changes is the order of locking and
permission checks, my vote would be to back-patch. However, since the
REINDEX piece won't be back-patched, I can see why we might not here,
either.

This patch is a bit more invasive than the REINDEX one to be honest, and
as this is getting qualified as an improvement, at least on this thread,
I would be inclined to be more restrictive and just patch HEAD, but not
v11.

Alright.

Attached is an updated patch with all your suggestions added.

Thanks! This patch builds cleanly, the new tests pass, and my manual
testing hasn't uncovered any issues. Notably, I cannot reproduce the
originally reported authentication issue by using TRUNCATE after this
change. Beyond a few small comment wording suggestions below, it
looks good to me.

+		/*
+		 * RangeVarGetRelidExtended has done some basic checks with its
+		 * callback, and only remain session-level checks.
+		 */

This is definitely a nitpick, but I might rephrase this to "...so only
session-level checks remain" or to "...but we still need to do
session-level checks."

+/*
+ * Set of extra sanity checks to check if a given relation is safe to
+ * truncate.  This is split with truncate_check_rel as
+ * RangeVarCallbackForTruncate can only call the former.
+ */
+static void
+truncate_check_activity(Relation rel)

It might be worth mentioning why RangeVarCallbackForTruncate can only
call truncate_check_rel() (we haven't opened the relation yet).

+# Test for lock lookup conflicts with TRUNCATE when working on unowned
+# relations, particularly catalogs should trigger an ERROR for all the
+# scenarios present here.

Since we are testing a few more things now, perhaps this could just be
"Test for locking conflicts with TRUNCATE commands."

+# TRUNCATE will directly fail

Maybe we could say something more like this:
If the role doesn't have privileges to TRUNCATE the table,
TRUNCATE should immediately fail without waiting for a lock.

+# TRUNCATE will block
+permutation "s1_begin" "s1_tab_lookup" "s2_grant" "s2_auth" "s2_truncate" "s1_commit" "s2_reset"
+permutation "s1_begin" "s2_grant" "s2_auth" "s2_truncate" "s1_tab_lookup" "s1_commit" "s2_reset"
+permutation "s1_begin" "s2_grant" "s2_auth" "s1_tab_lookup" "s2_truncate" "s1_commit" "s2_reset"
+permutation "s2_grant" "s2_auth" "s2_truncate" "s1_begin" "s1_tab_lookup" "s1_commit" "s2_reset"

The second and fourth tests don't seem to actually block. Perhaps we
could reword the comment to say something like this:
If the role has privileges to TRUNCATE the table, TRUNCATE
will block if another session holds a lock on the table.

Nathan

#5Michael Paquier
michael@paquier.xyz
In reply to: Bossart, Nathan (#4)
1 attachment(s)
Re: Improve behavior of concurrent TRUNCATE

On Thu, Aug 09, 2018 at 03:27:04PM +0000, Bossart, Nathan wrote:

Thanks! This patch builds cleanly, the new tests pass, and my manual
testing hasn't uncovered any issues. Notably, I cannot reproduce the
originally reported authentication issue by using TRUNCATE after this
change. Beyond a few small comment wording suggestions below, it
looks good to me.

Thanks, I have updated the patch as you suggested. Any more
improvements to it that you can foresee?

The second and fourth tests don't seem to actually block.

Yeah, that's intentional.
--
Michael

Attachments:

0001-Refactor-TRUNCATE-execution-to-avoid-too-early-lock-.patchtext/x-diff; charset=us-asciiDownload
From c19f8a362d4ddd72a7d5a5b0e12a1a45693a0aca Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Thu, 9 Aug 2018 18:27:49 +0200
Subject: [PATCH] Refactor TRUNCATE execution to avoid too early lock lookups

A caller of TRUNCATE can perform early lookup obtention which can cause
other sessions to block on the request done, causing potentially DOS
attacks as even a non-privileged user can attempt a truncation of a
critical catalog table to block even all incoming connection attempts.

This commit fixes the case of TRUNCATE so as RangeVarGetRelidExtended is
used with a callback doing the necessary checks at an earlier stage,
avoiding locking issues at early stages, so as a failure happens for
unprivileged users instead of waiting on a lock.
---
 src/backend/commands/tablecmds.c              | 84 ++++++++++++----
 .../isolation/expected/truncate-conflict.out  | 99 +++++++++++++++++++
 src/test/isolation/isolation_schedule         |  1 +
 .../isolation/specs/truncate-conflict.spec    | 38 +++++++
 4 files changed, 204 insertions(+), 18 deletions(-)
 create mode 100644 src/test/isolation/expected/truncate-conflict.out
 create mode 100644 src/test/isolation/specs/truncate-conflict.spec

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index eb2d33dd86..c327038a09 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -300,7 +300,10 @@ struct DropRelationCallbackState
 #define child_dependency_type(child_is_partition)	\
 	((child_is_partition) ? DEPENDENCY_AUTO : DEPENDENCY_NORMAL)
 
-static void truncate_check_rel(Relation rel);
+static void truncate_check_rel(Oid relid, Form_pg_class reltuple);
+static void truncate_check_activity(Relation rel);
+static void RangeVarCallbackForTruncate(const RangeVar *relation,
+							Oid relId, Oid oldRelId, void *arg);
 static List *MergeAttributes(List *schema, List *supers, char relpersistence,
 				bool is_partition, List **supOids, List **supconstr,
 				int *supOidCount);
@@ -1336,15 +1339,26 @@ ExecuteTruncate(TruncateStmt *stmt)
 		bool		recurse = rv->inh;
 		Oid			myrelid;
 
-		rel = heap_openrv(rv, AccessExclusiveLock);
-		myrelid = RelationGetRelid(rel);
+		myrelid = RangeVarGetRelidExtended(rv, AccessExclusiveLock,
+										   0, RangeVarCallbackForTruncate,
+										   NULL);
+
+		/* open the relation, we already hold a lock on it */
+		rel = heap_open(myrelid, NoLock);
+
 		/* don't throw error for "TRUNCATE foo, foo" */
 		if (list_member_oid(relids, myrelid))
 		{
 			heap_close(rel, AccessExclusiveLock);
 			continue;
 		}
-		truncate_check_rel(rel);
+
+		/*
+		 * RangeVarGetRelidExtended has done some basic checks with its
+		 * callback, but session-level checks remain.
+		 */
+		truncate_check_activity(rel);
+
 		rels = lappend(rels, rel);
 		relids = lappend_oid(relids, myrelid);
 		/* Log this relation only if needed for logical decoding */
@@ -1367,7 +1381,9 @@ ExecuteTruncate(TruncateStmt *stmt)
 
 				/* find_all_inheritors already got lock */
 				rel = heap_open(childrelid, NoLock);
-				truncate_check_rel(rel);
+				truncate_check_rel(RelationGetRelid(rel), rel->rd_rel);
+				truncate_check_activity(rel);
+
 				rels = lappend(rels, rel);
 				relids = lappend_oid(relids, childrelid);
 				/* Log this relation only if needed for logical decoding */
@@ -1450,7 +1466,8 @@ ExecuteTruncateGuts(List *explicit_rels, List *relids, List *relids_logged,
 				ereport(NOTICE,
 						(errmsg("truncate cascades to table \"%s\"",
 								RelationGetRelationName(rel))));
-				truncate_check_rel(rel);
+				truncate_check_rel(relid, rel->rd_rel);
+				truncate_check_activity(rel);
 				rels = lappend(rels, rel);
 				relids = lappend_oid(relids, relid);
 				/* Log this relation only if needed for logical decoding */
@@ -1700,38 +1717,47 @@ ExecuteTruncateGuts(List *explicit_rels, List *relids, List *relids_logged,
 }
 
 /*
- * Check that a given rel is safe to truncate.  Subroutine for ExecuteTruncate
+ * Check that a given relation is safe to truncate.  Subroutine for
+ * ExecuteTruncate and RangeVarCallbackForTruncate.
  */
 static void
-truncate_check_rel(Relation rel)
+truncate_check_rel(Oid relid, Form_pg_class reltuple)
 {
 	AclResult	aclresult;
+	char	   *relname = NameStr(reltuple->relname);
 
 	/*
 	 * Only allow truncate on regular tables and partitioned tables (although,
 	 * the latter are only being included here for the following checks; no
 	 * physical truncation will occur in their case.)
 	 */
-	if (rel->rd_rel->relkind != RELKIND_RELATION &&
-		rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
+	if (reltuple->relkind != RELKIND_RELATION &&
+		reltuple->relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
-				 errmsg("\"%s\" is not a table",
-						RelationGetRelationName(rel))));
+				 errmsg("\"%s\" is not a table", relname)));
 
 	/* Permissions checks */
-	aclresult = pg_class_aclcheck(RelationGetRelid(rel), GetUserId(),
-								  ACL_TRUNCATE);
+	aclresult = pg_class_aclcheck(relid, GetUserId(), ACL_TRUNCATE);
 	if (aclresult != ACLCHECK_OK)
-		aclcheck_error(aclresult, get_relkind_objtype(rel->rd_rel->relkind),
-					   RelationGetRelationName(rel));
+		aclcheck_error(aclresult, get_relkind_objtype(reltuple->relkind),
+					   relname);
 
-	if (!allowSystemTableMods && IsSystemRelation(rel))
+	if (!allowSystemTableMods && IsSystemClass(relid, reltuple))
 		ereport(ERROR,
 				(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 				 errmsg("permission denied: \"%s\" is a system catalog",
-						RelationGetRelationName(rel))));
+						relname)));
+}
 
+/*
+ * Set of extra sanity checks to check if a given relation is safe to
+ * truncate.  This is split with truncate_check_rel as
+ * RangeVarCallbackForTruncate cannot open a Relation yet.
+ */
+static void
+truncate_check_activity(Relation rel)
+{
 	/*
 	 * Don't allow truncate on temp tables of other backends ... their local
 	 * buffer manager is not going to cope.
@@ -13419,6 +13445,28 @@ RangeVarCallbackOwnsTable(const RangeVar *relation,
 		aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(get_rel_relkind(relId)), relation->relname);
 }
 
+/*
+ * Callback to RangeVarGetRelidExtended() for TRUNCATE processing.
+ */
+static void
+RangeVarCallbackForTruncate(const RangeVar *relation,
+							Oid relId, Oid oldRelId, void *arg)
+{
+	HeapTuple	tuple;
+
+	/* Nothing to do if the relation was not found. */
+	if (!OidIsValid(relId))
+		return;
+
+	tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(relId));
+	if (!HeapTupleIsValid(tuple))	/* should not happen */
+		elog(ERROR, "cache lookup failed for relation %u", relId);
+
+	truncate_check_rel(relId, (Form_pg_class) GETSTRUCT(tuple));
+
+	ReleaseSysCache(tuple);
+}
+
 /*
  * Callback to RangeVarGetRelidExtended(), similar to
  * RangeVarCallbackOwnsTable() but without checks on the type of the relation.
diff --git a/src/test/isolation/expected/truncate-conflict.out b/src/test/isolation/expected/truncate-conflict.out
new file mode 100644
index 0000000000..2c10f8d40d
--- /dev/null
+++ b/src/test/isolation/expected/truncate-conflict.out
@@ -0,0 +1,99 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1_begin s1_tab_lookup s2_auth s2_truncate s1_commit s2_reset
+step s1_begin: BEGIN;
+step s1_tab_lookup: SELECT count(*) >= 0 FROM truncate_tab;
+?column?       
+
+t              
+step s2_auth: SET ROLE regress_truncate_conflict;
+step s2_truncate: TRUNCATE truncate_tab;
+ERROR:  permission denied for table truncate_tab
+step s1_commit: COMMIT;
+step s2_reset: RESET ROLE;
+
+starting permutation: s1_begin s2_auth s2_truncate s1_tab_lookup s1_commit s2_reset
+step s1_begin: BEGIN;
+step s2_auth: SET ROLE regress_truncate_conflict;
+step s2_truncate: TRUNCATE truncate_tab;
+ERROR:  permission denied for table truncate_tab
+step s1_tab_lookup: SELECT count(*) >= 0 FROM truncate_tab;
+?column?       
+
+t              
+step s1_commit: COMMIT;
+step s2_reset: RESET ROLE;
+
+starting permutation: s1_begin s2_auth s1_tab_lookup s2_truncate s1_commit s2_reset
+step s1_begin: BEGIN;
+step s2_auth: SET ROLE regress_truncate_conflict;
+step s1_tab_lookup: SELECT count(*) >= 0 FROM truncate_tab;
+?column?       
+
+t              
+step s2_truncate: TRUNCATE truncate_tab;
+ERROR:  permission denied for table truncate_tab
+step s1_commit: COMMIT;
+step s2_reset: RESET ROLE;
+
+starting permutation: s2_auth s2_truncate s1_begin s1_tab_lookup s1_commit s2_reset
+step s2_auth: SET ROLE regress_truncate_conflict;
+step s2_truncate: TRUNCATE truncate_tab;
+ERROR:  permission denied for table truncate_tab
+step s1_begin: BEGIN;
+step s1_tab_lookup: SELECT count(*) >= 0 FROM truncate_tab;
+?column?       
+
+t              
+step s1_commit: COMMIT;
+step s2_reset: RESET ROLE;
+
+starting permutation: s1_begin s1_tab_lookup s2_grant s2_auth s2_truncate s1_commit s2_reset
+step s1_begin: BEGIN;
+step s1_tab_lookup: SELECT count(*) >= 0 FROM truncate_tab;
+?column?       
+
+t              
+step s2_grant: GRANT TRUNCATE ON truncate_tab TO regress_truncate_conflict;
+step s2_auth: SET ROLE regress_truncate_conflict;
+step s2_truncate: TRUNCATE truncate_tab; <waiting ...>
+step s1_commit: COMMIT;
+step s2_truncate: <... completed>
+step s2_reset: RESET ROLE;
+
+starting permutation: s1_begin s2_grant s2_auth s2_truncate s1_tab_lookup s1_commit s2_reset
+step s1_begin: BEGIN;
+step s2_grant: GRANT TRUNCATE ON truncate_tab TO regress_truncate_conflict;
+step s2_auth: SET ROLE regress_truncate_conflict;
+step s2_truncate: TRUNCATE truncate_tab;
+step s1_tab_lookup: SELECT count(*) >= 0 FROM truncate_tab;
+?column?       
+
+t              
+step s1_commit: COMMIT;
+step s2_reset: RESET ROLE;
+
+starting permutation: s1_begin s2_grant s2_auth s1_tab_lookup s2_truncate s1_commit s2_reset
+step s1_begin: BEGIN;
+step s2_grant: GRANT TRUNCATE ON truncate_tab TO regress_truncate_conflict;
+step s2_auth: SET ROLE regress_truncate_conflict;
+step s1_tab_lookup: SELECT count(*) >= 0 FROM truncate_tab;
+?column?       
+
+t              
+step s2_truncate: TRUNCATE truncate_tab; <waiting ...>
+step s1_commit: COMMIT;
+step s2_truncate: <... completed>
+step s2_reset: RESET ROLE;
+
+starting permutation: s2_grant s2_auth s2_truncate s1_begin s1_tab_lookup s1_commit s2_reset
+step s2_grant: GRANT TRUNCATE ON truncate_tab TO regress_truncate_conflict;
+step s2_auth: SET ROLE regress_truncate_conflict;
+step s2_truncate: TRUNCATE truncate_tab;
+step s1_begin: BEGIN;
+step s1_tab_lookup: SELECT count(*) >= 0 FROM truncate_tab;
+?column?       
+
+t              
+step s1_commit: COMMIT;
+step s2_reset: RESET ROLE;
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index d5594e80e2..48ae740739 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -76,3 +76,4 @@ test: partition-key-update-2
 test: partition-key-update-3
 test: partition-key-update-4
 test: plpgsql-toast
+test: truncate-conflict
diff --git a/src/test/isolation/specs/truncate-conflict.spec b/src/test/isolation/specs/truncate-conflict.spec
new file mode 100644
index 0000000000..3c1b1d1b34
--- /dev/null
+++ b/src/test/isolation/specs/truncate-conflict.spec
@@ -0,0 +1,38 @@
+# Tests for locking conflicts with TRUNCATE commands.
+
+setup
+{
+	CREATE ROLE regress_truncate_conflict;
+	CREATE TABLE truncate_tab (a int);
+}
+
+teardown
+{
+	DROP TABLE truncate_tab;
+	DROP ROLE regress_truncate_conflict;
+}
+
+session "s1"
+step "s1_begin"          { BEGIN; }
+step "s1_tab_lookup"     { SELECT count(*) >= 0 FROM truncate_tab; }
+step "s1_commit"         { COMMIT; }
+
+session "s2"
+step "s2_grant"          { GRANT TRUNCATE ON truncate_tab TO regress_truncate_conflict; }
+step "s2_auth"           { SET ROLE regress_truncate_conflict; }
+step "s2_truncate"       { TRUNCATE truncate_tab; }
+step "s2_reset"          { RESET ROLE; }
+
+# The role doesn't have privileges to truncate the table, so TRUNCATE should
+# immediately fail without waiting for a lock.
+permutation "s1_begin" "s1_tab_lookup" "s2_auth" "s2_truncate" "s1_commit" "s2_reset"
+permutation "s1_begin" "s2_auth" "s2_truncate" "s1_tab_lookup" "s1_commit" "s2_reset"
+permutation "s1_begin" "s2_auth" "s1_tab_lookup" "s2_truncate" "s1_commit" "s2_reset"
+permutation "s2_auth" "s2_truncate" "s1_begin" "s1_tab_lookup" "s1_commit" "s2_reset"
+
+# The role has privileges to truncate the table, TRUNCATE will block if
+# another session holds a lock on the table and succeed in all cases.
+permutation "s1_begin" "s1_tab_lookup" "s2_grant" "s2_auth" "s2_truncate" "s1_commit" "s2_reset"
+permutation "s1_begin" "s2_grant" "s2_auth" "s2_truncate" "s1_tab_lookup" "s1_commit" "s2_reset"
+permutation "s1_begin" "s2_grant" "s2_auth" "s1_tab_lookup" "s2_truncate" "s1_commit" "s2_reset"
+permutation "s2_grant" "s2_auth" "s2_truncate" "s1_begin" "s1_tab_lookup" "s1_commit" "s2_reset"
-- 
2.18.0

#6Bossart, Nathan
bossartn@amazon.com
In reply to: Michael Paquier (#5)
Re: Improve behavior of concurrent TRUNCATE

On 8/9/18, 11:31 AM, "Michael Paquier" <michael@paquier.xyz> wrote:

Thanks, I have updated the patch as you suggested. Any more
improvements to it that you can foresee?

Looks good to me.

Nathan

#7Michael Paquier
michael@paquier.xyz
In reply to: Bossart, Nathan (#6)
Re: Improve behavior of concurrent TRUNCATE

On Thu, Aug 09, 2018 at 05:55:54PM +0000, Bossart, Nathan wrote:

On 8/9/18, 11:31 AM, "Michael Paquier" <michael@paquier.xyz> wrote:

Thanks, I have updated the patch as you suggested. Any more
improvements to it that you can foresee?

Looks good to me.

Okay, pushed to HEAD. Now remains the cases for VACUUM and we will be
good. I still need to look more deeply at the last patch sent..
--
Michael

#8Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Michael Paquier (#1)
Re: Improve behavior of concurrent TRUNCATE

On 2018-Aug-06, Michael Paquier wrote:

Attached is a patch I have been working on which refactors the code of
TRUNCATE in such a way that we check for privileges before trying to
acquire a lock, without any user-facing impact (I have reworked a couple
of comments compared to the last version). This includes a set of tests
showing the new behavior.

Like cbe24a6, perhaps we would not want to back-patch it? Based on the
past history (and the consensus being reached for the REINDEX case would
be to patch only HEAD), I would be actually incline to not back-patch
this stuff and qualify that as an improvement. That's also less work
for me at commit :)

I'm not sure I understand your arguments for not back-patching this.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#9Michael Paquier
michael@paquier.xyz
In reply to: Alvaro Herrera (#8)
Re: Improve behavior of concurrent TRUNCATE

On Fri, Aug 10, 2018 at 02:03:28PM -0400, Alvaro Herrera wrote:

On 2018-Aug-06, Michael Paquier wrote:

Like cbe24a6, perhaps we would not want to back-patch it? Based on the
past history (and the consensus being reached for the REINDEX case would
be to patch only HEAD), I would be actually incline to not back-patch
this stuff and qualify that as an improvement. That's also less work
for me at commit :)

I'm not sure I understand your arguments for not back-patching this.

Mainly consistency. Looking at the git history for such cases we have
not really bothered back-patching fixes and those have been qualified as
improvements. If we were to close all the holes mentioned in the
original DOS thread a back-patch to v11 could be thought as acceptable?
That's where the REINDEX fix has found its way after all, but that was
way less code churn, and we are post beta 3 for v11...
--
Michael

#10Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Michael Paquier (#9)
Re: Improve behavior of concurrent TRUNCATE

On 2018-Aug-10, Michael Paquier wrote:

On Fri, Aug 10, 2018 at 02:03:28PM -0400, Alvaro Herrera wrote:

On 2018-Aug-06, Michael Paquier wrote:

Like cbe24a6, perhaps we would not want to back-patch it? Based on the
past history (and the consensus being reached for the REINDEX case would
be to patch only HEAD), I would be actually incline to not back-patch
this stuff and qualify that as an improvement. That's also less work
for me at commit :)

I'm not sure I understand your arguments for not back-patching this.

Mainly consistency. Looking at the git history for such cases we have
not really bothered back-patching fixes and those have been qualified as
improvements. If we were to close all the holes mentioned in the
original DOS thread a back-patch to v11 could be thought as acceptable?
That's where the REINDEX fix has found its way after all, but that was
way less code churn, and we are post beta 3 for v11...

I was actually thinking in applying to all back-branches, not just pg11,
considering it a fix for a pretty serious bug. But checking the
history, it seems that Robert patched this is 9.2 as new development
(2ad36c4e4, 1489e2f26, cbe24a6dd, 1da5c1195, 74a1d4fe7); holes remained,
but none was patched until 94da2a6a in pg10 -- took some time! And then
nobody cared about the ones you're patching now.

So I withdraw my argumentation, mostly because there's clearly not as
much interest in seeing this fixed as all that.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#11Robert Haas
robertmhaas@gmail.com
In reply to: Alvaro Herrera (#10)
Re: Improve behavior of concurrent TRUNCATE

On Fri, Aug 10, 2018 at 5:05 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

I was actually thinking in applying to all back-branches, not just pg11,
considering it a fix for a pretty serious bug. But checking the
history, it seems that Robert patched this is 9.2 as new development
(2ad36c4e4, 1489e2f26, cbe24a6dd, 1da5c1195, 74a1d4fe7); holes remained,
but none was patched until 94da2a6a in pg10 -- took some time! And then
nobody cared about the ones you're patching now.

So I withdraw my argumentation, mostly because there's clearly not as
much interest in seeing this fixed as all that.

The original patches would, I think, have been pretty scary to
back-patch, since the infrastructure didn't exist in older branches
and we were churning a fairly large amount of code. Now that most
places are fixed and things have had five years to bake, we could
conceivably back-patch the remaining fixes. However, I wonder if
we've really looked into how many instances of this problem remain.
If there's still ten more that we haven't bothered to fix,
back-patching one or two that we've gotten around to doing something
about doesn't make a lot of sense to me.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#12Michael Paquier
michael@paquier.xyz
In reply to: Robert Haas (#11)
Re: Improve behavior of concurrent TRUNCATE

On Mon, Aug 13, 2018 at 01:39:06PM -0400, Robert Haas wrote:

The original patches would, I think, have been pretty scary to
back-patch, since the infrastructure didn't exist in older branches
and we were churning a fairly large amount of code. Now that most
places are fixed and things have had five years to bake, we could
conceivably back-patch the remaining fixes. However, I wonder if
we've really looked into how many instances of this problem remain.
If there's still ten more that we haven't bothered to fix,
back-patching one or two that we've gotten around to doing something
about doesn't make a lot of sense to me.

If we are confident enough to say that all the holes have been patched,
then we could only back-patch down to v11 in my opinion as REINDEX
needed a change of behavior for the handling of shared catalog. FWIW, I
have spent some time fixing all the issues reported on the original
thread, but I did not double-check all commands using an exclusive
lock, hence all the issues I have known of are:
- REINDEX with shared catalogs, fixed by 661dd23.
- TRUNCATE, with something commit only on HEAD with f841ceb2.
- VACUUM FULL, for which I have submitted a patch proposal:
/messages/by-id/20180812222142.GA6097@paquier.xyz
--
Michael

#13Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Michael Paquier (#12)
Re: Improve behavior of concurrent TRUNCATE

On 2018-Aug-16, Michael Paquier wrote:

On Mon, Aug 13, 2018 at 01:39:06PM -0400, Robert Haas wrote:

The original patches would, I think, have been pretty scary to
back-patch, since the infrastructure didn't exist in older branches
and we were churning a fairly large amount of code. Now that most
places are fixed and things have had five years to bake, we could
conceivably back-patch the remaining fixes. However, I wonder if
we've really looked into how many instances of this problem remain.
If there's still ten more that we haven't bothered to fix,
back-patching one or two that we've gotten around to doing something
about doesn't make a lot of sense to me.

If we are confident enough to say that all the holes have been patched,
then we could only back-patch down to v11 in my opinion as REINDEX
needed a change of behavior for the handling of shared catalog.

I searched for uses of RangeVarGetRelid, as well as heap_openrv and
relation_openrv, and there are a couple that looks very suspicious. I
don't think we can claim yet that all holes are fixed.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services