[PATCH] Fix alter subscription concurrency errors

Started by Jelte Fennemaover 3 years ago10 messages
#1Jelte Fennema
Jelte.Fennema@microsoft.com
1 attachment(s)

Without this patch concurrent ALTER/DROP SUBSCRIPTION statements for
the same subscription could result in one of these statements returning the
following error:

ERROR: XX000: tuple concurrently updated

This patch fixes that by re-fetching the tuple after acquiring the lock on the
subscription. The included isolation test fails most of its permutations
without this patch, with the error shown above.

The loop to re-fetch the tuple is heavily based on the code from dbcommands.c

Attachments:

0001-Fix-errors-when-concurrently-altering-subscriptions.patchapplication/octet-stream; name=0001-Fix-errors-when-concurrently-altering-subscriptions.patchDownload
From 2d1448de67b9dbec7b033bc276111a1d9cd4ed89 Mon Sep 17 00:00:00 2001
From: Jelte Fennema <github-tech@jeltef.nl>
Date: Thu, 25 Aug 2022 15:02:58 +0200
Subject: [PATCH] Fix errors when concurrently altering subscriptions

Without this patch concurrent ALTER SUBSCRIPTION statements for the same
subscription could result in one of these statements returning the
following error:

ERROR:  XX000: tuple concurrently updated

This fixes that by re-fetching the tuple after acquiring the lock on the
subscription.
---
 src/backend/commands/subscriptioncmds.c       | 141 ++++++++++++------
 .../isolation/expected/subscription-ddl.out   |  41 +++++
 src/test/isolation/isolation_schedule         |   1 +
 .../isolation/specs/subscription-ddl.spec     |  32 ++++
 4 files changed, 173 insertions(+), 42 deletions(-)
 create mode 100644 src/test/isolation/expected/subscription-ddl.out
 create mode 100644 src/test/isolation/specs/subscription-ddl.spec

diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 670b219c8d..6fa17f93a2 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -95,6 +95,7 @@ static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
 static void check_duplicates_in_publist(List *publist, Datum *datums);
 static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
 static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
+static Oid	LockSubscription(Relation rel, const char *subname, bool missing_ok);
 
 
 /*
@@ -987,6 +988,86 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data,
 		table_close(rel, NoLock);
 }
 
+static Oid
+LockSubscription(Relation rel, const char *subname, bool missing_ok)
+{
+	HeapTuple	tup;
+	Form_pg_subscription form;
+	Oid			subid;
+
+	/*
+	 * Loop covers the rare case where the subscription is renamed before we
+	 * can lock it. We try again just in case we can find a new one of the
+	 * same name.
+	 */
+	while (true)
+	{
+		tup = SearchSysCache2(SUBSCRIPTIONNAME, MyDatabaseId,
+							  CStringGetDatum(subname));
+
+		if (!HeapTupleIsValid(tup))
+		{
+			if (!missing_ok)
+				ereport(ERROR,
+						(errcode(ERRCODE_UNDEFINED_OBJECT),
+						 errmsg("subscription \"%s\" does not exist",
+								subname)));
+			else
+				ereport(NOTICE,
+						(errmsg("subscription \"%s\" does not exist, skipping",
+								subname)));
+
+			return InvalidOid;
+		}
+
+		form = (Form_pg_subscription) GETSTRUCT(tup);
+		subid = form->oid;
+
+		/* must be owner */
+		if (!pg_subscription_ownercheck(subid, GetUserId()))
+			aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_SUBSCRIPTION,
+						   subname);
+
+		/*
+		 * Lock the subscription so nobody else can do anything with it
+		 * (including the replication workers).
+		 */
+		LockSharedObject(SubscriptionRelationId, subid, 0, AccessExclusiveLock);
+
+		/*
+		 * And now, re-fetch the tuple by OID.  If it's still there and still
+		 * the same name, we win; else, drop the lock and loop back to try
+		 * again.
+		 */
+		ReleaseSysCache(tup);
+		tup = SearchSysCache1(SUBSCRIPTIONOID, ObjectIdGetDatum(subid));
+		if (HeapTupleIsValid(tup))
+		{
+
+			form = (Form_pg_subscription) GETSTRUCT(tup);
+
+			/*
+			 * if the name changes we try te fetch a subscription with the
+			 * same name again, in case a new one was created.
+			 */
+			if (strcmp(subname, NameStr(form->subname)) == 0)
+			{
+				/* must still be owner */
+				if (!pg_subscription_ownercheck(subid, GetUserId()))
+					aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_SUBSCRIPTION,
+								   subname);
+				break;
+			}
+			/* can only get here if subscription was just renamed */
+			ReleaseSysCache(tup);
+		}
+		UnlockSharedObject(SubscriptionRelationId, subid, 0, AccessExclusiveLock);
+	}
+	ReleaseSysCache(tup);
+	return subid;
+}
+
+
 /*
  * Alter the existing subscription.
  */
@@ -1003,35 +1084,27 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
 	Oid			subid;
 	bool		update_tuple = false;
 	Subscription *sub;
-	Form_pg_subscription form;
 	bits32		supported_opts;
 	SubOpts		opts = {0};
 
 	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+	subid = LockSubscription(rel, stmt->subname, false);
 
 	/* Fetch the existing tuple. */
-	tup = SearchSysCacheCopy2(SUBSCRIPTIONNAME, MyDatabaseId,
-							  CStringGetDatum(stmt->subname));
+	tup = SearchSysCacheCopy1(SUBSCRIPTIONOID, ObjectIdGetDatum(subid));
 
+	/*
+	 * This should never happen since we already locked the subscription, so
+	 * we know it exists.
+	 */
 	if (!HeapTupleIsValid(tup))
 		ereport(ERROR,
 				(errcode(ERRCODE_UNDEFINED_OBJECT),
 				 errmsg("subscription \"%s\" does not exist",
 						stmt->subname)));
 
-	form = (Form_pg_subscription) GETSTRUCT(tup);
-	subid = form->oid;
-
-	/* must be owner */
-	if (!pg_subscription_ownercheck(subid, GetUserId()))
-		aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_SUBSCRIPTION,
-					   stmt->subname);
-
 	sub = GetSubscription(subid, false);
 
-	/* Lock the subscription so nobody else can do anything with it. */
-	LockSharedObject(SubscriptionRelationId, subid, 0, AccessExclusiveLock);
-
 	/* Form a new tuple. */
 	memset(values, 0, sizeof(values));
 	memset(nulls, false, sizeof(nulls));
@@ -1370,7 +1443,6 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
 	char		originname[NAMEDATALEN];
 	char	   *err = NULL;
 	WalReceiverConn *wrconn;
-	Form_pg_subscription form;
 	List	   *rstates;
 
 	/*
@@ -1379,43 +1451,28 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
 	 */
 	rel = table_open(SubscriptionRelationId, AccessExclusiveLock);
 
-	tup = SearchSysCache2(SUBSCRIPTIONNAME, MyDatabaseId,
-						  CStringGetDatum(stmt->subname));
-
-	if (!HeapTupleIsValid(tup))
+	subid = LockSubscription(rel, stmt->subname, stmt->missing_ok);
+	if (subid == InvalidOid)
 	{
 		table_close(rel, NoLock);
-
-		if (!stmt->missing_ok)
-			ereport(ERROR,
-					(errcode(ERRCODE_UNDEFINED_OBJECT),
-					 errmsg("subscription \"%s\" does not exist",
-							stmt->subname)));
-		else
-			ereport(NOTICE,
-					(errmsg("subscription \"%s\" does not exist, skipping",
-							stmt->subname)));
-
 		return;
 	}
 
-	form = (Form_pg_subscription) GETSTRUCT(tup);
-	subid = form->oid;
+	tup = SearchSysCache1(SUBSCRIPTIONOID, ObjectIdGetDatum(subid));
 
-	/* must be owner */
-	if (!pg_subscription_ownercheck(subid, GetUserId()))
-		aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_SUBSCRIPTION,
-					   stmt->subname);
+	/*
+	 * This should never happen since we already locked the subscription, so
+	 * we know it exists.
+	 */
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_OBJECT),
+				 errmsg("subscription \"%s\" does not exist",
+						stmt->subname)));
 
 	/* DROP hook for the subscription being removed */
 	InvokeObjectDropHook(SubscriptionRelationId, subid, 0);
 
-	/*
-	 * Lock the subscription so nobody else can do anything with it (including
-	 * the replication workers).
-	 */
-	LockSharedObject(SubscriptionRelationId, subid, 0, AccessExclusiveLock);
-
 	/* Get subname */
 	datum = SysCacheGetAttr(SUBSCRIPTIONOID, tup,
 							Anum_pg_subscription_subname, &isnull);
diff --git a/src/test/isolation/expected/subscription-ddl.out b/src/test/isolation/expected/subscription-ddl.out
new file mode 100644
index 0000000000..e832856be2
--- /dev/null
+++ b/src/test/isolation/expected/subscription-ddl.out
@@ -0,0 +1,41 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1alter s2alter s1commit
+step s1alter: ALTER SUBSCRIPTION sub1 DISABLE;
+step s2alter: ALTER SUBSCRIPTION sub1 DISABLE; <waiting ...>
+step s1commit: COMMIT;
+step s2alter: <... completed>
+
+starting permutation: s1drop s2alter s1commit
+step s1drop: DROP SUBSCRIPTION sub1;
+step s2alter: ALTER SUBSCRIPTION sub1 DISABLE; <waiting ...>
+step s1commit: COMMIT;
+step s2alter: <... completed>
+ERROR:  subscription "sub1" does not exist
+
+starting permutation: s1rename s2alter s1commit
+step s1rename: ALTER SUBSCRIPTION sub1 RENAME TO sub2;
+step s2alter: ALTER SUBSCRIPTION sub1 DISABLE; <waiting ...>
+step s1commit: COMMIT;
+step s2alter: <... completed>
+ERROR:  subscription "sub1" does not exist
+
+starting permutation: s1alter s2drop s1commit
+step s1alter: ALTER SUBSCRIPTION sub1 DISABLE;
+step s2drop: DROP SUBSCRIPTION sub1; <waiting ...>
+step s1commit: COMMIT;
+step s2drop: <... completed>
+
+starting permutation: s1drop s2drop s1commit
+step s1drop: DROP SUBSCRIPTION sub1;
+step s2drop: DROP SUBSCRIPTION sub1; <waiting ...>
+step s1commit: COMMIT;
+step s2drop: <... completed>
+ERROR:  subscription "sub1" does not exist
+
+starting permutation: s1rename s2drop s1commit
+step s1rename: ALTER SUBSCRIPTION sub1 RENAME TO sub2;
+step s2drop: DROP SUBSCRIPTION sub1; <waiting ...>
+step s1commit: COMMIT;
+step s2drop: <... completed>
+ERROR:  subscription "sub1" does not exist
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 529a4cbd4d..6e9144319a 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -84,6 +84,7 @@ test: alter-table-3
 test: alter-table-4
 test: create-trigger
 test: sequence-ddl
+test: subscription-ddl
 test: async-notify
 test: vacuum-no-cleanup-lock
 test: timeouts
diff --git a/src/test/isolation/specs/subscription-ddl.spec b/src/test/isolation/specs/subscription-ddl.spec
new file mode 100644
index 0000000000..24f4d2cff8
--- /dev/null
+++ b/src/test/isolation/specs/subscription-ddl.spec
@@ -0,0 +1,32 @@
+# Test sequence usage and concurrent sequence DDL
+
+setup
+{
+    CREATE SUBSCRIPTION sub1 CONNECTION 'dbname=postgres' PUBLICATION pub1 WITH(connect=false, create_slot=false, slot_name = NONE);
+}
+
+teardown
+{
+    DROP SUBSCRIPTION IF EXISTS sub1;
+    DROP SUBSCRIPTION IF EXISTS sub2;
+}
+
+session s1
+setup             { BEGIN; }
+step s1alter      { ALTER SUBSCRIPTION sub1 DISABLE; }
+step s1drop       { DROP SUBSCRIPTION sub1; }
+step s1rename     { ALTER SUBSCRIPTION sub1 RENAME TO sub2; }
+step s1commit     { COMMIT; }
+
+session s2
+step s2alter    { ALTER SUBSCRIPTION sub1 DISABLE; }
+step s2drop    { DROP SUBSCRIPTION sub1; }
+
+
+permutation s1alter s2alter s1commit
+permutation s1drop s2alter s1commit
+permutation s1rename s2alter s1commit
+
+permutation s1alter s2drop s1commit
+permutation s1drop s2drop s1commit
+permutation s1rename s2drop s1commit
-- 
2.34.1

#2Amit Kapila
amit.kapila16@gmail.com
In reply to: Jelte Fennema (#1)
Re: [PATCH] Fix alter subscription concurrency errors

On Thu, Aug 25, 2022 at 8:17 PM Jelte Fennema
<Jelte.Fennema@microsoft.com> wrote:

Without this patch concurrent ALTER/DROP SUBSCRIPTION statements for
the same subscription could result in one of these statements returning the
following error:

ERROR: XX000: tuple concurrently updated

This patch fixes that by re-fetching the tuple after acquiring the lock on the
subscription. The included isolation test fails most of its permutations
without this patch, with the error shown above.

Won't the same thing can happen for similar publication commands? Why
is this unique to the subscription and not other Alter/Drop commands?

--
With Regards,
Amit Kapila.

#3Jelte Fennema
Jelte.Fennema@microsoft.com
In reply to: Amit Kapila (#2)
1 attachment(s)
Re: [PATCH] Fix alter subscription concurrency errors

Won't the same thing can happen for similar publication commands? Why
is this unique to the subscription and not other Alter/Drop commands?

I indeed don't think this problem is unique to subscriptions, but it seems
better to at least have this problem in a few places less (not making perfect
the enemy of good).

If someone has a more generic way of solving this for other commands too,
then that sounds great, but if not then slowly chipping away at these cases
seems better than keeping the status quo.

Attached is a new patch where ALTER SUBSCRIPTION ... OWNER TO ... can
now also be executed concurrently with the other subscription commands.

Attachments:

0001-Fix-errors-when-concurrently-altering-subscriptions.patchapplication/octet-stream; name=0001-Fix-errors-when-concurrently-altering-subscriptions.patchDownload
From f20b23d4e19769d385d7b45d7dd9ed994e83a3f7 Mon Sep 17 00:00:00 2001
From: Jelte Fennema <github-tech@jeltef.nl>
Date: Thu, 25 Aug 2022 15:02:58 +0200
Subject: [PATCH] Fix errors when concurrently altering subscriptions

Without this patch concurrent ALTER SUBSCRIPTION statements for the same
subscription could result in one of these statements returning the
following error:

ERROR:  XX000: tuple concurrently updated

This fixes that by re-fetching the tuple after acquiring the lock on the
subscription.
---
 src/backend/commands/subscriptioncmds.c       | 150 +++++++++++-----
 .../isolation/expected/subscription-ddl.out   | 169 ++++++++++++++++++
 src/test/isolation/isolation_schedule         |   1 +
 .../isolation/specs/subscription-ddl.spec     |  59 ++++++
 4 files changed, 335 insertions(+), 44 deletions(-)
 create mode 100644 src/test/isolation/expected/subscription-ddl.out
 create mode 100644 src/test/isolation/specs/subscription-ddl.spec

diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 670b219c8d..9a1eaeb200 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -95,6 +95,7 @@ static List *fetch_table_list(WalReceiverConn *wrconn, List *publications);
 static void check_duplicates_in_publist(List *publist, Datum *datums);
 static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname);
 static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err);
+static Oid	LockSubscription(Relation rel, const char *subname, bool missing_ok);
 
 
 /*
@@ -987,6 +988,86 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data,
 		table_close(rel, NoLock);
 }
 
+static Oid
+LockSubscription(Relation rel, const char *subname, bool missing_ok)
+{
+	HeapTuple	tup;
+	Form_pg_subscription form;
+	Oid			subid;
+
+	/*
+	 * Loop covers the rare case where the subscription is renamed before we
+	 * can lock it. We try again just in case we can find a new one of the
+	 * same name.
+	 */
+	while (true)
+	{
+		tup = SearchSysCache2(SUBSCRIPTIONNAME, MyDatabaseId,
+							  CStringGetDatum(subname));
+
+		if (!HeapTupleIsValid(tup))
+		{
+			if (!missing_ok)
+				ereport(ERROR,
+						(errcode(ERRCODE_UNDEFINED_OBJECT),
+						 errmsg("subscription \"%s\" does not exist",
+								subname)));
+			else
+				ereport(NOTICE,
+						(errmsg("subscription \"%s\" does not exist, skipping",
+								subname)));
+
+			return InvalidOid;
+		}
+
+		form = (Form_pg_subscription) GETSTRUCT(tup);
+		subid = form->oid;
+
+		/* must be owner */
+		if (!pg_subscription_ownercheck(subid, GetUserId()))
+			aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_SUBSCRIPTION,
+						   subname);
+
+		/*
+		 * Lock the subscription so nobody else can do anything with it
+		 * (including the replication workers).
+		 */
+		LockSharedObject(SubscriptionRelationId, subid, 0, AccessExclusiveLock);
+
+		/*
+		 * And now, re-fetch the tuple by OID.  If it's still there and still
+		 * the same name, we win; else, drop the lock and loop back to try
+		 * again.
+		 */
+		ReleaseSysCache(tup);
+		tup = SearchSysCache1(SUBSCRIPTIONOID, ObjectIdGetDatum(subid));
+		if (HeapTupleIsValid(tup))
+		{
+
+			form = (Form_pg_subscription) GETSTRUCT(tup);
+
+			/*
+			 * if the name changes we try te fetch a subscription with the
+			 * same name again, in case a new one was created.
+			 */
+			if (strcmp(subname, NameStr(form->subname)) == 0)
+			{
+				/* must still be owner */
+				if (!pg_subscription_ownercheck(subid, GetUserId()))
+					aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_SUBSCRIPTION,
+								   subname);
+				break;
+			}
+			/* can only get here if subscription was just renamed */
+			ReleaseSysCache(tup);
+		}
+		UnlockSharedObject(SubscriptionRelationId, subid, 0, AccessExclusiveLock);
+	}
+	ReleaseSysCache(tup);
+	return subid;
+}
+
+
 /*
  * Alter the existing subscription.
  */
@@ -1003,35 +1084,27 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
 	Oid			subid;
 	bool		update_tuple = false;
 	Subscription *sub;
-	Form_pg_subscription form;
 	bits32		supported_opts;
 	SubOpts		opts = {0};
 
 	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+	subid = LockSubscription(rel, stmt->subname, false);
 
 	/* Fetch the existing tuple. */
-	tup = SearchSysCacheCopy2(SUBSCRIPTIONNAME, MyDatabaseId,
-							  CStringGetDatum(stmt->subname));
+	tup = SearchSysCacheCopy1(SUBSCRIPTIONOID, ObjectIdGetDatum(subid));
 
+	/*
+	 * This should never happen since we already locked the subscription, so
+	 * we know it exists.
+	 */
 	if (!HeapTupleIsValid(tup))
 		ereport(ERROR,
 				(errcode(ERRCODE_UNDEFINED_OBJECT),
 				 errmsg("subscription \"%s\" does not exist",
 						stmt->subname)));
 
-	form = (Form_pg_subscription) GETSTRUCT(tup);
-	subid = form->oid;
-
-	/* must be owner */
-	if (!pg_subscription_ownercheck(subid, GetUserId()))
-		aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_SUBSCRIPTION,
-					   stmt->subname);
-
 	sub = GetSubscription(subid, false);
 
-	/* Lock the subscription so nobody else can do anything with it. */
-	LockSharedObject(SubscriptionRelationId, subid, 0, AccessExclusiveLock);
-
 	/* Form a new tuple. */
 	memset(values, 0, sizeof(values));
 	memset(nulls, false, sizeof(nulls));
@@ -1370,7 +1443,6 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
 	char		originname[NAMEDATALEN];
 	char	   *err = NULL;
 	WalReceiverConn *wrconn;
-	Form_pg_subscription form;
 	List	   *rstates;
 
 	/*
@@ -1379,43 +1451,28 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
 	 */
 	rel = table_open(SubscriptionRelationId, AccessExclusiveLock);
 
-	tup = SearchSysCache2(SUBSCRIPTIONNAME, MyDatabaseId,
-						  CStringGetDatum(stmt->subname));
-
-	if (!HeapTupleIsValid(tup))
+	subid = LockSubscription(rel, stmt->subname, stmt->missing_ok);
+	if (subid == InvalidOid)
 	{
 		table_close(rel, NoLock);
-
-		if (!stmt->missing_ok)
-			ereport(ERROR,
-					(errcode(ERRCODE_UNDEFINED_OBJECT),
-					 errmsg("subscription \"%s\" does not exist",
-							stmt->subname)));
-		else
-			ereport(NOTICE,
-					(errmsg("subscription \"%s\" does not exist, skipping",
-							stmt->subname)));
-
 		return;
 	}
 
-	form = (Form_pg_subscription) GETSTRUCT(tup);
-	subid = form->oid;
+	tup = SearchSysCache1(SUBSCRIPTIONOID, ObjectIdGetDatum(subid));
 
-	/* must be owner */
-	if (!pg_subscription_ownercheck(subid, GetUserId()))
-		aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_SUBSCRIPTION,
-					   stmt->subname);
+	/*
+	 * This should never happen since we already locked the subscription, so
+	 * we know it exists.
+	 */
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_OBJECT),
+				 errmsg("subscription \"%s\" does not exist",
+						stmt->subname)));
 
 	/* DROP hook for the subscription being removed */
 	InvokeObjectDropHook(SubscriptionRelationId, subid, 0);
 
-	/*
-	 * Lock the subscription so nobody else can do anything with it (including
-	 * the replication workers).
-	 */
-	LockSharedObject(SubscriptionRelationId, subid, 0, AccessExclusiveLock);
-
 	/* Get subname */
 	datum = SysCacheGetAttr(SUBSCRIPTIONOID, tup,
 							Anum_pg_subscription_subname, &isnull);
@@ -1734,9 +1791,14 @@ AlterSubscriptionOwner(const char *name, Oid newOwnerId)
 
 	rel = table_open(SubscriptionRelationId, RowExclusiveLock);
 
-	tup = SearchSysCacheCopy2(SUBSCRIPTIONNAME, MyDatabaseId,
-							  CStringGetDatum(name));
+	subid = LockSubscription(rel, name, false);
 
+	tup = SearchSysCacheCopy1(SUBSCRIPTIONOID, ObjectIdGetDatum(subid));
+
+	/*
+	 * This should never happen since we already locked the subscription, so
+	 * we know it exists.
+	 */
 	if (!HeapTupleIsValid(tup))
 		ereport(ERROR,
 				(errcode(ERRCODE_UNDEFINED_OBJECT),
diff --git a/src/test/isolation/expected/subscription-ddl.out b/src/test/isolation/expected/subscription-ddl.out
new file mode 100644
index 0000000000..0f280cbad9
--- /dev/null
+++ b/src/test/isolation/expected/subscription-ddl.out
@@ -0,0 +1,169 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1alter s2alter s1commit
+step s1alter: ALTER SUBSCRIPTION sub1 DISABLE;
+step s2alter: ALTER SUBSCRIPTION sub1 DISABLE; <waiting ...>
+step s1commit: COMMIT;
+step s2alter: <... completed>
+
+starting permutation: s1owner s2alter s1commit
+step s1owner: ALTER SUBSCRIPTION sub1 OWNER TO testsuperuser;
+step s2alter: ALTER SUBSCRIPTION sub1 DISABLE; <waiting ...>
+step s1commit: COMMIT;
+step s2alter: <... completed>
+
+starting permutation: s1drop s2alter s1commit
+step s1drop: DROP SUBSCRIPTION sub1;
+step s2alter: ALTER SUBSCRIPTION sub1 DISABLE; <waiting ...>
+step s1commit: COMMIT;
+step s2alter: <... completed>
+ERROR:  subscription "sub1" does not exist
+
+starting permutation: s1rename s2alter s1commit
+step s1rename: ALTER SUBSCRIPTION sub1 RENAME TO sub2;
+step s2alter: ALTER SUBSCRIPTION sub1 DISABLE; <waiting ...>
+step s1commit: COMMIT;
+step s2alter: <... completed>
+ERROR:  subscription "sub1" does not exist
+
+starting permutation: s1drop s1recreate s2alter s1commit
+step s1drop: DROP SUBSCRIPTION sub1;
+s1: WARNING:  tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
+step s1recreate: CREATE SUBSCRIPTION sub1 CONNECTION 'dbname=postgres' PUBLICATION pub1 WITH(connect=false, create_slot=false, slot_name = NONE);
+step s2alter: ALTER SUBSCRIPTION sub1 DISABLE; <waiting ...>
+step s1commit: COMMIT;
+step s2alter: <... completed>
+
+starting permutation: s1rename s1recreate s2alter s1commit
+step s1rename: ALTER SUBSCRIPTION sub1 RENAME TO sub2;
+s1: WARNING:  tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
+step s1recreate: CREATE SUBSCRIPTION sub1 CONNECTION 'dbname=postgres' PUBLICATION pub1 WITH(connect=false, create_slot=false, slot_name = NONE);
+step s2alter: ALTER SUBSCRIPTION sub1 DISABLE; <waiting ...>
+step s1commit: COMMIT;
+step s2alter: <... completed>
+
+starting permutation: s1alter s2drop s1commit
+step s1alter: ALTER SUBSCRIPTION sub1 DISABLE;
+step s2drop: DROP SUBSCRIPTION sub1; <waiting ...>
+step s1commit: COMMIT;
+step s2drop: <... completed>
+
+starting permutation: s1owner s2drop s1commit
+step s1owner: ALTER SUBSCRIPTION sub1 OWNER TO testsuperuser;
+step s2drop: DROP SUBSCRIPTION sub1; <waiting ...>
+step s1commit: COMMIT;
+step s2drop: <... completed>
+
+starting permutation: s1drop s2drop s1commit
+step s1drop: DROP SUBSCRIPTION sub1;
+step s2drop: DROP SUBSCRIPTION sub1; <waiting ...>
+step s1commit: COMMIT;
+step s2drop: <... completed>
+ERROR:  subscription "sub1" does not exist
+
+starting permutation: s1rename s2drop s1commit
+step s1rename: ALTER SUBSCRIPTION sub1 RENAME TO sub2;
+step s2drop: DROP SUBSCRIPTION sub1; <waiting ...>
+step s1commit: COMMIT;
+step s2drop: <... completed>
+ERROR:  subscription "sub1" does not exist
+
+starting permutation: s1drop s1recreate s2drop s1commit
+step s1drop: DROP SUBSCRIPTION sub1;
+s1: WARNING:  tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
+step s1recreate: CREATE SUBSCRIPTION sub1 CONNECTION 'dbname=postgres' PUBLICATION pub1 WITH(connect=false, create_slot=false, slot_name = NONE);
+step s2drop: DROP SUBSCRIPTION sub1; <waiting ...>
+step s1commit: COMMIT;
+step s2drop: <... completed>
+
+starting permutation: s1rename s1recreate s2drop s1commit
+step s1rename: ALTER SUBSCRIPTION sub1 RENAME TO sub2;
+s1: WARNING:  tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
+step s1recreate: CREATE SUBSCRIPTION sub1 CONNECTION 'dbname=postgres' PUBLICATION pub1 WITH(connect=false, create_slot=false, slot_name = NONE);
+step s2drop: DROP SUBSCRIPTION sub1; <waiting ...>
+step s1commit: COMMIT;
+step s2drop: <... completed>
+
+starting permutation: s1alter s2rename s1commit
+step s1alter: ALTER SUBSCRIPTION sub1 DISABLE;
+step s2rename: ALTER SUBSCRIPTION sub1 RENAME TO sub3; <waiting ...>
+step s1commit: COMMIT;
+step s2rename: <... completed>
+
+starting permutation: s1owner s2rename s1commit
+step s1owner: ALTER SUBSCRIPTION sub1 OWNER TO testsuperuser;
+step s2rename: ALTER SUBSCRIPTION sub1 RENAME TO sub3; <waiting ...>
+step s1commit: COMMIT;
+step s2rename: <... completed>
+
+starting permutation: s1drop s2rename s1commit
+step s1drop: DROP SUBSCRIPTION sub1;
+step s2rename: ALTER SUBSCRIPTION sub1 RENAME TO sub3; <waiting ...>
+step s1commit: COMMIT;
+step s2rename: <... completed>
+ERROR:  subscription "sub1" does not exist
+
+starting permutation: s1rename s2rename s1commit
+step s1rename: ALTER SUBSCRIPTION sub1 RENAME TO sub2;
+step s2rename: ALTER SUBSCRIPTION sub1 RENAME TO sub3; <waiting ...>
+step s1commit: COMMIT;
+step s2rename: <... completed>
+ERROR:  subscription "sub1" does not exist
+
+starting permutation: s1drop s1recreate s2rename s1commit
+step s1drop: DROP SUBSCRIPTION sub1;
+s1: WARNING:  tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
+step s1recreate: CREATE SUBSCRIPTION sub1 CONNECTION 'dbname=postgres' PUBLICATION pub1 WITH(connect=false, create_slot=false, slot_name = NONE);
+step s2rename: ALTER SUBSCRIPTION sub1 RENAME TO sub3; <waiting ...>
+step s1commit: COMMIT;
+step s2rename: <... completed>
+
+starting permutation: s1rename s1recreate s2rename s1commit
+step s1rename: ALTER SUBSCRIPTION sub1 RENAME TO sub2;
+s1: WARNING:  tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
+step s1recreate: CREATE SUBSCRIPTION sub1 CONNECTION 'dbname=postgres' PUBLICATION pub1 WITH(connect=false, create_slot=false, slot_name = NONE);
+step s2rename: ALTER SUBSCRIPTION sub1 RENAME TO sub3; <waiting ...>
+step s1commit: COMMIT;
+step s2rename: <... completed>
+
+starting permutation: s1alter s2owner s1commit
+step s1alter: ALTER SUBSCRIPTION sub1 DISABLE;
+step s2owner: ALTER SUBSCRIPTION sub1 OWNER TO testsuperuser; <waiting ...>
+step s1commit: COMMIT;
+step s2owner: <... completed>
+
+starting permutation: s1owner s2owner s1commit
+step s1owner: ALTER SUBSCRIPTION sub1 OWNER TO testsuperuser;
+step s2owner: ALTER SUBSCRIPTION sub1 OWNER TO testsuperuser; <waiting ...>
+step s1commit: COMMIT;
+step s2owner: <... completed>
+
+starting permutation: s1drop s2owner s1commit
+step s1drop: DROP SUBSCRIPTION sub1;
+step s2owner: ALTER SUBSCRIPTION sub1 OWNER TO testsuperuser; <waiting ...>
+step s1commit: COMMIT;
+step s2owner: <... completed>
+ERROR:  subscription "sub1" does not exist
+
+starting permutation: s1rename s2owner s1commit
+step s1rename: ALTER SUBSCRIPTION sub1 RENAME TO sub2;
+step s2owner: ALTER SUBSCRIPTION sub1 OWNER TO testsuperuser; <waiting ...>
+step s1commit: COMMIT;
+step s2owner: <... completed>
+ERROR:  subscription "sub1" does not exist
+
+starting permutation: s1drop s1recreate s2owner s1commit
+step s1drop: DROP SUBSCRIPTION sub1;
+s1: WARNING:  tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
+step s1recreate: CREATE SUBSCRIPTION sub1 CONNECTION 'dbname=postgres' PUBLICATION pub1 WITH(connect=false, create_slot=false, slot_name = NONE);
+step s2owner: ALTER SUBSCRIPTION sub1 OWNER TO testsuperuser; <waiting ...>
+step s1commit: COMMIT;
+step s2owner: <... completed>
+
+starting permutation: s1rename s1recreate s2owner s1commit
+step s1rename: ALTER SUBSCRIPTION sub1 RENAME TO sub2;
+s1: WARNING:  tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe the tables
+step s1recreate: CREATE SUBSCRIPTION sub1 CONNECTION 'dbname=postgres' PUBLICATION pub1 WITH(connect=false, create_slot=false, slot_name = NONE);
+step s2owner: ALTER SUBSCRIPTION sub1 OWNER TO testsuperuser; <waiting ...>
+step s1commit: COMMIT;
+step s2owner: <... completed>
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 529a4cbd4d..6e9144319a 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -84,6 +84,7 @@ test: alter-table-3
 test: alter-table-4
 test: create-trigger
 test: sequence-ddl
+test: subscription-ddl
 test: async-notify
 test: vacuum-no-cleanup-lock
 test: timeouts
diff --git a/src/test/isolation/specs/subscription-ddl.spec b/src/test/isolation/specs/subscription-ddl.spec
new file mode 100644
index 0000000000..b55c8f90ca
--- /dev/null
+++ b/src/test/isolation/specs/subscription-ddl.spec
@@ -0,0 +1,59 @@
+# Test sequence usage and concurrent sequence DDL
+
+setup
+{
+    CREATE SUBSCRIPTION sub1 CONNECTION 'dbname=postgres' PUBLICATION pub1 WITH(connect=false, create_slot=false, slot_name = NONE);
+    CREATE ROLE testsuperuser WITH SUPERUSER;
+}
+
+teardown
+{
+    DROP SUBSCRIPTION IF EXISTS sub1;
+    DROP SUBSCRIPTION IF EXISTS sub2;
+    DROP SUBSCRIPTION IF EXISTS sub3;
+    DROP ROLE testsuperuser;
+}
+
+session s1
+setup             { BEGIN; }
+step s1alter      { ALTER SUBSCRIPTION sub1 DISABLE; }
+step s1drop       { DROP SUBSCRIPTION sub1; }
+step s1rename     { ALTER SUBSCRIPTION sub1 RENAME TO sub2; }
+step s1owner      { ALTER SUBSCRIPTION sub1 OWNER TO testsuperuser; }
+step s1recreate   { CREATE SUBSCRIPTION sub1 CONNECTION 'dbname=postgres' PUBLICATION pub1 WITH(connect=false, create_slot=false, slot_name = NONE); }
+step s1commit     { COMMIT; }
+
+session s2
+step s2alter    { ALTER SUBSCRIPTION sub1 DISABLE; }
+step s2drop     { DROP SUBSCRIPTION sub1; }
+step s2rename   { ALTER SUBSCRIPTION sub1 RENAME TO sub3; }
+step s2owner    { ALTER SUBSCRIPTION sub1 OWNER TO testsuperuser; }
+
+
+permutation s1alter s2alter s1commit
+permutation s1owner s2alter s1commit
+permutation s1drop s2alter s1commit
+permutation s1rename s2alter s1commit
+permutation s1drop s1recreate s2alter s1commit
+permutation s1rename s1recreate s2alter s1commit
+
+permutation s1alter s2drop s1commit
+permutation s1owner s2drop s1commit
+permutation s1drop s2drop s1commit
+permutation s1rename s2drop s1commit
+permutation s1drop s1recreate s2drop s1commit
+permutation s1rename s1recreate s2drop s1commit
+
+permutation s1alter s2rename s1commit
+permutation s1owner s2rename s1commit
+permutation s1drop s2rename s1commit
+permutation s1rename s2rename s1commit
+permutation s1drop s1recreate s2rename s1commit
+permutation s1rename s1recreate s2rename s1commit
+
+permutation s1alter s2owner s1commit
+permutation s1owner s2owner s1commit
+permutation s1drop s2owner s1commit
+permutation s1rename s2owner s1commit
+permutation s1drop s1recreate s2owner s1commit
+permutation s1rename s1recreate s2owner s1commit
-- 
2.34.1

#4Asif Rehman
asifr.rehman@gmail.com
In reply to: Jelte Fennema (#3)
Re: [PATCH] Fix alter subscription concurrency errors

The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: not tested
Documentation: not tested

The patch applies with few "Hunk succeeded, offset -3 lines" warnings. Tested against master '7d5852ca'.

+               if (!HeapTupleIsValid(tup))
+               {
+                       if (!missing_ok)
+                               ereport(ERROR,
+                                               (errcode(ERRCODE_UNDEFINED_OBJECT),
+                                                errmsg("subscription \"%s\" does not exist",
+                                                               subname)));
+                       else
+                               ereport(NOTICE,
+                                               (errmsg("subscription \"%s\" does not exist, skipping",
+                                                               subname)));
+
+                       return InvalidOid;
+               }
+

I think 'tup' should be released before returning, or break out of loop instead to release it.

The new status of this patch is: Waiting on Author

#5Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Jelte Fennema (#3)
Re: [PATCH] Fix alter subscription concurrency errors

On 2022-Aug-26, Jelte Fennema wrote:

I indeed don't think this problem is unique to subscriptions, but it seems
better to at least have this problem in a few places less (not making perfect
the enemy of good).

If someone has a more generic way of solving this for other commands too,
then that sounds great, but if not then slowly chipping away at these cases
seems better than keeping the status quo.

Attached is a new patch where ALTER SUBSCRIPTION ... OWNER TO ... can
now also be executed concurrently with the other subscription commands.

Would it work to use get_object_address() instead? That would save
having to write a lookup-and-lock function with a retry loop for each
object type.

--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/

#6Michael Paquier
michael@paquier.xyz
In reply to: Alvaro Herrera (#5)
Re: [PATCH] Fix alter subscription concurrency errors

On Fri, Sep 09, 2022 at 06:37:07PM +0200, Alvaro Herrera wrote:

Would it work to use get_object_address() instead? That would save
having to write a lookup-and-lock function with a retry loop for each
object type.

Jeite, this thread is waiting for your input. This is a bug fix, so I
have moved this patch to the next CF for now to keep track of it.
--
Michael

#7vignesh C
vignesh21@gmail.com
In reply to: Michael Paquier (#6)
Re: [PATCH] Fix alter subscription concurrency errors

On Wed, 12 Oct 2022 at 10:48, Michael Paquier <michael@paquier.xyz> wrote:

On Fri, Sep 09, 2022 at 06:37:07PM +0200, Alvaro Herrera wrote:

Would it work to use get_object_address() instead? That would save
having to write a lookup-and-lock function with a retry loop for each
object type.

Jeite, this thread is waiting for your input. This is a bug fix, so I
have moved this patch to the next CF for now to keep track of it.

Jeite, please post an updated version with the fixes. As CommitFest
2023-01 is currently underway, this would be an excellent time to
update the patch.

Regards,
Vignesh

#8Gregory Stark (as CFM)
stark.cfm@gmail.com
In reply to: vignesh C (#7)
Re: [PATCH] Fix alter subscription concurrency errors

On Mon, 16 Jan 2023 at 09:47, vignesh C <vignesh21@gmail.com> wrote:

Jeite, please post an updated version with the fixes. As CommitFest
2023-01 is currently underway, this would be an excellent time to
update the patch.

Hm. This patch is still waiting on updates. But it's a bug fix and it
would be good to get this in. Is someone else interested in finishing
this if Jeite isn't available?

--
Gregory Stark
As Commitfest Manager

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Gregory Stark (as CFM) (#8)
Re: [PATCH] Fix alter subscription concurrency errors

"Gregory Stark (as CFM)" <stark.cfm@gmail.com> writes:

Hm. This patch is still waiting on updates. But it's a bug fix and it
would be good to get this in. Is someone else interested in finishing
this if Jeite isn't available?

I think the patch as-submitted is pretty uninteresting, mainly because the
design of adding bespoke lock code for subscription objects doesn't scale.
I'm not excited about doing this just for subscriptions without at least a
clear plan for working on other object types.

Alvaro suggested switching it to use get_object_address() instead, which'd
be better; but before going in that direction we might have to do more
work on get_object_address's error reporting (note the para in its header
comments saying it's pretty weak on that).

regards, tom lane

#10Jelte Fennema
postgres@jeltef.nl
In reply to: Tom Lane (#9)
Re: [PATCH] Fix alter subscription concurrency errors

"Gregory Stark (as CFM)" <stark.cfm@gmail.com> writes:

Hm. This patch is still waiting on updates. But it's a bug fix and it
would be good to get this in. Is someone else interested in finishing
this if Jeite isn't available?

I think the patch as-submitted is pretty uninteresting, mainly because the
design of adding bespoke lock code for subscription objects doesn't scale.
I'm not excited about doing this just for subscriptions without at least a
clear plan for working on other object types.

Alvaro suggested switching it to use get_object_address() instead, which'd
be better; but before going in that direction we might have to do more
work on get_object_address's error reporting (note the para in its header
comments saying it's pretty weak on that).

Sorry for not responding earlier in this thread. I'll be honest in
saying this was a small annoyance to me, so I ignored theresonses more
than I should have. It caused some test flakiness in the Citus test
suite, and it seemed that fixing the underlying issue in Postgres was
most appropriate. I addressed this in Citus its test suite by
disabling the relevant test (which was an edge case test anyway). So
my immidiate problem was fixed, and I stopped caring about this patch
very much. Definitely not enough to address this for all other DDLs
with the same issue.

All in all I'm having a hard time feeling motivated to work on a patch
that I don't care much about. Especially since I have two other
patches open for a few commit fests that I actually care about, but
those patches have received (imho) very little input. Which makes it
hard to justify to myself to spend time on this patch, given the
knowledge that if I would spend time on it, it might take away the
precious reviewer time from the patches I do care about.