Re: Assertion failure in SnapBuildInitialSnapshot()

Started by Pradeep Kumar3 months ago1 messages
#1Pradeep Kumar
spradeepkumar29@gmail.com
1 attachment(s)

Hi All,
In this thread
</messages/by-id/CAA4eK1L8wYcyTPxNzPGkhuO52WBGoOZbT0A73Le=ZUWYAYmdfw@mail.gmail.com&gt;
they
proposed fix_concurrent_slot_xmin_update.patch will solve this assert
failure. After applying this patch I execute pg_sync_replication_slots()
(which calls SyncReplicationSlots → synchronize_slots() →
synchronize_one_slot() → ReplicationSlotsComputeRequiredXmin(true)) can hit
an assertion failure in ReplicationSlotsComputeRequiredXmin() because the
ReplicationSlotControlLock is not held in that code path. By default
sync_replication_slots is off, so the background slot-sync worker is not
spawned; invoking the UDF directly exercises the path without the lock. I
have a small patch that acquires ReplicationSlotControlLock in the manual
sync path; that stops the assert.

Call Stack :
TRAP: failed Assert("!already_locked ||
(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_EXCLUSIVE) &&
LWLockHeldByMeInMode(ProcArrayLock, LW_EXCLUSIVE))"), File: "slot.
c", Line: 1061, PID: 67056
0 postgres 0x000000010104aad4
ExceptionalCondition + 216
1 postgres 0x0000000100d8718c
ReplicationSlotsComputeRequiredXmin + 180
2 postgres 0x0000000100d6fba8
synchronize_one_slot + 1488
3 postgres 0x0000000100d6e8cc
synchronize_slots + 1480
4 postgres 0x0000000100d6efe4
SyncReplicationSlots + 164
5 postgres 0x0000000100d8da84
pg_sync_replication_slots + 476
6 postgres 0x0000000100b34c58 ExecInterpExpr +
2388
7 postgres 0x0000000100b33ee8
ExecInterpExprStillValid + 76
8 postgres 0x00000001008acd5c
ExecEvalExprSwitchContext + 64
9 postgres 0x0000000100b54d48 ExecProject + 76
10 postgres 0x0000000100b925d4 ExecResult + 312
11 postgres 0x0000000100b5083c
ExecProcNodeFirst + 92
12 postgres 0x0000000100b48b88 ExecProcNode + 60
13 postgres 0x0000000100b44410 ExecutePlan + 184
14 postgres 0x0000000100b442dc
standard_ExecutorRun + 644
15 postgres 0x0000000100b44048 ExecutorRun + 104
16 postgres 0x0000000100e3053c PortalRunSelect
+ 308
17 postgres 0x0000000100e2ff40 PortalRun + 736
18 postgres 0x0000000100e2b21c
exec_simple_query + 1368
19 postgres 0x0000000100e2a42c PostgresMain +
2508
20 postgres 0x0000000100e22ce4
BackendInitialize + 0
21 postgres 0x0000000100d1fd4c
postmaster_child_launch + 304
22 postgres 0x0000000100d26d9c BackendStartup +
448
23 postgres 0x0000000100d23f18 ServerLoop + 372
24 postgres 0x0000000100d22f18 PostmasterMain +
6396
25 postgres 0x0000000100bcffd4 init_locale + 0
26 dyld 0x0000000186d82b98 start + 6076

The assert is raised inside ReplicationSlotsComputeRequiredXmin() because
that function expects either that already_locked is false (and it will
acquire what it needs), or that callers already hold both
ReplicationSlotControlLock (exclusive) and ProcArrayLock (exclusive). In
the manual-sync path called by the UDF, neither lock is held, so the
assertion trips.

Why this happens:
The background slot sync worker (spawned when sync_replication_slots = on)
acquires the necessary locks before calling the routines that
update/compute slot xmins, so the worker path is safe.The manual path
through the SQL-callable UDF does not take the same locks before calling
synchronize_slots()/synchronize_one_slot(). As a result the invariant
assumed by ReplicationSlotsComputeRequiredXmin() can be violated, leading
to the assert.

Proposed fix:
In synchronize_slots() (the code path used by
SyncReplicationSlots()/pg_sync_replication_slots()), acquire
ReplicationSlotControlLock before any call that can end up calling
ReplicationSlotsComputeRequiredXmin(true).

Please refer the attached updated patch.

Attachments:

fix_concurrent_slot_xmin_update_version2.patchapplication/octet-stream; name=fix_concurrent_slot_xmin_update_version2.patchDownload
diff --git a/src/backend/replication/logical/logical.c b/src/backend/replication/logical/logical.c
index f1eb798f3e9..e3a2a1ffdd4 100644
--- a/src/backend/replication/logical/logical.c
+++ b/src/backend/replication/logical/logical.c
@@ -405,11 +405,11 @@ CreateInitDecodingContext(const char *plugin,
 	 * without further interlock its return value might immediately be out of
 	 * date.
 	 *
-	 * So we have to acquire the ProcArrayLock to prevent computation of new
-	 * xmin horizons by other backends, get the safe decoding xid, and inform
-	 * the slot machinery about the new limit. Once that's done the
-	 * ProcArrayLock can be released as the slot machinery now is
-	 * protecting against vacuum.
+	 * So we have to acquire both the ReplicationSlotControlLock and the
+	 * ProcArrayLock to prevent concurrent computation and update of new xmin
+	 * horizons by other backends, get the safe decoding xid, and inform the
+	 * slot machinery about the new limit. Once that's done the both locks
+	 * can be released as the slot machinery now is protecting against vacuum.
 	 *
 	 * Note that, temporarily, the data, not just the catalog, xmin has to be
 	 * reserved if a data snapshot is to be exported.  Otherwise the initial
@@ -422,6 +422,7 @@ CreateInitDecodingContext(const char *plugin,
 	 *
 	 * ----
 	 */
+	LWLockAcquire(ReplicationSlotControlLock, LW_EXCLUSIVE);
 	LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
 
 	xmin_horizon = GetOldestSafeDecodingTransactionId(!need_full_snapshot);
@@ -436,6 +437,7 @@ CreateInitDecodingContext(const char *plugin,
 	ReplicationSlotsComputeRequiredXmin(true);
 
 	LWLockRelease(ProcArrayLock);
+	LWLockRelease(ReplicationSlotControlLock);
 
 	ReplicationSlotMarkDirty();
 	ReplicationSlotSave();
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 61b2e9396aa..fee5aad7574 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -775,7 +775,8 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		SpinLockRelease(&slot->mutex);
 
 		reserve_wal_for_local_slot(remote_slot->restart_lsn);
-
+		
+		LWLockAcquire(ReplicationSlotControlLock, LW_EXCLUSIVE);
 		LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
 		xmin_horizon = GetOldestSafeDecodingTransactionId(true);
 		SpinLockAcquire(&slot->mutex);
@@ -784,6 +785,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		SpinLockRelease(&slot->mutex);
 		ReplicationSlotsComputeRequiredXmin(true);
 		LWLockRelease(ProcArrayLock);
+		LWLockRelease(ReplicationSlotControlLock);
 
 		update_and_persist_local_synced_slot(remote_slot, remote_dbid);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 101157ed8c9..303a40e61ea 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -1129,8 +1129,11 @@ ReplicationSlotPersist(void)
 /*
  * Compute the oldest xmin across all slots and store it in the ProcArray.
  *
- * If already_locked is true, ProcArrayLock has already been acquired
- * exclusively.
+ * If already_locked is true, both the ReplicationSlotControlLock and
+ * the ProcArrayLock have already been acquired exclusively.
+ *
+ * Note that the ReplicationSlotControlLock must be locked first to avoid
+ * deadlocks.
  */
 void
 ReplicationSlotsComputeRequiredXmin(bool already_locked)
@@ -1140,8 +1143,26 @@ ReplicationSlotsComputeRequiredXmin(bool already_locked)
 	TransactionId agg_catalog_xmin = InvalidTransactionId;
 
 	Assert(ReplicationSlotCtl != NULL);
+	Assert(!already_locked ||
+		   (LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_EXCLUSIVE) &&
+			LWLockHeldByMeInMode(ProcArrayLock, LW_EXCLUSIVE)));
 
-	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+	/*
+	 * Hold the ReplicationSlotControlLock exclusive until after updating
+	 * the slot xmin values, so no backend can compute and update the new
+	 * value concurrently.
+	 *
+	 * One might think that we can hold the ProcArrayLock exclusively,
+	 * compute the xmin values while holding the ReplicationSlotControlLock
+	 * in shared mode, and update the slot xmin values, but it could increase
+	 * lock contention on the ProcArrayLock, which is not great since this
+	 * function can be called at non-negligible frequency.
+	 *
+	 * We instead increase lock contention on the ReplicationSlotControlLock
+	 * but it would be less harmful.
+	 */
+	if (!already_locked)
+		LWLockAcquire(ReplicationSlotControlLock, LW_EXCLUSIVE);
 
 	for (i = 0; i < max_replication_slots; i++)
 	{
@@ -1176,9 +1197,10 @@ ReplicationSlotsComputeRequiredXmin(bool already_locked)
 			agg_catalog_xmin = effective_catalog_xmin;
 	}
 
-	LWLockRelease(ReplicationSlotControlLock);
-
 	ProcArraySetReplicationSlotXmin(agg_xmin, agg_catalog_xmin, already_locked);
+
+	if (!already_locked)
+		LWLockRelease(ReplicationSlotControlLock);
 }
 
 /*