Some problems of recovery conflict wait events

Started by Masahiko Sawadaalmost 6 years ago32 messages

masahiko.sawada@2ndquadrant.com

almost 6 years ago

Hi all,

When recovery conflicts happen on the streaming replication standby,
the wait event of startup process is null when
max_standby_streaming_delay = 0 (to be exact, when the limit time
calculated by max_standby_streaming_delay is behind the last WAL data
receipt time is behind). Moreover the process title of waiting startup
process looks odd in the case of lock conflicts.

1. When max_standby_streaming_delay > 0 and the startup process
conflicts with a lock,

* wait event
backend_type | wait_event_type | wait_event
--------------+-----------------+------------
startup | Lock | relation
(1 row)

* ps
42513 ?? Ss 0:00.05 postgres: startup recovering
000000010000000000000003 waiting

Looks good.

2. When max_standby_streaming_delay > 0 and the startup process
conflicts with a snapshot,

* wait event
backend_type | wait_event_type | wait_event
--------------+-----------------+------------
startup | |
(1 row)

* ps
44299 ?? Ss 0:00.05 postgres: startup recovering
000000010000000000000003 waiting

wait_event_type and wait_event are null in spite of waiting for
conflict resolution.

3. When max_standby_streaming_delay > 0 and the startup process
conflicts with a lock,

* wait event
backend_type | wait_event_type | wait_event
--------------+-----------------+------------
startup | |
(1 row)

* ps
46510 ?? Ss 0:00.05 postgres: startup recovering
000000010000000000000003 waiting waiting

wait_event_type and wait_event are null and the process title is
wrong; "waiting" appears twice.

The cause of the first problem, wait_event_type and wait_event are not
set, is that WaitExceedsMaxStandbyDelay which is called by
ResolveRecoveryConflictWithVirtualXIDs waits for other transactions
using pg_usleep rather than WaitLatch. I think we can change it so
that it uses WaitLatch and those caller passes wait event information.

For the second problem, wrong process title, the cause is also
relevant with ResolveRecoveryConflictWithVirtualXIDs; in case of lock
conflicts we add "waiting" to the process title in WaitOnLock but we
add it again in ResolveRecoveryConflictWithVirtualXIDs. I think we can
have WaitOnLock not set process title in recovery case.

This problem exists on 12, 11 and 10. I'll submit the patch.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

almost 6 years ago

In reply to: Masahiko Sawada (#1)

2 attachment(s)

Re: Some problems of recovery conflict wait events

On Tue, 18 Feb 2020 at 17:58, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

Hi all,

When recovery conflicts happen on the streaming replication standby,
the wait event of startup process is null when
max_standby_streaming_delay = 0 (to be exact, when the limit time
calculated by max_standby_streaming_delay is behind the last WAL data
receipt time is behind). Moreover the process title of waiting startup
process looks odd in the case of lock conflicts.

1. When max_standby_streaming_delay > 0 and the startup process
conflicts with a lock,

* wait event
backend_type | wait_event_type | wait_event
--------------+-----------------+------------
startup | Lock | relation
(1 row)

* ps
42513 ?? Ss 0:00.05 postgres: startup recovering
000000010000000000000003 waiting

Looks good.

2. When max_standby_streaming_delay > 0 and the startup process
conflicts with a snapshot,

* wait event
backend_type | wait_event_type | wait_event
--------------+-----------------+------------
startup | |
(1 row)

* ps
44299 ?? Ss 0:00.05 postgres: startup recovering
000000010000000000000003 waiting

wait_event_type and wait_event are null in spite of waiting for
conflict resolution.

3. When max_standby_streaming_delay > 0 and the startup process
conflicts with a lock,

* wait event
backend_type | wait_event_type | wait_event
--------------+-----------------+------------
startup | |
(1 row)

* ps
46510 ?? Ss 0:00.05 postgres: startup recovering
000000010000000000000003 waiting waiting

wait_event_type and wait_event are null and the process title is
wrong; "waiting" appears twice.

The cause of the first problem, wait_event_type and wait_event are not
set, is that WaitExceedsMaxStandbyDelay which is called by
ResolveRecoveryConflictWithVirtualXIDs waits for other transactions
using pg_usleep rather than WaitLatch. I think we can change it so
that it uses WaitLatch and those caller passes wait event information.

For the second problem, wrong process title, the cause is also
relevant with ResolveRecoveryConflictWithVirtualXIDs; in case of lock
conflicts we add "waiting" to the process title in WaitOnLock but we
add it again in ResolveRecoveryConflictWithVirtualXIDs. I think we can
have WaitOnLock not set process title in recovery case.

This problem exists on 12, 11 and 10. I'll submit the patch.

I've attached patches that fix the above two issues.

0001 patch fixes the first problem. Currently there are 5 types of
recovery conflict resolution: snapshot, tablespace, lock, database and
buffer pin, and we set wait events to only 2 events out of 5: lock
(only when doing ProcWaitForSignal) and buffer pin. Therefore, users
cannot know that the startup process is waiting or not, and what
waiting for. This patch sets wait events to more 3 events: snapshot,
tablespace and lock. For wait events of those 3 events, I thought that
we can create a new more appropriate wait event type, say
RecoveryConflict, and set it for them. However, considering
back-patching to existing versions, adding new wait event type would
not be acceptable. So this patch sets existing wait events such as
PG_WAIT_LOCK to those 3 places and doesn't not set a wait event for
conflict resolution on dropping database because there is not an
appropriate existing one. I'll start a separate thread about
improvement on wait events of recovery conflict resolution for PG13 if
necessary.

0002 patch fixes the second problem. With this patch, the process
title is updated properly in all recovery conflict resolution cases.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0001-Set-wait-events-for-recovery-conflict-resolution.patchapplication/octet-stream; name=0001-Set-wait-events-for-recovery-conflict-resolution.patchDownload

From e2ecf9c29fe47b52f5a867b1136a6e9f56d58490 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 26 Feb 2020 12:43:41 +0900
Subject: [PATCH 1/2] Set wait events for recovery conflict resolution

---
 src/backend/storage/ipc/standby.c | 45 +++++++++++++++----------------
 1 file changed, 21 insertions(+), 24 deletions(-)

diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 3090e57fa4..c85def7317 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -43,7 +43,8 @@ int			max_standby_streaming_delay = 30 * 1000;
 static HTAB *RecoveryLockLists;
 
 static void ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
-												   ProcSignalReason reason);
+												   ProcSignalReason reason,
+												   uint32 wait_event_info);
 static void SendRecoveryConflictWithBufferPin(ProcSignalReason reason);
 static XLogRecPtr LogCurrentRunningXacts(RunningTransactions CurrRunningXacts);
 static void LogAccessExclusiveLocks(int nlocks, xl_standby_lock *locks);
@@ -175,8 +176,7 @@ GetStandbyLimitTime(void)
 	}
 }
 
-#define STANDBY_INITIAL_WAIT_US  1000
-static int	standbyWait_us = STANDBY_INITIAL_WAIT_US;
+#define STANDBY_WAIT_MS  1000
 
 /*
  * Standby wait logic for ResolveRecoveryConflictWithVirtualXIDs.
@@ -184,7 +184,7 @@ static int	standbyWait_us = STANDBY_INITIAL_WAIT_US;
  * more then we return true, if we can wait some more return false.
  */
 static bool
-WaitExceedsMaxStandbyDelay(void)
+WaitExceedsMaxStandbyDelay(uint32 wait_event_info)
 {
 	TimestampTz ltime;
 
@@ -198,15 +198,11 @@ WaitExceedsMaxStandbyDelay(void)
 	/*
 	 * Sleep a bit (this is essential to avoid busy-waiting).
 	 */
-	pg_usleep(standbyWait_us);
-
-	/*
-	 * Progressively increase the sleep times, but not to more than 1s, since
-	 * pg_usleep isn't interruptible on some platforms.
-	 */
-	standbyWait_us *= 2;
-	if (standbyWait_us > 1000000)
-		standbyWait_us = 1000000;
+	WaitLatch(MyLatch,
+			  WL_LATCH_SET | WL_POSTMASTER_DEATH | WL_TIMEOUT,
+			  STANDBY_WAIT_MS,
+			  wait_event_info);
+	ResetLatch(MyLatch);
 
 	return false;
 }
@@ -219,7 +215,8 @@ WaitExceedsMaxStandbyDelay(void)
  */
 static void
 ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
-									   ProcSignalReason reason)
+									   ProcSignalReason reason,
+									   uint32 wait_event_info)
 {
 	TimestampTz waitStart;
 	char	   *new_status;
@@ -233,9 +230,6 @@ ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
 
 	while (VirtualTransactionIdIsValid(*waitlist))
 	{
-		/* reset standbyWait_us for each xact we wait for */
-		standbyWait_us = STANDBY_INITIAL_WAIT_US;
-
 		/* wait until the virtual xid is gone */
 		while (!VirtualXactLock(*waitlist, false))
 		{
@@ -259,7 +253,7 @@ ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
 			}
 
 			/* Is it time to kill it? */
-			if (WaitExceedsMaxStandbyDelay())
+			if (WaitExceedsMaxStandbyDelay(wait_event_info))
 			{
 				pid_t		pid;
 
@@ -311,7 +305,8 @@ ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode
 										 node.dbNode);
 
 	ResolveRecoveryConflictWithVirtualXIDs(backends,
-										   PROCSIG_RECOVERY_CONFLICT_SNAPSHOT);
+										   PROCSIG_RECOVERY_CONFLICT_SNAPSHOT,
+										   PG_WAIT_LOCK | LOCKTAG_TRANSACTION);
 }
 
 void
@@ -339,7 +334,8 @@ ResolveRecoveryConflictWithTablespace(Oid tsid)
 	temp_file_users = GetConflictingVirtualXIDs(InvalidTransactionId,
 												InvalidOid);
 	ResolveRecoveryConflictWithVirtualXIDs(temp_file_users,
-										   PROCSIG_RECOVERY_CONFLICT_TABLESPACE);
+										   PROCSIG_RECOVERY_CONFLICT_TABLESPACE,
+										   PG_WAIT_LOCK | LOCKTAG_TRANSACTION);
 }
 
 void
@@ -403,7 +399,8 @@ ResolveRecoveryConflictWithLock(LOCKTAG locktag)
 
 		backends = GetLockConflicts(&locktag, AccessExclusiveLock, NULL);
 		ResolveRecoveryConflictWithVirtualXIDs(backends,
-											   PROCSIG_RECOVERY_CONFLICT_LOCK);
+											   PROCSIG_RECOVERY_CONFLICT_LOCK,
+											   PG_WAIT_LOCK | locktag.locktag_type);
 	}
 	else
 	{
@@ -416,10 +413,10 @@ ResolveRecoveryConflictWithLock(LOCKTAG locktag)
 		timeouts[0].type = TMPARAM_AT;
 		timeouts[0].fin_time = ltime;
 		enable_timeouts(timeouts, 1);
-	}
 
-	/* Wait to be signaled by the release of the Relation Lock */
-	ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
+		/* Wait to be signaled by the release of the Relation Lock */
+		ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
+	}
 
 	/*
 	 * Clear any timeout requests established above.  We assume here that the
-- 
2.23.0

0002-Fix-process-title-update-during-recovery-conflicts.patchapplication/octet-stream; name=0002-Fix-process-title-update-during-recovery-conflicts.patchDownload

From 7c3c4cc47eb0916d7aa6334df952a9c284cab2fa Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 25 Feb 2020 16:58:56 +0900
Subject: [PATCH 2/2] Fix process title update during recovery conflicts

---
 src/backend/storage/ipc/standby.c | 105 +++++++++++++++++++++---------
 1 file changed, 73 insertions(+), 32 deletions(-)

diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index c85def7317..b45a83c54c 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -48,6 +48,7 @@ static void ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlis
 static void SendRecoveryConflictWithBufferPin(ProcSignalReason reason);
 static XLogRecPtr LogCurrentRunningXacts(RunningTransactions CurrRunningXacts);
 static void LogAccessExclusiveLocks(int nlocks, xl_standby_lock *locks);
+static char *set_process_title_waiting(void);
 
 /*
  * Keep track of all the locks owned by a given transaction.
@@ -218,40 +219,15 @@ ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
 									   ProcSignalReason reason,
 									   uint32 wait_event_info)
 {
-	TimestampTz waitStart;
-	char	   *new_status;
-
 	/* Fast exit, to avoid a kernel call if there's no work to be done. */
 	if (!VirtualTransactionIdIsValid(*waitlist))
 		return;
 
-	waitStart = GetCurrentTimestamp();
-	new_status = NULL;			/* we haven't changed the ps display */
-
 	while (VirtualTransactionIdIsValid(*waitlist))
 	{
 		/* wait until the virtual xid is gone */
 		while (!VirtualXactLock(*waitlist, false))
 		{
-			/*
-			 * Report via ps if we have been waiting for more than 500 msec
-			 * (should that be configurable?)
-			 */
-			if (update_process_title && new_status == NULL &&
-				TimestampDifferenceExceeds(waitStart, GetCurrentTimestamp(),
-										   500))
-			{
-				const char *old_status;
-				int			len;
-
-				old_status = get_ps_display(&len);
-				new_status = (char *) palloc(len + 8 + 1);
-				memcpy(new_status, old_status, len);
-				strcpy(new_status + len, " waiting");
-				set_ps_display(new_status, false);
-				new_status[len] = '\0'; /* truncate off " waiting" */
-			}
-
 			/* Is it time to kill it? */
 			if (WaitExceedsMaxStandbyDelay(wait_event_info))
 			{
@@ -275,19 +251,13 @@ ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
 		/* The virtual transaction is gone now, wait for the next one */
 		waitlist++;
 	}
-
-	/* Reset ps display if we changed it */
-	if (new_status)
-	{
-		set_ps_display(new_status, false);
-		pfree(new_status);
-	}
 }
 
 void
 ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode node)
 {
 	VirtualTransactionId *backends;
+	char		*new_status = NULL;
 
 	/*
 	 * If we get passed InvalidTransactionId then we are a little surprised,
@@ -301,18 +271,32 @@ ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode
 	if (!TransactionIdIsValid(latestRemovedXid))
 		return;
 
+	/* Report via ps we are waiting */
+	new_status = set_process_title_waiting();
+
 	backends = GetConflictingVirtualXIDs(latestRemovedXid,
 										 node.dbNode);
 
 	ResolveRecoveryConflictWithVirtualXIDs(backends,
 										   PROCSIG_RECOVERY_CONFLICT_SNAPSHOT,
 										   PG_WAIT_LOCK | LOCKTAG_TRANSACTION);
+
+	/* Reset ps display if we changed it */
+	if (new_status)
+	{
+		set_ps_display(new_status, false);
+		pfree(new_status);
+	}
 }
 
 void
 ResolveRecoveryConflictWithTablespace(Oid tsid)
 {
 	VirtualTransactionId *temp_file_users;
+	char		*new_status = NULL;
+
+	/* Report via ps we are waiting */
+	new_status = set_process_title_waiting();
 
 	/*
 	 * Standby users may be currently using this tablespace for their
@@ -336,11 +320,23 @@ ResolveRecoveryConflictWithTablespace(Oid tsid)
 	ResolveRecoveryConflictWithVirtualXIDs(temp_file_users,
 										   PROCSIG_RECOVERY_CONFLICT_TABLESPACE,
 										   PG_WAIT_LOCK | LOCKTAG_TRANSACTION);
+
+	/* Reset ps display if we changed it */
+	if (new_status)
+	{
+		set_ps_display(new_status, false);
+		pfree(new_status);
+	}
 }
 
 void
 ResolveRecoveryConflictWithDatabase(Oid dbid)
 {
+	char		*new_status = NULL;
+
+	/* Report via ps we are waiting */
+	new_status = set_process_title_waiting();
+
 	/*
 	 * We don't do ResolveRecoveryConflictWithVirtualXIDs() here since that
 	 * only waits for transactions and completely idle sessions would block
@@ -362,6 +358,13 @@ ResolveRecoveryConflictWithDatabase(Oid dbid)
 		 */
 		pg_usleep(10000);
 	}
+
+	/* Reset ps display if we changed it */
+	if (new_status)
+	{
+		set_ps_display(new_status, false);
+		pfree(new_status);
+	}
 }
 
 /*
@@ -380,6 +383,9 @@ ResolveRecoveryConflictWithDatabase(Oid dbid)
  *
  * Deadlocks involving the Startup process and an ordinary backend process
  * will be detected by the deadlock detector within the ordinary backend.
+ *
+ * Unlike other recovery conflict resolution functions, this function
+ * doesn't update the process title since we have already updated it.
  */
 void
 ResolveRecoveryConflictWithLock(LOCKTAG locktag)
@@ -458,9 +464,13 @@ void
 ResolveRecoveryConflictWithBufferPin(void)
 {
 	TimestampTz ltime;
+	char		*new_status = NULL;
 
 	Assert(InHotStandby);
 
+	/* Report via ps we are waiting */
+	new_status = set_process_title_waiting();
+
 	ltime = GetStandbyLimitTime();
 
 	if (ltime == 0)
@@ -505,6 +515,13 @@ ResolveRecoveryConflictWithBufferPin(void)
 	 * individually, but that'd be slower.
 	 */
 	disable_all_timeouts(false);
+
+	/* Reset ps display if we changed it */
+	if (new_status)
+	{
+		set_ps_display(new_status, false);
+		pfree(new_status);
+	}
 }
 
 static void
@@ -1091,3 +1108,27 @@ LogStandbyInvalidations(int nmsgs, SharedInvalidationMessage *msgs,
 					 nmsgs * sizeof(SharedInvalidationMessage));
 	XLogInsert(RM_STANDBY_ID, XLOG_INVALIDATIONS);
 }
+
+/*
+ * Add " waiting" to the process title, and return palloc'd
+ * original process title.
+ */
+static char *
+set_process_title_waiting(void)
+{
+	const char	*old_status;
+	char		*ret_status;
+	int			len;
+
+	if (!update_process_title)
+		return NULL;
+
+	old_status = get_ps_display(&len);
+	ret_status = (char *) palloc(len + 8 + 1);
+	memcpy(ret_status, old_status, len);
+	strcpy(ret_status + len, " waiting");
+	set_ps_display(ret_status, false);
+	ret_status[len] = '\0'; /* truncate off " waiting" */
+
+	return ret_status;
+}
-- 
2.23.0

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

almost 6 years ago

In reply to: Masahiko Sawada (#2)

1 attachment(s)

Re: Some problems of recovery conflict wait events

On Wed, 26 Feb 2020 at 16:19, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 18 Feb 2020 at 17:58, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

Hi all,

When recovery conflicts happen on the streaming replication standby,
the wait event of startup process is null when
max_standby_streaming_delay = 0 (to be exact, when the limit time
calculated by max_standby_streaming_delay is behind the last WAL data
receipt time is behind). Moreover the process title of waiting startup
process looks odd in the case of lock conflicts.

1. When max_standby_streaming_delay > 0 and the startup process
conflicts with a lock,

* wait event
backend_type | wait_event_type | wait_event
--------------+-----------------+------------
startup | Lock | relation
(1 row)

* ps
42513 ?? Ss 0:00.05 postgres: startup recovering
000000010000000000000003 waiting

Looks good.

2. When max_standby_streaming_delay > 0 and the startup process
conflicts with a snapshot,

* wait event
backend_type | wait_event_type | wait_event
--------------+-----------------+------------
startup | |
(1 row)

* ps
44299 ?? Ss 0:00.05 postgres: startup recovering
000000010000000000000003 waiting

wait_event_type and wait_event are null in spite of waiting for
conflict resolution.

3. When max_standby_streaming_delay > 0 and the startup process
conflicts with a lock,

* wait event
backend_type | wait_event_type | wait_event
--------------+-----------------+------------
startup | |
(1 row)

* ps
46510 ?? Ss 0:00.05 postgres: startup recovering
000000010000000000000003 waiting waiting

wait_event_type and wait_event are null and the process title is
wrong; "waiting" appears twice.

The cause of the first problem, wait_event_type and wait_event are not
set, is that WaitExceedsMaxStandbyDelay which is called by
ResolveRecoveryConflictWithVirtualXIDs waits for other transactions
using pg_usleep rather than WaitLatch. I think we can change it so
that it uses WaitLatch and those caller passes wait event information.

For the second problem, wrong process title, the cause is also
relevant with ResolveRecoveryConflictWithVirtualXIDs; in case of lock
conflicts we add "waiting" to the process title in WaitOnLock but we
add it again in ResolveRecoveryConflictWithVirtualXIDs. I think we can
have WaitOnLock not set process title in recovery case.

This problem exists on 12, 11 and 10. I'll submit the patch.

I've attached patches that fix the above two issues.

0001 patch fixes the first problem. Currently there are 5 types of
recovery conflict resolution: snapshot, tablespace, lock, database and
buffer pin, and we set wait events to only 2 events out of 5: lock
(only when doing ProcWaitForSignal) and buffer pin. Therefore, users
cannot know that the startup process is waiting or not, and what
waiting for. This patch sets wait events to more 3 events: snapshot,
tablespace and lock. For wait events of those 3 events, I thought that
we can create a new more appropriate wait event type, say
RecoveryConflict, and set it for them. However, considering
back-patching to existing versions, adding new wait event type would
not be acceptable. So this patch sets existing wait events such as
PG_WAIT_LOCK to those 3 places and doesn't not set a wait event for
conflict resolution on dropping database because there is not an
appropriate existing one. I'll start a separate thread about
improvement on wait events of recovery conflict resolution for PG13 if
necessary.

Attached a patch improves wait events of recovery conflict resolution.
It's for PG13. I added new RecoveryConflict wait_event_type and some
wait_event. This patch can be applied on top of two patches I already
proposed.

Regards,

[1]: /messages/by-id/CA+fd4k63ukOtdNx2f-fUZ2vuB3RgE=Po+xSnpmcPJbKqsJMtiA@mail.gmail.com

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0003-Improve-wait-events-of-recovery-conflict-resolution.patchapplication/octet-stream; name=0003-Improve-wait-events-of-recovery-conflict-resolution.patchDownload

From 3ef9afda4f4ed870d24029a344c431af023aa698 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 26 Feb 2020 14:44:09 +0900
Subject: [PATCH 3/3] Improve wait events of recovery conflict resolution

---
 doc/src/sgml/monitoring.sgml      | 28 ++++++++++++++++++
 src/backend/postmaster/pgstat.c   | 47 +++++++++++++++++++++++++++++++
 src/backend/storage/ipc/standby.c | 16 +++++++----
 src/include/pgstat.h              | 17 +++++++++++
 4 files changed, 102 insertions(+), 6 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 87586a7b06..ea30fedcd1 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -781,6 +781,13 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
           <literal>wait_event</literal> will identify the specific wait point.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>RecoveryConflict</literal>: The server process is waiting for a
+          recovery conflict resolution.  <literal>wait_event</literal> will identify
+          the specific wait point.
+         </para>
+        </listitem>
        </itemizedlist>
       </entry>
      </row>
@@ -1773,6 +1780,27 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
          <entry><literal>WALWrite</literal></entry>
          <entry>Waiting for a write to a WAL file.</entry>
         </row>
+        <row>
+         <entry morerows="5"><literal>RecoveryConflict</literal></entry>
+         <entry><literal>Snapshot</literal></entry>
+         <entry>Waiting for recovery conflict resolution on a physical cleanup.</entry>
+        </row>
+        <row>
+         <entry><literal>Tablespace</literal></entry>
+         <entry>Waiting for recovery conflict resolution on dropping tablespace.</entry>
+        </row>
+        <row>
+         <entry><literal>Lock</literal></entry>
+         <entry>Waiting for recovery conflict resolution on acquiring a lock.</entry>
+        </row>
+        <row>
+         <entry><literal>BufferPin</literal></entry>
+         <entry>Waiting for recovery conflict resolution on acquiring a buffer pin.</entry>
+        </row>
+        <row>
+         <entry><literal>Database</literal></entry>
+         <entry>Waiting for recovery conflict resolution on dropping a database.</entry>
+        </row>
       </tbody>
      </tgroup>
     </table>
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 462b4d7e06..5276300216 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -308,6 +308,7 @@ static const char *pgstat_get_wait_client(WaitEventClient w);
 static const char *pgstat_get_wait_ipc(WaitEventIPC w);
 static const char *pgstat_get_wait_timeout(WaitEventTimeout w);
 static const char *pgstat_get_wait_io(WaitEventIO w);
+static const char *pgstat_get_wait_recovery_conflict(WaitEventRecoveryConflict w);
 
 static void pgstat_setheader(PgStat_MsgHdr *hdr, StatMsgType mtype);
 static void pgstat_send(void *msg, int len);
@@ -3535,6 +3536,9 @@ pgstat_get_wait_event_type(uint32 wait_event_info)
 		case PG_WAIT_IO:
 			event_type = "IO";
 			break;
+		case PG_WAIT_RECOVERY_CONFLICT:
+			event_type = "RecoveryConflict";
+			break;
 		default:
 			event_type = "???";
 			break;
@@ -3612,6 +3616,14 @@ pgstat_get_wait_event(uint32 wait_event_info)
 				event_name = pgstat_get_wait_io(w);
 				break;
 			}
+		case PG_WAIT_RECOVERY_CONFLICT:
+			{
+				WaitEventRecoveryConflict w =
+					(WaitEventRecoveryConflict) wait_event_info;
+
+				event_name = pgstat_get_wait_recovery_conflict(w);
+				break;
+			}
 		default:
 			event_name = "unknown wait event";
 			break;
@@ -4112,6 +4124,41 @@ pgstat_get_wait_io(WaitEventIO w)
 	return event_name;
 }
 
+/* ----------
+ * pgstat_get_wait_recovery_conflict() -
+ *
+ * Convert WaitEventRecoveryConflict to string.
+ * ----------
+ */
+static const char *
+pgstat_get_wait_recovery_conflict(WaitEventRecoveryConflict w)
+{
+	const char *event_name = "unknown wait event";
+
+	switch (w)
+	{
+		case WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT:
+			event_name = "Snapshot";
+			break;
+		case WAIT_EVENT_RECOVERY_CONFLICT_TABLESPACE:
+			event_name = "Tablespace";
+			break;
+		case WAIT_EVENT_RECOVERY_CONFLICT_LOCK:
+			event_name = "Lock";
+			break;
+		case WAIT_EVENT_RECOVERY_CONFLICT_BUFFER_PIN:
+			event_name = "BufferPin";
+			break;
+		case WAIT_EVENT_RECOVERY_CONFLICT_DATABASE:
+			event_name = "Database";
+			break;
+		default:
+			event_name = "unknown wait event";
+			break;
+	}
+
+	return event_name;
+}
 
 /* ----------
  * pgstat_get_backend_current_activity() -
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index b45a83c54c..0bfe68b14a 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -279,7 +279,7 @@ ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode
 
 	ResolveRecoveryConflictWithVirtualXIDs(backends,
 										   PROCSIG_RECOVERY_CONFLICT_SNAPSHOT,
-										   PG_WAIT_LOCK | LOCKTAG_TRANSACTION);
+										   WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT);
 
 	/* Reset ps display if we changed it */
 	if (new_status)
@@ -319,7 +319,7 @@ ResolveRecoveryConflictWithTablespace(Oid tsid)
 												InvalidOid);
 	ResolveRecoveryConflictWithVirtualXIDs(temp_file_users,
 										   PROCSIG_RECOVERY_CONFLICT_TABLESPACE,
-										   PG_WAIT_LOCK | LOCKTAG_TRANSACTION);
+										   WAIT_EVENT_RECOVERY_CONFLICT_TABLESPACE);
 
 	/* Reset ps display if we changed it */
 	if (new_status)
@@ -356,7 +356,11 @@ ResolveRecoveryConflictWithDatabase(Oid dbid)
 		 * Wait awhile for them to die so that we avoid flooding an
 		 * unresponsive backend when system is heavily loaded.
 		 */
-		pg_usleep(10000);
+		WaitLatch(MyLatch,
+				  WL_LATCH_SET | WL_POSTMASTER_DEATH | WL_TIMEOUT,
+				  10,
+				  WAIT_EVENT_RECOVERY_CONFLICT_DATABASE);
+		ResetLatch(MyLatch);
 	}
 
 	/* Reset ps display if we changed it */
@@ -406,7 +410,7 @@ ResolveRecoveryConflictWithLock(LOCKTAG locktag)
 		backends = GetLockConflicts(&locktag, AccessExclusiveLock, NULL);
 		ResolveRecoveryConflictWithVirtualXIDs(backends,
 											   PROCSIG_RECOVERY_CONFLICT_LOCK,
-											   PG_WAIT_LOCK | locktag.locktag_type);
+											   WAIT_EVENT_RECOVERY_CONFLICT_LOCK);
 	}
 	else
 	{
@@ -421,7 +425,7 @@ ResolveRecoveryConflictWithLock(LOCKTAG locktag)
 		enable_timeouts(timeouts, 1);
 
 		/* Wait to be signaled by the release of the Relation Lock */
-		ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
+		ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_LOCK);
 	}
 
 	/*
@@ -506,7 +510,7 @@ ResolveRecoveryConflictWithBufferPin(void)
 	}
 
 	/* Wait to be signaled by UnpinBuffer() */
-	ProcWaitForSignal(PG_WAIT_BUFFER_PIN);
+	ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_BUFFER_PIN);
 
 	/*
 	 * Clear any timeout requests established above.  We assume here that the
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 3a65a51696..67ab5ee13e 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -760,6 +760,7 @@ typedef enum BackendState
 #define PG_WAIT_IPC					0x08000000U
 #define PG_WAIT_TIMEOUT				0x09000000U
 #define PG_WAIT_IO					0x0A000000U
+#define PG_WAIT_RECOVERY_CONFLICT	0x0B000000U
 
 /* ----------
  * Wait Events - Activity
@@ -948,6 +949,22 @@ typedef enum
 	WAIT_EVENT_WAL_WRITE
 } WaitEventIO;
 
+/* ----------
+ * Wait Events - Recovery Conflict
+ *
+ * Use this category when a process is waiting for a recovery conflict
+ * resolution.
+ * ----------
+ */
+typedef enum
+{
+	WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT = PG_WAIT_RECOVERY_CONFLICT,
+	WAIT_EVENT_RECOVERY_CONFLICT_TABLESPACE,
+	WAIT_EVENT_RECOVERY_CONFLICT_LOCK,
+	WAIT_EVENT_RECOVERY_CONFLICT_BUFFER_PIN,
+	WAIT_EVENT_RECOVERY_CONFLICT_DATABASE
+} WaitEventRecoveryConflict;
+
 /* ----------
  * Command type for progress reporting purposes
  * ----------
-- 
2.23.0

Fujii Masao

masao.fujii@oss.nttdata.com

almost 6 years ago

In reply to: Masahiko Sawada (#3)

Re: Some problems of recovery conflict wait events

On 2020/02/29 12:36, Masahiko Sawada wrote:

On Wed, 26 Feb 2020 at 16:19, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 18 Feb 2020 at 17:58, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

Hi all,

When recovery conflicts happen on the streaming replication standby,
the wait event of startup process is null when
max_standby_streaming_delay = 0 (to be exact, when the limit time
calculated by max_standby_streaming_delay is behind the last WAL data
receipt time is behind). Moreover the process title of waiting startup
process looks odd in the case of lock conflicts.

1. When max_standby_streaming_delay > 0 and the startup process
conflicts with a lock,

* wait event
backend_type | wait_event_type | wait_event
--------------+-----------------+------------
startup | Lock | relation
(1 row)

* ps
42513 ?? Ss 0:00.05 postgres: startup recovering
000000010000000000000003 waiting

Looks good.

2. When max_standby_streaming_delay > 0 and the startup process
conflicts with a snapshot,

* wait event
backend_type | wait_event_type | wait_event
--------------+-----------------+------------
startup | |
(1 row)

* ps
44299 ?? Ss 0:00.05 postgres: startup recovering
000000010000000000000003 waiting

wait_event_type and wait_event are null in spite of waiting for
conflict resolution.

3. When max_standby_streaming_delay > 0 and the startup process
conflicts with a lock,

* wait event
backend_type | wait_event_type | wait_event
--------------+-----------------+------------
startup | |
(1 row)

* ps
46510 ?? Ss 0:00.05 postgres: startup recovering
000000010000000000000003 waiting waiting

wait_event_type and wait_event are null and the process title is
wrong; "waiting" appears twice.

The cause of the first problem, wait_event_type and wait_event are not
set, is that WaitExceedsMaxStandbyDelay which is called by
ResolveRecoveryConflictWithVirtualXIDs waits for other transactions
using pg_usleep rather than WaitLatch. I think we can change it so
that it uses WaitLatch and those caller passes wait event information.

For the second problem, wrong process title, the cause is also
relevant with ResolveRecoveryConflictWithVirtualXIDs; in case of lock
conflicts we add "waiting" to the process title in WaitOnLock but we
add it again in ResolveRecoveryConflictWithVirtualXIDs. I think we can
have WaitOnLock not set process title in recovery case.

This problem exists on 12, 11 and 10. I'll submit the patch.

I've attached patches that fix the above two issues.

0001 patch fixes the first problem. Currently there are 5 types of
recovery conflict resolution: snapshot, tablespace, lock, database and
buffer pin, and we set wait events to only 2 events out of 5: lock
(only when doing ProcWaitForSignal) and buffer pin.

+1 to add those new wait events in the master. But adding them sounds like
new feature rather than bug fix. So ISTM that it's not be back-patchable...

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

almost 6 years ago

In reply to: Fujii Masao (#4)

Re: Some problems of recovery conflict wait events

On Wed, 4 Mar 2020 at 11:04, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/02/29 12:36, Masahiko Sawada wrote:

On Wed, 26 Feb 2020 at 16:19, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

On Tue, 18 Feb 2020 at 17:58, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:

Hi all,

When recovery conflicts happen on the streaming replication standby,
the wait event of startup process is null when
max_standby_streaming_delay = 0 (to be exact, when the limit time
calculated by max_standby_streaming_delay is behind the last WAL data
receipt time is behind). Moreover the process title of waiting startup
process looks odd in the case of lock conflicts.

1. When max_standby_streaming_delay > 0 and the startup process
conflicts with a lock,

* wait event
backend_type | wait_event_type | wait_event
--------------+-----------------+------------
startup | Lock | relation
(1 row)

* ps
42513 ?? Ss 0:00.05 postgres: startup recovering
000000010000000000000003 waiting

Looks good.

2. When max_standby_streaming_delay > 0 and the startup process
conflicts with a snapshot,

* wait event
backend_type | wait_event_type | wait_event
--------------+-----------------+------------
startup | |
(1 row)

* ps
44299 ?? Ss 0:00.05 postgres: startup recovering
000000010000000000000003 waiting

wait_event_type and wait_event are null in spite of waiting for
conflict resolution.

3. When max_standby_streaming_delay > 0 and the startup process
conflicts with a lock,

* wait event
backend_type | wait_event_type | wait_event
--------------+-----------------+------------
startup | |
(1 row)

* ps
46510 ?? Ss 0:00.05 postgres: startup recovering
000000010000000000000003 waiting waiting

wait_event_type and wait_event are null and the process title is
wrong; "waiting" appears twice.

The cause of the first problem, wait_event_type and wait_event are not
set, is that WaitExceedsMaxStandbyDelay which is called by
ResolveRecoveryConflictWithVirtualXIDs waits for other transactions
using pg_usleep rather than WaitLatch. I think we can change it so
that it uses WaitLatch and those caller passes wait event information.

For the second problem, wrong process title, the cause is also
relevant with ResolveRecoveryConflictWithVirtualXIDs; in case of lock
conflicts we add "waiting" to the process title in WaitOnLock but we
add it again in ResolveRecoveryConflictWithVirtualXIDs. I think we can
have WaitOnLock not set process title in recovery case.

This problem exists on 12, 11 and 10. I'll submit the patch.

I've attached patches that fix the above two issues.

0001 patch fixes the first problem. Currently there are 5 types of
recovery conflict resolution: snapshot, tablespace, lock, database and
buffer pin, and we set wait events to only 2 events out of 5: lock
(only when doing ProcWaitForSignal) and buffer pin.

+1 to add those new wait events in the master. But adding them sounds like
new feature rather than bug fix. So ISTM that it's not be back-patchable...

Yeah, so 0001 patch sets existing wait events to recovery conflict
resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
to the recovery conflict on a snapshot. 0003 patch improves these wait
events by adding the new type of wait event such as
WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
is the fix for existing versions and 0003 patch is an improvement for
only PG13. Did you mean even 0001 patch doesn't fit for back-patching?

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Michael Paquier

michael@paquier.xyz

almost 6 years ago

In reply to: Masahiko Sawada (#5)

Re: Some problems of recovery conflict wait events

On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:

Yeah, so 0001 patch sets existing wait events to recovery conflict
resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
to the recovery conflict on a snapshot. 0003 patch improves these wait
events by adding the new type of wait event such as
WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
is the fix for existing versions and 0003 patch is an improvement for
only PG13. Did you mean even 0001 patch doesn't fit for back-patching?

I got my eyes on this patch set. The full patch set is in my opinion
just a set of improvements, and not bug fixes, so I would refrain from
back-backpatching.
--
Michael

Fujii Masao

masao.fujii@oss.nttdata.com

almost 6 years ago

In reply to: Michael Paquier (#6)

Re: Some problems of recovery conflict wait events

On 2020/03/04 13:27, Michael Paquier wrote:

On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:

Yeah, so 0001 patch sets existing wait events to recovery conflict
resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
to the recovery conflict on a snapshot. 0003 patch improves these wait
events by adding the new type of wait event such as
WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
is the fix for existing versions and 0003 patch is an improvement for
only PG13. Did you mean even 0001 patch doesn't fit for back-patching?

Yes, it looks like a improvement rather than bug fix.

I got my eyes on this patch set. The full patch set is in my opinion
just a set of improvements, and not bug fixes, so I would refrain from
back-backpatching.

I think that the issue (i.e., "waiting" is reported twice needlessly
in PS display) that 0002 patch tries to fix is a bug. So it should be
fixed even in the back branches.

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

almost 6 years ago

In reply to: Fujii Masao (#7)

2 attachment(s)

Re: Some problems of recovery conflict wait events

On Wed, 4 Mar 2020 at 13:48, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 13:27, Michael Paquier wrote:

On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:

Yeah, so 0001 patch sets existing wait events to recovery conflict
resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
to the recovery conflict on a snapshot. 0003 patch improves these wait
events by adding the new type of wait event such as
WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
is the fix for existing versions and 0003 patch is an improvement for
only PG13. Did you mean even 0001 patch doesn't fit for back-patching?

Yes, it looks like a improvement rather than bug fix.

Okay, understand.

I got my eyes on this patch set. The full patch set is in my opinion
just a set of improvements, and not bug fixes, so I would refrain from
back-backpatching.

I think that the issue (i.e., "waiting" is reported twice needlessly
in PS display) that 0002 patch tries to fix is a bug. So it should be
fixed even in the back branches.

So we need only two patches: one fixes process title issue and another
improve wait event. I've attached updated patches.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v2-0002-Improve-wait-events-for-recovery-conflict-resolut.patchapplication/octet-stream; name=v2-0002-Improve-wait-events-for-recovery-conflict-resolut.patchDownload

From bb01996f2c91f615285eaa916984d3c136c1deb5 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Wed, 26 Feb 2020 12:43:41 +0900
Subject: [PATCH v2 2/2] Improve wait events for recovery conflict resolution

---
 doc/src/sgml/monitoring.sgml      | 28 ++++++++++++++++
 src/backend/postmaster/pgstat.c   | 47 +++++++++++++++++++++++++++
 src/backend/storage/ipc/standby.c | 53 ++++++++++++++++---------------
 src/include/pgstat.h              | 17 ++++++++++
 4 files changed, 119 insertions(+), 26 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 987580d6df..479a9000cb 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -789,6 +789,13 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
           <literal>wait_event</literal> will identify the specific wait point.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>RecoveryConflict</literal>: The server process is waiting for a
+          recovery conflict resolution.  <literal>wait_event</literal> will identify
+          the specific wait point.
+         </para>
+        </listitem>
        </itemizedlist>
       </entry>
      </row>
@@ -1781,6 +1788,27 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
          <entry><literal>WALWrite</literal></entry>
          <entry>Waiting for a write to a WAL file.</entry>
         </row>
+        <row>
+         <entry morerows="5"><literal>RecoveryConflict</literal></entry>
+         <entry><literal>Snapshot</literal></entry>
+         <entry>Waiting for recovery conflict resolution on a physical cleanup.</entry>
+        </row>
+        <row>
+         <entry><literal>Tablespace</literal></entry>
+         <entry>Waiting for recovery conflict resolution on dropping tablespace.</entry>
+        </row>
+        <row>
+         <entry><literal>Lock</literal></entry>
+         <entry>Waiting for recovery conflict resolution on acquiring a lock.</entry>
+        </row>
+        <row>
+         <entry><literal>BufferPin</literal></entry>
+         <entry>Waiting for recovery conflict resolution on acquiring a buffer pin.</entry>
+        </row>
+        <row>
+         <entry><literal>Database</literal></entry>
+         <entry>Waiting for recovery conflict resolution on dropping a database.</entry>
+        </row>
       </tbody>
      </tgroup>
     </table>
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 462b4d7e06..5276300216 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -308,6 +308,7 @@ static const char *pgstat_get_wait_client(WaitEventClient w);
 static const char *pgstat_get_wait_ipc(WaitEventIPC w);
 static const char *pgstat_get_wait_timeout(WaitEventTimeout w);
 static const char *pgstat_get_wait_io(WaitEventIO w);
+static const char *pgstat_get_wait_recovery_conflict(WaitEventRecoveryConflict w);
 
 static void pgstat_setheader(PgStat_MsgHdr *hdr, StatMsgType mtype);
 static void pgstat_send(void *msg, int len);
@@ -3535,6 +3536,9 @@ pgstat_get_wait_event_type(uint32 wait_event_info)
 		case PG_WAIT_IO:
 			event_type = "IO";
 			break;
+		case PG_WAIT_RECOVERY_CONFLICT:
+			event_type = "RecoveryConflict";
+			break;
 		default:
 			event_type = "???";
 			break;
@@ -3612,6 +3616,14 @@ pgstat_get_wait_event(uint32 wait_event_info)
 				event_name = pgstat_get_wait_io(w);
 				break;
 			}
+		case PG_WAIT_RECOVERY_CONFLICT:
+			{
+				WaitEventRecoveryConflict w =
+					(WaitEventRecoveryConflict) wait_event_info;
+
+				event_name = pgstat_get_wait_recovery_conflict(w);
+				break;
+			}
 		default:
 			event_name = "unknown wait event";
 			break;
@@ -4112,6 +4124,41 @@ pgstat_get_wait_io(WaitEventIO w)
 	return event_name;
 }
 
+/* ----------
+ * pgstat_get_wait_recovery_conflict() -
+ *
+ * Convert WaitEventRecoveryConflict to string.
+ * ----------
+ */
+static const char *
+pgstat_get_wait_recovery_conflict(WaitEventRecoveryConflict w)
+{
+	const char *event_name = "unknown wait event";
+
+	switch (w)
+	{
+		case WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT:
+			event_name = "Snapshot";
+			break;
+		case WAIT_EVENT_RECOVERY_CONFLICT_TABLESPACE:
+			event_name = "Tablespace";
+			break;
+		case WAIT_EVENT_RECOVERY_CONFLICT_LOCK:
+			event_name = "Lock";
+			break;
+		case WAIT_EVENT_RECOVERY_CONFLICT_BUFFER_PIN:
+			event_name = "BufferPin";
+			break;
+		case WAIT_EVENT_RECOVERY_CONFLICT_DATABASE:
+			event_name = "Database";
+			break;
+		default:
+			event_name = "unknown wait event";
+			break;
+	}
+
+	return event_name;
+}
 
 /* ----------
  * pgstat_get_backend_current_activity() -
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 4f12f598ab..bd95598a41 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -43,7 +43,8 @@ int			max_standby_streaming_delay = 30 * 1000;
 static HTAB *RecoveryLockLists;
 
 static void ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
-												   ProcSignalReason reason);
+												   ProcSignalReason reason,
+												   uint32 wait_event_info);
 static void SendRecoveryConflictWithBufferPin(ProcSignalReason reason);
 static XLogRecPtr LogCurrentRunningXacts(RunningTransactions CurrRunningXacts);
 static void LogAccessExclusiveLocks(int nlocks, xl_standby_lock *locks);
@@ -176,8 +177,7 @@ GetStandbyLimitTime(void)
 	}
 }
 
-#define STANDBY_INITIAL_WAIT_US  1000
-static int	standbyWait_us = STANDBY_INITIAL_WAIT_US;
+#define STANDBY_WAIT_MS  1000
 
 /*
  * Standby wait logic for ResolveRecoveryConflictWithVirtualXIDs.
@@ -185,7 +185,7 @@ static int	standbyWait_us = STANDBY_INITIAL_WAIT_US;
  * more then we return true, if we can wait some more return false.
  */
 static bool
-WaitExceedsMaxStandbyDelay(void)
+WaitExceedsMaxStandbyDelay(uint32 wait_event_info)
 {
 	TimestampTz ltime;
 
@@ -199,15 +199,11 @@ WaitExceedsMaxStandbyDelay(void)
 	/*
 	 * Sleep a bit (this is essential to avoid busy-waiting).
 	 */
-	pg_usleep(standbyWait_us);
-
-	/*
-	 * Progressively increase the sleep times, but not to more than 1s, since
-	 * pg_usleep isn't interruptible on some platforms.
-	 */
-	standbyWait_us *= 2;
-	if (standbyWait_us > 1000000)
-		standbyWait_us = 1000000;
+	WaitLatch(MyLatch,
+			  WL_LATCH_SET | WL_POSTMASTER_DEATH | WL_TIMEOUT,
+			  STANDBY_WAIT_MS,
+			  wait_event_info);
+	ResetLatch(MyLatch);
 
 	return false;
 }
@@ -220,7 +216,8 @@ WaitExceedsMaxStandbyDelay(void)
  */
 static void
 ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
-									   ProcSignalReason reason)
+									   ProcSignalReason reason,
+									   uint32 wait_event_info)
 {
 	/* Fast exit, to avoid a kernel call if there's no work to be done. */
 	if (!VirtualTransactionIdIsValid(*waitlist))
@@ -228,14 +225,11 @@ ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
 
 	while (VirtualTransactionIdIsValid(*waitlist))
 	{
-		/* reset standbyWait_us for each xact we wait for */
-		standbyWait_us = STANDBY_INITIAL_WAIT_US;
-
 		/* wait until the virtual xid is gone */
 		while (!VirtualXactLock(*waitlist, false))
 		{
 			/* Is it time to kill it? */
-			if (WaitExceedsMaxStandbyDelay())
+			if (WaitExceedsMaxStandbyDelay(wait_event_info))
 			{
 				pid_t		pid;
 
@@ -284,7 +278,8 @@ ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode
 										 node.dbNode);
 
 	ResolveRecoveryConflictWithVirtualXIDs(backends,
-										   PROCSIG_RECOVERY_CONFLICT_SNAPSHOT);
+										   PROCSIG_RECOVERY_CONFLICT_SNAPSHOT,
+										   WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT);
 
 	/* Reset ps display if we changed it */
 	if (new_status)
@@ -323,7 +318,8 @@ ResolveRecoveryConflictWithTablespace(Oid tsid)
 	temp_file_users = GetConflictingVirtualXIDs(InvalidTransactionId,
 												InvalidOid);
 	ResolveRecoveryConflictWithVirtualXIDs(temp_file_users,
-										   PROCSIG_RECOVERY_CONFLICT_TABLESPACE);
+										   PROCSIG_RECOVERY_CONFLICT_TABLESPACE,
+										   WAIT_EVENT_RECOVERY_CONFLICT_TABLESPACE);
 
 	/* Reset ps display if we changed it */
 	if (new_status)
@@ -360,7 +356,11 @@ ResolveRecoveryConflictWithDatabase(Oid dbid)
 		 * Wait awhile for them to die so that we avoid flooding an
 		 * unresponsive backend when system is heavily loaded.
 		 */
-		pg_usleep(10000);
+		WaitLatch(MyLatch,
+				  WL_LATCH_SET | WL_POSTMASTER_DEATH | WL_TIMEOUT,
+				  10,
+				  WAIT_EVENT_RECOVERY_CONFLICT_DATABASE);
+		ResetLatch(MyLatch);
 	}
 
 	/* Reset ps display if we changed it */
@@ -410,7 +410,8 @@ ResolveRecoveryConflictWithLock(LOCKTAG locktag)
 
 		backends = GetLockConflicts(&locktag, AccessExclusiveLock, NULL);
 		ResolveRecoveryConflictWithVirtualXIDs(backends,
-											   PROCSIG_RECOVERY_CONFLICT_LOCK);
+											   PROCSIG_RECOVERY_CONFLICT_LOCK,
+											   WAIT_EVENT_RECOVERY_CONFLICT_LOCK);
 	}
 	else
 	{
@@ -423,10 +424,10 @@ ResolveRecoveryConflictWithLock(LOCKTAG locktag)
 		timeouts[0].type = TMPARAM_AT;
 		timeouts[0].fin_time = ltime;
 		enable_timeouts(timeouts, 1);
-	}
 
-	/* Wait to be signaled by the release of the Relation Lock */
-	ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
+		/* Wait to be signaled by the release of the Relation Lock */
+		ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_LOCK);
+	}
 
 	/*
 	 * Clear any timeout requests established above.  We assume here that the
@@ -510,7 +511,7 @@ ResolveRecoveryConflictWithBufferPin(void)
 	}
 
 	/* Wait to be signaled by UnpinBuffer() */
-	ProcWaitForSignal(PG_WAIT_BUFFER_PIN);
+	ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_BUFFER_PIN);
 
 	/*
 	 * Clear any timeout requests established above.  We assume here that the
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 7bc36c6583..b1eda8cc32 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -760,6 +760,7 @@ typedef enum BackendState
 #define PG_WAIT_IPC					0x08000000U
 #define PG_WAIT_TIMEOUT				0x09000000U
 #define PG_WAIT_IO					0x0A000000U
+#define PG_WAIT_RECOVERY_CONFLICT	0x0B000000U
 
 /* ----------
  * Wait Events - Activity
@@ -948,6 +949,22 @@ typedef enum
 	WAIT_EVENT_WAL_WRITE
 } WaitEventIO;
 
+/* ----------
+ * Wait Events - Recovery Conflict
+ *
+ * Use this category when a process is waiting for a recovery conflict
+ * resolution.
+ * ----------
+ */
+typedef enum
+{
+	WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT = PG_WAIT_RECOVERY_CONFLICT,
+	WAIT_EVENT_RECOVERY_CONFLICT_TABLESPACE,
+	WAIT_EVENT_RECOVERY_CONFLICT_LOCK,
+	WAIT_EVENT_RECOVERY_CONFLICT_BUFFER_PIN,
+	WAIT_EVENT_RECOVERY_CONFLICT_DATABASE
+} WaitEventRecoveryConflict;
+
 /* ----------
  * Command type for progress reporting purposes
  * ----------
-- 
2.23.0

v2-0001-Fix-process-title-update-during-recovery-conflict.patchapplication/octet-stream; name=v2-0001-Fix-process-title-update-during-recovery-conflict.patchDownload

From dc896964de1bdb2e6ad419fc53edb54360b3afb1 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 25 Feb 2020 16:58:56 +0900
Subject: [PATCH v2 1/2] Fix process title update during recovery conflicts

---
 src/backend/storage/ipc/standby.c | 106 +++++++++++++++++++++---------
 1 file changed, 74 insertions(+), 32 deletions(-)

diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 3090e57fa4..4f12f598ab 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -47,6 +47,7 @@ static void ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlis
 static void SendRecoveryConflictWithBufferPin(ProcSignalReason reason);
 static XLogRecPtr LogCurrentRunningXacts(RunningTransactions CurrRunningXacts);
 static void LogAccessExclusiveLocks(int nlocks, xl_standby_lock *locks);
+static char *set_process_title_waiting(void);
 
 /*
  * Keep track of all the locks owned by a given transaction.
@@ -221,16 +222,10 @@ static void
 ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
 									   ProcSignalReason reason)
 {
-	TimestampTz waitStart;
-	char	   *new_status;
-
 	/* Fast exit, to avoid a kernel call if there's no work to be done. */
 	if (!VirtualTransactionIdIsValid(*waitlist))
 		return;
 
-	waitStart = GetCurrentTimestamp();
-	new_status = NULL;			/* we haven't changed the ps display */
-
 	while (VirtualTransactionIdIsValid(*waitlist))
 	{
 		/* reset standbyWait_us for each xact we wait for */
@@ -239,25 +234,6 @@ ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
 		/* wait until the virtual xid is gone */
 		while (!VirtualXactLock(*waitlist, false))
 		{
-			/*
-			 * Report via ps if we have been waiting for more than 500 msec
-			 * (should that be configurable?)
-			 */
-			if (update_process_title && new_status == NULL &&
-				TimestampDifferenceExceeds(waitStart, GetCurrentTimestamp(),
-										   500))
-			{
-				const char *old_status;
-				int			len;
-
-				old_status = get_ps_display(&len);
-				new_status = (char *) palloc(len + 8 + 1);
-				memcpy(new_status, old_status, len);
-				strcpy(new_status + len, " waiting");
-				set_ps_display(new_status, false);
-				new_status[len] = '\0'; /* truncate off " waiting" */
-			}
-
 			/* Is it time to kill it? */
 			if (WaitExceedsMaxStandbyDelay())
 			{
@@ -281,19 +257,13 @@ ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
 		/* The virtual transaction is gone now, wait for the next one */
 		waitlist++;
 	}
-
-	/* Reset ps display if we changed it */
-	if (new_status)
-	{
-		set_ps_display(new_status, false);
-		pfree(new_status);
-	}
 }
 
 void
 ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode node)
 {
 	VirtualTransactionId *backends;
+	char		*new_status = NULL;
 
 	/*
 	 * If we get passed InvalidTransactionId then we are a little surprised,
@@ -307,17 +277,31 @@ ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode
 	if (!TransactionIdIsValid(latestRemovedXid))
 		return;
 
+	/* Report via ps we are waiting */
+	new_status = set_process_title_waiting();
+
 	backends = GetConflictingVirtualXIDs(latestRemovedXid,
 										 node.dbNode);
 
 	ResolveRecoveryConflictWithVirtualXIDs(backends,
 										   PROCSIG_RECOVERY_CONFLICT_SNAPSHOT);
+
+	/* Reset ps display if we changed it */
+	if (new_status)
+	{
+		set_ps_display(new_status, false);
+		pfree(new_status);
+	}
 }
 
 void
 ResolveRecoveryConflictWithTablespace(Oid tsid)
 {
 	VirtualTransactionId *temp_file_users;
+	char		*new_status = NULL;
+
+	/* Report via ps we are waiting */
+	new_status = set_process_title_waiting();
 
 	/*
 	 * Standby users may be currently using this tablespace for their
@@ -340,11 +324,23 @@ ResolveRecoveryConflictWithTablespace(Oid tsid)
 												InvalidOid);
 	ResolveRecoveryConflictWithVirtualXIDs(temp_file_users,
 										   PROCSIG_RECOVERY_CONFLICT_TABLESPACE);
+
+	/* Reset ps display if we changed it */
+	if (new_status)
+	{
+		set_ps_display(new_status, false);
+		pfree(new_status);
+	}
 }
 
 void
 ResolveRecoveryConflictWithDatabase(Oid dbid)
 {
+	char		*new_status = NULL;
+
+	/* Report via ps we are waiting */
+	new_status = set_process_title_waiting();
+
 	/*
 	 * We don't do ResolveRecoveryConflictWithVirtualXIDs() here since that
 	 * only waits for transactions and completely idle sessions would block
@@ -366,6 +362,13 @@ ResolveRecoveryConflictWithDatabase(Oid dbid)
 		 */
 		pg_usleep(10000);
 	}
+
+	/* Reset ps display if we changed it */
+	if (new_status)
+	{
+		set_ps_display(new_status, false);
+		pfree(new_status);
+	}
 }
 
 /*
@@ -384,6 +387,10 @@ ResolveRecoveryConflictWithDatabase(Oid dbid)
  *
  * Deadlocks involving the Startup process and an ordinary backend process
  * will be detected by the deadlock detector within the ordinary backend.
+ *
+ * Unlike other recovery conflict resolution functions, this function
+ * doesn't update the process title since we have already updated it at
+ * WaitOnLock().
  */
 void
 ResolveRecoveryConflictWithLock(LOCKTAG locktag)
@@ -461,9 +468,13 @@ void
 ResolveRecoveryConflictWithBufferPin(void)
 {
 	TimestampTz ltime;
+	char		*new_status = NULL;
 
 	Assert(InHotStandby);
 
+	/* Report via ps we are waiting */
+	new_status = set_process_title_waiting();
+
 	ltime = GetStandbyLimitTime();
 
 	if (ltime == 0)
@@ -508,6 +519,13 @@ ResolveRecoveryConflictWithBufferPin(void)
 	 * individually, but that'd be slower.
 	 */
 	disable_all_timeouts(false);
+
+	/* Reset ps display if we changed it */
+	if (new_status)
+	{
+		set_ps_display(new_status, false);
+		pfree(new_status);
+	}
 }
 
 static void
@@ -1094,3 +1112,27 @@ LogStandbyInvalidations(int nmsgs, SharedInvalidationMessage *msgs,
 					 nmsgs * sizeof(SharedInvalidationMessage));
 	XLogInsert(RM_STANDBY_ID, XLOG_INVALIDATIONS);
 }
+
+/*
+ * Append " waiting" to the process title, and return palloc'd
+ * original process title.
+ */
+static char *
+set_process_title_waiting(void)
+{
+	const char	*old_status;
+	char		*ret_status;
+	int			len;
+
+	if (!update_process_title)
+		return NULL;
+
+	old_status = get_ps_display(&len);
+	ret_status = (char *) palloc(len + 8 + 1);
+	memcpy(ret_status, old_status, len);
+	strcpy(ret_status + len, " waiting");
+	set_ps_display(ret_status, false);
+	ret_status[len] = '\0'; /* truncate off " waiting" */
+
+	return ret_status;
+}
-- 
2.23.0

Fujii Masao

masao.fujii@oss.nttdata.com

almost 6 years ago

In reply to: Masahiko Sawada (#8)

Re: Some problems of recovery conflict wait events

On 2020/03/04 14:31, Masahiko Sawada wrote:

On Wed, 4 Mar 2020 at 13:48, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 13:27, Michael Paquier wrote:

On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:

Yeah, so 0001 patch sets existing wait events to recovery conflict
resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
to the recovery conflict on a snapshot. 0003 patch improves these wait
events by adding the new type of wait event such as
WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
is the fix for existing versions and 0003 patch is an improvement for
only PG13. Did you mean even 0001 patch doesn't fit for back-patching?

Yes, it looks like a improvement rather than bug fix.

Okay, understand.

I got my eyes on this patch set. The full patch set is in my opinion
just a set of improvements, and not bug fixes, so I would refrain from
back-backpatching.

I think that the issue (i.e., "waiting" is reported twice needlessly
in PS display) that 0002 patch tries to fix is a bug. So it should be
fixed even in the back branches.

So we need only two patches: one fixes process title issue and another
improve wait event. I've attached updated patches.

Thanks for updating the patches! I started reading 0001 patch.

- /*
- * Report via ps if we have been waiting for more than 500 msec
- * (should that be configurable?)
- */
- if (update_process_title && new_status == NULL &&
- TimestampDifferenceExceeds(waitStart, GetCurrentTimestamp(),
- 500))

The patch changes ResolveRecoveryConflictWithSnapshot() and
ResolveRecoveryConflictWithTablespace() so that they always add
"waiting" into the PS display, whether wait is really necessary or not.
But isn't it better to display "waiting" in PS basically when wait is
necessary, like originally ResolveRecoveryConflictWithVirtualXIDs()
does as the above?

  ResolveRecoveryConflictWithDatabase(Oid dbid)
  {
+	char		*new_status = NULL;
+
+	/* Report via ps we are waiting */
+	new_status = set_process_title_waiting();

In ResolveRecoveryConflictWithDatabase(), there seems no need to
display "waiting" in PS because no wait occurs when recovery conflict
with database happens.

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

#10

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

almost 6 years ago

In reply to: Fujii Masao (#9)

Re: Some problems of recovery conflict wait events

On Wed, 4 Mar 2020 at 15:21, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 14:31, Masahiko Sawada wrote:

On Wed, 4 Mar 2020 at 13:48, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 13:27, Michael Paquier wrote:

On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:

Yeah, so 0001 patch sets existing wait events to recovery conflict
resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
to the recovery conflict on a snapshot. 0003 patch improves these wait
events by adding the new type of wait event such as
WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
is the fix for existing versions and 0003 patch is an improvement for
only PG13. Did you mean even 0001 patch doesn't fit for back-patching?

Yes, it looks like a improvement rather than bug fix.

Okay, understand.

I got my eyes on this patch set. The full patch set is in my opinion
just a set of improvements, and not bug fixes, so I would refrain from
back-backpatching.

I think that the issue (i.e., "waiting" is reported twice needlessly
in PS display) that 0002 patch tries to fix is a bug. So it should be
fixed even in the back branches.

So we need only two patches: one fixes process title issue and another
improve wait event. I've attached updated patches.

Thanks for updating the patches! I started reading 0001 patch.

Thank you for reviewing this patch.

- /*
- * Report via ps if we have been waiting for more than 500 msec
- * (should that be configurable?)
- */
- if (update_process_title && new_status == NULL &&
- TimestampDifferenceExceeds(waitStart, GetCurrentTimestamp(),
- 500))

The patch changes ResolveRecoveryConflictWithSnapshot() and
ResolveRecoveryConflictWithTablespace() so that they always add
"waiting" into the PS display, whether wait is really necessary or not.
But isn't it better to display "waiting" in PS basically when wait is
necessary, like originally ResolveRecoveryConflictWithVirtualXIDs()
does as the above?

You're right. Will fix it.

ResolveRecoveryConflictWithDatabase(Oid dbid)
{
+       char            *new_status = NULL;
+
+       /* Report via ps we are waiting */
+       new_status = set_process_title_waiting();
In ResolveRecoveryConflictWithDatabase(), there seems no need to
display "waiting" in PS because no wait occurs when recovery conflict
with database happens.

Isn't the startup process waiting for other backend to terminate?

Regards

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#11

Fujii Masao

masao.fujii@oss.nttdata.com

almost 6 years ago

In reply to: Masahiko Sawada (#10)

Re: Some problems of recovery conflict wait events

On 2020/03/05 16:58, Masahiko Sawada wrote:

On Wed, 4 Mar 2020 at 15:21, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 14:31, Masahiko Sawada wrote:

On Wed, 4 Mar 2020 at 13:48, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 13:27, Michael Paquier wrote:

On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:

Yeah, so 0001 patch sets existing wait events to recovery conflict
resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
to the recovery conflict on a snapshot. 0003 patch improves these wait
events by adding the new type of wait event such as
WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
is the fix for existing versions and 0003 patch is an improvement for
only PG13. Did you mean even 0001 patch doesn't fit for back-patching?

Yes, it looks like a improvement rather than bug fix.

Okay, understand.

I got my eyes on this patch set. The full patch set is in my opinion
just a set of improvements, and not bug fixes, so I would refrain from
back-backpatching.

I think that the issue (i.e., "waiting" is reported twice needlessly
in PS display) that 0002 patch tries to fix is a bug. So it should be
fixed even in the back branches.

So we need only two patches: one fixes process title issue and another
improve wait event. I've attached updated patches.

Thanks for updating the patches! I started reading 0001 patch.

Thank you for reviewing this patch.

- /*
- * Report via ps if we have been waiting for more than 500 msec
- * (should that be configurable?)
- */
- if (update_process_title && new_status == NULL &&
- TimestampDifferenceExceeds(waitStart, GetCurrentTimestamp(),
- 500))

The patch changes ResolveRecoveryConflictWithSnapshot() and
ResolveRecoveryConflictWithTablespace() so that they always add
"waiting" into the PS display, whether wait is really necessary or not.
But isn't it better to display "waiting" in PS basically when wait is
necessary, like originally ResolveRecoveryConflictWithVirtualXIDs()
does as the above?

You're right. Will fix it.
ResolveRecoveryConflictWithDatabase(Oid dbid)
{
+       char            *new_status = NULL;
+
+       /* Report via ps we are waiting */
+       new_status = set_process_title_waiting();
In ResolveRecoveryConflictWithDatabase(), there seems no need to
display "waiting" in PS because no wait occurs when recovery conflict
with database happens.
Isn't the startup process waiting for other backend to terminate?

Yeah, you're right. I agree that "waiting" should be reported in this case.

Currently ResolveRecoveryConflictWithLock() and
ResolveRecoveryConflictWithBufferPin() don't call
ResolveRecoveryConflictWithVirtualXIDs and don't report "waiting"
in PS display. You changed them so that they report "waiting". I agree
to have this change. But this change is an improvement rather than
a bug fix, i.e., we should apply this change only in v13?

Of course, the other part in the patch, i.e., fixing the issue that
"waiting" is doubly reported, should be back-patched, I think,
though.

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

#12

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

almost 6 years ago

In reply to: Fujii Masao (#11)

2 attachment(s)

Re: Some problems of recovery conflict wait events

On Thu, 5 Mar 2020 at 20:16, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/05 16:58, Masahiko Sawada wrote:
On Wed, 4 Mar 2020 at 15:21, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 14:31, Masahiko Sawada wrote:

On Wed, 4 Mar 2020 at 13:48, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 13:27, Michael Paquier wrote:

On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:

Yeah, so 0001 patch sets existing wait events to recovery conflict
resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
to the recovery conflict on a snapshot. 0003 patch improves these wait
events by adding the new type of wait event such as
WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
is the fix for existing versions and 0003 patch is an improvement for
only PG13. Did you mean even 0001 patch doesn't fit for back-patching?

Yes, it looks like a improvement rather than bug fix.

Okay, understand.

I got my eyes on this patch set. The full patch set is in my opinion
just a set of improvements, and not bug fixes, so I would refrain from
back-backpatching.

I think that the issue (i.e., "waiting" is reported twice needlessly
in PS display) that 0002 patch tries to fix is a bug. So it should be
fixed even in the back branches.

So we need only two patches: one fixes process title issue and another
improve wait event. I've attached updated patches.

Thanks for updating the patches! I started reading 0001 patch.

Thank you for reviewing this patch.

- /*
- * Report via ps if we have been waiting for more than 500 msec
- * (should that be configurable?)
- */
- if (update_process_title && new_status == NULL &&
- TimestampDifferenceExceeds(waitStart, GetCurrentTimestamp(),
- 500))

The patch changes ResolveRecoveryConflictWithSnapshot() and
ResolveRecoveryConflictWithTablespace() so that they always add
"waiting" into the PS display, whether wait is really necessary or not.
But isn't it better to display "waiting" in PS basically when wait is
necessary, like originally ResolveRecoveryConflictWithVirtualXIDs()
does as the above?

You're right. Will fix it.
ResolveRecoveryConflictWithDatabase(Oid dbid)
{
+       char            *new_status = NULL;
+
+       /* Report via ps we are waiting */
+       new_status = set_process_title_waiting();
In ResolveRecoveryConflictWithDatabase(), there seems no need to
display "waiting" in PS because no wait occurs when recovery conflict
with database happens.
Isn't the startup process waiting for other backend to terminate?
Yeah, you're right. I agree that "waiting" should be reported in this case.

Currently ResolveRecoveryConflictWithLock() and
ResolveRecoveryConflictWithBufferPin() don't call
ResolveRecoveryConflictWithVirtualXIDs and don't report "waiting"
in PS display. You changed them so that they report "waiting". I agree
to have this change. But this change is an improvement rather than
a bug fix, i.e., we should apply this change only in v13?

Did you mean ResolveRecoveryConflictWithDatabase and
ResolveRecoveryConflictWithBufferPin? In the current code as far as I
researched there are two cases where we don't add "waiting" and one
case where we doubly add "waiting".

ResolveRecoveryConflictWithDatabase and
ResolveRecoveryConflictWithBufferPin don't update the ps title.
Although the path where GetCurrentTimestamp() >= ltime is false in
ResolveRecoveryConflictWithLock also doesn't update the ps title, it's
already updated in WaitOnLock. On the other hand, the path where
GetCurrentTimestamp() >= ltime is true in
ResolveRecoveryConflictWithLock updates the ps title but it's wrong as
I reported.

I've split the patch into two patches: 0001 patch fixes the issue
about doubly updating ps title, 0002 patch makes the recovery conflict
resolution on database and buffer pin update the ps title.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v3-0002-Update-process-title-during-conflict-resolution-w.patchapplication/octet-stream; name=v3-0002-Update-process-title-during-conflict-resolution-w.patchDownload

From f47615dabfe62f5861c74868bde59cf4ed459edd Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Sun, 8 Mar 2020 13:09:51 +0900
Subject: [PATCH v3 2/2] Update process title during conflict resolution with
 buffer-pin and database

---
 src/backend/storage/ipc/standby.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 3c0d42d7b4..b7f58397b3 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -345,6 +345,8 @@ ResolveRecoveryConflictWithTablespace(Oid tsid)
 void
 ResolveRecoveryConflictWithDatabase(Oid dbid)
 {
+	char		*new_status = NULL;
+
 	/*
 	 * We don't do ResolveRecoveryConflictWithVirtualXIDs() here since that
 	 * only waits for transactions and completely idle sessions would block
@@ -358,6 +360,10 @@ ResolveRecoveryConflictWithDatabase(Oid dbid)
 	 */
 	while (CountDBBackends(dbid) > 0)
 	{
+		/* Report via ps we are waiting */
+		if (!new_status)
+			new_status = set_process_title_waiting();
+
 		CancelDBBackends(dbid, PROCSIG_RECOVERY_CONFLICT_DATABASE, true);
 
 		/*
@@ -366,6 +372,13 @@ ResolveRecoveryConflictWithDatabase(Oid dbid)
 		 */
 		pg_usleep(10000);
 	}
+
+	/* Reset ps display if we changed it */
+	if (new_status)
+	{
+		set_ps_display(new_status, false);
+		pfree(new_status);
+	}
 }
 
 /*
@@ -465,9 +478,13 @@ void
 ResolveRecoveryConflictWithBufferPin(void)
 {
 	TimestampTz ltime;
+	char		*new_status = NULL;
 
 	Assert(InHotStandby);
 
+	/* Report via ps we are waiting */
+	new_status = set_process_title_waiting();
+
 	ltime = GetStandbyLimitTime();
 
 	if (ltime == 0)
@@ -512,6 +529,13 @@ ResolveRecoveryConflictWithBufferPin(void)
 	 * individually, but that'd be slower.
 	 */
 	disable_all_timeouts(false);
+
+	/* Reset ps display if we changed it */
+	if (new_status)
+	{
+		set_ps_display(new_status, false);
+		pfree(new_status);
+	}
 }
 
 static void
-- 
2.23.0

v3-0001-Fix-double-updating-ps-title-when-recovery-confli.patchapplication/octet-stream; name=v3-0001-Fix-double-updating-ps-title-when-recovery-confli.patchDownload

From 02eb9367f7a7757c5477d5034f5d7cda45d69318 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Sun, 8 Mar 2020 13:07:41 +0900
Subject: [PATCH v3 1/2] Fix double updating ps title when recovery conflict
 resolution with lock

---
 src/backend/storage/ipc/standby.c | 92 ++++++++++++++++++++-----------
 1 file changed, 60 insertions(+), 32 deletions(-)

diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 3090e57fa4..3c0d42d7b4 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -47,6 +47,7 @@ static void ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlis
 static void SendRecoveryConflictWithBufferPin(ProcSignalReason reason);
 static XLogRecPtr LogCurrentRunningXacts(RunningTransactions CurrRunningXacts);
 static void LogAccessExclusiveLocks(int nlocks, xl_standby_lock *locks);
+static char *set_process_title_waiting(void);
 
 /*
  * Keep track of all the locks owned by a given transaction.
@@ -221,16 +222,10 @@ static void
 ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
 									   ProcSignalReason reason)
 {
-	TimestampTz waitStart;
-	char	   *new_status;
-
 	/* Fast exit, to avoid a kernel call if there's no work to be done. */
 	if (!VirtualTransactionIdIsValid(*waitlist))
 		return;
 
-	waitStart = GetCurrentTimestamp();
-	new_status = NULL;			/* we haven't changed the ps display */
-
 	while (VirtualTransactionIdIsValid(*waitlist))
 	{
 		/* reset standbyWait_us for each xact we wait for */
@@ -239,25 +234,6 @@ ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
 		/* wait until the virtual xid is gone */
 		while (!VirtualXactLock(*waitlist, false))
 		{
-			/*
-			 * Report via ps if we have been waiting for more than 500 msec
-			 * (should that be configurable?)
-			 */
-			if (update_process_title && new_status == NULL &&
-				TimestampDifferenceExceeds(waitStart, GetCurrentTimestamp(),
-										   500))
-			{
-				const char *old_status;
-				int			len;
-
-				old_status = get_ps_display(&len);
-				new_status = (char *) palloc(len + 8 + 1);
-				memcpy(new_status, old_status, len);
-				strcpy(new_status + len, " waiting");
-				set_ps_display(new_status, false);
-				new_status[len] = '\0'; /* truncate off " waiting" */
-			}
-
 			/* Is it time to kill it? */
 			if (WaitExceedsMaxStandbyDelay())
 			{
@@ -281,19 +257,13 @@ ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
 		/* The virtual transaction is gone now, wait for the next one */
 		waitlist++;
 	}
-
-	/* Reset ps display if we changed it */
-	if (new_status)
-	{
-		set_ps_display(new_status, false);
-		pfree(new_status);
-	}
 }
 
 void
 ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode node)
 {
 	VirtualTransactionId *backends;
+	char		*new_status = NULL;
 
 	/*
 	 * If we get passed InvalidTransactionId then we are a little surprised,
@@ -310,14 +280,29 @@ ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode
 	backends = GetConflictingVirtualXIDs(latestRemovedXid,
 										 node.dbNode);
 
+	/* quick exit if there is no conflict on the xid */
+	if (!VirtualTransactionIdIsValid(*backends))
+		return;
+
+	/* Report via ps we are waiting */
+	new_status = set_process_title_waiting();
+
 	ResolveRecoveryConflictWithVirtualXIDs(backends,
 										   PROCSIG_RECOVERY_CONFLICT_SNAPSHOT);
+
+	/* Reset ps display if we changed it */
+	if (new_status)
+	{
+		set_ps_display(new_status, false);
+		pfree(new_status);
+	}
 }
 
 void
 ResolveRecoveryConflictWithTablespace(Oid tsid)
 {
 	VirtualTransactionId *temp_file_users;
+	char		*new_status = NULL;
 
 	/*
 	 * Standby users may be currently using this tablespace for their
@@ -338,8 +323,23 @@ ResolveRecoveryConflictWithTablespace(Oid tsid)
 	 */
 	temp_file_users = GetConflictingVirtualXIDs(InvalidTransactionId,
 												InvalidOid);
+
+	/* quick exit if there is no conflict on the tablespace */
+	if (!VirtualTransactionIdIsValid(*temp_file_users))
+		return;
+
+	/* Report via ps we are waiting */
+	new_status = set_process_title_waiting();
+
 	ResolveRecoveryConflictWithVirtualXIDs(temp_file_users,
 										   PROCSIG_RECOVERY_CONFLICT_TABLESPACE);
+
+	/* Reset ps display if we changed it */
+	if (new_status)
+	{
+		set_ps_display(new_status, false);
+		pfree(new_status);
+	}
 }
 
 void
@@ -384,6 +384,10 @@ ResolveRecoveryConflictWithDatabase(Oid dbid)
  *
  * Deadlocks involving the Startup process and an ordinary backend process
  * will be detected by the deadlock detector within the ordinary backend.
+ *
+ * Unlike other recovery conflict resolution functions, this function
+ * doesn't update the process title since we have already updated it in
+ * WaitOnLock().
  */
 void
 ResolveRecoveryConflictWithLock(LOCKTAG locktag)
@@ -1094,3 +1098,27 @@ LogStandbyInvalidations(int nmsgs, SharedInvalidationMessage *msgs,
 					 nmsgs * sizeof(SharedInvalidationMessage));
 	XLogInsert(RM_STANDBY_ID, XLOG_INVALIDATIONS);
 }
+
+/*
+ * Append " waiting" to the process title, and return palloc'd
+ * original process title.
+ */
+static char *
+set_process_title_waiting(void)
+{
+	const char	*old_status;
+	char		*ret_status;
+	int			len;
+
+	if (!update_process_title)
+		return NULL;
+
+	old_status = get_ps_display(&len);
+	ret_status = (char *) palloc(len + 8 + 1);
+	memcpy(ret_status, old_status, len);
+	strcpy(ret_status + len, " waiting");
+	set_ps_display(ret_status, false);
+	ret_status[len] = '\0'; /* truncate off " waiting" */
+
+	return ret_status;
+}
-- 
2.23.0

#13

Fujii Masao

masao.fujii@oss.nttdata.com

almost 6 years ago

In reply to: Masahiko Sawada (#12)

1 attachment(s)

Re: Some problems of recovery conflict wait events

On 2020/03/08 13:52, Masahiko Sawada wrote:

On Thu, 5 Mar 2020 at 20:16, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/03/05 16:58, Masahiko Sawada wrote:
On Wed, 4 Mar 2020 at 15:21, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 14:31, Masahiko Sawada wrote:

On Wed, 4 Mar 2020 at 13:48, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 13:27, Michael Paquier wrote:

On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:

Yeah, so 0001 patch sets existing wait events to recovery conflict
resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
to the recovery conflict on a snapshot. 0003 patch improves these wait
events by adding the new type of wait event such as
WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
is the fix for existing versions and 0003 patch is an improvement for
only PG13. Did you mean even 0001 patch doesn't fit for back-patching?

Yes, it looks like a improvement rather than bug fix.

Okay, understand.

I got my eyes on this patch set. The full patch set is in my opinion
just a set of improvements, and not bug fixes, so I would refrain from
back-backpatching.

I think that the issue (i.e., "waiting" is reported twice needlessly
in PS display) that 0002 patch tries to fix is a bug. So it should be
fixed even in the back branches.

So we need only two patches: one fixes process title issue and another
improve wait event. I've attached updated patches.

Thanks for updating the patches! I started reading 0001 patch.

Thank you for reviewing this patch.

- /*
- * Report via ps if we have been waiting for more than 500 msec
- * (should that be configurable?)
- */
- if (update_process_title && new_status == NULL &&
- TimestampDifferenceExceeds(waitStart, GetCurrentTimestamp(),
- 500))

The patch changes ResolveRecoveryConflictWithSnapshot() and
ResolveRecoveryConflictWithTablespace() so that they always add
"waiting" into the PS display, whether wait is really necessary or not.
But isn't it better to display "waiting" in PS basically when wait is
necessary, like originally ResolveRecoveryConflictWithVirtualXIDs()
does as the above?

You're right. Will fix it.
ResolveRecoveryConflictWithDatabase(Oid dbid)
{
+       char            *new_status = NULL;
+
+       /* Report via ps we are waiting */
+       new_status = set_process_title_waiting();
In ResolveRecoveryConflictWithDatabase(), there seems no need to
display "waiting" in PS because no wait occurs when recovery conflict
with database happens.
Isn't the startup process waiting for other backend to terminate?
Yeah, you're right. I agree that "waiting" should be reported in this case.

Currently ResolveRecoveryConflictWithLock() and
ResolveRecoveryConflictWithBufferPin() don't call
ResolveRecoveryConflictWithVirtualXIDs and don't report "waiting"
in PS display. You changed them so that they report "waiting". I agree
to have this change. But this change is an improvement rather than
a bug fix, i.e., we should apply this change only in v13?
Did you mean ResolveRecoveryConflictWithDatabase and
ResolveRecoveryConflictWithBufferPin?

Yes! Sorry for my typo.

In the current code as far as I
researched there are two cases where we don't add "waiting" and one
case where we doubly add "waiting".

ResolveRecoveryConflictWithDatabase and
ResolveRecoveryConflictWithBufferPin don't update the ps title.
Although the path where GetCurrentTimestamp() >= ltime is false in
ResolveRecoveryConflictWithLock also doesn't update the ps title, it's
already updated in WaitOnLock. On the other hand, the path where
GetCurrentTimestamp() >= ltime is true in
ResolveRecoveryConflictWithLock updates the ps title but it's wrong as
I reported.

I've split the patch into two patches: 0001 patch fixes the issue
about doubly updating ps title, 0002 patch makes the recovery conflict
resolution on database and buffer pin update the ps title.

Thanks for splitting the patches. I think that 0001 patch can be back-patched.

Originally, "waiting" is reported in PS if we've been waiting for more than
500 msec, as the above does. But you got rid of those codes in the patch.
Did you confirm that it's safe to do that? If not, isn't it better to apply
the attached patch? The attached patch makes
ResolveRecoveryConflictWithVirtualXIDs() report "waiting" as it does now,
and allows its caller to choose whether to report that.

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

Attachments:

v3-0001-Fix-double-updating-ps-title-when-recovery-confli_fujii.patchtext/plain; charset=UTF-8; name=v3-0001-Fix-double-updating-ps-title-when-recovery-confli_fujii.patch; x-mac-creator=0; x-mac-type=0Download

diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 3090e57fa4..94d4a8cbff 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -43,7 +43,7 @@ int			max_standby_streaming_delay = 30 * 1000;
 static HTAB *RecoveryLockLists;
 
 static void ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
-												   ProcSignalReason reason);
+												   ProcSignalReason reason, bool report_waiting);
 static void SendRecoveryConflictWithBufferPin(ProcSignalReason reason);
 static XLogRecPtr LogCurrentRunningXacts(RunningTransactions CurrRunningXacts);
 static void LogAccessExclusiveLocks(int nlocks, xl_standby_lock *locks);
@@ -216,10 +216,14 @@ WaitExceedsMaxStandbyDelay(void)
  * recovery processing. Judgement has already been passed on it within
  * a specific rmgr. Here we just issue the orders to the procs. The procs
  * then throw the required error as instructed.
+ *
+ * If report_waiting is true, "waiting" is reported in PS display if necessary.
+ * If the caller has already reported that, report_waiting should be false.
+ * Otherwise, "waiting" is reported twice unexpectedly.
  */
 static void
 ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
-									   ProcSignalReason reason)
+									   ProcSignalReason reason, bool report_waiting)
 {
 	TimestampTz waitStart;
 	char	   *new_status;
@@ -228,7 +232,8 @@ ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
 	if (!VirtualTransactionIdIsValid(*waitlist))
 		return;
 
-	waitStart = GetCurrentTimestamp();
+	if (report_waiting)
+		waitStart = GetCurrentTimestamp();
 	new_status = NULL;			/* we haven't changed the ps display */
 
 	while (VirtualTransactionIdIsValid(*waitlist))
@@ -243,7 +248,7 @@ ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
 			 * Report via ps if we have been waiting for more than 500 msec
 			 * (should that be configurable?)
 			 */
-			if (update_process_title && new_status == NULL &&
+			if (update_process_title && new_status == NULL && report_waiting &&
 				TimestampDifferenceExceeds(waitStart, GetCurrentTimestamp(),
 										   500))
 			{
@@ -311,7 +316,8 @@ ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode
 										 node.dbNode);
 
 	ResolveRecoveryConflictWithVirtualXIDs(backends,
-										   PROCSIG_RECOVERY_CONFLICT_SNAPSHOT);
+										   PROCSIG_RECOVERY_CONFLICT_SNAPSHOT,
+										   true);
 }
 
 void
@@ -339,7 +345,8 @@ ResolveRecoveryConflictWithTablespace(Oid tsid)
 	temp_file_users = GetConflictingVirtualXIDs(InvalidTransactionId,
 												InvalidOid);
 	ResolveRecoveryConflictWithVirtualXIDs(temp_file_users,
-										   PROCSIG_RECOVERY_CONFLICT_TABLESPACE);
+										   PROCSIG_RECOVERY_CONFLICT_TABLESPACE,
+										   true);
 }
 
 void
@@ -402,8 +409,15 @@ ResolveRecoveryConflictWithLock(LOCKTAG locktag)
 		VirtualTransactionId *backends;
 
 		backends = GetLockConflicts(&locktag, AccessExclusiveLock, NULL);
+
+		/*
+		 * Prevent ResolveRecoveryConflictWithVirtualXIDs() from reporting
+		 * "waiting" in PS display by disabling its argument report_waiting
+		 * because the caller, WaitOnLock(), has already reported that.
+		 */
 		ResolveRecoveryConflictWithVirtualXIDs(backends,
-											   PROCSIG_RECOVERY_CONFLICT_LOCK);
+											   PROCSIG_RECOVERY_CONFLICT_LOCK,
+											   false);
 	}
 	else
 	{

#14

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

almost 6 years ago

In reply to: Fujii Masao (#13)

Re: Some problems of recovery conflict wait events

On Mon, 9 Mar 2020 at 13:24, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/08 13:52, Masahiko Sawada wrote:
On Thu, 5 Mar 2020 at 20:16, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/03/05 16:58, Masahiko Sawada wrote:
On Wed, 4 Mar 2020 at 15:21, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 14:31, Masahiko Sawada wrote:

On Wed, 4 Mar 2020 at 13:48, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 13:27, Michael Paquier wrote:

On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:

Yeah, so 0001 patch sets existing wait events to recovery conflict
resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
to the recovery conflict on a snapshot. 0003 patch improves these wait
events by adding the new type of wait event such as
WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
is the fix for existing versions and 0003 patch is an improvement for
only PG13. Did you mean even 0001 patch doesn't fit for back-patching?

Yes, it looks like a improvement rather than bug fix.

Okay, understand.

I got my eyes on this patch set. The full patch set is in my opinion
just a set of improvements, and not bug fixes, so I would refrain from
back-backpatching.

I think that the issue (i.e., "waiting" is reported twice needlessly
in PS display) that 0002 patch tries to fix is a bug. So it should be
fixed even in the back branches.

So we need only two patches: one fixes process title issue and another
improve wait event. I've attached updated patches.

Thanks for updating the patches! I started reading 0001 patch.

Thank you for reviewing this patch.

- /*
- * Report via ps if we have been waiting for more than 500 msec
- * (should that be configurable?)
- */
- if (update_process_title && new_status == NULL &&
- TimestampDifferenceExceeds(waitStart, GetCurrentTimestamp(),
- 500))

The patch changes ResolveRecoveryConflictWithSnapshot() and
ResolveRecoveryConflictWithTablespace() so that they always add
"waiting" into the PS display, whether wait is really necessary or not.
But isn't it better to display "waiting" in PS basically when wait is
necessary, like originally ResolveRecoveryConflictWithVirtualXIDs()
does as the above?

You're right. Will fix it.
ResolveRecoveryConflictWithDatabase(Oid dbid)
{
+       char            *new_status = NULL;
+
+       /* Report via ps we are waiting */
+       new_status = set_process_title_waiting();
In ResolveRecoveryConflictWithDatabase(), there seems no need to
display "waiting" in PS because no wait occurs when recovery conflict
with database happens.
Isn't the startup process waiting for other backend to terminate?
Yeah, you're right. I agree that "waiting" should be reported in this case.

Currently ResolveRecoveryConflictWithLock() and
ResolveRecoveryConflictWithBufferPin() don't call
ResolveRecoveryConflictWithVirtualXIDs and don't report "waiting"
in PS display. You changed them so that they report "waiting". I agree
to have this change. But this change is an improvement rather than
a bug fix, i.e., we should apply this change only in v13?
Did you mean ResolveRecoveryConflictWithDatabase and
ResolveRecoveryConflictWithBufferPin?
Yes! Sorry for my typo.

In the current code as far as I
researched there are two cases where we don't add "waiting" and one
case where we doubly add "waiting".

ResolveRecoveryConflictWithDatabase and
ResolveRecoveryConflictWithBufferPin don't update the ps title.
Although the path where GetCurrentTimestamp() >= ltime is false in
ResolveRecoveryConflictWithLock also doesn't update the ps title, it's
already updated in WaitOnLock. On the other hand, the path where
GetCurrentTimestamp() >= ltime is true in
ResolveRecoveryConflictWithLock updates the ps title but it's wrong as
I reported.

I've split the patch into two patches: 0001 patch fixes the issue
about doubly updating ps title, 0002 patch makes the recovery conflict
resolution on database and buffer pin update the ps title.

Thanks for splitting the patches. I think that 0001 patch can be back-patched.

- /*
- * Report via ps if we have been waiting for more than 500 msec
- * (should that be configurable?)
- */
- if (update_process_title && new_status == NULL &&
- TimestampDifferenceExceeds(waitStart, GetCurrentTimestamp(),
- 500))

Originally, "waiting" is reported in PS if we've been waiting for more than
500 msec, as the above does. But you got rid of those codes in the patch.
Did you confirm that it's safe to do that? If not, isn't it better to apply
the attached patch?

In WaitOnLock() we update the ps title regardless of waiting time. So
I thought we can change it to make these behavior consistent. But
considering back-patch, your patch looks better than mine.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#15

Fujii Masao

masao.fujii@oss.nttdata.com

almost 6 years ago

In reply to: Masahiko Sawada (#14)

Re: Some problems of recovery conflict wait events

On 2020/03/09 14:10, Masahiko Sawada wrote:

On Mon, 9 Mar 2020 at 13:24, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/03/08 13:52, Masahiko Sawada wrote:
On Thu, 5 Mar 2020 at 20:16, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/03/05 16:58, Masahiko Sawada wrote:
On Wed, 4 Mar 2020 at 15:21, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 14:31, Masahiko Sawada wrote:

On Wed, 4 Mar 2020 at 13:48, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 13:27, Michael Paquier wrote:

On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:

Yeah, so 0001 patch sets existing wait events to recovery conflict
resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
to the recovery conflict on a snapshot. 0003 patch improves these wait
events by adding the new type of wait event such as
WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
is the fix for existing versions and 0003 patch is an improvement for
only PG13. Did you mean even 0001 patch doesn't fit for back-patching?

Yes, it looks like a improvement rather than bug fix.

Okay, understand.

I got my eyes on this patch set. The full patch set is in my opinion
just a set of improvements, and not bug fixes, so I would refrain from
back-backpatching.

I think that the issue (i.e., "waiting" is reported twice needlessly
in PS display) that 0002 patch tries to fix is a bug. So it should be
fixed even in the back branches.

So we need only two patches: one fixes process title issue and another
improve wait event. I've attached updated patches.

Thanks for updating the patches! I started reading 0001 patch.

Thank you for reviewing this patch.

- /*
- * Report via ps if we have been waiting for more than 500 msec
- * (should that be configurable?)
- */
- if (update_process_title && new_status == NULL &&
- TimestampDifferenceExceeds(waitStart, GetCurrentTimestamp(),
- 500))

The patch changes ResolveRecoveryConflictWithSnapshot() and
ResolveRecoveryConflictWithTablespace() so that they always add
"waiting" into the PS display, whether wait is really necessary or not.
But isn't it better to display "waiting" in PS basically when wait is
necessary, like originally ResolveRecoveryConflictWithVirtualXIDs()
does as the above?

You're right. Will fix it.
ResolveRecoveryConflictWithDatabase(Oid dbid)
{
+       char            *new_status = NULL;
+
+       /* Report via ps we are waiting */
+       new_status = set_process_title_waiting();
In ResolveRecoveryConflictWithDatabase(), there seems no need to
display "waiting" in PS because no wait occurs when recovery conflict
with database happens.
Isn't the startup process waiting for other backend to terminate?
Yeah, you're right. I agree that "waiting" should be reported in this case.

Currently ResolveRecoveryConflictWithLock() and
ResolveRecoveryConflictWithBufferPin() don't call
ResolveRecoveryConflictWithVirtualXIDs and don't report "waiting"
in PS display. You changed them so that they report "waiting". I agree
to have this change. But this change is an improvement rather than
a bug fix, i.e., we should apply this change only in v13?
Did you mean ResolveRecoveryConflictWithDatabase and
ResolveRecoveryConflictWithBufferPin?
Yes! Sorry for my typo.

In the current code as far as I
researched there are two cases where we don't add "waiting" and one
case where we doubly add "waiting".

ResolveRecoveryConflictWithDatabase and
ResolveRecoveryConflictWithBufferPin don't update the ps title.
Although the path where GetCurrentTimestamp() >= ltime is false in
ResolveRecoveryConflictWithLock also doesn't update the ps title, it's
already updated in WaitOnLock. On the other hand, the path where
GetCurrentTimestamp() >= ltime is true in
ResolveRecoveryConflictWithLock updates the ps title but it's wrong as
I reported.

I've split the patch into two patches: 0001 patch fixes the issue
about doubly updating ps title, 0002 patch makes the recovery conflict
resolution on database and buffer pin update the ps title.

Thanks for splitting the patches. I think that 0001 patch can be back-patched.

- /*
- * Report via ps if we have been waiting for more than 500 msec
- * (should that be configurable?)
- */
- if (update_process_title && new_status == NULL &&
- TimestampDifferenceExceeds(waitStart, GetCurrentTimestamp(),
- 500))

Originally, "waiting" is reported in PS if we've been waiting for more than
500 msec, as the above does. But you got rid of those codes in the patch.
Did you confirm that it's safe to do that? If not, isn't it better to apply
the attached patch?
In WaitOnLock() we update the ps title regardless of waiting time. So
I thought we can change it to make these behavior consistent. But
considering back-patch, your patch looks better than mine.

Yeah, so I pushed the 0001 patch at first!
I will review the other patches later.

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

#16

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

almost 6 years ago

In reply to: Fujii Masao (#15)

Re: Some problems of recovery conflict wait events

On Tue, 10 Mar 2020 at 00:57, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/09 14:10, Masahiko Sawada wrote:
On Mon, 9 Mar 2020 at 13:24, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/03/08 13:52, Masahiko Sawada wrote:
On Thu, 5 Mar 2020 at 20:16, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/03/05 16:58, Masahiko Sawada wrote:
On Wed, 4 Mar 2020 at 15:21, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 14:31, Masahiko Sawada wrote:

On Wed, 4 Mar 2020 at 13:48, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 13:27, Michael Paquier wrote:

On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:

Yeah, so 0001 patch sets existing wait events to recovery conflict
resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
to the recovery conflict on a snapshot. 0003 patch improves these wait
events by adding the new type of wait event such as
WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
is the fix for existing versions and 0003 patch is an improvement for
only PG13. Did you mean even 0001 patch doesn't fit for back-patching?

Yes, it looks like a improvement rather than bug fix.

Okay, understand.

I got my eyes on this patch set. The full patch set is in my opinion
just a set of improvements, and not bug fixes, so I would refrain from
back-backpatching.

I think that the issue (i.e., "waiting" is reported twice needlessly
in PS display) that 0002 patch tries to fix is a bug. So it should be
fixed even in the back branches.

So we need only two patches: one fixes process title issue and another
improve wait event. I've attached updated patches.

Thanks for updating the patches! I started reading 0001 patch.

Thank you for reviewing this patch.

- /*
- * Report via ps if we have been waiting for more than 500 msec
- * (should that be configurable?)
- */
- if (update_process_title && new_status == NULL &&
- TimestampDifferenceExceeds(waitStart, GetCurrentTimestamp(),
- 500))

The patch changes ResolveRecoveryConflictWithSnapshot() and
ResolveRecoveryConflictWithTablespace() so that they always add
"waiting" into the PS display, whether wait is really necessary or not.
But isn't it better to display "waiting" in PS basically when wait is
necessary, like originally ResolveRecoveryConflictWithVirtualXIDs()
does as the above?

You're right. Will fix it.
ResolveRecoveryConflictWithDatabase(Oid dbid)
{
+       char            *new_status = NULL;
+
+       /* Report via ps we are waiting */
+       new_status = set_process_title_waiting();
In ResolveRecoveryConflictWithDatabase(), there seems no need to
display "waiting" in PS because no wait occurs when recovery conflict
with database happens.
Isn't the startup process waiting for other backend to terminate?
Yeah, you're right. I agree that "waiting" should be reported in this case.

Currently ResolveRecoveryConflictWithLock() and
ResolveRecoveryConflictWithBufferPin() don't call
ResolveRecoveryConflictWithVirtualXIDs and don't report "waiting"
in PS display. You changed them so that they report "waiting". I agree
to have this change. But this change is an improvement rather than
a bug fix, i.e., we should apply this change only in v13?
Did you mean ResolveRecoveryConflictWithDatabase and
ResolveRecoveryConflictWithBufferPin?
Yes! Sorry for my typo.

In the current code as far as I
researched there are two cases where we don't add "waiting" and one
case where we doubly add "waiting".

ResolveRecoveryConflictWithDatabase and
ResolveRecoveryConflictWithBufferPin don't update the ps title.
Although the path where GetCurrentTimestamp() >= ltime is false in
ResolveRecoveryConflictWithLock also doesn't update the ps title, it's
already updated in WaitOnLock. On the other hand, the path where
GetCurrentTimestamp() >= ltime is true in
ResolveRecoveryConflictWithLock updates the ps title but it's wrong as
I reported.

I've split the patch into two patches: 0001 patch fixes the issue
about doubly updating ps title, 0002 patch makes the recovery conflict
resolution on database and buffer pin update the ps title.

Thanks for splitting the patches. I think that 0001 patch can be back-patched.

- /*
- * Report via ps if we have been waiting for more than 500 msec
- * (should that be configurable?)
- */
- if (update_process_title && new_status == NULL &&
- TimestampDifferenceExceeds(waitStart, GetCurrentTimestamp(),
- 500))

Originally, "waiting" is reported in PS if we've been waiting for more than
500 msec, as the above does. But you got rid of those codes in the patch.
Did you confirm that it's safe to do that? If not, isn't it better to apply
the attached patch?
In WaitOnLock() we update the ps title regardless of waiting time. So
I thought we can change it to make these behavior consistent. But
considering back-patch, your patch looks better than mine.
Yeah, so I pushed the 0001 patch at first!
I will review the other patches later.

Thank you!

For 0002 patch which makes ResolveRecoveryConflictWithDatabase and
ResolveRecoveryConflictWithBufferPin update the ps title, I think
these are better to wait for 5ms before updating the ps title like
ResolveRecoveryConflictWithVirtualXIDs, for consistency among recovery
conflict resolution functions, but what do you think?

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#17

Fujii Masao

masao.fujii@oss.nttdata.com

almost 6 years ago

In reply to: Masahiko Sawada (#16)

Re: Some problems of recovery conflict wait events

On 2020/03/10 13:54, Masahiko Sawada wrote:

On Tue, 10 Mar 2020 at 00:57, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/03/09 14:10, Masahiko Sawada wrote:
On Mon, 9 Mar 2020 at 13:24, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/03/08 13:52, Masahiko Sawada wrote:
On Thu, 5 Mar 2020 at 20:16, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/03/05 16:58, Masahiko Sawada wrote:
On Wed, 4 Mar 2020 at 15:21, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 14:31, Masahiko Sawada wrote:

On Wed, 4 Mar 2020 at 13:48, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 13:27, Michael Paquier wrote:

On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:

Yeah, so 0001 patch sets existing wait events to recovery conflict
resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
to the recovery conflict on a snapshot. 0003 patch improves these wait
events by adding the new type of wait event such as
WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
is the fix for existing versions and 0003 patch is an improvement for
only PG13. Did you mean even 0001 patch doesn't fit for back-patching?

Yes, it looks like a improvement rather than bug fix.

Okay, understand.

I got my eyes on this patch set. The full patch set is in my opinion
just a set of improvements, and not bug fixes, so I would refrain from
back-backpatching.

I think that the issue (i.e., "waiting" is reported twice needlessly
in PS display) that 0002 patch tries to fix is a bug. So it should be
fixed even in the back branches.

So we need only two patches: one fixes process title issue and another
improve wait event. I've attached updated patches.

Thanks for updating the patches! I started reading 0001 patch.

Thank you for reviewing this patch.

- /*
- * Report via ps if we have been waiting for more than 500 msec
- * (should that be configurable?)
- */
- if (update_process_title && new_status == NULL &&
- TimestampDifferenceExceeds(waitStart, GetCurrentTimestamp(),
- 500))

The patch changes ResolveRecoveryConflictWithSnapshot() and
ResolveRecoveryConflictWithTablespace() so that they always add
"waiting" into the PS display, whether wait is really necessary or not.
But isn't it better to display "waiting" in PS basically when wait is
necessary, like originally ResolveRecoveryConflictWithVirtualXIDs()
does as the above?

You're right. Will fix it.
ResolveRecoveryConflictWithDatabase(Oid dbid)
{
+       char            *new_status = NULL;
+
+       /* Report via ps we are waiting */
+       new_status = set_process_title_waiting();
In ResolveRecoveryConflictWithDatabase(), there seems no need to
display "waiting" in PS because no wait occurs when recovery conflict
with database happens.
Isn't the startup process waiting for other backend to terminate?
Yeah, you're right. I agree that "waiting" should be reported in this case.

Currently ResolveRecoveryConflictWithLock() and
ResolveRecoveryConflictWithBufferPin() don't call
ResolveRecoveryConflictWithVirtualXIDs and don't report "waiting"
in PS display. You changed them so that they report "waiting". I agree
to have this change. But this change is an improvement rather than
a bug fix, i.e., we should apply this change only in v13?
Did you mean ResolveRecoveryConflictWithDatabase and
ResolveRecoveryConflictWithBufferPin?
Yes! Sorry for my typo.

In the current code as far as I
researched there are two cases where we don't add "waiting" and one
case where we doubly add "waiting".

ResolveRecoveryConflictWithDatabase and
ResolveRecoveryConflictWithBufferPin don't update the ps title.
Although the path where GetCurrentTimestamp() >= ltime is false in
ResolveRecoveryConflictWithLock also doesn't update the ps title, it's
already updated in WaitOnLock. On the other hand, the path where
GetCurrentTimestamp() >= ltime is true in
ResolveRecoveryConflictWithLock updates the ps title but it's wrong as
I reported.

I've split the patch into two patches: 0001 patch fixes the issue
about doubly updating ps title, 0002 patch makes the recovery conflict
resolution on database and buffer pin update the ps title.

Thanks for splitting the patches. I think that 0001 patch can be back-patched.

- /*
- * Report via ps if we have been waiting for more than 500 msec
- * (should that be configurable?)
- */
- if (update_process_title && new_status == NULL &&
- TimestampDifferenceExceeds(waitStart, GetCurrentTimestamp(),
- 500))

Originally, "waiting" is reported in PS if we've been waiting for more than
500 msec, as the above does. But you got rid of those codes in the patch.
Did you confirm that it's safe to do that? If not, isn't it better to apply
the attached patch?
In WaitOnLock() we update the ps title regardless of waiting time. So
I thought we can change it to make these behavior consistent. But
considering back-patch, your patch looks better than mine.
Yeah, so I pushed the 0001 patch at first!
I will review the other patches later.
Thank you!

For 0002 patch which makes ResolveRecoveryConflictWithDatabase and
ResolveRecoveryConflictWithBufferPin update the ps title, I think
these are better to wait for 5ms before updating the ps title like
ResolveRecoveryConflictWithVirtualXIDs, for consistency among recovery
conflict resolution functions, but what do you think?

Maybe yes.

As another idea, for consistency, we can change all
ResolveRecoveryConflictWithXXX() so that they don't wait
at all before reporting "waiting". But if we don't do that,
"waiting" can be reported even when we can immediately
cancel or terminate the conflicting transactions (e.g., in
case of max_standby_streaming_delay=0). To avoid this
issue, I think it's better to wait for 500ms.

The 0002 patch changes ResolveRecoveryConflictWithBufferPin()
so that it updates PS every time. But this seems not good
because the update can happen very frequently. Thought?

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

#18

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

almost 6 years ago

In reply to: Fujii Masao (#17)

1 attachment(s)

Re: Some problems of recovery conflict wait events

On Wed, 11 Mar 2020 at 16:42, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/10 13:54, Masahiko Sawada wrote:
On Tue, 10 Mar 2020 at 00:57, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/03/09 14:10, Masahiko Sawada wrote:
On Mon, 9 Mar 2020 at 13:24, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/03/08 13:52, Masahiko Sawada wrote:
On Thu, 5 Mar 2020 at 20:16, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/03/05 16:58, Masahiko Sawada wrote:
On Wed, 4 Mar 2020 at 15:21, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 14:31, Masahiko Sawada wrote:

On Wed, 4 Mar 2020 at 13:48, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 13:27, Michael Paquier wrote:

On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:

Yeah, so 0001 patch sets existing wait events to recovery conflict
resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
to the recovery conflict on a snapshot. 0003 patch improves these wait
events by adding the new type of wait event such as
WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
is the fix for existing versions and 0003 patch is an improvement for
only PG13. Did you mean even 0001 patch doesn't fit for back-patching?

Yes, it looks like a improvement rather than bug fix.

Okay, understand.

I got my eyes on this patch set. The full patch set is in my opinion
just a set of improvements, and not bug fixes, so I would refrain from
back-backpatching.

I think that the issue (i.e., "waiting" is reported twice needlessly
in PS display) that 0002 patch tries to fix is a bug. So it should be
fixed even in the back branches.

So we need only two patches: one fixes process title issue and another
improve wait event. I've attached updated patches.

Thanks for updating the patches! I started reading 0001 patch.

Thank you for reviewing this patch.

- /*
- * Report via ps if we have been waiting for more than 500 msec
- * (should that be configurable?)
- */
- if (update_process_title && new_status == NULL &&
- TimestampDifferenceExceeds(waitStart, GetCurrentTimestamp(),
- 500))

The patch changes ResolveRecoveryConflictWithSnapshot() and
ResolveRecoveryConflictWithTablespace() so that they always add
"waiting" into the PS display, whether wait is really necessary or not.
But isn't it better to display "waiting" in PS basically when wait is
necessary, like originally ResolveRecoveryConflictWithVirtualXIDs()
does as the above?

You're right. Will fix it.
ResolveRecoveryConflictWithDatabase(Oid dbid)
{
+       char            *new_status = NULL;
+
+       /* Report via ps we are waiting */
+       new_status = set_process_title_waiting();
In ResolveRecoveryConflictWithDatabase(), there seems no need to
display "waiting" in PS because no wait occurs when recovery conflict
with database happens.
Isn't the startup process waiting for other backend to terminate?
Yeah, you're right. I agree that "waiting" should be reported in this case.

Currently ResolveRecoveryConflictWithLock() and
ResolveRecoveryConflictWithBufferPin() don't call
ResolveRecoveryConflictWithVirtualXIDs and don't report "waiting"
in PS display. You changed them so that they report "waiting". I agree
to have this change. But this change is an improvement rather than
a bug fix, i.e., we should apply this change only in v13?
Did you mean ResolveRecoveryConflictWithDatabase and
ResolveRecoveryConflictWithBufferPin?
Yes! Sorry for my typo.

In the current code as far as I
researched there are two cases where we don't add "waiting" and one
case where we doubly add "waiting".

ResolveRecoveryConflictWithDatabase and
ResolveRecoveryConflictWithBufferPin don't update the ps title.
Although the path where GetCurrentTimestamp() >= ltime is false in
ResolveRecoveryConflictWithLock also doesn't update the ps title, it's
already updated in WaitOnLock. On the other hand, the path where
GetCurrentTimestamp() >= ltime is true in
ResolveRecoveryConflictWithLock updates the ps title but it's wrong as
I reported.

I've split the patch into two patches: 0001 patch fixes the issue
about doubly updating ps title, 0002 patch makes the recovery conflict
resolution on database and buffer pin update the ps title.

Thanks for splitting the patches. I think that 0001 patch can be back-patched.

- /*
- * Report via ps if we have been waiting for more than 500 msec
- * (should that be configurable?)
- */
- if (update_process_title && new_status == NULL &&
- TimestampDifferenceExceeds(waitStart, GetCurrentTimestamp(),
- 500))

Originally, "waiting" is reported in PS if we've been waiting for more than
500 msec, as the above does. But you got rid of those codes in the patch.
Did you confirm that it's safe to do that? If not, isn't it better to apply
the attached patch?
In WaitOnLock() we update the ps title regardless of waiting time. So
I thought we can change it to make these behavior consistent. But
considering back-patch, your patch looks better than mine.
Yeah, so I pushed the 0001 patch at first!
I will review the other patches later.
Thank you!

For 0002 patch which makes ResolveRecoveryConflictWithDatabase and
ResolveRecoveryConflictWithBufferPin update the ps title, I think
these are better to wait for 5ms before updating the ps title like
ResolveRecoveryConflictWithVirtualXIDs, for consistency among recovery
conflict resolution functions, but what do you think?
Maybe yes.

As another idea, for consistency, we can change all
ResolveRecoveryConflictWithXXX() so that they don't wait
at all before reporting "waiting". But if we don't do that,
"waiting" can be reported even when we can immediately
cancel or terminate the conflicting transactions (e.g., in
case of max_standby_streaming_delay=0). To avoid this
issue, I think it's better to wait for 500ms.

Agreed.

The 0002 patch changes ResolveRecoveryConflictWithBufferPin()
so that it updates PS every time. But this seems not good
because the update can happen very frequently. Thought?

Agreed. In the updated version patch, I update the process title in
LockBufferForCleanup() only once when we've been waiting for more than
500 ms. This change also affects the primary server that is waiting
for buffer cleanup lock. I think it would not be bad but it's
different behaviour from LockBuffer().

I've attached the updated version patch. Please review it.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

update_process_title_v4.patchapplication/octet-stream; name=update_process_title_v4.patchDownload

diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 5880054245..b1bcb477c6 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -49,6 +49,7 @@
 #include "storage/proc.h"
 #include "storage/smgr.h"
 #include "storage/standby.h"
+#include "utils/ps_status.h"
 #include "utils/rel.h"
 #include "utils/resowner_private.h"
 #include "utils/timestamp.h"
@@ -3669,6 +3670,8 @@ void
 LockBufferForCleanup(Buffer buffer)
 {
 	BufferDesc *bufHdr;
+	TimestampTz waitStart = 0;
+	char		*new_status = NULL;
 
 	Assert(BufferIsValid(buffer));
 	Assert(PinCountWaitBuf == NULL);
@@ -3690,6 +3693,9 @@ LockBufferForCleanup(Buffer buffer)
 
 	bufHdr = GetBufferDescriptor(buffer - 1);
 
+	if (update_process_title)
+		waitStart = GetCurrentTimestamp();
+
 	for (;;)
 	{
 		uint32		buf_state;
@@ -3703,6 +3709,13 @@ LockBufferForCleanup(Buffer buffer)
 		{
 			/* Successfully acquired exclusive lock with pincount 1 */
 			UnlockBufHdr(bufHdr, buf_state);
+
+			/* Reset ps display if we changed it */
+			if (new_status)
+			{
+				set_ps_display(new_status);
+				pfree(new_status);
+			}
 			return;
 		}
 		/* Failed, so mark myself as waiting for pincount 1 */
@@ -3718,6 +3731,22 @@ LockBufferForCleanup(Buffer buffer)
 		UnlockBufHdr(bufHdr, buf_state);
 		LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
 
+		/* Report via ps if we have been waiting for more than 500 msec */
+		if (update_process_title && new_status == NULL &&
+			TimestampDifferenceExceeds(waitStart, GetCurrentTimestamp(),
+									   500))
+		{
+			const char *old_status;
+			int			len;
+
+			old_status = get_ps_display(&len);
+			new_status = (char *) palloc(len + 8 + 1);
+			memcpy(new_status, old_status, len);
+			strcpy(new_status + len, " waiting");
+			set_ps_display(new_status);
+			new_status[len] = '\0'; /* truncate off " waiting" */
+		}
+
 		/* Wait to be signaled by UnpinBuffer() */
 		if (InHotStandby)
 		{
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 08f695a980..24fc6de521 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -47,6 +47,7 @@ static void ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlis
 static void SendRecoveryConflictWithBufferPin(ProcSignalReason reason);
 static XLogRecPtr LogCurrentRunningXacts(RunningTransactions CurrRunningXacts);
 static void LogAccessExclusiveLocks(int nlocks, xl_standby_lock *locks);
+static char *update_process_title_waiting(TimestampTz waitStart);
 
 /*
  * Keep track of all the locks owned by a given transaction.
@@ -244,24 +245,9 @@ ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
 		/* wait until the virtual xid is gone */
 		while (!VirtualXactLock(*waitlist, false))
 		{
-			/*
-			 * Report via ps if we have been waiting for more than 500 msec
-			 * (should that be configurable?)
-			 */
-			if (update_process_title && new_status == NULL && report_waiting &&
-				TimestampDifferenceExceeds(waitStart, GetCurrentTimestamp(),
-										   500))
-			{
-				const char *old_status;
-				int			len;
-
-				old_status = get_ps_display(&len);
-				new_status = (char *) palloc(len + 8 + 1);
-				memcpy(new_status, old_status, len);
-				strcpy(new_status + len, " waiting");
-				set_ps_display(new_status);
-				new_status[len] = '\0'; /* truncate off " waiting" */
-			}
+			/* Report via ps we are waiting */
+			if (update_process_title && new_status == NULL && report_waiting)
+				new_status = update_process_title_waiting(waitStart);
 
 			/* Is it time to kill it? */
 			if (WaitExceedsMaxStandbyDelay())
@@ -352,6 +338,12 @@ ResolveRecoveryConflictWithTablespace(Oid tsid)
 void
 ResolveRecoveryConflictWithDatabase(Oid dbid)
 {
+	char		*new_status = NULL;
+	TimestampTz waitStart = 0;
+
+	if (update_process_title)
+		waitStart = GetCurrentTimestamp();
+
 	/*
 	 * We don't do ResolveRecoveryConflictWithVirtualXIDs() here since that
 	 * only waits for transactions and completely idle sessions would block
@@ -365,6 +357,10 @@ ResolveRecoveryConflictWithDatabase(Oid dbid)
 	 */
 	while (CountDBBackends(dbid) > 0)
 	{
+		/* Report via ps we are waiting */
+		if (update_process_title && new_status == NULL)
+			new_status = update_process_title_waiting(waitStart);
+
 		CancelDBBackends(dbid, PROCSIG_RECOVERY_CONFLICT_DATABASE, true);
 
 		/*
@@ -373,6 +369,13 @@ ResolveRecoveryConflictWithDatabase(Oid dbid)
 		 */
 		pg_usleep(10000);
 	}
+
+	/* Reset ps display if we changed it */
+	if (new_status)
+	{
+		set_ps_display(new_status);
+		pfree(new_status);
+	}
 }
 
 /*
@@ -1108,3 +1111,33 @@ LogStandbyInvalidations(int nmsgs, SharedInvalidationMessage *msgs,
 					 nmsgs * sizeof(SharedInvalidationMessage));
 	XLogInsert(RM_STANDBY_ID, XLOG_INVALIDATIONS);
 }
+
+/*
+ * Append " waiting" to the process title if we have been waiting for
+ * more than 500 msec since waitStart, and return palloc'd original
+ * process title. (should that be configurable?)
+ */
+static char *
+update_process_title_waiting(TimestampTz waitStart)
+{
+	char	   *new_status = NULL;
+
+	if (!update_process_title)
+		return NULL;
+
+	if (TimestampDifferenceExceeds(waitStart, GetCurrentTimestamp(),
+								   500))
+	{
+		const char *old_status;
+		int			len;
+
+		old_status = get_ps_display(&len);
+		new_status = (char *) palloc(len + 8 + 1);
+		memcpy(new_status, old_status, len);
+		strcpy(new_status + len, " waiting");
+		set_ps_display(new_status);
+		new_status[len] = '\0'; /* truncate off " waiting" */
+	}
+
+	return new_status;
+}

#19

Fujii Masao

masao.fujii@oss.nttdata.com

almost 6 years ago

In reply to: Fujii Masao (#11)

Re: Some problems of recovery conflict wait events

On 2020/03/05 20:16, Fujii Masao wrote:

On 2020/03/05 16:58, Masahiko Sawada wrote:
On Wed, 4 Mar 2020 at 15:21, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 14:31, Masahiko Sawada wrote:

On Wed, 4 Mar 2020 at 13:48, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 13:27, Michael Paquier wrote:

On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:

Yeah, so 0001 patch sets existing wait events to recovery conflict
resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
to the recovery conflict on a snapshot. 0003 patch improves these wait
events by adding the new type of wait event such as
WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
is the fix for existing versions and 0003 patch is an improvement for
only PG13. Did you mean even 0001 patch doesn't fit for back-patching?

Yes, it looks like a improvement rather than bug fix.

Okay, understand.

I got my eyes on this patch set. The full patch set is in my opinion
just a set of improvements, and not bug fixes, so I would refrain from
back-backpatching.

I think that the issue (i.e., "waiting" is reported twice needlessly
in PS display) that 0002 patch tries to fix is a bug. So it should be
fixed even in the back branches.

So we need only two patches: one fixes process title issue and another
improve wait event. I've attached updated patches.

Thanks for updating the patches! I started reading 0001 patch.

Thank you for reviewing this patch.

-                       /*
-                        * Report via ps if we have been waiting for more than 500 msec
-                        * (should that be configurable?)
-                        */
-                       if (update_process_title && new_status == NULL &&
-                               TimestampDifferenceExceeds(waitStart, GetCurrentTimestamp(),
-                                                                                  500))

The patch changes ResolveRecoveryConflictWithSnapshot() and
ResolveRecoveryConflictWithTablespace() so that they always add
"waiting" into the PS display, whether wait is really necessary or not.
But isn't it better to display "waiting" in PS basically when wait is
necessary, like originally ResolveRecoveryConflictWithVirtualXIDs()
does as the above?

You're right. Will fix it.
   ResolveRecoveryConflictWithDatabase(Oid dbid)
   {
+       char            *new_status = NULL;
+
+       /* Report via ps we are waiting */
+       new_status = set_process_title_waiting();
In ResolveRecoveryConflictWithDatabase(), there seems no need to
display "waiting" in PS because no wait occurs when recovery conflict
with database happens.
Isn't the startup process waiting for other backend to terminate?
Yeah, you're right. I agree that "waiting" should be reported in this case.

On second thought, in recovery conflict case, "waiting" should be reported
while waiting for the specified delay (e.g., by max_standby_streaming_delay)
until the conflict is resolved. So IMO reporting "waiting" in the case of
recovery conflict with buffer pin, snapshot, lock and tablespace seems valid,
because they are user-visible "expected" wait time.

However, in the case of recovery conflict with database, the recovery
basically doesn't wait at all and just terminates the conflicting sessions
immediately. Then the recovery waits for all those sessions to be terminated,
but that wait time is basically small and should not be the user-visible.
If that wait time becomes very long because of unresponsive backend, ISTM
that LOG or WARNING should be logged instead of reporting something in
PS display. I'm not sure if that logging is really necessary now, though.
Therefore, I'm thinking that "waiting" doesn't need to be reported in the case
of recovery conflict with database. Thought?

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

#20

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

almost 6 years ago

In reply to: Fujii Masao (#19)

Re: Some problems of recovery conflict wait events

On Tue, 24 Mar 2020 at 17:04, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/05 20:16, Fujii Masao wrote:
On 2020/03/05 16:58, Masahiko Sawada wrote:
On Wed, 4 Mar 2020 at 15:21, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 14:31, Masahiko Sawada wrote:

On Wed, 4 Mar 2020 at 13:48, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 13:27, Michael Paquier wrote:

On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:

Yeah, so 0001 patch sets existing wait events to recovery conflict
resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
to the recovery conflict on a snapshot. 0003 patch improves these wait
events by adding the new type of wait event such as
WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
is the fix for existing versions and 0003 patch is an improvement for
only PG13. Did you mean even 0001 patch doesn't fit for back-patching?

Yes, it looks like a improvement rather than bug fix.

Okay, understand.

I got my eyes on this patch set. The full patch set is in my opinion
just a set of improvements, and not bug fixes, so I would refrain from
back-backpatching.

I think that the issue (i.e., "waiting" is reported twice needlessly
in PS display) that 0002 patch tries to fix is a bug. So it should be
fixed even in the back branches.

So we need only two patches: one fixes process title issue and another
improve wait event. I've attached updated patches.

Thanks for updating the patches! I started reading 0001 patch.

Thank you for reviewing this patch.

- /*
- * Report via ps if we have been waiting for more than 500 msec
- * (should that be configurable?)
- */
- if (update_process_title && new_status == NULL &&
- TimestampDifferenceExceeds(waitStart, GetCurrentTimestamp(),
- 500))

The patch changes ResolveRecoveryConflictWithSnapshot() and
ResolveRecoveryConflictWithTablespace() so that they always add
"waiting" into the PS display, whether wait is really necessary or not.
But isn't it better to display "waiting" in PS basically when wait is
necessary, like originally ResolveRecoveryConflictWithVirtualXIDs()
does as the above?

You're right. Will fix it.
ResolveRecoveryConflictWithDatabase(Oid dbid)
{
+       char            *new_status = NULL;
+
+       /* Report via ps we are waiting */
+       new_status = set_process_title_waiting();
In ResolveRecoveryConflictWithDatabase(), there seems no need to
display "waiting" in PS because no wait occurs when recovery conflict
with database happens.
Isn't the startup process waiting for other backend to terminate?
Yeah, you're right. I agree that "waiting" should be reported in this case.
On second thought, in recovery conflict case, "waiting" should be reported
while waiting for the specified delay (e.g., by max_standby_streaming_delay)
until the conflict is resolved. So IMO reporting "waiting" in the case of
recovery conflict with buffer pin, snapshot, lock and tablespace seems valid,
because they are user-visible "expected" wait time.

However, in the case of recovery conflict with database, the recovery
basically doesn't wait at all and just terminates the conflicting sessions
immediately. Then the recovery waits for all those sessions to be terminated,
but that wait time is basically small and should not be the user-visible.
If that wait time becomes very long because of unresponsive backend, ISTM
that LOG or WARNING should be logged instead of reporting something in
PS display. I'm not sure if that logging is really necessary now, though.
Therefore, I'm thinking that "waiting" doesn't need to be reported in the case
of recovery conflict with database. Thought?

Fair enough. The longer wait time of conflicts with database is not
user-expected behaviour so logging would be better. I'd like to just
drop the change around ResolveRecoveryConflictWithDatabase() from the
patch. Maybe logging LOG or WARNING for recovery conflict on database
would be a separate patch and need more discussion.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#21

Fujii Masao

masao.fujii@oss.nttdata.com

almost 6 years ago

In reply to: Masahiko Sawada (#20)

1 attachment(s)

Re: Some problems of recovery conflict wait events

On 2020/03/26 14:33, Masahiko Sawada wrote:

On Tue, 24 Mar 2020 at 17:04, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/03/05 20:16, Fujii Masao wrote:
On 2020/03/05 16:58, Masahiko Sawada wrote:
On Wed, 4 Mar 2020 at 15:21, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 14:31, Masahiko Sawada wrote:

On Wed, 4 Mar 2020 at 13:48, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 13:27, Michael Paquier wrote:

On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:

Yeah, so 0001 patch sets existing wait events to recovery conflict
resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
to the recovery conflict on a snapshot. 0003 patch improves these wait
events by adding the new type of wait event such as
WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
is the fix for existing versions and 0003 patch is an improvement for
only PG13. Did you mean even 0001 patch doesn't fit for back-patching?

Yes, it looks like a improvement rather than bug fix.

Okay, understand.

I got my eyes on this patch set. The full patch set is in my opinion
just a set of improvements, and not bug fixes, so I would refrain from
back-backpatching.

I think that the issue (i.e., "waiting" is reported twice needlessly
in PS display) that 0002 patch tries to fix is a bug. So it should be
fixed even in the back branches.

So we need only two patches: one fixes process title issue and another
improve wait event. I've attached updated patches.

Thanks for updating the patches! I started reading 0001 patch.

Thank you for reviewing this patch.

- /*
- * Report via ps if we have been waiting for more than 500 msec
- * (should that be configurable?)
- */
- if (update_process_title && new_status == NULL &&
- TimestampDifferenceExceeds(waitStart, GetCurrentTimestamp(),
- 500))

The patch changes ResolveRecoveryConflictWithSnapshot() and
ResolveRecoveryConflictWithTablespace() so that they always add
"waiting" into the PS display, whether wait is really necessary or not.
But isn't it better to display "waiting" in PS basically when wait is
necessary, like originally ResolveRecoveryConflictWithVirtualXIDs()
does as the above?

You're right. Will fix it.
ResolveRecoveryConflictWithDatabase(Oid dbid)
{
+       char            *new_status = NULL;
+
+       /* Report via ps we are waiting */
+       new_status = set_process_title_waiting();
In ResolveRecoveryConflictWithDatabase(), there seems no need to
display "waiting" in PS because no wait occurs when recovery conflict
with database happens.
Isn't the startup process waiting for other backend to terminate?
Yeah, you're right. I agree that "waiting" should be reported in this case.
On second thought, in recovery conflict case, "waiting" should be reported
while waiting for the specified delay (e.g., by max_standby_streaming_delay)
until the conflict is resolved. So IMO reporting "waiting" in the case of
recovery conflict with buffer pin, snapshot, lock and tablespace seems valid,
because they are user-visible "expected" wait time.

However, in the case of recovery conflict with database, the recovery
basically doesn't wait at all and just terminates the conflicting sessions
immediately. Then the recovery waits for all those sessions to be terminated,
but that wait time is basically small and should not be the user-visible.
If that wait time becomes very long because of unresponsive backend, ISTM
that LOG or WARNING should be logged instead of reporting something in
PS display. I'm not sure if that logging is really necessary now, though.
Therefore, I'm thinking that "waiting" doesn't need to be reported in the case
of recovery conflict with database. Thought?
Fair enough. The longer wait time of conflicts with database is not
user-expected behaviour so logging would be better. I'd like to just
drop the change around ResolveRecoveryConflictWithDatabase() from the
patch.

Make sense.

+       if (update_process_title)
+               waitStart = GetCurrentTimestamp();

Since LockBufferForCleanup() can be called very frequently,
I don't think that it's good thing to call GetCurrentTimestamp()
every time when LockBufferForCleanup() is called.

+               /* Report via ps if we have been waiting for more than 500 msec */
+               if (update_process_title && new_status == NULL &&
+                       TimestampDifferenceExceeds(waitStart, GetCurrentTimestamp(),
+                                                                          500))

Do we really want to see "waiting" in PS even in non hot standby mode?

If max_standby_streaming_delay and deadlock_timeout are set to large value,
ResolveRecoveryConflictWithBufferPin() can wait for a long time, e.g.,
more than 500ms. In that case, I'm afraid that "report if we've been
waiting for more than 500ms" logic doesn't work.

So I'm now thinking that it's better to report "waiting" immdiately before
ResolveRecoveryConflictWithBufferPin(). Of course, we can still use
"report if we've been waiting for more than 500ms" logic by teaching 500ms
to ResolveRecoveryConflictWithBufferPin() as the minimum wait time.
But this looks overkill. Thought?

Based on the above comments, I updated the patch. Attached. Right now
the patch looks very simple. Could you review this patch?

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

Attachments:

update_process_title_v5.patchtext/plain; charset=UTF-8; name=update_process_title_v5.patch; x-mac-creator=0; x-mac-type=0Download

diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index e05e2b3456..e72d607a23 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -49,6 +49,7 @@
 #include "storage/proc.h"
 #include "storage/smgr.h"
 #include "storage/standby.h"
+#include "utils/ps_status.h"
 #include "utils/rel.h"
 #include "utils/resowner_private.h"
 #include "utils/timestamp.h"
@@ -3616,6 +3617,7 @@ void
 LockBufferForCleanup(Buffer buffer)
 {
 	BufferDesc *bufHdr;
+	char	   *new_status = NULL;
 
 	Assert(BufferIsValid(buffer));
 	Assert(PinCountWaitBuf == NULL);
@@ -3650,6 +3652,13 @@ LockBufferForCleanup(Buffer buffer)
 		{
 			/* Successfully acquired exclusive lock with pincount 1 */
 			UnlockBufHdr(bufHdr, buf_state);
+
+			/* Report change to non-waiting status */
+			if (new_status)
+			{
+				set_ps_display(new_status);
+				pfree(new_status);
+			}
 			return;
 		}
 		/* Failed, so mark myself as waiting for pincount 1 */
@@ -3668,6 +3677,20 @@ LockBufferForCleanup(Buffer buffer)
 		/* Wait to be signaled by UnpinBuffer() */
 		if (InHotStandby)
 		{
+			/* Report change to waiting status */
+			if (update_process_title && new_status == NULL)
+			{
+				const char *old_status;
+				int			len;
+
+				old_status = get_ps_display(&len);
+				new_status = (char *) palloc(len + 8 + 1);
+				memcpy(new_status, old_status, len);
+				strcpy(new_status + len, " waiting");
+				set_ps_display(new_status);
+				new_status[len] = '\0'; /* truncate off " waiting" */
+			}
+
 			/* Publish the bufid that Startup process waits on */
 			SetStartupBufferPinWaitBufId(buffer - 1);
 			/* Set alarm and then wait to be signaled by UnpinBuffer() */

#22

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

almost 6 years ago

In reply to: Fujii Masao (#21)

Re: Some problems of recovery conflict wait events

On Fri, 27 Mar 2020 at 10:32, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/26 14:33, Masahiko Sawada wrote:
On Tue, 24 Mar 2020 at 17:04, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/03/05 20:16, Fujii Masao wrote:
On 2020/03/05 16:58, Masahiko Sawada wrote:
On Wed, 4 Mar 2020 at 15:21, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 14:31, Masahiko Sawada wrote:

On Wed, 4 Mar 2020 at 13:48, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 13:27, Michael Paquier wrote:

On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:

Yeah, so 0001 patch sets existing wait events to recovery conflict
resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
to the recovery conflict on a snapshot. 0003 patch improves these wait
events by adding the new type of wait event such as
WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
is the fix for existing versions and 0003 patch is an improvement for
only PG13. Did you mean even 0001 patch doesn't fit for back-patching?

Yes, it looks like a improvement rather than bug fix.

Okay, understand.

I got my eyes on this patch set. The full patch set is in my opinion
just a set of improvements, and not bug fixes, so I would refrain from
back-backpatching.

I think that the issue (i.e., "waiting" is reported twice needlessly
in PS display) that 0002 patch tries to fix is a bug. So it should be
fixed even in the back branches.

So we need only two patches: one fixes process title issue and another
improve wait event. I've attached updated patches.

Thanks for updating the patches! I started reading 0001 patch.

Thank you for reviewing this patch.

- /*
- * Report via ps if we have been waiting for more than 500 msec
- * (should that be configurable?)
- */
- if (update_process_title && new_status == NULL &&
- TimestampDifferenceExceeds(waitStart, GetCurrentTimestamp(),
- 500))

The patch changes ResolveRecoveryConflictWithSnapshot() and
ResolveRecoveryConflictWithTablespace() so that they always add
"waiting" into the PS display, whether wait is really necessary or not.
But isn't it better to display "waiting" in PS basically when wait is
necessary, like originally ResolveRecoveryConflictWithVirtualXIDs()
does as the above?

You're right. Will fix it.
ResolveRecoveryConflictWithDatabase(Oid dbid)
{
+       char            *new_status = NULL;
+
+       /* Report via ps we are waiting */
+       new_status = set_process_title_waiting();
In ResolveRecoveryConflictWithDatabase(), there seems no need to
display "waiting" in PS because no wait occurs when recovery conflict
with database happens.
Isn't the startup process waiting for other backend to terminate?
Yeah, you're right. I agree that "waiting" should be reported in this case.
On second thought, in recovery conflict case, "waiting" should be reported
while waiting for the specified delay (e.g., by max_standby_streaming_delay)
until the conflict is resolved. So IMO reporting "waiting" in the case of
recovery conflict with buffer pin, snapshot, lock and tablespace seems valid,
because they are user-visible "expected" wait time.

However, in the case of recovery conflict with database, the recovery
basically doesn't wait at all and just terminates the conflicting sessions
immediately. Then the recovery waits for all those sessions to be terminated,
but that wait time is basically small and should not be the user-visible.
If that wait time becomes very long because of unresponsive backend, ISTM
that LOG or WARNING should be logged instead of reporting something in
PS display. I'm not sure if that logging is really necessary now, though.
Therefore, I'm thinking that "waiting" doesn't need to be reported in the case
of recovery conflict with database. Thought?
Fair enough. The longer wait time of conflicts with database is not
user-expected behaviour so logging would be better. I'd like to just
drop the change around ResolveRecoveryConflictWithDatabase() from the
patch.
Make sense.
+       if (update_process_title)
+               waitStart = GetCurrentTimestamp();
Since LockBufferForCleanup() can be called very frequently,
I don't think that it's good thing to call GetCurrentTimestamp()
every time when LockBufferForCleanup() is called.
+               /* Report via ps if we have been waiting for more than 500 msec */
+               if (update_process_title && new_status == NULL &&
+                       TimestampDifferenceExceeds(waitStart, GetCurrentTimestamp(),
+                                                                          500))
Do we really want to see "waiting" in PS even in non hot standby mode?

If max_standby_streaming_delay and deadlock_timeout are set to large value,
ResolveRecoveryConflictWithBufferPin() can wait for a long time, e.g.,
more than 500ms. In that case, I'm afraid that "report if we've been
waiting for more than 500ms" logic doesn't work.

So I'm now thinking that it's better to report "waiting" immdiately before
ResolveRecoveryConflictWithBufferPin(). Of course, we can still use
"report if we've been waiting for more than 500ms" logic by teaching 500ms
to ResolveRecoveryConflictWithBufferPin() as the minimum wait time.
But this looks overkill. Thought?

Based on the above comments, I updated the patch. Attached. Right now
the patch looks very simple. Could you review this patch?

Thank you for the patch. I agree with you for all the points. Your
patch looks good to me.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#23

Fujii Masao

masao.fujii@oss.nttdata.com

almost 6 years ago

In reply to: Masahiko Sawada (#22)

Re: Some problems of recovery conflict wait events

On 2020/03/27 15:39, Masahiko Sawada wrote:

On Fri, 27 Mar 2020 at 10:32, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/03/26 14:33, Masahiko Sawada wrote:
On Tue, 24 Mar 2020 at 17:04, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/03/05 20:16, Fujii Masao wrote:
On 2020/03/05 16:58, Masahiko Sawada wrote:
On Wed, 4 Mar 2020 at 15:21, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 14:31, Masahiko Sawada wrote:

On Wed, 4 Mar 2020 at 13:48, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 13:27, Michael Paquier wrote:

On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:

Yeah, so 0001 patch sets existing wait events to recovery conflict
resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
to the recovery conflict on a snapshot. 0003 patch improves these wait
events by adding the new type of wait event such as
WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
is the fix for existing versions and 0003 patch is an improvement for
only PG13. Did you mean even 0001 patch doesn't fit for back-patching?

Yes, it looks like a improvement rather than bug fix.

Okay, understand.

I got my eyes on this patch set. The full patch set is in my opinion
just a set of improvements, and not bug fixes, so I would refrain from
back-backpatching.

I think that the issue (i.e., "waiting" is reported twice needlessly
in PS display) that 0002 patch tries to fix is a bug. So it should be
fixed even in the back branches.

So we need only two patches: one fixes process title issue and another
improve wait event. I've attached updated patches.

Thanks for updating the patches! I started reading 0001 patch.

Thank you for reviewing this patch.

- /*
- * Report via ps if we have been waiting for more than 500 msec
- * (should that be configurable?)
- */
- if (update_process_title && new_status == NULL &&
- TimestampDifferenceExceeds(waitStart, GetCurrentTimestamp(),
- 500))

The patch changes ResolveRecoveryConflictWithSnapshot() and
ResolveRecoveryConflictWithTablespace() so that they always add
"waiting" into the PS display, whether wait is really necessary or not.
But isn't it better to display "waiting" in PS basically when wait is
necessary, like originally ResolveRecoveryConflictWithVirtualXIDs()
does as the above?

You're right. Will fix it.
ResolveRecoveryConflictWithDatabase(Oid dbid)
{
+       char            *new_status = NULL;
+
+       /* Report via ps we are waiting */
+       new_status = set_process_title_waiting();
In ResolveRecoveryConflictWithDatabase(), there seems no need to
display "waiting" in PS because no wait occurs when recovery conflict
with database happens.
Isn't the startup process waiting for other backend to terminate?
Yeah, you're right. I agree that "waiting" should be reported in this case.
On second thought, in recovery conflict case, "waiting" should be reported
while waiting for the specified delay (e.g., by max_standby_streaming_delay)
until the conflict is resolved. So IMO reporting "waiting" in the case of
recovery conflict with buffer pin, snapshot, lock and tablespace seems valid,
because they are user-visible "expected" wait time.

However, in the case of recovery conflict with database, the recovery
basically doesn't wait at all and just terminates the conflicting sessions
immediately. Then the recovery waits for all those sessions to be terminated,
but that wait time is basically small and should not be the user-visible.
If that wait time becomes very long because of unresponsive backend, ISTM
that LOG or WARNING should be logged instead of reporting something in
PS display. I'm not sure if that logging is really necessary now, though.
Therefore, I'm thinking that "waiting" doesn't need to be reported in the case
of recovery conflict with database. Thought?
Fair enough. The longer wait time of conflicts with database is not
user-expected behaviour so logging would be better. I'd like to just
drop the change around ResolveRecoveryConflictWithDatabase() from the
patch.
Make sense.
+       if (update_process_title)
+               waitStart = GetCurrentTimestamp();
Since LockBufferForCleanup() can be called very frequently,
I don't think that it's good thing to call GetCurrentTimestamp()
every time when LockBufferForCleanup() is called.
+               /* Report via ps if we have been waiting for more than 500 msec */
+               if (update_process_title && new_status == NULL &&
+                       TimestampDifferenceExceeds(waitStart, GetCurrentTimestamp(),
+                                                                          500))
Do we really want to see "waiting" in PS even in non hot standby mode?

If max_standby_streaming_delay and deadlock_timeout are set to large value,
ResolveRecoveryConflictWithBufferPin() can wait for a long time, e.g.,
more than 500ms. In that case, I'm afraid that "report if we've been
waiting for more than 500ms" logic doesn't work.

So I'm now thinking that it's better to report "waiting" immdiately before
ResolveRecoveryConflictWithBufferPin(). Of course, we can still use
"report if we've been waiting for more than 500ms" logic by teaching 500ms
to ResolveRecoveryConflictWithBufferPin() as the minimum wait time.
But this looks overkill. Thought?

Based on the above comments, I updated the patch. Attached. Right now
the patch looks very simple. Could you review this patch?
Thank you for the patch. I agree with you for all the points. Your
patch looks good to me.

Thanks for the review! Barring any objections, I will commit the latest patch.

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

#24

Fujii Masao

masao.fujii@oss.nttdata.com

almost 6 years ago

In reply to: Masahiko Sawada (#8)

Re: Some problems of recovery conflict wait events

On 2020/03/04 14:31, Masahiko Sawada wrote:

On Wed, 4 Mar 2020 at 13:48, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 13:27, Michael Paquier wrote:

On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:

Yeah, so 0001 patch sets existing wait events to recovery conflict
resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
to the recovery conflict on a snapshot. 0003 patch improves these wait
events by adding the new type of wait event such as
WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
is the fix for existing versions and 0003 patch is an improvement for
only PG13. Did you mean even 0001 patch doesn't fit for back-patching?

Yes, it looks like a improvement rather than bug fix.

Okay, understand.

I got my eyes on this patch set. The full patch set is in my opinion
just a set of improvements, and not bug fixes, so I would refrain from
back-backpatching.

I think that the issue (i.e., "waiting" is reported twice needlessly
in PS display) that 0002 patch tries to fix is a bug. So it should be
fixed even in the back branches.

So we need only two patches: one fixes process title issue and another
improve wait event. I've attached updated patches.

I started reading v2-0002-Improve-wait-events-for-recovery-conflict-resolut.patch.

-	ProcWaitForSignal(PG_WAIT_BUFFER_PIN);
+	ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_BUFFER_PIN);

Currently the wait event indicating the wait for buffer pin has already
been reported. But the above change in the patch changes the name of
wait event for buffer pin only in the startup process. Is this really useful?
Isn't the existing wait event for buffer pin enough?

-	/* Wait to be signaled by the release of the Relation Lock */
-	ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
+		/* Wait to be signaled by the release of the Relation Lock */
+		ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_LOCK);

Same as above. Isn't the existing wait event enough?

-	/*
-	 * Progressively increase the sleep times, but not to more than 1s, since
-	 * pg_usleep isn't interruptible on some platforms.
-	 */
-	standbyWait_us *= 2;
-	if (standbyWait_us > 1000000)
-		standbyWait_us = 1000000;
+	WaitLatch(MyLatch,
+			  WL_LATCH_SET | WL_POSTMASTER_DEATH | WL_TIMEOUT,
+			  STANDBY_WAIT_MS,
+			  wait_event_info);
+	ResetLatch(MyLatch);

ResetLatch() should be called before WaitLatch()?

Could you tell me why you dropped the "increase-sleep-times" logic?

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

#25

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

almost 6 years ago

In reply to: Fujii Masao (#24)

1 attachment(s)

Re: Some problems of recovery conflict wait events

On Fri, 27 Mar 2020 at 17:54, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 14:31, Masahiko Sawada wrote:

On Wed, 4 Mar 2020 at 13:48, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 13:27, Michael Paquier wrote:

On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:

Yeah, so 0001 patch sets existing wait events to recovery conflict
resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
to the recovery conflict on a snapshot. 0003 patch improves these wait
events by adding the new type of wait event such as
WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
is the fix for existing versions and 0003 patch is an improvement for
only PG13. Did you mean even 0001 patch doesn't fit for back-patching?

Yes, it looks like a improvement rather than bug fix.

Okay, understand.

I got my eyes on this patch set. The full patch set is in my opinion
just a set of improvements, and not bug fixes, so I would refrain from
back-backpatching.

I think that the issue (i.e., "waiting" is reported twice needlessly
in PS display) that 0002 patch tries to fix is a bug. So it should be
fixed even in the back branches.

So we need only two patches: one fixes process title issue and another
improve wait event. I've attached updated patches.

I started reading v2-0002-Improve-wait-events-for-recovery-conflict-resolut.patch.
-       ProcWaitForSignal(PG_WAIT_BUFFER_PIN);
+       ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_BUFFER_PIN);
Currently the wait event indicating the wait for buffer pin has already
been reported. But the above change in the patch changes the name of
wait event for buffer pin only in the startup process. Is this really useful?
Isn't the existing wait event for buffer pin enough?
-       /* Wait to be signaled by the release of the Relation Lock */
-       ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
+               /* Wait to be signaled by the release of the Relation Lock */
+               ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_LOCK);
Same as above. Isn't the existing wait event enough?

Yeah, we can use the existing wait events for buffer pin and lock.

-       /*
-        * Progressively increase the sleep times, but not to more than 1s, since
-        * pg_usleep isn't interruptible on some platforms.
-        */
-       standbyWait_us *= 2;
-       if (standbyWait_us > 1000000)
-               standbyWait_us = 1000000;
+       WaitLatch(MyLatch,
+                         WL_LATCH_SET | WL_POSTMASTER_DEATH | WL_TIMEOUT,
+                         STANDBY_WAIT_MS,
+                         wait_event_info);
+       ResetLatch(MyLatch);

ResetLatch() should be called before WaitLatch()?

Fixed.

Could you tell me why you dropped the "increase-sleep-times" logic?

I thought we can remove it because WaitLatch is interruptible but my
observation was not correct. The waiting startup process is not
necessarily woken up by signal. I think it's still better to not wait
more than 1 sec even if it's an interruptible wait.

Attached patch fixes the above and introduces only two wait events of
conflict resolution: snapshot and tablespace. I also removed the wait
event of conflict resolution of database since it's unlikely to become
a user-visible and a long sleep as we discussed before.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

recovery_conflict_wait_event_v3.patchapplication/octet-stream; name=recovery_conflict_wait_event_v3.patchDownload

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 220b8164c3..83534edc82 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -789,6 +789,13 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
           <literal>wait_event</literal> will identify the specific wait point.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>RecoveryConflict</literal>: The server process is waiting for a
+          recovery conflict resolution.  <literal>wait_event</literal> will identify
+          the specific wait point.
+         </para>
+        </listitem>
        </itemizedlist>
       </entry>
      </row>
@@ -1797,6 +1804,15 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
          <entry><literal>WALWrite</literal></entry>
          <entry>Waiting for a write to a WAL file.</entry>
         </row>
+        <row>
+         <entry morerows="2"><literal>RecoveryConflict</literal></entry>
+         <entry><literal>Snapshot</literal></entry>
+         <entry>Waiting for recovery conflict resolution on a physical cleanup.</entry>
+        </row>
+        <row>
+         <entry><literal>Tablespace</literal></entry>
+         <entry>Waiting for recovery conflict resolution on dropping tablespace.</entry>
+        </row>
       </tbody>
      </tgroup>
     </table>
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index ab42df7e1b..129ce4cc81 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -308,6 +308,7 @@ static const char *pgstat_get_wait_client(WaitEventClient w);
 static const char *pgstat_get_wait_ipc(WaitEventIPC w);
 static const char *pgstat_get_wait_timeout(WaitEventTimeout w);
 static const char *pgstat_get_wait_io(WaitEventIO w);
+static const char *pgstat_get_wait_recovery_conflict(WaitEventRecoveryConflict w);
 
 static void pgstat_setheader(PgStat_MsgHdr *hdr, StatMsgType mtype);
 static void pgstat_send(void *msg, int len);
@@ -3480,6 +3481,9 @@ pgstat_get_wait_event_type(uint32 wait_event_info)
 		case PG_WAIT_IO:
 			event_type = "IO";
 			break;
+		case PG_WAIT_RECOVERY_CONFLICT:
+			event_type = "RecoveryConflict";
+			break;
 		default:
 			event_type = "???";
 			break;
@@ -3557,6 +3561,14 @@ pgstat_get_wait_event(uint32 wait_event_info)
 				event_name = pgstat_get_wait_io(w);
 				break;
 			}
+		case PG_WAIT_RECOVERY_CONFLICT:
+			{
+				WaitEventRecoveryConflict w =
+					(WaitEventRecoveryConflict) wait_event_info;
+
+				event_name = pgstat_get_wait_recovery_conflict(w);
+				break;
+			}
 		default:
 			event_name = "unknown wait event";
 			break;
@@ -4066,6 +4078,32 @@ pgstat_get_wait_io(WaitEventIO w)
 	return event_name;
 }
 
+/* ----------
+ * pgstat_get_wait_recovery_conflict() -
+ *
+ * Convert WaitEventRecoveryConflict to string.
+ * ----------
+ */
+static const char *
+pgstat_get_wait_recovery_conflict(WaitEventRecoveryConflict w)
+{
+	const char *event_name = "unknown wait event";
+
+	switch (w)
+	{
+		case WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT:
+			event_name = "Snapshot";
+			break;
+		case WAIT_EVENT_RECOVERY_CONFLICT_TABLESPACE:
+			event_name = "Tablespace";
+			break;
+		default:
+			event_name = "unknown wait event";
+			break;
+	}
+
+	return event_name;
+}
 
 /* ----------
  * pgstat_get_backend_current_activity() -
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 08f695a980..63dfa8f26a 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -43,7 +43,9 @@ int			max_standby_streaming_delay = 30 * 1000;
 static HTAB *RecoveryLockLists;
 
 static void ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
-												   ProcSignalReason reason, bool report_waiting);
+												   ProcSignalReason reason,
+												   uint32 wait_event_info,
+												   bool report_waiting);
 static void SendRecoveryConflictWithBufferPin(ProcSignalReason reason);
 static XLogRecPtr LogCurrentRunningXacts(RunningTransactions CurrRunningXacts);
 static void LogAccessExclusiveLocks(int nlocks, xl_standby_lock *locks);
@@ -184,7 +186,7 @@ static int	standbyWait_us = STANDBY_INITIAL_WAIT_US;
  * more then we return true, if we can wait some more return false.
  */
 static bool
-WaitExceedsMaxStandbyDelay(void)
+WaitExceedsMaxStandbyDelay(uint32 wait_event_info)
 {
 	TimestampTz ltime;
 
@@ -198,11 +200,14 @@ WaitExceedsMaxStandbyDelay(void)
 	/*
 	 * Sleep a bit (this is essential to avoid busy-waiting).
 	 */
-	pg_usleep(standbyWait_us);
+	ResetLatch(MyLatch);
+	WaitLatch(MyLatch,
+			  WL_LATCH_SET | WL_POSTMASTER_DEATH | WL_TIMEOUT,
+			  standbyWait_us,
+			  wait_event_info);
 
 	/*
-	 * Progressively increase the sleep times, but not to more than 1s, since
-	 * pg_usleep isn't interruptible on some platforms.
+	 * Progressively increase the sleep times, but not to more than 1s.
 	 */
 	standbyWait_us *= 2;
 	if (standbyWait_us > 1000000)
@@ -223,7 +228,8 @@ WaitExceedsMaxStandbyDelay(void)
  */
 static void
 ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
-									   ProcSignalReason reason, bool report_waiting)
+									   ProcSignalReason reason, uint32 wait_event_info,
+									   bool report_waiting)
 {
 	TimestampTz waitStart = 0;
 	char	   *new_status;
@@ -264,7 +270,7 @@ ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
 			}
 
 			/* Is it time to kill it? */
-			if (WaitExceedsMaxStandbyDelay())
+			if (WaitExceedsMaxStandbyDelay(wait_event_info))
 			{
 				pid_t		pid;
 
@@ -317,6 +323,7 @@ ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode
 
 	ResolveRecoveryConflictWithVirtualXIDs(backends,
 										   PROCSIG_RECOVERY_CONFLICT_SNAPSHOT,
+										   WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT,
 										   true);
 }
 
@@ -346,6 +353,7 @@ ResolveRecoveryConflictWithTablespace(Oid tsid)
 												InvalidOid);
 	ResolveRecoveryConflictWithVirtualXIDs(temp_file_users,
 										   PROCSIG_RECOVERY_CONFLICT_TABLESPACE,
+										   WAIT_EVENT_RECOVERY_CONFLICT_TABLESPACE,
 										   true);
 }
 
@@ -417,6 +425,7 @@ ResolveRecoveryConflictWithLock(LOCKTAG locktag)
 		 */
 		ResolveRecoveryConflictWithVirtualXIDs(backends,
 											   PROCSIG_RECOVERY_CONFLICT_LOCK,
+											   PG_WAIT_LOCK | locktag.locktag_type,
 											   false);
 	}
 	else
@@ -430,10 +439,10 @@ ResolveRecoveryConflictWithLock(LOCKTAG locktag)
 		timeouts[0].type = TMPARAM_AT;
 		timeouts[0].fin_time = ltime;
 		enable_timeouts(timeouts, 1);
-	}
 
-	/* Wait to be signaled by the release of the Relation Lock */
-	ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
+		/* Wait to be signaled by the release of the Relation Lock */
+		ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
+	}
 
 	/*
 	 * Clear any timeout requests established above.  We assume here that the
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 763c1ee2bd..6c1c8bff7c 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -743,6 +743,7 @@ typedef enum BackendState
 #define PG_WAIT_IPC					0x08000000U
 #define PG_WAIT_TIMEOUT				0x09000000U
 #define PG_WAIT_IO					0x0A000000U
+#define PG_WAIT_RECOVERY_CONFLICT	0x0B000000U
 
 /* ----------
  * Wait Events - Activity
@@ -934,6 +935,19 @@ typedef enum
 	WAIT_EVENT_WAL_WRITE
 } WaitEventIO;
 
+/* ----------
+ * Wait Events - Recovery Conflict
+ *
+ * Use this category when a process is waiting for a recovery conflict
+ * resolution.
+ * ----------
+ */
+typedef enum
+{
+	WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT = PG_WAIT_RECOVERY_CONFLICT,
+	WAIT_EVENT_RECOVERY_CONFLICT_TABLESPACE,
+} WaitEventRecoveryConflict;
+
 /* ----------
  * Command type for progress reporting purposes
  * ----------

#26

Fujii Masao

masao.fujii@oss.nttdata.com

almost 6 years ago

In reply to: Masahiko Sawada (#25)

Re: Some problems of recovery conflict wait events

On 2020/03/30 20:10, Masahiko Sawada wrote:

On Fri, 27 Mar 2020 at 17:54, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/03/04 14:31, Masahiko Sawada wrote:

On Wed, 4 Mar 2020 at 13:48, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 13:27, Michael Paquier wrote:

On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:

Yeah, so 0001 patch sets existing wait events to recovery conflict
resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
to the recovery conflict on a snapshot. 0003 patch improves these wait
events by adding the new type of wait event such as
WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
is the fix for existing versions and 0003 patch is an improvement for
only PG13. Did you mean even 0001 patch doesn't fit for back-patching?

Yes, it looks like a improvement rather than bug fix.

Okay, understand.

I got my eyes on this patch set. The full patch set is in my opinion
just a set of improvements, and not bug fixes, so I would refrain from
back-backpatching.

I think that the issue (i.e., "waiting" is reported twice needlessly
in PS display) that 0002 patch tries to fix is a bug. So it should be
fixed even in the back branches.

So we need only two patches: one fixes process title issue and another
improve wait event. I've attached updated patches.

I started reading v2-0002-Improve-wait-events-for-recovery-conflict-resolut.patch.
-       ProcWaitForSignal(PG_WAIT_BUFFER_PIN);
+       ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_BUFFER_PIN);
Currently the wait event indicating the wait for buffer pin has already
been reported. But the above change in the patch changes the name of
wait event for buffer pin only in the startup process. Is this really useful?
Isn't the existing wait event for buffer pin enough?
-       /* Wait to be signaled by the release of the Relation Lock */
-       ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
+               /* Wait to be signaled by the release of the Relation Lock */
+               ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_LOCK);
Same as above. Isn't the existing wait event enough?
Yeah, we can use the existing wait events for buffer pin and lock.
-       /*
-        * Progressively increase the sleep times, but not to more than 1s, since
-        * pg_usleep isn't interruptible on some platforms.
-        */
-       standbyWait_us *= 2;
-       if (standbyWait_us > 1000000)
-               standbyWait_us = 1000000;
+       WaitLatch(MyLatch,
+                         WL_LATCH_SET | WL_POSTMASTER_DEATH | WL_TIMEOUT,
+                         STANDBY_WAIT_MS,
+                         wait_event_info);
+       ResetLatch(MyLatch);
ResetLatch() should be called before WaitLatch()?
Fixed.

Could you tell me why you dropped the "increase-sleep-times" logic?

I thought we can remove it because WaitLatch is interruptible but my
observation was not correct. The waiting startup process is not
necessarily woken up by signal. I think it's still better to not wait
more than 1 sec even if it's an interruptible wait.

So we don't need to use WaitLatch() there, i.e., it's ok to keep using
pg_usleep()?

Attached patch fixes the above and introduces only two wait events of
conflict resolution: snapshot and tablespace.

Many thanks for updating the patch!

-	/* Wait to be signaled by the release of the Relation Lock */
-	ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
+		/* Wait to be signaled by the release of the Relation Lock */
+		ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
+	}

Is this change really valid? What happens if the latch is set during
ResolveRecoveryConflictWithVirtualXIDs()?
ResolveRecoveryConflictWithVirtualXIDs() can return after the latch
is set but before WaitLatch() in WaitExceedsMaxStandbyDelay() is reached.

+		default:
+			event_name = "unknown wait event";
+			break;

Seems this default case should be removed. Please see other
pgstat_get_wait_xxx() function, so there is no such code.

I also removed the wait
event of conflict resolution of database since it's unlikely to become
a user-visible and a long sleep as we discussed before.

Is it worth defining new wait event type RecoveryConflict only for
those two events? ISTM that it's ok to use IPC event type here.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#27

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

almost 6 years ago

In reply to: Fujii Masao (#26)

1 attachment(s)

Re: Some problems of recovery conflict wait events

On Wed, 1 Apr 2020 at 22:32, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/30 20:10, Masahiko Sawada wrote:
On Fri, 27 Mar 2020 at 17:54, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/03/04 14:31, Masahiko Sawada wrote:

On Wed, 4 Mar 2020 at 13:48, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 13:27, Michael Paquier wrote:

On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:

Yeah, so 0001 patch sets existing wait events to recovery conflict
resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
to the recovery conflict on a snapshot. 0003 patch improves these wait
events by adding the new type of wait event such as
WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
is the fix for existing versions and 0003 patch is an improvement for
only PG13. Did you mean even 0001 patch doesn't fit for back-patching?

Yes, it looks like a improvement rather than bug fix.

Okay, understand.

I got my eyes on this patch set. The full patch set is in my opinion
just a set of improvements, and not bug fixes, so I would refrain from
back-backpatching.

I think that the issue (i.e., "waiting" is reported twice needlessly
in PS display) that 0002 patch tries to fix is a bug. So it should be
fixed even in the back branches.

So we need only two patches: one fixes process title issue and another
improve wait event. I've attached updated patches.

I started reading v2-0002-Improve-wait-events-for-recovery-conflict-resolut.patch.
-       ProcWaitForSignal(PG_WAIT_BUFFER_PIN);
+       ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_BUFFER_PIN);
Currently the wait event indicating the wait for buffer pin has already
been reported. But the above change in the patch changes the name of
wait event for buffer pin only in the startup process. Is this really useful?
Isn't the existing wait event for buffer pin enough?
-       /* Wait to be signaled by the release of the Relation Lock */
-       ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
+               /* Wait to be signaled by the release of the Relation Lock */
+               ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_LOCK);
Same as above. Isn't the existing wait event enough?
Yeah, we can use the existing wait events for buffer pin and lock.
-       /*
-        * Progressively increase the sleep times, but not to more than 1s, since
-        * pg_usleep isn't interruptible on some platforms.
-        */
-       standbyWait_us *= 2;
-       if (standbyWait_us > 1000000)
-               standbyWait_us = 1000000;
+       WaitLatch(MyLatch,
+                         WL_LATCH_SET | WL_POSTMASTER_DEATH | WL_TIMEOUT,
+                         STANDBY_WAIT_MS,
+                         wait_event_info);
+       ResetLatch(MyLatch);
ResetLatch() should be called before WaitLatch()?
Fixed.

Could you tell me why you dropped the "increase-sleep-times" logic?

I thought we can remove it because WaitLatch is interruptible but my
observation was not correct. The waiting startup process is not
necessarily woken up by signal. I think it's still better to not wait
more than 1 sec even if it's an interruptible wait.
So we don't need to use WaitLatch() there, i.e., it's ok to keep using
pg_usleep()?

Attached patch fixes the above and introduces only two wait events of
conflict resolution: snapshot and tablespace.

Many thanks for updating the patch!
-       /* Wait to be signaled by the release of the Relation Lock */
-       ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
+               /* Wait to be signaled by the release of the Relation Lock */
+               ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
+       }
Is this change really valid? What happens if the latch is set during
ResolveRecoveryConflictWithVirtualXIDs()?
ResolveRecoveryConflictWithVirtualXIDs() can return after the latch
is set but before WaitLatch() in WaitExceedsMaxStandbyDelay() is reached.

Thank you for reviewing the patch!

You're right. It's better to keep using pg_usleep() and set the wait
event by pgstat_report_wait_start().

+               default:
+                       event_name = "unknown wait event";
+                       break;
Seems this default case should be removed. Please see other
pgstat_get_wait_xxx() function, so there is no such code.

I also removed the wait
event of conflict resolution of database since it's unlikely to become
a user-visible and a long sleep as we discussed before.

Is it worth defining new wait event type RecoveryConflict only for
those two events? ISTM that it's ok to use IPC event type here.

I dropped a new wait even type and added them to IPC wait event type.

I've attached the new version patch.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

recovery_conflict_wait_event_v4.patchapplication/octet-stream; name=recovery_conflict_wait_event_v4.patchDownload

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 28ceb04d33..dcf52447ce 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -1486,6 +1486,14 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
          <entry><literal>RecoveryPause</literal></entry>
          <entry>Waiting for recovery to be resumed.</entry>
         </row>
+        <row>
+         <entry><literal>RecoveryConflictSnapshot</literal></entry>
+         <entry>Waiting for recovery conflict resolution on a physical cleanup.</entry>
+        </row>
+        <row>
+         <entry><literal>RecoveryConflictTablespace</literal></entry>
+         <entry>Waiting for recovery conflict resolution on dropping tablespace.</entry>
+        </row>
         <row>
          <entry><literal>ReplicationOriginDrop</literal></entry>
          <entry>Waiting for a replication origin to become inactive to be dropped.</entry>
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 04274056ca..9ebde47dea 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -3852,6 +3852,12 @@ pgstat_get_wait_ipc(WaitEventIPC w)
 		case WAIT_EVENT_PROMOTE:
 			event_name = "Promote";
 			break;
+		case WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT:
+			event_name = "RecoveryConflictSnapshot";
+			break;
+		case WAIT_EVENT_RECOVERY_CONFLICT_TABLESPACE:
+			event_name = "RecoveryConflictTablespace";
+			break;
 		case WAIT_EVENT_RECOVERY_PAUSE:
 			event_name = "RecoveryPause";
 			break;
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 08f695a980..bdaf10a4b1 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -43,7 +43,9 @@ int			max_standby_streaming_delay = 30 * 1000;
 static HTAB *RecoveryLockLists;
 
 static void ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
-												   ProcSignalReason reason, bool report_waiting);
+												   ProcSignalReason reason,
+												   uint32 wait_event_info,
+												   bool report_waiting);
 static void SendRecoveryConflictWithBufferPin(ProcSignalReason reason);
 static XLogRecPtr LogCurrentRunningXacts(RunningTransactions CurrRunningXacts);
 static void LogAccessExclusiveLocks(int nlocks, xl_standby_lock *locks);
@@ -184,7 +186,7 @@ static int	standbyWait_us = STANDBY_INITIAL_WAIT_US;
  * more then we return true, if we can wait some more return false.
  */
 static bool
-WaitExceedsMaxStandbyDelay(void)
+WaitExceedsMaxStandbyDelay(uint32 wait_event_info)
 {
 	TimestampTz ltime;
 
@@ -198,7 +200,9 @@ WaitExceedsMaxStandbyDelay(void)
 	/*
 	 * Sleep a bit (this is essential to avoid busy-waiting).
 	 */
+	pgstat_report_wait_start(wait_event_info);
 	pg_usleep(standbyWait_us);
+	pgstat_report_wait_end();
 
 	/*
 	 * Progressively increase the sleep times, but not to more than 1s, since
@@ -223,7 +227,8 @@ WaitExceedsMaxStandbyDelay(void)
  */
 static void
 ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
-									   ProcSignalReason reason, bool report_waiting)
+									   ProcSignalReason reason, uint32 wait_event_info,
+									   bool report_waiting)
 {
 	TimestampTz waitStart = 0;
 	char	   *new_status;
@@ -264,7 +269,7 @@ ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
 			}
 
 			/* Is it time to kill it? */
-			if (WaitExceedsMaxStandbyDelay())
+			if (WaitExceedsMaxStandbyDelay(wait_event_info))
 			{
 				pid_t		pid;
 
@@ -317,6 +322,7 @@ ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode
 
 	ResolveRecoveryConflictWithVirtualXIDs(backends,
 										   PROCSIG_RECOVERY_CONFLICT_SNAPSHOT,
+										   WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT,
 										   true);
 }
 
@@ -346,6 +352,7 @@ ResolveRecoveryConflictWithTablespace(Oid tsid)
 												InvalidOid);
 	ResolveRecoveryConflictWithVirtualXIDs(temp_file_users,
 										   PROCSIG_RECOVERY_CONFLICT_TABLESPACE,
+										   WAIT_EVENT_RECOVERY_CONFLICT_TABLESPACE,
 										   true);
 }
 
@@ -417,6 +424,7 @@ ResolveRecoveryConflictWithLock(LOCKTAG locktag)
 		 */
 		ResolveRecoveryConflictWithVirtualXIDs(backends,
 											   PROCSIG_RECOVERY_CONFLICT_LOCK,
+											   PG_WAIT_LOCK | locktag.locktag_type,
 											   false);
 	}
 	else
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 9d351e7714..b8041d9988 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -881,6 +881,8 @@ typedef enum
 	WAIT_EVENT_PARALLEL_FINISH,
 	WAIT_EVENT_PROCARRAY_GROUP_UPDATE,
 	WAIT_EVENT_PROMOTE,
+	WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT,
+	WAIT_EVENT_RECOVERY_CONFLICT_TABLESPACE,
 	WAIT_EVENT_RECOVERY_PAUSE,
 	WAIT_EVENT_REPLICATION_ORIGIN_DROP,
 	WAIT_EVENT_REPLICATION_SLOT_DROP,

#28

Fujii Masao

masao.fujii@oss.nttdata.com

almost 6 years ago

In reply to: Masahiko Sawada (#27)

Re: Some problems of recovery conflict wait events

On 2020/04/02 14:25, Masahiko Sawada wrote:

On Wed, 1 Apr 2020 at 22:32, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/03/30 20:10, Masahiko Sawada wrote:
On Fri, 27 Mar 2020 at 17:54, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/03/04 14:31, Masahiko Sawada wrote:

On Wed, 4 Mar 2020 at 13:48, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 13:27, Michael Paquier wrote:

On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:

Yeah, so 0001 patch sets existing wait events to recovery conflict
resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
to the recovery conflict on a snapshot. 0003 patch improves these wait
events by adding the new type of wait event such as
WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
is the fix for existing versions and 0003 patch is an improvement for
only PG13. Did you mean even 0001 patch doesn't fit for back-patching?

Yes, it looks like a improvement rather than bug fix.

Okay, understand.

I got my eyes on this patch set. The full patch set is in my opinion
just a set of improvements, and not bug fixes, so I would refrain from
back-backpatching.

I think that the issue (i.e., "waiting" is reported twice needlessly
in PS display) that 0002 patch tries to fix is a bug. So it should be
fixed even in the back branches.

So we need only two patches: one fixes process title issue and another
improve wait event. I've attached updated patches.

I started reading v2-0002-Improve-wait-events-for-recovery-conflict-resolut.patch.
-       ProcWaitForSignal(PG_WAIT_BUFFER_PIN);
+       ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_BUFFER_PIN);
Currently the wait event indicating the wait for buffer pin has already
been reported. But the above change in the patch changes the name of
wait event for buffer pin only in the startup process. Is this really useful?
Isn't the existing wait event for buffer pin enough?
-       /* Wait to be signaled by the release of the Relation Lock */
-       ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
+               /* Wait to be signaled by the release of the Relation Lock */
+               ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_LOCK);
Same as above. Isn't the existing wait event enough?
Yeah, we can use the existing wait events for buffer pin and lock.
-       /*
-        * Progressively increase the sleep times, but not to more than 1s, since
-        * pg_usleep isn't interruptible on some platforms.
-        */
-       standbyWait_us *= 2;
-       if (standbyWait_us > 1000000)
-               standbyWait_us = 1000000;
+       WaitLatch(MyLatch,
+                         WL_LATCH_SET | WL_POSTMASTER_DEATH | WL_TIMEOUT,
+                         STANDBY_WAIT_MS,
+                         wait_event_info);
+       ResetLatch(MyLatch);
ResetLatch() should be called before WaitLatch()?
Fixed.

Could you tell me why you dropped the "increase-sleep-times" logic?

I thought we can remove it because WaitLatch is interruptible but my
observation was not correct. The waiting startup process is not
necessarily woken up by signal. I think it's still better to not wait
more than 1 sec even if it's an interruptible wait.
So we don't need to use WaitLatch() there, i.e., it's ok to keep using
pg_usleep()?

Attached patch fixes the above and introduces only two wait events of
conflict resolution: snapshot and tablespace.

Many thanks for updating the patch!
-       /* Wait to be signaled by the release of the Relation Lock */
-       ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
+               /* Wait to be signaled by the release of the Relation Lock */
+               ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
+       }
Is this change really valid? What happens if the latch is set during
ResolveRecoveryConflictWithVirtualXIDs()?
ResolveRecoveryConflictWithVirtualXIDs() can return after the latch
is set but before WaitLatch() in WaitExceedsMaxStandbyDelay() is reached.
Thank you for reviewing the patch!

You're right. It's better to keep using pg_usleep() and set the wait
event by pgstat_report_wait_start().
+               default:
+                       event_name = "unknown wait event";
+                       break;
Seems this default case should be removed. Please see other
pgstat_get_wait_xxx() function, so there is no such code.

I also removed the wait
event of conflict resolution of database since it's unlikely to become
a user-visible and a long sleep as we discussed before.

Is it worth defining new wait event type RecoveryConflict only for
those two events? ISTM that it's ok to use IPC event type here.
I dropped a new wait even type and added them to IPC wait event type.

I've attached the new version patch.

Thanks for updating the patch! The patch looks good to me except
the following mior things.

+        <row>
+         <entry><literal>RecoveryConflictSnapshot</literal></entry>
+         <entry>Waiting for recovery conflict resolution on a physical cleanup.</entry>
+        </row>
+        <row>
+         <entry><literal>RecoveryConflictTablespace</literal></entry>
+         <entry>Waiting for recovery conflict resolution on dropping tablespace.</entry>
+        </row>

You need to increment the value of "morerows" in
"<entry morerows="38"><literal>IPC</literal></entry>".

The descriptions of those two events should be placed in alphabetical order
for event name. That is, they should be placed above RecoveryPause.

"vacuum cleanup" is better than "physical cleanup"?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#29

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

almost 6 years ago

In reply to: Fujii Masao (#28)

1 attachment(s)

Re: Some problems of recovery conflict wait events

On Thu, 2 Apr 2020 at 15:34, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/04/02 14:25, Masahiko Sawada wrote:
On Wed, 1 Apr 2020 at 22:32, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/03/30 20:10, Masahiko Sawada wrote:
On Fri, 27 Mar 2020 at 17:54, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/03/04 14:31, Masahiko Sawada wrote:

On Wed, 4 Mar 2020 at 13:48, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 13:27, Michael Paquier wrote:

On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:

Yeah, so 0001 patch sets existing wait events to recovery conflict
resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
to the recovery conflict on a snapshot. 0003 patch improves these wait
events by adding the new type of wait event such as
WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
is the fix for existing versions and 0003 patch is an improvement for
only PG13. Did you mean even 0001 patch doesn't fit for back-patching?

Yes, it looks like a improvement rather than bug fix.

Okay, understand.

I got my eyes on this patch set. The full patch set is in my opinion
just a set of improvements, and not bug fixes, so I would refrain from
back-backpatching.

I think that the issue (i.e., "waiting" is reported twice needlessly
in PS display) that 0002 patch tries to fix is a bug. So it should be
fixed even in the back branches.

So we need only two patches: one fixes process title issue and another
improve wait event. I've attached updated patches.

I started reading v2-0002-Improve-wait-events-for-recovery-conflict-resolut.patch.
-       ProcWaitForSignal(PG_WAIT_BUFFER_PIN);
+       ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_BUFFER_PIN);
Currently the wait event indicating the wait for buffer pin has already
been reported. But the above change in the patch changes the name of
wait event for buffer pin only in the startup process. Is this really useful?
Isn't the existing wait event for buffer pin enough?
-       /* Wait to be signaled by the release of the Relation Lock */
-       ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
+               /* Wait to be signaled by the release of the Relation Lock */
+               ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_LOCK);
Same as above. Isn't the existing wait event enough?
Yeah, we can use the existing wait events for buffer pin and lock.
-       /*
-        * Progressively increase the sleep times, but not to more than 1s, since
-        * pg_usleep isn't interruptible on some platforms.
-        */
-       standbyWait_us *= 2;
-       if (standbyWait_us > 1000000)
-               standbyWait_us = 1000000;
+       WaitLatch(MyLatch,
+                         WL_LATCH_SET | WL_POSTMASTER_DEATH | WL_TIMEOUT,
+                         STANDBY_WAIT_MS,
+                         wait_event_info);
+       ResetLatch(MyLatch);
ResetLatch() should be called before WaitLatch()?
Fixed.

Could you tell me why you dropped the "increase-sleep-times" logic?

I thought we can remove it because WaitLatch is interruptible but my
observation was not correct. The waiting startup process is not
necessarily woken up by signal. I think it's still better to not wait
more than 1 sec even if it's an interruptible wait.
So we don't need to use WaitLatch() there, i.e., it's ok to keep using
pg_usleep()?

Attached patch fixes the above and introduces only two wait events of
conflict resolution: snapshot and tablespace.

Many thanks for updating the patch!
-       /* Wait to be signaled by the release of the Relation Lock */
-       ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
+               /* Wait to be signaled by the release of the Relation Lock */
+               ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
+       }
Is this change really valid? What happens if the latch is set during
ResolveRecoveryConflictWithVirtualXIDs()?
ResolveRecoveryConflictWithVirtualXIDs() can return after the latch
is set but before WaitLatch() in WaitExceedsMaxStandbyDelay() is reached.
Thank you for reviewing the patch!

You're right. It's better to keep using pg_usleep() and set the wait
event by pgstat_report_wait_start().
+               default:
+                       event_name = "unknown wait event";
+                       break;
Seems this default case should be removed. Please see other
pgstat_get_wait_xxx() function, so there is no such code.

I also removed the wait
event of conflict resolution of database since it's unlikely to become
a user-visible and a long sleep as we discussed before.

Is it worth defining new wait event type RecoveryConflict only for
those two events? ISTM that it's ok to use IPC event type here.
I dropped a new wait even type and added them to IPC wait event type.

I've attached the new version patch.
Thanks for updating the patch! The patch looks good to me except
the following mior things.
+        <row>
+         <entry><literal>RecoveryConflictSnapshot</literal></entry>
+         <entry>Waiting for recovery conflict resolution on a physical cleanup.</entry>
+        </row>
+        <row>
+         <entry><literal>RecoveryConflictTablespace</literal></entry>
+         <entry>Waiting for recovery conflict resolution on dropping tablespace.</entry>
+        </row>
You need to increment the value of "morerows" in
"<entry morerows="38"><literal>IPC</literal></entry>".

The descriptions of those two events should be placed in alphabetical order
for event name. That is, they should be placed above RecoveryPause.

"vacuum cleanup" is better than "physical cleanup"?

Agreed.

I've attached the updated version patch.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

recovery_conflict_wait_event_v5.patchapplication/octet-stream; name=recovery_conflict_wait_event_v5.patchDownload

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 28ceb04d33..a55b567783 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -1346,7 +1346,7 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
          <entry>Waiting in an extension.</entry>
         </row>
         <row>
-         <entry morerows="38"><literal>IPC</literal></entry>
+         <entry morerows="40"><literal>IPC</literal></entry>
          <entry><literal>BackupWaitWalArchive</literal></entry>
          <entry>Waiting for WAL files required for the backup to be successfully archived.</entry>
         </row>
@@ -1482,6 +1482,14 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
          <entry><literal>Promote</literal></entry>
          <entry>Waiting for standby promotion.</entry>
         </row>
+        <row>
+         <entry><literal>RecoveryConflictSnapshot</literal></entry>
+         <entry>Waiting for recovery conflict resolution on a vacuum cleanup.</entry>
+        </row>
+        <row>
+         <entry><literal>RecoveryConflictTablespace</literal></entry>
+         <entry>Waiting for recovery conflict resolution on dropping tablespace.</entry>
+        </row>
         <row>
          <entry><literal>RecoveryPause</literal></entry>
          <entry>Waiting for recovery to be resumed.</entry>
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 04274056ca..9ebde47dea 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -3852,6 +3852,12 @@ pgstat_get_wait_ipc(WaitEventIPC w)
 		case WAIT_EVENT_PROMOTE:
 			event_name = "Promote";
 			break;
+		case WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT:
+			event_name = "RecoveryConflictSnapshot";
+			break;
+		case WAIT_EVENT_RECOVERY_CONFLICT_TABLESPACE:
+			event_name = "RecoveryConflictTablespace";
+			break;
 		case WAIT_EVENT_RECOVERY_PAUSE:
 			event_name = "RecoveryPause";
 			break;
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 08f695a980..bdaf10a4b1 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -43,7 +43,9 @@ int			max_standby_streaming_delay = 30 * 1000;
 static HTAB *RecoveryLockLists;
 
 static void ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
-												   ProcSignalReason reason, bool report_waiting);
+												   ProcSignalReason reason,
+												   uint32 wait_event_info,
+												   bool report_waiting);
 static void SendRecoveryConflictWithBufferPin(ProcSignalReason reason);
 static XLogRecPtr LogCurrentRunningXacts(RunningTransactions CurrRunningXacts);
 static void LogAccessExclusiveLocks(int nlocks, xl_standby_lock *locks);
@@ -184,7 +186,7 @@ static int	standbyWait_us = STANDBY_INITIAL_WAIT_US;
  * more then we return true, if we can wait some more return false.
  */
 static bool
-WaitExceedsMaxStandbyDelay(void)
+WaitExceedsMaxStandbyDelay(uint32 wait_event_info)
 {
 	TimestampTz ltime;
 
@@ -198,7 +200,9 @@ WaitExceedsMaxStandbyDelay(void)
 	/*
 	 * Sleep a bit (this is essential to avoid busy-waiting).
 	 */
+	pgstat_report_wait_start(wait_event_info);
 	pg_usleep(standbyWait_us);
+	pgstat_report_wait_end();
 
 	/*
 	 * Progressively increase the sleep times, but not to more than 1s, since
@@ -223,7 +227,8 @@ WaitExceedsMaxStandbyDelay(void)
  */
 static void
 ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
-									   ProcSignalReason reason, bool report_waiting)
+									   ProcSignalReason reason, uint32 wait_event_info,
+									   bool report_waiting)
 {
 	TimestampTz waitStart = 0;
 	char	   *new_status;
@@ -264,7 +269,7 @@ ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
 			}
 
 			/* Is it time to kill it? */
-			if (WaitExceedsMaxStandbyDelay())
+			if (WaitExceedsMaxStandbyDelay(wait_event_info))
 			{
 				pid_t		pid;
 
@@ -317,6 +322,7 @@ ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode
 
 	ResolveRecoveryConflictWithVirtualXIDs(backends,
 										   PROCSIG_RECOVERY_CONFLICT_SNAPSHOT,
+										   WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT,
 										   true);
 }
 
@@ -346,6 +352,7 @@ ResolveRecoveryConflictWithTablespace(Oid tsid)
 												InvalidOid);
 	ResolveRecoveryConflictWithVirtualXIDs(temp_file_users,
 										   PROCSIG_RECOVERY_CONFLICT_TABLESPACE,
+										   WAIT_EVENT_RECOVERY_CONFLICT_TABLESPACE,
 										   true);
 }
 
@@ -417,6 +424,7 @@ ResolveRecoveryConflictWithLock(LOCKTAG locktag)
 		 */
 		ResolveRecoveryConflictWithVirtualXIDs(backends,
 											   PROCSIG_RECOVERY_CONFLICT_LOCK,
+											   PG_WAIT_LOCK | locktag.locktag_type,
 											   false);
 	}
 	else
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 9d351e7714..b8041d9988 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -881,6 +881,8 @@ typedef enum
 	WAIT_EVENT_PARALLEL_FINISH,
 	WAIT_EVENT_PROCARRAY_GROUP_UPDATE,
 	WAIT_EVENT_PROMOTE,
+	WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT,
+	WAIT_EVENT_RECOVERY_CONFLICT_TABLESPACE,
 	WAIT_EVENT_RECOVERY_PAUSE,
 	WAIT_EVENT_REPLICATION_ORIGIN_DROP,
 	WAIT_EVENT_REPLICATION_SLOT_DROP,

#30

Fujii Masao

masao.fujii@oss.nttdata.com

almost 6 years ago

In reply to: Masahiko Sawada (#29)

Re: Some problems of recovery conflict wait events

On 2020/04/02 15:54, Masahiko Sawada wrote:

On Thu, 2 Apr 2020 at 15:34, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/04/02 14:25, Masahiko Sawada wrote:
On Wed, 1 Apr 2020 at 22:32, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/03/30 20:10, Masahiko Sawada wrote:
On Fri, 27 Mar 2020 at 17:54, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/03/04 14:31, Masahiko Sawada wrote:

On Wed, 4 Mar 2020 at 13:48, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 13:27, Michael Paquier wrote:

On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:

Yeah, so 0001 patch sets existing wait events to recovery conflict
resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
to the recovery conflict on a snapshot. 0003 patch improves these wait
events by adding the new type of wait event such as
WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
is the fix for existing versions and 0003 patch is an improvement for
only PG13. Did you mean even 0001 patch doesn't fit for back-patching?

Yes, it looks like a improvement rather than bug fix.

Okay, understand.

I got my eyes on this patch set. The full patch set is in my opinion
just a set of improvements, and not bug fixes, so I would refrain from
back-backpatching.

I think that the issue (i.e., "waiting" is reported twice needlessly
in PS display) that 0002 patch tries to fix is a bug. So it should be
fixed even in the back branches.

So we need only two patches: one fixes process title issue and another
improve wait event. I've attached updated patches.

I started reading v2-0002-Improve-wait-events-for-recovery-conflict-resolut.patch.
-       ProcWaitForSignal(PG_WAIT_BUFFER_PIN);
+       ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_BUFFER_PIN);
Currently the wait event indicating the wait for buffer pin has already
been reported. But the above change in the patch changes the name of
wait event for buffer pin only in the startup process. Is this really useful?
Isn't the existing wait event for buffer pin enough?
-       /* Wait to be signaled by the release of the Relation Lock */
-       ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
+               /* Wait to be signaled by the release of the Relation Lock */
+               ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_LOCK);
Same as above. Isn't the existing wait event enough?
Yeah, we can use the existing wait events for buffer pin and lock.
-       /*
-        * Progressively increase the sleep times, but not to more than 1s, since
-        * pg_usleep isn't interruptible on some platforms.
-        */
-       standbyWait_us *= 2;
-       if (standbyWait_us > 1000000)
-               standbyWait_us = 1000000;
+       WaitLatch(MyLatch,
+                         WL_LATCH_SET | WL_POSTMASTER_DEATH | WL_TIMEOUT,
+                         STANDBY_WAIT_MS,
+                         wait_event_info);
+       ResetLatch(MyLatch);
ResetLatch() should be called before WaitLatch()?
Fixed.

Could you tell me why you dropped the "increase-sleep-times" logic?

I thought we can remove it because WaitLatch is interruptible but my
observation was not correct. The waiting startup process is not
necessarily woken up by signal. I think it's still better to not wait
more than 1 sec even if it's an interruptible wait.
So we don't need to use WaitLatch() there, i.e., it's ok to keep using
pg_usleep()?

Attached patch fixes the above and introduces only two wait events of
conflict resolution: snapshot and tablespace.

Many thanks for updating the patch!
-       /* Wait to be signaled by the release of the Relation Lock */
-       ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
+               /* Wait to be signaled by the release of the Relation Lock */
+               ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
+       }
Is this change really valid? What happens if the latch is set during
ResolveRecoveryConflictWithVirtualXIDs()?
ResolveRecoveryConflictWithVirtualXIDs() can return after the latch
is set but before WaitLatch() in WaitExceedsMaxStandbyDelay() is reached.
Thank you for reviewing the patch!

You're right. It's better to keep using pg_usleep() and set the wait
event by pgstat_report_wait_start().
+               default:
+                       event_name = "unknown wait event";
+                       break;
Seems this default case should be removed. Please see other
pgstat_get_wait_xxx() function, so there is no such code.

I also removed the wait
event of conflict resolution of database since it's unlikely to become
a user-visible and a long sleep as we discussed before.

Is it worth defining new wait event type RecoveryConflict only for
those two events? ISTM that it's ok to use IPC event type here.
I dropped a new wait even type and added them to IPC wait event type.

I've attached the new version patch.
Thanks for updating the patch! The patch looks good to me except
the following mior things.
+        <row>
+         <entry><literal>RecoveryConflictSnapshot</literal></entry>
+         <entry>Waiting for recovery conflict resolution on a physical cleanup.</entry>
+        </row>
+        <row>
+         <entry><literal>RecoveryConflictTablespace</literal></entry>
+         <entry>Waiting for recovery conflict resolution on dropping tablespace.</entry>
+        </row>
You need to increment the value of "morerows" in
"<entry morerows="38"><literal>IPC</literal></entry>".

The descriptions of those two events should be placed in alphabetical order
for event name. That is, they should be placed above RecoveryPause.

"vacuum cleanup" is better than "physical cleanup"?
Agreed.

I've attached the updated version patch.

Thanks! Looks good to me. Barring any objection, I will commit this patch.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#31

Fujii Masao

masao.fujii@oss.nttdata.com

almost 6 years ago

In reply to: Fujii Masao (#30)

Re: Some problems of recovery conflict wait events

On 2020/04/02 16:12, Fujii Masao wrote:

On 2020/04/02 15:54, Masahiko Sawada wrote:
On Thu, 2 Apr 2020 at 15:34, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/04/02 14:25, Masahiko Sawada wrote:
On Wed, 1 Apr 2020 at 22:32, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/03/30 20:10, Masahiko Sawada wrote:
On Fri, 27 Mar 2020 at 17:54, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/03/04 14:31, Masahiko Sawada wrote:

On Wed, 4 Mar 2020 at 13:48, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 13:27, Michael Paquier wrote:

On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:

Yeah, so 0001 patch sets existing wait events to recovery conflict
resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
to the recovery conflict on a snapshot. 0003 patch improves these wait
events by adding the new type of wait event such as
WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
is the fix for existing versions and 0003 patch is an improvement for
only PG13. Did you mean even 0001 patch doesn't fit for back-patching?

Yes, it looks like a improvement rather than bug fix.

Okay, understand.

I got my eyes on this patch set. The full patch set is in my opinion
just a set of improvements, and not bug fixes, so I would refrain from
back-backpatching.

I think that the issue (i.e., "waiting" is reported twice needlessly
in PS display) that 0002 patch tries to fix is a bug. So it should be
fixed even in the back branches.

So we need only two patches: one fixes process title issue and another
improve wait event. I've attached updated patches.

I started reading v2-0002-Improve-wait-events-for-recovery-conflict-resolut.patch.
-       ProcWaitForSignal(PG_WAIT_BUFFER_PIN);
+       ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_BUFFER_PIN);
Currently the wait event indicating the wait for buffer pin has already
been reported. But the above change in the patch changes the name of
wait event for buffer pin only in the startup process. Is this really useful?
Isn't the existing wait event for buffer pin enough?
-       /* Wait to be signaled by the release of the Relation Lock */
-       ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
+               /* Wait to be signaled by the release of the Relation Lock */
+               ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_LOCK);
Same as above. Isn't the existing wait event enough?
Yeah, we can use the existing wait events for buffer pin and lock.
-       /*
-        * Progressively increase the sleep times, but not to more than 1s, since
-        * pg_usleep isn't interruptible on some platforms.
-        */
-       standbyWait_us *= 2;
-       if (standbyWait_us > 1000000)
-               standbyWait_us = 1000000;
+       WaitLatch(MyLatch,
+                         WL_LATCH_SET | WL_POSTMASTER_DEATH | WL_TIMEOUT,
+                         STANDBY_WAIT_MS,
+                         wait_event_info);
+       ResetLatch(MyLatch);
ResetLatch() should be called before WaitLatch()?
Fixed.

Could you tell me why you dropped the "increase-sleep-times" logic?

I thought we can remove it because WaitLatch is interruptible but my
observation was not correct. The waiting startup process is not
necessarily woken up by signal. I think it's still better to not wait
more than 1 sec even if it's an interruptible wait.
So we don't need to use WaitLatch() there, i.e., it's ok to keep using
pg_usleep()?

Attached patch fixes the above and introduces only two wait events of
conflict resolution: snapshot and tablespace.

Many thanks for updating the patch!
-       /* Wait to be signaled by the release of the Relation Lock */
-       ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
+               /* Wait to be signaled by the release of the Relation Lock */
+               ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
+       }
Is this change really valid? What happens if the latch is set during
ResolveRecoveryConflictWithVirtualXIDs()?
ResolveRecoveryConflictWithVirtualXIDs() can return after the latch
is set but before WaitLatch() in WaitExceedsMaxStandbyDelay() is reached.
Thank you for reviewing the patch!

You're right. It's better to keep using pg_usleep() and set the wait
event by pgstat_report_wait_start().
+               default:
+                       event_name = "unknown wait event";
+                       break;
Seems this default case should be removed. Please see other
pgstat_get_wait_xxx() function, so there is no such code.

I also removed the wait
event of conflict resolution of database since it's unlikely to become
a user-visible and a long sleep as we discussed before.

Is it worth defining new wait event type RecoveryConflict only for
those two events? ISTM that it's ok to use IPC event type here.
I dropped a new wait even type and added them to IPC wait event type.

I've attached the new version patch.
Thanks for updating the patch! The patch looks good to me except
the following mior things.
+        <row>
+         <entry><literal>RecoveryConflictSnapshot</literal></entry>
+         <entry>Waiting for recovery conflict resolution on a physical cleanup.</entry>
+        </row>
+        <row>
+         <entry><literal>RecoveryConflictTablespace</literal></entry>
+         <entry>Waiting for recovery conflict resolution on dropping tablespace.</entry>
+        </row>
You need to increment the value of "morerows" in
"<entry morerows="38"><literal>IPC</literal></entry>".

The descriptions of those two events should be placed in alphabetical order
for event name. That is, they should be placed above RecoveryPause.

"vacuum cleanup" is better than "physical cleanup"?
Agreed.

I've attached the updated version patch.
Thanks! Looks good to me. Barring any objection, I will commit this patch.

Pushed! Thanks!

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#32

Masahiko Sawada

masahiko.sawada@2ndquadrant.com

almost 6 years ago

In reply to: Fujii Masao (#31)

Re: Some problems of recovery conflict wait events

On Fri, 3 Apr 2020 at 12:28, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/04/02 16:12, Fujii Masao wrote:
On 2020/04/02 15:54, Masahiko Sawada wrote:
On Thu, 2 Apr 2020 at 15:34, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/04/02 14:25, Masahiko Sawada wrote:
On Wed, 1 Apr 2020 at 22:32, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/03/30 20:10, Masahiko Sawada wrote:
On Fri, 27 Mar 2020 at 17:54, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/03/04 14:31, Masahiko Sawada wrote:

On Wed, 4 Mar 2020 at 13:48, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2020/03/04 13:27, Michael Paquier wrote:

On Wed, Mar 04, 2020 at 01:13:19PM +0900, Masahiko Sawada wrote:

Yeah, so 0001 patch sets existing wait events to recovery conflict
resolution. For instance, it sets (PG_WAIT_LOCK | LOCKTAG_TRANSACTION)
to the recovery conflict on a snapshot. 0003 patch improves these wait
events by adding the new type of wait event such as
WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT. Therefore 0001 (and 0002) patch
is the fix for existing versions and 0003 patch is an improvement for
only PG13. Did you mean even 0001 patch doesn't fit for back-patching?

Yes, it looks like a improvement rather than bug fix.

Okay, understand.

I got my eyes on this patch set. The full patch set is in my opinion
just a set of improvements, and not bug fixes, so I would refrain from
back-backpatching.

I think that the issue (i.e., "waiting" is reported twice needlessly
in PS display) that 0002 patch tries to fix is a bug. So it should be
fixed even in the back branches.

So we need only two patches: one fixes process title issue and another
improve wait event. I've attached updated patches.

I started reading v2-0002-Improve-wait-events-for-recovery-conflict-resolut.patch.
-       ProcWaitForSignal(PG_WAIT_BUFFER_PIN);
+       ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_BUFFER_PIN);
Currently the wait event indicating the wait for buffer pin has already
been reported. But the above change in the patch changes the name of
wait event for buffer pin only in the startup process. Is this really useful?
Isn't the existing wait event for buffer pin enough?
-       /* Wait to be signaled by the release of the Relation Lock */
-       ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
+               /* Wait to be signaled by the release of the Relation Lock */
+               ProcWaitForSignal(WAIT_EVENT_RECOVERY_CONFLICT_LOCK);
Same as above. Isn't the existing wait event enough?
Yeah, we can use the existing wait events for buffer pin and lock.
-       /*
-        * Progressively increase the sleep times, but not to more than 1s, since
-        * pg_usleep isn't interruptible on some platforms.
-        */
-       standbyWait_us *= 2;
-       if (standbyWait_us > 1000000)
-               standbyWait_us = 1000000;
+       WaitLatch(MyLatch,
+                         WL_LATCH_SET | WL_POSTMASTER_DEATH | WL_TIMEOUT,
+                         STANDBY_WAIT_MS,
+                         wait_event_info);
+       ResetLatch(MyLatch);
ResetLatch() should be called before WaitLatch()?
Fixed.

Could you tell me why you dropped the "increase-sleep-times" logic?

I thought we can remove it because WaitLatch is interruptible but my
observation was not correct. The waiting startup process is not
necessarily woken up by signal. I think it's still better to not wait
more than 1 sec even if it's an interruptible wait.
So we don't need to use WaitLatch() there, i.e., it's ok to keep using
pg_usleep()?

Attached patch fixes the above and introduces only two wait events of
conflict resolution: snapshot and tablespace.

Many thanks for updating the patch!
-       /* Wait to be signaled by the release of the Relation Lock */
-       ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
+               /* Wait to be signaled by the release of the Relation Lock */
+               ProcWaitForSignal(PG_WAIT_LOCK | locktag.locktag_type);
+       }
Is this change really valid? What happens if the latch is set during
ResolveRecoveryConflictWithVirtualXIDs()?
ResolveRecoveryConflictWithVirtualXIDs() can return after the latch
is set but before WaitLatch() in WaitExceedsMaxStandbyDelay() is reached.
Thank you for reviewing the patch!

You're right. It's better to keep using pg_usleep() and set the wait
event by pgstat_report_wait_start().
+               default:
+                       event_name = "unknown wait event";
+                       break;
Seems this default case should be removed. Please see other
pgstat_get_wait_xxx() function, so there is no such code.

I also removed the wait
event of conflict resolution of database since it's unlikely to become
a user-visible and a long sleep as we discussed before.

Is it worth defining new wait event type RecoveryConflict only for
those two events? ISTM that it's ok to use IPC event type here.
I dropped a new wait even type and added them to IPC wait event type.

I've attached the new version patch.
Thanks for updating the patch! The patch looks good to me except
the following mior things.
+        <row>
+         <entry><literal>RecoveryConflictSnapshot</literal></entry>
+         <entry>Waiting for recovery conflict resolution on a physical cleanup.</entry>
+        </row>
+        <row>
+         <entry><literal>RecoveryConflictTablespace</literal></entry>
+         <entry>Waiting for recovery conflict resolution on dropping tablespace.</entry>
+        </row>
You need to increment the value of "morerows" in
"<entry morerows="38"><literal>IPC</literal></entry>".

The descriptions of those two events should be placed in alphabetical order
for event name. That is, they should be placed above RecoveryPause.

"vacuum cleanup" is better than "physical cleanup"?
Agreed.

I've attached the updated version patch.
Thanks! Looks good to me. Barring any objection, I will commit this patch.
Pushed! Thanks!

Thank you so much!

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services