Add ps display while waiting for wal in read_local_xlog_page_guts

Started by sirisha chamarthialmost 3 years ago4 messages
#1sirisha chamarthi
sirichamarthi22@gmail.com
1 attachment(s)

Hi,

pg_create_logical_replication_slot can take longer than usual on a standby
when there is no activity on the primary. We don't have enough information
in the pg_stat_activity or process title to debug why this is taking so
long. Attached a small patch to update the process title while waiting for
the wal in read_local_xlog_page_guts. Any thoughts on introducing a new
wait event too?

For example, in my setup, slot creation took 8 minutes 13 seconds. It only
succeeded after I ran select txid_current() on primary.

postgres=# select pg_create_logical_replication_slot('s1','test_decoding');

pg_create_logical_replication_slot
------------------------------------
(s1,0/C096D10)
(1 row)

Time: 493365.995 ms (08:13.366)

Thanks,
Sirisha

Attachments:

0001-set-ps-display_while-waiting-for-wal.patchapplication/octet-stream; name=0001-set-ps-display_while-waiting-for-wal.patchDownload
From 490fdcf1fdd51ba21e454f0a0a0e8fac3c9e526f Mon Sep 17 00:00:00 2001
From: root <root@pgvm.rlsumirojk0etd4qpjbaa2afce.tx.internal.cloudapp.net>
Date: Wed, 12 Apr 2023 22:25:03 +0000
Subject: [PATCH] Add ps display while waiting for wal

---
 src/backend/access/transam/xlogutils.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index e174a2a891..c10e2c9e07 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -31,7 +31,7 @@
 #include "utils/guc.h"
 #include "utils/hsearch.h"
 #include "utils/rel.h"
-
+#include "utils/ps_status.h"
 
 /* GUC variable */
 bool		ignore_invalid_pages = false;
@@ -957,6 +957,10 @@ read_local_xlog_page_guts(XLogReaderState *state, XLogRecPtr targetPagePtr,
 				break;
 			}
 
+			char		activitymsg[128];
+			snprintf(activitymsg, sizeof(activitymsg), "waiting for xlog to be available");
+			set_ps_display(activitymsg);
+
 			CHECK_FOR_INTERRUPTS();
 			pg_usleep(1000L);
 		}
@@ -987,6 +991,8 @@ read_local_xlog_page_guts(XLogReaderState *state, XLogRecPtr targetPagePtr,
 		}
 	}
 
+	set_ps_display("");
+
 	if (targetPagePtr + XLOG_BLCKSZ <= read_upto)
 	{
 		/*
-- 
2.25.1

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: sirisha chamarthi (#1)
Re: Add ps display while waiting for wal in read_local_xlog_page_guts

sirisha chamarthi <sirichamarthi22@gmail.com> writes:

pg_create_logical_replication_slot can take longer than usual on a standby
when there is no activity on the primary. We don't have enough information
in the pg_stat_activity or process title to debug why this is taking so
long. Attached a small patch to update the process title while waiting for
the wal in read_local_xlog_page_guts. Any thoughts on introducing a new
wait event too?

set_ps_display is a fairly expensive operation on a lot of platforms,
so I'm concerned about the overhead this proposal would add. However,
getting rid of that pg_usleep in favor of a proper wait event seems
like a good idea.

regards, tom lane

#3Drouvot, Bertrand
bertranddrouvot.pg@gmail.com
In reply to: sirisha chamarthi (#1)
Re: Add ps display while waiting for wal in read_local_xlog_page_guts

Hi,

On 4/13/23 12:43 AM, sirisha chamarthi wrote:

Hi,

pg_create_logical_replication_slot can take longer than usual on a standby when there is no activity on the primary. We don't have enough information in the pg_stat_activity or process title to debug why this is taking so long. Attached a small patch to update the process title while waiting for the wal in read_local_xlog_page_guts. Any thoughts on introducing a new wait event too?

For example, in my setup, slot creation took 8 minutes 13 seconds. It only succeeded after I ran select txid_current() on primary.

FWIW, this behavior has been mentioned in 0fdab27ad6 and a new function (pg_log_standby_snapshot()) has been created/documented to accelerate the slot creation on the standby.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#4Drouvot, Bertrand
bertranddrouvot.pg@gmail.com
In reply to: Tom Lane (#2)
Re: Add ps display while waiting for wal in read_local_xlog_page_guts

Hi,

On 4/13/23 4:29 AM, Tom Lane wrote:

sirisha chamarthi <sirichamarthi22@gmail.com> writes:

pg_create_logical_replication_slot can take longer than usual on a standby
when there is no activity on the primary. We don't have enough information
in the pg_stat_activity or process title to debug why this is taking so
long. Attached a small patch to update the process title while waiting for
the wal in read_local_xlog_page_guts.

Thanks for the patch!

Any thoughts on introducing a new

wait event too?

set_ps_display is a fairly expensive operation on a lot of platforms,
so I'm concerned about the overhead this proposal would add. However,
getting rid of that pg_usleep in favor of a proper wait event seems
like a good idea.

+1 for adding a proper wait event.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com