Add ps display while waiting for wal in read_local_xlog_page_guts
Hi,
pg_create_logical_replication_slot can take longer than usual on a standby
when there is no activity on the primary. We don't have enough information
in the pg_stat_activity or process title to debug why this is taking so
long. Attached a small patch to update the process title while waiting for
the wal in read_local_xlog_page_guts. Any thoughts on introducing a new
wait event too?
For example, in my setup, slot creation took 8 minutes 13 seconds. It only
succeeded after I ran select txid_current() on primary.
postgres=# select pg_create_logical_replication_slot('s1','test_decoding');
pg_create_logical_replication_slot
------------------------------------
(s1,0/C096D10)
(1 row)
Time: 493365.995 ms (08:13.366)
Thanks,
Sirisha
Attachments:
0001-set-ps-display_while-waiting-for-wal.patchapplication/octet-stream; name=0001-set-ps-display_while-waiting-for-wal.patchDownload
From 490fdcf1fdd51ba21e454f0a0a0e8fac3c9e526f Mon Sep 17 00:00:00 2001
From: root <root@pgvm.rlsumirojk0etd4qpjbaa2afce.tx.internal.cloudapp.net>
Date: Wed, 12 Apr 2023 22:25:03 +0000
Subject: [PATCH] Add ps display while waiting for wal
---
src/backend/access/transam/xlogutils.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index e174a2a891..c10e2c9e07 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -31,7 +31,7 @@
#include "utils/guc.h"
#include "utils/hsearch.h"
#include "utils/rel.h"
-
+#include "utils/ps_status.h"
/* GUC variable */
bool ignore_invalid_pages = false;
@@ -957,6 +957,10 @@ read_local_xlog_page_guts(XLogReaderState *state, XLogRecPtr targetPagePtr,
break;
}
+ char activitymsg[128];
+ snprintf(activitymsg, sizeof(activitymsg), "waiting for xlog to be available");
+ set_ps_display(activitymsg);
+
CHECK_FOR_INTERRUPTS();
pg_usleep(1000L);
}
@@ -987,6 +991,8 @@ read_local_xlog_page_guts(XLogReaderState *state, XLogRecPtr targetPagePtr,
}
}
+ set_ps_display("");
+
if (targetPagePtr + XLOG_BLCKSZ <= read_upto)
{
/*
--
2.25.1
sirisha chamarthi <sirichamarthi22@gmail.com> writes:
pg_create_logical_replication_slot can take longer than usual on a standby
when there is no activity on the primary. We don't have enough information
in the pg_stat_activity or process title to debug why this is taking so
long. Attached a small patch to update the process title while waiting for
the wal in read_local_xlog_page_guts. Any thoughts on introducing a new
wait event too?
set_ps_display is a fairly expensive operation on a lot of platforms,
so I'm concerned about the overhead this proposal would add. However,
getting rid of that pg_usleep in favor of a proper wait event seems
like a good idea.
regards, tom lane
Hi,
On 4/13/23 12:43 AM, sirisha chamarthi wrote:
Hi,
pg_create_logical_replication_slot can take longer than usual on a standby when there is no activity on the primary. We don't have enough information in the pg_stat_activity or process title to debug why this is taking so long. Attached a small patch to update the process title while waiting for the wal in read_local_xlog_page_guts. Any thoughts on introducing a new wait event too?
For example, in my setup, slot creation took 8 minutes 13 seconds. It only succeeded after I ran select txid_current() on primary.
FWIW, this behavior has been mentioned in 0fdab27ad6 and a new function (pg_log_standby_snapshot()) has been created/documented to accelerate the slot creation on the standby.
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Hi,
On 4/13/23 4:29 AM, Tom Lane wrote:
sirisha chamarthi <sirichamarthi22@gmail.com> writes:
pg_create_logical_replication_slot can take longer than usual on a standby
when there is no activity on the primary. We don't have enough information
in the pg_stat_activity or process title to debug why this is taking so
long. Attached a small patch to update the process title while waiting for
the wal in read_local_xlog_page_guts.
Thanks for the patch!
Any thoughts on introducing a new
wait event too?
set_ps_display is a fairly expensive operation on a lot of platforms,
so I'm concerned about the overhead this proposal would add. However,
getting rid of that pg_usleep in favor of a proper wait event seems
like a good idea.
+1 for adding a proper wait event.
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com