Lockless exit path for ReplicationOriginExitCleanup
Hi,
While looking at the use of session_replication_state, I noticed that
ReplicationOriginLock is acquired in ReplicationOriginExitCleanup()
even if session_replication_state is reset to NULL by
replorigin_session_reset(). Why can't there be a lockless exit path
something like [1]diff --git a/src/backend/replication/logical/origin.c b/src/backend/replication/logical/origin.c index 460e3dcc38..99bbe90f6c 100644 --- a/src/backend/replication/logical/origin.c +++ b/src/backend/replication/logical/origin.c @@ -1056,6 +1056,9 @@ ReplicationOriginExitCleanup(int code, Datum arg) { ConditionVariable *cv = NULL; similar to
replorigin_session_reset() which checks session_replication_state ==
NULL without a lock?
[1]
diff --git a/src/backend/replication/logical/origin.c
b/src/backend/replication/logical/origin.c
index 460e3dcc38..99bbe90f6c 100644
--- a/src/backend/replication/logical/origin.c
+++ b/src/backend/replication/logical/origin.c
@@ -1056,6 +1056,9 @@ ReplicationOriginExitCleanup(int code, Datum arg)
{
ConditionVariable *cv = NULL;
+ if (session_replication_state == NULL)
+ return;
+
LWLockAcquire(ReplicationOriginLock, LW_EXCLUSIVE);
if (session_replication_state != NULL &&
--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On Wed, Nov 22, 2023 at 2:12 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
While looking at the use of session_replication_state, I noticed that
ReplicationOriginLock is acquired in ReplicationOriginExitCleanup()
even if session_replication_state is reset to NULL by
replorigin_session_reset(). Why can't there be a lockless exit path
something like [1] similar to
replorigin_session_reset() which checks session_replication_state ==
NULL without a lock?
I don't see any problem with such a check but not sure of the benefit
of doing so either.
--
With Regards,
Amit Kapila.
Hello,
On 2023-Nov-22, Bharath Rupireddy wrote:
While looking at the use of session_replication_state, I noticed that
ReplicationOriginLock is acquired in ReplicationOriginExitCleanup()
even if session_replication_state is reset to NULL by
replorigin_session_reset(). Why can't there be a lockless exit path
something like [1] similar to
replorigin_session_reset() which checks session_replication_state ==
NULL without a lock?
I suppose we can do this on consistency grounds -- I'm pretty sure you'd
have a really hard time proving that this makes a performance difference --
but this patch is incomplete: just two lines below, we're still testing
session_replication_state for nullness, which would now be dead code.
Please repair.
The comment on session_replication_state is confusing also:
/*
* Backend-local, cached element from ReplicationState for use in a backend
* replaying remote commits, so we don't have to search ReplicationState for
* the backends current RepOriginId.
*/
My problem with it is that this is not a "cached element", but instead a
"cached pointer to [shared memory]". This is what makes testing that
pointer for null-ness doable, but access to each member therein
requiring lwlock acquisition.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"Once again, thank you and all of the developers for your hard work on
PostgreSQL. This is by far the most pleasant management experience of
any database I've worked on." (Dan Harris)
http://archives.postgresql.org/pgsql-performance/2006-04/msg00247.php
On Wed, Nov 22, 2023 at 2:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Nov 22, 2023 at 2:12 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:While looking at the use of session_replication_state, I noticed that
ReplicationOriginLock is acquired in ReplicationOriginExitCleanup()
even if session_replication_state is reset to NULL by
replorigin_session_reset(). Why can't there be a lockless exit path
something like [1] similar to
replorigin_session_reset() which checks session_replication_state ==
NULL without a lock?I don't see any problem with such a check but not sure of the benefit
of doing so either.
It avoids an unnecessary lock acquisition and release when
session_replication_state is already reset by
replorigin_session_reset() before reaching
ReplicationOriginExitCleanup(). A patch something like [1]diff --git a/src/backend/replication/logical/origin.c b/src/backend/replication/logical/origin.c index 460e3dcc38..dd3824bd27 100644 --- a/src/backend/replication/logical/origin.c +++ b/src/backend/replication/logical/origin.c @@ -1056,6 +1056,11 @@ ReplicationOriginExitCleanup(int code, Datum arg) { ConditionVariable *cv = NULL; and a run
of subscription tests shows that 153 times the lock acquisition and
release can be avoided.
ubuntu:~/postgres/src/test/subscription$ grep -ir "with
session_replication_state NULL" . | wc -l
153
ubuntu:~/postgres/src/test/subscription$ grep -ir "with
session_replication_state not NULL" . | wc -l
174
[1]
diff --git a/src/backend/replication/logical/origin.c
b/src/backend/replication/logical/origin.c
index 460e3dcc38..dd3824bd27 100644
--- a/src/backend/replication/logical/origin.c
+++ b/src/backend/replication/logical/origin.c
@@ -1056,6 +1056,11 @@ ReplicationOriginExitCleanup(int code, Datum arg)
{
ConditionVariable *cv = NULL;
+ if (session_replication_state == NULL)
+ elog(LOG, "In ReplicationOriginExitCleanup() with
session_replication_state NULL");
+ else
+ elog(LOG, "In ReplicationOriginExitCleanup() with
session_replication_state not NULL");
+
LWLockAcquire(ReplicationOriginLock, LW_EXCLUSIVE);
if (session_replication_state != NULL &&
--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On Wed, Nov 22, 2023 at 3:06 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
Hello,
On 2023-Nov-22, Bharath Rupireddy wrote:
While looking at the use of session_replication_state, I noticed that
ReplicationOriginLock is acquired in ReplicationOriginExitCleanup()
even if session_replication_state is reset to NULL by
replorigin_session_reset(). Why can't there be a lockless exit path
something like [1] similar to
replorigin_session_reset() which checks session_replication_state ==
NULL without a lock?I suppose we can do this on consistency grounds -- I'm pretty sure you'd
have a really hard time proving that this makes a performance difference --
Yes, can't measure the perf gain, however, as said upthread
/messages/by-id/CALj2ACVVhPn7BVQZLGPVvBrLoDZNRaV0LS9rBpt5y+j=xRebWw@mail.gmail.com,
it avoids unnecessary lock acquisition and release.
but this patch is incomplete: just two lines below, we're still testing
session_replication_state for nullness, which would now be dead code.
Please repair.
Done.
The comment on session_replication_state is confusing also:
/*
* Backend-local, cached element from ReplicationState for use in a backend
* replaying remote commits, so we don't have to search ReplicationState for
* the backends current RepOriginId.
*/My problem with it is that this is not a "cached element", but instead a
"cached pointer to [shared memory]". This is what makes testing that
pointer for null-ness doable, but access to each member therein
requiring lwlock acquisition.
Right. I've reworded the comment a bit.
PSA v1 patch.
--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachments:
v1-0001-Introduce-lockless-exit-path-for-ReplicationOrigi.patchapplication/octet-stream; name=v1-0001-Introduce-lockless-exit-path-for-ReplicationOrigi.patchDownload
From f143c8d7475e1808274c19020a2f66706e895260 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Thu, 23 Nov 2023 04:32:57 +0000
Subject: [PATCH v1] Introduce lockless exit path for
ReplicationOriginExitCleanup
---
src/backend/replication/logical/origin.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/src/backend/replication/logical/origin.c b/src/backend/replication/logical/origin.c
index 460e3dcc38..7b879d1861 100644
--- a/src/backend/replication/logical/origin.c
+++ b/src/backend/replication/logical/origin.c
@@ -172,9 +172,9 @@ static ReplicationState *replication_states;
static ReplicationStateCtl *replication_states_ctl;
/*
- * Backend-local, cached element from ReplicationState for use in a backend
- * replaying remote commits, so we don't have to search ReplicationState for
- * the backends current RepOriginId.
+ * A backend-local cached pointer to backend's ReplicationState in the shared
+ * memory array of replication states so that the backend while replaying
+ * remote commits doesn't have to search the array with current RepOriginId.
*/
static ReplicationState *session_replication_state = NULL;
@@ -1056,10 +1056,12 @@ ReplicationOriginExitCleanup(int code, Datum arg)
{
ConditionVariable *cv = NULL;
+ if (session_replication_state == NULL)
+ return;
+
LWLockAcquire(ReplicationOriginLock, LW_EXCLUSIVE);
- if (session_replication_state != NULL &&
- session_replication_state->acquired_by == MyProcPid)
+ if (session_replication_state->acquired_by == MyProcPid)
{
cv = &session_replication_state->origin_cv;
--
2.34.1
Thanks, pushed. I reworded the comment again, though.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/