Lockless exit path for ReplicationOriginExitCleanup

Started by Bharath Rupireddyabout 2 years ago6 messages
#1Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com

Hi,

While looking at the use of session_replication_state, I noticed that
ReplicationOriginLock is acquired in ReplicationOriginExitCleanup()
even if session_replication_state is reset to NULL by
replorigin_session_reset(). Why can't there be a lockless exit path
something like [1]diff --git a/src/backend/replication/logical/origin.c b/src/backend/replication/logical/origin.c index 460e3dcc38..99bbe90f6c 100644 --- a/src/backend/replication/logical/origin.c +++ b/src/backend/replication/logical/origin.c @@ -1056,6 +1056,9 @@ ReplicationOriginExitCleanup(int code, Datum arg) { ConditionVariable *cv = NULL; similar to
replorigin_session_reset() which checks session_replication_state ==
NULL without a lock?

[1]
diff --git a/src/backend/replication/logical/origin.c
b/src/backend/replication/logical/origin.c
index 460e3dcc38..99bbe90f6c 100644
--- a/src/backend/replication/logical/origin.c
+++ b/src/backend/replication/logical/origin.c
@@ -1056,6 +1056,9 @@ ReplicationOriginExitCleanup(int code, Datum arg)
 {
        ConditionVariable *cv = NULL;
+       if (session_replication_state == NULL)
+               return;
+
        LWLockAcquire(ReplicationOriginLock, LW_EXCLUSIVE);

if (session_replication_state != NULL &&

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#2Amit Kapila
amit.kapila16@gmail.com
In reply to: Bharath Rupireddy (#1)
Re: Lockless exit path for ReplicationOriginExitCleanup

On Wed, Nov 22, 2023 at 2:12 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

While looking at the use of session_replication_state, I noticed that
ReplicationOriginLock is acquired in ReplicationOriginExitCleanup()
even if session_replication_state is reset to NULL by
replorigin_session_reset(). Why can't there be a lockless exit path
something like [1] similar to
replorigin_session_reset() which checks session_replication_state ==
NULL without a lock?

I don't see any problem with such a check but not sure of the benefit
of doing so either.

--
With Regards,
Amit Kapila.

#3Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Bharath Rupireddy (#1)
Re: Lockless exit path for ReplicationOriginExitCleanup

Hello,

On 2023-Nov-22, Bharath Rupireddy wrote:

While looking at the use of session_replication_state, I noticed that
ReplicationOriginLock is acquired in ReplicationOriginExitCleanup()
even if session_replication_state is reset to NULL by
replorigin_session_reset(). Why can't there be a lockless exit path
something like [1] similar to
replorigin_session_reset() which checks session_replication_state ==
NULL without a lock?

I suppose we can do this on consistency grounds -- I'm pretty sure you'd
have a really hard time proving that this makes a performance difference --
but this patch is incomplete: just two lines below, we're still testing
session_replication_state for nullness, which would now be dead code.
Please repair.

The comment on session_replication_state is confusing also:

/*
* Backend-local, cached element from ReplicationState for use in a backend
* replaying remote commits, so we don't have to search ReplicationState for
* the backends current RepOriginId.
*/

My problem with it is that this is not a "cached element", but instead a
"cached pointer to [shared memory]". This is what makes testing that
pointer for null-ness doable, but access to each member therein
requiring lwlock acquisition.

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"Once again, thank you and all of the developers for your hard work on
PostgreSQL. This is by far the most pleasant management experience of
any database I've worked on." (Dan Harris)
http://archives.postgresql.org/pgsql-performance/2006-04/msg00247.php

#4Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Amit Kapila (#2)
Re: Lockless exit path for ReplicationOriginExitCleanup

On Wed, Nov 22, 2023 at 2:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Nov 22, 2023 at 2:12 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

While looking at the use of session_replication_state, I noticed that
ReplicationOriginLock is acquired in ReplicationOriginExitCleanup()
even if session_replication_state is reset to NULL by
replorigin_session_reset(). Why can't there be a lockless exit path
something like [1] similar to
replorigin_session_reset() which checks session_replication_state ==
NULL without a lock?

I don't see any problem with such a check but not sure of the benefit
of doing so either.

It avoids an unnecessary lock acquisition and release when
session_replication_state is already reset by
replorigin_session_reset() before reaching
ReplicationOriginExitCleanup(). A patch something like [1]diff --git a/src/backend/replication/logical/origin.c b/src/backend/replication/logical/origin.c index 460e3dcc38..dd3824bd27 100644 --- a/src/backend/replication/logical/origin.c +++ b/src/backend/replication/logical/origin.c @@ -1056,6 +1056,11 @@ ReplicationOriginExitCleanup(int code, Datum arg) { ConditionVariable *cv = NULL; and a run
of subscription tests shows that 153 times the lock acquisition and
release can be avoided.

ubuntu:~/postgres/src/test/subscription$ grep -ir "with
session_replication_state NULL" . | wc -l
153

ubuntu:~/postgres/src/test/subscription$ grep -ir "with
session_replication_state not NULL" . | wc -l
174

[1]
diff --git a/src/backend/replication/logical/origin.c
b/src/backend/replication/logical/origin.c
index 460e3dcc38..dd3824bd27 100644
--- a/src/backend/replication/logical/origin.c
+++ b/src/backend/replication/logical/origin.c
@@ -1056,6 +1056,11 @@ ReplicationOriginExitCleanup(int code, Datum arg)
 {
        ConditionVariable *cv = NULL;
+       if (session_replication_state == NULL)
+               elog(LOG, "In ReplicationOriginExitCleanup() with
session_replication_state NULL");
+       else
+               elog(LOG, "In ReplicationOriginExitCleanup() with
session_replication_state not NULL");
+
        LWLockAcquire(ReplicationOriginLock, LW_EXCLUSIVE);

if (session_replication_state != NULL &&

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#5Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Alvaro Herrera (#3)
1 attachment(s)
Re: Lockless exit path for ReplicationOriginExitCleanup

On Wed, Nov 22, 2023 at 3:06 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

Hello,

On 2023-Nov-22, Bharath Rupireddy wrote:

While looking at the use of session_replication_state, I noticed that
ReplicationOriginLock is acquired in ReplicationOriginExitCleanup()
even if session_replication_state is reset to NULL by
replorigin_session_reset(). Why can't there be a lockless exit path
something like [1] similar to
replorigin_session_reset() which checks session_replication_state ==
NULL without a lock?

I suppose we can do this on consistency grounds -- I'm pretty sure you'd
have a really hard time proving that this makes a performance difference --

Yes, can't measure the perf gain, however, as said upthread
/messages/by-id/CALj2ACVVhPn7BVQZLGPVvBrLoDZNRaV0LS9rBpt5y+j=xRebWw@mail.gmail.com,
it avoids unnecessary lock acquisition and release.

but this patch is incomplete: just two lines below, we're still testing
session_replication_state for nullness, which would now be dead code.
Please repair.

Done.

The comment on session_replication_state is confusing also:

/*
* Backend-local, cached element from ReplicationState for use in a backend
* replaying remote commits, so we don't have to search ReplicationState for
* the backends current RepOriginId.
*/

My problem with it is that this is not a "cached element", but instead a
"cached pointer to [shared memory]". This is what makes testing that
pointer for null-ness doable, but access to each member therein
requiring lwlock acquisition.

Right. I've reworded the comment a bit.

PSA v1 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v1-0001-Introduce-lockless-exit-path-for-ReplicationOrigi.patchapplication/octet-stream; name=v1-0001-Introduce-lockless-exit-path-for-ReplicationOrigi.patchDownload
From f143c8d7475e1808274c19020a2f66706e895260 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Thu, 23 Nov 2023 04:32:57 +0000
Subject: [PATCH v1] Introduce lockless exit path for
 ReplicationOriginExitCleanup

---
 src/backend/replication/logical/origin.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/src/backend/replication/logical/origin.c b/src/backend/replication/logical/origin.c
index 460e3dcc38..7b879d1861 100644
--- a/src/backend/replication/logical/origin.c
+++ b/src/backend/replication/logical/origin.c
@@ -172,9 +172,9 @@ static ReplicationState *replication_states;
 static ReplicationStateCtl *replication_states_ctl;
 
 /*
- * Backend-local, cached element from ReplicationState for use in a backend
- * replaying remote commits, so we don't have to search ReplicationState for
- * the backends current RepOriginId.
+ * A backend-local cached pointer to backend's ReplicationState in the shared
+ * memory array of replication states so that the backend while replaying
+ * remote commits doesn't have to search the array with current RepOriginId.
  */
 static ReplicationState *session_replication_state = NULL;
 
@@ -1056,10 +1056,12 @@ ReplicationOriginExitCleanup(int code, Datum arg)
 {
 	ConditionVariable *cv = NULL;
 
+	if (session_replication_state == NULL)
+		return;
+
 	LWLockAcquire(ReplicationOriginLock, LW_EXCLUSIVE);
 
-	if (session_replication_state != NULL &&
-		session_replication_state->acquired_by == MyProcPid)
+	if (session_replication_state->acquired_by == MyProcPid)
 	{
 		cv = &session_replication_state->origin_cv;
 
-- 
2.34.1

#6Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Bharath Rupireddy (#5)
Re: Lockless exit path for ReplicationOriginExitCleanup

Thanks, pushed. I reworded the comment again, though.

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/