Atomic ops for unlogged LSN

Started by John Morrisover 2 years ago16 messages

john.morris@crunchydata.com

over 2 years ago

1 attachment(s)

This is a short patch which cleans up code for unlogged LSNs. It replaces the existing spinlock with
atomic ops. It could provide a performance benefit for future uses of
unlogged LSNS, but for now it is simply a cleaner implementation.

Attachments:

0001-Use-atomic-ops-for-unloggedLSNs.patchapplication/octet-stream; name=0001-Use-atomic-ops-for-unloggedLSNs.patchDownload

From d9db17fb8ebaf1c57a51e2b5e47475fc855be04c Mon Sep 17 00:00:00 2001
From: John Morris <john.morris@crunchydata.com>
Date: Wed, 3 May 2023 18:28:39 -0700
Subject: [PATCH] Use atomic ops for unloggedLSNs

---
 src/backend/access/transam/xlog.c | 23 ++++++-----------------
 1 file changed, 6 insertions(+), 17 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index b2430f617c..9ed6cf4154 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -469,9 +469,8 @@ typedef struct XLogCtlData
 
 	XLogSegNo	lastRemovedSegNo;	/* latest removed/recycled XLOG segment */
 
-	/* Fake LSN counter, for unlogged relations. Protected by ulsn_lck. */
-	XLogRecPtr	unloggedLSN;
-	slock_t		ulsn_lck;
+	/* Fake LSN counter for unlogged relations. Updated atomically. */
+	pg_atomic_uint64 unloggedLSN;
 
 	/* Time and LSN of last xlog segment switch. Protected by WALWriteLock. */
 	pg_time_t	lastSegSwitchTime;
@@ -4215,14 +4214,7 @@ DataChecksumsEnabled(void)
 XLogRecPtr
 GetFakeLSNForUnloggedRel(void)
 {
-	XLogRecPtr	nextUnloggedLSN;
-
-	/* increment the unloggedLSN counter, need SpinLock */
-	SpinLockAcquire(&XLogCtl->ulsn_lck);
-	nextUnloggedLSN = XLogCtl->unloggedLSN++;
-	SpinLockRelease(&XLogCtl->ulsn_lck);
-
-	return nextUnloggedLSN;
+	return pg_atomic_fetch_add_u64(&XLogCtl->unloggedLSN, 1);
 }
 
 /*
@@ -4635,7 +4627,6 @@ XLOGShmemInit(void)
 
 	SpinLockInit(&XLogCtl->Insert.insertpos_lck);
 	SpinLockInit(&XLogCtl->info_lck);
-	SpinLockInit(&XLogCtl->ulsn_lck);
 }
 
 /*
@@ -5240,9 +5231,9 @@ StartupXLOG(void)
 	 * the unlogged LSN counter can be reset too.
 	 */
 	if (ControlFile->state == DB_SHUTDOWNED)
-		XLogCtl->unloggedLSN = ControlFile->unloggedLSN;
+		pg_atomic_init_u64(&XLogCtl->unloggedLSN, ControlFile->unloggedLSN);
 	else
-		XLogCtl->unloggedLSN = FirstNormalUnloggedLSN;
+		pg_atomic_init_u64(&XLogCtl->unloggedLSN, FirstNormalUnloggedLSN);
 
 	/*
 	 * Copy any missing timeline history files between 'now' and the recovery
@@ -6784,9 +6775,7 @@ CreateCheckPoint(int flags)
 	 * unused on non-shutdown checkpoints, but seems useful to store it always
 	 * for debugging purposes.
 	 */
-	SpinLockAcquire(&XLogCtl->ulsn_lck);
-	ControlFile->unloggedLSN = XLogCtl->unloggedLSN;
-	SpinLockRelease(&XLogCtl->ulsn_lck);
+	ControlFile->unloggedLSN = pg_atomic_read_u64(&XLogCtl->unloggedLSN);
 
 	UpdateControlFile();
 	LWLockRelease(ControlFileLock);
-- 
2.33.0

Michael Paquier

michael@paquier.xyz

over 2 years ago

In reply to: John Morris (#1)

Re: Atomic ops for unlogged LSN

On Tue, May 23, 2023 at 08:24:45PM +0000, John Morris wrote:

This is a short patch which cleans up code for unlogged LSNs. It
replaces the existing spinlock with atomic ops. It could provide a
performance benefit for future uses of unlogged LSNS, but for now
it is simply a cleaner implementation.

Seeing the code paths where gistGetFakeLSN() is called, I guess that
the answer will be no, still are you seeing a measurable difference in
some cases?

-   /* increment the unloggedLSN counter, need SpinLock */
-   SpinLockAcquire(&XLogCtl->ulsn_lck);
-   nextUnloggedLSN = XLogCtl->unloggedLSN++;
-   SpinLockRelease(&XLogCtl->ulsn_lck);
-
-   return nextUnloggedLSN;
+   return pg_atomic_fetch_add_u64(&XLogCtl->unloggedLSN, 1);

Spinlocks provide a full memory barrier, which may not the case with
add_u64() or read_u64(). Could memory ordering be a problem in these
code paths?
--
Michael

Robert Haas

robertmhaas@gmail.com

over 2 years ago

In reply to: Michael Paquier (#2)

Re: Atomic ops for unlogged LSN

On Tue, May 23, 2023 at 6:26 PM Michael Paquier <michael@paquier.xyz> wrote:

Spinlocks provide a full memory barrier, which may not the case with
add_u64() or read_u64(). Could memory ordering be a problem in these
code paths?

It could be a huge problem if what you say were true, but unless I'm
missing something, the comments in atomics.h say that it isn't. The
comments for the 64-bit functions say to look at the 32-bit functions,
and there it says:

/*
* pg_atomic_add_fetch_u32 - atomically add to variable
*
* Returns the value of ptr after the arithmetic operation.
*
* Full barrier semantics.
*/

Which is probably a good thing, because I'm not sure what good it
would be to have an operation that both reads and modifies an atomic
variable but has no barrier semantics.

That's not to say that I entirely understand the point of this patch.
It seems like a generally reasonable thing to do AFAICT but somehow I
wonder if there's more to the story here, because it doesn't feel like
it would be easy to measure the benefit of this patch in isolation.
That's not a reason to reject it, though, just something that makes me
curious.

--
Robert Haas
EDB: http://www.enterprisedb.com

Andres Freund

andres@anarazel.de

over 2 years ago

In reply to: Robert Haas (#3)

Re: Atomic ops for unlogged LSN

Hi,

On 2023-05-24 08:22:08 -0400, Robert Haas wrote:

On Tue, May 23, 2023 at 6:26 PM Michael Paquier <michael@paquier.xyz> wrote:

Spinlocks provide a full memory barrier, which may not the case with
add_u64() or read_u64(). Could memory ordering be a problem in these
code paths?

It could be a huge problem if what you say were true, but unless I'm
missing something, the comments in atomics.h say that it isn't. The
comments for the 64-bit functions say to look at the 32-bit functions,
and there it says:

/*
* pg_atomic_add_fetch_u32 - atomically add to variable
*
* Returns the value of ptr after the arithmetic operation.
*
* Full barrier semantics.
*/

Which is probably a good thing, because I'm not sure what good it
would be to have an operation that both reads and modifies an atomic
variable but has no barrier semantics.

I was a bit confused by Michael's comment as well, due to the section of code
quoted. But he does have a point: pg_atomic_read_u32() does indeed *not* have
barrier semantics (it'd be way more expensive), and the patch does contain
this hunk:

@@ -6784,9 +6775,7 @@ CreateCheckPoint(int flags)
* unused on non-shutdown checkpoints, but seems useful to store it always
* for debugging purposes.
*/
-	SpinLockAcquire(&XLogCtl->ulsn_lck);
-	ControlFile->unloggedLSN = XLogCtl->unloggedLSN;
-	SpinLockRelease(&XLogCtl->ulsn_lck);
+	ControlFile->unloggedLSN = pg_atomic_read_u64(&XLogCtl->unloggedLSN);

UpdateControlFile();
LWLockRelease(ControlFileLock);

So we indeed loose some "barrier strength" - but I think that's fine. For one,
it's a debugging-only value. But more importantly, I don't see what reordering
the barrier could prevent - a barrier is useful for things like sequencing two
memory accesses to happen in the intended order - but unloggedLSN doesn't
interact with another variable that's accessed within the ControlFileLock
afaict.

That's not to say that I entirely understand the point of this patch.
It seems like a generally reasonable thing to do AFAICT but somehow I
wonder if there's more to the story here, because it doesn't feel like
it would be easy to measure the benefit of this patch in isolation.
That's not a reason to reject it, though, just something that makes me
curious.

I doubt it's a meaningful, if even measurable win. But removing atomic ops and
reducing shared memory space isn't a bad thing, even if there's no immediate
benefit...

Greetings,

Andres Freund

Michael Paquier

michael@paquier.xyz

over 2 years ago

In reply to: Andres Freund (#4)

Re: Atomic ops for unlogged LSN

On Wed, May 24, 2023 at 02:49:58PM -0700, Andres Freund wrote:

I was a bit confused by Michael's comment as well, due to the section of code
quoted. But he does have a point: pg_atomic_read_u32() does indeed *not* have
barrier semantics (it'd be way more expensive), and the patch does contain
this hunk:

Thanks for the correction. The part about _add was incorrect.

So we indeed loose some "barrier strength" - but I think that's fine. For one,
it's a debugging-only value. But more importantly, I don't see what reordering
the barrier could prevent - a barrier is useful for things like sequencing two
memory accesses to happen in the intended order - but unloggedLSN doesn't
interact with another variable that's accessed within the ControlFileLock
afaict.

This stuff is usually tricky enough that I am never completely sure
whether it is fine without reading again README.barrier, which is
where unloggedLSN is saved in the control file under its LWLock.
Something that I find confusing in the patch is that it does not
document the reason why this is OK.
--
Michael

Nathan Bossart

nathandbossart@gmail.com

over 2 years ago

In reply to: Michael Paquier (#5)

Re: Atomic ops for unlogged LSN

On Thu, May 25, 2023 at 07:41:21AM +0900, Michael Paquier wrote:

On Wed, May 24, 2023 at 02:49:58PM -0700, Andres Freund wrote:

So we indeed loose some "barrier strength" - but I think that's fine. For one,
it's a debugging-only value.

Is it? I see uses in GiST indexing (62401db), so it's not immediately
obvious to me how it is debugging-only. If it is, then I think this patch
ought to clearly document it so that nobody else tries to use it for
non-debugging-only stuff.

But more importantly, I don't see what reordering
the barrier could prevent - a barrier is useful for things like sequencing two
memory accesses to happen in the intended order - but unloggedLSN doesn't
interact with another variable that's accessed within the ControlFileLock
afaict.

This stuff is usually tricky enough that I am never completely sure
whether it is fine without reading again README.barrier, which is
where unloggedLSN is saved in the control file under its LWLock.
Something that I find confusing in the patch is that it does not
document the reason why this is OK.

My concern would be whether GetFakeLSNForUnloggedRel or CreateCheckPoint
might see an old value of unloggedLSN. From the following note in
README.barrier, it sounds like this would be a problem even if we ensured
full barrier semantics:

3. No ordering guarantees. While memory barriers ensure that any given
process performs loads and stores to shared memory in order, they don't
guarantee synchronization. In the queue example above, we can use memory
barriers to be sure that readers won't see garbage, but there's nothing to
say whether a given reader will run before or after a given writer. If this
matters in a given situation, some other mechanism must be used instead of
or in addition to memory barriers.

IIUC we know that shared memory accesses cannot be reordered to precede
aquisition or follow release of a spinlock (thanks to 0709b7e), which is
why this isn't a problem in the current implementation.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Stephen Frost

sfrost@snowman.net

over 2 years ago

In reply to: Nathan Bossart (#6)

Re: Atomic ops for unlogged LSN

Greetings,

* Nathan Bossart (nathandbossart@gmail.com) wrote:

On Thu, May 25, 2023 at 07:41:21AM +0900, Michael Paquier wrote:

On Wed, May 24, 2023 at 02:49:58PM -0700, Andres Freund wrote:

So we indeed loose some "barrier strength" - but I think that's fine. For one,
it's a debugging-only value.

Is it? I see uses in GiST indexing (62401db), so it's not immediately
obvious to me how it is debugging-only. If it is, then I think this patch
ought to clearly document it so that nobody else tries to use it for
non-debugging-only stuff.

I don't see it as a debugging value. I'm not sure where that came
from..? We do use it in places and if anything, I expect it to be used
more, not less, in the future as a persistant generally increasing
value (could go backwards on a crash-restart but in such case all
unlogged tables are truncated).

But more importantly, I don't see what reordering
the barrier could prevent - a barrier is useful for things like sequencing two
memory accesses to happen in the intended order - but unloggedLSN doesn't
interact with another variable that's accessed within the ControlFileLock
afaict.

This stuff is usually tricky enough that I am never completely sure
whether it is fine without reading again README.barrier, which is
where unloggedLSN is saved in the control file under its LWLock.
Something that I find confusing in the patch is that it does not
document the reason why this is OK.

My concern would be whether GetFakeLSNForUnloggedRel or CreateCheckPoint
might see an old value of unloggedLSN. From the following note in
README.barrier, it sounds like this would be a problem even if we ensured
full barrier semantics:

3. No ordering guarantees. While memory barriers ensure that any given
process performs loads and stores to shared memory in order, they don't
guarantee synchronization. In the queue example above, we can use memory
barriers to be sure that readers won't see garbage, but there's nothing to
say whether a given reader will run before or after a given writer. If this
matters in a given situation, some other mechanism must be used instead of
or in addition to memory barriers.

IIUC we know that shared memory accesses cannot be reordered to precede
aquisition or follow release of a spinlock (thanks to 0709b7e), which is
why this isn't a problem in the current implementation.

The concern around unlogged LSN, imv anyway, is less about shared memory
access and making sure that all callers understand that it can move
backwards on a crash/restart. I don't think that's an issue for current
users but we just need to make sure to try and comment sufficiently
regarding that such that new users understand that about this particular
value.

Thanks,

Stephen

Nathan Bossart

nathandbossart@gmail.com

over 2 years ago

In reply to: Stephen Frost (#7)

Re: Atomic ops for unlogged LSN

On Mon, Jun 12, 2023 at 07:24:18PM -0400, Stephen Frost wrote:

* Nathan Bossart (nathandbossart@gmail.com) wrote:

Is it? I see uses in GiST indexing (62401db), so it's not immediately
obvious to me how it is debugging-only. If it is, then I think this patch
ought to clearly document it so that nobody else tries to use it for
non-debugging-only stuff.

I don't see it as a debugging value. I'm not sure where that came
from..? We do use it in places and if anything, I expect it to be used
more, not less, in the future as a persistant generally increasing
value (could go backwards on a crash-restart but in such case all
unlogged tables are truncated).

This is my understanding as well.

My concern would be whether GetFakeLSNForUnloggedRel or CreateCheckPoint
might see an old value of unloggedLSN. From the following note in
README.barrier, it sounds like this would be a problem even if we ensured
full barrier semantics:

Never mind. I think I'm forgetting that the atomics support in Postgres
deals with cache coherency.

The concern around unlogged LSN, imv anyway, is less about shared memory
access and making sure that all callers understand that it can move
backwards on a crash/restart. I don't think that's an issue for current
users but we just need to make sure to try and comment sufficiently
regarding that such that new users understand that about this particular
value.

Right.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Stephen Frost

sfrost@snowman.net

over 2 years ago

In reply to: Nathan Bossart (#8)

Re: Atomic ops for unlogged LSN

Greetings,

* Nathan Bossart (nathandbossart@gmail.com) wrote:

On Mon, Jun 12, 2023 at 07:24:18PM -0400, Stephen Frost wrote:

* Nathan Bossart (nathandbossart@gmail.com) wrote:

Is it? I see uses in GiST indexing (62401db), so it's not immediately
obvious to me how it is debugging-only. If it is, then I think this patch
ought to clearly document it so that nobody else tries to use it for
non-debugging-only stuff.

I don't see it as a debugging value. I'm not sure where that came
from..? We do use it in places and if anything, I expect it to be used
more, not less, in the future as a persistant generally increasing
value (could go backwards on a crash-restart but in such case all
unlogged tables are truncated).

This is my understanding as well.

My concern would be whether GetFakeLSNForUnloggedRel or CreateCheckPoint
might see an old value of unloggedLSN. From the following note in
README.barrier, it sounds like this would be a problem even if we ensured
full barrier semantics:

Never mind. I think I'm forgetting that the atomics support in Postgres
deals with cache coherency.

The concern around unlogged LSN, imv anyway, is less about shared memory
access and making sure that all callers understand that it can move
backwards on a crash/restart. I don't think that's an issue for current
users but we just need to make sure to try and comment sufficiently
regarding that such that new users understand that about this particular
value.

Right.

Awesome. Was there any other feedback on this change which basically is
just getting rid of a spinlock and replacing it with using atomics?
It's still in needs-review status but there's been a number of
comments/reviews (drive-by, at least) but without any real ask for any
changes to be made.

Thanks!

Stephen

#10

Nathan Bossart

nathandbossart@gmail.com

over 2 years ago

In reply to: Stephen Frost (#9)

Re: Atomic ops for unlogged LSN

On Mon, Jul 17, 2023 at 07:08:03PM -0400, Stephen Frost wrote:

Awesome. Was there any other feedback on this change which basically is
just getting rid of a spinlock and replacing it with using atomics?
It's still in needs-review status but there's been a number of
comments/reviews (drive-by, at least) but without any real ask for any
changes to be made.

LGTM

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#11

Andres Freund

andres@anarazel.de

over 2 years ago

In reply to: Nathan Bossart (#10)

Re: Atomic ops for unlogged LSN

Hi,

On 2023-07-17 16:15:52 -0700, Nathan Bossart wrote:

On Mon, Jul 17, 2023 at 07:08:03PM -0400, Stephen Frost wrote:

Awesome. Was there any other feedback on this change which basically is
just getting rid of a spinlock and replacing it with using atomics?
It's still in needs-review status but there's been a number of
comments/reviews (drive-by, at least) but without any real ask for any
changes to be made.

LGTM

Why don't we just use a barrier when around reading the value? It's not like
CreateCheckPoint() is frequent?

Greetings,

Andres Freund

#12

John Morris

john.morris@crunchydata.com

over 2 years ago

In reply to: Andres Freund (#11)

Re: Atomic ops for unlogged LSN

Why don't we just use a barrier when around reading the value? It's not like
CreateCheckPoint() is frequent?

One reason is that a barrier isn’t needed, and adding unnecessary barriers can also be confusing.

With respect to the “debug only” comment in the original code, whichever value is written to the structure during a checkpoint will be reset when restarting. Nobody will ever see that value. We could just as easily write a zero.

Shutting down is a different story. It isn’t stated, but we require the unlogged LSN be quiescent before shutting down. This patch doesn’t change that requirement.

We could also argue that memory ordering doesn’t matter when filling in a conventional, unlocked structure. But the fact we have only two cases 1) checkpoint where the value is ignored, and 2) shutdown where it is quiescent, makes the additional argument unnecessary.

Would you be more comfortable if I added comments describing the two cases? My intent was to be minimalistic, but the original code could use better explanation.

#13

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

over 2 years ago

In reply to: John Morris (#12)

Re: Atomic ops for unlogged LSN

On Thu, Jul 20, 2023 at 12:25 AM John Morris
<john.morris@crunchydata.com> wrote:

Why don't we just use a barrier when around reading the value? It's not like
CreateCheckPoint() is frequent?

One reason is that a barrier isn’t needed, and adding unnecessary barriers can also be confusing.

With respect to the “debug only” comment in the original code, whichever value is written to the structure during a checkpoint will be reset when restarting. Nobody will ever see that value. We could just as easily write a zero.

Shutting down is a different story. It isn’t stated, but we require the unlogged LSN be quiescent before shutting down. This patch doesn’t change that requirement.

We could also argue that memory ordering doesn’t matter when filling in a conventional, unlocked structure. But the fact we have only two cases 1) checkpoint where the value is ignored, and 2) shutdown where it is quiescent, makes the additional argument unnecessary.

Would you be more comfortable if I added comments describing the two cases? My intent was to be minimalistic, but the original code could use better explanation.

Here, the case for unloggedLSN is that there can exist multiple
backends incrementing/writing it (via GetFakeLSNForUnloggedRel), and
there can exist one reader at a time in CreateCheckPoint with
LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);. Is incrementing
unloggedLSN atomically (with an implied memory barrier from
pg_atomic_fetch_add_u64) helping out synchronize multiple writers and
readers? With a spinlock, writers-readers synchronization is
guaranteed. With an atomic variable, it is guaranteed that the readers
don't see a torn-value, but no synchronization is provided.

If the above understanding is right, what happens if readers see an
old unloggedLSN value - reader here stores the old unloggedLSN value
to control file and the server restarts (clean exit). So, the old
value is loaded back to unloggedLSN upon restart and the callers of
GetFakeLSNForUnloggedRel() will see an old/repeated value too. Is it a
problem?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#14

John Morris

john.morris@crunchydata.com

over 2 years ago

In reply to: Bharath Rupireddy (#13)

Re: Atomic ops for unlogged LSN

what happens if … reader here stores the old unloggedLSN value
to control file and the server restarts (clean exit). So, the old
value is loaded back to unloggedLSN upon restart and the callers of
GetFakeLSNForUnloggedRel() will see an old/repeated value too. Is it a
problem?

First, a clarification. The value being saved is the “next” unlogged LSN,
not one which has already been used.
(we are doing “fetch and add”, not “add and fetch”)

You have a good point about shutdown and startup. It is vital we
don’t repeat an unlogged LSN. This situation could easily happen
If other readers were active while we were shutting down.

With an atomic variable, it is guaranteed that the readers
don't see a torn-value, but no synchronization is provided.

The atomic increment also ensures the sequence
of values is correct, specifically we don’t see
repeated values like we might with a conventional increment.
As a side effect, the instruction enforces a memory barrier, but we are not
relying on a barrier in this case.

#15

John Morris

john.morris@crunchydata.com

over 2 years ago

In reply to: John Morris (#14)

1 attachment(s)

Re: Atomic ops for unlogged LSN

Based on your feedback, I’ve updated the patch with additional comments.

1. Explain the two cases when writing to the control file, and
2. a bit more emphasis on unloggedLSNs not being valid after a crash.

Thanks to y’all.

* John

Attachments:

v2-0001-Use-atomic-ops-for-unloggedLSNs.patchapplication/octet-stream; name=v2-0001-Use-atomic-ops-for-unloggedLSNs.patchDownload

From 3696fbc8dd4352ba645ffd8b134f27ef79d95940 Mon Sep 17 00:00:00 2001
From: John Morris <john.morris@crunchydata.com>
Date: Wed, 3 May 2023 18:28:39 -0700
Subject: [PATCH] Use atomic ops for unloggedLSNs

---
 src/backend/access/transam/xlog.c | 48 +++++++++++++++----------------
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index b2430f617c..3c38d1bf9d 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -469,9 +469,8 @@ typedef struct XLogCtlData
 
 	XLogSegNo	lastRemovedSegNo;	/* latest removed/recycled XLOG segment */
 
-	/* Fake LSN counter, for unlogged relations. Protected by ulsn_lck. */
-	XLogRecPtr	unloggedLSN;
-	slock_t		ulsn_lck;
+	/* Fake LSN counter for unlogged relations. Updated atomically. */
+	pg_atomic_uint64 unloggedLSN;
 
 	/* Time and LSN of last xlog segment switch. Protected by WALWriteLock. */
 	pg_time_t	lastSegSwitchTime;
@@ -4208,21 +4207,16 @@ DataChecksumsEnabled(void)
  *
  * Each call generates an LSN that is greater than any previous value
  * returned. The current counter value is saved and restored across clean
- * shutdowns, but like unlogged relations, does not survive a crash. This can
- * be used in lieu of real LSN values returned by XLogInsert, if you need an
- * LSN-like increasing sequence of numbers without writing any WAL.
+ * shutdowns, but it is reset after a crash. Consequently, it should
+ * only be used for unlogged relations or for data which does not survive
+ * a crash. An unlogged LSN may be used in lieu of real LSN values
+ * returned by XLogInsert, if you need an LSN-like increasing sequence
+ * of numbers without writing any WAL.
  */
 XLogRecPtr
 GetFakeLSNForUnloggedRel(void)
 {
-	XLogRecPtr	nextUnloggedLSN;
-
-	/* increment the unloggedLSN counter, need SpinLock */
-	SpinLockAcquire(&XLogCtl->ulsn_lck);
-	nextUnloggedLSN = XLogCtl->unloggedLSN++;
-	SpinLockRelease(&XLogCtl->ulsn_lck);
-
-	return nextUnloggedLSN;
+	return pg_atomic_fetch_add_u64(&XLogCtl->unloggedLSN, 1);
 }
 
 /*
@@ -4635,7 +4629,6 @@ XLOGShmemInit(void)
 
 	SpinLockInit(&XLogCtl->Insert.insertpos_lck);
 	SpinLockInit(&XLogCtl->info_lck);
-	SpinLockInit(&XLogCtl->ulsn_lck);
 }
 
 /*
@@ -5240,9 +5233,9 @@ StartupXLOG(void)
 	 * the unlogged LSN counter can be reset too.
 	 */
 	if (ControlFile->state == DB_SHUTDOWNED)
-		XLogCtl->unloggedLSN = ControlFile->unloggedLSN;
+		pg_atomic_init_u64(&XLogCtl->unloggedLSN, ControlFile->unloggedLSN);
 	else
-		XLogCtl->unloggedLSN = FirstNormalUnloggedLSN;
+		pg_atomic_init_u64(&XLogCtl->unloggedLSN, FirstNormalUnloggedLSN);
 
 	/*
 	 * Copy any missing timeline history files between 'now' and the recovery
@@ -6780,13 +6773,20 @@ CreateCheckPoint(int flags)
 	ControlFile->minRecoveryPointTLI = 0;
 
 	/*
-	 * Persist unloggedLSN value. It's reset on crash recovery, so this goes
-	 * unused on non-shutdown checkpoints, but seems useful to store it always
-	 * for debugging purposes.
-	 */
-	SpinLockAcquire(&XLogCtl->ulsn_lck);
-	ControlFile->unloggedLSN = XLogCtl->unloggedLSN;
-	SpinLockRelease(&XLogCtl->ulsn_lck);
+	 * Persist the unloggedLSN.
+	 * We have two cases to consider:
+	 *   1) We are shutting down. All other backends have finished, and
+	 *      the unloggedLSN isn't being updated any more. There are no race
+	 *      conditions. We just save the current value and restore it
+	 *      when we are restarting.
+	 *   2) We are actively running, say during a checkpoint.
+	 *      The unloggedLSN is being updated by other backends,
+	 *      but the value we write out doesn't matter.
+	 *      The unloggedLSN will be reset on crash recovery.
+	 *      We could simply write a zero, but the current value
+	 *      might be useful for debugging.
+	 */
+	ControlFile->unloggedLSN = pg_atomic_read_u64(&XLogCtl->unloggedLSN);
 
 	UpdateControlFile();
 	LWLockRelease(ControlFileLock);
-- 
2.33.0

#16

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

over 2 years ago

In reply to: John Morris (#15)

Re: Atomic ops for unlogged LSN

On Fri, Jul 21, 2023 at 4:43 AM John Morris <john.morris@crunchydata.com> wrote:

Based on your feedback, I’ve updated the patch with additional comments.

Explain the two cases when writing to the control file, and
a bit more emphasis on unloggedLSNs not being valid after a crash.

Given that the callers already have to deal with the unloggedLSN being
reset after a crash, meaning, they can't expect an always increasing
value from unloggedLSN, I think converting it to an atomic variable
seems a reasonable change. The other advantage we get here is the
freeing up shared memory space for spinlock ulsn_lck.

The attached v2 patch LGTM and marked the CF entry RfC -
https://commitfest.postgresql.org/43/4330/.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com