Spinlock is missing when updating two_phase of ReplicationSlot

Started by Masahiko Sawadaabout 3 years ago4 messages
#1Masahiko Sawada
sawada.mshk@gmail.com
1 attachment(s)

Hi,

I realized that in CreateDecodingContext() function, we update both
slot->data.two_phase and two_phase_at without acquiring the spinlock:

/* Mark slot to allow two_phase decoding if not already marked */
if (ctx->twophase && !slot->data.two_phase)
{
slot->data.two_phase = true;
slot->data.two_phase_at = start_lsn;
ReplicationSlotMarkDirty();
ReplicationSlotSave();
SnapBuildSetTwoPhaseAt(ctx->snapshot_builder, start_lsn);
}

I think we should acquire the spinlock when updating fields of the
replication slot even by its owner. Otherwise readers could see
inconsistent results. Looking at another place where we update
two_phase_at, we acquire the spinlock:

SpinLockAcquire(&slot->mutex);
slot->data.confirmed_flush = ctx->reader->EndRecPtr;
if (slot->data.two_phase)
slot->data.two_phase_at = ctx->reader->EndRecPtr;
SpinLockRelease(&slot->mutex);

It seems to me an oversight of commit a8fd13cab0b. I've attached the
small patch to fix it.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachments:

fix_spinlock.patchapplication/octet-stream; name=fix_spinlock.patchDownload
diff --git a/src/backend/replication/logical/logical.c b/src/backend/replication/logical/logical.c
index 52d1fe6269..1a58dd7649 100644
--- a/src/backend/replication/logical/logical.c
+++ b/src/backend/replication/logical/logical.c
@@ -555,8 +555,10 @@ CreateDecodingContext(XLogRecPtr start_lsn,
 	/* Mark slot to allow two_phase decoding if not already marked */
 	if (ctx->twophase && !slot->data.two_phase)
 	{
+		SpinLockAcquire(&slot->mutex);
 		slot->data.two_phase = true;
 		slot->data.two_phase_at = start_lsn;
+		SpinLockRelease(&slot->mutex);
 		ReplicationSlotMarkDirty();
 		ReplicationSlotSave();
 		SnapBuildSetTwoPhaseAt(ctx->snapshot_builder, start_lsn);
#2Michael Paquier
michael@paquier.xyz
In reply to: Masahiko Sawada (#1)
Re: Spinlock is missing when updating two_phase of ReplicationSlot

On Wed, Jan 11, 2023 at 11:07:05AM +0900, Masahiko Sawada wrote:

I think we should acquire the spinlock when updating fields of the
replication slot even by its owner. Otherwise readers could see
inconsistent results. Looking at another place where we update
two_phase_at, we acquire the spinlock:

SpinLockAcquire(&slot->mutex);
slot->data.confirmed_flush = ctx->reader->EndRecPtr;
if (slot->data.two_phase)
slot->data.two_phase_at = ctx->reader->EndRecPtr;
SpinLockRelease(&slot->mutex);

It seems to me an oversight of commit a8fd13cab0b. I've attached the
small patch to fix it.

Looks right to me, the paths updating the data related to the slots
are careful about that, even when it comes to fetching a slot from
MyReplicationSlot. I have been looking around the slot code to see if
there are other inconsistencies, and did not notice anything standing
out. Will fix..
--
Michael

#3Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#2)
Re: Spinlock is missing when updating two_phase of ReplicationSlot

On Wed, Jan 11, 2023 at 02:36:17PM +0900, Michael Paquier wrote:

Looks right to me, the paths updating the data related to the slots
are careful about that, even when it comes to fetching a slot from
MyReplicationSlot. I have been looking around the slot code to see if
there are other inconsistencies, and did not notice anything standing
out. Will fix..

And done, thanks!
--
Michael

#4Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Michael Paquier (#3)
Re: Spinlock is missing when updating two_phase of ReplicationSlot

On Thu, Jan 12, 2023 at 1:42 PM Michael Paquier <michael@paquier.xyz> wrote:

On Wed, Jan 11, 2023 at 02:36:17PM +0900, Michael Paquier wrote:

Looks right to me, the paths updating the data related to the slots
are careful about that, even when it comes to fetching a slot from
MyReplicationSlot. I have been looking around the slot code to see if
there are other inconsistencies, and did not notice anything standing
out. Will fix..

And done, thanks!

Thank you!

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com