Shmem queue is not flushed if receiver is not yet attached
Hello,
While testing on the current PG master, I noticed a problem between
backends communicating over a shared memory queue. I think `shm_mq_sendv()`
fails to flush the queue, even if `force_flush` is set to true, if the
receiver is not yet attached to the queue. This simple fix solves
the problem for me.
On another note, `shm_mq.h` declares `shm_mq_flush()`, but I don't see it
being implemented. Maybe just a leftover from the previous work? Though it
seems useful to implement that API.
Thanks,
Pavan
--
Pavan Deolasee
EnterpriseDB: https://www.enterprisedb..com
Attachments:
0001-Flush-the-queue-even-if-receiver-has-not-attached.patchapplication/octet-stream; name=0001-Flush-the-queue-even-if-receiver-has-not-attached.patchDownload
From 4d6158f7d9829dd5a909ee44748d7670e82b9983 Mon Sep 17 00:00:00 2001
From: Pavan Deolasee <pavan.deolasee@gmail.com>
Date: Thu, 17 Mar 2022 12:37:11 +0530
Subject: [PATCH 1/1] Flush the queue even if receiver has not attached.
The new API introduced in PG15 delays flushing shared memoey queue
unless the caller has requested a `force_flush`. But if the
receiver has not yet attached to the queue, we fail to enforce
`force_flush` and return with success without actually flushing the
queue. This commit fixes the bug.
---
src/backend/storage/ipc/shm_mq.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/src/backend/storage/ipc/shm_mq.c b/src/backend/storage/ipc/shm_mq.c
index 603cf9b0fa7..b0450f68e66 100644
--- a/src/backend/storage/ipc/shm_mq.c
+++ b/src/backend/storage/ipc/shm_mq.c
@@ -528,8 +528,6 @@ shm_mq_sendv(shm_mq_handle *mqh, shm_mq_iovec *iov, int iovcnt, bool nowait,
SpinLockAcquire(&mq->mq_mutex);
receiver = mq->mq_receiver;
SpinLockRelease(&mq->mq_mutex);
- if (receiver == NULL)
- return SHM_MQ_SUCCESS;
mqh->mqh_counterparty_attached = true;
}
@@ -541,7 +539,9 @@ shm_mq_sendv(shm_mq_handle *mqh, shm_mq_iovec *iov, int iovcnt, bool nowait,
if (force_flush || mqh->mqh_send_pending > (mq->mq_ring_size >> 2))
{
shm_mq_inc_bytes_written(mq, mqh->mqh_send_pending);
- SetLatch(&receiver->procLatch);
+ /* If the receiver is attached, signal it. */
+ if (receiver != NULL)
+ SetLatch(&receiver->procLatch);
mqh->mqh_send_pending = 0;
}
--
2.30.1 (Apple Git-130)
On Thu, Mar 17, 2022 at 3:13 AM Pavan Deolasee <pavan.deolasee@gmail.com> wrote:
While testing on the current PG master, I noticed a problem between backends communicating over a shared memory queue. I think `shm_mq_sendv()` fails to flush the queue, even if `force_flush` is set to true, if the receiver is not yet attached to the queue. This simple fix solves the problem for me.
On another note, `shm_mq.h` declares `shm_mq_flush()`, but I don't see it being implemented. Maybe just a leftover from the previous work? Though it seems useful to implement that API.
I think that this patch is basically correct, except that it's not
correct to set mqh_counterparty_attached when receiver is still NULL.
Here's a v2 where I've attempted to correct that while preserving the
essence of your proposed fix.
I'm not sure that we need a shm_mq_flush(), but we definitely don't
have one currently, so I've also adjusted your patch to remove the
dead prototype.
Please let me know your thoughts on the attached.
Thanks,
--
Robert Haas
EDB: http://www.enterprisedb.com
Attachments:
v2-0001-shm_mq_sendv-Fix-flushing-bug-when-receiver-not-y.patchapplication/octet-stream; name=v2-0001-shm_mq_sendv-Fix-flushing-bug-when-receiver-not-y.patchDownload
From d74a7e53be060c05990ec9e931242ea8b4eaac16 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Tue, 24 May 2022 10:55:01 -0400
Subject: [PATCH v2] shm_mq_sendv: Fix flushing bug when receiver not yet
attached.
With the old logic, when the reciever had not yet attached, we would
never call shm_mq_inc_bytes_written(), even if force_flush = true
was specified. That could result in a situation where data that the
sender believes it has sent is never received.
Along the way, remove a useless function prototype for a nonexistent
function from shm_mq.h.
Commit 46846433a03dff4f2e08c8a161e54a842da360d6 introduced these
problems.
Pavan Deolasee, with a few changes by me.
---
src/backend/storage/ipc/shm_mq.c | 11 +++++------
src/include/storage/shm_mq.h | 1 -
2 files changed, 5 insertions(+), 7 deletions(-)
diff --git a/src/backend/storage/ipc/shm_mq.c b/src/backend/storage/ipc/shm_mq.c
index 6139c622e0..8ca24de8d6 100644
--- a/src/backend/storage/ipc/shm_mq.c
+++ b/src/backend/storage/ipc/shm_mq.c
@@ -518,8 +518,7 @@ shm_mq_sendv(shm_mq_handle *mqh, shm_mq_iovec *iov, int iovcnt, bool nowait,
/*
* If the counterparty is known to have attached, we can read mq_receiver
- * without acquiring the spinlock and assume it isn't NULL. Otherwise,
- * more caution is needed.
+ * without acquiring the spinlock. Otherwise, more caution is needed.
*/
if (mqh->mqh_counterparty_attached)
receiver = mq->mq_receiver;
@@ -528,9 +527,8 @@ shm_mq_sendv(shm_mq_handle *mqh, shm_mq_iovec *iov, int iovcnt, bool nowait,
SpinLockAcquire(&mq->mq_mutex);
receiver = mq->mq_receiver;
SpinLockRelease(&mq->mq_mutex);
- if (receiver == NULL)
- return SHM_MQ_SUCCESS;
- mqh->mqh_counterparty_attached = true;
+ if (receiver != NULL)
+ mqh->mqh_counterparty_attached = true;
}
/*
@@ -541,7 +539,8 @@ shm_mq_sendv(shm_mq_handle *mqh, shm_mq_iovec *iov, int iovcnt, bool nowait,
if (force_flush || mqh->mqh_send_pending > (mq->mq_ring_size >> 2))
{
shm_mq_inc_bytes_written(mq, mqh->mqh_send_pending);
- SetLatch(&receiver->procLatch);
+ if (receiver != NULL)
+ SetLatch(&receiver->procLatch);
mqh->mqh_send_pending = 0;
}
diff --git a/src/include/storage/shm_mq.h b/src/include/storage/shm_mq.h
index f5220baa78..b6fe68725d 100644
--- a/src/include/storage/shm_mq.h
+++ b/src/include/storage/shm_mq.h
@@ -76,7 +76,6 @@ extern shm_mq_result shm_mq_sendv(shm_mq_handle *mqh, shm_mq_iovec *iov,
int iovcnt, bool nowait, bool force_flush);
extern shm_mq_result shm_mq_receive(shm_mq_handle *mqh,
Size *nbytesp, void **datap, bool nowait);
-extern void shm_mq_flush(shm_mq_handle *mqh);
/* Wait for our counterparty to attach to the queue. */
extern shm_mq_result shm_mq_wait_for_attach(shm_mq_handle *mqh);
--
2.24.3 (Apple Git-128)
On Tue, 24 May 2022 at 23:05, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Mar 17, 2022 at 3:13 AM Pavan Deolasee <pavan.deolasee@gmail.com> wrote:
While testing on the current PG master, I noticed a problem between backends communicating over a shared memory queue. I think `shm_mq_sendv()` fails to flush the queue, even if `force_flush` is set to true, if the receiver is not yet attached to the queue. This simple fix solves the problem for me.
On another note, `shm_mq.h` declares `shm_mq_flush()`, but I don't see it being implemented. Maybe just a leftover from the previous work? Though it seems useful to implement that API.
I think that this patch is basically correct, except that it's not
correct to set mqh_counterparty_attached when receiver is still NULL.
Here's a v2 where I've attempted to correct that while preserving the
essence of your proposed fix.I'm not sure that we need a shm_mq_flush(), but we definitely don't
have one currently, so I've also adjusted your patch to remove the
dead prototype.Please let me know your thoughts on the attached.
Thanks,
Hi,
I have a problem that is also related to shmem queue [1]/messages/by-id/MEYP282MB1669C8D88F0997354C2313C1B6CA9@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM, however, I cannot
reproduce it. How did you reproduce this problem?
[1]: /messages/by-id/MEYP282MB1669C8D88F0997354C2313C1B6CA9@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM
--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.
Hi Robert,
On Tue, May 24, 2022 at 8:35 PM Robert Haas <robertmhaas@gmail.com> wrote:
I think that this patch is basically correct, except that it's not
correct to set mqh_counterparty_attached when receiver is still NULL.
Here's a v2 where I've attempted to correct that while preserving the
essence of your proposed fix.
This looks good to me,
I'm not sure that we need a shm_mq_flush(), but we definitely don't
have one currently, so I've also adjusted your patch to remove the
dead prototype.
Makes sense to me.
Thanks,
Pavan
--
Pavan Deolasee
EnterpriseDB: https://www.enterprisedb..com
On Wed, May 25, 2022 at 7:01 AM Japin Li <japinli@hotmail.com> wrote:
I have a problem that is also related to shmem queue [1], however, I cannot
reproduce it. How did you reproduce this problem?
I discovered this bug while working on an extension that makes use of the
shared memory queue facility. Not sure how useful that is for your purpose.
Thanks,
Pavan
--
Pavan Deolasee
EnterpriseDB: https://www.enterprisedb..com
On Mon, May 30, 2022 at 3:06 AM Pavan Deolasee <pavan.deolasee@gmail.com> wrote:
I think that this patch is basically correct, except that it's not
correct to set mqh_counterparty_attached when receiver is still NULL.
Here's a v2 where I've attempted to correct that while preserving the
essence of your proposed fix.This looks good to me,
Thanks for checking. Committed.
--
Robert Haas
EDB: http://www.enterprisedb.com