Introduce timeout capability for ConditionVariableSleep

Started by Shawn Debnathalmost 7 years ago16 messages
#1Shawn Debnath
sdn@amazon.com
1 attachment(s)

Hello,

Postgres today doesn't support waiting for a condition variable with a
timeout, although the framework it relies upon, does. This change wraps
the existing ConditionVariableSleep functionality and introduces a new
API, ConditionVariableTimedSleep, to allow callers to specify a timeout
value.

A scenario that highlights this use case is a backend is waiting on
status update from multiple workers but needs to time out if that signal
doesn't arrive within a certain period. There was a workaround prior
to aced5a92, but with that change, the semantics are now different.

I chose to go with -1 instead of 0 for the return from
ConditionVariableTimedSleep to indicate timeout error as it seems
cleaner for this API. WaitEventSetWaitBlock returns -1 for timeout but
WaitEventSetWait treats timeout as 0 (to represent 0 events indicating
timeout).

If there's an alternative, cleaner way to achieve this outcome, I am all
ears.

Thanks.

--
Shawn Debnath
Amazon Web Services (AWS)

Attachments:

0001-Introduce-timeout-capability-for-ConditionVariableSleep-v1.patchtext/plain; charset=us-asciiDownload
From c1b332826070de4782b12bbb577d66406d439c8d Mon Sep 17 00:00:00 2001
From: Shawn Debnath <sdn@amazon.com>
Date: Tue, 12 Mar 2019 23:21:41 +0000
Subject: [PATCH] Introduce timeout capability for ConditionVariableSleep

This patch introduces ConditionVariableTimedSleep to enable timing out
of waiting for a condition variable to get signalled.
---
 src/backend/storage/lmgr/condition_variable.c | 41 +++++++++++++++++++++++++--
 src/include/storage/condition_variable.h      |  2 ++
 2 files changed, 41 insertions(+), 2 deletions(-)

diff --git a/src/backend/storage/lmgr/condition_variable.c b/src/backend/storage/lmgr/condition_variable.c
index 58b7b51472..a0e14a5efa 100644
--- a/src/backend/storage/lmgr/condition_variable.c
+++ b/src/backend/storage/lmgr/condition_variable.c
@@ -121,9 +121,33 @@ ConditionVariablePrepareToSleep(ConditionVariable *cv)
  */
 void
 ConditionVariableSleep(ConditionVariable *cv, uint32 wait_event_info)
+{
+	(void) ConditionVariableTimedSleep(cv, -1 /* no timeout */, wait_event_info);
+}
+
+/*
+ * Wait for the given condition variable to be signaled or till timeout.
+ * This should be called in a predicate loop that tests for a specific exit
+ * condition and otherwise sleeps, like so:
+ *
+ *	 ConditionVariablePrepareToSleep(cv);  // optional
+ *	 while (condition for which we are waiting is not true)
+ *		 ConditionVariableSleep(cv, wait_event_info);
+ *	 ConditionVariableCancelSleep();
+ *
+ * wait_event_info should be a value from one of the WaitEventXXX enums
+ * defined in pgstat.h.  This controls the contents of pg_stat_activity's
+ * wait_event_type and wait_event columns while waiting.
+ *
+ * Returns 0 or -1 if timed out.
+ */
+int
+ConditionVariableTimedSleep(ConditionVariable *cv, long timeout,
+							uint32 wait_event_info)
 {
 	WaitEvent	event;
 	bool		done = false;
+	int			ret;
 
 	/*
 	 * If the caller didn't prepare to sleep explicitly, then do so now and
@@ -143,7 +167,7 @@ ConditionVariableSleep(ConditionVariable *cv, uint32 wait_event_info)
 	if (cv_sleep_target != cv)
 	{
 		ConditionVariablePrepareToSleep(cv);
-		return;
+		return 0;
 	}
 
 	do
@@ -154,12 +178,23 @@ ConditionVariableSleep(ConditionVariable *cv, uint32 wait_event_info)
 		 * Wait for latch to be set.  (If we're awakened for some other
 		 * reason, the code below will cope anyway.)
 		 */
-		(void) WaitEventSetWait(cv_wait_event_set, -1, &event, 1,
+		ret = WaitEventSetWait(cv_wait_event_set, timeout, &event, 1,
 								wait_event_info);
 
 		/* Reset latch before examining the state of the wait list. */
 		ResetLatch(MyLatch);
 
+		/* Timeout */
+		if (ret == 0 && timeout >= 0)
+		{
+			/*
+			 * In the event of a timeout, we simply return and the caller
+			 * calls ConditionVariableCancelSleep to remove themselves from the
+			 * wait queue
+			 */
+			return -1;
+		}
+
 		/*
 		 * If this process has been taken out of the wait list, then we know
 		 * that it has been signaled by ConditionVariableSignal (or
@@ -183,6 +218,8 @@ ConditionVariableSleep(ConditionVariable *cv, uint32 wait_event_info)
 		}
 		SpinLockRelease(&cv->mutex);
 	} while (!done);
+
+	return 0;
 }
 
 /*
diff --git a/src/include/storage/condition_variable.h b/src/include/storage/condition_variable.h
index 2a0249392c..f62164e483 100644
--- a/src/include/storage/condition_variable.h
+++ b/src/include/storage/condition_variable.h
@@ -43,6 +43,8 @@ extern void ConditionVariableInit(ConditionVariable *cv);
  * the condition variable.
  */
 extern void ConditionVariableSleep(ConditionVariable *cv, uint32 wait_event_info);
+extern int ConditionVariableTimedSleep(ConditionVariable *cv, long timeout,
+										uint32 wait_event_info);
 extern void ConditionVariableCancelSleep(void);
 
 /*
-- 
2.16.5

#2Thomas Munro
thomas.munro@gmail.com
In reply to: Shawn Debnath (#1)
Re: Introduce timeout capability for ConditionVariableSleep

Hi Shawn,

On Wed, Mar 13, 2019 at 12:25 PM Shawn Debnath <sdn@amazon.com> wrote:

Postgres today doesn't support waiting for a condition variable with a
timeout, although the framework it relies upon, does. This change wraps
the existing ConditionVariableSleep functionality and introduces a new
API, ConditionVariableTimedSleep, to allow callers to specify a timeout
value.

Seems reasonable, I think, and should be familiar to anyone used to
well known multithreading libraries.

+/*
+ * Wait for the given condition variable to be signaled or till timeout.
+ * This should be called in a predicate loop that tests for a specific exit
+ * condition and otherwise sleeps, like so:
+ *
+ *     ConditionVariablePrepareToSleep(cv);  // optional
+ *     while (condition for which we are waiting is not true)
+ *         ConditionVariableSleep(cv, wait_event_info);
+ *     ConditionVariableCancelSleep();
+ *
+ * wait_event_info should be a value from one of the WaitEventXXX enums
+ * defined in pgstat.h.  This controls the contents of pg_stat_activity's
+ * wait_event_type and wait_event columns while waiting.
+ *
+ * Returns 0 or -1 if timed out.
+ */
+int
+ConditionVariableTimedSleep(ConditionVariable *cv, long timeout,
+                            uint32 wait_event_info)

Can we just refer to the other function's documentation for this? I
don't want two copies of this blurb (and this copy-paste already
failed to include "Timed" in the example function name).

One difference compared to pthread_cond_timedwait() is that pthread
uses an absolute time and here you use a relative time (as we do in
WaitEventSet API). The first question is which makes a better API,
and the second is what the semantics of a relative timeout should be:
start again every time we get a spurious wake-up, or track time
already waited? Let's see...

-        (void) WaitEventSetWait(cv_wait_event_set, -1, &event, 1,
+        ret = WaitEventSetWait(cv_wait_event_set, timeout, &event, 1,
                                 wait_event_info);

Here you're restarting the timeout clock every time through the loop
without adjustment, and I think that's not a good choice: spurious
wake-ups cause bonus waiting.

--
Thomas Munro
https://enterprisedb.com

#3Shawn Debnath
sdn@amazon.com
In reply to: Thomas Munro (#2)
1 attachment(s)
Re: Introduce timeout capability for ConditionVariableSleep

Hi Thomas,

Thanks for reviewing!

On Wed, Mar 13, 2019 at 12:40:57PM +1300, Thomas Munro wrote:

Can we just refer to the other function's documentation for this? I
don't want two copies of this blurb (and this copy-paste already
failed to include "Timed" in the example function name).

Hah - point well taken. Fixed.

One difference compared to pthread_cond_timedwait() is that pthread
uses an absolute time and here you use a relative time (as we do in
WaitEventSet API). The first question is which makes a better API,
and the second is what the semantics of a relative timeout should be:
start again every time we get a spurious wake-up, or track time
already waited? Let's see...

-        (void) WaitEventSetWait(cv_wait_event_set, -1, &event, 1,
+        ret = WaitEventSetWait(cv_wait_event_set, timeout, &event, 1,
wait_event_info);

Here you're restarting the timeout clock every time through the loop
without adjustment, and I think that's not a good choice: spurious
wake-ups cause bonus waiting.

Agree. In my testing WaitEventSetWait did the work for calculating the
right timeout remaining. It's a bummer that we have to repeat the same
pattern at the ConditionVariableTimedSleep() but I guess anyone who
loops in such cases will have to adjust their values accordingly.

BTW, I am curious why Andres in 98a64d0bd71 didn't just create an
artificial event with WL_TIMEOUT and return that from
WaitEventSetWait(). Would have made it cleaner than checking checking
return values for -1.

Updated v2 patch attached.

--
Shawn Debnath
Amazon Web Services (AWS)

Attachments:

0001-Introduce-timeout-capability-for-ConditionVariableSleep-v2.patchtext/plain; charset=us-asciiDownload
From 878462c4e05217a6c13107811a676948bb7f2a65 Mon Sep 17 00:00:00 2001
From: Shawn Debnath <sdn@amazon.com>
Date: Tue, 12 Mar 2019 23:21:41 +0000
Subject: [PATCH] Introduce timeout capability for ConditionVariableSleep

This patch introduces ConditionVariableTimedSleep to enable timing out
of waiting for a condition variable to get signalled.
---
 src/backend/storage/lmgr/condition_variable.c | 66 ++++++++++++++++++++++++++-
 src/include/storage/condition_variable.h      |  2 +
 2 files changed, 66 insertions(+), 2 deletions(-)

diff --git a/src/backend/storage/lmgr/condition_variable.c b/src/backend/storage/lmgr/condition_variable.c
index 58b7b51472..3afb726131 100644
--- a/src/backend/storage/lmgr/condition_variable.c
+++ b/src/backend/storage/lmgr/condition_variable.c
@@ -19,6 +19,7 @@
 #include "postgres.h"
 
 #include "miscadmin.h"
+#include "portability/instr_time.h"
 #include "storage/condition_variable.h"
 #include "storage/ipc.h"
 #include "storage/proc.h"
@@ -121,9 +122,31 @@ ConditionVariablePrepareToSleep(ConditionVariable *cv)
  */
 void
 ConditionVariableSleep(ConditionVariable *cv, uint32 wait_event_info)
+{
+	(void) ConditionVariableTimedSleep(cv, -1 /* no timeout */, wait_event_info);
+}
+
+/*
+ * Wait for the given condition variable to be signaled or till timeout.
+ *
+ * In the event of a timeout, we simply return and the caller
+ * calls ConditionVariableCancelSleep to remove themselves from the
+ * wait queue. See ConditionVariableSleep() for notes on how to correctly check
+ * for the exit condition.
+ *
+ * Returns 0, or -1 if timed out.
+ */
+int
+ConditionVariableTimedSleep(ConditionVariable *cv, long timeout,
+							uint32 wait_event_info)
 {
 	WaitEvent	event;
 	bool		done = false;
+	int			rc;
+	int			ret = 0;
+	long		rem_timeout = -1;
+	instr_time	start_time;
+	instr_time	cur_time;
 
 	/*
 	 * If the caller didn't prepare to sleep explicitly, then do so now and
@@ -143,7 +166,18 @@ ConditionVariableSleep(ConditionVariable *cv, uint32 wait_event_info)
 	if (cv_sleep_target != cv)
 	{
 		ConditionVariablePrepareToSleep(cv);
-		return;
+		return ret;
+	}
+
+	/*
+	 * Track the current time so that we can calculate the remaining timeout
+	 * if we are woken up spuriously.
+	 */
+	if (timeout >= 0)
+	{
+		INSTR_TIME_SET_CURRENT(start_time);
+		Assert(timeout >= 0 && timeout <= INT_MAX);
+		rem_timeout = timeout;
 	}
 
 	do
@@ -154,12 +188,19 @@ ConditionVariableSleep(ConditionVariable *cv, uint32 wait_event_info)
 		 * Wait for latch to be set.  (If we're awakened for some other
 		 * reason, the code below will cope anyway.)
 		 */
-		(void) WaitEventSetWait(cv_wait_event_set, -1, &event, 1,
+		rc = WaitEventSetWait(cv_wait_event_set, rem_timeout, &event, 1,
 								wait_event_info);
 
 		/* Reset latch before examining the state of the wait list. */
 		ResetLatch(MyLatch);
 
+		/* Timed out */
+		if (rc == 0 && timeout >= 0)
+		{
+			ret = -1;	/* timeout */
+			break;
+		}
+
 		/*
 		 * If this process has been taken out of the wait list, then we know
 		 * that it has been signaled by ConditionVariableSignal (or
@@ -179,10 +220,31 @@ ConditionVariableSleep(ConditionVariable *cv, uint32 wait_event_info)
 		if (!proclist_contains(&cv->wakeup, MyProc->pgprocno, cvWaitLink))
 		{
 			done = true;
+			Assert(ret == 0);
 			proclist_push_tail(&cv->wakeup, MyProc->pgprocno, cvWaitLink);
 		}
 		SpinLockRelease(&cv->mutex);
+
+		/* If we're not done, update rem_timeout for next iteration */
+		if (!done && timeout >= 0)
+		{
+			INSTR_TIME_SET_CURRENT(cur_time);
+			INSTR_TIME_SUBTRACT(cur_time, start_time);
+			rem_timeout = timeout - (long) INSTR_TIME_GET_MILLISEC(cur_time);
+
+			/*
+			 * Check if we have reached the timeout threshold after calculating
+			 * the remaining timeout value.
+			 */
+			if (rem_timeout <= 0)
+			{
+				ret = -1;	/* timeout */
+				break;
+			}
+		}
 	} while (!done);
+
+	return ret;
 }
 
 /*
diff --git a/src/include/storage/condition_variable.h b/src/include/storage/condition_variable.h
index 2a0249392c..f62164e483 100644
--- a/src/include/storage/condition_variable.h
+++ b/src/include/storage/condition_variable.h
@@ -43,6 +43,8 @@ extern void ConditionVariableInit(ConditionVariable *cv);
  * the condition variable.
  */
 extern void ConditionVariableSleep(ConditionVariable *cv, uint32 wait_event_info);
+extern int ConditionVariableTimedSleep(ConditionVariable *cv, long timeout,
+										uint32 wait_event_info);
 extern void ConditionVariableCancelSleep(void);
 
 /*
-- 
2.16.5

#4Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Shawn Debnath (#3)
1 attachment(s)
Re: Introduce timeout capability for ConditionVariableSleep

Hello.

At Tue, 12 Mar 2019 17:53:43 -0700, Shawn Debnath <sdn@amazon.com> wrote in <20190313005342.GA8301@f01898859afd.ant.amazon.com>

Hi Thomas,

Thanks for reviewing!

On Wed, Mar 13, 2019 at 12:40:57PM +1300, Thomas Munro wrote:

Can we just refer to the other function's documentation for this? I
don't want two copies of this blurb (and this copy-paste already
failed to include "Timed" in the example function name).

Hah - point well taken. Fixed.

One difference compared to pthread_cond_timedwait() is that pthread
uses an absolute time and here you use a relative time (as we do in
WaitEventSet API). The first question is which makes a better API,
and the second is what the semantics of a relative timeout should be:
start again every time we get a spurious wake-up, or track time
already waited? Let's see...

-        (void) WaitEventSetWait(cv_wait_event_set, -1, &event, 1,
+        ret = WaitEventSetWait(cv_wait_event_set, timeout, &event, 1,
wait_event_info);

Here you're restarting the timeout clock every time through the loop
without adjustment, and I think that's not a good choice: spurious
wake-ups cause bonus waiting.

Agree. In my testing WaitEventSetWait did the work for calculating the
right timeout remaining. It's a bummer that we have to repeat the same
pattern at the ConditionVariableTimedSleep() but I guess anyone who
loops in such cases will have to adjust their values accordingly.

I think so, too. And actually the duplicate timeout calculation
doesn't seem good. We could eliminate the duplicate by allowing
WaitEventSetWait to exit by unwanted events, something like the
attached.

BTW, I am curious why Andres in 98a64d0bd71 didn't just create an
artificial event with WL_TIMEOUT and return that from
WaitEventSetWait(). Would have made it cleaner than checking checking
return values for -1.

Maybe because it is not a kind of WaitEvent, so it naturally
cannot be a part of occurred_events.

# By the way, you can obtain a short hash of a commit by git
# rev-parse --short <full hash>.

Updated v2 patch attached.

I'd like to comment on the patch.

+ * In the event of a timeout, we simply return and the caller
+ * calls ConditionVariableCancelSleep to remove themselves from the
+ * wait queue. See ConditionVariableSleep() for notes on how to correctly check
+ * for the exit condition.
+ *
+ * Returns 0, or -1 if timed out.

Maybe this could be more simpler, that like:

* ConditionVariableTimedSleep - allows us to specify timeout
*
* If timeout = =1, block until the condition is satisfied.
*
* Returns -1 when timeout expires, otherwise returns 0.
*
* See ConditionVariableSleep() for general behavior and usage.

+int
+ConditionVariableTimedSleep(ConditionVariable *cv, long timeout,

Counldn't the two-state return value be a boolean?

+ int ret = 0;

As a general coding convention, we are not to give such a generic
name for a variable with such a long life if avoidable. In the
case the 'ret' could be 'timeout_fired' or something and it would
be far verbose.

+ if (rc == 0 && timeout >= 0)

WaitEventSetWait returns 0 only in the case of timeout
expiration, so the second term is useless. Just setting ret to
-1 and break seems to me almost the same with "goto". The reason
why the existing ConditionVariableSleep uses do {} while(done) is
that it is straightforwad. Timeout added everal exit point in the
loop so it's make the loop rather complex by going around with
the variable. Whole the loop could be in the following more flat
shape.

while (true)
{
CHECK_FOR_INTERRUPTS();
rc = WaitEventSetWait();
ResetLatch();

/* timeout expired, return */
if (rc == 0) return -1;
SpinLockAcquire();
if (!proclist...)
{
done = true;
}
SpinLockRelease();

/* condition satisfied, return */
if (done) return 0;

/* if we're here, we should wait for the remaining time */
INSTR_TIME_SET_CURRENT()
...
}

+ Assert(ret == 0);

I don't see a point in the assertion so much.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

xxxx_WaitEventSetWait_tweak.patchtext/x-patch; charset=us-asciiDownload
diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c
index 59fa917ae0..abead3294e 100644
--- a/src/backend/storage/ipc/latch.c
+++ b/src/backend/storage/ipc/latch.c
@@ -950,7 +950,7 @@ WaitEventAdjustWin32(WaitEventSet *set, WaitEvent *event)
 int
 WaitEventSetWait(WaitEventSet *set, long timeout,
 				 WaitEvent *occurred_events, int nevents,
-				 uint32 wait_event_info)
+				 uint32 wait_event_info, bool wait_for_timeout)
 {
 	int			returned_events = 0;
 	instr_time	start_time;
@@ -965,7 +965,8 @@ WaitEventSetWait(WaitEventSet *set, long timeout,
 	 */
 	if (timeout >= 0)
 	{
-		INSTR_TIME_SET_CURRENT(start_time);
+		if (wait_for_timeout)
+			INSTR_TIME_SET_CURRENT(start_time);
 		Assert(timeout >= 0 && timeout <= INT_MAX);
 		cur_timeout = timeout;
 	}
@@ -1038,6 +1039,9 @@ WaitEventSetWait(WaitEventSet *set, long timeout,
 		/* If we're not done, update cur_timeout for next iteration */
 		if (returned_events == 0 && timeout >= 0)
 		{
+			if (!wait_for_timeout)
+				return 0;
+
 			INSTR_TIME_SET_CURRENT(cur_time);
 			INSTR_TIME_SUBTRACT(cur_time, start_time);
 			cur_timeout = timeout - (long) INSTR_TIME_GET_MILLISEC(cur_time);
#5Shawn Debnath
sdn@amazon.com
In reply to: Kyotaro HORIGUCHI (#4)
1 attachment(s)
Re: Introduce timeout capability for ConditionVariableSleep

Thank you reviewing. Comments inline.

On Wed, Mar 13, 2019 at 05:24:15PM +0900, Kyotaro HORIGUCHI wrote:

Agree. In my testing WaitEventSetWait did the work for calculating
the right timeout remaining. It's a bummer that we have to repeat
the same pattern at the ConditionVariableTimedSleep() but I guess
anyone who loops in such cases will have to adjust their values
accordingly.

I think so, too. And actually the duplicate timeout calculation
doesn't seem good. We could eliminate the duplicate by allowing
WaitEventSetWait to exit by unwanted events, something like the
attached.

After thinking about this more, I see WaitEventSetWait()'s contract as
wait for an event or timeout if no events are received in that time
frame. Although ConditionVariableTimedSleep() is also using the same
word, I believe the semantics are different. The timeout period in
ConditionVariableTimedSleep() is intended to limit the time we wait
until removal from the wait queue. Whereas, in the case of
WaitEventSetWait, the timeout period is intended to limit the time the
caller waits till the first event.

Hence, I believe the code is correct as is and we shouldn't change the
contract for WaitEventSetWait.

BTW, I am curious why Andres in 98a64d0bd71 didn't just create an
artificial event with WL_TIMEOUT and return that from
WaitEventSetWait(). Would have made it cleaner than checking checking
return values for -1.

Maybe because it is not a kind of WaitEvent, so it naturally
cannot be a part of occurred_events.

Hmm, I don't agree with that completely. One could argue that the
backend is waiting for any event in order to be woken up, including a
timeout event.

# By the way, you can obtain a short hash of a commit by git
# rev-parse --short <full hash>.

Good to know! :-) Luckily git is smart enough to still match it to the
correct commit.

Updated v2 patch attached.

I'd like to comment on the patch.

+ * In the event of a timeout, we simply return and the caller
+ * calls ConditionVariableCancelSleep to remove themselves from the
+ * wait queue. See ConditionVariableSleep() for notes on how to correctly check
+ * for the exit condition.
+ *
+ * Returns 0, or -1 if timed out.

Maybe this could be more simpler, that like:

* ConditionVariableTimedSleep - allows us to specify timeout
*
* If timeout = =1, block until the condition is satisfied.
*
* Returns -1 when timeout expires, otherwise returns 0.
*
* See ConditionVariableSleep() for general behavior and usage.

Agree. Changed to:

* Wait for the given condition variable to be signaled or till timeout.
*
* Returns -1 when timeout expires, otherwise returns 0.
*
* See ConditionVariableSleep() for general usage.

+ConditionVariableTimedSleep(ConditionVariable *cv, long timeout,

Counldn't the two-state return value be a boolean?

I wanted to leave the option open to use the positive integers for other
purposes but you are correct, bool suffices for now. If needed, we can
change it in the future.

+ int ret = 0;

As a general coding convention, we are not to give such a generic
name for a variable with such a long life if avoidable. In the
case the 'ret' could be 'timeout_fired' or something and it would
be far verbose.

+ if (rc == 0 && timeout >= 0)

WaitEventSetWait returns 0 only in the case of timeout
expiration, so the second term is useless. Just setting ret to
-1 and break seems to me almost the same with "goto". The reason
why the existing ConditionVariableSleep uses do {} while(done) is
that it is straightforwad. Timeout added everal exit point in the
loop so it's make the loop rather complex by going around with
the variable. Whole the loop could be in the following more flat
shape.

while (true)
{
CHECK_FOR_INTERRUPTS();
rc = WaitEventSetWait();
ResetLatch();

/* timeout expired, return */
if (rc == 0) return -1;
SpinLockAcquire();
if (!proclist...)
{
done = true;
}
SpinLockRelease();

/* condition satisfied, return */
if (done) return 0;

/* if we're here, we should wait for the remaining time */
INSTR_TIME_SET_CURRENT()
...
}

Agree. The timeout did complicate the logic for a single variable to
track the exit condition. Adopted the approach above.

+ Assert(ret == 0);

I don't see a point in the assertion so much.

Being overly verbose. Removed.

--
Shawn Debnath
Amazon Web Services (AWS)

Attachments:

0001-Introduce-timeout-capability-for-ConditionVariableSleep-v3.patchtext/plain; charset=us-asciiDownload
From cca6894d4b070ffca2922e333cdc9bbf03c0d3c2 Mon Sep 17 00:00:00 2001
From: Shawn Debnath <sdn@amazon.com>
Date: Tue, 12 Mar 2019 23:21:41 +0000
Subject: [PATCH] Introduce timeout capability for ConditionVariableSleep

This patch introduces ConditionVariableTimedSleep to enable timing out
of waiting for a condition variable to get signalled.
---
 src/backend/storage/lmgr/condition_variable.c | 63 ++++++++++++++++++++++++---
 src/include/storage/condition_variable.h      |  2 +
 2 files changed, 59 insertions(+), 6 deletions(-)

diff --git a/src/backend/storage/lmgr/condition_variable.c b/src/backend/storage/lmgr/condition_variable.c
index 58b7b51472..57a8098d90 100644
--- a/src/backend/storage/lmgr/condition_variable.c
+++ b/src/backend/storage/lmgr/condition_variable.c
@@ -19,6 +19,7 @@
 #include "postgres.h"
 
 #include "miscadmin.h"
+#include "portability/instr_time.h"
 #include "storage/condition_variable.h"
 #include "storage/ipc.h"
 #include "storage/proc.h"
@@ -122,8 +123,23 @@ ConditionVariablePrepareToSleep(ConditionVariable *cv)
 void
 ConditionVariableSleep(ConditionVariable *cv, uint32 wait_event_info)
 {
-	WaitEvent	event;
-	bool		done = false;
+	(void) ConditionVariableTimedSleep(cv, -1 /* no timeout */, wait_event_info);
+}
+
+/*
+ * Wait for the given condition variable to be signaled or till timeout.
+ *
+ * Returns true when timeout expires, otherwise returns false.
+ *
+ * See ConditionVariableSleep() for general usage.
+ */
+bool
+ConditionVariableTimedSleep(ConditionVariable *cv, long timeout,
+							uint32 wait_event_info)
+{
+	long		rem_timeout = -1;
+	instr_time	start_time;
+	instr_time	cur_time;
 
 	/*
 	 * If the caller didn't prepare to sleep explicitly, then do so now and
@@ -143,23 +159,42 @@ ConditionVariableSleep(ConditionVariable *cv, uint32 wait_event_info)
 	if (cv_sleep_target != cv)
 	{
 		ConditionVariablePrepareToSleep(cv);
-		return;
+		return false;
 	}
 
-	do
+	/*
+	 * Track the current time so that we can calculate the remaining timeout
+	 * if we are woken up spuriously.
+	 */
+	if (timeout >= 0)
 	{
+		INSTR_TIME_SET_CURRENT(start_time);
+		Assert(timeout >= 0 && timeout <= INT_MAX);
+		rem_timeout = timeout;
+	}
+
+	while (true)
+	{
+		WaitEvent	event;
+		bool		done = false;
+		int 		rc;
+
 		CHECK_FOR_INTERRUPTS();
 
 		/*
 		 * Wait for latch to be set.  (If we're awakened for some other
 		 * reason, the code below will cope anyway.)
 		 */
-		(void) WaitEventSetWait(cv_wait_event_set, -1, &event, 1,
+		rc = WaitEventSetWait(cv_wait_event_set, rem_timeout, &event, 1,
 								wait_event_info);
 
 		/* Reset latch before examining the state of the wait list. */
 		ResetLatch(MyLatch);
 
+		/* Timed out */
+		if (rc == 0)
+			return true;
+
 		/*
 		 * If this process has been taken out of the wait list, then we know
 		 * that it has been signaled by ConditionVariableSignal (or
@@ -182,7 +217,23 @@ ConditionVariableSleep(ConditionVariable *cv, uint32 wait_event_info)
 			proclist_push_tail(&cv->wakeup, MyProc->pgprocno, cvWaitLink);
 		}
 		SpinLockRelease(&cv->mutex);
-	} while (!done);
+
+		/* We were signaled, so return */
+		if (done)
+			return false;
+
+		/* If we're not done, update rem_timeout for next iteration */
+		if (timeout >= 0)
+		{
+			INSTR_TIME_SET_CURRENT(cur_time);
+			INSTR_TIME_SUBTRACT(cur_time, start_time);
+			rem_timeout = timeout - (long) INSTR_TIME_GET_MILLISEC(cur_time);
+
+			/* Have we crossed the timeout threshold? */
+			if (rem_timeout <= 0)
+				return true;
+		}
+	}
 }
 
 /*
diff --git a/src/include/storage/condition_variable.h b/src/include/storage/condition_variable.h
index 2a0249392c..ee06e051ce 100644
--- a/src/include/storage/condition_variable.h
+++ b/src/include/storage/condition_variable.h
@@ -43,6 +43,8 @@ extern void ConditionVariableInit(ConditionVariable *cv);
  * the condition variable.
  */
 extern void ConditionVariableSleep(ConditionVariable *cv, uint32 wait_event_info);
+extern bool ConditionVariableTimedSleep(ConditionVariable *cv, long timeout,
+										uint32 wait_event_info);
 extern void ConditionVariableCancelSleep(void);
 
 /*
-- 
2.16.5

#6Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Shawn Debnath (#5)
Re: Introduce timeout capability for ConditionVariableSleep

Hello.

At Thu, 14 Mar 2019 17:26:11 -0700, Shawn Debnath <sdn@amazon.com> wrote in <20190315002611.GA1119@f01898859afd.ant.amazon.com>

Thank you reviewing. Comments inline.

On Wed, Mar 13, 2019 at 05:24:15PM +0900, Kyotaro HORIGUCHI wrote:

Agree. In my testing WaitEventSetWait did the work for calculating
the right timeout remaining. It's a bummer that we have to repeat
the same pattern at the ConditionVariableTimedSleep() but I guess
anyone who loops in such cases will have to adjust their values
accordingly.

I think so, too. And actually the duplicate timeout calculation
doesn't seem good. We could eliminate the duplicate by allowing
WaitEventSetWait to exit by unwanted events, something like the
attached.

After thinking about this more, I see WaitEventSetWait()'s contract as
wait for an event or timeout if no events are received in that time

Sure.

frame. Although ConditionVariableTimedSleep() is also using the same
word, I believe the semantics are different. The timeout period in
ConditionVariableTimedSleep() is intended to limit the time we wait
until removal from the wait queue. Whereas, in the case of
WaitEventSetWait, the timeout period is intended to limit the time the
caller waits till the first event.

Mmm. The two look the same to me.. Timeout means for both that
"Wait for one of these events or timeout expiration to
occur". Removal from waiting queue is just a subtask of exiting
from waiting state.

The "don't exit until timeout expires unless any expected events
occur" part is to be done in the uppermost layer so it is enough
that the lower layer does just "exit when something
happened". This is the behavior of WaitEventSetWaitBlock for
WaitEventSetWait. My proposal is giving callers capabliy to tell
WaitEventSetWait not to perform useless timeout contination.

Hence, I believe the code is correct as is and we shouldn't change the
contract for WaitEventSetWait.

BTW, I am curious why Andres in 98a64d0bd71 didn't just create an
artificial event with WL_TIMEOUT and return that from
WaitEventSetWait(). Would have made it cleaner than checking checking
return values for -1.

Maybe because it is not a kind of WaitEvent, so it naturally
cannot be a part of occurred_events.

Hmm, I don't agree with that completely. One could argue that the
backend is waiting for any event in order to be woken up, including a
timeout event.

Right, I understand that. I didn't mean that it is the right
design for everyone. Just meant that it is in that shape. (And I
rather like it.)

latch.h:127
#define WL_TIMEOUT (1 << 3) /* not for WaitEventSetWait() */

We can make it one of the events for WaitEventSetWait, but I
don't see such a big point on that, and also that doesn't this
patch better in any means.

# By the way, you can obtain a short hash of a commit by git
# rev-parse --short <full hash>.

Good to know! :-) Luckily git is smart enough to still match it to the
correct commit.

And too complex so that infrequent usage easily get out from my
brain:(

Updated v2 patch attached.

Thank you . It looks fine execpt the above point. But still I
have some questions on it. (the reason for they not being
comments is that they are about wordings..).

+     * Track the current time so that we can calculate the remaining timeout
+     * if we are woken up spuriously.

I think tha "track" means chasing a moving objects. So it might
be bettter that it is record or something?

* Wait for the given condition variable to be signaled or till timeout.
*
* Returns -1 when timeout expires, otherwise returns 0.
*
* See ConditionVariableSleep() for general usage.

+ConditionVariableTimedSleep(ConditionVariable *cv, long timeout,

Counldn't the two-state return value be a boolean?

I wanted to leave the option open to use the positive integers for other
purposes but you are correct, bool suffices for now. If needed, we can
change it in the future.

Yes, we can do that after we found it needed.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#7Shawn Debnath
sdn@amazon.com
In reply to: Kyotaro HORIGUCHI (#6)
Re: Introduce timeout capability for ConditionVariableSleep

On Fri, Mar 15, 2019 at 02:15:17PM +0900, Kyotaro HORIGUCHI wrote:

After thinking about this more, I see WaitEventSetWait()'s contract
as wait for an event or timeout if no events are received in that
time

Sure.

frame. Although ConditionVariableTimedSleep() is also using the same
word, I believe the semantics are different. The timeout period in
ConditionVariableTimedSleep() is intended to limit the time we wait
until removal from the wait queue. Whereas, in the case of
WaitEventSetWait, the timeout period is intended to limit the time the
caller waits till the first event.

Mmm. The two look the same to me.. Timeout means for both that
"Wait for one of these events or timeout expiration to
occur". Removal from waiting queue is just a subtask of exiting
from waiting state.

The "don't exit until timeout expires unless any expected events
occur" part is to be done in the uppermost layer so it is enough
that the lower layer does just "exit when something
happened".

Agree with the fact that lower layers should return and let the upper
layer determine and filter events as needed.

This is the behavior of WaitEventSetWaitBlock for
WaitEventSetWait. My proposal is giving callers capabliy to tell
WaitEventSetWait not to perform useless timeout contination.

This is where I disagree. WaitEventSetWait needs its own loop and
timeout calculation because WaitEventSetWaitBlock can return when EINTR
is received. This gets filtered in WaitEventSetWait and doesn't bubble
up by design. Since it's involved in filtering events, it now also has
to manage the timeout value. ConditionVariableTimedSleep being at a
higher level, and waitng for certain events that the lower layers are
unaware of, also shares the timeout management reponsibility.

Do note that there is no performance impact of having multiple timeout
loops. The current design allows for each layer to filter events and
hence per layer timeout management seems fine. If one would want to
avoid this, perhaps we need to introduce a non-static version of
WaitEventSetWaitBlock and call that directly. But that of course is
beyond this patch.

Thank you . It looks fine execpt the above point. But still I
have some questions on it. (the reason for they not being
comments is that they are about wordings..).

+     * Track the current time so that we can calculate the remaining timeout
+     * if we are woken up spuriously.

I think tha "track" means chasing a moving objects. So it might
be bettter that it is record or something?

* Wait for the given condition variable to be signaled or till timeout.
*
* Returns -1 when timeout expires, otherwise returns 0.
*
* See ConditionVariableSleep() for general usage.

+ConditionVariableTimedSleep(ConditionVariable *cv, long timeout,

Counldn't the two-state return value be a boolean?

I will change it to Record in the next iteration of the patch.

--
Shawn Debnath
Amazon Web Services (AWS)

#8Shawn Debnath
sdn@amazon.com
In reply to: Shawn Debnath (#7)
1 attachment(s)
Re: Introduce timeout capability for ConditionVariableSleep

On Sat, Mar 16, 2019 at 03:27:17PM -0700, Shawn Debnath wrote:

+     * Track the current time so that we can calculate the 
remaining timeout
+     * if we are woken up spuriously.

I think tha "track" means chasing a moving objects. So it might
be bettter that it is record or something?

* Wait for the given condition variable to be signaled or till timeout.
*
* Returns -1 when timeout expires, otherwise returns 0.
*
* See ConditionVariableSleep() for general usage.

+ConditionVariableTimedSleep(ConditionVariable *cv, long timeout,

Counldn't the two-state return value be a boolean?

I will change it to Record in the next iteration of the patch.

Posting rebased and updated patch. Changed the word 'Track' to 'Record'
and also changed variable name rem_timeout to cur_timeout to match
naming in other use cases.

--
Shawn Debnath
Amazon Web Services (AWS)

Attachments:

0001-Introduce-timeout-capability-for-ConditionVariableSleep-v4.patchtext/plain; charset=us-asciiDownload
From 537e879a66f7d59333ede62e8f7c42fec4006a31 Mon Sep 17 00:00:00 2001
From: Shawn Debnath <sdn@amazon.com>
Date: Tue, 12 Mar 2019 23:21:41 +0000
Subject: [PATCH] Introduce timeout capability for ConditionVariableSleep

This patch introduces ConditionVariableTimedSleep to enable timing out
of waiting for a condition variable to get signalled.
---
 src/backend/storage/lmgr/condition_variable.c | 63 ++++++++++++++++++++++++---
 src/include/storage/condition_variable.h      |  2 +
 2 files changed, 59 insertions(+), 6 deletions(-)

diff --git a/src/backend/storage/lmgr/condition_variable.c b/src/backend/storage/lmgr/condition_variable.c
index 58b7b51472..390072451c 100644
--- a/src/backend/storage/lmgr/condition_variable.c
+++ b/src/backend/storage/lmgr/condition_variable.c
@@ -19,6 +19,7 @@
 #include "postgres.h"
 
 #include "miscadmin.h"
+#include "portability/instr_time.h"
 #include "storage/condition_variable.h"
 #include "storage/ipc.h"
 #include "storage/proc.h"
@@ -122,8 +123,23 @@ ConditionVariablePrepareToSleep(ConditionVariable *cv)
 void
 ConditionVariableSleep(ConditionVariable *cv, uint32 wait_event_info)
 {
-	WaitEvent	event;
-	bool		done = false;
+	(void) ConditionVariableTimedSleep(cv, -1 /* no timeout */, wait_event_info);
+}
+
+/*
+ * Wait for the given condition variable to be signaled or till timeout.
+ *
+ * Returns true when timeout expires, otherwise returns false.
+ *
+ * See ConditionVariableSleep() for general usage.
+ */
+bool
+ConditionVariableTimedSleep(ConditionVariable *cv, long timeout,
+							uint32 wait_event_info)
+{
+	long		cur_timeout = -1;
+	instr_time	start_time;
+	instr_time	cur_time;
 
 	/*
 	 * If the caller didn't prepare to sleep explicitly, then do so now and
@@ -143,23 +159,42 @@ ConditionVariableSleep(ConditionVariable *cv, uint32 wait_event_info)
 	if (cv_sleep_target != cv)
 	{
 		ConditionVariablePrepareToSleep(cv);
-		return;
+		return false;
 	}
 
-	do
+	/*
+	 * Record the current time so that we can calculate the remaining timeout
+	 * if we are woken up spuriously.
+	 */
+	if (timeout >= 0)
 	{
+		INSTR_TIME_SET_CURRENT(start_time);
+		Assert(timeout >= 0 && timeout <= INT_MAX);
+		cur_timeout = timeout;
+	}
+
+	while (true)
+	{
+		WaitEvent	event;
+		bool		done = false;
+		int 		rc;
+
 		CHECK_FOR_INTERRUPTS();
 
 		/*
 		 * Wait for latch to be set.  (If we're awakened for some other
 		 * reason, the code below will cope anyway.)
 		 */
-		(void) WaitEventSetWait(cv_wait_event_set, -1, &event, 1,
+		rc = WaitEventSetWait(cv_wait_event_set, cur_timeout, &event, 1,
 								wait_event_info);
 
 		/* Reset latch before examining the state of the wait list. */
 		ResetLatch(MyLatch);
 
+		/* Timed out */
+		if (rc == 0)
+			return true;
+
 		/*
 		 * If this process has been taken out of the wait list, then we know
 		 * that it has been signaled by ConditionVariableSignal (or
@@ -182,7 +217,23 @@ ConditionVariableSleep(ConditionVariable *cv, uint32 wait_event_info)
 			proclist_push_tail(&cv->wakeup, MyProc->pgprocno, cvWaitLink);
 		}
 		SpinLockRelease(&cv->mutex);
-	} while (!done);
+
+		/* We were signaled, so return */
+		if (done)
+			return false;
+
+		/* If we're not done, update cur_timeout for next iteration */
+		if (timeout >= 0)
+		{
+			INSTR_TIME_SET_CURRENT(cur_time);
+			INSTR_TIME_SUBTRACT(cur_time, start_time);
+			cur_timeout = timeout - (long) INSTR_TIME_GET_MILLISEC(cur_time);
+
+			/* Have we crossed the timeout threshold? */
+			if (cur_timeout <= 0)
+				return true;
+		}
+	}
 }
 
 /*
diff --git a/src/include/storage/condition_variable.h b/src/include/storage/condition_variable.h
index 2a0249392c..ee06e051ce 100644
--- a/src/include/storage/condition_variable.h
+++ b/src/include/storage/condition_variable.h
@@ -43,6 +43,8 @@ extern void ConditionVariableInit(ConditionVariable *cv);
  * the condition variable.
  */
 extern void ConditionVariableSleep(ConditionVariable *cv, uint32 wait_event_info);
+extern bool ConditionVariableTimedSleep(ConditionVariable *cv, long timeout,
+										uint32 wait_event_info);
 extern void ConditionVariableCancelSleep(void);
 
 /*
-- 
2.16.5

#9Thomas Munro
thomas.munro@gmail.com
In reply to: Shawn Debnath (#8)
2 attachment(s)
Re: Introduce timeout capability for ConditionVariableSleep

On Fri, Mar 22, 2019 at 7:21 AM Shawn Debnath <sdn@amazon.com> wrote:

On Sat, Mar 16, 2019 at 03:27:17PM -0700, Shawn Debnath wrote:

+     * Track the current time so that we can calculate the
remaining timeout
+     * if we are woken up spuriously.

I think tha "track" means chasing a moving objects. So it might
be bettter that it is record or something?

* Wait for the given condition variable to be signaled or till timeout.
*
* Returns -1 when timeout expires, otherwise returns 0.
*
* See ConditionVariableSleep() for general usage.

+ConditionVariableTimedSleep(ConditionVariable *cv, long timeout,

Counldn't the two-state return value be a boolean?

I will change it to Record in the next iteration of the patch.

Posting rebased and updated patch. Changed the word 'Track' to 'Record'
and also changed variable name rem_timeout to cur_timeout to match
naming in other use cases.

Hi Shawn,

I think this is looking pretty good and I'm planning to commit it
soon. The convention for CHECK_FOR_INTERRUPTS() in latch wait loops
seems to be to put it after the ResetLatch(), so I've moved it in the
attached version (though I don't think it was wrong where it was).
Also pgindented and with my proposed commit message. I've also
attached a throw-away test module that gives you CALL poke() and
SELECT wait_for_poke(timeout) using a CV.

Observations that I'm not planning to do anything about:
1. I don't really like the data type "long", but it's already
established that we use that for latches so maybe it's too late for me
to complain about that.
2. I don't really like the fact that we have to do floating point
stuff in INSTR_TIME_GET_MILLISEC(). That's not really your patch's
fault and you've copied the timeout adjustment code from latch.c,
which seems reasonable.

--
Thomas Munro
https://enterprisedb.com

Attachments:

0001-Introduce-timed-waits-for-condition-variables-v5.patchapplication/octet-stream; name=0001-Introduce-timed-waits-for-condition-variables-v5.patchDownload
From 33e1e77cf1726548f4303a611779adb43aef1759 Mon Sep 17 00:00:00 2001
From: Shawn Debnath <sdn@amazon.com>
Date: Tue, 12 Mar 2019 23:21:41 +0000
Subject: [PATCH 1/2] Introduce timed waits for condition variables.

Provide ConditionVariableTimedSleep(), like ConditionVariableSleep()
but with a timeout argument.

Author: Shawn Debnath
Reviewed-by: Kyotaro Horiguchi, Thomas Munro
Discussion: https://postgr.es/m/eeb06007ccfe46e399df6af18bfcd15a@EX13D05UWC002.ant.amazon.com
---
 src/backend/storage/lmgr/condition_variable.c | 68 ++++++++++++++++---
 src/include/storage/condition_variable.h      |  2 +
 2 files changed, 62 insertions(+), 8 deletions(-)

diff --git a/src/backend/storage/lmgr/condition_variable.c b/src/backend/storage/lmgr/condition_variable.c
index 58b7b51472..528d865267 100644
--- a/src/backend/storage/lmgr/condition_variable.c
+++ b/src/backend/storage/lmgr/condition_variable.c
@@ -19,6 +19,7 @@
 #include "postgres.h"
 
 #include "miscadmin.h"
+#include "portability/instr_time.h"
 #include "storage/condition_variable.h"
 #include "storage/ipc.h"
 #include "storage/proc.h"
@@ -122,8 +123,24 @@ ConditionVariablePrepareToSleep(ConditionVariable *cv)
 void
 ConditionVariableSleep(ConditionVariable *cv, uint32 wait_event_info)
 {
-	WaitEvent	event;
-	bool		done = false;
+	(void) ConditionVariableTimedSleep(cv, -1 /* no timeout */ ,
+									   wait_event_info);
+}
+
+/*
+ * Wait for a condition variable to be signaled or a timeout to be reached.
+ *
+ * Returns true when timeout expires, otherwise returns false.
+ *
+ * See ConditionVariableSleep() for general usage.
+ */
+bool
+ConditionVariableTimedSleep(ConditionVariable *cv, long timeout,
+							uint32 wait_event_info)
+{
+	long		cur_timeout = -1;
+	instr_time	start_time;
+	instr_time	cur_time;
 
 	/*
 	 * If the caller didn't prepare to sleep explicitly, then do so now and
@@ -143,23 +160,42 @@ ConditionVariableSleep(ConditionVariable *cv, uint32 wait_event_info)
 	if (cv_sleep_target != cv)
 	{
 		ConditionVariablePrepareToSleep(cv);
-		return;
+		return false;
 	}
 
-	do
+	/*
+	 * Record the current time so that we can calculate the remaining timeout
+	 * if we are woken up spuriously.
+	 */
+	if (timeout >= 0)
 	{
-		CHECK_FOR_INTERRUPTS();
+		INSTR_TIME_SET_CURRENT(start_time);
+		Assert(timeout >= 0 && timeout <= INT_MAX);
+		cur_timeout = timeout;
+	}
+
+	while (true)
+	{
+		WaitEvent	event;
+		bool		done = false;
+		int			rc;
 
 		/*
 		 * Wait for latch to be set.  (If we're awakened for some other
 		 * reason, the code below will cope anyway.)
 		 */
-		(void) WaitEventSetWait(cv_wait_event_set, -1, &event, 1,
-								wait_event_info);
+		rc = WaitEventSetWait(cv_wait_event_set, cur_timeout, &event, 1,
+							  wait_event_info);
 
 		/* Reset latch before examining the state of the wait list. */
 		ResetLatch(MyLatch);
 
+		CHECK_FOR_INTERRUPTS();
+
+		/* Timed out */
+		if (rc == 0)
+			return true;
+
 		/*
 		 * If this process has been taken out of the wait list, then we know
 		 * that it has been signaled by ConditionVariableSignal (or
@@ -182,7 +218,23 @@ ConditionVariableSleep(ConditionVariable *cv, uint32 wait_event_info)
 			proclist_push_tail(&cv->wakeup, MyProc->pgprocno, cvWaitLink);
 		}
 		SpinLockRelease(&cv->mutex);
-	} while (!done);
+
+		/* We were signaled, so return */
+		if (done)
+			return false;
+
+		/* If we're not done, update cur_timeout for next iteration */
+		if (timeout >= 0)
+		{
+			INSTR_TIME_SET_CURRENT(cur_time);
+			INSTR_TIME_SUBTRACT(cur_time, start_time);
+			cur_timeout = timeout - (long) INSTR_TIME_GET_MILLISEC(cur_time);
+
+			/* Have we crossed the timeout threshold? */
+			if (cur_timeout <= 0)
+				return true;
+		}
+	}
 }
 
 /*
diff --git a/src/include/storage/condition_variable.h b/src/include/storage/condition_variable.h
index 2a0249392c..ee06e051ce 100644
--- a/src/include/storage/condition_variable.h
+++ b/src/include/storage/condition_variable.h
@@ -43,6 +43,8 @@ extern void ConditionVariableInit(ConditionVariable *cv);
  * the condition variable.
  */
 extern void ConditionVariableSleep(ConditionVariable *cv, uint32 wait_event_info);
+extern bool ConditionVariableTimedSleep(ConditionVariable *cv, long timeout,
+										uint32 wait_event_info);
 extern void ConditionVariableCancelSleep(void);
 
 /*
-- 
2.21.0

0002-Simple-module-for-waiting-for-other-sessions-v5.patchapplication/octet-stream; name=0002-Simple-module-for-waiting-for-other-sessions-v5.patchDownload
From 557767c4480e81be17c637d9bdef90c6f1a5c3f3 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Fri, 5 Jul 2019 12:30:24 +1200
Subject: [PATCH 2/2] Simple module for waiting for other sessions.

---
 contrib/poke/Makefile      | 18 ++++++++
 contrib/poke/poke--1.0.sql | 12 +++++
 contrib/poke/poke.c        | 93 ++++++++++++++++++++++++++++++++++++++
 contrib/poke/poke.control  |  5 ++
 4 files changed, 128 insertions(+)
 create mode 100644 contrib/poke/Makefile
 create mode 100644 contrib/poke/poke--1.0.sql
 create mode 100644 contrib/poke/poke.c
 create mode 100644 contrib/poke/poke.control

diff --git a/contrib/poke/Makefile b/contrib/poke/Makefile
new file mode 100644
index 0000000000..c56227a754
--- /dev/null
+++ b/contrib/poke/Makefile
@@ -0,0 +1,18 @@
+# contrib/poke/Makefile
+
+MODULES = poke
+
+EXTENSION = poke
+DATA = poke--1.0.sql
+PGFILEDESC = "Simple waiting mechanism."
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = contrib/poke
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/contrib/poke/poke--1.0.sql b/contrib/poke/poke--1.0.sql
new file mode 100644
index 0000000000..bb2273e517
--- /dev/null
+++ b/contrib/poke/poke--1.0.sql
@@ -0,0 +1,12 @@
+/* contrib/poke/poke--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION poke" to load this file. \quit
+
+CREATE PROCEDURE poke()
+AS 'MODULE_PATHNAME'
+LANGUAGE C;
+
+CREATE FUNCTION wait_for_poke(milliseconds int DEFAULT -1) RETURNS boolean
+AS 'MODULE_PATHNAME'
+LANGUAGE C;
diff --git a/contrib/poke/poke.c b/contrib/poke/poke.c
new file mode 100644
index 0000000000..f707e140dc
--- /dev/null
+++ b/contrib/poke/poke.c
@@ -0,0 +1,93 @@
+#include "postgres.h"
+
+#include "access/xact.h"
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "pgstat.h"
+#include "storage/condition_variable.h"
+#include "storage/ipc.h"
+
+PG_MODULE_MAGIC;
+
+PG_FUNCTION_INFO_V1(poke);
+PG_FUNCTION_INFO_V1(wait_for_poke);
+extern void _PG_init(void);
+
+static ConditionVariable *my_cv;
+static bool poke_on_commit;
+static shmem_startup_hook_type prev_shmem_startup_hook;
+
+static void
+poke_xact_callback(XactEvent event, void *data)
+{
+	switch (event)
+	{
+	case XACT_EVENT_PRE_COMMIT:
+		/* ignore */
+		break;
+	case XACT_EVENT_COMMIT:
+		if (poke_on_commit)
+			ConditionVariableBroadcast(my_cv);
+		/* fall through */
+	default:
+		poke_on_commit = false;
+	}
+}
+
+static void
+poke_shmem_init(void)
+{
+	bool	found;
+
+	LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
+	my_cv = ShmemInitStruct("poke", sizeof(ConditionVariable), &found);
+	if (!found)
+		ConditionVariableInit(my_cv);
+	LWLockRelease(AddinShmemInitLock);
+
+	RegisterXactCallback(poke_xact_callback, NULL);
+}
+
+static void
+poke_shmem_startup_hook(void)
+{
+	poke_shmem_init();
+	if (prev_shmem_startup_hook)
+		prev_shmem_startup_hook();
+}
+
+void
+_PG_init(void)
+{
+	if (process_shared_preload_libraries_in_progress)
+	{
+		/* If preloading, then politely request the space. */
+		prev_shmem_startup_hook = shmem_startup_hook;
+		shmem_startup_hook = poke_shmem_startup_hook;
+		RequestAddinShmemSpace(sizeof(ConditionVariable));
+		return;
+	}
+
+	poke_shmem_init();
+}
+
+Datum
+poke(PG_FUNCTION_ARGS)
+{
+	poke_on_commit = true;
+
+	PG_RETURN_VOID();
+}
+
+Datum
+wait_for_poke(PG_FUNCTION_ARGS)
+{
+	int		timeout = PG_GETARG_INT32(0);
+	bool	result;
+
+	ConditionVariablePrepareToSleep(my_cv);
+	result = ConditionVariableTimedSleep(my_cv, timeout, PG_WAIT_EXTENSION);
+	ConditionVariableCancelSleep();
+
+	PG_RETURN_BOOL(result);
+}
diff --git a/contrib/poke/poke.control b/contrib/poke/poke.control
new file mode 100644
index 0000000000..f4cc50f13b
--- /dev/null
+++ b/contrib/poke/poke.control
@@ -0,0 +1,5 @@
+# poke extension
+comment = 'simple IPC between sessions'
+default_version = '1.0'
+module_pathname = '$libdir/poke'
+relocatable = true
-- 
2.21.0

#10Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#9)
Re: Introduce timeout capability for ConditionVariableSleep

On Fri, Jul 5, 2019 at 1:40 PM Thomas Munro <thomas.munro@gmail.com> wrote:

I think this is looking pretty good and I'm planning to commit it
soon. The convention for CHECK_FOR_INTERRUPTS() in latch wait loops
seems to be to put it after the ResetLatch(), so I've moved it in the
attached version (though I don't think it was wrong where it was).
Also pgindented and with my proposed commit message. I've also
attached a throw-away test module that gives you CALL poke() and
SELECT wait_for_poke(timeout) using a CV.

I thought of one small problem with the current coding. Suppose there
are two processes A and B waiting on a CV, and another process C calls
ConditionVariableSignal() to signal one process. Around the same
time, A times out and exits via this code path:

+               /* Timed out */
+               if (rc == 0)
+                       return true;

Suppose ConditionVariableSignal() set A's latch immediately after
WaitEventSetWait() returned 0 in A. Now A won't report the CV signal
to the caller, and B is still waiting, so effectively nobody has
received the message and yet C thinks it has signalled a waiter if
there is one. My first thought is that we could simply remove the
above-quoted hunk and fall through to the second timeout-detecting
code. That'd mean that if we've been signalled AND timed out as of
that point in the code, we'll prefer to report the signal, and it also
reduces the complexity of the function to have only one "return true"
path.

That still leaves the danger that the CV can be signalled some time
after ConditionVariableTimedSleep() returns. So now I'm wondering if
ConditionVariableCancelSleep() should signal the CV if it discovers
that this process is not in the proclist, on the basis that that must
indicate that we've been signalled even though we're not interested in
the message anymore, and yet some other process else might be
interested, and that might have been the only signal that is ever
going to be delivered by ConditionVariableSignal().

--
Thomas Munro
https://enterprisedb.com

#11Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#10)
2 attachment(s)
Re: Introduce timeout capability for ConditionVariableSleep

On Sun, Jul 7, 2019 at 3:09 PM Thomas Munro <thomas.munro@gmail.com> wrote:

+               /* Timed out */
+               if (rc == 0)
+                       return true;

Here's a version without that bit, because I don't think we need it.

That still leaves the danger that the CV can be signalled some time
after ConditionVariableTimedSleep() returns. So now I'm wondering if
ConditionVariableCancelSleep() should signal the CV if it discovers
that this process is not in the proclist, on the basis that that must
indicate that we've been signalled even though we're not interested in
the message anymore, and yet some other process else might be
interested, and that might have been the only signal that is ever
going to be delivered by ConditionVariableSignal().

And a separate patch to do that. Thoughts?

--
Thomas Munro
https://enterprisedb.com

Attachments:

0001-Introduce-timed-waits-for-condition-variables-v6.patchapplication/octet-stream; name=0001-Introduce-timed-waits-for-condition-variables-v6.patchDownload
From a3f819ade8dd99ec43b1709fbf291b303daa085d Mon Sep 17 00:00:00 2001
From: Shawn Debnath <sdn@amazon.com>
Date: Tue, 12 Mar 2019 23:21:41 +0000
Subject: [PATCH 1/3] Introduce timed waits for condition variables.

Provide ConditionVariableTimedSleep(), like ConditionVariableSleep()
but with a timeout argument.

Author: Shawn Debnath
Reviewed-by: Kyotaro Horiguchi, Thomas Munro
Discussion: https://postgr.es/m/eeb06007ccfe46e399df6af18bfcd15a@EX13D05UWC002.ant.amazon.com
---
 src/backend/storage/lmgr/condition_variable.c | 64 ++++++++++++++++---
 src/include/storage/condition_variable.h      |  2 +
 2 files changed, 58 insertions(+), 8 deletions(-)

diff --git a/src/backend/storage/lmgr/condition_variable.c b/src/backend/storage/lmgr/condition_variable.c
index 58b7b514720..951039da551 100644
--- a/src/backend/storage/lmgr/condition_variable.c
+++ b/src/backend/storage/lmgr/condition_variable.c
@@ -19,6 +19,7 @@
 #include "postgres.h"
 
 #include "miscadmin.h"
+#include "portability/instr_time.h"
 #include "storage/condition_variable.h"
 #include "storage/ipc.h"
 #include "storage/proc.h"
@@ -122,8 +123,24 @@ ConditionVariablePrepareToSleep(ConditionVariable *cv)
 void
 ConditionVariableSleep(ConditionVariable *cv, uint32 wait_event_info)
 {
-	WaitEvent	event;
-	bool		done = false;
+	(void) ConditionVariableTimedSleep(cv, -1 /* no timeout */ ,
+									   wait_event_info);
+}
+
+/*
+ * Wait for a condition variable to be signaled or a timeout to be reached.
+ *
+ * Returns true when timeout expires, otherwise returns false.
+ *
+ * See ConditionVariableSleep() for general usage.
+ */
+bool
+ConditionVariableTimedSleep(ConditionVariable *cv, long timeout,
+							uint32 wait_event_info)
+{
+	long		cur_timeout = -1;
+	instr_time	start_time;
+	instr_time	cur_time;
 
 	/*
 	 * If the caller didn't prepare to sleep explicitly, then do so now and
@@ -143,23 +160,38 @@ ConditionVariableSleep(ConditionVariable *cv, uint32 wait_event_info)
 	if (cv_sleep_target != cv)
 	{
 		ConditionVariablePrepareToSleep(cv);
-		return;
+		return false;
 	}
 
-	do
+	/*
+	 * Record the current time so that we can calculate the remaining timeout
+	 * if we are woken up spuriously.
+	 */
+	if (timeout >= 0)
 	{
-		CHECK_FOR_INTERRUPTS();
+		INSTR_TIME_SET_CURRENT(start_time);
+		Assert(timeout >= 0 && timeout <= INT_MAX);
+		cur_timeout = timeout;
+	}
+
+	while (true)
+	{
+		WaitEvent	event;
+		bool		done = false;
+		int			rc;
 
 		/*
 		 * Wait for latch to be set.  (If we're awakened for some other
 		 * reason, the code below will cope anyway.)
 		 */
-		(void) WaitEventSetWait(cv_wait_event_set, -1, &event, 1,
-								wait_event_info);
+		rc = WaitEventSetWait(cv_wait_event_set, cur_timeout, &event, 1,
+							  wait_event_info);
 
 		/* Reset latch before examining the state of the wait list. */
 		ResetLatch(MyLatch);
 
+		CHECK_FOR_INTERRUPTS();
+
 		/*
 		 * If this process has been taken out of the wait list, then we know
 		 * that it has been signaled by ConditionVariableSignal (or
@@ -182,7 +214,23 @@ ConditionVariableSleep(ConditionVariable *cv, uint32 wait_event_info)
 			proclist_push_tail(&cv->wakeup, MyProc->pgprocno, cvWaitLink);
 		}
 		SpinLockRelease(&cv->mutex);
-	} while (!done);
+
+		/* We were signaled, so return */
+		if (done)
+			return false;
+
+		/* If we're not done, update cur_timeout for next iteration */
+		if (timeout >= 0)
+		{
+			INSTR_TIME_SET_CURRENT(cur_time);
+			INSTR_TIME_SUBTRACT(cur_time, start_time);
+			cur_timeout = timeout - (long) INSTR_TIME_GET_MILLISEC(cur_time);
+
+			/* Have we crossed the timeout threshold? */
+			if (cur_timeout <= 0)
+				return true;
+		}
+	}
 }
 
 /*
diff --git a/src/include/storage/condition_variable.h b/src/include/storage/condition_variable.h
index 2a0249392cc..ee06e051ce1 100644
--- a/src/include/storage/condition_variable.h
+++ b/src/include/storage/condition_variable.h
@@ -43,6 +43,8 @@ extern void ConditionVariableInit(ConditionVariable *cv);
  * the condition variable.
  */
 extern void ConditionVariableSleep(ConditionVariable *cv, uint32 wait_event_info);
+extern bool ConditionVariableTimedSleep(ConditionVariable *cv, long timeout,
+										uint32 wait_event_info);
 extern void ConditionVariableCancelSleep(void);
 
 /*
-- 
2.21.0

0002-Forward-received-condition-variable-signals-on-ca-v6.patchapplication/octet-stream; name=0002-Forward-received-condition-variable-signals-on-ca-v6.patchDownload
From 327fe5536a508963a68e8aded6dbb83effd68933 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Tue, 9 Jul 2019 20:48:06 +1200
Subject: [PATCH 2/3] Forward received condition variable signals on cancel.

After a process decides not to wait for a condition variable, it can
still consume a signal before it reaches ConditionVariableSleep().  In
that case, pass the signal on to another waiter if possible, so that
a signal doesn't go missing when there is another process waiting that
should receive it.

Author: Thomas Munro
Discussion: https://postgr.es/m/CA%2BhUKGLQ_RW%2BXs8znDn36e-%2Bmq2--zrPemBqTQ8eKT-VO1OF4Q%40mail.gmail.com
---
 src/backend/storage/lmgr/condition_variable.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/src/backend/storage/lmgr/condition_variable.c b/src/backend/storage/lmgr/condition_variable.c
index 951039da551..c22522beeb8 100644
--- a/src/backend/storage/lmgr/condition_variable.c
+++ b/src/backend/storage/lmgr/condition_variable.c
@@ -246,6 +246,7 @@ void
 ConditionVariableCancelSleep(void)
 {
 	ConditionVariable *cv = cv_sleep_target;
+	bool		signaled = false;
 
 	if (cv == NULL)
 		return;
@@ -253,8 +254,18 @@ ConditionVariableCancelSleep(void)
 	SpinLockAcquire(&cv->mutex);
 	if (proclist_contains(&cv->wakeup, MyProc->pgprocno, cvWaitLink))
 		proclist_delete(&cv->wakeup, MyProc->pgprocno, cvWaitLink);
+	else
+		signaled = true;
 	SpinLockRelease(&cv->mutex);
 
+	/*
+	 * If we've received a signal, pass it on to another waiting process, if
+	 * there is one.  Otherwise a call to ConditionVariableSignal() might get
+	 * lost, despite there being another process ready to handle it.
+	 */
+	if (signaled)
+		ConditionVariableSignal(cv);
+
 	cv_sleep_target = NULL;
 }
 
-- 
2.21.0

#12Shawn Debnath
sdn@amazon.com
In reply to: Thomas Munro (#11)
Re: Introduce timeout capability for ConditionVariableSleep

On Tue, Jul 09, 2019 at 11:03:18PM +1200, Thomas Munro wrote:

On Sun, Jul 7, 2019 at 3:09 PM Thomas Munro <thomas.munro@gmail.com> wrote:

+               /* Timed out */
+               if (rc == 0)
+                       return true;

Here's a version without that bit, because I don't think we need it.

This works. Agree that letting it fall through covers the first gap.

That still leaves the danger that the CV can be signalled some time
after ConditionVariableTimedSleep() returns. So now I'm wondering if
ConditionVariableCancelSleep() should signal the CV if it discovers
that this process is not in the proclist, on the basis that that must
indicate that we've been signalled even though we're not interested in
the message anymore, and yet some other process else might be
interested, and that might have been the only signal that is ever
going to be delivered by ConditionVariableSignal().

And a separate patch to do that. Thoughts?

I like it. This covers the gap all the way till cancel is invoked and it
manipulates the list to remove itself or realizes that it needs to
forward the signal to some other process.

Thanks Thomas!

--
Shawn Debnath
Amazon Web Services (AWS)

#13Thomas Munro
thomas.munro@gmail.com
In reply to: Shawn Debnath (#12)
Re: Introduce timeout capability for ConditionVariableSleep

On Fri, Jul 12, 2019 at 6:08 PM Shawn Debnath <sdn@amazon.com> wrote:

On Tue, Jul 09, 2019 at 11:03:18PM +1200, Thomas Munro wrote:

On Sun, Jul 7, 2019 at 3:09 PM Thomas Munro <thomas.munro@gmail.com> wrote:

+               /* Timed out */
+               if (rc == 0)
+                       return true;

Here's a version without that bit, because I don't think we need it.

This works. Agree that letting it fall through covers the first gap.

Pushed, like that (with the now unused rc variable also removed).
Thanks for the patch!

That still leaves the danger that the CV can be signalled some time
after ConditionVariableTimedSleep() returns. So now I'm wondering if
ConditionVariableCancelSleep() should signal the CV if it discovers
that this process is not in the proclist, on the basis that that must
indicate that we've been signalled even though we're not interested in
the message anymore, and yet some other process else might be
interested, and that might have been the only signal that is ever
going to be delivered by ConditionVariableSignal().

And a separate patch to do that. Thoughts?

I like it. This covers the gap all the way till cancel is invoked and it
manipulates the list to remove itself or realizes that it needs to
forward the signal to some other process.

I pushed this too. It's a separate commit, because I think there is
at least a theoretical argument that it should be back-patched. I'm
not going to do that today though, because I doubt anyone is relying
on ConditionVariableSignal() working that reliably yet, and it's
really with timeouts that it becomes a likely problem.

I thought about this edge case because I have long wanted to propose a
pair of functions that provide a simplified payloadless blocking
alternative to NOTIFY, that would allow for just the right number of
waiting sessions to wake up to handle SKIP LOCKED-style job queues.
Otherwise you sometimes get thundering herds of wakeups fighting over
crumbs. That made me think about the case where a worker session
decides to time out and shut down due to being idle for too long, but
eats a wakeup on its way out. Another question that comes up in that
use case is whether CV wakeup queues should be LIFO or FIFO. I think
the answer is LIFO, to support class worker pool designs that
stabilise at the right size using a simple idle timeout rule. They're
currently FIFO (proclist_pop_head_node() to wake up, but
proclist_push_tail() to sleep). I understand why Robert didn't care
about that last time I mentioned it: all our uses of CVs today are
"broadcast" wakeups. But a productised version of the "poke" hack I
showed earlier that supports poking just one waiter would care about
the thing this patch fixed, and also the wakeup queue order.

--
Thomas Munro
https://enterprisedb.com

#14Shawn Debnath
sdn@amazon.com
In reply to: Thomas Munro (#13)
Re: Introduce timeout capability for ConditionVariableSleep

On Sat, Jul 13, 2019 at 03:02:25PM +1200, Thomas Munro wrote:

Pushed, like that (with the now unused rc variable also removed).
Thanks for the patch!

Awesome - thank you!

--
Shawn Debnath
Amazon Web Services (AWS)

#15Robert Haas
robertmhaas@gmail.com
In reply to: Thomas Munro (#13)
Re: Introduce timeout capability for ConditionVariableSleep

On Fri, Jul 12, 2019 at 11:03 PM Thomas Munro <thomas.munro@gmail.com> wrote:

I pushed this too. It's a separate commit, because I think there is
at least a theoretical argument that it should be back-patched. I'm
not going to do that today though, because I doubt anyone is relying
on ConditionVariableSignal() working that reliably yet, and it's
really with timeouts that it becomes a likely problem.

To make it work reliably, you'd need to be sure that a process can't
ERROR or FATAL after getting signaled and before doing whatever the
associated work is (or that if it does, it will first pass on the
signal). Since that seems impossible, I'm not sure I see the point of
trying to do anything at all.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#16Thomas Munro
thomas.munro@gmail.com
In reply to: Robert Haas (#15)
Re: Introduce timeout capability for ConditionVariableSleep

On Tue, Jul 16, 2019 at 1:11 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Jul 12, 2019 at 11:03 PM Thomas Munro <thomas.munro@gmail.com> wrote:

I pushed this too. It's a separate commit, because I think there is
at least a theoretical argument that it should be back-patched. I'm
not going to do that today though, because I doubt anyone is relying
on ConditionVariableSignal() working that reliably yet, and it's
really with timeouts that it becomes a likely problem.

To make it work reliably, you'd need to be sure that a process can't
ERROR or FATAL after getting signaled and before doing whatever the
associated work is (or that if it does, it will first pass on the
signal). Since that seems impossible, I'm not sure I see the point of
trying to do anything at all.

I agree that that on its own doesn't fix problems in <some
non-existent client of this facility>, but that doesn't mean we
shouldn't try to make this API as reliable as possible. Unlike
typical CV implementations, our wait primitive is not atomic. When we
invented two-step wait, we created a way for ConditionVariableSignal()
to have no effect due to bad timing. Surely that's a bug.

--
Thomas Munro
https://enterprisedb.com