Fix pgbench --progress report under (very) low rate

Started by Fabien COELHOalmost 11 years ago5 messages
#1Fabien COELHO
coelho@cri.ensmp.fr
1 attachment(s)

When running with low rate, the --progress is only printed when there is
some activity, which makes it quite irregular, including some catching up
with stupid tps figures.

Shame on me for this "feature" (aka bug) in the first place.

This patch fixes this behavior by considering the next report time as a
target to meet as well as transaction schedule times.

Before the patch:

sh> ./pgbench -R 0.5 -T 10 -P 1 -S
progress: 1.7 s, 0.6 tps, lat 6.028 ms stddev -nan, lag 1.883 ms
progress: 2.2 s, 2.3 tps, lat 2.059 ms stddev -nan, lag 0.530 ms
progress: 7.3 s, 0.4 tps, lat 2.944 ms stddev 1.192, lag 2.624 ms
progress: 7.3 s, 1402.5 tps, lat 5.115 ms stddev 0.000, lag 0.000 ms
progress: 7.3 s, 0.0 tps, lat -nan ms stddev -nan, lag inf ms
progress: 7.3 s, 335.2 tps, lat 3.106 ms stddev 0.000, lag 0.000 ms
progress: 8.8 s, 0.0 tps, lat -nan ms stddev -nan, lag inf ms
progress: 8.8 s, 307.6 tps, lat 4.855 ms stddev -nan, lag 0.000 ms
progress: 10.0 s, 0.0 tps, lat -nan ms stddev -nan, lag -nan ms

After the patch:

sh> ./pgbench -R 0.5 -T 10 -P 1 -S
progress: 1.0 s, 0.0 tps, lat -nan ms stddev -nan, lag -nan ms
progress: 2.0 s, 1.0 tps, lat 5.980 ms stddev 0.000, lag 0.733 ms
progress: 3.0 s, 1.0 tps, lat 1.905 ms stddev 0.000, lag 0.935 ms
progress: 4.0 s, 1.0 tps, lat 3.966 ms stddev 0.000, lag 0.623 ms
progress: 5.0 s, 2.0 tps, lat 2.527 ms stddev 1.579, lag 0.512 ms
progress: 6.0 s, 0.0 tps, lat -nan ms stddev -nan, lag -nan ms
progress: 7.0 s, 0.0 tps, lat -nan ms stddev -nan, lag -nan ms
progress: 8.0 s, 1.0 tps, lat 1.750 ms stddev 0.000, lag 0.767 ms
progress: 9.0 s, 0.0 tps, lat -nan ms stddev -nan, lag -nan ms
progress: 10.0 s, 2.0 tps, lat 2.403 ms stddev 1.386, lag 0.357 ms

To answer a question before it is asked: I run low rates because I'm
looking at latency (rather than throughput) under different conditions.
For instance with the above tests, the latency is about 3 ms, but it
varies with the tps: (0.5 tps => 3 ms, 10 tps => 1 ms, 50 tps => 0.8 ms,
100 tps => 0.5 ms, 200 tps => 0.75 ms, 1000 tps => 0.5 ms...).

--
Fabien.

Attachments:

pgbench-sleeping-progress-1.patchtext/x-diff; name=pgbench-sleeping-progress-1.patchDownload
diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 706fdf5..3536782 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -3584,6 +3584,28 @@ threadRun(void *arg)
 				maxsock = sock;
 		}
 
+		/* also meet the next progress report time if needed */
+		if (progress && min_usec > 0
+#ifndef PTHREAD_FORK_EMULATION
+			&& thread->tid == 0
+#endif /* !PTHREAD_FORK_EMULATION */
+			)
+		{
+			/* get current time if nedded */
+			if (now_usec == 0)
+			{
+				instr_time	now;
+
+				INSTR_TIME_SET_CURRENT(now);
+				now_usec = INSTR_TIME_GET_MICROSEC(now);
+			}
+
+			if (now_usec >= next_report)
+				min_usec = 0;
+			else if ((next_report - now_usec) < min_usec)
+				min_usec = next_report - now_usec;
+		}
+
 		if (min_usec > 0 && maxsock != -1)
 		{
 			int			nsocks; /* return from select(2) */
#2Robert Haas
robertmhaas@gmail.com
In reply to: Fabien COELHO (#1)
Re: Fix pgbench --progress report under (very) low rate

On Sun, Mar 22, 2015 at 6:12 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:

When running with low rate, the --progress is only printed when there is
some activity, which makes it quite irregular, including some catching up
with stupid tps figures.

Shame on me for this "feature" (aka bug) in the first place.

This patch fixes this behavior by considering the next report time as a
target to meet as well as transaction schedule times.

Before the patch:

sh> ./pgbench -R 0.5 -T 10 -P 1 -S
progress: 1.7 s, 0.6 tps, lat 6.028 ms stddev -nan, lag 1.883 ms
progress: 2.2 s, 2.3 tps, lat 2.059 ms stddev -nan, lag 0.530 ms
progress: 7.3 s, 0.4 tps, lat 2.944 ms stddev 1.192, lag 2.624 ms
progress: 7.3 s, 1402.5 tps, lat 5.115 ms stddev 0.000, lag 0.000 ms
progress: 7.3 s, 0.0 tps, lat -nan ms stddev -nan, lag inf ms
progress: 7.3 s, 335.2 tps, lat 3.106 ms stddev 0.000, lag 0.000 ms
progress: 8.8 s, 0.0 tps, lat -nan ms stddev -nan, lag inf ms
progress: 8.8 s, 307.6 tps, lat 4.855 ms stddev -nan, lag 0.000 ms
progress: 10.0 s, 0.0 tps, lat -nan ms stddev -nan, lag -nan ms

After the patch:

sh> ./pgbench -R 0.5 -T 10 -P 1 -S
progress: 1.0 s, 0.0 tps, lat -nan ms stddev -nan, lag -nan ms
progress: 2.0 s, 1.0 tps, lat 5.980 ms stddev 0.000, lag 0.733 ms
progress: 3.0 s, 1.0 tps, lat 1.905 ms stddev 0.000, lag 0.935 ms
progress: 4.0 s, 1.0 tps, lat 3.966 ms stddev 0.000, lag 0.623 ms
progress: 5.0 s, 2.0 tps, lat 2.527 ms stddev 1.579, lag 0.512 ms
progress: 6.0 s, 0.0 tps, lat -nan ms stddev -nan, lag -nan ms
progress: 7.0 s, 0.0 tps, lat -nan ms stddev -nan, lag -nan ms
progress: 8.0 s, 1.0 tps, lat 1.750 ms stddev 0.000, lag 0.767 ms
progress: 9.0 s, 0.0 tps, lat -nan ms stddev -nan, lag -nan ms
progress: 10.0 s, 2.0 tps, lat 2.403 ms stddev 1.386, lag 0.357 ms

I haven't had time to really review the code here (except to notice
that you have a typo: "nedded") but the idea of it seems good.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Robert Haas (#2)
1 attachment(s)
Re: Fix pgbench --progress report under (very) low rate

I haven't had time to really review the code here (except to notice
that you have a typo: "nedded") but the idea of it seems good.

v3 rebase (after pgbench moved to src/bin) and minor style tweaking.

--
Fabien.

Attachments:

pgbench-sleeping-progress-3.patchtext/x-diff; name=pgbench-sleeping-progress-3.patchDownload
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 06a4dfd..38dc4a5 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -3607,6 +3607,28 @@ threadRun(void *arg)
 				maxsock = sock;
 		}
 
+		/* also meet the next progress report time if needed */
+		if (progress && min_usec > 0
+#if !defined(PTHREAD_FORK_EMULATION)
+			&& thread->tid == 0
+#endif /* !PTHREAD_FORK_EMULATION */
+			)
+		{
+			/* get current time if needed */
+			if (now_usec == 0)
+			{
+				instr_time	now;
+
+				INSTR_TIME_SET_CURRENT(now);
+				now_usec = INSTR_TIME_GET_MICROSEC(now);
+			}
+
+			if (now_usec >= next_report)
+				min_usec = 0;
+			else if ((next_report - now_usec) < min_usec)
+				min_usec = next_report - now_usec;
+		}
+
 		if (min_usec > 0 && maxsock != -1)
 		{
 			int			nsocks; /* return from select(2) */
#4Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Fabien COELHO (#3)
1 attachment(s)
Re: Fix pgbench --progress report under (very) low rate

v3 rebase (after pgbench moved to src/bin) and minor style tweaking.

v4 adds a fix to another progress timing issue:

Currently if pgbench/postgres get stuck somewhere, the report catches up
by repeating progresses several time in a row, which looks like that:

progress: 10.0 s ...
progress: 11.0 s ... stuck...
progress: 14.2 s catchup for 11.0 -> 14.2
progress: 14.2 s stupid data
progress: 14.2 s stupid data
progress: 15.0 s ...
progress: 16.0 s ...

The correction removes the "stupid data" lines which compute a reports on
a very short time, including absurd tps figures.

Yet again, shame on me in the first place for this behavior.

--
Fabien.

Attachments:

pgbench-sleeping-progress-4.patchtext/x-diff; name=pgbench-sleeping-progress-4.patchDownload
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 6f35db4..0d71173 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -3639,6 +3639,28 @@ threadRun(void *arg)
 				maxsock = sock;
 		}
 
+		/* also meet the next progress report time if needed */
+		if (progress && min_usec > 0
+#if !defined(PTHREAD_FORK_EMULATION)
+			&& thread->tid == 0
+#endif /* !PTHREAD_FORK_EMULATION */
+			)
+		{
+			/* get current time if needed */
+			if (now_usec == 0)
+			{
+				instr_time	now;
+
+				INSTR_TIME_SET_CURRENT(now);
+				now_usec = INSTR_TIME_GET_MICROSEC(now);
+			}
+
+			if (now_usec >= next_report)
+				min_usec = 0;
+			else if ((next_report - now_usec) < min_usec)
+				min_usec = next_report - now_usec;
+		}
+
 		if (min_usec > 0 && maxsock != -1)
 		{
 			int			nsocks; /* return from select(2) */
@@ -3744,7 +3766,13 @@ threadRun(void *arg)
 				last_lags = lags;
 				last_report = now;
 				last_skipped = thread->throttle_latency_skipped;
-				next_report += (int64) progress *1000000;
+
+				/* Ensure that the next report is in the future, in case
+				 * pgbench/postgres got stuck somewhere...
+				 */
+				do {
+					next_report += (int64) progress * 1000000;
+				} while (now >= next_report);
 			}
 		}
 #else
@@ -3808,7 +3836,13 @@ threadRun(void *arg)
 				last_lags = lags;
 				last_report = now;
 				last_skipped = thread->throttle_latency_skipped;
-				next_report += (int64) progress *1000000;
+
+				/* Ensure that the next report is in the future, in case
+				 * pgbench/postgres got stuck somewhere...
+				 */
+				do {
+					next_report += (int64) progress * 1000000;
+				} while (now >= next_report);
 			}
 		}
 #endif   /* PTHREAD_FORK_EMULATION */
#5Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Fabien COELHO (#4)
Re: Fix pgbench --progress report under (very) low rate

On 06/15/2015 09:12 PM, Fabien COELHO wrote:

v3 rebase (after pgbench moved to src/bin) and minor style tweaking.

v4 adds a fix to another progress timing issue:

Currently if pgbench/postgres get stuck somewhere, the report catches up
by repeating progresses several time in a row, which looks like that:

progress: 10.0 s ...
progress: 11.0 s ... stuck...
progress: 14.2 s catchup for 11.0 -> 14.2
progress: 14.2 s stupid data
progress: 14.2 s stupid data
progress: 15.0 s ...
progress: 16.0 s ...

The correction removes the "stupid data" lines which compute a reports on
a very short time, including absurd tps figures.

Yet again, shame on me in the first place for this behavior.

Thanks, applied. I chose to also backpatch this, although arguably this
is a change in behaviour that would not be good to change in a minor
version. However, progress reports are a very user-facing feature, it's
not going to break anyone's scripts.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers