pgbench: allow to exit immediately when any client is aborted
Hi,
I would like to propose to add an option to pgbench so that benchmark
can quit immediately when any client is aborted. Currently, when a
client is aborted due to some error, for example, network trouble,
other clients continue their run until a certain number of transactions
specified -t is reached or the time specified by -T is expired. At the
end, the results are printed, but they are not useful, as the message
"Run was aborted; the above results are incomplete" shows.
For precise benchmark purpose, we would not want to wait to get such
incomplete results, rather we would like to know some trouble happened
to allow a quick retry. Therefore, it would be nice to add an option to
make pgbench exit instead of continuing run in other clients when any
client is aborted. I think adding the optional is better than whole
behavioural change because some users that use pgbench just in order
to stress on backends for testing purpose rather than benchmark might
not want to stop pgbench even a client is aborted.
Attached is the patch to add the option --exit-on-abort.
If this option is specified, when any client is aborted, pgbench
immediately quit by calling exit(2).
What do you think about it?
Regards,
Yugo Nagata
--
Yugo NAGATA <nagata@sraoss.co.jp>
Attachments:
pgbench_exit_on_abort.patchtext/x-diff; name=pgbench_exit_on_abort.patchDownload
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 539c2795e2..6fae5a43e1 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -765,6 +765,8 @@ static int64 total_weight = 0;
static bool verbose_errors = false; /* print verbose messages of all errors */
+static bool exit_on_abort = false; /* exit when any client is aborted */
+
/* Builtin test scripts */
typedef struct BuiltinScript
{
@@ -911,6 +913,7 @@ usage(void)
" -T, --time=NUM duration of benchmark test in seconds\n"
" -v, --vacuum-all vacuum all four standard tables before tests\n"
" --aggregate-interval=NUM aggregate data over NUM seconds\n"
+ " --exit-on-abort exit when any client is aborted\n"
" --failures-detailed report the failures grouped by basic types\n"
" --log-prefix=PREFIX prefix for transaction time log file\n"
" (default: \"pgbench_log\")\n"
@@ -6612,6 +6615,7 @@ main(int argc, char **argv)
{"failures-detailed", no_argument, NULL, 13},
{"max-tries", required_argument, NULL, 14},
{"verbose-errors", no_argument, NULL, 15},
+ {"exit-on-abort", no_argument, NULL, 16},
{NULL, 0, NULL, 0}
};
@@ -6945,6 +6949,10 @@ main(int argc, char **argv)
benchmarking_option_set = true;
verbose_errors = true;
break;
+ case 16: /* exit-on-abort */
+ benchmarking_option_set = true;
+ exit_on_abort = true;
+ break;
default:
/* getopt_long already emitted a complaint */
pg_log_error_hint("Try \"%s --help\" for more information.", progname);
@@ -7553,11 +7561,13 @@ threadRun(void *arg)
advanceConnectionState(thread, st, &aggs);
+ if (exit_on_abort && st->state == CSTATE_ABORTED)
+ goto done;
/*
* If advanceConnectionState changed client to finished state,
* that's one fewer client that remains.
*/
- if (st->state == CSTATE_FINISHED || st->state == CSTATE_ABORTED)
+ else if (st->state == CSTATE_FINISHED || st->state == CSTATE_ABORTED)
remains--;
}
@@ -7592,6 +7602,15 @@ threadRun(void *arg)
done:
disconnect_all(state, nstate);
+ for (int i = 0; i < nstate; i++)
+ {
+ if (state[i].state != CSTATE_FINISHED)
+ {
+ pg_log_error("Run was aborted due to an error in thread %d", thread->tid);
+ exit(2);
+ }
+ }
+
if (thread->logfile)
{
if (agg_interval > 0)
Hi,
I would like to propose to add an option to pgbench so that benchmark
can quit immediately when any client is aborted. Currently, when a
client is aborted due to some error, for example, network trouble,
other clients continue their run until a certain number of transactions
specified -t is reached or the time specified by -T is expired. At the
end, the results are printed, but they are not useful, as the message
"Run was aborted; the above results are incomplete" shows.
Sounds like a good idea. It's a waste of resources waiting for
unusable benchmark results until t/T expired. If we graze on the
screen, then it's easy to cancel the pgbench run. But I frequently let
pgbench run without sitting in front of the screen especially when t/T
is large (it's recommended that running pgbench with large enough t/T
to get usable results).
For precise benchmark purpose, we would not want to wait to get such
incomplete results, rather we would like to know some trouble happened
to allow a quick retry. Therefore, it would be nice to add an option to
make pgbench exit instead of continuing run in other clients when any
client is aborted. I think adding the optional is better than whole
behavioural change because some users that use pgbench just in order
to stress on backends for testing purpose rather than benchmark might
not want to stop pgbench even a client is aborted.Attached is the patch to add the option --exit-on-abort.
If this option is specified, when any client is aborted, pgbench
immediately quit by calling exit(2).What do you think about it?
I think aborting pgbench by calling exit(2) is enough. It's not worth
the trouble to add more codes for this purpose.
Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
On Sat, 05 Aug 2023 12:16:11 +0900 (JST)
Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
Hi,
I would like to propose to add an option to pgbench so that benchmark
can quit immediately when any client is aborted. Currently, when a
client is aborted due to some error, for example, network trouble,
other clients continue their run until a certain number of transactions
specified -t is reached or the time specified by -T is expired. At the
end, the results are printed, but they are not useful, as the message
"Run was aborted; the above results are incomplete" shows.Sounds like a good idea. It's a waste of resources waiting for
unusable benchmark results until t/T expired. If we graze on the
screen, then it's easy to cancel the pgbench run. But I frequently let
pgbench run without sitting in front of the screen especially when t/T
is large (it's recommended that running pgbench with large enough t/T
to get usable results).
Thank you for your agreement.
For precise benchmark purpose, we would not want to wait to get such
incomplete results, rather we would like to know some trouble happened
to allow a quick retry. Therefore, it would be nice to add an option to
make pgbench exit instead of continuing run in other clients when any
client is aborted. I think adding the optional is better than whole
behavioural change because some users that use pgbench just in order
to stress on backends for testing purpose rather than benchmark might
not want to stop pgbench even a client is aborted.Attached is the patch to add the option --exit-on-abort.
If this option is specified, when any client is aborted, pgbench
immediately quit by calling exit(2).What do you think about it?
I think aborting pgbench by calling exit(2) is enough. It's not worth
the trouble to add more codes for this purpose.
In order to stop pgbench more gracefully, it might be better to make
each thread exit at more proper timing after some cleaning-up like
connection close. However, pgbench code doesn't provide such functions
for inter-threads communication. If we would try to make this, both
pthread and Windows versions would be needed. I don't think it is necessary
to make such effort for --exit-on-abort option, as you said.
Regards,
Yugo Nagata
Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
--
Yugo NAGATA <nagata@sraoss.co.jp>
On Mon, 7 Aug 2023 11:02:48 +0900
Yugo NAGATA <nagata@sraoss.co.jp> wrote:
On Sat, 05 Aug 2023 12:16:11 +0900 (JST)
Tatsuo Ishii <ishii@sraoss.co.jp> wrote:Hi,
I would like to propose to add an option to pgbench so that benchmark
can quit immediately when any client is aborted. Currently, when a
client is aborted due to some error, for example, network trouble,
other clients continue their run until a certain number of transactions
specified -t is reached or the time specified by -T is expired. At the
end, the results are printed, but they are not useful, as the message
"Run was aborted; the above results are incomplete" shows.Sounds like a good idea. It's a waste of resources waiting for
unusable benchmark results until t/T expired. If we graze on the
screen, then it's easy to cancel the pgbench run. But I frequently let
pgbench run without sitting in front of the screen especially when t/T
is large (it's recommended that running pgbench with large enough t/T
to get usable results).Thank you for your agreement.
For precise benchmark purpose, we would not want to wait to get such
incomplete results, rather we would like to know some trouble happened
to allow a quick retry. Therefore, it would be nice to add an option to
make pgbench exit instead of continuing run in other clients when any
client is aborted. I think adding the optional is better than whole
behavioural change because some users that use pgbench just in order
to stress on backends for testing purpose rather than benchmark might
not want to stop pgbench even a client is aborted.Attached is the patch to add the option --exit-on-abort.
If this option is specified, when any client is aborted, pgbench
immediately quit by calling exit(2).
I attached v2 patch including the documentation and some comments
in the code.
Regards,
Yugo Nagata
What do you think about it?
I think aborting pgbench by calling exit(2) is enough. It's not worth
the trouble to add more codes for this purpose.In order to stop pgbench more gracefully, it might be better to make
each thread exit at more proper timing after some cleaning-up like
connection close. However, pgbench code doesn't provide such functions
for inter-threads communication. If we would try to make this, both
pthread and Windows versions would be needed. I don't think it is necessary
to make such effort for --exit-on-abort option, as you said.Regards,
Yugo NagataBest reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp--
Yugo NAGATA <nagata@sraoss.co.jp>
--
Yugo NAGATA <nagata@sraoss.co.jp>
Attachments:
v2-pgbench_exit_on_abort.patchtext/x-diff; name=v2-pgbench_exit_on_abort.patchDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 6c5c8afa6d..83fe01c33f 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -787,6 +787,19 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
</listitem>
</varlistentry>
+ <varlistentry id="pgbench-option-exit-on-abort">
+ <term><option>--exit-on-abort</option></term>
+ <listitem>
+ <para>
+ Exit immediately when any client is aborted due to some error. Without
+ this option, even when a client is aborted, other clients could continue
+ their run as specified by <option>-t</option> or <option>-T</option> option,
+ and <application>pgbench</application> will print an incomplete results
+ in this case.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="pgbench-option-log-prefix">
<term><option>--log-prefix=<replaceable>prefix</replaceable></option></term>
<listitem>
@@ -985,7 +998,8 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
benchmark such as initial connection failures also exit with status 1.
Errors during the run such as database errors or problems in the script
will result in exit status 2. In the latter case,
- <application>pgbench</application> will print partial results.
+ <application>pgbench</application> will print partial results if
+ <option>--exit-on-abort</option> option is not specified.
</para>
</refsect1>
@@ -2801,14 +2815,17 @@ statement latencies in milliseconds, failures and retries:
start a connection to the database server / the socket for connecting
the client to the database server has become invalid). In such cases
all clients of this thread stop while other threads continue to work.
+ However, <option>--exit-on-abort</option> is specified, all of the
+ threads stop immediately in this case.
</para>
</listitem>
<listitem>
<para>
Direct client errors. They lead to immediate exit from
<application>pgbench</application> with the corresponding error message
- only in the case of an internal <application>pgbench</application>
- error (which are supposed to never occur...). Otherwise in the worst
+ in the case of an internal <application>pgbench</application>
+ error (which are supposed to never occur...) or when
+ <option>--exit-on-abort</option> is specified . Otherwise in the worst
case they only lead to the abortion of the failed client while other
clients continue their run (but some client errors are handled without
an abortion of the client and reported separately, see below). Later in
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 539c2795e2..75ed76182b 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -765,6 +765,8 @@ static int64 total_weight = 0;
static bool verbose_errors = false; /* print verbose messages of all errors */
+static bool exit_on_abort = false; /* exit when any client is aborted */
+
/* Builtin test scripts */
typedef struct BuiltinScript
{
@@ -911,6 +913,7 @@ usage(void)
" -T, --time=NUM duration of benchmark test in seconds\n"
" -v, --vacuum-all vacuum all four standard tables before tests\n"
" --aggregate-interval=NUM aggregate data over NUM seconds\n"
+ " --exit-on-abort exit when any client is aborted\n"
" --failures-detailed report the failures grouped by basic types\n"
" --log-prefix=PREFIX prefix for transaction time log file\n"
" (default: \"pgbench_log\")\n"
@@ -6612,6 +6615,7 @@ main(int argc, char **argv)
{"failures-detailed", no_argument, NULL, 13},
{"max-tries", required_argument, NULL, 14},
{"verbose-errors", no_argument, NULL, 15},
+ {"exit-on-abort", no_argument, NULL, 16},
{NULL, 0, NULL, 0}
};
@@ -6945,6 +6949,10 @@ main(int argc, char **argv)
benchmarking_option_set = true;
verbose_errors = true;
break;
+ case 16: /* exit-on-abort */
+ benchmarking_option_set = true;
+ exit_on_abort = true;
+ break;
default:
/* getopt_long already emitted a complaint */
pg_log_error_hint("Try \"%s --help\" for more information.", progname);
@@ -7553,11 +7561,17 @@ threadRun(void *arg)
advanceConnectionState(thread, st, &aggs);
+ /*
+ * If --exit-on-abort is used, the whole thread is going to exist
+ * when any client is aborted.
+ */
+ if (exit_on_abort && st->state == CSTATE_ABORTED)
+ goto done;
/*
* If advanceConnectionState changed client to finished state,
* that's one fewer client that remains.
*/
- if (st->state == CSTATE_FINISHED || st->state == CSTATE_ABORTED)
+ else if (st->state == CSTATE_FINISHED || st->state == CSTATE_ABORTED)
remains--;
}
@@ -7592,6 +7606,19 @@ threadRun(void *arg)
done:
disconnect_all(state, nstate);
+ if (exit_on_abort)
+ {
+ for (int i = 0; i < nstate; i++)
+ {
+ /* If any client is aborted, exit immediately. */
+ if (state[i].state != CSTATE_FINISHED)
+ {
+ pg_log_error("Run was aborted due to an error in thread %d", thread->tid);
+ exit(2);
+ }
+ }
+ }
+
if (thread->logfile)
{
if (agg_interval > 0)
Hello Yugo-san,
I attached v2 patch including the documentation and some comments
in the code.
I've looked at this patch.
I'm unclear whether it does what it says: "exit immediately on abort", I
would expect a cold call to "exit" (with a clear message obviously) when
the abort occurs.
Currently it skips to "done" which starts by closing this particular
thread client connections, then it will call "exit" later, so it is not
"immediate".
I do not think that this cleanup is necessary, because anyway all other
threads will be brutally killed by the exit called by the aborted thread,
so why bothering to disconnect only some connections?
/* If any client is aborted, exit immediately. */
if (state[i].state != CSTATE_FINISHED)
For this comment, I would prefer "if ( ... == CSTATE_ABORTED)" rather that
implying that not finished means aborted, and if you follow my previous
remark then this code can be removed.
Typo: "going to exist" -> "going to exit". Note that it is not "the whole
thread" but "the program" which is exiting.
There is no test.
--
Fabien.
Hello Fabien,
On Mon, 7 Aug 2023 12:17:38 +0200 (CEST)
Fabien COELHO <coelho@cri.ensmp.fr> wrote:
Hello Yugo-san,
I attached v2 patch including the documentation and some comments
in the code.I've looked at this patch.
Thank you for your review!
I'm unclear whether it does what it says: "exit immediately on abort", I
would expect a cold call to "exit" (with a clear message obviously) when
the abort occurs.Currently it skips to "done" which starts by closing this particular
thread client connections, then it will call "exit" later, so it is not
"immediate".
There are cases where "goto done" is used where some error like
"invalid socket: ..." happens. I would like to make pgbench exit in
such cases, too, so I chose to handle errors below the "done:" label.
However, we can change it to call "exit" instead of "goo done" at each
place. Do you think this is better?
I do not think that this cleanup is necessary, because anyway all other
threads will be brutally killed by the exit called by the aborted thread,
so why bothering to disconnect only some connections?
Agreed. This disconnection is not necessary anyway even when we would like
to handle it below "done".
/* If any client is aborted, exit immediately. */
if (state[i].state != CSTATE_FINISHED)For this comment, I would prefer "if ( ... == CSTATE_ABORTED)" rather that
implying that not finished means aborted, and if you follow my previous
remark then this code can be removed.
Ok. If we handle errors like "invalid socket:" (mentioned above) after
skipping to "done", we should set the status to CSTATE_ABORTED before the jump.
Otherwise, if we choose to call "exit" immediately at each error instead of
skipping to "done", we can remove this as you says.
Typo: "going to exist" -> "going to exit". Note that it is not "the whole
thread" but "the program" which is exiting.
I'll fix.
There is no test.
I'll add an test.
Regards,
Yugo Nagata
--
Fabien.
--
Yugo NAGATA <nagata@sraoss.co.jp>
Hello Yugo-san,
There are cases where "goto done" is used where some error like
"invalid socket: ..." happens. I would like to make pgbench exit in
such cases, too, so I chose to handle errors below the "done:" label.
However, we can change it to call "exit" instead of "goo done" at each
place. Do you think this is better?
Good point.
Now I understand the "!= FINISHED", because indeed in these cases the done
is reached with unfinished but not necessarily ABORTED clients, and the
comment was somehow misleading.
On reflection, there should be only one exit() call, thus I'd say to keep
the "goto done" as you did, but to move the checking loop *before* the
disconnect_all, and the overall section comment could be something like
"possibly abort if any client is not finished, meaning some error
occured", which is consistent with the "!= FINISHED" condition.
--
Fabien.
On Wed, 9 Aug 2023 02:15:01 +0200 (CEST)
Fabien COELHO <coelho@cri.ensmp.fr> wrote:
Hello Yugo-san,
There are cases where "goto done" is used where some error like
"invalid socket: ..." happens. I would like to make pgbench exit in
such cases, too, so I chose to handle errors below the "done:" label.
However, we can change it to call "exit" instead of "goo done" at each
place. Do you think this is better?Good point.
Now I understand the "!= FINISHED", because indeed in these cases the done
is reached with unfinished but not necessarily ABORTED clients, and the
comment was somehow misleading.On reflection, there should be only one exit() call, thus I'd say to keep
the "goto done" as you did, but to move the checking loop *before* the
disconnect_all, and the overall section comment could be something like
"possibly abort if any client is not finished, meaning some error
occured", which is consistent with the "!= FINISHED" condition.
Thank you for your suggestion.
I'll fix as above and submit a updated patch soon.
Regards,
Yugo Nagata
--
Yugo NAGATA <nagata@sraoss.co.jp>
On Wed, 9 Aug 2023 10:46:38 +0900
Yugo NAGATA <nagata@sraoss.co.jp> wrote:
On Wed, 9 Aug 2023 02:15:01 +0200 (CEST)
Fabien COELHO <coelho@cri.ensmp.fr> wrote:Hello Yugo-san,
There are cases where "goto done" is used where some error like
"invalid socket: ..." happens. I would like to make pgbench exit in
such cases, too, so I chose to handle errors below the "done:" label.
However, we can change it to call "exit" instead of "goo done" at each
place. Do you think this is better?Good point.
Now I understand the "!= FINISHED", because indeed in these cases the done
is reached with unfinished but not necessarily ABORTED clients, and the
comment was somehow misleading.On reflection, there should be only one exit() call, thus I'd say to keep
the "goto done" as you did, but to move the checking loop *before* the
disconnect_all, and the overall section comment could be something like
"possibly abort if any client is not finished, meaning some error
occured", which is consistent with the "!= FINISHED" condition.Thank you for your suggestion.
I'll fix as above and submit a updated patch soon.
I attached the updated patch v3 including changes above, a test,
and fix of the typo you pointed out.
Regards,
Yugo Nagata
Regards,
Yugo Nagata--
Yugo NAGATA <nagata@sraoss.co.jp>
--
Yugo NAGATA <nagata@sraoss.co.jp>
Attachments:
v3-pgbench_exit_on_abort.patchtext/x-diff; name=v3-pgbench_exit_on_abort.patchDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 6c5c8afa6d..83fe01c33f 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -787,6 +787,19 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
</listitem>
</varlistentry>
+ <varlistentry id="pgbench-option-exit-on-abort">
+ <term><option>--exit-on-abort</option></term>
+ <listitem>
+ <para>
+ Exit immediately when any client is aborted due to some error. Without
+ this option, even when a client is aborted, other clients could continue
+ their run as specified by <option>-t</option> or <option>-T</option> option,
+ and <application>pgbench</application> will print an incomplete results
+ in this case.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="pgbench-option-log-prefix">
<term><option>--log-prefix=<replaceable>prefix</replaceable></option></term>
<listitem>
@@ -985,7 +998,8 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
benchmark such as initial connection failures also exit with status 1.
Errors during the run such as database errors or problems in the script
will result in exit status 2. In the latter case,
- <application>pgbench</application> will print partial results.
+ <application>pgbench</application> will print partial results if
+ <option>--exit-on-abort</option> option is not specified.
</para>
</refsect1>
@@ -2801,14 +2815,17 @@ statement latencies in milliseconds, failures and retries:
start a connection to the database server / the socket for connecting
the client to the database server has become invalid). In such cases
all clients of this thread stop while other threads continue to work.
+ However, <option>--exit-on-abort</option> is specified, all of the
+ threads stop immediately in this case.
</para>
</listitem>
<listitem>
<para>
Direct client errors. They lead to immediate exit from
<application>pgbench</application> with the corresponding error message
- only in the case of an internal <application>pgbench</application>
- error (which are supposed to never occur...). Otherwise in the worst
+ in the case of an internal <application>pgbench</application>
+ error (which are supposed to never occur...) or when
+ <option>--exit-on-abort</option> is specified . Otherwise in the worst
case they only lead to the abortion of the failed client while other
clients continue their run (but some client errors are handled without
an abortion of the client and reported separately, see below). Later in
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 539c2795e2..eca55d0230 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -765,6 +765,8 @@ static int64 total_weight = 0;
static bool verbose_errors = false; /* print verbose messages of all errors */
+static bool exit_on_abort = false; /* exit when any client is aborted */
+
/* Builtin test scripts */
typedef struct BuiltinScript
{
@@ -911,6 +913,7 @@ usage(void)
" -T, --time=NUM duration of benchmark test in seconds\n"
" -v, --vacuum-all vacuum all four standard tables before tests\n"
" --aggregate-interval=NUM aggregate data over NUM seconds\n"
+ " --exit-on-abort exit when any client is aborted\n"
" --failures-detailed report the failures grouped by basic types\n"
" --log-prefix=PREFIX prefix for transaction time log file\n"
" (default: \"pgbench_log\")\n"
@@ -6612,6 +6615,7 @@ main(int argc, char **argv)
{"failures-detailed", no_argument, NULL, 13},
{"max-tries", required_argument, NULL, 14},
{"verbose-errors", no_argument, NULL, 15},
+ {"exit-on-abort", no_argument, NULL, 16},
{NULL, 0, NULL, 0}
};
@@ -6945,6 +6949,10 @@ main(int argc, char **argv)
benchmarking_option_set = true;
verbose_errors = true;
break;
+ case 16: /* exit-on-abort */
+ benchmarking_option_set = true;
+ exit_on_abort = true;
+ break;
default:
/* getopt_long already emitted a complaint */
pg_log_error_hint("Try \"%s --help\" for more information.", progname);
@@ -7553,11 +7561,17 @@ threadRun(void *arg)
advanceConnectionState(thread, st, &aggs);
+ /*
+ * If --exit-on-abort is used, the program is going to exit
+ * when any client is aborted.
+ */
+ if (exit_on_abort && st->state == CSTATE_ABORTED)
+ goto done;
/*
* If advanceConnectionState changed client to finished state,
* that's one fewer client that remains.
*/
- if (st->state == CSTATE_FINISHED || st->state == CSTATE_ABORTED)
+ else if (st->state == CSTATE_FINISHED || st->state == CSTATE_ABORTED)
remains--;
}
@@ -7590,6 +7604,19 @@ threadRun(void *arg)
}
done:
+ if (exit_on_abort)
+ {
+ /* Abort if any client is not finished, meaning some error occurred. */
+ for (int i = 0; i < nstate; i++)
+ {
+ if (state[i].state != CSTATE_FINISHED)
+ {
+ pg_log_error("Run was aborted due to an error in thread %d", thread->tid);
+ exit(2);
+ }
+ }
+ }
+
disconnect_all(state, nstate);
if (thread->logfile)
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 142f966300..0fe100232e 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -1475,6 +1475,30 @@ SELECT pg_advisory_unlock_all();
# Clean up
$node->safe_psql('postgres', 'DROP TABLE first_client_table, xy;');
+# Test --exit-on-abort
+$node->safe_psql('postgres',
+ 'CREATE TABLE counter(i int);'
+);
+
+$node->pgbench(
+ '-t 10 -c 4 -j 4 --exit-on-abort',
+ 2,
+ [],
+ [
+ qr{Run was aborted due to an error in thread}
+ ],
+ 'test --exit-on-abort',
+ {
+ '001_exit_on_abort' => q{
+update x set i = i+1 returning i \gset
+\if :i = 5
+\set y 1/0
+\endif
+}
+ });
+
+# Clean up
+$node->safe_psql('postgres', 'DROP TABLE counter;');
# done
$node->safe_psql('postgres', 'DROP TABLESPACE regress_pgbench_tap_1_ts');
On Wed, 9 Aug 2023 13:59:21 +0900
Yugo NAGATA <nagata@sraoss.co.jp> wrote:
On Wed, 9 Aug 2023 10:46:38 +0900
Yugo NAGATA <nagata@sraoss.co.jp> wrote:On Wed, 9 Aug 2023 02:15:01 +0200 (CEST)
Fabien COELHO <coelho@cri.ensmp.fr> wrote:Hello Yugo-san,
There are cases where "goto done" is used where some error like
"invalid socket: ..." happens. I would like to make pgbench exit in
such cases, too, so I chose to handle errors below the "done:" label.
However, we can change it to call "exit" instead of "goo done" at each
place. Do you think this is better?Good point.
Now I understand the "!= FINISHED", because indeed in these cases the done
is reached with unfinished but not necessarily ABORTED clients, and the
comment was somehow misleading.On reflection, there should be only one exit() call, thus I'd say to keep
the "goto done" as you did, but to move the checking loop *before* the
disconnect_all, and the overall section comment could be something like
"possibly abort if any client is not finished, meaning some error
occured", which is consistent with the "!= FINISHED" condition.Thank you for your suggestion.
I'll fix as above and submit a updated patch soon.I attached the updated patch v3 including changes above, a test,
and fix of the typo you pointed out.
I'm sorry but the test in the previous patch was incorrect.
I attached the correct one.
Regards,
Yugo Nagata
Regards,
Yugo NagataRegards,
Yugo Nagata--
Yugo NAGATA <nagata@sraoss.co.jp>--
Yugo NAGATA <nagata@sraoss.co.jp>
--
Yugo NAGATA <nagata@sraoss.co.jp>
Attachments:
v3-pgbench_exit_on_abort.patchtext/x-diff; name=v3-pgbench_exit_on_abort.patchDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 6c5c8afa6d..83fe01c33f 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -787,6 +787,19 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
</listitem>
</varlistentry>
+ <varlistentry id="pgbench-option-exit-on-abort">
+ <term><option>--exit-on-abort</option></term>
+ <listitem>
+ <para>
+ Exit immediately when any client is aborted due to some error. Without
+ this option, even when a client is aborted, other clients could continue
+ their run as specified by <option>-t</option> or <option>-T</option> option,
+ and <application>pgbench</application> will print an incomplete results
+ in this case.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="pgbench-option-log-prefix">
<term><option>--log-prefix=<replaceable>prefix</replaceable></option></term>
<listitem>
@@ -985,7 +998,8 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
benchmark such as initial connection failures also exit with status 1.
Errors during the run such as database errors or problems in the script
will result in exit status 2. In the latter case,
- <application>pgbench</application> will print partial results.
+ <application>pgbench</application> will print partial results if
+ <option>--exit-on-abort</option> option is not specified.
</para>
</refsect1>
@@ -2801,14 +2815,17 @@ statement latencies in milliseconds, failures and retries:
start a connection to the database server / the socket for connecting
the client to the database server has become invalid). In such cases
all clients of this thread stop while other threads continue to work.
+ However, <option>--exit-on-abort</option> is specified, all of the
+ threads stop immediately in this case.
</para>
</listitem>
<listitem>
<para>
Direct client errors. They lead to immediate exit from
<application>pgbench</application> with the corresponding error message
- only in the case of an internal <application>pgbench</application>
- error (which are supposed to never occur...). Otherwise in the worst
+ in the case of an internal <application>pgbench</application>
+ error (which are supposed to never occur...) or when
+ <option>--exit-on-abort</option> is specified . Otherwise in the worst
case they only lead to the abortion of the failed client while other
clients continue their run (but some client errors are handled without
an abortion of the client and reported separately, see below). Later in
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 539c2795e2..eca55d0230 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -765,6 +765,8 @@ static int64 total_weight = 0;
static bool verbose_errors = false; /* print verbose messages of all errors */
+static bool exit_on_abort = false; /* exit when any client is aborted */
+
/* Builtin test scripts */
typedef struct BuiltinScript
{
@@ -911,6 +913,7 @@ usage(void)
" -T, --time=NUM duration of benchmark test in seconds\n"
" -v, --vacuum-all vacuum all four standard tables before tests\n"
" --aggregate-interval=NUM aggregate data over NUM seconds\n"
+ " --exit-on-abort exit when any client is aborted\n"
" --failures-detailed report the failures grouped by basic types\n"
" --log-prefix=PREFIX prefix for transaction time log file\n"
" (default: \"pgbench_log\")\n"
@@ -6612,6 +6615,7 @@ main(int argc, char **argv)
{"failures-detailed", no_argument, NULL, 13},
{"max-tries", required_argument, NULL, 14},
{"verbose-errors", no_argument, NULL, 15},
+ {"exit-on-abort", no_argument, NULL, 16},
{NULL, 0, NULL, 0}
};
@@ -6945,6 +6949,10 @@ main(int argc, char **argv)
benchmarking_option_set = true;
verbose_errors = true;
break;
+ case 16: /* exit-on-abort */
+ benchmarking_option_set = true;
+ exit_on_abort = true;
+ break;
default:
/* getopt_long already emitted a complaint */
pg_log_error_hint("Try \"%s --help\" for more information.", progname);
@@ -7553,11 +7561,17 @@ threadRun(void *arg)
advanceConnectionState(thread, st, &aggs);
+ /*
+ * If --exit-on-abort is used, the program is going to exit
+ * when any client is aborted.
+ */
+ if (exit_on_abort && st->state == CSTATE_ABORTED)
+ goto done;
/*
* If advanceConnectionState changed client to finished state,
* that's one fewer client that remains.
*/
- if (st->state == CSTATE_FINISHED || st->state == CSTATE_ABORTED)
+ else if (st->state == CSTATE_FINISHED || st->state == CSTATE_ABORTED)
remains--;
}
@@ -7590,6 +7604,19 @@ threadRun(void *arg)
}
done:
+ if (exit_on_abort)
+ {
+ /* Abort if any client is not finished, meaning some error occurred. */
+ for (int i = 0; i < nstate; i++)
+ {
+ if (state[i].state != CSTATE_FINISHED)
+ {
+ pg_log_error("Run was aborted due to an error in thread %d", thread->tid);
+ exit(2);
+ }
+ }
+ }
+
disconnect_all(state, nstate);
if (thread->logfile)
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 142f966300..7e44bcea71 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -1475,6 +1475,32 @@ SELECT pg_advisory_unlock_all();
# Clean up
$node->safe_psql('postgres', 'DROP TABLE first_client_table, xy;');
+# Test --exit-on-abort
+$node->safe_psql('postgres',
+ 'CREATE TABLE counter(i int); '.
+ 'INSERT INTO counter VALUES (0);'
+);
+
+$node->pgbench(
+ '-T 10 -c 4 -j 4 --exit-on-abort',
+ 2,
+ [],
+ [
+ qr{division by zero},
+ qr{Run was aborted due to an error in thread}
+ ],
+ 'test --exit-on-abort',
+ {
+ '001_exit_on_abort' => q{
+update counter set i = i+1 returning i \gset
+\if :i = 5
+\set y 1/0
+\endif
+}
+ });
+
+# Clean up
+$node->safe_psql('postgres', 'DROP TABLE counter;');
# done
$node->safe_psql('postgres', 'DROP TABLESPACE regress_pgbench_tap_1_ts');
Hello Yugo-san,
I attached the updated patch v3 including changes above, a test,
and fix of the typo you pointed out.I'm sorry but the test in the previous patch was incorrect.
I attached the correct one.
About pgbench exit on abort v3:
Patch does not "git apply", but is ok with "patch" although there are some
minor warnings.
Compiles. Code is ok. Tests are ok.
About Test:
The code is simple to get an error quickly but after a few transactions,
good. I'll do a minimal "-c 2 -j 2 -t 10" instead of "-c 4 -j 4 -T 10".
--
Fabien.
On Sun, 13 Aug 2023 11:27:55 +0200 (CEST)
Fabien COELHO <coelho@cri.ensmp.fr> wrote:
Hello Yugo-san,
I attached the updated patch v3 including changes above, a test,
and fix of the typo you pointed out.I'm sorry but the test in the previous patch was incorrect.
I attached the correct one.About pgbench exit on abort v3:
Patch does not "git apply", but is ok with "patch" although there are some
minor warnings.
In my environment, the patch can be applied to the master branch without
any warnings...
Compiles. Code is ok. Tests are ok.
About Test:
The code is simple to get an error quickly but after a few transactions,
good. I'll do a minimal "-c 2 -j 2 -t 10" instead of "-c 4 -j 4 -T 10".
I fixed the test and attached the updated patch, v4.
Regards,
Yugo Nagata
--
Fabien.
--
Yugo NAGATA <nagata@sraoss.co.jp>
Attachments:
v4-pgbench_exit_on_abort.patchtext/x-diff; name=v4-pgbench_exit_on_abort.patchDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 6c5c8afa6d..83fe01c33f 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -787,6 +787,19 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
</listitem>
</varlistentry>
+ <varlistentry id="pgbench-option-exit-on-abort">
+ <term><option>--exit-on-abort</option></term>
+ <listitem>
+ <para>
+ Exit immediately when any client is aborted due to some error. Without
+ this option, even when a client is aborted, other clients could continue
+ their run as specified by <option>-t</option> or <option>-T</option> option,
+ and <application>pgbench</application> will print an incomplete results
+ in this case.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="pgbench-option-log-prefix">
<term><option>--log-prefix=<replaceable>prefix</replaceable></option></term>
<listitem>
@@ -985,7 +998,8 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
benchmark such as initial connection failures also exit with status 1.
Errors during the run such as database errors or problems in the script
will result in exit status 2. In the latter case,
- <application>pgbench</application> will print partial results.
+ <application>pgbench</application> will print partial results if
+ <option>--exit-on-abort</option> option is not specified.
</para>
</refsect1>
@@ -2801,14 +2815,17 @@ statement latencies in milliseconds, failures and retries:
start a connection to the database server / the socket for connecting
the client to the database server has become invalid). In such cases
all clients of this thread stop while other threads continue to work.
+ However, <option>--exit-on-abort</option> is specified, all of the
+ threads stop immediately in this case.
</para>
</listitem>
<listitem>
<para>
Direct client errors. They lead to immediate exit from
<application>pgbench</application> with the corresponding error message
- only in the case of an internal <application>pgbench</application>
- error (which are supposed to never occur...). Otherwise in the worst
+ in the case of an internal <application>pgbench</application>
+ error (which are supposed to never occur...) or when
+ <option>--exit-on-abort</option> is specified . Otherwise in the worst
case they only lead to the abortion of the failed client while other
clients continue their run (but some client errors are handled without
an abortion of the client and reported separately, see below). Later in
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 763c4b946a..4e64a60a63 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -765,6 +765,8 @@ static int64 total_weight = 0;
static bool verbose_errors = false; /* print verbose messages of all errors */
+static bool exit_on_abort = false; /* exit when any client is aborted */
+
/* Builtin test scripts */
typedef struct BuiltinScript
{
@@ -911,6 +913,7 @@ usage(void)
" -T, --time=NUM duration of benchmark test in seconds\n"
" -v, --vacuum-all vacuum all four standard tables before tests\n"
" --aggregate-interval=NUM aggregate data over NUM seconds\n"
+ " --exit-on-abort exit when any client is aborted\n"
" --failures-detailed report the failures grouped by basic types\n"
" --log-prefix=PREFIX prefix for transaction time log file\n"
" (default: \"pgbench_log\")\n"
@@ -6617,6 +6620,7 @@ main(int argc, char **argv)
{"failures-detailed", no_argument, NULL, 13},
{"max-tries", required_argument, NULL, 14},
{"verbose-errors", no_argument, NULL, 15},
+ {"exit-on-abort", no_argument, NULL, 16},
{NULL, 0, NULL, 0}
};
@@ -6950,6 +6954,10 @@ main(int argc, char **argv)
benchmarking_option_set = true;
verbose_errors = true;
break;
+ case 16: /* exit-on-abort */
+ benchmarking_option_set = true;
+ exit_on_abort = true;
+ break;
default:
/* getopt_long already emitted a complaint */
pg_log_error_hint("Try \"%s --help\" for more information.", progname);
@@ -7558,11 +7566,17 @@ threadRun(void *arg)
advanceConnectionState(thread, st, &aggs);
+ /*
+ * If --exit-on-abort is used, the program is going to exit
+ * when any client is aborted.
+ */
+ if (exit_on_abort && st->state == CSTATE_ABORTED)
+ goto done;
/*
* If advanceConnectionState changed client to finished state,
* that's one fewer client that remains.
*/
- if (st->state == CSTATE_FINISHED || st->state == CSTATE_ABORTED)
+ else if (st->state == CSTATE_FINISHED || st->state == CSTATE_ABORTED)
remains--;
}
@@ -7595,6 +7609,19 @@ threadRun(void *arg)
}
done:
+ if (exit_on_abort)
+ {
+ /* Abort if any client is not finished, meaning some error occurred. */
+ for (int i = 0; i < nstate; i++)
+ {
+ if (state[i].state != CSTATE_FINISHED)
+ {
+ pg_log_error("Run was aborted due to an error in thread %d", thread->tid);
+ exit(2);
+ }
+ }
+ }
+
disconnect_all(state, nstate);
if (thread->logfile)
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 142f966300..96be529d6b 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -1475,6 +1475,32 @@ SELECT pg_advisory_unlock_all();
# Clean up
$node->safe_psql('postgres', 'DROP TABLE first_client_table, xy;');
+# Test --exit-on-abort
+$node->safe_psql('postgres',
+ 'CREATE TABLE counter(i int); '.
+ 'INSERT INTO counter VALUES (0);'
+);
+
+$node->pgbench(
+ '-t 10 -c 2 -j 2 --exit-on-abort',
+ 2,
+ [],
+ [
+ qr{division by zero},
+ qr{Run was aborted due to an error in thread}
+ ],
+ 'test --exit-on-abort',
+ {
+ '001_exit_on_abort' => q{
+update counter set i = i+1 returning i \gset
+\if :i = 5
+\set y 1/0
+\endif
+}
+ });
+
+# Clean up
+$node->safe_psql('postgres', 'DROP TABLE counter;');
# done
$node->safe_psql('postgres', 'DROP TABLESPACE regress_pgbench_tap_1_ts');
About pgbench exit on abort v4:
Patch applies cleanly, compiles, local make check ok, doc looks ok.
This looks ok to me.
--
Fabien.
About pgbench exit on abort v4:
Patch applies cleanly, compiles, local make check ok, doc looks ok.
This looks ok to me.
I have tested the v4 patch with default_transaction_isolation =
'repeatable read'.
pgbench --exit-on-abort -N -p 11002 -c 10 -T 30 test
pgbench (17devel, server 15.3)
starting vacuum...end.
transaction type: <builtin: simple update>
scaling factor: 1
query mode: simple
number of clients: 10
number of threads: 1
maximum number of tries: 1
duration: 30 s
number of transactions actually processed: 64854
number of failed transactions: 4 (0.006%)
latency average = 4.628 ms (including failures)
initial connection time = 49.526 ms
tps = 2160.827556 (without initial connection time)
Depite the 4 failed transactions (could not serialize access due to
concurrent update) pgbench ran normally, rather than aborting the
test. Is this an expected behavior?
Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
I have tested the v4 patch with default_transaction_isolation =
'repeatable read'.pgbench --exit-on-abort -N -p 11002 -c 10 -T 30 test
pgbench (17devel, server 15.3)
starting vacuum...end.
transaction type: <builtin: simple update>
scaling factor: 1
query mode: simple
number of clients: 10
number of threads: 1
maximum number of tries: 1
duration: 30 s
number of transactions actually processed: 64854
number of failed transactions: 4 (0.006%)
latency average = 4.628 ms (including failures)
initial connection time = 49.526 ms
tps = 2160.827556 (without initial connection time)Depite the 4 failed transactions (could not serialize access due to
concurrent update) pgbench ran normally, rather than aborting the
test. Is this an expected behavior?
I start to think this behavior is ok and consistent with previous
behavior of pgbench because serialization (and dealock) errors have
been treated specially from other types of errors, such as accessing
non existing tables. However, I suggest to add more sentences to the
explanation of this option so that users are not confused by this.
+ <varlistentry id="pgbench-option-exit-on-abort">
+ <term><option>--exit-on-abort</option></term>
+ <listitem>
+ <para>
+ Exit immediately when any client is aborted due to some error. Without
+ this option, even when a client is aborted, other clients could continue
+ their run as specified by <option>-t</option> or <option>-T</option> option,
+ and <application>pgbench</application> will print an incomplete results
+ in this case.
+ </para>
+ </listitem>
+ </varlistentry>
+
What about inserting "Note that serialization failures or deadlock
failures will not abort client. See <xref
linkend="failures-and-retries"/> for more information." into the end
of this paragraph?
BTW, I think:
Exit immediately when any client is aborted due to some error. Without
should be:
Exit immediately when any client is aborted due to some errors. Without
(error vs. erros)
Also:
+ <option>--exit-on-abort</option> is specified . Otherwise in the worst
There is an extra space between "specified" and ".".
Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
On Sat, 19 Aug 2023 19:51:56 +0900 (JST)
Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
I have tested the v4 patch with default_transaction_isolation =
'repeatable read'.pgbench --exit-on-abort -N -p 11002 -c 10 -T 30 test
pgbench (17devel, server 15.3)
starting vacuum...end.
transaction type: <builtin: simple update>
scaling factor: 1
query mode: simple
number of clients: 10
number of threads: 1
maximum number of tries: 1
duration: 30 s
number of transactions actually processed: 64854
number of failed transactions: 4 (0.006%)
latency average = 4.628 ms (including failures)
initial connection time = 49.526 ms
tps = 2160.827556 (without initial connection time)Depite the 4 failed transactions (could not serialize access due to
concurrent update) pgbench ran normally, rather than aborting the
test. Is this an expected behavior?
Yes. --exit-on-abort changes a behaviour when a client is **aborted**
due to an error, and serialization errors do not cause abort, so
it is not affected.
I start to think this behavior is ok and consistent with previous
behavior of pgbench because serialization (and dealock) errors have
been treated specially from other types of errors, such as accessing
non existing tables. However, I suggest to add more sentences to the
explanation of this option so that users are not confused by this.+ <varlistentry id="pgbench-option-exit-on-abort"> + <term><option>--exit-on-abort</option></term> + <listitem> + <para> + Exit immediately when any client is aborted due to some error. Without + this option, even when a client is aborted, other clients could continue + their run as specified by <option>-t</option> or <option>-T</option> option, + and <application>pgbench</application> will print an incomplete results + in this case. + </para> + </listitem> + </varlistentry> +What about inserting "Note that serialization failures or deadlock
failures will not abort client. See <xref
linkend="failures-and-retries"/> for more information." into the end
of this paragraph?
--exit-on-abort is related to "abort" of a client instead of error or
failure itself, so rather I wonder a bit that mentioning serialization/deadlock
failures might be confusing. However, if users may think of such failures from
"abort", it could be beneficial to add the sentences with some modification as
below.
"Note that serialization failures or deadlock failures does not abort the
client, so they are not affected by this option.
See <xref linkend="failures-and-retries"/> for more information."
BTW, I think:
Exit immediately when any client is aborted due to some error. Withoutshould be:
Exit immediately when any client is aborted due to some errors. Without(error vs. erros)
Well, I chose "some" to mean "unknown or unspecified", not "an unspecified amount
or number of", so singular form "error" is used.
Instead, should we use "due to a error"?
Also:
+ <option>--exit-on-abort</option> is specified . Otherwise in the worstThere is an extra space between "specified" and ".".
Fixed.
Also, I fixed the place of the description in the documentation
to alphabetical order
Attached is the updated patch.
Regards,
Yugo Nagata
Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
--
Yugo NAGATA <nagata@sraoss.co.jp>
Attachments:
v5-pgbench_exit_on_abort.patchtext/x-diff; name=v5-pgbench_exit_on_abort.patchDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 6c5c8afa6d..4175cf4ccd 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -768,6 +768,24 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
</listitem>
</varlistentry>
+ <varlistentry id="pgbench-option-exit-on-abort">
+ <term><option>--exit-on-abort</option></term>
+ <listitem>
+ <para>
+ Exit immediately when any client is aborted due to some error. Without
+ this option, even when a client is aborted, other clients could continue
+ their run as specified by <option>-t</option> or <option>-T</option> option,
+ and <application>pgbench</application> will print an incomplete results
+ in this case.
+ </para>
+ <para>
+ Note that serialization failures or deadlock failures does not abort the
+ client, so they are not affected by this option.
+ See <xref linkend="failures-and-retries"/> for more information.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="pgbench-option-failures-detailed">
<term><option>--failures-detailed</option></term>
<listitem>
@@ -985,7 +1003,8 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
benchmark such as initial connection failures also exit with status 1.
Errors during the run such as database errors or problems in the script
will result in exit status 2. In the latter case,
- <application>pgbench</application> will print partial results.
+ <application>pgbench</application> will print partial results if
+ <option>--exit-on-abort</option> option is not specified.
</para>
</refsect1>
@@ -2801,14 +2820,17 @@ statement latencies in milliseconds, failures and retries:
start a connection to the database server / the socket for connecting
the client to the database server has become invalid). In such cases
all clients of this thread stop while other threads continue to work.
+ However, <option>--exit-on-abort</option> is specified, all of the
+ threads stop immediately in this case.
</para>
</listitem>
<listitem>
<para>
Direct client errors. They lead to immediate exit from
<application>pgbench</application> with the corresponding error message
- only in the case of an internal <application>pgbench</application>
- error (which are supposed to never occur...). Otherwise in the worst
+ in the case of an internal <application>pgbench</application>
+ error (which are supposed to never occur...) or when
+ <option>--exit-on-abort</option> is specified. Otherwise in the worst
case they only lead to the abortion of the failed client while other
clients continue their run (but some client errors are handled without
an abortion of the client and reported separately, see below). Later in
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 763c4b946a..4e64a60a63 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -765,6 +765,8 @@ static int64 total_weight = 0;
static bool verbose_errors = false; /* print verbose messages of all errors */
+static bool exit_on_abort = false; /* exit when any client is aborted */
+
/* Builtin test scripts */
typedef struct BuiltinScript
{
@@ -911,6 +913,7 @@ usage(void)
" -T, --time=NUM duration of benchmark test in seconds\n"
" -v, --vacuum-all vacuum all four standard tables before tests\n"
" --aggregate-interval=NUM aggregate data over NUM seconds\n"
+ " --exit-on-abort exit when any client is aborted\n"
" --failures-detailed report the failures grouped by basic types\n"
" --log-prefix=PREFIX prefix for transaction time log file\n"
" (default: \"pgbench_log\")\n"
@@ -6617,6 +6620,7 @@ main(int argc, char **argv)
{"failures-detailed", no_argument, NULL, 13},
{"max-tries", required_argument, NULL, 14},
{"verbose-errors", no_argument, NULL, 15},
+ {"exit-on-abort", no_argument, NULL, 16},
{NULL, 0, NULL, 0}
};
@@ -6950,6 +6954,10 @@ main(int argc, char **argv)
benchmarking_option_set = true;
verbose_errors = true;
break;
+ case 16: /* exit-on-abort */
+ benchmarking_option_set = true;
+ exit_on_abort = true;
+ break;
default:
/* getopt_long already emitted a complaint */
pg_log_error_hint("Try \"%s --help\" for more information.", progname);
@@ -7558,11 +7566,17 @@ threadRun(void *arg)
advanceConnectionState(thread, st, &aggs);
+ /*
+ * If --exit-on-abort is used, the program is going to exit
+ * when any client is aborted.
+ */
+ if (exit_on_abort && st->state == CSTATE_ABORTED)
+ goto done;
/*
* If advanceConnectionState changed client to finished state,
* that's one fewer client that remains.
*/
- if (st->state == CSTATE_FINISHED || st->state == CSTATE_ABORTED)
+ else if (st->state == CSTATE_FINISHED || st->state == CSTATE_ABORTED)
remains--;
}
@@ -7595,6 +7609,19 @@ threadRun(void *arg)
}
done:
+ if (exit_on_abort)
+ {
+ /* Abort if any client is not finished, meaning some error occurred. */
+ for (int i = 0; i < nstate; i++)
+ {
+ if (state[i].state != CSTATE_FINISHED)
+ {
+ pg_log_error("Run was aborted due to an error in thread %d", thread->tid);
+ exit(2);
+ }
+ }
+ }
+
disconnect_all(state, nstate);
if (thread->logfile)
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 142f966300..96be529d6b 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -1475,6 +1475,32 @@ SELECT pg_advisory_unlock_all();
# Clean up
$node->safe_psql('postgres', 'DROP TABLE first_client_table, xy;');
+# Test --exit-on-abort
+$node->safe_psql('postgres',
+ 'CREATE TABLE counter(i int); '.
+ 'INSERT INTO counter VALUES (0);'
+);
+
+$node->pgbench(
+ '-t 10 -c 2 -j 2 --exit-on-abort',
+ 2,
+ [],
+ [
+ qr{division by zero},
+ qr{Run was aborted due to an error in thread}
+ ],
+ 'test --exit-on-abort',
+ {
+ '001_exit_on_abort' => q{
+update counter set i = i+1 returning i \gset
+\if :i = 5
+\set y 1/0
+\endif
+}
+ });
+
+# Clean up
+$node->safe_psql('postgres', 'DROP TABLE counter;');
# done
$node->safe_psql('postgres', 'DROP TABLESPACE regress_pgbench_tap_1_ts');
I start to think this behavior is ok and consistent with previous
behavior of pgbench because serialization (and dealock) errors have
been treated specially from other types of errors, such as accessing
non existing tables. However, I suggest to add more sentences to the
explanation of this option so that users are not confused by this.+ <varlistentry id="pgbench-option-exit-on-abort"> + <term><option>--exit-on-abort</option></term> + <listitem> + <para> + Exit immediately when any client is aborted due to some error. Without + this option, even when a client is aborted, other clients could continue + their run as specified by <option>-t</option> or <option>-T</option> option, + and <application>pgbench</application> will print an incomplete results + in this case. + </para> + </listitem> + </varlistentry> +What about inserting "Note that serialization failures or deadlock
failures will not abort client. See <xref
linkend="failures-and-retries"/> for more information." into the end
of this paragraph?--exit-on-abort is related to "abort" of a client instead of error or
failure itself, so rather I wonder a bit that mentioning serialization/deadlock
failures might be confusing. However, if users may think of such failures from
"abort", it could be beneficial to add the sentences with some modification as
below.
I myself confused by this and believe that adding extra paragraph is
beneficial to users.
"Note that serialization failures or deadlock failures does not abort the
client, so they are not affected by this option.
See <xref linkend="failures-and-retries"/> for more information."
"does not" --> "do not".
BTW, I think:
Exit immediately when any client is aborted due to some error. Withoutshould be:
Exit immediately when any client is aborted due to some errors. Without(error vs. erros)
Well, I chose "some" to mean "unknown or unspecified", not "an unspecified amount
or number of", so singular form "error" is used.
Ok.
Instead, should we use "due to a error"?
I don't think so.
Also:
+ <option>--exit-on-abort</option> is specified . Otherwise in the worstThere is an extra space between "specified" and ".".
Fixed.
Also, I fixed the place of the description in the documentation
to alphabetical orderAttached is the updated patch.
Looks good to me. If there's no objection, I will commit this next week.
Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
On Thu, 24 Aug 2023 09:15:51 +0900 (JST)
Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
I start to think this behavior is ok and consistent with previous
behavior of pgbench because serialization (and dealock) errors have
been treated specially from other types of errors, such as accessing
non existing tables. However, I suggest to add more sentences to the
explanation of this option so that users are not confused by this.+ <varlistentry id="pgbench-option-exit-on-abort"> + <term><option>--exit-on-abort</option></term> + <listitem> + <para> + Exit immediately when any client is aborted due to some error. Without + this option, even when a client is aborted, other clients could continue + their run as specified by <option>-t</option> or <option>-T</option> option, + and <application>pgbench</application> will print an incomplete results + in this case. + </para> + </listitem> + </varlistentry> +What about inserting "Note that serialization failures or deadlock
failures will not abort client. See <xref
linkend="failures-and-retries"/> for more information." into the end
of this paragraph?--exit-on-abort is related to "abort" of a client instead of error or
failure itself, so rather I wonder a bit that mentioning serialization/deadlock
failures might be confusing. However, if users may think of such failures from
"abort", it could be beneficial to add the sentences with some modification as
below.I myself confused by this and believe that adding extra paragraph is
beneficial to users.
Ok.
"Note that serialization failures or deadlock failures does not abort the
client, so they are not affected by this option.
See <xref linkend="failures-and-retries"/> for more information.""does not" --> "do not".
Oops. I attached the updated patch.
BTW, I think:
Exit immediately when any client is aborted due to some error. Withoutshould be:
Exit immediately when any client is aborted due to some errors. Without(error vs. erros)
Well, I chose "some" to mean "unknown or unspecified", not "an unspecified amount
or number of", so singular form "error" is used.Ok.
Instead, should we use "due to a error"?
I don't think so.
Also:
+ <option>--exit-on-abort</option> is specified . Otherwise in the worstThere is an extra space between "specified" and ".".
Fixed.
Also, I fixed the place of the description in the documentation
to alphabetical orderAttached is the updated patch.
Looks good to me. If there's no objection, I will commit this next week.
Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
--
Yugo NAGATA <nagata@sraoss.co.jp>
Attachments:
v6-pgbench_exit_on_abort.patchtext/x-diff; name=v6-pgbench_exit_on_abort.patchDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 6c5c8afa6d..95cc9027c3 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -768,6 +768,24 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
</listitem>
</varlistentry>
+ <varlistentry id="pgbench-option-exit-on-abort">
+ <term><option>--exit-on-abort</option></term>
+ <listitem>
+ <para>
+ Exit immediately when any client is aborted due to some error. Without
+ this option, even when a client is aborted, other clients could continue
+ their run as specified by <option>-t</option> or <option>-T</option> option,
+ and <application>pgbench</application> will print an incomplete results
+ in this case.
+ </para>
+ <para>
+ Note that serialization failures or deadlock failures do not abort the
+ client, so they are not affected by this option.
+ See <xref linkend="failures-and-retries"/> for more information.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="pgbench-option-failures-detailed">
<term><option>--failures-detailed</option></term>
<listitem>
@@ -985,7 +1003,8 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
benchmark such as initial connection failures also exit with status 1.
Errors during the run such as database errors or problems in the script
will result in exit status 2. In the latter case,
- <application>pgbench</application> will print partial results.
+ <application>pgbench</application> will print partial results if
+ <option>--exit-on-abort</option> option is not specified.
</para>
</refsect1>
@@ -2801,14 +2820,17 @@ statement latencies in milliseconds, failures and retries:
start a connection to the database server / the socket for connecting
the client to the database server has become invalid). In such cases
all clients of this thread stop while other threads continue to work.
+ However, <option>--exit-on-abort</option> is specified, all of the
+ threads stop immediately in this case.
</para>
</listitem>
<listitem>
<para>
Direct client errors. They lead to immediate exit from
<application>pgbench</application> with the corresponding error message
- only in the case of an internal <application>pgbench</application>
- error (which are supposed to never occur...). Otherwise in the worst
+ in the case of an internal <application>pgbench</application>
+ error (which are supposed to never occur...) or when
+ <option>--exit-on-abort</option> is specified. Otherwise in the worst
case they only lead to the abortion of the failed client while other
clients continue their run (but some client errors are handled without
an abortion of the client and reported separately, see below). Later in
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 763c4b946a..4e64a60a63 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -765,6 +765,8 @@ static int64 total_weight = 0;
static bool verbose_errors = false; /* print verbose messages of all errors */
+static bool exit_on_abort = false; /* exit when any client is aborted */
+
/* Builtin test scripts */
typedef struct BuiltinScript
{
@@ -911,6 +913,7 @@ usage(void)
" -T, --time=NUM duration of benchmark test in seconds\n"
" -v, --vacuum-all vacuum all four standard tables before tests\n"
" --aggregate-interval=NUM aggregate data over NUM seconds\n"
+ " --exit-on-abort exit when any client is aborted\n"
" --failures-detailed report the failures grouped by basic types\n"
" --log-prefix=PREFIX prefix for transaction time log file\n"
" (default: \"pgbench_log\")\n"
@@ -6617,6 +6620,7 @@ main(int argc, char **argv)
{"failures-detailed", no_argument, NULL, 13},
{"max-tries", required_argument, NULL, 14},
{"verbose-errors", no_argument, NULL, 15},
+ {"exit-on-abort", no_argument, NULL, 16},
{NULL, 0, NULL, 0}
};
@@ -6950,6 +6954,10 @@ main(int argc, char **argv)
benchmarking_option_set = true;
verbose_errors = true;
break;
+ case 16: /* exit-on-abort */
+ benchmarking_option_set = true;
+ exit_on_abort = true;
+ break;
default:
/* getopt_long already emitted a complaint */
pg_log_error_hint("Try \"%s --help\" for more information.", progname);
@@ -7558,11 +7566,17 @@ threadRun(void *arg)
advanceConnectionState(thread, st, &aggs);
+ /*
+ * If --exit-on-abort is used, the program is going to exit
+ * when any client is aborted.
+ */
+ if (exit_on_abort && st->state == CSTATE_ABORTED)
+ goto done;
/*
* If advanceConnectionState changed client to finished state,
* that's one fewer client that remains.
*/
- if (st->state == CSTATE_FINISHED || st->state == CSTATE_ABORTED)
+ else if (st->state == CSTATE_FINISHED || st->state == CSTATE_ABORTED)
remains--;
}
@@ -7595,6 +7609,19 @@ threadRun(void *arg)
}
done:
+ if (exit_on_abort)
+ {
+ /* Abort if any client is not finished, meaning some error occurred. */
+ for (int i = 0; i < nstate; i++)
+ {
+ if (state[i].state != CSTATE_FINISHED)
+ {
+ pg_log_error("Run was aborted due to an error in thread %d", thread->tid);
+ exit(2);
+ }
+ }
+ }
+
disconnect_all(state, nstate);
if (thread->logfile)
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 142f966300..96be529d6b 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -1475,6 +1475,32 @@ SELECT pg_advisory_unlock_all();
# Clean up
$node->safe_psql('postgres', 'DROP TABLE first_client_table, xy;');
+# Test --exit-on-abort
+$node->safe_psql('postgres',
+ 'CREATE TABLE counter(i int); '.
+ 'INSERT INTO counter VALUES (0);'
+);
+
+$node->pgbench(
+ '-t 10 -c 2 -j 2 --exit-on-abort',
+ 2,
+ [],
+ [
+ qr{division by zero},
+ qr{Run was aborted due to an error in thread}
+ ],
+ 'test --exit-on-abort',
+ {
+ '001_exit_on_abort' => q{
+update counter set i = i+1 returning i \gset
+\if :i = 5
+\set y 1/0
+\endif
+}
+ });
+
+# Clean up
+$node->safe_psql('postgres', 'DROP TABLE counter;');
# done
$node->safe_psql('postgres', 'DROP TABLESPACE regress_pgbench_tap_1_ts');
Yugo,
Fabien,
I start to think this behavior is ok and consistent with previous
behavior of pgbench because serialization (and dealock) errors have
been treated specially from other types of errors, such as accessing
non existing tables. However, I suggest to add more sentences to the
explanation of this option so that users are not confused by this.+ <varlistentry id="pgbench-option-exit-on-abort"> + <term><option>--exit-on-abort</option></term> + <listitem> + <para> + Exit immediately when any client is aborted due to some error. Without + this option, even when a client is aborted, other clients could continue + their run as specified by <option>-t</option> or <option>-T</option> option, + and <application>pgbench</application> will print an incomplete results + in this case. + </para> + </listitem> + </varlistentry> +What about inserting "Note that serialization failures or deadlock
failures will not abort client. See <xref
linkend="failures-and-retries"/> for more information." into the end
of this paragraph?--exit-on-abort is related to "abort" of a client instead of error or
failure itself, so rather I wonder a bit that mentioning serialization/deadlock
failures might be confusing. However, if users may think of such failures from
"abort", it could be beneficial to add the sentences with some modification as
below.I myself confused by this and believe that adding extra paragraph is
beneficial to users."Note that serialization failures or deadlock failures does not abort the
client, so they are not affected by this option.
See <xref linkend="failures-and-retries"/> for more information.""does not" --> "do not".
BTW, I think:
Exit immediately when any client is aborted due to some error. Withoutshould be:
Exit immediately when any client is aborted due to some errors. Without(error vs. erros)
Well, I chose "some" to mean "unknown or unspecified", not "an unspecified amount
or number of", so singular form "error" is used.Ok.
Instead, should we use "due to a error"?
I don't think so.
Also:
+ <option>--exit-on-abort</option> is specified . Otherwise in the worstThere is an extra space between "specified" and ".".
Fixed.
Also, I fixed the place of the description in the documentation
to alphabetical orderAttached is the updated patch.
Looks good to me. If there's no objection, I will commit this next week.
I have pushed the patch. Thank you for the conributions!
Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
On Wed, 30 Aug 2023 10:11:10 +0900 (JST)
Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
Yugo,
Fabien,I start to think this behavior is ok and consistent with previous
behavior of pgbench because serialization (and dealock) errors have
been treated specially from other types of errors, such as accessing
non existing tables. However, I suggest to add more sentences to the
explanation of this option so that users are not confused by this.+ <varlistentry id="pgbench-option-exit-on-abort"> + <term><option>--exit-on-abort</option></term> + <listitem> + <para> + Exit immediately when any client is aborted due to some error. Without + this option, even when a client is aborted, other clients could continue + their run as specified by <option>-t</option> or <option>-T</option> option, + and <application>pgbench</application> will print an incomplete results + in this case. + </para> + </listitem> + </varlistentry> +What about inserting "Note that serialization failures or deadlock
failures will not abort client. See <xref
linkend="failures-and-retries"/> for more information." into the end
of this paragraph?--exit-on-abort is related to "abort" of a client instead of error or
failure itself, so rather I wonder a bit that mentioning serialization/deadlock
failures might be confusing. However, if users may think of such failures from
"abort", it could be beneficial to add the sentences with some modification as
below.I myself confused by this and believe that adding extra paragraph is
beneficial to users."Note that serialization failures or deadlock failures does not abort the
client, so they are not affected by this option.
See <xref linkend="failures-and-retries"/> for more information.""does not" --> "do not".
BTW, I think:
Exit immediately when any client is aborted due to some error. Withoutshould be:
Exit immediately when any client is aborted due to some errors. Without(error vs. erros)
Well, I chose "some" to mean "unknown or unspecified", not "an unspecified amount
or number of", so singular form "error" is used.Ok.
Instead, should we use "due to a error"?
I don't think so.
Also:
+ <option>--exit-on-abort</option> is specified . Otherwise in the worstThere is an extra space between "specified" and ".".
Fixed.
Also, I fixed the place of the description in the documentation
to alphabetical orderAttached is the updated patch.
Looks good to me. If there's no objection, I will commit this next week.
I have pushed the patch. Thank you for the conributions!
Thank you!
Regards,
Yugo Nagata
Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
--
Yugo NAGATA <nagata@sraoss.co.jp>