Some code cleanup for pgbench and pg_verifybackup

Started by Michael Paquierover 4 years ago13 messages

michael@paquier.xyz

over 4 years ago

1 attachment(s)

Hi all,

While looking at the code areas of $subject, I got surprised about a
couple of things:
- pgbench has its own parsing routines for int64 and double, with
an option to skip errors. That's not surprising in itself, but, for
strtodouble(), errorOK is always true, meaning that the error strings
are dead. For strtoint64(), errorOK is false only when parsing a
Variable, where a second error string is generated. I don't really
think that we need to be that pedantic about the type of errors
generated in those code paths when failing to parse a variable, so I'd
like to propose a simplification of the code where we reuse the same
error message as for double, cutting a bit the number of translatable
strings.
- pgbench and pg_verifybackup make use of pg_log_fatal() to report
some failures in code paths dedicated to command-line options, which
is inconsistent with all the other tools that use pg_log_error().

Please find attached a patch to clean up all those inconsistencies.

Thoughts?
--
Michael

Attachments:

pgbench-verifybackup-cleanups.patchtext/x-diff; charset=us-asciiDownload

diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index f5ebd57a47..5f0469373f 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -261,7 +261,7 @@ main(int argc, char **argv)
 	/* Get backup directory name */
 	if (optind >= argc)
 	{
-		pg_log_fatal("no backup directory specified");
+		pg_log_error("no backup directory specified");
 		fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
 				progname);
 		exit(1);
@@ -272,7 +272,7 @@ main(int argc, char **argv)
 	/* Complain if any arguments remain */
 	if (optind < argc)
 	{
-		pg_log_fatal("too many command-line arguments (first is \"%s\")",
+		pg_log_error("too many command-line arguments (first is \"%s\")",
 					 argv[optind]);
 		fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
 				progname);
diff --git a/src/bin/pgbench/exprscan.l b/src/bin/pgbench/exprscan.l
index 75432cedc6..b3322f01f7 100644
--- a/src/bin/pgbench/exprscan.l
+++ b/src/bin/pgbench/exprscan.l
@@ -205,19 +205,19 @@ notnull			[Nn][Oo][Tt][Nn][Uu][Ll][Ll]
 					return MAXINT_PLUS_ONE_CONST;
 				}
 {digit}+		{
-					if (!strtoint64(yytext, true, &yylval->ival))
+					if (!strtoint64(yytext, &yylval->ival))
 						expr_yyerror_more(yyscanner, "bigint constant overflow",
 										  strdup(yytext));
 					return INTEGER_CONST;
 				}
 {digit}+(\.{digit}*)?([eE][-+]?{digit}+)?	{
-					if (!strtodouble(yytext, true, &yylval->dval))
+					if (!strtodouble(yytext, &yylval->dval))
 						expr_yyerror_more(yyscanner, "double constant overflow",
 										  strdup(yytext));
 					return DOUBLE_CONST;
 				}
 \.{digit}+([eE][-+]?{digit}+)?	{
-					if (!strtodouble(yytext, true, &yylval->dval))
+					if (!strtodouble(yytext, &yylval->dval))
 						expr_yyerror_more(yyscanner, "double constant overflow",
 										  strdup(yytext));
 					return DOUBLE_CONST;
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index c51ebb8e31..87f3b9f3b6 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -803,7 +803,7 @@ is_an_int(const char *str)
  * If not errorOK, an error message is also printed out on errors.
  */
 bool
-strtoint64(const char *str, bool errorOK, int64 *result)
+strtoint64(const char *str, int64 *result)
 {
 	const char *ptr = str;
 	int64		tmp = 0;
@@ -862,19 +862,15 @@ strtoint64(const char *str, bool errorOK, int64 *result)
 	return true;
 
 out_of_range:
-	if (!errorOK)
-		pg_log_error("value \"%s\" is out of range for type bigint", str);
 	return false;
 
 invalid_syntax:
-	if (!errorOK)
-		pg_log_error("invalid input syntax for type bigint: \"%s\"", str);
 	return false;
 }
 
 /* convert string to double, detecting overflows/underflows */
 bool
-strtodouble(const char *str, bool errorOK, double *dv)
+strtodouble(const char *str, double *dv)
 {
 	char	   *end;
 
@@ -882,18 +878,9 @@ strtodouble(const char *str, bool errorOK, double *dv)
 	*dv = strtod(str, &end);
 
 	if (unlikely(errno != 0))
-	{
-		if (!errorOK)
-			pg_log_error("value \"%s\" is out of range for type double", str);
 		return false;
-	}
-
 	if (unlikely(end == str || *end != '\0'))
-	{
-		if (!errorOK)
-			pg_log_error("invalid input syntax for type double: \"%s\"", str);
 		return false;
-	}
 	return true;
 }
 
@@ -1525,16 +1512,19 @@ makeVariableValue(Variable *var)
 		/* if it looks like an int, it must be an int without overflow */
 		int64		iv;
 
-		if (!strtoint64(var->svalue, false, &iv))
+		if (!strtoint64(var->svalue, &iv))
+		{
+			pg_log_error("malformed variable \"%s\" value: \"%s\"",
+						 var->name, var->svalue);
 			return false;
-
+		}
 		setIntValue(&var->value, iv);
 	}
 	else						/* type should be double */
 	{
 		double		dv;
 
-		if (!strtodouble(var->svalue, true, &dv))
+		if (!strtodouble(var->svalue, &dv))
 		{
 			pg_log_error("malformed variable \"%s\" value: \"%s\"",
 						 var->name, var->svalue);
@@ -5900,12 +5890,12 @@ main(int argc, char **argv)
 				if (getrlimit(RLIMIT_OFILE, &rlim) == -1)
 #endif							/* RLIMIT_NOFILE */
 				{
-					pg_log_fatal("getrlimit failed: %m");
+					pg_log_error("getrlimit failed: %m");
 					exit(1);
 				}
 				if (rlim.rlim_cur < nclients + 3)
 				{
-					pg_log_fatal("need at least %d open files, but system limit is %ld",
+					pg_log_error("need at least %d open files, but system limit is %ld",
 								 nclients + 3, (long) rlim.rlim_cur);
 					pg_log_info("Reduce number of clients, or use limit/ulimit to increase the system limit.");
 					exit(1);
@@ -5922,7 +5912,7 @@ main(int argc, char **argv)
 #ifndef ENABLE_THREAD_SAFETY
 				if (nthreads != 1)
 				{
-					pg_log_fatal("threads are not supported on this platform; use -j1");
+					pg_log_error("threads are not supported on this platform; use -j1");
 					exit(1);
 				}
 #endif							/* !ENABLE_THREAD_SAFETY */
@@ -5998,7 +5988,7 @@ main(int argc, char **argv)
 
 					if ((p = strchr(optarg, '=')) == NULL || p == optarg || *(p + 1) == '\0')
 					{
-						pg_log_fatal("invalid variable definition: \"%s\"", optarg);
+						pg_log_error("invalid variable definition: \"%s\"", optarg);
 						exit(1);
 					}
 
@@ -6020,7 +6010,7 @@ main(int argc, char **argv)
 						break;
 				if (querymode >= NUM_QUERYMODE)
 				{
-					pg_log_fatal("invalid query mode (-M): \"%s\"", optarg);
+					pg_log_error("invalid query mode (-M): \"%s\"", optarg);
 					exit(1);
 				}
 				break;
@@ -6039,7 +6029,7 @@ main(int argc, char **argv)
 
 					if (throttle_value <= 0.0)
 					{
-						pg_log_fatal("invalid rate limit: \"%s\"", optarg);
+						pg_log_error("invalid rate limit: \"%s\"", optarg);
 						exit(1);
 					}
 					/* Invert rate limit into per-transaction delay in usec */
@@ -6052,7 +6042,7 @@ main(int argc, char **argv)
 
 					if (limit_ms <= 0.0)
 					{
-						pg_log_fatal("invalid latency limit: \"%s\"", optarg);
+						pg_log_error("invalid latency limit: \"%s\"", optarg);
 						exit(1);
 					}
 					benchmarking_option_set = true;
@@ -6076,7 +6066,7 @@ main(int argc, char **argv)
 				sample_rate = atof(optarg);
 				if (sample_rate <= 0.0 || sample_rate > 1.0)
 				{
-					pg_log_fatal("invalid sampling rate: \"%s\"", optarg);
+					pg_log_error("invalid sampling rate: \"%s\"", optarg);
 					exit(1);
 				}
 				break;
@@ -6102,7 +6092,7 @@ main(int argc, char **argv)
 				benchmarking_option_set = true;
 				if (!set_random_seed(optarg))
 				{
-					pg_log_fatal("error while setting random seed from --random-seed option");
+					pg_log_error("error while setting random seed from --random-seed option");
 					exit(1);
 				}
 				break;
@@ -6128,7 +6118,7 @@ main(int argc, char **argv)
 					partition_method = PART_HASH;
 				else
 				{
-					pg_log_fatal("invalid partition method, expecting \"range\" or \"hash\", got: \"%s\"",
+					pg_log_error("invalid partition method, expecting \"range\" or \"hash\", got: \"%s\"",
 								 optarg);
 					exit(1);
 				}
@@ -6163,7 +6153,7 @@ main(int argc, char **argv)
 
 	if (total_weight == 0 && !is_init_mode)
 	{
-		pg_log_fatal("total script weight must not be zero");
+		pg_log_error("total script weight must not be zero");
 		exit(1);
 	}
 
@@ -6200,7 +6190,7 @@ main(int argc, char **argv)
 
 	if (optind < argc)
 	{
-		pg_log_fatal("too many command-line arguments (first is \"%s\")",
+		pg_log_error("too many command-line arguments (first is \"%s\")",
 					 argv[optind]);
 		fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 		exit(1);
@@ -6210,13 +6200,13 @@ main(int argc, char **argv)
 	{
 		if (benchmarking_option_set)
 		{
-			pg_log_fatal("some of the specified options cannot be used in initialization (-i) mode");
+			pg_log_error("some of the specified options cannot be used in initialization (-i) mode");
 			exit(1);
 		}
 
 		if (partitions == 0 && partition_method != PART_NONE)
 		{
-			pg_log_fatal("--partition-method requires greater than zero --partitions");
+			pg_log_error("--partition-method requires greater than zero --partitions");
 			exit(1);
 		}
 
@@ -6255,14 +6245,14 @@ main(int argc, char **argv)
 	{
 		if (initialization_option_set)
 		{
-			pg_log_fatal("some of the specified options cannot be used in benchmarking mode");
+			pg_log_error("some of the specified options cannot be used in benchmarking mode");
 			exit(1);
 		}
 	}
 
 	if (nxacts > 0 && duration > 0)
 	{
-		pg_log_fatal("specify either a number of transactions (-t) or a duration (-T), not both");
+		pg_log_error("specify either a number of transactions (-t) or a duration (-T), not both");
 		exit(1);
 	}
 
@@ -6273,44 +6263,44 @@ main(int argc, char **argv)
 	/* --sampling-rate may be used only with -l */
 	if (sample_rate > 0.0 && !use_log)
 	{
-		pg_log_fatal("log sampling (--sampling-rate) is allowed only when logging transactions (-l)");
+		pg_log_error("log sampling (--sampling-rate) is allowed only when logging transactions (-l)");
 		exit(1);
 	}
 
 	/* --sampling-rate may not be used with --aggregate-interval */
 	if (sample_rate > 0.0 && agg_interval > 0)
 	{
-		pg_log_fatal("log sampling (--sampling-rate) and aggregation (--aggregate-interval) cannot be used at the same time");
+		pg_log_error("log sampling (--sampling-rate) and aggregation (--aggregate-interval) cannot be used at the same time");
 		exit(1);
 	}
 
 	if (agg_interval > 0 && !use_log)
 	{
-		pg_log_fatal("log aggregation is allowed only when actually logging transactions");
+		pg_log_error("log aggregation is allowed only when actually logging transactions");
 		exit(1);
 	}
 
 	if (!use_log && logfile_prefix)
 	{
-		pg_log_fatal("log file prefix (--log-prefix) is allowed only when logging transactions (-l)");
+		pg_log_error("log file prefix (--log-prefix) is allowed only when logging transactions (-l)");
 		exit(1);
 	}
 
 	if (duration > 0 && agg_interval > duration)
 	{
-		pg_log_fatal("number of seconds for aggregation (%d) must not be higher than test duration (%d)", agg_interval, duration);
+		pg_log_error("number of seconds for aggregation (%d) must not be higher than test duration (%d)", agg_interval, duration);
 		exit(1);
 	}
 
 	if (duration > 0 && agg_interval > 0 && duration % agg_interval != 0)
 	{
-		pg_log_fatal("duration (%d) must be a multiple of aggregation interval (%d)", duration, agg_interval);
+		pg_log_error("duration (%d) must be a multiple of aggregation interval (%d)", duration, agg_interval);
 		exit(1);
 	}
 
 	if (progress_timestamp && progress == 0)
 	{
-		pg_log_fatal("--progress-timestamp is allowed only under --progress");
+		pg_log_error("--progress-timestamp is allowed only under --progress");
 		exit(1);
 	}
 
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index 6ce1c98649..82adde9b2c 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -161,7 +161,7 @@ extern void syntax_error(const char *source, int lineno, const char *line,
 						 const char *cmd, const char *msg,
 						 const char *more, int col) pg_attribute_noreturn();
 
-extern bool strtoint64(const char *str, bool errorOK, int64 *pi);
-extern bool strtodouble(const char *str, bool errorOK, double *pd);
+extern bool strtoint64(const char *str, int64 *pi);
+extern bool strtodouble(const char *str, double *pd);
 
 #endif							/* PGBENCH_H */

Fabien COELHO

coelho@cri.ensmp.fr

over 4 years ago

In reply to: Michael Paquier (#1)

Re: Some code cleanup for pgbench and pg_verifybackup

Bonjour Michaï¿œl,

My 0.02ï¿œ:

- pgbench has its own parsing routines for int64 and double, with
an option to skip errors. That's not surprising in itself, but, for
strtodouble(), errorOK is always true, meaning that the error strings
are dead.

Indeed. However, there are "atof" calls for option parsing which should
rather use strtodouble, and that may or may not call it with errorOk as
true or false, it may depend.

For strtoint64(), errorOK is false only when parsing a Variable, where a
second error string is generated.

ISTM that it just returns false, there is no message about the parsing
error, hence the message is generated in the function.

I don't really think that we need to be that pedantic about the type of
errors generated in those code paths when failing to parse a variable,
so I'd like to propose a simplification of the code where we reuse the
same error message as for double, cutting a bit the number of
translatable strings.

ISTM that point is that errors from the parser are handled differently (by
calling some "yyerror" function which do different things), so they need a
special call for that.

For other cases we would not to have to replicate generating an error
messages for each caller, so it is best done directly in the function. Ok,
currently there is only one call, but there can be more, eg I have a
not-yet submitted patch to add a new option which will need to parse an
int64.

- pgbench and pg_verifybackup make use of pg_log_fatal() to report
some failures in code paths dedicated to command-line options, which
is inconsistent with all the other tools that use pg_log_error().

The semantics for fatal vs error is that an error is somehow handled while
a fatal is not. If the log message is followed by an cold exit, ISTM that
fatal is the right call, and I cannot help if other commands do not do
that. ISTM more logical to align other commands to fatal when appropriate.

Thoughts?

I'd be in favor of letting it as it is.

--
Fabien.

Alvaro Herrera

alvherre@alvh.no-ip.org

over 4 years ago

In reply to: Fabien COELHO (#2)

Re: Some code cleanup for pgbench and pg_verifybackup

On 2021-Jul-26, Fabien COELHO wrote:

- pgbench and pg_verifybackup make use of pg_log_fatal() to report
some failures in code paths dedicated to command-line options, which
is inconsistent with all the other tools that use pg_log_error().

The semantics for fatal vs error is that an error is somehow handled while a
fatal is not. If the log message is followed by an cold exit, ISTM that
fatal is the right call, and I cannot help if other commands do not do that.
ISTM more logical to align other commands to fatal when appropriate.

I was surprised to discover a few weeks ago that pg_log_fatal() did not
terminate the program, which was my expectation. If every single call
to pg_log_fatal() is followed by exit(1), why not make pg_log_fatal()
itself exit? Apparently this coding pattern confuses many people -- for
example pg_verifybackup.c lines 291ff fail to exit(1) after "throwing" a
fatal error, while the block at lines 275 does the right thing.

(In reality we cannot literally just exit(1) in pg_log_fatal(), because
there are quite a few places that do some other thing after the log
call and before exit(1), or terminate the program in some other way than
exit().)

--
Álvaro Herrera 39°49'30"S 73°17'W — https://www.EnterpriseDB.com/
Essentially, you're proposing Kevlar shoes as a solution for the problem
that you want to walk around carrying a loaded gun aimed at your foot.
(Tom Lane)

Michael Paquier

michael@paquier.xyz

over 4 years ago

In reply to: Alvaro Herrera (#3)

Re: Some code cleanup for pgbench and pg_verifybackup

On Mon, Jul 26, 2021 at 03:35:29PM -0400, Alvaro Herrera wrote:

On 2021-Jul-26, Fabien COELHO wrote:

The semantics for fatal vs error is that an error is somehow handled while a
fatal is not. If the log message is followed by an cold exit, ISTM that
fatal is the right call, and I cannot help if other commands do not do that.
ISTM more logical to align other commands to fatal when appropriate.

I disagree. pgbench is an outlier here. There are 71 calls to
pg_log_fatal() in src/bin/, and pgbench counts for 54 of them. It
would be more consistent to align pgbench with the others.

I was surprised to discover a few weeks ago that pg_log_fatal() did not
terminate the program, which was my expectation. If every single call
to pg_log_fatal() is followed by exit(1), why not make pg_log_fatal()
itself exit? Apparently this coding pattern confuses many people -- for
example pg_verifybackup.c lines 291ff fail to exit(1) after "throwing" a
fatal error, while the block at lines 275 does the right thing.

I remember having the same reaction when those logging APIs got
introduced (I may be wrong here), and I also recall that this point
has been discussed, where the conclusion was that the logging should
never exit() directly.

(In reality we cannot literally just exit(1) in pg_log_fatal(), because
there are quite a few places that do some other thing after the log
call and before exit(1), or terminate the program in some other way than
exit().)

Yes, that's obviously wrong. I am wondering if there is more of
that.
--
Michael

Fabien COELHO

coelho@cri.ensmp.fr

over 4 years ago

In reply to: Michael Paquier (#4)

Re: Some code cleanup for pgbench and pg_verifybackup

Bonjour Michaï¿œl-san,

The semantics for fatal vs error is that an error is somehow handled while a
fatal is not. If the log message is followed by an cold exit, ISTM that
fatal is the right call, and I cannot help if other commands do not do that.
ISTM more logical to align other commands to fatal when appropriate.

I disagree. pgbench is an outlier here. There are 71 calls to
pg_log_fatal() in src/bin/, and pgbench counts for 54 of them. It
would be more consistent to align pgbench with the others.

I do not understand your disagreement. Do you disagree about the expected
semantics of fatal? Then why provide fatal if it should not be used?
What is the expected usage of fatal?

I do not dispute that pgbench is a statistical outlier. However, Pgbench
is somehow special because it does not handle a lot of errors, hence a lot
of "fatal & exit" pattern is used, and the user has to restart. There are
76 calls to "exit" from pgbench, but only 23 from psql which is much
larger. ISTM that most "normal" pg programs won't do that because they are
nice and try to handle errors.

So for me "fatal" is the right choice before exiting with a non zero
status, but if "error" is called instead I do not think it matters much,
you do as you please.

I was surprised to discover a few weeks ago that pg_log_fatal() did not
terminate the program, which was my expectation.

Mine too.

--
Fabien.

Michael Paquier

michael@paquier.xyz

over 4 years ago

In reply to: Fabien COELHO (#5)

Re: Some code cleanup for pgbench and pg_verifybackup

On Tue, Jul 27, 2021 at 06:36:15AM +0200, Fabien COELHO wrote:

I do not understand your disagreement. Do you disagree about the expected
semantics of fatal? Then why provide fatal if it should not be used?
What is the expected usage of fatal?

I disagree about the fact that pgbench uses pg_log_fatal() in ways
that other binaries don't do. For example, other things use
pg_log_error() followed by an exit(), but not this code. I am not
going to fight hard on that, though.

That's a set of inconsistences I bumped into while plugging in
option_parse_int() within pgbench.
--
Michael

Fabien COELHO

coelho@cri.ensmp.fr

over 4 years ago

In reply to: Michael Paquier (#6)

Re: Some code cleanup for pgbench and pg_verifybackup

Hello,

I do not understand your disagreement. Do you disagree about the
expected>> semantics of fatal? Then why provide fatal if it should not
be used? What is the expected usage of fatal?

I disagree about the fact that pgbench uses pg_log_fatal() in ways
that other binaries don't do.

Sure. Then what should be the expected usage of fatal? Doc says:

* Severe errors that cause program termination. (One-shot programs may
* chose to label even fatal errors as merely "errors". The distinction
* is up to the program.)

pgbench is consistent with the doc. I prefer fatal for this purpose to
distinguish these clearly from recoverable errors, i.e. the programs goes
on despite the error, or at least for some time. I think it is good to
have such a distinction, and bgpench has many errors and many fatals,
although maybe some error should be fatal and some fatal should be error…

For example, other things use pg_log_error() followed by an exit(), but
not this code.

Sure.

I am not going to fight hard on that, though.

Me neither.

That's a set of inconsistences I bumped into while plugging in
option_parse_int()

Which is a very good thing! I have already been bitten by atoi.

--
Fabien.

Michael Paquier

michael@paquier.xyz

over 4 years ago

In reply to: Fabien COELHO (#7)

Re: Some code cleanup for pgbench and pg_verifybackup

On Tue, Jul 27, 2021 at 11:45:07AM +0200, Fabien COELHO wrote:

Sure. Then what should be the expected usage of fatal? Doc says:

* Severe errors that cause program termination. (One-shot programs may
* chose to label even fatal errors as merely "errors". The distinction
* is up to the program.)

pgbench is consistent with the doc. I prefer fatal for this purpose to
distinguish these clearly from recoverable errors, i.e. the programs goes on
despite the error, or at least for some time. I think it is good to have
such a distinction, and bgpench has many errors and many fatals, although
maybe some error should be fatal and some fatal should be error.

Hm? Incorrect option values are recoverable errors, no? The root
cause of the problem is the user. One can note that pg_log_fatal() vs
pg_log_error() results in a score of 54 vs 50 in src/bin/pgbench/, so
I am not quite sure your last statement is true.

That's a set of inconsistences I bumped into while plugging in
option_parse_int()

Which is a very good thing! I have already been bitten by atoi.

By the way, if you can write a patch that makes use of strtodouble()
for the float option parsing in pgbench with the way you see things,
I'd welcome that. This is a local change as only pgbench needs to
care about that.
--
Michael

Daniel Gustafsson

daniel@yesql.se

over 4 years ago

In reply to: Michael Paquier (#4)

Re: Some code cleanup for pgbench and pg_verifybackup

On 27 Jul 2021, at 01:53, Michael Paquier <michael@paquier.xyz> wrote:

..and I also recall that this point has been discussed, where the conclusion
was that the logging should never exit() directly.

That's a characteristic of the API which still holds IMO. If we want
something, it's better to have an explicit exit function which takes a log
string than a log function that exits.

--
Daniel Gustafsson https://vmware.com/

#10

Michael Paquier

michael@paquier.xyz

over 4 years ago

In reply to: Michael Paquier (#4)

1 attachment(s)

Re: Some code cleanup for pgbench and pg_verifybackup

On Tue, Jul 27, 2021 at 08:53:52AM +0900, Michael Paquier wrote:

On Mon, Jul 26, 2021 at 03:35:29PM -0400, Alvaro Herrera wrote:

(In reality we cannot literally just exit(1) in pg_log_fatal(), because
there are quite a few places that do some other thing after the log
call and before exit(1), or terminate the program in some other way than
exit().)

Yes, that's obviously wrong. I am wondering if there is more of
that.

I have been looking at that. Here are some findings:
- In pg_archivecleanup, CleanupPriorWALFiles() does not bother
exit()'ing with an error code. Shouldn't we worry about reporting
that correctly? It seems strange to not inform users about paths that
would be broken, as that could bloat the archives without one knowing
about it.
- In pg_basebackup.c, ReceiveTarAndUnpackCopyChunk() would not
exit() when failing to change permissions for non-WIN32.
- pg_recvlogical is missing a failure handling for fstat(), as of
5c0de38.
- pg_verifybackup is in the wrong, as mentioned upthread.

Thoughts? All that does not seem to enter into the category of things
worth back-patching, except for pg_verifybackup.
--
Michael

Attachments:

log-fatal-exit.patchtext/x-diff; charset=us-asciiDownload

diff --git a/src/bin/pg_archivecleanup/pg_archivecleanup.c b/src/bin/pg_archivecleanup/pg_archivecleanup.c
index 12338e3bb2..6c3e7f4e01 100644
--- a/src/bin/pg_archivecleanup/pg_archivecleanup.c
+++ b/src/bin/pg_archivecleanup/pg_archivecleanup.c
@@ -151,21 +151,30 @@ CleanupPriorWALFiles(void)
 				{
 					pg_log_error("could not remove file \"%s\": %m",
 								 WALFilePath);
-					break;
+					exit(1);
 				}
 			}
 		}
 
 		if (errno)
+		{
 			pg_log_error("could not read archive location \"%s\": %m",
 						 archiveLocation);
+			exit(1);
+		}
 		if (closedir(xldir))
+		{
 			pg_log_error("could not close archive location \"%s\": %m",
 						 archiveLocation);
+			exit(1);
+		}
 	}
 	else
+	{
 		pg_log_error("could not open archive location \"%s\": %m",
 					 archiveLocation);
+		exit(1);
+	}
 }
 
 /*
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 8f69c57380..7296eb97d0 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -1626,8 +1626,11 @@ ReceiveTarAndUnpackCopyChunk(size_t r, char *copybuf, void *callback_data)
 				}
 #ifndef WIN32
 				if (chmod(state->filename, (mode_t) filemode))
+				{
 					pg_log_error("could not set permissions on directory \"%s\": %m",
 								 state->filename);
+					exit(1);
+				}
 #endif
 			}
 			else if (copybuf[156] == '2')
@@ -1676,8 +1679,11 @@ ReceiveTarAndUnpackCopyChunk(size_t r, char *copybuf, void *callback_data)
 
 #ifndef WIN32
 		if (chmod(state->filename, (mode_t) filemode))
+		{
 			pg_log_error("could not set permissions on file \"%s\": %m",
 						 state->filename);
+			exit(1);
+		}
 #endif
 
 		if (state->current_len_left == 0)
diff --git a/src/bin/pg_basebackup/pg_recvlogical.c b/src/bin/pg_basebackup/pg_recvlogical.c
index 1d59bf3744..ebeb12d497 100644
--- a/src/bin/pg_basebackup/pg_recvlogical.c
+++ b/src/bin/pg_basebackup/pg_recvlogical.c
@@ -341,7 +341,10 @@ StreamLogicalLog(void)
 			}
 
 			if (fstat(outfd, &statbuf) != 0)
+			{
 				pg_log_error("could not stat file \"%s\": %m", outfile);
+				goto error;
+			}
 
 			output_isfile = S_ISREG(statbuf.st_mode) && !isatty(outfd);
 		}
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index f5ebd57a47..bb93b43093 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -304,6 +304,7 @@ main(int argc, char **argv)
 							 "but was not the same version as %s.\n"
 							 "Check your installation.",
 							 "pg_waldump", full_path, "pg_verifybackup");
+			exit(1);
 		}
 	}
 
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index c51ebb8e31..55d14604c0 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -6469,7 +6469,10 @@ main(int argc, char **argv)
 
 	errno = THREAD_BARRIER_INIT(&barrier, nthreads);
 	if (errno != 0)
+	{
 		pg_log_fatal("could not initialize barrier: %m");
+		exit(1);
+	}
 
 #ifdef ENABLE_THREAD_SAFETY
 	/* start all threads but thread 0 which is executed directly later */

#11

Fabien COELHO

coelho@cri.ensmp.fr

over 4 years ago

In reply to: Michael Paquier (#10)

Re: Some code cleanup for pgbench and pg_verifybackup

[...] Thoughts?

For pgbench it is definitely ok to add the exit. For others the added
exits look reasonable, but I do not know them intimately enough to be sure
that it is the right thing to do in all cases.

All that does not seem to enter into the category of things worth
back-patching, except for pg_verifybackup.

Yes.

--
Fabien.

#12

Robert Haas

robertmhaas@gmail.com

over 4 years ago

In reply to: Michael Paquier (#10)

Re: Some code cleanup for pgbench and pg_verifybackup

On Tue, Jul 27, 2021 at 11:18 PM Michael Paquier <michael@paquier.xyz> wrote:

I have been looking at that. Here are some findings:
- In pg_archivecleanup, CleanupPriorWALFiles() does not bother
exit()'ing with an error code. Shouldn't we worry about reporting
that correctly? It seems strange to not inform users about paths that
would be broken, as that could bloat the archives without one knowing
about it.
- In pg_basebackup.c, ReceiveTarAndUnpackCopyChunk() would not
exit() when failing to change permissions for non-WIN32.
- pg_recvlogical is missing a failure handling for fstat(), as of
5c0de38.
- pg_verifybackup is in the wrong, as mentioned upthread.

Thoughts? All that does not seem to enter into the category of things
worth back-patching, except for pg_verifybackup.

I think all of these are reasonable fixes. In the case of
pg_basebackup, a chmod() failure doesn't necessarily oblige us to give
up and exit; we could presumably still write the data. But it may be
best to give up and exit. The other cases appear to be clear bugs.

--
Robert Haas
EDB: http://www.enterprisedb.com

#13

Michael Paquier

michael@paquier.xyz

over 4 years ago

In reply to: Robert Haas (#12)

Re: Some code cleanup for pgbench and pg_verifybackup

On Wed, Jul 28, 2021 at 10:28:13AM -0400, Robert Haas wrote:

I think all of these are reasonable fixes. In the case of
pg_basebackup, a chmod() failure doesn't necessarily oblige us to give
up and exit; we could presumably still write the data. But it may be
best to give up and exit. The other cases appear to be clear bugs.

Yeah, there are advantages in both positions, still it is more natural
to me to not ignore this kind of failures. Note the inconsistency
with initdb for example. So, done.
--
Michael