pgbench's expression parsing & negative numbers

Started by Andres Freundabout 8 years ago17 messages

andres@anarazel.de

about 8 years ago

Hi,

working on overflow correctness in pg I noticed that pgbench isn't quite
there. I assume it won't matter terribly often, but the way it parses
integers makes it incorrect for, at least, the negativemost number.

It directly parses the integer input using:
{digit}+ {
yylval->ival = strtoint64(yytext);
return INTEGER_CONST;
}

but that unfortunately means that the sign is no included in the call to
strtoint64. Which in turn means you can't correctly parse PG_INT64_MIN,
because that's not representable as a positive number on a two's
complement machine (i.e. everywhere). Preliminary testing shows that
that can relatively easily fixed by just prefixing [-+]? to the relevant
exprscan.l rules. But it might also not be a bad idea to simply defer
parsing of the number until exprparse.y has its hand on the number?

There's plenty other things wrong with overflow handling too, especially
evalFunc() basically just doesn't care. I'll come up with a patch for
that sometime soon.

A third complaint I have is that the tap tests are pretty hard to parse
for humans.

Greetings,

Andres Freund

Fabien COELHO

coelho@cri.ensmp.fr

about 8 years ago

In reply to: Andres Freund (#1)

Re: pgbench's expression parsing & negative numbers

Hello Andres,

working on overflow correctness in pg I noticed that pgbench isn't quite
there. I assume it won't matter terribly often, but the way it parses
integers makes it incorrect for, at least, the negativemost number.

It directly parses the integer input using:
{digit}+ {
yylval->ival = strtoint64(yytext);
return INTEGER_CONST;
}

but that unfortunately means that the sign is no included in the call to
strtoint64. Which in turn means you can't correctly parse PG_INT64_MIN,

Indeed. For a benchmarking script which generates a pseudo random load I
do not see as a big issue, but maybe I'm missing some use case.

because that's not representable as a positive number on a two's
complement machine (i.e. everywhere). Preliminary testing shows that
that can relatively easily fixed by just prefixing [-+]? to the relevant
exprscan.l rules.

Beware that it must handle the unary/binary plus/minus case consistently:

"-123" vs "a -123" vs "a + -123"

But it might also not be a bad idea to simply defer parsing of the
number until exprparse.y has its hand on the number?

It is usually simpler to let the lexer handle constants.

There's plenty other things wrong with overflow handling too, especially
evalFunc() basically just doesn't care.

Sure.

There are some overflow checking with div and double to int cast, which
were added because of previous complaints, but which are not very useful
to me.

ISTM that it is a voluntary feature, which handles int64 operations
with a modulo overflow like C. Generating exceptions on such overflows
does not look very attractive to me.

I'll come up with a patch for that sometime soon.

I wish you could provide some motivation: why does it matter much to a
benchmarking script to handle overflows with more case?

A third complaint I have is that the tap tests are pretty hard to parse
for humans.

Probably.

Perl is pretty hard to humans to start with:-)

There is a lot of code factorization to cram many tests together, so that
one test is more or less one line and there are hundreds of them. Not sure
it would help if it was expanded.

There are a lot of regular expressions to check outputs, which does not
help.

I wanted to have the pgbench scripts outside but this has been rejected,
so the tested scripts themselves are in the middle of the perl code, which
I think has degraded the readability significantly.

--
Fabien.

Andres Freund

andres@anarazel.de

about 8 years ago

In reply to: Fabien COELHO (#2)

Re: pgbench's expression parsing & negative numbers

On 2017-12-12 07:21:21 +0100, Fabien COELHO wrote:

Hello Andres,

working on overflow correctness in pg I noticed that pgbench isn't quite
there. I assume it won't matter terribly often, but the way it parses
integers makes it incorrect for, at least, the negativemost number.

It directly parses the integer input using:
{digit}+ {
yylval->ival = strtoint64(yytext);
return INTEGER_CONST;
}

but that unfortunately means that the sign is no included in the call to
strtoint64. Which in turn means you can't correctly parse PG_INT64_MIN,

Indeed. For a benchmarking script which generates a pseudo random load I do
not see as a big issue, but maybe I'm missing some use case.

IDK, behaving correctly seems like a good idea. Also far from impossible
that that code gets propagated further. Doesn't seem like an entirely
crazy idea that somebody actually wants to benchmark boundary cases,
which might be slower and such.

because that's not representable as a positive number on a two's
complement machine (i.e. everywhere). Preliminary testing shows that
that can relatively easily fixed by just prefixing [-+]? to the relevant
exprscan.l rules.

Beware that it must handle the unary/binary plus/minus case consistently:

"-123" vs "a -123" vs "a + -123"

Which is a lot easier to solve on the parser side of things...

But it might also not be a bad idea to simply defer parsing of the
number until exprparse.y has its hand on the number?

It is usually simpler to let the lexer handle constants.

There's plenty other things wrong with overflow handling too, especially
evalFunc() basically just doesn't care.

Sure.

There are some overflow checking with div and double to int cast, which were
added because of previous complaints, but which are not very useful to me.

I think handling it inconsistently is the worst of all worlds.

ISTM that it is a voluntary feature, which handles int64 operations with a
modulo overflow like C. Generating exceptions on such overflows does not
look very attractive to me.

No, it's not really voluntary. Signed overflow isn't actually defined
behaviour in C. We kinda make it so on some platforms, but that's not
really a good idea.

I'll come up with a patch for that sometime soon.

I wish you could provide some motivation: why does it matter much to a
benchmarking script to handle overflows with more case?

a) I want to be able to compile without -fwrapv in the near
future. Which means you can't have signed overflows, because they're
undefined behaviour in C.
b) I want to be able to compile with -ftrapv /
-fsanitize=signed-integer-overflow, to be sure code is actually
correct. Currently pgbench crashes with that.
c) Randomly handling things in some but not all places is a bad idea.

A third complaint I have is that the tap tests are pretty hard to parse
for humans.

Probably.

Perl is pretty hard to humans to start with:-)

Meh.

There is a lot of code factorization to cram many tests together, so that
one test is more or less one line and there are hundreds of them. Not sure
it would help if it was expanded.

I don't think expanding it is really a problem, I think they're just
largely not well documented, formatted. With some effort the tests could
be a lot easier to read.

Greetings,

Andres Freund

Fabien COELHO

coelho@cri.ensmp.fr

about 8 years ago

In reply to: Andres Freund (#3)

Re: pgbench's expression parsing & negative numbers

Hello Andres,

There are some overflow checking with div and double to int cast, which were
added because of previous complaints, but which are not very useful to me.

I think handling it inconsistently is the worst of all worlds.

Hmmm... I cannot say that inconsistency is a good thing, that would not
be consistent:-)

My 0.02ï¿œ:

- I do not think that updating pgbench arithmetic for managing integer
overflows is worth Andres Freund time. My guess is that most
script would not trigger client-side overflows, so the change would
be a no-op in practice.

- I think that pgbench has more important defects/missing features, to
be fixed more urgently. Given the time patches spend in the cf queue,
obviously committers disagree on this one:-)

Now ISTM that it does not harm anything to add such a feature, so fine
with me. Maybe the global compiler option removal is worth the effort.

--
Fabien.

Andres Freund

andres@anarazel.de

about 8 years ago

In reply to: Fabien COELHO (#4)

Re: pgbench's expression parsing & negative numbers

Hi,

On 2017-12-14 10:41:04 +0100, Fabien COELHO wrote:

- I do not think that updating pgbench arithmetic for managing integer
overflows is worth Andres Freund time. My guess is that most
script would not trigger client-side overflows, so the change would
be a no-op in practice.

It might not be if you view it in isolation (although I'm not
convinced). The problem is that it has cost beyond pgbench. Due to
pgbench's overflow handling I can't run make check on a build that has
-ftrapv, which found several bugs already.

I'd be happy if somebody else would tackle the issue, but I don't quite
see it happening...

Greetings,

Andres Freund

Fabien COELHO

coelho@cri.ensmp.fr

about 8 years ago

In reply to: Andres Freund (#5)

Re: pgbench's expression parsing & negative numbers

Hello,

- I do not think that updating pgbench arithmetic for managing integer
overflows is worth Andres Freund time. My guess is that most
script would not trigger client-side overflows, so the change would
be a no-op in practice.

It might not be if you view it in isolation (although I'm not
convinced). The problem is that it has cost beyond pgbench. Due to
pgbench's overflow handling

Lack of?

I can't run make check on a build that has -ftrapv, which found several
bugs already.

Hmmm. You suggest that integer overflows do occur when running pgbench.

Indeed, this tap test case: "\set maxint debug(:minint - 1)"

Otherwise, some stat counters may overflow on very long runs? Although
overflowing a int64 takes some time...

I'd be happy if somebody else would tackle the issue, but I don't quite
see it happening...

I must admit that it is not very high on my may-do list. I'll review it if
it appears, though.

--
Fabien.

Fabien COELHO

coelho@cri.ensmp.fr

almost 8 years ago

In reply to: Andres Freund (#1)

Re: pgbench's expression parsing & negative numbers

Hello Andres,

working on overflow correctness in pg I noticed that pgbench isn't quite
there.

Indeed.

I assume it won't matter terribly often, but the way it parses integers
makes it incorrect for, at least, the negativemost number. [...] but
that unfortunately means that the sign is no included in the call to
strtoint64.

The strtoint64 function is indeed a mess. It does not really report errors
(more precisely, an error message is printed out, but the execution goes
on nevertheless...).

Which in turn means you can't correctly parse PG_INT64_MIN,
because that's not representable as a positive number on a two's
complement machine (i.e. everywhere). Preliminary testing shows that
that can relatively easily fixed by just prefixing [-+]? to the relevant
exprscan.l rules.

I'm not sure I get it, because then "1 -2" would be scanned as "int
signed_int", which requires to add some fine rules to the parser to handle
that as an addition, and I would be unclear about the priority handling,
eg "1 -2 * 3". But I agree that it can be made to work with transfering
the conversion to the parser and maybe some careful tweaking there.

But it might also not be a bad idea to simply defer
parsing of the number until exprparse.y has its hand on the number?

There's plenty other things wrong with overflow handling too, especially
evalFunc() basically just doesn't care.

Indeed.

Some overflow issues are easy to fix with the existing pg_*_s64_overflow
macros that you provided. It should also handle double2int64 overflows.

I have tried to trigger errors with a -fsanitize=signed-integer-overflow
compilation, without much success, but ISTM that you suggested that
pgbench does not work with that in another thread. Could you be more
precise?

I'll come up with a patch for that sometime soon.

ISTM that you have not sent any patch on the subject, otherwise I would
have reviewed it. Maybe I could do one some time later, unless you think
that I should not.

--
Fabien.

Fabien COELHO

coelho@cri.ensmp.fr

over 7 years ago

In reply to: Fabien COELHO (#7)

1 attachment(s)

Re: pgbench's expression parsing & negative numbers

Hello Andres,

I'll come up with a patch for that sometime soon.

ISTM that you have not sent any patch on the subject, otherwise I would
have reviewed it. Maybe I could do one some time later, unless you think
that I should not.

Here is a patch which detects pgbench overflows on int & double constants,
and on integer operators.

--
Fabien.

Attachments:

pgbench-overflow-2.patchtext/plain; name=pgbench-overflow-2.patchDownload

diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 88cf8b3933..8c464c9d42 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -989,6 +989,13 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
       are <literal>FALSE</literal>.
      </para>
 
+     <para>
+      Too large or small integer and double constants, as well as
+      integer arithmetic operators (<literal>+</literal>,
+      <literal>-</literal>, <literal>*</literal> and <literal>/</literal>)
+      raise errors on overflows.
+     </para>
+
      <para>
       When no final <token>ELSE</token> clause is provided to a
       <token>CASE</token>, the default value is <literal>NULL</literal>.
diff --git a/src/bin/pgbench/exprscan.l b/src/bin/pgbench/exprscan.l
index 5c1bd88128..3e82437bc8 100644
--- a/src/bin/pgbench/exprscan.l
+++ b/src/bin/pgbench/exprscan.l
@@ -192,15 +192,21 @@ notnull			[Nn][Oo][Tt][Nn][Uu][Ll][Ll]
 					return BOOLEAN_CONST;
 				}
 {digit}+		{
-					yylval->ival = strtoint64(yytext);
+					if (!strtoint64(yytext, true, &yylval->ival))
+						expr_yyerror_more(yyscanner, "bigint constant overflow",
+										  strdup(yytext));
 					return INTEGER_CONST;
 				}
 {digit}+(\.{digit}*)?([eE][-+]?{digit}+)?	{
-					yylval->dval = atof(yytext);
+					if (!strtodouble(yytext, true, &yylval->dval))
+						expr_yyerror_more(yyscanner, "double constant overflow",
+										  strdup(yytext));
 					return DOUBLE_CONST;
 				}
 \.{digit}+([eE][-+]?{digit}+)?	{
-					yylval->dval = atof(yytext);
+					if (!strtodouble(yytext, true, &yylval->dval))
+						expr_yyerror_more(yyscanner, "double constant overflow",
+										  strdup(yytext));
 					return DOUBLE_CONST;
 				}
 {alpha}{alnum}*	{
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index f0c5149523..3f1392fc03 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -38,6 +38,8 @@
 #include "libpq-fe.h"
 #include "portability/instr_time.h"
 
+#include "common/int.h" /* integer overflow checks */
+
 #include <ctype.h>
 #include <float.h>
 #include <limits.h>
@@ -627,19 +629,24 @@ is_an_int(const char *str)
 /*
  * strtoint64 -- convert a string to 64-bit integer
  *
- * This function is a modified version of scanint8() from
+ * This function is a slightly modified version of scanint8() from
  * src/backend/utils/adt/int8.c.
+ *
+ * If not errorOK, an error message is printed out.
  */
-int64
-strtoint64(const char *str)
+bool
+strtoint64(const char *str, bool errorOK, int64 *result)
 {
 	const char *ptr = str;
-	int64		result = 0;
-	int			sign = 1;
+	int64		tmp = 0;
+	bool		neg = false;
 
 	/*
 	 * Do our own scan, rather than relying on sscanf which might be broken
 	 * for long long.
+	 *
+	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
+	 * value as a negative number.
 	 */
 
 	/* skip leading spaces */
@@ -650,46 +657,80 @@ strtoint64(const char *str)
 	if (*ptr == '-')
 	{
 		ptr++;
-
-		/*
-		 * Do an explicit check for INT64_MIN.  Ugly though this is, it's
-		 * cleaner than trying to get the loop below to handle it portably.
-		 */
-		if (strncmp(ptr, "9223372036854775808", 19) == 0)
-		{
-			result = PG_INT64_MIN;
-			ptr += 19;
-			goto gotdigits;
-		}
-		sign = -1;
+		neg = true;
 	}
 	else if (*ptr == '+')
 		ptr++;
 
 	/* require at least one digit */
-	if (!isdigit((unsigned char) *ptr))
-		fprintf(stderr, "invalid input syntax for integer: \"%s\"\n", str);
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		goto invalid_syntax;
 
 	/* process digits */
 	while (*ptr && isdigit((unsigned char) *ptr))
 	{
-		int64		tmp = result * 10 + (*ptr++ - '0');
+		int8		digit = (*ptr++ - '0');
 
-		if ((tmp / 10) != result)	/* overflow? */
-			fprintf(stderr, "value \"%s\" is out of range for type bigint\n", str);
-		result = tmp;
+		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
+			goto out_of_range;
 	}
 
-gotdigits:
-
 	/* allow trailing whitespace, but not other trailing chars */
 	while (*ptr != '\0' && isspace((unsigned char) *ptr))
 		ptr++;
 
-	if (*ptr != '\0')
-		fprintf(stderr, "invalid input syntax for integer: \"%s\"\n", str);
+	if (unlikely(*ptr != '\0'))
+		goto invalid_syntax;
 
-	return ((sign < 0) ? -result : result);
+	if (!neg)
+	{
+		if (unlikely(tmp == PG_INT64_MIN))
+			goto out_of_range;
+		tmp = -tmp;
+	}
+
+	*result = tmp;
+	return true;
+
+out_of_range:
+	if (!errorOK)
+		fprintf(stderr,
+				"value \"%s\" is out of range for type bigint\n", str);
+	return false;
+
+invalid_syntax:
+	/* this can't happen here, function called on int-looking strings. */
+	if (!errorOK)
+		fprintf(stderr,
+				"invalid input syntax for type bigint: \"%s\"\n",str);
+	return false;
+}
+
+/* convert string to double, detecting overflows/underflows */
+bool
+strtodouble(const char *str, bool errorOK, double *dv)
+{
+	char *end;
+
+	*dv = strtod(str, &end);
+
+	if (unlikely(errno != 0))
+	{
+		if (!errorOK)
+			fprintf(stderr,
+					"value \"%s\" is out of range for type double\n", str);
+		return false;
+	}
+
+	if (unlikely(end == str || *end != '\0'))
+	{
+		if (!errorOK)
+			fprintf(stderr,
+					"invalid input syntax for type double: \"%s\"\n",str);
+		return false;
+	}
+	return true;
 }
 
 /* random number generator: uniform distribution from min to max inclusive */
@@ -1282,20 +1323,26 @@ makeVariableValue(Variable *var)
 	}
 	else if (is_an_int(var->svalue))
 	{
-		setIntValue(&var->value, strtoint64(var->svalue));
+		/* if it looks like an int, it must be an int without overflow */
+		int64 iv;
+
+		if (!strtoint64(var->svalue, false, &iv))
+			return false;
+
+		setIntValue(&var->value, iv);
 	}
 	else						/* type should be double */
 	{
 		double		dv;
-		char		xs;
 
-		if (sscanf(var->svalue, "%lf%c", &dv, &xs) != 1)
+		if (!strtodouble(var->svalue, true, &dv))
 		{
 			fprintf(stderr,
 					"malformed variable \"%s\" value: \"%s\"\n",
 					var->name, var->svalue);
 			return false;
 		}
+
 		setDoubleValue(&var->value, dv);
 	}
 	return true;
@@ -1905,7 +1952,8 @@ evalStandardFunc(TState *thread, CState *st,
 				else			/* we have integer operands, or % */
 				{
 					int64		li,
-								ri;
+								ri,
+								res;
 
 					if (!coerceToInt(lval, &li) ||
 						!coerceToInt(rval, &ri))
@@ -1914,14 +1962,29 @@ evalStandardFunc(TState *thread, CState *st,
 					switch (func)
 					{
 						case PGBENCH_ADD:
-							setIntValue(retval, li + ri);
+							if (unlikely(pg_add_s64_overflow(li, ri, &res)))
+							{
+								fprintf(stderr, "bigint add out of range\n");
+								return false;
+							}
+							setIntValue(retval, res);
 							return true;
 
 						case PGBENCH_SUB:
-							setIntValue(retval, li - ri);
+							if (unlikely(pg_sub_s64_overflow(li, ri, &res)))
+							{
+								fprintf(stderr, "bigint sub out of range\n");
+								return false;
+							}
+							setIntValue(retval, res);
 							return true;
 
 						case PGBENCH_MUL:
+							if (unlikely(pg_mul_s64_overflow(li, ri, &res)))
+							{
+								fprintf(stderr, "bigint mul out of range\n");
+								return false;
+							}
 							setIntValue(retval, li * ri);
 							return true;
 
@@ -1954,9 +2017,9 @@ evalStandardFunc(TState *thread, CState *st,
 								if (func == PGBENCH_DIV)
 								{
 									/* overflow check (needed for INT64_MIN) */
-									if (li == PG_INT64_MIN)
+									if (unlikely(li == PG_INT64_MIN))
 									{
-										fprintf(stderr, "bigint out of range\n");
+										fprintf(stderr, "bigint div out of range\n");
 										return false;
 									}
 									else
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index 6983865b92..de50340434 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -160,6 +160,7 @@ extern void syntax_error(const char *source, int lineno, const char *line,
 			 const char *cmd, const char *msg,
 			 const char *more, int col) pg_attribute_noreturn();
 
-extern int64 strtoint64(const char *str);
+extern bool strtoint64(const char *str, bool errorOK, int64 *pi);
+extern bool strtodouble(const char *str, bool errorOK, double *pd);
 
 #endif							/* PGBENCH_H */
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 2fc021dde7..87ac919e0a 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -255,7 +255,7 @@ COMMIT;
 # test expressions
 # command 1..3 and 23 depend on random seed which is used to call srandom.
 pgbench(
-	'--random-seed=5432 -t 1 -Dfoo=-10.1 -Dbla=false -Di=+3 -Dminint=-9223372036854775808 -Dn=null -Dt=t -Df=of -Dd=1.0',
+	'--random-seed=5432 -t 1 -Dfoo=-10.1 -Dbla=false -Di=+3 -Dn=null -Dt=t -Df=of -Dd=1.0',
 	0,
 	[ qr{type: .*/001_pgbench_expressions}, qr{processed: 1/1} ],
 	[
@@ -278,7 +278,6 @@ pgbench(
 		qr{command=15.: double 15\b},
 		qr{command=16.: double 16\b},
 		qr{command=17.: double 17\b},
-		qr{command=18.: int 9223372036854775807\b},
 		qr{command=20.: int \d\b},    # zipfian random: 1 on linux
 		qr{command=21.: double -27\b},
 		qr{command=22.: double 1024\b},
@@ -345,10 +344,9 @@ pgbench(
 \set pi debug(pi() * 4.9)
 \set d4 debug(greatest(4, 2, -1.17) * 4.0 * Ln(Exp(1.0)))
 \set d5 debug(least(-5.18, .0E0, 1.0/0) * -3.3)
--- forced overflow
-\set maxint debug(:minint - 1)
--- reset a variable
+-- reset variables
 \set i1 0
+\set d1 false
 -- yet another integer function
 \set id debug(random_zipfian(1, 9, 1.3))
 --- pow and power
@@ -601,16 +599,10 @@ SELECT LEAST(:i, :i, :i, :i, :i, :i, :i, :i, :i, :i, :i);
 		[qr{invalid variable name}], q{\set . 1}
 	],
 	[
-		'set int overflow',                   0,
-		[qr{double to int overflow for 100}], q{\set i int(1E32)}
+		'set division by zero', 0,
+		[qr{division by zero}], q{\set i 1/0}
 	],
-	[ 'set division by zero', 0, [qr{division by zero}], q{\set i 1/0} ],
-	[
-		'set bigint out of range', 0,
-		[qr{bigint out of range}], q{\set i 9223372036854775808 / -1}
-	],
-	[
-		'set undefined variable',
+	[   'set undefined variable',
 		0,
 		[qr{undefined variable "nosuchvariable"}],
 		q{\set i :nosuchvariable}
@@ -630,7 +622,7 @@ SELECT LEAST(:i, :i, :i, :i, :i, :i, :i, :i, :i, :i, :i);
 		'set random range too large',
 		0,
 		[qr{random range is too large}],
-		q{\set i random(-9223372036854775808, 9223372036854775807)}
+		q{\set i random(-:maxint - 1, :maxint)}
 	],
 	[
 		'set gaussian param too small',
@@ -693,6 +685,18 @@ SELECT LEAST(:i, :i, :i, :i, :i, :i, :i, :i, :i, :i, :i);
 		[qr{at least one argument expected}], q{\set i greatest())}
 	],
 
+	# SET: ARITHMETIC OVERFLOW DETECTION
+	[ 'set double to int overflow',                   0,
+		[ qr{double to int overflow for 100} ], q{\set i int(1E32)} ],
+	[ 'set bigint add overflow', 0,
+		[ qr{int add out} ], q{\set i (1<<62) + (1<<62)} ],
+	[ 'set bigint sub overflow', 0,
+		[ qr{int sub out} ], q{\set i 0 - (1<<62) - (1<<62) - (1<<62)} ],
+	[ 'set bigint mul overflow', 0,
+		[ qr{int mul out} ], q{\set i 2 * (1<<62)} ],
+	[ 'set bigint div out of range', 0,
+		[ qr{bigint div out of range} ], q{\set i (-:maxint - 1) / -1} ],
+
 	# SETSHELL
 	[
 		'setshell not an int',                0,
@@ -740,7 +744,7 @@ for my $e (@errors)
 	my $n = '001_pgbench_error_' . $name;
 	$n =~ s/ /_/g;
 	pgbench(
-		'-n -t 1 -Dfoo=bla -Dnull=null -Dtrue=true -Done=1 -Dzero=0.0 -Dbadtrue=trueXXX -M prepared',
+		'-n -t 1 -Dfoo=bla -Dnull=null -Dtrue=true -Done=1 -Dzero=0.0 -Dbadtrue=trueXXX -Dmaxint=9223372036854775807 -M prepared',
 		$status,
 		[ $status ? qr{^$} : qr{processed: 0/1} ],
 		$re,
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
index a2845a583b..11e6828b32 100644
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -270,6 +270,22 @@ my @script_tests = (
 		'endif syntax error',
 		[qr{unexpected argument in command "endif"}],
 		{ 'endif-bad.sql' => "\\if 0\n\\endif BAD\n" }
+	],
+	# overflow
+	[
+		'bigint overflow',
+		[qr{bigint constant overflow}],
+		{ 'overflow-1.sql' => "\\set i 100000000000000000000\n" }
+	],
+	[
+		'double overflow 1',
+		[qr{double constant overflow}],
+		{ 'overflow-2.sql' => "\\set d 1.0E309\n" }
+	],
+	[
+		'double overflow 2',
+		[qr{double constant overflow}],
+		{ 'overflow-2.sql' => "\\set d .1E310\n" }
 	],);
 
 for my $t (@script_tests)

Fabien COELHO

coelho@cri.ensmp.fr

over 7 years ago

In reply to: Fabien COELHO (#8)

1 attachment(s)

Re: pgbench's expression parsing & negative numbers

I'll come up with a patch for that sometime soon.

ISTM that you have not sent any patch on the subject, otherwise I would
have reviewed it. Maybe I could do one some time later, unless you think
that I should not.

Here is a patch which detects pgbench overflows on int & double constants,
and on integer operators.

... it but forgot to handle parsing min int, which was the initial focus
of this thread.

This patch does that as well by handling it as the special case between
lexer & parser (the issue being that 9223372036854775808 cannot be lexed
as an standard integer, as it is too large, and -9223372036854775808 is
really two tokens, so must be managed from the parser).

--
Fabien.

Attachments:

pgbench-overflow-3.patchtext/plain; name=pgbench-overflow-3.patchDownload

diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 88cf8b3933..8c464c9d42 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -989,6 +989,13 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
       are <literal>FALSE</literal>.
      </para>
 
+     <para>
+      Too large or small integer and double constants, as well as
+      integer arithmetic operators (<literal>+</literal>,
+      <literal>-</literal>, <literal>*</literal> and <literal>/</literal>)
+      raise errors on overflows.
+     </para>
+
      <para>
       When no final <token>ELSE</token> clause is provided to a
       <token>CASE</token>, the default value is <literal>NULL</literal>.
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index 8447e14d14..55585f0fde 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -61,7 +61,8 @@ static PgBenchExpr *make_case(yyscan_t yyscanner, PgBenchExprList *when_then_lis
 %type <bval> BOOLEAN_CONST
 %type <str> VARIABLE FUNCTION
 
-%token NULL_CONST INTEGER_CONST DOUBLE_CONST BOOLEAN_CONST VARIABLE FUNCTION
+%token NULL_CONST INTEGER_CONST MAXINT_PLUS_ONE_CONST DOUBLE_CONST
+%token BOOLEAN_CONST VARIABLE FUNCTION
 %token AND_OP OR_OP NOT_OP NE_OP LE_OP GE_OP LS_OP RS_OP IS_OP
 %token CASE_KW WHEN_KW THEN_KW ELSE_KW END_KW
 
@@ -90,6 +91,9 @@ expr: '(' expr ')'			{ $$ = $2; }
 	/* unary minus "-x" implemented as "0 - x" */
 	| '-' expr %prec UNARY	{ $$ = make_op(yyscanner, "-",
 										   make_integer_constant(0), $2); }
+	/* special minint handling, only after a unary minus */
+	| '-' MAXINT_PLUS_ONE_CONST %prec UNARY
+							{ $$ = make_integer_constant(PG_INT64_MIN); }
 	/* binary ones complement "~x" implemented as 0xffff... xor x" */
 	| '~' expr				{ $$ = make_op(yyscanner, "#",
 										   make_integer_constant(~INT64CONST(0)), $2); }
diff --git a/src/bin/pgbench/exprscan.l b/src/bin/pgbench/exprscan.l
index 5c1bd88128..e8faea3857 100644
--- a/src/bin/pgbench/exprscan.l
+++ b/src/bin/pgbench/exprscan.l
@@ -191,16 +191,26 @@ notnull			[Nn][Oo][Tt][Nn][Uu][Ll][Ll]
 					yylval->bval = false;
 					return BOOLEAN_CONST;
 				}
+"9223372036854775808" {
+					/* yuk: special handling for minint */
+					return MAXINT_PLUS_ONE_CONST;
+				}
 {digit}+		{
-					yylval->ival = strtoint64(yytext);
+					if (!strtoint64(yytext, true, &yylval->ival))
+						expr_yyerror_more(yyscanner, "bigint constant overflow",
+										  strdup(yytext));
 					return INTEGER_CONST;
 				}
 {digit}+(\.{digit}*)?([eE][-+]?{digit}+)?	{
-					yylval->dval = atof(yytext);
+					if (!strtodouble(yytext, true, &yylval->dval))
+						expr_yyerror_more(yyscanner, "double constant overflow",
+										  strdup(yytext));
 					return DOUBLE_CONST;
 				}
 \.{digit}+([eE][-+]?{digit}+)?	{
-					yylval->dval = atof(yytext);
+					if (!strtodouble(yytext, true, &yylval->dval))
+						expr_yyerror_more(yyscanner, "double constant overflow",
+										  strdup(yytext));
 					return DOUBLE_CONST;
 				}
 {alpha}{alnum}*	{
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 41b756c089..39cf429224 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -38,6 +38,8 @@
 #include "libpq-fe.h"
 #include "portability/instr_time.h"
 
+#include "common/int.h" /* integer overflow checks */
+
 #include <ctype.h>
 #include <float.h>
 #include <limits.h>
@@ -627,19 +629,24 @@ is_an_int(const char *str)
 /*
  * strtoint64 -- convert a string to 64-bit integer
  *
- * This function is a modified version of scanint8() from
+ * This function is a slightly modified version of scanint8() from
  * src/backend/utils/adt/int8.c.
+ *
+ * If not errorOK, an error message is printed out.
  */
-int64
-strtoint64(const char *str)
+bool
+strtoint64(const char *str, bool errorOK, int64 *result)
 {
 	const char *ptr = str;
-	int64		result = 0;
-	int			sign = 1;
+	int64		tmp = 0;
+	bool		neg = false;
 
 	/*
 	 * Do our own scan, rather than relying on sscanf which might be broken
 	 * for long long.
+	 *
+	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
+	 * value as a negative number.
 	 */
 
 	/* skip leading spaces */
@@ -650,46 +657,80 @@ strtoint64(const char *str)
 	if (*ptr == '-')
 	{
 		ptr++;
-
-		/*
-		 * Do an explicit check for INT64_MIN.  Ugly though this is, it's
-		 * cleaner than trying to get the loop below to handle it portably.
-		 */
-		if (strncmp(ptr, "9223372036854775808", 19) == 0)
-		{
-			result = PG_INT64_MIN;
-			ptr += 19;
-			goto gotdigits;
-		}
-		sign = -1;
+		neg = true;
 	}
 	else if (*ptr == '+')
 		ptr++;
 
 	/* require at least one digit */
-	if (!isdigit((unsigned char) *ptr))
-		fprintf(stderr, "invalid input syntax for integer: \"%s\"\n", str);
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		goto invalid_syntax;
 
 	/* process digits */
 	while (*ptr && isdigit((unsigned char) *ptr))
 	{
-		int64		tmp = result * 10 + (*ptr++ - '0');
+		int8		digit = (*ptr++ - '0');
 
-		if ((tmp / 10) != result)	/* overflow? */
-			fprintf(stderr, "value \"%s\" is out of range for type bigint\n", str);
-		result = tmp;
+		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
+			goto out_of_range;
 	}
 
-gotdigits:
-
 	/* allow trailing whitespace, but not other trailing chars */
 	while (*ptr != '\0' && isspace((unsigned char) *ptr))
 		ptr++;
 
-	if (*ptr != '\0')
-		fprintf(stderr, "invalid input syntax for integer: \"%s\"\n", str);
+	if (unlikely(*ptr != '\0'))
+		goto invalid_syntax;
 
-	return ((sign < 0) ? -result : result);
+	if (!neg)
+	{
+		if (unlikely(tmp == PG_INT64_MIN))
+			goto out_of_range;
+		tmp = -tmp;
+	}
+
+	*result = tmp;
+	return true;
+
+out_of_range:
+	if (!errorOK)
+		fprintf(stderr,
+				"value \"%s\" is out of range for type bigint\n", str);
+	return false;
+
+invalid_syntax:
+	/* this can't happen here, function called on int-looking strings. */
+	if (!errorOK)
+		fprintf(stderr,
+				"invalid input syntax for type bigint: \"%s\"\n",str);
+	return false;
+}
+
+/* convert string to double, detecting overflows/underflows */
+bool
+strtodouble(const char *str, bool errorOK, double *dv)
+{
+	char *end;
+
+	*dv = strtod(str, &end);
+
+	if (unlikely(errno != 0))
+	{
+		if (!errorOK)
+			fprintf(stderr,
+					"value \"%s\" is out of range for type double\n", str);
+		return false;
+	}
+
+	if (unlikely(end == str || *end != '\0'))
+	{
+		if (!errorOK)
+			fprintf(stderr,
+					"invalid input syntax for type double: \"%s\"\n",str);
+		return false;
+	}
+	return true;
 }
 
 /* random number generator: uniform distribution from min to max inclusive */
@@ -1282,20 +1323,26 @@ makeVariableValue(Variable *var)
 	}
 	else if (is_an_int(var->svalue))
 	{
-		setIntValue(&var->value, strtoint64(var->svalue));
+		/* if it looks like an int, it must be an int without overflow */
+		int64 iv;
+
+		if (!strtoint64(var->svalue, false, &iv))
+			return false;
+
+		setIntValue(&var->value, iv);
 	}
 	else						/* type should be double */
 	{
 		double		dv;
-		char		xs;
 
-		if (sscanf(var->svalue, "%lf%c", &dv, &xs) != 1)
+		if (!strtodouble(var->svalue, true, &dv))
 		{
 			fprintf(stderr,
 					"malformed variable \"%s\" value: \"%s\"\n",
 					var->name, var->svalue);
 			return false;
 		}
+
 		setDoubleValue(&var->value, dv);
 	}
 	return true;
@@ -1905,7 +1952,8 @@ evalStandardFunc(TState *thread, CState *st,
 				else			/* we have integer operands, or % */
 				{
 					int64		li,
-								ri;
+								ri,
+								res;
 
 					if (!coerceToInt(lval, &li) ||
 						!coerceToInt(rval, &ri))
@@ -1914,14 +1962,29 @@ evalStandardFunc(TState *thread, CState *st,
 					switch (func)
 					{
 						case PGBENCH_ADD:
-							setIntValue(retval, li + ri);
+							if (unlikely(pg_add_s64_overflow(li, ri, &res)))
+							{
+								fprintf(stderr, "bigint add out of range\n");
+								return false;
+							}
+							setIntValue(retval, res);
 							return true;
 
 						case PGBENCH_SUB:
-							setIntValue(retval, li - ri);
+							if (unlikely(pg_sub_s64_overflow(li, ri, &res)))
+							{
+								fprintf(stderr, "bigint sub out of range\n");
+								return false;
+							}
+							setIntValue(retval, res);
 							return true;
 
 						case PGBENCH_MUL:
+							if (unlikely(pg_mul_s64_overflow(li, ri, &res)))
+							{
+								fprintf(stderr, "bigint mul out of range\n");
+								return false;
+							}
 							setIntValue(retval, li * ri);
 							return true;
 
@@ -1954,9 +2017,9 @@ evalStandardFunc(TState *thread, CState *st,
 								if (func == PGBENCH_DIV)
 								{
 									/* overflow check (needed for INT64_MIN) */
-									if (li == PG_INT64_MIN)
+									if (unlikely(li == PG_INT64_MIN))
 									{
-										fprintf(stderr, "bigint out of range\n");
+										fprintf(stderr, "bigint div out of range\n");
 										return false;
 									}
 									else
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index 6983865b92..de50340434 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -160,6 +160,7 @@ extern void syntax_error(const char *source, int lineno, const char *line,
 			 const char *cmd, const char *msg,
 			 const char *more, int col) pg_attribute_noreturn();
 
-extern int64 strtoint64(const char *str);
+extern bool strtoint64(const char *str, bool errorOK, int64 *pi);
+extern bool strtodouble(const char *str, bool errorOK, double *pd);
 
 #endif							/* PGBENCH_H */
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 2fc021dde7..d972903f2a 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -255,7 +255,7 @@ COMMIT;
 # test expressions
 # command 1..3 and 23 depend on random seed which is used to call srandom.
 pgbench(
-	'--random-seed=5432 -t 1 -Dfoo=-10.1 -Dbla=false -Di=+3 -Dminint=-9223372036854775808 -Dn=null -Dt=t -Df=of -Dd=1.0',
+	'--random-seed=5432 -t 1 -Dfoo=-10.1 -Dbla=false -Di=+3 -Dn=null -Dt=t -Df=of -Dd=1.0',
 	0,
 	[ qr{type: .*/001_pgbench_expressions}, qr{processed: 1/1} ],
 	[
@@ -278,7 +278,6 @@ pgbench(
 		qr{command=15.: double 15\b},
 		qr{command=16.: double 16\b},
 		qr{command=17.: double 17\b},
-		qr{command=18.: int 9223372036854775807\b},
 		qr{command=20.: int \d\b},    # zipfian random: 1 on linux
 		qr{command=21.: double -27\b},
 		qr{command=22.: double 1024\b},
@@ -322,6 +321,8 @@ pgbench(
 		qr{command=96.: int 1\b},       # :scale
 		qr{command=97.: int 0\b},       # :client_id
 		qr{command=98.: int 5432\b},    # :random_seed
+		qr{command=99.: int -9223372036854775808\b},    # min int
+		qr{command=100.: int 9223372036854775807\b},    # max int
 	],
 	'pgbench expressions',
 	{
@@ -345,10 +346,9 @@ pgbench(
 \set pi debug(pi() * 4.9)
 \set d4 debug(greatest(4, 2, -1.17) * 4.0 * Ln(Exp(1.0)))
 \set d5 debug(least(-5.18, .0E0, 1.0/0) * -3.3)
--- forced overflow
-\set maxint debug(:minint - 1)
--- reset a variable
+-- reset variables
 \set i1 0
+\set d1 false
 -- yet another integer function
 \set id debug(random_zipfian(1, 9, 1.3))
 --- pow and power
@@ -447,6 +447,9 @@ SELECT :v0, :v1, :v2, :v3;
 \set sc debug(:scale)
 \set ci debug(:client_id)
 \set rs debug(:random_seed)
+-- minint constant parsing
+\set min debug(-9223372036854775808)
+\set max debug(-(:min + 1))
 }
 	});
 
@@ -601,16 +604,10 @@ SELECT LEAST(:i, :i, :i, :i, :i, :i, :i, :i, :i, :i, :i);
 		[qr{invalid variable name}], q{\set . 1}
 	],
 	[
-		'set int overflow',                   0,
-		[qr{double to int overflow for 100}], q{\set i int(1E32)}
+		'set division by zero', 0,
+		[qr{division by zero}], q{\set i 1/0}
 	],
-	[ 'set division by zero', 0, [qr{division by zero}], q{\set i 1/0} ],
-	[
-		'set bigint out of range', 0,
-		[qr{bigint out of range}], q{\set i 9223372036854775808 / -1}
-	],
-	[
-		'set undefined variable',
+	[   'set undefined variable',
 		0,
 		[qr{undefined variable "nosuchvariable"}],
 		q{\set i :nosuchvariable}
@@ -630,7 +627,7 @@ SELECT LEAST(:i, :i, :i, :i, :i, :i, :i, :i, :i, :i, :i);
 		'set random range too large',
 		0,
 		[qr{random range is too large}],
-		q{\set i random(-9223372036854775808, 9223372036854775807)}
+		q{\set i random(:minint, :maxint)}
 	],
 	[
 		'set gaussian param too small',
@@ -693,6 +690,18 @@ SELECT LEAST(:i, :i, :i, :i, :i, :i, :i, :i, :i, :i, :i);
 		[qr{at least one argument expected}], q{\set i greatest())}
 	],
 
+	# SET: ARITHMETIC OVERFLOW DETECTION
+	[ 'set double to int overflow',                   0,
+		[ qr{double to int overflow for 100} ], q{\set i int(1E32)} ],
+	[ 'set bigint add overflow', 0,
+		[ qr{int add out} ], q{\set i (1<<62) + (1<<62)} ],
+	[ 'set bigint sub overflow', 0,
+		[ qr{int sub out} ], q{\set i 0 - (1<<62) - (1<<62) - (1<<62)} ],
+	[ 'set bigint mul overflow', 0,
+		[ qr{int mul out} ], q{\set i 2 * (1<<62)} ],
+	[ 'set bigint div out of range', 0,
+		[ qr{bigint div out of range} ], q{\set i :minint / -1} ],
+
 	# SETSHELL
 	[
 		'setshell not an int',                0,
@@ -740,7 +749,8 @@ for my $e (@errors)
 	my $n = '001_pgbench_error_' . $name;
 	$n =~ s/ /_/g;
 	pgbench(
-		'-n -t 1 -Dfoo=bla -Dnull=null -Dtrue=true -Done=1 -Dzero=0.0 -Dbadtrue=trueXXX -M prepared',
+		'-n -t 1 -M prepared -Dfoo=bla -Dnull=null -Dtrue=true -Done=1 -Dzero=0.0 ' .
+		'-Dbadtrue=trueXXX -Dmaxint=9223372036854775807 -Dminint=-9223372036854775808',
 		$status,
 		[ $status ? qr{^$} : qr{processed: 0/1} ],
 		$re,
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
index a2845a583b..679c1de6b7 100644
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -270,6 +270,22 @@ my @script_tests = (
 		'endif syntax error',
 		[qr{unexpected argument in command "endif"}],
 		{ 'endif-bad.sql' => "\\if 0\n\\endif BAD\n" }
+	],
+	# overflow
+	[
+		'bigint overflow 1',
+		[qr{bigint constant overflow}],
+		{ 'overflow-1.sql' => "\\set i 100000000000000000000\n" }
+	],
+	[
+		'double overflow 2',
+		[qr{double constant overflow}],
+		{ 'overflow-2.sql' => "\\set d 1.0E309\n" }
+	],
+	[
+		'double overflow 3',
+		[qr{double constant overflow}],
+		{ 'overflow-3.sql' => "\\set d .1E310\n" }
 	],);
 
 for my $t (@script_tests)

#10

Ibrar Ahmed

ibrar.ahmad@gmail.com

over 7 years ago

In reply to: Fabien COELHO (#9)

Re: pgbench's expression parsing & negative numbers

The following review has been posted through the commitfest application:
make installcheck-world: not tested
Implements feature: tested, passed
Spec compliant: tested, passed
Documentation: not tested

Patch does not apply cleanly on the master branch, anyways I managed that. Patch work according to specs, and no issue found.
The only minor nit is that you can keep the full comments of function strtoint64

/*
* If not errorOK, an error message is printed out.
* If errorOK is true, just return "false" for bad input.
*/

#11

Fabien COELHO

coelho@cri.ensmp.fr

over 7 years ago

In reply to: Ibrar Ahmed (#10)

1 attachment(s)

Re: pgbench's expression parsing & negative numbers

Hello,

The following review has been posted through the commitfest application:
make installcheck-world: not tested
Implements feature: tested, passed
Spec compliant: tested, passed
Documentation: not tested

Patch does not apply cleanly on the master branch, anyways I managed that. Patch work according to specs, and no issue found.
The only minor nit is that you can keep the full comments of function strtoint64

/*
* If not errorOK, an error message is printed out.
* If errorOK is true, just return "false" for bad input.
*/

Thanks for the review.

Attached is a v4, with improved comments on strtoint64 as you requested.
I also added 2 "unlikely" compiler directives.

--
Fabien.

Attachments:

pgbench-overflow-4.patchtext/plain; name=pgbench-overflow-4.patchDownload

diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 88cf8b3933..8c464c9d42 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -989,6 +989,13 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
       are <literal>FALSE</literal>.
      </para>
 
+     <para>
+      Too large or small integer and double constants, as well as
+      integer arithmetic operators (<literal>+</literal>,
+      <literal>-</literal>, <literal>*</literal> and <literal>/</literal>)
+      raise errors on overflows.
+     </para>
+
      <para>
       When no final <token>ELSE</token> clause is provided to a
       <token>CASE</token>, the default value is <literal>NULL</literal>.
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index f7c56cc6a3..bf2509f19b 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -61,7 +61,8 @@ static PgBenchExpr *make_case(yyscan_t yyscanner, PgBenchExprList *when_then_lis
 %type <bval> BOOLEAN_CONST
 %type <str> VARIABLE FUNCTION
 
-%token NULL_CONST INTEGER_CONST DOUBLE_CONST BOOLEAN_CONST VARIABLE FUNCTION
+%token NULL_CONST INTEGER_CONST MAXINT_PLUS_ONE_CONST DOUBLE_CONST
+%token BOOLEAN_CONST VARIABLE FUNCTION
 %token AND_OP OR_OP NOT_OP NE_OP LE_OP GE_OP LS_OP RS_OP IS_OP
 %token CASE_KW WHEN_KW THEN_KW ELSE_KW END_KW
 
@@ -90,6 +91,9 @@ expr: '(' expr ')'			{ $$ = $2; }
 	/* unary minus "-x" implemented as "0 - x" */
 	| '-' expr %prec UNARY	{ $$ = make_op(yyscanner, "-",
 										   make_integer_constant(0), $2); }
+	/* special minint handling, only after a unary minus */
+	| '-' MAXINT_PLUS_ONE_CONST %prec UNARY
+							{ $$ = make_integer_constant(PG_INT64_MIN); }
 	/* binary ones complement "~x" implemented as 0xffff... xor x" */
 	| '~' expr				{ $$ = make_op(yyscanner, "#",
 										   make_integer_constant(~INT64CONST(0)), $2); }
diff --git a/src/bin/pgbench/exprscan.l b/src/bin/pgbench/exprscan.l
index 5c1bd88128..e8faea3857 100644
--- a/src/bin/pgbench/exprscan.l
+++ b/src/bin/pgbench/exprscan.l
@@ -191,16 +191,26 @@ notnull			[Nn][Oo][Tt][Nn][Uu][Ll][Ll]
 					yylval->bval = false;
 					return BOOLEAN_CONST;
 				}
+"9223372036854775808" {
+					/* yuk: special handling for minint */
+					return MAXINT_PLUS_ONE_CONST;
+				}
 {digit}+		{
-					yylval->ival = strtoint64(yytext);
+					if (!strtoint64(yytext, true, &yylval->ival))
+						expr_yyerror_more(yyscanner, "bigint constant overflow",
+										  strdup(yytext));
 					return INTEGER_CONST;
 				}
 {digit}+(\.{digit}*)?([eE][-+]?{digit}+)?	{
-					yylval->dval = atof(yytext);
+					if (!strtodouble(yytext, true, &yylval->dval))
+						expr_yyerror_more(yyscanner, "double constant overflow",
+										  strdup(yytext));
 					return DOUBLE_CONST;
 				}
 \.{digit}+([eE][-+]?{digit}+)?	{
-					yylval->dval = atof(yytext);
+					if (!strtodouble(yytext, true, &yylval->dval))
+						expr_yyerror_more(yyscanner, "double constant overflow",
+										  strdup(yytext));
 					return DOUBLE_CONST;
 				}
 {alpha}{alnum}*	{
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 41b756c089..bbcd541110 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -38,6 +38,8 @@
 #include "libpq-fe.h"
 #include "portability/instr_time.h"
 
+#include "common/int.h" /* integer overflow checks */
+
 #include <ctype.h>
 #include <float.h>
 #include <limits.h>
@@ -627,19 +629,27 @@ is_an_int(const char *str)
 /*
  * strtoint64 -- convert a string to 64-bit integer
  *
- * This function is a modified version of scanint8() from
+ * This function is a slightly modified version of scanint8() from
  * src/backend/utils/adt/int8.c.
+ *
+ * The function returns whether the conversion worked, and if so
+ * "*result" is set to the result.
+ *
+ * If not errorOK, an error message is also printed out on errors.
  */
-int64
-strtoint64(const char *str)
+bool
+strtoint64(const char *str, bool errorOK, int64 *result)
 {
 	const char *ptr = str;
-	int64		result = 0;
-	int			sign = 1;
+	int64		tmp = 0;
+	bool		neg = false;
 
 	/*
 	 * Do our own scan, rather than relying on sscanf which might be broken
 	 * for long long.
+	 *
+	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
+	 * value as a negative number.
 	 */
 
 	/* skip leading spaces */
@@ -650,46 +660,80 @@ strtoint64(const char *str)
 	if (*ptr == '-')
 	{
 		ptr++;
-
-		/*
-		 * Do an explicit check for INT64_MIN.  Ugly though this is, it's
-		 * cleaner than trying to get the loop below to handle it portably.
-		 */
-		if (strncmp(ptr, "9223372036854775808", 19) == 0)
-		{
-			result = PG_INT64_MIN;
-			ptr += 19;
-			goto gotdigits;
-		}
-		sign = -1;
+		neg = true;
 	}
 	else if (*ptr == '+')
 		ptr++;
 
 	/* require at least one digit */
-	if (!isdigit((unsigned char) *ptr))
-		fprintf(stderr, "invalid input syntax for integer: \"%s\"\n", str);
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		goto invalid_syntax;
 
 	/* process digits */
 	while (*ptr && isdigit((unsigned char) *ptr))
 	{
-		int64		tmp = result * 10 + (*ptr++ - '0');
+		int8		digit = (*ptr++ - '0');
 
-		if ((tmp / 10) != result)	/* overflow? */
-			fprintf(stderr, "value \"%s\" is out of range for type bigint\n", str);
-		result = tmp;
+		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
+			goto out_of_range;
 	}
 
-gotdigits:
-
 	/* allow trailing whitespace, but not other trailing chars */
 	while (*ptr != '\0' && isspace((unsigned char) *ptr))
 		ptr++;
 
-	if (*ptr != '\0')
-		fprintf(stderr, "invalid input syntax for integer: \"%s\"\n", str);
+	if (unlikely(*ptr != '\0'))
+		goto invalid_syntax;
 
-	return ((sign < 0) ? -result : result);
+	if (!neg)
+	{
+		if (unlikely(tmp == PG_INT64_MIN))
+			goto out_of_range;
+		tmp = -tmp;
+	}
+
+	*result = tmp;
+	return true;
+
+out_of_range:
+	if (!errorOK)
+		fprintf(stderr,
+				"value \"%s\" is out of range for type bigint\n", str);
+	return false;
+
+invalid_syntax:
+	/* this can't happen here, function called on int-looking strings. */
+	if (!errorOK)
+		fprintf(stderr,
+				"invalid input syntax for type bigint: \"%s\"\n",str);
+	return false;
+}
+
+/* convert string to double, detecting overflows/underflows */
+bool
+strtodouble(const char *str, bool errorOK, double *dv)
+{
+	char *end;
+
+	*dv = strtod(str, &end);
+
+	if (unlikely(errno != 0))
+	{
+		if (!errorOK)
+			fprintf(stderr,
+					"value \"%s\" is out of range for type double\n", str);
+		return false;
+	}
+
+	if (unlikely(end == str || *end != '\0'))
+	{
+		if (!errorOK)
+			fprintf(stderr,
+					"invalid input syntax for type double: \"%s\"\n",str);
+		return false;
+	}
+	return true;
 }
 
 /* random number generator: uniform distribution from min to max inclusive */
@@ -1282,20 +1326,26 @@ makeVariableValue(Variable *var)
 	}
 	else if (is_an_int(var->svalue))
 	{
-		setIntValue(&var->value, strtoint64(var->svalue));
+		/* if it looks like an int, it must be an int without overflow */
+		int64 iv;
+
+		if (unlikely(!strtoint64(var->svalue, false, &iv)))
+			return false;
+
+		setIntValue(&var->value, iv);
 	}
 	else						/* type should be double */
 	{
 		double		dv;
-		char		xs;
 
-		if (sscanf(var->svalue, "%lf%c", &dv, &xs) != 1)
+		if (unlikely(!strtodouble(var->svalue, true, &dv)))
 		{
 			fprintf(stderr,
 					"malformed variable \"%s\" value: \"%s\"\n",
 					var->name, var->svalue);
 			return false;
 		}
+
 		setDoubleValue(&var->value, dv);
 	}
 	return true;
@@ -1905,7 +1955,8 @@ evalStandardFunc(TState *thread, CState *st,
 				else			/* we have integer operands, or % */
 				{
 					int64		li,
-								ri;
+								ri,
+								res;
 
 					if (!coerceToInt(lval, &li) ||
 						!coerceToInt(rval, &ri))
@@ -1914,14 +1965,29 @@ evalStandardFunc(TState *thread, CState *st,
 					switch (func)
 					{
 						case PGBENCH_ADD:
-							setIntValue(retval, li + ri);
+							if (unlikely(pg_add_s64_overflow(li, ri, &res)))
+							{
+								fprintf(stderr, "bigint add out of range\n");
+								return false;
+							}
+							setIntValue(retval, res);
 							return true;
 
 						case PGBENCH_SUB:
-							setIntValue(retval, li - ri);
+							if (unlikely(pg_sub_s64_overflow(li, ri, &res)))
+							{
+								fprintf(stderr, "bigint sub out of range\n");
+								return false;
+							}
+							setIntValue(retval, res);
 							return true;
 
 						case PGBENCH_MUL:
+							if (unlikely(pg_mul_s64_overflow(li, ri, &res)))
+							{
+								fprintf(stderr, "bigint mul out of range\n");
+								return false;
+							}
 							setIntValue(retval, li * ri);
 							return true;
 
@@ -1954,9 +2020,9 @@ evalStandardFunc(TState *thread, CState *st,
 								if (func == PGBENCH_DIV)
 								{
 									/* overflow check (needed for INT64_MIN) */
-									if (li == PG_INT64_MIN)
+									if (unlikely(li == PG_INT64_MIN))
 									{
-										fprintf(stderr, "bigint out of range\n");
+										fprintf(stderr, "bigint div out of range\n");
 										return false;
 									}
 									else
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index 6983865b92..de50340434 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -160,6 +160,7 @@ extern void syntax_error(const char *source, int lineno, const char *line,
 			 const char *cmd, const char *msg,
 			 const char *more, int col) pg_attribute_noreturn();
 
-extern int64 strtoint64(const char *str);
+extern bool strtoint64(const char *str, bool errorOK, int64 *pi);
+extern bool strtodouble(const char *str, bool errorOK, double *pd);
 
 #endif							/* PGBENCH_H */
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 2fc021dde7..d972903f2a 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -255,7 +255,7 @@ COMMIT;
 # test expressions
 # command 1..3 and 23 depend on random seed which is used to call srandom.
 pgbench(
-	'--random-seed=5432 -t 1 -Dfoo=-10.1 -Dbla=false -Di=+3 -Dminint=-9223372036854775808 -Dn=null -Dt=t -Df=of -Dd=1.0',
+	'--random-seed=5432 -t 1 -Dfoo=-10.1 -Dbla=false -Di=+3 -Dn=null -Dt=t -Df=of -Dd=1.0',
 	0,
 	[ qr{type: .*/001_pgbench_expressions}, qr{processed: 1/1} ],
 	[
@@ -278,7 +278,6 @@ pgbench(
 		qr{command=15.: double 15\b},
 		qr{command=16.: double 16\b},
 		qr{command=17.: double 17\b},
-		qr{command=18.: int 9223372036854775807\b},
 		qr{command=20.: int \d\b},    # zipfian random: 1 on linux
 		qr{command=21.: double -27\b},
 		qr{command=22.: double 1024\b},
@@ -322,6 +321,8 @@ pgbench(
 		qr{command=96.: int 1\b},       # :scale
 		qr{command=97.: int 0\b},       # :client_id
 		qr{command=98.: int 5432\b},    # :random_seed
+		qr{command=99.: int -9223372036854775808\b},    # min int
+		qr{command=100.: int 9223372036854775807\b},    # max int
 	],
 	'pgbench expressions',
 	{
@@ -345,10 +346,9 @@ pgbench(
 \set pi debug(pi() * 4.9)
 \set d4 debug(greatest(4, 2, -1.17) * 4.0 * Ln(Exp(1.0)))
 \set d5 debug(least(-5.18, .0E0, 1.0/0) * -3.3)
--- forced overflow
-\set maxint debug(:minint - 1)
--- reset a variable
+-- reset variables
 \set i1 0
+\set d1 false
 -- yet another integer function
 \set id debug(random_zipfian(1, 9, 1.3))
 --- pow and power
@@ -447,6 +447,9 @@ SELECT :v0, :v1, :v2, :v3;
 \set sc debug(:scale)
 \set ci debug(:client_id)
 \set rs debug(:random_seed)
+-- minint constant parsing
+\set min debug(-9223372036854775808)
+\set max debug(-(:min + 1))
 }
 	});
 
@@ -601,16 +604,10 @@ SELECT LEAST(:i, :i, :i, :i, :i, :i, :i, :i, :i, :i, :i);
 		[qr{invalid variable name}], q{\set . 1}
 	],
 	[
-		'set int overflow',                   0,
-		[qr{double to int overflow for 100}], q{\set i int(1E32)}
+		'set division by zero', 0,
+		[qr{division by zero}], q{\set i 1/0}
 	],
-	[ 'set division by zero', 0, [qr{division by zero}], q{\set i 1/0} ],
-	[
-		'set bigint out of range', 0,
-		[qr{bigint out of range}], q{\set i 9223372036854775808 / -1}
-	],
-	[
-		'set undefined variable',
+	[   'set undefined variable',
 		0,
 		[qr{undefined variable "nosuchvariable"}],
 		q{\set i :nosuchvariable}
@@ -630,7 +627,7 @@ SELECT LEAST(:i, :i, :i, :i, :i, :i, :i, :i, :i, :i, :i);
 		'set random range too large',
 		0,
 		[qr{random range is too large}],
-		q{\set i random(-9223372036854775808, 9223372036854775807)}
+		q{\set i random(:minint, :maxint)}
 	],
 	[
 		'set gaussian param too small',
@@ -693,6 +690,18 @@ SELECT LEAST(:i, :i, :i, :i, :i, :i, :i, :i, :i, :i, :i);
 		[qr{at least one argument expected}], q{\set i greatest())}
 	],
 
+	# SET: ARITHMETIC OVERFLOW DETECTION
+	[ 'set double to int overflow',                   0,
+		[ qr{double to int overflow for 100} ], q{\set i int(1E32)} ],
+	[ 'set bigint add overflow', 0,
+		[ qr{int add out} ], q{\set i (1<<62) + (1<<62)} ],
+	[ 'set bigint sub overflow', 0,
+		[ qr{int sub out} ], q{\set i 0 - (1<<62) - (1<<62) - (1<<62)} ],
+	[ 'set bigint mul overflow', 0,
+		[ qr{int mul out} ], q{\set i 2 * (1<<62)} ],
+	[ 'set bigint div out of range', 0,
+		[ qr{bigint div out of range} ], q{\set i :minint / -1} ],
+
 	# SETSHELL
 	[
 		'setshell not an int',                0,
@@ -740,7 +749,8 @@ for my $e (@errors)
 	my $n = '001_pgbench_error_' . $name;
 	$n =~ s/ /_/g;
 	pgbench(
-		'-n -t 1 -Dfoo=bla -Dnull=null -Dtrue=true -Done=1 -Dzero=0.0 -Dbadtrue=trueXXX -M prepared',
+		'-n -t 1 -M prepared -Dfoo=bla -Dnull=null -Dtrue=true -Done=1 -Dzero=0.0 ' .
+		'-Dbadtrue=trueXXX -Dmaxint=9223372036854775807 -Dminint=-9223372036854775808',
 		$status,
 		[ $status ? qr{^$} : qr{processed: 0/1} ],
 		$re,
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
index c1c2c1e3d4..696c378edc 100644
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -290,6 +290,22 @@ my @script_tests = (
 		'too many arguments for hash',
 		[qr{unexpected number of arguments \(hash\)}],
 		{ 'bad-hash-2.sql' => "\\set i hash(1,2,3)\n" }
+	],
+	# overflow
+	[
+		'bigint overflow 1',
+		[qr{bigint constant overflow}],
+		{ 'overflow-1.sql' => "\\set i 100000000000000000000\n" }
+	],
+	[
+		'double overflow 2',
+		[qr{double constant overflow}],
+		{ 'overflow-2.sql' => "\\set d 1.0E309\n" }
+	],
+	[
+		'double overflow 3',
+		[qr{double constant overflow}],
+		{ 'overflow-3.sql' => "\\set d .1E310\n" }
 	],);
 
 for my $t (@script_tests)

#12

Ibrar Ahmed

ibrar.ahmad@gmail.com

over 7 years ago

In reply to: Fabien COELHO (#11)

Re: pgbench's expression parsing & negative numbers

Patch is good to go from my side.

The new status of this patch is: Ready for Committer

#13

Fabien COELHO

coelho@cri.ensmp.fr

over 7 years ago

In reply to: Ibrar Ahmed (#12)

Re: pgbench's expression parsing & negative numbers

Hello Ibrar,

The new status of this patch is: Ready for Committer

Ok, thanks. Let's see what committers think about it.

Andres, are you still interested in overflow detection and handling?

--
Fabien.

#14

Andres Freund

andres@anarazel.de

over 7 years ago

In reply to: Fabien COELHO (#13)

Re: pgbench's expression parsing & negative numbers

Hi,

On 2018-08-16 17:28:06 +0200, Fabien COELHO wrote:

The new status of this patch is: Ready for Committer

Ok, thanks. Let's see what committers think about it.

Andres, are you still interested in overflow detection and handling?

Yes. I'll try to look at it at some point not too far away.

- Andres

#15

Andres Freund

andres@anarazel.de

over 7 years ago

In reply to: Fabien COELHO (#11)

Re: pgbench's expression parsing & negative numbers

Hi,

On 2018-08-10 10:24:29 +0200, Fabien COELHO wrote:

diff --git a/src/bin/pgbench/exprscan.l b/src/bin/pgbench/exprscan.l
index 5c1bd88128..e8faea3857 100644
--- a/src/bin/pgbench/exprscan.l
+++ b/src/bin/pgbench/exprscan.l
@@ -191,16 +191,26 @@ notnull			[Nn][Oo][Tt][Nn][Uu][Ll][Ll]
yylval->bval = false;
return BOOLEAN_CONST;
}
+"9223372036854775808" {
+					/* yuk: special handling for minint */
+					return MAXINT_PLUS_ONE_CONST;
+				}

Yikes, do we really need this? If we dealt with - in the lexer, we
shouldn't need it, no?

/*
* strtoint64 -- convert a string to 64-bit integer
*
- * This function is a modified version of scanint8() from
+ * This function is a slightly modified version of scanint8() from
* src/backend/utils/adt/int8.c.
+ *
+ * The function returns whether the conversion worked, and if so
+ * "*result" is set to the result.
+ *
+ * If not errorOK, an error message is also printed out on errors.
*/
-int64
-strtoint64(const char *str)
+bool
+strtoint64(const char *str, bool errorOK, int64 *result)
{
const char *ptr = str;
-	int64		result = 0;
-	int			sign = 1;
+	int64		tmp = 0;
+	bool		neg = false;

/*
* Do our own scan, rather than relying on sscanf which might be broken
* for long long.
+	 *
+	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
+	 * value as a negative number.
*/

/* skip leading spaces */
@@ -650,46 +660,80 @@ strtoint64(const char *str)
if (*ptr == '-')
{
ptr++;
-
-		/*
-		 * Do an explicit check for INT64_MIN.  Ugly though this is, it's
-		 * cleaner than trying to get the loop below to handle it portably.
-		 */
-		if (strncmp(ptr, "9223372036854775808", 19) == 0)
-		{
-			result = PG_INT64_MIN;
-			ptr += 19;
-			goto gotdigits;
-		}
-		sign = -1;
+		neg = true;
}
else if (*ptr == '+')
ptr++;

/* require at least one digit */
-	if (!isdigit((unsigned char) *ptr))
-		fprintf(stderr, "invalid input syntax for integer: \"%s\"\n", str);
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		goto invalid_syntax;

/* process digits */
while (*ptr && isdigit((unsigned char) *ptr))
{
-		int64		tmp = result * 10 + (*ptr++ - '0');
+		int8		digit = (*ptr++ - '0');

-		if ((tmp / 10) != result)	/* overflow? */
-			fprintf(stderr, "value \"%s\" is out of range for type bigint\n", str);
-		result = tmp;
+		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
+			goto out_of_range;
}

-gotdigits:
-
/* allow trailing whitespace, but not other trailing chars */
while (*ptr != '\0' && isspace((unsigned char) *ptr))
ptr++;

-	if (*ptr != '\0')
-		fprintf(stderr, "invalid input syntax for integer: \"%s\"\n", str);
+	if (unlikely(*ptr != '\0'))
+		goto invalid_syntax;

-	return ((sign < 0) ? -result : result);
+	if (!neg)
+	{
+		if (unlikely(tmp == PG_INT64_MIN))
+			goto out_of_range;
+		tmp = -tmp;
+	}
+
+	*result = tmp;
+	return true;
+
+out_of_range:
+	if (!errorOK)
+		fprintf(stderr,
+				"value \"%s\" is out of range for type bigint\n", str);
+	return false;
+
+invalid_syntax:
+	/* this can't happen here, function called on int-looking strings. */

This comment doesn't strike me as a good idea, it's almost guaranteed to
be outdated at some point.

+	if (!errorOK)
+		fprintf(stderr,
+				"invalid input syntax for type bigint: \"%s\"\n",str);
+	return false;
+}

Duplicating these routines is pretty ugly. I really wish we had more
infrastructure to avoid this (in particularly a portable way to report
an error and jump out). But probably out of scope for this patches?

+/* convert string to double, detecting overflows/underflows */
+bool
+strtodouble(const char *str, bool errorOK, double *dv)
+{
+	char *end;
+
+	*dv = strtod(str, &end);
+
+	if (unlikely(errno != 0))
+	{
+		if (!errorOK)
+			fprintf(stderr,
+					"value \"%s\" is out of range for type double\n", str);
+		return false;
+	}
+
+	if (unlikely(end == str || *end != '\0'))
+	{
+		if (!errorOK)
+			fprintf(stderr,
+					"invalid input syntax for type double: \"%s\"\n",str);
+		return false;
+	}
+	return true;
}

Not sure I see much point in the unlikelys here, contrasting to the ones
in the backend (and the copy for the frontend) it's extremely unlikley
anything like this will ever be close to a bottleneck. In general, I'd
strongly suggest not placing unlikelys in non-critical codepaths -
they're way too often wrongly estimated.

Greetings,

Andres Freund

#16

Fabien COELHO

coelho@cri.ensmp.fr

over 7 years ago

In reply to: Andres Freund (#15)

Re: pgbench's expression parsing & negative numbers

Hello Andres,

+"9223372036854775808" {
+					/* yuk: special handling for minint */
+					return MAXINT_PLUS_ONE_CONST;
+				}
Yikes, do we really need this? If we dealt with - in the lexer, we
shouldn't need it, no?

I'm not sure how to handle unary minus constant in the lexer, given that
the same '-' character is also used as minus binary operator.

The proposed solution is functional and has a reduce overall impact (one
lexer and one parser rules added), so it looks good to me.

Probably other approaches are possible.

+ /* this can't happen here, function called on int-looking strings. */

This comment doesn't strike me as a good idea, it's almost guaranteed to
be outdated at some point.

It is valid now, and it can be removed anyway.

[...] Duplicating these routines is pretty ugly.

Sure.

I really wish we had more infrastructure to avoid this (in particularly
a portable way to report an error and jump out). But probably out of
scope for this patches?

Yes.

Devising an appropriate client-side error handling/reporting
infrastructure is a non trivial tasks, and would cost more than a few
duplicate functions. "fprintf(stderr + return/exit" currently does the job
with minimal hassle. I do not think that this patch is the right one to
change how error are handle in postgres client applications.

+ if (unlikely(end == str || *end != '\0'))

Not sure I see much point in the unlikelys here, contrasting to the ones
in the backend (and the copy for the frontend) it's extremely unlikley
anything like this will ever be close to a bottleneck. In general, I'd
strongly suggest not placing unlikelys in non-critical codepaths -
they're way too often wrongly estimated.

I put "unlikely" where I really thought it made sense. I do not know when
they would have an actual performance impact, but I appreciate that they
convey information to the reader of the code, telling that this path is
expected not to be taken.

It can be removed anyway if you do not like it.

--
Fabien.

#17

Andres Freund

andres@anarazel.de

over 7 years ago

In reply to: Fabien COELHO (#16)

Re: pgbench's expression parsing & negative numbers

Hi,

On 2018-09-26 22:38:41 +0200, Fabien COELHO wrote:

+"9223372036854775808" {
+					/* yuk: special handling for minint */
+					return MAXINT_PLUS_ONE_CONST;
+				}
Yikes, do we really need this? If we dealt with - in the lexer, we
shouldn't need it, no?
I'm not sure how to handle unary minus constant in the lexer, given that the
same '-' character is also used as minus binary operator.

True. I think the nicer fix than what you've done here is to instead
simply always store the number, as coming from the lexer, as the
negative number. We do similarly in a number of places. I've gone with
yours for now.

+ /* this can't happen here, function called on int-looking strings. */

This comment doesn't strike me as a good idea, it's almost guaranteed to
be outdated at some point.

It is valid now, and it can be removed anyway.

Removed

+ if (unlikely(end == str || *end != '\0'))

Not sure I see much point in the unlikelys here, contrasting to the ones
in the backend (and the copy for the frontend) it's extremely unlikley
anything like this will ever be close to a bottleneck. In general, I'd
strongly suggest not placing unlikelys in non-critical codepaths -
they're way too often wrongly estimated.

I put "unlikely" where I really thought it made sense. I do not know when
they would have an actual performance impact, but I appreciate that they
convey information to the reader of the code, telling that this path is
expected not to be taken.

I removed them.

Pushed, thanks for the patch! I only did some very minor comment
changes, reset errno to 0 before strtod, removed a redundant
multiplication in PGBENCH_MUL.

FWIW, after this, and fixing the prerequisite general code paths, the
pgbench tests pass without -fsanitize=signed-integer-overflow errors.

Greetings,

Andres Freund