refactoring - share str2*int64 functions

coelho@cri.ensmp.fr

over 6 years ago

In reply to: Fabien COELHO (#1)

1 attachment(s)

Re: refactoring - share str2*int64 functions

As usual, it is better with the attachement. Sorry for the noise.

Yep, but ISTM that it is down to 32 bits,

Only on 32-bit-long machines, which are a dwindling minority (except
for Windows, which I don't really care about).

So the third short is now always 0. Hmmm. I'll propose another option over
the week-end.

I suppose we could put pg_strtouint64 somewhere where pgbench can use it,
but TBH I don't think it's worth the trouble. The set of people using
the --random-seed=int option at all is darn near empty, I suspect,
and the documentation only says you can write an int there.

Although I agree it is not worth a lot of trouble, and even if I don't do
Windows, I think it valuable that the behavior is the same on all platform.
The attached match shares pg_str2*int64 functions between frontend and
backend by moving them to "common/", which avoids some code duplication.

This is more refactoring, and it fixes the behavior change on 32 bit
architectures.

--
Fabien.

Attachments:

share-strtoX64-functions-1.patchtext/x-diff; name=share-strtoX64-functions-1.patchDownload

diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 22d0fe5ac4..2040d11bba 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -21,6 +21,7 @@
 #include "catalog/heap.h"
 #include "catalog/pg_type.h"
 #include "commands/trigger.h"
+#include "common/string.h"
 #include "executor/executor.h"
 #include "executor/spi_priv.h"
 #include "miscadmin.h"
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 3b96492b36..9b40fcde78 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -34,6 +34,7 @@
 
 #include "fmgr.h"
 #include "miscadmin.h"
+#include "common/string.h"
 #include "nodes/extensible.h"
 #include "nodes/parsenodes.h"
 #include "nodes/plannodes.h"
diff --git a/src/backend/parser/parse_node.c b/src/backend/parser/parse_node.c
index 3a7a858e3a..8a6d031d7c 100644
--- a/src/backend/parser/parse_node.c
+++ b/src/backend/parser/parse_node.c
@@ -25,10 +25,10 @@
 #include "parser/parse_expr.h"
 #include "parser/parse_relation.h"
 #include "utils/builtins.h"
-#include "utils/int8.h"
 #include "utils/lsyscache.h"
 #include "utils/syscache.h"
 #include "utils/varbit.h"
+#include "common/string.h"
 
 
 static void pcb_error_callback(void *arg);
@@ -498,7 +498,7 @@ make_const(ParseState *pstate, Value *value, int location)
 
 		case T_Float:
 			/* could be an oversize integer as well as a float ... */
-			if (scanint8(strVal(value), true, &val64))
+			if (pg_strtoint64(strVal(value), true, &val64))
 			{
 				/*
 				 * It might actually fit in int32. Probably only INT_MIN can
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index bf64c8e4a4..fe4a25bec2 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,6 +12,7 @@
  */
 #include "postgres.h"
 
+#include "common/string.h"
 #include "catalog/pg_publication.h"
 
 #include "replication/logical.h"
@@ -20,7 +21,6 @@
 #include "replication/pgoutput.h"
 
 #include "utils/inval.h"
-#include "utils/int8.h"
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 #include "utils/varlena.h"
@@ -111,7 +111,7 @@ parse_output_parameters(List *options, uint32 *protocol_version,
 						 errmsg("conflicting or redundant options")));
 			protocol_version_given = true;
 
-			if (!scanint8(strVal(defel->arg), true, &parsed))
+			if (!pg_strtoint64(strVal(defel->arg), true, &parsed))
 				ereport(ERROR,
 						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 						 errmsg("invalid proto_version")));
diff --git a/src/backend/utils/adt/cash.c b/src/backend/utils/adt/cash.c
index c92e9d5046..618660f372 100644
--- a/src/backend/utils/adt/cash.c
+++ b/src/backend/utils/adt/cash.c
@@ -26,7 +26,6 @@
 #include "libpq/pqformat.h"
 #include "utils/builtins.h"
 #include "utils/cash.h"
-#include "utils/int8.h"
 #include "utils/numeric.h"
 #include "utils/pg_locale.h"
 
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index df1db7bc9f..1121601adc 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -92,7 +92,6 @@
 #include "utils/datetime.h"
 #include "utils/float.h"
 #include "utils/formatting.h"
-#include "utils/int8.h"
 #include "utils/memutils.h"
 #include "utils/numeric.h"
 #include "utils/pg_locale.h"
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 0ff9394a2f..98e7b52ce9 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -18,12 +18,12 @@
 #include <math.h>
 
 #include "common/int.h"
+#include "common/string.h"
 #include "funcapi.h"
 #include "libpq/pqformat.h"
 #include "nodes/nodeFuncs.h"
 #include "nodes/supportnodes.h"
 #include "optimizer/optimizer.h"
-#include "utils/int8.h"
 #include "utils/builtins.h"
 
 
@@ -47,89 +47,6 @@ typedef struct
  * Formatting and conversion routines.
  *---------------------------------------------------------*/
 
-/*
- * scanint8 --- try to parse a string into an int8.
- *
- * If errorOK is false, ereport a useful error message if the string is bad.
- * If errorOK is true, just return "false" for bad input.
- */
-bool
-scanint8(const char *str, bool errorOK, int64 *result)
-{
-	const char *ptr = str;
-	int64		tmp = 0;
-	bool		neg = false;
-
-	/*
-	 * Do our own scan, rather than relying on sscanf which might be broken
-	 * for long long.
-	 *
-	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
-	 * value as a negative number.
-	 */
-
-	/* skip leading spaces */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		/* could fail if input is most negative number */
-		if (unlikely(tmp == PG_INT64_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	*result = tmp;
-	return true;
-
-out_of_range:
-	if (!errorOK)
-		ereport(ERROR,
-				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-				 errmsg("value \"%s\" is out of range for type %s",
-						str, "bigint")));
-	return false;
-
-invalid_syntax:
-	if (!errorOK)
-		ereport(ERROR,
-				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
-				 errmsg("invalid input syntax for type %s: \"%s\"",
-						"bigint", str)));
-	return false;
-}
-
 /* int8in()
  */
 Datum
@@ -138,7 +55,7 @@ int8in(PG_FUNCTION_ARGS)
 	char	   *str = PG_GETARG_CSTRING(0);
 	int64		result;
 
-	(void) scanint8(str, false, &result);
+	(void) pg_strtoint64(str, false, &result);
 	PG_RETURN_INT64(result);
 }
 
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index fbc2863622..04cfc15177 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -39,7 +39,6 @@
 #include "utils/float.h"
 #include "utils/guc.h"
 #include "utils/hashutils.h"
-#include "utils/int8.h"
 #include "utils/numeric.h"
 #include "utils/sortsupport.h"
 
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index c7df630c3c..a5f6d797d8 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -543,25 +543,3 @@ pg_ltostr(char *str, int32 value)
 
 	return end;
 }
-
-/*
- * pg_strtouint64
- *		Converts 'str' into an unsigned 64-bit integer.
- *
- * This has the identical API to strtoul(3), except that it will handle
- * 64-bit ints even where "long" is narrower than that.
- *
- * For the moment it seems sufficient to assume that the platform has
- * such a function somewhere; let's not roll our own.
- */
-uint64
-pg_strtouint64(const char *str, char **endptr, int base)
-{
-#ifdef _MSC_VER					/* MSVC only */
-	return _strtoui64(str, endptr, base);
-#elif defined(HAVE_STRTOULL) && SIZEOF_LONG < 8
-	return strtoull(str, endptr, base);
-#else
-	return strtoul(str, endptr, base);
-#endif
-}
diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c
index 72c450c70e..1729e150e6 100644
--- a/src/backend/utils/adt/rangetypes.c
+++ b/src/backend/utils/adt/rangetypes.c
@@ -37,7 +37,6 @@
 #include "utils/builtins.h"
 #include "utils/date.h"
 #include "utils/hashutils.h"
-#include "utils/int8.h"
 #include "utils/lsyscache.h"
 #include "utils/rangetypes.h"
 #include "utils/timestamp.h"
diff --git a/src/bin/pgbench/exprscan.l b/src/bin/pgbench/exprscan.l
index e9020ad231..9666a7855d 100644
--- a/src/bin/pgbench/exprscan.l
+++ b/src/bin/pgbench/exprscan.l
@@ -23,6 +23,7 @@
  *-------------------------------------------------------------------------
  */
 
+#include "common/string.h"
 #include "fe_utils/psqlscan_int.h"
 
 /* context information for reporting errors in expressions */
@@ -205,7 +206,7 @@ notnull			[Nn][Oo][Tt][Nn][Uu][Ll][Ll]
 					return MAXINT_PLUS_ONE_CONST;
 				}
 {digit}+		{
-					if (!strtoint64(yytext, true, &yylval->ival))
+					if (!pg_strtoint64(yytext, true, &yylval->ival))
 						expr_yyerror_more(yyscanner, "bigint constant overflow",
 										  strdup(yytext));
 					return INTEGER_CONST;
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index a03ab281a5..c06134630b 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -33,6 +33,7 @@
 
 #include "postgres_fe.h"
 #include "common/int.h"
+#include "common/string.h"
 #include "fe_utils/conditional.h"
 #include "fe_utils/logging.h"
 #include "getopt_long.h"
@@ -669,7 +670,7 @@ is_an_int(const char *str)
 {
 	const char *ptr = str;
 
-	/* skip leading spaces; cast is consistent with strtoint64 */
+	/* skip leading spaces; cast is consistent with pg_strtoint64 */
 	while (*ptr && isspace((unsigned char) *ptr))
 		ptr++;
 
@@ -689,90 +690,6 @@ is_an_int(const char *str)
 	return *ptr == '\0';
 }
 
-
-/*
- * strtoint64 -- convert a string to 64-bit integer
- *
- * This function is a slightly modified version of scanint8() from
- * src/backend/utils/adt/int8.c.
- *
- * The function returns whether the conversion worked, and if so
- * "*result" is set to the result.
- *
- * If not errorOK, an error message is also printed out on errors.
- */
-bool
-strtoint64(const char *str, bool errorOK, int64 *result)
-{
-	const char *ptr = str;
-	int64		tmp = 0;
-	bool		neg = false;
-
-	/*
-	 * Do our own scan, rather than relying on sscanf which might be broken
-	 * for long long.
-	 *
-	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
-	 * value as a negative number.
-	 */
-
-	/* skip leading spaces */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		if (unlikely(tmp == PG_INT64_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	*result = tmp;
-	return true;
-
-out_of_range:
-	if (!errorOK)
-		fprintf(stderr,
-				"value \"%s\" is out of range for type bigint\n", str);
-	return false;
-
-invalid_syntax:
-	if (!errorOK)
-		fprintf(stderr,
-				"invalid input syntax for type bigint: \"%s\"\n", str);
-	return false;
-}
-
 /* convert string to double, detecting overflows/underflows */
 bool
 strtodouble(const char *str, bool errorOK, double *dv)
@@ -1324,7 +1241,7 @@ makeVariableValue(Variable *var)
 		/* if it looks like an int, it must be an int without overflow */
 		int64 iv;
 
-		if (!strtoint64(var->svalue, false, &iv))
+		if (!pg_strtoint64(var->svalue, false, &iv))
 			return false;
 
 		setIntValue(&var->value, iv);
@@ -5020,18 +4937,15 @@ set_random_seed(const char *seed)
 	else
 	{
 		/* parse unsigned-int seed value */
-		unsigned long ulseed;
-		char		garbage;
-
-		/* Don't try to use UINT64_FORMAT here; it might not work for sscanf */
-		if (sscanf(seed, "%lu%c", &ulseed, &garbage) != 1)
+		char		*garbage;
+		iseed = pg_strtouint64(seed, &garbage, 10);
+		if (*garbage != '\0')
 		{
 			fprintf(stderr,
 					"unrecognized random seed option \"%s\": expecting an unsigned integer, \"time\" or \"rand\"\n",
 					seed);
 			return false;
 		}
-		iseed = (uint64) ulseed;
 	}
 
 	if (seed != NULL)
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index dc557d416c..a4e78c16ad 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -160,7 +160,6 @@ extern void syntax_error(const char *source, int lineno, const char *line,
 			 const char *cmd, const char *msg,
 			 const char *more, int col) pg_attribute_noreturn();
 
-extern bool strtoint64(const char *str, bool errorOK, int64 *pi);
 extern bool strtodouble(const char *str, bool errorOK, double *pd);
 
 #endif							/* PGBENCH_H */
diff --git a/src/common/string.c b/src/common/string.c
index b01a56ceaa..4449576e35 100644
--- a/src/common/string.c
+++ b/src/common/string.c
@@ -22,6 +22,7 @@
 #endif
 
 #include "common/string.h"
+#include "common/int.h"
 
 
 /*
@@ -57,6 +58,123 @@ strtoint(const char *pg_restrict str, char **pg_restrict endptr, int base)
 	return (int) val;
 }
 
+/*
+ * pg_strtoint64 -- convert a string to 64-bit integer
+ *
+ * The function returns whether the conversion worked, and if so
+ * "*result" is set to the result.
+ *
+ * If not errorOK, an error message is also printed out on errors.
+ */
+bool
+pg_strtoint64(const char *str, bool errorOK, int64 *result)
+{
+	const char *ptr = str;
+	int64		tmp = 0;
+	bool		neg = false;
+
+	/*
+	 * Do our own scan, rather than relying on sscanf which might be broken
+	 * for long long.
+	 *
+	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
+	 * value as a negative number.
+	 */
+
+	/* skip leading spaces */
+	while (*ptr && isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '-')
+	{
+		ptr++;
+		neg = true;
+	}
+	else if (*ptr == '+')
+		ptr++;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		goto invalid_syntax;
+
+	/* process digits */
+	while (*ptr && isdigit((unsigned char) *ptr))
+	{
+		int8		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
+			goto out_of_range;
+	}
+
+	/* allow trailing whitespace, but not other trailing chars */
+	while (*ptr != '\0' && isspace((unsigned char) *ptr))
+		ptr++;
+
+	if (unlikely(*ptr != '\0'))
+		goto invalid_syntax;
+
+	if (!neg)
+	{
+		if (unlikely(tmp == PG_INT64_MIN))
+			goto out_of_range;
+		tmp = -tmp;
+	}
+
+	*result = tmp;
+	return true;
+
+out_of_range:
+	if (!errorOK)
+#ifdef FRONTEND
+		fprintf(stderr,
+				"value \"%s\" is out of range for type bigint\n", str);
+#else
+		ereport(ERROR,
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+				 errmsg("value \"%s\" is out of range for type %s",
+						str, "bigint")));
+#endif
+	return false;
+
+invalid_syntax:
+	if (!errorOK)
+#ifdef FRONTEND
+		fprintf(stderr,
+				"invalid input syntax for type bigint: \"%s\"\n", str);
+#else
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+				 errmsg("invalid input syntax for type %s: \"%s\"",
+						"bigint", str)));
+#endif
+
+	return false;
+}
+
+/*
+ * pg_strtouint64
+ *		Converts 'str' into an unsigned 64-bit integer.
+ *
+ * This has the identical API to strtoul(3), except that it will handle
+ * 64-bit ints even where "long" is narrower than that.
+ *
+ * For the moment it seems sufficient to assume that the platform has
+ * such a function somewhere; let's not roll our own.
+ */
+uint64
+pg_strtouint64(const char *str, char **endptr, int base)
+{
+#ifdef _MSC_VER					/* MSVC only */
+	return _strtoui64(str, endptr, base);
+#elif defined(HAVE_STRTOULL) && SIZEOF_LONG < 8
+	return strtoull(str, endptr, base);
+#else
+	return strtoul(str, endptr, base);
+#endif
+}
+
 
 /*
  * pg_clean_ascii -- Replace any non-ASCII chars with a '?' char
diff --git a/src/include/common/string.h b/src/include/common/string.h
index f3467f0ee4..ea88f3ce15 100644
--- a/src/include/common/string.h
+++ b/src/include/common/string.h
@@ -13,6 +13,9 @@
 extern bool pg_str_endswith(const char *str, const char *end);
 extern int strtoint(const char *pg_restrict str, char **pg_restrict endptr,
 		 int base);
+extern bool pg_strtoint64(const char *str, bool errorOK, int64 *result);
+extern uint64 pg_strtouint64(const char *str, char **endptr, int base);
 extern void pg_clean_ascii(char *str);
 
+
 #endif							/* COMMON_STRING_H */
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 03d0ee2766..0f5d241fc7 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -50,7 +50,6 @@ extern void pg_ltoa(int32 l, char *a);
 extern void pg_lltoa(int64 ll, char *a);
 extern char *pg_ltostr_zeropad(char *str, int32 value, int32 minwidth);
 extern char *pg_ltostr(char *str, int32 value);
-extern uint64 pg_strtouint64(const char *str, char **endptr, int base);
 
 /* oid.c */
 extern oidvector *buildoidvector(const Oid *oids, int n);
diff --git a/src/include/utils/int8.h b/src/include/utils/int8.h
deleted file mode 100644
index 4836095d93..0000000000
--- a/src/include/utils/int8.h
+++ /dev/null
@@ -1,25 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * int8.h
- *	  Declarations for operations on 64-bit integers.
- *
- *
- * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * src/include/utils/int8.h
- *
- * NOTES
- * These data types are supported on all 64-bit architectures, and may
- *	be supported through libraries on some 32-bit machines. If your machine
- *	is not currently supported, then please try to make it so, then post
- *	patches to the postgresql.org hackers mailing list.
- *
- *-------------------------------------------------------------------------
- */
-#ifndef INT8_H
-#define INT8_H
-
-extern bool scanint8(const char *str, bool errorOK, int64 *result);
-
-#endif							/* INT8_H */

coelho@cri.ensmp.fr

over 6 years ago

In reply to: Fabien COELHO (#2)

1 attachment(s)

Re: refactoring - share str2*int64 functions

Although I agree it is not worth a lot of trouble, and even if I don't do
Windows, I think it valuable that the behavior is the same on all platform.
The attached match shares pg_str2*int64 functions between frontend and
backend by moving them to "common/", which avoids some code duplication.

This is more refactoring, and it fixes the behavior change on 32 bit
architectures.

V2 is a rebase.

--
Fabien.

Attachments:

share-strtoX64-functions-2.patchtext/x-diff; name=share-strtoX64-functions-2.patchDownload

diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 8eedb613a1..e8744f054c 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -21,6 +21,7 @@
 #include "catalog/heap.h"
 #include "catalog/pg_type.h"
 #include "commands/trigger.h"
+#include "common/string.h"
 #include "executor/executor.h"
 #include "executor/spi_priv.h"
 #include "miscadmin.h"
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 6c2626ee62..3735268e3a 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -34,6 +34,7 @@
 
 #include "fmgr.h"
 #include "miscadmin.h"
+#include "common/string.h"
 #include "nodes/extensible.h"
 #include "nodes/parsenodes.h"
 #include "nodes/plannodes.h"
diff --git a/src/backend/parser/parse_node.c b/src/backend/parser/parse_node.c
index 3a7a858e3a..8a6d031d7c 100644
--- a/src/backend/parser/parse_node.c
+++ b/src/backend/parser/parse_node.c
@@ -25,10 +25,10 @@
 #include "parser/parse_expr.h"
 #include "parser/parse_relation.h"
 #include "utils/builtins.h"
-#include "utils/int8.h"
 #include "utils/lsyscache.h"
 #include "utils/syscache.h"
 #include "utils/varbit.h"
+#include "common/string.h"
 
 
 static void pcb_error_callback(void *arg);
@@ -498,7 +498,7 @@ make_const(ParseState *pstate, Value *value, int location)
 
 		case T_Float:
 			/* could be an oversize integer as well as a float ... */
-			if (scanint8(strVal(value), true, &val64))
+			if (pg_strtoint64(strVal(value), true, &val64))
 			{
 				/*
 				 * It might actually fit in int32. Probably only INT_MIN can
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index d317fd7006..70681588a1 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,6 +12,7 @@
  */
 #include "postgres.h"
 
+#include "common/string.h"
 #include "catalog/pg_publication.h"
 
 #include "replication/logical.h"
@@ -20,7 +21,6 @@
 #include "replication/pgoutput.h"
 
 #include "utils/inval.h"
-#include "utils/int8.h"
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 #include "utils/varlena.h"
@@ -111,7 +111,7 @@ parse_output_parameters(List *options, uint32 *protocol_version,
 						 errmsg("conflicting or redundant options")));
 			protocol_version_given = true;
 
-			if (!scanint8(strVal(defel->arg), true, &parsed))
+			if (!pg_strtoint64(strVal(defel->arg), true, &parsed))
 				ereport(ERROR,
 						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 						 errmsg("invalid proto_version")));
diff --git a/src/backend/utils/adt/cash.c b/src/backend/utils/adt/cash.c
index c92e9d5046..618660f372 100644
--- a/src/backend/utils/adt/cash.c
+++ b/src/backend/utils/adt/cash.c
@@ -26,7 +26,6 @@
 #include "libpq/pqformat.h"
 #include "utils/builtins.h"
 #include "utils/cash.h"
-#include "utils/int8.h"
 #include "utils/numeric.h"
 #include "utils/pg_locale.h"
 
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index 206576d4bd..1f17987e85 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -92,7 +92,6 @@
 #include "utils/datetime.h"
 #include "utils/float.h"
 #include "utils/formatting.h"
-#include "utils/int8.h"
 #include "utils/memutils.h"
 #include "utils/numeric.h"
 #include "utils/pg_locale.h"
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 0ff9394a2f..98e7b52ce9 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -18,12 +18,12 @@
 #include <math.h>
 
 #include "common/int.h"
+#include "common/string.h"
 #include "funcapi.h"
 #include "libpq/pqformat.h"
 #include "nodes/nodeFuncs.h"
 #include "nodes/supportnodes.h"
 #include "optimizer/optimizer.h"
-#include "utils/int8.h"
 #include "utils/builtins.h"
 
 
@@ -47,89 +47,6 @@ typedef struct
  * Formatting and conversion routines.
  *---------------------------------------------------------*/
 
-/*
- * scanint8 --- try to parse a string into an int8.
- *
- * If errorOK is false, ereport a useful error message if the string is bad.
- * If errorOK is true, just return "false" for bad input.
- */
-bool
-scanint8(const char *str, bool errorOK, int64 *result)
-{
-	const char *ptr = str;
-	int64		tmp = 0;
-	bool		neg = false;
-
-	/*
-	 * Do our own scan, rather than relying on sscanf which might be broken
-	 * for long long.
-	 *
-	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
-	 * value as a negative number.
-	 */
-
-	/* skip leading spaces */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		/* could fail if input is most negative number */
-		if (unlikely(tmp == PG_INT64_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	*result = tmp;
-	return true;
-
-out_of_range:
-	if (!errorOK)
-		ereport(ERROR,
-				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-				 errmsg("value \"%s\" is out of range for type %s",
-						str, "bigint")));
-	return false;
-
-invalid_syntax:
-	if (!errorOK)
-		ereport(ERROR,
-				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
-				 errmsg("invalid input syntax for type %s: \"%s\"",
-						"bigint", str)));
-	return false;
-}
-
 /* int8in()
  */
 Datum
@@ -138,7 +55,7 @@ int8in(PG_FUNCTION_ARGS)
 	char	   *str = PG_GETARG_CSTRING(0);
 	int64		result;
 
-	(void) scanint8(str, false, &result);
+	(void) pg_strtoint64(str, false, &result);
 	PG_RETURN_INT64(result);
 }
 
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index 201784bbf6..02bdf278c2 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -39,7 +39,6 @@
 #include "utils/float.h"
 #include "utils/guc.h"
 #include "utils/hashutils.h"
-#include "utils/int8.h"
 #include "utils/numeric.h"
 #include "utils/sortsupport.h"
 
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 70138feb29..f5d488c569 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -543,25 +543,3 @@ pg_ltostr(char *str, int32 value)
 
 	return end;
 }
-
-/*
- * pg_strtouint64
- *		Converts 'str' into an unsigned 64-bit integer.
- *
- * This has the identical API to strtoul(3), except that it will handle
- * 64-bit ints even where "long" is narrower than that.
- *
- * For the moment it seems sufficient to assume that the platform has
- * such a function somewhere; let's not roll our own.
- */
-uint64
-pg_strtouint64(const char *str, char **endptr, int base)
-{
-#ifdef _MSC_VER					/* MSVC only */
-	return _strtoui64(str, endptr, base);
-#elif defined(HAVE_STRTOULL) && SIZEOF_LONG < 8
-	return strtoull(str, endptr, base);
-#else
-	return strtoul(str, endptr, base);
-#endif
-}
diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c
index cb16d701d8..3941cb82b5 100644
--- a/src/backend/utils/adt/rangetypes.c
+++ b/src/backend/utils/adt/rangetypes.c
@@ -37,7 +37,6 @@
 #include "utils/builtins.h"
 #include "utils/date.h"
 #include "utils/hashutils.h"
-#include "utils/int8.h"
 #include "utils/lsyscache.h"
 #include "utils/rangetypes.h"
 #include "utils/timestamp.h"
diff --git a/src/bin/pgbench/exprscan.l b/src/bin/pgbench/exprscan.l
index e9020ad231..9666a7855d 100644
--- a/src/bin/pgbench/exprscan.l
+++ b/src/bin/pgbench/exprscan.l
@@ -23,6 +23,7 @@
  *-------------------------------------------------------------------------
  */
 
+#include "common/string.h"
 #include "fe_utils/psqlscan_int.h"
 
 /* context information for reporting errors in expressions */
@@ -205,7 +206,7 @@ notnull			[Nn][Oo][Tt][Nn][Uu][Ll][Ll]
 					return MAXINT_PLUS_ONE_CONST;
 				}
 {digit}+		{
-					if (!strtoint64(yytext, true, &yylval->ival))
+					if (!pg_strtoint64(yytext, true, &yylval->ival))
 						expr_yyerror_more(yyscanner, "bigint constant overflow",
 										  strdup(yytext));
 					return INTEGER_CONST;
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 8b84658ccd..e18c11b89d 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -34,6 +34,7 @@
 #include "postgres_fe.h"
 #include "common/int.h"
 #include "common/logging.h"
+#include "common/string.h"
 #include "fe_utils/conditional.h"
 #include "getopt_long.h"
 #include "libpq-fe.h"
@@ -669,7 +670,7 @@ is_an_int(const char *str)
 {
 	const char *ptr = str;
 
-	/* skip leading spaces; cast is consistent with strtoint64 */
+	/* skip leading spaces; cast is consistent with pg_strtoint64 */
 	while (*ptr && isspace((unsigned char) *ptr))
 		ptr++;
 
@@ -689,90 +690,6 @@ is_an_int(const char *str)
 	return *ptr == '\0';
 }
 
-
-/*
- * strtoint64 -- convert a string to 64-bit integer
- *
- * This function is a slightly modified version of scanint8() from
- * src/backend/utils/adt/int8.c.
- *
- * The function returns whether the conversion worked, and if so
- * "*result" is set to the result.
- *
- * If not errorOK, an error message is also printed out on errors.
- */
-bool
-strtoint64(const char *str, bool errorOK, int64 *result)
-{
-	const char *ptr = str;
-	int64		tmp = 0;
-	bool		neg = false;
-
-	/*
-	 * Do our own scan, rather than relying on sscanf which might be broken
-	 * for long long.
-	 *
-	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
-	 * value as a negative number.
-	 */
-
-	/* skip leading spaces */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		if (unlikely(tmp == PG_INT64_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	*result = tmp;
-	return true;
-
-out_of_range:
-	if (!errorOK)
-		fprintf(stderr,
-				"value \"%s\" is out of range for type bigint\n", str);
-	return false;
-
-invalid_syntax:
-	if (!errorOK)
-		fprintf(stderr,
-				"invalid input syntax for type bigint: \"%s\"\n", str);
-	return false;
-}
-
 /* convert string to double, detecting overflows/underflows */
 bool
 strtodouble(const char *str, bool errorOK, double *dv)
@@ -1324,7 +1241,7 @@ makeVariableValue(Variable *var)
 		/* if it looks like an int, it must be an int without overflow */
 		int64		iv;
 
-		if (!strtoint64(var->svalue, false, &iv))
+		if (!pg_strtoint64(var->svalue, false, &iv))
 			return false;
 
 		setIntValue(&var->value, iv);
@@ -5020,18 +4937,15 @@ set_random_seed(const char *seed)
 	else
 	{
 		/* parse unsigned-int seed value */
-		unsigned long ulseed;
-		char		garbage;
-
-		/* Don't try to use UINT64_FORMAT here; it might not work for sscanf */
-		if (sscanf(seed, "%lu%c", &ulseed, &garbage) != 1)
+		char		*garbage;
+		iseed = pg_strtouint64(seed, &garbage, 10);
+		if (*garbage != '\0')
 		{
 			fprintf(stderr,
 					"unrecognized random seed option \"%s\": expecting an unsigned integer, \"time\" or \"rand\"\n",
 					seed);
 			return false;
 		}
-		iseed = (uint64) ulseed;
 	}
 
 	if (seed != NULL)
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index c4a1e298e0..da3f8fa055 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -160,7 +160,6 @@ extern void syntax_error(const char *source, int lineno, const char *line,
 						 const char *cmd, const char *msg,
 						 const char *more, int col) pg_attribute_noreturn();
 
-extern bool strtoint64(const char *str, bool errorOK, int64 *pi);
 extern bool strtodouble(const char *str, bool errorOK, double *pd);
 
 #endif							/* PGBENCH_H */
diff --git a/src/common/string.c b/src/common/string.c
index b01a56ceaa..4449576e35 100644
--- a/src/common/string.c
+++ b/src/common/string.c
@@ -22,6 +22,7 @@
 #endif
 
 #include "common/string.h"
+#include "common/int.h"
 
 
 /*
@@ -57,6 +58,123 @@ strtoint(const char *pg_restrict str, char **pg_restrict endptr, int base)
 	return (int) val;
 }
 
+/*
+ * pg_strtoint64 -- convert a string to 64-bit integer
+ *
+ * The function returns whether the conversion worked, and if so
+ * "*result" is set to the result.
+ *
+ * If not errorOK, an error message is also printed out on errors.
+ */
+bool
+pg_strtoint64(const char *str, bool errorOK, int64 *result)
+{
+	const char *ptr = str;
+	int64		tmp = 0;
+	bool		neg = false;
+
+	/*
+	 * Do our own scan, rather than relying on sscanf which might be broken
+	 * for long long.
+	 *
+	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
+	 * value as a negative number.
+	 */
+
+	/* skip leading spaces */
+	while (*ptr && isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '-')
+	{
+		ptr++;
+		neg = true;
+	}
+	else if (*ptr == '+')
+		ptr++;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		goto invalid_syntax;
+
+	/* process digits */
+	while (*ptr && isdigit((unsigned char) *ptr))
+	{
+		int8		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
+			goto out_of_range;
+	}
+
+	/* allow trailing whitespace, but not other trailing chars */
+	while (*ptr != '\0' && isspace((unsigned char) *ptr))
+		ptr++;
+
+	if (unlikely(*ptr != '\0'))
+		goto invalid_syntax;
+
+	if (!neg)
+	{
+		if (unlikely(tmp == PG_INT64_MIN))
+			goto out_of_range;
+		tmp = -tmp;
+	}
+
+	*result = tmp;
+	return true;
+
+out_of_range:
+	if (!errorOK)
+#ifdef FRONTEND
+		fprintf(stderr,
+				"value \"%s\" is out of range for type bigint\n", str);
+#else
+		ereport(ERROR,
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+				 errmsg("value \"%s\" is out of range for type %s",
+						str, "bigint")));
+#endif
+	return false;
+
+invalid_syntax:
+	if (!errorOK)
+#ifdef FRONTEND
+		fprintf(stderr,
+				"invalid input syntax for type bigint: \"%s\"\n", str);
+#else
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+				 errmsg("invalid input syntax for type %s: \"%s\"",
+						"bigint", str)));
+#endif
+
+	return false;
+}
+
+/*
+ * pg_strtouint64
+ *		Converts 'str' into an unsigned 64-bit integer.
+ *
+ * This has the identical API to strtoul(3), except that it will handle
+ * 64-bit ints even where "long" is narrower than that.
+ *
+ * For the moment it seems sufficient to assume that the platform has
+ * such a function somewhere; let's not roll our own.
+ */
+uint64
+pg_strtouint64(const char *str, char **endptr, int base)
+{
+#ifdef _MSC_VER					/* MSVC only */
+	return _strtoui64(str, endptr, base);
+#elif defined(HAVE_STRTOULL) && SIZEOF_LONG < 8
+	return strtoull(str, endptr, base);
+#else
+	return strtoul(str, endptr, base);
+#endif
+}
+
 
 /*
  * pg_clean_ascii -- Replace any non-ASCII chars with a '?' char
diff --git a/src/include/common/string.h b/src/include/common/string.h
index 77f31337ca..ef760e8d25 100644
--- a/src/include/common/string.h
+++ b/src/include/common/string.h
@@ -13,6 +13,9 @@
 extern bool pg_str_endswith(const char *str, const char *end);
 extern int	strtoint(const char *pg_restrict str, char **pg_restrict endptr,
 					 int base);
+extern bool pg_strtoint64(const char *str, bool errorOK, int64 *result);
+extern uint64 pg_strtouint64(const char *str, char **endptr, int base);
 extern void pg_clean_ascii(char *str);
 
+
 #endif							/* COMMON_STRING_H */
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index a3cd7d26fa..e014a45705 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -50,7 +50,6 @@ extern void pg_ltoa(int32 l, char *a);
 extern void pg_lltoa(int64 ll, char *a);
 extern char *pg_ltostr_zeropad(char *str, int32 value, int32 minwidth);
 extern char *pg_ltostr(char *str, int32 value);
-extern uint64 pg_strtouint64(const char *str, char **endptr, int base);
 
 /* oid.c */
 extern oidvector *buildoidvector(const Oid *oids, int n);
diff --git a/src/include/utils/int8.h b/src/include/utils/int8.h
deleted file mode 100644
index 4836095d93..0000000000
--- a/src/include/utils/int8.h
+++ /dev/null
@@ -1,25 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * int8.h
- *	  Declarations for operations on 64-bit integers.
- *
- *
- * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * src/include/utils/int8.h
- *
- * NOTES
- * These data types are supported on all 64-bit architectures, and may
- *	be supported through libraries on some 32-bit machines. If your machine
- *	is not currently supported, then please try to make it so, then post
- *	patches to the postgresql.org hackers mailing list.
- *
- *-------------------------------------------------------------------------
- */
-#ifndef INT8_H
-#define INT8_H
-
-extern bool scanint8(const char *str, bool errorOK, int64 *result);
-
-#endif							/* INT8_H */

thomas.munro@gmail.com

over 6 years ago

In reply to: Fabien COELHO (#3)

Re: refactoring - share str2*int64 functions

On Fri, May 24, 2019 at 3:23 AM Fabien COELHO <coelho@cri.ensmp.fr> wrote:

Although I agree it is not worth a lot of trouble, and even if I don't do
Windows, I think it valuable that the behavior is the same on all platform.
The attached match shares pg_str2*int64 functions between frontend and
backend by moving them to "common/", which avoids some code duplication.

This is more refactoring, and it fixes the behavior change on 32 bit
architectures.

V2 is a rebase.

Hi Fabien,

Here's some semi-automated feedback, noted while going through
failures on cfbot.cputube.org. You have a stray editor file
src/backend/parser/parse_node.c.~1~. Something is failing to compile
while doing the temp-install in make check-world, which probably
indicates that some test or contrib module is using the interface you
changed?

--
Thomas Munro
https://enterprisedb.com

thomas.munro@gmail.com

over 6 years ago

In reply to: Thomas Munro (#4)

Re: refactoring - share str2*int64 functions

On Mon, Jul 8, 2019 at 3:22 PM Thomas Munro <thomas.munro@gmail.com> wrote:

Here's some semi-automated feedback, noted while going through
failures on cfbot.cputube.org. You have a stray editor file
src/backend/parser/parse_node.c.~1~. Something is failing to compile
while doing the temp-install in make check-world, which probably
indicates that some test or contrib module is using the interface you
changed?

Please disregard the the comment about the ".~1~" file, my mistake.
As for the check-world failure, it's here:

pg_stat_statements.c:1024:11: error: implicit declaration of function
'pg_strtouint64' is invalid in C99
[-Werror,-Wimplicit-function-declaration]
rows = pg_strtouint64(completionTag + 5, NULL, 10);
^
Apparently it needs to include common/string.h.

--
Thomas Munro
https://enterprisedb.com

coelho@cri.ensmp.fr

over 6 years ago

In reply to: Thomas Munro (#5)

1 attachment(s)

Re: refactoring - share str2*int64 functions

Hello Thomas,

pg_stat_statements.c:1024:11: error: implicit declaration of function
'pg_strtouint64' is invalid in C99
[-Werror,-Wimplicit-function-declaration]
rows = pg_strtouint64(completionTag + 5, NULL, 10);
^
Apparently it needs to include common/string.h.

Yep, I gathered that as well, but did not act promptly on your help.
Thanks for it!

Here is the updated patch on which I checked "make check-world".

--
Fabien.

Attachments:

share-strtoX64-functions-3.patchtext/x-diff; name=share-strtoX64-functions-3.patchDownload

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 00cae04eea..8f40245ac4 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -79,6 +79,7 @@
 #include "utils/builtins.h"
 #include "utils/hashutils.h"
 #include "utils/memutils.h"
+#include "common/string.h"
 
 PG_MODULE_MAGIC;
 
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 8eedb613a1..e8744f054c 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -21,6 +21,7 @@
 #include "catalog/heap.h"
 #include "catalog/pg_type.h"
 #include "commands/trigger.h"
+#include "common/string.h"
 #include "executor/executor.h"
 #include "executor/spi_priv.h"
 #include "miscadmin.h"
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 6c2626ee62..3735268e3a 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -34,6 +34,7 @@
 
 #include "fmgr.h"
 #include "miscadmin.h"
+#include "common/string.h"
 #include "nodes/extensible.h"
 #include "nodes/parsenodes.h"
 #include "nodes/plannodes.h"
diff --git a/src/backend/parser/parse_node.c b/src/backend/parser/parse_node.c
index 1baf7ef31f..53b89ece39 100644
--- a/src/backend/parser/parse_node.c
+++ b/src/backend/parser/parse_node.c
@@ -25,10 +25,10 @@
 #include "parser/parse_expr.h"
 #include "parser/parse_relation.h"
 #include "utils/builtins.h"
-#include "utils/int8.h"
 #include "utils/lsyscache.h"
 #include "utils/syscache.h"
 #include "utils/varbit.h"
+#include "common/string.h"
 
 
 static void pcb_error_callback(void *arg);
@@ -496,7 +496,7 @@ make_const(ParseState *pstate, Value *value, int location)
 
 		case T_Float:
 			/* could be an oversize integer as well as a float ... */
-			if (scanint8(strVal(value), true, &val64))
+			if (pg_strtoint64(strVal(value), true, &val64))
 			{
 				/*
 				 * It might actually fit in int32. Probably only INT_MIN can
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index d317fd7006..70681588a1 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,6 +12,7 @@
  */
 #include "postgres.h"
 
+#include "common/string.h"
 #include "catalog/pg_publication.h"
 
 #include "replication/logical.h"
@@ -20,7 +21,6 @@
 #include "replication/pgoutput.h"
 
 #include "utils/inval.h"
-#include "utils/int8.h"
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 #include "utils/varlena.h"
@@ -111,7 +111,7 @@ parse_output_parameters(List *options, uint32 *protocol_version,
 						 errmsg("conflicting or redundant options")));
 			protocol_version_given = true;
 
-			if (!scanint8(strVal(defel->arg), true, &parsed))
+			if (!pg_strtoint64(strVal(defel->arg), true, &parsed))
 				ereport(ERROR,
 						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 						 errmsg("invalid proto_version")));
diff --git a/src/backend/utils/adt/cash.c b/src/backend/utils/adt/cash.c
index c92e9d5046..618660f372 100644
--- a/src/backend/utils/adt/cash.c
+++ b/src/backend/utils/adt/cash.c
@@ -26,7 +26,6 @@
 #include "libpq/pqformat.h"
 #include "utils/builtins.h"
 #include "utils/cash.h"
-#include "utils/int8.h"
 #include "utils/numeric.h"
 #include "utils/pg_locale.h"
 
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index 206576d4bd..1f17987e85 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -92,7 +92,6 @@
 #include "utils/datetime.h"
 #include "utils/float.h"
 #include "utils/formatting.h"
-#include "utils/int8.h"
 #include "utils/memutils.h"
 #include "utils/numeric.h"
 #include "utils/pg_locale.h"
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 0ff9394a2f..98e7b52ce9 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -18,12 +18,12 @@
 #include <math.h>
 
 #include "common/int.h"
+#include "common/string.h"
 #include "funcapi.h"
 #include "libpq/pqformat.h"
 #include "nodes/nodeFuncs.h"
 #include "nodes/supportnodes.h"
 #include "optimizer/optimizer.h"
-#include "utils/int8.h"
 #include "utils/builtins.h"
 
 
@@ -47,89 +47,6 @@ typedef struct
  * Formatting and conversion routines.
  *---------------------------------------------------------*/
 
-/*
- * scanint8 --- try to parse a string into an int8.
- *
- * If errorOK is false, ereport a useful error message if the string is bad.
- * If errorOK is true, just return "false" for bad input.
- */
-bool
-scanint8(const char *str, bool errorOK, int64 *result)
-{
-	const char *ptr = str;
-	int64		tmp = 0;
-	bool		neg = false;
-
-	/*
-	 * Do our own scan, rather than relying on sscanf which might be broken
-	 * for long long.
-	 *
-	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
-	 * value as a negative number.
-	 */
-
-	/* skip leading spaces */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		/* could fail if input is most negative number */
-		if (unlikely(tmp == PG_INT64_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	*result = tmp;
-	return true;
-
-out_of_range:
-	if (!errorOK)
-		ereport(ERROR,
-				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-				 errmsg("value \"%s\" is out of range for type %s",
-						str, "bigint")));
-	return false;
-
-invalid_syntax:
-	if (!errorOK)
-		ereport(ERROR,
-				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
-				 errmsg("invalid input syntax for type %s: \"%s\"",
-						"bigint", str)));
-	return false;
-}
-
 /* int8in()
  */
 Datum
@@ -138,7 +55,7 @@ int8in(PG_FUNCTION_ARGS)
 	char	   *str = PG_GETARG_CSTRING(0);
 	int64		result;
 
-	(void) scanint8(str, false, &result);
+	(void) pg_strtoint64(str, false, &result);
 	PG_RETURN_INT64(result);
 }
 
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index 201784bbf6..02bdf278c2 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -39,7 +39,6 @@
 #include "utils/float.h"
 #include "utils/guc.h"
 #include "utils/hashutils.h"
-#include "utils/int8.h"
 #include "utils/numeric.h"
 #include "utils/sortsupport.h"
 
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 70138feb29..f5d488c569 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -543,25 +543,3 @@ pg_ltostr(char *str, int32 value)
 
 	return end;
 }
-
-/*
- * pg_strtouint64
- *		Converts 'str' into an unsigned 64-bit integer.
- *
- * This has the identical API to strtoul(3), except that it will handle
- * 64-bit ints even where "long" is narrower than that.
- *
- * For the moment it seems sufficient to assume that the platform has
- * such a function somewhere; let's not roll our own.
- */
-uint64
-pg_strtouint64(const char *str, char **endptr, int base)
-{
-#ifdef _MSC_VER					/* MSVC only */
-	return _strtoui64(str, endptr, base);
-#elif defined(HAVE_STRTOULL) && SIZEOF_LONG < 8
-	return strtoull(str, endptr, base);
-#else
-	return strtoul(str, endptr, base);
-#endif
-}
diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c
index cb16d701d8..3941cb82b5 100644
--- a/src/backend/utils/adt/rangetypes.c
+++ b/src/backend/utils/adt/rangetypes.c
@@ -37,7 +37,6 @@
 #include "utils/builtins.h"
 #include "utils/date.h"
 #include "utils/hashutils.h"
-#include "utils/int8.h"
 #include "utils/lsyscache.h"
 #include "utils/rangetypes.h"
 #include "utils/timestamp.h"
diff --git a/src/bin/pgbench/exprscan.l b/src/bin/pgbench/exprscan.l
index e9020ad231..9666a7855d 100644
--- a/src/bin/pgbench/exprscan.l
+++ b/src/bin/pgbench/exprscan.l
@@ -23,6 +23,7 @@
  *-------------------------------------------------------------------------
  */
 
+#include "common/string.h"
 #include "fe_utils/psqlscan_int.h"
 
 /* context information for reporting errors in expressions */
@@ -205,7 +206,7 @@ notnull			[Nn][Oo][Tt][Nn][Uu][Ll][Ll]
 					return MAXINT_PLUS_ONE_CONST;
 				}
 {digit}+		{
-					if (!strtoint64(yytext, true, &yylval->ival))
+					if (!pg_strtoint64(yytext, true, &yylval->ival))
 						expr_yyerror_more(yyscanner, "bigint constant overflow",
 										  strdup(yytext));
 					return INTEGER_CONST;
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 8b84658ccd..e18c11b89d 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -34,6 +34,7 @@
 #include "postgres_fe.h"
 #include "common/int.h"
 #include "common/logging.h"
+#include "common/string.h"
 #include "fe_utils/conditional.h"
 #include "getopt_long.h"
 #include "libpq-fe.h"
@@ -669,7 +670,7 @@ is_an_int(const char *str)
 {
 	const char *ptr = str;
 
-	/* skip leading spaces; cast is consistent with strtoint64 */
+	/* skip leading spaces; cast is consistent with pg_strtoint64 */
 	while (*ptr && isspace((unsigned char) *ptr))
 		ptr++;
 
@@ -689,90 +690,6 @@ is_an_int(const char *str)
 	return *ptr == '\0';
 }
 
-
-/*
- * strtoint64 -- convert a string to 64-bit integer
- *
- * This function is a slightly modified version of scanint8() from
- * src/backend/utils/adt/int8.c.
- *
- * The function returns whether the conversion worked, and if so
- * "*result" is set to the result.
- *
- * If not errorOK, an error message is also printed out on errors.
- */
-bool
-strtoint64(const char *str, bool errorOK, int64 *result)
-{
-	const char *ptr = str;
-	int64		tmp = 0;
-	bool		neg = false;
-
-	/*
-	 * Do our own scan, rather than relying on sscanf which might be broken
-	 * for long long.
-	 *
-	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
-	 * value as a negative number.
-	 */
-
-	/* skip leading spaces */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		if (unlikely(tmp == PG_INT64_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	*result = tmp;
-	return true;
-
-out_of_range:
-	if (!errorOK)
-		fprintf(stderr,
-				"value \"%s\" is out of range for type bigint\n", str);
-	return false;
-
-invalid_syntax:
-	if (!errorOK)
-		fprintf(stderr,
-				"invalid input syntax for type bigint: \"%s\"\n", str);
-	return false;
-}
-
 /* convert string to double, detecting overflows/underflows */
 bool
 strtodouble(const char *str, bool errorOK, double *dv)
@@ -1324,7 +1241,7 @@ makeVariableValue(Variable *var)
 		/* if it looks like an int, it must be an int without overflow */
 		int64		iv;
 
-		if (!strtoint64(var->svalue, false, &iv))
+		if (!pg_strtoint64(var->svalue, false, &iv))
 			return false;
 
 		setIntValue(&var->value, iv);
@@ -5020,18 +4937,15 @@ set_random_seed(const char *seed)
 	else
 	{
 		/* parse unsigned-int seed value */
-		unsigned long ulseed;
-		char		garbage;
-
-		/* Don't try to use UINT64_FORMAT here; it might not work for sscanf */
-		if (sscanf(seed, "%lu%c", &ulseed, &garbage) != 1)
+		char		*garbage;
+		iseed = pg_strtouint64(seed, &garbage, 10);
+		if (*garbage != '\0')
 		{
 			fprintf(stderr,
 					"unrecognized random seed option \"%s\": expecting an unsigned integer, \"time\" or \"rand\"\n",
 					seed);
 			return false;
 		}
-		iseed = (uint64) ulseed;
 	}
 
 	if (seed != NULL)
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index c4a1e298e0..da3f8fa055 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -160,7 +160,6 @@ extern void syntax_error(const char *source, int lineno, const char *line,
 						 const char *cmd, const char *msg,
 						 const char *more, int col) pg_attribute_noreturn();
 
-extern bool strtoint64(const char *str, bool errorOK, int64 *pi);
 extern bool strtodouble(const char *str, bool errorOK, double *pd);
 
 #endif							/* PGBENCH_H */
diff --git a/src/common/string.c b/src/common/string.c
index b01a56ceaa..4449576e35 100644
--- a/src/common/string.c
+++ b/src/common/string.c
@@ -22,6 +22,7 @@
 #endif
 
 #include "common/string.h"
+#include "common/int.h"
 
 
 /*
@@ -57,6 +58,123 @@ strtoint(const char *pg_restrict str, char **pg_restrict endptr, int base)
 	return (int) val;
 }
 
+/*
+ * pg_strtoint64 -- convert a string to 64-bit integer
+ *
+ * The function returns whether the conversion worked, and if so
+ * "*result" is set to the result.
+ *
+ * If not errorOK, an error message is also printed out on errors.
+ */
+bool
+pg_strtoint64(const char *str, bool errorOK, int64 *result)
+{
+	const char *ptr = str;
+	int64		tmp = 0;
+	bool		neg = false;
+
+	/*
+	 * Do our own scan, rather than relying on sscanf which might be broken
+	 * for long long.
+	 *
+	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
+	 * value as a negative number.
+	 */
+
+	/* skip leading spaces */
+	while (*ptr && isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '-')
+	{
+		ptr++;
+		neg = true;
+	}
+	else if (*ptr == '+')
+		ptr++;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		goto invalid_syntax;
+
+	/* process digits */
+	while (*ptr && isdigit((unsigned char) *ptr))
+	{
+		int8		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
+			goto out_of_range;
+	}
+
+	/* allow trailing whitespace, but not other trailing chars */
+	while (*ptr != '\0' && isspace((unsigned char) *ptr))
+		ptr++;
+
+	if (unlikely(*ptr != '\0'))
+		goto invalid_syntax;
+
+	if (!neg)
+	{
+		if (unlikely(tmp == PG_INT64_MIN))
+			goto out_of_range;
+		tmp = -tmp;
+	}
+
+	*result = tmp;
+	return true;
+
+out_of_range:
+	if (!errorOK)
+#ifdef FRONTEND
+		fprintf(stderr,
+				"value \"%s\" is out of range for type bigint\n", str);
+#else
+		ereport(ERROR,
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+				 errmsg("value \"%s\" is out of range for type %s",
+						str, "bigint")));
+#endif
+	return false;
+
+invalid_syntax:
+	if (!errorOK)
+#ifdef FRONTEND
+		fprintf(stderr,
+				"invalid input syntax for type bigint: \"%s\"\n", str);
+#else
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+				 errmsg("invalid input syntax for type %s: \"%s\"",
+						"bigint", str)));
+#endif
+
+	return false;
+}
+
+/*
+ * pg_strtouint64
+ *		Converts 'str' into an unsigned 64-bit integer.
+ *
+ * This has the identical API to strtoul(3), except that it will handle
+ * 64-bit ints even where "long" is narrower than that.
+ *
+ * For the moment it seems sufficient to assume that the platform has
+ * such a function somewhere; let's not roll our own.
+ */
+uint64
+pg_strtouint64(const char *str, char **endptr, int base)
+{
+#ifdef _MSC_VER					/* MSVC only */
+	return _strtoui64(str, endptr, base);
+#elif defined(HAVE_STRTOULL) && SIZEOF_LONG < 8
+	return strtoull(str, endptr, base);
+#else
+	return strtoul(str, endptr, base);
+#endif
+}
+
 
 /*
  * pg_clean_ascii -- Replace any non-ASCII chars with a '?' char
diff --git a/src/include/common/string.h b/src/include/common/string.h
index 77f31337ca..ef760e8d25 100644
--- a/src/include/common/string.h
+++ b/src/include/common/string.h
@@ -13,6 +13,9 @@
 extern bool pg_str_endswith(const char *str, const char *end);
 extern int	strtoint(const char *pg_restrict str, char **pg_restrict endptr,
 					 int base);
+extern bool pg_strtoint64(const char *str, bool errorOK, int64 *result);
+extern uint64 pg_strtouint64(const char *str, char **endptr, int base);
 extern void pg_clean_ascii(char *str);
 
+
 #endif							/* COMMON_STRING_H */
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index a3cd7d26fa..e014a45705 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -50,7 +50,6 @@ extern void pg_ltoa(int32 l, char *a);
 extern void pg_lltoa(int64 ll, char *a);
 extern char *pg_ltostr_zeropad(char *str, int32 value, int32 minwidth);
 extern char *pg_ltostr(char *str, int32 value);
-extern uint64 pg_strtouint64(const char *str, char **endptr, int base);
 
 /* oid.c */
 extern oidvector *buildoidvector(const Oid *oids, int n);
diff --git a/src/include/utils/int8.h b/src/include/utils/int8.h
deleted file mode 100644
index 4836095d93..0000000000
--- a/src/include/utils/int8.h
+++ /dev/null
@@ -1,25 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * int8.h
- *	  Declarations for operations on 64-bit integers.
- *
- *
- * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * src/include/utils/int8.h
- *
- * NOTES
- * These data types are supported on all 64-bit architectures, and may
- *	be supported through libraries on some 32-bit machines. If your machine
- *	is not currently supported, then please try to make it so, then post
- *	patches to the postgresql.org hackers mailing list.
- *
- *-------------------------------------------------------------------------
- */
-#ifndef INT8_H
-#define INT8_H
-
-extern bool scanint8(const char *str, bool errorOK, int64 *result);
-
-#endif							/* INT8_H */

[1]: /messages/by-id/CAEepm=2KeC8xDbEWgDTDObXGqPHFW4kcD7BZXR6NMfiHjjnKhQ@mail.gmail.com

thomas.munro@gmail.com

over 6 years ago

In reply to: Fabien COELHO (#6)

Re: refactoring - share str2*int64 functions

On Sun, Jul 14, 2019 at 3:22 AM Fabien COELHO <coelho@cri.ensmp.fr> wrote:

Here is the updated patch on which I checked "make check-world".

Thanks! So, we're moving pg_strtouint64() to a place where frontend
code can use it, and getting rid of some duplication. I like it. I
wanted this once before myself[1]/messages/by-id/CAEepm=2KeC8xDbEWgDTDObXGqPHFW4kcD7BZXR6NMfiHjjnKhQ@mail.gmail.com.

+extern bool pg_strtoint64(const char *str, bool errorOK, int64 *result);
+extern uint64 pg_strtouint64(const char *str, char **endptr, int base);

One of these things is not like the other. Let's see... the int64
version is used only by pgbench and is being promoted to common where
it can be used by more code. With a name like that, wouldn't it make
sense to bring it into line with the uint64 interface, and then move
pgbench's error reporting stuff back into pgbench? The uint64 one
derives its shape from the family of standard functions like strtol()
so I think it wins.

--
Thomas Munro
https://enterprisedb.com

coelho@cri.ensmp.fr

over 6 years ago

In reply to: Thomas Munro (#7)

1 attachment(s)

Re: refactoring - share str2*int64 functions

Hello Thomas,

+extern bool pg_strtoint64(const char *str, bool errorOK, int64 *result);
+extern uint64 pg_strtouint64(const char *str, char **endptr, int base);

One of these things is not like the other.

Indeed.

I agree that it is unfortunate, and it was bothering me a little as well.

Let's see... the int64 version is used only by pgbench and is being
promoted to common where it can be used by more code.

No and yes.

The pgbench code was a copy of server-side internal "scanint8", so it is
used both by pgbench and the server-side handling of "int8", it is used
significantly, taking advantage of its versatile error reporting feature
on both sides.

With a name like that, wouldn't it make sense to bring it into line with
the uint64 interface, and then move pgbench's error reporting stuff back
into pgbench?

That would need moving the server-side error handling as well, which I
would not really be happy with.

The uint64 one derives its shape from the family of standard functions
like strtol() so I think it wins.

Yep, it cannot be changed either.

I do not think that changing the error handling capability is appropriate,
it is really a feature of the function. The function could try to use an
internal pg_strtoint64 which would look like the other unsigned version,
but then it would not differentiate the various error conditions (out of
range vs syntax error).

The compromise I can offer is to change the name of the first one, say to
"pg_scanint8" to reflect its former backend name. Attached a v4 which does
a renaming so as to avoid the name similarity but signature difference. I
also made both error messages identical.

--
Fabien.

Attachments:

share-strtoX64-functions-4.patchtext/x-diff; name=share-strtoX64-functions-4.patchDownload

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index ba57628f6f..5c3536b117 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -79,6 +79,7 @@
 #include "utils/builtins.h"
 #include "utils/hashutils.h"
 #include "utils/memutils.h"
+#include "common/string.h"
 
 PG_MODULE_MAGIC;
 
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 8eedb613a1..e8744f054c 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -21,6 +21,7 @@
 #include "catalog/heap.h"
 #include "catalog/pg_type.h"
 #include "commands/trigger.h"
+#include "common/string.h"
 #include "executor/executor.h"
 #include "executor/spi_priv.h"
 #include "miscadmin.h"
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 6c2626ee62..3735268e3a 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -34,6 +34,7 @@
 
 #include "fmgr.h"
 #include "miscadmin.h"
+#include "common/string.h"
 #include "nodes/extensible.h"
 #include "nodes/parsenodes.h"
 #include "nodes/plannodes.h"
diff --git a/src/backend/parser/parse_node.c b/src/backend/parser/parse_node.c
index 1baf7ef31f..9a66ad41bd 100644
--- a/src/backend/parser/parse_node.c
+++ b/src/backend/parser/parse_node.c
@@ -25,10 +25,10 @@
 #include "parser/parse_expr.h"
 #include "parser/parse_relation.h"
 #include "utils/builtins.h"
-#include "utils/int8.h"
 #include "utils/lsyscache.h"
 #include "utils/syscache.h"
 #include "utils/varbit.h"
+#include "common/string.h"
 
 
 static void pcb_error_callback(void *arg);
@@ -496,7 +496,7 @@ make_const(ParseState *pstate, Value *value, int location)
 
 		case T_Float:
 			/* could be an oversize integer as well as a float ... */
-			if (scanint8(strVal(value), true, &val64))
+			if (pg_scanint8(strVal(value), true, &val64))
 			{
 				/*
 				 * It might actually fit in int32. Probably only INT_MIN can
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index d317fd7006..673f25a014 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,6 +12,7 @@
  */
 #include "postgres.h"
 
+#include "common/string.h"
 #include "catalog/pg_publication.h"
 
 #include "replication/logical.h"
@@ -20,7 +21,6 @@
 #include "replication/pgoutput.h"
 
 #include "utils/inval.h"
-#include "utils/int8.h"
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 #include "utils/varlena.h"
@@ -111,7 +111,7 @@ parse_output_parameters(List *options, uint32 *protocol_version,
 						 errmsg("conflicting or redundant options")));
 			protocol_version_given = true;
 
-			if (!scanint8(strVal(defel->arg), true, &parsed))
+			if (!pg_scanint8(strVal(defel->arg), true, &parsed))
 				ereport(ERROR,
 						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 						 errmsg("invalid proto_version")));
diff --git a/src/backend/utils/adt/cash.c b/src/backend/utils/adt/cash.c
index c92e9d5046..618660f372 100644
--- a/src/backend/utils/adt/cash.c
+++ b/src/backend/utils/adt/cash.c
@@ -26,7 +26,6 @@
 #include "libpq/pqformat.h"
 #include "utils/builtins.h"
 #include "utils/cash.h"
-#include "utils/int8.h"
 #include "utils/numeric.h"
 #include "utils/pg_locale.h"
 
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index 206576d4bd..1f17987e85 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -92,7 +92,6 @@
 #include "utils/datetime.h"
 #include "utils/float.h"
 #include "utils/formatting.h"
-#include "utils/int8.h"
 #include "utils/memutils.h"
 #include "utils/numeric.h"
 #include "utils/pg_locale.h"
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 0ff9394a2f..5992f4d425 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -18,12 +18,12 @@
 #include <math.h>
 
 #include "common/int.h"
+#include "common/string.h"
 #include "funcapi.h"
 #include "libpq/pqformat.h"
 #include "nodes/nodeFuncs.h"
 #include "nodes/supportnodes.h"
 #include "optimizer/optimizer.h"
-#include "utils/int8.h"
 #include "utils/builtins.h"
 
 
@@ -47,89 +47,6 @@ typedef struct
  * Formatting and conversion routines.
  *---------------------------------------------------------*/
 
-/*
- * scanint8 --- try to parse a string into an int8.
- *
- * If errorOK is false, ereport a useful error message if the string is bad.
- * If errorOK is true, just return "false" for bad input.
- */
-bool
-scanint8(const char *str, bool errorOK, int64 *result)
-{
-	const char *ptr = str;
-	int64		tmp = 0;
-	bool		neg = false;
-
-	/*
-	 * Do our own scan, rather than relying on sscanf which might be broken
-	 * for long long.
-	 *
-	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
-	 * value as a negative number.
-	 */
-
-	/* skip leading spaces */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		/* could fail if input is most negative number */
-		if (unlikely(tmp == PG_INT64_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	*result = tmp;
-	return true;
-
-out_of_range:
-	if (!errorOK)
-		ereport(ERROR,
-				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-				 errmsg("value \"%s\" is out of range for type %s",
-						str, "bigint")));
-	return false;
-
-invalid_syntax:
-	if (!errorOK)
-		ereport(ERROR,
-				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
-				 errmsg("invalid input syntax for type %s: \"%s\"",
-						"bigint", str)));
-	return false;
-}
-
 /* int8in()
  */
 Datum
@@ -138,7 +55,7 @@ int8in(PG_FUNCTION_ARGS)
 	char	   *str = PG_GETARG_CSTRING(0);
 	int64		result;
 
-	(void) scanint8(str, false, &result);
+	(void) pg_scanint8(str, false, &result);
 	PG_RETURN_INT64(result);
 }
 
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index 201784bbf6..02bdf278c2 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -39,7 +39,6 @@
 #include "utils/float.h"
 #include "utils/guc.h"
 #include "utils/hashutils.h"
-#include "utils/int8.h"
 #include "utils/numeric.h"
 #include "utils/sortsupport.h"
 
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 70138feb29..f5d488c569 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -543,25 +543,3 @@ pg_ltostr(char *str, int32 value)
 
 	return end;
 }
-
-/*
- * pg_strtouint64
- *		Converts 'str' into an unsigned 64-bit integer.
- *
- * This has the identical API to strtoul(3), except that it will handle
- * 64-bit ints even where "long" is narrower than that.
- *
- * For the moment it seems sufficient to assume that the platform has
- * such a function somewhere; let's not roll our own.
- */
-uint64
-pg_strtouint64(const char *str, char **endptr, int base)
-{
-#ifdef _MSC_VER					/* MSVC only */
-	return _strtoui64(str, endptr, base);
-#elif defined(HAVE_STRTOULL) && SIZEOF_LONG < 8
-	return strtoull(str, endptr, base);
-#else
-	return strtoul(str, endptr, base);
-#endif
-}
diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c
index cb16d701d8..3941cb82b5 100644
--- a/src/backend/utils/adt/rangetypes.c
+++ b/src/backend/utils/adt/rangetypes.c
@@ -37,7 +37,6 @@
 #include "utils/builtins.h"
 #include "utils/date.h"
 #include "utils/hashutils.h"
-#include "utils/int8.h"
 #include "utils/lsyscache.h"
 #include "utils/rangetypes.h"
 #include "utils/timestamp.h"
diff --git a/src/bin/pgbench/exprscan.l b/src/bin/pgbench/exprscan.l
index e9020ad231..bfd3800318 100644
--- a/src/bin/pgbench/exprscan.l
+++ b/src/bin/pgbench/exprscan.l
@@ -23,6 +23,7 @@
  *-------------------------------------------------------------------------
  */
 
+#include "common/string.h"
 #include "fe_utils/psqlscan_int.h"
 
 /* context information for reporting errors in expressions */
@@ -205,7 +206,7 @@ notnull			[Nn][Oo][Tt][Nn][Uu][Ll][Ll]
 					return MAXINT_PLUS_ONE_CONST;
 				}
 {digit}+		{
-					if (!strtoint64(yytext, true, &yylval->ival))
+					if (!pg_scanint8(yytext, true, &yylval->ival))
 						expr_yyerror_more(yyscanner, "bigint constant overflow",
 										  strdup(yytext));
 					return INTEGER_CONST;
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 8b84658ccd..ebc1964957 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -34,6 +34,7 @@
 #include "postgres_fe.h"
 #include "common/int.h"
 #include "common/logging.h"
+#include "common/string.h"
 #include "fe_utils/conditional.h"
 #include "getopt_long.h"
 #include "libpq-fe.h"
@@ -669,7 +670,7 @@ is_an_int(const char *str)
 {
 	const char *ptr = str;
 
-	/* skip leading spaces; cast is consistent with strtoint64 */
+	/* skip leading spaces; cast is consistent with pg_scanint8 */
 	while (*ptr && isspace((unsigned char) *ptr))
 		ptr++;
 
@@ -689,90 +690,6 @@ is_an_int(const char *str)
 	return *ptr == '\0';
 }
 
-
-/*
- * strtoint64 -- convert a string to 64-bit integer
- *
- * This function is a slightly modified version of scanint8() from
- * src/backend/utils/adt/int8.c.
- *
- * The function returns whether the conversion worked, and if so
- * "*result" is set to the result.
- *
- * If not errorOK, an error message is also printed out on errors.
- */
-bool
-strtoint64(const char *str, bool errorOK, int64 *result)
-{
-	const char *ptr = str;
-	int64		tmp = 0;
-	bool		neg = false;
-
-	/*
-	 * Do our own scan, rather than relying on sscanf which might be broken
-	 * for long long.
-	 *
-	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
-	 * value as a negative number.
-	 */
-
-	/* skip leading spaces */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		if (unlikely(tmp == PG_INT64_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	*result = tmp;
-	return true;
-
-out_of_range:
-	if (!errorOK)
-		fprintf(stderr,
-				"value \"%s\" is out of range for type bigint\n", str);
-	return false;
-
-invalid_syntax:
-	if (!errorOK)
-		fprintf(stderr,
-				"invalid input syntax for type bigint: \"%s\"\n", str);
-	return false;
-}
-
 /* convert string to double, detecting overflows/underflows */
 bool
 strtodouble(const char *str, bool errorOK, double *dv)
@@ -1324,7 +1241,7 @@ makeVariableValue(Variable *var)
 		/* if it looks like an int, it must be an int without overflow */
 		int64		iv;
 
-		if (!strtoint64(var->svalue, false, &iv))
+		if (!pg_scanint8(var->svalue, false, &iv))
 			return false;
 
 		setIntValue(&var->value, iv);
@@ -5020,18 +4937,15 @@ set_random_seed(const char *seed)
 	else
 	{
 		/* parse unsigned-int seed value */
-		unsigned long ulseed;
-		char		garbage;
-
-		/* Don't try to use UINT64_FORMAT here; it might not work for sscanf */
-		if (sscanf(seed, "%lu%c", &ulseed, &garbage) != 1)
+		char		*garbage;
+		iseed = pg_strtouint64(seed, &garbage, 10);
+		if (*garbage != '\0')
 		{
 			fprintf(stderr,
 					"unrecognized random seed option \"%s\": expecting an unsigned integer, \"time\" or \"rand\"\n",
 					seed);
 			return false;
 		}
-		iseed = (uint64) ulseed;
 	}
 
 	if (seed != NULL)
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index c4a1e298e0..da3f8fa055 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -160,7 +160,6 @@ extern void syntax_error(const char *source, int lineno, const char *line,
 						 const char *cmd, const char *msg,
 						 const char *more, int col) pg_attribute_noreturn();
 
-extern bool strtoint64(const char *str, bool errorOK, int64 *pi);
 extern bool strtodouble(const char *str, bool errorOK, double *pd);
 
 #endif							/* PGBENCH_H */
diff --git a/src/common/string.c b/src/common/string.c
index b01a56ceaa..b51641038e 100644
--- a/src/common/string.c
+++ b/src/common/string.c
@@ -22,6 +22,7 @@
 #endif
 
 #include "common/string.h"
+#include "common/int.h"
 
 
 /*
@@ -57,6 +58,123 @@ strtoint(const char *pg_restrict str, char **pg_restrict endptr, int base)
 	return (int) val;
 }
 
+/*
+ * pg_scanint8 -- convert a string to 64-bit integer
+ *
+ * The function returns whether the conversion worked, and if so
+ * "*result" is set to the result.
+ *
+ * If not errorOK, an error message is also printed out on errors.
+ */
+bool
+pg_scanint8(const char *str, bool errorOK, int64 *result)
+{
+	const char *ptr = str;
+	int64		tmp = 0;
+	bool		neg = false;
+
+	/*
+	 * Do our own scan, rather than relying on sscanf which might be broken
+	 * for long long.
+	 *
+	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
+	 * value as a negative number.
+	 */
+
+	/* skip leading spaces */
+	while (*ptr && isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '-')
+	{
+		ptr++;
+		neg = true;
+	}
+	else if (*ptr == '+')
+		ptr++;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		goto invalid_syntax;
+
+	/* process digits */
+	while (*ptr && isdigit((unsigned char) *ptr))
+	{
+		int8		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
+			goto out_of_range;
+	}
+
+	/* allow trailing whitespace, but not other trailing chars */
+	while (*ptr != '\0' && isspace((unsigned char) *ptr))
+		ptr++;
+
+	if (unlikely(*ptr != '\0'))
+		goto invalid_syntax;
+
+	if (!neg)
+	{
+		if (unlikely(tmp == PG_INT64_MIN))
+			goto out_of_range;
+		tmp = -tmp;
+	}
+
+	*result = tmp;
+	return true;
+
+out_of_range:
+	if (!errorOK)
+#ifdef FRONTEND
+		fprintf(stderr,
+				"value \"%s\" is out of range for type %s\n", str, "bigint");
+#else
+		ereport(ERROR,
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+				 errmsg("value \"%s\" is out of range for type %s",
+						str, "bigint")));
+#endif
+	return false;
+
+invalid_syntax:
+	if (!errorOK)
+#ifdef FRONTEND
+		fprintf(stderr,
+				"invalid input syntax for type %s: \"%s\"\n", "bigint", str);
+#else
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+				 errmsg("invalid input syntax for type %s: \"%s\"",
+						"bigint", str)));
+#endif
+
+	return false;
+}
+
+/*
+ * pg_strtouint64
+ *		Converts 'str' into an unsigned 64-bit integer.
+ *
+ * This has the identical API to strtoul(3), except that it will handle
+ * 64-bit ints even where "long" is narrower than that.
+ *
+ * For the moment it seems sufficient to assume that the platform has
+ * such a function somewhere; let's not roll our own.
+ */
+uint64
+pg_strtouint64(const char *str, char **endptr, int base)
+{
+#ifdef _MSC_VER					/* MSVC only */
+	return _strtoui64(str, endptr, base);
+#elif defined(HAVE_STRTOULL) && SIZEOF_LONG < 8
+	return strtoull(str, endptr, base);
+#else
+	return strtoul(str, endptr, base);
+#endif
+}
+
 
 /*
  * pg_clean_ascii -- Replace any non-ASCII chars with a '?' char
diff --git a/src/include/common/string.h b/src/include/common/string.h
index 77f31337ca..3d551d2934 100644
--- a/src/include/common/string.h
+++ b/src/include/common/string.h
@@ -13,6 +13,9 @@
 extern bool pg_str_endswith(const char *str, const char *end);
 extern int	strtoint(const char *pg_restrict str, char **pg_restrict endptr,
 					 int base);
+extern bool pg_scanint8(const char *str, bool errorOK, int64 *result);
+extern uint64 pg_strtouint64(const char *str, char **endptr, int base);
 extern void pg_clean_ascii(char *str);
 
+
 #endif							/* COMMON_STRING_H */
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index a3cd7d26fa..e014a45705 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -50,7 +50,6 @@ extern void pg_ltoa(int32 l, char *a);
 extern void pg_lltoa(int64 ll, char *a);
 extern char *pg_ltostr_zeropad(char *str, int32 value, int32 minwidth);
 extern char *pg_ltostr(char *str, int32 value);
-extern uint64 pg_strtouint64(const char *str, char **endptr, int base);
 
 /* oid.c */
 extern oidvector *buildoidvector(const Oid *oids, int n);
diff --git a/src/include/utils/int8.h b/src/include/utils/int8.h
deleted file mode 100644
index 4836095d93..0000000000
--- a/src/include/utils/int8.h
+++ /dev/null
@@ -1,25 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * int8.h
- *	  Declarations for operations on 64-bit integers.
- *
- *
- * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * src/include/utils/int8.h
- *
- * NOTES
- * These data types are supported on all 64-bit architectures, and may
- *	be supported through libraries on some 32-bit machines. If your machine
- *	is not currently supported, then please try to make it so, then post
- *	patches to the postgresql.org hackers mailing list.
- *
- *-------------------------------------------------------------------------
- */
-#ifndef INT8_H
-#define INT8_H
-
-extern bool scanint8(const char *str, bool errorOK, int64 *result);
-
-#endif							/* INT8_H */

thomas.munro@gmail.com

over 6 years ago

In reply to: Fabien COELHO (#8)

Re: refactoring - share str2*int64 functions

On Mon, Jul 15, 2019 at 5:08 PM Fabien COELHO <coelho@cri.ensmp.fr> wrote:

The compromise I can offer is to change the name of the first one, say to
"pg_scanint8" to reflect its former backend name. Attached a v4 which does
a renaming so as to avoid the name similarity but signature difference. I
also made both error messages identical.

Cool. I'm not exactly sure when we should include 'pg_' in identifier
names. It seems to be used for functions/macros that wrap or replace
something else with a similar name, like pg_pwrite(),
pg_attribute_noreturn(), ... In this case it's just our own code that
we're moving, so I'm wondering if we should just call it scanint8().

If you agree, I think this is ready to commit.

--
Thomas Munro
https://enterprisedb.com

#10

Alvaro Herrera

alvherre@2ndquadrant.com

over 6 years ago

In reply to: Thomas Munro (#9)

Re: refactoring - share str2*int64 functions

On 2019-Jul-16, Thomas Munro wrote:

On Mon, Jul 15, 2019 at 5:08 PM Fabien COELHO <coelho@cri.ensmp.fr> wrote:

The compromise I can offer is to change the name of the first one, say to
"pg_scanint8" to reflect its former backend name. Attached a v4 which does
a renaming so as to avoid the name similarity but signature difference. I
also made both error messages identical.

Cool. I'm not exactly sure when we should include 'pg_' in identifier
names. It seems to be used for functions/macros that wrap or replace
something else with a similar name, like pg_pwrite(),
pg_attribute_noreturn(), ... In this case it's just our own code that
we're moving, so I'm wondering if we should just call it scanint8().

Isn't it annoying that pg_strtouint64() has an implementation that
suggests that it ought to be in src/port? The fact that the signatures
are so different suggests to me that we should indeed put them separate.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#11

michael@paquier.xyz

over 6 years ago

In reply to: Thomas Munro (#9)

Re: refactoring - share str2*int64 functions

On Tue, Jul 16, 2019 at 11:16:31AM +1200, Thomas Munro wrote:

Cool. I'm not exactly sure when we should include 'pg_' in identifier
names. It seems to be used for functions/macros that wrap or replace
something else with a similar name, like pg_pwrite(),
pg_attribute_noreturn(), ... In this case it's just our own code that
we're moving, so I'm wondering if we should just call it scanint8().

FWIW, I was looking forward to putting my hands on this patch and try
to get it merged so as we can get rid of those duplications. Here are
some comments.

+#ifdef FRONTEND
+       fprintf(stderr,
+               "invalid input syntax for type %s: \"%s\"\n",
"bigint", str);
+#else
+       ereport(ERROR,
+               (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+                errmsg("invalid input syntax for type %s: \"%s\"",
+                       "bigint", str)));
+#endif
Have you looked at using the wrapper pg_log_error() here?

+extern bool pg_scanint8(const char *str, bool errorOK, int64
*result);
+extern uint64 pg_strtouint64(const char *str, char **endptr, int
base);
Hmm.  With this patch we have strtoint and pg_strtouint64, which makes
the whole set inconsistent.

+
#endif /* COMMON_STRING_H */
Noise diff..

numutils.c also has pg_strtoint16 and pg_strtoint32, so the locations
become rather inconsistent with inconsistent APIs for the manipulation
of int2 and int4 fields, and scanint8 is just a derivative of the same
logic. We have two categories of routines here:
- The wrappers on top of strtol and strtoul & co, which are named
respectively strtoint and pg_strtouint64 with the patch. The naming
part is inconsistent, and we only handle uint64 and int32. We don't
need to worry about int64 and uint32 because they are not used?
That's fine by me, but at least let's have a consistent naming.
Prefixing the functions with pg_* is a better practice in my opinion
as we will unlikely run into conflicts this way.
- The str->integer conversion routines, which actually have very
similar characteristics to the strtol families as they remove trailing
whitespaces first, check for a sign, etc, except that they work only
on base 10. And here we get into a state where pg_scanint8 should be
actually called pg_strtoint64, with an interface inconsistent with its
int32/int16 relatives now only in the backend. Could we consider more
consolidation here please? Like moving the whole set to src/common/?

If you agree, I think this is ready to commit.

Thomas, are you planning to look at this patch as committer? I had it
in my agenda, and was planning to look at it sooner than later. Now
if you are on it, I won't step on your toes.
--
Michael

#12

thomas.munro@gmail.com

over 6 years ago

In reply to: Michael Paquier (#11)

Re: refactoring - share str2*int64 functions

On Tue, Jul 16, 2019 at 7:11 PM Michael Paquier <michael@paquier.xyz> wrote:

Thomas, are you planning to look at this patch as committer? I had it
in my agenda, and was planning to look at it sooner than later. Now
if you are on it, I won't step on your toes.

Hi Michael, please go ahead, it's all yours.

--
Thomas Munro
https://enterprisedb.com

#13

coelho@cri.ensmp.fr

over 6 years ago

In reply to: Thomas Munro (#9)

Re: refactoring - share str2*int64 functions

Hello Thomas,

On Mon, Jul 15, 2019 at 5:08 PM Fabien COELHO <coelho@cri.ensmp.fr> wrote:

The compromise I can offer is to change the name of the first one, say to
"pg_scanint8" to reflect its former backend name. Attached a v4 which does
a renaming so as to avoid the name similarity but signature difference. I
also made both error messages identical.

Cool. I'm not exactly sure when we should include 'pg_' in identifier
names. It seems to be used for functions/macros that wrap or replace
something else with a similar name, like pg_pwrite(),
pg_attribute_noreturn(), ... In this case it's just our own code that
we're moving, so I'm wondering if we should just call it scanint8().

I added the pg_ prefix as a poor man's namespace because the function can
be used by external tools (eg contribs), so as to avoid potential name
conflicts.

I agree that such conflicts are less probable if the name does not replace
something existing.

If you agree, I think this is ready to commit.

It can be removed, or not. So you do as you feel.

--
Fabien.

#14

Robert Haas

robertmhaas@gmail.com

over 6 years ago

In reply to: Fabien COELHO (#13)

Re: refactoring - share str2*int64 functions

On Jul 16, 2019, at 3:30 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:

Cool. I'm not exactly sure when we should include 'pg_' in identifier
names. It seems to be used for functions/macros that wrap or replace
something else with a similar name, like pg_pwrite(),
pg_attribute_noreturn(), ... In this case it's just our own code that
we're moving, so I'm wondering if we should just call it scanint8().

I added the pg_ prefix as a poor man's namespace because the function can be used by external tools (eg contribs), so as to avoid potential name conflicts.

Yeah, I think if we are going to expose it to front end code there is a good argument for some kind of prefix that makes it sound PostgreSQL-related.

...Robert

#15

Tom Lane

tgl@sss.pgh.pa.us

over 6 years ago

In reply to: Robert Haas (#14)

Re: refactoring - share str2*int64 functions

Robert Haas <robertmhaas@gmail.com> writes:

On Jul 16, 2019, at 3:30 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:

Cool. I'm not exactly sure when we should include 'pg_' in identifier
names.

I added the pg_ prefix as a poor man's namespace because the function can be used by external tools (eg contribs), so as to avoid potential name conflicts.

Yeah, I think if we are going to expose it to front end code there is a good argument for some kind of prefix that makes it sound PostgreSQL-related.

Yeah, I'd tend to err in favor of including "pg_". We might get away
without that as long as the name is never exposed to non-PG code, but
for stuff that's going into src/common/ or src/port/ I think that's
a risky assumption to make.

I'm also in agreement with Michael's comments in
<20190716071144.GF1439@paquier.xyz> that this would be a good time
to bring some consistency to the naming of related functions.

regards, tom lane

#16

andres@anarazel.de

over 6 years ago

In reply to: Fabien COELHO (#8)

Re: refactoring - share str2*int64 functions

Hi,

On 2019-07-15 07:08:42 +0200, Fabien COELHO wrote:

I do not think that changing the error handling capability is appropriate,
it is really a feature of the function. The function could try to use an
internal pg_strtoint64 which would look like the other unsigned version, but
then it would not differentiate the various error conditions (out of range
vs syntax error).

The compromise I can offer is to change the name of the first one, say to
"pg_scanint8" to reflect its former backend name. Attached a v4 which does a
renaming so as to avoid the name similarity but signature difference. I also
made both error messages identical.

I think the interface of that function is not that good, and the "scan"
in the name isn't great for discoverability (for one it's a different
naming than pg_strtoint16 etc), and the *8 meaning 64bit is confusing
enough in the backend, we definitely shouldn't extend that to frontend
code.

Referencing "bigint" and "input syntax" from frontend code imo doesn't
make a lot of sense. And int8in is the only caller that uses
errorOK=False anyway, so there's currently no need for the frontend
error strings afaict.

ISTM that something like

typedef enum StrToIntConversion
{
STRTOINT_OK = 0,
STRTOINT_SYNTAX_ERROR = 1,
STRTOINT_OUT_OF_RANGE = 2
} StrToIntConversion;
StrToIntConversion pg_strtoint64(const char *str, int64 *result);

would make more sense.

There is the issue that there already is pg_strtoint16 and
pg_strtoint32, which do not have the option to not raise an error. I'd
probably name the non-error throwing ones something like pg_strtointNN_e
(for extended, or error handling), and have pg_strtointNN wrappers that
just handle the errors, or reverse the naming (which'd cause a bit of
churn, but not that much).

That'd also make the code for pg_strtointNN a bit nicer, because we'd
not need the gotos anymore, they're just there to avoid redundant error
messages - which'd not be an issue if the error handling were just a
switch in a separate function. E.g.

int32
pg_strtoint32(const char *s)
{
int32 result;

switch (pg_strtoint32_e(s, &result))
{
case STRTOINT_OK:
return result;

case STRTOINT_SYNTAX_ERROR:
ereport(ERROR,
(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
errmsg("value \"%s\" is out of range for type %s",
s, "integer")));

case STRTOINT_OUT_OF_RANGE:
ereport(ERROR,
(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
errmsg("invalid input syntax for type %s: \"%s\"",
"integer", s)));

}

return 0; /* keep compiler quiet */
}

which does seem nicer than what we have right now.

Greetings,

Andres Freund

#17

andres@anarazel.de

over 6 years ago

In reply to: Michael Paquier (#11)

Re: refactoring - share str2*int64 functions

Hi,

On 2019-07-16 16:11:44 +0900, Michael Paquier wrote:

numutils.c also has pg_strtoint16 and pg_strtoint32, so the locations
become rather inconsistent with inconsistent APIs for the manipulation
of int2 and int4 fields, and scanint8 is just a derivative of the same
logic. We have two categories of routines here:

- The wrappers on top of strtol and strtoul & co, which are named
respectively strtoint and pg_strtouint64 with the patch. The naming
part is inconsistent, and we only handle uint64 and int32. We don't
need to worry about int64 and uint32 because they are not used?
That's fine by me, but at least let's have a consistent naming.

Yea, consistent naming seems like a strong requirement
here. Additionally I think we should just provide a consistent set
rather than what's needed just now. That'll just lead to people
inventing their own again down the line.

Prefixing the functions with pg_* is a better practice in my opinion
as we will unlikely run into conflicts this way.

- The str->integer conversion routines, which actually have very
similar characteristics to the strtol families as they remove trailing
whitespaces first, check for a sign, etc, except that they work only
on base 10.

There's afaict neither a caller that needs the base argument at the
moment, nor one in the tree previously. I'd argue for just making
pg_strtouint64's API consistent.

I'd probably also just use the implementation we have for signed
integers (minus the relevant negation and overflow checks, obviously) -
it's a lot faster, and I think there's value in keeping the
implementations in sync.

Greetings,

Andres Freund

#18

michael@paquier.xyz

over 6 years ago

In reply to: Andres Freund (#17)

Re: refactoring - share str2*int64 functions

On Tue, Jul 16, 2019 at 01:18:38PM -0700, Andres Freund wrote:

Hi,

On 2019-07-16 16:11:44 +0900, Michael Paquier wrote:
Yea, consistent naming seems like a strong requirement
here. Additionally I think we should just provide a consistent set
rather than what's needed just now. That'll just lead to people
inventing their own again down the line.

Agreed. The first versions of pg_rewind in the tree have been using
copy_file_range(), which has been introduced in Linux.

- The str->integer conversion routines, which actually have very
similar characteristics to the strtol families as they remove trailing
whitespaces first, check for a sign, etc, except that they work only
on base 10.

There's afaict neither a caller that needs the base argument at the
moment, nor one in the tree previously. I'd argue for just making
pg_strtouint64's API consistent.

Good point, indeed, this could be much more simplified. I have not
paid attention at that part.

I'd probably also just use the implementation we have for signed
integers (minus the relevant negation and overflow checks, obviously) -
it's a lot faster, and I think there's value in keeping the
implementations in sync.

You mean that it is much faster than the set of wrappers for strtol
than we have? Is that because we don't care about the base?
--
Michael

#19

michael@paquier.xyz

over 6 years ago

In reply to: Andres Freund (#16)

Re: refactoring - share str2*int64 functions

On Tue, Jul 16, 2019 at 01:04:38PM -0700, Andres Freund wrote:

There is the issue that there already is pg_strtoint16 and
pg_strtoint32, which do not have the option to not raise an error. I'd
probably name the non-error throwing ones something like pg_strtointNN_e
(for extended, or error handling), and have pg_strtointNN wrappers that
just handle the errors, or reverse the naming (which'd cause a bit of
churn, but not that much).

That'd also make the code for pg_strtointNN a bit nicer, because we'd
not need the gotos anymore, they're just there to avoid redundant error
messages - which'd not be an issue if the error handling were just a
switch in a separate function. E.g.

Agreed on that. I am wondering if we should use a common wrapper for
all the internal functions taking in input a set of bits16 flags to
control its behavior and put all that into common/script.c:
- Issue an error.
- Check for signedness.
- Base length: 16, 32 or 64.
This would have the advantage to move the error string generation, the
trailing whitespace check, sign handling and such in a single place.
We could have the internal routine return uint64 which is casted
afterwards to a proper result depending on what we use. (Perhaps
that's what you actually meant?)

I would also rather not touch the strtol wrappers that we have able to
handle the base. There is nothing in the tree using it, but likely
there are extensions relying on it. Switching all the existing
callers in the tree to the new routines sounds good to me of course.

Consolidating all that still needs more work, so for now I am
switching the patch as waiting on author.
--
Michael

#20

coelho@cri.ensmp.fr

over 6 years ago

In reply to: Michael Paquier (#11)

Re: refactoring - share str2*int64 functions

Bonjour Michaï¿œl,

FWIW, I was looking forward to putting my hands on this patch and try
to get it merged so as we can get rid of those duplications. Here are
some comments.

+#ifdef FRONTEND
+       fprintf(stderr,
+               "invalid input syntax for type %s: \"%s\"\n",
"bigint", str);
+#else
+       ereport(ERROR,
+               (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+                errmsg("invalid input syntax for type %s: \"%s\"",
+                       "bigint", str)));
+#endif
Have you looked at using the wrapper pg_log_error() here?

I have not.

I have simply merged the two implementations (pgbench & backend) as they
were.

+extern bool pg_scanint8(const char *str, bool errorOK, int64 *result);
+extern uint64 pg_strtouint64(const char *str, char **endptr, int base);

Hmm. With this patch we have strtoint and pg_strtouint64, which makes
the whole set inconsistent.

I understand that you mean bits vs bytes? Indeed it can bite!

+
#endif /* COMMON_STRING_H */
Noise diff..

Indeed.

numutils.c also has pg_strtoint16 and pg_strtoint32, so the locations
become rather inconsistent with inconsistent APIs for the manipulation
of int2 and int4 fields, and scanint8 is just a derivative of the same
logic. We have two categories of routines here:

Yep, but the int2/4 functions are not used elsewhere.

- The wrappers on top of strtol and strtoul & co, which are named
respectively strtoint and pg_strtouint64 with the patch. The naming
part is inconsistent, and we only handle uint64 and int32. We don't
need to worry about int64 and uint32 because they are not used?

Indeed, it seems that they are not needed/used by client code, AFAICT.

That's fine by me, but at least let's have a consistent naming.

Ok.

Prefixing the functions with pg_* is a better practice in my opinion
as we will unlikely run into conflicts this way.

Ok.

- The str->integer conversion routines, which actually have very
similar characteristics to the strtol families as they remove trailing
whitespaces first, check for a sign, etc, except that they work only
on base 10. And here we get into a state where pg_scanint8 should be
actually called pg_strtoint64,

I just removed that:-)

ISTM that the issue is that the error handling of these functions is
pretty different.

with an interface inconsistent with its int32/int16 relatives now only
in the backend.

We can, but I'm not at ease with changing the error handling approach.

Could we consider more consolidation here please? Like moving the whole
set to src/common/?

My initial plan was simply to remove direct code duplications between
front-end and back-end, not to re-engineer the full set of string to int
conversion functions:-)

On the re-engineering front: Given the various points on the thread, ISTM
that there should probably be two behaviors for str to signed/unsigned
int{16,32,64}, and having only one kind of signature for all types would
be definitely better.

One low-level one that does the conversion or return an error.

Another higher-level one which possibly adds an error message (stderr for
front-end, log for back-end).

One choice is whether there are two functions (the higher one calling the
lower one and adding the messages) or just one with a boolean to trigger
the messages. I do not have a strong opinion. Maybe one function would be
simpler. Andres boolean-compatible enum return looks like a good option.

Overall, this leads to something like:

enum { STRTOINT_OK, STRTOINT_OVERFLOW_ERROR, STRTOINT_SYNTAX_ERROR }
pg_strto{,u}int{8?,16,32,64}
(const char * string, const TYPE * result, const bool verbose);

--
Fabien.

#21

michael@paquier.xyz

over 6 years ago

In reply to: Fabien COELHO (#20)

Re: refactoring - share str2*int64 functions

On Wed, Jul 17, 2019 at 07:55:39AM +0000, Fabien COELHO wrote:

numutils.c also has pg_strtoint16 and pg_strtoint32, so the locations
become rather inconsistent with inconsistent APIs for the manipulation
of int2 and int4 fields, and scanint8 is just a derivative of the same
logic. We have two categories of routines here:

Yep, but the int2/4 functions are not used elsewhere.

The worry is more about having people invent the same stuff all over
again. If we can get a clean interface, that would ease adoption.
Hopefully.

Overall, this leads to something like:

enum { STRTOINT_OK, STRTOINT_OVERFLOW_ERROR, STRTOINT_SYNTAX_ERROR }
pg_strto{,u}int{8?,16,32,64}
(const char * string, const TYPE * result, const bool verbose);

Something like that. "verbose" may mean "error_ok" though. Not
having 6 times the same trailing whitespace checks and such would be
nice.

Actually, one thing which may be a problem is that we lack currently
the equivalents of pg_mul_s16_overflow and such for unsigned
integers. The point of contention comes from pgbench's
set_random_seed() in this case as we can expect an unsigned seed as
the docs say. But if we give up on the signedness of the random seed
which remains with 8 bytes, then we could let pg_strtouint64 as
backend-only and only worry about porting this set of APIs for signed
integers.
--
Michael

#22

andres@anarazel.de

over 6 years ago

In reply to: Michael Paquier (#19)

Re: refactoring - share str2*int64 functions

Hi,

On 2019-07-17 12:18:19 +0900, Michael Paquier wrote:

On Tue, Jul 16, 2019 at 01:04:38PM -0700, Andres Freund wrote:

There is the issue that there already is pg_strtoint16 and
pg_strtoint32, which do not have the option to not raise an error. I'd
probably name the non-error throwing ones something like pg_strtointNN_e
(for extended, or error handling), and have pg_strtointNN wrappers that
just handle the errors, or reverse the naming (which'd cause a bit of
churn, but not that much).

That'd also make the code for pg_strtointNN a bit nicer, because we'd
not need the gotos anymore, they're just there to avoid redundant error
messages - which'd not be an issue if the error handling were just a
switch in a separate function. E.g.

Agreed on that. I am wondering if we should use a common wrapper for
all the internal functions taking in input a set of bits16 flags to
control its behavior and put all that into common/script.c:
- Issue an error.
- Check for signedness.
- Base length: 16, 32 or 64.

That'd be considerably slower, so I'm *strongly* against that. These
conversion routines are *really* hot in a number of workloads,
e.g. bulk-loading with COPY. Check e.g.
/messages/by-id/20171208214437.qgn6zdltyq5hmjpk@alap3.anarazel.de

I would also rather not touch the strtol wrappers that we have able to
handle the base. There is nothing in the tree using it, but likely
there are extensions relying on it.

I doubt it - it's not of that long-standing vintage (23a27b039d9,
2016-03-12), and if so they are very likely to use base 10. We shouldn't
keep some barely tested function around, just for the hypothetical
scenario that some extension uses it. Especially if that function is
considerably slower than the potential replacement.

Greetings,

Andres Freund

#23

Yes: /messages/by-id/20171208214437.qgn6zdltyq5hmjpk@alap3.anarazel.de

andres@anarazel.de

over 6 years ago

In reply to: Michael Paquier (#18)

Re: refactoring - share str2*int64 functions

Hi,

On 2019-07-17 12:04:32 +0900, Michael Paquier wrote:

On Tue, Jul 16, 2019 at 01:18:38PM -0700, Andres Freund wrote:

I'd probably also just use the implementation we have for signed
integers (minus the relevant negation and overflow checks, obviously) -
it's a lot faster, and I think there's value in keeping the
implementations in sync.

You mean that it is much faster than the set of wrappers for strtol
than we have? Is that because we don't care about the base?

Not caring about the base is one significant part, that removes a fair
bit of branches and more importantly allows the compiler to replace
divisions with much faster code (glibc tries to avoid the division too,
with lookup tables, but that's still expensive). Additionally there's
also some locale awareness in strtoll etc that we don't need. It's also
plainly not that well implemented at least in glibc and musl.

Having an implementation that reliably works the same across all
platforms is also advantageous.

Greetings,

Andres Freund

#24

andres@anarazel.de

over 6 years ago

In reply to: Fabien COELHO (#20)

Re: refactoring - share str2*int64 functions

Hi,

On 2019-07-17 07:55:39 +0000, Fabien COELHO wrote:

- The str->integer conversion routines, which actually have very
similar characteristics to the strtol families as they remove trailing
whitespaces first, check for a sign, etc, except that they work only
on base 10. And here we get into a state where pg_scanint8 should be
actually called pg_strtoint64,

I just removed that:-)

What do you mean by that?

with an interface inconsistent with its int32/int16 relatives now only
in the backend.

We can, but I'm not at ease with changing the error handling approach.

Why?

Could we consider more consolidation here please? Like moving the whole
set to src/common/?

My initial plan was simply to remove direct code duplications between
front-end and back-end, not to re-engineer the full set of string to int
conversion functions:-)

Well, if you expose functions to more places - e.g. now the whole
frontend - it becomes more important that they're reasonably designed.

On the re-engineering front: Given the various points on the thread, ISTM
that there should probably be two behaviors for str to signed/unsigned
int{16,32,64}, and having only one kind of signature for all types would be
definitely better.

I don't understand why we'd want to have different behaviours for
signed/unsigned? Maybe I'm mis-understanding your sentence, and you just
mean that there should be one that throws, and one that returns an
errorcode?

Another higher-level one which possibly adds an error message (stderr for
front-end, log for back-end).

Is there actually any need for a non-backend one that has an
error-message? I'm not convinced that in the frontend it's very useful
to have such a function that exits - in the backend we have a much more
complete way to handle that, including pointing to the right location in
the query strings etc.

One choice is whether there are two functions (the higher one calling the
lower one and adding the messages) or just one with a boolean to trigger the
messages. I do not have a strong opinion. Maybe one function would be
simpler. Andres boolean-compatible enum return looks like a good option.

The boolean makes the calling code harder to understand, the function
slower, and the code harder to grep.

Overall, this leads to something like:

enum { STRTOINT_OK, STRTOINT_OVERFLOW_ERROR, STRTOINT_SYNTAX_ERROR }
pg_strto{,u}int{8?,16,32,64}
(const char * string, const TYPE * result, const bool verbose);

What's with hthe const for the result? I assume that was just a copy&pasto?

Greetings,

Andres Freund

#25

andres@anarazel.de

over 6 years ago

In reply to: Michael Paquier (#21)

Re: refactoring - share str2*int64 functions

Hi,

On 2019-07-17 17:29:58 +0900, Michael Paquier wrote:

Actually, one thing which may be a problem is that we lack currently
the equivalents of pg_mul_s16_overflow and such for unsigned
integers.

It's much simpler to implement them for unsigned than for signed,
because unsigned overflow is well-defined. So I'd not be particularly
worried about just adding them. E.g. comparing the "slow" version of
pg_mul_s64_overflow() with an untested implementation of
pg_mul_u64_overflow():

pg_mul_s64_overflow:
/*
* Overflow can only happen if at least one value is outside the range
* sqrt(min)..sqrt(max) so check that first as the division can be quite a
* bit more expensive than the multiplication.
*
* Multiplying by 0 or 1 can't overflow of course and checking for 0
* separately avoids any risk of dividing by 0. Be careful about dividing
* INT_MIN by -1 also, note reversing the a and b to ensure we're always
* dividing it by a positive value.
*
*/
if ((a > PG_INT32_MAX || a < PG_INT32_MIN ||
b > PG_INT32_MAX || b < PG_INT32_MIN) &&
a != 0 && a != 1 && b != 0 && b != 1 &&
((a > 0 && b > 0 && a > PG_INT64_MAX / b) ||
(a > 0 && b < 0 && b < PG_INT64_MIN / a) ||
(a < 0 && b > 0 && a < PG_INT64_MIN / b) ||
(a < 0 && b < 0 && a < PG_INT64_MAX / b)))
{
*result = 0x5EED; /* to avoid spurious warnings */
return true;
}
*result = a * b;
return false;

pg_mul_s64_overflow:

/*
* Checking for unsigned overflow is simple, just check
* if reversing the multiplication indicates that the
* multiplication overflowed.
*/
int64 res = a * b;
if (a != 0 && b != res / a)
{
*result = 0x5EED; /* to avoid spurious warnings */
return true;
}
*result = res;
return false;

The cases for addition/subtraction are even easier:
addition:
res = a + b;
if (res < a)
/* overflow */

subtraction:
if (a < b)
/* underflow */
res = a - b;

Greetings,

Andres Freund

#26

coelho@cri.ensmp.fr

over 6 years ago

In reply to: Andres Freund (#24)

Re: refactoring - share str2*int64 functions

- The str->integer conversion routines, which actually have very
similar characteristics to the strtol families as they remove trailing
whitespaces first, check for a sign, etc, except that they work only
on base 10. And here we get into a state where pg_scanint8 should be
actually called pg_strtoint64,

I just removed that:-)

What do you mean by that?

That I renamed something from a previous patch version and someone is
complaining that I did.

with an interface inconsistent with its int32/int16 relatives now only
in the backend.

We can, but I'm not at ease with changing the error handling approach.

Why?

If a function reports an error to log, it should keep on doing it,
otherwise there would be a regression.

Could we consider more consolidation here please? Like moving the whole
set to src/common/?

My initial plan was simply to remove direct code duplications between
front-end and back-end, not to re-engineer the full set of string to int
conversion functions:-)

Well, if you expose functions to more places - e.g. now the whole
frontend - it becomes more important that they're reasonably designed.

I can somehow only agree with that. Note that the contraposite assumption
that badly designed functions would be okay for backend seems doubtful.

On the re-engineering front: Given the various points on the thread,
ISTM that there should probably be two behaviors for str to
signed/unsigned int{16,32,64}, and having only one kind of signature
for all types would be definitely better.

I don't understand why we'd want to have different behaviours for
signed/unsigned? Maybe I'm mis-understanding your sentence, and you just
mean that there should be one that throws, and one that returns an
errorcode?

Yep for the backend (if reporting an error generates a longjump), for the
frontend there is no exception mechanism, so it is showing the error or
not to stderr, and returning whether it was ok.

Another higher-level one which possibly adds an error message (stderr for
front-end, log for back-end).

Is there actually any need for a non-backend one that has an
error-message?

Pgbench uses it. If the function is shared and one is loging something, it
looks ok to send to stderr for front-end?

I'm not convinced that in the frontend it's very useful to have such a
function that exits - in the backend we have a much more complete way to
handle that, including pointing to the right location in the query
strings etc.

Sure. There is not exit though, just messages to stderr and return false.

One choice is whether there are two functions (the higher one calling the
lower one and adding the messages) or just one with a boolean to trigger the
messages. I do not have a strong opinion. Maybe one function would be
simpler. Andres boolean-compatible enum return looks like a good option.

The boolean makes the calling code harder to understand, the function
slower,

Hmmm. So I understand that you would prefer 2 functions, one raw (fast)
one and the other with the other with the better error reporting facity,
and the user must chose the one they like. I'm fine with that as well.

and the code harder to grep.

Overall, this leads to something like:

enum { STRTOINT_OK, STRTOINT_OVERFLOW_ERROR, STRTOINT_SYNTAX_ERROR }
pg_strto{,u}int{8?,16,32,64}
(const char * string, const TYPE * result, const bool verbose);

What's with hthe const for the result? I assume that was just a copy&pasto?

Yep. The pointer is constant, not the value pointed, maybe it should be
"TYPE * const result" or something like that. Or no const at all on
result.

--
Fabien.

#27

andres@anarazel.de

over 6 years ago

In reply to: Fabien COELHO (#26)

Re: refactoring - share str2*int64 functions

Hi,

On 2019-07-17 22:59:01 +0000, Fabien COELHO wrote:

with an interface inconsistent with its int32/int16 relatives now only
in the backend.

We can, but I'm not at ease with changing the error handling approach.

Why?

If a function reports an error to log, it should keep on doing it, otherwise
there would be a regression.

Err, huh. Especially if we change the signature, I fail to see how it's
a regression if we change the behaviour.

Another higher-level one which possibly adds an error message (stderr for
front-end, log for back-end).

Is there actually any need for a non-backend one that has an
error-message?

Pgbench uses it. If the function is shared and one is loging something, it
looks ok to send to stderr for front-end?

I'm not convinced that in the frontend it's very useful to have such a
function that exits - in the backend we have a much more complete way to
handle that, including pointing to the right location in the query
strings etc.

Sure. There is not exit though, just messages to stderr and return false.

I think it's a seriously bad idea to have a function that returns
depending in the error case depending on whether it's frontend or
backend code. We shouldn't do stuff like that, it just leads to bugs.

One choice is whether there are two functions (the higher one calling the
lower one and adding the messages) or just one with a boolean to trigger the
messages. I do not have a strong opinion. Maybe one function would be
simpler. Andres boolean-compatible enum return looks like a good option.

The boolean makes the calling code harder to understand, the function
slower,

Hmmm. So I understand that you would prefer 2 functions, one raw (fast) one
and the other with the other with the better error reporting facity, and the
user must chose the one they like. I'm fine with that as well.

Well, the one with error reporting would use the former.

Greetings,

Andres Freund

#28

michael@paquier.xyz

over 6 years ago

In reply to: Andres Freund (#22)

Re: refactoring - share str2*int64 functions

On Wed, Jul 17, 2019 at 11:14:28AM -0700, Andres Freund wrote:

That'd be considerably slower, so I'm *strongly* against that. These
conversion routines are *really* hot in a number of workloads,
e.g. bulk-loading with COPY. Check e.g.
/messages/by-id/20171208214437.qgn6zdltyq5hmjpk@alap3.anarazel.de

Thanks for the link. That makes sense! So stacking more function
calls could also be an issue. Even if using static inline for the
inner wrapper? That may sound like a stupid question but you have
likely more experience than me regarding that with profiling.

I doubt it - it's not of that long-standing vintage (23a27b039d9,
2016-03-12), and if so they are very likely to use base 10. We shouldn't
keep some barely tested function around, just for the hypothetical
scenario that some extension uses it. Especially if that function is
considerably slower than the potential replacement.

Okay, I won't fight hard on that either.
--
Michael

#29

andres@anarazel.de

over 6 years ago

In reply to: Michael Paquier (#28)

Re: refactoring - share str2*int64 functions

Hi,

On 2019-07-18 09:28:28 +0900, Michael Paquier wrote:

On Wed, Jul 17, 2019 at 11:14:28AM -0700, Andres Freund wrote:

That'd be considerably slower, so I'm *strongly* against that. These
conversion routines are *really* hot in a number of workloads,
e.g. bulk-loading with COPY. Check e.g.
/messages/by-id/20171208214437.qgn6zdltyq5hmjpk@alap3.anarazel.de

Thanks for the link. That makes sense! So stacking more function
calls could also be an issue. Even if using static inline for the
inner wrapper? That may sound like a stupid question but you have
likely more experience than me regarding that with profiling.

A static inline would be fine, depending on how you do that. I'm not
quite sure what you mean with "inner wrapper" - isn't a wrapper normally
outside?

I'd probably do something like

static inline int64
strtoint64(const char *str)
{
int64 res;

e = strtoint64_e(str, &res);
if (likely(e == STRTOINT_OK))
return res;
else
{
report_strtoint_error(str, e, "int64");
return 0; /* pacify compiler */
}
}

and then have one non-inline report_strtoint_error() shared across the
various functions. Even leaving code-duplication aside, not having the
elog call itself in the inline function is nice, as that's quite a few
instructions.

Greetings,

Andres Freund

#30

coelho@cri.ensmp.fr

over 6 years ago

In reply to: Andres Freund (#27)

Re: refactoring - share str2*int64 functions

Hello Andres,

If a function reports an error to log, it should keep on doing it, otherwise
there would be a regression.

Err, huh. Especially if we change the signature, I fail to see how it's
a regression if we change the behaviour.

ISTM that we do not understand one another, because I'm only trying to
state the obvious. Currently error messages on overflow or syntax error
are displayed. I just mean that such messages must keep on being emitted
somehow otherwise there would be a regression.

sql> SELECT INT8 '12345678901234567890';
ERROR: value "12345678901234567890" is out of range for type bigint
LINE 1: SELECT INT8 '12345678901234567890';

Sure. There is not exit though, just messages to stderr and return false.

I think it's a seriously bad idea to have a function that returns
depending in the error case depending on whether it's frontend or
backend code. We shouldn't do stuff like that, it just leads to bugs.

Ok, you really want two functions and getting read of the errorOK boolean.
I got that. The scanint8 already exists with its ereport ERROR vs return
based on a boolean, so the point is to get read of this kind of signature.

The boolean makes the calling code harder to understand, the function
slower,

Hmmm. So I understand that you would prefer 2 functions, one raw (fast) one
and the other with the other with the better error reporting facity, and the
user must chose the one they like. I'm fine with that as well.

Well, the one with error reporting would use the former.

Yep, possibly two functions, no boolean.

Having a common function will not escape the fact that there is no
exception mechanism for front-end, and none seems desirable just for
parsing ints. So if you want NOT to have a same function with return
something vs raises an error, that would make three functions.

One basic int parsing in the fe/be common part.

typedef enum { STRTOINT_OK, STRTOINT_OVERFLOW, STRTOINT_SYNTAX_ERROR }
strtoint_error;

strtoint_error pg_strtoint64(const char * str, int64 * result);

Then for backend, probably in backend somewhere:

// raises an exception on errors
void pg_strtoint64_log(const char * str, int64 * result) {
switch (pg_strtoint64) {
case STRTOINT_OK: return;
case STRTOINT...: ereport(ERROR, ...);
}
}

And for frontend, only in frontend somewhere:

// print to stderr and return false on error
bool pg_strtoint64_err(const char * str, int64 * result);

I'm unhappy with the function names though, feel free to improve.

--
Fabien.

#31

michael@paquier.xyz

over 6 years ago

In reply to: Fabien COELHO (#30)

Re: refactoring - share str2*int64 functions

On Thu, Jul 18, 2019 at 07:57:41AM +0000, Fabien COELHO wrote:

I'm unhappy with the function names though, feel free to improve.

I would have something rather close to what you are suggesting, still
not exactly that because we just don't care about the error strings
generated for the frontend. And my bet is that each frontend would
like to have their own error message depending on the error case.

FWIW, I had a similar experience with pg_strong_random() not so long
ago, which required a backend-specific handling because the fallback
random implementation needed some tweaks at postmaster startup (that's
why we have an alias pg_backend_random in include/port.h). So I would
recommend the following, roughly:
- One set of functions in src/port/ which return the status code for
the error handling, without any error reporting in it to avoid any
ifdef FRONTEND business, which have a generic name pg_strto[u]intXX
(XX = {16,32,64}). And have all that in a new, separate file, say
strtoint.c?
- One set of functions for the backend, called pg_stro[u]intXX_backend
or pg_backend_stro[u]intXX which can take as extra argument error_ok,
calling the portions in src/port/, and move those functions in a new
file prefixed with "backend_" in src/backend/utils/misc/ with a name
consistent with the one in src/port/ (with the previous naming that
would be backend_strtoint.c)
- We also need the unsigned-specific equivalents of
pg_mul_s64_overflow and such, so I would suggest putting that in a new
header, simply uint.h. If I finish by committing this stuff, I would
handle that in a separate commit.
--
Michael

#32

andres@anarazel.de

over 6 years ago

In reply to: Michael Paquier (#31)

Re: refactoring - share str2*int64 functions

Hi,

On 2019-07-19 12:21:27 +0900, Michael Paquier wrote:

On Thu, Jul 18, 2019 at 07:57:41AM +0000, Fabien COELHO wrote:

I'm unhappy with the function names though, feel free to improve.

I would have something rather close to what you are suggesting, still
not exactly that because we just don't care about the error strings
generated for the frontend. And my bet is that each frontend would
like to have their own error message depending on the error case.

Yea, the error messages pgbench is currently generating, for example,
don't make a lot of sense.

FWIW, I had a similar experience with pg_strong_random() not so long
ago, which required a backend-specific handling because the fallback
random implementation needed some tweaks at postmaster startup (that's
why we have an alias pg_backend_random in include/port.h). So I would
recommend the following, roughly:
- One set of functions in src/port/ which return the status code for
the error handling, without any error reporting in it to avoid any
ifdef FRONTEND business, which have a generic name pg_strto[u]intXX
(XX = {16,32,64}). And have all that in a new, separate file, say
strtoint.c?

Why not common? It's not a platform dependent bit. Could even be put
into the already existing string.c.

- One set of functions for the backend, called pg_stro[u]intXX_backend
or pg_backend_stro[u]intXX which can take as extra argument error_ok,
calling the portions in src/port/, and move those functions in a new
file prefixed with "backend_" in src/backend/utils/misc/ with a name
consistent with the one in src/port/ (with the previous naming that
would be backend_strtoint.c)

I'm not following. What would be the point of any of this? The error_ok
bit is unnecessary, because the function is exactly the same as the
generic function. And the backend_ prefix would be pretty darn weird,
given that that's already below src/backend.

- We also need the unsigned-specific equivalents of
pg_mul_s64_overflow and such, so I would suggest putting that in a new
header, simply uint.h. If I finish by committing this stuff, I would
handle that in a separate commit.

Why not the same header? I fail to see what we'd gain by splitting it
up.

Greetings,

Andres Freund

#33

michael@paquier.xyz

over 6 years ago

In reply to: Andres Freund (#32)

Re: refactoring - share str2*int64 functions

On Thu, Jul 18, 2019 at 09:16:22PM -0700, Andres Freund wrote:

On 2019-07-19 12:21:27 +0900, Michael Paquier wrote:
Why not common? It's not a platform dependent bit. Could even be put
into the already existing string.c.

That would be fine to me, it is not like this file is bloated now.

- One set of functions for the backend, called pg_stro[u]intXX_backend
or pg_backend_stro[u]intXX which can take as extra argument error_ok,
calling the portions in src/port/, and move those functions in a new
file prefixed with "backend_" in src/backend/utils/misc/ with a name
consistent with the one in src/port/ (with the previous naming that
would be backend_strtoint.c)

I'm not following. What would be the point of any of this? The error_ok
bit is unnecessary, because the function is exactly the same as the
generic function. And the backend_ prefix would be pretty darn weird,
given that that's already below src/backend.

Do you have a better idea of name for those functions?

- We also need the unsigned-specific equivalents of
pg_mul_s64_overflow and such, so I would suggest putting that in a new
header, simply uint.h. If I finish by committing this stuff, I would
handle that in a separate commit.

Why not the same header? I fail to see what we'd gain by splitting it
up.

No objections to that at the end.

Fabien, are you planning to send an updated patch? This stuff has
value.
--
Michael

#34

coelho@cri.ensmp.fr

over 6 years ago

In reply to: Michael Paquier (#33)

Re: refactoring - share str2*int64 functions

Bonjour Michaï¿œl,

Fabien, are you planning to send an updated patch? This stuff has
value.

Yep, I will try for this week.

--
Fabien.

#35

andres@anarazel.de

over 6 years ago

In reply to: Michael Paquier (#33)

Re: refactoring - share str2*int64 functions

Hi,

On 2019-07-29 10:48:41 +0900, Michael Paquier wrote:

On Thu, Jul 18, 2019 at 09:16:22PM -0700, Andres Freund wrote:

- One set of functions for the backend, called pg_stro[u]intXX_backend
or pg_backend_stro[u]intXX which can take as extra argument error_ok,
calling the portions in src/port/, and move those functions in a new
file prefixed with "backend_" in src/backend/utils/misc/ with a name
consistent with the one in src/port/ (with the previous naming that
would be backend_strtoint.c)

I'm not following. What would be the point of any of this? The error_ok
bit is unnecessary, because the function is exactly the same as the
generic function. And the backend_ prefix would be pretty darn weird,
given that that's already below src/backend.

Do you have a better idea of name for those functions?

I don't understand why they need any prefix. pg_strto[u]int{32,64}{,_checked}.
The unchecked variant would be for both frontend backend. The checked one
either for both frontend/backend, or just for backend. I also could live with
_raises, _throws or such instead of _checked. Implement all of them in one
file in /common/, possibly hidin gthe ones not currently implemented for the
frontend.

Imo if _checked is implemented for both frontend/backend they'd need
different error messages. In my opinion
out_of_range:
if (!errorOK)
fprintf(stderr,
"value \"%s\" is out of range for type bigint\n", str);
return false;

invalid_syntax:
if (!errorOK)
fprintf(stderr,
"invalid input syntax for type bigint: \"%s\"\n", str);

is unsuitable for generic code. In fact, I'm doubtful that it's applicable for
any use, except for int8in(), which makes me think it better ought to use the
a non-checked variant, and include the errors directly. If we still want to
have _checked - which is reasonable imo - it should refer to 64bit integers or somesuch.

Greetings,

Andres Freund

#36

michael@paquier.xyz

over 6 years ago

In reply to: Fabien COELHO (#34)

Re: refactoring - share str2*int64 functions

On Mon, Jul 29, 2019 at 07:04:09AM +0200, Fabien COELHO wrote:

Bonjour Michaël,
Yep, I will try for this week.

Please note that for now I have marked the patch as returned with
feedback as the CF is ending.
--
Michael

#37

coelho@cri.ensmp.fr

over 6 years ago

In reply to: Michael Paquier (#36)

Re: refactoring - share str2*int64 functions

Bonjour Michaï¿½l,

Yep, I will try for this week.

Please note that for now I have marked the patch as returned with
feedback as the CF is ending.

Ok.

I have looked quickly at it, but I'm not sure that there is an agreement
about what should be done precisely, so the feedback is not clearly
actionable.

--
Fabien.

#38

michael@paquier.xyz

over 6 years ago

In reply to: Fabien COELHO (#37)

Re: refactoring - share str2*int64 functions

On Thu, Aug 01, 2019 at 09:00:41AM +0200, Fabien COELHO wrote:

I have looked quickly at it, but I'm not sure that there is an agreement
about what should be done precisely, so the feedback is not clearly
actionable.

Per the latest trends, it seems that the input of Andres was kind of
the most interesting pieces. If you don't have room for it, I would
not mind doing the legwork myself.
--
Michael

#39

coelho@cri.ensmp.fr

over 6 years ago

In reply to: Michael Paquier (#38)

Re: refactoring - share str2*int64 functions

Michaï¿œl-san,

I have looked quickly at it, but I'm not sure that there is an agreement
about what should be done precisely, so the feedback is not clearly
actionable.

Per the latest trends, it seems that the input of Andres was kind of
the most interesting pieces.

Yes, definitely. I understood that we want in "string.h" something like
(just the spirit):

typedef enum {
STRTOINT_OK, STRTOINT_RANGE_ERROR, STRTOINT_SYNTAX_ERROR
} strtoint_status;

strtoint_status pg_strtoint64(const char * str, int64 * result);

However there is a contrary objective to have a unified interface,
but there also exists a:

extern uint64 pg_strtouint64(const char *str, char **endptr, int base);

called 3 times, always with base == 10. We have a similar name but a
totally different interface, so basically it would have to be replaced
by something like the first interface.

If you don't have room for it, I would not mind doing the legwork
myself.

I think that it would be quick if the what is clear enough, so I can do
it.

--
Fabien.

#40

michael@paquier.xyz

over 6 years ago

In reply to: Fabien COELHO (#39)

Re: refactoring - share str2*int64 functions

On Thu, Aug 01, 2019 at 11:34:34AM +0200, Fabien COELHO wrote:

However there is a contrary objective to have a unified interface,
but there also exists a:

extern uint64 pg_strtouint64(const char *str, char **endptr, int base);

called 3 times, always with base == 10. We have a similar name but a totally
different interface, so basically it would have to be replaced
by something like the first interface.

My understanding on this one was to nuke the base argument and unify
the interface with our own, faster routines:
/messages/by-id/20190716201838.rwrd7xzbrybq7dop@alap3.anarazel.de
--
Michael

#41

coelho@cri.ensmp.fr

over 6 years ago

In reply to: Michael Paquier (#40)

Re: refactoring - share str2*int64 functions

extern uint64 pg_strtouint64(const char *str, char **endptr, int base);

called 3 times, always with base == 10. We have a similar name but a totally
different interface, so basically it would have to be replaced
by something like the first interface.

My understanding on this one was to nuke the base argument and unify
the interface with our own, faster routines:
/messages/by-id/20190716201838.rwrd7xzbrybq7dop@alap3.anarazel.de

Ok, so there is an agreement on reworking the unsigned function. I missed
this bit.

So I'll set out to write and use "pg_strtou?int64", i.e. 2 functions, and
then possibly generalize to lower sizes, 32, 16, depending on what is
actually needed.

--
Fabien.

#42

michael@paquier.xyz

over 6 years ago

In reply to: Fabien COELHO (#41)

Re: refactoring - share str2*int64 functions

Hi Fabien,

On Thu, Aug 01, 2019 at 04:48:35PM +0200, Fabien COELHO wrote:

Ok, so there is an agreement on reworking the unsigned function. I missed
this bit.

So I'll set out to write and use "pg_strtou?int64", i.e. 2 functions, and
then possibly generalize to lower sizes, 32, 16, depending on what is
actually needed.

I am interested in this patch, and the next commit fest is close by.
Are you working on an updated version? If not, do you mind if I work
on it and post a new version by the beginning of next week based on
all the feedback gathered?
--
Michael

#43

coelho@cri.ensmp.fr

over 6 years ago

In reply to: Michael Paquier (#42)

Re: refactoring - share str2*int64 functions

Bonjour Michaï¿œl,

So I'll set out to write and use "pg_strtou?int64", i.e. 2 functions, and
then possibly generalize to lower sizes, 32, 16, depending on what is
actually needed.

I am interested in this patch, and the next commit fest is close by.
Are you working on an updated version? If not, do you mind if I work
on it and post a new version by the beginning of next week based on
all the feedback gathered?

I have started to do something, and I can spend some time on that this
week, but I'm pretty unclear about what exactly should be done.

The error returning stuff is simple enough, but I'm unclear about what to
do with pg_uint64, which has a totally different signature. Should it be
aligned?

--
Fabien Coelho - CRI, MINES ParisTech

#44

michael@paquier.xyz

over 6 years ago

In reply to: Fabien COELHO (#43)

Re: refactoring - share str2*int64 functions

On Mon, Aug 26, 2019 at 11:05:55AM +0200, Fabien COELHO wrote:

I have started to do something, and I can spend some time on that this week,
but I'm pretty unclear about what exactly should be done.

Thanks.

The error returning stuff is simple enough, but I'm unclear about what to do
with pg_uint64, which has a totally different signature. Should it be
aligned?

I am not sure what you mean with aligned here. If you mean
consistent, getting into a state where we have all functions for all
three sizes, signed and unsigned, would be nice.
--
Michael

#45

coelho@cri.ensmp.fr

over 6 years ago

In reply to: Michael Paquier (#44)

Re: refactoring - share str2*int64 functions

Bonjour Michaï¿œl,

The error returning stuff is simple enough, but I'm unclear about what to do
with pg_uint64, which has a totally different signature. Should it be
aligned?

I am not sure what you mean with aligned here.

I meant same signature.

If you mean consistent, getting into a state where we have all functions
for all three sizes, signed and unsigned, would be nice.

Ok, I look into it.

--
Fabien.

#46

coelho@cri.ensmp.fr

over 6 years ago

In reply to: Michael Paquier (#42)

1 attachment(s)

Re: refactoring - share str2*int64 functions

Bonjour Michaï¿œl,

So I'll set out to write and use "pg_strtou?int64", i.e. 2 functions, and
then possibly generalize to lower sizes, 32, 16, depending on what is
actually needed.

I am interested in this patch, and the next commit fest is close by.
Are you working on an updated version? If not, do you mind if I work
on it and post a new version by the beginning of next week based on
all the feedback gathered?

Here is an updated patch for the u?int64 conversion functions.

I have taken the liberty to optimize the existing int64 function by
removing spurious tests. I have not created uint64 specific inlined
overflow functions.

If it looks ok, a separate patch could address the 32 & 16 versions.

--
Fabien.

Attachments:

str2int-5.patchtext/x-diff; name=str2int-5.patchDownload

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 221b47298c..28891ba337 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -79,6 +79,7 @@
 #include "utils/builtins.h"
 #include "utils/hashutils.h"
 #include "utils/memutils.h"
+#include "common/string.h"
 
 PG_MODULE_MAGIC;
 
@@ -1022,7 +1023,7 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 		/* parse command tag to retrieve the number of affected rows. */
 		if (completionTag &&
 			strncmp(completionTag, "COPY ", 5) == 0)
-			rows = pg_strtouint64(completionTag + 5, NULL, 10);
+			(void) pg_strtouint64(completionTag + 5, &rows);
 		else
 			rows = 0;
 
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 2c0ae395ba..8e75d52b06 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -21,6 +21,7 @@
 #include "catalog/heap.h"
 #include "catalog/pg_type.h"
 #include "commands/trigger.h"
+#include "common/string.h"
 #include "executor/executor.h"
 #include "executor/spi_priv.h"
 #include "miscadmin.h"
@@ -2338,8 +2339,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
 					CreateTableAsStmt *ctastmt = (CreateTableAsStmt *) stmt->utilityStmt;
 
 					if (strncmp(completionTag, "SELECT ", 7) == 0)
-						_SPI_current->processed =
-							pg_strtouint64(completionTag + 7, NULL, 10);
+						(void) pg_strtouint64(completionTag + 7, &_SPI_current->processed);
 					else
 					{
 						/*
@@ -2361,8 +2361,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
 				else if (IsA(stmt->utilityStmt, CopyStmt))
 				{
 					Assert(strncmp(completionTag, "COPY ", 5) == 0);
-					_SPI_current->processed = pg_strtouint64(completionTag + 5,
-															 NULL, 10);
+					(void) pg_strtouint64(completionTag + 5, &_SPI_current->processed);
 				}
 			}
 
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 764e3bb90c..17cde42b4d 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -34,6 +34,7 @@
 
 #include "fmgr.h"
 #include "miscadmin.h"
+#include "common/string.h"
 #include "nodes/extensible.h"
 #include "nodes/parsenodes.h"
 #include "nodes/plannodes.h"
@@ -80,7 +81,7 @@
 #define READ_UINT64_FIELD(fldname) \
 	token = pg_strtok(&length);		/* skip :fldname */ \
 	token = pg_strtok(&length);		/* get field value */ \
-	local_node->fldname = pg_strtouint64(token, NULL, 10)
+	(void) pg_strtouint64(token, &local_node->fldname)
 
 /* Read a long integer field (anything written as ":fldname %ld") */
 #define READ_LONG_FIELD(fldname) \
diff --git a/src/backend/parser/parse_node.c b/src/backend/parser/parse_node.c
index 1baf7ef31f..4595c35875 100644
--- a/src/backend/parser/parse_node.c
+++ b/src/backend/parser/parse_node.c
@@ -25,10 +25,10 @@
 #include "parser/parse_expr.h"
 #include "parser/parse_relation.h"
 #include "utils/builtins.h"
-#include "utils/int8.h"
 #include "utils/lsyscache.h"
 #include "utils/syscache.h"
 #include "utils/varbit.h"
+#include "common/string.h"
 
 
 static void pcb_error_callback(void *arg);
@@ -496,7 +496,7 @@ make_const(ParseState *pstate, Value *value, int location)
 
 		case T_Float:
 			/* could be an oversize integer as well as a float ... */
-			if (scanint8(strVal(value), true, &val64))
+			if (pg_strtoint64(strVal(value), &val64) == strtoint_ok)
 			{
 				/*
 				 * It might actually fit in int32. Probably only INT_MIN can
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 9c08757fca..b364a41fb5 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,6 +12,7 @@
  */
 #include "postgres.h"
 
+#include "common/string.h"
 #include "catalog/pg_publication.h"
 
 #include "fmgr.h"
@@ -22,7 +23,6 @@
 #include "replication/pgoutput.h"
 
 #include "utils/inval.h"
-#include "utils/int8.h"
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 #include "utils/varlena.h"
@@ -113,7 +113,7 @@ parse_output_parameters(List *options, uint32 *protocol_version,
 						 errmsg("conflicting or redundant options")));
 			protocol_version_given = true;
 
-			if (!scanint8(strVal(defel->arg), true, &parsed))
+			if (unlikely(pg_strtoint64(strVal(defel->arg), &parsed) != strtoint_ok))
 				ereport(ERROR,
 						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 						 errmsg("invalid proto_version")));
diff --git a/src/backend/utils/adt/cash.c b/src/backend/utils/adt/cash.c
index 6515fc8ec6..15f3babc05 100644
--- a/src/backend/utils/adt/cash.c
+++ b/src/backend/utils/adt/cash.c
@@ -26,7 +26,6 @@
 #include "libpq/pqformat.h"
 #include "utils/builtins.h"
 #include "utils/cash.h"
-#include "utils/int8.h"
 #include "utils/numeric.h"
 #include "utils/pg_locale.h"
 
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index b3115e4bea..05bbd2a006 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -92,7 +92,6 @@
 #include "utils/datetime.h"
 #include "utils/float.h"
 #include "utils/formatting.h"
-#include "utils/int8.h"
 #include "utils/memutils.h"
 #include "utils/numeric.h"
 #include "utils/pg_locale.h"
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 0ff9394a2f..5782ea7cab 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -18,12 +18,12 @@
 #include <math.h>
 
 #include "common/int.h"
+#include "common/string.h"
 #include "funcapi.h"
 #include "libpq/pqformat.h"
 #include "nodes/nodeFuncs.h"
 #include "nodes/supportnodes.h"
 #include "optimizer/optimizer.h"
-#include "utils/int8.h"
 #include "utils/builtins.h"
 
 
@@ -47,87 +47,32 @@ typedef struct
  * Formatting and conversion routines.
  *---------------------------------------------------------*/
 
-/*
- * scanint8 --- try to parse a string into an int8.
- *
- * If errorOK is false, ereport a useful error message if the string is bad.
- * If errorOK is true, just return "false" for bad input.
- */
-bool
-scanint8(const char *str, bool errorOK, int64 *result)
+static bool
+scanint8(const char * str, int64 * result)
 {
-	const char *ptr = str;
-	int64		tmp = 0;
-	bool		neg = false;
+	strtoint_status stat = pg_strtoint64(str, result);
 
-	/*
-	 * Do our own scan, rather than relying on sscanf which might be broken
-	 * for long long.
-	 *
-	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
-	 * value as a negative number.
-	 */
-
-	/* skip leading spaces */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
+	if (likely(stat == strtoint_ok))
+		return true;
+	else if (stat == strtoint_range_error)
 	{
-		/* could fail if input is most negative number */
-		if (unlikely(tmp == PG_INT64_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	*result = tmp;
-	return true;
-
-out_of_range:
-	if (!errorOK)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("value \"%s\" is out of range for type %s",
 						str, "bigint")));
-	return false;
-
-invalid_syntax:
-	if (!errorOK)
+		return false;
+	}
+	else if (stat == strtoint_syntax_error)
+	{
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
 				 errmsg("invalid input syntax for type %s: \"%s\"",
 						"bigint", str)));
-	return false;
+		return false;
+	}
+	else
+		/* cannot get here */
+		Assert(0);
 }
 
 /* int8in()
@@ -138,7 +83,7 @@ int8in(PG_FUNCTION_ARGS)
 	char	   *str = PG_GETARG_CSTRING(0);
 	int64		result;
 
-	(void) scanint8(str, false, &result);
+	(void) scanint8(str, &result);
 	PG_RETURN_INT64(result);
 }
 
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index a00db3ce7a..bc02958950 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -39,7 +39,6 @@
 #include "utils/float.h"
 #include "utils/guc.h"
 #include "utils/hashutils.h"
-#include "utils/int8.h"
 #include "utils/numeric.h"
 #include "utils/sortsupport.h"
 
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 70138feb29..f5d488c569 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -543,25 +543,3 @@ pg_ltostr(char *str, int32 value)
 
 	return end;
 }
-
-/*
- * pg_strtouint64
- *		Converts 'str' into an unsigned 64-bit integer.
- *
- * This has the identical API to strtoul(3), except that it will handle
- * 64-bit ints even where "long" is narrower than that.
- *
- * For the moment it seems sufficient to assume that the platform has
- * such a function somewhere; let's not roll our own.
- */
-uint64
-pg_strtouint64(const char *str, char **endptr, int base)
-{
-#ifdef _MSC_VER					/* MSVC only */
-	return _strtoui64(str, endptr, base);
-#elif defined(HAVE_STRTOULL) && SIZEOF_LONG < 8
-	return strtoull(str, endptr, base);
-#else
-	return strtoul(str, endptr, base);
-#endif
-}
diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c
index e5c7e5c7ee..749fc44623 100644
--- a/src/backend/utils/adt/rangetypes.c
+++ b/src/backend/utils/adt/rangetypes.c
@@ -37,7 +37,6 @@
 #include "utils/builtins.h"
 #include "utils/date.h"
 #include "utils/hashutils.h"
-#include "utils/int8.h"
 #include "utils/lsyscache.h"
 #include "utils/rangetypes.h"
 #include "utils/timestamp.h"
diff --git a/src/bin/pgbench/exprscan.l b/src/bin/pgbench/exprscan.l
index e9020ad231..11e211f294 100644
--- a/src/bin/pgbench/exprscan.l
+++ b/src/bin/pgbench/exprscan.l
@@ -23,6 +23,7 @@
  *-------------------------------------------------------------------------
  */
 
+#include "common/string.h"
 #include "fe_utils/psqlscan_int.h"
 
 /* context information for reporting errors in expressions */
@@ -205,7 +206,7 @@ notnull			[Nn][Oo][Tt][Nn][Uu][Ll][Ll]
 					return MAXINT_PLUS_ONE_CONST;
 				}
 {digit}+		{
-					if (!strtoint64(yytext, true, &yylval->ival))
+					if (unlikely(pg_strtoint64(yytext, &yylval->ival) != strtoint_ok))
 						expr_yyerror_more(yyscanner, "bigint constant overflow",
 										  strdup(yytext));
 					return INTEGER_CONST;
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 570cf3306a..d1cfa07e87 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -34,6 +34,7 @@
 #include "postgres_fe.h"
 #include "common/int.h"
 #include "common/logging.h"
+#include "common/string.h"
 #include "fe_utils/conditional.h"
 #include "getopt_long.h"
 #include "libpq-fe.h"
@@ -670,8 +671,8 @@ is_an_int(const char *str)
 {
 	const char *ptr = str;
 
-	/* skip leading spaces; cast is consistent with strtoint64 */
-	while (*ptr && isspace((unsigned char) *ptr))
+	/* skip leading spaces; cast is consistent with pg_strtoint64 */
+	while (isspace((unsigned char) *ptr))
 		ptr++;
 
 	/* skip sign */
@@ -679,101 +680,21 @@ is_an_int(const char *str)
 		ptr++;
 
 	/* at least one digit */
-	if (*ptr && !isdigit((unsigned char) *ptr))
+	if (!isdigit((unsigned char) *ptr++))
 		return false;
 
-	/* eat all digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
+	/* eat all other digits */
+	while (isdigit((unsigned char) *ptr))
+		ptr++;
+
+	/* allow trailing spaces */
+	while (isspace((unsigned char) *ptr))
 		ptr++;
 
 	/* must have reached end of string */
 	return *ptr == '\0';
 }
 
-
-/*
- * strtoint64 -- convert a string to 64-bit integer
- *
- * This function is a slightly modified version of scanint8() from
- * src/backend/utils/adt/int8.c.
- *
- * The function returns whether the conversion worked, and if so
- * "*result" is set to the result.
- *
- * If not errorOK, an error message is also printed out on errors.
- */
-bool
-strtoint64(const char *str, bool errorOK, int64 *result)
-{
-	const char *ptr = str;
-	int64		tmp = 0;
-	bool		neg = false;
-
-	/*
-	 * Do our own scan, rather than relying on sscanf which might be broken
-	 * for long long.
-	 *
-	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
-	 * value as a negative number.
-	 */
-
-	/* skip leading spaces */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		if (unlikely(tmp == PG_INT64_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	*result = tmp;
-	return true;
-
-out_of_range:
-	if (!errorOK)
-		fprintf(stderr,
-				"value \"%s\" is out of range for type bigint\n", str);
-	return false;
-
-invalid_syntax:
-	if (!errorOK)
-		fprintf(stderr,
-				"invalid input syntax for type bigint: \"%s\"\n", str);
-	return false;
-}
-
 /* convert string to double, detecting overflows/underflows */
 bool
 strtodouble(const char *str, bool errorOK, double *dv)
@@ -1283,6 +1204,22 @@ getVariable(CState *st, char *name)
 	return var->svalue;
 }
 
+static bool
+str2int64(const char *str, int64 *val)
+{
+	strtoint_status		stat = pg_strtoint64(str, val);
+
+	if (likely(stat == strtoint_ok))
+		return true;
+
+	if (stat == strtoint_range_error)
+		fprintf(stderr, "value \"%s\" is out of range for type %s\n", str, "int64");
+	else if (stat == strtoint_syntax_error)
+		fprintf(stderr, "invalid input syntax for type %s: \"%s\"", "int64", str);
+
+	return false;
+}
+
 /* Try to convert variable to a value; return false on failure */
 static bool
 makeVariableValue(Variable *var)
@@ -1325,7 +1262,7 @@ makeVariableValue(Variable *var)
 		/* if it looks like an int, it must be an int without overflow */
 		int64		iv;
 
-		if (!strtoint64(var->svalue, false, &iv))
+		if (!str2int64(var->svalue, &iv))
 			return false;
 
 		setIntValue(&var->value, iv);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index c4a1e298e0..da3f8fa055 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -160,7 +160,6 @@ extern void syntax_error(const char *source, int lineno, const char *line,
 						 const char *cmd, const char *msg,
 						 const char *more, int col) pg_attribute_noreturn();
 
-extern bool strtoint64(const char *str, bool errorOK, int64 *pi);
 extern bool strtodouble(const char *str, bool errorOK, double *pd);
 
 #endif							/* PGBENCH_H */
diff --git a/src/common/string.c b/src/common/string.c
index c9b8482cb0..3385106f60 100644
--- a/src/common/string.c
+++ b/src/common/string.c
@@ -22,6 +22,7 @@
 #endif
 
 #include "common/string.h"
+#include "common/int.h"
 
 
 /*
@@ -57,6 +58,132 @@ strtoint(const char *pg_restrict str, char **pg_restrict endptr, int base)
 	return (int) val;
 }
 
+/*
+ * pg_strtoint64 -- convert a string to 64-bit integer
+ *
+ * The function returns if the conversion failed, or
+ * "*result" is set to the result.
+ */
+strtoint_status
+pg_strtoint64(const char *str, int64 *result)
+{
+	const char *ptr = str;
+	int64		tmp = 0;
+	bool		neg = false;
+
+	/*
+	 * Do our own scan, rather than relying on sscanf which might be broken
+	 * for long long.
+	 *
+	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
+	 * value as a negative number.
+	 */
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '-')
+	{
+		ptr++;
+		neg = true;
+	}
+	else if (*ptr == '+')
+		ptr++;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return strtoint_syntax_error;
+
+	/* process digits, we know that we have one ahead */
+	do
+	{
+		int64		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
+			return strtoint_range_error;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* but not other trailing chars */
+	if (unlikely(*ptr != '\0'))
+		return strtoint_syntax_error;
+
+	if (!neg)
+	{
+		if (unlikely(tmp == PG_INT64_MIN))
+			return strtoint_range_error;
+		tmp = -tmp;
+	}
+
+	*result = tmp;
+	return strtoint_ok;
+}
+
+/*
+ * pg_strtouint64 -- convert a string to unsigned 64-bit integer
+ *
+ * The function returns if the conversion failed, or
+ * "*result" is set to the result.
+ */
+strtoint_status
+pg_strtouint64(const char *str, uint64 *result)
+{
+	const char *ptr = str;
+	uint64		tmp = 0;
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '+')
+		ptr++;
+	else if (unlikely(*ptr == '-'))
+		return strtoint_syntax_error;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return strtoint_syntax_error;
+
+	/* process digits, we know that we have one ahead */
+	do
+	{
+		uint64		next;
+
+		/* should we add pg_{mul,add}_u64_overflow variants? */
+
+		/* catch mul overflow: MAX_UINT64 / 10 == 1844674407370955161 */
+		if (unlikely(tmp > 1844674407370955161))
+			return strtoint_range_error;
+
+		next = 10 * tmp + (*ptr++ - '0');;
+
+		/* catch add overflow */
+		if (unlikely(next < tmp))
+			return strtoint_range_error;
+
+		tmp = next;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* but not other trailing chars */
+	if (unlikely(*ptr != '\0'))
+		return strtoint_syntax_error;
+
+	*result = tmp;
+	return strtoint_ok;
+}
 
 /*
  * pg_clean_ascii -- Replace any non-ASCII chars with a '?' char
diff --git a/src/include/common/string.h b/src/include/common/string.h
index 94f653fdd7..9b58305319 100644
--- a/src/include/common/string.h
+++ b/src/include/common/string.h
@@ -11,8 +11,19 @@
 #define COMMON_STRING_H
 
 extern bool pg_str_endswith(const char *str, const char *end);
+
 extern int	strtoint(const char *pg_restrict str, char **pg_restrict endptr,
 					 int base);
+
+typedef enum {
+	strtoint_ok,
+	strtoint_syntax_error,
+	strtoint_range_error
+} strtoint_status;
+
+extern strtoint_status pg_strtoint64(const char *str, int64 *result);
+extern strtoint_status pg_strtouint64(const char *str, uint64 *result);
+
 extern void pg_clean_ascii(char *str);
 extern int	pg_strip_crlf(char *str);
 
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 937ddb7ef0..8a0c2a5c48 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -50,7 +50,6 @@ extern void pg_ltoa(int32 l, char *a);
 extern void pg_lltoa(int64 ll, char *a);
 extern char *pg_ltostr_zeropad(char *str, int32 value, int32 minwidth);
 extern char *pg_ltostr(char *str, int32 value);
-extern uint64 pg_strtouint64(const char *str, char **endptr, int base);
 
 /* oid.c */
 extern oidvector *buildoidvector(const Oid *oids, int n);
diff --git a/src/include/utils/int8.h b/src/include/utils/int8.h
deleted file mode 100644
index 4836095d93..0000000000
--- a/src/include/utils/int8.h
+++ /dev/null
@@ -1,25 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * int8.h
- *	  Declarations for operations on 64-bit integers.
- *
- *
- * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * src/include/utils/int8.h
- *
- * NOTES
- * These data types are supported on all 64-bit architectures, and may
- *	be supported through libraries on some 32-bit machines. If your machine
- *	is not currently supported, then please try to make it so, then post
- *	patches to the postgresql.org hackers mailing list.
- *
- *-------------------------------------------------------------------------
- */
-#ifndef INT8_H
-#define INT8_H
-
-extern bool scanint8(const char *str, bool errorOK, int64 *result);
-
-#endif							/* INT8_H */

#47

michael@paquier.xyz

over 6 years ago

In reply to: Fabien COELHO (#46)

Re: refactoring - share str2*int64 functions

On Wed, Aug 28, 2019 at 08:51:29AM +0200, Fabien COELHO wrote:

Here is an updated patch for the u?int64 conversion functions.

Thanks!

I have taken the liberty to optimize the existing int64 function by removing
spurious tests.

Which are?

I have not created uint64 specific inlined overflow functions.

Why? There is a comment below ;p

If it looks ok, a separate patch could address the 32 & 16 versions.

I am surprised to see a negative diff actually just by doing that
(adding the 32 and 16 parts will add much more code of course). At
quick glance, I think that this is on the right track. Some comments
I have on the top of my mind:
- It would me good to have the unsigned equivalents of
pg_mul_s64_overflow, etc. These are simple enough, and per the
feedback from Andres they could live in common/int.h.
- It is more consistent to use upper-case statuses in the enum
strtoint_status. Could it be renamed to pg_strtoint_status?
--
Michael

#48

coelho@cri.ensmp.fr

over 6 years ago

In reply to: Michael Paquier (#47)

1 attachment(s)

Re: refactoring - share str2*int64 functions

Michaël,

I have taken the liberty to optimize the existing int64 function by removing
spurious tests.

Which are?

- *ptr && WHATEVER(*ptr)
*ptr is redundant, WHATEVER yields false on '\0', and it costs on each
char but at the end. It might be debatable in some places, e.g. it is
likely that there are no spaces in the string, but likely that there are
more than one digit.

If you want all/some *ptr added back, no problem.

- isdigit repeated on if and following while, used if/do-while instead.

I have not created uint64 specific inlined overflow functions.

Why? There is a comment below ;p

See comment about comment below:-)

If it looks ok, a separate patch could address the 32 & 16 versions.

I am surprised to see a negative diff

Is it? Long duplicate functions are factored out (this was my initial
intent), one file is removed…

actually just by doing that (adding the 32 and 16 parts will add much
more code of course). At quick glance, I think that this is on the
right track. Some comments I have on the top of my mind:

- It would me good to have the unsigned equivalents of
pg_mul_s64_overflow, etc. These are simple enough,

Hmmm. Have you looked at the fallback cases when the corresponding
builtins are not available?

I'm unsure of a reliable way to detect a generic unsigned int overflow
without expensive dividing back and having to care about zero…

So I was pretty happy with my two discreet, small and efficient tests.

and per the feedback from Andres they could live in common/int.h.

Could be, however "string.c" already contains a string to int conversion
function, so I put them together. Probably this function should be
removed in the end, though.

- It is more consistent to use upper-case statuses in the enum
strtoint_status.

I thought of that, but first enum I found used lower case, so it did not
seem obvious that pg style was to use upper case. Indeed, some enum
constants use upper cases.

Could it be renamed to pg_strtoint_status?

Sure. I also prefixed the enum constants for consistency.

Attached patch uses a prefix and uppers constants. Waiting for further
input about other comments.

--
Fabien.

Attachments:

str2int-6.patchtext/x-diff; name=str2int-6.patchDownload

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 221b47298c..28891ba337 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -79,6 +79,7 @@
 #include "utils/builtins.h"
 #include "utils/hashutils.h"
 #include "utils/memutils.h"
+#include "common/string.h"
 
 PG_MODULE_MAGIC;
 
@@ -1022,7 +1023,7 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 		/* parse command tag to retrieve the number of affected rows. */
 		if (completionTag &&
 			strncmp(completionTag, "COPY ", 5) == 0)
-			rows = pg_strtouint64(completionTag + 5, NULL, 10);
+			(void) pg_strtouint64(completionTag + 5, &rows);
 		else
 			rows = 0;
 
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 2c0ae395ba..8e75d52b06 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -21,6 +21,7 @@
 #include "catalog/heap.h"
 #include "catalog/pg_type.h"
 #include "commands/trigger.h"
+#include "common/string.h"
 #include "executor/executor.h"
 #include "executor/spi_priv.h"
 #include "miscadmin.h"
@@ -2338,8 +2339,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
 					CreateTableAsStmt *ctastmt = (CreateTableAsStmt *) stmt->utilityStmt;
 
 					if (strncmp(completionTag, "SELECT ", 7) == 0)
-						_SPI_current->processed =
-							pg_strtouint64(completionTag + 7, NULL, 10);
+						(void) pg_strtouint64(completionTag + 7, &_SPI_current->processed);
 					else
 					{
 						/*
@@ -2361,8 +2361,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
 				else if (IsA(stmt->utilityStmt, CopyStmt))
 				{
 					Assert(strncmp(completionTag, "COPY ", 5) == 0);
-					_SPI_current->processed = pg_strtouint64(completionTag + 5,
-															 NULL, 10);
+					(void) pg_strtouint64(completionTag + 5, &_SPI_current->processed);
 				}
 			}
 
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 764e3bb90c..17cde42b4d 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -34,6 +34,7 @@
 
 #include "fmgr.h"
 #include "miscadmin.h"
+#include "common/string.h"
 #include "nodes/extensible.h"
 #include "nodes/parsenodes.h"
 #include "nodes/plannodes.h"
@@ -80,7 +81,7 @@
 #define READ_UINT64_FIELD(fldname) \
 	token = pg_strtok(&length);		/* skip :fldname */ \
 	token = pg_strtok(&length);		/* get field value */ \
-	local_node->fldname = pg_strtouint64(token, NULL, 10)
+	(void) pg_strtouint64(token, &local_node->fldname)
 
 /* Read a long integer field (anything written as ":fldname %ld") */
 #define READ_LONG_FIELD(fldname) \
diff --git a/src/backend/parser/parse_node.c b/src/backend/parser/parse_node.c
index 1baf7ef31f..8af5e083ee 100644
--- a/src/backend/parser/parse_node.c
+++ b/src/backend/parser/parse_node.c
@@ -25,10 +25,10 @@
 #include "parser/parse_expr.h"
 #include "parser/parse_relation.h"
 #include "utils/builtins.h"
-#include "utils/int8.h"
 #include "utils/lsyscache.h"
 #include "utils/syscache.h"
 #include "utils/varbit.h"
+#include "common/string.h"
 
 
 static void pcb_error_callback(void *arg);
@@ -496,7 +496,7 @@ make_const(ParseState *pstate, Value *value, int location)
 
 		case T_Float:
 			/* could be an oversize integer as well as a float ... */
-			if (scanint8(strVal(value), true, &val64))
+			if (pg_strtoint64(strVal(value), &val64) == PG_STRTOINT_OK)
 			{
 				/*
 				 * It might actually fit in int32. Probably only INT_MIN can
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 9c08757fca..e58dbe2fd0 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,6 +12,7 @@
  */
 #include "postgres.h"
 
+#include "common/string.h"
 #include "catalog/pg_publication.h"
 
 #include "fmgr.h"
@@ -22,7 +23,6 @@
 #include "replication/pgoutput.h"
 
 #include "utils/inval.h"
-#include "utils/int8.h"
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 #include "utils/varlena.h"
@@ -113,7 +113,7 @@ parse_output_parameters(List *options, uint32 *protocol_version,
 						 errmsg("conflicting or redundant options")));
 			protocol_version_given = true;
 
-			if (!scanint8(strVal(defel->arg), true, &parsed))
+			if (unlikely(pg_strtoint64(strVal(defel->arg), &parsed) != PG_STRTOINT_OK))
 				ereport(ERROR,
 						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 						 errmsg("invalid proto_version")));
diff --git a/src/backend/utils/adt/cash.c b/src/backend/utils/adt/cash.c
index 6515fc8ec6..15f3babc05 100644
--- a/src/backend/utils/adt/cash.c
+++ b/src/backend/utils/adt/cash.c
@@ -26,7 +26,6 @@
 #include "libpq/pqformat.h"
 #include "utils/builtins.h"
 #include "utils/cash.h"
-#include "utils/int8.h"
 #include "utils/numeric.h"
 #include "utils/pg_locale.h"
 
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index b3115e4bea..05bbd2a006 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -92,7 +92,6 @@
 #include "utils/datetime.h"
 #include "utils/float.h"
 #include "utils/formatting.h"
-#include "utils/int8.h"
 #include "utils/memutils.h"
 #include "utils/numeric.h"
 #include "utils/pg_locale.h"
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 0ff9394a2f..0152fc9dbd 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -18,12 +18,12 @@
 #include <math.h>
 
 #include "common/int.h"
+#include "common/string.h"
 #include "funcapi.h"
 #include "libpq/pqformat.h"
 #include "nodes/nodeFuncs.h"
 #include "nodes/supportnodes.h"
 #include "optimizer/optimizer.h"
-#include "utils/int8.h"
 #include "utils/builtins.h"
 
 
@@ -47,87 +47,32 @@ typedef struct
  * Formatting and conversion routines.
  *---------------------------------------------------------*/
 
-/*
- * scanint8 --- try to parse a string into an int8.
- *
- * If errorOK is false, ereport a useful error message if the string is bad.
- * If errorOK is true, just return "false" for bad input.
- */
-bool
-scanint8(const char *str, bool errorOK, int64 *result)
+static bool
+scanint8(const char * str, int64 * result)
 {
-	const char *ptr = str;
-	int64		tmp = 0;
-	bool		neg = false;
+	pg_strtoint_status stat = pg_strtoint64(str, result);
 
-	/*
-	 * Do our own scan, rather than relying on sscanf which might be broken
-	 * for long long.
-	 *
-	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
-	 * value as a negative number.
-	 */
-
-	/* skip leading spaces */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
+	if (likely(stat == PG_STRTOINT_OK))
+		return true;
+	else if (stat == PG_STRTOINT_RANGE_ERROR)
 	{
-		/* could fail if input is most negative number */
-		if (unlikely(tmp == PG_INT64_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	*result = tmp;
-	return true;
-
-out_of_range:
-	if (!errorOK)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("value \"%s\" is out of range for type %s",
 						str, "bigint")));
-	return false;
-
-invalid_syntax:
-	if (!errorOK)
+		return false;
+	}
+	else if (stat == PG_STRTOINT_SYNTAX_ERROR)
+	{
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
 				 errmsg("invalid input syntax for type %s: \"%s\"",
 						"bigint", str)));
-	return false;
+		return false;
+	}
+	else
+		/* cannot get here */
+		Assert(0);
 }
 
 /* int8in()
@@ -138,7 +83,7 @@ int8in(PG_FUNCTION_ARGS)
 	char	   *str = PG_GETARG_CSTRING(0);
 	int64		result;
 
-	(void) scanint8(str, false, &result);
+	(void) scanint8(str, &result);
 	PG_RETURN_INT64(result);
 }
 
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index a00db3ce7a..bc02958950 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -39,7 +39,6 @@
 #include "utils/float.h"
 #include "utils/guc.h"
 #include "utils/hashutils.h"
-#include "utils/int8.h"
 #include "utils/numeric.h"
 #include "utils/sortsupport.h"
 
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 70138feb29..f5d488c569 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -543,25 +543,3 @@ pg_ltostr(char *str, int32 value)
 
 	return end;
 }
-
-/*
- * pg_strtouint64
- *		Converts 'str' into an unsigned 64-bit integer.
- *
- * This has the identical API to strtoul(3), except that it will handle
- * 64-bit ints even where "long" is narrower than that.
- *
- * For the moment it seems sufficient to assume that the platform has
- * such a function somewhere; let's not roll our own.
- */
-uint64
-pg_strtouint64(const char *str, char **endptr, int base)
-{
-#ifdef _MSC_VER					/* MSVC only */
-	return _strtoui64(str, endptr, base);
-#elif defined(HAVE_STRTOULL) && SIZEOF_LONG < 8
-	return strtoull(str, endptr, base);
-#else
-	return strtoul(str, endptr, base);
-#endif
-}
diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c
index e5c7e5c7ee..749fc44623 100644
--- a/src/backend/utils/adt/rangetypes.c
+++ b/src/backend/utils/adt/rangetypes.c
@@ -37,7 +37,6 @@
 #include "utils/builtins.h"
 #include "utils/date.h"
 #include "utils/hashutils.h"
-#include "utils/int8.h"
 #include "utils/lsyscache.h"
 #include "utils/rangetypes.h"
 #include "utils/timestamp.h"
diff --git a/src/bin/pgbench/exprscan.l b/src/bin/pgbench/exprscan.l
index e9020ad231..99db6c72eb 100644
--- a/src/bin/pgbench/exprscan.l
+++ b/src/bin/pgbench/exprscan.l
@@ -23,6 +23,7 @@
  *-------------------------------------------------------------------------
  */
 
+#include "common/string.h"
 #include "fe_utils/psqlscan_int.h"
 
 /* context information for reporting errors in expressions */
@@ -205,7 +206,7 @@ notnull			[Nn][Oo][Tt][Nn][Uu][Ll][Ll]
 					return MAXINT_PLUS_ONE_CONST;
 				}
 {digit}+		{
-					if (!strtoint64(yytext, true, &yylval->ival))
+					if (unlikely(pg_strtoint64(yytext, &yylval->ival) != PG_STRTOINT_OK))
 						expr_yyerror_more(yyscanner, "bigint constant overflow",
 										  strdup(yytext));
 					return INTEGER_CONST;
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 570cf3306a..c51e101b54 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -34,6 +34,7 @@
 #include "postgres_fe.h"
 #include "common/int.h"
 #include "common/logging.h"
+#include "common/string.h"
 #include "fe_utils/conditional.h"
 #include "getopt_long.h"
 #include "libpq-fe.h"
@@ -670,8 +671,8 @@ is_an_int(const char *str)
 {
 	const char *ptr = str;
 
-	/* skip leading spaces; cast is consistent with strtoint64 */
-	while (*ptr && isspace((unsigned char) *ptr))
+	/* skip leading spaces; cast is consistent with pg_strtoint64 */
+	while (isspace((unsigned char) *ptr))
 		ptr++;
 
 	/* skip sign */
@@ -679,101 +680,21 @@ is_an_int(const char *str)
 		ptr++;
 
 	/* at least one digit */
-	if (*ptr && !isdigit((unsigned char) *ptr))
+	if (!isdigit((unsigned char) *ptr++))
 		return false;
 
-	/* eat all digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
+	/* eat all other digits */
+	while (isdigit((unsigned char) *ptr))
+		ptr++;
+
+	/* allow trailing spaces */
+	while (isspace((unsigned char) *ptr))
 		ptr++;
 
 	/* must have reached end of string */
 	return *ptr == '\0';
 }
 
-
-/*
- * strtoint64 -- convert a string to 64-bit integer
- *
- * This function is a slightly modified version of scanint8() from
- * src/backend/utils/adt/int8.c.
- *
- * The function returns whether the conversion worked, and if so
- * "*result" is set to the result.
- *
- * If not errorOK, an error message is also printed out on errors.
- */
-bool
-strtoint64(const char *str, bool errorOK, int64 *result)
-{
-	const char *ptr = str;
-	int64		tmp = 0;
-	bool		neg = false;
-
-	/*
-	 * Do our own scan, rather than relying on sscanf which might be broken
-	 * for long long.
-	 *
-	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
-	 * value as a negative number.
-	 */
-
-	/* skip leading spaces */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		if (unlikely(tmp == PG_INT64_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	*result = tmp;
-	return true;
-
-out_of_range:
-	if (!errorOK)
-		fprintf(stderr,
-				"value \"%s\" is out of range for type bigint\n", str);
-	return false;
-
-invalid_syntax:
-	if (!errorOK)
-		fprintf(stderr,
-				"invalid input syntax for type bigint: \"%s\"\n", str);
-	return false;
-}
-
 /* convert string to double, detecting overflows/underflows */
 bool
 strtodouble(const char *str, bool errorOK, double *dv)
@@ -1283,6 +1204,22 @@ getVariable(CState *st, char *name)
 	return var->svalue;
 }
 
+static bool
+str2int64(const char *str, int64 *val)
+{
+	pg_strtoint_status		stat = pg_strtoint64(str, val);
+
+	if (likely(stat == PG_STRTOINT_OK))
+		return true;
+
+	if (stat == PG_STRTOINT_RANGE_ERROR)
+		fprintf(stderr, "value \"%s\" is out of range for type %s\n", str, "int64");
+	else if (stat == PG_STRTOINT_SYNTAX_ERROR)
+		fprintf(stderr, "invalid input syntax for type %s: \"%s\"", "int64", str);
+
+	return false;
+}
+
 /* Try to convert variable to a value; return false on failure */
 static bool
 makeVariableValue(Variable *var)
@@ -1325,7 +1262,7 @@ makeVariableValue(Variable *var)
 		/* if it looks like an int, it must be an int without overflow */
 		int64		iv;
 
-		if (!strtoint64(var->svalue, false, &iv))
+		if (!str2int64(var->svalue, &iv))
 			return false;
 
 		setIntValue(&var->value, iv);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index c4a1e298e0..da3f8fa055 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -160,7 +160,6 @@ extern void syntax_error(const char *source, int lineno, const char *line,
 						 const char *cmd, const char *msg,
 						 const char *more, int col) pg_attribute_noreturn();
 
-extern bool strtoint64(const char *str, bool errorOK, int64 *pi);
 extern bool strtodouble(const char *str, bool errorOK, double *pd);
 
 #endif							/* PGBENCH_H */
diff --git a/src/common/string.c b/src/common/string.c
index c9b8482cb0..da0c5ee43d 100644
--- a/src/common/string.c
+++ b/src/common/string.c
@@ -22,6 +22,7 @@
 #endif
 
 #include "common/string.h"
+#include "common/int.h"
 
 
 /*
@@ -57,6 +58,132 @@ strtoint(const char *pg_restrict str, char **pg_restrict endptr, int base)
 	return (int) val;
 }
 
+/*
+ * pg_strtoint64 -- convert a string to 64-bit integer
+ *
+ * The function returns if the conversion failed, or
+ * "*result" is set to the result.
+ */
+pg_strtoint_status
+pg_strtoint64(const char *str, int64 *result)
+{
+	const char *ptr = str;
+	int64		tmp = 0;
+	bool		neg = false;
+
+	/*
+	 * Do our own scan, rather than relying on sscanf which might be broken
+	 * for long long.
+	 *
+	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
+	 * value as a negative number.
+	 */
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '-')
+	{
+		ptr++;
+		neg = true;
+	}
+	else if (*ptr == '+')
+		ptr++;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead */
+	do
+	{
+		int64		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* but not other trailing chars */
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	if (!neg)
+	{
+		if (unlikely(tmp == PG_INT64_MIN))
+			return PG_STRTOINT_RANGE_ERROR;
+		tmp = -tmp;
+	}
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
+/*
+ * pg_strtouint64 -- convert a string to unsigned 64-bit integer
+ *
+ * The function returns if the conversion failed, or
+ * "*result" is set to the result.
+ */
+pg_strtoint_status
+pg_strtouint64(const char *str, uint64 *result)
+{
+	const char *ptr = str;
+	uint64		tmp = 0;
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '+')
+		ptr++;
+	else if (unlikely(*ptr == '-'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead */
+	do
+	{
+		uint64		next;
+
+		/* should we add pg_{mul,add}_u64_overflow variants? */
+
+		/* catch mul overflow: MAX_UINT64 / 10 == 1844674407370955161 */
+		if (unlikely(tmp > 1844674407370955161))
+			return PG_STRTOINT_RANGE_ERROR;
+
+		next = 10 * tmp + (*ptr++ - '0');;
+
+		/* catch add overflow */
+		if (unlikely(next < tmp))
+			return PG_STRTOINT_RANGE_ERROR;
+
+		tmp = next;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* but not other trailing chars */
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
 
 /*
  * pg_clean_ascii -- Replace any non-ASCII chars with a '?' char
diff --git a/src/include/common/string.h b/src/include/common/string.h
index 94f653fdd7..9a8277a670 100644
--- a/src/include/common/string.h
+++ b/src/include/common/string.h
@@ -11,8 +11,19 @@
 #define COMMON_STRING_H
 
 extern bool pg_str_endswith(const char *str, const char *end);
+
 extern int	strtoint(const char *pg_restrict str, char **pg_restrict endptr,
 					 int base);
+
+typedef enum {
+	PG_STRTOINT_OK,
+	PG_STRTOINT_SYNTAX_ERROR,
+	PG_STRTOINT_RANGE_ERROR
+} pg_strtoint_status;
+
+extern pg_strtoint_status pg_strtoint64(const char *str, int64 *result);
+extern pg_strtoint_status pg_strtouint64(const char *str, uint64 *result);
+
 extern void pg_clean_ascii(char *str);
 extern int	pg_strip_crlf(char *str);
 
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 937ddb7ef0..8a0c2a5c48 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -50,7 +50,6 @@ extern void pg_ltoa(int32 l, char *a);
 extern void pg_lltoa(int64 ll, char *a);
 extern char *pg_ltostr_zeropad(char *str, int32 value, int32 minwidth);
 extern char *pg_ltostr(char *str, int32 value);
-extern uint64 pg_strtouint64(const char *str, char **endptr, int base);
 
 /* oid.c */
 extern oidvector *buildoidvector(const Oid *oids, int n);
diff --git a/src/include/utils/int8.h b/src/include/utils/int8.h
deleted file mode 100644
index 4836095d93..0000000000
--- a/src/include/utils/int8.h
+++ /dev/null
@@ -1,25 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * int8.h
- *	  Declarations for operations on 64-bit integers.
- *
- *
- * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * src/include/utils/int8.h
- *
- * NOTES
- * These data types are supported on all 64-bit architectures, and may
- *	be supported through libraries on some 32-bit machines. If your machine
- *	is not currently supported, then please try to make it so, then post
- *	patches to the postgresql.org hackers mailing list.
- *
- *-------------------------------------------------------------------------
- */
-#ifndef INT8_H
-#define INT8_H
-
-extern bool scanint8(const char *str, bool errorOK, int64 *result);
-
-#endif							/* INT8_H */

#49

michael@paquier.xyz

over 6 years ago

In reply to: Fabien COELHO (#48)

Re: refactoring - share str2*int64 functions

On Wed, Aug 28, 2019 at 09:50:44AM +0200, Fabien COELHO wrote:

- *ptr && WHATEVER(*ptr)
*ptr is redundant, WHATEVER yields false on '\0', and it costs on each
char but at the end. It might be debatable in some places, e.g. it is
likely that there are no spaces in the string, but likely that there are
more than one digit.

Still this makes the checks less robust?

If you want all/some *ptr added back, no problem.

- isdigit repeated on if and following while, used if/do-while instead.

I see, you don't check twice if the first character is a digit this
way.

Hmmm. Have you looked at the fallback cases when the corresponding builtins
are not available?

I'm unsure of a reliable way to detect a generic unsigned int overflow
without expensive dividing back and having to care about zero…

Mr Freund has mentioned that here:
/messages/by-id/20190717184820.iqz7schxdbucmdmu@alap3.anarazel.de

So I was pretty happy with my two discreet, small and efficient tests.

That's also a matter of code and interface consistency IMHO.
--
Michael

#50

coelho@cri.ensmp.fr

over 6 years ago

In reply to: Michael Paquier (#49)

1 attachment(s)

Re: refactoring - share str2*int64 functions

Bonjour Michaël,

- *ptr && WHATEVER(*ptr)
*ptr is redundant, WHATEVER yields false on '\0', and it costs on each
char but at the end. It might be debatable in some places, e.g. it is
likely that there are no spaces in the string, but likely that there are
more than one digit.

Still this makes the checks less robust?

I do not see any downside, but maybe I lack imagination.

On my laptop isWHATEVER is implementd through an array mapping characters
to a bitfield saying whether each char is WHATEVER, depending on the bit.
This array is well defined for index 0 ('\0').

If an implementation is based on comparisons, for isdigit it would be:

c >= '0' && c <= '9'

Then checking c != '\0' is redundant with c >= '0'.

Given the way the character checks are used in the function, we do not go
beyond the end of the string because we only proceed further when a
character is actually recognized, else we return.

So I cannot see any robustness issue, just a few cycles saved.

Hmmm. Have you looked at the fallback cases when the corresponding builtins
are not available?

I'm unsure of a reliable way to detect a generic unsigned int overflow
without expensive dividing back and having to care about zero…

Mr Freund has mentioned that here:
/messages/by-id/20190717184820.iqz7schxdbucmdmu@alap3.anarazel.de

Yep, that is what I mean by expensive: there is an integer division, which
is avoided if b is known to be 10, hence is not zero and the limit value
can be checked directly on the input without having to perform a division
each time.

So I was pretty happy with my two discreet, small and efficient tests.

That's also a matter of code and interface consistency IMHO.

Possibly.

ISTM that part of the motivation is to reduce costs for heavily used
conversions, eg on COPY. Function "scanf" is overly expensive because it
has to interpret its format, and we have to check for overflows…

Anyway, if we assume that the builtins exist and rely on efficient
hardware check, maybe we do not care about the fallback cases which would
just be slow but never executed.

Note that all this is moot, as all instances of string to uint64
conversion do not check for errors.

Attached v7 does create uint64 overflow inline functions. The stuff yet is
not moved to "common/int.c", a file which does not exists, even if
"common/int.h" does.

--
Fabien.

Attachments:

str2int-7.patchtext/x-diff; name=str2int-7.patchDownload

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 221b47298c..28891ba337 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -79,6 +79,7 @@
 #include "utils/builtins.h"
 #include "utils/hashutils.h"
 #include "utils/memutils.h"
+#include "common/string.h"
 
 PG_MODULE_MAGIC;
 
@@ -1022,7 +1023,7 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 		/* parse command tag to retrieve the number of affected rows. */
 		if (completionTag &&
 			strncmp(completionTag, "COPY ", 5) == 0)
-			rows = pg_strtouint64(completionTag + 5, NULL, 10);
+			(void) pg_strtouint64(completionTag + 5, &rows);
 		else
 			rows = 0;
 
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 2c0ae395ba..8e75d52b06 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -21,6 +21,7 @@
 #include "catalog/heap.h"
 #include "catalog/pg_type.h"
 #include "commands/trigger.h"
+#include "common/string.h"
 #include "executor/executor.h"
 #include "executor/spi_priv.h"
 #include "miscadmin.h"
@@ -2338,8 +2339,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
 					CreateTableAsStmt *ctastmt = (CreateTableAsStmt *) stmt->utilityStmt;
 
 					if (strncmp(completionTag, "SELECT ", 7) == 0)
-						_SPI_current->processed =
-							pg_strtouint64(completionTag + 7, NULL, 10);
+						(void) pg_strtouint64(completionTag + 7, &_SPI_current->processed);
 					else
 					{
 						/*
@@ -2361,8 +2361,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
 				else if (IsA(stmt->utilityStmt, CopyStmt))
 				{
 					Assert(strncmp(completionTag, "COPY ", 5) == 0);
-					_SPI_current->processed = pg_strtouint64(completionTag + 5,
-															 NULL, 10);
+					(void) pg_strtouint64(completionTag + 5, &_SPI_current->processed);
 				}
 			}
 
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 764e3bb90c..17cde42b4d 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -34,6 +34,7 @@
 
 #include "fmgr.h"
 #include "miscadmin.h"
+#include "common/string.h"
 #include "nodes/extensible.h"
 #include "nodes/parsenodes.h"
 #include "nodes/plannodes.h"
@@ -80,7 +81,7 @@
 #define READ_UINT64_FIELD(fldname) \
 	token = pg_strtok(&length);		/* skip :fldname */ \
 	token = pg_strtok(&length);		/* get field value */ \
-	local_node->fldname = pg_strtouint64(token, NULL, 10)
+	(void) pg_strtouint64(token, &local_node->fldname)
 
 /* Read a long integer field (anything written as ":fldname %ld") */
 #define READ_LONG_FIELD(fldname) \
diff --git a/src/backend/parser/parse_node.c b/src/backend/parser/parse_node.c
index 1baf7ef31f..8af5e083ee 100644
--- a/src/backend/parser/parse_node.c
+++ b/src/backend/parser/parse_node.c
@@ -25,10 +25,10 @@
 #include "parser/parse_expr.h"
 #include "parser/parse_relation.h"
 #include "utils/builtins.h"
-#include "utils/int8.h"
 #include "utils/lsyscache.h"
 #include "utils/syscache.h"
 #include "utils/varbit.h"
+#include "common/string.h"
 
 
 static void pcb_error_callback(void *arg);
@@ -496,7 +496,7 @@ make_const(ParseState *pstate, Value *value, int location)
 
 		case T_Float:
 			/* could be an oversize integer as well as a float ... */
-			if (scanint8(strVal(value), true, &val64))
+			if (pg_strtoint64(strVal(value), &val64) == PG_STRTOINT_OK)
 			{
 				/*
 				 * It might actually fit in int32. Probably only INT_MIN can
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 9c08757fca..e58dbe2fd0 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,6 +12,7 @@
  */
 #include "postgres.h"
 
+#include "common/string.h"
 #include "catalog/pg_publication.h"
 
 #include "fmgr.h"
@@ -22,7 +23,6 @@
 #include "replication/pgoutput.h"
 
 #include "utils/inval.h"
-#include "utils/int8.h"
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 #include "utils/varlena.h"
@@ -113,7 +113,7 @@ parse_output_parameters(List *options, uint32 *protocol_version,
 						 errmsg("conflicting or redundant options")));
 			protocol_version_given = true;
 
-			if (!scanint8(strVal(defel->arg), true, &parsed))
+			if (unlikely(pg_strtoint64(strVal(defel->arg), &parsed) != PG_STRTOINT_OK))
 				ereport(ERROR,
 						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 						 errmsg("invalid proto_version")));
diff --git a/src/backend/utils/adt/cash.c b/src/backend/utils/adt/cash.c
index 6515fc8ec6..15f3babc05 100644
--- a/src/backend/utils/adt/cash.c
+++ b/src/backend/utils/adt/cash.c
@@ -26,7 +26,6 @@
 #include "libpq/pqformat.h"
 #include "utils/builtins.h"
 #include "utils/cash.h"
-#include "utils/int8.h"
 #include "utils/numeric.h"
 #include "utils/pg_locale.h"
 
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index b3115e4bea..05bbd2a006 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -92,7 +92,6 @@
 #include "utils/datetime.h"
 #include "utils/float.h"
 #include "utils/formatting.h"
-#include "utils/int8.h"
 #include "utils/memutils.h"
 #include "utils/numeric.h"
 #include "utils/pg_locale.h"
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 0ff9394a2f..0152fc9dbd 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -18,12 +18,12 @@
 #include <math.h>
 
 #include "common/int.h"
+#include "common/string.h"
 #include "funcapi.h"
 #include "libpq/pqformat.h"
 #include "nodes/nodeFuncs.h"
 #include "nodes/supportnodes.h"
 #include "optimizer/optimizer.h"
-#include "utils/int8.h"
 #include "utils/builtins.h"
 
 
@@ -47,87 +47,32 @@ typedef struct
  * Formatting and conversion routines.
  *---------------------------------------------------------*/
 
-/*
- * scanint8 --- try to parse a string into an int8.
- *
- * If errorOK is false, ereport a useful error message if the string is bad.
- * If errorOK is true, just return "false" for bad input.
- */
-bool
-scanint8(const char *str, bool errorOK, int64 *result)
+static bool
+scanint8(const char * str, int64 * result)
 {
-	const char *ptr = str;
-	int64		tmp = 0;
-	bool		neg = false;
+	pg_strtoint_status stat = pg_strtoint64(str, result);
 
-	/*
-	 * Do our own scan, rather than relying on sscanf which might be broken
-	 * for long long.
-	 *
-	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
-	 * value as a negative number.
-	 */
-
-	/* skip leading spaces */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
+	if (likely(stat == PG_STRTOINT_OK))
+		return true;
+	else if (stat == PG_STRTOINT_RANGE_ERROR)
 	{
-		/* could fail if input is most negative number */
-		if (unlikely(tmp == PG_INT64_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	*result = tmp;
-	return true;
-
-out_of_range:
-	if (!errorOK)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("value \"%s\" is out of range for type %s",
 						str, "bigint")));
-	return false;
-
-invalid_syntax:
-	if (!errorOK)
+		return false;
+	}
+	else if (stat == PG_STRTOINT_SYNTAX_ERROR)
+	{
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
 				 errmsg("invalid input syntax for type %s: \"%s\"",
 						"bigint", str)));
-	return false;
+		return false;
+	}
+	else
+		/* cannot get here */
+		Assert(0);
 }
 
 /* int8in()
@@ -138,7 +83,7 @@ int8in(PG_FUNCTION_ARGS)
 	char	   *str = PG_GETARG_CSTRING(0);
 	int64		result;
 
-	(void) scanint8(str, false, &result);
+	(void) scanint8(str, &result);
 	PG_RETURN_INT64(result);
 }
 
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index a00db3ce7a..bc02958950 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -39,7 +39,6 @@
 #include "utils/float.h"
 #include "utils/guc.h"
 #include "utils/hashutils.h"
-#include "utils/int8.h"
 #include "utils/numeric.h"
 #include "utils/sortsupport.h"
 
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 70138feb29..f5d488c569 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -543,25 +543,3 @@ pg_ltostr(char *str, int32 value)
 
 	return end;
 }
-
-/*
- * pg_strtouint64
- *		Converts 'str' into an unsigned 64-bit integer.
- *
- * This has the identical API to strtoul(3), except that it will handle
- * 64-bit ints even where "long" is narrower than that.
- *
- * For the moment it seems sufficient to assume that the platform has
- * such a function somewhere; let's not roll our own.
- */
-uint64
-pg_strtouint64(const char *str, char **endptr, int base)
-{
-#ifdef _MSC_VER					/* MSVC only */
-	return _strtoui64(str, endptr, base);
-#elif defined(HAVE_STRTOULL) && SIZEOF_LONG < 8
-	return strtoull(str, endptr, base);
-#else
-	return strtoul(str, endptr, base);
-#endif
-}
diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c
index e5c7e5c7ee..749fc44623 100644
--- a/src/backend/utils/adt/rangetypes.c
+++ b/src/backend/utils/adt/rangetypes.c
@@ -37,7 +37,6 @@
 #include "utils/builtins.h"
 #include "utils/date.h"
 #include "utils/hashutils.h"
-#include "utils/int8.h"
 #include "utils/lsyscache.h"
 #include "utils/rangetypes.h"
 #include "utils/timestamp.h"
diff --git a/src/bin/pgbench/exprscan.l b/src/bin/pgbench/exprscan.l
index e9020ad231..99db6c72eb 100644
--- a/src/bin/pgbench/exprscan.l
+++ b/src/bin/pgbench/exprscan.l
@@ -23,6 +23,7 @@
  *-------------------------------------------------------------------------
  */
 
+#include "common/string.h"
 #include "fe_utils/psqlscan_int.h"
 
 /* context information for reporting errors in expressions */
@@ -205,7 +206,7 @@ notnull			[Nn][Oo][Tt][Nn][Uu][Ll][Ll]
 					return MAXINT_PLUS_ONE_CONST;
 				}
 {digit}+		{
-					if (!strtoint64(yytext, true, &yylval->ival))
+					if (unlikely(pg_strtoint64(yytext, &yylval->ival) != PG_STRTOINT_OK))
 						expr_yyerror_more(yyscanner, "bigint constant overflow",
 										  strdup(yytext));
 					return INTEGER_CONST;
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 570cf3306a..a6fe596893 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -34,6 +34,7 @@
 #include "postgres_fe.h"
 #include "common/int.h"
 #include "common/logging.h"
+#include "common/string.h"
 #include "fe_utils/conditional.h"
 #include "getopt_long.h"
 #include "libpq-fe.h"
@@ -670,8 +671,8 @@ is_an_int(const char *str)
 {
 	const char *ptr = str;
 
-	/* skip leading spaces; cast is consistent with strtoint64 */
-	while (*ptr && isspace((unsigned char) *ptr))
+	/* skip leading spaces; cast is consistent with pg_strtoint64 */
+	while (isspace((unsigned char) *ptr))
 		ptr++;
 
 	/* skip sign */
@@ -679,101 +680,21 @@ is_an_int(const char *str)
 		ptr++;
 
 	/* at least one digit */
-	if (*ptr && !isdigit((unsigned char) *ptr))
+	if (!isdigit((unsigned char) *ptr++))
 		return false;
 
-	/* eat all digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
+	/* eat all other digits */
+	while (isdigit((unsigned char) *ptr))
+		ptr++;
+
+	/* allow trailing spaces */
+	while (isspace((unsigned char) *ptr))
 		ptr++;
 
 	/* must have reached end of string */
 	return *ptr == '\0';
 }
 
-
-/*
- * strtoint64 -- convert a string to 64-bit integer
- *
- * This function is a slightly modified version of scanint8() from
- * src/backend/utils/adt/int8.c.
- *
- * The function returns whether the conversion worked, and if so
- * "*result" is set to the result.
- *
- * If not errorOK, an error message is also printed out on errors.
- */
-bool
-strtoint64(const char *str, bool errorOK, int64 *result)
-{
-	const char *ptr = str;
-	int64		tmp = 0;
-	bool		neg = false;
-
-	/*
-	 * Do our own scan, rather than relying on sscanf which might be broken
-	 * for long long.
-	 *
-	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
-	 * value as a negative number.
-	 */
-
-	/* skip leading spaces */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		if (unlikely(tmp == PG_INT64_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	*result = tmp;
-	return true;
-
-out_of_range:
-	if (!errorOK)
-		fprintf(stderr,
-				"value \"%s\" is out of range for type bigint\n", str);
-	return false;
-
-invalid_syntax:
-	if (!errorOK)
-		fprintf(stderr,
-				"invalid input syntax for type bigint: \"%s\"\n", str);
-	return false;
-}
-
 /* convert string to double, detecting overflows/underflows */
 bool
 strtodouble(const char *str, bool errorOK, double *dv)
@@ -1283,6 +1204,24 @@ getVariable(CState *st, char *name)
 	return var->svalue;
 }
 
+static bool
+str2int64(const char *str, int64 *val)
+{
+	pg_strtoint_status		stat = pg_strtoint64(str, val);
+
+	if (likely(stat == PG_STRTOINT_OK))
+		return true;
+
+	if (stat == PG_STRTOINT_RANGE_ERROR)
+		fprintf(stderr, "value \"%s\" is out of range for type %s\n", str, "int64");
+	else if (stat == PG_STRTOINT_SYNTAX_ERROR)
+		fprintf(stderr, "invalid input syntax for type %s: \"%s\"", "int64", str);
+	else
+		Assert(0); /* cannot happen */
+
+	return false;
+}
+
 /* Try to convert variable to a value; return false on failure */
 static bool
 makeVariableValue(Variable *var)
@@ -1325,7 +1264,7 @@ makeVariableValue(Variable *var)
 		/* if it looks like an int, it must be an int without overflow */
 		int64		iv;
 
-		if (!strtoint64(var->svalue, false, &iv))
+		if (!str2int64(var->svalue, &iv))
 			return false;
 
 		setIntValue(&var->value, iv);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index c4a1e298e0..da3f8fa055 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -160,7 +160,6 @@ extern void syntax_error(const char *source, int lineno, const char *line,
 						 const char *cmd, const char *msg,
 						 const char *more, int col) pg_attribute_noreturn();
 
-extern bool strtoint64(const char *str, bool errorOK, int64 *pi);
 extern bool strtodouble(const char *str, bool errorOK, double *pd);
 
 #endif							/* PGBENCH_H */
diff --git a/src/common/string.c b/src/common/string.c
index c9b8482cb0..5487dfbb8b 100644
--- a/src/common/string.c
+++ b/src/common/string.c
@@ -22,6 +22,7 @@
 #endif
 
 #include "common/string.h"
+#include "common/int.h"
 
 
 /*
@@ -57,6 +58,122 @@ strtoint(const char *pg_restrict str, char **pg_restrict endptr, int base)
 	return (int) val;
 }
 
+/*
+ * pg_strtoint64 -- convert a string to 64-bit integer
+ *
+ * The function returns if the conversion failed, or
+ * "*result" is set to the result.
+ */
+pg_strtoint_status
+pg_strtoint64(const char *str, int64 *result)
+{
+	const char *ptr = str;
+	int64		tmp = 0;
+	bool		neg = false;
+
+	/*
+	 * Do our own scan, rather than relying on sscanf which might be broken
+	 * for long long.
+	 *
+	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
+	 * value as a negative number.
+	 */
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '-')
+	{
+		ptr++;
+		neg = true;
+	}
+	else if (*ptr == '+')
+		ptr++;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead */
+	do
+	{
+		int64		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* but not other trailing chars */
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	if (!neg)
+	{
+		if (unlikely(tmp == PG_INT64_MIN))
+			return PG_STRTOINT_RANGE_ERROR;
+		tmp = -tmp;
+	}
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
+/*
+ * pg_strtouint64 -- convert a string to unsigned 64-bit integer
+ *
+ * The function returns if the conversion failed, or
+ * "*result" is set to the result.
+ */
+pg_strtoint_status
+pg_strtouint64(const char *str, uint64 *result)
+{
+	const char *ptr = str;
+	uint64		tmp = 0;
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '+')
+		ptr++;
+	else if (unlikely(*ptr == '-'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead */
+	do
+	{
+		uint64		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_u64_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_add_u64_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* but not other trailing chars */
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
 
 /*
  * pg_clean_ascii -- Replace any non-ASCII chars with a '?' char
diff --git a/src/include/common/int.h b/src/include/common/int.h
index d754798543..c52b446e98 100644
--- a/src/include/common/int.h
+++ b/src/include/common/int.h
@@ -270,4 +270,94 @@ pg_mul_s64_overflow(int64 a, int64 b, int64 *result)
 #endif
 }
 
+/*
+ * If a + b overflows, return true, otherwise store the result of a + b into
+ * *result. The content of *result is implementation defined in case of
+ * overflow.
+ */
+static inline bool
+pg_add_u64_overflow(uint64 a, uint64 b, uint64 *result)
+{
+#if defined(HAVE__BUILTIN_OP_OVERFLOW)
+	return __builtin_add_overflow(a, b, result);
+#elif defined(HAVE_INT128)
+	int128		res = (int128) a + (int128) b;
+
+	if (unlikely(res > PG_UINT64_MAX))
+	{
+		*result = 0x5EED;		/* to avoid spurious warnings */
+		return true;
+	}
+	*result = (uint64) res;
+	return false;
+#else
+	uint64		res = a + b;
+
+	if (unlikely(res < a))
+	{
+		*result = 0x5EED;		/* to avoid spurious warnings */
+		return true;
+	}
+	*result = res;
+	return false;
+#endif
+}
+
+/*
+ * If a - b overflows, return true, otherwise store the result of a - b into
+ * *result. The content of *result is implementation defined in case of
+ * overflow.
+ */
+static inline bool
+pg_sub_u64_overflow(uint64 a, uint64 b, uint64 *result)
+{
+#if defined(HAVE__BUILTIN_OP_OVERFLOW)
+	return __builtin_sub_overflow(a, b, result);
+#else
+	/* check for underflow, really */
+	if (b > a)
+	{
+		*result = 0x5EED;		/* to avoid spurious warnings */
+		return true;
+	}
+	*result = a - b;
+	return false;
+#endif
+}
+
+/*
+ * If a * b overflows, return true, otherwise store the result of a * b into
+ * *result. The content of *result is implementation defined in case of
+ * overflow.
+ */
+static inline bool
+pg_mul_u64_overflow(uint64 a, uint64 b, uint64 *result)
+{
+#if defined(HAVE__BUILTIN_OP_OVERFLOW)
+	return __builtin_mul_overflow(a, b, result);
+#elif defined(HAVE_INT128)
+	int128		res = (int128) a * (int128) b;
+
+	/* res < 0 is triggered if we overflow int128 mul */
+	if (unlikely(res > PG_UINT64_MAX || res < 0))
+	{
+		*result = 0x5EED;		/* to avoid spurious warnings */
+		return true;
+	}
+	*result = (uint64) res;
+	return false;
+#else
+	/* hopefully this quite expensive version is never used */
+	uint64		res = a * b;
+
+	if (unlikely(b != 0 && a != res / b))
+	{
+		*result = 0x5EED;		/* to avoid spurious warnings */
+		return true;
+	}
+	*result = res;
+	return false;
+#endif
+}
+
 #endif							/* COMMON_INT_H */
diff --git a/src/include/common/string.h b/src/include/common/string.h
index 94f653fdd7..9a8277a670 100644
--- a/src/include/common/string.h
+++ b/src/include/common/string.h
@@ -11,8 +11,19 @@
 #define COMMON_STRING_H
 
 extern bool pg_str_endswith(const char *str, const char *end);
+
 extern int	strtoint(const char *pg_restrict str, char **pg_restrict endptr,
 					 int base);
+
+typedef enum {
+	PG_STRTOINT_OK,
+	PG_STRTOINT_SYNTAX_ERROR,
+	PG_STRTOINT_RANGE_ERROR
+} pg_strtoint_status;
+
+extern pg_strtoint_status pg_strtoint64(const char *str, int64 *result);
+extern pg_strtoint_status pg_strtouint64(const char *str, uint64 *result);
+
 extern void pg_clean_ascii(char *str);
 extern int	pg_strip_crlf(char *str);
 
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 937ddb7ef0..8a0c2a5c48 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -50,7 +50,6 @@ extern void pg_ltoa(int32 l, char *a);
 extern void pg_lltoa(int64 ll, char *a);
 extern char *pg_ltostr_zeropad(char *str, int32 value, int32 minwidth);
 extern char *pg_ltostr(char *str, int32 value);
-extern uint64 pg_strtouint64(const char *str, char **endptr, int base);
 
 /* oid.c */
 extern oidvector *buildoidvector(const Oid *oids, int n);
diff --git a/src/include/utils/int8.h b/src/include/utils/int8.h
deleted file mode 100644
index 4836095d93..0000000000
--- a/src/include/utils/int8.h
+++ /dev/null
@@ -1,25 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * int8.h
- *	  Declarations for operations on 64-bit integers.
- *
- *
- * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * src/include/utils/int8.h
- *
- * NOTES
- * These data types are supported on all 64-bit architectures, and may
- *	be supported through libraries on some 32-bit machines. If your machine
- *	is not currently supported, then please try to make it so, then post
- *	patches to the postgresql.org hackers mailing list.
- *
- *-------------------------------------------------------------------------
- */
-#ifndef INT8_H
-#define INT8_H
-
-extern bool scanint8(const char *str, bool errorOK, int64 *result);
-
-#endif							/* INT8_H */

#51

michael@paquier.xyz

over 6 years ago

In reply to: Fabien COELHO (#50)

2 attachment(s)

Re: refactoring - share str2*int64 functions

On Thu, Aug 29, 2019 at 08:14:54AM +0200, Fabien COELHO wrote:

Attached v7 does create uint64 overflow inline functions. The stuff yet is
not moved to "common/int.c", a file which does not exists, even if
"common/int.h" does.

Thanks for the new version. I have begun reviewing your patch, and
attached is a first cut that I would like to commit separately which
adds all the compatibility overflow routines to int.h for uint16,
uint32 and uint64 with all the fallback implementations (int128-based
method added as well if available). I have also grouped at the top of
the file the comments about each routine's return policy to avoid
duplication. For the fallback implementations of uint64 using int128,
I think that it is cleaner to use uint128 so as there is no need to
check after negative results for multiplications, and I have made the
various expressions consistent for each size.

Attached is a small module called "overflow" with various regression
tests that I used to check each implementation. I don't propose that
for commit as I am not sure if that's worth the extra CPU, so let's
consider it as a toy for now.

What do you think?
--
Michael

Attachments:

overflow.tar.gzapplication/gzipDownload

str2int-overflows-8.patchtext/x-diff; charset=us-asciiDownload

diff --git a/src/include/common/int.h b/src/include/common/int.h
index d754798543..ccef1b4cc9 100644
--- a/src/include/common/int.h
+++ b/src/include/common/int.h
@@ -20,11 +20,26 @@
 #ifndef COMMON_INT_H
 #define COMMON_INT_H
 
-/*
- * If a + b overflows, return true, otherwise store the result of a + b into
- * *result. The content of *result is implementation defined in case of
+
+/*---------
+ * The following guidelines apply to all the routines:
+ * - If a + b overflows, return true, otherwise store the result of a + b
+ * into *result. The content of *result is implementation defined in case of
  * overflow.
+ * - If a - b overflows, return true, otherwise store the result of a - b
+ * into *result. The content of *result is implementation defined in case of
+ * overflow.
+ * - If a * b overflows, return true, otherwise store the result of a * b
+ * into *result. The content of *result is implementation defined in case of
+ * overflow.
+ *---------
  */
+
+/*------------------------------------------------------------------------
+ * Overflow routines for signed integers
+ *------------------------------------------------------------------------
+ */
+
 static inline bool
 pg_add_s16_overflow(int16 a, int16 b, int16 *result)
 {
@@ -43,11 +58,6 @@ pg_add_s16_overflow(int16 a, int16 b, int16 *result)
 #endif
 }
 
-/*
- * If a - b overflows, return true, otherwise store the result of a - b into
- * *result. The content of *result is implementation defined in case of
- * overflow.
- */
 static inline bool
 pg_sub_s16_overflow(int16 a, int16 b, int16 *result)
 {
@@ -66,11 +76,6 @@ pg_sub_s16_overflow(int16 a, int16 b, int16 *result)
 #endif
 }
 
-/*
- * If a * b overflows, return true, otherwise store the result of a * b into
- * *result. The content of *result is implementation defined in case of
- * overflow.
- */
 static inline bool
 pg_mul_s16_overflow(int16 a, int16 b, int16 *result)
 {
@@ -89,11 +94,6 @@ pg_mul_s16_overflow(int16 a, int16 b, int16 *result)
 #endif
 }
 
-/*
- * If a + b overflows, return true, otherwise store the result of a + b into
- * *result. The content of *result is implementation defined in case of
- * overflow.
- */
 static inline bool
 pg_add_s32_overflow(int32 a, int32 b, int32 *result)
 {
@@ -112,11 +112,6 @@ pg_add_s32_overflow(int32 a, int32 b, int32 *result)
 #endif
 }
 
-/*
- * If a - b overflows, return true, otherwise store the result of a - b into
- * *result. The content of *result is implementation defined in case of
- * overflow.
- */
 static inline bool
 pg_sub_s32_overflow(int32 a, int32 b, int32 *result)
 {
@@ -135,11 +130,6 @@ pg_sub_s32_overflow(int32 a, int32 b, int32 *result)
 #endif
 }
 
-/*
- * If a * b overflows, return true, otherwise store the result of a * b into
- * *result. The content of *result is implementation defined in case of
- * overflow.
- */
 static inline bool
 pg_mul_s32_overflow(int32 a, int32 b, int32 *result)
 {
@@ -158,11 +148,6 @@ pg_mul_s32_overflow(int32 a, int32 b, int32 *result)
 #endif
 }
 
-/*
- * If a + b overflows, return true, otherwise store the result of a + b into
- * *result. The content of *result is implementation defined in case of
- * overflow.
- */
 static inline bool
 pg_add_s64_overflow(int64 a, int64 b, int64 *result)
 {
@@ -190,11 +175,6 @@ pg_add_s64_overflow(int64 a, int64 b, int64 *result)
 #endif
 }
 
-/*
- * If a - b overflows, return true, otherwise store the result of a - b into
- * *result. The content of *result is implementation defined in case of
- * overflow.
- */
 static inline bool
 pg_sub_s64_overflow(int64 a, int64 b, int64 *result)
 {
@@ -222,11 +202,6 @@ pg_sub_s64_overflow(int64 a, int64 b, int64 *result)
 #endif
 }
 
-/*
- * If a * b overflows, return true, otherwise store the result of a * b into
- * *result. The content of *result is implementation defined in case of
- * overflow.
- */
 static inline bool
 pg_mul_s64_overflow(int64 a, int64 b, int64 *result)
 {
@@ -270,4 +245,195 @@ pg_mul_s64_overflow(int64 a, int64 b, int64 *result)
 #endif
 }
 
+/*------------------------------------------------------------------------
+ * Overflow routines for unsigned integers
+ *------------------------------------------------------------------------
+ */
+
+static inline bool
+pg_add_u16_overflow(uint16 a, uint16 b, uint16 *result)
+{
+#if defined(HAVE__BUILTIN_OP_OVERFLOW)
+	return __builtin_add_overflow(a, b, result);
+#else
+	uint16		res = a + b;
+
+	if (res < a)
+	{
+		*result = 0x5EED;		/* to avoid spurious warnings */
+		return true;
+	}
+	*result = res;
+	return false;
+#endif
+}
+
+static inline bool
+pg_sub_u16_overflow(uint16 a, uint16 b, uint16 *result)
+{
+#if defined(HAVE__BUILTIN_OP_OVERFLOW)
+	return __builtin_sub_overflow(a, b, result);
+#else
+	if (b > a)
+	{
+		*result = 0x5EED;		/* to avoid spurious warnings */
+		return true;
+	}
+	*result = a - b;
+	return false;
+#endif
+}
+
+static inline bool
+pg_mul_u16_overflow(uint16 a, uint16 b, uint16 *result)
+{
+#if defined(HAVE__BUILTIN_OP_OVERFLOW)
+	return __builtin_mul_overflow(a, b, result);
+#else
+	uint16		res = a * b;
+
+	if (a != 0 && b != res / a)
+	{
+		*result = 0x5EED;		/* to avoid spurious warnings */
+		return true;
+	}
+	*result = res;
+	return false;
+#endif
+}
+
+static inline bool
+pg_add_u32_overflow(uint32 a, uint32 b, uint32 *result)
+{
+#if defined(HAVE__BUILTIN_OP_OVERFLOW)
+	return __builtin_add_overflow(a, b, result);
+#else
+	uint32		res = a + b;
+
+	if (res < a)
+	{
+		*result = 0x5EED;		/* to avoid spurious warnings */
+		return true;
+	}
+	*result = res;
+	return false;
+#endif
+}
+
+static inline bool
+pg_sub_u32_overflow(uint32 a, uint32 b, uint32 *result)
+{
+#if defined(HAVE__BUILTIN_OP_OVERFLOW)
+	return __builtin_sub_overflow(a, b, result);
+#else
+	if (b > a)
+	{
+		*result = 0x5EED;		/* to avoid spurious warnings */
+		return true;
+	}
+	*result = a - b;
+	return false;
+#endif
+}
+
+static inline bool
+pg_mul_u32_overflow(uint32 a, uint32 b, uint32 *result)
+{
+#if defined(HAVE__BUILTIN_OP_OVERFLOW)
+	return __builtin_mul_overflow(a, b, result);
+#else
+	uint32		res = a * b;
+
+	if (a != 0 && b != res / a)
+	{
+		*result = 0x5EED;		/* to avoid spurious warnings */
+		return true;
+	}
+	*result = res;
+	return false;
+#endif
+}
+
+static inline bool
+pg_add_u64_overflow(uint64 a, uint64 b, uint64 *result)
+{
+#if defined(HAVE__BUILTIN_OP_OVERFLOW)
+	return __builtin_add_overflow(a, b, result);
+#elif defined(HAVE_INT128)
+	uint128		res = (uint128) a + (uint128) b;
+
+	if (res > PG_UINT64_MAX)
+	{
+		*result = 0x5EED;		/* to avoid spurious warnings */
+		return true;
+	}
+	*result = (uint64) res;
+	return false;
+#else
+	uint64		res = a + b;
+
+	if (res < a)
+	{
+		*result = 0x5EED;		/* to avoid spurious warnings */
+		return true;
+	}
+	*result = res;
+	return false;
+#endif
+}
+
+static inline bool
+pg_sub_u64_overflow(uint64 a, uint64 b, uint64 *result)
+{
+#if defined(HAVE__BUILTIN_OP_OVERFLOW)
+	return __builtin_sub_overflow(a, b, result);
+#elif defined(HAVE_INT128)
+	uint128		res = (uint128) a - (uint128) b;
+
+	if (res > PG_UINT64_MAX)
+	{
+		*result = 0x5EED;		/* to avoid spurious warnings */
+		return true;
+	}
+	*result = (uint64) res;
+	return false;
+#else
+	if (b > a)
+	{
+		*result = 0x5EED;		/* to avoid spurious warnings */
+		return true;
+	}
+	*result = a - b;
+	return false;
+#endif
+}
+
+static inline bool
+pg_mul_u64_overflow(uint64 a, uint64 b, uint64 *result)
+{
+#if defined(HAVE__BUILTIN_OP_OVERFLOW)
+	return __builtin_mul_overflow(a, b, result);
+#elif defined(HAVE_INT128)
+	uint128		res = (uint128) a * (uint128) b;
+
+	if (res > PG_UINT64_MAX)
+	{
+		*result = 0x5EED;		/* to avoid spurious warnings */
+		return true;
+	}
+	*result = (uint64) res;
+	return false;
+#else
+	uint64		res = a * b;
+
+	if (a != 0 && b != res / a)
+	{
+		*result = 0x5EED;		/* to avoid spurious warnings */
+		return true;
+	}
+	*result = res;
+	return false;
+#endif
+}
+
 #endif							/* COMMON_INT_H */

#52

coelho@cri.ensmp.fr

over 6 years ago

In reply to: Michael Paquier (#51)

Re: refactoring - share str2*int64 functions

Michaël,

attached is a first cut that I would like to commit separately which
adds all the compatibility overflow routines to int.h for uint16, uint32
and uint64 with all the fallback implementations (int128-based method
added as well if available). I have also grouped at the top of the file
the comments about each routine's return policy to avoid duplication.
For the fallback implementations of uint64 using int128, I think that it
is cleaner to use uint128 so as there is no need to check after negative
results for multiplications, and I have made the various expressions
consistent for each size.

Patch applies cleanly, compiles, "make check" ok, but the added functions
are not used (yet).

I think that factoring out comments is a good move.

For symmetry and efficiency, ISTM that uint16 mul overflow could use
uint32 and uint32 could use uint64, and the division-based method be
dropped in these cases.

Maybe I would add a comment before each new section telling about the
type, eg:

/*
* UINT16
*/
add/sub/mul uint16 functions…

I would tend to commit working solutions per type rather than by
installment, so that at least all added functions are actually used
somewhere, but it does not matter much, really.

I was unsure that having int128 implies uint128 availability, so I did not
use it.

The checking extension is funny, but ISTM that these checks should be (are
already?) included in some standard sql test, which will test the macros
from direct SQL operations:

fabien=# SELECT INT2 '1234512434343';
ERROR: value "1234512434343" is out of range for type smallint

Well, a quick look at "src/test/regress/sql/int2.sql" suggests that
there the existing tests should be extended… :-(

--
Fabien.

#53

michael@paquier.xyz

over 6 years ago

In reply to: Fabien COELHO (#52)

Re: refactoring - share str2*int64 functions

On Fri, Aug 30, 2019 at 10:06:11AM +0200, Fabien COELHO wrote:

Patch applies cleanly, compiles, "make check" ok, but the added functions
are not used (yet).

Thanks.

I think that factoring out comments is a good move.

For symmetry and efficiency, ISTM that uint16 mul overflow could use uint32
and uint32 could use uint64, and the division-based method be dropped in
these cases.

Yes, the division would be worse than the other. What do you think
about using the previous module I sent and measure how long it takes
to evaluate the overflows in some specific cases N times in loops?

Maybe I would add a comment before each new section telling about the type,
eg:

/*
* UINT16
*/
add/sub/mul uint16 functions.

Let's do as you suggest here.

I would tend to commit working solutions per type rather than by
installment, so that at least all added functions are actually used
somewhere, but it does not matter much, really.

I prefer by section, with testing dedicated to each part properly
done so as we can move to the next parts.

I was unsure that having int128 implies uint128 availability, so I did not
use it.

The recent Ryu-floating point work has begun using them (see f2s.c and
d2s.c).

The checking extension is funny, but ISTM that these checks should be (are
already?) included in some standard sql test, which will test the macros
from direct SQL operations:

Sure. But not for the unsigned part as there are no equivalent
in-core data types, still it is possible to trick things with signed
integer arguments. I found my toy useful to check test all
implementations consistently.

fabien=# SELECT INT2 '1234512434343';
ERROR: value "1234512434343" is out of range for type smallint

Well, a quick look at "src/test/regress/sql/int2.sql" suggests that
there the existing tests should be extended… :-(

We can tackle that separately. -32768 is perfectly legal for
smallint, and the test is wrong here.
--
Michael

#54

coelho@cri.ensmp.fr

over 6 years ago

In reply to: Michael Paquier (#53)

Re: refactoring - share str2*int64 functions

Michaël,

For symmetry and efficiency, ISTM that uint16 mul overflow could use uint32
and uint32 could use uint64, and the division-based method be dropped in
these cases.

Yes, the division would be worse than the other. What do you think
about using the previous module I sent and measure how long it takes
to evaluate the overflows in some specific cases N times in loops?

Given the overheads of the SQL interpreter, I'm unsure about what you
would measure at the SQL level. You could just write a small standalone C
program to test perf and accuracy. Maybe this is what you have in mind.

I would tend to commit working solutions per type rather than by
installment, so that at least all added functions are actually used
somewhere, but it does not matter much, really.

I prefer by section, with testing dedicated to each part properly
done so as we can move to the next parts.

This suggests that you will test twice: once when adding the inlined
functions, once when calling from SQL.

The checking extension is funny, but ISTM that these checks should be (are
already?) included in some standard sql test, which will test the macros
from direct SQL operations:

Sure. But not for the unsigned part as there are no equivalent
in-core data types,

Yep, it bothered me sometimes, but not enough to propose to add them.

still it is possible to trick things with signed integer arguments.

Is it?

I found my toy useful to check test all implementations consistently.

Ok.

fabien=# SELECT INT2 '1234512434343';
ERROR: value "1234512434343" is out of range for type smallint

Well, a quick look at "src/test/regress/sql/int2.sql" suggests that
there the existing tests should be extended… :-(

We can tackle that separately. -32768 is perfectly legal for
smallint, and the test is wrong here.

Do you mean:

sql> SELECT -32768::INT2;
ERROR: smallint out of range

This is not a negative constant but the reverse of a positive, which is
indeed out of range, although the error message could help more.

sql> SELECT (-32768)::INT2;
-32768 # ok

sql> SELECT INT2 '-32768';
-32768 # ok

--
Fabien.

#55

michael@paquier.xyz

over 6 years ago

In reply to: Fabien COELHO (#54)

2 attachment(s)

Re: refactoring - share str2*int64 functions

On Fri, Aug 30, 2019 at 04:50:21PM +0200, Fabien COELHO wrote:

Given the overheads of the SQL interpreter, I'm unsure about what you would
measure at the SQL level. You could just write a small standalone C program
to test perf and accuracy. Maybe this is what you have in mind.

After a certain threshold, you can see the difference anyway by paying
once the overhead of the function. See for example the updated module
attached that I used for my tests.

I have been testing the various implementations, and doing 2B
iterations leads to roughly the following with a non-assert, -O2
build using mul_u32:
- __builtin_sub_overflow => 5s
- cast to uint64 => 5.9s
- division => 8s
You are right as well that having symmetry with the signed methods is
much better. In order to see the difference, you can just do that
with the extension attached, after of course hijacking int.h with some
undefs and recompiling the backend and the module:
select pg_overflow_check(10000, 10000, 2000000000, 'uint32', 'mul');

still it is possible to trick things with signed integer arguments.

Is it?

If you pass -1 and then you can fall back to the maximum of each 16,
32 or 64 bits for the unsigned (see the regression tests I added with
the module).

Do you mean:

sql> SELECT -32768::INT2;
ERROR: smallint out of range

You are incorrect here, as the minus sign is ignored by the cast.
This works though:
=# SELECT (-32768)::INT2;
int2
--------
-32768
(1 row)

If you look at int2.sql, we do that:
-- largest and smallest values
INSERT INTO INT2_TBL(f1) VALUES ('32767');
INSERT INTO INT2_TBL(f1) VALUES ('-32767');
That's the part I mean is wrong, as the minimum is actually -32768,
but the test fails to consider that. I'll go fix that after
double-checking other similar tests for int4 and int8.

Attached is an updated patch to complete the work for common/int.h,
with the following changes:
- Changed the multiplication methods for uint16 and uint32 to not be
division-based. uint64 can use that only if int128 exists.
- Added comments on top of each sub-sections for the types checked.

Attached is also an updated version of the module I used to validate
this stuff. Fabien, any thoughts?
--
Michael

Attachments:

overflow.tar.gzapplication/gzipDownload

str2int-overflows-9.patchtext/x-diff; charset=us-asciiDownload

diff --git a/src/include/common/int.h b/src/include/common/int.h
index d754798543..4e1aec5018 100644
--- a/src/include/common/int.h
+++ b/src/include/common/int.h
@@ -20,10 +20,28 @@
 #ifndef COMMON_INT_H
 #define COMMON_INT_H
 
-/*
- * If a + b overflows, return true, otherwise store the result of a + b into
- * *result. The content of *result is implementation defined in case of
+
+/*---------
+ * The following guidelines apply to all the routines:
+ * - If a + b overflows, return true, otherwise store the result of a + b
+ * into *result. The content of *result is implementation defined in case of
  * overflow.
+ * - If a - b overflows, return true, otherwise store the result of a - b
+ * into *result. The content of *result is implementation defined in case of
+ * overflow.
+ * - If a * b overflows, return true, otherwise store the result of a * b
+ * into *result. The content of *result is implementation defined in case of
+ * overflow.
+ *---------
+ */
+
+/*------------------------------------------------------------------------
+ * Overflow routines for signed integers
+ *------------------------------------------------------------------------
+ */
+
+/*
+ * INT16
  */
 static inline bool
 pg_add_s16_overflow(int16 a, int16 b, int16 *result)
@@ -43,11 +61,6 @@ pg_add_s16_overflow(int16 a, int16 b, int16 *result)
 #endif
 }
 
-/*
- * If a - b overflows, return true, otherwise store the result of a - b into
- * *result. The content of *result is implementation defined in case of
- * overflow.
- */
 static inline bool
 pg_sub_s16_overflow(int16 a, int16 b, int16 *result)
 {
@@ -66,11 +79,6 @@ pg_sub_s16_overflow(int16 a, int16 b, int16 *result)
 #endif
 }
 
-/*
- * If a * b overflows, return true, otherwise store the result of a * b into
- * *result. The content of *result is implementation defined in case of
- * overflow.
- */
 static inline bool
 pg_mul_s16_overflow(int16 a, int16 b, int16 *result)
 {
@@ -90,9 +98,7 @@ pg_mul_s16_overflow(int16 a, int16 b, int16 *result)
 }
 
 /*
- * If a + b overflows, return true, otherwise store the result of a + b into
- * *result. The content of *result is implementation defined in case of
- * overflow.
+ * INT32
  */
 static inline bool
 pg_add_s32_overflow(int32 a, int32 b, int32 *result)
@@ -112,11 +118,6 @@ pg_add_s32_overflow(int32 a, int32 b, int32 *result)
 #endif
 }
 
-/*
- * If a - b overflows, return true, otherwise store the result of a - b into
- * *result. The content of *result is implementation defined in case of
- * overflow.
- */
 static inline bool
 pg_sub_s32_overflow(int32 a, int32 b, int32 *result)
 {
@@ -135,11 +136,6 @@ pg_sub_s32_overflow(int32 a, int32 b, int32 *result)
 #endif
 }
 
-/*
- * If a * b overflows, return true, otherwise store the result of a * b into
- * *result. The content of *result is implementation defined in case of
- * overflow.
- */
 static inline bool
 pg_mul_s32_overflow(int32 a, int32 b, int32 *result)
 {
@@ -159,9 +155,7 @@ pg_mul_s32_overflow(int32 a, int32 b, int32 *result)
 }
 
 /*
- * If a + b overflows, return true, otherwise store the result of a + b into
- * *result. The content of *result is implementation defined in case of
- * overflow.
+ * INT64
  */
 static inline bool
 pg_add_s64_overflow(int64 a, int64 b, int64 *result)
@@ -190,11 +184,6 @@ pg_add_s64_overflow(int64 a, int64 b, int64 *result)
 #endif
 }
 
-/*
- * If a - b overflows, return true, otherwise store the result of a - b into
- * *result. The content of *result is implementation defined in case of
- * overflow.
- */
 static inline bool
 pg_sub_s64_overflow(int64 a, int64 b, int64 *result)
 {
@@ -222,11 +211,6 @@ pg_sub_s64_overflow(int64 a, int64 b, int64 *result)
 #endif
 }
 
-/*
- * If a * b overflows, return true, otherwise store the result of a * b into
- * *result. The content of *result is implementation defined in case of
- * overflow.
- */
 static inline bool
 pg_mul_s64_overflow(int64 a, int64 b, int64 *result)
 {
@@ -270,4 +254,204 @@ pg_mul_s64_overflow(int64 a, int64 b, int64 *result)
 #endif
 }
 
+/*------------------------------------------------------------------------
+ * Overflow routines for unsigned integers
+ *------------------------------------------------------------------------
+ */
+
+/*
+ * UINT16
+ */
+static inline bool
+pg_add_u16_overflow(uint16 a, uint16 b, uint16 *result)
+{
+#if defined(HAVE__BUILTIN_OP_OVERFLOW)
+	return __builtin_add_overflow(a, b, result);
+#else
+	uint16		res = a + b;
+
+	if (res < a)
+	{
+		*result = 0x5EED;		/* to avoid spurious warnings */
+		return true;
+	}
+	*result = res;
+	return false;
+#endif
+}
+
+static inline bool
+pg_sub_u16_overflow(uint16 a, uint16 b, uint16 *result)
+{
+#if defined(HAVE__BUILTIN_OP_OVERFLOW)
+	return __builtin_sub_overflow(a, b, result);
+#else
+	if (b > a)
+	{
+		*result = 0x5EED;		/* to avoid spurious warnings */
+		return true;
+	}
+	*result = a - b;
+	return false;
+#endif
+}
+
+static inline bool
+pg_mul_u16_overflow(uint16 a, uint16 b, uint16 *result)
+{
+#if defined(HAVE__BUILTIN_OP_OVERFLOW)
+	return __builtin_mul_overflow(a, b, result);
+#else
+	uint32		res = (uint32) a * (uint32) b;
+
+	if (res > PG_UINT16_MAX)
+	{
+		*result = 0x5EED;		/* to avoid spurious warnings */
+		return true;
+	}
+	*result = (uint16) res;
+	return false;
+#endif
+}
+
+/*
+ * INT32
+ */
+static inline bool
+pg_add_u32_overflow(uint32 a, uint32 b, uint32 *result)
+{
+#if defined(HAVE__BUILTIN_OP_OVERFLOW)
+	return __builtin_add_overflow(a, b, result);
+#else
+	uint32		res = a + b;
+
+	if (res < a)
+	{
+		*result = 0x5EED;		/* to avoid spurious warnings */
+		return true;
+	}
+	*result = res;
+	return false;
+#endif
+}
+
+static inline bool
+pg_sub_u32_overflow(uint32 a, uint32 b, uint32 *result)
+{
+#if defined(HAVE__BUILTIN_OP_OVERFLOW)
+	return __builtin_sub_overflow(a, b, result);
+#else
+	if (b > a)
+	{
+		*result = 0x5EED;		/* to avoid spurious warnings */
+		return true;
+	}
+	*result = a - b;
+	return false;
+#endif
+}
+
+static inline bool
+pg_mul_u32_overflow(uint32 a, uint32 b, uint32 *result)
+{
+#if defined(HAVE__BUILTIN_OP_OVERFLOW)
+	return __builtin_mul_overflow(a, b, result);
+#else
+	uint64		res = (uint64) a * (uint64) b;
+
+	if (res > PG_UINT32_MAX)
+	{
+		*result = 0x5EED;		/* to avoid spurious warnings */
+		return true;
+	}
+	*result = (uint32) res;
+	return false;
+#endif
+}
+
+/*
+ * UINT64
+ */
+static inline bool
+pg_add_u64_overflow(uint64 a, uint64 b, uint64 *result)
+{
+#if defined(HAVE__BUILTIN_OP_OVERFLOW)
+	return __builtin_add_overflow(a, b, result);
+#elif defined(HAVE_INT128)
+	uint128		res = (uint128) a + (uint128) b;
+
+	if (res > PG_UINT64_MAX)
+	{
+		*result = 0x5EED;		/* to avoid spurious warnings */
+		return true;
+	}
+	*result = (uint64) res;
+	return false;
+#else
+	uint64		res = a + b;
+
+	if (res < a)
+	{
+		*result = 0x5EED;		/* to avoid spurious warnings */
+		return true;
+	}
+	*result = res;
+	return false;
+#endif
+}
+
+static inline bool
+pg_sub_u64_overflow(uint64 a, uint64 b, uint64 *result)
+{
+#if defined(HAVE__BUILTIN_OP_OVERFLOW)
+	return __builtin_sub_overflow(a, b, result);
+#elif defined(HAVE_INT128)
+	uint128		res = (uint128) a - (uint128) b;
+
+	if (res > PG_UINT64_MAX)
+	{
+		*result = 0x5EED;		/* to avoid spurious warnings */
+		return true;
+	}
+	*result = (uint64) res;
+	return false;
+#else
+	if (b > a)
+	{
+		*result = 0x5EED;		/* to avoid spurious warnings */
+		return true;
+	}
+	*result = a - b;
+	return false;
+#endif
+}
+
+static inline bool
+pg_mul_u64_overflow(uint64 a, uint64 b, uint64 *result)
+{
+#if defined(HAVE__BUILTIN_OP_OVERFLOW)
+	return __builtin_mul_overflow(a, b, result);
+#elif defined(HAVE_INT128)
+	uint128		res = (uint128) a * (uint128) b;
+
+	if (res > PG_UINT64_MAX)
+	{
+		*result = 0x5EED;		/* to avoid spurious warnings */
+		return true;
+	}
+	*result = (uint64) res;
+	return false;
+#else
+	uint64		res = a * b;
+
+	if (a != 0 && b != res / a)
+	{
+		*result = 0x5EED;		/* to avoid spurious warnings */
+		return true;
+	}
+	*result = res;
+	return false;
+#endif
+}
+
 #endif							/* COMMON_INT_H */

#56

coelho@cri.ensmp.fr

over 6 years ago

In reply to: Michael Paquier (#55)

Re: refactoring - share str2*int64 functions

Bonjour Michaël,

You are right as well that having symmetry with the signed methods is
much better. In order to see the difference, you can just do that with
the extension attached, after of course hijacking int.h with some undefs
and recompiling the backend and the module: select
pg_overflow_check(10000, 10000, 2000000000, 'uint32', 'mul');

Ok.

still it is possible to trick things with signed integer arguments.

Is it?

If you pass -1 and then you can fall back to the maximum of each 16,
32 or 64 bits for the unsigned (see the regression tests I added with
the module).

Attached is also an updated version of the module I used to validate
this stuff. Fabien, any thoughts?

Patch apply cleanly, compiles, "make check" ok (although changes
are untested).

I would put back unlikely() on overflow tests, as there are indeed
unlikely to occur and it may help some compilers, and cannot be harmful.
It also helps the code reader to know that these path are not expected to
be taken often.

On reflection, I'm not sure that add_u64 and sub_u64 overflow with uint128
are useful. The res < a or b > a tricks should suffice, just like for u16
and u32 cases, and it may cost a little less anyway.

I would suggest keep the overflow extension as "contrib/overflow_test".
For mul tests, I'd suggest not to try only min/max values like add/sub,
but also "standard" multiplications that overflow or not. It would be good
if "make check" could be made to work", for some reason it requires
"installcheck".

I could not test performance directly, loops are optimized out by the
compiler. I added "volatile" on input value declarations to work around
that. On 2B iterations I got on my laptop:

int16: mul = 2770 ms, add = 1830 ms, sub = 1826 ms
int32: mul = 1838 ms, add = 1835 ms, sub = 1840 ms
int64: mul = 1836 ms, add = 1834 ms, sub = 1833 ms

uint16: mul = 3670 ms, add = 1830 ms, sub = 2148 ms
uint32: mul = 2438 ms, add = 1834 ms, sub = 1831 ms
uint64: mul = 2139 ms, add = 1841 ms, sub = 1882 ms

Why int16 mul, uint* mul and uint16 sub are bad is unclear.

With fallback code triggered with:

#undef HAVE__BUILTIN_OP_OVERFLOW

int16: mul = 1433 ms, add = 1424 ms, sub = 1254 ms
int32: mul = 1433 ms, add = 1425 ms, sub = 1443 ms
int64: mul = 1430 ms, add = 1429 ms, sub = 1441 ms

uint16: mul = 1445 ms, add = 1291 ms, sub = 1265 ms
uint32: mul = 1419 ms, add = 1434 ms, sub = 1493 ms
uint64: mul = 1266 ms, add = 1430 ms, sub = 1440 ms

For some unclear reason, 4 tests are significantly faster.

Forcing further down fallback code with:

#undef HAVE_INT128

int64: mul = 1424 ms, add = 1429 ms, sub = 1440 ms
uint64: mul = 24145 ms, add = 1434 ms, sub = 1435 ms

There is no doubt that dividing 64 bits integers is a very bad idea, at
least on my architecture!

Note that checks depends on value, so actual performance may vary
depending on actual val1 and val2 passed. I used 10000 10000 like your
example.

These results are definitely depressing because the fallback code is
nearly twice as fast as the builtin overflow detection version. For the
record: gcc 7.4.0 on ubuntu 18.04 LTS. Not sure what to advise, relying on
the builtin should be the better idea…

--
Fabien.

#57

michael@paquier.xyz

over 6 years ago

In reply to: Fabien COELHO (#56)

Re: refactoring - share str2*int64 functions

On Sun, Sep 01, 2019 at 01:57:06PM +0200, Fabien COELHO wrote:

I would put back unlikely() on overflow tests, as there are indeed unlikely
to occur and it may help some compilers, and cannot be harmful. It also
helps the code reader to know that these path are not expected to be taken
often.

Hm. I don't agree about that part, and the original signed portions
don't do that. I think that we should let the callers of the routines
decide if a problem is unlikely to happen or not as we do now.

On reflection, I'm not sure that add_u64 and sub_u64 overflow with uint128
are useful. The res < a or b > a tricks should suffice, just like for u16
and u32 cases, and it may cost a little less anyway.

Actually, I agree and this is something I can see as well with some
extra measurements. mul_u64 without int128 is twice slower, while
add_u64 and sub_u64 are 15~20% faster.

I would suggest keep the overflow extension as "contrib/overflow_test". For
mul tests, I'd suggest not to try only min/max values like add/sub, but also
"standard" multiplications that overflow or not. It would be good if "make
check" could be made to work", for some reason it requires "installcheck".

Any extensions out of core can only work with "installcheck", and
check is not supported (see pgxs.mk). I am still not convinced that
this module is worth the extra cycles to justify its existence
though.

There is no doubt that dividing 64 bits integers is a very bad idea, at
least on my architecture!

That's surprising. I cannot reproduce that. Are you sure that you
didn't just undefine HAVE_INT128? This would cause
HAVE__BUILTIN_OP_OVERFLOW to still be active in all the code paths.
Here are a couple of results from my side with this query, FWIW, and
some numbers for all the compile flags (-O2 used):
select pg_overflow_check(10000, 10000, 2000000000, 'XXX', 'XXX');
1) uint16:
1-1) mul:
- built-in: 5.5s
- fallback: 5.5s
1-2) sub:
- built-in: 5.3s
- fallback: 5.4s
1-3) add:
- built-in: 5.3s
- fallback: 6.2s
2) uint32:
2-1) mul:
- built-in: 5.1s
- fallback: 5.9s
2-2) sub:
- built-in: 5.2s
- fallback: 5.4s
2-3) add:
- built-in: 5.2s
- fallback: 6.2s
2) uint64:
2-1) mul:
- built-in: 5.1s
- fallback (with uint128): 8.0s
- fallback (without uint128): 18.1s
2-2) sub:
- built-in: 5.2s
- fallback (with uint128): 7.1s
- fallback (without uint128): 5.5s
2-3) add:
- built-in: 5.2s
- fallback (with uint128): 7.1s
- fallback (without uint128): 6.3s

So, the built-in option is always faster, and keeping the int128 path
if available for the multiplication makes sense, but not for the
subtraction and the addition. I am wondering if we should review
further the signed part for add and sub, but I'd rather not touch it
in this patch.

Note that checks depends on value, so actual performance may vary depending
on actual val1 and val2 passed. I used 10000 10000 like your example.

Sure. Still that offers helpful hints as we do the same operations
for all code paths the same number of times.

If you have done any changes on my previous patch, or if you have a
script to share I could use to attempt to reproduce your results, I
would be happy to do so.

So, do you have more comments?
--
Michael

#58

coelho@cri.ensmp.fr

over 6 years ago

In reply to: Michael Paquier (#57)

1 attachment(s)

Re: refactoring - share str2*int64 functions

Michaï¿œl,

I would put back unlikely() on overflow tests, as there are indeed unlikely
to occur and it may help some compilers, and cannot be harmful. It also
helps the code reader to know that these path are not expected to be taken
often.

Hm. I don't agree about that part, and the original signed portions
don't do that. I think that we should let the callers of the routines
decide if a problem is unlikely to happen or not as we do now.

Hmmm. Maybe inlining propates them, but otherwise they make sense for a
compiler perspective.

I am still not convinced that this module is worth the extra cycles to
justify its existence though.

They allow to quickly do performance tests, for me it is useful to keep it
around, but you are the committer, you do as you feel.

[...]
There is no doubt that dividing 64 bits integers is a very bad idea, at
least on my architecture!

That's surprising. I cannot reproduce that.

It seems to me that somehow you can, you have a 5 to 18 seconds drop
below. There are actual reasons why some processors are more expensive
than others, it is not just marketing:-)

2-1) mul:
- built-in: 5.1s
- fallback (with uint128): 8.0s
- fallback (without uint128): 18.1s

So, the built-in option is always faster, and keeping the int128 path
if available for the multiplication makes sense, but not for the
subtraction and the addition.

Yep.

I am wondering if we should review further the signed part for add and
sub, but I'd rather not touch it in this patch.

The signed overflows are trickier even, I have not paid attention to the
fallback code. I agree that it is better left untouched for know.

If you have done any changes on my previous patch, or if you have a
script to share I could use to attempt to reproduce your results, I
would be happy to do so.

Hmmm. I did manual tests really. Attached a psql script replicating them.

# with builtin overflow detection
sh> psql < oc.sql
NOTICE: int 16 mul: 00:00:02.747269 # slow
NOTICE: int 16 add: 00:00:01.83281
NOTICE: int 16 sub: 00:00:01.8501
NOTICE: uint 16 mul: 00:00:03.68362 # slower
NOTICE: uint 16 add: 00:00:01.835294
NOTICE: uint 16 sub: 00:00:02.136895 # slow
NOTICE: int 32 mul: 00:00:01.828065
NOTICE: int 32 add: 00:00:01.840269
NOTICE: int 32 sub: 00:00:01.843557
NOTICE: uint 32 mul: 00:00:02.447052 # slow
NOTICE: uint 32 add: 00:00:01.849899
NOTICE: uint 32 sub: 00:00:01.840773
NOTICE: int 64 mul: 00:00:01.839051
NOTICE: int 64 add: 00:00:01.839065
NOTICE: int 64 sub: 00:00:01.838599
NOTICE: uint 64 mul: 00:00:02.161346 # slow
NOTICE: uint 64 add: 00:00:01.839404
NOTICE: uint 64 sub: 00:00:01.838549
DO

So, do you have more comments?

No other comments.

--
Fabien.

#59

michael@paquier.xyz

over 6 years ago

In reply to: Fabien COELHO (#58)

Re: refactoring - share str2*int64 functions

On Sun, Sep 01, 2019 at 08:07:06PM +0200, Fabien COELHO wrote:

They allow to quickly do performance tests, for me it is useful to keep it
around, but you are the committer, you do as you feel.

If there are more voices for having that in core, we could consider
it. For now I have added that into my own plugin repository with all
the functions discussed on this thread:
https://github.com/michaelpq/pg_plugins/

The signed overflows are trickier even, I have not paid attention to the
fallback code. I agree that it is better left untouched for know.

Thanks.

Hmmm. I did manual tests really. Attached a psql script replicating them.

# with builtin overflow detection
sh> psql < oc.sql
NOTICE: int 16 mul: 00:00:02.747269 # slow
NOTICE: int 16 add: 00:00:01.83281
NOTICE: int 16 sub: 00:00:01.8501
NOTICE: uint 16 mul: 00:00:03.68362 # slower
NOTICE: uint 16 add: 00:00:01.835294
NOTICE: uint 16 sub: 00:00:02.136895 # slow
NOTICE: int 32 mul: 00:00:01.828065
NOTICE: int 32 add: 00:00:01.840269
NOTICE: int 32 sub: 00:00:01.843557
NOTICE: uint 32 mul: 00:00:02.447052 # slow
NOTICE: uint 32 add: 00:00:01.849899
NOTICE: uint 32 sub: 00:00:01.840773
NOTICE: int 64 mul: 00:00:01.839051
NOTICE: int 64 add: 00:00:01.839065
NOTICE: int 64 sub: 00:00:01.838599
NOTICE: uint 64 mul: 00:00:02.161346 # slow
NOTICE: uint 64 add: 00:00:01.839404
NOTICE: uint 64 sub: 00:00:01.838549

Actually that's much faster than a single core on my debian SID with
gcc 9.2.1.

Here are more results from me:
Built-in undef Built-in
int16 mul 00:00:05.425207 00:00:05.634417
int16 add 00:00:05.389738 00:00:06.38885
int16 sub 00:00:05.446529 00:00:06.39569
uint16 mul 00:00:05.499066 00:00:05.541617
uint16 add 00:00:05.281622 00:00:06.252511
uint16 sub 00:00:05.366424 00:00:05.457148
int32 mul 00:00:05.121209 00:00:06.154989
int32 add 00:00:05.228722 00:00:06.344721
int32 sub 00:00:05.237594 00:00:06.323543
uint32 mul 00:00:05.126339 00:00:05.921738
uint32 add 00:00:05.212085 00:00:06.183031
uint32 sub 00:00:05.201884 00:00:05.363667
int64 mul 00:00:05.136129 00:00:06.148101
int64 add 00:00:05.200201 00:00:06.329091
int64 sub 00:00:05.218028 00:00:06.313114
uint64 mul 00:00:05.444733 00:00:08.089742
uint64 add 00:00:05.603978 00:00:06.377753
uint64 sub 00:00:05.544838 00:00:05.490873

This part has been committed, now let's move to the next parts of your
patch.
--
Michael

#60

coelho@cri.ensmp.fr

over 6 years ago

In reply to: Michael Paquier (#59)

1 attachment(s)

Re: refactoring - share str2*int64 functions

This part has been committed, now let's move to the next parts of your
patch.

Attached a rebased version which implements the int64/uint64 stuff. It is
basically the previous patch without the overflow inlined functions.

--
Fabien Coelho - CRI, MINES ParisTech

Attachments:

str2int-10.patchtext/x-diff; name=str2int-10.patchDownload

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 221b47298c..28891ba337 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -79,6 +79,7 @@
 #include "utils/builtins.h"
 #include "utils/hashutils.h"
 #include "utils/memutils.h"
+#include "common/string.h"
 
 PG_MODULE_MAGIC;
 
@@ -1022,7 +1023,7 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 		/* parse command tag to retrieve the number of affected rows. */
 		if (completionTag &&
 			strncmp(completionTag, "COPY ", 5) == 0)
-			rows = pg_strtouint64(completionTag + 5, NULL, 10);
+			(void) pg_strtouint64(completionTag + 5, &rows);
 		else
 			rows = 0;
 
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 2c0ae395ba..8e75d52b06 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -21,6 +21,7 @@
 #include "catalog/heap.h"
 #include "catalog/pg_type.h"
 #include "commands/trigger.h"
+#include "common/string.h"
 #include "executor/executor.h"
 #include "executor/spi_priv.h"
 #include "miscadmin.h"
@@ -2338,8 +2339,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
 					CreateTableAsStmt *ctastmt = (CreateTableAsStmt *) stmt->utilityStmt;
 
 					if (strncmp(completionTag, "SELECT ", 7) == 0)
-						_SPI_current->processed =
-							pg_strtouint64(completionTag + 7, NULL, 10);
+						(void) pg_strtouint64(completionTag + 7, &_SPI_current->processed);
 					else
 					{
 						/*
@@ -2361,8 +2361,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
 				else if (IsA(stmt->utilityStmt, CopyStmt))
 				{
 					Assert(strncmp(completionTag, "COPY ", 5) == 0);
-					_SPI_current->processed = pg_strtouint64(completionTag + 5,
-															 NULL, 10);
+					(void) pg_strtouint64(completionTag + 5, &_SPI_current->processed);
 				}
 			}
 
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 764e3bb90c..17cde42b4d 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -34,6 +34,7 @@
 
 #include "fmgr.h"
 #include "miscadmin.h"
+#include "common/string.h"
 #include "nodes/extensible.h"
 #include "nodes/parsenodes.h"
 #include "nodes/plannodes.h"
@@ -80,7 +81,7 @@
 #define READ_UINT64_FIELD(fldname) \
 	token = pg_strtok(&length);		/* skip :fldname */ \
 	token = pg_strtok(&length);		/* get field value */ \
-	local_node->fldname = pg_strtouint64(token, NULL, 10)
+	(void) pg_strtouint64(token, &local_node->fldname)
 
 /* Read a long integer field (anything written as ":fldname %ld") */
 #define READ_LONG_FIELD(fldname) \
diff --git a/src/backend/parser/parse_node.c b/src/backend/parser/parse_node.c
index 1baf7ef31f..8af5e083ee 100644
--- a/src/backend/parser/parse_node.c
+++ b/src/backend/parser/parse_node.c
@@ -25,10 +25,10 @@
 #include "parser/parse_expr.h"
 #include "parser/parse_relation.h"
 #include "utils/builtins.h"
-#include "utils/int8.h"
 #include "utils/lsyscache.h"
 #include "utils/syscache.h"
 #include "utils/varbit.h"
+#include "common/string.h"
 
 
 static void pcb_error_callback(void *arg);
@@ -496,7 +496,7 @@ make_const(ParseState *pstate, Value *value, int location)
 
 		case T_Float:
 			/* could be an oversize integer as well as a float ... */
-			if (scanint8(strVal(value), true, &val64))
+			if (pg_strtoint64(strVal(value), &val64) == PG_STRTOINT_OK)
 			{
 				/*
 				 * It might actually fit in int32. Probably only INT_MIN can
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 9c08757fca..e58dbe2fd0 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,6 +12,7 @@
  */
 #include "postgres.h"
 
+#include "common/string.h"
 #include "catalog/pg_publication.h"
 
 #include "fmgr.h"
@@ -22,7 +23,6 @@
 #include "replication/pgoutput.h"
 
 #include "utils/inval.h"
-#include "utils/int8.h"
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 #include "utils/varlena.h"
@@ -113,7 +113,7 @@ parse_output_parameters(List *options, uint32 *protocol_version,
 						 errmsg("conflicting or redundant options")));
 			protocol_version_given = true;
 
-			if (!scanint8(strVal(defel->arg), true, &parsed))
+			if (unlikely(pg_strtoint64(strVal(defel->arg), &parsed) != PG_STRTOINT_OK))
 				ereport(ERROR,
 						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 						 errmsg("invalid proto_version")));
diff --git a/src/backend/utils/adt/cash.c b/src/backend/utils/adt/cash.c
index 6515fc8ec6..15f3babc05 100644
--- a/src/backend/utils/adt/cash.c
+++ b/src/backend/utils/adt/cash.c
@@ -26,7 +26,6 @@
 #include "libpq/pqformat.h"
 #include "utils/builtins.h"
 #include "utils/cash.h"
-#include "utils/int8.h"
 #include "utils/numeric.h"
 #include "utils/pg_locale.h"
 
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index b3115e4bea..05bbd2a006 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -92,7 +92,6 @@
 #include "utils/datetime.h"
 #include "utils/float.h"
 #include "utils/formatting.h"
-#include "utils/int8.h"
 #include "utils/memutils.h"
 #include "utils/numeric.h"
 #include "utils/pg_locale.h"
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 0ff9394a2f..0152fc9dbd 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -18,12 +18,12 @@
 #include <math.h>
 
 #include "common/int.h"
+#include "common/string.h"
 #include "funcapi.h"
 #include "libpq/pqformat.h"
 #include "nodes/nodeFuncs.h"
 #include "nodes/supportnodes.h"
 #include "optimizer/optimizer.h"
-#include "utils/int8.h"
 #include "utils/builtins.h"
 
 
@@ -47,87 +47,32 @@ typedef struct
  * Formatting and conversion routines.
  *---------------------------------------------------------*/
 
-/*
- * scanint8 --- try to parse a string into an int8.
- *
- * If errorOK is false, ereport a useful error message if the string is bad.
- * If errorOK is true, just return "false" for bad input.
- */
-bool
-scanint8(const char *str, bool errorOK, int64 *result)
+static bool
+scanint8(const char * str, int64 * result)
 {
-	const char *ptr = str;
-	int64		tmp = 0;
-	bool		neg = false;
+	pg_strtoint_status stat = pg_strtoint64(str, result);
 
-	/*
-	 * Do our own scan, rather than relying on sscanf which might be broken
-	 * for long long.
-	 *
-	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
-	 * value as a negative number.
-	 */
-
-	/* skip leading spaces */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
+	if (likely(stat == PG_STRTOINT_OK))
+		return true;
+	else if (stat == PG_STRTOINT_RANGE_ERROR)
 	{
-		/* could fail if input is most negative number */
-		if (unlikely(tmp == PG_INT64_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	*result = tmp;
-	return true;
-
-out_of_range:
-	if (!errorOK)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("value \"%s\" is out of range for type %s",
 						str, "bigint")));
-	return false;
-
-invalid_syntax:
-	if (!errorOK)
+		return false;
+	}
+	else if (stat == PG_STRTOINT_SYNTAX_ERROR)
+	{
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
 				 errmsg("invalid input syntax for type %s: \"%s\"",
 						"bigint", str)));
-	return false;
+		return false;
+	}
+	else
+		/* cannot get here */
+		Assert(0);
 }
 
 /* int8in()
@@ -138,7 +83,7 @@ int8in(PG_FUNCTION_ARGS)
 	char	   *str = PG_GETARG_CSTRING(0);
 	int64		result;
 
-	(void) scanint8(str, false, &result);
+	(void) scanint8(str, &result);
 	PG_RETURN_INT64(result);
 }
 
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index a00db3ce7a..bc02958950 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -39,7 +39,6 @@
 #include "utils/float.h"
 #include "utils/guc.h"
 #include "utils/hashutils.h"
-#include "utils/int8.h"
 #include "utils/numeric.h"
 #include "utils/sortsupport.h"
 
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 70138feb29..f5d488c569 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -543,25 +543,3 @@ pg_ltostr(char *str, int32 value)
 
 	return end;
 }
-
-/*
- * pg_strtouint64
- *		Converts 'str' into an unsigned 64-bit integer.
- *
- * This has the identical API to strtoul(3), except that it will handle
- * 64-bit ints even where "long" is narrower than that.
- *
- * For the moment it seems sufficient to assume that the platform has
- * such a function somewhere; let's not roll our own.
- */
-uint64
-pg_strtouint64(const char *str, char **endptr, int base)
-{
-#ifdef _MSC_VER					/* MSVC only */
-	return _strtoui64(str, endptr, base);
-#elif defined(HAVE_STRTOULL) && SIZEOF_LONG < 8
-	return strtoull(str, endptr, base);
-#else
-	return strtoul(str, endptr, base);
-#endif
-}
diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c
index e5c7e5c7ee..749fc44623 100644
--- a/src/backend/utils/adt/rangetypes.c
+++ b/src/backend/utils/adt/rangetypes.c
@@ -37,7 +37,6 @@
 #include "utils/builtins.h"
 #include "utils/date.h"
 #include "utils/hashutils.h"
-#include "utils/int8.h"
 #include "utils/lsyscache.h"
 #include "utils/rangetypes.h"
 #include "utils/timestamp.h"
diff --git a/src/bin/pgbench/exprscan.l b/src/bin/pgbench/exprscan.l
index e9020ad231..99db6c72eb 100644
--- a/src/bin/pgbench/exprscan.l
+++ b/src/bin/pgbench/exprscan.l
@@ -23,6 +23,7 @@
  *-------------------------------------------------------------------------
  */
 
+#include "common/string.h"
 #include "fe_utils/psqlscan_int.h"
 
 /* context information for reporting errors in expressions */
@@ -205,7 +206,7 @@ notnull			[Nn][Oo][Tt][Nn][Uu][Ll][Ll]
 					return MAXINT_PLUS_ONE_CONST;
 				}
 {digit}+		{
-					if (!strtoint64(yytext, true, &yylval->ival))
+					if (unlikely(pg_strtoint64(yytext, &yylval->ival) != PG_STRTOINT_OK))
 						expr_yyerror_more(yyscanner, "bigint constant overflow",
 										  strdup(yytext));
 					return INTEGER_CONST;
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index ed7652bfbf..086d75437c 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -34,6 +34,7 @@
 #include "postgres_fe.h"
 #include "common/int.h"
 #include "common/logging.h"
+#include "common/string.h"
 #include "fe_utils/conditional.h"
 #include "getopt_long.h"
 #include "libpq-fe.h"
@@ -670,8 +671,8 @@ is_an_int(const char *str)
 {
 	const char *ptr = str;
 
-	/* skip leading spaces; cast is consistent with strtoint64 */
-	while (*ptr && isspace((unsigned char) *ptr))
+	/* skip leading spaces; cast is consistent with pg_strtoint64 */
+	while (isspace((unsigned char) *ptr))
 		ptr++;
 
 	/* skip sign */
@@ -679,101 +680,21 @@ is_an_int(const char *str)
 		ptr++;
 
 	/* at least one digit */
-	if (*ptr && !isdigit((unsigned char) *ptr))
+	if (!isdigit((unsigned char) *ptr++))
 		return false;
 
-	/* eat all digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
+	/* eat all other digits */
+	while (isdigit((unsigned char) *ptr))
+		ptr++;
+
+	/* allow trailing spaces */
+	while (isspace((unsigned char) *ptr))
 		ptr++;
 
 	/* must have reached end of string */
 	return *ptr == '\0';
 }
 
-
-/*
- * strtoint64 -- convert a string to 64-bit integer
- *
- * This function is a slightly modified version of scanint8() from
- * src/backend/utils/adt/int8.c.
- *
- * The function returns whether the conversion worked, and if so
- * "*result" is set to the result.
- *
- * If not errorOK, an error message is also printed out on errors.
- */
-bool
-strtoint64(const char *str, bool errorOK, int64 *result)
-{
-	const char *ptr = str;
-	int64		tmp = 0;
-	bool		neg = false;
-
-	/*
-	 * Do our own scan, rather than relying on sscanf which might be broken
-	 * for long long.
-	 *
-	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
-	 * value as a negative number.
-	 */
-
-	/* skip leading spaces */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		if (unlikely(tmp == PG_INT64_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	*result = tmp;
-	return true;
-
-out_of_range:
-	if (!errorOK)
-		fprintf(stderr,
-				"value \"%s\" is out of range for type bigint\n", str);
-	return false;
-
-invalid_syntax:
-	if (!errorOK)
-		fprintf(stderr,
-				"invalid input syntax for type bigint: \"%s\"\n", str);
-	return false;
-}
-
 /* convert string to double, detecting overflows/underflows */
 bool
 strtodouble(const char *str, bool errorOK, double *dv)
@@ -1283,6 +1204,24 @@ getVariable(CState *st, char *name)
 	return var->svalue;
 }
 
+static bool
+str2int64(const char *str, int64 *val)
+{
+	pg_strtoint_status		stat = pg_strtoint64(str, val);
+
+	if (likely(stat == PG_STRTOINT_OK))
+		return true;
+
+	if (stat == PG_STRTOINT_RANGE_ERROR)
+		fprintf(stderr, "value \"%s\" is out of range for type %s\n", str, "int64");
+	else if (stat == PG_STRTOINT_SYNTAX_ERROR)
+		fprintf(stderr, "invalid input syntax for type %s: \"%s\"", "int64", str);
+	else
+		Assert(0); /* cannot happen */
+
+	return false;
+}
+
 /* Try to convert variable to a value; return false on failure */
 static bool
 makeVariableValue(Variable *var)
@@ -1325,7 +1264,7 @@ makeVariableValue(Variable *var)
 		/* if it looks like an int, it must be an int without overflow */
 		int64		iv;
 
-		if (!strtoint64(var->svalue, false, &iv))
+		if (!str2int64(var->svalue, &iv))
 			return false;
 
 		setIntValue(&var->value, iv);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index c4a1e298e0..da3f8fa055 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -160,7 +160,6 @@ extern void syntax_error(const char *source, int lineno, const char *line,
 						 const char *cmd, const char *msg,
 						 const char *more, int col) pg_attribute_noreturn();
 
-extern bool strtoint64(const char *str, bool errorOK, int64 *pi);
 extern bool strtodouble(const char *str, bool errorOK, double *pd);
 
 #endif							/* PGBENCH_H */
diff --git a/src/common/string.c b/src/common/string.c
index c9b8482cb0..5487dfbb8b 100644
--- a/src/common/string.c
+++ b/src/common/string.c
@@ -22,6 +22,7 @@
 #endif
 
 #include "common/string.h"
+#include "common/int.h"
 
 
 /*
@@ -57,6 +58,122 @@ strtoint(const char *pg_restrict str, char **pg_restrict endptr, int base)
 	return (int) val;
 }
 
+/*
+ * pg_strtoint64 -- convert a string to 64-bit integer
+ *
+ * The function returns if the conversion failed, or
+ * "*result" is set to the result.
+ */
+pg_strtoint_status
+pg_strtoint64(const char *str, int64 *result)
+{
+	const char *ptr = str;
+	int64		tmp = 0;
+	bool		neg = false;
+
+	/*
+	 * Do our own scan, rather than relying on sscanf which might be broken
+	 * for long long.
+	 *
+	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
+	 * value as a negative number.
+	 */
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '-')
+	{
+		ptr++;
+		neg = true;
+	}
+	else if (*ptr == '+')
+		ptr++;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead */
+	do
+	{
+		int64		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* but not other trailing chars */
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	if (!neg)
+	{
+		if (unlikely(tmp == PG_INT64_MIN))
+			return PG_STRTOINT_RANGE_ERROR;
+		tmp = -tmp;
+	}
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
+/*
+ * pg_strtouint64 -- convert a string to unsigned 64-bit integer
+ *
+ * The function returns if the conversion failed, or
+ * "*result" is set to the result.
+ */
+pg_strtoint_status
+pg_strtouint64(const char *str, uint64 *result)
+{
+	const char *ptr = str;
+	uint64		tmp = 0;
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '+')
+		ptr++;
+	else if (unlikely(*ptr == '-'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead */
+	do
+	{
+		uint64		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_u64_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_add_u64_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* but not other trailing chars */
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
 
 /*
  * pg_clean_ascii -- Replace any non-ASCII chars with a '?' char
diff --git a/src/include/common/string.h b/src/include/common/string.h
index 94f653fdd7..9a8277a670 100644
--- a/src/include/common/string.h
+++ b/src/include/common/string.h
@@ -11,8 +11,19 @@
 #define COMMON_STRING_H
 
 extern bool pg_str_endswith(const char *str, const char *end);
+
 extern int	strtoint(const char *pg_restrict str, char **pg_restrict endptr,
 					 int base);
+
+typedef enum {
+	PG_STRTOINT_OK,
+	PG_STRTOINT_SYNTAX_ERROR,
+	PG_STRTOINT_RANGE_ERROR
+} pg_strtoint_status;
+
+extern pg_strtoint_status pg_strtoint64(const char *str, int64 *result);
+extern pg_strtoint_status pg_strtouint64(const char *str, uint64 *result);
+
 extern void pg_clean_ascii(char *str);
 extern int	pg_strip_crlf(char *str);
 
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 937ddb7ef0..8a0c2a5c48 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -50,7 +50,6 @@ extern void pg_ltoa(int32 l, char *a);
 extern void pg_lltoa(int64 ll, char *a);
 extern char *pg_ltostr_zeropad(char *str, int32 value, int32 minwidth);
 extern char *pg_ltostr(char *str, int32 value);
-extern uint64 pg_strtouint64(const char *str, char **endptr, int base);
 
 /* oid.c */
 extern oidvector *buildoidvector(const Oid *oids, int n);
diff --git a/src/include/utils/int8.h b/src/include/utils/int8.h
deleted file mode 100644
index 4836095d93..0000000000
--- a/src/include/utils/int8.h
+++ /dev/null
@@ -1,25 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * int8.h
- *	  Declarations for operations on 64-bit integers.
- *
- *
- * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * src/include/utils/int8.h
- *
- * NOTES
- * These data types are supported on all 64-bit architectures, and may
- *	be supported through libraries on some 32-bit machines. If your machine
- *	is not currently supported, then please try to make it so, then post
- *	patches to the postgresql.org hackers mailing list.
- *
- *-------------------------------------------------------------------------
- */
-#ifndef INT8_H
-#define INT8_H
-
-extern bool scanint8(const char *str, bool errorOK, int64 *result);
-
-#endif							/* INT8_H */

#61

michael@paquier.xyz

over 6 years ago

In reply to: Fabien COELHO (#60)

Re: refactoring - share str2*int64 functions

On Tue, Sep 03, 2019 at 08:10:37PM +0200, Fabien COELHO wrote:

Attached a rebased version which implements the int64/uint64 stuff. It is
basically the previous patch without the overflow inlined functions.

-     if (!strtoint64(yytext, true, &yylval->ival))
+     if (unlikely(pg_strtoint64(yytext, &yylval->ival) != PG_STRTOINT_OK))
It seems to me that we should have unlikely() only within
pg_strtoint64(), pg_strtouint64(), etc.

-   /* skip leading spaces; cast is consistent with strtoint64 */
-   while (*ptr && isspace((unsigned char) *ptr))
+   /* skip leading spaces; cast is consistent with pg_strtoint64 */
+   while (isspace((unsigned char) *ptr))
        ptr++;
What do you think about splitting this part in two?  I would suggest
to do the refactoring in a first patch, and the consider all the
optimizations for the routines you have in mind afterwards.

I think that we don't actually need is_an_int() and str2int64() at all
in pgbench.c as we could just check for the return code of
pg_strtoint64() and switch to the double parsing only if we don't have
PG_STRTOINT_OK.

scanint8() only has one caller in the backend with your patch, and we
don't check after its return result in int8.c, so I would suggest to
move the error handling directly in int8in() and to remove scanint8().

I think that we should also introduce the [u]int[16|32] flavors and
expand them in the code in a single patch, in a way consistent with
what you have done for int64/uint64 as the state that we reach with
the patch is kind of weird as there are existing versions numutils.c.

Have you done some performance testing of the patch? The COPY
bulkload is a good example of that:
/messages/by-id/20190717181428.krqpmduejbqr4m6g@alap3.anarazel.de
--
Michael

#62

andres@anarazel.de

over 6 years ago

In reply to: Fabien COELHO (#60)

Re: refactoring - share str2*int64 functions

Hi,

On 2019-09-03 20:10:37 +0200, Fabien COELHO wrote:

@@ -113,7 +113,7 @@ parse_output_parameters(List *options, uint32 *protocol_version,
errmsg("conflicting or redundant options")));
protocol_version_given = true;
-			if (!scanint8(strVal(defel->arg), true, &parsed))
+			if (unlikely(pg_strtoint64(strVal(defel->arg), &parsed) != PG_STRTOINT_OK))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("invalid proto_version")));

Unexcited about adding unlikely() to any place that's not a hot path -
which this certainly is not.

But I also wonder if we shouldn't just put the branch likelihood of
pg_strtoint64 not failing into one central location. E.g. something like

static inline pg_strtoint_status
pg_strtoint64(const char *str, int64 *result)
{
pg_strtoint_status status;

status = pg_strtoint64_impl(str, result);

likely(status == PG_STRTOINT_OK);

return status;
}

I've not tested this, but IIRC that should be sufficient to propagate
that knowledge up to callers.

+	if (likely(stat == PG_STRTOINT_OK))
+		return true;
+	else if (stat == PG_STRTOINT_RANGE_ERROR)
{
-		/* could fail if input is most negative number */
-		if (unlikely(tmp == PG_INT64_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	*result = tmp;
-	return true;
-
-out_of_range:
-	if (!errorOK)
ereport(ERROR,
(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
errmsg("value \"%s\" is out of range for type %s",
str, "bigint")));
-	return false;
-
-invalid_syntax:
-	if (!errorOK)
+		return false;
+	}
+	else if (stat == PG_STRTOINT_SYNTAX_ERROR)
+	{
ereport(ERROR,
(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
errmsg("invalid input syntax for type %s: \"%s\"",
"bigint", str)));
-	return false;
+		return false;
+	}
+	else
+		/* cannot get here */
+		Assert(0);

I'd write this as a switch over the enum - that way we'll get a
compile-time warning if we're not handling a value.

+static bool
+str2int64(const char *str, int64 *val)
+{
+	pg_strtoint_status		stat = pg_strtoint64(str, val);
+

I find it weird to have a wrapper that's named 'str2...' that then calls
'strto' to implement itself.

+/*
+ * pg_strtoint64 -- convert a string to 64-bit integer
+ *
+ * The function returns if the conversion failed, or
+ * "*result" is set to the result.
+ */
+pg_strtoint_status
+pg_strtoint64(const char *str, int64 *result)
+{
+	const char *ptr = str;
+	int64		tmp = 0;
+	bool		neg = false;
+
+	/*
+	 * Do our own scan, rather than relying on sscanf which might be broken
+	 * for long long.
+	 *
+	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
+	 * value as a negative number.
+	 */
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '-')
+	{
+		ptr++;
+		neg = true;
+	}
+	else if (*ptr == '+')
+		ptr++;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead */
+	do
+	{
+		int64		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));

Hm. If we're concerned about the cost of isdigit etc - and I think
that's reasonable, after looking at their implementation in glibc (it
performs a lookup in a global table to handle potential locale changes)
- I think we ought to just not use the provided isdigit, and probably
not isspace either. This code effectively relies on the input being
ascii anyway, so we don't need any locale specific behaviour afaict
(we'd e.g. get wrong results if isdigit() returned true for something
other than the ascii chars).

To me the generated code looks considerably better if I use something
like

static inline bool
pg_isdigit(char c)
{
return c >= '0' && c <= '9';
}

static inline bool
pg_isspace(char c)
{
return c == ' ';
}

(if we were to actually go for this, we'd probably want to add some docs
that we don't expect EOF, or make the code safe against that).

I've not benchmarked that, but I'd be surprised if it didn't improve
matters.

And once coded using the above, there's no point in the do/while
conversion imo, as any compiler can trivially optimize redundant checks.

+/*
+ * pg_strtouint64 -- convert a string to unsigned 64-bit integer
+ *
+ * The function returns if the conversion failed, or
+ * "*result" is set to the result.
+ */
+pg_strtoint_status
+pg_strtouint64(const char *str, uint64 *result)
+{
+	const char *ptr = str;
+	uint64		tmp = 0;
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '+')
+		ptr++;
+	else if (unlikely(*ptr == '-'))
+		return PG_STRTOINT_SYNTAX_ERROR;

Hm. Seems like that should return PG_STRTOINT_RANGE_ERROR?

+typedef enum {

Please don't define anonyous types (the enum itself, which now is only
reachable via the typedef). Also, there's a missing newline here).

Greetings,

Andres Freund

#63

coelho@cri.ensmp.fr

over 6 years ago

In reply to: Michael Paquier (#61)

1 attachment(s)

Re: refactoring - share str2*int64 functions

Bonjour Michaï¿œl,

Attached a rebased version which implements the int64/uint64 stuff. It is
basically the previous patch without the overflow inlined functions.
-     if (!strtoint64(yytext, true, &yylval->ival))
+     if (unlikely(pg_strtoint64(yytext, &yylval->ival) != PG_STRTOINT_OK))

It seems to me that we should have unlikely() only within
pg_strtoint64(), pg_strtouint64(), etc.

From a compiler perspective, the (un)likely tip is potentially useful on
any test. We know when parsing a that it is very unlikely that the string
conversion would fail, so we can tell that, so that the compiler knows
which branch it should optimize first.

You can argue against that if the functions are inlined, because maybe the
compiler would propagate the information, but for distinct functions
compiled separately the information is useful at each level.

- /* skip leading spaces; cast is consistent with strtoint64 */
- while (*ptr && isspace((unsigned char) *ptr))

+   /* skip leading spaces; cast is consistent with pg_strtoint64 */
+   while (isspace((unsigned char) *ptr))
ptr++;

What do you think about splitting this part in two? I would suggest
to do the refactoring in a first patch, and the consider all the
optimizations for the routines you have in mind afterwards.

I would not bother.

I think that we don't actually need is_an_int() and str2int64() at all
in pgbench.c as we could just check for the return code of
pg_strtoint64() and switch to the double parsing only if we don't have
PG_STRTOINT_OK.

Yep, you are right, now that the conversion functions does not error out a
message, its failure can be used as a test.

The version attached changes slightly the semantics, because on int
overflows a double conversion will be attempted instead of erroring. I do
not think that it is worth the effort of preserving the previous semantic
of erroring.

scanint8() only has one caller in the backend with your patch, and we
don't check after its return result in int8.c, so I would suggest to
move the error handling directly in int8in() and to remove scanint8().

Ok.

I think that we should also introduce the [u]int[16|32] flavors and
expand them in the code in a single patch, in a way consistent with
what you have done for int64/uint64 as the state that we reach with
the patch is kind of weird as there are existing versions numutils.c.

Before dealing with the 16/32 versions, which involve quite a significant
amount of changes, I would want a clear message that "the 64 bit approach"
is the model to follow.

Moreover, I'm unsure how to rename the existing pg_strtoint32 and others
which call ereport, if the name is used for the common error returning
version.

Have you done some performance testing of the patch? The COPY
bulkload is a good example of that:
/messages/by-id/20190717181428.krqpmduejbqr4m6g@alap3.anarazel.de

I have done no such thing for now.

I would not expect any significant performance difference when loading
int8 things because basically scanint8 has just been renamed pg_strtoint64
and made global, and that is more or less all. It might be a little bit
slower because possible the compiler cannot inline the conversion, but on
the other hand, the *likely hints and removed tests may marginaly help
performance. I think that the only way to test performance significantly
would be to write a specific program that loops over a conversion.

--
Fabien.

Attachments:

str2int-11.patchtext/x-diff; name=str2int-11.patchDownload

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 221b47298c..28891ba337 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -79,6 +79,7 @@
 #include "utils/builtins.h"
 #include "utils/hashutils.h"
 #include "utils/memutils.h"
+#include "common/string.h"
 
 PG_MODULE_MAGIC;
 
@@ -1022,7 +1023,7 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 		/* parse command tag to retrieve the number of affected rows. */
 		if (completionTag &&
 			strncmp(completionTag, "COPY ", 5) == 0)
-			rows = pg_strtouint64(completionTag + 5, NULL, 10);
+			(void) pg_strtouint64(completionTag + 5, &rows);
 		else
 			rows = 0;
 
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 2c0ae395ba..8e75d52b06 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -21,6 +21,7 @@
 #include "catalog/heap.h"
 #include "catalog/pg_type.h"
 #include "commands/trigger.h"
+#include "common/string.h"
 #include "executor/executor.h"
 #include "executor/spi_priv.h"
 #include "miscadmin.h"
@@ -2338,8 +2339,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
 					CreateTableAsStmt *ctastmt = (CreateTableAsStmt *) stmt->utilityStmt;
 
 					if (strncmp(completionTag, "SELECT ", 7) == 0)
-						_SPI_current->processed =
-							pg_strtouint64(completionTag + 7, NULL, 10);
+						(void) pg_strtouint64(completionTag + 7, &_SPI_current->processed);
 					else
 					{
 						/*
@@ -2361,8 +2361,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
 				else if (IsA(stmt->utilityStmt, CopyStmt))
 				{
 					Assert(strncmp(completionTag, "COPY ", 5) == 0);
-					_SPI_current->processed = pg_strtouint64(completionTag + 5,
-															 NULL, 10);
+					(void) pg_strtouint64(completionTag + 5, &_SPI_current->processed);
 				}
 			}
 
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 764e3bb90c..17cde42b4d 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -34,6 +34,7 @@
 
 #include "fmgr.h"
 #include "miscadmin.h"
+#include "common/string.h"
 #include "nodes/extensible.h"
 #include "nodes/parsenodes.h"
 #include "nodes/plannodes.h"
@@ -80,7 +81,7 @@
 #define READ_UINT64_FIELD(fldname) \
 	token = pg_strtok(&length);		/* skip :fldname */ \
 	token = pg_strtok(&length);		/* get field value */ \
-	local_node->fldname = pg_strtouint64(token, NULL, 10)
+	(void) pg_strtouint64(token, &local_node->fldname)
 
 /* Read a long integer field (anything written as ":fldname %ld") */
 #define READ_LONG_FIELD(fldname) \
diff --git a/src/backend/parser/parse_node.c b/src/backend/parser/parse_node.c
index 1baf7ef31f..8af5e083ee 100644
--- a/src/backend/parser/parse_node.c
+++ b/src/backend/parser/parse_node.c
@@ -25,10 +25,10 @@
 #include "parser/parse_expr.h"
 #include "parser/parse_relation.h"
 #include "utils/builtins.h"
-#include "utils/int8.h"
 #include "utils/lsyscache.h"
 #include "utils/syscache.h"
 #include "utils/varbit.h"
+#include "common/string.h"
 
 
 static void pcb_error_callback(void *arg);
@@ -496,7 +496,7 @@ make_const(ParseState *pstate, Value *value, int location)
 
 		case T_Float:
 			/* could be an oversize integer as well as a float ... */
-			if (scanint8(strVal(value), true, &val64))
+			if (pg_strtoint64(strVal(value), &val64) == PG_STRTOINT_OK)
 			{
 				/*
 				 * It might actually fit in int32. Probably only INT_MIN can
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 9c08757fca..e58dbe2fd0 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,6 +12,7 @@
  */
 #include "postgres.h"
 
+#include "common/string.h"
 #include "catalog/pg_publication.h"
 
 #include "fmgr.h"
@@ -22,7 +23,6 @@
 #include "replication/pgoutput.h"
 
 #include "utils/inval.h"
-#include "utils/int8.h"
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 #include "utils/varlena.h"
@@ -113,7 +113,7 @@ parse_output_parameters(List *options, uint32 *protocol_version,
 						 errmsg("conflicting or redundant options")));
 			protocol_version_given = true;
 
-			if (!scanint8(strVal(defel->arg), true, &parsed))
+			if (unlikely(pg_strtoint64(strVal(defel->arg), &parsed) != PG_STRTOINT_OK))
 				ereport(ERROR,
 						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 						 errmsg("invalid proto_version")));
diff --git a/src/backend/utils/adt/cash.c b/src/backend/utils/adt/cash.c
index 6515fc8ec6..15f3babc05 100644
--- a/src/backend/utils/adt/cash.c
+++ b/src/backend/utils/adt/cash.c
@@ -26,7 +26,6 @@
 #include "libpq/pqformat.h"
 #include "utils/builtins.h"
 #include "utils/cash.h"
-#include "utils/int8.h"
 #include "utils/numeric.h"
 #include "utils/pg_locale.h"
 
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index 755ca6e277..f8cc5bc48a 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -92,7 +92,6 @@
 #include "utils/datetime.h"
 #include "utils/float.h"
 #include "utils/formatting.h"
-#include "utils/int8.h"
 #include "utils/memutils.h"
 #include "utils/numeric.h"
 #include "utils/pg_locale.h"
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 0ff9394a2f..d1bc30d9f4 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -18,12 +18,12 @@
 #include <math.h>
 
 #include "common/int.h"
+#include "common/string.h"
 #include "funcapi.h"
 #include "libpq/pqformat.h"
 #include "nodes/nodeFuncs.h"
 #include "nodes/supportnodes.h"
 #include "optimizer/optimizer.h"
-#include "utils/int8.h"
 #include "utils/builtins.h"
 
 
@@ -47,99 +47,29 @@ typedef struct
  * Formatting and conversion routines.
  *---------------------------------------------------------*/
 
-/*
- * scanint8 --- try to parse a string into an int8.
- *
- * If errorOK is false, ereport a useful error message if the string is bad.
- * If errorOK is true, just return "false" for bad input.
+/* int8in()
  */
-bool
-scanint8(const char *str, bool errorOK, int64 *result)
+Datum
+int8in(PG_FUNCTION_ARGS)
 {
-	const char *ptr = str;
-	int64		tmp = 0;
-	bool		neg = false;
+	char			   *str = PG_GETARG_CSTRING(0);
+	int64				result;
+	pg_strtoint_status	stat = pg_strtoint64(str, &result);
 
-	/*
-	 * Do our own scan, rather than relying on sscanf which might be broken
-	 * for long long.
-	 *
-	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
-	 * value as a negative number.
-	 */
-
-	/* skip leading spaces */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		/* could fail if input is most negative number */
-		if (unlikely(tmp == PG_INT64_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	*result = tmp;
-	return true;
-
-out_of_range:
-	if (!errorOK)
+	if (likely(stat == PG_STRTOINT_OK))
+		PG_RETURN_INT64(result);
+	else if (stat == PG_STRTOINT_RANGE_ERROR)
 		ereport(ERROR,
 				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 				 errmsg("value \"%s\" is out of range for type %s",
 						str, "bigint")));
-	return false;
-
-invalid_syntax:
-	if (!errorOK)
+	else if (stat == PG_STRTOINT_SYNTAX_ERROR)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
 				 errmsg("invalid input syntax for type %s: \"%s\"",
 						"bigint", str)));
-	return false;
-}
-
-/* int8in()
- */
-Datum
-int8in(PG_FUNCTION_ARGS)
-{
-	char	   *str = PG_GETARG_CSTRING(0);
-	int64		result;
-
-	(void) scanint8(str, false, &result);
-	PG_RETURN_INT64(result);
+	else
+		Assert(0); /* cannot get here */
 }
 
 
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index a00db3ce7a..bc02958950 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -39,7 +39,6 @@
 #include "utils/float.h"
 #include "utils/guc.h"
 #include "utils/hashutils.h"
-#include "utils/int8.h"
 #include "utils/numeric.h"
 #include "utils/sortsupport.h"
 
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 70138feb29..f5d488c569 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -543,25 +543,3 @@ pg_ltostr(char *str, int32 value)
 
 	return end;
 }
-
-/*
- * pg_strtouint64
- *		Converts 'str' into an unsigned 64-bit integer.
- *
- * This has the identical API to strtoul(3), except that it will handle
- * 64-bit ints even where "long" is narrower than that.
- *
- * For the moment it seems sufficient to assume that the platform has
- * such a function somewhere; let's not roll our own.
- */
-uint64
-pg_strtouint64(const char *str, char **endptr, int base)
-{
-#ifdef _MSC_VER					/* MSVC only */
-	return _strtoui64(str, endptr, base);
-#elif defined(HAVE_STRTOULL) && SIZEOF_LONG < 8
-	return strtoull(str, endptr, base);
-#else
-	return strtoul(str, endptr, base);
-#endif
-}
diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c
index e5c7e5c7ee..749fc44623 100644
--- a/src/backend/utils/adt/rangetypes.c
+++ b/src/backend/utils/adt/rangetypes.c
@@ -37,7 +37,6 @@
 #include "utils/builtins.h"
 #include "utils/date.h"
 #include "utils/hashutils.h"
-#include "utils/int8.h"
 #include "utils/lsyscache.h"
 #include "utils/rangetypes.h"
 #include "utils/timestamp.h"
diff --git a/src/bin/pgbench/exprscan.l b/src/bin/pgbench/exprscan.l
index e9020ad231..99db6c72eb 100644
--- a/src/bin/pgbench/exprscan.l
+++ b/src/bin/pgbench/exprscan.l
@@ -23,6 +23,7 @@
  *-------------------------------------------------------------------------
  */
 
+#include "common/string.h"
 #include "fe_utils/psqlscan_int.h"
 
 /* context information for reporting errors in expressions */
@@ -205,7 +206,7 @@ notnull			[Nn][Oo][Tt][Nn][Uu][Ll][Ll]
 					return MAXINT_PLUS_ONE_CONST;
 				}
 {digit}+		{
-					if (!strtoint64(yytext, true, &yylval->ival))
+					if (unlikely(pg_strtoint64(yytext, &yylval->ival) != PG_STRTOINT_OK))
 						expr_yyerror_more(yyscanner, "bigint constant overflow",
 										  strdup(yytext));
 					return INTEGER_CONST;
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index ed7652bfbf..6581dc4de4 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -34,6 +34,7 @@
 #include "postgres_fe.h"
 #include "common/int.h"
 #include "common/logging.h"
+#include "common/string.h"
 #include "fe_utils/conditional.h"
 #include "getopt_long.h"
 #include "libpq-fe.h"
@@ -664,116 +665,6 @@ usage(void)
 		   progname, progname);
 }
 
-/* return whether str matches "^\s*[-+]?[0-9]+$" */
-static bool
-is_an_int(const char *str)
-{
-	const char *ptr = str;
-
-	/* skip leading spaces; cast is consistent with strtoint64 */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* skip sign */
-	if (*ptr == '+' || *ptr == '-')
-		ptr++;
-
-	/* at least one digit */
-	if (*ptr && !isdigit((unsigned char) *ptr))
-		return false;
-
-	/* eat all digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-		ptr++;
-
-	/* must have reached end of string */
-	return *ptr == '\0';
-}
-
-
-/*
- * strtoint64 -- convert a string to 64-bit integer
- *
- * This function is a slightly modified version of scanint8() from
- * src/backend/utils/adt/int8.c.
- *
- * The function returns whether the conversion worked, and if so
- * "*result" is set to the result.
- *
- * If not errorOK, an error message is also printed out on errors.
- */
-bool
-strtoint64(const char *str, bool errorOK, int64 *result)
-{
-	const char *ptr = str;
-	int64		tmp = 0;
-	bool		neg = false;
-
-	/*
-	 * Do our own scan, rather than relying on sscanf which might be broken
-	 * for long long.
-	 *
-	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
-	 * value as a negative number.
-	 */
-
-	/* skip leading spaces */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		if (unlikely(tmp == PG_INT64_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	*result = tmp;
-	return true;
-
-out_of_range:
-	if (!errorOK)
-		fprintf(stderr,
-				"value \"%s\" is out of range for type bigint\n", str);
-	return false;
-
-invalid_syntax:
-	if (!errorOK)
-		fprintf(stderr,
-				"invalid input syntax for type bigint: \"%s\"\n", str);
-	return false;
-}
-
 /* convert string to double, detecting overflows/underflows */
 bool
 strtodouble(const char *str, bool errorOK, double *dv)
@@ -1283,11 +1174,13 @@ getVariable(CState *st, char *name)
 	return var->svalue;
 }
 
+
 /* Try to convert variable to a value; return false on failure */
 static bool
 makeVariableValue(Variable *var)
 {
 	size_t		slen;
+	int64		ival;
 
 	if (var->value.type != PGBT_NO_VALUE)
 		return true;			/* no work */
@@ -1320,15 +1213,16 @@ makeVariableValue(Variable *var)
 	{
 		setBoolValue(&var->value, false);
 	}
-	else if (is_an_int(var->svalue))
+
+	/*
+	 * Attempt an int64 conversion.
+	 *
+	 * A too-large but syntactically correct int64 will be parsed as a double
+	 * below.
+	 */
+	else if (pg_strtoint64(var->svalue, &ival) == PG_STRTOINT_OK)
 	{
-		/* if it looks like an int, it must be an int without overflow */
-		int64		iv;
-
-		if (!strtoint64(var->svalue, false, &iv))
-			return false;
-
-		setIntValue(&var->value, iv);
+		setIntValue(&var->value, ival);
 	}
 	else						/* type should be double */
 	{
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index c4a1e298e0..da3f8fa055 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -160,7 +160,6 @@ extern void syntax_error(const char *source, int lineno, const char *line,
 						 const char *cmd, const char *msg,
 						 const char *more, int col) pg_attribute_noreturn();
 
-extern bool strtoint64(const char *str, bool errorOK, int64 *pi);
 extern bool strtodouble(const char *str, bool errorOK, double *pd);
 
 #endif							/* PGBENCH_H */
diff --git a/src/common/string.c b/src/common/string.c
index c9b8482cb0..5487dfbb8b 100644
--- a/src/common/string.c
+++ b/src/common/string.c
@@ -22,6 +22,7 @@
 #endif
 
 #include "common/string.h"
+#include "common/int.h"
 
 
 /*
@@ -57,6 +58,122 @@ strtoint(const char *pg_restrict str, char **pg_restrict endptr, int base)
 	return (int) val;
 }
 
+/*
+ * pg_strtoint64 -- convert a string to 64-bit integer
+ *
+ * The function returns if the conversion failed, or
+ * "*result" is set to the result.
+ */
+pg_strtoint_status
+pg_strtoint64(const char *str, int64 *result)
+{
+	const char *ptr = str;
+	int64		tmp = 0;
+	bool		neg = false;
+
+	/*
+	 * Do our own scan, rather than relying on sscanf which might be broken
+	 * for long long.
+	 *
+	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
+	 * value as a negative number.
+	 */
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '-')
+	{
+		ptr++;
+		neg = true;
+	}
+	else if (*ptr == '+')
+		ptr++;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead */
+	do
+	{
+		int64		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* but not other trailing chars */
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	if (!neg)
+	{
+		if (unlikely(tmp == PG_INT64_MIN))
+			return PG_STRTOINT_RANGE_ERROR;
+		tmp = -tmp;
+	}
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
+/*
+ * pg_strtouint64 -- convert a string to unsigned 64-bit integer
+ *
+ * The function returns if the conversion failed, or
+ * "*result" is set to the result.
+ */
+pg_strtoint_status
+pg_strtouint64(const char *str, uint64 *result)
+{
+	const char *ptr = str;
+	uint64		tmp = 0;
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '+')
+		ptr++;
+	else if (unlikely(*ptr == '-'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead */
+	do
+	{
+		uint64		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_u64_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_add_u64_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* but not other trailing chars */
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
 
 /*
  * pg_clean_ascii -- Replace any non-ASCII chars with a '?' char
diff --git a/src/include/common/string.h b/src/include/common/string.h
index 94f653fdd7..9a8277a670 100644
--- a/src/include/common/string.h
+++ b/src/include/common/string.h
@@ -11,8 +11,19 @@
 #define COMMON_STRING_H
 
 extern bool pg_str_endswith(const char *str, const char *end);
+
 extern int	strtoint(const char *pg_restrict str, char **pg_restrict endptr,
 					 int base);
+
+typedef enum {
+	PG_STRTOINT_OK,
+	PG_STRTOINT_SYNTAX_ERROR,
+	PG_STRTOINT_RANGE_ERROR
+} pg_strtoint_status;
+
+extern pg_strtoint_status pg_strtoint64(const char *str, int64 *result);
+extern pg_strtoint_status pg_strtouint64(const char *str, uint64 *result);
+
 extern void pg_clean_ascii(char *str);
 extern int	pg_strip_crlf(char *str);
 
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 937ddb7ef0..8a0c2a5c48 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -50,7 +50,6 @@ extern void pg_ltoa(int32 l, char *a);
 extern void pg_lltoa(int64 ll, char *a);
 extern char *pg_ltostr_zeropad(char *str, int32 value, int32 minwidth);
 extern char *pg_ltostr(char *str, int32 value);
-extern uint64 pg_strtouint64(const char *str, char **endptr, int base);
 
 /* oid.c */
 extern oidvector *buildoidvector(const Oid *oids, int n);
diff --git a/src/include/utils/int8.h b/src/include/utils/int8.h
deleted file mode 100644
index 4836095d93..0000000000
--- a/src/include/utils/int8.h
+++ /dev/null
@@ -1,25 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * int8.h
- *	  Declarations for operations on 64-bit integers.
- *
- *
- * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * src/include/utils/int8.h
- *
- * NOTES
- * These data types are supported on all 64-bit architectures, and may
- *	be supported through libraries on some 32-bit machines. If your machine
- *	is not currently supported, then please try to make it so, then post
- *	patches to the postgresql.org hackers mailing list.
- *
- *-------------------------------------------------------------------------
- */
-#ifndef INT8_H
-#define INT8_H
-
-extern bool scanint8(const char *str, bool errorOK, int64 *result);
-
-#endif							/* INT8_H */

#64

michael@paquier.xyz

over 6 years ago

In reply to: Andres Freund (#62)

Re: refactoring - share str2*int64 functions

On Wed, Sep 04, 2019 at 02:08:39AM -0700, Andres Freund wrote:

+static bool
+str2int64(const char *str, int64 *val)
+{
+	pg_strtoint_status		stat = pg_strtoint64(str, val);
+
I find it weird to have a wrapper that's named 'str2...' that then calls
'strto' to implement itself.

It happens that this wrapper in pgbench.c is not actually needed.

Hm. If we're concerned about the cost of isdigit etc - and I think
that's reasonable, after looking at their implementation in glibc (it
performs a lookup in a global table to handle potential locale changes)
- I think we ought to just not use the provided isdigit, and probably
not isspace either. This code effectively relies on the input being
ascii anyway, so we don't need any locale specific behaviour afaict
(we'd e.g. get wrong results if isdigit() returned true for something
other than the ascii chars).

Yeah. It seems to me that we have more optimizations that could come
in line here, and actually we have perhaps more refactoring at hand
with each one of the 6 functions we'd like to add at the end. I had
in mind about first shaping the refactoring patch, consolidating all
the interfaces, and then evaluate the improvements we can come up with
as after the refactoring we'd need to update only common/string.c.

I've not benchmarked that, but I'd be surprised if it didn't improve
matters.

I think that you are right here, there is something to gain. Looking
at their stuff this makes use of __isctype as told by ctype/ctype.h.
--
Michael

#65

michael@paquier.xyz

over 6 years ago

In reply to: Fabien COELHO (#63)

Re: refactoring - share str2*int64 functions

On Wed, Sep 04, 2019 at 12:49:17PM +0200, Fabien COELHO wrote:

From a compiler perspective, the (un)likely tip is potentially useful on any
test. We know when parsing a that it is very unlikely that the string
conversion would fail, so we can tell that, so that the compiler knows which
branch it should optimize first.

You can argue against that if the functions are inlined, because maybe the
compiler would propagate the information, but for distinct functions
compiled separately the information is useful at each level.

Hmm. There has been an extra recent argument on the matter here:
/messages/by-id/20190904090839.stp3madovtynq3px@alap3.anarazel.de

I am not sure that we should tackle that as part of the first
refactoring though, as what we'd want is first to put all the
interfaces in a single place we can deal with afterwards.

Yep, you are right, now that the conversion functions does not error out a
message, its failure can be used as a test.

The version attached changes slightly the semantics, because on int
overflows a double conversion will be attempted instead of erroring. I do
not think that it is worth the effort of preserving the previous semantic of
erroring.

Yes. I would move things in this direction. I may reconsider this
part again with more testing but switching from one to the other is
simple enough so let's keep the code as you suggest for now.

scanint8() only has one caller in the backend with your patch, and we
don't check after its return result in int8.c, so I would suggest to
move the error handling directly in int8in() and to remove scanint8().

Ok.

As per the extra comments of upthread, this should use a switch
without a default.

Before dealing with the 16/32 versions, which involve quite a significant
amount of changes, I would want a clear message that "the 64 bit approach"
is the model to follow.

Moreover, I'm unsure how to rename the existing pg_strtoint32 and others
which call ereport, if the name is used for the common error returning
version.

Right, there was this part. This brings also the point of having one
interface for the backend as all the error messages for the backend
are actually the same, with the most simple name being that:
pg_strtoint(value, size, error_ok).

This then calls all the sub-routines we have in src/common/. There
were more suggestions here:
/messages/by-id/20190729050539.d7mbjabcrlv7bxc3@alap3.anarazel.de

I would not expect any significant performance difference when loading int8
things because basically scanint8 has just been renamed pg_strtoint64 and
made global, and that is more or less all. It might be a little bit slower
because possible the compiler cannot inline the conversion, but on the other
hand, the *likely hints and removed tests may marginaly help performance. I
think that the only way to test performance significantly would be to write
a specific program that loops over a conversion.

I would suspect a change for pg_strtouint64().
--
Michael

#66

michael@paquier.xyz

over 6 years ago

In reply to: Michael Paquier (#65)

2 attachment(s)

Re: refactoring - share str2*int64 functions

On Thu, Sep 05, 2019 at 03:52:48PM +0900, Michael Paquier wrote:

Right, there was this part. This brings also the point of having one
interface for the backend as all the error messages for the backend
are actually the same, with the most simple name being that:
pg_strtoint(value, size, error_ok).

I have been looking at that for the last couple of days. First, I
have consolidated all the error strings in a single routine like this
one, except that error_ok is not really needed if you take this
approach: callers that don't care about failures could just call the
set of routines in common/string.c and be done with it.

Attached is an update of my little module that I used to check that
the refactoring was done correctly (regression tests attached), it
also includes a couple of routines to check the performance difference
between one approach and the other, with focus on two things:
- Is pg_strtouint64 really improved?
- How much do we lose by moving to a common interface in the backend
with pg_strtoint?

The good news is that removing strtol from pg_strtouint64 really
improves the performance as already reported, and with one billion
calls in a tight loop you see a clear difference:
=# select pg_strtouint64_old_check('10000', 1000000000);
pg_strtouint64_old_check
--------------------------
10000
(1 row)
Time: 15576.539 ms (00:15.577)
=# select pg_strtouint64_new_check('10000', 1000000000);
pg_strtouint64_new_check
--------------------------
10000
(1 row)
Time: 8820.544 ms (00:08.821)

So the new implementation is more than 40% faster with this
micro-benchmark on my Debian box.

The bad news is that a pg_strtoint() wrapper is not a good idea:
=# select pg_strtoint_check('10000', 1000000000, 4);
pg_strtoint_check
-------------------
10000
(1 row)
Time: 11178.101 ms (00:11.178)
=# select pg_strtoint32_check('10000', 1000000000);
pg_strtoint32_check
---------------------
10000
(1 row)
Time: 9252.894 ms (00:09.253)
So trying to consolidate all error messages leads to a 15% hit with
this test, which sucks.

So, it seems to me that if we want to have a consolidation of
everything, we basically need to have the generic error messages for
the backend directly into common/string.c where the routines are
refactored, and I think that we should do the following based on the
patch attached:
- Just remove pg_strtoint(), and move the error generation logic in
common/string.c.
- Add an error_ok flag for all the pg_strto[u]int{16|32|64} routines,
which generates errors only for FRONTEND is not defined.
- Use pg_log_error() for the frontend errors.

If those errors are added everywhere, we would basically have no code
paths in the frontend or the backend (for the unsigned part only)
which use them yet. Another possibility would be to ignore the
error_ok flag in those cases, but that's inconsistent. My take would
be here to have a more generic error message, like that:
"value \"%s\" is out of range for [unsigned] {16|32|64}-bit integer"
"invalid input syntax for [unsigned] {16|32|64}-bit integer: \"%s\"\n"

I do not suggest to change the messages for the backend for signed
entries though, as these are linked to the data types used.

Attached is an update of my toy module, and an updated patch with what
I have done up to now. This stuff already does a lot, so for now I
have not worked on the removal strtoint() in string.c yet. We could
just do that with a follow-up patch and make it use the new conversion
routines once we are sure that they are in a correct shape. As
strtoint() makes use of strtol(), switching to the new routines will
be much faster anyway...

Fabien, any thoughts?
--
Michael

Attachments:

str2int-12.patchtext/x-diff; charset=us-asciiDownload

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 221b47298c..28891ba337 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -79,6 +79,7 @@
 #include "utils/builtins.h"
 #include "utils/hashutils.h"
 #include "utils/memutils.h"
+#include "common/string.h"
 
 PG_MODULE_MAGIC;
 
@@ -1022,7 +1023,7 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 		/* parse command tag to retrieve the number of affected rows. */
 		if (completionTag &&
 			strncmp(completionTag, "COPY ", 5) == 0)
-			rows = pg_strtouint64(completionTag + 5, NULL, 10);
+			(void) pg_strtouint64(completionTag + 5, &rows);
 		else
 			rows = 0;
 
diff --git a/contrib/spi/refint.c b/contrib/spi/refint.c
index adf0490f85..9583845d7e 100644
--- a/contrib/spi/refint.c
+++ b/contrib/spi/refint.c
@@ -306,7 +306,7 @@ check_foreign_key(PG_FUNCTION_ARGS)
 		/* internal error */
 		elog(ERROR, "check_foreign_key: too short %d (< 5) list of arguments", nargs);
 
-	nrefs = pg_strtoint32(args[0]);
+	nrefs = pg_strtoint(args[0], 4);
 	if (nrefs < 1)
 		/* internal error */
 		elog(ERROR, "check_foreign_key: %d (< 1) number of references specified", nrefs);
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 2c0ae395ba..8e75d52b06 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -21,6 +21,7 @@
 #include "catalog/heap.h"
 #include "catalog/pg_type.h"
 #include "commands/trigger.h"
+#include "common/string.h"
 #include "executor/executor.h"
 #include "executor/spi_priv.h"
 #include "miscadmin.h"
@@ -2338,8 +2339,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
 					CreateTableAsStmt *ctastmt = (CreateTableAsStmt *) stmt->utilityStmt;
 
 					if (strncmp(completionTag, "SELECT ", 7) == 0)
-						_SPI_current->processed =
-							pg_strtouint64(completionTag + 7, NULL, 10);
+						(void) pg_strtouint64(completionTag + 7, &_SPI_current->processed);
 					else
 					{
 						/*
@@ -2361,8 +2361,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
 				else if (IsA(stmt->utilityStmt, CopyStmt))
 				{
 					Assert(strncmp(completionTag, "COPY ", 5) == 0);
-					_SPI_current->processed = pg_strtouint64(completionTag + 5,
-															 NULL, 10);
+					(void) pg_strtouint64(completionTag + 5, &_SPI_current->processed);
 				}
 			}
 
diff --git a/src/backend/libpq/pqmq.c b/src/backend/libpq/pqmq.c
index a9bd47d937..593a5ef54e 100644
--- a/src/backend/libpq/pqmq.c
+++ b/src/backend/libpq/pqmq.c
@@ -286,10 +286,10 @@ pq_parse_errornotice(StringInfo msg, ErrorData *edata)
 				edata->hint = pstrdup(value);
 				break;
 			case PG_DIAG_STATEMENT_POSITION:
-				edata->cursorpos = pg_strtoint32(value);
+				edata->cursorpos = pg_strtoint(value, 4);
 				break;
 			case PG_DIAG_INTERNAL_POSITION:
-				edata->internalpos = pg_strtoint32(value);
+				edata->internalpos = pg_strtoint(value, 4);
 				break;
 			case PG_DIAG_INTERNAL_QUERY:
 				edata->internalquery = pstrdup(value);
@@ -316,7 +316,7 @@ pq_parse_errornotice(StringInfo msg, ErrorData *edata)
 				edata->filename = pstrdup(value);
 				break;
 			case PG_DIAG_SOURCE_LINE:
-				edata->lineno = pg_strtoint32(value);
+				edata->lineno = pg_strtoint(value, 4);
 				break;
 			case PG_DIAG_SOURCE_FUNCTION:
 				edata->funcname = pstrdup(value);
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 764e3bb90c..17cde42b4d 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -34,6 +34,7 @@
 
 #include "fmgr.h"
 #include "miscadmin.h"
+#include "common/string.h"
 #include "nodes/extensible.h"
 #include "nodes/parsenodes.h"
 #include "nodes/plannodes.h"
@@ -80,7 +81,7 @@
 #define READ_UINT64_FIELD(fldname) \
 	token = pg_strtok(&length);		/* skip :fldname */ \
 	token = pg_strtok(&length);		/* get field value */ \
-	local_node->fldname = pg_strtouint64(token, NULL, 10)
+	(void) pg_strtouint64(token, &local_node->fldname)
 
 /* Read a long integer field (anything written as ":fldname %ld") */
 #define READ_LONG_FIELD(fldname) \
diff --git a/src/backend/parser/parse_node.c b/src/backend/parser/parse_node.c
index 1baf7ef31f..8af5e083ee 100644
--- a/src/backend/parser/parse_node.c
+++ b/src/backend/parser/parse_node.c
@@ -25,10 +25,10 @@
 #include "parser/parse_expr.h"
 #include "parser/parse_relation.h"
 #include "utils/builtins.h"
-#include "utils/int8.h"
 #include "utils/lsyscache.h"
 #include "utils/syscache.h"
 #include "utils/varbit.h"
+#include "common/string.h"
 
 
 static void pcb_error_callback(void *arg);
@@ -496,7 +496,7 @@ make_const(ParseState *pstate, Value *value, int location)
 
 		case T_Float:
 			/* could be an oversize integer as well as a float ... */
-			if (scanint8(strVal(value), true, &val64))
+			if (pg_strtoint64(strVal(value), &val64) == PG_STRTOINT_OK)
 			{
 				/*
 				 * It might actually fit in int32. Probably only INT_MIN can
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 6eba08a920..de52f538a7 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -341,7 +341,7 @@ libpqrcv_identify_system(WalReceiverConn *conn, TimeLineID *primary_tli)
 						   ntuples, nfields, 3, 1)));
 	}
 	primary_sysid = pstrdup(PQgetvalue(res, 0, 0));
-	*primary_tli = pg_strtoint32(PQgetvalue(res, 0, 1));
+	*primary_tli = pg_strtoint(PQgetvalue(res, 0, 1), 4);
 	PQclear(res);
 
 	return primary_sysid;
@@ -487,7 +487,7 @@ libpqrcv_endstreaming(WalReceiverConn *conn, TimeLineID *next_tli)
 		if (PQnfields(res) < 2 || PQntuples(res) != 1)
 			ereport(ERROR,
 					(errmsg("unexpected result set after end-of-streaming")));
-		*next_tli = pg_strtoint32(PQgetvalue(res, 0, 0));
+		*next_tli = pg_strtoint(PQgetvalue(res, 0, 0), 4);
 		PQclear(res);
 
 		/* the result set should be followed by CommandComplete */
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 9c08757fca..fa41edaa96 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -12,6 +12,7 @@
  */
 #include "postgres.h"
 
+#include "common/string.h"
 #include "catalog/pg_publication.h"
 
 #include "fmgr.h"
@@ -22,7 +23,6 @@
 #include "replication/pgoutput.h"
 
 #include "utils/inval.h"
-#include "utils/int8.h"
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 #include "utils/varlena.h"
@@ -113,7 +113,7 @@ parse_output_parameters(List *options, uint32 *protocol_version,
 						 errmsg("conflicting or redundant options")));
 			protocol_version_given = true;
 
-			if (!scanint8(strVal(defel->arg), true, &parsed))
+			if (pg_strtoint64(strVal(defel->arg), &parsed) != PG_STRTOINT_OK)
 				ereport(ERROR,
 						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 						 errmsg("invalid proto_version")));
diff --git a/src/backend/tsearch/wparser_def.c b/src/backend/tsearch/wparser_def.c
index f18e761af8..a7b80a83d4 100644
--- a/src/backend/tsearch/wparser_def.c
+++ b/src/backend/tsearch/wparser_def.c
@@ -2460,13 +2460,13 @@ prsd_headline(PG_FUNCTION_ARGS)
 		char	   *val = defGetString(defel);
 
 		if (pg_strcasecmp(defel->defname, "MaxWords") == 0)
-			max_words = pg_strtoint32(val);
+			max_words = pg_strtoint(val, 4);
 		else if (pg_strcasecmp(defel->defname, "MinWords") == 0)
-			min_words = pg_strtoint32(val);
+			min_words = pg_strtoint(val, 4);
 		else if (pg_strcasecmp(defel->defname, "ShortWord") == 0)
-			shortword = pg_strtoint32(val);
+			shortword = pg_strtoint(val, 4);
 		else if (pg_strcasecmp(defel->defname, "MaxFragments") == 0)
-			max_fragments = pg_strtoint32(val);
+			max_fragments = pg_strtoint(val, 4);
 		else if (pg_strcasecmp(defel->defname, "StartSel") == 0)
 			prs->startsel = pstrdup(val);
 		else if (pg_strcasecmp(defel->defname, "StopSel") == 0)
diff --git a/src/backend/utils/adt/arrayutils.c b/src/backend/utils/adt/arrayutils.c
index 6170e35ac2..60557402dc 100644
--- a/src/backend/utils/adt/arrayutils.c
+++ b/src/backend/utils/adt/arrayutils.c
@@ -226,7 +226,7 @@ ArrayGetIntegerTypmods(ArrayType *arr, int *n)
 	result = (int32 *) palloc(*n * sizeof(int32));
 
 	for (i = 0; i < *n; i++)
-		result[i] = pg_strtoint32(DatumGetCString(elem_values[i]));
+		result[i] = pg_strtoint(DatumGetCString(elem_values[i]), 4);
 
 	pfree(elem_values);
 
diff --git a/src/backend/utils/adt/cash.c b/src/backend/utils/adt/cash.c
index 6515fc8ec6..15f3babc05 100644
--- a/src/backend/utils/adt/cash.c
+++ b/src/backend/utils/adt/cash.c
@@ -26,7 +26,6 @@
 #include "libpq/pqformat.h"
 #include "utils/builtins.h"
 #include "utils/cash.h"
-#include "utils/int8.h"
 #include "utils/numeric.h"
 #include "utils/pg_locale.h"
 
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index 755ca6e277..f8cc5bc48a 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -92,7 +92,6 @@
 #include "utils/datetime.h"
 #include "utils/float.h"
 #include "utils/formatting.h"
-#include "utils/int8.h"
 #include "utils/memutils.h"
 #include "utils/numeric.h"
 #include "utils/pg_locale.h"
diff --git a/src/backend/utils/adt/int.c b/src/backend/utils/adt/int.c
index 04825fc77d..8007cc5839 100644
--- a/src/backend/utils/adt/int.c
+++ b/src/backend/utils/adt/int.c
@@ -64,7 +64,7 @@ int2in(PG_FUNCTION_ARGS)
 {
 	char	   *num = PG_GETARG_CSTRING(0);
 
-	PG_RETURN_INT16(pg_strtoint16(num));
+	PG_RETURN_INT16(pg_strtoint(num, 2));
 }
 
 /*
@@ -269,7 +269,7 @@ int4in(PG_FUNCTION_ARGS)
 {
 	char	   *num = PG_GETARG_CSTRING(0);
 
-	PG_RETURN_INT32(pg_strtoint32(num));
+	PG_RETURN_INT32(pg_strtoint(num, 4));
 }
 
 /*
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 0ff9394a2f..d7e1a04c7d 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -23,7 +23,6 @@
 #include "nodes/nodeFuncs.h"
 #include "nodes/supportnodes.h"
 #include "optimizer/optimizer.h"
-#include "utils/int8.h"
 #include "utils/builtins.h"
 
 
@@ -47,99 +46,14 @@ typedef struct
  * Formatting and conversion routines.
  *---------------------------------------------------------*/
 
-/*
- * scanint8 --- try to parse a string into an int8.
- *
- * If errorOK is false, ereport a useful error message if the string is bad.
- * If errorOK is true, just return "false" for bad input.
- */
-bool
-scanint8(const char *str, bool errorOK, int64 *result)
-{
-	const char *ptr = str;
-	int64		tmp = 0;
-	bool		neg = false;
-
-	/*
-	 * Do our own scan, rather than relying on sscanf which might be broken
-	 * for long long.
-	 *
-	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
-	 * value as a negative number.
-	 */
-
-	/* skip leading spaces */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		/* could fail if input is most negative number */
-		if (unlikely(tmp == PG_INT64_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	*result = tmp;
-	return true;
-
-out_of_range:
-	if (!errorOK)
-		ereport(ERROR,
-				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-				 errmsg("value \"%s\" is out of range for type %s",
-						str, "bigint")));
-	return false;
-
-invalid_syntax:
-	if (!errorOK)
-		ereport(ERROR,
-				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
-				 errmsg("invalid input syntax for type %s: \"%s\"",
-						"bigint", str)));
-	return false;
-}
-
 /* int8in()
  */
 Datum
 int8in(PG_FUNCTION_ARGS)
 {
-	char	   *str = PG_GETARG_CSTRING(0);
-	int64		result;
+	char			   *str = PG_GETARG_CSTRING(0);
 
-	(void) scanint8(str, false, &result);
-	PG_RETURN_INT64(result);
+	PG_RETURN_INT64(pg_strtoint(str, 8));
 }
 
 
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index a00db3ce7a..bc02958950 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -39,7 +39,6 @@
 #include "utils/float.h"
 #include "utils/guc.h"
 #include "utils/hashutils.h"
-#include "utils/int8.h"
 #include "utils/numeric.h"
 #include "utils/sortsupport.h"
 
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 70138feb29..1d8b6be9bc 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -19,6 +19,7 @@
 #include <ctype.h>
 
 #include "common/int.h"
+#include "common/string.h"
 #include "utils/builtins.h"
 
 /*
@@ -110,153 +111,86 @@ pg_atoi(const char *s, int size, int c)
 }
 
 /*
- * Convert input string to a signed 16 bit integer.
+ * pg_strtoint
  *
- * Allows any number of leading or trailing whitespace characters. Will throw
- * ereport() upon bad input format or overflow.
+ * Convert input string to a signed integer of the wanted size.
  *
- * NB: Accumulate input as a negative number, to deal with two's complement
- * representation of the most negative number, which can't be represented as a
- * positive number.
+ * 'size' is the sizeof() the desired integral result (2, 4, or 8 bytes).
+ *
+ * This throws ereport() upon bad input format or overflow.
  */
-int16
-pg_strtoint16(const char *s)
+int64
+pg_strtoint(const char *s, int size)
 {
-	const char *ptr = s;
-	int16		tmp = 0;
-	bool		neg = false;
+	pg_strtoint_status status = PG_STRTOINT_OK;
+	int64		result;
+	char	   *type;
 
-	/* skip leading spaces */
-	while (likely(*ptr) && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
+	switch (size)
 	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
+		case 2:
+		{
+			int16		res16;
 
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
+			status = pg_strtoint16(s, &res16);
+			result = (int64) res16;
+			break;
+		}
+		case 4:
+		{
+			int32		res32;
 
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s16_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s16_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
+			status = pg_strtoint32(s, &res32);
+			result = (int64) res32;
+			break;
+		}
+		case 8:
+			status = pg_strtoint64(s, &result);
+			break;
+		default:
+			elog(ERROR, "unsupported result size: %d", size);
+			break;
 	}
 
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
+	if (likely(status == PG_STRTOINT_OK))
+		return result;
 
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
+	switch (size)
 	{
-		/* could fail if input is most negative number */
-		if (unlikely(tmp == PG_INT16_MIN))
-			goto out_of_range;
-		tmp = -tmp;
+		case 2:
+			type = "smallint";
+			break;
+		case 4:
+			type = "integer";
+			break;
+		case 8:
+			type = "bigint";
+			break;
+		default:
+			Assert(false);
+			break;
 	}
 
-	return tmp;
-
-out_of_range:
-	ereport(ERROR,
-			(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-			 errmsg("value \"%s\" is out of range for type %s",
-					s, "smallint")));
-
-invalid_syntax:
-	ereport(ERROR,
-			(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
-			 errmsg("invalid input syntax for type %s: \"%s\"",
-					"smallint", s)));
-
-	return 0;					/* keep compiler quiet */
-}
-
-/*
- * Convert input string to a signed 32 bit integer.
- *
- * Allows any number of leading or trailing whitespace characters. Will throw
- * ereport() upon bad input format or overflow.
- *
- * NB: Accumulate input as a negative number, to deal with two's complement
- * representation of the most negative number, which can't be represented as a
- * positive number.
- */
-int32
-pg_strtoint32(const char *s)
-{
-	const char *ptr = s;
-	int32		tmp = 0;
-	bool		neg = false;
-
-	/* skip leading spaces */
-	while (likely(*ptr) && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
+	switch (status)
 	{
-		ptr++;
-		neg = true;
+		case PG_STRTOINT_SYNTAX_ERROR:
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+					 errmsg("invalid input syntax for type %s: \"%s\"",
+							type, s)));
+			break;
+
+		case PG_STRTOINT_RANGE_ERROR:
+			ereport(ERROR,
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+				 errmsg("value \"%s\" is out of range for type %s",
+						s, type)));
+			break;
+
+		case PG_STRTOINT_OK:
+			Assert(false);
+			break;
 	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s32_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s32_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		/* could fail if input is most negative number */
-		if (unlikely(tmp == PG_INT32_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	return tmp;
-
-out_of_range:
-	ereport(ERROR,
-			(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-			 errmsg("value \"%s\" is out of range for type %s",
-					s, "integer")));
-
-invalid_syntax:
-	ereport(ERROR,
-			(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
-			 errmsg("invalid input syntax for type %s: \"%s\"",
-					"integer", s)));
 
 	return 0;					/* keep compiler quiet */
 }
@@ -543,25 +477,3 @@ pg_ltostr(char *str, int32 value)
 
 	return end;
 }
-
-/*
- * pg_strtouint64
- *		Converts 'str' into an unsigned 64-bit integer.
- *
- * This has the identical API to strtoul(3), except that it will handle
- * 64-bit ints even where "long" is narrower than that.
- *
- * For the moment it seems sufficient to assume that the platform has
- * such a function somewhere; let's not roll our own.
- */
-uint64
-pg_strtouint64(const char *str, char **endptr, int base)
-{
-#ifdef _MSC_VER					/* MSVC only */
-	return _strtoui64(str, endptr, base);
-#elif defined(HAVE_STRTOULL) && SIZEOF_LONG < 8
-	return strtoull(str, endptr, base);
-#else
-	return strtoul(str, endptr, base);
-#endif
-}
diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c
index e5c7e5c7ee..749fc44623 100644
--- a/src/backend/utils/adt/rangetypes.c
+++ b/src/backend/utils/adt/rangetypes.c
@@ -37,7 +37,6 @@
 #include "utils/builtins.h"
 #include "utils/date.h"
 #include "utils/hashutils.h"
-#include "utils/int8.h"
 #include "utils/lsyscache.h"
 #include "utils/rangetypes.h"
 #include "utils/timestamp.h"
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index d36156f4e4..bdb6ab5e84 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -5626,8 +5626,8 @@ text_format(PG_FUNCTION_ARGS)
 
 				str = OutputFunctionCall(&typoutputinfo_width, value);
 
-				/* pg_strtoint32 will complain about bad data or overflow */
-				width = pg_strtoint32(str);
+				/* pg_strtoint will complain about bad data or overflow */
+				width = (int) pg_strtoint(str, 4);
 
 				pfree(str);
 			}
diff --git a/src/bin/pgbench/exprscan.l b/src/bin/pgbench/exprscan.l
index e9020ad231..84a9d39795 100644
--- a/src/bin/pgbench/exprscan.l
+++ b/src/bin/pgbench/exprscan.l
@@ -23,6 +23,7 @@
  *-------------------------------------------------------------------------
  */
 
+#include "common/string.h"
 #include "fe_utils/psqlscan_int.h"
 
 /* context information for reporting errors in expressions */
@@ -205,7 +206,7 @@ notnull			[Nn][Oo][Tt][Nn][Uu][Ll][Ll]
 					return MAXINT_PLUS_ONE_CONST;
 				}
 {digit}+		{
-					if (!strtoint64(yytext, true, &yylval->ival))
+					if (pg_strtoint64(yytext, &yylval->ival) != PG_STRTOINT_OK)
 						expr_yyerror_more(yyscanner, "bigint constant overflow",
 										  strdup(yytext));
 					return INTEGER_CONST;
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index ed7652bfbf..6581dc4de4 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -34,6 +34,7 @@
 #include "postgres_fe.h"
 #include "common/int.h"
 #include "common/logging.h"
+#include "common/string.h"
 #include "fe_utils/conditional.h"
 #include "getopt_long.h"
 #include "libpq-fe.h"
@@ -664,116 +665,6 @@ usage(void)
 		   progname, progname);
 }
 
-/* return whether str matches "^\s*[-+]?[0-9]+$" */
-static bool
-is_an_int(const char *str)
-{
-	const char *ptr = str;
-
-	/* skip leading spaces; cast is consistent with strtoint64 */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* skip sign */
-	if (*ptr == '+' || *ptr == '-')
-		ptr++;
-
-	/* at least one digit */
-	if (*ptr && !isdigit((unsigned char) *ptr))
-		return false;
-
-	/* eat all digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-		ptr++;
-
-	/* must have reached end of string */
-	return *ptr == '\0';
-}
-
-
-/*
- * strtoint64 -- convert a string to 64-bit integer
- *
- * This function is a slightly modified version of scanint8() from
- * src/backend/utils/adt/int8.c.
- *
- * The function returns whether the conversion worked, and if so
- * "*result" is set to the result.
- *
- * If not errorOK, an error message is also printed out on errors.
- */
-bool
-strtoint64(const char *str, bool errorOK, int64 *result)
-{
-	const char *ptr = str;
-	int64		tmp = 0;
-	bool		neg = false;
-
-	/*
-	 * Do our own scan, rather than relying on sscanf which might be broken
-	 * for long long.
-	 *
-	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
-	 * value as a negative number.
-	 */
-
-	/* skip leading spaces */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		if (unlikely(tmp == PG_INT64_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	*result = tmp;
-	return true;
-
-out_of_range:
-	if (!errorOK)
-		fprintf(stderr,
-				"value \"%s\" is out of range for type bigint\n", str);
-	return false;
-
-invalid_syntax:
-	if (!errorOK)
-		fprintf(stderr,
-				"invalid input syntax for type bigint: \"%s\"\n", str);
-	return false;
-}
-
 /* convert string to double, detecting overflows/underflows */
 bool
 strtodouble(const char *str, bool errorOK, double *dv)
@@ -1283,11 +1174,13 @@ getVariable(CState *st, char *name)
 	return var->svalue;
 }
 
+
 /* Try to convert variable to a value; return false on failure */
 static bool
 makeVariableValue(Variable *var)
 {
 	size_t		slen;
+	int64		ival;
 
 	if (var->value.type != PGBT_NO_VALUE)
 		return true;			/* no work */
@@ -1320,15 +1213,16 @@ makeVariableValue(Variable *var)
 	{
 		setBoolValue(&var->value, false);
 	}
-	else if (is_an_int(var->svalue))
+
+	/*
+	 * Attempt an int64 conversion.
+	 *
+	 * A too-large but syntactically correct int64 will be parsed as a double
+	 * below.
+	 */
+	else if (pg_strtoint64(var->svalue, &ival) == PG_STRTOINT_OK)
 	{
-		/* if it looks like an int, it must be an int without overflow */
-		int64		iv;
-
-		if (!strtoint64(var->svalue, false, &iv))
-			return false;
-
-		setIntValue(&var->value, iv);
+		setIntValue(&var->value, ival);
 	}
 	else						/* type should be double */
 	{
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index c4a1e298e0..da3f8fa055 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -160,7 +160,6 @@ extern void syntax_error(const char *source, int lineno, const char *line,
 						 const char *cmd, const char *msg,
 						 const char *more, int col) pg_attribute_noreturn();
 
-extern bool strtoint64(const char *str, bool errorOK, int64 *pi);
 extern bool strtodouble(const char *str, bool errorOK, double *pd);
 
 #endif							/* PGBENCH_H */
diff --git a/src/common/string.c b/src/common/string.c
index c9b8482cb0..788ea19b8e 100644
--- a/src/common/string.c
+++ b/src/common/string.c
@@ -22,6 +22,7 @@
 #endif
 
 #include "common/string.h"
+#include "common/int.h"
 
 
 /*
@@ -58,6 +59,376 @@ strtoint(const char *pg_restrict str, char **pg_restrict endptr, int base)
 }
 
 
+/*
+ * pg_strtoint16
+ *
+ * Convert input string to a signed 16-bit integer.  Allows any number of
+ * leading or trailing whitespace characters.
+ *
+ * NB: Accumulate input as a negative number, to deal with two's complement
+ * representation of the most negative number, which can't be represented as a
+ * positive number.
+ *
+ * The function returns immediately if the conversion failed with a status
+ * value to let the caller handle the error.  On success, the result is
+ * stored in "*result".
+ */
+pg_strtoint_status
+pg_strtoint16(const char *s, int16 *result)
+{
+	const char *ptr = s;
+	int16		tmp = 0;
+	bool		neg = false;
+
+	/* skip leading spaces */
+	while (likely(*ptr) && isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '-')
+	{
+		ptr++;
+		neg = true;
+	}
+	else if (*ptr == '+')
+		ptr++;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead per the last check */
+	do
+	{
+		int8		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_s16_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_sub_s16_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace, but not other trailing chars */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	if (!neg)
+	{
+		/* could fail if input is most negative number */
+		if (unlikely(tmp == PG_INT16_MIN))
+			return PG_STRTOINT_RANGE_ERROR;
+		tmp = -tmp;
+	}
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
+/*
+ * pg_strtoint32
+ *
+ * Convert input string to a signed 32-bit integer.  Allows any number of
+ * leading or trailing whitespace characters.
+ *
+ * NB: Accumulate input as a negative number, to deal with two's complement
+ * representation of the most negative number, which can't be represented as a
+ * positive number.
+ *
+ * The function returns immediately if the conversion failed with a status
+ * value to let the caller handle the error.  On success, the result is
+ * stored in "*result".
+ */
+pg_strtoint_status
+pg_strtoint32(const char *s, int32 *result)
+{
+	const char *ptr = s;
+	int32		tmp = 0;
+	bool		neg = false;
+
+	/* skip leading spaces */
+	while (likely(*ptr) && isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '-')
+	{
+		ptr++;
+		neg = true;
+	}
+	else if (*ptr == '+')
+		ptr++;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead per the last check */
+	do
+	{
+		int8		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_s32_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_sub_s32_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace, but not other trailing chars */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	if (!neg)
+	{
+		/* could fail if input is most negative number */
+		if (unlikely(tmp == PG_INT32_MIN))
+			return PG_STRTOINT_RANGE_ERROR;
+		tmp = -tmp;
+	}
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
+/*
+ * pg_strtoint64
+ *
+ * Convert input string to a signed 32-bit integer.  Allows any number of
+ * leading or trailing whitespace characters.
+ *
+ * NB: Accumulate input as a negative number, to deal with two's complement
+ * representation of the most negative number, which can't be represented as a
+ * positive number.
+ *
+ * The function returns immediately if the conversion failed with a status
+ * value to let the caller handle the error.  On success, the result is
+ * stored in "*result".
+ */
+pg_strtoint_status
+pg_strtoint64(const char *str, int64 *result)
+{
+	const char *ptr = str;
+	int64		tmp = 0;
+	bool		neg = false;
+
+	/*
+	 * Do our own scan, rather than relying on sscanf which might be broken
+	 * for long long.
+	 *
+	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
+	 * value as a negative number.
+	 */
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '-')
+	{
+		ptr++;
+		neg = true;
+	}
+	else if (*ptr == '+')
+		ptr++;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead per the last check */
+	do
+	{
+		int64		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace, but not other trailing chars */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	if (!neg)
+	{
+		if (unlikely(tmp == PG_INT64_MIN))
+			return PG_STRTOINT_RANGE_ERROR;
+		tmp = -tmp;
+	}
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
+/*
+ * pg_strtouint16
+ *
+ * Convert input string to an unsigned 16-bit integer.  Allows any number of
+ * leading or trailing whitespace characters.
+ *
+ * The function returns immediately if the conversion failed with a status
+ * value to let the caller handle the error.  On success, the result is
+ * stored in "*result".
+ */
+pg_strtoint_status
+pg_strtouint16(const char *str, uint16 *result)
+{
+	const char *ptr = str;
+	uint16		tmp = 0;
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '+')
+		ptr++;
+	else if (unlikely(*ptr == '-'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* require at least one digit */
+	if (!unlikely(isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead per the last check */
+	do
+	{
+		uint8		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_u16_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_add_u16_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* but not other trailing chars */
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
+/*
+ * pg_strtouint32
+ *
+ * Convert input string to an unsigned 32-bit integer.  Allows any number of
+ * leading or trailing whitespace characters.
+ *
+ * The function returns immediately if the conversion failed with a status
+ * value to let the caller handle the error.  On success, the result is
+ * stored in "*result".
+ */
+pg_strtoint_status
+pg_strtouint32(const char *str, uint32 *result)
+{
+	const char *ptr = str;
+	uint32		tmp = 0;
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '+')
+		ptr++;
+	else if (unlikely(*ptr == '-'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* require at least one digit */
+	if (!unlikely(isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead per the last check */
+	do
+	{
+		uint8		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_u32_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_add_u32_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* but not other trailing chars */
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
+/*
+ * pg_strtouint64
+ *
+ * Convert input string to an unsigned 64-bit integer.  Allows any number of
+ * leading or trailing whitespace characters.
+ *
+ * The function returns immediately if the conversion failed with a status
+ * value to let the caller handle the error.  On success, the result is
+ * stored in "*result".
+ */
+pg_strtoint_status
+pg_strtouint64(const char *str, uint64 *result)
+{
+	const char *ptr = str;
+	uint64		tmp = 0;
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '+')
+		ptr++;
+	else if (unlikely(*ptr == '-'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* require at least one digit */
+	if (!unlikely(isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead per the last check */
+	do
+	{
+		uint64		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_u64_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_add_u64_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* but not other trailing chars */
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
 /*
  * pg_clean_ascii -- Replace any non-ASCII chars with a '?' char
  *
diff --git a/src/include/common/string.h b/src/include/common/string.h
index 94f653fdd7..eb0f273d82 100644
--- a/src/include/common/string.h
+++ b/src/include/common/string.h
@@ -11,8 +11,26 @@
 #define COMMON_STRING_H
 
 extern bool pg_str_endswith(const char *str, const char *end);
-extern int	strtoint(const char *pg_restrict str, char **pg_restrict endptr,
-					 int base);
+extern int strtoint(const char *pg_restrict str, char **pg_restrict endptr,
+					int base);
+
+/*
+ * Set of routines for conversion from string to various integer types.
+ */
+typedef enum pg_strtoint_status
+{
+	PG_STRTOINT_OK,
+	PG_STRTOINT_SYNTAX_ERROR,
+	PG_STRTOINT_RANGE_ERROR
+} pg_strtoint_status;
+
+extern pg_strtoint_status pg_strtoint16(const char *str, int16 *result);
+extern pg_strtoint_status pg_strtoint32(const char *str, int32 *result);
+extern pg_strtoint_status pg_strtoint64(const char *str, int64 *result);
+extern pg_strtoint_status pg_strtouint16(const char *str, uint16 *result);
+extern pg_strtoint_status pg_strtouint32(const char *str, uint32 *result);
+extern pg_strtoint_status pg_strtouint64(const char *str, uint64 *result);
+
 extern void pg_clean_ascii(char *str);
 extern int	pg_strip_crlf(char *str);
 
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 937ddb7ef0..472a97c8b9 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -43,14 +43,12 @@ extern int	namestrcmp(Name name, const char *str);
 
 /* numutils.c */
 extern int32 pg_atoi(const char *s, int size, int c);
-extern int16 pg_strtoint16(const char *s);
-extern int32 pg_strtoint32(const char *s);
+extern int64 pg_strtoint(const char *s, int size);
 extern void pg_itoa(int16 i, char *a);
 extern void pg_ltoa(int32 l, char *a);
 extern void pg_lltoa(int64 ll, char *a);
 extern char *pg_ltostr_zeropad(char *str, int32 value, int32 minwidth);
 extern char *pg_ltostr(char *str, int32 value);
-extern uint64 pg_strtouint64(const char *str, char **endptr, int base);
 
 /* oid.c */
 extern oidvector *buildoidvector(const Oid *oids, int n);
diff --git a/src/include/utils/int8.h b/src/include/utils/int8.h
deleted file mode 100644
index 4836095d93..0000000000
--- a/src/include/utils/int8.h
+++ /dev/null
@@ -1,25 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * int8.h
- *	  Declarations for operations on 64-bit integers.
- *
- *
- * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * src/include/utils/int8.h
- *
- * NOTES
- * These data types are supported on all 64-bit architectures, and may
- *	be supported through libraries on some 32-bit machines. If your machine
- *	is not currently supported, then please try to make it so, then post
- *	patches to the postgresql.org hackers mailing list.
- *
- *-------------------------------------------------------------------------
- */
-#ifndef INT8_H
-#define INT8_H
-
-extern bool scanint8(const char *str, bool errorOK, int64 *result);
-
-#endif							/* INT8_H */

overflow.tar.gzapplication/gzipDownload

#67

andres@anarazel.de

over 6 years ago

In reply to: Michael Paquier (#66)

Re: refactoring - share str2*int64 functions

Hi,

On 2019-09-09 14:28:14 +0900, Michael Paquier wrote:

On Thu, Sep 05, 2019 at 03:52:48PM +0900, Michael Paquier wrote:

Right, there was this part. This brings also the point of having one
interface for the backend as all the error messages for the backend
are actually the same, with the most simple name being that:
pg_strtoint(value, size, error_ok).

I *VEHEMENTLY* oppose the introduction of any future pseudo-generic
routines taking the type width as a parameter. They're weakening the
type system and they're unnecessarily inefficient.

I don't really buy that saving a few copies of a strings is worth that
much. But if you really want to do that, the right approach imo would be
to move the error reporting into a separate function. I.e. something

void pg_attribute_noreturn()
pg_strtoint_error(pg_strtoint_status status, const char *input, const char *type)

that you'd call in small wrappers. Something like

static inline int32
pg_strtoint32_check(const char* s)
{
int32 result;
pg_strtoint_status status = pg_strtoint32(s, &result);

if (unlikely(status == PG_STRTOINT_OK))
pg_strtoint_error(status, s, "int32");
return result;
}

So, it seems to me that if we want to have a consolidation of
everything, we basically need to have the generic error messages for
the backend directly into common/string.c where the routines are
refactored, and I think that we should do the following based on the
patch attached:

- Just remove pg_strtoint(), and move the error generation logic in
common/string.c.

I'm not quite sure what you mean by moving the "error generation logic"?

- Add an error_ok flag for all the pg_strto[u]int{16|32|64} routines,
which generates errors only for FRONTEND is not defined.

I think this is a bad idea.

- Use pg_log_error() for the frontend errors.

If those errors are added everywhere, we would basically have no code
paths in the frontend or the backend (for the unsigned part only)
which use them yet. Another possibility would be to ignore the
error_ok flag in those cases, but that's inconsistent.

Yea, ignoring it would be terrible idea.

diff --git a/src/backend/libpq/pqmq.c b/src/backend/libpq/pqmq.c
index a9bd47d937..593a5ef54e 100644
--- a/src/backend/libpq/pqmq.c
+++ b/src/backend/libpq/pqmq.c
@@ -286,10 +286,10 @@ pq_parse_errornotice(StringInfo msg, ErrorData *edata)
edata->hint = pstrdup(value);
break;
case PG_DIAG_STATEMENT_POSITION:
-				edata->cursorpos = pg_strtoint32(value);
+				edata->cursorpos = pg_strtoint(value, 4);
break;

I'd be really upset if this type of change went in.

#include "fmgr.h"
#include "miscadmin.h"
+#include "common/string.h"
#include "nodes/extensible.h"
#include "nodes/parsenodes.h"
#include "nodes/plannodes.h"
@@ -80,7 +81,7 @@
#define READ_UINT64_FIELD(fldname) \
token = pg_strtok(&length);		/* skip :fldname */ \
token = pg_strtok(&length);		/* get field value */ \
-	local_node->fldname = pg_strtouint64(token, NULL, 10)
+	(void) pg_strtouint64(token, &local_node->fldname)

Seems like these actually could just ought to use the error-checked
variants. And I think it ought to change all of
READ_{INT,UINT,LONG,UINT64,OID}_FIELD, rather than just redirecting one
of them to the new routines.

static void pcb_error_callback(void *arg);
@@ -496,7 +496,7 @@ make_const(ParseState *pstate, Value *value, int location)
case T_Float:
/* could be an oversize integer as well as a float ... */
-			if (scanint8(strVal(value), true, &val64))
+			if (pg_strtoint64(strVal(value), &val64) == PG_STRTOINT_OK)
{
/*
* It might actually fit in int32. Probably only INT_MIN can

Not for this change, but we really ought to move away from this crappy
logic. It's really bonkers to have T_Float represent large integers and
floats.

+/*
+ * pg_strtoint16
+ *
+ * Convert input string to a signed 16-bit integer.  Allows any number of
+ * leading or trailing whitespace characters.
+ *
+ * NB: Accumulate input as a negative number, to deal with two's complement
+ * representation of the most negative number, which can't be represented as a
+ * positive number.
+ *
+ * The function returns immediately if the conversion failed with a status
+ * value to let the caller handle the error.  On success, the result is
+ * stored in "*result".
+ */
+pg_strtoint_status
+pg_strtoint16(const char *s, int16 *result)
+{
+	const char *ptr = s;
+	int16		tmp = 0;
+	bool		neg = false;
+
+	/* skip leading spaces */
+	while (likely(*ptr) && isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '-')
+	{
+		ptr++;
+		neg = true;
+	}
+	else if (*ptr == '+')
+		ptr++;

+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;

Wonder if there's an argument for moving this behaviour to someplace
else - in most cases we really don't expect whitespace, and checking for
it is unnecessary overhead.

Greetings,

Andres Freund

#68

michael@paquier.xyz

over 6 years ago

In reply to: Andres Freund (#67)

Re: refactoring - share str2*int64 functions

On Mon, Sep 09, 2019 at 03:17:38AM -0700, Andres Freund wrote:

On 2019-09-09 14:28:14 +0900, Michael Paquier wrote:
I *VEHEMENTLY* oppose the introduction of any future pseudo-generic
routines taking the type width as a parameter. They're weakening the
type system and they're unnecessarily inefficient.

I saw that, using the previous wrapper increases the time of one call
from roughly 0.9ms to 1.1ns.

I don't really buy that saving a few copies of a strings is worth that
much. But if you really want to do that, the right approach imo would be
to move the error reporting into a separate function. I.e. something

void pg_attribute_noreturn()
pg_strtoint_error(pg_strtoint_status status, const char *input, const char *type)

that you'd call in small wrappers. Something like

I am not completely sure if we should do that either anyway. Another
approach would be to try to make the callers of the routines generate
their own error messages. The errors we have now are really linked to
the data types we have in core for signed integers (smallint, int,
bigint). In most cases do they really make sense (varlena.c, pqmq.c,
etc.)? And for errors which should never happen we could just use
elog(). For the input functions of int2/4/8 we still need the
existing errors of course.

So, it seems to me that if we want to have a consolidation of
everything, we basically need to have the generic error messages for
the backend directly into common/string.c where the routines are
refactored, and I think that we should do the following based on the
patch attached:

- Just remove pg_strtoint(), and move the error generation logic in
common/string.c.

I'm not quite sure what you mean by moving the "error generation logic"?

I was referring to the error messages we have on HEAD in scanint8()
and pg_strtoint16() for bad inputs and overflows.

Seems like these actually could just ought to use the error-checked
variants. And I think it ought to change all of
READ_{INT,UINT,LONG,UINT64,OID}_FIELD, rather than just redirecting one

Right.

static void pcb_error_callback(void *arg);
@@ -496,7 +496,7 @@ make_const(ParseState *pstate, Value *value, int location)
case T_Float:
/* could be an oversize integer as well as a float ... */
-			if (scanint8(strVal(value), true, &val64))
+			if (pg_strtoint64(strVal(value), &val64) == PG_STRTOINT_OK)
{
/*
* It might actually fit in int32. Probably only INT_MIN can
Not for this change, but we really ought to move away from this crappy
logic. It's really bonkers to have T_Float represent large integers and
floats.

I am not sure but what are you suggesting here?

+	/* skip leading spaces */
+	while (likely(*ptr) && isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '-')
+	{
+		ptr++;
+		neg = true;
+	}
+	else if (*ptr == '+')
+		ptr++;
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
Wonder if there's an argument for moving this behaviour to someplace
else - in most cases we really don't expect whitespace, and checking for
it is unnecessary overhead.

Not sure about that. I would keep the scope of the patch simple as of
now, where we make sure that we have the right interface for
everything. There are a couple of extra improvements which could be
done afterwards, and if we move everything in the same place that
should be easier to move on with more improvements. Hopefully.
--
Michael

#69

andres@anarazel.de

over 6 years ago

In reply to: Michael Paquier (#68)

Re: refactoring - share str2*int64 functions

Hi,

On 2019-09-09 20:57:46 +0900, Michael Paquier wrote:

I don't really buy that saving a few copies of a strings is worth that
much. But if you really want to do that, the right approach imo would be
to move the error reporting into a separate function. I.e. something

void pg_attribute_noreturn()
pg_strtoint_error(pg_strtoint_status status, const char *input, const char *type)

that you'd call in small wrappers. Something like

I am not completely sure if we should do that either anyway. Another
approach would be to try to make the callers of the routines generate
their own error messages. The errors we have now are really linked to
the data types we have in core for signed integers (smallint, int,
bigint). In most cases do they really make sense (varlena.c, pqmq.c,
etc.)?

I think there's plenty places that ought to use the checked functions,
even if they currently don't. And typically calling in the caller will
actually be slightly less efficient, than an out-of-line function like I
was proposing above, because it'll be in-line code for infrequent code.

But ISTM all of them ought to just use the C types, rather than the SQL
types however. Since in the above proposal the caller determines the
type names, if you want a different type - like the SQL input routines -
can just invoke pg_strtoint_error() themselves (or just have it open
coded).

And for errors which should never happen we could just use
elog(). For the input functions of int2/4/8 we still need the
existing errors of course.

Right, there it makes sense to continue to refer the SQL level types.

So, it seems to me that if we want to have a consolidation of
everything, we basically need to have the generic error messages for
the backend directly into common/string.c where the routines are
refactored, and I think that we should do the following based on the
patch attached:

- Just remove pg_strtoint(), and move the error generation logic in
common/string.c.

I'm not quite sure what you mean by moving the "error generation logic"?

I was referring to the error messages we have on HEAD in scanint8()
and pg_strtoint16() for bad inputs and overflows.

Not the right direction imo.

static void pcb_error_callback(void *arg);
@@ -496,7 +496,7 @@ make_const(ParseState *pstate, Value *value, int location)
case T_Float:
/* could be an oversize integer as well as a float ... */
-			if (scanint8(strVal(value), true, &val64))
+			if (pg_strtoint64(strVal(value), &val64) == PG_STRTOINT_OK)
{
/*
* It might actually fit in int32. Probably only INT_MIN can
Not for this change, but we really ought to move away from this crappy
logic. It's really bonkers to have T_Float represent large integers and
floats.
I am not sure but what are you suggesting here?

I'm saying that we shouldn't need the whole logic of trying to parse the
string as an int, and then fail to float if it's not that. But that it's
not this patchset's task to fix this.

+	/* skip leading spaces */
+	while (likely(*ptr) && isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '-')
+	{
+		ptr++;
+		neg = true;
+	}
+	else if (*ptr == '+')
+		ptr++;
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
Wonder if there's an argument for moving this behaviour to someplace
else - in most cases we really don't expect whitespace, and checking for
it is unnecessary overhead.
Not sure about that. I would keep the scope of the patch simple as of
now, where we make sure that we have the right interface for
everything. There are a couple of extra improvements which could be
done afterwards, and if we move everything in the same place that
should be easier to move on with more improvements. Hopefully.

The only reason for thinking about it now is that we'd then avoid
changing the API twice.

Greetings,

Andres Freund

#70

michael@paquier.xyz

over 6 years ago

In reply to: Andres Freund (#67)

Re: refactoring - share str2*int64 functions

On Mon, Sep 09, 2019 at 03:17:38AM -0700, Andres Freund wrote:

On 2019-09-09 14:28:14 +0900, Michael Paquier wrote:
@@ -80,7 +81,7 @@
#define READ_UINT64_FIELD(fldname) \
token = pg_strtok(&length);		/* skip :fldname */ \
token = pg_strtok(&length);		/* get field value */ \
-	local_node->fldname = pg_strtouint64(token, NULL, 10)
+	(void) pg_strtouint64(token, &local_node->fldname)
Seems like these actually could just ought to use the error-checked
variants. And I think it ought to change all of
READ_{INT,UINT,LONG,UINT64,OID}_FIELD, rather than just redirecting one
of them to the new routines.

Okay for these changes, except for READ_INT_FIELD where we have short
variables using it as well (for example StrategyNumber) so this
generates a justified warning. I think that a correct solution
here would be to add a new READ_SHORT_FIELD which uses pg_strtoint16.
I am not adding that for now.
--
Michael

#71

michael@paquier.xyz

over 6 years ago

In reply to: Andres Freund (#69)

1 attachment(s)

Re: refactoring - share str2*int64 functions

On Mon, Sep 09, 2019 at 05:27:04AM -0700, Andres Freund wrote:

On 2019-09-09 20:57:46 +0900, Michael Paquier wrote:
But ISTM all of them ought to just use the C types, rather than the SQL
types however. Since in the above proposal the caller determines the
type names, if you want a different type - like the SQL input routines -
can just invoke pg_strtoint_error() themselves (or just have it open
coded).

Yep, that was my line of thoughts.

And for errors which should never happen we could just use
elog(). For the input functions of int2/4/8 we still need the
existing errors of course.

Right, there it makes sense to continue to refer the SQL level types.

Actually, I found your suggestion of using a noreturn function for the
error reporting to be a very clean alternative. I didn't know though
that gcc is not able to detect that a function does not return if you
don't have a default in the switch for all the status codes. And this
even if all the values of the enum for the switch are listed.

I'm saying that we shouldn't need the whole logic of trying to parse the
string as an int, and then fail to float if it's not that. But that it's
not this patchset's task to fix this.

Ah, sure. Agreed.

Not sure about that. I would keep the scope of the patch simple as of
now, where we make sure that we have the right interface for
everything. There are a couple of extra improvements which could be
done afterwards, and if we move everything in the same place that
should be easier to move on with more improvements. Hopefully.

The only reason for thinking about it now is that we'd then avoid
changing the API twice.

What I think we would be looking for here is an extra argument for the
low-level routines to control the behavior of the function in an
extensible way, say a bits16 for a set of flags, with one flag to
ignore checks for trailing and leading whitespace. This feels a bit
over-engineered though for this purpose.

Attached is an updated patch? How does it look? I have left the
parts of readfuncs.c for now as there are more issues behind that than
doing a single switch, short reads are one, long reads a second. And
the patch already does a lot. There could be also an argument for
having extra _check wrappers for the unsigned portions but these would
be mostly unused in the backend code, so I have left that out on
purpose.

After all that stuff, there are still some issues which need more
care, in short:
- the T_Float conversion.
- removal of strtoint()
- the part for readfuncs.c
--
Michael

Attachments:

str2int-13.patchtext/x-diff; charset=us-asciiDownload

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 221b47298c..1ed03aae02 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -62,6 +62,7 @@
 #include <unistd.h>
 
 #include "catalog/pg_authid.h"
+#include "common/string.h"
 #include "executor/instrument.h"
 #include "funcapi.h"
 #include "mb/pg_wchar.h"
@@ -1022,7 +1023,7 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 		/* parse command tag to retrieve the number of affected rows. */
 		if (completionTag &&
 			strncmp(completionTag, "COPY ", 5) == 0)
-			rows = pg_strtouint64(completionTag + 5, NULL, 10);
+			(void) pg_strtouint64(completionTag + 5, &rows);
 		else
 			rows = 0;
 
diff --git a/contrib/spi/refint.c b/contrib/spi/refint.c
index adf0490f85..b063f1e122 100644
--- a/contrib/spi/refint.c
+++ b/contrib/spi/refint.c
@@ -306,7 +306,7 @@ check_foreign_key(PG_FUNCTION_ARGS)
 		/* internal error */
 		elog(ERROR, "check_foreign_key: too short %d (< 5) list of arguments", nargs);
 
-	nrefs = pg_strtoint32(args[0]);
+	nrefs = pg_strtoint32_check(args[0]);
 	if (nrefs < 1)
 		/* internal error */
 		elog(ERROR, "check_foreign_key: %d (< 1) number of references specified", nrefs);
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 2c0ae395ba..8e75d52b06 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -21,6 +21,7 @@
 #include "catalog/heap.h"
 #include "catalog/pg_type.h"
 #include "commands/trigger.h"
+#include "common/string.h"
 #include "executor/executor.h"
 #include "executor/spi_priv.h"
 #include "miscadmin.h"
@@ -2338,8 +2339,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
 					CreateTableAsStmt *ctastmt = (CreateTableAsStmt *) stmt->utilityStmt;
 
 					if (strncmp(completionTag, "SELECT ", 7) == 0)
-						_SPI_current->processed =
-							pg_strtouint64(completionTag + 7, NULL, 10);
+						(void) pg_strtouint64(completionTag + 7, &_SPI_current->processed);
 					else
 					{
 						/*
@@ -2361,8 +2361,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
 				else if (IsA(stmt->utilityStmt, CopyStmt))
 				{
 					Assert(strncmp(completionTag, "COPY ", 5) == 0);
-					_SPI_current->processed = pg_strtouint64(completionTag + 5,
-															 NULL, 10);
+					(void) pg_strtouint64(completionTag + 5, &_SPI_current->processed);
 				}
 			}
 
diff --git a/src/backend/libpq/pqmq.c b/src/backend/libpq/pqmq.c
index a9bd47d937..ac2bf2f2c4 100644
--- a/src/backend/libpq/pqmq.c
+++ b/src/backend/libpq/pqmq.c
@@ -286,10 +286,10 @@ pq_parse_errornotice(StringInfo msg, ErrorData *edata)
 				edata->hint = pstrdup(value);
 				break;
 			case PG_DIAG_STATEMENT_POSITION:
-				edata->cursorpos = pg_strtoint32(value);
+				edata->cursorpos = pg_strtoint32_check(value);
 				break;
 			case PG_DIAG_INTERNAL_POSITION:
-				edata->internalpos = pg_strtoint32(value);
+				edata->internalpos = pg_strtoint32_check(value);
 				break;
 			case PG_DIAG_INTERNAL_QUERY:
 				edata->internalquery = pstrdup(value);
@@ -316,7 +316,7 @@ pq_parse_errornotice(StringInfo msg, ErrorData *edata)
 				edata->filename = pstrdup(value);
 				break;
 			case PG_DIAG_SOURCE_LINE:
-				edata->lineno = pg_strtoint32(value);
+				edata->lineno = pg_strtoint32_check(value);
 				break;
 			case PG_DIAG_SOURCE_FUNCTION:
 				edata->funcname = pstrdup(value);
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 764e3bb90c..2a1e9be3eb 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -32,6 +32,7 @@
 
 #include <math.h>
 
+#include "common/string.h"
 #include "fmgr.h"
 #include "miscadmin.h"
 #include "nodes/extensible.h"
@@ -80,7 +81,7 @@
 #define READ_UINT64_FIELD(fldname) \
 	token = pg_strtok(&length);		/* skip :fldname */ \
 	token = pg_strtok(&length);		/* get field value */ \
-	local_node->fldname = pg_strtouint64(token, NULL, 10)
+	(void) pg_strtouint64(token, &local_node->fldname)
 
 /* Read a long integer field (anything written as ":fldname %ld") */
 #define READ_LONG_FIELD(fldname) \
diff --git a/src/backend/parser/parse_node.c b/src/backend/parser/parse_node.c
index 1baf7ef31f..cc846a71f1 100644
--- a/src/backend/parser/parse_node.c
+++ b/src/backend/parser/parse_node.c
@@ -17,6 +17,7 @@
 #include "access/htup_details.h"
 #include "access/table.h"
 #include "catalog/pg_type.h"
+#include "common/string.h"
 #include "mb/pg_wchar.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
@@ -25,7 +26,6 @@
 #include "parser/parse_expr.h"
 #include "parser/parse_relation.h"
 #include "utils/builtins.h"
-#include "utils/int8.h"
 #include "utils/lsyscache.h"
 #include "utils/syscache.h"
 #include "utils/varbit.h"
@@ -496,7 +496,7 @@ make_const(ParseState *pstate, Value *value, int location)
 
 		case T_Float:
 			/* could be an oversize integer as well as a float ... */
-			if (scanint8(strVal(value), true, &val64))
+			if (pg_strtoint64(strVal(value), &val64) == PG_STRTOINT_OK)
 			{
 				/*
 				 * It might actually fit in int32. Probably only INT_MIN can
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 6eba08a920..1ca6542c2e 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -341,7 +341,7 @@ libpqrcv_identify_system(WalReceiverConn *conn, TimeLineID *primary_tli)
 						   ntuples, nfields, 3, 1)));
 	}
 	primary_sysid = pstrdup(PQgetvalue(res, 0, 0));
-	*primary_tli = pg_strtoint32(PQgetvalue(res, 0, 1));
+	*primary_tli = pg_strtoint32_check(PQgetvalue(res, 0, 1));
 	PQclear(res);
 
 	return primary_sysid;
@@ -487,7 +487,7 @@ libpqrcv_endstreaming(WalReceiverConn *conn, TimeLineID *next_tli)
 		if (PQnfields(res) < 2 || PQntuples(res) != 1)
 			ereport(ERROR,
 					(errmsg("unexpected result set after end-of-streaming")));
-		*next_tli = pg_strtoint32(PQgetvalue(res, 0, 0));
+		*next_tli = pg_strtoint32_check(PQgetvalue(res, 0, 0));
 		PQclear(res);
 
 		/* the result set should be followed by CommandComplete */
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 9c08757fca..d8a2c4979f 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,6 +13,7 @@
 #include "postgres.h"
 
 #include "catalog/pg_publication.h"
+#include "common/string.h"
 
 #include "fmgr.h"
 
@@ -22,7 +23,6 @@
 #include "replication/pgoutput.h"
 
 #include "utils/inval.h"
-#include "utils/int8.h"
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 #include "utils/varlena.h"
@@ -113,7 +113,7 @@ parse_output_parameters(List *options, uint32 *protocol_version,
 						 errmsg("conflicting or redundant options")));
 			protocol_version_given = true;
 
-			if (!scanint8(strVal(defel->arg), true, &parsed))
+			if (pg_strtoint64(strVal(defel->arg), &parsed) != PG_STRTOINT_OK)
 				ereport(ERROR,
 						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 						 errmsg("invalid proto_version")));
diff --git a/src/backend/tsearch/wparser_def.c b/src/backend/tsearch/wparser_def.c
index f18e761af8..abde3fdf6d 100644
--- a/src/backend/tsearch/wparser_def.c
+++ b/src/backend/tsearch/wparser_def.c
@@ -2460,13 +2460,13 @@ prsd_headline(PG_FUNCTION_ARGS)
 		char	   *val = defGetString(defel);
 
 		if (pg_strcasecmp(defel->defname, "MaxWords") == 0)
-			max_words = pg_strtoint32(val);
+			max_words = pg_strtoint32_check(val);
 		else if (pg_strcasecmp(defel->defname, "MinWords") == 0)
-			min_words = pg_strtoint32(val);
+			min_words = pg_strtoint32_check(val);
 		else if (pg_strcasecmp(defel->defname, "ShortWord") == 0)
-			shortword = pg_strtoint32(val);
+			shortword = pg_strtoint32_check(val);
 		else if (pg_strcasecmp(defel->defname, "MaxFragments") == 0)
-			max_fragments = pg_strtoint32(val);
+			max_fragments = pg_strtoint32_check(val);
 		else if (pg_strcasecmp(defel->defname, "StartSel") == 0)
 			prs->startsel = pstrdup(val);
 		else if (pg_strcasecmp(defel->defname, "StopSel") == 0)
diff --git a/src/backend/utils/adt/arrayutils.c b/src/backend/utils/adt/arrayutils.c
index 6170e35ac2..29d9784f41 100644
--- a/src/backend/utils/adt/arrayutils.c
+++ b/src/backend/utils/adt/arrayutils.c
@@ -226,7 +226,7 @@ ArrayGetIntegerTypmods(ArrayType *arr, int *n)
 	result = (int32 *) palloc(*n * sizeof(int32));
 
 	for (i = 0; i < *n; i++)
-		result[i] = pg_strtoint32(DatumGetCString(elem_values[i]));
+		result[i] = pg_strtoint32_check(DatumGetCString(elem_values[i]));
 
 	pfree(elem_values);
 
diff --git a/src/backend/utils/adt/cash.c b/src/backend/utils/adt/cash.c
index 6515fc8ec6..15f3babc05 100644
--- a/src/backend/utils/adt/cash.c
+++ b/src/backend/utils/adt/cash.c
@@ -26,7 +26,6 @@
 #include "libpq/pqformat.h"
 #include "utils/builtins.h"
 #include "utils/cash.h"
-#include "utils/int8.h"
 #include "utils/numeric.h"
 #include "utils/pg_locale.h"
 
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index 755ca6e277..f8cc5bc48a 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -92,7 +92,6 @@
 #include "utils/datetime.h"
 #include "utils/float.h"
 #include "utils/formatting.h"
-#include "utils/int8.h"
 #include "utils/memutils.h"
 #include "utils/numeric.h"
 #include "utils/pg_locale.h"
diff --git a/src/backend/utils/adt/int.c b/src/backend/utils/adt/int.c
index 04825fc77d..c038a47f8c 100644
--- a/src/backend/utils/adt/int.c
+++ b/src/backend/utils/adt/int.c
@@ -63,8 +63,16 @@ Datum
 int2in(PG_FUNCTION_ARGS)
 {
 	char	   *num = PG_GETARG_CSTRING(0);
+	int16		res;
+	pg_strtoint_status status;
 
-	PG_RETURN_INT16(pg_strtoint16(num));
+	/* Use a custom set of error messages here adapted to the data type */
+	status = pg_strtoint16(num, &res);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, num, "smallint");
+
+	PG_RETURN_INT16(res);
 }
 
 /*
@@ -268,8 +276,16 @@ Datum
 int4in(PG_FUNCTION_ARGS)
 {
 	char	   *num = PG_GETARG_CSTRING(0);
+	int32		res;
+	pg_strtoint_status status;
 
-	PG_RETURN_INT32(pg_strtoint32(num));
+	/* Use a custom set of error messages here adapted to the data type */
+	status = pg_strtoint32(num, &res);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, num, "integer");
+
+	PG_RETURN_INT32(res);
 }
 
 /*
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 0ff9394a2f..9933bfa4b4 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -23,7 +23,6 @@
 #include "nodes/nodeFuncs.h"
 #include "nodes/supportnodes.h"
 #include "optimizer/optimizer.h"
-#include "utils/int8.h"
 #include "utils/builtins.h"
 
 
@@ -47,99 +46,22 @@ typedef struct
  * Formatting and conversion routines.
  *---------------------------------------------------------*/
 
-/*
- * scanint8 --- try to parse a string into an int8.
- *
- * If errorOK is false, ereport a useful error message if the string is bad.
- * If errorOK is true, just return "false" for bad input.
- */
-bool
-scanint8(const char *str, bool errorOK, int64 *result)
-{
-	const char *ptr = str;
-	int64		tmp = 0;
-	bool		neg = false;
-
-	/*
-	 * Do our own scan, rather than relying on sscanf which might be broken
-	 * for long long.
-	 *
-	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
-	 * value as a negative number.
-	 */
-
-	/* skip leading spaces */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		/* could fail if input is most negative number */
-		if (unlikely(tmp == PG_INT64_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	*result = tmp;
-	return true;
-
-out_of_range:
-	if (!errorOK)
-		ereport(ERROR,
-				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-				 errmsg("value \"%s\" is out of range for type %s",
-						str, "bigint")));
-	return false;
-
-invalid_syntax:
-	if (!errorOK)
-		ereport(ERROR,
-				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
-				 errmsg("invalid input syntax for type %s: \"%s\"",
-						"bigint", str)));
-	return false;
-}
-
 /* int8in()
  */
 Datum
 int8in(PG_FUNCTION_ARGS)
 {
-	char	   *str = PG_GETARG_CSTRING(0);
-	int64		result;
+	char			   *str = PG_GETARG_CSTRING(0);
+	int64				res;
+	pg_strtoint_status	status;
 
-	(void) scanint8(str, false, &result);
-	PG_RETURN_INT64(result);
+	/* Use a custom set of error messages here adapted to the data type */
+	status = pg_strtoint64(str, &res);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, str, "bigint");
+
+	PG_RETURN_INT64(res);
 }
 
 
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index a00db3ce7a..bc02958950 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -39,7 +39,6 @@
 #include "utils/float.h"
 #include "utils/guc.h"
 #include "utils/hashutils.h"
-#include "utils/int8.h"
 #include "utils/numeric.h"
 #include "utils/sortsupport.h"
 
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 70138feb29..845c3bfe0a 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -19,6 +19,7 @@
 #include <ctype.h>
 
 #include "common/int.h"
+#include "common/string.h"
 #include "utils/builtins.h"
 
 /*
@@ -110,157 +111,92 @@ pg_atoi(const char *s, int size, int c)
 }
 
 /*
- * Convert input string to a signed 16 bit integer.
+ * pg_strtoint_error
  *
- * Allows any number of leading or trailing whitespace characters. Will throw
- * ereport() upon bad input format or overflow.
- *
- * NB: Accumulate input as a negative number, to deal with two's complement
- * representation of the most negative number, which can't be represented as a
- * positive number.
+ * utility wrapper to report an error with string-to-integer conversion
+ * functions.
  */
-int16
-pg_strtoint16(const char *s)
+void
+pg_strtoint_error(pg_strtoint_status status, const char *input,
+				  const char *type)
 {
-	const char *ptr = s;
-	int16		tmp = 0;
-	bool		neg = false;
-
-	/* skip leading spaces */
-	while (likely(*ptr) && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
+	switch (status)
 	{
-		ptr++;
-		neg = true;
+		case PG_STRTOINT_SYNTAX_ERROR:
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+					 errmsg("invalid input syntax for type %s: \"%s\"",
+							type, input)));
+			break;
+
+		case PG_STRTOINT_RANGE_ERROR:
+			ereport(ERROR,
+					(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+					 errmsg("value \"%s\" is out of range for type %s",
+							input, type)));
+			break;
+
+		default:
+			elog(ERROR, "unexpected status for string-to-integer conversion");
+			break;
 	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s16_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s16_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		/* could fail if input is most negative number */
-		if (unlikely(tmp == PG_INT16_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	return tmp;
-
-out_of_range:
-	ereport(ERROR,
-			(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-			 errmsg("value \"%s\" is out of range for type %s",
-					s, "smallint")));
-
-invalid_syntax:
-	ereport(ERROR,
-			(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
-			 errmsg("invalid input syntax for type %s: \"%s\"",
-					"smallint", s)));
-
-	return 0;					/* keep compiler quiet */
 }
 
 /*
- * Convert input string to a signed 32 bit integer.
+ * pg_strtoint16_check
  *
- * Allows any number of leading or trailing whitespace characters. Will throw
- * ereport() upon bad input format or overflow.
+ * Convert input string to a signed 16-bit integer.
  *
- * NB: Accumulate input as a negative number, to deal with two's complement
- * representation of the most negative number, which can't be represented as a
- * positive number.
+ * This throws ereport() upon bad input format or overflow.
+ */
+int16
+pg_strtoint16_check(const char *s)
+{
+	int16		result;
+	pg_strtoint_status status = pg_strtoint16(s, &result);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, s, "int16");
+	return result;
+}
+
+/*
+ * pg_strtoint32_check
+ *
+ * Convert input string to a signed 32-bit integer.
+ *
+ * This throws ereport() upon bad input format or overflow.
  */
 int32
-pg_strtoint32(const char *s)
+pg_strtoint32_check(const char *s)
 {
-	const char *ptr = s;
-	int32		tmp = 0;
-	bool		neg = false;
+	int32		result;
+	pg_strtoint_status status = pg_strtoint32(s, &result);
 
-	/* skip leading spaces */
-	while (likely(*ptr) && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s32_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s32_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		/* could fail if input is most negative number */
-		if (unlikely(tmp == PG_INT32_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	return tmp;
-
-out_of_range:
-	ereport(ERROR,
-			(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-			 errmsg("value \"%s\" is out of range for type %s",
-					s, "integer")));
-
-invalid_syntax:
-	ereport(ERROR,
-			(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
-			 errmsg("invalid input syntax for type %s: \"%s\"",
-					"integer", s)));
-
-	return 0;					/* keep compiler quiet */
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, s, "int32");
+	return result;
 }
 
+/*
+ * pg_strtoint64_check
+ *
+ * Convert input string to a signed 64-bit integer.
+ *
+ * This throws ereport() upon bad input format or overflow.
+ */
+int64
+pg_strtoint64_check(const char *s)
+{
+	int64		result;
+	pg_strtoint_status status = pg_strtoint64(s, &result);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, s, "int64");
+	return result;
+}
+
+
 /*
  * pg_itoa: converts a signed 16-bit integer to its string representation
  *
@@ -543,25 +479,3 @@ pg_ltostr(char *str, int32 value)
 
 	return end;
 }
-
-/*
- * pg_strtouint64
- *		Converts 'str' into an unsigned 64-bit integer.
- *
- * This has the identical API to strtoul(3), except that it will handle
- * 64-bit ints even where "long" is narrower than that.
- *
- * For the moment it seems sufficient to assume that the platform has
- * such a function somewhere; let's not roll our own.
- */
-uint64
-pg_strtouint64(const char *str, char **endptr, int base)
-{
-#ifdef _MSC_VER					/* MSVC only */
-	return _strtoui64(str, endptr, base);
-#elif defined(HAVE_STRTOULL) && SIZEOF_LONG < 8
-	return strtoull(str, endptr, base);
-#else
-	return strtoul(str, endptr, base);
-#endif
-}
diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c
index e5c7e5c7ee..749fc44623 100644
--- a/src/backend/utils/adt/rangetypes.c
+++ b/src/backend/utils/adt/rangetypes.c
@@ -37,7 +37,6 @@
 #include "utils/builtins.h"
 #include "utils/date.h"
 #include "utils/hashutils.h"
-#include "utils/int8.h"
 #include "utils/lsyscache.h"
 #include "utils/rangetypes.h"
 #include "utils/timestamp.h"
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index d36156f4e4..8c8c98bc3f 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -5626,8 +5626,11 @@ text_format(PG_FUNCTION_ARGS)
 
 				str = OutputFunctionCall(&typoutputinfo_width, value);
 
-				/* pg_strtoint32 will complain about bad data or overflow */
-				width = pg_strtoint32(str);
+				/*
+				 * pg_strtoint32_check will complain about bad data or
+				 * overflow.
+				 */
+				width = pg_strtoint32_check(str);
 
 				pfree(str);
 			}
diff --git a/src/bin/pgbench/exprscan.l b/src/bin/pgbench/exprscan.l
index e9020ad231..84a9d39795 100644
--- a/src/bin/pgbench/exprscan.l
+++ b/src/bin/pgbench/exprscan.l
@@ -23,6 +23,7 @@
  *-------------------------------------------------------------------------
  */
 
+#include "common/string.h"
 #include "fe_utils/psqlscan_int.h"
 
 /* context information for reporting errors in expressions */
@@ -205,7 +206,7 @@ notnull			[Nn][Oo][Tt][Nn][Uu][Ll][Ll]
 					return MAXINT_PLUS_ONE_CONST;
 				}
 {digit}+		{
-					if (!strtoint64(yytext, true, &yylval->ival))
+					if (pg_strtoint64(yytext, &yylval->ival) != PG_STRTOINT_OK)
 						expr_yyerror_more(yyscanner, "bigint constant overflow",
 										  strdup(yytext));
 					return INTEGER_CONST;
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index ed7652bfbf..600f1deb71 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -34,6 +34,7 @@
 #include "postgres_fe.h"
 #include "common/int.h"
 #include "common/logging.h"
+#include "common/string.h"
 #include "fe_utils/conditional.h"
 #include "getopt_long.h"
 #include "libpq-fe.h"
@@ -664,116 +665,6 @@ usage(void)
 		   progname, progname);
 }
 
-/* return whether str matches "^\s*[-+]?[0-9]+$" */
-static bool
-is_an_int(const char *str)
-{
-	const char *ptr = str;
-
-	/* skip leading spaces; cast is consistent with strtoint64 */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* skip sign */
-	if (*ptr == '+' || *ptr == '-')
-		ptr++;
-
-	/* at least one digit */
-	if (*ptr && !isdigit((unsigned char) *ptr))
-		return false;
-
-	/* eat all digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-		ptr++;
-
-	/* must have reached end of string */
-	return *ptr == '\0';
-}
-
-
-/*
- * strtoint64 -- convert a string to 64-bit integer
- *
- * This function is a slightly modified version of scanint8() from
- * src/backend/utils/adt/int8.c.
- *
- * The function returns whether the conversion worked, and if so
- * "*result" is set to the result.
- *
- * If not errorOK, an error message is also printed out on errors.
- */
-bool
-strtoint64(const char *str, bool errorOK, int64 *result)
-{
-	const char *ptr = str;
-	int64		tmp = 0;
-	bool		neg = false;
-
-	/*
-	 * Do our own scan, rather than relying on sscanf which might be broken
-	 * for long long.
-	 *
-	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
-	 * value as a negative number.
-	 */
-
-	/* skip leading spaces */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		if (unlikely(tmp == PG_INT64_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	*result = tmp;
-	return true;
-
-out_of_range:
-	if (!errorOK)
-		fprintf(stderr,
-				"value \"%s\" is out of range for type bigint\n", str);
-	return false;
-
-invalid_syntax:
-	if (!errorOK)
-		fprintf(stderr,
-				"invalid input syntax for type bigint: \"%s\"\n", str);
-	return false;
-}
-
 /* convert string to double, detecting overflows/underflows */
 bool
 strtodouble(const char *str, bool errorOK, double *dv)
@@ -1288,6 +1179,7 @@ static bool
 makeVariableValue(Variable *var)
 {
 	size_t		slen;
+	int64		ival;
 
 	if (var->value.type != PGBT_NO_VALUE)
 		return true;			/* no work */
@@ -1320,15 +1212,16 @@ makeVariableValue(Variable *var)
 	{
 		setBoolValue(&var->value, false);
 	}
-	else if (is_an_int(var->svalue))
+
+	/*
+	 * Attempt an int64 conversion.
+	 *
+	 * A too-large but syntactically correct int64 will be parsed as a double
+	 * below.
+	 */
+	else if (pg_strtoint64(var->svalue, &ival) == PG_STRTOINT_OK)
 	{
-		/* if it looks like an int, it must be an int without overflow */
-		int64		iv;
-
-		if (!strtoint64(var->svalue, false, &iv))
-			return false;
-
-		setIntValue(&var->value, iv);
+		setIntValue(&var->value, ival);
 	}
 	else						/* type should be double */
 	{
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index c4a1e298e0..da3f8fa055 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -160,7 +160,6 @@ extern void syntax_error(const char *source, int lineno, const char *line,
 						 const char *cmd, const char *msg,
 						 const char *more, int col) pg_attribute_noreturn();
 
-extern bool strtoint64(const char *str, bool errorOK, int64 *pi);
 extern bool strtodouble(const char *str, bool errorOK, double *pd);
 
 #endif							/* PGBENCH_H */
diff --git a/src/common/string.c b/src/common/string.c
index c9b8482cb0..82cbee4c3d 100644
--- a/src/common/string.c
+++ b/src/common/string.c
@@ -21,6 +21,7 @@
 #include "postgres_fe.h"
 #endif
 
+#include "common/int.h"
 #include "common/string.h"
 
 
@@ -58,6 +59,376 @@ strtoint(const char *pg_restrict str, char **pg_restrict endptr, int base)
 }
 
 
+/*
+ * pg_strtoint16
+ *
+ * Convert input string to a signed 16-bit integer.  Allows any number of
+ * leading or trailing whitespace characters.
+ *
+ * NB: Accumulate input as a negative number, to deal with two's complement
+ * representation of the most negative number, which can't be represented as a
+ * positive number.
+ *
+ * The function returns immediately if the conversion failed with a status
+ * value to let the caller handle the error.  On success, the result is
+ * stored in "*result".
+ */
+pg_strtoint_status
+pg_strtoint16(const char *s, int16 *result)
+{
+	const char *ptr = s;
+	int16		tmp = 0;
+	bool		neg = false;
+
+	/* skip leading spaces */
+	while (likely(*ptr) && isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '-')
+	{
+		ptr++;
+		neg = true;
+	}
+	else if (*ptr == '+')
+		ptr++;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead per the last check */
+	do
+	{
+		int8		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_s16_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_sub_s16_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace, but not other trailing chars */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	if (!neg)
+	{
+		/* could fail if input is most negative number */
+		if (unlikely(tmp == PG_INT16_MIN))
+			return PG_STRTOINT_RANGE_ERROR;
+		tmp = -tmp;
+	}
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
+/*
+ * pg_strtoint32
+ *
+ * Convert input string to a signed 32-bit integer.  Allows any number of
+ * leading or trailing whitespace characters.
+ *
+ * NB: Accumulate input as a negative number, to deal with two's complement
+ * representation of the most negative number, which can't be represented as a
+ * positive number.
+ *
+ * The function returns immediately if the conversion failed with a status
+ * value to let the caller handle the error.  On success, the result is
+ * stored in "*result".
+ */
+pg_strtoint_status
+pg_strtoint32(const char *s, int32 *result)
+{
+	const char *ptr = s;
+	int32		tmp = 0;
+	bool		neg = false;
+
+	/* skip leading spaces */
+	while (likely(*ptr) && isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '-')
+	{
+		ptr++;
+		neg = true;
+	}
+	else if (*ptr == '+')
+		ptr++;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead per the last check */
+	do
+	{
+		int8		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_s32_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_sub_s32_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace, but not other trailing chars */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	if (!neg)
+	{
+		/* could fail if input is most negative number */
+		if (unlikely(tmp == PG_INT32_MIN))
+			return PG_STRTOINT_RANGE_ERROR;
+		tmp = -tmp;
+	}
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
+/*
+ * pg_strtoint64
+ *
+ * Convert input string to a signed 32-bit integer.  Allows any number of
+ * leading or trailing whitespace characters.
+ *
+ * NB: Accumulate input as a negative number, to deal with two's complement
+ * representation of the most negative number, which can't be represented as a
+ * positive number.
+ *
+ * The function returns immediately if the conversion failed with a status
+ * value to let the caller handle the error.  On success, the result is
+ * stored in "*result".
+ */
+pg_strtoint_status
+pg_strtoint64(const char *str, int64 *result)
+{
+	const char *ptr = str;
+	int64		tmp = 0;
+	bool		neg = false;
+
+	/*
+	 * Do our own scan, rather than relying on sscanf which might be broken
+	 * for long long.
+	 *
+	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
+	 * value as a negative number.
+	 */
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '-')
+	{
+		ptr++;
+		neg = true;
+	}
+	else if (*ptr == '+')
+		ptr++;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead per the last check */
+	do
+	{
+		int64		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace, but not other trailing chars */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	if (!neg)
+	{
+		if (unlikely(tmp == PG_INT64_MIN))
+			return PG_STRTOINT_RANGE_ERROR;
+		tmp = -tmp;
+	}
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
+/*
+ * pg_strtouint16
+ *
+ * Convert input string to an unsigned 16-bit integer.  Allows any number of
+ * leading or trailing whitespace characters.
+ *
+ * The function returns immediately if the conversion failed with a status
+ * value to let the caller handle the error.  On success, the result is
+ * stored in "*result".
+ */
+pg_strtoint_status
+pg_strtouint16(const char *str, uint16 *result)
+{
+	const char *ptr = str;
+	uint16		tmp = 0;
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '+')
+		ptr++;
+	else if (unlikely(*ptr == '-'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* require at least one digit */
+	if (!unlikely(isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead per the last check */
+	do
+	{
+		uint8		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_u16_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_add_u16_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* but not other trailing chars */
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
+/*
+ * pg_strtouint32
+ *
+ * Convert input string to an unsigned 32-bit integer.  Allows any number of
+ * leading or trailing whitespace characters.
+ *
+ * The function returns immediately if the conversion failed with a status
+ * value to let the caller handle the error.  On success, the result is
+ * stored in "*result".
+ */
+pg_strtoint_status
+pg_strtouint32(const char *str, uint32 *result)
+{
+	const char *ptr = str;
+	uint32		tmp = 0;
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '+')
+		ptr++;
+	else if (unlikely(*ptr == '-'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* require at least one digit */
+	if (!unlikely(isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead per the last check */
+	do
+	{
+		uint8		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_u32_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_add_u32_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* but not other trailing chars */
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
+/*
+ * pg_strtouint64
+ *
+ * Convert input string to an unsigned 64-bit integer.  Allows any number of
+ * leading or trailing whitespace characters.
+ *
+ * The function returns immediately if the conversion failed with a status
+ * value to let the caller handle the error.  On success, the result is
+ * stored in "*result".
+ */
+pg_strtoint_status
+pg_strtouint64(const char *str, uint64 *result)
+{
+	const char *ptr = str;
+	uint64		tmp = 0;
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '+')
+		ptr++;
+	else if (unlikely(*ptr == '-'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* require at least one digit */
+	if (!unlikely(isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead per the last check */
+	do
+	{
+		uint64		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_u64_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_add_u64_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* but not other trailing chars */
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
 /*
  * pg_clean_ascii -- Replace any non-ASCII chars with a '?' char
  *
diff --git a/src/include/common/string.h b/src/include/common/string.h
index 94f653fdd7..eb0f273d82 100644
--- a/src/include/common/string.h
+++ b/src/include/common/string.h
@@ -11,8 +11,26 @@
 #define COMMON_STRING_H
 
 extern bool pg_str_endswith(const char *str, const char *end);
-extern int	strtoint(const char *pg_restrict str, char **pg_restrict endptr,
-					 int base);
+extern int strtoint(const char *pg_restrict str, char **pg_restrict endptr,
+					int base);
+
+/*
+ * Set of routines for conversion from string to various integer types.
+ */
+typedef enum pg_strtoint_status
+{
+	PG_STRTOINT_OK,
+	PG_STRTOINT_SYNTAX_ERROR,
+	PG_STRTOINT_RANGE_ERROR
+} pg_strtoint_status;
+
+extern pg_strtoint_status pg_strtoint16(const char *str, int16 *result);
+extern pg_strtoint_status pg_strtoint32(const char *str, int32 *result);
+extern pg_strtoint_status pg_strtoint64(const char *str, int64 *result);
+extern pg_strtoint_status pg_strtouint16(const char *str, uint16 *result);
+extern pg_strtoint_status pg_strtouint32(const char *str, uint32 *result);
+extern pg_strtoint_status pg_strtouint64(const char *str, uint64 *result);
+
 extern void pg_clean_ascii(char *str);
 extern int	pg_strip_crlf(char *str);
 
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 937ddb7ef0..02e1c9ba03 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -15,6 +15,7 @@
 #define BUILTINS_H
 
 #include "fmgr.h"
+#include "common/string.h"
 #include "nodes/nodes.h"
 #include "utils/fmgrprotos.h"
 
@@ -43,14 +44,17 @@ extern int	namestrcmp(Name name, const char *str);
 
 /* numutils.c */
 extern int32 pg_atoi(const char *s, int size, int c);
-extern int16 pg_strtoint16(const char *s);
-extern int32 pg_strtoint32(const char *s);
+extern void pg_strtoint_error(pg_strtoint_status status,
+							  const char *input,
+							  const char *type)	pg_attribute_noreturn();
+extern int16 pg_strtoint16_check(const char *s);
+extern int32 pg_strtoint32_check(const char *s);
+extern int64 pg_strtoint64_check(const char *s);
 extern void pg_itoa(int16 i, char *a);
 extern void pg_ltoa(int32 l, char *a);
 extern void pg_lltoa(int64 ll, char *a);
 extern char *pg_ltostr_zeropad(char *str, int32 value, int32 minwidth);
 extern char *pg_ltostr(char *str, int32 value);
-extern uint64 pg_strtouint64(const char *str, char **endptr, int base);
 
 /* oid.c */
 extern oidvector *buildoidvector(const Oid *oids, int n);
diff --git a/src/include/utils/int8.h b/src/include/utils/int8.h
deleted file mode 100644
index 4836095d93..0000000000
--- a/src/include/utils/int8.h
+++ /dev/null
@@ -1,25 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * int8.h
- *	  Declarations for operations on 64-bit integers.
- *
- *
- * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * src/include/utils/int8.h
- *
- * NOTES
- * These data types are supported on all 64-bit architectures, and may
- *	be supported through libraries on some 32-bit machines. If your machine
- *	is not currently supported, then please try to make it so, then post
- *	patches to the postgresql.org hackers mailing list.
- *
- *-------------------------------------------------------------------------
- */
-#ifndef INT8_H
-#define INT8_H
-
-extern bool scanint8(const char *str, bool errorOK, int64 *result);
-
-#endif							/* INT8_H */

#72

michael@paquier.xyz

over 6 years ago

In reply to: Michael Paquier (#71)

Re: refactoring - share str2*int64 functions

On Tue, Sep 10, 2019 at 12:05:25PM +0900, Michael Paquier wrote:

Attached is an updated patch? How does it look? I have left the
parts of readfuncs.c for now as there are more issues behind that than
doing a single switch, short reads are one, long reads a second. And
the patch already does a lot. There could be also an argument for
having extra _check wrappers for the unsigned portions but these would
be mostly unused in the backend code, so I have left that out on
purpose.

I have looked at this patch again today after letting it aside a
couple of days, and I quite like the resulting shape of the routines.
Does anybody else have any comments? Would it make sense to extend
more the string-to-int conversion routines with a set of control flags
to bypass the removal of leading and trailing whitespaces?
--
Michael

#73

andres@anarazel.de

over 6 years ago

In reply to: Michael Paquier (#71)

Re: refactoring - share str2*int64 functions

On 2019-09-10 12:05:25 +0900, Michael Paquier wrote:

On Mon, Sep 09, 2019 at 05:27:04AM -0700, Andres Freund wrote:

On 2019-09-09 20:57:46 +0900, Michael Paquier wrote:
But ISTM all of them ought to just use the C types, rather than the SQL
types however. Since in the above proposal the caller determines the
type names, if you want a different type - like the SQL input routines -
can just invoke pg_strtoint_error() themselves (or just have it open
coded).

Yep, that was my line of thoughts.

And for errors which should never happen we could just use
elog(). For the input functions of int2/4/8 we still need the
existing errors of course.

Right, there it makes sense to continue to refer the SQL level types.

Actually, I found your suggestion of using a noreturn function for the
error reporting to be a very clean alternative. I didn't know though
that gcc is not able to detect that a function does not return if you
don't have a default in the switch for all the status codes. And this
even if all the values of the enum for the switch are listed.

As I proposed they'd be in different translation units, so the compiler
wouldn't see the definition of the function, just the declaration.

Not sure about that. I would keep the scope of the patch simple as of
now, where we make sure that we have the right interface for
everything. There are a couple of extra improvements which could be
done afterwards, and if we move everything in the same place that
should be easier to move on with more improvements. Hopefully.

The only reason for thinking about it now is that we'd then avoid
changing the API twice.

What I think we would be looking for here is an extra argument for the
low-level routines to control the behavior of the function in an
extensible way, say a bits16 for a set of flags, with one flag to
ignore checks for trailing and leading whitespace.

That'd probably be a bad idea, for performance reasons.

Attached is an updated patch? How does it look? I have left the
parts of readfuncs.c for now as there are more issues behind that than
doing a single switch, short reads are one, long reads a second.

Hm? I don't know what you mean by those issues.

And the patch already does a lot. There could be also an argument for
having extra _check wrappers for the unsigned portions but these would
be mostly unused in the backend code, so I have left that out on
purpose.

I'd value consistency higher here.

diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 2c0ae395ba..8e75d52b06 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -21,6 +21,7 @@
#include "catalog/heap.h"
#include "catalog/pg_type.h"
#include "commands/trigger.h"
+#include "common/string.h"
#include "executor/executor.h"
#include "executor/spi_priv.h"
#include "miscadmin.h"
@@ -2338,8 +2339,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
CreateTableAsStmt *ctastmt = (CreateTableAsStmt *) stmt->utilityStmt;

if (strncmp(completionTag, "SELECT ", 7) == 0)
-						_SPI_current->processed =
-							pg_strtouint64(completionTag + 7, NULL, 10);
+						(void) pg_strtouint64(completionTag + 7, &_SPI_current->processed);

I'd just use the checked version here, seems like a good thing to check
for, and I can't imagine it matters performance wise.

@@ -63,8 +63,16 @@ Datum
int2in(PG_FUNCTION_ARGS)
{
char	   *num = PG_GETARG_CSTRING(0);
+	int16		res;
+	pg_strtoint_status status;

-	PG_RETURN_INT16(pg_strtoint16(num));
+	/* Use a custom set of error messages here adapted to the data type */
+	status = pg_strtoint16(num, &res);

I don't know what that comment is supposed to mean?

+/*
+ * pg_strtoint64_check
+ *
+ * Convert input string to a signed 64-bit integer.
+ *
+ * This throws ereport() upon bad input format or overflow.
+ */
+int64
+pg_strtoint64_check(const char *s)
+{
+	int64		result;
+	pg_strtoint_status status = pg_strtoint64(s, &result);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, s, "int64");
+	return result;
+}

I think I'd just put these as inlines in the header.

Greetings,

Andres Freund

#74

michael@paquier.xyz

over 6 years ago

In reply to: Andres Freund (#73)

1 attachment(s)

Re: refactoring - share str2*int64 functions

On Fri, Sep 13, 2019 at 06:38:31PM -0700, Andres Freund wrote:

On 2019-09-10 12:05:25 +0900, Michael Paquier wrote:

On Mon, Sep 09, 2019 at 05:27:04AM -0700, Andres Freund wrote:
Attached is an updated patch? How does it look? I have left the
parts of readfuncs.c for now as there are more issues behind that than
doing a single switch, short reads are one, long reads a second.

Hm? I don't know what you mean by those issues.

I think that we have more issues than it looks. For example:
- Switching UINT to use pg_strtouint32() causes an incompatibility
issue compared to atoui().
- Switching INT to use pg_strtoint32() causes a set of warnings as for
example with AttrNumber:
72 | (void) pg_strtoint32(token, &local_node->fldname)
| ^~~~~~~~~~~~~~~~~~~~~
| |
| AttrNumber * {aka short int *}
And it is not like we should use a cast either, as we could hide real
issues. Hence it seems to me that we need to have a new routine
definition for shorter integers and switch more flags to that.
- Switching LONG to use pg_strtoint64() leads to another set of
issues, particularly one could see an assertion failure related to Agg
nodes. I am not sure either that we should use int64 here as the size
can be at least 32b.
- Switching OID to use pg_strtoint32 causes a failure with initdb.

So while I agree with you that a switch should be doable, there is a
large set of issues to ponder about here, and the patch already does a
lot, so I really think that we had better do a closer lookup at those
issues separately, once the basics are in place, and consider them if
they actually make sense. There is much more than just doing a direct
switch in this area with the family of ato*() system calls.

And the patch already does a lot. There could be also an argument for
having extra _check wrappers for the unsigned portions but these would
be mostly unused in the backend code, so I have left that out on
purpose.

I'd value consistency higher here.

Okay, no objections to that.

@@ -2338,8 +2339,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
CreateTableAsStmt *ctastmt = (CreateTableAsStmt *) stmt->utilityStmt;
if (strncmp(completionTag, "SELECT ", 7) == 0)
-						_SPI_current->processed =
-							pg_strtouint64(completionTag + 7, NULL, 10);
+						(void) pg_strtouint64(completionTag + 7, &_SPI_current->processed);
I'd just use the checked version here, seems like a good thing to check
for, and I can't imagine it matters performance wise.

Yeah, makes sense. I don't think it matters either for
pg_stat_statements in the same context. So changed that part as
well.

@@ -63,8 +63,16 @@ Datum
int2in(PG_FUNCTION_ARGS)
{
char	   *num = PG_GETARG_CSTRING(0);
+	int16		res;
+	pg_strtoint_status status;

-	PG_RETURN_INT16(pg_strtoint16(num));
+	/* Use a custom set of error messages here adapted to the data type */
+	status = pg_strtoint16(num, &res);

I don't know what that comment is supposed to mean?

I mean here that the _check equivalent cannot be used as any error
messages generated need to be consistent with the SQL data type. I
have updated the comment, does it look better now?

+/*
+ * pg_strtoint64_check
+ *
+ * Convert input string to a signed 64-bit integer.
+ *
+ * This throws ereport() upon bad input format or overflow.
+ */
+int64
+pg_strtoint64_check(const char *s)
+{
+	int64		result;
+	pg_strtoint_status status = pg_strtoint64(s, &result);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, s, "int64");
+	return result;
+}

I think I'd just put these as inlines in the header.

I have not considered that. This bloats a bit more builtins.h. We
could live with that, or just move that into a separate header in
include/utils/, say int.h? Even if common/int.h exists?

Attached is an updated patch. Perhaps you have something else in
mind?
--
Michael

Attachments:

str2int-14.patchtext/x-diff; charset=us-asciiDownload

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 221b47298c..7af05fc009 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -1022,7 +1022,7 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 		/* parse command tag to retrieve the number of affected rows. */
 		if (completionTag &&
 			strncmp(completionTag, "COPY ", 5) == 0)
-			rows = pg_strtouint64(completionTag + 5, NULL, 10);
+			rows = pg_strtouint64_check(completionTag + 5);
 		else
 			rows = 0;
 
diff --git a/contrib/spi/refint.c b/contrib/spi/refint.c
index adf0490f85..b063f1e122 100644
--- a/contrib/spi/refint.c
+++ b/contrib/spi/refint.c
@@ -306,7 +306,7 @@ check_foreign_key(PG_FUNCTION_ARGS)
 		/* internal error */
 		elog(ERROR, "check_foreign_key: too short %d (< 5) list of arguments", nargs);
 
-	nrefs = pg_strtoint32(args[0]);
+	nrefs = pg_strtoint32_check(args[0]);
 	if (nrefs < 1)
 		/* internal error */
 		elog(ERROR, "check_foreign_key: %d (< 1) number of references specified", nrefs);
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 2c0ae395ba..1272b4ee04 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -2338,8 +2338,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
 					CreateTableAsStmt *ctastmt = (CreateTableAsStmt *) stmt->utilityStmt;
 
 					if (strncmp(completionTag, "SELECT ", 7) == 0)
-						_SPI_current->processed =
-							pg_strtouint64(completionTag + 7, NULL, 10);
+						_SPI_current->processed = pg_strtouint64_check(completionTag + 7);
 					else
 					{
 						/*
@@ -2361,8 +2360,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
 				else if (IsA(stmt->utilityStmt, CopyStmt))
 				{
 					Assert(strncmp(completionTag, "COPY ", 5) == 0);
-					_SPI_current->processed = pg_strtouint64(completionTag + 5,
-															 NULL, 10);
+					_SPI_current->processed = pg_strtouint64_check(completionTag + 5);
 				}
 			}
 
diff --git a/src/backend/libpq/pqmq.c b/src/backend/libpq/pqmq.c
index a9bd47d937..ac2bf2f2c4 100644
--- a/src/backend/libpq/pqmq.c
+++ b/src/backend/libpq/pqmq.c
@@ -286,10 +286,10 @@ pq_parse_errornotice(StringInfo msg, ErrorData *edata)
 				edata->hint = pstrdup(value);
 				break;
 			case PG_DIAG_STATEMENT_POSITION:
-				edata->cursorpos = pg_strtoint32(value);
+				edata->cursorpos = pg_strtoint32_check(value);
 				break;
 			case PG_DIAG_INTERNAL_POSITION:
-				edata->internalpos = pg_strtoint32(value);
+				edata->internalpos = pg_strtoint32_check(value);
 				break;
 			case PG_DIAG_INTERNAL_QUERY:
 				edata->internalquery = pstrdup(value);
@@ -316,7 +316,7 @@ pq_parse_errornotice(StringInfo msg, ErrorData *edata)
 				edata->filename = pstrdup(value);
 				break;
 			case PG_DIAG_SOURCE_LINE:
-				edata->lineno = pg_strtoint32(value);
+				edata->lineno = pg_strtoint32_check(value);
 				break;
 			case PG_DIAG_SOURCE_FUNCTION:
 				edata->funcname = pstrdup(value);
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 764e3bb90c..2a1e9be3eb 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -32,6 +32,7 @@
 
 #include <math.h>
 
+#include "common/string.h"
 #include "fmgr.h"
 #include "miscadmin.h"
 #include "nodes/extensible.h"
@@ -80,7 +81,7 @@
 #define READ_UINT64_FIELD(fldname) \
 	token = pg_strtok(&length);		/* skip :fldname */ \
 	token = pg_strtok(&length);		/* get field value */ \
-	local_node->fldname = pg_strtouint64(token, NULL, 10)
+	(void) pg_strtouint64(token, &local_node->fldname)
 
 /* Read a long integer field (anything written as ":fldname %ld") */
 #define READ_LONG_FIELD(fldname) \
diff --git a/src/backend/parser/parse_node.c b/src/backend/parser/parse_node.c
index 1baf7ef31f..cc846a71f1 100644
--- a/src/backend/parser/parse_node.c
+++ b/src/backend/parser/parse_node.c
@@ -17,6 +17,7 @@
 #include "access/htup_details.h"
 #include "access/table.h"
 #include "catalog/pg_type.h"
+#include "common/string.h"
 #include "mb/pg_wchar.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
@@ -25,7 +26,6 @@
 #include "parser/parse_expr.h"
 #include "parser/parse_relation.h"
 #include "utils/builtins.h"
-#include "utils/int8.h"
 #include "utils/lsyscache.h"
 #include "utils/syscache.h"
 #include "utils/varbit.h"
@@ -496,7 +496,7 @@ make_const(ParseState *pstate, Value *value, int location)
 
 		case T_Float:
 			/* could be an oversize integer as well as a float ... */
-			if (scanint8(strVal(value), true, &val64))
+			if (pg_strtoint64(strVal(value), &val64) == PG_STRTOINT_OK)
 			{
 				/*
 				 * It might actually fit in int32. Probably only INT_MIN can
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 6eba08a920..1ca6542c2e 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -341,7 +341,7 @@ libpqrcv_identify_system(WalReceiverConn *conn, TimeLineID *primary_tli)
 						   ntuples, nfields, 3, 1)));
 	}
 	primary_sysid = pstrdup(PQgetvalue(res, 0, 0));
-	*primary_tli = pg_strtoint32(PQgetvalue(res, 0, 1));
+	*primary_tli = pg_strtoint32_check(PQgetvalue(res, 0, 1));
 	PQclear(res);
 
 	return primary_sysid;
@@ -487,7 +487,7 @@ libpqrcv_endstreaming(WalReceiverConn *conn, TimeLineID *next_tli)
 		if (PQnfields(res) < 2 || PQntuples(res) != 1)
 			ereport(ERROR,
 					(errmsg("unexpected result set after end-of-streaming")));
-		*next_tli = pg_strtoint32(PQgetvalue(res, 0, 0));
+		*next_tli = pg_strtoint32_check(PQgetvalue(res, 0, 0));
 		PQclear(res);
 
 		/* the result set should be followed by CommandComplete */
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 9c08757fca..d8a2c4979f 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,6 +13,7 @@
 #include "postgres.h"
 
 #include "catalog/pg_publication.h"
+#include "common/string.h"
 
 #include "fmgr.h"
 
@@ -22,7 +23,6 @@
 #include "replication/pgoutput.h"
 
 #include "utils/inval.h"
-#include "utils/int8.h"
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 #include "utils/varlena.h"
@@ -113,7 +113,7 @@ parse_output_parameters(List *options, uint32 *protocol_version,
 						 errmsg("conflicting or redundant options")));
 			protocol_version_given = true;
 
-			if (!scanint8(strVal(defel->arg), true, &parsed))
+			if (pg_strtoint64(strVal(defel->arg), &parsed) != PG_STRTOINT_OK)
 				ereport(ERROR,
 						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 						 errmsg("invalid proto_version")));
diff --git a/src/backend/tsearch/wparser_def.c b/src/backend/tsearch/wparser_def.c
index f18e761af8..abde3fdf6d 100644
--- a/src/backend/tsearch/wparser_def.c
+++ b/src/backend/tsearch/wparser_def.c
@@ -2460,13 +2460,13 @@ prsd_headline(PG_FUNCTION_ARGS)
 		char	   *val = defGetString(defel);
 
 		if (pg_strcasecmp(defel->defname, "MaxWords") == 0)
-			max_words = pg_strtoint32(val);
+			max_words = pg_strtoint32_check(val);
 		else if (pg_strcasecmp(defel->defname, "MinWords") == 0)
-			min_words = pg_strtoint32(val);
+			min_words = pg_strtoint32_check(val);
 		else if (pg_strcasecmp(defel->defname, "ShortWord") == 0)
-			shortword = pg_strtoint32(val);
+			shortword = pg_strtoint32_check(val);
 		else if (pg_strcasecmp(defel->defname, "MaxFragments") == 0)
-			max_fragments = pg_strtoint32(val);
+			max_fragments = pg_strtoint32_check(val);
 		else if (pg_strcasecmp(defel->defname, "StartSel") == 0)
 			prs->startsel = pstrdup(val);
 		else if (pg_strcasecmp(defel->defname, "StopSel") == 0)
diff --git a/src/backend/utils/adt/arrayutils.c b/src/backend/utils/adt/arrayutils.c
index 6170e35ac2..29d9784f41 100644
--- a/src/backend/utils/adt/arrayutils.c
+++ b/src/backend/utils/adt/arrayutils.c
@@ -226,7 +226,7 @@ ArrayGetIntegerTypmods(ArrayType *arr, int *n)
 	result = (int32 *) palloc(*n * sizeof(int32));
 
 	for (i = 0; i < *n; i++)
-		result[i] = pg_strtoint32(DatumGetCString(elem_values[i]));
+		result[i] = pg_strtoint32_check(DatumGetCString(elem_values[i]));
 
 	pfree(elem_values);
 
diff --git a/src/backend/utils/adt/cash.c b/src/backend/utils/adt/cash.c
index 6515fc8ec6..15f3babc05 100644
--- a/src/backend/utils/adt/cash.c
+++ b/src/backend/utils/adt/cash.c
@@ -26,7 +26,6 @@
 #include "libpq/pqformat.h"
 #include "utils/builtins.h"
 #include "utils/cash.h"
-#include "utils/int8.h"
 #include "utils/numeric.h"
 #include "utils/pg_locale.h"
 
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index 755ca6e277..f8cc5bc48a 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -92,7 +92,6 @@
 #include "utils/datetime.h"
 #include "utils/float.h"
 #include "utils/formatting.h"
-#include "utils/int8.h"
 #include "utils/memutils.h"
 #include "utils/numeric.h"
 #include "utils/pg_locale.h"
diff --git a/src/backend/utils/adt/int.c b/src/backend/utils/adt/int.c
index 04825fc77d..3bf8438942 100644
--- a/src/backend/utils/adt/int.c
+++ b/src/backend/utils/adt/int.c
@@ -63,8 +63,19 @@ Datum
 int2in(PG_FUNCTION_ARGS)
 {
 	char	   *num = PG_GETARG_CSTRING(0);
+	int16		res;
+	pg_strtoint_status status;
 
-	PG_RETURN_INT16(pg_strtoint16(num));
+	/*
+	 * pg_strtoint16_check is not used here as any error message generated
+	 * should be consistent with the SQL data type used.
+	 */
+	status = pg_strtoint16(num, &res);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, num, "smallint");
+
+	PG_RETURN_INT16(res);
 }
 
 /*
@@ -268,8 +279,19 @@ Datum
 int4in(PG_FUNCTION_ARGS)
 {
 	char	   *num = PG_GETARG_CSTRING(0);
+	int32		res;
+	pg_strtoint_status status;
 
-	PG_RETURN_INT32(pg_strtoint32(num));
+	/*
+	 * pg_strtoint32_check is not used here as any error message generated
+	 * should be consistent with the SQL data type used.
+	 */
+	status = pg_strtoint32(num, &res);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, num, "integer");
+
+	PG_RETURN_INT32(res);
 }
 
 /*
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 0ff9394a2f..3ef0a01ea0 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -23,7 +23,6 @@
 #include "nodes/nodeFuncs.h"
 #include "nodes/supportnodes.h"
 #include "optimizer/optimizer.h"
-#include "utils/int8.h"
 #include "utils/builtins.h"
 
 
@@ -47,99 +46,25 @@ typedef struct
  * Formatting and conversion routines.
  *---------------------------------------------------------*/
 
-/*
- * scanint8 --- try to parse a string into an int8.
- *
- * If errorOK is false, ereport a useful error message if the string is bad.
- * If errorOK is true, just return "false" for bad input.
- */
-bool
-scanint8(const char *str, bool errorOK, int64 *result)
-{
-	const char *ptr = str;
-	int64		tmp = 0;
-	bool		neg = false;
-
-	/*
-	 * Do our own scan, rather than relying on sscanf which might be broken
-	 * for long long.
-	 *
-	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
-	 * value as a negative number.
-	 */
-
-	/* skip leading spaces */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		/* could fail if input is most negative number */
-		if (unlikely(tmp == PG_INT64_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	*result = tmp;
-	return true;
-
-out_of_range:
-	if (!errorOK)
-		ereport(ERROR,
-				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-				 errmsg("value \"%s\" is out of range for type %s",
-						str, "bigint")));
-	return false;
-
-invalid_syntax:
-	if (!errorOK)
-		ereport(ERROR,
-				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
-				 errmsg("invalid input syntax for type %s: \"%s\"",
-						"bigint", str)));
-	return false;
-}
-
 /* int8in()
  */
 Datum
 int8in(PG_FUNCTION_ARGS)
 {
-	char	   *str = PG_GETARG_CSTRING(0);
-	int64		result;
+	char			   *str = PG_GETARG_CSTRING(0);
+	int64				res;
+	pg_strtoint_status	status;
 
-	(void) scanint8(str, false, &result);
-	PG_RETURN_INT64(result);
+	/*
+	 * pg_strtoint64_check is not used here as any error message generated
+	 * should be consistent with the SQL data type used.
+	 */
+	status = pg_strtoint64(str, &res);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, str, "bigint");
+
+	PG_RETURN_INT64(res);
 }
 
 
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index a00db3ce7a..bc02958950 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -39,7 +39,6 @@
 #include "utils/float.h"
 #include "utils/guc.h"
 #include "utils/hashutils.h"
-#include "utils/int8.h"
 #include "utils/numeric.h"
 #include "utils/sortsupport.h"
 
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 70138feb29..f27148f101 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -109,157 +109,6 @@ pg_atoi(const char *s, int size, int c)
 	return (int32) l;
 }
 
-/*
- * Convert input string to a signed 16 bit integer.
- *
- * Allows any number of leading or trailing whitespace characters. Will throw
- * ereport() upon bad input format or overflow.
- *
- * NB: Accumulate input as a negative number, to deal with two's complement
- * representation of the most negative number, which can't be represented as a
- * positive number.
- */
-int16
-pg_strtoint16(const char *s)
-{
-	const char *ptr = s;
-	int16		tmp = 0;
-	bool		neg = false;
-
-	/* skip leading spaces */
-	while (likely(*ptr) && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s16_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s16_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		/* could fail if input is most negative number */
-		if (unlikely(tmp == PG_INT16_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	return tmp;
-
-out_of_range:
-	ereport(ERROR,
-			(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-			 errmsg("value \"%s\" is out of range for type %s",
-					s, "smallint")));
-
-invalid_syntax:
-	ereport(ERROR,
-			(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
-			 errmsg("invalid input syntax for type %s: \"%s\"",
-					"smallint", s)));
-
-	return 0;					/* keep compiler quiet */
-}
-
-/*
- * Convert input string to a signed 32 bit integer.
- *
- * Allows any number of leading or trailing whitespace characters. Will throw
- * ereport() upon bad input format or overflow.
- *
- * NB: Accumulate input as a negative number, to deal with two's complement
- * representation of the most negative number, which can't be represented as a
- * positive number.
- */
-int32
-pg_strtoint32(const char *s)
-{
-	const char *ptr = s;
-	int32		tmp = 0;
-	bool		neg = false;
-
-	/* skip leading spaces */
-	while (likely(*ptr) && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s32_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s32_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		/* could fail if input is most negative number */
-		if (unlikely(tmp == PG_INT32_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	return tmp;
-
-out_of_range:
-	ereport(ERROR,
-			(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-			 errmsg("value \"%s\" is out of range for type %s",
-					s, "integer")));
-
-invalid_syntax:
-	ereport(ERROR,
-			(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
-			 errmsg("invalid input syntax for type %s: \"%s\"",
-					"integer", s)));
-
-	return 0;					/* keep compiler quiet */
-}
 
 /*
  * pg_itoa: converts a signed 16-bit integer to its string representation
@@ -543,25 +392,3 @@ pg_ltostr(char *str, int32 value)
 
 	return end;
 }
-
-/*
- * pg_strtouint64
- *		Converts 'str' into an unsigned 64-bit integer.
- *
- * This has the identical API to strtoul(3), except that it will handle
- * 64-bit ints even where "long" is narrower than that.
- *
- * For the moment it seems sufficient to assume that the platform has
- * such a function somewhere; let's not roll our own.
- */
-uint64
-pg_strtouint64(const char *str, char **endptr, int base)
-{
-#ifdef _MSC_VER					/* MSVC only */
-	return _strtoui64(str, endptr, base);
-#elif defined(HAVE_STRTOULL) && SIZEOF_LONG < 8
-	return strtoull(str, endptr, base);
-#else
-	return strtoul(str, endptr, base);
-#endif
-}
diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c
index e5c7e5c7ee..749fc44623 100644
--- a/src/backend/utils/adt/rangetypes.c
+++ b/src/backend/utils/adt/rangetypes.c
@@ -37,7 +37,6 @@
 #include "utils/builtins.h"
 #include "utils/date.h"
 #include "utils/hashutils.h"
-#include "utils/int8.h"
 #include "utils/lsyscache.h"
 #include "utils/rangetypes.h"
 #include "utils/timestamp.h"
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index d36156f4e4..8c8c98bc3f 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -5626,8 +5626,11 @@ text_format(PG_FUNCTION_ARGS)
 
 				str = OutputFunctionCall(&typoutputinfo_width, value);
 
-				/* pg_strtoint32 will complain about bad data or overflow */
-				width = pg_strtoint32(str);
+				/*
+				 * pg_strtoint32_check will complain about bad data or
+				 * overflow.
+				 */
+				width = pg_strtoint32_check(str);
 
 				pfree(str);
 			}
diff --git a/src/bin/pgbench/exprscan.l b/src/bin/pgbench/exprscan.l
index e9020ad231..84a9d39795 100644
--- a/src/bin/pgbench/exprscan.l
+++ b/src/bin/pgbench/exprscan.l
@@ -23,6 +23,7 @@
  *-------------------------------------------------------------------------
  */
 
+#include "common/string.h"
 #include "fe_utils/psqlscan_int.h"
 
 /* context information for reporting errors in expressions */
@@ -205,7 +206,7 @@ notnull			[Nn][Oo][Tt][Nn][Uu][Ll][Ll]
 					return MAXINT_PLUS_ONE_CONST;
 				}
 {digit}+		{
-					if (!strtoint64(yytext, true, &yylval->ival))
+					if (pg_strtoint64(yytext, &yylval->ival) != PG_STRTOINT_OK)
 						expr_yyerror_more(yyscanner, "bigint constant overflow",
 										  strdup(yytext));
 					return INTEGER_CONST;
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index ed7652bfbf..600f1deb71 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -34,6 +34,7 @@
 #include "postgres_fe.h"
 #include "common/int.h"
 #include "common/logging.h"
+#include "common/string.h"
 #include "fe_utils/conditional.h"
 #include "getopt_long.h"
 #include "libpq-fe.h"
@@ -664,116 +665,6 @@ usage(void)
 		   progname, progname);
 }
 
-/* return whether str matches "^\s*[-+]?[0-9]+$" */
-static bool
-is_an_int(const char *str)
-{
-	const char *ptr = str;
-
-	/* skip leading spaces; cast is consistent with strtoint64 */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* skip sign */
-	if (*ptr == '+' || *ptr == '-')
-		ptr++;
-
-	/* at least one digit */
-	if (*ptr && !isdigit((unsigned char) *ptr))
-		return false;
-
-	/* eat all digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-		ptr++;
-
-	/* must have reached end of string */
-	return *ptr == '\0';
-}
-
-
-/*
- * strtoint64 -- convert a string to 64-bit integer
- *
- * This function is a slightly modified version of scanint8() from
- * src/backend/utils/adt/int8.c.
- *
- * The function returns whether the conversion worked, and if so
- * "*result" is set to the result.
- *
- * If not errorOK, an error message is also printed out on errors.
- */
-bool
-strtoint64(const char *str, bool errorOK, int64 *result)
-{
-	const char *ptr = str;
-	int64		tmp = 0;
-	bool		neg = false;
-
-	/*
-	 * Do our own scan, rather than relying on sscanf which might be broken
-	 * for long long.
-	 *
-	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
-	 * value as a negative number.
-	 */
-
-	/* skip leading spaces */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		if (unlikely(tmp == PG_INT64_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	*result = tmp;
-	return true;
-
-out_of_range:
-	if (!errorOK)
-		fprintf(stderr,
-				"value \"%s\" is out of range for type bigint\n", str);
-	return false;
-
-invalid_syntax:
-	if (!errorOK)
-		fprintf(stderr,
-				"invalid input syntax for type bigint: \"%s\"\n", str);
-	return false;
-}
-
 /* convert string to double, detecting overflows/underflows */
 bool
 strtodouble(const char *str, bool errorOK, double *dv)
@@ -1288,6 +1179,7 @@ static bool
 makeVariableValue(Variable *var)
 {
 	size_t		slen;
+	int64		ival;
 
 	if (var->value.type != PGBT_NO_VALUE)
 		return true;			/* no work */
@@ -1320,15 +1212,16 @@ makeVariableValue(Variable *var)
 	{
 		setBoolValue(&var->value, false);
 	}
-	else if (is_an_int(var->svalue))
+
+	/*
+	 * Attempt an int64 conversion.
+	 *
+	 * A too-large but syntactically correct int64 will be parsed as a double
+	 * below.
+	 */
+	else if (pg_strtoint64(var->svalue, &ival) == PG_STRTOINT_OK)
 	{
-		/* if it looks like an int, it must be an int without overflow */
-		int64		iv;
-
-		if (!strtoint64(var->svalue, false, &iv))
-			return false;
-
-		setIntValue(&var->value, iv);
+		setIntValue(&var->value, ival);
 	}
 	else						/* type should be double */
 	{
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index c4a1e298e0..da3f8fa055 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -160,7 +160,6 @@ extern void syntax_error(const char *source, int lineno, const char *line,
 						 const char *cmd, const char *msg,
 						 const char *more, int col) pg_attribute_noreturn();
 
-extern bool strtoint64(const char *str, bool errorOK, int64 *pi);
 extern bool strtodouble(const char *str, bool errorOK, double *pd);
 
 #endif							/* PGBENCH_H */
diff --git a/src/common/string.c b/src/common/string.c
index c9b8482cb0..82cbee4c3d 100644
--- a/src/common/string.c
+++ b/src/common/string.c
@@ -21,6 +21,7 @@
 #include "postgres_fe.h"
 #endif
 
+#include "common/int.h"
 #include "common/string.h"
 
 
@@ -58,6 +59,376 @@ strtoint(const char *pg_restrict str, char **pg_restrict endptr, int base)
 }
 
 
+/*
+ * pg_strtoint16
+ *
+ * Convert input string to a signed 16-bit integer.  Allows any number of
+ * leading or trailing whitespace characters.
+ *
+ * NB: Accumulate input as a negative number, to deal with two's complement
+ * representation of the most negative number, which can't be represented as a
+ * positive number.
+ *
+ * The function returns immediately if the conversion failed with a status
+ * value to let the caller handle the error.  On success, the result is
+ * stored in "*result".
+ */
+pg_strtoint_status
+pg_strtoint16(const char *s, int16 *result)
+{
+	const char *ptr = s;
+	int16		tmp = 0;
+	bool		neg = false;
+
+	/* skip leading spaces */
+	while (likely(*ptr) && isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '-')
+	{
+		ptr++;
+		neg = true;
+	}
+	else if (*ptr == '+')
+		ptr++;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead per the last check */
+	do
+	{
+		int8		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_s16_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_sub_s16_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace, but not other trailing chars */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	if (!neg)
+	{
+		/* could fail if input is most negative number */
+		if (unlikely(tmp == PG_INT16_MIN))
+			return PG_STRTOINT_RANGE_ERROR;
+		tmp = -tmp;
+	}
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
+/*
+ * pg_strtoint32
+ *
+ * Convert input string to a signed 32-bit integer.  Allows any number of
+ * leading or trailing whitespace characters.
+ *
+ * NB: Accumulate input as a negative number, to deal with two's complement
+ * representation of the most negative number, which can't be represented as a
+ * positive number.
+ *
+ * The function returns immediately if the conversion failed with a status
+ * value to let the caller handle the error.  On success, the result is
+ * stored in "*result".
+ */
+pg_strtoint_status
+pg_strtoint32(const char *s, int32 *result)
+{
+	const char *ptr = s;
+	int32		tmp = 0;
+	bool		neg = false;
+
+	/* skip leading spaces */
+	while (likely(*ptr) && isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '-')
+	{
+		ptr++;
+		neg = true;
+	}
+	else if (*ptr == '+')
+		ptr++;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead per the last check */
+	do
+	{
+		int8		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_s32_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_sub_s32_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace, but not other trailing chars */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	if (!neg)
+	{
+		/* could fail if input is most negative number */
+		if (unlikely(tmp == PG_INT32_MIN))
+			return PG_STRTOINT_RANGE_ERROR;
+		tmp = -tmp;
+	}
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
+/*
+ * pg_strtoint64
+ *
+ * Convert input string to a signed 32-bit integer.  Allows any number of
+ * leading or trailing whitespace characters.
+ *
+ * NB: Accumulate input as a negative number, to deal with two's complement
+ * representation of the most negative number, which can't be represented as a
+ * positive number.
+ *
+ * The function returns immediately if the conversion failed with a status
+ * value to let the caller handle the error.  On success, the result is
+ * stored in "*result".
+ */
+pg_strtoint_status
+pg_strtoint64(const char *str, int64 *result)
+{
+	const char *ptr = str;
+	int64		tmp = 0;
+	bool		neg = false;
+
+	/*
+	 * Do our own scan, rather than relying on sscanf which might be broken
+	 * for long long.
+	 *
+	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
+	 * value as a negative number.
+	 */
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '-')
+	{
+		ptr++;
+		neg = true;
+	}
+	else if (*ptr == '+')
+		ptr++;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead per the last check */
+	do
+	{
+		int64		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace, but not other trailing chars */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	if (!neg)
+	{
+		if (unlikely(tmp == PG_INT64_MIN))
+			return PG_STRTOINT_RANGE_ERROR;
+		tmp = -tmp;
+	}
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
+/*
+ * pg_strtouint16
+ *
+ * Convert input string to an unsigned 16-bit integer.  Allows any number of
+ * leading or trailing whitespace characters.
+ *
+ * The function returns immediately if the conversion failed with a status
+ * value to let the caller handle the error.  On success, the result is
+ * stored in "*result".
+ */
+pg_strtoint_status
+pg_strtouint16(const char *str, uint16 *result)
+{
+	const char *ptr = str;
+	uint16		tmp = 0;
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '+')
+		ptr++;
+	else if (unlikely(*ptr == '-'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* require at least one digit */
+	if (!unlikely(isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead per the last check */
+	do
+	{
+		uint8		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_u16_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_add_u16_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* but not other trailing chars */
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
+/*
+ * pg_strtouint32
+ *
+ * Convert input string to an unsigned 32-bit integer.  Allows any number of
+ * leading or trailing whitespace characters.
+ *
+ * The function returns immediately if the conversion failed with a status
+ * value to let the caller handle the error.  On success, the result is
+ * stored in "*result".
+ */
+pg_strtoint_status
+pg_strtouint32(const char *str, uint32 *result)
+{
+	const char *ptr = str;
+	uint32		tmp = 0;
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '+')
+		ptr++;
+	else if (unlikely(*ptr == '-'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* require at least one digit */
+	if (!unlikely(isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead per the last check */
+	do
+	{
+		uint8		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_u32_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_add_u32_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* but not other trailing chars */
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
+/*
+ * pg_strtouint64
+ *
+ * Convert input string to an unsigned 64-bit integer.  Allows any number of
+ * leading or trailing whitespace characters.
+ *
+ * The function returns immediately if the conversion failed with a status
+ * value to let the caller handle the error.  On success, the result is
+ * stored in "*result".
+ */
+pg_strtoint_status
+pg_strtouint64(const char *str, uint64 *result)
+{
+	const char *ptr = str;
+	uint64		tmp = 0;
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '+')
+		ptr++;
+	else if (unlikely(*ptr == '-'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* require at least one digit */
+	if (!unlikely(isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead per the last check */
+	do
+	{
+		uint64		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_u64_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_add_u64_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* but not other trailing chars */
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
 /*
  * pg_clean_ascii -- Replace any non-ASCII chars with a '?' char
  *
diff --git a/src/include/common/string.h b/src/include/common/string.h
index 94f653fdd7..eb0f273d82 100644
--- a/src/include/common/string.h
+++ b/src/include/common/string.h
@@ -11,8 +11,26 @@
 #define COMMON_STRING_H
 
 extern bool pg_str_endswith(const char *str, const char *end);
-extern int	strtoint(const char *pg_restrict str, char **pg_restrict endptr,
-					 int base);
+extern int strtoint(const char *pg_restrict str, char **pg_restrict endptr,
+					int base);
+
+/*
+ * Set of routines for conversion from string to various integer types.
+ */
+typedef enum pg_strtoint_status
+{
+	PG_STRTOINT_OK,
+	PG_STRTOINT_SYNTAX_ERROR,
+	PG_STRTOINT_RANGE_ERROR
+} pg_strtoint_status;
+
+extern pg_strtoint_status pg_strtoint16(const char *str, int16 *result);
+extern pg_strtoint_status pg_strtoint32(const char *str, int32 *result);
+extern pg_strtoint_status pg_strtoint64(const char *str, int64 *result);
+extern pg_strtoint_status pg_strtouint16(const char *str, uint16 *result);
+extern pg_strtoint_status pg_strtouint32(const char *str, uint32 *result);
+extern pg_strtoint_status pg_strtouint64(const char *str, uint64 *result);
+
 extern void pg_clean_ascii(char *str);
 extern int	pg_strip_crlf(char *str);
 
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 937ddb7ef0..691dac84a0 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -15,6 +15,7 @@
 #define BUILTINS_H
 
 #include "fmgr.h"
+#include "common/string.h"
 #include "nodes/nodes.h"
 #include "utils/fmgrprotos.h"
 
@@ -43,14 +44,145 @@ extern int	namestrcmp(Name name, const char *str);
 
 /* numutils.c */
 extern int32 pg_atoi(const char *s, int size, int c);
-extern int16 pg_strtoint16(const char *s);
-extern int32 pg_strtoint32(const char *s);
 extern void pg_itoa(int16 i, char *a);
 extern void pg_ltoa(int32 l, char *a);
 extern void pg_lltoa(int64 ll, char *a);
 extern char *pg_ltostr_zeropad(char *str, int32 value, int32 minwidth);
 extern char *pg_ltostr(char *str, int32 value);
-extern uint64 pg_strtouint64(const char *str, char **endptr, int base);
+
+/*
+ * pg_strtoint_error
+ *
+ * utility wrapper to report an error with string-to-integer conversion
+ * functions.
+ */
+static inline void
+pg_strtoint_error(pg_strtoint_status status, const char *input,
+				  const char *type)
+{
+	switch (status)
+	{
+		case PG_STRTOINT_SYNTAX_ERROR:
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+					 errmsg("invalid input syntax for type %s: \"%s\"",
+							type, input)));
+			break;
+
+		case PG_STRTOINT_RANGE_ERROR:
+			ereport(ERROR,
+					(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+					 errmsg("value \"%s\" is out of range for type %s",
+							input, type)));
+			break;
+
+		default:
+			elog(ERROR, "unexpected status for string-to-integer conversion");
+			break;
+	}
+}
+
+/*
+ * pg_strtoint16_check
+ *
+ * Convert input string to a signed 16-bit integer.  This throws ereport()
+ * upon bad input format or overflow.
+ */
+static inline int16
+pg_strtoint16_check(const char *s)
+{
+	int16		result;
+	pg_strtoint_status status = pg_strtoint16(s, &result);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, s, "int16");
+	return result;
+}
+
+/*
+ * pg_strtoint32_check
+ *
+ * Convert input string to a signed 32-bit integer.  This throws ereport()
+ * upon bad input format or overflow.
+ */
+static inline int32
+pg_strtoint32_check(const char *s)
+{
+	int32		result;
+	pg_strtoint_status status = pg_strtoint32(s, &result);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, s, "int32");
+	return result;
+}
+
+/*
+ * pg_strtoint64_check
+ *
+ * Convert input string to a signed 64-bit integer.  This throws ereport()
+ * upon bad input format or overflow.
+ */
+static inline int64
+pg_strtoint64_check(const char *s)
+{
+	int64		result;
+	pg_strtoint_status status = pg_strtoint64(s, &result);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, s, "int64");
+	return result;
+}
+
+/*
+ * pg_strtouint16_check
+ *
+ * Convert input string to a unsigned 16-bit integer.  This throws ereport()
+ * upon bad input format or overflow.
+ */
+static inline uint16
+pg_strtouint16_check(const char *s)
+{
+	uint16		result;
+	pg_strtoint_status status = pg_strtouint16(s, &result);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, s, "uint16");
+	return result;
+}
+
+/*
+ * pg_strtouint32_check
+ *
+ * Convert input string to a signed 32-bit integer.  This throws ereport()
+ * upon bad input format or overflow.
+ */
+static inline uint32
+pg_strtouint32_check(const char *s)
+{
+	uint32		result;
+	pg_strtoint_status status = pg_strtouint32(s, &result);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, s, "uint32");
+	return result;
+}
+
+/*
+ * pg_strtouint64_check
+ *
+ * Convert input string to a signed 64-bit integer.  This throws ereport()
+ * upon bad input format or overflow.
+ */
+static inline uint64
+pg_strtouint64_check(const char *s)
+{
+	uint64		result;
+	pg_strtoint_status status = pg_strtouint64(s, &result);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, s, "uint64");
+	return result;
+}
 
 /* oid.c */
 extern oidvector *buildoidvector(const Oid *oids, int n);
diff --git a/src/include/utils/int8.h b/src/include/utils/int8.h
deleted file mode 100644
index 4836095d93..0000000000
--- a/src/include/utils/int8.h
+++ /dev/null
@@ -1,25 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * int8.h
- *	  Declarations for operations on 64-bit integers.
- *
- *
- * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * src/include/utils/int8.h
- *
- * NOTES
- * These data types are supported on all 64-bit architectures, and may
- *	be supported through libraries on some 32-bit machines. If your machine
- *	is not currently supported, then please try to make it so, then post
- *	patches to the postgresql.org hackers mailing list.
- *
- *-------------------------------------------------------------------------
- */
-#ifndef INT8_H
-#define INT8_H
-
-extern bool scanint8(const char *str, bool errorOK, int64 *result);
-
-#endif							/* INT8_H */

#75

coelho@cri.ensmp.fr

over 6 years ago

In reply to: Michael Paquier (#74)

Re: refactoring - share str2*int64 functions

Bonjour Michaël,

- Switching INT to use pg_strtoint32() causes a set of warnings as for
example with AttrNumber:
72 | (void) pg_strtoint32(token, &local_node->fldname)
| ^~~~~~~~~~~~~~~~~~~~~
| |
| AttrNumber * {aka short int *}

And it is not like we should use a cast either, as we could hide real
issues.

It should rather call pg_strtoint16? And possibly switch the "short int"
declaration to int16?

About batch v14: applies cleanly and compiles without warnings. "make
check" ok.

I do not think that "pg_strtoint_error" should be inlinable. The function
is unlikely to be called, so it is not performance critical to inline it,
and would enlarge the executable needlessly. However, the
"pg_strto*_check" variants should be inlinable, as you have done.

About the code, on several instances of:

/* skip leading spaces */
while (likely(*ptr) && isspace((unsigned char) *ptr)) …

I would drop the "likely(*ptr)".

And on several instances of:

!unlikely(isdigit((unsigned char) *ptr)))

ISTM that we want "unlikely(!isdigit((unsigned char) *ptr)))". Parsing
!unlikely leads to false conclusion and a headache:-)

Otherwise, this batch of changes looks ok to me.

--
Fabien.

#76

michael@paquier.xyz

over 6 years ago

In reply to: Fabien COELHO (#75)

1 attachment(s)

Re: refactoring - share str2*int64 functions

On Sat, Sep 14, 2019 at 10:24:10AM +0200, Fabien COELHO wrote:

It should rather call pg_strtoint16? And possibly switch the "short int"
declaration to int16?

Sure, but you get into other problems if using the 16-bit version for
some other fields, which is why it seems to me that we should add an
extra routine for shorts. So for now I prefer discarding this part.

I do not think that "pg_strtoint_error" should be inlinable. The function is
unlikely to be called, so it is not performance critical to inline it, and
would enlarge the executable needlessly. However, the "pg_strto*_check"
variants should be inlinable, as you have done.

Makes sense.

About the code, on several instances of:

/* skip leading spaces */
while (likely(*ptr) && isspace((unsigned char) *ptr)) …

I would drop the "likely(*ptr)".

Right as well. There were two places out of six with that pattern.

And on several instances of:

!unlikely(isdigit((unsigned char) *ptr)))

ISTM that we want "unlikely(!isdigit((unsigned char) *ptr)))". Parsing
!unlikely leads to false conclusion and a headache:-)

That part was actually inconsistent with the rest.

Otherwise, this batch of changes looks ok to me.

Thanks.
--
Michael

Attachments:

str2int-15.patchtext/x-diff; charset=us-asciiDownload

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 221b47298c..7af05fc009 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -1022,7 +1022,7 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 		/* parse command tag to retrieve the number of affected rows. */
 		if (completionTag &&
 			strncmp(completionTag, "COPY ", 5) == 0)
-			rows = pg_strtouint64(completionTag + 5, NULL, 10);
+			rows = pg_strtouint64_check(completionTag + 5);
 		else
 			rows = 0;
 
diff --git a/contrib/spi/refint.c b/contrib/spi/refint.c
index adf0490f85..b063f1e122 100644
--- a/contrib/spi/refint.c
+++ b/contrib/spi/refint.c
@@ -306,7 +306,7 @@ check_foreign_key(PG_FUNCTION_ARGS)
 		/* internal error */
 		elog(ERROR, "check_foreign_key: too short %d (< 5) list of arguments", nargs);
 
-	nrefs = pg_strtoint32(args[0]);
+	nrefs = pg_strtoint32_check(args[0]);
 	if (nrefs < 1)
 		/* internal error */
 		elog(ERROR, "check_foreign_key: %d (< 1) number of references specified", nrefs);
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 2c0ae395ba..1272b4ee04 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -2338,8 +2338,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
 					CreateTableAsStmt *ctastmt = (CreateTableAsStmt *) stmt->utilityStmt;
 
 					if (strncmp(completionTag, "SELECT ", 7) == 0)
-						_SPI_current->processed =
-							pg_strtouint64(completionTag + 7, NULL, 10);
+						_SPI_current->processed = pg_strtouint64_check(completionTag + 7);
 					else
 					{
 						/*
@@ -2361,8 +2360,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
 				else if (IsA(stmt->utilityStmt, CopyStmt))
 				{
 					Assert(strncmp(completionTag, "COPY ", 5) == 0);
-					_SPI_current->processed = pg_strtouint64(completionTag + 5,
-															 NULL, 10);
+					_SPI_current->processed = pg_strtouint64_check(completionTag + 5);
 				}
 			}
 
diff --git a/src/backend/libpq/pqmq.c b/src/backend/libpq/pqmq.c
index a9bd47d937..ac2bf2f2c4 100644
--- a/src/backend/libpq/pqmq.c
+++ b/src/backend/libpq/pqmq.c
@@ -286,10 +286,10 @@ pq_parse_errornotice(StringInfo msg, ErrorData *edata)
 				edata->hint = pstrdup(value);
 				break;
 			case PG_DIAG_STATEMENT_POSITION:
-				edata->cursorpos = pg_strtoint32(value);
+				edata->cursorpos = pg_strtoint32_check(value);
 				break;
 			case PG_DIAG_INTERNAL_POSITION:
-				edata->internalpos = pg_strtoint32(value);
+				edata->internalpos = pg_strtoint32_check(value);
 				break;
 			case PG_DIAG_INTERNAL_QUERY:
 				edata->internalquery = pstrdup(value);
@@ -316,7 +316,7 @@ pq_parse_errornotice(StringInfo msg, ErrorData *edata)
 				edata->filename = pstrdup(value);
 				break;
 			case PG_DIAG_SOURCE_LINE:
-				edata->lineno = pg_strtoint32(value);
+				edata->lineno = pg_strtoint32_check(value);
 				break;
 			case PG_DIAG_SOURCE_FUNCTION:
 				edata->funcname = pstrdup(value);
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 764e3bb90c..2a1e9be3eb 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -32,6 +32,7 @@
 
 #include <math.h>
 
+#include "common/string.h"
 #include "fmgr.h"
 #include "miscadmin.h"
 #include "nodes/extensible.h"
@@ -80,7 +81,7 @@
 #define READ_UINT64_FIELD(fldname) \
 	token = pg_strtok(&length);		/* skip :fldname */ \
 	token = pg_strtok(&length);		/* get field value */ \
-	local_node->fldname = pg_strtouint64(token, NULL, 10)
+	(void) pg_strtouint64(token, &local_node->fldname)
 
 /* Read a long integer field (anything written as ":fldname %ld") */
 #define READ_LONG_FIELD(fldname) \
diff --git a/src/backend/parser/parse_node.c b/src/backend/parser/parse_node.c
index 1baf7ef31f..cc846a71f1 100644
--- a/src/backend/parser/parse_node.c
+++ b/src/backend/parser/parse_node.c
@@ -17,6 +17,7 @@
 #include "access/htup_details.h"
 #include "access/table.h"
 #include "catalog/pg_type.h"
+#include "common/string.h"
 #include "mb/pg_wchar.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
@@ -25,7 +26,6 @@
 #include "parser/parse_expr.h"
 #include "parser/parse_relation.h"
 #include "utils/builtins.h"
-#include "utils/int8.h"
 #include "utils/lsyscache.h"
 #include "utils/syscache.h"
 #include "utils/varbit.h"
@@ -496,7 +496,7 @@ make_const(ParseState *pstate, Value *value, int location)
 
 		case T_Float:
 			/* could be an oversize integer as well as a float ... */
-			if (scanint8(strVal(value), true, &val64))
+			if (pg_strtoint64(strVal(value), &val64) == PG_STRTOINT_OK)
 			{
 				/*
 				 * It might actually fit in int32. Probably only INT_MIN can
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 6eba08a920..1ca6542c2e 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -341,7 +341,7 @@ libpqrcv_identify_system(WalReceiverConn *conn, TimeLineID *primary_tli)
 						   ntuples, nfields, 3, 1)));
 	}
 	primary_sysid = pstrdup(PQgetvalue(res, 0, 0));
-	*primary_tli = pg_strtoint32(PQgetvalue(res, 0, 1));
+	*primary_tli = pg_strtoint32_check(PQgetvalue(res, 0, 1));
 	PQclear(res);
 
 	return primary_sysid;
@@ -487,7 +487,7 @@ libpqrcv_endstreaming(WalReceiverConn *conn, TimeLineID *next_tli)
 		if (PQnfields(res) < 2 || PQntuples(res) != 1)
 			ereport(ERROR,
 					(errmsg("unexpected result set after end-of-streaming")));
-		*next_tli = pg_strtoint32(PQgetvalue(res, 0, 0));
+		*next_tli = pg_strtoint32_check(PQgetvalue(res, 0, 0));
 		PQclear(res);
 
 		/* the result set should be followed by CommandComplete */
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 9c08757fca..d8a2c4979f 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,6 +13,7 @@
 #include "postgres.h"
 
 #include "catalog/pg_publication.h"
+#include "common/string.h"
 
 #include "fmgr.h"
 
@@ -22,7 +23,6 @@
 #include "replication/pgoutput.h"
 
 #include "utils/inval.h"
-#include "utils/int8.h"
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 #include "utils/varlena.h"
@@ -113,7 +113,7 @@ parse_output_parameters(List *options, uint32 *protocol_version,
 						 errmsg("conflicting or redundant options")));
 			protocol_version_given = true;
 
-			if (!scanint8(strVal(defel->arg), true, &parsed))
+			if (pg_strtoint64(strVal(defel->arg), &parsed) != PG_STRTOINT_OK)
 				ereport(ERROR,
 						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 						 errmsg("invalid proto_version")));
diff --git a/src/backend/tsearch/wparser_def.c b/src/backend/tsearch/wparser_def.c
index f18e761af8..abde3fdf6d 100644
--- a/src/backend/tsearch/wparser_def.c
+++ b/src/backend/tsearch/wparser_def.c
@@ -2460,13 +2460,13 @@ prsd_headline(PG_FUNCTION_ARGS)
 		char	   *val = defGetString(defel);
 
 		if (pg_strcasecmp(defel->defname, "MaxWords") == 0)
-			max_words = pg_strtoint32(val);
+			max_words = pg_strtoint32_check(val);
 		else if (pg_strcasecmp(defel->defname, "MinWords") == 0)
-			min_words = pg_strtoint32(val);
+			min_words = pg_strtoint32_check(val);
 		else if (pg_strcasecmp(defel->defname, "ShortWord") == 0)
-			shortword = pg_strtoint32(val);
+			shortword = pg_strtoint32_check(val);
 		else if (pg_strcasecmp(defel->defname, "MaxFragments") == 0)
-			max_fragments = pg_strtoint32(val);
+			max_fragments = pg_strtoint32_check(val);
 		else if (pg_strcasecmp(defel->defname, "StartSel") == 0)
 			prs->startsel = pstrdup(val);
 		else if (pg_strcasecmp(defel->defname, "StopSel") == 0)
diff --git a/src/backend/utils/adt/arrayutils.c b/src/backend/utils/adt/arrayutils.c
index 6170e35ac2..29d9784f41 100644
--- a/src/backend/utils/adt/arrayutils.c
+++ b/src/backend/utils/adt/arrayutils.c
@@ -226,7 +226,7 @@ ArrayGetIntegerTypmods(ArrayType *arr, int *n)
 	result = (int32 *) palloc(*n * sizeof(int32));
 
 	for (i = 0; i < *n; i++)
-		result[i] = pg_strtoint32(DatumGetCString(elem_values[i]));
+		result[i] = pg_strtoint32_check(DatumGetCString(elem_values[i]));
 
 	pfree(elem_values);
 
diff --git a/src/backend/utils/adt/cash.c b/src/backend/utils/adt/cash.c
index 6515fc8ec6..15f3babc05 100644
--- a/src/backend/utils/adt/cash.c
+++ b/src/backend/utils/adt/cash.c
@@ -26,7 +26,6 @@
 #include "libpq/pqformat.h"
 #include "utils/builtins.h"
 #include "utils/cash.h"
-#include "utils/int8.h"
 #include "utils/numeric.h"
 #include "utils/pg_locale.h"
 
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index 755ca6e277..f8cc5bc48a 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -92,7 +92,6 @@
 #include "utils/datetime.h"
 #include "utils/float.h"
 #include "utils/formatting.h"
-#include "utils/int8.h"
 #include "utils/memutils.h"
 #include "utils/numeric.h"
 #include "utils/pg_locale.h"
diff --git a/src/backend/utils/adt/int.c b/src/backend/utils/adt/int.c
index 04825fc77d..3bf8438942 100644
--- a/src/backend/utils/adt/int.c
+++ b/src/backend/utils/adt/int.c
@@ -63,8 +63,19 @@ Datum
 int2in(PG_FUNCTION_ARGS)
 {
 	char	   *num = PG_GETARG_CSTRING(0);
+	int16		res;
+	pg_strtoint_status status;
 
-	PG_RETURN_INT16(pg_strtoint16(num));
+	/*
+	 * pg_strtoint16_check is not used here as any error message generated
+	 * should be consistent with the SQL data type used.
+	 */
+	status = pg_strtoint16(num, &res);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, num, "smallint");
+
+	PG_RETURN_INT16(res);
 }
 
 /*
@@ -268,8 +279,19 @@ Datum
 int4in(PG_FUNCTION_ARGS)
 {
 	char	   *num = PG_GETARG_CSTRING(0);
+	int32		res;
+	pg_strtoint_status status;
 
-	PG_RETURN_INT32(pg_strtoint32(num));
+	/*
+	 * pg_strtoint32_check is not used here as any error message generated
+	 * should be consistent with the SQL data type used.
+	 */
+	status = pg_strtoint32(num, &res);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, num, "integer");
+
+	PG_RETURN_INT32(res);
 }
 
 /*
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 0ff9394a2f..3ef0a01ea0 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -23,7 +23,6 @@
 #include "nodes/nodeFuncs.h"
 #include "nodes/supportnodes.h"
 #include "optimizer/optimizer.h"
-#include "utils/int8.h"
 #include "utils/builtins.h"
 
 
@@ -47,99 +46,25 @@ typedef struct
  * Formatting and conversion routines.
  *---------------------------------------------------------*/
 
-/*
- * scanint8 --- try to parse a string into an int8.
- *
- * If errorOK is false, ereport a useful error message if the string is bad.
- * If errorOK is true, just return "false" for bad input.
- */
-bool
-scanint8(const char *str, bool errorOK, int64 *result)
-{
-	const char *ptr = str;
-	int64		tmp = 0;
-	bool		neg = false;
-
-	/*
-	 * Do our own scan, rather than relying on sscanf which might be broken
-	 * for long long.
-	 *
-	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
-	 * value as a negative number.
-	 */
-
-	/* skip leading spaces */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		/* could fail if input is most negative number */
-		if (unlikely(tmp == PG_INT64_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	*result = tmp;
-	return true;
-
-out_of_range:
-	if (!errorOK)
-		ereport(ERROR,
-				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-				 errmsg("value \"%s\" is out of range for type %s",
-						str, "bigint")));
-	return false;
-
-invalid_syntax:
-	if (!errorOK)
-		ereport(ERROR,
-				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
-				 errmsg("invalid input syntax for type %s: \"%s\"",
-						"bigint", str)));
-	return false;
-}
-
 /* int8in()
  */
 Datum
 int8in(PG_FUNCTION_ARGS)
 {
-	char	   *str = PG_GETARG_CSTRING(0);
-	int64		result;
+	char			   *str = PG_GETARG_CSTRING(0);
+	int64				res;
+	pg_strtoint_status	status;
 
-	(void) scanint8(str, false, &result);
-	PG_RETURN_INT64(result);
+	/*
+	 * pg_strtoint64_check is not used here as any error message generated
+	 * should be consistent with the SQL data type used.
+	 */
+	status = pg_strtoint64(str, &res);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, str, "bigint");
+
+	PG_RETURN_INT64(res);
 }
 
 
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index a00db3ce7a..bc02958950 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -39,7 +39,6 @@
 #include "utils/float.h"
 #include "utils/guc.h"
 #include "utils/hashutils.h"
-#include "utils/int8.h"
 #include "utils/numeric.h"
 #include "utils/sortsupport.h"
 
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 70138feb29..03fc5bd220 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -109,158 +109,40 @@ pg_atoi(const char *s, int size, int c)
 	return (int32) l;
 }
 
-/*
- * Convert input string to a signed 16 bit integer.
- *
- * Allows any number of leading or trailing whitespace characters. Will throw
- * ereport() upon bad input format or overflow.
- *
- * NB: Accumulate input as a negative number, to deal with two's complement
- * representation of the most negative number, which can't be represented as a
- * positive number.
- */
-int16
-pg_strtoint16(const char *s)
-{
-	const char *ptr = s;
-	int16		tmp = 0;
-	bool		neg = false;
-
-	/* skip leading spaces */
-	while (likely(*ptr) && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s16_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s16_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		/* could fail if input is most negative number */
-		if (unlikely(tmp == PG_INT16_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	return tmp;
-
-out_of_range:
-	ereport(ERROR,
-			(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-			 errmsg("value \"%s\" is out of range for type %s",
-					s, "smallint")));
-
-invalid_syntax:
-	ereport(ERROR,
-			(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
-			 errmsg("invalid input syntax for type %s: \"%s\"",
-					"smallint", s)));
-
-	return 0;					/* keep compiler quiet */
-}
 
 /*
- * Convert input string to a signed 32 bit integer.
+ * pg_strtoint_error
  *
- * Allows any number of leading or trailing whitespace characters. Will throw
- * ereport() upon bad input format or overflow.
- *
- * NB: Accumulate input as a negative number, to deal with two's complement
- * representation of the most negative number, which can't be represented as a
- * positive number.
+ * utility wrapper to report an error with string-to-integer conversion
+ * functions.
  */
-int32
-pg_strtoint32(const char *s)
+void
+pg_strtoint_error(pg_strtoint_status status, const char *input,
+				  const char *type)
 {
-	const char *ptr = s;
-	int32		tmp = 0;
-	bool		neg = false;
-
-	/* skip leading spaces */
-	while (likely(*ptr) && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
+	switch (status)
 	{
-		ptr++;
-		neg = true;
+		case PG_STRTOINT_SYNTAX_ERROR:
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+					 errmsg("invalid input syntax for type %s: \"%s\"",
+							type, input)));
+			break;
+
+		case PG_STRTOINT_RANGE_ERROR:
+			ereport(ERROR,
+					(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+					 errmsg("value \"%s\" is out of range for type %s",
+							input, type)));
+			break;
+
+		default:
+			elog(ERROR, "unexpected status for string-to-integer conversion");
+			break;
 	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s32_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s32_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		/* could fail if input is most negative number */
-		if (unlikely(tmp == PG_INT32_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	return tmp;
-
-out_of_range:
-	ereport(ERROR,
-			(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-			 errmsg("value \"%s\" is out of range for type %s",
-					s, "integer")));
-
-invalid_syntax:
-	ereport(ERROR,
-			(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
-			 errmsg("invalid input syntax for type %s: \"%s\"",
-					"integer", s)));
-
-	return 0;					/* keep compiler quiet */
 }
 
+
 /*
  * pg_itoa: converts a signed 16-bit integer to its string representation
  *
@@ -543,25 +425,3 @@ pg_ltostr(char *str, int32 value)
 
 	return end;
 }
-
-/*
- * pg_strtouint64
- *		Converts 'str' into an unsigned 64-bit integer.
- *
- * This has the identical API to strtoul(3), except that it will handle
- * 64-bit ints even where "long" is narrower than that.
- *
- * For the moment it seems sufficient to assume that the platform has
- * such a function somewhere; let's not roll our own.
- */
-uint64
-pg_strtouint64(const char *str, char **endptr, int base)
-{
-#ifdef _MSC_VER					/* MSVC only */
-	return _strtoui64(str, endptr, base);
-#elif defined(HAVE_STRTOULL) && SIZEOF_LONG < 8
-	return strtoull(str, endptr, base);
-#else
-	return strtoul(str, endptr, base);
-#endif
-}
diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c
index e5c7e5c7ee..749fc44623 100644
--- a/src/backend/utils/adt/rangetypes.c
+++ b/src/backend/utils/adt/rangetypes.c
@@ -37,7 +37,6 @@
 #include "utils/builtins.h"
 #include "utils/date.h"
 #include "utils/hashutils.h"
-#include "utils/int8.h"
 #include "utils/lsyscache.h"
 #include "utils/rangetypes.h"
 #include "utils/timestamp.h"
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index d36156f4e4..8c8c98bc3f 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -5626,8 +5626,11 @@ text_format(PG_FUNCTION_ARGS)
 
 				str = OutputFunctionCall(&typoutputinfo_width, value);
 
-				/* pg_strtoint32 will complain about bad data or overflow */
-				width = pg_strtoint32(str);
+				/*
+				 * pg_strtoint32_check will complain about bad data or
+				 * overflow.
+				 */
+				width = pg_strtoint32_check(str);
 
 				pfree(str);
 			}
diff --git a/src/bin/pgbench/exprscan.l b/src/bin/pgbench/exprscan.l
index e9020ad231..84a9d39795 100644
--- a/src/bin/pgbench/exprscan.l
+++ b/src/bin/pgbench/exprscan.l
@@ -23,6 +23,7 @@
  *-------------------------------------------------------------------------
  */
 
+#include "common/string.h"
 #include "fe_utils/psqlscan_int.h"
 
 /* context information for reporting errors in expressions */
@@ -205,7 +206,7 @@ notnull			[Nn][Oo][Tt][Nn][Uu][Ll][Ll]
 					return MAXINT_PLUS_ONE_CONST;
 				}
 {digit}+		{
-					if (!strtoint64(yytext, true, &yylval->ival))
+					if (pg_strtoint64(yytext, &yylval->ival) != PG_STRTOINT_OK)
 						expr_yyerror_more(yyscanner, "bigint constant overflow",
 										  strdup(yytext));
 					return INTEGER_CONST;
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index ed7652bfbf..600f1deb71 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -34,6 +34,7 @@
 #include "postgres_fe.h"
 #include "common/int.h"
 #include "common/logging.h"
+#include "common/string.h"
 #include "fe_utils/conditional.h"
 #include "getopt_long.h"
 #include "libpq-fe.h"
@@ -664,116 +665,6 @@ usage(void)
 		   progname, progname);
 }
 
-/* return whether str matches "^\s*[-+]?[0-9]+$" */
-static bool
-is_an_int(const char *str)
-{
-	const char *ptr = str;
-
-	/* skip leading spaces; cast is consistent with strtoint64 */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* skip sign */
-	if (*ptr == '+' || *ptr == '-')
-		ptr++;
-
-	/* at least one digit */
-	if (*ptr && !isdigit((unsigned char) *ptr))
-		return false;
-
-	/* eat all digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-		ptr++;
-
-	/* must have reached end of string */
-	return *ptr == '\0';
-}
-
-
-/*
- * strtoint64 -- convert a string to 64-bit integer
- *
- * This function is a slightly modified version of scanint8() from
- * src/backend/utils/adt/int8.c.
- *
- * The function returns whether the conversion worked, and if so
- * "*result" is set to the result.
- *
- * If not errorOK, an error message is also printed out on errors.
- */
-bool
-strtoint64(const char *str, bool errorOK, int64 *result)
-{
-	const char *ptr = str;
-	int64		tmp = 0;
-	bool		neg = false;
-
-	/*
-	 * Do our own scan, rather than relying on sscanf which might be broken
-	 * for long long.
-	 *
-	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
-	 * value as a negative number.
-	 */
-
-	/* skip leading spaces */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		if (unlikely(tmp == PG_INT64_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	*result = tmp;
-	return true;
-
-out_of_range:
-	if (!errorOK)
-		fprintf(stderr,
-				"value \"%s\" is out of range for type bigint\n", str);
-	return false;
-
-invalid_syntax:
-	if (!errorOK)
-		fprintf(stderr,
-				"invalid input syntax for type bigint: \"%s\"\n", str);
-	return false;
-}
-
 /* convert string to double, detecting overflows/underflows */
 bool
 strtodouble(const char *str, bool errorOK, double *dv)
@@ -1288,6 +1179,7 @@ static bool
 makeVariableValue(Variable *var)
 {
 	size_t		slen;
+	int64		ival;
 
 	if (var->value.type != PGBT_NO_VALUE)
 		return true;			/* no work */
@@ -1320,15 +1212,16 @@ makeVariableValue(Variable *var)
 	{
 		setBoolValue(&var->value, false);
 	}
-	else if (is_an_int(var->svalue))
+
+	/*
+	 * Attempt an int64 conversion.
+	 *
+	 * A too-large but syntactically correct int64 will be parsed as a double
+	 * below.
+	 */
+	else if (pg_strtoint64(var->svalue, &ival) == PG_STRTOINT_OK)
 	{
-		/* if it looks like an int, it must be an int without overflow */
-		int64		iv;
-
-		if (!strtoint64(var->svalue, false, &iv))
-			return false;
-
-		setIntValue(&var->value, iv);
+		setIntValue(&var->value, ival);
 	}
 	else						/* type should be double */
 	{
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index c4a1e298e0..da3f8fa055 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -160,7 +160,6 @@ extern void syntax_error(const char *source, int lineno, const char *line,
 						 const char *cmd, const char *msg,
 						 const char *more, int col) pg_attribute_noreturn();
 
-extern bool strtoint64(const char *str, bool errorOK, int64 *pi);
 extern bool strtodouble(const char *str, bool errorOK, double *pd);
 
 #endif							/* PGBENCH_H */
diff --git a/src/common/string.c b/src/common/string.c
index c9b8482cb0..aaeb6da648 100644
--- a/src/common/string.c
+++ b/src/common/string.c
@@ -21,6 +21,7 @@
 #include "postgres_fe.h"
 #endif
 
+#include "common/int.h"
 #include "common/string.h"
 
 
@@ -58,6 +59,376 @@ strtoint(const char *pg_restrict str, char **pg_restrict endptr, int base)
 }
 
 
+/*
+ * pg_strtoint16
+ *
+ * Convert input string to a signed 16-bit integer.  Allows any number of
+ * leading or trailing whitespace characters.
+ *
+ * NB: Accumulate input as a negative number, to deal with two's complement
+ * representation of the most negative number, which can't be represented as a
+ * positive number.
+ *
+ * The function returns immediately if the conversion failed with a status
+ * value to let the caller handle the error.  On success, the result is
+ * stored in "*result".
+ */
+pg_strtoint_status
+pg_strtoint16(const char *s, int16 *result)
+{
+	const char *ptr = s;
+	int16		tmp = 0;
+	bool		neg = false;
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '-')
+	{
+		ptr++;
+		neg = true;
+	}
+	else if (*ptr == '+')
+		ptr++;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead per the last check */
+	do
+	{
+		int8		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_s16_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_sub_s16_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace, but not other trailing chars */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	if (!neg)
+	{
+		/* could fail if input is most negative number */
+		if (unlikely(tmp == PG_INT16_MIN))
+			return PG_STRTOINT_RANGE_ERROR;
+		tmp = -tmp;
+	}
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
+/*
+ * pg_strtoint32
+ *
+ * Convert input string to a signed 32-bit integer.  Allows any number of
+ * leading or trailing whitespace characters.
+ *
+ * NB: Accumulate input as a negative number, to deal with two's complement
+ * representation of the most negative number, which can't be represented as a
+ * positive number.
+ *
+ * The function returns immediately if the conversion failed with a status
+ * value to let the caller handle the error.  On success, the result is
+ * stored in "*result".
+ */
+pg_strtoint_status
+pg_strtoint32(const char *s, int32 *result)
+{
+	const char *ptr = s;
+	int32		tmp = 0;
+	bool		neg = false;
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '-')
+	{
+		ptr++;
+		neg = true;
+	}
+	else if (*ptr == '+')
+		ptr++;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead per the last check */
+	do
+	{
+		int8		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_s32_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_sub_s32_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace, but not other trailing chars */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	if (!neg)
+	{
+		/* could fail if input is most negative number */
+		if (unlikely(tmp == PG_INT32_MIN))
+			return PG_STRTOINT_RANGE_ERROR;
+		tmp = -tmp;
+	}
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
+/*
+ * pg_strtoint64
+ *
+ * Convert input string to a signed 32-bit integer.  Allows any number of
+ * leading or trailing whitespace characters.
+ *
+ * NB: Accumulate input as a negative number, to deal with two's complement
+ * representation of the most negative number, which can't be represented as a
+ * positive number.
+ *
+ * The function returns immediately if the conversion failed with a status
+ * value to let the caller handle the error.  On success, the result is
+ * stored in "*result".
+ */
+pg_strtoint_status
+pg_strtoint64(const char *str, int64 *result)
+{
+	const char *ptr = str;
+	int64		tmp = 0;
+	bool		neg = false;
+
+	/*
+	 * Do our own scan, rather than relying on sscanf which might be broken
+	 * for long long.
+	 *
+	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
+	 * value as a negative number.
+	 */
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '-')
+	{
+		ptr++;
+		neg = true;
+	}
+	else if (*ptr == '+')
+		ptr++;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead per the last check */
+	do
+	{
+		int64		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace, but not other trailing chars */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	if (!neg)
+	{
+		if (unlikely(tmp == PG_INT64_MIN))
+			return PG_STRTOINT_RANGE_ERROR;
+		tmp = -tmp;
+	}
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
+/*
+ * pg_strtouint16
+ *
+ * Convert input string to an unsigned 16-bit integer.  Allows any number of
+ * leading or trailing whitespace characters.
+ *
+ * The function returns immediately if the conversion failed with a status
+ * value to let the caller handle the error.  On success, the result is
+ * stored in "*result".
+ */
+pg_strtoint_status
+pg_strtouint16(const char *str, uint16 *result)
+{
+	const char *ptr = str;
+	uint16		tmp = 0;
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '+')
+		ptr++;
+	else if (unlikely(*ptr == '-'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead per the last check */
+	do
+	{
+		uint8		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_u16_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_add_u16_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* but not other trailing chars */
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
+/*
+ * pg_strtouint32
+ *
+ * Convert input string to an unsigned 32-bit integer.  Allows any number of
+ * leading or trailing whitespace characters.
+ *
+ * The function returns immediately if the conversion failed with a status
+ * value to let the caller handle the error.  On success, the result is
+ * stored in "*result".
+ */
+pg_strtoint_status
+pg_strtouint32(const char *str, uint32 *result)
+{
+	const char *ptr = str;
+	uint32		tmp = 0;
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '+')
+		ptr++;
+	else if (unlikely(*ptr == '-'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead per the last check */
+	do
+	{
+		uint8		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_u32_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_add_u32_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* but not other trailing chars */
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
+/*
+ * pg_strtouint64
+ *
+ * Convert input string to an unsigned 64-bit integer.  Allows any number of
+ * leading or trailing whitespace characters.
+ *
+ * The function returns immediately if the conversion failed with a status
+ * value to let the caller handle the error.  On success, the result is
+ * stored in "*result".
+ */
+pg_strtoint_status
+pg_strtouint64(const char *str, uint64 *result)
+{
+	const char *ptr = str;
+	uint64		tmp = 0;
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '+')
+		ptr++;
+	else if (unlikely(*ptr == '-'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead per the last check */
+	do
+	{
+		uint64		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_u64_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_add_u64_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* but not other trailing chars */
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
 /*
  * pg_clean_ascii -- Replace any non-ASCII chars with a '?' char
  *
diff --git a/src/include/common/string.h b/src/include/common/string.h
index 94f653fdd7..eb0f273d82 100644
--- a/src/include/common/string.h
+++ b/src/include/common/string.h
@@ -11,8 +11,26 @@
 #define COMMON_STRING_H
 
 extern bool pg_str_endswith(const char *str, const char *end);
-extern int	strtoint(const char *pg_restrict str, char **pg_restrict endptr,
-					 int base);
+extern int strtoint(const char *pg_restrict str, char **pg_restrict endptr,
+					int base);
+
+/*
+ * Set of routines for conversion from string to various integer types.
+ */
+typedef enum pg_strtoint_status
+{
+	PG_STRTOINT_OK,
+	PG_STRTOINT_SYNTAX_ERROR,
+	PG_STRTOINT_RANGE_ERROR
+} pg_strtoint_status;
+
+extern pg_strtoint_status pg_strtoint16(const char *str, int16 *result);
+extern pg_strtoint_status pg_strtoint32(const char *str, int32 *result);
+extern pg_strtoint_status pg_strtoint64(const char *str, int64 *result);
+extern pg_strtoint_status pg_strtouint16(const char *str, uint16 *result);
+extern pg_strtoint_status pg_strtouint32(const char *str, uint32 *result);
+extern pg_strtoint_status pg_strtouint64(const char *str, uint64 *result);
+
 extern void pg_clean_ascii(char *str);
 extern int	pg_strip_crlf(char *str);
 
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 937ddb7ef0..1e327c4c8d 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -15,6 +15,7 @@
 #define BUILTINS_H
 
 #include "fmgr.h"
+#include "common/string.h"
 #include "nodes/nodes.h"
 #include "utils/fmgrprotos.h"
 
@@ -43,14 +44,116 @@ extern int	namestrcmp(Name name, const char *str);
 
 /* numutils.c */
 extern int32 pg_atoi(const char *s, int size, int c);
-extern int16 pg_strtoint16(const char *s);
-extern int32 pg_strtoint32(const char *s);
 extern void pg_itoa(int16 i, char *a);
 extern void pg_ltoa(int32 l, char *a);
 extern void pg_lltoa(int64 ll, char *a);
 extern char *pg_ltostr_zeropad(char *str, int32 value, int32 minwidth);
 extern char *pg_ltostr(char *str, int32 value);
-extern uint64 pg_strtouint64(const char *str, char **endptr, int base);
+extern void pg_strtoint_error(pg_strtoint_status status,
+							  const char *input,
+							  const char *type) pg_attribute_noreturn();
+
+/*
+ * pg_strtoint16_check
+ *
+ * Convert input string to a signed 16-bit integer.  This throws ereport()
+ * upon bad input format or overflow.
+ */
+static inline int16
+pg_strtoint16_check(const char *s)
+{
+	int16		result;
+	pg_strtoint_status status = pg_strtoint16(s, &result);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, s, "int16");
+	return result;
+}
+
+/*
+ * pg_strtoint32_check
+ *
+ * Convert input string to a signed 32-bit integer.  This throws ereport()
+ * upon bad input format or overflow.
+ */
+static inline int32
+pg_strtoint32_check(const char *s)
+{
+	int32		result;
+	pg_strtoint_status status = pg_strtoint32(s, &result);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, s, "int32");
+	return result;
+}
+
+/*
+ * pg_strtoint64_check
+ *
+ * Convert input string to a signed 64-bit integer.  This throws ereport()
+ * upon bad input format or overflow.
+ */
+static inline int64
+pg_strtoint64_check(const char *s)
+{
+	int64		result;
+	pg_strtoint_status status = pg_strtoint64(s, &result);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, s, "int64");
+	return result;
+}
+
+/*
+ * pg_strtouint16_check
+ *
+ * Convert input string to a unsigned 16-bit integer.  This throws ereport()
+ * upon bad input format or overflow.
+ */
+static inline uint16
+pg_strtouint16_check(const char *s)
+{
+	uint16		result;
+	pg_strtoint_status status = pg_strtouint16(s, &result);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, s, "uint16");
+	return result;
+}
+
+/*
+ * pg_strtouint32_check
+ *
+ * Convert input string to a signed 32-bit integer.  This throws ereport()
+ * upon bad input format or overflow.
+ */
+static inline uint32
+pg_strtouint32_check(const char *s)
+{
+	uint32		result;
+	pg_strtoint_status status = pg_strtouint32(s, &result);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, s, "uint32");
+	return result;
+}
+
+/*
+ * pg_strtouint64_check
+ *
+ * Convert input string to a signed 64-bit integer.  This throws ereport()
+ * upon bad input format or overflow.
+ */
+static inline uint64
+pg_strtouint64_check(const char *s)
+{
+	uint64		result;
+	pg_strtoint_status status = pg_strtouint64(s, &result);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, s, "uint64");
+	return result;
+}
 
 /* oid.c */
 extern oidvector *buildoidvector(const Oid *oids, int n);
diff --git a/src/include/utils/int8.h b/src/include/utils/int8.h
deleted file mode 100644
index 4836095d93..0000000000
--- a/src/include/utils/int8.h
+++ /dev/null
@@ -1,25 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * int8.h
- *	  Declarations for operations on 64-bit integers.
- *
- *
- * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * src/include/utils/int8.h
- *
- * NOTES
- * These data types are supported on all 64-bit architectures, and may
- *	be supported through libraries on some 32-bit machines. If your machine
- *	is not currently supported, then please try to make it so, then post
- *	patches to the postgresql.org hackers mailing list.
- *
- *-------------------------------------------------------------------------
- */
-#ifndef INT8_H
-#define INT8_H
-
-extern bool scanint8(const char *str, bool errorOK, int64 *result);
-
-#endif							/* INT8_H */

#77

andres@anarazel.de

over 6 years ago

In reply to: Michael Paquier (#74)

Re: refactoring - share str2*int64 functions

Hi,

On 2019-09-14 15:02:36 +0900, Michael Paquier wrote:

On Fri, Sep 13, 2019 at 06:38:31PM -0700, Andres Freund wrote:

On 2019-09-10 12:05:25 +0900, Michael Paquier wrote:

On Mon, Sep 09, 2019 at 05:27:04AM -0700, Andres Freund wrote:
Attached is an updated patch? How does it look? I have left the
parts of readfuncs.c for now as there are more issues behind that than
doing a single switch, short reads are one, long reads a second.

Hm? I don't know what you mean by those issues.

I think that we have more issues than it looks. For example:
- Switching UINT to use pg_strtouint32() causes an incompatibility
issue compared to atoui().

"An incompatibility" is, uh, vague.

- Switching INT to use pg_strtoint32() causes a set of warnings as for
example with AttrNumber:
72 | (void) pg_strtoint32(token, &local_node->fldname)
| ^~~~~~~~~~~~~~~~~~~~~
| |
| AttrNumber * {aka short int *}
And it is not like we should use a cast either, as we could hide real
issues. Hence it seems to me that we need to have a new routine
definition for shorter integers and switch more flags to that.

Yea.

- Switching LONG to use pg_strtoint64() leads to another set of
issues, particularly one could see an assertion failure related to Agg
nodes. I am not sure either that we should use int64 here as the size
can be at least 32b.

That seems pretty clearly something that needs to be debugged before
applying this series. If there's such a failure, it indicates that
there's either a problem in this patchset, or a pre-existing problem in
readfuncs.

- Switching OID to use pg_strtoint32 causes a failure with initdb.

Needs to be debugged too. Although I suspect this might just be that you
need to use unsigned variant.

So while I agree with you that a switch should be doable, there is a
large set of issues to ponder about here, and the patch already does a
lot, so I really think that we had better do a closer lookup at those
issues separately, once the basics are in place, and consider them if
they actually make sense. There is much more than just doing a direct
switch in this area with the family of ato*() system calls.

I have no problme applying this separately, but I really don't think
it's wise to apply this before these problems have been debugged.

Greetings,

Andres Freund

#78

coelho@cri.ensmp.fr

over 6 years ago

In reply to: Michael Paquier (#76)

Re: refactoring - share str2*int64 functions

Bonjour Michaël,

Otherwise, this batch of changes looks ok to me.

Thanks.

About v15: applies cleanly, compiles, "make check" ok.

While re-reading the patch, there are bit of repetitions on pg_strtou?int*
comments. I'm wondering whether it would make sense to write a global
comments before each 3-type series to avoid that.

--
Fabien.

#79

michael@paquier.xyz

over 6 years ago

In reply to: Andres Freund (#77)

Re: refactoring - share str2*int64 functions

On Mon, Sep 16, 2019 at 10:08:19AM -0700, Andres Freund wrote:

On 2019-09-14 15:02:36 +0900, Michael Paquier wrote:

- Switching OID to use pg_strtoint32 causes a failure with initdb.

Needs to be debugged too. Although I suspect this might just be that you
need to use unsigned variant.

No, that's not it. My last message had a typo as I used the unsigned
variant. Anyway, by switching the routines of readfuncs.c to use the
new _check wrappers it is easy to see what the actual issue is. And
here we go with one example:
"FATAL: invalid input syntax for type uint32: "12089 :relkind v"

So, the root of the problem is that the optimized conversion routines
would complain if the end of the string includes incorrect characters
when converting a node from text, which is not something that strtoXX
complains about. So our wrappers atooid() and atoui() accept more
types of strings in input as they rely on the system's strtol(). And
that counts for the failures with UINT and OID. atoi() also is more
flexible on that, which explains the failures for INT, as well as
atol() for LONG (this shows a failure in the regression tests, not at
initdb time though).

One may think that this restriction does not actually apply to
READ_UINT64_FIELD because the routine involves no other things than
queryId. However once I enable -DWRITE_READ_PARSE_PLAN_TREES
-DWRITE_READ_PARSE_PLAN_TREES -DCOPY_PARSE_PLAN_TREES in the builds,
queryId parsing also complains with the patch. So except if we
redesign the node reads we are bound to keep around the wrapper of
strtoXX on HEAD called pg_strtouint64() to avoid an incompatibility
when parsing the 64-bit query ID. We could keep that isolated in
readfuncs.c close to the declarations of atoui() and strtobool()
though. This also points out that pg_strtouint64 of HEAD is
inconsistent with its signed relatives in the treatment of the input
string.

The code paths of the patch calling pg_strtouint64_check to parse
completion tags (spi.c and pg_stat_statements.c) should complain in
those cases as the format of the tags for SELECT and COPY is fixed.

In order to unify the parsing interface and put all the conversion
routines in a single place, I still think that the patch has value so
I would still keep it (with a fix for the queryId parsing of course),
but there is much more to it.

So while I agree with you that a switch should be doable, there is a
large set of issues to ponder about here, and the patch already does a
lot, so I really think that we had better do a closer lookup at those
issues separately, once the basics are in place, and consider them if
they actually make sense. There is much more than just doing a direct
switch in this area with the family of ato*() system calls.

I have no problem applying this separately, but I really don't think
it's wise to apply this before these problems have been debugged.

Sure. No problem with that line of reasoning.
--
Michael

#80

michael@paquier.xyz

over 6 years ago

In reply to: Michael Paquier (#79)

Re: refactoring - share str2*int64 functions

On Tue, Sep 17, 2019 at 11:29:13AM +0900, Michael Paquier wrote:

The code paths of the patch calling pg_strtouint64_check to parse
completion tags (spi.c and pg_stat_statements.c) should complain in
those cases as the format of the tags for SELECT and COPY is fixed.

In order to unify the parsing interface and put all the conversion
routines in a single place, I still think that the patch has value so
I would still keep it (with a fix for the queryId parsing of course),
but there is much more to it.

Forgot to mention that another angle of attack would be of course to
add some control flags in the optimized parsing functions to make them
more permissive regarding the handling of the trailing characters, by
not considering as a syntax error the case where the last character is
not a zero-termination.
--
Michael

#81

michael@paquier.xyz

over 6 years ago

In reply to: Michael Paquier (#79)

1 attachment(s)

Re: refactoring - share str2*int64 functions

On Tue, Sep 17, 2019 at 11:29:13AM +0900, Michael Paquier wrote:

In order to unify the parsing interface and put all the conversion
routines in a single place, I still think that the patch has value so
I would still keep it (with a fix for the queryId parsing of course),
but there is much more to it.

As of now, here is an updated patch which takes the path to not
complicate the refactored APIs and fixes the issue with queryID in
readfuncs.c. Thoughts?
--
Michael

Attachments:

str2int-16.patchtext/x-diff; charset=us-asciiDownload

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 221b47298c..7af05fc009 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -1022,7 +1022,7 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 		/* parse command tag to retrieve the number of affected rows. */
 		if (completionTag &&
 			strncmp(completionTag, "COPY ", 5) == 0)
-			rows = pg_strtouint64(completionTag + 5, NULL, 10);
+			rows = pg_strtouint64_check(completionTag + 5);
 		else
 			rows = 0;
 
diff --git a/contrib/spi/refint.c b/contrib/spi/refint.c
index adf0490f85..b063f1e122 100644
--- a/contrib/spi/refint.c
+++ b/contrib/spi/refint.c
@@ -306,7 +306,7 @@ check_foreign_key(PG_FUNCTION_ARGS)
 		/* internal error */
 		elog(ERROR, "check_foreign_key: too short %d (< 5) list of arguments", nargs);
 
-	nrefs = pg_strtoint32(args[0]);
+	nrefs = pg_strtoint32_check(args[0]);
 	if (nrefs < 1)
 		/* internal error */
 		elog(ERROR, "check_foreign_key: %d (< 1) number of references specified", nrefs);
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 2c0ae395ba..1272b4ee04 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -2338,8 +2338,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
 					CreateTableAsStmt *ctastmt = (CreateTableAsStmt *) stmt->utilityStmt;
 
 					if (strncmp(completionTag, "SELECT ", 7) == 0)
-						_SPI_current->processed =
-							pg_strtouint64(completionTag + 7, NULL, 10);
+						_SPI_current->processed = pg_strtouint64_check(completionTag + 7);
 					else
 					{
 						/*
@@ -2361,8 +2360,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
 				else if (IsA(stmt->utilityStmt, CopyStmt))
 				{
 					Assert(strncmp(completionTag, "COPY ", 5) == 0);
-					_SPI_current->processed = pg_strtouint64(completionTag + 5,
-															 NULL, 10);
+					_SPI_current->processed = pg_strtouint64_check(completionTag + 5);
 				}
 			}
 
diff --git a/src/backend/libpq/pqmq.c b/src/backend/libpq/pqmq.c
index a9bd47d937..ac2bf2f2c4 100644
--- a/src/backend/libpq/pqmq.c
+++ b/src/backend/libpq/pqmq.c
@@ -286,10 +286,10 @@ pq_parse_errornotice(StringInfo msg, ErrorData *edata)
 				edata->hint = pstrdup(value);
 				break;
 			case PG_DIAG_STATEMENT_POSITION:
-				edata->cursorpos = pg_strtoint32(value);
+				edata->cursorpos = pg_strtoint32_check(value);
 				break;
 			case PG_DIAG_INTERNAL_POSITION:
-				edata->internalpos = pg_strtoint32(value);
+				edata->internalpos = pg_strtoint32_check(value);
 				break;
 			case PG_DIAG_INTERNAL_QUERY:
 				edata->internalquery = pstrdup(value);
@@ -316,7 +316,7 @@ pq_parse_errornotice(StringInfo msg, ErrorData *edata)
 				edata->filename = pstrdup(value);
 				break;
 			case PG_DIAG_SOURCE_LINE:
-				edata->lineno = pg_strtoint32(value);
+				edata->lineno = pg_strtoint32_check(value);
 				break;
 			case PG_DIAG_SOURCE_FUNCTION:
 				edata->funcname = pstrdup(value);
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 764e3bb90c..4ed69f3d2b 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -80,7 +80,7 @@
 #define READ_UINT64_FIELD(fldname) \
 	token = pg_strtok(&length);		/* skip :fldname */ \
 	token = pg_strtok(&length);		/* get field value */ \
-	local_node->fldname = pg_strtouint64(token, NULL, 10)
+	local_node->fldname = strtouint64(token)
 
 /* Read a long integer field (anything written as ":fldname %ld") */
 #define READ_LONG_FIELD(fldname) \
@@ -189,6 +189,27 @@
 #define nullable_string(token,length)  \
 	((length) == 0 ? NULL : debackslash(token, length))
 
+/*
+ * pg_strtouint64
+ *	  Converts 'str' into an unsigned 64-bit integer.
+ *
+ * This has an identical API to strtoul(3) with base 10, except that
+ * it will handle 64-bit ints even where "long" is narrower than that.
+ *
+ * This has the advantage to not consider as a syntax failure any characters
+ * present in the string after the conversion is done.
+ */
+static uint64
+strtouint64(const char *str)
+{
+#ifdef _MSC_VER                    /* MSVC only */
+   return _strtoui64(str, NULL, 10);
+#elif defined(HAVE_STRTOULL) && SIZEOF_LONG < 8
+   return strtoull(str, NULL, 10);
+#else
+   return strtoul(str, NULL, 10);
+#endif
+}
 
 /*
  * _readBitmapset
diff --git a/src/backend/parser/parse_node.c b/src/backend/parser/parse_node.c
index 1baf7ef31f..cc846a71f1 100644
--- a/src/backend/parser/parse_node.c
+++ b/src/backend/parser/parse_node.c
@@ -17,6 +17,7 @@
 #include "access/htup_details.h"
 #include "access/table.h"
 #include "catalog/pg_type.h"
+#include "common/string.h"
 #include "mb/pg_wchar.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
@@ -25,7 +26,6 @@
 #include "parser/parse_expr.h"
 #include "parser/parse_relation.h"
 #include "utils/builtins.h"
-#include "utils/int8.h"
 #include "utils/lsyscache.h"
 #include "utils/syscache.h"
 #include "utils/varbit.h"
@@ -496,7 +496,7 @@ make_const(ParseState *pstate, Value *value, int location)
 
 		case T_Float:
 			/* could be an oversize integer as well as a float ... */
-			if (scanint8(strVal(value), true, &val64))
+			if (pg_strtoint64(strVal(value), &val64) == PG_STRTOINT_OK)
 			{
 				/*
 				 * It might actually fit in int32. Probably only INT_MIN can
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 6eba08a920..1ca6542c2e 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -341,7 +341,7 @@ libpqrcv_identify_system(WalReceiverConn *conn, TimeLineID *primary_tli)
 						   ntuples, nfields, 3, 1)));
 	}
 	primary_sysid = pstrdup(PQgetvalue(res, 0, 0));
-	*primary_tli = pg_strtoint32(PQgetvalue(res, 0, 1));
+	*primary_tli = pg_strtoint32_check(PQgetvalue(res, 0, 1));
 	PQclear(res);
 
 	return primary_sysid;
@@ -487,7 +487,7 @@ libpqrcv_endstreaming(WalReceiverConn *conn, TimeLineID *next_tli)
 		if (PQnfields(res) < 2 || PQntuples(res) != 1)
 			ereport(ERROR,
 					(errmsg("unexpected result set after end-of-streaming")));
-		*next_tli = pg_strtoint32(PQgetvalue(res, 0, 0));
+		*next_tli = pg_strtoint32_check(PQgetvalue(res, 0, 0));
 		PQclear(res);
 
 		/* the result set should be followed by CommandComplete */
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 9c08757fca..d8a2c4979f 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -13,6 +13,7 @@
 #include "postgres.h"
 
 #include "catalog/pg_publication.h"
+#include "common/string.h"
 
 #include "fmgr.h"
 
@@ -22,7 +23,6 @@
 #include "replication/pgoutput.h"
 
 #include "utils/inval.h"
-#include "utils/int8.h"
 #include "utils/memutils.h"
 #include "utils/syscache.h"
 #include "utils/varlena.h"
@@ -113,7 +113,7 @@ parse_output_parameters(List *options, uint32 *protocol_version,
 						 errmsg("conflicting or redundant options")));
 			protocol_version_given = true;
 
-			if (!scanint8(strVal(defel->arg), true, &parsed))
+			if (pg_strtoint64(strVal(defel->arg), &parsed) != PG_STRTOINT_OK)
 				ereport(ERROR,
 						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 						 errmsg("invalid proto_version")));
diff --git a/src/backend/tsearch/wparser_def.c b/src/backend/tsearch/wparser_def.c
index f18e761af8..abde3fdf6d 100644
--- a/src/backend/tsearch/wparser_def.c
+++ b/src/backend/tsearch/wparser_def.c
@@ -2460,13 +2460,13 @@ prsd_headline(PG_FUNCTION_ARGS)
 		char	   *val = defGetString(defel);
 
 		if (pg_strcasecmp(defel->defname, "MaxWords") == 0)
-			max_words = pg_strtoint32(val);
+			max_words = pg_strtoint32_check(val);
 		else if (pg_strcasecmp(defel->defname, "MinWords") == 0)
-			min_words = pg_strtoint32(val);
+			min_words = pg_strtoint32_check(val);
 		else if (pg_strcasecmp(defel->defname, "ShortWord") == 0)
-			shortword = pg_strtoint32(val);
+			shortword = pg_strtoint32_check(val);
 		else if (pg_strcasecmp(defel->defname, "MaxFragments") == 0)
-			max_fragments = pg_strtoint32(val);
+			max_fragments = pg_strtoint32_check(val);
 		else if (pg_strcasecmp(defel->defname, "StartSel") == 0)
 			prs->startsel = pstrdup(val);
 		else if (pg_strcasecmp(defel->defname, "StopSel") == 0)
diff --git a/src/backend/utils/adt/arrayutils.c b/src/backend/utils/adt/arrayutils.c
index 6170e35ac2..29d9784f41 100644
--- a/src/backend/utils/adt/arrayutils.c
+++ b/src/backend/utils/adt/arrayutils.c
@@ -226,7 +226,7 @@ ArrayGetIntegerTypmods(ArrayType *arr, int *n)
 	result = (int32 *) palloc(*n * sizeof(int32));
 
 	for (i = 0; i < *n; i++)
-		result[i] = pg_strtoint32(DatumGetCString(elem_values[i]));
+		result[i] = pg_strtoint32_check(DatumGetCString(elem_values[i]));
 
 	pfree(elem_values);
 
diff --git a/src/backend/utils/adt/cash.c b/src/backend/utils/adt/cash.c
index 6515fc8ec6..15f3babc05 100644
--- a/src/backend/utils/adt/cash.c
+++ b/src/backend/utils/adt/cash.c
@@ -26,7 +26,6 @@
 #include "libpq/pqformat.h"
 #include "utils/builtins.h"
 #include "utils/cash.h"
-#include "utils/int8.h"
 #include "utils/numeric.h"
 #include "utils/pg_locale.h"
 
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index 053affa5ce..706e3b4258 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -93,7 +93,6 @@
 #include "utils/datetime.h"
 #include "utils/float.h"
 #include "utils/formatting.h"
-#include "utils/int8.h"
 #include "utils/memutils.h"
 #include "utils/numeric.h"
 #include "utils/pg_locale.h"
diff --git a/src/backend/utils/adt/int.c b/src/backend/utils/adt/int.c
index 04825fc77d..3bf8438942 100644
--- a/src/backend/utils/adt/int.c
+++ b/src/backend/utils/adt/int.c
@@ -63,8 +63,19 @@ Datum
 int2in(PG_FUNCTION_ARGS)
 {
 	char	   *num = PG_GETARG_CSTRING(0);
+	int16		res;
+	pg_strtoint_status status;
 
-	PG_RETURN_INT16(pg_strtoint16(num));
+	/*
+	 * pg_strtoint16_check is not used here as any error message generated
+	 * should be consistent with the SQL data type used.
+	 */
+	status = pg_strtoint16(num, &res);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, num, "smallint");
+
+	PG_RETURN_INT16(res);
 }
 
 /*
@@ -268,8 +279,19 @@ Datum
 int4in(PG_FUNCTION_ARGS)
 {
 	char	   *num = PG_GETARG_CSTRING(0);
+	int32		res;
+	pg_strtoint_status status;
 
-	PG_RETURN_INT32(pg_strtoint32(num));
+	/*
+	 * pg_strtoint32_check is not used here as any error message generated
+	 * should be consistent with the SQL data type used.
+	 */
+	status = pg_strtoint32(num, &res);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, num, "integer");
+
+	PG_RETURN_INT32(res);
 }
 
 /*
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 0ff9394a2f..d94185c2b4 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -23,7 +23,6 @@
 #include "nodes/nodeFuncs.h"
 #include "nodes/supportnodes.h"
 #include "optimizer/optimizer.h"
-#include "utils/int8.h"
 #include "utils/builtins.h"
 
 
@@ -47,99 +46,25 @@ typedef struct
  * Formatting and conversion routines.
  *---------------------------------------------------------*/
 
-/*
- * scanint8 --- try to parse a string into an int8.
- *
- * If errorOK is false, ereport a useful error message if the string is bad.
- * If errorOK is true, just return "false" for bad input.
- */
-bool
-scanint8(const char *str, bool errorOK, int64 *result)
-{
-	const char *ptr = str;
-	int64		tmp = 0;
-	bool		neg = false;
-
-	/*
-	 * Do our own scan, rather than relying on sscanf which might be broken
-	 * for long long.
-	 *
-	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
-	 * value as a negative number.
-	 */
-
-	/* skip leading spaces */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		/* could fail if input is most negative number */
-		if (unlikely(tmp == PG_INT64_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	*result = tmp;
-	return true;
-
-out_of_range:
-	if (!errorOK)
-		ereport(ERROR,
-				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-				 errmsg("value \"%s\" is out of range for type %s",
-						str, "bigint")));
-	return false;
-
-invalid_syntax:
-	if (!errorOK)
-		ereport(ERROR,
-				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
-				 errmsg("invalid input syntax for type %s: \"%s\"",
-						"bigint", str)));
-	return false;
-}
-
 /* int8in()
  */
 Datum
 int8in(PG_FUNCTION_ARGS)
 {
 	char	   *str = PG_GETARG_CSTRING(0);
-	int64		result;
+	int64		res;
+	pg_strtoint_status status;
 
-	(void) scanint8(str, false, &result);
-	PG_RETURN_INT64(result);
+	/*
+	 * pg_strtoint64_check is not used here as any error message generated
+	 * should be consistent with the SQL data type used.
+	 */
+	status = pg_strtoint64(str, &res);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, str, "bigint");
+
+	PG_RETURN_INT64(res);
 }
 
 
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index a00db3ce7a..bc02958950 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -39,7 +39,6 @@
 #include "utils/float.h"
 #include "utils/guc.h"
 #include "utils/hashutils.h"
-#include "utils/int8.h"
 #include "utils/numeric.h"
 #include "utils/sortsupport.h"
 
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 70138feb29..03fc5bd220 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -109,158 +109,40 @@ pg_atoi(const char *s, int size, int c)
 	return (int32) l;
 }
 
-/*
- * Convert input string to a signed 16 bit integer.
- *
- * Allows any number of leading or trailing whitespace characters. Will throw
- * ereport() upon bad input format or overflow.
- *
- * NB: Accumulate input as a negative number, to deal with two's complement
- * representation of the most negative number, which can't be represented as a
- * positive number.
- */
-int16
-pg_strtoint16(const char *s)
-{
-	const char *ptr = s;
-	int16		tmp = 0;
-	bool		neg = false;
-
-	/* skip leading spaces */
-	while (likely(*ptr) && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s16_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s16_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		/* could fail if input is most negative number */
-		if (unlikely(tmp == PG_INT16_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	return tmp;
-
-out_of_range:
-	ereport(ERROR,
-			(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-			 errmsg("value \"%s\" is out of range for type %s",
-					s, "smallint")));
-
-invalid_syntax:
-	ereport(ERROR,
-			(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
-			 errmsg("invalid input syntax for type %s: \"%s\"",
-					"smallint", s)));
-
-	return 0;					/* keep compiler quiet */
-}
 
 /*
- * Convert input string to a signed 32 bit integer.
+ * pg_strtoint_error
  *
- * Allows any number of leading or trailing whitespace characters. Will throw
- * ereport() upon bad input format or overflow.
- *
- * NB: Accumulate input as a negative number, to deal with two's complement
- * representation of the most negative number, which can't be represented as a
- * positive number.
+ * utility wrapper to report an error with string-to-integer conversion
+ * functions.
  */
-int32
-pg_strtoint32(const char *s)
+void
+pg_strtoint_error(pg_strtoint_status status, const char *input,
+				  const char *type)
 {
-	const char *ptr = s;
-	int32		tmp = 0;
-	bool		neg = false;
-
-	/* skip leading spaces */
-	while (likely(*ptr) && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
+	switch (status)
 	{
-		ptr++;
-		neg = true;
+		case PG_STRTOINT_SYNTAX_ERROR:
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+					 errmsg("invalid input syntax for type %s: \"%s\"",
+							type, input)));
+			break;
+
+		case PG_STRTOINT_RANGE_ERROR:
+			ereport(ERROR,
+					(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+					 errmsg("value \"%s\" is out of range for type %s",
+							input, type)));
+			break;
+
+		default:
+			elog(ERROR, "unexpected status for string-to-integer conversion");
+			break;
 	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s32_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s32_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		/* could fail if input is most negative number */
-		if (unlikely(tmp == PG_INT32_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	return tmp;
-
-out_of_range:
-	ereport(ERROR,
-			(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-			 errmsg("value \"%s\" is out of range for type %s",
-					s, "integer")));
-
-invalid_syntax:
-	ereport(ERROR,
-			(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
-			 errmsg("invalid input syntax for type %s: \"%s\"",
-					"integer", s)));
-
-	return 0;					/* keep compiler quiet */
 }
 
+
 /*
  * pg_itoa: converts a signed 16-bit integer to its string representation
  *
@@ -543,25 +425,3 @@ pg_ltostr(char *str, int32 value)
 
 	return end;
 }
-
-/*
- * pg_strtouint64
- *		Converts 'str' into an unsigned 64-bit integer.
- *
- * This has the identical API to strtoul(3), except that it will handle
- * 64-bit ints even where "long" is narrower than that.
- *
- * For the moment it seems sufficient to assume that the platform has
- * such a function somewhere; let's not roll our own.
- */
-uint64
-pg_strtouint64(const char *str, char **endptr, int base)
-{
-#ifdef _MSC_VER					/* MSVC only */
-	return _strtoui64(str, endptr, base);
-#elif defined(HAVE_STRTOULL) && SIZEOF_LONG < 8
-	return strtoull(str, endptr, base);
-#else
-	return strtoul(str, endptr, base);
-#endif
-}
diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c
index e5c7e5c7ee..749fc44623 100644
--- a/src/backend/utils/adt/rangetypes.c
+++ b/src/backend/utils/adt/rangetypes.c
@@ -37,7 +37,6 @@
 #include "utils/builtins.h"
 #include "utils/date.h"
 #include "utils/hashutils.h"
-#include "utils/int8.h"
 #include "utils/lsyscache.h"
 #include "utils/rangetypes.h"
 #include "utils/timestamp.h"
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index d36156f4e4..8c8c98bc3f 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -5626,8 +5626,11 @@ text_format(PG_FUNCTION_ARGS)
 
 				str = OutputFunctionCall(&typoutputinfo_width, value);
 
-				/* pg_strtoint32 will complain about bad data or overflow */
-				width = pg_strtoint32(str);
+				/*
+				 * pg_strtoint32_check will complain about bad data or
+				 * overflow.
+				 */
+				width = pg_strtoint32_check(str);
 
 				pfree(str);
 			}
diff --git a/src/bin/pgbench/exprscan.l b/src/bin/pgbench/exprscan.l
index e9020ad231..84a9d39795 100644
--- a/src/bin/pgbench/exprscan.l
+++ b/src/bin/pgbench/exprscan.l
@@ -23,6 +23,7 @@
  *-------------------------------------------------------------------------
  */
 
+#include "common/string.h"
 #include "fe_utils/psqlscan_int.h"
 
 /* context information for reporting errors in expressions */
@@ -205,7 +206,7 @@ notnull			[Nn][Oo][Tt][Nn][Uu][Ll][Ll]
 					return MAXINT_PLUS_ONE_CONST;
 				}
 {digit}+		{
-					if (!strtoint64(yytext, true, &yylval->ival))
+					if (pg_strtoint64(yytext, &yylval->ival) != PG_STRTOINT_OK)
 						expr_yyerror_more(yyscanner, "bigint constant overflow",
 										  strdup(yytext));
 					return INTEGER_CONST;
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index ed7652bfbf..600f1deb71 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -34,6 +34,7 @@
 #include "postgres_fe.h"
 #include "common/int.h"
 #include "common/logging.h"
+#include "common/string.h"
 #include "fe_utils/conditional.h"
 #include "getopt_long.h"
 #include "libpq-fe.h"
@@ -664,116 +665,6 @@ usage(void)
 		   progname, progname);
 }
 
-/* return whether str matches "^\s*[-+]?[0-9]+$" */
-static bool
-is_an_int(const char *str)
-{
-	const char *ptr = str;
-
-	/* skip leading spaces; cast is consistent with strtoint64 */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* skip sign */
-	if (*ptr == '+' || *ptr == '-')
-		ptr++;
-
-	/* at least one digit */
-	if (*ptr && !isdigit((unsigned char) *ptr))
-		return false;
-
-	/* eat all digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-		ptr++;
-
-	/* must have reached end of string */
-	return *ptr == '\0';
-}
-
-
-/*
- * strtoint64 -- convert a string to 64-bit integer
- *
- * This function is a slightly modified version of scanint8() from
- * src/backend/utils/adt/int8.c.
- *
- * The function returns whether the conversion worked, and if so
- * "*result" is set to the result.
- *
- * If not errorOK, an error message is also printed out on errors.
- */
-bool
-strtoint64(const char *str, bool errorOK, int64 *result)
-{
-	const char *ptr = str;
-	int64		tmp = 0;
-	bool		neg = false;
-
-	/*
-	 * Do our own scan, rather than relying on sscanf which might be broken
-	 * for long long.
-	 *
-	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
-	 * value as a negative number.
-	 */
-
-	/* skip leading spaces */
-	while (*ptr && isspace((unsigned char) *ptr))
-		ptr++;
-
-	/* handle sign */
-	if (*ptr == '-')
-	{
-		ptr++;
-		neg = true;
-	}
-	else if (*ptr == '+')
-		ptr++;
-
-	/* require at least one digit */
-	if (unlikely(!isdigit((unsigned char) *ptr)))
-		goto invalid_syntax;
-
-	/* process digits */
-	while (*ptr && isdigit((unsigned char) *ptr))
-	{
-		int8		digit = (*ptr++ - '0');
-
-		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
-			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
-			goto out_of_range;
-	}
-
-	/* allow trailing whitespace, but not other trailing chars */
-	while (*ptr != '\0' && isspace((unsigned char) *ptr))
-		ptr++;
-
-	if (unlikely(*ptr != '\0'))
-		goto invalid_syntax;
-
-	if (!neg)
-	{
-		if (unlikely(tmp == PG_INT64_MIN))
-			goto out_of_range;
-		tmp = -tmp;
-	}
-
-	*result = tmp;
-	return true;
-
-out_of_range:
-	if (!errorOK)
-		fprintf(stderr,
-				"value \"%s\" is out of range for type bigint\n", str);
-	return false;
-
-invalid_syntax:
-	if (!errorOK)
-		fprintf(stderr,
-				"invalid input syntax for type bigint: \"%s\"\n", str);
-	return false;
-}
-
 /* convert string to double, detecting overflows/underflows */
 bool
 strtodouble(const char *str, bool errorOK, double *dv)
@@ -1288,6 +1179,7 @@ static bool
 makeVariableValue(Variable *var)
 {
 	size_t		slen;
+	int64		ival;
 
 	if (var->value.type != PGBT_NO_VALUE)
 		return true;			/* no work */
@@ -1320,15 +1212,16 @@ makeVariableValue(Variable *var)
 	{
 		setBoolValue(&var->value, false);
 	}
-	else if (is_an_int(var->svalue))
+
+	/*
+	 * Attempt an int64 conversion.
+	 *
+	 * A too-large but syntactically correct int64 will be parsed as a double
+	 * below.
+	 */
+	else if (pg_strtoint64(var->svalue, &ival) == PG_STRTOINT_OK)
 	{
-		/* if it looks like an int, it must be an int without overflow */
-		int64		iv;
-
-		if (!strtoint64(var->svalue, false, &iv))
-			return false;
-
-		setIntValue(&var->value, iv);
+		setIntValue(&var->value, ival);
 	}
 	else						/* type should be double */
 	{
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index c4a1e298e0..da3f8fa055 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -160,7 +160,6 @@ extern void syntax_error(const char *source, int lineno, const char *line,
 						 const char *cmd, const char *msg,
 						 const char *more, int col) pg_attribute_noreturn();
 
-extern bool strtoint64(const char *str, bool errorOK, int64 *pi);
 extern bool strtodouble(const char *str, bool errorOK, double *pd);
 
 #endif							/* PGBENCH_H */
diff --git a/src/common/string.c b/src/common/string.c
index c9b8482cb0..c0811bf0d6 100644
--- a/src/common/string.c
+++ b/src/common/string.c
@@ -21,6 +21,7 @@
 #include "postgres_fe.h"
 #endif
 
+#include "common/int.h"
 #include "common/string.h"
 
 
@@ -58,6 +59,348 @@ strtoint(const char *pg_restrict str, char **pg_restrict endptr, int base)
 }
 
 
+/*
+ * pg_strtoint16
+ *
+ * Convert input string to a signed 16-bit integer.  Allows any number of
+ * leading or trailing whitespace characters.
+ *
+ * NB: Accumulate input as a negative number, to deal with two's complement
+ * representation of the most negative number, which can't be represented as a
+ * positive number.
+ *
+ * The function returns immediately if the conversion failed with a status
+ * value to let the caller handle the error.  On success, the result is
+ * stored in "*result".
+ */
+pg_strtoint_status
+pg_strtoint16(const char *s, int16 *result)
+{
+	const char *ptr = s;
+	int16		tmp = 0;
+	bool		neg = false;
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '-')
+	{
+		ptr++;
+		neg = true;
+	}
+	else if (*ptr == '+')
+		ptr++;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead per the last check */
+	do
+	{
+		int8		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_s16_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_sub_s16_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace, but not other trailing chars */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	if (!neg)
+	{
+		/* could fail if input is most negative number */
+		if (unlikely(tmp == PG_INT16_MIN))
+			return PG_STRTOINT_RANGE_ERROR;
+		tmp = -tmp;
+	}
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
+/*
+ * pg_strtoint32
+ *
+ * Same as previously, for 32-bit signed integer.
+ */
+pg_strtoint_status
+pg_strtoint32(const char *s, int32 *result)
+{
+	const char *ptr = s;
+	int32		tmp = 0;
+	bool		neg = false;
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '-')
+	{
+		ptr++;
+		neg = true;
+	}
+	else if (*ptr == '+')
+		ptr++;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead per the last check */
+	do
+	{
+		int8		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_s32_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_sub_s32_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace, but not other trailing chars */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	if (!neg)
+	{
+		/* could fail if input is most negative number */
+		if (unlikely(tmp == PG_INT32_MIN))
+			return PG_STRTOINT_RANGE_ERROR;
+		tmp = -tmp;
+	}
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
+/*
+ * pg_strtoint64
+ *
+ * Same as previously, for 64-bit signed integer.
+ */
+pg_strtoint_status
+pg_strtoint64(const char *str, int64 *result)
+{
+	const char *ptr = str;
+	int64		tmp = 0;
+	bool		neg = false;
+
+	/*
+	 * Do our own scan, rather than relying on sscanf which might be broken
+	 * for long long.
+	 *
+	 * As INT64_MIN can't be stored as a positive 64 bit integer, accumulate
+	 * value as a negative number.
+	 */
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '-')
+	{
+		ptr++;
+		neg = true;
+	}
+	else if (*ptr == '+')
+		ptr++;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead per the last check */
+	do
+	{
+		int64		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_s64_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_sub_s64_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace, but not other trailing chars */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	if (!neg)
+	{
+		if (unlikely(tmp == PG_INT64_MIN))
+			return PG_STRTOINT_RANGE_ERROR;
+		tmp = -tmp;
+	}
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
+/*
+ * pg_strtouint16
+ *
+ * Convert input string to an unsigned 16-bit integer.  Allows any number of
+ * leading or trailing whitespace characters.
+ *
+ * The function returns immediately if the conversion failed with a status
+ * value to let the caller handle the error.  On success, the result is
+ * stored in "*result".
+ */
+pg_strtoint_status
+pg_strtouint16(const char *str, uint16 *result)
+{
+	const char *ptr = str;
+	uint16		tmp = 0;
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '+')
+		ptr++;
+	else if (unlikely(*ptr == '-'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead per the last check */
+	do
+	{
+		uint8		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_u16_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_add_u16_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* but not other trailing chars */
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
+/*
+ * pg_strtouint32
+ *
+ * Same as previously, for 32-bit unsigned integer.
+ */
+pg_strtoint_status
+pg_strtouint32(const char *str, uint32 *result)
+{
+	const char *ptr = str;
+	uint32		tmp = 0;
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '+')
+		ptr++;
+	else if (unlikely(*ptr == '-'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead per the last check */
+	do
+	{
+		uint8		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_u32_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_add_u32_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* but not other trailing chars */
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
+/*
+ * pg_strtouint64
+ *
+ * Same as previously, for 64-bit unsigned integer.
+ */
+pg_strtoint_status
+pg_strtouint64(const char *str, uint64 *result)
+{
+	const char *ptr = str;
+	uint64		tmp = 0;
+
+	/* skip leading spaces */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* handle sign */
+	if (*ptr == '+')
+		ptr++;
+	else if (unlikely(*ptr == '-'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* require at least one digit */
+	if (unlikely(!isdigit((unsigned char) *ptr)))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	/* process digits, we know that we have one ahead per the last check */
+	do
+	{
+		uint64		digit = (*ptr++ - '0');
+
+		if (unlikely(pg_mul_u64_overflow(tmp, 10, &tmp)) ||
+			unlikely(pg_add_u64_overflow(tmp, digit, &tmp)))
+			return PG_STRTOINT_RANGE_ERROR;
+	}
+	while (isdigit((unsigned char) *ptr));
+
+	/* allow trailing whitespace */
+	while (isspace((unsigned char) *ptr))
+		ptr++;
+
+	/* but not other trailing chars */
+	if (unlikely(*ptr != '\0'))
+		return PG_STRTOINT_SYNTAX_ERROR;
+
+	*result = tmp;
+	return PG_STRTOINT_OK;
+}
+
 /*
  * pg_clean_ascii -- Replace any non-ASCII chars with a '?' char
  *
diff --git a/src/include/common/string.h b/src/include/common/string.h
index 94f653fdd7..ae5d56356a 100644
--- a/src/include/common/string.h
+++ b/src/include/common/string.h
@@ -13,6 +13,24 @@
 extern bool pg_str_endswith(const char *str, const char *end);
 extern int	strtoint(const char *pg_restrict str, char **pg_restrict endptr,
 					 int base);
+
+/*
+ * Set of routines for conversion from string to various integer types.
+ */
+typedef enum pg_strtoint_status
+{
+	PG_STRTOINT_OK,
+	PG_STRTOINT_SYNTAX_ERROR,
+	PG_STRTOINT_RANGE_ERROR
+} pg_strtoint_status;
+
+extern pg_strtoint_status pg_strtoint16(const char *str, int16 *result);
+extern pg_strtoint_status pg_strtoint32(const char *str, int32 *result);
+extern pg_strtoint_status pg_strtoint64(const char *str, int64 *result);
+extern pg_strtoint_status pg_strtouint16(const char *str, uint16 *result);
+extern pg_strtoint_status pg_strtouint32(const char *str, uint32 *result);
+extern pg_strtoint_status pg_strtouint64(const char *str, uint64 *result);
+
 extern void pg_clean_ascii(char *str);
 extern int	pg_strip_crlf(char *str);
 
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 937ddb7ef0..755c5ac5b8 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -15,6 +15,7 @@
 #define BUILTINS_H
 
 #include "fmgr.h"
+#include "common/string.h"
 #include "nodes/nodes.h"
 #include "utils/fmgrprotos.h"
 
@@ -43,14 +44,85 @@ extern int	namestrcmp(Name name, const char *str);
 
 /* numutils.c */
 extern int32 pg_atoi(const char *s, int size, int c);
-extern int16 pg_strtoint16(const char *s);
-extern int32 pg_strtoint32(const char *s);
 extern void pg_itoa(int16 i, char *a);
 extern void pg_ltoa(int32 l, char *a);
 extern void pg_lltoa(int64 ll, char *a);
 extern char *pg_ltostr_zeropad(char *str, int32 value, int32 minwidth);
 extern char *pg_ltostr(char *str, int32 value);
-extern uint64 pg_strtouint64(const char *str, char **endptr, int base);
+extern void pg_strtoint_error(pg_strtoint_status status,
+							  const char *input,
+							  const char *type) pg_attribute_noreturn();
+
+/*
+ * Set of routines to convert an input string to a signed or unsigned
+ * 16-bit, 32-bit or 64-bit integer.  Those routines throw ereport()
+ * upon bad input format or overflow.
+ */
+static inline int16
+pg_strtoint16_check(const char *s)
+{
+	int16		result;
+	pg_strtoint_status status = pg_strtoint16(s, &result);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, s, "int16");
+	return result;
+}
+
+static inline int32
+pg_strtoint32_check(const char *s)
+{
+	int32		result;
+	pg_strtoint_status status = pg_strtoint32(s, &result);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, s, "int32");
+	return result;
+}
+
+static inline int64
+pg_strtoint64_check(const char *s)
+{
+	int64		result;
+	pg_strtoint_status status = pg_strtoint64(s, &result);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, s, "int64");
+	return result;
+}
+
+static inline uint16
+pg_strtouint16_check(const char *s)
+{
+	uint16		result;
+	pg_strtoint_status status = pg_strtouint16(s, &result);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, s, "uint16");
+	return result;
+}
+
+static inline uint32
+pg_strtouint32_check(const char *s)
+{
+	uint32		result;
+	pg_strtoint_status status = pg_strtouint32(s, &result);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, s, "uint32");
+	return result;
+}
+
+static inline uint64
+pg_strtouint64_check(const char *s)
+{
+	uint64		result;
+	pg_strtoint_status status = pg_strtouint64(s, &result);
+
+	if (unlikely(status != PG_STRTOINT_OK))
+		pg_strtoint_error(status, s, "uint64");
+	return result;
+}
 
 /* oid.c */
 extern oidvector *buildoidvector(const Oid *oids, int n);
diff --git a/src/include/utils/int8.h b/src/include/utils/int8.h
deleted file mode 100644
index 4836095d93..0000000000
--- a/src/include/utils/int8.h
+++ /dev/null
@@ -1,25 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * int8.h
- *	  Declarations for operations on 64-bit integers.
- *
- *
- * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * src/include/utils/int8.h
- *
- * NOTES
- * These data types are supported on all 64-bit architectures, and may
- *	be supported through libraries on some 32-bit machines. If your machine
- *	is not currently supported, then please try to make it so, then post
- *	patches to the postgresql.org hackers mailing list.
- *
- *-------------------------------------------------------------------------
- */
-#ifndef INT8_H
-#define INT8_H
-
-extern bool scanint8(const char *str, bool errorOK, int64 *result);
-
-#endif							/* INT8_H */

#82

Ashutosh Sharma

ashu.coek88@gmail.com

over 6 years ago

In reply to: Michael Paquier (#81)

Re: refactoring - share str2*int64 functions

Is there any specific reason for hard coding the *base* of a number
representing the string in strtouint64(). I understand that currently
strtouint64() is being used just to convert an input string to decimal
unsigned value but what if we want it to be used for hexadecimal
values or may be some other values, in that case it can't be used.
Further, the function name is strtouint64() but the comments atop it's
definition says it's pg_strtouint64(). That needs to be corrected.

At few places, I could see that the function call to
pg_strtoint32_check() is followed by an error handling. Isn't that
already being done in pg_strtoint32_check function itself. For e.g. in
refint.c the function call to pg_strtoint32_check is followed by a if
condition that checks for an error which I assume shouldn't be there
as it is already being done by pg_strtoint32_check.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

Show quoted text

On Wed, Sep 18, 2019 at 6:43 AM Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Sep 17, 2019 at 11:29:13AM +0900, Michael Paquier wrote:

In order to unify the parsing interface and put all the conversion
routines in a single place, I still think that the patch has value so
I would still keep it (with a fix for the queryId parsing of course),
but there is much more to it.

As of now, here is an updated patch which takes the path to not
complicate the refactored APIs and fixes the issue with queryID in
readfuncs.c. Thoughts?
--
Michael

#83

andres@anarazel.de

over 6 years ago

In reply to: Ashutosh Sharma (#82)

Re: refactoring - share str2*int64 functions

Hi,

On 2019-10-04 14:27:44 +0530, Ashutosh Sharma wrote:

Is there any specific reason for hard coding the *base* of a number
representing the string in strtouint64(). I understand that currently
strtouint64() is being used just to convert an input string to decimal
unsigned value but what if we want it to be used for hexadecimal
values or may be some other values, in that case it can't be used.

It's a lot slower if the base is variable, because the compiler cannot
replace the division by shifts.

Greetings,

Andres Freund

#84

Ashutosh Sharma

ashu.coek88@gmail.com

over 6 years ago

In reply to: Andres Freund (#83)

Re: refactoring - share str2*int64 functions

On Fri, Oct 4, 2019 at 8:58 PM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2019-10-04 14:27:44 +0530, Ashutosh Sharma wrote:

Is there any specific reason for hard coding the *base* of a number
representing the string in strtouint64(). I understand that currently
strtouint64() is being used just to convert an input string to decimal
unsigned value but what if we want it to be used for hexadecimal
values or may be some other values, in that case it can't be used.

It's a lot slower if the base is variable, because the compiler cannot
replace the division by shifts.

Thanks Andres for the reply. I didn't know that the compiler won't be
able to replace division with shifts operator if the base is variable
and it's true that it would make the things a lot slower.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

#85

michael@paquier.xyz

over 6 years ago

In reply to: Ashutosh Sharma (#82)

Re: refactoring - share str2*int64 functions

On Fri, Oct 04, 2019 at 02:27:44PM +0530, Ashutosh Sharma wrote:

Is there any specific reason for hard coding the *base* of a number
representing the string in strtouint64(). I understand that currently
strtouint64() is being used just to convert an input string to decimal
unsigned value but what if we want it to be used for hexadecimal
values or may be some other values, in that case it can't be used.
Further, the function name is strtouint64() but the comments atop it's
definition says it's pg_strtouint64(). That needs to be corrected.

Performance, as Andres has already stated upthread. Moving away from
strtol gives roughly a 40% improvement with a call-to-call comparison:
/messages/by-id/20190909052814.GA26605@paquier.xyz

At few places, I could see that the function call to
pg_strtoint32_check() is followed by an error handling. Isn't that
already being done in pg_strtoint32_check function itself. For e.g. in
refint.c the function call to pg_strtoint32_check is followed by a if
condition that checks for an error which I assume shouldn't be there
as it is already being done by pg_strtoint32_check.

pg_strtoint32_check is used for a signed integer, so it would complain
about incorrect input syntax, but not when the parsed integer is less
or equal than 0, which is what refint.c complains about.
--
Michael

#86