Efficient output for integer types

Started by David Fetterover 6 years ago23 messages

david@fetter.org

over 6 years ago

2 attachment(s)

Folks,

Please find attached a couple of patches intended to $subject.

This patch set cut the time to copy ten million rows of randomly sized
int8s (10 of them) by about a third, so at least for that case, it's
pretty decent.

Thanks to Andrew Gierth for lots of patient help.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachments:

v1-0001-Output-digits-two-at-a-time-in-sprintf.c.patchtext/x-diff; charset=us-asciiDownload

From 6e8136ece5b01ca9cd16bdb974c4d54e939c92cf Mon Sep 17 00:00:00 2001
From: David Fetter <david@fetter.org>
Date: Tue, 10 Sep 2019 02:06:31 -0700
Subject: [PATCH v1 1/2] Output digits two at a time in sprintf.c
To: hackers
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="------------2.21.0"

This is a multi-part message in MIME format.
--------------2.21.0
Content-Type: text/plain; charset=UTF-8; format=fixed
Content-Transfer-Encoding: 8bit


diff --git a/src/port/snprintf.c b/src/port/snprintf.c
index 8fd997553e..fd9d384144 100644
--- a/src/port/snprintf.c
+++ b/src/port/snprintf.c
@@ -1014,9 +1014,60 @@ fmtint(long long value, char type, int forcesign, int leftjust,
 	   PrintfTarget *target)
 {
 	unsigned long long base;
+	unsigned long long square;
 	unsigned long long uvalue;
 	int			dosign;
-	const char *cvt = "0123456789abcdef";
+	/* Maps for octal, decimal, and two flavors of hexadecimal */
+	const char *digits;
+	const char	decimal_digits[200] =
+					/* 10^2 * 2 decimal digits */
+					"0001020304050607080910111213141516171819"
+					"2021222324252627282930313233343536373839"
+					"4041424344454647484950515253545556575859"
+					"6061626364656667686970717273747576777879"
+					"8081828384858687888990919293949596979899";
+	const char	octal_digits[128] =
+					/* 8^2 * 2 octal digits */
+					"00010203040506071011121314151617"
+					"20212223242526273031323334353637"
+					"40414243444546475051525354555657"
+					"60616263646566677071727374757677";
+					/* 16^2 * 2 hex digits */
+	const char	hex_lower_digits[512] =
+					"000102030405060708090a0b0c0d0e0f"
+					"101112131415161718191a1b1c1d1e1f"
+					"202122232425262728292a2b2c2d2e2f"
+					"303132333435363738393a3b3c3d3e3f"
+					"404142434445464748494a4b4c4d4e4f"
+					"505152535455565758595a5b5c5d5e5f"
+					"606162636465666768696a6b6c6d6e6f"
+					"707172737475767778797a7b7c7d7e7f"
+					"808182838485868788898a8b8c8d8e8f"
+					"909192939495969798999a9b9c9d9e9f"
+					"a0a1a2a3a4a5a6a7a8a9aaabacadaeaf"
+					"b0b1b2b3b4b5b6b7b8b9babbbcbdbebf"
+					"c0c1c2c3c4c5c6c7c8c9cacbcccdcecf"
+					"d0d1d2d3d4d5d6d7d8d9dadbdcdddedf"
+					"e0e1e2e3e4e5e6e7e8e9eaebecedeeef"
+					"f0f1f2f3f4f5f6f7f8f9fafbfcfdfeff";
+	const char	hex_upper_digits[512] =
+					/* 16^2 * 2 HEX DIGITS */
+					"000102030405060708090A0B0C0D0E0F"
+					"101112131415161718191A1B1C1D1E1F"
+					"202122232425262728292A2B2C2D2E2F"
+					"303132333435363738393A3B3C3D3E3F"
+					"404142434445464748494A4B4C4D4E4F"
+					"505152535455565758595A5B5C5D5E5F"
+					"606162636465666768696A6B6C6D6E6F"
+					"707172737475767778797A7B7C7D7E7F"
+					"808182838485868788898A8B8C8D8E8F"
+					"909192939495969798999A9B9C9D9E9F"
+					"A0A1A2A3A4A5A6A7A8A9AAABACADAEAF"
+					"B0B1B2B3B4B5B6B7B8B9BABBBCBDBEBF"
+					"C0C1C2C3C4C5C6C7C8C9CACBCCCDCECF"
+					"D0D1D2D3D4D5D6D7D8D9DADBDCDDDEDF"
+					"E0E1E2E3E4E5E6E7E8E9EAEBECEDEEEF"
+					"F0F1F2F3F4F5F6F7F8F9FAFBFCFDFEFF";
 	int			signvalue = 0;
 	char		convert[64];
 	int			vallen = 0;
@@ -1027,23 +1078,27 @@ fmtint(long long value, char type, int forcesign, int leftjust,
 	{
 		case 'd':
 		case 'i':
+			digits = decimal_digits;
 			base = 10;
 			dosign = 1;
 			break;
 		case 'o':
+			digits = octal_digits;
 			base = 8;
 			dosign = 0;
 			break;
 		case 'u':
+			digits = decimal_digits;
 			base = 10;
 			dosign = 0;
 			break;
 		case 'x':
+			digits = hex_lower_digits;
 			base = 16;
 			dosign = 0;
 			break;
 		case 'X':
-			cvt = "0123456789ABCDEF";
+			digits = hex_upper_digits;
 			base = 16;
 			dosign = 0;
 			break;
@@ -1051,6 +1106,8 @@ fmtint(long long value, char type, int forcesign, int leftjust,
 			return;				/* keep compiler quiet */
 	}
 
+	square = base * base;
+
 	/* disable MSVC warning about applying unary minus to an unsigned value */
 #if _MSC_VER
 #pragma warning(push)
@@ -1073,12 +1130,20 @@ fmtint(long long value, char type, int forcesign, int leftjust,
 		vallen = 0;
 	else
 	{
-		/* make integer string */
-		do
+		/* make integer string, two digits at a time */
+		while(uvalue >= base)
 		{
-			convert[sizeof(convert) - (++vallen)] = cvt[uvalue % base];
-			uvalue = uvalue / base;
-		} while (uvalue);
+			const int i = (uvalue % square) * 2;
+			uvalue /= square;
+			vallen += 2;
+			memcpy(convert + sizeof(convert) - vallen, digits + i, 2);
+		}
+		/* Account for single digit */
+		if (uvalue > 0 || vallen == 0)
+		{
+			vallen++;
+			memcpy(convert + sizeof(convert) - vallen, digits + uvalue * 2 + 1, 1);
+		}
 	}
 
 	zeropad = Max(0, precision - vallen);

--------------2.21.0--

v1-0002-Made-int8-operations-more-efficent.patchtext/x-diff; charset=us-asciiDownload

From f4a3729900484292cce5066be6a6f183f489ae8c Mon Sep 17 00:00:00 2001
From: David Fetter <david@fetter.org>
Date: Sun, 15 Sep 2019 00:06:29 -0700
Subject: [PATCH v1 2/2] Made int8 operations more efficent
To: hackers
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="------------2.21.0"

This is a multi-part message in MIME format.
--------------2.21.0
Content-Type: text/plain; charset=UTF-8; format=fixed
Content-Transfer-Encoding: 8bit


- Output routines now do more digits per iteration, and
- Code determines the number of decimal digits in an int8 efficiently

diff --git a/src/backend/access/common/printsimple.c b/src/backend/access/common/printsimple.c
index 651ade14dd..17ca533b87 100644
--- a/src/backend/access/common/printsimple.c
+++ b/src/backend/access/common/printsimple.c
@@ -112,7 +112,7 @@ printsimple(TupleTableSlot *slot, DestReceiver *self)
 			case INT8OID:
 				{
 					int64		num = DatumGetInt64(value);
-					char		str[23];	/* sign, 21 digits and '\0' */
+					char		str[MAXINT8LEN];
 
 					pg_lltoa(num, str);
 					pq_sendcountedtext(&buf, str, strlen(str), false);
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 580043233b..3818dbaa85 100644
--- a/src/backend/utils/adt/Makefile
+++ b/src/backend/utils/adt/Makefile
@@ -39,6 +39,8 @@ jsonpath_scan.c: FLEX_NO_BACKUP=yes
 # jsonpath_scan is compiled as part of jsonpath_gram
 jsonpath_gram.o: jsonpath_scan.c
 
+numutils.o: CFLAGS += $(PERMIT_DECLARATION_AFTER_STATEMENT)
+
 # jsonpath_gram.c and jsonpath_scan.c are in the distribution tarball,
 # so they are not cleaned here.
 clean distclean maintainer-clean:
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 0ff9394a2f..6230807906 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -27,8 +27,6 @@
 #include "utils/builtins.h"
 
 
-#define MAXINT8LEN		25
-
 typedef struct
 {
 	int64		current;
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 70138feb29..f75faa9255 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -20,6 +20,44 @@
 
 #include "common/int.h"
 #include "utils/builtins.h"
+#include "port/pg_bitutils.h"
+
+/*
+ * A table of all two-digit numbers. This is used to speed up decimal digit
+ * generation by copying pairs of digits into the final output.
+ */
+static const char DIGIT_TABLE[200] = {
+	'0', '0', '0', '1', '0', '2', '0', '3', '0', '4', '0', '5', '0', '6', '0', '7', '0', '8', '0', '9',
+	'1', '0', '1', '1', '1', '2', '1', '3', '1', '4', '1', '5', '1', '6', '1', '7', '1', '8', '1', '9',
+	'2', '0', '2', '1', '2', '2', '2', '3', '2', '4', '2', '5', '2', '6', '2', '7', '2', '8', '2', '9',
+	'3', '0', '3', '1', '3', '2', '3', '3', '3', '4', '3', '5', '3', '6', '3', '7', '3', '8', '3', '9',
+	'4', '0', '4', '1', '4', '2', '4', '3', '4', '4', '4', '5', '4', '6', '4', '7', '4', '8', '4', '9',
+	'5', '0', '5', '1', '5', '2', '5', '3', '5', '4', '5', '5', '5', '6', '5', '7', '5', '8', '5', '9',
+	'6', '0', '6', '1', '6', '2', '6', '3', '6', '4', '6', '5', '6', '6', '6', '7', '6', '8', '6', '9',
+	'7', '0', '7', '1', '7', '2', '7', '3', '7', '4', '7', '5', '7', '6', '7', '7', '7', '8', '7', '9',
+	'8', '0', '8', '1', '8', '2', '8', '3', '8', '4', '8', '5', '8', '6', '8', '7', '8', '8', '8', '9',
+	'9', '0', '9', '1', '9', '2', '9', '3', '9', '4', '9', '5', '9', '6', '9', '7', '9', '8', '9', '9'
+};
+
+/*
+ * Adapted from http://graphics.stanford.edu/~seander/bithacks.html#IntegerLog10
+ */
+static inline uint32
+decimalLength(const uint64 v)
+{
+	uint32			t;
+	static uint64	PowersOfTen[] =
+	{1,                10,                100,
+	 1000,             10000,             100000,
+	 1000000,          10000000,          100000000,
+	 1000000000,       10000000000,       100000000000,
+	 1000000000000,    10000000000000,    100000000000000,
+	 1000000000000000, 10000000000000000, 100000000000000000,
+	 1000000000000000000};
+
+	t = (pg_leftmost_one_pos64(v) + 1)*1233/4096;
+	return t + (v >= PowersOfTen[t]);
+}
 
 /*
  * pg_atoi: convert string to integer
@@ -333,13 +371,13 @@ pg_ltoa(int32 value, char *a)
  * pg_lltoa: convert a signed 64-bit integer to its string representation
  *
  * Caller must ensure that 'a' points to enough memory to hold the result
- * (at least MAXINT8LEN+1 bytes, counting a leading sign and trailing NUL).
+ * (at least MAXINT8LEN bytes, counting a leading sign and trailing NUL).
  */
 void
 pg_lltoa(int64 value, char *a)
 {
-	char	   *start = a;
-	bool		neg = false;
+	uint32	olength;
+	uint32	i = 0;
 
 	/*
 	 * Avoid problems with the most negative integer not being representable
@@ -350,37 +388,83 @@ pg_lltoa(int64 value, char *a)
 		memcpy(a, "-9223372036854775808", 21);
 		return;
 	}
-	else if (value < 0)
+
+	/* Might as well handle this case, too */
+	if (value == 0)
+	{
+		memcpy(a, "0", 2);
+		return;
+	}
+
+	if (value < 0)
 	{
 		value = -value;
-		neg = true;
-	}
-
-	/* Compute the result string backwards. */
-	do
-	{
-		int64		remainder;
-		int64		oldval = value;
-
-		value /= 10;
-		remainder = oldval - value * 10;
-		*a++ = '0' + remainder;
-	} while (value != 0);
-
-	if (neg)
 		*a++ = '-';
+	}
+
+	olength = decimalLength(value);
+
+	/* Compute the result string. */
+	while (value >= 100000000)
+	{
+		/* Expensive 64-bit division. Optimize? */
+		const	uint64 q = value / 100000000;
+		uint32	value2 = (uint32) (value - 100000000 * q);
+
+		value = q;
+
+		const uint32 c = value2 % 10000;
+		const uint32 d = value2 / 10000;
+		const uint32 c0 = (c % 100) << 1;
+		const uint32 c1 = (c / 100) << 1;
+		const uint32 d0 = (d % 100) << 1;
+		const uint32 d1 = (d / 100) << 1;
+
+		value = q;
+
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c0, 2);
+		memcpy(a + olength - i - 4, DIGIT_TABLE + c1, 2);
+		memcpy(a + olength - i - 6, DIGIT_TABLE + d0, 2);
+		memcpy(a + olength - i - 8, DIGIT_TABLE + d1, 2);
+		i += 8;
+	}
+
+	uint32		value2 = (uint32) value;
 
-	/* Add trailing NUL byte, and back up 'a' to the last character. */
-	*a-- = '\0';
+	while (value2 >= 10000)
+	{
+		const	uint32 c = value2 - 10000 * (value2 / 10000);
+
+		value2 /= 10000;
+
+		const	uint32 c0 = (c % 100) << 1;
+		const	uint32 c1 = (c / 100) << 1;
+
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c0, 2);
+		memcpy(a + olength - i - 4, DIGIT_TABLE + c1, 2);
+		i += 4;
+	}
+	if (value2 >= 100)
+	{
+		const uint32 c = (value2 % 100) << 1;
 
-	/* Reverse string. */
-	while (start < a)
+		value2 /= 100;
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	if (value2 >= 10)
 	{
-		char		swap = *start;
+		const uint32 c = value2 << 1;
 
-		*start++ = *a;
-		*a-- = swap;
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c, 2);
+		i += 2;
 	}
+	else
+	{
+		*a = (char) ('0' + value2);
+	}
+
+	a[olength] = '\0';
 }
 
 
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 937ddb7ef0..9e8392741e 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -18,6 +18,7 @@
 #include "nodes/nodes.h"
 #include "utils/fmgrprotos.h"
 
+#define MAXINT8LEN 21
 
 /* bool.c */
 extern bool parse_bool(const char *value, bool *result);

--------------2.21.0--

Andrey Borodin

x4mmm@yandex-team.ru

over 6 years ago

In reply to: David Fetter (#1)

Re: Efficient output for integer types

15 сент. 2019 г., в 12:18, David Fetter <david@fetter.org> написал(а):

Please find attached a couple of patches intended to $subject.

This patch set cut the time to copy ten million rows of randomly sized
int8s (10 of them) by about a third, so at least for that case, it's
pretty decent.

Hi! Looks cool.

Just curious if for any fixed base and square here

+		while(uvalue >= base)
 		{
+			const int i = (uvalue % square) * 2;
+			uvalue /= square;
+			vallen += 2;
+			memcpy(convert + sizeof(convert) - vallen, digits + i, 2);
+		}

compiler will have a chance to avoid idiv instruction?
Maybe few specialized functions could work better than generic algorithm?

Best regards, Andrey Borodin.

David Fetter

david@fetter.org

over 6 years ago

In reply to: Andrey Borodin (#2)

Re: Efficient output for integer types

On Sun, Sep 15, 2019 at 02:06:29PM +0500, Andrey Borodin wrote:

15 сент. 2019 г., в 12:18, David Fetter <david@fetter.org> написал(а):

Please find attached a couple of patches intended to $subject.

This patch set cut the time to copy ten million rows of randomly sized
int8s (10 of them) by about a third, so at least for that case, it's
pretty decent.

Hi! Looks cool.

Just curious if for any fixed base and square here
+		while(uvalue >= base)
{
+			const int i = (uvalue % square) * 2;
+			uvalue /= square;
+			vallen += 2;
+			memcpy(convert + sizeof(convert) - vallen, digits + i, 2);
+		}
compiler will have a chance to avoid idiv instruction?

That could very well be. I took the idea (and most of the code) from
the Ryū implementation Andrew Gierth committed for 12.

Maybe few specialized functions could work better than generic
algorithm?

Could be. What do you have in mind? I'm guessing that the ones for
decimals, that being both the most common case and the least obvious
as to how to optimize, would give the most benefit.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

David Fetter

david@fetter.org

over 6 years ago

In reply to: David Fetter (#1)

1 attachment(s)

Re: Efficient output for integer types

On Sun, Sep 15, 2019 at 09:18:49AM +0200, David Fetter wrote:

Folks,

Please find attached a couple of patches intended to $subject.

This patch set cut the time to copy ten million rows of randomly sized
int8s (10 of them) by about a third, so at least for that case, it's
pretty decent.

Added int4 output, removed the sprintf stuff, as it didn't seem to
help in any cases I was testing.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachments:

v2-0001-Make-int4-and-int8-operations-more-efficent.patchtext/x-diff; charset=us-asciiDownload

From 3297c8ab91e0b4b70e51cbe171d361a7b18d465d Mon Sep 17 00:00:00 2001
From: David Fetter <david@fetter.org>
Date: Sun, 15 Sep 2019 00:06:29 -0700
Subject: [PATCH v2] Make int4 and int8 operations more efficent
To: hackers
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="------------2.21.0"

This is a multi-part message in MIME format.
--------------2.21.0
Content-Type: text/plain; charset=UTF-8; format=fixed
Content-Transfer-Encoding: 8bit


- Output routines now do more digits per iteration, and
- Code determines the number of decimal digits in int4/int8 efficiently

diff --git a/src/backend/access/common/printsimple.c b/src/backend/access/common/printsimple.c
index 651ade14dd..17ca533b87 100644
--- a/src/backend/access/common/printsimple.c
+++ b/src/backend/access/common/printsimple.c
@@ -112,7 +112,7 @@ printsimple(TupleTableSlot *slot, DestReceiver *self)
 			case INT8OID:
 				{
 					int64		num = DatumGetInt64(value);
-					char		str[23];	/* sign, 21 digits and '\0' */
+					char		str[MAXINT8LEN];
 
 					pg_lltoa(num, str);
 					pq_sendcountedtext(&buf, str, strlen(str), false);
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 580043233b..3818dbaa85 100644
--- a/src/backend/utils/adt/Makefile
+++ b/src/backend/utils/adt/Makefile
@@ -39,6 +39,8 @@ jsonpath_scan.c: FLEX_NO_BACKUP=yes
 # jsonpath_scan is compiled as part of jsonpath_gram
 jsonpath_gram.o: jsonpath_scan.c
 
+numutils.o: CFLAGS += $(PERMIT_DECLARATION_AFTER_STATEMENT)
+
 # jsonpath_gram.c and jsonpath_scan.c are in the distribution tarball,
 # so they are not cleaned here.
 clean distclean maintainer-clean:
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 0ff9394a2f..6230807906 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -27,8 +27,6 @@
 #include "utils/builtins.h"
 
 
-#define MAXINT8LEN		25
-
 typedef struct
 {
 	int64		current;
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 70138feb29..9da2535c5a 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -20,6 +20,58 @@
 
 #include "common/int.h"
 #include "utils/builtins.h"
+#include "port/pg_bitutils.h"
+
+/*
+ * A table of all two-digit numbers. This is used to speed up decimal digit
+ * generation by copying pairs of digits into the final output.
+ */
+static const char DIGIT_TABLE[200] = {
+	'0', '0', '0', '1', '0', '2', '0', '3', '0', '4', '0', '5', '0', '6', '0', '7', '0', '8', '0', '9',
+	'1', '0', '1', '1', '1', '2', '1', '3', '1', '4', '1', '5', '1', '6', '1', '7', '1', '8', '1', '9',
+	'2', '0', '2', '1', '2', '2', '2', '3', '2', '4', '2', '5', '2', '6', '2', '7', '2', '8', '2', '9',
+	'3', '0', '3', '1', '3', '2', '3', '3', '3', '4', '3', '5', '3', '6', '3', '7', '3', '8', '3', '9',
+	'4', '0', '4', '1', '4', '2', '4', '3', '4', '4', '4', '5', '4', '6', '4', '7', '4', '8', '4', '9',
+	'5', '0', '5', '1', '5', '2', '5', '3', '5', '4', '5', '5', '5', '6', '5', '7', '5', '8', '5', '9',
+	'6', '0', '6', '1', '6', '2', '6', '3', '6', '4', '6', '5', '6', '6', '6', '7', '6', '8', '6', '9',
+	'7', '0', '7', '1', '7', '2', '7', '3', '7', '4', '7', '5', '7', '6', '7', '7', '7', '8', '7', '9',
+	'8', '0', '8', '1', '8', '2', '8', '3', '8', '4', '8', '5', '8', '6', '8', '7', '8', '8', '8', '9',
+	'9', '0', '9', '1', '9', '2', '9', '3', '9', '4', '9', '5', '9', '6', '9', '7', '9', '8', '9', '9'
+};
+
+/*
+ * Adapted from http://graphics.stanford.edu/~seander/bithacks.html#IntegerLog10
+ */
+static inline uint32
+decimalLength32(const uint32 v)
+{
+	uint32			t;
+	static uint64	PowersOfTen[] =
+	{1,                10,                100,
+	 1000,             10000,             100000,
+	 1000000,          10000000,          100000000,
+	 1000000000};
+
+	t = (pg_leftmost_one_pos32(v) + 1)*1233/4096;
+	return t + (v >= PowersOfTen[t]);
+}
+
+static inline uint32
+decimalLength64(const uint64 v)
+{
+	uint32			t;
+	static uint64	PowersOfTen[] =
+	{1,                10,                100,
+	 1000,             10000,             100000,
+	 1000000,          10000000,          100000000,
+	 1000000000,       10000000000,       100000000000,
+	 1000000000000,    10000000000000,    100000000000000,
+	 1000000000000000, 10000000000000000, 100000000000000000,
+	 1000000000000000000};
+
+	t = (pg_leftmost_one_pos64(v) + 1)*1233/4096;
+	return t + (v >= PowersOfTen[t]);
+}
 
 /*
  * pg_atoi: convert string to integer
@@ -284,8 +336,8 @@ pg_itoa(int16 i, char *a)
 void
 pg_ltoa(int32 value, char *a)
 {
-	char	   *start = a;
-	bool		neg = false;
+	uint32	olength;
+	uint32	i = 0;
 
 	/*
 	 * Avoid problems with the most negative integer not being representable
@@ -296,50 +348,91 @@ pg_ltoa(int32 value, char *a)
 		memcpy(a, "-2147483648", 12);
 		return;
 	}
-	else if (value < 0)
+
+	/* Might as well handle this case, too */
+	if (value == 0)
+	{
+		memcpy(a, "0", 2);
+		return;
+	}
+
+	if (value < 0)
 	{
 		value = -value;
-		neg = true;
-	}
-
-	/* Compute the result string backwards. */
-	do
-	{
-		int32		remainder;
-		int32		oldval = value;
-
-		value /= 10;
-		remainder = oldval - value * 10;
-		*a++ = '0' + remainder;
-	} while (value != 0);
-
-	if (neg)
 		*a++ = '-';
+	}
+
+	olength = decimalLength32(value);
+
+	/* Compute the result string. */
+	while (value >= 100000000)
+	{
+		/* Expensive 64-bit division. Optimize? */
+		const	uint64 q = value / 100000000;
+		uint32	value2 = (uint32) (value - 100000000 * q);
+
+		value = q;
+
+		const uint32 c = value2 % 10000;
+		const uint32 d = value2 / 10000;
+		const uint32 c0 = (c % 100) << 1;
+		const uint32 c1 = (c / 100) << 1;
+		const uint32 d0 = (d % 100) << 1;
+		const uint32 d1 = (d / 100) << 1;
+
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c0, 2);
+		memcpy(a + olength - i - 4, DIGIT_TABLE + c1, 2);
+		memcpy(a + olength - i - 6, DIGIT_TABLE + d0, 2);
+		memcpy(a + olength - i - 8, DIGIT_TABLE + d1, 2);
+		i += 8;
+	}
+
+	while (value >= 10000)
+	{
+		const	uint32 c = value - 10000 * (value / 10000);
+
+		value /= 10000;
 
-	/* Add trailing NUL byte, and back up 'a' to the last character. */
-	*a-- = '\0';
+		const	uint32 c0 = (c % 100) << 1;
+		const	uint32 c1 = (c / 100) << 1;
 
-	/* Reverse string. */
-	while (start < a)
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c0, 2);
+		memcpy(a + olength - i - 4, DIGIT_TABLE + c1, 2);
+		i += 4;
+	}
+	if (value >= 100)
 	{
-		char		swap = *start;
+		const uint32 c = (value % 100) << 1;
 
-		*start++ = *a;
-		*a-- = swap;
+		value /= 100;
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c, 2);
+		i += 2;
 	}
+	if (value >= 10)
+	{
+		const uint32 c = value << 1;
+
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	else
+	{
+		*a = (char) ('0' + value);
+	}
+
+	a[olength] = '\0';
 }
-
 /*
  * pg_lltoa: convert a signed 64-bit integer to its string representation
  *
  * Caller must ensure that 'a' points to enough memory to hold the result
- * (at least MAXINT8LEN+1 bytes, counting a leading sign and trailing NUL).
+ * (at least MAXINT8LEN bytes, counting a leading sign and trailing NUL).
  */
 void
 pg_lltoa(int64 value, char *a)
 {
-	char	   *start = a;
-	bool		neg = false;
+	uint32	olength;
+	uint32	i = 0;
 
 	/*
 	 * Avoid problems with the most negative integer not being representable
@@ -350,37 +443,82 @@ pg_lltoa(int64 value, char *a)
 		memcpy(a, "-9223372036854775808", 21);
 		return;
 	}
-	else if (value < 0)
+
+	/* Might as well handle this case, too */
+	if (value == 0)
+	{
+		memcpy(a, "0", 2);
+		return;
+	}
+
+	if (value < 0)
 	{
 		value = -value;
-		neg = true;
-	}
-
-	/* Compute the result string backwards. */
-	do
-	{
-		int64		remainder;
-		int64		oldval = value;
-
-		value /= 10;
-		remainder = oldval - value * 10;
-		*a++ = '0' + remainder;
-	} while (value != 0);
-
-	if (neg)
 		*a++ = '-';
+	}
+
+	olength = decimalLength64(value);
+
+	/* Compute the result string. */
+	while (value >= 100000000)
+	{
+		/* Expensive 64-bit division. Optimize? */
+		const	uint64 q = value / 100000000;
+		uint32	value2 = (uint32) (value - 100000000 * q);
+
+		value = q;
 
-	/* Add trailing NUL byte, and back up 'a' to the last character. */
-	*a-- = '\0';
+		const uint32 c = value2 % 10000;
+		const uint32 d = value2 / 10000;
+		const uint32 c0 = (c % 100) << 1;
+		const uint32 c1 = (c / 100) << 1;
+		const uint32 d0 = (d % 100) << 1;
+		const uint32 d1 = (d / 100) << 1;
 
-	/* Reverse string. */
-	while (start < a)
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c0, 2);
+		memcpy(a + olength - i - 4, DIGIT_TABLE + c1, 2);
+		memcpy(a + olength - i - 6, DIGIT_TABLE + d0, 2);
+		memcpy(a + olength - i - 8, DIGIT_TABLE + d1, 2);
+		i += 8;
+	}
+
+	/* Switch to 32-bit for speed */
+	uint32		value2 = (uint32) value;
+
+	while (value2 >= 10000)
 	{
-		char		swap = *start;
+		const	uint32 c = value2 - 10000 * (value2 / 10000);
+
+		value2 /= 10000;
+
+		const	uint32 c0 = (c % 100) << 1;
+		const	uint32 c1 = (c / 100) << 1;
 
-		*start++ = *a;
-		*a-- = swap;
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c0, 2);
+		memcpy(a + olength - i - 4, DIGIT_TABLE + c1, 2);
+		i += 4;
 	}
+	if (value2 >= 100)
+	{
+		const uint32 c = (value2 % 100) << 1;
+
+		value2 /= 100;
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	if (value2 >= 10)
+	{
+		const uint32 c = value2 << 1;
+
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	else
+	{
+		*a = (char) ('0' + value2);
+	}
+
+	a[olength] = '\0';
 }
 
 
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 937ddb7ef0..9e8392741e 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -18,6 +18,7 @@
 #include "nodes/nodes.h"
 #include "utils/fmgrprotos.h"
 
+#define MAXINT8LEN 21
 
 /* bool.c */
 extern bool parse_bool(const char *value, bool *result);

--------------2.21.0--

David Fetter

david@fetter.org

over 6 years ago

In reply to: David Fetter (#4)

1 attachment(s)

Re: Efficient output for integer types

On Tue, Sep 17, 2019 at 08:55:05AM +0200, David Fetter wrote:

On Sun, Sep 15, 2019 at 09:18:49AM +0200, David Fetter wrote:

Folks,

Please find attached a couple of patches intended to $subject.

This patch set cut the time to copy ten million rows of randomly sized
int8s (10 of them) by about a third, so at least for that case, it's
pretty decent.

Added int4 output, removed the sprintf stuff, as it didn't seem to
help in any cases I was testing.

Found a couple of "whiles" that should have been "ifs."

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachments:

v3-0001-Make-int4-and-int8-operations-more-efficent.patchtext/x-diff; charset=us-asciiDownload

From e6cc7e79c0c36ce8b0307356820d4dea1f07f2cb Mon Sep 17 00:00:00 2001
From: David Fetter <david@fetter.org>
Date: Sun, 15 Sep 2019 00:06:29 -0700
Subject: [PATCH v3] Make int4 and int8 operations more efficent
To: hackers
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="------------2.21.0"

This is a multi-part message in MIME format.
--------------2.21.0
Content-Type: text/plain; charset=UTF-8; format=fixed
Content-Transfer-Encoding: 8bit


- Output routines now do more digits per iteration, and
- Code determines the number of decimal digits in int4/int8 efficiently

diff --git a/src/backend/access/common/printsimple.c b/src/backend/access/common/printsimple.c
index 651ade14dd..17ca533b87 100644
--- a/src/backend/access/common/printsimple.c
+++ b/src/backend/access/common/printsimple.c
@@ -112,7 +112,7 @@ printsimple(TupleTableSlot *slot, DestReceiver *self)
 			case INT8OID:
 				{
 					int64		num = DatumGetInt64(value);
-					char		str[23];	/* sign, 21 digits and '\0' */
+					char		str[MAXINT8LEN];
 
 					pg_lltoa(num, str);
 					pq_sendcountedtext(&buf, str, strlen(str), false);
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 580043233b..3818dbaa85 100644
--- a/src/backend/utils/adt/Makefile
+++ b/src/backend/utils/adt/Makefile
@@ -39,6 +39,8 @@ jsonpath_scan.c: FLEX_NO_BACKUP=yes
 # jsonpath_scan is compiled as part of jsonpath_gram
 jsonpath_gram.o: jsonpath_scan.c
 
+numutils.o: CFLAGS += $(PERMIT_DECLARATION_AFTER_STATEMENT)
+
 # jsonpath_gram.c and jsonpath_scan.c are in the distribution tarball,
 # so they are not cleaned here.
 clean distclean maintainer-clean:
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 0ff9394a2f..6230807906 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -27,8 +27,6 @@
 #include "utils/builtins.h"
 
 
-#define MAXINT8LEN		25
-
 typedef struct
 {
 	int64		current;
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 70138feb29..2807deab5d 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -20,6 +20,58 @@
 
 #include "common/int.h"
 #include "utils/builtins.h"
+#include "port/pg_bitutils.h"
+
+/*
+ * A table of all two-digit numbers. This is used to speed up decimal digit
+ * generation by copying pairs of digits into the final output.
+ */
+static const char DIGIT_TABLE[200] = {
+	'0', '0', '0', '1', '0', '2', '0', '3', '0', '4', '0', '5', '0', '6', '0', '7', '0', '8', '0', '9',
+	'1', '0', '1', '1', '1', '2', '1', '3', '1', '4', '1', '5', '1', '6', '1', '7', '1', '8', '1', '9',
+	'2', '0', '2', '1', '2', '2', '2', '3', '2', '4', '2', '5', '2', '6', '2', '7', '2', '8', '2', '9',
+	'3', '0', '3', '1', '3', '2', '3', '3', '3', '4', '3', '5', '3', '6', '3', '7', '3', '8', '3', '9',
+	'4', '0', '4', '1', '4', '2', '4', '3', '4', '4', '4', '5', '4', '6', '4', '7', '4', '8', '4', '9',
+	'5', '0', '5', '1', '5', '2', '5', '3', '5', '4', '5', '5', '5', '6', '5', '7', '5', '8', '5', '9',
+	'6', '0', '6', '1', '6', '2', '6', '3', '6', '4', '6', '5', '6', '6', '6', '7', '6', '8', '6', '9',
+	'7', '0', '7', '1', '7', '2', '7', '3', '7', '4', '7', '5', '7', '6', '7', '7', '7', '8', '7', '9',
+	'8', '0', '8', '1', '8', '2', '8', '3', '8', '4', '8', '5', '8', '6', '8', '7', '8', '8', '8', '9',
+	'9', '0', '9', '1', '9', '2', '9', '3', '9', '4', '9', '5', '9', '6', '9', '7', '9', '8', '9', '9'
+};
+
+/*
+ * Adapted from http://graphics.stanford.edu/~seander/bithacks.html#IntegerLog10
+ */
+static inline uint32
+decimalLength32(const uint32 v)
+{
+	uint32			t;
+	static uint64	PowersOfTen[] =
+	{1,                10,                100,
+	 1000,             10000,             100000,
+	 1000000,          10000000,          100000000,
+	 1000000000};
+
+	t = (pg_leftmost_one_pos32(v) + 1)*1233/4096;
+	return t + (v >= PowersOfTen[t]);
+}
+
+static inline uint32
+decimalLength64(const uint64 v)
+{
+	uint32			t;
+	static uint64	PowersOfTen[] =
+	{1,                10,                100,
+	 1000,             10000,             100000,
+	 1000000,          10000000,          100000000,
+	 1000000000,       10000000000,       100000000000,
+	 1000000000000,    10000000000000,    100000000000000,
+	 1000000000000000, 10000000000000000, 100000000000000000,
+	 1000000000000000000};
+
+	t = (pg_leftmost_one_pos64(v) + 1)*1233/4096;
+	return t + (v >= PowersOfTen[t]);
+}
 
 /*
  * pg_atoi: convert string to integer
@@ -284,8 +336,8 @@ pg_itoa(int16 i, char *a)
 void
 pg_ltoa(int32 value, char *a)
 {
-	char	   *start = a;
-	bool		neg = false;
+	uint32	olength;
+	uint32	i = 0;
 
 	/*
 	 * Avoid problems with the most negative integer not being representable
@@ -296,50 +348,91 @@ pg_ltoa(int32 value, char *a)
 		memcpy(a, "-2147483648", 12);
 		return;
 	}
-	else if (value < 0)
+
+	/* Might as well handle this case, too */
+	if (value == 0)
+	{
+		memcpy(a, "0", 2);
+		return;
+	}
+
+	if (value < 0)
 	{
 		value = -value;
-		neg = true;
-	}
-
-	/* Compute the result string backwards. */
-	do
-	{
-		int32		remainder;
-		int32		oldval = value;
-
-		value /= 10;
-		remainder = oldval - value * 10;
-		*a++ = '0' + remainder;
-	} while (value != 0);
-
-	if (neg)
 		*a++ = '-';
+	}
+
+	olength = decimalLength32(value);
+
+	/* Compute the result string. */
+	while (value >= 100000000)
+	{
+		/* Expensive 64-bit division. Optimize? */
+		const	uint64 q = value / 100000000;
+		uint32	value2 = (uint32) (value - 100000000 * q);
+
+		value = q;
+
+		const uint32 c = value2 % 10000;
+		const uint32 d = value2 / 10000;
+		const uint32 c0 = (c % 100) << 1;
+		const uint32 c1 = (c / 100) << 1;
+		const uint32 d0 = (d % 100) << 1;
+		const uint32 d1 = (d / 100) << 1;
+
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c0, 2);
+		memcpy(a + olength - i - 4, DIGIT_TABLE + c1, 2);
+		memcpy(a + olength - i - 6, DIGIT_TABLE + d0, 2);
+		memcpy(a + olength - i - 8, DIGIT_TABLE + d1, 2);
+		i += 8;
+	}
+
+	if (value >= 10000)
+	{
+		const	uint32 c = value - 10000 * (value / 10000);
+
+		value /= 10000;
 
-	/* Add trailing NUL byte, and back up 'a' to the last character. */
-	*a-- = '\0';
+		const	uint32 c0 = (c % 100) << 1;
+		const	uint32 c1 = (c / 100) << 1;
 
-	/* Reverse string. */
-	while (start < a)
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c0, 2);
+		memcpy(a + olength - i - 4, DIGIT_TABLE + c1, 2);
+		i += 4;
+	}
+	if (value >= 100)
 	{
-		char		swap = *start;
+		const uint32 c = (value % 100) << 1;
 
-		*start++ = *a;
-		*a-- = swap;
+		value /= 100;
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c, 2);
+		i += 2;
 	}
+	if (value >= 10)
+	{
+		const uint32 c = value << 1;
+
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	else
+	{
+		*a = (char) ('0' + value);
+	}
+
+	a[olength] = '\0';
 }
-
 /*
  * pg_lltoa: convert a signed 64-bit integer to its string representation
  *
  * Caller must ensure that 'a' points to enough memory to hold the result
- * (at least MAXINT8LEN+1 bytes, counting a leading sign and trailing NUL).
+ * (at least MAXINT8LEN bytes, counting a leading sign and trailing NUL).
  */
 void
 pg_lltoa(int64 value, char *a)
 {
-	char	   *start = a;
-	bool		neg = false;
+	uint32	olength;
+	uint32	i = 0;
 
 	/*
 	 * Avoid problems with the most negative integer not being representable
@@ -350,37 +443,82 @@ pg_lltoa(int64 value, char *a)
 		memcpy(a, "-9223372036854775808", 21);
 		return;
 	}
-	else if (value < 0)
+
+	/* Might as well handle this case, too */
+	if (value == 0)
+	{
+		memcpy(a, "0", 2);
+		return;
+	}
+
+	if (value < 0)
 	{
 		value = -value;
-		neg = true;
-	}
-
-	/* Compute the result string backwards. */
-	do
-	{
-		int64		remainder;
-		int64		oldval = value;
-
-		value /= 10;
-		remainder = oldval - value * 10;
-		*a++ = '0' + remainder;
-	} while (value != 0);
-
-	if (neg)
 		*a++ = '-';
+	}
+
+	olength = decimalLength64(value);
+
+	/* Compute the result string. */
+	while (value >= 100000000)
+	{
+		/* Expensive 64-bit division. Optimize? */
+		const	uint64 q = value / 100000000;
+		uint32	value2 = (uint32) (value - 100000000 * q);
+
+		value = q;
 
-	/* Add trailing NUL byte, and back up 'a' to the last character. */
-	*a-- = '\0';
+		const uint32 c = value2 % 10000;
+		const uint32 d = value2 / 10000;
+		const uint32 c0 = (c % 100) << 1;
+		const uint32 c1 = (c / 100) << 1;
+		const uint32 d0 = (d % 100) << 1;
+		const uint32 d1 = (d / 100) << 1;
 
-	/* Reverse string. */
-	while (start < a)
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c0, 2);
+		memcpy(a + olength - i - 4, DIGIT_TABLE + c1, 2);
+		memcpy(a + olength - i - 6, DIGIT_TABLE + d0, 2);
+		memcpy(a + olength - i - 8, DIGIT_TABLE + d1, 2);
+		i += 8;
+	}
+
+	/* Switch to 32-bit for speed */
+	uint32		value2 = (uint32) value;
+
+	if (value2 >= 10000)
 	{
-		char		swap = *start;
+		const	uint32 c = value2 - 10000 * (value2 / 10000);
+
+		value2 /= 10000;
+
+		const	uint32 c0 = (c % 100) << 1;
+		const	uint32 c1 = (c / 100) << 1;
 
-		*start++ = *a;
-		*a-- = swap;
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c0, 2);
+		memcpy(a + olength - i - 4, DIGIT_TABLE + c1, 2);
+		i += 4;
 	}
+	if (value2 >= 100)
+	{
+		const uint32 c = (value2 % 100) << 1;
+
+		value2 /= 100;
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	if (value2 >= 10)
+	{
+		const uint32 c = value2 << 1;
+
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	else
+	{
+		*a = (char) ('0' + value2);
+	}
+
+	a[olength] = '\0';
 }
 
 
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 937ddb7ef0..9e8392741e 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -18,6 +18,7 @@
 #include "nodes/nodes.h"
 #include "utils/fmgrprotos.h"
 
+#define MAXINT8LEN 21
 
 /* bool.c */
 extern bool parse_bool(const char *value, bool *result);

--------------2.21.0--

David Fetter

david@fetter.org

over 6 years ago

In reply to: David Fetter (#5)

1 attachment(s)

Re: Efficient output for integer types

On Tue, Sep 17, 2019 at 09:01:57AM +0200, David Fetter wrote:

On Tue, Sep 17, 2019 at 08:55:05AM +0200, David Fetter wrote:

On Sun, Sep 15, 2019 at 09:18:49AM +0200, David Fetter wrote:

Folks,

Please find attached a couple of patches intended to $subject.

This patch set cut the time to copy ten million rows of randomly sized
int8s (10 of them) by about a third, so at least for that case, it's
pretty decent.

Added int4 output, removed the sprintf stuff, as it didn't seem to
help in any cases I was testing.

Found a couple of "whiles" that should have been "ifs."

Factored out some inefficient functions and made the guts use the more
efficient function.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachments:

v4-0001-Make-int4-and-int8-operations-more-efficent.patchtext/x-diff; charset=us-asciiDownload

From 8ba1ed73a3a96b947e591c9edd22acb89bfa096b Mon Sep 17 00:00:00 2001
From: David Fetter <david@fetter.org>
Date: Sun, 15 Sep 2019 00:06:29 -0700
Subject: [PATCH v4] Make int4 and int8 operations more efficent
To: hackers
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="------------2.21.0"

This is a multi-part message in MIME format.
--------------2.21.0
Content-Type: text/plain; charset=UTF-8; format=fixed
Content-Transfer-Encoding: 8bit


- Output routines now do more digits per iteration, and
- Code determines the number of decimal digits in int4/int8 efficiently
- Split off pg_ltoa_n from pg_ltoa
- Use same to make other functions shorter

diff --git a/src/backend/access/common/printsimple.c b/src/backend/access/common/printsimple.c
index 651ade14dd..17ca533b87 100644
--- a/src/backend/access/common/printsimple.c
+++ b/src/backend/access/common/printsimple.c
@@ -112,7 +112,7 @@ printsimple(TupleTableSlot *slot, DestReceiver *self)
 			case INT8OID:
 				{
 					int64		num = DatumGetInt64(value);
-					char		str[23];	/* sign, 21 digits and '\0' */
+					char		str[MAXINT8LEN];
 
 					pg_lltoa(num, str);
 					pq_sendcountedtext(&buf, str, strlen(str), false);
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 580043233b..3818dbaa85 100644
--- a/src/backend/utils/adt/Makefile
+++ b/src/backend/utils/adt/Makefile
@@ -39,6 +39,8 @@ jsonpath_scan.c: FLEX_NO_BACKUP=yes
 # jsonpath_scan is compiled as part of jsonpath_gram
 jsonpath_gram.o: jsonpath_scan.c
 
+numutils.o: CFLAGS += $(PERMIT_DECLARATION_AFTER_STATEMENT)
+
 # jsonpath_gram.c and jsonpath_scan.c are in the distribution tarball,
 # so they are not cleaned here.
 clean distclean maintainer-clean:
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 0ff9394a2f..6230807906 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -27,8 +27,6 @@
 #include "utils/builtins.h"
 
 
-#define MAXINT8LEN		25
-
 typedef struct
 {
 	int64		current;
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 70138feb29..47280b68d6 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -20,6 +20,58 @@
 
 #include "common/int.h"
 #include "utils/builtins.h"
+#include "port/pg_bitutils.h"
+
+/*
+ * A table of all two-digit numbers. This is used to speed up decimal digit
+ * generation by copying pairs of digits into the final output.
+ */
+static const char DIGIT_TABLE[200] = {
+	'0', '0', '0', '1', '0', '2', '0', '3', '0', '4', '0', '5', '0', '6', '0', '7', '0', '8', '0', '9',
+	'1', '0', '1', '1', '1', '2', '1', '3', '1', '4', '1', '5', '1', '6', '1', '7', '1', '8', '1', '9',
+	'2', '0', '2', '1', '2', '2', '2', '3', '2', '4', '2', '5', '2', '6', '2', '7', '2', '8', '2', '9',
+	'3', '0', '3', '1', '3', '2', '3', '3', '3', '4', '3', '5', '3', '6', '3', '7', '3', '8', '3', '9',
+	'4', '0', '4', '1', '4', '2', '4', '3', '4', '4', '4', '5', '4', '6', '4', '7', '4', '8', '4', '9',
+	'5', '0', '5', '1', '5', '2', '5', '3', '5', '4', '5', '5', '5', '6', '5', '7', '5', '8', '5', '9',
+	'6', '0', '6', '1', '6', '2', '6', '3', '6', '4', '6', '5', '6', '6', '6', '7', '6', '8', '6', '9',
+	'7', '0', '7', '1', '7', '2', '7', '3', '7', '4', '7', '5', '7', '6', '7', '7', '7', '8', '7', '9',
+	'8', '0', '8', '1', '8', '2', '8', '3', '8', '4', '8', '5', '8', '6', '8', '7', '8', '8', '8', '9',
+	'9', '0', '9', '1', '9', '2', '9', '3', '9', '4', '9', '5', '9', '6', '9', '7', '9', '8', '9', '9'
+};
+
+/*
+ * Adapted from http://graphics.stanford.edu/~seander/bithacks.html#IntegerLog10
+ */
+static inline uint32
+decimalLength32(const uint32 v)
+{
+	uint32			t;
+	static uint64	PowersOfTen[] =
+	{1,                10,                100,
+	 1000,             10000,             100000,
+	 1000000,          10000000,          100000000,
+	 1000000000};
+
+	t = (pg_leftmost_one_pos32(v) + 1)*1233/4096;
+	return t + (v >= PowersOfTen[t]);
+}
+
+static inline uint32
+decimalLength64(const uint64 v)
+{
+	uint32			t;
+	static uint64	PowersOfTen[] =
+	{1,                10,                100,
+	 1000,             10000,             100000,
+	 1000000,          10000000,          100000000,
+	 1000000000,       10000000000,       100000000000,
+	 1000000000000,    10000000000000,    100000000000000,
+	 1000000000000000, 10000000000000000, 100000000000000000,
+	 1000000000000000000};
+
+	t = (pg_leftmost_one_pos64(v) + 1)*1233/4096;
+	return t + (v >= PowersOfTen[t]);
+}
 
 /*
  * pg_atoi: convert string to integer
@@ -276,16 +328,17 @@ pg_itoa(int16 i, char *a)
 }
 
 /*
- * pg_ltoa: converts a signed 32-bit integer to its string representation
+ * pg_ltoa_n: converts a signed 32-bit integer to its string representation, not
+ * NUL-terminated, and returns the length of that string representation
  *
- * Caller must ensure that 'a' points to enough memory to hold the result
- * (at least 12 bytes, counting a leading sign and trailing NUL).
+ * Caller must ensure that 'a' points to enough memory to hold the result (at
+ * least 11 bytes, counting a leading sign).
  */
-void
-pg_ltoa(int32 value, char *a)
+static int32
+pg_ltoa_n(int32 value, char *a)
 {
-	char	   *start = a;
-	bool		neg = false;
+	uint32	olength;
+	uint32	i = 0, adjust = 0;
 
 	/*
 	 * Avoid problems with the most negative integer not being representable
@@ -293,53 +346,112 @@ pg_ltoa(int32 value, char *a)
 	 */
 	if (value == PG_INT32_MIN)
 	{
-		memcpy(a, "-2147483648", 12);
-		return;
+		memcpy(a, "-2147483648", 11);
+		return 11;
 	}
-	else if (value < 0)
+
+	/* Might as well handle this case, too */
+	if (value == 0)
+	{
+		memcpy(a, "0", 1);
+		return 1;
+	}
+
+	if (value < 0)
 	{
 		value = -value;
-		neg = true;
-	}
-
-	/* Compute the result string backwards. */
-	do
-	{
-		int32		remainder;
-		int32		oldval = value;
-
-		value /= 10;
-		remainder = oldval - value * 10;
-		*a++ = '0' + remainder;
-	} while (value != 0);
-
-	if (neg)
 		*a++ = '-';
+		adjust++;
+	}
+
+	olength = decimalLength32(value);
+
+	/* Compute the result string. */
+	while (value >= 100000000)
+	{
+		/* Expensive 64-bit division. Optimize? */
+		const	uint64 q = value / 100000000;
+		uint32	value2 = (uint32) (value - 100000000 * q);
+
+		value = q;
 
-	/* Add trailing NUL byte, and back up 'a' to the last character. */
-	*a-- = '\0';
+		const uint32 c = value2 % 10000;
+		const uint32 d = value2 / 10000;
+		const uint32 c0 = (c % 100) << 1;
+		const uint32 c1 = (c / 100) << 1;
+		const uint32 d0 = (d % 100) << 1;
+		const uint32 d1 = (d / 100) << 1;
 
-	/* Reverse string. */
-	while (start < a)
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c0, 2);
+		memcpy(a + olength - i - 4, DIGIT_TABLE + c1, 2);
+		memcpy(a + olength - i - 6, DIGIT_TABLE + d0, 2);
+		memcpy(a + olength - i - 8, DIGIT_TABLE + d1, 2);
+		i += 8;
+	}
+
+	if (value >= 10000)
 	{
-		char		swap = *start;
+		const	uint32 c = value - 10000 * (value / 10000);
+
+		value /= 10000;
 
-		*start++ = *a;
-		*a-- = swap;
+		const	uint32 c0 = (c % 100) << 1;
+		const	uint32 c1 = (c / 100) << 1;
+
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c0, 2);
+		memcpy(a + olength - i - 4, DIGIT_TABLE + c1, 2);
+		i += 4;
 	}
+	if (value >= 100)
+	{
+		const uint32 c = (value % 100) << 1;
+
+		value /= 100;
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	if (value >= 10)
+	{
+		const uint32 c = value << 1;
+
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	else
+	{
+		*a = (char) ('0' + value);
+		i++;
+	}
+
+	i += adjust;
+	return i;
+}
+
+/*
+ * NUL-terminate the output of pg_ltoa_n.
+ *
+ * It is the caller's responsibility to ensure that a is at least 12 bytes long,
+ * which is enough room to hold a minus sign, a maximally long int32, and the
+ * above terminating NUL.
+ */
+void
+pg_ltoa(int32 value, char *a)
+{
+	int32 len = pg_ltoa_n(value, a);
+	a[len] = '\0';
 }
 
 /*
  * pg_lltoa: convert a signed 64-bit integer to its string representation
  *
  * Caller must ensure that 'a' points to enough memory to hold the result
- * (at least MAXINT8LEN+1 bytes, counting a leading sign and trailing NUL).
+ * (at least MAXINT8LEN bytes, counting a leading sign and trailing NUL).
  */
 void
 pg_lltoa(int64 value, char *a)
 {
-	char	   *start = a;
-	bool		neg = false;
+	uint32	olength;
+	uint32	i = 0;
 
 	/*
 	 * Avoid problems with the most negative integer not being representable
@@ -350,37 +462,82 @@ pg_lltoa(int64 value, char *a)
 		memcpy(a, "-9223372036854775808", 21);
 		return;
 	}
-	else if (value < 0)
+
+	/* Might as well handle this case, too */
+	if (value == 0)
+	{
+		memcpy(a, "0", 2);
+		return;
+	}
+
+	if (value < 0)
 	{
 		value = -value;
-		neg = true;
-	}
-
-	/* Compute the result string backwards. */
-	do
-	{
-		int64		remainder;
-		int64		oldval = value;
-
-		value /= 10;
-		remainder = oldval - value * 10;
-		*a++ = '0' + remainder;
-	} while (value != 0);
-
-	if (neg)
 		*a++ = '-';
+	}
+
+	olength = decimalLength64(value);
+
+	/* Compute the result string. */
+	while (value >= 100000000)
+	{
+		/* Expensive 64-bit division. Optimize? */
+		const	uint64 q = value / 100000000;
+		uint32	value2 = (uint32) (value - 100000000 * q);
+
+		value = q;
 
-	/* Add trailing NUL byte, and back up 'a' to the last character. */
-	*a-- = '\0';
+		const uint32 c = value2 % 10000;
+		const uint32 d = value2 / 10000;
+		const uint32 c0 = (c % 100) << 1;
+		const uint32 c1 = (c / 100) << 1;
+		const uint32 d0 = (d % 100) << 1;
+		const uint32 d1 = (d / 100) << 1;
 
-	/* Reverse string. */
-	while (start < a)
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c0, 2);
+		memcpy(a + olength - i - 4, DIGIT_TABLE + c1, 2);
+		memcpy(a + olength - i - 6, DIGIT_TABLE + d0, 2);
+		memcpy(a + olength - i - 8, DIGIT_TABLE + d1, 2);
+		i += 8;
+	}
+
+	/* Switch to 32-bit for speed */
+	uint32		value2 = (uint32) value;
+
+	if (value2 >= 10000)
 	{
-		char		swap = *start;
+		const	uint32 c = value2 - 10000 * (value2 / 10000);
+
+		value2 /= 10000;
+
+		const	uint32 c0 = (c % 100) << 1;
+		const	uint32 c1 = (c / 100) << 1;
 
-		*start++ = *a;
-		*a-- = swap;
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c0, 2);
+		memcpy(a + olength - i - 4, DIGIT_TABLE + c1, 2);
+		i += 4;
 	}
+	if (value2 >= 100)
+	{
+		const uint32 c = (value2 % 100) << 1;
+
+		value2 /= 100;
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	if (value2 >= 10)
+	{
+		const uint32 c = value2 << 1;
+
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	else
+	{
+		*a = (char) ('0' + value2);
+	}
+
+	a[olength] = '\0';
 }
 
 
@@ -409,60 +566,44 @@ pg_lltoa(int64 value, char *a)
 char *
 pg_ltostr_zeropad(char *str, int32 value, int32 minwidth)
 {
-	char	   *start = str;
-	char	   *end = &str[minwidth];
-	int32		num = value;
+	int32		len;
 
 	Assert(minwidth > 0);
 
-	/*
-	 * Handle negative numbers in a special way.  We can't just write a '-'
-	 * prefix and reverse the sign as that would overflow for INT32_MIN.
-	 */
-	if (num < 0)
+	if (value >= 0)
 	{
-		*start++ = '-';
-		minwidth--;
+		if (value < 100 && minwidth == 2) /* Short cut for common case */
+		{
+			const uint32 c = value << 1;
+			memcpy(str, DIGIT_TABLE + c, 2);
+			return str + 2;
+		}
 
+		len = pg_ltoa_n(value, str);
+		if (minwidth <= len)
+			return str + len;
+
+		memmove(str + minwidth - len, str, len);
+		for(int i = 0; i < minwidth - len; i++)
+		{
+			memcpy(str + i, DIGIT_TABLE, 1);
+		}
+		return str + minwidth;
+	}
+	else
+	{
 		/*
-		 * Build the number starting at the last digit.  Here remainder will
-		 * be a negative number, so we must reverse the sign before adding '0'
-		 * in order to get the correct ASCII digit.
+		 * Changing this number's sign would overflow PG_INT32_MAX,
+		 * so special-case it.
 		 */
-		while (minwidth--)
+		if (value == PG_INT32_MIN)
 		{
-			int32		oldval = num;
-			int32		remainder;
-
-			num /= 10;
-			remainder = oldval - num * 10;
-			start[minwidth] = '0' - remainder;
+			memcpy(str, "-2147483648", 11);
+			return str + 11;
 		}
+		*str++ = '-';
+		return pg_ltostr_zeropad(str, -value, minwidth - 1);
 	}
-	else
-	{
-		/* Build the number starting at the last digit */
-		while (minwidth--)
-		{
-			int32		oldval = num;
-			int32		remainder;
-
-			num /= 10;
-			remainder = oldval - num * 10;
-			start[minwidth] = '0' + remainder;
-		}
-	}
-
-	/*
-	 * If minwidth was not high enough to fit the number then num won't have
-	 * been divided down to zero.  We punt the problem to pg_ltostr(), which
-	 * will generate a correct answer in the minimum valid width.
-	 */
-	if (num != 0)
-		return pg_ltostr(str, value);
-
-	/* Otherwise, return last output character + 1 */
-	return end;
 }
 
 /*
@@ -486,62 +627,8 @@ pg_ltostr_zeropad(char *str, int32 value, int32 minwidth)
 char *
 pg_ltostr(char *str, int32 value)
 {
-	char	   *start;
-	char	   *end;
-
-	/*
-	 * Handle negative numbers in a special way.  We can't just write a '-'
-	 * prefix and reverse the sign as that would overflow for INT32_MIN.
-	 */
-	if (value < 0)
-	{
-		*str++ = '-';
-
-		/* Mark the position we must reverse the string from. */
-		start = str;
-
-		/* Compute the result string backwards. */
-		do
-		{
-			int32		oldval = value;
-			int32		remainder;
-
-			value /= 10;
-			remainder = oldval - value * 10;
-			/* As above, we expect remainder to be negative. */
-			*str++ = '0' - remainder;
-		} while (value != 0);
-	}
-	else
-	{
-		/* Mark the position we must reverse the string from. */
-		start = str;
-
-		/* Compute the result string backwards. */
-		do
-		{
-			int32		oldval = value;
-			int32		remainder;
-
-			value /= 10;
-			remainder = oldval - value * 10;
-			*str++ = '0' + remainder;
-		} while (value != 0);
-	}
-
-	/* Remember the end+1 and back up 'str' to the last character. */
-	end = str--;
-
-	/* Reverse string. */
-	while (start < str)
-	{
-		char		swap = *start;
-
-		*start++ = *str;
-		*str-- = swap;
-	}
-
-	return end;
+	int32	len = pg_ltoa_n(value, str);
+	return str + len;
 }
 
 /*
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 937ddb7ef0..628fe73573 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -18,6 +18,7 @@
 #include "nodes/nodes.h"
 #include "utils/fmgrprotos.h"
 
+#define MAXINT8LEN 21
 
 /* bool.c */
 extern bool parse_bool(const char *value, bool *result);
@@ -46,6 +47,7 @@ extern int32 pg_atoi(const char *s, int size, int c);
 extern int16 pg_strtoint16(const char *s);
 extern int32 pg_strtoint32(const char *s);
 extern void pg_itoa(int16 i, char *a);
+static int32 pg_ltoa_n(int32 l, char *a);
 extern void pg_ltoa(int32 l, char *a);
 extern void pg_lltoa(int64 ll, char *a);
 extern char *pg_ltostr_zeropad(char *str, int32 value, int32 minwidth);

--------------2.21.0--

David Fetter

david@fetter.org

over 6 years ago

In reply to: David Fetter (#6)

1 attachment(s)

Re: Efficient output for integer types

On Wed, Sep 18, 2019 at 05:42:01AM +0200, David Fetter wrote:

On Tue, Sep 17, 2019 at 09:01:57AM +0200, David Fetter wrote:

On Tue, Sep 17, 2019 at 08:55:05AM +0200, David Fetter wrote:

On Sun, Sep 15, 2019 at 09:18:49AM +0200, David Fetter wrote:

Folks,

Please find attached a couple of patches intended to $subject.

This patch set cut the time to copy ten million rows of randomly sized
int8s (10 of them) by about a third, so at least for that case, it's
pretty decent.

Added int4 output, removed the sprintf stuff, as it didn't seem to
help in any cases I was testing.

Found a couple of "whiles" that should have been "ifs."

Factored out some inefficient functions and made the guts use the more
efficient function.

Fix copy-paste-o that introduced some unneeded 64-bit math.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachments:

v5-0001-Make-int4-and-int8-operations-more-efficent.patchtext/x-diff; charset=us-asciiDownload

From b9b2e2dac6f5c6a15cf4161ff135d201ea52a207 Mon Sep 17 00:00:00 2001
From: David Fetter <david@fetter.org>
Date: Sun, 15 Sep 2019 00:06:29 -0700
Subject: [PATCH v5] Make int4 and int8 operations more efficent
To: hackers
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="------------2.21.0"

This is a multi-part message in MIME format.
--------------2.21.0
Content-Type: text/plain; charset=UTF-8; format=fixed
Content-Transfer-Encoding: 8bit


- Output routines now do more digits per iteration, and
- Code determines the number of decimal digits in int4/int8 efficiently
- Split off pg_ltoa_n from pg_ltoa
- Use same to make other functions shorter

diff --git a/src/backend/access/common/printsimple.c b/src/backend/access/common/printsimple.c
index 651ade14dd..17ca533b87 100644
--- a/src/backend/access/common/printsimple.c
+++ b/src/backend/access/common/printsimple.c
@@ -112,7 +112,7 @@ printsimple(TupleTableSlot *slot, DestReceiver *self)
 			case INT8OID:
 				{
 					int64		num = DatumGetInt64(value);
-					char		str[23];	/* sign, 21 digits and '\0' */
+					char		str[MAXINT8LEN];
 
 					pg_lltoa(num, str);
 					pq_sendcountedtext(&buf, str, strlen(str), false);
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 580043233b..3818dbaa85 100644
--- a/src/backend/utils/adt/Makefile
+++ b/src/backend/utils/adt/Makefile
@@ -39,6 +39,8 @@ jsonpath_scan.c: FLEX_NO_BACKUP=yes
 # jsonpath_scan is compiled as part of jsonpath_gram
 jsonpath_gram.o: jsonpath_scan.c
 
+numutils.o: CFLAGS += $(PERMIT_DECLARATION_AFTER_STATEMENT)
+
 # jsonpath_gram.c and jsonpath_scan.c are in the distribution tarball,
 # so they are not cleaned here.
 clean distclean maintainer-clean:
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 0ff9394a2f..6230807906 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -27,8 +27,6 @@
 #include "utils/builtins.h"
 
 
-#define MAXINT8LEN		25
-
 typedef struct
 {
 	int64		current;
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 70138feb29..8ef9fac717 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -20,6 +20,58 @@
 
 #include "common/int.h"
 #include "utils/builtins.h"
+#include "port/pg_bitutils.h"
+
+/*
+ * A table of all two-digit numbers. This is used to speed up decimal digit
+ * generation by copying pairs of digits into the final output.
+ */
+static const char DIGIT_TABLE[200] = {
+	'0', '0', '0', '1', '0', '2', '0', '3', '0', '4', '0', '5', '0', '6', '0', '7', '0', '8', '0', '9',
+	'1', '0', '1', '1', '1', '2', '1', '3', '1', '4', '1', '5', '1', '6', '1', '7', '1', '8', '1', '9',
+	'2', '0', '2', '1', '2', '2', '2', '3', '2', '4', '2', '5', '2', '6', '2', '7', '2', '8', '2', '9',
+	'3', '0', '3', '1', '3', '2', '3', '3', '3', '4', '3', '5', '3', '6', '3', '7', '3', '8', '3', '9',
+	'4', '0', '4', '1', '4', '2', '4', '3', '4', '4', '4', '5', '4', '6', '4', '7', '4', '8', '4', '9',
+	'5', '0', '5', '1', '5', '2', '5', '3', '5', '4', '5', '5', '5', '6', '5', '7', '5', '8', '5', '9',
+	'6', '0', '6', '1', '6', '2', '6', '3', '6', '4', '6', '5', '6', '6', '6', '7', '6', '8', '6', '9',
+	'7', '0', '7', '1', '7', '2', '7', '3', '7', '4', '7', '5', '7', '6', '7', '7', '7', '8', '7', '9',
+	'8', '0', '8', '1', '8', '2', '8', '3', '8', '4', '8', '5', '8', '6', '8', '7', '8', '8', '8', '9',
+	'9', '0', '9', '1', '9', '2', '9', '3', '9', '4', '9', '5', '9', '6', '9', '7', '9', '8', '9', '9'
+};
+
+/*
+ * Adapted from http://graphics.stanford.edu/~seander/bithacks.html#IntegerLog10
+ */
+static inline uint32
+decimalLength32(const uint32 v)
+{
+	uint32			t;
+	static uint64	PowersOfTen[] =
+	{1,                10,                100,
+	 1000,             10000,             100000,
+	 1000000,          10000000,          100000000,
+	 1000000000};
+
+	t = (pg_leftmost_one_pos32(v) + 1)*1233/4096;
+	return t + (v >= PowersOfTen[t]);
+}
+
+static inline uint32
+decimalLength64(const uint64 v)
+{
+	uint32			t;
+	static uint64	PowersOfTen[] =
+	{1,                10,                100,
+	 1000,             10000,             100000,
+	 1000000,          10000000,          100000000,
+	 1000000000,       10000000000,       100000000000,
+	 1000000000000,    10000000000000,    100000000000000,
+	 1000000000000000, 10000000000000000, 100000000000000000,
+	 1000000000000000000};
+
+	t = (pg_leftmost_one_pos64(v) + 1)*1233/4096;
+	return t + (v >= PowersOfTen[t]);
+}
 
 /*
  * pg_atoi: convert string to integer
@@ -276,16 +328,17 @@ pg_itoa(int16 i, char *a)
 }
 
 /*
- * pg_ltoa: converts a signed 32-bit integer to its string representation
+ * pg_ltoa_n: converts a signed 32-bit integer to its string representation, not
+ * NUL-terminated, and returns the length of that string representation
  *
- * Caller must ensure that 'a' points to enough memory to hold the result
- * (at least 12 bytes, counting a leading sign and trailing NUL).
+ * Caller must ensure that 'a' points to enough memory to hold the result (at
+ * least 11 bytes, counting a leading sign).
  */
-void
-pg_ltoa(int32 value, char *a)
+static int32
+pg_ltoa_n(int32 value, char *a)
 {
-	char	   *start = a;
-	bool		neg = false;
+	uint32	olength;
+	uint32	i = 0, adjust = 0;
 
 	/*
 	 * Avoid problems with the most negative integer not being representable
@@ -293,53 +346,111 @@ pg_ltoa(int32 value, char *a)
 	 */
 	if (value == PG_INT32_MIN)
 	{
-		memcpy(a, "-2147483648", 12);
-		return;
+		memcpy(a, "-2147483648", 11);
+		return 11;
 	}
-	else if (value < 0)
+
+	/* Might as well handle this case, too */
+	if (value == 0)
+	{
+		memcpy(a, "0", 1);
+		return 1;
+	}
+
+	if (value < 0)
 	{
 		value = -value;
-		neg = true;
-	}
-
-	/* Compute the result string backwards. */
-	do
-	{
-		int32		remainder;
-		int32		oldval = value;
-
-		value /= 10;
-		remainder = oldval - value * 10;
-		*a++ = '0' + remainder;
-	} while (value != 0);
-
-	if (neg)
 		*a++ = '-';
+		adjust++;
+	}
+
+	olength = decimalLength32(value);
+
+	/* Compute the result string. */
+	while (value >= 100000000)
+	{
+		/* Expensive 64-bit division. Optimize? */
+		const	uint32 value2 = value % 100000000;
+
+		value /= 100000000;
 
-	/* Add trailing NUL byte, and back up 'a' to the last character. */
-	*a-- = '\0';
+		const uint32 c = value2 % 10000;
+		const uint32 d = value2 / 10000;
+		const uint32 c0 = (c % 100) << 1;
+		const uint32 c1 = (c / 100) << 1;
+		const uint32 d0 = (d % 100) << 1;
+		const uint32 d1 = (d / 100) << 1;
 
-	/* Reverse string. */
-	while (start < a)
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c0, 2);
+		memcpy(a + olength - i - 4, DIGIT_TABLE + c1, 2);
+		memcpy(a + olength - i - 6, DIGIT_TABLE + d0, 2);
+		memcpy(a + olength - i - 8, DIGIT_TABLE + d1, 2);
+		i += 8;
+	}
+
+	if (value >= 10000)
 	{
-		char		swap = *start;
+		const	uint32 c = value - 10000 * (value / 10000);
+
+		value /= 10000;
 
-		*start++ = *a;
-		*a-- = swap;
+		const	uint32 c0 = (c % 100) << 1;
+		const	uint32 c1 = (c / 100) << 1;
+
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c0, 2);
+		memcpy(a + olength - i - 4, DIGIT_TABLE + c1, 2);
+		i += 4;
 	}
+	if (value >= 100)
+	{
+		const uint32 c = (value % 100) << 1;
+
+		value /= 100;
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	if (value >= 10)
+	{
+		const uint32 c = value << 1;
+
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	else
+	{
+		*a = (char) ('0' + value);
+		i++;
+	}
+
+	i += adjust;
+	return i;
+}
+
+/*
+ * NUL-terminate the output of pg_ltoa_n.
+ *
+ * It is the caller's responsibility to ensure that a is at least 12 bytes long,
+ * which is enough room to hold a minus sign, a maximally long int32, and the
+ * above terminating NUL.
+ */
+void
+pg_ltoa(int32 value, char *a)
+{
+	int32 len = pg_ltoa_n(value, a);
+	a[len] = '\0';
 }
 
 /*
  * pg_lltoa: convert a signed 64-bit integer to its string representation
  *
  * Caller must ensure that 'a' points to enough memory to hold the result
- * (at least MAXINT8LEN+1 bytes, counting a leading sign and trailing NUL).
+ * (at least MAXINT8LEN bytes, counting a leading sign and trailing NUL).
  */
 void
 pg_lltoa(int64 value, char *a)
 {
-	char	   *start = a;
-	bool		neg = false;
+	uint32	olength;
+	uint32	i = 0;
 
 	/*
 	 * Avoid problems with the most negative integer not being representable
@@ -350,37 +461,82 @@ pg_lltoa(int64 value, char *a)
 		memcpy(a, "-9223372036854775808", 21);
 		return;
 	}
-	else if (value < 0)
+
+	/* Might as well handle this case, too */
+	if (value == 0)
+	{
+		memcpy(a, "0", 2);
+		return;
+	}
+
+	if (value < 0)
 	{
 		value = -value;
-		neg = true;
-	}
-
-	/* Compute the result string backwards. */
-	do
-	{
-		int64		remainder;
-		int64		oldval = value;
-
-		value /= 10;
-		remainder = oldval - value * 10;
-		*a++ = '0' + remainder;
-	} while (value != 0);
-
-	if (neg)
 		*a++ = '-';
+	}
+
+	olength = decimalLength64(value);
+
+	/* Compute the result string. */
+	while (value >= 100000000)
+	{
+		/* Expensive 64-bit division. Optimize? */
+		const	uint64 q = value / 100000000;
+		uint32	value2 = (uint32) (value - 100000000 * q);
+
+		value = q;
 
-	/* Add trailing NUL byte, and back up 'a' to the last character. */
-	*a-- = '\0';
+		const uint32 c = value2 % 10000;
+		const uint32 d = value2 / 10000;
+		const uint32 c0 = (c % 100) << 1;
+		const uint32 c1 = (c / 100) << 1;
+		const uint32 d0 = (d % 100) << 1;
+		const uint32 d1 = (d / 100) << 1;
 
-	/* Reverse string. */
-	while (start < a)
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c0, 2);
+		memcpy(a + olength - i - 4, DIGIT_TABLE + c1, 2);
+		memcpy(a + olength - i - 6, DIGIT_TABLE + d0, 2);
+		memcpy(a + olength - i - 8, DIGIT_TABLE + d1, 2);
+		i += 8;
+	}
+
+	/* Switch to 32-bit for speed */
+	uint32		value2 = (uint32) value;
+
+	if (value2 >= 10000)
 	{
-		char		swap = *start;
+		const	uint32 c = value2 - 10000 * (value2 / 10000);
+
+		value2 /= 10000;
+
+		const	uint32 c0 = (c % 100) << 1;
+		const	uint32 c1 = (c / 100) << 1;
 
-		*start++ = *a;
-		*a-- = swap;
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c0, 2);
+		memcpy(a + olength - i - 4, DIGIT_TABLE + c1, 2);
+		i += 4;
 	}
+	if (value2 >= 100)
+	{
+		const uint32 c = (value2 % 100) << 1;
+
+		value2 /= 100;
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	if (value2 >= 10)
+	{
+		const uint32 c = value2 << 1;
+
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	else
+	{
+		*a = (char) ('0' + value2);
+	}
+
+	a[olength] = '\0';
 }
 
 
@@ -409,60 +565,44 @@ pg_lltoa(int64 value, char *a)
 char *
 pg_ltostr_zeropad(char *str, int32 value, int32 minwidth)
 {
-	char	   *start = str;
-	char	   *end = &str[minwidth];
-	int32		num = value;
+	int32		len;
 
 	Assert(minwidth > 0);
 
-	/*
-	 * Handle negative numbers in a special way.  We can't just write a '-'
-	 * prefix and reverse the sign as that would overflow for INT32_MIN.
-	 */
-	if (num < 0)
+	if (value >= 0)
 	{
-		*start++ = '-';
-		minwidth--;
+		if (value < 100 && minwidth == 2) /* Short cut for common case */
+		{
+			const uint32 c = value << 1;
+			memcpy(str, DIGIT_TABLE + c, 2);
+			return str + 2;
+		}
 
+		len = pg_ltoa_n(value, str);
+		if (minwidth <= len)
+			return str + len;
+
+		memmove(str + minwidth - len, str, len);
+		for(int i = 0; i < minwidth - len; i++)
+		{
+			memcpy(str + i, DIGIT_TABLE, 1);
+		}
+		return str + minwidth;
+	}
+	else
+	{
 		/*
-		 * Build the number starting at the last digit.  Here remainder will
-		 * be a negative number, so we must reverse the sign before adding '0'
-		 * in order to get the correct ASCII digit.
+		 * Changing this number's sign would overflow PG_INT32_MAX,
+		 * so special-case it.
 		 */
-		while (minwidth--)
+		if (value == PG_INT32_MIN)
 		{
-			int32		oldval = num;
-			int32		remainder;
-
-			num /= 10;
-			remainder = oldval - num * 10;
-			start[minwidth] = '0' - remainder;
+			memcpy(str, "-2147483648", 11);
+			return str + 11;
 		}
+		*str++ = '-';
+		return pg_ltostr_zeropad(str, -value, minwidth - 1);
 	}
-	else
-	{
-		/* Build the number starting at the last digit */
-		while (minwidth--)
-		{
-			int32		oldval = num;
-			int32		remainder;
-
-			num /= 10;
-			remainder = oldval - num * 10;
-			start[minwidth] = '0' + remainder;
-		}
-	}
-
-	/*
-	 * If minwidth was not high enough to fit the number then num won't have
-	 * been divided down to zero.  We punt the problem to pg_ltostr(), which
-	 * will generate a correct answer in the minimum valid width.
-	 */
-	if (num != 0)
-		return pg_ltostr(str, value);
-
-	/* Otherwise, return last output character + 1 */
-	return end;
 }
 
 /*
@@ -486,62 +626,8 @@ pg_ltostr_zeropad(char *str, int32 value, int32 minwidth)
 char *
 pg_ltostr(char *str, int32 value)
 {
-	char	   *start;
-	char	   *end;
-
-	/*
-	 * Handle negative numbers in a special way.  We can't just write a '-'
-	 * prefix and reverse the sign as that would overflow for INT32_MIN.
-	 */
-	if (value < 0)
-	{
-		*str++ = '-';
-
-		/* Mark the position we must reverse the string from. */
-		start = str;
-
-		/* Compute the result string backwards. */
-		do
-		{
-			int32		oldval = value;
-			int32		remainder;
-
-			value /= 10;
-			remainder = oldval - value * 10;
-			/* As above, we expect remainder to be negative. */
-			*str++ = '0' - remainder;
-		} while (value != 0);
-	}
-	else
-	{
-		/* Mark the position we must reverse the string from. */
-		start = str;
-
-		/* Compute the result string backwards. */
-		do
-		{
-			int32		oldval = value;
-			int32		remainder;
-
-			value /= 10;
-			remainder = oldval - value * 10;
-			*str++ = '0' + remainder;
-		} while (value != 0);
-	}
-
-	/* Remember the end+1 and back up 'str' to the last character. */
-	end = str--;
-
-	/* Reverse string. */
-	while (start < str)
-	{
-		char		swap = *start;
-
-		*start++ = *str;
-		*str-- = swap;
-	}
-
-	return end;
+	int32	len = pg_ltoa_n(value, str);
+	return str + len;
 }
 
 /*
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 937ddb7ef0..628fe73573 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -18,6 +18,7 @@
 #include "nodes/nodes.h"
 #include "utils/fmgrprotos.h"
 
+#define MAXINT8LEN 21
 
 /* bool.c */
 extern bool parse_bool(const char *value, bool *result);
@@ -46,6 +47,7 @@ extern int32 pg_atoi(const char *s, int size, int c);
 extern int16 pg_strtoint16(const char *s);
 extern int32 pg_strtoint32(const char *s);
 extern void pg_itoa(int16 i, char *a);
+static int32 pg_ltoa_n(int32 l, char *a);
 extern void pg_ltoa(int32 l, char *a);
 extern void pg_lltoa(int64 ll, char *a);
 extern char *pg_ltostr_zeropad(char *str, int32 value, int32 minwidth);

--------------2.21.0--

David Fetter

david@fetter.org

over 6 years ago

In reply to: David Fetter (#7)

1 attachment(s)

Re: Efficient output for integer types

On Wed, Sep 18, 2019 at 07:51:42AM +0200, David Fetter wrote:

On Wed, Sep 18, 2019 at 05:42:01AM +0200, David Fetter wrote:

On Tue, Sep 17, 2019 at 09:01:57AM +0200, David Fetter wrote:

On Tue, Sep 17, 2019 at 08:55:05AM +0200, David Fetter wrote:

On Sun, Sep 15, 2019 at 09:18:49AM +0200, David Fetter wrote:

Folks,

Please find attached a couple of patches intended to $subject.

This patch set cut the time to copy ten million rows of randomly sized
int8s (10 of them) by about a third, so at least for that case, it's
pretty decent.

Added int4 output, removed the sprintf stuff, as it didn't seem to
help in any cases I was testing.

Found a couple of "whiles" that should have been "ifs."

Factored out some inefficient functions and made the guts use the more
efficient function.

Fix copy-paste-o that introduced some unneeded 64-bit math.

Removed static annotation that shouldn't have been present.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachments:

v6-0001-Make-int4-and-int8-operations-more-efficent.patchtext/x-diff; charset=us-asciiDownload

From ba1b57b230b08582bb3cc9ec91baca5177e63ff5 Mon Sep 17 00:00:00 2001
From: David Fetter <david@fetter.org>
Date: Sun, 15 Sep 2019 00:06:29 -0700
Subject: [PATCH v6] Make int4 and int8 operations more efficent
To: hackers
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="------------2.21.0"

This is a multi-part message in MIME format.
--------------2.21.0
Content-Type: text/plain; charset=UTF-8; format=fixed
Content-Transfer-Encoding: 8bit


- Output routines now do more digits per iteration, and
- Code determines the number of decimal digits in int4/int8 efficiently
- Split off pg_ltoa_n from pg_ltoa
- Use same to make other functions shorter

diff --git a/src/backend/access/common/printsimple.c b/src/backend/access/common/printsimple.c
index 651ade14dd..17ca533b87 100644
--- a/src/backend/access/common/printsimple.c
+++ b/src/backend/access/common/printsimple.c
@@ -112,7 +112,7 @@ printsimple(TupleTableSlot *slot, DestReceiver *self)
 			case INT8OID:
 				{
 					int64		num = DatumGetInt64(value);
-					char		str[23];	/* sign, 21 digits and '\0' */
+					char		str[MAXINT8LEN];
 
 					pg_lltoa(num, str);
 					pq_sendcountedtext(&buf, str, strlen(str), false);
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 580043233b..3818dbaa85 100644
--- a/src/backend/utils/adt/Makefile
+++ b/src/backend/utils/adt/Makefile
@@ -39,6 +39,8 @@ jsonpath_scan.c: FLEX_NO_BACKUP=yes
 # jsonpath_scan is compiled as part of jsonpath_gram
 jsonpath_gram.o: jsonpath_scan.c
 
+numutils.o: CFLAGS += $(PERMIT_DECLARATION_AFTER_STATEMENT)
+
 # jsonpath_gram.c and jsonpath_scan.c are in the distribution tarball,
 # so they are not cleaned here.
 clean distclean maintainer-clean:
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 0ff9394a2f..6230807906 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -27,8 +27,6 @@
 #include "utils/builtins.h"
 
 
-#define MAXINT8LEN		25
-
 typedef struct
 {
 	int64		current;
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 70138feb29..2c6f39cbc7 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -20,6 +20,58 @@
 
 #include "common/int.h"
 #include "utils/builtins.h"
+#include "port/pg_bitutils.h"
+
+/*
+ * A table of all two-digit numbers. This is used to speed up decimal digit
+ * generation by copying pairs of digits into the final output.
+ */
+static const char DIGIT_TABLE[200] = {
+	'0', '0', '0', '1', '0', '2', '0', '3', '0', '4', '0', '5', '0', '6', '0', '7', '0', '8', '0', '9',
+	'1', '0', '1', '1', '1', '2', '1', '3', '1', '4', '1', '5', '1', '6', '1', '7', '1', '8', '1', '9',
+	'2', '0', '2', '1', '2', '2', '2', '3', '2', '4', '2', '5', '2', '6', '2', '7', '2', '8', '2', '9',
+	'3', '0', '3', '1', '3', '2', '3', '3', '3', '4', '3', '5', '3', '6', '3', '7', '3', '8', '3', '9',
+	'4', '0', '4', '1', '4', '2', '4', '3', '4', '4', '4', '5', '4', '6', '4', '7', '4', '8', '4', '9',
+	'5', '0', '5', '1', '5', '2', '5', '3', '5', '4', '5', '5', '5', '6', '5', '7', '5', '8', '5', '9',
+	'6', '0', '6', '1', '6', '2', '6', '3', '6', '4', '6', '5', '6', '6', '6', '7', '6', '8', '6', '9',
+	'7', '0', '7', '1', '7', '2', '7', '3', '7', '4', '7', '5', '7', '6', '7', '7', '7', '8', '7', '9',
+	'8', '0', '8', '1', '8', '2', '8', '3', '8', '4', '8', '5', '8', '6', '8', '7', '8', '8', '8', '9',
+	'9', '0', '9', '1', '9', '2', '9', '3', '9', '4', '9', '5', '9', '6', '9', '7', '9', '8', '9', '9'
+};
+
+/*
+ * Adapted from http://graphics.stanford.edu/~seander/bithacks.html#IntegerLog10
+ */
+static inline uint32
+decimalLength32(const uint32 v)
+{
+	uint32			t;
+	static uint64	PowersOfTen[] =
+	{1,                10,                100,
+	 1000,             10000,             100000,
+	 1000000,          10000000,          100000000,
+	 1000000000};
+
+	t = (pg_leftmost_one_pos32(v) + 1)*1233/4096;
+	return t + (v >= PowersOfTen[t]);
+}
+
+static inline uint32
+decimalLength64(const uint64 v)
+{
+	uint32			t;
+	static uint64	PowersOfTen[] =
+	{1,                10,                100,
+	 1000,             10000,             100000,
+	 1000000,          10000000,          100000000,
+	 1000000000,       10000000000,       100000000000,
+	 1000000000000,    10000000000000,    100000000000000,
+	 1000000000000000, 10000000000000000, 100000000000000000,
+	 1000000000000000000};
+
+	t = (pg_leftmost_one_pos64(v) + 1)*1233/4096;
+	return t + (v >= PowersOfTen[t]);
+}
 
 /*
  * pg_atoi: convert string to integer
@@ -276,16 +328,17 @@ pg_itoa(int16 i, char *a)
 }
 
 /*
- * pg_ltoa: converts a signed 32-bit integer to its string representation
+ * pg_ltoa_n: converts a signed 32-bit integer to its string representation, not
+ * NUL-terminated, and returns the length of that string representation
  *
- * Caller must ensure that 'a' points to enough memory to hold the result
- * (at least 12 bytes, counting a leading sign and trailing NUL).
+ * Caller must ensure that 'a' points to enough memory to hold the result (at
+ * least 11 bytes, counting a leading sign).
  */
-void
-pg_ltoa(int32 value, char *a)
+int32
+pg_ltoa_n(int32 value, char *a)
 {
-	char	   *start = a;
-	bool		neg = false;
+	uint32	olength;
+	uint32	i = 0, adjust = 0;
 
 	/*
 	 * Avoid problems with the most negative integer not being representable
@@ -293,53 +346,111 @@ pg_ltoa(int32 value, char *a)
 	 */
 	if (value == PG_INT32_MIN)
 	{
-		memcpy(a, "-2147483648", 12);
-		return;
+		memcpy(a, "-2147483648", 11);
+		return 11;
 	}
-	else if (value < 0)
+
+	/* Might as well handle this case, too */
+	if (value == 0)
+	{
+		memcpy(a, "0", 1);
+		return 1;
+	}
+
+	if (value < 0)
 	{
 		value = -value;
-		neg = true;
-	}
-
-	/* Compute the result string backwards. */
-	do
-	{
-		int32		remainder;
-		int32		oldval = value;
-
-		value /= 10;
-		remainder = oldval - value * 10;
-		*a++ = '0' + remainder;
-	} while (value != 0);
-
-	if (neg)
 		*a++ = '-';
+		adjust++;
+	}
+
+	olength = decimalLength32(value);
+
+	/* Compute the result string. */
+	while (value >= 100000000)
+	{
+		/* Expensive 64-bit division. Optimize? */
+		const	uint32 value2 = value % 100000000;
+
+		value /= 100000000;
 
-	/* Add trailing NUL byte, and back up 'a' to the last character. */
-	*a-- = '\0';
+		const uint32 c = value2 % 10000;
+		const uint32 d = value2 / 10000;
+		const uint32 c0 = (c % 100) << 1;
+		const uint32 c1 = (c / 100) << 1;
+		const uint32 d0 = (d % 100) << 1;
+		const uint32 d1 = (d / 100) << 1;
 
-	/* Reverse string. */
-	while (start < a)
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c0, 2);
+		memcpy(a + olength - i - 4, DIGIT_TABLE + c1, 2);
+		memcpy(a + olength - i - 6, DIGIT_TABLE + d0, 2);
+		memcpy(a + olength - i - 8, DIGIT_TABLE + d1, 2);
+		i += 8;
+	}
+
+	if (value >= 10000)
 	{
-		char		swap = *start;
+		const	uint32 c = value - 10000 * (value / 10000);
+
+		value /= 10000;
 
-		*start++ = *a;
-		*a-- = swap;
+		const	uint32 c0 = (c % 100) << 1;
+		const	uint32 c1 = (c / 100) << 1;
+
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c0, 2);
+		memcpy(a + olength - i - 4, DIGIT_TABLE + c1, 2);
+		i += 4;
 	}
+	if (value >= 100)
+	{
+		const uint32 c = (value % 100) << 1;
+
+		value /= 100;
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	if (value >= 10)
+	{
+		const uint32 c = value << 1;
+
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	else
+	{
+		*a = (char) ('0' + value);
+		i++;
+	}
+
+	i += adjust;
+	return i;
+}
+
+/*
+ * NUL-terminate the output of pg_ltoa_n.
+ *
+ * It is the caller's responsibility to ensure that a is at least 12 bytes long,
+ * which is enough room to hold a minus sign, a maximally long int32, and the
+ * above terminating NUL.
+ */
+void
+pg_ltoa(int32 value, char *a)
+{
+	int32 len = pg_ltoa_n(value, a);
+	a[len] = '\0';
 }
 
 /*
  * pg_lltoa: convert a signed 64-bit integer to its string representation
  *
  * Caller must ensure that 'a' points to enough memory to hold the result
- * (at least MAXINT8LEN+1 bytes, counting a leading sign and trailing NUL).
+ * (at least MAXINT8LEN bytes, counting a leading sign and trailing NUL).
  */
 void
 pg_lltoa(int64 value, char *a)
 {
-	char	   *start = a;
-	bool		neg = false;
+	uint32	olength;
+	uint32	i = 0;
 
 	/*
 	 * Avoid problems with the most negative integer not being representable
@@ -350,37 +461,82 @@ pg_lltoa(int64 value, char *a)
 		memcpy(a, "-9223372036854775808", 21);
 		return;
 	}
-	else if (value < 0)
+
+	/* Might as well handle this case, too */
+	if (value == 0)
+	{
+		memcpy(a, "0", 2);
+		return;
+	}
+
+	if (value < 0)
 	{
 		value = -value;
-		neg = true;
-	}
-
-	/* Compute the result string backwards. */
-	do
-	{
-		int64		remainder;
-		int64		oldval = value;
-
-		value /= 10;
-		remainder = oldval - value * 10;
-		*a++ = '0' + remainder;
-	} while (value != 0);
-
-	if (neg)
 		*a++ = '-';
+	}
+
+	olength = decimalLength64(value);
+
+	/* Compute the result string. */
+	while (value >= 100000000)
+	{
+		/* Expensive 64-bit division. Optimize? */
+		const	uint64 q = value / 100000000;
+		uint32	value2 = (uint32) (value - 100000000 * q);
+
+		value = q;
 
-	/* Add trailing NUL byte, and back up 'a' to the last character. */
-	*a-- = '\0';
+		const uint32 c = value2 % 10000;
+		const uint32 d = value2 / 10000;
+		const uint32 c0 = (c % 100) << 1;
+		const uint32 c1 = (c / 100) << 1;
+		const uint32 d0 = (d % 100) << 1;
+		const uint32 d1 = (d / 100) << 1;
 
-	/* Reverse string. */
-	while (start < a)
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c0, 2);
+		memcpy(a + olength - i - 4, DIGIT_TABLE + c1, 2);
+		memcpy(a + olength - i - 6, DIGIT_TABLE + d0, 2);
+		memcpy(a + olength - i - 8, DIGIT_TABLE + d1, 2);
+		i += 8;
+	}
+
+	/* Switch to 32-bit for speed */
+	uint32		value2 = (uint32) value;
+
+	if (value2 >= 10000)
 	{
-		char		swap = *start;
+		const	uint32 c = value2 - 10000 * (value2 / 10000);
+
+		value2 /= 10000;
+
+		const	uint32 c0 = (c % 100) << 1;
+		const	uint32 c1 = (c / 100) << 1;
 
-		*start++ = *a;
-		*a-- = swap;
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c0, 2);
+		memcpy(a + olength - i - 4, DIGIT_TABLE + c1, 2);
+		i += 4;
 	}
+	if (value2 >= 100)
+	{
+		const uint32 c = (value2 % 100) << 1;
+
+		value2 /= 100;
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	if (value2 >= 10)
+	{
+		const uint32 c = value2 << 1;
+
+		memcpy(a + olength - i - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	else
+	{
+		*a = (char) ('0' + value2);
+	}
+
+	a[olength] = '\0';
 }
 
 
@@ -409,60 +565,44 @@ pg_lltoa(int64 value, char *a)
 char *
 pg_ltostr_zeropad(char *str, int32 value, int32 minwidth)
 {
-	char	   *start = str;
-	char	   *end = &str[minwidth];
-	int32		num = value;
+	int32		len;
 
 	Assert(minwidth > 0);
 
-	/*
-	 * Handle negative numbers in a special way.  We can't just write a '-'
-	 * prefix and reverse the sign as that would overflow for INT32_MIN.
-	 */
-	if (num < 0)
+	if (value >= 0)
 	{
-		*start++ = '-';
-		minwidth--;
+		if (value < 100 && minwidth == 2) /* Short cut for common case */
+		{
+			const uint32 c = value << 1;
+			memcpy(str, DIGIT_TABLE + c, 2);
+			return str + 2;
+		}
 
+		len = pg_ltoa_n(value, str);
+		if (minwidth <= len)
+			return str + len;
+
+		memmove(str + minwidth - len, str, len);
+		for(int i = 0; i < minwidth - len; i++)
+		{
+			memcpy(str + i, DIGIT_TABLE, 1);
+		}
+		return str + minwidth;
+	}
+	else
+	{
 		/*
-		 * Build the number starting at the last digit.  Here remainder will
-		 * be a negative number, so we must reverse the sign before adding '0'
-		 * in order to get the correct ASCII digit.
+		 * Changing this number's sign would overflow PG_INT32_MAX,
+		 * so special-case it.
 		 */
-		while (minwidth--)
+		if (value == PG_INT32_MIN)
 		{
-			int32		oldval = num;
-			int32		remainder;
-
-			num /= 10;
-			remainder = oldval - num * 10;
-			start[minwidth] = '0' - remainder;
+			memcpy(str, "-2147483648", 11);
+			return str + 11;
 		}
+		*str++ = '-';
+		return pg_ltostr_zeropad(str, -value, minwidth - 1);
 	}
-	else
-	{
-		/* Build the number starting at the last digit */
-		while (minwidth--)
-		{
-			int32		oldval = num;
-			int32		remainder;
-
-			num /= 10;
-			remainder = oldval - num * 10;
-			start[minwidth] = '0' + remainder;
-		}
-	}
-
-	/*
-	 * If minwidth was not high enough to fit the number then num won't have
-	 * been divided down to zero.  We punt the problem to pg_ltostr(), which
-	 * will generate a correct answer in the minimum valid width.
-	 */
-	if (num != 0)
-		return pg_ltostr(str, value);
-
-	/* Otherwise, return last output character + 1 */
-	return end;
 }
 
 /*
@@ -486,62 +626,8 @@ pg_ltostr_zeropad(char *str, int32 value, int32 minwidth)
 char *
 pg_ltostr(char *str, int32 value)
 {
-	char	   *start;
-	char	   *end;
-
-	/*
-	 * Handle negative numbers in a special way.  We can't just write a '-'
-	 * prefix and reverse the sign as that would overflow for INT32_MIN.
-	 */
-	if (value < 0)
-	{
-		*str++ = '-';
-
-		/* Mark the position we must reverse the string from. */
-		start = str;
-
-		/* Compute the result string backwards. */
-		do
-		{
-			int32		oldval = value;
-			int32		remainder;
-
-			value /= 10;
-			remainder = oldval - value * 10;
-			/* As above, we expect remainder to be negative. */
-			*str++ = '0' - remainder;
-		} while (value != 0);
-	}
-	else
-	{
-		/* Mark the position we must reverse the string from. */
-		start = str;
-
-		/* Compute the result string backwards. */
-		do
-		{
-			int32		oldval = value;
-			int32		remainder;
-
-			value /= 10;
-			remainder = oldval - value * 10;
-			*str++ = '0' + remainder;
-		} while (value != 0);
-	}
-
-	/* Remember the end+1 and back up 'str' to the last character. */
-	end = str--;
-
-	/* Reverse string. */
-	while (start < str)
-	{
-		char		swap = *start;
-
-		*start++ = *str;
-		*str-- = swap;
-	}
-
-	return end;
+	int32	len = pg_ltoa_n(value, str);
+	return str + len;
 }
 
 /*
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 937ddb7ef0..5033b38980 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -18,6 +18,7 @@
 #include "nodes/nodes.h"
 #include "utils/fmgrprotos.h"
 
+#define MAXINT8LEN 21
 
 /* bool.c */
 extern bool parse_bool(const char *value, bool *result);
@@ -46,6 +47,7 @@ extern int32 pg_atoi(const char *s, int size, int c);
 extern int16 pg_strtoint16(const char *s);
 extern int32 pg_strtoint32(const char *s);
 extern void pg_itoa(int16 i, char *a);
+int32 pg_ltoa_n(int32 l, char *a);
 extern void pg_ltoa(int32 l, char *a);
 extern void pg_lltoa(int64 ll, char *a);
 extern char *pg_ltostr_zeropad(char *str, int32 value, int32 minwidth);

--------------2.21.0--

Kyotaro Horiguchi

horikyota.ntt@gmail.com

over 6 years ago

In reply to: David Fetter (#6)

Re: Efficient output for integer types

Hello.

At Wed, 18 Sep 2019 05:42:01 +0200, David Fetter <david@fetter.org> wrote in <20190918034201.GX31596@fetter.org>

On Tue, Sep 17, 2019 at 09:01:57AM +0200, David Fetter wrote:

On Tue, Sep 17, 2019 at 08:55:05AM +0200, David Fetter wrote:

On Sun, Sep 15, 2019 at 09:18:49AM +0200, David Fetter wrote:

Folks,

Please find attached a couple of patches intended to $subject.

This patch set cut the time to copy ten million rows of randomly sized
int8s (10 of them) by about a third, so at least for that case, it's
pretty decent.

Added int4 output, removed the sprintf stuff, as it didn't seem to
help in any cases I was testing.

Found a couple of "whiles" that should have been "ifs."

Factored out some inefficient functions and made the guts use the more
efficient function.

I'm not sure this is on the KISS principle, but looked it and
have several random comments.

+numutils.o: CFLAGS += $(PERMIT_DECLARATION_AFTER_STATEMENT)

I don't think that we are allowing that as project coding
policy. It seems to have been introduced only to accept external
code as-is.

-                    char        str[23];    /* sign, 21 digits and '\0' */
+                    char        str[MAXINT8LEN];

It's uneasy that MAXINT8LEN contains tailling NUL. MAXINT8BUFLEN
can be so. I think MAXINT8LEN should be 20 and the definition
should be str[MAXINT8LEN + 1].

+static const char DIGIT_TABLE[200] = {
+	'0', '0', '0', '1', '0', '2', '0', '3', '0', '4', '0', '5', '0', '6', '0', '7', '0', '8', '0', '9',

Wouldn't it be simpler if it were defined as a constant string?

static const char DIGIT_TABLE[201] =
"000102030405....19"
"202122232425....39"
..

+pg_ltoa_n(int32 value, char *a)
...
+	/* Compute the result string. */
+	while (value >= 100000000)

We have only two degits above the value. Isn't the stuff inside
the while a waste of cycles?

+ /* Expensive 64-bit division. Optimize? */

I believe compiler treats such trivial optimizations. (concretely
converts into shifts and subtractons if faster.)

+ memcpy(a + olength - i - 2, DIGIT_TABLE + c0, 2);

Maybe it'd be easy to read if 'a + olength - i' is a single variable.

+ i += adjust;
+ return i;

If 'a + olength - i' is replaced with a variable, the return
statement is replacable with "return olength + adjust;".

+ return t + (v >= PowersOfTen[t]);

I think it's better that if it were 't - (v < POT[t]) + 1; /*
log10(v) + 1 */'. At least we need an explanation of the
difference. (I'didn't checked the algorithm is truely right,
though.)

void
pg_lltoa(int64 value, char *a)
{

memcpy(a, "-9223372036854775808", 21);

memcpy(a, "0", 2);

The lines need a comment like "/* length contains trailing '\0'
*/"

+	if (value >= 0)
...
+	else
+ {
+		if (value == PG_INT32_MIN)
+			memcpy(str, "-2147483648", 11);
+			return str + 11;

}

+		*str++ = '-';
+		return pg_ltostr_zeropad(str, -value, minwidth - 1);

If then block of the if statement were (values < 0), we won't
need to reenter the functaion.

+		len = pg_ltoa_n(value, str);
+		if (minwidth <= len)
+			return str + len;
+
+		memmove(str + minwidth - len, str, len);

If the function had the parameters str with the room only for two
digits and a NUL, 2 as minwidth but 1000 as value, the function
would overrun the buffer. The original function just ignores
overflowing digits.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#10

David Fetter

david@fetter.org

over 6 years ago

In reply to: Kyotaro Horiguchi (#9)

1 attachment(s)

Re: Efficient output for integer types

On Wed, Sep 18, 2019 at 04:27:46PM +0900, Kyotaro Horiguchi wrote:

Hello.

At Wed, 18 Sep 2019 05:42:01 +0200, David Fetter <david@fetter.org> wrote in <20190918034201.GX31596@fetter.org>

On Tue, Sep 17, 2019 at 09:01:57AM +0200, David Fetter wrote:

On Tue, Sep 17, 2019 at 08:55:05AM +0200, David Fetter wrote:

On Sun, Sep 15, 2019 at 09:18:49AM +0200, David Fetter wrote:

Folks,

Please find attached a couple of patches intended to $subject.

This patch set cut the time to copy ten million rows of randomly sized
int8s (10 of them) by about a third, so at least for that case, it's
pretty decent.

Added int4 output, removed the sprintf stuff, as it didn't seem to
help in any cases I was testing.

Found a couple of "whiles" that should have been "ifs."

Factored out some inefficient functions and made the guts use the more
efficient function.

I'm not sure this is on the KISS principle, but looked it and
have several random comments.

+numutils.o: CFLAGS += $(PERMIT_DECLARATION_AFTER_STATEMENT)

I don't think that we are allowing that as project coding
policy. It seems to have been introduced only to accept external
code as-is.

Changed to fit current policy.

-                    char        str[23];    /* sign, 21 digits and '\0' */
+                    char        str[MAXINT8LEN];
It's uneasy that MAXINT8LEN contains tailling NUL. MAXINT8BUFLEN
can be so. I think MAXINT8LEN should be 20 and the definition
should be str[MAXINT8LEN + 1].

Done.

+static const char DIGIT_TABLE[200] = {
+	'0', '0', '0', '1', '0', '2', '0', '3', '0', '4', '0', '5', '0', '6', '0', '7', '0', '8', '0', '9',
Wouldn't it be simpler if it were defined as a constant string?

static const char DIGIT_TABLE[201] =
"000102030405....19"
"202122232425....39"
..

I thought this might be even clearer:

"00" "01" "02" "03" "04" "05" "06" "07" "08" "09"
"10" "11" "12" "13" "14" "15" "16" "17" "18" "19"
"20" "21" "22" "23" "24" "25" "26" "27" "28" "29"
"30" "31" "32" "33" "34" "35" "36" "37" "38" "39"
"40" "41" "42" "43" "44" "45" "46" "47" "48" "49"
"50" "51" "52" "53" "54" "55" "56" "57" "58" "59"
"60" "61" "62" "63" "64" "65" "66" "67" "68" "69"
"70" "71" "72" "73" "74" "75" "76" "77" "78" "79"
"80" "81" "82" "83" "84" "85" "86" "87" "88" "89"
"90" "91" "92" "93" "94" "95" "96" "97" "98" "99";

+pg_ltoa_n(int32 value, char *a)
...
+	/* Compute the result string. */
+	while (value >= 100000000)
We have only two degits above the value. Isn't the stuff inside
the while a waste of cycles?

Changed the while to an if.

+ /* Expensive 64-bit division. Optimize? */

I believe compiler treats such trivial optimizations. (concretely
converts into shifts and subtractons if faster.)

Comments removed.

+ memcpy(a + olength - i - 2, DIGIT_TABLE + c0, 2);

Maybe it'd be easy to read if 'a + olength - i' is a single variable.

Done.

+ i += adjust;
+ return i;

If 'a + olength - i' is replaced with a variable, the return
statement is replacable with "return olength + adjust;".

I'm not sure I understand this.

+ return t + (v >= PowersOfTen[t]);

I think it's better that if it were 't - (v < POT[t]) + 1; /*
log10(v) + 1 */'. At least we need an explanation of the
difference. (I'didn't checked the algorithm is truely right,
though.)

Comments added.

void
pg_lltoa(int64 value, char *a)
{

..

memcpy(a, "-9223372036854775808", 21);

..

memcpy(a, "0", 2);

The lines need a comment like "/* length contains trailing '\0'
*/"

Comments added.

+	if (value >= 0)
...
+	else
+ {
+		if (value == PG_INT32_MIN)
+			memcpy(str, "-2147483648", 11);
+			return str + 11;
}
+		*str++ = '-';
+		return pg_ltostr_zeropad(str, -value, minwidth - 1);
If then block of the if statement were (values < 0), we won't
need to reenter the functaion.

This is a tail-call recursion, so it's probably optimized already.

+		len = pg_ltoa_n(value, str);
+		if (minwidth <= len)
+			return str + len;
+
+		memmove(str + minwidth - len, str, len);
If the function had the parameters str with the room only for two
digits and a NUL, 2 as minwidth but 1000 as value, the function
would overrun the buffer. The original function just ignores
overflowing digits.

I believe the original was incorrect.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachments:

v7-0001-Make-int4-and-int8-operations-more-efficent.patchtext/x-diff; charset=us-asciiDownload

From 30d8a2b1bd2f6094a5cb4dd8acb9cc36c837802a Mon Sep 17 00:00:00 2001
From: David Fetter <david@fetter.org>
Date: Sun, 15 Sep 2019 00:06:29 -0700
Subject: [PATCH v7] Make int4 and int8 operations more efficent
To: hackers
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="------------2.21.0"

This is a multi-part message in MIME format.
--------------2.21.0
Content-Type: text/plain; charset=UTF-8; format=fixed
Content-Transfer-Encoding: 8bit


- Output routines now do more digits per iteration, and
- Code determines the number of decimal digits in int4/int8 efficiently
- Split off pg_ltoa_n from pg_ltoa
- Use same to make other functions shorter

diff --git a/src/backend/access/common/printsimple.c b/src/backend/access/common/printsimple.c
index 651ade14dd..5c5b6d33b2 100644
--- a/src/backend/access/common/printsimple.c
+++ b/src/backend/access/common/printsimple.c
@@ -112,7 +112,7 @@ printsimple(TupleTableSlot *slot, DestReceiver *self)
 			case INT8OID:
 				{
 					int64		num = DatumGetInt64(value);
-					char		str[23];	/* sign, 21 digits and '\0' */
+					char		str[MAXINT8LEN + 1];
 
 					pg_lltoa(num, str);
 					pq_sendcountedtext(&buf, str, strlen(str), false);
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 0ff9394a2f..6230807906 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -27,8 +27,6 @@
 #include "utils/builtins.h"
 
 
-#define MAXINT8LEN		25
-
 typedef struct
 {
 	int64		current;
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 70138feb29..fef6da672b 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -20,6 +20,64 @@
 
 #include "common/int.h"
 #include "utils/builtins.h"
+#include "port/pg_bitutils.h"
+
+/*
+ * A table of all two-digit numbers. This is used to speed up decimal digit
+ * generation by copying pairs of digits into the final output.
+ */
+static const char DIGIT_TABLE[200] =
+"00" "01" "02" "03" "04" "05" "06" "07" "08" "09"
+"10" "11" "12" "13" "14" "15" "16" "17" "18" "19"
+"20" "21" "22" "23" "24" "25" "26" "27" "28" "29"
+"30" "31" "32" "33" "34" "35" "36" "37" "38" "39"
+"40" "41" "42" "43" "44" "45" "46" "47" "48" "49"
+"50" "51" "52" "53" "54" "55" "56" "57" "58" "59"
+"60" "61" "62" "63" "64" "65" "66" "67" "68" "69"
+"70" "71" "72" "73" "74" "75" "76" "77" "78" "79"
+"80" "81" "82" "83" "84" "85" "86" "87" "88" "89"
+"90" "91" "92" "93" "94" "95" "96" "97" "98" "99";
+
+/*
+ * Adapted from http://graphics.stanford.edu/~seander/bithacks.html#IntegerLog10
+ */
+static inline uint32
+decimalLength32(const uint32 v)
+{
+	uint32			t;
+	static uint64	PowersOfTen[] =
+	{1,                10,                100,
+	 1000,             10000,             100000,
+	 1000000,          10000000,          100000000,
+	 1000000000};
+	/*
+	 * Compute base-10 logarithm by dividing the base-2 logarithm
+	 * by a good-enough approximation of the base-2 logarithm of 10
+	 */
+	t = (pg_leftmost_one_pos32(v) + 1)*1233/4096;
+	return t + (v >= PowersOfTen[t]);
+}
+
+static inline uint32
+decimalLength64(const uint64 v)
+{
+	uint32			t;
+	static uint64	PowersOfTen[] =
+	{1,                10,                100,
+	 1000,             10000,             100000,
+	 1000000,          10000000,          100000000,
+	 1000000000,       10000000000,       100000000000,
+	 1000000000000,    10000000000000,    100000000000000,
+	 1000000000000000, 10000000000000000, 100000000000000000,
+	 1000000000000000000};
+
+	/*
+	 * Compute base-10 logarithm by dividing the base-2 logarithm
+	 * by a good-enough approximation of the base-2 logarithm of 10
+	 */
+	t = (pg_leftmost_one_pos64(v) + 1)*1233/4096;
+	return t + (v >= PowersOfTen[t]);
+}
 
 /*
  * pg_atoi: convert string to integer
@@ -276,16 +334,17 @@ pg_itoa(int16 i, char *a)
 }
 
 /*
- * pg_ltoa: converts a signed 32-bit integer to its string representation
+ * pg_ltoa_n: converts a signed 32-bit integer to its string representation, not
+ * NUL-terminated, and returns the length of that string representation
  *
- * Caller must ensure that 'a' points to enough memory to hold the result
- * (at least 12 bytes, counting a leading sign and trailing NUL).
+ * Caller must ensure that 'a' points to enough memory to hold the result (at
+ * least 11 bytes, counting a leading sign).
  */
-void
-pg_ltoa(int32 value, char *a)
+int32
+pg_ltoa_n(int32 value, char *a)
 {
-	char	   *start = a;
-	bool		neg = false;
+	uint32	olength;
+	uint32	i = 0, adjust = 0;
 
 	/*
 	 * Avoid problems with the most negative integer not being representable
@@ -293,53 +352,114 @@ pg_ltoa(int32 value, char *a)
 	 */
 	if (value == PG_INT32_MIN)
 	{
-		memcpy(a, "-2147483648", 12);
-		return;
+		memcpy(a, "-2147483648", 11);
+		return 11;
 	}
-	else if (value < 0)
+
+	/* Might as well handle this case, too */
+	if (value == 0)
+	{
+		memcpy(a, "0", 1);
+		return 1;
+	}
+
+	if (value < 0)
 	{
 		value = -value;
-		neg = true;
-	}
-
-	/* Compute the result string backwards. */
-	do
-	{
-		int32		remainder;
-		int32		oldval = value;
-
-		value /= 10;
-		remainder = oldval - value * 10;
-		*a++ = '0' + remainder;
-	} while (value != 0);
-
-	if (neg)
 		*a++ = '-';
+		adjust++;
+	}
+
+	olength = decimalLength32(value);
+
+	/* Compute the result string. */
+	if (value >= 100000000)
+	{
+		const	uint32 value2 = value % 100000000;
+
+		value /= 100000000;
 
-	/* Add trailing NUL byte, and back up 'a' to the last character. */
-	*a-- = '\0';
+		const uint32 c = value2 % 10000;
+		const uint32 d = value2 / 10000;
+		const uint32 c0 = (c % 100) << 1;
+		const uint32 c1 = (c / 100) << 1;
+		const uint32 d0 = (d % 100) << 1;
+		const uint32 d1 = (d / 100) << 1;
 
-	/* Reverse string. */
-	while (start < a)
+		char *pos = a + olength - i;
+		memcpy(pos - 2, DIGIT_TABLE + c0, 2);
+		memcpy(pos - 4, DIGIT_TABLE + c1, 2);
+		memcpy(pos - 6, DIGIT_TABLE + d0, 2);
+		memcpy(pos - 8, DIGIT_TABLE + d1, 2);
+		i += 8;
+	}
+
+	if (value >= 10000)
 	{
-		char		swap = *start;
+		const	uint32 c = value - 10000 * (value / 10000);
+
+		value /= 10000;
 
-		*start++ = *a;
-		*a-- = swap;
+		const	uint32 c0 = (c % 100) << 1;
+		const	uint32 c1 = (c / 100) << 1;
+
+		char *pos = a + olength - i;
+		memcpy(pos - 2, DIGIT_TABLE + c0, 2);
+		memcpy(pos - 4, DIGIT_TABLE + c1, 2);
+		i += 4;
 	}
+	if (value >= 100)
+	{
+		const uint32 c = (value % 100) << 1;
+
+		value /= 100;
+		char *pos = a + olength - i;
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	if (value >= 10)
+	{
+		const uint32 c = value << 1;
+
+		char *pos = a + olength - i;
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	else
+	{
+		*a = (char) ('0' + value);
+		i++;
+	}
+
+	i += adjust;
+	return i;
+}
+
+/*
+ * NUL-terminate the output of pg_ltoa_n.
+ *
+ * It is the caller's responsibility to ensure that a is at least 12 bytes long,
+ * which is enough room to hold a minus sign, a maximally long int32, and the
+ * above terminating NUL.
+ */
+void
+pg_ltoa(int32 value, char *a)
+{
+	int32 len = pg_ltoa_n(value, a);
+	a[len] = '\0';
 }
 
 /*
  * pg_lltoa: convert a signed 64-bit integer to its string representation
  *
  * Caller must ensure that 'a' points to enough memory to hold the result
- * (at least MAXINT8LEN+1 bytes, counting a leading sign and trailing NUL).
+ * (at least MAXINT8LEN + 1 bytes, counting a leading sign and trailing NUL).
  */
 void
 pg_lltoa(int64 value, char *a)
 {
-	char	   *start = a;
-	bool		neg = false;
+	uint32	olength;
+	uint32	i = 0;
 
 	/*
 	 * Avoid problems with the most negative integer not being representable
@@ -347,40 +467,91 @@ pg_lltoa(int64 value, char *a)
 	 */
 	if (value == PG_INT64_MIN)
 	{
-		memcpy(a, "-9223372036854775808", 21);
+		/* Include terminating NUL */
+		memcpy(a, "-9223372036854775808", MAXINT8LEN + 1);
 		return;
 	}
-	else if (value < 0)
+
+	/* Might as well handle this case, too */
+	if (value == 0)
+	{
+		/* Include terminating NUL */
+		memcpy(a, "0", 2);
+		return;
+	}
+
+	if (value < 0)
 	{
 		value = -value;
-		neg = true;
-	}
-
-	/* Compute the result string backwards. */
-	do
-	{
-		int64		remainder;
-		int64		oldval = value;
-
-		value /= 10;
-		remainder = oldval - value * 10;
-		*a++ = '0' + remainder;
-	} while (value != 0);
-
-	if (neg)
 		*a++ = '-';
+	}
+
+	olength = decimalLength64(value);
+
+	/* Compute the result string. */
+	while (value >= 100000000)
+	{
+		const	uint64 q = value / 100000000; /* Most-significant digits */
+		uint32	value2 = (uint32) (value - 100000000 * q); /* Least-significant digits */
+
+		const uint32 c = value2 % 10000;
+		const uint32 d = value2 / 10000;
+		const uint32 c0 = (c % 100) << 1;
+		const uint32 c1 = (c / 100) << 1;
+		const uint32 d0 = (d % 100) << 1;
+		const uint32 d1 = (d / 100) << 1;
+
+		char *pos = a + olength - i;
+
+		value = q;
+
+		memcpy(pos - 2, DIGIT_TABLE + c0, 2);
+		memcpy(pos - 4, DIGIT_TABLE + c1, 2);
+		memcpy(pos - 6, DIGIT_TABLE + d0, 2);
+		memcpy(pos - 8, DIGIT_TABLE + d1, 2);
+		i += 8;
+	}
+
+	/* Switch to 32-bit for speed */
+	uint32		value2 = (uint32) value;
 
-	/* Add trailing NUL byte, and back up 'a' to the last character. */
-	*a-- = '\0';
+	if (value2 >= 10000)
+	{
+		const	uint32 c = value2 - 10000 * (value2 / 10000);
+
+		value2 /= 10000;
+
+		const	uint32 c0 = (c % 100) << 1;
+		const	uint32 c1 = (c / 100) << 1;
+
+		char *pos = a + olength - i;
+		memcpy(pos - 2, DIGIT_TABLE + c0, 2);
+		memcpy(pos - 4, DIGIT_TABLE + c1, 2);
+		i += 4;
+	}
+	if (value2 >= 100)
+	{
+		const uint32 c = (value2 % 100) << 1;
 
-	/* Reverse string. */
-	while (start < a)
+		value2 /= 100;
+		char *pos = a + olength - i;
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	if (value2 >= 10)
 	{
-		char		swap = *start;
+		const uint32 c = value2 << 1;
 
-		*start++ = *a;
-		*a-- = swap;
+		char *pos = a + olength - i;
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
+		i += 2;
 	}
+	else
+	{
+		*a = (char) ('0' + value2);
+	}
+
+	a[olength] = '\0';
 }
 
 
@@ -409,60 +580,44 @@ pg_lltoa(int64 value, char *a)
 char *
 pg_ltostr_zeropad(char *str, int32 value, int32 minwidth)
 {
-	char	   *start = str;
-	char	   *end = &str[minwidth];
-	int32		num = value;
+	int32		len;
 
 	Assert(minwidth > 0);
 
-	/*
-	 * Handle negative numbers in a special way.  We can't just write a '-'
-	 * prefix and reverse the sign as that would overflow for INT32_MIN.
-	 */
-	if (num < 0)
+	if (value >= 0)
 	{
-		*start++ = '-';
-		minwidth--;
+		if (value < 100 && minwidth == 2) /* Short cut for common case */
+		{
+			const uint32 c = value << 1;
+			memcpy(str, DIGIT_TABLE + c, 2);
+			return str + 2;
+		}
 
+		len = pg_ltoa_n(value, str);
+		if (minwidth <= len)
+			return str + len;
+
+		memmove(str + minwidth - len, str, len);
+		for(int i = 0; i < minwidth - len; i++)
+		{
+			memcpy(str + i, DIGIT_TABLE, 1);
+		}
+		return str + minwidth;
+	}
+	else
+	{
 		/*
-		 * Build the number starting at the last digit.  Here remainder will
-		 * be a negative number, so we must reverse the sign before adding '0'
-		 * in order to get the correct ASCII digit.
+		 * Changing this number's sign would overflow PG_INT32_MAX,
+		 * so special-case it.
 		 */
-		while (minwidth--)
+		if (value == PG_INT32_MIN)
 		{
-			int32		oldval = num;
-			int32		remainder;
-
-			num /= 10;
-			remainder = oldval - num * 10;
-			start[minwidth] = '0' - remainder;
+			memcpy(str, "-2147483648", 11);
+			return str + 11;
 		}
+		*str++ = '-';
+		return pg_ltostr_zeropad(str, -value, minwidth - 1);
 	}
-	else
-	{
-		/* Build the number starting at the last digit */
-		while (minwidth--)
-		{
-			int32		oldval = num;
-			int32		remainder;
-
-			num /= 10;
-			remainder = oldval - num * 10;
-			start[minwidth] = '0' + remainder;
-		}
-	}
-
-	/*
-	 * If minwidth was not high enough to fit the number then num won't have
-	 * been divided down to zero.  We punt the problem to pg_ltostr(), which
-	 * will generate a correct answer in the minimum valid width.
-	 */
-	if (num != 0)
-		return pg_ltostr(str, value);
-
-	/* Otherwise, return last output character + 1 */
-	return end;
 }
 
 /*
@@ -486,62 +641,8 @@ pg_ltostr_zeropad(char *str, int32 value, int32 minwidth)
 char *
 pg_ltostr(char *str, int32 value)
 {
-	char	   *start;
-	char	   *end;
-
-	/*
-	 * Handle negative numbers in a special way.  We can't just write a '-'
-	 * prefix and reverse the sign as that would overflow for INT32_MIN.
-	 */
-	if (value < 0)
-	{
-		*str++ = '-';
-
-		/* Mark the position we must reverse the string from. */
-		start = str;
-
-		/* Compute the result string backwards. */
-		do
-		{
-			int32		oldval = value;
-			int32		remainder;
-
-			value /= 10;
-			remainder = oldval - value * 10;
-			/* As above, we expect remainder to be negative. */
-			*str++ = '0' - remainder;
-		} while (value != 0);
-	}
-	else
-	{
-		/* Mark the position we must reverse the string from. */
-		start = str;
-
-		/* Compute the result string backwards. */
-		do
-		{
-			int32		oldval = value;
-			int32		remainder;
-
-			value /= 10;
-			remainder = oldval - value * 10;
-			*str++ = '0' + remainder;
-		} while (value != 0);
-	}
-
-	/* Remember the end+1 and back up 'str' to the last character. */
-	end = str--;
-
-	/* Reverse string. */
-	while (start < str)
-	{
-		char		swap = *start;
-
-		*start++ = *str;
-		*str-- = swap;
-	}
-
-	return end;
+	int32	len = pg_ltoa_n(value, str);
+	return str + len;
 }
 
 /*
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 937ddb7ef0..cd13536cf4 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -18,6 +18,8 @@
 #include "nodes/nodes.h"
 #include "utils/fmgrprotos.h"
 
+/* Sign + the most decimal digits an 8-byte number could have */
+#define MAXINT8LEN 20
 
 /* bool.c */
 extern bool parse_bool(const char *value, bool *result);
@@ -46,6 +48,7 @@ extern int32 pg_atoi(const char *s, int size, int c);
 extern int16 pg_strtoint16(const char *s);
 extern int32 pg_strtoint32(const char *s);
 extern void pg_itoa(int16 i, char *a);
+int32 pg_ltoa_n(int32 l, char *a);
 extern void pg_ltoa(int32 l, char *a);
 extern void pg_lltoa(int64 ll, char *a);
 extern char *pg_ltostr_zeropad(char *str, int32 value, int32 minwidth);

--------------2.21.0--

#11

David Fetter

david@fetter.org

over 6 years ago

In reply to: David Fetter (#10)

1 attachment(s)

Re: Efficient output for integer types

On Fri, Sep 20, 2019 at 09:14:51PM +0200, David Fetter wrote:

On Wed, Sep 18, 2019 at 04:27:46PM +0900, Kyotaro Horiguchi wrote:

Hello.

At Wed, 18 Sep 2019 05:42:01 +0200, David Fetter <david@fetter.org> wrote in <20190918034201.GX31596@fetter.org>

On Tue, Sep 17, 2019 at 09:01:57AM +0200, David Fetter wrote:

On Tue, Sep 17, 2019 at 08:55:05AM +0200, David Fetter wrote:

On Sun, Sep 15, 2019 at 09:18:49AM +0200, David Fetter wrote:

Folks,

Please find attached a couple of patches intended to $subject.

This patch set cut the time to copy ten million rows of randomly sized
int8s (10 of them) by about a third, so at least for that case, it's
pretty decent.

Added int4 output, removed the sprintf stuff, as it didn't seem to
help in any cases I was testing.

Found a couple of "whiles" that should have been "ifs."

Factored out some inefficient functions and made the guts use the more
efficient function.

I'm not sure this is on the KISS principle, but looked it and
have several random comments.

+numutils.o: CFLAGS += $(PERMIT_DECLARATION_AFTER_STATEMENT)

Oops. Missed a few.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachments:

v8-0001-Make-int4-and-int8-operations-more-efficent.patchtext/x-diff; charset=us-asciiDownload

From f7f9cb42ce9d50e5a4746fe80208960e29ffd348 Mon Sep 17 00:00:00 2001
From: David Fetter <david@fetter.org>
Date: Sun, 15 Sep 2019 00:06:29 -0700
Subject: [PATCH v8] Make int4 and int8 operations more efficent
To: hackers
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="------------2.21.0"

This is a multi-part message in MIME format.
--------------2.21.0
Content-Type: text/plain; charset=UTF-8; format=fixed
Content-Transfer-Encoding: 8bit


- Output routines now do more digits per iteration, and
- Code determines the number of decimal digits in int4/int8 efficiently
- Split off pg_ltoa_n from pg_ltoa
- Use same to make other functions shorter

diff --git a/src/backend/access/common/printsimple.c b/src/backend/access/common/printsimple.c
index 651ade14dd..5c5b6d33b2 100644
--- a/src/backend/access/common/printsimple.c
+++ b/src/backend/access/common/printsimple.c
@@ -112,7 +112,7 @@ printsimple(TupleTableSlot *slot, DestReceiver *self)
 			case INT8OID:
 				{
 					int64		num = DatumGetInt64(value);
-					char		str[23];	/* sign, 21 digits and '\0' */
+					char		str[MAXINT8LEN + 1];
 
 					pg_lltoa(num, str);
 					pq_sendcountedtext(&buf, str, strlen(str), false);
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 0ff9394a2f..6230807906 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -27,8 +27,6 @@
 #include "utils/builtins.h"
 
 
-#define MAXINT8LEN		25
-
 typedef struct
 {
 	int64		current;
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 70138feb29..fef6da672b 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -20,6 +20,64 @@
 
 #include "common/int.h"
 #include "utils/builtins.h"
+#include "port/pg_bitutils.h"
+
+/*
+ * A table of all two-digit numbers. This is used to speed up decimal digit
+ * generation by copying pairs of digits into the final output.
+ */
+static const char DIGIT_TABLE[200] =
+"00" "01" "02" "03" "04" "05" "06" "07" "08" "09"
+"10" "11" "12" "13" "14" "15" "16" "17" "18" "19"
+"20" "21" "22" "23" "24" "25" "26" "27" "28" "29"
+"30" "31" "32" "33" "34" "35" "36" "37" "38" "39"
+"40" "41" "42" "43" "44" "45" "46" "47" "48" "49"
+"50" "51" "52" "53" "54" "55" "56" "57" "58" "59"
+"60" "61" "62" "63" "64" "65" "66" "67" "68" "69"
+"70" "71" "72" "73" "74" "75" "76" "77" "78" "79"
+"80" "81" "82" "83" "84" "85" "86" "87" "88" "89"
+"90" "91" "92" "93" "94" "95" "96" "97" "98" "99";
+
+/*
+ * Adapted from http://graphics.stanford.edu/~seander/bithacks.html#IntegerLog10
+ */
+static inline uint32
+decimalLength32(const uint32 v)
+{
+	uint32			t;
+	static uint64	PowersOfTen[] =
+	{1,                10,                100,
+	 1000,             10000,             100000,
+	 1000000,          10000000,          100000000,
+	 1000000000};
+	/*
+	 * Compute base-10 logarithm by dividing the base-2 logarithm
+	 * by a good-enough approximation of the base-2 logarithm of 10
+	 */
+	t = (pg_leftmost_one_pos32(v) + 1)*1233/4096;
+	return t + (v >= PowersOfTen[t]);
+}
+
+static inline uint32
+decimalLength64(const uint64 v)
+{
+	uint32			t;
+	static uint64	PowersOfTen[] =
+	{1,                10,                100,
+	 1000,             10000,             100000,
+	 1000000,          10000000,          100000000,
+	 1000000000,       10000000000,       100000000000,
+	 1000000000000,    10000000000000,    100000000000000,
+	 1000000000000000, 10000000000000000, 100000000000000000,
+	 1000000000000000000};
+
+	/*
+	 * Compute base-10 logarithm by dividing the base-2 logarithm
+	 * by a good-enough approximation of the base-2 logarithm of 10
+	 */
+	t = (pg_leftmost_one_pos64(v) + 1)*1233/4096;
+	return t + (v >= PowersOfTen[t]);
+}
 
 /*
  * pg_atoi: convert string to integer
@@ -276,16 +334,17 @@ pg_itoa(int16 i, char *a)
 }
 
 /*
- * pg_ltoa: converts a signed 32-bit integer to its string representation
+ * pg_ltoa_n: converts a signed 32-bit integer to its string representation, not
+ * NUL-terminated, and returns the length of that string representation
  *
- * Caller must ensure that 'a' points to enough memory to hold the result
- * (at least 12 bytes, counting a leading sign and trailing NUL).
+ * Caller must ensure that 'a' points to enough memory to hold the result (at
+ * least 11 bytes, counting a leading sign).
  */
-void
-pg_ltoa(int32 value, char *a)
+int32
+pg_ltoa_n(int32 value, char *a)
 {
-	char	   *start = a;
-	bool		neg = false;
+	uint32	olength;
+	uint32	i = 0, adjust = 0;
 
 	/*
 	 * Avoid problems with the most negative integer not being representable
@@ -293,53 +352,114 @@ pg_ltoa(int32 value, char *a)
 	 */
 	if (value == PG_INT32_MIN)
 	{
-		memcpy(a, "-2147483648", 12);
-		return;
+		memcpy(a, "-2147483648", 11);
+		return 11;
 	}
-	else if (value < 0)
+
+	/* Might as well handle this case, too */
+	if (value == 0)
+	{
+		memcpy(a, "0", 1);
+		return 1;
+	}
+
+	if (value < 0)
 	{
 		value = -value;
-		neg = true;
-	}
-
-	/* Compute the result string backwards. */
-	do
-	{
-		int32		remainder;
-		int32		oldval = value;
-
-		value /= 10;
-		remainder = oldval - value * 10;
-		*a++ = '0' + remainder;
-	} while (value != 0);
-
-	if (neg)
 		*a++ = '-';
+		adjust++;
+	}
+
+	olength = decimalLength32(value);
+
+	/* Compute the result string. */
+	if (value >= 100000000)
+	{
+		const	uint32 value2 = value % 100000000;
+
+		value /= 100000000;
 
-	/* Add trailing NUL byte, and back up 'a' to the last character. */
-	*a-- = '\0';
+		const uint32 c = value2 % 10000;
+		const uint32 d = value2 / 10000;
+		const uint32 c0 = (c % 100) << 1;
+		const uint32 c1 = (c / 100) << 1;
+		const uint32 d0 = (d % 100) << 1;
+		const uint32 d1 = (d / 100) << 1;
 
-	/* Reverse string. */
-	while (start < a)
+		char *pos = a + olength - i;
+		memcpy(pos - 2, DIGIT_TABLE + c0, 2);
+		memcpy(pos - 4, DIGIT_TABLE + c1, 2);
+		memcpy(pos - 6, DIGIT_TABLE + d0, 2);
+		memcpy(pos - 8, DIGIT_TABLE + d1, 2);
+		i += 8;
+	}
+
+	if (value >= 10000)
 	{
-		char		swap = *start;
+		const	uint32 c = value - 10000 * (value / 10000);
+
+		value /= 10000;
 
-		*start++ = *a;
-		*a-- = swap;
+		const	uint32 c0 = (c % 100) << 1;
+		const	uint32 c1 = (c / 100) << 1;
+
+		char *pos = a + olength - i;
+		memcpy(pos - 2, DIGIT_TABLE + c0, 2);
+		memcpy(pos - 4, DIGIT_TABLE + c1, 2);
+		i += 4;
 	}
+	if (value >= 100)
+	{
+		const uint32 c = (value % 100) << 1;
+
+		value /= 100;
+		char *pos = a + olength - i;
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	if (value >= 10)
+	{
+		const uint32 c = value << 1;
+
+		char *pos = a + olength - i;
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	else
+	{
+		*a = (char) ('0' + value);
+		i++;
+	}
+
+	i += adjust;
+	return i;
+}
+
+/*
+ * NUL-terminate the output of pg_ltoa_n.
+ *
+ * It is the caller's responsibility to ensure that a is at least 12 bytes long,
+ * which is enough room to hold a minus sign, a maximally long int32, and the
+ * above terminating NUL.
+ */
+void
+pg_ltoa(int32 value, char *a)
+{
+	int32 len = pg_ltoa_n(value, a);
+	a[len] = '\0';
 }
 
 /*
  * pg_lltoa: convert a signed 64-bit integer to its string representation
  *
  * Caller must ensure that 'a' points to enough memory to hold the result
- * (at least MAXINT8LEN+1 bytes, counting a leading sign and trailing NUL).
+ * (at least MAXINT8LEN + 1 bytes, counting a leading sign and trailing NUL).
  */
 void
 pg_lltoa(int64 value, char *a)
 {
-	char	   *start = a;
-	bool		neg = false;
+	uint32	olength;
+	uint32	i = 0;
 
 	/*
 	 * Avoid problems with the most negative integer not being representable
@@ -347,40 +467,91 @@ pg_lltoa(int64 value, char *a)
 	 */
 	if (value == PG_INT64_MIN)
 	{
-		memcpy(a, "-9223372036854775808", 21);
+		/* Include terminating NUL */
+		memcpy(a, "-9223372036854775808", MAXINT8LEN + 1);
 		return;
 	}
-	else if (value < 0)
+
+	/* Might as well handle this case, too */
+	if (value == 0)
+	{
+		/* Include terminating NUL */
+		memcpy(a, "0", 2);
+		return;
+	}
+
+	if (value < 0)
 	{
 		value = -value;
-		neg = true;
-	}
-
-	/* Compute the result string backwards. */
-	do
-	{
-		int64		remainder;
-		int64		oldval = value;
-
-		value /= 10;
-		remainder = oldval - value * 10;
-		*a++ = '0' + remainder;
-	} while (value != 0);
-
-	if (neg)
 		*a++ = '-';
+	}
+
+	olength = decimalLength64(value);
+
+	/* Compute the result string. */
+	while (value >= 100000000)
+	{
+		const	uint64 q = value / 100000000; /* Most-significant digits */
+		uint32	value2 = (uint32) (value - 100000000 * q); /* Least-significant digits */
+
+		const uint32 c = value2 % 10000;
+		const uint32 d = value2 / 10000;
+		const uint32 c0 = (c % 100) << 1;
+		const uint32 c1 = (c / 100) << 1;
+		const uint32 d0 = (d % 100) << 1;
+		const uint32 d1 = (d / 100) << 1;
+
+		char *pos = a + olength - i;
+
+		value = q;
+
+		memcpy(pos - 2, DIGIT_TABLE + c0, 2);
+		memcpy(pos - 4, DIGIT_TABLE + c1, 2);
+		memcpy(pos - 6, DIGIT_TABLE + d0, 2);
+		memcpy(pos - 8, DIGIT_TABLE + d1, 2);
+		i += 8;
+	}
+
+	/* Switch to 32-bit for speed */
+	uint32		value2 = (uint32) value;
 
-	/* Add trailing NUL byte, and back up 'a' to the last character. */
-	*a-- = '\0';
+	if (value2 >= 10000)
+	{
+		const	uint32 c = value2 - 10000 * (value2 / 10000);
+
+		value2 /= 10000;
+
+		const	uint32 c0 = (c % 100) << 1;
+		const	uint32 c1 = (c / 100) << 1;
+
+		char *pos = a + olength - i;
+		memcpy(pos - 2, DIGIT_TABLE + c0, 2);
+		memcpy(pos - 4, DIGIT_TABLE + c1, 2);
+		i += 4;
+	}
+	if (value2 >= 100)
+	{
+		const uint32 c = (value2 % 100) << 1;
 
-	/* Reverse string. */
-	while (start < a)
+		value2 /= 100;
+		char *pos = a + olength - i;
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	if (value2 >= 10)
 	{
-		char		swap = *start;
+		const uint32 c = value2 << 1;
 
-		*start++ = *a;
-		*a-- = swap;
+		char *pos = a + olength - i;
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
+		i += 2;
 	}
+	else
+	{
+		*a = (char) ('0' + value2);
+	}
+
+	a[olength] = '\0';
 }
 
 
@@ -409,60 +580,44 @@ pg_lltoa(int64 value, char *a)
 char *
 pg_ltostr_zeropad(char *str, int32 value, int32 minwidth)
 {
-	char	   *start = str;
-	char	   *end = &str[minwidth];
-	int32		num = value;
+	int32		len;
 
 	Assert(minwidth > 0);
 
-	/*
-	 * Handle negative numbers in a special way.  We can't just write a '-'
-	 * prefix and reverse the sign as that would overflow for INT32_MIN.
-	 */
-	if (num < 0)
+	if (value >= 0)
 	{
-		*start++ = '-';
-		minwidth--;
+		if (value < 100 && minwidth == 2) /* Short cut for common case */
+		{
+			const uint32 c = value << 1;
+			memcpy(str, DIGIT_TABLE + c, 2);
+			return str + 2;
+		}
 
+		len = pg_ltoa_n(value, str);
+		if (minwidth <= len)
+			return str + len;
+
+		memmove(str + minwidth - len, str, len);
+		for(int i = 0; i < minwidth - len; i++)
+		{
+			memcpy(str + i, DIGIT_TABLE, 1);
+		}
+		return str + minwidth;
+	}
+	else
+	{
 		/*
-		 * Build the number starting at the last digit.  Here remainder will
-		 * be a negative number, so we must reverse the sign before adding '0'
-		 * in order to get the correct ASCII digit.
+		 * Changing this number's sign would overflow PG_INT32_MAX,
+		 * so special-case it.
 		 */
-		while (minwidth--)
+		if (value == PG_INT32_MIN)
 		{
-			int32		oldval = num;
-			int32		remainder;
-
-			num /= 10;
-			remainder = oldval - num * 10;
-			start[minwidth] = '0' - remainder;
+			memcpy(str, "-2147483648", 11);
+			return str + 11;
 		}
+		*str++ = '-';
+		return pg_ltostr_zeropad(str, -value, minwidth - 1);
 	}
-	else
-	{
-		/* Build the number starting at the last digit */
-		while (minwidth--)
-		{
-			int32		oldval = num;
-			int32		remainder;
-
-			num /= 10;
-			remainder = oldval - num * 10;
-			start[minwidth] = '0' + remainder;
-		}
-	}
-
-	/*
-	 * If minwidth was not high enough to fit the number then num won't have
-	 * been divided down to zero.  We punt the problem to pg_ltostr(), which
-	 * will generate a correct answer in the minimum valid width.
-	 */
-	if (num != 0)
-		return pg_ltostr(str, value);
-
-	/* Otherwise, return last output character + 1 */
-	return end;
 }
 
 /*
@@ -486,62 +641,8 @@ pg_ltostr_zeropad(char *str, int32 value, int32 minwidth)
 char *
 pg_ltostr(char *str, int32 value)
 {
-	char	   *start;
-	char	   *end;
-
-	/*
-	 * Handle negative numbers in a special way.  We can't just write a '-'
-	 * prefix and reverse the sign as that would overflow for INT32_MIN.
-	 */
-	if (value < 0)
-	{
-		*str++ = '-';
-
-		/* Mark the position we must reverse the string from. */
-		start = str;
-
-		/* Compute the result string backwards. */
-		do
-		{
-			int32		oldval = value;
-			int32		remainder;
-
-			value /= 10;
-			remainder = oldval - value * 10;
-			/* As above, we expect remainder to be negative. */
-			*str++ = '0' - remainder;
-		} while (value != 0);
-	}
-	else
-	{
-		/* Mark the position we must reverse the string from. */
-		start = str;
-
-		/* Compute the result string backwards. */
-		do
-		{
-			int32		oldval = value;
-			int32		remainder;
-
-			value /= 10;
-			remainder = oldval - value * 10;
-			*str++ = '0' + remainder;
-		} while (value != 0);
-	}
-
-	/* Remember the end+1 and back up 'str' to the last character. */
-	end = str--;
-
-	/* Reverse string. */
-	while (start < str)
-	{
-		char		swap = *start;
-
-		*start++ = *str;
-		*str-- = swap;
-	}
-
-	return end;
+	int32	len = pg_ltoa_n(value, str);
+	return str + len;
 }
 
 /*
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 937ddb7ef0..cd13536cf4 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -18,6 +18,8 @@
 #include "nodes/nodes.h"
 #include "utils/fmgrprotos.h"
 
+/* Sign + the most decimal digits an 8-byte number could have */
+#define MAXINT8LEN 20
 
 /* bool.c */
 extern bool parse_bool(const char *value, bool *result);
@@ -46,6 +48,7 @@ extern int32 pg_atoi(const char *s, int size, int c);
 extern int16 pg_strtoint16(const char *s);
 extern int32 pg_strtoint32(const char *s);
 extern void pg_itoa(int16 i, char *a);
+int32 pg_ltoa_n(int32 l, char *a);
 extern void pg_ltoa(int32 l, char *a);
 extern void pg_lltoa(int64 ll, char *a);
 extern char *pg_ltostr_zeropad(char *str, int32 value, int32 minwidth);

--------------2.21.0--

#12

David Fetter

david@fetter.org

over 6 years ago

In reply to: David Fetter (#11)

1 attachment(s)

Re: Efficient output for integer types

On Fri, Sep 20, 2019 at 11:09:16PM +0200, David Fetter wrote:

On Fri, Sep 20, 2019 at 09:14:51PM +0200, David Fetter wrote:

On Wed, Sep 18, 2019 at 04:27:46PM +0900, Kyotaro Horiguchi wrote:

Hello.

At Wed, 18 Sep 2019 05:42:01 +0200, David Fetter <david@fetter.org> wrote in <20190918034201.GX31596@fetter.org>

On Tue, Sep 17, 2019 at 09:01:57AM +0200, David Fetter wrote:

On Tue, Sep 17, 2019 at 08:55:05AM +0200, David Fetter wrote:

On Sun, Sep 15, 2019 at 09:18:49AM +0200, David Fetter wrote:

Folks,

Please find attached a couple of patches intended to $subject.

This patch set cut the time to copy ten million rows of randomly sized
int8s (10 of them) by about a third, so at least for that case, it's
pretty decent.

Added int4 output, removed the sprintf stuff, as it didn't seem to
help in any cases I was testing.

Found a couple of "whiles" that should have been "ifs."

Factored out some inefficient functions and made the guts use the more
efficient function.

I'm not sure this is on the KISS principle, but looked it and
have several random comments.

+numutils.o: CFLAGS += $(PERMIT_DECLARATION_AFTER_STATEMENT)

Oops. Missed a few.

D'oh! Wrong patch.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachments:

v9-0001-Make-int4-and-int8-operations-more-efficent.patchtext/x-diff; charset=us-asciiDownload

From 51d793143daf44cc0f01a00d6aedeaf3fdd88230 Mon Sep 17 00:00:00 2001
From: David Fetter <david@fetter.org>
Date: Sun, 15 Sep 2019 00:06:29 -0700
Subject: [PATCH v9] Make int4 and int8 operations more efficent
To: hackers
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="------------2.21.0"

This is a multi-part message in MIME format.
--------------2.21.0
Content-Type: text/plain; charset=UTF-8; format=fixed
Content-Transfer-Encoding: 8bit


- Output routines now do more digits per iteration, and
- Code determines the number of decimal digits in int4/int8 efficiently
- Split off pg_ltoa_n from pg_ltoa
- Use same to make other functions shorter

diff --git a/src/backend/access/common/printsimple.c b/src/backend/access/common/printsimple.c
index 651ade14dd..5c5b6d33b2 100644
--- a/src/backend/access/common/printsimple.c
+++ b/src/backend/access/common/printsimple.c
@@ -112,7 +112,7 @@ printsimple(TupleTableSlot *slot, DestReceiver *self)
 			case INT8OID:
 				{
 					int64		num = DatumGetInt64(value);
-					char		str[23];	/* sign, 21 digits and '\0' */
+					char		str[MAXINT8LEN + 1];
 
 					pg_lltoa(num, str);
 					pq_sendcountedtext(&buf, str, strlen(str), false);
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 0ff9394a2f..6230807906 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -27,8 +27,6 @@
 #include "utils/builtins.h"
 
 
-#define MAXINT8LEN		25
-
 typedef struct
 {
 	int64		current;
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 70138feb29..630aafd168 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -20,6 +20,64 @@
 
 #include "common/int.h"
 #include "utils/builtins.h"
+#include "port/pg_bitutils.h"
+
+/*
+ * A table of all two-digit numbers. This is used to speed up decimal digit
+ * generation by copying pairs of digits into the final output.
+ */
+static const char DIGIT_TABLE[200] =
+"00" "01" "02" "03" "04" "05" "06" "07" "08" "09"
+"10" "11" "12" "13" "14" "15" "16" "17" "18" "19"
+"20" "21" "22" "23" "24" "25" "26" "27" "28" "29"
+"30" "31" "32" "33" "34" "35" "36" "37" "38" "39"
+"40" "41" "42" "43" "44" "45" "46" "47" "48" "49"
+"50" "51" "52" "53" "54" "55" "56" "57" "58" "59"
+"60" "61" "62" "63" "64" "65" "66" "67" "68" "69"
+"70" "71" "72" "73" "74" "75" "76" "77" "78" "79"
+"80" "81" "82" "83" "84" "85" "86" "87" "88" "89"
+"90" "91" "92" "93" "94" "95" "96" "97" "98" "99";
+
+/*
+ * Adapted from http://graphics.stanford.edu/~seander/bithacks.html#IntegerLog10
+ */
+static inline uint32
+decimalLength32(const uint32 v)
+{
+	uint32			t;
+	static uint64	PowersOfTen[] =
+	{1,                10,                100,
+	 1000,             10000,             100000,
+	 1000000,          10000000,          100000000,
+	 1000000000};
+	/*
+	 * Compute base-10 logarithm by dividing the base-2 logarithm
+	 * by a good-enough approximation of the base-2 logarithm of 10
+	 */
+	t = (pg_leftmost_one_pos32(v) + 1)*1233/4096;
+	return t + (v >= PowersOfTen[t]);
+}
+
+static inline uint32
+decimalLength64(const uint64 v)
+{
+	uint32			t;
+	static uint64	PowersOfTen[] =
+	{1,                10,                100,
+	 1000,             10000,             100000,
+	 1000000,          10000000,          100000000,
+	 1000000000,       10000000000,       100000000000,
+	 1000000000000,    10000000000000,    100000000000000,
+	 1000000000000000, 10000000000000000, 100000000000000000,
+	 1000000000000000000};
+
+	/*
+	 * Compute base-10 logarithm by dividing the base-2 logarithm
+	 * by a good-enough approximation of the base-2 logarithm of 10
+	 */
+	t = (pg_leftmost_one_pos64(v) + 1)*1233/4096;
+	return t + (v >= PowersOfTen[t]);
+}
 
 /*
  * pg_atoi: convert string to integer
@@ -276,16 +334,17 @@ pg_itoa(int16 i, char *a)
 }
 
 /*
- * pg_ltoa: converts a signed 32-bit integer to its string representation
+ * pg_ltoa_n: converts a signed 32-bit integer to its string representation, not
+ * NUL-terminated, and returns the length of that string representation
  *
- * Caller must ensure that 'a' points to enough memory to hold the result
- * (at least 12 bytes, counting a leading sign and trailing NUL).
+ * Caller must ensure that 'a' points to enough memory to hold the result (at
+ * least 11 bytes, counting a leading sign).
  */
-void
-pg_ltoa(int32 value, char *a)
+int32
+pg_ltoa_n(int32 value, char *a)
 {
-	char	   *start = a;
-	bool		neg = false;
+	uint32	olength;
+	uint32	i = 0, adjust = 0;
 
 	/*
 	 * Avoid problems with the most negative integer not being representable
@@ -293,53 +352,117 @@ pg_ltoa(int32 value, char *a)
 	 */
 	if (value == PG_INT32_MIN)
 	{
-		memcpy(a, "-2147483648", 12);
-		return;
+		memcpy(a, "-2147483648", 11);
+		return 11;
 	}
-	else if (value < 0)
+
+	/* Might as well handle this case, too */
+	if (value == 0)
+	{
+		memcpy(a, "0", 1);
+		return 1;
+	}
+
+	if (value < 0)
 	{
 		value = -value;
-		neg = true;
-	}
-
-	/* Compute the result string backwards. */
-	do
-	{
-		int32		remainder;
-		int32		oldval = value;
-
-		value /= 10;
-		remainder = oldval - value * 10;
-		*a++ = '0' + remainder;
-	} while (value != 0);
-
-	if (neg)
 		*a++ = '-';
+		adjust++;
+	}
+
+	olength = decimalLength32(value);
+
+	/* Compute the result string. */
+	if (value >= 100000000)
+	{
+		const	uint32 value2 = value % 100000000;
+
+		const uint32 c = value2 % 10000;
+		const uint32 d = value2 / 10000;
+		const uint32 c0 = (c % 100) << 1;
+		const uint32 c1 = (c / 100) << 1;
+		const uint32 d0 = (d % 100) << 1;
+		const uint32 d1 = (d / 100) << 1;
+
+		char *pos = a + olength - i;
+
+		value /= 100000000;
+
+		memcpy(pos - 2, DIGIT_TABLE + c0, 2);
+		memcpy(pos - 4, DIGIT_TABLE + c1, 2);
+		memcpy(pos - 6, DIGIT_TABLE + d0, 2);
+		memcpy(pos - 8, DIGIT_TABLE + d1, 2);
+		i += 8;
+	}
+
+	if (value >= 10000)
+	{
+		const	uint32 c = value - 10000 * (value / 10000);
+		const	uint32 c0 = (c % 100) << 1;
+		const	uint32 c1 = (c / 100) << 1;
 
-	/* Add trailing NUL byte, and back up 'a' to the last character. */
-	*a-- = '\0';
+		char *pos = a + olength - i;
 
-	/* Reverse string. */
-	while (start < a)
+		value /= 10000;
+
+		memcpy(pos - 2, DIGIT_TABLE + c0, 2);
+		memcpy(pos - 4, DIGIT_TABLE + c1, 2);
+		i += 4;
+	}
+	if (value >= 100)
 	{
-		char		swap = *start;
+		const uint32 c = (value % 100) << 1;
+
+		char *pos = a + olength - i;
 
-		*start++ = *a;
-		*a-- = swap;
+		value /= 100;
+
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
+		i += 2;
 	}
+	if (value >= 10)
+	{
+		const uint32 c = value << 1;
+
+		char *pos = a + olength - i;
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	else
+	{
+		*a = (char) ('0' + value);
+		i++;
+	}
+
+	i += adjust;
+	return i;
+}
+
+/*
+ * NUL-terminate the output of pg_ltoa_n.
+ *
+ * It is the caller's responsibility to ensure that a is at least 12 bytes long,
+ * which is enough room to hold a minus sign, a maximally long int32, and the
+ * above terminating NUL.
+ */
+void
+pg_ltoa(int32 value, char *a)
+{
+	int32 len = pg_ltoa_n(value, a);
+	a[len] = '\0';
 }
 
 /*
  * pg_lltoa: convert a signed 64-bit integer to its string representation
  *
  * Caller must ensure that 'a' points to enough memory to hold the result
- * (at least MAXINT8LEN+1 bytes, counting a leading sign and trailing NUL).
+ * (at least MAXINT8LEN + 1 bytes, counting a leading sign and trailing NUL).
  */
 void
 pg_lltoa(int64 value, char *a)
 {
-	char	   *start = a;
-	bool		neg = false;
+	uint32	olength, value2;
+	uint32	i = 0;
 
 	/*
 	 * Avoid problems with the most negative integer not being representable
@@ -347,40 +470,92 @@ pg_lltoa(int64 value, char *a)
 	 */
 	if (value == PG_INT64_MIN)
 	{
-		memcpy(a, "-9223372036854775808", 21);
+		/* Include terminating NUL */
+		memcpy(a, "-9223372036854775808", MAXINT8LEN + 1);
 		return;
 	}
-	else if (value < 0)
+
+	/* Might as well handle this case, too */
+	if (value == 0)
+	{
+		/* Include terminating NUL */
+		memcpy(a, "0", 2);
+		return;
+	}
+
+	if (value < 0)
 	{
 		value = -value;
-		neg = true;
-	}
-
-	/* Compute the result string backwards. */
-	do
-	{
-		int64		remainder;
-		int64		oldval = value;
-
-		value /= 10;
-		remainder = oldval - value * 10;
-		*a++ = '0' + remainder;
-	} while (value != 0);
-
-	if (neg)
 		*a++ = '-';
+	}
 
-	/* Add trailing NUL byte, and back up 'a' to the last character. */
-	*a-- = '\0';
+	olength = decimalLength64(value);
 
-	/* Reverse string. */
-	while (start < a)
+	/* Compute the result string. */
+	while (value >= 100000000)
 	{
-		char		swap = *start;
+		const	uint64 q = value / 100000000; /* Most-significant digits */
+		uint32	value2 = (uint32) (value - 100000000 * q); /* Least-significant digits */
+
+		const uint32 c = value2 % 10000;
+		const uint32 d = value2 / 10000;
+		const uint32 c0 = (c % 100) << 1;
+		const uint32 c1 = (c / 100) << 1;
+		const uint32 d0 = (d % 100) << 1;
+		const uint32 d1 = (d / 100) << 1;
+
+		char *pos = a + olength - i;
+
+		value = q;
+
+		memcpy(pos - 2, DIGIT_TABLE + c0, 2);
+		memcpy(pos - 4, DIGIT_TABLE + c1, 2);
+		memcpy(pos - 6, DIGIT_TABLE + d0, 2);
+		memcpy(pos - 8, DIGIT_TABLE + d1, 2);
+		i += 8;
+	}
+
+	/* Switch to 32-bit for speed */
+	value2 = (uint32) value;
+
+	if (value2 >= 10000)
+	{
+		const	uint32 c = value2 - 10000 * (value2 / 10000);
+		const	uint32 c0 = (c % 100) << 1;
+		const	uint32 c1 = (c / 100) << 1;
+
+		char *pos = a + olength - i;
+
+		value2 /= 10000;
 
-		*start++ = *a;
-		*a-- = swap;
+		memcpy(pos - 2, DIGIT_TABLE + c0, 2);
+		memcpy(pos - 4, DIGIT_TABLE + c1, 2);
+		i += 4;
 	}
+	if (value2 >= 100)
+	{
+		const uint32 c = (value2 % 100) << 1;
+		char *pos = a + olength - i;
+
+		value2 /= 100;
+
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	if (value2 >= 10)
+	{
+		const uint32 c = value2 << 1;
+		char *pos = a + olength - i;
+
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	else
+	{
+		*a = (char) ('0' + value2);
+	}
+
+	a[olength] = '\0';
 }
 
 
@@ -409,60 +584,44 @@ pg_lltoa(int64 value, char *a)
 char *
 pg_ltostr_zeropad(char *str, int32 value, int32 minwidth)
 {
-	char	   *start = str;
-	char	   *end = &str[minwidth];
-	int32		num = value;
+	int32		len;
 
 	Assert(minwidth > 0);
 
-	/*
-	 * Handle negative numbers in a special way.  We can't just write a '-'
-	 * prefix and reverse the sign as that would overflow for INT32_MIN.
-	 */
-	if (num < 0)
+	if (value >= 0)
 	{
-		*start++ = '-';
-		minwidth--;
+		if (value < 100 && minwidth == 2) /* Short cut for common case */
+		{
+			const uint32 c = value << 1;
+			memcpy(str, DIGIT_TABLE + c, 2);
+			return str + 2;
+		}
 
+		len = pg_ltoa_n(value, str);
+		if (minwidth <= len)
+			return str + len;
+
+		memmove(str + minwidth - len, str, len);
+		for(int i = 0; i < minwidth - len; i++)
+		{
+			memcpy(str + i, DIGIT_TABLE, 1);
+		}
+		return str + minwidth;
+	}
+	else
+	{
 		/*
-		 * Build the number starting at the last digit.  Here remainder will
-		 * be a negative number, so we must reverse the sign before adding '0'
-		 * in order to get the correct ASCII digit.
+		 * Changing this number's sign would overflow PG_INT32_MAX,
+		 * so special-case it.
 		 */
-		while (minwidth--)
+		if (value == PG_INT32_MIN)
 		{
-			int32		oldval = num;
-			int32		remainder;
-
-			num /= 10;
-			remainder = oldval - num * 10;
-			start[minwidth] = '0' - remainder;
+			memcpy(str, "-2147483648", 11);
+			return str + 11;
 		}
+		*str++ = '-';
+		return pg_ltostr_zeropad(str, -value, minwidth - 1);
 	}
-	else
-	{
-		/* Build the number starting at the last digit */
-		while (minwidth--)
-		{
-			int32		oldval = num;
-			int32		remainder;
-
-			num /= 10;
-			remainder = oldval - num * 10;
-			start[minwidth] = '0' + remainder;
-		}
-	}
-
-	/*
-	 * If minwidth was not high enough to fit the number then num won't have
-	 * been divided down to zero.  We punt the problem to pg_ltostr(), which
-	 * will generate a correct answer in the minimum valid width.
-	 */
-	if (num != 0)
-		return pg_ltostr(str, value);
-
-	/* Otherwise, return last output character + 1 */
-	return end;
 }
 
 /*
@@ -486,62 +645,8 @@ pg_ltostr_zeropad(char *str, int32 value, int32 minwidth)
 char *
 pg_ltostr(char *str, int32 value)
 {
-	char	   *start;
-	char	   *end;
-
-	/*
-	 * Handle negative numbers in a special way.  We can't just write a '-'
-	 * prefix and reverse the sign as that would overflow for INT32_MIN.
-	 */
-	if (value < 0)
-	{
-		*str++ = '-';
-
-		/* Mark the position we must reverse the string from. */
-		start = str;
-
-		/* Compute the result string backwards. */
-		do
-		{
-			int32		oldval = value;
-			int32		remainder;
-
-			value /= 10;
-			remainder = oldval - value * 10;
-			/* As above, we expect remainder to be negative. */
-			*str++ = '0' - remainder;
-		} while (value != 0);
-	}
-	else
-	{
-		/* Mark the position we must reverse the string from. */
-		start = str;
-
-		/* Compute the result string backwards. */
-		do
-		{
-			int32		oldval = value;
-			int32		remainder;
-
-			value /= 10;
-			remainder = oldval - value * 10;
-			*str++ = '0' + remainder;
-		} while (value != 0);
-	}
-
-	/* Remember the end+1 and back up 'str' to the last character. */
-	end = str--;
-
-	/* Reverse string. */
-	while (start < str)
-	{
-		char		swap = *start;
-
-		*start++ = *str;
-		*str-- = swap;
-	}
-
-	return end;
+	int32	len = pg_ltoa_n(value, str);
+	return str + len;
 }
 
 /*
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 937ddb7ef0..cd13536cf4 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -18,6 +18,8 @@
 #include "nodes/nodes.h"
 #include "utils/fmgrprotos.h"
 
+/* Sign + the most decimal digits an 8-byte number could have */
+#define MAXINT8LEN 20
 
 /* bool.c */
 extern bool parse_bool(const char *value, bool *result);
@@ -46,6 +48,7 @@ extern int32 pg_atoi(const char *s, int size, int c);
 extern int16 pg_strtoint16(const char *s);
 extern int32 pg_strtoint32(const char *s);
 extern void pg_itoa(int16 i, char *a);
+int32 pg_ltoa_n(int32 l, char *a);
 extern void pg_ltoa(int32 l, char *a);
 extern void pg_lltoa(int64 ll, char *a);
 extern char *pg_ltostr_zeropad(char *str, int32 value, int32 minwidth);

--------------2.21.0--

#13

Andrew Gierth

andrew@tao11.riddles.org.uk

over 6 years ago

In reply to: David Fetter (#12)

Re: Efficient output for integer types

"David" == David Fetter <david@fetter.org> writes:

David> + /* Compute the result string. */
David> + if (value >= 100000000)
David> + {
David> + const uint32 value2 = value % 100000000;
David> +
David> + const uint32 c = value2 % 10000;
David> + const uint32 d = value2 / 10000;
David> + const uint32 c0 = (c % 100) << 1;
David> + const uint32 c1 = (c / 100) << 1;
David> + const uint32 d0 = (d % 100) << 1;
David> + const uint32 d1 = (d / 100) << 1;
David> +
David> + char *pos = a + olength - i;
David> +
David> + value /= 100000000;
David> +
David> + memcpy(pos - 2, DIGIT_TABLE + c0, 2);
David> + memcpy(pos - 4, DIGIT_TABLE + c1, 2);
David> + memcpy(pos - 6, DIGIT_TABLE + d0, 2);
David> + memcpy(pos - 8, DIGIT_TABLE + d1, 2);
David> + i += 8;
David> + }

For the 32-bit case, there's no point in doing an 8-digit divide
specially, it doesn't save any time. It's sufficient to just change

David> + if (value >= 10000)

to while(value >= 10000)

in order to process 4 digits at a time.

David> + for(int i = 0; i < minwidth - len; i++)
David> + {
David> + memcpy(str + i, DIGIT_TABLE, 1);
David> + }

Should be:
memset(str, '0', minwidth-len);

--
Andrew (irc:RhodiumToad)

#14

David Fetter

david@fetter.org

over 6 years ago

In reply to: Andrew Gierth (#13)

1 attachment(s)

Re: Efficient output for integer types

On Sat, Sep 21, 2019 at 03:36:21AM +0100, Andrew Gierth wrote:

"David" == David Fetter <david@fetter.org> writes:

David> + /* Compute the result string. */
David> + if (value >= 100000000)
David> + {
David> + const uint32 value2 = value % 100000000;
David> +
David> + const uint32 c = value2 % 10000;
David> + const uint32 d = value2 / 10000;
David> + const uint32 c0 = (c % 100) << 1;
David> + const uint32 c1 = (c / 100) << 1;
David> + const uint32 d0 = (d % 100) << 1;
David> + const uint32 d1 = (d / 100) << 1;
David> +
David> + char *pos = a + olength - i;
David> +
David> + value /= 100000000;
David> +
David> + memcpy(pos - 2, DIGIT_TABLE + c0, 2);
David> + memcpy(pos - 4, DIGIT_TABLE + c1, 2);
David> + memcpy(pos - 6, DIGIT_TABLE + d0, 2);
David> + memcpy(pos - 8, DIGIT_TABLE + d1, 2);
David> + i += 8;
David> + }

For the 32-bit case, there's no point in doing an 8-digit divide
specially, it doesn't save any time. It's sufficient to just change

David> + if (value >= 10000)

to while(value >= 10000)

Done.

in order to process 4 digits at a time.

David> + for(int i = 0; i < minwidth - len; i++)
David> + {
David> + memcpy(str + i, DIGIT_TABLE, 1);
David> + }

Should be:
memset(str, '0', minwidth-len);

Done.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachments:

v10-0001-Make-int4-and-int8-operations-more-efficent.patchtext/x-diff; charset=us-asciiDownload

From 4407b0771cd53890acf41ffee8908d1a701c52c0 Mon Sep 17 00:00:00 2001
From: David Fetter <david@fetter.org>
Date: Sun, 15 Sep 2019 00:06:29 -0700
Subject: [PATCH v10] Make int4 and int8 operations more efficent
To: hackers
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="------------2.21.0"

This is a multi-part message in MIME format.
--------------2.21.0
Content-Type: text/plain; charset=UTF-8; format=fixed
Content-Transfer-Encoding: 8bit


- Output routines now do more digits per iteration, and
- Code determines the number of decimal digits in int4/int8 efficiently
- Split off pg_ltoa_n from pg_ltoa
- Use same to make other functions shorter

diff --git a/src/backend/access/common/printsimple.c b/src/backend/access/common/printsimple.c
index 651ade14dd..5c5b6d33b2 100644
--- a/src/backend/access/common/printsimple.c
+++ b/src/backend/access/common/printsimple.c
@@ -112,7 +112,7 @@ printsimple(TupleTableSlot *slot, DestReceiver *self)
 			case INT8OID:
 				{
 					int64		num = DatumGetInt64(value);
-					char		str[23];	/* sign, 21 digits and '\0' */
+					char		str[MAXINT8LEN + 1];
 
 					pg_lltoa(num, str);
 					pq_sendcountedtext(&buf, str, strlen(str), false);
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 0ff9394a2f..6230807906 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -27,8 +27,6 @@
 #include "utils/builtins.h"
 
 
-#define MAXINT8LEN		25
-
 typedef struct
 {
 	int64		current;
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 70138feb29..babc6f6166 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -20,6 +20,68 @@
 
 #include "common/int.h"
 #include "utils/builtins.h"
+#include "port/pg_bitutils.h"
+
+/*
+ * A table of all two-digit numbers. This is used to speed up decimal digit
+ * generation by copying pairs of digits into the final output.
+ */
+static const char DIGIT_TABLE[200] =
+"00" "01" "02" "03" "04" "05" "06" "07" "08" "09"
+"10" "11" "12" "13" "14" "15" "16" "17" "18" "19"
+"20" "21" "22" "23" "24" "25" "26" "27" "28" "29"
+"30" "31" "32" "33" "34" "35" "36" "37" "38" "39"
+"40" "41" "42" "43" "44" "45" "46" "47" "48" "49"
+"50" "51" "52" "53" "54" "55" "56" "57" "58" "59"
+"60" "61" "62" "63" "64" "65" "66" "67" "68" "69"
+"70" "71" "72" "73" "74" "75" "76" "77" "78" "79"
+"80" "81" "82" "83" "84" "85" "86" "87" "88" "89"
+"90" "91" "92" "93" "94" "95" "96" "97" "98" "99";
+
+/*
+ * Adapted from http://graphics.stanford.edu/~seander/bithacks.html#IntegerLog10
+ */
+static inline uint32
+decimalLength32(const uint32 v)
+{
+	uint32			t;
+	static uint32	PowersOfTen[] =
+	{1,                10,                100,
+	 1000,             10000,             100000,
+	 1000000,          10000000,          100000000,
+	 1000000000};
+	/*
+	 * Compute base-10 logarithm by dividing the base-2 logarithm
+	 * by a good-enough approximation of the base-2 logarithm of 10
+	 */
+	t = (pg_leftmost_one_pos32(v) + 1)*1233/4096;
+	return t + (v >= PowersOfTen[t]);
+}
+
+static inline uint32
+decimalLength64(const uint64_t v)
+{
+	uint32			t;
+	static uint64_t	PowersOfTen[] = {
+		UINT64CONST(1),                   UINT64CONST(10),
+		UINT64CONST(100),                 UINT64CONST(1000),
+		UINT64CONST(10000),               UINT64CONST(100000),
+		UINT64CONST(1000000),             UINT64CONST(10000000),
+		UINT64CONST(100000000),           UINT64CONST(1000000000),
+		UINT64CONST(10000000000),         UINT64CONST(100000000000),
+		UINT64CONST(1000000000000),       UINT64CONST(10000000000000),
+		UINT64CONST(100000000000000),     UINT64CONST(1000000000000000),
+		UINT64CONST(10000000000000000),   UINT64CONST(100000000000000000),
+		UINT64CONST(1000000000000000000), UINT64CONST(10000000000000000000)
+	};
+
+	/*
+	 * Compute base-10 logarithm by dividing the base-2 logarithm
+	 * by a good-enough approximation of the base-2 logarithm of 10
+	 */
+	t = (pg_leftmost_one_pos64(v) + 1)*1233/4096;
+	return t + (v >= PowersOfTen[t]);
+}
 
 /*
  * pg_atoi: convert string to integer
@@ -276,111 +338,191 @@ pg_itoa(int16 i, char *a)
 }
 
 /*
- * pg_ltoa: converts a signed 32-bit integer to its string representation
+ * pg_ltoa_n: converts a signed 32-bit integer to its string representation, not
+ * NUL-terminated, and returns the length of that string representation
  *
- * Caller must ensure that 'a' points to enough memory to hold the result
- * (at least 12 bytes, counting a leading sign and trailing NUL).
+ * Caller must ensure that 'a' points to enough memory to hold the result (at
+ * least 11 bytes, counting a leading sign).
+ */
+int32
+pg_ltoa_n(uint32 value, char *a)
+{
+	uint32	olength;
+	uint32	i = 0, adjust = 0;
+
+	/* Might as well handle this case */
+	if (value == 0)
+	{
+		memcpy(a, "0", 1);
+		return 1;
+	}
+
+	olength = decimalLength32(value);
+
+	/* Compute the result string. */
+	while (value >= 10000)
+	{
+		const	uint32 c = value - 10000 * (value / 10000);
+		const	uint32 c0 = (c % 100) << 1;
+		const	uint32 c1 = (c / 100) << 1;
+
+		char *pos = a + olength - i;
+
+		value /= 10000;
+
+		memcpy(pos - 2, DIGIT_TABLE + c0, 2);
+		memcpy(pos - 4, DIGIT_TABLE + c1, 2);
+		i += 4;
+	}
+	if (value >= 100)
+	{
+		const uint32 c = (value % 100) << 1;
+
+		char *pos = a + olength - i;
+
+		value /= 100;
+
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	if (value >= 10)
+	{
+		const uint32 c = value << 1;
+
+		char *pos = a + olength - i;
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	else
+	{
+		*a = (char) ('0' + value);
+		i++;
+	}
+
+	i += adjust;
+	return i;
+}
+
+/*
+ * NUL-terminate the output of pg_ltoa_n.
+ *
+ * It is the caller's responsibility to ensure that a is at least 12 bytes long,
+ * which is enough room to hold a minus sign, a maximally long int32, and the
+ * above terminating NUL.
  */
 void
 pg_ltoa(int32 value, char *a)
 {
-	char	   *start = a;
-	bool		neg = false;
 
-	/*
-	 * Avoid problems with the most negative integer not being representable
-	 * as a positive integer.
-	 */
-	if (value == PG_INT32_MIN)
+	uint32_t	uvalue = (uint32_t)value;
+	int32		len;
+	if (value < 0)
 	{
-		memcpy(a, "-2147483648", 12);
-		return;
-	}
-	else if (value < 0)
-	{
-		value = -value;
-		neg = true;
-	}
-
-	/* Compute the result string backwards. */
-	do
-	{
-		int32		remainder;
-		int32		oldval = value;
-
-		value /= 10;
-		remainder = oldval - value * 10;
-		*a++ = '0' + remainder;
-	} while (value != 0);
-
-	if (neg)
+		uvalue = (uint32_t)0 - (uint32_t)value;
 		*a++ = '-';
+	}
+	len = pg_ltoa_n(uvalue, a);
+	a[len] = '\0';
+}
+
+int32
+pg_lltoa_n(uint64_t value, char *a)
+{
+	uint32	olength, value2;
+	uint32	i = 0;
+
+	/* Degenerate case */
+	if (value == 0)
+	{
+		memcpy(a, "0", 1);
+		return 1;
+	}
 
-	/* Add trailing NUL byte, and back up 'a' to the last character. */
-	*a-- = '\0';
+	olength = decimalLength64(value);
 
-	/* Reverse string. */
-	while (start < a)
+	/* Compute the result string. */
+	while (value >= 100000000)
 	{
-		char		swap = *start;
+		const	uint64_t q = value / 100000000; /* Most-significant digits */
+		uint32	value2 = (uint32) (value - 100000000 * q); /* Least-significant digits */
+
+		const uint32 c = value2 % 10000;
+		const uint32 d = value2 / 10000;
+		const uint32 c0 = (c % 100) << 1;
+		const uint32 c1 = (c / 100) << 1;
+		const uint32 d0 = (d % 100) << 1;
+		const uint32 d1 = (d / 100) << 1;
 
-		*start++ = *a;
-		*a-- = swap;
+		char *pos = a + olength - i;
+
+		value = q;
+
+		memcpy(pos - 2, DIGIT_TABLE + c0, 2);
+		memcpy(pos - 4, DIGIT_TABLE + c1, 2);
+		memcpy(pos - 6, DIGIT_TABLE + d0, 2);
+		memcpy(pos - 8, DIGIT_TABLE + d1, 2);
+		i += 8;
 	}
+
+	/* Switch to 32-bit for speed */
+	value2 = (uint32) value;
+
+	if (value2 >= 10000)
+	{
+		const	uint32 c = value2 - 10000 * (value2 / 10000);
+		const	uint32 c0 = (c % 100) << 1;
+		const	uint32 c1 = (c / 100) << 1;
+
+		char *pos = a + olength - i;
+
+		value2 /= 10000;
+
+		memcpy(pos - 2, DIGIT_TABLE + c0, 2);
+		memcpy(pos - 4, DIGIT_TABLE + c1, 2);
+		i += 4;
+	}
+	if (value2 >= 100)
+	{
+		const uint32 c = (value2 % 100) << 1;
+		char *pos = a + olength - i;
+
+		value2 /= 100;
+
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	if (value2 >= 10)
+	{
+		const uint32 c = value2 << 1;
+		char *pos = a + olength - i;
+
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	else
+		*a = (char) ('0' + value2);
+
+	return olength;
 }
 
 /*
  * pg_lltoa: convert a signed 64-bit integer to its string representation
  *
  * Caller must ensure that 'a' points to enough memory to hold the result
- * (at least MAXINT8LEN+1 bytes, counting a leading sign and trailing NUL).
+ * (at least MAXINT8LEN + 1 bytes, counting a leading sign and trailing NUL).
  */
 void
 pg_lltoa(int64 value, char *a)
 {
-	char	   *start = a;
-	bool		neg = false;
-
-	/*
-	 * Avoid problems with the most negative integer not being representable
-	 * as a positive integer.
-	 */
-	if (value == PG_INT64_MIN)
+	int len;
+	uint64_t uvalue = value;
+	if (value < 0)
 	{
-		memcpy(a, "-9223372036854775808", 21);
-		return;
-	}
-	else if (value < 0)
-	{
-		value = -value;
-		neg = true;
-	}
-
-	/* Compute the result string backwards. */
-	do
-	{
-		int64		remainder;
-		int64		oldval = value;
-
-		value /= 10;
-		remainder = oldval - value * 10;
-		*a++ = '0' + remainder;
-	} while (value != 0);
-
-	if (neg)
 		*a++ = '-';
-
-	/* Add trailing NUL byte, and back up 'a' to the last character. */
-	*a-- = '\0';
-
-	/* Reverse string. */
-	while (start < a)
-	{
-		char		swap = *start;
-
-		*start++ = *a;
-		*a-- = swap;
+		uvalue = (uint64_t)0 - uvalue;
 	}
+	len = pg_lltoa_n(uvalue, a);
+	a[len] = 0;
 }
 
 
@@ -409,60 +551,41 @@ pg_lltoa(int64 value, char *a)
 char *
 pg_ltostr_zeropad(char *str, int32 value, int32 minwidth)
 {
-	char	   *start = str;
-	char	   *end = &str[minwidth];
-	int32		num = value;
+	int32		len;
 
 	Assert(minwidth > 0);
 
-	/*
-	 * Handle negative numbers in a special way.  We can't just write a '-'
-	 * prefix and reverse the sign as that would overflow for INT32_MIN.
-	 */
-	if (num < 0)
+	if (value >= 0)
 	{
-		*start++ = '-';
-		minwidth--;
+		if (value < 100 && minwidth == 2) /* Short cut for common case */
+		{
+			const uint32 c = value << 1;
+			memcpy(str, DIGIT_TABLE + c, 2);
+			return str + 2;
+		}
 
+		len = pg_ltoa_n(value, str);
+		if (minwidth <= len)
+			return str + len;
+
+		memmove(str + minwidth - len, str, len);
+		memset(str, '0', minwidth-len);
+		return str + minwidth;
+	}
+	else
+	{
 		/*
-		 * Build the number starting at the last digit.  Here remainder will
-		 * be a negative number, so we must reverse the sign before adding '0'
-		 * in order to get the correct ASCII digit.
+		 * Changing this number's sign would overflow PG_INT32_MAX,
+		 * so special-case it.
 		 */
-		while (minwidth--)
+		if (value == PG_INT32_MIN)
 		{
-			int32		oldval = num;
-			int32		remainder;
-
-			num /= 10;
-			remainder = oldval - num * 10;
-			start[minwidth] = '0' - remainder;
+			memcpy(str, "-2147483648", 11);
+			return str + 11;
 		}
+		*str++ = '-';
+		return pg_ltostr_zeropad(str, -value, minwidth - 1);
 	}
-	else
-	{
-		/* Build the number starting at the last digit */
-		while (minwidth--)
-		{
-			int32		oldval = num;
-			int32		remainder;
-
-			num /= 10;
-			remainder = oldval - num * 10;
-			start[minwidth] = '0' + remainder;
-		}
-	}
-
-	/*
-	 * If minwidth was not high enough to fit the number then num won't have
-	 * been divided down to zero.  We punt the problem to pg_ltostr(), which
-	 * will generate a correct answer in the minimum valid width.
-	 */
-	if (num != 0)
-		return pg_ltostr(str, value);
-
-	/* Otherwise, return last output character + 1 */
-	return end;
 }
 
 /*
@@ -486,62 +609,8 @@ pg_ltostr_zeropad(char *str, int32 value, int32 minwidth)
 char *
 pg_ltostr(char *str, int32 value)
 {
-	char	   *start;
-	char	   *end;
-
-	/*
-	 * Handle negative numbers in a special way.  We can't just write a '-'
-	 * prefix and reverse the sign as that would overflow for INT32_MIN.
-	 */
-	if (value < 0)
-	{
-		*str++ = '-';
-
-		/* Mark the position we must reverse the string from. */
-		start = str;
-
-		/* Compute the result string backwards. */
-		do
-		{
-			int32		oldval = value;
-			int32		remainder;
-
-			value /= 10;
-			remainder = oldval - value * 10;
-			/* As above, we expect remainder to be negative. */
-			*str++ = '0' - remainder;
-		} while (value != 0);
-	}
-	else
-	{
-		/* Mark the position we must reverse the string from. */
-		start = str;
-
-		/* Compute the result string backwards. */
-		do
-		{
-			int32		oldval = value;
-			int32		remainder;
-
-			value /= 10;
-			remainder = oldval - value * 10;
-			*str++ = '0' + remainder;
-		} while (value != 0);
-	}
-
-	/* Remember the end+1 and back up 'str' to the last character. */
-	end = str--;
-
-	/* Reverse string. */
-	while (start < str)
-	{
-		char		swap = *start;
-
-		*start++ = *str;
-		*str-- = swap;
-	}
-
-	return end;
+	int32	len = pg_ltoa_n(value, str);
+	return str + len;
 }
 
 /*
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 937ddb7ef0..8d2a5b81dc 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -18,6 +18,8 @@
 #include "nodes/nodes.h"
 #include "utils/fmgrprotos.h"
 
+/* Sign + the most decimal digits an 8-byte number could have */
+#define MAXINT8LEN 20
 
 /* bool.c */
 extern bool parse_bool(const char *value, bool *result);
@@ -46,8 +48,10 @@ extern int32 pg_atoi(const char *s, int size, int c);
 extern int16 pg_strtoint16(const char *s);
 extern int32 pg_strtoint32(const char *s);
 extern void pg_itoa(int16 i, char *a);
-extern void pg_ltoa(int32 l, char *a);
-extern void pg_lltoa(int64 ll, char *a);
+int32 pg_ltoa_n(uint32_t l, char *a);
+int32 pg_lltoa_n(uint64_t l, char *a);
+extern void pg_ltoa(int32_t l, char *a);
+extern void pg_lltoa(int64_t ll, char *a);
 extern char *pg_ltostr_zeropad(char *str, int32 value, int32 minwidth);
 extern char *pg_ltostr(char *str, int32 value);
 extern uint64 pg_strtouint64(const char *str, char **endptr, int base);

--------------2.21.0--

#15

Andrew Gierth

andrew@tao11.riddles.org.uk

over 6 years ago

In reply to: David Fetter (#14)

Re: Efficient output for integer types

"David" == David Fetter <david@fetter.org> writes:

David> +static inline uint32
David> +decimalLength64(const uint64_t v)

Should be uint64, not uint64_t.

Also return an int, not a uint32.

For int vs. int32, my own inclination is to use "int" where the value is
just a (smallish) number, especially one that will be used as an index
or loop count, and use "int32" when it actually matters that it's 32
bits rather than some other size. Other opinions may differ.

David> +{
David> + uint32 t;
David> + static uint64_t PowersOfTen[] = {

uint64 not uint64_t here too.

David> +int32
David> +pg_ltoa_n(uint32 value, char *a)

If this is going to handle only unsigned values, it should probably be
named pg_ultoa_n.

David> + uint32 i = 0, adjust = 0;

"adjust" is not assigned anywhere else. Presumably that's from previous
handling of negative numbers?

David> + memcpy(a, "0", 1);

*a = '0'; would suffice.

David> + i += adjust;

Superfluous?

David> + uint32_t uvalue = (uint32_t)value;

uint32 not uint32_t.

David> + int32 len;

See above re. int vs. int32.

David> + uvalue = (uint32_t)0 - (uint32_t)value;

Should be uint32 not uint32_t again.

For anyone wondering, I suggested this to David in place of the ugly
special casing of INT32_MIN. This method avoids the UB of doing (-value)
where value==INT32_MIN, and is nevertheless required to produce the
correct result:

1. If value < 0, then ((uint32)value) is (value + UINT32_MAX + 1)
2. (uint32)0 - (uint32)value
becomes (UINT32_MAX+1)-(value+UINT32_MAX+1)
which is (-value) as required

David> +int32
David> +pg_lltoa_n(uint64_t value, char *a)

Again, if this is doing unsigned, then it should be named pg_ulltoa_n

David> + if (value == PG_INT32_MIN)

This being inconsistent with the others is not nice.

--
Andrew (irc:RhodiumToad)

#16

David Fetter

david@fetter.org

over 6 years ago

In reply to: Andrew Gierth (#15)

1 attachment(s)

Re: Efficient output for integer types

On Sat, Sep 21, 2019 at 07:29:25AM +0100, Andrew Gierth wrote:

"David" == David Fetter <david@fetter.org> writes:

David> +static inline uint32
David> +decimalLength64(const uint64_t v)

Should be uint64, not uint64_t.

Fixed.

Also return an int, not a uint32.

Fixed.

For int vs. int32, my own inclination is to use "int" where the value is
just a (smallish) number, especially one that will be used as an index
or loop count, and use "int32" when it actually matters that it's 32
bits rather than some other size. Other opinions may differ.

Done with int.

David> +{
David> + uint32 t;
David> + static uint64_t PowersOfTen[] = {

uint64 not uint64_t here too.

Fixed.

David> +int32
David> +pg_ltoa_n(uint32 value, char *a)

If this is going to handle only unsigned values, it should probably be
named pg_ultoa_n.

It does signed values now.

David> + uint32 i = 0, adjust = 0;

"adjust" is not assigned anywhere else. Presumably that's from previous
handling of negative numbers?

It was, and now it's gone.

David> + memcpy(a, "0", 1);

*a = '0'; would suffice.

Fixed.

David> + i += adjust;

Superfluous?

Yep. Gone.

David> + uint32_t uvalue = (uint32_t)value;

uint32 not uint32_t.

Fixed.

David> + int32 len;

See above re. int vs. int32.

Done that way.

David> + uvalue = (uint32_t)0 - (uint32_t)value;

Should be uint32 not uint32_t again.

Done.

For anyone wondering, I suggested this to David in place of the ugly
special casing of INT32_MIN. This method avoids the UB of doing (-value)
where value==INT32_MIN, and is nevertheless required to produce the
correct result:

1. If value < 0, then ((uint32)value) is (value + UINT32_MAX + 1)
2. (uint32)0 - (uint32)value
becomes (UINT32_MAX+1)-(value+UINT32_MAX+1)
which is (-value) as required

David> +int32
David> +pg_lltoa_n(uint64_t value, char *a)

Again, if this is doing unsigned, then it should be named pg_ulltoa_n

Renamed to allow the uint64s that de-special-casing INT32_MIN/INT64_MIN requires.

David> + if (value == PG_INT32_MIN)

This being inconsistent with the others is not nice.

Fixed.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachments:

v11-0001-Make-int4-and-int8-operations-more-efficent.patchtext/x-diff; charset=us-asciiDownload

From 8045adc25343314e089d83fe9fbb91dbd1fb71e1 Mon Sep 17 00:00:00 2001
From: David Fetter <david@fetter.org>
Date: Sun, 15 Sep 2019 00:06:29 -0700
Subject: [PATCH v11] Make int4 and int8 operations more efficent
To: hackers
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="------------2.21.0"

This is a multi-part message in MIME format.
--------------2.21.0
Content-Type: text/plain; charset=UTF-8; format=fixed
Content-Transfer-Encoding: 8bit


- Output routines now do more digits per iteration, and
- Code determines the number of decimal digits in int4/int8 efficiently
- Split off pg_ltoa_n from pg_ltoa
- Use same to make other functions shorter

diff --git a/src/backend/access/common/printsimple.c b/src/backend/access/common/printsimple.c
index 651ade14dd..5c5b6d33b2 100644
--- a/src/backend/access/common/printsimple.c
+++ b/src/backend/access/common/printsimple.c
@@ -112,7 +112,7 @@ printsimple(TupleTableSlot *slot, DestReceiver *self)
 			case INT8OID:
 				{
 					int64		num = DatumGetInt64(value);
-					char		str[23];	/* sign, 21 digits and '\0' */
+					char		str[MAXINT8LEN + 1];
 
 					pg_lltoa(num, str);
 					pq_sendcountedtext(&buf, str, strlen(str), false);
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 0ff9394a2f..6230807906 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -27,8 +27,6 @@
 #include "utils/builtins.h"
 
 
-#define MAXINT8LEN		25
-
 typedef struct
 {
 	int64		current;
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 70138feb29..1fdf9caadd 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -20,6 +20,68 @@
 
 #include "common/int.h"
 #include "utils/builtins.h"
+#include "port/pg_bitutils.h"
+
+/*
+ * A table of all two-digit numbers. This is used to speed up decimal digit
+ * generation by copying pairs of digits into the final output.
+ */
+static const char DIGIT_TABLE[200] =
+"00" "01" "02" "03" "04" "05" "06" "07" "08" "09"
+"10" "11" "12" "13" "14" "15" "16" "17" "18" "19"
+"20" "21" "22" "23" "24" "25" "26" "27" "28" "29"
+"30" "31" "32" "33" "34" "35" "36" "37" "38" "39"
+"40" "41" "42" "43" "44" "45" "46" "47" "48" "49"
+"50" "51" "52" "53" "54" "55" "56" "57" "58" "59"
+"60" "61" "62" "63" "64" "65" "66" "67" "68" "69"
+"70" "71" "72" "73" "74" "75" "76" "77" "78" "79"
+"80" "81" "82" "83" "84" "85" "86" "87" "88" "89"
+"90" "91" "92" "93" "94" "95" "96" "97" "98" "99";
+
+/*
+ * Adapted from http://graphics.stanford.edu/~seander/bithacks.html#IntegerLog10
+ */
+static inline int
+decimalLength32(const uint32 v)
+{
+	int			t;
+	static uint32	PowersOfTen[] =
+	{1,                10,                100,
+	 1000,             10000,             100000,
+	 1000000,          10000000,          100000000,
+	 1000000000};
+	/*
+	 * Compute base-10 logarithm by dividing the base-2 logarithm
+	 * by a good-enough approximation of the base-2 logarithm of 10
+	 */
+	t = (pg_leftmost_one_pos32(v) + 1)*1233/4096;
+	return t + (v >= PowersOfTen[t]);
+}
+
+static inline int
+decimalLength64(const uint64 v)
+{
+	int			t;
+	static uint64	PowersOfTen[] = {
+		UINT64CONST(1),                   UINT64CONST(10),
+		UINT64CONST(100),                 UINT64CONST(1000),
+		UINT64CONST(10000),               UINT64CONST(100000),
+		UINT64CONST(1000000),             UINT64CONST(10000000),
+		UINT64CONST(100000000),           UINT64CONST(1000000000),
+		UINT64CONST(10000000000),         UINT64CONST(100000000000),
+		UINT64CONST(1000000000000),       UINT64CONST(10000000000000),
+		UINT64CONST(100000000000000),     UINT64CONST(1000000000000000),
+		UINT64CONST(10000000000000000),   UINT64CONST(100000000000000000),
+		UINT64CONST(1000000000000000000), UINT64CONST(10000000000000000000)
+	};
+
+	/*
+	 * Compute base-10 logarithm by dividing the base-2 logarithm
+	 * by a good-enough approximation of the base-2 logarithm of 10
+	 */
+	t = (pg_leftmost_one_pos64(v) + 1)*1233/4096;
+	return t + (v >= PowersOfTen[t]);
+}
 
 /*
  * pg_atoi: convert string to integer
@@ -276,111 +338,194 @@ pg_itoa(int16 i, char *a)
 }
 
 /*
- * pg_ltoa: converts a signed 32-bit integer to its string representation
+ * pg_ultoa_n: converts an unsigned 32-bit integer to its string representation,
+ * not NUL-terminated, and returns the length of that string representation
  *
- * Caller must ensure that 'a' points to enough memory to hold the result
- * (at least 12 bytes, counting a leading sign and trailing NUL).
+ * Caller must ensure that 'a' points to enough memory to hold the result (at
+ * least 10 bytes)
+ */
+int
+pg_ultoa_n(uint32 value, char *a)
+{
+	int		olength, i = 0;
+
+	/* Degenerate case */
+	if (value == 0)
+	{
+		*a = '0';
+		return 1;
+	}
+
+	olength = decimalLength32(value);
+
+	/* Compute the result string. */
+	while (value >= 10000)
+	{
+		const	uint32 c = value - 10000 * (value / 10000);
+		const	uint32 c0 = (c % 100) << 1;
+		const	uint32 c1 = (c / 100) << 1;
+
+		char *pos = a + olength - i;
+
+		value /= 10000;
+
+		memcpy(pos - 2, DIGIT_TABLE + c0, 2);
+		memcpy(pos - 4, DIGIT_TABLE + c1, 2);
+		i += 4;
+	}
+	if (value >= 100)
+	{
+		const uint32 c = (value % 100) << 1;
+
+		char *pos = a + olength - i;
+
+		value /= 100;
+
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	if (value >= 10)
+	{
+		const uint32 c = value << 1;
+
+		char *pos = a + olength - i;
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	else
+	{
+		*a = (char) ('0' + value);
+		i++;
+	}
+
+	return i;
+}
+
+/*
+ * NUL-terminate the output of pg_ultoa_n.
+ *
+ * It is the caller's responsibility to ensure that a is at least 12 bytes long,
+ * which is enough room to hold a minus sign, a maximally long int32, and the
+ * above terminating NUL.
  */
 void
 pg_ltoa(int32 value, char *a)
 {
-	char	   *start = a;
-	bool		neg = false;
 
-	/*
-	 * Avoid problems with the most negative integer not being representable
-	 * as a positive integer.
-	 */
-	if (value == PG_INT32_MIN)
+	uint32	uvalue = (uint32)value;
+	int		len;
+	if (value < 0)
 	{
-		memcpy(a, "-2147483648", 12);
-		return;
-	}
-	else if (value < 0)
-	{
-		value = -value;
-		neg = true;
-	}
-
-	/* Compute the result string backwards. */
-	do
-	{
-		int32		remainder;
-		int32		oldval = value;
-
-		value /= 10;
-		remainder = oldval - value * 10;
-		*a++ = '0' + remainder;
-	} while (value != 0);
-
-	if (neg)
+		uvalue = (uint32)0 - uvalue;
 		*a++ = '-';
+	}
+	len = pg_ultoa_n(uvalue, a);
+	a[len] = '\0';
+}
+
+/*
+ * Get the decimal representation, not NUL-terminated, and return the length of
+ * same.  Caller must ensure that a points to at least MAXINT8LEN bytes.
+ */
+int
+pg_ulltoa_n(uint64 value, char *a)
+{
+	int	olength, i = 0;
+	uint32 value2;
+
+
+	/* Degenerate case */
+	if (value == 0)
+	{
+		*a = '0';
+		return 1;
+	}
+
+	olength = decimalLength64(value);
+
+	/* Compute the result string. */
+	while (value >= 100000000)
+	{
+		const	uint64 q = value / 100000000;
+		uint32	value2 = (uint32) (value - 100000000 * q);
+
+		const uint32 c = value2 % 10000;
+		const uint32 d = value2 / 10000;
+		const uint32 c0 = (c % 100) << 1;
+		const uint32 c1 = (c / 100) << 1;
+		const uint32 d0 = (d % 100) << 1;
+		const uint32 d1 = (d / 100) << 1;
+
+		char *pos = a + olength - i;
+
+		value = q;
 
-	/* Add trailing NUL byte, and back up 'a' to the last character. */
-	*a-- = '\0';
+		memcpy(pos - 2, DIGIT_TABLE + c0, 2);
+		memcpy(pos - 4, DIGIT_TABLE + c1, 2);
+		memcpy(pos - 6, DIGIT_TABLE + d0, 2);
+		memcpy(pos - 8, DIGIT_TABLE + d1, 2);
+		i += 8;
+	}
+
+	/* Switch to 32-bit for speed */
+	value2 = (uint32) value;
+
+	if (value2 >= 10000)
+	{
+		const	uint32 c = value2 - 10000 * (value2 / 10000);
+		const	uint32 c0 = (c % 100) << 1;
+		const	uint32 c1 = (c / 100) << 1;
+
+		char *pos = a + olength - i;
+
+		value2 /= 10000;
 
-	/* Reverse string. */
-	while (start < a)
+		memcpy(pos - 2, DIGIT_TABLE + c0, 2);
+		memcpy(pos - 4, DIGIT_TABLE + c1, 2);
+		i += 4;
+	}
+	if (value2 >= 100)
+	{
+		const uint32 c = (value2 % 100) << 1;
+		char *pos = a + olength - i;
+
+		value2 /= 100;
+
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	if (value2 >= 10)
 	{
-		char		swap = *start;
+		const uint32 c = value2 << 1;
+		char *pos = a + olength - i;
 
-		*start++ = *a;
-		*a-- = swap;
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
+		i += 2;
 	}
+	else
+		*a = (char) ('0' + value2);
+
+	return olength;
 }
 
 /*
  * pg_lltoa: convert a signed 64-bit integer to its string representation
  *
  * Caller must ensure that 'a' points to enough memory to hold the result
- * (at least MAXINT8LEN+1 bytes, counting a leading sign and trailing NUL).
+ * (at least MAXINT8LEN + 1 bytes, counting a leading sign and trailing NUL).
  */
 void
 pg_lltoa(int64 value, char *a)
 {
-	char	   *start = a;
-	bool		neg = false;
-
-	/*
-	 * Avoid problems with the most negative integer not being representable
-	 * as a positive integer.
-	 */
-	if (value == PG_INT64_MIN)
+	int len;
+	uint64 uvalue = value;
+	if (value < 0)
 	{
-		memcpy(a, "-9223372036854775808", 21);
-		return;
-	}
-	else if (value < 0)
-	{
-		value = -value;
-		neg = true;
-	}
-
-	/* Compute the result string backwards. */
-	do
-	{
-		int64		remainder;
-		int64		oldval = value;
-
-		value /= 10;
-		remainder = oldval - value * 10;
-		*a++ = '0' + remainder;
-	} while (value != 0);
-
-	if (neg)
 		*a++ = '-';
-
-	/* Add trailing NUL byte, and back up 'a' to the last character. */
-	*a-- = '\0';
-
-	/* Reverse string. */
-	while (start < a)
-	{
-		char		swap = *start;
-
-		*start++ = *a;
-		*a-- = swap;
+		uvalue = (uint64)0 - uvalue;
 	}
+	len = pg_ulltoa_n(uvalue, a);
+	a[len] = 0;
 }
 
 
@@ -409,60 +554,36 @@ pg_lltoa(int64 value, char *a)
 char *
 pg_ltostr_zeropad(char *str, int32 value, int32 minwidth)
 {
-	char	   *start = str;
-	char	   *end = &str[minwidth];
-	int32		num = value;
+	int32		len;
 
 	Assert(minwidth > 0);
 
-	/*
-	 * Handle negative numbers in a special way.  We can't just write a '-'
-	 * prefix and reverse the sign as that would overflow for INT32_MIN.
-	 */
-	if (num < 0)
+	if (value >= 0)
 	{
-		*start++ = '-';
-		minwidth--;
+		if (value < 100 && minwidth == 2) /* Short cut for common case */
+		{
+			const uint32 c = value << 1;
+			memcpy(str, DIGIT_TABLE + c, 2);
+			return str + 2;
+		}
 
+		len = pg_ultoa_n(value, str);
+		if (minwidth <= len)
+			return str + len;
+
+		memmove(str + minwidth - len, str, len);
+		memset(str, '0', minwidth-len);
+		return str + minwidth;
+	}
+	else
+	{
 		/*
-		 * Build the number starting at the last digit.  Here remainder will
-		 * be a negative number, so we must reverse the sign before adding '0'
-		 * in order to get the correct ASCII digit.
+		 * Changing this number's sign would overflow PG_INT32_MAX,
+		 * so special-case it.
 		 */
-		while (minwidth--)
-		{
-			int32		oldval = num;
-			int32		remainder;
-
-			num /= 10;
-			remainder = oldval - num * 10;
-			start[minwidth] = '0' - remainder;
-		}
+		*str++ = '-';
+		return pg_ltostr_zeropad(str, (uint32)0 - (uint32)value, minwidth - 1);
 	}
-	else
-	{
-		/* Build the number starting at the last digit */
-		while (minwidth--)
-		{
-			int32		oldval = num;
-			int32		remainder;
-
-			num /= 10;
-			remainder = oldval - num * 10;
-			start[minwidth] = '0' + remainder;
-		}
-	}
-
-	/*
-	 * If minwidth was not high enough to fit the number then num won't have
-	 * been divided down to zero.  We punt the problem to pg_ltostr(), which
-	 * will generate a correct answer in the minimum valid width.
-	 */
-	if (num != 0)
-		return pg_ltostr(str, value);
-
-	/* Otherwise, return last output character + 1 */
-	return end;
 }
 
 /*
@@ -486,62 +607,8 @@ pg_ltostr_zeropad(char *str, int32 value, int32 minwidth)
 char *
 pg_ltostr(char *str, int32 value)
 {
-	char	   *start;
-	char	   *end;
-
-	/*
-	 * Handle negative numbers in a special way.  We can't just write a '-'
-	 * prefix and reverse the sign as that would overflow for INT32_MIN.
-	 */
-	if (value < 0)
-	{
-		*str++ = '-';
-
-		/* Mark the position we must reverse the string from. */
-		start = str;
-
-		/* Compute the result string backwards. */
-		do
-		{
-			int32		oldval = value;
-			int32		remainder;
-
-			value /= 10;
-			remainder = oldval - value * 10;
-			/* As above, we expect remainder to be negative. */
-			*str++ = '0' - remainder;
-		} while (value != 0);
-	}
-	else
-	{
-		/* Mark the position we must reverse the string from. */
-		start = str;
-
-		/* Compute the result string backwards. */
-		do
-		{
-			int32		oldval = value;
-			int32		remainder;
-
-			value /= 10;
-			remainder = oldval - value * 10;
-			*str++ = '0' + remainder;
-		} while (value != 0);
-	}
-
-	/* Remember the end+1 and back up 'str' to the last character. */
-	end = str--;
-
-	/* Reverse string. */
-	while (start < str)
-	{
-		char		swap = *start;
-
-		*start++ = *str;
-		*str-- = swap;
-	}
-
-	return end;
+	int32	len = pg_ultoa_n(value, str);
+	return str + len;
 }
 
 /*
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 937ddb7ef0..47ec1e8017 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -18,6 +18,8 @@
 #include "nodes/nodes.h"
 #include "utils/fmgrprotos.h"
 
+/* Sign + the most decimal digits an 8-byte number could have */
+#define MAXINT8LEN 20
 
 /* bool.c */
 extern bool parse_bool(const char *value, bool *result);
@@ -46,6 +48,8 @@ extern int32 pg_atoi(const char *s, int size, int c);
 extern int16 pg_strtoint16(const char *s);
 extern int32 pg_strtoint32(const char *s);
 extern void pg_itoa(int16 i, char *a);
+int pg_ultoa_n(uint32 l, char *a);
+int pg_ulltoa_n(uint64 l, char *a);
 extern void pg_ltoa(int32 l, char *a);
 extern void pg_lltoa(int64 ll, char *a);
 extern char *pg_ltostr_zeropad(char *str, int32 value, int32 minwidth);

--------------2.21.0--

#17

Tels

nospam-pg-abuse@bloodgate.com

over 6 years ago

In reply to: David Fetter (#16)

Re: Efficient output for integer types

Moin,

On 2019-09-22 23:58, David Fetter wrote:

On Sat, Sep 21, 2019 at 07:29:25AM +0100, Andrew Gierth wrote:

"David" == David Fetter <david@fetter.org> writes:

Fixed.

Good work, more performance is sure nice :)

Noticed one more thing in the patch:

-		*start++ = *a;
-		*a-- = swap;
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
+		i += 2;
}
+	else
+		*a = (char) ('0' + value2);
+
+	return olength;
}

The line "i += 2;" modifies i, but i is never used again nor returned.

Best regards,

Tels

#18

Andrew Gierth

andrew@tao11.riddles.org.uk

over 6 years ago

In reply to: David Fetter (#16)

Re: Efficient output for integer types

"David" == David Fetter <david@fetter.org> writes:

David> + return pg_ltostr_zeropad(str, (uint32)0 - (uint32)value, minwidth - 1);

No, this is just reintroducing the undefined behavior again. Once the
value has been converted to unsigned you can't cast it back to signed or
pass it to a function expecting a signed value, since it will overflow
in the INT_MIN case. (and in this example would probably output '-'
signs until it ran off the end of memory).

Here's how I would do it:

char *
pg_ltostr_zeropad(char *str, int32 value, int32 minwidth)
{
int32 len;
uint32 uvalue = value;

Assert(minwidth > 0);

if (value >= 0)
{
if (value < 100 && minwidth == 2) /* Short cut for common case */
{
memcpy(str, DIGIT_TABLE + value*2, 2);
return str + 2;
}
}
else
{
*str++ = '-';
minwidth -= 1;
uvalue = (uint32)0 - uvalue;
}

len = pg_ultoa_n(uvalue, str);
if (len >= minwidth)
return str + len;

memmove(str + minwidth - len, str, len);
memset(str, '0', minwidth - len);
return str + minwidth;
}

David> pg_ltostr(char *str, int32 value)
David> + int32 len = pg_ultoa_n(value, str);
David> + return str + len;

This seems to have lost its handling of negative numbers entirely (which
doesn't say much for the regression test coverage)

--
Andrew (irc:RhodiumToad)

#19

David Fetter

david@fetter.org

over 6 years ago

In reply to: Tels (#17)

Re: Efficient output for integer types

On Mon, Sep 23, 2019 at 10:28:09AM +0200, Tels wrote:

Moin,

On 2019-09-22 23:58, David Fetter wrote:

On Sat, Sep 21, 2019 at 07:29:25AM +0100, Andrew Gierth wrote:

"David" == David Fetter <david@fetter.org> writes:

Fixed.

Good work, more performance is sure nice :)

Noticed one more thing in the patch:
-		*start++ = *a;
-		*a-- = swap;
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
+		i += 2;
}
+	else
+		*a = (char) ('0' + value2);
+
+	return olength;
}
The line "i += 2;" modifies i, but i is never used again nor returned.

I found a similar one in a similar function, and removed it, too.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#20

David Fetter

david@fetter.org

over 6 years ago

In reply to: Andrew Gierth (#18)

1 attachment(s)

Re: Efficient output for integer types

On Mon, Sep 23, 2019 at 01:16:36PM +0100, Andrew Gierth wrote:

"David" == David Fetter <david@fetter.org> writes:

David> + return pg_ltostr_zeropad(str, (uint32)0 - (uint32)value, minwidth - 1);

No, this is just reintroducing the undefined behavior again. Once the
value has been converted to unsigned you can't cast it back to signed or
pass it to a function expecting a signed value, since it will overflow
in the INT_MIN case. (and in this example would probably output '-'
signs until it ran off the end of memory).

Here's how I would do it:

char *
pg_ltostr_zeropad(char *str, int32 value, int32 minwidth)
{
int32 len;
uint32 uvalue = value;

Assert(minwidth > 0);

if (value >= 0)
{
if (value < 100 && minwidth == 2) /* Short cut for common case */
{
memcpy(str, DIGIT_TABLE + value*2, 2);
return str + 2;
}
}
else
{
*str++ = '-';
minwidth -= 1;
uvalue = (uint32)0 - uvalue;
}

len = pg_ultoa_n(uvalue, str);
if (len >= minwidth)
return str + len;

memmove(str + minwidth - len, str, len);
memset(str, '0', minwidth - len);
return str + minwidth;
}

Done pretty much that way.

David> pg_ltostr(char *str, int32 value)
David> + int32 len = pg_ultoa_n(value, str);
David> + return str + len;

This seems to have lost its handling of negative numbers entirely

Given the comment that precedes it and all the use cases in the code,
I changed the signature to take an unsigned integer instead. It's
pretty clear that the intent was to add digits and only digits to the
passed-in string.

(which doesn't say much for the regression test coverage)

I didn't see any obvious way to test functions not surfaced to SQL.
Should we have one?

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachments:

v12-0001-Make-int4-and-int8-operations-more-efficent.patchtext/x-diff; charset=us-asciiDownload

From a0eecd74d1dd3f6a4bb1a07f41740c4f6d34ceac Mon Sep 17 00:00:00 2001
From: David Fetter <david@fetter.org>
Date: Sun, 15 Sep 2019 00:06:29 -0700
Subject: [PATCH v12] Make int4 and int8 operations more efficent
To: hackers
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="------------2.21.0"

This is a multi-part message in MIME format.
--------------2.21.0
Content-Type: text/plain; charset=UTF-8; format=fixed
Content-Transfer-Encoding: 8bit


- Output routines now do more digits per iteration, and
- Code determines the number of decimal digits in int4/int8 efficiently
- Split off pg_ltoa_n from pg_ltoa
- Use same to make other functions shorter

diff --git a/src/backend/access/common/printsimple.c b/src/backend/access/common/printsimple.c
index 651ade14dd..5c5b6d33b2 100644
--- a/src/backend/access/common/printsimple.c
+++ b/src/backend/access/common/printsimple.c
@@ -112,7 +112,7 @@ printsimple(TupleTableSlot *slot, DestReceiver *self)
 			case INT8OID:
 				{
 					int64		num = DatumGetInt64(value);
-					char		str[23];	/* sign, 21 digits and '\0' */
+					char		str[MAXINT8LEN + 1];
 
 					pg_lltoa(num, str);
 					pq_sendcountedtext(&buf, str, strlen(str), false);
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 0ff9394a2f..6230807906 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -27,8 +27,6 @@
 #include "utils/builtins.h"
 
 
-#define MAXINT8LEN		25
-
 typedef struct
 {
 	int64		current;
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 70138feb29..5dfbe8dfcd 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -20,6 +20,68 @@
 
 #include "common/int.h"
 #include "utils/builtins.h"
+#include "port/pg_bitutils.h"
+
+/*
+ * A table of all two-digit numbers. This is used to speed up decimal digit
+ * generation by copying pairs of digits into the final output.
+ */
+static const char DIGIT_TABLE[200] =
+"00" "01" "02" "03" "04" "05" "06" "07" "08" "09"
+"10" "11" "12" "13" "14" "15" "16" "17" "18" "19"
+"20" "21" "22" "23" "24" "25" "26" "27" "28" "29"
+"30" "31" "32" "33" "34" "35" "36" "37" "38" "39"
+"40" "41" "42" "43" "44" "45" "46" "47" "48" "49"
+"50" "51" "52" "53" "54" "55" "56" "57" "58" "59"
+"60" "61" "62" "63" "64" "65" "66" "67" "68" "69"
+"70" "71" "72" "73" "74" "75" "76" "77" "78" "79"
+"80" "81" "82" "83" "84" "85" "86" "87" "88" "89"
+"90" "91" "92" "93" "94" "95" "96" "97" "98" "99";
+
+/*
+ * Adapted from http://graphics.stanford.edu/~seander/bithacks.html#IntegerLog10
+ */
+static inline int
+decimalLength32(const uint32 v)
+{
+	int			t;
+	static uint32	PowersOfTen[] =
+	{1,                10,                100,
+	 1000,             10000,             100000,
+	 1000000,          10000000,          100000000,
+	 1000000000};
+	/*
+	 * Compute base-10 logarithm by dividing the base-2 logarithm
+	 * by a good-enough approximation of the base-2 logarithm of 10
+	 */
+	t = (pg_leftmost_one_pos32(v) + 1)*1233/4096;
+	return t + (v >= PowersOfTen[t]);
+}
+
+static inline int
+decimalLength64(const uint64 v)
+{
+	int			t;
+	static uint64	PowersOfTen[] = {
+		UINT64CONST(1),                   UINT64CONST(10),
+		UINT64CONST(100),                 UINT64CONST(1000),
+		UINT64CONST(10000),               UINT64CONST(100000),
+		UINT64CONST(1000000),             UINT64CONST(10000000),
+		UINT64CONST(100000000),           UINT64CONST(1000000000),
+		UINT64CONST(10000000000),         UINT64CONST(100000000000),
+		UINT64CONST(1000000000000),       UINT64CONST(10000000000000),
+		UINT64CONST(100000000000000),     UINT64CONST(1000000000000000),
+		UINT64CONST(10000000000000000),   UINT64CONST(100000000000000000),
+		UINT64CONST(1000000000000000000), UINT64CONST(10000000000000000000)
+	};
+
+	/*
+	 * Compute base-10 logarithm by dividing the base-2 logarithm
+	 * by a good-enough approximation of the base-2 logarithm of 10
+	 */
+	t = (pg_leftmost_one_pos64(v) + 1)*1233/4096;
+	return t + (v >= PowersOfTen[t]);
+}
 
 /*
  * pg_atoi: convert string to integer
@@ -276,111 +338,191 @@ pg_itoa(int16 i, char *a)
 }
 
 /*
- * pg_ltoa: converts a signed 32-bit integer to its string representation
+ * pg_ultoa_n: converts an unsigned 32-bit integer to its string representation,
+ * not NUL-terminated, and returns the length of that string representation
  *
- * Caller must ensure that 'a' points to enough memory to hold the result
- * (at least 12 bytes, counting a leading sign and trailing NUL).
+ * Caller must ensure that 'a' points to enough memory to hold the result (at
+ * least 10 bytes)
+ */
+int
+pg_ultoa_n(uint32 value, char *a)
+{
+	int		olength, i = 0;
+
+	/* Degenerate case */
+	if (value == 0)
+	{
+		*a = '0';
+		return 1;
+	}
+
+	olength = decimalLength32(value);
+
+	/* Compute the result string. */
+	while (value >= 10000)
+	{
+		const	uint32 c = value - 10000 * (value / 10000);
+		const	uint32 c0 = (c % 100) << 1;
+		const	uint32 c1 = (c / 100) << 1;
+
+		char *pos = a + olength - i;
+
+		value /= 10000;
+
+		memcpy(pos - 2, DIGIT_TABLE + c0, 2);
+		memcpy(pos - 4, DIGIT_TABLE + c1, 2);
+		i += 4;
+	}
+	if (value >= 100)
+	{
+		const uint32 c = (value % 100) << 1;
+
+		char *pos = a + olength - i;
+
+		value /= 100;
+
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	if (value >= 10)
+	{
+		const uint32 c = value << 1;
+
+		char *pos = a + olength - i;
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
+	}
+	else
+	{
+		*a = (char) ('0' + value);
+	}
+
+	return olength;
+}
+
+/*
+ * NUL-terminate the output of pg_ultoa_n.
+ *
+ * It is the caller's responsibility to ensure that a is at least 12 bytes long,
+ * which is enough room to hold a minus sign, a maximally long int32, and the
+ * above terminating NUL.
  */
 void
 pg_ltoa(int32 value, char *a)
 {
-	char	   *start = a;
-	bool		neg = false;
 
-	/*
-	 * Avoid problems with the most negative integer not being representable
-	 * as a positive integer.
-	 */
-	if (value == PG_INT32_MIN)
+	uint32	uvalue = (uint32)value;
+	int		len;
+	if (value < 0)
 	{
-		memcpy(a, "-2147483648", 12);
-		return;
-	}
-	else if (value < 0)
-	{
-		value = -value;
-		neg = true;
-	}
-
-	/* Compute the result string backwards. */
-	do
-	{
-		int32		remainder;
-		int32		oldval = value;
-
-		value /= 10;
-		remainder = oldval - value * 10;
-		*a++ = '0' + remainder;
-	} while (value != 0);
-
-	if (neg)
+		uvalue = (uint32)0 - uvalue;
 		*a++ = '-';
+	}
+	len = pg_ultoa_n(uvalue, a);
+	a[len] = '\0';
+}
+
+/*
+ * Get the decimal representation, not NUL-terminated, and return the length of
+ * same.  Caller must ensure that a points to at least MAXINT8LEN bytes.
+ */
+int
+pg_ulltoa_n(uint64 value, char *a)
+{
+	int	olength, i = 0;
+	uint32 value2;
+
+
+	/* Degenerate case */
+	if (value == 0)
+	{
+		*a = '0';
+		return 1;
+	}
+
+	olength = decimalLength64(value);
+
+	/* Compute the result string. */
+	while (value >= 100000000)
+	{
+		const	uint64 q = value / 100000000;
+		uint32	value2 = (uint32) (value - 100000000 * q);
+
+		const uint32 c = value2 % 10000;
+		const uint32 d = value2 / 10000;
+		const uint32 c0 = (c % 100) << 1;
+		const uint32 c1 = (c / 100) << 1;
+		const uint32 d0 = (d % 100) << 1;
+		const uint32 d1 = (d / 100) << 1;
+
+		char *pos = a + olength - i;
+
+		value = q;
 
-	/* Add trailing NUL byte, and back up 'a' to the last character. */
-	*a-- = '\0';
+		memcpy(pos - 2, DIGIT_TABLE + c0, 2);
+		memcpy(pos - 4, DIGIT_TABLE + c1, 2);
+		memcpy(pos - 6, DIGIT_TABLE + d0, 2);
+		memcpy(pos - 8, DIGIT_TABLE + d1, 2);
+		i += 8;
+	}
+
+	/* Switch to 32-bit for speed */
+	value2 = (uint32) value;
+
+	if (value2 >= 10000)
+	{
+		const	uint32 c = value2 - 10000 * (value2 / 10000);
+		const	uint32 c0 = (c % 100) << 1;
+		const	uint32 c1 = (c / 100) << 1;
+
+		char *pos = a + olength - i;
+
+		value2 /= 10000;
 
-	/* Reverse string. */
-	while (start < a)
+		memcpy(pos - 2, DIGIT_TABLE + c0, 2);
+		memcpy(pos - 4, DIGIT_TABLE + c1, 2);
+		i += 4;
+	}
+	if (value2 >= 100)
+	{
+		const uint32 c = (value2 % 100) << 1;
+		char *pos = a + olength - i;
+
+		value2 /= 100;
+
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	if (value2 >= 10)
 	{
-		char		swap = *start;
+		const uint32 c = value2 << 1;
+		char *pos = a + olength - i;
 
-		*start++ = *a;
-		*a-- = swap;
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
 	}
+	else
+		*a = (char) ('0' + value2);
+
+	return olength;
 }
 
 /*
  * pg_lltoa: convert a signed 64-bit integer to its string representation
  *
  * Caller must ensure that 'a' points to enough memory to hold the result
- * (at least MAXINT8LEN+1 bytes, counting a leading sign and trailing NUL).
+ * (at least MAXINT8LEN + 1 bytes, counting a leading sign and trailing NUL).
  */
 void
 pg_lltoa(int64 value, char *a)
 {
-	char	   *start = a;
-	bool		neg = false;
-
-	/*
-	 * Avoid problems with the most negative integer not being representable
-	 * as a positive integer.
-	 */
-	if (value == PG_INT64_MIN)
+	int len;
+	uint64 uvalue = value;
+	if (value < 0)
 	{
-		memcpy(a, "-9223372036854775808", 21);
-		return;
-	}
-	else if (value < 0)
-	{
-		value = -value;
-		neg = true;
-	}
-
-	/* Compute the result string backwards. */
-	do
-	{
-		int64		remainder;
-		int64		oldval = value;
-
-		value /= 10;
-		remainder = oldval - value * 10;
-		*a++ = '0' + remainder;
-	} while (value != 0);
-
-	if (neg)
 		*a++ = '-';
-
-	/* Add trailing NUL byte, and back up 'a' to the last character. */
-	*a-- = '\0';
-
-	/* Reverse string. */
-	while (start < a)
-	{
-		char		swap = *start;
-
-		*start++ = *a;
-		*a-- = swap;
+		uvalue = (uint64)0 - uvalue;
 	}
+	len = pg_ulltoa_n(uvalue, a);
+	a[len] = 0;
 }
 
 
@@ -409,60 +551,33 @@ pg_lltoa(int64 value, char *a)
 char *
 pg_ltostr_zeropad(char *str, int32 value, int32 minwidth)
 {
-	char	   *start = str;
-	char	   *end = &str[minwidth];
-	int32		num = value;
+	int			len;
+	uint32		uvalue = value;
 
 	Assert(minwidth > 0);
 
-	/*
-	 * Handle negative numbers in a special way.  We can't just write a '-'
-	 * prefix and reverse the sign as that would overflow for INT32_MIN.
-	 */
-	if (num < 0)
+	if (value >= 0)
 	{
-		*start++ = '-';
+		if (value < 100 && minwidth == 2) /* Short cut for common case */
+		{
+			memcpy(str, DIGIT_TABLE + value * 2, 2);
+			return str + 2;
+		}
+	}
+	else
+	{
+		*str++ = '-';
 		minwidth--;
-
-		/*
-		 * Build the number starting at the last digit.  Here remainder will
-		 * be a negative number, so we must reverse the sign before adding '0'
-		 * in order to get the correct ASCII digit.
-		 */
-		while (minwidth--)
-		{
-			int32		oldval = num;
-			int32		remainder;
-
-			num /= 10;
-			remainder = oldval - num * 10;
-			start[minwidth] = '0' - remainder;
-		}
-	}
-	else
-	{
-		/* Build the number starting at the last digit */
-		while (minwidth--)
-		{
-			int32		oldval = num;
-			int32		remainder;
-
-			num /= 10;
-			remainder = oldval - num * 10;
-			start[minwidth] = '0' + remainder;
-		}
+		uvalue = (uint32)0 - uvalue;
 	}
 
-	/*
-	 * If minwidth was not high enough to fit the number then num won't have
-	 * been divided down to zero.  We punt the problem to pg_ltostr(), which
-	 * will generate a correct answer in the minimum valid width.
-	 */
-	if (num != 0)
-		return pg_ltostr(str, value);
+	len = pg_ultoa_n(value, str);
+	if (len >= minwidth)
+		return str + len;
 
-	/* Otherwise, return last output character + 1 */
-	return end;
+	memmove(str + minwidth - len, str, len);
+	memset(str, '0', minwidth-len);
+	return str + minwidth;
 }
 
 /*
@@ -484,64 +599,10 @@ pg_ltostr_zeropad(char *str, int32 value, int32 minwidth)
  * result.
  */
 char *
-pg_ltostr(char *str, int32 value)
+pg_ltostr(char *str, uint32 value)
 {
-	char	   *start;
-	char	   *end;
-
-	/*
-	 * Handle negative numbers in a special way.  We can't just write a '-'
-	 * prefix and reverse the sign as that would overflow for INT32_MIN.
-	 */
-	if (value < 0)
-	{
-		*str++ = '-';
-
-		/* Mark the position we must reverse the string from. */
-		start = str;
-
-		/* Compute the result string backwards. */
-		do
-		{
-			int32		oldval = value;
-			int32		remainder;
-
-			value /= 10;
-			remainder = oldval - value * 10;
-			/* As above, we expect remainder to be negative. */
-			*str++ = '0' - remainder;
-		} while (value != 0);
-	}
-	else
-	{
-		/* Mark the position we must reverse the string from. */
-		start = str;
-
-		/* Compute the result string backwards. */
-		do
-		{
-			int32		oldval = value;
-			int32		remainder;
-
-			value /= 10;
-			remainder = oldval - value * 10;
-			*str++ = '0' + remainder;
-		} while (value != 0);
-	}
-
-	/* Remember the end+1 and back up 'str' to the last character. */
-	end = str--;
-
-	/* Reverse string. */
-	while (start < str)
-	{
-		char		swap = *start;
-
-		*start++ = *str;
-		*str-- = swap;
-	}
-
-	return end;
+	int		len = pg_ultoa_n(value, str);
+	return str + len;
 }
 
 /*
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 937ddb7ef0..7f27cdd80f 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -18,6 +18,8 @@
 #include "nodes/nodes.h"
 #include "utils/fmgrprotos.h"
 
+/* Sign + the most decimal digits an 8-byte number could have */
+#define MAXINT8LEN 20
 
 /* bool.c */
 extern bool parse_bool(const char *value, bool *result);
@@ -46,10 +48,12 @@ extern int32 pg_atoi(const char *s, int size, int c);
 extern int16 pg_strtoint16(const char *s);
 extern int32 pg_strtoint32(const char *s);
 extern void pg_itoa(int16 i, char *a);
+int pg_ultoa_n(uint32 l, char *a);
+int pg_ulltoa_n(uint64 l, char *a);
 extern void pg_ltoa(int32 l, char *a);
 extern void pg_lltoa(int64 ll, char *a);
 extern char *pg_ltostr_zeropad(char *str, int32 value, int32 minwidth);
-extern char *pg_ltostr(char *str, int32 value);
+extern char *pg_ltostr(char *str, uint32 value);
 extern uint64 pg_strtouint64(const char *str, char **endptr, int base);
 
 /* oid.c */

--------------2.21.0--

#21

David Fetter

david@fetter.org

over 6 years ago

In reply to: David Fetter (#20)

1 attachment(s)

Re: Efficient output for integer types

On Mon, Sep 23, 2019 at 11:35:07PM +0200, David Fetter wrote:

On Mon, Sep 23, 2019 at 01:16:36PM +0100, Andrew Gierth wrote:

Per discussion on IRC, change some functions to take only unsigned
integer types so as not to branch for the case of negative numbers
they're never actually called with.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachments:

v13-0001-Make-int4-and-int8-operations-more-efficent.patchtext/x-diff; charset=us-asciiDownload

From 714a7b91a3d109d380473462de0f68e53ebc5d15 Mon Sep 17 00:00:00 2001
From: David Fetter <david@fetter.org>
Date: Sun, 15 Sep 2019 00:06:29 -0700
Subject: [PATCH v13] Make int4 and int8 operations more efficent
To: hackers
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="------------2.21.0"

This is a multi-part message in MIME format.
--------------2.21.0
Content-Type: text/plain; charset=UTF-8; format=fixed
Content-Transfer-Encoding: 8bit


- Output routines now do more digits per iteration, and
- Add code which determines the number of decimal digits in an int4/int8 efficiently
- Split off pg_ultoa_n from pg_ltoa
- Use same to make other functions shorter
- Change some of the functions to take only unsigned so they don't have to branch for negative numbers

diff --git a/src/backend/access/common/printsimple.c b/src/backend/access/common/printsimple.c
index 651ade14dd..5c5b6d33b2 100644
--- a/src/backend/access/common/printsimple.c
+++ b/src/backend/access/common/printsimple.c
@@ -112,7 +112,7 @@ printsimple(TupleTableSlot *slot, DestReceiver *self)
 			case INT8OID:
 				{
 					int64		num = DatumGetInt64(value);
-					char		str[23];	/* sign, 21 digits and '\0' */
+					char		str[MAXINT8LEN + 1];
 
 					pg_lltoa(num, str);
 					pq_sendcountedtext(&buf, str, strlen(str), false);
diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index e38bd93054..9fb03d4fbc 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -391,9 +391,9 @@ AppendSeconds(char *cp, int sec, fsec_t fsec, int precision, bool fillzeros)
 	Assert(precision >= 0);
 
 	if (fillzeros)
-		cp = pg_ltostr_zeropad(cp, Abs(sec), 2);
+		cp = pg_ultostr_zeropad(cp, Abs(sec), 2);
 	else
-		cp = pg_ltostr(cp, Abs(sec));
+		cp = pg_ultostr(cp, Abs(sec));
 
 	/* fsec_t is just an int32 */
 	if (fsec != 0)
@@ -433,7 +433,7 @@ AppendSeconds(char *cp, int sec, fsec_t fsec, int precision, bool fillzeros)
 		 * which will generate a correct answer in the minimum valid width.
 		 */
 		if (value)
-			return pg_ltostr(cp, Abs(fsec));
+			return pg_ultostr(cp, Abs(fsec));
 
 		return end;
 	}
@@ -3834,20 +3834,20 @@ EncodeTimezone(char *str, int tz, int style)
 
 	if (sec != 0)
 	{
-		str = pg_ltostr_zeropad(str, hour, 2);
+		str = pg_ultostr_zeropad(str, hour, 2);
 		*str++ = ':';
-		str = pg_ltostr_zeropad(str, min, 2);
+		str = pg_ultostr_zeropad(str, min, 2);
 		*str++ = ':';
-		str = pg_ltostr_zeropad(str, sec, 2);
+		str = pg_ultostr_zeropad(str, sec, 2);
 	}
 	else if (min != 0 || style == USE_XSD_DATES)
 	{
-		str = pg_ltostr_zeropad(str, hour, 2);
+		str = pg_ultostr_zeropad(str, hour, 2);
 		*str++ = ':';
-		str = pg_ltostr_zeropad(str, min, 2);
+		str = pg_ultostr_zeropad(str, min, 2);
 	}
 	else
-		str = pg_ltostr_zeropad(str, hour, 2);
+		str = pg_ultostr_zeropad(str, hour, 2);
 	return str;
 }
 
@@ -3864,40 +3864,40 @@ EncodeDateOnly(struct pg_tm *tm, int style, char *str)
 		case USE_ISO_DATES:
 		case USE_XSD_DATES:
 			/* compatible with ISO date formats */
-			str = pg_ltostr_zeropad(str,
+			str = pg_ultostr_zeropad(str,
 									(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
 			*str++ = '-';
-			str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_mon, 2);
 			*str++ = '-';
-			str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_mday, 2);
 			break;
 
 		case USE_SQL_DATES:
 			/* compatible with Oracle/Ingres date formats */
 			if (DateOrder == DATEORDER_DMY)
 			{
-				str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+				str = pg_ultostr_zeropad(str, tm->tm_mday, 2);
 				*str++ = '/';
-				str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+				str = pg_ultostr_zeropad(str, tm->tm_mon, 2);
 			}
 			else
 			{
-				str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+				str = pg_ultostr_zeropad(str, tm->tm_mon, 2);
 				*str++ = '/';
-				str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+				str = pg_ultostr_zeropad(str, tm->tm_mday, 2);
 			}
 			*str++ = '/';
-			str = pg_ltostr_zeropad(str,
+			str = pg_ultostr_zeropad(str,
 									(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
 			break;
 
 		case USE_GERMAN_DATES:
 			/* German-style date format */
-			str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_mday, 2);
 			*str++ = '.';
-			str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_mon, 2);
 			*str++ = '.';
-			str = pg_ltostr_zeropad(str,
+			str = pg_ultostr_zeropad(str,
 									(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
 			break;
 
@@ -3906,18 +3906,18 @@ EncodeDateOnly(struct pg_tm *tm, int style, char *str)
 			/* traditional date-only style for Postgres */
 			if (DateOrder == DATEORDER_DMY)
 			{
-				str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+				str = pg_ultostr_zeropad(str, tm->tm_mday, 2);
 				*str++ = '-';
-				str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+				str = pg_ultostr_zeropad(str, tm->tm_mon, 2);
 			}
 			else
 			{
-				str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+				str = pg_ultostr_zeropad(str, tm->tm_mon, 2);
 				*str++ = '-';
-				str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+				str = pg_ultostr_zeropad(str, tm->tm_mday, 2);
 			}
 			*str++ = '-';
-			str = pg_ltostr_zeropad(str,
+			str = pg_ultostr_zeropad(str,
 									(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
 			break;
 	}
@@ -3942,9 +3942,9 @@ EncodeDateOnly(struct pg_tm *tm, int style, char *str)
 void
 EncodeTimeOnly(struct pg_tm *tm, fsec_t fsec, bool print_tz, int tz, int style, char *str)
 {
-	str = pg_ltostr_zeropad(str, tm->tm_hour, 2);
+	str = pg_ultostr_zeropad(str, tm->tm_hour, 2);
 	*str++ = ':';
-	str = pg_ltostr_zeropad(str, tm->tm_min, 2);
+	str = pg_ultostr_zeropad(str, tm->tm_min, 2);
 	*str++ = ':';
 	str = AppendSeconds(str, tm->tm_sec, fsec, MAX_TIME_PRECISION, true);
 	if (print_tz)
@@ -3987,16 +3987,16 @@ EncodeDateTime(struct pg_tm *tm, fsec_t fsec, bool print_tz, int tz, const char
 		case USE_ISO_DATES:
 		case USE_XSD_DATES:
 			/* Compatible with ISO-8601 date formats */
-			str = pg_ltostr_zeropad(str,
+			str = pg_ultostr_zeropad(str,
 									(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
 			*str++ = '-';
-			str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_mon, 2);
 			*str++ = '-';
-			str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_mday, 2);
 			*str++ = (style == USE_ISO_DATES) ? ' ' : 'T';
-			str = pg_ltostr_zeropad(str, tm->tm_hour, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_hour, 2);
 			*str++ = ':';
-			str = pg_ltostr_zeropad(str, tm->tm_min, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_min, 2);
 			*str++ = ':';
 			str = AppendTimestampSeconds(str, tm, fsec);
 			if (print_tz)
@@ -4007,23 +4007,23 @@ EncodeDateTime(struct pg_tm *tm, fsec_t fsec, bool print_tz, int tz, const char
 			/* Compatible with Oracle/Ingres date formats */
 			if (DateOrder == DATEORDER_DMY)
 			{
-				str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+				str = pg_ultostr_zeropad(str, tm->tm_mday, 2);
 				*str++ = '/';
-				str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+				str = pg_ultostr_zeropad(str, tm->tm_mon, 2);
 			}
 			else
 			{
-				str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+				str = pg_ultostr_zeropad(str, tm->tm_mon, 2);
 				*str++ = '/';
-				str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+				str = pg_ultostr_zeropad(str, tm->tm_mday, 2);
 			}
 			*str++ = '/';
-			str = pg_ltostr_zeropad(str,
+			str = pg_ultostr_zeropad(str,
 									(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
 			*str++ = ' ';
-			str = pg_ltostr_zeropad(str, tm->tm_hour, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_hour, 2);
 			*str++ = ':';
-			str = pg_ltostr_zeropad(str, tm->tm_min, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_min, 2);
 			*str++ = ':';
 			str = AppendTimestampSeconds(str, tm, fsec);
 
@@ -4046,16 +4046,16 @@ EncodeDateTime(struct pg_tm *tm, fsec_t fsec, bool print_tz, int tz, const char
 
 		case USE_GERMAN_DATES:
 			/* German variant on European style */
-			str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_mday, 2);
 			*str++ = '.';
-			str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_mon, 2);
 			*str++ = '.';
-			str = pg_ltostr_zeropad(str,
+			str = pg_ultostr_zeropad(str,
 									(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
 			*str++ = ' ';
-			str = pg_ltostr_zeropad(str, tm->tm_hour, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_hour, 2);
 			*str++ = ':';
-			str = pg_ltostr_zeropad(str, tm->tm_min, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_min, 2);
 			*str++ = ':';
 			str = AppendTimestampSeconds(str, tm, fsec);
 
@@ -4081,7 +4081,7 @@ EncodeDateTime(struct pg_tm *tm, fsec_t fsec, bool print_tz, int tz, const char
 			*str++ = ' ';
 			if (DateOrder == DATEORDER_DMY)
 			{
-				str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+				str = pg_ultostr_zeropad(str, tm->tm_mday, 2);
 				*str++ = ' ';
 				memcpy(str, months[tm->tm_mon - 1], 3);
 				str += 3;
@@ -4091,16 +4091,16 @@ EncodeDateTime(struct pg_tm *tm, fsec_t fsec, bool print_tz, int tz, const char
 				memcpy(str, months[tm->tm_mon - 1], 3);
 				str += 3;
 				*str++ = ' ';
-				str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+				str = pg_ultostr_zeropad(str, tm->tm_mday, 2);
 			}
 			*str++ = ' ';
-			str = pg_ltostr_zeropad(str, tm->tm_hour, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_hour, 2);
 			*str++ = ':';
-			str = pg_ltostr_zeropad(str, tm->tm_min, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_min, 2);
 			*str++ = ':';
 			str = AppendTimestampSeconds(str, tm, fsec);
 			*str++ = ' ';
-			str = pg_ltostr_zeropad(str,
+			str = pg_ultostr_zeropad(str,
 									(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
 
 			if (print_tz)
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 0ff9394a2f..6230807906 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -27,8 +27,6 @@
 #include "utils/builtins.h"
 
 
-#define MAXINT8LEN		25
-
 typedef struct
 {
 	int64		current;
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 70138feb29..27d6f3aacb 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -20,6 +20,68 @@
 
 #include "common/int.h"
 #include "utils/builtins.h"
+#include "port/pg_bitutils.h"
+
+/*
+ * A table of all two-digit numbers. This is used to speed up decimal digit
+ * generation by copying pairs of digits into the final output.
+ */
+static const char DIGIT_TABLE[200] =
+"00" "01" "02" "03" "04" "05" "06" "07" "08" "09"
+"10" "11" "12" "13" "14" "15" "16" "17" "18" "19"
+"20" "21" "22" "23" "24" "25" "26" "27" "28" "29"
+"30" "31" "32" "33" "34" "35" "36" "37" "38" "39"
+"40" "41" "42" "43" "44" "45" "46" "47" "48" "49"
+"50" "51" "52" "53" "54" "55" "56" "57" "58" "59"
+"60" "61" "62" "63" "64" "65" "66" "67" "68" "69"
+"70" "71" "72" "73" "74" "75" "76" "77" "78" "79"
+"80" "81" "82" "83" "84" "85" "86" "87" "88" "89"
+"90" "91" "92" "93" "94" "95" "96" "97" "98" "99";
+
+/*
+ * Adapted from http://graphics.stanford.edu/~seander/bithacks.html#IntegerLog10
+ */
+static inline int
+decimalLength32(const uint32 v)
+{
+	int			t;
+	static uint32	PowersOfTen[] =
+	{1,                10,                100,
+	 1000,             10000,             100000,
+	 1000000,          10000000,          100000000,
+	 1000000000};
+	/*
+	 * Compute base-10 logarithm by dividing the base-2 logarithm
+	 * by a good-enough approximation of the base-2 logarithm of 10
+	 */
+	t = (pg_leftmost_one_pos32(v) + 1)*1233/4096;
+	return t + (v >= PowersOfTen[t]);
+}
+
+static inline int
+decimalLength64(const uint64 v)
+{
+	int			t;
+	static uint64	PowersOfTen[] = {
+		UINT64CONST(1),                   UINT64CONST(10),
+		UINT64CONST(100),                 UINT64CONST(1000),
+		UINT64CONST(10000),               UINT64CONST(100000),
+		UINT64CONST(1000000),             UINT64CONST(10000000),
+		UINT64CONST(100000000),           UINT64CONST(1000000000),
+		UINT64CONST(10000000000),         UINT64CONST(100000000000),
+		UINT64CONST(1000000000000),       UINT64CONST(10000000000000),
+		UINT64CONST(100000000000000),     UINT64CONST(1000000000000000),
+		UINT64CONST(10000000000000000),   UINT64CONST(100000000000000000),
+		UINT64CONST(1000000000000000000), UINT64CONST(10000000000000000000)
+	};
+
+	/*
+	 * Compute base-10 logarithm by dividing the base-2 logarithm
+	 * by a good-enough approximation of the base-2 logarithm of 10
+	 */
+	t = (pg_leftmost_one_pos64(v) + 1)*1233/4096;
+	return t + (v >= PowersOfTen[t]);
+}
 
 /*
  * pg_atoi: convert string to integer
@@ -276,116 +338,196 @@ pg_itoa(int16 i, char *a)
 }
 
 /*
- * pg_ltoa: converts a signed 32-bit integer to its string representation
+ * pg_ultoa_n: converts an unsigned 32-bit integer to its string representation,
+ * not NUL-terminated, and returns the length of that string representation
  *
- * Caller must ensure that 'a' points to enough memory to hold the result
- * (at least 12 bytes, counting a leading sign and trailing NUL).
+ * Caller must ensure that 'a' points to enough memory to hold the result (at
+ * least 10 bytes)
+ */
+int
+pg_ultoa_n(uint32 value, char *a)
+{
+	int		olength, i = 0;
+
+	/* Degenerate case */
+	if (value == 0)
+	{
+		*a = '0';
+		return 1;
+	}
+
+	olength = decimalLength32(value);
+
+	/* Compute the result string. */
+	while (value >= 10000)
+	{
+		const	uint32 c = value - 10000 * (value / 10000);
+		const	uint32 c0 = (c % 100) << 1;
+		const	uint32 c1 = (c / 100) << 1;
+
+		char *pos = a + olength - i;
+
+		value /= 10000;
+
+		memcpy(pos - 2, DIGIT_TABLE + c0, 2);
+		memcpy(pos - 4, DIGIT_TABLE + c1, 2);
+		i += 4;
+	}
+	if (value >= 100)
+	{
+		const uint32 c = (value % 100) << 1;
+
+		char *pos = a + olength - i;
+
+		value /= 100;
+
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	if (value >= 10)
+	{
+		const uint32 c = value << 1;
+
+		char *pos = a + olength - i;
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
+	}
+	else
+	{
+		*a = (char) ('0' + value);
+	}
+
+	return olength;
+}
+
+/*
+ * NUL-terminate the output of pg_ultoa_n.
+ *
+ * It is the caller's responsibility to ensure that a is at least 12 bytes long,
+ * which is enough room to hold a minus sign, a maximally long int32, and the
+ * above terminating NUL.
  */
 void
 pg_ltoa(int32 value, char *a)
 {
-	char	   *start = a;
-	bool		neg = false;
 
-	/*
-	 * Avoid problems with the most negative integer not being representable
-	 * as a positive integer.
-	 */
-	if (value == PG_INT32_MIN)
+	uint32	uvalue = (uint32)value;
+	int		len;
+	if (value < 0)
 	{
-		memcpy(a, "-2147483648", 12);
-		return;
-	}
-	else if (value < 0)
-	{
-		value = -value;
-		neg = true;
-	}
-
-	/* Compute the result string backwards. */
-	do
-	{
-		int32		remainder;
-		int32		oldval = value;
-
-		value /= 10;
-		remainder = oldval - value * 10;
-		*a++ = '0' + remainder;
-	} while (value != 0);
-
-	if (neg)
+		uvalue = (uint32)0 - uvalue;
 		*a++ = '-';
+	}
+	len = pg_ultoa_n(uvalue, a);
+	a[len] = '\0';
+}
+
+/*
+ * Get the decimal representation, not NUL-terminated, and return the length of
+ * same.  Caller must ensure that a points to at least MAXINT8LEN bytes.
+ */
+int
+pg_ulltoa_n(uint64 value, char *a)
+{
+	int	olength, i = 0;
+	uint32 value2;
+
+
+	/* Degenerate case */
+	if (value == 0)
+	{
+		*a = '0';
+		return 1;
+	}
+
+	olength = decimalLength64(value);
+
+	/* Compute the result string. */
+	while (value >= 100000000)
+	{
+		const	uint64 q = value / 100000000;
+		uint32	value2 = (uint32) (value - 100000000 * q);
+
+		const uint32 c = value2 % 10000;
+		const uint32 d = value2 / 10000;
+		const uint32 c0 = (c % 100) << 1;
+		const uint32 c1 = (c / 100) << 1;
+		const uint32 d0 = (d % 100) << 1;
+		const uint32 d1 = (d / 100) << 1;
+
+		char *pos = a + olength - i;
+
+		value = q;
 
-	/* Add trailing NUL byte, and back up 'a' to the last character. */
-	*a-- = '\0';
+		memcpy(pos - 2, DIGIT_TABLE + c0, 2);
+		memcpy(pos - 4, DIGIT_TABLE + c1, 2);
+		memcpy(pos - 6, DIGIT_TABLE + d0, 2);
+		memcpy(pos - 8, DIGIT_TABLE + d1, 2);
+		i += 8;
+	}
+
+	/* Switch to 32-bit for speed */
+	value2 = (uint32) value;
+
+	if (value2 >= 10000)
+	{
+		const	uint32 c = value2 - 10000 * (value2 / 10000);
+		const	uint32 c0 = (c % 100) << 1;
+		const	uint32 c1 = (c / 100) << 1;
+
+		char *pos = a + olength - i;
+
+		value2 /= 10000;
 
-	/* Reverse string. */
-	while (start < a)
+		memcpy(pos - 2, DIGIT_TABLE + c0, 2);
+		memcpy(pos - 4, DIGIT_TABLE + c1, 2);
+		i += 4;
+	}
+	if (value2 >= 100)
+	{
+		const uint32 c = (value2 % 100) << 1;
+		char *pos = a + olength - i;
+
+		value2 /= 100;
+
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	if (value2 >= 10)
 	{
-		char		swap = *start;
+		const uint32 c = value2 << 1;
+		char *pos = a + olength - i;
 
-		*start++ = *a;
-		*a-- = swap;
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
 	}
+	else
+		*a = (char) ('0' + value2);
+
+	return olength;
 }
 
 /*
  * pg_lltoa: convert a signed 64-bit integer to its string representation
  *
  * Caller must ensure that 'a' points to enough memory to hold the result
- * (at least MAXINT8LEN+1 bytes, counting a leading sign and trailing NUL).
+ * (at least MAXINT8LEN + 1 bytes, counting a leading sign and trailing NUL).
  */
 void
 pg_lltoa(int64 value, char *a)
 {
-	char	   *start = a;
-	bool		neg = false;
-
-	/*
-	 * Avoid problems with the most negative integer not being representable
-	 * as a positive integer.
-	 */
-	if (value == PG_INT64_MIN)
+	int len;
+	uint64 uvalue = value;
+	if (value < 0)
 	{
-		memcpy(a, "-9223372036854775808", 21);
-		return;
-	}
-	else if (value < 0)
-	{
-		value = -value;
-		neg = true;
-	}
-
-	/* Compute the result string backwards. */
-	do
-	{
-		int64		remainder;
-		int64		oldval = value;
-
-		value /= 10;
-		remainder = oldval - value * 10;
-		*a++ = '0' + remainder;
-	} while (value != 0);
-
-	if (neg)
 		*a++ = '-';
-
-	/* Add trailing NUL byte, and back up 'a' to the last character. */
-	*a-- = '\0';
-
-	/* Reverse string. */
-	while (start < a)
-	{
-		char		swap = *start;
-
-		*start++ = *a;
-		*a-- = swap;
+		uvalue = (uint64)0 - uvalue;
 	}
+	len = pg_ulltoa_n(uvalue, a);
+	a[len] = 0;
 }
 
 
 /*
- * pg_ltostr_zeropad
+ * pg_ultostr_zeropad
  *		Converts 'value' into a decimal string representation stored at 'str'.
  *		'minwidth' specifies the minimum width of the result; any extra space
  *		is filled up by prefixing the number with zeros.
@@ -407,62 +549,25 @@ pg_lltoa(int64 value, char *a)
  * result.
  */
 char *
-pg_ltostr_zeropad(char *str, int32 value, int32 minwidth)
+pg_ultostr_zeropad(char *str, uint32 value, int32 minwidth)
 {
-	char	   *start = str;
-	char	   *end = &str[minwidth];
-	int32		num = value;
+	int			len;
 
 	Assert(minwidth > 0);
 
-	/*
-	 * Handle negative numbers in a special way.  We can't just write a '-'
-	 * prefix and reverse the sign as that would overflow for INT32_MIN.
-	 */
-	if (num < 0)
+	if (value < 100 && minwidth == 2) /* Short cut for common case */
 	{
-		*start++ = '-';
-		minwidth--;
-
-		/*
-		 * Build the number starting at the last digit.  Here remainder will
-		 * be a negative number, so we must reverse the sign before adding '0'
-		 * in order to get the correct ASCII digit.
-		 */
-		while (minwidth--)
-		{
-			int32		oldval = num;
-			int32		remainder;
-
-			num /= 10;
-			remainder = oldval - num * 10;
-			start[minwidth] = '0' - remainder;
-		}
-	}
-	else
-	{
-		/* Build the number starting at the last digit */
-		while (minwidth--)
-		{
-			int32		oldval = num;
-			int32		remainder;
-
-			num /= 10;
-			remainder = oldval - num * 10;
-			start[minwidth] = '0' + remainder;
-		}
+		memcpy(str, DIGIT_TABLE + value * 2, 2);
+		return str + 2;
 	}
 
-	/*
-	 * If minwidth was not high enough to fit the number then num won't have
-	 * been divided down to zero.  We punt the problem to pg_ltostr(), which
-	 * will generate a correct answer in the minimum valid width.
-	 */
-	if (num != 0)
-		return pg_ltostr(str, value);
+	len = pg_ultoa_n(value, str);
+	if (len >= minwidth)
+		return str + len;
 
-	/* Otherwise, return last output character + 1 */
-	return end;
+	memmove(str + minwidth - len, str, len);
+	memset(str, '0', minwidth-len);
+	return str + minwidth;
 }
 
 /*
@@ -484,64 +589,10 @@ pg_ltostr_zeropad(char *str, int32 value, int32 minwidth)
  * result.
  */
 char *
-pg_ltostr(char *str, int32 value)
+pg_ultostr(char *str, uint32 value)
 {
-	char	   *start;
-	char	   *end;
-
-	/*
-	 * Handle negative numbers in a special way.  We can't just write a '-'
-	 * prefix and reverse the sign as that would overflow for INT32_MIN.
-	 */
-	if (value < 0)
-	{
-		*str++ = '-';
-
-		/* Mark the position we must reverse the string from. */
-		start = str;
-
-		/* Compute the result string backwards. */
-		do
-		{
-			int32		oldval = value;
-			int32		remainder;
-
-			value /= 10;
-			remainder = oldval - value * 10;
-			/* As above, we expect remainder to be negative. */
-			*str++ = '0' - remainder;
-		} while (value != 0);
-	}
-	else
-	{
-		/* Mark the position we must reverse the string from. */
-		start = str;
-
-		/* Compute the result string backwards. */
-		do
-		{
-			int32		oldval = value;
-			int32		remainder;
-
-			value /= 10;
-			remainder = oldval - value * 10;
-			*str++ = '0' + remainder;
-		} while (value != 0);
-	}
-
-	/* Remember the end+1 and back up 'str' to the last character. */
-	end = str--;
-
-	/* Reverse string. */
-	while (start < str)
-	{
-		char		swap = *start;
-
-		*start++ = *str;
-		*str-- = swap;
-	}
-
-	return end;
+	int		len = pg_ultoa_n(value, str);
+	return str + len;
 }
 
 /*
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 937ddb7ef0..f5cba27db6 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -18,6 +18,8 @@
 #include "nodes/nodes.h"
 #include "utils/fmgrprotos.h"
 
+/* Sign + the most decimal digits an 8-byte number could have */
+#define MAXINT8LEN 20
 
 /* bool.c */
 extern bool parse_bool(const char *value, bool *result);
@@ -46,10 +48,12 @@ extern int32 pg_atoi(const char *s, int size, int c);
 extern int16 pg_strtoint16(const char *s);
 extern int32 pg_strtoint32(const char *s);
 extern void pg_itoa(int16 i, char *a);
+int pg_ultoa_n(uint32 l, char *a);
+int pg_ulltoa_n(uint64 l, char *a);
 extern void pg_ltoa(int32 l, char *a);
 extern void pg_lltoa(int64 ll, char *a);
-extern char *pg_ltostr_zeropad(char *str, int32 value, int32 minwidth);
-extern char *pg_ltostr(char *str, int32 value);
+extern char *pg_ultostr_zeropad(char *str, uint32 value, int32 minwidth);
+extern char *pg_ultostr(char *str, uint32 value);
 extern uint64 pg_strtouint64(const char *str, char **endptr, int base);
 
 /* oid.c */

--------------2.21.0--

#22

David Fetter

david@fetter.org

over 6 years ago

In reply to: David Fetter (#21)

1 attachment(s)

Re: Efficient output for integer types

On Tue, Sep 24, 2019 at 06:30:18AM +0200, David Fetter wrote:

On Mon, Sep 23, 2019 at 11:35:07PM +0200, David Fetter wrote:

On Mon, Sep 23, 2019 at 01:16:36PM +0100, Andrew Gierth wrote:

Per discussion on IRC, change some functions to take only unsigned
integer types so as not to branch for the case of negative numbers
they're never actually called with.

Best,
David.

...and part of a pgindent run

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachments:

v14-0001-Make-int4-and-int8-operations-more-efficent.patchtext/x-diff; charset=us-asciiDownload

From 094c5d9efb1611920de87e9c41e473b39cad442f Mon Sep 17 00:00:00 2001
From: David Fetter <david@fetter.org>
Date: Sun, 15 Sep 2019 00:06:29 -0700
Subject: [PATCH v14] Make int4 and int8 operations more efficent
To: hackers
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="------------2.21.0"

This is a multi-part message in MIME format.
--------------2.21.0
Content-Type: text/plain; charset=UTF-8; format=fixed
Content-Transfer-Encoding: 8bit


- Output routines now do more digits per iteration, and
- Add code which determines the number of decimal digits in an int4/int8 efficiently
- Split off pg_ultoa_n from pg_ltoa
- Use same to make other functions shorter
- Change some of the functions to take only unsigned so they don't have to branch for negative numbers

diff --git a/src/backend/access/common/printsimple.c b/src/backend/access/common/printsimple.c
index 651ade14dd..5c5b6d33b2 100644
--- a/src/backend/access/common/printsimple.c
+++ b/src/backend/access/common/printsimple.c
@@ -112,7 +112,7 @@ printsimple(TupleTableSlot *slot, DestReceiver *self)
 			case INT8OID:
 				{
 					int64		num = DatumGetInt64(value);
-					char		str[23];	/* sign, 21 digits and '\0' */
+					char		str[MAXINT8LEN + 1];
 
 					pg_lltoa(num, str);
 					pq_sendcountedtext(&buf, str, strlen(str), false);
diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index e38bd93054..9fb03d4fbc 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -391,9 +391,9 @@ AppendSeconds(char *cp, int sec, fsec_t fsec, int precision, bool fillzeros)
 	Assert(precision >= 0);
 
 	if (fillzeros)
-		cp = pg_ltostr_zeropad(cp, Abs(sec), 2);
+		cp = pg_ultostr_zeropad(cp, Abs(sec), 2);
 	else
-		cp = pg_ltostr(cp, Abs(sec));
+		cp = pg_ultostr(cp, Abs(sec));
 
 	/* fsec_t is just an int32 */
 	if (fsec != 0)
@@ -433,7 +433,7 @@ AppendSeconds(char *cp, int sec, fsec_t fsec, int precision, bool fillzeros)
 		 * which will generate a correct answer in the minimum valid width.
 		 */
 		if (value)
-			return pg_ltostr(cp, Abs(fsec));
+			return pg_ultostr(cp, Abs(fsec));
 
 		return end;
 	}
@@ -3834,20 +3834,20 @@ EncodeTimezone(char *str, int tz, int style)
 
 	if (sec != 0)
 	{
-		str = pg_ltostr_zeropad(str, hour, 2);
+		str = pg_ultostr_zeropad(str, hour, 2);
 		*str++ = ':';
-		str = pg_ltostr_zeropad(str, min, 2);
+		str = pg_ultostr_zeropad(str, min, 2);
 		*str++ = ':';
-		str = pg_ltostr_zeropad(str, sec, 2);
+		str = pg_ultostr_zeropad(str, sec, 2);
 	}
 	else if (min != 0 || style == USE_XSD_DATES)
 	{
-		str = pg_ltostr_zeropad(str, hour, 2);
+		str = pg_ultostr_zeropad(str, hour, 2);
 		*str++ = ':';
-		str = pg_ltostr_zeropad(str, min, 2);
+		str = pg_ultostr_zeropad(str, min, 2);
 	}
 	else
-		str = pg_ltostr_zeropad(str, hour, 2);
+		str = pg_ultostr_zeropad(str, hour, 2);
 	return str;
 }
 
@@ -3864,40 +3864,40 @@ EncodeDateOnly(struct pg_tm *tm, int style, char *str)
 		case USE_ISO_DATES:
 		case USE_XSD_DATES:
 			/* compatible with ISO date formats */
-			str = pg_ltostr_zeropad(str,
+			str = pg_ultostr_zeropad(str,
 									(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
 			*str++ = '-';
-			str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_mon, 2);
 			*str++ = '-';
-			str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_mday, 2);
 			break;
 
 		case USE_SQL_DATES:
 			/* compatible with Oracle/Ingres date formats */
 			if (DateOrder == DATEORDER_DMY)
 			{
-				str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+				str = pg_ultostr_zeropad(str, tm->tm_mday, 2);
 				*str++ = '/';
-				str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+				str = pg_ultostr_zeropad(str, tm->tm_mon, 2);
 			}
 			else
 			{
-				str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+				str = pg_ultostr_zeropad(str, tm->tm_mon, 2);
 				*str++ = '/';
-				str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+				str = pg_ultostr_zeropad(str, tm->tm_mday, 2);
 			}
 			*str++ = '/';
-			str = pg_ltostr_zeropad(str,
+			str = pg_ultostr_zeropad(str,
 									(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
 			break;
 
 		case USE_GERMAN_DATES:
 			/* German-style date format */
-			str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_mday, 2);
 			*str++ = '.';
-			str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_mon, 2);
 			*str++ = '.';
-			str = pg_ltostr_zeropad(str,
+			str = pg_ultostr_zeropad(str,
 									(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
 			break;
 
@@ -3906,18 +3906,18 @@ EncodeDateOnly(struct pg_tm *tm, int style, char *str)
 			/* traditional date-only style for Postgres */
 			if (DateOrder == DATEORDER_DMY)
 			{
-				str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+				str = pg_ultostr_zeropad(str, tm->tm_mday, 2);
 				*str++ = '-';
-				str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+				str = pg_ultostr_zeropad(str, tm->tm_mon, 2);
 			}
 			else
 			{
-				str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+				str = pg_ultostr_zeropad(str, tm->tm_mon, 2);
 				*str++ = '-';
-				str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+				str = pg_ultostr_zeropad(str, tm->tm_mday, 2);
 			}
 			*str++ = '-';
-			str = pg_ltostr_zeropad(str,
+			str = pg_ultostr_zeropad(str,
 									(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
 			break;
 	}
@@ -3942,9 +3942,9 @@ EncodeDateOnly(struct pg_tm *tm, int style, char *str)
 void
 EncodeTimeOnly(struct pg_tm *tm, fsec_t fsec, bool print_tz, int tz, int style, char *str)
 {
-	str = pg_ltostr_zeropad(str, tm->tm_hour, 2);
+	str = pg_ultostr_zeropad(str, tm->tm_hour, 2);
 	*str++ = ':';
-	str = pg_ltostr_zeropad(str, tm->tm_min, 2);
+	str = pg_ultostr_zeropad(str, tm->tm_min, 2);
 	*str++ = ':';
 	str = AppendSeconds(str, tm->tm_sec, fsec, MAX_TIME_PRECISION, true);
 	if (print_tz)
@@ -3987,16 +3987,16 @@ EncodeDateTime(struct pg_tm *tm, fsec_t fsec, bool print_tz, int tz, const char
 		case USE_ISO_DATES:
 		case USE_XSD_DATES:
 			/* Compatible with ISO-8601 date formats */
-			str = pg_ltostr_zeropad(str,
+			str = pg_ultostr_zeropad(str,
 									(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
 			*str++ = '-';
-			str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_mon, 2);
 			*str++ = '-';
-			str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_mday, 2);
 			*str++ = (style == USE_ISO_DATES) ? ' ' : 'T';
-			str = pg_ltostr_zeropad(str, tm->tm_hour, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_hour, 2);
 			*str++ = ':';
-			str = pg_ltostr_zeropad(str, tm->tm_min, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_min, 2);
 			*str++ = ':';
 			str = AppendTimestampSeconds(str, tm, fsec);
 			if (print_tz)
@@ -4007,23 +4007,23 @@ EncodeDateTime(struct pg_tm *tm, fsec_t fsec, bool print_tz, int tz, const char
 			/* Compatible with Oracle/Ingres date formats */
 			if (DateOrder == DATEORDER_DMY)
 			{
-				str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+				str = pg_ultostr_zeropad(str, tm->tm_mday, 2);
 				*str++ = '/';
-				str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+				str = pg_ultostr_zeropad(str, tm->tm_mon, 2);
 			}
 			else
 			{
-				str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+				str = pg_ultostr_zeropad(str, tm->tm_mon, 2);
 				*str++ = '/';
-				str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+				str = pg_ultostr_zeropad(str, tm->tm_mday, 2);
 			}
 			*str++ = '/';
-			str = pg_ltostr_zeropad(str,
+			str = pg_ultostr_zeropad(str,
 									(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
 			*str++ = ' ';
-			str = pg_ltostr_zeropad(str, tm->tm_hour, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_hour, 2);
 			*str++ = ':';
-			str = pg_ltostr_zeropad(str, tm->tm_min, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_min, 2);
 			*str++ = ':';
 			str = AppendTimestampSeconds(str, tm, fsec);
 
@@ -4046,16 +4046,16 @@ EncodeDateTime(struct pg_tm *tm, fsec_t fsec, bool print_tz, int tz, const char
 
 		case USE_GERMAN_DATES:
 			/* German variant on European style */
-			str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_mday, 2);
 			*str++ = '.';
-			str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_mon, 2);
 			*str++ = '.';
-			str = pg_ltostr_zeropad(str,
+			str = pg_ultostr_zeropad(str,
 									(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
 			*str++ = ' ';
-			str = pg_ltostr_zeropad(str, tm->tm_hour, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_hour, 2);
 			*str++ = ':';
-			str = pg_ltostr_zeropad(str, tm->tm_min, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_min, 2);
 			*str++ = ':';
 			str = AppendTimestampSeconds(str, tm, fsec);
 
@@ -4081,7 +4081,7 @@ EncodeDateTime(struct pg_tm *tm, fsec_t fsec, bool print_tz, int tz, const char
 			*str++ = ' ';
 			if (DateOrder == DATEORDER_DMY)
 			{
-				str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+				str = pg_ultostr_zeropad(str, tm->tm_mday, 2);
 				*str++ = ' ';
 				memcpy(str, months[tm->tm_mon - 1], 3);
 				str += 3;
@@ -4091,16 +4091,16 @@ EncodeDateTime(struct pg_tm *tm, fsec_t fsec, bool print_tz, int tz, const char
 				memcpy(str, months[tm->tm_mon - 1], 3);
 				str += 3;
 				*str++ = ' ';
-				str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+				str = pg_ultostr_zeropad(str, tm->tm_mday, 2);
 			}
 			*str++ = ' ';
-			str = pg_ltostr_zeropad(str, tm->tm_hour, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_hour, 2);
 			*str++ = ':';
-			str = pg_ltostr_zeropad(str, tm->tm_min, 2);
+			str = pg_ultostr_zeropad(str, tm->tm_min, 2);
 			*str++ = ':';
 			str = AppendTimestampSeconds(str, tm, fsec);
 			*str++ = ' ';
-			str = pg_ltostr_zeropad(str,
+			str = pg_ultostr_zeropad(str,
 									(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
 
 			if (print_tz)
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 0ff9394a2f..6230807906 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -27,8 +27,6 @@
 #include "utils/builtins.h"
 
 
-#define MAXINT8LEN		25
-
 typedef struct
 {
 	int64		current;
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 70138feb29..9439508aca 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -20,6 +20,68 @@
 
 #include "common/int.h"
 #include "utils/builtins.h"
+#include "port/pg_bitutils.h"
+
+/*
+ * A table of all two-digit numbers. This is used to speed up decimal digit
+ * generation by copying pairs of digits into the final output.
+ */
+static const char DIGIT_TABLE[200] =
+"00" "01" "02" "03" "04" "05" "06" "07" "08" "09"
+"10" "11" "12" "13" "14" "15" "16" "17" "18" "19"
+"20" "21" "22" "23" "24" "25" "26" "27" "28" "29"
+"30" "31" "32" "33" "34" "35" "36" "37" "38" "39"
+"40" "41" "42" "43" "44" "45" "46" "47" "48" "49"
+"50" "51" "52" "53" "54" "55" "56" "57" "58" "59"
+"60" "61" "62" "63" "64" "65" "66" "67" "68" "69"
+"70" "71" "72" "73" "74" "75" "76" "77" "78" "79"
+"80" "81" "82" "83" "84" "85" "86" "87" "88" "89"
+"90" "91" "92" "93" "94" "95" "96" "97" "98" "99";
+
+/*
+ * Adapted from http://graphics.stanford.edu/~seander/bithacks.html#IntegerLog10
+ */
+static inline int
+decimalLength32(const uint32 v)
+{
+	int			t;
+	static uint32	PowersOfTen[] =
+	{1,                10,                100,
+	 1000,             10000,             100000,
+	 1000000,          10000000,          100000000,
+	 1000000000};
+	/*
+	 * Compute base-10 logarithm by dividing the base-2 logarithm by a
+	 * good-enough approximation of the base-2 logarithm of 10
+	 */
+	t = (pg_leftmost_one_pos32(v) + 1) * 1233 / 4096;
+	return t + (v >= PowersOfTen[t]);
+}
+
+static inline int
+decimalLength64(const uint64 v)
+{
+	int			t;
+	static uint64	PowersOfTen[] = {
+		UINT64CONST(1),                   UINT64CONST(10),
+		UINT64CONST(100),                 UINT64CONST(1000),
+		UINT64CONST(10000),               UINT64CONST(100000),
+		UINT64CONST(1000000),             UINT64CONST(10000000),
+		UINT64CONST(100000000),           UINT64CONST(1000000000),
+		UINT64CONST(10000000000),         UINT64CONST(100000000000),
+		UINT64CONST(1000000000000),       UINT64CONST(10000000000000),
+		UINT64CONST(100000000000000),     UINT64CONST(1000000000000000),
+		UINT64CONST(10000000000000000),   UINT64CONST(100000000000000000),
+		UINT64CONST(1000000000000000000), UINT64CONST(10000000000000000000)
+	};
+
+	/*
+	 * Compute base-10 logarithm by dividing the base-2 logarithm by a
+	 * good-enough approximation of the base-2 logarithm of 10
+	 */
+	t = (pg_leftmost_one_pos64(v) + 1) * 1233 / 4096;
+	return t + (v >= PowersOfTen[t]);
+}
 
 /*
  * pg_atoi: convert string to integer
@@ -276,116 +338,201 @@ pg_itoa(int16 i, char *a)
 }
 
 /*
- * pg_ltoa: converts a signed 32-bit integer to its string representation
+ * pg_ultoa_n: converts an unsigned 32-bit integer to its string representation,
+ * not NUL-terminated, and returns the length of that string representation
  *
- * Caller must ensure that 'a' points to enough memory to hold the result
- * (at least 12 bytes, counting a leading sign and trailing NUL).
+ * Caller must ensure that 'a' points to enough memory to hold the result (at
+ * least 10 bytes)
+ */
+int
+pg_ultoa_n(uint32 value, char *a)
+{
+	int			olength,
+				i = 0;
+
+	/* Degenerate case */
+	if (value == 0)
+	{
+		*a = '0';
+		return 1;
+	}
+
+	olength = decimalLength32(value);
+
+	/* Compute the result string. */
+	while (value >= 10000)
+	{
+		const uint32 c = value - 10000 * (value / 10000);
+		const uint32 c0 = (c % 100) << 1;
+		const uint32 c1 = (c / 100) << 1;
+
+		char	   *pos = a + olength - i;
+
+		value /= 10000;
+
+		memcpy(pos - 2, DIGIT_TABLE + c0, 2);
+		memcpy(pos - 4, DIGIT_TABLE + c1, 2);
+		i += 4;
+	}
+	if (value >= 100)
+	{
+		const uint32 c = (value % 100) << 1;
+
+		char	   *pos = a + olength - i;
+
+		value /= 100;
+
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	if (value >= 10)
+	{
+		const uint32 c = value << 1;
+
+		char	   *pos = a + olength - i;
+
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
+	}
+	else
+	{
+		*a = (char) ('0' + value);
+	}
+
+	return olength;
+}
+
+/*
+ * NUL-terminate the output of pg_ultoa_n.
+ *
+ * It is the caller's responsibility to ensure that a is at least 12 bytes long,
+ * which is enough room to hold a minus sign, a maximally long int32, and the
+ * above terminating NUL.
  */
 void
 pg_ltoa(int32 value, char *a)
 {
-	char	   *start = a;
-	bool		neg = false;
 
-	/*
-	 * Avoid problems with the most negative integer not being representable
-	 * as a positive integer.
-	 */
-	if (value == PG_INT32_MIN)
-	{
-		memcpy(a, "-2147483648", 12);
-		return;
-	}
-	else if (value < 0)
-	{
-		value = -value;
-		neg = true;
-	}
+	uint32		uvalue = (uint32) value;
+	int			len;
 
-	/* Compute the result string backwards. */
-	do
+	if (value < 0)
 	{
-		int32		remainder;
-		int32		oldval = value;
-
-		value /= 10;
-		remainder = oldval - value * 10;
-		*a++ = '0' + remainder;
-	} while (value != 0);
-
-	if (neg)
+		uvalue = (uint32) 0 - uvalue;
 		*a++ = '-';
+	}
+	len = pg_ultoa_n(uvalue, a);
+	a[len] = '\0';
+}
+
+/*
+ * Get the decimal representation, not NUL-terminated, and return the length of
+ * same.  Caller must ensure that a points to at least MAXINT8LEN bytes.
+ */
+int
+pg_ulltoa_n(uint64 value, char *a)
+{
+	int			olength,
+				i = 0;
+	uint32		value2;
+
+
+	/* Degenerate case */
+	if (value == 0)
+	{
+		*a = '0';
+		return 1;
+	}
+
+	olength = decimalLength64(value);
+
+	/* Compute the result string. */
+	while (value >= 100000000)
+	{
+		const uint64 q = value / 100000000;
+		uint32		value2 = (uint32) (value - 100000000 * q);
+
+		const uint32 c = value2 % 10000;
+		const uint32 d = value2 / 10000;
+		const uint32 c0 = (c % 100) << 1;
+		const uint32 c1 = (c / 100) << 1;
+		const uint32 d0 = (d % 100) << 1;
+		const uint32 d1 = (d / 100) << 1;
+
+		char	   *pos = a + olength - i;
+
+		value = q;
 
-	/* Add trailing NUL byte, and back up 'a' to the last character. */
-	*a-- = '\0';
+		memcpy(pos - 2, DIGIT_TABLE + c0, 2);
+		memcpy(pos - 4, DIGIT_TABLE + c1, 2);
+		memcpy(pos - 6, DIGIT_TABLE + d0, 2);
+		memcpy(pos - 8, DIGIT_TABLE + d1, 2);
+		i += 8;
+	}
+
+	/* Switch to 32-bit for speed */
+	value2 = (uint32) value;
+
+	if (value2 >= 10000)
+	{
+		const uint32 c = value2 - 10000 * (value2 / 10000);
+		const uint32 c0 = (c % 100) << 1;
+		const uint32 c1 = (c / 100) << 1;
+
+		char	   *pos = a + olength - i;
+
+		value2 /= 10000;
 
-	/* Reverse string. */
-	while (start < a)
+		memcpy(pos - 2, DIGIT_TABLE + c0, 2);
+		memcpy(pos - 4, DIGIT_TABLE + c1, 2);
+		i += 4;
+	}
+	if (value2 >= 100)
+	{
+		const uint32 c = (value2 % 100) << 1;
+		char	   *pos = a + olength - i;
+
+		value2 /= 100;
+
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
+		i += 2;
+	}
+	if (value2 >= 10)
 	{
-		char		swap = *start;
+		const uint32 c = value2 << 1;
+		char	   *pos = a + olength - i;
 
-		*start++ = *a;
-		*a-- = swap;
+		memcpy(pos - 2, DIGIT_TABLE + c, 2);
 	}
+	else
+		*a = (char) ('0' + value2);
+
+	return olength;
 }
 
 /*
  * pg_lltoa: convert a signed 64-bit integer to its string representation
  *
  * Caller must ensure that 'a' points to enough memory to hold the result
- * (at least MAXINT8LEN+1 bytes, counting a leading sign and trailing NUL).
+ * (at least MAXINT8LEN + 1 bytes, counting a leading sign and trailing NUL).
  */
 void
 pg_lltoa(int64 value, char *a)
 {
-	char	   *start = a;
-	bool		neg = false;
+	int			len;
+	uint64		uvalue = value;
 
-	/*
-	 * Avoid problems with the most negative integer not being representable
-	 * as a positive integer.
-	 */
-	if (value == PG_INT64_MIN)
+	if (value < 0)
 	{
-		memcpy(a, "-9223372036854775808", 21);
-		return;
-	}
-	else if (value < 0)
-	{
-		value = -value;
-		neg = true;
-	}
-
-	/* Compute the result string backwards. */
-	do
-	{
-		int64		remainder;
-		int64		oldval = value;
-
-		value /= 10;
-		remainder = oldval - value * 10;
-		*a++ = '0' + remainder;
-	} while (value != 0);
-
-	if (neg)
 		*a++ = '-';
-
-	/* Add trailing NUL byte, and back up 'a' to the last character. */
-	*a-- = '\0';
-
-	/* Reverse string. */
-	while (start < a)
-	{
-		char		swap = *start;
-
-		*start++ = *a;
-		*a-- = swap;
+		uvalue = (uint64) 0 - uvalue;
 	}
+	len = pg_ulltoa_n(uvalue, a);
+	a[len] = 0;
 }
 
 
 /*
- * pg_ltostr_zeropad
+ * pg_ultostr_zeropad
  *		Converts 'value' into a decimal string representation stored at 'str'.
  *		'minwidth' specifies the minimum width of the result; any extra space
  *		is filled up by prefixing the number with zeros.
@@ -407,62 +554,25 @@ pg_lltoa(int64 value, char *a)
  * result.
  */
 char *
-pg_ltostr_zeropad(char *str, int32 value, int32 minwidth)
+pg_ultostr_zeropad(char *str, uint32 value, int32 minwidth)
 {
-	char	   *start = str;
-	char	   *end = &str[minwidth];
-	int32		num = value;
+	int			len;
 
 	Assert(minwidth > 0);
 
-	/*
-	 * Handle negative numbers in a special way.  We can't just write a '-'
-	 * prefix and reverse the sign as that would overflow for INT32_MIN.
-	 */
-	if (num < 0)
+	if (value < 100 && minwidth == 2)	/* Short cut for common case */
 	{
-		*start++ = '-';
-		minwidth--;
-
-		/*
-		 * Build the number starting at the last digit.  Here remainder will
-		 * be a negative number, so we must reverse the sign before adding '0'
-		 * in order to get the correct ASCII digit.
-		 */
-		while (minwidth--)
-		{
-			int32		oldval = num;
-			int32		remainder;
-
-			num /= 10;
-			remainder = oldval - num * 10;
-			start[minwidth] = '0' - remainder;
-		}
-	}
-	else
-	{
-		/* Build the number starting at the last digit */
-		while (minwidth--)
-		{
-			int32		oldval = num;
-			int32		remainder;
-
-			num /= 10;
-			remainder = oldval - num * 10;
-			start[minwidth] = '0' + remainder;
-		}
+		memcpy(str, DIGIT_TABLE + value * 2, 2);
+		return str + 2;
 	}
 
-	/*
-	 * If minwidth was not high enough to fit the number then num won't have
-	 * been divided down to zero.  We punt the problem to pg_ltostr(), which
-	 * will generate a correct answer in the minimum valid width.
-	 */
-	if (num != 0)
-		return pg_ltostr(str, value);
+	len = pg_ultoa_n(value, str);
+	if (len >= minwidth)
+		return str + len;
 
-	/* Otherwise, return last output character + 1 */
-	return end;
+	memmove(str + minwidth - len, str, len);
+	memset(str, '0', minwidth - len);
+	return str + minwidth;
 }
 
 /*
@@ -484,64 +594,11 @@ pg_ltostr_zeropad(char *str, int32 value, int32 minwidth)
  * result.
  */
 char *
-pg_ltostr(char *str, int32 value)
+pg_ultostr(char *str, uint32 value)
 {
-	char	   *start;
-	char	   *end;
+	int			len = pg_ultoa_n(value, str);
 
-	/*
-	 * Handle negative numbers in a special way.  We can't just write a '-'
-	 * prefix and reverse the sign as that would overflow for INT32_MIN.
-	 */
-	if (value < 0)
-	{
-		*str++ = '-';
-
-		/* Mark the position we must reverse the string from. */
-		start = str;
-
-		/* Compute the result string backwards. */
-		do
-		{
-			int32		oldval = value;
-			int32		remainder;
-
-			value /= 10;
-			remainder = oldval - value * 10;
-			/* As above, we expect remainder to be negative. */
-			*str++ = '0' - remainder;
-		} while (value != 0);
-	}
-	else
-	{
-		/* Mark the position we must reverse the string from. */
-		start = str;
-
-		/* Compute the result string backwards. */
-		do
-		{
-			int32		oldval = value;
-			int32		remainder;
-
-			value /= 10;
-			remainder = oldval - value * 10;
-			*str++ = '0' + remainder;
-		} while (value != 0);
-	}
-
-	/* Remember the end+1 and back up 'str' to the last character. */
-	end = str--;
-
-	/* Reverse string. */
-	while (start < str)
-	{
-		char		swap = *start;
-
-		*start++ = *str;
-		*str-- = swap;
-	}
-
-	return end;
+	return str + len;
 }
 
 /*
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 937ddb7ef0..a01232c6af 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -18,6 +18,8 @@
 #include "nodes/nodes.h"
 #include "utils/fmgrprotos.h"
 
+/* Sign + the most decimal digits an 8-byte number could have */
+#define MAXINT8LEN 20
 
 /* bool.c */
 extern bool parse_bool(const char *value, bool *result);
@@ -46,10 +48,12 @@ extern int32 pg_atoi(const char *s, int size, int c);
 extern int16 pg_strtoint16(const char *s);
 extern int32 pg_strtoint32(const char *s);
 extern void pg_itoa(int16 i, char *a);
+int			pg_ultoa_n(uint32 l, char *a);
+int			pg_ulltoa_n(uint64 l, char *a);
 extern void pg_ltoa(int32 l, char *a);
 extern void pg_lltoa(int64 ll, char *a);
-extern char *pg_ltostr_zeropad(char *str, int32 value, int32 minwidth);
-extern char *pg_ltostr(char *str, int32 value);
+extern char *pg_ultostr_zeropad(char *str, uint32 value, int32 minwidth);
+extern char *pg_ultostr(char *str, uint32 value);
 extern uint64 pg_strtouint64(const char *str, char **endptr, int base);
 
 /* oid.c */

--------------2.21.0--

#23

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 6 years ago

In reply to: David Fetter (#22)

Re: Efficient output for integer types

Hi,

Any plans regarding committing this patch? I see the thread is silent
since September 24, when the last patch version was posted. The patch is
already marked as RFC since December, when David changed the status. I
don't have any opinion whether the patch is RFC or not (it might well
be), but IMHO it should have been mentioned in this thread.

I did a quick test to see how much more efficient this is, and for a
table with 10 bigint columns and 5M random rows the COPY to /dev/null
went from 3000 ms to ~2700 ms. That's not the 30% speedup mentioned by
David in the first message, but 10% is still pretty nice.

Of course, for real-world use cases the speedup will be lower because of
using other data types too, I/O etc. But it's still nice.

So, is anyone opposed to pushing this? If not, who'll to do that? I see
Andrew Gierth was involved in the discussions on IRC and it's related to
the Ryu patch, so maybe he want's to take care of this. Andrew?

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services