pgbench - add pseudo-random permutation function
Hello,
This patch adds a pseudo-random permutation function to pgbench. It allows
to mix non uniform random keys to avoid trivial correlations between
neighboring values, hence between pages.
The function is a simplistic form of encryption adapted to any size, using
a few iterations of scramble and scatter phases. The result is not
cryptographically convincing, nor even statistically, but it is quite
inexpensive and achieves the desired result. A computation costs 0.22 ᅵs
per call on my laptop, about three times the cost of a simple function.
Alternative designs, such as iterating over an actual encryption function
or using some sbox, would lead to much more costly solutions and complex
code.
I also join a few scripts I used for testing.
--
Fabien.
Attachments:
pgbench-prp-func-1.patchtext/plain; name=pgbench-prp-func-1.patchDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 88cf8b3933..31ef39bf10 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -917,7 +917,7 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<row>
<entry> <literal>default_seed</literal> </entry>
- <entry>seed used in hash functions by default</entry>
+ <entry>seed used in hash and pseudo-random permutation functions by default</entry>
</row>
<row>
@@ -1370,6 +1370,13 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<entry><literal>pow(2.0, 10)</literal>, <literal>power(2.0, 10)</literal></entry>
<entry><literal>1024.0</literal></entry>
</row>
+ <row>
+ <entry><literal><function>pr_perm(<replaceable>i</replaceable>, <replaceable>size</replaceable> [, <replaceable>seed</replaceable> ] )</function></literal></entry>
+ <entry>integer</entry>
+ <entry>pseudo-random permutation in [0,size)</entry>
+ <entry><literal>pr_perm(0, 4)</literal></entry>
+ <entry><literal>0</literal>, <literal>1</literal>, <literal>2</literal> or <literal>3</literal></entry>
+ </row>
<row>
<entry><literal><function>random(<replaceable>lb</replaceable>, <replaceable>ub</replaceable>)</function></literal></entry>
<entry>integer</entry>
@@ -1531,6 +1538,21 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
</programlisting>
</para>
+ <para>
+ Function <literal>pr_perm</literal> implements a pseudo-random permutation.
+ It permutes integers in [0, size) using a seed by applying rounds of
+ simple invertible functions, similarly to an encryption function,
+ although beware that it is not at all cryptographically secure.
+ It allows to mix the output of non uniform random functions so that
+ values drawn more often are not correlated.
+ Values outside the interval are interpreted modulo the size.
+ The function errors if size is not positive.
+ If no seed is provided, <literal>:default_seed</literal> is used.
+ Compared to <literal>hash</literal> functions discussed above, the function
+ ensures that a perfect permutation is applied: there are no collisions
+ nor holes in the output values.
+ </para>
+
<para>
As an example, the full definition of the built-in TPC-B-like
transaction is:
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index f7c56cc6a3..762a62959b 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -19,6 +19,7 @@
#define PGBENCH_NARGS_VARIABLE (-1)
#define PGBENCH_NARGS_CASE (-2)
#define PGBENCH_NARGS_HASH (-3)
+#define PGBENCH_NARGS_PRPERM (-4)
PgBenchExpr *expr_parse_result;
@@ -366,6 +367,9 @@ static const struct
{
"hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A
},
+ {
+ "pr_perm", PGBENCH_NARGS_PRPERM, PGBENCH_PRPERM
+ },
/* keep as last array element */
{
NULL, 0, 0
@@ -478,6 +482,19 @@ make_func(yyscan_t yyscanner, int fnumber, PgBenchExprList *args)
}
break;
+ /* pseudo-random permutation function with optional seed argument */
+ case PGBENCH_NARGS_PRPERM:
+ if (len < 2 || len > 3)
+ expr_yyerror_more(yyscanner, "unexpected number of arguments",
+ PGBENCH_FUNCTIONS[fnumber].fname);
+
+ if (len == 2)
+ {
+ PgBenchExpr *var = make_variable("default_seed");
+ args = make_elist(var, args);
+ }
+ break;
+
/* common case: positive arguments number */
default:
Assert(PGBENCH_FUNCTIONS[fnumber].nargs >= 0);
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 41b756c089..7d21644c77 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -986,6 +986,119 @@ getHashMurmur2(int64 val, uint64 seed)
return (int64) result;
}
+/* pseudo-random permutation */
+
+/* 16 so that % 16 can be optimized to & 0x0f */
+#define PRP_PRIMES 16
+/* 27-29 bits mega primes from https://primes.utm.edu/lists/small/millions/ */
+static int64 primes[PRP_PRIMES] = {
+ INT64CONST(122949829),
+ INT64CONST(141650963),
+ INT64CONST(160481219),
+ INT64CONST(179424691),
+ INT64CONST(198491329),
+ INT64CONST(217645199),
+ INT64CONST(236887699),
+ INT64CONST(256203221),
+ INT64CONST(275604547),
+ INT64CONST(295075153),
+ INT64CONST(314606891),
+ INT64CONST(334214467),
+ INT64CONST(353868019),
+ INT64CONST(373587911),
+ INT64CONST(393342743),
+ INT64CONST(413158523)
+};
+
+/* how many "encryption" rounds to apply */
+#define PRP_ROUNDS 4
+
+/* return largest mask in 0 .. n-1 */
+static uint64 compute_prp_mask(uint64 n)
+{
+ n |= n >> 1;
+ n |= n >> 2;
+ n |= n >> 4;
+ n |= n >> 8;
+ n |= n >> 16;
+ n |= n >> 32;
+ return n >> 1;
+}
+
+/* Donald Knuth linear congruential generator */
+#define DK_LCG_MUL INT64CONST(6364136223846793005)
+#define DK_LCG_INC INT64CONST(1442695040888963407)
+
+/* do not use all small bits */
+#define LCG_SHIFT 13
+
+/*
+ * PRP: parametric pseudo-random permutation
+ *
+ * Result in [0, size) is a permutation for inputs in the same set.
+ *
+ * Note that this function does not pass statistical tests: eg
+ * permutations of 2, 3, 4 or 5 ints are not strictly equiprobable.
+ * However it is inexpensive compared to an actual encryption function,
+ * and the quality is good enough to avoid trivial correlations on
+ * large sizes, which is the expected use case.
+ *
+ * THIS FUNCTION IS NOT CRYPTOGRAPHICALLY SECURE.
+ * PLEASE DO NOT USE FOR SUCH PURPOSE.
+ */
+static int64
+pseudorandom_perm(const int64 data, const int64 isize, const int64 seed)
+{
+ /* computations are performed on unsigned values */
+ uint64 key = (uint64) seed;
+ uint64 size = (uint64) isize;
+ uint64 v = (uint64) data % size;
+ /* size-1: ensures 2 possibly overlapping halves */
+ uint64 mask = compute_prp_mask(size-1);
+
+ unsigned int i, p;
+
+ /* nothing to permute */
+ if (isize == 1)
+ return 0;
+
+ Assert(isize >= 2);
+
+ /* apply 4 rounds of bijective transformations:
+ * (1) scramble: partial xors on power-or-2 subsets
+ * (2) scatter: linear modulo
+ */
+ for (i = 0, p = key % PRP_PRIMES; i < PRP_ROUNDS; i++, p = (p + 1) % PRP_PRIMES)
+ {
+ uint64 t;
+
+ /* first "half" whitening, for v in 0 .. mask */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ if (v <= mask)
+ v ^= (key >> LCG_SHIFT) & mask;
+
+ /* second (possibly overlapping) "half" whitening */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ t = size - 1 - v;
+ if (t <= mask)
+ {
+ t ^= (key >> LCG_SHIFT) & mask;
+ v = size - 1 - t;
+ }
+
+ /* at most 2 primes are skipped for a given size */
+ while (unlikely(size % primes[p] == 0))
+ p = (p + 1) % PRP_PRIMES;
+
+ /* scatter values with a prime multiplication */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ v = (primes[p] * v + (key >> LCG_SHIFT)) % size;
+ }
+
+ /* back to signed */
+ return (int64) v;
+}
+
/*
* Initialize the given SimpleStats struct to all zeroes
*/
@@ -2319,6 +2432,26 @@ evalStandardFunc(TState *thread, CState *st,
return true;
}
+ case PGBENCH_PRPERM:
+ {
+ int64 val, size, seed;
+ Assert(nargs == 3);
+
+ if (!coerceToInt(&vargs[0], &val) ||
+ !coerceToInt(&vargs[1], &size) ||
+ !coerceToInt(&vargs[2], &seed))
+ return false;
+
+ if (size < 1)
+ {
+ fprintf(stderr, "pr_perm size parameter must be >= 1\n");
+ return false;
+ }
+
+ setIntValue(retval, pseudorandom_perm(val, size, seed));
+ return true;
+ }
+
default:
/* cannot get here */
Assert(0);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index 6983865b92..665c450c2c 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -99,7 +99,8 @@ typedef enum PgBenchFunction
PGBENCH_IS,
PGBENCH_CASE,
PGBENCH_HASH_FNV1A,
- PGBENCH_HASH_MURMUR2
+ PGBENCH_HASH_MURMUR2,
+ PGBENCH_PRPERM
} PgBenchFunction;
typedef struct PgBenchExpr PgBenchExpr;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 2fc021dde7..0aec384eca 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -322,6 +322,14 @@ pgbench(
qr{command=96.: int 1\b}, # :scale
qr{command=97.: int 0\b}, # :client_id
qr{command=98.: int 5432\b}, # :random_seed
+ qr{command=99.: boolean true\b},
+ qr{command=100.: boolean true\b},
+ qr{command=101.: boolean true\b},
+ qr{command=102.: boolean true\b},
+ qr{command=103.: boolean true\b},
+ qr{command=107.: boolean true\b},
+ qr{command=108.: boolean true\b},
+ qr{command=109.: boolean true\b},
],
'pgbench expressions',
{
@@ -447,6 +455,24 @@ SELECT :v0, :v1, :v2, :v3;
\set sc debug(:scale)
\set ci debug(:client_id)
\set rs debug(:random_seed)
+-- pseudo-random permutation
+\set t debug(pr_perm(0, 2) + pr_perm(1, 2) = 1)
+\set t debug(pr_perm(0, 3) + pr_perm(1, 3) + pr_perm(2, 3) = 3)
+\set t debug(pr_perm(0, 4) + pr_perm(1, 4) + pr_perm(2, 4) + pr_perm(3, 4) = 6)
+\set t debug(pr_perm(0, 5) + pr_perm(1, 5) + pr_perm(2, 5) + pr_perm(3, 5) + pr_perm(4, 5) = 10)
+\set t debug(pr_perm(0, 16) + pr_perm(1, 16) + pr_perm(2, 16) + pr_perm(3, 16) + \
+ pr_perm(4, 16) + pr_perm(5, 16) + pr_perm(6, 16) + pr_perm(7, 16) + \
+ pr_perm(8, 16) + pr_perm(9, 16) + pr_perm(10, 16) + pr_perm(11, 16) + \
+ pr_perm(12, 16) + pr_perm(13, 16) + pr_perm(14, 16) + pr_perm(15, 16) = 120)
+-- random sanity check
+\set size random(2, 1000)
+\set v random(0, :size - 1)
+\set p pr_perm(:v, :size)
+\set t debug(0 <= :p and :p < :size and :p = pr_perm(:v + :size, :size) and :p <> pr_perm(:v + 1, :size))
+-- actual values
+\set t debug(pr_perm(:v, 1) = 0)
+\set t debug(pr_perm(0, 2, 5432) = 0 and pr_perm(1, 2, 5432) = 1 and \
+ pr_perm(0, 2, 5431) = 1 and pr_perm(1, 2, 5431) = 0)
}
});
@@ -731,6 +757,10 @@ SELECT LEAST(:i, :i, :i, :i, :i, :i, :i, :i, :i, :i, :i);
[
'bad boolean', 0,
[qr{malformed variable.*trueXXX}], q{\set b :badtrue or true}
+ ],
+ [
+ 'invalid pr_perm size', 0,
+ [qr{pr_perm size parameter must be >= 1}], q{\set i pr_perm(0, 0)}
],);
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
index c1c2c1e3d4..ff02cfb46b 100644
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -290,6 +290,16 @@ my @script_tests = (
'too many arguments for hash',
[qr{unexpected number of arguments \(hash\)}],
{ 'bad-hash-2.sql' => "\\set i hash(1,2,3)\n" }
+ ],
+ [
+ 'not enough arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-1.sql' => "\\set i pr_perm(1)\n" }
+ ],
+ [
+ 'too many arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-2.sql' => "\\set i pr_perm(1, 2, 3, 4)\n" }
],);
for my $t (@script_tests)
Hi Fabian,
I reviewed `pgbench-prp-func-1.patch` and found an incomplete
implementation.
In the pseudorandom_perm function, I confirmed that the scramble and
scatter operations are mathematically bijections. Therefore, this
function is mathematically correct.
However, the implementation of the scatter operation in this patch
overflows in many cases if the variable:size is 38 bit integer or
greater. Because the variable:size and the item of the array:primes[]
which stores 27-29 bit integers are multiplicated. If overflow occurs,
the scatter operation does not satisfy bijective.
I did a sampling inspection, whose conditions are the variable:size is
1099511627773 (= 40 bit integer) and the variable:seed is 5432. Even
with very few samples, I found a huge number of collisions as shown below:
pr_perm(409749816, 1099511627773, 5432) = pr_perm(491041141,
1099511627773, 5432) = pr_perm(729171766, 1099511627773, 5432) =
pr_perm(849775914, 1099511627773, 5432) = 22445482629,
pr_perm(45609644, 1099511627773, 5432) = pr_perm(174005122,
1099511627773, 5432) = pr_perm(811754941, 1099511627773, 5432) =
pr_perm( 1131176891, 1099511627773, 5432) = 10626826534,
and so on.
There are two ways to resolve this issue. The first one is to reduce the
maximum value can be set for the variable:size. The second one is to add
a modular multiplication function to avoid overflow. I made a patch
including the function that can be calculated modular multiplication
without overflow, and attached this mail.
(I also attached a simple test suite of the function I added.)
In the others parts of the original patch, I could apply the patch and
did tests without trouble; the documentations and comments are well
except one comment in the function shown below:
(1) scramble: partial xors on power-or-2 subsets
I could not understand this meaning when I read it at the first time.
Best regards,
Show quoted text
On 2018/07/28 15:03, Fabien COELHO wrote:
Hello,
This patch adds a pseudo-random permutation function to pgbench. It
allows to mix non uniform random keys to avoid trivial correlations
between neighboring values, hence between pages.The function is a simplistic form of encryption adapted to any size,
using a few iterations of scramble and scatter phases. The result is not
cryptographically convincing, nor even statistically, but it is quite
inexpensive and achieves the desired result. A computation costs 0.22 ᅵs
per call on my laptop, about three times the cost of a simple function.Alternative designs, such as iterating over an actual encryption
function or using some sbox, would lead to much more costly solutions
and complex code.I also join a few scripts I used for testing.
Attachments:
pgbench-prp-func-2.patchtext/plain; charset=UTF-8; name=pgbench-prp-func-2.patch; x-mac-creator=0; x-mac-type=0Download
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 88cf8b3..31ef39b 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -917,7 +917,7 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<row>
<entry> <literal>default_seed</literal> </entry>
- <entry>seed used in hash functions by default</entry>
+ <entry>seed used in hash and pseudo-random permutation functions by default</entry>
</row>
<row>
@@ -1371,6 +1371,13 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<entry><literal>1024.0</literal></entry>
</row>
<row>
+ <entry><literal><function>pr_perm(<replaceable>i</replaceable>, <replaceable>size</replaceable> [, <replaceable>seed</replaceable> ] )</function></literal></entry>
+ <entry>integer</entry>
+ <entry>pseudo-random permutation in [0,size)</entry>
+ <entry><literal>pr_perm(0, 4)</literal></entry>
+ <entry><literal>0</literal>, <literal>1</literal>, <literal>2</literal> or <literal>3</literal></entry>
+ </row>
+ <row>
<entry><literal><function>random(<replaceable>lb</replaceable>, <replaceable>ub</replaceable>)</function></literal></entry>
<entry>integer</entry>
<entry>uniformly-distributed random integer in <literal>[lb, ub]</literal></entry>
@@ -1532,6 +1539,21 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
</para>
<para>
+ Function <literal>pr_perm</literal> implements a pseudo-random permutation.
+ It permutes integers in [0, size) using a seed by applying rounds of
+ simple invertible functions, similarly to an encryption function,
+ although beware that it is not at all cryptographically secure.
+ It allows to mix the output of non uniform random functions so that
+ values drawn more often are not correlated.
+ Values outside the interval are interpreted modulo the size.
+ The function errors if size is not positive.
+ If no seed is provided, <literal>:default_seed</literal> is used.
+ Compared to <literal>hash</literal> functions discussed above, the function
+ ensures that a perfect permutation is applied: there are no collisions
+ nor holes in the output values.
+ </para>
+
+ <para>
As an example, the full definition of the built-in TPC-B-like
transaction is:
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index f7c56cc..762a629 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -19,6 +19,7 @@
#define PGBENCH_NARGS_VARIABLE (-1)
#define PGBENCH_NARGS_CASE (-2)
#define PGBENCH_NARGS_HASH (-3)
+#define PGBENCH_NARGS_PRPERM (-4)
PgBenchExpr *expr_parse_result;
@@ -366,6 +367,9 @@ static const struct
{
"hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A
},
+ {
+ "pr_perm", PGBENCH_NARGS_PRPERM, PGBENCH_PRPERM
+ },
/* keep as last array element */
{
NULL, 0, 0
@@ -478,6 +482,19 @@ make_func(yyscan_t yyscanner, int fnumber, PgBenchExprList *args)
}
break;
+ /* pseudo-random permutation function with optional seed argument */
+ case PGBENCH_NARGS_PRPERM:
+ if (len < 2 || len > 3)
+ expr_yyerror_more(yyscanner, "unexpected number of arguments",
+ PGBENCH_FUNCTIONS[fnumber].fname);
+
+ if (len == 2)
+ {
+ PgBenchExpr *var = make_variable("default_seed");
+ args = make_elist(var, args);
+ }
+ break;
+
/* common case: positive arguments number */
default:
Assert(PGBENCH_FUNCTIONS[fnumber].nargs >= 0);
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 41b756c..763bf6f 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -986,6 +986,215 @@ getHashMurmur2(int64 val, uint64 seed)
return (int64) result;
}
+/* pseudo-random permutation */
+
+/* 16 so that % 16 can be optimized to & 0x0f */
+#define PRP_PRIMES 16
+/* 27-29 bits mega primes from https://primes.utm.edu/lists/small/millions/ */
+static int64 primes[PRP_PRIMES] = {
+ INT64CONST(122949829),
+ INT64CONST(141650963),
+ INT64CONST(160481219),
+ INT64CONST(179424691),
+ INT64CONST(198491329),
+ INT64CONST(217645199),
+ INT64CONST(236887699),
+ INT64CONST(256203221),
+ INT64CONST(275604547),
+ INT64CONST(295075153),
+ INT64CONST(314606891),
+ INT64CONST(334214467),
+ INT64CONST(353868019),
+ INT64CONST(373587911),
+ INT64CONST(393342743),
+ INT64CONST(413158523)
+};
+
+/* how many "encryption" rounds to apply */
+#define PRP_ROUNDS 4
+
+/* return largest mask in 0 .. n-1 */
+static uint64 compute_prp_mask(uint64 n)
+{
+ n |= n >> 1;
+ n |= n >> 2;
+ n |= n >> 4;
+ n |= n >> 8;
+ n |= n >> 16;
+ n |= n >> 32;
+ return n >> 1;
+}
+
+/*
+ * Calculate (x * y) % m, where x and y in [0, 2^64), m in [1, 2^64).
+ *
+ * If x or y is greater than 2^32, improved interleaved modular
+ * multiplication algorithm is used to avoid overflow.
+ */
+static uint64 modular_multiplicate(uint64 x, uint64 y, const uint64 m)
+{
+ int i, bits;
+ uint64 r = 0;
+
+ Assert(1 <= m);
+
+ /* Because of (x * y) % m = (x % m * y % m) % m */
+ if (x >= m)
+ x %= m;
+ if (y >= m)
+ y %= m;
+
+ /* Return the trivial result. */
+ if (x == 0 || y == 0 || m == 1)
+ return 0;
+
+ /* Return the result if (x * y) can be multiplicated without overflow. */
+ if ((x | y) < (0xffffffff))
+ return (x * y) % m;
+
+ /* To reduce the for loop in the algorithm below. */
+ if (x < y)
+ {
+ uint64 tmp = x;
+ x = y;
+ y = tmp;
+ }
+
+ /* Interleaved modular multiplication algorithm [1]
+ *
+ * This algorithm is usually used in the field of digital circuit
+ * design.
+ *
+ * Input: X, Y, M; 0 <= X, Y <= M;
+ * Output: R = X * Y mod M;
+ * bits: number of bits of Y
+ * Y[i]: i th bit of Y
+ *
+ * 1. R = 0;
+ * 2. for (i = bits - 1; i >= 0; i--) {
+ * 3. R = 2 * R;
+ * 4. if (Y[i] == 0x1)
+ * 5. R += X;
+ * 6. if (R >= M) R -= M;
+ * 7. if (R >= M) R -= M;
+ * }
+ *
+ * In Steps 3 and 5, overflow should be avoided.
+ * Steps 6 and 7 can be instead of a modular operation (R %= M).
+ *
+ * Reference
+ * [1] D.N. Amanor, et al, "Efficient hardware architecture for
+ * modular multiplication on FPGAs", in Field Programmable
+ * Logic and Apllications, 2005. International Conference on,
+ * Aug 2005, pp. 539-542.
+ */
+
+ bits = 64;
+ while (bits > 0 && (y >> (64 - bits) | 0x1) == 0)
+ bits--;
+
+ for (i = bits - 1; i >= 0; i--)
+ {
+ if (r > 0x7fffffffffffffff)
+ /* To avoid overflow, transform from (2 * r) to
+ * (2 * r) % m, and further transform to
+ * mathematically equivalent form shown below:
+ */
+ r = m - ((m - r) << 1);
+ else
+ r <<= 1;
+
+ if ((y >> i) & 0x1)
+ {
+ /* Calculate (r + x) without overflow using same
+ * transformations described in the above comment.
+ */
+ if (m > 0x7fffffffffffffff)
+ r = ((m - r) > x) ? r + x : r + x - m;
+ else
+ r = (r > m) ? r - m + x : r + x;
+ }
+
+ r %= m;
+ }
+
+ return r;
+}
+
+/* Donald Knuth linear congruential generator */
+#define DK_LCG_MUL INT64CONST(6364136223846793005)
+#define DK_LCG_INC INT64CONST(1442695040888963407)
+
+/* do not use all small bits */
+#define LCG_SHIFT 13
+
+/*
+ * PRP: parametric pseudo-random permutation
+ *
+ * Result in [0, size) is a permutation for inputs in the same set.
+ *
+ * Note that this function does not pass statistical tests: eg
+ * permutations of 2, 3, 4 or 5 ints are not strictly equiprobable.
+ * However it is inexpensive compared to an actual encryption function,
+ * and the quality is good enough to avoid trivial correlations on
+ * large sizes, which is the expected use case.
+ *
+ * THIS FUNCTION IS NOT CRYPTOGRAPHICALLY SECURE.
+ * PLEASE DO NOT USE FOR SUCH PURPOSE.
+ */
+static int64
+pseudorandom_perm(const int64 data, const int64 isize, const int64 seed)
+{
+ /* computations are performed on unsigned values */
+ uint64 key = (uint64) seed;
+ uint64 size = (uint64) isize;
+ uint64 v = (uint64) data % size;
+ /* size-1: ensures 2 possibly overlapping halves */
+ uint64 mask = compute_prp_mask(size-1);
+
+ unsigned int i, p;
+
+ /* nothing to permute */
+ if (isize == 1)
+ return 0;
+
+ Assert(isize >= 2);
+
+ /* apply 4 rounds of bijective transformations:
+ * (1) scramble: partial xors on power-or-2 subsets
+ * (2) scatter: linear modulo
+ */
+ for (i = 0, p = key % PRP_PRIMES; i < PRP_ROUNDS; i++, p = (p + 1) % PRP_PRIMES)
+ {
+ uint64 t;
+
+ /* first "half" whitening, for v in 0 .. mask */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ if (v <= mask)
+ v ^= (key >> LCG_SHIFT) & mask;
+
+ /* second (possibly overlapping) "half" whitening */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ t = size - 1 - v;
+ if (t <= mask)
+ {
+ t ^= (key >> LCG_SHIFT) & mask;
+ v = size - 1 - t;
+ }
+
+ /* at most 2 primes are skipped for a given size */
+ while (unlikely(size % primes[p] == 0))
+ p = (p + 1) % PRP_PRIMES;
+
+ /* scatter values with a prime multiplication */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ v = (modular_multiplicate((uint64)primes[p], v, size) + (key >> LCG_SHIFT)) % size;
+ }
+
+ /* back to signed */
+ return (int64) v;
+}
+
/*
* Initialize the given SimpleStats struct to all zeroes
*/
@@ -2319,6 +2528,26 @@ evalStandardFunc(TState *thread, CState *st,
return true;
}
+ case PGBENCH_PRPERM:
+ {
+ int64 val, size, seed;
+ Assert(nargs == 3);
+
+ if (!coerceToInt(&vargs[0], &val) ||
+ !coerceToInt(&vargs[1], &size) ||
+ !coerceToInt(&vargs[2], &seed))
+ return false;
+
+ if (size < 1)
+ {
+ fprintf(stderr, "pr_perm size parameter must be >= 1\n");
+ return false;
+ }
+
+ setIntValue(retval, pseudorandom_perm(val, size, seed));
+ return true;
+ }
+
default:
/* cannot get here */
Assert(0);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index 6983865..665c450 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -99,7 +99,8 @@ typedef enum PgBenchFunction
PGBENCH_IS,
PGBENCH_CASE,
PGBENCH_HASH_FNV1A,
- PGBENCH_HASH_MURMUR2
+ PGBENCH_HASH_MURMUR2,
+ PGBENCH_PRPERM
} PgBenchFunction;
typedef struct PgBenchExpr PgBenchExpr;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 2fc021d..0aec384 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -322,6 +322,14 @@ pgbench(
qr{command=96.: int 1\b}, # :scale
qr{command=97.: int 0\b}, # :client_id
qr{command=98.: int 5432\b}, # :random_seed
+ qr{command=99.: boolean true\b},
+ qr{command=100.: boolean true\b},
+ qr{command=101.: boolean true\b},
+ qr{command=102.: boolean true\b},
+ qr{command=103.: boolean true\b},
+ qr{command=107.: boolean true\b},
+ qr{command=108.: boolean true\b},
+ qr{command=109.: boolean true\b},
],
'pgbench expressions',
{
@@ -447,6 +455,24 @@ SELECT :v0, :v1, :v2, :v3;
\set sc debug(:scale)
\set ci debug(:client_id)
\set rs debug(:random_seed)
+-- pseudo-random permutation
+\set t debug(pr_perm(0, 2) + pr_perm(1, 2) = 1)
+\set t debug(pr_perm(0, 3) + pr_perm(1, 3) + pr_perm(2, 3) = 3)
+\set t debug(pr_perm(0, 4) + pr_perm(1, 4) + pr_perm(2, 4) + pr_perm(3, 4) = 6)
+\set t debug(pr_perm(0, 5) + pr_perm(1, 5) + pr_perm(2, 5) + pr_perm(3, 5) + pr_perm(4, 5) = 10)
+\set t debug(pr_perm(0, 16) + pr_perm(1, 16) + pr_perm(2, 16) + pr_perm(3, 16) + \
+ pr_perm(4, 16) + pr_perm(5, 16) + pr_perm(6, 16) + pr_perm(7, 16) + \
+ pr_perm(8, 16) + pr_perm(9, 16) + pr_perm(10, 16) + pr_perm(11, 16) + \
+ pr_perm(12, 16) + pr_perm(13, 16) + pr_perm(14, 16) + pr_perm(15, 16) = 120)
+-- random sanity check
+\set size random(2, 1000)
+\set v random(0, :size - 1)
+\set p pr_perm(:v, :size)
+\set t debug(0 <= :p and :p < :size and :p = pr_perm(:v + :size, :size) and :p <> pr_perm(:v + 1, :size))
+-- actual values
+\set t debug(pr_perm(:v, 1) = 0)
+\set t debug(pr_perm(0, 2, 5432) = 0 and pr_perm(1, 2, 5432) = 1 and \
+ pr_perm(0, 2, 5431) = 1 and pr_perm(1, 2, 5431) = 0)
}
});
@@ -731,6 +757,10 @@ SELECT LEAST(:i, :i, :i, :i, :i, :i, :i, :i, :i, :i, :i);
[
'bad boolean', 0,
[qr{malformed variable.*trueXXX}], q{\set b :badtrue or true}
+ ],
+ [
+ 'invalid pr_perm size', 0,
+ [qr{pr_perm size parameter must be >= 1}], q{\set i pr_perm(0, 0)}
],);
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
index c1c2c1e..ff02cfb 100644
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -290,6 +290,16 @@ my @script_tests = (
'too many arguments for hash',
[qr{unexpected number of arguments \(hash\)}],
{ 'bad-hash-2.sql' => "\\set i hash(1,2,3)\n" }
+ ],
+ [
+ 'not enough arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-1.sql' => "\\set i pr_perm(1)\n" }
+ ],
+ [
+ 'too many arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-2.sql' => "\\set i pr_perm(1, 2, 3, 4)\n" }
],);
for my $t (@script_tests)
Hello Hironobu-san,
I reviewed `pgbench-prp-func-1.patch` and found an incomplete implementation.
Indeed, thanks a lot for the debug, and the proposed fix!
I'm going to work a little bit more on the patch based on your proposed
changes, and submit a v3, hopefully soon.
--
Fabien.
Hello Hironobu-san,
However, the implementation of the scatter operation in this patch overflows
in many cases if the variable:size is 38 bit integer or greater. Because the
variable:size and the item of the array:primes[] which stores 27-29 bit
integers are multiplicated. If overflow occurs, the scatter operation does
not satisfy bijective.
Indeed. Again, thanks for the debug! As you contributed some code, I added
you as a co-author in the CF entry.
Attached a v3, based on your fix, plus some additional changes:
- explicitly declare unsigned variables where appropriate, to avoid casts
- use smaller 24 bits primes instead of 27-29 bits
- add a shortcut for multiplier below 24 bits and y value below 40 bits,
which should avoid the manually implemented multiplication in most
practical cases (tables with over 2^40 rows are pretty rare...).
- change the existing shortcut to look a the number of bits instead of
using 32 limits.
- add a test for minimal code coverage with over 40 bits sizes
- attempt to improve the documentation
- some comments were updates, hopefully for the better
The main idea behind the smaller primes is to avoid the expensive modmul
implementation on most realistic cases.
--
Fabien.
Attachments:
pgbench-prp-func-3.patchtext/x-diff; name=pgbench-prp-func-3.patchDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 88cf8b3933..9b8e90e26f 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -917,7 +917,7 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<row>
<entry> <literal>default_seed</literal> </entry>
- <entry>seed used in hash functions by default</entry>
+ <entry>seed used in hash and pseudo-random permutation functions by default</entry>
</row>
<row>
@@ -1370,6 +1370,13 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<entry><literal>pow(2.0, 10)</literal>, <literal>power(2.0, 10)</literal></entry>
<entry><literal>1024.0</literal></entry>
</row>
+ <row>
+ <entry><literal><function>pr_perm(<replaceable>i</replaceable>, <replaceable>size</replaceable> [, <replaceable>seed</replaceable> ] )</function></literal></entry>
+ <entry>integer</entry>
+ <entry>pseudo-random permutation in [0,size)</entry>
+ <entry><literal>pr_perm(0, 4)</literal></entry>
+ <entry><literal>0</literal>, <literal>1</literal>, <literal>2</literal> or <literal>3</literal></entry>
+ </row>
<row>
<entry><literal><function>random(<replaceable>lb</replaceable>, <replaceable>ub</replaceable>)</function></literal></entry>
<entry>integer</entry>
@@ -1531,6 +1538,24 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
</programlisting>
</para>
+ <para>
+ Function <literal>pr_perm</literal> implements a pseudo-random permutation.
+ It allows to mix the output of non uniform random functions so that
+ values drawn more often are not trivially correlated.
+ It permutes integers in [0, size) using a seed by applying rounds of
+ simple invertible functions, similarly to an encryption function,
+ although beware that it is not at all cryptographically secure.
+ Compared to <literal>hash</literal> functions discussed above, the function
+ ensures that a perfect permutation is applied: there are no collisions
+ nor holes in the output values.
+ Values outside the interval are interpreted modulo the size.
+ The function errors if size is not positive.
+ If no seed is provided, <literal>:default_seed</literal> is used.
+ For a given size and seed, the function is fully deterministic: if two
+ permutations on the same size must not be correlated, use distinct seeds
+ as outlined in the previous example about hash functions.
+ </para>
+
<para>
As an example, the full definition of the built-in TPC-B-like
transaction is:
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index f7c56cc6a3..762a62959b 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -19,6 +19,7 @@
#define PGBENCH_NARGS_VARIABLE (-1)
#define PGBENCH_NARGS_CASE (-2)
#define PGBENCH_NARGS_HASH (-3)
+#define PGBENCH_NARGS_PRPERM (-4)
PgBenchExpr *expr_parse_result;
@@ -366,6 +367,9 @@ static const struct
{
"hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A
},
+ {
+ "pr_perm", PGBENCH_NARGS_PRPERM, PGBENCH_PRPERM
+ },
/* keep as last array element */
{
NULL, 0, 0
@@ -478,6 +482,19 @@ make_func(yyscan_t yyscanner, int fnumber, PgBenchExprList *args)
}
break;
+ /* pseudo-random permutation function with optional seed argument */
+ case PGBENCH_NARGS_PRPERM:
+ if (len < 2 || len > 3)
+ expr_yyerror_more(yyscanner, "unexpected number of arguments",
+ PGBENCH_FUNCTIONS[fnumber].fname);
+
+ if (len == 2)
+ {
+ PgBenchExpr *var = make_variable("default_seed");
+ args = make_elist(var, args);
+ }
+ break;
+
/* common case: positive arguments number */
default:
Assert(PGBENCH_FUNCTIONS[fnumber].nargs >= 0);
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 41b756c089..895e424452 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -986,6 +986,249 @@ getHashMurmur2(int64 val, uint64 seed)
return (int64) result;
}
+/* pseudo-random permutation */
+
+/* 16 so that % 16 can be optimized to & 0x0f */
+#define PRP_PRIMES 16
+/*
+ * 24 bits mega primes from https://primes.utm.edu/lists/small/millions/
+ * the i-th prime, i \in [0, 15], is the first prime above 2^23 + i * 2^19
+ */
+static uint64 primes[PRP_PRIMES] = {
+ UINT64CONST(8388617),
+ UINT64CONST(8912921),
+ UINT64CONST(9437189),
+ UINT64CONST(9961487),
+ UINT64CONST(10485767),
+ UINT64CONST(11010059),
+ UINT64CONST(11534351),
+ UINT64CONST(12058679),
+ UINT64CONST(12582917),
+ UINT64CONST(13107229),
+ UINT64CONST(13631489),
+ UINT64CONST(14155777),
+ UINT64CONST(14680067),
+ UINT64CONST(15204391),
+ UINT64CONST(15728681),
+ UINT64CONST(16252967)
+};
+
+/* how many "encryption" rounds to apply */
+#define PRP_ROUNDS 4
+
+/* return largest mask in 0 .. n-1 */
+static uint64 compute_prp_mask(uint64 n)
+{
+ n |= n >> 1;
+ n |= n >> 2;
+ n |= n >> 4;
+ n |= n >> 8;
+ n |= n >> 16;
+ n |= n >> 32;
+ return n >> 1;
+}
+
+/*
+ * Count bits in integer representation
+ *
+ * There are SSE instructions & some compiler builtins for that,
+ * but they are not portable.
+ */
+static int popcount64(uint64 n)
+{
+ n -= (n >> 1) & UINT64CONST(0x5555555555555555);
+ n = (n & UINT64CONST(0x3333333333333333)) + ((n >> 2) & UINT64CONST(0x3333333333333333));
+ n = (n + (n >> 4)) & UINT64CONST(0x0F0F0F0F0F0F0F0F);
+ return (n * UINT64CONST(0x0101010101010101)) >> 56;
+}
+
+/* length of n binary representation */
+static int nbits(uint64 n)
+{
+ /* set all lower bits to 1 */
+ n |= n >> 1;
+ n |= n >> 2;
+ n |= n >> 4;
+ n |= n >> 8;
+ n |= n >> 16;
+ n |= n >> 32;
+ /* then count them */
+ return popcount64(n);
+}
+
+/*
+ * Compute (x * y) % m, where x and y in [0, 2^64), m in [1, 2^64).
+ *
+ * Use improved interleaved modular multiplication algorithm to avoid
+ * overflows when necessary.
+ */
+static uint64 modular_multiply(uint64 x, uint64 y, const uint64 m)
+{
+ int i, bits;
+ uint64 r;
+
+ Assert(m >= 1);
+
+ /* Performance shortcut for our 24 bit primes, ok for m up to ~10E12 */
+ if ((x & UINT64CONST(0xffffff)) == x && (y & UINT64CONST(0xffffffffff)) == y)
+ return (x * y) % m;
+
+ /* Because of (x * y) % m = (x % m * y % m) % m */
+ if (x >= m)
+ x %= m;
+ if (y >= m)
+ y %= m;
+
+ /* Return the trivial result. */
+ if (x == 0 || y == 0 || m == 1)
+ return 0;
+
+ /* Return the result if (x * y) can be multiplied without overflow. */
+ if (nbits(x) + nbits(y) <= 64)
+ return (x * y) % m;
+
+ /* To reduce the for loop in the algorithm below, ensure y <= x. */
+ if (x < y)
+ {
+ uint64 tmp = x;
+ x = y;
+ y = tmp;
+ }
+
+ /* Interleaved modular multiplication algorithm from:
+ *
+ * D.N. Amanor et al., "Efficient hardware architecture for
+ * modular multiplication on FPGAs", in International Conference on
+ * Field Programmable Logic and Applications, Aug 2005, pp. 539-542.
+ *
+ * This algorithm is usually used in the field of digital circuit
+ * design.
+ *
+ * Input: X, Y, M; 0 <= X, Y <= M;
+ * Output: R = X * Y mod M;
+ * bits: number of bits of Y
+ * Y[i]: i th bit of Y
+ *
+ * 1. R = 0;
+ * 2. for (i = bits - 1; i >= 0; i--) {
+ * 3. R = 2 * R;
+ * 4. if (Y[i] == 0x1)
+ * 5. R += X;
+ * 6. if (R >= M) R -= M;
+ * 7. if (R >= M) R -= M;
+ * }
+ *
+ * In Steps 3 and 5, overflow should be avoided.
+ * Steps 6 and 7 can be instead a modular operation (R %= M).
+ */
+
+ bits = nbits(y);
+ r = 0;
+
+ for (i = bits - 1; i >= 0; i--)
+ {
+ if (r > UINT64CONST(0x7fffffffffffffff))
+ /* To avoid overflow, transform from (2 * r) to
+ * (2 * r) % m, and further transform to
+ * mathematically equivalent form shown below:
+ */
+ r = m - ((m - r) << 1);
+ else
+ r <<= 1;
+
+ if ((y >> i) & 0x1)
+ {
+ /* Compute (r + x) without overflow using same
+ * transformations described in the above comment.
+ */
+ if (m > UINT64CONST(0x7fffffffffffffff))
+ r = ((m - r) > x) ? r + x : r + x - m;
+ else
+ r = (r > m) ? r - m + x : r + x;
+ }
+
+ r %= m;
+ }
+
+ return r;
+}
+
+/* Donald Knuth linear congruential generator */
+#define DK_LCG_MUL UINT64CONST(6364136223846793005)
+#define DK_LCG_INC UINT64CONST(1442695040888963407)
+
+/* do not use all small bits */
+#define LCG_SHIFT 13
+
+/*
+ * PRP: parametric pseudo-random permutation
+ *
+ * Result in [0, size) is a permutation for inputs in the same set.
+ *
+ * Note that this function does not pass statistical tests: eg
+ * permutations of 2, 3, 4 or 5 ints are not strictly equiprobable.
+ * Things worsen for large sizes as there are many more permutations
+ * (size!) than seeds to select them (2^64 < 21!).
+ * However it is inexpensive compared to an actual encryption function,
+ * and the quality is good enough to avoid trivial correlations on
+ * large sizes, which is the expected use case.
+ *
+ * THIS FUNCTION IS NOT CRYPTOGRAPHICALLY SECURE.
+ * PLEASE DO NOT USE FOR SUCH PURPOSE.
+ */
+static int64
+pseudorandom_perm(const int64 data, const int64 isize, const int64 seed)
+{
+ /* computations are performed on unsigned values */
+ uint64 size = (uint64) isize;
+ uint64 v = (uint64) data % size;
+ uint64 key = (uint64) seed;
+ /* size-1: ensures 2 possibly overlapping halves */
+ uint64 mask = compute_prp_mask(size-1);
+
+ unsigned int i, p;
+
+ /* nothing to permute */
+ if (isize == 1)
+ return 0;
+
+ Assert(isize >= 2);
+
+ /* apply 4 rounds of bijective transformations:
+ * (1) scramble: partial xors on overlapping power-or-2 subsets
+ * (2) scatter: linear modulo
+ */
+ for (i = 0, p = key % PRP_PRIMES; i < PRP_ROUNDS; i++, p = (p + 1) % PRP_PRIMES)
+ {
+ uint64 t;
+
+ /* first "half" whitening, for v in 0 .. mask */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ if (v <= mask)
+ v ^= (key >> LCG_SHIFT) & mask;
+
+ /* second (possibly overlapping) "half" whitening */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ t = size - 1 - v;
+ if (t <= mask)
+ {
+ t ^= (key >> LCG_SHIFT) & mask;
+ v = size - 1 - t;
+ }
+
+ /* at most 2 primes are skipped for a given size */
+ while (unlikely(size % primes[p] == 0))
+ p = (p + 1) % PRP_PRIMES;
+
+ /* scatter values with a prime multiplication */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ v = (modular_multiply(primes[p], v, size) + (key >> LCG_SHIFT)) % size;
+ }
+
+ /* back to signed */
+ return (int64) v;
+}
+
/*
* Initialize the given SimpleStats struct to all zeroes
*/
@@ -2319,6 +2562,27 @@ evalStandardFunc(TState *thread, CState *st,
return true;
}
+ case PGBENCH_PRPERM:
+ {
+ int64 val, size, seed;
+
+ Assert(nargs == 3);
+
+ if (!coerceToInt(&vargs[0], &val) ||
+ !coerceToInt(&vargs[1], &size) ||
+ !coerceToInt(&vargs[2], &seed))
+ return false;
+
+ if (size < 1)
+ {
+ fprintf(stderr, "pr_perm size parameter must be >= 1\n");
+ return false;
+ }
+
+ setIntValue(retval, pseudorandom_perm(val, size, seed));
+ return true;
+ }
+
default:
/* cannot get here */
Assert(0);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index 6983865b92..665c450c2c 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -99,7 +99,8 @@ typedef enum PgBenchFunction
PGBENCH_IS,
PGBENCH_CASE,
PGBENCH_HASH_FNV1A,
- PGBENCH_HASH_MURMUR2
+ PGBENCH_HASH_MURMUR2,
+ PGBENCH_PRPERM
} PgBenchFunction;
typedef struct PgBenchExpr PgBenchExpr;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 2fc021dde7..7ea4914bf2 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -322,6 +322,15 @@ pgbench(
qr{command=96.: int 1\b}, # :scale
qr{command=97.: int 0\b}, # :client_id
qr{command=98.: int 5432\b}, # :random_seed
+ qr{command=99.: boolean true\b},
+ qr{command=100.: boolean true\b},
+ qr{command=101.: boolean true\b},
+ qr{command=102.: boolean true\b},
+ qr{command=103.: boolean true\b},
+ qr{command=107.: boolean true\b},
+ qr{command=108.: boolean true\b},
+ qr{command=109.: boolean true\b},
+ qr{command=110.: boolean true\b},
],
'pgbench expressions',
{
@@ -447,6 +456,28 @@ SELECT :v0, :v1, :v2, :v3;
\set sc debug(:scale)
\set ci debug(:client_id)
\set rs debug(:random_seed)
+-- pseudo-random permutation
+\set t debug(pr_perm(0, 2) + pr_perm(1, 2) = 1)
+\set t debug(pr_perm(0, 3) + pr_perm(1, 3) + pr_perm(2, 3) = 3)
+\set t debug(pr_perm(0, 4) + pr_perm(1, 4) + pr_perm(2, 4) + pr_perm(3, 4) = 6)
+\set t debug(pr_perm(0, 5) + pr_perm(1, 5) + pr_perm(2, 5) + pr_perm(3, 5) + pr_perm(4, 5) = 10)
+\set t debug(pr_perm(0, 16) + pr_perm(1, 16) + pr_perm(2, 16) + pr_perm(3, 16) + \
+ pr_perm(4, 16) + pr_perm(5, 16) + pr_perm(6, 16) + pr_perm(7, 16) + \
+ pr_perm(8, 16) + pr_perm(9, 16) + pr_perm(10, 16) + pr_perm(11, 16) + \
+ pr_perm(12, 16) + pr_perm(13, 16) + pr_perm(14, 16) + pr_perm(15, 16) = 120)
+-- random sanity check
+\set size random(2, 1000)
+\set v random(0, :size - 1)
+\set p pr_perm(:v, :size)
+\set t debug(0 <= :p and :p < :size and :p = pr_perm(:v + :size, :size) and :p <> pr_perm(:v + 1, :size))
+-- actual values
+\set t debug(pr_perm(:v, 1) = 0)
+\set t debug(pr_perm(0, 2, 5432) = 0 and pr_perm(1, 2, 5432) = 1 and \
+ pr_perm(0, 2, 5431) = 1 and pr_perm(1, 2, 5431) = 0)
+-- ~50 bits test to exercise full modular multiplication
+\set t debug(pr_perm(0, 1000000000000000, 54321) = 9406454989259 and \
+ pr_perm(1999999999999999, 1000000000000000, 54321) = 54570773902028 and \
+ pr_perm(2500000000000000, 1000000000000000, 54321) = 771082311307468)
}
});
@@ -731,6 +762,10 @@ SELECT LEAST(:i, :i, :i, :i, :i, :i, :i, :i, :i, :i, :i);
[
'bad boolean', 0,
[qr{malformed variable.*trueXXX}], q{\set b :badtrue or true}
+ ],
+ [
+ 'invalid pr_perm size', 0,
+ [qr{pr_perm size parameter must be >= 1}], q{\set i pr_perm(0, 0)}
],);
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
index c1c2c1e3d4..ff02cfb46b 100644
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -290,6 +290,16 @@ my @script_tests = (
'too many arguments for hash',
[qr{unexpected number of arguments \(hash\)}],
{ 'bad-hash-2.sql' => "\\set i hash(1,2,3)\n" }
+ ],
+ [
+ 'not enough arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-1.sql' => "\\set i pr_perm(1)\n" }
+ ],
+ [
+ 'too many arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-2.sql' => "\\set i pr_perm(1, 2, 3, 4)\n" }
],);
for my $t (@script_tests)
Hello Hironobu-san,
Here is a v4, based on our out-of-list discussion:
- the mask function is factored out
- the popcount builtin is used if available
Attached a v3, based on your fix, plus some additional changes:
- explicitly declare unsigned variables where appropriate, to avoid casts
- use smaller 24 bits primes instead of 27-29 bits
- add a shortcut for multiplier below 24 bits and y value below 40 bits,
which should avoid the manually implemented multiplication in most
practical cases (tables with over 2^40 rows are pretty rare...).
- change the existing shortcut to look a the number of bits instead of
using 32 limits.
- add a test for minimal code coverage with over 40 bits sizes
- attempt to improve the documentation
- some comments were updates, hopefully for the better
--
Fabien.
Attachments:
pgbench-prp-func-4.patchtext/x-diff; name=pgbench-prp-func-4.patchDownload
diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index eedaf12d69..d1bc983798 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -319,6 +319,26 @@ fi])# PGAC_C_BUILTIN_BSWAP64
+# PGAC_C_BUILTIN_POPCOUNTLL
+# -------------------------
+# Check if the C compiler understands __builtin_popcountll(),
+# and define HAVE__BUILTIN_POPCOUNTLL if so.
+# Both GCC & CLANG seem to have one.
+AC_DEFUN([PGAC_C_BUILTIN_POPCOUNTLL],
+[AC_CACHE_CHECK(for __builtin_popcountll, pgac_cv__builtin_popcountll,
+[AC_COMPILE_IFELSE([AC_LANG_SOURCE(
+[static unsigned long int x = __builtin_popcountll(0xaabbccddeeff0011);]
+)],
+[pgac_cv__builtin_popcountll=yes],
+[pgac_cv__builtin_popcountll=no])])
+if test x"$pgac_cv__builtin_popcountll" = xyes ; then
+AC_DEFINE(HAVE__BUILTIN_POPCOUNTLL, 1,
+ [Define to 1 if your compiler understands __builtin_popcountll.])
+fi])# PGAC_C_BUILTIN_POPCOUNTLL
+
+
+
+
# PGAC_C_BUILTIN_CONSTANT_P
# -------------------------
# Check if the C compiler understands __builtin_constant_p(),
diff --git a/configure b/configure
index c6a44a9078..18850975db 100755
--- a/configure
+++ b/configure
@@ -13881,6 +13881,30 @@ if test x"$pgac_cv__builtin_bswap64" = xyes ; then
$as_echo "#define HAVE__BUILTIN_BSWAP64 1" >>confdefs.h
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __builtin_popcountll" >&5
+$as_echo_n "checking for __builtin_popcountll... " >&6; }
+if ${pgac_cv__builtin_popcountll+:} false; then :
+ $as_echo_n "(cached) " >&6
+else
+ cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h. */
+static unsigned long int x = __builtin_popcountll(0xaabbccddeeff0011);
+
+_ACEOF
+if ac_fn_c_try_compile "$LINENO"; then :
+ pgac_cv__builtin_popcountll=yes
+else
+ pgac_cv__builtin_popcountll=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv__builtin_popcountll" >&5
+$as_echo "$pgac_cv__builtin_popcountll" >&6; }
+if test x"$pgac_cv__builtin_popcountll" = xyes ; then
+
+$as_echo "#define HAVE__BUILTIN_POPCOUNTLL 1" >>confdefs.h
+
fi
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __builtin_constant_p" >&5
$as_echo_n "checking for __builtin_constant_p... " >&6; }
diff --git a/configure.in b/configure.in
index 3ada48b5f9..086c80e7ec 100644
--- a/configure.in
+++ b/configure.in
@@ -1431,6 +1431,7 @@ PGAC_C_TYPES_COMPATIBLE
PGAC_C_BUILTIN_BSWAP16
PGAC_C_BUILTIN_BSWAP32
PGAC_C_BUILTIN_BSWAP64
+PGAC_C_BUILTIN_POPCOUNTLL
PGAC_C_BUILTIN_CONSTANT_P
PGAC_C_BUILTIN_UNREACHABLE
PGAC_C_COMPUTED_GOTO
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 88cf8b3933..9b8e90e26f 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -917,7 +917,7 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<row>
<entry> <literal>default_seed</literal> </entry>
- <entry>seed used in hash functions by default</entry>
+ <entry>seed used in hash and pseudo-random permutation functions by default</entry>
</row>
<row>
@@ -1370,6 +1370,13 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<entry><literal>pow(2.0, 10)</literal>, <literal>power(2.0, 10)</literal></entry>
<entry><literal>1024.0</literal></entry>
</row>
+ <row>
+ <entry><literal><function>pr_perm(<replaceable>i</replaceable>, <replaceable>size</replaceable> [, <replaceable>seed</replaceable> ] )</function></literal></entry>
+ <entry>integer</entry>
+ <entry>pseudo-random permutation in [0,size)</entry>
+ <entry><literal>pr_perm(0, 4)</literal></entry>
+ <entry><literal>0</literal>, <literal>1</literal>, <literal>2</literal> or <literal>3</literal></entry>
+ </row>
<row>
<entry><literal><function>random(<replaceable>lb</replaceable>, <replaceable>ub</replaceable>)</function></literal></entry>
<entry>integer</entry>
@@ -1531,6 +1538,24 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
</programlisting>
</para>
+ <para>
+ Function <literal>pr_perm</literal> implements a pseudo-random permutation.
+ It allows to mix the output of non uniform random functions so that
+ values drawn more often are not trivially correlated.
+ It permutes integers in [0, size) using a seed by applying rounds of
+ simple invertible functions, similarly to an encryption function,
+ although beware that it is not at all cryptographically secure.
+ Compared to <literal>hash</literal> functions discussed above, the function
+ ensures that a perfect permutation is applied: there are no collisions
+ nor holes in the output values.
+ Values outside the interval are interpreted modulo the size.
+ The function errors if size is not positive.
+ If no seed is provided, <literal>:default_seed</literal> is used.
+ For a given size and seed, the function is fully deterministic: if two
+ permutations on the same size must not be correlated, use distinct seeds
+ as outlined in the previous example about hash functions.
+ </para>
+
<para>
As an example, the full definition of the built-in TPC-B-like
transaction is:
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index f7c56cc6a3..762a62959b 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -19,6 +19,7 @@
#define PGBENCH_NARGS_VARIABLE (-1)
#define PGBENCH_NARGS_CASE (-2)
#define PGBENCH_NARGS_HASH (-3)
+#define PGBENCH_NARGS_PRPERM (-4)
PgBenchExpr *expr_parse_result;
@@ -366,6 +367,9 @@ static const struct
{
"hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A
},
+ {
+ "pr_perm", PGBENCH_NARGS_PRPERM, PGBENCH_PRPERM
+ },
/* keep as last array element */
{
NULL, 0, 0
@@ -478,6 +482,19 @@ make_func(yyscan_t yyscanner, int fnumber, PgBenchExprList *args)
}
break;
+ /* pseudo-random permutation function with optional seed argument */
+ case PGBENCH_NARGS_PRPERM:
+ if (len < 2 || len > 3)
+ expr_yyerror_more(yyscanner, "unexpected number of arguments",
+ PGBENCH_FUNCTIONS[fnumber].fname);
+
+ if (len == 2)
+ {
+ PgBenchExpr *var = make_variable("default_seed");
+ args = make_elist(var, args);
+ }
+ break;
+
/* common case: positive arguments number */
default:
Assert(PGBENCH_FUNCTIONS[fnumber].nargs >= 0);
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 41b756c089..3d3f3d0805 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -986,6 +986,237 @@ getHashMurmur2(int64 val, uint64 seed)
return (int64) result;
}
+/* pseudo-random permutation */
+
+/* 16 so that % 16 can be optimized to & 0x0f */
+#define PRP_PRIMES 16
+/*
+ * 24 bits mega primes from https://primes.utm.edu/lists/small/millions/
+ * the i-th prime, i \in [0, 15], is the first prime above 2^23 + i * 2^19
+ */
+static uint64 primes[PRP_PRIMES] = {
+ UINT64CONST(8388617),
+ UINT64CONST(8912921),
+ UINT64CONST(9437189),
+ UINT64CONST(9961487),
+ UINT64CONST(10485767),
+ UINT64CONST(11010059),
+ UINT64CONST(11534351),
+ UINT64CONST(12058679),
+ UINT64CONST(12582917),
+ UINT64CONST(13107229),
+ UINT64CONST(13631489),
+ UINT64CONST(14155777),
+ UINT64CONST(14680067),
+ UINT64CONST(15204391),
+ UINT64CONST(15728681),
+ UINT64CONST(16252967)
+};
+
+/* how many "encryption" rounds to apply */
+#define PRP_ROUNDS 4
+
+/* return smallest mask holding n */
+static uint64 compute_mask(uint64 n)
+{
+ n |= n >> 1;
+ n |= n >> 2;
+ n |= n >> 4;
+ n |= n >> 8;
+ n |= n >> 16;
+ n |= n >> 32;
+ return n;
+}
+
+/* length of n binary representation */
+static int nbits(uint64 n)
+{
+ /* set all lower bits to 1 */
+ n = compute_mask(n);
+ /* then count them */
+#ifdef HAVE__BUILTIN_POPCOUNTLL
+ return __builtin_popcountll(n);
+#else /* use clever no branching bitwise operator version */
+ n -= (n >> 1) & UINT64CONST(0x5555555555555555);
+ n = (n & UINT64CONST(0x3333333333333333)) + ((n >> 2) & UINT64CONST(0x3333333333333333));
+ n = (n + (n >> 4)) & UINT64CONST(0x0F0F0F0F0F0F0F0F);
+ return (n * UINT64CONST(0x0101010101010101)) >> 56;
+#endif /* HAVE__BUILTIN_POPCOUNTLL */
+}
+
+/*
+ * Compute (x * y) % m, where x and y in [0, 2^64), m in [1, 2^64).
+ *
+ * Use improved interleaved modular multiplication algorithm to avoid
+ * overflows when necessary.
+ */
+static uint64 modular_multiply(uint64 x, uint64 y, const uint64 m)
+{
+ int i, bits;
+ uint64 r;
+
+ Assert(m >= 1);
+
+ /* Performance shortcut for our 24 bit primes, ok for m up to ~10E12 */
+ if ((x & UINT64CONST(0xffffff)) == x && (y & UINT64CONST(0xffffffffff)) == y)
+ return (x * y) % m;
+
+ /* Because of (x * y) % m = (x % m * y % m) % m */
+ if (x >= m)
+ x %= m;
+ if (y >= m)
+ y %= m;
+
+ /* Return the trivial result. */
+ if (x == 0 || y == 0 || m == 1)
+ return 0;
+
+ /* Return the result if (x * y) can be multiplied without overflow. */
+ if (nbits(x) + nbits(y) <= 64)
+ return (x * y) % m;
+
+ /* To reduce the for loop in the algorithm below, ensure y <= x. */
+ if (x < y)
+ {
+ uint64 tmp = x;
+ x = y;
+ y = tmp;
+ }
+
+ /* Interleaved modular multiplication algorithm from:
+ *
+ * D.N. Amanor et al., "Efficient hardware architecture for
+ * modular multiplication on FPGAs", in International Conference on
+ * Field Programmable Logic and Applications, Aug 2005, pp. 539-542.
+ *
+ * This algorithm is usually used in the field of digital circuit
+ * design.
+ *
+ * Input: X, Y, M; 0 <= X, Y <= M;
+ * Output: R = X * Y mod M;
+ * bits: number of bits of Y
+ * Y[i]: i th bit of Y
+ *
+ * 1. R = 0;
+ * 2. for (i = bits - 1; i >= 0; i--) {
+ * 3. R = 2 * R;
+ * 4. if (Y[i] == 0x1)
+ * 5. R += X;
+ * 6. if (R >= M) R -= M;
+ * 7. if (R >= M) R -= M;
+ * }
+ *
+ * In Steps 3 and 5, overflow should be avoided.
+ * Steps 6 and 7 can be instead a modular operation (R %= M).
+ */
+
+ bits = nbits(y);
+ r = 0;
+
+ for (i = bits - 1; i >= 0; i--)
+ {
+ if (r > UINT64CONST(0x7fffffffffffffff))
+ /* To avoid overflow, transform from (2 * r) to
+ * (2 * r) % m, and further transform to
+ * mathematically equivalent form shown below:
+ */
+ r = m - ((m - r) << 1);
+ else
+ r <<= 1;
+
+ if ((y >> i) & 0x1)
+ {
+ /* Compute (r + x) without overflow using same
+ * transformations described in the above comment.
+ */
+ if (m > UINT64CONST(0x7fffffffffffffff))
+ r = ((m - r) > x) ? r + x : r + x - m;
+ else
+ r = (r > m) ? r - m + x : r + x;
+ }
+
+ r %= m;
+ }
+
+ return r;
+}
+
+/* Donald Knuth linear congruential generator */
+#define DK_LCG_MUL UINT64CONST(6364136223846793005)
+#define DK_LCG_INC UINT64CONST(1442695040888963407)
+
+/* do not use all small bits */
+#define LCG_SHIFT 13
+
+/*
+ * PRP: parametric pseudo-random permutation
+ *
+ * Result in [0, size) is a permutation for inputs in the same set.
+ *
+ * Note that this function does not pass statistical tests: eg
+ * permutations of 2, 3, 4 or 5 ints are not strictly equiprobable.
+ * Things worsen for large sizes as there are many more permutations
+ * (size!) than seeds to select them (2^64 < 21!).
+ * However it is inexpensive compared to an actual encryption function,
+ * and the quality is good enough to avoid trivial correlations on
+ * large sizes, which is the expected use case.
+ *
+ * THIS FUNCTION IS NOT CRYPTOGRAPHICALLY SECURE.
+ * DO NOT USE FOR SUCH PURPOSE.
+ */
+static int64
+pseudorandom_perm(const int64 data, const int64 isize, const int64 seed)
+{
+ /* computations are performed on unsigned values */
+ uint64 size = (uint64) isize;
+ uint64 v = (uint64) data % size;
+ uint64 key = (uint64) seed;
+ /* size-1: ensures 2 possibly overlapping halves */
+ uint64 mask = compute_mask(size-1) >> 1;
+
+ unsigned int i, p;
+
+ /* nothing to permute */
+ if (isize == 1)
+ return 0;
+
+ Assert(isize >= 2);
+
+ /* apply 4 rounds of bijective transformations:
+ * (1) scramble: partial xors on overlapping power-or-2 subsets
+ * (2) scatter: linear modulo
+ */
+ for (i = 0, p = key % PRP_PRIMES; i < PRP_ROUNDS; i++, p = (p + 1) % PRP_PRIMES)
+ {
+ uint64 t;
+
+ /* first "half" whitening, for v in 0 .. mask */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ if (v <= mask)
+ v ^= (key >> LCG_SHIFT) & mask;
+
+ /* second (possibly overlapping) "half" whitening */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ t = size - 1 - v;
+ if (t <= mask)
+ {
+ t ^= (key >> LCG_SHIFT) & mask;
+ v = size - 1 - t;
+ }
+
+ /* at most 2 primes are skipped for a given size */
+ while (unlikely(size % primes[p] == 0))
+ p = (p + 1) % PRP_PRIMES;
+
+ /* scatter values with a prime multiplication */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ v = (modular_multiply(primes[p], v, size) + (key >> LCG_SHIFT)) % size;
+ }
+
+ /* back to signed */
+ return (int64) v;
+}
+
/*
* Initialize the given SimpleStats struct to all zeroes
*/
@@ -2319,6 +2550,27 @@ evalStandardFunc(TState *thread, CState *st,
return true;
}
+ case PGBENCH_PRPERM:
+ {
+ int64 val, size, seed;
+
+ Assert(nargs == 3);
+
+ if (!coerceToInt(&vargs[0], &val) ||
+ !coerceToInt(&vargs[1], &size) ||
+ !coerceToInt(&vargs[2], &seed))
+ return false;
+
+ if (size < 1)
+ {
+ fprintf(stderr, "pr_perm size parameter must be >= 1\n");
+ return false;
+ }
+
+ setIntValue(retval, pseudorandom_perm(val, size, seed));
+ return true;
+ }
+
default:
/* cannot get here */
Assert(0);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index 6983865b92..665c450c2c 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -99,7 +99,8 @@ typedef enum PgBenchFunction
PGBENCH_IS,
PGBENCH_CASE,
PGBENCH_HASH_FNV1A,
- PGBENCH_HASH_MURMUR2
+ PGBENCH_HASH_MURMUR2,
+ PGBENCH_PRPERM
} PgBenchFunction;
typedef struct PgBenchExpr PgBenchExpr;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 2fc021dde7..7ea4914bf2 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -322,6 +322,15 @@ pgbench(
qr{command=96.: int 1\b}, # :scale
qr{command=97.: int 0\b}, # :client_id
qr{command=98.: int 5432\b}, # :random_seed
+ qr{command=99.: boolean true\b},
+ qr{command=100.: boolean true\b},
+ qr{command=101.: boolean true\b},
+ qr{command=102.: boolean true\b},
+ qr{command=103.: boolean true\b},
+ qr{command=107.: boolean true\b},
+ qr{command=108.: boolean true\b},
+ qr{command=109.: boolean true\b},
+ qr{command=110.: boolean true\b},
],
'pgbench expressions',
{
@@ -447,6 +456,28 @@ SELECT :v0, :v1, :v2, :v3;
\set sc debug(:scale)
\set ci debug(:client_id)
\set rs debug(:random_seed)
+-- pseudo-random permutation
+\set t debug(pr_perm(0, 2) + pr_perm(1, 2) = 1)
+\set t debug(pr_perm(0, 3) + pr_perm(1, 3) + pr_perm(2, 3) = 3)
+\set t debug(pr_perm(0, 4) + pr_perm(1, 4) + pr_perm(2, 4) + pr_perm(3, 4) = 6)
+\set t debug(pr_perm(0, 5) + pr_perm(1, 5) + pr_perm(2, 5) + pr_perm(3, 5) + pr_perm(4, 5) = 10)
+\set t debug(pr_perm(0, 16) + pr_perm(1, 16) + pr_perm(2, 16) + pr_perm(3, 16) + \
+ pr_perm(4, 16) + pr_perm(5, 16) + pr_perm(6, 16) + pr_perm(7, 16) + \
+ pr_perm(8, 16) + pr_perm(9, 16) + pr_perm(10, 16) + pr_perm(11, 16) + \
+ pr_perm(12, 16) + pr_perm(13, 16) + pr_perm(14, 16) + pr_perm(15, 16) = 120)
+-- random sanity check
+\set size random(2, 1000)
+\set v random(0, :size - 1)
+\set p pr_perm(:v, :size)
+\set t debug(0 <= :p and :p < :size and :p = pr_perm(:v + :size, :size) and :p <> pr_perm(:v + 1, :size))
+-- actual values
+\set t debug(pr_perm(:v, 1) = 0)
+\set t debug(pr_perm(0, 2, 5432) = 0 and pr_perm(1, 2, 5432) = 1 and \
+ pr_perm(0, 2, 5431) = 1 and pr_perm(1, 2, 5431) = 0)
+-- ~50 bits test to exercise full modular multiplication
+\set t debug(pr_perm(0, 1000000000000000, 54321) = 9406454989259 and \
+ pr_perm(1999999999999999, 1000000000000000, 54321) = 54570773902028 and \
+ pr_perm(2500000000000000, 1000000000000000, 54321) = 771082311307468)
}
});
@@ -731,6 +762,10 @@ SELECT LEAST(:i, :i, :i, :i, :i, :i, :i, :i, :i, :i, :i);
[
'bad boolean', 0,
[qr{malformed variable.*trueXXX}], q{\set b :badtrue or true}
+ ],
+ [
+ 'invalid pr_perm size', 0,
+ [qr{pr_perm size parameter must be >= 1}], q{\set i pr_perm(0, 0)}
],);
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
index c1c2c1e3d4..ff02cfb46b 100644
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -290,6 +290,16 @@ my @script_tests = (
'too many arguments for hash',
[qr{unexpected number of arguments \(hash\)}],
{ 'bad-hash-2.sql' => "\\set i hash(1,2,3)\n" }
+ ],
+ [
+ 'not enough arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-1.sql' => "\\set i pr_perm(1)\n" }
+ ],
+ [
+ 'too many arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-2.sql' => "\\set i pr_perm(1, 2, 3, 4)\n" }
],);
for my $t (@script_tests)
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 4094e22776..d395b6ee32 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -745,6 +745,9 @@
/* Define to 1 if your compiler understands __builtin_bswap64. */
#undef HAVE__BUILTIN_BSWAP64
+/* Define to 1 if your compiler understands __builtin_popcountll. */
+#undef HAVE__BUILTIN_POPCOUNTLL
+
/* Define to 1 if your compiler understands __builtin_constant_p. */
#undef HAVE__BUILTIN_CONSTANT_P
Hi Fabian-san,
I reviewed 'pgbench-prp-func/pgbench-prp-func-4.patch'.
I could apply it and did the TAP test successfully. I could not find
typo in the documentations and comments.
To make sure, I checked the new routine which contains the
__builtin_popcountll() function on Linux + gcc 7.3.1 and I confirmed
that it works well.
I thinks this patch is fine.
Best regards,
Show quoted text
On 2018/09/16 21:05, Fabien COELHO wrote:
Hello Hironobu-san,
Here is a v4, based on our out-of-list discussion:
�- the mask function is factored out
�- the popcount builtin is used if availableAttached a v3, based on your fix, plus some additional changes:
- explicitly declare unsigned variables where appropriate, to avoid casts
- use smaller 24 bits primes instead of 27-29 bits
- add a shortcut for multiplier below 24 bits and y value below 40 bits,
� which should avoid the manually implemented multiplication in most
� practical cases (tables with over 2^40 rows are pretty rare...).
- change the existing shortcut to look a the number of bits instead of
� using 32 limits.
- add a test for minimal code coverage with over 40 bits sizes
- attempt to improve the documentation
- some comments were updates, hopefully for the better
I reviewed 'pgbench-prp-func/pgbench-prp-func-4.patch'. [...] I thinks
this patch is fine.
Thanks!
You may consider turning it "ready" in the CF app, so as to see whether
some committer agrees with you.
--
Fabien.
On Wed, Sep 19, 2018 at 7:07 AM Fabien COELHO <coelho@cri.ensmp.fr> wrote:
I reviewed 'pgbench-prp-func/pgbench-prp-func-4.patch'. [...] I thinks
this patch is fine.
modular_multiply() is an interesting device. I will leave this to
committers with a stronger mathematical background than me, but I have
a small comment in passing:
+#ifdef HAVE__BUILTIN_POPCOUNTLL
+ return __builtin_popcountll(n);
+#else /* use clever no branching bitwise operator version */
I think it is not enough for the compiler to have
__builtin_popcountll(). The CPU that this is eventually executed on
must actually have the POPCNT instruction[1]https://en.wikipedia.org/wiki/SSE4#POPCNT_and_LZCNT (or equivalent on ARM,
POWER etc), or the program will die with SIGILL. There are CPUs in
circulation produced in this decade that don't have it. I have
previously considered something like this[2]/messages/by-id/CAEepm=3g1_fjJGp38QGv+38BC2HHVkzUq6s69nk3mWLgPHqC3A@mail.gmail.com, but realised you would
therefore need a runtime check. There are a couple of ways to do
that: see commit a7a73875 for one example, also
__builtin_cpu_supports(), and direct CPU ID bit checks (some
platforms). There is also the GCC "multiversion" system that takes
care of that for you but that worked only for C++ last time I checked.
[1]: https://en.wikipedia.org/wiki/SSE4#POPCNT_and_LZCNT
[2]: /messages/by-id/CAEepm=3g1_fjJGp38QGv+38BC2HHVkzUq6s69nk3mWLgPHqC3A@mail.gmail.com
--
Thomas Munro
http://www.enterprisedb.com
Hello Thomas,
modular_multiply() is an interesting device. I will leave this to
committers with a stronger mathematical background than me, but I have
a small comment in passing:
For testing this function, I have manually commented out the various
shortcuts so that the manual multiplication code is always used, and the
tests passed. I just did it again.
+#ifdef HAVE__BUILTIN_POPCOUNTLL + return __builtin_popcountll(n); +#else /* use clever no branching bitwise operator version */I think it is not enough for the compiler to have
__builtin_popcountll(). The CPU that this is eventually executed on
must actually have the POPCNT instruction[1] (or equivalent on ARM,
POWER etc), or the program will die with SIGILL.
Hmmm, I'd be pretty surprised: The point of the builtin is to delegate to
the compiler the hassle of selecting the best option available, depending
on target hardware class: whether it issues a cpu/sse4 instruction is
not/should not be our problem.
There are CPUs in circulation produced in this decade that don't have
it.
Then the compiler, when generating code that is expected to run for a
large class of hardware which include such old ones, should not use a
possibly unavailable instruction, or the runtime should take
responsability for dynamically selecting a viable option.
My understanding is that it should always be safe to call __builtin_XYZ
functions when available. Then if you compile saying that you want code
specific to cpu X and then run it on cpu Y and it fails, basically you
just shot yourself in the foot.
I have previously considered something like this[2], but realised you
would therefore need a runtime check. There are a couple of ways to do
that: see commit a7a73875 for one example, also
__builtin_cpu_supports(), and direct CPU ID bit checks (some platforms).
There is also the GCC "multiversion" system that takes care of that for
you but that worked only for C++ last time I checked.
ISTM that the purpose of a dynamic check would be to have the better
hardware benefit even when compiling for a generic class of hardware which
may or may not have the better option.
This approach is fine for reaching a better performance/portability
compromise, but I do not think that it is that useful for pgbench to go to
this level of optimization, unless someone else takes care, hence the
compiler builtin.
An interesting side effect of your mail is that while researching a bit on
the subject I noticed __builtin_clzll() which helps improve the nbits code
compared to using popcount. Patch attached uses CLZ insted of POPCOUNT.
--
Fabien.
Attachments:
pgbench-prp-func-5.patchtext/x-diff; name=pgbench-prp-func-5.patchDownload
diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index eedaf12d69..bca110ed0c 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -319,6 +319,26 @@ fi])# PGAC_C_BUILTIN_BSWAP64
+# PGAC_C_BUILTIN_CLZLL
+# -------------------------
+# Check if the C compiler understands __builtin_clzll(),
+# and define HAVE__BUILTIN_CLZLL if so.
+# Both GCC & CLANG seem to have one.
+AC_DEFUN([PGAC_C_BUILTIN_CLZLL],
+[AC_CACHE_CHECK(for __builtin_clzll, pgac_cv__builtin_clzll,
+[AC_COMPILE_IFELSE([AC_LANG_SOURCE(
+[static unsigned long int x = __builtin_clzll(0xaabbccddeeff0011);]
+)],
+[pgac_cv__builtin_clzll=yes],
+[pgac_cv__builtin_clzll=no])])
+if test x"$pgac_cv__builtin_clzll" = xyes ; then
+AC_DEFINE(HAVE__BUILTIN_CLZLL, 1,
+ [Define to 1 if your compiler understands __builtin_clzll.])
+fi])# PGAC_C_BUILTIN_CLZLL
+
+
+
+
# PGAC_C_BUILTIN_CONSTANT_P
# -------------------------
# Check if the C compiler understands __builtin_constant_p(),
diff --git a/configure b/configure
index 23ebfa8f3d..13ba5bc093 100755
--- a/configure
+++ b/configure
@@ -13923,6 +13923,30 @@ if test x"$pgac_cv__builtin_bswap64" = xyes ; then
$as_echo "#define HAVE__BUILTIN_BSWAP64 1" >>confdefs.h
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __builtin_clzll" >&5
+$as_echo_n "checking for __builtin_clzll... " >&6; }
+if ${pgac_cv__builtin_clzll+:} false; then :
+ $as_echo_n "(cached) " >&6
+else
+ cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h. */
+static unsigned long int x = __builtin_clzll(0xaabbccddeeff0011);
+
+_ACEOF
+if ac_fn_c_try_compile "$LINENO"; then :
+ pgac_cv__builtin_clzll=yes
+else
+ pgac_cv__builtin_clzll=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv__builtin_clzll" >&5
+$as_echo "$pgac_cv__builtin_clzll" >&6; }
+if test x"$pgac_cv__builtin_clzll" = xyes ; then
+
+$as_echo "#define HAVE__BUILTIN_CLZLL 1" >>confdefs.h
+
fi
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __builtin_constant_p" >&5
$as_echo_n "checking for __builtin_constant_p... " >&6; }
diff --git a/configure.in b/configure.in
index 530f275993..b9de9346a1 100644
--- a/configure.in
+++ b/configure.in
@@ -1458,6 +1458,7 @@ PGAC_C_TYPES_COMPATIBLE
PGAC_C_BUILTIN_BSWAP16
PGAC_C_BUILTIN_BSWAP32
PGAC_C_BUILTIN_BSWAP64
+PGAC_C_BUILTIN_CLZLL
PGAC_C_BUILTIN_CONSTANT_P
PGAC_C_BUILTIN_UNREACHABLE
PGAC_C_COMPUTED_GOTO
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 88cf8b3933..9b8e90e26f 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -917,7 +917,7 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<row>
<entry> <literal>default_seed</literal> </entry>
- <entry>seed used in hash functions by default</entry>
+ <entry>seed used in hash and pseudo-random permutation functions by default</entry>
</row>
<row>
@@ -1370,6 +1370,13 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<entry><literal>pow(2.0, 10)</literal>, <literal>power(2.0, 10)</literal></entry>
<entry><literal>1024.0</literal></entry>
</row>
+ <row>
+ <entry><literal><function>pr_perm(<replaceable>i</replaceable>, <replaceable>size</replaceable> [, <replaceable>seed</replaceable> ] )</function></literal></entry>
+ <entry>integer</entry>
+ <entry>pseudo-random permutation in [0,size)</entry>
+ <entry><literal>pr_perm(0, 4)</literal></entry>
+ <entry><literal>0</literal>, <literal>1</literal>, <literal>2</literal> or <literal>3</literal></entry>
+ </row>
<row>
<entry><literal><function>random(<replaceable>lb</replaceable>, <replaceable>ub</replaceable>)</function></literal></entry>
<entry>integer</entry>
@@ -1531,6 +1538,24 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
</programlisting>
</para>
+ <para>
+ Function <literal>pr_perm</literal> implements a pseudo-random permutation.
+ It allows to mix the output of non uniform random functions so that
+ values drawn more often are not trivially correlated.
+ It permutes integers in [0, size) using a seed by applying rounds of
+ simple invertible functions, similarly to an encryption function,
+ although beware that it is not at all cryptographically secure.
+ Compared to <literal>hash</literal> functions discussed above, the function
+ ensures that a perfect permutation is applied: there are no collisions
+ nor holes in the output values.
+ Values outside the interval are interpreted modulo the size.
+ The function errors if size is not positive.
+ If no seed is provided, <literal>:default_seed</literal> is used.
+ For a given size and seed, the function is fully deterministic: if two
+ permutations on the same size must not be correlated, use distinct seeds
+ as outlined in the previous example about hash functions.
+ </para>
+
<para>
As an example, the full definition of the built-in TPC-B-like
transaction is:
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index f7c56cc6a3..762a62959b 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -19,6 +19,7 @@
#define PGBENCH_NARGS_VARIABLE (-1)
#define PGBENCH_NARGS_CASE (-2)
#define PGBENCH_NARGS_HASH (-3)
+#define PGBENCH_NARGS_PRPERM (-4)
PgBenchExpr *expr_parse_result;
@@ -366,6 +367,9 @@ static const struct
{
"hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A
},
+ {
+ "pr_perm", PGBENCH_NARGS_PRPERM, PGBENCH_PRPERM
+ },
/* keep as last array element */
{
NULL, 0, 0
@@ -478,6 +482,19 @@ make_func(yyscan_t yyscanner, int fnumber, PgBenchExprList *args)
}
break;
+ /* pseudo-random permutation function with optional seed argument */
+ case PGBENCH_NARGS_PRPERM:
+ if (len < 2 || len > 3)
+ expr_yyerror_more(yyscanner, "unexpected number of arguments",
+ PGBENCH_FUNCTIONS[fnumber].fname);
+
+ if (len == 2)
+ {
+ PgBenchExpr *var = make_variable("default_seed");
+ args = make_elist(var, args);
+ }
+ break;
+
/* common case: positive arguments number */
default:
Assert(PGBENCH_FUNCTIONS[fnumber].nargs >= 0);
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 7576e4cfae..d9738008f4 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -1024,6 +1024,237 @@ getHashMurmur2(int64 val, uint64 seed)
return (int64) result;
}
+/* pseudo-random permutation */
+
+/* 16 so that % 16 can be optimized to & 0x0f */
+#define PRP_PRIMES 16
+/*
+ * 24 bits mega primes from https://primes.utm.edu/lists/small/millions/
+ * the i-th prime, i \in [0, 15], is the first prime above 2^23 + i * 2^19
+ */
+static uint64 primes[PRP_PRIMES] = {
+ UINT64CONST(8388617),
+ UINT64CONST(8912921),
+ UINT64CONST(9437189),
+ UINT64CONST(9961487),
+ UINT64CONST(10485767),
+ UINT64CONST(11010059),
+ UINT64CONST(11534351),
+ UINT64CONST(12058679),
+ UINT64CONST(12582917),
+ UINT64CONST(13107229),
+ UINT64CONST(13631489),
+ UINT64CONST(14155777),
+ UINT64CONST(14680067),
+ UINT64CONST(15204391),
+ UINT64CONST(15728681),
+ UINT64CONST(16252967)
+};
+
+/* how many "encryption" rounds to apply */
+#define PRP_ROUNDS 4
+
+/* return smallest mask holding n */
+static uint64 compute_mask(uint64 n)
+{
+ n |= n >> 1;
+ n |= n >> 2;
+ n |= n >> 4;
+ n |= n >> 8;
+ n |= n >> 16;
+ n |= n >> 32;
+ return n;
+}
+
+/* length of n binary representation */
+static int nbits(uint64 n)
+{
+#ifdef HAVE__BUILTIN_CLZLL
+ return 64 - (n != 0 ? __builtin_clzll(n) : 64);
+#else /* use clever no branching bitwise operator version */
+ /* set all lower bits to 1 */
+ n = compute_mask(n);
+ /* then count them */
+ n -= (n >> 1) & UINT64CONST(0x5555555555555555);
+ n = (n & UINT64CONST(0x3333333333333333)) + ((n >> 2) & UINT64CONST(0x3333333333333333));
+ n = (n + (n >> 4)) & UINT64CONST(0x0F0F0F0F0F0F0F0F);
+ return (n * UINT64CONST(0x0101010101010101)) >> 56;
+#endif /* HAVE__BUILTIN_CLZLL */
+}
+
+/*
+ * Compute (x * y) % m, where x and y in [0, 2^64), m in [1, 2^64).
+ *
+ * Use improved interleaved modular multiplication algorithm to avoid
+ * overflows when necessary.
+ */
+static uint64 modular_multiply(uint64 x, uint64 y, const uint64 m)
+{
+ int i, bits;
+ uint64 r;
+
+ Assert(m >= 1);
+
+ /* Performance shortcut for our 24 bit primes, ok for m up to ~10E12 */
+ if ((x & UINT64CONST(0xffffff)) == x && (y & UINT64CONST(0xffffffffff)) == y)
+ return (x * y) % m;
+
+ /* Because of (x * y) % m = (x % m * y % m) % m */
+ if (x >= m)
+ x %= m;
+ if (y >= m)
+ y %= m;
+
+ /* Return the trivial result. */
+ if (x == 0 || y == 0 || m == 1)
+ return 0;
+
+ /* Return the result if (x * y) can be multiplied without overflow. */
+ if (nbits(x) + nbits(y) <= 64)
+ return (x * y) % m;
+
+ /* To reduce the for loop in the algorithm below, ensure y <= x. */
+ if (x < y)
+ {
+ uint64 tmp = x;
+ x = y;
+ y = tmp;
+ }
+
+ /* Interleaved modular multiplication algorithm from:
+ *
+ * D.N. Amanor et al., "Efficient hardware architecture for
+ * modular multiplication on FPGAs", in International Conference on
+ * Field Programmable Logic and Applications, Aug 2005, pp. 539-542.
+ *
+ * This algorithm is usually used in the field of digital circuit
+ * design.
+ *
+ * Input: X, Y, M; 0 <= X, Y <= M;
+ * Output: R = X * Y mod M;
+ * bits: number of bits of Y
+ * Y[i]: i th bit of Y
+ *
+ * 1. R = 0;
+ * 2. for (i = bits - 1; i >= 0; i--) {
+ * 3. R = 2 * R;
+ * 4. if (Y[i] == 0x1)
+ * 5. R += X;
+ * 6. if (R >= M) R -= M;
+ * 7. if (R >= M) R -= M;
+ * }
+ *
+ * In Steps 3 and 5, overflow should be avoided.
+ * Steps 6 and 7 can be instead a modular operation (R %= M).
+ */
+
+ bits = nbits(y);
+ r = 0;
+
+ for (i = bits - 1; i >= 0; i--)
+ {
+ if (r > UINT64CONST(0x7fffffffffffffff))
+ /* To avoid overflow, transform from (2 * r) to
+ * (2 * r) % m, and further transform to
+ * mathematically equivalent form shown below:
+ */
+ r = m - ((m - r) << 1);
+ else
+ r <<= 1;
+
+ if ((y >> i) & 0x1)
+ {
+ /* Compute (r + x) without overflow using same
+ * transformations described in the above comment.
+ */
+ if (m > UINT64CONST(0x7fffffffffffffff))
+ r = ((m - r) > x) ? r + x : r + x - m;
+ else
+ r = (r > m) ? r - m + x : r + x;
+ }
+
+ r %= m;
+ }
+
+ return r;
+}
+
+/* Donald Knuth linear congruential generator */
+#define DK_LCG_MUL UINT64CONST(6364136223846793005)
+#define DK_LCG_INC UINT64CONST(1442695040888963407)
+
+/* do not use all small bits */
+#define LCG_SHIFT 13
+
+/*
+ * PRP: parametric pseudo-random permutation
+ *
+ * Result in [0, size) is a permutation for inputs in the same set.
+ *
+ * Note that this function does not pass statistical tests: eg
+ * permutations of 2, 3, 4 or 5 ints are not strictly equiprobable.
+ * Things worsen for large sizes as there are many more permutations
+ * (size!) than seeds to select them (2^64 < 21!).
+ * However it is inexpensive compared to an actual encryption function,
+ * and the quality is good enough to avoid trivial correlations on
+ * large sizes, which is the expected use case.
+ *
+ * THIS FUNCTION IS NOT CRYPTOGRAPHICALLY SECURE.
+ * DO NOT USE FOR SUCH PURPOSE.
+ */
+static int64
+pseudorandom_perm(const int64 data, const int64 isize, const int64 seed)
+{
+ /* computations are performed on unsigned values */
+ uint64 size = (uint64) isize;
+ uint64 v = (uint64) data % size;
+ uint64 key = (uint64) seed;
+ /* size-1: ensures 2 possibly overlapping halves */
+ uint64 mask = compute_mask(size-1) >> 1;
+
+ unsigned int i, p;
+
+ /* nothing to permute */
+ if (isize == 1)
+ return 0;
+
+ Assert(isize >= 2);
+
+ /* apply 4 rounds of bijective transformations:
+ * (1) scramble: partial xors on overlapping power-or-2 subsets
+ * (2) scatter: linear modulo
+ */
+ for (i = 0, p = key % PRP_PRIMES; i < PRP_ROUNDS; i++, p = (p + 1) % PRP_PRIMES)
+ {
+ uint64 t;
+
+ /* first "half" whitening, for v in 0 .. mask */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ if (v <= mask)
+ v ^= (key >> LCG_SHIFT) & mask;
+
+ /* second (possibly overlapping) "half" whitening */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ t = size - 1 - v;
+ if (t <= mask)
+ {
+ t ^= (key >> LCG_SHIFT) & mask;
+ v = size - 1 - t;
+ }
+
+ /* at most 2 primes are skipped for a given size */
+ while (unlikely(size % primes[p] == 0))
+ p = (p + 1) % PRP_PRIMES;
+
+ /* scatter values with a prime multiplication */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ v = (modular_multiply(primes[p], v, size) + (key >> LCG_SHIFT)) % size;
+ }
+
+ /* back to signed */
+ return (int64) v;
+}
+
/*
* Initialize the given SimpleStats struct to all zeroes
*/
@@ -2357,6 +2588,27 @@ evalStandardFunc(TState *thread, CState *st,
return true;
}
+ case PGBENCH_PRPERM:
+ {
+ int64 val, size, seed;
+
+ Assert(nargs == 3);
+
+ if (!coerceToInt(&vargs[0], &val) ||
+ !coerceToInt(&vargs[1], &size) ||
+ !coerceToInt(&vargs[2], &seed))
+ return false;
+
+ if (size < 1)
+ {
+ fprintf(stderr, "pr_perm size parameter must be >= 1\n");
+ return false;
+ }
+
+ setIntValue(retval, pseudorandom_perm(val, size, seed));
+ return true;
+ }
+
default:
/* cannot get here */
Assert(0);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index 6983865b92..665c450c2c 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -99,7 +99,8 @@ typedef enum PgBenchFunction
PGBENCH_IS,
PGBENCH_CASE,
PGBENCH_HASH_FNV1A,
- PGBENCH_HASH_MURMUR2
+ PGBENCH_HASH_MURMUR2,
+ PGBENCH_PRPERM
} PgBenchFunction;
typedef struct PgBenchExpr PgBenchExpr;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 2fc021dde7..7ea4914bf2 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -322,6 +322,15 @@ pgbench(
qr{command=96.: int 1\b}, # :scale
qr{command=97.: int 0\b}, # :client_id
qr{command=98.: int 5432\b}, # :random_seed
+ qr{command=99.: boolean true\b},
+ qr{command=100.: boolean true\b},
+ qr{command=101.: boolean true\b},
+ qr{command=102.: boolean true\b},
+ qr{command=103.: boolean true\b},
+ qr{command=107.: boolean true\b},
+ qr{command=108.: boolean true\b},
+ qr{command=109.: boolean true\b},
+ qr{command=110.: boolean true\b},
],
'pgbench expressions',
{
@@ -447,6 +456,28 @@ SELECT :v0, :v1, :v2, :v3;
\set sc debug(:scale)
\set ci debug(:client_id)
\set rs debug(:random_seed)
+-- pseudo-random permutation
+\set t debug(pr_perm(0, 2) + pr_perm(1, 2) = 1)
+\set t debug(pr_perm(0, 3) + pr_perm(1, 3) + pr_perm(2, 3) = 3)
+\set t debug(pr_perm(0, 4) + pr_perm(1, 4) + pr_perm(2, 4) + pr_perm(3, 4) = 6)
+\set t debug(pr_perm(0, 5) + pr_perm(1, 5) + pr_perm(2, 5) + pr_perm(3, 5) + pr_perm(4, 5) = 10)
+\set t debug(pr_perm(0, 16) + pr_perm(1, 16) + pr_perm(2, 16) + pr_perm(3, 16) + \
+ pr_perm(4, 16) + pr_perm(5, 16) + pr_perm(6, 16) + pr_perm(7, 16) + \
+ pr_perm(8, 16) + pr_perm(9, 16) + pr_perm(10, 16) + pr_perm(11, 16) + \
+ pr_perm(12, 16) + pr_perm(13, 16) + pr_perm(14, 16) + pr_perm(15, 16) = 120)
+-- random sanity check
+\set size random(2, 1000)
+\set v random(0, :size - 1)
+\set p pr_perm(:v, :size)
+\set t debug(0 <= :p and :p < :size and :p = pr_perm(:v + :size, :size) and :p <> pr_perm(:v + 1, :size))
+-- actual values
+\set t debug(pr_perm(:v, 1) = 0)
+\set t debug(pr_perm(0, 2, 5432) = 0 and pr_perm(1, 2, 5432) = 1 and \
+ pr_perm(0, 2, 5431) = 1 and pr_perm(1, 2, 5431) = 0)
+-- ~50 bits test to exercise full modular multiplication
+\set t debug(pr_perm(0, 1000000000000000, 54321) = 9406454989259 and \
+ pr_perm(1999999999999999, 1000000000000000, 54321) = 54570773902028 and \
+ pr_perm(2500000000000000, 1000000000000000, 54321) = 771082311307468)
}
});
@@ -731,6 +762,10 @@ SELECT LEAST(:i, :i, :i, :i, :i, :i, :i, :i, :i, :i, :i);
[
'bad boolean', 0,
[qr{malformed variable.*trueXXX}], q{\set b :badtrue or true}
+ ],
+ [
+ 'invalid pr_perm size', 0,
+ [qr{pr_perm size parameter must be >= 1}], q{\set i pr_perm(0, 0)}
],);
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
index c1c2c1e3d4..ff02cfb46b 100644
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -290,6 +290,16 @@ my @script_tests = (
'too many arguments for hash',
[qr{unexpected number of arguments \(hash\)}],
{ 'bad-hash-2.sql' => "\\set i hash(1,2,3)\n" }
+ ],
+ [
+ 'not enough arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-1.sql' => "\\set i pr_perm(1)\n" }
+ ],
+ [
+ 'too many arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-2.sql' => "\\set i pr_perm(1, 2, 3, 4)\n" }
],);
for my $t (@script_tests)
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 5d4079609c..c0c26386cc 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -748,6 +748,9 @@
/* Define to 1 if your compiler understands __builtin_bswap64. */
#undef HAVE__BUILTIN_BSWAP64
+/* Define to 1 if your compiler understands __builtin_clzll. */
+#undef HAVE__BUILTIN_CLZLL
+
/* Define to 1 if your compiler understands __builtin_constant_p. */
#undef HAVE__BUILTIN_CONSTANT_P
On Wed, Sep 26, 2018 at 8:26 PM Fabien COELHO <coelho@cri.ensmp.fr> wrote:
modular_multiply() is an interesting device. I will leave this to
committers with a stronger mathematical background than me, but I have
a small comment in passing:For testing this function, I have manually commented out the various
shortcuts so that the manual multiplication code is always used, and the
tests passed. I just did it again.+#ifdef HAVE__BUILTIN_POPCOUNTLL + return __builtin_popcountll(n); +#else /* use clever no branching bitwise operator version */I think it is not enough for the compiler to have
__builtin_popcountll(). The CPU that this is eventually executed on
must actually have the POPCNT instruction[1] (or equivalent on ARM,
POWER etc), or the program will die with SIGILL.Hmmm, I'd be pretty surprised: The point of the builtin is to delegate to
the compiler the hassle of selecting the best option available, depending
on target hardware class: whether it issues a cpu/sse4 instruction is
not/should not be our problem.
True, it selects based on what you tell it:
$ cat x.c
#include <stdio.h>
#include <stdlib.h>
int
main(int argc, char *argv[])
{
printf("%d\n", __builtin_popcount(atoi(argv[1])));
}
$ gdb -q a.out
...
(gdb) disass main
...
0x00000000004005a4 <+39>: callq 0x4005c0 <__popcountdi2>
...
$ gcc -mpopcnt x.c
$ gdb -q a.out
...
(gdb) disass main
...
0x000000000040059f <+34>: popcnt %eax,%eax
...
I'd forgotten one detail... we figure out if need to tell it that we
have SSE4.2 instructions explicitly in the configure script:
checking which CRC-32C implementation to use... SSE 4.2 with runtime check
We enable -msse4.2 just on the one file that needs it, in src/port/Makefile:
pg_crc32c_sse42.o: CFLAGS+=$(CFLAGS_SSE42)
On this CentOS machine I see:
gcc ... -msse4.2 ... -c pg_crc32c_sse42.c -o pg_crc32c_sse42_srv.o
That is necessary because most people consume PostgreSQL through
packages from distributions that have to work on an Athlon II or
whatever, so we can't just use -msse4.2 for every translation unit.
So it becomes our job to isolate small bits of code that use newer
instructions, if it's really worth the effort to do that, and supply
our own runtime checks and provide a fallback.
There are CPUs in circulation produced in this decade that don't have
it.Then the compiler, when generating code that is expected to run for a
large class of hardware which include such old ones, should not use a
possibly unavailable instruction, or the runtime should take
responsability for dynamically selecting a viable option.
Right. Our problem is that if we didn't do our own runtime testing,
users of (say) Debian and RHEL packages (= most users of PostgreSQL)
would effectively never be able to use things like CRC32 instructions.
None of that seems worth it for something like this.
An interesting side effect of your mail is that while researching a bit on
the subject I noticed __builtin_clzll() which helps improve the nbits code
compared to using popcount. Patch attached uses CLZ insted of POPCOUNT.
It's annoying that fls() didn't make it into POSIX along with ffs().
On my system it compiles to a BSR instruction, just like
__builtin_clz().
--
Thomas Munro
http://www.enterprisedb.com
Hello,
That is necessary because most people consume PostgreSQL through
packages from distributions that have to work on an Athlon II or
whatever, so we can't just use -msse4.2 for every translation unit.
So it becomes our job to isolate small bits of code that use newer
instructions, if it's really worth the effort to do that, and supply
our own runtime checks and provide a fallback.
Ok. That was my understanding so as to improve the portability/performance
compromise. I do not think that pgbench is worth the effort on this
particular point.
[...] None of that seems worth it for something like this.
Indeed.
So, am I right to deducing that you are satisfied with the current status
of the patch, with the nbits implementation either based on popcount (v4)
or clz (v5) compiler intrinsics? I think that the clz option is better.
--
Fabien.
On Wed, Sep 26, 2018 at 01:20:49PM +0200, Fabien COELHO wrote:
So, am I right to deducing that you are satisfied with the current status of
the patch, with the nbits implementation either based on popcount (v4) or
clz (v5) compiler intrinsics? I think that the clz option is better.
Fabien, please note that v5 does not apply, so I moved this patch to
next CF, waiting on author.
--
Michael
PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Subject: Re: pgbench - add pseudo-random permutation functionOn Wed, Sep 26, 2018 at 01:20:49PM +0200, Fabien COELHO wrote:
So, am I right to deducing that you are satisfied with the current status of
the patch, with the nbits implementation either based on popcount (v4) or
clz (v5) compiler intrinsics? I think that the clz option is better.Fabien, please note that v5 does not apply,
Indeed, tests interact with 92a0342 committed on Thursday.
Here is a rebase of the latest version based on CLZ. According to basic
test I made, the underlying hardware instruction seems to be more often
available.
so I moved this patch to next CF, waiting on author.
I'm going to change its state to "Needs review".
--
Fabien.
Attachments:
pgbench-prp-func-6.patchtext/x-diff; name=pgbench-prp-func-6.patchDownload
diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index fb58c94d4b..e3cff88bc2 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -312,6 +312,26 @@ fi])# PGAC_C_BUILTIN_BSWAP64
+# PGAC_C_BUILTIN_CLZLL
+# -------------------------
+# Check if the C compiler understands __builtin_clzll(),
+# and define HAVE__BUILTIN_CLZLL if so.
+# Both GCC & CLANG seem to have one.
+AC_DEFUN([PGAC_C_BUILTIN_CLZLL],
+[AC_CACHE_CHECK(for __builtin_clzll, pgac_cv__builtin_clzll,
+[AC_COMPILE_IFELSE([AC_LANG_SOURCE(
+[static unsigned long int x = __builtin_clzll(0xaabbccddeeff0011);]
+)],
+[pgac_cv__builtin_clzll=yes],
+[pgac_cv__builtin_clzll=no])])
+if test x"$pgac_cv__builtin_clzll" = xyes ; then
+AC_DEFINE(HAVE__BUILTIN_CLZLL, 1,
+ [Define to 1 if your compiler understands __builtin_clzll.])
+fi])# PGAC_C_BUILTIN_CLZLL
+
+
+
+
# PGAC_C_BUILTIN_CONSTANT_P
# -------------------------
# Check if the C compiler understands __builtin_constant_p(),
diff --git a/configure b/configure
index 6414ec1ea6..0f6a549a41 100755
--- a/configure
+++ b/configure
@@ -13921,6 +13921,30 @@ if test x"$pgac_cv__builtin_bswap64" = xyes ; then
$as_echo "#define HAVE__BUILTIN_BSWAP64 1" >>confdefs.h
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __builtin_clzll" >&5
+$as_echo_n "checking for __builtin_clzll... " >&6; }
+if ${pgac_cv__builtin_clzll+:} false; then :
+ $as_echo_n "(cached) " >&6
+else
+ cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h. */
+static unsigned long int x = __builtin_clzll(0xaabbccddeeff0011);
+
+_ACEOF
+if ac_fn_c_try_compile "$LINENO"; then :
+ pgac_cv__builtin_clzll=yes
+else
+ pgac_cv__builtin_clzll=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv__builtin_clzll" >&5
+$as_echo "$pgac_cv__builtin_clzll" >&6; }
+if test x"$pgac_cv__builtin_clzll" = xyes ; then
+
+$as_echo "#define HAVE__BUILTIN_CLZLL 1" >>confdefs.h
+
fi
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __builtin_constant_p" >&5
$as_echo_n "checking for __builtin_constant_p... " >&6; }
diff --git a/configure.in b/configure.in
index 158d5a1ac8..231cf27cfb 100644
--- a/configure.in
+++ b/configure.in
@@ -1458,6 +1458,7 @@ PGAC_C_TYPES_COMPATIBLE
PGAC_C_BUILTIN_BSWAP16
PGAC_C_BUILTIN_BSWAP32
PGAC_C_BUILTIN_BSWAP64
+PGAC_C_BUILTIN_CLZLL
PGAC_C_BUILTIN_CONSTANT_P
PGAC_C_BUILTIN_UNREACHABLE
PGAC_C_COMPUTED_GOTO
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 8c464c9d42..265a91df1d 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -917,7 +917,7 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<row>
<entry> <literal>default_seed</literal> </entry>
- <entry>seed used in hash functions by default</entry>
+ <entry>seed used in hash and pseudo-random permutation functions by default</entry>
</row>
<row>
@@ -1377,6 +1377,13 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<entry><literal>pow(2.0, 10)</literal>, <literal>power(2.0, 10)</literal></entry>
<entry><literal>1024.0</literal></entry>
</row>
+ <row>
+ <entry><literal><function>pr_perm(<replaceable>i</replaceable>, <replaceable>size</replaceable> [, <replaceable>seed</replaceable> ] )</function></literal></entry>
+ <entry>integer</entry>
+ <entry>pseudo-random permutation in [0,size)</entry>
+ <entry><literal>pr_perm(0, 4)</literal></entry>
+ <entry><literal>0</literal>, <literal>1</literal>, <literal>2</literal> or <literal>3</literal></entry>
+ </row>
<row>
<entry><literal><function>random(<replaceable>lb</replaceable>, <replaceable>ub</replaceable>)</function></literal></entry>
<entry>integer</entry>
@@ -1538,6 +1545,24 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
</programlisting>
</para>
+ <para>
+ Function <literal>pr_perm</literal> implements a pseudo-random permutation.
+ It allows to mix the output of non uniform random functions so that
+ values drawn more often are not trivially correlated.
+ It permutes integers in [0, size) using a seed by applying rounds of
+ simple invertible functions, similarly to an encryption function,
+ although beware that it is not at all cryptographically secure.
+ Compared to <literal>hash</literal> functions discussed above, the function
+ ensures that a perfect permutation is applied: there are no collisions
+ nor holes in the output values.
+ Values outside the interval are interpreted modulo the size.
+ The function errors if size is not positive.
+ If no seed is provided, <literal>:default_seed</literal> is used.
+ For a given size and seed, the function is fully deterministic: if two
+ permutations on the same size must not be correlated, use distinct seeds
+ as outlined in the previous example about hash functions.
+ </para>
+
<para>
As an example, the full definition of the built-in TPC-B-like
transaction is:
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index bab6f8a95c..94fcebb77a 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -19,6 +19,7 @@
#define PGBENCH_NARGS_VARIABLE (-1)
#define PGBENCH_NARGS_CASE (-2)
#define PGBENCH_NARGS_HASH (-3)
+#define PGBENCH_NARGS_PRPERM (-4)
PgBenchExpr *expr_parse_result;
@@ -370,6 +371,9 @@ static const struct
{
"hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A
},
+ {
+ "pr_perm", PGBENCH_NARGS_PRPERM, PGBENCH_PRPERM
+ },
/* keep as last array element */
{
NULL, 0, 0
@@ -482,6 +486,19 @@ make_func(yyscan_t yyscanner, int fnumber, PgBenchExprList *args)
}
break;
+ /* pseudo-random permutation function with optional seed argument */
+ case PGBENCH_NARGS_PRPERM:
+ if (len < 2 || len > 3)
+ expr_yyerror_more(yyscanner, "unexpected number of arguments",
+ PGBENCH_FUNCTIONS[fnumber].fname);
+
+ if (len == 2)
+ {
+ PgBenchExpr *var = make_variable("default_seed");
+ args = make_elist(var, args);
+ }
+ break;
+
/* common case: positive arguments number */
default:
Assert(PGBENCH_FUNCTIONS[fnumber].nargs >= 0);
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 436764b9c9..4febde1b93 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -1066,6 +1066,237 @@ getHashMurmur2(int64 val, uint64 seed)
return (int64) result;
}
+/* pseudo-random permutation */
+
+/* 16 so that % 16 can be optimized to & 0x0f */
+#define PRP_PRIMES 16
+/*
+ * 24 bits mega primes from https://primes.utm.edu/lists/small/millions/
+ * the i-th prime, i \in [0, 15], is the first prime above 2^23 + i * 2^19
+ */
+static uint64 primes[PRP_PRIMES] = {
+ UINT64CONST(8388617),
+ UINT64CONST(8912921),
+ UINT64CONST(9437189),
+ UINT64CONST(9961487),
+ UINT64CONST(10485767),
+ UINT64CONST(11010059),
+ UINT64CONST(11534351),
+ UINT64CONST(12058679),
+ UINT64CONST(12582917),
+ UINT64CONST(13107229),
+ UINT64CONST(13631489),
+ UINT64CONST(14155777),
+ UINT64CONST(14680067),
+ UINT64CONST(15204391),
+ UINT64CONST(15728681),
+ UINT64CONST(16252967)
+};
+
+/* how many "encryption" rounds to apply */
+#define PRP_ROUNDS 4
+
+/* return smallest mask holding n */
+static uint64 compute_mask(uint64 n)
+{
+ n |= n >> 1;
+ n |= n >> 2;
+ n |= n >> 4;
+ n |= n >> 8;
+ n |= n >> 16;
+ n |= n >> 32;
+ return n;
+}
+
+/* length of n binary representation */
+static int nbits(uint64 n)
+{
+#ifdef HAVE__BUILTIN_CLZLL
+ return 64 - (n != 0 ? __builtin_clzll(n) : 64);
+#else /* use clever no branching bitwise operator version */
+ /* set all lower bits to 1 */
+ n = compute_mask(n);
+ /* then count them */
+ n -= (n >> 1) & UINT64CONST(0x5555555555555555);
+ n = (n & UINT64CONST(0x3333333333333333)) + ((n >> 2) & UINT64CONST(0x3333333333333333));
+ n = (n + (n >> 4)) & UINT64CONST(0x0F0F0F0F0F0F0F0F);
+ return (n * UINT64CONST(0x0101010101010101)) >> 56;
+#endif /* HAVE__BUILTIN_CLZLL */
+}
+
+/*
+ * Compute (x * y) % m, where x and y in [0, 2^64), m in [1, 2^64).
+ *
+ * Use improved interleaved modular multiplication algorithm to avoid
+ * overflows when necessary.
+ */
+static uint64 modular_multiply(uint64 x, uint64 y, const uint64 m)
+{
+ int i, bits;
+ uint64 r;
+
+ Assert(m >= 1);
+
+ /* Performance shortcut for our 24 bit primes, ok for m up to ~10E12 */
+ if ((x & UINT64CONST(0xffffff)) == x && (y & UINT64CONST(0xffffffffff)) == y)
+ return (x * y) % m;
+
+ /* Because of (x * y) % m = (x % m * y % m) % m */
+ if (x >= m)
+ x %= m;
+ if (y >= m)
+ y %= m;
+
+ /* Return the trivial result. */
+ if (x == 0 || y == 0 || m == 1)
+ return 0;
+
+ /* Return the result if (x * y) can be multiplied without overflow. */
+ if (nbits(x) + nbits(y) <= 64)
+ return (x * y) % m;
+
+ /* To reduce the for loop in the algorithm below, ensure y <= x. */
+ if (x < y)
+ {
+ uint64 tmp = x;
+ x = y;
+ y = tmp;
+ }
+
+ /* Interleaved modular multiplication algorithm from:
+ *
+ * D.N. Amanor et al., "Efficient hardware architecture for
+ * modular multiplication on FPGAs", in International Conference on
+ * Field Programmable Logic and Applications, Aug 2005, pp. 539-542.
+ *
+ * This algorithm is usually used in the field of digital circuit
+ * design.
+ *
+ * Input: X, Y, M; 0 <= X, Y <= M;
+ * Output: R = X * Y mod M;
+ * bits: number of bits of Y
+ * Y[i]: i th bit of Y
+ *
+ * 1. R = 0;
+ * 2. for (i = bits - 1; i >= 0; i--) {
+ * 3. R = 2 * R;
+ * 4. if (Y[i] == 0x1)
+ * 5. R += X;
+ * 6. if (R >= M) R -= M;
+ * 7. if (R >= M) R -= M;
+ * }
+ *
+ * In Steps 3 and 5, overflow should be avoided.
+ * Steps 6 and 7 can be instead a modular operation (R %= M).
+ */
+
+ bits = nbits(y);
+ r = 0;
+
+ for (i = bits - 1; i >= 0; i--)
+ {
+ if (r > UINT64CONST(0x7fffffffffffffff))
+ /* To avoid overflow, transform from (2 * r) to
+ * (2 * r) % m, and further transform to
+ * mathematically equivalent form shown below:
+ */
+ r = m - ((m - r) << 1);
+ else
+ r <<= 1;
+
+ if ((y >> i) & 0x1)
+ {
+ /* Compute (r + x) without overflow using same
+ * transformations described in the above comment.
+ */
+ if (m > UINT64CONST(0x7fffffffffffffff))
+ r = ((m - r) > x) ? r + x : r + x - m;
+ else
+ r = (r > m) ? r - m + x : r + x;
+ }
+
+ r %= m;
+ }
+
+ return r;
+}
+
+/* Donald Knuth linear congruential generator */
+#define DK_LCG_MUL UINT64CONST(6364136223846793005)
+#define DK_LCG_INC UINT64CONST(1442695040888963407)
+
+/* do not use all small bits */
+#define LCG_SHIFT 13
+
+/*
+ * PRP: parametric pseudo-random permutation
+ *
+ * Result in [0, size) is a permutation for inputs in the same set.
+ *
+ * Note that this function does not pass statistical tests: eg
+ * permutations of 2, 3, 4 or 5 ints are not strictly equiprobable.
+ * Things worsen for large sizes as there are many more permutations
+ * (size!) than seeds to select them (2^64 < 21!).
+ * However it is inexpensive compared to an actual encryption function,
+ * and the quality is good enough to avoid trivial correlations on
+ * large sizes, which is the expected use case.
+ *
+ * THIS FUNCTION IS NOT CRYPTOGRAPHICALLY SECURE.
+ * DO NOT USE FOR SUCH PURPOSE.
+ */
+static int64
+pseudorandom_perm(const int64 data, const int64 isize, const int64 seed)
+{
+ /* computations are performed on unsigned values */
+ uint64 size = (uint64) isize;
+ uint64 v = (uint64) data % size;
+ uint64 key = (uint64) seed;
+ /* size-1: ensures 2 possibly overlapping halves */
+ uint64 mask = compute_mask(size-1) >> 1;
+
+ unsigned int i, p;
+
+ /* nothing to permute */
+ if (isize == 1)
+ return 0;
+
+ Assert(isize >= 2);
+
+ /* apply 4 rounds of bijective transformations:
+ * (1) scramble: partial xors on overlapping power-or-2 subsets
+ * (2) scatter: linear modulo
+ */
+ for (i = 0, p = key % PRP_PRIMES; i < PRP_ROUNDS; i++, p = (p + 1) % PRP_PRIMES)
+ {
+ uint64 t;
+
+ /* first "half" whitening, for v in 0 .. mask */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ if (v <= mask)
+ v ^= (key >> LCG_SHIFT) & mask;
+
+ /* second (possibly overlapping) "half" whitening */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ t = size - 1 - v;
+ if (t <= mask)
+ {
+ t ^= (key >> LCG_SHIFT) & mask;
+ v = size - 1 - t;
+ }
+
+ /* at most 2 primes are skipped for a given size */
+ while (unlikely(size % primes[p] == 0))
+ p = (p + 1) % PRP_PRIMES;
+
+ /* scatter values with a prime multiplication */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ v = (modular_multiply(primes[p], v, size) + (key >> LCG_SHIFT)) % size;
+ }
+
+ /* back to signed */
+ return (int64) v;
+}
+
/*
* Initialize the given SimpleStats struct to all zeroes
*/
@@ -2420,6 +2651,27 @@ evalStandardFunc(TState *thread, CState *st,
return true;
}
+ case PGBENCH_PRPERM:
+ {
+ int64 val, size, seed;
+
+ Assert(nargs == 3);
+
+ if (!coerceToInt(&vargs[0], &val) ||
+ !coerceToInt(&vargs[1], &size) ||
+ !coerceToInt(&vargs[2], &seed))
+ return false;
+
+ if (size < 1)
+ {
+ fprintf(stderr, "pr_perm size parameter must be >= 1\n");
+ return false;
+ }
+
+ setIntValue(retval, pseudorandom_perm(val, size, seed));
+ return true;
+ }
+
default:
/* cannot get here */
Assert(0);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index de50340434..12d5c2946b 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -99,7 +99,8 @@ typedef enum PgBenchFunction
PGBENCH_IS,
PGBENCH_CASE,
PGBENCH_HASH_FNV1A,
- PGBENCH_HASH_MURMUR2
+ PGBENCH_HASH_MURMUR2,
+ PGBENCH_PRPERM
} PgBenchFunction;
typedef struct PgBenchExpr PgBenchExpr;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index d972903f2a..e141224c80 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -323,6 +323,13 @@ pgbench(
qr{command=98.: int 5432\b}, # :random_seed
qr{command=99.: int -9223372036854775808\b}, # min int
qr{command=100.: int 9223372036854775807\b}, # max int
+ qr{command=101.: boolean true\b},
+ qr{command=102.: boolean true\b},
+ qr{command=103.: boolean true\b},
+ qr{command=104.: boolean true\b},
+ qr{command=105.: boolean true\b},
+ qr{command=109.: boolean true\b},
+ qr{command=110.: boolean true\b},
],
'pgbench expressions',
{
@@ -450,6 +457,28 @@ SELECT :v0, :v1, :v2, :v3;
-- minint constant parsing
\set min debug(-9223372036854775808)
\set max debug(-(:min + 1))
+-- pseudo-random permutation
+\set t debug(pr_perm(0, 2) + pr_perm(1, 2) = 1)
+\set t debug(pr_perm(0, 3) + pr_perm(1, 3) + pr_perm(2, 3) = 3)
+\set t debug(pr_perm(0, 4) + pr_perm(1, 4) + pr_perm(2, 4) + pr_perm(3, 4) = 6)
+\set t debug(pr_perm(0, 5) + pr_perm(1, 5) + pr_perm(2, 5) + pr_perm(3, 5) + pr_perm(4, 5) = 10)
+\set t debug(pr_perm(0, 16) + pr_perm(1, 16) + pr_perm(2, 16) + pr_perm(3, 16) + \
+ pr_perm(4, 16) + pr_perm(5, 16) + pr_perm(6, 16) + pr_perm(7, 16) + \
+ pr_perm(8, 16) + pr_perm(9, 16) + pr_perm(10, 16) + pr_perm(11, 16) + \
+ pr_perm(12, 16) + pr_perm(13, 16) + pr_perm(14, 16) + pr_perm(15, 16) = 120)
+-- random sanity check
+\set size random(2, 1000)
+\set v random(0, :size - 1)
+\set p pr_perm(:v, :size)
+\set t debug(0 <= :p and :p < :size and :p = pr_perm(:v + :size, :size) and :p <> pr_perm(:v + 1, :size))
+-- actual values
+\set t debug(pr_perm(:v, 1) = 0)
+\set t debug(pr_perm(0, 2, 5432) = 0 and pr_perm(1, 2, 5432) = 1 and \
+ pr_perm(0, 2, 5431) = 1 and pr_perm(1, 2, 5431) = 0)
+-- ~50 bits test to exercise full modular multiplication
+\set t debug(pr_perm(0, 1000000000000000, 54321) = 9406454989259 and \
+ pr_perm(1999999999999999, 1000000000000000, 54321) = 54570773902028 and \
+ pr_perm(2500000000000000, 1000000000000000, 54321) = 771082311307468)
}
});
@@ -740,6 +769,10 @@ SELECT LEAST(:i, :i, :i, :i, :i, :i, :i, :i, :i, :i, :i);
[
'bad boolean', 0,
[qr{malformed variable.*trueXXX}], q{\set b :badtrue or true}
+ ],
+ [
+ 'invalid pr_perm size', 0,
+ [qr{pr_perm size parameter must be >= 1}], q{\set i pr_perm(0, 0)}
],);
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
index 696c378edc..7c03ef2bbc 100644
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -306,6 +306,16 @@ my @script_tests = (
'double overflow 3',
[qr{double constant overflow}],
{ 'overflow-3.sql' => "\\set d .1E310\n" }
+ ],
+ [
+ 'not enough arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-1.sql' => "\\set i pr_perm(1)\n" }
+ ],
+ [
+ 'too many arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-2.sql' => "\\set i pr_perm(1, 2, 3, 4)\n" }
],);
for my $t (@script_tests)
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 90dda8ea05..39b97863d1 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -731,6 +731,9 @@
/* Define to 1 if your compiler understands __builtin_bswap64. */
#undef HAVE__BUILTIN_BSWAP64
+/* Define to 1 if your compiler understands __builtin_clzll. */
+#undef HAVE__BUILTIN_CLZLL
+
/* Define to 1 if your compiler understands __builtin_constant_p. */
#undef HAVE__BUILTIN_CONSTANT_P
Hi,
I reviewed pgbench-prp-func-6.patch.
(1) modular_multiply()
In modular_multiply(), the numbers of digits of inputs are checked in
order to determine using the interleaved modular multiplication
algorithm or not. However, the check condition absolutely depends on the
implementation of pseudorandom_perm() even though modular_multiply() is
a general function.
I think it is better that pseudorandom_perm() directly checks the
condition because it can be worked efficiently and modular_multiply()
can be used in general purpose.
(2) pseudorandom_perm() and 001_pgbench_with_server.pl
Reading the discussion of __builtin_xxx functions, I remembered another
overflow issue.
pseudorandom_perm() uses the Donald Knuth's linear congruential
generator method to obtain pseudo-random numbers. This method, not only
this but also all linear congruential generator methods, always overflows.
If we do not need to guarantee the portability of this code, we do not
care about the result of this method because we just use *pseudo* random
numbers. (In fact, I ignored it before.) However, since we have to
guarantee the portability, we have to calculate it accurately. I
therefore implemented the function dk_lcg() and rewrote pseudorandom_perm().
Just to be sure, I made a python code to check the algorithm of
pseudorandom_perm() and run it.
Fortunately, my python code and pseudorandom_perm() which I rewrote
returned the same answers, so I rewrote the answers in
001_pgbench_with_server.pl.
I attached the new patch `pgbench-prp-func-7.patch`, the python code
`pseudorandom_perm.py`, and the pr_perm check script file
`pr_perm_check.sql`.
Best regards,
Show quoted text
On 2018/10/01 12:19, Fabien COELHO wrote:
��� PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Subject: Re: pgbench - add pseudo-random permutation functionOn Wed, Sep 26, 2018 at 01:20:49PM +0200, Fabien COELHO wrote:
So, am I right to deducing that you are satisfied with the current
status of
the patch, with the nbits implementation either based on popcount
(v4) or
clz (v5) compiler intrinsics? I think that the clz option is better.Fabien, please note that v5 does not apply,
Indeed, tests interact with 92a0342 committed on Thursday.
Here is a rebase of the latest version based on CLZ. According to basic
test I made, the underlying hardware instruction seems to be more often
available.so I moved this patch to next CF, waiting on author.
I'm going to change its state to "Needs review".
Attachments:
pgbench-prp-func-7.patchtext/plain; charset=UTF-8; name=pgbench-prp-func-7.patch; x-mac-creator=0; x-mac-type=0Download
diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index af2dea1..5b09f73 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -327,6 +327,26 @@ fi])# PGAC_C_BUILTIN_BSWAP64
+# PGAC_C_BUILTIN_CLZLL
+# -------------------------
+# Check if the C compiler understands __builtin_clzll(),
+# and define HAVE__BUILTIN_CLZLL if so.
+# Both GCC & CLANG seem to have one.
+AC_DEFUN([PGAC_C_BUILTIN_CLZLL],
+[AC_CACHE_CHECK(for __builtin_clzll, pgac_cv__builtin_clzll,
+[AC_COMPILE_IFELSE([AC_LANG_SOURCE(
+[static unsigned long int x = __builtin_clzll(0xaabbccddeeff0011);]
+)],
+[pgac_cv__builtin_clzll=yes],
+[pgac_cv__builtin_clzll=no])])
+if test x"$pgac_cv__builtin_clzll" = xyes ; then
+AC_DEFINE(HAVE__BUILTIN_CLZLL, 1,
+ [Define to 1 if your compiler understands __builtin_clzll.])
+fi])# PGAC_C_BUILTIN_CLZLL
+
+
+
+
# PGAC_C_BUILTIN_CONSTANT_P
# -------------------------
# Check if the C compiler understands __builtin_constant_p(),
diff --git a/configure b/configure
index b7250d7..a6dc740 100755
--- a/configure
+++ b/configure
@@ -13951,6 +13951,30 @@ if test x"$pgac_cv__builtin_bswap64" = xyes ; then
$as_echo "#define HAVE__BUILTIN_BSWAP64 1" >>confdefs.h
fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __builtin_clzll" >&5
+$as_echo_n "checking for __builtin_clzll... " >&6; }
+if ${pgac_cv__builtin_clzll+:} false; then :
+ $as_echo_n "(cached) " >&6
+else
+ cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h. */
+static unsigned long int x = __builtin_clzll(0xaabbccddeeff0011);
+
+_ACEOF
+if ac_fn_c_try_compile "$LINENO"; then :
+ pgac_cv__builtin_clzll=yes
+else
+ pgac_cv__builtin_clzll=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv__builtin_clzll" >&5
+$as_echo "$pgac_cv__builtin_clzll" >&6; }
+if test x"$pgac_cv__builtin_clzll" = xyes ; then
+
+$as_echo "#define HAVE__BUILTIN_CLZLL 1" >>confdefs.h
+
+fi
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __builtin_constant_p" >&5
$as_echo_n "checking for __builtin_constant_p... " >&6; }
if ${pgac_cv__builtin_constant_p+:} false; then :
diff --git a/configure.in b/configure.in
index de5f777..e872875 100644
--- a/configure.in
+++ b/configure.in
@@ -1485,6 +1485,7 @@ PGAC_C_TYPES_COMPATIBLE
PGAC_C_BUILTIN_BSWAP16
PGAC_C_BUILTIN_BSWAP32
PGAC_C_BUILTIN_BSWAP64
+PGAC_C_BUILTIN_CLZLL
PGAC_C_BUILTIN_CONSTANT_P
PGAC_C_BUILTIN_UNREACHABLE
PGAC_C_COMPUTED_GOTO
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 8c464c9..265a91d 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -917,7 +917,7 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<row>
<entry> <literal>default_seed</literal> </entry>
- <entry>seed used in hash functions by default</entry>
+ <entry>seed used in hash and pseudo-random permutation functions by default</entry>
</row>
<row>
@@ -1378,6 +1378,13 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<entry><literal>1024.0</literal></entry>
</row>
<row>
+ <entry><literal><function>pr_perm(<replaceable>i</replaceable>, <replaceable>size</replaceable> [, <replaceable>seed</replaceable> ] )</function></literal></entry>
+ <entry>integer</entry>
+ <entry>pseudo-random permutation in [0,size)</entry>
+ <entry><literal>pr_perm(0, 4)</literal></entry>
+ <entry><literal>0</literal>, <literal>1</literal>, <literal>2</literal> or <literal>3</literal></entry>
+ </row>
+ <row>
<entry><literal><function>random(<replaceable>lb</replaceable>, <replaceable>ub</replaceable>)</function></literal></entry>
<entry>integer</entry>
<entry>uniformly-distributed random integer in <literal>[lb, ub]</literal></entry>
@@ -1539,6 +1546,24 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
</para>
<para>
+ Function <literal>pr_perm</literal> implements a pseudo-random permutation.
+ It allows to mix the output of non uniform random functions so that
+ values drawn more often are not trivially correlated.
+ It permutes integers in [0, size) using a seed by applying rounds of
+ simple invertible functions, similarly to an encryption function,
+ although beware that it is not at all cryptographically secure.
+ Compared to <literal>hash</literal> functions discussed above, the function
+ ensures that a perfect permutation is applied: there are no collisions
+ nor holes in the output values.
+ Values outside the interval are interpreted modulo the size.
+ The function errors if size is not positive.
+ If no seed is provided, <literal>:default_seed</literal> is used.
+ For a given size and seed, the function is fully deterministic: if two
+ permutations on the same size must not be correlated, use distinct seeds
+ as outlined in the previous example about hash functions.
+ </para>
+
+ <para>
As an example, the full definition of the built-in TPC-B-like
transaction is:
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index bab6f8a..94fcebb 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -19,6 +19,7 @@
#define PGBENCH_NARGS_VARIABLE (-1)
#define PGBENCH_NARGS_CASE (-2)
#define PGBENCH_NARGS_HASH (-3)
+#define PGBENCH_NARGS_PRPERM (-4)
PgBenchExpr *expr_parse_result;
@@ -370,6 +371,9 @@ static const struct
{
"hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A
},
+ {
+ "pr_perm", PGBENCH_NARGS_PRPERM, PGBENCH_PRPERM
+ },
/* keep as last array element */
{
NULL, 0, 0
@@ -482,6 +486,19 @@ make_func(yyscan_t yyscanner, int fnumber, PgBenchExprList *args)
}
break;
+ /* pseudo-random permutation function with optional seed argument */
+ case PGBENCH_NARGS_PRPERM:
+ if (len < 2 || len > 3)
+ expr_yyerror_more(yyscanner, "unexpected number of arguments",
+ PGBENCH_FUNCTIONS[fnumber].fname);
+
+ if (len == 2)
+ {
+ PgBenchExpr *var = make_variable("default_seed");
+ args = make_elist(var, args);
+ }
+ break;
+
/* common case: positive arguments number */
default:
Assert(PGBENCH_FUNCTIONS[fnumber].nargs >= 0);
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 436764b..28bde41 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -1066,6 +1066,254 @@ getHashMurmur2(int64 val, uint64 seed)
return (int64) result;
}
+/* pseudo-random permutation */
+
+/* 16 so that % 16 can be optimized to & 0x0f */
+#define PRP_PRIMES 16
+/*
+ * 24 bits mega primes from https://primes.utm.edu/lists/small/millions/
+ * the i-th prime, i \in [0, 15], is the first prime above 2^23 + i * 2^19
+ */
+static uint64 primes[PRP_PRIMES] = {
+ UINT64CONST(8388617),
+ UINT64CONST(8912921),
+ UINT64CONST(9437189),
+ UINT64CONST(9961487),
+ UINT64CONST(10485767),
+ UINT64CONST(11010059),
+ UINT64CONST(11534351),
+ UINT64CONST(12058679),
+ UINT64CONST(12582917),
+ UINT64CONST(13107229),
+ UINT64CONST(13631489),
+ UINT64CONST(14155777),
+ UINT64CONST(14680067),
+ UINT64CONST(15204391),
+ UINT64CONST(15728681),
+ UINT64CONST(16252967)
+};
+
+/* how many "encryption" rounds to apply */
+#define PRP_ROUNDS 4
+
+/* return smallest mask holding n */
+static uint64 compute_mask(uint64 n)
+{
+ n |= n >> 1;
+ n |= n >> 2;
+ n |= n >> 4;
+ n |= n >> 8;
+ n |= n >> 16;
+ n |= n >> 32;
+ return n;
+}
+
+/* length of n binary representation */
+static int nbits(uint64 n)
+{
+#ifdef HAVE__BUILTIN_CLZLL
+ return 64 - (n != 0 ? __builtin_clzll(n) : 64);
+#else /* use clever no branching bitwise operator version */
+ /* set all lower bits to 1 */
+ n = compute_mask(n);
+ /* then count them */
+ n -= (n >> 1) & UINT64CONST(0x5555555555555555);
+ n = (n & UINT64CONST(0x3333333333333333)) + ((n >> 2) & UINT64CONST(0x3333333333333333));
+ n = (n + (n >> 4)) & UINT64CONST(0x0F0F0F0F0F0F0F0F);
+ return (n * UINT64CONST(0x0101010101010101)) >> 56;
+#endif /* HAVE__BUILTIN_CLZLL */
+}
+
+/*
+ * Compute (x * y) % m, where x and y in [0, 2^64), m in [1, 2^64).
+ *
+ * Use improved interleaved modular multiplication algorithm to avoid
+ * overflows when necessary.
+ */
+static uint64 modular_multiply(uint64 x, uint64 y, const uint64 m)
+{
+ int i, bits;
+ uint64 r;
+
+ Assert(m >= 1);
+
+ /* Because of (x * y) % m = (x % m * y % m) % m */
+ if (x >= m)
+ x %= m;
+ if (y >= m)
+ y %= m;
+
+ /* Return the trivial result. */
+ if (x == 0 || y == 0 || m == 1)
+ return 0;
+
+ /* Return the result if (x * y) can be multiplied without overflow. */
+ if (nbits(x) + nbits(y) <= 64)
+ return (x * y) % m;
+
+ /* To reduce the for loop in the algorithm below, ensure y <= x. */
+ if (x < y)
+ {
+ uint64 tmp = x;
+ x = y;
+ y = tmp;
+ }
+
+ /* Interleaved modular multiplication algorithm from:
+ *
+ * D.N. Amanor et al., "Efficient hardware architecture for
+ * modular multiplication on FPGAs", in International Conference on
+ * Field Programmable Logic and Applications, Aug 2005, pp. 539-542.
+ *
+ * This algorithm is usually used in the field of digital circuit
+ * design.
+ *
+ * Input: X, Y, M; 0 <= X, Y <= M;
+ * Output: R = X * Y mod M;
+ * bits: number of bits of Y
+ * Y[i]: i th bit of Y
+ *
+ * 1. R = 0;
+ * 2. for (i = bits - 1; i >= 0; i--) {
+ * 3. R = 2 * R;
+ * 4. if (Y[i] == 0x1)
+ * 5. R += X;
+ * 6. if (R >= M) R -= M;
+ * 7. if (R >= M) R -= M;
+ * }
+ *
+ * In Steps 3 and 5, overflow should be avoided.
+ * Steps 6 and 7 can be instead a modular operation (R %= M).
+ */
+
+ bits = nbits(y);
+ r = 0;
+
+ for (i = bits - 1; i >= 0; i--)
+ {
+ if (r > UINT64CONST(0x7fffffffffffffff))
+ /* To avoid overflow, transform from (2 * r) to
+ * (2 * r) % m, and further transform to
+ * mathematically equivalent form shown below:
+ */
+ r = m - ((m - r) << 1);
+ else
+ r <<= 1;
+
+ if ((y >> i) & 0x1)
+ {
+ /* Compute (r + x) without overflow using same
+ * transformations described in the above comment.
+ */
+ if (m > UINT64CONST(0x7fffffffffffffff))
+ r = ((m - r) > x) ? r + x : r + x - m;
+ else
+ r = (r > m) ? r + x - m : r + x;
+ }
+
+ r %= m;
+ }
+
+ return r;
+}
+
+/* Donald Knuth linear congruential generator */
+#define DK_LCG_MUL UINT64CONST(6364136223846793005)
+#define DK_LCG_INC UINT64CONST(1442695040888963407)
+
+/* Calculate (key * DK_LCG_MUL + DK_LCG_INC) % 2^64 without overflow.
+ */
+static uint64 dk_lcg(uint64 key)
+{
+ uint64 r = modular_multiply(key, DK_LCG_MUL, UINT64CONST(0xffffffffffffffff));
+
+ if ((UINT64CONST(0xffffffffffffffff) - r) > DK_LCG_INC)
+ r += DK_LCG_INC;
+ else
+ r -= UINT64CONST(0xffffffffffffffff) - DK_LCG_INC;
+
+ return r;
+}
+
+/* do not use all small bits */
+#define LCG_SHIFT 13
+
+/*
+ * PRP: parametric pseudo-random permutation
+ *
+ * Result in [0, size) is a permutation for inputs in the same set.
+ *
+ * Note that this function does not pass statistical tests: eg
+ * permutations of 2, 3, 4 or 5 ints are not strictly equiprobable.
+ * Things worsen for large sizes as there are many more permutations
+ * (size!) than seeds to select them (2^64 < 21!).
+ * However it is inexpensive compared to an actual encryption function,
+ * and the quality is good enough to avoid trivial correlations on
+ * large sizes, which is the expected use case.
+ *
+ * THIS FUNCTION IS NOT CRYPTOGRAPHICALLY SECURE.
+ * DO NOT USE FOR SUCH PURPOSE.
+ */
+static int64
+pseudorandom_perm(const int64 data, const int64 isize, const int64 seed)
+{
+ /* computations are performed on unsigned values */
+ uint64 size = (uint64) isize;
+ uint64 v = (uint64) data % size;
+ uint64 key = (uint64) seed;
+ /* size-1: ensures 2 possibly overlapping halves */
+ uint64 mask = compute_mask(size-1) >> 1;
+
+ unsigned int i, p;
+
+ /* nothing to permute */
+ if (isize == 1)
+ return 0;
+
+ Assert(isize >= 2);
+
+ /* apply 4 rounds of bijective transformations:
+ * (1) scramble: partial xors on overlapping power-or-2 subsets
+ * (2) scatter: linear modulo
+ */
+ for (i = 0, p = key % PRP_PRIMES; i < PRP_ROUNDS; i++, p = (p + 1) % PRP_PRIMES)
+ {
+ uint64 t;
+
+ /* first "half" whitening, for v in 0 .. mask */
+ key = dk_lcg(key);
+ if (v <= mask)
+ v ^= (key >> LCG_SHIFT) & mask;
+
+ /* second (possibly overlapping) "half" whitening */
+ key = dk_lcg(key);
+ t = size - 1 - v;
+ if (t <= mask)
+ {
+ t ^= (key >> LCG_SHIFT) & mask;
+ v = size - 1 - t;
+ }
+
+ /* at most 2 primes are skipped for a given size */
+ while (unlikely(size % primes[p] == 0))
+ p = (p + 1) % PRP_PRIMES;
+
+ /* scatter values with a prime multiplication */
+ key = dk_lcg(key);
+ /* To avoid overflow in `primes[p] * v`, we check the number of
+ * digits of v and decide whether to use the modular_multiply function.
+ * Note that primes[p] is 24 bit.
+ */
+ if ((v & UINT64CONST(0xffffffffff)) == v)
+ v = (primes[p] * v + (key >> LCG_SHIFT)) % size;
+ else
+ v = (modular_multiply(primes[p], v, size) + (key >> LCG_SHIFT)) % size;
+ }
+
+ /* back to signed */
+ return (int64) v;
+}
+
/*
* Initialize the given SimpleStats struct to all zeroes
*/
@@ -2420,6 +2668,27 @@ evalStandardFunc(TState *thread, CState *st,
return true;
}
+ case PGBENCH_PRPERM:
+ {
+ int64 val, size, seed;
+
+ Assert(nargs == 3);
+
+ if (!coerceToInt(&vargs[0], &val) ||
+ !coerceToInt(&vargs[1], &size) ||
+ !coerceToInt(&vargs[2], &seed))
+ return false;
+
+ if (size < 1)
+ {
+ fprintf(stderr, "pr_perm size parameter must be >= 1\n");
+ return false;
+ }
+
+ setIntValue(retval, pseudorandom_perm(val, size, seed));
+ return true;
+ }
+
default:
/* cannot get here */
Assert(0);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index de50340..12d5c29 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -99,7 +99,8 @@ typedef enum PgBenchFunction
PGBENCH_IS,
PGBENCH_CASE,
PGBENCH_HASH_FNV1A,
- PGBENCH_HASH_MURMUR2
+ PGBENCH_HASH_MURMUR2,
+ PGBENCH_PRPERM
} PgBenchFunction;
typedef struct PgBenchExpr PgBenchExpr;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index d972903..5fdf08f 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -323,6 +323,13 @@ pgbench(
qr{command=98.: int 5432\b}, # :random_seed
qr{command=99.: int -9223372036854775808\b}, # min int
qr{command=100.: int 9223372036854775807\b}, # max int
+ qr{command=101.: boolean true\b},
+ qr{command=102.: boolean true\b},
+ qr{command=103.: boolean true\b},
+ qr{command=104.: boolean true\b},
+ qr{command=105.: boolean true\b},
+ qr{command=109.: boolean true\b},
+ qr{command=110.: boolean true\b},
],
'pgbench expressions',
{
@@ -450,6 +457,28 @@ SELECT :v0, :v1, :v2, :v3;
-- minint constant parsing
\set min debug(-9223372036854775808)
\set max debug(-(:min + 1))
+-- pseudo-random permutation
+\set t debug(pr_perm(0, 2) + pr_perm(1, 2) = 1)
+\set t debug(pr_perm(0, 3) + pr_perm(1, 3) + pr_perm(2, 3) = 3)
+\set t debug(pr_perm(0, 4) + pr_perm(1, 4) + pr_perm(2, 4) + pr_perm(3, 4) = 6)
+\set t debug(pr_perm(0, 5) + pr_perm(1, 5) + pr_perm(2, 5) + pr_perm(3, 5) + pr_perm(4, 5) = 10)
+\set t debug(pr_perm(0, 16) + pr_perm(1, 16) + pr_perm(2, 16) + pr_perm(3, 16) + \
+ pr_perm(4, 16) + pr_perm(5, 16) + pr_perm(6, 16) + pr_perm(7, 16) + \
+ pr_perm(8, 16) + pr_perm(9, 16) + pr_perm(10, 16) + pr_perm(11, 16) + \
+ pr_perm(12, 16) + pr_perm(13, 16) + pr_perm(14, 16) + pr_perm(15, 16) = 120)
+-- random sanity check
+\set size random(2, 1000)
+\set v random(0, :size - 1)
+\set p pr_perm(:v, :size)
+\set t debug(0 <= :p and :p < :size and :p = pr_perm(:v + :size, :size) and :p <> pr_perm(:v + 1, :size))
+-- actual values
+\set t debug(pr_perm(:v, 1) = 0)
+\set t debug(pr_perm(0, 2, 5432) = 0 and pr_perm(1, 2, 5432) = 1 and \
+ pr_perm(0, 2, 5431) = 1 and pr_perm(1, 2, 5431) = 0)
+-- ~50 bits test to exercise full modular multiplication
+\set t debug(pr_perm(0, 1000000000000000, 54321) = 204851330149484 and \
+ pr_perm(1999999999999999, 1000000000000000, 54321) = 96596601529320 and \
+ pr_perm(2500000000000000, 1000000000000000, 54321) = 108116869098154)
}
});
@@ -740,6 +769,10 @@ SELECT LEAST(:i, :i, :i, :i, :i, :i, :i, :i, :i, :i, :i);
[
'bad boolean', 0,
[qr{malformed variable.*trueXXX}], q{\set b :badtrue or true}
+ ],
+ [
+ 'invalid pr_perm size', 0,
+ [qr{pr_perm size parameter must be >= 1}], q{\set i pr_perm(0, 0)}
],);
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
index 696c378..7c03ef2 100644
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -306,6 +306,16 @@ my @script_tests = (
'double overflow 3',
[qr{double constant overflow}],
{ 'overflow-3.sql' => "\\set d .1E310\n" }
+ ],
+ [
+ 'not enough arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-1.sql' => "\\set i pr_perm(1)\n" }
+ ],
+ [
+ 'too many arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-2.sql' => "\\set i pr_perm(1, 2, 3, 4)\n" }
],);
for my $t (@script_tests)
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 9798bd2..4cf3750 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -734,6 +734,9 @@
/* Define to 1 if your compiler understands __builtin_bswap64. */
#undef HAVE__BUILTIN_BSWAP64
+/* Define to 1 if your compiler understands __builtin_clzll. */
+#undef HAVE__BUILTIN_CLZLL
+
/* Define to 1 if your compiler understands __builtin_constant_p. */
#undef HAVE__BUILTIN_CONSTANT_P
Hello Hironobu-san,
I reviewed pgbench-prp-func-6.patch.
Thanks.
(1) modular_multiply()
In modular_multiply(), the numbers of digits of inputs are checked in order
to determine using the interleaved modular multiplication algorithm or not.
However, the check condition absolutely depends on the implementation of
pseudorandom_perm() even though modular_multiply() is a general function.I think it is better that pseudorandom_perm() directly checks the condition
because it can be worked efficiently and modular_multiply() can be used in
general purpose.
You moved the shortcut to the caller. Why not, it removes one modulo
operation and avoids the call altogether.
(2) pseudorandom_perm() and 001_pgbench_with_server.pl
Reading the discussion of __builtin_xxx functions, I remembered another
overflow issue.pseudorandom_perm() uses the Donald Knuth's linear congruential generator
method to obtain pseudo-random numbers. This method, not only this but also
all linear congruential generator methods, always overflows.
If we do not need to guarantee the portability of this code,
ISTM that we do not need such changes: the code is already portable
because standard C unsigned operations overflow consistently and this is
what it really expected for the linear congruential generator.
If an alternate implementation should be provided, given the heavy cost of
the modular multiplication function, it would be only for those
architectures which fails, detected at configure time. I would not go into
this without an actual failing architecture & C compiler.
Also, the alternate implementation should not change the result, so
something looks amiss in your version. Moreover, I'm unclear how to
implement an overflow multiply with the safe no overflow version.
Here is a v8 which:
- moves the shortcut to the caller
- changes the r formula as you did
- does nothing about Knuth's formula, as nothing should be needed
- fixes tests after Peter's exit status changes
--
Fabien.
Attachments:
pgbench-prp-func-8.patchtext/x-diff; name=pgbench-prp-func-8.patchDownload
diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index af2dea1c2a..5b09f73d26 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -327,6 +327,26 @@ fi])# PGAC_C_BUILTIN_BSWAP64
+# PGAC_C_BUILTIN_CLZLL
+# -------------------------
+# Check if the C compiler understands __builtin_clzll(),
+# and define HAVE__BUILTIN_CLZLL if so.
+# Both GCC & CLANG seem to have one.
+AC_DEFUN([PGAC_C_BUILTIN_CLZLL],
+[AC_CACHE_CHECK(for __builtin_clzll, pgac_cv__builtin_clzll,
+[AC_COMPILE_IFELSE([AC_LANG_SOURCE(
+[static unsigned long int x = __builtin_clzll(0xaabbccddeeff0011);]
+)],
+[pgac_cv__builtin_clzll=yes],
+[pgac_cv__builtin_clzll=no])])
+if test x"$pgac_cv__builtin_clzll" = xyes ; then
+AC_DEFINE(HAVE__BUILTIN_CLZLL, 1,
+ [Define to 1 if your compiler understands __builtin_clzll.])
+fi])# PGAC_C_BUILTIN_CLZLL
+
+
+
+
# PGAC_C_BUILTIN_CONSTANT_P
# -------------------------
# Check if the C compiler understands __builtin_constant_p(),
diff --git a/configure b/configure
index b7250d7f5b..a6dc740fee 100755
--- a/configure
+++ b/configure
@@ -13950,6 +13950,30 @@ if test x"$pgac_cv__builtin_bswap64" = xyes ; then
$as_echo "#define HAVE__BUILTIN_BSWAP64 1" >>confdefs.h
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __builtin_clzll" >&5
+$as_echo_n "checking for __builtin_clzll... " >&6; }
+if ${pgac_cv__builtin_clzll+:} false; then :
+ $as_echo_n "(cached) " >&6
+else
+ cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h. */
+static unsigned long int x = __builtin_clzll(0xaabbccddeeff0011);
+
+_ACEOF
+if ac_fn_c_try_compile "$LINENO"; then :
+ pgac_cv__builtin_clzll=yes
+else
+ pgac_cv__builtin_clzll=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv__builtin_clzll" >&5
+$as_echo "$pgac_cv__builtin_clzll" >&6; }
+if test x"$pgac_cv__builtin_clzll" = xyes ; then
+
+$as_echo "#define HAVE__BUILTIN_CLZLL 1" >>confdefs.h
+
fi
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __builtin_constant_p" >&5
$as_echo_n "checking for __builtin_constant_p... " >&6; }
diff --git a/configure.in b/configure.in
index de5f777333..e872875293 100644
--- a/configure.in
+++ b/configure.in
@@ -1485,6 +1485,7 @@ PGAC_C_TYPES_COMPATIBLE
PGAC_C_BUILTIN_BSWAP16
PGAC_C_BUILTIN_BSWAP32
PGAC_C_BUILTIN_BSWAP64
+PGAC_C_BUILTIN_CLZLL
PGAC_C_BUILTIN_CONSTANT_P
PGAC_C_BUILTIN_UNREACHABLE
PGAC_C_COMPUTED_GOTO
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index b5e3a62a33..77cbfcd097 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -929,7 +929,7 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<row>
<entry> <literal>default_seed</literal> </entry>
- <entry>seed used in hash functions by default</entry>
+ <entry>seed used in hash and pseudo-random permutation functions by default</entry>
</row>
<row>
@@ -1389,6 +1389,13 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<entry><literal>pow(2.0, 10)</literal>, <literal>power(2.0, 10)</literal></entry>
<entry><literal>1024.0</literal></entry>
</row>
+ <row>
+ <entry><literal><function>pr_perm(<replaceable>i</replaceable>, <replaceable>size</replaceable> [, <replaceable>seed</replaceable> ] )</function></literal></entry>
+ <entry>integer</entry>
+ <entry>pseudo-random permutation in [0,size)</entry>
+ <entry><literal>pr_perm(0, 4)</literal></entry>
+ <entry><literal>0</literal>, <literal>1</literal>, <literal>2</literal> or <literal>3</literal></entry>
+ </row>
<row>
<entry><literal><function>random(<replaceable>lb</replaceable>, <replaceable>ub</replaceable>)</function></literal></entry>
<entry>integer</entry>
@@ -1550,6 +1557,24 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
</programlisting>
</para>
+ <para>
+ Function <literal>pr_perm</literal> implements a pseudo-random permutation.
+ It allows to mix the output of non uniform random functions so that
+ values drawn more often are not trivially correlated.
+ It permutes integers in [0, size) using a seed by applying rounds of
+ simple invertible functions, similarly to an encryption function,
+ although beware that it is not at all cryptographically secure.
+ Compared to <literal>hash</literal> functions discussed above, the function
+ ensures that a perfect permutation is applied: there are no collisions
+ nor holes in the output values.
+ Values outside the interval are interpreted modulo the size.
+ The function errors if size is not positive.
+ If no seed is provided, <literal>:default_seed</literal> is used.
+ For a given size and seed, the function is fully deterministic: if two
+ permutations on the same size must not be correlated, use distinct seeds
+ as outlined in the previous example about hash functions.
+ </para>
+
<para>
As an example, the full definition of the built-in TPC-B-like
transaction is:
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index bab6f8a95c..94fcebb77a 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -19,6 +19,7 @@
#define PGBENCH_NARGS_VARIABLE (-1)
#define PGBENCH_NARGS_CASE (-2)
#define PGBENCH_NARGS_HASH (-3)
+#define PGBENCH_NARGS_PRPERM (-4)
PgBenchExpr *expr_parse_result;
@@ -370,6 +371,9 @@ static const struct
{
"hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A
},
+ {
+ "pr_perm", PGBENCH_NARGS_PRPERM, PGBENCH_PRPERM
+ },
/* keep as last array element */
{
NULL, 0, 0
@@ -482,6 +486,19 @@ make_func(yyscan_t yyscanner, int fnumber, PgBenchExprList *args)
}
break;
+ /* pseudo-random permutation function with optional seed argument */
+ case PGBENCH_NARGS_PRPERM:
+ if (len < 2 || len > 3)
+ expr_yyerror_more(yyscanner, "unexpected number of arguments",
+ PGBENCH_FUNCTIONS[fnumber].fname);
+
+ if (len == 2)
+ {
+ PgBenchExpr *var = make_variable("default_seed");
+ args = make_elist(var, args);
+ }
+ break;
+
/* common case: positive arguments number */
default:
Assert(PGBENCH_FUNCTIONS[fnumber].nargs >= 0);
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 81bc6d8a6e..e216e041f2 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -1066,6 +1066,243 @@ getHashMurmur2(int64 val, uint64 seed)
return (int64) result;
}
+/* pseudo-random permutation */
+
+/* 16 so that % 16 can be optimized to & 0x0f */
+#define PRP_PRIMES 16
+/*
+ * 24 bits mega primes from https://primes.utm.edu/lists/small/millions/
+ * the i-th prime, i \in [0, 15], is the first prime above 2^23 + i * 2^19
+ */
+static uint64 primes[PRP_PRIMES] = {
+ UINT64CONST(8388617),
+ UINT64CONST(8912921),
+ UINT64CONST(9437189),
+ UINT64CONST(9961487),
+ UINT64CONST(10485767),
+ UINT64CONST(11010059),
+ UINT64CONST(11534351),
+ UINT64CONST(12058679),
+ UINT64CONST(12582917),
+ UINT64CONST(13107229),
+ UINT64CONST(13631489),
+ UINT64CONST(14155777),
+ UINT64CONST(14680067),
+ UINT64CONST(15204391),
+ UINT64CONST(15728681),
+ UINT64CONST(16252967)
+};
+
+/* how many "encryption" rounds to apply */
+#define PRP_ROUNDS 4
+
+/* return smallest mask holding n */
+static uint64 compute_mask(uint64 n)
+{
+ n |= n >> 1;
+ n |= n >> 2;
+ n |= n >> 4;
+ n |= n >> 8;
+ n |= n >> 16;
+ n |= n >> 32;
+ return n;
+}
+
+/* length of n binary representation */
+static int nbits(uint64 n)
+{
+#ifdef HAVE__BUILTIN_CLZLL
+ return 64 - (n != 0 ? __builtin_clzll(n) : 64);
+#else /* use clever no branching bitwise operator version */
+ /* set all lower bits to 1 */
+ n = compute_mask(n);
+ /* then count them */
+ n -= (n >> 1) & UINT64CONST(0x5555555555555555);
+ n = (n & UINT64CONST(0x3333333333333333)) + ((n >> 2) & UINT64CONST(0x3333333333333333));
+ n = (n + (n >> 4)) & UINT64CONST(0x0F0F0F0F0F0F0F0F);
+ return (n * UINT64CONST(0x0101010101010101)) >> 56;
+#endif /* HAVE__BUILTIN_CLZLL */
+}
+
+/*
+ * Compute (x * y) % m, where x and y in [0, 2^64), m in [1, 2^64).
+ *
+ * Use improved interleaved modular multiplication algorithm to avoid
+ * overflows when necessary.
+ */
+static uint64 modular_multiply(uint64 x, uint64 y, const uint64 m)
+{
+ int i, bits;
+ uint64 r;
+
+ Assert(m >= 1);
+
+ /* Because of (x * y) % m = (x % m * y % m) % m */
+ if (x >= m)
+ x %= m;
+ if (y >= m)
+ y %= m;
+
+ /* Return the trivial result. */
+ if (x == 0 || y == 0 || m == 1)
+ return 0;
+
+ /* Return the result if (x * y) can be multiplied without overflow. */
+ if (nbits(x) + nbits(y) <= 64)
+ return (x * y) % m;
+
+ /* To reduce the for loop in the algorithm below, ensure y <= x. */
+ if (x < y)
+ {
+ uint64 tmp = x;
+ x = y;
+ y = tmp;
+ }
+
+ /* Interleaved modular multiplication algorithm from:
+ *
+ * D.N. Amanor et al., "Efficient hardware architecture for
+ * modular multiplication on FPGAs", in International Conference on
+ * Field Programmable Logic and Applications, Aug 2005, pp. 539-542.
+ *
+ * This algorithm is usually used in the field of digital circuit
+ * design.
+ *
+ * Input: X, Y, M; 0 <= X, Y <= M;
+ * Output: R = X * Y mod M;
+ * bits: number of bits of Y
+ * Y[i]: i th bit of Y
+ *
+ * 1. R = 0;
+ * 2. for (i = bits - 1; i >= 0; i--) {
+ * 3. R = 2 * R;
+ * 4. if (Y[i] == 0x1)
+ * 5. R += X;
+ * 6. if (R >= M) R -= M;
+ * 7. if (R >= M) R -= M;
+ * }
+ *
+ * In Steps 3 and 5, overflow should be avoided.
+ * Steps 6 and 7 can be instead a modular operation (R %= M).
+ */
+
+ bits = nbits(y);
+ r = 0;
+
+ for (i = bits - 1; i >= 0; i--)
+ {
+ if (r > UINT64CONST(0x7fffffffffffffff))
+ /* To avoid overflow, transform from (2 * r) to
+ * (2 * r) % m, and further transform to
+ * mathematically equivalent form shown below:
+ */
+ r = m - ((m - r) << 1);
+ else
+ r <<= 1;
+
+ if ((y >> i) & 0x1)
+ {
+ /* Compute (r + x) without overflow using same
+ * transformations described in the above comment.
+ */
+ if (m > UINT64CONST(0x7fffffffffffffff))
+ r = ((m - r) > x) ? r + x : r + x - m;
+ else
+ r = (r > m) ? r + x - m : r + x;
+ }
+
+ r %= m;
+ }
+
+ return r;
+}
+
+/*
+ * Donald Knuth's linear congruential generator
+ *
+ * Relying on multiplication overflows is part of the design
+ * of this simple pseudo random number generator.
+ */
+#define DK_LCG_MUL UINT64CONST(6364136223846793005)
+#define DK_LCG_INC UINT64CONST(1442695040888963407)
+
+/* do not use all small bits */
+#define LCG_SHIFT 13
+
+/*
+ * PRP: parametric pseudo-random permutation
+ *
+ * Result in [0, size) is a permutation for inputs in the same set.
+ *
+ * Note that this function does not pass statistical tests: eg
+ * permutations of 2, 3, 4 or 5 ints are not strictly equiprobable.
+ * Things worsen for large sizes as there are many more permutations
+ * (size!) than seeds to select them (2^64 < 21!).
+ * However it is inexpensive compared to an actual encryption function,
+ * and the quality is good enough to avoid trivial correlations on
+ * large sizes, which is the expected use case.
+ *
+ * THIS FUNCTION IS NOT CRYPTOGRAPHICALLY SECURE.
+ * DO NOT USE FOR SUCH PURPOSE.
+ */
+static int64
+pseudorandom_perm(const int64 data, const int64 isize, const int64 seed)
+{
+ /* computations are performed on unsigned values */
+ uint64 size = (uint64) isize;
+ uint64 v = (uint64) data % size;
+ uint64 key = (uint64) seed;
+ /* size-1: ensures 2 possibly overlapping halves */
+ uint64 mask = compute_mask(size-1) >> 1;
+
+ unsigned int i, p;
+
+ /* nothing to permute */
+ if (isize == 1)
+ return 0;
+
+ Assert(isize >= 2);
+
+ /* apply 4 rounds of bijective transformations:
+ * (1) scramble: partial xors on overlapping power-or-2 subsets
+ * (2) scatter: linear modulo
+ */
+ for (i = 0, p = key % PRP_PRIMES; i < PRP_ROUNDS; i++, p = (p + 1) % PRP_PRIMES)
+ {
+ uint64 t;
+
+ /* first "half" whitening, for v in 0 .. mask */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ if (v <= mask)
+ v ^= (key >> LCG_SHIFT) & mask;
+
+ /* second (possibly overlapping) "half" whitening */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ t = size - 1 - v;
+ if (t <= mask)
+ {
+ t ^= (key >> LCG_SHIFT) & mask;
+ v = size - 1 - t;
+ }
+
+ /* at most 2 primes are skipped for a given size */
+ while (unlikely(size % primes[p] == 0))
+ p = (p + 1) % PRP_PRIMES;
+
+ /* scatter values with a prime multiplication */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+
+ /* Performance shortcut for our 24 bit primes, ok for size up to ~10E12 */
+ if ((v & UINT64CONST(0xffffffffff)) == v)
+ v = (primes[p] * v + (key >> LCG_SHIFT)) % size;
+ else
+ v = (modular_multiply(primes[p], v, size) + (key >> LCG_SHIFT)) % size;
+ }
+
+ /* back to signed */
+ return (int64) v;
+}
+
/*
* Initialize the given SimpleStats struct to all zeroes
*/
@@ -2420,6 +2657,27 @@ evalStandardFunc(TState *thread, CState *st,
return true;
}
+ case PGBENCH_PRPERM:
+ {
+ int64 val, size, seed;
+
+ Assert(nargs == 3);
+
+ if (!coerceToInt(&vargs[0], &val) ||
+ !coerceToInt(&vargs[1], &size) ||
+ !coerceToInt(&vargs[2], &seed))
+ return false;
+
+ if (size < 1)
+ {
+ fprintf(stderr, "pr_perm size parameter must be >= 1\n");
+ return false;
+ }
+
+ setIntValue(retval, pseudorandom_perm(val, size, seed));
+ return true;
+ }
+
default:
/* cannot get here */
Assert(0);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index de50340434..12d5c2946b 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -99,7 +99,8 @@ typedef enum PgBenchFunction
PGBENCH_IS,
PGBENCH_CASE,
PGBENCH_HASH_FNV1A,
- PGBENCH_HASH_MURMUR2
+ PGBENCH_HASH_MURMUR2,
+ PGBENCH_PRPERM
} PgBenchFunction;
typedef struct PgBenchExpr PgBenchExpr;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 0081989026..6ebbcb73cf 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -323,6 +323,13 @@ pgbench(
qr{command=98.: int 5432\b}, # :random_seed
qr{command=99.: int -9223372036854775808\b}, # min int
qr{command=100.: int 9223372036854775807\b}, # max int
+ qr{command=101.: boolean true\b},
+ qr{command=102.: boolean true\b},
+ qr{command=103.: boolean true\b},
+ qr{command=104.: boolean true\b},
+ qr{command=105.: boolean true\b},
+ qr{command=109.: boolean true\b},
+ qr{command=110.: boolean true\b},
],
'pgbench expressions',
{
@@ -450,6 +457,28 @@ SELECT :v0, :v1, :v2, :v3;
-- minint constant parsing
\set min debug(-9223372036854775808)
\set max debug(-(:min + 1))
+-- pseudo-random permutation
+\set t debug(pr_perm(0, 2) + pr_perm(1, 2) = 1)
+\set t debug(pr_perm(0, 3) + pr_perm(1, 3) + pr_perm(2, 3) = 3)
+\set t debug(pr_perm(0, 4) + pr_perm(1, 4) + pr_perm(2, 4) + pr_perm(3, 4) = 6)
+\set t debug(pr_perm(0, 5) + pr_perm(1, 5) + pr_perm(2, 5) + pr_perm(3, 5) + pr_perm(4, 5) = 10)
+\set t debug(pr_perm(0, 16) + pr_perm(1, 16) + pr_perm(2, 16) + pr_perm(3, 16) + \
+ pr_perm(4, 16) + pr_perm(5, 16) + pr_perm(6, 16) + pr_perm(7, 16) + \
+ pr_perm(8, 16) + pr_perm(9, 16) + pr_perm(10, 16) + pr_perm(11, 16) + \
+ pr_perm(12, 16) + pr_perm(13, 16) + pr_perm(14, 16) + pr_perm(15, 16) = 120)
+-- random sanity check
+\set size random(2, 1000)
+\set v random(0, :size - 1)
+\set p pr_perm(:v, :size)
+\set t debug(0 <= :p and :p < :size and :p = pr_perm(:v + :size, :size) and :p <> pr_perm(:v + 1, :size))
+-- actual values
+\set t debug(pr_perm(:v, 1) = 0)
+\set t debug(pr_perm(0, 2, 5432) = 0 and pr_perm(1, 2, 5432) = 1 and \
+ pr_perm(0, 2, 5431) = 1 and pr_perm(1, 2, 5431) = 0)
+-- ~50 bits test to exercise full modular multiplication
+\set t debug(pr_perm(0, 1000000000000000, 54321) = 9406454989259 and \
+ pr_perm(1999999999999999, 1000000000000000, 54321) = 54570773902028 and \
+ pr_perm(2500000000000000, 1000000000000000, 54321) = 771082311307468)
}
});
@@ -740,6 +769,10 @@ SELECT LEAST(:i, :i, :i, :i, :i, :i, :i, :i, :i, :i, :i);
[
'bad boolean', 2,
[qr{malformed variable.*trueXXX}], q{\set b :badtrue or true}
+ ],
+ [
+ 'invalid pr_perm size', 2,
+ [qr{pr_perm size parameter must be >= 1}], q{\set i pr_perm(0, 0)}
],);
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
index 696c378edc..7c03ef2bbc 100644
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -306,6 +306,16 @@ my @script_tests = (
'double overflow 3',
[qr{double constant overflow}],
{ 'overflow-3.sql' => "\\set d .1E310\n" }
+ ],
+ [
+ 'not enough arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-1.sql' => "\\set i pr_perm(1)\n" }
+ ],
+ [
+ 'too many arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-2.sql' => "\\set i pr_perm(1, 2, 3, 4)\n" }
],);
for my $t (@script_tests)
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 9798bd24b4..4cf375085d 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -734,6 +734,9 @@
/* Define to 1 if your compiler understands __builtin_bswap64. */
#undef HAVE__BUILTIN_BSWAP64
+/* Define to 1 if your compiler understands __builtin_clzll. */
+#undef HAVE__BUILTIN_CLZLL
+
/* Define to 1 if your compiler understands __builtin_constant_p. */
#undef HAVE__BUILTIN_CONSTANT_P
Hello,
Also, the alternate implementation should not change the result, so
something looks amiss in your version. Moreover, I'm unclear how to
implement an overflow multiply with the safe no overflow version.
(snip)
I made an honest mistake. I had assumed the modulo number of Knuth's LCG
is (2 ^ 64 - 1).
BTW, I found other overflow issue.
In pseudorandom_perm(), `modular_multiply() + (key >> LCG_SHIFT)` may
overflow if the result of modular_multiply() is large. Therefore, I've
improved it.
Also, I've simplified Step 5 in modular_multiply().
I attached pgbench-prp-func-9.patch.
Best regards,
Attachments:
pgbench-prp-func-9.patchtext/plain; charset=UTF-8; name=pgbench-prp-func-9.patch; x-mac-creator=0; x-mac-type=0Download
diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index af2dea1..5b09f73 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -327,6 +327,26 @@ fi])# PGAC_C_BUILTIN_BSWAP64
+# PGAC_C_BUILTIN_CLZLL
+# -------------------------
+# Check if the C compiler understands __builtin_clzll(),
+# and define HAVE__BUILTIN_CLZLL if so.
+# Both GCC & CLANG seem to have one.
+AC_DEFUN([PGAC_C_BUILTIN_CLZLL],
+[AC_CACHE_CHECK(for __builtin_clzll, pgac_cv__builtin_clzll,
+[AC_COMPILE_IFELSE([AC_LANG_SOURCE(
+[static unsigned long int x = __builtin_clzll(0xaabbccddeeff0011);]
+)],
+[pgac_cv__builtin_clzll=yes],
+[pgac_cv__builtin_clzll=no])])
+if test x"$pgac_cv__builtin_clzll" = xyes ; then
+AC_DEFINE(HAVE__BUILTIN_CLZLL, 1,
+ [Define to 1 if your compiler understands __builtin_clzll.])
+fi])# PGAC_C_BUILTIN_CLZLL
+
+
+
+
# PGAC_C_BUILTIN_CONSTANT_P
# -------------------------
# Check if the C compiler understands __builtin_constant_p(),
diff --git a/configure b/configure
index 43ae8c8..c455505 100755
--- a/configure
+++ b/configure
@@ -13952,6 +13952,30 @@ if test x"$pgac_cv__builtin_bswap64" = xyes ; then
$as_echo "#define HAVE__BUILTIN_BSWAP64 1" >>confdefs.h
fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __builtin_clzll" >&5
+$as_echo_n "checking for __builtin_clzll... " >&6; }
+if ${pgac_cv__builtin_clzll+:} false; then :
+ $as_echo_n "(cached) " >&6
+else
+ cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h. */
+static unsigned long int x = __builtin_clzll(0xaabbccddeeff0011);
+
+_ACEOF
+if ac_fn_c_try_compile "$LINENO"; then :
+ pgac_cv__builtin_clzll=yes
+else
+ pgac_cv__builtin_clzll=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv__builtin_clzll" >&5
+$as_echo "$pgac_cv__builtin_clzll" >&6; }
+if test x"$pgac_cv__builtin_clzll" = xyes ; then
+
+$as_echo "#define HAVE__BUILTIN_CLZLL 1" >>confdefs.h
+
+fi
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __builtin_constant_p" >&5
$as_echo_n "checking for __builtin_constant_p... " >&6; }
if ${pgac_cv__builtin_constant_p+:} false; then :
diff --git a/configure.in b/configure.in
index 519ecd5..f7eb78a 100644
--- a/configure.in
+++ b/configure.in
@@ -1486,6 +1486,7 @@ PGAC_C_TYPES_COMPATIBLE
PGAC_C_BUILTIN_BSWAP16
PGAC_C_BUILTIN_BSWAP32
PGAC_C_BUILTIN_BSWAP64
+PGAC_C_BUILTIN_CLZLL
PGAC_C_BUILTIN_CONSTANT_P
PGAC_C_BUILTIN_UNREACHABLE
PGAC_C_COMPUTED_GOTO
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index b5e3a62..77cbfcd 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -929,7 +929,7 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<row>
<entry> <literal>default_seed</literal> </entry>
- <entry>seed used in hash functions by default</entry>
+ <entry>seed used in hash and pseudo-random permutation functions by default</entry>
</row>
<row>
@@ -1390,6 +1390,13 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<entry><literal>1024.0</literal></entry>
</row>
<row>
+ <entry><literal><function>pr_perm(<replaceable>i</replaceable>, <replaceable>size</replaceable> [, <replaceable>seed</replaceable> ] )</function></literal></entry>
+ <entry>integer</entry>
+ <entry>pseudo-random permutation in [0,size)</entry>
+ <entry><literal>pr_perm(0, 4)</literal></entry>
+ <entry><literal>0</literal>, <literal>1</literal>, <literal>2</literal> or <literal>3</literal></entry>
+ </row>
+ <row>
<entry><literal><function>random(<replaceable>lb</replaceable>, <replaceable>ub</replaceable>)</function></literal></entry>
<entry>integer</entry>
<entry>uniformly-distributed random integer in <literal>[lb, ub]</literal></entry>
@@ -1551,6 +1558,24 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
</para>
<para>
+ Function <literal>pr_perm</literal> implements a pseudo-random permutation.
+ It allows to mix the output of non uniform random functions so that
+ values drawn more often are not trivially correlated.
+ It permutes integers in [0, size) using a seed by applying rounds of
+ simple invertible functions, similarly to an encryption function,
+ although beware that it is not at all cryptographically secure.
+ Compared to <literal>hash</literal> functions discussed above, the function
+ ensures that a perfect permutation is applied: there are no collisions
+ nor holes in the output values.
+ Values outside the interval are interpreted modulo the size.
+ The function errors if size is not positive.
+ If no seed is provided, <literal>:default_seed</literal> is used.
+ For a given size and seed, the function is fully deterministic: if two
+ permutations on the same size must not be correlated, use distinct seeds
+ as outlined in the previous example about hash functions.
+ </para>
+
+ <para>
As an example, the full definition of the built-in TPC-B-like
transaction is:
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index bab6f8a..94fcebb 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -19,6 +19,7 @@
#define PGBENCH_NARGS_VARIABLE (-1)
#define PGBENCH_NARGS_CASE (-2)
#define PGBENCH_NARGS_HASH (-3)
+#define PGBENCH_NARGS_PRPERM (-4)
PgBenchExpr *expr_parse_result;
@@ -370,6 +371,9 @@ static const struct
{
"hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A
},
+ {
+ "pr_perm", PGBENCH_NARGS_PRPERM, PGBENCH_PRPERM
+ },
/* keep as last array element */
{
NULL, 0, 0
@@ -482,6 +486,19 @@ make_func(yyscan_t yyscanner, int fnumber, PgBenchExprList *args)
}
break;
+ /* pseudo-random permutation function with optional seed argument */
+ case PGBENCH_NARGS_PRPERM:
+ if (len < 2 || len > 3)
+ expr_yyerror_more(yyscanner, "unexpected number of arguments",
+ PGBENCH_FUNCTIONS[fnumber].fname);
+
+ if (len == 2)
+ {
+ PgBenchExpr *var = make_variable("default_seed");
+ args = make_elist(var, args);
+ }
+ break;
+
/* common case: positive arguments number */
default:
Assert(PGBENCH_FUNCTIONS[fnumber].nargs >= 0);
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 81bc6d8..7322291 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -1066,6 +1066,253 @@ getHashMurmur2(int64 val, uint64 seed)
return (int64) result;
}
+/* pseudo-random permutation */
+
+/* 16 so that % 16 can be optimized to & 0x0f */
+#define PRP_PRIMES 16
+/*
+ * 24 bits mega primes from https://primes.utm.edu/lists/small/millions/
+ * the i-th prime, i \in [0, 15], is the first prime above 2^23 + i * 2^19
+ */
+static uint64 primes[PRP_PRIMES] = {
+ UINT64CONST(8388617),
+ UINT64CONST(8912921),
+ UINT64CONST(9437189),
+ UINT64CONST(9961487),
+ UINT64CONST(10485767),
+ UINT64CONST(11010059),
+ UINT64CONST(11534351),
+ UINT64CONST(12058679),
+ UINT64CONST(12582917),
+ UINT64CONST(13107229),
+ UINT64CONST(13631489),
+ UINT64CONST(14155777),
+ UINT64CONST(14680067),
+ UINT64CONST(15204391),
+ UINT64CONST(15728681),
+ UINT64CONST(16252967)
+};
+
+/* how many "encryption" rounds to apply */
+#define PRP_ROUNDS 4
+
+/* return smallest mask holding n */
+static uint64 compute_mask(uint64 n)
+{
+ n |= n >> 1;
+ n |= n >> 2;
+ n |= n >> 4;
+ n |= n >> 8;
+ n |= n >> 16;
+ n |= n >> 32;
+ return n;
+}
+
+/* length of n binary representation */
+static int nbits(uint64 n)
+{
+#ifdef HAVE__BUILTIN_CLZLL
+ return 64 - (n != 0 ? __builtin_clzll(n) : 64);
+#else /* use clever no branching bitwise operator version */
+ /* set all lower bits to 1 */
+ n = compute_mask(n);
+ /* then count them */
+ n -= (n >> 1) & UINT64CONST(0x5555555555555555);
+ n = (n & UINT64CONST(0x3333333333333333)) + ((n >> 2) & UINT64CONST(0x3333333333333333));
+ n = (n + (n >> 4)) & UINT64CONST(0x0F0F0F0F0F0F0F0F);
+ return (n * UINT64CONST(0x0101010101010101)) >> 56;
+#endif /* HAVE__BUILTIN_CLZLL */
+}
+
+/*
+ * Compute (x * y) % m, where x and y in [0, 2^64), m in [1, 2^64).
+ *
+ * Use improved interleaved modular multiplication algorithm to avoid
+ * overflows when necessary.
+ */
+static uint64 modular_multiply(uint64 x, uint64 y, const uint64 m)
+{
+ int i, bits;
+ uint64 r;
+
+ Assert(m >= 1);
+
+ /* Because of (x * y) % m = (x % m * y % m) % m */
+ if (x >= m)
+ x %= m;
+ if (y >= m)
+ y %= m;
+
+ /* Return the trivial result. */
+ if (x == 0 || y == 0 || m == 1)
+ return 0;
+
+ /* Return the result if (x * y) can be multiplied without overflow. */
+ if (nbits(x) + nbits(y) <= 64)
+ return (x * y) % m;
+
+ /* To reduce the for loop in the algorithm below, ensure y <= x. */
+ if (x < y)
+ {
+ uint64 tmp = x;
+ x = y;
+ y = tmp;
+ }
+
+ /* Interleaved modular multiplication algorithm from:
+ *
+ * D.N. Amanor et al., "Efficient hardware architecture for
+ * modular multiplication on FPGAs", in International Conference on
+ * Field Programmable Logic and Applications, Aug 2005, pp. 539-542.
+ *
+ * This algorithm is usually used in the field of digital circuit
+ * design.
+ *
+ * Input: X, Y, M; 0 <= X, Y <= M;
+ * Output: R = X * Y mod M;
+ * bits: number of bits of Y
+ * Y[i]: i th bit of Y
+ *
+ * 1. R = 0;
+ * 2. for (i = bits - 1; i >= 0; i--) {
+ * 3. R = 2 * R;
+ * 4. if (Y[i] == 0x1)
+ * 5. R += X;
+ * 6. if (R >= M) R -= M;
+ * 7. if (R >= M) R -= M;
+ * }
+ *
+ * In Steps 3 and 5, overflow should be avoided.
+ * Steps 6 and 7 can be instead a modular operation (R %= M).
+ */
+
+ bits = nbits(y);
+ r = 0;
+
+ for (i = bits - 1; i >= 0; i--)
+ {
+ if (r > UINT64CONST(0x7fffffffffffffff))
+ /* To avoid overflow, transform from (2 * r) to
+ * (2 * r) % m, and further transform to
+ * mathematically equivalent form shown below:
+ */
+ r = m - ((m - r) << 1);
+ else
+ r <<= 1;
+
+ if ((y >> i) & 0x1)
+ {
+ if (r > UINT64CONST(0xffffffffffffffff) - x)
+ /* To calculate (r + x) without overflow,
+ * transform to (r + x) % m, and further
+ * transform to (r + x - m).
+ */
+ r += x - m;
+ else
+ r += x;
+ }
+
+ r %= m;
+ }
+
+ return r;
+}
+
+/*
+ * Donald Knuth's linear congruential generator
+ *
+ * Relying on multiplication overflows is part of the design
+ * of this simple pseudo random number generator.
+ */
+#define DK_LCG_MUL UINT64CONST(6364136223846793005)
+#define DK_LCG_INC UINT64CONST(1442695040888963407)
+
+/* do not use all small bits */
+#define LCG_SHIFT 13
+
+/*
+ * PRP: parametric pseudo-random permutation
+ *
+ * Result in [0, size) is a permutation for inputs in the same set.
+ *
+ * Note that this function does not pass statistical tests: eg
+ * permutations of 2, 3, 4 or 5 ints are not strictly equiprobable.
+ * Things worsen for large sizes as there are many more permutations
+ * (size!) than seeds to select them (2^64 < 21!).
+ * However it is inexpensive compared to an actual encryption function,
+ * and the quality is good enough to avoid trivial correlations on
+ * large sizes, which is the expected use case.
+ *
+ * THIS FUNCTION IS NOT CRYPTOGRAPHICALLY SECURE.
+ * DO NOT USE FOR SUCH PURPOSE.
+ */
+static int64
+pseudorandom_perm(const int64 data, const int64 isize, const int64 seed)
+{
+ /* computations are performed on unsigned values */
+ uint64 size = (uint64) isize;
+ uint64 v = (uint64) data % size;
+ uint64 key = (uint64) seed;
+ /* size-1: ensures 2 possibly overlapping halves */
+ uint64 mask = compute_mask(size-1) >> 1;
+
+ unsigned int i, p;
+
+ /* nothing to permute */
+ if (isize == 1)
+ return 0;
+
+ Assert(isize >= 2);
+
+ /* apply 4 rounds of bijective transformations:
+ * (1) scramble: partial xors on overlapping power-or-2 subsets
+ * (2) scatter: linear modulo
+ */
+ for (i = 0, p = key % PRP_PRIMES; i < PRP_ROUNDS; i++, p = (p + 1) % PRP_PRIMES)
+ {
+ uint64 t;
+
+ /* first "half" whitening, for v in 0 .. mask */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ if (v <= mask)
+ v ^= (key >> LCG_SHIFT) & mask;
+
+ /* second (possibly overlapping) "half" whitening */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ t = size - 1 - v;
+ if (t <= mask)
+ {
+ t ^= (key >> LCG_SHIFT) & mask;
+ v = size - 1 - t;
+ }
+
+ /* at most 2 primes are skipped for a given size */
+ while (unlikely(size % primes[p] == 0))
+ p = (p + 1) % PRP_PRIMES;
+
+ /* scatter values with a prime multiplication */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+
+ /* Performance shortcut for our 24 bit primes, ok for size up to ~10E12 */
+ if ((v & UINT64CONST(0xffffffffff)) == v)
+ v = (primes[p] * v + (key >> LCG_SHIFT)) % size;
+ else
+ {
+ v = modular_multiply(primes[p], v, size);
+ if (v > UINT64CONST(0xffffffffffffffff) - (key >> LCG_SHIFT))
+ /* Even if `size` is subtracted for avoiding overflow,
+ * the result is the same because of modulo operation.
+ */
+ v = (v + (key >> LCG_SHIFT) - size) % size;
+ else
+ v = (v + (key >> LCG_SHIFT)) % size;
+ }
+ }
+
+ /* back to signed */
+ return (int64) v;
+}
+
/*
* Initialize the given SimpleStats struct to all zeroes
*/
@@ -2420,6 +2667,27 @@ evalStandardFunc(TState *thread, CState *st,
return true;
}
+ case PGBENCH_PRPERM:
+ {
+ int64 val, size, seed;
+
+ Assert(nargs == 3);
+
+ if (!coerceToInt(&vargs[0], &val) ||
+ !coerceToInt(&vargs[1], &size) ||
+ !coerceToInt(&vargs[2], &seed))
+ return false;
+
+ if (size < 1)
+ {
+ fprintf(stderr, "pr_perm size parameter must be >= 1\n");
+ return false;
+ }
+
+ setIntValue(retval, pseudorandom_perm(val, size, seed));
+ return true;
+ }
+
default:
/* cannot get here */
Assert(0);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index de50340..12d5c29 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -99,7 +99,8 @@ typedef enum PgBenchFunction
PGBENCH_IS,
PGBENCH_CASE,
PGBENCH_HASH_FNV1A,
- PGBENCH_HASH_MURMUR2
+ PGBENCH_HASH_MURMUR2,
+ PGBENCH_PRPERM
} PgBenchFunction;
typedef struct PgBenchExpr PgBenchExpr;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 0081989..6ebbcb7 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -323,6 +323,13 @@ pgbench(
qr{command=98.: int 5432\b}, # :random_seed
qr{command=99.: int -9223372036854775808\b}, # min int
qr{command=100.: int 9223372036854775807\b}, # max int
+ qr{command=101.: boolean true\b},
+ qr{command=102.: boolean true\b},
+ qr{command=103.: boolean true\b},
+ qr{command=104.: boolean true\b},
+ qr{command=105.: boolean true\b},
+ qr{command=109.: boolean true\b},
+ qr{command=110.: boolean true\b},
],
'pgbench expressions',
{
@@ -450,6 +457,28 @@ SELECT :v0, :v1, :v2, :v3;
-- minint constant parsing
\set min debug(-9223372036854775808)
\set max debug(-(:min + 1))
+-- pseudo-random permutation
+\set t debug(pr_perm(0, 2) + pr_perm(1, 2) = 1)
+\set t debug(pr_perm(0, 3) + pr_perm(1, 3) + pr_perm(2, 3) = 3)
+\set t debug(pr_perm(0, 4) + pr_perm(1, 4) + pr_perm(2, 4) + pr_perm(3, 4) = 6)
+\set t debug(pr_perm(0, 5) + pr_perm(1, 5) + pr_perm(2, 5) + pr_perm(3, 5) + pr_perm(4, 5) = 10)
+\set t debug(pr_perm(0, 16) + pr_perm(1, 16) + pr_perm(2, 16) + pr_perm(3, 16) + \
+ pr_perm(4, 16) + pr_perm(5, 16) + pr_perm(6, 16) + pr_perm(7, 16) + \
+ pr_perm(8, 16) + pr_perm(9, 16) + pr_perm(10, 16) + pr_perm(11, 16) + \
+ pr_perm(12, 16) + pr_perm(13, 16) + pr_perm(14, 16) + pr_perm(15, 16) = 120)
+-- random sanity check
+\set size random(2, 1000)
+\set v random(0, :size - 1)
+\set p pr_perm(:v, :size)
+\set t debug(0 <= :p and :p < :size and :p = pr_perm(:v + :size, :size) and :p <> pr_perm(:v + 1, :size))
+-- actual values
+\set t debug(pr_perm(:v, 1) = 0)
+\set t debug(pr_perm(0, 2, 5432) = 0 and pr_perm(1, 2, 5432) = 1 and \
+ pr_perm(0, 2, 5431) = 1 and pr_perm(1, 2, 5431) = 0)
+-- ~50 bits test to exercise full modular multiplication
+\set t debug(pr_perm(0, 1000000000000000, 54321) = 9406454989259 and \
+ pr_perm(1999999999999999, 1000000000000000, 54321) = 54570773902028 and \
+ pr_perm(2500000000000000, 1000000000000000, 54321) = 771082311307468)
}
});
@@ -740,6 +769,10 @@ SELECT LEAST(:i, :i, :i, :i, :i, :i, :i, :i, :i, :i, :i);
[
'bad boolean', 2,
[qr{malformed variable.*trueXXX}], q{\set b :badtrue or true}
+ ],
+ [
+ 'invalid pr_perm size', 2,
+ [qr{pr_perm size parameter must be >= 1}], q{\set i pr_perm(0, 0)}
],);
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
index 696c378..7c03ef2 100644
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -306,6 +306,16 @@ my @script_tests = (
'double overflow 3',
[qr{double constant overflow}],
{ 'overflow-3.sql' => "\\set d .1E310\n" }
+ ],
+ [
+ 'not enough arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-1.sql' => "\\set i pr_perm(1)\n" }
+ ],
+ [
+ 'too many arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-2.sql' => "\\set i pr_perm(1, 2, 3, 4)\n" }
],);
for my $t (@script_tests)
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 9798bd2..4cf3750 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -734,6 +734,9 @@
/* Define to 1 if your compiler understands __builtin_bswap64. */
#undef HAVE__BUILTIN_BSWAP64
+/* Define to 1 if your compiler understands __builtin_clzll. */
+#undef HAVE__BUILTIN_CLZLL
+
/* Define to 1 if your compiler understands __builtin_constant_p. */
#undef HAVE__BUILTIN_CONSTANT_P
Hello Hironobu-san,
In pseudorandom_perm(), `modular_multiply() + (key >> LCG_SHIFT)` may
overflow if the result of modular_multiply() is large. Therefore, I've
improved it.
Also, I've simplified Step 5 in modular_multiply().
Attached is a v10, where I have:
- updated some comments
- the + cannot overflow because size is taken from a signed int
and the added value is small thanks to the shift.
I have put back the simple formula and added a comment about it.
- added a few test cases, and fix the associated checks
--
Fabien.
Attachments:
pgbench-prp-func-10.patchtext/x-diff; name=pgbench-prp-func-10.patchDownload
diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index af2dea1c2a..5b09f73d26 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -327,6 +327,26 @@ fi])# PGAC_C_BUILTIN_BSWAP64
+# PGAC_C_BUILTIN_CLZLL
+# -------------------------
+# Check if the C compiler understands __builtin_clzll(),
+# and define HAVE__BUILTIN_CLZLL if so.
+# Both GCC & CLANG seem to have one.
+AC_DEFUN([PGAC_C_BUILTIN_CLZLL],
+[AC_CACHE_CHECK(for __builtin_clzll, pgac_cv__builtin_clzll,
+[AC_COMPILE_IFELSE([AC_LANG_SOURCE(
+[static unsigned long int x = __builtin_clzll(0xaabbccddeeff0011);]
+)],
+[pgac_cv__builtin_clzll=yes],
+[pgac_cv__builtin_clzll=no])])
+if test x"$pgac_cv__builtin_clzll" = xyes ; then
+AC_DEFINE(HAVE__BUILTIN_CLZLL, 1,
+ [Define to 1 if your compiler understands __builtin_clzll.])
+fi])# PGAC_C_BUILTIN_CLZLL
+
+
+
+
# PGAC_C_BUILTIN_CONSTANT_P
# -------------------------
# Check if the C compiler understands __builtin_constant_p(),
diff --git a/configure b/configure
index 43ae8c869d..c455505ccf 100755
--- a/configure
+++ b/configure
@@ -13951,6 +13951,30 @@ if test x"$pgac_cv__builtin_bswap64" = xyes ; then
$as_echo "#define HAVE__BUILTIN_BSWAP64 1" >>confdefs.h
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __builtin_clzll" >&5
+$as_echo_n "checking for __builtin_clzll... " >&6; }
+if ${pgac_cv__builtin_clzll+:} false; then :
+ $as_echo_n "(cached) " >&6
+else
+ cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h. */
+static unsigned long int x = __builtin_clzll(0xaabbccddeeff0011);
+
+_ACEOF
+if ac_fn_c_try_compile "$LINENO"; then :
+ pgac_cv__builtin_clzll=yes
+else
+ pgac_cv__builtin_clzll=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv__builtin_clzll" >&5
+$as_echo "$pgac_cv__builtin_clzll" >&6; }
+if test x"$pgac_cv__builtin_clzll" = xyes ; then
+
+$as_echo "#define HAVE__BUILTIN_CLZLL 1" >>confdefs.h
+
fi
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __builtin_constant_p" >&5
$as_echo_n "checking for __builtin_constant_p... " >&6; }
diff --git a/configure.in b/configure.in
index 519ecd5e1e..f7eb78aef9 100644
--- a/configure.in
+++ b/configure.in
@@ -1486,6 +1486,7 @@ PGAC_C_TYPES_COMPATIBLE
PGAC_C_BUILTIN_BSWAP16
PGAC_C_BUILTIN_BSWAP32
PGAC_C_BUILTIN_BSWAP64
+PGAC_C_BUILTIN_CLZLL
PGAC_C_BUILTIN_CONSTANT_P
PGAC_C_BUILTIN_UNREACHABLE
PGAC_C_COMPUTED_GOTO
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index b5e3a62a33..77cbfcd097 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -929,7 +929,7 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<row>
<entry> <literal>default_seed</literal> </entry>
- <entry>seed used in hash functions by default</entry>
+ <entry>seed used in hash and pseudo-random permutation functions by default</entry>
</row>
<row>
@@ -1389,6 +1389,13 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<entry><literal>pow(2.0, 10)</literal>, <literal>power(2.0, 10)</literal></entry>
<entry><literal>1024.0</literal></entry>
</row>
+ <row>
+ <entry><literal><function>pr_perm(<replaceable>i</replaceable>, <replaceable>size</replaceable> [, <replaceable>seed</replaceable> ] )</function></literal></entry>
+ <entry>integer</entry>
+ <entry>pseudo-random permutation in [0,size)</entry>
+ <entry><literal>pr_perm(0, 4)</literal></entry>
+ <entry><literal>0</literal>, <literal>1</literal>, <literal>2</literal> or <literal>3</literal></entry>
+ </row>
<row>
<entry><literal><function>random(<replaceable>lb</replaceable>, <replaceable>ub</replaceable>)</function></literal></entry>
<entry>integer</entry>
@@ -1550,6 +1557,24 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
</programlisting>
</para>
+ <para>
+ Function <literal>pr_perm</literal> implements a pseudo-random permutation.
+ It allows to mix the output of non uniform random functions so that
+ values drawn more often are not trivially correlated.
+ It permutes integers in [0, size) using a seed by applying rounds of
+ simple invertible functions, similarly to an encryption function,
+ although beware that it is not at all cryptographically secure.
+ Compared to <literal>hash</literal> functions discussed above, the function
+ ensures that a perfect permutation is applied: there are no collisions
+ nor holes in the output values.
+ Values outside the interval are interpreted modulo the size.
+ The function errors if size is not positive.
+ If no seed is provided, <literal>:default_seed</literal> is used.
+ For a given size and seed, the function is fully deterministic: if two
+ permutations on the same size must not be correlated, use distinct seeds
+ as outlined in the previous example about hash functions.
+ </para>
+
<para>
As an example, the full definition of the built-in TPC-B-like
transaction is:
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index bab6f8a95c..94fcebb77a 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -19,6 +19,7 @@
#define PGBENCH_NARGS_VARIABLE (-1)
#define PGBENCH_NARGS_CASE (-2)
#define PGBENCH_NARGS_HASH (-3)
+#define PGBENCH_NARGS_PRPERM (-4)
PgBenchExpr *expr_parse_result;
@@ -370,6 +371,9 @@ static const struct
{
"hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A
},
+ {
+ "pr_perm", PGBENCH_NARGS_PRPERM, PGBENCH_PRPERM
+ },
/* keep as last array element */
{
NULL, 0, 0
@@ -482,6 +486,19 @@ make_func(yyscan_t yyscanner, int fnumber, PgBenchExprList *args)
}
break;
+ /* pseudo-random permutation function with optional seed argument */
+ case PGBENCH_NARGS_PRPERM:
+ if (len < 2 || len > 3)
+ expr_yyerror_more(yyscanner, "unexpected number of arguments",
+ PGBENCH_FUNCTIONS[fnumber].fname);
+
+ if (len == 2)
+ {
+ PgBenchExpr *var = make_variable("default_seed");
+ args = make_elist(var, args);
+ }
+ break;
+
/* common case: positive arguments number */
default:
Assert(PGBENCH_FUNCTIONS[fnumber].nargs >= 0);
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 81bc6d8a6e..747565ac54 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -1066,6 +1066,249 @@ getHashMurmur2(int64 val, uint64 seed)
return (int64) result;
}
+/* pseudo-random permutation */
+
+/* 16 so that % 16 can be optimized to & 0x0f */
+#define PRP_PRIMES 16
+/*
+ * 24 bits mega primes from https://primes.utm.edu/lists/small/millions/
+ * the i-th prime, i \in [0, 15], is the first prime above 2^23 + i * 2^19
+ */
+static uint64 primes[PRP_PRIMES] = {
+ UINT64CONST(8388617),
+ UINT64CONST(8912921),
+ UINT64CONST(9437189),
+ UINT64CONST(9961487),
+ UINT64CONST(10485767),
+ UINT64CONST(11010059),
+ UINT64CONST(11534351),
+ UINT64CONST(12058679),
+ UINT64CONST(12582917),
+ UINT64CONST(13107229),
+ UINT64CONST(13631489),
+ UINT64CONST(14155777),
+ UINT64CONST(14680067),
+ UINT64CONST(15204391),
+ UINT64CONST(15728681),
+ UINT64CONST(16252967)
+};
+
+/* how many "encryption" rounds to apply */
+#define PRP_ROUNDS 4
+
+/* return smallest mask holding n */
+static uint64 compute_mask(uint64 n)
+{
+ n |= n >> 1;
+ n |= n >> 2;
+ n |= n >> 4;
+ n |= n >> 8;
+ n |= n >> 16;
+ n |= n >> 32;
+ return n;
+}
+
+/* length of n binary representation */
+static int nbits(uint64 n)
+{
+#ifdef HAVE__BUILTIN_CLZLL
+ return 64 - (n != 0 ? __builtin_clzll(n) : 64);
+#else /* use clever no branching bitwise operator version */
+ /* set all lower bits to 1 */
+ n = compute_mask(n);
+ /* then count them */
+ n -= (n >> 1) & UINT64CONST(0x5555555555555555);
+ n = (n & UINT64CONST(0x3333333333333333)) + ((n >> 2) & UINT64CONST(0x3333333333333333));
+ n = (n + (n >> 4)) & UINT64CONST(0x0F0F0F0F0F0F0F0F);
+ return (n * UINT64CONST(0x0101010101010101)) >> 56;
+#endif /* HAVE__BUILTIN_CLZLL */
+}
+
+/*
+ * Compute (x * y) % m, where x and y in [0, 2^64), m in [1, 2^64).
+ *
+ * Use improved interleaved modular multiplication algorithm to avoid
+ * overflows when necessary.
+ */
+static uint64 modular_multiply(uint64 x, uint64 y, const uint64 m)
+{
+ int i, bits;
+ uint64 r;
+
+ Assert(m >= 1);
+
+ /* Because of (x * y) % m = (x % m * y % m) % m */
+ if (x >= m)
+ x %= m;
+ if (y >= m)
+ y %= m;
+
+ /* Return the trivial result. */
+ if (x == 0 || y == 0 || m == 1)
+ return 0;
+
+ /* Return the result if (x * y) can be multiplied without overflow. */
+ if (nbits(x) + nbits(y) <= 64)
+ return (x * y) % m;
+
+ /* To reduce the for loop in the algorithm below, ensure y <= x. */
+ if (x < y)
+ {
+ uint64 tmp = x;
+ x = y;
+ y = tmp;
+ }
+
+ /* Interleaved modular multiplication algorithm from:
+ *
+ * D.N. Amanor et al., "Efficient hardware architecture for
+ * modular multiplication on FPGAs", in International Conference on
+ * Field Programmable Logic and Applications, Aug 2005, pp. 539-542.
+ *
+ * This algorithm is usually used in the field of digital circuit
+ * design.
+ *
+ * Input: X, Y, M; 0 <= X, Y <= M;
+ * Output: R = X * Y mod M;
+ * bits: number of bits of Y
+ * Y[i]: i th bit of Y
+ *
+ * 1. R = 0;
+ * 2. for (i = bits - 1; i >= 0; i--) {
+ * 3. R = 2 * R;
+ * 4. if (Y[i] == 0x1)
+ * 5. R += X;
+ * 6. if (R >= M) R -= M;
+ * 7. if (R >= M) R -= M;
+ * }
+ *
+ * In Steps 3 and 5, overflow should be avoided.
+ * Steps 6 and 7 can be instead a modular operation (R %= M).
+ */
+
+ bits = nbits(y);
+ r = 0;
+
+ for (i = bits - 1; i >= 0; i--)
+ {
+ if (r > UINT64CONST(0x7fffffffffffffff))
+ /* To avoid overflow, transform from (2 * r) to
+ * (2 * r) % m, and further transform to
+ * mathematically equivalent form shown below:
+ */
+ r = m - ((m - r) << 1);
+ else
+ r <<= 1;
+
+ if ((y >> i) & 0x1)
+ {
+ if (r > UINT64CONST(0xffffffffffffffff) - x)
+ /* To compute (r + x) without overflow,
+ * transform to (r + x) % m, and further
+ * transform to (r + x - m).
+ */
+ r += x - m;
+ else
+ r += x;
+ }
+
+ r %= m;
+ }
+
+ return r;
+}
+
+/*
+ * Donald Knuth's linear congruential generator
+ *
+ * Relying on multiplication overflows is part of the design
+ * of this simple pseudo random number generator.
+ */
+#define DK_LCG_MUL UINT64CONST(6364136223846793005)
+#define DK_LCG_INC UINT64CONST(1442695040888963407)
+
+/* do not use all small bits */
+#define LCG_SHIFT 13
+
+/*
+ * PRP: parametric pseudo-random permutation
+ *
+ * Result in [0, size) is a permutation for inputs in the same set.
+ *
+ * Note that this function does not pass statistical tests: eg
+ * permutations of 2, 3, 4 or 5 ints are not strictly equiprobable.
+ * Things worsen for large sizes as there are many more permutations
+ * (size!) than seeds to select them (2^64 < 21!).
+ * However it is inexpensive compared to an actual encryption function,
+ * and the quality is good enough to avoid trivial correlations on
+ * large sizes, which is the expected use case.
+ *
+ * THIS FUNCTION IS NOT CRYPTOGRAPHICALLY SECURE.
+ * DO NOT USE FOR SUCH PURPOSE.
+ */
+static int64
+pseudorandom_perm(const int64 data, const int64 isize, const int64 seed)
+{
+ /* computations are performed on unsigned values */
+ uint64 size = (uint64) isize;
+ uint64 v = (uint64) data % size;
+ uint64 key = (uint64) seed;
+ /* size-1: ensures 2 possibly overlapping halves */
+ uint64 mask = compute_mask(size-1) >> 1;
+
+ unsigned int i, p;
+
+ /* nothing to permute */
+ if (isize == 1)
+ return 0;
+
+ Assert(isize >= 2);
+
+ /* apply 4 rounds of bijective transformations:
+ * (1) scramble: partial xors on overlapping power-of-2 subsets
+ * (2) scatter: linear modulo
+ */
+ for (i = 0, p = key % PRP_PRIMES; i < PRP_ROUNDS; i++, p = (p + 1) % PRP_PRIMES)
+ {
+ uint64 t;
+
+ /* first "half" whitening, for v in 0 .. mask */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ if (v <= mask)
+ v ^= (key >> LCG_SHIFT) & mask;
+
+ /* second (possibly overlapping) "half" whitening */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ t = size - 1 - v;
+ if (t <= mask)
+ {
+ t ^= (key >> LCG_SHIFT) & mask;
+ v = size - 1 - t;
+ }
+
+ /* at most 2 primes are skipped for a given size */
+ while (unlikely(size % primes[p] == 0))
+ p = (p + 1) % PRP_PRIMES;
+
+ /* scatter values with a prime multiplication */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+
+ /* Performance shortcut for our 24 bit primes, ok for size up to ~10E12 */
+ if ((v & UINT64CONST(0xffffffffff)) == v)
+ v = (primes[p] * v + (key >> LCG_SHIFT)) % size;
+ else
+ /* note: the + below cannot overflow because size is under 63 bits
+ * as mmv = mm(prime, v, size) < size <= 0x7fffffffffffffff = (1 << 63) - 1
+ * and ks = key >> LCG_SHIFTS <= 2^51
+ * then mmv + ks < (1<<64) - 1
+ */
+ v = (modular_multiply(primes[p], v, size) + (key >> LCG_SHIFT)) % size;
+ }
+
+ /* back to signed */
+ return (int64) v;
+}
+
/*
* Initialize the given SimpleStats struct to all zeroes
*/
@@ -2420,6 +2663,27 @@ evalStandardFunc(TState *thread, CState *st,
return true;
}
+ case PGBENCH_PRPERM:
+ {
+ int64 val, size, seed;
+
+ Assert(nargs == 3);
+
+ if (!coerceToInt(&vargs[0], &val) ||
+ !coerceToInt(&vargs[1], &size) ||
+ !coerceToInt(&vargs[2], &seed))
+ return false;
+
+ if (size < 1)
+ {
+ fprintf(stderr, "pr_perm size parameter must be >= 1\n");
+ return false;
+ }
+
+ setIntValue(retval, pseudorandom_perm(val, size, seed));
+ return true;
+ }
+
default:
/* cannot get here */
Assert(0);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index de50340434..12d5c2946b 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -99,7 +99,8 @@ typedef enum PgBenchFunction
PGBENCH_IS,
PGBENCH_CASE,
PGBENCH_HASH_FNV1A,
- PGBENCH_HASH_MURMUR2
+ PGBENCH_HASH_MURMUR2,
+ PGBENCH_PRPERM
} PgBenchFunction;
typedef struct PgBenchExpr PgBenchExpr;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 0081989026..30c415c43f 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -323,6 +323,17 @@ pgbench(
qr{command=98.: int 5432\b}, # :random_seed
qr{command=99.: int -9223372036854775808\b}, # min int
qr{command=100.: int 9223372036854775807\b}, # max int
+ qr{command=101.: boolean true\b},
+ qr{command=102.: boolean true\b},
+ qr{command=103.: boolean true\b},
+ qr{command=104.: boolean true\b},
+ qr{command=105.: boolean true\b},
+ qr{command=109.: boolean true\b},
+ qr{command=110.: boolean true\b},
+ qr{command=111.: boolean true\b},
+ qr{command=112.: boolean true\b},
+ qr{command=113.: int 9223372036854775797\b},
+ qr{command=114.: boolean true\b},
],
'pgbench expressions',
{
@@ -450,6 +461,37 @@ SELECT :v0, :v1, :v2, :v3;
-- minint constant parsing
\set min debug(-9223372036854775808)
\set max debug(-(:min + 1))
+-- pseudo-random permutation
+\set t debug(pr_perm(0, 2) + pr_perm(1, 2) = 1)
+\set t debug(pr_perm(0, 3) + pr_perm(1, 3) + pr_perm(2, 3) = 3)
+\set t debug(pr_perm(0, 4) + pr_perm(1, 4) + pr_perm(2, 4) + pr_perm(3, 4) = 6)
+\set t debug(pr_perm(0, 5) + pr_perm(1, 5) + pr_perm(2, 5) + pr_perm(3, 5) + pr_perm(4, 5) = 10)
+\set t debug(pr_perm(0, 16) + pr_perm(1, 16) + pr_perm(2, 16) + pr_perm(3, 16) + \
+ pr_perm(4, 16) + pr_perm(5, 16) + pr_perm(6, 16) + pr_perm(7, 16) + \
+ pr_perm(8, 16) + pr_perm(9, 16) + pr_perm(10, 16) + pr_perm(11, 16) + \
+ pr_perm(12, 16) + pr_perm(13, 16) + pr_perm(14, 16) + pr_perm(15, 16) = 120)
+-- random sanity check
+\set size random(2, 1000)
+\set v random(0, :size - 1)
+\set p pr_perm(:v, :size)
+\set t debug(0 <= :p and :p < :size and :p = pr_perm(:v + :size, :size) and :p <> pr_perm(:v + 1, :size))
+-- actual values
+\set t debug(pr_perm(:v, 1) = 0)
+\set t debug(pr_perm(0, 2, 5432) = 0 and pr_perm(1, 2, 5432) = 1 and \
+ pr_perm(0, 2, 5431) = 1 and pr_perm(1, 2, 5431) = 0)
+-- ~50 bits test to exercise full modular multiplication
+\set t debug(pr_perm(0, 1000000000000000, 54321) = 9406454989259 and \
+ pr_perm(1999999999999999, 1000000000000000, 54321) = 54570773902028 and \
+ pr_perm(2500000000000000, 1000000000000000, 54321) = 771082311307468)
+-- 63 bits tests
+\set size debug(:max - 10)
+\set ok debug(pr_perm(:size-1, :size, 5432) = 7214172101785397543 and \
+ pr_perm(:size-2, :size, 5432) = 4060085360303920649 and \
+ pr_perm(:size-3, :size, 5432) = 919477102797071521 and \
+ pr_perm(:size-4, :size, 5432) = 7588404289602340497 and \
+ pr_perm(:size-5, :size, 5432) = 5568354808723584469 and \
+ pr_perm(:size-6, :size, 5432) = 2410458883211907565 and \
+ pr_perm(:size-7, :size, 5432) = 1738667467693569814)
}
});
@@ -740,6 +782,10 @@ SELECT LEAST(:i, :i, :i, :i, :i, :i, :i, :i, :i, :i, :i);
[
'bad boolean', 2,
[qr{malformed variable.*trueXXX}], q{\set b :badtrue or true}
+ ],
+ [
+ 'invalid pr_perm size', 2,
+ [qr{pr_perm size parameter must be >= 1}], q{\set i pr_perm(0, 0)}
],);
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
index 696c378edc..7c03ef2bbc 100644
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -306,6 +306,16 @@ my @script_tests = (
'double overflow 3',
[qr{double constant overflow}],
{ 'overflow-3.sql' => "\\set d .1E310\n" }
+ ],
+ [
+ 'not enough arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-1.sql' => "\\set i pr_perm(1)\n" }
+ ],
+ [
+ 'too many arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-2.sql' => "\\set i pr_perm(1, 2, 3, 4)\n" }
],);
for my $t (@script_tests)
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 9798bd24b4..4cf375085d 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -734,6 +734,9 @@
/* Define to 1 if your compiler understands __builtin_bswap64. */
#undef HAVE__BUILTIN_BSWAP64
+/* Define to 1 if your compiler understands __builtin_clzll. */
+#undef HAVE__BUILTIN_CLZLL
+
/* Define to 1 if your compiler understands __builtin_constant_p. */
#undef HAVE__BUILTIN_CONSTANT_P
Hi Fabian-san,
I reviewed 'pgbench-prp-func/pgbench-prp-func-10.patch'.
On 2018/10/24 12:55, Fabien COELHO wrote:
Hello Hironobu-san,
In pseudorandom_perm(), `modular_multiply() + (key >> LCG_SHIFT)` may
overflow if the result of modular_multiply() is large. Therefore, I've
improved it.Also, I've simplified Step 5 in modular_multiply().
Attached is a v10, where I have:
�- updated some comments
�- the + cannot overflow because size is taken from a signed int
�� and the added value is small thanks to the shift.
�� I have put back the simple formula and added a comment about it.
�- added a few test cases, and fix the associated checks
I agree your discussion before.
I checked the tests you added in this patch and I confirmed that there
is no problem.
I thinks this patch is fine.
Best regards,
I thinks this patch is fine.
Thanks!
Hopefully some committer will pick it up at some point.
--
Fabien.
Can you please pgindent this?
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hello Alvaro,
Can you please pgindent this?
Hmmm. After some investigation, I installed some "pg_bsd_indent" and ran
the "pgindent" script, which reindented far more than the patch... So I
picked up the patch-related changes and integrated them manually, although
not comment changes which break the logic of the algorithm descriptions. I
have not found how to tell pgindent to let comments indentation alone.
Here is the result for the code, and for part of comments.
--
Fabien.
Attachments:
pgbench-prp-func-11.patchtext/x-diff; name=pgbench-prp-func-11.patchDownload
diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index af2dea1c2a..5b09f73d26 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -327,6 +327,26 @@ fi])# PGAC_C_BUILTIN_BSWAP64
+# PGAC_C_BUILTIN_CLZLL
+# -------------------------
+# Check if the C compiler understands __builtin_clzll(),
+# and define HAVE__BUILTIN_CLZLL if so.
+# Both GCC & CLANG seem to have one.
+AC_DEFUN([PGAC_C_BUILTIN_CLZLL],
+[AC_CACHE_CHECK(for __builtin_clzll, pgac_cv__builtin_clzll,
+[AC_COMPILE_IFELSE([AC_LANG_SOURCE(
+[static unsigned long int x = __builtin_clzll(0xaabbccddeeff0011);]
+)],
+[pgac_cv__builtin_clzll=yes],
+[pgac_cv__builtin_clzll=no])])
+if test x"$pgac_cv__builtin_clzll" = xyes ; then
+AC_DEFINE(HAVE__BUILTIN_CLZLL, 1,
+ [Define to 1 if your compiler understands __builtin_clzll.])
+fi])# PGAC_C_BUILTIN_CLZLL
+
+
+
+
# PGAC_C_BUILTIN_CONSTANT_P
# -------------------------
# Check if the C compiler understands __builtin_constant_p(),
diff --git a/configure b/configure
index 43ae8c869d..c455505ccf 100755
--- a/configure
+++ b/configure
@@ -13951,6 +13951,30 @@ if test x"$pgac_cv__builtin_bswap64" = xyes ; then
$as_echo "#define HAVE__BUILTIN_BSWAP64 1" >>confdefs.h
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __builtin_clzll" >&5
+$as_echo_n "checking for __builtin_clzll... " >&6; }
+if ${pgac_cv__builtin_clzll+:} false; then :
+ $as_echo_n "(cached) " >&6
+else
+ cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h. */
+static unsigned long int x = __builtin_clzll(0xaabbccddeeff0011);
+
+_ACEOF
+if ac_fn_c_try_compile "$LINENO"; then :
+ pgac_cv__builtin_clzll=yes
+else
+ pgac_cv__builtin_clzll=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv__builtin_clzll" >&5
+$as_echo "$pgac_cv__builtin_clzll" >&6; }
+if test x"$pgac_cv__builtin_clzll" = xyes ; then
+
+$as_echo "#define HAVE__BUILTIN_CLZLL 1" >>confdefs.h
+
fi
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __builtin_constant_p" >&5
$as_echo_n "checking for __builtin_constant_p... " >&6; }
diff --git a/configure.in b/configure.in
index 519ecd5e1e..f7eb78aef9 100644
--- a/configure.in
+++ b/configure.in
@@ -1486,6 +1486,7 @@ PGAC_C_TYPES_COMPATIBLE
PGAC_C_BUILTIN_BSWAP16
PGAC_C_BUILTIN_BSWAP32
PGAC_C_BUILTIN_BSWAP64
+PGAC_C_BUILTIN_CLZLL
PGAC_C_BUILTIN_CONSTANT_P
PGAC_C_BUILTIN_UNREACHABLE
PGAC_C_COMPUTED_GOTO
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index b5e3a62a33..77cbfcd097 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -929,7 +929,7 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<row>
<entry> <literal>default_seed</literal> </entry>
- <entry>seed used in hash functions by default</entry>
+ <entry>seed used in hash and pseudo-random permutation functions by default</entry>
</row>
<row>
@@ -1389,6 +1389,13 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<entry><literal>pow(2.0, 10)</literal>, <literal>power(2.0, 10)</literal></entry>
<entry><literal>1024.0</literal></entry>
</row>
+ <row>
+ <entry><literal><function>pr_perm(<replaceable>i</replaceable>, <replaceable>size</replaceable> [, <replaceable>seed</replaceable> ] )</function></literal></entry>
+ <entry>integer</entry>
+ <entry>pseudo-random permutation in [0,size)</entry>
+ <entry><literal>pr_perm(0, 4)</literal></entry>
+ <entry><literal>0</literal>, <literal>1</literal>, <literal>2</literal> or <literal>3</literal></entry>
+ </row>
<row>
<entry><literal><function>random(<replaceable>lb</replaceable>, <replaceable>ub</replaceable>)</function></literal></entry>
<entry>integer</entry>
@@ -1550,6 +1557,24 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
</programlisting>
</para>
+ <para>
+ Function <literal>pr_perm</literal> implements a pseudo-random permutation.
+ It allows to mix the output of non uniform random functions so that
+ values drawn more often are not trivially correlated.
+ It permutes integers in [0, size) using a seed by applying rounds of
+ simple invertible functions, similarly to an encryption function,
+ although beware that it is not at all cryptographically secure.
+ Compared to <literal>hash</literal> functions discussed above, the function
+ ensures that a perfect permutation is applied: there are no collisions
+ nor holes in the output values.
+ Values outside the interval are interpreted modulo the size.
+ The function errors if size is not positive.
+ If no seed is provided, <literal>:default_seed</literal> is used.
+ For a given size and seed, the function is fully deterministic: if two
+ permutations on the same size must not be correlated, use distinct seeds
+ as outlined in the previous example about hash functions.
+ </para>
+
<para>
As an example, the full definition of the built-in TPC-B-like
transaction is:
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index bab6f8a95c..94fcebb77a 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -19,6 +19,7 @@
#define PGBENCH_NARGS_VARIABLE (-1)
#define PGBENCH_NARGS_CASE (-2)
#define PGBENCH_NARGS_HASH (-3)
+#define PGBENCH_NARGS_PRPERM (-4)
PgBenchExpr *expr_parse_result;
@@ -370,6 +371,9 @@ static const struct
{
"hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A
},
+ {
+ "pr_perm", PGBENCH_NARGS_PRPERM, PGBENCH_PRPERM
+ },
/* keep as last array element */
{
NULL, 0, 0
@@ -482,6 +486,19 @@ make_func(yyscan_t yyscanner, int fnumber, PgBenchExprList *args)
}
break;
+ /* pseudo-random permutation function with optional seed argument */
+ case PGBENCH_NARGS_PRPERM:
+ if (len < 2 || len > 3)
+ expr_yyerror_more(yyscanner, "unexpected number of arguments",
+ PGBENCH_FUNCTIONS[fnumber].fname);
+
+ if (len == 2)
+ {
+ PgBenchExpr *var = make_variable("default_seed");
+ args = make_elist(var, args);
+ }
+ break;
+
/* common case: positive arguments number */
default:
Assert(PGBENCH_FUNCTIONS[fnumber].nargs >= 0);
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 81bc6d8a6e..7fcb2e4334 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -1066,6 +1066,260 @@ getHashMurmur2(int64 val, uint64 seed)
return (int64) result;
}
+/* pseudo-random permutation */
+
+/* 16 so that % 16 can be optimized to & 0x0f */
+#define PRP_PRIMES 16
+/*
+ * 24 bits mega primes from https://primes.utm.edu/lists/small/millions/
+ * the i-th prime, i \in [0, 15], is the first prime above 2^23 + i * 2^19
+ */
+static uint64 primes[PRP_PRIMES] = {
+ UINT64CONST(8388617),
+ UINT64CONST(8912921),
+ UINT64CONST(9437189),
+ UINT64CONST(9961487),
+ UINT64CONST(10485767),
+ UINT64CONST(11010059),
+ UINT64CONST(11534351),
+ UINT64CONST(12058679),
+ UINT64CONST(12582917),
+ UINT64CONST(13107229),
+ UINT64CONST(13631489),
+ UINT64CONST(14155777),
+ UINT64CONST(14680067),
+ UINT64CONST(15204391),
+ UINT64CONST(15728681),
+ UINT64CONST(16252967)
+};
+
+/* how many "encryption" rounds to apply */
+#define PRP_ROUNDS 4
+
+/* return smallest mask holding n */
+static uint64
+compute_mask(uint64 n)
+{
+ n |= n >> 1;
+ n |= n >> 2;
+ n |= n >> 4;
+ n |= n >> 8;
+ n |= n >> 16;
+ n |= n >> 32;
+ return n;
+}
+
+/* length of n binary representation */
+static int
+nbits(uint64 n)
+{
+#ifdef HAVE__BUILTIN_CLZLL
+ return 64 - (n != 0 ? __builtin_clzll(n) : 64);
+#else
+ /* use clever no branching bitwise operator version */
+ /* set all lower bits to 1 */
+ n = compute_mask(n);
+ /* then count them */
+ n -= (n >> 1) & UINT64CONST(0x5555555555555555);
+ n = (n & UINT64CONST(0x3333333333333333)) + ((n >> 2) & UINT64CONST(0x3333333333333333));
+ n = (n + (n >> 4)) & UINT64CONST(0x0F0F0F0F0F0F0F0F);
+ return (n * UINT64CONST(0x0101010101010101)) >> 56;
+#endif
+}
+
+/*
+ * Compute (x * y) % m, where x and y in [0, 2^64), m in [1, 2^64).
+ *
+ * Use improved interleaved modular multiplication algorithm to avoid
+ * overflows when necessary.
+ */
+static uint64
+modular_multiply(uint64 x, uint64 y, const uint64 m)
+{
+ int i,
+ bits;
+ uint64 r;
+
+ Assert(m >= 1);
+
+ /* Because of (x * y) % m = (x % m * y % m) % m */
+ if (x >= m)
+ x %= m;
+ if (y >= m)
+ y %= m;
+
+ /* Return the trivial result. */
+ if (x == 0 || y == 0 || m == 1)
+ return 0;
+
+ /* Return the result if (x * y) can be multiplied without overflow. */
+ if (nbits(x) + nbits(y) <= 64)
+ return (x * y) % m;
+
+ /* To reduce the for loop in the algorithm below, ensure y <= x. */
+ if (x < y)
+ {
+ uint64 tmp = x;
+
+ x = y;
+ y = tmp;
+ }
+
+ /*
+ * Interleaved modular multiplication algorithm from:
+ *
+ * D.N. Amanor et al., "Efficient hardware architecture for modular
+ * multiplication on FPGAs", in International Conference on Field
+ * Programmable Logic and Applications, Aug 2005, pp. 539-542.
+ *
+ * This algorithm is usually used in the field of digital circuit design.
+ *
+ * Input: X, Y, M; 0 <= X, Y <= M;
+ * Output: R = X * Y mod M;
+ * bits: number of bits of Y
+ * Y[i]: i th bit of Y
+ *
+ * 1. R = 0;
+ * 2. for (i = bits - 1; i >= 0; i--) {
+ * 3. R = 2 * R;
+ * 4. if (Y[i] == 0x1)
+ * 5. R += X;
+ * 6. if (R >= M) R -= M;
+ * 7. if (R >= M) R -= M;
+ * }
+ *
+ * In Steps 3 and 5, overflow should be avoided.
+ * Steps 6 and 7 can be instead a modular operation (R %= M).
+ */
+
+ bits = nbits(y);
+ r = 0;
+
+ for (i = bits - 1; i >= 0; i--)
+ {
+ if (r > UINT64CONST(0x7fffffffffffffff))
+ /*
+ * To avoid overflow, transform from (2 * r) to (2 * r) % m, and
+ * further transform to mathematically equivalent form shown
+ * below:
+ */
+ r = m - ((m - r) << 1);
+ else
+ r <<= 1;
+
+ if ((y >> i) & 0x1)
+ {
+ if (r > UINT64CONST(0xffffffffffffffff) - x)
+ /*
+ * To compute (r + x) without overflow:
+ * transform to (r + x) % m, and further
+ * transform to (r + x - m).
+ */
+ r += x - m;
+ else
+ r += x;
+ }
+
+ r %= m;
+ }
+
+ return r;
+}
+
+/*
+ * Donald Knuth's linear congruential generator
+ *
+ * Relying on multiplication overflows is part of the design
+ * of this simple pseudo random number generator.
+ */
+#define DK_LCG_MUL UINT64CONST(6364136223846793005)
+#define DK_LCG_INC UINT64CONST(1442695040888963407)
+
+/* do not use all small bits */
+#define LCG_SHIFT 13
+
+/*
+ * PRP: parametric pseudo-random permutation
+ *
+ * Result in [0, size) is a permutation for inputs in the same set.
+ *
+ * Note that this function does not pass statistical tests: eg
+ * permutations of 2, 3, 4 or 5 ints are not strictly equiprobable.
+ * Things worsen for large sizes as there are many more permutations
+ * (size!) than seeds to select them (2^64 < 21!).
+ * However it is inexpensive compared to an actual encryption function,
+ * and the quality is good enough to avoid trivial correlations on
+ * large sizes, which is the expected use case.
+ *
+ * THIS FUNCTION IS NOT CRYPTOGRAPHICALLY SECURE.
+ * DO NOT USE FOR SUCH PURPOSE.
+ */
+static int64
+pseudorandom_perm(const int64 data, const int64 isize, const int64 seed)
+{
+ /* computations are performed on unsigned values */
+ uint64 size = (uint64) isize;
+ uint64 v = (uint64) data % size;
+ uint64 key = (uint64) seed;
+ /* size-1: ensures 2 possibly overlapping halves */
+ uint64 mask = compute_mask(size - 1) >> 1;
+
+ unsigned int i,
+ p;
+
+ /* nothing to permute */
+ if (isize == 1)
+ return 0;
+
+ Assert(isize >= 2);
+
+ /*
+ * Apply 4 rounds of bijective transformations:
+ * (1) scramble: partial xors on overlapping power-of-2 subsets
+ * (2) scatter: linear modulo
+ */
+ for (i = 0, p = key % PRP_PRIMES; i < PRP_ROUNDS; i++, p = (p + 1) % PRP_PRIMES)
+ {
+ uint64 t;
+
+ /* first "half" whitening, for v in 0 .. mask */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ if (v <= mask)
+ v ^= (key >> LCG_SHIFT) & mask;
+
+ /* second (possibly overlapping) "half" whitening */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ t = size - 1 - v;
+ if (t <= mask)
+ {
+ t ^= (key >> LCG_SHIFT) & mask;
+ v = size - 1 - t;
+ }
+
+ /* at most 2 primes are skipped for a given size */
+ while (unlikely(size % primes[p] == 0))
+ p = (p + 1) % PRP_PRIMES;
+
+ /* scatter values with a prime multiplication */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+
+ /* Performance shortcut for 24 bit primes, ok for size up to ~10E12 */
+ if ((v & UINT64CONST(0xffffffffff)) == v)
+ v = (primes[p] * v + (key >> LCG_SHIFT)) % size;
+ else
+ /*
+ * Note: the + below cannot overflow as size is under 63 bits:
+ * mmv = mm(prime, v, size) < size <= 0x7fffffffffffffff = (1<<63)-1
+ * ks = key >> LCG_SHIFTS <= 2^51
+ * => mmv + ks < (1<<64) - 1
+ */
+ v = (modular_multiply(primes[p], v, size) + (key >> LCG_SHIFT)) % size;
+ }
+
+ /* back to signed */
+ return (int64) v;
+}
+
/*
* Initialize the given SimpleStats struct to all zeroes
*/
@@ -2420,6 +2674,27 @@ evalStandardFunc(TState *thread, CState *st,
return true;
}
+ case PGBENCH_PRPERM:
+ {
+ int64 val, size, seed;
+
+ Assert(nargs == 3);
+
+ if (!coerceToInt(&vargs[0], &val) ||
+ !coerceToInt(&vargs[1], &size) ||
+ !coerceToInt(&vargs[2], &seed))
+ return false;
+
+ if (size < 1)
+ {
+ fprintf(stderr, "pr_perm size parameter must be >= 1\n");
+ return false;
+ }
+
+ setIntValue(retval, pseudorandom_perm(val, size, seed));
+ return true;
+ }
+
default:
/* cannot get here */
Assert(0);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index de50340434..12d5c2946b 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -99,7 +99,8 @@ typedef enum PgBenchFunction
PGBENCH_IS,
PGBENCH_CASE,
PGBENCH_HASH_FNV1A,
- PGBENCH_HASH_MURMUR2
+ PGBENCH_HASH_MURMUR2,
+ PGBENCH_PRPERM
} PgBenchFunction;
typedef struct PgBenchExpr PgBenchExpr;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 0081989026..30c415c43f 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -323,6 +323,17 @@ pgbench(
qr{command=98.: int 5432\b}, # :random_seed
qr{command=99.: int -9223372036854775808\b}, # min int
qr{command=100.: int 9223372036854775807\b}, # max int
+ qr{command=101.: boolean true\b},
+ qr{command=102.: boolean true\b},
+ qr{command=103.: boolean true\b},
+ qr{command=104.: boolean true\b},
+ qr{command=105.: boolean true\b},
+ qr{command=109.: boolean true\b},
+ qr{command=110.: boolean true\b},
+ qr{command=111.: boolean true\b},
+ qr{command=112.: boolean true\b},
+ qr{command=113.: int 9223372036854775797\b},
+ qr{command=114.: boolean true\b},
],
'pgbench expressions',
{
@@ -450,6 +461,37 @@ SELECT :v0, :v1, :v2, :v3;
-- minint constant parsing
\set min debug(-9223372036854775808)
\set max debug(-(:min + 1))
+-- pseudo-random permutation
+\set t debug(pr_perm(0, 2) + pr_perm(1, 2) = 1)
+\set t debug(pr_perm(0, 3) + pr_perm(1, 3) + pr_perm(2, 3) = 3)
+\set t debug(pr_perm(0, 4) + pr_perm(1, 4) + pr_perm(2, 4) + pr_perm(3, 4) = 6)
+\set t debug(pr_perm(0, 5) + pr_perm(1, 5) + pr_perm(2, 5) + pr_perm(3, 5) + pr_perm(4, 5) = 10)
+\set t debug(pr_perm(0, 16) + pr_perm(1, 16) + pr_perm(2, 16) + pr_perm(3, 16) + \
+ pr_perm(4, 16) + pr_perm(5, 16) + pr_perm(6, 16) + pr_perm(7, 16) + \
+ pr_perm(8, 16) + pr_perm(9, 16) + pr_perm(10, 16) + pr_perm(11, 16) + \
+ pr_perm(12, 16) + pr_perm(13, 16) + pr_perm(14, 16) + pr_perm(15, 16) = 120)
+-- random sanity check
+\set size random(2, 1000)
+\set v random(0, :size - 1)
+\set p pr_perm(:v, :size)
+\set t debug(0 <= :p and :p < :size and :p = pr_perm(:v + :size, :size) and :p <> pr_perm(:v + 1, :size))
+-- actual values
+\set t debug(pr_perm(:v, 1) = 0)
+\set t debug(pr_perm(0, 2, 5432) = 0 and pr_perm(1, 2, 5432) = 1 and \
+ pr_perm(0, 2, 5431) = 1 and pr_perm(1, 2, 5431) = 0)
+-- ~50 bits test to exercise full modular multiplication
+\set t debug(pr_perm(0, 1000000000000000, 54321) = 9406454989259 and \
+ pr_perm(1999999999999999, 1000000000000000, 54321) = 54570773902028 and \
+ pr_perm(2500000000000000, 1000000000000000, 54321) = 771082311307468)
+-- 63 bits tests
+\set size debug(:max - 10)
+\set ok debug(pr_perm(:size-1, :size, 5432) = 7214172101785397543 and \
+ pr_perm(:size-2, :size, 5432) = 4060085360303920649 and \
+ pr_perm(:size-3, :size, 5432) = 919477102797071521 and \
+ pr_perm(:size-4, :size, 5432) = 7588404289602340497 and \
+ pr_perm(:size-5, :size, 5432) = 5568354808723584469 and \
+ pr_perm(:size-6, :size, 5432) = 2410458883211907565 and \
+ pr_perm(:size-7, :size, 5432) = 1738667467693569814)
}
});
@@ -740,6 +782,10 @@ SELECT LEAST(:i, :i, :i, :i, :i, :i, :i, :i, :i, :i, :i);
[
'bad boolean', 2,
[qr{malformed variable.*trueXXX}], q{\set b :badtrue or true}
+ ],
+ [
+ 'invalid pr_perm size', 2,
+ [qr{pr_perm size parameter must be >= 1}], q{\set i pr_perm(0, 0)}
],);
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
index 696c378edc..7c03ef2bbc 100644
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -306,6 +306,16 @@ my @script_tests = (
'double overflow 3',
[qr{double constant overflow}],
{ 'overflow-3.sql' => "\\set d .1E310\n" }
+ ],
+ [
+ 'not enough arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-1.sql' => "\\set i pr_perm(1)\n" }
+ ],
+ [
+ 'too many arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-2.sql' => "\\set i pr_perm(1, 2, 3, 4)\n" }
],);
for my $t (@script_tests)
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 9798bd24b4..4cf375085d 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -734,6 +734,9 @@
/* Define to 1 if your compiler understands __builtin_bswap64. */
#undef HAVE__BUILTIN_BSWAP64
+/* Define to 1 if your compiler understands __builtin_clzll. */
+#undef HAVE__BUILTIN_CLZLL
+
/* Define to 1 if your compiler understands __builtin_constant_p. */
#undef HAVE__BUILTIN_CONSTANT_P
On 2018-Oct-24, Fabien COELHO wrote:
Hello Alvaro,
Can you please pgindent this?
Hmmm. After some investigation, I installed some "pg_bsd_indent" and ran the
"pgindent" script, which reindented far more than the patch... So I picked
up the patch-related changes and integrated them manually,
Cool, thanks.
although not comment changes which break the logic of the algorithm
descriptions. I have not found how to tell pgindent to let comments
indentation alone.
You can use /*----- for such comments.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Oct 24, 2018 at 06:00:08PM -0300, Alvaro Herrera wrote:
On 2018-Oct-24, Fabien COELHO wrote:
although not comment changes which break the logic of the algorithm
descriptions. I have not found how to tell pgindent to let comments
indentation alone.You can use /*----- for such comments.
A recent example of that is what d55241af has done in
pg_verify_checksums.c.
--
Michael
Hello Alvaro,
although not comment changes which break the logic of the algorithm
descriptions. I have not found how to tell pgindent to let comments
indentation alone.You can use /*----- for such comments.
Thanks for the hint. Here is an updated patch using this marker.
I noticed that some instances use a closing "*-----" sequence, but it does
not seem useful, so I left it out.
I have also tried to improve a few comments, and moved a declaration into
a loop because I did not like the pg-indented version much.
--
Fabien.
Attachments:
pgbench-prp-func-12.patchtext/x-diff; name=pgbench-prp-func-12.patchDownload
diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index af2dea1c2a..5b09f73d26 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -327,6 +327,26 @@ fi])# PGAC_C_BUILTIN_BSWAP64
+# PGAC_C_BUILTIN_CLZLL
+# -------------------------
+# Check if the C compiler understands __builtin_clzll(),
+# and define HAVE__BUILTIN_CLZLL if so.
+# Both GCC & CLANG seem to have one.
+AC_DEFUN([PGAC_C_BUILTIN_CLZLL],
+[AC_CACHE_CHECK(for __builtin_clzll, pgac_cv__builtin_clzll,
+[AC_COMPILE_IFELSE([AC_LANG_SOURCE(
+[static unsigned long int x = __builtin_clzll(0xaabbccddeeff0011);]
+)],
+[pgac_cv__builtin_clzll=yes],
+[pgac_cv__builtin_clzll=no])])
+if test x"$pgac_cv__builtin_clzll" = xyes ; then
+AC_DEFINE(HAVE__BUILTIN_CLZLL, 1,
+ [Define to 1 if your compiler understands __builtin_clzll.])
+fi])# PGAC_C_BUILTIN_CLZLL
+
+
+
+
# PGAC_C_BUILTIN_CONSTANT_P
# -------------------------
# Check if the C compiler understands __builtin_constant_p(),
diff --git a/configure b/configure
index 43ae8c869d..c455505ccf 100755
--- a/configure
+++ b/configure
@@ -13951,6 +13951,30 @@ if test x"$pgac_cv__builtin_bswap64" = xyes ; then
$as_echo "#define HAVE__BUILTIN_BSWAP64 1" >>confdefs.h
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __builtin_clzll" >&5
+$as_echo_n "checking for __builtin_clzll... " >&6; }
+if ${pgac_cv__builtin_clzll+:} false; then :
+ $as_echo_n "(cached) " >&6
+else
+ cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h. */
+static unsigned long int x = __builtin_clzll(0xaabbccddeeff0011);
+
+_ACEOF
+if ac_fn_c_try_compile "$LINENO"; then :
+ pgac_cv__builtin_clzll=yes
+else
+ pgac_cv__builtin_clzll=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv__builtin_clzll" >&5
+$as_echo "$pgac_cv__builtin_clzll" >&6; }
+if test x"$pgac_cv__builtin_clzll" = xyes ; then
+
+$as_echo "#define HAVE__BUILTIN_CLZLL 1" >>confdefs.h
+
fi
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __builtin_constant_p" >&5
$as_echo_n "checking for __builtin_constant_p... " >&6; }
diff --git a/configure.in b/configure.in
index 519ecd5e1e..f7eb78aef9 100644
--- a/configure.in
+++ b/configure.in
@@ -1486,6 +1486,7 @@ PGAC_C_TYPES_COMPATIBLE
PGAC_C_BUILTIN_BSWAP16
PGAC_C_BUILTIN_BSWAP32
PGAC_C_BUILTIN_BSWAP64
+PGAC_C_BUILTIN_CLZLL
PGAC_C_BUILTIN_CONSTANT_P
PGAC_C_BUILTIN_UNREACHABLE
PGAC_C_COMPUTED_GOTO
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index b5e3a62a33..77cbfcd097 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -929,7 +929,7 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<row>
<entry> <literal>default_seed</literal> </entry>
- <entry>seed used in hash functions by default</entry>
+ <entry>seed used in hash and pseudo-random permutation functions by default</entry>
</row>
<row>
@@ -1389,6 +1389,13 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<entry><literal>pow(2.0, 10)</literal>, <literal>power(2.0, 10)</literal></entry>
<entry><literal>1024.0</literal></entry>
</row>
+ <row>
+ <entry><literal><function>pr_perm(<replaceable>i</replaceable>, <replaceable>size</replaceable> [, <replaceable>seed</replaceable> ] )</function></literal></entry>
+ <entry>integer</entry>
+ <entry>pseudo-random permutation in [0,size)</entry>
+ <entry><literal>pr_perm(0, 4)</literal></entry>
+ <entry><literal>0</literal>, <literal>1</literal>, <literal>2</literal> or <literal>3</literal></entry>
+ </row>
<row>
<entry><literal><function>random(<replaceable>lb</replaceable>, <replaceable>ub</replaceable>)</function></literal></entry>
<entry>integer</entry>
@@ -1550,6 +1557,24 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
</programlisting>
</para>
+ <para>
+ Function <literal>pr_perm</literal> implements a pseudo-random permutation.
+ It allows to mix the output of non uniform random functions so that
+ values drawn more often are not trivially correlated.
+ It permutes integers in [0, size) using a seed by applying rounds of
+ simple invertible functions, similarly to an encryption function,
+ although beware that it is not at all cryptographically secure.
+ Compared to <literal>hash</literal> functions discussed above, the function
+ ensures that a perfect permutation is applied: there are no collisions
+ nor holes in the output values.
+ Values outside the interval are interpreted modulo the size.
+ The function errors if size is not positive.
+ If no seed is provided, <literal>:default_seed</literal> is used.
+ For a given size and seed, the function is fully deterministic: if two
+ permutations on the same size must not be correlated, use distinct seeds
+ as outlined in the previous example about hash functions.
+ </para>
+
<para>
As an example, the full definition of the built-in TPC-B-like
transaction is:
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index bab6f8a95c..94fcebb77a 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -19,6 +19,7 @@
#define PGBENCH_NARGS_VARIABLE (-1)
#define PGBENCH_NARGS_CASE (-2)
#define PGBENCH_NARGS_HASH (-3)
+#define PGBENCH_NARGS_PRPERM (-4)
PgBenchExpr *expr_parse_result;
@@ -370,6 +371,9 @@ static const struct
{
"hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A
},
+ {
+ "pr_perm", PGBENCH_NARGS_PRPERM, PGBENCH_PRPERM
+ },
/* keep as last array element */
{
NULL, 0, 0
@@ -482,6 +486,19 @@ make_func(yyscan_t yyscanner, int fnumber, PgBenchExprList *args)
}
break;
+ /* pseudo-random permutation function with optional seed argument */
+ case PGBENCH_NARGS_PRPERM:
+ if (len < 2 || len > 3)
+ expr_yyerror_more(yyscanner, "unexpected number of arguments",
+ PGBENCH_FUNCTIONS[fnumber].fname);
+
+ if (len == 2)
+ {
+ PgBenchExpr *var = make_variable("default_seed");
+ args = make_elist(var, args);
+ }
+ break;
+
/* common case: positive arguments number */
default:
Assert(PGBENCH_FUNCTIONS[fnumber].nargs >= 0);
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 81bc6d8a6e..220cb2ce16 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -1066,6 +1066,255 @@ getHashMurmur2(int64 val, uint64 seed)
return (int64) result;
}
+/* pseudo-random permutation */
+
+/* 16 so that % 16 can be optimized to & 0x0f */
+#define PRP_PRIMES 16
+/*
+ * 24 bits mega primes from https://primes.utm.edu/lists/small/millions/
+ * the i-th prime, i \in [0, 15], is the first prime above 2^23 + i * 2^19
+ */
+static uint64 primes[PRP_PRIMES] = {
+ UINT64CONST(8388617),
+ UINT64CONST(8912921),
+ UINT64CONST(9437189),
+ UINT64CONST(9961487),
+ UINT64CONST(10485767),
+ UINT64CONST(11010059),
+ UINT64CONST(11534351),
+ UINT64CONST(12058679),
+ UINT64CONST(12582917),
+ UINT64CONST(13107229),
+ UINT64CONST(13631489),
+ UINT64CONST(14155777),
+ UINT64CONST(14680067),
+ UINT64CONST(15204391),
+ UINT64CONST(15728681),
+ UINT64CONST(16252967)
+};
+
+/* how many "encryption" rounds to apply */
+#define PRP_ROUNDS 4
+
+/* return smallest mask holding n */
+static uint64
+compute_mask(uint64 n)
+{
+ n |= n >> 1;
+ n |= n >> 2;
+ n |= n >> 4;
+ n |= n >> 8;
+ n |= n >> 16;
+ n |= n >> 32;
+ return n;
+}
+
+/* length of n binary representation */
+static int
+nbits(uint64 n)
+{
+#ifdef HAVE__BUILTIN_CLZLL
+ return 64 - (n != 0 ? __builtin_clzll(n) : 64);
+#else
+ /* use clever no branching bitwise operator version */
+ /* set all lower bits to 1 */
+ n = compute_mask(n);
+ /* then count them */
+ n -= (n >> 1) & UINT64CONST(0x5555555555555555);
+ n = (n & UINT64CONST(0x3333333333333333)) + ((n >> 2) & UINT64CONST(0x3333333333333333));
+ n = (n + (n >> 4)) & UINT64CONST(0x0F0F0F0F0F0F0F0F);
+ return (n * UINT64CONST(0x0101010101010101)) >> 56;
+#endif
+}
+
+/*
+ * Compute (x * y) % m, where x and y in [0, 2^64), m in [1, 2^64).
+ *
+ * Use improved interleaved modular multiplication algorithm to avoid
+ * overflows when necessary.
+ */
+static uint64
+modular_multiply(uint64 x, uint64 y, const uint64 m)
+{
+ int i,
+ bits;
+ uint64 r;
+
+ Assert(m >= 1);
+
+ /* Because of (x * y) % m = (x % m * y % m) % m */
+ if (x >= m)
+ x %= m;
+ if (y >= m)
+ y %= m;
+
+ /* Return the trivial result. */
+ if (x == 0 || y == 0 || m == 1)
+ return 0;
+
+ /* Return the result if (x * y) can be multiplied without overflow. */
+ if (nbits(x) + nbits(y) <= 64)
+ return (x * y) % m;
+
+ /* To reduce the for loop in the algorithm below, ensure y <= x. */
+ if (x < y)
+ {
+ uint64 tmp = x;
+
+ x = y;
+ y = tmp;
+ }
+
+ /*-----
+ * Interleaved modular multiplication algorithm from:
+ *
+ * D.N. Amanor et al., "Efficient hardware architecture for modular
+ * multiplication on FPGAs", in International Conference on Field
+ * Programmable Logic and Applications, Aug 2005, pp. 539-542.
+ *
+ * This algorithm is usually used in the field of digital circuit design.
+ *
+ * Input: X, Y, M; 0 <= X, Y <= M;
+ * Output: R = X * Y mod M;
+ * bits: number of bits of Y
+ * Y[i]: i th bit of Y
+ *
+ * 1. R = 0;
+ * 2. for (i = bits - 1; i >= 0; i--) {
+ * 3. R = 2 * R;
+ * 4. if (Y[i] == 0x1)
+ * 5. R += X;
+ * 6. if (R >= M) R -= M;
+ * 7. if (R >= M) R -= M;
+ * }
+ *
+ * In Steps 3 and 5, overflow should be avoided.
+ * Steps 6 and 7 can be instead a modular operation (R %= M).
+ */
+
+ bits = nbits(y);
+ r = 0;
+
+ for (i = bits - 1; i >= 0; i--)
+ {
+ if (r > UINT64CONST(0x7fffffffffffffff))
+ /*
+ * To avoid overflow, transform from (2 * r) to (2 * r) % m, and
+ * further transform to mathematically equivalent form below:
+ */
+ r = m - ((m - r) << 1);
+ else
+ r <<= 1;
+
+ if ((y >> i) & 0x1)
+ {
+ if (r > UINT64CONST(0xffffffffffffffff) - x)
+ /*-----
+ * To compute (r + x) without overflow: transform to
+ * (r + x) % m, and then to (r + x - m).
+ */
+ r += x - m;
+ else
+ r += x;
+ }
+
+ r %= m;
+ }
+
+ return r;
+}
+
+/*
+ * Donald Knuth's linear congruential generator
+ *
+ * Relying on multiplication overflows is part of the design
+ * of this simple pseudo random number generator.
+ */
+#define DK_LCG_MUL UINT64CONST(6364136223846793005)
+#define DK_LCG_INC UINT64CONST(1442695040888963407)
+
+/* do not use all small bits */
+#define LCG_SHIFT 13
+
+/*
+ * PRP: parametric pseudo-random permutation
+ *
+ * Result in [0, size) is a permutation for inputs in the same set.
+ *
+ * Note that this function does not pass statistical tests: eg
+ * permutations of 2, 3, 4 or 5 ints are not strictly equiprobable.
+ * Things worsen for large sizes as there are many more permutations
+ * (size!) than seeds to select them (2^64 < 21!).
+ * However it is inexpensive compared to an actual encryption function,
+ * and the quality is good enough to avoid trivial correlations on
+ * large sizes, which is the expected use case.
+ *
+ * THIS FUNCTION IS NOT CRYPTOGRAPHICALLY SECURE.
+ * DO NOT USE FOR SUCH PURPOSE.
+ */
+static int64
+pseudorandom_perm(const int64 data, const int64 isize, const int64 seed)
+{
+ /* computations are performed on unsigned values */
+ uint64 size = (uint64) isize;
+ uint64 v = (uint64) data % size;
+ uint64 key = (uint64) seed;
+ /* size-1 ensures 2 possibly overlapping halves */
+ uint64 mask = compute_mask(size - 1) >> 1;
+
+ /* nothing to permute */
+ if (isize == 1)
+ return 0;
+
+ Assert(isize >= 2);
+
+ /*-----
+ * Apply 4 rounds of bijective transformations:
+ * (1) scramble: partial xors on overlapping power-of-2 subsets
+ * (2) scatter: linear modulo
+ */
+ for (unsigned int i = 0, p = key % PRP_PRIMES; i < PRP_ROUNDS; i++, p = (p + 1) % PRP_PRIMES)
+ {
+ uint64 t;
+
+ /* first "half" whitening, for v in 0 .. mask */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ if (v <= mask)
+ v ^= (key >> LCG_SHIFT) & mask;
+
+ /* second (possibly overlapping) "half" whitening */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ t = size - 1 - v;
+ if (t <= mask)
+ {
+ t ^= (key >> LCG_SHIFT) & mask;
+ v = size - 1 - t;
+ }
+
+ /* at most 2 primes are skipped for a given size */
+ while (unlikely(size % primes[p] == 0))
+ p = (p + 1) % PRP_PRIMES;
+
+ /* scatter values with a prime multiplication */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+
+ /* Performance shortcut for 24 bit primes, ok for size up to ~10E12 */
+ if ((v & UINT64CONST(0xffffffffff)) == v)
+ v = (primes[p] * v + (key >> LCG_SHIFT)) % size;
+ else
+ /*-----
+ * Note: the add cannot overflow as size is under 63 bits:
+ * mmv = mm(prime, v, size) < size <= 0x7fffffffffffffff = (1<<63)-1
+ * ks = key >> LCG_SHIFTS <= 2^51
+ * => mmv + ks < (1<<64) - 1
+ */
+ v = (modular_multiply(primes[p], v, size) + (key >> LCG_SHIFT)) % size;
+ }
+
+ /* back to signed */
+ return (int64) v;
+}
+
/*
* Initialize the given SimpleStats struct to all zeroes
*/
@@ -2420,6 +2669,27 @@ evalStandardFunc(TState *thread, CState *st,
return true;
}
+ case PGBENCH_PRPERM:
+ {
+ int64 val, size, seed;
+
+ Assert(nargs == 3);
+
+ if (!coerceToInt(&vargs[0], &val) ||
+ !coerceToInt(&vargs[1], &size) ||
+ !coerceToInt(&vargs[2], &seed))
+ return false;
+
+ if (size < 1)
+ {
+ fprintf(stderr, "pr_perm size parameter must be >= 1\n");
+ return false;
+ }
+
+ setIntValue(retval, pseudorandom_perm(val, size, seed));
+ return true;
+ }
+
default:
/* cannot get here */
Assert(0);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index de50340434..12d5c2946b 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -99,7 +99,8 @@ typedef enum PgBenchFunction
PGBENCH_IS,
PGBENCH_CASE,
PGBENCH_HASH_FNV1A,
- PGBENCH_HASH_MURMUR2
+ PGBENCH_HASH_MURMUR2,
+ PGBENCH_PRPERM
} PgBenchFunction;
typedef struct PgBenchExpr PgBenchExpr;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 0081989026..30c415c43f 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -323,6 +323,17 @@ pgbench(
qr{command=98.: int 5432\b}, # :random_seed
qr{command=99.: int -9223372036854775808\b}, # min int
qr{command=100.: int 9223372036854775807\b}, # max int
+ qr{command=101.: boolean true\b},
+ qr{command=102.: boolean true\b},
+ qr{command=103.: boolean true\b},
+ qr{command=104.: boolean true\b},
+ qr{command=105.: boolean true\b},
+ qr{command=109.: boolean true\b},
+ qr{command=110.: boolean true\b},
+ qr{command=111.: boolean true\b},
+ qr{command=112.: boolean true\b},
+ qr{command=113.: int 9223372036854775797\b},
+ qr{command=114.: boolean true\b},
],
'pgbench expressions',
{
@@ -450,6 +461,37 @@ SELECT :v0, :v1, :v2, :v3;
-- minint constant parsing
\set min debug(-9223372036854775808)
\set max debug(-(:min + 1))
+-- pseudo-random permutation
+\set t debug(pr_perm(0, 2) + pr_perm(1, 2) = 1)
+\set t debug(pr_perm(0, 3) + pr_perm(1, 3) + pr_perm(2, 3) = 3)
+\set t debug(pr_perm(0, 4) + pr_perm(1, 4) + pr_perm(2, 4) + pr_perm(3, 4) = 6)
+\set t debug(pr_perm(0, 5) + pr_perm(1, 5) + pr_perm(2, 5) + pr_perm(3, 5) + pr_perm(4, 5) = 10)
+\set t debug(pr_perm(0, 16) + pr_perm(1, 16) + pr_perm(2, 16) + pr_perm(3, 16) + \
+ pr_perm(4, 16) + pr_perm(5, 16) + pr_perm(6, 16) + pr_perm(7, 16) + \
+ pr_perm(8, 16) + pr_perm(9, 16) + pr_perm(10, 16) + pr_perm(11, 16) + \
+ pr_perm(12, 16) + pr_perm(13, 16) + pr_perm(14, 16) + pr_perm(15, 16) = 120)
+-- random sanity check
+\set size random(2, 1000)
+\set v random(0, :size - 1)
+\set p pr_perm(:v, :size)
+\set t debug(0 <= :p and :p < :size and :p = pr_perm(:v + :size, :size) and :p <> pr_perm(:v + 1, :size))
+-- actual values
+\set t debug(pr_perm(:v, 1) = 0)
+\set t debug(pr_perm(0, 2, 5432) = 0 and pr_perm(1, 2, 5432) = 1 and \
+ pr_perm(0, 2, 5431) = 1 and pr_perm(1, 2, 5431) = 0)
+-- ~50 bits test to exercise full modular multiplication
+\set t debug(pr_perm(0, 1000000000000000, 54321) = 9406454989259 and \
+ pr_perm(1999999999999999, 1000000000000000, 54321) = 54570773902028 and \
+ pr_perm(2500000000000000, 1000000000000000, 54321) = 771082311307468)
+-- 63 bits tests
+\set size debug(:max - 10)
+\set ok debug(pr_perm(:size-1, :size, 5432) = 7214172101785397543 and \
+ pr_perm(:size-2, :size, 5432) = 4060085360303920649 and \
+ pr_perm(:size-3, :size, 5432) = 919477102797071521 and \
+ pr_perm(:size-4, :size, 5432) = 7588404289602340497 and \
+ pr_perm(:size-5, :size, 5432) = 5568354808723584469 and \
+ pr_perm(:size-6, :size, 5432) = 2410458883211907565 and \
+ pr_perm(:size-7, :size, 5432) = 1738667467693569814)
}
});
@@ -740,6 +782,10 @@ SELECT LEAST(:i, :i, :i, :i, :i, :i, :i, :i, :i, :i, :i);
[
'bad boolean', 2,
[qr{malformed variable.*trueXXX}], q{\set b :badtrue or true}
+ ],
+ [
+ 'invalid pr_perm size', 2,
+ [qr{pr_perm size parameter must be >= 1}], q{\set i pr_perm(0, 0)}
],);
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
index 696c378edc..7c03ef2bbc 100644
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -306,6 +306,16 @@ my @script_tests = (
'double overflow 3',
[qr{double constant overflow}],
{ 'overflow-3.sql' => "\\set d .1E310\n" }
+ ],
+ [
+ 'not enough arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-1.sql' => "\\set i pr_perm(1)\n" }
+ ],
+ [
+ 'too many arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-2.sql' => "\\set i pr_perm(1, 2, 3, 4)\n" }
],);
for my $t (@script_tests)
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 9798bd24b4..4cf375085d 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -734,6 +734,9 @@
/* Define to 1 if your compiler understands __builtin_bswap64. */
#undef HAVE__BUILTIN_BSWAP64
+/* Define to 1 if your compiler understands __builtin_clzll. */
+#undef HAVE__BUILTIN_CLZLL
+
/* Define to 1 if your compiler understands __builtin_constant_p. */
#undef HAVE__BUILTIN_CONSTANT_P
On Thu, Oct 25, 2018 at 08:21:27AM +0200, Fabien COELHO wrote:
Thanks for the hint. Here is an updated patch using this marker.
I noticed that some instances use a closing "*-----" sequence, but it does
not seem useful, so I left it out.I have also tried to improve a few comments, and moved a declaration into a
loop because I did not like the pg-indented version much.
This patch is marked as ready for committer for some time now, and it
has rotten, so I am moving it to next CF, waiting on author.
--
Michael
I updated the patch. And also I added some explanations and simple
examples in the modular_multiply function.
Fabian-san,
To make easily understanding, I think it is a good idea to add a brief
explanation of outline the pseudorandom_perm function and bijective
functions/transformations. What do you think about?
Best regards,
Show quoted text
On 2019/02/02 1:16, Michael Paquier wrote:
On Thu, Oct 25, 2018 at 08:21:27AM +0200, Fabien COELHO wrote:
Thanks for the hint. Here is an updated patch using this marker.
I noticed that some instances use a closing "*-----" sequence, but it does
not seem useful, so I left it out.I have also tried to improve a few comments, and moved a declaration into a
loop because I did not like the pg-indented version much.This patch is marked as ready for committer for some time now, and it
has rotten, so I am moving it to next CF, waiting on author.
--
Michael
Attachments:
pgbench-prp-func-13.patchtext/plain; charset=UTF-8; name=pgbench-prp-func-13.patch; x-mac-creator=0; x-mac-type=0Download
diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index af2dea1..5b09f73 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -327,6 +327,26 @@ fi])# PGAC_C_BUILTIN_BSWAP64
+# PGAC_C_BUILTIN_CLZLL
+# -------------------------
+# Check if the C compiler understands __builtin_clzll(),
+# and define HAVE__BUILTIN_CLZLL if so.
+# Both GCC & CLANG seem to have one.
+AC_DEFUN([PGAC_C_BUILTIN_CLZLL],
+[AC_CACHE_CHECK(for __builtin_clzll, pgac_cv__builtin_clzll,
+[AC_COMPILE_IFELSE([AC_LANG_SOURCE(
+[static unsigned long int x = __builtin_clzll(0xaabbccddeeff0011);]
+)],
+[pgac_cv__builtin_clzll=yes],
+[pgac_cv__builtin_clzll=no])])
+if test x"$pgac_cv__builtin_clzll" = xyes ; then
+AC_DEFINE(HAVE__BUILTIN_CLZLL, 1,
+ [Define to 1 if your compiler understands __builtin_clzll.])
+fi])# PGAC_C_BUILTIN_CLZLL
+
+
+
+
# PGAC_C_BUILTIN_CONSTANT_P
# -------------------------
# Check if the C compiler understands __builtin_constant_p(),
diff --git a/configure b/configure
index 5ef947c..735dc0e 100755
--- a/configure
+++ b/configure
@@ -14025,6 +14025,30 @@ if test x"$pgac_cv__builtin_bswap64" = xyes ; then
$as_echo "#define HAVE__BUILTIN_BSWAP64 1" >>confdefs.h
fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __builtin_clzll" >&5
+$as_echo_n "checking for __builtin_clzll... " >&6; }
+if ${pgac_cv__builtin_clzll+:} false; then :
+ $as_echo_n "(cached) " >&6
+else
+ cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h. */
+static unsigned long int x = __builtin_clzll(0xaabbccddeeff0011);
+
+_ACEOF
+if ac_fn_c_try_compile "$LINENO"; then :
+ pgac_cv__builtin_clzll=yes
+else
+ pgac_cv__builtin_clzll=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv__builtin_clzll" >&5
+$as_echo "$pgac_cv__builtin_clzll" >&6; }
+if test x"$pgac_cv__builtin_clzll" = xyes ; then
+
+$as_echo "#define HAVE__BUILTIN_CLZLL 1" >>confdefs.h
+
+fi
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __builtin_constant_p" >&5
$as_echo_n "checking for __builtin_constant_p... " >&6; }
if ${pgac_cv__builtin_constant_p+:} false; then :
diff --git a/configure.in b/configure.in
index 96118b5..c7f5c01 100644
--- a/configure.in
+++ b/configure.in
@@ -1481,6 +1481,7 @@ PGAC_C_TYPES_COMPATIBLE
PGAC_C_BUILTIN_BSWAP16
PGAC_C_BUILTIN_BSWAP32
PGAC_C_BUILTIN_BSWAP64
+PGAC_C_BUILTIN_CLZLL
PGAC_C_BUILTIN_CONSTANT_P
PGAC_C_BUILTIN_UNREACHABLE
PGAC_C_COMPUTED_GOTO
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 9d18524..14a1ae7 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -936,7 +936,7 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<row>
<entry> <literal>default_seed</literal> </entry>
- <entry>seed used in hash functions by default</entry>
+ <entry>seed used in hash and pseudo-random permutation functions by default</entry>
</row>
<row>
@@ -1482,6 +1482,13 @@ SELECT 2 AS two, 3 AS three \gset p_
<entry><literal>1024.0</literal></entry>
</row>
<row>
+ <entry><literal><function>pr_perm(<replaceable>i</replaceable>, <replaceable>size</replaceable> [, <replaceable>seed</replaceable> ] )</function></literal></entry>
+ <entry>integer</entry>
+ <entry>pseudo-random permutation in [0,size)</entry>
+ <entry><literal>pr_perm(0, 4)</literal></entry>
+ <entry><literal>0</literal>, <literal>1</literal>, <literal>2</literal> or <literal>3</literal></entry>
+ </row>
+ <row>
<entry><literal><function>random(<replaceable>lb</replaceable>, <replaceable>ub</replaceable>)</function></literal></entry>
<entry>integer</entry>
<entry>uniformly-distributed random integer in <literal>[lb, ub]</literal></entry>
@@ -1652,6 +1659,24 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
</para>
<para>
+ Function <literal>pr_perm</literal> implements a pseudo-random permutation.
+ It allows to mix the output of non uniform random functions so that
+ values drawn more often are not trivially correlated.
+ It permutes integers in [0, size) using a seed by applying rounds of
+ simple invertible functions, similarly to an encryption function,
+ although beware that it is not at all cryptographically secure.
+ Compared to <literal>hash</literal> functions discussed above, the function
+ ensures that a perfect permutation is applied: there are no collisions
+ nor holes in the output values.
+ Values outside the interval are interpreted modulo the size.
+ The function errors if size is not positive.
+ If no seed is provided, <literal>:default_seed</literal> is used.
+ For a given size and seed, the function is fully deterministic: if two
+ permutations on the same size must not be correlated, use distinct seeds
+ as outlined in the previous example about hash functions.
+ </para>
+
+ <para>
As an example, the full definition of the built-in TPC-B-like
transaction is:
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index 17c9ec3..1341078 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -19,6 +19,7 @@
#define PGBENCH_NARGS_VARIABLE (-1)
#define PGBENCH_NARGS_CASE (-2)
#define PGBENCH_NARGS_HASH (-3)
+#define PGBENCH_NARGS_PRPERM (-4)
PgBenchExpr *expr_parse_result;
@@ -370,6 +371,9 @@ static const struct
{
"hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A
},
+ {
+ "pr_perm", PGBENCH_NARGS_PRPERM, PGBENCH_PRPERM
+ },
/* keep as last array element */
{
NULL, 0, 0
@@ -482,6 +486,19 @@ make_func(yyscan_t yyscanner, int fnumber, PgBenchExprList *args)
}
break;
+ /* pseudo-random permutation function with optional seed argument */
+ case PGBENCH_NARGS_PRPERM:
+ if (len < 2 || len > 3)
+ expr_yyerror_more(yyscanner, "unexpected number of arguments",
+ PGBENCH_FUNCTIONS[fnumber].fname);
+
+ if (len == 2)
+ {
+ PgBenchExpr *var = make_variable("default_seed");
+ args = make_elist(var, args);
+ }
+ break;
+
/* common case: positive arguments number */
default:
Assert(PGBENCH_FUNCTIONS[fnumber].nargs >= 0);
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 19532cf..23ad1b4 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -1157,6 +1157,296 @@ getHashMurmur2(int64 val, uint64 seed)
return (int64) result;
}
+/* pseudo-random permutation */
+
+/* 16 so that % 16 can be optimized to & 0x0f */
+#define PRP_PRIMES 16
+/*
+ * 24 bits mega primes from https://primes.utm.edu/lists/small/millions/
+ * the i-th prime, i \in [0, 15], is the first prime above 2^23 + i * 2^19
+ */
+static uint64 primes[PRP_PRIMES] = {
+ UINT64CONST(8388617),
+ UINT64CONST(8912921),
+ UINT64CONST(9437189),
+ UINT64CONST(9961487),
+ UINT64CONST(10485767),
+ UINT64CONST(11010059),
+ UINT64CONST(11534351),
+ UINT64CONST(12058679),
+ UINT64CONST(12582917),
+ UINT64CONST(13107229),
+ UINT64CONST(13631489),
+ UINT64CONST(14155777),
+ UINT64CONST(14680067),
+ UINT64CONST(15204391),
+ UINT64CONST(15728681),
+ UINT64CONST(16252967)
+};
+
+/* how many "encryption" rounds to apply */
+#define PRP_ROUNDS 4
+
+/* return smallest mask holding n */
+static uint64
+compute_mask(uint64 n)
+{
+ n |= n >> 1;
+ n |= n >> 2;
+ n |= n >> 4;
+ n |= n >> 8;
+ n |= n >> 16;
+ n |= n >> 32;
+ return n;
+}
+
+/* length of n binary representation */
+static int
+nbits(uint64 n)
+{
+#ifdef HAVE__BUILTIN_CLZLL
+ return 64 - (n != 0 ? __builtin_clzll(n) : 64);
+#else
+ /* use clever no branching bitwise operator version */
+ /* set all lower bits to 1 */
+ n = compute_mask(n);
+ /* then count them */
+ n -= (n >> 1) & UINT64CONST(0x5555555555555555);
+ n = (n & UINT64CONST(0x3333333333333333)) + ((n >> 2) & UINT64CONST(0x3333333333333333));
+ n = (n + (n >> 4)) & UINT64CONST(0x0F0F0F0F0F0F0F0F);
+ return (n * UINT64CONST(0x0101010101010101)) >> 56;
+#endif
+}
+
+/*
+ * Compute (x * y) % m, where x and y in [0, 2^64), m in [1, 2^64).
+ *
+ * Use improved interleaved modular multiplication algorithm to avoid
+ * overflows when necessary.
+ */
+static uint64
+modular_multiply(uint64 x, uint64 y, const uint64 m)
+{
+ int i,
+ bits;
+ uint64 r;
+
+ Assert(m >= 1);
+
+ /* Because of (x * y) % m = (x % m * y % m) % m */
+ if (x >= m)
+ x %= m;
+ if (y >= m)
+ y %= m;
+
+ /* Return the trivial result. */
+ if (x == 0 || y == 0 || m == 1)
+ return 0;
+
+ /* Return the result if (x * y) can be multiplied without overflow. */
+ if (nbits(x) + nbits(y) <= 64)
+ return (x * y) % m;
+
+ /* To reduce the for loop in the algorithm below, ensure y <= x. */
+ if (x < y)
+ {
+ uint64 tmp = x;
+
+ x = y;
+ y = tmp;
+ }
+
+ /*-----
+ * Interleaved modular multiplication algorithm from:
+ *
+ * D.N. Amanor et al., "Efficient hardware architecture for modular
+ * multiplication on FPGAs", in International Conference on Field
+ * Programmable Logic and Applications, Aug 2005, pp. 539-542.
+ *
+ * This algorithm is usually used in the field of digital circuit design.
+ *
+ * Input: X, Y, M; 0 <= X, Y <= M;
+ * Output: R = X * Y mod M;
+ * bits: number of bits of Y
+ * Y[i]: i th bit of Y
+ *
+ * 1. R = 0;
+ * 2. for (i = bits - 1; i >= 0; i--) {
+ * 3. R = 2 * R;
+ * 4. if (Y[i] == 0x1)
+ * 5. R += X;
+ * 6. if (R >= M) R -= M;
+ * 7. if (R >= M) R -= M;
+ * }
+ *
+ * In Steps 3 and 5, overflow should be avoided. The details are explained
+ * in each step.
+ * Steps 6 and 7 can be instead a modular operation (R %= M).
+ *
+ * Note that, in this implementation, the distributive property of modular
+ * operation shown in below is used whenever it can be applied.
+ * (X * Y) % M = X % M * Y % M, (X + Y) % M = X % M + Y % M
+ *
+ * For ease of understanding, an example is shown.
+ * Example: (11 * 5) % 7
+ *
+ * [Notation] ":=" means substitution.
+ *
+ * [initial value]
+ * R := 0x0000
+ *
+ * [i=2]
+ * R := (0x0000 * 2) % 0x0111 = 0x0000
+ * R := (R + 0x1011) % 0x0111 = 0x0100
+ *
+ * [i=1]
+ * R := (0x0100 * 2) % 0x0111 = 0x1000 % 0x0111 = 0x0001
+ *
+ * [i=0]
+ * R := (0x0001 * 2) % 0x0111 = 0x0010
+ * R := (R + 0x1011) % 0x0111 = 0x1101 % 0x0111 = 0x0110
+ *
+ * [result]
+ * R = 6
+ */
+
+ bits = nbits(y);
+ r = 0;
+
+ for (i = bits - 1; i >= 0; i--)
+ {
+ if (r > UINT64CONST(0x7fffffffffffffff))
+ /*-----
+ * To avoid overflow, transform from (2 * r) to (2 * r) % m, and
+ * further transform to mathematically equivalent form below:
+ *
+ * For ease of understanding, an example using one digit decimal number,
+ * where r = 6 and m = 7, is shown.
+ * (6 * 2) % 7 = (7 - 1) * 2 % 7 = (7 % 7 - 1 % 7) * 2 % 7
+ * = (0 - 1) * 2 % 7 = -2 % 7 = 7 - 2 = 5.
+ * Generally, if (r * 2) overflows,
+ * (r * 2) % m = m - (m - r) * 2.
+ */
+ r = m - ((m - r) << 1);
+ else
+ r <<= 1;
+
+ if ((y >> i) & 0x1)
+ {
+ if (r > UINT64CONST(0xffffffffffffffff) - x)
+ /*-----
+ * To compute (r + x) without overflow: transform to
+ * (r + x) % m, and then to (r + x - m).
+ *
+ * An example using one digit decimal number, where r = 6, x = 5 and
+ * m = 7, is shown.
+ * (6 + 5) % 7 = (6 + (5 - 7 + 7)) % 7 = 6 % 7 + (5 - 7) % 7 + 7 % 7
+ * = 6 + (5 - 7) + 0 = 4.
+ * Generally, if (r + x) overflows,
+ * (r + x) % m = r + (x - m).
+ */
+ r += x - m;
+ else
+ r += x;
+ }
+
+ r %= m;
+ }
+
+ return r;
+}
+
+/*
+ * Donald Knuth's linear congruential generator
+ *
+ * Relying on multiplication overflows is part of the design
+ * of this simple pseudo random number generator.
+ */
+#define DK_LCG_MUL UINT64CONST(6364136223846793005)
+#define DK_LCG_INC UINT64CONST(1442695040888963407)
+
+/* do not use all small bits */
+#define LCG_SHIFT 13
+
+/*
+ * PRP: parametric pseudo-random permutation
+ *
+ * Result in [0, size) is a permutation for inputs in the same set.
+ *
+ * Note that this function does not pass statistical tests: eg
+ * permutations of 2, 3, 4 or 5 ints are not strictly equiprobable.
+ * Things worsen for large sizes as there are many more permutations
+ * (size!) than seeds to select them (2^64 < 21!).
+ * However it is inexpensive compared to an actual encryption function,
+ * and the quality is good enough to avoid trivial correlations on
+ * large sizes, which is the expected use case.
+ *
+ * THIS FUNCTION IS NOT CRYPTOGRAPHICALLY SECURE.
+ * DO NOT USE FOR SUCH PURPOSE.
+ */
+static int64
+pseudorandom_perm(const int64 data, const int64 isize, const int64 seed)
+{
+ /* computations are performed on unsigned values */
+ uint64 size = (uint64) isize;
+ uint64 v = (uint64) data % size;
+ uint64 key = (uint64) seed;
+ /* size-1 ensures 2 possibly overlapping halves */
+ uint64 mask = compute_mask(size - 1) >> 1;
+
+ /* nothing to permute */
+ if (isize == 1)
+ return 0;
+
+ Assert(isize >= 2);
+
+ /*-----
+ * Apply 4 rounds of bijective transformations:
+ * (1) scramble: partial xors on overlapping power-of-2 subsets
+ * (2) scatter: linear modulo
+ */
+ for (unsigned int i = 0, p = key % PRP_PRIMES; i < PRP_ROUNDS; i++, p = (p + 1) % PRP_PRIMES)
+ {
+ uint64 t;
+
+ /* first "half" whitening, for v in 0 .. mask */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ if (v <= mask)
+ v ^= (key >> LCG_SHIFT) & mask;
+
+ /* second (possibly overlapping) "half" whitening */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ t = size - 1 - v;
+ if (t <= mask)
+ {
+ t ^= (key >> LCG_SHIFT) & mask;
+ v = size - 1 - t;
+ }
+
+ /* at most 2 primes are skipped for a given size */
+ while (unlikely(size % primes[p] == 0))
+ p = (p + 1) % PRP_PRIMES;
+
+ /* scatter values with a prime multiplication */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+
+ /* Performance shortcut for 24 bit primes, ok for size up to ~10E12 */
+ if ((v & UINT64CONST(0xffffffffff)) == v)
+ v = (primes[p] * v + (key >> LCG_SHIFT)) % size;
+ else
+ /*-----
+ * Note: the add cannot overflow as size is under 63 bits:
+ * mmv = mm(prime, v, size) < size <= 0x7fffffffffffffff = (1<<63)-1
+ * ks = key >> LCG_SHIFTS <= 2^51
+ * => mmv + ks < (1<<64) - 1
+ */
+ v = (modular_multiply(primes[p], v, size) + (key >> LCG_SHIFT)) % size;
+ }
+
+ /* back to signed */
+ return (int64) v;
+}
+
/*
* Initialize the given SimpleStats struct to all zeroes
*/
@@ -2515,6 +2805,27 @@ evalStandardFunc(TState *thread, CState *st,
return true;
}
+ case PGBENCH_PRPERM:
+ {
+ int64 val, size, seed;
+
+ Assert(nargs == 3);
+
+ if (!coerceToInt(&vargs[0], &val) ||
+ !coerceToInt(&vargs[1], &size) ||
+ !coerceToInt(&vargs[2], &seed))
+ return false;
+
+ if (size < 1)
+ {
+ fprintf(stderr, "pr_perm size parameter must be >= 1\n");
+ return false;
+ }
+
+ setIntValue(retval, pseudorandom_perm(val, size, seed));
+ return true;
+ }
+
default:
/* cannot get here */
Assert(0);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index dc557d4..f9f6b38 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -99,7 +99,8 @@ typedef enum PgBenchFunction
PGBENCH_IS,
PGBENCH_CASE,
PGBENCH_HASH_FNV1A,
- PGBENCH_HASH_MURMUR2
+ PGBENCH_HASH_MURMUR2,
+ PGBENCH_PRPERM
} PgBenchFunction;
typedef struct PgBenchExpr PgBenchExpr;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index ad15ba6..0ecbf07 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -321,6 +321,17 @@ pgbench(
qr{command=98.: int 5432\b}, # :random_seed
qr{command=99.: int -9223372036854775808\b}, # min int
qr{command=100.: int 9223372036854775807\b}, # max int
+ qr{command=101.: boolean true\b},
+ qr{command=102.: boolean true\b},
+ qr{command=103.: boolean true\b},
+ qr{command=104.: boolean true\b},
+ qr{command=105.: boolean true\b},
+ qr{command=109.: boolean true\b},
+ qr{command=110.: boolean true\b},
+ qr{command=111.: boolean true\b},
+ qr{command=112.: boolean true\b},
+ qr{command=113.: int 9223372036854775797\b},
+ qr{command=114.: boolean true\b},
],
'pgbench expressions',
{
@@ -448,6 +459,37 @@ SELECT :v0, :v1, :v2, :v3;
-- minint constant parsing
\set min debug(-9223372036854775808)
\set max debug(-(:min + 1))
+-- pseudo-random permutation
+\set t debug(pr_perm(0, 2) + pr_perm(1, 2) = 1)
+\set t debug(pr_perm(0, 3) + pr_perm(1, 3) + pr_perm(2, 3) = 3)
+\set t debug(pr_perm(0, 4) + pr_perm(1, 4) + pr_perm(2, 4) + pr_perm(3, 4) = 6)
+\set t debug(pr_perm(0, 5) + pr_perm(1, 5) + pr_perm(2, 5) + pr_perm(3, 5) + pr_perm(4, 5) = 10)
+\set t debug(pr_perm(0, 16) + pr_perm(1, 16) + pr_perm(2, 16) + pr_perm(3, 16) + \
+ pr_perm(4, 16) + pr_perm(5, 16) + pr_perm(6, 16) + pr_perm(7, 16) + \
+ pr_perm(8, 16) + pr_perm(9, 16) + pr_perm(10, 16) + pr_perm(11, 16) + \
+ pr_perm(12, 16) + pr_perm(13, 16) + pr_perm(14, 16) + pr_perm(15, 16) = 120)
+-- random sanity check
+\set size random(2, 1000)
+\set v random(0, :size - 1)
+\set p pr_perm(:v, :size)
+\set t debug(0 <= :p and :p < :size and :p = pr_perm(:v + :size, :size) and :p <> pr_perm(:v + 1, :size))
+-- actual values
+\set t debug(pr_perm(:v, 1) = 0)
+\set t debug(pr_perm(0, 2, 5432) = 0 and pr_perm(1, 2, 5432) = 1 and \
+ pr_perm(0, 2, 5431) = 1 and pr_perm(1, 2, 5431) = 0)
+-- ~50 bits test to exercise full modular multiplication
+\set t debug(pr_perm(0, 1000000000000000, 54321) = 9406454989259 and \
+ pr_perm(1999999999999999, 1000000000000000, 54321) = 54570773902028 and \
+ pr_perm(2500000000000000, 1000000000000000, 54321) = 771082311307468)
+-- 63 bits tests
+\set size debug(:max - 10)
+\set ok debug(pr_perm(:size-1, :size, 5432) = 7214172101785397543 and \
+ pr_perm(:size-2, :size, 5432) = 4060085360303920649 and \
+ pr_perm(:size-3, :size, 5432) = 919477102797071521 and \
+ pr_perm(:size-4, :size, 5432) = 7588404289602340497 and \
+ pr_perm(:size-5, :size, 5432) = 5568354808723584469 and \
+ pr_perm(:size-6, :size, 5432) = 2410458883211907565 and \
+ pr_perm(:size-7, :size, 5432) = 1738667467693569814)
}
});
@@ -779,6 +821,8 @@ SELECT LEAST(:i, :i, :i, :i, :i, :i, :i, :i, :i, :i, :i);
[ 'misc empty script', 1, [qr{empty command list for script}], q{} ],
[ 'bad boolean', 2,
[qr{malformed variable.*trueXXX}], q{\set b :badtrue or true} ],
+ [ 'invalid pr_perm size', 2,
+ [qr{pr_perm size parameter must be >= 1}], q{\set i pr_perm(0, 0)}],
# GSET & CSET
[ 'gset no row', 2,
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
index 69a6d03..0acbb7f 100644
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -306,6 +306,16 @@ my @script_tests = (
'double overflow 3',
[qr{double constant overflow}],
{ 'overflow-3.sql' => "\\set d .1E310\n" }
+ ],
+ [
+ 'not enough arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-1.sql' => "\\set i pr_perm(1)\n" }
+ ],
+ [
+ 'too many arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-2.sql' => "\\set i pr_perm(1, 2, 3, 4)\n" }
],);
for my $t (@script_tests)
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 82547f3..6e389ea 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -745,6 +745,9 @@
/* Define to 1 if your compiler understands __builtin_bswap64. */
#undef HAVE__BUILTIN_BSWAP64
+/* Define to 1 if your compiler understands __builtin_clzll. */
+#undef HAVE__BUILTIN_CLZLL
+
/* Define to 1 if your compiler understands __builtin_constant_p. */
#undef HAVE__BUILTIN_CONSTANT_P
Hi,
On 2019-02-10 17:46:15 +0000, Hironobu SUZUKI wrote:
I updated the patch. And also I added some explanations and simple examples
in the modular_multiply function.
It'd be good to update the commitfest entry to say 'needs review' the
next time.
+# PGAC_C_BUILTIN_CLZLL +# ------------------------- +# Check if the C compiler understands __builtin_clzll(), +# and define HAVE__BUILTIN_CLZLL if so. +# Both GCC & CLANG seem to have one. +AC_DEFUN([PGAC_C_BUILTIN_CLZLL], +[AC_CACHE_CHECK(for __builtin_clzll, pgac_cv__builtin_clzll, +[AC_COMPILE_IFELSE([AC_LANG_SOURCE( +[static unsigned long int x = __builtin_clzll(0xaabbccddeeff0011);] +)], +[pgac_cv__builtin_clzll=yes], +[pgac_cv__builtin_clzll=no])]) +if test x"$pgac_cv__builtin_clzll" = xyes ; then +AC_DEFINE(HAVE__BUILTIN_CLZLL, 1, + [Define to 1 if your compiler understands __builtin_clzll.]) +fi])# PGAC_C_BUILTIN_CLZLL
I think this has been partially superceded by
commit 711bab1e4d19b5c9967328315a542d93386b1ac5
Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date: 2019-02-13 16:10:06 -0300
Add basic support for using the POPCNT and SSE4.2s LZCNT opcodes
could you make sur eit's integrated appropriately?
<para> + Function <literal>pr_perm</literal> implements a pseudo-random permutation. + It allows to mix the output of non uniform random functions so that + values drawn more often are not trivially correlated. + It permutes integers in [0, size) using a seed by applying rounds of + simple invertible functions, similarly to an encryption function, + although beware that it is not at all cryptographically secure. + Compared to <literal>hash</literal> functions discussed above, the function + ensures that a perfect permutation is applied: there are no collisions + nor holes in the output values. + Values outside the interval are interpreted modulo the size. + The function errors if size is not positive. + If no seed is provided, <literal>:default_seed</literal> is used. + For a given size and seed, the function is fully deterministic: if two + permutations on the same size must not be correlated, use distinct seeds + as outlined in the previous example about hash functions. + </para>
This doesn't really explain why we want this in pgbench.
Greetings,
Andres Freund
Hello Andres,
+# PGAC_C_BUILTIN_CLZLL
I think this has been partially superceded by
commit 711bab1e4d19b5c9967328315a542d93386b1ac5
Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date: 2019-02-13 16:10:06 -0300
Indeed, the patch needs a rebase & conflit resolution. I'll do it. Later.
<para> + Function <literal>pr_perm</literal> implements a pseudo-random permutation. + It allows to mix the output of non uniform random functions so that + values drawn more often are not trivially correlated. + It permutes integers in [0, size) using a seed by applying rounds of + simple invertible functions, similarly to an encryption function, + although beware that it is not at all cryptographically secure. + Compared to <literal>hash</literal> functions discussed above, the function + ensures that a perfect permutation is applied: there are no collisions + nor holes in the output values. + Values outside the interval are interpreted modulo the size. + The function errors if size is not positive. + If no seed is provided, <literal>:default_seed</literal> is used. + For a given size and seed, the function is fully deterministic: if two + permutations on the same size must not be correlated, use distinct seeds + as outlined in the previous example about hash functions. + </para>This doesn't really explain why we want this in pgbench.
Who is "we"?
If someone runs non uniform tests, ie with random_exp/zipf/gauss, close
values are drawn with a similar frequency, thus correlated, inducing an
undeserved correlation at the page level (eg for read) and better
performance that would be the case if relative frequencies were not
correlated to key values.
So the function allows having more realistic non uniform test, whereas
currently we can only have non uniform test with very unrealistically
correlated values at the key level and possibly at the page level, meaning
non representative performances because of these induced bias.
This is under the assumption that pgbench should allow more realistic
performance test scenarii, which I believe is a desirable purpose. If
someone disagree with this purpose, then they would consider both non
uniform random functions and this proposed pseudo-random permutation
function as useless, as probably most other additions to pgbench.
--
Fabien.
Indeed, the patch needs a rebase & conflit resolution. I'll do it. Later.
Here is an update:
- take advantage of pg_bitutils (although I noted that the "slow"
popcount there could be speeded-up and shorten with a bitwise operator
implementation that I just removed from pgbench).
- add comments about the bijective transformations in the code.
As already stated, this function makes sense for people who want to test
performance with pgbench using non uniform rands. If you do not want to do
that, you will probably find the function pretty useless. I can't help it.
Also, non uniform rands is also a way to test pg lock contention behavior.
--
Fabien.
Attachments:
pgbench-prp-func-14.patchtext/x-diff; name=pgbench-prp-func-14.patchDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 9d18524834..14a1ae7b75 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -936,7 +936,7 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<row>
<entry> <literal>default_seed</literal> </entry>
- <entry>seed used in hash functions by default</entry>
+ <entry>seed used in hash and pseudo-random permutation functions by default</entry>
</row>
<row>
@@ -1481,6 +1481,13 @@ SELECT 2 AS two, 3 AS three \gset p_
<entry><literal>pow(2.0, 10)</literal>, <literal>power(2.0, 10)</literal></entry>
<entry><literal>1024.0</literal></entry>
</row>
+ <row>
+ <entry><literal><function>pr_perm(<replaceable>i</replaceable>, <replaceable>size</replaceable> [, <replaceable>seed</replaceable> ] )</function></literal></entry>
+ <entry>integer</entry>
+ <entry>pseudo-random permutation in [0,size)</entry>
+ <entry><literal>pr_perm(0, 4)</literal></entry>
+ <entry><literal>0</literal>, <literal>1</literal>, <literal>2</literal> or <literal>3</literal></entry>
+ </row>
<row>
<entry><literal><function>random(<replaceable>lb</replaceable>, <replaceable>ub</replaceable>)</function></literal></entry>
<entry>integer</entry>
@@ -1651,6 +1658,24 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
</programlisting>
</para>
+ <para>
+ Function <literal>pr_perm</literal> implements a pseudo-random permutation.
+ It allows to mix the output of non uniform random functions so that
+ values drawn more often are not trivially correlated.
+ It permutes integers in [0, size) using a seed by applying rounds of
+ simple invertible functions, similarly to an encryption function,
+ although beware that it is not at all cryptographically secure.
+ Compared to <literal>hash</literal> functions discussed above, the function
+ ensures that a perfect permutation is applied: there are no collisions
+ nor holes in the output values.
+ Values outside the interval are interpreted modulo the size.
+ The function errors if size is not positive.
+ If no seed is provided, <literal>:default_seed</literal> is used.
+ For a given size and seed, the function is fully deterministic: if two
+ permutations on the same size must not be correlated, use distinct seeds
+ as outlined in the previous example about hash functions.
+ </para>
+
<para>
As an example, the full definition of the built-in TPC-B-like
transaction is:
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index 17c9ec32c5..13410787d4 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -19,6 +19,7 @@
#define PGBENCH_NARGS_VARIABLE (-1)
#define PGBENCH_NARGS_CASE (-2)
#define PGBENCH_NARGS_HASH (-3)
+#define PGBENCH_NARGS_PRPERM (-4)
PgBenchExpr *expr_parse_result;
@@ -370,6 +371,9 @@ static const struct
{
"hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A
},
+ {
+ "pr_perm", PGBENCH_NARGS_PRPERM, PGBENCH_PRPERM
+ },
/* keep as last array element */
{
NULL, 0, 0
@@ -482,6 +486,19 @@ make_func(yyscan_t yyscanner, int fnumber, PgBenchExprList *args)
}
break;
+ /* pseudo-random permutation function with optional seed argument */
+ case PGBENCH_NARGS_PRPERM:
+ if (len < 2 || len > 3)
+ expr_yyerror_more(yyscanner, "unexpected number of arguments",
+ PGBENCH_FUNCTIONS[fnumber].fname);
+
+ if (len == 2)
+ {
+ PgBenchExpr *var = make_variable("default_seed");
+ args = make_elist(var, args);
+ }
+ break;
+
/* common case: positive arguments number */
default:
Assert(PGBENCH_FUNCTIONS[fnumber].nargs >= 0);
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 19532cfb54..bdbb9c252b 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -37,6 +37,7 @@
#include "getopt_long.h"
#include "libpq-fe.h"
#include "portability/instr_time.h"
+#include "port/pg_bitutils.h"
#include <ctype.h>
#include <float.h>
@@ -1157,6 +1158,301 @@ getHashMurmur2(int64 val, uint64 seed)
return (int64) result;
}
+/* pseudo-random permutation */
+
+/* 16 so that % 16 can be optimized to & 0x0f */
+#define PRP_PRIMES 16
+/*
+ * 24 bits mega primes from https://primes.utm.edu/lists/small/millions/
+ * the i-th prime, i \in [0, 15], is the first prime above 2^23 + i * 2^19
+ */
+static uint64 primes[PRP_PRIMES] = {
+ UINT64CONST(8388617),
+ UINT64CONST(8912921),
+ UINT64CONST(9437189),
+ UINT64CONST(9961487),
+ UINT64CONST(10485767),
+ UINT64CONST(11010059),
+ UINT64CONST(11534351),
+ UINT64CONST(12058679),
+ UINT64CONST(12582917),
+ UINT64CONST(13107229),
+ UINT64CONST(13631489),
+ UINT64CONST(14155777),
+ UINT64CONST(14680067),
+ UINT64CONST(15204391),
+ UINT64CONST(15728681),
+ UINT64CONST(16252967)
+};
+
+/* how many "encryption" rounds to apply */
+#define PRP_ROUNDS 4
+
+/* return smallest mask holding n */
+static uint64
+compute_mask(uint64 n)
+{
+ n |= n >> 1;
+ n |= n >> 2;
+ n |= n >> 4;
+ n |= n >> 8;
+ n |= n >> 16;
+ n |= n >> 32;
+ return n;
+}
+
+/* length of n binary representation */
+static int
+nbits(uint64 n)
+{
+ /* set lower bits to 1 and count them */
+ return pg_popcount64(compute_mask(n));
+}
+
+/*
+ * Compute (x * y) % m, where x and y in [0, 2^64), m in [1, 2^64).
+ *
+ * Use improved interleaved modular multiplication algorithm to avoid
+ * overflows when necessary.
+ */
+static uint64
+modular_multiply(uint64 x, uint64 y, const uint64 m)
+{
+ int i,
+ bits;
+ uint64 r;
+
+ Assert(m >= 1);
+
+ /* Because of (x * y) % m = (x % m * y % m) % m */
+ if (x >= m)
+ x %= m;
+ if (y >= m)
+ y %= m;
+
+ /* Return the trivial result. */
+ if (x == 0 || y == 0 || m == 1)
+ return 0;
+
+ /* Return the result if (x * y) can be multiplied without overflow. */
+ if (nbits(x) + nbits(y) <= 64)
+ return (x * y) % m;
+
+ /* To reduce the for loop in the algorithm below, ensure y <= x. */
+ if (x < y)
+ {
+ uint64 tmp = x;
+
+ x = y;
+ y = tmp;
+ }
+
+ /*-----
+ * Interleaved modular multiplication algorithm from:
+ *
+ * D.N. Amanor et al., "Efficient hardware architecture for modular
+ * multiplication on FPGAs", in International Conference on Field
+ * Programmable Logic and Applications, Aug 2005, pp. 539-542.
+ *
+ * This algorithm is usually used in the field of digital circuit design.
+ *
+ * Input: X, Y, M; 0 <= X, Y <= M;
+ * Output: R = X * Y mod M;
+ * bits: number of bits of Y
+ * Y[i]: i th bit of Y
+ *
+ * 1. R = 0;
+ * 2. for (i = bits - 1; i >= 0; i--) {
+ * 3. R = 2 * R;
+ * 4. if (Y[i] == 0x1)
+ * 5. R += X;
+ * 6. if (R >= M) R -= M;
+ * 7. if (R >= M) R -= M;
+ * }
+ *
+ * In Steps 3 and 5, overflow should be avoided. The details are explained
+ * in each step.
+ * Steps 6 and 7 can be instead a modular operation (R %= M).
+ *
+ * Note that, in this implementation, the distributive property of modular
+ * operation shown in below is used whenever it can be applied.
+ * (X * Y) % M = X % M * Y % M, (X + Y) % M = X % M + Y % M
+ *
+ * For ease of understanding, an example is shown.
+ * Example: (11 * 5) % 7
+ *
+ * [Notation] ":=" means substitution.
+ *
+ * [initial value]
+ * R := 0x0000
+ *
+ * [i=2]
+ * R := (0x0000 * 2) % 0x0111 = 0x0000
+ * R := (R + 0x1011) % 0x0111 = 0x0100
+ *
+ * [i=1]
+ * R := (0x0100 * 2) % 0x0111 = 0x1000 % 0x0111 = 0x0001
+ *
+ * [i=0]
+ * R := (0x0001 * 2) % 0x0111 = 0x0010
+ * R := (R + 0x1011) % 0x0111 = 0x1101 % 0x0111 = 0x0110
+ *
+ * [result]
+ * R = 6
+ */
+
+ bits = nbits(y);
+ r = 0;
+
+ for (i = bits - 1; i >= 0; i--)
+ {
+ if (r > UINT64CONST(0x7fffffffffffffff))
+ /*-----
+ * To avoid overflow, transform from (2 * r) to (2 * r) % m, and
+ * further transform to mathematically equivalent form below:
+ *
+ * For ease of understanding, an example using one digit decimal number,
+ * where r = 6 and m = 7, is shown.
+ * (6 * 2) % 7 = (7 - 1) * 2 % 7 = (7 % 7 - 1 % 7) * 2 % 7
+ * = (0 - 1) * 2 % 7 = -2 % 7 = 7 - 2 = 5.
+ * Generally, if (r * 2) overflows,
+ * (r * 2) % m = m - (m - r) * 2.
+ */
+ r = m - ((m - r) << 1);
+ else
+ r <<= 1;
+
+ if ((y >> i) & 0x1)
+ {
+ if (r > UINT64CONST(0xffffffffffffffff) - x)
+ /*-----
+ * To compute (r + x) without overflow: transform to
+ * (r + x) % m, and then to (r + x - m).
+ *
+ * An example using one digit decimal number, where r = 6, x = 5 and
+ * m = 7, is shown.
+ * (6 + 5) % 7 = (6 + (5 - 7 + 7)) % 7 = 6 % 7 + (5 - 7) % 7 + 7 % 7
+ * = 6 + (5 - 7) + 0 = 4.
+ * Generally, if (r + x) overflows,
+ * (r + x) % m = r + (x - m).
+ */
+ r += x - m;
+ else
+ r += x;
+ }
+
+ r %= m;
+ }
+
+ return r;
+}
+
+/*
+ * Donald Knuth's linear congruential generator
+ *
+ * Relying on multiplication overflows is part of the design
+ * of this simple pseudo random number generator.
+ */
+#define DK_LCG_MUL UINT64CONST(6364136223846793005)
+#define DK_LCG_INC UINT64CONST(1442695040888963407)
+
+/* do not use all small bits */
+#define LCG_SHIFT 13
+
+/*
+ * PRP: parametric pseudo-random permutation
+ *
+ * Result in [0, size) is a permutation for inputs in the same set.
+ *
+ * Note that this function does not pass statistical tests: eg
+ * permutations of 2, 3, 4 or 5 ints are not strictly equiprobable.
+ * Things worsen for large sizes as there are many more permutations
+ * (size!) than seeds to select them (2^64 < 21!).
+ * However it is inexpensive compared to an actual encryption function,
+ * and the quality is good enough to avoid trivial correlations on
+ * large sizes, which is the expected use case.
+ *
+ * THIS FUNCTION IS NOT CRYPTOGRAPHICALLY SECURE.
+ * DO NOT USE FOR SUCH PURPOSE.
+ */
+static int64
+pseudorandom_perm(const int64 data, const int64 isize, const int64 seed)
+{
+ /* computations are performed on unsigned values */
+ uint64 size = (uint64) isize;
+ uint64 v = (uint64) data % size;
+ uint64 key = (uint64) seed;
+ /* size-1 ensures 2 possibly overlapping halves */
+ uint64 mask = compute_mask(size - 1) >> 1;
+
+ /* nothing to permute */
+ if (isize == 1)
+ return 0;
+
+ Assert(isize >= 2);
+
+ /*-----
+ * Apply 4 rounds of bijective transformations using key updated
+ * at each stage:
+ *
+ * (1) scramble: partial xors on overlapping power-of-2 subsets
+ * for instance with v in 0 .. 14 (i.e. with size == 15):
+ * if v is in 0 .. 7 do v = (v ^ k) % 8
+ * if v is in 7 .. 14 do v = 14 - ((14-v) ^ k) % 8
+ * note that because of the overlap (here 7), v may be changed twice.
+ * this transformation if bijective because the condition to apply it
+ * is still true after applying it, and xor itself is bijective on a
+ * power-of-2 size.
+ *
+ * (2) scatter: linear modulo
+ * v = (v * p + k) % size
+ * this transformation is bijective is p & size are prime, which is
+ * ensured in the code by the while loop which discards primes when
+ * size is a multiple of it.
+ *
+ */
+ for (unsigned int i = 0, p = key % PRP_PRIMES; i < PRP_ROUNDS; i++, p = (p + 1) % PRP_PRIMES)
+ {
+ uint64 t;
+
+ /* first "half" whitening, for v in 0 .. mask */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ if (v <= mask)
+ v ^= (key >> LCG_SHIFT) & mask;
+
+ /* second (possibly overlapping) "half" whitening */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ t = size - 1 - v;
+ if (t <= mask)
+ {
+ t ^= (key >> LCG_SHIFT) & mask;
+ v = size - 1 - t;
+ }
+
+ /* at most 2 primes are skipped for a given size */
+ while (unlikely(size % primes[p] == 0))
+ p = (p + 1) % PRP_PRIMES;
+
+ /* scatter values with a prime multiplication */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+
+ /* Performance shortcut for 24 bit primes, ok for size up to ~10E12 */
+ if ((v & UINT64CONST(0xffffffffff)) == v)
+ v = (primes[p] * v + (key >> LCG_SHIFT)) % size;
+ else
+ /*-----
+ * Note: the add cannot overflow as size is under 63 bits:
+ * mmv = mm(prime, v, size) < size <= 0x7fffffffffffffff = (1<<63)-1
+ * ks = key >> LCG_SHIFTS <= 2^51
+ * => mmv + ks < (1<<64) - 1
+ */
+ v = (modular_multiply(primes[p], v, size) + (key >> LCG_SHIFT)) % size;
+ }
+
+ /* back to signed */
+ return (int64) v;
+}
+
/*
* Initialize the given SimpleStats struct to all zeroes
*/
@@ -2515,6 +2811,27 @@ evalStandardFunc(TState *thread, CState *st,
return true;
}
+ case PGBENCH_PRPERM:
+ {
+ int64 val, size, seed;
+
+ Assert(nargs == 3);
+
+ if (!coerceToInt(&vargs[0], &val) ||
+ !coerceToInt(&vargs[1], &size) ||
+ !coerceToInt(&vargs[2], &seed))
+ return false;
+
+ if (size < 1)
+ {
+ fprintf(stderr, "pr_perm size parameter must be >= 1\n");
+ return false;
+ }
+
+ setIntValue(retval, pseudorandom_perm(val, size, seed));
+ return true;
+ }
+
default:
/* cannot get here */
Assert(0);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index dc557d416c..f9f6b38662 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -99,7 +99,8 @@ typedef enum PgBenchFunction
PGBENCH_IS,
PGBENCH_CASE,
PGBENCH_HASH_FNV1A,
- PGBENCH_HASH_MURMUR2
+ PGBENCH_HASH_MURMUR2,
+ PGBENCH_PRPERM
} PgBenchFunction;
typedef struct PgBenchExpr PgBenchExpr;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index ad15ba66ea..0ecbf07c6d 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -321,6 +321,17 @@ pgbench(
qr{command=98.: int 5432\b}, # :random_seed
qr{command=99.: int -9223372036854775808\b}, # min int
qr{command=100.: int 9223372036854775807\b}, # max int
+ qr{command=101.: boolean true\b},
+ qr{command=102.: boolean true\b},
+ qr{command=103.: boolean true\b},
+ qr{command=104.: boolean true\b},
+ qr{command=105.: boolean true\b},
+ qr{command=109.: boolean true\b},
+ qr{command=110.: boolean true\b},
+ qr{command=111.: boolean true\b},
+ qr{command=112.: boolean true\b},
+ qr{command=113.: int 9223372036854775797\b},
+ qr{command=114.: boolean true\b},
],
'pgbench expressions',
{
@@ -448,6 +459,37 @@ SELECT :v0, :v1, :v2, :v3;
-- minint constant parsing
\set min debug(-9223372036854775808)
\set max debug(-(:min + 1))
+-- pseudo-random permutation
+\set t debug(pr_perm(0, 2) + pr_perm(1, 2) = 1)
+\set t debug(pr_perm(0, 3) + pr_perm(1, 3) + pr_perm(2, 3) = 3)
+\set t debug(pr_perm(0, 4) + pr_perm(1, 4) + pr_perm(2, 4) + pr_perm(3, 4) = 6)
+\set t debug(pr_perm(0, 5) + pr_perm(1, 5) + pr_perm(2, 5) + pr_perm(3, 5) + pr_perm(4, 5) = 10)
+\set t debug(pr_perm(0, 16) + pr_perm(1, 16) + pr_perm(2, 16) + pr_perm(3, 16) + \
+ pr_perm(4, 16) + pr_perm(5, 16) + pr_perm(6, 16) + pr_perm(7, 16) + \
+ pr_perm(8, 16) + pr_perm(9, 16) + pr_perm(10, 16) + pr_perm(11, 16) + \
+ pr_perm(12, 16) + pr_perm(13, 16) + pr_perm(14, 16) + pr_perm(15, 16) = 120)
+-- random sanity check
+\set size random(2, 1000)
+\set v random(0, :size - 1)
+\set p pr_perm(:v, :size)
+\set t debug(0 <= :p and :p < :size and :p = pr_perm(:v + :size, :size) and :p <> pr_perm(:v + 1, :size))
+-- actual values
+\set t debug(pr_perm(:v, 1) = 0)
+\set t debug(pr_perm(0, 2, 5432) = 0 and pr_perm(1, 2, 5432) = 1 and \
+ pr_perm(0, 2, 5431) = 1 and pr_perm(1, 2, 5431) = 0)
+-- ~50 bits test to exercise full modular multiplication
+\set t debug(pr_perm(0, 1000000000000000, 54321) = 9406454989259 and \
+ pr_perm(1999999999999999, 1000000000000000, 54321) = 54570773902028 and \
+ pr_perm(2500000000000000, 1000000000000000, 54321) = 771082311307468)
+-- 63 bits tests
+\set size debug(:max - 10)
+\set ok debug(pr_perm(:size-1, :size, 5432) = 7214172101785397543 and \
+ pr_perm(:size-2, :size, 5432) = 4060085360303920649 and \
+ pr_perm(:size-3, :size, 5432) = 919477102797071521 and \
+ pr_perm(:size-4, :size, 5432) = 7588404289602340497 and \
+ pr_perm(:size-5, :size, 5432) = 5568354808723584469 and \
+ pr_perm(:size-6, :size, 5432) = 2410458883211907565 and \
+ pr_perm(:size-7, :size, 5432) = 1738667467693569814)
}
});
@@ -779,6 +821,8 @@ SELECT LEAST(:i, :i, :i, :i, :i, :i, :i, :i, :i, :i, :i);
[ 'misc empty script', 1, [qr{empty command list for script}], q{} ],
[ 'bad boolean', 2,
[qr{malformed variable.*trueXXX}], q{\set b :badtrue or true} ],
+ [ 'invalid pr_perm size', 2,
+ [qr{pr_perm size parameter must be >= 1}], q{\set i pr_perm(0, 0)}],
# GSET & CSET
[ 'gset no row', 2,
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
index 69a6d03bb3..0acbb7fbbb 100644
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -306,6 +306,16 @@ my @script_tests = (
'double overflow 3',
[qr{double constant overflow}],
{ 'overflow-3.sql' => "\\set d .1E310\n" }
+ ],
+ [
+ 'not enough arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-1.sql' => "\\set i pr_perm(1)\n" }
+ ],
+ [
+ 'too many arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-2.sql' => "\\set i pr_perm(1, 2, 3, 4)\n" }
],);
for my $t (@script_tests)
Hi Hironobu,
On 3/3/19 12:55 PM, Fabien COELHO wrote:
Indeed, the patch needs a rebase & conflit resolution. I'll do it. Later.
Here is an update:
�- take advantage of pg_bitutils (although I noted that the "slow"
�� popcount there could be speeded-up and shorten with a bitwise operator
�� implementation that I just removed from pgbench).�- add comments about the bijective transformations in the code.
As already stated, this function makes sense for people who want to test
performance with pgbench using non uniform rands. If you do not want to
do that, you will probably find the function pretty useless. I can't
help it.Also, non uniform rands is also a way to test pg lock contention behavior.
You have signed up as a reviewer for this patch. Do you know when
you'll have time to do the review?
Regards,
--
-David
david@pgmasters.net
On 2019/03/21 17:27, David Steele wrote:
Hi Hironobu,
Sorry for the late reply. I reviewed this patch.
Function nbits(), which was previously discussed, has been simplified by
using the function pg_popcount64().
By adding the mathematical explanation, it has been easier to understand
the operation of this function.
I believe that these improvements will have a positive impact on
maintenance.
The patch could be applied successfully and the tests passed without
problems.
So, I think the latest patch is fine.
Best regards,
Show quoted text
On 3/3/19 12:55 PM, Fabien COELHO wrote:
Indeed, the patch needs a rebase & conflit resolution. I'll do it.
Later.Here is an update:
��- take advantage of pg_bitutils (although I noted that the "slow"
��� popcount there could be speeded-up and shorten with a bitwise
operator
��� implementation that I just removed from pgbench).��- add comments about the bijective transformations in the code.
As already stated, this function makes sense for people who want to
test performance with pgbench using non uniform rands. If you do not
want to do that, you will probably find the function pretty useless. I
can't help it.Also, non uniform rands is also a way to test pg lock contention
behavior.You have signed up as a reviewer for this patch.� Do you know when
you'll have time to do the review?Regards,
Hello Hironobu-san,
Here is a v15 which is a rebase, plus a large simplification of the modmul
function if an int128 type is available, which is probably always�
Could you have a look and possibly revalidate?
Sorry for the late reply. I reviewed this patch.
Function nbits(), which was previously discussed, has been simplified by
using the function pg_popcount64().By adding the mathematical explanation, it has been easier to understand the
operation of this function.I believe that these improvements will have a positive impact on maintenance.
The patch could be applied successfully and the tests passed without
problems.So, I think the latest patch is fine.
Best regards,
On 3/3/19 12:55 PM, Fabien COELHO wrote:
Indeed, the patch needs a rebase & conflit resolution. I'll do it. Later.
Here is an update:
��- take advantage of pg_bitutils (although I noted that the "slow"
��� popcount there could be speeded-up and shorten with a bitwise operator
��� implementation that I just removed from pgbench).��- add comments about the bijective transformations in the code.
As already stated, this function makes sense for people who want to test
performance with pgbench using non uniform rands. If you do not want to do
that, you will probably find the function pretty useless. I can't help it.Also, non uniform rands is also a way to test pg lock contention behavior.
You have signed up as a reviewer for this patch.� Do you know when you'll
have time to do the review?Regards,
--
Fabien Coelho - CRI, MINES ParisTech
Attachments:
pgbench-prp-func-15.patchtext/x-diff; name=pgbench-prp-func-15.patchDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index ef22a484e7..24a2c147cc 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -938,7 +938,7 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<row>
<entry> <literal>default_seed</literal> </entry>
- <entry>seed used in hash functions by default</entry>
+ <entry>seed used in hash and pseudo-random permutation functions by default</entry>
</row>
<row>
@@ -1434,6 +1434,13 @@ SELECT 2 AS two, 3 AS three \gset p_
<entry><literal>pow(2.0, 10)</literal>, <literal>power(2.0, 10)</literal></entry>
<entry><literal>1024.0</literal></entry>
</row>
+ <row>
+ <entry><literal><function>pr_perm(<replaceable>i</replaceable>, <replaceable>size</replaceable> [, <replaceable>seed</replaceable> ] )</function></literal></entry>
+ <entry>integer</entry>
+ <entry>pseudo-random permutation in [0,size)</entry>
+ <entry><literal>pr_perm(0, 4)</literal></entry>
+ <entry><literal>0</literal>, <literal>1</literal>, <literal>2</literal> or <literal>3</literal></entry>
+ </row>
<row>
<entry><literal><function>random(<replaceable>lb</replaceable>, <replaceable>ub</replaceable>)</function></literal></entry>
<entry>integer</entry>
@@ -1599,6 +1606,24 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
</programlisting>
</para>
+ <para>
+ Function <literal>pr_perm</literal> implements a pseudo-random permutation.
+ It allows to mix the output of non uniform random functions so that
+ values drawn more often are not trivially correlated.
+ It permutes integers in [0, size) using a seed by applying rounds of
+ simple invertible functions, similarly to an encryption function,
+ although beware that it is not at all cryptographically secure.
+ Compared to <literal>hash</literal> functions discussed above, the function
+ ensures that a perfect permutation is applied: there are no collisions
+ nor holes in the output values.
+ Values outside the interval are interpreted modulo the size.
+ The function errors if size is not positive.
+ If no seed is provided, <literal>:default_seed</literal> is used.
+ For a given size and seed, the function is fully deterministic: if two
+ permutations on the same size must not be correlated, use distinct seeds
+ as outlined in the previous example about hash functions.
+ </para>
+
<para>
As an example, the full definition of the built-in TPC-B-like
transaction is:
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index 17c9ec32c5..13410787d4 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -19,6 +19,7 @@
#define PGBENCH_NARGS_VARIABLE (-1)
#define PGBENCH_NARGS_CASE (-2)
#define PGBENCH_NARGS_HASH (-3)
+#define PGBENCH_NARGS_PRPERM (-4)
PgBenchExpr *expr_parse_result;
@@ -370,6 +371,9 @@ static const struct
{
"hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A
},
+ {
+ "pr_perm", PGBENCH_NARGS_PRPERM, PGBENCH_PRPERM
+ },
/* keep as last array element */
{
NULL, 0, 0
@@ -482,6 +486,19 @@ make_func(yyscan_t yyscanner, int fnumber, PgBenchExprList *args)
}
break;
+ /* pseudo-random permutation function with optional seed argument */
+ case PGBENCH_NARGS_PRPERM:
+ if (len < 2 || len > 3)
+ expr_yyerror_more(yyscanner, "unexpected number of arguments",
+ PGBENCH_FUNCTIONS[fnumber].fname);
+
+ if (len == 2)
+ {
+ PgBenchExpr *var = make_variable("default_seed");
+ args = make_elist(var, args);
+ }
+ break;
+
/* common case: positive arguments number */
default:
Assert(PGBENCH_FUNCTIONS[fnumber].nargs >= 0);
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 8b84658ccd..c532978398 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -38,6 +38,7 @@
#include "getopt_long.h"
#include "libpq-fe.h"
#include "portability/instr_time.h"
+#include "port/pg_bitutils.h"
#include <ctype.h>
#include <float.h>
@@ -1039,6 +1040,307 @@ getHashMurmur2(int64 val, uint64 seed)
return (int64) result;
}
+/* pseudo-random permutation */
+
+/* 16 so that % 16 can be optimized to & 0x0f */
+#define PRP_PRIMES 16
+/*
+ * 24 bits mega primes from https://primes.utm.edu/lists/small/millions/
+ * the i-th prime, i \in [0, 15], is the first prime above 2^23 + i * 2^19
+ */
+static uint64 primes[PRP_PRIMES] = {
+ UINT64CONST(8388617),
+ UINT64CONST(8912921),
+ UINT64CONST(9437189),
+ UINT64CONST(9961487),
+ UINT64CONST(10485767),
+ UINT64CONST(11010059),
+ UINT64CONST(11534351),
+ UINT64CONST(12058679),
+ UINT64CONST(12582917),
+ UINT64CONST(13107229),
+ UINT64CONST(13631489),
+ UINT64CONST(14155777),
+ UINT64CONST(14680067),
+ UINT64CONST(15204391),
+ UINT64CONST(15728681),
+ UINT64CONST(16252967)
+};
+
+/* how many "encryption" rounds to apply */
+#define PRP_ROUNDS 4
+
+/* return smallest mask holding n */
+static uint64
+compute_mask(uint64 n)
+{
+ n |= n >> 1;
+ n |= n >> 2;
+ n |= n >> 4;
+ n |= n >> 8;
+ n |= n >> 16;
+ n |= n >> 32;
+ return n;
+}
+
+#if !defined(PG_INT128_TYPE)
+/* length of n binary representation */
+static int
+nbits(uint64 n)
+{
+ /* set lower bits to 1 and count them */
+ return pg_popcount64(compute_mask(n));
+}
+#endif
+
+/*
+ * Compute (x * y) % m, where x and y in [0, 2^64), m in [1, 2^64).
+ *
+ * Use improved interleaved modular multiplication algorithm to avoid
+ * overflows when necessary.
+ */
+static uint64
+modular_multiply(uint64 x, uint64 y, const uint64 m)
+{
+#if defined(PG_INT128_TYPE)
+ return (PG_INT128_TYPE) x * (PG_INT128_TYPE) y % (PG_INT128_TYPE) m;
+#else
+ int i,
+ bits;
+ uint64 r;
+
+ Assert(m >= 1);
+
+ /* Because of (x * y) % m = (x % m * y % m) % m */
+ if (x >= m)
+ x %= m;
+ if (y >= m)
+ y %= m;
+
+ /* Return the trivial result. */
+ if (x == 0 || y == 0 || m == 1)
+ return 0;
+
+ /* Return the result if (x * y) can be multiplied without overflow. */
+ if (nbits(x) + nbits(y) <= 64)
+ return (x * y) % m;
+
+ /* To reduce the for loop in the algorithm below, ensure y <= x. */
+ if (x < y)
+ {
+ uint64 tmp = x;
+
+ x = y;
+ y = tmp;
+ }
+
+ /*-----
+ * Interleaved modular multiplication algorithm from:
+ *
+ * D.N. Amanor et al., "Efficient hardware architecture for modular
+ * multiplication on FPGAs", in International Conference on Field
+ * Programmable Logic and Applications, Aug 2005, pp. 539-542.
+ *
+ * This algorithm is usually used in the field of digital circuit design.
+ *
+ * Input: X, Y, M; 0 <= X, Y <= M;
+ * Output: R = X * Y mod M;
+ * bits: number of bits of Y
+ * Y[i]: i th bit of Y
+ *
+ * 1. R = 0;
+ * 2. for (i = bits - 1; i >= 0; i--) {
+ * 3. R = 2 * R;
+ * 4. if (Y[i] == 0x1)
+ * 5. R += X;
+ * 6. if (R >= M) R -= M;
+ * 7. if (R >= M) R -= M;
+ * }
+ *
+ * In Steps 3 and 5, overflow should be avoided. The details are explained
+ * in each step.
+ * Steps 6 and 7 can be instead a modular operation (R %= M).
+ *
+ * Note that, in this implementation, the distributive property of modular
+ * operation shown in below is used whenever it can be applied.
+ * (X * Y) % M = X % M * Y % M, (X + Y) % M = X % M + Y % M
+ *
+ * For ease of understanding, an example is shown.
+ * Example: (11 * 5) % 7
+ *
+ * [Notation] ":=" means substitution.
+ *
+ * [initial value]
+ * R := 0x0000
+ *
+ * [i=2]
+ * R := (0x0000 * 2) % 0x0111 = 0x0000
+ * R := (R + 0x1011) % 0x0111 = 0x0100
+ *
+ * [i=1]
+ * R := (0x0100 * 2) % 0x0111 = 0x1000 % 0x0111 = 0x0001
+ *
+ * [i=0]
+ * R := (0x0001 * 2) % 0x0111 = 0x0010
+ * R := (R + 0x1011) % 0x0111 = 0x1101 % 0x0111 = 0x0110
+ *
+ * [result]
+ * R = 6
+ */
+
+ bits = nbits(y);
+ r = 0;
+
+ for (i = bits - 1; i >= 0; i--)
+ {
+ if (r > UINT64CONST(0x7fffffffffffffff))
+ /*-----
+ * To avoid overflow, transform from (2 * r) to (2 * r) % m, and
+ * further transform to mathematically equivalent form below:
+ *
+ * For ease of understanding, an example using one digit decimal number,
+ * where r = 6 and m = 7, is shown.
+ * (6 * 2) % 7 = (7 - 1) * 2 % 7 = (7 % 7 - 1 % 7) * 2 % 7
+ * = (0 - 1) * 2 % 7 = -2 % 7 = 7 - 2 = 5.
+ * Generally, if (r * 2) overflows,
+ * (r * 2) % m = m - (m - r) * 2.
+ */
+ r = m - ((m - r) << 1);
+ else
+ r <<= 1;
+
+ if ((y >> i) & 0x1)
+ {
+ if (r > UINT64CONST(0xffffffffffffffff) - x)
+ /*-----
+ * To compute (r + x) without overflow: transform to
+ * (r + x) % m, and then to (r + x - m).
+ *
+ * An example using one digit decimal number, where r = 6, x = 5 and
+ * m = 7, is shown.
+ * (6 + 5) % 7 = (6 + (5 - 7 + 7)) % 7 = 6 % 7 + (5 - 7) % 7 + 7 % 7
+ * = 6 + (5 - 7) + 0 = 4.
+ * Generally, if (r + x) overflows,
+ * (r + x) % m = r + (x - m).
+ */
+ r += x - m;
+ else
+ r += x;
+ }
+
+ r %= m;
+ }
+
+ return r;
+#endif
+}
+
+/*
+ * Donald Knuth's linear congruential generator
+ *
+ * Relying on multiplication overflows is part of the design
+ * of this simple pseudo random number generator.
+ */
+#define DK_LCG_MUL UINT64CONST(6364136223846793005)
+#define DK_LCG_INC UINT64CONST(1442695040888963407)
+
+/* do not use all small bits */
+#define LCG_SHIFT 13
+
+/*
+ * PRP: parametric pseudo-random permutation
+ *
+ * Result in [0, size) is a permutation for inputs in the same set.
+ *
+ * Note that this function does not pass statistical tests: eg
+ * permutations of 2, 3, 4 or 5 ints are not strictly equiprobable.
+ * Things worsen for large sizes as there are many more permutations
+ * (size!) than seeds to select them (2^64 < 21!).
+ * However it is inexpensive compared to an actual encryption function,
+ * and the quality is good enough to avoid trivial correlations on
+ * large sizes, which is the expected use case.
+ *
+ * THIS FUNCTION IS NOT CRYPTOGRAPHICALLY SECURE.
+ * DO NOT USE FOR SUCH PURPOSE.
+ */
+static int64
+pseudorandom_perm(const int64 data, const int64 isize, const int64 seed)
+{
+ /* computations are performed on unsigned values */
+ uint64 size = (uint64) isize;
+ uint64 v = (uint64) data % size;
+ uint64 key = (uint64) seed;
+ /* size-1 ensures 2 possibly overlapping halves */
+ uint64 mask = compute_mask(size - 1) >> 1;
+
+ /* nothing to permute */
+ if (isize == 1)
+ return 0;
+
+ Assert(isize >= 2);
+
+ /*-----
+ * Apply 4 rounds of bijective transformations using key updated
+ * at each stage:
+ *
+ * (1) scramble: partial xors on overlapping power-of-2 subsets
+ * for instance with v in 0 .. 14 (i.e. with size == 15):
+ * if v is in 0 .. 7 do v = (v ^ k) % 8
+ * if v is in 7 .. 14 do v = 14 - ((14-v) ^ k) % 8
+ * note that because of the overlap (here 7), v may be changed twice.
+ * this transformation if bijective because the condition to apply it
+ * is still true after applying it, and xor itself is bijective on a
+ * power-of-2 size.
+ *
+ * (2) scatter: linear modulo
+ * v = (v * p + k) % size
+ * this transformation is bijective is p & size are prime, which is
+ * ensured in the code by the while loop which discards primes when
+ * size is a multiple of it.
+ *
+ */
+ for (unsigned int i = 0, p = key % PRP_PRIMES; i < PRP_ROUNDS; i++, p = (p + 1) % PRP_PRIMES)
+ {
+ uint64 t;
+
+ /* first "half" whitening, for v in 0 .. mask */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ if (v <= mask)
+ v ^= (key >> LCG_SHIFT) & mask;
+
+ /* second (possibly overlapping) "half" whitening */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ t = size - 1 - v;
+ if (t <= mask)
+ {
+ t ^= (key >> LCG_SHIFT) & mask;
+ v = size - 1 - t;
+ }
+
+ /* at most 2 primes are skipped for a given size */
+ while (unlikely(size % primes[p] == 0))
+ p = (p + 1) % PRP_PRIMES;
+
+ /* scatter values with a prime multiplication */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+
+ /* Performance shortcut for 24 bit primes, ok for size up to ~10E12 */
+ if ((v & UINT64CONST(0xffffffffff)) == v)
+ v = (primes[p] * v + (key >> LCG_SHIFT)) % size;
+ else
+ /*-----
+ * Note: the add cannot overflow as size is under 63 bits:
+ * mmv = mm(prime, v, size) < size <= 0x7fffffffffffffff = (1<<63)-1
+ * ks = key >> LCG_SHIFTS <= 2^51
+ * => mmv + ks < (1<<64) - 1
+ */
+ v = (modular_multiply(primes[p], v, size) + (key >> LCG_SHIFT)) % size;
+ }
+
+ /* back to signed */
+ return (int64) v;
+}
+
/*
* Initialize the given SimpleStats struct to all zeroes
*/
@@ -2383,6 +2685,27 @@ evalStandardFunc(CState *st,
return true;
}
+ case PGBENCH_PRPERM:
+ {
+ int64 val, size, seed;
+
+ Assert(nargs == 3);
+
+ if (!coerceToInt(&vargs[0], &val) ||
+ !coerceToInt(&vargs[1], &size) ||
+ !coerceToInt(&vargs[2], &seed))
+ return false;
+
+ if (size < 1)
+ {
+ fprintf(stderr, "pr_perm size parameter must be >= 1\n");
+ return false;
+ }
+
+ setIntValue(retval, pseudorandom_perm(val, size, seed));
+ return true;
+ }
+
default:
/* cannot get here */
Assert(0);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index c4a1e298e0..573a67ec69 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -99,7 +99,8 @@ typedef enum PgBenchFunction
PGBENCH_IS,
PGBENCH_CASE,
PGBENCH_HASH_FNV1A,
- PGBENCH_HASH_MURMUR2
+ PGBENCH_HASH_MURMUR2,
+ PGBENCH_PRPERM
} PgBenchFunction;
typedef struct PgBenchExpr PgBenchExpr;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index dc2c72fa92..5547ee4466 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -331,6 +331,17 @@ pgbench(
qr{command=98.: int 5432\b}, # :random_seed
qr{command=99.: int -9223372036854775808\b}, # min int
qr{command=100.: int 9223372036854775807\b}, # max int
+ qr{command=101.: boolean true\b},
+ qr{command=102.: boolean true\b},
+ qr{command=103.: boolean true\b},
+ qr{command=104.: boolean true\b},
+ qr{command=105.: boolean true\b},
+ qr{command=109.: boolean true\b},
+ qr{command=110.: boolean true\b},
+ qr{command=111.: boolean true\b},
+ qr{command=112.: boolean true\b},
+ qr{command=113.: int 9223372036854775797\b},
+ qr{command=114.: boolean true\b},
],
'pgbench expressions',
{
@@ -458,6 +469,37 @@ SELECT :v0, :v1, :v2, :v3;
-- minint constant parsing
\set min debug(-9223372036854775808)
\set max debug(-(:min + 1))
+-- pseudo-random permutation
+\set t debug(pr_perm(0, 2) + pr_perm(1, 2) = 1)
+\set t debug(pr_perm(0, 3) + pr_perm(1, 3) + pr_perm(2, 3) = 3)
+\set t debug(pr_perm(0, 4) + pr_perm(1, 4) + pr_perm(2, 4) + pr_perm(3, 4) = 6)
+\set t debug(pr_perm(0, 5) + pr_perm(1, 5) + pr_perm(2, 5) + pr_perm(3, 5) + pr_perm(4, 5) = 10)
+\set t debug(pr_perm(0, 16) + pr_perm(1, 16) + pr_perm(2, 16) + pr_perm(3, 16) + \
+ pr_perm(4, 16) + pr_perm(5, 16) + pr_perm(6, 16) + pr_perm(7, 16) + \
+ pr_perm(8, 16) + pr_perm(9, 16) + pr_perm(10, 16) + pr_perm(11, 16) + \
+ pr_perm(12, 16) + pr_perm(13, 16) + pr_perm(14, 16) + pr_perm(15, 16) = 120)
+-- random sanity check
+\set size random(2, 1000)
+\set v random(0, :size - 1)
+\set p pr_perm(:v, :size)
+\set t debug(0 <= :p and :p < :size and :p = pr_perm(:v + :size, :size) and :p <> pr_perm(:v + 1, :size))
+-- actual values
+\set t debug(pr_perm(:v, 1) = 0)
+\set t debug(pr_perm(0, 2, 5432) = 0 and pr_perm(1, 2, 5432) = 1 and \
+ pr_perm(0, 2, 5431) = 1 and pr_perm(1, 2, 5431) = 0)
+-- ~50 bits test to exercise full modular multiplication
+\set t debug(pr_perm(0, 1000000000000000, 54321) = 9406454989259 and \
+ pr_perm(1999999999999999, 1000000000000000, 54321) = 54570773902028 and \
+ pr_perm(2500000000000000, 1000000000000000, 54321) = 771082311307468)
+-- 63 bits tests
+\set size debug(:max - 10)
+\set ok debug(pr_perm(:size-1, :size, 5432) = 7214172101785397543 and \
+ pr_perm(:size-2, :size, 5432) = 4060085360303920649 and \
+ pr_perm(:size-3, :size, 5432) = 919477102797071521 and \
+ pr_perm(:size-4, :size, 5432) = 7588404289602340497 and \
+ pr_perm(:size-5, :size, 5432) = 5568354808723584469 and \
+ pr_perm(:size-6, :size, 5432) = 2410458883211907565 and \
+ pr_perm(:size-7, :size, 5432) = 1738667467693569814)
}
});
@@ -771,9 +813,13 @@ SELECT LEAST(} . join(', ', (':i') x 256) . q{)}
],
[ 'misc empty script', 1, [qr{empty command list for script}], q{} ],
[
- 'bad boolean', 2,
+ 'bad boolean', 2,
[qr{malformed variable.*trueXXX}], q{\set b :badtrue or true}
],
+ [
+ 'invalid pr_perm size', 2,
+ [qr{pr_perm size parameter must be >= 1}], q{\set i pr_perm(0, 0)}
+ ],
# GSET
[
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
index 69a6d03bb3..0acbb7fbbb 100644
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -306,6 +306,16 @@ my @script_tests = (
'double overflow 3',
[qr{double constant overflow}],
{ 'overflow-3.sql' => "\\set d .1E310\n" }
+ ],
+ [
+ 'not enough arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-1.sql' => "\\set i pr_perm(1)\n" }
+ ],
+ [
+ 'too many arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-2.sql' => "\\set i pr_perm(1, 2, 3, 4)\n" }
],);
for my $t (@script_tests)
On Fri, May 24, 2019 at 2:46 AM Fabien COELHO <coelho@cri.ensmp.fr> wrote:
Here is a v15 which is a rebase, plus a large simplification of the modmul
function if an int128 type is available, which is probably always…
Function nbits(), which was previously discussed, has been simplified by
using the function pg_popcount64().
Hi Fabien, Suzuki-san,
I am not smart enough to commit this or judge its value for
benchmarking, but I have a few trivial comments on the language:
+ It allows to mix the output of non uniform random functions so that
"It allows the output of non-uniform random functions to be mixed so that"
+ ensures that a perfect permutation is applied: there are no collisions
+ nor holes in the output values.
"neither collisions nor holes", or "no collisions or holes"
+ The function errors if size is not positive.
"raises an error"
+ * 24 bits mega primes from https://primes.utm.edu/lists/small/millions/
"24 bit mega primes"
+/* length of n binary representation */
+static int
+nbits(uint64 n)
+{
+ /* set lower bits to 1 and count them */
+ return pg_popcount64(compute_mask(n));
+}
I suppose you could use n == 0 ? 0 : pg_leftmost_one_pos64(n) + 1, and then...
+/* return smallest mask holding n */
+static uint64
+compute_mask(uint64 n)
+{
+ n |= n >> 1;
+ n |= n >> 2;
+ n |= n >> 4;
+ n |= n >> 8;
+ n |= n >> 16;
+ n |= n >> 32;
+ return n;
+}
... here you could use 1 << nbits(n)) - 1. I have no idea if doing it
that way around is better, it's just a thought and removes a few lines
of bit-swizzling code.
--
Thomas Munro
https://enterprisedb.com
Hello Thomas,
Function nbits(), which was previously discussed, has been simplified by
using the function pg_popcount64().Hi Fabien, Suzuki-san,
I am not smart enough to commit this
I'm in no hurry:-)
or judge its value for benchmarking,
I think that it is valuable given that we have non uniform random
generators in pgbench.
I'm wondering about the modular_multiply manual implementation which adds
quite a few lines of non trivial code. If int128 is available on all/most
platforms, it could be removed and marked as not supported on these,
although it would create an issue with non regression tests.
but I have a few trivial comments on the language:
+ It allows to mix the output of non uniform random functions so that
"It allows the output of non-uniform random functions to be mixed so that"
Fixed.
+ ensures that a perfect permutation is applied: there are no collisions + nor holes in the output values."neither collisions nor holes", or "no collisions or holes"
I choose the first.
+ The function errors if size is not positive.
"raises an error"
Fixed.
+ * 24 bits mega primes from https://primes.utm.edu/lists/small/millions/
"24 bit mega primes"
Fixed.
+/* length of n binary representation */ +static int +nbits(uint64 n) +{ + /* set lower bits to 1 and count them */ + return pg_popcount64(compute_mask(n)); +}I suppose you could use n == 0 ? 0 : pg_leftmost_one_pos64(n) + 1, and then...
It would create a branch, that I'm trying to avoid.
+/* return smallest mask holding n */ +static uint64 +compute_mask(uint64 n) +{ + n |= n >> 1; + n |= n >> 2; + n |= n >> 4; + n |= n >> 8; + n |= n >> 16; + n |= n >> 32; + return n; +}... here you could use 1 << nbits(n)) - 1. I have no idea if doing it
that way around is better, it's just a thought and removes a few lines
of bit-swizzling code.
This would create a infinite recursion as nbits currently uses
compute_mask. The 6 bitfield operation above is pretty efficient, I'd let
it at that.
Attached a v16.
--
Fabien.
Attachments:
pgbench-prp-func-16.patchtext/x-diff; name=pgbench-prp-func-16.patchDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 816f9cc4c7..1747a9f66a 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -947,7 +947,7 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<row>
<entry> <literal>default_seed</literal> </entry>
- <entry>seed used in hash functions by default</entry>
+ <entry>seed used in hash and pseudo-random permutation functions by default</entry>
</row>
<row>
@@ -1443,6 +1443,13 @@ SELECT 2 AS two, 3 AS three \gset p_
<entry><literal>pow(2.0, 10)</literal>, <literal>power(2.0, 10)</literal></entry>
<entry><literal>1024.0</literal></entry>
</row>
+ <row>
+ <entry><literal><function>pr_perm(<replaceable>i</replaceable>, <replaceable>size</replaceable> [, <replaceable>seed</replaceable> ] )</function></literal></entry>
+ <entry>integer</entry>
+ <entry>pseudo-random permutation in [0,size)</entry>
+ <entry><literal>pr_perm(0, 4)</literal></entry>
+ <entry><literal>0</literal>, <literal>1</literal>, <literal>2</literal> or <literal>3</literal></entry>
+ </row>
<row>
<entry><literal><function>random(<replaceable>lb</replaceable>, <replaceable>ub</replaceable>)</function></literal></entry>
<entry>integer</entry>
@@ -1608,6 +1615,24 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
</programlisting>
</para>
+ <para>
+ Function <literal>pr_perm</literal> implements a pseudo-random permutation.
+ It allows the output of non uniform random functions to be mixed so that
+ values drawn more often are not trivially correlated.
+ It permutes integers in [0, size) using a seed by applying rounds of
+ simple invertible functions, similarly to an encryption function,
+ although beware that it is not at all cryptographically secure.
+ Compared to <literal>hash</literal> functions discussed above, the function
+ ensures that a perfect permutation is applied: there are neither collisions
+ nor holes in the output values.
+ Values outside the interval are interpreted modulo the size.
+ The function raises an error if size is not positive.
+ If no seed is provided, <literal>:default_seed</literal> is used.
+ For a given size and seed, the function is fully deterministic: if two
+ permutations on the same size must not be correlated, use distinct seeds
+ as outlined in the previous example about hash functions.
+ </para>
+
<para>
As an example, the full definition of the built-in TPC-B-like
transaction is:
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index 17c9ec32c5..13410787d4 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -19,6 +19,7 @@
#define PGBENCH_NARGS_VARIABLE (-1)
#define PGBENCH_NARGS_CASE (-2)
#define PGBENCH_NARGS_HASH (-3)
+#define PGBENCH_NARGS_PRPERM (-4)
PgBenchExpr *expr_parse_result;
@@ -370,6 +371,9 @@ static const struct
{
"hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A
},
+ {
+ "pr_perm", PGBENCH_NARGS_PRPERM, PGBENCH_PRPERM
+ },
/* keep as last array element */
{
NULL, 0, 0
@@ -482,6 +486,19 @@ make_func(yyscan_t yyscanner, int fnumber, PgBenchExprList *args)
}
break;
+ /* pseudo-random permutation function with optional seed argument */
+ case PGBENCH_NARGS_PRPERM:
+ if (len < 2 || len > 3)
+ expr_yyerror_more(yyscanner, "unexpected number of arguments",
+ PGBENCH_FUNCTIONS[fnumber].fname);
+
+ if (len == 2)
+ {
+ PgBenchExpr *var = make_variable("default_seed");
+ args = make_elist(var, args);
+ }
+ break;
+
/* common case: positive arguments number */
default:
Assert(PGBENCH_FUNCTIONS[fnumber].nargs >= 0);
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 570cf3306a..41ca17c26b 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -38,6 +38,7 @@
#include "getopt_long.h"
#include "libpq-fe.h"
#include "portability/instr_time.h"
+#include "port/pg_bitutils.h"
#include <ctype.h>
#include <float.h>
@@ -1040,6 +1041,307 @@ getHashMurmur2(int64 val, uint64 seed)
return (int64) result;
}
+/* pseudo-random permutation */
+
+/* 16 so that % 16 can be optimized to & 0x0f */
+#define PRP_PRIMES 16
+/*
+ * 24 bit mega primes from https://primes.utm.edu/lists/small/millions/
+ * the i-th prime, i \in [0, 15], is the first prime above 2^23 + i * 2^19
+ */
+static uint64 primes[PRP_PRIMES] = {
+ UINT64CONST(8388617),
+ UINT64CONST(8912921),
+ UINT64CONST(9437189),
+ UINT64CONST(9961487),
+ UINT64CONST(10485767),
+ UINT64CONST(11010059),
+ UINT64CONST(11534351),
+ UINT64CONST(12058679),
+ UINT64CONST(12582917),
+ UINT64CONST(13107229),
+ UINT64CONST(13631489),
+ UINT64CONST(14155777),
+ UINT64CONST(14680067),
+ UINT64CONST(15204391),
+ UINT64CONST(15728681),
+ UINT64CONST(16252967)
+};
+
+/* how many "encryption" rounds to apply */
+#define PRP_ROUNDS 4
+
+/* return smallest mask holding n */
+static uint64
+compute_mask(uint64 n)
+{
+ n |= n >> 1;
+ n |= n >> 2;
+ n |= n >> 4;
+ n |= n >> 8;
+ n |= n >> 16;
+ n |= n >> 32;
+ return n;
+}
+
+#if !defined(PG_INT128_TYPE)
+/* length of n binary representation */
+static int
+nbits(uint64 n)
+{
+ /* set lower bits to 1 and count them */
+ return pg_popcount64(compute_mask(n));
+}
+#endif
+
+/*
+ * Compute (x * y) % m, where x and y in [0, 2^64), m in [1, 2^64).
+ *
+ * Use improved interleaved modular multiplication algorithm to avoid
+ * overflows when necessary.
+ */
+static uint64
+modular_multiply(uint64 x, uint64 y, const uint64 m)
+{
+#if defined(PG_INT128_TYPE)
+ return (PG_INT128_TYPE) x * (PG_INT128_TYPE) y % (PG_INT128_TYPE) m;
+#else
+ int i,
+ bits;
+ uint64 r;
+
+ Assert(m >= 1);
+
+ /* Because of (x * y) % m = (x % m * y % m) % m */
+ if (x >= m)
+ x %= m;
+ if (y >= m)
+ y %= m;
+
+ /* Return the trivial result. */
+ if (x == 0 || y == 0 || m == 1)
+ return 0;
+
+ /* Return the result if (x * y) can be multiplied without overflow. */
+ if (nbits(x) + nbits(y) <= 64)
+ return (x * y) % m;
+
+ /* To reduce the for loop in the algorithm below, ensure y <= x. */
+ if (x < y)
+ {
+ uint64 tmp = x;
+
+ x = y;
+ y = tmp;
+ }
+
+ /*-----
+ * Interleaved modular multiplication algorithm from:
+ *
+ * D.N. Amanor et al., "Efficient hardware architecture for modular
+ * multiplication on FPGAs", in International Conference on Field
+ * Programmable Logic and Applications, Aug 2005, pp. 539-542.
+ *
+ * This algorithm is usually used in the field of digital circuit design.
+ *
+ * Input: X, Y, M; 0 <= X, Y <= M;
+ * Output: R = X * Y mod M;
+ * bits: number of bits of Y
+ * Y[i]: i th bit of Y
+ *
+ * 1. R = 0;
+ * 2. for (i = bits - 1; i >= 0; i--) {
+ * 3. R = 2 * R;
+ * 4. if (Y[i] == 0x1)
+ * 5. R += X;
+ * 6. if (R >= M) R -= M;
+ * 7. if (R >= M) R -= M;
+ * }
+ *
+ * In Steps 3 and 5, overflow should be avoided. The details are explained
+ * in each step.
+ * Steps 6 and 7 can be instead a modular operation (R %= M).
+ *
+ * Note that, in this implementation, the distributive property of modular
+ * operation shown in below is used whenever it can be applied.
+ * (X * Y) % M = X % M * Y % M, (X + Y) % M = X % M + Y % M
+ *
+ * For ease of understanding, an example is shown.
+ * Example: (11 * 5) % 7
+ *
+ * [Notation] ":=" means substitution.
+ *
+ * [initial value]
+ * R := 0x0000
+ *
+ * [i=2]
+ * R := (0x0000 * 2) % 0x0111 = 0x0000
+ * R := (R + 0x1011) % 0x0111 = 0x0100
+ *
+ * [i=1]
+ * R := (0x0100 * 2) % 0x0111 = 0x1000 % 0x0111 = 0x0001
+ *
+ * [i=0]
+ * R := (0x0001 * 2) % 0x0111 = 0x0010
+ * R := (R + 0x1011) % 0x0111 = 0x1101 % 0x0111 = 0x0110
+ *
+ * [result]
+ * R = 6
+ */
+
+ bits = nbits(y);
+ r = 0;
+
+ for (i = bits - 1; i >= 0; i--)
+ {
+ if (r > UINT64CONST(0x7fffffffffffffff))
+ /*-----
+ * To avoid overflow, transform from (2 * r) to (2 * r) % m, and
+ * further transform to mathematically equivalent form below:
+ *
+ * For ease of understanding, an example using one digit decimal number,
+ * where r = 6 and m = 7, is shown.
+ * (6 * 2) % 7 = (7 - 1) * 2 % 7 = (7 % 7 - 1 % 7) * 2 % 7
+ * = (0 - 1) * 2 % 7 = -2 % 7 = 7 - 2 = 5.
+ * Generally, if (r * 2) overflows,
+ * (r * 2) % m = m - (m - r) * 2.
+ */
+ r = m - ((m - r) << 1);
+ else
+ r <<= 1;
+
+ if ((y >> i) & 0x1)
+ {
+ if (r > UINT64CONST(0xffffffffffffffff) - x)
+ /*-----
+ * To compute (r + x) without overflow: transform to
+ * (r + x) % m, and then to (r + x - m).
+ *
+ * An example using one digit decimal number, where r = 6, x = 5 and
+ * m = 7, is shown.
+ * (6 + 5) % 7 = (6 + (5 - 7 + 7)) % 7 = 6 % 7 + (5 - 7) % 7 + 7 % 7
+ * = 6 + (5 - 7) + 0 = 4.
+ * Generally, if (r + x) overflows,
+ * (r + x) % m = r + (x - m).
+ */
+ r += x - m;
+ else
+ r += x;
+ }
+
+ r %= m;
+ }
+
+ return r;
+#endif
+}
+
+/*
+ * Donald Knuth's linear congruential generator
+ *
+ * Relying on multiplication overflows is part of the design
+ * of this simple pseudo random number generator.
+ */
+#define DK_LCG_MUL UINT64CONST(6364136223846793005)
+#define DK_LCG_INC UINT64CONST(1442695040888963407)
+
+/* do not use all small bits */
+#define LCG_SHIFT 13
+
+/*
+ * PRP: parametric pseudo-random permutation
+ *
+ * Result in [0, size) is a permutation for inputs in the same set.
+ *
+ * Note that this function does not pass statistical tests: eg
+ * permutations of 2, 3, 4 or 5 ints are not strictly equiprobable.
+ * Things worsen for large sizes as there are many more permutations
+ * (size!) than seeds to select them (2^64 < 21!).
+ * However it is inexpensive compared to an actual encryption function,
+ * and the quality is good enough to avoid trivial correlations on
+ * large sizes, which is the expected use case.
+ *
+ * THIS FUNCTION IS NOT CRYPTOGRAPHICALLY SECURE.
+ * DO NOT USE FOR SUCH PURPOSE.
+ */
+static int64
+pseudorandom_perm(const int64 data, const int64 isize, const int64 seed)
+{
+ /* computations are performed on unsigned values */
+ uint64 size = (uint64) isize;
+ uint64 v = (uint64) data % size;
+ uint64 key = (uint64) seed;
+ /* size-1 ensures 2 possibly overlapping halves */
+ uint64 mask = compute_mask(size - 1) >> 1;
+
+ /* nothing to permute */
+ if (isize == 1)
+ return 0;
+
+ Assert(isize >= 2);
+
+ /*-----
+ * Apply 4 rounds of bijective transformations using key updated
+ * at each stage:
+ *
+ * (1) scramble: partial xors on overlapping power-of-2 subsets
+ * for instance with v in 0 .. 14 (i.e. with size == 15):
+ * if v is in 0 .. 7 do v = (v ^ k) % 8
+ * if v is in 7 .. 14 do v = 14 - ((14-v) ^ k) % 8
+ * note that because of the overlap (here 7), v may be changed twice.
+ * this transformation if bijective because the condition to apply it
+ * is still true after applying it, and xor itself is bijective on a
+ * power-of-2 size.
+ *
+ * (2) scatter: linear modulo
+ * v = (v * p + k) % size
+ * this transformation is bijective is p & size are prime, which is
+ * ensured in the code by the while loop which discards primes when
+ * size is a multiple of it.
+ *
+ */
+ for (unsigned int i = 0, p = key % PRP_PRIMES; i < PRP_ROUNDS; i++, p = (p + 1) % PRP_PRIMES)
+ {
+ uint64 t;
+
+ /* first "half" whitening, for v in 0 .. mask */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ if (v <= mask)
+ v ^= (key >> LCG_SHIFT) & mask;
+
+ /* second (possibly overlapping) "half" whitening */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ t = size - 1 - v;
+ if (t <= mask)
+ {
+ t ^= (key >> LCG_SHIFT) & mask;
+ v = size - 1 - t;
+ }
+
+ /* at most 2 primes are skipped for a given size */
+ while (unlikely(size % primes[p] == 0))
+ p = (p + 1) % PRP_PRIMES;
+
+ /* scatter values with a prime multiplication */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+
+ /* Performance shortcut for 24 bit primes, ok for size up to ~10E12 */
+ if ((v & UINT64CONST(0xffffffffff)) == v)
+ v = (primes[p] * v + (key >> LCG_SHIFT)) % size;
+ else
+ /*-----
+ * Note: the add cannot overflow as size is under 63 bits:
+ * mmv = mm(prime, v, size) < size <= 0x7fffffffffffffff = (1<<63)-1
+ * ks = key >> LCG_SHIFTS <= 2^51
+ * => mmv + ks < (1<<64) - 1
+ */
+ v = (modular_multiply(primes[p], v, size) + (key >> LCG_SHIFT)) % size;
+ }
+
+ /* back to signed */
+ return (int64) v;
+}
+
/*
* Initialize the given SimpleStats struct to all zeroes
*/
@@ -2384,6 +2686,27 @@ evalStandardFunc(CState *st,
return true;
}
+ case PGBENCH_PRPERM:
+ {
+ int64 val, size, seed;
+
+ Assert(nargs == 3);
+
+ if (!coerceToInt(&vargs[0], &val) ||
+ !coerceToInt(&vargs[1], &size) ||
+ !coerceToInt(&vargs[2], &seed))
+ return false;
+
+ if (size < 1)
+ {
+ fprintf(stderr, "pr_perm size parameter must be >= 1\n");
+ return false;
+ }
+
+ setIntValue(retval, pseudorandom_perm(val, size, seed));
+ return true;
+ }
+
default:
/* cannot get here */
Assert(0);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index c4a1e298e0..573a67ec69 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -99,7 +99,8 @@ typedef enum PgBenchFunction
PGBENCH_IS,
PGBENCH_CASE,
PGBENCH_HASH_FNV1A,
- PGBENCH_HASH_MURMUR2
+ PGBENCH_HASH_MURMUR2,
+ PGBENCH_PRPERM
} PgBenchFunction;
typedef struct PgBenchExpr PgBenchExpr;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 5a2fdb9acb..f7f1dfa055 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -326,6 +326,17 @@ pgbench(
qr{command=98.: int 5432\b}, # :random_seed
qr{command=99.: int -9223372036854775808\b}, # min int
qr{command=100.: int 9223372036854775807\b}, # max int
+ qr{command=101.: boolean true\b},
+ qr{command=102.: boolean true\b},
+ qr{command=103.: boolean true\b},
+ qr{command=104.: boolean true\b},
+ qr{command=105.: boolean true\b},
+ qr{command=109.: boolean true\b},
+ qr{command=110.: boolean true\b},
+ qr{command=111.: boolean true\b},
+ qr{command=112.: boolean true\b},
+ qr{command=113.: int 9223372036854775797\b},
+ qr{command=114.: boolean true\b},
],
'pgbench expressions',
{
@@ -453,6 +464,37 @@ SELECT :v0, :v1, :v2, :v3;
-- minint constant parsing
\set min debug(-9223372036854775808)
\set max debug(-(:min + 1))
+-- pseudo-random permutation
+\set t debug(pr_perm(0, 2) + pr_perm(1, 2) = 1)
+\set t debug(pr_perm(0, 3) + pr_perm(1, 3) + pr_perm(2, 3) = 3)
+\set t debug(pr_perm(0, 4) + pr_perm(1, 4) + pr_perm(2, 4) + pr_perm(3, 4) = 6)
+\set t debug(pr_perm(0, 5) + pr_perm(1, 5) + pr_perm(2, 5) + pr_perm(3, 5) + pr_perm(4, 5) = 10)
+\set t debug(pr_perm(0, 16) + pr_perm(1, 16) + pr_perm(2, 16) + pr_perm(3, 16) + \
+ pr_perm(4, 16) + pr_perm(5, 16) + pr_perm(6, 16) + pr_perm(7, 16) + \
+ pr_perm(8, 16) + pr_perm(9, 16) + pr_perm(10, 16) + pr_perm(11, 16) + \
+ pr_perm(12, 16) + pr_perm(13, 16) + pr_perm(14, 16) + pr_perm(15, 16) = 120)
+-- random sanity check
+\set size random(2, 1000)
+\set v random(0, :size - 1)
+\set p pr_perm(:v, :size)
+\set t debug(0 <= :p and :p < :size and :p = pr_perm(:v + :size, :size) and :p <> pr_perm(:v + 1, :size))
+-- actual values
+\set t debug(pr_perm(:v, 1) = 0)
+\set t debug(pr_perm(0, 2, 5432) = 0 and pr_perm(1, 2, 5432) = 1 and \
+ pr_perm(0, 2, 5431) = 1 and pr_perm(1, 2, 5431) = 0)
+-- ~50 bits test to exercise full modular multiplication
+\set t debug(pr_perm(0, 1000000000000000, 54321) = 9406454989259 and \
+ pr_perm(1999999999999999, 1000000000000000, 54321) = 54570773902028 and \
+ pr_perm(2500000000000000, 1000000000000000, 54321) = 771082311307468)
+-- 63 bits tests
+\set size debug(:max - 10)
+\set ok debug(pr_perm(:size-1, :size, 5432) = 7214172101785397543 and \
+ pr_perm(:size-2, :size, 5432) = 4060085360303920649 and \
+ pr_perm(:size-3, :size, 5432) = 919477102797071521 and \
+ pr_perm(:size-4, :size, 5432) = 7588404289602340497 and \
+ pr_perm(:size-5, :size, 5432) = 5568354808723584469 and \
+ pr_perm(:size-6, :size, 5432) = 2410458883211907565 and \
+ pr_perm(:size-7, :size, 5432) = 1738667467693569814)
}
});
@@ -766,9 +808,13 @@ SELECT LEAST(} . join(', ', (':i') x 256) . q{)}
],
[ 'misc empty script', 1, [qr{empty command list for script}], q{} ],
[
- 'bad boolean', 2,
+ 'bad boolean', 2,
[qr{malformed variable.*trueXXX}], q{\set b :badtrue or true}
],
+ [
+ 'invalid pr_perm size', 2,
+ [qr{pr_perm size parameter must be >= 1}], q{\set i pr_perm(0, 0)}
+ ],
# GSET
[
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
index f7fa18418b..714618fc40 100644
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -315,6 +315,16 @@ my @script_tests = (
'double overflow 3',
[qr{double constant overflow}],
{ 'overflow-3.sql' => "\\set d .1E310\n" }
+ ],
+ [
+ 'not enough arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-1.sql' => "\\set i pr_perm(1)\n" }
+ ],
+ [
+ 'too many arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-2.sql' => "\\set i pr_perm(1, 2, 3, 4)\n" }
],);
for my $t (@script_tests)
Hi,
This patch was marked as RFC on 2019-03-30, but since then there have
been a couple more issues pointed out in a review by Thomas Munro, and
it went through 2019-09 and 2019-11 without any attention. Is the RFC
status still appropriate?
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
This patch was marked as RFC on 2019-03-30, but since then there have
been a couple more issues pointed out in a review by Thomas Munro, and
it went through 2019-09 and 2019-11 without any attention. Is the RFC
status still appropriate?
Thomas review was about comments/documentation wording and asking for
explanations, which I think I addressed, and the code did not actually
change, so I'm not sure that the "needs review" is really needed, but do
as you feel.
--
Fabien
On 2020-01-05 10:02, Fabien COELHO wrote:
This patch was marked as RFC on 2019-03-30, but since then there have
been a couple more issues pointed out in a review by Thomas Munro, and
it went through 2019-09 and 2019-11 without any attention. Is the RFC
status still appropriate?Thomas review was about comments/documentation wording and asking for
explanations, which I think I addressed, and the code did not actually
change, so I'm not sure that the "needs review" is really needed, but do
as you feel.
I read the whole thread, I still don't know what this patch is supposed
to do. I know what the words in the subject line mean, but I don't know
how this helps a pgbench user run better benchmarks. I feel this is
also the sentiment expressed by others earlier in the thread. You
indicated that this functionality makes sense to those who want this
functionality, but so far only two people, namely the patch author and
the reviewer, have participated in the discussion on the substance of
this patch. So either the feature is extremely niche, or nobody
understands it. I think you ought to take about three steps back and
explain this in more basic terms, even just in email at first so that we
can then discuss what to put into the documentation.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2020-Jan-30, Peter Eisentraut wrote:
I read the whole thread, I still don't know what this patch is supposed to
do. I know what the words in the subject line mean, but I don't know how
this helps a pgbench user run better benchmarks. I feel this is also the
sentiment expressed by others earlier in the thread. You indicated that
this functionality makes sense to those who want this functionality, but so
far only two people, namely the patch author and the reviewer, have
participated in the discussion on the substance of this patch. So either
the feature is extremely niche, or nobody understands it. I think you ought
to take about three steps back and explain this in more basic terms, even
just in email at first so that we can then discuss what to put into the
documentation.
After re-reading one more time, it dawned on me that the point of this
is similar in spirit to this one:
https://wiki.postgresql.org/wiki/Pseudo_encrypt
The idea seems to be to map the int4 domain into itself, so you can use
a sequence to generate numbers that will not look like a sequence,
allowing the user to hide some properties (such as the generation rate)
that might be useful to an eavesdropper/attacker. In terms of writing
benchmarks, it seems useful to destroy all locality of access, which
changes the benchmark completely. (I'm not sure if this is something
benchmark writers really want to have.)
If I'm right, then I agree that the documentation provided with the
patch does a pretty bad job at explaining it, because until now I didn't
at all realize this is what it was.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hello Peter,
This patch was marked as RFC on 2019-03-30, but since then there have
been a couple more issues pointed out in a review by Thomas Munro, and
it went through 2019-09 and 2019-11 without any attention. Is the RFC
status still appropriate?Thomas review was about comments/documentation wording and asking for
explanations, which I think I addressed, and the code did not actually
change, so I'm not sure that the "needs review" is really needed, but do
as you feel.I read the whole thread, I still don't know what this patch is supposed to
do. I know what the words in the subject line mean, but I don't know how
this helps a pgbench user run better benchmarks. I feel this is also the
sentiment expressed by others earlier in the thread. You indicated that this
functionality makes sense to those who want this functionality, but so far
only two people, namely the patch author and the reviewer, have participated
in the discussion on the substance of this patch. So either the feature is
extremely niche, or nobody understands it. I think you ought to take about
three steps back and explain this in more basic terms, even just in email at
first so that we can then discuss what to put into the documentation.
Here is the motivation for this feature:
When you design a benchmark to test performance, you want to avoid
unwanted correlation which may impact performance unduly, one way or
another. For the default pgbench benchmarks, accounts are chosen
uniformly, no problem. However, if you want to test a non uniform
distribution, which is what most people would encounter in practice,
things are different.
For instance, you would replace random by random_exp in the default
benchmarks. If you do that, then all accounts with lower ids would come
out more often. However this creates an unwanted correlation and
performance effect: why frequent accounts would just happen to be those
with small ids? which just happen to reside together in the first few
pages of the table? In order to avoid these effects, you need to shuffle
the chosen account ids, so that frequent accounts are not specifically
those at the beginning of the table.
Basically, as soon as you have a non uniform random generator selecting
step you want to add some shuffle, otherwise you are going to skew your
benchmark measures. Nobody should use a non-uniform generator for
selecting rows without some shuffling, ever. I have no doubt that many
people do without realizing that they are skewing their performance data.
Once you realize that, you can try to invent your own shuffling, but
frankly this is not as easy as it looks to have something non trivial
which would not generate unwanted correlation. I had a lot of (very) bad
design before coming up with the one in the patch: You want something
fast, you want good steering, which are contradictory objectives. There is
also the question of being able to change the shuffling for a given size,
for instance to check that there is no unwanted effects, hence the
seeding. Good seeded shuffling is what an encryption algorithm do, but for
these it is somehow simpler because they would usually work on power of
two sizes.
In conclusion, using a seeded shuffle step is needed as soon as you start
using non random generators. Providing one in pgbench, a tool designed to
run possibly non-uniform benchmarks against postgres, looks like common
courtesy. Not providing one is helping people to design bad benchmarks,
possibly without realizing it, so is outright thoughtlessness.
I have no doubt that the documentation should be improved to explain that
in a concise but clear way.
--
Fabien.
Hello Alvaro,
I read the whole thread, I still don't know what this patch is supposed to
do. I know what the words in the subject line mean, but I don't know how
this helps a pgbench user run better benchmarks. I feel this is also the
sentiment expressed by others earlier in the thread. You indicated that
this functionality makes sense to those who want this functionality, but so
far only two people, namely the patch author and the reviewer, have
participated in the discussion on the substance of this patch. So either
the feature is extremely niche, or nobody understands it. I think you ought
to take about three steps back and explain this in more basic terms, even
just in email at first so that we can then discuss what to put into the
documentation.After re-reading one more time, it dawned on me that the point of this
is similar in spirit to this one:
https://wiki.postgresql.org/wiki/Pseudo_encrypt
Indeed. The one in the wiki is useless because it is on all integers,
whereas in a benchmark you want it for a given size and you want seeding,
but otherwise the same correlation-avoidance problem is addressed.
The idea seems to be to map the int4 domain into itself, so you can use
a sequence to generate numbers that will not look like a sequence,
allowing the user to hide some properties (such as the generation rate)
that might be useful to an eavesdropper/attacker. In terms of writing
benchmarks, it seems useful to destroy all locality of access, which
changes the benchmark completely.
Yes.
(I'm not sure if this is something benchmark writers really want to
have.)
I do not get this sentence. I'm sure that a benchmark writer should really
want to avoid unrealistic correlations that have a performance impact.
If I'm right, then I agree that the documentation provided with the
patch does a pretty bad job at explaining it, because until now I didn't
at all realize this is what it was.
The documentation is improvable, no doubt.
Attached is an attempt at improving things. I have added a explicit note
and hijacked an existing example to better illustrate the purpose of the
function.
--
Fabien.
Attachments:
pgbench-prp-func-18.patchtext/x-diff; name=pgbench-prp-func-18.patchDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 4c48a58ed2..d2344b029b 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -998,7 +998,7 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<row>
<entry> <literal>default_seed</literal> </entry>
- <entry>seed used in hash functions by default</entry>
+ <entry>seed used in hash and pseudo-random permutation functions by default</entry>
</row>
<row>
@@ -1494,6 +1494,13 @@ SELECT 2 AS two, 3 AS three \gset p_
<entry><literal>pow(2.0, 10)</literal>, <literal>power(2.0, 10)</literal></entry>
<entry><literal>1024.0</literal></entry>
</row>
+ <row>
+ <entry><literal><function>pr_perm(<replaceable>i</replaceable>, <replaceable>size</replaceable> [, <replaceable>seed</replaceable> ] )</function></literal></entry>
+ <entry>integer</entry>
+ <entry>pseudo-random permutation in [0,size)</entry>
+ <entry><literal>pr_perm(0, 4)</literal></entry>
+ <entry><literal>0</literal>, <literal>1</literal>, <literal>2</literal> or <literal>3</literal></entry>
+ </row>
<row>
<entry><literal><function>random(<replaceable>lb</replaceable>, <replaceable>ub</replaceable>)</function></literal></entry>
<entry>integer</entry>
@@ -1545,6 +1552,20 @@ SELECT 2 AS two, 3 AS three \gset p_
shape of the distribution.
</para>
+ <note>
+ <para>
+ When designing a benchmark which selects rows non-uniformly, be aware
+ that using <application>pgbench</application> non-uniform random functions
+ directly leads to unduly correlations: rows with close ids come out with
+ similar probability, skewing performance measures because they also reside
+ in close pages.
+ </para>
+ <para>
+ You must use <function>pr_perm</function> to shuffle the selected ids, or
+ some other additional step with similar effect, to avoid these skewing impacts.
+ </para>
+ </note>
+
<itemizedlist>
<listitem>
<para>
@@ -1639,24 +1660,54 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
<literal>hash_fnv1a</literal> accept an input value and an optional seed parameter.
In case the seed isn't provided the value of <literal>:default_seed</literal>
is used, which is initialized randomly unless set by the command-line
- <literal>-D</literal> option. Hash functions can be used to scatter the
- distribution of random functions such as <literal>random_zipfian</literal> or
- <literal>random_exponential</literal>. For instance, the following pgbench
- script simulates possible real world workload typical for social media and
+ <literal>-D</literal> option.
+ </para>
+
+ <para>
+ Function <literal>pr_perm</literal> implements a pseudo-random permutation
+ on an arbitrary size.
+ It allows to shuffle output of non uniform random functions so that
+ values drawn more often are not trivially correlated.
+ It permutes integers in [0, size) using a seed by applying rounds of
+ simple invertible functions, similarly to an encryption function,
+ although beware that it is not at all cryptographically secure.
+ Compared to <literal>hash</literal> functions discussed above, the function
+ ensures that a perfect permutation is applied: there are neither collisions
+ nor holes in the output values.
+ Values outside the interval are interpreted modulo the size.
+ The function raises an error if size is not positive.
+ If no seed is provided, <literal>:default_seed</literal> is used.
+
+ For instance, the following <application>pgbench</application> script
+ simulates possible real world workload typical for social media and
blogging platforms where few accounts generate excessive load:
<programlisting>
-\set r random_zipfian(0, 100000000, 1.07)
-\set k abs(hash(:r)) % 1000000
+\set size 1000000
+\set r random_zipfian(1, :size, 1.07)
+\set k 1 + pr_perm(:r, :size)
</programlisting>
In some cases several distinct distributions are needed which don't correlate
with each other and this is when implicit seed parameter comes in handy:
<programlisting>
-\set k1 abs(hash(:r, :default_seed + 123)) % 1000000
-\set k2 abs(hash(:r, :default_seed + 321)) % 1000000
+\set k1 1 + pr_perm(:r, :size, :default_seed + 123)
+\set k2 1 + pr_perm(:r, :size, :default_seed + 321)
</programlisting>
+
+ An similar behavior can also be approximated with
+ <function>hash</function>:
+
+<programlisting>
+\set size 1000000
+\set r random_zipfian(1, 100 * :size, 1.07)
+\set k 1 + abs(hash(:r)) % :size
+</programlisting>
+
+ As this hash-modulo version generates collisions, some
+ <literal>id</literal> would not be reachable and others would come more often
+ than the target distribution.
</para>
<para>
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index 85d61caa9f..0ea968c94b 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -19,6 +19,7 @@
#define PGBENCH_NARGS_VARIABLE (-1)
#define PGBENCH_NARGS_CASE (-2)
#define PGBENCH_NARGS_HASH (-3)
+#define PGBENCH_NARGS_PRPERM (-4)
PgBenchExpr *expr_parse_result;
@@ -370,6 +371,9 @@ static const struct
{
"hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A
},
+ {
+ "pr_perm", PGBENCH_NARGS_PRPERM, PGBENCH_PRPERM
+ },
/* keep as last array element */
{
NULL, 0, 0
@@ -482,6 +486,19 @@ make_func(yyscan_t yyscanner, int fnumber, PgBenchExprList *args)
}
break;
+ /* pseudo-random permutation function with optional seed argument */
+ case PGBENCH_NARGS_PRPERM:
+ if (len < 2 || len > 3)
+ expr_yyerror_more(yyscanner, "unexpected number of arguments",
+ PGBENCH_FUNCTIONS[fnumber].fname);
+
+ if (len == 2)
+ {
+ PgBenchExpr *var = make_variable("default_seed");
+ args = make_elist(var, args);
+ }
+ break;
+
/* common case: positive arguments number */
default:
Assert(PGBENCH_FUNCTIONS[fnumber].nargs >= 0);
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 39c1a243d5..3df863e95f 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -32,6 +32,13 @@
#endif
#include "postgres_fe.h"
+#include "common/int.h"
+#include "common/logging.h"
+#include "fe_utils/conditional.h"
+#include "getopt_long.h"
+#include "libpq-fe.h"
+#include "portability/instr_time.h"
+#include "port/pg_bitutils.h"
#include <ctype.h>
#include <float.h>
@@ -1057,6 +1064,319 @@ getHashMurmur2(int64 val, uint64 seed)
return (int64) result;
}
+/* pseudo-random permutation */
+
+/* 16 so that % 16 can be optimized to & 0x0f */
+#define PRP_PRIMES 16
+/*
+ * 24 bit mega primes from https://primes.utm.edu/lists/small/millions/
+ * the i-th prime, i \in [0, 15], is the first prime above 2^23 + i * 2^19
+ */
+static uint64 primes[PRP_PRIMES] = {
+ UINT64CONST(8388617),
+ UINT64CONST(8912921),
+ UINT64CONST(9437189),
+ UINT64CONST(9961487),
+ UINT64CONST(10485767),
+ UINT64CONST(11010059),
+ UINT64CONST(11534351),
+ UINT64CONST(12058679),
+ UINT64CONST(12582917),
+ UINT64CONST(13107229),
+ UINT64CONST(13631489),
+ UINT64CONST(14155777),
+ UINT64CONST(14680067),
+ UINT64CONST(15204391),
+ UINT64CONST(15728681),
+ UINT64CONST(16252967)
+};
+
+/* how many "encryption" rounds to apply */
+#define PRP_ROUNDS 4
+
+/* return smallest mask holding n */
+static uint64
+compute_mask(uint64 n)
+{
+ n |= n >> 1;
+ n |= n >> 2;
+ n |= n >> 4;
+ n |= n >> 8;
+ n |= n >> 16;
+ n |= n >> 32;
+ return n;
+}
+
+#if !defined(PG_INT128_TYPE)
+/* length of n binary representation */
+static int
+nbits(uint64 n)
+{
+ /* set lower bits to 1 and count them */
+ return pg_popcount64(compute_mask(n));
+}
+#endif
+
+#define HIBITS UINT64CONST(0xffffffff00000000)
+#define LOBITS UINT64CONST(0x00000000ffffffff)
+
+static uint64
+modular_multiply_2(const uint64 x, const uint64 y, const uint64 m)
+{
+ /* x = x1 2^32 + x0 with x0 and x1 32 bits */
+ uint64 x1 = (x & HIBITS) >> 32, x0 = x & LOBITS;
+ uint64 y1 = (y & HIBITS) >> 32, y0 = y & LOBITS;
+
+}
+
+/*
+ * Compute (x * y) % m, where x and y in [0, 2^64), m in [1, 2^64).
+ *
+ * Use improved interleaved modular multiplication algorithm to avoid
+ * overflows when necessary.
+ */
+static uint64
+modular_multiply(uint64 x, uint64 y, const uint64 m)
+{
+#if defined(PG_INT128_TYPE)
+ return (PG_INT128_TYPE) x * (PG_INT128_TYPE) y % (PG_INT128_TYPE) m;
+#else
+ int i,
+ bits;
+ uint64 r;
+
+ Assert(m >= 1);
+
+ /* Because of (x * y) % m = (x % m * y % m) % m */
+ if (x >= m)
+ x %= m;
+ if (y >= m)
+ y %= m;
+
+ /* Return the trivial result. */
+ if (x == 0 || y == 0 || m == 1)
+ return 0;
+
+ /* Return the result if (x * y) can be multiplied without overflow. */
+ if (nbits(x) + nbits(y) <= 64)
+ return (x * y) % m;
+
+ /* To reduce the for loop in the algorithm below, ensure y <= x. */
+ if (x < y)
+ {
+ uint64 tmp = x;
+
+ x = y;
+ y = tmp;
+ }
+
+ /*-----
+ * Interleaved modular multiplication algorithm from:
+ *
+ * D.N. Amanor et al., "Efficient hardware architecture for modular
+ * multiplication on FPGAs", in International Conference on Field
+ * Programmable Logic and Applications, Aug 2005, pp. 539-542.
+ *
+ * This algorithm is usually used in the field of digital circuit design.
+ *
+ * Input: X, Y, M; 0 <= X, Y <= M;
+ * Output: R = X * Y mod M;
+ * bits: number of bits of Y
+ * Y[i]: i th bit of Y
+ *
+ * 1. R = 0;
+ * 2. for (i = bits - 1; i >= 0; i--) {
+ * 3. R = 2 * R;
+ * 4. if (Y[i] == 0x1)
+ * 5. R += X;
+ * 6. if (R >= M) R -= M;
+ * 7. if (R >= M) R -= M;
+ * }
+ *
+ * In Steps 3 and 5, overflow should be avoided. The details are explained
+ * in each step.
+ * Steps 6 and 7 can be instead a modular operation (R %= M).
+ *
+ * Note that, in this implementation, the distributive property of modular
+ * operation shown in below is used whenever it can be applied.
+ * (X * Y) % M = X % M * Y % M, (X + Y) % M = X % M + Y % M
+ *
+ * For ease of understanding, an example is shown.
+ * Example: (11 * 5) % 7
+ *
+ * [Notation] ":=" means substitution.
+ *
+ * [initial value]
+ * R := 0x0000
+ *
+ * [i=2]
+ * R := (0x0000 * 2) % 0x0111 = 0x0000
+ * R := (R + 0x1011) % 0x0111 = 0x0100
+ *
+ * [i=1]
+ * R := (0x0100 * 2) % 0x0111 = 0x1000 % 0x0111 = 0x0001
+ *
+ * [i=0]
+ * R := (0x0001 * 2) % 0x0111 = 0x0010
+ * R := (R + 0x1011) % 0x0111 = 0x1101 % 0x0111 = 0x0110
+ *
+ * [result]
+ * R = 6
+ */
+
+ bits = nbits(y);
+ r = 0;
+
+ for (i = bits - 1; i >= 0; i--)
+ {
+ if (r > UINT64CONST(0x7fffffffffffffff))
+ /*-----
+ * To avoid overflow, transform from (2 * r) to (2 * r) % m, and
+ * further transform to mathematically equivalent form below:
+ *
+ * For ease of understanding, an example using one digit decimal number,
+ * where r = 6 and m = 7, is shown.
+ * (6 * 2) % 7 = (7 - 1) * 2 % 7 = (7 % 7 - 1 % 7) * 2 % 7
+ * = (0 - 1) * 2 % 7 = -2 % 7 = 7 - 2 = 5.
+ * Generally, if (r * 2) overflows,
+ * (r * 2) % m = m - (m - r) * 2.
+ */
+ r = m - ((m - r) << 1);
+ else
+ r <<= 1;
+
+ if ((y >> i) & 0x1)
+ {
+ if (r > UINT64CONST(0xffffffffffffffff) - x)
+ /*-----
+ * To compute (r + x) without overflow: transform to
+ * (r + x) % m, and then to (r + x - m).
+ *
+ * An example using one digit decimal number, where r = 6, x = 5 and
+ * m = 7, is shown.
+ * (6 + 5) % 7 = (6 + (5 - 7 + 7)) % 7 = 6 % 7 + (5 - 7) % 7 + 7 % 7
+ * = 6 + (5 - 7) + 0 = 4.
+ * Generally, if (r + x) overflows,
+ * (r + x) % m = r + (x - m).
+ */
+ r += x - m;
+ else
+ r += x;
+ }
+
+ r %= m;
+ }
+
+ return r;
+#endif
+}
+
+/*
+ * Donald Knuth's linear congruential generator
+ *
+ * Relying on multiplication overflows is part of the design
+ * of this simple pseudo random number generator.
+ */
+#define DK_LCG_MUL UINT64CONST(6364136223846793005)
+#define DK_LCG_INC UINT64CONST(1442695040888963407)
+
+/* do not use all small bits */
+#define LCG_SHIFT 13
+
+/*
+ * PRP: parametric pseudo-random permutation
+ *
+ * Result in [0, size) is a permutation for inputs in the same set.
+ *
+ * Note that this function does not pass statistical tests: eg
+ * permutations of 2, 3, 4 or 5 ints are not strictly equiprobable.
+ * Things worsen for large sizes as there are many more permutations
+ * (size!) than seeds to select them (2^64 < 21!).
+ * However it is inexpensive compared to an actual encryption function,
+ * and the quality is good enough to avoid trivial correlations on
+ * large sizes, which is the expected use case.
+ *
+ * THIS FUNCTION IS NOT CRYPTOGRAPHICALLY SECURE.
+ * DO NOT USE FOR SUCH PURPOSE.
+ */
+static int64
+pseudorandom_perm(const int64 data, const int64 isize, const int64 seed)
+{
+ /* computations are performed on unsigned values */
+ uint64 size = (uint64) isize;
+ uint64 v = (uint64) data % size;
+ uint64 key = (uint64) seed;
+ /* size-1 ensures 2 possibly overlapping halves */
+ uint64 mask = compute_mask(size - 1) >> 1;
+
+ /* nothing to permute */
+ if (isize == 1)
+ return 0;
+
+ Assert(isize >= 2);
+
+ /*-----
+ * Apply 4 rounds of bijective transformations using key updated
+ * at each stage:
+ *
+ * (1) scramble: partial xors on overlapping power-of-2 subsets
+ * for instance with v in 0 .. 14 (i.e. with size == 15):
+ * if v is in 0 .. 7 do v = (v ^ k) % 8
+ * if v is in 7 .. 14 do v = 14 - ((14-v) ^ k) % 8
+ * note that because of the overlap (here 7), v may be changed twice.
+ * this transformation if bijective because the condition to apply it
+ * is still true after applying it, and xor itself is bijective on a
+ * power-of-2 size.
+ *
+ * (2) scatter: linear modulo
+ * v = (v * p + k) % size
+ * this transformation is bijective is p & size are prime, which is
+ * ensured in the code by the while loop which discards primes when
+ * size is a multiple of it.
+ *
+ */
+ for (unsigned int i = 0, p = key % PRP_PRIMES; i < PRP_ROUNDS; i++, p = (p + 1) % PRP_PRIMES)
+ {
+ uint64 t;
+
+ /* first "half" whitening, for v in 0 .. mask */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ if (v <= mask)
+ v ^= (key >> LCG_SHIFT) & mask;
+
+ /* second (possibly overlapping) "half" whitening */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ t = size - 1 - v;
+ if (t <= mask)
+ {
+ t ^= (key >> LCG_SHIFT) & mask;
+ v = size - 1 - t;
+ }
+
+ /* at most 2 primes are skipped for a given size */
+ while (unlikely(size % primes[p] == 0))
+ p = (p + 1) % PRP_PRIMES;
+
+ /* scatter values with a prime multiplication */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+
+ /* Performance shortcut for 24 bit primes, ok for size up to ~10E12 */
+ if ((v & UINT64CONST(0xffffffffff)) == v)
+ v = (primes[p] * v + (key >> LCG_SHIFT)) % size;
+ else
+ /*-----
+ * Note: the add cannot overflow as size is under 63 bits:
+ * mmv = mm(prime, v, size) < size <= 0x7fffffffffffffff = (1<<63)-1
+ * ks = key >> LCG_SHIFTS <= 2^51
+ * => mmv + ks < (1<<64) - 1
+ */
+ v = (modular_multiply(primes[p], v, size) + (key >> LCG_SHIFT)) % size;
+ }
+
+ /* back to signed */
+ return (int64) v;
+}
+
/*
* Initialize the given SimpleStats struct to all zeroes
*/
@@ -2394,6 +2714,27 @@ evalStandardFunc(CState *st,
return true;
}
+ case PGBENCH_PRPERM:
+ {
+ int64 val, size, seed;
+
+ Assert(nargs == 3);
+
+ if (!coerceToInt(&vargs[0], &val) ||
+ !coerceToInt(&vargs[1], &size) ||
+ !coerceToInt(&vargs[2], &seed))
+ return false;
+
+ if (size < 1)
+ {
+ fprintf(stderr, "pr_perm size parameter must be >= 1\n");
+ return false;
+ }
+
+ setIntValue(retval, pseudorandom_perm(val, size, seed));
+ return true;
+ }
+
default:
/* cannot get here */
Assert(0);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index fb2c34f512..a95095a27d 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -99,7 +99,8 @@ typedef enum PgBenchFunction
PGBENCH_IS,
PGBENCH_CASE,
PGBENCH_HASH_FNV1A,
- PGBENCH_HASH_MURMUR2
+ PGBENCH_HASH_MURMUR2,
+ PGBENCH_PRPERM
} PgBenchFunction;
typedef struct PgBenchExpr PgBenchExpr;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 25ea17f7d1..c4d57221b7 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -384,6 +384,17 @@ pgbench(
qr{command=98.: int 5432\b}, # :random_seed
qr{command=99.: int -9223372036854775808\b}, # min int
qr{command=100.: int 9223372036854775807\b}, # max int
+ qr{command=101.: boolean true\b},
+ qr{command=102.: boolean true\b},
+ qr{command=103.: boolean true\b},
+ qr{command=104.: boolean true\b},
+ qr{command=105.: boolean true\b},
+ qr{command=109.: boolean true\b},
+ qr{command=110.: boolean true\b},
+ qr{command=111.: boolean true\b},
+ qr{command=112.: boolean true\b},
+ qr{command=113.: int 9223372036854775797\b},
+ qr{command=114.: boolean true\b},
],
'pgbench expressions',
{
@@ -511,6 +522,37 @@ SELECT :v0, :v1, :v2, :v3;
-- minint constant parsing
\set min debug(-9223372036854775808)
\set max debug(-(:min + 1))
+-- pseudo-random permutation
+\set t debug(pr_perm(0, 2) + pr_perm(1, 2) = 1)
+\set t debug(pr_perm(0, 3) + pr_perm(1, 3) + pr_perm(2, 3) = 3)
+\set t debug(pr_perm(0, 4) + pr_perm(1, 4) + pr_perm(2, 4) + pr_perm(3, 4) = 6)
+\set t debug(pr_perm(0, 5) + pr_perm(1, 5) + pr_perm(2, 5) + pr_perm(3, 5) + pr_perm(4, 5) = 10)
+\set t debug(pr_perm(0, 16) + pr_perm(1, 16) + pr_perm(2, 16) + pr_perm(3, 16) + \
+ pr_perm(4, 16) + pr_perm(5, 16) + pr_perm(6, 16) + pr_perm(7, 16) + \
+ pr_perm(8, 16) + pr_perm(9, 16) + pr_perm(10, 16) + pr_perm(11, 16) + \
+ pr_perm(12, 16) + pr_perm(13, 16) + pr_perm(14, 16) + pr_perm(15, 16) = 120)
+-- random sanity check
+\set size random(2, 1000)
+\set v random(0, :size - 1)
+\set p pr_perm(:v, :size)
+\set t debug(0 <= :p and :p < :size and :p = pr_perm(:v + :size, :size) and :p <> pr_perm(:v + 1, :size))
+-- actual values
+\set t debug(pr_perm(:v, 1) = 0)
+\set t debug(pr_perm(0, 2, 5432) = 0 and pr_perm(1, 2, 5432) = 1 and \
+ pr_perm(0, 2, 5431) = 1 and pr_perm(1, 2, 5431) = 0)
+-- ~50 bits test to exercise full modular multiplication
+\set t debug(pr_perm(0, 1000000000000000, 54321) = 9406454989259 and \
+ pr_perm(1999999999999999, 1000000000000000, 54321) = 54570773902028 and \
+ pr_perm(2500000000000000, 1000000000000000, 54321) = 771082311307468)
+-- 63 bits tests
+\set size debug(:max - 10)
+\set ok debug(pr_perm(:size-1, :size, 5432) = 7214172101785397543 and \
+ pr_perm(:size-2, :size, 5432) = 4060085360303920649 and \
+ pr_perm(:size-3, :size, 5432) = 919477102797071521 and \
+ pr_perm(:size-4, :size, 5432) = 7588404289602340497 and \
+ pr_perm(:size-5, :size, 5432) = 5568354808723584469 and \
+ pr_perm(:size-6, :size, 5432) = 2410458883211907565 and \
+ pr_perm(:size-7, :size, 5432) = 1738667467693569814)
}
});
@@ -824,9 +866,13 @@ SELECT LEAST(} . join(', ', (':i') x 256) . q{)}
],
[ 'misc empty script', 1, [qr{empty command list for script}], q{} ],
[
- 'bad boolean', 2,
+ 'bad boolean', 2,
[qr{malformed variable.*trueXXX}], q{\set b :badtrue or true}
],
+ [
+ 'invalid pr_perm size', 2,
+ [qr{pr_perm size parameter must be >= 1}], q{\set i pr_perm(0, 0)}
+ ],
# GSET
[
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
index 66b1bd6ff6..afd40b11f3 100644
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -328,6 +328,16 @@ my @script_tests = (
'set i',
[ qr{set i 1 }, qr{\^ error found here} ],
{ 'set_i_op' => "\\set i 1 +\n" }
+ ],
+ [
+ 'not enough arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-1.sql' => "\\set i pr_perm(1)\n" }
+ ],
+ [
+ 'too many arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-2.sql' => "\\set i pr_perm(1, 2, 3, 4)\n" }
],);
for my $t (@script_tests)
Hi Fabien,
On 2/1/20 5:12 AM, Fabien COELHO wrote:
Attached is an attempt at improving things. I have added a explicit note
and hijacked an existing example to better illustrate the purpose of the
function.
This patch does not build on Linux due to some unused functions and
variables: http://cfbot.cputube.org/patch_27_1712.log
The CF entry has been updated to Waiting on Author.
A few committers have expressed doubts about whether this patch is
needed and it doesn't make sense to keep moving it from CF to CF.
I'm planning to mark this patch RwF on April 8 and I suggest you
resubmit if you are able to get some consensus.
Regards,
--
-David
david@pgmasters.net
Hello David,
Attached is an attempt at improving things. I have added a explicit note
and hijacked an existing example to better illustrate the purpose of the
function.This patch does not build on Linux due to some unused functions and
variables: http://cfbot.cputube.org/patch_27_1712.log
This link is about some other patch, but indeed there is something amiss
in v18. Attached a v19 which fixes that.
The CF entry has been updated to Waiting on Author.
I put it back to "needs review".
A few committers have expressed doubts about whether this patch is needed
Yep.
The key point is that if you (think that you) do not need it, it is
by definition useless.
If you finally figure out that you need it (IMHO you must for any
benchmark with non uniform randoms, otherwise performance result are
biased and thus invalid) and it is not available, then you are just stuck.
and it doesn't make sense to keep moving it from CF to CF.
You do as you feel. IMO such a feature is useful and consistent with
providing non-uniform random functions.
I'm planning to mark this patch RwF on April 8 and I suggest you resubmit if
you are able to get some consensus.
People interested in non-uniform benchmarks would see the point. Why many
people would be happy with uniform benchmarks only while life is not
uniform at all fails me.
--
Fabien.
On 2020-Apr-02, Fabien COELHO wrote:
I'm planning to mark this patch RwF on April 8 and I suggest you
resubmit if you are able to get some consensus.People interested in non-uniform benchmarks would see the point. Why many
people would be happy with uniform benchmarks only while life is not uniform
at all fails me.
I don't think we should boot this patch. I don't think I would be able
to get this over the commit line in this CF, but let's not discard it.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attached is an attempt at improving things. I have added a explicit note and
hijacked an existing example to better illustrate the purpose of the
function.
A significant part of the complexity of the patch is the overflow-handling
implementation of (a * b % c) for 64 bits integers.
However this implementation is probably never used because int128 bits are
available and the one line implementation takes precedence, or the size is
small enough (less than 48 bits) so that there are no overflows with the
primes involved, thus the operation is done directly on 64 bits integers.
I could remove the implementation and replace it with a "not available on
this architecture" message, which would seldom be triggered: you would
have to use a 32 bits arch (?) and test against a table with about 262
Trows, which I guess does not exists anywhere. This approach would remove
about 40% of the code & comments.
Thougths?
--
Fabien.
On 4/2/20 3:01 AM, Alvaro Herrera wrote:
On 2020-Apr-02, Fabien COELHO wrote:
I'm planning to mark this patch RwF on April 8 and I suggest you
resubmit if you are able to get some consensus.People interested in non-uniform benchmarks would see the point. Why many
people would be happy with uniform benchmarks only while life is not uniform
at all fails me.I don't think we should boot this patch. I don't think I would be able
to get this over the commit line in this CF, but let's not discard it.
Understood. I have moved the patch to the 2020-07 CF in Needs Review state.
Regards,
--
-David
david@pgmasters.net
I don't think we should boot this patch. I don't think I would be able
to get this over the commit line in this CF, but let's not discard it.Understood. I have moved the patch to the 2020-07 CF in Needs Review state.
Attached v19 is a rebase, per cfbot.
--
Fabien.
Attachments:
pgbench-prp-func-19.patchtext/x-diff; name=pgbench-prp-func-19.patchDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 9f3bb5fce6..ae4da2b2af 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -1033,7 +1033,7 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<row>
<entry> <literal>default_seed</literal> </entry>
- <entry>seed used in hash functions by default</entry>
+ <entry>seed used in hash and pseudo-random permutation functions by default</entry>
</row>
<row>
@@ -1850,6 +1850,20 @@ SELECT 4 AS four \; SELECT 5 AS five \aset
</para></entry>
</row>
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <function>pr_perm</functio> ( <parameter>i</parameter>, <parameter>size</parameter> [, <parameter>seed</parameter> ] )
+ <returnvalue>integer</returnvalue>
+ </para>
+ <para>
+ pseudo-random permutation in <literal>[0,size)</literal>
+ </para>
+ <para>
+ <literal>pr_perm(0, 4)</literal>
+ <returnvalue>an integer between 0 and 3</returnvalue>
+ </para></entry>
+ </row>
+
<row>
<entry role="func_table_entry"><para role="func_signature">
<function>random</function> ( <parameter>lb</parameter>, <parameter>ub</parameter> )
@@ -1936,6 +1950,20 @@ SELECT 4 AS four \; SELECT 5 AS five \aset
shape of the distribution.
</para>
+ <note>
+ <para>
+ When designing a benchmark which selects rows non-uniformly, be aware
+ that using <application>pgbench</application> non-uniform random functions
+ directly leads to unduly correlations: rows with close ids come out with
+ similar probability, skewing performance measures because they also reside
+ in close pages.
+ </para>
+ <para>
+ You must use <function>pr_perm</function> to shuffle the selected ids, or
+ some other additional step with similar effect, to avoid these skewing impacts.
+ </para>
+ </note>
+
<itemizedlist>
<listitem>
<para>
@@ -2030,24 +2058,54 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
<literal>hash_fnv1a</literal> accept an input value and an optional seed parameter.
In case the seed isn't provided the value of <literal>:default_seed</literal>
is used, which is initialized randomly unless set by the command-line
- <literal>-D</literal> option. Hash functions can be used to scatter the
- distribution of random functions such as <literal>random_zipfian</literal> or
- <literal>random_exponential</literal>. For instance, the following pgbench
- script simulates possible real world workload typical for social media and
+ <literal>-D</literal> option.
+ </para>
+
+ <para>
+ Function <literal>pr_perm</literal> implements a pseudo-random permutation
+ on an arbitrary size.
+ It allows to shuffle output of non uniform random functions so that
+ values drawn more often are not trivially correlated.
+ It permutes integers in [0, size) using a seed by applying rounds of
+ simple invertible functions, similarly to an encryption function,
+ although beware that it is not at all cryptographically secure.
+ Compared to <literal>hash</literal> functions discussed above, the function
+ ensures that a perfect permutation is applied: there are neither collisions
+ nor holes in the output values.
+ Values outside the interval are interpreted modulo the size.
+ The function raises an error if size is not positive.
+ If no seed is provided, <literal>:default_seed</literal> is used.
+
+ For instance, the following <application>pgbench</application> script
+ simulates possible real world workload typical for social media and
blogging platforms where few accounts generate excessive load:
<programlisting>
-\set r random_zipfian(0, 100000000, 1.07)
-\set k abs(hash(:r)) % 1000000
+\set size 1000000
+\set r random_zipfian(1, :size, 1.07)
+\set k 1 + pr_perm(:r, :size)
</programlisting>
In some cases several distinct distributions are needed which don't correlate
with each other and this is when implicit seed parameter comes in handy:
<programlisting>
-\set k1 abs(hash(:r, :default_seed + 123)) % 1000000
-\set k2 abs(hash(:r, :default_seed + 321)) % 1000000
+\set k1 1 + pr_perm(:r, :size, :default_seed + 123)
+\set k2 1 + pr_perm(:r, :size, :default_seed + 321)
</programlisting>
+
+ An similar behavior can also be approximated with
+ <function>hash</function>:
+
+<programlisting>
+\set size 1000000
+\set r random_zipfian(1, 100 * :size, 1.07)
+\set k 1 + abs(hash(:r)) % :size
+</programlisting>
+
+ As this hash-modulo version generates collisions, some
+ <literal>id</literal> would not be reachable and others would come more often
+ than the target distribution.
</para>
<para>
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index 85d61caa9f..0ea968c94b 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -19,6 +19,7 @@
#define PGBENCH_NARGS_VARIABLE (-1)
#define PGBENCH_NARGS_CASE (-2)
#define PGBENCH_NARGS_HASH (-3)
+#define PGBENCH_NARGS_PRPERM (-4)
PgBenchExpr *expr_parse_result;
@@ -370,6 +371,9 @@ static const struct
{
"hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A
},
+ {
+ "pr_perm", PGBENCH_NARGS_PRPERM, PGBENCH_PRPERM
+ },
/* keep as last array element */
{
NULL, 0, 0
@@ -482,6 +486,19 @@ make_func(yyscan_t yyscanner, int fnumber, PgBenchExprList *args)
}
break;
+ /* pseudo-random permutation function with optional seed argument */
+ case PGBENCH_NARGS_PRPERM:
+ if (len < 2 || len > 3)
+ expr_yyerror_more(yyscanner, "unexpected number of arguments",
+ PGBENCH_FUNCTIONS[fnumber].fname);
+
+ if (len == 2)
+ {
+ PgBenchExpr *var = make_variable("default_seed");
+ args = make_elist(var, args);
+ }
+ break;
+
/* common case: positive arguments number */
default:
Assert(PGBENCH_FUNCTIONS[fnumber].nargs >= 0);
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 08a5947a9e..287e95078f 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -32,6 +32,13 @@
#endif
#include "postgres_fe.h"
+#include "common/int.h"
+#include "common/logging.h"
+#include "fe_utils/conditional.h"
+#include "getopt_long.h"
+#include "libpq-fe.h"
+#include "portability/instr_time.h"
+#include "port/pg_bitutils.h"
#include <ctype.h>
#include <float.h>
@@ -1061,6 +1068,307 @@ getHashMurmur2(int64 val, uint64 seed)
return (int64) result;
}
+/* pseudo-random permutation */
+
+/* 16 so that % 16 can be optimized to & 0x0f */
+#define PRP_PRIMES 16
+/*
+ * 24 bit mega primes from https://primes.utm.edu/lists/small/millions/
+ * the i-th prime, i \in [0, 15], is the first prime above 2^23 + i * 2^19
+ */
+static uint64 primes[PRP_PRIMES] = {
+ UINT64CONST(8388617),
+ UINT64CONST(8912921),
+ UINT64CONST(9437189),
+ UINT64CONST(9961487),
+ UINT64CONST(10485767),
+ UINT64CONST(11010059),
+ UINT64CONST(11534351),
+ UINT64CONST(12058679),
+ UINT64CONST(12582917),
+ UINT64CONST(13107229),
+ UINT64CONST(13631489),
+ UINT64CONST(14155777),
+ UINT64CONST(14680067),
+ UINT64CONST(15204391),
+ UINT64CONST(15728681),
+ UINT64CONST(16252967)
+};
+
+/* how many "encryption" rounds to apply */
+#define PRP_ROUNDS 4
+
+/* return smallest mask holding n */
+static uint64
+compute_mask(uint64 n)
+{
+ n |= n >> 1;
+ n |= n >> 2;
+ n |= n >> 4;
+ n |= n >> 8;
+ n |= n >> 16;
+ n |= n >> 32;
+ return n;
+}
+
+#if !defined(PG_INT128_TYPE)
+/* length of n binary representation */
+static int
+nbits(uint64 n)
+{
+ /* set lower bits to 1 and count them */
+ return pg_popcount64(compute_mask(n));
+}
+#endif
+
+/*
+ * Compute (x * y) % m, where x and y in [0, 2^64), m in [1, 2^64).
+ *
+ * Use improved interleaved modular multiplication algorithm to avoid
+ * overflows when necessary.
+ */
+static uint64
+modular_multiply(uint64 x, uint64 y, const uint64 m)
+{
+#if defined(PG_INT128_TYPE)
+ return (PG_INT128_TYPE) x * (PG_INT128_TYPE) y % (PG_INT128_TYPE) m;
+#else
+ int i,
+ bits;
+ uint64 r;
+
+ Assert(m >= 1);
+
+ /* Because of (x * y) % m = (x % m * y % m) % m */
+ if (x >= m)
+ x %= m;
+ if (y >= m)
+ y %= m;
+
+ /* Return the trivial result. */
+ if (x == 0 || y == 0 || m == 1)
+ return 0;
+
+ /* Return the result if (x * y) can be multiplied without overflow. */
+ if (nbits(x) + nbits(y) <= 64)
+ return (x * y) % m;
+
+ /* To reduce the for loop in the algorithm below, ensure y <= x. */
+ if (x < y)
+ {
+ uint64 tmp = x;
+
+ x = y;
+ y = tmp;
+ }
+
+ /*-----
+ * Interleaved modular multiplication algorithm from:
+ *
+ * D.N. Amanor et al., "Efficient hardware architecture for modular
+ * multiplication on FPGAs", in International Conference on Field
+ * Programmable Logic and Applications, Aug 2005, pp. 539-542.
+ *
+ * This algorithm is usually used in the field of digital circuit design.
+ *
+ * Input: X, Y, M; 0 <= X, Y <= M;
+ * Output: R = X * Y mod M;
+ * bits: number of bits of Y
+ * Y[i]: i th bit of Y
+ *
+ * 1. R = 0;
+ * 2. for (i = bits - 1; i >= 0; i--) {
+ * 3. R = 2 * R;
+ * 4. if (Y[i] == 0x1)
+ * 5. R += X;
+ * 6. if (R >= M) R -= M;
+ * 7. if (R >= M) R -= M;
+ * }
+ *
+ * In Steps 3 and 5, overflow should be avoided. The details are explained
+ * in each step.
+ * Steps 6 and 7 can be instead a modular operation (R %= M).
+ *
+ * Note that, in this implementation, the distributive property of modular
+ * operation shown in below is used whenever it can be applied.
+ * (X * Y) % M = X % M * Y % M, (X + Y) % M = X % M + Y % M
+ *
+ * For ease of understanding, an example is shown.
+ * Example: (11 * 5) % 7
+ *
+ * [Notation] ":=" means substitution.
+ *
+ * [initial value]
+ * R := 0x0000
+ *
+ * [i=2]
+ * R := (0x0000 * 2) % 0x0111 = 0x0000
+ * R := (R + 0x1011) % 0x0111 = 0x0100
+ *
+ * [i=1]
+ * R := (0x0100 * 2) % 0x0111 = 0x1000 % 0x0111 = 0x0001
+ *
+ * [i=0]
+ * R := (0x0001 * 2) % 0x0111 = 0x0010
+ * R := (R + 0x1011) % 0x0111 = 0x1101 % 0x0111 = 0x0110
+ *
+ * [result]
+ * R = 6
+ */
+
+ bits = nbits(y);
+ r = 0;
+
+ for (i = bits - 1; i >= 0; i--)
+ {
+ if (r > UINT64CONST(0x7fffffffffffffff))
+ /*-----
+ * To avoid overflow, transform from (2 * r) to (2 * r) % m, and
+ * further transform to mathematically equivalent form below:
+ *
+ * For ease of understanding, an example using one digit decimal number,
+ * where r = 6 and m = 7, is shown.
+ * (6 * 2) % 7 = (7 - 1) * 2 % 7 = (7 % 7 - 1 % 7) * 2 % 7
+ * = (0 - 1) * 2 % 7 = -2 % 7 = 7 - 2 = 5.
+ * Generally, if (r * 2) overflows,
+ * (r * 2) % m = m - (m - r) * 2.
+ */
+ r = m - ((m - r) << 1);
+ else
+ r <<= 1;
+
+ if ((y >> i) & 0x1)
+ {
+ if (r > UINT64CONST(0xffffffffffffffff) - x)
+ /*-----
+ * To compute (r + x) without overflow: transform to
+ * (r + x) % m, and then to (r + x - m).
+ *
+ * An example using one digit decimal number, where r = 6, x = 5 and
+ * m = 7, is shown.
+ * (6 + 5) % 7 = (6 + (5 - 7 + 7)) % 7 = 6 % 7 + (5 - 7) % 7 + 7 % 7
+ * = 6 + (5 - 7) + 0 = 4.
+ * Generally, if (r + x) overflows,
+ * (r + x) % m = r + (x - m).
+ */
+ r += x - m;
+ else
+ r += x;
+ }
+
+ r %= m;
+ }
+
+ return r;
+#endif
+}
+
+/*
+ * Donald Knuth's linear congruential generator
+ *
+ * Relying on multiplication overflows is part of the design
+ * of this simple pseudo random number generator.
+ */
+#define DK_LCG_MUL UINT64CONST(6364136223846793005)
+#define DK_LCG_INC UINT64CONST(1442695040888963407)
+
+/* do not use all small bits */
+#define LCG_SHIFT 13
+
+/*
+ * PRP: parametric pseudo-random permutation
+ *
+ * Result in [0, size) is a permutation for inputs in the same set.
+ *
+ * Note that this function does not pass statistical tests: eg
+ * permutations of 2, 3, 4 or 5 ints are not strictly equiprobable.
+ * Things worsen for large sizes as there are many more permutations
+ * (size!) than seeds to select them (2^64 < 21!).
+ * However it is inexpensive compared to an actual encryption function,
+ * and the quality is good enough to avoid trivial correlations on
+ * large sizes, which is the expected use case.
+ *
+ * THIS FUNCTION IS NOT CRYPTOGRAPHICALLY SECURE.
+ * DO NOT USE FOR SUCH PURPOSE.
+ */
+static int64
+pseudorandom_perm(const int64 data, const int64 isize, const int64 seed)
+{
+ /* computations are performed on unsigned values */
+ uint64 size = (uint64) isize;
+ uint64 v = (uint64) data % size;
+ uint64 key = (uint64) seed;
+ /* size-1 ensures 2 possibly overlapping halves */
+ uint64 mask = compute_mask(size - 1) >> 1;
+
+ /* nothing to permute */
+ if (isize == 1)
+ return 0;
+
+ Assert(isize >= 2);
+
+ /*-----
+ * Apply 4 rounds of bijective transformations using key updated
+ * at each stage:
+ *
+ * (1) scramble: partial xors on overlapping power-of-2 subsets
+ * for instance with v in 0 .. 14 (i.e. with size == 15):
+ * if v is in 0 .. 7 do v = (v ^ k) % 8
+ * if v is in 7 .. 14 do v = 14 - ((14-v) ^ k) % 8
+ * note that because of the overlap (here 7), v may be changed twice.
+ * this transformation if bijective because the condition to apply it
+ * is still true after applying it, and xor itself is bijective on a
+ * power-of-2 size.
+ *
+ * (2) scatter: linear modulo
+ * v = (v * p + k) % size
+ * this transformation is bijective is p & size are prime, which is
+ * ensured in the code by the while loop which discards primes when
+ * size is a multiple of it.
+ *
+ */
+ for (unsigned int i = 0, p = key % PRP_PRIMES; i < PRP_ROUNDS; i++, p = (p + 1) % PRP_PRIMES)
+ {
+ uint64 t;
+
+ /* first "half" whitening, for v in 0 .. mask */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ if (v <= mask)
+ v ^= (key >> LCG_SHIFT) & mask;
+
+ /* second (possibly overlapping) "half" whitening */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ t = size - 1 - v;
+ if (t <= mask)
+ {
+ t ^= (key >> LCG_SHIFT) & mask;
+ v = size - 1 - t;
+ }
+
+ /* at most 2 primes are skipped for a given size */
+ while (unlikely(size % primes[p] == 0))
+ p = (p + 1) % PRP_PRIMES;
+
+ /* scatter values with a prime multiplication */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+
+ /* Performance shortcut for 24 bit primes, ok for size up to ~10E12 */
+ if ((v & UINT64CONST(0xffffffffff)) == v)
+ v = (primes[p] * v + (key >> LCG_SHIFT)) % size;
+ else
+ /*-----
+ * Note: the add cannot overflow as size is under 63 bits:
+ * mmv = mm(prime, v, size) < size <= 0x7fffffffffffffff = (1<<63)-1
+ * ks = key >> LCG_SHIFTS <= 2^51
+ * => mmv + ks < (1<<64) - 1
+ */
+ v = (modular_multiply(primes[p], v, size) + (key >> LCG_SHIFT)) % size;
+ }
+
+ /* back to signed */
+ return (int64) v;
+}
+
/*
* Initialize the given SimpleStats struct to all zeroes
*/
@@ -2398,6 +2706,27 @@ evalStandardFunc(CState *st,
return true;
}
+ case PGBENCH_PRPERM:
+ {
+ int64 val, size, seed;
+
+ Assert(nargs == 3);
+
+ if (!coerceToInt(&vargs[0], &val) ||
+ !coerceToInt(&vargs[1], &size) ||
+ !coerceToInt(&vargs[2], &seed))
+ return false;
+
+ if (size < 1)
+ {
+ fprintf(stderr, "pr_perm size parameter must be >= 1\n");
+ return false;
+ }
+
+ setIntValue(retval, pseudorandom_perm(val, size, seed));
+ return true;
+ }
+
default:
/* cannot get here */
Assert(0);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index fb2c34f512..a95095a27d 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -99,7 +99,8 @@ typedef enum PgBenchFunction
PGBENCH_IS,
PGBENCH_CASE,
PGBENCH_HASH_FNV1A,
- PGBENCH_HASH_MURMUR2
+ PGBENCH_HASH_MURMUR2,
+ PGBENCH_PRPERM
} PgBenchFunction;
typedef struct PgBenchExpr PgBenchExpr;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 52009c3524..7d4c51ab21 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -467,6 +467,17 @@ pgbench(
qr{command=98.: int 5432\b}, # :random_seed
qr{command=99.: int -9223372036854775808\b}, # min int
qr{command=100.: int 9223372036854775807\b}, # max int
+ qr{command=101.: boolean true\b},
+ qr{command=102.: boolean true\b},
+ qr{command=103.: boolean true\b},
+ qr{command=104.: boolean true\b},
+ qr{command=105.: boolean true\b},
+ qr{command=109.: boolean true\b},
+ qr{command=110.: boolean true\b},
+ qr{command=111.: boolean true\b},
+ qr{command=112.: boolean true\b},
+ qr{command=113.: int 9223372036854775797\b},
+ qr{command=114.: boolean true\b},
],
'pgbench expressions',
{
@@ -594,6 +605,37 @@ SELECT :v0, :v1, :v2, :v3;
-- minint constant parsing
\set min debug(-9223372036854775808)
\set max debug(-(:min + 1))
+-- pseudo-random permutation
+\set t debug(pr_perm(0, 2) + pr_perm(1, 2) = 1)
+\set t debug(pr_perm(0, 3) + pr_perm(1, 3) + pr_perm(2, 3) = 3)
+\set t debug(pr_perm(0, 4) + pr_perm(1, 4) + pr_perm(2, 4) + pr_perm(3, 4) = 6)
+\set t debug(pr_perm(0, 5) + pr_perm(1, 5) + pr_perm(2, 5) + pr_perm(3, 5) + pr_perm(4, 5) = 10)
+\set t debug(pr_perm(0, 16) + pr_perm(1, 16) + pr_perm(2, 16) + pr_perm(3, 16) + \
+ pr_perm(4, 16) + pr_perm(5, 16) + pr_perm(6, 16) + pr_perm(7, 16) + \
+ pr_perm(8, 16) + pr_perm(9, 16) + pr_perm(10, 16) + pr_perm(11, 16) + \
+ pr_perm(12, 16) + pr_perm(13, 16) + pr_perm(14, 16) + pr_perm(15, 16) = 120)
+-- random sanity check
+\set size random(2, 1000)
+\set v random(0, :size - 1)
+\set p pr_perm(:v, :size)
+\set t debug(0 <= :p and :p < :size and :p = pr_perm(:v + :size, :size) and :p <> pr_perm(:v + 1, :size))
+-- actual values
+\set t debug(pr_perm(:v, 1) = 0)
+\set t debug(pr_perm(0, 2, 5432) = 0 and pr_perm(1, 2, 5432) = 1 and \
+ pr_perm(0, 2, 5431) = 1 and pr_perm(1, 2, 5431) = 0)
+-- ~50 bits test to exercise full modular multiplication
+\set t debug(pr_perm(0, 1000000000000000, 54321) = 9406454989259 and \
+ pr_perm(1999999999999999, 1000000000000000, 54321) = 54570773902028 and \
+ pr_perm(2500000000000000, 1000000000000000, 54321) = 771082311307468)
+-- 63 bits tests
+\set size debug(:max - 10)
+\set ok debug(pr_perm(:size-1, :size, 5432) = 7214172101785397543 and \
+ pr_perm(:size-2, :size, 5432) = 4060085360303920649 and \
+ pr_perm(:size-3, :size, 5432) = 919477102797071521 and \
+ pr_perm(:size-4, :size, 5432) = 7588404289602340497 and \
+ pr_perm(:size-5, :size, 5432) = 5568354808723584469 and \
+ pr_perm(:size-6, :size, 5432) = 2410458883211907565 and \
+ pr_perm(:size-7, :size, 5432) = 1738667467693569814)
}
});
@@ -955,6 +997,10 @@ SELECT LEAST(} . join(', ', (':i') x 256) . q{)}
'bad boolean', 2,
[qr{malformed variable.*trueXXX}], q{\set b :badtrue or true}
],
+ [
+ 'invalid pr_perm size', 2,
+ [qr{pr_perm size parameter must be >= 1}], q{\set i pr_perm(0, 0)}
+ ],
# GSET
[
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
index e38c7d77d1..c1a9f7f403 100644
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -341,6 +341,16 @@ my @script_tests = (
'set i',
[ qr{set i 1 }, qr{\^ error found here} ],
{ 'set_i_op' => "\\set i 1 +\n" }
+ ],
+ [
+ 'not enough arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-1.sql' => "\\set i pr_perm(1)\n" }
+ ],
+ [
+ 'too many arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-2.sql' => "\\set i pr_perm(1, 2, 3, 4)\n" }
],);
for my $t (@script_tests)
Attached v19 is a rebase, per cfbot.
Attached v20 fixes a doc xml typo, per cfbot again.
--
Fabien.
Attachments:
pgbench-prp-func-20.patchtext/x-diff; name=pgbench-prp-func-20.patchDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 9f3bb5fce6..d4a604c6fa 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -1033,7 +1033,7 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<row>
<entry> <literal>default_seed</literal> </entry>
- <entry>seed used in hash functions by default</entry>
+ <entry>seed used in hash and pseudo-random permutation functions by default</entry>
</row>
<row>
@@ -1850,6 +1850,20 @@ SELECT 4 AS four \; SELECT 5 AS five \aset
</para></entry>
</row>
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <function>pr_perm</function> ( <parameter>i</parameter>, <parameter>size</parameter> [, <parameter>seed</parameter> ] )
+ <returnvalue>integer</returnvalue>
+ </para>
+ <para>
+ pseudo-random permutation in <literal>[0,size)</literal>
+ </para>
+ <para>
+ <literal>pr_perm(0, 4)</literal>
+ <returnvalue>an integer between 0 and 3</returnvalue>
+ </para></entry>
+ </row>
+
<row>
<entry role="func_table_entry"><para role="func_signature">
<function>random</function> ( <parameter>lb</parameter>, <parameter>ub</parameter> )
@@ -1936,6 +1950,20 @@ SELECT 4 AS four \; SELECT 5 AS five \aset
shape of the distribution.
</para>
+ <note>
+ <para>
+ When designing a benchmark which selects rows non-uniformly, be aware
+ that using <application>pgbench</application> non-uniform random functions
+ directly leads to unduly correlations: rows with close ids come out with
+ similar probability, skewing performance measures because they also reside
+ in close pages.
+ </para>
+ <para>
+ You must use <function>pr_perm</function> to shuffle the selected ids, or
+ some other additional step with similar effect, to avoid these skewing impacts.
+ </para>
+ </note>
+
<itemizedlist>
<listitem>
<para>
@@ -2030,24 +2058,54 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
<literal>hash_fnv1a</literal> accept an input value and an optional seed parameter.
In case the seed isn't provided the value of <literal>:default_seed</literal>
is used, which is initialized randomly unless set by the command-line
- <literal>-D</literal> option. Hash functions can be used to scatter the
- distribution of random functions such as <literal>random_zipfian</literal> or
- <literal>random_exponential</literal>. For instance, the following pgbench
- script simulates possible real world workload typical for social media and
+ <literal>-D</literal> option.
+ </para>
+
+ <para>
+ Function <literal>pr_perm</literal> implements a pseudo-random permutation
+ on an arbitrary size.
+ It allows to shuffle output of non uniform random functions so that
+ values drawn more often are not trivially correlated.
+ It permutes integers in [0, size) using a seed by applying rounds of
+ simple invertible functions, similarly to an encryption function,
+ although beware that it is not at all cryptographically secure.
+ Compared to <literal>hash</literal> functions discussed above, the function
+ ensures that a perfect permutation is applied: there are neither collisions
+ nor holes in the output values.
+ Values outside the interval are interpreted modulo the size.
+ The function raises an error if size is not positive.
+ If no seed is provided, <literal>:default_seed</literal> is used.
+
+ For instance, the following <application>pgbench</application> script
+ simulates possible real world workload typical for social media and
blogging platforms where few accounts generate excessive load:
<programlisting>
-\set r random_zipfian(0, 100000000, 1.07)
-\set k abs(hash(:r)) % 1000000
+\set size 1000000
+\set r random_zipfian(1, :size, 1.07)
+\set k 1 + pr_perm(:r, :size)
</programlisting>
In some cases several distinct distributions are needed which don't correlate
with each other and this is when implicit seed parameter comes in handy:
<programlisting>
-\set k1 abs(hash(:r, :default_seed + 123)) % 1000000
-\set k2 abs(hash(:r, :default_seed + 321)) % 1000000
+\set k1 1 + pr_perm(:r, :size, :default_seed + 123)
+\set k2 1 + pr_perm(:r, :size, :default_seed + 321)
</programlisting>
+
+ An similar behavior can also be approximated with
+ <function>hash</function>:
+
+<programlisting>
+\set size 1000000
+\set r random_zipfian(1, 100 * :size, 1.07)
+\set k 1 + abs(hash(:r)) % :size
+</programlisting>
+
+ As this hash-modulo version generates collisions, some
+ <literal>id</literal> would not be reachable and others would come more often
+ than the target distribution.
</para>
<para>
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index 85d61caa9f..0ea968c94b 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -19,6 +19,7 @@
#define PGBENCH_NARGS_VARIABLE (-1)
#define PGBENCH_NARGS_CASE (-2)
#define PGBENCH_NARGS_HASH (-3)
+#define PGBENCH_NARGS_PRPERM (-4)
PgBenchExpr *expr_parse_result;
@@ -370,6 +371,9 @@ static const struct
{
"hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A
},
+ {
+ "pr_perm", PGBENCH_NARGS_PRPERM, PGBENCH_PRPERM
+ },
/* keep as last array element */
{
NULL, 0, 0
@@ -482,6 +486,19 @@ make_func(yyscan_t yyscanner, int fnumber, PgBenchExprList *args)
}
break;
+ /* pseudo-random permutation function with optional seed argument */
+ case PGBENCH_NARGS_PRPERM:
+ if (len < 2 || len > 3)
+ expr_yyerror_more(yyscanner, "unexpected number of arguments",
+ PGBENCH_FUNCTIONS[fnumber].fname);
+
+ if (len == 2)
+ {
+ PgBenchExpr *var = make_variable("default_seed");
+ args = make_elist(var, args);
+ }
+ break;
+
/* common case: positive arguments number */
default:
Assert(PGBENCH_FUNCTIONS[fnumber].nargs >= 0);
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 08a5947a9e..287e95078f 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -32,6 +32,13 @@
#endif
#include "postgres_fe.h"
+#include "common/int.h"
+#include "common/logging.h"
+#include "fe_utils/conditional.h"
+#include "getopt_long.h"
+#include "libpq-fe.h"
+#include "portability/instr_time.h"
+#include "port/pg_bitutils.h"
#include <ctype.h>
#include <float.h>
@@ -1061,6 +1068,307 @@ getHashMurmur2(int64 val, uint64 seed)
return (int64) result;
}
+/* pseudo-random permutation */
+
+/* 16 so that % 16 can be optimized to & 0x0f */
+#define PRP_PRIMES 16
+/*
+ * 24 bit mega primes from https://primes.utm.edu/lists/small/millions/
+ * the i-th prime, i \in [0, 15], is the first prime above 2^23 + i * 2^19
+ */
+static uint64 primes[PRP_PRIMES] = {
+ UINT64CONST(8388617),
+ UINT64CONST(8912921),
+ UINT64CONST(9437189),
+ UINT64CONST(9961487),
+ UINT64CONST(10485767),
+ UINT64CONST(11010059),
+ UINT64CONST(11534351),
+ UINT64CONST(12058679),
+ UINT64CONST(12582917),
+ UINT64CONST(13107229),
+ UINT64CONST(13631489),
+ UINT64CONST(14155777),
+ UINT64CONST(14680067),
+ UINT64CONST(15204391),
+ UINT64CONST(15728681),
+ UINT64CONST(16252967)
+};
+
+/* how many "encryption" rounds to apply */
+#define PRP_ROUNDS 4
+
+/* return smallest mask holding n */
+static uint64
+compute_mask(uint64 n)
+{
+ n |= n >> 1;
+ n |= n >> 2;
+ n |= n >> 4;
+ n |= n >> 8;
+ n |= n >> 16;
+ n |= n >> 32;
+ return n;
+}
+
+#if !defined(PG_INT128_TYPE)
+/* length of n binary representation */
+static int
+nbits(uint64 n)
+{
+ /* set lower bits to 1 and count them */
+ return pg_popcount64(compute_mask(n));
+}
+#endif
+
+/*
+ * Compute (x * y) % m, where x and y in [0, 2^64), m in [1, 2^64).
+ *
+ * Use improved interleaved modular multiplication algorithm to avoid
+ * overflows when necessary.
+ */
+static uint64
+modular_multiply(uint64 x, uint64 y, const uint64 m)
+{
+#if defined(PG_INT128_TYPE)
+ return (PG_INT128_TYPE) x * (PG_INT128_TYPE) y % (PG_INT128_TYPE) m;
+#else
+ int i,
+ bits;
+ uint64 r;
+
+ Assert(m >= 1);
+
+ /* Because of (x * y) % m = (x % m * y % m) % m */
+ if (x >= m)
+ x %= m;
+ if (y >= m)
+ y %= m;
+
+ /* Return the trivial result. */
+ if (x == 0 || y == 0 || m == 1)
+ return 0;
+
+ /* Return the result if (x * y) can be multiplied without overflow. */
+ if (nbits(x) + nbits(y) <= 64)
+ return (x * y) % m;
+
+ /* To reduce the for loop in the algorithm below, ensure y <= x. */
+ if (x < y)
+ {
+ uint64 tmp = x;
+
+ x = y;
+ y = tmp;
+ }
+
+ /*-----
+ * Interleaved modular multiplication algorithm from:
+ *
+ * D.N. Amanor et al., "Efficient hardware architecture for modular
+ * multiplication on FPGAs", in International Conference on Field
+ * Programmable Logic and Applications, Aug 2005, pp. 539-542.
+ *
+ * This algorithm is usually used in the field of digital circuit design.
+ *
+ * Input: X, Y, M; 0 <= X, Y <= M;
+ * Output: R = X * Y mod M;
+ * bits: number of bits of Y
+ * Y[i]: i th bit of Y
+ *
+ * 1. R = 0;
+ * 2. for (i = bits - 1; i >= 0; i--) {
+ * 3. R = 2 * R;
+ * 4. if (Y[i] == 0x1)
+ * 5. R += X;
+ * 6. if (R >= M) R -= M;
+ * 7. if (R >= M) R -= M;
+ * }
+ *
+ * In Steps 3 and 5, overflow should be avoided. The details are explained
+ * in each step.
+ * Steps 6 and 7 can be instead a modular operation (R %= M).
+ *
+ * Note that, in this implementation, the distributive property of modular
+ * operation shown in below is used whenever it can be applied.
+ * (X * Y) % M = X % M * Y % M, (X + Y) % M = X % M + Y % M
+ *
+ * For ease of understanding, an example is shown.
+ * Example: (11 * 5) % 7
+ *
+ * [Notation] ":=" means substitution.
+ *
+ * [initial value]
+ * R := 0x0000
+ *
+ * [i=2]
+ * R := (0x0000 * 2) % 0x0111 = 0x0000
+ * R := (R + 0x1011) % 0x0111 = 0x0100
+ *
+ * [i=1]
+ * R := (0x0100 * 2) % 0x0111 = 0x1000 % 0x0111 = 0x0001
+ *
+ * [i=0]
+ * R := (0x0001 * 2) % 0x0111 = 0x0010
+ * R := (R + 0x1011) % 0x0111 = 0x1101 % 0x0111 = 0x0110
+ *
+ * [result]
+ * R = 6
+ */
+
+ bits = nbits(y);
+ r = 0;
+
+ for (i = bits - 1; i >= 0; i--)
+ {
+ if (r > UINT64CONST(0x7fffffffffffffff))
+ /*-----
+ * To avoid overflow, transform from (2 * r) to (2 * r) % m, and
+ * further transform to mathematically equivalent form below:
+ *
+ * For ease of understanding, an example using one digit decimal number,
+ * where r = 6 and m = 7, is shown.
+ * (6 * 2) % 7 = (7 - 1) * 2 % 7 = (7 % 7 - 1 % 7) * 2 % 7
+ * = (0 - 1) * 2 % 7 = -2 % 7 = 7 - 2 = 5.
+ * Generally, if (r * 2) overflows,
+ * (r * 2) % m = m - (m - r) * 2.
+ */
+ r = m - ((m - r) << 1);
+ else
+ r <<= 1;
+
+ if ((y >> i) & 0x1)
+ {
+ if (r > UINT64CONST(0xffffffffffffffff) - x)
+ /*-----
+ * To compute (r + x) without overflow: transform to
+ * (r + x) % m, and then to (r + x - m).
+ *
+ * An example using one digit decimal number, where r = 6, x = 5 and
+ * m = 7, is shown.
+ * (6 + 5) % 7 = (6 + (5 - 7 + 7)) % 7 = 6 % 7 + (5 - 7) % 7 + 7 % 7
+ * = 6 + (5 - 7) + 0 = 4.
+ * Generally, if (r + x) overflows,
+ * (r + x) % m = r + (x - m).
+ */
+ r += x - m;
+ else
+ r += x;
+ }
+
+ r %= m;
+ }
+
+ return r;
+#endif
+}
+
+/*
+ * Donald Knuth's linear congruential generator
+ *
+ * Relying on multiplication overflows is part of the design
+ * of this simple pseudo random number generator.
+ */
+#define DK_LCG_MUL UINT64CONST(6364136223846793005)
+#define DK_LCG_INC UINT64CONST(1442695040888963407)
+
+/* do not use all small bits */
+#define LCG_SHIFT 13
+
+/*
+ * PRP: parametric pseudo-random permutation
+ *
+ * Result in [0, size) is a permutation for inputs in the same set.
+ *
+ * Note that this function does not pass statistical tests: eg
+ * permutations of 2, 3, 4 or 5 ints are not strictly equiprobable.
+ * Things worsen for large sizes as there are many more permutations
+ * (size!) than seeds to select them (2^64 < 21!).
+ * However it is inexpensive compared to an actual encryption function,
+ * and the quality is good enough to avoid trivial correlations on
+ * large sizes, which is the expected use case.
+ *
+ * THIS FUNCTION IS NOT CRYPTOGRAPHICALLY SECURE.
+ * DO NOT USE FOR SUCH PURPOSE.
+ */
+static int64
+pseudorandom_perm(const int64 data, const int64 isize, const int64 seed)
+{
+ /* computations are performed on unsigned values */
+ uint64 size = (uint64) isize;
+ uint64 v = (uint64) data % size;
+ uint64 key = (uint64) seed;
+ /* size-1 ensures 2 possibly overlapping halves */
+ uint64 mask = compute_mask(size - 1) >> 1;
+
+ /* nothing to permute */
+ if (isize == 1)
+ return 0;
+
+ Assert(isize >= 2);
+
+ /*-----
+ * Apply 4 rounds of bijective transformations using key updated
+ * at each stage:
+ *
+ * (1) scramble: partial xors on overlapping power-of-2 subsets
+ * for instance with v in 0 .. 14 (i.e. with size == 15):
+ * if v is in 0 .. 7 do v = (v ^ k) % 8
+ * if v is in 7 .. 14 do v = 14 - ((14-v) ^ k) % 8
+ * note that because of the overlap (here 7), v may be changed twice.
+ * this transformation if bijective because the condition to apply it
+ * is still true after applying it, and xor itself is bijective on a
+ * power-of-2 size.
+ *
+ * (2) scatter: linear modulo
+ * v = (v * p + k) % size
+ * this transformation is bijective is p & size are prime, which is
+ * ensured in the code by the while loop which discards primes when
+ * size is a multiple of it.
+ *
+ */
+ for (unsigned int i = 0, p = key % PRP_PRIMES; i < PRP_ROUNDS; i++, p = (p + 1) % PRP_PRIMES)
+ {
+ uint64 t;
+
+ /* first "half" whitening, for v in 0 .. mask */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ if (v <= mask)
+ v ^= (key >> LCG_SHIFT) & mask;
+
+ /* second (possibly overlapping) "half" whitening */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ t = size - 1 - v;
+ if (t <= mask)
+ {
+ t ^= (key >> LCG_SHIFT) & mask;
+ v = size - 1 - t;
+ }
+
+ /* at most 2 primes are skipped for a given size */
+ while (unlikely(size % primes[p] == 0))
+ p = (p + 1) % PRP_PRIMES;
+
+ /* scatter values with a prime multiplication */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+
+ /* Performance shortcut for 24 bit primes, ok for size up to ~10E12 */
+ if ((v & UINT64CONST(0xffffffffff)) == v)
+ v = (primes[p] * v + (key >> LCG_SHIFT)) % size;
+ else
+ /*-----
+ * Note: the add cannot overflow as size is under 63 bits:
+ * mmv = mm(prime, v, size) < size <= 0x7fffffffffffffff = (1<<63)-1
+ * ks = key >> LCG_SHIFTS <= 2^51
+ * => mmv + ks < (1<<64) - 1
+ */
+ v = (modular_multiply(primes[p], v, size) + (key >> LCG_SHIFT)) % size;
+ }
+
+ /* back to signed */
+ return (int64) v;
+}
+
/*
* Initialize the given SimpleStats struct to all zeroes
*/
@@ -2398,6 +2706,27 @@ evalStandardFunc(CState *st,
return true;
}
+ case PGBENCH_PRPERM:
+ {
+ int64 val, size, seed;
+
+ Assert(nargs == 3);
+
+ if (!coerceToInt(&vargs[0], &val) ||
+ !coerceToInt(&vargs[1], &size) ||
+ !coerceToInt(&vargs[2], &seed))
+ return false;
+
+ if (size < 1)
+ {
+ fprintf(stderr, "pr_perm size parameter must be >= 1\n");
+ return false;
+ }
+
+ setIntValue(retval, pseudorandom_perm(val, size, seed));
+ return true;
+ }
+
default:
/* cannot get here */
Assert(0);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index fb2c34f512..a95095a27d 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -99,7 +99,8 @@ typedef enum PgBenchFunction
PGBENCH_IS,
PGBENCH_CASE,
PGBENCH_HASH_FNV1A,
- PGBENCH_HASH_MURMUR2
+ PGBENCH_HASH_MURMUR2,
+ PGBENCH_PRPERM
} PgBenchFunction;
typedef struct PgBenchExpr PgBenchExpr;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 52009c3524..7d4c51ab21 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -467,6 +467,17 @@ pgbench(
qr{command=98.: int 5432\b}, # :random_seed
qr{command=99.: int -9223372036854775808\b}, # min int
qr{command=100.: int 9223372036854775807\b}, # max int
+ qr{command=101.: boolean true\b},
+ qr{command=102.: boolean true\b},
+ qr{command=103.: boolean true\b},
+ qr{command=104.: boolean true\b},
+ qr{command=105.: boolean true\b},
+ qr{command=109.: boolean true\b},
+ qr{command=110.: boolean true\b},
+ qr{command=111.: boolean true\b},
+ qr{command=112.: boolean true\b},
+ qr{command=113.: int 9223372036854775797\b},
+ qr{command=114.: boolean true\b},
],
'pgbench expressions',
{
@@ -594,6 +605,37 @@ SELECT :v0, :v1, :v2, :v3;
-- minint constant parsing
\set min debug(-9223372036854775808)
\set max debug(-(:min + 1))
+-- pseudo-random permutation
+\set t debug(pr_perm(0, 2) + pr_perm(1, 2) = 1)
+\set t debug(pr_perm(0, 3) + pr_perm(1, 3) + pr_perm(2, 3) = 3)
+\set t debug(pr_perm(0, 4) + pr_perm(1, 4) + pr_perm(2, 4) + pr_perm(3, 4) = 6)
+\set t debug(pr_perm(0, 5) + pr_perm(1, 5) + pr_perm(2, 5) + pr_perm(3, 5) + pr_perm(4, 5) = 10)
+\set t debug(pr_perm(0, 16) + pr_perm(1, 16) + pr_perm(2, 16) + pr_perm(3, 16) + \
+ pr_perm(4, 16) + pr_perm(5, 16) + pr_perm(6, 16) + pr_perm(7, 16) + \
+ pr_perm(8, 16) + pr_perm(9, 16) + pr_perm(10, 16) + pr_perm(11, 16) + \
+ pr_perm(12, 16) + pr_perm(13, 16) + pr_perm(14, 16) + pr_perm(15, 16) = 120)
+-- random sanity check
+\set size random(2, 1000)
+\set v random(0, :size - 1)
+\set p pr_perm(:v, :size)
+\set t debug(0 <= :p and :p < :size and :p = pr_perm(:v + :size, :size) and :p <> pr_perm(:v + 1, :size))
+-- actual values
+\set t debug(pr_perm(:v, 1) = 0)
+\set t debug(pr_perm(0, 2, 5432) = 0 and pr_perm(1, 2, 5432) = 1 and \
+ pr_perm(0, 2, 5431) = 1 and pr_perm(1, 2, 5431) = 0)
+-- ~50 bits test to exercise full modular multiplication
+\set t debug(pr_perm(0, 1000000000000000, 54321) = 9406454989259 and \
+ pr_perm(1999999999999999, 1000000000000000, 54321) = 54570773902028 and \
+ pr_perm(2500000000000000, 1000000000000000, 54321) = 771082311307468)
+-- 63 bits tests
+\set size debug(:max - 10)
+\set ok debug(pr_perm(:size-1, :size, 5432) = 7214172101785397543 and \
+ pr_perm(:size-2, :size, 5432) = 4060085360303920649 and \
+ pr_perm(:size-3, :size, 5432) = 919477102797071521 and \
+ pr_perm(:size-4, :size, 5432) = 7588404289602340497 and \
+ pr_perm(:size-5, :size, 5432) = 5568354808723584469 and \
+ pr_perm(:size-6, :size, 5432) = 2410458883211907565 and \
+ pr_perm(:size-7, :size, 5432) = 1738667467693569814)
}
});
@@ -955,6 +997,10 @@ SELECT LEAST(} . join(', ', (':i') x 256) . q{)}
'bad boolean', 2,
[qr{malformed variable.*trueXXX}], q{\set b :badtrue or true}
],
+ [
+ 'invalid pr_perm size', 2,
+ [qr{pr_perm size parameter must be >= 1}], q{\set i pr_perm(0, 0)}
+ ],
# GSET
[
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
index e38c7d77d1..c1a9f7f403 100644
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -341,6 +341,16 @@ my @script_tests = (
'set i',
[ qr{set i 1 }, qr{\^ error found here} ],
{ 'set_i_op' => "\\set i 1 +\n" }
+ ],
+ [
+ 'not enough arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-1.sql' => "\\set i pr_perm(1)\n" }
+ ],
+ [
+ 'too many arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-2.sql' => "\\set i pr_perm(1, 2, 3, 4)\n" }
],);
for my $t (@script_tests)
The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: tested, passed
Documentation: not tested
There are few "Stripping trailing CRs from the patch" and one "Hunk succeeded offset -2 lines" warning.
other than that the patch applies successfully and works as it claims.
The new status of this patch is: Ready for Committer
Hi Álvaro,
On 4/2/20 3:01 AM, Alvaro Herrera wrote:
On 2020-Apr-02, Fabien COELHO wrote:
I'm planning to mark this patch RwF on April 8 and I suggest you
resubmit if you are able to get some consensus.People interested in non-uniform benchmarks would see the point. Why many
people would be happy with uniform benchmarks only while life is not uniform
at all fails me.I don't think we should boot this patch. I don't think I would be able
to get this over the commit line in this CF, but let's not discard it.
As far as I can see you are the only committer who has shown real
interest in this patch. It's been sitting idle for the last year.
What are your current thoughts?
Regards,
--
-David
david@pgmasters.net
Hi David
On 2021-Mar-05, David Steele wrote:
On 4/2/20 3:01 AM, Alvaro Herrera wrote:
On 2020-Apr-02, Fabien COELHO wrote:
I'm planning to mark this patch RwF on April 8 and I suggest you
resubmit if you are able to get some consensus.People interested in non-uniform benchmarks would see the point. Why many
people would be happy with uniform benchmarks only while life is not uniform
at all fails me.I don't think we should boot this patch. I don't think I would be able
to get this over the commit line in this CF, but let's not discard it.As far as I can see you are the only committer who has shown real interest
in this patch. It's been sitting idle for the last year.What are your current thoughts?
Thanks for prodding. I still think it's a useful feature. However I
don't think I'll have to time to get it done on the current commitfest.
I suggest to let it sit in the commitfest to see if somebody else will
pick it up -- and if not, we move it to the next one, with apologies to
author and reviewers.
I may have time to become familiar or at least semi-comfortable with all
that weird math in it by then.
--
�lvaro Herrera 39�49'30"S 73�17'W
"At least to kernel hackers, who really are human, despite occasional
rumors to the contrary" (LWN.net)
What are your current thoughts?
Thanks for prodding. I still think it's a useful feature. However I
don't think I'll have to time to get it done on the current commitfest.
I suggest to let it sit in the commitfest to see if somebody else will
pick it up -- and if not, we move it to the next one, with apologies to
author and reviewers.I may have time to become familiar or at least semi-comfortable with all
that weird math in it by then.
Yep.
Generating a parametric good-quality low-cost (but not
cryptographically-secure) pseudo-random permutations on arbitrary sizes
(not juste power of two sizes) is not a trivial task, I had to be quite
creative to achieve it, hence the "weird" maths. I had a lot of bad
not-really-working ideas before the current status of the patch.
The code could be simplified if we assume that PG_INT128_TYPE will be
available on all relevant architectures, and accept the feature not to be
available if not.
--
Fabien.
On Mon, Mar 8, 2021 at 11:50:43AM +0100, Fabien COELHO wrote:
What are your current thoughts?
Thanks for prodding. I still think it's a useful feature. However I
don't think I'll have to time to get it done on the current commitfest.
I suggest to let it sit in the commitfest to see if somebody else will
pick it up -- and if not, we move it to the next one, with apologies to
author and reviewers.I may have time to become familiar or at least semi-comfortable with all
that weird math in it by then.Yep.
Generating a parametric good-quality low-cost (but not
cryptographically-secure) pseudo-random permutations on arbitrary sizes (not
juste power of two sizes) is not a trivial task, I had to be quite creative
to achieve it, hence the "weird" maths. I had a lot of bad
not-really-working ideas before the current status of the patch.The code could be simplified if we assume that PG_INT128_TYPE will be
available on all relevant architectures, and accept the feature not to be
available if not.
Maybe Dean Rasheed can help because of his math background --- CC'ing him.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee
On Thu, 11 Mar 2021 at 00:58, Bruce Momjian <bruce@momjian.us> wrote:
Maybe Dean Rasheed can help because of his math background --- CC'ing him.
Reading the thread I can see how such a function might be useful to
scatter non-uniformly random values.
The implementation looks plausible too, though it adds quite a large
amount of new code. The main thing that concerns me is justifying the
code. With this kind of thing, it's all too easy to overlook corner
cases and end up with trivial sequences in certain special cases. I'd
feel better about that if we were implementing a standard algorithm
with known pedigree.
Thinking about the use case for this, it seems that it's basically
designed to turn a set of non-uniform random numbers (produced by
random_exponential() et al.) into another set of non-uniform random
numbers, where the non-uniformity is scattered so that the more/less
common values aren't all clumped together.
I'm wondering if that's something that can't be done more simply by
passing the non-uniform random numbers through the uniform random
number generator to scatter them uniformly across some range -- e.g.,
given an integer n, return the n'th value from the sequence produced
by random(), starting from some initial seed -- i.e., implement
nth_random(lb, ub, seed, n). That would actually be pretty
straightforward to implement using O(log(n)) time to execute (see the
attached python example), though it wouldn't generate a permutation,
so it'd need a bit of thought to see if it met the requirements.
Regards,
Dean
Attachments:
The algorithm for generating a random permutation with a uniform
distribution across all permutations is easy:
for (i=0; i<n; i++) {
swap a[n-i] with a[rand(n-i+1)]
}
where 0 <= rand(x) < x and a[i] is initially i (see Knuth, Section 3.4.2
Algorithm P)
David Bowen
On Thu, Mar 11, 2021 at 9:32 AM Dean Rasheed <dean.a.rasheed@gmail.com>
wrote:
Show quoted text
On Thu, 11 Mar 2021 at 00:58, Bruce Momjian <bruce@momjian.us> wrote:
Maybe Dean Rasheed can help because of his math background --- CC'ing
him.
Reading the thread I can see how such a function might be useful to
scatter non-uniformly random values.The implementation looks plausible too, though it adds quite a large
amount of new code. The main thing that concerns me is justifying the
code. With this kind of thing, it's all too easy to overlook corner
cases and end up with trivial sequences in certain special cases. I'd
feel better about that if we were implementing a standard algorithm
with known pedigree.Thinking about the use case for this, it seems that it's basically
designed to turn a set of non-uniform random numbers (produced by
random_exponential() et al.) into another set of non-uniform random
numbers, where the non-uniformity is scattered so that the more/less
common values aren't all clumped together.I'm wondering if that's something that can't be done more simply by
passing the non-uniform random numbers through the uniform random
number generator to scatter them uniformly across some range -- e.g.,
given an integer n, return the n'th value from the sequence produced
by random(), starting from some initial seed -- i.e., implement
nth_random(lb, ub, seed, n). That would actually be pretty
straightforward to implement using O(log(n)) time to execute (see the
attached python example), though it wouldn't generate a permutation,
so it'd need a bit of thought to see if it met the requirements.Regards,
Dean
On Thu, 11 Mar 2021 at 19:06, David Bowen <dmb0317@gmail.com> wrote:
The algorithm for generating a random permutation with a uniform distribution across all permutations is easy:
for (i=0; i<n; i++) {
swap a[n-i] with a[rand(n-i+1)]
}where 0 <= rand(x) < x and a[i] is initially i (see Knuth, Section 3.4.2 Algorithm P)
True, but n can be very large, so that might use a lot of memory and
involve a lot of pre-processing.
Regards,
Dean
Hello Dean,
The implementation looks plausible too, though it adds quite a large
amount of new code.
A significant part of this new code the the multiply-modulo
implementation, which can be dropped if we assume that the target has
int128 available, and accept that the feature is not available otherwise.
Also, there are quite a lot of comments which add to the code length.
The main thing that concerns me is justifying the code. With this kind
of thing, it's all too easy to overlook corner cases and end up with
trivial sequences in certain special cases. I'd feel better about that
if we were implementing a standard algorithm with known pedigree.
Yep. I did not find anything convincing with the requirements: generate a
permutation, can be parametric, low constant cost, good quality, work on
arbitrary sizes…
Thinking about the use case for this, it seems that it's basically
designed to turn a set of non-uniform random numbers (produced by
random_exponential() et al.) into another set of non-uniform random
numbers, where the non-uniformity is scattered so that the more/less
common values aren't all clumped together.
Yes.
I'm wondering if that's something that can't be done more simply by
passing the non-uniform random numbers through the uniform random
number generator to scatter them uniformly across some range -- e.g.,
given an integer n, return the n'th value from the sequence produced
by random(), starting from some initial seed -- i.e., implement
nth_random(lb, ub, seed, n). That would actually be pretty
straightforward to implement using O(log(n)) time to execute (see the
attached python example), though it wouldn't generate a permutation,
so it'd need a bit of thought to see if it met the requirements.
Indeed, this violates two requirements: constant cost & permutation.
--
Fabien.
On Mon, Mar 8, 2021 at 11:50 PM Fabien COELHO <coelho@cri.ensmp.fr> wrote:
I may have time to become familiar or at least semi-comfortable with all
that weird math in it by then.Yep.
Generating a parametric good-quality low-cost (but not
cryptographically-secure) pseudo-random permutations on arbitrary sizes
(not juste power of two sizes) is not a trivial task, I had to be quite
creative to achieve it, hence the "weird" maths. I had a lot of bad
not-really-working ideas before the current status of the patch.The code could be simplified if we assume that PG_INT128_TYPE will be
available on all relevant architectures, and accept the feature not to be
available if not.
That doesn't sound like a bad option to me, if it makes this much
simpler. The main modern system without it seems to be MSVC. The
Linux, BSD, Apple, illumos, AIX systems using Clang/GCC with
Intel/AMD/ARM/PowerPC CPUs have it, and the Windows systems using open
source compilers have it.
On 2021-Mar-13, Thomas Munro wrote:
That doesn't sound like a bad option to me, if it makes this much
simpler. The main modern system without it seems to be MSVC. The
Linux, BSD, Apple, illumos, AIX systems using Clang/GCC with
Intel/AMD/ARM/PowerPC CPUs have it, and the Windows systems using open
source compilers have it.
Hmm. Can we go a bit further, and say that if you don't have 128-bit
ints, then you can use pr_perm but only to a maximum of 32-bit ints?
Then you can do the calculations in 64-bit ints. That's much less bad
than desupporting the feature altogether.
--
�lvaro Herrera 39�49'30"S 73�17'W
"No necesitamos banderas
No reconocemos fronteras" (Jorge Gonz�lez)
Hello Alvaro,
That doesn't sound like a bad option to me, if it makes this much
simpler. The main modern system without it seems to be MSVC. The
Linux, BSD, Apple, illumos, AIX systems using Clang/GCC with
Intel/AMD/ARM/PowerPC CPUs have it, and the Windows systems using open
source compilers have it.Hmm. Can we go a bit further, and say that if you don't have 128-bit
ints, then you can use pr_perm but only to a maximum of 32-bit ints?
Then you can do the calculations in 64-bit ints. That's much less bad
than desupporting the feature altogether.
This looks like a good compromise, which can even be a little better
because we still have 64 bits ints.
As there is already a performance shortcut in the code relying on the fact
that x is 24 bits and that no overflow occurs if y & m are up to 48 bits
(we are using unsigned int), the simplification is just to abort if int128
is not available, because we would have called the function only if it was
really needed.
Attached a simplified patch which does that, removing 1/3 of the code.
What do you think?
Note that this creates one issue though: tests now fail if 128 bits ints
are not available. I'm not sure how to detect & skip that from the tap
tests. I'm not keen on removing the tests either. Will give it some
thoughts if you want to proceed in that direction.
--
Fabien.
Attachments:
pgbench-prp-func-21.patchtext/x-diff; name=pgbench-prp-func-21.patchDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 299d93b241..3fe06b1fe3 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -1057,7 +1057,7 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<row>
<entry> <literal>default_seed</literal> </entry>
- <entry>seed used in hash functions by default</entry>
+ <entry>seed used in hash and pseudo-random permutation functions by default</entry>
</row>
<row>
@@ -1874,6 +1874,20 @@ SELECT 4 AS four \; SELECT 5 AS five \aset
</para></entry>
</row>
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <function>pr_perm</function> ( <parameter>i</parameter>, <parameter>size</parameter> [, <parameter>seed</parameter> ] )
+ <returnvalue>integer</returnvalue>
+ </para>
+ <para>
+ pseudo-random permutation in <literal>[0,size)</literal>
+ </para>
+ <para>
+ <literal>pr_perm(0, 4)</literal>
+ <returnvalue>an integer between 0 and 3</returnvalue>
+ </para></entry>
+ </row>
+
<row>
<entry role="func_table_entry"><para role="func_signature">
<function>random</function> ( <parameter>lb</parameter>, <parameter>ub</parameter> )
@@ -1960,6 +1974,20 @@ SELECT 4 AS four \; SELECT 5 AS five \aset
shape of the distribution.
</para>
+ <note>
+ <para>
+ When designing a benchmark which selects rows non-uniformly, be aware
+ that using <application>pgbench</application> non-uniform random functions
+ directly leads to unduly correlations: rows with close ids come out with
+ similar probability, skewing performance measures because they also reside
+ in close pages.
+ </para>
+ <para>
+ You must use <function>pr_perm</function> to shuffle the selected ids, or
+ some other additional step with similar effect, to avoid these skewing impacts.
+ </para>
+ </note>
+
<itemizedlist>
<listitem>
<para>
@@ -2054,24 +2082,54 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
<literal>hash_fnv1a</literal> accept an input value and an optional seed parameter.
In case the seed isn't provided the value of <literal>:default_seed</literal>
is used, which is initialized randomly unless set by the command-line
- <literal>-D</literal> option. Hash functions can be used to scatter the
- distribution of random functions such as <literal>random_zipfian</literal> or
- <literal>random_exponential</literal>. For instance, the following pgbench
- script simulates possible real world workload typical for social media and
+ <literal>-D</literal> option.
+ </para>
+
+ <para>
+ Function <literal>pr_perm</literal> implements a pseudo-random permutation
+ on an arbitrary size.
+ It allows to shuffle output of non uniform random functions so that
+ values drawn more often are not trivially correlated.
+ It permutes integers in [0, size) using a seed by applying rounds of
+ simple invertible functions, similarly to an encryption function,
+ although beware that it is not at all cryptographically secure.
+ Compared to <literal>hash</literal> functions discussed above, the function
+ ensures that a perfect permutation is applied: there are neither collisions
+ nor holes in the output values.
+ Values outside the interval are interpreted modulo the size.
+ The function raises an error if size is not positive.
+ If no seed is provided, <literal>:default_seed</literal> is used.
+
+ For instance, the following <application>pgbench</application> script
+ simulates possible real world workload typical for social media and
blogging platforms where few accounts generate excessive load:
<programlisting>
-\set r random_zipfian(0, 100000000, 1.07)
-\set k abs(hash(:r)) % 1000000
+\set size 1000000
+\set r random_zipfian(1, :size, 1.07)
+\set k 1 + pr_perm(:r, :size)
</programlisting>
In some cases several distinct distributions are needed which don't correlate
with each other and this is when implicit seed parameter comes in handy:
<programlisting>
-\set k1 abs(hash(:r, :default_seed + 123)) % 1000000
-\set k2 abs(hash(:r, :default_seed + 321)) % 1000000
+\set k1 1 + pr_perm(:r, :size, :default_seed + 123)
+\set k2 1 + pr_perm(:r, :size, :default_seed + 321)
</programlisting>
+
+ An similar behavior can also be approximated with
+ <function>hash</function>:
+
+<programlisting>
+\set size 1000000
+\set r random_zipfian(1, 100 * :size, 1.07)
+\set k 1 + abs(hash(:r)) % :size
+</programlisting>
+
+ As this hash-modulo version generates collisions, some
+ <literal>id</literal> would not be reachable and others would come more often
+ than the target distribution.
</para>
<para>
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index 4d529ea550..7c1870a375 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -19,6 +19,7 @@
#define PGBENCH_NARGS_VARIABLE (-1)
#define PGBENCH_NARGS_CASE (-2)
#define PGBENCH_NARGS_HASH (-3)
+#define PGBENCH_NARGS_PRPERM (-4)
PgBenchExpr *expr_parse_result;
@@ -370,6 +371,9 @@ static const struct
{
"hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A
},
+ {
+ "pr_perm", PGBENCH_NARGS_PRPERM, PGBENCH_PRPERM
+ },
/* keep as last array element */
{
NULL, 0, 0
@@ -482,6 +486,19 @@ make_func(yyscan_t yyscanner, int fnumber, PgBenchExprList *args)
}
break;
+ /* pseudo-random permutation function with optional seed argument */
+ case PGBENCH_NARGS_PRPERM:
+ if (len < 2 || len > 3)
+ expr_yyerror_more(yyscanner, "unexpected number of arguments",
+ PGBENCH_FUNCTIONS[fnumber].fname);
+
+ if (len == 2)
+ {
+ PgBenchExpr *var = make_variable("default_seed");
+ args = make_elist(var, args);
+ }
+ break;
+
/* common case: positive arguments number */
default:
Assert(PGBENCH_FUNCTIONS[fnumber].nargs >= 0);
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index f6a214669c..9624f77f91 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -32,6 +32,13 @@
#endif
#include "postgres_fe.h"
+#include "common/int.h"
+#include "common/logging.h"
+#include "fe_utils/conditional.h"
+#include "getopt_long.h"
+#include "libpq-fe.h"
+#include "portability/instr_time.h"
+#include "port/pg_bitutils.h"
#include <ctype.h>
#include <float.h>
@@ -1124,6 +1131,180 @@ getHashMurmur2(int64 val, uint64 seed)
return (int64) result;
}
+/* pseudo-random permutation */
+
+/* 16 so that % 16 can be optimized to & 0x0f */
+#define PRP_PRIMES 16
+/*
+ * 24 bit mega primes from https://primes.utm.edu/lists/small/millions/
+ * the i-th prime, i \in [0, 15], is the first prime above 2^23 + i * 2^19
+ */
+static uint64 primes[PRP_PRIMES] = {
+ UINT64CONST(8388617),
+ UINT64CONST(8912921),
+ UINT64CONST(9437189),
+ UINT64CONST(9961487),
+ UINT64CONST(10485767),
+ UINT64CONST(11010059),
+ UINT64CONST(11534351),
+ UINT64CONST(12058679),
+ UINT64CONST(12582917),
+ UINT64CONST(13107229),
+ UINT64CONST(13631489),
+ UINT64CONST(14155777),
+ UINT64CONST(14680067),
+ UINT64CONST(15204391),
+ UINT64CONST(15728681),
+ UINT64CONST(16252967)
+};
+
+/* how many "encryption" rounds to apply */
+#define PRP_ROUNDS 4
+
+/* return smallest mask holding n */
+static uint64
+compute_mask(uint64 n)
+{
+ n |= n >> 1;
+ n |= n >> 2;
+ n |= n >> 4;
+ n |= n >> 8;
+ n |= n >> 16;
+ n |= n >> 32;
+ return n;
+}
+
+#if !defined(PG_INT128_TYPE)
+/* length of n binary representation */
+static int
+nbits(uint64 n)
+{
+ /* set lower bits to 1 and count them */
+ return pg_popcount64(compute_mask(n));
+}
+#endif
+
+/*
+ * Compute (x * y) % m, where x and y in [0, 2^64), m in [1, 2^64).
+ */
+static uint64
+modular_multiply(uint64 x, uint64 y, const uint64 m)
+{
+#if defined(PG_INT128_TYPE)
+ return (PG_INT128_TYPE) x * (PG_INT128_TYPE) y % (PG_INT128_TYPE) m;
+#else
+ pg_log_fatal("modular_multiply not implemented on this platform");
+ pg_log_info("128 bits integer arithmetic not available");
+ exit(1);
+#endif
+}
+
+/*
+ * Donald Knuth's linear congruential generator
+ *
+ * Relying on multiplication overflows is part of the design
+ * of this simple pseudo random number generator.
+ */
+#define DK_LCG_MUL UINT64CONST(6364136223846793005)
+#define DK_LCG_INC UINT64CONST(1442695040888963407)
+
+/* do not use all small bits */
+#define LCG_SHIFT 13
+
+/*
+ * PRP: parametric pseudo-random permutation
+ *
+ * Result in [0, size) is a permutation for inputs in the same set.
+ *
+ * Note that this function does not pass statistical tests: eg
+ * permutations of 2, 3, 4 or 5 ints are not strictly equiprobable.
+ * Things worsen for large sizes as there are many more permutations
+ * (size!) than seeds to select them (2^64 < 21!).
+ * However it is inexpensive compared to an actual encryption function,
+ * and the quality is good enough to avoid trivial correlations on
+ * large sizes, which is the expected use case.
+ *
+ * THIS FUNCTION IS NOT CRYPTOGRAPHICALLY SECURE.
+ * DO NOT USE FOR SUCH PURPOSE.
+ */
+static int64
+pseudorandom_perm(const int64 data, const int64 isize, const int64 seed)
+{
+ /* computations are performed on unsigned values */
+ uint64 size = (uint64) isize;
+ uint64 v = (uint64) data % size;
+ uint64 key = (uint64) seed;
+ /* size-1 ensures 2 possibly overlapping halves */
+ uint64 mask = compute_mask(size - 1) >> 1;
+
+ /* nothing to permute */
+ if (isize == 1)
+ return 0;
+
+ Assert(isize >= 2);
+
+ /*-----
+ * Apply 4 rounds of bijective transformations using key updated
+ * at each stage:
+ *
+ * (1) scramble: partial xors on overlapping power-of-2 subsets
+ * for instance with v in 0 .. 14 (i.e. with size == 15):
+ * if v is in 0 .. 7 do v = (v ^ k) % 8
+ * if v is in 7 .. 14 do v = 14 - ((14-v) ^ k) % 8
+ * note that because of the overlap (here 7), v may be changed twice.
+ * this transformation if bijective because the condition to apply it
+ * is still true after applying it, and xor itself is bijective on a
+ * power-of-2 size.
+ *
+ * (2) scatter: linear modulo
+ * v = (v * p + k) % size
+ * this transformation is bijective is p & size are prime, which is
+ * ensured in the code by the while loop which discards primes when
+ * size is a multiple of it.
+ *
+ */
+ for (unsigned int i = 0, p = key % PRP_PRIMES; i < PRP_ROUNDS; i++, p = (p + 1) % PRP_PRIMES)
+ {
+ uint64 t;
+
+ /* first "half" whitening, for v in 0 .. mask */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ if (v <= mask)
+ v ^= (key >> LCG_SHIFT) & mask;
+
+ /* second (possibly overlapping) "half" whitening */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ t = size - 1 - v;
+ if (t <= mask)
+ {
+ t ^= (key >> LCG_SHIFT) & mask;
+ v = size - 1 - t;
+ }
+
+ /* at most 2 primes are skipped for a given size */
+ while (unlikely(size % primes[p] == 0))
+ p = (p + 1) % PRP_PRIMES;
+
+ /* scatter values with a prime multiplication */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+
+ /* Performance shortcut for 24 bit primes, ok for size up to ~10E12 */
+ if ((v & UINT64CONST(0xffffffffff)) == v)
+ v = (primes[p] * v + (key >> LCG_SHIFT)) % size;
+ else
+ /*-----
+ * Note: the add cannot overflow as size is under 63 bits:
+ * mmv = mm(prime, v, size) < size <= 0x7fffffffffffffff = (1<<63)-1
+ * ks = key >> LCG_SHIFTS <= 2^51
+ * => mmv + ks < (1<<64) - 1
+ */
+ v = (modular_multiply(primes[p], v, size) + (key >> LCG_SHIFT)) % size;
+ }
+
+ /* back to signed */
+ return (int64) v;
+}
+
/*
* Initialize the given SimpleStats struct to all zeroes
*/
@@ -2472,6 +2653,27 @@ evalStandardFunc(CState *st,
return true;
}
+ case PGBENCH_PRPERM:
+ {
+ int64 val, size, seed;
+
+ Assert(nargs == 3);
+
+ if (!coerceToInt(&vargs[0], &val) ||
+ !coerceToInt(&vargs[1], &size) ||
+ !coerceToInt(&vargs[2], &seed))
+ return false;
+
+ if (size < 1)
+ {
+ fprintf(stderr, "pr_perm size parameter must be >= 1\n");
+ return false;
+ }
+
+ setIntValue(retval, pseudorandom_perm(val, size, seed));
+ return true;
+ }
+
default:
/* cannot get here */
Assert(0);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index 3a9d89e6f1..7898502c47 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -99,7 +99,8 @@ typedef enum PgBenchFunction
PGBENCH_IS,
PGBENCH_CASE,
PGBENCH_HASH_FNV1A,
- PGBENCH_HASH_MURMUR2
+ PGBENCH_HASH_MURMUR2,
+ PGBENCH_PRPERM
} PgBenchFunction;
typedef struct PgBenchExpr PgBenchExpr;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index daffc18e52..684d299ca1 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -467,6 +467,18 @@ pgbench(
qr{command=98.: int 5432\b}, # :random_seed
qr{command=99.: int -9223372036854775808\b}, # min int
qr{command=100.: int 9223372036854775807\b}, # max int
+ # PRP tests
+ qr{command=101.: boolean true\b},
+ qr{command=102.: boolean true\b},
+ qr{command=103.: boolean true\b},
+ qr{command=104.: boolean true\b},
+ qr{command=105.: boolean true\b},
+ qr{command=109.: boolean true\b},
+ qr{command=110.: boolean true\b},
+ qr{command=111.: boolean true\b},
+ qr{command=112.: boolean true\b},
+ qr{command=113.: int 9223372036854775797\b},
+ qr{command=114.: boolean true\b},
],
'pgbench expressions',
{
@@ -594,6 +606,37 @@ SELECT :v0, :v1, :v2, :v3;
-- minint constant parsing
\set min debug(-9223372036854775808)
\set max debug(-(:min + 1))
+-- pseudo-random permutation
+\set t debug(pr_perm(0, 2) + pr_perm(1, 2) = 1)
+\set t debug(pr_perm(0, 3) + pr_perm(1, 3) + pr_perm(2, 3) = 3)
+\set t debug(pr_perm(0, 4) + pr_perm(1, 4) + pr_perm(2, 4) + pr_perm(3, 4) = 6)
+\set t debug(pr_perm(0, 5) + pr_perm(1, 5) + pr_perm(2, 5) + pr_perm(3, 5) + pr_perm(4, 5) = 10)
+\set t debug(pr_perm(0, 16) + pr_perm(1, 16) + pr_perm(2, 16) + pr_perm(3, 16) + \
+ pr_perm(4, 16) + pr_perm(5, 16) + pr_perm(6, 16) + pr_perm(7, 16) + \
+ pr_perm(8, 16) + pr_perm(9, 16) + pr_perm(10, 16) + pr_perm(11, 16) + \
+ pr_perm(12, 16) + pr_perm(13, 16) + pr_perm(14, 16) + pr_perm(15, 16) = 120)
+-- random sanity check
+\set size random(2, 1000)
+\set v random(0, :size - 1)
+\set p pr_perm(:v, :size)
+\set t debug(0 <= :p and :p < :size and :p = pr_perm(:v + :size, :size) and :p <> pr_perm(:v + 1, :size))
+-- actual values
+\set t debug(pr_perm(:v, 1) = 0)
+\set t debug(pr_perm(0, 2, 5432) = 0 and pr_perm(1, 2, 5432) = 1 and \
+ pr_perm(0, 2, 5431) = 1 and pr_perm(1, 2, 5431) = 0)
+-- ~50 bits test to exercise full modular multiplication
+\set t debug(pr_perm(0, 1000000000000000, 54321) = 9406454989259 and \
+ pr_perm(1999999999999999, 1000000000000000, 54321) = 54570773902028 and \
+ pr_perm(2500000000000000, 1000000000000000, 54321) = 771082311307468)
+-- 63 bits tests
+\set size debug(:max - 10)
+\set ok debug(pr_perm(:size-1, :size, 5432) = 7214172101785397543 and \
+ pr_perm(:size-2, :size, 5432) = 4060085360303920649 and \
+ pr_perm(:size-3, :size, 5432) = 919477102797071521 and \
+ pr_perm(:size-4, :size, 5432) = 7588404289602340497 and \
+ pr_perm(:size-5, :size, 5432) = 5568354808723584469 and \
+ pr_perm(:size-6, :size, 5432) = 2410458883211907565 and \
+ pr_perm(:size-7, :size, 5432) = 1738667467693569814)
}
});
@@ -955,6 +998,10 @@ SELECT LEAST(} . join(', ', (':i') x 256) . q{)}
'bad boolean', 2,
[qr{malformed variable.*trueXXX}], q{\set b :badtrue or true}
],
+ [
+ 'invalid pr_perm size', 2,
+ [qr{pr_perm size parameter must be >= 1}], q{\set i pr_perm(0, 0)}
+ ],
# GSET
[
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
index e38c7d77d1..c1a9f7f403 100644
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -341,6 +341,16 @@ my @script_tests = (
'set i',
[ qr{set i 1 }, qr{\^ error found here} ],
{ 'set_i_op' => "\\set i 1 +\n" }
+ ],
+ [
+ 'not enough arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-1.sql' => "\\set i pr_perm(1)\n" }
+ ],
+ [
+ 'too many arguments for pr_perm',
+ [qr{unexpected number of arguments \(pr_perm\)}],
+ { 'bad-pr_perm-2.sql' => "\\set i pr_perm(1, 2, 3, 4)\n" }
],);
for my $t (@script_tests)
Hello,
On 2021-Mar-13, Fabien COELHO wrote:
This looks like a good compromise, which can even be a little better because
we still have 64 bits ints.
Yeah, we rely on those being available elsewhere.
Attached a simplified patch which does that, removing 1/3 of the code. What
do you think?
Clearly we got rid of a lot of code. About the tests, maybe the easiest
thing to do is "skip ... if $Config{'osname'} eq 'MSWin32'".
One comment in pseudorandom_perm talks about "whitening" while the other
talks about "scramble". I think they should both use the same term to
ease understanding.
You kept routine nbits() which is unneeded now.
pgbench.c gains too many includes for my taste. Can we put
pseudorandom_perm() in a separate file?
The function name pr_perm() is much too short; I think we should use a
more descriptive name; maybe \prand_perm is sufficient? Though I admit
the abbreviation "perm" evokes "permission" more than "permutation" to
me, so maybe \prand_permutation is better? (If you're inclined to
abbreviate permutation, maybe \prand_permut?)
I think the reference docs are not clear enough. What do you think of
something like this?
+ <row> + <entry role="func_table_entry"><para role="func_signature"> + <function>pr_perm</function> ( <parameter>i</parameter>, <parameter>size</parameter> [, <parameter>seed</parameter> ] ) + <returnvalue>integer</returnvalue> + </para> + <para> + i'th pseudo-random permutation in <literal>[0,size)</literal> + <!-- "The i'th value of the pseudo-random permutation in the interval [0,size)"? --> + </para> + <para> + <literal>pr_perm(2, 4)</literal> + <returnvalue>the 2nd integer between 0 and (4-1)</returnvalue> + </para></entry> + </row>
--
�lvaro Herrera Valdivia, Chile
Hello Alvaro,
Clearly we got rid of a lot of code. About the tests, maybe the easiest
thing to do is "skip ... if $Config{'osname'} eq 'MSWin32'".
I tried that.
One comment in pseudorandom_perm talks about "whitening" while the other
talks about "scramble". I think they should both use the same term to
ease understanding.
Good point.
You kept routine nbits() which is unneeded now.
Indeed.
pgbench.c gains too many includes for my taste. Can we put
pseudorandom_perm() in a separate file?
The refactoring I was thinking about for some time now is to separate
expression evaluation entirely, not just one function, and possibly other
parts as well. I suggest that this should wait for another submission.
The function name pr_perm() is much too short; I think we should use a
more descriptive name; maybe \prand_perm is sufficient? Though I admit
the abbreviation "perm" evokes "permission" more than "permutation" to
me, so maybe \prand_permutation is better? (If you're inclined to
abbreviate permutation, maybe \prand_permut?)
What about simply "permute"? Pgbench is unlikely to get another permute
function anyway.
I think the reference docs are not clear enough. What do you think of
something like this?
I agree that the documentation is not clear enough. The proposal would not
help me to understand what it does, though. I've tried to improve the
explanation based on wikipedia explanations about permutations.
See attached v22.
--
Fabien.
Attachments:
pgbench-prp-func-22.patchtext/x-diff; name=pgbench-prp-func-22.patchDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 299d93b241..c46272fd50 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -1057,7 +1057,7 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<row>
<entry> <literal>default_seed</literal> </entry>
- <entry>seed used in hash functions by default</entry>
+ <entry>seed used in hash and pseudo-random permutation functions by default</entry>
</row>
<row>
@@ -1842,6 +1842,22 @@ SELECT 4 AS four \; SELECT 5 AS five \aset
</para></entry>
</row>
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <function>permute</function> ( <parameter>i</parameter>, <parameter>size</parameter> [, <parameter>seed</parameter> ] )
+ <returnvalue>integer</returnvalue>
+ </para>
+ <para>
+ Permuted value of <parameter>i</parameter>, in <literal>[0,size)</literal>.
+ This is the new position of <parameter>i</parameter> in a pseudo-random rearrangement of
+ <parameter>size</parameter> first integers parameterized by <parameter>seed</parameter>.
+ </para>
+ <para>
+ <literal>permute(0, 4)</literal>
+ <returnvalue>an integer between 0 and 3</returnvalue>
+ </para></entry>
+ </row>
+
<row>
<entry role="func_table_entry"><para role="func_signature">
<function>pi</function> ()
@@ -1960,6 +1976,20 @@ SELECT 4 AS four \; SELECT 5 AS five \aset
shape of the distribution.
</para>
+ <note>
+ <para>
+ When designing a benchmark which selects rows non-uniformly, be aware
+ that using <application>pgbench</application> non-uniform random functions
+ directly leads to unduly correlations: rows with close ids come out with
+ similar probability, skewing performance measures because they also reside
+ in close pages.
+ </para>
+ <para>
+ You must use <function>pr_perm</function> to shuffle the selected ids, or
+ some other additional step with similar effect, to avoid these skewing impacts.
+ </para>
+ </note>
+
<itemizedlist>
<listitem>
<para>
@@ -2054,24 +2084,54 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
<literal>hash_fnv1a</literal> accept an input value and an optional seed parameter.
In case the seed isn't provided the value of <literal>:default_seed</literal>
is used, which is initialized randomly unless set by the command-line
- <literal>-D</literal> option. Hash functions can be used to scatter the
- distribution of random functions such as <literal>random_zipfian</literal> or
- <literal>random_exponential</literal>. For instance, the following pgbench
- script simulates possible real world workload typical for social media and
+ <literal>-D</literal> option.
+ </para>
+
+ <para>
+ Function <literal>pr_perm</literal> implements a pseudo-random permutation
+ on an arbitrary size.
+ It allows to shuffle output of non uniform random functions so that
+ values drawn more often are not trivially correlated.
+ It permutes integers in [0, size) using a seed by applying rounds of
+ simple invertible functions, similarly to an encryption function,
+ although beware that it is not at all cryptographically secure.
+ Compared to <literal>hash</literal> functions discussed above, the function
+ ensures that a perfect permutation is applied: there are neither collisions
+ nor holes in the output values.
+ Values outside the interval are interpreted modulo the size.
+ The function raises an error if size is not positive.
+ If no seed is provided, <literal>:default_seed</literal> is used.
+
+ For instance, the following <application>pgbench</application> script
+ simulates possible real world workload typical for social media and
blogging platforms where few accounts generate excessive load:
<programlisting>
-\set r random_zipfian(0, 100000000, 1.07)
-\set k abs(hash(:r)) % 1000000
+\set size 1000000
+\set r random_zipfian(1, :size, 1.07)
+\set k 1 + pr_perm(:r, :size)
</programlisting>
In some cases several distinct distributions are needed which don't correlate
with each other and this is when implicit seed parameter comes in handy:
<programlisting>
-\set k1 abs(hash(:r, :default_seed + 123)) % 1000000
-\set k2 abs(hash(:r, :default_seed + 321)) % 1000000
+\set k1 1 + pr_perm(:r, :size, :default_seed + 123)
+\set k2 1 + pr_perm(:r, :size, :default_seed + 321)
</programlisting>
+
+ An similar behavior can also be approximated with
+ <function>hash</function>:
+
+<programlisting>
+\set size 1000000
+\set r random_zipfian(1, 100 * :size, 1.07)
+\set k 1 + abs(hash(:r)) % :size
+</programlisting>
+
+ As this hash-modulo version generates collisions, some
+ <literal>id</literal> would not be reachable and others would come more often
+ than the target distribution.
</para>
<para>
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index 4d529ea550..73514a0a47 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -19,6 +19,7 @@
#define PGBENCH_NARGS_VARIABLE (-1)
#define PGBENCH_NARGS_CASE (-2)
#define PGBENCH_NARGS_HASH (-3)
+#define PGBENCH_NARGS_PERMUTE (-4)
PgBenchExpr *expr_parse_result;
@@ -370,6 +371,9 @@ static const struct
{
"hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A
},
+ {
+ "permute", PGBENCH_NARGS_PERMUTE, PGBENCH_PERMUTE
+ },
/* keep as last array element */
{
NULL, 0, 0
@@ -482,6 +486,19 @@ make_func(yyscan_t yyscanner, int fnumber, PgBenchExprList *args)
}
break;
+ /* pseudo-random permutation function with optional seed argument */
+ case PGBENCH_NARGS_PERMUTE:
+ if (len < 2 || len > 3)
+ expr_yyerror_more(yyscanner, "unexpected number of arguments",
+ PGBENCH_FUNCTIONS[fnumber].fname);
+
+ if (len == 2)
+ {
+ PgBenchExpr *var = make_variable("default_seed");
+ args = make_elist(var, args);
+ }
+ break;
+
/* common case: positive arguments number */
default:
Assert(PGBENCH_FUNCTIONS[fnumber].nargs >= 0);
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index f6a214669c..8f092d3118 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -32,6 +32,13 @@
#endif
#include "postgres_fe.h"
+#include "common/int.h"
+#include "common/logging.h"
+#include "fe_utils/conditional.h"
+#include "getopt_long.h"
+#include "libpq-fe.h"
+#include "portability/instr_time.h"
+#include "port/pg_bitutils.h"
#include <ctype.h>
#include <float.h>
@@ -1124,6 +1131,172 @@ getHashMurmur2(int64 val, uint64 seed)
return (int64) result;
}
+/*
+ * Parametric Pseudo-random Permutation
+ */
+
+/* 16 so that % 16 can be optimized to & 0x0f */
+#define PPP_PRIMES 16
+/*
+ * 24 bit mega primes from https://primes.utm.edu/lists/small/millions/
+ * the i-th prime, i \in [0, 16), is the first prime above 2^23 + i * 2^19
+ */
+static uint64 primes[PPP_PRIMES] = {
+ UINT64CONST(8388617),
+ UINT64CONST(8912921),
+ UINT64CONST(9437189),
+ UINT64CONST(9961487),
+ UINT64CONST(10485767),
+ UINT64CONST(11010059),
+ UINT64CONST(11534351),
+ UINT64CONST(12058679),
+ UINT64CONST(12582917),
+ UINT64CONST(13107229),
+ UINT64CONST(13631489),
+ UINT64CONST(14155777),
+ UINT64CONST(14680067),
+ UINT64CONST(15204391),
+ UINT64CONST(15728681),
+ UINT64CONST(16252967)
+};
+
+/* how many "encryption" rounds to apply */
+#define PPP_ROUNDS 4
+
+/* return smallest mask holding n */
+static uint64
+compute_mask(uint64 n)
+{
+ n |= n >> 1;
+ n |= n >> 2;
+ n |= n >> 4;
+ n |= n >> 8;
+ n |= n >> 16;
+ n |= n >> 32;
+ return n;
+}
+
+/*
+ * Compute (x * y) % m, where x and y in [0, 2^64), m in [1, 2^64).
+ */
+static uint64
+modular_multiply(uint64 x, uint64 y, const uint64 m)
+{
+#if defined(PG_INT128_TYPE)
+ return (PG_INT128_TYPE) x * (PG_INT128_TYPE) y % (PG_INT128_TYPE) m;
+#else
+ pg_log_fatal("modular_multiply not implemented on this platform");
+ pg_log_info("128 bits integer arithmetic not available");
+ exit(1);
+#endif
+}
+
+/*
+ * Donald Knuth's linear congruential generator
+ *
+ * Relying on multiplication overflows is part of the design
+ * of this simple pseudo random number generator.
+ */
+#define DK_LCG_MUL UINT64CONST(6364136223846793005)
+#define DK_LCG_INC UINT64CONST(1442695040888963407)
+
+/* do not use all small bits */
+#define LCG_SHIFT 13
+
+/*
+ * parametric pseudo-random permutation function on int64
+ *
+ * Result in [0, size) is a permutation for inputs in the same set.
+ *
+ * Note that this function does not pass statistical tests: eg
+ * permutations of 2, 3, 4 or 5 ints are not strictly equiprobable.
+ * Things worsen for large sizes as there are many more permutations
+ * (size!) than seeds to select them (2^64 < 21!).
+ * However it is inexpensive compared to an actual encryption function,
+ * and the quality is good enough to avoid trivial correlations on
+ * large sizes, which is the expected use case.
+ *
+ * THIS FUNCTION IS NOT CRYPTOGRAPHICALLY SECURE.
+ * DO NOT USE FOR SUCH PURPOSE.
+ */
+static int64
+permute(const int64 data, const int64 isize, const int64 seed)
+{
+ /* computations are performed on unsigned values */
+ uint64 size = (uint64) isize;
+ uint64 v = (uint64) data % size;
+ uint64 key = (uint64) seed;
+ /* size-1 ensures 2 possibly overlapping halves */
+ uint64 mask = compute_mask(size - 1) >> 1;
+
+ /* nothing to permute */
+ if (isize == 1)
+ return 0;
+
+ Assert(isize >= 2);
+
+ /*-----
+ * Apply 4 rounds of bijective transformations using key updated
+ * at each stage:
+ *
+ * (1) whiten: partial xors on overlapping power-of-2 subsets
+ * for instance with v in 0 .. 14 (i.e. with size == 15):
+ * if v is in 0 .. 7 do v = (v ^ k) % 8
+ * if v is in 7 .. 14 do v = 14 - ((14-v) ^ k) % 8
+ * note that because of the overlap (here 7), v may be changed twice.
+ * this transformation if bijective because the condition to apply it
+ * is still true after applying it, and xor itself is bijective on a
+ * power-of-2 size.
+ *
+ * (2) scatter: linear modulo
+ * v = (v * p + k) % size
+ * this transformation is bijective is p & size are prime, which is
+ * ensured in the code by the while loop which discards primes when
+ * size is a multiple of it.
+ *
+ */
+ for (unsigned int i = 0, p = key % PPP_PRIMES; i < PPP_ROUNDS; i++, p = (p + 1) % PPP_PRIMES)
+ {
+ uint64 t;
+
+ /* first "half" whitening, for v in 0 .. mask */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ if (v <= mask)
+ v ^= (key >> LCG_SHIFT) & mask;
+
+ /* second (possibly overlapping) "half" whitening */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ t = size - 1 - v;
+ if (t <= mask)
+ {
+ t ^= (key >> LCG_SHIFT) & mask;
+ v = size - 1 - t;
+ }
+
+ /* at most 2 primes are skipped for a given size */
+ while (unlikely(size % primes[p] == 0))
+ p = (p + 1) % PPP_PRIMES;
+
+ /* scatter values with a prime multiplication */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+
+ /* Performance shortcut for 24 bit primes, ok for size up to ~10E12 */
+ if ((v & UINT64CONST(0xffffffffff)) == v)
+ v = (primes[p] * v + (key >> LCG_SHIFT)) % size;
+ else
+ /*-----
+ * Note: the add cannot overflow as size is under 63 bits:
+ * mmv = mm(prime, v, size) < size <= 0x7fffffffffffffff = (1<<63)-1
+ * ks = key >> LCG_SHIFTS <= 2^51
+ * => mmv + ks < (1<<64) - 1
+ */
+ v = (modular_multiply(primes[p], v, size) + (key >> LCG_SHIFT)) % size;
+ }
+
+ /* back to signed */
+ return (int64) v;
+}
+
/*
* Initialize the given SimpleStats struct to all zeroes
*/
@@ -2472,6 +2645,27 @@ evalStandardFunc(CState *st,
return true;
}
+ case PGBENCH_PERMUTE:
+ {
+ int64 val, size, seed;
+
+ Assert(nargs == 3);
+
+ if (!coerceToInt(&vargs[0], &val) ||
+ !coerceToInt(&vargs[1], &size) ||
+ !coerceToInt(&vargs[2], &seed))
+ return false;
+
+ if (size < 1)
+ {
+ fprintf(stderr, "permute size parameter must be >= 1\n");
+ return false;
+ }
+
+ setIntValue(retval, permute(val, size, seed));
+ return true;
+ }
+
default:
/* cannot get here */
Assert(0);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index 3a9d89e6f1..6ce1c98649 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -99,7 +99,8 @@ typedef enum PgBenchFunction
PGBENCH_IS,
PGBENCH_CASE,
PGBENCH_HASH_FNV1A,
- PGBENCH_HASH_MURMUR2
+ PGBENCH_HASH_MURMUR2,
+ PGBENCH_PERMUTE
} PgBenchFunction;
typedef struct PgBenchExpr PgBenchExpr;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index daffc18e52..2dd1860e71 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -4,6 +4,7 @@ use warnings;
use PostgresNode;
use TestLib;
use Test::More;
+use Config;
# start a pgbench specific server
my $node = get_new_node('main');
@@ -467,6 +468,15 @@ pgbench(
qr{command=98.: int 5432\b}, # :random_seed
qr{command=99.: int -9223372036854775808\b}, # min int
qr{command=100.: int 9223372036854775807\b}, # max int
+ # PRP tests
+ qr{command=101.: boolean true\b},
+ qr{command=102.: boolean true\b},
+ qr{command=103.: boolean true\b},
+ qr{command=104.: boolean true\b},
+ qr{command=105.: boolean true\b},
+ qr{command=109.: boolean true\b},
+ qr{command=110.: boolean true\b},
+ qr{command=111.: boolean true\b},
],
'pgbench expressions',
{
@@ -594,9 +604,58 @@ SELECT :v0, :v1, :v2, :v3;
-- minint constant parsing
\set min debug(-9223372036854775808)
\set max debug(-(:min + 1))
+-- parametric pseudo-random permutation
+\set t debug(permute(0, 2) + permute(1, 2) = 1)
+\set t debug(permute(0, 3) + permute(1, 3) + permute(2, 3) = 3)
+\set t debug(permute(0, 4) + permute(1, 4) + permute(2, 4) + permute(3, 4) = 6)
+\set t debug(permute(0, 5) + permute(1, 5) + permute(2, 5) + permute(3, 5) + permute(4, 5) = 10)
+\set t debug(permute(0, 16) + permute(1, 16) + permute(2, 16) + permute(3, 16) + \
+ permute(4, 16) + permute(5, 16) + permute(6, 16) + permute(7, 16) + \
+ permute(8, 16) + permute(9, 16) + permute(10, 16) + permute(11, 16) + \
+ permute(12, 16) + permute(13, 16) + permute(14, 16) + permute(15, 16) = 120)
+-- random sanity check
+\set size random(2, 1000)
+\set v random(0, :size - 1)
+\set p permute(:v, :size)
+\set t debug(0 <= :p and :p < :size and :p = permute(:v + :size, :size) and :p <> permute(:v + 1, :size))
+-- actual values
+\set t debug(permute(:v, 1) = 0)
+\set t debug(permute(0, 2, 5432) = 0 and permute(1, 2, 5432) = 1 and \
+ permute(0, 2, 5431) = 1 and permute(1, 2, 5431) = 0)
}
});
+# some PPP tests require 128-bit int support.
+pgbench(
+ '-t 1',
+ 0,
+ [ qr{type: .*/001_pgbench_permute}, qr{processed: 1/1} ],
+ [
+ qr{command=3.: boolean true\b},
+ qr{command=4.: int 9223372036854775797\b},
+ qr{command=5.: boolean true\b},
+ ],
+ 'pgbench permute',
+ {
+ '001_pgbench_permute' => q{
+\set min debug(-9223372036854775808)
+\set max debug(-(:min + 1))
+-- ~50 bits test to exercise full modular multiplication
+\set t debug(permute(0, 1000000000000000, 54321) = 9406454989259 and \
+ permute(1999999999999999, 1000000000000000, 54321) = 54570773902028 and \
+ permute(2500000000000000, 1000000000000000, 54321) = 771082311307468)
+-- 63 bits tests
+\set size debug(:max - 10)
+\set ok debug(permute(:size-1, :size, 5432) = 7214172101785397543 and \
+ permute(:size-2, :size, 5432) = 4060085360303920649 and \
+ permute(:size-3, :size, 5432) = 919477102797071521 and \
+ permute(:size-4, :size, 5432) = 7588404289602340497 and \
+ permute(:size-5, :size, 5432) = 5568354808723584469 and \
+ permute(:size-6, :size, 5432) = 2410458883211907565 and \
+ permute(:size-7, :size, 5432) = 1738667467693569814)
+}
+ }) unless $Config{'osname'} eq 'MSWin32';
+
# random determinism when seeded
$node->safe_psql('postgres',
'CREATE UNLOGGED TABLE seeded_random(seed INT8 NOT NULL, rand TEXT NOT NULL, val INTEGER NOT NULL);'
@@ -955,6 +1014,10 @@ SELECT LEAST(} . join(', ', (':i') x 256) . q{)}
'bad boolean', 2,
[qr{malformed variable.*trueXXX}], q{\set b :badtrue or true}
],
+ [
+ 'invalid permute size', 2,
+ [qr{permute size parameter must be >= 1}], q{\set i permute(0, 0)}
+ ],
# GSET
[
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
index e38c7d77d1..4027e68dfa 100644
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -341,6 +341,16 @@ my @script_tests = (
'set i',
[ qr{set i 1 }, qr{\^ error found here} ],
{ 'set_i_op' => "\\set i 1 +\n" }
+ ],
+ [
+ 'not enough arguments to permute',
+ [qr{unexpected number of arguments \(permute\)}],
+ { 'bad-permute-1.sql' => "\\set i permute(1)\n" }
+ ],
+ [
+ 'too many arguments to permute',
+ [qr{unexpected number of arguments \(permute\)}],
+ { 'bad-permute-2.sql' => "\\set i permute(1, 2, 3, 4)\n" }
],);
for my $t (@script_tests)
See attached v22.
v23:
- replaces remaining occurences of "pr_perm" in the documentation
- removes duplicated includes
--
Fabien.
Attachments:
pgbench-prp-func-23.patchtext/x-diff; name=pgbench-prp-func-23.patchDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 299d93b241..9f49a6a78d 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -1057,7 +1057,7 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<row>
<entry> <literal>default_seed</literal> </entry>
- <entry>seed used in hash functions by default</entry>
+ <entry>seed used in hash and pseudo-random permutation functions by default</entry>
</row>
<row>
@@ -1842,6 +1842,22 @@ SELECT 4 AS four \; SELECT 5 AS five \aset
</para></entry>
</row>
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <function>permute</function> ( <parameter>i</parameter>, <parameter>size</parameter> [, <parameter>seed</parameter> ] )
+ <returnvalue>integer</returnvalue>
+ </para>
+ <para>
+ Permuted value of <parameter>i</parameter>, in <literal>[0,size)</literal>.
+ This is the new position of <parameter>i</parameter> in a pseudo-random rearrangement of
+ <parameter>size</parameter> first integers parameterized by <parameter>seed</parameter>.
+ </para>
+ <para>
+ <literal>permute(0, 4)</literal>
+ <returnvalue>an integer between 0 and 3</returnvalue>
+ </para></entry>
+ </row>
+
<row>
<entry role="func_table_entry"><para role="func_signature">
<function>pi</function> ()
@@ -1960,6 +1976,20 @@ SELECT 4 AS four \; SELECT 5 AS five \aset
shape of the distribution.
</para>
+ <note>
+ <para>
+ When designing a benchmark which selects rows non-uniformly, be aware
+ that using <application>pgbench</application> non-uniform random functions
+ directly leads to unduly correlations: rows with close ids come out with
+ similar probability, skewing performance measures because they also reside
+ in close pages.
+ </para>
+ <para>
+ You must use <function>permute</function> to shuffle the selected ids, or
+ some other additional step with similar effect, to avoid these skewing impacts.
+ </para>
+ </note>
+
<itemizedlist>
<listitem>
<para>
@@ -2054,24 +2084,54 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
<literal>hash_fnv1a</literal> accept an input value and an optional seed parameter.
In case the seed isn't provided the value of <literal>:default_seed</literal>
is used, which is initialized randomly unless set by the command-line
- <literal>-D</literal> option. Hash functions can be used to scatter the
- distribution of random functions such as <literal>random_zipfian</literal> or
- <literal>random_exponential</literal>. For instance, the following pgbench
- script simulates possible real world workload typical for social media and
+ <literal>-D</literal> option.
+ </para>
+
+ <para>
+ Function <literal>permute</literal> implements a parametric pseudo-random permutation
+ on an arbitrary size.
+ It allows to shuffle output of non uniform random functions so that
+ values drawn more often are not trivially correlated.
+ It permutes integers in [0, size) using a seed by applying rounds of
+ simple invertible functions, similarly to an encryption function,
+ although beware that it is not at all cryptographically secure.
+ Compared to <literal>hash</literal> functions discussed above, the function
+ ensures that a perfect permutation is applied: there are neither collisions
+ nor holes in the output values.
+ Values outside the interval are interpreted modulo the size.
+ The function raises an error if size is not positive.
+ If no seed is provided, <literal>:default_seed</literal> is used.
+
+ For instance, the following <application>pgbench</application> script
+ simulates possible real world workload typical for social media and
blogging platforms where few accounts generate excessive load:
<programlisting>
-\set r random_zipfian(0, 100000000, 1.07)
-\set k abs(hash(:r)) % 1000000
+\set size 1000000
+\set r random_zipfian(1, :size, 1.07)
+\set k 1 + permute(:r, :size)
</programlisting>
In some cases several distinct distributions are needed which don't correlate
with each other and this is when implicit seed parameter comes in handy:
<programlisting>
-\set k1 abs(hash(:r, :default_seed + 123)) % 1000000
-\set k2 abs(hash(:r, :default_seed + 321)) % 1000000
+\set k1 1 + permute(:r, :size, :default_seed + 123)
+\set k2 1 + permute(:r, :size, :default_seed + 321)
</programlisting>
+
+ An similar behavior can also be approximated with
+ <function>hash</function>:
+
+<programlisting>
+\set size 1000000
+\set r random_zipfian(1, 100 * :size, 1.07)
+\set k 1 + abs(hash(:r)) % :size
+</programlisting>
+
+ As this hash-modulo version generates collisions, some
+ <literal>id</literal> would not be reachable and others would come more often
+ than the target distribution.
</para>
<para>
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index 4d529ea550..73514a0a47 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -19,6 +19,7 @@
#define PGBENCH_NARGS_VARIABLE (-1)
#define PGBENCH_NARGS_CASE (-2)
#define PGBENCH_NARGS_HASH (-3)
+#define PGBENCH_NARGS_PERMUTE (-4)
PgBenchExpr *expr_parse_result;
@@ -370,6 +371,9 @@ static const struct
{
"hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A
},
+ {
+ "permute", PGBENCH_NARGS_PERMUTE, PGBENCH_PERMUTE
+ },
/* keep as last array element */
{
NULL, 0, 0
@@ -482,6 +486,19 @@ make_func(yyscan_t yyscanner, int fnumber, PgBenchExprList *args)
}
break;
+ /* pseudo-random permutation function with optional seed argument */
+ case PGBENCH_NARGS_PERMUTE:
+ if (len < 2 || len > 3)
+ expr_yyerror_more(yyscanner, "unexpected number of arguments",
+ PGBENCH_FUNCTIONS[fnumber].fname);
+
+ if (len == 2)
+ {
+ PgBenchExpr *var = make_variable("default_seed");
+ args = make_elist(var, args);
+ }
+ break;
+
/* common case: positive arguments number */
default:
Assert(PGBENCH_FUNCTIONS[fnumber].nargs >= 0);
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index f6a214669c..2f74b44d4a 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -1124,6 +1124,172 @@ getHashMurmur2(int64 val, uint64 seed)
return (int64) result;
}
+/*
+ * Parametric Pseudo-random Permutation
+ */
+
+/* 16 so that % 16 can be optimized to & 0x0f */
+#define PPP_PRIMES 16
+/*
+ * 24 bit mega primes from https://primes.utm.edu/lists/small/millions/
+ * the i-th prime, i \in [0, 16), is the first prime above 2^23 + i * 2^19
+ */
+static uint64 primes[PPP_PRIMES] = {
+ UINT64CONST(8388617),
+ UINT64CONST(8912921),
+ UINT64CONST(9437189),
+ UINT64CONST(9961487),
+ UINT64CONST(10485767),
+ UINT64CONST(11010059),
+ UINT64CONST(11534351),
+ UINT64CONST(12058679),
+ UINT64CONST(12582917),
+ UINT64CONST(13107229),
+ UINT64CONST(13631489),
+ UINT64CONST(14155777),
+ UINT64CONST(14680067),
+ UINT64CONST(15204391),
+ UINT64CONST(15728681),
+ UINT64CONST(16252967)
+};
+
+/* how many "encryption" rounds to apply */
+#define PPP_ROUNDS 4
+
+/* return smallest mask holding n */
+static uint64
+compute_mask(uint64 n)
+{
+ n |= n >> 1;
+ n |= n >> 2;
+ n |= n >> 4;
+ n |= n >> 8;
+ n |= n >> 16;
+ n |= n >> 32;
+ return n;
+}
+
+/*
+ * Compute (x * y) % m, where x and y in [0, 2^64), m in [1, 2^64).
+ */
+static uint64
+modular_multiply(uint64 x, uint64 y, const uint64 m)
+{
+#if defined(PG_INT128_TYPE)
+ return (PG_INT128_TYPE) x * (PG_INT128_TYPE) y % (PG_INT128_TYPE) m;
+#else
+ pg_log_fatal("modular_multiply not implemented on this platform");
+ pg_log_info("128 bits integer arithmetic not available");
+ exit(1);
+#endif
+}
+
+/*
+ * Donald Knuth's linear congruential generator
+ *
+ * Relying on multiplication overflows is part of the design
+ * of this simple pseudo random number generator.
+ */
+#define DK_LCG_MUL UINT64CONST(6364136223846793005)
+#define DK_LCG_INC UINT64CONST(1442695040888963407)
+
+/* do not use all small bits */
+#define LCG_SHIFT 13
+
+/*
+ * parametric pseudo-random permutation function on int64
+ *
+ * Result in [0, size) is a permutation for inputs in the same set.
+ *
+ * Note that this function does not pass statistical tests: eg
+ * permutations of 2, 3, 4 or 5 ints are not strictly equiprobable.
+ * Things worsen for large sizes as there are many more permutations
+ * (size!) than seeds to select them (2^64 < 21!).
+ * However it is inexpensive compared to an actual encryption function,
+ * and the quality is good enough to avoid trivial correlations on
+ * large sizes, which is the expected use case.
+ *
+ * THIS FUNCTION IS NOT CRYPTOGRAPHICALLY SECURE.
+ * DO NOT USE FOR SUCH PURPOSE.
+ */
+static int64
+permute(const int64 data, const int64 isize, const int64 seed)
+{
+ /* computations are performed on unsigned values */
+ uint64 size = (uint64) isize;
+ uint64 v = (uint64) data % size;
+ uint64 key = (uint64) seed;
+ /* size-1 ensures 2 possibly overlapping halves */
+ uint64 mask = compute_mask(size - 1) >> 1;
+
+ /* nothing to permute */
+ if (isize == 1)
+ return 0;
+
+ Assert(isize >= 2);
+
+ /*-----
+ * Apply 4 rounds of bijective transformations using key updated
+ * at each stage:
+ *
+ * (1) whiten: partial xors on overlapping power-of-2 subsets
+ * for instance with v in 0 .. 14 (i.e. with size == 15):
+ * if v is in 0 .. 7 do v = (v ^ k) % 8
+ * if v is in 7 .. 14 do v = 14 - ((14-v) ^ k) % 8
+ * note that because of the overlap (here 7), v may be changed twice.
+ * this transformation if bijective because the condition to apply it
+ * is still true after applying it, and xor itself is bijective on a
+ * power-of-2 size.
+ *
+ * (2) scatter: linear modulo
+ * v = (v * p + k) % size
+ * this transformation is bijective is p & size are prime, which is
+ * ensured in the code by the while loop which discards primes when
+ * size is a multiple of it.
+ *
+ */
+ for (unsigned int i = 0, p = key % PPP_PRIMES; i < PPP_ROUNDS; i++, p = (p + 1) % PPP_PRIMES)
+ {
+ uint64 t;
+
+ /* first "half" whitening, for v in 0 .. mask */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ if (v <= mask)
+ v ^= (key >> LCG_SHIFT) & mask;
+
+ /* second (possibly overlapping) "half" whitening */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+ t = size - 1 - v;
+ if (t <= mask)
+ {
+ t ^= (key >> LCG_SHIFT) & mask;
+ v = size - 1 - t;
+ }
+
+ /* at most 2 primes are skipped for a given size */
+ while (unlikely(size % primes[p] == 0))
+ p = (p + 1) % PPP_PRIMES;
+
+ /* scatter values with a prime multiplication */
+ key = key * DK_LCG_MUL + DK_LCG_INC;
+
+ /* Performance shortcut for 24 bit primes, ok for size up to ~10E12 */
+ if ((v & UINT64CONST(0xffffffffff)) == v)
+ v = (primes[p] * v + (key >> LCG_SHIFT)) % size;
+ else
+ /*-----
+ * Note: the add cannot overflow as size is under 63 bits:
+ * mmv = mm(prime, v, size) < size <= 0x7fffffffffffffff = (1<<63)-1
+ * ks = key >> LCG_SHIFTS <= 2^51
+ * => mmv + ks < (1<<64) - 1
+ */
+ v = (modular_multiply(primes[p], v, size) + (key >> LCG_SHIFT)) % size;
+ }
+
+ /* back to signed */
+ return (int64) v;
+}
+
/*
* Initialize the given SimpleStats struct to all zeroes
*/
@@ -2472,6 +2638,27 @@ evalStandardFunc(CState *st,
return true;
}
+ case PGBENCH_PERMUTE:
+ {
+ int64 val, size, seed;
+
+ Assert(nargs == 3);
+
+ if (!coerceToInt(&vargs[0], &val) ||
+ !coerceToInt(&vargs[1], &size) ||
+ !coerceToInt(&vargs[2], &seed))
+ return false;
+
+ if (size < 1)
+ {
+ fprintf(stderr, "permute size parameter must be >= 1\n");
+ return false;
+ }
+
+ setIntValue(retval, permute(val, size, seed));
+ return true;
+ }
+
default:
/* cannot get here */
Assert(0);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index 3a9d89e6f1..6ce1c98649 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -99,7 +99,8 @@ typedef enum PgBenchFunction
PGBENCH_IS,
PGBENCH_CASE,
PGBENCH_HASH_FNV1A,
- PGBENCH_HASH_MURMUR2
+ PGBENCH_HASH_MURMUR2,
+ PGBENCH_PERMUTE
} PgBenchFunction;
typedef struct PgBenchExpr PgBenchExpr;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index daffc18e52..2dd1860e71 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -4,6 +4,7 @@ use warnings;
use PostgresNode;
use TestLib;
use Test::More;
+use Config;
# start a pgbench specific server
my $node = get_new_node('main');
@@ -467,6 +468,15 @@ pgbench(
qr{command=98.: int 5432\b}, # :random_seed
qr{command=99.: int -9223372036854775808\b}, # min int
qr{command=100.: int 9223372036854775807\b}, # max int
+ # PRP tests
+ qr{command=101.: boolean true\b},
+ qr{command=102.: boolean true\b},
+ qr{command=103.: boolean true\b},
+ qr{command=104.: boolean true\b},
+ qr{command=105.: boolean true\b},
+ qr{command=109.: boolean true\b},
+ qr{command=110.: boolean true\b},
+ qr{command=111.: boolean true\b},
],
'pgbench expressions',
{
@@ -594,9 +604,58 @@ SELECT :v0, :v1, :v2, :v3;
-- minint constant parsing
\set min debug(-9223372036854775808)
\set max debug(-(:min + 1))
+-- parametric pseudo-random permutation
+\set t debug(permute(0, 2) + permute(1, 2) = 1)
+\set t debug(permute(0, 3) + permute(1, 3) + permute(2, 3) = 3)
+\set t debug(permute(0, 4) + permute(1, 4) + permute(2, 4) + permute(3, 4) = 6)
+\set t debug(permute(0, 5) + permute(1, 5) + permute(2, 5) + permute(3, 5) + permute(4, 5) = 10)
+\set t debug(permute(0, 16) + permute(1, 16) + permute(2, 16) + permute(3, 16) + \
+ permute(4, 16) + permute(5, 16) + permute(6, 16) + permute(7, 16) + \
+ permute(8, 16) + permute(9, 16) + permute(10, 16) + permute(11, 16) + \
+ permute(12, 16) + permute(13, 16) + permute(14, 16) + permute(15, 16) = 120)
+-- random sanity check
+\set size random(2, 1000)
+\set v random(0, :size - 1)
+\set p permute(:v, :size)
+\set t debug(0 <= :p and :p < :size and :p = permute(:v + :size, :size) and :p <> permute(:v + 1, :size))
+-- actual values
+\set t debug(permute(:v, 1) = 0)
+\set t debug(permute(0, 2, 5432) = 0 and permute(1, 2, 5432) = 1 and \
+ permute(0, 2, 5431) = 1 and permute(1, 2, 5431) = 0)
}
});
+# some PPP tests require 128-bit int support.
+pgbench(
+ '-t 1',
+ 0,
+ [ qr{type: .*/001_pgbench_permute}, qr{processed: 1/1} ],
+ [
+ qr{command=3.: boolean true\b},
+ qr{command=4.: int 9223372036854775797\b},
+ qr{command=5.: boolean true\b},
+ ],
+ 'pgbench permute',
+ {
+ '001_pgbench_permute' => q{
+\set min debug(-9223372036854775808)
+\set max debug(-(:min + 1))
+-- ~50 bits test to exercise full modular multiplication
+\set t debug(permute(0, 1000000000000000, 54321) = 9406454989259 and \
+ permute(1999999999999999, 1000000000000000, 54321) = 54570773902028 and \
+ permute(2500000000000000, 1000000000000000, 54321) = 771082311307468)
+-- 63 bits tests
+\set size debug(:max - 10)
+\set ok debug(permute(:size-1, :size, 5432) = 7214172101785397543 and \
+ permute(:size-2, :size, 5432) = 4060085360303920649 and \
+ permute(:size-3, :size, 5432) = 919477102797071521 and \
+ permute(:size-4, :size, 5432) = 7588404289602340497 and \
+ permute(:size-5, :size, 5432) = 5568354808723584469 and \
+ permute(:size-6, :size, 5432) = 2410458883211907565 and \
+ permute(:size-7, :size, 5432) = 1738667467693569814)
+}
+ }) unless $Config{'osname'} eq 'MSWin32';
+
# random determinism when seeded
$node->safe_psql('postgres',
'CREATE UNLOGGED TABLE seeded_random(seed INT8 NOT NULL, rand TEXT NOT NULL, val INTEGER NOT NULL);'
@@ -955,6 +1014,10 @@ SELECT LEAST(} . join(', ', (':i') x 256) . q{)}
'bad boolean', 2,
[qr{malformed variable.*trueXXX}], q{\set b :badtrue or true}
],
+ [
+ 'invalid permute size', 2,
+ [qr{permute size parameter must be >= 1}], q{\set i permute(0, 0)}
+ ],
# GSET
[
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
index e38c7d77d1..4027e68dfa 100644
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -341,6 +341,16 @@ my @script_tests = (
'set i',
[ qr{set i 1 }, qr{\^ error found here} ],
{ 'set_i_op' => "\\set i 1 +\n" }
+ ],
+ [
+ 'not enough arguments to permute',
+ [qr{unexpected number of arguments \(permute\)}],
+ { 'bad-permute-1.sql' => "\\set i permute(1)\n" }
+ ],
+ [
+ 'too many arguments to permute',
+ [qr{unexpected number of arguments \(permute\)}],
+ { 'bad-permute-2.sql' => "\\set i permute(1, 2, 3, 4)\n" }
],);
for my $t (@script_tests)
On 2021-Mar-14, Fabien COELHO wrote:
+ /*----- + * Apply 4 rounds of bijective transformations using key updated + * at each stage: + * + * (1) whiten: partial xors on overlapping power-of-2 subsets + * for instance with v in 0 .. 14 (i.e. with size == 15): + * if v is in 0 .. 7 do v = (v ^ k) % 8 + * if v is in 7 .. 14 do v = 14 - ((14-v) ^ k) % 8 + * note that because of the overlap (here 7), v may be changed twice. + * this transformation if bijective because the condition to apply it + * is still true after applying it, and xor itself is bijective on a + * power-of-2 size. + * + * (2) scatter: linear modulo + * v = (v * p + k) % size + * this transformation is bijective is p & size are prime, which is + * ensured in the code by the while loop which discards primes when + * size is a multiple of it. + * + */
My main question on this now is, do you have a scholar reference for
this algorithm?
--
�lvaro Herrera Valdivia, Chile
"Someone said that it is at least an order of magnitude more work to do
production software than a prototype. I think he is wrong by at least
an order of magnitude." (Brian Kernighan)
Hello Alvaro,
+ /*----- + * Apply 4 rounds of bijective transformations using key updated + * at each stage: + * + * (1) whiten: partial xors on overlapping power-of-2 subsets + * for instance with v in 0 .. 14 (i.e. with size == 15): + * if v is in 0 .. 7 do v = (v ^ k) % 8 + * if v is in 7 .. 14 do v = 14 - ((14-v) ^ k) % 8 + * note that because of the overlap (here 7), v may be changed twice. + * this transformation if bijective because the condition to apply it + * is still true after applying it, and xor itself is bijective on a + * power-of-2 size. + * + * (2) scatter: linear modulo + * v = (v * p + k) % size + * this transformation is bijective is p & size are prime, which is + * ensured in the code by the while loop which discards primes when + * size is a multiple of it. + * + */My main question on this now is, do you have a scholar reference for
this algorithm?
Nope, otherwise I would have put a reference. I'm a scholar though, if
it helps:-)
I could not find any algorithm that fitted the bill. The usual approach
(eg benchmarking designs) is too use some hash function and assume that it
is not a permutation, too bad.
Basically the algorithm mimics a few rounds of cryptographic encryption
adapted to any size and simple operators, whereas encryption function are
restricted to power of two blocks (eg the Feistel network). The structure
is the same AES with its AddRoundKey the xor-ing stage (adapted to non
power of two in whiten above), MixColumns which does the scattering, and
for key expansion, I used Donald Knuth generator. Basically one could say
that permute is an inexpensive and insecure AES:-)
We could add a reference to AES for the structure of the algorithm itself,
but otherwise I just iterated designs till I was satisfied with the
result (again: inexpensive and constant cost, any size, permutation…).
--
Fabien.
On Sun, 14 Mar 2021 at 16:08, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
My main question on this now is, do you have a scholar reference for
this algorithm?Nope, otherwise I would have put a reference. I'm a scholar though, if
it helps:-)I could not find any algorithm that fitted the bill. The usual approach
(eg benchmarking designs) is too use some hash function and assume that it
is not a permutation, too bad.Basically the algorithm mimics a few rounds of cryptographic encryption
adapted to any size and simple operators, whereas encryption function are
restricted to power of two blocks (eg the Feistel network). The structure
is the same AES with its AddRoundKey the xor-ing stage (adapted to non
power of two in whiten above), MixColumns which does the scattering, and
for key expansion, I used Donald Knuth generator. Basically one could say
that permute is an inexpensive and insecure AES:-)
I spent a little time playing around with this, trying to come up with
a reasonable way to test it. It's apparent from the code and
associated comments that the transformation is bijective and so will
produce a permutation, but it's less obvious how random that
permutation will be. Obviously the Fisher-Yates algorithm would give
the most random permutation, but that's not really practical in this
context. Any other algorithm will, in some sense, be less random than
that, so I think it's really just a question of testing that it's
random enough to satisfy the intended use case.
The approach to testing I tried was to use the Kolmogorov-Smirnov test
to test how uniformly random a couple of quantities are:
1). For a given size and fixed input value, and a large number of
randomly chosen seed values, is the output uniformly random?
2). For a given size and a pair of nearby input values, are the two
output values uniformly randomly spaced apart?
This second test seems desirable to ensure sufficient shuffling so
that inputs coming from some non-uniform random distribution are
scattered sufficiently to distribute the maxima/minima (x -> x + rand
mod size would pass test 1, so that by itself is insufficient).
I tested this using the attached (somewhat ugly) stand-alone test C
program, and for the most part these 2 tests seemed to pass. The
program also tests that the output really is a permutation, just to be
sure, and this was confirmed in all cases.
However, I noticed that the results are less good when size is a power
of 2. In this case, the results seem significantly less uniform,
suggesting that for such sizes, there is an increased chance that the
permuted output might still be skewed.
Based on the above, I also experimented with a different permutation
algorithm (permute2() in the test), which attempts to inject more
randomness into the bijection, and between pairs of inputs, to make
the output less predictable and less likely to be accidentally
non-uniform. It's possible that it suffers from other problems, but it
did do better in these 2 tests.
Maybe some other tests might be useful, but I really don't know. As
noted above, any algorithm is likely to lead to some pattern in the
output (e.g., it looks like both these algorithms cause adjacent
inputs to always end up separated by an odd number), so a judgement
call will be required to decide what is random enough.
Regards,
Dean
Attachments:
On Mon, 22 Mar 2021 at 13:43, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
On Sun, 14 Mar 2021 at 16:08, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
My main question on this now is, do you have a scholar reference for
this algorithm?Nope, otherwise I would have put a reference. I'm a scholar though, if
it helps:-)I could not find any algorithm that fitted the bill. The usual approach
(eg benchmarking designs) is too use some hash function and assume that it
is not a permutation, too bad.Basically the algorithm mimics a few rounds of cryptographic encryption
adapted to any size and simple operators, whereas encryption function are
restricted to power of two blocks (eg the Feistel network). The structure
is the same AES with its AddRoundKey the xor-ing stage (adapted to non
power of two in whiten above), MixColumns which does the scattering, and
for key expansion, I used Donald Knuth generator. Basically one could say
that permute is an inexpensive and insecure AES:-)I spent a little time playing around with this, trying to come up with
a reasonable way to test it.
I spent more time testing this -- the previous testing was mostly for
large values of size, so I decided to also test it for small sizes.
The theoretical number of possible permutations grows rapidly with
size (size!), and the actual number of permutations observed grew
almost as quickly:
size size! distinct perms
2 2 2
3 6 6
4 24 24
5 120 120
6 720 720
7 5040 5040
8 40320 24382
9 362880 181440
10 3628800 3627355
My test script stopped once the first permutation had been seen 100
times, so it might have seen more permutations had it kept going for
longer.
However, looking at the actual output, there is a very significant
non-uniformity in the probability of particular permutations being
chosen, especially at certain sizes like 8 and 10. For example, at
size = 8, the identity permutation is significantly more likely than
almost any other permutation (roughly 13 times more likely than it
would be by random chance). Additionally, the signs are that this
non-uniformity tends to increase with size. At size = 10, the average
number of occurrences of each permutation in the test was 148, but
there were some that didn't appear at all, many that only appeared 2
or 3 times, and then some that appeared nearly 5000 times.
Also, there appears to be an unfortunate dependence on the seed -- if
the seed is even and the size is a power of 2, it looks like the
number of distinct permutations produced is just size/2, which is a
tiny fraction of size!.
This, together with the results from the previous K-S uniformity tests
at larger sizes, suggests that the current algorithm may not be random
enough to scatter values and remove correlations from a non-uniform
distribution.
In an attempt to do better, I tweaked the algorithm in the attached
patch, which makes use of erand48() to inject more randomness into the
permutation choice. Repeating the tests with the updated algorithm
shows that it has a number of advantages:
1). For small sizes (2-10), each of the size! possible permutations is
now produced with roughly equal probability. No single permutation was
much more likely than expected.
2). The loss of randomness for even seeds is gone.
3). For any given input, the output is more uniformly randomly
distributed for different seeds.
4). For any pair of nearby inputs, the corresponding outputs are more
uniformly randomly separated from one another.
5). The updated algorithm no longer uses modular_multiply(), so it
works the same on all platforms.
I still feel slightly uneasy about inventing our own algorithm here,
but I wasn't able to find anything else suitable, and I've now tested
this quite extensively. All the indications are that this updated
algorithm works well at all sizes and produces sufficiently random
results, though if anyone else can think of other approaches to
testing the core algorithm, that would be useful. For reference, I
attach 2 standalone test C programs I used for testing, which include
copies of the old and new algorithms.
I also did a quick copy editing pass over the docs, and I think the
patch is in pretty good shape. Barring objections, I'll see about
committing it later this week.
Regards,
Dean
Attachments:
pgbench-prp-func-24.patchtext/x-patch; charset=US-ASCII; name=pgbench-prp-func-24.patchDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
new file mode 100644
index 50cf22b..84d9566
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -1057,7 +1057,7 @@ pgbench <optional> <replaceable>options<
<row>
<entry> <literal>default_seed</literal> </entry>
- <entry>seed used in hash functions by default</entry>
+ <entry>seed used in hash and pseudorandom permutation functions by default</entry>
</row>
<row>
@@ -1866,6 +1866,24 @@ SELECT 4 AS four \; SELECT 5 AS five \as
<row>
<entry role="func_table_entry"><para role="func_signature">
+ <function>permute</function> ( <parameter>i</parameter>, <parameter>size</parameter> [, <parameter>seed</parameter> ] )
+ <returnvalue>integer</returnvalue>
+ </para>
+ <para>
+ Permuted value of <parameter>i</parameter>, in the range
+ <literal>[0, size)</literal>. This is the new position of
+ <parameter>i</parameter> (modulo <parameter>size</parameter>) in a
+ pseudorandom permutation of the integers <literal>0...size-1</literal>,
+ parameterized by <parameter>seed</parameter>.
+ </para>
+ <para>
+ <literal>permute(0, 4)</literal>
+ <returnvalue>an integer between 0 and 3</returnvalue>
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
<function>pi</function> ()
<returnvalue>double</returnvalue>
</para>
@@ -2071,29 +2089,70 @@ f(x) = PHI(2.0 * parameter * (x - mu) /
</listitem>
</itemizedlist>
+ <note>
+ <para>
+ When designing a benchmark which selects rows non-uniformly, be aware
+ that the rows chosen may be correlated with other data such as IDs from
+ a sequence or the physical row ordering, which may skew performance
+ measurements.
+ </para>
+ <para>
+ To avoid this, you may wish to use the <function>permute</function>
+ function, or some other additional step with similar effect, to shuffle
+ the selected rows and remove such correlations.
+ </para>
+ </note>
+
<para>
Hash functions <literal>hash</literal>, <literal>hash_murmur2</literal> and
<literal>hash_fnv1a</literal> accept an input value and an optional seed parameter.
In case the seed isn't provided the value of <literal>:default_seed</literal>
is used, which is initialized randomly unless set by the command-line
- <literal>-D</literal> option. Hash functions can be used to scatter the
- distribution of random functions such as <literal>random_zipfian</literal> or
- <literal>random_exponential</literal>. For instance, the following pgbench
- script simulates possible real world workload typical for social media and
- blogging platforms where few accounts generate excessive load:
+ <literal>-D</literal> option.
+ </para>
+
+ <para>
+ <literal>permute</literal> accepts an input value, a size, and an optional
+ seed parameter. It generates a pseudorandom permutation of integers in
+ the range <literal>[0, size)</literal>, and returns the index of the input
+ value in the permuted values. The permutation chosen is parameterized by
+ the seed, which defaults to <literal>:default_seed</literal>, if not
+ specified. Unlike the hash functions, <literal>permute</literal> ensures
+ that there are no collisions or holes in the output values. Input values
+ outside the interval are interpreted modulo the size. The function raises
+ an error if the size is not positive. <function>permute</function> can be
+ used to scatter the distribution of non-uniform random functions such as
+ <literal>random_zipfian</literal> or <literal>random_exponential</literal>
+ so that values drawn more often are not trivially correlated. For
+ instance, the following <application>pgbench</application> script
+ simulates a possible real world workload typical for social media and
+ blogging platforms where a few accounts generate excessive load:
<programlisting>
-\set r random_zipfian(0, 100000000, 1.07)
-\set k abs(hash(:r)) % 1000000
+\set size 1000000
+\set r random_zipfian(1, :size, 1.07)
+\set k 1 + permute(:r, :size)
</programlisting>
In some cases several distinct distributions are needed which don't correlate
- with each other and this is when implicit seed parameter comes in handy:
+ with each other and this is when the optional seed parameter comes in handy:
<programlisting>
-\set k1 abs(hash(:r, :default_seed + 123)) % 1000000
-\set k2 abs(hash(:r, :default_seed + 321)) % 1000000
+\set k1 1 + permute(:r, :size, :default_seed + 123)
+\set k2 1 + permute(:r, :size, :default_seed + 321)
+</programlisting>
+
+ A similar behavior can also be approximated with <function>hash</function>:
+
+<programlisting>
+\set size 1000000
+\set r random_zipfian(1, 100 * :size, 1.07)
+\set k 1 + abs(hash(:r)) % :size
</programlisting>
+
+ However, since <function>hash</function> generates collisions, some values
+ will not be reachable and others will be more frequent than expected from
+ the original distribution.
</para>
<para>
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
new file mode 100644
index 4d529ea..56f75cc
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -19,6 +19,7 @@
#define PGBENCH_NARGS_VARIABLE (-1)
#define PGBENCH_NARGS_CASE (-2)
#define PGBENCH_NARGS_HASH (-3)
+#define PGBENCH_NARGS_PERMUTE (-4)
PgBenchExpr *expr_parse_result;
@@ -370,6 +371,9 @@ static const struct
{
"hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A
},
+ {
+ "permute", PGBENCH_NARGS_PERMUTE, PGBENCH_PERMUTE
+ },
/* keep as last array element */
{
NULL, 0, 0
@@ -479,6 +483,19 @@ make_func(yyscan_t yyscanner, int fnumbe
{
PgBenchExpr *var = make_variable("default_seed");
args = make_elist(var, args);
+ }
+ break;
+
+ /* pseudorandom permutation function with optional seed argument */
+ case PGBENCH_NARGS_PERMUTE:
+ if (len < 2 || len > 3)
+ expr_yyerror_more(yyscanner, "unexpected number of arguments",
+ PGBENCH_FUNCTIONS[fnumber].fname);
+
+ if (len == 2)
+ {
+ PgBenchExpr *var = make_variable("default_seed");
+ args = make_elist(var, args);
}
break;
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
new file mode 100644
index 48ce171..e1ac78e
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -1127,6 +1127,127 @@ getHashMurmur2(int64 val, uint64 seed)
return (int64) result;
}
+/* return smallest mask holding n */
+static uint64
+compute_mask(uint64 n)
+{
+ n |= n >> 1;
+ n |= n >> 2;
+ n |= n >> 4;
+ n |= n >> 8;
+ n |= n >> 16;
+ n |= n >> 32;
+ return n;
+}
+
+/*
+ * Pseudorandom permutation function
+ *
+ * For small sizes, this generates each of the (size!) possible permutations
+ * of integers in the range [0, size) with roughly equal probability. Once
+ * the size is larger than 16, the number of possible permutations exceeds the
+ * number of distinct states of the internal pseudorandom number generator,
+ * and so not all possible permutations can be generated, but the permutations
+ * chosen should continue to give the appearance of being random.
+ *
+ * THIS FUNCTION IS NOT CRYPTOGRAPHICALLY SECURE.
+ * DO NOT USE FOR SUCH PURPOSE.
+ */
+static int64
+permute(const int64 val, const int64 isize, const int64 seed)
+{
+ RandomState random_state;
+ uint64 size,
+ v,
+ mask,
+ top_bit;
+ int i;
+
+ if (isize < 2)
+ return 0; /* nothing to permute */
+
+ /* Initialize random state with low-order bits of seed */
+ random_state.xseed[0] = seed & 0xFFFF;
+ random_state.xseed[1] = (seed >> 16) & 0xFFFF;
+ random_state.xseed[2] = (seed >> 32) & 0xFFFF;
+
+ /* Computations are performed on unsigned values */
+ size = (uint64) isize;
+ v = (uint64) val % size;
+
+ /* Mask to work modulo largest power of 2 less than or equal to size */
+ mask = compute_mask(size - 1);
+ if (mask >= size)
+ mask >>= 1;
+
+ /* Topmost bit of mask */
+ top_bit = (mask + 1) >> 1;
+
+ /*
+ * Permute the input value by applying 4 rounds of pseudorandom bijective
+ * transformations. The intention here is to distribute each input
+ * uniformly randomly across the range, and separate adjacent inputs
+ * approximately uniformly randomly from each other, leading to a fairly
+ * random overall choice of permutation.
+ *
+ * To separate adjacent inputs, we multiply by a random number modulo
+ * (mask + 1), which is a power of 2. For this to be a bijection, the
+ * multiplier must be odd. Since this is known to lead to less randomness
+ * in the lower bits, we also apply a rotation that shifts the topmost bit
+ * into the least significant bit. In the special cases where size <= 3,
+ * mask = 1 and each of these operations is actually a no-op, so we also
+ * XOR with a different random number to inject additional randomness.
+ * Since the size is generally not a power of 2, we apply this bijection
+ * on overlapping upper and lower halves of the input.
+ *
+ * To distribute the inputs uniformly across the range, we then also apply
+ * a random offset modulo the full range.
+ *
+ * Taken together, these operations resemble a modified linear
+ * congruential generator, as is commonly used in pseudorandom number
+ * generators. Empirically, it is found that for small sizes it selects
+ * each of the (size!) possible permutations with roughly equal
+ * probability. For larger sizes, not all permutations can be generated,
+ * but the intended random spread is still produced.
+ */
+ for (i = 0; i < 4; i++)
+ {
+ uint64 m,
+ r,
+ t;
+
+ /* Random multiply (by an odd number), XOR and rotate of lower half */
+ m = (uint64) (pg_erand48(random_state.xseed) * (mask + 1)) | 1;
+ r = (uint64) (pg_erand48(random_state.xseed) * (mask + 1));
+ if (v <= mask)
+ {
+ v = ((v * m) ^ r) & mask;
+ v = ((v << 1) & mask) | (v & top_bit ? 1 : 0);
+ }
+
+ /* Random offset */
+ r = (uint64) (pg_erand48(random_state.xseed) * size);
+ v = (v + r) % size;
+
+ /* Random multiply (by an odd number), XOR and rotate of upper half */
+ m = (uint64) (pg_erand48(random_state.xseed) * (mask + 1)) | 1;
+ r = (uint64) (pg_erand48(random_state.xseed) * (mask + 1));
+ t = size - 1 - v;
+ if (t <= mask)
+ {
+ t = ((t * m) ^ r) & mask;
+ t = ((t << 1) & mask) | (t & top_bit ? 1 : 0);
+ v = size - 1 - t;
+ }
+
+ /* Random offset */
+ r = (uint64) (pg_erand48(random_state.xseed) * size);
+ v = (v + r) % size;
+ }
+
+ return (int64) v;
+}
+
/*
* Initialize the given SimpleStats struct to all zeroes
*/
@@ -2475,6 +2596,29 @@ evalStandardFunc(CState *st,
return true;
}
+ case PGBENCH_PERMUTE:
+ {
+ int64 val,
+ size,
+ seed;
+
+ Assert(nargs == 3);
+
+ if (!coerceToInt(&vargs[0], &val) ||
+ !coerceToInt(&vargs[1], &size) ||
+ !coerceToInt(&vargs[2], &seed))
+ return false;
+
+ if (size <= 0)
+ {
+ pg_log_error("permute size parameter must be greater than zero");
+ return false;
+ }
+
+ setIntValue(retval, permute(val, size, seed));
+ return true;
+ }
+
default:
/* cannot get here */
Assert(0);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
new file mode 100644
index 3a9d89e..6ce1c98
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -99,7 +99,8 @@ typedef enum PgBenchFunction
PGBENCH_IS,
PGBENCH_CASE,
PGBENCH_HASH_FNV1A,
- PGBENCH_HASH_MURMUR2
+ PGBENCH_HASH_MURMUR2,
+ PGBENCH_PERMUTE
} PgBenchFunction;
typedef struct PgBenchExpr PgBenchExpr;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
new file mode 100644
index 82a46c7..a86a62d
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -4,6 +4,7 @@ use warnings;
use PostgresNode;
use TestLib;
use Test::More;
+use Config;
# start a pgbench specific server
my $node = get_new_node('main');
@@ -483,6 +484,17 @@ pgbench(
qr{command=98.: int 5432\b}, # :random_seed
qr{command=99.: int -9223372036854775808\b}, # min int
qr{command=100.: int 9223372036854775807\b}, # max int
+ # pseudorandom permutation tests
+ qr{command=101.: boolean true\b},
+ qr{command=102.: boolean true\b},
+ qr{command=103.: boolean true\b},
+ qr{command=104.: boolean true\b},
+ qr{command=105.: boolean true\b},
+ qr{command=109.: boolean true\b},
+ qr{command=110.: boolean true\b},
+ qr{command=111.: boolean true\b},
+ qr{command=112.: int 9223372036854775797\b},
+ qr{command=113.: boolean true\b},
],
'pgbench expressions',
{
@@ -610,6 +622,33 @@ SELECT :v0, :v1, :v2, :v3;
-- minint constant parsing
\set min debug(-9223372036854775808)
\set max debug(-(:min + 1))
+-- parametric pseudorandom permutation function
+\set t debug(permute(0, 2) + permute(1, 2) = 1)
+\set t debug(permute(0, 3) + permute(1, 3) + permute(2, 3) = 3)
+\set t debug(permute(0, 4) + permute(1, 4) + permute(2, 4) + permute(3, 4) = 6)
+\set t debug(permute(0, 5) + permute(1, 5) + permute(2, 5) + permute(3, 5) + permute(4, 5) = 10)
+\set t debug(permute(0, 16) + permute(1, 16) + permute(2, 16) + permute(3, 16) + \
+ permute(4, 16) + permute(5, 16) + permute(6, 16) + permute(7, 16) + \
+ permute(8, 16) + permute(9, 16) + permute(10, 16) + permute(11, 16) + \
+ permute(12, 16) + permute(13, 16) + permute(14, 16) + permute(15, 16) = 120)
+-- random sanity checks
+\set size random(2, 1000)
+\set v random(0, :size - 1)
+\set p permute(:v, :size)
+\set t debug(0 <= :p and :p < :size and :p = permute(:v + :size, :size) and :p <> permute(:v + 1, :size))
+-- actual values
+\set t debug(permute(:v, 1) = 0)
+\set t debug(permute(0, 2, 123456) = 0 and permute(1, 2, 123456) = 1 and \
+ permute(0, 2, 123457) = 1 and permute(1, 2, 123457) = 0)
+-- 63 bits tests
+\set size debug(:max - 10)
+\set t debug(permute(:size-1, :size, 5432) = 7012042409040117905 and \
+ permute(:size-2, :size, 5432) = 1103575893259321449 and \
+ permute(:size-3, :size, 5432) = 222497678256930915 and \
+ permute(:size-4, :size, 5432) = 8505436630458204228 and \
+ permute(:size-5, :size, 5432) = 1126572291920756793 and \
+ permute(:size-6, :size, 5432) = 7300129793825407053 and \
+ permute(:size-7, :size, 5432) = 2511815335807189023)
}
});
@@ -1048,6 +1087,10 @@ SELECT LEAST(} . join(', ', (':i') x 256
'bad boolean', 2,
[qr{malformed variable.*trueXXX}], q{\set b :badtrue or true}
],
+ [
+ 'invalid permute size', 2,
+ [qr{permute size parameter must be greater than zero}], q{\set i permute(0, 0)}
+ ],
# GSET
[
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
new file mode 100644
index e38c7d7..4027e68
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -341,6 +341,16 @@ my @script_tests = (
'set i',
[ qr{set i 1 }, qr{\^ error found here} ],
{ 'set_i_op' => "\\set i 1 +\n" }
+ ],
+ [
+ 'not enough arguments to permute',
+ [qr{unexpected number of arguments \(permute\)}],
+ { 'bad-permute-1.sql' => "\\set i permute(1)\n" }
+ ],
+ [
+ 'too many arguments to permute',
+ [qr{unexpected number of arguments \(permute\)}],
+ { 'bad-permute-2.sql' => "\\set i permute(1, 2, 3, 4)\n" }
],);
for my $t (@script_tests)
Hello Dean,
Thanks a lot for this work. This version looks much better than the
previous one you sent, and has significant advantages over the one I sent,
in particular avoiding the prime number stuff and large modular multiply.
So this looks good!
I'm happy that halves-xoring is back because it was an essential part of
the design. ISTM that the previous version you sent only had linear/affine
transforms which was a bad idea.
The linear transform is moved inside halves, why not, and the
any-odd-number multiplication is prime with power-of-two trick is neat on
these.
However I still have some reservations:
First, I have a thing against erand48. The erand 48 bits design looks like
something that made sense when computer architectures where 16/32 bits and
large integers were 32 bits, so a 16 bits margin looked good enough. Using
a int16[3] type now is really depressing and probably costly on modern
architectures. With 64 bits architecture, and given that we are
manipulating 64 bits integers (well 63 really), ISTM that the minimal
state size for a PRNG should be at least 64 bits. It there a good reason
NOT to use a 64 bits state prn generator? I took Knuth's because it is
cheap and 64 bits. I'd accept anything which is at least 64 bits. I looked
at xor-shift things, but there was some controversy around these so I kept
it simple and just shifted small values to get rid of the obvious cycles
on Knuth's generator.
Second, I have a significant reservation about the very structure of the
transformation in this version:
loop 4 times :
// FIRST HALF STEER
m/r = pseudo randoms
if v in first "half"
v = ((v * m) ^ r) & mask;
rotate1(v)
// FULL SHIFT 1
r = pseudo random
v = (v + r) % size
// SECOND HALF STEER
m/r = pseudo randoms
if v in second "half"
same as previous on second half
// FULL SHIFT 2
r = pseudo random
v = (v + r) % size
I'm really at odds with FULL SHIFT 1, because it means that up to 1/256 of
values are kept out of STEERING. Whole chunks of values could be kept
unshuffled because they would only have SHIFTS apply to them and each time
fall in the not steered half. It should be an essential part of the design
that at least one steer is applied on a value at each round, and if two
are applied then fine, but certainly not zero. So basically I think that
the design would be significantly improved by removing "FULL SHIFT 1".
Third, I think that the rotate code can be simplified, in particular the
?: should be avoided because it may induce branches quite damaging to
processor performance.
I'll give it some more thoughts.
--
Fabien.
On Tue, 30 Mar 2021 at 19:26, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
First, I have a thing against erand48.
Yeah, that's probably a fair point. However, all the existing pgbench
random functions are using it, so I think it's fair enough for
permute() to do the same (and actually 2^48 is pretty huge). Switching
to a 64-bit PRNG might not be a bad idea, but I think that's something
we'd want to do across the board, and so I think it should be out of
scope for this patch.
Second, I have a significant reservation about the very structure of the
transformation in this version:loop 4 times :
// FIRST HALF STEER
m/r = pseudo randoms
if v in first "half"
v = ((v * m) ^ r) & mask;
rotate1(v)// FULL SHIFT 1
r = pseudo random
v = (v + r) % size// SECOND HALF STEER
m/r = pseudo randoms
if v in second "half"
same as previous on second half// FULL SHIFT 2
r = pseudo random
v = (v + r) % sizeI'm really at odds with FULL SHIFT 1, because it means that up to 1/256 of
values are kept out of STEERING. Whole chunks of values could be kept
unshuffled because they would only have SHIFTS apply to them and each time
fall in the not steered half. It should be an essential part of the design
that at least one steer is applied on a value at each round, and if two
are applied then fine, but certainly not zero. So basically I think that
the design would be significantly improved by removing "FULL SHIFT 1".
Ah, that's a good point. Something else that also concerned me there
was that it might lead to 2 consecutive full shifts with nothing in
between, which would lead to less uniform randomness (like the
Irwin-Hall distribution).
I just did a quick test without the first full shift, and the results
do appear to be better, so removing that looks like a good idea.
Third, I think that the rotate code can be simplified, in particular the
?: should be avoided because it may induce branches quite damaging to
processor performance.
Yeah, I wondered about that. Perhaps there's a "trick" that can be
used to simplify it. Pre-computing the number of bits in the mask
would probably help. I'll give it some thought.
Regards,
Dean
On Tue, 30 Mar 2021 at 20:31, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
Yeah, that's probably a fair point. However, all the existing pgbench
random functions are using it, so I think it's fair enough for
permute() to do the same (and actually 2^48 is pretty huge). Switching
to a 64-bit PRNG might not be a bad idea, but I think that's something
we'd want to do across the board, and so I think it should be out of
scope for this patch.
Of course the immediate counter-argument to changing the existing
random functions would be that doing so would break lots of people's
tests, and no one would thank us for that. Still, I think that, since
the existing random functions use a 48-bit PRNG, it's not unreasonable
for permute() to do the same.
Regards,
Dean
Hello Dean,
First, I have a thing against erand48.
Yeah, that's probably a fair point. However, all the existing pgbench
random functions are using it, so I think it's fair enough for permute()
to do the same (and actually 2^48 is pretty huge). Switching to a 64-bit
PRNG might not be a bad idea, but I think that's something we'd want to
do across the board, and so I think it should be out of scope for this
patch.
But less likely to pass, whereas here we have an internal function that
we can set as we want.
Also, there is a 64 bits seed provided to the function which instantly
ignores 16 of them, which looks pretty silly to me.
Also, the function is named everywhere erand48 with its hardcoded int16[3]
state, which makes a poor abstraction.
At least, I suggest that two 48-bits prng could be initialized with parts
of the seed and used in different places, eg for r & m.
Also, the seed could be used to adjust the rotation, maybe.
I'm really at odds with FULL SHIFT 1, because it means that up to 1/256 of
values are kept out of STEERING. [...]Ah, that's a good point. Something else that also concerned me there was
that it might lead to 2 consecutive full shifts with nothing in between,
which would lead to less uniform randomness (like the Irwin-Hall
distribution). I just did a quick test without the first full shift, and
the results do appear to be better,
Indeed, it makes sense to me.
so removing that looks like a good idea.
Third, I think that the rotate code can be simplified, in particular
the ?: should be avoided because it may induce branches quite damaging
to processor performance.Yeah, I wondered about that. Perhaps there's a "trick" that can be
used to simplify it. Pre-computing the number of bits in the mask
would probably help.
See pg_popcount64().
--
Fabien.
On Wed, 31 Mar 2021 at 09:02, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
First, I have a thing against erand48.
Also, there is a 64 bits seed provided to the function which instantly
ignores 16 of them, which looks pretty silly to me.
Yeah, that was copied from set_random_seed().
At least, I suggest that two 48-bits prng could be initialized with parts
of the seed and used in different places, eg for r & m.
That could work. I'd certainly feel better about that than
implementing a whole new PRNG.
Also, the seed could be used to adjust the rotation, maybe.
Perhaps. I'm not sure it's really necessary though.
I'm really at odds with FULL SHIFT 1, because it means that up to 1/256 of
values are kept out of STEERING. [...]Ah, that's a good point. Something else that also concerned me there was
that it might lead to 2 consecutive full shifts with nothing in between,
which would lead to less uniform randomness (like the Irwin-Hall
distribution). I just did a quick test without the first full shift, and
the results do appear to be better,Indeed, it makes sense to me.
OK, attached is an update making this change and simplifying the
rotate code, which hopefully just leaves the question of what (if
anything) to do with pg_erand48().
Third, I think that the rotate code can be simplified, in particular
the ?: should be avoided because it may induce branches quite damaging
to processor performance.Yeah, I wondered about that. Perhaps there's a "trick" that can be
used to simplify it. Pre-computing the number of bits in the mask
would probably help.See pg_popcount64().
Actually, I used pg_leftmost_one_pos64() to calculate the mask length,
allowing the mask to be computed from that, so there is no longer a
need for compute_mask(), which seems like a neat little
simplification.
Regards,
Dean
Attachments:
pgbench-prp-func-25.patchtext/x-patch; charset=US-ASCII; name=pgbench-prp-func-25.patchDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
new file mode 100644
index 50cf22b..84d9566
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -1057,7 +1057,7 @@ pgbench <optional> <replaceable>options<
<row>
<entry> <literal>default_seed</literal> </entry>
- <entry>seed used in hash functions by default</entry>
+ <entry>seed used in hash and pseudorandom permutation functions by default</entry>
</row>
<row>
@@ -1866,6 +1866,24 @@ SELECT 4 AS four \; SELECT 5 AS five \as
<row>
<entry role="func_table_entry"><para role="func_signature">
+ <function>permute</function> ( <parameter>i</parameter>, <parameter>size</parameter> [, <parameter>seed</parameter> ] )
+ <returnvalue>integer</returnvalue>
+ </para>
+ <para>
+ Permuted value of <parameter>i</parameter>, in the range
+ <literal>[0, size)</literal>. This is the new position of
+ <parameter>i</parameter> (modulo <parameter>size</parameter>) in a
+ pseudorandom permutation of the integers <literal>0...size-1</literal>,
+ parameterized by <parameter>seed</parameter>.
+ </para>
+ <para>
+ <literal>permute(0, 4)</literal>
+ <returnvalue>an integer between 0 and 3</returnvalue>
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
<function>pi</function> ()
<returnvalue>double</returnvalue>
</para>
@@ -2071,29 +2089,70 @@ f(x) = PHI(2.0 * parameter * (x - mu) /
</listitem>
</itemizedlist>
+ <note>
+ <para>
+ When designing a benchmark which selects rows non-uniformly, be aware
+ that the rows chosen may be correlated with other data such as IDs from
+ a sequence or the physical row ordering, which may skew performance
+ measurements.
+ </para>
+ <para>
+ To avoid this, you may wish to use the <function>permute</function>
+ function, or some other additional step with similar effect, to shuffle
+ the selected rows and remove such correlations.
+ </para>
+ </note>
+
<para>
Hash functions <literal>hash</literal>, <literal>hash_murmur2</literal> and
<literal>hash_fnv1a</literal> accept an input value and an optional seed parameter.
In case the seed isn't provided the value of <literal>:default_seed</literal>
is used, which is initialized randomly unless set by the command-line
- <literal>-D</literal> option. Hash functions can be used to scatter the
- distribution of random functions such as <literal>random_zipfian</literal> or
- <literal>random_exponential</literal>. For instance, the following pgbench
- script simulates possible real world workload typical for social media and
- blogging platforms where few accounts generate excessive load:
+ <literal>-D</literal> option.
+ </para>
+
+ <para>
+ <literal>permute</literal> accepts an input value, a size, and an optional
+ seed parameter. It generates a pseudorandom permutation of integers in
+ the range <literal>[0, size)</literal>, and returns the index of the input
+ value in the permuted values. The permutation chosen is parameterized by
+ the seed, which defaults to <literal>:default_seed</literal>, if not
+ specified. Unlike the hash functions, <literal>permute</literal> ensures
+ that there are no collisions or holes in the output values. Input values
+ outside the interval are interpreted modulo the size. The function raises
+ an error if the size is not positive. <function>permute</function> can be
+ used to scatter the distribution of non-uniform random functions such as
+ <literal>random_zipfian</literal> or <literal>random_exponential</literal>
+ so that values drawn more often are not trivially correlated. For
+ instance, the following <application>pgbench</application> script
+ simulates a possible real world workload typical for social media and
+ blogging platforms where a few accounts generate excessive load:
<programlisting>
-\set r random_zipfian(0, 100000000, 1.07)
-\set k abs(hash(:r)) % 1000000
+\set size 1000000
+\set r random_zipfian(1, :size, 1.07)
+\set k 1 + permute(:r, :size)
</programlisting>
In some cases several distinct distributions are needed which don't correlate
- with each other and this is when implicit seed parameter comes in handy:
+ with each other and this is when the optional seed parameter comes in handy:
<programlisting>
-\set k1 abs(hash(:r, :default_seed + 123)) % 1000000
-\set k2 abs(hash(:r, :default_seed + 321)) % 1000000
+\set k1 1 + permute(:r, :size, :default_seed + 123)
+\set k2 1 + permute(:r, :size, :default_seed + 321)
+</programlisting>
+
+ A similar behavior can also be approximated with <function>hash</function>:
+
+<programlisting>
+\set size 1000000
+\set r random_zipfian(1, 100 * :size, 1.07)
+\set k 1 + abs(hash(:r)) % :size
</programlisting>
+
+ However, since <function>hash</function> generates collisions, some values
+ will not be reachable and others will be more frequent than expected from
+ the original distribution.
</para>
<para>
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
new file mode 100644
index 4d529ea..56f75cc
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -19,6 +19,7 @@
#define PGBENCH_NARGS_VARIABLE (-1)
#define PGBENCH_NARGS_CASE (-2)
#define PGBENCH_NARGS_HASH (-3)
+#define PGBENCH_NARGS_PERMUTE (-4)
PgBenchExpr *expr_parse_result;
@@ -370,6 +371,9 @@ static const struct
{
"hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A
},
+ {
+ "permute", PGBENCH_NARGS_PERMUTE, PGBENCH_PERMUTE
+ },
/* keep as last array element */
{
NULL, 0, 0
@@ -479,6 +483,19 @@ make_func(yyscan_t yyscanner, int fnumbe
{
PgBenchExpr *var = make_variable("default_seed");
args = make_elist(var, args);
+ }
+ break;
+
+ /* pseudorandom permutation function with optional seed argument */
+ case PGBENCH_NARGS_PERMUTE:
+ if (len < 2 || len > 3)
+ expr_yyerror_more(yyscanner, "unexpected number of arguments",
+ PGBENCH_FUNCTIONS[fnumber].fname);
+
+ if (len == 2)
+ {
+ PgBenchExpr *var = make_variable("default_seed");
+ args = make_elist(var, args);
}
break;
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
new file mode 100644
index 48ce171..a90b01d
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -66,6 +66,7 @@
#include "getopt_long.h"
#include "libpq-fe.h"
#include "pgbench.h"
+#include "port/pg_bitutils.h"
#include "portability/instr_time.h"
#ifndef M_PI
@@ -1128,6 +1129,106 @@ getHashMurmur2(int64 val, uint64 seed)
}
/*
+ * Pseudorandom permutation function
+ *
+ * For small sizes, this generates each of the (size!) possible permutations
+ * of integers in the range [0, size) with roughly equal probability. Once
+ * the size is larger than 16, the number of possible permutations exceeds the
+ * number of distinct states of the internal pseudorandom number generator,
+ * and so not all possible permutations can be generated, but the permutations
+ * chosen should continue to give the appearance of being random.
+ *
+ * THIS FUNCTION IS NOT CRYPTOGRAPHICALLY SECURE.
+ * DO NOT USE FOR SUCH PURPOSE.
+ */
+static int64
+permute(const int64 val, const int64 isize, const int64 seed)
+{
+ RandomState random_state;
+ uint64 size;
+ uint64 v;
+ int masklen;
+ uint64 mask;
+ int i;
+
+ if (isize < 2)
+ return 0; /* nothing to permute */
+
+ /* Initialize random state with low-order bits of seed */
+ random_state.xseed[0] = seed & 0xFFFF;
+ random_state.xseed[1] = (seed >> 16) & 0xFFFF;
+ random_state.xseed[2] = (seed >> 32) & 0xFFFF;
+
+ /* Computations are performed on unsigned values */
+ size = (uint64) isize;
+ v = (uint64) val % size;
+
+ /* Mask to work modulo largest power of 2 less than or equal to size */
+ masklen = pg_leftmost_one_pos64(size);
+ mask = (((uint64) 1) << masklen) - 1;
+
+ /*
+ * Permute the input value by applying 4 rounds of pseudorandom bijective
+ * transformations. The intention here is to distribute each input
+ * uniformly randomly across the range, and separate adjacent inputs
+ * approximately uniformly randomly from each other, leading to a fairly
+ * random overall choice of permutation.
+ *
+ * To separate adjacent inputs, we multiply by a random number modulo
+ * (mask + 1), which is a power of 2. For this to be a bijection, the
+ * multiplier must be odd. Since this is known to lead to less randomness
+ * in the lower bits, we also apply a rotation that shifts the topmost bit
+ * into the least significant bit. In the special cases where size <= 3,
+ * mask = 1 and each of these operations is actually a no-op, so we also
+ * XOR with a different random number to inject additional randomness.
+ * Since the size is generally not a power of 2, we apply this bijection
+ * on overlapping upper and lower halves of the input.
+ *
+ * To distribute the inputs uniformly across the range, we then also apply
+ * a random offset modulo the full range.
+ *
+ * Taken together, these operations resemble a modified linear
+ * congruential generator, as is commonly used in pseudorandom number
+ * generators. Empirically, it is found that for small sizes it selects
+ * each of the (size!) possible permutations with roughly equal
+ * probability. For larger sizes, not all permutations can be generated,
+ * but the intended random spread is still produced.
+ */
+ for (i = 0; i < 4; i++)
+ {
+ uint64 m,
+ r,
+ t;
+
+ /* Random multiply (by an odd number), XOR and rotate of lower half */
+ m = (uint64) (pg_erand48(random_state.xseed) * (mask + 1)) | 1;
+ r = (uint64) (pg_erand48(random_state.xseed) * (mask + 1));
+ if (v <= mask)
+ {
+ v = ((v * m) ^ r) & mask;
+ v = ((v << 1) & mask) | (v >> (masklen - 1));
+ }
+
+ /* Random multiply (by an odd number), XOR and rotate of upper half */
+ m = (uint64) (pg_erand48(random_state.xseed) * (mask + 1)) | 1;
+ r = (uint64) (pg_erand48(random_state.xseed) * (mask + 1));
+ t = size - 1 - v;
+ if (t <= mask)
+ {
+ t = ((t * m) ^ r) & mask;
+ t = ((t << 1) & mask) | (t >> (masklen - 1));
+ v = size - 1 - t;
+ }
+
+ /* Random offset */
+ r = (uint64) (pg_erand48(random_state.xseed) * size);
+ v = (v + r) % size;
+ }
+
+ return (int64) v;
+}
+
+/*
* Initialize the given SimpleStats struct to all zeroes
*/
static void
@@ -2475,6 +2576,29 @@ evalStandardFunc(CState *st,
return true;
}
+ case PGBENCH_PERMUTE:
+ {
+ int64 val,
+ size,
+ seed;
+
+ Assert(nargs == 3);
+
+ if (!coerceToInt(&vargs[0], &val) ||
+ !coerceToInt(&vargs[1], &size) ||
+ !coerceToInt(&vargs[2], &seed))
+ return false;
+
+ if (size <= 0)
+ {
+ pg_log_error("permute size parameter must be greater than zero");
+ return false;
+ }
+
+ setIntValue(retval, permute(val, size, seed));
+ return true;
+ }
+
default:
/* cannot get here */
Assert(0);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
new file mode 100644
index 3a9d89e..6ce1c98
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -99,7 +99,8 @@ typedef enum PgBenchFunction
PGBENCH_IS,
PGBENCH_CASE,
PGBENCH_HASH_FNV1A,
- PGBENCH_HASH_MURMUR2
+ PGBENCH_HASH_MURMUR2,
+ PGBENCH_PERMUTE
} PgBenchFunction;
typedef struct PgBenchExpr PgBenchExpr;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
new file mode 100644
index 82a46c7..cf50975
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -4,6 +4,7 @@ use warnings;
use PostgresNode;
use TestLib;
use Test::More;
+use Config;
# start a pgbench specific server
my $node = get_new_node('main');
@@ -483,6 +484,17 @@ pgbench(
qr{command=98.: int 5432\b}, # :random_seed
qr{command=99.: int -9223372036854775808\b}, # min int
qr{command=100.: int 9223372036854775807\b}, # max int
+ # pseudorandom permutation tests
+ qr{command=101.: boolean true\b},
+ qr{command=102.: boolean true\b},
+ qr{command=103.: boolean true\b},
+ qr{command=104.: boolean true\b},
+ qr{command=105.: boolean true\b},
+ qr{command=109.: boolean true\b},
+ qr{command=110.: boolean true\b},
+ qr{command=111.: boolean true\b},
+ qr{command=112.: int 9223372036854775797\b},
+ qr{command=113.: boolean true\b},
],
'pgbench expressions',
{
@@ -610,6 +622,33 @@ SELECT :v0, :v1, :v2, :v3;
-- minint constant parsing
\set min debug(-9223372036854775808)
\set max debug(-(:min + 1))
+-- parametric pseudorandom permutation function
+\set t debug(permute(0, 2) + permute(1, 2) = 1)
+\set t debug(permute(0, 3) + permute(1, 3) + permute(2, 3) = 3)
+\set t debug(permute(0, 4) + permute(1, 4) + permute(2, 4) + permute(3, 4) = 6)
+\set t debug(permute(0, 5) + permute(1, 5) + permute(2, 5) + permute(3, 5) + permute(4, 5) = 10)
+\set t debug(permute(0, 16) + permute(1, 16) + permute(2, 16) + permute(3, 16) + \
+ permute(4, 16) + permute(5, 16) + permute(6, 16) + permute(7, 16) + \
+ permute(8, 16) + permute(9, 16) + permute(10, 16) + permute(11, 16) + \
+ permute(12, 16) + permute(13, 16) + permute(14, 16) + permute(15, 16) = 120)
+-- random sanity checks
+\set size random(2, 1000)
+\set v random(0, :size - 1)
+\set p permute(:v, :size)
+\set t debug(0 <= :p and :p < :size and :p = permute(:v + :size, :size) and :p <> permute(:v + 1, :size))
+-- actual values
+\set t debug(permute(:v, 1) = 0)
+\set t debug(permute(0, 2, 5432) = 0 and permute(1, 2, 5432) = 1 and \
+ permute(0, 2, 5433) = 1 and permute(1, 2, 5433) = 0)
+-- 63 bits tests
+\set size debug(:max - 10)
+\set t debug(permute(:size-1, :size, 5432) = 5453903545070026760 and \
+ permute(:size-2, :size, 5432) = 3852510547464191995 and \
+ permute(:size-3, :size, 5432) = 6519944788385497068 and \
+ permute(:size-4, :size, 5432) = 2393897006749810651 and \
+ permute(:size-5, :size, 5432) = 435256285874192331 and \
+ permute(:size-6, :size, 5432) = 5260571010402451383 and \
+ permute(:size-7, :size, 5432) = 5245374096631496614)
}
});
@@ -1048,6 +1087,10 @@ SELECT LEAST(} . join(', ', (':i') x 256
'bad boolean', 2,
[qr{malformed variable.*trueXXX}], q{\set b :badtrue or true}
],
+ [
+ 'invalid permute size', 2,
+ [qr{permute size parameter must be greater than zero}], q{\set i permute(0, 0)}
+ ],
# GSET
[
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
new file mode 100644
index e38c7d7..4027e68
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -341,6 +341,16 @@ my @script_tests = (
'set i',
[ qr{set i 1 }, qr{\^ error found here} ],
{ 'set_i_op' => "\\set i 1 +\n" }
+ ],
+ [
+ 'not enough arguments to permute',
+ [qr{unexpected number of arguments \(permute\)}],
+ { 'bad-permute-1.sql' => "\\set i permute(1)\n" }
+ ],
+ [
+ 'too many arguments to permute',
+ [qr{unexpected number of arguments \(permute\)}],
+ { 'bad-permute-2.sql' => "\\set i permute(1, 2, 3, 4)\n" }
],);
for my $t (@script_tests)
Hello Dean,
OK, attached is an update making this change and simplifying the rotate
code, which hopefully just leaves the question of what (if anything) to
do with pg_erand48().
Yep. While looking at it, I have some doubts on this part:
m = (uint64) (pg_erand48(random_state.xseed) * (mask + 1)) | 1;
r = (uint64) (pg_erand48(random_state.xseed) * (mask + 1));
r = (uint64) (pg_erand48(random_state.xseed) * size);
I do not understand why the random values are multiplied by anything in
the first place…
This one looks like a no-op :
r = (uint64) (pg_erand48(random_state.xseed) * size);
v = (v + r) % size;
v = (v + r) % size
= (v + rand * size) % size
=? (v % size + rand * size % size) % size
=? (v % size + 0) % size
= v % size
= v
I'm also skeptical about this one:
r = (uint64) (pg_erand48(random_state.xseed) * (mask + 1));
if (v <= mask)
v = ((v * m) ^ r) & mask;
v = ((v * m) ^ r) & mask
= ((v * m) ^ r) % (mask+1)
= ((v * m) ^ (rand * (mask+1))) % (mask+1)
=? ((v * m) % (mask+1)) ^ (rand * (mask+1) % (mask+1))
=? ((v * m) % (mask+1)) ^ (0)
= (v * m) & mask
Or possibly I'm missing something obvious and I'm wrong with my
arithmetic?
--
Fabien.
On Wed, 31 Mar 2021 at 18:53, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
While looking at it, I have some doubts on this part:
m = (uint64) (pg_erand48(random_state.xseed) * (mask + 1)) | 1;
r = (uint64) (pg_erand48(random_state.xseed) * (mask + 1));
r = (uint64) (pg_erand48(random_state.xseed) * size);I do not understand why the random values are multiplied by anything in
the first place…
These are just random integers in the range [0,mask] and [0,size-1],
formed in exactly the same way as getrand().
This one looks like a no-op :
r = (uint64) (pg_erand48(random_state.xseed) * size);
v = (v + r) % size;v = (v + r) % size
= (v + rand * size) % size
=? (v % size + rand * size % size) % size
=? (v % size + 0) % size
= v % size
= v
rand * size % size is not zero because rand is a floating point number
in the range [0,1), so actually rand * size % size = rand * size.
Similarly in the other case, you're forgetting that rand is not an
integer.
Thinking more about our use of erand48(), the only real impact it has
is to limit the number of possible permutations produced, and actually
2^48 is so huge (roughly 281 million million) that I can't ever see
that being an issue in practice. (In a quick dummy test, replacing
erand48() with a silly "erand8()" function that only returned one of
256 distinct values, permute() still worked fine at any size, except
for the fact that only up to 256 distinct permutations were produced.
In other words, limitations on the source of randomness don't prevent
it from producing permutations of any size, they just limit the number
of distinct permutations possible. And since 2^48 is so big, that
shouldn't be an issue.)
Also, I think the source of the input seed is most likely to be either
manually hand-picked integers or pgbench's own random() function, so
the only real issue I can see is that by ignoring the upper 16-bits,
there's a very small chance of always using the same random sequence
if some hand-picked numbers only vary in the top 16 bits, though I
think that's highly unlikely in practice.
Nonetheless, it's not much more effort to make another random state
and use those remaining bits of the seed and get more internal random
states, so here's an update doing that. I intentionally chose to reuse
the lower 16 bits of the seed in the second random function (in a
different slot of the random state), since those are probably the ones
most likely to vary in practice.
This doesn't actually make any measurable difference to any of the
tests, but it closes that potential loophole of ignoring part of the
seed. In all my tests, the biggest improvement was between v23 and v24
of the patch. By comparison, the later versions have been relatively
small improvements, and it's probably now "random enough" for the
intended purposes.
Regards,
Dean
Attachments:
pgbench-prp-func-26.patchtext/x-patch; charset=US-ASCII; name=pgbench-prp-func-26.patchDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
new file mode 100644
index 50cf22b..84d9566
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -1057,7 +1057,7 @@ pgbench <optional> <replaceable>options<
<row>
<entry> <literal>default_seed</literal> </entry>
- <entry>seed used in hash functions by default</entry>
+ <entry>seed used in hash and pseudorandom permutation functions by default</entry>
</row>
<row>
@@ -1866,6 +1866,24 @@ SELECT 4 AS four \; SELECT 5 AS five \as
<row>
<entry role="func_table_entry"><para role="func_signature">
+ <function>permute</function> ( <parameter>i</parameter>, <parameter>size</parameter> [, <parameter>seed</parameter> ] )
+ <returnvalue>integer</returnvalue>
+ </para>
+ <para>
+ Permuted value of <parameter>i</parameter>, in the range
+ <literal>[0, size)</literal>. This is the new position of
+ <parameter>i</parameter> (modulo <parameter>size</parameter>) in a
+ pseudorandom permutation of the integers <literal>0...size-1</literal>,
+ parameterized by <parameter>seed</parameter>.
+ </para>
+ <para>
+ <literal>permute(0, 4)</literal>
+ <returnvalue>an integer between 0 and 3</returnvalue>
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
<function>pi</function> ()
<returnvalue>double</returnvalue>
</para>
@@ -2071,29 +2089,70 @@ f(x) = PHI(2.0 * parameter * (x - mu) /
</listitem>
</itemizedlist>
+ <note>
+ <para>
+ When designing a benchmark which selects rows non-uniformly, be aware
+ that the rows chosen may be correlated with other data such as IDs from
+ a sequence or the physical row ordering, which may skew performance
+ measurements.
+ </para>
+ <para>
+ To avoid this, you may wish to use the <function>permute</function>
+ function, or some other additional step with similar effect, to shuffle
+ the selected rows and remove such correlations.
+ </para>
+ </note>
+
<para>
Hash functions <literal>hash</literal>, <literal>hash_murmur2</literal> and
<literal>hash_fnv1a</literal> accept an input value and an optional seed parameter.
In case the seed isn't provided the value of <literal>:default_seed</literal>
is used, which is initialized randomly unless set by the command-line
- <literal>-D</literal> option. Hash functions can be used to scatter the
- distribution of random functions such as <literal>random_zipfian</literal> or
- <literal>random_exponential</literal>. For instance, the following pgbench
- script simulates possible real world workload typical for social media and
- blogging platforms where few accounts generate excessive load:
+ <literal>-D</literal> option.
+ </para>
+
+ <para>
+ <literal>permute</literal> accepts an input value, a size, and an optional
+ seed parameter. It generates a pseudorandom permutation of integers in
+ the range <literal>[0, size)</literal>, and returns the index of the input
+ value in the permuted values. The permutation chosen is parameterized by
+ the seed, which defaults to <literal>:default_seed</literal>, if not
+ specified. Unlike the hash functions, <literal>permute</literal> ensures
+ that there are no collisions or holes in the output values. Input values
+ outside the interval are interpreted modulo the size. The function raises
+ an error if the size is not positive. <function>permute</function> can be
+ used to scatter the distribution of non-uniform random functions such as
+ <literal>random_zipfian</literal> or <literal>random_exponential</literal>
+ so that values drawn more often are not trivially correlated. For
+ instance, the following <application>pgbench</application> script
+ simulates a possible real world workload typical for social media and
+ blogging platforms where a few accounts generate excessive load:
<programlisting>
-\set r random_zipfian(0, 100000000, 1.07)
-\set k abs(hash(:r)) % 1000000
+\set size 1000000
+\set r random_zipfian(1, :size, 1.07)
+\set k 1 + permute(:r, :size)
</programlisting>
In some cases several distinct distributions are needed which don't correlate
- with each other and this is when implicit seed parameter comes in handy:
+ with each other and this is when the optional seed parameter comes in handy:
<programlisting>
-\set k1 abs(hash(:r, :default_seed + 123)) % 1000000
-\set k2 abs(hash(:r, :default_seed + 321)) % 1000000
+\set k1 1 + permute(:r, :size, :default_seed + 123)
+\set k2 1 + permute(:r, :size, :default_seed + 321)
+</programlisting>
+
+ A similar behavior can also be approximated with <function>hash</function>:
+
+<programlisting>
+\set size 1000000
+\set r random_zipfian(1, 100 * :size, 1.07)
+\set k 1 + abs(hash(:r)) % :size
</programlisting>
+
+ However, since <function>hash</function> generates collisions, some values
+ will not be reachable and others will be more frequent than expected from
+ the original distribution.
</para>
<para>
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
new file mode 100644
index 4d529ea..56f75cc
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -19,6 +19,7 @@
#define PGBENCH_NARGS_VARIABLE (-1)
#define PGBENCH_NARGS_CASE (-2)
#define PGBENCH_NARGS_HASH (-3)
+#define PGBENCH_NARGS_PERMUTE (-4)
PgBenchExpr *expr_parse_result;
@@ -370,6 +371,9 @@ static const struct
{
"hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A
},
+ {
+ "permute", PGBENCH_NARGS_PERMUTE, PGBENCH_PERMUTE
+ },
/* keep as last array element */
{
NULL, 0, 0
@@ -479,6 +483,19 @@ make_func(yyscan_t yyscanner, int fnumbe
{
PgBenchExpr *var = make_variable("default_seed");
args = make_elist(var, args);
+ }
+ break;
+
+ /* pseudorandom permutation function with optional seed argument */
+ case PGBENCH_NARGS_PERMUTE:
+ if (len < 2 || len > 3)
+ expr_yyerror_more(yyscanner, "unexpected number of arguments",
+ PGBENCH_FUNCTIONS[fnumber].fname);
+
+ if (len == 2)
+ {
+ PgBenchExpr *var = make_variable("default_seed");
+ args = make_elist(var, args);
}
break;
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
new file mode 100644
index 48ce171..61fb9e1
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -66,6 +66,7 @@
#include "getopt_long.h"
#include "libpq-fe.h"
#include "pgbench.h"
+#include "port/pg_bitutils.h"
#include "portability/instr_time.h"
#ifndef M_PI
@@ -1128,6 +1129,111 @@ getHashMurmur2(int64 val, uint64 seed)
}
/*
+ * Pseudorandom permutation function
+ *
+ * For small sizes, this generates each of the (size!) possible permutations
+ * of integers in the range [0, size) with roughly equal probability. Once
+ * the size is larger than 16, the number of possible permutations exceeds the
+ * number of distinct states of the internal pseudorandom number generator,
+ * and so not all possible permutations can be generated, but the permutations
+ * chosen should continue to give the appearance of being random.
+ *
+ * THIS FUNCTION IS NOT CRYPTOGRAPHICALLY SECURE.
+ * DO NOT USE FOR SUCH PURPOSE.
+ */
+static int64
+permute(const int64 val, const int64 isize, const int64 seed)
+{
+ RandomState random_state1;
+ RandomState random_state2;
+ uint64 size;
+ uint64 v;
+ int masklen;
+ uint64 mask;
+ int i;
+
+ if (isize < 2)
+ return 0; /* nothing to permute */
+
+ /* Initialize a pair of random states using the seed */
+ random_state1.xseed[0] = seed & 0xFFFF;
+ random_state1.xseed[1] = (seed >> 16) & 0xFFFF;
+ random_state1.xseed[2] = (seed >> 32) & 0xFFFF;
+
+ random_state2.xseed[0] = (((uint64) seed) >> 48) & 0xFFFF;
+ random_state2.xseed[1] = seed & 0xFFFF;
+ random_state2.xseed[2] = (seed >> 16) & 0xFFFF;
+
+ /* Computations are performed on unsigned values */
+ size = (uint64) isize;
+ v = (uint64) val % size;
+
+ /* Mask to work modulo largest power of 2 less than or equal to size */
+ masklen = pg_leftmost_one_pos64(size);
+ mask = (((uint64) 1) << masklen) - 1;
+
+ /*
+ * Permute the input value by applying 4 rounds of pseudorandom bijective
+ * transformations. The intention here is to distribute each input
+ * uniformly randomly across the range, and separate adjacent inputs
+ * approximately uniformly randomly from each other, leading to a fairly
+ * random overall choice of permutation.
+ *
+ * To separate adjacent inputs, we multiply by a random number modulo
+ * (mask + 1), which is a power of 2. For this to be a bijection, the
+ * multiplier must be odd. Since this is known to lead to less randomness
+ * in the lower bits, we also apply a rotation that shifts the topmost bit
+ * into the least significant bit. In the special cases where size <= 3,
+ * mask = 1 and each of these operations is actually a no-op, so we also
+ * XOR with a different random number to inject additional randomness.
+ * Since the size is generally not a power of 2, we apply this bijection
+ * on overlapping upper and lower halves of the input.
+ *
+ * To distribute the inputs uniformly across the range, we then also apply
+ * a random offset modulo the full range.
+ *
+ * Taken together, these operations resemble a modified linear
+ * congruential generator, as is commonly used in pseudorandom number
+ * generators. Empirically, it is found that for small sizes it selects
+ * each of the (size!) possible permutations with roughly equal
+ * probability. For larger sizes, not all permutations can be generated,
+ * but the intended random spread is still produced.
+ */
+ for (i = 0; i < 4; i++)
+ {
+ uint64 m,
+ r,
+ t;
+
+ /* Random multiply (by an odd number), XOR and rotate of lower half */
+ m = (uint64) (pg_erand48(random_state1.xseed) * (mask + 1)) | 1;
+ r = (uint64) (pg_erand48(random_state2.xseed) * (mask + 1));
+ if (v <= mask)
+ {
+ v = ((v * m) ^ r) & mask;
+ v = ((v << 1) & mask) | (v >> (masklen - 1));
+ }
+
+ /* Random multiply (by an odd number), XOR and rotate of upper half */
+ m = (uint64) (pg_erand48(random_state1.xseed) * (mask + 1)) | 1;
+ r = (uint64) (pg_erand48(random_state2.xseed) * (mask + 1));
+ t = size - 1 - v;
+ if (t <= mask)
+ {
+ t = ((t * m) ^ r) & mask;
+ t = ((t << 1) & mask) | (t >> (masklen - 1));
+ v = size - 1 - t;
+ }
+
+ /* Random offset */
+ r = (uint64) (pg_erand48(random_state2.xseed) * size);
+ v = (v + r) % size;
+ }
+
+ return (int64) v;
+}
+
+/*
* Initialize the given SimpleStats struct to all zeroes
*/
static void
@@ -2475,6 +2581,29 @@ evalStandardFunc(CState *st,
return true;
}
+ case PGBENCH_PERMUTE:
+ {
+ int64 val,
+ size,
+ seed;
+
+ Assert(nargs == 3);
+
+ if (!coerceToInt(&vargs[0], &val) ||
+ !coerceToInt(&vargs[1], &size) ||
+ !coerceToInt(&vargs[2], &seed))
+ return false;
+
+ if (size <= 0)
+ {
+ pg_log_error("permute size parameter must be greater than zero");
+ return false;
+ }
+
+ setIntValue(retval, permute(val, size, seed));
+ return true;
+ }
+
default:
/* cannot get here */
Assert(0);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
new file mode 100644
index 3a9d89e..6ce1c98
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -99,7 +99,8 @@ typedef enum PgBenchFunction
PGBENCH_IS,
PGBENCH_CASE,
PGBENCH_HASH_FNV1A,
- PGBENCH_HASH_MURMUR2
+ PGBENCH_HASH_MURMUR2,
+ PGBENCH_PERMUTE
} PgBenchFunction;
typedef struct PgBenchExpr PgBenchExpr;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
new file mode 100644
index 82a46c7..f06151e
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -4,6 +4,7 @@ use warnings;
use PostgresNode;
use TestLib;
use Test::More;
+use Config;
# start a pgbench specific server
my $node = get_new_node('main');
@@ -483,6 +484,17 @@ pgbench(
qr{command=98.: int 5432\b}, # :random_seed
qr{command=99.: int -9223372036854775808\b}, # min int
qr{command=100.: int 9223372036854775807\b}, # max int
+ # pseudorandom permutation tests
+ qr{command=101.: boolean true\b},
+ qr{command=102.: boolean true\b},
+ qr{command=103.: boolean true\b},
+ qr{command=104.: boolean true\b},
+ qr{command=105.: boolean true\b},
+ qr{command=109.: boolean true\b},
+ qr{command=110.: boolean true\b},
+ qr{command=111.: boolean true\b},
+ qr{command=112.: int 9223372036854775797\b},
+ qr{command=113.: boolean true\b},
],
'pgbench expressions',
{
@@ -610,6 +622,33 @@ SELECT :v0, :v1, :v2, :v3;
-- minint constant parsing
\set min debug(-9223372036854775808)
\set max debug(-(:min + 1))
+-- parametric pseudorandom permutation function
+\set t debug(permute(0, 2) + permute(1, 2) = 1)
+\set t debug(permute(0, 3) + permute(1, 3) + permute(2, 3) = 3)
+\set t debug(permute(0, 4) + permute(1, 4) + permute(2, 4) + permute(3, 4) = 6)
+\set t debug(permute(0, 5) + permute(1, 5) + permute(2, 5) + permute(3, 5) + permute(4, 5) = 10)
+\set t debug(permute(0, 16) + permute(1, 16) + permute(2, 16) + permute(3, 16) + \
+ permute(4, 16) + permute(5, 16) + permute(6, 16) + permute(7, 16) + \
+ permute(8, 16) + permute(9, 16) + permute(10, 16) + permute(11, 16) + \
+ permute(12, 16) + permute(13, 16) + permute(14, 16) + permute(15, 16) = 120)
+-- random sanity checks
+\set size random(2, 1000)
+\set v random(0, :size - 1)
+\set p permute(:v, :size)
+\set t debug(0 <= :p and :p < :size and :p = permute(:v + :size, :size) and :p <> permute(:v + 1, :size))
+-- actual values
+\set t debug(permute(:v, 1) = 0)
+\set t debug(permute(0, 2, 5432) = 1 and permute(1, 2, 5432) = 0 and \
+ permute(0, 2, 5433) = 0 and permute(1, 2, 5433) = 1)
+-- 63 bits tests
+\set size debug(:max - 10)
+\set t debug(permute(:size-1, :size, 5432) = 4190403159373676543 and \
+ permute(:size-2, :size, 5432) = 6531117427252953058 and \
+ permute(:size-3, :size, 5432) = 817680950359556067 and \
+ permute(:size-4, :size, 5432) = 4488021121584496620 and \
+ permute(:size-5, :size, 5432) = 889630397817880559 and \
+ permute(:size-6, :size, 5432) = 2304246419350781925 and \
+ permute(:size-7, :size, 5432) = 6067877766654295977)
}
});
@@ -1048,6 +1087,10 @@ SELECT LEAST(} . join(', ', (':i') x 256
'bad boolean', 2,
[qr{malformed variable.*trueXXX}], q{\set b :badtrue or true}
],
+ [
+ 'invalid permute size', 2,
+ [qr{permute size parameter must be greater than zero}], q{\set i permute(0, 0)}
+ ],
# GSET
[
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
new file mode 100644
index e38c7d7..4027e68
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -341,6 +341,16 @@ my @script_tests = (
'set i',
[ qr{set i 1 }, qr{\^ error found here} ],
{ 'set_i_op' => "\\set i 1 +\n" }
+ ],
+ [
+ 'not enough arguments to permute',
+ [qr{unexpected number of arguments \(permute\)}],
+ { 'bad-permute-1.sql' => "\\set i permute(1)\n" }
+ ],
+ [
+ 'too many arguments to permute',
+ [qr{unexpected number of arguments \(permute\)}],
+ { 'bad-permute-2.sql' => "\\set i permute(1, 2, 3, 4)\n" }
],);
for my $t (@script_tests)
r = (uint64) (pg_erand48(random_state.xseed) * size);
I do not understand why the random values are multiplied by anything in
the first place…These are just random integers in the range [0,mask] and [0,size-1],
formed in exactly the same way as getrand().
Indeed, erand returns a double, this was the part I was missing. I did not
realize that you had switched to doubles in your approach.
I think that permute should only use integer operations. I'd suggest to
use one of the integer variants instead of going through a double
computation and casting back to int. The internal state is based on
integers, I do not see the added value of going through floats, possibly
enduring floating point issues (undeflow, rounding, normalization,
whatever) on the way, whereas from start to finish we just need ints.
See attached v27 proposal.
I still think that *rand48 is a poor (relatively small state) and
inefficient (the implementation includes packing and unpacking 16 bits
ints to build a 64 bits int) choice.
--
Fabien.
See attached v27 proposal.
As usual, it is easier to see with the actual attachement:-)
--
Fabien.
Attachments:
pgbench-prp-func-27.patchtext/x-diff; name=pgbench-prp-func-27.patchDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 50cf22ba6b..84d9566f49 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -1057,7 +1057,7 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<row>
<entry> <literal>default_seed</literal> </entry>
- <entry>seed used in hash functions by default</entry>
+ <entry>seed used in hash and pseudorandom permutation functions by default</entry>
</row>
<row>
@@ -1864,6 +1864,24 @@ SELECT 4 AS four \; SELECT 5 AS five \aset
</para></entry>
</row>
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <function>permute</function> ( <parameter>i</parameter>, <parameter>size</parameter> [, <parameter>seed</parameter> ] )
+ <returnvalue>integer</returnvalue>
+ </para>
+ <para>
+ Permuted value of <parameter>i</parameter>, in the range
+ <literal>[0, size)</literal>. This is the new position of
+ <parameter>i</parameter> (modulo <parameter>size</parameter>) in a
+ pseudorandom permutation of the integers <literal>0...size-1</literal>,
+ parameterized by <parameter>seed</parameter>.
+ </para>
+ <para>
+ <literal>permute(0, 4)</literal>
+ <returnvalue>an integer between 0 and 3</returnvalue>
+ </para></entry>
+ </row>
+
<row>
<entry role="func_table_entry"><para role="func_signature">
<function>pi</function> ()
@@ -2071,29 +2089,70 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
</listitem>
</itemizedlist>
+ <note>
+ <para>
+ When designing a benchmark which selects rows non-uniformly, be aware
+ that the rows chosen may be correlated with other data such as IDs from
+ a sequence or the physical row ordering, which may skew performance
+ measurements.
+ </para>
+ <para>
+ To avoid this, you may wish to use the <function>permute</function>
+ function, or some other additional step with similar effect, to shuffle
+ the selected rows and remove such correlations.
+ </para>
+ </note>
+
<para>
Hash functions <literal>hash</literal>, <literal>hash_murmur2</literal> and
<literal>hash_fnv1a</literal> accept an input value and an optional seed parameter.
In case the seed isn't provided the value of <literal>:default_seed</literal>
is used, which is initialized randomly unless set by the command-line
- <literal>-D</literal> option. Hash functions can be used to scatter the
- distribution of random functions such as <literal>random_zipfian</literal> or
- <literal>random_exponential</literal>. For instance, the following pgbench
- script simulates possible real world workload typical for social media and
- blogging platforms where few accounts generate excessive load:
+ <literal>-D</literal> option.
+ </para>
+
+ <para>
+ <literal>permute</literal> accepts an input value, a size, and an optional
+ seed parameter. It generates a pseudorandom permutation of integers in
+ the range <literal>[0, size)</literal>, and returns the index of the input
+ value in the permuted values. The permutation chosen is parameterized by
+ the seed, which defaults to <literal>:default_seed</literal>, if not
+ specified. Unlike the hash functions, <literal>permute</literal> ensures
+ that there are no collisions or holes in the output values. Input values
+ outside the interval are interpreted modulo the size. The function raises
+ an error if the size is not positive. <function>permute</function> can be
+ used to scatter the distribution of non-uniform random functions such as
+ <literal>random_zipfian</literal> or <literal>random_exponential</literal>
+ so that values drawn more often are not trivially correlated. For
+ instance, the following <application>pgbench</application> script
+ simulates a possible real world workload typical for social media and
+ blogging platforms where a few accounts generate excessive load:
<programlisting>
-\set r random_zipfian(0, 100000000, 1.07)
-\set k abs(hash(:r)) % 1000000
+\set size 1000000
+\set r random_zipfian(1, :size, 1.07)
+\set k 1 + permute(:r, :size)
</programlisting>
In some cases several distinct distributions are needed which don't correlate
- with each other and this is when implicit seed parameter comes in handy:
+ with each other and this is when the optional seed parameter comes in handy:
<programlisting>
-\set k1 abs(hash(:r, :default_seed + 123)) % 1000000
-\set k2 abs(hash(:r, :default_seed + 321)) % 1000000
+\set k1 1 + permute(:r, :size, :default_seed + 123)
+\set k2 1 + permute(:r, :size, :default_seed + 321)
</programlisting>
+
+ A similar behavior can also be approximated with <function>hash</function>:
+
+<programlisting>
+\set size 1000000
+\set r random_zipfian(1, 100 * :size, 1.07)
+\set k 1 + abs(hash(:r)) % :size
+</programlisting>
+
+ However, since <function>hash</function> generates collisions, some values
+ will not be reachable and others will be more frequent than expected from
+ the original distribution.
</para>
<para>
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index 4d529ea550..56f75ccd25 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -19,6 +19,7 @@
#define PGBENCH_NARGS_VARIABLE (-1)
#define PGBENCH_NARGS_CASE (-2)
#define PGBENCH_NARGS_HASH (-3)
+#define PGBENCH_NARGS_PERMUTE (-4)
PgBenchExpr *expr_parse_result;
@@ -370,6 +371,9 @@ static const struct
{
"hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A
},
+ {
+ "permute", PGBENCH_NARGS_PERMUTE, PGBENCH_PERMUTE
+ },
/* keep as last array element */
{
NULL, 0, 0
@@ -482,6 +486,19 @@ make_func(yyscan_t yyscanner, int fnumber, PgBenchExprList *args)
}
break;
+ /* pseudorandom permutation function with optional seed argument */
+ case PGBENCH_NARGS_PERMUTE:
+ if (len < 2 || len > 3)
+ expr_yyerror_more(yyscanner, "unexpected number of arguments",
+ PGBENCH_FUNCTIONS[fnumber].fname);
+
+ if (len == 2)
+ {
+ PgBenchExpr *var = make_variable("default_seed");
+ args = make_elist(var, args);
+ }
+ break;
+
/* common case: positive arguments number */
default:
Assert(PGBENCH_FUNCTIONS[fnumber].nargs >= 0);
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 48ce1712cc..63a4b24e54 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -66,6 +66,7 @@
#include "getopt_long.h"
#include "libpq-fe.h"
#include "pgbench.h"
+#include "port/pg_bitutils.h"
#include "portability/instr_time.h"
#ifndef M_PI
@@ -1127,6 +1128,120 @@ getHashMurmur2(int64 val, uint64 seed)
return (int64) result;
}
+static uint64
+randu64(RandomState * state)
+{
+ uint64 r1 = pg_jrand48((*state).xseed),
+ r2 = pg_jrand48((*state).xseed);
+ return r1 << 51 | r2 << 13 | r1 >> 13;
+}
+
+
+/*
+ * Pseudorandom permutation function
+ *
+ * For small sizes, this generates each of the (size!) possible permutations
+ * of integers in the range [0, size) with roughly equal probability. Once
+ * the size is larger than 16, the number of possible permutations exceeds the
+ * number of distinct states of the internal pseudorandom number generator,
+ * and so not all possible permutations can be generated, but the permutations
+ * chosen should continue to give the appearance of being random.
+ *
+ * THIS FUNCTION IS NOT CRYPTOGRAPHICALLY SECURE.
+ * DO NOT USE FOR SUCH PURPOSE.
+ */
+static int64
+permute(const int64 val, const int64 isize, const int64 seed)
+{
+ RandomState random_state1;
+ RandomState random_state2;
+ uint64 size;
+ uint64 v;
+ int masklen;
+ uint64 mask;
+ int i;
+
+ if (isize < 2)
+ return 0; /* nothing to permute */
+
+ /* Initialize a pair of random states using the seed */
+ random_state1.xseed[0] = seed & 0xFFFF;
+ random_state1.xseed[1] = (seed >> 16) & 0xFFFF;
+ random_state1.xseed[2] = (seed >> 32) & 0xFFFF;
+
+ random_state2.xseed[0] = (((uint64) seed) >> 48) & 0xFFFF;
+ random_state2.xseed[1] = seed & 0xFFFF;
+ random_state2.xseed[2] = (seed >> 16) & 0xFFFF;
+
+ /* Computations are performed on unsigned values */
+ size = (uint64) isize;
+ v = (uint64) val % size;
+
+ /* Mask to work modulo largest power of 2 less than or equal to size */
+ masklen = pg_leftmost_one_pos64(size);
+ mask = (((uint64) 1) << masklen) - 1;
+
+ /*
+ * Permute the input value by applying 4 rounds of pseudorandom bijective
+ * transformations. The intention here is to distribute each input
+ * uniformly randomly across the range, and separate adjacent inputs
+ * approximately uniformly randomly from each other, leading to a fairly
+ * random overall choice of permutation.
+ *
+ * To separate adjacent inputs, we multiply by a random number modulo
+ * (mask + 1), which is a power of 2. For this to be a bijection, the
+ * multiplier must be odd. Since this is known to lead to less randomness
+ * in the lower bits, we also apply a rotation that shifts the topmost bit
+ * into the least significant bit. In the special cases where size <= 3,
+ * mask = 1 and each of these operations is actually a no-op, so we also
+ * XOR with a different random number to inject additional randomness.
+ * Since the size is generally not a power of 2, we apply this bijection
+ * on overlapping upper and lower halves of the input.
+ *
+ * To distribute the inputs uniformly across the range, we then also apply
+ * a random offset modulo the full range.
+ *
+ * Taken together, these operations resemble a modified linear
+ * congruential generator, as is commonly used in pseudorandom number
+ * generators. Empirically, it is found that for small sizes it selects
+ * each of the (size!) possible permutations with roughly equal
+ * probability. For larger sizes, not all permutations can be generated,
+ * but the intended random spread is still produced.
+ */
+ for (i = 0; i < 4; i++)
+ {
+ uint64 m,
+ r,
+ t;
+
+ /* Random multiply (by an odd number), XOR and rotate of lower half */
+ m = randu64(&random_state1) | 1;
+ r = randu64(&random_state2);
+ if (v <= mask)
+ {
+ v = ((v * m) ^ r) & mask;
+ v = ((v << 1) & mask) | (v >> (masklen - 1));
+ }
+
+ /* Random multiply (by an odd number), XOR and rotate of upper half */
+ m = randu64(&random_state1) | 1;
+ r = randu64(&random_state2);
+ t = size - 1 - v;
+ if (t <= mask)
+ {
+ t = ((t * m) ^ r) & mask;
+ t = ((t << 1) & mask) | (t >> (masklen - 1));
+ v = size - 1 - t;
+ }
+
+ /* Random offset */
+ r = randu64(&random_state2);
+ v = (v + r) % size;
+ }
+
+ return (int64) v;
+}
+
/*
* Initialize the given SimpleStats struct to all zeroes
*/
@@ -2475,6 +2590,29 @@ evalStandardFunc(CState *st,
return true;
}
+ case PGBENCH_PERMUTE:
+ {
+ int64 val,
+ size,
+ seed;
+
+ Assert(nargs == 3);
+
+ if (!coerceToInt(&vargs[0], &val) ||
+ !coerceToInt(&vargs[1], &size) ||
+ !coerceToInt(&vargs[2], &seed))
+ return false;
+
+ if (size <= 0)
+ {
+ pg_log_error("permute size parameter must be greater than zero");
+ return false;
+ }
+
+ setIntValue(retval, permute(val, size, seed));
+ return true;
+ }
+
default:
/* cannot get here */
Assert(0);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index 3a9d89e6f1..6ce1c98649 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -99,7 +99,8 @@ typedef enum PgBenchFunction
PGBENCH_IS,
PGBENCH_CASE,
PGBENCH_HASH_FNV1A,
- PGBENCH_HASH_MURMUR2
+ PGBENCH_HASH_MURMUR2,
+ PGBENCH_PERMUTE
} PgBenchFunction;
typedef struct PgBenchExpr PgBenchExpr;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 82a46c72b6..d2b7f2efcb 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -4,6 +4,7 @@ use warnings;
use PostgresNode;
use TestLib;
use Test::More;
+use Config;
# start a pgbench specific server
my $node = get_new_node('main');
@@ -483,6 +484,17 @@ pgbench(
qr{command=98.: int 5432\b}, # :random_seed
qr{command=99.: int -9223372036854775808\b}, # min int
qr{command=100.: int 9223372036854775807\b}, # max int
+ # pseudorandom permutation tests
+ qr{command=101.: boolean true\b},
+ qr{command=102.: boolean true\b},
+ qr{command=103.: boolean true\b},
+ qr{command=104.: boolean true\b},
+ qr{command=105.: boolean true\b},
+ qr{command=109.: boolean true\b},
+ qr{command=110.: boolean true\b},
+ qr{command=111.: boolean true\b},
+ qr{command=112.: int 9223372036854775797\b},
+ qr{command=113.: boolean true\b},
],
'pgbench expressions',
{
@@ -610,6 +622,33 @@ SELECT :v0, :v1, :v2, :v3;
-- minint constant parsing
\set min debug(-9223372036854775808)
\set max debug(-(:min + 1))
+-- parametric pseudorandom permutation function
+\set t debug(permute(0, 2) + permute(1, 2) = 1)
+\set t debug(permute(0, 3) + permute(1, 3) + permute(2, 3) = 3)
+\set t debug(permute(0, 4) + permute(1, 4) + permute(2, 4) + permute(3, 4) = 6)
+\set t debug(permute(0, 5) + permute(1, 5) + permute(2, 5) + permute(3, 5) + permute(4, 5) = 10)
+\set t debug(permute(0, 16) + permute(1, 16) + permute(2, 16) + permute(3, 16) + \
+ permute(4, 16) + permute(5, 16) + permute(6, 16) + permute(7, 16) + \
+ permute(8, 16) + permute(9, 16) + permute(10, 16) + permute(11, 16) + \
+ permute(12, 16) + permute(13, 16) + permute(14, 16) + permute(15, 16) = 120)
+-- random sanity checks
+\set size random(2, 1000)
+\set v random(0, :size - 1)
+\set p permute(:v, :size)
+\set t debug(0 <= :p and :p < :size and :p = permute(:v + :size, :size) and :p <> permute(:v + 1, :size))
+-- actual values
+\set t debug(permute(:v, 1) = 0)
+\set t debug(permute(0, 2, 5432) = 0 and permute(1, 2, 5432) = 1 and \
+ permute(0, 2, 5435) = 1 and permute(1, 2, 5435) = 0)
+-- 63 bits tests
+\set size debug(:max - 10)
+\set t debug(permute(:size-1, :size, 5432) = 6027738441054679063 and \
+ permute(:size-2, :size, 5432) = 3883756639104291588 and \
+ permute(:size-3, :size, 5432) = 3407620656580533607 and \
+ permute(:size-4, :size, 5432) = 4439146987379709618 and \
+ permute(:size-5, :size, 5432) = 2736158554213998904 and \
+ permute(:size-6, :size, 5432) = 1448350938097189462 and \
+ permute(:size-7, :size, 5432) = 3955131470409280934)
}
});
@@ -1048,6 +1087,10 @@ SELECT LEAST(} . join(', ', (':i') x 256) . q{)}
'bad boolean', 2,
[qr{malformed variable.*trueXXX}], q{\set b :badtrue or true}
],
+ [
+ 'invalid permute size', 2,
+ [qr{permute size parameter must be greater than zero}], q{\set i permute(0, 0)}
+ ],
# GSET
[
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
index e38c7d77d1..4027e68dfa 100644
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -341,6 +341,16 @@ my @script_tests = (
'set i',
[ qr{set i 1 }, qr{\^ error found here} ],
{ 'set_i_op' => "\\set i 1 +\n" }
+ ],
+ [
+ 'not enough arguments to permute',
+ [qr{unexpected number of arguments \(permute\)}],
+ { 'bad-permute-1.sql' => "\\set i permute(1)\n" }
+ ],
+ [
+ 'too many arguments to permute',
+ [qr{unexpected number of arguments \(permute\)}],
+ { 'bad-permute-2.sql' => "\\set i permute(1, 2, 3, 4)\n" }
],);
for my $t (@script_tests)
On Fri, 2 Apr 2021 at 06:38, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
r = (uint64) (pg_erand48(random_state.xseed) * size);
I do not understand why the random values are multiplied by anything in
the first place…These are just random integers in the range [0,mask] and [0,size-1],
formed in exactly the same way as getrand().Indeed, erand returns a double, this was the part I was missing. I did not
realize that you had switched to doubles in your approach.I think that permute should only use integer operations. I'd suggest to
use one of the integer variants instead of going through a double
computation and casting back to int. The internal state is based on
integers, I do not see the added value of going through floats, possibly
enduring floating point issues (undeflow, rounding, normalization,
whatever) on the way, whereas from start to finish we just need ints.
This is the already-established coding pattern used in getrand() to
pick a random number uniformly in some range that's not necessarily a
power of 2.
Floating point underflow and normalisation issues are not possible
because erand48() takes a 48-bit integer N and uses ldexp() to store
N/2^48 in a double, which is an exact operation since IEEE doubles
have something like 56-bit mantissas. This is then turned back into an
integer in the required range by multiplying by the desired maximum
value, so there's never any risk of underflow or normalisation issues.
If you didn't do it that way, you'd need to rely on some larger
integer datatype, such as 128-bit integers.
I guess that there may be rounding variations once the required
maximum value exceeds something like 2^56 (although the comment in
getrand() is much more conservative than that), so it's possible that
a pgbench script calling random() with (ub-lb) larger than that might
give different results on different platforms. For the non-uniform
random functions, that effect might well kick in sooner. I'm not aware
of any field complaints about that though, possibly because real-world
data sizes are significantly smaller than that.
In practice, permute() is likely to take its input from one of the
non-uniform random functions, so it won't be permute() that first
introduces rounding issues.
See attached v27 proposal.
This update has a number of flaws. For example, this:
+static uint64
+randu64(RandomState * state)
+{
+ uint64 r1 = pg_jrand48((*state).xseed),
+ r2 = pg_jrand48((*state).xseed);
+ return r1 << 51 | r2 << 13 | r1 >> 13;
+}
It still uses a 48-bit RandomState, so it doesn't improve on getrand()
in that respect.
It replaces a single erand48() call with 2 jrand48() calls, which
comes at a cost in terms of performance. (In fact, changing the number
of rounds in the previous version of permute() from 4 to 6 has a
smaller performance impact than this -- more about that below.)
jrand48() returns a signed 32-bit integer, which has a 50% chance of
being negative, so when that is cast to a uint64, there is a 50%
chance that the 32 most significant bits will be 1. When the various
parts are OR'ed together, that will then mask out any randomness from
the other parts. For example, 50% of the time, the jrand48() value
used for r1 will be negative, and so 32 bits in the middle of the
final result will all be set.
There is essentially no randomness at all in bits 45..50, and the r1
and r2 values overlap in bits 13..18, giving them a 75% chance of
being set.
So overall, the results will be highly non-uniform, with less
randomness and poorer performance than erand48().
In addition, it returns a result in the range [0,2^64), which is not
really what's wanted. For example:
+ /* Random offset */
+ r = randu64(&random_state2);
+ v = (v + r) % size;
The previous code used r = getrand(0, size-1), which gave a uniformly
random offset. However, the new code risks introducing additional
non-uniformity when size is not a power of 2.
Finally, worst of all, this random offset is no longer bijective, due
to 64-bit integer wrap-around. For example, suppose that size=100 and
r=(2^64-10), then the following 2 values both map to the same result:
v = 20 -> (v + r) % size
= (20 + (2^64 - 10)) % 100
= (2^64 + 10) % 100
= (10) % 100
= 10
v = 4 -> (v + r) % size
= (4 + (2^64 - 10)) % 100
= (2^64 - 6) % 100
= (18446744073709551610) % 100
= 10
So not only are the results no longer uniformly random, they might not
even form a permutation.
I did some more testing of the previous version (v26), this time
looking at much larger sizes, all the way up to the maximum, which is
2^63-1 since it comes from a signed int64. In general, the results
were very good, though I did notice some slight non-uniformity in the
way adjacent inputs were separated from another when the size was just
under a power of 2. I think that's the hardest case for this
algorithm, because there's very little overlap between the 2 halves.
Increasing the number of rounds from 4 to 6 ironed out that
non-uniformity (and as mentioned above, is still cheaper than using
randu64() with 4 rounds), so I think we should go with that.
I still think that *rand48 is a poor (relatively small state) and
inefficient (the implementation includes packing and unpacking 16 bits
ints to build a 64 bits int) choice.
I can somewhat understand your dislike of *rand48(), but in the
context of pgbench I think that it is perfectly adequate. Since we now
use 2 RandomStates, I don't think the state size is an issue anymore,
if it ever really was. Using erand48() gives better performance than
jrand48() because it returns 48 bits rather than 32, so fewer calls
are needed, which allows more rounds for the same cost. Additionally,
following the same pattern as existing code reduces the risk of bugs,
and builds on proven, tested code.
You may wish to submit a separate patch to replace pgbench's use of
*rand48() with something else, and that would be discussed on its own
merits, but I don't see why that should hold up adding permute().
Regards,
Dean
Hello Dean,
I think that permute should only use integer operations. I'd suggest to
use one of the integer variants instead of going through a double
computation and casting back to int. The internal state is based on
integers, I do not see the added value of going through floats, possibly
enduring floating point issues (undeflow, rounding, normalization,
whatever) on the way, whereas from start to finish we just need ints.This is the already-established coding pattern used in getrand() to
pick a random number uniformly in some range that's not necessarily a
power of 2.
Indeed. I'm not particularly happy with that one either.
Floating point underflow and normalisation issues are not possible
because erand48() takes a 48-bit integer N and uses ldexp() to store
N/2^48 in a double, which is an exact operation since IEEE doubles
have something like 56-bit mantissas.
Double mantissa size is 52 bits.
This is then turned back into an integer in the required range by
multiplying by the desired maximum value, so there's never any risk of
underflow or normalisation issues.
ISTM that there are significant issues when multiplying with an integer,
because the integer is cast to a double before multiplying, so if the int
is over 52 bits then it is coldly truncated and some values are just lost
in the process and will never be drawn. Probably not too many of them, but
some of them anyway.
I guess that there may be rounding variations once the required
maximum value exceeds something like 2^56 (although the comment in
getrand() is much more conservative than that), so it's possible that
a pgbench script calling random() with (ub-lb) larger than that might
give different results on different platforms.
Dunno. This may be the same issue I'm pointing out above.
For the non-uniform random functions, that effect might well kick in
sooner. I'm not aware of any field complaints about that though,
possibly because real-world data sizes are significantly smaller than
that.In practice, permute() is likely to take its input from one of the
non-uniform random functions, so it won't be permute() that first
introduces rounding issues.
Sure. I'd like permute to be immune to that.
See attached v27 proposal.
This update has a number of flaws. For example, this:
Indeed:-)
+static uint64 +randu64(RandomState * state) +{ + uint64 r1 = pg_jrand48((*state).xseed), + r2 = pg_jrand48((*state).xseed); + return r1 << 51 | r2 << 13 | r1 >> 13; +}It still uses a 48-bit RandomState, so it doesn't improve on getrand()
in that respect.
Sure. I'm pretty unhappy with that one, but I was not trying to address
that. My idea that randu64 would be replace with something better at some
point. My intention was "64-bits pseudo-random", my implementation does
not work, ok:-)
It replaces a single erand48() call with 2 jrand48() calls, which
comes at a cost in terms of performance. (In fact, changing the number
of rounds in the previous version of permute() from 4 to 6 has a
smaller performance impact than this -- more about that below.)
Sure, same remark as above, I was not trying to address that pointB.
jrand48() returns a signed 32-bit integer, which has a 50% chance of
being negative, so when that is cast to a uint64, there is a 50%
chance that the 32 most significant bits will be 1.
Argh.
When the various parts are OR'ed together, that will then mask out any
randomness from the other parts. For example, 50% of the time, the
jrand48() value used for r1 will be negative, and so 32 bits in the
middle of the final result will all be set.
Argh. I hesitated to use xor. I should not have:-)
So overall, the results will be highly non-uniform, with less
randomness and poorer performance than erand48().
Indeed, bad choice. I wanted to used the unsigned version but it is not
implemented, and swichted to the signed version without thinking of some
of the implications.
In addition, it returns a result in the range [0,2^64), which is not
really what's wanted. For example:+ /* Random offset */ + r = randu64(&random_state2); + v = (v + r) % size;The previous code used r = getrand(0, size-1), which gave a uniformly
random offset. However, the new code risks introducing additional
non-uniformity when size is not a power of 2.
ISTM that the overall non uniformity is worse with the float approach as
opposed to the int approach.
Conceptually, the same kind of bias is expected whether you get through
floats or through ints, because the underlying power-of-two maths is the
same, so what makes the difference in reducing non-uniformity is using
more bits. Basically, when enough bits are used the same number of values
should appear n vs n+1 times.
When not enough bits are provided, things get ugly: for instance, with
size = 2^53, even if the floats were fully the 52-bit float pseudo-random
mantissa (they are really 48 with erand48) would result in only even
numbers to be produced, whereas with ints all numbers are produced. With
erand48, when size is above 48 bits ISTM that last bits are always zeros
with the double approach. I'm not counting lost values because of size
truncation when converting it to double.
Finally, worst of all, this random offset is no longer bijective, due
to 64-bit integer wrap-around. For example, suppose that size=100 and
r=(2^64-10), then the following 2 values both map to the same result:v = 20 -> (v + r) % size
= (20 + (2^64 - 10)) % 100
= (2^64 + 10) % 100
= (10) % 100
= 10v = 4 -> (v + r) % size
= (4 + (2^64 - 10)) % 100
= (2^64 - 6) % 100
= (18446744073709551610) % 100
= 10So not only are the results no longer uniformly random, they might not
even form a permutation.
Indeed, this one is pretty fun! Probably the right formula for this
approach is "(v + r % size) % size", which is kind of a mouthful.
I fully agree that my v27 implementation is butched on many dimensions,
some of them intentional and temporary (use jrand48 twice) and some of
them accidental (not considering int overflows, being optimistic on signed
to unsigned casts…).
I still disagree though that going through floating point is the right
thing to do, because of some of the issues I outlined above (eg truncation
and rounding for above 48/52-bits sizes). Basically I think that an
algorithm dealing with integers should not have to resort to floating
point computations unless it is actually required. This is not the case
for permute, were v26 is using doubles as glorified 48-bit integers, that
could be extended to 52-bit integers, but no more. The only benefit I see
is using implicitly the internal 104-bit rounding by truncation on
multiply, but I do not think that implicitely reducing the practical int
values to 52 bits is worth it, and that the same quality (bias) can be
achieved for 63 bits integers by keeping them as ints are writing the
right formula, which I fully failed to demonstrate in v27.
I did some more testing of the previous version (v26), this time
looking at much larger sizes, all the way up to the maximum, which is
2^63-1 since it comes from a signed int64. In general, the results
were very good, though I did notice some slight non-uniformity in the
way adjacent inputs were separated from another when the size was just
under a power of 2. I think that's the hardest case for this
algorithm, because there's very little overlap between the 2 halves.
Yes, less values are steered twice per round. However, as for adjacent
values for large sizes, I'm wondering whether this may have more to do
with the 48 bit limitations, so that lower bits are not really xored for
instance. Not sure.
Increasing the number of rounds from 4 to 6 ironed out that
non-uniformity (and as mentioned above, is still cheaper than using
randu64() with 4 rounds), so I think we should go with that.
There is a quality-cost tradeoff. With the previous version I convinced
myself that 4 rounds were a good compromise (not perfect, but ok for
keeping the cost low on practical sizes).
With this version, I'll admit that I do not have an opinion.
You may wish to submit a separate patch to replace pgbench's use of
*rand48() with something else, and that would be discussed on its own
merits, but I don't see why that should hold up adding permute().
I'll see.
Attached a v28 which I hope fixes the many above issues and stays with
ints. The randu64 is still some kind of a joke, I artificially reduced the
cost by calling jrand48 once and extending it to 64 bits, so it could give
an idea of the cost endured if a 64-bit prng was used.
Now you are the committer, you can do as you please, I'm just stating my
(mathematical) opinions about using floating point computations for that.
I think that apart from this point of principle/philosophy the permute
performance and implementation are reasonable, and better than my initial
version because it avoids int128 computations and the large prime number
business.
--
Fabien.
Attachments:
pgbench-prp-func-28.patchtext/x-diff; name=pgbench-prp-func-28.patchDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 50cf22ba6b..84d9566f49 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -1057,7 +1057,7 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<row>
<entry> <literal>default_seed</literal> </entry>
- <entry>seed used in hash functions by default</entry>
+ <entry>seed used in hash and pseudorandom permutation functions by default</entry>
</row>
<row>
@@ -1864,6 +1864,24 @@ SELECT 4 AS four \; SELECT 5 AS five \aset
</para></entry>
</row>
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <function>permute</function> ( <parameter>i</parameter>, <parameter>size</parameter> [, <parameter>seed</parameter> ] )
+ <returnvalue>integer</returnvalue>
+ </para>
+ <para>
+ Permuted value of <parameter>i</parameter>, in the range
+ <literal>[0, size)</literal>. This is the new position of
+ <parameter>i</parameter> (modulo <parameter>size</parameter>) in a
+ pseudorandom permutation of the integers <literal>0...size-1</literal>,
+ parameterized by <parameter>seed</parameter>.
+ </para>
+ <para>
+ <literal>permute(0, 4)</literal>
+ <returnvalue>an integer between 0 and 3</returnvalue>
+ </para></entry>
+ </row>
+
<row>
<entry role="func_table_entry"><para role="func_signature">
<function>pi</function> ()
@@ -2071,29 +2089,70 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
</listitem>
</itemizedlist>
+ <note>
+ <para>
+ When designing a benchmark which selects rows non-uniformly, be aware
+ that the rows chosen may be correlated with other data such as IDs from
+ a sequence or the physical row ordering, which may skew performance
+ measurements.
+ </para>
+ <para>
+ To avoid this, you may wish to use the <function>permute</function>
+ function, or some other additional step with similar effect, to shuffle
+ the selected rows and remove such correlations.
+ </para>
+ </note>
+
<para>
Hash functions <literal>hash</literal>, <literal>hash_murmur2</literal> and
<literal>hash_fnv1a</literal> accept an input value and an optional seed parameter.
In case the seed isn't provided the value of <literal>:default_seed</literal>
is used, which is initialized randomly unless set by the command-line
- <literal>-D</literal> option. Hash functions can be used to scatter the
- distribution of random functions such as <literal>random_zipfian</literal> or
- <literal>random_exponential</literal>. For instance, the following pgbench
- script simulates possible real world workload typical for social media and
- blogging platforms where few accounts generate excessive load:
+ <literal>-D</literal> option.
+ </para>
+
+ <para>
+ <literal>permute</literal> accepts an input value, a size, and an optional
+ seed parameter. It generates a pseudorandom permutation of integers in
+ the range <literal>[0, size)</literal>, and returns the index of the input
+ value in the permuted values. The permutation chosen is parameterized by
+ the seed, which defaults to <literal>:default_seed</literal>, if not
+ specified. Unlike the hash functions, <literal>permute</literal> ensures
+ that there are no collisions or holes in the output values. Input values
+ outside the interval are interpreted modulo the size. The function raises
+ an error if the size is not positive. <function>permute</function> can be
+ used to scatter the distribution of non-uniform random functions such as
+ <literal>random_zipfian</literal> or <literal>random_exponential</literal>
+ so that values drawn more often are not trivially correlated. For
+ instance, the following <application>pgbench</application> script
+ simulates a possible real world workload typical for social media and
+ blogging platforms where a few accounts generate excessive load:
<programlisting>
-\set r random_zipfian(0, 100000000, 1.07)
-\set k abs(hash(:r)) % 1000000
+\set size 1000000
+\set r random_zipfian(1, :size, 1.07)
+\set k 1 + permute(:r, :size)
</programlisting>
In some cases several distinct distributions are needed which don't correlate
- with each other and this is when implicit seed parameter comes in handy:
+ with each other and this is when the optional seed parameter comes in handy:
<programlisting>
-\set k1 abs(hash(:r, :default_seed + 123)) % 1000000
-\set k2 abs(hash(:r, :default_seed + 321)) % 1000000
+\set k1 1 + permute(:r, :size, :default_seed + 123)
+\set k2 1 + permute(:r, :size, :default_seed + 321)
</programlisting>
+
+ A similar behavior can also be approximated with <function>hash</function>:
+
+<programlisting>
+\set size 1000000
+\set r random_zipfian(1, 100 * :size, 1.07)
+\set k 1 + abs(hash(:r)) % :size
+</programlisting>
+
+ However, since <function>hash</function> generates collisions, some values
+ will not be reachable and others will be more frequent than expected from
+ the original distribution.
</para>
<para>
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index 4d529ea550..56f75ccd25 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -19,6 +19,7 @@
#define PGBENCH_NARGS_VARIABLE (-1)
#define PGBENCH_NARGS_CASE (-2)
#define PGBENCH_NARGS_HASH (-3)
+#define PGBENCH_NARGS_PERMUTE (-4)
PgBenchExpr *expr_parse_result;
@@ -370,6 +371,9 @@ static const struct
{
"hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A
},
+ {
+ "permute", PGBENCH_NARGS_PERMUTE, PGBENCH_PERMUTE
+ },
/* keep as last array element */
{
NULL, 0, 0
@@ -482,6 +486,19 @@ make_func(yyscan_t yyscanner, int fnumber, PgBenchExprList *args)
}
break;
+ /* pseudorandom permutation function with optional seed argument */
+ case PGBENCH_NARGS_PERMUTE:
+ if (len < 2 || len > 3)
+ expr_yyerror_more(yyscanner, "unexpected number of arguments",
+ PGBENCH_FUNCTIONS[fnumber].fname);
+
+ if (len == 2)
+ {
+ PgBenchExpr *var = make_variable("default_seed");
+ args = make_elist(var, args);
+ }
+ break;
+
/* common case: positive arguments number */
default:
Assert(PGBENCH_FUNCTIONS[fnumber].nargs >= 0);
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 48ce1712cc..68a78a3d53 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -66,6 +66,7 @@
#include "getopt_long.h"
#include "libpq-fe.h"
#include "pgbench.h"
+#include "port/pg_bitutils.h"
#include "portability/instr_time.h"
#ifndef M_PI
@@ -1127,6 +1128,127 @@ getHashMurmur2(int64 val, uint64 seed)
return (int64) result;
}
+/*
+ * Return a pseudo-random unsigned 64-bit integer.
+ *
+ * This should really be one call to an actual 64-bit pseudo-random generator.
+ * This is just for a demonstration which poorly extends a 32-bit int.
+ */
+static uint64
+randu64(RandomState * state)
+{
+ uint64 r = pg_jrand48((*state).xseed);
+ return (r << 51) ^ ((r ^ 0xdeadbeef) << 13) ^ (r >> 13);
+}
+
+
+/*
+ * Pseudorandom permutation function
+ *
+ * For small sizes, this generates each of the (size!) possible permutations
+ * of integers in the range [0, size) with roughly equal probability. Once
+ * the size is larger than 16, the number of possible permutations exceeds the
+ * number of distinct states of the internal pseudorandom number generator,
+ * and so not all possible permutations can be generated, but the permutations
+ * chosen should continue to give the appearance of being random.
+ *
+ * THIS FUNCTION IS NOT CRYPTOGRAPHICALLY SECURE.
+ * DO NOT USE FOR SUCH PURPOSE.
+ */
+static int64
+permute(const int64 val, const int64 isize, const int64 seed)
+{
+ RandomState random_state1;
+ RandomState random_state2;
+ uint64 size;
+ uint64 v;
+ int masklen;
+ uint64 mask;
+ int i;
+
+ if (isize < 2)
+ return 0; /* nothing to permute */
+
+ /* Initialize a pair of random states using the seed */
+ random_state1.xseed[0] = seed & 0xFFFF;
+ random_state1.xseed[1] = (seed >> 16) & 0xFFFF;
+ random_state1.xseed[2] = (seed >> 32) & 0xFFFF;
+
+ random_state2.xseed[0] = (((uint64) seed) >> 48) & 0xFFFF;
+ random_state2.xseed[1] = seed & 0xFFFF;
+ random_state2.xseed[2] = (seed >> 16) & 0xFFFF;
+
+ /* Computations are performed on unsigned values, size is 63 bits */
+ size = (uint64) isize;
+ v = (uint64) val % size;
+
+ /* Mask to work modulo largest power of 2 less than or equal to size */
+ masklen = pg_leftmost_one_pos64(size);
+ mask = (((uint64) 1) << masklen) - 1;
+
+ /*
+ * Permute the input value by applying 4 rounds of pseudorandom bijective
+ * transformations. The intention here is to distribute each input
+ * uniformly randomly across the range, and separate adjacent inputs
+ * approximately uniformly randomly from each other, leading to a fairly
+ * random overall choice of permutation.
+ *
+ * To separate adjacent inputs, we multiply by a random number modulo
+ * (mask + 1), which is a power of 2. For this to be a bijection, the
+ * multiplier must be odd. Since this is known to lead to less randomness
+ * in the lower bits, we also apply a rotation that shifts the topmost bit
+ * into the least significant bit. In the special cases where size <= 3,
+ * mask = 1 and each of these operations is actually a no-op, so we also
+ * XOR with a different random number to inject additional randomness.
+ * Since the size is generally not a power of 2, we apply this bijection
+ * on overlapping upper and lower halves of the input.
+ *
+ * To distribute the inputs uniformly across the range, we then also apply
+ * a random offset modulo the full range.
+ *
+ * Taken together, these operations resemble a modified linear
+ * congruential generator, as is commonly used in pseudorandom number
+ * generators. Empirically, it is found that for small sizes it selects
+ * each of the (size!) possible permutations with roughly equal
+ * probability. For larger sizes, not all permutations can be generated,
+ * but the intended random spread is still produced.
+ */
+ for (i = 0; i < 4; i++)
+ {
+ uint64 m,
+ r,
+ t;
+
+ /* pseudo-random multiply (by an odd number), XOR and rotate of lower half */
+ m = randu64(&random_state1) | 1;
+ r = randu64(&random_state2);
+ if (v <= mask)
+ {
+ /* multiply overflow is okay */
+ v = ((v * m) ^ r) & mask;
+ v = ((v << 1) & mask) | (v >> (masklen - 1));
+ }
+
+ /* pseudo-random multiply (by an odd number), XOR and rotate of upper half */
+ m = randu64(&random_state1) | 1;
+ r = randu64(&random_state2);
+ t = size - 1 - v;
+ if (t <= mask)
+ {
+ /* multiply overflow is okay */
+ t = ((t * m) ^ r) & mask;
+ t = ((t << 1) & mask) | (t >> (masklen - 1));
+ v = size - 1 - t;
+ }
+
+ /* pseudo-random offset */
+ r = randu64(&random_state2) % size;
+ v = (v + r) % size;
+ }
+
+ return (int64) v;
+}
+
/*
* Initialize the given SimpleStats struct to all zeroes
*/
@@ -2475,6 +2597,29 @@ evalStandardFunc(CState *st,
return true;
}
+ case PGBENCH_PERMUTE:
+ {
+ int64 val,
+ size,
+ seed;
+
+ Assert(nargs == 3);
+
+ if (!coerceToInt(&vargs[0], &val) ||
+ !coerceToInt(&vargs[1], &size) ||
+ !coerceToInt(&vargs[2], &seed))
+ return false;
+
+ if (size <= 0)
+ {
+ pg_log_error("permute size parameter must be greater than zero");
+ return false;
+ }
+
+ setIntValue(retval, permute(val, size, seed));
+ return true;
+ }
+
default:
/* cannot get here */
Assert(0);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index 3a9d89e6f1..6ce1c98649 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -99,7 +99,8 @@ typedef enum PgBenchFunction
PGBENCH_IS,
PGBENCH_CASE,
PGBENCH_HASH_FNV1A,
- PGBENCH_HASH_MURMUR2
+ PGBENCH_HASH_MURMUR2,
+ PGBENCH_PERMUTE
} PgBenchFunction;
typedef struct PgBenchExpr PgBenchExpr;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 82a46c72b6..8f80e157b5 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -4,6 +4,7 @@ use warnings;
use PostgresNode;
use TestLib;
use Test::More;
+use Config;
# start a pgbench specific server
my $node = get_new_node('main');
@@ -483,6 +484,17 @@ pgbench(
qr{command=98.: int 5432\b}, # :random_seed
qr{command=99.: int -9223372036854775808\b}, # min int
qr{command=100.: int 9223372036854775807\b}, # max int
+ # pseudorandom permutation tests
+ qr{command=101.: boolean true\b},
+ qr{command=102.: boolean true\b},
+ qr{command=103.: boolean true\b},
+ qr{command=104.: boolean true\b},
+ qr{command=105.: boolean true\b},
+ qr{command=109.: boolean true\b},
+ qr{command=110.: boolean true\b},
+ qr{command=111.: boolean true\b},
+ qr{command=112.: int 9223372036854775797\b},
+ qr{command=113.: boolean true\b},
],
'pgbench expressions',
{
@@ -610,6 +622,33 @@ SELECT :v0, :v1, :v2, :v3;
-- minint constant parsing
\set min debug(-9223372036854775808)
\set max debug(-(:min + 1))
+-- parametric pseudorandom permutation function
+\set t debug(permute(0, 2) + permute(1, 2) = 1)
+\set t debug(permute(0, 3) + permute(1, 3) + permute(2, 3) = 3)
+\set t debug(permute(0, 4) + permute(1, 4) + permute(2, 4) + permute(3, 4) = 6)
+\set t debug(permute(0, 5) + permute(1, 5) + permute(2, 5) + permute(3, 5) + permute(4, 5) = 10)
+\set t debug(permute(0, 16) + permute(1, 16) + permute(2, 16) + permute(3, 16) + \
+ permute(4, 16) + permute(5, 16) + permute(6, 16) + permute(7, 16) + \
+ permute(8, 16) + permute(9, 16) + permute(10, 16) + permute(11, 16) + \
+ permute(12, 16) + permute(13, 16) + permute(14, 16) + permute(15, 16) = 120)
+-- random sanity checks
+\set size random(2, 1000)
+\set v random(0, :size - 1)
+\set p permute(:v, :size)
+\set t debug(0 <= :p and :p < :size and :p = permute(:v + :size, :size) and :p <> permute(:v + 1, :size))
+-- actual values
+\set t debug(permute(:v, 1) = 0)
+\set t debug(permute(0, 2, 5432) = 0 and permute(1, 2, 5432) = 1 and \
+ permute(0, 2, 5435) = 1 and permute(1, 2, 5435) = 0)
+-- 63 bits tests
+\set size debug(:max - 10)
+\set t debug(permute(:size-1, :size, 5432) = 1225654122160942995 and \
+ permute(:size-2, :size, 5432) = 8874742469157307278 and \
+ permute(:size-3, :size, 5432) = 2898638033959878781 and \
+ permute(:size-4, :size, 5432) = 6791412537301503153 and \
+ permute(:size-5, :size, 5432) = 4942524564272279459 and \
+ permute(:size-6, :size, 5432) = 5514957334643765238 and \
+ permute(:size-7, :size, 5432) = 6151510276933039910)
}
});
@@ -1048,6 +1087,10 @@ SELECT LEAST(} . join(', ', (':i') x 256) . q{)}
'bad boolean', 2,
[qr{malformed variable.*trueXXX}], q{\set b :badtrue or true}
],
+ [
+ 'invalid permute size', 2,
+ [qr{permute size parameter must be greater than zero}], q{\set i permute(0, 0)}
+ ],
# GSET
[
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
index e38c7d77d1..4027e68dfa 100644
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -341,6 +341,16 @@ my @script_tests = (
'set i',
[ qr{set i 1 }, qr{\^ error found here} ],
{ 'set_i_op' => "\\set i 1 +\n" }
+ ],
+ [
+ 'not enough arguments to permute',
+ [qr{unexpected number of arguments \(permute\)}],
+ { 'bad-permute-1.sql' => "\\set i permute(1)\n" }
+ ],
+ [
+ 'too many arguments to permute',
+ [qr{unexpected number of arguments \(permute\)}],
+ { 'bad-permute-2.sql' => "\\set i permute(1, 2, 3, 4)\n" }
],);
for my $t (@script_tests)
On Mon, 5 Apr 2021 at 13:07, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
Attached a v28 which I hope fixes the many above issues and stays with
ints. The randu64 is still some kind of a joke, I artificially reduced the
cost by calling jrand48 once and extending it to 64 bits, so it could give
an idea of the cost endured if a 64-bit prng was used.Now you are the committer, you can do as you please, I'm just stating my
(mathematical) opinions about using floating point computations for that.
I think that apart from this point of principle/philosophy the permute
performance and implementation are reasonable, and better than my initial
version because it avoids int128 computations and the large prime number
business.
Pushed.
I decided not to go with the "joke" randu64() function, but instead
used getrand() directly. This at least removes any *direct* use of
doubles in permute() (though of course they're still used indirectly),
and means that if getrand() is improved in the future, permute() will
benefit too.
Regards,
Dean
Hello Dean,
Pushed.
I decided not to go with the "joke" randu64() function, but instead
used getrand() directly. This at least removes any *direct* use of
doubles in permute() (though of course they're still used indirectly),
and means that if getrand() is improved in the future, permute() will
benefit too.
Good idea, at least it is hidden and in one place.
Thanks for the push!
--
Fabien.