PSA: we lack TAP test coverage on NetBSD and OpenBSD

coelho@cri.ensmp.fr

almost 7 years ago

In reply to: Tom Lane (#1)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

Hello Tom,

Although we've got a few NetBSD and OpenBSD buildfarm critters,
none of them are doing --enable-tap-tests. If they were, we'd
have noticed the pgbench regression tests falling over:

[...]

I am, TBH, inclined to fix this by removing that test case rather
than teaching it another spelling to accept. I think it's very
hard to make the case that tests like this one are anything but
a waste of developer and buildfarm time. When they are also a
portability hazard, it's time to cut our losses. (I also note
that this test has caused us problems before, cf 869aa40a2 and
933851033.)

I'd rather keep it by simply adding the "|unknown" alternative. 30 years
of programming have taught me that testing limit & error cases is useful,
although you never know when it will be proven so.

Client application coverage is currently abysmal, especially "psql"
despite the many script used for testing (39% of lines, 42% of
functions!), pgbench is under 90%. Generally we really need more tests,
not less. TAP tests are a good compromise because they are not always
run, and ISTM sometimes (i.e. you asked for it) is enough.

I agree that some tests can be useless, but I do not think that it applies
to this one. This test also checks that under a bad option pgbench stops
with an appropriate 1 exit status. Recently a patch updated the exit
status of pgbench in many cases to distinguish between different kind
errors, thus having non-regression in this area was shown to be a good
idea. Moreover, knowing that the exit status on getopt errors is
consistent across platform has value, and knowing that there is some
variability is not uninteresting.

--
Fabien.

mikael.kjellstrom@mksoft.nu

almost 7 years ago

In reply to: Tom Lane (#1)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

On 2019-01-17 06:04, Tom Lane wrote:

Although we've got a few NetBSD and OpenBSD buildfarm critters,
none of them are doing --enable-tap-tests. If they were, we'd
have noticed the pgbench regression tests falling over:

For what it's worth I've enabled tap-tests for my OpenBSD 5.9 (curculio)
and NetBSD 7 (sidewinder) animals now.

/Mikael

tgl@sss.pgh.pa.us

almost 7 years ago

In reply to: Mikael Kjellström (#3)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

=?UTF-8?Q?Mikael_Kjellstr=c3=b6m?= <mikael.kjellstrom@mksoft.nu> writes:

On 2019-01-17 06:04, Tom Lane wrote:

Although we've got a few NetBSD and OpenBSD buildfarm critters,
none of them are doing --enable-tap-tests. If they were, we'd
have noticed the pgbench regression tests falling over:

For what it's worth I've enabled tap-tests for my OpenBSD 5.9 (curculio)
and NetBSD 7 (sidewinder) animals now.

Oh, thanks! I'm guessing they'll fail their next runs, but I'll
wait to see confirmation of that before I do anything about the
test bug.

regards, tom lane

mikael.kjellstrom@mksoft.nu

almost 7 years ago

In reply to: Tom Lane (#4)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

On 2019-01-17 22:16, Tom Lane wrote:

For what it's worth I've enabled tap-tests for my OpenBSD 5.9 (curculio)
and NetBSD 7 (sidewinder) animals now.

Oh, thanks! I'm guessing they'll fail their next runs, but I'll
wait to see confirmation of that before I do anything about the
test bug.

They should run the next time within the hour or hour and a half so I
guess we will find out soon enough.

/Mikael

mikael.kjellstrom@mksoft.nu

almost 7 years ago

In reply to: Mikael Kjellström (#5)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

On 2019-01-17 22:19, Mikael Kjellström wrote:

On 2019-01-17 22:16, Tom Lane wrote:

For what it's worth I've enabled tap-tests for my OpenBSD 5.9 (curculio)
and NetBSD 7 (sidewinder) animals now.

Oh, thanks! I'm guessing they'll fail their next runs, but I'll
wait to see confirmation of that before I do anything about the
test bug.

They should run the next time within the hour or hour and a half so I
guess we will find out soon enough.

Hm, that didn't go so well.

It says:

configure: error: Additional Perl modules are required to run TAP tests

so how do I find out with Perl modules that are required?

/Mikael

tgl@sss.pgh.pa.us

almost 7 years ago

In reply to: Mikael Kjellström (#6)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

=?UTF-8?Q?Mikael_Kjellstr=c3=b6m?= <mikael.kjellstrom@mksoft.nu> writes:

It says:
configure: error: Additional Perl modules are required to run TAP tests
so how do I find out with Perl modules that are required?

If you look into the configure log it should say just above that,
but I'm betting you just need p5-IPC-Run.

regards, tom lane

mikael.kjellstrom@mksoft.nu

almost 7 years ago

In reply to: Tom Lane (#7)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

On 2019-01-17 22:42, Tom Lane wrote:

=?UTF-8?Q?Mikael_Kjellstr=c3=b6m?= <mikael.kjellstrom@mksoft.nu> writes:

It says:
configure: error: Additional Perl modules are required to run TAP tests
so how do I find out with Perl modules that are required?

If you look into the configure log it should say just above that,
but I'm betting you just need p5-IPC-Run.

Yes it seems to be IPC::Run that is missing.

I've installed it manually through CPAN.

Let's see if it works better this time.

/Mikael

mikael.kjellstrom@mksoft.nu

almost 7 years ago

In reply to: Mikael Kjellström (#8)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

On 2019-01-17 22:47, Mikael Kjellström wrote:

On 2019-01-17 22:42, Tom Lane wrote:

=?UTF-8?Q?Mikael_Kjellstr=c3=b6m?= <mikael.kjellstrom@mksoft.nu> writes:

It says:
configure: error: Additional Perl modules are required to run TAP tests
so how do I find out with Perl modules that are required?

If you look into the configure log it should say just above that,
but I'm betting you just need p5-IPC-Run.

Yes it seems to be IPC::Run that is missing.

I've installed it manually through CPAN.

Let's see if it works better this time.

Hmmm, nope:

==================
pgsql.build/src/bin/pg_ctl/tmp_check/log/003_promote_standby.log
===================
2019-01-17 23:09:20.343 CET [9129] LOG: listening on Unix socket
"/tmp/g66P1fpMFK/.s.PGSQL.64980"
2019-01-17 23:09:20.343 CET [9129] FATAL: could not create semaphores:
No space left on device
2019-01-17 23:09:20.343 CET [9129] DETAIL: Failed system call was
semget(64980002, 17, 03600).
2019-01-17 23:09:20.343 CET [9129] HINT: This error does *not* mean
that you have run out of disk space. It occurs when either the system
limit for the maximum number of semaphore sets (SEMMNI), or the system
wide maximum number of semaphores (SEMMNS), would be exceeded. You need
to raise the respective kernel parameter. Alternatively, reduce
PostgreSQL's consumption of semaphores by reducing its max_connections
parameter.
The PostgreSQL documentation contains more information about
configuring your system for PostgreSQL.
2019-01-17 23:09:20.345 CET [9129] LOG: database system is shut down

will try and increase SEMMNI and see if that helps.

/Mikael

#10

tgl@sss.pgh.pa.us

almost 7 years ago

In reply to: Mikael Kjellström (#9)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

=?UTF-8?Q?Mikael_Kjellstr=c3=b6m?= <mikael.kjellstrom@mksoft.nu> writes:

Let's see if it works better this time.

Hmmm, nope:

2019-01-17 23:09:20.343 CET [9129] FATAL: could not create semaphores:
No space left on device

Yeah, you might've been able to get by with OpenBSD/NetBSD's default
semaphore settings before, but they really only let one postmaster
run at a time; and the TAP tests want to start more than one.
For me it seems to work to append this to /etc/sysctl.conf:

kern.seminfo.semmni=100
kern.seminfo.semmns=2000

and either reboot, or install those settings manually with sysctl.

regards, tom lane

#11

mikael.kjellstrom@mksoft.nu

almost 7 years ago

In reply to: Tom Lane (#10)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

On 2019-01-17 23:23, Tom Lane wrote:

Yeah, you might've been able to get by with OpenBSD/NetBSD's default
semaphore settings before, but they really only let one postmaster
run at a time; and the TAP tests want to start more than one.
For me it seems to work to append this to /etc/sysctl.conf:

kern.seminfo.semmni=100
kern.seminfo.semmns=2000

and either reboot, or install those settings manually with sysctl.

Looks that way.

I've increased the values and rebooted the machines.

Let's hope 5th time is the charm :-)

/Mikael

#12

mikael.kjellstrom@mksoft.nu

almost 7 years ago

In reply to: Mikael Kjellström (#11)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

On 2019-01-17 23:37, Mikael Kjellström wrote:

On 2019-01-17 23:23, Tom Lane wrote:

Yeah, you might've been able to get by with OpenBSD/NetBSD's default
semaphore settings before, but they really only let one postmaster
run at a time; and the TAP tests want to start more than one.
For me it seems to work to append this to /etc/sysctl.conf:

kern.seminfo.semmni=100
kern.seminfo.semmns=2000

and either reboot, or install those settings manually with sysctl.

Looks that way.

I've increased the values and rebooted the machines.

Let's hope 5th time is the charm :-)

Nope!

But it looks like in NetBSD the options are called:

netbsd7-pgbf# sysctl -a | grep semmn
kern.ipc.semmni = 10
kern.ipc.semmns = 60
kern.ipc.semmnu = 30

so I will try and set that in /etc/sysctl.conf and reboot and see what
happens.

/Mikael

#13

mikael.kjellstrom@mksoft.nu

almost 7 years ago

In reply to: Mikael Kjellström (#12)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

On 2019-01-17 23:54, Mikael Kjellström wrote:

But it looks like in NetBSD the options are called:

netbsd7-pgbf# sysctl -a | grep semmn
kern.ipc.semmni = 10
kern.ipc.semmns = 60
kern.ipc.semmnu = 30

so I will try and set that in /etc/sysctl.conf and reboot and see what
happens.

That seems to have done the trick:

netbsd7-pgbf# sysctl -a | grep semmn
kern.ipc.semmni = 100
kern.ipc.semmns = 2000
kern.ipc.semmnu = 30

I just started another run on sidewinder (NetBSD 7), let's see how that
goes.

but the OpenBSD machine went further and now fails on:

pgbenchCheck instead.

Is that the failure you expected to get?

/Mikael

#14

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sidewinder&dt=2019-01-17%2022%3A57%3A14

mikael.kjellstrom@mksoft.nu

almost 7 years ago

In reply to: Mikael Kjellström (#13)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

On 2019-01-18 00:00, Mikael Kjellström wrote:

I just started another run on sidewinder (NetBSD 7), let's see how that
goes.

but the OpenBSD machine went further and now fails on:

pgbenchCheck instead.

Is that the failure you expected to get?

And now also the NetBSD machine failed on pgbenchCheck.

should I leave it as it is for now?

/Mikael

#15

tgl@sss.pgh.pa.us

almost 7 years ago

In reply to: Mikael Kjellström (#13)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

=?UTF-8?Q?Mikael_Kjellstr=c3=b6m?= <mikael.kjellstrom@mksoft.nu> writes:

But it looks like in NetBSD the options are called:

Sorry about that, I copied-and-pasted from the openbsd machine I was
looking at without remembering that netbsd is just a shade different.

but the OpenBSD machine went further and now fails on:
pgbenchCheck instead.
Is that the failure you expected to get?

Yup, sure is. Thanks!

regards, tom lane

#16

tgl@sss.pgh.pa.us

almost 7 years ago

In reply to: Mikael Kjellström (#14)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

=?UTF-8?Q?Mikael_Kjellstr=c3=b6m?= <mikael.kjellstrom@mksoft.nu> writes:

And now also the NetBSD machine failed on pgbenchCheck.

Indeed, as expected.

should I leave it as it is for now?

Please. I'll push a fix for the broken test case in a bit --- I
just wanted to confirm that somebody else's machines agreed that
it's broken.

regards, tom lane

#17

mikael.kjellstrom@mksoft.nu

almost 7 years ago

In reply to: Tom Lane (#16)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

On 2019-01-18 00:31, Tom Lane wrote:

=?UTF-8?Q?Mikael_Kjellstr=c3=b6m?= <mikael.kjellstrom@mksoft.nu> writes:

And now also the NetBSD machine failed on pgbenchCheck.

Indeed, as expected.

Ok.

should I leave it as it is for now?

Please. I'll push a fix for the broken test case in a bit --- I
just wanted to confirm that somebody else's machines agreed that
it's broken.

Ok, I will leave it on then.

/Mikael

#18

tgl@sss.pgh.pa.us

almost 7 years ago

In reply to: Fabien COELHO (#2)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

Fabien COELHO <coelho@cri.ensmp.fr> writes:

I am, TBH, inclined to fix this by removing that test case rather
than teaching it another spelling to accept. I think it's very
hard to make the case that tests like this one are anything but
a waste of developer and buildfarm time. When they are also a
portability hazard, it's time to cut our losses. (I also note
that this test has caused us problems before, cf 869aa40a2 and
933851033.)

I'd rather keep it by simply adding the "|unknown" alternative. 30 years
of programming have taught me that testing limit & error cases is useful,
although you never know when it will be proven so.

Sorry, I don't buy this line of argument. Reasonable test design requires
making cost/benefit tradeoffs: the cost to run the test over and over,
and the cost to maintain the test itself (e.g. fix portability issues in
it) have to be balanced against the probability of it finding something
useful. I judge that the chance of this particular test finding something
is small, and I've had quite enough of the maintenance costs.

Just to point up that we're still not clearly done with the maintenance
costs of supposing that we know how every version of getopt_long will
spell this error message, I note that my Linux box seems to have two
variants of it:

$ pgbench -z
pgbench: invalid option -- 'z'
Try "pgbench --help" for more information.
$ pgbench --z
pgbench: unrecognized option '--z'
Try "pgbench --help" for more information.

of which the "invalid" alternative is also not in our list right now.
Who's to say how many more versions of getopt_long are out there,
or what the maintainers thereof might do in the future?

I agree that some tests can be useless, but I do not think that it applies
to this one. This test also checks that under a bad option pgbench stops
with an appropriate 1 exit status.

It's possible that it's worth the trouble to check for exit status 1,
but I entirely fail to see the point of checking exactly what is the
spelling of a message that is issued by code not under our control.

Looking closer at the test case:

[
'bad option',
'-h home -p 5432 -U calvin -d --bad-option',
[ qr{(unrecognized|illegal) option}, qr{--help.*more information} ]
],

ISTM that just removing the first qr{} pattern, and checking only that
we get the text that *is* under our control, is a reasonable compromise
here.

regards, tom lane

#19

Michael Paquier

michael@paquier.xyz

almost 7 years ago

In reply to: Tom Lane (#18)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

On Thu, Jan 17, 2019 at 07:21:08PM -0500, Tom Lane wrote:

Sorry, I don't buy this line of argument. Reasonable test design requires
making cost/benefit tradeoffs: the cost to run the test over and over,
and the cost to maintain the test itself (e.g. fix portability issues in
it) have to be balanced against the probability of it finding something
useful. I judge that the chance of this particular test finding something
is small, and I've had quite enough of the maintenance costs.

Yes, I agree with Tom's line of thoughts here. It seems to me that
just dropping this part of the test is just but fine.
--
Michael

#20

https://man.openbsd.org/srandom

tgl@sss.pgh.pa.us

almost 7 years ago

In reply to: Tom Lane (#18)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

BTW, if you're wondering why curculio is still failing the pgbench
test, all is explained here:

Or at least most is explained there. While curculio is unsurprisingly
failing all four seeded_random tests, when I try it locally on an
OpenBSD 6.4 installation, only the uniform, exponential, and gaussian
cases reliably "fail". zipfian usually doesn't. It looks like the
zipfian code almost always produces 4000 regardless of the seed value,
though occasionally it produces 4001. Bad parameters for that
algorithm, perhaps?

regards, tom lane

#21

coelho@cri.ensmp.fr

almost 7 years ago

In reply to: Tom Lane (#18)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

I'd rather keep it by simply adding the "|unknown" alternative. 30 years
of programming have taught me that testing limit & error cases is useful,
although you never know when it will be proven so.

Sorry, I don't buy this line of argument.

Reasonable test design requires making cost/benefit tradeoffs: the cost
to run the test over and over, and the cost to maintain the test itself
(e.g. fix portability issues in it) have to be balanced against the
probability of it finding something useful. I judge that the chance of
this particular test finding something is small, and I've had quite
enough of the maintenance costs.

Just to point up that we're still not clearly done with the maintenance
costs of supposing that we know how every version of getopt_long will
spell this error message, I note that my Linux box seems to have two
variants of it:

$ pgbench -z
pgbench: invalid option -- 'z'
Try "pgbench --help" for more information.
$ pgbench --z
pgbench: unrecognized option '--z'
Try "pgbench --help" for more information.

of which the "invalid" alternative is also not in our list right now.
Who's to say how many more versions of getopt_long are out there,
or what the maintainers thereof might do in the future?

ISTM that the getopt implementers imagination should run out in the end:-)
invalid, unknown, unrecognized, unexpected, incorrect... Ok English has
many words:-)

I agree that some tests can be useless, but I do not think that it applies
to this one. This test also checks that under a bad option pgbench stops
with an appropriate 1 exit status.

It's possible that it's worth the trouble to check for exit status 1,
but I entirely fail to see the point of checking exactly what is the
spelling of a message that is issued by code not under our control.

Looking closer at the test case:

[
'bad option',
'-h home -p 5432 -U calvin -d --bad-option',
[ qr{(unrecognized|illegal) option}, qr{--help.*more information} ]
],

ISTM that just removing the first qr{} pattern, and checking only that
we get the text that *is* under our control, is a reasonable compromise
here.

Possibly. I'd be a little happier if it checks for a non-empty error
message, eg qr{...} or qr{option} (the message should say something about
the option).

--
Fabien.

#22

coelho@cri.ensmp.fr

almost 7 years ago

In reply to: Tom Lane (#20)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

BTW, if you're wondering why curculio is still failing the pgbench
test,

Hmmm, that is interesting! It shows that at least some TAP tests are
useful.

all is explained here:

https://man.openbsd.org/srandom

Or at least most is explained there.

Yep. They try to be more serious than other systems about PRNG, which is
not bad in itself.

While curculio is unsurprisingly failing all four seeded_random tests,
when I try it locally on an OpenBSD 6.4 installation, only the uniform,
exponential, and gaussian cases reliably "fail". zipfian usually
doesn't.

It looks like the zipfian code almost always produces 4000 regardless of
the seed value, though occasionally it produces 4001. Bad parameters
for that algorithm, perhaps?

Welcome to the zipfian highly skewed distribution! I'll check the
parameters used in the test, maybe it should use something less extreme.

srandom is only used for initializing the state of various internal rand48
LCG PRNG for pgbench.

Maybe on OpenBSD pg should switch srandom to srandom_deterministic?

--
Fabien.

#23

tgl@sss.pgh.pa.us

almost 7 years ago

In reply to: Fabien COELHO (#22)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

Fabien COELHO <coelho@cri.ensmp.fr> writes:

all is explained here:
https://man.openbsd.org/srandom
Or at least most is explained there.

Yep. They try to be more serious than other systems about PRNG, which is
not bad in itself.

Maybe on OpenBSD pg should switch srandom to srandom_deterministic?

Dunno. I'm fairly annoyed by their idea that they're smarter than POSIX.
However, for most of our uses of srandom, this behavior isn't awful;
it's only pgbench that has an expectation that the platform random()
can be made to behave deterministically. And TBH I think that's just
an expectation that's going to bite us.

I'd suggest that maybe we should get rid of the use of both random()
and srandom() in pgbench, and go over to letting set_random_seed()
fill the pg_erand48 state directly. In the integer-seed case you
could use something equivalent to pg_srand48. (In the other cases
probably you could do better, certainly the strong-random case could
just fill all 6 bytes directly.) That would get us to a place where
the behavior of --random-seed=N is not only deterministic but
platform-independent, which seems like an improvement.

regards, tom lane

#24

coelho@cri.ensmp.fr

almost 7 years ago

In reply to: Tom Lane (#23)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

Maybe on OpenBSD pg should switch srandom to srandom_deterministic?

Dunno. I'm fairly annoyed by their idea that they're smarter than POSIX.
However, for most of our uses of srandom, this behavior isn't awful;
it's only pgbench that has an expectation that the platform random()
can be made to behave deterministically. And TBH I think that's just
an expectation that's going to bite us.

I'd suggest that maybe we should get rid of the use of both random()
and srandom() in pgbench, and go over to letting set_random_seed()
fill the pg_erand48 state directly. In the integer-seed case you
could use something equivalent to pg_srand48. (In the other cases
probably you could do better, certainly the strong-random case could
just fill all 6 bytes directly.) That would get us to a place where
the behavior of --random-seed=N is not only deterministic but
platform-independent, which seems like an improvement.

That's a point. Althought I'm not found of round48, indeed having
something platform independent for testing makes definite sense.

I'll look into it.

--
Fabien.

#25

coelho@cri.ensmp.fr

almost 7 years ago

In reply to: Fabien COELHO (#24)

1 attachment(s)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

Hello Tom,

Maybe on OpenBSD pg should switch srandom to srandom_deterministic?

Dunno. I'm fairly annoyed by their idea that they're smarter than POSIX.

Hmmm. I'm afraid that is not that hard.

However, for most of our uses of srandom, this behavior isn't awful;
it's only pgbench that has an expectation that the platform random()
can be made to behave deterministically. And TBH I think that's just
an expectation that's going to bite us.

I'd suggest that maybe we should get rid of the use of both random()
and srandom() in pgbench, and go over to letting set_random_seed()
fill the pg_erand48 state directly. In the integer-seed case you
could use something equivalent to pg_srand48. (In the other cases
probably you could do better, certainly the strong-random case could
just fill all 6 bytes directly.) That would get us to a place where
the behavior of --random-seed=N is not only deterministic but
platform-independent, which seems like an improvement.

That's a point. Althought I'm not found of round48, indeed having something
platform independent for testing makes definite sense.

I'll look into it.

Here is a POC which defines an internal interface for a PRNG, and use it
within pgbench, with several possible implementations which default to
rand48.

I must admit that I have a grudge against standard rand48:

- it is a known poor PRNG which was designed at a time when LCG where
basically the only low cost PRNG available. Newer designs were very
recent when the standard was set.
- it is a LCG, i.e. its low bits cycle quickly, so should not be used.
- so the 48 bit state size is relevant for generating 32 bits ints
and floats.
- however it eis used to generate more bits...
- the double function uses all 48 bits, whereas 52 need to be filled...
- and it is used to generate integers, which means that for large range
some values are inaccessible.
- 3 * 16 bits integers state looks silly on 32/64 bit architectures.
- ...

Given that postgres needs doubles (52 bits mantissa) and possibly 64 bits
integers, IMO the internal state should be 64 bits as a bare minimum,
which anyway is also the minimal bite on 64 bit architectures, which is
what is encoutered in practice.

--
Fabien.

Attachments:

pgbench-prng-1.patchtext/x-diff; name=pgbench-prng-1.patchDownload

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index b5c75ce1c6..06f1083d5a 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -185,7 +185,7 @@ int64		latency_limit = 0;
 char	   *tablespace = NULL;
 char	   *index_tablespace = NULL;
 
-/* random seed used when calling srandom() */
+/* random seed used to initialize the PRNG */
 int64		random_seed = -1;
 
 /*
@@ -279,14 +279,6 @@ typedef struct StatsData
 	SimpleStats lag;
 } StatsData;
 
-/*
- * Struct to keep random state.
- */
-typedef struct RandomState
-{
-	unsigned short xseed[3];
-} RandomState;
-
 /*
  * Connection state machine states.
  */
@@ -383,7 +375,7 @@ typedef struct
 	 * Separate randomness for each client. This is used for random functions
 	 * PGBENCH_RANDOM_* during the execution of the script.
 	 */
-	RandomState cs_func_rs;
+	pg_prng_state cs_func_rs;
 
 	int			use_file;		/* index in sql_script for this client */
 	int			command;		/* command number in script */
@@ -450,9 +442,9 @@ typedef struct
 	 * random state to make all of them independent of each other and therefore
 	 * deterministic at the thread level.
 	 */
-	RandomState ts_choose_rs;	/* random state for selecting a script */
-	RandomState ts_throttle_rs;	/* random state for transaction throttling */
-	RandomState ts_sample_rs;	/* random state for log sampling */
+	pg_prng_state ts_choose_rs;	/* random state for selecting a script */
+	pg_prng_state ts_throttle_rs;	/* random state for transaction throttling */
+	pg_prng_state ts_sample_rs;	/* random state for log sampling */
 
 	int64		throttle_trigger;	/* previous/next throttling (us) */
 	FILE	   *logfile;		/* where to log, or NULL */
@@ -832,30 +824,34 @@ strtodouble(const char *str, bool errorOK, double *dv)
 }
 
 /*
- * Initialize a random state struct.
+ * Initialize a pseudo-random state struct from main PRNG.
+ * This done at the beginning of process allows to have deterministic
+ * pgbench runs.
  */
 static void
-initRandomState(RandomState *random_state)
+pg_prng_state_init(pg_prng_state *rs)
 {
-	random_state->xseed[0] = random();
-	random_state->xseed[1] = random();
-	random_state->xseed[2] = random();
+	pg_prng_setstate_r(rs, pg_prng_u64());
 }
 
 /* random number generator: uniform distribution from min to max inclusive */
 static int64
-getrand(RandomState *random_state, int64 min, int64 max)
+getrand(pg_prng_state *random_state, int64 min, int64 max)
 {
 	/*
 	 * Odd coding is so that min and max have approximately the same chance of
 	 * being selected as do numbers between them.
 	 *
-	 * pg_erand48() is thread-safe and concurrent, which is why we use it
+	 * pg_prng_double_r() is thread-safe and concurrent, which is why we use it
 	 * rather than random(), which in glibc is non-reentrant, and therefore
 	 * protected by a mutex, and therefore a bottleneck on machines with many
 	 * CPUs.
+	 *
+	 * Note: pg_prng_double returns 52 bits (and pg_erand48 only 48 bits),
+	 * so this does not really work if more is needed between min & max.
+	 * Should rather use a modulo? Would be biased a little close to 64 bits.
 	 */
-	return min + (int64) ((max - min + 1) * pg_erand48(random_state->xseed));
+	return min + (int64) ((max - min + 1) * pg_prng_double_r(random_state));
 }
 
 /*
@@ -864,7 +860,7 @@ getrand(RandomState *random_state, int64 min, int64 max)
  * value is exp(-parameter).
  */
 static int64
-getExponentialRand(RandomState *random_state, int64 min, int64 max,
+getExponentialRand(pg_prng_state *random_state, int64 min, int64 max,
 				   double parameter)
 {
 	double		cut,
@@ -875,7 +871,7 @@ getExponentialRand(RandomState *random_state, int64 min, int64 max,
 	Assert(parameter > 0.0);
 	cut = exp(-parameter);
 	/* erand in [0, 1), uniform in (0, 1] */
-	uniform = 1.0 - pg_erand48(random_state->xseed);
+	uniform = 1.0 - pg_prng_double_r(random_state);
 
 	/*
 	 * inner expression in (cut, 1] (if parameter > 0), rand in [0, 1)
@@ -888,7 +884,7 @@ getExponentialRand(RandomState *random_state, int64 min, int64 max,
 
 /* random number generator: gaussian distribution from min to max inclusive */
 static int64
-getGaussianRand(RandomState *random_state, int64 min, int64 max,
+getGaussianRand(pg_prng_state *random_state, int64 min, int64 max,
 				double parameter)
 {
 	double		stdev;
@@ -912,13 +908,13 @@ getGaussianRand(RandomState *random_state, int64 min, int64 max,
 	do
 	{
 		/*
-		 * pg_erand48 generates [0,1), but for the basic version of the
+		 * pseudo-random output in [0,1), but for the basic version of the
 		 * Box-Muller transform the two uniformly distributed random numbers
 		 * are expected in (0, 1] (see
 		 * https://en.wikipedia.org/wiki/Box-Muller_transform)
 		 */
-		double		rand1 = 1.0 - pg_erand48(random_state->xseed);
-		double		rand2 = 1.0 - pg_erand48(random_state->xseed);
+		double		rand1 = 1.0 - pg_prng_double_r(random_state);
+		double		rand2 = 1.0 - pg_prng_double_r(random_state);
 
 		/* Box-Muller basic form transform */
 		double		var_sqrt = sqrt(-2.0 * log(rand1));
@@ -948,7 +944,7 @@ getGaussianRand(RandomState *random_state, int64 min, int64 max,
  * not be one.
  */
 static int64
-getPoissonRand(RandomState *random_state, double center)
+getPoissonRand(pg_prng_state *random_state, double center)
 {
 	/*
 	 * Use inverse transform sampling to generate a value > 0, such that the
@@ -957,7 +953,7 @@ getPoissonRand(RandomState *random_state, double center)
 	double		uniform;
 
 	/* erand in [0, 1), uniform in (0, 1] */
-	uniform = 1.0 - pg_erand48(random_state->xseed);
+	uniform = 1.0 - pg_prng_double_r(random_state);
 
 	return (int64) (-log(uniform) * center + 0.5);
 }
@@ -1035,7 +1031,7 @@ zipfFindOrCreateCacheCell(ZipfCache *cache, int64 n, double s)
  * Luc Devroye, p. 550-551, Springer 1986.
  */
 static int64
-computeIterativeZipfian(RandomState *random_state, int64 n, double s)
+computeIterativeZipfian(pg_prng_state *random_state, int64 n, double s)
 {
 	double		b = pow(2.0, s - 1.0);
 	double		x,
@@ -1046,8 +1042,8 @@ computeIterativeZipfian(RandomState *random_state, int64 n, double s)
 	while (true)
 	{
 		/* random variates */
-		u = pg_erand48(random_state->xseed);
-		v = pg_erand48(random_state->xseed);
+		u = pg_prng_double_r(random_state);
+		v = pg_prng_double_r(random_state);
 
 		x = floor(pow(u, -1.0 / (s - 1.0)));
 
@@ -1065,11 +1061,11 @@ computeIterativeZipfian(RandomState *random_state, int64 n, double s)
  * Jim Gray et al, SIGMOD 1994
  */
 static int64
-computeHarmonicZipfian(ZipfCache *zipf_cache, RandomState *random_state,
+computeHarmonicZipfian(ZipfCache *zipf_cache, pg_prng_state *random_state,
 					   int64 n, double s)
 {
 	ZipfCell   *cell = zipfFindOrCreateCacheCell(zipf_cache, n, s);
-	double		uniform = pg_erand48(random_state->xseed);
+	double		uniform = pg_prng_double_r(random_state);
 	double		uz = uniform * cell->harmonicn;
 
 	if (uz < 1.0)
@@ -1081,7 +1077,7 @@ computeHarmonicZipfian(ZipfCache *zipf_cache, RandomState *random_state,
 
 /* random number generator: zipfian distribution from min to max inclusive */
 static int64
-getZipfianRand(ZipfCache *zipf_cache, RandomState *random_state, int64 min,
+getZipfianRand(ZipfCache *zipf_cache, pg_prng_state *random_state, int64 min,
 			   int64 max, double s)
 {
 	int64		n = max - min + 1;
@@ -3565,7 +3561,7 @@ doLog(TState *thread, CState *st,
 	 * to the random sample.
 	 */
 	if (sample_rate != 0.0 &&
-		pg_erand48(thread->ts_sample_rs.xseed) > sample_rate)
+		pg_prng_double_r(&thread->ts_sample_rs) > sample_rate)
 		return;
 
 	/* should we aggregate the results or not? */
@@ -5126,12 +5122,11 @@ printResults(TState *threads, StatsData *total, instr_time total_time,
 	}
 }
 
-/* call srandom based on some seed. NULL triggers the default behavior. */
+/* init prng based on some seed. NULL triggers the default behavior. */
 static bool
 set_random_seed(const char *seed)
 {
-	/* srandom expects an unsigned int */
-	unsigned int iseed;
+	uint64 iseed;
 
 	if (seed == NULL || strcmp(seed, "time") == 0)
 	{
@@ -5139,7 +5134,7 @@ set_random_seed(const char *seed)
 		instr_time	now;
 
 		INSTR_TIME_SET_CURRENT(now);
-		iseed = (unsigned int) INSTR_TIME_GET_MICROSEC(now);
+		iseed = INSTR_TIME_GET_MICROSEC(now);
 	}
 	else if (strcmp(seed, "rand") == 0)
 	{
@@ -5155,7 +5150,7 @@ set_random_seed(const char *seed)
 		/* parse seed unsigned int value */
 		char		garbage;
 
-		if (sscanf(seed, "%u%c", &iseed, &garbage) != 1)
+		if (sscanf(seed, UINT64_FORMAT "%c", &iseed, &garbage) != 1)
 		{
 			fprintf(stderr,
 					"unrecognized random seed option \"%s\": expecting an unsigned integer, \"time\" or \"rand\"\n",
@@ -5165,9 +5160,12 @@ set_random_seed(const char *seed)
 	}
 
 	if (seed != NULL)
-		fprintf(stderr, "setting random seed to %u\n", iseed);
-	srandom(iseed);
-	/* no precision loss: 32 bit unsigned int cast to 64 bit int */
+		fprintf(stderr, "setting random seed to " UINT64_FORMAT "\n", iseed);
+
+	/* initialize main PRNG */
+	pg_prng_setstate(iseed);
+
+	/* keep value, casted as signed */
 	random_seed = iseed;
 	return true;
 }
@@ -5775,7 +5773,7 @@ main(int argc, char **argv)
 	for (i = 0; i < nclients; i++)
 	{
 		state[i].cstack = conditional_stack_create();
-		initRandomState(&state[i].cs_func_rs);
+		pg_prng_state_init(&state[i].cs_func_rs);
 	}
 
 	if (debug)
@@ -5862,10 +5860,7 @@ main(int argc, char **argv)
 	/* set default seed for hash functions */
 	if (lookupVariable(&state[0], "default_seed") == NULL)
 	{
-		uint64		seed = ((uint64) (random() & 0xFFFF) << 48) |
-		((uint64) (random() & 0xFFFF) << 32) |
-		((uint64) (random() & 0xFFFF) << 16) |
-		(uint64) (random() & 0xFFFF);
+		uint64		seed = pg_prng_u64();
 
 		for (i = 0; i < nclients; i++)
 			if (!putVariableInt(&state[i], "startup", "default_seed", (int64) seed))
@@ -5909,9 +5904,9 @@ main(int argc, char **argv)
 		thread->state = &state[nclients_dealt];
 		thread->nstate =
 			(nclients - nclients_dealt + nthreads - i - 1) / (nthreads - i);
-		initRandomState(&thread->ts_choose_rs);
-		initRandomState(&thread->ts_throttle_rs);
-		initRandomState(&thread->ts_sample_rs);
+		pg_prng_state_init(&thread->ts_choose_rs);
+		pg_prng_state_init(&thread->ts_throttle_rs);
+		pg_prng_state_init(&thread->ts_sample_rs);
 		thread->logfile = NULL; /* filled in later */
 		thread->latency_late = 0;
 		thread->zipf_cache.nb_cells = 0;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 17323e6ca5..6e5e2cf997 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -260,10 +260,10 @@ pgbench(
 		qr{setting random seed to 5432\b},
 
 		# After explicit seeding, the four * random checks (1-3,20) should be
-		# deterministic, but not necessarily portable.
-		qr{command=1.: int 1\d\b},        # uniform random: 12 on linux
-		qr{command=2.: int 1\d\d\b},      # exponential random: 106 on linux
-		qr{command=3.: int 1\d\d\d\b},    # gaussian random: 1462 on linux
+		# deterministic on all platforms
+		qr{command=1.: int 18\b},     # uniform random
+		qr{command=2.: int 101\b},    # exponential random
+		qr{command=3.: int 1509\b},   # gaussian random
 		qr{command=4.: int 4\b},
 		qr{command=5.: int 5\b},
 		qr{command=6.: int 6\b},
@@ -276,7 +276,7 @@ pgbench(
 		qr{command=15.: double 15\b},
 		qr{command=16.: double 16\b},
 		qr{command=17.: double 17\b},
-		qr{command=20.: int \d\b},    # zipfian random: 1 on linux
+		qr{command=20.: int 3\b},     # zipfian random
 		qr{command=21.: double -27\b},
 		qr{command=22.: double 1024\b},
 		qr{command=23.: double 1\b},
diff --git a/src/include/port.h b/src/include/port.h
index a55c473262..7597f1a78b 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -348,6 +348,57 @@ extern long pg_lrand48(void);
 extern long pg_jrand48(unsigned short xseed[3]);
 extern void pg_srand48(long seed);
 
+/*----
+ * Postgres internal pseudo-random number generator (PRNG)
+ *
+ * 0: rand48 (pg historical linear congruential generator)
+ * 1: Knuth's MMIX LCG
+ * 2: splitmix64
+ * 3: xoshiro128+
+ * 4: xoshiro256** (default)
+ */
+#ifndef PG_PRNG_ALGORITHM
+#define PG_PRNG_ALGORITHM 0
+#endif
+
+/* set PRNG state byte size depending on algorithm */
+#if PG_PRNG_ALGORITHM == 0
+#define PG_PRNG_STATE_SIZE 6
+#elif PG_PRNG_ALGORITHM == 1 || PG_PRNG_ALGORITHM == 2
+#define PG_PRNG_STATE_SIZE 8
+#elif PG_PRNG_ALGORITHM == 3
+#define PG_PRNG_STATE_SIZE 16
+#elif PG_PRNG_ALGORITHM == 4
+#define PG_PRNG_STATE_SIZE 32
+#else
+#error "unexpected PG_PRNG_ALGORITHM value, must be in 0..4"
+#endif
+
+/* closed type seen as a byte array */
+typedef struct {
+	unsigned char data[PG_PRNG_STATE_SIZE];
+} pg_prng_state;
+
+/* interface with a reference to an external state */
+extern double pg_prng_double_r(pg_prng_state *state);
+extern int32 pg_prng_u31_r(pg_prng_state *state);
+extern int32 pg_prng_s32_r(pg_prng_state *state);
+extern uint32 pg_prng_u32_r(pg_prng_state *state);
+extern int64 pg_prng_s64_r(pg_prng_state *state);
+extern uint64 pg_prng_u64_r(pg_prng_state *state);
+extern void pg_prng_setstate_r(pg_prng_state *state, uint64 u);
+extern bool pg_prng_strong_setstate_r(pg_prng_state *state);
+
+/* interface with a shared internal state */
+extern double pg_prng_double(void);
+extern int32 pg_prng_u31(void);
+extern int32 pg_prng_s32(void);
+extern uint32 pg_prng_u32(void);
+extern int64 pg_prng_s64(void);
+extern uint64 pg_prng_u64(void);
+extern void pg_prng_setstate(uint64 u);
+extern bool pg_prng_strong_setstate(void);
+
 #ifndef HAVE_FLS
 extern int	fls(int mask);
 #endif
diff --git a/src/port/Makefile b/src/port/Makefile
index 9cfc0f9279..afa4161dde 100644
--- a/src/port/Makefile
+++ b/src/port/Makefile
@@ -37,7 +37,7 @@ LIBS += $(PTHREAD_LIBS)
 
 OBJS = $(LIBOBJS) $(PG_CRC32C_OBJS) chklocale.o erand48.o inet_net_ntop.o \
 	noblock.o path.o pgcheckdir.o pgmkdirp.o pgsleep.o \
-	pg_strong_random.o pgstrcasecmp.o pgstrsignal.o pqsignal.o \
+	pg_strong_random.o pgstrcasecmp.o pgstrsignal.o pqsignal.o prng.o \
 	qsort.o qsort_arg.o quotes.o snprintf.o sprompt.o strerror.o \
 	tar.o thread.o
 
diff --git a/src/port/prng.c b/src/port/prng.c
new file mode 100644
index 0000000000..4b8002270e
--- /dev/null
+++ b/src/port/prng.c
@@ -0,0 +1,438 @@
+/*-------------------------------------------------------------------------
+ *
+ * prng.c
+ *
+ * Postgres internal PRNG (pseudo-random number generator).
+ *
+ * This file supplies several implementations of a PRNG: POSIX-standard
+ * rand48, splitmix64, xoroshiro128+ and xoshiro256**, all wrapped into
+ * the same abstract interface so that changing the function only requires
+ * recompiling, without further changes. The variants with a reference to
+ * the state (*_r) are thread-safe if used with a thread-specific state.
+ *
+ * None of these functions should apply to anything which needs cryptographic
+ * security, even if seeded from strong random data: beware that the internal
+ * state can probably be recovered from a few outputs. If the state is too
+ * small or initialized from too small (random) data, they may be subject to a
+ * state enumeration attack. Most of these functions pass statistical tests,
+ * although they fail under some chosen transformation.
+ *
+ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
+ *
+ * Portions Copyright (c) 1993 Martin Birgmeier
+ * All rights reserved.
+ *
+ * You may redistribute unmodified or modified versions of this source
+ * code provided that the above copyright notice and this and the
+ * following conditions are retained.
+ *
+ * This software is provided ``as is'', and comes with no warranties
+ * of any kind. I shall in no event be liable for anything that happens
+ * to anyone/anything when using this software.
+ *
+ * IDENTIFICATION
+ *	  src/port/prng.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "c.h"
+#include "port.h"
+
+#include <math.h>
+
+/*
+ * Left rotation, decent compilers should generate hardware instructions
+ */
+#define ROTL(x, n) ((x << n) | (x >> (64 - n)))
+
+/*
+ * PG PRNG ALGORITHM
+ */
+const char * pg_prng_algorithm =
+#if PG_PRNG_ALGORITHM == 0
+	/* POSIX standard */
+	"rand48"
+#elif PG_PRNG_ALGORITHM == 1
+	/* Donald Knuth's MMIX LCG */
+	"knuth-mmix-lcg"
+#elif PG_PRNG_ALGORITHM == 2
+	/*----
+	 * Basic 8 bytes state PRNG from Java 8,
+	 * see http://xoshiro.di.unimi.it/splitmix64.c
+	 */
+	"splitmix64"
+#elif PG_PRNG_ALGORITHM == 3
+	/*----
+	 * Algorithm by David Blackman and Sebastiano Vigna,
+	 * see http://xoshiro.di.unimi.it/xoroshiro128plus.c
+	 */
+	"xoroshiro128+"
+#elif PG_PRNG_ALGORITHM == 4
+	/*----
+	 * Algorithm by David Blackman and Sebastiano Vigna,
+	 * see http://xoshiro.di.unimi.it/xoshiro256starstar.c
+	 */
+	"xoshiro256**"
+#endif
+	;
+
+#if PG_PRNG_ALGORITHM >= 2
+/* also used to initialize large states */
+static uint64 _splitmix64(uint64 *state)
+{
+	uint64	v;
+	*state += UINT64CONST(0x9e3779b97f4a7c15);
+	v = *state;
+	v = (v ^ (v >> 30)) * UINT64CONST(0xbf58476d1ce4e5b9);
+	v = (v ^ (v >> 27)) * UINT64CONST(0x94d049bb133111eb);
+	return v ^ (v >> 31);
+}
+#endif
+
+/*
+ * INTERNAL STATE TYPE
+ */
+typedef struct {
+#if PG_PRNG_STATE_SIZE == 6
+	uint16	data[3];
+#else
+	uint64 data[PG_PRNG_STATE_SIZE / 8];
+#endif
+} _pg_prng_state;
+
+/*
+ * INTERNAL STATE DEFAULT
+ *
+ * pi in hexa is 3.243F6A8885 A308D31319 8A2E037073 44A4093822
+ * 299F31D008 2EFA98EC4E 6C89452821 E638D01377...
+ */
+static _pg_prng_state _prng_state = {
+	{
+
+#if PG_PRNG_STATE_SIZE == 6
+
+/* POSIX specifies 0x330e's use in srand48, but the other bits are arbitrary */
+#define RAND48_SEED_0	(0x330E)
+#define RAND48_SEED_1	(0x243F)
+#define RAND48_SEED_2	(0x6A88)
+
+		RAND48_SEED_0,
+		RAND48_SEED_1,
+		RAND48_SEED_2
+
+#elif PG_PRNG_STATE_SIZE == 8
+
+		UINT64CONST(0x243F6A8885A308D3)
+
+#elif PG_PRNG_STATE_SIZE == 16
+
+		UINT64CONST(0x243F6A8885A308D3),
+		UINT64CONST(0x13198A2E03707344)
+
+#else /* PG_PRNG_STATE_SIZE == 32 */
+
+		UINT64CONST(0x243F6A8885A308D3),
+		UINT64CONST(0x13198A2E03707344),
+		UINT64CONST(0xA4093822299F31D0),
+		UINT64CONST(0x082EFA98EC4E6C89)
+
+#endif
+
+	}
+};
+
+/*
+ * NEXT: advance internal state and return 64 bit value
+ */
+static uint64
+_next(_pg_prng_state *state)
+{
+#if PG_PRNG_ALGORITHM == 0
+
+/* These values are specified by POSIX */
+#define RAND48_MUL	UINT64CONST(0x0005deece66d)
+#define RAND48_ADD	UINT64CONST(0x000b)
+
+	/*
+	 * We do the arithmetic in uint64; any type wider than 48 bits would work.
+	 */
+	uint64		in;
+	uint64		out;
+
+	in = (uint64) state->data[2] << 32 |
+		(uint64) state->data[1] << 16 |
+		(uint64) state->data[0];
+
+	/* this is really 64 bits */
+	out = in * RAND48_MUL + RAND48_ADD;
+
+	/* truncated back to 48 bits */
+	state->data[0] = out & 0xFFFF;
+	state->data[1] = (out >> 16) & 0xFFFF;
+	state->data[2] = (out >> 32) & 0xFFFF;
+
+	return out;
+
+#elif PG_PRNG_ALGORITHM == 1
+
+	state->data[0] = UINT64CONST(6364136223846793005) * state->data[0] +
+					 UINT64CONST(1442695040888963407);
+
+	return state->data[0];
+
+#elif PG_PRNG_ALGORITHM == 2
+
+	return _splitmix64(state->data);
+
+#elif PG_PRNG_ALGORITHM == 3
+
+	const uint64 s0 = state->data[0];
+	uint64 s1 = state->data[1];
+	const uint64 next = s0 + s1;
+
+	s1 ^= s0;
+
+	state->data[0] = ROTL(s0, 24) ^ s1 ^ (s1 << 16);
+	state->data[1] = ROTL(s1, 37);
+
+	return next;
+
+#else /* PG_PRNG_ALGORITHM == 4 */
+
+	const uint64 v = state->data[1] * 5;
+	const uint64 next = ROTL(v, 7) * 9;
+	const uint64 t = state->data[1] << 17;
+
+	state->data[2] ^= state->data[0];
+	state->data[3] ^= state->data[1];
+	state->data[1] ^= state->data[2];
+	state->data[0] ^= state->data[3];
+
+	state->data[2] ^= t;
+	state->data[3] = ROTL(state->data[3], 45);
+
+	return next;
+
+#endif
+}
+
+/*----
+ * Initialize the internal state from a 64-bit seed.
+ *
+ * This makes sense for determinism, eg needed under debug, where the user
+ * can supply an integer for this purpose. If this is not needed, consider
+ * using the strong variant.
+ */
+void
+pg_prng_setstate_r(pg_prng_state *state, uint64 u)
+{
+	_pg_prng_state * s = (_pg_prng_state *) state;
+
+	Assert(sizeof(pg_prng_state) == sizeof(_pg_prng_state));
+
+#if PG_PRNG_STATE_SIZE == 6
+
+	/*
+	 * Per POSIX, this uses only 32 bits from "seed" even if "long" is wider.
+	 * Hence, the set of possible seed values is smaller than it could be.
+	 * Better practice is to use caller-supplied state and initialize it with
+	 * random bits obtained from a high-quality source of random bits.
+	 *
+	 * Note: POSIX specifies a function seed48() that allows all 48 bits
+	 * of the internal state to be set, but we don't currently support that.
+	 */
+	s->data[0] = RAND48_SEED_0;
+	s->data[1] = (uint16) u;
+	s->data[2] = (uint16) (u >> 16);
+
+#elif PG_PRNG_STATE_SIZE == 8
+
+	s->data[0] = u;
+
+#elif PG_PRNG_STATE_SIZE == 16
+
+	s->data[0] = u;
+	s->data[1] = _splitmix64(&u);
+
+#else /* PG_PRNG_STATE_SIZE == 32 */
+
+	s->data[0] = u;
+	s->data[1] = _splitmix64(&u);
+	s->data[2] = _splitmix64(&u);
+	s->data[3] = _splitmix64(&u);
+
+#endif
+}
+
+/*------------------------------------------------------------------------------
+ * Common pseudo-random data extraction functions
+ */
+
+/*----
+ * Generate a pseudo-random floating-point value using caller-supplied state.
+ * Values are uniformly distributed over the interval [0.0, 1.0).
+ */
+double
+pg_prng_double_r(pg_prng_state *state)
+{
+	uint64		x = _next((_pg_prng_state *) state);
+
+#if PG_PRNG_STATE_SIZE == 6
+	/* not enough bits to fill the mantissa?
+	 * in fact we have but that would depart from erand48
+	 */
+	return ldexp((double) (x & UINT64CONST(0x0000FFFFFFFFFFFF)), -48);
+#else /* sizes 8, 16 and 32 are ok, prefer higher order bits though */
+	return ldexp((double) (x >> 12), -52);
+#endif
+}
+
+/* Generate a pseudo-random double in [0.0, 1.0) from internal state. */
+double
+pg_prng_double(void)
+{
+	return pg_prng_double_r((pg_prng_state *) &_prng_state);
+}
+
+/*----
+ * Generate a pseudo-random non-negative integral value using caller-supplied
+ * state. Values are uniformly distributed over the interval [0, 2^31).
+ */
+int32
+pg_prng_u31_r(pg_prng_state *state)
+{
+	uint64		x = _next((_pg_prng_state *) state);
+
+#if PG_PRNG_STATE_SIZE == 6
+	/* LCG must avoid low-weight bits */
+	return (x >> 17) & UINT64CONST(0x7FFFFFFF);
+#else /* combine available 64 bits */
+	return (x + (x >> 32)) & UINT64CONST(0x7FFFFFFF);
+#endif
+}
+
+/* Generate pseudo-random non-negative integral from internal state. */
+int32
+pg_prng_u31(void)
+{
+	return pg_prng_u31_r((pg_prng_state *) &_prng_state);
+}
+
+/*---
+ * Generate a pseudo-random signed integral value using caller-supplied state.
+ * Values are uniformly distributed over the interval [-2^31, 2^31).
+ */
+int32
+pg_prng_s32_r(pg_prng_state *state)
+{
+	uint64		x = _next((_pg_prng_state *) state);
+
+#if PG_PRNG_STATE_SIZE == 6
+	/* LCG must avoid low-weight bits */
+	return (int32) ((x >> 16) & UINT64CONST(0xFFFFFFFF));
+#else
+	/* combine available 64 bits */
+	return (int32) ((x + (x >> 32)) & UINT64CONST(0xFFFFFFFF));
+#endif
+}
+
+/* Generate pseudo-random signed integral from internal state. */
+int32
+pg_prng_s32(void)
+{
+	return pg_prng_s32_r((pg_prng_state *) &_prng_state);
+}
+
+/*---
+ * Generate a pseudo-random unsigned integral value using caller-supplied state.
+ * Values are uniformly distributed over the interval [0, 2^32).
+ */
+uint32
+pg_prng_u32_r(pg_prng_state *state)
+{
+	uint64		x = _next((_pg_prng_state *) state);
+
+#if PG_PRNG_STATE_SIZE == 6
+	/* LCG must avoid low-weight bits */
+	return (x >> 16) & UINT64CONST(0xFFFFFFFF);
+#else
+	/* combine available 64 bits */
+	return ((x + (x >> 32)) & UINT64CONST(0xFFFFFFFF));
+#endif
+}
+
+/* Generate pseudo-random signed integral from internal state. */
+uint32
+pg_prng_u32(void)
+{
+	return pg_prng_u32_r((pg_prng_state *) &_prng_state);
+}
+
+/*
+ * Generate a pseudo-random non-negative integral value using caller-supplied
+ * state. Values are uniformly distributed over the interval [-2^63, 2^63).
+ * (note: not really for low-end LCG)
+ */
+int64
+pg_prng_s64_r(pg_prng_state *state)
+{
+#if PG_PRNG_STATE_SIZE == 6 /* rand48 */
+	/* not enough bits, call _next twice and drop some low bits */
+	return (int64) (_next((_pg_prng_state *) state) << 16  ^
+					_next((_pg_prng_state *) state) >> 20);
+#else
+	return (int64) _next((_pg_prng_state *) state);
+#endif
+}
+
+/* Generate a pseudo-random int64 from internal state */
+int64
+pg_prng_s64(void)
+{
+	return pg_prng_s64_r((pg_prng_state *) &_prng_state);
+}
+
+/*
+ * Generate a pseudo-random non-negative integral value using caller-supplied
+ * state. Values are uniformly distributed over the interval [0, 2^64).
+ */
+uint64
+pg_prng_u64_r(pg_prng_state *state)
+{
+#if PG_PRNG_STATE_SIZE == 6 /* rand48 */
+	/* not enough bits, call _next twice and drop some low bits */
+	return _next((_pg_prng_state *) state) << 16  ^
+		   _next((_pg_prng_state *) state) >> 20;
+#else
+	return _next((_pg_prng_state *) state);
+#endif
+}
+
+/* Generate a pseudo-random uint64 from internal state */
+uint64
+pg_prng_u64(void)
+{
+	return pg_prng_u64_r((pg_prng_state *) &_prng_state);
+}
+
+/* seed internal state from 64 bit integer */
+void
+pg_prng_setstate(uint64 u)
+{
+	pg_prng_setstate_r((pg_prng_state *) &_prng_state, u);
+}
+
+/* try to initialize prng state from a strong random source */
+bool
+pg_prng_strong_setstate_r(pg_prng_state *state)
+{
+	return pg_strong_random(&state->data, PG_PRNG_STATE_SIZE);
+}
+
+/* try to initialize the internal prng state from a strong random source */
+bool
+pg_prng_strong_setstate(void)
+{
+	return pg_prng_strong_setstate_r((pg_prng_state *) &_prng_state);
+}

#26

tgl@sss.pgh.pa.us

almost 7 years ago

In reply to: Fabien COELHO (#25)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

Fabien COELHO <coelho@cri.ensmp.fr> writes:

I'd suggest that maybe we should get rid of the use of both random()
and srandom() in pgbench, and go over to letting set_random_seed()
fill the pg_erand48 state directly.

Here is a POC which defines an internal interface for a PRNG, and use it
within pgbench, with several possible implementations which default to
rand48.

I seriously dislike this patch. pgbench's random support is quite
overengineered already IMO, and this proposes to add a whole batch of
new code and new APIs to fix a very small bug.

I must admit that I have a grudge against standard rand48:

I think this is nonsense, particularly the claim that anything in PG
cares about the lowest-order bits of random doubles. I'm aware that
there are applications where that does matter, but people aren't doing
high-precision weather simulations in pgbench.

BTW, did you look at the question of the range of zipfian? I confirmed
here that as used in the test case, it's generating a range way smaller
than the other ones: repeating the insertion snippet 1000x produces stats
like this:

regression=# select seed,rand,min(val),max(val),count(distinct val) from seeded_random group by 1,2 order by 2,1;
seed | rand | min | max | count
------------+-------------+------+------+-------
1957482663 | exponential | 2000 | 2993 | 586
1958556409 | exponential | 2000 | 2995 | 569
1959867462 | exponential | 2000 | 2997 | 569
1957482663 | gaussian | 3009 | 3997 | 493
1958556409 | gaussian | 3027 | 3956 | 501
1959867462 | gaussian | 3018 | 3960 | 511
1957482663 | uniform | 1001 | 1999 | 625
1958556409 | uniform | 1000 | 1999 | 642
1959867462 | uniform | 1001 | 1999 | 630
1957482663 | zipfian | 4000 | 4081 | 19
1958556409 | zipfian | 4000 | 4022 | 18
1959867462 | zipfian | 4000 | 4156 | 23

I have no idea whether that indicates an actual bug, or just poor
choice of parameter in the test's call. But the very small number
of distinct outputs is disheartening at least.

regards, tom lane

#27

coelho@cri.ensmp.fr

almost 7 years ago

In reply to: Tom Lane (#26)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

Hello Tom,

Here is a POC which defines an internal interface for a PRNG, and use it
within pgbench, with several possible implementations which default to
rand48.

I seriously dislike this patch. pgbench's random support is quite
overengineered already IMO, and this proposes to add a whole batch of
new code and new APIs to fix a very small bug.

My intention is rather to discuss postgres' PRNG, in passing. Full success
on this point:-)

I must admit that I have a grudge against standard rand48:

I think this is nonsense, particularly the claim that anything in PG
cares about the lowest-order bits of random doubles. I'm aware that
there are applications where that does matter, but people aren't doing
high-precision weather simulations in pgbench.

Sure. My point is not that it is an actual issue for pgbench, but as the
same PRNG is used more or less everywhere in postgres, I think that it
should be a good one rather than a known bad one.

Eg, about double:

\set i debug(random(1, POWER(2,49)) % 2)

Always return 1 because of the 48 bit precision, i.e. the output is never
even.

\set i debug(random(1, POWER(2,48)) % 2)

Return 0 1 0 1 0 1 0 1 0 1 0 1 0 1 ... because it is a LCG.

\set i debug(random(1, POWER(2,48)) % 4)

Cycles over (3 2 1 0)*

\set i debug(random(1, power(2, 47)) % 4)

Cycles over (0 0 1 1 2 2 3 3)*, and so on.

BTW, did you look at the question of the range of zipfian?

Yep.

I confirmed here that as used in the test case, it's generating a range
way smaller than the other ones: repeating the insertion snippet 1000x
produces stats like this: [...]

I have no idea whether that indicates an actual bug, or just poor
choice of parameter in the test's call. But the very small number
of distinct outputs is disheartening at least.

Zipf distribution is highly skewed, somehow close to an exponential. To
reduce the decreasing probability the parameter must be closer to 1, eg
1.05 or something. However as far as the test is concerned I do not see
this as a significant issue. I was rather planning to submit a
documentation improvement to provide more precise hints about how the
distribution behaves depending on the parameter, and possibly reduce the
parameter used in the test in passing, but I see this as not very urgent.

--
Fabien.

#28

coelho@cri.ensmp.fr

almost 7 years ago

In reply to: Fabien COELHO (#27)

5 attachment(s)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

Hello Tom,

BTW, did you look at the question of the range of zipfian?

Yep.

I confirmed here that as used in the test case, it's generating a range way
smaller than the other ones: repeating the insertion snippet 1000x produces
stats like this: [...]

I have no idea whether that indicates an actual bug, or just poor
choice of parameter in the test's call. But the very small number
of distinct outputs is disheartening at least.

Zipf distribution is highly skewed, somehow close to an exponential. To
reduce the decreasing probability the parameter must be closer to 1, eg 1.05
or something. However as far as the test is concerned I do not see this as a
significant issue. I was rather planning to submit a documentation
improvement to provide more precise hints about how the distribution behaves
depending on the parameter, and possibly reduce the parameter used in the
test in passing, but I see this as not very urgent.

Attached a documentation patch and a scripts to check the distribution
(here for N = 10 & s = 2.5), the kind of thing I used when checking the
initial patch:

sh> psql < zipf_init.sql
sh> pgbench -t 500000 -c 2 -M prepared -f zipf_test.sql -P 1
-- close to 29000 tps on my laptop
sh> psql < zipf_end.sql
┌────┬────────┬────────────────────┬────────────────────────┐
│ i │ cnt │ ratio │ expected │
├────┼────────┼────────────────────┼────────────────────────┤
│ 1 │ 756371 │ • │ • │
│ 2 │ 133431 │ 5.6686302283577280 │ 5.65685424949238019521 │
│ 3 │ 48661 │ 2.7420521567579787 │ 2.7556759606310754 │
│ 4 │ 23677 │ 2.0552012501583816 │ 2.0528009571186693 │
│ 5 │ 13534 │ 1.7494458401063987 │ 1.7469281074217107 │
│ 6 │ 8773 │ 1.5426877920893651 │ 1.5774409656148784 │
│ 7 │ 5709 │ 1.5366964442108951 │ 1.4701680288054869 │
│ 8 │ 4247 │ 1.3442429950553332 │ 1.3963036312159316 │
│ 9 │ 3147 │ 1.3495392437241818 │ 1.3423980299088363 │
│ 10 │ 2450 │ 1.2844897959183673 │ 1.3013488313450120 │
└────┴────────┴────────────────────┴────────────────────────┘
sh> psql < zipf_clean.sql

Given these results, I do not think that it is useful to change
random_zipfian TAP test parameter from 2.5 to something else.

--
Fabien.

Attachments:

pgbench-zipf-doc-1.patchtext/x-diff; name=pgbench-zipf-doc-1.patchDownload

diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 15ee7c0f2b..10285d655b 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -1613,6 +1613,14 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
       frequently values to the beginning of the interval are drawn.
       The closer to 0 <replaceable>parameter</replaceable> is,
       the flatter (more uniform) the access distribution.
+      The distribution is such that, assuming the range starts from 1,
+      the ratio of probability of drawing <replaceable>k</replaceable> versus
+      drawing <replaceable>k+1</replaceable> is
+      <literal>((<replaceable>k</replaceable>+1)/<replaceable>k</replaceable>)**<replaceable>parameter</replaceable></literal>.
+      For instance <literal>random_zipfian(1, ..., 2.5)</literal> draws
+      value <literal>1</literal> about <literal>(2/1)**2.5 = 5.66</literal> times more frequently
+      than <literal>2</literal>, which itself is drawn <literal>(3/2)*2.5 = 2.76</literal> times more
+      frequently than <literal>3</literal>, and so on.
      </para>
     </listitem>
    </itemizedlist>

zipf_init.sqlapplication/x-sql; name=zipf_init.sqlDownload

zipf_test.sqlapplication/x-sql; name=zipf_test.sqlDownload

zipf_end.sqlapplication/x-sql; name=zipf_end.sqlDownload

zipf_clean.sqlapplication/x-sql; name=zipf_clean.sqlDownload

#29

tgl@sss.pgh.pa.us

almost 7 years ago

In reply to: Fabien COELHO (#28)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

Fabien COELHO <coelho@cri.ensmp.fr> writes:

Given these results, I do not think that it is useful to change
random_zipfian TAP test parameter from 2.5 to something else.

I'm not following this argument. The test case is basically useless
for its intended purpose with that parameter, because it's highly
likely that the failure mode it's supposedly checking for will be
masked by the "random" function's tendency to spit out the same
value all the time. We might as well drop zipfian from the test
altogether and save ourselves some buildfarm cycles.

regards, tom lane

#30

coelho@cri.ensmp.fr

almost 7 years ago

In reply to: Tom Lane (#29)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

Hello Tom,

Given these results, I do not think that it is useful to change
random_zipfian TAP test parameter from 2.5 to something else.

I'm not following this argument. The test case is basically useless
for its intended purpose with that parameter, because it's highly
likely that the failure mode it's supposedly checking for will be
masked by the "random" function's tendency to spit out the same
value all the time.

The first value is taken about 75% of the time for N=1000 and s=2.5, which
means that a non deterministic implementation would succeed about 0.75ï¿œ ~
56% of the time on that one. Then there is other lower probability random
successes. ISTM that if a test fails every three run it would be detected,
so the purpose of testing random_zipfian determinism is somehow served.

Also, the drawing procedure is less efficient when the parameter is close
to 1 because it is more likely to loop, and there are other values tested,
0.5 and 1.3 (note that the code has two methods, depending on whether the
parameter is below or above 1), so I think that having something different
is better.

If you want something more drastic, using 1.5 instead of 2.5 would reduce
the probability of accidentaly passing the test by chance to about 20%, so
it would fail 80% of the time.

We might as well drop zipfian from the test altogether and save
ourselves some buildfarm cycles.

All 4 random functions are tested together on the same run, removing a
particular one does not seem desirable to me.

--
Fabien.

#31

tgl@sss.pgh.pa.us

almost 7 years ago

In reply to: Fabien COELHO (#27)

1 attachment(s)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

Fabien COELHO <coelho@cri.ensmp.fr> writes:

Here is a POC which defines an internal interface for a PRNG, and use it
within pgbench, with several possible implementations which default to
rand48.

I seriously dislike this patch. pgbench's random support is quite
overengineered already IMO, and this proposes to add a whole batch of
new code and new APIs to fix a very small bug.

My intention is rather to discuss postgres' PRNG, in passing. Full success
on this point:-)

Our immediate problem is to fix a portability failure, which we need to
back-patch into at least one released branch, ergo conservatism is
warranted. I had in mind something more like the attached.

regards, tom lane

Attachments:

dont-use-srandom-in-pgbench-1.patchtext/x-diff; charset=us-ascii; name=dont-use-srandom-in-pgbench-1.patchDownload

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index b5c75ce..d5b543f 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -185,7 +185,7 @@ int64		latency_limit = 0;
 char	   *tablespace = NULL;
 char	   *index_tablespace = NULL;
 
-/* random seed used when calling srandom() */
+/* random seed used to initialize base_random_sequence */
 int64		random_seed = -1;
 
 /*
@@ -287,6 +287,9 @@ typedef struct RandomState
 	unsigned short xseed[3];
 } RandomState;
 
+/* Various random sequences are initialized from this one. */
+static RandomState base_random_sequence;
+
 /*
  * Connection state machine states.
  */
@@ -598,6 +601,8 @@ static const BuiltinScript builtin_script[] =
 
 
 /* Function prototypes */
+static void initRandomState(RandomState *random_state);
+static int64 getrand(RandomState *random_state, int64 min, int64 max);
 static void setNullValue(PgBenchValue *pv);
 static void setBoolValue(PgBenchValue *pv, bool bval);
 static void setIntValue(PgBenchValue *pv, int64 ival);
@@ -833,16 +838,28 @@ strtodouble(const char *str, bool errorOK, double *dv)
 
 /*
  * Initialize a random state struct.
+ *
+ * We derive the seed from base_random_sequence, which must be set up already.
  */
 static void
 initRandomState(RandomState *random_state)
 {
-	random_state->xseed[0] = random();
-	random_state->xseed[1] = random();
-	random_state->xseed[2] = random();
+	random_state->xseed[0] = (unsigned short)
+		getrand(&base_random_sequence, 0, 0xFFFF);
+	random_state->xseed[1] = (unsigned short)
+		getrand(&base_random_sequence, 0, 0xFFFF);
+	random_state->xseed[2] = (unsigned short)
+		getrand(&base_random_sequence, 0, 0xFFFF);
 }
 
-/* random number generator: uniform distribution from min to max inclusive */
+/*
+ * Random number generator: uniform distribution from min to max inclusive.
+ *
+ * Although the limits are expressed as int64, you can't generate the full
+ * int64 range in one call, because the difference of the limits mustn't
+ * overflow int64.  In practice it's unwise to ask for more than an int32
+ * range, because of the limited precision of pg_erand48().
+ */
 static int64
 getrand(RandomState *random_state, int64 min, int64 max)
 {
@@ -5126,12 +5143,14 @@ printResults(TState *threads, StatsData *total, instr_time total_time,
 	}
 }
 
-/* call srandom based on some seed. NULL triggers the default behavior. */
+/*
+ * Set up a random seed according to seed parameter (NULL means default),
+ * and initialize base_random_sequence for use in initializing other sequences.
+ */
 static bool
 set_random_seed(const char *seed)
 {
-	/* srandom expects an unsigned int */
-	unsigned int iseed;
+	uint64		iseed;
 
 	if (seed == NULL || strcmp(seed, "time") == 0)
 	{
@@ -5139,7 +5158,7 @@ set_random_seed(const char *seed)
 		instr_time	now;
 
 		INSTR_TIME_SET_CURRENT(now);
-		iseed = (unsigned int) INSTR_TIME_GET_MICROSEC(now);
+		iseed = (uint64) INSTR_TIME_GET_MICROSEC(now);
 	}
 	else if (strcmp(seed, "rand") == 0)
 	{
@@ -5155,7 +5174,7 @@ set_random_seed(const char *seed)
 		/* parse seed unsigned int value */
 		char		garbage;
 
-		if (sscanf(seed, "%u%c", &iseed, &garbage) != 1)
+		if (sscanf(seed, UINT64_FORMAT "%c", &iseed, &garbage) != 1)
 		{
 			fprintf(stderr,
 					"unrecognized random seed option \"%s\": expecting an unsigned integer, \"time\" or \"rand\"\n",
@@ -5165,10 +5184,14 @@ set_random_seed(const char *seed)
 	}
 
 	if (seed != NULL)
-		fprintf(stderr, "setting random seed to %u\n", iseed);
-	srandom(iseed);
-	/* no precision loss: 32 bit unsigned int cast to 64 bit int */
+		fprintf(stderr, "setting random seed to " UINT64_FORMAT "\n", iseed);
 	random_seed = iseed;
+
+	/* Fill base_random_sequence with low-order bits of seed */
+	base_random_sequence.xseed[0] = iseed & 0xFFFF;
+	base_random_sequence.xseed[1] = (iseed >> 16) & 0xFFFF;
+	base_random_sequence.xseed[2] = (iseed >> 32) & 0xFFFF;
+
 	return true;
 }
 
@@ -5862,10 +5885,9 @@ main(int argc, char **argv)
 	/* set default seed for hash functions */
 	if (lookupVariable(&state[0], "default_seed") == NULL)
 	{
-		uint64		seed = ((uint64) (random() & 0xFFFF) << 48) |
-		((uint64) (random() & 0xFFFF) << 32) |
-		((uint64) (random() & 0xFFFF) << 16) |
-		(uint64) (random() & 0xFFFF);
+		uint64		seed =
+		((uint64) getrand(&base_random_sequence, 0, 0xFFFFFFFF)) |
+		(((uint64) getrand(&base_random_sequence, 0, 0xFFFFFFFF)) << 32);
 
 		for (i = 0; i < nclients; i++)
 			if (!putVariableInt(&state[i], "startup", "default_seed", (int64) seed))

#32

tgl@sss.pgh.pa.us

almost 7 years ago

In reply to: Fabien COELHO (#30)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

Fabien COELHO <coelho@cri.ensmp.fr> writes:

I'm not following this argument. The test case is basically useless
for its intended purpose with that parameter, because it's highly
likely that the failure mode it's supposedly checking for will be
masked by the "random" function's tendency to spit out the same
value all the time.

The first value is taken about 75% of the time for N=1000 and s=2.5, which
means that a non deterministic implementation would succeed about 0.75² ~
56% of the time on that one.

Right, that's about what we've been seeing on OpenBSD.

Also, the drawing procedure is less efficient when the parameter is close
to 1 because it is more likely to loop,

That might be something to fix, but I agree it's a reason not to go
overboard trying to flatten the test case's distribution right now.

If you want something more drastic, using 1.5 instead of 2.5 would reduce
the probability of accidentaly passing the test by chance to about 20%, so
it would fail 80% of the time.

I think your math is off; 1.5 works quite well here. I saw one failure
to produce distinct values in 20 attempts. It's not demonstrably slower
than 2.5 either. (1.1 is measurably slower; probably not by enough for
anyone to care, but 1.5 is good enough for me.)

regards, tom lane

#33

coelho@cri.ensmp.fr

almost 7 years ago

In reply to: Tom Lane (#32)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

The first value is taken about 75% of the time for N=1000 and s=2.5, which
means that a non deterministic implementation would succeed about 0.75² ~
56% of the time on that one.

Right, that's about what we've been seeing on OpenBSD.

Also, the drawing procedure is less efficient when the parameter is close
to 1 because it is more likely to loop,

That might be something to fix, but I agree it's a reason not to go
overboard trying to flatten the test case's distribution right now.

Probably you would have to invent a new method to draw a zipfian
distribution for that, which would be nice.

If you want something more drastic, using 1.5 instead of 2.5 would reduce
the probability of accidentaly passing the test by chance to about 20%, so
it would fail 80% of the time.

I think your math is off;

Argh. Although I confirm my computation, ISTM that with 1.5 the first
value as 39% chance of getting out so collision on 15% of cases, second
value 14% so collision on 2%, ... total cumulated probability about 18%.

1.5 works quite well here. I saw one failure to produce distinct values
in 20 attempts.

For 3 failure expected, that is possible.

It's not demonstrably slower than 2.5 either. (1.1 is measurably
slower; probably not by enough for anyone to care, but 1.5 is good
enough for me.)

Good if it fails quick enough for you.

--
Fabien.

#34

coelho@cri.ensmp.fr

almost 7 years ago

In reply to: Fabien COELHO (#33)

1 attachment(s)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

It's not demonstrably slower than 2.5 either. (1.1 is measurably slower;
probably not by enough for anyone to care, but 1.5 is good enough for me.)

Good if it fails quick enough for you.

Attached a patch with the zipf doc update & the TAP test parameter change.

--
Fabien.

Attachments:

pgbench-zipf-2.patchtext/x-diff; name=pgbench-zipf-2.patchDownload

diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 15ee7c0f2b..10285d655b 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -1613,6 +1613,14 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
       frequently values to the beginning of the interval are drawn.
       The closer to 0 <replaceable>parameter</replaceable> is,
       the flatter (more uniform) the access distribution.
+      The distribution is such that, assuming the range starts from 1,
+      the ratio of probability of drawing <replaceable>k</replaceable> versus
+      drawing <replaceable>k+1</replaceable> is
+      <literal>((<replaceable>k</replaceable>+1)/<replaceable>k</replaceable>)**<replaceable>parameter</replaceable></literal>.
+      For instance <literal>random_zipfian(1, ..., 2.5)</literal> draws
+      value <literal>1</literal> about <literal>(2/1)**2.5 = 5.66</literal> times more frequently
+      than <literal>2</literal>, which itself is drawn <literal>(3/2)*2.5 = 2.76</literal> times more
+      frequently than <literal>3</literal>, and so on.
      </para>
     </listitem>
    </itemizedlist>
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index c87748086a..c0cdfbf5f7 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -471,7 +471,7 @@ for my $i (1, 2)
 \set ur random(1000, 1999)
 \set er random_exponential(2000, 2999, 2.0)
 \set gr random_gaussian(3000, 3999, 3.0)
-\set zr random_zipfian(4000, 4999, 2.5)
+\set zr random_zipfian(4000, 4999, 1.5)
 INSERT INTO seeded_random(seed, rand, val) VALUES
   (:random_seed, 'uniform', :ur),
   (:random_seed, 'exponential', :er),

#35

coelho@cri.ensmp.fr

almost 7 years ago

In reply to: Tom Lane (#31)

1 attachment(s)

Re: PSA: we lack TAP test coverage on NetBSD and OpenBSD

Hello Tom,

Here is a POC which defines an internal interface for a PRNG, and use it
within pgbench, with several possible implementations which default to
rand48.

I seriously dislike this patch. pgbench's random support is quite
overengineered already IMO, and this proposes to add a whole batch of
new code and new APIs to fix a very small bug.

My intention is rather to discuss postgres' PRNG, in passing. Full success
on this point:-)

Our immediate problem is to fix a portability failure, which we need to
back-patch into at least one released branch, ergo conservatism is
warranted.

Sure, the patch I sent is definitely not for backpatching, it is for
discussion.

I had in mind something more like the attached.

Yep.

I'm not too happy that it mixes API levels, and about the int/double/int
path.

Attached an updated version which relies on pg_jrand48 instead. Also, as
the pseudo-random state is fully controlled, seeded test results are
deterministic so the expected value can be fully checked.

I did a few sanity tests which were all ok.

I think that this version is appropriate for backpatching. I also think
that it would be appropriate to consider having a better PRNG to replace
rand48 in a future release.

--
Fabien.

Attachments:

pgbench-prng-3.patchtext/x-diff; name=pgbench-prng-3.patchDownload

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index b5c75ce1c6..27aac479e3 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -185,7 +185,7 @@ int64		latency_limit = 0;
 char	   *tablespace = NULL;
 char	   *index_tablespace = NULL;
 
-/* random seed used when calling srandom() */
+/* random seed used to initialize base_random_sequence */
 int64		random_seed = -1;
 
 /*
@@ -287,6 +287,9 @@ typedef struct RandomState
 	unsigned short xseed[3];
 } RandomState;
 
+/* Various random sequences are initialized from this one. */
+static RandomState base_random_sequence;
+
 /*
  * Connection state machine states.
  */
@@ -598,6 +601,8 @@ static const BuiltinScript builtin_script[] =
 
 
 /* Function prototypes */
+static void initRandomState(RandomState *random_state);
+static int64 getrand(RandomState *random_state, int64 min, int64 max);
 static void setNullValue(PgBenchValue *pv);
 static void setBoolValue(PgBenchValue *pv, bool bval);
 static void setIntValue(PgBenchValue *pv, int64 ival);
@@ -833,16 +838,28 @@ strtodouble(const char *str, bool errorOK, double *dv)
 
 /*
  * Initialize a random state struct.
+ *
+ * We derive the seed from base_random_sequence, which must be set up already.
  */
 static void
 initRandomState(RandomState *random_state)
 {
-	random_state->xseed[0] = random();
-	random_state->xseed[1] = random();
-	random_state->xseed[2] = random();
+	random_state->xseed[0] = (unsigned short)
+		pg_jrand48(base_random_sequence.xseed) & 0xFFFF;
+	random_state->xseed[1] = (unsigned short)
+		pg_jrand48(base_random_sequence.xseed) & 0xFFFF;
+	random_state->xseed[2] = (unsigned short)
+		pg_jrand48(base_random_sequence.xseed) & 0xFFFF;
 }
 
-/* random number generator: uniform distribution from min to max inclusive */
+/*
+ * Random number generator: uniform distribution from min to max inclusive.
+ *
+ * Although the limits are expressed as int64, you can't generate the full
+ * int64 range in one call, because the difference of the limits mustn't
+ * overflow int64.  In practice it's unwise to ask for more than an int32
+ * range, because of the limited precision of pg_erand48().
+ */
 static int64
 getrand(RandomState *random_state, int64 min, int64 max)
 {
@@ -5126,12 +5143,14 @@ printResults(TState *threads, StatsData *total, instr_time total_time,
 	}
 }
 
-/* call srandom based on some seed. NULL triggers the default behavior. */
+/*
+ * Set up a random seed according to seed parameter (NULL means default),
+ * and initialize base_random_sequence for use in initializing other sequences.
+ */
 static bool
 set_random_seed(const char *seed)
 {
-	/* srandom expects an unsigned int */
-	unsigned int iseed;
+	uint64		iseed;
 
 	if (seed == NULL || strcmp(seed, "time") == 0)
 	{
@@ -5139,7 +5158,7 @@ set_random_seed(const char *seed)
 		instr_time	now;
 
 		INSTR_TIME_SET_CURRENT(now);
-		iseed = (unsigned int) INSTR_TIME_GET_MICROSEC(now);
+		iseed = (uint64) INSTR_TIME_GET_MICROSEC(now);
 	}
 	else if (strcmp(seed, "rand") == 0)
 	{
@@ -5155,7 +5174,7 @@ set_random_seed(const char *seed)
 		/* parse seed unsigned int value */
 		char		garbage;
 
-		if (sscanf(seed, "%u%c", &iseed, &garbage) != 1)
+		if (sscanf(seed, UINT64_FORMAT "%c", &iseed, &garbage) != 1)
 		{
 			fprintf(stderr,
 					"unrecognized random seed option \"%s\": expecting an unsigned integer, \"time\" or \"rand\"\n",
@@ -5165,10 +5184,14 @@ set_random_seed(const char *seed)
 	}
 
 	if (seed != NULL)
-		fprintf(stderr, "setting random seed to %u\n", iseed);
-	srandom(iseed);
-	/* no precision loss: 32 bit unsigned int cast to 64 bit int */
+		fprintf(stderr, "setting random seed to " UINT64_FORMAT "\n", iseed);
 	random_seed = iseed;
+
+	/* Fill base_random_sequence with low-order bits of seed */
+	base_random_sequence.xseed[0] = iseed & 0xFFFF;
+	base_random_sequence.xseed[1] = (iseed >> 16) & 0xFFFF;
+	base_random_sequence.xseed[2] = (iseed >> 32) & 0xFFFF;
+
 	return true;
 }
 
@@ -5862,10 +5885,9 @@ main(int argc, char **argv)
 	/* set default seed for hash functions */
 	if (lookupVariable(&state[0], "default_seed") == NULL)
 	{
-		uint64		seed = ((uint64) (random() & 0xFFFF) << 48) |
-		((uint64) (random() & 0xFFFF) << 32) |
-		((uint64) (random() & 0xFFFF) << 16) |
-		(uint64) (random() & 0xFFFF);
+		uint64		seed =
+			((uint64) pg_jrand48(base_random_sequence.xseed) & 0xFFFFFFFF) |
+			(((uint64) pg_jrand48(base_random_sequence.xseed) & 0xFFFFFFFF) << 32);
 
 		for (i = 0; i < nclients; i++)
 			if (!putVariableInt(&state[i], "startup", "default_seed", (int64) seed))
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index c87748086a..4aaac76c03 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -259,11 +259,11 @@ pgbench(
 	[
 		qr{setting random seed to 5432\b},
 
-		# After explicit seeding, the four * random checks (1-3,20) should be
-		# deterministic, but not necessarily portable.
-		qr{command=1.: int 1\d\b},        # uniform random: 12 on linux
-		qr{command=2.: int 1\d\d\b},      # exponential random: 106 on linux
-		qr{command=3.: int 1\d\d\d\b},    # gaussian random: 1462 on linux
+		# After explicit seeding, the four random checks (1-3,20) are
+		# deterministic
+		qr{command=1.: int 13\b},      # uniform random
+		qr{command=2.: int 116\b},     # exponential random
+		qr{command=3.: int 1498\b},    # gaussian random
 		qr{command=4.: int 4\b},
 		qr{command=5.: int 5\b},
 		qr{command=6.: int 6\b},
@@ -276,7 +276,7 @@ pgbench(
 		qr{command=15.: double 15\b},
 		qr{command=16.: double 16\b},
 		qr{command=17.: double 17\b},
-		qr{command=20.: int \d\b},    # zipfian random: 1 on linux
+		qr{command=20.: int 1\b},      # zipfian random
 		qr{command=21.: double -27\b},
 		qr{command=22.: double 1024\b},
 		qr{command=23.: double 1\b},

#36