Support getrandom() for pg_strong_random() source

Started by Masahiko Sawada8 months ago67 messages
Jump to latest
#1Masahiko Sawada
sawada.mshk@gmail.com

Hi all,

Currently we have three options for pg_strong_random() sources:

1. OpenSSL's RAND_bytes()
2. Windows' CryptGenRandom() function
3. /dev/urandom

The patch supports the getrandom() function as a new source of
pg_strong_random(). The getrandom() function uses the same source as
the /dev/urandom device but it seems much faster than opening,
reading, and closing /dev/urandom. Here is the execution time of
generating 1 million UUIDv4 data measured on my environment:

HEAD(/dev/urandom): 1863.064 ms
Patched(getrandom()): 516.627 ms

I guess that while OpenSSL's RAND_bytes() should still be prioritized
where available it might be a good idea to support getrandom() for
builds where RAND_bytes() is not available.

Feedback is very welcome.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachments:

v1-0001-Support-getrandom-as-the-source-of-pg_strong_rand.patchapplication/octet-stream; name=v1-0001-Support-getrandom-as-the-source-of-pg_strong_rand.patchDownload+77-6
#2Michael Paquier
michael@paquier.xyz
In reply to: Masahiko Sawada (#1)
Re: Support getrandom() for pg_strong_random() source

On Mon, Jul 21, 2025 at 11:43:35PM -0700, Masahiko Sawada wrote:

The patch supports the getrandom() function as a new source of
pg_strong_random(). The getrandom() function uses the same source as
the /dev/urandom device but it seems much faster than opening,
reading, and closing /dev/urandom. Here is the execution time of
generating 1 million UUIDv4 data measured on my environment:

HEAD(/dev/urandom): 1863.064 ms
Patched(getrandom()): 516.627 ms

Interesting. Are there platforms where this is not available? I'd be
pretty sure that some animals in the buildfarm would not like this
suggestion but I'm saying it anyway. Perhaps we could even drop
/dev/urandom?

I guess that while OpenSSL's RAND_bytes() should still be prioritized
where available it might be a good idea to support getrandom() for
builds where RAND_bytes() is not available.

Feedback is very welcome.

I am wondering how much non-OpenSSL builds matter these days, TBH, so
I am not sure that this is worth the addition of an extra
configure/meson check and this stuff has its cost just for such
builds. I am not saying that we should make OpenSSL mandatory, of
course not, but all production instances of Postgres have likely
OpenSSL enabled anyway. Perhaps some embedded deployments like
--without-openssl, who knows..
--
Michael

#3Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Michael Paquier (#2)
Re: Support getrandom() for pg_strong_random() source

On Tue, Jul 22, 2025 at 12:13 AM Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Jul 21, 2025 at 11:43:35PM -0700, Masahiko Sawada wrote:

The patch supports the getrandom() function as a new source of
pg_strong_random(). The getrandom() function uses the same source as
the /dev/urandom device but it seems much faster than opening,
reading, and closing /dev/urandom. Here is the execution time of
generating 1 million UUIDv4 data measured on my environment:

HEAD(/dev/urandom): 1863.064 ms
Patched(getrandom()): 516.627 ms

Interesting. Are there platforms where this is not available? I'd be
pretty sure that some animals in the buildfarm would not like this
suggestion but I'm saying it anyway. Perhaps we could even drop
/dev/urandom?

As far as I know macOS doesn't support getrandom() but supports
getentropy() instead. And an older glibc version might not support it.
It's supported since Linux 3.17 and glibc 2.25.

I guess that while OpenSSL's RAND_bytes() should still be prioritized
where available it might be a good idea to support getrandom() for
builds where RAND_bytes() is not available.

Feedback is very welcome.

I am wondering how much non-OpenSSL builds matter these days, TBH, so
I am not sure that this is worth the addition of an extra
configure/meson check and this stuff has its cost just for such
builds. I am not saying that we should make OpenSSL mandatory, of
course not, but all production instances of Postgres have likely
OpenSSL enabled anyway. Perhaps some embedded deployments like
--without-openssl, who knows..

Fair point. In fact, I was not using OpenSSL and just realized
generating UUID by PostgreSQL's uuidv4() and uuidv7() was much slower
than generating it by Rust's UUID crate. On my environment,
getrandom() is faster than RAND_bytes() so I thought there are some
cases where users want to use the getrandom() source rather than
RAND_bytes(), but I'm not sure since there is also a difference in the
secureness.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In reply to: Masahiko Sawada (#3)
Re: Support getrandom() for pg_strong_random() source

Masahiko Sawada <sawada.mshk@gmail.com> writes:

On Tue, Jul 22, 2025 at 12:13 AM Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Jul 21, 2025 at 11:43:35PM -0700, Masahiko Sawada wrote:

The patch supports the getrandom() function as a new source of
pg_strong_random(). The getrandom() function uses the same source as
the /dev/urandom device but it seems much faster than opening,
reading, and closing /dev/urandom. Here is the execution time of
generating 1 million UUIDv4 data measured on my environment:

HEAD(/dev/urandom): 1863.064 ms
Patched(getrandom()): 516.627 ms

Interesting. Are there platforms where this is not available? I'd be
pretty sure that some animals in the buildfarm would not like this
suggestion but I'm saying it anyway. Perhaps we could even drop
/dev/urandom?

As far as I know macOS doesn't support getrandom() but supports
getentropy() instead. And an older glibc version might not support it.
It's supported since Linux 3.17 and glibc 2.25.

getrandom() is Linux-specific, while getentropy() is specified by POSIX
(since 2024). It was originally introduced by OpenBSD 5.6 in 2014, and
was added to macOS 10.12 in 2016, glibc 2.25 (same as getrandom()) in
2017, musl 1.1.20 and FreeBSD 12.0 in 2018, and NetBSD 10.0 in 2024

Sources:

https://pubs.opengroup.org/onlinepubs/9799919799/functions/getentropy.html
https://dotat.at/@/2024-10-01-getentropy.html

So I think it's more worthwhile to add support for getentropy() than
getrandom().

- ilmari

#5Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Dagfinn Ilmari Mannsåker (#4)
Re: Support getrandom() for pg_strong_random() source

On Tue, Jul 22, 2025 at 4:12 AM Dagfinn Ilmari Mannsåker
<ilmari@ilmari.org> wrote:

Masahiko Sawada <sawada.mshk@gmail.com> writes:

On Tue, Jul 22, 2025 at 12:13 AM Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Jul 21, 2025 at 11:43:35PM -0700, Masahiko Sawada wrote:

The patch supports the getrandom() function as a new source of
pg_strong_random(). The getrandom() function uses the same source as
the /dev/urandom device but it seems much faster than opening,
reading, and closing /dev/urandom. Here is the execution time of
generating 1 million UUIDv4 data measured on my environment:

HEAD(/dev/urandom): 1863.064 ms
Patched(getrandom()): 516.627 ms

Interesting. Are there platforms where this is not available? I'd be
pretty sure that some animals in the buildfarm would not like this
suggestion but I'm saying it anyway. Perhaps we could even drop
/dev/urandom?

As far as I know macOS doesn't support getrandom() but supports
getentropy() instead. And an older glibc version might not support it.
It's supported since Linux 3.17 and glibc 2.25.

getrandom() is Linux-specific, while getentropy() is specified by POSIX
(since 2024). It was originally introduced by OpenBSD 5.6 in 2014, and
was added to macOS 10.12 in 2016, glibc 2.25 (same as getrandom()) in
2017, musl 1.1.20 and FreeBSD 12.0 in 2018, and NetBSD 10.0 in 2024

Sources:

https://pubs.opengroup.org/onlinepubs/9799919799/functions/getentropy.html
https://dotat.at/@/2024-10-01-getentropy.html

So I think it's more worthwhile to add support for getentropy() than
getrandom().

While getentropy() has better portability, according to the
getentropy() manual, the maximum length is limited to 256 bytes. It
works in some cases such as generating UUID data but seems not
appropriate for our general pg_strong_random() use cases.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#6Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Masahiko Sawada (#5)
Re: Support getrandom() for pg_strong_random() source

On Tue, Jul 22, 2025 at 11:32 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

While getentropy() has better portability, according to the
getentropy() manual, the maximum length is limited to 256 bytes. It
works in some cases such as generating UUID data but seems not
appropriate for our general pg_strong_random() use cases.

I imagine that the code would look very similar to your patch, though
(loop, in chunks of size GETENTROPY_MAX, until the required length is
met). Without looking too deeply, I have to say that implementing a
newer POSIX API as opposed to a Linux-specific one does seem like a
better cost-benefit tradeoff, if we decide to do this.

Can you talk more about this part:

On my environment,
getrandom() is faster than RAND_bytes() so I thought there are some
cases where users want to use the getrandom() source rather than
RAND_bytes(), but I'm not sure since there is also a difference in the
secureness.

That is _really_ surprising to me at first glance. I thought
RAND_bytes() was supposed to be a userspace PRNG, which I would
naively expect to take much less time than pulling data from Linux.
(Once the OpenSSL PRNG has been seeded, that is.) Are there any other
details about your environment (or the test itself) that are unusual?

Thanks,
--Jacob

#7DINESH  NAIR
Dinesh_Nair@iitmpravartak.net
In reply to: Masahiko Sawada (#5)
Re: Support getrandom() for pg_strong_random() source

Hi ,

            On Tue, Jul 22, 2025 at 4:12 AM Dagfinn Ilmari Mannsåker
<ilmari@ilmari.org> wrote:

Masahiko Sawada <sawada.mshk@gmail.com> writes:

On Tue, Jul 22, 2025 at 12:13 AM Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Jul 21, 2025 at 11:43:35PM -0700, Masahiko Sawada wrote:

The patch supports the getrandom() function as a new source of
pg_strong_random(). The getrandom() function uses the same source as
the /dev/urandom device but it seems much faster than opening,
reading, and closing /dev/urandom. Here is the execution time of
generating 1 million UUIDv4 data measured on my environment:

HEAD(/dev/urandom): 1863.064 ms
Patched(getrandom()): 516.627 ms

Interesting. Are there platforms where this is not available? I'd be
pretty sure that some animals in the buildfarm would not like this
suggestion but I'm saying it anyway. Perhaps we could even drop
/dev/urandom?

As far as I know macOS doesn't support getrandom() but supports
getentropy() instead. And an older glibc version might not support it.
It's supported since Linux 3.17 and glibc 2.25.

getrandom() is Linux-specific, while getentropy() is specified by POSIX
(since 2024). It was originally introduced by OpenBSD 5.6 in 2014, and
was added to macOS 10.12 in 2016, glibc 2.25 (same as getrandom()) in
2017, musl 1.1.20 and FreeBSD 12.0 in 2018, and NetBSD 10.0 in 2024

Sources:
While getentropy() has better portability, according to the
getentropy() manual, the maximum length is limited to 256 bytes. It
works in some cases such as generating UUID data but seems not
appropriate for our general pg_strong_random() use cases.

The getentropy() function has a limitation of generating a maximum of 256 bytes of entropy per call and is not supported on Windows platforms. For cryptographic operations that require large buffers of high-quality randomness efficiently, it's not recommended to use getentropy().
https://brandur.org/fragments/secure-bytes-without-pgcrypto
A few secure, random bytes without `pgcrypto` — brandur.org<https://brandur.org/fragments/secure-bytes-without-pgcrypto&gt;
In Postgres it’s common to see the SQL random() function used to generate a random number, but it’s a pseudo-random number generator, and not suitable for cases where real randomness is required critical. Postgres also provides a way of getting secure random numbers as well, but only through the use of the pgcrypto extension, which makes gen_random_bytes available. Pulling pgcrypto into ...
brandur.org

Thanks

Regards

Dinesh Nair

________________________________
From: Masahiko Sawada <sawada.mshk@gmail.com>
Sent: Wednesday, July 23, 2025 12:02 AM
To: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Cc: Michael Paquier <michael@paquier.xyz>; PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Subject: Re: Support getrandom() for pg_strong_random() source

Caution: This email was sent from an external source. Please verify the sender’s identity before clicking links or opening attachments.

On Tue, Jul 22, 2025 at 4:12 AM Dagfinn Ilmari Mannsåker
<ilmari@ilmari.org> wrote:

Masahiko Sawada <sawada.mshk@gmail.com> writes:

On Tue, Jul 22, 2025 at 12:13 AM Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Jul 21, 2025 at 11:43:35PM -0700, Masahiko Sawada wrote:

The patch supports the getrandom() function as a new source of
pg_strong_random(). The getrandom() function uses the same source as
the /dev/urandom device but it seems much faster than opening,
reading, and closing /dev/urandom. Here is the execution time of
generating 1 million UUIDv4 data measured on my environment:

HEAD(/dev/urandom): 1863.064 ms
Patched(getrandom()): 516.627 ms

Interesting. Are there platforms where this is not available? I'd be
pretty sure that some animals in the buildfarm would not like this
suggestion but I'm saying it anyway. Perhaps we could even drop
/dev/urandom?

As far as I know macOS doesn't support getrandom() but supports
getentropy() instead. And an older glibc version might not support it.
It's supported since Linux 3.17 and glibc 2.25.

getrandom() is Linux-specific, while getentropy() is specified by POSIX
(since 2024). It was originally introduced by OpenBSD 5.6 in 2014, and
was added to macOS 10.12 in 2016, glibc 2.25 (same as getrandom()) in
2017, musl 1.1.20 and FreeBSD 12.0 in 2018, and NetBSD 10.0 in 2024

Sources:

https://ind01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpubs.opengroup.org%2Fonlinepubs%2F9799919799%2Ffunctions%2Fgetentropy.html&amp;data=05%7C02%7Cdinesh_nair%40iitmpravartak.net%7C6063a2ac8d4f45e1a74808ddc94e2e4e%7C3e964837c2384683915549f4ec04f8e9%7C0%7C0%7C638888059824005298%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&amp;sdata=C5abb70MrRMb8YrRpFZreelrwfXgKtxWYNWvEc3oPFg%3D&amp;reserved=0&lt;https://pubs.opengroup.org/onlinepubs/9799919799/functions/getentropy.html&gt;
https://ind01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdotat.at%2F%40%2F2024-10-01-getentropy.html&amp;data=05%7C02%7Cdinesh_nair%40iitmpravartak.net%7C6063a2ac8d4f45e1a74808ddc94e2e4e%7C3e964837c2384683915549f4ec04f8e9%7C0%7C0%7C638888059824035506%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&amp;sdata=RwqnnTVGURh7kw31OLW0kRUu%2BCUVVRlt%2Fx9FDDesb58%3D&amp;reserved=0&lt;https://dotat.at/@/2024-10-01-getentropy.html&gt;

So I think it's more worthwhile to add support for getentropy() than
getrandom().

While getentropy() has better portability, according to the
getentropy() manual, the maximum length is limited to 256 bytes. It
works in some cases such as generating UUID data but seems not
appropriate for our general pg_strong_random() use cases.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://ind01.safelinks.protection.outlook.com/?url=https%3A%2F%2Faws.amazon.com%2F&amp;data=05%7C02%7Cdinesh_nair%40iitmpravartak.net%7C6063a2ac8d4f45e1a74808ddc94e2e4e%7C3e964837c2384683915549f4ec04f8e9%7C0%7C0%7C638888059824052980%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&amp;sdata=NNEQNe%2Fibr6VRZAmBPWTy6r5J4pH2yza4PVGA4E9LO4%3D&amp;reserved=0&lt;https://aws.amazon.com/&gt;

#8Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Jacob Champion (#6)
Re: Support getrandom() for pg_strong_random() source

On Tue, Jul 22, 2025 at 11:46 AM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:

On Tue, Jul 22, 2025 at 11:32 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

While getentropy() has better portability, according to the
getentropy() manual, the maximum length is limited to 256 bytes. It
works in some cases such as generating UUID data but seems not
appropriate for our general pg_strong_random() use cases.

I imagine that the code would look very similar to your patch, though
(loop, in chunks of size GETENTROPY_MAX, until the required length is
met). Without looking too deeply, I have to say that implementing a
newer POSIX API as opposed to a Linux-specific one does seem like a
better cost-benefit tradeoff, if we decide to do this.

Can you talk more about this part:

On my environment,
getrandom() is faster than RAND_bytes() so I thought there are some
cases where users want to use the getrandom() source rather than
RAND_bytes(), but I'm not sure since there is also a difference in the
secureness.

That is _really_ surprising to me at first glance. I thought
RAND_bytes() was supposed to be a userspace PRNG, which I would
naively expect to take much less time than pulling data from Linux.
(Once the OpenSSL PRNG has been seeded, that is.) Are there any other
details about your environment (or the test itself) that are unusual?

Yes, it surprised me too. The environment I used for this benchmark was:

% cat /etc/redhat-release
Red Hat Enterprise Linux release 8.10 (Ootpa)
% uname -r
4.18.0-553.22.1.el8_10.x86_64
% rpm -qa | grep openssl
openssl-libs-1.1.1k-14.el8_6.x86_64
openssl-debugsource-1.1.1k-14.el8_6.x86_64
rubygem-openssl-2.1.2-114.module+el8.10.0+23088+750dc6ca.x86_64
openssl-devel-1.1.1k-14.el8_6.x86_64
openssl-pkcs11-0.4.10-3.el8.x86_64
openssl-1.1.1k-14.el8_6.x86_64
openssl-debuginfo-1.1.1k-14.el8_6.x86_64
% openssl version
OpenSSL 1.1.1k FIPS 25 Mar 2021

and I measured the execution time of the following query:

explain analyze select uuidv4() from generate_series(1, 1_000_000);

The result is:

getrandom: 517.120ms
RAND_bytes: 1150.051 ms
/dev/urandom: 1862.483 ms

Since on the above environment I used an old Linux kernel and openssl
version, I've does the same benchmark on another environment:

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 24.04.2 LTS
Release: 24.04
Codename: noble
$ apt list --installed | grep ssl
libssl-dev/noble-updates,noble-security,now 3.0.13-0ubuntu3.5 amd64 [installed]
libssl3t64/noble-updates,noble-security,now 3.0.13-0ubuntu3.5 amd64
[installed,automatic]
libxmlsec1t64-openssl/noble,now 1.2.39-5build2 amd64 [installed,automatic]
openssl/noble-updates,noble-security,now 3.0.13-0ubuntu3.5 amd64
[installed,automatic]
python3-openssl/noble,now 23.2.0-1 all [installed,automatic]
ssl-cert/noble,now 1.1.2ubuntu1 all [installed,automatic]

The trend of the results were similar:

getrandom: 497.061 ms
RAND_bytes: 1152.260 ms ms
/dev/urandom: 1696.065 ms

Please let me know if I'm missing configurations or settings to
measure this workload properly.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#9Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Masahiko Sawada (#8)
Re: Support getrandom() for pg_strong_random() source

On Tue, Jul 22, 2025 at 4:23 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

The trend of the results were similar:

getrandom: 497.061 ms
RAND_bytes: 1152.260 ms ms
/dev/urandom: 1696.065 ms

Please let me know if I'm missing configurations or settings to
measure this workload properly.

I don't think you're missing anything, or else I'm missing something
too. If I modify pg_strong_random() to call getentropy() in addition
to the existing RAND_bytes() code, `perf` shows RAND_bytes() taking up
2.4x the samples that getentropy() does. That's very similar to your
results.

On Tue, Jul 22, 2025 at 11:46 AM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:

That is _really_ surprising to me at first glance. I thought
RAND_bytes() was supposed to be a userspace PRNG, which I would
naively expect to take much less time than pulling data from Linux.

So my expectation was naive for sure. This has sent me down a bit of a
rabbit hole, starting with Adam Langley's BoringSSL post [1]https://www.imperialviolet.org/2015/10/17/boringssl.html which led
to a post/rant on urandom [2]https://sockpuppet.org/blog/2014/02/25/safely-generate-random-numbers/. I don't think an API that advertises
"strong randomness" should ever prioritize performance over strength.
But maybe the pendulum has swung far enough that we can expect any
kernel supporting getentropy() to be able to do the job just as well
as OpenSSL does in userspace, except also faster? I think it might be
worth a discussion.

Thanks,
--Jacob

[1]: https://www.imperialviolet.org/2015/10/17/boringssl.html
[2]: https://sockpuppet.org/blog/2014/02/25/safely-generate-random-numbers/

#10Daniel Gustafsson
daniel@yesql.se
In reply to: Jacob Champion (#9)
Re: Support getrandom() for pg_strong_random() source

On 23 Jul 2025, at 19:11, Jacob Champion <jacob.champion@enterprisedb.com> wrote:

.. maybe the pendulum has swung far enough that we can expect any
kernel supporting getentropy() to be able to do the job just as well
as OpenSSL does in userspace, except also faster? I think it might be
worth a discussion.

There has in the past been discussions (at least off-list in hallway tracks)
about allowing randomness to be chosen separately from underlying factors such
as OpenSSL support, at the time it didn't seem worth the trouble but that may
well have changed.

With OpenSSL 1.1.1 being the baseline we can also make use of the _priv_bytes
functions to get increased isolation.

--
Daniel Gustafsson

#11Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Daniel Gustafsson (#10)
Re: Support getrandom() for pg_strong_random() source

On Mon, Jul 28, 2025 at 4:36 AM Daniel Gustafsson <daniel@yesql.se> wrote:

There has in the past been discussions (at least off-list in hallway tracks)
about allowing randomness to be chosen separately from underlying factors such
as OpenSSL support, at the time it didn't seem worth the trouble but that may
well have changed.

Yeah, especially if other options with similar strength could be much
faster. But the comparison is really going to be OS-dependent [1, 2].

With OpenSSL 1.1.1 being the baseline we can also make use of the _priv_bytes
functions to get increased isolation.

Hmm, that's an interesting idea too.

To move this forward a tiny bit: I would be okay with maintaining a
new getentropy() case. (I'm less excited about getrandom() because of
its reduced reach.) And maybe down the line we should discuss choosing
an option at configure time?

--Jacob

[1]: https://lwn.net/Articles/983186/
[2]: https://dotat.at/@/2024-10-01-getentropy.html

#12Daniel Gustafsson
daniel@yesql.se
In reply to: Jacob Champion (#11)
Re: Support getrandom() for pg_strong_random() source

On 28 Jul 2025, at 17:29, Jacob Champion <jacob.champion@enterprisedb.com> wrote:

To move this forward a tiny bit: I would be okay with maintaining a
new getentropy() case. (I'm less excited about getrandom() because of
its reduced reach.) And maybe down the line we should discuss choosing
an option at configure time?

I would not be opposed to starting there.

--
Daniel Gustafsson

#13Michael Paquier
michael@paquier.xyz
In reply to: Daniel Gustafsson (#12)
Re: Support getrandom() for pg_strong_random() source

On Mon, Jul 28, 2025 at 06:14:20PM +0200, Daniel Gustafsson wrote:

On 28 Jul 2025, at 17:29, Jacob Champion <jacob.champion@enterprisedb.com> wrote:

To move this forward a tiny bit: I would be okay with maintaining a
new getentropy() case. (I'm less excited about getrandom() because of
its reduced reach.) And maybe down the line we should discuss choosing
an option at configure time?

I would not be opposed to starting there.

Both of you know the options of these areas of the code more than the
average committer, I think, so if you think that getentropy() could be
a good choice, while making the choice configurable to give the
possibility to be outside of OpenSSL, why not.

My understanding of the problem is that it is a choice of efficiency
vs entropy, and that it's not really possible to have both parts of
the cake. If we make that configurable, documentation sounds like the
key point to me, to explain which one has more benefits over the
other.

Could getentropy() be more efficient at the end on most platforms,
meaning that this could limit the meaning of having a GUC switch?
Having it in POSIX is appealing with the long-term picture in mind..
--
Michael

#14Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Michael Paquier (#13)
Re: Support getrandom() for pg_strong_random() source

On Mon, Jul 28, 2025 at 6:30 PM Michael Paquier <michael@paquier.xyz> wrote:

My understanding of the problem is that it is a choice of efficiency
vs entropy, and that it's not really possible to have both parts of
the cake.

That was my understanding too, but then [1]https://www.imperialviolet.org/2015/10/17/boringssl.html called that into question.
If -- and I don't really have enough expertise to verify that "if" --
the reason OpenSSL is slower isn't because of "entropy", but because
of operations and safety checks that it has to ask the kernel to make
for it, then it stands to reason that the kernel could do that a lot
faster.

If it turns out that's not the case (the post at [1]https://www.imperialviolet.org/2015/10/17/boringssl.html is ten years old;
things change, or Adam could have been wrong, or...), I think that
getentropy() is still a straight upgrade from /dev/urandom, due to its
increased safety guarantees at startup.

If we make that configurable, documentation sounds like the
key point to me, to explain which one has more benefits over the
other.

Agreed.

Could getentropy() be more efficient at the end on most platforms,
meaning that this could limit the meaning of having a GUC switch?

I don't know. [2]https://dotat.at/@/2024-10-01-getentropy.html implies that the performance comparison depends on
several factors, and falls in favor of OpenSSL when the number of
bytes per call is large -- but our use of pg_strong_random() is
generally on small buffers. We would need to do a _lot_ more research
before, say, switching any defaults.

Thanks,
--Jacob

[1]: https://www.imperialviolet.org/2015/10/17/boringssl.html
[2]: https://dotat.at/@/2024-10-01-getentropy.html

In reply to: Jacob Champion (#14)
Re: Support getrandom() for pg_strong_random() source

Jacob Champion <jacob.champion@enterprisedb.com> writes:

On Mon, Jul 28, 2025 at 6:30 PM Michael Paquier <michael@paquier.xyz> wrote:

Could getentropy() be more efficient at the end on most platforms,
meaning that this could limit the meaning of having a GUC switch?

I don't know. [2] implies that the performance comparison depends on
several factors, and falls in favor of OpenSSL when the number of
bytes per call is large

[...]

[2] https://dotat.at/@/2024-10-01-getentropy.html

Note that that test was done on an older Linux kernel without the vDSO
implementation of getentropy(), so on newer kernel (>=6.11) and glibc
(>= 2.41) versions the difference might be smaller or the other way
around.

- ilmari

#16Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Jacob Champion (#14)
Re: Support getrandom() for pg_strong_random() source

On Tue, Jul 29, 2025 at 8:55 AM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:

On Mon, Jul 28, 2025 at 6:30 PM Michael Paquier <michael@paquier.xyz> wrote:

My understanding of the problem is that it is a choice of efficiency
vs entropy, and that it's not really possible to have both parts of
the cake.

Agreed. I think the optimal choice would depend on the specific use
case. For instance, since UUIDs are not intended for security
purposes, they don't require particularly high entropy. In UUID
generation, the efficiency of random data generation tends to be
prioritized over the quality of randomness.

Could getentropy() be more efficient at the end on most platforms,
meaning that this could limit the meaning of having a GUC switch?

I don't know. [2] implies that the performance comparison depends on
several factors, and falls in favor of OpenSSL when the number of
bytes per call is large -- but our use of pg_strong_random() is
generally on small buffers. We would need to do a _lot_ more research
before, say, switching any defaults.

The performance issue with getentropy, particularly when len=1024,
likely stems from the need for multiple getentropy() calls due to its
256-byte length restriction.

Analysis of RAND_bytes() through strace reveals that it internally
makes calls to getrandom() with a fixed length of 32 bytes. While I'm
uncertain of the exact purpose, it's logical that a single
getentropy() call would be more efficient than RAND_bytes(), which
involves additional overhead beyond just calling getrandom(),
especially when dealing with smaller byte sizes.

I've updated the patch to support getentropy() instead of getrandom().

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachments:

v2-0001-Support-getentropy-as-source-of-pg_strong_random-.patchapplication/octet-stream; name=v2-0001-Support-getentropy-as-source-of-pg_strong_random-.patchDownload+75-6
#17Peter Eisentraut
peter_e@gmx.net
In reply to: Dagfinn Ilmari Mannsåker (#4)
Re: Support getrandom() for pg_strong_random() source

On 22.07.25 13:11, Dagfinn Ilmari Mannsåker wrote:

getrandom() is Linux-specific, while getentropy() is specified by POSIX
(since 2024). It was originally introduced by OpenBSD 5.6 in 2014, and
was added to macOS 10.12 in 2016, glibc 2.25 (same as getrandom()) in
2017, musl 1.1.20 and FreeBSD 12.0 in 2018, and NetBSD 10.0 in 2024

Sources:

https://pubs.opengroup.org/onlinepubs/9799919799/functions/getentropy.html
https://dotat.at/@/2024-10-01-getentropy.html

So I think it's more worthwhile to add support for getentropy() than
getrandom().

The POSIX description of getentropy() says:

"The intended use of this function is to create a seed for other
pseudo-random number generators."

So using getentropy() for generating the random numbers that are passed
back to the application code would appear to be the wrong use.

#18Peter Eisentraut
peter_e@gmx.net
In reply to: Masahiko Sawada (#16)
Re: Support getrandom() for pg_strong_random() source

On 30.07.25 08:59, Masahiko Sawada wrote:

I've updated the patch to support getentropy() instead of getrandom().

The point still stands that the number of installations without OpenSSL
support is approximately zero, so what is the purpose of this patch if
approximately no one will be able to use it?

In reply to: Masahiko Sawada (#16)
Re: Support getrandom() for pg_strong_random() source

Masahiko Sawada <sawada.mshk@gmail.com> writes:

On Tue, Jul 29, 2025 at 8:55 AM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:

On Mon, Jul 28, 2025 at 6:30 PM Michael Paquier <michael@paquier.xyz> wrote:

My understanding of the problem is that it is a choice of efficiency
vs entropy, and that it's not really possible to have both parts of
the cake.

Agreed. I think the optimal choice would depend on the specific use
case. For instance, since UUIDs are not intended for security
purposes, they don't require particularly high entropy. In UUID
generation, the efficiency of random data generation tends to be
prioritized over the quality of randomness.

Could getentropy() be more efficient at the end on most platforms,
meaning that this could limit the meaning of having a GUC switch?

I don't know. [2] implies that the performance comparison depends on
several factors, and falls in favor of OpenSSL when the number of
bytes per call is large -- but our use of pg_strong_random() is
generally on small buffers. We would need to do a _lot_ more research
before, say, switching any defaults.

The performance issue with getentropy, particularly when len=1024,
likely stems from the need for multiple getentropy() calls due to its
256-byte length restriction.

Analysis of RAND_bytes() through strace reveals that it internally
makes calls to getrandom() with a fixed length of 32 bytes. While I'm
uncertain of the exact purpose, it's logical that a single
getentropy() call would be more efficient than RAND_bytes(), which
involves additional overhead beyond just calling getrandom(),
especially when dealing with smaller byte sizes.

I've updated the patch to support getentropy() instead of getrandom().

Thanks, just a few comments:

The blog post at
https://dotat.at/@/2024-10-01-getentropy.html#portability-of-getentropy-
points out a couple of caveats:

* Originally getentropy() was declared in <sys/random.h> but POSIX
declares it in <unistd.h>. You need to include both headers to be
sure.

So the probes need to include both <sys/random.h> (if avaliable) and
<unistd.h>, and in the code <sys/random.h> should only be included if
available.

* POSIX specifies a GETENTROPY_MAX macro in <limits.h> for the largest
buffer getentropy() will fill. Most systems don’t yet have this
macro; if it isn’t defined the limit is 256 bytes.

And this means we should include <limits.h> and only define
GETENTROPY_MAX to 256 if it's not already defined.

+bool
+pg_strong_random(void *buf, size_t len)
+{
+	char	   *p = buf;
+	ssize_t		res;
+
+	while (len)
+	{
+		size_t		l = Min(len, GETENTROPY_MAX_LEN);
+
+		res = getentropy(buf, l);

This should be getentropy(p, l), otherwise it will will just fill the
first GETENTROPY_MAX_LEN bytes of buf repeatedly. On my machine I got a
warning about that:

../postgresql/src/port/pg_strong_random.c:159:11: warning: variable 'p' set but not used [-Wunused-but-set-variable]
159 | char *p = buf;
| ^

Do we not have any tests for pg_strong_random that make sure it fills
the entire bufffer for various sizes?

I've attached an updated patch for the above, except I don't know enough
autoconf/m4 to make it include <unistd.h> and optionally <sys/random.h>
in the check there.

- ilmari

Attachments:

v3-0001-Support-getentropy-as-source-of-pg_strong_random-.patchtext/x-diffDownload+88-6
#20Daniel Gustafsson
daniel@yesql.se
In reply to: Peter Eisentraut (#18)
Re: Support getrandom() for pg_strong_random() source

On 30 Jul 2025, at 13:10, Peter Eisentraut <peter@eisentraut.org> wrote:

On 30.07.25 08:59, Masahiko Sawada wrote:

I've updated the patch to support getentropy() instead of getrandom().

The point still stands that the number of installations without OpenSSL support is approximately zero, so what is the purpose of this patch if approximately no one will be able to use it?

The main usecase I've heard discussed (mostly in hallway tracks IIRC) is to
allow multiple PRNG's so that codepaths which favor performance over
cryptographic properties can choose, this would not be that but a small step on
that path (whether or not that's the appropriate step is debatable).

For installations without OpenSSL, getrandom() as an API over /dev/urandom
still works when /dev is chrooted away. That subset might be too small to
spend code on though.

--
Daniel Gustafsson

In reply to: Dagfinn Ilmari Mannsåker (#19)
#22Peter Eisentraut
peter_e@gmx.net
In reply to: Daniel Gustafsson (#20)
#23Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Peter Eisentraut (#18)
#24Peter Eisentraut
peter_e@gmx.net
In reply to: Jacob Champion (#23)
#25Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Peter Eisentraut (#24)
#26Michael Paquier
michael@paquier.xyz
In reply to: Jacob Champion (#25)
#27Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Jacob Champion (#25)
#28Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Masahiko Sawada (#27)
#29Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Dagfinn Ilmari Mannsåker (#15)
#30Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Jacob Champion (#28)
#31Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Masahiko Sawada (#30)
#32Michael Paquier
michael@paquier.xyz
In reply to: Jacob Champion (#31)
#33Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Michael Paquier (#32)
#34Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Jacob Champion (#31)
#35Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Masahiko Sawada (#34)
#36Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Jacob Champion (#35)
#37Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Masahiko Sawada (#36)
#38Daniel Gustafsson
daniel@yesql.se
In reply to: Jacob Champion (#37)
#39Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Daniel Gustafsson (#38)
#40Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Masahiko Sawada (#39)
#41Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Masahiko Sawada (#40)
#42Michael Paquier
michael@paquier.xyz
In reply to: Jacob Champion (#41)
#43Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Michael Paquier (#42)
#44Daniel Gustafsson
daniel@yesql.se
In reply to: Jacob Champion (#41)
#45Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Daniel Gustafsson (#44)
#46Michael Paquier
michael@paquier.xyz
In reply to: Masahiko Sawada (#45)
#47Daniel Gustafsson
daniel@yesql.se
In reply to: Masahiko Sawada (#45)
#48Joe Conway
mail@joeconway.com
In reply to: Daniel Gustafsson (#47)
#49Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Joe Conway (#48)
#50Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Daniel Gustafsson (#44)
#51Michael Paquier
michael@paquier.xyz
In reply to: Jacob Champion (#50)
#52Daniel Gustafsson
daniel@yesql.se
In reply to: Jacob Champion (#49)
#53Daniel Gustafsson
daniel@yesql.se
In reply to: Jacob Champion (#50)
In reply to: Jacob Champion (#49)
#55Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Daniel Gustafsson (#52)
#56Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Michael Paquier (#51)
#57Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Dagfinn Ilmari Mannsåker (#54)
#58Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Daniel Gustafsson (#53)
#59Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Jacob Champion (#58)
#60Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Masahiko Sawada (#59)
#61Daniel Gustafsson
daniel@yesql.se
In reply to: Jacob Champion (#60)
#62Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Jacob Champion (#60)
#63Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Masahiko Sawada (#62)
#64Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Jacob Champion (#63)
#65Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Masahiko Sawada (#64)
#66Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Jacob Champion (#65)
#67Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Masahiko Sawada (#66)