Fix rounding method used to compute huge pages

Started by Anthonin Bonnefoy1 day ago8 messages
#1Anthonin Bonnefoy
anthonin.bonnefoy@datadoghq.com
2 attachment(s)

Hi,

When computing the dynamic value of shared_memory_size_in_huge_pages,
(1+size_b/hp_size) is currently used. This works when size_b is not
divisible by hp_size. However, it will yield an additional huge page
when size_b is divisible by hp_size.

On CreateAnonymousSegment's side, the allocation size is rounded up to
the next required huge pages when necessary. However, there's no
overflow check when doing this round up.

0001: This patch replicates CreateAnonymousSegment's rounding method
to InitializeShmemGUCs, only rounding up when the value is not
divisible by hp_size.

0002: This patch uses add_size in CreateAnonymousSegment when the
allocation size is rounded up, to check for possible overflow.

Regards,
Anthonin Bonnefoy

Attachments:

v1-0001-Fix-rounding-method-used-to-compute-shared_memory.patchapplication/octet-stream; name=v1-0001-Fix-rounding-method-used-to-compute-shared_memory.patchDownload
From 43f795ef933d1661e15f4d758e05167e10dc3887 Mon Sep 17 00:00:00 2001
From: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Date: Thu, 22 Jan 2026 15:16:21 +0100
Subject: Fix rounding method used to compute shared_memory_size_in_huge_pages

When computing the dynamic value of shared_memory_size_in_huge_pages,
(1+size_b/hp_size) is currently used. This works when size_b is not
divisible by hp_size. However, it will yield an additional huge page
when size_b is divisible by hp_size.

This patch replicates the same rounding method used for the mmap
allocation in CreateAnonymousSegment, only rounding up when the value
is not divisible by hp_size.
---
 src/backend/storage/ipc/ipci.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 85c67b2c183..268040dbec5 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -363,7 +363,9 @@ InitializeShmemGUCs(void)
 	{
 		Size		hp_required;
 
-		hp_required = add_size(size_b / hp_size, 1);
+		if (size_b % hp_size != 0)
+			size_b = add_size(size_b, hp_size - (size_b % hp_size));
+		hp_required = size_b / hp_size;
 		sprintf(buf, "%zu", hp_required);
 		SetConfigOption("shared_memory_size_in_huge_pages", buf,
 						PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
-- 
2.52.0

v1-0002-Check-for-overflow-when-rounding-up-allocsize.patchapplication/octet-stream; name=v1-0002-Check-for-overflow-when-rounding-up-allocsize.patchDownload
From 7feed212172266444a2f3e09d183a005a673a8bd Mon Sep 17 00:00:00 2001
From: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Date: Thu, 22 Jan 2026 15:50:29 +0100
Subject: Check for overflow when rounding up allocsize

The amount of shared memory to mmap is calculated using
CalculateShmemSize, which takes care of checking for possible overflow
using add_size.

When using huge pages, the final allocation size is rounded up to the
next required huge pages when necessary. However, there's no overflow
check when doing this round up, leaving the possibility for an
overflow.

This patch uses add_size when rounding up to check for possible
overflow.
---
 src/backend/port/sysv_shmem.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index de491897118..5239b6acbbc 100644
--- a/src/backend/port/sysv_shmem.c
+++ b/src/backend/port/sysv_shmem.c
@@ -34,6 +34,7 @@
 #include "storage/fd.h"
 #include "storage/ipc.h"
 #include "storage/pg_shmem.h"
+#include "storage/shmem.h"
 #include "utils/guc.h"
 #include "utils/guc_hooks.h"
 #include "utils/pidfile.h"
@@ -617,7 +618,7 @@ CreateAnonymousSegment(Size *size)
 		GetHugePageSize(&hugepagesize, &mmap_flags);
 
 		if (allocsize % hugepagesize != 0)
-			allocsize += hugepagesize - (allocsize % hugepagesize);
+			allocsize = add_size(allocsize, hugepagesize - (allocsize % hugepagesize));
 
 		ptr = mmap(NULL, allocsize, PROT_READ | PROT_WRITE,
 				   PG_MMAP_FLAGS | mmap_flags, -1, 0);
-- 
2.52.0

#2Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Anthonin Bonnefoy (#1)
Re: Fix rounding method used to compute huge pages

On Thu, Jan 22, 2026 at 10:13 PM Anthonin Bonnefoy
<anthonin.bonnefoy@datadoghq.com> wrote:

Hi,

When computing the dynamic value of shared_memory_size_in_huge_pages,
(1+size_b/hp_size) is currently used. This works when size_b is not
divisible by hp_size. However, it will yield an additional huge page
when size_b is divisible by hp_size.

On CreateAnonymousSegment's side, the allocation size is rounded up to
the next required huge pages when necessary. However, there's no
overflow check when doing this round up.

0001: This patch replicates CreateAnonymousSegment's rounding method
to InitializeShmemGUCs, only rounding up when the value is not
divisible by hp_size.

0002: This patch uses add_size in CreateAnonymousSegment when the
allocation size is rounded up, to check for possible overflow.

We have similar incantation in CalculateShmemSize()
size = add_size(size, 8192 - (size % 8192));

I think we should just introduce a method ceil_size() and place it
near add_size() and mul_size() and use wherever we are rounding the
sizes.
Size
ceil_size(size, base)
{
return add_size(size, base - (size % base))
}

InitializeShmemGUCs() also has following line, which can can be
replaced with ceil_size(size_b, 1024 * 1024)
size_mb = add_size(size_b, (1024 * 1024) - 1) / (1024 * 1024);

--
Best Wishes,
Ashutosh Bapat

#3Nathan Bossart
nathandbossart@gmail.com
In reply to: Anthonin Bonnefoy (#1)
Re: Fix rounding method used to compute huge pages

On Thu, Jan 22, 2026 at 05:42:44PM +0100, Anthonin Bonnefoy wrote:

When computing the dynamic value of shared_memory_size_in_huge_pages,
(1+size_b/hp_size) is currently used. This works when size_b is not
divisible by hp_size. However, it will yield an additional huge page
when size_b is divisible by hp_size.

Oops, it looks like this is my fault. I doubt this causes any practical
problems, but we might as well fix it.

+		if (size_b % hp_size != 0)
+			size_b = add_size(size_b, hp_size - (size_b % hp_size));
+		hp_required = size_b / hp_size;

I think we could simplify this a tad:

hp_required = size_b / hp_size;
if (size_b % hp_size != 0)
hp_required = add_size(hp_required, 1);

0002: This patch uses add_size in CreateAnonymousSegment when the
allocation size is rounded up, to check for possible overflow.

Seems reasonable.

--
nathan

#4Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#3)
Re: Fix rounding method used to compute huge pages

On Thu, Jan 22, 2026 at 03:24:13PM -0600, Nathan Bossart wrote:

Oops, it looks like this is my fault. I doubt this causes any practical
problems, but we might as well fix it.

And it's something I have committed.

+		if (size_b % hp_size != 0)
+			size_b = add_size(size_b, hp_size - (size_b % hp_size));
+		hp_required = size_b / hp_size;

I think we could simplify this a tad:

hp_required = size_b / hp_size;
if (size_b % hp_size != 0)
hp_required = add_size(hp_required, 1);

FWIW, we have always been kind of sloppy with slightly overestimating
the shmem size required in the backend code, and here it's just a one.
I don't see a strong need for a backpatch here. Of course, no
objections in adjusting that on HEAD. Nathan, you are planning to
take care of that as original author? This should fall under my
bucket as original committer, but as you were an author, feel free to
take priority here of course.
--
Michael

#5Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#4)
Re: Fix rounding method used to compute huge pages

On Fri, Jan 23, 2026 at 08:04:29AM +0900, Michael Paquier wrote:

FWIW, we have always been kind of sloppy with slightly overestimating
the shmem size required in the backend code, and here it's just a one.
I don't see a strong need for a backpatch here. Of course, no
objections in adjusting that on HEAD.

Agreed.

Nathan, you are planning to
take care of that as original author? This should fall under my
bucket as original committer, but as you were an author, feel free to
take priority here of course.

I'll take care of it (likely tomorrow).

--
nathan

#6Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#5)
Re: Fix rounding method used to compute huge pages

On Thu, Jan 22, 2026 at 05:10:08PM -0600, Nathan Bossart wrote:

I'll take care of it (likely tomorrow).

Thanks!
--
Michael

#7Anthonin Bonnefoy
anthonin.bonnefoy@datadoghq.com
In reply to: Michael Paquier (#6)
Re: Fix rounding method used to compute huge pages

On Thu, Jan 22, 2026 at 10:24 PM Nathan Bossart
<nathandbossart@gmail.com> wrote:

Oops, it looks like this is my fault. I doubt this causes any practical
problems, but we might as well fix it.

Yeah, the chance of this being a problem is pretty low.

+               if (size_b % hp_size != 0)
+                       size_b = add_size(size_b, hp_size - (size_b % hp_size));
+               hp_required = size_b / hp_size;

I think we could simplify this a tad:

hp_required = size_b / hp_size;
if (size_b % hp_size != 0)
hp_required = add_size(hp_required, 1);

From my understanding, 'add_size(hp_required, 1)' will never overflow
since size_b was checked for overflow, and hp_size should always be >1
(except if huge pages of 1 byte exist somewhere).

For consistency with CreateAnonymousSegment, using 'add_size(size_b,
hp_size - (size_b % hp_size))' will also check that the final
requested allocation doesn't overflow.

#8Nathan Bossart
nathandbossart@gmail.com
In reply to: Anthonin Bonnefoy (#7)
Re: Fix rounding method used to compute huge pages

Committed.

On Fri, Jan 23, 2026 at 09:21:53AM +0100, Anthonin Bonnefoy wrote:

From my understanding, 'add_size(hp_required, 1)' will never overflow
since size_b was checked for overflow, and hp_size should always be >1
(except if huge pages of 1 byte exist somewhere).

That's true, but for this sort of thing, I usually prefer to avoid relying
on those kinds of assumptions to reason about the correctness of the code.
The overflow check costs little, and IIUC this function is run exactly once
for the lifetime of the server.

For consistency with CreateAnonymousSegment, using 'add_size(size_b,
hp_size - (size_b % hp_size))' will also check that the final
requested allocation doesn't overflow.

*shrug* I don't see a strong reason for consistency here. AFAICT you'd
have to be trying to allocate something like 18 exabytes on most systems
for there to be a problem, at which point there are probably bigger issues
to sort out.

Thanks for the patch!

--
nathan