[BUG + PATCH] DSA pagemap out-of-bounds in make_new_segment odd-sized path

Started by Paul Bunn6 days ago10 messages
Jump to latest
#1Paul Bunn
paul.bunn@icloud.com

Hi hackers,

Sorry for the previously poorly-formatted/threaded email.

We've identified a bug in the DSA (Dynamic Shared Memory Area) allocator

that causes memory corruption and crashes during parallel hash joins with

large data sets. The bug has been present since the DSA implementation

was introduced and affects all supported branches.

Attached is a minimal fix (5 lines added, 0 changed).

== Bug Summary ==

In make_new_segment() (src/backend/utils/mmgr/dsa.c), there are two paths

for computing segment layout:

Path 1 (geometric): knows total_size upfront, computes

total_pages = total_size / FPM_PAGE_SIZE

pagemap entries = total_pages <-- correct

Path 2 (odd-sized, when requested > geometric): works forward from

usable_pages = requested_pages

pagemap entries = usable_pages <-- BUG

The pagemap is indexed by absolute page number. The FreePageManager hands

out pages with indices from metadata_pages to (metadata_pages +

usable_pages - 1). Since metadata_pages >= 1, page indices at the high

end of the range exceed usable_pages, making the pagemap accesses

out-of-bounds.

== How It Was Found on Postgres 15 ==

Multiple parallel worker backends crashed simultaneously with SIGSEGV in

dsa_get_address(), called from dsa_free() in ExecHashTableDetachBatch()

during parallel hash join batch cleanup.

Stack trace:

#0 dsa_get_address (area, dp) at dsa.c:955

#1 dsa_free (area, dp) at dsa.c:839

#2 ExecHashTableDetachBatch (hashtable) at nodeHash.c:3189

#3 ExecParallelHashJoinNewBatch (hjstate) at nodeHashjoin.c:1157

All crashing workers computed the same pageno (196993), which was the last

valid FPM page but beyond the pagemap array (usable_pages = 196609).

== Root Cause (from core dump analysis) ==

The crashing segment had:

usable_pages = 196,609 (pagemap has this many entries)

metadata_pages = 385

total_pages = 196,994 (= metadata_pages + usable_pages)

last FPM page = 196,993 (= metadata_pages + usable_pages - 1)

pagemap valid indices: 0 .. 196,608

FPM page indices: 385 .. 196,993

Pages 196,609 through 196,993 (385 pages) have no valid pagemap entry.

The pagemap array ends 3,072 bytes before the data area starts (padding

zone). Most out-of-bounds entries fall in this padding and cause silent

corruption. The last 17 entries fall in the data area itself, causing

bidirectional corruption: pagemap writes destroy allocated object data,

and subsequent pagemap reads return garbage, crashing dsa_free().

== The Fix ==

After computing metadata_bytes with usable_pages pagemap entries, add

entries for the metadata pages themselves:

metadata_bytes +=

((metadata_bytes / (FPM_PAGE_SIZE - sizeof(dsa_pointer))) + 1) *

sizeof(dsa_pointer);

The divisor (FPM_PAGE_SIZE - sizeof(dsa_pointer)) = 4088 accounts for

the circular dependency: each metadata page costs one pagemap entry

(8 bytes), so only 4088 of each 4096-byte metadata page is net-free for

other pagemap entries. The +1 absorbs the ceiling.

== Repro ==

It's tricky to create a test that causes a failure, or standalone SQL that
will cause

the bug to manifest. This is a (long) dormant bug that only causes a crash
in rare

circumstances.

--

Thanks,

Paul

Attachments:

0001-Fix-DSA-pagemap-undersizing-in-make_new_segment.patchapplication/octet-stream; name=0001-Fix-DSA-pagemap-undersizing-in-make_new_segment.patchDownload+5-1
#2Paul Bunn
paul.bunn@icloud.com
In reply to: Paul Bunn (#1)
RE: [BUG + PATCH] DSA pagemap out-of-bounds in make_new_segment odd-sized path

Attached is a new patch to demonstrate the bug in make_new_segment.

It's not meant as a permanent new test (it's using some hard-coded
assumptions about the x64 platform) -- just useful in validating the bug and
fix.

From: paul.bunn@icloud.com <paul.bunn@icloud.com>
Sent: Wednesday, March 4, 2026 12:01 AM
To: pgsql-hackers@postgresql.org
Subject: [BUG + PATCH] DSA pagemap out-of-bounds in make_new_segment
odd-sized path

Hi hackers,

Sorry for the previously poorly-formatted/threaded email.

We've identified a bug in the DSA (Dynamic Shared Memory Area) allocator

that causes memory corruption and crashes during parallel hash joins with

large data sets. The bug has been present since the DSA implementation

was introduced and affects all supported branches.

Attached is a minimal fix (5 lines added, 0 changed).

== Bug Summary ==

In make_new_segment() (src/backend/utils/mmgr/dsa.c), there are two paths

for computing segment layout:

Path 1 (geometric): knows total_size upfront, computes

total_pages = total_size / FPM_PAGE_SIZE

pagemap entries = total_pages <-- correct

Path 2 (odd-sized, when requested > geometric): works forward from

usable_pages = requested_pages

pagemap entries = usable_pages <-- BUG

The pagemap is indexed by absolute page number. The FreePageManager hands

out pages with indices from metadata_pages to (metadata_pages +

usable_pages - 1). Since metadata_pages >= 1, page indices at the high

end of the range exceed usable_pages, making the pagemap accesses

out-of-bounds.

== How It Was Found on Postgres 15 ==

Multiple parallel worker backends crashed simultaneously with SIGSEGV in

dsa_get_address(), called from dsa_free() in ExecHashTableDetachBatch()

during parallel hash join batch cleanup.

Stack trace:

#0 dsa_get_address (area, dp) at dsa.c:955

#1 dsa_free (area, dp) at dsa.c:839

#2 ExecHashTableDetachBatch (hashtable) at nodeHash.c:3189

#3 ExecParallelHashJoinNewBatch (hjstate) at nodeHashjoin.c:1157

All crashing workers computed the same pageno (196993), which was the last

valid FPM page but beyond the pagemap array (usable_pages = 196609).

== Root Cause (from core dump analysis) ==

The crashing segment had:

usable_pages = 196,609 (pagemap has this many entries)

metadata_pages = 385

total_pages = 196,994 (= metadata_pages + usable_pages)

last FPM page = 196,993 (= metadata_pages + usable_pages - 1)

pagemap valid indices: 0 .. 196,608

FPM page indices: 385 .. 196,993

Pages 196,609 through 196,993 (385 pages) have no valid pagemap entry.

The pagemap array ends 3,072 bytes before the data area starts (padding

zone). Most out-of-bounds entries fall in this padding and cause silent

corruption. The last 17 entries fall in the data area itself, causing

bidirectional corruption: pagemap writes destroy allocated object data,

and subsequent pagemap reads return garbage, crashing dsa_free().

== The Fix ==

After computing metadata_bytes with usable_pages pagemap entries, add

entries for the metadata pages themselves:

metadata_bytes +=

((metadata_bytes / (FPM_PAGE_SIZE - sizeof(dsa_pointer))) + 1) *

sizeof(dsa_pointer);

The divisor (FPM_PAGE_SIZE - sizeof(dsa_pointer)) = 4088 accounts for

the circular dependency: each metadata page costs one pagemap entry

(8 bytes), so only 4088 of each 4096-byte metadata page is net-free for

other pagemap entries. The +1 absorbs the ceiling.

== Repro ==

It's tricky to create a test that causes a failure, or standalone SQL that
will cause

the bug to manifest. This is a (long) dormant bug that only causes a crash
in rare

circumstances.

--

Thanks,

Paul

Attachments:

test_dsa_pagemap_overflow.patchapplication/octet-stream; name=test_dsa_pagemap_overflow.patchDownload+85-1
#3Chao Li
li.evan.chao@gmail.com
In reply to: Paul Bunn (#2)
Re: [BUG + PATCH] DSA pagemap out-of-bounds in make_new_segment odd-sized path

On Mar 5, 2026, at 07:33, <paul.bunn@icloud.com> <paul.bunn@icloud.com> wrote:

Attached is a new patch to demonstrate the bug in make_new_segment.
It’s not meant as a permanent new test (it’s using some hard-coded assumptions about the x64 platform) -- just useful in validating the bug and fix.
From: paul.bunn@icloud.com <paul.bunn@icloud.com>
Sent: Wednesday, March 4, 2026 12:01 AM
To: pgsql-hackers@postgresql.org
Subject: [BUG + PATCH] DSA pagemap out-of-bounds in make_new_segment odd-sized path
Hi hackers,
Sorry for the previously poorly-formatted/threaded email.
We've identified a bug in the DSA (Dynamic Shared Memory Area) allocator
that causes memory corruption and crashes during parallel hash joins with
large data sets. The bug has been present since the DSA implementation
was introduced and affects all supported branches.
Attached is a minimal fix (5 lines added, 0 changed).
== Bug Summary ==
In make_new_segment() (src/backend/utils/mmgr/dsa.c), there are two paths
for computing segment layout:
Path 1 (geometric): knows total_size upfront, computes
total_pages = total_size / FPM_PAGE_SIZE
pagemap entries = total_pages <-- correct
Path 2 (odd-sized, when requested > geometric): works forward from
usable_pages = requested_pages
pagemap entries = usable_pages <-- BUG
The pagemap is indexed by absolute page number. The FreePageManager hands
out pages with indices from metadata_pages to (metadata_pages +
usable_pages - 1). Since metadata_pages >= 1, page indices at the high
end of the range exceed usable_pages, making the pagemap accesses
out-of-bounds.
== How It Was Found on Postgres 15 ==
Multiple parallel worker backends crashed simultaneously with SIGSEGV in
dsa_get_address(), called from dsa_free() in ExecHashTableDetachBatch()
during parallel hash join batch cleanup.
Stack trace:
#0 dsa_get_address (area, dp) at dsa.c:955
#1 dsa_free (area, dp) at dsa.c:839
#2 ExecHashTableDetachBatch (hashtable) at nodeHash.c:3189
#3 ExecParallelHashJoinNewBatch (hjstate) at nodeHashjoin.c:1157
All crashing workers computed the same pageno (196993), which was the last
valid FPM page but beyond the pagemap array (usable_pages = 196609).
== Root Cause (from core dump analysis) ==
The crashing segment had:
usable_pages = 196,609 (pagemap has this many entries)
metadata_pages = 385
total_pages = 196,994 (= metadata_pages + usable_pages)
last FPM page = 196,993 (= metadata_pages + usable_pages - 1)
pagemap valid indices: 0 .. 196,608
FPM page indices: 385 .. 196,993
Pages 196,609 through 196,993 (385 pages) have no valid pagemap entry.
The pagemap array ends 3,072 bytes before the data area starts (padding
zone). Most out-of-bounds entries fall in this padding and cause silent
corruption. The last 17 entries fall in the data area itself, causing
bidirectional corruption: pagemap writes destroy allocated object data,
and subsequent pagemap reads return garbage, crashing dsa_free().

While reviewing this patch, I took the opportunity to read through the DSA code again.

I think the reason why the bug is hard to trigger is that, even if out-of-bound of pagemap happens, in most cases, the OOB access drops into the padding area, which is actually safe. Only when the padding area is not big enough to hold metadata_bytes/FPM_PAGE_SIZE dsa_pointer entries, it encounters a segment fault error.

== The Fix ==
After computing metadata_bytes with usable_pages pagemap entries, add
entries for the metadata pages themselves:
metadata_bytes +=
((metadata_bytes / (FPM_PAGE_SIZE - sizeof(dsa_pointer))) + 1) *
sizeof(dsa_pointer);
The divisor (FPM_PAGE_SIZE - sizeof(dsa_pointer)) = 4088 accounts for
the circular dependency: each metadata page costs one pagemap entry
(8 bytes), so only 4088 of each 4096-byte metadata page is net-free for
other pagemap entries. The +1 absorbs the ceiling.

I think this fix is very clever. But the problem I see is that, it adds extra bytes to metadata_bytes, then compute padding size. But I believe the padding area is safe to hold pagemap. Thus, if the padding area is big enough, then we don’t need to add extra bytes to metadata_bytes. Based on that, I made a fix:
```
diff --git a/src/backend/utils/mmgr/dsa.c b/src/backend/utils/mmgr/dsa.c
index ce9ede4c196..b7a089bc288 100644
--- a/src/backend/utils/mmgr/dsa.c
+++ b/src/backend/utils/mmgr/dsa.c
@@ -2131,6 +2131,7 @@ static dsa_segment_map *
 make_new_segment(dsa_area *area, size_t requested_pages)
 {
        dsa_segment_index new_index;
+       size_t          metadata_fixed_bytes;
        size_t          metadata_bytes;
        size_t          total_size;
        size_t          total_pages;
@@ -2180,9 +2181,11 @@ make_new_segment(dsa_area *area, size_t requested_pages)
                                         area->control->total_segment_size);
        total_pages = total_size / FPM_PAGE_SIZE;
-       metadata_bytes =
+       metadata_fixed_bytes =
                MAXALIGN(sizeof(dsa_segment_header)) +
-               MAXALIGN(sizeof(FreePageManager)) +
+               MAXALIGN(sizeof(FreePageManager));
+       metadata_bytes =
+               metadata_fixed_bytes +
                sizeof(dsa_pointer) * total_pages;
        /* Add padding up to next page boundary. */
@@ -2196,19 +2199,36 @@ make_new_segment(dsa_area *area, size_t requested_pages)
        /* See if that is enough... */
        if (requested_pages > usable_pages)
        {
+               size_t pagemap_entries;
+
                /*
                 * We'll make an odd-sized segment, working forward from the requested
                 * number of pages.
                 */
                usable_pages = requested_pages;
-               metadata_bytes =
-                       MAXALIGN(sizeof(dsa_segment_header)) +
-                       MAXALIGN(sizeof(FreePageManager)) +
-                       usable_pages * sizeof(dsa_pointer);
-
-               /* Add padding up to next page boundary. */
-               if (metadata_bytes % FPM_PAGE_SIZE != 0)
-                       metadata_bytes += FPM_PAGE_SIZE - (metadata_bytes % FPM_PAGE_SIZE);
+               pagemap_entries = usable_pages;
+
+               for (;;)
+               {
+                       size_t actual_pages;
+                       size_t metadata_pages;
+
+                       metadata_bytes =
+                               metadata_fixed_bytes +
+                               pagemap_entries * sizeof(dsa_pointer);
+
+                       /* Add padding up to next page boundary. */
+                       if (metadata_bytes % FPM_PAGE_SIZE != 0)
+                               metadata_bytes += FPM_PAGE_SIZE - (metadata_bytes % FPM_PAGE_SIZE);
+
+                       metadata_pages = metadata_bytes / FPM_PAGE_SIZE;
+                       actual_pages = metadata_pages + usable_pages;
+                       if (pagemap_entries >= actual_pages)
+                               break;
+
+                       pagemap_entries = actual_pages;
+               }
+
                total_size = metadata_bytes + usable_pages * FPM_PAGE_SIZE;

/* Is that too large for dsa_pointer's addressing scheme? */
```

I tested my version of fix with the test script you attached in your next email, and the test passed. And I feel my version is easier to understand.

But I haven’t thought deeper if your version really waste the padding area, if not, then I agree your version is shorter and neater, only concern is a bit hard to understand, but that wouldn't be a problem IMO if a proper comment is added.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

#4Sami Imseih
samimseih@gmail.com
In reply to: Paul Bunn (#1)
Re: [BUG + PATCH] DSA pagemap out-of-bounds in make_new_segment odd-sized path

Hi,

The pagemap array ends 3,072 bytes before the data area starts (padding

zone). Most out-of-bounds entries fall in this padding and cause silent

corruption. The last 17 entries fall in the data area itself, causing

bidirectional corruption: pagemap writes destroy allocated object data,

AFAICT, The pagemap array tracks all pages in the segment, including
the metadata
pages. The bug is that the pagemap entries for the metadata pages were
not allocated
when we go into the odd-sized allocation path.

It is indeed difficult to get a reproducer for this, but not allocating pagemap
entries for the metadata pages is indeed wrong.

So, we should first add an Assert that determines we have allocated
enough pagemap entries for all the pages. We can do this by computing the number
of pagemap entries that fit in the metadata region and verifying it is at
least as large as the total number of pages in the segment.

--- a/src/backend/utils/mmgr/dsa.c
+++ b/src/backend/utils/mmgr/dsa.c
@@ -2196,6 +2196,8 @@ make_new_segment(dsa_area *area, size_t requested_pages)
        /* See if that is enough... */
        if (requested_pages > usable_pages)
        {
+               size_t          total_pages;
+
                /*
                 * We'll make an odd-sized segment, working forward
from the requested
                 * number of pages.
@@ -2210,6 +2212,11 @@ make_new_segment(dsa_area *area, size_t requested_pages)
                if (metadata_bytes % FPM_PAGE_SIZE != 0)
                        metadata_bytes += FPM_PAGE_SIZE -
(metadata_bytes % FPM_PAGE_SIZE);
                total_size = metadata_bytes + usable_pages * FPM_PAGE_SIZE;
+               total_pages = total_size / FPM_PAGE_SIZE;
+
+               /* Verify we allocated enough pagemap entries for all pages */
+               Assert((metadata_bytes - MAXALIGN(sizeof(dsa_segment_header)) -
+                               MAXALIGN(sizeof(FreePageManager))) /
sizeof(dsa_pointer) >= total_pages);

With the above assert in place, the usable_pages of 879 ars you have
in test_dsa.c crashes
due to the assertion failure.

As far as the fix, I do agree with what you have in 0001-, except I am
not so sure about the "+1 for rounding".
Can we just do ceiling division?

+               /*
+                * We must also account for pagemap entries needed to cover the
+                * metadata pages themselves.  The pagemap must track
all pages in the
+                * segment, including the pages occupied by metadata.
+                */
+               metadata_bytes +=
+                       ((metadata_bytes + (FPM_PAGE_SIZE -
sizeof(dsa_pointer)) - 1) /
+                       (FPM_PAGE_SIZE - sizeof(dsa_pointer))) *
+                       sizeof(dsa_pointer);

I don't think we should add a test_dsa, but I do think it was useful
to verify the issue.

This sounds like it should be backpatched, but we'll see what a
committer thinks.

See attached.

What do you think?

--
Sami Imseih
Amazon Web Services

Attachments:

v2-0001-Fix-DSA-pagemap-undersizing-in-make_new_segment.patchapplication/octet-stream; name=v2-0001-Fix-DSA-pagemap-undersizing-in-make_new_segment.patchDownload+20-1
#5Michael Paquier
michael@paquier.xyz
In reply to: Sami Imseih (#4)
Re: [BUG + PATCH] DSA pagemap out-of-bounds in make_new_segment odd-sized path

On Thu, Mar 05, 2026 at 04:25:57PM -0600, Sami Imseih wrote:

With the above assert in place, the usable_pages of 879 ars you have
in test_dsa.c crashes
due to the assertion failure.

Yep, the assertion looks useful to have in place.

As far as the fix, I do agree with what you have in 0001-, except I am
not so sure about the "+1 for rounding".
Can we just do ceiling division?

Ceiling division is more precise at the end, and can be checked, I'd
tend to stick with your approach.

+               /*
+                * We must also account for pagemap entries needed to cover the
+                * metadata pages themselves.  The pagemap must track
all pages in the
+                * segment, including the pages occupied by metadata.
+                */
+               metadata_bytes +=
+                       ((metadata_bytes + (FPM_PAGE_SIZE -
sizeof(dsa_pointer)) - 1) /
+                       (FPM_PAGE_SIZE - sizeof(dsa_pointer))) *
+                       sizeof(dsa_pointer);

I don't think we should add a test_dsa, but I do think it was useful
to verify the issue.

I think that a test would be actually nice to have in test_dsa, but the
proposed one lacks flexibility. How about changing it to a function
that takes in input the values you'd want to test for pagemap_start,
usable_pages and the offset? The point is to check a dsa_allocate()
with the sanity of an offset, hence a simple test_dsa_allocate() as
function name? The test module exists down to v17, which should be
enough to check things across platforms moving ahead.

This sounds like it should be backpatched, but we'll see what a
committer thinks.

This should be backpatched. If you'd have told me that the
allocation is oversized, then an optimization may have been possible
on HEAD, especially if overestimation was large. A small reduction in
size may not even be worth worrying in some cases, as we tend to
oversize some shmem areas as well. An undersized calculation is a
very different story: memory corruptions on the stack are not cool at
all, especially on stable branches, and they lead to unpredictible
behaviors.
--
Michael

#6Chao Li
li.evan.chao@gmail.com
In reply to: Michael Paquier (#5)
Re: [BUG + PATCH] DSA pagemap out-of-bounds in make_new_segment odd-sized path

On Mar 6, 2026, at 06:49, Michael Paquier <michael@paquier.xyz> wrote:

On Thu, Mar 05, 2026 at 04:25:57PM -0600, Sami Imseih wrote:

With the above assert in place, the usable_pages of 879 ars you have
in test_dsa.c crashes
due to the assertion failure.

Yep, the assertion looks useful to have in place.

While debugging the code, I used the same Assert, which helped quickly trigger the bug. Without the Assert, even if an OOB happens, it might not immediately crash.

But now that the bug has been identified and fixed, and the function is self-contained, I’m not sure how much the Assert would still help. Unless we are not sure whether the fix really works, otherwise I don’t think the Assert is necessary.

However, this is not a strong objection. If we do want to add the Assert, I notice the expression MAXALIGN(sizeof(dsa_segment_header)) + MAXALIGN(sizeof(FreePageManager)) appears multiple times, and the Assert actually uses it as well. We could add a local variable like:
```
size_t metadata_fixed_bytes =
MAXALIGN(sizeof(dsa_segment_header)) +
MAXALIGN(sizeof(FreePageManager));
```

That would help reduce code redundancy.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

#7Sami Imseih
samimseih@gmail.com
In reply to: Michael Paquier (#5)
Re: [BUG + PATCH] DSA pagemap out-of-bounds in make_new_segment odd-sized path

With the above assert in place, the usable_pages of 879 ars you have
in test_dsa.c crashes
due to the assertion failure.

Yep, the assertion looks useful to have in place.

As far as the fix, I do agree with what you have in 0001-, except I am
not so sure about the "+1 for rounding".
Can we just do ceiling division?

Ceiling division is more precise at the end, and can be checked, I'd
tend to stick with your approach.

+               /*
+                * We must also account for pagemap entries needed to cover the
+                * metadata pages themselves.  The pagemap must track
all pages in the
+                * segment, including the pages occupied by metadata.
+                */
+               metadata_bytes +=
+                       ((metadata_bytes + (FPM_PAGE_SIZE -
sizeof(dsa_pointer)) - 1) /
+                       (FPM_PAGE_SIZE - sizeof(dsa_pointer))) *
+                       sizeof(dsa_pointer);

I don't think we should add a test_dsa, but I do think it was useful
to verify the issue.

I think that a test would be actually nice to have in test_dsa, but the
proposed one lacks flexibility. How about changing it to a function
that takes in input the values you'd want to test for pagemap_start,
usable_pages and the offset? The point is to check a dsa_allocate()
with the sanity of an offset, hence a simple test_dsa_allocate() as
function name? The test module exists down to v17, which should be
enough to check things across platforms moving ahead.

How about we loop through a wider range of allocations and rely
on the assert to fail if we hit the issue for an undersized pagemap.

test allocations up to 10k pages, skipping 100 pages at a time.

```
SELECT test_dsa_allocate(10000, 100);
```

I think this gives us broader coverage for dsa_allocate().

See v3.

This sounds like it should be backpatched, but we'll see what a
committer thinks.

This should be backpatched. If you'd have told me that the
allocation is oversized, then an optimization may have been possible
on HEAD, especially if overestimation was large. A small reduction in
size may not even be worth worrying in some cases, as we tend to
oversize some shmem areas as well. An undersized calculation is a
very different story: memory corruptions on the stack are not cool at
all, especially on stable branches, and they lead to unpredictible
behaviors.

I agree.

--
Sami Imseih
Amazon Web Services (AWS)

Attachments:

v3-0001-Fix-DSA-pagemap-undersizing-in-make_new_segment.patchapplication/octet-stream; name=v3-0001-Fix-DSA-pagemap-undersizing-in-make_new_segment.patchDownload+71-1
#8Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#7)
Re: [BUG + PATCH] DSA pagemap out-of-bounds in make_new_segment odd-sized path

See v3.

ugh, the .out file was not updated. v4 fixes that.

Paul, can you create a commitfest entry [1][https://commitfest.postgresql.org/59/] for this
as a bug fix? This will need to go on all stable versions.

[1]: [https://commitfest.postgresql.org/59/]

--
Sami Imseih
Amazon Web Services (AWS)

Attachments:

v4-0001-Fix-DSA-pagemap-undersizing-in-make_new_segment.patchapplication/octet-stream; name=v4-0001-Fix-DSA-pagemap-undersizing-in-make_new_segment.patchDownload+75-1
#9Paul Bunn
paul.bunn@icloud.com
In reply to: Sami Imseih (#8)
Re: [BUG + PATCH] DSA pagemap out-of-bounds in make_new_segment odd-sized path

Thanks
I’ll be able to create the cf by Monday. Appreciate the effort and quick responses.

On Mar 6, 2026, at 5:29 AM, Sami Imseih <samimseih@gmail.com> wrote:



See v3.

ugh, the .out file was not updated. v4 fixes that.

Paul, can you create a commitfest entry [1][https://commitfest.postgresql.org/59/] for this
as a bug fix? This will need to go on all stable versions.

[1]: [https://commitfest.postgresql.org/59/]

--
Sami Imseih
Amazon Web Services (AWS)
<v4-0001-Fix-DSA-pagemap-undersizing-in-make_new_segment.patch>

#10Michael Paquier
michael@paquier.xyz
In reply to: Paul Bunn (#9)
Re: [BUG + PATCH] DSA pagemap out-of-bounds in make_new_segment odd-sized path

On Fri, Mar 06, 2026 at 08:22:27AM -0800, Paul Bunn wrote:

ugh, the .out file was not updated. v4 fixes that.

I have looked at that, and found that the assumptions behind the
formulas could have a better documentation, information actually added
partially in the commit log of the first versions of the patch.
Having that in the comments of the code felt more adapted.

The test case was taking a long time to run, up to a couple of
seconds. And it was much slower in the buildfarm. The allocation
time stacks quickly once we aim for higher, and being able to test the
odd-sized case could come to 1901 and 6501 for the number of pages.
Hence I have slightly reworked the function, tweaking it with a
"start" and an "end" location, on top of the custom "step".

A last issue was with total_pages, shadow variable of a declaration
with the same name in the function make_new_segment(). This can be
switched to a PG_USED_FOR_ASSERTS_ONLY, with a different name.

And backpatched down to v14, with the test down to v17. The test
required a slight tweak in v18 and v17.
--
Michael