BRIN: Prevent the heapblk overflow during index summarization on very large tables resulting in an infinite loop

Started by sunil s3 months ago7 messages
#1sunil s
sunilfeb26@gmail.com
1 attachment(s)

Previously, *heapBlk* was defined as an unsigned 32-bit integer. When
incremented

by *pagesPerRange *on very large tables, it could wrap around, causing the
condition

*heapBlk < nblocks* to remain true indefinitely — resulting in an *infinite
loop*.

This could cause the PostgreSQL backend to hang, consuming 100% CPU
indefinitely

and preventing operations from completing on large tables.

The solution is straightforward — the data type of `heapBlk` has been
changed

from a 32-bit integer to a 64-bit `BlockNumber` (int64), ensuring it can
safely

handle extremely large tables without risk of overflow.

This was explained very nicely by Tomas Vondra[1]/messages/by-id/b8a4e04c-c091-056c-a379-11d35c7b2d8d@enterprisedb.com and below two solutions
were

suggested.

i) Change to int64

ii) Tracking the prevHeapBlk

Among these two I feel using solution #1 would be more feasible(similar to
previously used solution 4bc6fb57f774ea18187fd8565aad9994160bfc17[2]https://github.com/postgres/postgres/commit/4bc6fb57f774ea18187fd8565aad9994160bfc17),
though

other solution also works.

I’ve attached a patch with the changes for *solution #1*.
Kindly review it and share your feedback or suggestions — your input would
be greatly appreciated.

Reference:

[1]: /messages/by-id/b8a4e04c-c091-056c-a379-11d35c7b2d8d@enterprisedb.com
/messages/by-id/b8a4e04c-c091-056c-a379-11d35c7b2d8d@enterprisedb.com

[2]: https://github.com/postgres/postgres/commit/4bc6fb57f774ea18187fd8565aad9994160bfc17
https://github.com/postgres/postgres/commit/4bc6fb57f774ea18187fd8565aad9994160bfc17

Thanks & Regards,
Sunil Seetharama
Broadcom Inc

Attachments:

0001-BRIN-Prevent-the-heapblk-overflow-during-index-summa.patchapplication/octet-stream; name=0001-BRIN-Prevent-the-heapblk-overflow-during-index-summa.patchDownload
From f0a59c18ecb068bbf417c9a8c8cbbcc37460dd3f Mon Sep 17 00:00:00 2001
From: Sunil Seetharama <sunil.s@broadcom.com>
Date: Fri, 17 Oct 2025 10:51:42 +0530
Subject: [PATCH] BRIN: Prevent the heapblk overflow during index summarization
 on very large tables
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Previously, `heapBlk` was defined as an unsigned 32-bit integer. When incremented
by `pagesPerRange` on very large tables, it could wrap around, causing the condition
`heapBlk < nblocks` to remain true indefinitely — resulting in an infinite loop.

This could cause the PostgreSQL backend to hang, consuming 100% CPU indefinitely
and preventing operations from completing on large tables.

The solution is straightforward — the data type of `heapBlk` has been changed
from a 32-bit integer to a 64-bit `BlockNumber` (int64), ensuring it can safely
handle extremely large tables without risk of overflow.

This was explained very nicely by Tomas Vondra[1] and below two solutions were
suggested.
	i)  Change to int64
	ii) Tracking the prevHeapBlk

Among these two I feel using solution #1 would be more feasible(similar to previously used solution #2), though
other solution also works.

Reference:
[1] https://www.postgresql.org/message-id/b8a4e04c-c091-056c-a379-11d35c7b2d8d%40enterprisedb.com
[2] https://github.com/postgres/postgres/commit/4bc6fb57f774ea18187fd8565aad9994160bfc17
---
 src/backend/access/brin/brin.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 7ff7467e462..6a273e9fc39 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -573,7 +573,7 @@ bringetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
 	Relation	heapRel;
 	BrinOpaque *opaque;
 	BlockNumber nblocks;
-	BlockNumber heapBlk;
+	int64 heapBlk;
 	int64		totalpages = 0;
 	FmgrInfo   *consistentFn;
 	MemoryContext oldcxt;
@@ -749,7 +749,7 @@ bringetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
 
 		MemoryContextReset(perRangeCxt);
 
-		tup = brinGetTupleForHeapBlock(opaque->bo_rmAccess, heapBlk, &buf,
+		tup = brinGetTupleForHeapBlock(opaque->bo_rmAccess, (BlockNumber )heapBlk, &buf,
 									   &off, &size, BUFFER_LOCK_SHARE);
 		if (tup)
 		{
-- 
2.50.1

#2David Rowley
dgrowleyml@gmail.com
In reply to: sunil s (#1)
Re: BRIN: Prevent the heapblk overflow during index summarization on very large tables resulting in an infinite loop

On Sun, 19 Oct 2025 at 19:03, sunil s <sunilfeb26@gmail.com> wrote:

Previously, heapBlk was defined as an unsigned 32-bit integer. When incremented
by pagesPerRange on very large tables, it could wrap around, causing the condition
heapBlk < nblocks to remain true indefinitely — resulting in an infinite loop.

This was explained very nicely by Tomas Vondra[1] and below two solutions were

suggested.
i) Change to int64
ii) Tracking the prevHeapBlk

This is similar to what table_block_parallelscan_nextpage() has to do
to avoid wrapping around when working with tables containing near 2^32
pages. I'd suggest using uint64 rather than int64 and also adding a
comment to mention why that type is being used rather than
BlockNumber. Something like: "We make use of uint64 for heapBlk as a
BlockNumber could wrap for tables with close to 2^32 pages."

You could move the declaration of heapBlk into the for loop line so
that the comment about using uint64 is closer to the declaration.

I suspect you will also want to switch to using uint64 for the
"pageno" variable at the end of the loop. My compiler isn't warning
about the "pageno = heapBlk;", but maybe other compilers will.

Not for this patch, but I wonder why we switch memory contexts so
often in that final tbm_add_page() loop. I think it would be better to
switch once before the loop and back again after it. What's there
seems a bit wasteful for any pagesPerRange > 1.

David

#3sunil s
sunilfeb26@gmail.com
In reply to: David Rowley (#2)
1 attachment(s)
Re: BRIN: Prevent the heapblk overflow during index summarization on very large tables resulting in an infinite loop

Thanks, David, for providing feedback on the changes.
I’ve addressed your comments and attached a rebased patch.

Thanks & Regards,
Sunil S

Attachments:

v2-0001-BRIN-Prevent-integer-overflow-during-index-summar.patchapplication/octet-stream; name=v2-0001-BRIN-Prevent-integer-overflow-during-index-summar.patchDownload
From 723c16770af3a183b5df8cdea9f4b9432ddc4ea3 Mon Sep 17 00:00:00 2001
From: Sunil Seetharama <sunilfeb26@gmail.com>
Date: Fri, 17 Oct 2025 10:51:42 +0530
Subject: [PATCH v2] BRIN: Prevent integer overflow during index summarization on very large tables
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Previously, `heapBlk` was defined as an unsigned 32-bit integer. When incremented
by `pagesPerRange` on very large tables, it could wrap around, causing the condition
`heapBlk < nblocks` to remain true indefinitely — resulting in an infinite loop.

This could cause the PostgreSQL backend to hang, consuming 100% CPU indefinitely
and preventing operations from completing on large tables.

The solution is straightforward — the data type of `heapBlk` has been changed
from a 32-bit integer to a 64-bit `BlockNumber` (int64), ensuring it can safely
handle extremely large tables without risk of overflow.

This was explained very nicely by Tomas Vondra[1] and below two solutions were
suggested.
	i)  Change to int64
	ii) Tracking the prevHeapBlk

Among these two I feel using solution #1 would be more feasible(similar to previously used solution #2), though
other solution also works.This is also similar to logic used in table_block_parallelscan_nextpage()

Reference:
[1] https://www.postgresql.org/message-id/b8a4e04c-c091-056c-a379-11d35c7b2d8d%40enterprisedb.com
[2] https://github.com/postgres/postgres/commit/4bc6fb57f774ea18187fd8565aad9994160bfc17
---
 src/backend/access/brin/brin.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 7ff7467e462..d3fef042f04 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -573,7 +573,6 @@ bringetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
 	Relation	heapRel;
 	BrinOpaque *opaque;
 	BlockNumber nblocks;
-	BlockNumber heapBlk;
 	int64		totalpages = 0;
 	FmgrInfo   *consistentFn;
 	MemoryContext oldcxt;
@@ -736,8 +735,15 @@ bringetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
 	 * Now scan the revmap.  We start by querying for heap page 0,
 	 * incrementing by the number of pages per range; this gives us a full
 	 * view of the table.
+	 *
+	 * Since heapBlk is incremented by opaque->bo_pagesPerRange, it can exceed
+	 * the maximum 32-bit limit (2^32) on very large tables, potentially causing
+	 * the loop to become infinite.
+	 *
+	 * To prevent this overflow, the counter must use a 64-bit type, ensuring it
+	 * can handle cases where nblocks approaches 2^32.
 	 */
-	for (heapBlk = 0; heapBlk < nblocks; heapBlk += opaque->bo_pagesPerRange)
+	for (uint64 heapBlk = 0; heapBlk < nblocks; heapBlk += opaque->bo_pagesPerRange)
 	{
 		bool		addrange;
 		bool		gottuple = false;
@@ -749,7 +755,7 @@ bringetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
 
 		MemoryContextReset(perRangeCxt);
 
-		tup = brinGetTupleForHeapBlock(opaque->bo_rmAccess, heapBlk, &buf,
+		tup = brinGetTupleForHeapBlock(opaque->bo_rmAccess, (BlockNumber )heapBlk, &buf,
 									   &off, &size, BUFFER_LOCK_SHARE);
 		if (tup)
 		{
@@ -926,7 +932,7 @@ bringetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
 		{
 			BlockNumber pageno;
 
-			for (pageno = heapBlk;
+			for (pageno = (BlockNumber)heapBlk;
 				 pageno <= Min(nblocks, heapBlk + opaque->bo_pagesPerRange) - 1;
 				 pageno++)
 			{
-- 
2.50.1

#4Michael Paquier
michael@paquier.xyz
In reply to: sunil s (#3)
Re: BRIN: Prevent the heapblk overflow during index summarization on very large tables resulting in an infinite loop

On Mon, Oct 20, 2025 at 03:11:01PM +0530, sunil s wrote:

Thanks, David, for providing feedback on the changes.
I’ve addressed your comments and attached a rebased patch.

Using signed or unsigned is not going to matter much at the end. We
would be far from the count even if the number is signed.

+	 * Since heapBlk is incremented by opaque->bo_pagesPerRange, it can exceed
+	 * the maximum 32-bit limit (2^32) on very large tables, potentially causing
+	 * the loop to become infinite.
+	 *
+	 * To prevent this overflow, the counter must use a 64-bit type, ensuring it
+	 * can handle cases where nblocks approaches 2^32.

2^32 is mentioned twice. A simpler suggestion:
Since heapBlk is incremented by opaque->bo_pagesPerRange, it could
exceed the maximum 32-bit limit (2^32) on very large tables and
wraparound. The counter must be 64 bits wide for this reason.

Like totalpages, there is an argument about consistency based on the
result type of bringetbitmap(). It's minor, still.

It would be simpler to switch "pageno" to be 64-bit wide as well,
rather than casting it back to BlockNumber.
--
Michael

#5David Rowley
dgrowleyml@gmail.com
In reply to: Michael Paquier (#4)
Re: BRIN: Prevent the heapblk overflow during index summarization on very large tables resulting in an infinite loop

On Tue, 21 Oct 2025 at 15:55, Michael Paquier <michael@paquier.xyz> wrote:

Using signed or unsigned is not going to matter much at the end. We
would be far from the count even if the number is signed.

I'd leave it as uint64. There's no reason to mixup the signedness
between these two variables.

+        * Since heapBlk is incremented by opaque->bo_pagesPerRange, it can exceed
+        * the maximum 32-bit limit (2^32) on very large tables, potentially causing
+        * the loop to become infinite.
+        *
+        * To prevent this overflow, the counter must use a 64-bit type, ensuring it
+        * can handle cases where nblocks approaches 2^32.

2^32 is mentioned twice. A simpler suggestion:
Since heapBlk is incremented by opaque->bo_pagesPerRange, it could
exceed the maximum 32-bit limit (2^32) on very large tables and
wraparound. The counter must be 64 bits wide for this reason.

I wasn't a fan of that change either. I suggested "We make use of
uint64 for heapBlk as a BlockNumber could wrap for tables with close
to 2^32 pages.", but that's not what happened.

Like totalpages, there is an argument about consistency based on the
result type of bringetbitmap(). It's minor, still.

I don't think totalpages being int64 is an argument to make the heapBlk int64.

It would be simpler to switch "pageno" to be 64-bit wide as well,
rather than casting it back to BlockNumber.

I suggested that too, but ...

I'm happy to finish this one off. I was leaving it for Tomas to
comment, but I think he'll be busy with pgconf.eu for the next few
days.

David

#6Michael Paquier
michael@paquier.xyz
In reply to: David Rowley (#5)
Re: BRIN: Prevent the heapblk overflow during index summarization on very large tables resulting in an infinite loop

On Tue, Oct 21, 2025 at 06:32:01PM +1300, David Rowley wrote:

I'd leave it as uint64. There's no reason to mixup the signedness
between these two variables.

That's fine as well.

I'm happy to finish this one off. I was leaving it for Tomas to
comment, but I think he'll be busy with pgconf.eu for the next few
days.

Cool, thanks. I'm leaving that stuff up to any of you, have fun.
FWIW, I am on the same line as you. Your suggestions are better than
what the proposed patch does, as far as I've looked.
--
Michael

#7David Rowley
dgrowleyml@gmail.com
In reply to: Michael Paquier (#6)
Re: BRIN: Prevent the heapblk overflow during index summarization on very large tables resulting in an infinite loop

On Tue, 21 Oct 2025 at 18:53, Michael Paquier <michael@paquier.xyz> wrote:

FWIW, I am on the same line as you. Your suggestions are better than
what the proposed patch does, as far as I've looked.

Pushed with the agreed comment change and type change to pageno.

David