[PATCH] Align verify_heapam.c error message offset with test expectations

Started by zengman3 days ago4 messages
#1zengman
zengman@halodbtech.com
2 attachment(s)

Hi all,

I think there might be an issue with the error messages in contrib/amcheck/verify_heapam.c.
The test comment in src/bin/pg_amcheck/t/004_verify_heapam.pl (line 715) clearly states:
```
# Tuple at offset 43 is the successor of this one
```
This indicates that for the test case at offnum == 36, the error message should report "offset 43" (the successor), not "offset 36" (the current tuple).
However, when I updated the test expectation from \d+ wildcard to the exact value offset 43, the test fails.
This makes me wonder whether the current code in verify_heapam.c (lines 777, 793, 799) should be using nextoffnum instead of ctx.offnum.
This would also be consistent with similar error messages at lines 746-747 and 753-754, which use nextoffnum when referring to the produced tuple location.

--
Regards,
Man Zeng
www.openhalo.org

Attachments:

0001-fix-hardcode-offset-43-in-expected-tuple-error-regex.patchapplication/octet-stream; charset=ISO-8859-1; name=0001-fix-hardcode-offset-43-in-expected-tuple-error-regex.patchDownload
From c642a764af508d578f37008e24bd9d32738ac734 Mon Sep 17 00:00:00 2001
From: Man Zeng <zengman@halodbtech.com>
Date: Wed, 21 Jan 2026 22:16:32 +0800
Subject: [PATCH 1/2] fix: hardcode offset 43 in expected tuple error regex

Align expected error message with comment logic - the corrupted successor
tuple should be at offset 43, not arbitrary offset. Update regex to match
offset 43 instead of \d+ for accurate error validation.
---
 src/bin/pg_amcheck/t/004_verify_heapam.pl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/bin/pg_amcheck/t/004_verify_heapam.pl b/src/bin/pg_amcheck/t/004_verify_heapam.pl
index 95f1f34c90d..f9c76f4b460 100644
--- a/src/bin/pg_amcheck/t/004_verify_heapam.pl
+++ b/src/bin/pg_amcheck/t/004_verify_heapam.pl
@@ -720,7 +720,7 @@ for (my $tupidx = 0; $tupidx < $ROWCOUNT; $tupidx++)
 		$tup->{t_xmax} = $in_progress_xid;
 		$tup->{t_infomask} &= ~HEAP_XMIN_COMMITTED;
 		push @expected,
-		  qr/${header}tuple with aborted xmin \d+ was updated to produce a tuple at offset \d+ with in-progress xmin \d+/;
+		  qr/${header}tuple with aborted xmin \d+ was updated to produce a tuple at offset 43 with in-progress xmin \d+/;
 	}
 	elsif ($offnum == 40)
 	{
-- 
2.45.2

0002-fix-update-offset-variable-in-corruption-report-mess.patchapplication/octet-stream; charset=ISO-8859-1; name=0002-fix-update-offset-variable-in-corruption-report-mess.patchDownload
From 5c2cbdfd907892c3f1e6a4f1a2ae6a13e7d4fa76 Mon Sep 17 00:00:00 2001
From: Man Zeng <zengman@halodbtech.com>
Date: Wed, 21 Jan 2026 22:17:44 +0800
Subject: [PATCH 2/2] fix: update offset variable in corruption report messages

---
 contrib/amcheck/verify_heapam.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/contrib/amcheck/verify_heapam.c b/contrib/amcheck/verify_heapam.c
index 30c2f583173..fa2451d6f1d 100644
--- a/contrib/amcheck/verify_heapam.c
+++ b/contrib/amcheck/verify_heapam.c
@@ -774,7 +774,7 @@ verify_heapam(PG_FUNCTION_ARGS)
 				report_corruption(&ctx,
 								  psprintf("tuple with in-progress xmin %u was updated to produce a tuple at offset %d with committed xmin %u",
 										   curr_xmin,
-										   ctx.offnum,
+										   nextoffnum,
 										   next_xmin));
 			}
 
@@ -790,13 +790,13 @@ verify_heapam(PG_FUNCTION_ARGS)
 					report_corruption(&ctx,
 									  psprintf("tuple with aborted xmin %u was updated to produce a tuple at offset %d with in-progress xmin %u",
 											   curr_xmin,
-											   ctx.offnum,
+											   nextoffnum,
 											   next_xmin));
 				else if (xmin_commit_status[nextoffnum] == XID_COMMITTED)
 					report_corruption(&ctx,
 									  psprintf("tuple with aborted xmin %u was updated to produce a tuple at offset %d with committed xmin %u",
 											   curr_xmin,
-											   ctx.offnum,
+											   nextoffnum,
 											   next_xmin));
 			}
 		}
-- 
2.45.2

#2Neil Chen
carpenter.nail.cz@gmail.com
In reply to: zengman (#1)
Re: [PATCH] Align verify_heapam.c error message offset with test expectations

Hi zengman,

On Wed, Jan 21, 2026 at 10:37 PM zengman <zengman@halodbtech.com> wrote:

This indicates that for the test case at offnum == 36, the error message
should report "offset 43" (the successor), not "offset 36" (the current
tuple).
However, when I updated the test expectation from \d+ wildcard to the
exact value offset 43, the test fails.
This makes me wonder whether the current code in verify_heapam.c (lines
777, 793, 799) should be using nextoffnum instead of ctx.offnum.

Fully agreed.
I think there's no need to modify the expectations. Seems the 0002 patch
alone is sufficient.

#3zengman
zengman@halodbtech.com
In reply to: Neil Chen (#2)
Re: [PATCH] Align verify_heapam.c error message offset with test expectations

Fully agreed.
I think there's no need to modify the expectations. Seems the 0002 patch alone is sufficient.

Hi,

Thanks for your feedback. I don’t have a strong opinion either way about whether to keep or remove the 0001 patch — I included it more as a way to illustrate the issue.

--
Regards,
Man Zeng
www.openhalo.org

#4Khoa Nguyen
khoaduynguyen@gmail.com
In reply to: zengman (#1)
Re: [PATCH] Align verify_heapam.c error message offset with test expectations

On Thu, Jan 22, 2026 at 9:44 AM zengman <zengman@halodbtech.com> wrote:

Hi all,

I think there might be an issue with the error messages in
contrib/amcheck/verify_heapam.c.
The test comment in src/bin/pg_amcheck/t/004_verify_heapam.pl (line 715)
clearly states:
```
# Tuple at offset 43 is the successor of this one
```
This indicates that for the test case at offnum == 36, the error message
should report "offset 43" (the successor), not "offset 36" (the current
tuple).
However, when I updated the test expectation from \d+ wildcard to the
exact value offset 43, the test fails.

Patch 1: Updating the wildcard matching from "\d+" to the exact value "43"
makes the test more precise and clear. After all, this change led to the
discovery of the bug you fixed in Patch 2 - the original wildcard masked
the problem. Since the test data was engineered to be deterministic and
there is precedent for hardcoding test values, I think Patch 1 is a good
addition.

Lines 283-296 implies test data was engineer to be deterministic.
# Data for HOT chain validation, so not calling VACUUM FREEZE.
$node->safe_psql(
'postgres', qq(
INSERT INTO public.test (a, b, c)
VALUES ( x'DEADF9F9DEADF9F9'::bigint, 'abcdefg',
generate_series(7,15)); -- offset numbers 28 to 36
UPDATE public.test SET c = 'a' WHERE c = '7'; -- offset number 37
UPDATE public.test SET c = 'a' WHERE c = '10'; -- offset number 38
UPDATE public.test SET c = 'a' WHERE c = '11'; -- offset number 39
UPDATE public.test SET c = 'a' WHERE c = '12'; -- offset number 40
UPDATE public.test SET c = 'a' WHERE c = '13'; -- offset number 41
UPDATE public.test SET c = 'a' WHERE c = '14'; -- offset number 42
UPDATE public.test SET c = 'a' WHERE c = '15'; -- offset number 43
));

I think there are a few other test cases that Patch 2 will affect but the
diff will be masked out by regexp. It is up to you if you want to follow
through with the exercise. At a minimum, I think we should address test
offnum == 33 a line 693.

This makes me wonder whether the current code in verify_heapam.c (lines

777, 793, 799) should be using nextoffnum instead of ctx.offnum.
This would also be consistent with similar error messages at lines 746-747
and 753-754, which use nextoffnum when referring to the produced tuple
location.

Great catch on Patch 2. Lines 633-664 show that ctx.offnum is simply an
iteration of the line pointer in the page and nextoffnum maps to the
successor of each iteration's line pointer.

Given that the error message is "... was updated to produce a tuple at
offset %d ...", the successor's offset should be used. Therefore the
change on lines 777, 793, and 799 from using ctx.offnum to using nextoffnum
looks good.

Both patches look great to me.