[PATCH] Perform check for oversized WAL record before calculating record CRC

Started by Sergey Fukanchik4 months ago3 messages
#1Sergey Fukanchik
s.fukanchik@postgrespro.ru
1 attachment(s)

Hi Postgres hackers,
I found a case where CRC of 1Gb block is calculated first and then
immediately
discarded.

There is a limit on WAL record size - XLogRecordMaxSize. If the record
being inserted is larger than that, it is discarded and error is reported:

ERROR:  oversized WAL record
DETAIL:  WAL record would be 1069547521 bytes (of maximum 1069547520 bytes)

However, crc of record data is calculated before the record size is
validated,
and in case of oversized record this crc is not used anywhere.

It is surely a minor issue, but might be worth fixing. I'm proposing a
patch.
Since this situation is not covered by any tests I also included a test case
for failing on huge WAL records.
---
Sergey Fukanchik

Attachments:

0001-Perform-check-for-oversized-WAL-record-before-calcul.patchtext/x-patch; charset=UTF-8; name=0001-Perform-check-for-oversized-WAL-record-before-calcul.patchDownload
From 1795b6eca263db9a8a941f27c7ebae911954dcff Mon Sep 17 00:00:00 2001
From: Sergey Fukanchik <s.fukanchik@postgrespro.ru>
Date: Sat, 6 Sep 2025 12:40:01 +0300
Subject: [PATCH] Perform check for oversized WAL record before calculating
 record CRC

---
 src/backend/access/transam/xloginsert.c       | 26 +++++++++----------
 .../regress/expected/oversized_wal_record.out |  5 ++++
 src/test/regress/parallel_schedule            |  2 +-
 src/test/regress/sql/oversized_wal_record.sql |  4 +++
 4 files changed, 23 insertions(+), 14 deletions(-)
 create mode 100644 src/test/regress/expected/oversized_wal_record.out
 create mode 100644 src/test/regress/sql/oversized_wal_record.sql

diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c
index c7571429e8e..e8c55924c12 100644
--- a/src/backend/access/transam/xloginsert.c
+++ b/src/backend/access/transam/xloginsert.c
@@ -904,19 +904,6 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
 	hdr_rdt.len = (scratch - hdr_scratch);
 	total_len += hdr_rdt.len;
 
-	/*
-	 * Calculate CRC of the data
-	 *
-	 * Note that the record header isn't added into the CRC initially since we
-	 * don't know the prev-link yet.  Thus, the CRC will represent the CRC of
-	 * the whole record in the order: rdata, then backup blocks, then record
-	 * header.
-	 */
-	INIT_CRC32C(rdata_crc);
-	COMP_CRC32C(rdata_crc, hdr_scratch + SizeOfXLogRecord, hdr_rdt.len - SizeOfXLogRecord);
-	for (rdt = hdr_rdt.next; rdt != NULL; rdt = rdt->next)
-		COMP_CRC32C(rdata_crc, rdt->data, rdt->len);
-
 	/*
 	 * Ensure that the XLogRecord is not too large.
 	 *
@@ -930,6 +917,19 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
 				 errdetail_internal("WAL record would be %" PRIu64 " bytes (of maximum %u bytes); rmid %u flags %u.",
 									total_len, XLogRecordMaxSize, rmid, info)));
 
+	/*
+	 * Calculate CRC of the data
+	 *
+	 * Note that the record header isn't added into the CRC initially since we
+	 * don't know the prev-link yet.  Thus, the CRC will represent the CRC of
+	 * the whole record in the order: rdata, then backup blocks, then record
+	 * header.
+	 */
+	INIT_CRC32C(rdata_crc);
+	COMP_CRC32C(rdata_crc, hdr_scratch + SizeOfXLogRecord, hdr_rdt.len - SizeOfXLogRecord);
+	for (rdt = hdr_rdt.next; rdt != NULL; rdt = rdt->next)
+		COMP_CRC32C(rdata_crc, rdt->data, rdt->len);
+
 	/*
 	 * Fill in the fields in the record header. Prev-link is filled in later,
 	 * once we know where in the WAL the record will be inserted. The CRC does
diff --git a/src/test/regress/expected/oversized_wal_record.out b/src/test/regress/expected/oversized_wal_record.out
new file mode 100644
index 00000000000..c9da04b59ec
--- /dev/null
+++ b/src/test/regress/expected/oversized_wal_record.out
@@ -0,0 +1,5 @@
+-- Try to create a WAL record which is larger than the limit.
+-- That should raise an error.
+select pg_logical_emit_message(false, 'a', repeat('00000000', (1020 * 1024 * 1024)/8+1), true);
+ERROR:  oversized WAL record
+DETAIL:  WAL record would be 1069547583 bytes (of maximum 1069547520 bytes); rmid 21 flags 0.
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index fbffc67ae60..913e2583129 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -36,7 +36,7 @@ test: geometry horology tstypes regex type_sanity opr_sanity misc_sanity comment
 # execute two copy tests in parallel, to check that copy itself
 # is concurrent safe.
 # ----------
-test: copy copyselect copydml copyencoding insert insert_conflict
+test: copy copyselect copydml copyencoding insert insert_conflict oversized_wal_record
 
 # ----------
 # More groups of parallel tests
diff --git a/src/test/regress/sql/oversized_wal_record.sql b/src/test/regress/sql/oversized_wal_record.sql
new file mode 100644
index 00000000000..555b6035c0f
--- /dev/null
+++ b/src/test/regress/sql/oversized_wal_record.sql
@@ -0,0 +1,4 @@
+-- Try to create a WAL record which is larger than the limit.
+-- That should raise an error.
+
+select pg_logical_emit_message(false, 'a', repeat('00000000', (1020 * 1024 * 1024)/8+1), true);
-- 
2.34.1

#2Andrey Borodin
x4mmm@yandex-team.ru
In reply to: Sergey Fukanchik (#1)
Re: [PATCH] Perform check for oversized WAL record before calculating record CRC

On 6 Sep 2025, at 16:00, Sergey Fukanchik <s.fukanchik@postgrespro.ru> wrote:

<0001-Perform-check-for-oversized-WAL-record-before-calcul.patch>

Hi Sergey!

It seems to me reasonable to move size check above CRC computation. However, it seems suspicious to me to run a test that allocates 1Gb in `make check`. Maybe, there are places that are not exercised too often. Perhaps recovery tests or something like that.

Best regards, Andrey Borodin.

#3Sergey Fukanchik
s.fukanchik@postgrespro.ru
In reply to: Andrey Borodin (#2)
2 attachment(s)
Re: [PATCH] Perform check for oversized WAL record before calculating record CRC

It seems to me reasonable to move size check above CRC computation. However, it seems suspicious to me to run a test that allocates 1Gb in `make check`. Maybe, there are places that are not exercised too often. Perhaps recovery tests or something like that.

Hi Andrey,

I share your concern about memory consumption and also agree recovery
tests look like the right place for this test.

So I split the patch into two - the change proper and the converted TAP
test, guarded by PG_TEST_EXTRA. Attaching both patches.

---

Sergey

Attachments:

v2-0001-Perform-check-for-oversized-WAL-record-before-cal.patchtext/x-patch; charset=UTF-8; name=v2-0001-Perform-check-for-oversized-WAL-record-before-cal.patchDownload
From c3818555611ab4c9c39dee90e9e56e69c2fcc1ed Mon Sep 17 00:00:00 2001
From: Sergey Fukanchik <s.fukanchik@postgrespro.ru>
Date: Sun, 7 Sep 2025 12:06:03 +0300
Subject: [PATCH v2 1/2] Perform check for oversized WAL record before
 calculating record CRC

---
 src/backend/access/transam/xloginsert.c | 26 ++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c
index c7571429e8e..e8c55924c12 100644
--- a/src/backend/access/transam/xloginsert.c
+++ b/src/backend/access/transam/xloginsert.c
@@ -904,19 +904,6 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
 	hdr_rdt.len = (scratch - hdr_scratch);
 	total_len += hdr_rdt.len;
 
-	/*
-	 * Calculate CRC of the data
-	 *
-	 * Note that the record header isn't added into the CRC initially since we
-	 * don't know the prev-link yet.  Thus, the CRC will represent the CRC of
-	 * the whole record in the order: rdata, then backup blocks, then record
-	 * header.
-	 */
-	INIT_CRC32C(rdata_crc);
-	COMP_CRC32C(rdata_crc, hdr_scratch + SizeOfXLogRecord, hdr_rdt.len - SizeOfXLogRecord);
-	for (rdt = hdr_rdt.next; rdt != NULL; rdt = rdt->next)
-		COMP_CRC32C(rdata_crc, rdt->data, rdt->len);
-
 	/*
 	 * Ensure that the XLogRecord is not too large.
 	 *
@@ -930,6 +917,19 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
 				 errdetail_internal("WAL record would be %" PRIu64 " bytes (of maximum %u bytes); rmid %u flags %u.",
 									total_len, XLogRecordMaxSize, rmid, info)));
 
+	/*
+	 * Calculate CRC of the data
+	 *
+	 * Note that the record header isn't added into the CRC initially since we
+	 * don't know the prev-link yet.  Thus, the CRC will represent the CRC of
+	 * the whole record in the order: rdata, then backup blocks, then record
+	 * header.
+	 */
+	INIT_CRC32C(rdata_crc);
+	COMP_CRC32C(rdata_crc, hdr_scratch + SizeOfXLogRecord, hdr_rdt.len - SizeOfXLogRecord);
+	for (rdt = hdr_rdt.next; rdt != NULL; rdt = rdt->next)
+		COMP_CRC32C(rdata_crc, rdt->data, rdt->len);
+
 	/*
 	 * Fill in the fields in the record header. Prev-link is filled in later,
 	 * once we know where in the WAL the record will be inserted. The CRC does
-- 
2.34.1

v2-0002-Add-a-TAP-test-for-oversized-WAL-records.patchtext/x-patch; charset=UTF-8; name=v2-0002-Add-a-TAP-test-for-oversized-WAL-records.patchDownload
From 23172b798c82b039ec5be178a3740362ae65fc37 Mon Sep 17 00:00:00 2001
From: Sergey Fukanchik <s.fukanchik@postgrespro.ru>
Date: Sun, 7 Sep 2025 12:06:27 +0300
Subject: [PATCH v2 2/2] Add a TAP test for oversized WAL records

---
 .../recovery/t/049_oversized_wal_record.pl    | 50 +++++++++++++++++++
 1 file changed, 50 insertions(+)
 create mode 100644 src/test/recovery/t/049_oversized_wal_record.pl

diff --git a/src/test/recovery/t/049_oversized_wal_record.pl b/src/test/recovery/t/049_oversized_wal_record.pl
new file mode 100644
index 00000000000..8604ef53d48
--- /dev/null
+++ b/src/test/recovery/t/049_oversized_wal_record.pl
@@ -0,0 +1,50 @@
+
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Try to create a WAL record which is larger than the limit XLogRecordMaxSize.
+# That should raise an error.
+
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+# Disable the test unless requested explicitly because it requires 1Gb of memory
+if (exists $ENV{PG_TEST_EXTRA}
+	&& $ENV{PG_TEST_EXTRA} =~ m/\boversized_wal_record\b/)
+{
+	plan tests => 4;
+}
+else
+{
+	plan skip_all =>
+      'Skipping oversized_wal_record test as it requires a lot of memory';
+}
+
+note "Check handing of oversized WAL records";
+
+# Initialize and start basic node
+my $node = PostgreSQL::Test::Cluster->new('node');
+
+$node->init;
+
+$node->start;
+
+# Insert oversized record
+my ($ret, $stdout, $stderr) = $node->psql(
+	'postgres',
+	"select pg_logical_emit_message(false, 'a', repeat('00000000', (1020 * 1024 * 1024)/8+1), true)",
+	on_error_die => 0
+  );
+
+# Verify we have error
+ok($ret != 0, 'exit code is non-zero');
+
+ok($stderr =~ qr/ERROR:  oversized WAL record/, 'expect error');
+
+ok($stderr =~ qr/DETAIL:  WAL record would be [0-9]+ bytes \(of maximum [0-9]+ bytes\);/,
+   'error details');
+
+$node->stop;
+
+done_testing();
-- 
2.34.1