[BUG] pg_basebackup produces wrong incremental files after relation truncation in segmented tables

Started by Oleg Tkachenko4 months ago19 messages

oatkachenko@gmail.com

4 months ago

Hello PostgreSQL developers,

I’ve encountered a bug in the incremental backup feature that prevents restoration of backups containing relations larger than 1 GB that were vacuum-truncated.

Problem Description

When taking incremental backups of relations that span multiple segments, if the relation is truncated during VACUUM (after the base backup but before the incremental one), pg_combinebackup fails with:

```

file "%s" has truncation block length %u in excess of segment size %u

```

pg_basebackup itself completes without errors, but the resulting incremental backup cannot be restored.

Root Cause

In segmented relations, a VACUUM that truncates blocks sets a limit_block in the WAL summary. The incremental restore logic miscalculates truncation_block_length when processing segment 0…N, because it compares the segment-local size with a relation-wide limit.

In src/backend/backup/basebackup_incremental.c:

```

*truncation_block_length = size / BLCKSZ;

if (BlockNumberIsValid(limit_block))

{

unsigned relative_limit = limit_block - segno * RELSEG_SIZE;

if (*truncation_block_length < relative_limit) /* ← problematic */

*truncation_block_length = relative_limit;

}

```

For example, if limit_block lies in segment 10, then relative_limit will be roughly 9 * RELSEG_SIZE while processing segment 0. This forces truncation_block_length far beyond the actual segment size, leading to a segment length larger than RELSEG_SIZE and eventually the restore error.

Reproduction Steps

Create a table larger than 1 GB (multiple segments).

Take a full base backup.

Delete rows that occupy the end of the relation.

Run VACUUM (VERBOSE, TRUNCATE) to ensure blocks are removed.

(optional) Confirm that the WAL summary includes a limit entry for the relation.

Take an incremental backup with pg_basebackup.

Attempt to restore using pg_combinebackup.

Observe the truncation block length error.

Patch

A patch correcting this logic is attached, and I’m happy to provide additional details or revisions if helpful.

Best regards,

Oleg Tkachenko

oatkachenko@gmail.com

3 months ago

In reply to: Oleg Tkachenko (#1)

Re: [BUG] [PATCH] pg_basebackup produces wrong incremental files after relation truncation in segmented tables

Hello,

I am following up on my report with a patch, which did not receive any responses. I suspect the issue only manifests under specific conditions, so I am sending additional details along with a reliable reproducer.

For context:

Incremental backups cannot be restored when a relation larger than 1 GB (multiple segments) is vacuum-truncated between the base backup and the incremental backup (pg_basebackup itself completes successfully, but the resulting incremental backup is not restorable)

For segmented relations, the WAL summarizer records limit_block in the WAL summary. During incremental backup, the truncation length is computed incorrectly because a relation-wide limit is compared against a segment-local size.

Reproducer

I am attaching a bash script that reliably reproduces the issue on my system. The script:

Creates a table large enough to span multiple segments.

Takes a full base backup.

Deletes rows at the end of the relation.

Runs VACUUM (TRUNCATE) to remove blocks.

Takes an incremental backup.

Fails during pg_combinebackup.

The script is fully automated and intended to be run as-is.

A patch itself in the previous message.

I would appreciate feedback on the approach and am happy to revise it if needed.

Best regards,

Oleg Tkachenko

Show quoted text

On Nov 14, 2025, at 20:43, Oleg Tkachenko <oatkachenko@gmail.com> wrote:

Hello PostgreSQL developers,

I’ve encountered a bug in the incremental backup feature that prevents restoration of backups containing relations larger than 1 GB that were vacuum-truncated.

Problem Description

When taking incremental backups of relations that span multiple segments, if the relation is truncated during VACUUM (after the base backup but before the incremental one), pg_combinebackup fails with:

```

file "%s" has truncation block length %u in excess of segment size %u

```

pg_basebackup itself completes without errors, but the resulting incremental backup cannot be restored.

Root Cause

In segmented relations, a VACUUM that truncates blocks sets a limit_block in the WAL summary. The incremental restore logic miscalculates truncation_block_length when processing segment 0…N, because it compares the segment-local size with a relation-wide limit.

In src/backend/backup/basebackup_incremental.c:

```

*truncation_block_length = size / BLCKSZ;

if (BlockNumberIsValid(limit_block))

{

unsigned relative_limit = limit_block - segno * RELSEG_SIZE;

if (*truncation_block_length < relative_limit) /* ← problematic */

*truncation_block_length = relative_limit;

}

```

For example, if limit_block lies in segment 10, then relative_limit will be roughly 9 * RELSEG_SIZE while processing segment 0. This forces truncation_block_length far beyond the actual segment size, leading to a segment length larger than RELSEG_SIZE and eventually the restore error.

Reproduction Steps

Create a table larger than 1 GB (multiple segments).

Take a full base backup.

Delete rows that occupy the end of the relation.

Run VACUUM (VERBOSE, TRUNCATE) to ensure blocks are removed.

(optional) Confirm that the WAL summary includes a limit entry for the relation.

Take an incremental backup with pg_basebackup.

Attempt to restore using pg_combinebackup.

Observe the truncation block length error.

Patch

A patch correcting this logic is attached, and I’m happy to provide additional details or revisions if helpful.

Best regards,

Oleg Tkachenko

<bug_truncation_block_length.patch>

Amul Sul

sulamul@gmail.com

3 months ago

In reply to: Oleg Tkachenko (#2)

Re: [BUG] [PATCH] pg_basebackup produces wrong incremental files after relation truncation in segmented tables

On Mon, Dec 15, 2025 at 4:39 PM Oleg Tkachenko <oatkachenko@gmail.com> wrote:

[....]

A patch correcting this logic is attached, and I’m happy to provide additional details or revisions if helpful.

Thanks for the reproducer; I can see the reported issue, but I am not
quite sure the proposed fix is correct and might break other cases (I
haven't tried constructed that case yet) but there is a comment
detailing that case just before the point where you are planning to do
the changes:

/*
* The truncation block length is the minimum length of the reconstructed
* file. Any block numbers below this threshold that are not present in
* the backup need to be fetched from the prior backup. At or above this
* threshold, blocks should only be included in the result if they are
* present in the backup. (This may require inserting zero blocks if the
* blocks included in the backup are non-consecutive.)
*/

IIUC, we might need the original assignment logic as it is. But we
need to ensure that truncation_block_length is not set to a value that
exceeds RELSEG_SIZE.

Regards,
Amul

Robert Haas

robertmhaas@gmail.com

3 months ago

In reply to: Amul Sul (#3)

Re: [BUG] [PATCH] pg_basebackup produces wrong incremental files after relation truncation in segmented tables

[ sorry for not noticing this thread sooner; thanks to Andres for
pointing me to it ]

On Mon, Dec 15, 2025 at 9:01 AM Amul Sul <sulamul@gmail.com> wrote:

Thanks for the reproducer; I can see the reported issue, but I am not
quite sure the proposed fix is correct and might break other cases (I
haven't tried constructed that case yet) but there is a comment
detailing that case just before the point where you are planning to do
the changes:

/*
* The truncation block length is the minimum length of the reconstructed
* file. Any block numbers below this threshold that are not present in
* the backup need to be fetched from the prior backup. At or above this
* threshold, blocks should only be included in the result if they are
* present in the backup. (This may require inserting zero blocks if the
* blocks included in the backup are non-consecutive.)
*/

IIUC, we might need the original assignment logic as it is. But we
need to ensure that truncation_block_length is not set to a value that
exceeds RELSEG_SIZE.

I think you're right. By way of example, let's say that the current
length of the file is 200 blocks, but the limit block is 100 blocks
into the current segment. That means that the only blocks that we can
get from any previous backup are blocks 0-99. Blocks 100-199 of the
current segment are either mentioned in the WAL summaries we're using
for this backup, or they're all zeroes. We can't set the
truncation_block_length to a value greater than 100, or we'll go
looking for the contents of any zero-filled blocks in previous
backups, will will either fail or produce the wrong answer. But Oleg
is correct that we also shouldn't set it to a value greater than
RELSEG_SIZE. So my guess is that the correct fix might be something
like the attached (untested, for discussion).

--
Robert Haas
EDB: http://www.enterprisedb.com

Oleg Tkachenko

oatkachenko@gmail.com

3 months ago

In reply to: Robert Haas (#4)

Re: [BUG] [PATCH] pg_basebackup produces wrong incremental files after relation truncation in segmented tables

Hello Robert,

Thank you for the explanation.

At first, I also thought about clamping truncation_block_length to RELSEG_SIZE. But I hesitated because I thought the reconstructed relation file couldn’t be larger than relative_limit.

After reading the reconstruction code and the comments on top of the discussed block of code (many times), I finally understood that truncation_block_length is the minimum length of the reconstructed file, not just a safety limit. It determines which blocks must be fetched from older backups. So a simple clamp could change how reconstruction works if some blocks are included in incremental backups.

I’ve tested the version with the limit enforced to RELSEG_SIZE, and it works correctly.

Also, I’ve attached a patch based on your guidance. The changes are effectively the same as your suggested approach, but I would be happy to be listed as a contributor.

Regards,
Oleg Tkachenko

On Dec 15, 2025, at 17:35, Robert Haas <robertmhaas@gmail.com> wrote:

[ sorry for not noticing this thread sooner; thanks to Andres for
pointing me to it ]

On Mon, Dec 15, 2025 at 9:01 AM Amul Sul <sulamul@gmail.com> wrote:

Thanks for the reproducer; I can see the reported issue, but I am not
quite sure the proposed fix is correct and might break other cases (I
haven't tried constructed that case yet) but there is a comment
detailing that case just before the point where you are planning to do
the changes:

/*
* The truncation block length is the minimum length of the reconstructed
* file. Any block numbers below this threshold that are not present in
* the backup need to be fetched from the prior backup. At or above this
* threshold, blocks should only be included in the result if they are
* present in the backup. (This may require inserting zero blocks if the
* blocks included in the backup are non-consecutive.)
*/

IIUC, we might need the original assignment logic as it is. But we
need to ensure that truncation_block_length is not set to a value that
exceeds RELSEG_SIZE.

I think you're right. By way of example, let's say that the current
length of the file is 200 blocks, but the limit block is 100 blocks
into the current segment. That means that the only blocks that we can
get from any previous backup are blocks 0-99. Blocks 100-199 of the
current segment are either mentioned in the WAL summaries we're using
for this backup, or they're all zeroes. We can't set the
truncation_block_length to a value greater than 100, or we'll go
looking for the contents of any zero-filled blocks in previous
backups, will will either fail or produce the wrong answer. But Oleg
is correct that we also shouldn't set it to a value greater than
RELSEG_SIZE. So my guess is that the correct fix might be something
like the attached (untested, for discussion).

--
Robert Haas
EDB: http://www.enterprisedb.com

Robert Haas

robertmhaas@gmail.com

3 months ago

In reply to: Oleg Tkachenko (#5)

Re: [BUG] [PATCH] pg_basebackup produces wrong incremental files after relation truncation in segmented tables

On Mon, Dec 15, 2025 at 1:46 PM Oleg Tkachenko <oatkachenko@gmail.com> wrote:

Also, I’ve attached a patch based on your guidance. The changes are effectively the same as your suggested approach, but I would be happy to be listed as a contributor.

You'll certain be listed as the reporter for this issue when a fix is
committed. If you want to be listed as a co-author of the patch, I
think it is fair to say that it will need to contain some code written
by you. For example, maybe you would like to try writing a TAP test
case that fails without this fix and passes with it.

--
Robert Haas
EDB: http://www.enterprisedb.com

Chao Li

li.evan.chao@gmail.com

3 months ago

In reply to: Robert Haas (#4)

Re: [BUG] [PATCH] pg_basebackup produces wrong incremental files after relation truncation in segmented tables

On Dec 16, 2025, at 00:35, Robert Haas <robertmhaas@gmail.com> wrote:

[ sorry for not noticing this thread sooner; thanks to Andres for
pointing me to it ]

On Mon, Dec 15, 2025 at 9:01 AM Amul Sul <sulamul@gmail.com> wrote:

Thanks for the reproducer; I can see the reported issue, but I am not
quite sure the proposed fix is correct and might break other cases (I
haven't tried constructed that case yet) but there is a comment
detailing that case just before the point where you are planning to do
the changes:

/*
* The truncation block length is the minimum length of the reconstructed
* file. Any block numbers below this threshold that are not present in
* the backup need to be fetched from the prior backup. At or above this
* threshold, blocks should only be included in the result if they are
* present in the backup. (This may require inserting zero blocks if the
* blocks included in the backup are non-consecutive.)
*/

IIUC, we might need the original assignment logic as it is. But we
need to ensure that truncation_block_length is not set to a value that
exceeds RELSEG_SIZE.

I think you're right. By way of example, let's say that the current
length of the file is 200 blocks, but the limit block is 100 blocks
into the current segment. That means that the only blocks that we can
get from any previous backup are blocks 0-99. Blocks 100-199 of the
current segment are either mentioned in the WAL summaries we're using
for this backup, or they're all zeroes. We can't set the
truncation_block_length to a value greater than 100, or we'll go
looking for the contents of any zero-filled blocks in previous
backups, will will either fail or produce the wrong answer. But Oleg
is correct that we also shouldn't set it to a value greater than
RELSEG_SIZE. So my guess is that the correct fix might be something
like the attached (untested, for discussion).

--
Robert Haas
EDB: http://www.enterprisedb.com
<v1-0001-Don-t-set-the-truncation_block_length-rather-than.patch>

The change looks good to me. Only nitpick is:

```
Subject: [PATCH v1] Don't set the truncation_block_length rather than
RELSEG_SIZE.
```

I guess you meant to say “larger (or greater) than” instead of “rather than”.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

Robert Haas

robertmhaas@gmail.com

3 months ago

In reply to: Chao Li (#7)

Re: [BUG] [PATCH] pg_basebackup produces wrong incremental files after relation truncation in segmented tables

On Tue, Dec 16, 2025 at 3:06 AM Chao Li <li.evan.chao@gmail.com> wrote:

I guess you meant to say “larger (or greater) than” instead of “rather than”.

Yes, thanks.

--
Robert Haas
EDB: http://www.enterprisedb.com

Oleg Tkachenko

oatkachenko@gmail.com

3 months ago

In reply to: Robert Haas (#6)

Re: [BUG] [PATCH] pg_basebackup produces wrong incremental files after relation truncation in segmented tables

Hello, Robert

I’ve created a small test that reproduces the issue. With the proposed fix applied, the test passes, and the reconstruction behaves as expected.

I’m attaching the test for review. Please let me know if this looks OK or if you would like it changed.

Regards,

Oleg

Show quoted text

On Dec 15, 2025, at 21:13, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Dec 15, 2025 at 1:46 PM Oleg Tkachenko <oatkachenko@gmail.com> wrote:

Also, I’ve attached a patch based on your guidance. The changes are effectively the same as your suggested approach, but I would be happy to be listed as a contributor.

You'll certain be listed as the reporter for this issue when a fix is
committed. If you want to be listed as a co-author of the patch, I
think it is fair to say that it will need to contain some code written
by you. For example, maybe you would like to try writing a TAP test
case that fails without this fix and passes with it.

--
Robert Haas
EDB: http://www.enterprisedb.com

#10

Amul Sul

sulamul@gmail.com

3 months ago

In reply to: Oleg Tkachenko (#9)

Re: [BUG] [PATCH] pg_basebackup produces wrong incremental files after relation truncation in segmented tables

On Wed, Dec 17, 2025 at 3:25 AM Oleg Tkachenko <oatkachenko@gmail.com> wrote:

Hello, Robert

I’ve created a small test that reproduces the issue. With the proposed fix applied, the test passes, and the reconstruction behaves as expected.

I’m attaching the test for review. Please let me know if this looks OK or if you would like it changed.

Test looks good to me, but I have three suggestions as follow:

1. To minimize repetition in insert: use fillfactor 10, which is the
minimal we can set for a table, so that we can minimize tuples per
page. Use a longer string and lower count in repeat(), which I believe
helps the test become a bit faster.

2. I think we could add this test to the existing pg_combinebackup's
test file instead of creating a new file with a single-test. See the
attached version; it’s a bit smaller than your original patch, but
since I haven't copied all of your comments yet, I’ve marked it as
WIP.

3. Kindly combine the code fix and tests together into a single patch.

Regards,
Amul

#11

Robert Haas

robertmhaas@gmail.com

3 months ago

In reply to: Amul Sul (#10)

Re: [BUG] [PATCH] pg_basebackup produces wrong incremental files after relation truncation in segmented tables

On Thu, Dec 18, 2025 at 1:05 AM Amul Sul <sulamul@gmail.com> wrote:

Test looks good to me, but I have three suggestions as follow:

1. To minimize repetition in insert: use fillfactor 10, which is the
minimal we can set for a table, so that we can minimize tuples per
page. Use a longer string and lower count in repeat(), which I believe
helps the test become a bit faster.

I haven't checked how big a relation the test case creates, but it's
worth keeping in mind that the CI tests run on one platform with the
segment size set to six blocks. I think we should design the test case
with that in mind i.e. don't worry about catching the bug when the
segment size is 1GB, but make sure the test fails in CI without the
bug fix. Let's not rely on fillfactor -- the cost here is the disk
space and the time to write the blocks, not how many tuples they
actually contain.

2. I think we could add this test to the existing pg_combinebackup's
test file instead of creating a new file with a single-test. See the
attached version; it’s a bit smaller than your original patch, but
since I haven't copied all of your comments yet, I’ve marked it as
WIP.

-1. This kind of thing tends to make the tests harder to understand.

--
Robert Haas
EDB: http://www.enterprisedb.com

#12

Oleg Tkachenko

oatkachenko@gmail.com

3 months ago

In reply to: Robert Haas (#11)

Re: [BUG] [PATCH] pg_basebackup produces wrong incremental files after relation truncation in segmented tables

Hi,

Here is a refactored test.

Now, it creates data depending on the relation block size, so it works even if the segment size is not standard. I tested it locally with segment_size_blocks = 6, and it works correctly.

I would be happy to hear your comments or suggestions.

Regards,

Oleg

Show quoted text

On Dec 18, 2025, at 15:26, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Dec 18, 2025 at 1:05 AM Amul Sul <sulamul@gmail.com> wrote:

Test looks good to me, but I have three suggestions as follow:

1. To minimize repetition in insert: use fillfactor 10, which is the
minimal we can set for a table, so that we can minimize tuples per
page. Use a longer string and lower count in repeat(), which I believe
helps the test become a bit faster.

I haven't checked how big a relation the test case creates, but it's
worth keeping in mind that the CI tests run on one platform with the
segment size set to six blocks. I think we should design the test case
with that in mind i.e. don't worry about catching the bug when the
segment size is 1GB, but make sure the test fails in CI without the
bug fix. Let's not rely on fillfactor -- the cost here is the disk
space and the time to write the blocks, not how many tuples they
actually contain.

2. I think we could add this test to the existing pg_combinebackup's
test file instead of creating a new file with a single-test. See the
attached version; it’s a bit smaller than your original patch, but
since I haven't copied all of your comments yet, I’ve marked it as
WIP.

-1. This kind of thing tends to make the tests harder to understand.

--
Robert Haas
EDB: http://www.enterprisedb.com

#13

Robert Haas

robertmhaas@gmail.com

2 months ago

In reply to: Oleg Tkachenko (#12)

Re: [BUG] [PATCH] pg_basebackup produces wrong incremental files after relation truncation in segmented tables

On Thu, Dec 18, 2025 at 12:24 PM Oleg Tkachenko <oatkachenko@gmail.com> wrote:

Here is a refactored test.

Now, it creates data depending on the relation block size, so it works even if the segment size is not standard. I tested it locally with segment_size_blocks = 6, and it works correctly.

I would be happy to hear your comments or suggestions.

Hi Oleg,

I have been mostly on vacation since you sent this email, but here I
am back again. I tried running this on CI with and without the actual
code fix, and was pleased to see the CI failed on this test without
the code fix and passed with it. But then I noticed that you hadn't
updated meson.build in src/bin/pg_basebackup for the new test, which
means that the test was only running in configure/make builds and not
in meson/ninja builds. When I fixed that, things didn't look so good.
The test then fails:

pg_combinebackup: reconstructing
"/tmp/cirrus-ci-build/build/testrun/pg_basebackup/050_incremental_backup_truncation_block/data/t_050_incremental_backup_truncation_block_node2_data/pgdata/base/5/16384.1"
(1 blocks, checksum CRC32C)
pg_combinebackup: reconstruction plan:
0:/tmp/cirrus-ci-build/build/testrun/pg_basebackup/050_incremental_backup_truncation_block/data/t_050_incremental_backup_truncation_block_primary_data/backup/full/base/5/16384.1@0
pg_combinebackup: read 1 blocks from
"/tmp/cirrus-ci-build/build/testrun/pg_basebackup/050_incremental_backup_truncation_block/data/t_050_incremental_backup_truncation_block_primary_data/backup/full/base/5/16384.1"
pg_combinebackup: reconstructing
"/tmp/cirrus-ci-build/build/testrun/pg_basebackup/050_incremental_backup_truncation_block/data/t_050_incremental_backup_truncation_block_node2_data/pgdata/base/5/16384_vm"
(131072 blocks, checksum CRC32C)
pg_combinebackup: reconstruction plan:
0-3:/tmp/cirrus-ci-build/build/testrun/pg_basebackup/050_incremental_backup_truncation_block/data/t_050_incremental_backup_truncation_block_primary_data/backup/full/base/5/16384_vm@24576
4:/tmp/cirrus-ci-build/build/testrun/pg_basebackup/050_incremental_backup_truncation_block/data/t_050_incremental_backup_truncation_block_primary_data/backup/incr/base/5/INCREMENTAL.16384_vm@8192
5-131071:zero
pg_combinebackup: error: could not write file
"/tmp/cirrus-ci-build/build/testrun/pg_basebackup/050_incremental_backup_truncation_block/data/t_050_incremental_backup_truncation_block_node2_data/pgdata/base/5/16384_vm":
No space left on device

I'm not sure what's going on here exactly, but it seems bad. The
output implies that 16384_vm is a full 1GB in size, which doesn't
really make any sense to me at all, but the same thing also happens
when I run the test locally. The VM fork is normally quite small
compared to the data, and here the data is only one block over 1GB, so
I'd expect the VM fork to be just a few blocks. Are we somehow
confusing the length of the VM fork with the length of the main fork?

A couple of stylistic notes: All of the existing incremental backup
tests are in src/bin/pg_combinebackup/t. I suggest putting this one
there too. Normally, our TAP test names are all lower-case, so do the
same here. Try to format the test file so that things fit within 80
columns, by breaking comments and Perl statements at appropriate
points. Consider running src/tools/pgindent/pgperltidy over the script
to check that the way you've broken the Perl statements won't get
reindented.

--
Robert Haas
EDB: http://www.enterprisedb.com

#14

Oleg Tkachenko

oatkachenko@gmail.com

2 months ago

In reply to: Robert Haas (#13)

Re: [BUG] [PATCH] pg_basebackup produces wrong incremental files after relation truncation in segmented tables

Hello Robert,

I checked the VM fork file and found that its incremental version has a wrong
block number in the header:

```
xxd -l 12 INCREMENTAL.16384_vm
0d1f aed3 0100 0000 0000 0200 <--- 131072 blocks (1 GB)
^^^^ ^^^^
```

This value can only come from the WAL summaries, so I checked them too.
One of the summary files contains:

```
TS 1663, DB 5, REL 16384, FORK main: limit 131073
TS 1663, DB 5, REL 16384, FORK vm: limit 131073
TS 1663, DB 5, REL 16384, FORK vm: block 4

```

Both forks have the same limit, which looks wrong.
So I checked the WAL files to see what really happened with the VM fork.
I did not find any “truncate" records for the VM file.
I only found this record for the main fork
(actually, the fork isn’t mentioned at all):

```
rmgr: Storage len (rec/tot): 46/46, tx: 759, lsn: 0/4600D318,
prev 0/4600B2C8, desc: TRUNCATE base/5/16384 to 131073 blocks flags 7
```

This suggests that the WAL summarizer may be mixing up information between
relation forks.

I also noticed this comment in basebackup_incremental.c:

```
/*
* The free-space map fork is not properly WAL-logged, so we need to
* backup the entire file every time.
*/
if (forknum == FSM_FORKNUM)
return BACK_UP_FILE_FULLY;
```

Maybe we should treat the VM fork the same way and always back it up fully?
Another option is to fix the summarizer so it handles forks correctly.

Best regards,
Oleg

Show quoted text

On Jan 5, 2026, at 17:05, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Dec 18, 2025 at 12:24 PM Oleg Tkachenko <oatkachenko@gmail.com> wrote:

Here is a refactored test.

Now, it creates data depending on the relation block size, so it works even if the segment size is not standard. I tested it locally with segment_size_blocks = 6, and it works correctly.

I would be happy to hear your comments or suggestions.

Hi Oleg,

I have been mostly on vacation since you sent this email, but here I
am back again. I tried running this on CI with and without the actual
code fix, and was pleased to see the CI failed on this test without
the code fix and passed with it. But then I noticed that you hadn't
updated meson.build in src/bin/pg_basebackup for the new test, which
means that the test was only running in configure/make builds and not
in meson/ninja builds. When I fixed that, things didn't look so good.
The test then fails:

pg_combinebackup: reconstructing
"/tmp/cirrus-ci-build/build/testrun/pg_basebackup/050_incremental_backup_truncation_block/data/t_050_incremental_backup_truncation_block_node2_data/pgdata/base/5/16384.1"
(1 blocks, checksum CRC32C)
pg_combinebackup: reconstruction plan:
0:/tmp/cirrus-ci-build/build/testrun/pg_basebackup/050_incremental_backup_truncation_block/data/t_050_incremental_backup_truncation_block_primary_data/backup/full/base/5/16384.1@0
pg_combinebackup: read 1 blocks from
"/tmp/cirrus-ci-build/build/testrun/pg_basebackup/050_incremental_backup_truncation_block/data/t_050_incremental_backup_truncation_block_primary_data/backup/full/base/5/16384.1"
pg_combinebackup: reconstructing
"/tmp/cirrus-ci-build/build/testrun/pg_basebackup/050_incremental_backup_truncation_block/data/t_050_incremental_backup_truncation_block_node2_data/pgdata/base/5/16384_vm"
(131072 blocks, checksum CRC32C)
pg_combinebackup: reconstruction plan:
0-3:/tmp/cirrus-ci-build/build/testrun/pg_basebackup/050_incremental_backup_truncation_block/data/t_050_incremental_backup_truncation_block_primary_data/backup/full/base/5/16384_vm@24576
4:/tmp/cirrus-ci-build/build/testrun/pg_basebackup/050_incremental_backup_truncation_block/data/t_050_incremental_backup_truncation_block_primary_data/backup/incr/base/5/INCREMENTAL.16384_vm@8192
5-131071:zero
pg_combinebackup: error: could not write file
"/tmp/cirrus-ci-build/build/testrun/pg_basebackup/050_incremental_backup_truncation_block/data/t_050_incremental_backup_truncation_block_node2_data/pgdata/base/5/16384_vm":
No space left on device

I'm not sure what's going on here exactly, but it seems bad. The
output implies that 16384_vm is a full 1GB in size, which doesn't
really make any sense to me at all, but the same thing also happens
when I run the test locally. The VM fork is normally quite small
compared to the data, and here the data is only one block over 1GB, so
I'd expect the VM fork to be just a few blocks. Are we somehow
confusing the length of the VM fork with the length of the main fork?

A couple of stylistic notes: All of the existing incremental backup
tests are in src/bin/pg_combinebackup/t. I suggest putting this one
there too. Normally, our TAP test names are all lower-case, so do the
same here. Try to format the test file so that things fit within 80
columns, by breaking comments and Perl statements at appropriate
points. Consider running src/tools/pgindent/pgperltidy over the script
to check that the way you've broken the Perl statements won't get
reindented.

--
Robert Haas
EDB: http://www.enterprisedb.com

#15

Robert Haas

robertmhaas@gmail.com

about 2 months ago

In reply to: Oleg Tkachenko (#14)

Re: [BUG] [PATCH] pg_basebackup produces wrong incremental files after relation truncation in segmented tables

On Wed, Jan 7, 2026 at 9:50 AM Oleg Tkachenko <oatkachenko@gmail.com> wrote:

Both forks have the same limit, which looks wrong.
So I checked the WAL files to see what really happened with the VM fork.
I did not find any “truncate" records for the VM file.
I only found this record for the main fork
(actually, the fork isn’t mentioned at all):

rmgr: Storage len (rec/tot): 46/46, tx: 759, lsn: 0/4600D318,
prev 0/4600B2C8, desc: TRUNCATE base/5/16384 to 131073 blocks flags 7

Flags 7 for Storage/TRUNCATE means all forks:

#define SMGR_TRUNCATE_HEAP 0x0001
#define SMGR_TRUNCATE_VM 0x0002
#define SMGR_TRUNCATE_FSM 0x0004
#define SMGR_TRUNCATE_ALL \
(SMGR_TRUNCATE_HEAP|SMGR_TRUNCATE_VM|SMGR_TRUNCATE_FSM)

I think this comes from RelationTruncate(), which does indeed set
xlrec.flags = SMGR_TRUNCATE_ALL. It seems bananas to me to use the
same count of blocks for all forks, but it seems that is the way the
code treats it. SmgrTruncate() goes on to do
smgrtruncate(RelationGetSmgr(rel), forks, nforks, old_blocks, blocks)
which iterates over all forks and uses the same block number for all
of them, smgr_redo() also does this, and SummarizeSmgrRecord() also
calls BlockRefTableSetLimitBlock() for each relevant fork with that
same block number. This really makes no sense to me unless the block
count happens to be zero, but AFAICT all the code agrees that this is
how it's supposed to work.

I think the problem here is that the incremental backup code makes the
apparently-naive assumption that the purpose of truncation is to make
things shorter. In this case, all forks were truncated to a random
length that was well in excess of the length of the VM fork, and in
pg_combinebackup, find_reconstructed_block_length() interprets that to
mean that the output file should be at least as long as the truncation
length. I am at present uncertain whether that can be safely changed
without breaking anything else. I don't think that what we're doing is
unsafe in the sense of producing corrupted data, because a bunch of
trailing blocks of zeroes are harmless, but it's obviously potentially
pretty problematic if it causes a huge disk space blowup as it did
here. So I think something should be done about this, but I think the
original issue you reported is more urgent.

So my suggestion is to change the test so that it produces a file that
is the same small size on every platform. On most platforms, this will
be 1 segment. On the CI platform where we set the segment size to 6,
it will be multiple segments, and on that platform only it will
effectively test for this bug. If you do that, then we can commit the
fix for the original problem. We (or someone else) can then look into
what needs to address the excessive zero-padding as a separate issue.

--
Robert Haas
EDB: http://www.enterprisedb.com

#16

Oleg Tkachenko

oatkachenko@gmail.com

about 2 months ago

In reply to: Robert Haas (#15)

Re: [BUG] [PATCH] pg_basebackup produces wrong incremental files after relation truncation in segmented tables

Hi, Robert

As you suggested, I've updated the test so that it produces small files consistently across platforms.
On one specific platform (with segment size = 6 blocks), it still exercises the relevant code path.
And I've also applied the style changes you mentioned.

Best regards,
Oleg

#17

Robert Haas

robertmhaas@gmail.com

about 2 months ago

In reply to: Oleg Tkachenko (#16)

Re: [BUG] [PATCH] pg_basebackup produces wrong incremental files after relation truncation in segmented tables

On Fri, Jan 16, 2026 at 7:10 AM Oleg Tkachenko <oatkachenko@gmail.com> wrote:

As you suggested, I've updated the test so that it produces small files consistently across platforms.
On one specific platform (with segment size = 6 blocks), it still exercises the relevant code path.
And I've also applied the style changes you mentioned.

Thanks. I have committed the fix and back-patched to v17. I made some
style changes to your test, especially rewriting comments, but the
substance of it is unchanged.

--
Robert Haas
EDB: http://www.enterprisedb.com

#18

Alexander Lakhin

exclusion@gmail.com

about 1 month ago

In reply to: Robert Haas (#17)

Re: [BUG] [PATCH] pg_basebackup produces wrong incremental files after relation truncation in segmented tables

Hello Robert,

19.01.2026 19:42, Robert Haas wrote:

On Fri, Jan 16, 2026 at 7:10 AM Oleg Tkachenko <oatkachenko@gmail.com> wrote:

As you suggested, I've updated the test so that it produces small files consistently across platforms.
On one specific platform (with segment size = 6 blocks), it still exercises the relevant code path.
And I've also applied the style changes you mentioned.

Thanks. I have committed the fix and back-patched to v17. I made some
style changes to your test, especially rewriting comments, but the
substance of it is unchanged.

As Windows animal fairywren shows at [1]https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren&dt=2026-01-20%2001%3A26%3A02, the test
011_incremental_backup_truncation_block, added by ecd275718, hit Windows
limitation to the path length:
126/281 postgresql:pg_combinebackup / pg_combinebackup/011_incremental_backup_truncation_block ERROR 96.71s
(exit status 255 or signal 127 SIGinvalid)

log/011_incremental_backup_truncation_block_primary.log contains:
...
2026-01-20 02:11:14.760 UTC [7480:2] [unknown] LOG: connection authenticated: user="pgrunner" method=trust
(C:/tools/xmsys64/home/pgrunner/bf/root/REL_18_STABLE/pgsql.build/testrun/pg_combinebackup/011_incremental_backup_truncation_block/data/t_011_incremental_backup_truncation_block_primary_data/pgdata/pg_hba.conf:117)
...
2026-01-20 02:11:23.506 UTC [1252:3] LOG: checkpoint starting: immediate force wait
2026-01-20 02:11:23.580 UTC [1252:4] LOG: checkpoint complete: wrote 5 buffers (3.9%), wrote 1 SLRU buffers; 0 WAL
file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.001 s, total=0.074 s; sync files=0, longest=0.000 s,
average=0.000 s; distance=32768 kB, estimate=32768 kB; lsn=0/4000080, redo lsn=0/4000028
2026-01-20 02:11:23.974 UTC [8968:2] ERROR: could not rename file "pg_wal/summaries/temp.summary" to
"pg_wal/summaries/00000001000000000100002800000000010CAA50.summary": No such file or directory
1 file(s) copied.
2026-01-20 02:11:33.645 UTC [7968:14] 011_incremental_backup_truncation_block.pl WARNING: still waiting for WAL
summarization through 0/4000028 after 10 seconds
2026-01-20 02:11:33.645 UTC [7968:15] 011_incremental_backup_truncation_block.pl DETAIL: Summarization has reached
0/1000000 on disk and 0/10CAA50 in memory.

That is, the target filename for the rename operation is:
C:/tools/xmsys64/home/pgrunner/bf/root/REL_18_STABLE/pgsql.build/testrun/pg_combinebackup/011_incremental_backup_truncation_block/data/t_011_incremental_backup_truncation_block_primary_data/pgdata/pg_wal/summaries/00000001000000000100002800000000010CAA50.summary

I can reproduce this locally with the source tree located at
/c/src/postgresql12345678901234567890123456789012345678901.
Not reproduced with HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\FileSystem\LongPathsEnabled set to 1.

Note that the test passes at the master branch ([2]https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=fairywren&dt=2026-01-24%2010%3A03%3A10&stg=misc-check) because "master" is
shorter than "REL_18_STABLE" by 7 chars.

[1]: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren&dt=2026-01-20%2001%3A26%3A02
[2]: https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=fairywren&dt=2026-01-24%2010%3A03%3A10&stg=misc-check
[3]: /messages/by-id/666ac55b-3400-fb2c-2cea-0281bf36a53c@dunslane.net

Best regards,
Alexander

#19

Robert Haas

robertmhaas@gmail.com

about 1 month ago

In reply to: Alexander Lakhin (#18)

Re: [BUG] [PATCH] pg_basebackup produces wrong incremental files after relation truncation in segmented tables

On Sun, Jan 25, 2026 at 7:00 AM Alexander Lakhin <exclusion@gmail.com> wrote:

As Windows animal fairywren shows at [1], the test
011_incremental_backup_truncation_block, added by ecd275718, hit Windows
limitation to the path length:
126/281 postgresql:pg_combinebackup / pg_combinebackup/011_incremental_backup_truncation_block ERROR 96.71s
(exit status 255 or signal 127 SIGinvalid)

Thanks for the report. I have pushed a commit to rename this test case
to 011_incremental_backup_truncation_block to 011_ib_truncation, which
I hope will be enough to fix this. I noticed when committing
originally that this test case's name was a lot longer than anything
else in the same directory, but I figured it didn't matter enough to
bother changing it. Oops.

I also wonder if there's some way we could change some of our pathname
construction logic to mitigate this. Notice that in this pathname:

C:/tools/xmsys64/home/pgrunner/bf/root/REL_18_STABLE/pgsql.build/testrun/pg_combinebackup/011_incremental_backup_truncation_block/data/t_011_incremental_backup_truncation_block_primary_data/pgdata/pg_wal/summaries/00000001000000000100002800000000010CAA50.summary

...the full name of the test case appears twice, once as a
subdirectory of pg_combinebackup, indicating which pg_combinebackup
test is running, and then again as part of the name of the data
directory. But why does the directory need to be named
t_011_incremental_backup_truncation_block_primary_data instead of, you
know, primary_data? I would sort of like to hope that in 2026, we
wouldn't be subject to a 260-character pathname limit. But if we are,
repeating the same strings multiple times in that pathname doesn't
seem like the way to go.

--
Robert Haas
EDB: http://www.enterprisedb.com

[BUG] pg_basebackup produces wrong incremental files after relation truncation in segmented tables

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments: