small pg_combinebackup improvements

Started by Robert Haasabout 1 year ago6 messages
#1Robert Haas
robertmhaas@gmail.com
2 attachment(s)

Hi,

Here are two small patches to improve pg_combinebackup slightly.

0001: I noticed that there is some logic in reconstruct.c that
constructs a pathname of the form a/b//c instead of a/b/c. AFAICT,
this still works fine; it just looks ugly. It's possible to get one of
these pathnames to show up in an error message if you have a corrupted
backup, e.g.:

pg_combinebackup: error: could not open file
"x/base/1//INCREMENTAL.6229": No such file or directory
pg_combinebackup: removing output directory "xy"

So 0001 fixes this.

0002: If you try to do something like pg_combinebackup incr1 incr2 -o
result, you'll get an error saying that the first backup must be a
full backup. This is an implementation restriction that might be good
to lift, but right now that's how it is. However, if you do
pg_combinebackup full incr -o result, but the full backup happens to
contain an INCREMENTAL.* file that also exists in the incremental
backup, then you won't get an error. Instead you'll reconstruct a
completely bogus full file and write it to the result directory. Now,
this should really be impossible unless you're intentionally trying to
break pg_combinebackup, but it's also pretty lame, so 0002 fixes
things so that you get a proper error, and also adds a test case.

Review appreciated. My plan is to commit at least to master and
possibly back-patch. Opinions on whether to back-patch are also
appreciated.

Thanks,

--
Robert Haas
EDB: http://www.enterprisedb.com

Attachments:

v1-0001-pg_combinebackup-When-reconstructing-avoid-double.patchapplication/octet-stream; name=v1-0001-pg_combinebackup-When-reconstructing-avoid-double.patchDownload
From 46d2ab10897626fd3d22282713e67e38945f2d60 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Wed, 30 Oct 2024 15:02:38 -0400
Subject: [PATCH v1 1/2] pg_combinebackup: When reconstructing, avoid double
 slash in filename.

This function is always called with a relative_path that ends in a
slash, so there's no need to insert a second one. So, don't. Instead,
add an assertion to verify that nothing gets broken in the future, and
adjust the comments.
---
 src/bin/pg_combinebackup/reconstruct.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/src/bin/pg_combinebackup/reconstruct.c b/src/bin/pg_combinebackup/reconstruct.c
index db3f968d271..ae8a5125263 100644
--- a/src/bin/pg_combinebackup/reconstruct.c
+++ b/src/bin/pg_combinebackup/reconstruct.c
@@ -77,8 +77,9 @@ static void read_block(rfile *s, off_t off, uint8 *buffer);
  *
  * relative_path should be the path to the directory containing this file,
  * relative to the root of the backup (NOT relative to the root of the
- * tablespace). bare_file_name should be the name of the file within that
- * directory, without "INCREMENTAL.".
+ * tablespace). It must always end with a trailing slash. bare_file_name
+ * should be the name of the file within that directory, without
+ * "INCREMENTAL.".
  *
  * n_prior_backups is the number of prior backups, and prior_backup_dirs is
  * an array of pathnames where those backups can be found.
@@ -111,6 +112,10 @@ reconstruct_from_incremental_file(char *input_filename,
 	rfile	   *copy_source = NULL;
 	pg_checksum_context checksum_ctx;
 
+	/* Sanity check the relative_path. */
+	Assert(relative_path[0] != '\0');
+	Assert(relative_path[strlen(relative_path) - 1] == '/');
+
 	/*
 	 * Every block must come either from the latest version of the file or
 	 * from one of the prior backups.
@@ -174,11 +179,11 @@ reconstruct_from_incremental_file(char *input_filename,
 		 * Look for the full file in the previous backup. If not found, then
 		 * look for an incremental file instead.
 		 */
-		snprintf(source_filename, MAXPGPATH, "%s/%s/%s",
+		snprintf(source_filename, MAXPGPATH, "%s/%s%s",
 				 prior_backup_dirs[sidx], relative_path, bare_file_name);
 		if ((s = make_rfile(source_filename, true)) == NULL)
 		{
-			snprintf(source_filename, MAXPGPATH, "%s/%s/INCREMENTAL.%s",
+			snprintf(source_filename, MAXPGPATH, "%s/%sINCREMENTAL.%s",
 					 prior_backup_dirs[sidx], relative_path, bare_file_name);
 			s = make_incremental_rfile(source_filename);
 		}
-- 
2.39.3 (Apple Git-145)

v1-0002-pg_combinebackup-Error-if-incremental-file-exists.patchapplication/octet-stream; name=v1-0002-pg_combinebackup-Error-if-incremental-file-exists.patchDownload
From 44625b0d549ce43ef5925ae12951167df94b9868 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Wed, 30 Oct 2024 15:33:22 -0400
Subject: [PATCH v1 2/2] pg_combinebackup: Error if incremental file exists in
 full backup.

Suppose that you run a command like "pg_combinebackup b1 b2 -o output",
but both b1 and b2 contain an INCREMENTAL.$something file in a directory
that is expected to contain relation files. This is an error, but the
previous code would not detect the problem and instead write a garbage
full file named $something to the output directory. This commit adds
code to detect the error and a test case to verify the behavior.
---
 src/bin/pg_combinebackup/meson.build          |  1 +
 src/bin/pg_combinebackup/reconstruct.c        |  8 +++
 .../pg_combinebackup/t/009_no_full_file.pl    | 66 +++++++++++++++++++
 3 files changed, 75 insertions(+)
 create mode 100644 src/bin/pg_combinebackup/t/009_no_full_file.pl

diff --git a/src/bin/pg_combinebackup/meson.build b/src/bin/pg_combinebackup/meson.build
index d142608e949..550d3503269 100644
--- a/src/bin/pg_combinebackup/meson.build
+++ b/src/bin/pg_combinebackup/meson.build
@@ -36,6 +36,7 @@ tests += {
       't/006_db_file_copy.pl',
       't/007_wal_level_minimal.pl',
       't/008_promote.pl',
+      't/009_no_full_file.pl',
     ],
   }
 }
diff --git a/src/bin/pg_combinebackup/reconstruct.c b/src/bin/pg_combinebackup/reconstruct.c
index ae8a5125263..37ae38b6108 100644
--- a/src/bin/pg_combinebackup/reconstruct.c
+++ b/src/bin/pg_combinebackup/reconstruct.c
@@ -326,11 +326,19 @@ reconstruct_from_incremental_file(char *input_filename,
 	 * result, then forget about performing reconstruction and just copy that
 	 * file in its entirety.
 	 *
+	 * If we have only incremental files, and there's no full file at any
+	 * point in the backup chain, something has gone wrong. Emit an error.
+	 *
 	 * Otherwise, reconstruct.
 	 */
 	if (copy_source != NULL)
 		copy_file(copy_source->filename, output_filename,
 				  &checksum_ctx, copy_method, dry_run);
+	else if (sidx == 0 && source[0]->header_length != 0)
+	{
+		pg_fatal("full backup contains unexpected incremental file \"%s\"",
+				 source[0]->filename);
+	}
 	else
 	{
 		write_reconstructed_file(input_filename, output_filename,
diff --git a/src/bin/pg_combinebackup/t/009_no_full_file.pl b/src/bin/pg_combinebackup/t/009_no_full_file.pl
new file mode 100644
index 00000000000..8a23fa7a4cf
--- /dev/null
+++ b/src/bin/pg_combinebackup/t/009_no_full_file.pl
@@ -0,0 +1,66 @@
+# Copyright (c) 2021-2024, PostgreSQL Global Development Group
+
+use strict;
+use warnings FATAL => 'all';
+use File::Copy;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Can be changed to test the other modes.
+my $mode = $ENV{PG_TEST_PG_COMBINEBACKUP_MODE} || '--copy';
+
+note "testing using mode $mode";
+
+# Set up a new database instance.
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(has_archiving => 1, allows_streaming => 1);
+$primary->append_conf('postgresql.conf', 'summarize_wal = on');
+$primary->start;
+
+# Take a full backup.
+my $backup1path = $primary->backup_dir . '/backup1';
+$primary->command_ok(
+	[ 'pg_basebackup', '-D', $backup1path, '--no-sync', '-cfast' ],
+	"full backup");
+
+# Take an incremental backup.
+my $backup2path = $primary->backup_dir . '/backup2';
+$primary->command_ok(
+	[
+		'pg_basebackup', '-D', $backup2path, '--no-sync', '-cfast',
+		'--incremental', $backup1path . '/backup_manifest'
+	],
+	"incremental backup");
+
+# Find an incremental file in the incremental backup for which there is a full
+# file in the full backup. When we find one, replace the full file with an
+# incremental file.
+my @filelist = grep { /^INCREMENTAL\./ } slurp_dir("$backup2path/base/1");
+my $success = 0;
+for my $iname (@filelist)
+{
+	my $name = $iname;
+	$name =~ s/^INCREMENTAL.//;
+
+	if (-f "$backup1path/base/1/$name")
+	{
+		copy("$backup2path/base/1/$iname", "$backup1path/base/1/$iname")
+			|| die "copy $backup2path/base/1/$iname: $!";
+		unlink("$backup1path/base/1/$name")
+			|| die "unlink $backup1path/base/1/$name: $!";
+		$success = 1;
+		last;
+	}
+}
+
+# pg_combinebackup should fail.
+my $outpath = $primary->backup_dir . '/out';
+$primary->command_fails_like(
+	[
+		'pg_combinebackup', $backup1path, $backup2path, '-o', $outpath,
+	],
+	qr/full backup contains unexpected incremental file/,
+	"pg_combinebackup fails");
+
+done_testing();
-- 
2.39.3 (Apple Git-145)

#2Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Robert Haas (#1)
Re: small pg_combinebackup improvements

Hi,

On Wed, Oct 30, 2024 at 03:50:53PM -0400, Robert Haas wrote:

Hi,

Here are two small patches to improve pg_combinebackup slightly.

0001: I noticed that there is some logic in reconstruct.c that
constructs a pathname of the form a/b//c instead of a/b/c. AFAICT,
this still works fine; it just looks ugly. It's possible to get one of
these pathnames to show up in an error message if you have a corrupted
backup, e.g.:

pg_combinebackup: error: could not open file
"x/base/1//INCREMENTAL.6229": No such file or directory
pg_combinebackup: removing output directory "xy"

So 0001 fixes this.

Yeah, I don't think that was an issue per say but better to display a path of
the form a/b/c (to not be "distracted" by the output at the first place).

0001 looks pretty straightforward and good to me.

0002: If you try to do something like pg_combinebackup incr1 incr2 -o
result, you'll get an error saying that the first backup must be a
full backup. This is an implementation restriction that might be good
to lift, but right now that's how it is. However, if you do
pg_combinebackup full incr -o result, but the full backup happens to
contain an INCREMENTAL.* file that also exists in the incremental
backup, then you won't get an error. Instead you'll reconstruct a
completely bogus full file and write it to the result directory. Now,
this should really be impossible unless you're intentionally trying to
break pg_combinebackup, but it's also pretty lame, so 0002 fixes
things so that you get a proper error, and also adds a test case.

1 ===

+        * If we have only incremental files, and there's no full file at any
+        * point in the backup chain, something has gone wrong. Emit an error.
+        *
         * Otherwise, reconstruct.
         */
        if (copy_source != NULL)
                copy_file(copy_source->filename, output_filename,
                                  &checksum_ctx, copy_method, dry_run);
+       else if (sidx == 0 && source[0]->header_length != 0)
+       {
+               pg_fatal("full backup contains unexpected incremental file \"%s\"",
+                                source[0]->filename);
+       }

What about moving the new comment just before the new "else if"?

2 ===

+ copy("$backup2path/base/1/$iname", "$backup1path/base/1/$iname")

Yeah, "copy" is needed. I tested to "just" create an empty incremental file
and got:

$ pg_combinebackup backup1 backup2 -o restored_full
pg_combinebackup: error: could not read file "backup1/base/1/INCREMENTAL.6113": read 0 of 4

Which is not the error we want to produce.

3 ===

+       qr/full backup contains unexpected incremental file/,
+       "pg_combinebackup fails");

s/pg_combinebackup fails/pg_combinebackup fails due to an unexpected incremental file/?

Review appreciated. My plan is to commit at least to master and
possibly back-patch. Opinions on whether to back-patch are also
appreciated.

I'm not sure about 0001 but I think 0002 deserves a back patch as I think it fits
into the "low-risk fixes" category [0]https://www.postgresql.org/support/versioning/.

[0]: https://www.postgresql.org/support/versioning/

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#3Robert Haas
robertmhaas@gmail.com
In reply to: Bertrand Drouvot (#2)
Re: small pg_combinebackup improvements

On Thu, Oct 31, 2024 at 11:41 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

0001 looks pretty straightforward and good to me.

Thanks for the review.

What about moving the new comment just before the new "else if"?

Well, the block comment applies to the whole if-else if-else
construction. If we get too many else-ifs here it may need
restructuring, but I don't think it needs that now.

Yeah, "copy" is needed. I tested to "just" create an empty incremental file
and got:

$ pg_combinebackup backup1 backup2 -o restored_full
pg_combinebackup: error: could not read file "backup1/base/1/INCREMENTAL.6113": read 0 of 4

Which is not the error we want to produce.

Right, it needs to look like a valid incremental file.

s/pg_combinebackup fails/pg_combinebackup fails due to an unexpected incremental file/?

OK

I'm not sure about 0001 but I think 0002 deserves a back patch as I think it fits
into the "low-risk fixes" category [0].

I'm inclined to back-patch both, then. We might have more small fixes
and they'll be less risky to back-patch if we back-patch them all.

--
Robert Haas
EDB: http://www.enterprisedb.com

#4Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Robert Haas (#3)
Re: small pg_combinebackup improvements

Hi,

On Thu, Oct 31, 2024 at 12:06:25PM -0400, Robert Haas wrote:

On Thu, Oct 31, 2024 at 11:41 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

I'm not sure about 0001 but I think 0002 deserves a back patch as I think it fits
into the "low-risk fixes" category [0].

I'm inclined to back-patch both, then. We might have more small fixes
and they'll be less risky to back-patch if we back-patch them all.

Yeah, that makes fully sense. +1 to back-patch both then.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#5Amul Sul
sulamul@gmail.com
In reply to: Bertrand Drouvot (#4)
Re: small pg_combinebackup improvements

On Fri, Nov 1, 2024 at 1:31 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Hi,

On Thu, Oct 31, 2024 at 12:06:25PM -0400, Robert Haas wrote:

On Thu, Oct 31, 2024 at 11:41 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

I'm not sure about 0001 but I think 0002 deserves a back patch as I think it fits
into the "low-risk fixes" category [0].

I'm inclined to back-patch both, then. We might have more small fixes
and they'll be less risky to back-patch if we back-patch them all.

Yeah, that makes fully sense. +1 to back-patch both then.

+1 for the back-patching.

For 0002, I think we could report the error a bit earlier — the better
place might be in the else part of the following IF-block, IMO:

/*
* If s->header_length == 0, then this is a full file; otherwise, it's
* an incremental file.
*/
if (s->header_length == 0)

Regards,
Amul

#6Robert Haas
robertmhaas@gmail.com
In reply to: Amul Sul (#5)
Re: small pg_combinebackup improvements

On Sun, Nov 3, 2024 at 11:40 PM Amul Sul <sulamul@gmail.com> wrote:

+1 for the back-patching.

For 0002, I think we could report the error a bit earlier — the better
place might be in the else part of the following IF-block, IMO:

/*
* If s->header_length == 0, then this is a full file; otherwise, it's
* an incremental file.
*/
if (s->header_length == 0)

We could, but I have some follow-up patches adding some new
functionality which I plan to post soon, and for those patches, it
turns out to be more convenient to have the error check where I put
it. So I'm sticking with that placement.

Committed and back-patched both to v17; thanks to both of you for the review.

--
Robert Haas
EDB: http://www.enterprisedb.com