Add -k/--link option to pg_combinebackup

Started by Israel Barth Rubio12 months ago29 messages

barthisrael@gmail.com

12 months ago

1 attachment(s)

Hello all,

With the current implementation of pg_combinebackup, we have a few
copy methods: --clone, --copy and --copy-file-range. By using either of
them, it implicates an actual file copy in the file system, i.e. among
other things, disk usage.

While discussing with some people, e.g. Robert Haas and Martín Marqués,
about possible ways to improve pg_combinebackup performance and/or
reduce disk usage taken by the synthetic backup, there was a thought
of using hard links.

I'm submitting a patch in this thread which introduces the -k/--link
option to pg_combinebackup, making it similar to the options exposed
by pg_upgrade. The patch reuses the same code flow that already exists
in pg_combinebackup, and whenever the tool judges a file should be
copied from either of the input backups to the synthetic backup, if the
link mode was chosen, a hard link is created instead of performing a
copy.

Depending on the pattern of modification of PGDATA files between
backups (based on the workload), the user might face improved
performance of pg_combinebackup as well as lower disk usage by the
synthetic backup.

That enhancement comes at a cost: the input backups may be invalidated
if the user modified the shared files from the synthetic backup, or if
the user starts the cluster from the synthetic backup. With that in
mind, a warning has been added to pg_combinebackup, as well as to the
docs, recommending that the user moves the synthetic backup to another
file system or machine before making use of it.

I've run the existing test suite with --link option and it worked fine,
except for an issue when running pg_verifybackup in t/003_timeline.pl.
The issue was expected given the description shared above. I modified
the tests to behave slightly differently depending on the copy method
selected by the runner.

Besides that, you can find below the outcomes of manual testing:

* Create some tables:

postgres=# CREATE TABLE test_1 AS SELECT generate_series(1, 1000000);
SELECT 1000000
postgres=# CREATE TABLE test_2 AS SELECT generate_series(1, 1000000);
SELECT 1000000
postgres=# CREATE TABLE test_3 AS SELECT generate_series(1, 1000000);
SELECT 1000000

* Check their OID:

postgres=# SELECT oid, relname FROM pg_class WHERE relname LIKE 'test_%';
oid | relname
-------+---------
16388 | test_1
16391 | test_2
16394 | test_3
(3 rows)

* Enable the WAL summarizer:

postgres=# ALTER SYSTEM SET summarize_wal TO on;
ALTER SYSTEM
postgres=# SELECT pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)

* Take a full backup:

$ pg_basebackup -d 'dbname=postgres' -D ~/monday -c fast -Xn
NOTICE: WAL archiving is not enabled; you must ensure that all required
WAL segments are copied through other means to complete the backup

* Perform an update that will touch the whole file:

postgres=# UPDATE test_2 SET generate_series = generate_series + 1;
UPDATE 1000000

* Perform an update that will touch a single page:

postgres=# UPDATE test_3 SET generate_series = generate_series + 1 WHERE
generate_series = 1;
UPDATE 1

* Take an incremental backup:

$ pg_basebackup -d 'dbname=postgres' -D ~/tuesday -c fast -Xn --incremental
~/monday/backup_manifest
NOTICE: WAL archiving is not enabled; you must ensure that all required
WAL segments are copied through other means to complete the backup

* Check a dry run of pg_combinebackup, focusing on the OIDs of the
tables:

$ pg_combinebackup -d -n --link -o ~/tuesday_full ~/monday ~/tuesday/ 2>&1
| grep -P '16388|16391|16394'
pg_combinebackup: would copy "/home/vagrant/monday/base/5/16388" to
"/home/vagrant/tuesday_full/base/5/16388" using strategy link
pg_combinebackup: would copy "/home/vagrant/tuesday//base/5/16391" to
"/home/vagrant/tuesday_full/base/5/16391" using strategy link
pg_combinebackup: would reconstruct
"/home/vagrant/tuesday_full/base/5/16394" (4480 blocks, checksum CRC32C)
pg_combinebackup: reconstruction plan:
0:/home/vagrant/tuesday//base/5/INCREMENTAL.16394@8192
1-4423:/home/vagrant/monday/base/5/16394@36233216
4424:/home/vagrant/tuesday//base/5/INCREMENTAL.16394@16384
4425-4479:/home/vagrant/monday/base/5/16394@36691968
pg_combinebackup: would have read 4478 blocks from
"/home/vagrant/monday/base/5/16394"
pg_combinebackup: would have read 2 blocks from
"/home/vagrant/tuesday//base/5/INCREMENTAL.16394"
pg_combinebackup: would copy "/home/vagrant/monday/base/5/16388_vm" to
"/home/vagrant/tuesday_full/base/5/16388_vm" using strategy link
pg_combinebackup: would copy "/home/vagrant/tuesday//base/5/16388_fsm" to
"/home/vagrant/tuesday_full/base/5/16388_fsm" using strategy link
pg_combinebackup: would copy "/home/vagrant/tuesday//base/5/16391_vm" to
"/home/vagrant/tuesday_full/base/5/16391_vm" using strategy link
pg_combinebackup: would copy "/home/vagrant/tuesday//base/5/16391_fsm" to
"/home/vagrant/tuesday_full/base/5/16391_fsm" using strategy link
pg_combinebackup: would copy "/home/vagrant/monday/base/5/16394_vm" to
"/home/vagrant/tuesday_full/base/5/16394_vm" using strategy link
pg_combinebackup: would copy "/home/vagrant/tuesday//base/5/16394_fsm" to
"/home/vagrant/tuesday_full/base/5/16394_fsm" using strategy link

* Create a synthetic backup:
$ pg_combinebackup --link -o ~/tuesday_full ~/monday ~/tuesday
pg_combinebackup: warning: Modifying files or starting the cluster from the
synthetic backup might invalidate one or more of the input backups
pg_combinebackup: hint: It is recommended to move the synthetic backup to
another file system or machine before performing changes and/or starting
the cluster

* Check files with link counter set to 1:

$ find ~/tuesday_full -type f -links 1
/home/vagrant/tuesday_full/backup_label
/home/vagrant/tuesday_full/base/5/1259
/home/vagrant/tuesday_full/base/5/2619
/home/vagrant/tuesday_full/base/5/16394
/home/vagrant/tuesday_full/backup_manifest

* Check files with link counter set to 2, focusing on the OIDs of the
tables:

$ find ~/tuesday_full -type f -links 2 | grep -P '16388|16391|16394'
/home/vagrant/tuesday_full/base/5/16388
/home/vagrant/tuesday_full/base/5/16391
/home/vagrant/tuesday_full/base/5/16388_vm
/home/vagrant/tuesday_full/base/5/16388_fsm
/home/vagrant/tuesday_full/base/5/16391_vm
/home/vagrant/tuesday_full/base/5/16391_fsm
/home/vagrant/tuesday_full/base/5/16394_vm
/home/vagrant/tuesday_full/base/5/16394_fsm

(There are many other linked files).

Attachments:

v1-0001-pg_combinebackup-add-support-for-hard-links.patchapplication/octet-stream; name=v1-0001-pg_combinebackup-add-support-for-hard-links.patchDownload

From e32e636cff9910f660c5bcaa4c2cb2e9586f41b9 Mon Sep 17 00:00:00 2001
From: Israel Barth Rubio <barthisrael@gmail.com>
Date: Tue, 14 Jan 2025 19:49:35 +0000
Subject: [PATCH v1] pg_combinebackup: add support for hard links

Up to now, pg_combinebackup reconstructs incremental files, if needed,
otherwise copy them from any of the input backups to the output
directory. That copy mecanism can use different methods, depending on
the argument specified by the user.

This commit adds support for a new "copy method": hard links
(-k/--link). When using that mode, instead of copying unmodified files
from the input backups to the output directory, pg_combinebackup
creates the files as hard links from the output directory to the input
backups.

The new link method might speed up the reconstruction of the synthetic
backup (no file copy) and reduce disk usage taken by the synthetic
backup. The benefits depend on the modification pattern of files in
PGDATA between backups, imposed by the workload on Postgres.

This feature requires that the input backups plus the output directory
are in the same file system. Also, caution is required from the user
when modifying or starting the cluster from a synthetic backup, as that
might invalidate one or more of the input backups.

Signed-off-by: Israel Barth Rubio <barthisrael@gmail.com>
---
 doc/src/sgml/ref/pg_combinebackup.sgml      | 56 ++++++++++++++++++++-
 src/bin/pg_combinebackup/copy_file.c        | 25 +++++++++
 src/bin/pg_combinebackup/copy_file.h        |  1 +
 src/bin/pg_combinebackup/pg_combinebackup.c | 18 ++++++-
 src/bin/pg_combinebackup/t/003_timeline.pl  | 35 +++++++++++--
 5 files changed, 128 insertions(+), 7 deletions(-)

diff --git a/doc/src/sgml/ref/pg_combinebackup.sgml b/doc/src/sgml/ref/pg_combinebackup.sgml
index 091982f62a..49be0217c5 100644
--- a/doc/src/sgml/ref/pg_combinebackup.sgml
+++ b/doc/src/sgml/ref/pg_combinebackup.sgml
@@ -137,13 +137,38 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-k</option></term>
+      <term><option>--link</option></term>
+      <listitem>
+       <para>
+        Use hard links instead of copying files to the synthetic backup.
+        Reconstruction of the synthetic backup might be faster (no file copying)
+        and use less disk space.
+       </para>
+
+       <para>
+        Requires that the input backups and the output directory are in the
+        same file system.
+       </para>
+
+       <para>
+        If a backup manifest is not available or does not contain checksum of
+        the right type, file cloning will be used to copy the file, but the
+        file will be also read block-by-block for the checksum calculation.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>--clone</option></term>
       <listitem>
        <para>
         Use efficient file cloning (also known as <quote>reflinks</quote> on
         some systems) instead of copying files to the new data directory,
-        which can result in near-instantaneous copying of the data files.
+        which can result in near-instantaneous copying of the data files, giving
+        the speed advantages of <option>-k</option>/<option>--link</option>
+        while leaving the input backups untouched.
        </para>
 
        <para>
@@ -167,7 +192,8 @@ PostgreSQL documentation
       <listitem>
        <para>
         Perform regular file copy.  This is the default.  (See also
-        <option>--copy-file-range</option> and <option>--clone</option>.)
+        <option>--copy-file-range</option>, <option>--clone</option> and
+        <option>-k</option>/<option>--link</option>.)
        </para>
       </listitem>
      </varlistentry>
@@ -308,6 +334,32 @@ PostgreSQL documentation
   </para>
  </refsect1>
 
+ <refsect1>
+  <title>Notes</title>
+
+  <para>
+   When <application>pg_combinebackup</application> is run with
+   <option>-k</option>/<option>--link</option>, the output synthetic backup may
+   share several files with the input backups because of the hard links it
+   creates. Any changes performed on files that were linked between any of the
+   input backups and the synthetic backup, will be reflected on both backups.
+  </para>
+
+  <caution>
+   <para>
+    Modifying files or starting a cluster from a synthetic backup that was
+    reconstructed with <option>-k</option>/<option>--link</option>, might
+    invalidate one or more of the inputs backups that were used to reconstruct
+    it.
+   </para>
+
+   <para>
+    It is recommended to move the synthetic backup to another file system or
+    machine before performing changes and/or starting the cluster.
+   </para>
+  </caution>
+ </refsect1>
+
  <refsect1>
   <title>See Also</title>
 
diff --git a/src/bin/pg_combinebackup/copy_file.c b/src/bin/pg_combinebackup/copy_file.c
index 4e27814839..8687e23fde 100644
--- a/src/bin/pg_combinebackup/copy_file.c
+++ b/src/bin/pg_combinebackup/copy_file.c
@@ -40,6 +40,9 @@ static void copy_file_copyfile(const char *src, const char *dst,
 							   pg_checksum_context *checksum_ctx);
 #endif
 
+static void copy_file_link(const char *src, const char *dest,
+						   pg_checksum_context *checksum_ctx);
+
 /*
  * Copy a regular file, optionally computing a checksum, and emitting
  * appropriate debug messages. But if we're in dry-run mode, then just emit
@@ -93,6 +96,10 @@ copy_file(const char *src, const char *dst,
 			strategy_implementation = copy_file_copyfile;
 			break;
 #endif
+		case COPY_METHOD_LINK:
+			strategy_name = "link";
+			strategy_implementation = copy_file_link;
+			break;
 	}
 
 	if (dry_run)
@@ -304,3 +311,21 @@ copy_file_copyfile(const char *src, const char *dst,
 	checksum_file(src, checksum_ctx);
 }
 #endif							/* WIN32 */
+
+/*
+ * copy_file_link
+ * 		Hard-links a file from src to dest.
+ *
+ * If needed, also reads the file and calculates the checksum.
+ */
+static void
+copy_file_link(const char *src, const char *dest,
+			   pg_checksum_context *checksum_ctx)
+{
+	if (link(src, dest) < 0)
+		pg_fatal("error while linking file from \"%s\" to \"%s\": %m",
+				 src, dest);
+
+	/* if needed, calculate checksum of the file */
+	checksum_file(src, checksum_ctx);
+}
diff --git a/src/bin/pg_combinebackup/copy_file.h b/src/bin/pg_combinebackup/copy_file.h
index 92f104115b..5a8517629c 100644
--- a/src/bin/pg_combinebackup/copy_file.h
+++ b/src/bin/pg_combinebackup/copy_file.h
@@ -25,6 +25,7 @@ typedef enum CopyMethod
 #ifdef WIN32
 	COPY_METHOD_COPYFILE,
 #endif
+	COPY_METHOD_LINK,
 } CopyMethod;
 
 extern void copy_file(const char *src, const char *dst,
diff --git a/src/bin/pg_combinebackup/pg_combinebackup.c b/src/bin/pg_combinebackup/pg_combinebackup.c
index 5864ec574f..76f7a8704e 100644
--- a/src/bin/pg_combinebackup/pg_combinebackup.c
+++ b/src/bin/pg_combinebackup/pg_combinebackup.c
@@ -135,6 +135,7 @@ main(int argc, char *argv[])
 		{"no-sync", no_argument, NULL, 'N'},
 		{"output", required_argument, NULL, 'o'},
 		{"tablespace-mapping", required_argument, NULL, 'T'},
+		{"link", no_argument, NULL, 'k'},
 		{"manifest-checksums", required_argument, NULL, 1},
 		{"no-manifest", no_argument, NULL, 2},
 		{"sync-method", required_argument, NULL, 3},
@@ -172,7 +173,7 @@ main(int argc, char *argv[])
 	opt.copy_method = COPY_METHOD_COPY;
 
 	/* process command-line options */
-	while ((c = getopt_long(argc, argv, "dnNo:T:",
+	while ((c = getopt_long(argc, argv, "dnNo:T:k",
 							long_options, &optindex)) != -1)
 	{
 		switch (c)
@@ -193,6 +194,9 @@ main(int argc, char *argv[])
 			case 'T':
 				add_tablespace_mapping(&opt, optarg);
 				break;
+			case 'k':
+				opt.copy_method = COPY_METHOD_LINK;
+				break;
 			case 1:
 				if (!pg_checksum_parse_type(optarg,
 											&opt.manifest_checksums))
@@ -424,6 +428,17 @@ main(int argc, char *argv[])
 		}
 	}
 
+	/* Warn about the possibility of compromising the backups, when link mode */
+	if (opt.copy_method == COPY_METHOD_LINK)
+	{
+		pg_log_warning("Modifying files or starting the cluster from the "
+					   "synthetic backup might invalidate one or more of the "
+					   "input backups");
+		pg_log_warning_hint("It is recommended to move the synthetic backup to "
+							"another file system or machine before performing "
+							"changes and/or starting the cluster");
+	}
+
 	/* It's a success, so don't remove the output directories. */
 	reset_directory_cleanup_list();
 	exit(0);
@@ -766,6 +781,7 @@ help(const char *progname)
 	printf(_("  -o, --output=DIRECTORY    output directory\n"));
 	printf(_("  -T, --tablespace-mapping=OLDDIR=NEWDIR\n"
 			 "                            relocate tablespace in OLDDIR to NEWDIR\n"));
+	printf(_("  -k, --link                link files instead of copying\n"));
 	printf(_("      --clone               clone (reflink) files instead of copying\n"));
 	printf(_("      --copy                copy files (default)\n"));
 	printf(_("      --copy-file-range     copy using copy_file_range() system call\n"));
diff --git a/src/bin/pg_combinebackup/t/003_timeline.pl b/src/bin/pg_combinebackup/t/003_timeline.pl
index ad2dd04872..09d37a604a 100644
--- a/src/bin/pg_combinebackup/t/003_timeline.pl
+++ b/src/bin/pg_combinebackup/t/003_timeline.pl
@@ -70,6 +70,20 @@ $node2->command_ok(
 	],
 	"incremental backup from node2");
 
+# Verify all the backups at this point, when mode is --link.
+# In that case, the next init_from_backup runs pg_combinebackup with --link and
+# then changes the linked postgresql.conf, causing future executions of
+# pg_verifybackup to fail because of a mismatch in the size of that file.
+if ($mode eq "--link")
+{
+	for my $backup_name (qw(backup1 backup2 backup3))
+	{
+		$node1->command_ok(
+			[ 'pg_verifybackup', $node1->backup_dir . '/' . $backup_name ],
+			"verify backup $backup_name");
+	}
+}
+
 # Restore the incremental backup and use it to create a new node.
 my $node3 = PostgreSQL::Test::Cluster->new('node3');
 $node3->init_from_backup(
@@ -89,12 +103,25 @@ select string_agg(a::text, ':'), string_agg(b, ':') from mytable;
 EOM
 is($result, '1:2:4:5|aardvark:beetle:dingo:elephant');
 
-# Let's also verify all the backups.
+# Let's also verify all the backups, now for any mode.
 for my $backup_name (qw(backup1 backup2 backup3))
 {
-	$node1->command_ok(
-		[ 'pg_verifybackup', $node1->backup_dir . '/' . $backup_name ],
-		"verify backup $backup_name");
+	my $cmd = [ 'pg_verifybackup', $node1->backup_dir . '/' . $backup_name ];
+	my $msg = "verify backup $backup_name";
+
+	# We expect verify backup3 to fail, when mode is --link. For more details,
+	# refer to a previous comment related with pg_verifybackup execution.
+	if ($mode eq "--link" && $backup_name eq "backup3")
+	{
+		$node1->command_fails_like(
+			$cmd,
+			qr/"postgresql\.conf" has size \d+ on disk but size \d+ in the manifest/,
+			$msg);
+	}
+	else
+	{
+		$node1->command_ok($cmd, $msg);
+	}
 }
 
 # OK, that's all.
-- 
2.43.5

Israel Barth Rubio

barthisrael@gmail.com

12 months ago

In reply to: Israel Barth Rubio (#1)

Re: Add -k/--link option to pg_combinebackup

One concern that I have about this --link mode, which Euler Taveira
also got concerned about: the fact that it can invalidate the original
backups if the user modifies or starts the synthetic backup without
moving it to another file system or machine.

At the moment, pg_combinebackup issues a warning about that problem as
soon as the execution of the tool is about to finish. There is also a
warning in the docs. But some users don't pay proper attention to the
output and/or docs.

I wonder if we have some way, like pg_upgrade does (rename a global
file in the original cluster), so we could increase the protection
against that issue?

Robert Haas

robertmhaas@gmail.com

11 months ago

In reply to: Israel Barth Rubio (#1)

Re: Add -k/--link option to pg_combinebackup

On Wed, Jan 15, 2025 at 12:42 PM Israel Barth Rubio
<barthisrael@gmail.com> wrote:

With the current implementation of pg_combinebackup, we have a few
copy methods: --clone, --copy and --copy-file-range. By using either of
them, it implicates an actual file copy in the file system, i.e. among
other things, disk usage.

While discussing with some people, e.g. Robert Haas and Martín Marqués,
about possible ways to improve pg_combinebackup performance and/or
reduce disk usage taken by the synthetic backup, there was a thought
of using hard links.

I'm submitting a patch in this thread which introduces the -k/--link
option to pg_combinebackup, making it similar to the options exposed
by pg_upgrade.

In general, +1, although I think that the patch needs polishing in a
bunch of areas.

Originally, I thought what we wanted was something like a --in-place
option to pg_combinebackup, allowing the output directory to be the
same as one of the input directories. However, I now think that what
this patch does is likely better, and this patch is a lot simpler to
write than that one would be. With the --in-place option, if I'm
combine backup1 and backup2, I must write either "pg_combinebackup
backup1 backup2 -o backup1" or "pg_combinebackup backup1 backup2 -o
backup2". In either case, I can skip whole-file copies from one of the
two source directories -- whichever one happens to be the output
directory. However, if I write "pg_combinebackup --link backup1
backup2 -o result", I can skip *all* whole-file copies from *every*
input directory, which seems a lot nicer.

Furthermore, if all the input directories are staging directories,
basically copies of original backups stored elsewhere, then the fact
that those directories get trashed is of no concern to me. In fact,
they don't even get trashed if pg_combinebackup is interrupted partway
through, because I can just remove the output directory and recreate
it and everything is fine. With an --in-place option, that would be
trickier.

Regarding the patch itself, I think we need to rethink the test case
changes in particular. As written, the patch won't do any testing of
link mode unless the user sets PG_TEST_PG_COMBINEBACKUP_MODE=--link;
and if they do set that option, then we won't test anything else.
Also, even with that environment variable set, we'll only test --link
mode a bit ... accidentally. The patch doesn't really do anything to
make sure that link mode actually does what it's intended to do. It
only adapts the existing tests not to fail. I think it would be better
to decide that you're not supposed to set
PG_TEST_PG_COMBINEBACKUP_MODE=--link, and then write some new test,
not depending on PG_TEST_PG_COMBINEBACKUP_MODE, that specifically
verifies that link mode behaves as expected.

After all, link mode is noticeably different from the other copy
modes. Those should all produce equivalent results, or fail outright,
we suppose. This produces a different user-visible result, so we
probably ought to test that we get that result. For example, we might
check that certain files end up with a link count of 1 or 2, as
appropriate.

Does link() work on Windows?

I'm not sure what to do about the issue that using --link trashes the
input cluster if you subsequently start the database or otherwise
modify any hard-linked files. Keep in mind that, for command-line
arguments other than the first, these are incremental backups, and you
already can't run postgres on those directories. Only the first input
directory, which is a full backup not an incremental, is a potential
target for an unexpected database start. I'm tentatively inclined to
think we shouldn't modify the input directories and just emit a
warning like:

warning: --link mode was used; any modifications to the output
directory may destructively modify input directories

...but maybe that's not the right idea. Happy to hear other opinions.

--
Robert Haas
EDB: http://www.enterprisedb.com

Israel Barth Rubio

barthisrael@gmail.com

11 months ago

In reply to: Robert Haas (#3)

1 attachment(s)

Re: Add -k/--link option to pg_combinebackup

Hello Robert,

In general, +1, although I think that the patch needs polishing in a
bunch of areas.

Thanks for your review!

Originally, I thought what we wanted was something like a --in-place
option to pg_combinebackup, allowing the output directory to be the
same as one of the input directories. However, I now think that what
this patch does is likely better, and this patch is a lot simpler to
write than that one would be. With the --in-place option, if I'm
combine backup1 and backup2, I must write either "pg_combinebackup
backup1 backup2 -o backup1" or "pg_combinebackup backup1 backup2 -o
backup2". In either case, I can skip whole-file copies from one of the
two source directories -- whichever one happens to be the output
directory. However, if I write "pg_combinebackup --link backup1
backup2 -o result", I can skip *all* whole-file copies from *every*
input directory, which seems a lot nicer.

Furthermore, if all the input directories are staging directories,
basically copies of original backups stored elsewhere, then the fact
that those directories get trashed is of no concern to me. In fact,
they don't even get trashed if pg_combinebackup is interrupted partway
through, because I can just remove the output directory and recreate
it and everything is fine. With an --in-place option, that would be
trickier.

I see! I thought of not reinventing the wheel while writing, so I replayed
in pg_combinebackup the same behavior implemented in pg_upgrade.
And turns out the outcomes sounded convincing.

Regarding the patch itself, I think we need to rethink the test case
changes in particular. As written, the patch won't do any testing of
link mode unless the user sets PG_TEST_PG_COMBINEBACKUP_MODE=--link;
and if they do set that option, then we won't test anything else.
Also, even with that environment variable set, we'll only test --link
mode a bit ... accidentally. The patch doesn't really do anything to
make sure that link mode actually does what it's intended to do. It
only adapts the existing tests not to fail. I think it would be better
to decide that you're not supposed to set
PG_TEST_PG_COMBINEBACKUP_MODE=--link, and then write some new test,
not depending on PG_TEST_PG_COMBINEBACKUP_MODE, that specifically
verifies that link mode behaves as expected.

After all, link mode is noticeably different from the other copy
modes. Those should all produce equivalent results, or fail outright,
we suppose. This produces a different user-visible result, so we
probably ought to test that we get that result. For example, we might
check that certain files end up with a link count of 1 or 2, as
appropriate.

That makes sense to me too. I'm submitting a new patch which includes
a new file 010_links.pl. That test takes care of generating full and
incremental,
backups, then combines them with --link mode and ensures that the hard link
count for all files under PGDATA is as expected based on the operations that
were applied in the cluster.

I kept the "support" for PG_TEST_PG_COMBINEBACKUP_MODE=--link
around, just in case someone wants to run that and verify those outcomes
with
the --link mode. At least now we might have a proper test for checking
--link,
which always runs independently of the env variable.

Does link() work on Windows?

link is a function available only in Linux. The equivalent function in
Windows is
CreateHardLink.

However, in Postgres link works in either of the operating systems because
we
implement a link function, which wraps the call to CreateHardLink. That's
coded
in the file src/port/win32link.c

I'm not sure what to do about the issue that using --link trashes the
input cluster if you subsequently start the database or otherwise
modify any hard-linked files. Keep in mind that, for command-line
arguments other than the first, these are incremental backups, and you
already can't run postgres on those directories. Only the first input
directory, which is a full backup not an incremental, is a potential
target for an unexpected database start. I'm tentatively inclined to
think we shouldn't modify the input directories and just emit a
warning like:

warning: --link mode was used; any modifications to the output
directory may destructively modify input directories

The previous patch already had a warning. But I enjoyed your message,
so I slightly changed the warning and the docs in the new version of the
patch.

Best regards,
Israel.

Attachments:

v2-0001-pg_combinebackup-add-support-for-hard-links.patchapplication/octet-stream; name=v2-0001-pg_combinebackup-add-support-for-hard-links.patchDownload

From 668772af9bcce2c266bb1d135aa38b3259b83859 Mon Sep 17 00:00:00 2001
From: Israel Barth Rubio <barthisrael@gmail.com>
Date: Tue, 14 Jan 2025 19:49:35 +0000
Subject: [PATCH v2] pg_combinebackup: add support for hard links

Up to now, pg_combinebackup reconstructs incremental files, if needed,
otherwise copy them from any of the input backups to the output
directory. That copy mecanism can use different methods, depending on
the argument specified by the user.

This commit adds support for a new "copy method": hard links
(-k/--link). When using that mode, instead of copying unmodified files
from the input backups to the output directory, pg_combinebackup
creates the files as hard links from the output directory to the input
backups.

The new link method might speed up the reconstruction of the synthetic
backup (no file copy) and reduce disk usage taken by the synthetic
backup. The benefits depend on the modification pattern of files in
PGDATA between backups, imposed by the workload on Postgres.

This feature requires that the input backups plus the output directory
are in the same file system. Also, caution is required from the user
when modifying or starting the cluster from a synthetic backup, as that
might invalidate one or more of the input backups.

Signed-off-by: Israel Barth Rubio <barthisrael@gmail.com>
---
 doc/src/sgml/ref/pg_combinebackup.sgml      |  57 +++++++-
 src/bin/pg_combinebackup/copy_file.c        |  25 ++++
 src/bin/pg_combinebackup/copy_file.h        |   1 +
 src/bin/pg_combinebackup/pg_combinebackup.c |  19 ++-
 src/bin/pg_combinebackup/t/003_timeline.pl  |  35 ++++-
 src/bin/pg_combinebackup/t/010_links.pl     | 139 ++++++++++++++++++++
 6 files changed, 269 insertions(+), 7 deletions(-)
 create mode 100644 src/bin/pg_combinebackup/t/010_links.pl

diff --git a/doc/src/sgml/ref/pg_combinebackup.sgml b/doc/src/sgml/ref/pg_combinebackup.sgml
index 091982f62a..185cf6fff4 100644
--- a/doc/src/sgml/ref/pg_combinebackup.sgml
+++ b/doc/src/sgml/ref/pg_combinebackup.sgml
@@ -137,13 +137,38 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-k</option></term>
+      <term><option>--link</option></term>
+      <listitem>
+       <para>
+        Use hard links instead of copying files to the synthetic backup.
+        Reconstruction of the synthetic backup might be faster (no file copying)
+        and use less disk space.
+       </para>
+
+       <para>
+        Requires that the input backups and the output directory are in the
+        same file system.
+       </para>
+
+       <para>
+        If a backup manifest is not available or does not contain checksum of
+        the right type, file cloning will be used to copy the file, but the
+        file will be also read block-by-block for the checksum calculation.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>--clone</option></term>
       <listitem>
        <para>
         Use efficient file cloning (also known as <quote>reflinks</quote> on
         some systems) instead of copying files to the new data directory,
-        which can result in near-instantaneous copying of the data files.
+        which can result in near-instantaneous copying of the data files, giving
+        the speed advantages of <option>-k</option>/<option>--link</option>
+        while leaving the input backups untouched.
        </para>
 
        <para>
@@ -167,7 +192,8 @@ PostgreSQL documentation
       <listitem>
        <para>
         Perform regular file copy.  This is the default.  (See also
-        <option>--copy-file-range</option> and <option>--clone</option>.)
+        <option>--copy-file-range</option>, <option>--clone</option> and
+        <option>-k</option>/<option>--link</option>.)
        </para>
       </listitem>
      </varlistentry>
@@ -308,6 +334,33 @@ PostgreSQL documentation
   </para>
  </refsect1>
 
+ <refsect1>
+  <title>Notes</title>
+
+  <para>
+   When <application>pg_combinebackup</application> is run with
+   <option>-k</option>/<option>--link</option>, the output synthetic backup may
+   share several files with the input backups because of the hard links it
+   creates. Any changes performed on files that were linked between any of the
+   input backups and the synthetic backup, will be reflected on both places.
+  </para>
+
+  <caution>
+   <para>
+    Modifying files or starting a cluster from a synthetic backup that was
+    reconstructed with <option>-k</option>/<option>--link</option>, might
+    invalidate one or more of the inputs backups that were used to reconstruct
+    it.
+   </para>
+
+   <para>
+    If the input directories are not staging directories, it is recommended to
+    move the synthetic backup to another file system or machine before
+    performing changes to its files and/or starting the cluster.
+   </para>
+  </caution>
+ </refsect1>
+
  <refsect1>
   <title>See Also</title>
 
diff --git a/src/bin/pg_combinebackup/copy_file.c b/src/bin/pg_combinebackup/copy_file.c
index 4e27814839..8687e23fde 100644
--- a/src/bin/pg_combinebackup/copy_file.c
+++ b/src/bin/pg_combinebackup/copy_file.c
@@ -40,6 +40,9 @@ static void copy_file_copyfile(const char *src, const char *dst,
 							   pg_checksum_context *checksum_ctx);
 #endif
 
+static void copy_file_link(const char *src, const char *dest,
+						   pg_checksum_context *checksum_ctx);
+
 /*
  * Copy a regular file, optionally computing a checksum, and emitting
  * appropriate debug messages. But if we're in dry-run mode, then just emit
@@ -93,6 +96,10 @@ copy_file(const char *src, const char *dst,
 			strategy_implementation = copy_file_copyfile;
 			break;
 #endif
+		case COPY_METHOD_LINK:
+			strategy_name = "link";
+			strategy_implementation = copy_file_link;
+			break;
 	}
 
 	if (dry_run)
@@ -304,3 +311,21 @@ copy_file_copyfile(const char *src, const char *dst,
 	checksum_file(src, checksum_ctx);
 }
 #endif							/* WIN32 */
+
+/*
+ * copy_file_link
+ * 		Hard-links a file from src to dest.
+ *
+ * If needed, also reads the file and calculates the checksum.
+ */
+static void
+copy_file_link(const char *src, const char *dest,
+			   pg_checksum_context *checksum_ctx)
+{
+	if (link(src, dest) < 0)
+		pg_fatal("error while linking file from \"%s\" to \"%s\": %m",
+				 src, dest);
+
+	/* if needed, calculate checksum of the file */
+	checksum_file(src, checksum_ctx);
+}
diff --git a/src/bin/pg_combinebackup/copy_file.h b/src/bin/pg_combinebackup/copy_file.h
index 92f104115b..5a8517629c 100644
--- a/src/bin/pg_combinebackup/copy_file.h
+++ b/src/bin/pg_combinebackup/copy_file.h
@@ -25,6 +25,7 @@ typedef enum CopyMethod
 #ifdef WIN32
 	COPY_METHOD_COPYFILE,
 #endif
+	COPY_METHOD_LINK,
 } CopyMethod;
 
 extern void copy_file(const char *src, const char *dst,
diff --git a/src/bin/pg_combinebackup/pg_combinebackup.c b/src/bin/pg_combinebackup/pg_combinebackup.c
index 5864ec574f..16b6715d7c 100644
--- a/src/bin/pg_combinebackup/pg_combinebackup.c
+++ b/src/bin/pg_combinebackup/pg_combinebackup.c
@@ -135,6 +135,7 @@ main(int argc, char *argv[])
 		{"no-sync", no_argument, NULL, 'N'},
 		{"output", required_argument, NULL, 'o'},
 		{"tablespace-mapping", required_argument, NULL, 'T'},
+		{"link", no_argument, NULL, 'k'},
 		{"manifest-checksums", required_argument, NULL, 1},
 		{"no-manifest", no_argument, NULL, 2},
 		{"sync-method", required_argument, NULL, 3},
@@ -172,7 +173,7 @@ main(int argc, char *argv[])
 	opt.copy_method = COPY_METHOD_COPY;
 
 	/* process command-line options */
-	while ((c = getopt_long(argc, argv, "dnNo:T:",
+	while ((c = getopt_long(argc, argv, "dnNo:T:k",
 							long_options, &optindex)) != -1)
 	{
 		switch (c)
@@ -193,6 +194,9 @@ main(int argc, char *argv[])
 			case 'T':
 				add_tablespace_mapping(&opt, optarg);
 				break;
+			case 'k':
+				opt.copy_method = COPY_METHOD_LINK;
+				break;
 			case 1:
 				if (!pg_checksum_parse_type(optarg,
 											&opt.manifest_checksums))
@@ -424,6 +428,18 @@ main(int argc, char *argv[])
 		}
 	}
 
+	/* Warn about the possibility of compromising the backups, when link mode */
+	if (opt.copy_method == COPY_METHOD_LINK)
+	{
+		pg_log_warning("--link mode was used; any modifications to the output "
+					   "directory may destructively modify input directories");
+		pg_log_warning_hint("If the input directories are not staging "
+							"directories, it is recommended to move the output"
+							"directory to another file system or machine "
+							"before performing changes to the files and/or "
+							"starting the cluster");
+	}
+
 	/* It's a success, so don't remove the output directories. */
 	reset_directory_cleanup_list();
 	exit(0);
@@ -766,6 +782,7 @@ help(const char *progname)
 	printf(_("  -o, --output=DIRECTORY    output directory\n"));
 	printf(_("  -T, --tablespace-mapping=OLDDIR=NEWDIR\n"
 			 "                            relocate tablespace in OLDDIR to NEWDIR\n"));
+	printf(_("  -k, --link                link files instead of copying\n"));
 	printf(_("      --clone               clone (reflink) files instead of copying\n"));
 	printf(_("      --copy                copy files (default)\n"));
 	printf(_("      --copy-file-range     copy using copy_file_range() system call\n"));
diff --git a/src/bin/pg_combinebackup/t/003_timeline.pl b/src/bin/pg_combinebackup/t/003_timeline.pl
index 0205a59f92..cce551276c 100644
--- a/src/bin/pg_combinebackup/t/003_timeline.pl
+++ b/src/bin/pg_combinebackup/t/003_timeline.pl
@@ -81,6 +81,20 @@ $node2->command_ok(
 	],
 	"incremental backup from node2");
 
+# Verify all the backups at this point, when mode is --link.
+# In that case, the next init_from_backup runs pg_combinebackup with --link and
+# then changes the linked postgresql.conf, causing future executions of
+# pg_verifybackup to fail because of a mismatch in the size of that file.
+if ($mode eq "--link")
+{
+	for my $backup_name (qw(backup1 backup2 backup3))
+	{
+		$node1->command_ok(
+			[ 'pg_verifybackup', $node1->backup_dir . '/' . $backup_name ],
+			"verify backup $backup_name");
+	}
+}
+
 # Restore the incremental backup and use it to create a new node.
 my $node3 = PostgreSQL::Test::Cluster->new('node3');
 $node3->init_from_backup(
@@ -100,12 +114,25 @@ select string_agg(a::text, ':'), string_agg(b, ':') from mytable;
 EOM
 is($result, '1:2:4:5|aardvark:beetle:dingo:elephant');
 
-# Let's also verify all the backups.
+# Let's also verify all the backups, now for any mode.
 for my $backup_name (qw(backup1 backup2 backup3))
 {
-	$node1->command_ok(
-		[ 'pg_verifybackup', $node1->backup_dir . '/' . $backup_name ],
-		"verify backup $backup_name");
+	my $cmd = [ 'pg_verifybackup', $node1->backup_dir . '/' . $backup_name ];
+	my $msg = "verify backup $backup_name";
+
+	# We expect verify backup3 to fail, when mode is --link. For more details,
+	# refer to a previous comment related with pg_verifybackup execution.
+	if ($mode eq "--link" && $backup_name eq "backup3")
+	{
+		$node1->command_fails_like(
+			$cmd,
+			qr/"postgresql\.conf" has size \d+ on disk but size \d+ in the manifest/,
+			$msg);
+	}
+	else
+	{
+		$node1->command_ok($cmd, $msg);
+	}
 }
 
 # OK, that's all.
diff --git a/src/bin/pg_combinebackup/t/010_links.pl b/src/bin/pg_combinebackup/t/010_links.pl
new file mode 100644
index 0000000000..3b5e8dc71d
--- /dev/null
+++ b/src/bin/pg_combinebackup/t/010_links.pl
@@ -0,0 +1,139 @@
+# Copyright (c) 2021-2025, PostgreSQL Global Development Group
+#
+# This test aims to validate that hard-links are created as expected in the
+# output directory, when running pg_combinebackup with --link mode.
+
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+use File::Find;
+use File::Spec;
+use File::stat;
+
+# Set up a new database instance.
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(has_archiving => 1, allows_streaming => 1);
+$primary->append_conf('postgresql.conf', 'summarize_wal = on');
+$primary->start;
+
+# Create some tables.
+$primary->safe_psql('postgres', <<EOM);
+CREATE TABLE test_1 AS SELECT generate_series(1, 1000000);
+CREATE TABLE test_2 AS SELECT generate_series(1, 1000000);
+CREATE TABLE test_3 AS SELECT generate_series(1, 1000000);
+EOM
+
+# Fetch information about the data files.
+my $query = "SELECT pg_relation_filepath(oid) FROM pg_class WHERE relname = '%s';";
+
+my $pg_attribute_path = $primary->safe_psql('postgres', sprintf($query, 'pg_attribute'));
+note "pg_attribute path is $pg_attribute_path";
+my $pg_class_path = $primary->safe_psql('postgres', sprintf($query, 'pg_class'));
+note "pg_class path is $pg_class_path";
+my $pg_statistic_path = $primary->safe_psql('postgres', sprintf($query, 'pg_statistic'));
+note "pg_statistic path is $pg_statistic_path";
+my $test_1_path = $primary->safe_psql('postgres', sprintf($query, 'test_1'));
+note "test_1 path is $test_1_path";
+my $test_2_path = $primary->safe_psql('postgres', sprintf($query, 'test_2'));
+note "test_2 path is $test_2_path";
+my $test_3_path = $primary->safe_psql('postgres', sprintf($query, 'test_3'));
+note "test_3 path is $test_3_path";
+
+# Take a full backup.
+my $backup1path = $primary->backup_dir . '/backup1';
+$primary->command_ok(
+	[
+		'pg_basebackup',
+		'--pgdata' => $backup1path,
+		'--no-sync',
+		'--checkpoint' => 'fast',
+        '--wal-method' => 'none'
+	],
+	"full backup");
+
+# Perform an update that touches the whole file.
+$primary->safe_psql('postgres', <<EOM);
+UPDATE test_2
+SET generate_series = generate_series + 1;
+EOM
+
+# Perform an insert that touches a single page.
+$primary->safe_psql('postgres', <<EOM);
+INSERT INTO test_3 (generate_series) VALUES (1);
+EOM
+
+# Take an incremental backup.
+my $backup2path = $primary->backup_dir . '/backup2';
+$primary->command_ok(
+	[
+		'pg_basebackup',
+		'--pgdata' => $backup2path,
+		'--no-sync',
+		'--checkpoint' => 'fast',
+        '--wal-method' => 'none',
+		'--incremental' => $backup1path . '/backup_manifest'
+	],
+	"incremental backup");
+
+# Restore the incremental backup and use it to create a new node.
+my $restore = PostgreSQL::Test::Cluster->new('restore');
+$restore->init_from_backup(
+	$primary, 'backup2',
+	combine_with_prior => ['backup1'],
+	combine_mode => '--link');
+
+# Ensure files have the expected counter of hard-links.
+# We expect all files to have 2 hard-links in this case, except for:
+# 
+# * backup_label and backup_manifest: the backups were taken in different
+#   LSNs and with different contents on the test tables.
+# * pg_attribute, pg_class and pg_statistic might have been modified in the
+#   meantime, so they can have 1 or 2 hard-links.
+# * data file of table test_3 is different because of changes in a page.
+
+# Directory to traverse
+my $backup_dest = $restore->data_dir;
+
+# Set of non-linked files (these are the ones with 1 hard link)
+my %non_linked_files = map { $_ => 1 } (
+    File::Spec->catfile($restore->data_dir, $test_3_path),
+    File::Spec->catfile($restore->data_dir, 'backup_manifest'),
+    File::Spec->catfile($restore->data_dir, 'backup_label')
+);
+
+# Set of linked or non-linked files (these are the ones that may be with 1 or 2
+# hard links)
+my %linked_or_non_linked_files = map { $_ => 1 } (
+    File::Spec->catfile($restore->data_dir, $pg_attribute_path),
+    File::Spec->catfile($restore->data_dir, $pg_class_path),
+    File::Spec->catfile($restore->data_dir, $pg_statistic_path),
+);
+
+# Recursively traverse the directory
+find(sub {
+    my $file = $File::Find::name;
+    
+    # Skip directories
+    return if -d $file;
+    
+    # Get the file's stat information
+    my $stat = stat($file);
+    my $nlink_count = $stat->nlink;
+
+    # Check if the file's exact path matches any in the non_linked_files set
+    if (exists $non_linked_files{$file}) {
+        # Non-linked files should have 1 hard link
+        ok($nlink_count == 1, "File '$file' has 1 hard link");
+    } elsif (exists $linked_or_non_linked_files{$file}) {
+        # These files can have either 1 or 2 hard link
+        ok($nlink_count == 1 || $nlink_count == 2, "File '$file' has 1 or 2 hard link");
+    } else {
+        # Other files should have 2 hard links
+        ok($nlink_count == 2, "File '$file' has 2 hard links");
+    }
+}, $backup_dest);
+
+# OK, that's all.
+done_testing();
-- 
2.43.5

Robert Haas

robertmhaas@gmail.com

11 months ago

In reply to: Israel Barth Rubio (#4)

Re: Add -k/--link option to pg_combinebackup

Some more review:

+        Use hard links instead of copying files to the synthetic backup.
+        Reconstruction of the synthetic backup might be faster (no file copying
)
+        and use less disk space.

I think it would be good to add a caveat at the end of this sentence:
and use less disk space, but care must be taken when using the output
directory, because any modifications to that directory (for example,
starting the server) can also affect the input directories. Likewise,
changes to the input directories (for example, starting the server on
the full backup) could affect the output directory. Thus, this option
is best used when the input directories are only copies that will be
removed after <application>pg_combienbackup</application> has
completed.

-        which can result in near-instantaneous copying of the data files.
+        which can result in near-instantaneous copying of the data
files, giving
+        the speed advantages of <option>-k</option>/<option>--link</option>
+        while leaving the input backups untouched.

I don't think we need this hunk.

+   When <application>pg_combinebackup</application> is run with
+   <option>-k</option>/<option>--link</option>, the output synthetic backup may
+   share several files with the input backups because of the hard links it
+   creates. Any changes performed on files that were linked between any of the
+   input backups and the synthetic backup, will be reflected on both places.

I don't like the wording of this for a couple of reasons. First,
"several" means more than 1 and less than all, but we really have no
idea: it could be none, one, some, many, or all. Second, we want to be
a little careful not to define what a hard link means here. Also,
these notes are a bit far from the documentation of --link, so
somebody might miss them. I suggest that we can probably just leave
out the notes you've added here if we add something like what I
suggested above to the documentation of --link itself.

+               pg_log_warning_hint("If the input directories are not staging "
+                                                       "directories,
it is recommended to move the output"
+                                                       "directory to
another file system or machine "
+                                                       "before
performing changes to the files and/or "
+                                                       "starting the cluster");

I don't think "staging directories" is a sufficiently well-known and
unambiguous term that we should be using it here. Also, "move the
output directory" seems like fuzzy wording. You can really only move
files around within a filesystem; otherwise, you're making a copy and
deleting the original. I think we can just drop this hint entirely.
The primary warning message seems sufficient to me.

I'm still not a fan of the changes to 010_links.pl; let's drop those.

cfbot is fairly unhappy with your patch. See
http://cfbot.cputube.org/israel-barth.html or
https://cirrus-ci.com/github/postgresql-cfbot/postgresql/cf%2F5508 --
I haven't looked into what is going wrong here, and there may well be
more than one thing, but nothing can be committed until this is all
green.

--
Robert Haas
EDB: http://www.enterprisedb.com

Robert Haas

robertmhaas@gmail.com

11 months ago

In reply to: Israel Barth Rubio (#4)

Re: Add -k/--link option to pg_combinebackup

Oh, one more thing:

+        If a backup manifest is not available or does not contain checksum of
+        the right type, file cloning will be used to copy the file, but the
+        file will be also read block-by-block for the checksum calculation.

s/file cloning will be used to copy the file/hard links will still be created/

--
Robert Haas
EDB: http://www.enterprisedb.com

Israel Barth Rubio

barthisrael@gmail.com

11 months ago

In reply to: Robert Haas (#6)

1 attachment(s)

Re: Add -k/--link option to pg_combinebackup

Thanks for the review, Robert!
I applied all of your suggestions.

I'm still not a fan of the changes to 010_links.pl; let's drop those.

As discussed through a side chat, 010_links.pl is the new test file
which ensures the hard links are created as expected in the output
directory.

I've removed the changes that the v2 patch had in the other test files,
which were the ones you were suggesting to drop here.

cfbot is fairly unhappy with your patch. See
http://cfbot.cputube.org/israel-barth.html or
https://cirrus-ci.com/github/postgresql-cfbot/postgresql/cf%2F5508 --
I haven't looked into what is going wrong here, and there may well be
more than one thing, but nothing can be committed until this is all
green.

There are some failures on compiler warnings and Windows jobs that
seem unrelated with this patch.

Besides that, there were failures when executing `010_links.pl` in the
Linux autoconf jobs. As discussed in a side chat with Euler and Robert,
and pointed out by Robert, Cirrus CI uses a very small
`--with-segsize-blocks`. My tests in the V2 patch were assuming 1GB
segments, so they were failing in Cirrus CI.

The patch V3, which I'm attaching in this reply, implements some logic
to handle any segment size.

Best regards,
Israel.

Attachments:

v3-0001-pg_combinebackup-add-support-for-hard-links.patchapplication/octet-stream; name=v3-0001-pg_combinebackup-add-support-for-hard-links.patchDownload

From d2d65fcc6a494ce9191706e6b05e57d31a35d96b Mon Sep 17 00:00:00 2001
From: Israel Barth Rubio <barthisrael@gmail.com>
Date: Wed, 19 Feb 2025 03:26:30 +0000
Subject: [PATCH v3] pg_combinebackup: add support for hard links

Up to now, pg_combinebackup reconstructs incremental files, if needed,
otherwise copy them from any of the input backups to the output
directory. That copy mecanism can use different methods, depending on
the argument specified by the user.

This commit adds support for a new "copy method": hard links
(-k/--link). When using that mode, instead of copying unmodified files
from the input backups to the output directory, pg_combinebackup
creates the files as hard links from the output directory to the input
backups.

The new link method might speed up the reconstruction of the synthetic
backup (no file copy) and reduce disk usage taken by the synthetic
backup. The benefits depend on the modification pattern of files in
PGDATA between backups, imposed by the workload on Postgres.

This feature requires that the input backups plus the output directory
are in the same file system. Also, caution is required from the user
when modifying or starting the cluster from a synthetic backup, as that
might invalidate one or more of the input backups.

Signed-off-by: Israel Barth Rubio <barthisrael@gmail.com>
---
 doc/src/sgml/ref/pg_combinebackup.sgml      |  32 ++-
 src/bin/pg_combinebackup/copy_file.c        |  25 +++
 src/bin/pg_combinebackup/copy_file.h        |   1 +
 src/bin/pg_combinebackup/pg_combinebackup.c |  14 +-
 src/bin/pg_combinebackup/t/010_links.pl     | 212 ++++++++++++++++++++
 5 files changed, 282 insertions(+), 2 deletions(-)
 create mode 100644 src/bin/pg_combinebackup/t/010_links.pl

diff --git a/doc/src/sgml/ref/pg_combinebackup.sgml b/doc/src/sgml/ref/pg_combinebackup.sgml
index 091982f62a..79c73ad460 100644
--- a/doc/src/sgml/ref/pg_combinebackup.sgml
+++ b/doc/src/sgml/ref/pg_combinebackup.sgml
@@ -137,6 +137,35 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-k</option></term>
+      <term><option>--link</option></term>
+      <listitem>
+       <para>
+        Use hard links instead of copying files to the synthetic backup.
+        Reconstruction of the synthetic backup might be faster (no file copying)
+        and use less disk space, but care must be taken when using the output
+        directory, because any modifications to that directory (for example,
+        starting the server) can also affect the input directories. Likewise,
+        changes to the input directories (for example, starting the server on
+        the full backup) could affect the output directory. Thus, this option
+        is best used when the input directories are only copies that will be
+        removed after <application>pg_combinebackup</application> has completed.
+       </para>
+
+       <para>
+        Requires that the input backups and the output directory are in the
+        same file system.
+       </para>
+
+       <para>
+        If a backup manifest is not available or does not contain checksum of
+        the right type, hard links will still be created, but the file will be
+        also read block-by-block for the checksum calculation.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>--clone</option></term>
       <listitem>
@@ -167,7 +196,8 @@ PostgreSQL documentation
       <listitem>
        <para>
         Perform regular file copy.  This is the default.  (See also
-        <option>--copy-file-range</option> and <option>--clone</option>.)
+        <option>--copy-file-range</option>, <option>--clone</option> and
+        <option>-k</option>/<option>--link</option>.)
        </para>
       </listitem>
      </varlistentry>
diff --git a/src/bin/pg_combinebackup/copy_file.c b/src/bin/pg_combinebackup/copy_file.c
index 4e27814839..8687e23fde 100644
--- a/src/bin/pg_combinebackup/copy_file.c
+++ b/src/bin/pg_combinebackup/copy_file.c
@@ -40,6 +40,9 @@ static void copy_file_copyfile(const char *src, const char *dst,
 							   pg_checksum_context *checksum_ctx);
 #endif
 
+static void copy_file_link(const char *src, const char *dest,
+						   pg_checksum_context *checksum_ctx);
+
 /*
  * Copy a regular file, optionally computing a checksum, and emitting
  * appropriate debug messages. But if we're in dry-run mode, then just emit
@@ -93,6 +96,10 @@ copy_file(const char *src, const char *dst,
 			strategy_implementation = copy_file_copyfile;
 			break;
 #endif
+		case COPY_METHOD_LINK:
+			strategy_name = "link";
+			strategy_implementation = copy_file_link;
+			break;
 	}
 
 	if (dry_run)
@@ -304,3 +311,21 @@ copy_file_copyfile(const char *src, const char *dst,
 	checksum_file(src, checksum_ctx);
 }
 #endif							/* WIN32 */
+
+/*
+ * copy_file_link
+ * 		Hard-links a file from src to dest.
+ *
+ * If needed, also reads the file and calculates the checksum.
+ */
+static void
+copy_file_link(const char *src, const char *dest,
+			   pg_checksum_context *checksum_ctx)
+{
+	if (link(src, dest) < 0)
+		pg_fatal("error while linking file from \"%s\" to \"%s\": %m",
+				 src, dest);
+
+	/* if needed, calculate checksum of the file */
+	checksum_file(src, checksum_ctx);
+}
diff --git a/src/bin/pg_combinebackup/copy_file.h b/src/bin/pg_combinebackup/copy_file.h
index 92f104115b..5a8517629c 100644
--- a/src/bin/pg_combinebackup/copy_file.h
+++ b/src/bin/pg_combinebackup/copy_file.h
@@ -25,6 +25,7 @@ typedef enum CopyMethod
 #ifdef WIN32
 	COPY_METHOD_COPYFILE,
 #endif
+	COPY_METHOD_LINK,
 } CopyMethod;
 
 extern void copy_file(const char *src, const char *dst,
diff --git a/src/bin/pg_combinebackup/pg_combinebackup.c b/src/bin/pg_combinebackup/pg_combinebackup.c
index 5864ec574f..5a48876298 100644
--- a/src/bin/pg_combinebackup/pg_combinebackup.c
+++ b/src/bin/pg_combinebackup/pg_combinebackup.c
@@ -135,6 +135,7 @@ main(int argc, char *argv[])
 		{"no-sync", no_argument, NULL, 'N'},
 		{"output", required_argument, NULL, 'o'},
 		{"tablespace-mapping", required_argument, NULL, 'T'},
+		{"link", no_argument, NULL, 'k'},
 		{"manifest-checksums", required_argument, NULL, 1},
 		{"no-manifest", no_argument, NULL, 2},
 		{"sync-method", required_argument, NULL, 3},
@@ -172,7 +173,7 @@ main(int argc, char *argv[])
 	opt.copy_method = COPY_METHOD_COPY;
 
 	/* process command-line options */
-	while ((c = getopt_long(argc, argv, "dnNo:T:",
+	while ((c = getopt_long(argc, argv, "dnNo:T:k",
 							long_options, &optindex)) != -1)
 	{
 		switch (c)
@@ -193,6 +194,9 @@ main(int argc, char *argv[])
 			case 'T':
 				add_tablespace_mapping(&opt, optarg);
 				break;
+			case 'k':
+				opt.copy_method = COPY_METHOD_LINK;
+				break;
 			case 1:
 				if (!pg_checksum_parse_type(optarg,
 											&opt.manifest_checksums))
@@ -424,6 +428,13 @@ main(int argc, char *argv[])
 		}
 	}
 
+	/* Warn about the possibility of compromising the backups, when link mode */
+	if (opt.copy_method == COPY_METHOD_LINK)
+	{
+		pg_log_warning("--link mode was used; any modifications to the output "
+					   "directory may destructively modify input directories");
+	}
+
 	/* It's a success, so don't remove the output directories. */
 	reset_directory_cleanup_list();
 	exit(0);
@@ -766,6 +777,7 @@ help(const char *progname)
 	printf(_("  -o, --output=DIRECTORY    output directory\n"));
 	printf(_("  -T, --tablespace-mapping=OLDDIR=NEWDIR\n"
 			 "                            relocate tablespace in OLDDIR to NEWDIR\n"));
+	printf(_("  -k, --link                link files instead of copying\n"));
 	printf(_("      --clone               clone (reflink) files instead of copying\n"));
 	printf(_("      --copy                copy files (default)\n"));
 	printf(_("      --copy-file-range     copy using copy_file_range() system call\n"));
diff --git a/src/bin/pg_combinebackup/t/010_links.pl b/src/bin/pg_combinebackup/t/010_links.pl
new file mode 100644
index 0000000000..fae0d12c5a
--- /dev/null
+++ b/src/bin/pg_combinebackup/t/010_links.pl
@@ -0,0 +1,212 @@
+# Copyright (c) 2021-2025, PostgreSQL Global Development Group
+#
+# This test aims to validate that hard links are created as expected in the
+# output directory, when running pg_combinebackup with --link mode.
+
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+use File::Find;
+use File::Spec;
+use File::stat;
+
+# Set up a new database instance.
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(has_archiving => 1, allows_streaming => 1);
+$primary->append_conf('postgresql.conf', 'summarize_wal = on');
+$primary->start;
+
+# Create some tables (~35MB each).
+$primary->safe_psql('postgres', <<EOM);
+CREATE TABLE test_1 AS SELECT generate_series(1, 1000000);
+CREATE TABLE test_2 AS SELECT generate_series(1, 1000000);
+CREATE TABLE test_3 AS SELECT generate_series(1, 1000000);
+EOM
+
+# Fetch information about the data files.
+my $query = "SELECT pg_relation_filepath(oid) FROM pg_class WHERE relname = '%s';";
+
+my $pg_attribute_path = $primary->safe_psql('postgres', sprintf($query, 'pg_attribute'));
+note "pg_attribute path is $pg_attribute_path";
+
+my $pg_class_path = $primary->safe_psql('postgres', sprintf($query, 'pg_class'));
+note "pg_class path is $pg_class_path";
+
+my $pg_statistic_path = $primary->safe_psql('postgres', sprintf($query, 'pg_statistic'));
+note "pg_statistic path is $pg_statistic_path";
+
+my $test_1_path = $primary->safe_psql('postgres', sprintf($query, 'test_1'));
+note "test_1 path is $test_1_path";
+
+my $test_2_path = $primary->safe_psql('postgres', sprintf($query, 'test_2'));
+note "test_2 path is $test_2_path";
+
+my $test_3_path = $primary->safe_psql('postgres', sprintf($query, 'test_3'));
+note "test_3 path is $test_3_path";
+
+# Take a full backup.
+my $backup1path = $primary->backup_dir . '/backup1';
+$primary->command_ok(
+	[
+		'pg_basebackup',
+		'--pgdata' => $backup1path,
+		'--no-sync',
+		'--checkpoint' => 'fast',
+        '--wal-method' => 'none'
+	],
+	"full backup");
+
+# Perform an update that touches the whole test_2 data file(s).
+$primary->safe_psql('postgres', <<EOM);
+UPDATE test_2
+SET generate_series = generate_series + 1;
+EOM
+
+# Perform an insert that touches a single page of the last test_3 data file.
+$primary->safe_psql('postgres', <<EOM);
+INSERT INTO test_3 (generate_series) VALUES (1);
+EOM
+
+# Take an incremental backup.
+my $backup2path = $primary->backup_dir . '/backup2';
+$primary->command_ok(
+	[
+		'pg_basebackup',
+		'--pgdata' => $backup2path,
+		'--no-sync',
+		'--checkpoint' => 'fast',
+        '--wal-method' => 'none',
+		'--incremental' => $backup1path . '/backup_manifest'
+	],
+	"incremental backup");
+
+# Restore the incremental backup and use it to create a new node.
+my $restore = PostgreSQL::Test::Cluster->new('restore');
+$restore->init_from_backup(
+	$primary, 'backup2',
+	combine_with_prior => ['backup1'],
+	combine_mode => '--link');
+
+# Ensure files have the expected counter of hard links.
+# We expect all files to have 2 hard links in this case, except for:
+#
+# * backup_label and backup_manifest: the backups were taken in different
+#   LSNs and with different contents on the test tables.
+# * pg_attribute, pg_class and pg_statistic might have been modified by Postgres
+#   in the meantime, so they can have 1 or 2 hard links.
+# * data file of table test_3 is different because of changes in a page.
+
+# Directory to traverse
+my $backup_dest = $restore->data_dir;
+
+# Set of non-linked files (these are the ones with 1 hard link)
+my %non_linked_files = map { $_ => 1 } (
+    File::Spec->catfile($restore->data_dir, 'backup_manifest'),
+    File::Spec->catfile($restore->data_dir, 'backup_label')
+);
+
+# Set of linked or non-linked files (these are the ones that may be with 1 or 2
+# hard links)
+my %linked_or_non_linked_files = map { $_ => 1 } (
+    File::Spec->catfile($restore->data_dir, $pg_attribute_path),
+    File::Spec->catfile($restore->data_dir, $pg_class_path),
+    File::Spec->catfile($restore->data_dir, $pg_statistic_path),
+);
+
+# By default Postgres uses 1GB segments for the data files, and our test tables
+# are 35MB worth of data each. However, that segment size is configurable, so we
+# have to handle all possibilities here. Cirrus CI e.g. is configured with 6
+# blocks per segment, and we need to cover that test case too. That's why we
+# didn't put test_3_path in the non_linked_files variable.
+my $test_3_full_path = File::Spec->catfile($restore->data_dir, $test_3_path);
+my @test_3_segments = ();
+
+# Recursively traverse the directory
+find(sub {
+    my $file = $File::Find::name;
+
+    # Skip directories
+    return if -d $file;
+
+    # Get base name, in case it is a data file and it has more than one segment
+    # This logic is used also for non-data files, and in essence it does
+    # nothing in that case, as most of the non-data files contain no dots in the
+    # name.
+    my $basename = (split /\./, $file)[0];
+
+    if ($test_3_full_path eq $basename) {
+        # The test_3 table is a special case, because we touched a single block
+        # of the last segment of its data file. So we need to handle this case
+        # separately later.
+        push @test_3_segments, $file;
+    } else {
+        # Get the file's stat information
+        my $stat = stat($file);
+        my $nlink_count = $stat->nlink;
+
+        if (exists $non_linked_files{$basename}) {
+            # Non-linked files should have 1 hard link
+            ok($nlink_count == 1, "File '$file' has 1 hard link");
+        } elsif (exists $linked_or_non_linked_files{$basename}) {
+            # These files can have either 1 or 2 hard links, as they are a special case
+            ok($nlink_count == 1 || $nlink_count == 2, "File '$file' has 1 or 2 hard link");
+        } else {
+            # Other files should have 2 hard links
+            ok($nlink_count == 2, "File '$file' has 2 hard links");
+        }
+    }
+}, $backup_dest);
+
+# All segments of the data file of the table test_3 should contain 2 hard links,
+# except for the last one, where we inserted one tuple.
+@test_3_segments = sort { natural_sort ($a, $b) } @test_3_segments;
+my $last_test_3_segment = pop @test_3_segments;
+
+for my $test_3_segment (@test_3_segments) {
+    # Get the file's stat information of each segment
+    my $stat = stat($test_3_segment);
+    my $nlink_count = $stat->nlink;
+    ok($nlink_count == 2, "File '$test_3_segment' has 2 hard links");
+}
+
+# Get the file's stat information of the last segment
+my $stat = stat($last_test_3_segment);
+my $nlink_count = $stat->nlink;
+ok($nlink_count == 1, "File '$last_test_3_segment' has 1 hard link");
+
+# OK, that's all.
+done_testing();
+
+# Natural comparison subroutine for strings with numbers
+# This is just a helper function for sorting strings with numbers (we want
+# "base/123.13" to come before "base/123.123", for example)
+sub natural_sort {
+    my ($a, $b) = @_;
+
+    # Split into non-numeric and numeric parts
+    my @parts_a = $a =~ /(\D+|\d+)/g;
+    my @parts_b = $b =~ /(\D+|\d+)/g;
+
+    for (my $i = 0; $i < scalar(@parts_a) && $i < scalar(@parts_b); $i++) {
+        if ($parts_a[$i] =~ /^\d/ && $parts_b[$i] =~ /^\d/) {
+            # Compare numerically if both parts are numbers
+            if ($parts_a[$i] < $parts_b[$i]) {
+                return -1;
+            } elsif ($parts_a[$i] > $parts_b[$i]) {
+                return 1;
+            }
+        } else {
+            # Compare lexicographically if not numbers
+            if ($parts_a[$i] lt $parts_b[$i]) {
+                return -1;
+            } elsif ($parts_a[$i] gt $parts_b[$i]) {
+                return 1;
+            }
+        }
+    }
+
+    # If all compared parts are the same, the shorter string comes first
+    return scalar(@parts_a) <=> scalar(@parts_b);
+}
-- 
2.43.5

vignesh C

vignesh21@gmail.com

11 months ago

In reply to: Israel Barth Rubio (#7)

Re: Add -k/--link option to pg_combinebackup

On Fri, 21 Feb 2025 at 03:50, Israel Barth Rubio <barthisrael@gmail.com> wrote:

There are some failures on compiler warnings and Windows jobs that
seem unrelated with this patch.

Besides that, there were failures when executing `010_links.pl` in the
Linux autoconf jobs. As discussed in a side chat with Euler and Robert,
and pointed out by Robert, Cirrus CI uses a very small
`--with-segsize-blocks`. My tests in the V2 patch were assuming 1GB
segments, so they were failing in Cirrus CI.

The patch V3, which I'm attaching in this reply, implements some logic
to handle any segment size.

Few comments:
1) The new file 010_links.pl added should be included to meson.build also:
'tests': [
't/010_pg_basebackup.pl',
't/011_in_place_tablespace.pl',
't/020_pg_receivewal.pl',
't/030_pg_recvlogical.pl',
't/040_pg_createsubscriber.pl',
],

2) Since it is a new file, "Copyright (c) 2021-2025" should be
"Copyright (c) 2025":
+++ b/src/bin/pg_combinebackup/t/010_links.pl
@@ -0,0 +1,212 @@
+# Copyright (c) 2021-2025, PostgreSQL Global Development Group
+#
+# This test aims to validate that hard links are created as expected in the
+# output directory, when running pg_combinebackup with --link mode.

3) Since it is a single statement, no need of braces here:
+       /* Warn about the possibility of compromising the backups,
when link mode */
+       if (opt.copy_method == COPY_METHOD_LINK)
+       {
+               pg_log_warning("--link mode was used; any
modifications to the output "
+                                          "directory may
destructively modify input directories");
+       }

Regards,
Vignesh

Robert Haas

robertmhaas@gmail.com

11 months ago

In reply to: vignesh C (#8)

Re: Add -k/--link option to pg_combinebackup

On Thu, Feb 27, 2025 at 4:50 AM vignesh C <vignesh21@gmail.com> wrote:

Few comments:
1) The new file 010_links.pl added should be included to meson.build also:
'tests': [
't/010_pg_basebackup.pl',
't/011_in_place_tablespace.pl',
't/020_pg_receivewal.pl',
't/030_pg_recvlogical.pl',
't/040_pg_createsubscriber.pl',
],

Also give it a different number than any existing file -- if we
already have an 010 in that directory then make this one something
else. 012, 050, or whatever makes most sense.

--
Robert Haas
EDB: http://www.enterprisedb.com

#10

Israel Barth Rubio

barthisrael@gmail.com

10 months ago

In reply to: Robert Haas (#9)

1 attachment(s)

Re: Add -k/--link option to pg_combinebackup

Thanks for the new round of reviews!

1) The new file 010_links.pl added should be included to meson.build also

Added 010_links.pl to src/bin/pg_combinebackup/meson.build.

2) Since it is a new file, "Copyright (c) 2021-2025" should be
"Copyright (c) 2025"

Done!

3) Since it is a single statement, no need of braces here

I missed removing the braces along with the warning hint in
the previous version. Adjusted.

Also give it a different number than any existing file -- if we
already have an 010 in that directory then make this one something
else. 012, 050, or whatever makes most sense.

Oh, t/010_links.pl was not conflicting with anything.
The snippet that Vignesh quoted in his reply was from the
meson.build file of pg_basebackup instead of from
pg_combinebackup.

Attaching the new version of the patch in this reply.

I had to slightly change 010_links.pl and copy_file.c to
properly handle Windows, as the meson tests on
Windows had complaints with the code of v3:

* copy_file.c was forcing CopyFile on Windows regardless
of the option chosen by the user. Now, to make that behavior
backward compatible, I'm only forcing CopyFile on Windows
when the copy method is not --link. That allows my patch to
work, and doesn't change the previous behavior.
* Had to normalize paths when performing string matching in
the test that verifies the hard link count on files.

Best regards,
Israel.

Attachments:

v4-0001-pg_combinebackup-add-support-for-hard-links.patchapplication/octet-stream; name=v4-0001-pg_combinebackup-add-support-for-hard-links.patchDownload

From 431909d92594e4d3bd6c35a3da546d23f103016f Mon Sep 17 00:00:00 2001
From: Israel Barth Rubio <barthisrael@gmail.com>
Date: Wed, 19 Feb 2025 03:26:30 +0000
Subject: [PATCH v4] pg_combinebackup: add support for hard links

Up to now, pg_combinebackup reconstructs incremental files, if needed,
otherwise copy them from any of the input backups to the output
directory. That copy mecanism can use different methods, depending on
the argument specified by the user.

This commit adds support for a new "copy method": hard links
(-k/--link). When using that mode, instead of copying unmodified files
from the input backups to the output directory, pg_combinebackup
creates the files as hard links from the output directory to the input
backups.

The new link method might speed up the reconstruction of the synthetic
backup (no file copy) and reduce disk usage taken by the synthetic
backup. The benefits depend on the modification pattern of files in
PGDATA between backups, imposed by the workload on Postgres.

This feature requires that the input backups plus the output directory
are in the same file system. Also, caution is required from the user
when modifying or starting the cluster from a synthetic backup, as that
might invalidate one or more of the input backups.

Signed-off-by: Israel Barth Rubio <barthisrael@gmail.com>
---
 doc/src/sgml/ref/pg_combinebackup.sgml      |  32 ++-
 src/bin/pg_combinebackup/copy_file.c        |  34 ++-
 src/bin/pg_combinebackup/copy_file.h        |   1 +
 src/bin/pg_combinebackup/meson.build        |   1 +
 src/bin/pg_combinebackup/pg_combinebackup.c |  12 +-
 src/bin/pg_combinebackup/t/010_links.pl     | 230 ++++++++++++++++++++
 6 files changed, 307 insertions(+), 3 deletions(-)
 create mode 100644 src/bin/pg_combinebackup/t/010_links.pl

diff --git a/doc/src/sgml/ref/pg_combinebackup.sgml b/doc/src/sgml/ref/pg_combinebackup.sgml
index 091982f62a..79c73ad460 100644
--- a/doc/src/sgml/ref/pg_combinebackup.sgml
+++ b/doc/src/sgml/ref/pg_combinebackup.sgml
@@ -137,6 +137,35 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-k</option></term>
+      <term><option>--link</option></term>
+      <listitem>
+       <para>
+        Use hard links instead of copying files to the synthetic backup.
+        Reconstruction of the synthetic backup might be faster (no file copying)
+        and use less disk space, but care must be taken when using the output
+        directory, because any modifications to that directory (for example,
+        starting the server) can also affect the input directories. Likewise,
+        changes to the input directories (for example, starting the server on
+        the full backup) could affect the output directory. Thus, this option
+        is best used when the input directories are only copies that will be
+        removed after <application>pg_combinebackup</application> has completed.
+       </para>
+
+       <para>
+        Requires that the input backups and the output directory are in the
+        same file system.
+       </para>
+
+       <para>
+        If a backup manifest is not available or does not contain checksum of
+        the right type, hard links will still be created, but the file will be
+        also read block-by-block for the checksum calculation.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>--clone</option></term>
       <listitem>
@@ -167,7 +196,8 @@ PostgreSQL documentation
       <listitem>
        <para>
         Perform regular file copy.  This is the default.  (See also
-        <option>--copy-file-range</option> and <option>--clone</option>.)
+        <option>--copy-file-range</option>, <option>--clone</option> and
+        <option>-k</option>/<option>--link</option>.)
        </para>
       </listitem>
      </varlistentry>
diff --git a/src/bin/pg_combinebackup/copy_file.c b/src/bin/pg_combinebackup/copy_file.c
index 4e27814839..f4f38e9ad7 100644
--- a/src/bin/pg_combinebackup/copy_file.c
+++ b/src/bin/pg_combinebackup/copy_file.c
@@ -40,6 +40,9 @@ static void copy_file_copyfile(const char *src, const char *dst,
 							   pg_checksum_context *checksum_ctx);
 #endif
 
+static void copy_file_link(const char *src, const char *dest,
+						   pg_checksum_context *checksum_ctx);
+
 /*
  * Copy a regular file, optionally computing a checksum, and emitting
  * appropriate debug messages. But if we're in dry-run mode, then just emit
@@ -69,7 +72,14 @@ copy_file(const char *src, const char *dst,
 	}
 
 #ifdef WIN32
-	copy_method = COPY_METHOD_COPYFILE;
+	/*
+	 * Windows only supports two "copy methods": CopyFile and
+	 * CreateHardLink. Whenever the user selects a method other than
+	 * --link, we force pg_combinebackup to use CopyFile, as --clone and
+	 * --copy-file-range are not supported in that platform.
+	 */
+	if (copy_method != COPY_METHOD_LINK)
+		copy_method = COPY_METHOD_COPYFILE;
 #endif
 
 	/* Determine the name of the copy strategy for use in log messages. */
@@ -93,6 +103,10 @@ copy_file(const char *src, const char *dst,
 			strategy_implementation = copy_file_copyfile;
 			break;
 #endif
+		case COPY_METHOD_LINK:
+			strategy_name = "link";
+			strategy_implementation = copy_file_link;
+			break;
 	}
 
 	if (dry_run)
@@ -304,3 +318,21 @@ copy_file_copyfile(const char *src, const char *dst,
 	checksum_file(src, checksum_ctx);
 }
 #endif							/* WIN32 */
+
+/*
+ * copy_file_link
+ * 		Hard-links a file from src to dest.
+ *
+ * If needed, also reads the file and calculates the checksum.
+ */
+static void
+copy_file_link(const char *src, const char *dest,
+			   pg_checksum_context *checksum_ctx)
+{
+	if (link(src, dest) < 0)
+		pg_fatal("error while linking file from \"%s\" to \"%s\": %m",
+				 src, dest);
+
+	/* if needed, calculate checksum of the file */
+	checksum_file(src, checksum_ctx);
+}
diff --git a/src/bin/pg_combinebackup/copy_file.h b/src/bin/pg_combinebackup/copy_file.h
index 92f104115b..5a8517629c 100644
--- a/src/bin/pg_combinebackup/copy_file.h
+++ b/src/bin/pg_combinebackup/copy_file.h
@@ -25,6 +25,7 @@ typedef enum CopyMethod
 #ifdef WIN32
 	COPY_METHOD_COPYFILE,
 #endif
+	COPY_METHOD_LINK,
 } CopyMethod;
 
 extern void copy_file(const char *src, const char *dst,
diff --git a/src/bin/pg_combinebackup/meson.build b/src/bin/pg_combinebackup/meson.build
index 0c4fd9e627..e19c309ad2 100644
--- a/src/bin/pg_combinebackup/meson.build
+++ b/src/bin/pg_combinebackup/meson.build
@@ -37,6 +37,7 @@ tests += {
       't/007_wal_level_minimal.pl',
       't/008_promote.pl',
       't/009_no_full_file.pl',
+      't/010_links.pl',
     ],
   }
 }
diff --git a/src/bin/pg_combinebackup/pg_combinebackup.c b/src/bin/pg_combinebackup/pg_combinebackup.c
index 5864ec574f..e005b033ed 100644
--- a/src/bin/pg_combinebackup/pg_combinebackup.c
+++ b/src/bin/pg_combinebackup/pg_combinebackup.c
@@ -135,6 +135,7 @@ main(int argc, char *argv[])
 		{"no-sync", no_argument, NULL, 'N'},
 		{"output", required_argument, NULL, 'o'},
 		{"tablespace-mapping", required_argument, NULL, 'T'},
+		{"link", no_argument, NULL, 'k'},
 		{"manifest-checksums", required_argument, NULL, 1},
 		{"no-manifest", no_argument, NULL, 2},
 		{"sync-method", required_argument, NULL, 3},
@@ -172,7 +173,7 @@ main(int argc, char *argv[])
 	opt.copy_method = COPY_METHOD_COPY;
 
 	/* process command-line options */
-	while ((c = getopt_long(argc, argv, "dnNo:T:",
+	while ((c = getopt_long(argc, argv, "dnNo:T:k",
 							long_options, &optindex)) != -1)
 	{
 		switch (c)
@@ -193,6 +194,9 @@ main(int argc, char *argv[])
 			case 'T':
 				add_tablespace_mapping(&opt, optarg);
 				break;
+			case 'k':
+				opt.copy_method = COPY_METHOD_LINK;
+				break;
 			case 1:
 				if (!pg_checksum_parse_type(optarg,
 											&opt.manifest_checksums))
@@ -424,6 +428,11 @@ main(int argc, char *argv[])
 		}
 	}
 
+	/* Warn about the possibility of compromising the backups, when link mode */
+	if (opt.copy_method == COPY_METHOD_LINK)
+		pg_log_warning("--link mode was used; any modifications to the output "
+					   "directory may destructively modify input directories");
+
 	/* It's a success, so don't remove the output directories. */
 	reset_directory_cleanup_list();
 	exit(0);
@@ -766,6 +775,7 @@ help(const char *progname)
 	printf(_("  -o, --output=DIRECTORY    output directory\n"));
 	printf(_("  -T, --tablespace-mapping=OLDDIR=NEWDIR\n"
 			 "                            relocate tablespace in OLDDIR to NEWDIR\n"));
+	printf(_("  -k, --link                link files instead of copying\n"));
 	printf(_("      --clone               clone (reflink) files instead of copying\n"));
 	printf(_("      --copy                copy files (default)\n"));
 	printf(_("      --copy-file-range     copy using copy_file_range() system call\n"));
diff --git a/src/bin/pg_combinebackup/t/010_links.pl b/src/bin/pg_combinebackup/t/010_links.pl
new file mode 100644
index 0000000000..3fe4271a8f
--- /dev/null
+++ b/src/bin/pg_combinebackup/t/010_links.pl
@@ -0,0 +1,230 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+#
+# This test aims to validate that hard links are created as expected in the
+# output directory, when running pg_combinebackup with --link mode.
+
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+use File::Find;
+
+# Set up a new database instance.
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(has_archiving => 1, allows_streaming => 1);
+$primary->append_conf('postgresql.conf', 'summarize_wal = on');
+$primary->start;
+
+# Create some tables (~35MB each).
+$primary->safe_psql('postgres', <<EOM);
+CREATE TABLE test_1 AS SELECT generate_series(1, 1000000);
+CREATE TABLE test_2 AS SELECT generate_series(1, 1000000);
+CREATE TABLE test_3 AS SELECT generate_series(1, 1000000);
+EOM
+
+# Fetch information about the data files.
+my $query = "SELECT pg_relation_filepath(oid) FROM pg_class WHERE relname = '%s';";
+
+my $pg_attribute_path = $primary->safe_psql('postgres', sprintf($query, 'pg_attribute'));
+note "pg_attribute path is $pg_attribute_path";
+
+my $pg_class_path = $primary->safe_psql('postgres', sprintf($query, 'pg_class'));
+note "pg_class path is $pg_class_path";
+
+my $pg_statistic_path = $primary->safe_psql('postgres', sprintf($query, 'pg_statistic'));
+note "pg_statistic path is $pg_statistic_path";
+
+my $test_1_path = $primary->safe_psql('postgres', sprintf($query, 'test_1'));
+note "test_1 path is $test_1_path";
+
+my $test_2_path = $primary->safe_psql('postgres', sprintf($query, 'test_2'));
+note "test_2 path is $test_2_path";
+
+my $test_3_path = $primary->safe_psql('postgres', sprintf($query, 'test_3'));
+note "test_3 path is $test_3_path";
+
+# Take a full backup.
+my $backup1path = $primary->backup_dir . '/backup1';
+$primary->command_ok(
+	[
+		'pg_basebackup',
+		'--pgdata' => $backup1path,
+		'--no-sync',
+		'--checkpoint' => 'fast',
+        '--wal-method' => 'none'
+	],
+	"full backup");
+
+# Perform an update that touches the whole test_2 data file(s).
+$primary->safe_psql('postgres', <<EOM);
+UPDATE test_2
+SET generate_series = generate_series + 1;
+EOM
+
+# Perform an insert that touches a single page of the last test_3 data file.
+$primary->safe_psql('postgres', <<EOM);
+INSERT INTO test_3 (generate_series) VALUES (1);
+EOM
+
+# Take an incremental backup.
+my $backup2path = $primary->backup_dir . '/backup2';
+$primary->command_ok(
+	[
+		'pg_basebackup',
+		'--pgdata' => $backup2path,
+		'--no-sync',
+		'--checkpoint' => 'fast',
+        '--wal-method' => 'none',
+		'--incremental' => $backup1path . '/backup_manifest'
+	],
+	"incremental backup");
+
+# Restore the incremental backup and use it to create a new node.
+my $restore = PostgreSQL::Test::Cluster->new('restore');
+$restore->init_from_backup(
+	$primary, 'backup2',
+	combine_with_prior => ['backup1'],
+	combine_mode => '--link');
+
+# Ensure files have the expected counter of hard links.
+# We expect all files to have 2 hard links in this case, except for:
+#
+# * backup_label and backup_manifest: the backups were taken in different
+#   LSNs and with different contents on the test tables.
+# * pg_attribute, pg_class and pg_statistic might have been modified by Postgres
+#   in the meantime, so they can have 1 or 2 hard links.
+# * data file of table test_3 is different because of changes in a page.
+
+# Directory to traverse
+my $restore_dir = $restore->data_dir;
+
+# Work around differences between Windows and Linux test runners.
+# The find function from the File::Find module returns paths with forward
+# slashes, while the restore->data_dir variable contains back slashes on
+# Windows. This step is just to normalize the paths, so we are able to
+# match strings later.
+my $restore_dir_normalized = $restore_dir;
+$restore_dir_normalized =~ s/\\/\//g;
+
+# Set of non-linked files (these are the ones with 1 hard link)
+my %non_linked_files = map { $_ => 1 } (
+    join('/', $restore_dir_normalized, 'backup_manifest'),
+    join('/', $restore_dir_normalized, 'backup_label')
+);
+
+# Set of linked or non-linked files (these are the ones that may be with 1 or 2
+# hard links)
+my %linked_or_non_linked_files = map { $_ => 1 } (
+    join('/', $restore_dir_normalized, $pg_attribute_path),
+    join('/', $restore_dir_normalized, $pg_class_path),
+    join('/', $restore_dir_normalized, $pg_statistic_path),
+);
+
+# By default Postgres uses 1GB segments for the data files, and our test tables
+# are 35MB worth of data each. However, that segment size is configurable, so we
+# have to handle all possibilities here. Cirrus CI e.g. is configured with 6
+# blocks per segment, and we need to cover that test case too. That's why we
+# didn't put test_3_path in the non_linked_files variable.
+my $test_3_full_path = join('/', $restore_dir_normalized, $test_3_path);
+my @test_3_segments = ();
+
+# Recursively traverse the directory
+find(sub {
+    my $file = $File::Find::name;
+
+    # Skip directories
+    return if -d $file;
+
+    # Get base name, in case it is a data file and it has more than one segment
+    # This logic is used also for non-data files, and in essence it does
+    # nothing in that case, as most of the non-data files contain no dots in the
+    # name.
+    my $basename = (split /\./, $file)[0];
+
+    if ($test_3_full_path eq $basename) {
+        # The test_3 table is a special case, because we touched a single block
+        # of the last segment of its data file. So we need to handle this case
+        # separately later.
+        push @test_3_segments, $file;
+    } else {
+        # Get the file's stat information
+        my $nlink_count = get_hard_link_count($file);
+
+        if (exists $non_linked_files{$basename}) {
+            # Non-linked files should have 1 hard link
+            ok($nlink_count == 1, "File '$file' has 1 hard link");
+        } elsif (exists $linked_or_non_linked_files{$basename}) {
+            # These files can have either 1 or 2 hard links, as they are a special case
+            ok($nlink_count == 1 || $nlink_count == 2, "File '$file' has 1 or 2 hard link");
+        } else {
+            # Other files should have 2 hard links
+            ok($nlink_count == 2, "File '$file' has 2 hard links");
+        }
+    }
+}, $restore_dir);
+
+# All segments of the data file of the table test_3 should contain 2 hard links,
+# except for the last one, where we inserted one tuple.
+@test_3_segments = sort { natural_sort ($a, $b) } @test_3_segments;
+my $last_test_3_segment = pop @test_3_segments;
+
+for my $test_3_segment (@test_3_segments) {
+    # Get the file's stat information of each segment
+    my $nlink_count = get_hard_link_count($test_3_segment);
+    ok($nlink_count == 2, "File '$test_3_segment' has 2 hard links");
+}
+
+# Get the file's stat information of the last segment
+my $nlink_count = get_hard_link_count($last_test_3_segment);
+ok($nlink_count == 1, "File '$last_test_3_segment' has 1 hard link");
+
+# OK, that's all.
+done_testing();
+
+# Natural comparison subroutine for strings with numbers
+# This is just a helper function for sorting strings with numbers (we want
+# "base/123.13" to come before "base/123.123", for example)
+sub natural_sort {
+    my ($a, $b) = @_;
+
+    # Split into non-numeric and numeric parts
+    my @parts_a = $a =~ /(\D+|\d+)/g;
+    my @parts_b = $b =~ /(\D+|\d+)/g;
+
+    for (my $i = 0; $i < scalar(@parts_a) && $i < scalar(@parts_b); $i++) {
+        if ($parts_a[$i] =~ /^\d/ && $parts_b[$i] =~ /^\d/) {
+            # Compare numerically if both parts are numbers
+            if ($parts_a[$i] < $parts_b[$i]) {
+                return -1;
+            }
+            elsif ($parts_a[$i] > $parts_b[$i]) {
+                return 1;
+            }
+        }
+        else {
+            # Compare lexicographically if not numbers
+            if ($parts_a[$i] lt $parts_b[$i]) {
+                return -1;
+            }
+            elsif ($parts_a[$i] gt $parts_b[$i]) {
+                return 1;
+            }
+        }
+    }
+
+    # If all compared parts are the same, the shorter string comes first
+    return scalar(@parts_a) <=> scalar(@parts_b);
+}
+
+
+# Subroutine to get hard-link count of a given file.
+sub get_hard_link_count {
+    my ($file) = @_;
+
+    # Get file stats
+    my @stats = stat($file);
+    my $nlink = $stats[3];  # Number of hard links
+
+    return $nlink;
+}
-- 
2.43.5

#11

Robert Haas

robertmhaas@gmail.com

10 months ago

In reply to: Israel Barth Rubio (#10)

1 attachment(s)

Re: Add -k/--link option to pg_combinebackup

On Mon, Mar 3, 2025 at 9:00 AM Israel Barth Rubio <barthisrael@gmail.com> wrote:

2) Since it is a new file, "Copyright (c) 2021-2025" should be
"Copyright (c) 2025"

Done!

For the record, this proposed change is not a project policy, AIUI. I
don't care very much what we do here, but -1 for kibitzing the range
of years people put in the copyright header.

Attaching the new version of the patch in this reply.

I had to slightly change 010_links.pl and copy_file.c to
properly handle Windows, as the meson tests on
Windows had complaints with the code of v3:

* copy_file.c was forcing CopyFile on Windows regardless
of the option chosen by the user. Now, to make that behavior
backward compatible, I'm only forcing CopyFile on Windows
when the copy method is not --link. That allows my patch to
work, and doesn't change the previous behavior.
* Had to normalize paths when performing string matching in
the test that verifies the hard link count on files.

When I looked at the code for this test, the questions that sprang to
mind for me were:

1. Is this really going to be stable?
2. Is this going to be too expensive?

With regard to the second question, why does this test need to create
three tables with a million rows each instead of some small number of
rows? If you need to fill multiple blocks, consider making the rows
wider instead of inserting such a large number.

With regard to the first question, I'm going to say that the answer is
"no" because when I run this test locally, I get this:

<lots of output omitted>
[12:26:07.979](0.000s) ok 973 - File
'/Users/robert.haas/pgsql-meson/testrun/pg_combinebackup/010_links/data/t_010_links_restore_data/pgdata/base/5/2664'
has 2 hard links
[12:26:07.980](0.000s) ok 974 - File
'/Users/robert.haas/pgsql-meson/testrun/pg_combinebackup/010_links/data/t_010_links_restore_data/pgdata/base/5/1249_vm'
has 2 hard links
[12:26:07.980](0.000s) ok 975 - File
'/Users/robert.haas/pgsql-meson/testrun/pg_combinebackup/010_links/data/t_010_links_restore_data/pgdata/base/5/2836'
has 2 hard links
[12:26:07.980](0.000s) ok 976 - File
'/Users/robert.haas/pgsql-meson/testrun/pg_combinebackup/010_links/data/t_010_links_restore_data/pgdata/base/5/2663'
has 2 hard links
[12:26:07.980](0.000s) ok 977 - File
'/Users/robert.haas/pgsql-meson/testrun/pg_combinebackup/010_links/data/t_010_links_restore_data/pgdata/base/5/6237'
has 2 hard links
Use of uninitialized value $file in stat at
/Users/robert.haas/pgsql/src/bin/pg_combinebackup/t/010_links.pl line
226.

I'm not sure whether that "Use of uninitialized value" message at the
end is killing the script at that point or whether it's just a
warning, but it should be cleaned up either way. For the rest, I'm not
a fan of testing every single file in the data directory like this. It
seems sufficient to me to test two files, one that you expect to
definitely be modified and one that you expect to definitely be not
modified. That assumes that you can guarantee that the file definitely
wasn't modified, which I'm not quite sure about. The test doesn't seem
to disable autovacuum, so what would keep autovacuum from sometimes
processing that table after the full backup and before the incremental
backup, and sometimes not? But there's something else wrong here, too,
because even when I adjusted the test to disable autovacuum, it still
failed in the same way as shown above.

Also, project style is uncuddled braces and lines less than 80 where
possible. So don't do this:

for my $test_3_segment (@test_3_segments) {

The curly brace belongs on the next line. Similarly this should be
three separate lines:

} else {

Regarding long lines, some of these cases are easy to fix:

my $query = "SELECT pg_relation_filepath(oid) FROM pg_class WHERE
relname = '%s';";

This could be fixed by writing my $query = <<EOM;

my $pg_attribute_path = $primary->safe_psql('postgres',
sprintf($query, 'pg_attribute'));

This could be fixed by moving the sprintf() down a line and indenting
it under 'postgres'.

Attached is a small delta patch to fix a few other issues. It does the
following:

* adds a serial comma to pg_combinebackup.sgml. This isn't absolutely
mandatory but we usually prefer this style.

* puts the new -k option in proper alphabetical order in several places

* changes the test in copy_file() to test for == COPY_METHOD_COPY
instead of != COPY_METHOD_COPYFILE and updates the comment to reflect
what I believe the actual reasoning to be

--
Robert Haas
EDB: http://www.enterprisedb.com

Attachments:

rmh-tweaks.txttext/plain; charset=US-ASCII; name=rmh-tweaks.txtDownload

diff --git a/doc/src/sgml/ref/pg_combinebackup.sgml b/doc/src/sgml/ref/pg_combinebackup.sgml
index 79c73ad460d..55bc46849db 100644
--- a/doc/src/sgml/ref/pg_combinebackup.sgml
+++ b/doc/src/sgml/ref/pg_combinebackup.sgml
@@ -196,7 +196,7 @@ PostgreSQL documentation
       <listitem>
        <para>
         Perform regular file copy.  This is the default.  (See also
-        <option>--copy-file-range</option>, <option>--clone</option> and
+        <option>--copy-file-range</option>, <option>--clone</option>, and
         <option>-k</option>/<option>--link</option>.)
        </para>
       </listitem>
diff --git a/src/bin/pg_combinebackup/copy_file.c b/src/bin/pg_combinebackup/copy_file.c
index f4f38e9ad71..97ecda5a66d 100644
--- a/src/bin/pg_combinebackup/copy_file.c
+++ b/src/bin/pg_combinebackup/copy_file.c
@@ -73,12 +73,11 @@ copy_file(const char *src, const char *dst,
 
 #ifdef WIN32
 	/*
-	 * Windows only supports two "copy methods": CopyFile and
-	 * CreateHardLink. Whenever the user selects a method other than
-	 * --link, we force pg_combinebackup to use CopyFile, as --clone and
-	 * --copy-file-range are not supported in that platform.
+	 * We have no specific switch to enable CopyFile on Windows, because
+	 * it's supported (as far as we know) on all Windows machines. So,
+	 * automatically enable it unless some other strategy was selected.
 	 */
-	if (copy_method != COPY_METHOD_LINK)
+	if (copy_method == COPY_METHOD_COPY)
 		copy_method = COPY_METHOD_COPYFILE;
 #endif
 
diff --git a/src/bin/pg_combinebackup/pg_combinebackup.c b/src/bin/pg_combinebackup/pg_combinebackup.c
index e005b033ed0..d480dc74436 100644
--- a/src/bin/pg_combinebackup/pg_combinebackup.c
+++ b/src/bin/pg_combinebackup/pg_combinebackup.c
@@ -173,7 +173,7 @@ main(int argc, char *argv[])
 	opt.copy_method = COPY_METHOD_COPY;
 
 	/* process command-line options */
-	while ((c = getopt_long(argc, argv, "dnNo:T:k",
+	while ((c = getopt_long(argc, argv, "dknNo:T:",
 							long_options, &optindex)) != -1)
 	{
 		switch (c)
@@ -182,6 +182,9 @@ main(int argc, char *argv[])
 				opt.debug = true;
 				pg_logging_increase_verbosity();
 				break;
+			case 'k':
+				opt.copy_method = COPY_METHOD_LINK;
+				break;
 			case 'n':
 				opt.dry_run = true;
 				break;
@@ -194,9 +197,6 @@ main(int argc, char *argv[])
 			case 'T':
 				add_tablespace_mapping(&opt, optarg);
 				break;
-			case 'k':
-				opt.copy_method = COPY_METHOD_LINK;
-				break;
 			case 1:
 				if (!pg_checksum_parse_type(optarg,
 											&opt.manifest_checksums))
@@ -770,12 +770,12 @@ help(const char *progname)
 	printf(_("  %s [OPTION]... DIRECTORY...\n"), progname);
 	printf(_("\nOptions:\n"));
 	printf(_("  -d, --debug               generate lots of debugging output\n"));
+	printf(_("  -k, --link                link files instead of copying\n"));
 	printf(_("  -n, --dry-run             do not actually do anything\n"));
 	printf(_("  -N, --no-sync             do not wait for changes to be written safely to disk\n"));
 	printf(_("  -o, --output=DIRECTORY    output directory\n"));
 	printf(_("  -T, --tablespace-mapping=OLDDIR=NEWDIR\n"
 			 "                            relocate tablespace in OLDDIR to NEWDIR\n"));
-	printf(_("  -k, --link                link files instead of copying\n"));
 	printf(_("      --clone               clone (reflink) files instead of copying\n"));
 	printf(_("      --copy                copy files (default)\n"));
 	printf(_("      --copy-file-range     copy using copy_file_range() system call\n"));

#12

Israel Barth Rubio

barthisrael@gmail.com

10 months ago

In reply to: Robert Haas (#11)

1 attachment(s)

Re: Add -k/--link option to pg_combinebackup

With regard to the second question, why does this test need to create
three tables with a million rows each instead of some small number of
rows? If you need to fill multiple blocks, consider making the rows
wider instead of inserting such a large number.

Makes sense! Changed that to use tables with wider rows, but less
rows.

With regard to the first question, I'm going to say that the answer is
"no" because when I run this test locally, I get this:

Use of uninitialized value $file in stat at
/Users/robert.haas/pgsql/src/bin/pg_combinebackup/t/010_links.pl line
226.

I'm not sure whether that "Use of uninitialized value" message at the
end is killing the script at that point or whether it's just a
warning, but it should be cleaned up either way.

That's unexpected. It seems somehow the function was called with a
"null" value as the argument.

For the rest, I'm not
a fan of testing every single file in the data directory like this. It
seems sufficient to me to test two files, one that you expect to
definitely be modified and one that you expect to definitely be not
modified. That assumes that you can guarantee that the file definitely
wasn't modified, which I'm not quite sure about. The test doesn't seem
to disable autovacuum, so what would keep autovacuum from sometimes
processing that table after the full backup and before the incremental
backup, and sometimes not? But there's something else wrong here, too,
because even when I adjusted the test to disable autovacuum, it still
failed in the same way as shown above.

Right, maybe the previous test was trying to do much more than it
should.

I've refactored the test to focus only on the user tables that we create
and use in the test.

I've also added `autovacuum = off` to the configuration, just in case.

Also, project style is uncuddled braces and lines less than 80 where
possible. So don't do this:

for my $test_3_segment (@test_3_segments) {

The curly brace belongs on the next line. Similarly this should be
three separate lines:

} else {

Regarding long lines, some of these cases are easy to fix:

my $query = "SELECT pg_relation_filepath(oid) FROM pg_class WHERE
relname = '%s';";

This could be fixed by writing my $query = <<EOM;

my $pg_attribute_path = $primary->safe_psql('postgres',
sprintf($query, 'pg_attribute'));

This could be fixed by moving the sprintf() down a line and indenting
it under 'postgres'.

Oh, I thought the styling guide from the Postgres wiki would apply to
the .c/.h files from the source code. Not sure why I got to that conclusion,
but I applied the styling guide to the perl files in the new version of the
patch.

Attached is a small delta patch to fix a few other issues. It does the
following:

* adds a serial comma to pg_combinebackup.sgml. This isn't absolutely
mandatory but we usually prefer this style.

* puts the new -k option in proper alphabetical order in several places

* changes the test in copy_file() to test for == COPY_METHOD_COPY
instead of != COPY_METHOD_COPYFILE and updates the comment to reflect
what I believe the actual reasoning to be

Thanks for the patch with the suggestions.
About the last one, related to the copy method, my first thought was to do
something like you did (in the sense of using == instead of !=). However, I
was concerned that I would change the previous behavior. But I do prefer
how it stands in your suggestion, thanks!

I'm attaching the v5 of the patch now.

Best regards,
Israel.

Attachments:

v5-0001-pg_combinebackup-add-support-for-hard-links.patchapplication/octet-stream; name=v5-0001-pg_combinebackup-add-support-for-hard-links.patchDownload

From dcd2530630c2fee0ee5b3cbf9508430a7019d193 Mon Sep 17 00:00:00 2001
From: Israel Barth Rubio <barthisrael@gmail.com>
Date: Wed, 19 Feb 2025 03:26:30 +0000
Subject: [PATCH v5] pg_combinebackup: add support for hard links

Up to now, pg_combinebackup reconstructs incremental files, if needed,
otherwise copy them from any of the input backups to the output
directory. That copy mecanism can use different methods, depending on
the argument specified by the user.

This commit adds support for a new "copy method": hard links
(-k/--link). When using that mode, instead of copying unmodified files
from the input backups to the output directory, pg_combinebackup
creates the files as hard links from the output directory to the input
backups.

The new link method might speed up the reconstruction of the synthetic
backup (no file copy) and reduce disk usage taken by the synthetic
backup. The benefits depend on the modification pattern of files in
PGDATA between backups, imposed by the workload on Postgres.

This feature requires that the input backups plus the output directory
are in the same file system. Also, caution is required from the user
when modifying or starting the cluster from a synthetic backup, as that
might invalidate one or more of the input backups.

Signed-off-by: Israel Barth Rubio <barthisrael@gmail.com>
---
 doc/src/sgml/ref/pg_combinebackup.sgml      |  32 ++-
 src/bin/pg_combinebackup/copy_file.c        |  33 ++-
 src/bin/pg_combinebackup/copy_file.h        |   1 +
 src/bin/pg_combinebackup/meson.build        |   1 +
 src/bin/pg_combinebackup/pg_combinebackup.c |  12 +-
 src/bin/pg_combinebackup/t/010_links.pl     | 228 ++++++++++++++++++++
 6 files changed, 304 insertions(+), 3 deletions(-)
 create mode 100644 src/bin/pg_combinebackup/t/010_links.pl

diff --git a/doc/src/sgml/ref/pg_combinebackup.sgml b/doc/src/sgml/ref/pg_combinebackup.sgml
index 091982f62a..55bc46849d 100644
--- a/doc/src/sgml/ref/pg_combinebackup.sgml
+++ b/doc/src/sgml/ref/pg_combinebackup.sgml
@@ -137,6 +137,35 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-k</option></term>
+      <term><option>--link</option></term>
+      <listitem>
+       <para>
+        Use hard links instead of copying files to the synthetic backup.
+        Reconstruction of the synthetic backup might be faster (no file copying)
+        and use less disk space, but care must be taken when using the output
+        directory, because any modifications to that directory (for example,
+        starting the server) can also affect the input directories. Likewise,
+        changes to the input directories (for example, starting the server on
+        the full backup) could affect the output directory. Thus, this option
+        is best used when the input directories are only copies that will be
+        removed after <application>pg_combinebackup</application> has completed.
+       </para>
+
+       <para>
+        Requires that the input backups and the output directory are in the
+        same file system.
+       </para>
+
+       <para>
+        If a backup manifest is not available or does not contain checksum of
+        the right type, hard links will still be created, but the file will be
+        also read block-by-block for the checksum calculation.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>--clone</option></term>
       <listitem>
@@ -167,7 +196,8 @@ PostgreSQL documentation
       <listitem>
        <para>
         Perform regular file copy.  This is the default.  (See also
-        <option>--copy-file-range</option> and <option>--clone</option>.)
+        <option>--copy-file-range</option>, <option>--clone</option>, and
+        <option>-k</option>/<option>--link</option>.)
        </para>
       </listitem>
      </varlistentry>
diff --git a/src/bin/pg_combinebackup/copy_file.c b/src/bin/pg_combinebackup/copy_file.c
index 4e27814839..97ecda5a66 100644
--- a/src/bin/pg_combinebackup/copy_file.c
+++ b/src/bin/pg_combinebackup/copy_file.c
@@ -40,6 +40,9 @@ static void copy_file_copyfile(const char *src, const char *dst,
 							   pg_checksum_context *checksum_ctx);
 #endif
 
+static void copy_file_link(const char *src, const char *dest,
+						   pg_checksum_context *checksum_ctx);
+
 /*
  * Copy a regular file, optionally computing a checksum, and emitting
  * appropriate debug messages. But if we're in dry-run mode, then just emit
@@ -69,7 +72,13 @@ copy_file(const char *src, const char *dst,
 	}
 
 #ifdef WIN32
-	copy_method = COPY_METHOD_COPYFILE;
+	/*
+	 * We have no specific switch to enable CopyFile on Windows, because
+	 * it's supported (as far as we know) on all Windows machines. So,
+	 * automatically enable it unless some other strategy was selected.
+	 */
+	if (copy_method == COPY_METHOD_COPY)
+		copy_method = COPY_METHOD_COPYFILE;
 #endif
 
 	/* Determine the name of the copy strategy for use in log messages. */
@@ -93,6 +102,10 @@ copy_file(const char *src, const char *dst,
 			strategy_implementation = copy_file_copyfile;
 			break;
 #endif
+		case COPY_METHOD_LINK:
+			strategy_name = "link";
+			strategy_implementation = copy_file_link;
+			break;
 	}
 
 	if (dry_run)
@@ -304,3 +317,21 @@ copy_file_copyfile(const char *src, const char *dst,
 	checksum_file(src, checksum_ctx);
 }
 #endif							/* WIN32 */
+
+/*
+ * copy_file_link
+ * 		Hard-links a file from src to dest.
+ *
+ * If needed, also reads the file and calculates the checksum.
+ */
+static void
+copy_file_link(const char *src, const char *dest,
+			   pg_checksum_context *checksum_ctx)
+{
+	if (link(src, dest) < 0)
+		pg_fatal("error while linking file from \"%s\" to \"%s\": %m",
+				 src, dest);
+
+	/* if needed, calculate checksum of the file */
+	checksum_file(src, checksum_ctx);
+}
diff --git a/src/bin/pg_combinebackup/copy_file.h b/src/bin/pg_combinebackup/copy_file.h
index 92f104115b..5a8517629c 100644
--- a/src/bin/pg_combinebackup/copy_file.h
+++ b/src/bin/pg_combinebackup/copy_file.h
@@ -25,6 +25,7 @@ typedef enum CopyMethod
 #ifdef WIN32
 	COPY_METHOD_COPYFILE,
 #endif
+	COPY_METHOD_LINK,
 } CopyMethod;
 
 extern void copy_file(const char *src, const char *dst,
diff --git a/src/bin/pg_combinebackup/meson.build b/src/bin/pg_combinebackup/meson.build
index 0c4fd9e627..e19c309ad2 100644
--- a/src/bin/pg_combinebackup/meson.build
+++ b/src/bin/pg_combinebackup/meson.build
@@ -37,6 +37,7 @@ tests += {
       't/007_wal_level_minimal.pl',
       't/008_promote.pl',
       't/009_no_full_file.pl',
+      't/010_links.pl',
     ],
   }
 }
diff --git a/src/bin/pg_combinebackup/pg_combinebackup.c b/src/bin/pg_combinebackup/pg_combinebackup.c
index 5864ec574f..d480dc7443 100644
--- a/src/bin/pg_combinebackup/pg_combinebackup.c
+++ b/src/bin/pg_combinebackup/pg_combinebackup.c
@@ -135,6 +135,7 @@ main(int argc, char *argv[])
 		{"no-sync", no_argument, NULL, 'N'},
 		{"output", required_argument, NULL, 'o'},
 		{"tablespace-mapping", required_argument, NULL, 'T'},
+		{"link", no_argument, NULL, 'k'},
 		{"manifest-checksums", required_argument, NULL, 1},
 		{"no-manifest", no_argument, NULL, 2},
 		{"sync-method", required_argument, NULL, 3},
@@ -172,7 +173,7 @@ main(int argc, char *argv[])
 	opt.copy_method = COPY_METHOD_COPY;
 
 	/* process command-line options */
-	while ((c = getopt_long(argc, argv, "dnNo:T:",
+	while ((c = getopt_long(argc, argv, "dknNo:T:",
 							long_options, &optindex)) != -1)
 	{
 		switch (c)
@@ -181,6 +182,9 @@ main(int argc, char *argv[])
 				opt.debug = true;
 				pg_logging_increase_verbosity();
 				break;
+			case 'k':
+				opt.copy_method = COPY_METHOD_LINK;
+				break;
 			case 'n':
 				opt.dry_run = true;
 				break;
@@ -424,6 +428,11 @@ main(int argc, char *argv[])
 		}
 	}
 
+	/* Warn about the possibility of compromising the backups, when link mode */
+	if (opt.copy_method == COPY_METHOD_LINK)
+		pg_log_warning("--link mode was used; any modifications to the output "
+					   "directory may destructively modify input directories");
+
 	/* It's a success, so don't remove the output directories. */
 	reset_directory_cleanup_list();
 	exit(0);
@@ -761,6 +770,7 @@ help(const char *progname)
 	printf(_("  %s [OPTION]... DIRECTORY...\n"), progname);
 	printf(_("\nOptions:\n"));
 	printf(_("  -d, --debug               generate lots of debugging output\n"));
+	printf(_("  -k, --link                link files instead of copying\n"));
 	printf(_("  -n, --dry-run             do not actually do anything\n"));
 	printf(_("  -N, --no-sync             do not wait for changes to be written safely to disk\n"));
 	printf(_("  -o, --output=DIRECTORY    output directory\n"));
diff --git a/src/bin/pg_combinebackup/t/010_links.pl b/src/bin/pg_combinebackup/t/010_links.pl
new file mode 100644
index 0000000000..6da02ac243
--- /dev/null
+++ b/src/bin/pg_combinebackup/t/010_links.pl
@@ -0,0 +1,228 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+#
+# This test aims to validate that hard links are created as expected in the
+# output directory, when running pg_combinebackup with --link mode.
+
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+use File::Basename;
+
+# Set up a new database instance.
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(has_archiving => 1, allows_streaming => 1);
+$primary->append_conf('postgresql.conf', 'summarize_wal = on');
+$primary->append_conf('postgresql.conf', 'autovacuum = off');
+$primary->start;
+
+# Create some tables (~264KB each).
+my $query = <<'EOM';
+CREATE TABLE test_%s AS
+    SELECT x.id::bigint,
+           repeat('a', 1600) AS value
+    FROM generate_series(1, 100) AS x(id);
+EOM
+
+$primary->safe_psql('postgres', sprintf($query, '1'));
+$primary->safe_psql('postgres', sprintf($query, '2'));
+
+# Fetch information about the data files.
+$query = <<'EOM';
+SELECT pg_relation_filepath(oid)
+FROM pg_class
+WHERE relname = 'test_%s';
+EOM
+
+my $test_1_path = $primary->safe_psql('postgres', sprintf($query, '1'));
+note "test_1 path is $test_1_path";
+
+my $test_2_path = $primary->safe_psql('postgres', sprintf($query, '2'));
+note "test_2 path is $test_2_path";
+
+# Take a full backup.
+my $backup1path = $primary->backup_dir . '/backup1';
+$primary->command_ok(
+	[
+		'pg_basebackup',
+		'--pgdata' => $backup1path,
+		'--no-sync',
+		'--checkpoint' => 'fast',
+        '--wal-method' => 'none'
+	],
+	"full backup");
+
+# Perform an insert that touches pages of the last segment of the data file of
+# table test_2.
+$primary->safe_psql('postgres', <<EOM);
+INSERT INTO test_2 (id, value)
+    SELECT x.id::bigint,
+           repeat('a', 1600) AS value
+    FROM generate_series(101, 110) AS x(id);
+EOM
+
+# Take an incremental backup.
+my $backup2path = $primary->backup_dir . '/backup2';
+$primary->command_ok(
+	[
+		'pg_basebackup',
+		'--pgdata' => $backup2path,
+		'--no-sync',
+		'--checkpoint' => 'fast',
+        '--wal-method' => 'none',
+		'--incremental' => $backup1path . '/backup_manifest'
+	],
+	"incremental backup");
+
+# Restore the incremental backup and use it to create a new node.
+my $restore = PostgreSQL::Test::Cluster->new('restore');
+$restore->init_from_backup(
+	$primary, 'backup2',
+	combine_with_prior => ['backup1'],
+	combine_mode => '--link');
+
+# Work around differences between Windows and Linux test runners. The perl
+# functions to inspect files return paths with forward slashes, while the
+# restore->data_dir variable contains back slashes on Windows. This step is just
+# to normalize the paths, so we are able to match strings later.
+my $restore_dir_normalized = $restore->data_dir;
+$restore_dir_normalized =~ s/\\/\//g;
+
+# Ensure files have the expected count of hard links. We expect all data files
+# from test_1 to contain 2 hard links, because they were not touched between the
+# full and incremental backups, and the last data file of table test_2 to
+# contain a single hard link because of changes in its last pages.
+my $test_1_full_path = join('/', $restore_dir_normalized, $test_1_path);
+check_data_file($test_1_full_path, 2);
+
+my $test_2_full_path = join('/', $restore_dir_normalized, $test_2_path);
+check_data_file($test_2_full_path, 1);
+
+# OK, that's all.
+done_testing();
+
+
+# Given the path to the first segment of a data file, inspect its parent
+# directory to find all the segments of that data file, and make sure all the
+# segments contain 2 hard links. The last one must have the given number of hard
+# links.
+#
+# Parameters:
+# * data_file: path to the first segment of a data file, as per the output of
+#              pg_relation_filepath.
+# * last_segment_nlinks: the number of hard links expected in the last segment
+#                        of the given data file.
+sub check_data_file
+{
+    my ($data_file, $last_segment_nlinks) = @_;
+
+    # By default Postgres uses 1GB segments for the data files, and our test
+    # tables are 264KB worth of data each. However, that segment size is
+    # configurable, so we have to handle all possibilities here. Cirrus CI e.g.
+    # is configured with 6 blocks per segment, and we need to cover that test
+    # case too. We want to scan all files under the directory which contains the
+    # data file, and get all the segments of that data file.
+    my @data_file_segments = ();
+    my $dir = dirname($data_file);
+
+    opendir(my $dh, $dir) or die "Cannot open directory $dir: $!";
+
+    while (my $file = readdir($dh))
+    {
+        my $full_path = "$dir/$file";
+
+        # Skip the '.' and '..' entries, and directories
+        next if $file eq '.' or $file eq '..' or !-f $full_path;
+
+        # The first segment of the data file contains no dots. From the second
+        # onwards, it follows the same name pattern of the first segment,
+        # followed by a dot and the sequence number. Here we want to normalize
+        # all the segments to the same name.
+        my $basename = (split /\./, $full_path)[0];
+
+        # If it is a segment of the given data file, add it to the list of
+        # segments.
+        if ($basename eq $data_file)
+        {
+            push @data_file_segments, $full_path;
+        }
+    }
+
+    closedir($dh);
+
+    # All segments of the given data file should contain 2 hard links, except
+    # for the last one, which should match the given number of links.
+    @data_file_segments = sort { natural_sort ($a, $b) } @data_file_segments;
+    my $last_segment = pop @data_file_segments;
+
+    for my $segment (@data_file_segments)
+    {
+        # Get the file's stat information of each segment
+        my $nlink_count = get_hard_link_count($segment);
+        ok($nlink_count == 2, "File '$segment' has 2 hard links");
+    }
+
+    # Get the file's stat information of the last segment
+    my $nlink_count = get_hard_link_count($last_segment);
+    ok($nlink_count == $last_segment_nlinks,
+       "File '$last_segment' has $last_segment_nlinks hard link(s)");
+}
+
+# Natural comparison subroutine for strings with numbers.
+# This is just a helper function for sorting strings with numbers. We want
+# "base/123.13" to come before "base/123.123", for example.
+sub natural_sort
+{
+    my ($a, $b) = @_;
+
+    # Split into non-numeric and numeric parts
+    my @parts_a = $a =~ /(\D+|\d+)/g;
+    my @parts_b = $b =~ /(\D+|\d+)/g;
+
+    for (my $i = 0; $i < scalar(@parts_a) && $i < scalar(@parts_b); $i++)
+    {
+        if ($parts_a[$i] =~ /^\d/ && $parts_b[$i] =~ /^\d/)
+        {
+            # Compare numerically if both parts are numbers
+            if ($parts_a[$i] < $parts_b[$i])
+            {
+                return -1;
+            }
+            elsif ($parts_a[$i] > $parts_b[$i])
+            {
+                return 1;
+            }
+        }
+        else
+        {
+            # Compare lexicographically if not numbers
+            if ($parts_a[$i] lt $parts_b[$i])
+            {
+                return -1;
+            }
+            elsif ($parts_a[$i] gt $parts_b[$i])
+            {
+                return 1;
+            }
+        }
+    }
+
+    # If all compared parts are the same, the shorter string comes first
+    return scalar(@parts_a) <=> scalar(@parts_b);
+}
+
+
+# Subroutine to get hard link count of a given file.
+# Receives the path to a file, and returns the number of hard links of
+# that file.
+sub get_hard_link_count
+{
+    my ($file) = @_;
+
+    # Get file stats
+    my @stats = stat($file);
+    my $nlink = $stats[3];  # Number of hard links
+
+    return $nlink;
+}
-- 
2.43.5

#13

Robert Haas

robertmhaas@gmail.com

10 months ago

In reply to: Israel Barth Rubio (#12)

Re: Add -k/--link option to pg_combinebackup

On Tue, Mar 4, 2025 at 7:07 AM Israel Barth Rubio <barthisrael@gmail.com> wrote:

About the last one, related to the copy method, my first thought was to do
something like you did (in the sense of using == instead of !=). However, I
was concerned that I would change the previous behavior. But I do prefer
how it stands in your suggestion, thanks!

I don't think it does, because I think those options are anyway
blocked on Windows. See the comment block that says /* Check that the
platform supports the requested copy method. */

I'm attaching the v5 of the patch now.

So, this still fails for me, in a manner quite similar to before. The
first call to check_data_file() is for
/Users/robert.haas/pgsql-meson/testrun/pg_combinebackup/010_links/data/t_010_links_restore_data/pgdata/base/5/16384
but @data_file_segments ends up zero-length. Therefore, $last_segment
is undef when we access it at the bottom of check_data_file(). I think
the problem is here:

my $basename = (split /\./, $full_path)[0];

This code assumes that no path name component other than the filename
at the end contains a period, but in my case that's not true, because
my home directory on this particular computer is /Users/robert.haas.

But I'm wondering why you are even using readdir() here in the first
place. You could just test -f "16384"; if it exists test -f "16384.1";
if that exists test -f "16384.2" and keep going until you reach a file
that does not exist. Not only would this avoid reading the entire
directory contents for every call to this function, but it would also
generate the list of files in sorted order, which would let you throw
away natural_sort().

Overall, I feel like this could just be significantly simpler. For
instance, I don't quite see the need to have both test_1 and test_2.
If we just have one table and all the segments but the last have 2
hard links while the last have 1, isn't that enough? What test
coverage do we get by adding the second relation? All of the segments
of test_1 behave just like the non-final segments of test_2, so why
test both? If you go down to 1 table, you can simplify a bunch of
things here.

Why does the insert add 10 new rows instead of just 1? Doesn't that
introduce a significant risk that with some choice of segment size
those inserts will be spread across two segments rather than one,
leading to the test failing?

--
Robert Haas
EDB: http://www.enterprisedb.com

#14

Israel Barth Rubio

barthisrael@gmail.com

10 months ago

In reply to: Robert Haas (#13)

1 attachment(s)

Re: Add -k/--link option to pg_combinebackup

I don't think it does, because I think those options are anyway
blocked on Windows. See the comment block that says /* Check that the
platform supports the requested copy method. */

Yeah, you are right. I could swear I ran `pg_combinebackup --clone` on
a Windows VM, and instead of erroring out it instead ran with CopyFile.
But that's not the case, it errors out (as expected).

So, this still fails for me, in a manner quite similar to before. The
first call to check_data_file() is for

/Users/robert.haas/pgsql-meson/testrun/pg_combinebackup/010_links/data/t_010_links_restore_data/pgdata/base/5/16384

but @data_file_segments ends up zero-length. Therefore, $last_segment
is undef when we access it at the bottom of check_data_file(). I think
the problem is here:

my $basename = (split /\./, $full_path)[0];

This code assumes that no path name component other than the filename
at the end contains a period, but in my case that's not true, because
my home directory on this particular computer is /Users/robert.haas.

Right, well spotted. The test code had bad assumptions. Those
assumptions should be in the file name, not in the full path, thus that
broke in your env.

But I'm wondering why you are even using readdir() here in the first
place. You could just test -f "16384"; if it exists test -f "16384.1";
if that exists test -f "16384.2" and keep going until you reach a file
that does not exist. Not only would this avoid reading the entire
directory contents for every call to this function, but it would also
generate the list of files in sorted order, which would let you throw
away natural_sort().

That makes total sense. In the v6 version I threw out the natural_sort
and started using the approach you described, which is much more
efficient and less error prone.

Overall, I feel like this could just be significantly simpler. For
instance, I don't quite see the need to have both test_1 and test_2.
If we just have one table and all the segments but the last have 2
hard links while the last have 1, isn't that enough? What test
coverage do we get by adding the second relation? All of the segments
of test_1 behave just like the non-final segments of test_2, so why
test both? If you go down to 1 table, you can simplify a bunch of
things here.

I think that isn't enough because it makes assumptions on the
environment -- in this case that assumes the Linux autoconf environment
from Cirrus CI.

The test is creating a couple of tables with ~264KB. If you have a very
small segment size, like we have in the autoconf job of Cirrus CI, then
that's enough to create more than a single file, in such a way we are
able to verify segments with 2 or 1 hard links.

However, when running through other jobs, like Windows meson, it uses
the default segment size of 1GB, and in that case we would have a single
segment with 1 hard link if we had a single table.

With that in mind, and taking into consideration that local builds from
devs might use the default segment size, having a couple of tables to
test different purposes, i.e. test_1 for testing an "unmodified table"
scenario, and test_2 for testing a "modified table" scenario, looks more
appropriate to me.

For now I'm keeping the patch as it was in that sense, i.e. with a
check_data_file function that is called both for test_1 and test_2. I
added a comment about that in the test file. I can send a new patch
version if that still sounds like an overkill (having two test tables).

Why does the insert add 10 new rows instead of just 1? Doesn't that
introduce a significant risk that with some choice of segment size
those inserts will be spread across two segments rather than one,
leading to the test failing?

That was a leftover from some manual attempts on my side before sending
the patch, sorry! I started with a simple INSERT, then tried a INSERT
with a SELECT on a range, and missed reverting before sending the patch.
The v6 version of the patch contains the simple INSERT version, which
adds only one tuple to the test_2 table, and is safer as your pointed
out.

Best regards,
Israel.

Attachments:

v6-0001-pg_combinebackup-add-support-for-hard-links.patchapplication/octet-stream; name=v6-0001-pg_combinebackup-add-support-for-hard-links.patchDownload

From c79530e7b28fa195826a890cc18512285aa70b2b Mon Sep 17 00:00:00 2001
From: Israel Barth Rubio <barthisrael@gmail.com>
Date: Wed, 19 Feb 2025 03:26:30 +0000
Subject: [PATCH v6] pg_combinebackup: add support for hard links

Up to now, pg_combinebackup reconstructs incremental files, if needed,
otherwise copy them from any of the input backups to the output
directory. That copy mecanism can use different methods, depending on
the argument specified by the user.

This commit adds support for a new "copy method": hard links
(-k/--link). When using that mode, instead of copying unmodified files
from the input backups to the output directory, pg_combinebackup
creates the files as hard links from the output directory to the input
backups.

The new link method might speed up the reconstruction of the synthetic
backup (no file copy) and reduce disk usage taken by the synthetic
backup. The benefits depend on the modification pattern of files in
PGDATA between backups, imposed by the workload on Postgres.

This feature requires that the input backups plus the output directory
are in the same file system. Also, caution is required from the user
when modifying or starting the cluster from a synthetic backup, as that
might invalidate one or more of the input backups.

Signed-off-by: Israel Barth Rubio <barthisrael@gmail.com>
---
 doc/src/sgml/ref/pg_combinebackup.sgml      |  32 +++-
 src/bin/pg_combinebackup/copy_file.c        |  33 +++-
 src/bin/pg_combinebackup/copy_file.h        |   1 +
 src/bin/pg_combinebackup/meson.build        |   1 +
 src/bin/pg_combinebackup/pg_combinebackup.c |  12 +-
 src/bin/pg_combinebackup/t/010_links.pl     | 180 ++++++++++++++++++++
 6 files changed, 256 insertions(+), 3 deletions(-)
 create mode 100644 src/bin/pg_combinebackup/t/010_links.pl

diff --git a/doc/src/sgml/ref/pg_combinebackup.sgml b/doc/src/sgml/ref/pg_combinebackup.sgml
index 091982f62a..55bc46849d 100644
--- a/doc/src/sgml/ref/pg_combinebackup.sgml
+++ b/doc/src/sgml/ref/pg_combinebackup.sgml
@@ -137,6 +137,35 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-k</option></term>
+      <term><option>--link</option></term>
+      <listitem>
+       <para>
+        Use hard links instead of copying files to the synthetic backup.
+        Reconstruction of the synthetic backup might be faster (no file copying)
+        and use less disk space, but care must be taken when using the output
+        directory, because any modifications to that directory (for example,
+        starting the server) can also affect the input directories. Likewise,
+        changes to the input directories (for example, starting the server on
+        the full backup) could affect the output directory. Thus, this option
+        is best used when the input directories are only copies that will be
+        removed after <application>pg_combinebackup</application> has completed.
+       </para>
+
+       <para>
+        Requires that the input backups and the output directory are in the
+        same file system.
+       </para>
+
+       <para>
+        If a backup manifest is not available or does not contain checksum of
+        the right type, hard links will still be created, but the file will be
+        also read block-by-block for the checksum calculation.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>--clone</option></term>
       <listitem>
@@ -167,7 +196,8 @@ PostgreSQL documentation
       <listitem>
        <para>
         Perform regular file copy.  This is the default.  (See also
-        <option>--copy-file-range</option> and <option>--clone</option>.)
+        <option>--copy-file-range</option>, <option>--clone</option>, and
+        <option>-k</option>/<option>--link</option>.)
        </para>
       </listitem>
      </varlistentry>
diff --git a/src/bin/pg_combinebackup/copy_file.c b/src/bin/pg_combinebackup/copy_file.c
index 4e27814839..97ecda5a66 100644
--- a/src/bin/pg_combinebackup/copy_file.c
+++ b/src/bin/pg_combinebackup/copy_file.c
@@ -40,6 +40,9 @@ static void copy_file_copyfile(const char *src, const char *dst,
 							   pg_checksum_context *checksum_ctx);
 #endif
 
+static void copy_file_link(const char *src, const char *dest,
+						   pg_checksum_context *checksum_ctx);
+
 /*
  * Copy a regular file, optionally computing a checksum, and emitting
  * appropriate debug messages. But if we're in dry-run mode, then just emit
@@ -69,7 +72,13 @@ copy_file(const char *src, const char *dst,
 	}
 
 #ifdef WIN32
-	copy_method = COPY_METHOD_COPYFILE;
+	/*
+	 * We have no specific switch to enable CopyFile on Windows, because
+	 * it's supported (as far as we know) on all Windows machines. So,
+	 * automatically enable it unless some other strategy was selected.
+	 */
+	if (copy_method == COPY_METHOD_COPY)
+		copy_method = COPY_METHOD_COPYFILE;
 #endif
 
 	/* Determine the name of the copy strategy for use in log messages. */
@@ -93,6 +102,10 @@ copy_file(const char *src, const char *dst,
 			strategy_implementation = copy_file_copyfile;
 			break;
 #endif
+		case COPY_METHOD_LINK:
+			strategy_name = "link";
+			strategy_implementation = copy_file_link;
+			break;
 	}
 
 	if (dry_run)
@@ -304,3 +317,21 @@ copy_file_copyfile(const char *src, const char *dst,
 	checksum_file(src, checksum_ctx);
 }
 #endif							/* WIN32 */
+
+/*
+ * copy_file_link
+ * 		Hard-links a file from src to dest.
+ *
+ * If needed, also reads the file and calculates the checksum.
+ */
+static void
+copy_file_link(const char *src, const char *dest,
+			   pg_checksum_context *checksum_ctx)
+{
+	if (link(src, dest) < 0)
+		pg_fatal("error while linking file from \"%s\" to \"%s\": %m",
+				 src, dest);
+
+	/* if needed, calculate checksum of the file */
+	checksum_file(src, checksum_ctx);
+}
diff --git a/src/bin/pg_combinebackup/copy_file.h b/src/bin/pg_combinebackup/copy_file.h
index 92f104115b..5a8517629c 100644
--- a/src/bin/pg_combinebackup/copy_file.h
+++ b/src/bin/pg_combinebackup/copy_file.h
@@ -25,6 +25,7 @@ typedef enum CopyMethod
 #ifdef WIN32
 	COPY_METHOD_COPYFILE,
 #endif
+	COPY_METHOD_LINK,
 } CopyMethod;
 
 extern void copy_file(const char *src, const char *dst,
diff --git a/src/bin/pg_combinebackup/meson.build b/src/bin/pg_combinebackup/meson.build
index 0c4fd9e627..e19c309ad2 100644
--- a/src/bin/pg_combinebackup/meson.build
+++ b/src/bin/pg_combinebackup/meson.build
@@ -37,6 +37,7 @@ tests += {
       't/007_wal_level_minimal.pl',
       't/008_promote.pl',
       't/009_no_full_file.pl',
+      't/010_links.pl',
     ],
   }
 }
diff --git a/src/bin/pg_combinebackup/pg_combinebackup.c b/src/bin/pg_combinebackup/pg_combinebackup.c
index 5864ec574f..d480dc7443 100644
--- a/src/bin/pg_combinebackup/pg_combinebackup.c
+++ b/src/bin/pg_combinebackup/pg_combinebackup.c
@@ -135,6 +135,7 @@ main(int argc, char *argv[])
 		{"no-sync", no_argument, NULL, 'N'},
 		{"output", required_argument, NULL, 'o'},
 		{"tablespace-mapping", required_argument, NULL, 'T'},
+		{"link", no_argument, NULL, 'k'},
 		{"manifest-checksums", required_argument, NULL, 1},
 		{"no-manifest", no_argument, NULL, 2},
 		{"sync-method", required_argument, NULL, 3},
@@ -172,7 +173,7 @@ main(int argc, char *argv[])
 	opt.copy_method = COPY_METHOD_COPY;
 
 	/* process command-line options */
-	while ((c = getopt_long(argc, argv, "dnNo:T:",
+	while ((c = getopt_long(argc, argv, "dknNo:T:",
 							long_options, &optindex)) != -1)
 	{
 		switch (c)
@@ -181,6 +182,9 @@ main(int argc, char *argv[])
 				opt.debug = true;
 				pg_logging_increase_verbosity();
 				break;
+			case 'k':
+				opt.copy_method = COPY_METHOD_LINK;
+				break;
 			case 'n':
 				opt.dry_run = true;
 				break;
@@ -424,6 +428,11 @@ main(int argc, char *argv[])
 		}
 	}
 
+	/* Warn about the possibility of compromising the backups, when link mode */
+	if (opt.copy_method == COPY_METHOD_LINK)
+		pg_log_warning("--link mode was used; any modifications to the output "
+					   "directory may destructively modify input directories");
+
 	/* It's a success, so don't remove the output directories. */
 	reset_directory_cleanup_list();
 	exit(0);
@@ -761,6 +770,7 @@ help(const char *progname)
 	printf(_("  %s [OPTION]... DIRECTORY...\n"), progname);
 	printf(_("\nOptions:\n"));
 	printf(_("  -d, --debug               generate lots of debugging output\n"));
+	printf(_("  -k, --link                link files instead of copying\n"));
 	printf(_("  -n, --dry-run             do not actually do anything\n"));
 	printf(_("  -N, --no-sync             do not wait for changes to be written safely to disk\n"));
 	printf(_("  -o, --output=DIRECTORY    output directory\n"));
diff --git a/src/bin/pg_combinebackup/t/010_links.pl b/src/bin/pg_combinebackup/t/010_links.pl
new file mode 100644
index 0000000000..024dd90433
--- /dev/null
+++ b/src/bin/pg_combinebackup/t/010_links.pl
@@ -0,0 +1,180 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+#
+# This test aims to validate that hard links are created as expected in the
+# output directory, when running pg_combinebackup with --link mode.
+
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Set up a new database instance.
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(has_archiving => 1, allows_streaming => 1);
+$primary->append_conf('postgresql.conf', 'summarize_wal = on');
+# We disable autovacuum to prevent "something else" to modify our test tables.
+$primary->append_conf('postgresql.conf', 'autovacuum = off');
+$primary->start;
+
+# Create a couple of tables (~264KB each).
+# Note: Cirrus CI runs some tests with a very small segment size, so a single
+# table of 264KB would be enough to cover assertions of hard link count set to 1
+# or to 2. However, a default user installation comes with 1GB of segment size,
+# so we use 2 different tables here, in a way that this same test can be used to
+# cover count set to 1 or to 2 in that installation as well. In other words:
+#
+# * test_1:
+#   * Small segment size: all segments will contain 2 hard links.
+#   * Default segment size: the only one segment will contain 2 hard links.
+# * test_2:
+#   * Small segment size: all segments will have 2 hard links, except for the
+#                         last one, which will have 1 hard link.
+#   * Default segment size: the only one segment will contain 1 hard link.
+my $query = <<'EOM';
+CREATE TABLE test_%s AS
+    SELECT x.id::bigint,
+           repeat('a', 1600) AS value
+    FROM generate_series(1, 100) AS x(id);
+EOM
+
+$primary->safe_psql('postgres', sprintf($query, '1'));
+$primary->safe_psql('postgres', sprintf($query, '2'));
+
+# Fetch information about the data files.
+$query = <<'EOM';
+SELECT pg_relation_filepath(oid)
+FROM pg_class
+WHERE relname = 'test_%s';
+EOM
+
+my $test_1_path = $primary->safe_psql('postgres', sprintf($query, '1'));
+note "test_1 path is $test_1_path";
+
+my $test_2_path = $primary->safe_psql('postgres', sprintf($query, '2'));
+note "test_2 path is $test_2_path";
+
+# Take a full backup.
+my $backup1path = $primary->backup_dir . '/backup1';
+$primary->command_ok(
+	[
+		'pg_basebackup',
+		'--pgdata' => $backup1path,
+		'--no-sync',
+		'--checkpoint' => 'fast',
+        '--wal-method' => 'none'
+	],
+	"full backup");
+
+# Perform an insert that touches a page of the last segment of the data file of
+# table test_2.
+$primary->safe_psql('postgres', <<EOM);
+INSERT INTO test_2 (id, value) VALUES (101, repeat('a', 1600));
+EOM
+
+# Take an incremental backup.
+my $backup2path = $primary->backup_dir . '/backup2';
+$primary->command_ok(
+	[
+		'pg_basebackup',
+		'--pgdata' => $backup2path,
+		'--no-sync',
+		'--checkpoint' => 'fast',
+        '--wal-method' => 'none',
+		'--incremental' => $backup1path . '/backup_manifest'
+	],
+	"incremental backup");
+
+# Restore the incremental backup and use it to create a new node.
+my $restore = PostgreSQL::Test::Cluster->new('restore');
+$restore->init_from_backup(
+	$primary, 'backup2',
+	combine_with_prior => ['backup1'],
+	combine_mode => '--link');
+
+# Ensure files have the expected count of hard links. We expect all data files
+# from test_1 to contain 2 hard links, because they were not touched between the
+# full and incremental backups, and the last data file of table test_2 to
+# contain a single hard link because of changes in its last page.
+my $test_1_full_path = join('/', $restore->data_dir, $test_1_path);
+check_data_file($test_1_full_path, 2);
+
+my $test_2_full_path = join('/', $restore->data_dir, $test_2_path);
+check_data_file($test_2_full_path, 1);
+
+# OK, that's all.
+done_testing();
+
+
+# Given the path to the first segment of a data file, inspect its parent
+# directory to find all the segments of that data file, and make sure all the
+# segments contain 2 hard links. The last one must have the given number of hard
+# links.
+#
+# Parameters:
+# * data_file: path to the first segment of a data file, as per the output of
+#              pg_relation_filepath.
+# * last_segment_nlinks: the number of hard links expected in the last segment
+#                        of the given data file.
+sub check_data_file
+{
+    my ($data_file, $last_segment_nlinks) = @_;
+
+    # By default Postgres uses 1GB segments for the data files, and our test
+    # tables are 264KB worth of data each. However, that segment size is
+    # configurable, so we have to handle all possibilities here. Cirrus CI e.g.
+    # is configured with 6 blocks per segment, and we need to cover that test
+    # case too. We want to get all the segments of that data file.
+    my @data_file_segments = ($data_file);
+
+    # Start checking for additional segments
+    my $segment_number = 1;
+
+    while (1)
+    {
+        my $next_segment = $data_file . '.' . $segment_number;
+
+        # If the file exists and is a regular file, add it to the list
+        if (-f $next_segment)
+        {
+            push @data_file_segments, $next_segment;
+            $segment_number++;
+        }
+        # Stop the loop if the file doesn't exist
+        else
+        {
+            last;
+        }
+    }
+
+    # All segments of the given data file should contain 2 hard links, except
+    # for the last one, which should match the given number of links.
+    my $last_segment = pop @data_file_segments;
+
+    for my $segment (@data_file_segments)
+    {
+        # Get the file's stat information of each segment
+        my $nlink_count = get_hard_link_count($segment);
+        ok($nlink_count == 2, "File '$segment' has 2 hard links");
+    }
+
+    # Get the file's stat information of the last segment
+    my $nlink_count = get_hard_link_count($last_segment);
+    ok($nlink_count == $last_segment_nlinks,
+       "File '$last_segment' has $last_segment_nlinks hard link(s)");
+}
+
+
+# Subroutine to get hard link count of a given file.
+# Receives the path to a file, and returns the number of hard links of
+# that file.
+sub get_hard_link_count
+{
+    my ($file) = @_;
+
+    # Get file stats
+    my @stats = stat($file);
+    my $nlink = $stats[3];  # Number of hard links
+
+    return $nlink;
+}
-- 
2.43.5

#15

Robert Haas

robertmhaas@gmail.com

10 months ago

In reply to: Israel Barth Rubio (#14)

1 attachment(s)

Re: Add -k/--link option to pg_combinebackup

On Wed, Mar 5, 2025 at 9:47 AM Israel Barth Rubio <barthisrael@gmail.com> wrote:

The v6 version of the patch contains the simple INSERT version, which
adds only one tuple to the test_2 table, and is safer as your pointed
out.

Cool. Here is v7, with minor changes. I rewrote the commit message,
renamed the test case to 010_hardlink.pl, and adjusted some of the
comments in the test case.

I'm happy with this now, so barring objections or complaints from CI,
I plan to commit it soon.

--
Robert Haas
EDB: http://www.enterprisedb.com

Attachments:

v7-0001-pg_combinebackup-Add-k-link-option.patchapplication/octet-stream; name=v7-0001-pg_combinebackup-Add-k-link-option.patchDownload

From 6d3f73d4eb8d8f8cd33aaf655d551cfe4975c86d Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Wed, 5 Mar 2025 13:44:26 -0500
Subject: [PATCH v7] pg_combinebackup: Add -k, --link option.

This is similar to pg_upgrade's --link option, except that here we won't
typically be able to use it for every input file: sometimes we will need
to reconstruct a complete backup from blocks stored in different files.
However, when a whole file does need to be copied, we can use an
optimized copying strategy: see the existing --clone and
--copy-file-range options and the code to use CopyFile() on Windows.
This commit adds a new strategy: add a hard link to an existing file.
Making a hard link doesn't actually copy anything, but it makes sense
for the code to treat it as doing so.

This is useful when the input directories are merely staging directories
that will be removed once the restore is complete. In such cases, there
is no need to actually copy the data, and making a bunch of new hard
links can be very quick. However, it would be quite dangerous to use it
if the input directories might later be reused for any other purpose,
since starting postgres on the output directory would destructively
modify the input directories. For that reason, using this new option
causes pg_combinebackup to emit a warning about the danger involved.

Author: Israel Barth Rubio <barthisrael@gmail.com>
Co-authored-by: Robert Haas <robertmhaas@gmail.com> (cosmetic changes)
Reviewed-by: Vignesh C <vignesh21@gmail.com>
---
 doc/src/sgml/ref/pg_combinebackup.sgml      |  32 +++-
 src/bin/pg_combinebackup/copy_file.c        |  33 +++-
 src/bin/pg_combinebackup/copy_file.h        |   1 +
 src/bin/pg_combinebackup/meson.build        |   1 +
 src/bin/pg_combinebackup/pg_combinebackup.c |  12 +-
 src/bin/pg_combinebackup/t/010_hardlink.pl  | 169 ++++++++++++++++++++
 6 files changed, 245 insertions(+), 3 deletions(-)
 create mode 100644 src/bin/pg_combinebackup/t/010_hardlink.pl

diff --git a/doc/src/sgml/ref/pg_combinebackup.sgml b/doc/src/sgml/ref/pg_combinebackup.sgml
index 091982f62ad..55bc46849db 100644
--- a/doc/src/sgml/ref/pg_combinebackup.sgml
+++ b/doc/src/sgml/ref/pg_combinebackup.sgml
@@ -137,6 +137,35 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>-k</option></term>
+      <term><option>--link</option></term>
+      <listitem>
+       <para>
+        Use hard links instead of copying files to the synthetic backup.
+        Reconstruction of the synthetic backup might be faster (no file copying)
+        and use less disk space, but care must be taken when using the output
+        directory, because any modifications to that directory (for example,
+        starting the server) can also affect the input directories. Likewise,
+        changes to the input directories (for example, starting the server on
+        the full backup) could affect the output directory. Thus, this option
+        is best used when the input directories are only copies that will be
+        removed after <application>pg_combinebackup</application> has completed.
+       </para>
+
+       <para>
+        Requires that the input backups and the output directory are in the
+        same file system.
+       </para>
+
+       <para>
+        If a backup manifest is not available or does not contain checksum of
+        the right type, hard links will still be created, but the file will be
+        also read block-by-block for the checksum calculation.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>--clone</option></term>
       <listitem>
@@ -167,7 +196,8 @@ PostgreSQL documentation
       <listitem>
        <para>
         Perform regular file copy.  This is the default.  (See also
-        <option>--copy-file-range</option> and <option>--clone</option>.)
+        <option>--copy-file-range</option>, <option>--clone</option>, and
+        <option>-k</option>/<option>--link</option>.)
        </para>
       </listitem>
      </varlistentry>
diff --git a/src/bin/pg_combinebackup/copy_file.c b/src/bin/pg_combinebackup/copy_file.c
index 4e27814839c..97ecda5a66d 100644
--- a/src/bin/pg_combinebackup/copy_file.c
+++ b/src/bin/pg_combinebackup/copy_file.c
@@ -40,6 +40,9 @@ static void copy_file_copyfile(const char *src, const char *dst,
 							   pg_checksum_context *checksum_ctx);
 #endif
 
+static void copy_file_link(const char *src, const char *dest,
+						   pg_checksum_context *checksum_ctx);
+
 /*
  * Copy a regular file, optionally computing a checksum, and emitting
  * appropriate debug messages. But if we're in dry-run mode, then just emit
@@ -69,7 +72,13 @@ copy_file(const char *src, const char *dst,
 	}
 
 #ifdef WIN32
-	copy_method = COPY_METHOD_COPYFILE;
+	/*
+	 * We have no specific switch to enable CopyFile on Windows, because
+	 * it's supported (as far as we know) on all Windows machines. So,
+	 * automatically enable it unless some other strategy was selected.
+	 */
+	if (copy_method == COPY_METHOD_COPY)
+		copy_method = COPY_METHOD_COPYFILE;
 #endif
 
 	/* Determine the name of the copy strategy for use in log messages. */
@@ -93,6 +102,10 @@ copy_file(const char *src, const char *dst,
 			strategy_implementation = copy_file_copyfile;
 			break;
 #endif
+		case COPY_METHOD_LINK:
+			strategy_name = "link";
+			strategy_implementation = copy_file_link;
+			break;
 	}
 
 	if (dry_run)
@@ -304,3 +317,21 @@ copy_file_copyfile(const char *src, const char *dst,
 	checksum_file(src, checksum_ctx);
 }
 #endif							/* WIN32 */
+
+/*
+ * copy_file_link
+ * 		Hard-links a file from src to dest.
+ *
+ * If needed, also reads the file and calculates the checksum.
+ */
+static void
+copy_file_link(const char *src, const char *dest,
+			   pg_checksum_context *checksum_ctx)
+{
+	if (link(src, dest) < 0)
+		pg_fatal("error while linking file from \"%s\" to \"%s\": %m",
+				 src, dest);
+
+	/* if needed, calculate checksum of the file */
+	checksum_file(src, checksum_ctx);
+}
diff --git a/src/bin/pg_combinebackup/copy_file.h b/src/bin/pg_combinebackup/copy_file.h
index 92f104115bb..5a8517629c7 100644
--- a/src/bin/pg_combinebackup/copy_file.h
+++ b/src/bin/pg_combinebackup/copy_file.h
@@ -25,6 +25,7 @@ typedef enum CopyMethod
 #ifdef WIN32
 	COPY_METHOD_COPYFILE,
 #endif
+	COPY_METHOD_LINK,
 } CopyMethod;
 
 extern void copy_file(const char *src, const char *dst,
diff --git a/src/bin/pg_combinebackup/meson.build b/src/bin/pg_combinebackup/meson.build
index 0c4fd9e6270..e80a4756a7f 100644
--- a/src/bin/pg_combinebackup/meson.build
+++ b/src/bin/pg_combinebackup/meson.build
@@ -37,6 +37,7 @@ tests += {
       't/007_wal_level_minimal.pl',
       't/008_promote.pl',
       't/009_no_full_file.pl',
+      't/010_hardlink.pl',
     ],
   }
 }
diff --git a/src/bin/pg_combinebackup/pg_combinebackup.c b/src/bin/pg_combinebackup/pg_combinebackup.c
index 5864ec574fb..d480dc74436 100644
--- a/src/bin/pg_combinebackup/pg_combinebackup.c
+++ b/src/bin/pg_combinebackup/pg_combinebackup.c
@@ -135,6 +135,7 @@ main(int argc, char *argv[])
 		{"no-sync", no_argument, NULL, 'N'},
 		{"output", required_argument, NULL, 'o'},
 		{"tablespace-mapping", required_argument, NULL, 'T'},
+		{"link", no_argument, NULL, 'k'},
 		{"manifest-checksums", required_argument, NULL, 1},
 		{"no-manifest", no_argument, NULL, 2},
 		{"sync-method", required_argument, NULL, 3},
@@ -172,7 +173,7 @@ main(int argc, char *argv[])
 	opt.copy_method = COPY_METHOD_COPY;
 
 	/* process command-line options */
-	while ((c = getopt_long(argc, argv, "dnNo:T:",
+	while ((c = getopt_long(argc, argv, "dknNo:T:",
 							long_options, &optindex)) != -1)
 	{
 		switch (c)
@@ -181,6 +182,9 @@ main(int argc, char *argv[])
 				opt.debug = true;
 				pg_logging_increase_verbosity();
 				break;
+			case 'k':
+				opt.copy_method = COPY_METHOD_LINK;
+				break;
 			case 'n':
 				opt.dry_run = true;
 				break;
@@ -424,6 +428,11 @@ main(int argc, char *argv[])
 		}
 	}
 
+	/* Warn about the possibility of compromising the backups, when link mode */
+	if (opt.copy_method == COPY_METHOD_LINK)
+		pg_log_warning("--link mode was used; any modifications to the output "
+					   "directory may destructively modify input directories");
+
 	/* It's a success, so don't remove the output directories. */
 	reset_directory_cleanup_list();
 	exit(0);
@@ -761,6 +770,7 @@ help(const char *progname)
 	printf(_("  %s [OPTION]... DIRECTORY...\n"), progname);
 	printf(_("\nOptions:\n"));
 	printf(_("  -d, --debug               generate lots of debugging output\n"));
+	printf(_("  -k, --link                link files instead of copying\n"));
 	printf(_("  -n, --dry-run             do not actually do anything\n"));
 	printf(_("  -N, --no-sync             do not wait for changes to be written safely to disk\n"));
 	printf(_("  -o, --output=DIRECTORY    output directory\n"));
diff --git a/src/bin/pg_combinebackup/t/010_hardlink.pl b/src/bin/pg_combinebackup/t/010_hardlink.pl
new file mode 100644
index 00000000000..a0ee419090c
--- /dev/null
+++ b/src/bin/pg_combinebackup/t/010_hardlink.pl
@@ -0,0 +1,169 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+#
+# This test aims to validate that hard links are created as expected in the
+# output directory, when running pg_combinebackup with --link mode.
+
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Set up a new database instance.
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(has_archiving => 1, allows_streaming => 1);
+$primary->append_conf('postgresql.conf', 'summarize_wal = on');
+# We disable autovacuum to prevent "something else" to modify our test tables.
+$primary->append_conf('postgresql.conf', 'autovacuum = off');
+$primary->start;
+
+# Create a couple of tables (~264KB each).
+# Note: Cirrus CI runs some tests with a very small segment size, so, in that
+# environment, a single table of 264KB would have both a segment with a link
+# count of 1 and also one with a link count of 2. But in a normal installation,
+# segment size is 1GB.  Therefore, we use 2 different tables here: for test_1,
+# all segments (or the only one) will have two hard links; for test_2, the
+# last segment (or the only one) will have 1 hard link, and any others will
+# have 2.
+my $query = <<'EOM';
+CREATE TABLE test_%s AS
+    SELECT x.id::bigint,
+           repeat('a', 1600) AS value
+    FROM generate_series(1, 100) AS x(id);
+EOM
+
+$primary->safe_psql('postgres', sprintf($query, '1'));
+$primary->safe_psql('postgres', sprintf($query, '2'));
+
+# Fetch information about the data files.
+$query = <<'EOM';
+SELECT pg_relation_filepath(oid)
+FROM pg_class
+WHERE relname = 'test_%s';
+EOM
+
+my $test_1_path = $primary->safe_psql('postgres', sprintf($query, '1'));
+note "test_1 path is $test_1_path";
+
+my $test_2_path = $primary->safe_psql('postgres', sprintf($query, '2'));
+note "test_2 path is $test_2_path";
+
+# Take a full backup.
+my $backup1path = $primary->backup_dir . '/backup1';
+$primary->command_ok(
+	[
+		'pg_basebackup',
+		'--pgdata' => $backup1path,
+		'--no-sync',
+		'--checkpoint' => 'fast',
+        '--wal-method' => 'none'
+	],
+	"full backup");
+
+# Perform an insert that touches a page of the last segment of the data file of
+# table test_2.
+$primary->safe_psql('postgres', <<EOM);
+INSERT INTO test_2 (id, value) VALUES (101, repeat('a', 1600));
+EOM
+
+# Take an incremental backup.
+my $backup2path = $primary->backup_dir . '/backup2';
+$primary->command_ok(
+	[
+		'pg_basebackup',
+		'--pgdata' => $backup2path,
+		'--no-sync',
+		'--checkpoint' => 'fast',
+        '--wal-method' => 'none',
+		'--incremental' => $backup1path . '/backup_manifest'
+	],
+	"incremental backup");
+
+# Restore the incremental backup and use it to create a new node.
+my $restore = PostgreSQL::Test::Cluster->new('restore');
+$restore->init_from_backup(
+	$primary, 'backup2',
+	combine_with_prior => ['backup1'],
+	combine_mode => '--link');
+
+# Ensure files have the expected count of hard links. We expect all data files
+# from test_1 to contain 2 hard links, because they were not touched between the
+# full and incremental backups, and the last data file of table test_2 to
+# contain a single hard link because of changes in its last page.
+my $test_1_full_path = join('/', $restore->data_dir, $test_1_path);
+check_data_file($test_1_full_path, 2);
+
+my $test_2_full_path = join('/', $restore->data_dir, $test_2_path);
+check_data_file($test_2_full_path, 1);
+
+# OK, that's all.
+done_testing();
+
+
+# Given the path to the first segment of a data file, inspect its parent
+# directory to find all the segments of that data file, and make sure all the
+# segments contain 2 hard links. The last one must have the given number of hard
+# links.
+#
+# Parameters:
+# * data_file: path to the first segment of a data file, as per the output of
+#              pg_relation_filepath.
+# * last_segment_nlinks: the number of hard links expected in the last segment
+#                        of the given data file.
+sub check_data_file
+{
+    my ($data_file, $last_segment_nlinks) = @_;
+
+    my @data_file_segments = ($data_file);
+
+    # Start checking for additional segments
+    my $segment_number = 1;
+
+    while (1)
+    {
+        my $next_segment = $data_file . '.' . $segment_number;
+
+        # If the file exists and is a regular file, add it to the list
+        if (-f $next_segment)
+        {
+            push @data_file_segments, $next_segment;
+            $segment_number++;
+        }
+        # Stop the loop if the file doesn't exist
+        else
+        {
+            last;
+        }
+    }
+
+    # All segments of the given data file should contain 2 hard links, except
+    # for the last one, which should match the given number of links.
+    my $last_segment = pop @data_file_segments;
+
+    for my $segment (@data_file_segments)
+    {
+        # Get the file's stat information of each segment
+        my $nlink_count = get_hard_link_count($segment);
+        ok($nlink_count == 2, "File '$segment' has 2 hard links");
+    }
+
+    # Get the file's stat information of the last segment
+    my $nlink_count = get_hard_link_count($last_segment);
+    ok($nlink_count == $last_segment_nlinks,
+       "File '$last_segment' has $last_segment_nlinks hard link(s)");
+}
+
+
+# Subroutine to get hard link count of a given file.
+# Receives the path to a file, and returns the number of hard links of
+# that file.
+sub get_hard_link_count
+{
+    my ($file) = @_;
+
+    # Get file stats
+    my @stats = stat($file);
+    my $nlink = $stats[3];  # Number of hard links
+
+    return $nlink;
+}
-- 
2.39.3 (Apple Git-145)

#16

Israel Barth Rubio

barthisrael@gmail.com

10 months ago

In reply to: Robert Haas (#15)

Re: Add -k/--link option to pg_combinebackup

I'm happy with this now, so barring objections or complaints from CI,
I plan to commit it soon.

Great, thank you!

#17

Robert Haas

robertmhaas@gmail.com

10 months ago

In reply to: Israel Barth Rubio (#16)

Re: Add -k/--link option to pg_combinebackup

On Thu, Mar 6, 2025 at 6:18 AM Israel Barth Rubio <barthisrael@gmail.com> wrote:

I'm happy with this now, so barring objections or complaints from CI,
I plan to commit it soon.

Great, thank you!

"Soon" ended up being somewhat later than planned, because I've been
out of town, but done now.

Now, to see if the buildfarm explodes...

--
Robert Haas
EDB: http://www.enterprisedb.com

#18

Nathan Bossart

nathandbossart@gmail.com

10 months ago

In reply to: Robert Haas (#17)

Re: Add -k/--link option to pg_combinebackup

On Mon, Mar 17, 2025 at 02:10:14PM -0400, Robert Haas wrote:

Now, to see if the buildfarm explodes...

I think koel might complain. pgindent on my machine yields the following:

diff --git a/src/bin/pg_combinebackup/copy_file.c b/src/bin/pg_combinebackup/copy_file.c
index 97ecda5a66d..b0c94f6ee31 100644
--- a/src/bin/pg_combinebackup/copy_file.c
+++ b/src/bin/pg_combinebackup/copy_file.c
@@ -72,9 +72,10 @@ copy_file(const char *src, const char *dst,
     }

 #ifdef WIN32
+
     /*
-     * We have no specific switch to enable CopyFile on Windows, because
-     * it's supported (as far as we know) on all Windows machines. So,
+     * We have no specific switch to enable CopyFile on Windows, because it's
+     * supported (as far as we know) on all Windows machines. So,
      * automatically enable it unless some other strategy was selected.
      */
     if (copy_method == COPY_METHOD_COPY)

--
nathan

#19

Robert Haas

robertmhaas@gmail.com

10 months ago

In reply to: Nathan Bossart (#18)

Re: Add -k/--link option to pg_combinebackup

On Mon, Mar 17, 2025 at 2:17 PM Nathan Bossart <nathandbossart@gmail.com> wrote:

I think koel might complain. pgindent on my machine yields the following:

Indeed, it did. I still think it's a mistake to have testing for this
in the buildfarm and not in 'meson test' or CI.

--
Robert Haas
EDB: http://www.enterprisedb.com

#20

Andrew Dunstan

andrew@dunslane.net

10 months ago

In reply to: Robert Haas (#19)

Re: Add -k/--link option to pg_combinebackup

On 2025-03-17 Mo 4:07 PM, Robert Haas wrote:

On Mon, Mar 17, 2025 at 2:17 PM Nathan Bossart <nathandbossart@gmail.com> wrote:

I think koel might complain. pgindent on my machine yields the following:

Indeed, it did. I still think it's a mistake to have testing for this
in the buildfarm and not in 'meson test' or CI.

Maybe we could have a TAP test module that would run it but only if you
have "code-indent" in your PG_TEST_EXTRA.

cheers

andre

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#21

Robert Haas

robertmhaas@gmail.com

10 months ago

In reply to: Andrew Dunstan (#20)

Re: Add -k/--link option to pg_combinebackup

On Tue, Mar 18, 2025 at 9:27 AM Andrew Dunstan <andrew@dunslane.net> wrote:

Maybe we could have a TAP test module that would run it but only if you
have "code-indent" in your PG_TEST_EXTRA.

What is the argument against always testing this?

--
Robert Haas
EDB: http://www.enterprisedb.com

#22

Andrew Dunstan

andrew@dunslane.net

10 months ago

In reply to: Robert Haas (#21)

Re: Add -k/--link option to pg_combinebackup

On Mar 18, 2025, at 9:34 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Mar 18, 2025 at 9:27 AM Andrew Dunstan <andrew@dunslane.net> wrote:

Maybe we could have a TAP test module that would run it but only if you
have "code-indent" in your PG_TEST_EXTRA.

What is the argument against always testing this?

Just a question if everyone wants to run this. Koel takes about 10s to run the indent test.

Cheers

Andrew

#23

Robert Haas

robertmhaas@gmail.com

10 months ago

In reply to: Andrew Dunstan (#22)

Re: Add -k/--link option to pg_combinebackup

On Tue, Mar 18, 2025 at 10:31 AM Andrew Dunstan <andrew@dunslane.net> wrote:

On Mar 18, 2025, at 9:34 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Mar 18, 2025 at 9:27 AM Andrew Dunstan <andrew@dunslane.net> wrote:

Maybe we could have a TAP test module that would run it but only if you
have "code-indent" in your PG_TEST_EXTRA.

What is the argument against always testing this?

Just a question if everyone wants to run this. Koel takes about 10s to run the indent test.

Well, IMHO, that's pretty cheap insurance against having to push a
second commit to fix indentation. But I guess one problem is not
everyone may have pgindent installed locally.

But at least I should get it set up here. I tried setting
PG_TEST_EXTRA=code-indent locally and running 'meson test' and I
didn't get a failure -- and 'git grep code-indent' returned nothing.
Any tips?

--
Robert Haas
EDB: http://www.enterprisedb.com

#24

Tom Lane

tgl@sss.pgh.pa.us

10 months ago

In reply to: Robert Haas (#23)

Re: Add -k/--link option to pg_combinebackup

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Mar 18, 2025 at 10:31 AM Andrew Dunstan <andrew@dunslane.net> wrote:

Just a question if everyone wants to run this. Koel takes about 10s to run the indent test.

Well, IMHO, that's pretty cheap insurance against having to push a
second commit to fix indentation. But I guess one problem is not
everyone may have pgindent installed locally.

10s added to every check-world run doesn't sound "cheap" to me.
However, both pg_bsd_indent and the typedefs list are in-tree
nowadays, so I don't see any reason that this couldn't be a
totally self-contained check.

But at least I should get it set up here. I tried setting
PG_TEST_EXTRA=code-indent locally and running 'meson test' and I
didn't get a failure -- and 'git grep code-indent' returned nothing.
Any tips?

Andrew was suggesting a test we could write, not one we
already have.

regards, tom lane

#25

Robert Haas

robertmhaas@gmail.com

10 months ago

In reply to: Tom Lane (#24)

Re: Add -k/--link option to pg_combinebackup

On Tue, Mar 18, 2025 at 1:55 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

10s added to every check-world run doesn't sound "cheap" to me.

Really? If you're sensitive to the time the tests take to run, maybe
use 'meson test' rather than 'make check-world'? I do that locally
with MESON_TESTTHREADS=8 and the entire test suite finishes in under 3
minutes. A 10-second pgindent test will add something to that,
presumably, because it'll take up one of those test threads for the
time it's running, but if you assume perfect thread utilization before
and after adding this test we're talking about the difference between
'meson test' finishing in 2:55 and 2:57, or something like that. That
seems like pretty cheap insurance to me for what is, at least for me,
a common error.

(make check-world for me takes 12:57 -- and yes that could be faster
with parallelism too, but then the output is jumbled. At any rate, if
I'm waiting that long, I'm getting up and walking away from the
computer and then the difference between being gone for 13 minutes and
13 minutes and 10 seconds isn't really material either, at least not
to me.)

However, both pg_bsd_indent and the typedefs list are in-tree
nowadays, so I don't see any reason that this couldn't be a
totally self-contained check.

Well, +1 from me. Every year when I go through the commit log from the
previous year for contribution statistics, it always shocks me how
fix-up commits we have. Now, some of that is unavoidable -- people are
always going to miss things in review and discover them after commit.
But it should be possible to reduce purely mechanical errors like
failing to rerun pgindent after the very last cosmetic change, or
whatever. I'd like to spend more time on substantive issues and less
on things that could have been caught automatically. Tons and tons of
commits have two, three, or even more fix-up commits and that all adds
up to a lot of committer time that could be better spent.

--
Robert Haas
EDB: http://www.enterprisedb.com

#26

Tom Lane

tgl@sss.pgh.pa.us

10 months ago

In reply to: Robert Haas (#25)

Re: Add -k/--link option to pg_combinebackup

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Mar 18, 2025 at 1:55 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

10s added to every check-world run doesn't sound "cheap" to me.

Really? If you're sensitive to the time the tests take to run, maybe
use 'meson test' rather than 'make check-world'?

FTR, what I actually use is

make -s -j10 check-world >/dev/null

which I like because I don't see any output unless there's a problem.
On my machine, that took right about 2 minutes for a long time;
lately it's crept up to about 2:15 which is already annoying me.
Admittedly an additional 10-second test might not add much to that
due to parallelism, but then again it might; it's hard to tell how
much slop we have in the parallel scheduling.

Anyway, I'm totally fine with finding a way to allow pgindent
checking to be included, as long as I can opt out of that.
I don't especially want to see buildfarm cycles going into it
on every animal, either. If there's any test we have that ought
to be completely deterministic, it'd be this one.

regards, tom lane

#27

Andrew Dunstan

andrew@dunslane.net

10 months ago

In reply to: Tom Lane (#24)

Re: Add -k/--link option to pg_combinebackup

On 2025-03-18 Tu 1:55 PM, Tom Lane wrote:

But at least I should get it set up here. I tried setting
PG_TEST_EXTRA=code-indent locally and running 'meson test' and I
didn't get a failure -- and 'git grep code-indent' returned nothing.
Any tips?

Andrew was suggesting a test we could write, not one we
already have.

Yeah, will try to have one next week.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#28

Robert Haas

robertmhaas@gmail.com

10 months ago

In reply to: Tom Lane (#26)

Re: Add -k/--link option to pg_combinebackup

On Tue, Mar 18, 2025 at 4:38 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

On my machine, that took right about 2 minutes for a long time;
lately it's crept up to about 2:15 which is already annoying me.

I'm constantly amazed by how well-optimized you and Andres are around
this stuff and how little delay it takes to annoy you. I'm not
questioning that it does, but personally I'm so dumb that the dominant
time for me is almost how fast I can figure out what the heck I should
even be doing. I could run the tests ten times as often as I actually
do and probably still not care about the difference between 2 and
2:15, because it would be dwarfed by the 47:25 I spent looking for
some stupid bug I introduced. :-)

--
Robert Haas
EDB: http://www.enterprisedb.com

#29

Andres Freund

andres@anarazel.de

10 months ago

In reply to: Andrew Dunstan (#27)

Re: Add -k/--link option to pg_combinebackup

Hi,

On 2025-03-19 08:28:44 -0400, Andrew Dunstan wrote:

On 2025-03-18 Tu 1:55 PM, Tom Lane wrote:

But at least I should get it set up here. I tried setting
PG_TEST_EXTRA=code-indent locally and running 'meson test' and I
didn't get a failure -- and 'git grep code-indent' returned nothing.
Any tips?

Andrew was suggesting a test we could write, not one we
already have.

Yeah, will try to have one next week.

FWIW, I had written this up a while back:

/messages/by-id/20231019044907.ph6dw637loqg3lqk@awork3.anarazel.de
https://github.com/anarazel/postgres/commits/ci-pgindent/

I just didn't seem to get a whole lot of traction back then. But it might be
an interesting basis to get started.

Greetings,

Andres Freund