Add progress reporting to pg_verifybackup
Hi all,
I've attached the simple patch to add the progress reporting option to
pg_verifybackup. The progress information is displayed with --progress
option only during the checksum verification, which is the most time
consuming task. It cannot be used together with --quiet option.
Feedback is very welcome.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Attachments:
v1-0001-Add-progress-reporting-to-pg_verifybackup.patchapplication/octet-stream; name=v1-0001-Add-progress-reporting-to-pg_verifybackup.patchDownload
From 146923a725a918416482b5a9391e10b5df0ab8e5 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 6 Jan 2023 13:26:09 +0900
Subject: [PATCH v1] Add progress reporting to pg_verifybackup
This adds a new option to pg_verifybackup called -P/--progress,
showing every second some information about the state of checksum
verification.
Similarly to what is done for pg_rewind and pg_basebackup, the
information printed in the progress report consists of the current
amount of data computed and the total amount of data to compute. This
could be extended later on.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 15 ++++
src/bin/pg_verifybackup/pg_verifybackup.c | 85 +++++++++++++++++++++--
2 files changed, 96 insertions(+), 4 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 5f83c98706..36335e0a18 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -178,6 +178,21 @@ PostgreSQL documentation
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><option>-P</option></term>
+ <term><option>--progress</option></term>
+ <listitem>
+ <para>
+ Enable progress reporting. Turning this on will deliver a progress
+ report while verifying checksums.
+ </para>
+ <para>
+ This option cannot be used together with the option
+ <option>--quiet</option>.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><option>-q</option></term>
<term><option>--quiet</option></term>
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 7634dfc285..b2d54ce25f 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -16,12 +16,14 @@
#include <dirent.h>
#include <fcntl.h>
#include <sys/stat.h>
+#include <time.h>
#include "common/hashfn.h"
#include "common/logging.h"
#include "fe_utils/simple_list.h"
#include "getopt_long.h"
#include "parse_manifest.h"
+#include "pgtime.h"
/*
* For efficiency, we'd like our hash table containing information about the
@@ -57,6 +59,8 @@ typedef struct manifest_file
bool matched;
bool bad;
} manifest_file;
+#define should_check_checksums(m) \
+ (((m)->matched) && !((m)->bad) && (((m)->checksum_type) != CHECKSUM_TYPE_NONE))
/*
* Define a hash table which we can use to store information about the files
@@ -147,10 +151,19 @@ static void report_fatal_error(const char *pg_restrict fmt,...)
pg_attribute_printf(1, 2) pg_attribute_noreturn();
static bool should_ignore_relpath(verifier_context *context, char *relpath);
+static void progress_report(bool finished);
static void usage(void);
static const char *progname;
+/* options */
+static bool show_progress = false;
+static bool skip_checksums = false;
+
+/* Progress indicators */
+static uint64 total_size = 0;
+static uint64 done_size = 0;
+
/*
* Main entry point.
*/
@@ -162,6 +175,7 @@ main(int argc, char **argv)
{"ignore", required_argument, NULL, 'i'},
{"manifest-path", required_argument, NULL, 'm'},
{"no-parse-wal", no_argument, NULL, 'n'},
+ {"progress", no_argument, NULL, 'P'},
{"quiet", no_argument, NULL, 'q'},
{"skip-checksums", no_argument, NULL, 's'},
{"wal-directory", required_argument, NULL, 'w'},
@@ -174,7 +188,6 @@ main(int argc, char **argv)
char *manifest_path = NULL;
bool no_parse_wal = false;
bool quiet = false;
- bool skip_checksums = false;
char *wal_directory = NULL;
char *pg_waldump_path = NULL;
@@ -219,7 +232,7 @@ main(int argc, char **argv)
simple_string_list_append(&context.ignore_list, "recovery.signal");
simple_string_list_append(&context.ignore_list, "standby.signal");
- while ((c = getopt_long(argc, argv, "ei:m:nqsw:", long_options, NULL)) != -1)
+ while ((c = getopt_long(argc, argv, "ei:m:nPqsw:", long_options, NULL)) != -1)
{
switch (c)
{
@@ -241,6 +254,9 @@ main(int argc, char **argv)
case 'n':
no_parse_wal = true;
break;
+ case 'P':
+ show_progress = true;
+ break;
case 'q':
quiet = true;
break;
@@ -277,6 +293,10 @@ main(int argc, char **argv)
exit(1);
}
+ /* Complain if the specified arguments conflict */
+ if (show_progress && quiet)
+ pg_fatal("cannot specify both --progress and --quiet");
+
/* Unless --no-parse-wal was specified, we will need pg_waldump. */
if (!no_parse_wal)
{
@@ -638,6 +658,10 @@ verify_backup_file(verifier_context *context, char *relpath, char *fullpath)
m->bad = true;
}
+ /* Update statistics for progress report, if necessary */
+ if (show_progress && !skip_checksums && should_check_checksums(m))
+ total_size += m->size;
+
/*
* We don't verify checksums at this stage. We first finish verifying that
* we have the expected set of files with the expected sizes, and only
@@ -675,11 +699,12 @@ verify_backup_checksums(verifier_context *context)
manifest_files_iterator it;
manifest_file *m;
+ progress_report(false);
+
manifest_files_start_iterate(context->ht, &it);
while ((m = manifest_files_iterate(context->ht, &it)) != NULL)
{
- if (m->matched && !m->bad && m->checksum_type != CHECKSUM_TYPE_NONE &&
- !should_ignore_relpath(context, m->pathname))
+ if (should_check_checksums(m) && !should_ignore_relpath(context, m->pathname))
{
char *fullpath;
@@ -694,6 +719,8 @@ verify_backup_checksums(verifier_context *context)
pfree(fullpath);
}
}
+
+ progress_report(true);
}
/*
@@ -740,6 +767,10 @@ verify_file_checksum(verifier_context *context, manifest_file *m,
close(fd);
return;
}
+
+ /* Report progress */
+ done_size += rc;
+ progress_report(false);
}
if (rc < 0)
report_backup_error(context, "could not read file \"%s\": %m",
@@ -894,6 +925,51 @@ hash_string_pointer(char *s)
return hash_bytes(ss, strlen(s));
}
+/*
+ * Print a progress report based on the global variables.
+ *
+ * Progress report is written at maximum once per second, unless the finished
+ * parameter is set to true.
+ *
+ * If finished is set to true, this is the last progress report. The cursor
+ * is moved to the next line.
+ */
+static void
+progress_report(bool finished)
+{
+ static pg_time_t last_progress_report = 0;
+ pg_time_t now;
+ int percent_size = 0;
+ char totalsize_str[32];
+ char donesize_str[32];
+
+ if (!show_progress)
+ return;
+
+ now = time(NULL);
+ if (now == last_progress_report && !finished)
+ return; /* Max once per second */
+
+ last_progress_report = now;
+ percent_size = total_size ? (int) ((done_size * 100 / total_size)) : 0;
+
+ snprintf(totalsize_str, sizeof(totalsize_str), UINT64_FORMAT,
+ total_size / 1024);
+ snprintf(donesize_str, sizeof(donesize_str), UINT64_FORMAT,
+ done_size / 1024);
+
+ fprintf(stderr,
+ _("%*s/%s kB (%d%%) verified"),
+ (int) strlen(totalsize_str),
+ donesize_str, totalsize_str, percent_size);
+
+ /*
+ * Stay on the same line if reporting to a terminal and we're not done
+ * yet.
+ */
+ fputc((!finished && isatty(fileno(stderr))) ? '\r' : '\n', stderr);
+}
+
/*
* Print out usage information and exit.
*/
@@ -907,6 +983,7 @@ usage(void)
printf(_(" -i, --ignore=RELATIVE_PATH ignore indicated path\n"));
printf(_(" -m, --manifest-path=PATH use specified path for manifest\n"));
printf(_(" -n, --no-parse-wal do not try to parse WAL files\n"));
+ printf(_(" -P, --progress show progress information\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -s, --skip-checksums skip checksum verification\n"));
printf(_(" -w, --wal-directory=PATH use specified path for WAL files\n"));
--
2.31.1
On Fri, Jan 06, 2023 at 04:28:42PM +0900, Masahiko Sawada wrote:
I've attached the simple patch to add the progress reporting option to
pg_verifybackup. The progress information is displayed with --progress
option only during the checksum verification, which is the most time
consuming task. It cannot be used together with --quiet option.
That looks helpful, particularly when a backup has many relation
files. Calculating the total size when browsing the file list looks
fine.
+ /* Complain if the specified arguments conflict */
+ if (show_progress && quiet)
+ pg_fatal("cannot specify both --progress and --quiet");
Nothing on HEAD proposes --progress and --quiet at the same time from
what I can see, so just disabling the combination is fine by me. For
the error message, I would recommend to switch to a more generic
pattern, as of:
"cannot specify both %s and %s", "-P/--progress", "-q/--quiet"
Could you add a check based on command_fails_like() in 004_options.pl,
at least? A second test to check after the output of --progress would
be a nice bonus, for example by sticking --progress into one of the
existing commands doing a command_like().
--
Michael
On Wed, Feb 1, 2023 at 10:25 AM Michael Paquier <michael@paquier.xyz> wrote:
On Fri, Jan 06, 2023 at 04:28:42PM +0900, Masahiko Sawada wrote:
I've attached the simple patch to add the progress reporting option to
pg_verifybackup. The progress information is displayed with --progress
option only during the checksum verification, which is the most time
consuming task. It cannot be used together with --quiet option.That looks helpful, particularly when a backup has many relation
files. Calculating the total size when browsing the file list looks
fine.+ /* Complain if the specified arguments conflict */ + if (show_progress && quiet) + pg_fatal("cannot specify both --progress and --quiet");Nothing on HEAD proposes --progress and --quiet at the same time from
what I can see, so just disabling the combination is fine by me. For
the error message, I would recommend to switch to a more generic
pattern, as of:
"cannot specify both %s and %s", "-P/--progress", "-q/--quiet"
Agreed.
Could you add a check based on command_fails_like() in 004_options.pl,
at least?
Agreed, done in v2 patch.
A second test to check after the output of --progress would
be a nice bonus, for example by sticking --progress into one of the
existing commands doing a command_like().
It seems that the --progress option doesn't work with command_like()
since the progress information is written in stderr but command_like()
doesn't want it.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Attachments:
v2-0001-Add-progress-reporting-to-pg_verifybackup.patchapplication/octet-stream; name=v2-0001-Add-progress-reporting-to-pg_verifybackup.patchDownload
From 38a95ed1d76101162d2bb62d1c943fe41b4ce84d Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 6 Jan 2023 13:26:09 +0900
Subject: [PATCH v2] Add progress reporting to pg_verifybackup
This adds a new option to pg_verifybackup called -P/--progress,
showing every second some information about the state of checksum
verification.
Similarly to what is done for pg_rewind and pg_basebackup, the
information printed in the progress report consists of the current
amount of data computed and the total amount of data to compute. This
could be extended later on.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 15 ++++
src/bin/pg_verifybackup/pg_verifybackup.c | 85 +++++++++++++++++++++--
src/bin/pg_verifybackup/t/004_options.pl | 6 ++
3 files changed, 102 insertions(+), 4 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 5f83c98706..36335e0a18 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -178,6 +178,21 @@ PostgreSQL documentation
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><option>-P</option></term>
+ <term><option>--progress</option></term>
+ <listitem>
+ <para>
+ Enable progress reporting. Turning this on will deliver a progress
+ report while verifying checksums.
+ </para>
+ <para>
+ This option cannot be used together with the option
+ <option>--quiet</option>.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><option>-q</option></term>
<term><option>--quiet</option></term>
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 7634dfc285..ca0bcabb6d 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -16,12 +16,14 @@
#include <dirent.h>
#include <fcntl.h>
#include <sys/stat.h>
+#include <time.h>
#include "common/hashfn.h"
#include "common/logging.h"
#include "fe_utils/simple_list.h"
#include "getopt_long.h"
#include "parse_manifest.h"
+#include "pgtime.h"
/*
* For efficiency, we'd like our hash table containing information about the
@@ -57,6 +59,8 @@ typedef struct manifest_file
bool matched;
bool bad;
} manifest_file;
+#define should_check_checksums(m) \
+ (((m)->matched) && !((m)->bad) && (((m)->checksum_type) != CHECKSUM_TYPE_NONE))
/*
* Define a hash table which we can use to store information about the files
@@ -147,10 +151,19 @@ static void report_fatal_error(const char *pg_restrict fmt,...)
pg_attribute_printf(1, 2) pg_attribute_noreturn();
static bool should_ignore_relpath(verifier_context *context, char *relpath);
+static void progress_report(bool finished);
static void usage(void);
static const char *progname;
+/* options */
+static bool show_progress = false;
+static bool skip_checksums = false;
+
+/* Progress indicators */
+static uint64 total_size = 0;
+static uint64 done_size = 0;
+
/*
* Main entry point.
*/
@@ -162,6 +175,7 @@ main(int argc, char **argv)
{"ignore", required_argument, NULL, 'i'},
{"manifest-path", required_argument, NULL, 'm'},
{"no-parse-wal", no_argument, NULL, 'n'},
+ {"progress", no_argument, NULL, 'P'},
{"quiet", no_argument, NULL, 'q'},
{"skip-checksums", no_argument, NULL, 's'},
{"wal-directory", required_argument, NULL, 'w'},
@@ -174,7 +188,6 @@ main(int argc, char **argv)
char *manifest_path = NULL;
bool no_parse_wal = false;
bool quiet = false;
- bool skip_checksums = false;
char *wal_directory = NULL;
char *pg_waldump_path = NULL;
@@ -219,7 +232,7 @@ main(int argc, char **argv)
simple_string_list_append(&context.ignore_list, "recovery.signal");
simple_string_list_append(&context.ignore_list, "standby.signal");
- while ((c = getopt_long(argc, argv, "ei:m:nqsw:", long_options, NULL)) != -1)
+ while ((c = getopt_long(argc, argv, "ei:m:nPqsw:", long_options, NULL)) != -1)
{
switch (c)
{
@@ -241,6 +254,9 @@ main(int argc, char **argv)
case 'n':
no_parse_wal = true;
break;
+ case 'P':
+ show_progress = true;
+ break;
case 'q':
quiet = true;
break;
@@ -277,6 +293,10 @@ main(int argc, char **argv)
exit(1);
}
+ /* Complain if the specified arguments conflict */
+ if (show_progress && quiet)
+ pg_fatal("cannot specify both %s and %s", "-P/--progress", "-q/--quiet");
+
/* Unless --no-parse-wal was specified, we will need pg_waldump. */
if (!no_parse_wal)
{
@@ -638,6 +658,10 @@ verify_backup_file(verifier_context *context, char *relpath, char *fullpath)
m->bad = true;
}
+ /* Update statistics for progress report, if necessary */
+ if (show_progress && !skip_checksums && should_check_checksums(m))
+ total_size += m->size;
+
/*
* We don't verify checksums at this stage. We first finish verifying that
* we have the expected set of files with the expected sizes, and only
@@ -675,11 +699,12 @@ verify_backup_checksums(verifier_context *context)
manifest_files_iterator it;
manifest_file *m;
+ progress_report(false);
+
manifest_files_start_iterate(context->ht, &it);
while ((m = manifest_files_iterate(context->ht, &it)) != NULL)
{
- if (m->matched && !m->bad && m->checksum_type != CHECKSUM_TYPE_NONE &&
- !should_ignore_relpath(context, m->pathname))
+ if (should_check_checksums(m) && !should_ignore_relpath(context, m->pathname))
{
char *fullpath;
@@ -694,6 +719,8 @@ verify_backup_checksums(verifier_context *context)
pfree(fullpath);
}
}
+
+ progress_report(true);
}
/*
@@ -740,6 +767,10 @@ verify_file_checksum(verifier_context *context, manifest_file *m,
close(fd);
return;
}
+
+ /* Report progress */
+ done_size += rc;
+ progress_report(false);
}
if (rc < 0)
report_backup_error(context, "could not read file \"%s\": %m",
@@ -894,6 +925,51 @@ hash_string_pointer(char *s)
return hash_bytes(ss, strlen(s));
}
+/*
+ * Print a progress report based on the global variables.
+ *
+ * Progress report is written at maximum once per second, unless the finished
+ * parameter is set to true.
+ *
+ * If finished is set to true, this is the last progress report. The cursor
+ * is moved to the next line.
+ */
+static void
+progress_report(bool finished)
+{
+ static pg_time_t last_progress_report = 0;
+ pg_time_t now;
+ int percent_size = 0;
+ char totalsize_str[32];
+ char donesize_str[32];
+
+ if (!show_progress)
+ return;
+
+ now = time(NULL);
+ if (now == last_progress_report && !finished)
+ return; /* Max once per second */
+
+ last_progress_report = now;
+ percent_size = total_size ? (int) ((done_size * 100 / total_size)) : 0;
+
+ snprintf(totalsize_str, sizeof(totalsize_str), UINT64_FORMAT,
+ total_size / 1024);
+ snprintf(donesize_str, sizeof(donesize_str), UINT64_FORMAT,
+ done_size / 1024);
+
+ fprintf(stderr,
+ _("%*s/%s kB (%d%%) verified"),
+ (int) strlen(totalsize_str),
+ donesize_str, totalsize_str, percent_size);
+
+ /*
+ * Stay on the same line if reporting to a terminal and we're not done
+ * yet.
+ */
+ fputc((!finished && isatty(fileno(stderr))) ? '\r' : '\n', stderr);
+}
+
/*
* Print out usage information and exit.
*/
@@ -907,6 +983,7 @@ usage(void)
printf(_(" -i, --ignore=RELATIVE_PATH ignore indicated path\n"));
printf(_(" -m, --manifest-path=PATH use specified path for manifest\n"));
printf(_(" -n, --no-parse-wal do not try to parse WAL files\n"));
+ printf(_(" -P, --progress show progress information\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -s, --skip-checksums skip checksum verification\n"));
printf(_(" -w, --wal-directory=PATH use specified path for WAL files\n"));
diff --git a/src/bin/pg_verifybackup/t/004_options.pl b/src/bin/pg_verifybackup/t/004_options.pl
index 25c485e0ee..466d79c967 100644
--- a/src/bin/pg_verifybackup/t/004_options.pl
+++ b/src/bin/pg_verifybackup/t/004_options.pl
@@ -28,6 +28,12 @@ ok($result, "-q succeeds: exit code 0");
is($stdout, '', "-q succeeds: no stdout");
is($stderr, '', "-q succeeds: no stderr");
+# Test invalid options
+command_fails_like(
+ [ 'pg_verifybackup', '-P', '-q', $backup_path ],
+ qr{cannot specify both -P/--progress and -q/--quiet},
+ 'cannot use options -P and -q at the same time');
+
# Corrupt the PG_VERSION file.
my $version_pathname = "$backup_path/PG_VERSION";
my $version_contents = slurp_file($version_pathname);
--
2.31.1
On Thu, Feb 02, 2023 at 02:57:44PM +0900, Masahiko Sawada wrote:
It seems that the --progress option doesn't work with command_like()
since the progress information is written in stderr but command_like()
doesn't want it.
What about command_checks_all()? It should check for stderr, stdout
as well as the expected error code.
--
Michael
On Thu, Feb 2, 2023 at 3:12 PM Michael Paquier <michael@paquier.xyz> wrote:
On Thu, Feb 02, 2023 at 02:57:44PM +0900, Masahiko Sawada wrote:
It seems that the --progress option doesn't work with command_like()
since the progress information is written in stderr but command_like()
doesn't want it.What about command_checks_all()? It should check for stderr, stdout
as well as the expected error code.
Seems a good idea. Please find an attached patch.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Attachments:
v3-0001-Add-progress-reporting-to-pg_verifybackup.patchapplication/octet-stream; name=v3-0001-Add-progress-reporting-to-pg_verifybackup.patchDownload
From c9d70aab87cf73ca86a820f89d4995d233055fc6 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 6 Jan 2023 13:26:09 +0900
Subject: [PATCH v3] Add progress reporting to pg_verifybackup
This adds a new option to pg_verifybackup called -P/--progress,
showing every second some information about the state of checksum
verification.
Similarly to what is done for pg_rewind and pg_basebackup, the
information printed in the progress report consists of the current
amount of data computed and the total amount of data to compute. This
could be extended later on.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 15 ++++
src/bin/pg_verifybackup/pg_verifybackup.c | 85 +++++++++++++++++++++--
src/bin/pg_verifybackup/t/004_options.pl | 18 +++--
3 files changed, 110 insertions(+), 8 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 5f83c98706..36335e0a18 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -178,6 +178,21 @@ PostgreSQL documentation
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><option>-P</option></term>
+ <term><option>--progress</option></term>
+ <listitem>
+ <para>
+ Enable progress reporting. Turning this on will deliver a progress
+ report while verifying checksums.
+ </para>
+ <para>
+ This option cannot be used together with the option
+ <option>--quiet</option>.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><option>-q</option></term>
<term><option>--quiet</option></term>
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 7634dfc285..ca0bcabb6d 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -16,12 +16,14 @@
#include <dirent.h>
#include <fcntl.h>
#include <sys/stat.h>
+#include <time.h>
#include "common/hashfn.h"
#include "common/logging.h"
#include "fe_utils/simple_list.h"
#include "getopt_long.h"
#include "parse_manifest.h"
+#include "pgtime.h"
/*
* For efficiency, we'd like our hash table containing information about the
@@ -57,6 +59,8 @@ typedef struct manifest_file
bool matched;
bool bad;
} manifest_file;
+#define should_check_checksums(m) \
+ (((m)->matched) && !((m)->bad) && (((m)->checksum_type) != CHECKSUM_TYPE_NONE))
/*
* Define a hash table which we can use to store information about the files
@@ -147,10 +151,19 @@ static void report_fatal_error(const char *pg_restrict fmt,...)
pg_attribute_printf(1, 2) pg_attribute_noreturn();
static bool should_ignore_relpath(verifier_context *context, char *relpath);
+static void progress_report(bool finished);
static void usage(void);
static const char *progname;
+/* options */
+static bool show_progress = false;
+static bool skip_checksums = false;
+
+/* Progress indicators */
+static uint64 total_size = 0;
+static uint64 done_size = 0;
+
/*
* Main entry point.
*/
@@ -162,6 +175,7 @@ main(int argc, char **argv)
{"ignore", required_argument, NULL, 'i'},
{"manifest-path", required_argument, NULL, 'm'},
{"no-parse-wal", no_argument, NULL, 'n'},
+ {"progress", no_argument, NULL, 'P'},
{"quiet", no_argument, NULL, 'q'},
{"skip-checksums", no_argument, NULL, 's'},
{"wal-directory", required_argument, NULL, 'w'},
@@ -174,7 +188,6 @@ main(int argc, char **argv)
char *manifest_path = NULL;
bool no_parse_wal = false;
bool quiet = false;
- bool skip_checksums = false;
char *wal_directory = NULL;
char *pg_waldump_path = NULL;
@@ -219,7 +232,7 @@ main(int argc, char **argv)
simple_string_list_append(&context.ignore_list, "recovery.signal");
simple_string_list_append(&context.ignore_list, "standby.signal");
- while ((c = getopt_long(argc, argv, "ei:m:nqsw:", long_options, NULL)) != -1)
+ while ((c = getopt_long(argc, argv, "ei:m:nPqsw:", long_options, NULL)) != -1)
{
switch (c)
{
@@ -241,6 +254,9 @@ main(int argc, char **argv)
case 'n':
no_parse_wal = true;
break;
+ case 'P':
+ show_progress = true;
+ break;
case 'q':
quiet = true;
break;
@@ -277,6 +293,10 @@ main(int argc, char **argv)
exit(1);
}
+ /* Complain if the specified arguments conflict */
+ if (show_progress && quiet)
+ pg_fatal("cannot specify both %s and %s", "-P/--progress", "-q/--quiet");
+
/* Unless --no-parse-wal was specified, we will need pg_waldump. */
if (!no_parse_wal)
{
@@ -638,6 +658,10 @@ verify_backup_file(verifier_context *context, char *relpath, char *fullpath)
m->bad = true;
}
+ /* Update statistics for progress report, if necessary */
+ if (show_progress && !skip_checksums && should_check_checksums(m))
+ total_size += m->size;
+
/*
* We don't verify checksums at this stage. We first finish verifying that
* we have the expected set of files with the expected sizes, and only
@@ -675,11 +699,12 @@ verify_backup_checksums(verifier_context *context)
manifest_files_iterator it;
manifest_file *m;
+ progress_report(false);
+
manifest_files_start_iterate(context->ht, &it);
while ((m = manifest_files_iterate(context->ht, &it)) != NULL)
{
- if (m->matched && !m->bad && m->checksum_type != CHECKSUM_TYPE_NONE &&
- !should_ignore_relpath(context, m->pathname))
+ if (should_check_checksums(m) && !should_ignore_relpath(context, m->pathname))
{
char *fullpath;
@@ -694,6 +719,8 @@ verify_backup_checksums(verifier_context *context)
pfree(fullpath);
}
}
+
+ progress_report(true);
}
/*
@@ -740,6 +767,10 @@ verify_file_checksum(verifier_context *context, manifest_file *m,
close(fd);
return;
}
+
+ /* Report progress */
+ done_size += rc;
+ progress_report(false);
}
if (rc < 0)
report_backup_error(context, "could not read file \"%s\": %m",
@@ -894,6 +925,51 @@ hash_string_pointer(char *s)
return hash_bytes(ss, strlen(s));
}
+/*
+ * Print a progress report based on the global variables.
+ *
+ * Progress report is written at maximum once per second, unless the finished
+ * parameter is set to true.
+ *
+ * If finished is set to true, this is the last progress report. The cursor
+ * is moved to the next line.
+ */
+static void
+progress_report(bool finished)
+{
+ static pg_time_t last_progress_report = 0;
+ pg_time_t now;
+ int percent_size = 0;
+ char totalsize_str[32];
+ char donesize_str[32];
+
+ if (!show_progress)
+ return;
+
+ now = time(NULL);
+ if (now == last_progress_report && !finished)
+ return; /* Max once per second */
+
+ last_progress_report = now;
+ percent_size = total_size ? (int) ((done_size * 100 / total_size)) : 0;
+
+ snprintf(totalsize_str, sizeof(totalsize_str), UINT64_FORMAT,
+ total_size / 1024);
+ snprintf(donesize_str, sizeof(donesize_str), UINT64_FORMAT,
+ done_size / 1024);
+
+ fprintf(stderr,
+ _("%*s/%s kB (%d%%) verified"),
+ (int) strlen(totalsize_str),
+ donesize_str, totalsize_str, percent_size);
+
+ /*
+ * Stay on the same line if reporting to a terminal and we're not done
+ * yet.
+ */
+ fputc((!finished && isatty(fileno(stderr))) ? '\r' : '\n', stderr);
+}
+
/*
* Print out usage information and exit.
*/
@@ -907,6 +983,7 @@ usage(void)
printf(_(" -i, --ignore=RELATIVE_PATH ignore indicated path\n"));
printf(_(" -m, --manifest-path=PATH use specified path for manifest\n"));
printf(_(" -n, --no-parse-wal do not try to parse WAL files\n"));
+ printf(_(" -P, --progress show progress information\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -s, --skip-checksums skip checksum verification\n"));
printf(_(" -w, --wal-directory=PATH use specified path for WAL files\n"));
diff --git a/src/bin/pg_verifybackup/t/004_options.pl b/src/bin/pg_verifybackup/t/004_options.pl
index 25c485e0ee..43eeb84e88 100644
--- a/src/bin/pg_verifybackup/t/004_options.pl
+++ b/src/bin/pg_verifybackup/t/004_options.pl
@@ -28,6 +28,12 @@ ok($result, "-q succeeds: exit code 0");
is($stdout, '', "-q succeeds: no stdout");
is($stderr, '', "-q succeeds: no stderr");
+# Test invalid options
+command_fails_like(
+ [ 'pg_verifybackup', '-P', '-q', $backup_path ],
+ qr{cannot specify both -P/--progress and -q/--quiet},
+ 'cannot use options -P and -q at the same time');
+
# Corrupt the PG_VERSION file.
my $version_pathname = "$backup_path/PG_VERSION";
my $version_contents = slurp_file($version_pathname);
@@ -48,12 +54,16 @@ command_like(
qr/backup successfully verified/,
'-s skips checksumming');
-# Validation should succeed if we ignore the problem file.
-command_like(
- [ 'pg_verifybackup', '-i', 'PG_VERSION', $backup_path ],
- qr/backup successfully verified/,
+# Validation should succeed if we ignore the problem file. Also, check
+# the progress information.
+command_checks_all(
+ [ 'pg_verifybackup', '-P', '-i', 'PG_VERSION', $backup_path ],
+ 0,
+ [qr/backup successfully verified/],
+ [qr{(\d+/\d+ kB \(\d+%\) verified)+}],
'-i ignores problem file');
+
# PG_VERSION is already corrupt; let's try also removing all of pg_xact.
rmtree($backup_path . "/pg_xact");
--
2.31.1
On Thu, Feb 02, 2023 at 05:56:47PM +0900, Masahiko Sawada wrote:
Seems a good idea. Please find an attached patch.
That seems rather OK seen from here. I'll see about getting that
applied except if there is an objection of any kind.
--
Michael
On Sat, Feb 04, 2023 at 12:32:15PM +0900, Michael Paquier wrote:
That seems rather OK seen from here. I'll see about getting that
applied except if there is an objection of any kind.
Okay, I have looked at that again this morning and I've spotted one
tiny issue: specifying --progress with --skip-checksums does not
really make sense.
Ignoring entries with a bad size would lead to incorrect progress
report (for example, say an entry in the manifest has a largely
oversized size number), so your approach on this side is correct. The
application of the ignore list via -i is also correct, as a patch
matching with should_ignore_relpath() does not compute an extra size
for total_size.
I was also wondering for a few minutes while on it whether it would
have been cleaner to move the check for should_ignore_relpath()
directly in verify_file_checksum() and verify_backup_file(), but
nobody has complained about that as being a problem, either.
What do you think about the updated version attached?
--
Michael
Attachments:
v4-0001-Add-progress-reporting-to-pg_verifybackup.patchtext/x-diff; charset=us-asciiDownload
From 08a33cc650fb3cd58e742a79b232dbd5d9843008 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Mon, 6 Feb 2023 09:34:19 +0900
Subject: [PATCH v4] Add progress reporting to pg_verifybackup
This adds a new option to pg_verifybackup called -P/--progress,
showing every second some information about the state of checksum
verification.
Similarly to what is done for pg_rewind and pg_basebackup, the
information printed in the progress report consists of the current
amount of data computed and the total amount of data to compute. This
could be extended later on.
---
src/bin/pg_verifybackup/pg_verifybackup.c | 86 ++++++++++++++++++++++-
src/bin/pg_verifybackup/t/004_options.pl | 21 ++++--
doc/src/sgml/ref/pg_verifybackup.sgml | 15 ++++
3 files changed, 116 insertions(+), 6 deletions(-)
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 7634dfc285..8d5cb1c72b 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -16,12 +16,14 @@
#include <dirent.h>
#include <fcntl.h>
#include <sys/stat.h>
+#include <time.h>
#include "common/hashfn.h"
#include "common/logging.h"
#include "fe_utils/simple_list.h"
#include "getopt_long.h"
#include "parse_manifest.h"
+#include "pgtime.h"
/*
* For efficiency, we'd like our hash table containing information about the
@@ -57,6 +59,8 @@ typedef struct manifest_file
bool matched;
bool bad;
} manifest_file;
+#define should_check_checksums(m) \
+ (((m)->matched) && !((m)->bad) && (((m)->checksum_type) != CHECKSUM_TYPE_NONE))
/*
* Define a hash table which we can use to store information about the files
@@ -147,10 +151,18 @@ static void report_fatal_error(const char *pg_restrict fmt,...)
pg_attribute_printf(1, 2) pg_attribute_noreturn();
static bool should_ignore_relpath(verifier_context *context, char *relpath);
+static void progress_report(bool finished);
static void usage(void);
static const char *progname;
+/* options */
+static bool show_progress = false;
+
+/* Progress indicators */
+static uint64 total_size = 0;
+static uint64 done_size = 0;
+
/*
* Main entry point.
*/
@@ -162,6 +174,7 @@ main(int argc, char **argv)
{"ignore", required_argument, NULL, 'i'},
{"manifest-path", required_argument, NULL, 'm'},
{"no-parse-wal", no_argument, NULL, 'n'},
+ {"progress", no_argument, NULL, 'P'},
{"quiet", no_argument, NULL, 'q'},
{"skip-checksums", no_argument, NULL, 's'},
{"wal-directory", required_argument, NULL, 'w'},
@@ -219,7 +232,7 @@ main(int argc, char **argv)
simple_string_list_append(&context.ignore_list, "recovery.signal");
simple_string_list_append(&context.ignore_list, "standby.signal");
- while ((c = getopt_long(argc, argv, "ei:m:nqsw:", long_options, NULL)) != -1)
+ while ((c = getopt_long(argc, argv, "ei:m:nPqsw:", long_options, NULL)) != -1)
{
switch (c)
{
@@ -241,6 +254,9 @@ main(int argc, char **argv)
case 'n':
no_parse_wal = true;
break;
+ case 'P':
+ show_progress = true;
+ break;
case 'q':
quiet = true;
break;
@@ -277,6 +293,14 @@ main(int argc, char **argv)
exit(1);
}
+ /* Complain if the specified arguments conflict */
+ if (show_progress && quiet)
+ pg_fatal("cannot specify both %s and %s",
+ "-P/--progress", "-q/--quiet");
+ if (show_progress && skip_checksums)
+ pg_fatal("cannot specify both %s and %s",
+ "-P/--progress", "-s/--skip-checksums");
+
/* Unless --no-parse-wal was specified, we will need pg_waldump. */
if (!no_parse_wal)
{
@@ -638,6 +662,10 @@ verify_backup_file(verifier_context *context, char *relpath, char *fullpath)
m->bad = true;
}
+ /* Update statistics for progress report, if necessary */
+ if (show_progress && should_check_checksums(m))
+ total_size += m->size;
+
/*
* We don't verify checksums at this stage. We first finish verifying that
* we have the expected set of files with the expected sizes, and only
@@ -675,10 +703,12 @@ verify_backup_checksums(verifier_context *context)
manifest_files_iterator it;
manifest_file *m;
+ progress_report(false);
+
manifest_files_start_iterate(context->ht, &it);
while ((m = manifest_files_iterate(context->ht, &it)) != NULL)
{
- if (m->matched && !m->bad && m->checksum_type != CHECKSUM_TYPE_NONE &&
+ if (should_check_checksums(m) &&
!should_ignore_relpath(context, m->pathname))
{
char *fullpath;
@@ -694,6 +724,8 @@ verify_backup_checksums(verifier_context *context)
pfree(fullpath);
}
}
+
+ progress_report(true);
}
/*
@@ -740,6 +772,10 @@ verify_file_checksum(verifier_context *context, manifest_file *m,
close(fd);
return;
}
+
+ /* Report progress */
+ done_size += rc;
+ progress_report(false);
}
if (rc < 0)
report_backup_error(context, "could not read file \"%s\": %m",
@@ -894,6 +930,51 @@ hash_string_pointer(char *s)
return hash_bytes(ss, strlen(s));
}
+/*
+ * Print a progress report based on the global variables.
+ *
+ * Progress report is written at maximum once per second, unless the finished
+ * parameter is set to true.
+ *
+ * If finished is set to true, this is the last progress report. The cursor
+ * is moved to the next line.
+ */
+static void
+progress_report(bool finished)
+{
+ static pg_time_t last_progress_report = 0;
+ pg_time_t now;
+ int percent_size = 0;
+ char totalsize_str[32];
+ char donesize_str[32];
+
+ if (!show_progress)
+ return;
+
+ now = time(NULL);
+ if (now == last_progress_report && !finished)
+ return; /* Max once per second */
+
+ last_progress_report = now;
+ percent_size = total_size ? (int) ((done_size * 100 / total_size)) : 0;
+
+ snprintf(totalsize_str, sizeof(totalsize_str), UINT64_FORMAT,
+ total_size / 1024);
+ snprintf(donesize_str, sizeof(donesize_str), UINT64_FORMAT,
+ done_size / 1024);
+
+ fprintf(stderr,
+ _("%*s/%s kB (%d%%) verified"),
+ (int) strlen(totalsize_str),
+ donesize_str, totalsize_str, percent_size);
+
+ /*
+ * Stay on the same line if reporting to a terminal and we're not done
+ * yet.
+ */
+ fputc((!finished && isatty(fileno(stderr))) ? '\r' : '\n', stderr);
+}
+
/*
* Print out usage information and exit.
*/
@@ -907,6 +988,7 @@ usage(void)
printf(_(" -i, --ignore=RELATIVE_PATH ignore indicated path\n"));
printf(_(" -m, --manifest-path=PATH use specified path for manifest\n"));
printf(_(" -n, --no-parse-wal do not try to parse WAL files\n"));
+ printf(_(" -P, --progress show progress information\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -s, --skip-checksums skip checksum verification\n"));
printf(_(" -w, --wal-directory=PATH use specified path for WAL files\n"));
diff --git a/src/bin/pg_verifybackup/t/004_options.pl b/src/bin/pg_verifybackup/t/004_options.pl
index 25c485e0ee..654cba743b 100644
--- a/src/bin/pg_verifybackup/t/004_options.pl
+++ b/src/bin/pg_verifybackup/t/004_options.pl
@@ -28,6 +28,16 @@ ok($result, "-q succeeds: exit code 0");
is($stdout, '', "-q succeeds: no stdout");
is($stderr, '', "-q succeeds: no stderr");
+# Test invalid options
+command_fails_like(
+ [ 'pg_verifybackup', '--progress', '--quiet', $backup_path ],
+ qr{cannot specify both -P/--progress and -q/--quiet},
+ 'cannot use --progress and --quiet at the same time');
+command_fails_like(
+ [ 'pg_verifybackup', '--progress', '--skip-checksums', $backup_path ],
+ qr{cannot specify both -P/--progress and -s/--skip-checksums},
+ 'cannot use options --progress and --skip-checksums at the same time');
+
# Corrupt the PG_VERSION file.
my $version_pathname = "$backup_path/PG_VERSION";
my $version_contents = slurp_file($version_pathname);
@@ -48,10 +58,13 @@ command_like(
qr/backup successfully verified/,
'-s skips checksumming');
-# Validation should succeed if we ignore the problem file.
-command_like(
- [ 'pg_verifybackup', '-i', 'PG_VERSION', $backup_path ],
- qr/backup successfully verified/,
+# Validation should succeed if we ignore the problem file. Also, check
+# the progress information.
+command_checks_all(
+ [ 'pg_verifybackup', '-P', '-i', 'PG_VERSION', $backup_path ],
+ 0,
+ [qr/backup successfully verified/],
+ [qr{(\d+/\d+ kB \(\d+%\) verified)+}],
'-i ignores problem file');
# PG_VERSION is already corrupt; let's try also removing all of pg_xact.
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 5f83c98706..b4e0a52ad3 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -178,6 +178,21 @@ PostgreSQL documentation
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><option>-P</option></term>
+ <term><option>--progress</option></term>
+ <listitem>
+ <para>
+ Enable progress reporting. Turning this on will deliver a progress
+ report while verifying checksums.
+ </para>
+ <para>
+ This option cannot be used together with the option
+ <option>--quiet</option> or <option>--skip-checksums</option>.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><option>-q</option></term>
<term><option>--quiet</option></term>
--
2.39.1
On Mon, Feb 6, 2023 at 9:35 AM Michael Paquier <michael@paquier.xyz> wrote:
On Sat, Feb 04, 2023 at 12:32:15PM +0900, Michael Paquier wrote:
That seems rather OK seen from here. I'll see about getting that
applied except if there is an objection of any kind.Okay, I have looked at that again this morning and I've spotted one
tiny issue: specifying --progress with --skip-checksums does not
really make sense.
I thought that too, but I thought it's better to ignore it, instead of
erroring out. For example, we can specify both --disable and
--progress options to pg_checksum commands, but we don't write any
progress information in this case.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Mon, Feb 06, 2023 at 12:27:51PM +0900, Masahiko Sawada wrote:
I thought that too, but I thought it's better to ignore it, instead of
erroring out. For example, we can specify both --disable and
--progress options to pg_checksum commands, but we don't write any
progress information in this case.
Well, if you don't feel strongly about that, that's fine by me as
well, so I have applied your v3 with the tweaks I posted previously,
without the restriction on --skip-checksums.
--
Michael
On Mon, Feb 6, 2023 at 2:45 PM Michael Paquier <michael@paquier.xyz> wrote:
On Mon, Feb 06, 2023 at 12:27:51PM +0900, Masahiko Sawada wrote:
I thought that too, but I thought it's better to ignore it, instead of
erroring out. For example, we can specify both --disable and
--progress options to pg_checksum commands, but we don't write any
progress information in this case.Well, if you don't feel strongly about that, that's fine by me as
well, so I have applied your v3 with the tweaks I posted previously,
without the restriction on --skip-checksums.
Thank you!
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com