add --no-sync to pg_upgrade's calls to pg_dump and pg_dumpall
This is likely small potatoes compared to some of the other
pg_upgrade-related improvements I've proposed [0]/messages/by-id/20240418041712.GA3441570@nathanxps13 [1]/messages/by-id/20240503025140.GA1227404@nathanxps13 or plan to propose,
but this is easy enough, and I already wrote the patch, so here it is.
AFAICT there's no reason to bother syncing these dump files to disk. If
someone pulls the plug during pg_upgrade, it's not like you can resume
pg_upgrade from where it left off. Also, I think we skipped syncing before
v10, anyway, as the --no-sync flag was only added in commit 96a7128, which
added the code to sync dump files, too.
[0]: /messages/by-id/20240418041712.GA3441570@nathanxps13
[1]: /messages/by-id/20240503025140.GA1227404@nathanxps13
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Attachments:
v1-0001-add-no-sync-to-pg_upgrade-s-calls-to-pg_dump-all.patchtext/x-diff; charset=us-asciiDownload
From 998613046fb5ff5c6618964c821109b573bfa89b Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Fri, 3 May 2024 10:35:21 -0500
Subject: [PATCH v1 1/1] add --no-sync to pg_upgrade's calls to pg_dump[all]
---
src/bin/pg_upgrade/dump.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/src/bin/pg_upgrade/dump.c b/src/bin/pg_upgrade/dump.c
index 29fb45b928..f052a4985c 100644
--- a/src/bin/pg_upgrade/dump.c
+++ b/src/bin/pg_upgrade/dump.c
@@ -22,9 +22,10 @@ generate_old_dump(void)
/* run new pg_dumpall binary for globals */
exec_prog(UTILITY_LOG_FILE, NULL, true, true,
"\"%s/pg_dumpall\" %s --globals-only --quote-all-identifiers "
- "--binary-upgrade %s -f \"%s/%s\"",
+ "--binary-upgrade %s %s -f \"%s/%s\"",
new_cluster.bindir, cluster_conn_opts(&old_cluster),
log_opts.verbose ? "--verbose" : "",
+ GET_MAJOR_VERSION(old_cluster.major_version) >= 1000 ? "--no-sync" : "",
log_opts.dumpdir,
GLOBALS_DUMP_FILE);
check_ok();
@@ -53,9 +54,10 @@ generate_old_dump(void)
parallel_exec_prog(log_file_name, NULL,
"\"%s/pg_dump\" %s --schema-only --quote-all-identifiers "
- "--binary-upgrade --format=custom %s --file=\"%s/%s\" %s",
+ "--binary-upgrade --format=custom %s %s --file=\"%s/%s\" %s",
new_cluster.bindir, cluster_conn_opts(&old_cluster),
log_opts.verbose ? "--verbose" : "",
+ GET_MAJOR_VERSION(old_cluster.major_version) >= 1000 ? "--no-sync" : "",
log_opts.dumpdir,
sql_file_name, escaped_connstr.data);
--
2.25.1
On 03.05.24 19:13, Nathan Bossart wrote:
This is likely small potatoes compared to some of the other
pg_upgrade-related improvements I've proposed [0] [1] or plan to propose,
but this is easy enough, and I already wrote the patch, so here it is.
AFAICT there's no reason to bother syncing these dump files to disk. If
someone pulls the plug during pg_upgrade, it's not like you can resume
pg_upgrade from where it left off. Also, I think we skipped syncing before
v10, anyway, as the --no-sync flag was only added in commit 96a7128, which
added the code to sync dump files, too.
Looks good to me.
On Wed, May 08, 2024 at 10:09:46AM +0200, Peter Eisentraut wrote:
On 03.05.24 19:13, Nathan Bossart wrote:
This is likely small potatoes compared to some of the other
pg_upgrade-related improvements I've proposed [0] [1] or plan to propose,
but this is easy enough, and I already wrote the patch, so here it is.
AFAICT there's no reason to bother syncing these dump files to disk. If
someone pulls the plug during pg_upgrade, it's not like you can resume
pg_upgrade from where it left off. Also, I think we skipped syncing before
v10, anyway, as the --no-sync flag was only added in commit 96a7128, which
added the code to sync dump files, too.Looks good to me.
Thanks for looking. I noticed that the version check is unnecessary since
we always use the new binary's pg_dump[all], so I removed that in v2.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Attachments:
v2-0001-add-no-sync-to-pg_upgrade-s-calls-to-pg_dump-all.patchtext/x-diff; charset=us-asciiDownload
From 265a999ed65bf56491f76ae013f705ab64491486 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Fri, 3 May 2024 10:35:21 -0500
Subject: [PATCH v2 1/1] add --no-sync to pg_upgrade's calls to pg_dump[all]
---
src/bin/pg_upgrade/dump.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/bin/pg_upgrade/dump.c b/src/bin/pg_upgrade/dump.c
index 29fb45b928..8345f55be8 100644
--- a/src/bin/pg_upgrade/dump.c
+++ b/src/bin/pg_upgrade/dump.c
@@ -22,7 +22,7 @@ generate_old_dump(void)
/* run new pg_dumpall binary for globals */
exec_prog(UTILITY_LOG_FILE, NULL, true, true,
"\"%s/pg_dumpall\" %s --globals-only --quote-all-identifiers "
- "--binary-upgrade %s -f \"%s/%s\"",
+ "--binary-upgrade %s --no-sync -f \"%s/%s\"",
new_cluster.bindir, cluster_conn_opts(&old_cluster),
log_opts.verbose ? "--verbose" : "",
log_opts.dumpdir,
@@ -53,7 +53,7 @@ generate_old_dump(void)
parallel_exec_prog(log_file_name, NULL,
"\"%s/pg_dump\" %s --schema-only --quote-all-identifiers "
- "--binary-upgrade --format=custom %s --file=\"%s/%s\" %s",
+ "--binary-upgrade --format=custom %s --no-sync --file=\"%s/%s\" %s",
new_cluster.bindir, cluster_conn_opts(&old_cluster),
log_opts.verbose ? "--verbose" : "",
log_opts.dumpdir,
--
2.25.1
Nathan Bossart <nathandbossart@gmail.com> writes:
Thanks for looking. I noticed that the version check is unnecessary since
we always use the new binary's pg_dump[all], so I removed that in v2.
+1
regards, tom lane
On Wed, May 08, 2024 at 02:49:58PM -0400, Tom Lane wrote:
Nathan Bossart <nathandbossart@gmail.com> writes:
Thanks for looking. I noticed that the version check is unnecessary since
we always use the new binary's pg_dump[all], so I removed that in v2.+1
+1. Could there be an argument in favor of a backpatch? This is a
performance improvement, but one could also side that the addition of
sync support in pg_dump[all] has made that a regression that we'd
better fix because the flushes don't matter in this context. They
also bring costs for no gain.
--
Michael
On Thu, May 09, 2024 at 09:03:56AM +0900, Michael Paquier wrote:
+1. Could there be an argument in favor of a backpatch? This is a
performance improvement, but one could also side that the addition of
sync support in pg_dump[all] has made that a regression that we'd
better fix because the flushes don't matter in this context. They
also bring costs for no gain.
I don't see a strong need to back-patch this, if for no other reason than
it seems to have gone unnoticed for 7 major versions. Plus, based on my
admittedly limited testing, this is unlikely to provide significant
improvements.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On 9 May 2024, at 21:34, Nathan Bossart <nathandbossart@gmail.com> wrote:
On Thu, May 09, 2024 at 09:03:56AM +0900, Michael Paquier wrote:
+1. Could there be an argument in favor of a backpatch? This is a
performance improvement, but one could also side that the addition of
sync support in pg_dump[all] has made that a regression that we'd
better fix because the flushes don't matter in this context. They
also bring costs for no gain.I don't see a strong need to back-patch this, if for no other reason than
it seems to have gone unnoticed for 7 major versions. Plus, based on my
admittedly limited testing, this is unlikely to provide significant
improvements.
Agreed, this is a nice little improvement but it's unlikely to be enough of a
speedup to warrant changing the backbranches.
--
Daniel Gustafsson