pg_upgrade version checking questions

Started by Tom Lanealmost 7 years ago25 messages
#1Tom Lane
tgl@sss.pgh.pa.us

While poking around trying to find an explanation for the pg_upgrade
failure described here:
/messages/by-id/CACmJi2JUhGo2ZxqDkh-EPHNjEN1ZA1S64uHLJFWHBhUuV4492w@mail.gmail.com
I noticed a few things that seem a bit fishy about pg_upgrade.
I can't (yet) connect any of these to Tomasz' problem, but:

1. check_bin_dir() does validate_exec() for pg_dumpall and pg_dump,
but not for pg_restore, though pg_upgrade surely calls that too.
For that matter, it's not validating initdb and vacuumdb, though
it's grown dependencies on those as well. Seems like there's little
point in checking these if we're not going to check all of them.

2. check_cluster_versions() insists that the target version be the
same major version as pg_upgrade itself, but is that really good enough?
As things stand, it looks like pg_upgrade 11.3 would happily use pg_dump
11.1, or vice versa. With this rule, we cannot safely make any fixes
in minor releases that rely on synchronized changes in the behavior of
pg_upgrade and pg_dump/pg_dumpall/pg_restore. I've not gone looking
to see if we've already made such changes in the past, but even if we
never have, that's a rather tight-looking straitjacket. I think we
should insist that the new_cluster.bin_version be an exact match
to pg_upgrade's own PG_VERSION_NUM.

3. Actually, I'm kind of wondering why pg_upgrade has a --new-bindir
option at all, rather than just insisting on finding the new-version
executables in the same directory it is in. This seems like, at best,
a hangover from before it got into core. Even if you don't want to
remove the option, we could surely provide a useful default setting
based on find_my_exec. (I'm amused to notice that pg_upgrade
currently takes the trouble to find out its own path, and then does
precisely nothing with the information.)

Thoughts?

regards, tom lane

#2Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#1)
Re: pg_upgrade version checking questions

On Mon, Mar 18, 2019 at 07:35:17PM -0400, Tom Lane wrote:

While poking around trying to find an explanation for the pg_upgrade
failure described here:
/messages/by-id/CACmJi2JUhGo2ZxqDkh-EPHNjEN1ZA1S64uHLJFWHBhUuV4492w@mail.gmail.com
I noticed a few things that seem a bit fishy about pg_upgrade.
I can't (yet) connect any of these to Tomasz' problem, but:

1. check_bin_dir() does validate_exec() for pg_dumpall and pg_dump,
but not for pg_restore, though pg_upgrade surely calls that too.
For that matter, it's not validating initdb and vacuumdb, though
it's grown dependencies on those as well. Seems like there's little
point in checking these if we're not going to check all of them.

Yes, adding those checks would be nice. I guess I never suspected there
would be mixed-version binaries in that directory.

2. check_cluster_versions() insists that the target version be the
same major version as pg_upgrade itself, but is that really good enough?
As things stand, it looks like pg_upgrade 11.3 would happily use pg_dump
11.1, or vice versa. With this rule, we cannot safely make any fixes
in minor releases that rely on synchronized changes in the behavior of
pg_upgrade and pg_dump/pg_dumpall/pg_restore. I've not gone looking
to see if we've already made such changes in the past, but even if we
never have, that's a rather tight-looking straitjacket. I think we
should insist that the new_cluster.bin_version be an exact match
to pg_upgrade's own PG_VERSION_NUM.

Again, I never considered minor-version changes, so yeah, forcing minor
version matching makes sense.

3. Actually, I'm kind of wondering why pg_upgrade has a --new-bindir
option at all, rather than just insisting on finding the new-version
executables in the same directory it is in. This seems like, at best,
a hangover from before it got into core. Even if you don't want to
remove the option, we could surely provide a useful default setting
based on find_my_exec. (I'm amused to notice that pg_upgrade
currently takes the trouble to find out its own path, and then does
precisely nothing with the information.)

Good point. You are right that when it was outside of the source tree,
and even in /contrib, that would not have worked easily. Makes sense to
at least default to the same directory as pg_upgrade.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +
#3Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#2)
Re: pg_upgrade version checking questions

On Tue, Mar 19, 2019 at 02:43:49AM -0400, Bruce Momjian wrote:

3. Actually, I'm kind of wondering why pg_upgrade has a --new-bindir
option at all, rather than just insisting on finding the new-version
executables in the same directory it is in. This seems like, at best,
a hangover from before it got into core. Even if you don't want to
remove the option, we could surely provide a useful default setting
based on find_my_exec. (I'm amused to notice that pg_upgrade
currently takes the trouble to find out its own path, and then does
precisely nothing with the information.)

Good point. You are right that when it was outside of the source tree,
and even in /contrib, that would not have worked easily. Makes sense to
at least default to the same directory as pg_upgrade.

I guess an open question is whether we should remove the --new-bindir
option completely.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +
#4Daniel Gustafsson
daniel@yesql.se
In reply to: Bruce Momjian (#3)
Re: pg_upgrade version checking questions

On Tuesday, March 19, 2019 7:55 AM, Bruce Momjian <bruce@momjian.us> wrote:

On Tue, Mar 19, 2019 at 02:43:49AM -0400, Bruce Momjian wrote:

3. Actually, I'm kind of wondering why pg_upgrade has a --new-bindir
option at all, rather than just insisting on finding the new-version
executables in the same directory it is in. This seems like, at best,
a hangover from before it got into core. Even if you don't want to
remove the option, we could surely provide a useful default setting
based on find_my_exec. (I'm amused to notice that pg_upgrade
currently takes the trouble to find out its own path, and then does
precisely nothing with the information.)

Good point. You are right that when it was outside of the source tree,
and even in /contrib, that would not have worked easily. Makes sense to
at least default to the same directory as pg_upgrade.

I guess an open question is whether we should remove the --new-bindir
option completely.

If the default is made to find the new-version binaries in the same directory,
keeping --new-bindir could still be useful for easier testing of pg_upgrade.

cheers ./daniel

#5Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Tom Lane (#1)
Re: pg_upgrade version checking questions

On 2019-03-19 00:35, Tom Lane wrote:

2. check_cluster_versions() insists that the target version be the
same major version as pg_upgrade itself, but is that really good enough?
As things stand, it looks like pg_upgrade 11.3 would happily use pg_dump
11.1, or vice versa.

I'd hesitate to tie this down too much. It's possible that either the
client or the server package cannot currently be upgraded because of
some other dependencies. In fact, a careful packager might as a result
of a change like this tie the client and server packages together with
an exact version match. This has the potential to make the global
dependency hell worse.

3. Actually, I'm kind of wondering why pg_upgrade has a --new-bindir
option at all, rather than just insisting on finding the new-version
executables in the same directory it is in. This seems like, at best,
a hangover from before it got into core. Even if you don't want to
remove the option, we could surely provide a useful default setting
based on find_my_exec.

Previously discussed here:
/messages/by-id/1304710184.28821.9.camel@vanquo.pezone.net
(Summary: right)

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#5)
Re: pg_upgrade version checking questions

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

On 2019-03-19 00:35, Tom Lane wrote:

2. check_cluster_versions() insists that the target version be the
same major version as pg_upgrade itself, but is that really good enough?
As things stand, it looks like pg_upgrade 11.3 would happily use pg_dump
11.1, or vice versa.

I'd hesitate to tie this down too much. It's possible that either the
client or the server package cannot currently be upgraded because of
some other dependencies. In fact, a careful packager might as a result
of a change like this tie the client and server packages together with
an exact version match. This has the potential to make the global
dependency hell worse.

I'm not really getting your point here. Packagers ordinarily tie
those versions together anyway, I'd expect --- there's no upside
to not doing so, and plenty of risk if one doesn't, because of
exactly the sort of coordinated-changes hazard I'm talking about here.

3. Actually, I'm kind of wondering why pg_upgrade has a --new-bindir
option at all, rather than just insisting on finding the new-version
executables in the same directory it is in. This seems like, at best,
a hangover from before it got into core. Even if you don't want to
remove the option, we could surely provide a useful default setting
based on find_my_exec.

Previously discussed here:
/messages/by-id/1304710184.28821.9.camel@vanquo.pezone.net
(Summary: right)

Mmm. The point that a default is of no particular use to scripts is
still valid. Shall we then remove the useless call to find_my_exec?

regards, tom lane

#7Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Tom Lane (#6)
Re: pg_upgrade version checking questions

On 2019-03-19 16:51, Tom Lane wrote:

I'm not really getting your point here. Packagers ordinarily tie
those versions together anyway, I'd expect --- there's no upside
to not doing so, and plenty of risk if one doesn't, because of
exactly the sort of coordinated-changes hazard I'm talking about here.

The RPM packages do that, but the Debian packages do not.

3. Actually, I'm kind of wondering why pg_upgrade has a --new-bindir
option at all, rather than just insisting on finding the new-version
executables in the same directory it is in. This seems like, at best,
a hangover from before it got into core. Even if you don't want to
remove the option, we could surely provide a useful default setting
based on find_my_exec.

Previously discussed here:
/messages/by-id/1304710184.28821.9.camel@vanquo.pezone.net
(Summary: right)

Mmm. The point that a default is of no particular use to scripts is
still valid. Shall we then remove the useless call to find_my_exec?

I'm still in favor of defaulting --new-bindir appropriately. It seems
silly not to. We know where the directory is, we don't have to ask anyone.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#8Christoph Berg
myon@debian.org
In reply to: Peter Eisentraut (#7)
Re: pg_upgrade version checking questions

Re: Peter Eisentraut 2019-03-22 <57769959-8960-a9ca-fc9c-4dbb12629b8a@2ndquadrant.com>

I'm still in favor of defaulting --new-bindir appropriately. It seems
silly not to. We know where the directory is, we don't have to ask anyone.

Fwiw I've been wondering why I have to pass that option every time
I've been using pg_upgrade. +1 on making it optional/redundant.

Christoph

#9Daniel Gustafsson
daniel@yesql.se
In reply to: Tom Lane (#1)
3 attachment(s)
Re: pg_upgrade version checking questions

On Tuesday, March 19, 2019 12:35 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I noticed a few things that seem a bit fishy about pg_upgrade.

Attached are three patches which takes a stab at the issues raised here (and
the discussion in this thread):

0001 - Enforces the version check to the full version including the minor
0002 - Tests for all the binaries that pg_upgrade executes
0003 - Make -B default to CWD and remove the exec_path check

Defaulting to CWD for the new bindir has the side effect that the default
sockdir is in the bin/ directory which may be less optimal.

cheers ./daniel

Attachments:

0002-Check-all-used-executables.patchapplication/octet-stream; name=0002-Check-all-used-executables.patchDownload
From 340c4f2a76b770180f49d36cdbfa3218adecd09c Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Mon, 25 Mar 2019 23:50:43 +0100
Subject: [PATCH 2/3] Check all used executables

Expand the validate_exec() calls to cover all the used binaries.
---
 src/bin/pg_upgrade/exec.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/bin/pg_upgrade/exec.c b/src/bin/pg_upgrade/exec.c
index a42e1ac6d3..8e08089b12 100644
--- a/src/bin/pg_upgrade/exec.c
+++ b/src/bin/pg_upgrade/exec.c
@@ -404,6 +404,9 @@ check_bin_dir(ClusterInfo *cluster)
 		validate_exec(cluster->bindir, "psql");
 		validate_exec(cluster->bindir, "pg_dump");
 		validate_exec(cluster->bindir, "pg_dumpall");
+		validate_exec(cluster->bindir, "pg_restore");
+		validate_exec(cluster->bindir, "initdb");
+		validate_exec(cluster->bindir, "vacuumdb");
 	}
 }
 
-- 
2.14.1.145.gb3622a4ee

0003-Default-new-bindir-to-CWD.patchapplication/octet-stream; name=0003-Default-new-bindir-to-CWD.patchDownload
From 0c44439fff2c43d678b29814fe134e60371a672c Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Mon, 25 Mar 2019 23:57:25 +0100
Subject: [PATCH 3/3] Default new bindir to CWD

Make the current working directory the default for new bindir, and
remove the codepath for finding the exec_path as the result is no
longer used.
---
 doc/src/sgml/ref/pgupgrade.sgml |  1 +
 src/bin/pg_upgrade/option.c     |  4 ++--
 src/bin/pg_upgrade/pg_upgrade.c | 11 -----------
 src/bin/pg_upgrade/pg_upgrade.h |  1 -
 4 files changed, 3 insertions(+), 14 deletions(-)

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index c896882dd1..cd2e351464 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -89,6 +89,7 @@
       <term><option>-B</option> <replaceable>bindir</replaceable></term>
       <term><option>--new-bindir=</option><replaceable>bindir</replaceable></term>
       <listitem><para>the new PostgreSQL executable directory;
+      default is current working directory;
       environment variable <envar>PGBINNEW</envar></para></listitem>
      </varlistentry>
 
diff --git a/src/bin/pg_upgrade/option.c b/src/bin/pg_upgrade/option.c
index ab5e2d6a30..d49b148531 100644
--- a/src/bin/pg_upgrade/option.c
+++ b/src/bin/pg_upgrade/option.c
@@ -250,7 +250,7 @@ parseCommandLine(int argc, char *argv[])
 	/* Get values from env if not already set */
 	check_required_directory(&old_cluster.bindir, "PGBINOLD", false,
 							 "-b", _("old cluster binaries reside"));
-	check_required_directory(&new_cluster.bindir, "PGBINNEW", false,
+	check_required_directory(&new_cluster.bindir, "PGBINNEW", true,
 							 "-B", _("new cluster binaries reside"));
 	check_required_directory(&old_cluster.pgdata, "PGDATAOLD", false,
 							 "-d", _("old cluster data resides"));
@@ -291,7 +291,7 @@ usage(void)
 	printf(_("  pg_upgrade [OPTION]...\n\n"));
 	printf(_("Options:\n"));
 	printf(_("  -b, --old-bindir=BINDIR       old cluster executable directory\n"));
-	printf(_("  -B, --new-bindir=BINDIR       new cluster executable directory\n"));
+	printf(_("  -B, --new-bindir=BINDIR       new cluster executable directory (default CWD)\n"));
 	printf(_("  -c, --check                   check clusters only, don't change any data\n"));
 	printf(_("  -d, --old-datadir=DATADIR     old cluster data directory\n"));
 	printf(_("  -D, --new-datadir=DATADIR     new cluster data directory\n"));
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index 92d8940a9f..ed444996ee 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -202,8 +202,6 @@ main(int argc, char **argv)
 static void
 setup(char *argv0, bool *live_check)
 {
-	char		exec_path[MAXPGPATH];	/* full path to my executable */
-
 	/*
 	 * make sure the user has a clean environment, otherwise, we may confuse
 	 * libpq when we connect to one (or both) of the servers.
@@ -245,15 +243,6 @@ setup(char *argv0, bool *live_check)
 			pg_fatal("There seems to be a postmaster servicing the new cluster.\n"
 					 "Please shutdown that postmaster and try again.\n");
 	}
-
-	/* get path to pg_upgrade executable */
-	if (find_my_exec(argv0, exec_path) < 0)
-		pg_fatal("%s: could not find own program executable\n", argv0);
-
-	/* Trim off program name and keep just path */
-	*last_dir_separator(exec_path) = '\0';
-	canonicalize_path(exec_path);
-	os_info.exec_path = pg_strdup(exec_path);
 }
 
 
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index baeb8ff0f8..04b537a8cd 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -320,7 +320,6 @@ typedef struct
 typedef struct
 {
 	const char *progname;		/* complete pathname for this program */
-	char	   *exec_path;		/* full path to my executable */
 	char	   *user;			/* username for clusters */
 	bool		user_specified; /* user specified on command-line */
 	char	  **old_tablespaces;	/* tablespaces */
-- 
2.14.1.145.gb3622a4ee

0001-Only-allow-upgrades-by-the-same-exact-version-new-bi.patchapplication/octet-stream; name=0001-Only-allow-upgrades-by-the-same-exact-version-new-bi.patchDownload
From e6cd29c0de796f4256e443fba54192cee3896e30 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Mon, 25 Mar 2019 23:49:18 +0100
Subject: [PATCH 1/3] Only allow upgrades by the same exact version new bindir

Extend the check for bin_version to ensure the upgrade is using
the exact same version of binaries as pg_upgrade.
---
 src/bin/pg_upgrade/check.c |  2 ++
 src/bin/pg_upgrade/exec.c  | 18 +++++++++++++-----
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fc5aa7010f..4bafd15715 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -283,6 +283,8 @@ check_cluster_versions(void)
 	if (GET_MAJOR_VERSION(new_cluster.major_version) !=
 		GET_MAJOR_VERSION(new_cluster.bin_version))
 		pg_fatal("New cluster data and binary directories are from different major versions.\n");
+	if (new_cluster.bin_version != PG_VERSION_NUM)
+		pg_fatal("New cluster binaries and upgrade utility are from different versions.\n");
 
 	check_ok();
 }
diff --git a/src/bin/pg_upgrade/exec.c b/src/bin/pg_upgrade/exec.c
index 043373f9a5..a42e1ac6d3 100644
--- a/src/bin/pg_upgrade/exec.c
+++ b/src/bin/pg_upgrade/exec.c
@@ -34,8 +34,10 @@ get_bin_version(ClusterInfo *cluster)
 	char		cmd[MAXPGPATH],
 				cmd_output[MAX_STRING];
 	FILE	   *output;
+	int			matches;
 	int			v1 = 0,
-				v2 = 0;
+				v2 = 0,
+				v3 = 0;
 
 	snprintf(cmd, sizeof(cmd), "\"%s/pg_ctl\" --version", cluster->bindir);
 
@@ -46,17 +48,23 @@ get_bin_version(ClusterInfo *cluster)
 
 	pclose(output);
 
-	if (sscanf(cmd_output, "%*s %*s %d.%d", &v1, &v2) < 1)
+	matches = sscanf(cmd_output, "%*s %*s %d.%d.%d", &v1, &v2, &v3);
+	if (matches < 1)
 		pg_fatal("could not get pg_ctl version output from %s\n", cmd);
 
-	if (v1 < 10)
+	if (matches == 3)
 	{
 		/* old style, e.g. 9.6.1 */
-		cluster->bin_version = v1 * 10000 + v2 * 100;
+		cluster->bin_version = v1 * 10000 + v2 * 100 + v3;
 	}
-	else
+	else if (matches == 2)
 	{
 		/* new style, e.g. 10.1 */
+		cluster->bin_version = v1 * 10000 + v2;
+	}
+	else
+	{
+		/* new style pre-release, e.g. 12devel */
 		cluster->bin_version = v1 * 10000;
 	}
 }
-- 
2.14.1.145.gb3622a4ee

#10Christoph Berg
myon@debian.org
In reply to: Daniel Gustafsson (#9)
Re: pg_upgrade version checking questions

Re: Daniel Gustafsson 2019-03-26 <pC-NMmh4vQLQP76YTwY4AuoD4OdNw9egikekyQpXFpgqmTlGjIZXOTd2W5RDZPpRski5N3ADRrLYgLk6QUuvmuT5fWC9acPAYyDU1AVxJcU=@yesql.se>

0003 - Make -B default to CWD and remove the exec_path check

Defaulting to CWD for the new bindir has the side effect that the default
sockdir is in the bin/ directory which may be less optimal.

Hmm, I would have thought that the default for the new bindir is the
directory where pg_upgrade is located, not the CWD, which is likely to
be ~postgres or the like?

On Debian, the incantation is

/usr/lib/postgresql/12/bin/pg_upgrade \
-b /usr/lib/postgresql/11/bin \
-B /usr/lib/postgresql/12/bin <-- should be redundant

Christoph

#11Daniel Gustafsson
daniel@yesql.se
In reply to: Christoph Berg (#10)
3 attachment(s)
Re: pg_upgrade version checking questions

On Wednesday, March 27, 2019 1:43 PM, Christoph Berg <myon@debian.org> wrote:

Re: Daniel Gustafsson 2019-03-26 pC-NMmh4vQLQP76YTwY4AuoD4OdNw9egikekyQpXFpgqmTlGjIZXOTd2W5RDZPpRski5N3ADRrLYgLk6QUuvmuT5fWC9acPAYyDU1AVxJcU=@yesql.se

0003 - Make -B default to CWD and remove the exec_path check
Defaulting to CWD for the new bindir has the side effect that the default
sockdir is in the bin/ directory which may be less optimal.

Hmm, I would have thought that the default for the new bindir is the
directory where pg_upgrade is located, not the CWD, which is likely to
be ~postgres or the like?

Yes, thinking on it that's obviously better. The attached v2 repurposes the
find_my_exec() check to make the current directory of pg_upgrade the default
for new_cluster.bindir (the other two patches are left as they were).

cheers ./daniel

Attachments:

0001-Only-allow-upgrades-by-the-same-exact-version-new-v2.patchapplication/octet-stream; name=0001-Only-allow-upgrades-by-the-same-exact-version-new-v2.patchDownload
From bb5ece58679a4c62b4b0e08c294c3a4f4c762b58 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Mon, 25 Mar 2019 23:49:18 +0100
Subject: [PATCH 1/3] Only allow upgrades by the same exact version new bindir

Extend the check for bin_version to ensure the upgrade is using
the exact same version of binaries as pg_upgrade.
---
 src/bin/pg_upgrade/check.c |  2 ++
 src/bin/pg_upgrade/exec.c  | 18 +++++++++++++-----
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fc5aa7010f..4bafd15715 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -283,6 +283,8 @@ check_cluster_versions(void)
 	if (GET_MAJOR_VERSION(new_cluster.major_version) !=
 		GET_MAJOR_VERSION(new_cluster.bin_version))
 		pg_fatal("New cluster data and binary directories are from different major versions.\n");
+	if (new_cluster.bin_version != PG_VERSION_NUM)
+		pg_fatal("New cluster binaries and upgrade utility are from different versions.\n");
 
 	check_ok();
 }
diff --git a/src/bin/pg_upgrade/exec.c b/src/bin/pg_upgrade/exec.c
index 043373f9a5..a42e1ac6d3 100644
--- a/src/bin/pg_upgrade/exec.c
+++ b/src/bin/pg_upgrade/exec.c
@@ -34,8 +34,10 @@ get_bin_version(ClusterInfo *cluster)
 	char		cmd[MAXPGPATH],
 				cmd_output[MAX_STRING];
 	FILE	   *output;
+	int			matches;
 	int			v1 = 0,
-				v2 = 0;
+				v2 = 0,
+				v3 = 0;
 
 	snprintf(cmd, sizeof(cmd), "\"%s/pg_ctl\" --version", cluster->bindir);
 
@@ -46,17 +48,23 @@ get_bin_version(ClusterInfo *cluster)
 
 	pclose(output);
 
-	if (sscanf(cmd_output, "%*s %*s %d.%d", &v1, &v2) < 1)
+	matches = sscanf(cmd_output, "%*s %*s %d.%d.%d", &v1, &v2, &v3);
+	if (matches < 1)
 		pg_fatal("could not get pg_ctl version output from %s\n", cmd);
 
-	if (v1 < 10)
+	if (matches == 3)
 	{
 		/* old style, e.g. 9.6.1 */
-		cluster->bin_version = v1 * 10000 + v2 * 100;
+		cluster->bin_version = v1 * 10000 + v2 * 100 + v3;
 	}
-	else
+	else if (matches == 2)
 	{
 		/* new style, e.g. 10.1 */
+		cluster->bin_version = v1 * 10000 + v2;
+	}
+	else
+	{
+		/* new style pre-release, e.g. 12devel */
 		cluster->bin_version = v1 * 10000;
 	}
 }
-- 
2.14.1.145.gb3622a4ee

0002-Check-all-used-executables-v2.patchapplication/octet-stream; name=0002-Check-all-used-executables-v2.patchDownload
From 4603623a283e2dbf6c6765ef028c3c9ddb116500 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Mon, 25 Mar 2019 23:50:43 +0100
Subject: [PATCH 2/3] Check all used executables

Expand the validate_exec() calls to cover all the used binaries.
---
 src/bin/pg_upgrade/exec.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/bin/pg_upgrade/exec.c b/src/bin/pg_upgrade/exec.c
index a42e1ac6d3..8e08089b12 100644
--- a/src/bin/pg_upgrade/exec.c
+++ b/src/bin/pg_upgrade/exec.c
@@ -404,6 +404,9 @@ check_bin_dir(ClusterInfo *cluster)
 		validate_exec(cluster->bindir, "psql");
 		validate_exec(cluster->bindir, "pg_dump");
 		validate_exec(cluster->bindir, "pg_dumpall");
+		validate_exec(cluster->bindir, "pg_restore");
+		validate_exec(cluster->bindir, "initdb");
+		validate_exec(cluster->bindir, "vacuumdb");
 	}
 }
 
-- 
2.14.1.145.gb3622a4ee

0003-Default-new-bindir-to-exec_path-v2.patchapplication/octet-stream; name=0003-Default-new-bindir-to-exec_path-v2.patchDownload
From 7dbd62f61c9c0aa4dfb639460e864cea81df44c8 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Mon, 25 Mar 2019 23:57:25 +0100
Subject: [PATCH 3/3] Default new bindir to exec_path

Make the directory where the pg_upgrade binary resides the default
for new bindir; move the codepath for finding the exec_path to the
options parsing to support this. The rationale is that since the
new pg_upgrade is highly likely to be executed from the new bindir,
make this the default and avoid the need to separately specify it.
---
 doc/src/sgml/ref/pgupgrade.sgml |  1 +
 src/bin/pg_upgrade/option.c     | 15 ++++++++++++++-
 src/bin/pg_upgrade/pg_upgrade.c | 11 -----------
 src/bin/pg_upgrade/pg_upgrade.h |  1 -
 4 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index c896882dd1..424f0730f1 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -89,6 +89,7 @@
       <term><option>-B</option> <replaceable>bindir</replaceable></term>
       <term><option>--new-bindir=</option><replaceable>bindir</replaceable></term>
       <listitem><para>the new PostgreSQL executable directory;
+      default is the directory where <application>pg_upgrade</application> resides;
       environment variable <envar>PGBINNEW</envar></para></listitem>
      </varlistentry>
 
diff --git a/src/bin/pg_upgrade/option.c b/src/bin/pg_upgrade/option.c
index ab5e2d6a30..a54fe35eb1 100644
--- a/src/bin/pg_upgrade/option.c
+++ b/src/bin/pg_upgrade/option.c
@@ -65,6 +65,7 @@ parseCommandLine(int argc, char *argv[])
 	FILE	   *fp;
 	char	  **filename;
 	time_t		run_time = time(NULL);
+	char		exec_path[MAXPGPATH];	/* full path to my executable */
 
 	user_opts.transfer_mode = TRANSFER_MODE_COPY;
 
@@ -74,6 +75,17 @@ parseCommandLine(int argc, char *argv[])
 	old_cluster.port = getenv("PGPORTOLD") ? atoi(getenv("PGPORTOLD")) : DEF_PGUPORT;
 	new_cluster.port = getenv("PGPORTNEW") ? atoi(getenv("PGPORTNEW")) : DEF_PGUPORT;
 
+	/*
+	 * Get path to pg_upgrade executable to use as default in case -B isn't
+	 * supplied by the user. Load the path here for usage() output.
+	 */
+	if (find_my_exec(argv[0], exec_path) < 0)
+		pg_fatal("%s: could not find own program executable\n", argv[0]);
+	/* Trim off program name and keep just path */
+	*last_dir_separator(exec_path) = '\0';
+	canonicalize_path(exec_path);
+	new_cluster.bindir = pg_strdup(exec_path);
+
 	os_user_effective_id = get_user_info(&os_info.user);
 	/* we override just the database user name;  we got the OS id above */
 	if (getenv("PGUSER"))
@@ -114,6 +126,7 @@ parseCommandLine(int argc, char *argv[])
 				break;
 
 			case 'B':
+				pg_free(new_cluster.bindir);
 				new_cluster.bindir = pg_strdup(optarg);
 				break;
 
@@ -291,7 +304,7 @@ usage(void)
 	printf(_("  pg_upgrade [OPTION]...\n\n"));
 	printf(_("Options:\n"));
 	printf(_("  -b, --old-bindir=BINDIR       old cluster executable directory\n"));
-	printf(_("  -B, --new-bindir=BINDIR       new cluster executable directory\n"));
+	printf(_("  -B, --new-bindir=BINDIR       new cluster executable directory (default \"%s\"\n"), new_cluster.bindir);
 	printf(_("  -c, --check                   check clusters only, don't change any data\n"));
 	printf(_("  -d, --old-datadir=DATADIR     old cluster data directory\n"));
 	printf(_("  -D, --new-datadir=DATADIR     new cluster data directory\n"));
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index 0b304bbd56..f3d0501fa6 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -204,8 +204,6 @@ main(int argc, char **argv)
 static void
 setup(char *argv0, bool *live_check)
 {
-	char		exec_path[MAXPGPATH];	/* full path to my executable */
-
 	/*
 	 * make sure the user has a clean environment, otherwise, we may confuse
 	 * libpq when we connect to one (or both) of the servers.
@@ -247,15 +245,6 @@ setup(char *argv0, bool *live_check)
 			pg_fatal("There seems to be a postmaster servicing the new cluster.\n"
 					 "Please shutdown that postmaster and try again.\n");
 	}
-
-	/* get path to pg_upgrade executable */
-	if (find_my_exec(argv0, exec_path) < 0)
-		pg_fatal("%s: could not find own program executable\n", argv0);
-
-	/* Trim off program name and keep just path */
-	*last_dir_separator(exec_path) = '\0';
-	canonicalize_path(exec_path);
-	os_info.exec_path = pg_strdup(exec_path);
 }
 
 
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index baeb8ff0f8..04b537a8cd 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -320,7 +320,6 @@ typedef struct
 typedef struct
 {
 	const char *progname;		/* complete pathname for this program */
-	char	   *exec_path;		/* full path to my executable */
 	char	   *user;			/* username for clusters */
 	bool		user_specified; /* user specified on command-line */
 	char	  **old_tablespaces;	/* tablespaces */
-- 
2.14.1.145.gb3622a4ee

#12Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Daniel Gustafsson (#11)
Re: pg_upgrade version checking questions

On 2019-04-04 15:40, Daniel Gustafsson wrote:

On Wednesday, March 27, 2019 1:43 PM, Christoph Berg <myon@debian.org> wrote:

Re: Daniel Gustafsson 2019-03-26 pC-NMmh4vQLQP76YTwY4AuoD4OdNw9egikekyQpXFpgqmTlGjIZXOTd2W5RDZPpRski5N3ADRrLYgLk6QUuvmuT5fWC9acPAYyDU1AVxJcU=@yesql.se

0003 - Make -B default to CWD and remove the exec_path check
Defaulting to CWD for the new bindir has the side effect that the default
sockdir is in the bin/ directory which may be less optimal.

Hmm, I would have thought that the default for the new bindir is the
directory where pg_upgrade is located, not the CWD, which is likely to
be ~postgres or the like?

Yes, thinking on it that's obviously better. The attached v2 repurposes the
find_my_exec() check to make the current directory of pg_upgrade the default
for new_cluster.bindir (the other two patches are left as they were).

0001-Only-allow-upgrades-by-the-same-exact-version-new-v2.patch

I don't understand what this does. Please explain.

0002-Check-all-used-executables-v2.patch

I think we'd also need a check for pg_controldata.

Perhaps this comment could be improved:

/* these are only needed in the new cluster */

to

/* these are only needed for the target version */

(pg_dump runs on the old cluster but has to be of the new version.)

0003-Default-new-bindir-to-exec_path-v2.patch

I don't like how the find_my_exec() code has been moved around. That
makes the modularity of the code worse. Let's keep it where it was and
then structure it like this:

if -B was given:
new_cluster.bindir = what was given for -B
else:
# existing block

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#13Daniel Gustafsson
daniel@yesql.se
In reply to: Peter Eisentraut (#12)
3 attachment(s)
Re: pg_upgrade version checking questions

On 22 Jul 2019, at 10:46, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:

On 2019-04-04 15:40, Daniel Gustafsson wrote:

On Wednesday, March 27, 2019 1:43 PM, Christoph Berg <myon@debian.org> wrote:

Re: Daniel Gustafsson 2019-03-26 pC-NMmh4vQLQP76YTwY4AuoD4OdNw9egikekyQpXFpgqmTlGjIZXOTd2W5RDZPpRski5N3ADRrLYgLk6QUuvmuT5fWC9acPAYyDU1AVxJcU=@yesql.se

0003 - Make -B default to CWD and remove the exec_path check
Defaulting to CWD for the new bindir has the side effect that the default
sockdir is in the bin/ directory which may be less optimal.

Hmm, I would have thought that the default for the new bindir is the
directory where pg_upgrade is located, not the CWD, which is likely to
be ~postgres or the like?

Yes, thinking on it that's obviously better. The attached v2 repurposes the
find_my_exec() check to make the current directory of pg_upgrade the default
for new_cluster.bindir (the other two patches are left as they were).

Thanks for reviewing!

0001-Only-allow-upgrades-by-the-same-exact-version-new-v2.patch

I don't understand what this does. Please explain.

This patch makes the version check stricter to ensure that pg_upgrade and the
new cluster is of the same major and minor version. The code grabs the full
version from the various formats we have (x.y.z, x.z, xdevel) where we used to
skip the minor rev. This is done to address one of Toms original complaints in
this thread.

0002-Check-all-used-executables-v2.patch

I think we'd also need a check for pg_controldata.

Fixed. I also rearranged the new cluster checks to be in alphabetical order
since the list makes more sense then (starting with initdb etc).

Perhaps this comment could be improved:

/* these are only needed in the new cluster */

to

/* these are only needed for the target version */

(pg_dump runs on the old cluster but has to be of the new version.)

I like this suggestion, fixed with a little bit of wordsmithing.

0003-Default-new-bindir-to-exec_path-v2.patch

I don't like how the find_my_exec() code has been moved around. That
makes the modularity of the code worse. Let's keep it where it was and
then structure it like this:

if -B was given:
new_cluster.bindir = what was given for -B
else:
# existing block

The reason for moving is that we print default values in usage(), and that
requires the value to be computed before calling usage(). We already do this
for resolving environment values in parseCommandLine(). If we do it in setup,
then we’d have to split out resolving the new_cluster.bindir into it’s own
function exposed to option.c, or do you have any other suggestions there?

I’ve attached all three patches as v3 to be compatible with the CFBot, only
0002 changed so far.

cheers ./daniel

Attachments:

0003-Default-new-bindir-to-exec_path-v3.patchapplication/octet-stream; name=0003-Default-new-bindir-to-exec_path-v3.patch; x-unix-mode=0644Download
From 17b7b2c6cdd04e85db3759b9b434b2fec12a580a Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Mon, 25 Mar 2019 23:57:25 +0100
Subject: [PATCH 3/3] Default new bindir to exec_path

Make the directory where the pg_upgrade binary resides the default
for new bindir; move the codepath for finding the exec_path to the
options parsing to support this. The rationale is that since the
new pg_upgrade is highly likely to be executed from the new bindir,
make this the default and avoid the need to separately specify it.
---
 doc/src/sgml/ref/pgupgrade.sgml |  1 +
 src/bin/pg_upgrade/option.c     | 15 ++++++++++++++-
 src/bin/pg_upgrade/pg_upgrade.c | 11 -----------
 src/bin/pg_upgrade/pg_upgrade.h |  1 -
 4 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 82886760f1..3b69db909f 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -89,6 +89,7 @@
       <term><option>-B</option> <replaceable>bindir</replaceable></term>
       <term><option>--new-bindir=</option><replaceable>bindir</replaceable></term>
       <listitem><para>the new PostgreSQL executable directory;
+      default is the directory where <application>pg_upgrade</application> resides;
       environment variable <envar>PGBINNEW</envar></para></listitem>
      </varlistentry>
 
diff --git a/src/bin/pg_upgrade/option.c b/src/bin/pg_upgrade/option.c
index d76f27c9e8..e7131e9896 100644
--- a/src/bin/pg_upgrade/option.c
+++ b/src/bin/pg_upgrade/option.c
@@ -65,6 +65,7 @@ parseCommandLine(int argc, char *argv[])
 	FILE	   *fp;
 	char	  **filename;
 	time_t		run_time = time(NULL);
+	char		exec_path[MAXPGPATH];	/* full path to my executable */
 
 	user_opts.transfer_mode = TRANSFER_MODE_COPY;
 
@@ -74,6 +75,17 @@ parseCommandLine(int argc, char *argv[])
 	old_cluster.port = getenv("PGPORTOLD") ? atoi(getenv("PGPORTOLD")) : DEF_PGUPORT;
 	new_cluster.port = getenv("PGPORTNEW") ? atoi(getenv("PGPORTNEW")) : DEF_PGUPORT;
 
+	/*
+	 * Get path to pg_upgrade executable to use as default in case -B isn't
+	 * supplied by the user. Load the path here for usage() output.
+	 */
+	if (find_my_exec(argv[0], exec_path) < 0)
+		pg_fatal("%s: could not find own program executable\n", argv[0]);
+	/* Trim off program name and keep just path */
+	*last_dir_separator(exec_path) = '\0';
+	canonicalize_path(exec_path);
+	new_cluster.bindir = pg_strdup(exec_path);
+
 	os_user_effective_id = get_user_info(&os_info.user);
 	/* we override just the database user name;  we got the OS id above */
 	if (getenv("PGUSER"))
@@ -111,6 +123,7 @@ parseCommandLine(int argc, char *argv[])
 				break;
 
 			case 'B':
+				pg_free(new_cluster.bindir);
 				new_cluster.bindir = pg_strdup(optarg);
 				break;
 
@@ -293,7 +306,7 @@ usage(void)
 	printf(_("  pg_upgrade [OPTION]...\n\n"));
 	printf(_("Options:\n"));
 	printf(_("  -b, --old-bindir=BINDIR       old cluster executable directory\n"));
-	printf(_("  -B, --new-bindir=BINDIR       new cluster executable directory\n"));
+	printf(_("  -B, --new-bindir=BINDIR       new cluster executable directory (default \"%s\"\n"), new_cluster.bindir);
 	printf(_("  -c, --check                   check clusters only, don't change any data\n"));
 	printf(_("  -d, --old-datadir=DATADIR     old cluster data directory\n"));
 	printf(_("  -D, --new-datadir=DATADIR     new cluster data directory\n"));
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index d1975aab2b..f51e9334dc 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -204,8 +204,6 @@ main(int argc, char **argv)
 static void
 setup(char *argv0, bool *live_check)
 {
-	char		exec_path[MAXPGPATH];	/* full path to my executable */
-
 	/*
 	 * make sure the user has a clean environment, otherwise, we may confuse
 	 * libpq when we connect to one (or both) of the servers.
@@ -247,15 +245,6 @@ setup(char *argv0, bool *live_check)
 			pg_fatal("There seems to be a postmaster servicing the new cluster.\n"
 					 "Please shutdown that postmaster and try again.\n");
 	}
-
-	/* get path to pg_upgrade executable */
-	if (find_my_exec(argv0, exec_path) < 0)
-		pg_fatal("%s: could not find own program executable\n", argv0);
-
-	/* Trim off program name and keep just path */
-	*last_dir_separator(exec_path) = '\0';
-	canonicalize_path(exec_path);
-	os_info.exec_path = pg_strdup(exec_path);
 }
 
 
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 5d31750d86..ca6a9efd9c 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -314,7 +314,6 @@ typedef struct
 typedef struct
 {
 	const char *progname;		/* complete pathname for this program */
-	char	   *exec_path;		/* full path to my executable */
 	char	   *user;			/* username for clusters */
 	bool		user_specified; /* user specified on command-line */
 	char	  **old_tablespaces;	/* tablespaces */
-- 
2.14.1.145.gb3622a4ee

0002-Check-all-used-executables-v3.patchapplication/octet-stream; name=0002-Check-all-used-executables-v3.patch; x-unix-mode=0644Download
From 61426f60f9bd7649d3378a79bfb88b0a605528f6 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Mon, 25 Mar 2019 23:50:43 +0100
Subject: [PATCH 2/3] Check all used executables

Expand the validate_exec() calls to cover all the used binaries.
---
 src/bin/pg_upgrade/exec.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/src/bin/pg_upgrade/exec.c b/src/bin/pg_upgrade/exec.c
index e66170ffdd..f31c9d661a 100644
--- a/src/bin/pg_upgrade/exec.c
+++ b/src/bin/pg_upgrade/exec.c
@@ -384,6 +384,7 @@ check_bin_dir(ClusterInfo *cluster)
 					  cluster->bindir);
 
 	validate_exec(cluster->bindir, "postgres");
+	validate_exec(cluster->bindir, "pg_controldata");
 	validate_exec(cluster->bindir, "pg_ctl");
 
 	/*
@@ -398,12 +399,20 @@ check_bin_dir(ClusterInfo *cluster)
 		validate_exec(cluster->bindir, "pg_resetxlog");
 	else
 		validate_exec(cluster->bindir, "pg_resetwal");
+
 	if (cluster == &new_cluster)
 	{
-		/* these are only needed in the new cluster */
-		validate_exec(cluster->bindir, "psql");
+		/*
+		 * These binaries are only needed for the target version. pg_dump and
+		 * pg_dumpall are used to dump the old cluster, but must be of the
+		 * target version.
+		 */
+		validate_exec(cluster->bindir, "initdb");
 		validate_exec(cluster->bindir, "pg_dump");
 		validate_exec(cluster->bindir, "pg_dumpall");
+		validate_exec(cluster->bindir, "pg_restore");
+		validate_exec(cluster->bindir, "psql");
+		validate_exec(cluster->bindir, "vacuumdb");
 	}
 }
 
-- 
2.14.1.145.gb3622a4ee

0001-Only-allow-upgrades-by-the-same-exact-version-new-v3.patchapplication/octet-stream; name=0001-Only-allow-upgrades-by-the-same-exact-version-new-v3.patch; x-unix-mode=0644Download
From 77d22dea3ed6823e27d38fcce7bf3c4a27302df4 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Mon, 25 Mar 2019 23:49:18 +0100
Subject: [PATCH 1/3] Only allow upgrades by the same exact version new bindir

Extend the check for bin_version to ensure the upgrade is using
the exact same version of binaries as pg_upgrade.
---
 src/bin/pg_upgrade/check.c |  2 ++
 src/bin/pg_upgrade/exec.c  | 18 +++++++++++++-----
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index 617270f101..31cc5d79f5 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -283,6 +283,8 @@ check_cluster_versions(void)
 	if (GET_MAJOR_VERSION(new_cluster.major_version) !=
 		GET_MAJOR_VERSION(new_cluster.bin_version))
 		pg_fatal("New cluster data and binary directories are from different major versions.\n");
+	if (new_cluster.bin_version != PG_VERSION_NUM)
+		pg_fatal("New cluster binaries and upgrade utility are from different versions.\n");
 
 	check_ok();
 }
diff --git a/src/bin/pg_upgrade/exec.c b/src/bin/pg_upgrade/exec.c
index 0363309328..e66170ffdd 100644
--- a/src/bin/pg_upgrade/exec.c
+++ b/src/bin/pg_upgrade/exec.c
@@ -34,8 +34,10 @@ get_bin_version(ClusterInfo *cluster)
 	char		cmd[MAXPGPATH],
 				cmd_output[MAX_STRING];
 	FILE	   *output;
+	int			matches;
 	int			v1 = 0,
-				v2 = 0;
+				v2 = 0,
+				v3 = 0;
 
 	snprintf(cmd, sizeof(cmd), "\"%s/pg_ctl\" --version", cluster->bindir);
 
@@ -46,17 +48,23 @@ get_bin_version(ClusterInfo *cluster)
 
 	pclose(output);
 
-	if (sscanf(cmd_output, "%*s %*s %d.%d", &v1, &v2) < 1)
+	matches = sscanf(cmd_output, "%*s %*s %d.%d.%d", &v1, &v2, &v3);
+	if (matches < 1)
 		pg_fatal("could not get pg_ctl version output from %s\n", cmd);
 
-	if (v1 < 10)
+	if (matches == 3)
 	{
 		/* old style, e.g. 9.6.1 */
-		cluster->bin_version = v1 * 10000 + v2 * 100;
+		cluster->bin_version = v1 * 10000 + v2 * 100 + v3;
 	}
-	else
+	else if (matches == 2)
 	{
 		/* new style, e.g. 10.1 */
+		cluster->bin_version = v1 * 10000 + v2;
+	}
+	else
+	{
+		/* new style pre-release, e.g. 12devel */
 		cluster->bin_version = v1 * 10000;
 	}
 }
-- 
2.14.1.145.gb3622a4ee

#14Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Daniel Gustafsson (#13)
Re: pg_upgrade version checking questions

On 2019-07-23 17:30, Daniel Gustafsson wrote:

The reason for moving is that we print default values in usage(), and that
requires the value to be computed before calling usage(). We already do this
for resolving environment values in parseCommandLine(). If we do it in setup,
then we’d have to split out resolving the new_cluster.bindir into it’s own
function exposed to option.c, or do you have any other suggestions there?

I think doing nontrivial work in order to print default values in
usage() is bad practice, because in unfortunate cases it would even
prevent you from calling --help. Also, in this case, it would probably
very often exceed the typical line length of --help output and create
some general ugliness. Writing something like "(default: same as this
pg_upgrade)" would probably achieve just about the same.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#15Daniel Gustafsson
daniel@yesql.se
In reply to: Peter Eisentraut (#14)
4 attachment(s)
Re: pg_upgrade version checking questions

On 24 Jul 2019, at 22:32, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:

On 2019-07-23 17:30, Daniel Gustafsson wrote:

The reason for moving is that we print default values in usage(), and that
requires the value to be computed before calling usage(). We already do this
for resolving environment values in parseCommandLine(). If we do it in setup,
then we’d have to split out resolving the new_cluster.bindir into it’s own
function exposed to option.c, or do you have any other suggestions there?

I think doing nontrivial work in order to print default values in
usage() is bad practice, because in unfortunate cases it would even
prevent you from calling --help. Also, in this case, it would probably
very often exceed the typical line length of --help output and create
some general ugliness. Writing something like "(default: same as this
pg_upgrade)" would probably achieve just about the same.

Fair enough, those are both excellent points. I’ve shuffled the code around to
move back the check for exec_path to setup (albeit earlier than before due to
where we perform directory checking). This does mean that the directory
checking in the options parsing must learn to cope with missing directories,
which is a bit unfortunate since it’s already doing a few too many things IMHO.
To ensure dogfooding, I also removed the use of -B in ‘make check’ for
pg_upgrade, which should bump the coverage.

Also spotted a typo in a pg_upgrade file header in a file touched by this, so
included it in this thread too as a 0004.

Thanks again for reviewing, much appreciated!

cheers ./daniel

Attachments:

0001-Only-allow-upgrades-by-the-same-exact-version-new-v4.patchapplication/octet-stream; name=0001-Only-allow-upgrades-by-the-same-exact-version-new-v4.patch; x-unix-mode=0644Download
From 86c2b84a282f05873b3cb7c73cbc688ac40f1a5f Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Mon, 25 Mar 2019 23:49:18 +0100
Subject: [PATCH 1/4] Only allow upgrades by the same exact version new bindir

Extend the check for bin_version to ensure the upgrade is using
the exact same version of binaries as pg_upgrade.
---
 src/bin/pg_upgrade/check.c |  2 ++
 src/bin/pg_upgrade/exec.c  | 18 +++++++++++++-----
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index 617270f101..31cc5d79f5 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -283,6 +283,8 @@ check_cluster_versions(void)
 	if (GET_MAJOR_VERSION(new_cluster.major_version) !=
 		GET_MAJOR_VERSION(new_cluster.bin_version))
 		pg_fatal("New cluster data and binary directories are from different major versions.\n");
+	if (new_cluster.bin_version != PG_VERSION_NUM)
+		pg_fatal("New cluster binaries and upgrade utility are from different versions.\n");
 
 	check_ok();
 }
diff --git a/src/bin/pg_upgrade/exec.c b/src/bin/pg_upgrade/exec.c
index 0363309328..e66170ffdd 100644
--- a/src/bin/pg_upgrade/exec.c
+++ b/src/bin/pg_upgrade/exec.c
@@ -34,8 +34,10 @@ get_bin_version(ClusterInfo *cluster)
 	char		cmd[MAXPGPATH],
 				cmd_output[MAX_STRING];
 	FILE	   *output;
+	int			matches;
 	int			v1 = 0,
-				v2 = 0;
+				v2 = 0,
+				v3 = 0;
 
 	snprintf(cmd, sizeof(cmd), "\"%s/pg_ctl\" --version", cluster->bindir);
 
@@ -46,17 +48,23 @@ get_bin_version(ClusterInfo *cluster)
 
 	pclose(output);
 
-	if (sscanf(cmd_output, "%*s %*s %d.%d", &v1, &v2) < 1)
+	matches = sscanf(cmd_output, "%*s %*s %d.%d.%d", &v1, &v2, &v3);
+	if (matches < 1)
 		pg_fatal("could not get pg_ctl version output from %s\n", cmd);
 
-	if (v1 < 10)
+	if (matches == 3)
 	{
 		/* old style, e.g. 9.6.1 */
-		cluster->bin_version = v1 * 10000 + v2 * 100;
+		cluster->bin_version = v1 * 10000 + v2 * 100 + v3;
 	}
-	else
+	else if (matches == 2)
 	{
 		/* new style, e.g. 10.1 */
+		cluster->bin_version = v1 * 10000 + v2;
+	}
+	else
+	{
+		/* new style pre-release, e.g. 12devel */
 		cluster->bin_version = v1 * 10000;
 	}
 }
-- 
2.14.1.145.gb3622a4ee

0004-Fix-typo-in-pg_upgrade-file-header-v4.patchapplication/octet-stream; name=0004-Fix-typo-in-pg_upgrade-file-header-v4.patch; x-unix-mode=0644Download
From 01694ee1927eaa538cce802913069e69bddfe248 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Thu, 25 Jul 2019 16:17:19 +0200
Subject: [PATCH 4/4] Fix typo in pg_upgrade file header

---
 src/bin/pg_upgrade/option.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/bin/pg_upgrade/option.c b/src/bin/pg_upgrade/option.c
index b9f10db3bd..88d4d4bc49 100644
--- a/src/bin/pg_upgrade/option.c
+++ b/src/bin/pg_upgrade/option.c
@@ -1,5 +1,5 @@
 /*
- *	opt.c
+ *	option.c
  *
  *	options functions
  *
-- 
2.14.1.145.gb3622a4ee

0003-Default-new-bindir-to-exec_path-v4.patchapplication/octet-stream; name=0003-Default-new-bindir-to-exec_path-v4.patch; x-unix-mode=0644Download
From 154c7f4fe2ebd2acbaedc8ffd94b555c3b129b07 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Mon, 25 Mar 2019 23:57:25 +0100
Subject: [PATCH 3/4] Default new bindir to exec_path

Make the directory where the pg_upgrade binary resides the default
for new bindir, as running the pg_upgrade binary from where the new
cluster is installed is a very common scenario. Setting this as the
defauly bindir for the new cluster will remove the need to provide
is explicitly via -B in many cases.

To support directories being missing from option parsing, extend the
directory check with a missingOk mode where the path must be filled
at a later point before being used. Also move the exec_path check
to earlier in setup to make sure we know the new cluster bindir when
we scan for required executables.

This removes the exec_path from the OSInfo struct as it is not used
anywhere.
---
 doc/src/sgml/ref/pgupgrade.sgml |  1 +
 src/bin/pg_upgrade/option.c     | 21 +++++++++++++--------
 src/bin/pg_upgrade/pg_upgrade.c | 24 +++++++++++++++---------
 src/bin/pg_upgrade/pg_upgrade.h |  1 -
 src/bin/pg_upgrade/test.sh      |  2 +-
 5 files changed, 30 insertions(+), 19 deletions(-)

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 82886760f1..3b69db909f 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -89,6 +89,7 @@
       <term><option>-B</option> <replaceable>bindir</replaceable></term>
       <term><option>--new-bindir=</option><replaceable>bindir</replaceable></term>
       <listitem><para>the new PostgreSQL executable directory;
+      default is the directory where <application>pg_upgrade</application> resides;
       environment variable <envar>PGBINNEW</envar></para></listitem>
      </varlistentry>
 
diff --git a/src/bin/pg_upgrade/option.c b/src/bin/pg_upgrade/option.c
index d76f27c9e8..b9f10db3bd 100644
--- a/src/bin/pg_upgrade/option.c
+++ b/src/bin/pg_upgrade/option.c
@@ -23,7 +23,8 @@
 static void usage(void);
 static void check_required_directory(char **dirpath,
 									 const char *envVarName, bool useCwd,
-									 const char *cmdLineOption, const char *description);
+									 const char *cmdLineOption, const char *description,
+									 bool missingOk);
 #define FIX_DEFAULT_READ_ONLY "-c default_transaction_read_only=false"
 
 
@@ -251,15 +252,15 @@ parseCommandLine(int argc, char *argv[])
 
 	/* Get values from env if not already set */
 	check_required_directory(&old_cluster.bindir, "PGBINOLD", false,
-							 "-b", _("old cluster binaries reside"));
+							 "-b", _("old cluster binaries reside"), false);
 	check_required_directory(&new_cluster.bindir, "PGBINNEW", false,
-							 "-B", _("new cluster binaries reside"));
+							 "-B", _("new cluster binaries reside"), true);
 	check_required_directory(&old_cluster.pgdata, "PGDATAOLD", false,
-							 "-d", _("old cluster data resides"));
+							 "-d", _("old cluster data resides"), false);
 	check_required_directory(&new_cluster.pgdata, "PGDATANEW", false,
-							 "-D", _("new cluster data resides"));
+							 "-D", _("new cluster data resides"), false);
 	check_required_directory(&user_opts.socketdir, "PGSOCKETDIR", true,
-							 "-s", _("sockets will be created"));
+							 "-s", _("sockets will be created"), false);
 
 #ifdef WIN32
 
@@ -293,7 +294,7 @@ usage(void)
 	printf(_("  pg_upgrade [OPTION]...\n\n"));
 	printf(_("Options:\n"));
 	printf(_("  -b, --old-bindir=BINDIR       old cluster executable directory\n"));
-	printf(_("  -B, --new-bindir=BINDIR       new cluster executable directory\n"));
+	printf(_("  -B, --new-bindir=BINDIR       new cluster executable directory (default pg_upgrade binary dir)"));
 	printf(_("  -c, --check                   check clusters only, don't change any data\n"));
 	printf(_("  -d, --old-datadir=DATADIR     old cluster data directory\n"));
 	printf(_("  -D, --new-datadir=DATADIR     new cluster data directory\n"));
@@ -351,13 +352,15 @@ usage(void)
  *	useCwd		  - true if OK to default to CWD
  *	cmdLineOption - the command line option for this directory
  *	description   - a description of this directory option
+ *	missingOk	  - true if OK that both dirpath and envVarName are NULL
  *
  * We use the last two arguments to construct a meaningful error message if the
  * user hasn't provided the required directory name.
  */
 static void
 check_required_directory(char **dirpath, const char *envVarName, bool useCwd,
-						 const char *cmdLineOption, const char *description)
+						 const char *cmdLineOption, const char *description,
+						 bool missingOk)
 {
 	if (*dirpath == NULL || strlen(*dirpath) == 0)
 	{
@@ -373,6 +376,8 @@ check_required_directory(char **dirpath, const char *envVarName, bool useCwd,
 				pg_fatal("could not determine current directory\n");
 			*dirpath = pg_strdup(cwd);
 		}
+		else if (missingOk)
+			return;
 		else
 			pg_fatal("You must identify the directory where the %s.\n"
 					 "Please use the %s command-line option or the %s environment variable.\n",
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index d1975aab2b..82f7d8b919 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -212,6 +212,21 @@ setup(char *argv0, bool *live_check)
 	 */
 	check_pghost_envvar();
 
+	/*
+	 * In case the user hasn't specified the directory for the new binaries
+	 * with -B, default to using the path of the currently executed pg_upgrade
+	 * binary.
+	 */
+	if (!new_cluster.bindir)
+	{
+		if (find_my_exec(argv0, exec_path) < 0)
+			pg_fatal("%s: could not find own program executable\n", argv0);
+		/* Trim off program name and keep just path */
+		*last_dir_separator(exec_path) = '\0';
+		canonicalize_path(exec_path);
+		new_cluster.bindir = pg_strdup(exec_path);
+	}
+
 	verify_directories();
 
 	/* no postmasters should be running, except for a live check */
@@ -247,15 +262,6 @@ setup(char *argv0, bool *live_check)
 			pg_fatal("There seems to be a postmaster servicing the new cluster.\n"
 					 "Please shutdown that postmaster and try again.\n");
 	}
-
-	/* get path to pg_upgrade executable */
-	if (find_my_exec(argv0, exec_path) < 0)
-		pg_fatal("%s: could not find own program executable\n", argv0);
-
-	/* Trim off program name and keep just path */
-	*last_dir_separator(exec_path) = '\0';
-	canonicalize_path(exec_path);
-	os_info.exec_path = pg_strdup(exec_path);
 }
 
 
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 5d31750d86..ca6a9efd9c 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -314,7 +314,6 @@ typedef struct
 typedef struct
 {
 	const char *progname;		/* complete pathname for this program */
-	char	   *exec_path;		/* full path to my executable */
 	char	   *user;			/* username for clusters */
 	bool		user_specified; /* user specified on command-line */
 	char	  **old_tablespaces;	/* tablespaces */
diff --git a/src/bin/pg_upgrade/test.sh b/src/bin/pg_upgrade/test.sh
index 78820247b3..7e44747e39 100644
--- a/src/bin/pg_upgrade/test.sh
+++ b/src/bin/pg_upgrade/test.sh
@@ -220,7 +220,7 @@ PGDATA="$BASE_PGDATA"
 
 standard_initdb 'initdb'
 
-pg_upgrade $PG_UPGRADE_OPTS -d "${PGDATA}.old" -D "$PGDATA" -b "$oldbindir" -B "$bindir" -p "$PGPORT" -P "$PGPORT"
+pg_upgrade $PG_UPGRADE_OPTS -d "${PGDATA}.old" -D "$PGDATA" -b "$oldbindir" -p "$PGPORT" -P "$PGPORT"
 
 # make sure all directories and files have group permissions, on Unix hosts
 # Windows hosts don't support Unix-y permissions.
-- 
2.14.1.145.gb3622a4ee

0002-Check-all-used-executables-v4.patchapplication/octet-stream; name=0002-Check-all-used-executables-v4.patch; x-unix-mode=0644Download
From a7c9f5bedf973dc60de8966ce3e1fd67d83b6187 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Mon, 25 Mar 2019 23:50:43 +0100
Subject: [PATCH 2/4] Check all used executables

Expand the validate_exec() calls to cover all the used binaries.
---
 src/bin/pg_upgrade/exec.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/src/bin/pg_upgrade/exec.c b/src/bin/pg_upgrade/exec.c
index e66170ffdd..f31c9d661a 100644
--- a/src/bin/pg_upgrade/exec.c
+++ b/src/bin/pg_upgrade/exec.c
@@ -384,6 +384,7 @@ check_bin_dir(ClusterInfo *cluster)
 					  cluster->bindir);
 
 	validate_exec(cluster->bindir, "postgres");
+	validate_exec(cluster->bindir, "pg_controldata");
 	validate_exec(cluster->bindir, "pg_ctl");
 
 	/*
@@ -398,12 +399,20 @@ check_bin_dir(ClusterInfo *cluster)
 		validate_exec(cluster->bindir, "pg_resetxlog");
 	else
 		validate_exec(cluster->bindir, "pg_resetwal");
+
 	if (cluster == &new_cluster)
 	{
-		/* these are only needed in the new cluster */
-		validate_exec(cluster->bindir, "psql");
+		/*
+		 * These binaries are only needed for the target version. pg_dump and
+		 * pg_dumpall are used to dump the old cluster, but must be of the
+		 * target version.
+		 */
+		validate_exec(cluster->bindir, "initdb");
 		validate_exec(cluster->bindir, "pg_dump");
 		validate_exec(cluster->bindir, "pg_dumpall");
+		validate_exec(cluster->bindir, "pg_restore");
+		validate_exec(cluster->bindir, "psql");
+		validate_exec(cluster->bindir, "vacuumdb");
 	}
 }
 
-- 
2.14.1.145.gb3622a4ee

#16Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Daniel Gustafsson (#15)
Re: pg_upgrade version checking questions

On 2019-07-25 16:33, Daniel Gustafsson wrote:

Fair enough, those are both excellent points. I’ve shuffled the code around to
move back the check for exec_path to setup (albeit earlier than before due to
where we perform directory checking). This does mean that the directory
checking in the options parsing must learn to cope with missing directories,
which is a bit unfortunate since it’s already doing a few too many things IMHO.
To ensure dogfooding, I also removed the use of -B in ‘make check’ for
pg_upgrade, which should bump the coverage.

Also spotted a typo in a pg_upgrade file header in a file touched by this, so
included it in this thread too as a 0004.

I have committed 0002, 0003, and 0004.

The implementation in 0001 (Only allow upgrades by the same exact
version new bindir) has a problem. It compares (new_cluster.bin_version
!= PG_VERSION_NUM), but new_cluster.bin_version is actually just the
version of pg_ctl, so this is just comparing the version of pg_upgrade
with the version of pg_ctl, which is not wrong, but doesn't really
achieve the full goal of having all binaries match.

I think a better structure would be to add a version check for each
validate_exec() so that each program is checked against pg_upgrade.
This should mirror what find_other_exec() in src/common/exec.c does. In
a better world we would use find_other_exec() directly, but then we
can't support -B. Maybe expand find_other_exec() to support this, or
make a private copy for pg_upgrade to support this. (Also, we have two
copies of validate_exec() around. Maybe this could all be unified.)

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#17Daniel Gustafsson
daniel@yesql.se
In reply to: Peter Eisentraut (#16)
Re: pg_upgrade version checking questions

On 27 Jul 2019, at 08:42, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:

I have committed 0002, 0003, and 0004.

Thanks!

The implementation in 0001 (Only allow upgrades by the same exact
version new bindir) has a problem. It compares (new_cluster.bin_version
!= PG_VERSION_NUM), but new_cluster.bin_version is actually just the
version of pg_ctl, so this is just comparing the version of pg_upgrade
with the version of pg_ctl, which is not wrong, but doesn't really
achieve the full goal of having all binaries match.

Right, it seemed the cleanest option at the time more or less based on the
issues outlined below.

I think a better structure would be to add a version check for each
validate_exec() so that each program is checked against pg_upgrade.
This should mirror what find_other_exec() in src/common/exec.c does. In
a better world we would use find_other_exec() directly, but then we
can't support -B. Maybe expand find_other_exec() to support this, or
make a private copy for pg_upgrade to support this. (Also, we have two
copies of validate_exec() around. Maybe this could all be unified.)

I’ll take a stab at tidying all of this up to require less duplication, we’ll
see where that ends up.

cheers ./daniel

#18Thomas Munro
thomas.munro@gmail.com
In reply to: Daniel Gustafsson (#17)
Re: pg_upgrade version checking questions

On Wed, Jul 31, 2019 at 3:13 AM Daniel Gustafsson <daniel@yesql.se> wrote:

On 27 Jul 2019, at 08:42, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
I have committed 0002, 0003, and 0004.

Thanks!

The implementation in 0001 (Only allow upgrades by the same exact
version new bindir) has a problem. It compares (new_cluster.bin_version
!= PG_VERSION_NUM), but new_cluster.bin_version is actually just the
version of pg_ctl, so this is just comparing the version of pg_upgrade
with the version of pg_ctl, which is not wrong, but doesn't really
achieve the full goal of having all binaries match.

Right, it seemed the cleanest option at the time more or less based on the
issues outlined below.

I think a better structure would be to add a version check for each
validate_exec() so that each program is checked against pg_upgrade.
This should mirror what find_other_exec() in src/common/exec.c does. In
a better world we would use find_other_exec() directly, but then we
can't support -B. Maybe expand find_other_exec() to support this, or
make a private copy for pg_upgrade to support this. (Also, we have two
copies of validate_exec() around. Maybe this could all be unified.)

I’ll take a stab at tidying all of this up to require less duplication, we’ll
see where that ends up.

Hi Daniel,

I've moved this to the next CF, because it sounds like you're working
on a new version of 0001. If I misunderstood and you're happy with
just 0002-0004 being committed for now, please feel free to mark the
September entry 'Committed'.

--
Thomas Munro
https://enterprisedb.com

#19Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Daniel Gustafsson (#17)
Re: pg_upgrade version checking questions

On 2019-Jul-30, Daniel Gustafsson wrote:

I’ll take a stab at tidying all of this up to require less duplication, we’ll
see where that ends up.

Hello Daniel, are you submitting a new version soon?

Thanks,

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#20Daniel Gustafsson
daniel@yesql.se
In reply to: Alvaro Herrera (#19)
Re: pg_upgrade version checking questions

On 2 Sep 2019, at 19:59, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

On 2019-Jul-30, Daniel Gustafsson wrote:

I’ll take a stab at tidying all of this up to require less duplication, we’ll
see where that ends up.

Hello Daniel, are you submitting a new version soon?

I am working on an updated version which unfortunately got a bit delayed, but
will be submitted shortly (targeting this week).

cheers ./daniel

#21Daniel Gustafsson
daniel@yesql.se
In reply to: Peter Eisentraut (#16)
1 attachment(s)
Re: pg_upgrade version checking questions

On 27 Jul 2019, at 08:42, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:

On 2019-07-25 16:33, Daniel Gustafsson wrote:

Fair enough, those are both excellent points. I’ve shuffled the code around to
move back the check for exec_path to setup (albeit earlier than before due to
where we perform directory checking). This does mean that the directory
checking in the options parsing must learn to cope with missing directories,
which is a bit unfortunate since it’s already doing a few too many things IMHO.
To ensure dogfooding, I also removed the use of -B in ‘make check’ for
pg_upgrade, which should bump the coverage.

Also spotted a typo in a pg_upgrade file header in a file touched by this, so
included it in this thread too as a 0004.

I have committed 0002, 0003, and 0004.

The implementation in 0001 (Only allow upgrades by the same exact
version new bindir) has a problem. It compares (new_cluster.bin_version
!= PG_VERSION_NUM), but new_cluster.bin_version is actually just the
version of pg_ctl, so this is just comparing the version of pg_upgrade
with the version of pg_ctl, which is not wrong, but doesn't really
achieve the full goal of having all binaries match.

I think a better structure would be to add a version check for each
validate_exec() so that each program is checked against pg_upgrade.
This should mirror what find_other_exec() in src/common/exec.c does. In
a better world we would use find_other_exec() directly, but then we
can't support -B. Maybe expand find_other_exec() to support this, or
make a private copy for pg_upgrade to support this. (Also, we have two
copies of validate_exec() around. Maybe this could all be unified.)

Turns out I overshot my original estimate of a new 0001 by a hair (by ~530 days
or so) but attached is an updated version.

This exports validate_exec to reduce duplication, and implements a custom
find_other_exec-like function in pg_upgrade to check each binary for the
version number. Keeping a local copy of validate_exec is easy to do if it's
deemed not worth it to export it.

--
Daniel Gustafsson https://vmware.com/

Attachments:

v5-0001-Check-version-of-target-cluster-binaries.patchapplication/octet-stream; name=v5-0001-Check-version-of-target-cluster-binaries.patch; x-unix-mode=0644Download
From 1e7d5d155257bbce1c35c9f0ce2e0fa57b1c7f51 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Tue, 23 Feb 2021 16:48:38 +0100
Subject: [PATCH v5] Check version of target cluster binaries

This expands the binary validation in pg_upgrade with a version
check per binary to ensure the health of the target cluster. In
order to reduce duplication, validate_exec is exported from port.h
and the local copy in pg_upgrade is removed.
---
 src/bin/pg_upgrade/exec.c | 85 +++++++++++++++------------------------
 src/common/exec.c         |  3 +-
 src/include/port.h        |  1 +
 3 files changed, 35 insertions(+), 54 deletions(-)

diff --git a/src/bin/pg_upgrade/exec.c b/src/bin/pg_upgrade/exec.c
index 43a4565c2e..71471143e3 100644
--- a/src/bin/pg_upgrade/exec.c
+++ b/src/bin/pg_upgrade/exec.c
@@ -16,7 +16,7 @@
 static void check_data_dir(ClusterInfo *cluster);
 static void check_bin_dir(ClusterInfo *cluster);
 static void get_bin_version(ClusterInfo *cluster);
-static void validate_exec(const char *dir, const char *cmdName);
+static void find_exec(const char *dir, const char *program);
 
 #ifdef WIN32
 static int	win32_check_directory_write_permissions(void);
@@ -375,9 +375,9 @@ check_bin_dir(ClusterInfo *cluster)
 		report_status(PG_FATAL, "\"%s\" is not a directory\n",
 					  cluster->bindir);
 
-	validate_exec(cluster->bindir, "postgres");
-	validate_exec(cluster->bindir, "pg_controldata");
-	validate_exec(cluster->bindir, "pg_ctl");
+	find_exec(cluster->bindir, "postgres");
+	find_exec(cluster->bindir, "pg_controldata");
+	find_exec(cluster->bindir, "pg_ctl");
 
 	/*
 	 * Fetch the binary version after checking for the existence of pg_ctl.
@@ -388,9 +388,9 @@ check_bin_dir(ClusterInfo *cluster)
 
 	/* pg_resetxlog has been renamed to pg_resetwal in version 10 */
 	if (GET_MAJOR_VERSION(cluster->bin_version) <= 906)
-		validate_exec(cluster->bindir, "pg_resetxlog");
+		find_exec(cluster->bindir, "pg_resetxlog");
 	else
-		validate_exec(cluster->bindir, "pg_resetwal");
+		find_exec(cluster->bindir, "pg_resetwal");
 
 	if (cluster == &new_cluster)
 	{
@@ -399,63 +399,44 @@ check_bin_dir(ClusterInfo *cluster)
 		 * pg_dumpall are used to dump the old cluster, but must be of the
 		 * target version.
 		 */
-		validate_exec(cluster->bindir, "initdb");
-		validate_exec(cluster->bindir, "pg_dump");
-		validate_exec(cluster->bindir, "pg_dumpall");
-		validate_exec(cluster->bindir, "pg_restore");
-		validate_exec(cluster->bindir, "psql");
-		validate_exec(cluster->bindir, "vacuumdb");
+		find_exec(cluster->bindir, "initdb");
+		find_exec(cluster->bindir, "pg_dump");
+		find_exec(cluster->bindir, "pg_dumpall");
+		find_exec(cluster->bindir, "pg_restore");
+		find_exec(cluster->bindir, "psql");
+		find_exec(cluster->bindir, "vacuumdb");
 	}
 }
 
-
-/*
- * validate_exec()
- *
- * validate "path" as an executable file
- */
 static void
-validate_exec(const char *dir, const char *cmdName)
+find_exec(const char *dir, const char *program)
 {
-	char		path[MAXPGPATH];
-	struct stat buf;
+	char	path[MAXPGPATH];
+	char	line[MAXPGPATH];
+	char	cmd[MAXPGPATH];
+	char	versionstr[128];
+	int		ret;
 
-	snprintf(path, sizeof(path), "%s/%s", dir, cmdName);
+	snprintf(path, sizeof(path), "%s/%s%s", dir, program, EXE);
 
-#ifdef WIN32
-	/* Windows requires a .exe suffix for stat() */
-	if (strlen(path) <= strlen(EXE_EXT) ||
-		pg_strcasecmp(path + strlen(path) - strlen(EXE_EXT), EXE_EXT) != 0)
-		strlcat(path, EXE_EXT, sizeof(path));
-#endif
+	ret = validate_exec(path);
 
-	/*
-	 * Ensure that the file exists and is a regular file.
-	 */
-	if (stat(path, &buf) < 0)
-		pg_fatal("check for \"%s\" failed: %s\n",
-				 path, strerror(errno));
-	else if (!S_ISREG(buf.st_mode))
+	if (ret == -1)
 		pg_fatal("check for \"%s\" failed: not a regular file\n",
 				 path);
-
-	/*
-	 * Ensure that the file is both executable and readable (required for
-	 * dynamic loading).
-	 */
-#ifndef WIN32
-	if (access(path, R_OK) != 0)
-#else
-	if ((buf.st_mode & S_IRUSR) == 0)
-#endif
-		pg_fatal("check for \"%s\" failed: cannot read file (permission denied)\n",
+	else if (ret == -2)
+		pg_fatal("check for \"%s\" failed: cannot execute (permission denied)\n",
 				 path);
 
-#ifndef WIN32
-	if (access(path, X_OK) != 0)
-#else
-	if ((buf.st_mode & S_IXUSR) == 0)
-#endif
-		pg_fatal("check for \"%s\" failed: cannot execute (permission denied)\n",
+	snprintf(cmd, sizeof(cmd), "\"%s\" -V", path);
+
+	if (!pipe_read_line(cmd, line, sizeof(line)))
+		pg_fatal("check for \"%s\" failed: cannot execute\n",
 				 path);
+
+	snprintf(versionstr, sizeof(versionstr), "%s (PostgreSQL) " PG_VERSION "\n", program);
+
+	if (strcmp(line, versionstr) != 0)
+		pg_fatal("check for \"%s\" failed: incorrect version (\"%s\" vs \"%s\")\n",
+				 path, line, versionstr);
 }
diff --git a/src/common/exec.c b/src/common/exec.c
index 6cbc820042..66c3aa6acc 100644
--- a/src/common/exec.c
+++ b/src/common/exec.c
@@ -49,7 +49,6 @@
 #define getcwd(cwd,len)  GetCurrentDirectory(len, cwd)
 #endif
 
-static int	validate_exec(const char *path);
 static int	resolve_symlinks(char *path);
 
 #ifdef WIN32
@@ -63,7 +62,7 @@ static BOOL GetTokenUser(HANDLE hToken, PTOKEN_USER *ppTokenUser);
  *		  -1 if the regular file "path" does not exist or cannot be executed.
  *		  -2 if the file is otherwise valid but cannot be read.
  */
-static int
+int
 validate_exec(const char *path)
 {
 	struct stat buf;
diff --git a/src/include/port.h b/src/include/port.h
index 6486db9fdd..227ef4b148 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -125,6 +125,7 @@ extern void pgfnames_cleanup(char **filenames);
 extern void set_pglocale_pgservice(const char *argv0, const char *app);
 
 /* Portable way to find and execute binaries (in exec.c) */
+extern int	validate_exec(const char *path);
 extern int	find_my_exec(const char *argv0, char *retpath);
 extern int	find_other_exec(const char *argv0, const char *target,
 							const char *versionstr, char *retpath);
-- 
2.21.1 (Apple Git-122.3)

#22Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Daniel Gustafsson (#21)
Re: pg_upgrade version checking questions

On 23.02.21 17:14, Daniel Gustafsson wrote:

This exports validate_exec to reduce duplication, and implements a custom
find_other_exec-like function in pg_upgrade to check each binary for the
version number. Keeping a local copy of validate_exec is easy to do if it's
deemed not worth it to export it.

This looks mostly okay to me.

The commit message says something about "to ensure the health of the
target cluster", which doesn't make sense to me. Maybe find a better
wording.

The name find_exec() seems not very accurate. It doesn't find anything.
Maybe "check"?

I'm not sure why the new find_exec() adds EXE. AFAIK, this is only
required for stat(), and validate_exec() already does it.

#23Daniel Gustafsson
daniel@yesql.se
In reply to: Peter Eisentraut (#22)
1 attachment(s)
Re: pg_upgrade version checking questions

On 2 Mar 2021, at 14:20, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:

On 23.02.21 17:14, Daniel Gustafsson wrote:

This exports validate_exec to reduce duplication, and implements a custom
find_other_exec-like function in pg_upgrade to check each binary for the
version number. Keeping a local copy of validate_exec is easy to do if it's
deemed not worth it to export it.

This looks mostly okay to me.

Thanks for reviewing!

The commit message says something about "to ensure the health of the target cluster", which doesn't make sense to me. Maybe find a better wording.

Reworded in the attached updated version.

The name find_exec() seems not very accurate. It doesn't find anything. Maybe "check"?

I'm not wild about check_exec(), but every other name I could think of was
drastically worse so I went with check_exec.

I'm not sure why the new find_exec() adds EXE. AFAIK, this is only required for stat(), and validate_exec() already does it.

Good point, fixed.

--
Daniel Gustafsson https://vmware.com/

Attachments:

v6-0001-Check-version-of-target-cluster-binaries.patchapplication/octet-stream; name=v6-0001-Check-version-of-target-cluster-binaries.patch; x-unix-mode=0644Download
From e44a9cfd8de54eb4ed61397b674b095a7a26b494 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Tue, 2 Mar 2021 15:53:52 +0100
Subject: [PATCH v6] Check version of target cluster binaries

This expands the binary validation in pg_upgrade with a version
check per binary to ensure that the target cluster installation
only contains binaries from the target version.

In order to reduce duplication, validate_exec is exported from
port.h and the local copy in pg_upgrade is removed.
---
 src/bin/pg_upgrade/exec.c | 85 +++++++++++++++------------------------
 src/common/exec.c         |  3 +-
 src/include/port.h        |  1 +
 3 files changed, 35 insertions(+), 54 deletions(-)

diff --git a/src/bin/pg_upgrade/exec.c b/src/bin/pg_upgrade/exec.c
index 43a4565c2e..270a1c9e57 100644
--- a/src/bin/pg_upgrade/exec.c
+++ b/src/bin/pg_upgrade/exec.c
@@ -16,7 +16,7 @@
 static void check_data_dir(ClusterInfo *cluster);
 static void check_bin_dir(ClusterInfo *cluster);
 static void get_bin_version(ClusterInfo *cluster);
-static void validate_exec(const char *dir, const char *cmdName);
+static void check_exec(const char *dir, const char *program);
 
 #ifdef WIN32
 static int	win32_check_directory_write_permissions(void);
@@ -375,9 +375,9 @@ check_bin_dir(ClusterInfo *cluster)
 		report_status(PG_FATAL, "\"%s\" is not a directory\n",
 					  cluster->bindir);
 
-	validate_exec(cluster->bindir, "postgres");
-	validate_exec(cluster->bindir, "pg_controldata");
-	validate_exec(cluster->bindir, "pg_ctl");
+	check_exec(cluster->bindir, "postgres");
+	check_exec(cluster->bindir, "pg_controldata");
+	check_exec(cluster->bindir, "pg_ctl");
 
 	/*
 	 * Fetch the binary version after checking for the existence of pg_ctl.
@@ -388,9 +388,9 @@ check_bin_dir(ClusterInfo *cluster)
 
 	/* pg_resetxlog has been renamed to pg_resetwal in version 10 */
 	if (GET_MAJOR_VERSION(cluster->bin_version) <= 906)
-		validate_exec(cluster->bindir, "pg_resetxlog");
+		check_exec(cluster->bindir, "pg_resetxlog");
 	else
-		validate_exec(cluster->bindir, "pg_resetwal");
+		check_exec(cluster->bindir, "pg_resetwal");
 
 	if (cluster == &new_cluster)
 	{
@@ -399,63 +399,44 @@ check_bin_dir(ClusterInfo *cluster)
 		 * pg_dumpall are used to dump the old cluster, but must be of the
 		 * target version.
 		 */
-		validate_exec(cluster->bindir, "initdb");
-		validate_exec(cluster->bindir, "pg_dump");
-		validate_exec(cluster->bindir, "pg_dumpall");
-		validate_exec(cluster->bindir, "pg_restore");
-		validate_exec(cluster->bindir, "psql");
-		validate_exec(cluster->bindir, "vacuumdb");
+		check_exec(cluster->bindir, "initdb");
+		check_exec(cluster->bindir, "pg_dump");
+		check_exec(cluster->bindir, "pg_dumpall");
+		check_exec(cluster->bindir, "pg_restore");
+		check_exec(cluster->bindir, "psql");
+		check_exec(cluster->bindir, "vacuumdb");
 	}
 }
 
-
-/*
- * validate_exec()
- *
- * validate "path" as an executable file
- */
 static void
-validate_exec(const char *dir, const char *cmdName)
+check_exec(const char *dir, const char *program)
 {
-	char		path[MAXPGPATH];
-	struct stat buf;
+	char	path[MAXPGPATH];
+	char	line[MAXPGPATH];
+	char	cmd[MAXPGPATH];
+	char	versionstr[128];
+	int		ret;
 
-	snprintf(path, sizeof(path), "%s/%s", dir, cmdName);
+	snprintf(path, sizeof(path), "%s/%s", dir, program);
 
-#ifdef WIN32
-	/* Windows requires a .exe suffix for stat() */
-	if (strlen(path) <= strlen(EXE_EXT) ||
-		pg_strcasecmp(path + strlen(path) - strlen(EXE_EXT), EXE_EXT) != 0)
-		strlcat(path, EXE_EXT, sizeof(path));
-#endif
+	ret = validate_exec(path);
 
-	/*
-	 * Ensure that the file exists and is a regular file.
-	 */
-	if (stat(path, &buf) < 0)
-		pg_fatal("check for \"%s\" failed: %s\n",
-				 path, strerror(errno));
-	else if (!S_ISREG(buf.st_mode))
+	if (ret == -1)
 		pg_fatal("check for \"%s\" failed: not a regular file\n",
 				 path);
-
-	/*
-	 * Ensure that the file is both executable and readable (required for
-	 * dynamic loading).
-	 */
-#ifndef WIN32
-	if (access(path, R_OK) != 0)
-#else
-	if ((buf.st_mode & S_IRUSR) == 0)
-#endif
-		pg_fatal("check for \"%s\" failed: cannot read file (permission denied)\n",
+	else if (ret == -2)
+		pg_fatal("check for \"%s\" failed: cannot execute (permission denied)\n",
 				 path);
 
-#ifndef WIN32
-	if (access(path, X_OK) != 0)
-#else
-	if ((buf.st_mode & S_IXUSR) == 0)
-#endif
-		pg_fatal("check for \"%s\" failed: cannot execute (permission denied)\n",
+	snprintf(cmd, sizeof(cmd), "\"%s\" -V", path);
+
+	if (!pipe_read_line(cmd, line, sizeof(line)))
+		pg_fatal("check for \"%s\" failed: cannot execute\n",
 				 path);
+
+	snprintf(versionstr, sizeof(versionstr), "%s (PostgreSQL) " PG_VERSION "\n", program);
+
+	if (strcmp(line, versionstr) != 0)
+		pg_fatal("check for \"%s\" failed: incorrect version (\"%s\" vs \"%s\")\n",
+				 path, line, versionstr);
 }
diff --git a/src/common/exec.c b/src/common/exec.c
index 6cbc820042..66c3aa6acc 100644
--- a/src/common/exec.c
+++ b/src/common/exec.c
@@ -49,7 +49,6 @@
 #define getcwd(cwd,len)  GetCurrentDirectory(len, cwd)
 #endif
 
-static int	validate_exec(const char *path);
 static int	resolve_symlinks(char *path);
 
 #ifdef WIN32
@@ -63,7 +62,7 @@ static BOOL GetTokenUser(HANDLE hToken, PTOKEN_USER *ppTokenUser);
  *		  -1 if the regular file "path" does not exist or cannot be executed.
  *		  -2 if the file is otherwise valid but cannot be read.
  */
-static int
+int
 validate_exec(const char *path)
 {
 	struct stat buf;
diff --git a/src/include/port.h b/src/include/port.h
index 6486db9fdd..227ef4b148 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -125,6 +125,7 @@ extern void pgfnames_cleanup(char **filenames);
 extern void set_pglocale_pgservice(const char *argv0, const char *app);
 
 /* Portable way to find and execute binaries (in exec.c) */
+extern int	validate_exec(const char *path);
 extern int	find_my_exec(const char *argv0, char *retpath);
 extern int	find_other_exec(const char *argv0, const char *target,
 							const char *versionstr, char *retpath);
-- 
2.21.1 (Apple Git-122.3)

#24Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Daniel Gustafsson (#23)
Re: pg_upgrade version checking questions

On 02.03.21 22:51, Daniel Gustafsson wrote:

The commit message says something about "to ensure the health of the target cluster", which doesn't make sense to me. Maybe find a better wording.

Reworded in the attached updated version.

The name find_exec() seems not very accurate. It doesn't find anything. Maybe "check"?

I'm not wild about check_exec(), but every other name I could think of was
drastically worse so I went with check_exec.

I'm not sure why the new find_exec() adds EXE. AFAIK, this is only required for stat(), and validate_exec() already does it.

Good point, fixed.

I committed this. I added a pg_strip_crlf() so that there are no
newlines in the error message. I also slightly reworded the error
message to make the found and expected value distinguishable.

#25Daniel Gustafsson
daniel@yesql.se
In reply to: Peter Eisentraut (#24)
Re: pg_upgrade version checking questions

On 3 Mar 2021, at 09:57, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:

I committed this. I added a pg_strip_crlf() so that there are no newlines in the error message.

Right, that's much better, thanks!

--
Daniel Gustafsson https://vmware.com/