pipe_read_line for reading arbitrary strings

Started by Daniel Gustafssonalmost 3 years ago16 messages
#1Daniel Gustafsson
daniel@yesql.se
1 attachment(s)

When skimming through pg_rewind during a small review I noticed the use of
pipe_read_line for reading arbitrary data from a pipe, the mechanics of which
seemed odd.

Commit 5b2f4afffe6 refactored find_other_exec() and broke out pipe_read_line()
as a static convenience routine for reading a single line of output to catch a
version number. Many years later, commit a7e8ece41 exposed it externally in
order to read a GUC from postgresql.conf using "postgres -C ..". f06b1c598
also make use of it for reading a version string much like find_other_exec().
Funnily enough, while now used for arbitrary string reading the variable is
still "pgver".

Since the function requires passing a buffer/size, and at most size - 1 bytes
will be read via fgets(), there is a truncation risk when using this for
reading GUCs (like how pg_rewind does, though the risk there is slim to none).

If we are going to continue using this for reading $stuff from pipes, maybe we
should think about presenting a nicer API which removes that risk? Returning
an allocated buffer which contains all the output along the lines of the recent
pg_get_line work seems a lot nicer and safer IMO.

The attached POC diff replace fgets() with pg_get_line(), which may not be an
Ok way to cross the streams (it's clearly not a great fit), but as a POC it
provided a neater interface for reading one-off lines from a pipe IMO. Does
anyone else think this is worth fixing before too many callsites use it, or is
this another case of my fear of silent subtle truncation bugs? =)

--
Daniel Gustafsson

Attachments:

pipe_read_line.diffapplication/octet-stream; name=pipe_read_line.diff; x-unix-mode=0644Download
diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c
index f7f3b8227f..038543defb 100644
--- a/src/bin/pg_rewind/pg_rewind.c
+++ b/src/bin/pg_rewind/pg_rewind.c
@@ -1042,8 +1042,7 @@ static void
 getRestoreCommand(const char *argv0)
 {
 	int			rc;
-	char		postgres_exec_path[MAXPGPATH],
-				cmd_output[MAXPGPATH];
+	char		postgres_exec_path[MAXPGPATH];
 	PQExpBuffer postgres_cmd;
 
 	if (!restore_wal)
@@ -1092,16 +1091,15 @@ getRestoreCommand(const char *argv0)
 	/* add -C switch, for restore_command */
 	appendPQExpBufferStr(postgres_cmd, " -C restore_command");
 
-	if (!pipe_read_line(postgres_cmd->data, cmd_output, sizeof(cmd_output)))
+	restore_command = pipe_read_line(postgres_cmd->data);
+	if (restore_command == NULL)
 		exit(1);
 
-	(void) pg_strip_crlf(cmd_output);
+	(void) pg_strip_crlf(restore_command);
 
-	if (strcmp(cmd_output, "") == 0)
+	if (strcmp(restore_command, "") == 0)
 		pg_fatal("restore_command is not set in the target cluster");
 
-	restore_command = pg_strdup(cmd_output);
-
 	pg_log_debug("using for rewind restore_command = \'%s\'",
 				 restore_command);
 
diff --git a/src/bin/pg_upgrade/exec.c b/src/bin/pg_upgrade/exec.c
index 5b2edebe41..f56f22af4e 100644
--- a/src/bin/pg_upgrade/exec.c
+++ b/src/bin/pg_upgrade/exec.c
@@ -431,7 +431,7 @@ static void
 check_exec(const char *dir, const char *program, bool check_version)
 {
 	char		path[MAXPGPATH];
-	char		line[MAXPGPATH];
+	char	   *line;
 	char		cmd[MAXPGPATH];
 	char		versionstr[128];
 
@@ -442,7 +442,7 @@ check_exec(const char *dir, const char *program, bool check_version)
 
 	snprintf(cmd, sizeof(cmd), "\"%s\" -V", path);
 
-	if (!pipe_read_line(cmd, line, sizeof(line)))
+	if ((line = pipe_read_line(cmd)) == NULL)
 		pg_fatal("check for \"%s\" failed: cannot execute",
 				 path);
 
@@ -456,4 +456,6 @@ check_exec(const char *dir, const char *program, bool check_version)
 			pg_fatal("check for \"%s\" failed: incorrect version: found \"%s\", expected \"%s\"",
 					 path, line, versionstr);
 	}
+
+	pg_free(line);
 }
diff --git a/src/common/exec.c b/src/common/exec.c
index 93b2faf2c4..99130fed11 100644
--- a/src/common/exec.c
+++ b/src/common/exec.c
@@ -33,6 +33,8 @@
 #endif
 #endif
 
+#include "common/string.h"
+
 /* Inhibit mingw CRT's auto-globbing of command line arguments */
 #if defined(WIN32) && !defined(_MSC_VER)
 extern int	_CRT_glob = 0;		/* 0 turns off globbing; 1 turns it on */
@@ -352,7 +354,7 @@ find_other_exec(const char *argv0, const char *target,
 				const char *versionstr, char *retpath)
 {
 	char		cmd[MAXPGPATH];
-	char		line[MAXPGPATH];
+	char	   *line;
 
 	if (find_my_exec(argv0, retpath) < 0)
 		return -1;
@@ -370,46 +372,48 @@ find_other_exec(const char *argv0, const char *target,
 
 	snprintf(cmd, sizeof(cmd), "\"%s\" -V", retpath);
 
-	if (!pipe_read_line(cmd, line, sizeof(line)))
+	if ((line = pipe_read_line(cmd)) == NULL)
 		return -1;
 
 	if (strcmp(line, versionstr) != 0)
+	{
+		pfree(line);
 		return -2;
+	}
 
+	pfree(line);
 	return 0;
 }
 
 
 /*
- * Execute a command in a pipe and read the first line from it.
+ * Execute a command in a pipe and read the first line from it. The returned
+ * string is allocated, the caller is responsible for freeing.
  */
 char *
-pipe_read_line(char *cmd, char *line, int maxsize)
+pipe_read_line(char *cmd)
 {
-	FILE	   *pgver;
+	FILE	   *pipe_cmd;
+	char	   *line;
 
 	fflush(NULL);
 
 	errno = 0;
-	if ((pgver = popen(cmd, "r")) == NULL)
+	if ((pipe_cmd = popen(cmd, "r")) == NULL)
 	{
 		perror("popen failure");
 		return NULL;
 	}
 
-	errno = 0;
-	if (fgets(line, maxsize, pgver) == NULL)
+	line = pg_get_line(pipe_cmd, NULL);
+
+	if (line == NULL)
 	{
-		if (feof(pgver))
-			fprintf(stderr, "no data was returned by command \"%s\"\n", cmd);
-		else
-			perror("fgets failure");
-		pclose(pgver);			/* no error checking */
-		return NULL;
+		log_error(errcode_for_file_access(),
+				  _("no data was returned by command \"%s\""), cmd);
 	}
 
-	if (pclose_check(pgver))
-		return NULL;
+	(void) pclose_check(pipe_cmd);
 
 	return line;
 }
diff --git a/src/include/port.h b/src/include/port.h
index e66193bed9..6eac3757d3 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -137,7 +137,7 @@ extern int	validate_exec(const char *path);
 extern int	find_my_exec(const char *argv0, char *retpath);
 extern int	find_other_exec(const char *argv0, const char *target,
 							const char *versionstr, char *retpath);
-extern char *pipe_read_line(char *cmd, char *line, int maxsize);
+extern char *pipe_read_line(char *cmd);
 
 /* Doesn't belong here, but this is used with find_other_exec(), so... */
 #define PG_BACKEND_VERSIONSTR "postgres (PostgreSQL) " PG_VERSION "\n"
#2Daniel Gustafsson
daniel@yesql.se
In reply to: Daniel Gustafsson (#1)
1 attachment(s)
Re: pipe_read_line for reading arbitrary strings

On 7 Mar 2023, at 23:05, Daniel Gustafsson <daniel@yesql.se> wrote:

When skimming through pg_rewind during a small review I noticed the use of
pipe_read_line for reading arbitrary data from a pipe, the mechanics of which
seemed odd.

A rebase of this for the CFBot since I realized I had forgotten to add this to
the July CF.

--
Daniel Gustafsson

Attachments:

v2-0001-Refactor-pipe_read_line-to-return-the-full-line.patchapplication/octet-stream; name=v2-0001-Refactor-pipe_read_line-to-return-the-full-line.patch; x-unix-mode=0644Download
From a66759e24b3cc597f964ec9f9189251dab5f3ca6 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Wed, 17 May 2023 12:55:55 +0200
Subject: [PATCH v2] Refactor pipe_read_line to return the full line

Commit 5b2f4afffe6 refactored find_other_exec() and in the process broke
out pipe_read_line() into a static routine for reading a single line of
output, aimed at reading a version number.  Commit a7e8ece41 later exposed
it externally in order to read a GUC from postgresql.conf using "postgres
-C ..".  f06b1c598 also make use of it for reading a version string much
like find_other_exec().  The internal variable remained "pgver" though.

Since the function requires passing a buffer/size, and at most size - 1
bytes will be read via fgets(), there is a truncation risk when using this
for reading GUCs (like how pg_rewind does, though the risk in this case is
marginal).

To keep this as generic functionality for reading a line from a pipe, this
refactors pipe_read_line into returning an allocated buffer containing all
of the line to remove the risk of silent truncation.

Discussion: https://postgr.es/m/DEDF73CE-D528-49A3-9089-B3592FD671A9@yesql.se
---
 src/bin/pg_rewind/pg_rewind.c | 14 ++++++--------
 src/bin/pg_upgrade/exec.c     |  6 ++++--
 src/common/exec.c             | 36 +++++++++++++++++++----------------
 src/include/port.h            |  2 +-
 4 files changed, 31 insertions(+), 27 deletions(-)

diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c
index f7f3b8227f..c9fbe740e7 100644
--- a/src/bin/pg_rewind/pg_rewind.c
+++ b/src/bin/pg_rewind/pg_rewind.c
@@ -1042,8 +1042,7 @@ static void
 getRestoreCommand(const char *argv0)
 {
 	int			rc;
-	char		postgres_exec_path[MAXPGPATH],
-				cmd_output[MAXPGPATH];
+	char		postgres_exec_path[MAXPGPATH];
 	PQExpBuffer postgres_cmd;
 
 	if (!restore_wal)
@@ -1092,16 +1091,15 @@ getRestoreCommand(const char *argv0)
 	/* add -C switch, for restore_command */
 	appendPQExpBufferStr(postgres_cmd, " -C restore_command");
 
-	if (!pipe_read_line(postgres_cmd->data, cmd_output, sizeof(cmd_output)))
-		exit(1);
+	restore_command = pipe_read_line(postgres_cmd->data);
+	if (restore_command == NULL)
+		pg_fatal("unable to read restore_command from target cluster");
 
-	(void) pg_strip_crlf(cmd_output);
+	(void) pg_strip_crlf(restore_command);
 
-	if (strcmp(cmd_output, "") == 0)
+	if (strcmp(restore_command, "") == 0)
 		pg_fatal("restore_command is not set in the target cluster");
 
-	restore_command = pg_strdup(cmd_output);
-
 	pg_log_debug("using for rewind restore_command = \'%s\'",
 				 restore_command);
 
diff --git a/src/bin/pg_upgrade/exec.c b/src/bin/pg_upgrade/exec.c
index 5b2edebe41..f56f22af4e 100644
--- a/src/bin/pg_upgrade/exec.c
+++ b/src/bin/pg_upgrade/exec.c
@@ -431,7 +431,7 @@ static void
 check_exec(const char *dir, const char *program, bool check_version)
 {
 	char		path[MAXPGPATH];
-	char		line[MAXPGPATH];
+	char	   *line;
 	char		cmd[MAXPGPATH];
 	char		versionstr[128];
 
@@ -442,7 +442,7 @@ check_exec(const char *dir, const char *program, bool check_version)
 
 	snprintf(cmd, sizeof(cmd), "\"%s\" -V", path);
 
-	if (!pipe_read_line(cmd, line, sizeof(line)))
+	if ((line = pipe_read_line(cmd)) == NULL)
 		pg_fatal("check for \"%s\" failed: cannot execute",
 				 path);
 
@@ -456,4 +456,6 @@ check_exec(const char *dir, const char *program, bool check_version)
 			pg_fatal("check for \"%s\" failed: incorrect version: found \"%s\", expected \"%s\"",
 					 path, line, versionstr);
 	}
+
+	pg_free(line);
 }
diff --git a/src/common/exec.c b/src/common/exec.c
index f209b934df..70c9397643 100644
--- a/src/common/exec.c
+++ b/src/common/exec.c
@@ -42,6 +42,8 @@
 #endif
 #endif
 
+#include "common/string.h"
+
 /* Inhibit mingw CRT's auto-globbing of command line arguments */
 #if defined(WIN32) && !defined(_MSC_VER)
 extern int	_CRT_glob = 0;		/* 0 turns off globbing; 1 turns it on */
@@ -328,7 +330,7 @@ find_other_exec(const char *argv0, const char *target,
 				const char *versionstr, char *retpath)
 {
 	char		cmd[MAXPGPATH];
-	char		line[MAXPGPATH];
+	char	   *line;
 
 	if (find_my_exec(argv0, retpath) < 0)
 		return -1;
@@ -346,46 +348,48 @@ find_other_exec(const char *argv0, const char *target,
 
 	snprintf(cmd, sizeof(cmd), "\"%s\" -V", retpath);
 
-	if (!pipe_read_line(cmd, line, sizeof(line)))
+	if ((line = pipe_read_line(cmd)) == NULL)
 		return -1;
 
 	if (strcmp(line, versionstr) != 0)
+	{
+		pfree(line);
 		return -2;
+	}
 
+	pfree(line);
 	return 0;
 }
 
 
 /*
- * Execute a command in a pipe and read the first line from it.
+ * Execute a command in a pipe and read the first line from it. The returned
+ * string is allocated, the caller is responsible for freeing.
  */
 char *
-pipe_read_line(char *cmd, char *line, int maxsize)
+pipe_read_line(char *cmd)
 {
-	FILE	   *pgver;
+	FILE	   *pipe_cmd;
+	char	   *line;
 
 	fflush(NULL);
 
 	errno = 0;
-	if ((pgver = popen(cmd, "r")) == NULL)
+	if ((pipe_cmd = popen(cmd, "r")) == NULL)
 	{
 		perror("popen failure");
 		return NULL;
 	}
 
-	errno = 0;
-	if (fgets(line, maxsize, pgver) == NULL)
+	line = pg_get_line(pipe_cmd, NULL);
+
+	if (line == NULL)
 	{
-		if (feof(pgver))
-			fprintf(stderr, "no data was returned by command \"%s\"\n", cmd);
-		else
-			perror("fgets failure");
-		pclose(pgver);			/* no error checking */
-		return NULL;
+		log_error(errcode_for_file_access(),
+				  _("no data was returned by command \"%s\""), cmd);
 	}
 
-	if (pclose_check(pgver))
-		return NULL;
+	(void) pclose_check(pipe_cmd);
 
 	return line;
 }
diff --git a/src/include/port.h b/src/include/port.h
index a88d403483..76ef003a69 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -137,7 +137,7 @@ extern int	validate_exec(const char *path);
 extern int	find_my_exec(const char *argv0, char *retpath);
 extern int	find_other_exec(const char *argv0, const char *target,
 							const char *versionstr, char *retpath);
-extern char *pipe_read_line(char *cmd, char *line, int maxsize);
+extern char *pipe_read_line(char *cmd);
 
 /* Doesn't belong here, but this is used with find_other_exec(), so... */
 #define PG_BACKEND_VERSIONSTR "postgres (PostgreSQL) " PG_VERSION "\n"
-- 
2.32.1 (Apple Git-133)

#3Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Daniel Gustafsson (#1)
Re: pipe_read_line for reading arbitrary strings

On 08/03/2023 00:05, Daniel Gustafsson wrote:

When skimming through pg_rewind during a small review I noticed the use of
pipe_read_line for reading arbitrary data from a pipe, the mechanics of which
seemed odd.

Commit 5b2f4afffe6 refactored find_other_exec() and broke out pipe_read_line()
as a static convenience routine for reading a single line of output to catch a
version number. Many years later, commit a7e8ece41 exposed it externally in
order to read a GUC from postgresql.conf using "postgres -C ..". f06b1c598
also make use of it for reading a version string much like find_other_exec().
Funnily enough, while now used for arbitrary string reading the variable is
still "pgver".

Since the function requires passing a buffer/size, and at most size - 1 bytes
will be read via fgets(), there is a truncation risk when using this for
reading GUCs (like how pg_rewind does, though the risk there is slim to none).

Good point.

If we are going to continue using this for reading $stuff from pipes, maybe we
should think about presenting a nicer API which removes that risk? Returning
an allocated buffer which contains all the output along the lines of the recent
pg_get_line work seems a lot nicer and safer IMO.

+1

/*
* Execute a command in a pipe and read the first line from it. The returned
* string is allocated, the caller is responsible for freeing.
*/
char *
pipe_read_line(char *cmd)

I think it's worth being explicit here that it's palloc'd, or malloc'd
in frontend programs, rather than just "allocated". Like in pg_get_line.

Other than that, LGTM.

--
Heikki Linnakangas
Neon (https://neon.tech)

#4Daniel Gustafsson
daniel@yesql.se
In reply to: Heikki Linnakangas (#3)
Re: pipe_read_line for reading arbitrary strings

On 4 Jul 2023, at 13:59, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
On 08/03/2023 00:05, Daniel Gustafsson wrote:

If we are going to continue using this for reading $stuff from pipes, maybe we
should think about presenting a nicer API which removes that risk? Returning
an allocated buffer which contains all the output along the lines of the recent
pg_get_line work seems a lot nicer and safer IMO.

+1

Thanks for review!

/*
* Execute a command in a pipe and read the first line from it. The returned
* string is allocated, the caller is responsible for freeing.
*/
char *
pipe_read_line(char *cmd)

I think it's worth being explicit here that it's palloc'd, or malloc'd in frontend programs, rather than just "allocated". Like in pg_get_line.

Good point, I'll make that happen before committing this.

--
Daniel Gustafsson

#5Daniel Gustafsson
daniel@yesql.se
In reply to: Daniel Gustafsson (#4)
1 attachment(s)
Re: pipe_read_line for reading arbitrary strings

On 4 Jul 2023, at 14:50, Daniel Gustafsson <daniel@yesql.se> wrote:

On 4 Jul 2023, at 13:59, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
On 08/03/2023 00:05, Daniel Gustafsson wrote:

If we are going to continue using this for reading $stuff from pipes, maybe we
should think about presenting a nicer API which removes that risk? Returning
an allocated buffer which contains all the output along the lines of the recent
pg_get_line work seems a lot nicer and safer IMO.

+1

Thanks for review!

/*
* Execute a command in a pipe and read the first line from it. The returned
* string is allocated, the caller is responsible for freeing.
*/
char *
pipe_read_line(char *cmd)

I think it's worth being explicit here that it's palloc'd, or malloc'd in frontend programs, rather than just "allocated". Like in pg_get_line.

Good point, I'll make that happen before committing this.

Fixed, along with commit message wordsmithing in the attached. Unless objected
to I'll go ahead with this version.

--
Daniel Gustafsson

Attachments:

v3-0001-Refactor-pipe_read_line-to-return-the-full-line.patchapplication/octet-stream; name=v3-0001-Refactor-pipe_read_line-to-return-the-full-line.patch; x-unix-mode=0644Download
From d17d50e5341447c744320d00e06de897e481c9cd Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Mon, 25 Sep 2023 09:46:33 +0200
Subject: [PATCH v3] Refactor pipe_read_line to return the full line

Commit 5b2f4afffe6 refactored find_other_exec() and in the process
created pipe_read_line() into a static routine for reading a single
line of output, aimed at reading version numbers.  Commit a7e8ece41
later exposed it externally in order to read a postgresql.conf GUC
using "postgres -C ..".  Further, f06b1c598 also made use of it for
reading a version string much like find_other_exec().  The internal
variable remained "pgver", even when used for other purposes.

Since the function requires passing a buffer and its size, and at
most size - 1 bytes will be read via fgets(), there is a truncation
risk when using this for reading GUCs (like how pg_rewind does,
though the risk in this case is marginal).

To keep this as generic functionality for reading a line from a pipe,
this refactors pipe_read_line() into returning an allocated buffer
containing all of the line to remove the risk of silent truncation.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/DEDF73CE-D528-49A3-9089-B3592FD671A9@yesql.se
---
 src/bin/pg_rewind/pg_rewind.c | 14 ++++++-------
 src/bin/pg_upgrade/exec.c     |  6 ++++--
 src/common/exec.c             | 37 ++++++++++++++++++++---------------
 src/include/port.h            |  2 +-
 4 files changed, 32 insertions(+), 27 deletions(-)

diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c
index bfd44a284e..16d5923ae3 100644
--- a/src/bin/pg_rewind/pg_rewind.c
+++ b/src/bin/pg_rewind/pg_rewind.c
@@ -1055,8 +1055,7 @@ static void
 getRestoreCommand(const char *argv0)
 {
 	int			rc;
-	char		postgres_exec_path[MAXPGPATH],
-				cmd_output[MAXPGPATH];
+	char		postgres_exec_path[MAXPGPATH];
 	PQExpBuffer postgres_cmd;
 
 	if (!restore_wal)
@@ -1105,16 +1104,15 @@ getRestoreCommand(const char *argv0)
 	/* add -C switch, for restore_command */
 	appendPQExpBufferStr(postgres_cmd, " -C restore_command");
 
-	if (!pipe_read_line(postgres_cmd->data, cmd_output, sizeof(cmd_output)))
-		exit(1);
+	restore_command = pipe_read_line(postgres_cmd->data);
+	if (restore_command == NULL)
+		pg_fatal("unable to read restore_command from target cluster");
 
-	(void) pg_strip_crlf(cmd_output);
+	(void) pg_strip_crlf(restore_command);
 
-	if (strcmp(cmd_output, "") == 0)
+	if (strcmp(restore_command, "") == 0)
 		pg_fatal("restore_command is not set in the target cluster");
 
-	restore_command = pg_strdup(cmd_output);
-
 	pg_log_debug("using for rewind restore_command = \'%s\'",
 				 restore_command);
 
diff --git a/src/bin/pg_upgrade/exec.c b/src/bin/pg_upgrade/exec.c
index 5b2edebe41..f56f22af4e 100644
--- a/src/bin/pg_upgrade/exec.c
+++ b/src/bin/pg_upgrade/exec.c
@@ -431,7 +431,7 @@ static void
 check_exec(const char *dir, const char *program, bool check_version)
 {
 	char		path[MAXPGPATH];
-	char		line[MAXPGPATH];
+	char	   *line;
 	char		cmd[MAXPGPATH];
 	char		versionstr[128];
 
@@ -442,7 +442,7 @@ check_exec(const char *dir, const char *program, bool check_version)
 
 	snprintf(cmd, sizeof(cmd), "\"%s\" -V", path);
 
-	if (!pipe_read_line(cmd, line, sizeof(line)))
+	if ((line = pipe_read_line(cmd)) == NULL)
 		pg_fatal("check for \"%s\" failed: cannot execute",
 				 path);
 
@@ -456,4 +456,6 @@ check_exec(const char *dir, const char *program, bool check_version)
 			pg_fatal("check for \"%s\" failed: incorrect version: found \"%s\", expected \"%s\"",
 					 path, line, versionstr);
 	}
+
+	pg_free(line);
 }
diff --git a/src/common/exec.c b/src/common/exec.c
index f209b934df..efa539c0e0 100644
--- a/src/common/exec.c
+++ b/src/common/exec.c
@@ -42,6 +42,8 @@
 #endif
 #endif
 
+#include "common/string.h"
+
 /* Inhibit mingw CRT's auto-globbing of command line arguments */
 #if defined(WIN32) && !defined(_MSC_VER)
 extern int	_CRT_glob = 0;		/* 0 turns off globbing; 1 turns it on */
@@ -328,7 +330,7 @@ find_other_exec(const char *argv0, const char *target,
 				const char *versionstr, char *retpath)
 {
 	char		cmd[MAXPGPATH];
-	char		line[MAXPGPATH];
+	char	   *line;
 
 	if (find_my_exec(argv0, retpath) < 0)
 		return -1;
@@ -346,46 +348,49 @@ find_other_exec(const char *argv0, const char *target,
 
 	snprintf(cmd, sizeof(cmd), "\"%s\" -V", retpath);
 
-	if (!pipe_read_line(cmd, line, sizeof(line)))
+	if ((line = pipe_read_line(cmd)) == NULL)
 		return -1;
 
 	if (strcmp(line, versionstr) != 0)
+	{
+		pfree(line);
 		return -2;
+	}
 
+	pfree(line);
 	return 0;
 }
 
 
 /*
- * Execute a command in a pipe and read the first line from it.
+ * Execute a command in a pipe and read the first line from it. The returned
+ * string is palloc'd (malloc'd in frontend code), the caller is responsible
+ * for freeing.
  */
 char *
-pipe_read_line(char *cmd, char *line, int maxsize)
+pipe_read_line(char *cmd)
 {
-	FILE	   *pgver;
+	FILE	   *pipe_cmd;
+	char	   *line;
 
 	fflush(NULL);
 
 	errno = 0;
-	if ((pgver = popen(cmd, "r")) == NULL)
+	if ((pipe_cmd = popen(cmd, "r")) == NULL)
 	{
 		perror("popen failure");
 		return NULL;
 	}
 
-	errno = 0;
-	if (fgets(line, maxsize, pgver) == NULL)
+	line = pg_get_line(pipe_cmd, NULL);
+
+	if (line == NULL)
 	{
-		if (feof(pgver))
-			fprintf(stderr, "no data was returned by command \"%s\"\n", cmd);
-		else
-			perror("fgets failure");
-		pclose(pgver);			/* no error checking */
-		return NULL;
+		log_error(errcode_for_file_access(),
+				  _("no data was returned by command \"%s\""), cmd);
 	}
 
-	if (pclose_check(pgver))
-		return NULL;
+	(void) pclose_check(pipe_cmd);
 
 	return line;
 }
diff --git a/src/include/port.h b/src/include/port.h
index 5979139ebc..be8928c8e1 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -137,7 +137,7 @@ extern int	validate_exec(const char *path);
 extern int	find_my_exec(const char *argv0, char *retpath);
 extern int	find_other_exec(const char *argv0, const char *target,
 							const char *versionstr, char *retpath);
-extern char *pipe_read_line(char *cmd, char *line, int maxsize);
+extern char *pipe_read_line(char *cmd);
 
 /* Doesn't belong here, but this is used with find_other_exec(), so... */
 #define PG_BACKEND_VERSIONSTR "postgres (PostgreSQL) " PG_VERSION "\n"
-- 
2.32.1 (Apple Git-133)

#6John Naylor
johncnaylorls@gmail.com
In reply to: Daniel Gustafsson (#5)
Re: pipe_read_line for reading arbitrary strings

On Mon, Sep 25, 2023 at 2:55 PM Daniel Gustafsson <daniel@yesql.se> wrote:

Fixed, along with commit message wordsmithing in the attached. Unless objected
to I'll go ahead with this version.

+1

#7Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Daniel Gustafsson (#1)
Re: pipe_read_line for reading arbitrary strings

On 2023-Mar-07, Daniel Gustafsson wrote:

The attached POC diff replace fgets() with pg_get_line(), which may not be an
Ok way to cross the streams (it's clearly not a great fit), but as a POC it
provided a neater interface for reading one-off lines from a pipe IMO. Does
anyone else think this is worth fixing before too many callsites use it, or is
this another case of my fear of silent subtle truncation bugs? =)

I think this is generally a good change.

I think pipe_read_line should have a "%m" in the "no data returned"
error message. pg_read_line is careful to retain errno (and it was
already zero at start), so this should be okay ... or should we set
errno again to zero after popen(), even if it works?

(I'm not sure I buy pg_read_line's use of perror in the backend case.
Maybe this is only okay because the backend doesn't use this code?)

pg_get_line says caller can distinguish error from no-input-before-EOF
with ferror(), but pipe_read_line does no such thing. I wonder what
happens if an NFS mount containing a file being read disappears in the
middle of reading it, for example ... will we think that we completed
reading the file, rather than erroring out?

--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"That sort of implies that there are Emacs keystrokes which aren't obscure.
I've been using it daily for 2 years now and have yet to discover any key
sequence which makes any sense." (Paul Thomas)

#8Daniel Gustafsson
daniel@yesql.se
In reply to: Alvaro Herrera (#7)
1 attachment(s)
Re: pipe_read_line for reading arbitrary strings

On 22 Nov 2023, at 13:47, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

On 2023-Mar-07, Daniel Gustafsson wrote:

The attached POC diff replace fgets() with pg_get_line(), which may not be an
Ok way to cross the streams (it's clearly not a great fit), but as a POC it
provided a neater interface for reading one-off lines from a pipe IMO. Does
anyone else think this is worth fixing before too many callsites use it, or is
this another case of my fear of silent subtle truncation bugs? =)

I think this is generally a good change.

Thanks for review!

I think pipe_read_line should have a "%m" in the "no data returned"
error message.

Good point.

pg_read_line is careful to retain errno (and it was
already zero at start), so this should be okay ... or should we set
errno again to zero after popen(), even if it works?

While it shouldn't be needed, reading manpages from a variety of systems
indicates that popen() isn't entirely reliable when it comes to errno so I've
added an explicit errno=0 just to be certain.

(I'm not sure I buy pg_read_line's use of perror in the backend case.
Maybe this is only okay because the backend doesn't use this code?)

In EXEC_BACKEND builds the postmaster will use find_other_exec which in turn
calls pipe_read_line, so there is a possibility. I agree it's proabably not a
good idea, I'll have a look at it separately and will raise on a new thread.

pg_get_line says caller can distinguish error from no-input-before-EOF
with ferror(), but pipe_read_line does no such thing. I wonder what
happens if an NFS mount containing a file being read disappears in the
middle of reading it, for example ... will we think that we completed
reading the file, rather than erroring out?

Interesting, that's an omission which should be fixed. I notice there are a
number of callsites using pg_get_line which skip validating with ferror(), I'll
have a look at those next (posting findings to a new thread).

--
Daniel Gustafsson

Attachments:

v4-0001-Refactor-pipe_read_line-to-return-the-full-line.patchapplication/octet-stream; name=v4-0001-Refactor-pipe_read_line-to-return-the-full-line.patch; x-unix-mode=0644Download
From 8eb38c83ca64c984a28dd231f6bf28533d8237b8 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Fri, 24 Nov 2023 10:04:32 +0100
Subject: [PATCH v4] Refactor pipe_read_line to return the full line
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Commit 5b2f4afffe6 refactored find_other_exec() and in the process
created pipe_read_line() into a static routine for reading a single
line of output, aimed at reading version numbers.  Commit a7e8ece41
later exposed it externally in order to read a postgresql.conf GUC
using "postgres -C ..".  Further, f06b1c598 also made use of it for
reading a version string much like find_other_exec().  The internal
variable remained "pgver", even when used for other purposes.

Since the function requires passing a buffer and its size, and at
most size - 1 bytes will be read via fgets(), there is a truncation
risk when using this for reading GUCs (like how pg_rewind does,
though the risk in this case is marginal).

To keep this as generic functionality for reading a line from a pipe,
this refactors pipe_read_line() into returning an allocated buffer
containing all of the line to remove the risk of silent truncation.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: John Naylor <johncnaylorls@gmail.com>
Discussion: https://postgr.es/m/DEDF73CE-D528-49A3-9089-B3592FD671A9@yesql.se
---
 src/bin/pg_rewind/pg_rewind.c | 14 ++++++-------
 src/bin/pg_upgrade/exec.c     |  6 ++++--
 src/common/exec.c             | 39 ++++++++++++++++++++++-------------
 src/include/port.h            |  2 +-
 4 files changed, 36 insertions(+), 25 deletions(-)

diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c
index bfd44a284e..16d5923ae3 100644
--- a/src/bin/pg_rewind/pg_rewind.c
+++ b/src/bin/pg_rewind/pg_rewind.c
@@ -1055,8 +1055,7 @@ static void
 getRestoreCommand(const char *argv0)
 {
 	int			rc;
-	char		postgres_exec_path[MAXPGPATH],
-				cmd_output[MAXPGPATH];
+	char		postgres_exec_path[MAXPGPATH];
 	PQExpBuffer postgres_cmd;
 
 	if (!restore_wal)
@@ -1105,16 +1104,15 @@ getRestoreCommand(const char *argv0)
 	/* add -C switch, for restore_command */
 	appendPQExpBufferStr(postgres_cmd, " -C restore_command");
 
-	if (!pipe_read_line(postgres_cmd->data, cmd_output, sizeof(cmd_output)))
-		exit(1);
+	restore_command = pipe_read_line(postgres_cmd->data);
+	if (restore_command == NULL)
+		pg_fatal("unable to read restore_command from target cluster");
 
-	(void) pg_strip_crlf(cmd_output);
+	(void) pg_strip_crlf(restore_command);
 
-	if (strcmp(cmd_output, "") == 0)
+	if (strcmp(restore_command, "") == 0)
 		pg_fatal("restore_command is not set in the target cluster");
 
-	restore_command = pg_strdup(cmd_output);
-
 	pg_log_debug("using for rewind restore_command = \'%s\'",
 				 restore_command);
 
diff --git a/src/bin/pg_upgrade/exec.c b/src/bin/pg_upgrade/exec.c
index 5b2edebe41..f56f22af4e 100644
--- a/src/bin/pg_upgrade/exec.c
+++ b/src/bin/pg_upgrade/exec.c
@@ -431,7 +431,7 @@ static void
 check_exec(const char *dir, const char *program, bool check_version)
 {
 	char		path[MAXPGPATH];
-	char		line[MAXPGPATH];
+	char	   *line;
 	char		cmd[MAXPGPATH];
 	char		versionstr[128];
 
@@ -442,7 +442,7 @@ check_exec(const char *dir, const char *program, bool check_version)
 
 	snprintf(cmd, sizeof(cmd), "\"%s\" -V", path);
 
-	if (!pipe_read_line(cmd, line, sizeof(line)))
+	if ((line = pipe_read_line(cmd)) == NULL)
 		pg_fatal("check for \"%s\" failed: cannot execute",
 				 path);
 
@@ -456,4 +456,6 @@ check_exec(const char *dir, const char *program, bool check_version)
 			pg_fatal("check for \"%s\" failed: incorrect version: found \"%s\", expected \"%s\"",
 					 path, line, versionstr);
 	}
+
+	pg_free(line);
 }
diff --git a/src/common/exec.c b/src/common/exec.c
index f209b934df..676cb5014b 100644
--- a/src/common/exec.c
+++ b/src/common/exec.c
@@ -42,6 +42,8 @@
 #endif
 #endif
 
+#include "common/string.h"
+
 /* Inhibit mingw CRT's auto-globbing of command line arguments */
 #if defined(WIN32) && !defined(_MSC_VER)
 extern int	_CRT_glob = 0;		/* 0 turns off globbing; 1 turns it on */
@@ -328,7 +330,7 @@ find_other_exec(const char *argv0, const char *target,
 				const char *versionstr, char *retpath)
 {
 	char		cmd[MAXPGPATH];
-	char		line[MAXPGPATH];
+	char	   *line;
 
 	if (find_my_exec(argv0, retpath) < 0)
 		return -1;
@@ -346,46 +348,55 @@ find_other_exec(const char *argv0, const char *target,
 
 	snprintf(cmd, sizeof(cmd), "\"%s\" -V", retpath);
 
-	if (!pipe_read_line(cmd, line, sizeof(line)))
+	if ((line = pipe_read_line(cmd)) == NULL)
 		return -1;
 
 	if (strcmp(line, versionstr) != 0)
+	{
+		pfree(line);
 		return -2;
+	}
 
+	pfree(line);
 	return 0;
 }
 
 
 /*
- * Execute a command in a pipe and read the first line from it.
+ * Execute a command in a pipe and read the first line from it. The returned
+ * string is palloc'd (malloc'd in frontend code), the caller is responsible
+ * for freeing.
  */
 char *
-pipe_read_line(char *cmd, char *line, int maxsize)
+pipe_read_line(char *cmd)
 {
-	FILE	   *pgver;
+	FILE	   *pipe_cmd;
+	char	   *line;
 
 	fflush(NULL);
 
 	errno = 0;
-	if ((pgver = popen(cmd, "r")) == NULL)
+	if ((pipe_cmd = popen(cmd, "r")) == NULL)
 	{
 		perror("popen failure");
 		return NULL;
 	}
 
+	/* Make sure popen() didn't change errno */
 	errno = 0;
-	if (fgets(line, maxsize, pgver) == NULL)
+	line = pg_get_line(pipe_cmd, NULL);
+
+	if (line == NULL)
 	{
-		if (feof(pgver))
-			fprintf(stderr, "no data was returned by command \"%s\"\n", cmd);
+		if (ferror(pipe_cmd))
+			log_error(errcode_for_file_access(),
+					  _("could not read from command \"%s\": %m"), cmd);
 		else
-			perror("fgets failure");
-		pclose(pgver);			/* no error checking */
-		return NULL;
+			log_error(errcode_for_file_access(),
+					  _("no data was returned by command \"%s\": %m"), cmd);
 	}
 
-	if (pclose_check(pgver))
-		return NULL;
+	(void) pclose_check(pipe_cmd);
 
 	return line;
 }
diff --git a/src/include/port.h b/src/include/port.h
index 5979139ebc..be8928c8e1 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -137,7 +137,7 @@ extern int	validate_exec(const char *path);
 extern int	find_my_exec(const char *argv0, char *retpath);
 extern int	find_other_exec(const char *argv0, const char *target,
 							const char *versionstr, char *retpath);
-extern char *pipe_read_line(char *cmd, char *line, int maxsize);
+extern char *pipe_read_line(char *cmd);
 
 /* Doesn't belong here, but this is used with find_other_exec(), so... */
 #define PG_BACKEND_VERSIONSTR "postgres (PostgreSQL) " PG_VERSION "\n"
-- 
2.32.1 (Apple Git-133)

#9Daniel Gustafsson
daniel@yesql.se
In reply to: Daniel Gustafsson (#8)
1 attachment(s)
Re: pipe_read_line for reading arbitrary strings

The attached v5 is a rebase with no new changes just to get a fresh run in the
CFBot before pushing. All review comments have been addressed and the patch
has been Ready for Committer for some time, just didn't have time to get to it
in the last CF.

--
Daniel Gustafsson

Attachments:

v5-0001-Refactor-pipe_read_line-to-return-the-full-line.patchapplication/octet-stream; name=v5-0001-Refactor-pipe_read_line-to-return-the-full-line.patch; x-unix-mode=0644Download
From ff60b4bc285c8fff69382b4b9eb53111fd1aa81f Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Fri, 24 Nov 2023 10:04:32 +0100
Subject: [PATCH v5] Refactor pipe_read_line to return the full line
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Commit 5b2f4afffe6 refactored find_other_exec() and in the process
created pipe_read_line() into a static routine for reading a single
line of output, aimed at reading version numbers.  Commit a7e8ece41
later exposed it externally in order to read a postgresql.conf GUC
using "postgres -C ..".  Further, f06b1c598 also made use of it for
reading a version string much like find_other_exec().  The internal
variable remained "pgver", even when used for other purposes.

Since the function requires passing a buffer and its size, and at
most size - 1 bytes will be read via fgets(), there is a truncation
risk when using this for reading GUCs (like how pg_rewind does,
though the risk in this case is marginal).

To keep this as generic functionality for reading a line from a pipe,
this refactors pipe_read_line() into returning an allocated buffer
containing all of the line to remove the risk of silent truncation.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: John Naylor <johncnaylorls@gmail.com>
Discussion: https://postgr.es/m/DEDF73CE-D528-49A3-9089-B3592FD671A9@yesql.se
---
 src/bin/pg_rewind/pg_rewind.c | 14 ++++++-------
 src/bin/pg_upgrade/exec.c     |  6 ++++--
 src/common/exec.c             | 39 ++++++++++++++++++++++-------------
 src/include/port.h            |  2 +-
 4 files changed, 36 insertions(+), 25 deletions(-)

diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c
index 57a168fea2..bde90bf60b 100644
--- a/src/bin/pg_rewind/pg_rewind.c
+++ b/src/bin/pg_rewind/pg_rewind.c
@@ -1055,8 +1055,7 @@ static void
 getRestoreCommand(const char *argv0)
 {
 	int			rc;
-	char		postgres_exec_path[MAXPGPATH],
-				cmd_output[MAXPGPATH];
+	char		postgres_exec_path[MAXPGPATH];
 	PQExpBuffer postgres_cmd;
 
 	if (!restore_wal)
@@ -1105,16 +1104,15 @@ getRestoreCommand(const char *argv0)
 	/* add -C switch, for restore_command */
 	appendPQExpBufferStr(postgres_cmd, " -C restore_command");
 
-	if (!pipe_read_line(postgres_cmd->data, cmd_output, sizeof(cmd_output)))
-		exit(1);
+	restore_command = pipe_read_line(postgres_cmd->data);
+	if (restore_command == NULL)
+		pg_fatal("unable to read restore_command from target cluster");
 
-	(void) pg_strip_crlf(cmd_output);
+	(void) pg_strip_crlf(restore_command);
 
-	if (strcmp(cmd_output, "") == 0)
+	if (strcmp(restore_command, "") == 0)
 		pg_fatal("restore_command is not set in the target cluster");
 
-	restore_command = pg_strdup(cmd_output);
-
 	pg_log_debug("using for rewind restore_command = \'%s\'",
 				 restore_command);
 
diff --git a/src/bin/pg_upgrade/exec.c b/src/bin/pg_upgrade/exec.c
index fec8dc4c2f..3552cf00af 100644
--- a/src/bin/pg_upgrade/exec.c
+++ b/src/bin/pg_upgrade/exec.c
@@ -431,7 +431,7 @@ static void
 check_exec(const char *dir, const char *program, bool check_version)
 {
 	char		path[MAXPGPATH];
-	char		line[MAXPGPATH];
+	char	   *line;
 	char		cmd[MAXPGPATH];
 	char		versionstr[128];
 
@@ -442,7 +442,7 @@ check_exec(const char *dir, const char *program, bool check_version)
 
 	snprintf(cmd, sizeof(cmd), "\"%s\" -V", path);
 
-	if (!pipe_read_line(cmd, line, sizeof(line)))
+	if ((line = pipe_read_line(cmd)) == NULL)
 		pg_fatal("check for \"%s\" failed: cannot execute",
 				 path);
 
@@ -456,4 +456,6 @@ check_exec(const char *dir, const char *program, bool check_version)
 			pg_fatal("check for \"%s\" failed: incorrect version: found \"%s\", expected \"%s\"",
 					 path, line, versionstr);
 	}
+
+	pg_free(line);
 }
diff --git a/src/common/exec.c b/src/common/exec.c
index 8cfc721b12..da929f15b9 100644
--- a/src/common/exec.c
+++ b/src/common/exec.c
@@ -42,6 +42,8 @@
 #endif
 #endif
 
+#include "common/string.h"
+
 /* Inhibit mingw CRT's auto-globbing of command line arguments */
 #if defined(WIN32) && !defined(_MSC_VER)
 extern int	_CRT_glob = 0;		/* 0 turns off globbing; 1 turns it on */
@@ -328,7 +330,7 @@ find_other_exec(const char *argv0, const char *target,
 				const char *versionstr, char *retpath)
 {
 	char		cmd[MAXPGPATH];
-	char		line[MAXPGPATH];
+	char	   *line;
 
 	if (find_my_exec(argv0, retpath) < 0)
 		return -1;
@@ -346,46 +348,55 @@ find_other_exec(const char *argv0, const char *target,
 
 	snprintf(cmd, sizeof(cmd), "\"%s\" -V", retpath);
 
-	if (!pipe_read_line(cmd, line, sizeof(line)))
+	if ((line = pipe_read_line(cmd)) == NULL)
 		return -1;
 
 	if (strcmp(line, versionstr) != 0)
+	{
+		pfree(line);
 		return -2;
+	}
 
+	pfree(line);
 	return 0;
 }
 
 
 /*
- * Execute a command in a pipe and read the first line from it.
+ * Execute a command in a pipe and read the first line from it. The returned
+ * string is palloc'd (malloc'd in frontend code), the caller is responsible
+ * for freeing.
  */
 char *
-pipe_read_line(char *cmd, char *line, int maxsize)
+pipe_read_line(char *cmd)
 {
-	FILE	   *pgver;
+	FILE	   *pipe_cmd;
+	char	   *line;
 
 	fflush(NULL);
 
 	errno = 0;
-	if ((pgver = popen(cmd, "r")) == NULL)
+	if ((pipe_cmd = popen(cmd, "r")) == NULL)
 	{
 		perror("popen failure");
 		return NULL;
 	}
 
+	/* Make sure popen() didn't change errno */
 	errno = 0;
-	if (fgets(line, maxsize, pgver) == NULL)
+	line = pg_get_line(pipe_cmd, NULL);
+
+	if (line == NULL)
 	{
-		if (feof(pgver))
-			fprintf(stderr, "no data was returned by command \"%s\"\n", cmd);
+		if (ferror(pipe_cmd))
+			log_error(errcode_for_file_access(),
+					  _("could not read from command \"%s\": %m"), cmd);
 		else
-			perror("fgets failure");
-		pclose(pgver);			/* no error checking */
-		return NULL;
+			log_error(errcode_for_file_access(),
+					  _("no data was returned by command \"%s\": %m"), cmd);
 	}
 
-	if (pclose_check(pgver))
-		return NULL;
+	(void) pclose_check(pipe_cmd);
 
 	return line;
 }
diff --git a/src/include/port.h b/src/include/port.h
index 2e3425da63..263cf791a3 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -137,7 +137,7 @@ extern int	validate_exec(const char *path);
 extern int	find_my_exec(const char *argv0, char *retpath);
 extern int	find_other_exec(const char *argv0, const char *target,
 							const char *versionstr, char *retpath);
-extern char *pipe_read_line(char *cmd, char *line, int maxsize);
+extern char *pipe_read_line(char *cmd);
 
 /* Doesn't belong here, but this is used with find_other_exec(), so... */
 #define PG_BACKEND_VERSIONSTR "postgres (PostgreSQL) " PG_VERSION "\n"
-- 
2.32.1 (Apple Git-133)

#10Peter Eisentraut
peter@eisentraut.org
In reply to: Alvaro Herrera (#7)
Re: pipe_read_line for reading arbitrary strings

On 22.11.23 13:47, Alvaro Herrera wrote:

On 2023-Mar-07, Daniel Gustafsson wrote:

The attached POC diff replace fgets() with pg_get_line(), which may not be an
Ok way to cross the streams (it's clearly not a great fit), but as a POC it
provided a neater interface for reading one-off lines from a pipe IMO. Does
anyone else think this is worth fixing before too many callsites use it, or is
this another case of my fear of silent subtle truncation bugs? =)

I think this is generally a good change.

I think pipe_read_line should have a "%m" in the "no data returned"
error message. pg_read_line is careful to retain errno (and it was
already zero at start), so this should be okay ... or should we set
errno again to zero after popen(), even if it works?

Is this correct? The code now looks like this:

line = pg_get_line(pipe_cmd, NULL);

if (line == NULL)
{
if (ferror(pipe_cmd))
log_error(errcode_for_file_access(),
_("could not read from command \"%s\": %m"), cmd);
else
log_error(errcode_for_file_access(),
_("no data was returned by command \"%s\": %m"),
cmd);
}

We already handle the case where an error happened in the first branch,
so there cannot be an error set in the second branch (unless something
nonobvious is going on?).

It seems to me that if the command being run just happens to print
nothing but is otherwise successful, this would print a bogus error code
(or "Success")?

#11Daniel Gustafsson
daniel@yesql.se
In reply to: Peter Eisentraut (#10)
1 attachment(s)
Re: pipe_read_line for reading arbitrary strings

On 6 Mar 2024, at 10:07, Peter Eisentraut <peter@eisentraut.org> wrote:

On 22.11.23 13:47, Alvaro Herrera wrote:

On 2023-Mar-07, Daniel Gustafsson wrote:

The attached POC diff replace fgets() with pg_get_line(), which may not be an
Ok way to cross the streams (it's clearly not a great fit), but as a POC it
provided a neater interface for reading one-off lines from a pipe IMO. Does
anyone else think this is worth fixing before too many callsites use it, or is
this another case of my fear of silent subtle truncation bugs? =)

I think this is generally a good change.
I think pipe_read_line should have a "%m" in the "no data returned"
error message. pg_read_line is careful to retain errno (and it was
already zero at start), so this should be okay ... or should we set
errno again to zero after popen(), even if it works?

Is this correct? The code now looks like this:

line = pg_get_line(pipe_cmd, NULL);

if (line == NULL)
{
if (ferror(pipe_cmd))
log_error(errcode_for_file_access(),
_("could not read from command \"%s\": %m"), cmd);
else
log_error(errcode_for_file_access(),
_("no data was returned by command \"%s\": %m"), cmd);
}

We already handle the case where an error happened in the first branch, so there cannot be an error set in the second branch (unless something nonobvious is going on?).

It seems to me that if the command being run just happens to print nothing but is otherwise successful, this would print a bogus error code (or "Success")?

Good catch, that's an incorrect copy/paste, it should use ERRCODE_NO_DATA. I'm
not convinced that a function to read from a pipe should consider not reading
anything successful by default, output is sort expected here. We could add a
flag parameter to use for signalling that no data is fine though as per the
attached (as of yet untested) diff?

--
Daniel Gustafsson

Attachments:

fix_errhandling.diffapplication/octet-stream; name=fix_errhandling.diff; x-unix-mode=0644Download
diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c
index bde90bf60b..44f555b8e7 100644
--- a/src/bin/pg_rewind/pg_rewind.c
+++ b/src/bin/pg_rewind/pg_rewind.c
@@ -1104,7 +1104,7 @@ getRestoreCommand(const char *argv0)
 	/* add -C switch, for restore_command */
 	appendPQExpBufferStr(postgres_cmd, " -C restore_command");
 
-	restore_command = pipe_read_line(postgres_cmd->data);
+	restore_command = pipe_read_line(postgres_cmd->data, false);
 	if (restore_command == NULL)
 		pg_fatal("unable to read restore_command from target cluster");
 
diff --git a/src/bin/pg_upgrade/exec.c b/src/bin/pg_upgrade/exec.c
index 3552cf00af..2cd76f9f99 100644
--- a/src/bin/pg_upgrade/exec.c
+++ b/src/bin/pg_upgrade/exec.c
@@ -442,7 +442,7 @@ check_exec(const char *dir, const char *program, bool check_version)
 
 	snprintf(cmd, sizeof(cmd), "\"%s\" -V", path);
 
-	if ((line = pipe_read_line(cmd)) == NULL)
+	if ((line = pipe_read_line(cmd, false)) == NULL)
 		pg_fatal("check for \"%s\" failed: cannot execute",
 				 path);
 
diff --git a/src/common/exec.c b/src/common/exec.c
index da929f15b9..2f4d4264c7 100644
--- a/src/common/exec.c
+++ b/src/common/exec.c
@@ -348,7 +348,7 @@ find_other_exec(const char *argv0, const char *target,
 
 	snprintf(cmd, sizeof(cmd), "\"%s\" -V", retpath);
 
-	if ((line = pipe_read_line(cmd)) == NULL)
+	if ((line = pipe_read_line(cmd, false)) == NULL)
 		return -1;
 
 	if (strcmp(line, versionstr) != 0)
@@ -368,7 +368,7 @@ find_other_exec(const char *argv0, const char *target,
  * for freeing.
  */
 char *
-pipe_read_line(char *cmd)
+pipe_read_line(char *cmd, bool empty_ok)
 {
 	FILE	   *pipe_cmd;
 	char	   *line;
@@ -391,9 +391,9 @@ pipe_read_line(char *cmd)
 		if (ferror(pipe_cmd))
 			log_error(errcode_for_file_access(),
 					  _("could not read from command \"%s\": %m"), cmd);
-		else
-			log_error(errcode_for_file_access(),
-					  _("no data was returned by command \"%s\": %m"), cmd);
+		if (!empty_ok)
+			log_error(errcode(ERRCODE_NO_DATA),
+					  _("no data was returned by command \"%s\""), cmd);
 	}
 
 	(void) pclose_check(pipe_cmd);
diff --git a/src/include/port.h b/src/include/port.h
index ae115d2d97..20f150b3a5 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -137,7 +137,7 @@ extern int	validate_exec(const char *path);
 extern int	find_my_exec(const char *argv0, char *retpath);
 extern int	find_other_exec(const char *argv0, const char *target,
 							const char *versionstr, char *retpath);
-extern char *pipe_read_line(char *cmd);
+extern char *pipe_read_line(char *cmd, bool empty_ok);
 
 /* Doesn't belong here, but this is used with find_other_exec(), so... */
 #define PG_BACKEND_VERSIONSTR "postgres (PostgreSQL) " PG_VERSION "\n"
#12Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Daniel Gustafsson (#11)
Re: pipe_read_line for reading arbitrary strings

On 2024-Mar-06, Daniel Gustafsson wrote:

Good catch, that's an incorrect copy/paste, it should use ERRCODE_NO_DATA. I'm
not convinced that a function to read from a pipe should consider not reading
anything successful by default, output is sort expected here. We could add a
flag parameter to use for signalling that no data is fine though as per the
attached (as of yet untested) diff?

I think adding dead code is not a great plan, particularly if it's hairy
enough that we need to very carefully dissect what happens in error
cases. IMO if and when somebody has a need for an empty return string
being acceptable, they can add it then.

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/

#13Daniel Gustafsson
daniel@yesql.se
In reply to: Alvaro Herrera (#12)
Re: pipe_read_line for reading arbitrary strings

On 6 Mar 2024, at 11:46, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

On 2024-Mar-06, Daniel Gustafsson wrote:

Good catch, that's an incorrect copy/paste, it should use ERRCODE_NO_DATA. I'm
not convinced that a function to read from a pipe should consider not reading
anything successful by default, output is sort expected here. We could add a
flag parameter to use for signalling that no data is fine though as per the
attached (as of yet untested) diff?

I think adding dead code is not a great plan, particularly if it's hairy
enough that we need to very carefully dissect what happens in error
cases. IMO if and when somebody has a need for an empty return string
being acceptable, they can add it then.

I agree with that, there are no callers today and I can't imagine one off the
cuff. The change to use the appropriate errcode still applies though.

--
Daniel Gustafsson

#14Peter Eisentraut
peter@eisentraut.org
In reply to: Daniel Gustafsson (#11)
Re: pipe_read_line for reading arbitrary strings

On 06.03.24 10:54, Daniel Gustafsson wrote:

On 6 Mar 2024, at 10:07, Peter Eisentraut <peter@eisentraut.org> wrote:

On 22.11.23 13:47, Alvaro Herrera wrote:

On 2023-Mar-07, Daniel Gustafsson wrote:

The attached POC diff replace fgets() with pg_get_line(), which may not be an
Ok way to cross the streams (it's clearly not a great fit), but as a POC it
provided a neater interface for reading one-off lines from a pipe IMO. Does
anyone else think this is worth fixing before too many callsites use it, or is
this another case of my fear of silent subtle truncation bugs? =)

I think this is generally a good change.
I think pipe_read_line should have a "%m" in the "no data returned"
error message. pg_read_line is careful to retain errno (and it was
already zero at start), so this should be okay ... or should we set
errno again to zero after popen(), even if it works?

Is this correct? The code now looks like this:

line = pg_get_line(pipe_cmd, NULL);

if (line == NULL)
{
if (ferror(pipe_cmd))
log_error(errcode_for_file_access(),
_("could not read from command \"%s\": %m"), cmd);
else
log_error(errcode_for_file_access(),
_("no data was returned by command \"%s\": %m"), cmd);
}

We already handle the case where an error happened in the first branch, so there cannot be an error set in the second branch (unless something nonobvious is going on?).

It seems to me that if the command being run just happens to print nothing but is otherwise successful, this would print a bogus error code (or "Success")?

Good catch, that's an incorrect copy/paste, it should use ERRCODE_NO_DATA.

Also it shouldn't print %m, was my point.

#15Daniel Gustafsson
daniel@yesql.se
In reply to: Peter Eisentraut (#14)
Re: pipe_read_line for reading arbitrary strings

On 8 Mar 2024, at 18:13, Peter Eisentraut <peter@eisentraut.org> wrote:

Good catch, that's an incorrect copy/paste, it should use ERRCODE_NO_DATA.

Also it shouldn't print %m, was my point.

Absolutely, I removed that in the patch upthread, it was clearly wrong.

--
Daniel Gustafsson

#16Daniel Gustafsson
daniel@yesql.se
In reply to: Daniel Gustafsson (#15)
Re: pipe_read_line for reading arbitrary strings

On 8 Mar 2024, at 19:38, Daniel Gustafsson <daniel@yesql.se> wrote:

On 8 Mar 2024, at 18:13, Peter Eisentraut <peter@eisentraut.org> wrote:

Good catch, that's an incorrect copy/paste, it should use ERRCODE_NO_DATA.

Also it shouldn't print %m, was my point.

Absolutely, I removed that in the patch upthread, it was clearly wrong.

Pushed the fix for the incorrect logline and errcode.

--
Daniel Gustafsson