Alternative to \copy in psql modelled after \g

Started by Daniel Veriteabout 7 years ago36 messages
#1Daniel Verite
daniel@manitou-mail.org
1 attachment(s)

Hi,

Currently \copy cannot span multiple lines (like any meta-command)
and cannot use psql variables whereas \g can do both.

The POC patch attached implements two meta-commands \copyfrom
and \copyto that are to COPY what \g is to any other query:

- they take the COPY query already var-expanded from the query buffer,
which must mention FROM STDIN or TO STDOUT.

- they accept an argument declaring the local data source or destination,
either a filename or a program (|command args) or empty for stdin/stdout.

By contrast \copy has a specific parser to extract the data source
or dest from its line of arguments, plus whether it's a COPY FROM or TO,
and build a COPY query from that.

Examples of use

1. $ psql -v filename="/path/data-$(date -I).csv"
COPY (SELECT *
FROM table
WHERE ...)
TO STDOUT (FORMAT csv) \copyto :filename

2. -- copy only the first 100 lines
COPY table FROM stdin \copyfrom |head -n 100 /data/datafile.txt

3. $ cat script.sql
COPY table1 FROM stdin; -- copy inline data
data line
data line
\.

-- copy data from psql's stdin
COPY table2 FROM stdin \copyfrom

# copy both in-script and out-of-script data
$ psql -f script.sql < table2.data

Comments? Does that look useful as an alternative to \copy ?

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Attachments:

poc-patch-psql-copy-like-g.difftext/x-patch; name=poc-patch-psql-copy-like-g.diffDownload
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 0dea54d..7398dec 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -65,6 +65,8 @@ static backslashResult exec_command_cd(PsqlScanState scan_state, bool active_bra
 				const char *cmd);
 static backslashResult exec_command_conninfo(PsqlScanState scan_state, bool active_branch);
 static backslashResult exec_command_copy(PsqlScanState scan_state, bool active_branch);
+static backslashResult exec_command_copyfrom(PQExpBuffer query_buf, PsqlScanState scan_state, bool active_branch);
+static backslashResult exec_command_copyto(PQExpBuffer query_buf, PsqlScanState scan_state, bool active_branch);
 static backslashResult exec_command_copyright(PsqlScanState scan_state, bool active_branch);
 static backslashResult exec_command_crosstabview(PsqlScanState scan_state, bool active_branch);
 static backslashResult exec_command_d(PsqlScanState scan_state, bool active_branch,
@@ -305,6 +307,10 @@ exec_command(const char *cmd,
 		status = exec_command_conninfo(scan_state, active_branch);
 	else if (pg_strcasecmp(cmd, "copy") == 0)
 		status = exec_command_copy(scan_state, active_branch);
+	else if (pg_strcasecmp(cmd, "copyfrom") == 0)
+		status = exec_command_copyfrom(query_buf, scan_state, active_branch);
+	else if (pg_strcasecmp(cmd, "copyto") == 0)
+		status = exec_command_copyto(query_buf, scan_state, active_branch);
 	else if (strcmp(cmd, "copyright") == 0)
 		status = exec_command_copyright(scan_state, active_branch);
 	else if (strcmp(cmd, "crosstabview") == 0)
@@ -634,6 +640,53 @@ exec_command_copy(PsqlScanState scan_state, bool active_branch)
 }
 
 /*
+ * \copyfrom -- run a COPY FROM command
+ * The COPY query is obtained from the query buffer
+ * The argument is 'filename' as in \copy
+ */
+static backslashResult
+exec_command_copyfrom(PQExpBuffer query_buf, PsqlScanState scan_state, bool active_branch)
+{
+	bool		success = true;
+
+	if (active_branch)
+	{
+		char	   *opt = psql_scan_slash_option(scan_state,
+												 OT_FILEPIPE, NULL, false);
+
+		success = do_copy_query(query_buf->data, opt, copy_from);
+		resetPQExpBuffer(query_buf);
+		free(opt);
+	}
+	else
+		ignore_slash_filepipe(scan_state);
+
+	return success ? PSQL_CMD_SKIP_LINE : PSQL_CMD_ERROR;
+}
+/*
+ * \copyto -- run a COPY TO command
+ */
+static backslashResult
+exec_command_copyto(PQExpBuffer query_buf, PsqlScanState scan_state, bool active_branch)
+{
+	bool		success = true;
+
+	if (active_branch)
+	{
+		char	   *opt = psql_scan_slash_option(scan_state,
+												 OT_FILEPIPE, NULL, false);
+
+		success = do_copy_query(query_buf->data, opt, copy_to);
+		resetPQExpBuffer(query_buf);
+		free(opt);
+	}
+	else
+		ignore_slash_filepipe(scan_state);
+
+	return success ? PSQL_CMD_SKIP_LINE : PSQL_CMD_ERROR;
+}
+
+/*
  * \copyright -- print copyright notice
  */
 static backslashResult
diff --git a/src/bin/psql/copy.c b/src/bin/psql/copy.c
index 555c633..95c5c5e 100644
--- a/src/bin/psql/copy.c
+++ b/src/bin/psql/copy.c
@@ -260,28 +260,11 @@ error:
 }
 
 
-/*
- * Execute a \copy command (frontend copy). We have to open a file (or execute
- * a command), then submit a COPY query to the backend and either feed it data
- * from the file or route its response into the file.
- */
-bool
-do_copy(const char *args)
+static
+FILE*
+open_copy_stream(struct copy_options *options)
 {
-	PQExpBufferData query;
-	FILE	   *copystream;
-	struct copy_options *options;
-	bool		success;
-
-	/* parse options */
-	options = parse_slash_copy(args);
-
-	if (!options)
-		return false;
-
-	/* prepare to read or write the target file */
-	if (options->file && !options->program)
-		canonicalize_path(options->file);
+	FILE *copystream = NULL;
 
 	if (options->from)
 	{
@@ -331,8 +314,7 @@ do_copy(const char *args)
 		else
 			psql_error("%s: %s\n",
 					   options->file, strerror(errno));
-		free_copy_options(options);
-		return false;
+		return NULL;
 	}
 
 	if (!options->program)
@@ -352,27 +334,18 @@ do_copy(const char *args)
 		if (result < 0 || S_ISDIR(st.st_mode))
 		{
 			fclose(copystream);
-			free_copy_options(options);
-			return false;
+			return NULL;
 		}
 	}
 
-	/* build the command we will send to the backend */
-	initPQExpBuffer(&query);
-	printfPQExpBuffer(&query, "COPY ");
-	appendPQExpBufferStr(&query, options->before_tofrom);
-	if (options->from)
-		appendPQExpBufferStr(&query, " FROM STDIN ");
-	else
-		appendPQExpBufferStr(&query, " TO STDOUT ");
-	if (options->after_tofrom)
-		appendPQExpBufferStr(&query, options->after_tofrom);
+	return copystream;
+}
 
-	/* run it like a user command, but with copystream as data source/sink */
-	pset.copyStream = copystream;
-	success = SendQuery(query.data);
-	pset.copyStream = NULL;
-	termPQExpBuffer(&query);
+static
+bool
+close_copy_stream(FILE* copystream, struct copy_options *options)
+{
+	bool success = true;
 
 	if (options->file != NULL)
 	{
@@ -407,6 +380,119 @@ do_copy(const char *args)
 			}
 		}
 	}
+	return success;
+}
+
+/*
+ * Execute a \copy command (frontend copy). We have to open a file (or execute
+ * a command), then submit a COPY query to the backend and either feed it data
+ * from the file or route its response into the file.
+ */
+bool
+do_copy(const char *args)
+{
+	PQExpBufferData query;
+	FILE	   *copystream;
+	struct copy_options *options;
+	bool		success;
+
+	/* parse options */
+	options = parse_slash_copy(args);
+
+	if (!options)
+		return false;
+
+	/* prepare to read or write the target file */
+	if (options->file && !options->program)
+		canonicalize_path(options->file);
+
+	/* open the file or program from which or to which data goes */
+	copystream = open_copy_stream(options);
+
+	if (!copystream)
+	{
+		/* a user-visible error has already been emitted at this point, so
+		 * just clean up and return */
+		free_copy_options(options);
+		return false;
+	}
+
+	/* build the command we will send to the backend */
+	initPQExpBuffer(&query);
+	printfPQExpBuffer(&query, "COPY ");
+	appendPQExpBufferStr(&query, options->before_tofrom);
+	if (options->from)
+		appendPQExpBufferStr(&query, " FROM STDIN ");
+	else
+		appendPQExpBufferStr(&query, " TO STDOUT ");
+	if (options->after_tofrom)
+		appendPQExpBufferStr(&query, options->after_tofrom);
+
+	/* run it like a user command, but with copystream as data source/sink */
+	pset.copyStream = copystream;
+	success = SendQuery(query.data);
+
+	pset.copyStream = NULL;
+	termPQExpBuffer(&query);
+	success = close_copy_stream(copystream, options);
+	free_copy_options(options);
+	return success;
+}
+
+/*
+ * Send the COPY query (frontend copy) in the query buffer, and feed it data
+ * from the file or program given as argument (when dir=copy_from),
+ * or route its response into the file or program (when dir=copy_to).
+ */
+bool
+do_copy_query(const char* querystring, const char *args, enum copy_direction dir)
+{
+	FILE	   *copystream;
+	struct copy_options *options;
+	bool		success;
+
+	options = pg_malloc0(sizeof(struct copy_options));
+	if (!options)
+		return false;
+
+	options->from = (dir == copy_from);
+
+	if (args)
+	{
+		if (args[0] == '|')
+		{
+			options->program = true;
+			options->file = pg_strdup(args+1);
+		}
+		else
+		{
+			options->file = pg_strdup(args);
+			canonicalize_path(options->file);
+		}
+	}
+	else
+	{
+		/* Use psql's stdin or stdout if no file or program given */
+		options->psql_inout = true;
+	}
+
+	/* open the file or program from which or to which data goes */
+	copystream = open_copy_stream(options);
+	if (!copystream)
+	{
+		/* a user-visible error has already been emitted at this point, so
+		 * just clean up and return */
+		free_copy_options(options);
+		return false;
+	}
+
+	/* run the COPY query with copystream as data source/sink */
+	pset.copyStream = copystream;
+	success = SendQuery(querystring);
+	pset.copyStream = NULL;
+
+	success = close_copy_stream(copystream, options);
+
 	free_copy_options(options);
 	return success;
 }
diff --git a/src/bin/psql/copy.h b/src/bin/psql/copy.h
index f4107d7..7755ac7 100644
--- a/src/bin/psql/copy.h
+++ b/src/bin/psql/copy.h
@@ -10,10 +10,18 @@
 
 #include "libpq-fe.h"
 
+enum copy_direction
+{
+	copy_from = 1,
+	copy_to
+};
 
 /* handler for \copy */
 extern bool do_copy(const char *args);
 
+/* handler for \copyfrom and \copyto */
+extern bool do_copy_query(const char* querystring, const char *args, enum copy_direction dir);
+
 /* lower level processors for copy in/out streams */
 
 extern bool handleCopyOut(PGconn *conn, FILE *copystream,
#2David G. Johnston
david.g.johnston@gmail.com
In reply to: Daniel Verite (#1)
Re: Alternative to \copy in psql modelled after \g

On Fri, Nov 9, 2018 at 4:19 AM Daniel Verite <daniel@manitou-mail.org> wrote:

Examples of use

1. $ psql -v filename="/path/data-$(date -I).csv"
COPY (SELECT *
FROM table
WHERE ...)
TO STDOUT (FORMAT csv) \copyto :filename

Do I understand correctly that you are proposing a slightly less
verbose alternative of:

\o :filename
COPY TO STDOUT
\o

David J.

#3Daniel Verite
daniel@manitou-mail.org
In reply to: David G. Johnston (#2)
Re: Alternative to \copy in psql modelled after \g

David G. Johnston wrote:

Do I understand correctly that you are proposing a slightly less
verbose alternative of:

\o :filename
COPY TO STDOUT
\o

Not strictly the same because of this:

\o or \out [ filename ]
....
“Query results” includes all tables, command responses, and notices
obtained from the database server, as well as output of various
backslash commands that query the database (such as \d); but not
error messages.

If for instance psql received a notification between the two \o it would
end up in the file, which it wouldn't with \copyto.
The same is true for SELECT... \g file as opposed to \o file

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

#4David G. Johnston
david.g.johnston@gmail.com
In reply to: Daniel Verite (#3)
Re: Alternative to \copy in psql modelled after \g

On Fri, Nov 9, 2018 at 9:35 AM Daniel Verite <daniel@manitou-mail.org> wrote:

If for instance psql received a notification between the two \o it would
end up in the file, which it wouldn't with \copyto.
The same is true for SELECT... \g file as opposed to \o file

Is that something we could change instead of adding a new command?
POLA, for me at least, would be for \g [filename] to do exactly what
you are describing with the \copyto feature. That is doesn't and there
haven't been complaints I would chalk up to notices and command
responses not being prevalent in situations where people would use \g
[filename] or paired \o [filename].

David J.

#5Daniel Verite
daniel@manitou-mail.org
In reply to: David G. Johnston (#4)
Re: Alternative to \copy in psql modelled after \g

David G. Johnston wrote:

POLA, for me at least, would be for \g [filename] to do exactly what
you are describing with the \copyto feature.

I admit that if we could improve \g to handle COPY, it would be more
elegant than the current proposal adding two meta-commands.

But the copy-workflow and non-copy-workflow are different, and in
order to know which one to start, \g would need to analyze the query
to determine whether it's a COPY FROM, COPY TO or something else.
psql parses queries syntactically, but not semantically AFAIK, and I
suspect we don't want to start doing that, as it breaks a separation
of concerns.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

#6Daniel Verite
daniel@manitou-mail.org
In reply to: Daniel Verite (#5)
1 attachment(s)
Re: Alternative to \copy in psql modelled after \g

I wrote:

I admit that if we could improve \g to handle COPY, it would be more
elegant than the current proposal adding two meta-commands.

But the copy-workflow and non-copy-workflow are different, and in
order to know which one to start, \g would need to analyze the query

It turns out I was wrong on this. The workflows are different but when
psql receives the first PGresult, it's still time to handle the I/O
redirection. It just differs from \copy in the case of an error:
\copy won't even send a command to the server if the local output
stream can't be opened, whereas COPY \g would send the command, and
will have to deal with the client-side error afterwards.

Well, except that up to now, COPY ... TO STDOUT \g file or \g |command
has been ignoring "file" or "|command", which is arguably a bug as Tom
puts it in another discussion in [1]bug #15535 /messages/by-id/6309.1544031175@sss.pgh.pa.us.

So as a replacement for the \copyto I was proposing earlier, PFA a patch
for COPY TO STDOUT to make use of the argument to \g.

[1]: bug #15535 /messages/by-id/6309.1544031175@sss.pgh.pa.us
/messages/by-id/6309.1544031175@sss.pgh.pa.us

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Attachments:

psql-copyout-g.patchtext/plainDownload
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 62c2928..1488bb9 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -1095,14 +1095,31 @@ ProcessResult(PGresult **results)
 			 * If pset.copyStream is set, use that as data source/sink,
 			 * otherwise use queryFout or cur_cmd_source as appropriate.
 			 */
-			FILE	   *copystream = pset.copyStream;
+			FILE	   *copystream = NULL;
 			PGresult   *copy_result;
 
 			SetCancelConn();
 			if (result_status == PGRES_COPY_OUT)
 			{
-				if (!copystream)
+				bool is_pipe;
+				if (pset.gfname)
+				{
+					/*
+					 * COPY TO STDOUT \g [|]file may be used as an alternative
+					 * to \copy
+					 */
+					if (!openQueryOutputFile(pset.gfname, &copystream, &is_pipe))
+					{
+						copystream = NULL; /* will discard the COPY data entirely */
+					}
+					if (is_pipe)
+						disable_sigpipe_trap();
+				}
+				else if (pset.copyStream)
+					copystream = pset.copyStream;
+				else
 					copystream = pset.queryFout;
+
 				success = handleCopyOut(pset.db,
 										copystream,
 										&copy_result) && success;
@@ -1117,11 +1134,25 @@ ProcessResult(PGresult **results)
 					PQclear(copy_result);
 					copy_result = NULL;
 				}
+
+				if (pset.gfname && copystream != NULL)
+				{
+					/* close \g argument file/pipe */
+					if (is_pipe)
+					{
+						pclose(copystream);
+						restore_sigpipe_trap();
+					}
+					else
+					{
+						fclose(copystream);
+					}
+				}
 			}
 			else
 			{
-				if (!copystream)
-					copystream = pset.cur_cmd_source;
+				/* COPY IN */
+				copystream = pset.copyStream ? pset.copyStream : pset.cur_cmd_source;
 				success = handleCopyIn(pset.db,
 									   copystream,
 									   PQbinaryTuples(*results),
diff --git a/src/bin/psql/copy.c b/src/bin/psql/copy.c
index 555c633..645970e 100644
--- a/src/bin/psql/copy.c
+++ b/src/bin/psql/copy.c
@@ -426,6 +426,7 @@ do_copy(const char *args)
  * conn should be a database connection that you just issued COPY TO on
  * and got back a PGRES_COPY_OUT result.
  * copystream is the file stream for the data to go to.
+ * copystream can be NULL to pump the data without writing it anywhere.
  * The final status for the COPY is returned into *res (but note
  * we already reported the error, if it's not a success result).
  *
@@ -434,7 +435,7 @@ do_copy(const char *args)
 bool
 handleCopyOut(PGconn *conn, FILE *copystream, PGresult **res)
 {
-	bool		OK = true;
+	bool		OK = (copystream != NULL);
 	char	   *buf;
 	int			ret;
 
#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Daniel Verite (#6)
Re: Alternative to \copy in psql modelled after \g

"Daniel Verite" <daniel@manitou-mail.org> writes:

So as a replacement for the \copyto I was proposing earlier, PFA a patch
for COPY TO STDOUT to make use of the argument to \g.

Sounds plausible, please add to next commitfest so we don't forget it.

regards, tom lane

#8Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Daniel Verite (#6)
Re: Alternative to \copy in psql modelled after \g

Bonjour Daniel,

But the copy-workflow and non-copy-workflow are different, and in
order to know which one to start, \g would need to analyze the query

It turns out I was wrong on this. The workflows are different but when
psql receives the first PGresult, it's still time to handle the I/O
redirection. It just differs from \copy in the case of an error:
\copy won't even send a command to the server if the local output
stream can't be opened, whereas COPY \g would send the command, and
will have to deal with the client-side error afterwards.

Well, except that up to now, COPY ... TO STDOUT \g file or \g |command
has been ignoring "file" or "|command", which is arguably a bug as Tom
puts it in another discussion in [1].

So as a replacement for the \copyto I was proposing earlier, PFA a patch
for COPY TO STDOUT to make use of the argument to \g.

Patch applies cleanly, compiles, make check is ok.

However, it does not contain any tests (bad:-) nor documentation (hmmm...
maybe none needed, though).

Is this just kind of a bug fix? Beforehand the documentation says "\g fn"
sends to file, but that was not happening with COPY, and now it does as it
says?

A question: if opening the output file fails, should not the query be
cancelled and an error be reported? Maybe it is too late for that. It
seems that "SELECT pg_sleep(10) \g /bad/file" fails but executes the query
nevertheless.

ISTM that overriding copystream on open failures is not necessary, because
its value is already NULL in this case.

I'd suggest that is_pipe should be initialized to false, otherwise it is
unclear from the code when it is set before use, and some compilers may
complain.

--
Fabien.

#9Daniel Verite
daniel@manitou-mail.org
In reply to: Fabien COELHO (#8)
Re: Alternative to \copy in psql modelled after \g

Fabien COELHO wrote:

Is this just kind of a bug fix? Beforehand the documentation says "\g fn"
sends to file, but that was not happening with COPY, and now it does as it
says?

Yes. The doc says about \g:

Sends the current query buffer to the server for execution. If an
argument is given, the query's output is written to the named file
or piped to the given shell command, instead of displaying it as
usual.

It does not add "unless the query is a COPY", so it seems right
to make that just work, and call it a bug fix.

A question: if opening the output file fails, should not the query
be cancelled and an error be reported? Maybe it is too late for
that. It seems that "SELECT pg_sleep(10) \g /bad/file" fails but
executes the query nevertheless.

Yes. This part works as documented:

"The file or command is written to only if the query successfully
returns zero or more tuples, not if the query fails or is a
non-data-returning SQL command."

However, it does not contain any tests (bad:-)

Testing this requires at least some interaction with the OS which I'm
uncomfortable to add. There is a precedent in
regress/sql/hs_standby_allowed.sql doing:

COPY hs1 TO '/tmp/copy_test'
\! cat /tmp/copy_test

We could add something like that in psql.sql, but I'm not fond of it
because it assumes a Unix-ish environment, and that it's OK to clobber
the hardcoded /tmp/copy_test should it preexist. I'd rather work with
a dedicated temporary directory. On Linux mktemp -d could be used, but
I don't know about a portable solution, plus POSIX has deprecated
mktemp.
I'm open to ideas on a portable way for psql.sql to test \g writing to a
file or a program, but ATM my inclination is to not add that test.

nor documentation (hmmm... maybe none needed, though).

\copy has this paragraph:

"The syntax of this command is similar to that of the SQL COPY
command. All options other than the data source/destination are as
specified for COPY. Because of this, special parsing rules apply to
the \copy meta-command. Unlike most other meta-commands, the entire
remainder of the line is always taken to be the arguments of \copy,
and neither variable interpolation nor backquote expansion are
performed in the arguments".

We could add that COPY TO STDOUT with a redirection might be used as an
alternative. The downside is that with four paragraphs plus one tip,
the explanations on \copy give already a lot to chew on, so is it
worth to add more?

ISTM that overriding copystream on open failures is not necessary, because
its value is already NULL in this case.

I'd suggest that is_pipe should be initialized to false, otherwise it is
unclear from the code when it is set before use, and some compilers may
complain.

OK, will consider these in the next revision.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

#10Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Daniel Verite (#9)
Re: Alternative to \copy in psql modelled after \g

Hello,

Is this just kind of a bug fix? Beforehand the documentation says "\g fn"
sends to file, but that was not happening with COPY, and now it does as it
says?

Yes. [...]

It does not add "unless the query is a COPY", so it seems right
to make that just work, and call it a bug fix.

Does this suggest backpatching?

However, it does not contain any tests (bad:-)

Testing this requires at least some interaction with the OS which I'm
uncomfortable to add.

Hmmm. This means that "\g filename" goes fully untested:-( A casual grep
seems to confirm this. Sigh:-(

There is a precedent in regress/sql/hs_standby_allowed.sql doing:

COPY hs1 TO '/tmp/copy_test'
\! cat /tmp/copy_test

Indeed. I'm unsure windows has cat or /tmp, so I do not understand how it
works on such platform. Maybe I'm missing something.

We could add something like that in psql.sql, but I'm not fond of it
because it assumes a Unix-ish environment,

Yep.

I'm open to ideas on a portable way for psql.sql to test \g writing to a
file or a program, but ATM my inclination is to not add that test.

A relative file could be ok? After some testing, the standard regression
tests do not cd to some special place, so it may fail?

However TAP tests do that, and I have used this extensively with pgbench,
so a psql TAP test could do that and other things, such as importing a csv
file or whatever.

nor documentation (hmmm... maybe none needed, though).

\copy has this paragraph:

"The syntax of this command is similar to that of the SQL COPY
command. All options other than the data source/destination are as
specified for COPY. Because of this, special parsing rules apply to
the \copy meta-command. Unlike most other meta-commands, the entire
remainder of the line is always taken to be the arguments of \copy,
and neither variable interpolation nor backquote expansion are
performed in the arguments".

We could add that COPY TO STDOUT with a redirection might be used as an
alternative. The downside is that with four paragraphs plus one tip,
the explanations on \copy give already a lot to chew on, so is it
worth to add more?

I'd say that a small paragraph with the tip would be ok.

--
Fabien.

#11Daniel Verite
daniel@manitou-mail.org
In reply to: Fabien COELHO (#10)
Re: Alternative to \copy in psql modelled after \g

Fabien COELHO wrote:

It does not add "unless the query is a COPY", so it seems right
to make that just work, and call it a bug fix.

Does this suggest backpatching?

Yes, I think it's a candidate for back-patching.

There is a precedent in regress/sql/hs_standby_allowed.sql doing:

COPY hs1 TO '/tmp/copy_test'
\! cat /tmp/copy_test

Indeed. I'm unsure windows has cat or /tmp, so I do not understand how it
works on such platform. Maybe I'm missing something.

It's exercised only on a standby. Possibly few machines run this test,
among which none powered by Windows? And maybe it even works
on Windows in some cases: the reference to /tmp would work
in an MSYS/MingW environment and "cat" might too if \! gets
to the /bin/sh of that environment.

However TAP tests do that, and I have used this extensively with pgbench,
so a psql TAP test could do that and other things, such as importing a csv
file or whatever.

It looks a significant step forward, to be brought by a patch on its own
without prospect of being back-patched.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

#12Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Daniel Verite (#11)
Re: Alternative to \copy in psql modelled after \g

It looks a significant step forward, to be brought by a patch on its own
without prospect of being back-patched.

Dunno. Even if an additional tap test would not be backpatched, it could
be added on master. I'm basically sadden by pg test coverage, especially
psql which is under 40%, so I try to grasp any improvement opportunity…

See https://coverage.postgresql.org/

--
Fabien.

#13Daniel Verite
daniel@manitou-mail.org
In reply to: Fabien COELHO (#12)
Re: Alternative to \copy in psql modelled after \g

Fabien COELHO wrote:

Dunno. Even if an additional tap test would not be backpatched, it could
be added on master. I'm basically sadden by pg test coverage, especially
psql which is under 40%, so I try to grasp any improvement opportunity…

See https://coverage.postgresql.org/

Maybe I misunderstand something, as I'm not familiar with TAP tests,
but hasn't psql no such test to begin with, as opposed to the
other programs in src/bin that have a t/ directory?

$ find . -name t
./pg_resetwal/t
./scripts/t
./pg_archivecleanup/t
./pg_verify_checksums/t
./pg_config/t
./pg_controldata/t
./pgbench/t
./pg_rewind/t
./pg_basebackup/t
./pg_dump/t
./initdb/t
./pg_ctl/t

In that case, the first thing we'd need is to add check and installcheck
targets in .../psql/Makefile, and a t/ directory with at least one Perl
script.
If that's the way to go forward, let's just do that in a patch
with a specific entry in the next CF like "Add TAP tests to psql".
Personally I'll be willing to submit and review new tests in t
independently of the patch discussed in $subject.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

#14Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Daniel Verite (#13)
Re: Alternative to \copy in psql modelled after \g

On 2018-Dec-27, Daniel Verite wrote:

Maybe I misunderstand something, as I'm not familiar with TAP tests,
but hasn't psql no such test to begin with, as opposed to the
other programs in src/bin that have a t/ directory?

That's correct. psql does have some tests though, in
src/test/regress/sql/psql.sql and psql_crosstab.sql. It's also tested
indirectly because it's used to run all the src/test/regress files.

If you want to add more tests and increase coverage, that's a good goal,
but keep in mind those other files that can also be used. It doesn't
all have to be TAP.

Some things such as help.c, sql_help.c are hard to test. describe.c
could use more coverage for sure, but lots of it is version-specific,
which makes things harder.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#14)
Re: Alternative to \copy in psql modelled after \g

I took a quick look at this patch. Some thoughts:

* I think the right way to proceed is to commit (and back-patch)
this without any regression test case, and maybe later look into
adding a TAP test that covers "\g file". We have no test cases
for any variant of "\g file", and because of platform variability
and suchlike considerations, adding that seems like a project of
its own --- one I'd not want to back-patch.

* I do agree with documenting this by adding some small note to the
discussion of \copy.

* I think you've made the wrong choice about how this interacts with
the pset.copyStream option. If copyStream is set, that should
continue to take priority, IMO, as anything else would break do_copy's
assumptions about what will happen. One example of why it would be
problematic is that both levels of code would think they control what
to do with the sigpipe trap. Now, it's likely that \copy's syntax
makes it impossible for both copyStream and gfname to be set at the
same time anyway, because \copy extends to the end of the line. But
if it were to happen, we don't want it to be something that do_copy
has to take into account; better therefore just to leave gfname alone.
(Note also the comment just above where you patched, which you failed
to update.)

* You need to be more careful about error cases. I note that
openQueryOutputFile is capable of returning failure with is_pipe
set true, which would lead this patch to disable the sigpipe trap
and never re-enable it.

regards, tom lane

#16Daniel Verite
daniel@manitou-mail.org
In reply to: Tom Lane (#15)
1 attachment(s)
Re: Alternative to \copy in psql modelled after \g

Tom Lane wrote:

I took a quick look at this patch.

PFA an updated patch addressing your comments and Fabien's.

I've also changed handleCopyOut() to return success if it
could pump the data without writing it out locally for lack of
an output stream. It seems to make more sense like that.

While adding the note to the doc I've noticed that the other \copy
tip says:

"This operation is not as efficient as the SQL COPY command because
all data must pass through the client/server connection. For large
amounts of data the SQL command might be preferable.

It doesn't specify that it's for COPY TO/FROM file, not COPY TO
STDOUT/FROM STDIN. Of course the latter would rank the same as \copy
with respect to client/server throughput. Should this tip be more
specific?

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Attachments:

psql-copyout-g-v2.patchtext/plainDownload
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 6c76cf2..8d17708 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1044,6 +1044,17 @@ testdb=&gt;
         </para>
         </tip>
 
+        <tip>
+        <para>
+        As an alternative to <literal>\copy ... to 'filename' | program 'command'</literal>,
+        an equivalent <acronym>SQL</acronym> <literal>COPY ... TO STDOUT</literal> command
+        terminated by <literal>\g filename</literal> or <literal>\g |program</literal>
+        may be used to the same effect. The latter form has the advantages that it
+        can span multiple lines and allows for variable interpolation, both in the query
+        and in the argument following <literal>\g</literal>.
+        </para>
+        </tip>
+
         </listitem>
       </varlistentry>
 
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index b11d7ac6..8d2fa59 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -1092,20 +1092,49 @@ ProcessResult(PGresult **results)
 			 * connection out of its COPY state, then call PQresultStatus()
 			 * once and report any error.
 			 *
-			 * If pset.copyStream is set, use that as data source/sink,
-			 * otherwise use queryFout or cur_cmd_source as appropriate.
+			 * For COPY OUT, direct the output to pset.copyStream if it's set,
+			 * otherwise to pset.gfname if it's set, otherwise to queryFout.
+			 * For COPY IN, use pset.copyStream as data source if it's set,
+			 * otherwise cur_cmd_source.
 			 */
-			FILE	   *copystream = pset.copyStream;
+			FILE	   *copystream;
 			PGresult   *copy_result;
 
 			SetCancelConn();
 			if (result_status == PGRES_COPY_OUT)
 			{
-				if (!copystream)
+				bool is_pipe;
+
+				if (pset.copyStream)
+				{
+					/* instantiated by \copy */
+					copystream = pset.copyStream;
+				}
+				else if (pset.gfname)
+				{
+					/*
+					 * COPY TO STDOUT \g [|]file may be used as an alternative
+					 * to \copy
+					 */
+					if (!openQueryOutputFile(pset.gfname, &copystream, &is_pipe))
+					{
+						copystream = NULL; /* will discard the COPY data entirely */
+						is_pipe = false;
+					}
+					if (is_pipe)
+						disable_sigpipe_trap();
+				}
+				else
+				{
+					/* fall back to the generic query output stream */
 					copystream = pset.queryFout;
+				}
+
 				success = handleCopyOut(pset.db,
 										copystream,
-										&copy_result) && success;
+										&copy_result)
+					&& success
+					&& (copystream != NULL);
 
 				/*
 				 * Suppress status printing if the report would go to the same
@@ -1117,11 +1146,25 @@ ProcessResult(PGresult **results)
 					PQclear(copy_result);
 					copy_result = NULL;
 				}
+
+				if (pset.gfname && copystream != NULL)
+				{
+					/* close \g argument file/pipe */
+					if (is_pipe)
+					{
+						pclose(copystream);
+						restore_sigpipe_trap();
+					}
+					else
+					{
+						fclose(copystream);
+					}
+				}
 			}
 			else
 			{
-				if (!copystream)
-					copystream = pset.cur_cmd_source;
+				/* COPY IN */
+				copystream = pset.copyStream ? pset.copyStream : pset.cur_cmd_source;
 				success = handleCopyIn(pset.db,
 									   copystream,
 									   PQbinaryTuples(*results),
diff --git a/src/bin/psql/copy.c b/src/bin/psql/copy.c
index 02c8511..0f3ea33 100644
--- a/src/bin/psql/copy.c
+++ b/src/bin/psql/copy.c
@@ -426,6 +426,7 @@ do_copy(const char *args)
  * conn should be a database connection that you just issued COPY TO on
  * and got back a PGRES_COPY_OUT result.
  * copystream is the file stream for the data to go to.
+ * copystream can be NULL to pump the data without writing it anywhere.
  * The final status for the COPY is returned into *res (but note
  * we already reported the error, if it's not a success result).
  *
@@ -447,7 +448,7 @@ handleCopyOut(PGconn *conn, FILE *copystream, PGresult **res)
 
 		if (buf)
 		{
-			if (OK && fwrite(buf, 1, ret, copystream) != ret)
+			if (OK && copystream && fwrite(buf, 1, ret, copystream) != ret)
 			{
 				psql_error("could not write COPY data: %s\n",
 						   strerror(errno));
@@ -458,7 +459,7 @@ handleCopyOut(PGconn *conn, FILE *copystream, PGresult **res)
 		}
 	}
 
-	if (OK && fflush(copystream))
+	if (OK && copystream && fflush(copystream))
 	{
 		psql_error("could not write COPY data: %s\n",
 				   strerror(errno));
#17Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Daniel Verite (#16)
Re: Alternative to \copy in psql modelled after \g

Bonjour Daniel,

I took a quick look at this patch.

PFA an updated patch addressing your comments and Fabien's.

About this v2.

Applies cleanly, compiles cleanly, "make check" ok.

No tests, but Tom suggests this does not belong here: the committer is
right:-)

I tested the feature manually, and I noticed that with your patch
ROW_COUNT is set more consistently:

sql> COPY pgbench_branches TO STDOUT \g /dev/null # COPY 10
sql> \echo :ROW_COUNT # 10

But previously we had:

sql> \echo :ROW_COUNT # 0

This is an improvement, although I'm not sure where it comes from.

I've also changed handleCopyOut() to return success if it
could pump the data without writing it out locally for lack of
an output stream. It seems to make more sense like that.

I'm hesitating on this one.

On the one hand the SQL query is executed, on the other hand the \g
command partially failed. There is a debatable side effect: If there is an
error, eg:

sql> COPY pgbench_branches TO STDOUT \g /NIET
/NIET: Permission denied
sql> \echo :ERROR :ROW_COUNT # true 0

I understand from the code that the COPY is really executed, so the ERROR
and so ROW_COUNT about the SQL should reflect that. Basically the change
makes the client believe that there is an SQL error whereas the error is
on the client.

Does anyone else have an opinion?

About the code:

I'm unclear whether the project policy accepts "foo" for "foo != NULL",
whether the later is prefered, or whether it does not care about it.

/*
* COPY TO STDOUT \g [|]file may be used as an alternative
* to \copy
*/

The later part of the comment is already stated in the documentation, I'm
not sure it is worth repeating it here. I'd suggest a simpler /* handle
"COPY TO STDOUT \g ..." */

While adding the note to the doc I've noticed that the other \copy
tip says:

"This operation is not as efficient as the SQL COPY command because
all data must pass through the client/server connection. For large
amounts of data the SQL command might be preferable.

It doesn't specify that it's for COPY TO/FROM file, not COPY TO
STDOUT/FROM STDIN. Of course the latter would rank the same as \copy
with respect to client/server throughput. Should this tip be more
specific?

Yep.

This tip also overlooks that the client and server are not necessary on
the same host with the same permissions: it seems to say "prefer COPY to
\copy", while the alternative may work only under special conditions which
are not hinted about in any way.

--
Fabien.

#18Daniel Verite
daniel@manitou-mail.org
In reply to: Fabien COELHO (#17)
Re: Alternative to \copy in psql modelled after \g

Fabien COELHO wrote:

I've also changed handleCopyOut() to return success if it
could pump the data without writing it out locally for lack of
an output stream. It seems to make more sense like that.

I'm hesitating on this one.

On the one hand the SQL query is executed, on the other hand the \g
command partially failed. There is a debatable side effect: If there is an
error, eg:

The change mentioned above is not supposed to have any user-visible
effect, compared to the previous version. The success of the command
as a whole, that is reflected by :ERROR, is the result of AND'ing the
successes of different steps of the command. Having handleCopyOut()
succeed does not change the falseness of the aggregated result,
which is still false when it fails to open the output stream.

But previously we had:

sql> \echo :ROW_COUNT # 0

Previously "COPY... \g file" behaved as "COPY... \g", the argument being
ignored. In this case, and the patch doesn't change that, the code does:

/*
* Suppress status printing if the report would go to the same
* place as the COPY data just went. Note this doesn't
* prevent error reporting, since handleCopyOut did that.
*/
if (copystream == pset.queryFout)
{
PQclear(copy_result);
copy_result = NULL;
}

One effect of clearing that result is that :ROW_COUNT is going to
be set to zero by SetResultVariables(), which explains the above
output.

:ROW_COUNT is incorrect after COPY TO STDOUT but this is a PG11-only
bug, :ROW_COUNT being a recent addition, so maybe we should deal with
this separately, since the current patch is supposed to address all versions?

I understand from the code that the COPY is really executed, so the ERROR
and so ROW_COUNT about the SQL should reflect that. Basically the change
makes the client believe that there is an SQL error whereas the error is
on the client.

Right, but wether COPY fails because psql can't write the output,
possibly half-way because of a disk full condition, or because the
query was cancelled or the server went down, are these distinctions
meaningful for a script?
If we say yes, how can the user know that the data fetched is
empty or incomplete? We don't have a CLIENT_SIDE_ERROR variable.
Also, the fact that psql retrieves the results when it doesn't have
a valid destination for them is an implementation detail.
I think we could also cancel the query in a way that would be
technically an SQL error, as Ctrl+C would do.
Hiding these details under a generic ERROR=true seems reasonable,
especially since we expose SQLSTATE for fine-grained error checking,
should that be needed.
ERROR=true and SQLSTATE empty is already mentioned as plausible
in SetResultVariables():

SetVariable(pset.vars, "ERROR", "true");

/*
* If there is no SQLSTATE code, use an empty string. This can
happen
* for libpq-detected errors (e.g., lost connection, ENOMEM).
*/
if (code == NULL)
code = "";
SetVariable(pset.vars, "SQLSTATE", code);

The later part of the comment is already stated in the documentation, I'm
not sure it is worth repeating it here. I'd suggest a simpler /* handle
"COPY TO STDOUT \g ..." */

The bug we're fixing here is due to missing the point the comment is
making, so being thick seems fair.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

#19Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Daniel Verite (#18)
Re: Alternative to \copy in psql modelled after \g

Bonjour Daniel,

:ROW_COUNT is incorrect after COPY TO STDOUT but this is a PG11-only
bug, :ROW_COUNT being a recent addition, so maybe we should deal with
this separately, since the current patch is supposed to address all versions?

ISTM that the patch is considered a bug fix, so it shoud be applied to
pg11 and fix it?

I understand from the code that the COPY is really executed, so the ERROR
and so ROW_COUNT about the SQL should reflect that. Basically the change
makes the client believe that there is an SQL error whereas the error is
on the client.

Right, but wether COPY fails because psql can't write the output,
possibly half-way because of a disk full condition, or because the
query was cancelled or the server went down, are these distinctions
meaningful for a script?

It could if the SQL command has side effects, but probably this does not
apply to COPY TO which cannot have.

If we say yes, how can the user know that the data fetched is
empty or incomplete? We don't have a CLIENT_SIDE_ERROR variable.

Yep. Maybe we should.

The documentation states that ERROR is about SQL, not psql internal stuff:

ERROR true if the last SQL query failed, false if it succeeded.
See also SQLSTATE.

And this is somehow the behavior on all other SQL commands, with or
without your patch:

sql> SELECT 1 \g /BAD
/BAD: Permission denied

sql> \echo :ERROR
false

Basically, with your patch, the behavior becomes inconsistent between COPY
and other SQL commands, so there is something to fix.

Given that, I see two options:

(1) document ERROR as being muddy, i.e. there has been some error which
may be SQL or possibly client side. Although SQLSTATE would still allow to
know whether an SQL error occured, there is still no client side
expressions, and even if I had moved pgbench expressions to psql they
would still need to be extended to handle strings. This suggest that maybe
there could be an SQL_ERROR boolean which does store whether SQL succeeded
or not, and possibly a CLIENT_ERROR on the side, and ERROR = SQL_ERROR OR
CLIENT_ERROR.

(2) keep ERROR as is, i.e. about SQL, and add some CLIENT_ERROR, but then
I see another issue, if it is *only* the client error, then it should be
true if there is an SQL error, thus checking if there is any error becomes
ERROR OR CLIENT_ERROR, which is muddy as well especially as there are no
client-side expressions in psql.

Also, the fact that psql retrieves the results when it doesn't have
a valid destination for them is an implementation detail.

Yep, but if there are side effects, a script could want to know if they
occured?

I think we could also cancel the query in a way that would be
technically an SQL error, as Ctrl+C would do.

Hmmm.

Hiding these details under a generic ERROR=true seems reasonable,
especially since we expose SQLSTATE for fine-grained error checking,
should that be needed.

Yep, but no expression.

ERROR=true and SQLSTATE empty is already mentioned as plausible
in SetResultVariables():

Indeed. This suggest that ERROR is already muddy, contrary to the
documentation.

SetVariable(pset.vars, "ERROR", "true");
if (code == NULL)
code = "";
SetVariable(pset.vars, "SQLSTATE", code);

Overall, ISTM that it should point to solution (1).

- document that ERROR is muddy, "some SQL or client error occured"
- add SQL_ERROR, which is always about SQL error and nothing else
- add CLIENT_ERROR, same
- make the behavior consistent for all SQL command, COPY & others

The later part of the comment is already stated in the documentation, I'm
not sure it is worth repeating it here. I'd suggest a simpler /* handle
"COPY TO STDOUT \g ..." */

The bug we're fixing here is due to missing the point the comment is
making, so being thick seems fair.

I would not be, because ISTM that mentionning that COPY TO STDOUT is
specially handled already points clearly to the previous issue. No big
deal.

--
Fabien.

#20Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Fabien COELHO (#19)
Re: Alternative to \copy in psql modelled after \g

I understand from the code that the COPY is really executed, so the ERROR
and so ROW_COUNT about the SQL should reflect that. Basically the change
makes the client believe that there is an SQL error whereas the error is
on the client.

Right, but wether COPY fails because psql can't write the output,
possibly half-way because of a disk full condition, or because the
query was cancelled or the server went down, are these distinctions
meaningful for a script?

It could if the SQL command has side effects, but probably this does not
apply to COPY TO which cannot have.

Yes it can:

COPY (
UPDATE pgbench_branches
SET bbalance = bbalance + 1
WHERE bid <= 5
RETURNING *) TO STDOUT \g /BAD

The SQL command is executed but the backslash command fails.

--
Fabien.

#21Daniel Verite
daniel@manitou-mail.org
In reply to: Fabien COELHO (#19)
Re: Alternative to \copy in psql modelled after \g

Fabien COELHO wrote:

sql> SELECT 1 \g /BAD
/BAD: Permission denied

sql> \echo :ERROR
false

That seems broken, because it's pointless to leave out a class of errors
from ERROR. Presumably the purpose of ERROR is to enable
error checking like:
\if :ERROR
... error processing
\endif

Now if you download data with SELECT or COPY and we can't even
create the file, how is that a good idea to intentionally have the
script fail to detect it? What purpose does it satisfy?

The documentation states that ERROR is about SQL, not psql internal stuff:

ERROR true if the last SQL query failed, false if it succeeded.
See also SQLSTATE.

ERROR is set by SetResultVariables(PGresult *results, bool success)
and takes the value of "success", ignoring the PGresult.
So why is ERROR independant of the SQL result, relatively to your
claim that it should never reflect anything else?

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

#22Daniel Verite
daniel@manitou-mail.org
In reply to: Fabien COELHO (#19)
Re: Alternative to \copy in psql modelled after \g

Fabien COELHO wrote:

(1) document ERROR as being muddy, i.e. there has been some error which
may be SQL or possibly client side. Although SQLSTATE would still allow to
know whether an SQL error occured, there is still no client side
expressions, and even if I had moved pgbench expressions to psql they
would still need to be extended to handle strings. This suggest that maybe
there could be an SQL_ERROR boolean which does store whether SQL succeeded
or not, and possibly a CLIENT_ERROR on the side, and ERROR = SQL_ERROR OR
CLIENT_ERROR.

(2) keep ERROR as is, i.e. about SQL, and add some CLIENT_ERROR, but then
I see another issue, if it is *only* the client error, then it should be
true if there is an SQL error, thus checking if there is any error becomes
ERROR OR CLIENT_ERROR, which is muddy as well especially as there are no
client-side expressions in psql.

(3) Set ERROR=true if ON_ERROR_STOP would quit on the exact same
conditions.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

#23Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Daniel Verite (#21)
Re: Alternative to \copy in psql modelled after \g

Bonsoir Daniel,

sql> SELECT 1 \g /BAD
/BAD: Permission denied

sql> \echo :ERROR
false

That seems broken, because it's pointless to leave out a class of errors
from ERROR.

Yep. That is my point, but fixing it requires to decide whether it is okay
to change ERROR documentation and behavior, and ISTM that there is some
legit case to have the SQL_ERROR information independently.

Now if you download data with SELECT or COPY and we can't even
create the file, how is that a good idea to intentionally have the
script fail to detect it? What purpose does it satisfy?

It means that the client knows that the SQL command, and possible
associated side effects, were executed server-side, and that if we are in
a transaction it is still going on:

BEGIN;
UPDATE pgbench_branches SET bbalance = bbalance + 1 RETURNING * \g /bad
// the update is performed, the transaction is not rollbacked,
// *but* the output file was not written, "COMMIT" save changes.

The documentation states that ERROR is about SQL, not psql internal stuff:

ERROR true if the last SQL query failed, false if it succeeded.
See also SQLSTATE.

ERROR is set by SetResultVariables(PGresult *results, bool success)
and takes the value of "success", ignoring the PGresult.

Yes and no: "success" pretty often currently depends on PGresult, eg:

if (PQresultStatus(results) != PGRES_COMMAND_OK) {
...
SetResultVariables(results, false);

results = PQexec(pset.db, buf.data);
OK = AcceptResult(results); // true if SQL is ok
...
SetResultVariables(results, OK);

results = PQexec(pset.db, buf.data);
OK = AcceptResult(results) &&
(PQresultStatus(results) == PGRES_COMMAND_OK);
if (!OK)
SetResultVariables(results, OK);

So why is ERROR independant of the SQL result, relatively to your
claim that it should never reflect anything else?

AFAICS I'm not claiming that, on the contrary I'm basically ok with
changing ERROR documentation and implementation (what I called option 1),
although it would have to pass through committers, *AND* I'm still
thinking that having a separate access to whether the SQL failed or not is
of interest, so there is an argument to add a SQL_ERROR which reflects the
current documentation, if not fully the implementation.

--
Fabien .

#24Daniel Verite
daniel@manitou-mail.org
In reply to: Fabien COELHO (#23)
Re: Alternative to \copy in psql modelled after \g

Fabien COELHO wrote:

Now if you download data with SELECT or COPY and we can't even
create the file, how is that a good idea to intentionally have the
script fail to detect it? What purpose does it satisfy?

It means that the client knows that the SQL command, and possible
associated side effects, were executed server-side, and that if we are in
a transaction it is still going on:

BEGIN;
UPDATE pgbench_branches SET bbalance = bbalance + 1 RETURNING * \g /bad
// the update is performed, the transaction is not rollbacked,
// *but* the output file was not written, "COMMIT" save changes.

if PQexec() could not store the results for lack of memory, it would
yield a PGRES_FATAL_ERROR, then :ERROR would be set to true, and yet
the server-side operation would have been performed. Additionally, If
that BEGIN was not there, the statement would also have been
committed, so its effect would be durable independently of the value
of :ERROR.

My point is that client-side issues already affect :ERROR,
so it can't be assumed that :ERROR=true implies that the SQL
statement did not have effects on the server.

In that sense, the patch in its current state does not break this
guarantee, since it does not exist in the first place.

OTOH I hope that :ERROR = false is a true guarantee that there
have been no problem whatsoever in the execution of
the last statement.

on the contrary I'm basically ok with changing ERROR
documentation and implementation (what I called option 1),

The doc only says:

ERROR
true if the last SQL query failed, false if it succeeded.
See also SQLSTATE.

If the failure to retrieve results is included in "query failed", which
seems a reasonable interpretation to me, it doesn't need to be changed.

What's not right is "SELECT ... \g /bad" having a different effect on
:ERROR than "COPY... \g /bad".
I plan to follow up on that because there are other problems with
SELECT ... \g anyway, for instance, when a disk full occurs,
it's not reported at all. But that problem is not in the code path
of COPY.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

#25Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Daniel Verite (#24)
Re: Alternative to \copy in psql modelled after \g

BEGIN;
UPDATE pgbench_branches SET bbalance = bbalance + 1 RETURNING * \g /bad
// the update is performed, the transaction is not rollbacked,
// *but* the output file was not written, "COMMIT" save changes.

if PQexec() could not store the results for lack of memory, it would
yield a PGRES_FATAL_ERROR, then :ERROR would be set to true, and yet
the server-side operation would have been performed. Additionally, If
that BEGIN was not there, the statement would also have been
committed, so its effect would be durable independently of the value
of :ERROR.

Indeed, OOM is a special case when the client is unable to know whether
the command was actually executed. However ISTM that the connection is
likely to be aborted, so if there is an explicit transaction it would be
aborted as well.

My point is that client-side issues already affect :ERROR,
so it can't be assumed that :ERROR=true implies that the SQL
statement did not have effects on the server.

Not from my reading of the doc (which I probably wrote, and did the
implementation without a clear view of the different client-specific error
conditions, so all this is really my fault BTW), where I understand that
ERROR as true says that the SQL failed.

In that sense, the patch in its current state does not break this
guarantee, since it does not exist in the first place.

Sure, but the patch in its current state creates an inconsistency between
an SQL command with "\g /BAD" and the same command in "COPY (...) TO
STDOUT \g /BAD". I think that it must at the minimum be consistent.

OTOH I hope that :ERROR = false is a true guarantee that there have been
no problem whatsoever in the execution of the last statement.

on the contrary I'm basically ok with changing ERROR
documentation and implementation (what I called option 1),

The doc only says:

ERROR
true if the last SQL query failed, false if it succeeded.
See also SQLSTATE.

If the failure to retrieve results is included in "query failed", which
seems a reasonable interpretation to me, it doesn't need to be changed.

I understand "SQL query failed" as a server issue, especially as SQLSTATE
is the reported command status at the PQ protocol level.

What's not right is "SELECT ... \g /bad" having a different effect on
:ERROR than "COPY... \g /bad".

Yep, I've been saying that.

I plan to follow up on that because there are other problems with
SELECT ... \g anyway, for instance, when a disk full occurs,
it's not reported at all. But that problem is not in the code path
of COPY.

Sure. As there are several bugs (doubtful features) uncovered, a first
patch could fix "COPY ...TO STDOUT \g file", but probably replicate ERROR
current behavior however debatable it is (i.e. your patch without the
ERROR change, which is unrelated to the bug being fixed), and then another
patch should fix/modify the behavior around ERROR (everywhere and
consistently), and probably IMHO add an SQL_ERROR.

--
Fabien.

#26Daniel Verite
daniel@manitou-mail.org
In reply to: Fabien COELHO (#25)
Re: Alternative to \copy in psql modelled after \g

Fabien COELHO wrote:

Sure. As there are several bugs (doubtful features) uncovered, a first
patch could fix "COPY ...TO STDOUT \g file", but probably replicate ERROR
current behavior however debatable it is (i.e. your patch without the
ERROR change, which is unrelated to the bug being fixed), and then another
patch should fix/modify the behavior around ERROR (everywhere and
consistently), and probably IMHO add an SQL_ERROR.

It's not as if the patch issued an explicit call SetVariable("ERROR", "true")
that could be commented, or something like that. The assignment
of the variable happens as a consequence of patched code that aims at
being correct in its error handling.
So I'm for leaving this decision to a maintainer, because I don't agree
with your conclusion that the current patch should be changed in
that regard.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

#27Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Daniel Verite (#26)
Re: Alternative to \copy in psql modelled after \g

Bonsoir Daniel,

Sure. As there are several bugs (doubtful features) uncovered, a first
patch could fix "COPY ...TO STDOUT \g file", but probably replicate ERROR
current behavior however debatable it is (i.e. your patch without the
ERROR change, which is unrelated to the bug being fixed), and then another
patch should fix/modify the behavior around ERROR (everywhere and
consistently), and probably IMHO add an SQL_ERROR.

It's not as if the patch issued an explicit call SetVariable("ERROR", "true")
that could be commented, or something like that. The assignment
of the variable happens as a consequence of patched code that aims at
being correct in its error handling.
So I'm for leaving this decision to a maintainer, because I don't agree
with your conclusion that the current patch should be changed in
that regard.

Ok, fine with me.

To summarize the debate, when fixing "\g filename" for "COPY TO STDOUT"
the patch does not set error consistently on:

sql> COPY (SELECT 1) TO STDOUT \g /BAD
# Permission denied on "/BAD"
-> :ERROR is true
# the command is executed but could not write to the file

compared to the existing behavior:

sql> SELECT 1 \g BAD
# Permission denied on "/BAD"
-> :ERROR is false
# the SQL command is fine, although writing to the file failed

Should we include this inconsistency and then fix the problem later, or
replicate the initial (doubtful?) behavior for consistency and then fix
the problem later, anyway?

Fixing the problem envolves deciding what is the right behavior, and
update the documentation and the implementation accordingly. Currently the
documentation suggests that :ERROR is about SQL error (subject to
interpretation), and the implementation is more or less consistent with
that view, but not always, as pointed out by Daniel.

Note that as the author of the initial patch adding :ERROR and others
(69835bc8), I'm probably somehow responsible for this situation, so shame
on me.

--
Fabien.

#28Tom Lane
tgl@sss.pgh.pa.us
In reply to: Fabien COELHO (#27)
Re: Alternative to \copy in psql modelled after \g

Fabien COELHO <coelho@cri.ensmp.fr> writes:

Fixing the problem envolves deciding what is the right behavior, and
update the documentation and the implementation accordingly. Currently the
documentation suggests that :ERROR is about SQL error (subject to
interpretation), and the implementation is more or less consistent with
that view, but not always, as pointed out by Daniel.

FWIW, I think we should adopt the rule that :ERROR reflects any error
condition, whether server-side or client-side. It's unlikely that
scripts would really not care about client-side errors. Moreover,
I do not think we can reliably tell the difference; there are always
going to be corner cases that are hard to classify. Example: if we
lose the server connection, is that a server-side error or client-side?
Or maybe neither, maybe the network went down.

Because of this concern, I find the idea of inventing separate
SQL_ERROR and CLIENT_ERROR variables to be a pretty terrible one.
I think we'd be creating a lot of make-work and hard choices for
ourselves, to do something that most script writers won't give a
fig about. If you've got subtle error-handling logic in mind,
you shouldn't be trying to implement your app in a psql script
in the first place, IMO anyway.

It's unclear to me whether to push ahead with Daniel's existing
patch or not. It doesn't look to me like it's making things
any worse from the error-consistency standpoint than they were
already, so I'd be inclined to consider error semantics cleanup
as something to be done separately/later.

BTW, it looks to me like the last patch is still not right as far
as when to close copystream --- won't it incorrectly close a
stream obtained from pset.copyStream? While you could fix that
with

-	if (pset.gfname && copystream != NULL)
+	if (!pset.copyStream && pset.gfname && copystream != NULL)

I think that's getting overly complex and unmaintainable. I'd
be inclined to code things more like

FILE *copystream = NULL;
bool need_close = false;

...

need_close = openQueryOutputFile(...);

...

if (need_close)
// close copystream here

regards, tom lane

#29Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Tom Lane (#28)
Re: Alternative to \copy in psql modelled after \g

Hello Tom,

Fixing the problem envolves deciding what is the right behavior, and
update the documentation and the implementation accordingly. Currently the
documentation suggests that :ERROR is about SQL error (subject to
interpretation), and the implementation is more or less consistent with
that view, but not always, as pointed out by Daniel.

FWIW, I think we should adopt the rule that :ERROR reflects any error
condition, whether server-side or client-side.

I think that everybody agrees with that.

It's unlikely that scripts would really not care about client-side
errors. Moreover, I do not think we can reliably tell the difference;
there are always going to be corner cases that are hard to classify.
Example: if we lose the server connection, is that a server-side error
or client-side? Or maybe neither, maybe the network went down.

Because of this concern, I find the idea of inventing separate
SQL_ERROR and CLIENT_ERROR variables to be a pretty terrible one.

Then the committer is right:-)

I think we'd be creating a lot of make-work and hard choices for
ourselves, to do something that most script writers won't give a
fig about. If you've got subtle error-handling logic in mind,
you shouldn't be trying to implement your app in a psql script
in the first place, IMO anyway.

Possibly. I was also thinking of debug. ISTM that SQL_ERROR is pretty
trivial to implement because we already set SQLSTATE, and SQL_ERROR is
probably something like SQLSTATE != "0000" or close to that.

It's unclear to me whether to push ahead with Daniel's existing
patch or not. It doesn't look to me like it's making things
any worse from the error-consistency standpoint than they were
already, so I'd be inclined to consider error semantics cleanup
as something to be done separately/later.

Fine.

BTW, it looks to me like the last patch is still not right as far
as when to close copystream --- won't it incorrectly close a
stream obtained from pset.copyStream?

Argh, I should have checked this point.

--
Fabien.

#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: Fabien COELHO (#29)
Re: Alternative to \copy in psql modelled after \g

Fabien COELHO <coelho@cri.ensmp.fr> writes:

It's unclear to me whether to push ahead with Daniel's existing
patch or not. It doesn't look to me like it's making things
any worse from the error-consistency standpoint than they were
already, so I'd be inclined to consider error semantics cleanup
as something to be done separately/later.

Fine.

OK. I fixed the error-cleanup issue and pushed it.

The patch applied cleanly back to 9.5, but the code for \g is a good
bit different in 9.4. I didn't have the interest to try to make the
patch work with that, so I just left 9.4 alone.

regards, tom lane

#31Daniel Verite
daniel@manitou-mail.org
In reply to: Tom Lane (#30)
Re: Alternative to \copy in psql modelled after \g

Tom Lane wrote:

OK. I fixed the error-cleanup issue and pushed it.

The patch applied cleanly back to 9.5, but the code for \g is a good
bit different in 9.4. I didn't have the interest to try to make the
patch work with that, so I just left 9.4 alone.

Thanks!

Now as far as I can see, there is nothing that \copy to file or program
can do that COPY TO STDOUT cannot do. The next thing would be to
figure out how to similarly improve COPY FROM in psql, after which
\copy might be seen as obsolete.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

#32Tom Lane
tgl@sss.pgh.pa.us
In reply to: Daniel Verite (#31)
Re: Alternative to \copy in psql modelled after \g

"Daniel Verite" <daniel@manitou-mail.org> writes:

Now as far as I can see, there is nothing that \copy to file or program
can do that COPY TO STDOUT cannot do.

I don't think there's a way to get the effect of "\copy to pstdout"
(which, IIRC without any caffeine, means write to psql's stdout regardless
of where queryFout is currently pointing). Somebody was excited enough
about that to submit a patch for it, so maybe it's worth covering.
My first thought about syntax is to define "\g -" as meaning that.

The next thing would be to
figure out how to similarly improve COPY FROM in psql, after which
\copy might be seen as obsolete.

I suggested upthread that we could just define "\g foo" as reading
from foo, not writing it, if the command turns out to be COPY FROM.
Maybe that's too weird/mistake-prone? A variant that might or might
not be safer is "\g <foo", ie we insist on you putting a mark there
that shows you intended to read.

Also, not quite clear what we'd do about copy-from-program.
I think "\g |foo" is definitely confusing for that. "\g foo|"
would be better if it doesn't have syntax issues.

regards, tom lane

#33Daniel Verite
daniel@manitou-mail.org
In reply to: Tom Lane (#32)
Re: Alternative to \copy in psql modelled after \g

Tom Lane wrote:

A variant that might or might not be safer is "\g <foo", ie we
insist on you putting a mark there that shows you intended to read.

Also, not quite clear what we'd do about copy-from-program.
I think "\g |foo" is definitely confusing for that. "\g foo|"
would be better if it doesn't have syntax issues.

I haven't written any patch yet, but I was thinking of submitting
something like that, with the addition of "\g >foo" as a synonym of
"\g foo" for the symmetry with "<".
Perl's open() also uses the "program |" syntax.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

#34Tom Lane
tgl@sss.pgh.pa.us
In reply to: Daniel Verite (#33)
Re: Alternative to \copy in psql modelled after \g

"Daniel Verite" <daniel@manitou-mail.org> writes:

Tom Lane wrote:

A variant that might or might not be safer is "\g <foo", ie we
insist on you putting a mark there that shows you intended to read.

I haven't written any patch yet, but I was thinking of submitting
something like that, with the addition of "\g >foo" as a synonym of
"\g foo" for the symmetry with "<".

+1, the same had occurred to me.

regards, tom lane

#35Daniel Verite
daniel@manitou-mail.org
In reply to: Tom Lane (#32)
Re: Alternative to \copy in psql modelled after \g

Tom Lane wrote:

Now as far as I can see, there is nothing that \copy to file or program
can do that COPY TO STDOUT cannot do.

I don't think there's a way to get the effect of "\copy to pstdout"
(which, IIRC without any caffeine, means write to psql's stdout regardless
of where queryFout is currently pointing).

\g /dev/stdout would work already on systems like Linux that provide
this alias.
Otherwise "\g -" looks good as a portable solution.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

#36Corey Huinker
corey.huinker@gmail.com
In reply to: Daniel Verite (#35)
Re: Alternative to \copy in psql modelled after \g

Otherwise "\g -" looks good as a portable solution.

+1