Improvements and additions to COPY progress reporting

Started by Matthias van de Meentalmost 5 years ago51 messages
#1Matthias van de Meent
boekewurm+postgres@gmail.com
5 attachment(s)

Hi,

With [0]/messages/by-id/CAFp7Qwr6_FmRM6pCO0x_a0mymOfX_Gg+FEKet4XaTGSW=LitKQ@mail.gmail.com we got COPY progress reporting. Before the column names of
this newly added view are effectively set in stone with the release of
pg14, I propose the following set of relatively small patches. These
are v2, because it is a patchset that is based on a set of patches
that I previously posted in [0]/messages/by-id/CAFp7Qwr6_FmRM6pCO0x_a0mymOfX_Gg+FEKet4XaTGSW=LitKQ@mail.gmail.com.

0001 Adds a column to pg_stat_progress_copy which details the amount
of tuples that were excluded from insertion by the WHERE clause of the
COPY FROM command.

0002 alters pg_stat_progress_copy to use 'tuple'-terminology instead
of 'line'-terminology. 'Line' doesn't make sense in the binary copy
case, and only for the 'text' copy format there can be a guarantee
that the source / output file actually contains the reported amount of
lines, whereas the amount of data tuples (which is also what it's
called internally) is guaranteed to equal for all data types.

There was some discussion about this in [0]/messages/by-id/CAFp7Qwr6_FmRM6pCO0x_a0mymOfX_Gg+FEKet4XaTGSW=LitKQ@mail.gmail.com where the author thought
'line' is more consistent with the CSV documentation, and where I
argued that 'tuple' is both more consistent with the rest of the
progress reporting tables and more consistent with the actual counted
items: these are the tuples serialized / inserted (as noted in the CSV
docs; "Thus the files are not strictly one line per table row like
text-format files.").

Patch 0003 adds backlinks to the progress reporting docs from the docs
of the commands that have progress reporting (re/index, cluster,
vacuum, etc.) such that progress reporting is better discoverable from
the relevant commands, and removes the datname column from the
progress_copy view (that column was never committed). This too should
be fairly trivial and uncontroversial.

0004 adds the 'command' column to the progress_copy view; which
distinguishes between COPY FROM and COPY TO. The two commands are (in
my opinion) significantly different enough to warrant this column;
similar to the difference between CREATE INDEX/REINDEX [CONCURRENTLY]
which also report that information. I believe that this change is
appropriate; as the semantics of the columns change depending on the
command being executed.

Lastly, 0005 adds 'io_target' to the reported information, that is,
FILE, PROGRAM, STDIO or CALLBACK. Although this can relatively easily
be determined based on the commands in pg_stat_activity, it is
reasonably something that a user would want to query on, as the
origin/target of COPY has security and performance implications,
whereas other options (e.g. format) are less interesting for clients
that are not executing that specific COPY command.

Of special interest in 0005 is that it reports the io_target for the
logical replications' initial tablesyncs' internal COPY. This would
otherwise be measured, but no knowledge about the type of copy (or its
origin) would be available on the worker's side. I'm not married to
this patch 0005, but I believe it could be useful, and therefore
included it in the patchset.

With regards,

Matthias van de Meent.

[0]: /messages/by-id/CAFp7Qwr6_FmRM6pCO0x_a0mymOfX_Gg+FEKet4XaTGSW=LitKQ@mail.gmail.com

Attachments:

v2-0005-Add-a-io_target-column-to-the-copy-progress-view.patchtext/x-patch; charset=US-ASCII; name=v2-0005-Add-a-io_target-column-to-the-copy-progress-view.patchDownload
From 02949f1eea264266fdd1c7b53fc58f8c9e590153 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Mon, 8 Feb 2021 18:13:46 +0100
Subject: [PATCH v2 5/5] Add a io_target column to the copy progress view

This allows filtering on IO target for progress reporting, and allows for
further filtering of COPY commands. Additionally, this allows for
identification of logical replication's initial tablesync workers at the
subscriber side through io_target = CALLBACK, as it is the only current
supplier of a callback datasource.
---
 doc/src/sgml/monitoring.sgml         | 13 +++++++++++++
 src/backend/catalog/system_views.sql |  5 +++++
 src/backend/commands/copyfrom.c      | 29 +++++++++++++++++++++++++++-
 src/backend/commands/copyto.c        | 26 ++++++++++++++++++++++++-
 src/include/commands/progress.h      |  7 +++++++
 src/test/regress/expected/rules.out  |  7 +++++++
 6 files changed, 85 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index ca84b53896..9dcce405fb 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6554,6 +6554,19 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>io_target</structfield> <type>text</type>
+      </para>
+      <para>
+       The io target that the data is read from or written to: 
+       <literal>FILE</literal>, <literal>PROGRAM</literal>, 
+       <literal>STDIO</literal> (for COPY FROM STDIN and COPY TO STDOUT),
+       or <literal>CALLBACK</literal> (used in the tablesync background 
+       worker).
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>bytes_processed</structfield> <type>bigint</type>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 1082b7d253..6a3ac47b85 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1132,6 +1132,11 @@ CREATE VIEW pg_stat_progress_copy AS
         CASE S.param5 WHEN 1 THEN 'COPY FROM'
                       WHEN 2 THEN 'COPY TO'
                       END AS command,
+        CASE S.param6 WHEN 1 THEN 'FILE'
+                      WHEN 2 THEN 'PROGRAM'
+                      WHEN 3 THEN 'STDIO'
+                      WHEN 4 THEN 'CALLBACK'
+                      END AS io_target,
         S.param1 AS bytes_processed,
         S.param2 AS bytes_total,
         S.param3 AS tuples_processed,
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index ce343dbf80..21d71427c7 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1428,7 +1428,34 @@ BeginCopyFrom(ParseState *pstate,
 	/* initialize progress */
 	pgstat_progress_start_command(PROGRESS_COMMAND_COPY,
 								  cstate->rel ? RelationGetRelid(cstate->rel) : InvalidOid);
-	pgstat_progress_update_param(PROGRESS_COPY_COMMAND, PROGRESS_COPY_COMMAND_FROM);
+	{
+		const int progress_index[] = {
+			PROGRESS_COPY_COMMAND,
+			PROGRESS_COPY_IO_TARGET
+		};
+		int64 progress_vals[] = {
+			PROGRESS_COPY_COMMAND_FROM,
+			0
+		};
+		switch (cstate->copy_src)
+		{
+			case COPY_FILE:
+				if (is_program)
+					progress_vals[1] = PROGRESS_COPY_IO_TARGET_PROGRAM;
+				else
+					progress_vals[1] = PROGRESS_COPY_IO_TARGET_FILE;
+				break;
+			case COPY_OLD_FE:
+			case COPY_NEW_FE:
+				progress_vals[1] = PROGRESS_COPY_IO_TARGET_STDIO;
+				break;
+			case COPY_CALLBACK:
+				progress_vals[1] = PROGRESS_COPY_IO_TARGET_CALLBACK;
+				break;
+		}
+		pgstat_progress_update_multi_param(2, progress_index, progress_vals);
+	}
+
 	cstate->bytes_processed = 0;
 
 	/* We keep those variables in cstate. */
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 534c091c75..42c4a828df 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -772,7 +772,31 @@ BeginCopyTo(ParseState *pstate,
 	/* initialize progress */
 	pgstat_progress_start_command(PROGRESS_COMMAND_COPY,
 								  cstate->rel ? RelationGetRelid(cstate->rel) : InvalidOid);
-	pgstat_progress_update_param(PROGRESS_COPY_COMMAND, PROGRESS_COPY_COMMAND_TO);
+	{
+		const int progress_index[] = {
+			PROGRESS_COPY_COMMAND,
+			PROGRESS_COPY_IO_TARGET
+		};
+		int64 progress_vals[] = {
+			PROGRESS_COPY_COMMAND_TO,
+			0
+		};
+		switch (cstate->copy_dest)
+		{
+			case COPY_FILE:
+				if (is_program)
+					progress_vals[1] = PROGRESS_COPY_IO_TARGET_PROGRAM;
+				else
+					progress_vals[1] = PROGRESS_COPY_IO_TARGET_FILE;
+				break;
+			case COPY_OLD_FE:
+			case COPY_NEW_FE:
+				progress_vals[1] = PROGRESS_COPY_IO_TARGET_STDIO;
+				break;
+		}
+		pgstat_progress_update_multi_param(2, progress_index, progress_vals);
+	}
+
 	cstate->bytes_processed = 0;
 
 	MemoryContextSwitchTo(oldcontext);
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 1c30d09abb..e003217554 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -139,9 +139,16 @@
 #define PROGRESS_COPY_TUPLES_PROCESSED 2
 #define PROGRESS_COPY_TUPLES_EXCLUDED 3
 #define PROGRESS_COPY_COMMAND 4
+#define PROGRESS_COPY_IO_TARGET 5
 
 /* Commands of PROGRESS_COPY_COMMAND */
 #define PROGRESS_COPY_COMMAND_FROM 1
 #define PROGRESS_COPY_COMMAND_TO 2
 
+/* Types of PROGRESS_COPY_INOUT_TYPE */
+#define PROGRESS_COPY_IO_TARGET_FILE 1
+#define PROGRESS_COPY_IO_TARGET_PROGRAM 2
+#define PROGRESS_COPY_IO_TARGET_STDIO 3
+#define PROGRESS_COPY_IO_TARGET_CALLBACK 4
+
 #endif
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 63b5e33083..0698c71d23 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1953,6 +1953,13 @@ pg_stat_progress_copy| SELECT s.pid,
             WHEN 2 THEN 'COPY TO'::text
             ELSE NULL::text
         END AS command,
+        CASE s.param6
+            WHEN 1 THEN 'FILE'::text
+            WHEN 2 THEN 'PROGRAM'::text
+            WHEN 3 THEN 'STDIO'::text
+            WHEN 4 THEN 'CALLBACK'::text
+            ELSE NULL::text
+        END AS io_target,
     s.param1 AS bytes_processed,
     s.param2 AS bytes_total,
     s.param3 AS tuples_processed,
-- 
2.20.1

v2-0002-Rename-lines-to-tuples-in-COPY-progress-reporting.patchtext/x-patch; charset=US-ASCII; name=v2-0002-Rename-lines-to-tuples-in-COPY-progress-reporting.patchDownload
From d6723f2cfbe98e9837948336420f283312c91b51 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Thu, 7 Jan 2021 16:51:30 +0100
Subject: [PATCH v2 2/5] Rename lines* to tuples* in COPY progress reporting
 view

This pulls the names of the columns in line with the other progress
reporting views and the comments surrounding the code that updates the columns.

It also makes it clear to the user that for the foreign data consumed or
produced in the COPY command, the line count need not be related to these
columns.
---
 doc/src/sgml/monitoring.sgml         | 8 ++++----
 src/backend/catalog/system_views.sql | 4 ++--
 src/backend/commands/copyfrom.c      | 4 ++--
 src/backend/commands/copyto.c        | 4 ++--
 src/include/commands/progress.h      | 4 ++--
 src/test/regress/expected/rules.out  | 4 ++--
 6 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index dc998bc5f7..940e9dcee4 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6565,19 +6565,19 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>lines_processed</structfield> <type>bigint</type>
+       <structfield>tuples_processed</structfield> <type>bigint</type>
       </para>
       <para>
-       Number of lines already processed by <command>COPY</command> command.
+       Number of tuples already processed by <command>COPY</command> command.
       </para></entry>
      </row>
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>lines_excluded</structfield> <type>bigint</type>
+       <structfield>tuples_excluded</structfield> <type>bigint</type>
       </para>
       <para>
-       Number of lines not processed because they were excluded by the
+       Number of tuples not processed because they were excluded by the
        <command>WHERE</command> clause of the <command>COPY</command> command.
       </para></entry>
      </row>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index a9bd983419..e7e227792c 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1131,8 +1131,8 @@ CREATE VIEW pg_stat_progress_copy AS
         S.relid AS relid,
         S.param1 AS bytes_processed,
         S.param2 AS bytes_total,
-        S.param3 AS lines_processed,
-        S.param4 AS lines_excluded
+        S.param3 AS tuples_processed,
+        S.param4 AS tuples_excluded
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 86e6a422e2..fb3c7e2c0c 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -871,7 +871,7 @@ CopyFrom(CopyFromState cstate)
 			if (!ExecQual(cstate->qualexpr, econtext))
 			{
 				/* Report that this tuple was filtered out by the WHERE clause */
-				pgstat_progress_update_param(PROGRESS_COPY_LINES_EXCLUDED, ++excluded);
+				pgstat_progress_update_param(PROGRESS_COPY_TUPLES_EXCLUDED, ++excluded);
 				continue;
 			}
 		}
@@ -1111,7 +1111,7 @@ CopyFrom(CopyFromState cstate)
 			 * for counting tuples inserted by an INSERT command. Update
 			 * progress of the COPY command as well.
 			 */
-			pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++processed);
+			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++processed);
 		}
 	}
 
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index e04ec1e331..9ffe7a6ee3 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -954,7 +954,7 @@ CopyTo(CopyToState cstate)
 			CopyOneRowTo(cstate, slot);
 
 			/* Increment amount of processed tuples and update the progress */
-			pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++processed);
+			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++processed);
 		}
 
 		ExecDropSingleTupleTableSlot(slot);
@@ -1321,7 +1321,7 @@ copy_dest_receive(TupleTableSlot *slot, DestReceiver *self)
 	CopyOneRowTo(cstate, slot);
 
 	/* Increment amount of processed tuples and update the progress */
-	pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++myState->processed);
+	pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++myState->processed);
 
 	return true;
 }
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index e6f6545033..8b2b188bd5 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -136,7 +136,7 @@
 /* Commands of PROGRESS_COPY */
 #define PROGRESS_COPY_BYTES_PROCESSED 0
 #define PROGRESS_COPY_BYTES_TOTAL 1
-#define PROGRESS_COPY_LINES_PROCESSED 2
-#define PROGRESS_COPY_LINES_EXCLUDED 3
+#define PROGRESS_COPY_TUPLES_PROCESSED 2
+#define PROGRESS_COPY_TUPLES_EXCLUDED 3
 
 #endif
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 25a863627e..970f6909c2 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1950,8 +1950,8 @@ pg_stat_progress_copy| SELECT s.pid,
     s.relid,
     s.param1 AS bytes_processed,
     s.param2 AS bytes_total,
-    s.param3 AS lines_processed,
-    s.param4 AS lines_excluded
+    s.param3 AS tuples_processed,
+    s.param4 AS tuples_excluded
    FROM (pg_stat_get_progress_info('COPY'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_create_index| SELECT s.pid,
-- 
2.20.1

v2-0001-Add-progress-reporting-for-excluded-rows.patchtext/x-patch; charset=US-ASCII; name=v2-0001-Add-progress-reporting-for-excluded-rows.patchDownload
From 177fabe331cbf21b695c9d8f7bab513d14e8f727 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm+postgres@gmail.com>
Date: Thu, 7 Jan 2021 16:39:57 +0100
Subject: [PATCH v2 1/5] Add progress reporting for excluded rows.

COPY ... FROM ... WHERE (condition) will exclude rows that do not match the
condition, which we now register as explicitly being excluded from insertion.
---
 doc/src/sgml/monitoring.sgml         | 10 ++++++++++
 src/backend/catalog/system_views.sql |  3 ++-
 src/backend/commands/copyfrom.c      |  5 +++++
 src/include/commands/progress.h      |  1 +
 src/test/regress/expected/rules.out  |  3 ++-
 5 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index c602ee4427..dc998bc5f7 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6571,6 +6571,16 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
        Number of lines already processed by <command>COPY</command> command.
       </para></entry>
      </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>lines_excluded</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of lines not processed because they were excluded by the
+       <command>WHERE</command> clause of the <command>COPY</command> command.
+      </para></entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index fa58afd9d7..a9bd983419 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1131,7 +1131,8 @@ CREATE VIEW pg_stat_progress_copy AS
         S.relid AS relid,
         S.param1 AS bytes_processed,
         S.param2 AS bytes_total,
-        S.param3 AS lines_processed
+        S.param3 AS lines_processed,
+        S.param4 AS lines_excluded
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index c39cc736ed..86e6a422e2 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -540,6 +540,7 @@ CopyFrom(CopyFromState cstate)
 	CopyInsertMethod insertMethod;
 	CopyMultiInsertInfo multiInsertInfo = {0};	/* pacify compiler */
 	uint64		processed = 0;
+	uint64		excluded = 0;
 	bool		has_before_insert_row_trig;
 	bool		has_instead_insert_row_trig;
 	bool		leafpart_use_multi_insert = false;
@@ -868,7 +869,11 @@ CopyFrom(CopyFromState cstate)
 			econtext->ecxt_scantuple = myslot;
 			/* Skip items that don't match COPY's WHERE clause */
 			if (!ExecQual(cstate->qualexpr, econtext))
+			{
+				/* Report that this tuple was filtered out by the WHERE clause */
+				pgstat_progress_update_param(PROGRESS_COPY_LINES_EXCLUDED, ++excluded);
 				continue;
+			}
 		}
 
 		/* Determine the partition to insert the tuple into */
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 95ec5d02e9..e6f6545033 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -137,5 +137,6 @@
 #define PROGRESS_COPY_BYTES_PROCESSED 0
 #define PROGRESS_COPY_BYTES_TOTAL 1
 #define PROGRESS_COPY_LINES_PROCESSED 2
+#define PROGRESS_COPY_LINES_EXCLUDED 3
 
 #endif
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 6173473de9..25a863627e 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1950,7 +1950,8 @@ pg_stat_progress_copy| SELECT s.pid,
     s.relid,
     s.param1 AS bytes_processed,
     s.param2 AS bytes_total,
-    s.param3 AS lines_processed
+    s.param3 AS lines_processed,
+    s.param4 AS lines_excluded
    FROM (pg_stat_get_progress_info('COPY'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_create_index| SELECT s.pid,
-- 
2.20.1

v2-0003-Add-backlinks-to-progress-reporting-documentation.patchtext/x-patch; charset=US-ASCII; name=v2-0003-Add-backlinks-to-progress-reporting-documentation.patchDownload
From 04d6fcb33d863a940d828d06971617e12b35b3cc Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Fri, 5 Feb 2021 23:11:50 +0100
Subject: [PATCH v2 3/5] Add backlinks to progress reporting documentation

Previously, for most progress-reported features, the only place the
feature was mentioned is in the progress reporting document itself.
This makes the progress reporting more discoverable from the reported
commands.
---
 doc/src/sgml/ref/analyze.sgml       | 1 +
 doc/src/sgml/ref/cluster.sgml       | 1 +
 doc/src/sgml/ref/copy.sgml          | 8 ++++++++
 doc/src/sgml/ref/create_index.sgml  | 1 +
 doc/src/sgml/ref/pg_basebackup.sgml | 1 +
 doc/src/sgml/ref/reindex.sgml       | 1 +
 doc/src/sgml/ref/vacuum.sgml        | 2 ++
 7 files changed, 15 insertions(+)

diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml
index 7d816c87c6..25381a86bf 100644
--- a/doc/src/sgml/ref/analyze.sgml
+++ b/doc/src/sgml/ref/analyze.sgml
@@ -291,6 +291,7 @@ ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replacea
    <member><xref linkend="app-vacuumdb"/></member>
    <member><xref linkend="runtime-config-resource-vacuum-cost"/></member>
    <member><xref linkend="autovacuum"/></member>
+   <member><xref linkend="analyze-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/cluster.sgml b/doc/src/sgml/ref/cluster.sgml
index 5dd21a0189..65902f0131 100644
--- a/doc/src/sgml/ref/cluster.sgml
+++ b/doc/src/sgml/ref/cluster.sgml
@@ -242,6 +242,7 @@ CLUSTER <replaceable class="parameter">index_name</replaceable> ON <replaceable
 
   <simplelist type="inline">
    <member><xref linkend="app-clusterdb"/></member>
+   <member><xref linkend="cluster-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 0fca6583af..62c8504a6e 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -1052,4 +1052,12 @@ COPY [ BINARY ] <replaceable class="parameter">table_name</replaceable>
     [ WITH NULL AS '<replaceable class="parameter">null_string</replaceable>' ]
 </synopsis></para>
  </refsect1>
+ 
+ <refsect1>
+  <title>See Also</title>
+ 
+  <simplelist type="inline">
+   <member><xref linkend="copy-progress-reporting"/></member>
+  </simplelist>
+ </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index a5271a9f8f..36cf69ec1e 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -978,6 +978,7 @@ CREATE INDEX CONCURRENTLY sales_quantity_index ON sales_table (quantity);
    <member><xref linkend="sql-alterindex"/></member>
    <member><xref linkend="sql-dropindex"/></member>
    <member><xref linkend="sql-reindex"/></member>
+   <member><xref linkend="create-index-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 5754ad5aa6..14cc88a852 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -904,6 +904,7 @@ PostgreSQL documentation
 
   <simplelist type="inline">
    <member><xref linkend="app-pgdump"/></member>
+   <member><xref linkend="basebackup-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 07795b5737..a8b3b5efc5 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -551,6 +551,7 @@ REINDEX TABLE CONCURRENTLY my_broken_table;
    <member><xref linkend="sql-createindex"/></member>
    <member><xref linkend="sql-dropindex"/></member>
    <member><xref linkend="app-reindexdb"/></member>
+   <member><xref linkend="create-index-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 21ab57d880..06cde1ca16 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -407,6 +407,8 @@ VACUUM (VERBOSE, ANALYZE) onek;
    <member><xref linkend="app-vacuumdb"/></member>
    <member><xref linkend="runtime-config-resource-vacuum-cost"/></member>
    <member><xref linkend="autovacuum"/></member>
+   <member><xref linkend="vacuum-progress-reporting"/></member>
+   <member><xref linkend="cluster-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
-- 
2.20.1

v2-0004-Add-a-command-column-to-the-copy-progress-view.patchtext/x-patch; charset=US-ASCII; name=v2-0004-Add-a-command-column-to-the-copy-progress-view.patchDownload
From 9f04d5a97652f6b08747ccc9af3b079fe1221d4b Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Mon, 8 Feb 2021 17:36:00 +0100
Subject: [PATCH v2 4/5] Add a command column to the copy progress view

This allows filtering on COPY FROM / COPY TO for progress reporting, and makes
it possible to determine the further meaning of the columns involved.
---
 doc/src/sgml/monitoring.sgml         | 10 ++++++++++
 src/backend/catalog/system_views.sql |  3 +++
 src/backend/commands/copyfrom.c      |  1 +
 src/backend/commands/copyto.c        |  1 +
 src/include/commands/progress.h      |  5 +++++
 src/test/regress/expected/rules.out  |  5 +++++
 6 files changed, 25 insertions(+)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 940e9dcee4..ca84b53896 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6544,6 +6544,16 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>command</structfield> <type>text</type>
+      </para>
+      <para>
+       The command that is running: <literal>COPY FROM</literal>, or
+       <literal>COPY TO</literal>.
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>bytes_processed</structfield> <type>bigint</type>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index e7e227792c..1082b7d253 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1129,6 +1129,9 @@ CREATE VIEW pg_stat_progress_copy AS
     SELECT
         S.pid AS pid, S.datid AS datid, D.datname AS datname,
         S.relid AS relid,
+        CASE S.param5 WHEN 1 THEN 'COPY FROM'
+                      WHEN 2 THEN 'COPY TO'
+                      END AS command,
         S.param1 AS bytes_processed,
         S.param2 AS bytes_total,
         S.param3 AS tuples_processed,
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index fb3c7e2c0c..ce343dbf80 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1428,6 +1428,7 @@ BeginCopyFrom(ParseState *pstate,
 	/* initialize progress */
 	pgstat_progress_start_command(PROGRESS_COMMAND_COPY,
 								  cstate->rel ? RelationGetRelid(cstate->rel) : InvalidOid);
+	pgstat_progress_update_param(PROGRESS_COPY_COMMAND, PROGRESS_COPY_COMMAND_FROM);
 	cstate->bytes_processed = 0;
 
 	/* We keep those variables in cstate. */
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 9ffe7a6ee3..534c091c75 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -772,6 +772,7 @@ BeginCopyTo(ParseState *pstate,
 	/* initialize progress */
 	pgstat_progress_start_command(PROGRESS_COMMAND_COPY,
 								  cstate->rel ? RelationGetRelid(cstate->rel) : InvalidOid);
+	pgstat_progress_update_param(PROGRESS_COPY_COMMAND, PROGRESS_COPY_COMMAND_TO);
 	cstate->bytes_processed = 0;
 
 	MemoryContextSwitchTo(oldcontext);
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 8b2b188bd5..1c30d09abb 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -138,5 +138,10 @@
 #define PROGRESS_COPY_BYTES_TOTAL 1
 #define PROGRESS_COPY_TUPLES_PROCESSED 2
 #define PROGRESS_COPY_TUPLES_EXCLUDED 3
+#define PROGRESS_COPY_COMMAND 4
+
+/* Commands of PROGRESS_COPY_COMMAND */
+#define PROGRESS_COPY_COMMAND_FROM 1
+#define PROGRESS_COPY_COMMAND_TO 2
 
 #endif
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 970f6909c2..63b5e33083 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1948,6 +1948,11 @@ pg_stat_progress_copy| SELECT s.pid,
     s.datid,
     d.datname,
     s.relid,
+        CASE s.param5
+            WHEN 1 THEN 'COPY FROM'::text
+            WHEN 2 THEN 'COPY TO'::text
+            ELSE NULL::text
+        END AS command,
     s.param1 AS bytes_processed,
     s.param2 AS bytes_total,
     s.param3 AS tuples_processed,
-- 
2.20.1

#2Josef Šimánek
josef.simanek@gmail.com
In reply to: Matthias van de Meent (#1)
Re: Improvements and additions to COPY progress reporting

po 8. 2. 2021 v 19:35 odesílatel Matthias van de Meent
<boekewurm+postgres@gmail.com> napsal:

Hi,

With [0] we got COPY progress reporting. Before the column names of
this newly added view are effectively set in stone with the release of
pg14, I propose the following set of relatively small patches. These
are v2, because it is a patchset that is based on a set of patches
that I previously posted in [0].

Hello. I had this in my backlog to revisit this feature as well before
the release. Thanks for picking this up.

0001 Adds a column to pg_stat_progress_copy which details the amount
of tuples that were excluded from insertion by the WHERE clause of the
COPY FROM command.

0002 alters pg_stat_progress_copy to use 'tuple'-terminology instead
of 'line'-terminology. 'Line' doesn't make sense in the binary copy
case, and only for the 'text' copy format there can be a guarantee
that the source / output file actually contains the reported amount of
lines, whereas the amount of data tuples (which is also what it's
called internally) is guaranteed to equal for all data types.

There was some discussion about this in [0] where the author thought
'line' is more consistent with the CSV documentation, and where I
argued that 'tuple' is both more consistent with the rest of the
progress reporting tables and more consistent with the actual counted
items: these are the tuples serialized / inserted (as noted in the CSV
docs; "Thus the files are not strictly one line per table row like
text-format files.").

As an mentioned author I have no preference over line or tuple
terminology here. For some cases "line" terminology fits better, for
some "tuple" one. Docs can be improved later if needed to make it
clear at some cases (for example in most common case probably - CSV
import/export) tuple equals one line in CSV.

Patch 0003 adds backlinks to the progress reporting docs from the docs
of the commands that have progress reporting (re/index, cluster,
vacuum, etc.) such that progress reporting is better discoverable from
the relevant commands, and removes the datname column from the
progress_copy view (that column was never committed). This too should
be fairly trivial and uncontroversial.

0004 adds the 'command' column to the progress_copy view; which
distinguishes between COPY FROM and COPY TO. The two commands are (in
my opinion) significantly different enough to warrant this column;
similar to the difference between CREATE INDEX/REINDEX [CONCURRENTLY]
which also report that information. I believe that this change is
appropriate; as the semantics of the columns change depending on the
command being executed.

This was part of my initial patch as well, but I decided to strip it
out to make the final patch as small as possible to make it quickly
mergeable without need of further discussion. From my side this is
useful to have directly in the progress report as well.

Lastly, 0005 adds 'io_target' to the reported information, that is,
FILE, PROGRAM, STDIO or CALLBACK. Although this can relatively easily
be determined based on the commands in pg_stat_activity, it is
reasonably something that a user would want to query on, as the
origin/target of COPY has security and performance implications,
whereas other options (e.g. format) are less interesting for clients
that are not executing that specific COPY command.

Similar (simplified, not supporting CALLBACK) info was also part of
the initial patch and stripped out later. I'm also +1 on this info
being useful to have directly in the progress report.

Of special interest in 0005 is that it reports the io_target for the
logical replications' initial tablesyncs' internal COPY. This would
otherwise be measured, but no knowledge about the type of copy (or its
origin) would be available on the worker's side. I'm not married to
this patch 0005, but I believe it could be useful, and therefore
included it in the patchset.

All patches seem good to me. I was able to apply them to current clean
master and "make check" has succeeded without problems.

Show quoted text

With regards,

Matthias van de Meent.

[0] /messages/by-id/CAFp7Qwr6_FmRM6pCO0x_a0mymOfX_Gg+FEKet4XaTGSW=LitKQ@mail.gmail.com

#3Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Matthias van de Meent (#1)
Re: Improvements and additions to COPY progress reporting

On Tue, Feb 9, 2021 at 12:06 AM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

With [0] we got COPY progress reporting. Before the column names of
this newly added view are effectively set in stone with the release of
pg14, I propose the following set of relatively small patches. These
are v2, because it is a patchset that is based on a set of patches
that I previously posted in [0].

Thanks for working on the patches. Here are some comments:

0001 - +1 to add tuples_excluded and the patch LGTM.

0002 - Yes, the tuples_processed or tuples_excluded makes more sense
to me than lines_processed and lines_excluded. The patch LGTM.

0003 - Instead of just adding the progress reporting to "See also"
sections in the footer of the respective pages analyze, cluster and
others, it would be nice if we have a mention of it in the description
as pg_basebackup has something like below:
<para>
Whenever <application>pg_basebackup</application> is taking a base
backup, the server's <structname>pg_stat_progress_basebackup</structname>
view will report the progress of the backup.
See <xref linkend="basebackup-progress-reporting"/> for details.

0004 -
1) How about PROGRESS_COPY_COMMAND_TYPE instead of
PROGRESS_COPY_COMMAND? The names looks bit confusing with the existing
PROGRESS_COMMAND_COPY.

0005 -
1) How about
+       or <literal>CALLBACK</literal> (used in the table
synchronization background
+       worker).
instead of
+       or <literal>CALLBACK</literal> (used in the tablesync background
+       worker).
Because "table synchronization" is being used in logical-replication.sgml.

2) I think cstate->copy_src = COPY_CALLBACK is assigned after the
switch case added in copyfrom.c
if (data_source_cb)
{
cstate->copy_src = COPY_CALLBACK;
cstate->data_source_cb = data_source_cb;
}

Also, you can add this to the current commitfest.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

#4Josef Šimánek
josef.simanek@gmail.com
In reply to: Matthias van de Meent (#1)
1 attachment(s)
Re: Improvements and additions to COPY progress reporting

po 8. 2. 2021 v 19:35 odesílatel Matthias van de Meent
<boekewurm+postgres@gmail.com> napsal:

Hi,

With [0] we got COPY progress reporting. Before the column names of
this newly added view are effectively set in stone with the release of
pg14, I propose the following set of relatively small patches. These
are v2, because it is a patchset that is based on a set of patches
that I previously posted in [0].

0001 Adds a column to pg_stat_progress_copy which details the amount
of tuples that were excluded from insertion by the WHERE clause of the
COPY FROM command.

0002 alters pg_stat_progress_copy to use 'tuple'-terminology instead
of 'line'-terminology. 'Line' doesn't make sense in the binary copy
case, and only for the 'text' copy format there can be a guarantee
that the source / output file actually contains the reported amount of
lines, whereas the amount of data tuples (which is also what it's
called internally) is guaranteed to equal for all data types.

There was some discussion about this in [0] where the author thought
'line' is more consistent with the CSV documentation, and where I
argued that 'tuple' is both more consistent with the rest of the
progress reporting tables and more consistent with the actual counted
items: these are the tuples serialized / inserted (as noted in the CSV
docs; "Thus the files are not strictly one line per table row like
text-format files.").

Patch 0003 adds backlinks to the progress reporting docs from the docs
of the commands that have progress reporting (re/index, cluster,
vacuum, etc.) such that progress reporting is better discoverable from
the relevant commands, and removes the datname column from the
progress_copy view (that column was never committed). This too should
be fairly trivial and uncontroversial.

0004 adds the 'command' column to the progress_copy view; which
distinguishes between COPY FROM and COPY TO. The two commands are (in
my opinion) significantly different enough to warrant this column;
similar to the difference between CREATE INDEX/REINDEX [CONCURRENTLY]
which also report that information. I believe that this change is
appropriate; as the semantics of the columns change depending on the
command being executed.

Lastly, 0005 adds 'io_target' to the reported information, that is,
FILE, PROGRAM, STDIO or CALLBACK. Although this can relatively easily
be determined based on the commands in pg_stat_activity, it is
reasonably something that a user would want to query on, as the
origin/target of COPY has security and performance implications,
whereas other options (e.g. format) are less interesting for clients
that are not executing that specific COPY command.

I took a little deeper look and I'm not sure if I understand FILE and
STDIO. I have finally tried to finalize some initial regress testing
of COPY command progress using triggers. I have attached the initial
patch applicable to your changes. As you can see COPY FROM STDIN is
reported as FILE. That's probably expected, but it is a little
confusing for me since STDIN and STDIO sound similar. What is the
purpose of STDIO? When is the COPY command reported with io_target of
STDIO?

Show quoted text

Of special interest in 0005 is that it reports the io_target for the
logical replications' initial tablesyncs' internal COPY. This would
otherwise be measured, but no knowledge about the type of copy (or its
origin) would be available on the worker's side. I'm not married to
this patch 0005, but I believe it could be useful, and therefore
included it in the patchset.

With regards,

Matthias van de Meent.

[0] /messages/by-id/CAFp7Qwr6_FmRM6pCO0x_a0mymOfX_Gg+FEKet4XaTGSW=LitKQ@mail.gmail.com

Attachments:

v2-0006-initial-regress-testing-of-COPY-progress.patchtext/x-patch; charset=US-ASCII; name=v2-0006-initial-regress-testing-of-COPY-progress.patchDownload
From d91fbe3a80c1858e9cbc126e4ade44945cea9ae9 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Josef=20=C5=A0im=C3=A1nek?= <josef.simanek@gmail.com>
Date: Tue, 9 Feb 2021 09:27:21 +0100
Subject: [PATCH] Add initial regress test for COPY progress reporting.

---
 src/test/regress/expected/progress.out | 43 ++++++++++++++++++++++++++
 src/test/regress/parallel_schedule     |  2 +-
 src/test/regress/sql/progress.sql      | 39 +++++++++++++++++++++++
 3 files changed, 83 insertions(+), 1 deletion(-)
 create mode 100644 src/test/regress/expected/progress.out
 create mode 100644 src/test/regress/sql/progress.sql

diff --git a/src/test/regress/expected/progress.out b/src/test/regress/expected/progress.out
new file mode 100644
index 000000000000..2c745b946366
--- /dev/null
+++ b/src/test/regress/expected/progress.out
@@ -0,0 +1,43 @@
+-- setup for COPY progress testing
+CREATE TEMP TABLE test_progress_with_trigger (
+  a int,
+  b text
+) ;
+CREATE OR REPLACE function notice_after_progress_reporting() RETURNS trigger AS
+$$
+DECLARE report record;
+BEGIN
+  SELECT INTO report * FROM pg_stat_progress_copy report WHERE pid = pg_backend_pid();
+  raise info 'progress datname: %', report.datname;
+  raise info 'progress command: %', report.command;
+  raise info 'progress io_target: %', report.io_target;
+  raise info 'progress bytes_processed: %', report.bytes_processed;
+  raise info 'progress bytes_total: %', report.bytes_total;
+  raise info 'progress tuples_processed: %', report.tuples_processed;
+  raise info 'progress tuples_excluded: %', report.tuples_excluded;
+
+  RETURN NEW;
+END;
+$$ LANGUAGE plpgsql;
+CREATE OR REPLACE TRIGGER check_after_progress_reporting
+    AFTER INSERT ON test_progress_with_trigger
+    FOR EACH ROW
+    EXECUTE FUNCTION notice_after_progress_reporting();
+-- simple COPY from STDIN
+COPY test_progress_with_trigger (a, b) FROM STDIN;
+INFO:  progress datname: regression
+INFO:  progress command: COPY FROM
+INFO:  progress io_target: FILE
+INFO:  progress bytes_processed: 9
+INFO:  progress bytes_total: 0
+INFO:  progress tuples_processed: 1
+INFO:  progress tuples_excluded: 0
+-- COPY from STDIN with WHERE skipping lines
+COPY test_progress_with_trigger (a, b) FROM STDIN WHERE a > 1;
+INFO:  progress datname: regression
+INFO:  progress command: COPY FROM
+INFO:  progress io_target: FILE
+INFO:  progress bytes_processed: 18
+INFO:  progress bytes_total: 0
+INFO:  progress tuples_processed: 1
+INFO:  progress tuples_excluded: 1
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 12bb67e49118..e445ad24f9ed 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -97,7 +97,7 @@ test: publication subscription
 # ----------
 # Another group of parallel tests
 # ----------
-test: select_views portals_p2 foreign_key cluster dependency guc bitmapops combocid tsearch tsdicts foreign_data window xmlmap functional_deps advisory_lock indirect_toast equivclass
+test: select_views portals_p2 foreign_key cluster dependency guc bitmapops combocid tsearch tsdicts foreign_data window xmlmap functional_deps advisory_lock indirect_toast equivclass progress
 
 # ----------
 # Another group of parallel tests (JSON related)
diff --git a/src/test/regress/sql/progress.sql b/src/test/regress/sql/progress.sql
new file mode 100644
index 000000000000..ef5d95056197
--- /dev/null
+++ b/src/test/regress/sql/progress.sql
@@ -0,0 +1,39 @@
+-- setup for COPY progress testing
+CREATE TEMP TABLE test_progress_with_trigger (
+  a int,
+  b text
+) ;
+
+CREATE OR REPLACE function notice_after_progress_reporting() RETURNS trigger AS
+$$
+DECLARE report record;
+BEGIN
+  SELECT INTO report * FROM pg_stat_progress_copy report WHERE pid = pg_backend_pid();
+  raise info 'progress datname: %', report.datname;
+  raise info 'progress command: %', report.command;
+  raise info 'progress io_target: %', report.io_target;
+  raise info 'progress bytes_processed: %', report.bytes_processed;
+  raise info 'progress bytes_total: %', report.bytes_total;
+  raise info 'progress tuples_processed: %', report.tuples_processed;
+  raise info 'progress tuples_excluded: %', report.tuples_excluded;
+
+  RETURN NEW;
+END;
+$$ LANGUAGE plpgsql;
+
+CREATE OR REPLACE TRIGGER check_after_progress_reporting
+    AFTER INSERT ON test_progress_with_trigger
+    FOR EACH ROW
+    EXECUTE FUNCTION notice_after_progress_reporting();
+
+-- simple COPY from STDIN
+COPY test_progress_with_trigger (a, b) FROM STDIN;
+1	test_1
+\.
+
+-- COPY from STDIN with WHERE skipping lines
+COPY test_progress_with_trigger (a, b) FROM STDIN WHERE a > 1;
+1	test_1
+2	test_2
+\.
+
#50010203112132233
boekewurm@gmail.com
In reply to: Josef Šimánek (#4)
Re: Improvements and additions to COPY progress reporting

On Tue, 9 Feb 2021 at 09:32, Josef Šimánek <josef.simanek@gmail.com> wrote:

po 8. 2. 2021 v 19:35 odesílatel Matthias van de Meent
<boekewurm+postgres@gmail.com> napsal:

Lastly, 0005 adds 'io_target' to the reported information, that is,
FILE, PROGRAM, STDIO or CALLBACK. Although this can relatively easily
be determined based on the commands in pg_stat_activity, it is
reasonably something that a user would want to query on, as the
origin/target of COPY has security and performance implications,
whereas other options (e.g. format) are less interesting for clients
that are not executing that specific COPY command.

I took a little deeper look and I'm not sure if I understand FILE and
STDIO. I have finally tried to finalize some initial regress testing
of COPY command progress using triggers. I have attached the initial
patch applicable to your changes. As you can see COPY FROM STDIN is
reported as FILE. That's probably expected, but it is a little
confusing for me since STDIN and STDIO sound similar. What is the
purpose of STDIO? When is the COPY command reported with io_target of
STDIO?

I checked for the type of the copy_src before it was correctly set,
therefore only reporting FILE type, but this will be fixed shortly in
v3.

Matthias

#6Josef Šimánek
josef.simanek@gmail.com
In reply to: 0010203112132233 (#5)
Re: Improvements and additions to COPY progress reporting

út 9. 2. 2021 v 12:51 odesílatel 0010203112132233 <boekewurm@gmail.com> napsal:

On Tue, 9 Feb 2021 at 09:32, Josef Šimánek <josef.simanek@gmail.com> wrote:

po 8. 2. 2021 v 19:35 odesílatel Matthias van de Meent
<boekewurm+postgres@gmail.com> napsal:

Lastly, 0005 adds 'io_target' to the reported information, that is,
FILE, PROGRAM, STDIO or CALLBACK. Although this can relatively easily
be determined based on the commands in pg_stat_activity, it is
reasonably something that a user would want to query on, as the
origin/target of COPY has security and performance implications,
whereas other options (e.g. format) are less interesting for clients
that are not executing that specific COPY command.

I took a little deeper look and I'm not sure if I understand FILE and
STDIO. I have finally tried to finalize some initial regress testing
of COPY command progress using triggers. I have attached the initial
patch applicable to your changes. As you can see COPY FROM STDIN is
reported as FILE. That's probably expected, but it is a little
confusing for me since STDIN and STDIO sound similar. What is the
purpose of STDIO? When is the COPY command reported with io_target of
STDIO?

I checked for the type of the copy_src before it was correctly set,
therefore only reporting FILE type, but this will be fixed shortly in
v3.

OK, would you mind to integrate my regression test initial patch as
well in v3 or should I submit it later in a separate way?

Show quoted text

Matthias

#7Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Bharath Rupireddy (#3)
6 attachment(s)
Re: Improvements and additions to COPY progress reporting

On Tue, 9 Feb 2021 at 08:12, Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Tue, Feb 9, 2021 at 12:06 AM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

With [0] we got COPY progress reporting. Before the column names of
this newly added view are effectively set in stone with the release of
pg14, I propose the following set of relatively small patches. These
are v2, because it is a patchset that is based on a set of patches
that I previously posted in [0].

Thanks for working on the patches. Here are some comments:

0001 - +1 to add tuples_excluded and the patch LGTM.

0002 - Yes, the tuples_processed or tuples_excluded makes more sense
to me than lines_processed and lines_excluded. The patch LGTM.

0003 - Instead of just adding the progress reporting to "See also"
sections in the footer of the respective pages analyze, cluster and
others, it would be nice if we have a mention of it in the description
as pg_basebackup has something like below:
<para>
Whenever <application>pg_basebackup</application> is taking a base
backup, the server's <structname>pg_stat_progress_basebackup</structname>
view will report the progress of the backup.
See <xref linkend="basebackup-progress-reporting"/> for details.

Added

0004 -
1) How about PROGRESS_COPY_COMMAND_TYPE instead of
PROGRESS_COPY_COMMAND? The names looks bit confusing with the existing
PROGRESS_COMMAND_COPY.

The current name is consistent with the naming of the other
command-reporting progress views; CREATEIDX and CLUSTER both use the
*_COMMAND as this column indexes' internal name.

0005 -
1) How about
+       or <literal>CALLBACK</literal> (used in the table
synchronization background
+       worker).
instead of
+       or <literal>CALLBACK</literal> (used in the tablesync background
+       worker).
Because "table synchronization" is being used in logical-replication.sgml.

Fixed

2) I think cstate->copy_src = COPY_CALLBACK is assigned after the
switch case added in copyfrom.c
if (data_source_cb)
{
cstate->copy_src = COPY_CALLBACK;
cstate->data_source_cb = data_source_cb;
}

Yes, I noticed this too while working on the patchset, but apparently
didn't act on this... Fixed in attachted version.

Also, you can add this to the current commitfest.

See https://commitfest.postgresql.org/32/2977/

On Tue, 9 Feb 2021 at 12:53, Josef Šimánek <josef.simanek@gmail.com> wrote:

OK, would you mind to integrate my regression test initial patch as
well in v3 or should I submit it later in a separate way?

Attached, with minor fixes

With regards,

Matthias van de Meent

Attachments:

v3-0006-Add-progress-reporting-regression-tests.patchtext/x-patch; charset=US-ASCII; name=v3-0006-Add-progress-reporting-regression-tests.patchDownload
From 720f58890c2910bba7829c380ccb5a022596a6c8 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Josef=20=C5=A0im=C3=A1nek?= <josef.simanek@gmail.com>
Date: Tue, 9 Feb 2021 13:06:45 +0100
Subject: [PATCH v3 6/6] Add progress reporting regression tests

This tests some basic features of copy progress reporting, as this is
possible to implement in one session using triggers. Other views may follow.
---
 src/test/regress/expected/progress.out | 47 ++++++++++++++++++++++++++
 src/test/regress/parallel_schedule     |  2 +-
 src/test/regress/sql/progress.sql      | 43 +++++++++++++++++++++++
 3 files changed, 91 insertions(+), 1 deletion(-)
 create mode 100644 src/test/regress/expected/progress.out
 create mode 100644 src/test/regress/sql/progress.sql

diff --git a/src/test/regress/expected/progress.out b/src/test/regress/expected/progress.out
new file mode 100644
index 0000000000..8feed9df42
--- /dev/null
+++ b/src/test/regress/expected/progress.out
@@ -0,0 +1,47 @@
+-- setup for COPY progress testing
+CREATE TEMP TABLE test_progress_with_trigger (
+  a int,
+  b text
+) ;
+CREATE function notice_after_progress_reporting() RETURNS trigger AS
+$$
+DECLARE report record;
+BEGIN
+  SELECT INTO report * FROM pg_stat_progress_copy report WHERE pid = pg_backend_pid();
+  raise info 'progress datname: %', report.datname;
+  raise info 'progress command: %', report.command;
+  raise info 'progress io_target: %', report.io_target;
+  raise info 'progress bytes_processed: %', report.bytes_processed;
+  raise info 'progress bytes_total: %', report.bytes_total;
+  raise info 'progress tuples_processed: %', report.tuples_processed;
+  raise info 'progress tuples_excluded: %', report.tuples_excluded;
+
+  RETURN NEW;
+END;
+$$ LANGUAGE plpgsql;
+CREATE TRIGGER check_after_progress_reporting
+    AFTER INSERT ON test_progress_with_trigger
+    FOR EACH ROW
+    EXECUTE FUNCTION notice_after_progress_reporting();
+-- simple COPY from STDIN
+COPY test_progress_with_trigger (a, b) FROM STDIN;
+INFO:  progress datname: regression
+INFO:  progress command: COPY FROM
+INFO:  progress io_target: STDIO
+INFO:  progress bytes_processed: 9
+INFO:  progress bytes_total: 0
+INFO:  progress tuples_processed: 1
+INFO:  progress tuples_excluded: 0
+-- COPY from STDIN with WHERE skipping lines
+COPY test_progress_with_trigger (a, b) FROM STDIN WHERE a > 1;
+INFO:  progress datname: regression
+INFO:  progress command: COPY FROM
+INFO:  progress io_target: STDIO
+INFO:  progress bytes_processed: 18
+INFO:  progress bytes_total: 0
+INFO:  progress tuples_processed: 1
+INFO:  progress tuples_excluded: 1
+-- cleanup
+DROP TRIGGER check_after_progress_reporting ON test_progress_with_trigger;
+DROP FUNCTION notice_after_progress_reporting;
+DROP TABLE test_progress_with_trigger;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 12bb67e491..e445ad24f9 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -97,7 +97,7 @@ test: publication subscription
 # ----------
 # Another group of parallel tests
 # ----------
-test: select_views portals_p2 foreign_key cluster dependency guc bitmapops combocid tsearch tsdicts foreign_data window xmlmap functional_deps advisory_lock indirect_toast equivclass
+test: select_views portals_p2 foreign_key cluster dependency guc bitmapops combocid tsearch tsdicts foreign_data window xmlmap functional_deps advisory_lock indirect_toast equivclass progress
 
 # ----------
 # Another group of parallel tests (JSON related)
diff --git a/src/test/regress/sql/progress.sql b/src/test/regress/sql/progress.sql
new file mode 100644
index 0000000000..9b25bc027d
--- /dev/null
+++ b/src/test/regress/sql/progress.sql
@@ -0,0 +1,43 @@
+-- setup for COPY progress testing
+CREATE TEMP TABLE test_progress_with_trigger (
+  a int,
+  b text
+) ;
+
+CREATE function notice_after_progress_reporting() RETURNS trigger AS
+$$
+DECLARE report record;
+BEGIN
+  SELECT INTO report * FROM pg_stat_progress_copy report WHERE pid = pg_backend_pid();
+  raise info 'progress datname: %', report.datname;
+  raise info 'progress command: %', report.command;
+  raise info 'progress io_target: %', report.io_target;
+  raise info 'progress bytes_processed: %', report.bytes_processed;
+  raise info 'progress bytes_total: %', report.bytes_total;
+  raise info 'progress tuples_processed: %', report.tuples_processed;
+  raise info 'progress tuples_excluded: %', report.tuples_excluded;
+
+  RETURN NEW;
+END;
+$$ LANGUAGE plpgsql;
+
+CREATE TRIGGER check_after_progress_reporting
+    AFTER INSERT ON test_progress_with_trigger
+    FOR EACH ROW
+    EXECUTE FUNCTION notice_after_progress_reporting();
+
+-- simple COPY from STDIN
+COPY test_progress_with_trigger (a, b) FROM STDIN;
+1	test_1
+\.
+
+-- COPY from STDIN with WHERE skipping lines
+COPY test_progress_with_trigger (a, b) FROM STDIN WHERE a > 1;
+1	test_1
+2	test_2
+\.
+
+-- cleanup
+DROP TRIGGER check_after_progress_reporting ON test_progress_with_trigger;
+DROP FUNCTION notice_after_progress_reporting;
+DROP TABLE test_progress_with_trigger;
-- 
2.20.1

v3-0004-Add-a-command-column-to-the-copy-progress-view.patchtext/x-patch; charset=US-ASCII; name=v3-0004-Add-a-command-column-to-the-copy-progress-view.patchDownload
From f7d761f6774753d4914d0dbc80effbb1ab09b58e Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Mon, 8 Feb 2021 17:36:00 +0100
Subject: [PATCH v3 4/6] Add a command column to the copy progress view

This allows filtering on COPY FROM / COPY TO for progress reporting, and makes
it possible to determine the further meaning of the columns involved.
---
 doc/src/sgml/monitoring.sgml         | 10 ++++++++++
 src/backend/catalog/system_views.sql |  3 +++
 src/backend/commands/copyfrom.c      |  1 +
 src/backend/commands/copyto.c        |  1 +
 src/include/commands/progress.h      |  5 +++++
 src/test/regress/expected/rules.out  |  5 +++++
 6 files changed, 25 insertions(+)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 940e9dcee4..ca84b53896 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6544,6 +6544,16 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>command</structfield> <type>text</type>
+      </para>
+      <para>
+       The command that is running: <literal>COPY FROM</literal>, or
+       <literal>COPY TO</literal>.
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>bytes_processed</structfield> <type>bigint</type>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index e7e227792c..1082b7d253 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1129,6 +1129,9 @@ CREATE VIEW pg_stat_progress_copy AS
     SELECT
         S.pid AS pid, S.datid AS datid, D.datname AS datname,
         S.relid AS relid,
+        CASE S.param5 WHEN 1 THEN 'COPY FROM'
+                      WHEN 2 THEN 'COPY TO'
+                      END AS command,
         S.param1 AS bytes_processed,
         S.param2 AS bytes_total,
         S.param3 AS tuples_processed,
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index fb3c7e2c0c..ce343dbf80 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1428,6 +1428,7 @@ BeginCopyFrom(ParseState *pstate,
 	/* initialize progress */
 	pgstat_progress_start_command(PROGRESS_COMMAND_COPY,
 								  cstate->rel ? RelationGetRelid(cstate->rel) : InvalidOid);
+	pgstat_progress_update_param(PROGRESS_COPY_COMMAND, PROGRESS_COPY_COMMAND_FROM);
 	cstate->bytes_processed = 0;
 
 	/* We keep those variables in cstate. */
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 9ffe7a6ee3..534c091c75 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -772,6 +772,7 @@ BeginCopyTo(ParseState *pstate,
 	/* initialize progress */
 	pgstat_progress_start_command(PROGRESS_COMMAND_COPY,
 								  cstate->rel ? RelationGetRelid(cstate->rel) : InvalidOid);
+	pgstat_progress_update_param(PROGRESS_COPY_COMMAND, PROGRESS_COPY_COMMAND_TO);
 	cstate->bytes_processed = 0;
 
 	MemoryContextSwitchTo(oldcontext);
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 8b2b188bd5..1c30d09abb 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -138,5 +138,10 @@
 #define PROGRESS_COPY_BYTES_TOTAL 1
 #define PROGRESS_COPY_TUPLES_PROCESSED 2
 #define PROGRESS_COPY_TUPLES_EXCLUDED 3
+#define PROGRESS_COPY_COMMAND 4
+
+/* Commands of PROGRESS_COPY_COMMAND */
+#define PROGRESS_COPY_COMMAND_FROM 1
+#define PROGRESS_COPY_COMMAND_TO 2
 
 #endif
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 970f6909c2..63b5e33083 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1948,6 +1948,11 @@ pg_stat_progress_copy| SELECT s.pid,
     s.datid,
     d.datname,
     s.relid,
+        CASE s.param5
+            WHEN 1 THEN 'COPY FROM'::text
+            WHEN 2 THEN 'COPY TO'::text
+            ELSE NULL::text
+        END AS command,
     s.param1 AS bytes_processed,
     s.param2 AS bytes_total,
     s.param3 AS tuples_processed,
-- 
2.20.1

v3-0003-Add-backlinks-to-progress-reporting-documentation.patchtext/x-patch; charset=US-ASCII; name=v3-0003-Add-backlinks-to-progress-reporting-documentation.patchDownload
From 1aaf82200c79aaec10a81f669fa49b801ff8b4ed Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Fri, 5 Feb 2021 23:11:50 +0100
Subject: [PATCH v3 3/6] Add backlinks to progress reporting documentation

Previously, for most progress-reported features, the only place the
feature was mentioned is in the progress reporting document itself.
This makes the progress reporting more discoverable from the reported
commands.
---
 doc/src/sgml/ref/analyze.sgml       |  7 +++++++
 doc/src/sgml/ref/cluster.sgml       |  6 ++++++
 doc/src/sgml/ref/copy.sgml          | 14 ++++++++++++++
 doc/src/sgml/ref/create_index.sgml  |  8 ++++++++
 doc/src/sgml/ref/pg_basebackup.sgml |  1 +
 doc/src/sgml/ref/reindex.sgml       |  7 +++++++
 doc/src/sgml/ref/vacuum.sgml        | 11 +++++++++++
 7 files changed, 54 insertions(+)

diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml
index 7d816c87c6..9db9070b62 100644
--- a/doc/src/sgml/ref/analyze.sgml
+++ b/doc/src/sgml/ref/analyze.sgml
@@ -273,6 +273,12 @@ ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replacea
     will not record new statistics for that table.  Any existing statistics
     will be retained.
   </para>
+
+  <para>
+    Each backend running the <command>ANALYZE</command> command will report their
+    progress to the <structname>pg_stat_progress_analyze</structname> view.
+    See <xref linkend="analyze-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -291,6 +297,7 @@ ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replacea
    <member><xref linkend="app-vacuumdb"/></member>
    <member><xref linkend="runtime-config-resource-vacuum-cost"/></member>
    <member><xref linkend="autovacuum"/></member>
+   <member><xref linkend="analyze-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/cluster.sgml b/doc/src/sgml/ref/cluster.sgml
index 5dd21a0189..5c2270f71b 100644
--- a/doc/src/sgml/ref/cluster.sgml
+++ b/doc/src/sgml/ref/cluster.sgml
@@ -192,6 +192,11 @@ CLUSTER [VERBOSE]
     are periodically reclustered.
    </para>
 
+  <para>
+    Each backend running the <command>CLUSTER</command> command will report their
+    progress to the <structname>pg_stat_progress_cluster</structname> view.
+    See <xref linkend="cluster-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -242,6 +247,7 @@ CLUSTER <replaceable class="parameter">index_name</replaceable> ON <replaceable
 
   <simplelist type="inline">
    <member><xref linkend="app-clusterdb"/></member>
+   <member><xref linkend="cluster-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 0fca6583af..af3ce72561 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -82,6 +82,12 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
    specified, data is transmitted via the connection between the
    client and the server.
   </para>
+
+  <para>
+    Each backend running the <command>COPY</command> command will report their
+    progress to the <structname>pg_stat_progress_copy</structname> view.
+    See <xref linkend="copy-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -1052,4 +1058,12 @@ COPY [ BINARY ] <replaceable class="parameter">table_name</replaceable>
     [ WITH NULL AS '<replaceable class="parameter">null_string</replaceable>' ]
 </synopsis></para>
  </refsect1>
+ 
+ <refsect1>
+  <title>See Also</title>
+ 
+  <simplelist type="inline">
+   <member><xref linkend="copy-progress-reporting"/></member>
+  </simplelist>
+ </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index a5271a9f8f..278058f500 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -865,6 +865,13 @@ Indexes:
    will interpret it as <literal>USING gist</literal>, to simplify conversion
    of old databases to GiST.
   </para>
+
+  <para>
+    Each backend running the <command>CREATE INDEX</command> command will 
+    report their progress to the 
+    <structname>pg_stat_progress_create_index</structname> view.
+    See <xref linkend="create-index-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -978,6 +985,7 @@ CREATE INDEX CONCURRENTLY sales_quantity_index ON sales_table (quantity);
    <member><xref linkend="sql-alterindex"/></member>
    <member><xref linkend="sql-dropindex"/></member>
    <member><xref linkend="sql-reindex"/></member>
+   <member><xref linkend="create-index-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 5754ad5aa6..14cc88a852 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -904,6 +904,7 @@ PostgreSQL documentation
 
   <simplelist type="inline">
    <member><xref linkend="app-pgdump"/></member>
+   <member><xref linkend="basebackup-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 07795b5737..f69f5db403 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -493,6 +493,12 @@ Indexes:
     is reindexed concurrently, those indexes will be skipped.  (It is possible
     to reindex such indexes without the <command>CONCURRENTLY</command> option.)
    </para>
+
+  <para>
+    Each backend running the <command>REINDEX</command> command will report
+    their progress to the <structname>pg_stat_progress_create_index</structname>
+    view. See <xref linkend="create-index-progress-reporting"/> for details.
+  </para>
   </refsect2>
  </refsect1>
 
@@ -551,6 +557,7 @@ REINDEX TABLE CONCURRENTLY my_broken_table;
    <member><xref linkend="sql-createindex"/></member>
    <member><xref linkend="sql-dropindex"/></member>
    <member><xref linkend="app-reindexdb"/></member>
+   <member><xref linkend="create-index-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 21ab57d880..f087087816 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -378,6 +378,15 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     information about automatic and manual vacuuming, see
     <xref linkend="routine-vacuuming"/>.
    </para>
+   <para>
+    Each backend running the <command>VACUUM</command> command without the
+    <literal>FULL</literal> option will report their progress in the 
+    <structname>pg_stat_progress_vacuum</structname> view. Backends running
+    <command>VACUUM</command> with the <literal>FULL</literal> option report 
+    progress in the <structname>pg_stat_progress_cluster</structname> instead.
+    See <xref linkend="vacuum-progress-reporting"/> and 
+    <xref linkend="cluster-progress-reporting"/> for details.
+   </para>
  </refsect1>
 
  <refsect1>
@@ -407,6 +416,8 @@ VACUUM (VERBOSE, ANALYZE) onek;
    <member><xref linkend="app-vacuumdb"/></member>
    <member><xref linkend="runtime-config-resource-vacuum-cost"/></member>
    <member><xref linkend="autovacuum"/></member>
+   <member><xref linkend="vacuum-progress-reporting"/></member>
+   <member><xref linkend="cluster-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
-- 
2.20.1

v3-0002-Rename-lines-to-tuples-in-COPY-progress-reporting.patchtext/x-patch; charset=US-ASCII; name=v3-0002-Rename-lines-to-tuples-in-COPY-progress-reporting.patchDownload
From d6723f2cfbe98e9837948336420f283312c91b51 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Thu, 7 Jan 2021 16:51:30 +0100
Subject: [PATCH v3 2/6] Rename lines* to tuples* in COPY progress reporting
 view

This pulls the names of the columns in line with the other progress
reporting views and the comments surrounding the code that updates the columns.

It also makes it clear to the user that for the foreign data consumed or
produced in the COPY command, the line count need not be related to these
columns.
---
 doc/src/sgml/monitoring.sgml         | 8 ++++----
 src/backend/catalog/system_views.sql | 4 ++--
 src/backend/commands/copyfrom.c      | 4 ++--
 src/backend/commands/copyto.c        | 4 ++--
 src/include/commands/progress.h      | 4 ++--
 src/test/regress/expected/rules.out  | 4 ++--
 6 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index dc998bc5f7..940e9dcee4 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6565,19 +6565,19 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>lines_processed</structfield> <type>bigint</type>
+       <structfield>tuples_processed</structfield> <type>bigint</type>
       </para>
       <para>
-       Number of lines already processed by <command>COPY</command> command.
+       Number of tuples already processed by <command>COPY</command> command.
       </para></entry>
      </row>
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>lines_excluded</structfield> <type>bigint</type>
+       <structfield>tuples_excluded</structfield> <type>bigint</type>
       </para>
       <para>
-       Number of lines not processed because they were excluded by the
+       Number of tuples not processed because they were excluded by the
        <command>WHERE</command> clause of the <command>COPY</command> command.
       </para></entry>
      </row>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index a9bd983419..e7e227792c 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1131,8 +1131,8 @@ CREATE VIEW pg_stat_progress_copy AS
         S.relid AS relid,
         S.param1 AS bytes_processed,
         S.param2 AS bytes_total,
-        S.param3 AS lines_processed,
-        S.param4 AS lines_excluded
+        S.param3 AS tuples_processed,
+        S.param4 AS tuples_excluded
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 86e6a422e2..fb3c7e2c0c 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -871,7 +871,7 @@ CopyFrom(CopyFromState cstate)
 			if (!ExecQual(cstate->qualexpr, econtext))
 			{
 				/* Report that this tuple was filtered out by the WHERE clause */
-				pgstat_progress_update_param(PROGRESS_COPY_LINES_EXCLUDED, ++excluded);
+				pgstat_progress_update_param(PROGRESS_COPY_TUPLES_EXCLUDED, ++excluded);
 				continue;
 			}
 		}
@@ -1111,7 +1111,7 @@ CopyFrom(CopyFromState cstate)
 			 * for counting tuples inserted by an INSERT command. Update
 			 * progress of the COPY command as well.
 			 */
-			pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++processed);
+			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++processed);
 		}
 	}
 
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index e04ec1e331..9ffe7a6ee3 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -954,7 +954,7 @@ CopyTo(CopyToState cstate)
 			CopyOneRowTo(cstate, slot);
 
 			/* Increment amount of processed tuples and update the progress */
-			pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++processed);
+			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++processed);
 		}
 
 		ExecDropSingleTupleTableSlot(slot);
@@ -1321,7 +1321,7 @@ copy_dest_receive(TupleTableSlot *slot, DestReceiver *self)
 	CopyOneRowTo(cstate, slot);
 
 	/* Increment amount of processed tuples and update the progress */
-	pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++myState->processed);
+	pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++myState->processed);
 
 	return true;
 }
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index e6f6545033..8b2b188bd5 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -136,7 +136,7 @@
 /* Commands of PROGRESS_COPY */
 #define PROGRESS_COPY_BYTES_PROCESSED 0
 #define PROGRESS_COPY_BYTES_TOTAL 1
-#define PROGRESS_COPY_LINES_PROCESSED 2
-#define PROGRESS_COPY_LINES_EXCLUDED 3
+#define PROGRESS_COPY_TUPLES_PROCESSED 2
+#define PROGRESS_COPY_TUPLES_EXCLUDED 3
 
 #endif
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 25a863627e..970f6909c2 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1950,8 +1950,8 @@ pg_stat_progress_copy| SELECT s.pid,
     s.relid,
     s.param1 AS bytes_processed,
     s.param2 AS bytes_total,
-    s.param3 AS lines_processed,
-    s.param4 AS lines_excluded
+    s.param3 AS tuples_processed,
+    s.param4 AS tuples_excluded
    FROM (pg_stat_get_progress_info('COPY'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_create_index| SELECT s.pid,
-- 
2.20.1

v3-0005-Add-a-io_target-column-to-the-copy-progress-view.patchtext/x-patch; charset=US-ASCII; name=v3-0005-Add-a-io_target-column-to-the-copy-progress-view.patchDownload
From 35e9815dec8e1770d380d6b2446b41010b1007a0 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Mon, 8 Feb 2021 18:13:46 +0100
Subject: [PATCH v3 5/6] Add a io_target column to the copy progress view

This allows filtering on IO target for progress reporting, and allows for
further filtering of COPY commands. Additionally, this allows for
identification of logical replication's initial table synchronization
background workers at the subscriber side through io_target = CALLBACK, as it
is the only current supplier of a callback datasource.
---
 doc/src/sgml/monitoring.sgml         | 13 +++++++++++++
 src/backend/catalog/system_views.sql |  5 +++++
 src/backend/commands/copyfrom.c      | 22 ++++++++++++++++++++++
 src/backend/commands/copyto.c        | 26 +++++++++++++++++++++++++-
 src/include/commands/progress.h      |  7 +++++++
 src/test/regress/expected/rules.out  |  7 +++++++
 6 files changed, 79 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index ca84b53896..3c39c82f1a 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6554,6 +6554,19 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>io_target</structfield> <type>text</type>
+      </para>
+      <para>
+       The io target that the data is read from or written to: 
+       <literal>FILE</literal>, <literal>PROGRAM</literal>, 
+       <literal>STDIO</literal> (for COPY FROM STDIN and COPY TO STDOUT),
+       or <literal>CALLBACK</literal> (used in the table synchronization
+       background worker).
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>bytes_processed</structfield> <type>bigint</type>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 1082b7d253..6a3ac47b85 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1132,6 +1132,11 @@ CREATE VIEW pg_stat_progress_copy AS
         CASE S.param5 WHEN 1 THEN 'COPY FROM'
                       WHEN 2 THEN 'COPY TO'
                       END AS command,
+        CASE S.param6 WHEN 1 THEN 'FILE'
+                      WHEN 2 THEN 'PROGRAM'
+                      WHEN 3 THEN 'STDIO'
+                      WHEN 4 THEN 'CALLBACK'
+                      END AS io_target,
         S.param1 AS bytes_processed,
         S.param2 AS bytes_total,
         S.param3 AS tuples_processed,
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index ce343dbf80..bf952fa293 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1429,6 +1429,7 @@ BeginCopyFrom(ParseState *pstate,
 	pgstat_progress_start_command(PROGRESS_COMMAND_COPY,
 								  cstate->rel ? RelationGetRelid(cstate->rel) : InvalidOid);
 	pgstat_progress_update_param(PROGRESS_COPY_COMMAND, PROGRESS_COPY_COMMAND_FROM);
+
 	cstate->bytes_processed = 0;
 
 	/* We keep those variables in cstate. */
@@ -1506,6 +1507,27 @@ BeginCopyFrom(ParseState *pstate,
 		ReceiveCopyBinaryHeader(cstate);
 	}
 
+	{
+		int64 io_target;
+		switch (cstate->copy_src)
+		{
+			case COPY_FILE:
+				if (is_program)
+					io_target = PROGRESS_COPY_IO_TARGET_PROGRAM;
+				else
+					io_target = PROGRESS_COPY_IO_TARGET_FILE;
+				break;
+			case COPY_OLD_FE:
+			case COPY_NEW_FE:
+				io_target = PROGRESS_COPY_IO_TARGET_STDIO;
+				break;
+			case COPY_CALLBACK:
+				io_target = PROGRESS_COPY_IO_TARGET_CALLBACK;
+				break;
+		}
+		pgstat_progress_update_param(PROGRESS_COPY_IO_TARGET, io_target);
+	}
+
 	/* create workspace for CopyReadAttributes results */
 	if (!cstate->opts.binary)
 	{
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 534c091c75..42c4a828df 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -772,7 +772,31 @@ BeginCopyTo(ParseState *pstate,
 	/* initialize progress */
 	pgstat_progress_start_command(PROGRESS_COMMAND_COPY,
 								  cstate->rel ? RelationGetRelid(cstate->rel) : InvalidOid);
-	pgstat_progress_update_param(PROGRESS_COPY_COMMAND, PROGRESS_COPY_COMMAND_TO);
+	{
+		const int progress_index[] = {
+			PROGRESS_COPY_COMMAND,
+			PROGRESS_COPY_IO_TARGET
+		};
+		int64 progress_vals[] = {
+			PROGRESS_COPY_COMMAND_TO,
+			0
+		};
+		switch (cstate->copy_dest)
+		{
+			case COPY_FILE:
+				if (is_program)
+					progress_vals[1] = PROGRESS_COPY_IO_TARGET_PROGRAM;
+				else
+					progress_vals[1] = PROGRESS_COPY_IO_TARGET_FILE;
+				break;
+			case COPY_OLD_FE:
+			case COPY_NEW_FE:
+				progress_vals[1] = PROGRESS_COPY_IO_TARGET_STDIO;
+				break;
+		}
+		pgstat_progress_update_multi_param(2, progress_index, progress_vals);
+	}
+
 	cstate->bytes_processed = 0;
 
 	MemoryContextSwitchTo(oldcontext);
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 1c30d09abb..e003217554 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -139,9 +139,16 @@
 #define PROGRESS_COPY_TUPLES_PROCESSED 2
 #define PROGRESS_COPY_TUPLES_EXCLUDED 3
 #define PROGRESS_COPY_COMMAND 4
+#define PROGRESS_COPY_IO_TARGET 5
 
 /* Commands of PROGRESS_COPY_COMMAND */
 #define PROGRESS_COPY_COMMAND_FROM 1
 #define PROGRESS_COPY_COMMAND_TO 2
 
+/* Types of PROGRESS_COPY_INOUT_TYPE */
+#define PROGRESS_COPY_IO_TARGET_FILE 1
+#define PROGRESS_COPY_IO_TARGET_PROGRAM 2
+#define PROGRESS_COPY_IO_TARGET_STDIO 3
+#define PROGRESS_COPY_IO_TARGET_CALLBACK 4
+
 #endif
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 63b5e33083..0698c71d23 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1953,6 +1953,13 @@ pg_stat_progress_copy| SELECT s.pid,
             WHEN 2 THEN 'COPY TO'::text
             ELSE NULL::text
         END AS command,
+        CASE s.param6
+            WHEN 1 THEN 'FILE'::text
+            WHEN 2 THEN 'PROGRAM'::text
+            WHEN 3 THEN 'STDIO'::text
+            WHEN 4 THEN 'CALLBACK'::text
+            ELSE NULL::text
+        END AS io_target,
     s.param1 AS bytes_processed,
     s.param2 AS bytes_total,
     s.param3 AS tuples_processed,
-- 
2.20.1

v3-0001-Add-progress-reporting-for-excluded-rows.patchtext/x-patch; charset=US-ASCII; name=v3-0001-Add-progress-reporting-for-excluded-rows.patchDownload
From 177fabe331cbf21b695c9d8f7bab513d14e8f727 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm+postgres@gmail.com>
Date: Thu, 7 Jan 2021 16:39:57 +0100
Subject: [PATCH v3 1/6] Add progress reporting for excluded rows.

COPY ... FROM ... WHERE (condition) will exclude rows that do not match the
condition, which we now register as explicitly being excluded from insertion.
---
 doc/src/sgml/monitoring.sgml         | 10 ++++++++++
 src/backend/catalog/system_views.sql |  3 ++-
 src/backend/commands/copyfrom.c      |  5 +++++
 src/include/commands/progress.h      |  1 +
 src/test/regress/expected/rules.out  |  3 ++-
 5 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index c602ee4427..dc998bc5f7 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6571,6 +6571,16 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
        Number of lines already processed by <command>COPY</command> command.
       </para></entry>
      </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>lines_excluded</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of lines not processed because they were excluded by the
+       <command>WHERE</command> clause of the <command>COPY</command> command.
+      </para></entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index fa58afd9d7..a9bd983419 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1131,7 +1131,8 @@ CREATE VIEW pg_stat_progress_copy AS
         S.relid AS relid,
         S.param1 AS bytes_processed,
         S.param2 AS bytes_total,
-        S.param3 AS lines_processed
+        S.param3 AS lines_processed,
+        S.param4 AS lines_excluded
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index c39cc736ed..86e6a422e2 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -540,6 +540,7 @@ CopyFrom(CopyFromState cstate)
 	CopyInsertMethod insertMethod;
 	CopyMultiInsertInfo multiInsertInfo = {0};	/* pacify compiler */
 	uint64		processed = 0;
+	uint64		excluded = 0;
 	bool		has_before_insert_row_trig;
 	bool		has_instead_insert_row_trig;
 	bool		leafpart_use_multi_insert = false;
@@ -868,7 +869,11 @@ CopyFrom(CopyFromState cstate)
 			econtext->ecxt_scantuple = myslot;
 			/* Skip items that don't match COPY's WHERE clause */
 			if (!ExecQual(cstate->qualexpr, econtext))
+			{
+				/* Report that this tuple was filtered out by the WHERE clause */
+				pgstat_progress_update_param(PROGRESS_COPY_LINES_EXCLUDED, ++excluded);
 				continue;
+			}
 		}
 
 		/* Determine the partition to insert the tuple into */
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 95ec5d02e9..e6f6545033 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -137,5 +137,6 @@
 #define PROGRESS_COPY_BYTES_PROCESSED 0
 #define PROGRESS_COPY_BYTES_TOTAL 1
 #define PROGRESS_COPY_LINES_PROCESSED 2
+#define PROGRESS_COPY_LINES_EXCLUDED 3
 
 #endif
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 6173473de9..25a863627e 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1950,7 +1950,8 @@ pg_stat_progress_copy| SELECT s.pid,
     s.relid,
     s.param1 AS bytes_processed,
     s.param2 AS bytes_total,
-    s.param3 AS lines_processed
+    s.param3 AS lines_processed,
+    s.param4 AS lines_excluded
    FROM (pg_stat_get_progress_info('COPY'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_create_index| SELECT s.pid,
-- 
2.20.1

#8Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Matthias van de Meent (#7)
Re: Improvements and additions to COPY progress reporting

On Tue, Feb 9, 2021 at 6:02 PM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

Also, you can add this to the current commitfest.

See https://commitfest.postgresql.org/32/2977/

On Tue, 9 Feb 2021 at 12:53, Josef Šimánek <josef.simanek@gmail.com> wrote:

OK, would you mind to integrate my regression test initial patch as
well in v3 or should I submit it later in a separate way?

Attached, with minor fixes

Why do we need to have a new test file progress.sql for the test
cases? Can't we add them into existing copy.sql or copy2.sql? Or do
you have a plan to add test cases into progress.sql for other progress
reporting commands?

IMO, it's better not add any new test file but add the tests to existing files.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

#9Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Bharath Rupireddy (#8)
Re: Improvements and additions to COPY progress reporting

On Wed, 10 Feb 2021 at 07:43, Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Tue, Feb 9, 2021 at 6:02 PM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

Also, you can add this to the current commitfest.

See https://commitfest.postgresql.org/32/2977/

On Tue, 9 Feb 2021 at 12:53, Josef Šimánek <josef.simanek@gmail.com> wrote:

OK, would you mind to integrate my regression test initial patch as
well in v3 or should I submit it later in a separate way?

Attached, with minor fixes

Why do we need to have a new test file progress.sql for the test
cases? Can't we add them into existing copy.sql or copy2.sql? Or do
you have a plan to add test cases into progress.sql for other progress
reporting commands?

I don't mind moving the test into copy or copy2, but the main reason
to put it in a seperate file is to test the 'copy' component of the
feature called 'progress reporting'. If the feature instead is 'copy'
and 'progress reporting' is part of that feature, then I'd put it in
the copy-tests, but because the documentation of this has it's own
docs page 'progress reporting', and because 'copy' is a subsection of
that, I do think that this feature warrants its own regression test
file.

There are no other tests for the progress reporting feature yet,
because COPY ... FROM is the only command that is progress reported
_and_ that can fire triggers while running the command, so checking
the progress view during the progress reported command is only
feasable in COPY progress reporting. To test the other progress
reporting views, we would need multiple sessions, which I believe is
impossible in this test format. Please correct me if I'm wrong; I'd
love to add tests for the other components. That will not be in this
patchset, though.

IMO, it's better not add any new test file but add the tests to existing files.

In general I agree, but in some cases (e.g. new system component, new
full-fledged feature), new test files are needed. I think that this
could be one of those cases.

With regards,

Matthias van de Meent

#10Josef Šimánek
josef.simanek@gmail.com
In reply to: Matthias van de Meent (#9)
Re: Improvements and additions to COPY progress reporting

čt 11. 2. 2021 v 15:27 odesílatel Matthias van de Meent
<boekewurm+postgres@gmail.com> napsal:

On Wed, 10 Feb 2021 at 07:43, Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Tue, Feb 9, 2021 at 6:02 PM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

Also, you can add this to the current commitfest.

See https://commitfest.postgresql.org/32/2977/

On Tue, 9 Feb 2021 at 12:53, Josef Šimánek <josef.simanek@gmail.com> wrote:

OK, would you mind to integrate my regression test initial patch as
well in v3 or should I submit it later in a separate way?

Attached, with minor fixes

Why do we need to have a new test file progress.sql for the test
cases? Can't we add them into existing copy.sql or copy2.sql? Or do
you have a plan to add test cases into progress.sql for other progress
reporting commands?

I don't mind moving the test into copy or copy2, but the main reason
to put it in a seperate file is to test the 'copy' component of the
feature called 'progress reporting'. If the feature instead is 'copy'
and 'progress reporting' is part of that feature, then I'd put it in
the copy-tests, but because the documentation of this has it's own
docs page 'progress reporting', and because 'copy' is a subsection of
that, I do think that this feature warrants its own regression test
file.

There are no other tests for the progress reporting feature yet,
because COPY ... FROM is the only command that is progress reported
_and_ that can fire triggers while running the command, so checking
the progress view during the progress reported command is only
feasable in COPY progress reporting. To test the other progress
reporting views, we would need multiple sessions, which I believe is
impossible in this test format. Please correct me if I'm wrong; I'd
love to add tests for the other components. That will not be in this
patchset, though.

IMO, it's better not add any new test file but add the tests to existing files.

In general I agree, but in some cases (e.g. new system component, new
full-fledged feature), new test files are needed. I think that this
could be one of those cases.

I have split it since it should be the start of progress reporting
testing at all. If you better consider this as part of COPY testing,
feel free to move it to already existing copy testing related files.
There's no real reason to keep it separated if not needed.

Show quoted text

With regards,

Matthias van de Meent

#11Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Josef Šimánek (#10)
Re: Improvements and additions to COPY progress reporting

On Thu, Feb 11, 2021, 8:08 PM Josef Šimánek <josef.simanek@gmail.com> wrote:

čt 11. 2. 2021 v 15:27 odesílatel Matthias van de Meent
<boekewurm+postgres@gmail.com> napsal:

On Wed, 10 Feb 2021 at 07:43, Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Tue, Feb 9, 2021 at 6:02 PM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

Also, you can add this to the current commitfest.

See https://commitfest.postgresql.org/32/2977/

On Tue, 9 Feb 2021 at 12:53, Josef Šimánek <josef.simanek@gmail.com>

wrote:

OK, would you mind to integrate my regression test initial patch as
well in v3 or should I submit it later in a separate way?

Attached, with minor fixes

Why do we need to have a new test file progress.sql for the test
cases? Can't we add them into existing copy.sql or copy2.sql? Or do
you have a plan to add test cases into progress.sql for other progress
reporting commands?

I don't mind moving the test into copy or copy2, but the main reason
to put it in a seperate file is to test the 'copy' component of the
feature called 'progress reporting'. If the feature instead is 'copy'
and 'progress reporting' is part of that feature, then I'd put it in
the copy-tests, but because the documentation of this has it's own
docs page 'progress reporting', and because 'copy' is a subsection of
that, I do think that this feature warrants its own regression test
file.

There are no other tests for the progress reporting feature yet,
because COPY ... FROM is the only command that is progress reported
_and_ that can fire triggers while running the command, so checking
the progress view during the progress reported command is only
feasable in COPY progress reporting. To test the other progress
reporting views, we would need multiple sessions, which I believe is
impossible in this test format. Please correct me if I'm wrong; I'd
love to add tests for the other components. That will not be in this
patchset, though.

IMO, it's better not add any new test file but add the tests to

existing files.

In general I agree, but in some cases (e.g. new system component, new
full-fledged feature), new test files are needed. I think that this
could be one of those cases.

I have split it since it should be the start of progress reporting
testing at all. If you better consider this as part of COPY testing,
feel free to move it to already existing copy testing related files.
There's no real reason to keep it separated if not needed.

+1 to move those test cases to existing copy test files.

#12Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Bharath Rupireddy (#11)
6 attachment(s)
Re: Improvements and additions to COPY progress reporting

On Thu, 11 Feb 2021 at 15:44, Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Thu, Feb 11, 2021, 8:08 PM Josef Šimánek <josef.simanek@gmail.com> wrote:

I have split it since it should be the start of progress reporting
testing at all. If you better consider this as part of COPY testing,
feel free to move it to already existing copy testing related files.
There's no real reason to keep it separated if not needed.

+1 to move those test cases to existing copy test files.

Thanks for your reviews. PFA v4 of the patchset, in which the tests
are put into copy.sql (well, input/copy.source). This also adds tests
for correctly reporting COPY ... FROM 'file'.

I've changed the notice-alerted format from manually naming each
column to calling to_jsonb and removing the unstable columns from the
reported value; this should therefore be stable and give direct notice
to changes in the view.

With regards,

Matthias van de Meent.

Attachments:

v4-0004-Add-a-command-column-to-the-copy-progress-view.patchtext/x-patch; charset=US-ASCII; name=v4-0004-Add-a-command-column-to-the-copy-progress-view.patchDownload
From f7d761f6774753d4914d0dbc80effbb1ab09b58e Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Mon, 8 Feb 2021 17:36:00 +0100
Subject: [PATCH v4 4/6] Add a command column to the copy progress view

This allows filtering on COPY FROM / COPY TO for progress reporting, and makes
it possible to determine the further meaning of the columns involved.
---
 doc/src/sgml/monitoring.sgml         | 10 ++++++++++
 src/backend/catalog/system_views.sql |  3 +++
 src/backend/commands/copyfrom.c      |  1 +
 src/backend/commands/copyto.c        |  1 +
 src/include/commands/progress.h      |  5 +++++
 src/test/regress/expected/rules.out  |  5 +++++
 6 files changed, 25 insertions(+)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 940e9dcee4..ca84b53896 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6544,6 +6544,16 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>command</structfield> <type>text</type>
+      </para>
+      <para>
+       The command that is running: <literal>COPY FROM</literal>, or
+       <literal>COPY TO</literal>.
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>bytes_processed</structfield> <type>bigint</type>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index e7e227792c..1082b7d253 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1129,6 +1129,9 @@ CREATE VIEW pg_stat_progress_copy AS
     SELECT
         S.pid AS pid, S.datid AS datid, D.datname AS datname,
         S.relid AS relid,
+        CASE S.param5 WHEN 1 THEN 'COPY FROM'
+                      WHEN 2 THEN 'COPY TO'
+                      END AS command,
         S.param1 AS bytes_processed,
         S.param2 AS bytes_total,
         S.param3 AS tuples_processed,
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index fb3c7e2c0c..ce343dbf80 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1428,6 +1428,7 @@ BeginCopyFrom(ParseState *pstate,
 	/* initialize progress */
 	pgstat_progress_start_command(PROGRESS_COMMAND_COPY,
 								  cstate->rel ? RelationGetRelid(cstate->rel) : InvalidOid);
+	pgstat_progress_update_param(PROGRESS_COPY_COMMAND, PROGRESS_COPY_COMMAND_FROM);
 	cstate->bytes_processed = 0;
 
 	/* We keep those variables in cstate. */
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 9ffe7a6ee3..534c091c75 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -772,6 +772,7 @@ BeginCopyTo(ParseState *pstate,
 	/* initialize progress */
 	pgstat_progress_start_command(PROGRESS_COMMAND_COPY,
 								  cstate->rel ? RelationGetRelid(cstate->rel) : InvalidOid);
+	pgstat_progress_update_param(PROGRESS_COPY_COMMAND, PROGRESS_COPY_COMMAND_TO);
 	cstate->bytes_processed = 0;
 
 	MemoryContextSwitchTo(oldcontext);
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 8b2b188bd5..1c30d09abb 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -138,5 +138,10 @@
 #define PROGRESS_COPY_BYTES_TOTAL 1
 #define PROGRESS_COPY_TUPLES_PROCESSED 2
 #define PROGRESS_COPY_TUPLES_EXCLUDED 3
+#define PROGRESS_COPY_COMMAND 4
+
+/* Commands of PROGRESS_COPY_COMMAND */
+#define PROGRESS_COPY_COMMAND_FROM 1
+#define PROGRESS_COPY_COMMAND_TO 2
 
 #endif
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 970f6909c2..63b5e33083 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1948,6 +1948,11 @@ pg_stat_progress_copy| SELECT s.pid,
     s.datid,
     d.datname,
     s.relid,
+        CASE s.param5
+            WHEN 1 THEN 'COPY FROM'::text
+            WHEN 2 THEN 'COPY TO'::text
+            ELSE NULL::text
+        END AS command,
     s.param1 AS bytes_processed,
     s.param2 AS bytes_total,
     s.param3 AS tuples_processed,
-- 
2.20.1

v4-0006-Add-copy-progress-reporting-regression-tests.patchtext/x-patch; charset=UTF-8; name=v4-0006-Add-copy-progress-reporting-regression-tests.patchDownload
From 181e95a07d7f1d3bac08f1596622e6da2519d069 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Josef=20=C5=A0im=C3=A1nek?= <josef.simanek@gmail.com>
Date: Tue, 9 Feb 2021 13:06:45 +0100
Subject: [PATCH v4 6/6] Add copy progress reporting regression tests.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This tests some basic features of copy progress reporting.

Sadly, we can only request one snapshot of progress_reporting each
transaction / statement, so we can't (easily) get intermediate results without
each result requiring a seperate statement being run.

Author: Josef Šimánek, Matthias van de Meent
---
 src/test/regress/input/copy.source  | 57 +++++++++++++++++++++++++++++
 src/test/regress/output/copy.source | 44 ++++++++++++++++++++++
 2 files changed, 101 insertions(+)

diff --git a/src/test/regress/input/copy.source b/src/test/regress/input/copy.source
index a1d529ad36..0ce267e1cc 100644
--- a/src/test/regress/input/copy.source
+++ b/src/test/regress/input/copy.source
@@ -201,3 +201,60 @@ select * from parted_copytest where b = 1;
 select * from parted_copytest where b = 2;
 
 drop table parted_copytest;
+
+--
+-- progress reporting
+-- 
+
+-- setup
+-- reuse employer datatype, that has a small sized data set
+
+create table progress_reporting (
+	name 		text,
+	age			int4,
+	location 	point,
+	salary 		int4,
+	manager 	name
+);
+
+create function notice_after_progress_reporting() returns trigger AS
+$$
+declare report record;
+begin
+	-- We cannot expect 'pid' nor 'relid' to be consistent over runs due to 
+	-- variance in system process ids, and concurrency in runs of tests.
+	-- Additionally, due to the usage of this test in pg_regress, the 'datid'
+	-- also is not consistent between runs.
+	select into report (to_jsonb(r) - '{pid,relid,datid}'::text[]) as value
+		from pg_stat_progress_copy r
+		where pid = pg_backend_pid();
+
+	raise info 'progress: %', report.value::text;
+
+	RETURN NEW;
+END;
+$$ LANGUAGE plpgsql;
+
+create trigger check_after_progress_reporting
+	after insert on progress_reporting
+	for each statement
+	execute function notice_after_progress_reporting();
+
+-- reporting of STDIO imports, and correct bytes-processed/tuples-processed reporting
+
+copy progress_reporting from stdin;
+sharon	25	(15,12)	1000	sam
+sam	30	(10,5)	2000	bill
+bill	20	(11,10)	1000	sharon
+\.
+
+-- reporting of FILE imports, and correct reporting of tuples-excluded
+
+copy progress_reporting from '@abs_builddir@/data/emp.data'
+	where (salary < 2000);
+
+-- cleanup progress_reporting
+
+drop trigger check_after_progress_reporting on progress_reporting;
+drop function notice_after_progress_reporting();
+drop table progress_reporting;
diff --git a/src/test/regress/output/copy.source b/src/test/regress/output/copy.source
index 938d3551da..32edc60319 100644
--- a/src/test/regress/output/copy.source
+++ b/src/test/regress/output/copy.source
@@ -165,3 +165,47 @@ select * from parted_copytest where b = 2;
 (1 row)
 
 drop table parted_copytest;
+--
+-- progress reporting
+-- 
+-- setup
+-- reuse employer datatype, that has a small sized data set
+create table progress_reporting (
+	name 		text,
+	age			int4,
+	location 	point,
+	salary 		int4,
+	manager 	name
+);
+create function notice_after_progress_reporting() returns trigger AS
+$$
+declare report record;
+begin
+	-- We cannot expect 'pid' nor 'relid' to be consistent over runs due to 
+	-- variance in system process ids, and concurrency in runs of tests.
+	-- Additionally, due to the usage of this test in pg_regress, the 'datid'
+	-- also is not consistent between runs.
+	select into report (to_jsonb(r) - '{pid,relid,datid}'::text[]) as value
+		from pg_stat_progress_copy r
+		where pid = pg_backend_pid();
+
+	raise info 'progress: %', report.value::text;
+
+	RETURN NEW;
+END;
+$$ LANGUAGE plpgsql;
+create trigger check_after_progress_reporting
+	after insert on progress_reporting
+	for each statement
+	execute function notice_after_progress_reporting();
+-- reporting of STDIO imports, and correct bytes-processed/tuples-processed reporting
+copy progress_reporting from stdin;
+INFO:  progress: {"command": "COPY FROM", "datname": "regression", "io_target": "STDIO", "bytes_total": 0, "bytes_processed": 79, "tuples_excluded": 0, "tuples_processed": 3}
+-- reporting of FILE imports, and correct reporting of tuples-excluded
+copy progress_reporting from '@abs_builddir@/data/emp.data'
+	where (salary < 2000);
+INFO:  progress: {"command": "COPY FROM", "datname": "regression", "io_target": "FILE", "bytes_total": 79, "bytes_processed": 79, "tuples_excluded": 1, "tuples_processed": 2}
+-- cleanup progress_reporting
+drop trigger check_after_progress_reporting on progress_reporting;
+drop function notice_after_progress_reporting();
+drop table progress_reporting;
-- 
2.20.1

v4-0002-Rename-lines-to-tuples-in-COPY-progress-reporting.patchtext/x-patch; charset=US-ASCII; name=v4-0002-Rename-lines-to-tuples-in-COPY-progress-reporting.patchDownload
From d6723f2cfbe98e9837948336420f283312c91b51 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Thu, 7 Jan 2021 16:51:30 +0100
Subject: [PATCH v4 2/6] Rename lines* to tuples* in COPY progress reporting
 view

This pulls the names of the columns in line with the other progress
reporting views and the comments surrounding the code that updates the columns.

It also makes it clear to the user that for the foreign data consumed or
produced in the COPY command, the line count need not be related to these
columns.
---
 doc/src/sgml/monitoring.sgml         | 8 ++++----
 src/backend/catalog/system_views.sql | 4 ++--
 src/backend/commands/copyfrom.c      | 4 ++--
 src/backend/commands/copyto.c        | 4 ++--
 src/include/commands/progress.h      | 4 ++--
 src/test/regress/expected/rules.out  | 4 ++--
 6 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index dc998bc5f7..940e9dcee4 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6565,19 +6565,19 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>lines_processed</structfield> <type>bigint</type>
+       <structfield>tuples_processed</structfield> <type>bigint</type>
       </para>
       <para>
-       Number of lines already processed by <command>COPY</command> command.
+       Number of tuples already processed by <command>COPY</command> command.
       </para></entry>
      </row>
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>lines_excluded</structfield> <type>bigint</type>
+       <structfield>tuples_excluded</structfield> <type>bigint</type>
       </para>
       <para>
-       Number of lines not processed because they were excluded by the
+       Number of tuples not processed because they were excluded by the
        <command>WHERE</command> clause of the <command>COPY</command> command.
       </para></entry>
      </row>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index a9bd983419..e7e227792c 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1131,8 +1131,8 @@ CREATE VIEW pg_stat_progress_copy AS
         S.relid AS relid,
         S.param1 AS bytes_processed,
         S.param2 AS bytes_total,
-        S.param3 AS lines_processed,
-        S.param4 AS lines_excluded
+        S.param3 AS tuples_processed,
+        S.param4 AS tuples_excluded
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 86e6a422e2..fb3c7e2c0c 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -871,7 +871,7 @@ CopyFrom(CopyFromState cstate)
 			if (!ExecQual(cstate->qualexpr, econtext))
 			{
 				/* Report that this tuple was filtered out by the WHERE clause */
-				pgstat_progress_update_param(PROGRESS_COPY_LINES_EXCLUDED, ++excluded);
+				pgstat_progress_update_param(PROGRESS_COPY_TUPLES_EXCLUDED, ++excluded);
 				continue;
 			}
 		}
@@ -1111,7 +1111,7 @@ CopyFrom(CopyFromState cstate)
 			 * for counting tuples inserted by an INSERT command. Update
 			 * progress of the COPY command as well.
 			 */
-			pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++processed);
+			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++processed);
 		}
 	}
 
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index e04ec1e331..9ffe7a6ee3 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -954,7 +954,7 @@ CopyTo(CopyToState cstate)
 			CopyOneRowTo(cstate, slot);
 
 			/* Increment amount of processed tuples and update the progress */
-			pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++processed);
+			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++processed);
 		}
 
 		ExecDropSingleTupleTableSlot(slot);
@@ -1321,7 +1321,7 @@ copy_dest_receive(TupleTableSlot *slot, DestReceiver *self)
 	CopyOneRowTo(cstate, slot);
 
 	/* Increment amount of processed tuples and update the progress */
-	pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++myState->processed);
+	pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++myState->processed);
 
 	return true;
 }
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index e6f6545033..8b2b188bd5 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -136,7 +136,7 @@
 /* Commands of PROGRESS_COPY */
 #define PROGRESS_COPY_BYTES_PROCESSED 0
 #define PROGRESS_COPY_BYTES_TOTAL 1
-#define PROGRESS_COPY_LINES_PROCESSED 2
-#define PROGRESS_COPY_LINES_EXCLUDED 3
+#define PROGRESS_COPY_TUPLES_PROCESSED 2
+#define PROGRESS_COPY_TUPLES_EXCLUDED 3
 
 #endif
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 25a863627e..970f6909c2 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1950,8 +1950,8 @@ pg_stat_progress_copy| SELECT s.pid,
     s.relid,
     s.param1 AS bytes_processed,
     s.param2 AS bytes_total,
-    s.param3 AS lines_processed,
-    s.param4 AS lines_excluded
+    s.param3 AS tuples_processed,
+    s.param4 AS tuples_excluded
    FROM (pg_stat_get_progress_info('COPY'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_create_index| SELECT s.pid,
-- 
2.20.1

v4-0003-Add-backlinks-to-progress-reporting-documentation.patchtext/x-patch; charset=US-ASCII; name=v4-0003-Add-backlinks-to-progress-reporting-documentation.patchDownload
From 1aaf82200c79aaec10a81f669fa49b801ff8b4ed Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Fri, 5 Feb 2021 23:11:50 +0100
Subject: [PATCH v4 3/6] Add backlinks to progress reporting documentation

Previously, for most progress-reported features, the only place the
feature was mentioned is in the progress reporting document itself.
This makes the progress reporting more discoverable from the reported
commands.
---
 doc/src/sgml/ref/analyze.sgml       |  7 +++++++
 doc/src/sgml/ref/cluster.sgml       |  6 ++++++
 doc/src/sgml/ref/copy.sgml          | 14 ++++++++++++++
 doc/src/sgml/ref/create_index.sgml  |  8 ++++++++
 doc/src/sgml/ref/pg_basebackup.sgml |  1 +
 doc/src/sgml/ref/reindex.sgml       |  7 +++++++
 doc/src/sgml/ref/vacuum.sgml        | 11 +++++++++++
 7 files changed, 54 insertions(+)

diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml
index 7d816c87c6..9db9070b62 100644
--- a/doc/src/sgml/ref/analyze.sgml
+++ b/doc/src/sgml/ref/analyze.sgml
@@ -273,6 +273,12 @@ ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replacea
     will not record new statistics for that table.  Any existing statistics
     will be retained.
   </para>
+
+  <para>
+    Each backend running the <command>ANALYZE</command> command will report their
+    progress to the <structname>pg_stat_progress_analyze</structname> view.
+    See <xref linkend="analyze-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -291,6 +297,7 @@ ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replacea
    <member><xref linkend="app-vacuumdb"/></member>
    <member><xref linkend="runtime-config-resource-vacuum-cost"/></member>
    <member><xref linkend="autovacuum"/></member>
+   <member><xref linkend="analyze-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/cluster.sgml b/doc/src/sgml/ref/cluster.sgml
index 5dd21a0189..5c2270f71b 100644
--- a/doc/src/sgml/ref/cluster.sgml
+++ b/doc/src/sgml/ref/cluster.sgml
@@ -192,6 +192,11 @@ CLUSTER [VERBOSE]
     are periodically reclustered.
    </para>
 
+  <para>
+    Each backend running the <command>CLUSTER</command> command will report their
+    progress to the <structname>pg_stat_progress_cluster</structname> view.
+    See <xref linkend="cluster-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -242,6 +247,7 @@ CLUSTER <replaceable class="parameter">index_name</replaceable> ON <replaceable
 
   <simplelist type="inline">
    <member><xref linkend="app-clusterdb"/></member>
+   <member><xref linkend="cluster-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 0fca6583af..af3ce72561 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -82,6 +82,12 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
    specified, data is transmitted via the connection between the
    client and the server.
   </para>
+
+  <para>
+    Each backend running the <command>COPY</command> command will report their
+    progress to the <structname>pg_stat_progress_copy</structname> view.
+    See <xref linkend="copy-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -1052,4 +1058,12 @@ COPY [ BINARY ] <replaceable class="parameter">table_name</replaceable>
     [ WITH NULL AS '<replaceable class="parameter">null_string</replaceable>' ]
 </synopsis></para>
  </refsect1>
+ 
+ <refsect1>
+  <title>See Also</title>
+ 
+  <simplelist type="inline">
+   <member><xref linkend="copy-progress-reporting"/></member>
+  </simplelist>
+ </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index a5271a9f8f..278058f500 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -865,6 +865,13 @@ Indexes:
    will interpret it as <literal>USING gist</literal>, to simplify conversion
    of old databases to GiST.
   </para>
+
+  <para>
+    Each backend running the <command>CREATE INDEX</command> command will 
+    report their progress to the 
+    <structname>pg_stat_progress_create_index</structname> view.
+    See <xref linkend="create-index-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -978,6 +985,7 @@ CREATE INDEX CONCURRENTLY sales_quantity_index ON sales_table (quantity);
    <member><xref linkend="sql-alterindex"/></member>
    <member><xref linkend="sql-dropindex"/></member>
    <member><xref linkend="sql-reindex"/></member>
+   <member><xref linkend="create-index-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 5754ad5aa6..14cc88a852 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -904,6 +904,7 @@ PostgreSQL documentation
 
   <simplelist type="inline">
    <member><xref linkend="app-pgdump"/></member>
+   <member><xref linkend="basebackup-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 07795b5737..f69f5db403 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -493,6 +493,12 @@ Indexes:
     is reindexed concurrently, those indexes will be skipped.  (It is possible
     to reindex such indexes without the <command>CONCURRENTLY</command> option.)
    </para>
+
+  <para>
+    Each backend running the <command>REINDEX</command> command will report
+    their progress to the <structname>pg_stat_progress_create_index</structname>
+    view. See <xref linkend="create-index-progress-reporting"/> for details.
+  </para>
   </refsect2>
  </refsect1>
 
@@ -551,6 +557,7 @@ REINDEX TABLE CONCURRENTLY my_broken_table;
    <member><xref linkend="sql-createindex"/></member>
    <member><xref linkend="sql-dropindex"/></member>
    <member><xref linkend="app-reindexdb"/></member>
+   <member><xref linkend="create-index-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 21ab57d880..f087087816 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -378,6 +378,15 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     information about automatic and manual vacuuming, see
     <xref linkend="routine-vacuuming"/>.
    </para>
+   <para>
+    Each backend running the <command>VACUUM</command> command without the
+    <literal>FULL</literal> option will report their progress in the 
+    <structname>pg_stat_progress_vacuum</structname> view. Backends running
+    <command>VACUUM</command> with the <literal>FULL</literal> option report 
+    progress in the <structname>pg_stat_progress_cluster</structname> instead.
+    See <xref linkend="vacuum-progress-reporting"/> and 
+    <xref linkend="cluster-progress-reporting"/> for details.
+   </para>
  </refsect1>
 
  <refsect1>
@@ -407,6 +416,8 @@ VACUUM (VERBOSE, ANALYZE) onek;
    <member><xref linkend="app-vacuumdb"/></member>
    <member><xref linkend="runtime-config-resource-vacuum-cost"/></member>
    <member><xref linkend="autovacuum"/></member>
+   <member><xref linkend="vacuum-progress-reporting"/></member>
+   <member><xref linkend="cluster-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
-- 
2.20.1

v4-0005-Add-a-io_target-column-to-the-copy-progress-view.patchtext/x-patch; charset=US-ASCII; name=v4-0005-Add-a-io_target-column-to-the-copy-progress-view.patchDownload
From 35e9815dec8e1770d380d6b2446b41010b1007a0 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Mon, 8 Feb 2021 18:13:46 +0100
Subject: [PATCH v4 5/6] Add a io_target column to the copy progress view

This allows filtering on IO target for progress reporting, and allows for
further filtering of COPY commands. Additionally, this allows for
identification of logical replication's initial table synchronization
background workers at the subscriber side through io_target = CALLBACK, as it
is the only current supplier of a callback datasource.
---
 doc/src/sgml/monitoring.sgml         | 13 +++++++++++++
 src/backend/catalog/system_views.sql |  5 +++++
 src/backend/commands/copyfrom.c      | 22 ++++++++++++++++++++++
 src/backend/commands/copyto.c        | 26 +++++++++++++++++++++++++-
 src/include/commands/progress.h      |  7 +++++++
 src/test/regress/expected/rules.out  |  7 +++++++
 6 files changed, 79 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index ca84b53896..3c39c82f1a 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6554,6 +6554,19 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>io_target</structfield> <type>text</type>
+      </para>
+      <para>
+       The io target that the data is read from or written to: 
+       <literal>FILE</literal>, <literal>PROGRAM</literal>, 
+       <literal>STDIO</literal> (for COPY FROM STDIN and COPY TO STDOUT),
+       or <literal>CALLBACK</literal> (used in the table synchronization
+       background worker).
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>bytes_processed</structfield> <type>bigint</type>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 1082b7d253..6a3ac47b85 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1132,6 +1132,11 @@ CREATE VIEW pg_stat_progress_copy AS
         CASE S.param5 WHEN 1 THEN 'COPY FROM'
                       WHEN 2 THEN 'COPY TO'
                       END AS command,
+        CASE S.param6 WHEN 1 THEN 'FILE'
+                      WHEN 2 THEN 'PROGRAM'
+                      WHEN 3 THEN 'STDIO'
+                      WHEN 4 THEN 'CALLBACK'
+                      END AS io_target,
         S.param1 AS bytes_processed,
         S.param2 AS bytes_total,
         S.param3 AS tuples_processed,
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index ce343dbf80..bf952fa293 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1429,6 +1429,7 @@ BeginCopyFrom(ParseState *pstate,
 	pgstat_progress_start_command(PROGRESS_COMMAND_COPY,
 								  cstate->rel ? RelationGetRelid(cstate->rel) : InvalidOid);
 	pgstat_progress_update_param(PROGRESS_COPY_COMMAND, PROGRESS_COPY_COMMAND_FROM);
+
 	cstate->bytes_processed = 0;
 
 	/* We keep those variables in cstate. */
@@ -1506,6 +1507,27 @@ BeginCopyFrom(ParseState *pstate,
 		ReceiveCopyBinaryHeader(cstate);
 	}
 
+	{
+		int64 io_target;
+		switch (cstate->copy_src)
+		{
+			case COPY_FILE:
+				if (is_program)
+					io_target = PROGRESS_COPY_IO_TARGET_PROGRAM;
+				else
+					io_target = PROGRESS_COPY_IO_TARGET_FILE;
+				break;
+			case COPY_OLD_FE:
+			case COPY_NEW_FE:
+				io_target = PROGRESS_COPY_IO_TARGET_STDIO;
+				break;
+			case COPY_CALLBACK:
+				io_target = PROGRESS_COPY_IO_TARGET_CALLBACK;
+				break;
+		}
+		pgstat_progress_update_param(PROGRESS_COPY_IO_TARGET, io_target);
+	}
+
 	/* create workspace for CopyReadAttributes results */
 	if (!cstate->opts.binary)
 	{
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 534c091c75..42c4a828df 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -772,7 +772,31 @@ BeginCopyTo(ParseState *pstate,
 	/* initialize progress */
 	pgstat_progress_start_command(PROGRESS_COMMAND_COPY,
 								  cstate->rel ? RelationGetRelid(cstate->rel) : InvalidOid);
-	pgstat_progress_update_param(PROGRESS_COPY_COMMAND, PROGRESS_COPY_COMMAND_TO);
+	{
+		const int progress_index[] = {
+			PROGRESS_COPY_COMMAND,
+			PROGRESS_COPY_IO_TARGET
+		};
+		int64 progress_vals[] = {
+			PROGRESS_COPY_COMMAND_TO,
+			0
+		};
+		switch (cstate->copy_dest)
+		{
+			case COPY_FILE:
+				if (is_program)
+					progress_vals[1] = PROGRESS_COPY_IO_TARGET_PROGRAM;
+				else
+					progress_vals[1] = PROGRESS_COPY_IO_TARGET_FILE;
+				break;
+			case COPY_OLD_FE:
+			case COPY_NEW_FE:
+				progress_vals[1] = PROGRESS_COPY_IO_TARGET_STDIO;
+				break;
+		}
+		pgstat_progress_update_multi_param(2, progress_index, progress_vals);
+	}
+
 	cstate->bytes_processed = 0;
 
 	MemoryContextSwitchTo(oldcontext);
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 1c30d09abb..e003217554 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -139,9 +139,16 @@
 #define PROGRESS_COPY_TUPLES_PROCESSED 2
 #define PROGRESS_COPY_TUPLES_EXCLUDED 3
 #define PROGRESS_COPY_COMMAND 4
+#define PROGRESS_COPY_IO_TARGET 5
 
 /* Commands of PROGRESS_COPY_COMMAND */
 #define PROGRESS_COPY_COMMAND_FROM 1
 #define PROGRESS_COPY_COMMAND_TO 2
 
+/* Types of PROGRESS_COPY_INOUT_TYPE */
+#define PROGRESS_COPY_IO_TARGET_FILE 1
+#define PROGRESS_COPY_IO_TARGET_PROGRAM 2
+#define PROGRESS_COPY_IO_TARGET_STDIO 3
+#define PROGRESS_COPY_IO_TARGET_CALLBACK 4
+
 #endif
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 63b5e33083..0698c71d23 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1953,6 +1953,13 @@ pg_stat_progress_copy| SELECT s.pid,
             WHEN 2 THEN 'COPY TO'::text
             ELSE NULL::text
         END AS command,
+        CASE s.param6
+            WHEN 1 THEN 'FILE'::text
+            WHEN 2 THEN 'PROGRAM'::text
+            WHEN 3 THEN 'STDIO'::text
+            WHEN 4 THEN 'CALLBACK'::text
+            ELSE NULL::text
+        END AS io_target,
     s.param1 AS bytes_processed,
     s.param2 AS bytes_total,
     s.param3 AS tuples_processed,
-- 
2.20.1

v4-0001-Add-progress-reporting-for-excluded-rows.patchtext/x-patch; charset=US-ASCII; name=v4-0001-Add-progress-reporting-for-excluded-rows.patchDownload
From 177fabe331cbf21b695c9d8f7bab513d14e8f727 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm+postgres@gmail.com>
Date: Thu, 7 Jan 2021 16:39:57 +0100
Subject: [PATCH v4 1/6] Add progress reporting for excluded rows.

COPY ... FROM ... WHERE (condition) will exclude rows that do not match the
condition, which we now register as explicitly being excluded from insertion.
---
 doc/src/sgml/monitoring.sgml         | 10 ++++++++++
 src/backend/catalog/system_views.sql |  3 ++-
 src/backend/commands/copyfrom.c      |  5 +++++
 src/include/commands/progress.h      |  1 +
 src/test/regress/expected/rules.out  |  3 ++-
 5 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index c602ee4427..dc998bc5f7 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6571,6 +6571,16 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
        Number of lines already processed by <command>COPY</command> command.
       </para></entry>
      </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>lines_excluded</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of lines not processed because they were excluded by the
+       <command>WHERE</command> clause of the <command>COPY</command> command.
+      </para></entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index fa58afd9d7..a9bd983419 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1131,7 +1131,8 @@ CREATE VIEW pg_stat_progress_copy AS
         S.relid AS relid,
         S.param1 AS bytes_processed,
         S.param2 AS bytes_total,
-        S.param3 AS lines_processed
+        S.param3 AS lines_processed,
+        S.param4 AS lines_excluded
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index c39cc736ed..86e6a422e2 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -540,6 +540,7 @@ CopyFrom(CopyFromState cstate)
 	CopyInsertMethod insertMethod;
 	CopyMultiInsertInfo multiInsertInfo = {0};	/* pacify compiler */
 	uint64		processed = 0;
+	uint64		excluded = 0;
 	bool		has_before_insert_row_trig;
 	bool		has_instead_insert_row_trig;
 	bool		leafpart_use_multi_insert = false;
@@ -868,7 +869,11 @@ CopyFrom(CopyFromState cstate)
 			econtext->ecxt_scantuple = myslot;
 			/* Skip items that don't match COPY's WHERE clause */
 			if (!ExecQual(cstate->qualexpr, econtext))
+			{
+				/* Report that this tuple was filtered out by the WHERE clause */
+				pgstat_progress_update_param(PROGRESS_COPY_LINES_EXCLUDED, ++excluded);
 				continue;
+			}
 		}
 
 		/* Determine the partition to insert the tuple into */
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 95ec5d02e9..e6f6545033 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -137,5 +137,6 @@
 #define PROGRESS_COPY_BYTES_PROCESSED 0
 #define PROGRESS_COPY_BYTES_TOTAL 1
 #define PROGRESS_COPY_LINES_PROCESSED 2
+#define PROGRESS_COPY_LINES_EXCLUDED 3
 
 #endif
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 6173473de9..25a863627e 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1950,7 +1950,8 @@ pg_stat_progress_copy| SELECT s.pid,
     s.relid,
     s.param1 AS bytes_processed,
     s.param2 AS bytes_total,
-    s.param3 AS lines_processed
+    s.param3 AS lines_processed,
+    s.param4 AS lines_excluded
    FROM (pg_stat_get_progress_info('COPY'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_create_index| SELECT s.pid,
-- 
2.20.1

#13Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Matthias van de Meent (#12)
6 attachment(s)
Re: Improvements and additions to COPY progress reporting

On Fri, 12 Feb 2021 at 12:23, Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

On Thu, 11 Feb 2021 at 15:44, Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Thu, Feb 11, 2021, 8:08 PM Josef Šimánek <josef.simanek@gmail.com> wrote:

I have split it since it should be the start of progress reporting
testing at all. If you better consider this as part of COPY testing,
feel free to move it to already existing copy testing related files.
There's no real reason to keep it separated if not needed.

+1 to move those test cases to existing copy test files.

Thanks for your reviews. PFA v4 of the patchset, in which the tests
are put into copy.sql (well, input/copy.source). This also adds tests
for correctly reporting COPY ... FROM 'file'.

PFA v5, which fixes a failure in the pg_upgrade regression tests due
to incorrect usage of @abs_builddir@. I had the changes staged, but
forgot to add them to the patches.

Sorry for the noise.

-Matthias

Attachments:

v5-0006-Add-copy-progress-reporting-regression-tests.patchtext/x-patch; charset=UTF-8; name=v5-0006-Add-copy-progress-reporting-regression-tests.patchDownload
From 5727e7d6de2e0ffa9c5f319fb92c8eb35421f50b Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Fri, 12 Feb 2021 12:53:14 +0100
Subject: [PATCH v5 6/6] Add copy progress reporting regression tests.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This tests some basic features of copy progress reporting.

Sadly, we can only request one snapshot of progress_reporting each
transaction / statement, so we can't (easily) get intermediate results without
each result requiring a seperate statement being run.

Author: Josef Šimánek, Matthias van de Meent
---
 src/test/regress/input/copy.source  | 2 +-
 src/test/regress/output/copy.source | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/test/regress/input/copy.source b/src/test/regress/input/copy.source
index 0ce267e1cc..ddde33e7cc 100644
--- a/src/test/regress/input/copy.source
+++ b/src/test/regress/input/copy.source
@@ -250,7 +250,7 @@ bill	20	(11,10)	1000	sharon
 
 -- reporting of FILE imports, and correct reporting of tuples-excluded
 
-copy progress_reporting from '@abs_builddir@/data/emp.data'
+copy progress_reporting from '@abs_srcdir@/data/emp.data'
 	where (salary < 2000);
 
 -- cleanup progress_reporting
diff --git a/src/test/regress/output/copy.source b/src/test/regress/output/copy.source
index 32edc60319..60f4206aa1 100644
--- a/src/test/regress/output/copy.source
+++ b/src/test/regress/output/copy.source
@@ -202,7 +202,7 @@ create trigger check_after_progress_reporting
 copy progress_reporting from stdin;
 INFO:  progress: {"command": "COPY FROM", "datname": "regression", "io_target": "STDIO", "bytes_total": 0, "bytes_processed": 79, "tuples_excluded": 0, "tuples_processed": 3}
 -- reporting of FILE imports, and correct reporting of tuples-excluded
-copy progress_reporting from '@abs_builddir@/data/emp.data'
+copy progress_reporting from '@abs_srcdir@/data/emp.data'
 	where (salary < 2000);
 INFO:  progress: {"command": "COPY FROM", "datname": "regression", "io_target": "FILE", "bytes_total": 79, "bytes_processed": 79, "tuples_excluded": 1, "tuples_processed": 2}
 -- cleanup progress_reporting
-- 
2.20.1

v5-0005-Add-copy-progress-reporting-regression-tests.patchtext/x-patch; charset=UTF-8; name=v5-0005-Add-copy-progress-reporting-regression-tests.patchDownload
From 181e95a07d7f1d3bac08f1596622e6da2519d069 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Josef=20=C5=A0im=C3=A1nek?= <josef.simanek@gmail.com>
Date: Tue, 9 Feb 2021 13:06:45 +0100
Subject: [PATCH v5 5/6] Add copy progress reporting regression tests.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This tests some basic features of copy progress reporting.

Sadly, we can only request one snapshot of progress_reporting each
transaction / statement, so we can't (easily) get intermediate results without
each result requiring a seperate statement being run.

Author: Josef Šimánek, Matthias van de Meent
---
 src/test/regress/input/copy.source  | 57 +++++++++++++++++++++++++++++
 src/test/regress/output/copy.source | 44 ++++++++++++++++++++++
 2 files changed, 101 insertions(+)

diff --git a/src/test/regress/input/copy.source b/src/test/regress/input/copy.source
index a1d529ad36..0ce267e1cc 100644
--- a/src/test/regress/input/copy.source
+++ b/src/test/regress/input/copy.source
@@ -201,3 +201,60 @@ select * from parted_copytest where b = 1;
 select * from parted_copytest where b = 2;
 
 drop table parted_copytest;
+
+--
+-- progress reporting
+-- 
+
+-- setup
+-- reuse employer datatype, that has a small sized data set
+
+create table progress_reporting (
+	name 		text,
+	age			int4,
+	location 	point,
+	salary 		int4,
+	manager 	name
+);
+
+create function notice_after_progress_reporting() returns trigger AS
+$$
+declare report record;
+begin
+	-- We cannot expect 'pid' nor 'relid' to be consistent over runs due to 
+	-- variance in system process ids, and concurrency in runs of tests.
+	-- Additionally, due to the usage of this test in pg_regress, the 'datid'
+	-- also is not consistent between runs.
+	select into report (to_jsonb(r) - '{pid,relid,datid}'::text[]) as value
+		from pg_stat_progress_copy r
+		where pid = pg_backend_pid();
+
+	raise info 'progress: %', report.value::text;
+
+	RETURN NEW;
+END;
+$$ LANGUAGE plpgsql;
+
+create trigger check_after_progress_reporting
+	after insert on progress_reporting
+	for each statement
+	execute function notice_after_progress_reporting();
+
+-- reporting of STDIO imports, and correct bytes-processed/tuples-processed reporting
+
+copy progress_reporting from stdin;
+sharon	25	(15,12)	1000	sam
+sam	30	(10,5)	2000	bill
+bill	20	(11,10)	1000	sharon
+\.
+
+-- reporting of FILE imports, and correct reporting of tuples-excluded
+
+copy progress_reporting from '@abs_builddir@/data/emp.data'
+	where (salary < 2000);
+
+-- cleanup progress_reporting
+
+drop trigger check_after_progress_reporting on progress_reporting;
+drop function notice_after_progress_reporting();
+drop table progress_reporting;
diff --git a/src/test/regress/output/copy.source b/src/test/regress/output/copy.source
index 938d3551da..32edc60319 100644
--- a/src/test/regress/output/copy.source
+++ b/src/test/regress/output/copy.source
@@ -165,3 +165,47 @@ select * from parted_copytest where b = 2;
 (1 row)
 
 drop table parted_copytest;
+--
+-- progress reporting
+-- 
+-- setup
+-- reuse employer datatype, that has a small sized data set
+create table progress_reporting (
+	name 		text,
+	age			int4,
+	location 	point,
+	salary 		int4,
+	manager 	name
+);
+create function notice_after_progress_reporting() returns trigger AS
+$$
+declare report record;
+begin
+	-- We cannot expect 'pid' nor 'relid' to be consistent over runs due to 
+	-- variance in system process ids, and concurrency in runs of tests.
+	-- Additionally, due to the usage of this test in pg_regress, the 'datid'
+	-- also is not consistent between runs.
+	select into report (to_jsonb(r) - '{pid,relid,datid}'::text[]) as value
+		from pg_stat_progress_copy r
+		where pid = pg_backend_pid();
+
+	raise info 'progress: %', report.value::text;
+
+	RETURN NEW;
+END;
+$$ LANGUAGE plpgsql;
+create trigger check_after_progress_reporting
+	after insert on progress_reporting
+	for each statement
+	execute function notice_after_progress_reporting();
+-- reporting of STDIO imports, and correct bytes-processed/tuples-processed reporting
+copy progress_reporting from stdin;
+INFO:  progress: {"command": "COPY FROM", "datname": "regression", "io_target": "STDIO", "bytes_total": 0, "bytes_processed": 79, "tuples_excluded": 0, "tuples_processed": 3}
+-- reporting of FILE imports, and correct reporting of tuples-excluded
+copy progress_reporting from '@abs_builddir@/data/emp.data'
+	where (salary < 2000);
+INFO:  progress: {"command": "COPY FROM", "datname": "regression", "io_target": "FILE", "bytes_total": 79, "bytes_processed": 79, "tuples_excluded": 1, "tuples_processed": 2}
+-- cleanup progress_reporting
+drop trigger check_after_progress_reporting on progress_reporting;
+drop function notice_after_progress_reporting();
+drop table progress_reporting;
-- 
2.20.1

v5-0002-Add-backlinks-to-progress-reporting-documentation.patchtext/x-patch; charset=US-ASCII; name=v5-0002-Add-backlinks-to-progress-reporting-documentation.patchDownload
From 1aaf82200c79aaec10a81f669fa49b801ff8b4ed Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Fri, 5 Feb 2021 23:11:50 +0100
Subject: [PATCH v5 2/6] Add backlinks to progress reporting documentation

Previously, for most progress-reported features, the only place the
feature was mentioned is in the progress reporting document itself.
This makes the progress reporting more discoverable from the reported
commands.
---
 doc/src/sgml/ref/analyze.sgml       |  7 +++++++
 doc/src/sgml/ref/cluster.sgml       |  6 ++++++
 doc/src/sgml/ref/copy.sgml          | 14 ++++++++++++++
 doc/src/sgml/ref/create_index.sgml  |  8 ++++++++
 doc/src/sgml/ref/pg_basebackup.sgml |  1 +
 doc/src/sgml/ref/reindex.sgml       |  7 +++++++
 doc/src/sgml/ref/vacuum.sgml        | 11 +++++++++++
 7 files changed, 54 insertions(+)

diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml
index 7d816c87c6..9db9070b62 100644
--- a/doc/src/sgml/ref/analyze.sgml
+++ b/doc/src/sgml/ref/analyze.sgml
@@ -273,6 +273,12 @@ ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replacea
     will not record new statistics for that table.  Any existing statistics
     will be retained.
   </para>
+
+  <para>
+    Each backend running the <command>ANALYZE</command> command will report their
+    progress to the <structname>pg_stat_progress_analyze</structname> view.
+    See <xref linkend="analyze-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -291,6 +297,7 @@ ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replacea
    <member><xref linkend="app-vacuumdb"/></member>
    <member><xref linkend="runtime-config-resource-vacuum-cost"/></member>
    <member><xref linkend="autovacuum"/></member>
+   <member><xref linkend="analyze-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/cluster.sgml b/doc/src/sgml/ref/cluster.sgml
index 5dd21a0189..5c2270f71b 100644
--- a/doc/src/sgml/ref/cluster.sgml
+++ b/doc/src/sgml/ref/cluster.sgml
@@ -192,6 +192,11 @@ CLUSTER [VERBOSE]
     are periodically reclustered.
    </para>
 
+  <para>
+    Each backend running the <command>CLUSTER</command> command will report their
+    progress to the <structname>pg_stat_progress_cluster</structname> view.
+    See <xref linkend="cluster-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -242,6 +247,7 @@ CLUSTER <replaceable class="parameter">index_name</replaceable> ON <replaceable
 
   <simplelist type="inline">
    <member><xref linkend="app-clusterdb"/></member>
+   <member><xref linkend="cluster-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 0fca6583af..af3ce72561 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -82,6 +82,12 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
    specified, data is transmitted via the connection between the
    client and the server.
   </para>
+
+  <para>
+    Each backend running the <command>COPY</command> command will report their
+    progress to the <structname>pg_stat_progress_copy</structname> view.
+    See <xref linkend="copy-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -1052,4 +1058,12 @@ COPY [ BINARY ] <replaceable class="parameter">table_name</replaceable>
     [ WITH NULL AS '<replaceable class="parameter">null_string</replaceable>' ]
 </synopsis></para>
  </refsect1>
+ 
+ <refsect1>
+  <title>See Also</title>
+ 
+  <simplelist type="inline">
+   <member><xref linkend="copy-progress-reporting"/></member>
+  </simplelist>
+ </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index a5271a9f8f..278058f500 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -865,6 +865,13 @@ Indexes:
    will interpret it as <literal>USING gist</literal>, to simplify conversion
    of old databases to GiST.
   </para>
+
+  <para>
+    Each backend running the <command>CREATE INDEX</command> command will 
+    report their progress to the 
+    <structname>pg_stat_progress_create_index</structname> view.
+    See <xref linkend="create-index-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -978,6 +985,7 @@ CREATE INDEX CONCURRENTLY sales_quantity_index ON sales_table (quantity);
    <member><xref linkend="sql-alterindex"/></member>
    <member><xref linkend="sql-dropindex"/></member>
    <member><xref linkend="sql-reindex"/></member>
+   <member><xref linkend="create-index-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 5754ad5aa6..14cc88a852 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -904,6 +904,7 @@ PostgreSQL documentation
 
   <simplelist type="inline">
    <member><xref linkend="app-pgdump"/></member>
+   <member><xref linkend="basebackup-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 07795b5737..f69f5db403 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -493,6 +493,12 @@ Indexes:
     is reindexed concurrently, those indexes will be skipped.  (It is possible
     to reindex such indexes without the <command>CONCURRENTLY</command> option.)
    </para>
+
+  <para>
+    Each backend running the <command>REINDEX</command> command will report
+    their progress to the <structname>pg_stat_progress_create_index</structname>
+    view. See <xref linkend="create-index-progress-reporting"/> for details.
+  </para>
   </refsect2>
  </refsect1>
 
@@ -551,6 +557,7 @@ REINDEX TABLE CONCURRENTLY my_broken_table;
    <member><xref linkend="sql-createindex"/></member>
    <member><xref linkend="sql-dropindex"/></member>
    <member><xref linkend="app-reindexdb"/></member>
+   <member><xref linkend="create-index-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 21ab57d880..f087087816 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -378,6 +378,15 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     information about automatic and manual vacuuming, see
     <xref linkend="routine-vacuuming"/>.
    </para>
+   <para>
+    Each backend running the <command>VACUUM</command> command without the
+    <literal>FULL</literal> option will report their progress in the 
+    <structname>pg_stat_progress_vacuum</structname> view. Backends running
+    <command>VACUUM</command> with the <literal>FULL</literal> option report 
+    progress in the <structname>pg_stat_progress_cluster</structname> instead.
+    See <xref linkend="vacuum-progress-reporting"/> and 
+    <xref linkend="cluster-progress-reporting"/> for details.
+   </para>
  </refsect1>
 
  <refsect1>
@@ -407,6 +416,8 @@ VACUUM (VERBOSE, ANALYZE) onek;
    <member><xref linkend="app-vacuumdb"/></member>
    <member><xref linkend="runtime-config-resource-vacuum-cost"/></member>
    <member><xref linkend="autovacuum"/></member>
+   <member><xref linkend="vacuum-progress-reporting"/></member>
+   <member><xref linkend="cluster-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
-- 
2.20.1

v5-0004-Add-a-io_target-column-to-the-copy-progress-view.patchtext/x-patch; charset=US-ASCII; name=v5-0004-Add-a-io_target-column-to-the-copy-progress-view.patchDownload
From 35e9815dec8e1770d380d6b2446b41010b1007a0 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Mon, 8 Feb 2021 18:13:46 +0100
Subject: [PATCH v5 4/6] Add a io_target column to the copy progress view

This allows filtering on IO target for progress reporting, and allows for
further filtering of COPY commands. Additionally, this allows for
identification of logical replication's initial table synchronization
background workers at the subscriber side through io_target = CALLBACK, as it
is the only current supplier of a callback datasource.
---
 doc/src/sgml/monitoring.sgml         | 13 +++++++++++++
 src/backend/catalog/system_views.sql |  5 +++++
 src/backend/commands/copyfrom.c      | 22 ++++++++++++++++++++++
 src/backend/commands/copyto.c        | 26 +++++++++++++++++++++++++-
 src/include/commands/progress.h      |  7 +++++++
 src/test/regress/expected/rules.out  |  7 +++++++
 6 files changed, 79 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index ca84b53896..3c39c82f1a 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6554,6 +6554,19 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>io_target</structfield> <type>text</type>
+      </para>
+      <para>
+       The io target that the data is read from or written to: 
+       <literal>FILE</literal>, <literal>PROGRAM</literal>, 
+       <literal>STDIO</literal> (for COPY FROM STDIN and COPY TO STDOUT),
+       or <literal>CALLBACK</literal> (used in the table synchronization
+       background worker).
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>bytes_processed</structfield> <type>bigint</type>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 1082b7d253..6a3ac47b85 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1132,6 +1132,11 @@ CREATE VIEW pg_stat_progress_copy AS
         CASE S.param5 WHEN 1 THEN 'COPY FROM'
                       WHEN 2 THEN 'COPY TO'
                       END AS command,
+        CASE S.param6 WHEN 1 THEN 'FILE'
+                      WHEN 2 THEN 'PROGRAM'
+                      WHEN 3 THEN 'STDIO'
+                      WHEN 4 THEN 'CALLBACK'
+                      END AS io_target,
         S.param1 AS bytes_processed,
         S.param2 AS bytes_total,
         S.param3 AS tuples_processed,
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index ce343dbf80..bf952fa293 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1429,6 +1429,7 @@ BeginCopyFrom(ParseState *pstate,
 	pgstat_progress_start_command(PROGRESS_COMMAND_COPY,
 								  cstate->rel ? RelationGetRelid(cstate->rel) : InvalidOid);
 	pgstat_progress_update_param(PROGRESS_COPY_COMMAND, PROGRESS_COPY_COMMAND_FROM);
+
 	cstate->bytes_processed = 0;
 
 	/* We keep those variables in cstate. */
@@ -1506,6 +1507,27 @@ BeginCopyFrom(ParseState *pstate,
 		ReceiveCopyBinaryHeader(cstate);
 	}
 
+	{
+		int64 io_target;
+		switch (cstate->copy_src)
+		{
+			case COPY_FILE:
+				if (is_program)
+					io_target = PROGRESS_COPY_IO_TARGET_PROGRAM;
+				else
+					io_target = PROGRESS_COPY_IO_TARGET_FILE;
+				break;
+			case COPY_OLD_FE:
+			case COPY_NEW_FE:
+				io_target = PROGRESS_COPY_IO_TARGET_STDIO;
+				break;
+			case COPY_CALLBACK:
+				io_target = PROGRESS_COPY_IO_TARGET_CALLBACK;
+				break;
+		}
+		pgstat_progress_update_param(PROGRESS_COPY_IO_TARGET, io_target);
+	}
+
 	/* create workspace for CopyReadAttributes results */
 	if (!cstate->opts.binary)
 	{
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 534c091c75..42c4a828df 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -772,7 +772,31 @@ BeginCopyTo(ParseState *pstate,
 	/* initialize progress */
 	pgstat_progress_start_command(PROGRESS_COMMAND_COPY,
 								  cstate->rel ? RelationGetRelid(cstate->rel) : InvalidOid);
-	pgstat_progress_update_param(PROGRESS_COPY_COMMAND, PROGRESS_COPY_COMMAND_TO);
+	{
+		const int progress_index[] = {
+			PROGRESS_COPY_COMMAND,
+			PROGRESS_COPY_IO_TARGET
+		};
+		int64 progress_vals[] = {
+			PROGRESS_COPY_COMMAND_TO,
+			0
+		};
+		switch (cstate->copy_dest)
+		{
+			case COPY_FILE:
+				if (is_program)
+					progress_vals[1] = PROGRESS_COPY_IO_TARGET_PROGRAM;
+				else
+					progress_vals[1] = PROGRESS_COPY_IO_TARGET_FILE;
+				break;
+			case COPY_OLD_FE:
+			case COPY_NEW_FE:
+				progress_vals[1] = PROGRESS_COPY_IO_TARGET_STDIO;
+				break;
+		}
+		pgstat_progress_update_multi_param(2, progress_index, progress_vals);
+	}
+
 	cstate->bytes_processed = 0;
 
 	MemoryContextSwitchTo(oldcontext);
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 1c30d09abb..e003217554 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -139,9 +139,16 @@
 #define PROGRESS_COPY_TUPLES_PROCESSED 2
 #define PROGRESS_COPY_TUPLES_EXCLUDED 3
 #define PROGRESS_COPY_COMMAND 4
+#define PROGRESS_COPY_IO_TARGET 5
 
 /* Commands of PROGRESS_COPY_COMMAND */
 #define PROGRESS_COPY_COMMAND_FROM 1
 #define PROGRESS_COPY_COMMAND_TO 2
 
+/* Types of PROGRESS_COPY_INOUT_TYPE */
+#define PROGRESS_COPY_IO_TARGET_FILE 1
+#define PROGRESS_COPY_IO_TARGET_PROGRAM 2
+#define PROGRESS_COPY_IO_TARGET_STDIO 3
+#define PROGRESS_COPY_IO_TARGET_CALLBACK 4
+
 #endif
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 63b5e33083..0698c71d23 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1953,6 +1953,13 @@ pg_stat_progress_copy| SELECT s.pid,
             WHEN 2 THEN 'COPY TO'::text
             ELSE NULL::text
         END AS command,
+        CASE s.param6
+            WHEN 1 THEN 'FILE'::text
+            WHEN 2 THEN 'PROGRAM'::text
+            WHEN 3 THEN 'STDIO'::text
+            WHEN 4 THEN 'CALLBACK'::text
+            ELSE NULL::text
+        END AS io_target,
     s.param1 AS bytes_processed,
     s.param2 AS bytes_total,
     s.param3 AS tuples_processed,
-- 
2.20.1

v5-0003-Add-a-command-column-to-the-copy-progress-view.patchtext/x-patch; charset=US-ASCII; name=v5-0003-Add-a-command-column-to-the-copy-progress-view.patchDownload
From f7d761f6774753d4914d0dbc80effbb1ab09b58e Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Mon, 8 Feb 2021 17:36:00 +0100
Subject: [PATCH v5 3/6] Add a command column to the copy progress view

This allows filtering on COPY FROM / COPY TO for progress reporting, and makes
it possible to determine the further meaning of the columns involved.
---
 doc/src/sgml/monitoring.sgml         | 10 ++++++++++
 src/backend/catalog/system_views.sql |  3 +++
 src/backend/commands/copyfrom.c      |  1 +
 src/backend/commands/copyto.c        |  1 +
 src/include/commands/progress.h      |  5 +++++
 src/test/regress/expected/rules.out  |  5 +++++
 6 files changed, 25 insertions(+)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 940e9dcee4..ca84b53896 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6544,6 +6544,16 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>command</structfield> <type>text</type>
+      </para>
+      <para>
+       The command that is running: <literal>COPY FROM</literal>, or
+       <literal>COPY TO</literal>.
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>bytes_processed</structfield> <type>bigint</type>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index e7e227792c..1082b7d253 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1129,6 +1129,9 @@ CREATE VIEW pg_stat_progress_copy AS
     SELECT
         S.pid AS pid, S.datid AS datid, D.datname AS datname,
         S.relid AS relid,
+        CASE S.param5 WHEN 1 THEN 'COPY FROM'
+                      WHEN 2 THEN 'COPY TO'
+                      END AS command,
         S.param1 AS bytes_processed,
         S.param2 AS bytes_total,
         S.param3 AS tuples_processed,
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index fb3c7e2c0c..ce343dbf80 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1428,6 +1428,7 @@ BeginCopyFrom(ParseState *pstate,
 	/* initialize progress */
 	pgstat_progress_start_command(PROGRESS_COMMAND_COPY,
 								  cstate->rel ? RelationGetRelid(cstate->rel) : InvalidOid);
+	pgstat_progress_update_param(PROGRESS_COPY_COMMAND, PROGRESS_COPY_COMMAND_FROM);
 	cstate->bytes_processed = 0;
 
 	/* We keep those variables in cstate. */
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 9ffe7a6ee3..534c091c75 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -772,6 +772,7 @@ BeginCopyTo(ParseState *pstate,
 	/* initialize progress */
 	pgstat_progress_start_command(PROGRESS_COMMAND_COPY,
 								  cstate->rel ? RelationGetRelid(cstate->rel) : InvalidOid);
+	pgstat_progress_update_param(PROGRESS_COPY_COMMAND, PROGRESS_COPY_COMMAND_TO);
 	cstate->bytes_processed = 0;
 
 	MemoryContextSwitchTo(oldcontext);
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 8b2b188bd5..1c30d09abb 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -138,5 +138,10 @@
 #define PROGRESS_COPY_BYTES_TOTAL 1
 #define PROGRESS_COPY_TUPLES_PROCESSED 2
 #define PROGRESS_COPY_TUPLES_EXCLUDED 3
+#define PROGRESS_COPY_COMMAND 4
+
+/* Commands of PROGRESS_COPY_COMMAND */
+#define PROGRESS_COPY_COMMAND_FROM 1
+#define PROGRESS_COPY_COMMAND_TO 2
 
 #endif
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 970f6909c2..63b5e33083 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1948,6 +1948,11 @@ pg_stat_progress_copy| SELECT s.pid,
     s.datid,
     d.datname,
     s.relid,
+        CASE s.param5
+            WHEN 1 THEN 'COPY FROM'::text
+            WHEN 2 THEN 'COPY TO'::text
+            ELSE NULL::text
+        END AS command,
     s.param1 AS bytes_processed,
     s.param2 AS bytes_total,
     s.param3 AS tuples_processed,
-- 
2.20.1

v5-0001-Rename-lines-to-tuples-in-COPY-progress-reporting.patchtext/x-patch; charset=US-ASCII; name=v5-0001-Rename-lines-to-tuples-in-COPY-progress-reporting.patchDownload
From d6723f2cfbe98e9837948336420f283312c91b51 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Thu, 7 Jan 2021 16:51:30 +0100
Subject: [PATCH v5 1/6] Rename lines* to tuples* in COPY progress reporting
 view

This pulls the names of the columns in line with the other progress
reporting views and the comments surrounding the code that updates the columns.

It also makes it clear to the user that for the foreign data consumed or
produced in the COPY command, the line count need not be related to these
columns.
---
 doc/src/sgml/monitoring.sgml         | 8 ++++----
 src/backend/catalog/system_views.sql | 4 ++--
 src/backend/commands/copyfrom.c      | 4 ++--
 src/backend/commands/copyto.c        | 4 ++--
 src/include/commands/progress.h      | 4 ++--
 src/test/regress/expected/rules.out  | 4 ++--
 6 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index dc998bc5f7..940e9dcee4 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6565,19 +6565,19 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>lines_processed</structfield> <type>bigint</type>
+       <structfield>tuples_processed</structfield> <type>bigint</type>
       </para>
       <para>
-       Number of lines already processed by <command>COPY</command> command.
+       Number of tuples already processed by <command>COPY</command> command.
       </para></entry>
      </row>
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>lines_excluded</structfield> <type>bigint</type>
+       <structfield>tuples_excluded</structfield> <type>bigint</type>
       </para>
       <para>
-       Number of lines not processed because they were excluded by the
+       Number of tuples not processed because they were excluded by the
        <command>WHERE</command> clause of the <command>COPY</command> command.
       </para></entry>
      </row>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index a9bd983419..e7e227792c 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1131,8 +1131,8 @@ CREATE VIEW pg_stat_progress_copy AS
         S.relid AS relid,
         S.param1 AS bytes_processed,
         S.param2 AS bytes_total,
-        S.param3 AS lines_processed,
-        S.param4 AS lines_excluded
+        S.param3 AS tuples_processed,
+        S.param4 AS tuples_excluded
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 86e6a422e2..fb3c7e2c0c 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -871,7 +871,7 @@ CopyFrom(CopyFromState cstate)
 			if (!ExecQual(cstate->qualexpr, econtext))
 			{
 				/* Report that this tuple was filtered out by the WHERE clause */
-				pgstat_progress_update_param(PROGRESS_COPY_LINES_EXCLUDED, ++excluded);
+				pgstat_progress_update_param(PROGRESS_COPY_TUPLES_EXCLUDED, ++excluded);
 				continue;
 			}
 		}
@@ -1111,7 +1111,7 @@ CopyFrom(CopyFromState cstate)
 			 * for counting tuples inserted by an INSERT command. Update
 			 * progress of the COPY command as well.
 			 */
-			pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++processed);
+			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++processed);
 		}
 	}
 
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index e04ec1e331..9ffe7a6ee3 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -954,7 +954,7 @@ CopyTo(CopyToState cstate)
 			CopyOneRowTo(cstate, slot);
 
 			/* Increment amount of processed tuples and update the progress */
-			pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++processed);
+			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++processed);
 		}
 
 		ExecDropSingleTupleTableSlot(slot);
@@ -1321,7 +1321,7 @@ copy_dest_receive(TupleTableSlot *slot, DestReceiver *self)
 	CopyOneRowTo(cstate, slot);
 
 	/* Increment amount of processed tuples and update the progress */
-	pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++myState->processed);
+	pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++myState->processed);
 
 	return true;
 }
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index e6f6545033..8b2b188bd5 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -136,7 +136,7 @@
 /* Commands of PROGRESS_COPY */
 #define PROGRESS_COPY_BYTES_PROCESSED 0
 #define PROGRESS_COPY_BYTES_TOTAL 1
-#define PROGRESS_COPY_LINES_PROCESSED 2
-#define PROGRESS_COPY_LINES_EXCLUDED 3
+#define PROGRESS_COPY_TUPLES_PROCESSED 2
+#define PROGRESS_COPY_TUPLES_EXCLUDED 3
 
 #endif
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 25a863627e..970f6909c2 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1950,8 +1950,8 @@ pg_stat_progress_copy| SELECT s.pid,
     s.relid,
     s.param1 AS bytes_processed,
     s.param2 AS bytes_total,
-    s.param3 AS lines_processed,
-    s.param4 AS lines_excluded
+    s.param3 AS tuples_processed,
+    s.param4 AS tuples_excluded
    FROM (pg_stat_get_progress_info('COPY'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_create_index| SELECT s.pid,
-- 
2.20.1

#14Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Matthias van de Meent (#13)
Re: Improvements and additions to COPY progress reporting

On Fri, Feb 12, 2021 at 5:40 PM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

On Fri, 12 Feb 2021 at 12:23, Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

On Thu, 11 Feb 2021 at 15:44, Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Thu, Feb 11, 2021, 8:08 PM Josef Šimánek <josef.simanek@gmail.com> wrote:

I have split it since it should be the start of progress reporting
testing at all. If you better consider this as part of COPY testing,
feel free to move it to already existing copy testing related files.
There's no real reason to keep it separated if not needed.

+1 to move those test cases to existing copy test files.

Thanks for your reviews. PFA v4 of the patchset, in which the tests
are put into copy.sql (well, input/copy.source). This also adds tests
for correctly reporting COPY ... FROM 'file'.

PFA v5, which fixes a failure in the pg_upgrade regression tests due
to incorrect usage of @abs_builddir@. I had the changes staged, but
forgot to add them to the patches.

Sorry for the noise.

Looks like the patch 0001 that was adding tuples_excluded was missing
and cfbot is also not happy with the v5 patch set.

Maybe, we may not need 6 patches as they are relatively very small
patches. IMO, the following are enough:

0001 - tuples_excluded, lines to tuples change, COPY FROM/COPY TO
addition, io_target -- basically all the code related patches can go
into 0001
0002 - documentation
0003 - tests - I think we can only have a simple test(in copy2.sql)
showing stdin/stdout and not have file related tests. Because these
patches work as expected, please find my testing below:

postgres=# select * from pg_stat_progress_copy;
pid | datid | datname | relid | command | io_target |
bytes_processed | bytes_total | tuples_processed | tuples_excluded
---------+-------+----------+-------+-----------+-----------+-----------------+-------------+------------------+-----------------
2886103 | 12977 | postgres | 16384 | COPY FROM | FILE |
83099648 | 85777795 | 9553999 | 1111111
(1 row)

postgres=# select * from pg_stat_progress_copy;
pid | datid | datname | relid | command | io_target |
bytes_processed | bytes_total | tuples_processed | tuples_excluded
---------+-------+----------+-------+-----------+-----------+-----------------+-------------+------------------+-----------------
2886103 | 12977 | postgres | 16384 | COPY FROM | STDIO |
0 | 0 | 0 | 0
(1 row)

postgres=# select * from pg_stat_progress_copy;
pid | datid | datname | relid | command | io_target |
bytes_processed | bytes_total | tuples_processed | tuples_excluded
---------+-------+----------+-------+---------+-----------+-----------------+-------------+------------------+-----------------
2886103 | 12977 | postgres | 16384 | COPY TO | FILE |
37771610 | 0 | 4999228 | 0
(1 row)

postgres=# select * from pg_stat_progress_copy;
pid | datid | datname | relid | command | io_target |
bytes_processed | bytes_total | tuples_processed | tuples_excluded
---------+-------+----------+-------+-----------+-----------+-----------------+-------------+------------------+-----------------
2892816 | 12977 | postgres | 16384 | COPY FROM | CALLBACK |
249777823 | 0 | 31888892 | 0
(1 row)

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

#15Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Bharath Rupireddy (#14)
3 attachment(s)
Re: Improvements and additions to COPY progress reporting

On Fri, 12 Feb 2021 at 13:40, Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Fri, Feb 12, 2021 at 5:40 PM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

On Fri, 12 Feb 2021 at 12:23, Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

On Thu, 11 Feb 2021 at 15:44, Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Thu, Feb 11, 2021, 8:08 PM Josef Šimánek <josef.simanek@gmail.com> wrote:

I have split it since it should be the start of progress reporting
testing at all. If you better consider this as part of COPY testing,
feel free to move it to already existing copy testing related files.
There's no real reason to keep it separated if not needed.

+1 to move those test cases to existing copy test files.

Thanks for your reviews. PFA v4 of the patchset, in which the tests
are put into copy.sql (well, input/copy.source). This also adds tests
for correctly reporting COPY ... FROM 'file'.

PFA v5, which fixes a failure in the pg_upgrade regression tests due
to incorrect usage of @abs_builddir@. I had the changes staged, but
forgot to add them to the patches.

Sorry for the noise.

Looks like the patch 0001 that was adding tuples_excluded was missing
and cfbot is also not happy with the v5 patch set.

Maybe, we may not need 6 patches as they are relatively very small
patches. IMO, the following are enough:

0001 - tuples_excluded, lines to tuples change, COPY FROM/COPY TO
addition, io_target -- basically all the code related patches can go
into 0001
0002 - documentation
0003 - tests - I think we can only have a simple test(in copy2.sql)
showing stdin/stdout and not have file related tests. Because these
patches work as expected, please find my testing below:

I agree with that split, the current split was mainly for the reason
that some of the patches (except 1, 3 and 6, which are quite
substantially different from the rest) each have had their seperate
concerns voiced about the changes contained in that patch (be it
direct or indirect); e.g. the renaming of lines_* to tuples_* is in my
opionion a good thing, and Josef disagrees.

Anyway, please find attached patchset v6 applying that split.

Regarding only a simple test: I believe it is useful to have at least
a test that distinguishes between two different import types. I've
made a mistake before, so I think it is useful to add a regression
tests to prevent someone else from making this same mistake (trivial
as it may be). Additionally, testing in copy.sql also allows for
validating the bytes_total column, which cannot be tested in copy2.sql
due to the lack of COPY FROM FILE -support over there. I'm +0.5 on
keeping it as-is in copy.sql, so unless someone has some strong
feelings about this, I'd like to keep it in copy.sql.

With regards,

Matthias van de Meent.

Attachments:

v6-0002-Add-backlinks-to-progress-reporting-documentation.patchapplication/x-patch; name=v6-0002-Add-backlinks-to-progress-reporting-documentation.patchDownload
From daf793916a91157f917bafd669f50af3d3751ee4 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Fri, 5 Feb 2021 23:11:50 +0100
Subject: [PATCH v6 2/3] Add backlinks to progress reporting documentation

Previously, for most progress-reported features, the only place the
feature was mentioned is in the progress reporting document itself.
This makes the progress reporting more discoverable from the reported
commands.
---
 doc/src/sgml/ref/analyze.sgml       |  7 +++++++
 doc/src/sgml/ref/cluster.sgml       |  6 ++++++
 doc/src/sgml/ref/copy.sgml          | 14 ++++++++++++++
 doc/src/sgml/ref/create_index.sgml  |  8 ++++++++
 doc/src/sgml/ref/pg_basebackup.sgml |  1 +
 doc/src/sgml/ref/reindex.sgml       |  7 +++++++
 doc/src/sgml/ref/vacuum.sgml        | 11 +++++++++++
 7 files changed, 54 insertions(+)

diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml
index 7d816c87c6..9db9070b62 100644
--- a/doc/src/sgml/ref/analyze.sgml
+++ b/doc/src/sgml/ref/analyze.sgml
@@ -273,6 +273,12 @@ ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replacea
     will not record new statistics for that table.  Any existing statistics
     will be retained.
   </para>
+
+  <para>
+    Each backend running the <command>ANALYZE</command> command will report their
+    progress to the <structname>pg_stat_progress_analyze</structname> view.
+    See <xref linkend="analyze-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -291,6 +297,7 @@ ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replacea
    <member><xref linkend="app-vacuumdb"/></member>
    <member><xref linkend="runtime-config-resource-vacuum-cost"/></member>
    <member><xref linkend="autovacuum"/></member>
+   <member><xref linkend="analyze-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/cluster.sgml b/doc/src/sgml/ref/cluster.sgml
index 5dd21a0189..5c2270f71b 100644
--- a/doc/src/sgml/ref/cluster.sgml
+++ b/doc/src/sgml/ref/cluster.sgml
@@ -192,6 +192,11 @@ CLUSTER [VERBOSE]
     are periodically reclustered.
    </para>
 
+  <para>
+    Each backend running the <command>CLUSTER</command> command will report their
+    progress to the <structname>pg_stat_progress_cluster</structname> view.
+    See <xref linkend="cluster-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -242,6 +247,7 @@ CLUSTER <replaceable class="parameter">index_name</replaceable> ON <replaceable
 
   <simplelist type="inline">
    <member><xref linkend="app-clusterdb"/></member>
+   <member><xref linkend="cluster-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 0fca6583af..af3ce72561 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -82,6 +82,12 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
    specified, data is transmitted via the connection between the
    client and the server.
   </para>
+
+  <para>
+    Each backend running the <command>COPY</command> command will report their
+    progress to the <structname>pg_stat_progress_copy</structname> view.
+    See <xref linkend="copy-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -1052,4 +1058,12 @@ COPY [ BINARY ] <replaceable class="parameter">table_name</replaceable>
     [ WITH NULL AS '<replaceable class="parameter">null_string</replaceable>' ]
 </synopsis></para>
  </refsect1>
+ 
+ <refsect1>
+  <title>See Also</title>
+ 
+  <simplelist type="inline">
+   <member><xref linkend="copy-progress-reporting"/></member>
+  </simplelist>
+ </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index a5271a9f8f..278058f500 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -865,6 +865,13 @@ Indexes:
    will interpret it as <literal>USING gist</literal>, to simplify conversion
    of old databases to GiST.
   </para>
+
+  <para>
+    Each backend running the <command>CREATE INDEX</command> command will 
+    report their progress to the 
+    <structname>pg_stat_progress_create_index</structname> view.
+    See <xref linkend="create-index-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -978,6 +985,7 @@ CREATE INDEX CONCURRENTLY sales_quantity_index ON sales_table (quantity);
    <member><xref linkend="sql-alterindex"/></member>
    <member><xref linkend="sql-dropindex"/></member>
    <member><xref linkend="sql-reindex"/></member>
+   <member><xref linkend="create-index-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 5754ad5aa6..14cc88a852 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -904,6 +904,7 @@ PostgreSQL documentation
 
   <simplelist type="inline">
    <member><xref linkend="app-pgdump"/></member>
+   <member><xref linkend="basebackup-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 07795b5737..f69f5db403 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -493,6 +493,12 @@ Indexes:
     is reindexed concurrently, those indexes will be skipped.  (It is possible
     to reindex such indexes without the <command>CONCURRENTLY</command> option.)
    </para>
+
+  <para>
+    Each backend running the <command>REINDEX</command> command will report
+    their progress to the <structname>pg_stat_progress_create_index</structname>
+    view. See <xref linkend="create-index-progress-reporting"/> for details.
+  </para>
   </refsect2>
  </refsect1>
 
@@ -551,6 +557,7 @@ REINDEX TABLE CONCURRENTLY my_broken_table;
    <member><xref linkend="sql-createindex"/></member>
    <member><xref linkend="sql-dropindex"/></member>
    <member><xref linkend="app-reindexdb"/></member>
+   <member><xref linkend="create-index-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 4bb624979b..bab3d00313 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -393,6 +393,15 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     information about automatic and manual vacuuming, see
     <xref linkend="routine-vacuuming"/>.
    </para>
+   <para>
+    Each backend running the <command>VACUUM</command> command without the
+    <literal>FULL</literal> option will report their progress in the 
+    <structname>pg_stat_progress_vacuum</structname> view. Backends running
+    <command>VACUUM</command> with the <literal>FULL</literal> option report 
+    progress in the <structname>pg_stat_progress_cluster</structname> instead.
+    See <xref linkend="vacuum-progress-reporting"/> and 
+    <xref linkend="cluster-progress-reporting"/> for details.
+   </para>
  </refsect1>
 
  <refsect1>
@@ -422,6 +431,8 @@ VACUUM (VERBOSE, ANALYZE) onek;
    <member><xref linkend="app-vacuumdb"/></member>
    <member><xref linkend="runtime-config-resource-vacuum-cost"/></member>
    <member><xref linkend="autovacuum"/></member>
+   <member><xref linkend="vacuum-progress-reporting"/></member>
+   <member><xref linkend="cluster-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
-- 
2.20.1

v6-0003-Add-copy-progress-reporting-regression-tests.patchapplication/x-patch; name=v6-0003-Add-copy-progress-reporting-regression-tests.patchDownload
From e02e2d80deccd033f6283caa15f0df9240087240 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Josef=20=C5=A0im=C3=A1nek?= <josef.simanek@gmail.com>
Date: Tue, 9 Feb 2021 13:06:45 +0100
Subject: [PATCH v6 3/3] Add copy progress reporting regression tests.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This tests some basic features of copy progress reporting.

Sadly, we can only request one snapshot of progress_reporting each
transaction / statement, so we can't (easily) get intermediate results without
each result requiring a seperate statement being run.

Author: Josef Šimánek, Matthias van de Meent
---
 src/test/regress/input/copy.source  | 57 +++++++++++++++++++++++++++++
 src/test/regress/output/copy.source | 44 ++++++++++++++++++++++
 2 files changed, 101 insertions(+)

diff --git a/src/test/regress/input/copy.source b/src/test/regress/input/copy.source
index a1d529ad36..ddde33e7cc 100644
--- a/src/test/regress/input/copy.source
+++ b/src/test/regress/input/copy.source
@@ -201,3 +201,60 @@ select * from parted_copytest where b = 1;
 select * from parted_copytest where b = 2;
 
 drop table parted_copytest;
+
+--
+-- progress reporting
+-- 
+
+-- setup
+-- reuse employer datatype, that has a small sized data set
+
+create table progress_reporting (
+	name 		text,
+	age			int4,
+	location 	point,
+	salary 		int4,
+	manager 	name
+);
+
+create function notice_after_progress_reporting() returns trigger AS
+$$
+declare report record;
+begin
+	-- We cannot expect 'pid' nor 'relid' to be consistent over runs due to 
+	-- variance in system process ids, and concurrency in runs of tests.
+	-- Additionally, due to the usage of this test in pg_regress, the 'datid'
+	-- also is not consistent between runs.
+	select into report (to_jsonb(r) - '{pid,relid,datid}'::text[]) as value
+		from pg_stat_progress_copy r
+		where pid = pg_backend_pid();
+
+	raise info 'progress: %', report.value::text;
+
+	RETURN NEW;
+END;
+$$ LANGUAGE plpgsql;
+
+create trigger check_after_progress_reporting
+	after insert on progress_reporting
+	for each statement
+	execute function notice_after_progress_reporting();
+
+-- reporting of STDIO imports, and correct bytes-processed/tuples-processed reporting
+
+copy progress_reporting from stdin;
+sharon	25	(15,12)	1000	sam
+sam	30	(10,5)	2000	bill
+bill	20	(11,10)	1000	sharon
+\.
+
+-- reporting of FILE imports, and correct reporting of tuples-excluded
+
+copy progress_reporting from '@abs_srcdir@/data/emp.data'
+	where (salary < 2000);
+
+-- cleanup progress_reporting
+
+drop trigger check_after_progress_reporting on progress_reporting;
+drop function notice_after_progress_reporting();
+drop table progress_reporting;
diff --git a/src/test/regress/output/copy.source b/src/test/regress/output/copy.source
index 938d3551da..60f4206aa1 100644
--- a/src/test/regress/output/copy.source
+++ b/src/test/regress/output/copy.source
@@ -165,3 +165,47 @@ select * from parted_copytest where b = 2;
 (1 row)
 
 drop table parted_copytest;
+--
+-- progress reporting
+-- 
+-- setup
+-- reuse employer datatype, that has a small sized data set
+create table progress_reporting (
+	name 		text,
+	age			int4,
+	location 	point,
+	salary 		int4,
+	manager 	name
+);
+create function notice_after_progress_reporting() returns trigger AS
+$$
+declare report record;
+begin
+	-- We cannot expect 'pid' nor 'relid' to be consistent over runs due to 
+	-- variance in system process ids, and concurrency in runs of tests.
+	-- Additionally, due to the usage of this test in pg_regress, the 'datid'
+	-- also is not consistent between runs.
+	select into report (to_jsonb(r) - '{pid,relid,datid}'::text[]) as value
+		from pg_stat_progress_copy r
+		where pid = pg_backend_pid();
+
+	raise info 'progress: %', report.value::text;
+
+	RETURN NEW;
+END;
+$$ LANGUAGE plpgsql;
+create trigger check_after_progress_reporting
+	after insert on progress_reporting
+	for each statement
+	execute function notice_after_progress_reporting();
+-- reporting of STDIO imports, and correct bytes-processed/tuples-processed reporting
+copy progress_reporting from stdin;
+INFO:  progress: {"command": "COPY FROM", "datname": "regression", "io_target": "STDIO", "bytes_total": 0, "bytes_processed": 79, "tuples_excluded": 0, "tuples_processed": 3}
+-- reporting of FILE imports, and correct reporting of tuples-excluded
+copy progress_reporting from '@abs_srcdir@/data/emp.data'
+	where (salary < 2000);
+INFO:  progress: {"command": "COPY FROM", "datname": "regression", "io_target": "FILE", "bytes_total": 79, "bytes_processed": 79, "tuples_excluded": 1, "tuples_processed": 2}
+-- cleanup progress_reporting
+drop trigger check_after_progress_reporting on progress_reporting;
+drop function notice_after_progress_reporting();
+drop table progress_reporting;
-- 
2.20.1

v6-0001-Add-progress-reported-components-for-COPY-progres.patchapplication/x-patch; name=v6-0001-Add-progress-reported-components-for-COPY-progres.patchDownload
From 275f47f5a2495db5ab0195cc2ff5101563724f36 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Fri, 12 Feb 2021 14:06:44 +0100
Subject: [PATCH v6 1/3] Add progress-reported components for COPY progress
 reporting

The command, io target and excluded tuple count (by COPY ... FROM ... s'
WHERE -clause) are now reported in the pg_stat_progress_copy view. 
Additionally, the column lines_processed is renamed to tuples_processed to
disambiguate the meaning of this column in cases of CSV and BINARY copies and
to stay consistent with regards to names in the pg_stat_progress_*-views.

Of special interest is the reporting of io_target, with which we can
distinguish logical replications' initial table synchronization workers'
progress without having to join the pg_stat_activity view.
---
 doc/src/sgml/monitoring.sgml         | 37 ++++++++++++++++++++++++++--
 src/backend/catalog/system_views.sql | 11 ++++++++-
 src/backend/commands/copyfrom.c      | 30 +++++++++++++++++++++-
 src/backend/commands/copyto.c        | 29 ++++++++++++++++++++--
 src/include/commands/progress.h      | 15 ++++++++++-
 src/test/regress/expected/rules.out  | 15 ++++++++++-
 6 files changed, 129 insertions(+), 8 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index c602ee4427..3c39c82f1a 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6544,6 +6544,29 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>command</structfield> <type>text</type>
+      </para>
+      <para>
+       The command that is running: <literal>COPY FROM</literal>, or
+       <literal>COPY TO</literal>.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>io_target</structfield> <type>text</type>
+      </para>
+      <para>
+       The io target that the data is read from or written to: 
+       <literal>FILE</literal>, <literal>PROGRAM</literal>, 
+       <literal>STDIO</literal> (for COPY FROM STDIN and COPY TO STDOUT),
+       or <literal>CALLBACK</literal> (used in the table synchronization
+       background worker).
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>bytes_processed</structfield> <type>bigint</type>
@@ -6565,10 +6588,20 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>lines_processed</structfield> <type>bigint</type>
+       <structfield>tuples_processed</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of tuples already processed by <command>COPY</command> command.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>tuples_excluded</structfield> <type>bigint</type>
       </para>
       <para>
-       Number of lines already processed by <command>COPY</command> command.
+       Number of tuples not processed because they were excluded by the
+       <command>WHERE</command> clause of the <command>COPY</command> command.
       </para></entry>
      </row>
     </tbody>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index fa58afd9d7..6a3ac47b85 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1129,9 +1129,18 @@ CREATE VIEW pg_stat_progress_copy AS
     SELECT
         S.pid AS pid, S.datid AS datid, D.datname AS datname,
         S.relid AS relid,
+        CASE S.param5 WHEN 1 THEN 'COPY FROM'
+                      WHEN 2 THEN 'COPY TO'
+                      END AS command,
+        CASE S.param6 WHEN 1 THEN 'FILE'
+                      WHEN 2 THEN 'PROGRAM'
+                      WHEN 3 THEN 'STDIO'
+                      WHEN 4 THEN 'CALLBACK'
+                      END AS io_target,
         S.param1 AS bytes_processed,
         S.param2 AS bytes_total,
-        S.param3 AS lines_processed
+        S.param3 AS tuples_processed,
+        S.param4 AS tuples_excluded
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 796ca7b3f7..c3610eb67e 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -540,6 +540,7 @@ CopyFrom(CopyFromState cstate)
 	CopyInsertMethod insertMethod;
 	CopyMultiInsertInfo multiInsertInfo = {0};	/* pacify compiler */
 	uint64		processed = 0;
+	uint64		excluded = 0;
 	bool		has_before_insert_row_trig;
 	bool		has_instead_insert_row_trig;
 	bool		leafpart_use_multi_insert = false;
@@ -869,7 +870,11 @@ CopyFrom(CopyFromState cstate)
 			econtext->ecxt_scantuple = myslot;
 			/* Skip items that don't match COPY's WHERE clause */
 			if (!ExecQual(cstate->qualexpr, econtext))
+			{
+				/* Report that this tuple was filtered out by the WHERE clause */
+				pgstat_progress_update_param(PROGRESS_COPY_TUPLES_EXCLUDED, ++excluded);
 				continue;
+			}
 		}
 
 		/* Determine the partition to insert the tuple into */
@@ -1107,7 +1112,7 @@ CopyFrom(CopyFromState cstate)
 			 * for counting tuples inserted by an INSERT command. Update
 			 * progress of the COPY command as well.
 			 */
-			pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++processed);
+			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++processed);
 		}
 	}
 
@@ -1424,6 +1429,8 @@ BeginCopyFrom(ParseState *pstate,
 	/* initialize progress */
 	pgstat_progress_start_command(PROGRESS_COMMAND_COPY,
 								  cstate->rel ? RelationGetRelid(cstate->rel) : InvalidOid);
+	pgstat_progress_update_param(PROGRESS_COPY_COMMAND, PROGRESS_COPY_COMMAND_FROM);
+
 	cstate->bytes_processed = 0;
 
 	/* We keep those variables in cstate. */
@@ -1501,6 +1508,27 @@ BeginCopyFrom(ParseState *pstate,
 		ReceiveCopyBinaryHeader(cstate);
 	}
 
+	{
+		int64 io_target;
+		switch (cstate->copy_src)
+		{
+			case COPY_FILE:
+				if (is_program)
+					io_target = PROGRESS_COPY_IO_TARGET_PROGRAM;
+				else
+					io_target = PROGRESS_COPY_IO_TARGET_FILE;
+				break;
+			case COPY_OLD_FE:
+			case COPY_NEW_FE:
+				io_target = PROGRESS_COPY_IO_TARGET_STDIO;
+				break;
+			case COPY_CALLBACK:
+				io_target = PROGRESS_COPY_IO_TARGET_CALLBACK;
+				break;
+		}
+		pgstat_progress_update_param(PROGRESS_COPY_IO_TARGET, io_target);
+	}
+
 	/* create workspace for CopyReadAttributes results */
 	if (!cstate->opts.binary)
 	{
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index e04ec1e331..42c4a828df 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -772,6 +772,31 @@ BeginCopyTo(ParseState *pstate,
 	/* initialize progress */
 	pgstat_progress_start_command(PROGRESS_COMMAND_COPY,
 								  cstate->rel ? RelationGetRelid(cstate->rel) : InvalidOid);
+	{
+		const int progress_index[] = {
+			PROGRESS_COPY_COMMAND,
+			PROGRESS_COPY_IO_TARGET
+		};
+		int64 progress_vals[] = {
+			PROGRESS_COPY_COMMAND_TO,
+			0
+		};
+		switch (cstate->copy_dest)
+		{
+			case COPY_FILE:
+				if (is_program)
+					progress_vals[1] = PROGRESS_COPY_IO_TARGET_PROGRAM;
+				else
+					progress_vals[1] = PROGRESS_COPY_IO_TARGET_FILE;
+				break;
+			case COPY_OLD_FE:
+			case COPY_NEW_FE:
+				progress_vals[1] = PROGRESS_COPY_IO_TARGET_STDIO;
+				break;
+		}
+		pgstat_progress_update_multi_param(2, progress_index, progress_vals);
+	}
+
 	cstate->bytes_processed = 0;
 
 	MemoryContextSwitchTo(oldcontext);
@@ -954,7 +979,7 @@ CopyTo(CopyToState cstate)
 			CopyOneRowTo(cstate, slot);
 
 			/* Increment amount of processed tuples and update the progress */
-			pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++processed);
+			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++processed);
 		}
 
 		ExecDropSingleTupleTableSlot(slot);
@@ -1321,7 +1346,7 @@ copy_dest_receive(TupleTableSlot *slot, DestReceiver *self)
 	CopyOneRowTo(cstate, slot);
 
 	/* Increment amount of processed tuples and update the progress */
-	pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++myState->processed);
+	pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++myState->processed);
 
 	return true;
 }
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 95ec5d02e9..e003217554 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -136,6 +136,19 @@
 /* Commands of PROGRESS_COPY */
 #define PROGRESS_COPY_BYTES_PROCESSED 0
 #define PROGRESS_COPY_BYTES_TOTAL 1
-#define PROGRESS_COPY_LINES_PROCESSED 2
+#define PROGRESS_COPY_TUPLES_PROCESSED 2
+#define PROGRESS_COPY_TUPLES_EXCLUDED 3
+#define PROGRESS_COPY_COMMAND 4
+#define PROGRESS_COPY_IO_TARGET 5
+
+/* Commands of PROGRESS_COPY_COMMAND */
+#define PROGRESS_COPY_COMMAND_FROM 1
+#define PROGRESS_COPY_COMMAND_TO 2
+
+/* Types of PROGRESS_COPY_INOUT_TYPE */
+#define PROGRESS_COPY_IO_TARGET_FILE 1
+#define PROGRESS_COPY_IO_TARGET_PROGRAM 2
+#define PROGRESS_COPY_IO_TARGET_STDIO 3
+#define PROGRESS_COPY_IO_TARGET_CALLBACK 4
 
 #endif
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index b632d9f2ea..d244f9b020 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1948,9 +1948,22 @@ pg_stat_progress_copy| SELECT s.pid,
     s.datid,
     d.datname,
     s.relid,
+        CASE s.param5
+            WHEN 1 THEN 'COPY FROM'::text
+            WHEN 2 THEN 'COPY TO'::text
+            ELSE NULL::text
+        END AS command,
+        CASE s.param6
+            WHEN 1 THEN 'FILE'::text
+            WHEN 2 THEN 'PROGRAM'::text
+            WHEN 3 THEN 'STDIO'::text
+            WHEN 4 THEN 'CALLBACK'::text
+            ELSE NULL::text
+        END AS io_target,
     s.param1 AS bytes_processed,
     s.param2 AS bytes_total,
-    s.param3 AS lines_processed
+    s.param3 AS tuples_processed,
+    s.param4 AS tuples_excluded
    FROM (pg_stat_get_progress_info('COPY'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_create_index| SELECT s.pid,
-- 
2.20.1

#16Justin Pryzby
pryzby@telsasoft.com
In reply to: Matthias van de Meent (#15)
Re: Improvements and additions to COPY progress reporting
--- a/doc/src/sgml/ref/analyze.sgml                                                                                                                                                                                            
+++ b/doc/src/sgml/ref/analyze.sgml                                                                                                                                                                                            
@@ -273,6 +273,12 @@ ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replacea                                                                                                                          
     will not record new statistics for that table.  Any existing statistics                                                                                                                                                   
     will be retained.                                                                                                                                                                                                         
   </para>                                                                                                                                                                                                                     
+                                                                                                                                                                                                                              
+  <para>                                                                                                                                                                                                                      
+    Each backend running the <command>ANALYZE</command> command will report their                                                                                                                                             
+    progress to the <structname>pg_stat_progress_analyze</structname> view.                                                                                                                                                   
+    See <xref linkend="analyze-progress-reporting"/> for details.                                                                                                                                                             
+  </para>                                                                                                                                                                                                                     

I think this should say:

"..will report its progress to.."

Or:

"The progress of each backend running >ANALYZE< is reported in the

pg_stat_progress_analyze< view."

--
Justin

#17Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: Matthias van de Meent (#15)
6 attachment(s)
Re: Improvements and additions to COPY progress reporting

Hi,

I agree with these changes in general - I have a couple minor comment:

1) 0001

- the SGML docs are missing a couple tags

- The blocks in copyfrom.cc/copyto.c should be reworked - I don't think
we do this in our codebase. Move the variable declarations to the
beginning, get rid of the out block. Or something like that.

- I fir the "io_target" name misleading, because in some cases it's
actually the *source*.

2) 0002

- I believe "each backend ... reports its" (not theirs), right?

- This seems more like a generic docs improvement, not quite specific to
the COPY progress patch. It's a bit buried, maybe it should be posted
separately. OTOH it's pretty small.

3) 0003

- Some whitespace noise, triggering "git am" warnings.

- Might be good to briefly explain what the regression test does with
the triggers, etc.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

v7-0001-Add-progress-reported-components-for-COPY-progress-r.patchtext/x-patch; charset=UTF-8; name=v7-0001-Add-progress-reported-components-for-COPY-progress-r.patchDownload
From 32e8bf3e84d58b43c8ab5888eb909e86bdf0ceea Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Fri, 12 Feb 2021 14:06:44 +0100
Subject: [PATCH 1/6] Add progress-reported components for COPY progress
 reporting

The command, io target and excluded tuple count (by COPY ... FROM ... s'
WHERE -clause) are now reported in the pg_stat_progress_copy view.
Additionally, the column lines_processed is renamed to tuples_processed to
disambiguate the meaning of this column in cases of CSV and BINARY copies and
to stay consistent with regards to names in the pg_stat_progress_*-views.

Of special interest is the reporting of io_target, with which we can
distinguish logical replications' initial table synchronization workers'
progress without having to join the pg_stat_activity view.
---
 doc/src/sgml/monitoring.sgml         | 37 ++++++++++++++++++++++++++--
 src/backend/catalog/system_views.sql | 11 ++++++++-
 src/backend/commands/copyfrom.c      | 30 +++++++++++++++++++++-
 src/backend/commands/copyto.c        | 29 ++++++++++++++++++++--
 src/include/commands/progress.h      | 15 ++++++++++-
 src/test/regress/expected/rules.out  | 15 ++++++++++-
 6 files changed, 129 insertions(+), 8 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index c602ee4427..3c39c82f1a 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6544,6 +6544,29 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>command</structfield> <type>text</type>
+      </para>
+      <para>
+       The command that is running: <literal>COPY FROM</literal>, or
+       <literal>COPY TO</literal>.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>io_target</structfield> <type>text</type>
+      </para>
+      <para>
+       The io target that the data is read from or written to: 
+       <literal>FILE</literal>, <literal>PROGRAM</literal>, 
+       <literal>STDIO</literal> (for COPY FROM STDIN and COPY TO STDOUT),
+       or <literal>CALLBACK</literal> (used in the table synchronization
+       background worker).
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>bytes_processed</structfield> <type>bigint</type>
@@ -6565,10 +6588,20 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>lines_processed</structfield> <type>bigint</type>
+       <structfield>tuples_processed</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of tuples already processed by <command>COPY</command> command.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>tuples_excluded</structfield> <type>bigint</type>
       </para>
       <para>
-       Number of lines already processed by <command>COPY</command> command.
+       Number of tuples not processed because they were excluded by the
+       <command>WHERE</command> clause of the <command>COPY</command> command.
       </para></entry>
      </row>
     </tbody>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index fa58afd9d7..6a3ac47b85 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1129,9 +1129,18 @@ CREATE VIEW pg_stat_progress_copy AS
     SELECT
         S.pid AS pid, S.datid AS datid, D.datname AS datname,
         S.relid AS relid,
+        CASE S.param5 WHEN 1 THEN 'COPY FROM'
+                      WHEN 2 THEN 'COPY TO'
+                      END AS command,
+        CASE S.param6 WHEN 1 THEN 'FILE'
+                      WHEN 2 THEN 'PROGRAM'
+                      WHEN 3 THEN 'STDIO'
+                      WHEN 4 THEN 'CALLBACK'
+                      END AS io_target,
         S.param1 AS bytes_processed,
         S.param2 AS bytes_total,
-        S.param3 AS lines_processed
+        S.param3 AS tuples_processed,
+        S.param4 AS tuples_excluded
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 796ca7b3f7..c3610eb67e 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -540,6 +540,7 @@ CopyFrom(CopyFromState cstate)
 	CopyInsertMethod insertMethod;
 	CopyMultiInsertInfo multiInsertInfo = {0};	/* pacify compiler */
 	uint64		processed = 0;
+	uint64		excluded = 0;
 	bool		has_before_insert_row_trig;
 	bool		has_instead_insert_row_trig;
 	bool		leafpart_use_multi_insert = false;
@@ -869,7 +870,11 @@ CopyFrom(CopyFromState cstate)
 			econtext->ecxt_scantuple = myslot;
 			/* Skip items that don't match COPY's WHERE clause */
 			if (!ExecQual(cstate->qualexpr, econtext))
+			{
+				/* Report that this tuple was filtered out by the WHERE clause */
+				pgstat_progress_update_param(PROGRESS_COPY_TUPLES_EXCLUDED, ++excluded);
 				continue;
+			}
 		}
 
 		/* Determine the partition to insert the tuple into */
@@ -1107,7 +1112,7 @@ CopyFrom(CopyFromState cstate)
 			 * for counting tuples inserted by an INSERT command. Update
 			 * progress of the COPY command as well.
 			 */
-			pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++processed);
+			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++processed);
 		}
 	}
 
@@ -1424,6 +1429,8 @@ BeginCopyFrom(ParseState *pstate,
 	/* initialize progress */
 	pgstat_progress_start_command(PROGRESS_COMMAND_COPY,
 								  cstate->rel ? RelationGetRelid(cstate->rel) : InvalidOid);
+	pgstat_progress_update_param(PROGRESS_COPY_COMMAND, PROGRESS_COPY_COMMAND_FROM);
+
 	cstate->bytes_processed = 0;
 
 	/* We keep those variables in cstate. */
@@ -1501,6 +1508,27 @@ BeginCopyFrom(ParseState *pstate,
 		ReceiveCopyBinaryHeader(cstate);
 	}
 
+	{
+		int64 io_target;
+		switch (cstate->copy_src)
+		{
+			case COPY_FILE:
+				if (is_program)
+					io_target = PROGRESS_COPY_IO_TARGET_PROGRAM;
+				else
+					io_target = PROGRESS_COPY_IO_TARGET_FILE;
+				break;
+			case COPY_OLD_FE:
+			case COPY_NEW_FE:
+				io_target = PROGRESS_COPY_IO_TARGET_STDIO;
+				break;
+			case COPY_CALLBACK:
+				io_target = PROGRESS_COPY_IO_TARGET_CALLBACK;
+				break;
+		}
+		pgstat_progress_update_param(PROGRESS_COPY_IO_TARGET, io_target);
+	}
+
 	/* create workspace for CopyReadAttributes results */
 	if (!cstate->opts.binary)
 	{
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index e04ec1e331..42c4a828df 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -772,6 +772,31 @@ BeginCopyTo(ParseState *pstate,
 	/* initialize progress */
 	pgstat_progress_start_command(PROGRESS_COMMAND_COPY,
 								  cstate->rel ? RelationGetRelid(cstate->rel) : InvalidOid);
+	{
+		const int progress_index[] = {
+			PROGRESS_COPY_COMMAND,
+			PROGRESS_COPY_IO_TARGET
+		};
+		int64 progress_vals[] = {
+			PROGRESS_COPY_COMMAND_TO,
+			0
+		};
+		switch (cstate->copy_dest)
+		{
+			case COPY_FILE:
+				if (is_program)
+					progress_vals[1] = PROGRESS_COPY_IO_TARGET_PROGRAM;
+				else
+					progress_vals[1] = PROGRESS_COPY_IO_TARGET_FILE;
+				break;
+			case COPY_OLD_FE:
+			case COPY_NEW_FE:
+				progress_vals[1] = PROGRESS_COPY_IO_TARGET_STDIO;
+				break;
+		}
+		pgstat_progress_update_multi_param(2, progress_index, progress_vals);
+	}
+
 	cstate->bytes_processed = 0;
 
 	MemoryContextSwitchTo(oldcontext);
@@ -954,7 +979,7 @@ CopyTo(CopyToState cstate)
 			CopyOneRowTo(cstate, slot);
 
 			/* Increment amount of processed tuples and update the progress */
-			pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++processed);
+			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++processed);
 		}
 
 		ExecDropSingleTupleTableSlot(slot);
@@ -1321,7 +1346,7 @@ copy_dest_receive(TupleTableSlot *slot, DestReceiver *self)
 	CopyOneRowTo(cstate, slot);
 
 	/* Increment amount of processed tuples and update the progress */
-	pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++myState->processed);
+	pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++myState->processed);
 
 	return true;
 }
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 95ec5d02e9..e003217554 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -136,6 +136,19 @@
 /* Commands of PROGRESS_COPY */
 #define PROGRESS_COPY_BYTES_PROCESSED 0
 #define PROGRESS_COPY_BYTES_TOTAL 1
-#define PROGRESS_COPY_LINES_PROCESSED 2
+#define PROGRESS_COPY_TUPLES_PROCESSED 2
+#define PROGRESS_COPY_TUPLES_EXCLUDED 3
+#define PROGRESS_COPY_COMMAND 4
+#define PROGRESS_COPY_IO_TARGET 5
+
+/* Commands of PROGRESS_COPY_COMMAND */
+#define PROGRESS_COPY_COMMAND_FROM 1
+#define PROGRESS_COPY_COMMAND_TO 2
+
+/* Types of PROGRESS_COPY_INOUT_TYPE */
+#define PROGRESS_COPY_IO_TARGET_FILE 1
+#define PROGRESS_COPY_IO_TARGET_PROGRAM 2
+#define PROGRESS_COPY_IO_TARGET_STDIO 3
+#define PROGRESS_COPY_IO_TARGET_CALLBACK 4
 
 #endif
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 10a1f34ebc..3c6776429d 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1949,9 +1949,22 @@ pg_stat_progress_copy| SELECT s.pid,
     s.datid,
     d.datname,
     s.relid,
+        CASE s.param5
+            WHEN 1 THEN 'COPY FROM'::text
+            WHEN 2 THEN 'COPY TO'::text
+            ELSE NULL::text
+        END AS command,
+        CASE s.param6
+            WHEN 1 THEN 'FILE'::text
+            WHEN 2 THEN 'PROGRAM'::text
+            WHEN 3 THEN 'STDIO'::text
+            WHEN 4 THEN 'CALLBACK'::text
+            ELSE NULL::text
+        END AS io_target,
     s.param1 AS bytes_processed,
     s.param2 AS bytes_total,
-    s.param3 AS lines_processed
+    s.param3 AS tuples_processed,
+    s.param4 AS tuples_excluded
    FROM (pg_stat_get_progress_info('COPY'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_create_index| SELECT s.pid,
-- 
2.26.2

v7-0002-0001-review.patchtext/x-patch; charset=UTF-8; name=v7-0002-0001-review.patchDownload
From ddcc5f263958efefda871e03f17715aa6111a2dc Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Mon, 15 Feb 2021 16:26:32 +0100
Subject: [PATCH 2/6] 0001 review

---
 doc/src/sgml/monitoring.sgml    | 2 +-
 src/backend/commands/copyfrom.c | 1 +
 src/backend/commands/copyto.c   | 2 ++
 3 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 3c39c82f1a..c5048fefd4 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6561,7 +6561,7 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
       <para>
        The io target that the data is read from or written to: 
        <literal>FILE</literal>, <literal>PROGRAM</literal>, 
-       <literal>STDIO</literal> (for COPY FROM STDIN and COPY TO STDOUT),
+       <literal>STDIO</literal> (for <command>COPY FROM STDIN</command> and <command>COPY TO STDOUT</command>),
        or <literal>CALLBACK</literal> (used in the table synchronization
        background worker).
       </para></entry>
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index c3610eb67e..d98f447cf5 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1508,6 +1508,7 @@ BeginCopyFrom(ParseState *pstate,
 		ReceiveCopyBinaryHeader(cstate);
 	}
 
+	/* XXX this block seems weird */
 	{
 		int64 io_target;
 		switch (cstate->copy_src)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 42c4a828df..4b76bf5ef6 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -772,6 +772,8 @@ BeginCopyTo(ParseState *pstate,
 	/* initialize progress */
 	pgstat_progress_start_command(PROGRESS_COMMAND_COPY,
 								  cstate->rel ? RelationGetRelid(cstate->rel) : InvalidOid);
+
+	/* XXX this too */
 	{
 		const int progress_index[] = {
 			PROGRESS_COPY_COMMAND,
-- 
2.26.2

v7-0003-Add-backlinks-to-progress-reporting-documentation.patchtext/x-patch; charset=UTF-8; name=v7-0003-Add-backlinks-to-progress-reporting-documentation.patchDownload
From c2a92e0fee760e52a604f6eb11e0766f5ffdf090 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Fri, 5 Feb 2021 23:11:50 +0100
Subject: [PATCH 3/6] Add backlinks to progress reporting documentation

Previously, for most progress-reported features, the only place the
feature was mentioned is in the progress reporting document itself.
This makes the progress reporting more discoverable from the reported
commands.
---
 doc/src/sgml/ref/analyze.sgml       |  7 +++++++
 doc/src/sgml/ref/cluster.sgml       |  6 ++++++
 doc/src/sgml/ref/copy.sgml          | 14 ++++++++++++++
 doc/src/sgml/ref/create_index.sgml  |  8 ++++++++
 doc/src/sgml/ref/pg_basebackup.sgml |  1 +
 doc/src/sgml/ref/reindex.sgml       |  7 +++++++
 doc/src/sgml/ref/vacuum.sgml        | 11 +++++++++++
 7 files changed, 54 insertions(+)

diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml
index 7d816c87c6..9db9070b62 100644
--- a/doc/src/sgml/ref/analyze.sgml
+++ b/doc/src/sgml/ref/analyze.sgml
@@ -273,6 +273,12 @@ ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replacea
     will not record new statistics for that table.  Any existing statistics
     will be retained.
   </para>
+
+  <para>
+    Each backend running the <command>ANALYZE</command> command will report their
+    progress to the <structname>pg_stat_progress_analyze</structname> view.
+    See <xref linkend="analyze-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -291,6 +297,7 @@ ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replacea
    <member><xref linkend="app-vacuumdb"/></member>
    <member><xref linkend="runtime-config-resource-vacuum-cost"/></member>
    <member><xref linkend="autovacuum"/></member>
+   <member><xref linkend="analyze-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/cluster.sgml b/doc/src/sgml/ref/cluster.sgml
index 5dd21a0189..5c2270f71b 100644
--- a/doc/src/sgml/ref/cluster.sgml
+++ b/doc/src/sgml/ref/cluster.sgml
@@ -192,6 +192,11 @@ CLUSTER [VERBOSE]
     are periodically reclustered.
    </para>
 
+  <para>
+    Each backend running the <command>CLUSTER</command> command will report their
+    progress to the <structname>pg_stat_progress_cluster</structname> view.
+    See <xref linkend="cluster-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -242,6 +247,7 @@ CLUSTER <replaceable class="parameter">index_name</replaceable> ON <replaceable
 
   <simplelist type="inline">
    <member><xref linkend="app-clusterdb"/></member>
+   <member><xref linkend="cluster-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 0fca6583af..af3ce72561 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -82,6 +82,12 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
    specified, data is transmitted via the connection between the
    client and the server.
   </para>
+
+  <para>
+    Each backend running the <command>COPY</command> command will report their
+    progress to the <structname>pg_stat_progress_copy</structname> view.
+    See <xref linkend="copy-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -1052,4 +1058,12 @@ COPY [ BINARY ] <replaceable class="parameter">table_name</replaceable>
     [ WITH NULL AS '<replaceable class="parameter">null_string</replaceable>' ]
 </synopsis></para>
  </refsect1>
+ 
+ <refsect1>
+  <title>See Also</title>
+ 
+  <simplelist type="inline">
+   <member><xref linkend="copy-progress-reporting"/></member>
+  </simplelist>
+ </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index a5271a9f8f..278058f500 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -865,6 +865,13 @@ Indexes:
    will interpret it as <literal>USING gist</literal>, to simplify conversion
    of old databases to GiST.
   </para>
+
+  <para>
+    Each backend running the <command>CREATE INDEX</command> command will 
+    report their progress to the 
+    <structname>pg_stat_progress_create_index</structname> view.
+    See <xref linkend="create-index-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -978,6 +985,7 @@ CREATE INDEX CONCURRENTLY sales_quantity_index ON sales_table (quantity);
    <member><xref linkend="sql-alterindex"/></member>
    <member><xref linkend="sql-dropindex"/></member>
    <member><xref linkend="sql-reindex"/></member>
+   <member><xref linkend="create-index-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 5754ad5aa6..14cc88a852 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -904,6 +904,7 @@ PostgreSQL documentation
 
   <simplelist type="inline">
    <member><xref linkend="app-pgdump"/></member>
+   <member><xref linkend="basebackup-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 07795b5737..f69f5db403 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -493,6 +493,12 @@ Indexes:
     is reindexed concurrently, those indexes will be skipped.  (It is possible
     to reindex such indexes without the <command>CONCURRENTLY</command> option.)
    </para>
+
+  <para>
+    Each backend running the <command>REINDEX</command> command will report
+    their progress to the <structname>pg_stat_progress_create_index</structname>
+    view. See <xref linkend="create-index-progress-reporting"/> for details.
+  </para>
   </refsect2>
  </refsect1>
 
@@ -551,6 +557,7 @@ REINDEX TABLE CONCURRENTLY my_broken_table;
    <member><xref linkend="sql-createindex"/></member>
    <member><xref linkend="sql-dropindex"/></member>
    <member><xref linkend="app-reindexdb"/></member>
+   <member><xref linkend="create-index-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 4bb624979b..bab3d00313 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -393,6 +393,15 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     information about automatic and manual vacuuming, see
     <xref linkend="routine-vacuuming"/>.
    </para>
+   <para>
+    Each backend running the <command>VACUUM</command> command without the
+    <literal>FULL</literal> option will report their progress in the 
+    <structname>pg_stat_progress_vacuum</structname> view. Backends running
+    <command>VACUUM</command> with the <literal>FULL</literal> option report 
+    progress in the <structname>pg_stat_progress_cluster</structname> instead.
+    See <xref linkend="vacuum-progress-reporting"/> and 
+    <xref linkend="cluster-progress-reporting"/> for details.
+   </para>
  </refsect1>
 
  <refsect1>
@@ -422,6 +431,8 @@ VACUUM (VERBOSE, ANALYZE) onek;
    <member><xref linkend="app-vacuumdb"/></member>
    <member><xref linkend="runtime-config-resource-vacuum-cost"/></member>
    <member><xref linkend="autovacuum"/></member>
+   <member><xref linkend="vacuum-progress-reporting"/></member>
+   <member><xref linkend="cluster-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
-- 
2.26.2

v7-0004-0002-review.patchtext/x-patch; charset=UTF-8; name=v7-0004-0002-review.patchDownload
From d0d3e87d5c52d894c93c434ff7d0e03055948a1c Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Mon, 15 Feb 2021 16:31:58 +0100
Subject: [PATCH 4/6] 0002 review

---
 doc/src/sgml/ref/analyze.sgml      | 2 +-
 doc/src/sgml/ref/cluster.sgml      | 2 +-
 doc/src/sgml/ref/copy.sgml         | 2 +-
 doc/src/sgml/ref/create_index.sgml | 2 +-
 doc/src/sgml/ref/reindex.sgml      | 2 +-
 doc/src/sgml/ref/vacuum.sgml       | 2 +-
 6 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml
index 9db9070b62..5a9e3322b3 100644
--- a/doc/src/sgml/ref/analyze.sgml
+++ b/doc/src/sgml/ref/analyze.sgml
@@ -275,7 +275,7 @@ ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replacea
   </para>
 
   <para>
-    Each backend running the <command>ANALYZE</command> command will report their
+    Each backend running the <command>ANALYZE</command> command will report its
     progress to the <structname>pg_stat_progress_analyze</structname> view.
     See <xref linkend="analyze-progress-reporting"/> for details.
   </para>
diff --git a/doc/src/sgml/ref/cluster.sgml b/doc/src/sgml/ref/cluster.sgml
index 5c2270f71b..70c4b33454 100644
--- a/doc/src/sgml/ref/cluster.sgml
+++ b/doc/src/sgml/ref/cluster.sgml
@@ -193,7 +193,7 @@ CLUSTER [VERBOSE]
    </para>
 
   <para>
-    Each backend running the <command>CLUSTER</command> command will report their
+    Each backend running the <command>CLUSTER</command> command will report its
     progress to the <structname>pg_stat_progress_cluster</structname> view.
     See <xref linkend="cluster-progress-reporting"/> for details.
   </para>
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index af3ce72561..41e1d7c371 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -84,7 +84,7 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
   </para>
 
   <para>
-    Each backend running the <command>COPY</command> command will report their
+    Each backend running the <command>COPY</command> command will report its
     progress to the <structname>pg_stat_progress_copy</structname> view.
     See <xref linkend="copy-progress-reporting"/> for details.
   </para>
diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index 278058f500..6079a0ef28 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -868,7 +868,7 @@ Indexes:
 
   <para>
     Each backend running the <command>CREATE INDEX</command> command will 
-    report their progress to the 
+    report it's progress to the 
     <structname>pg_stat_progress_create_index</structname> view.
     See <xref linkend="create-index-progress-reporting"/> for details.
   </para>
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index f69f5db403..3b4d409856 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -496,7 +496,7 @@ Indexes:
 
   <para>
     Each backend running the <command>REINDEX</command> command will report
-    their progress to the <structname>pg_stat_progress_create_index</structname>
+    its progress to the <structname>pg_stat_progress_create_index</structname>
     view. See <xref linkend="create-index-progress-reporting"/> for details.
   </para>
   </refsect2>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index bab3d00313..0624235696 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -395,7 +395,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
    </para>
    <para>
     Each backend running the <command>VACUUM</command> command without the
-    <literal>FULL</literal> option will report their progress in the 
+    <literal>FULL</literal> option will report its progress in the 
     <structname>pg_stat_progress_vacuum</structname> view. Backends running
     <command>VACUUM</command> with the <literal>FULL</literal> option report 
     progress in the <structname>pg_stat_progress_cluster</structname> instead.
-- 
2.26.2

v7-0005-Add-copy-progress-reporting-regression-tests.patchtext/x-patch; charset=UTF-8; name=v7-0005-Add-copy-progress-reporting-regression-tests.patchDownload
From c76d1d779b797a42ebc58abe7ead4f5d80b45bcb Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Josef=20=C5=A0im=C3=A1nek?= <josef.simanek@gmail.com>
Date: Tue, 9 Feb 2021 13:06:45 +0100
Subject: [PATCH 5/6] Add copy progress reporting regression tests.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This tests some basic features of copy progress reporting.

Sadly, we can only request one snapshot of progress_reporting each
transaction / statement, so we can't (easily) get intermediate results without
each result requiring a seperate statement being run.

Author: Josef Šimánek, Matthias van de Meent
---
 src/test/regress/input/copy.source  | 57 +++++++++++++++++++++++++++++
 src/test/regress/output/copy.source | 44 ++++++++++++++++++++++
 2 files changed, 101 insertions(+)

diff --git a/src/test/regress/input/copy.source b/src/test/regress/input/copy.source
index a1d529ad36..ddde33e7cc 100644
--- a/src/test/regress/input/copy.source
+++ b/src/test/regress/input/copy.source
@@ -201,3 +201,60 @@ select * from parted_copytest where b = 1;
 select * from parted_copytest where b = 2;
 
 drop table parted_copytest;
+
+--
+-- progress reporting
+-- 
+
+-- setup
+-- reuse employer datatype, that has a small sized data set
+
+create table progress_reporting (
+	name 		text,
+	age			int4,
+	location 	point,
+	salary 		int4,
+	manager 	name
+);
+
+create function notice_after_progress_reporting() returns trigger AS
+$$
+declare report record;
+begin
+	-- We cannot expect 'pid' nor 'relid' to be consistent over runs due to 
+	-- variance in system process ids, and concurrency in runs of tests.
+	-- Additionally, due to the usage of this test in pg_regress, the 'datid'
+	-- also is not consistent between runs.
+	select into report (to_jsonb(r) - '{pid,relid,datid}'::text[]) as value
+		from pg_stat_progress_copy r
+		where pid = pg_backend_pid();
+
+	raise info 'progress: %', report.value::text;
+
+	RETURN NEW;
+END;
+$$ LANGUAGE plpgsql;
+
+create trigger check_after_progress_reporting
+	after insert on progress_reporting
+	for each statement
+	execute function notice_after_progress_reporting();
+
+-- reporting of STDIO imports, and correct bytes-processed/tuples-processed reporting
+
+copy progress_reporting from stdin;
+sharon	25	(15,12)	1000	sam
+sam	30	(10,5)	2000	bill
+bill	20	(11,10)	1000	sharon
+\.
+
+-- reporting of FILE imports, and correct reporting of tuples-excluded
+
+copy progress_reporting from '@abs_srcdir@/data/emp.data'
+	where (salary < 2000);
+
+-- cleanup progress_reporting
+
+drop trigger check_after_progress_reporting on progress_reporting;
+drop function notice_after_progress_reporting();
+drop table progress_reporting;
diff --git a/src/test/regress/output/copy.source b/src/test/regress/output/copy.source
index 938d3551da..60f4206aa1 100644
--- a/src/test/regress/output/copy.source
+++ b/src/test/regress/output/copy.source
@@ -165,3 +165,47 @@ select * from parted_copytest where b = 2;
 (1 row)
 
 drop table parted_copytest;
+--
+-- progress reporting
+-- 
+-- setup
+-- reuse employer datatype, that has a small sized data set
+create table progress_reporting (
+	name 		text,
+	age			int4,
+	location 	point,
+	salary 		int4,
+	manager 	name
+);
+create function notice_after_progress_reporting() returns trigger AS
+$$
+declare report record;
+begin
+	-- We cannot expect 'pid' nor 'relid' to be consistent over runs due to 
+	-- variance in system process ids, and concurrency in runs of tests.
+	-- Additionally, due to the usage of this test in pg_regress, the 'datid'
+	-- also is not consistent between runs.
+	select into report (to_jsonb(r) - '{pid,relid,datid}'::text[]) as value
+		from pg_stat_progress_copy r
+		where pid = pg_backend_pid();
+
+	raise info 'progress: %', report.value::text;
+
+	RETURN NEW;
+END;
+$$ LANGUAGE plpgsql;
+create trigger check_after_progress_reporting
+	after insert on progress_reporting
+	for each statement
+	execute function notice_after_progress_reporting();
+-- reporting of STDIO imports, and correct bytes-processed/tuples-processed reporting
+copy progress_reporting from stdin;
+INFO:  progress: {"command": "COPY FROM", "datname": "regression", "io_target": "STDIO", "bytes_total": 0, "bytes_processed": 79, "tuples_excluded": 0, "tuples_processed": 3}
+-- reporting of FILE imports, and correct reporting of tuples-excluded
+copy progress_reporting from '@abs_srcdir@/data/emp.data'
+	where (salary < 2000);
+INFO:  progress: {"command": "COPY FROM", "datname": "regression", "io_target": "FILE", "bytes_total": 79, "bytes_processed": 79, "tuples_excluded": 1, "tuples_processed": 2}
+-- cleanup progress_reporting
+drop trigger check_after_progress_reporting on progress_reporting;
+drop function notice_after_progress_reporting();
+drop table progress_reporting;
-- 
2.26.2

v7-0006-0003-review.patchtext/x-patch; charset=UTF-8; name=v7-0006-0003-review.patchDownload
From 07d065c2a127301011849e6bd35b8062c38e915b Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Mon, 15 Feb 2021 16:36:28 +0100
Subject: [PATCH 6/6] 0003 review

---
 src/test/regress/input/copy.source  | 4 ++--
 src/test/regress/output/copy.source | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/test/regress/input/copy.source b/src/test/regress/input/copy.source
index ddde33e7cc..4ee1b26326 100644
--- a/src/test/regress/input/copy.source
+++ b/src/test/regress/input/copy.source
@@ -204,7 +204,7 @@ drop table parted_copytest;
 
 --
 -- progress reporting
--- 
+--
 
 -- setup
 -- reuse employer datatype, that has a small sized data set
@@ -221,7 +221,7 @@ create function notice_after_progress_reporting() returns trigger AS
 $$
 declare report record;
 begin
-	-- We cannot expect 'pid' nor 'relid' to be consistent over runs due to 
+	-- We cannot expect 'pid' nor 'relid' to be consistent over runs due to
 	-- variance in system process ids, and concurrency in runs of tests.
 	-- Additionally, due to the usage of this test in pg_regress, the 'datid'
 	-- also is not consistent between runs.
diff --git a/src/test/regress/output/copy.source b/src/test/regress/output/copy.source
index 60f4206aa1..8ebfb7ae81 100644
--- a/src/test/regress/output/copy.source
+++ b/src/test/regress/output/copy.source
@@ -167,7 +167,7 @@ select * from parted_copytest where b = 2;
 drop table parted_copytest;
 --
 -- progress reporting
--- 
+--
 -- setup
 -- reuse employer datatype, that has a small sized data set
 create table progress_reporting (
@@ -181,7 +181,7 @@ create function notice_after_progress_reporting() returns trigger AS
 $$
 declare report record;
 begin
-	-- We cannot expect 'pid' nor 'relid' to be consistent over runs due to 
+	-- We cannot expect 'pid' nor 'relid' to be consistent over runs due to
 	-- variance in system process ids, and concurrency in runs of tests.
 	-- Additionally, due to the usage of this test in pg_regress, the 'datid'
 	-- also is not consistent between runs.
-- 
2.26.2

#18Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Tomas Vondra (#17)
3 attachment(s)
Re: Improvements and additions to COPY progress reporting

Hi,

Thank you all for the suggestions. PFA version 8 of the patchset, in
which I have applied most of your comments. Unless explicitly named
below, I have applied the suggestions.

On Mon, 15 Feb 2021 at 17:07, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:

- The blocks in copyfrom.cc/copyto.c should be reworked - I don't think
we do this in our codebase.

I saw this being used in (re)index progress reporting, that's where I
took inspiration from. It has been fixed in the attached version.

- I fir the "io_target" name misleading, because in some cases it's
actually the *source*.

Yes, I was also not quite happy with this, but couldn't find a better
one at the point of writing the initial patchset. Would
"io_operations", "io_port", "operates_through" or "through" maybe be
better?

With regards,

Matthias van de Meent

Attachments:

v8-0001-Add-progress-reported-components-for-COPY-progres.patchapplication/x-patch; name=v8-0001-Add-progress-reported-components-for-COPY-progres.patchDownload
From 40fade23e4a5c4ccf072d034b6f9b22e2fdfcfa5 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Fri, 12 Feb 2021 14:06:44 +0100
Subject: [PATCH v8 1/3] Add progress-reported components for COPY progress
 reporting

The command, io target and excluded tuple count (by the
COPY ... FROM ... WHERE -clause) are now reported in the pg_stat_progress_copy
view. Additionally, the column lines_processed is renamed to tuples_processed
to disambiguate the meaning of this column in cases of CSV and BINARY copies
and to stay consistent with regards to names in the pg_stat_progress_*-views.

Of special interest is the reporting of io_target, with which we can
distinguish logical replications' initial table synchronization workers'
progress without having to join the pg_stat_activity view.
---
 doc/src/sgml/monitoring.sgml         | 37 ++++++++++++++++++++++++++--
 src/backend/catalog/system_views.sql | 11 ++++++++-
 src/backend/commands/copyfrom.c      | 27 +++++++++++++++++---
 src/backend/commands/copyto.c        | 19 ++++++++++++--
 src/include/commands/progress.h      | 15 ++++++++++-
 src/test/regress/expected/rules.out  | 15 ++++++++++-
 6 files changed, 114 insertions(+), 10 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index c602ee4427..35cc938b20 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6544,6 +6544,29 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>command</structfield> <type>text</type>
+      </para>
+      <para>
+       The command that is running: <literal>COPY FROM</literal>, or
+       <literal>COPY TO</literal>.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>io_target</structfield> <type>text</type>
+      </para>
+      <para>
+       The io target that the data is read from or written to:
+       <literal>FILE</literal>, <literal>PROGRAM</literal>,
+       <literal>STDIO</literal> (for <command>COPY FROM STDIN</command> and
+       <command>COPY TO STDOUT</command>), or <literal>CALLBACK</literal>
+       (used in the table synchronization background worker).
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>bytes_processed</structfield> <type>bigint</type>
@@ -6565,10 +6588,20 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>lines_processed</structfield> <type>bigint</type>
+       <structfield>tuples_processed</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of tuples already processed by <command>COPY</command> command.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>tuples_excluded</structfield> <type>bigint</type>
       </para>
       <para>
-       Number of lines already processed by <command>COPY</command> command.
+       Number of tuples not processed because they were excluded by the
+       <command>WHERE</command> clause of the <command>COPY</command> command.
       </para></entry>
      </row>
     </tbody>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index fa58afd9d7..6a3ac47b85 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1129,9 +1129,18 @@ CREATE VIEW pg_stat_progress_copy AS
     SELECT
         S.pid AS pid, S.datid AS datid, D.datname AS datname,
         S.relid AS relid,
+        CASE S.param5 WHEN 1 THEN 'COPY FROM'
+                      WHEN 2 THEN 'COPY TO'
+                      END AS command,
+        CASE S.param6 WHEN 1 THEN 'FILE'
+                      WHEN 2 THEN 'PROGRAM'
+                      WHEN 3 THEN 'STDIO'
+                      WHEN 4 THEN 'CALLBACK'
+                      END AS io_target,
         S.param1 AS bytes_processed,
         S.param2 AS bytes_total,
-        S.param3 AS lines_processed
+        S.param3 AS tuples_processed,
+        S.param4 AS tuples_excluded
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 796ca7b3f7..b74b43ab84 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -539,7 +539,8 @@ CopyFrom(CopyFromState cstate)
 	BulkInsertState bistate = NULL;
 	CopyInsertMethod insertMethod;
 	CopyMultiInsertInfo multiInsertInfo = {0};	/* pacify compiler */
-	uint64		processed = 0;
+	int64		processed = 0;
+	int64		excluded = 0;
 	bool		has_before_insert_row_trig;
 	bool		has_instead_insert_row_trig;
 	bool		leafpart_use_multi_insert = false;
@@ -869,7 +870,11 @@ CopyFrom(CopyFromState cstate)
 			econtext->ecxt_scantuple = myslot;
 			/* Skip items that don't match COPY's WHERE clause */
 			if (!ExecQual(cstate->qualexpr, econtext))
+			{
+				/* Report that this tuple was filtered out by the WHERE clause */
+				pgstat_progress_update_param(PROGRESS_COPY_TUPLES_EXCLUDED, ++excluded);
 				continue;
+			}
 		}
 
 		/* Determine the partition to insert the tuple into */
@@ -1107,7 +1112,7 @@ CopyFrom(CopyFromState cstate)
 			 * for counting tuples inserted by an INSERT command. Update
 			 * progress of the COPY command as well.
 			 */
-			pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++processed);
+			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++processed);
 		}
 	}
 
@@ -1200,6 +1205,16 @@ BeginCopyFrom(ParseState *pstate,
 	ExprState **defexprs;
 	MemoryContext oldcontext;
 	bool		volatile_defexprs;
+	const int	progress_cols[] = {
+		PROGRESS_COPY_COMMAND,
+		PROGRESS_COPY_IO_TARGET,
+		PROGRESS_COPY_BYTES_TOTAL
+	};
+	int64		progress_vals[] = {
+		PROGRESS_COPY_COMMAND_FROM,
+		0,
+		0
+	};
 
 	/* Allocate workspace and zero all fields */
 	cstate = (CopyFromStateData *) palloc0(sizeof(CopyFromStateData));
@@ -1437,11 +1452,13 @@ BeginCopyFrom(ParseState *pstate,
 
 	if (data_source_cb)
 	{
+		progress_vals[1] = PROGRESS_COPY_IO_TARGET_CALLBACK;
 		cstate->copy_src = COPY_CALLBACK;
 		cstate->data_source_cb = data_source_cb;
 	}
 	else if (pipe)
 	{
+		progress_vals[1] = PROGRESS_COPY_IO_TARGET_STDIO;
 		Assert(!is_program);	/* the grammar does not allow this */
 		if (whereToSendOutput == DestRemote)
 			ReceiveCopyBegin(cstate);
@@ -1454,6 +1471,7 @@ BeginCopyFrom(ParseState *pstate,
 
 		if (cstate->is_program)
 		{
+			progress_vals[1] = PROGRESS_COPY_IO_TARGET_PROGRAM;
 			cstate->copy_file = OpenPipeStream(cstate->filename, PG_BINARY_R);
 			if (cstate->copy_file == NULL)
 				ereport(ERROR,
@@ -1465,6 +1483,7 @@ BeginCopyFrom(ParseState *pstate,
 		{
 			struct stat st;
 
+			progress_vals[1] = PROGRESS_COPY_IO_TARGET_FILE;
 			cstate->copy_file = AllocateFile(cstate->filename, PG_BINARY_R);
 			if (cstate->copy_file == NULL)
 			{
@@ -1491,10 +1510,12 @@ BeginCopyFrom(ParseState *pstate,
 						(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 						 errmsg("\"%s\" is a directory", cstate->filename)));
 
-			pgstat_progress_update_param(PROGRESS_COPY_BYTES_TOTAL, st.st_size);
+			progress_vals[2] = st.st_size;
 		}
 	}
 
+	pgstat_progress_update_multi_param(3, progress_cols, progress_vals);
+
 	if (cstate->opts.binary)
 	{
 		/* Read and verify binary header */
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index e04ec1e331..d29e1382e8 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -394,6 +394,14 @@ BeginCopyTo(ParseState *pstate,
 	TupleDesc	tupDesc;
 	int			num_phys_attrs;
 	MemoryContext oldcontext;
+	const int	progress_cols[] = {
+		PROGRESS_COPY_COMMAND,
+		PROGRESS_COPY_IO_TARGET
+	};
+	int64		progress_vals[] = {
+		PROGRESS_COPY_COMMAND_TO,
+		0
+	};
 
 	if (rel != NULL && rel->rd_rel->relkind != RELKIND_RELATION)
 	{
@@ -702,7 +710,10 @@ BeginCopyTo(ParseState *pstate,
 	{
 		Assert(!is_program);	/* the grammar does not allow this */
 		if (whereToSendOutput != DestRemote)
+		{
+			progress_vals[1] = PROGRESS_COPY_IO_TARGET_STDIO;
 			cstate->copy_file = stdout;
+		}
 	}
 	else
 	{
@@ -711,6 +722,7 @@ BeginCopyTo(ParseState *pstate,
 
 		if (is_program)
 		{
+			progress_vals[1] = PROGRESS_COPY_IO_TARGET_PROGRAM;
 			cstate->copy_file = OpenPipeStream(cstate->filename, PG_BINARY_W);
 			if (cstate->copy_file == NULL)
 				ereport(ERROR,
@@ -723,6 +735,7 @@ BeginCopyTo(ParseState *pstate,
 			mode_t		oumask; /* Pre-existing umask value */
 			struct stat st;
 
+			progress_vals[1] = PROGRESS_COPY_IO_TARGET_FILE;
 			/*
 			 * Prevent write to relative path ... too easy to shoot oneself in
 			 * the foot by overwriting a database file ...
@@ -772,6 +785,8 @@ BeginCopyTo(ParseState *pstate,
 	/* initialize progress */
 	pgstat_progress_start_command(PROGRESS_COMMAND_COPY,
 								  cstate->rel ? RelationGetRelid(cstate->rel) : InvalidOid);
+	pgstat_progress_update_multi_param(2, progress_cols, progress_vals);
+
 	cstate->bytes_processed = 0;
 
 	MemoryContextSwitchTo(oldcontext);
@@ -954,7 +969,7 @@ CopyTo(CopyToState cstate)
 			CopyOneRowTo(cstate, slot);
 
 			/* Increment amount of processed tuples and update the progress */
-			pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++processed);
+			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++processed);
 		}
 
 		ExecDropSingleTupleTableSlot(slot);
@@ -1321,7 +1336,7 @@ copy_dest_receive(TupleTableSlot *slot, DestReceiver *self)
 	CopyOneRowTo(cstate, slot);
 
 	/* Increment amount of processed tuples and update the progress */
-	pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++myState->processed);
+	pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++myState->processed);
 
 	return true;
 }
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 95ec5d02e9..e003217554 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -136,6 +136,19 @@
 /* Commands of PROGRESS_COPY */
 #define PROGRESS_COPY_BYTES_PROCESSED 0
 #define PROGRESS_COPY_BYTES_TOTAL 1
-#define PROGRESS_COPY_LINES_PROCESSED 2
+#define PROGRESS_COPY_TUPLES_PROCESSED 2
+#define PROGRESS_COPY_TUPLES_EXCLUDED 3
+#define PROGRESS_COPY_COMMAND 4
+#define PROGRESS_COPY_IO_TARGET 5
+
+/* Commands of PROGRESS_COPY_COMMAND */
+#define PROGRESS_COPY_COMMAND_FROM 1
+#define PROGRESS_COPY_COMMAND_TO 2
+
+/* Types of PROGRESS_COPY_INOUT_TYPE */
+#define PROGRESS_COPY_IO_TARGET_FILE 1
+#define PROGRESS_COPY_IO_TARGET_PROGRAM 2
+#define PROGRESS_COPY_IO_TARGET_STDIO 3
+#define PROGRESS_COPY_IO_TARGET_CALLBACK 4
 
 #endif
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index b632d9f2ea..d244f9b020 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1948,9 +1948,22 @@ pg_stat_progress_copy| SELECT s.pid,
     s.datid,
     d.datname,
     s.relid,
+        CASE s.param5
+            WHEN 1 THEN 'COPY FROM'::text
+            WHEN 2 THEN 'COPY TO'::text
+            ELSE NULL::text
+        END AS command,
+        CASE s.param6
+            WHEN 1 THEN 'FILE'::text
+            WHEN 2 THEN 'PROGRAM'::text
+            WHEN 3 THEN 'STDIO'::text
+            WHEN 4 THEN 'CALLBACK'::text
+            ELSE NULL::text
+        END AS io_target,
     s.param1 AS bytes_processed,
     s.param2 AS bytes_total,
-    s.param3 AS lines_processed
+    s.param3 AS tuples_processed,
+    s.param4 AS tuples_excluded
    FROM (pg_stat_get_progress_info('COPY'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_create_index| SELECT s.pid,
-- 
2.20.1

v8-0002-Add-backlinks-to-progress-reporting-documentation.patchapplication/x-patch; name=v8-0002-Add-backlinks-to-progress-reporting-documentation.patchDownload
From b3a95f76f5fe1050c33515a74cf6acabe926fe37 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Fri, 5 Feb 2021 23:11:50 +0100
Subject: [PATCH v8 2/3] Add backlinks to progress reporting documentation

Previously, for most progress-reported features, the only place the
feature was mentioned is in the progress reporting document itself.
This makes the progress reporting more discoverable from the reported
commands.
---
 doc/src/sgml/ref/analyze.sgml       |  7 +++++++
 doc/src/sgml/ref/cluster.sgml       |  6 ++++++
 doc/src/sgml/ref/copy.sgml          | 14 ++++++++++++++
 doc/src/sgml/ref/create_index.sgml  |  7 +++++++
 doc/src/sgml/ref/pg_basebackup.sgml |  1 +
 doc/src/sgml/ref/reindex.sgml       |  7 +++++++
 doc/src/sgml/ref/vacuum.sgml        | 11 +++++++++++
 7 files changed, 53 insertions(+)

diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml
index 7d816c87c6..e44a5e07c1 100644
--- a/doc/src/sgml/ref/analyze.sgml
+++ b/doc/src/sgml/ref/analyze.sgml
@@ -273,6 +273,12 @@ ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replacea
     will not record new statistics for that table.  Any existing statistics
     will be retained.
   </para>
+
+  <para>
+    Each backend running <command>ANALYZE</command> will report its progress to
+    the <structname>pg_stat_progress_analyze</structname> view. See
+    <xref linkend="analyze-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -291,6 +297,7 @@ ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replacea
    <member><xref linkend="app-vacuumdb"/></member>
    <member><xref linkend="runtime-config-resource-vacuum-cost"/></member>
    <member><xref linkend="autovacuum"/></member>
+   <member><xref linkend="analyze-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/cluster.sgml b/doc/src/sgml/ref/cluster.sgml
index 5dd21a0189..6dcf886e54 100644
--- a/doc/src/sgml/ref/cluster.sgml
+++ b/doc/src/sgml/ref/cluster.sgml
@@ -192,6 +192,11 @@ CLUSTER [VERBOSE]
     are periodically reclustered.
    </para>
 
+  <para>
+    Each backend running <command>CLUSTER</command> will report its progress to
+    the <structname>pg_stat_progress_cluster</structname> view. See
+    <xref linkend="cluster-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -242,6 +247,7 @@ CLUSTER <replaceable class="parameter">index_name</replaceable> ON <replaceable
 
   <simplelist type="inline">
    <member><xref linkend="app-clusterdb"/></member>
+   <member><xref linkend="cluster-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 0fca6583af..bd2dd87c52 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -82,6 +82,12 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
    specified, data is transmitted via the connection between the
    client and the server.
   </para>
+
+  <para>
+    Each backend running <command>COPY</command> will report its progress to
+    the <structname>pg_stat_progress_copy</structname> view. See
+    <xref linkend="copy-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -1052,4 +1058,12 @@ COPY [ BINARY ] <replaceable class="parameter">table_name</replaceable>
     [ WITH NULL AS '<replaceable class="parameter">null_string</replaceable>' ]
 </synopsis></para>
  </refsect1>
+
+ <refsect1>
+  <title>See Also</title>
+
+  <simplelist type="inline">
+   <member><xref linkend="copy-progress-reporting"/></member>
+  </simplelist>
+ </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index a5271a9f8f..092372694d 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -865,6 +865,12 @@ Indexes:
    will interpret it as <literal>USING gist</literal>, to simplify conversion
    of old databases to GiST.
   </para>
+
+  <para>
+    Each backend running <command>CREATE INDEX</command> will report its
+    progress to the <structname>pg_stat_progress_create_index</structname>
+    view. See <xref linkend="create-index-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -978,6 +984,7 @@ CREATE INDEX CONCURRENTLY sales_quantity_index ON sales_table (quantity);
    <member><xref linkend="sql-alterindex"/></member>
    <member><xref linkend="sql-dropindex"/></member>
    <member><xref linkend="sql-reindex"/></member>
+   <member><xref linkend="create-index-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 5754ad5aa6..14cc88a852 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -904,6 +904,7 @@ PostgreSQL documentation
 
   <simplelist type="inline">
    <member><xref linkend="app-pgdump"/></member>
+   <member><xref linkend="basebackup-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 07795b5737..67cdd83449 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -493,6 +493,12 @@ Indexes:
     is reindexed concurrently, those indexes will be skipped.  (It is possible
     to reindex such indexes without the <command>CONCURRENTLY</command> option.)
    </para>
+
+  <para>
+    Each backend running <command>REINDEX</command> will report its progress
+    to the <structname>pg_stat_progress_create_index</structname> view. See
+    <xref linkend="create-index-progress-reporting"/> for details.
+  </para>
   </refsect2>
  </refsect1>
 
@@ -551,6 +557,7 @@ REINDEX TABLE CONCURRENTLY my_broken_table;
    <member><xref linkend="sql-createindex"/></member>
    <member><xref linkend="sql-dropindex"/></member>
    <member><xref linkend="app-reindexdb"/></member>
+   <member><xref linkend="create-index-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 4bb624979b..08a40e94f7 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -393,6 +393,15 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     information about automatic and manual vacuuming, see
     <xref linkend="routine-vacuuming"/>.
    </para>
+   <para>
+    Each backend running <command>VACUUM</command> without the
+    <literal>FULL</literal> option will report its progress in the
+    <structname>pg_stat_progress_vacuum</structname> view. Backends running
+    <command>VACUUM</command> with the <literal>FULL</literal> option report
+    progress in the <structname>pg_stat_progress_cluster</structname> instead.
+    See <xref linkend="vacuum-progress-reporting"/> and
+    <xref linkend="cluster-progress-reporting"/> for details.
+   </para>
  </refsect1>
 
  <refsect1>
@@ -422,6 +431,8 @@ VACUUM (VERBOSE, ANALYZE) onek;
    <member><xref linkend="app-vacuumdb"/></member>
    <member><xref linkend="runtime-config-resource-vacuum-cost"/></member>
    <member><xref linkend="autovacuum"/></member>
+   <member><xref linkend="vacuum-progress-reporting"/></member>
+   <member><xref linkend="cluster-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
-- 
2.20.1

v8-0003-Add-copy-progress-reporting-regression-tests.patchapplication/x-patch; name=v8-0003-Add-copy-progress-reporting-regression-tests.patchDownload
From 21b972b65cee4f9c7eaa559f3d7cfbc59f261697 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Josef=20=C5=A0im=C3=A1nek?= <josef.simanek@gmail.com>
Date: Tue, 9 Feb 2021 13:06:45 +0100
Subject: [PATCH v8 3/3] Add copy progress reporting regression tests.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This tests some basic features of copy progress reporting.

Sadly, we can only request one snapshot of progress_reporting each
transaction / statement, so we can't (easily) get intermediate results without
each result requiring a seperate statement being run.

Author: Josef Šimánek, Matthias van de Meent
---
 src/test/regress/input/copy.source  | 60 +++++++++++++++++++++++++++++
 src/test/regress/output/copy.source | 47 ++++++++++++++++++++++
 2 files changed, 107 insertions(+)

diff --git a/src/test/regress/input/copy.source b/src/test/regress/input/copy.source
index a1d529ad36..4f1cbc73d2 100644
--- a/src/test/regress/input/copy.source
+++ b/src/test/regress/input/copy.source
@@ -201,3 +201,63 @@ select * from parted_copytest where b = 1;
 select * from parted_copytest where b = 2;
 
 drop table parted_copytest;
+
+--
+-- progress reporting
+--
+
+-- setup
+-- reuse employer datatype, that has a small sized data set
+
+create table progress_reporting (
+	name 		text,
+	age			int4,
+	location 	point,
+	salary 		int4,
+	manager 	name
+);
+
+-- Add a trigger to 'raise info' the contents of pg_stat_progress_copy. This
+-- allows us to test and verify the contents of this view, which would
+-- otherwise require a concurrent session checking that table.
+create function notice_after_progress_reporting() returns trigger AS
+$$
+declare report record;
+begin
+	-- We cannot expect 'pid' nor 'relid' to be consistent over runs due to
+	-- variance in system process ids, and concurrency in runs of tests.
+	-- Additionally, due to the usage of this test in pg_regress, the 'datid'
+	-- also is not consistent between runs.
+	select into report (to_jsonb(r) - '{pid,relid,datid}'::text[]) as value
+		from pg_stat_progress_copy r
+		where pid = pg_backend_pid();
+
+	raise info 'progress: %', report.value::text;
+
+	RETURN NEW;
+END;
+$$ LANGUAGE plpgsql;
+
+create trigger check_after_progress_reporting
+	after insert on progress_reporting
+	for each statement
+	execute function notice_after_progress_reporting();
+
+-- reporting of STDIO imports, and correct bytes-processed/tuples-processed reporting
+
+copy progress_reporting from stdin;
+sharon	25	(15,12)	1000	sam
+sam	30	(10,5)	2000	bill
+bill	20	(11,10)	1000	sharon
+\.
+
+-- reporting of FILE imports, and correct reporting of tuples-excluded
+
+copy progress_reporting from '@abs_srcdir@/data/emp.data'
+	where (salary < 2000);
+
+-- cleanup progress_reporting
+
+drop trigger check_after_progress_reporting on progress_reporting;
+drop function notice_after_progress_reporting();
+drop table progress_reporting;
diff --git a/src/test/regress/output/copy.source b/src/test/regress/output/copy.source
index 938d3551da..42bcaed760 100644
--- a/src/test/regress/output/copy.source
+++ b/src/test/regress/output/copy.source
@@ -165,3 +165,50 @@ select * from parted_copytest where b = 2;
 (1 row)
 
 drop table parted_copytest;
+--
+-- progress reporting
+--
+-- setup
+-- reuse employer datatype, that has a small sized data set
+create table progress_reporting (
+	name 		text,
+	age			int4,
+	location 	point,
+	salary 		int4,
+	manager 	name
+);
+-- Add a trigger to 'raise info' the contents of pg_stat_progress_copy. This
+-- allows us to test and verify the contents of this view, which would
+-- otherwise require a concurrent session checking that table.
+create function notice_after_progress_reporting() returns trigger AS
+$$
+declare report record;
+begin
+	-- We cannot expect 'pid' nor 'relid' to be consistent over runs due to
+	-- variance in system process ids, and concurrency in runs of tests.
+	-- Additionally, due to the usage of this test in pg_regress, the 'datid'
+	-- also is not consistent between runs.
+	select into report (to_jsonb(r) - '{pid,relid,datid}'::text[]) as value
+		from pg_stat_progress_copy r
+		where pid = pg_backend_pid();
+
+	raise info 'progress: %', report.value::text;
+
+	RETURN NEW;
+END;
+$$ LANGUAGE plpgsql;
+create trigger check_after_progress_reporting
+	after insert on progress_reporting
+	for each statement
+	execute function notice_after_progress_reporting();
+-- reporting of STDIO imports, and correct bytes-processed/tuples-processed reporting
+copy progress_reporting from stdin;
+INFO:  progress: {"command": "COPY FROM", "datname": "regression", "io_target": "STDIO", "bytes_total": 0, "bytes_processed": 79, "tuples_excluded": 0, "tuples_processed": 3}
+-- reporting of FILE imports, and correct reporting of tuples-excluded
+copy progress_reporting from '@abs_srcdir@/data/emp.data'
+	where (salary < 2000);
+INFO:  progress: {"command": "COPY FROM", "datname": "regression", "io_target": "FILE", "bytes_total": 79, "bytes_processed": 79, "tuples_excluded": 1, "tuples_processed": 2}
+-- cleanup progress_reporting
+drop trigger check_after_progress_reporting on progress_reporting;
+drop function notice_after_progress_reporting();
+drop table progress_reporting;
-- 
2.20.1

#19Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: Matthias van de Meent (#18)
Re: Improvements and additions to COPY progress reporting

On 2/18/21 4:46 PM, Matthias van de Meent wrote:

Hi,

Thank you all for the suggestions. PFA version 8 of the patchset, in
which I have applied most of your comments. Unless explicitly named
below, I have applied the suggestions.

Thanks.

On Mon, 15 Feb 2021 at 17:07, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:

- The blocks in copyfrom.cc/copyto.c should be reworked - I don't think
we do this in our codebase.

I saw this being used in (re)index progress reporting, that's where I
took inspiration from. It has been fixed in the attached version.

Hmmm, good point. I haven't looked at the other places reporting
progress and I only ever saw this pattern in old code. I kinda dislike
these blocks, but admittedly that's rather subjective view. So if other
similar places do this when reporting progress, this probably should
too. What's your opinion on this?

- I fir the "io_target" name misleading, because in some cases it's
actually the *source*.

Yes, I was also not quite happy with this, but couldn't find a better
one at the point of writing the initial patchset. Would
"io_operations", "io_port", "operates_through" or "through" maybe be
better?

No idea. Let's see if someone has a better proposal ...

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#20Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Tomas Vondra (#19)
Re: Improvements and additions to COPY progress reporting

On Fri, Feb 19, 2021 at 2:34 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:

On Mon, 15 Feb 2021 at 17:07, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:

- The blocks in copyfrom.cc/copyto.c should be reworked - I don't think
we do this in our codebase.

I saw this being used in (re)index progress reporting, that's where I
took inspiration from. It has been fixed in the attached version.

Hmmm, good point. I haven't looked at the other places reporting
progress and I only ever saw this pattern in old code. I kinda dislike
these blocks, but admittedly that's rather subjective view. So if other
similar places do this when reporting progress, this probably should
too. What's your opinion on this?

Actually in the code base the style of that variable declaration and
usage of pgstat_progress_update_multi_param is a mix. For instance, in
lazy_scan_heap, ReindexRelationConcurrently, the variables are
declared at the start of the function. And in _bt_spools_heapscan,
index_build, validate_index, perform_base_backup, the variables are
declared within a separate block.

IMO, we can have the arrays declared at the start of the functions
i.e. the way it's done in v8-0001, because we can extend them for
reporting some other parameter(maybe in future).

- I fir the "io_target" name misleading, because in some cases it's
actually the *source*.

Yes, I was also not quite happy with this, but couldn't find a better
one at the point of writing the initial patchset. Would
"io_operations", "io_port", "operates_through" or "through" maybe be
better?

No idea. Let's see if someone has a better proposal ...

For COPY TO the name "source_type" column and for COPY FROM the name
"destination_type" makes sense. To have a combined column name for
both, how about naming that column as "io_type"?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

#21Michael Paquier
michael@paquier.xyz
In reply to: Bharath Rupireddy (#20)
Re: Improvements and additions to COPY progress reporting

On Sat, Feb 20, 2021 at 11:39:22AM +0530, Bharath Rupireddy wrote:

Actually in the code base the style of that variable declaration and
usage of pgstat_progress_update_multi_param is a mix. For instance, in
lazy_scan_heap, ReindexRelationConcurrently, the variables are
declared at the start of the function. And in _bt_spools_heapscan,
index_build, validate_index, perform_base_backup, the variables are
declared within a separate block.

I think that we should encourage the use of
pgstat_progress_update_multi_param() where we can, as it makes
consistent the updates to all the parameters according to
st_changecount. That's also usually cleaner to store all the
parameters that are changed if these are updated multiple times like
the REINDEX CONCURRENTLY ones. The context of the code also matters,
of course.

Scanning through the patch set, 0002 is a good idea taken
independently.
--
Michael

#22Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Michael Paquier (#21)
Re: Improvements and additions to COPY progress reporting

On Sat, Feb 20, 2021 at 12:49 PM Michael Paquier <michael@paquier.xyz>
wrote:

On Sat, Feb 20, 2021 at 11:39:22AM +0530, Bharath Rupireddy wrote:

Actually in the code base the style of that variable declaration and
usage of pgstat_progress_update_multi_param is a mix. For instance, in
lazy_scan_heap, ReindexRelationConcurrently, the variables are
declared at the start of the function. And in _bt_spools_heapscan,
index_build, validate_index, perform_base_backup, the variables are
declared within a separate block.

I think that we should encourage the use of
pgstat_progress_update_multi_param() where we can, as it makes
consistent the updates to all the parameters according to
st_changecount. That's also usually cleaner to store all the
parameters that are changed if these are updated multiple times like
the REINDEX CONCURRENTLY ones. The context of the code also matters,
of course.

Yeah. We could use pgstat_progress_update_multi_param instead of
pgstat_progress_update_param to update multiple params.

On a quick scan through the code, I found that we can do the following. If
okay, I can start a new thread so that we don't divert the main thread
here. Thoughts?

@@ -3686,12 +3686,18 @@ reindex_index(Oid indexId, bool
skip_constraint_checks, char persistence,
      if (progress)
     {
+        const int    progress_cols[] = {
+            PROGRESS_CREATEIDX_COMMAND,
+            PROGRESS_CREATEIDX_INDEX_OID
+        };
+        const int64    progress_vals[] = {
+            PROGRESS_CREATEIDX_COMMAND_REINDEX,
+            indexId
+        };
+
         pgstat_progress_start_command(PROGRESS_COMMAND_CREATE_INDEX,
                                       heapId);
-        pgstat_progress_update_param(PROGRESS_CREATEIDX_COMMAND,
-                                     PROGRESS_CREATEIDX_COMMAND_REINDEX);
-        pgstat_progress_update_param(PROGRESS_CREATEIDX_INDEX_OID,
-                                     indexId);
+        pgstat_progress_update_multi_param(2, progress_cols,
progress_vals);
     }
@@ -1457,10 +1457,21 @@ DefineIndex(Oid relationId,
         set_indexsafe_procflags();
     /*
-     * The index is now visible, so we can report the OID.
+     * The index is now visible, so we can report the OID. And also, report
+     * Phase 2 of concurrent index build.
      */
-    pgstat_progress_update_param(PROGRESS_CREATEIDX_INDEX_OID,
-                                 indexRelationId);
+    {
+        const int    progress_cols[] = {
+            PROGRESS_CREATEIDX_INDEX_OID,
+            PROGRESS_CREATEIDX_PHASE
+        };
+        const int64    progress_vals[] = {
+            indexRelationId,
+            PROGRESS_CREATEIDX_PHASE_WAIT_1
+        };
+
+        pgstat_progress_update_multi_param(2, progress_cols,
progress_vals);
+    }
@@ -284,12 +284,9 @@ cluster_rel(Oid tableOid, Oid indexOid, ClusterParams
*params)
     CHECK_FOR_INTERRUPTS();
      pgstat_progress_start_command(PROGRESS_COMMAND_CLUSTER, tableOid);
-    if (OidIsValid(indexOid))
-        pgstat_progress_update_param(PROGRESS_CLUSTER_COMMAND,
-                                     PROGRESS_CLUSTER_COMMAND_CLUSTER);
-    else
-        pgstat_progress_update_param(PROGRESS_CLUSTER_COMMAND,
-                                     PROGRESS_CLUSTER_COMMAND_VACUUM_FULL);
+    pgstat_progress_update_param(PROGRESS_CLUSTER_COMMAND,
+                                 OidIsValid(indexOid) ?
PROGRESS_CLUSTER_COMMAND_CLUSTER :
+                                 PROGRESS_CLUSTER_COMMAND_VACUUM_FULL);

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

#23Josef Šimánek
josef.simanek@gmail.com
In reply to: Bharath Rupireddy (#20)
Re: Improvements and additions to COPY progress reporting

so 20. 2. 2021 v 7:09 odesílatel Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> napsal:

On Fri, Feb 19, 2021 at 2:34 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:

On Mon, 15 Feb 2021 at 17:07, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:

- The blocks in copyfrom.cc/copyto.c should be reworked - I don't think
we do this in our codebase.

I saw this being used in (re)index progress reporting, that's where I
took inspiration from. It has been fixed in the attached version.

Hmmm, good point. I haven't looked at the other places reporting
progress and I only ever saw this pattern in old code. I kinda dislike
these blocks, but admittedly that's rather subjective view. So if other
similar places do this when reporting progress, this probably should
too. What's your opinion on this?

Actually in the code base the style of that variable declaration and
usage of pgstat_progress_update_multi_param is a mix. For instance, in
lazy_scan_heap, ReindexRelationConcurrently, the variables are
declared at the start of the function. And in _bt_spools_heapscan,
index_build, validate_index, perform_base_backup, the variables are
declared within a separate block.

IMO, we can have the arrays declared at the start of the functions
i.e. the way it's done in v8-0001, because we can extend them for
reporting some other parameter(maybe in future).

- I fir the "io_target" name misleading, because in some cases it's
actually the *source*.

Yes, I was also not quite happy with this, but couldn't find a better
one at the point of writing the initial patchset. Would
"io_operations", "io_port", "operates_through" or "through" maybe be
better?

No idea. Let's see if someone has a better proposal ...

For COPY TO the name "source_type" column and for COPY FROM the name
"destination_type" makes sense. To have a combined column name for
both, how about naming that column as "io_type"?

+1 on "io_type", that is my best candidate as well

Show quoted text

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

#24Michael Paquier
michael@paquier.xyz
In reply to: Bharath Rupireddy (#22)
Re: Improvements and additions to COPY progress reporting

On Sat, Feb 20, 2021 at 02:29:44PM +0530, Bharath Rupireddy wrote:

Yeah. We could use pgstat_progress_update_multi_param instead of
pgstat_progress_update_param to update multiple params.

On a quick scan through the code, I found that we can do the following. If
okay, I can start a new thread so that we don't divert the main thread
here. Thoughts?

Having a separate thread to discuss this part would be right. This
way any patches sent would attract the correct audience.
--
Michael

#25Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Michael Paquier (#24)
3 attachment(s)
Re: Improvements and additions to COPY progress reporting

On Sat, 20 Feb 2021 at 07:09, Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

For COPY TO the name "source_type" column and for COPY FROM the name
"destination_type" makes sense. To have a combined column name for
both, how about naming that column as "io_type"?

Thank you, that's way better! PFA what I believe is a finalized
patchset v9, utilizing io_type terminology instead of io_target.

With regards,

Matthias van de Meent

Attachments:

v9-0001-Add-progress-reported-components-for-COPY-progres.patchtext/x-patch; charset=US-ASCII; name=v9-0001-Add-progress-reported-components-for-COPY-progres.patchDownload
From 5c0c9658e7c50a2a2d74a738b7bdbfcaa6362e24 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Fri, 12 Feb 2021 14:06:44 +0100
Subject: [PATCH v9 1/3] Add progress-reported components for COPY progress
 reporting

The command, io target and excluded tuple count (by the
COPY ... FROM ... WHERE -clause) are now reported in the pg_stat_progress_copy
view. Additionally, the column lines_processed is renamed to tuples_processed
to disambiguate the meaning of this column in cases of CSV and BINARY copies
and to stay consistent with regards to names in the pg_stat_progress_*-views.

Of special interest is the reporting of io_type, with which we can
distinguish logical replications' initial table synchronization workers'
progress without having to join the pg_stat_activity view.
---
 doc/src/sgml/monitoring.sgml         | 37 ++++++++++++++++++++++++++--
 src/backend/catalog/system_views.sql | 11 ++++++++-
 src/backend/commands/copyfrom.c      | 27 +++++++++++++++++---
 src/backend/commands/copyto.c        | 19 ++++++++++++--
 src/include/commands/progress.h      | 15 ++++++++++-
 src/test/regress/expected/rules.out  | 15 ++++++++++-
 6 files changed, 114 insertions(+), 10 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 3513e127b7..a21bee69d8 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6546,6 +6546,29 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>command</structfield> <type>text</type>
+      </para>
+      <para>
+       The command that is running: <literal>COPY FROM</literal>, or
+       <literal>COPY TO</literal>.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>io_type</structfield> <type>text</type>
+      </para>
+      <para>
+       The io type that the data is read from or written to:
+       <literal>FILE</literal>, <literal>PROGRAM</literal>,
+       <literal>STDIO</literal> (for <command>COPY FROM STDIN</command> and
+       <command>COPY TO STDOUT</command>), or <literal>CALLBACK</literal>
+       (used in the table synchronization background worker).
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>bytes_processed</structfield> <type>bigint</type>
@@ -6567,10 +6590,20 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>lines_processed</structfield> <type>bigint</type>
+       <structfield>tuples_processed</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of tuples already processed by <command>COPY</command> command.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>tuples_excluded</structfield> <type>bigint</type>
       </para>
       <para>
-       Number of lines already processed by <command>COPY</command> command.
+       Number of tuples not processed because they were excluded by the
+       <command>WHERE</command> clause of the <command>COPY</command> command.
       </para></entry>
      </row>
     </tbody>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index fa58afd9d7..f96ad0a5dc 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1129,9 +1129,18 @@ CREATE VIEW pg_stat_progress_copy AS
     SELECT
         S.pid AS pid, S.datid AS datid, D.datname AS datname,
         S.relid AS relid,
+        CASE S.param5 WHEN 1 THEN 'COPY FROM'
+                      WHEN 2 THEN 'COPY TO'
+                      END AS command,
+        CASE S.param6 WHEN 1 THEN 'FILE'
+                      WHEN 2 THEN 'PROGRAM'
+                      WHEN 3 THEN 'STDIO'
+                      WHEN 4 THEN 'CALLBACK'
+                      END AS io_type,
         S.param1 AS bytes_processed,
         S.param2 AS bytes_total,
-        S.param3 AS lines_processed
+        S.param3 AS tuples_processed,
+        S.param4 AS tuples_excluded
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 796ca7b3f7..36828246c1 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -539,7 +539,8 @@ CopyFrom(CopyFromState cstate)
 	BulkInsertState bistate = NULL;
 	CopyInsertMethod insertMethod;
 	CopyMultiInsertInfo multiInsertInfo = {0};	/* pacify compiler */
-	uint64		processed = 0;
+	int64		processed = 0;
+	int64		excluded = 0;
 	bool		has_before_insert_row_trig;
 	bool		has_instead_insert_row_trig;
 	bool		leafpart_use_multi_insert = false;
@@ -869,7 +870,11 @@ CopyFrom(CopyFromState cstate)
 			econtext->ecxt_scantuple = myslot;
 			/* Skip items that don't match COPY's WHERE clause */
 			if (!ExecQual(cstate->qualexpr, econtext))
+			{
+				/* Report that this tuple was filtered out by the WHERE clause */
+				pgstat_progress_update_param(PROGRESS_COPY_TUPLES_EXCLUDED, ++excluded);
 				continue;
+			}
 		}
 
 		/* Determine the partition to insert the tuple into */
@@ -1107,7 +1112,7 @@ CopyFrom(CopyFromState cstate)
 			 * for counting tuples inserted by an INSERT command. Update
 			 * progress of the COPY command as well.
 			 */
-			pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++processed);
+			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++processed);
 		}
 	}
 
@@ -1200,6 +1205,16 @@ BeginCopyFrom(ParseState *pstate,
 	ExprState **defexprs;
 	MemoryContext oldcontext;
 	bool		volatile_defexprs;
+	const int	progress_cols[] = {
+		PROGRESS_COPY_COMMAND,
+		PROGRESS_COPY_IO_TYPE,
+		PROGRESS_COPY_BYTES_TOTAL
+	};
+	int64		progress_vals[] = {
+		PROGRESS_COPY_COMMAND_FROM,
+		0,
+		0
+	};
 
 	/* Allocate workspace and zero all fields */
 	cstate = (CopyFromStateData *) palloc0(sizeof(CopyFromStateData));
@@ -1437,11 +1452,13 @@ BeginCopyFrom(ParseState *pstate,
 
 	if (data_source_cb)
 	{
+		progress_vals[1] = PROGRESS_COPY_IO_TYPE_CALLBACK;
 		cstate->copy_src = COPY_CALLBACK;
 		cstate->data_source_cb = data_source_cb;
 	}
 	else if (pipe)
 	{
+		progress_vals[1] = PROGRESS_COPY_IO_TYPE_STDIO;
 		Assert(!is_program);	/* the grammar does not allow this */
 		if (whereToSendOutput == DestRemote)
 			ReceiveCopyBegin(cstate);
@@ -1454,6 +1471,7 @@ BeginCopyFrom(ParseState *pstate,
 
 		if (cstate->is_program)
 		{
+			progress_vals[1] = PROGRESS_COPY_IO_TYPE_PROGRAM;
 			cstate->copy_file = OpenPipeStream(cstate->filename, PG_BINARY_R);
 			if (cstate->copy_file == NULL)
 				ereport(ERROR,
@@ -1465,6 +1483,7 @@ BeginCopyFrom(ParseState *pstate,
 		{
 			struct stat st;
 
+			progress_vals[1] = PROGRESS_COPY_IO_TYPE_FILE;
 			cstate->copy_file = AllocateFile(cstate->filename, PG_BINARY_R);
 			if (cstate->copy_file == NULL)
 			{
@@ -1491,10 +1510,12 @@ BeginCopyFrom(ParseState *pstate,
 						(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 						 errmsg("\"%s\" is a directory", cstate->filename)));
 
-			pgstat_progress_update_param(PROGRESS_COPY_BYTES_TOTAL, st.st_size);
+			progress_vals[2] = st.st_size;
 		}
 	}
 
+	pgstat_progress_update_multi_param(3, progress_cols, progress_vals);
+
 	if (cstate->opts.binary)
 	{
 		/* Read and verify binary header */
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index e04ec1e331..d8534bc613 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -394,6 +394,14 @@ BeginCopyTo(ParseState *pstate,
 	TupleDesc	tupDesc;
 	int			num_phys_attrs;
 	MemoryContext oldcontext;
+	const int	progress_cols[] = {
+		PROGRESS_COPY_COMMAND,
+		PROGRESS_COPY_IO_TYPE
+	};
+	int64		progress_vals[] = {
+		PROGRESS_COPY_COMMAND_TO,
+		0
+	};
 
 	if (rel != NULL && rel->rd_rel->relkind != RELKIND_RELATION)
 	{
@@ -702,7 +710,10 @@ BeginCopyTo(ParseState *pstate,
 	{
 		Assert(!is_program);	/* the grammar does not allow this */
 		if (whereToSendOutput != DestRemote)
+		{
+			progress_vals[1] = PROGRESS_COPY_IO_TYPE_STDIO;
 			cstate->copy_file = stdout;
+		}
 	}
 	else
 	{
@@ -711,6 +722,7 @@ BeginCopyTo(ParseState *pstate,
 
 		if (is_program)
 		{
+			progress_vals[1] = PROGRESS_COPY_IO_TYPE_PROGRAM;
 			cstate->copy_file = OpenPipeStream(cstate->filename, PG_BINARY_W);
 			if (cstate->copy_file == NULL)
 				ereport(ERROR,
@@ -723,6 +735,7 @@ BeginCopyTo(ParseState *pstate,
 			mode_t		oumask; /* Pre-existing umask value */
 			struct stat st;
 
+			progress_vals[1] = PROGRESS_COPY_IO_TYPE_FILE;
 			/*
 			 * Prevent write to relative path ... too easy to shoot oneself in
 			 * the foot by overwriting a database file ...
@@ -772,6 +785,8 @@ BeginCopyTo(ParseState *pstate,
 	/* initialize progress */
 	pgstat_progress_start_command(PROGRESS_COMMAND_COPY,
 								  cstate->rel ? RelationGetRelid(cstate->rel) : InvalidOid);
+	pgstat_progress_update_multi_param(2, progress_cols, progress_vals);
+
 	cstate->bytes_processed = 0;
 
 	MemoryContextSwitchTo(oldcontext);
@@ -954,7 +969,7 @@ CopyTo(CopyToState cstate)
 			CopyOneRowTo(cstate, slot);
 
 			/* Increment amount of processed tuples and update the progress */
-			pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++processed);
+			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++processed);
 		}
 
 		ExecDropSingleTupleTableSlot(slot);
@@ -1321,7 +1336,7 @@ copy_dest_receive(TupleTableSlot *slot, DestReceiver *self)
 	CopyOneRowTo(cstate, slot);
 
 	/* Increment amount of processed tuples and update the progress */
-	pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++myState->processed);
+	pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++myState->processed);
 
 	return true;
 }
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 95ec5d02e9..508b562d92 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -136,6 +136,19 @@
 /* Commands of PROGRESS_COPY */
 #define PROGRESS_COPY_BYTES_PROCESSED 0
 #define PROGRESS_COPY_BYTES_TOTAL 1
-#define PROGRESS_COPY_LINES_PROCESSED 2
+#define PROGRESS_COPY_TUPLES_PROCESSED 2
+#define PROGRESS_COPY_TUPLES_EXCLUDED 3
+#define PROGRESS_COPY_COMMAND 4
+#define PROGRESS_COPY_IO_TYPE 5
+
+/* Commands of PROGRESS_COPY_COMMAND */
+#define PROGRESS_COPY_COMMAND_FROM 1
+#define PROGRESS_COPY_COMMAND_TO 2
+
+/* Types of PROGRESS_COPY_INOUT_TYPE */
+#define PROGRESS_COPY_IO_TYPE_FILE 1
+#define PROGRESS_COPY_IO_TYPE_PROGRAM 2
+#define PROGRESS_COPY_IO_TYPE_STDIO 3
+#define PROGRESS_COPY_IO_TYPE_CALLBACK 4
 
 #endif
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 10a1f34ebc..a6ae73bd09 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1949,9 +1949,22 @@ pg_stat_progress_copy| SELECT s.pid,
     s.datid,
     d.datname,
     s.relid,
+        CASE s.param5
+            WHEN 1 THEN 'COPY FROM'::text
+            WHEN 2 THEN 'COPY TO'::text
+            ELSE NULL::text
+        END AS command,
+        CASE s.param6
+            WHEN 1 THEN 'FILE'::text
+            WHEN 2 THEN 'PROGRAM'::text
+            WHEN 3 THEN 'STDIO'::text
+            WHEN 4 THEN 'CALLBACK'::text
+            ELSE NULL::text
+        END AS io_type,
     s.param1 AS bytes_processed,
     s.param2 AS bytes_total,
-    s.param3 AS lines_processed
+    s.param3 AS tuples_processed,
+    s.param4 AS tuples_excluded
    FROM (pg_stat_get_progress_info('COPY'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_create_index| SELECT s.pid,
-- 
2.20.1

v9-0003-Add-copy-progress-reporting-regression-tests.patchtext/x-patch; charset=UTF-8; name=v9-0003-Add-copy-progress-reporting-regression-tests.patchDownload
From 773085ed047e83395624fd2aef7be229431f9c4e Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Josef=20=C5=A0im=C3=A1nek?= <josef.simanek@gmail.com>
Date: Tue, 9 Feb 2021 13:06:45 +0100
Subject: [PATCH v9 3/3] Add copy progress reporting regression tests.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This tests some basic features of copy progress reporting.

Sadly, we can only request one snapshot of progress_reporting each
transaction / statement, so we can't (easily) get intermediate results without
each result requiring a seperate statement being run.

Author: Josef Šimánek, Matthias van de Meent
---
 src/test/regress/input/copy.source  | 60 +++++++++++++++++++++++++++++
 src/test/regress/output/copy.source | 47 ++++++++++++++++++++++
 2 files changed, 107 insertions(+)

diff --git a/src/test/regress/input/copy.source b/src/test/regress/input/copy.source
index a1d529ad36..4f1cbc73d2 100644
--- a/src/test/regress/input/copy.source
+++ b/src/test/regress/input/copy.source
@@ -201,3 +201,63 @@ select * from parted_copytest where b = 1;
 select * from parted_copytest where b = 2;
 
 drop table parted_copytest;
+
+--
+-- progress reporting
+--
+
+-- setup
+-- reuse employer datatype, that has a small sized data set
+
+create table progress_reporting (
+	name 		text,
+	age			int4,
+	location 	point,
+	salary 		int4,
+	manager 	name
+);
+
+-- Add a trigger to 'raise info' the contents of pg_stat_progress_copy. This
+-- allows us to test and verify the contents of this view, which would
+-- otherwise require a concurrent session checking that table.
+create function notice_after_progress_reporting() returns trigger AS
+$$
+declare report record;
+begin
+	-- We cannot expect 'pid' nor 'relid' to be consistent over runs due to
+	-- variance in system process ids, and concurrency in runs of tests.
+	-- Additionally, due to the usage of this test in pg_regress, the 'datid'
+	-- also is not consistent between runs.
+	select into report (to_jsonb(r) - '{pid,relid,datid}'::text[]) as value
+		from pg_stat_progress_copy r
+		where pid = pg_backend_pid();
+
+	raise info 'progress: %', report.value::text;
+
+	RETURN NEW;
+END;
+$$ LANGUAGE plpgsql;
+
+create trigger check_after_progress_reporting
+	after insert on progress_reporting
+	for each statement
+	execute function notice_after_progress_reporting();
+
+-- reporting of STDIO imports, and correct bytes-processed/tuples-processed reporting
+
+copy progress_reporting from stdin;
+sharon	25	(15,12)	1000	sam
+sam	30	(10,5)	2000	bill
+bill	20	(11,10)	1000	sharon
+\.
+
+-- reporting of FILE imports, and correct reporting of tuples-excluded
+
+copy progress_reporting from '@abs_srcdir@/data/emp.data'
+	where (salary < 2000);
+
+-- cleanup progress_reporting
+
+drop trigger check_after_progress_reporting on progress_reporting;
+drop function notice_after_progress_reporting();
+drop table progress_reporting;
diff --git a/src/test/regress/output/copy.source b/src/test/regress/output/copy.source
index 938d3551da..f3596bdd15 100644
--- a/src/test/regress/output/copy.source
+++ b/src/test/regress/output/copy.source
@@ -165,3 +165,50 @@ select * from parted_copytest where b = 2;
 (1 row)
 
 drop table parted_copytest;
+--
+-- progress reporting
+--
+-- setup
+-- reuse employer datatype, that has a small sized data set
+create table progress_reporting (
+	name 		text,
+	age			int4,
+	location 	point,
+	salary 		int4,
+	manager 	name
+);
+-- Add a trigger to 'raise info' the contents of pg_stat_progress_copy. This
+-- allows us to test and verify the contents of this view, which would
+-- otherwise require a concurrent session checking that table.
+create function notice_after_progress_reporting() returns trigger AS
+$$
+declare report record;
+begin
+	-- We cannot expect 'pid' nor 'relid' to be consistent over runs due to
+	-- variance in system process ids, and concurrency in runs of tests.
+	-- Additionally, due to the usage of this test in pg_regress, the 'datid'
+	-- also is not consistent between runs.
+	select into report (to_jsonb(r) - '{pid,relid,datid}'::text[]) as value
+		from pg_stat_progress_copy r
+		where pid = pg_backend_pid();
+
+	raise info 'progress: %', report.value::text;
+
+	RETURN NEW;
+END;
+$$ LANGUAGE plpgsql;
+create trigger check_after_progress_reporting
+	after insert on progress_reporting
+	for each statement
+	execute function notice_after_progress_reporting();
+-- reporting of STDIO imports, and correct bytes-processed/tuples-processed reporting
+copy progress_reporting from stdin;
+INFO:  progress: {"command": "COPY FROM", "datname": "regression", "io_type": "STDIO", "bytes_total": 0, "bytes_processed": 79, "tuples_excluded": 0, "tuples_processed": 3}
+-- reporting of FILE imports, and correct reporting of tuples-excluded
+copy progress_reporting from '@abs_srcdir@/data/emp.data'
+	where (salary < 2000);
+INFO:  progress: {"command": "COPY FROM", "datname": "regression", "io_type": "FILE", "bytes_total": 79, "bytes_processed": 79, "tuples_excluded": 1, "tuples_processed": 2}
+-- cleanup progress_reporting
+drop trigger check_after_progress_reporting on progress_reporting;
+drop function notice_after_progress_reporting();
+drop table progress_reporting;
-- 
2.20.1

v9-0002-Add-backlinks-to-progress-reporting-documentation.patchtext/x-patch; charset=US-ASCII; name=v9-0002-Add-backlinks-to-progress-reporting-documentation.patchDownload
From 82e457b1614781479270f561257441c86f5e064c Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Fri, 5 Feb 2021 23:11:50 +0100
Subject: [PATCH v9 2/3] Add backlinks to progress reporting documentation

Previously, for most progress-reported features, the only place the
feature was mentioned is in the progress reporting document itself.
This makes the progress reporting more discoverable from the reported
commands.
---
 doc/src/sgml/ref/analyze.sgml       |  7 +++++++
 doc/src/sgml/ref/cluster.sgml       |  6 ++++++
 doc/src/sgml/ref/copy.sgml          | 14 ++++++++++++++
 doc/src/sgml/ref/create_index.sgml  |  7 +++++++
 doc/src/sgml/ref/pg_basebackup.sgml |  1 +
 doc/src/sgml/ref/reindex.sgml       |  7 +++++++
 doc/src/sgml/ref/vacuum.sgml        | 11 +++++++++++
 7 files changed, 53 insertions(+)

diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml
index 7d816c87c6..e44a5e07c1 100644
--- a/doc/src/sgml/ref/analyze.sgml
+++ b/doc/src/sgml/ref/analyze.sgml
@@ -273,6 +273,12 @@ ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replacea
     will not record new statistics for that table.  Any existing statistics
     will be retained.
   </para>
+
+  <para>
+    Each backend running <command>ANALYZE</command> will report its progress to
+    the <structname>pg_stat_progress_analyze</structname> view. See
+    <xref linkend="analyze-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -291,6 +297,7 @@ ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replacea
    <member><xref linkend="app-vacuumdb"/></member>
    <member><xref linkend="runtime-config-resource-vacuum-cost"/></member>
    <member><xref linkend="autovacuum"/></member>
+   <member><xref linkend="analyze-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/cluster.sgml b/doc/src/sgml/ref/cluster.sgml
index 5dd21a0189..6dcf886e54 100644
--- a/doc/src/sgml/ref/cluster.sgml
+++ b/doc/src/sgml/ref/cluster.sgml
@@ -192,6 +192,11 @@ CLUSTER [VERBOSE]
     are periodically reclustered.
    </para>
 
+  <para>
+    Each backend running <command>CLUSTER</command> will report its progress to
+    the <structname>pg_stat_progress_cluster</structname> view. See
+    <xref linkend="cluster-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -242,6 +247,7 @@ CLUSTER <replaceable class="parameter">index_name</replaceable> ON <replaceable
 
   <simplelist type="inline">
    <member><xref linkend="app-clusterdb"/></member>
+   <member><xref linkend="cluster-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 0fca6583af..bd2dd87c52 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -82,6 +82,12 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
    specified, data is transmitted via the connection between the
    client and the server.
   </para>
+
+  <para>
+    Each backend running <command>COPY</command> will report its progress to
+    the <structname>pg_stat_progress_copy</structname> view. See
+    <xref linkend="copy-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -1052,4 +1058,12 @@ COPY [ BINARY ] <replaceable class="parameter">table_name</replaceable>
     [ WITH NULL AS '<replaceable class="parameter">null_string</replaceable>' ]
 </synopsis></para>
  </refsect1>
+
+ <refsect1>
+  <title>See Also</title>
+
+  <simplelist type="inline">
+   <member><xref linkend="copy-progress-reporting"/></member>
+  </simplelist>
+ </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index a5271a9f8f..092372694d 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -865,6 +865,12 @@ Indexes:
    will interpret it as <literal>USING gist</literal>, to simplify conversion
    of old databases to GiST.
   </para>
+
+  <para>
+    Each backend running <command>CREATE INDEX</command> will report its
+    progress to the <structname>pg_stat_progress_create_index</structname>
+    view. See <xref linkend="create-index-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -978,6 +984,7 @@ CREATE INDEX CONCURRENTLY sales_quantity_index ON sales_table (quantity);
    <member><xref linkend="sql-alterindex"/></member>
    <member><xref linkend="sql-dropindex"/></member>
    <member><xref linkend="sql-reindex"/></member>
+   <member><xref linkend="create-index-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 5754ad5aa6..14cc88a852 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -904,6 +904,7 @@ PostgreSQL documentation
 
   <simplelist type="inline">
    <member><xref linkend="app-pgdump"/></member>
+   <member><xref linkend="basebackup-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 07795b5737..67cdd83449 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -493,6 +493,12 @@ Indexes:
     is reindexed concurrently, those indexes will be skipped.  (It is possible
     to reindex such indexes without the <command>CONCURRENTLY</command> option.)
    </para>
+
+  <para>
+    Each backend running <command>REINDEX</command> will report its progress
+    to the <structname>pg_stat_progress_create_index</structname> view. See
+    <xref linkend="create-index-progress-reporting"/> for details.
+  </para>
   </refsect2>
  </refsect1>
 
@@ -551,6 +557,7 @@ REINDEX TABLE CONCURRENTLY my_broken_table;
    <member><xref linkend="sql-createindex"/></member>
    <member><xref linkend="sql-dropindex"/></member>
    <member><xref linkend="app-reindexdb"/></member>
+   <member><xref linkend="create-index-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 4bb624979b..08a40e94f7 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -393,6 +393,15 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     information about automatic and manual vacuuming, see
     <xref linkend="routine-vacuuming"/>.
    </para>
+   <para>
+    Each backend running <command>VACUUM</command> without the
+    <literal>FULL</literal> option will report its progress in the
+    <structname>pg_stat_progress_vacuum</structname> view. Backends running
+    <command>VACUUM</command> with the <literal>FULL</literal> option report
+    progress in the <structname>pg_stat_progress_cluster</structname> instead.
+    See <xref linkend="vacuum-progress-reporting"/> and
+    <xref linkend="cluster-progress-reporting"/> for details.
+   </para>
  </refsect1>
 
  <refsect1>
@@ -422,6 +431,8 @@ VACUUM (VERBOSE, ANALYZE) onek;
    <member><xref linkend="app-vacuumdb"/></member>
    <member><xref linkend="runtime-config-resource-vacuum-cost"/></member>
    <member><xref linkend="autovacuum"/></member>
+   <member><xref linkend="vacuum-progress-reporting"/></member>
+   <member><xref linkend="cluster-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
-- 
2.20.1

#26Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Matthias van de Meent (#25)
Re: Improvements and additions to COPY progress reporting

On Mon, Feb 22, 2021 at 12:40 AM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

On Sat, 20 Feb 2021 at 07:09, Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

For COPY TO the name "source_type" column and for COPY FROM the name
"destination_type" makes sense. To have a combined column name for
both, how about naming that column as "io_type"?

Thank you, that's way better! PFA what I believe is a finalized
patchset v9, utilizing io_type terminology instead of io_target.

Thanks for the patches. I reviewed them.

0001 - I think there's a bug. See COPY TO stdout doesn't print
io_type as "STDIO".

postgres=# select * from pg_stat_progress_copy;
pid | datid | datname | relid | command | io_type |
bytes_processed | bytes_total | tuples_processed | tuples_excluded
--------+-------+----------+-------+---------+---------+-----------------+-------------+------------------+-----------------
977510 | 13003 | postgres | 16384 | COPY TO | |
23961591 | 0 | 2662399 | 0
(1 row)

we should do below(like we do for copyfrom.c):

@@ -702,7 +710,10 @@ BeginCopyTo(ParseState *pstate,
if (pipe)
{
progress_vals[1] = PROGRESS_COPY_IO_TYPE_STDIO;
Assert(!is_program); /* the grammar does not allow this */
if (whereToSendOutput != DestRemote)
cstate->copy_file = stdout;
}

Because if "pipe" is true, that means the transfer is between STDIO.
See below comment:

* If <pipe> is false, transfer is between the table and the file named
* <filename>. Otherwise, transfer is between the table and our regular
* input/output stream. The latter could be either stdin/stdout or a
* socket, depending on whether we're running under Postmaster control.
*

0002 patch looks good to me.

0003 - patch:
I'm doubtful if the "bytes_total": 79 i.e. test file size will be the
same across different platforms and file systems types, if true, then
the below tests will not be stable. Do we also want to exclude the
bytes_total from the output, just to be on the safer side? Thoughts?

copy progress_reporting from stdin;
+INFO:  progress: {"command": "COPY FROM", "datname": "regression",
"io_type": "STDIO", "bytes_total": 0, "bytes_processed": 79,
"tuples_excluded": 0, "tuples_processed": 3}
+-- reporting of FILE imports, and correct reporting of tuples-excluded
+copy progress_reporting from '@abs_srcdir@/data/emp.data'
+    where (salary < 2000);
+INFO:  progress: {"command": "COPY FROM", "datname": "regression",
"io_type": "FILE", "bytes_total": 79, "bytes_processed": 79,
"tuples_excluded": 1, "tuples_processed": 2}
+-- cleanup progress_reporting

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

#27Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Bharath Rupireddy (#26)
3 attachment(s)
Re: Improvements and additions to COPY progress reporting

On Mon, 22 Feb 2021 at 05:49, Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Mon, Feb 22, 2021 at 12:40 AM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

On Sat, 20 Feb 2021 at 07:09, Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

For COPY TO the name "source_type" column and for COPY FROM the name
"destination_type" makes sense. To have a combined column name for
both, how about naming that column as "io_type"?

Thank you, that's way better! PFA what I believe is a finalized
patchset v9, utilizing io_type terminology instead of io_target.

Thanks for the patches. I reviewed them.

0001 - I think there's a bug. See COPY TO stdout doesn't print
io_type as "STDIO".

Fixed in attached

0003 - patch:
I'm doubtful if the "bytes_total": 79 i.e. test file size will be the
same across different platforms and file systems types, if true, then
the below tests will not be stable. Do we also want to exclude the
bytes_total from the output, just to be on the safer side? Thoughts?

I'm fairly certain that input files of the regression tests are
considered 'binary files' to the test framework and that contents
don't change between different architectures or OSes. I also think
that any POSIX-compliant file system would report anything but the
size of the file contents, i.e. the size of the blob that is the file,
and that is correctly reported here. Other than that, if bytes_total
wouldn't be stable, then bytes_processed wouldn't make sense either.

For STDIN / STDOUT you might also have a point (different input
methods might have different length encodings for the specified
input), but insofar that I understand the test framework and the
expected git configurations, the tests run using UTF-8 / ascii only,
with a single style of newlines[+]. Sadly, I cannot provide examples
nor outputs for other test framework settings due to my lack of
experience with running the tests with non-standard settings.

Note, I'm happy to be proven wrong here, in which case I don't
disagree, but according to my limited knowledge, these outputs should
be stable.

With regards,

Matthias van de Meent

[+] Except when explicitly configured to run using non-standard
configurations, in which case there are different expected output
values to be configured for that configuration.

Attachments:

v10-0001-Add-progress-reported-components-for-COPY-progre.patchapplication/x-patch; name=v10-0001-Add-progress-reported-components-for-COPY-progre.patchDownload
From 3729dd30ac10af14b056db6c41d44f8c6b39b444 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Fri, 12 Feb 2021 14:06:44 +0100
Subject: [PATCH v10 1/3] Add progress-reported components for COPY progress
 reporting

The command, io target and excluded tuple count (by the
COPY ... FROM ... WHERE -clause) are now reported in the pg_stat_progress_copy
view. Additionally, the column lines_processed is renamed to tuples_processed
to disambiguate the meaning of this column in cases of CSV and BINARY copies
and to stay consistent with regards to names in the pg_stat_progress_*-views.

Of special interest is the reporting of io_type, with which we can
distinguish logical replications' initial table synchronization workers'
progress without having to join the pg_stat_activity view.
---
 doc/src/sgml/monitoring.sgml         | 37 ++++++++++++++++++++++++++--
 src/backend/catalog/system_views.sql | 11 ++++++++-
 src/backend/commands/copyfrom.c      | 27 +++++++++++++++++---
 src/backend/commands/copyto.c        | 18 ++++++++++++--
 src/include/commands/progress.h      | 15 ++++++++++-
 src/test/regress/expected/rules.out  | 15 ++++++++++-
 6 files changed, 113 insertions(+), 10 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 3513e127b7..a21bee69d8 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6546,6 +6546,29 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>command</structfield> <type>text</type>
+      </para>
+      <para>
+       The command that is running: <literal>COPY FROM</literal>, or
+       <literal>COPY TO</literal>.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>io_type</structfield> <type>text</type>
+      </para>
+      <para>
+       The io type that the data is read from or written to:
+       <literal>FILE</literal>, <literal>PROGRAM</literal>,
+       <literal>STDIO</literal> (for <command>COPY FROM STDIN</command> and
+       <command>COPY TO STDOUT</command>), or <literal>CALLBACK</literal>
+       (used in the table synchronization background worker).
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>bytes_processed</structfield> <type>bigint</type>
@@ -6567,10 +6590,20 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>lines_processed</structfield> <type>bigint</type>
+       <structfield>tuples_processed</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of tuples already processed by <command>COPY</command> command.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>tuples_excluded</structfield> <type>bigint</type>
       </para>
       <para>
-       Number of lines already processed by <command>COPY</command> command.
+       Number of tuples not processed because they were excluded by the
+       <command>WHERE</command> clause of the <command>COPY</command> command.
       </para></entry>
      </row>
     </tbody>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index fa58afd9d7..f96ad0a5dc 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1129,9 +1129,18 @@ CREATE VIEW pg_stat_progress_copy AS
     SELECT
         S.pid AS pid, S.datid AS datid, D.datname AS datname,
         S.relid AS relid,
+        CASE S.param5 WHEN 1 THEN 'COPY FROM'
+                      WHEN 2 THEN 'COPY TO'
+                      END AS command,
+        CASE S.param6 WHEN 1 THEN 'FILE'
+                      WHEN 2 THEN 'PROGRAM'
+                      WHEN 3 THEN 'STDIO'
+                      WHEN 4 THEN 'CALLBACK'
+                      END AS io_type,
         S.param1 AS bytes_processed,
         S.param2 AS bytes_total,
-        S.param3 AS lines_processed
+        S.param3 AS tuples_processed,
+        S.param4 AS tuples_excluded
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 796ca7b3f7..36828246c1 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -539,7 +539,8 @@ CopyFrom(CopyFromState cstate)
 	BulkInsertState bistate = NULL;
 	CopyInsertMethod insertMethod;
 	CopyMultiInsertInfo multiInsertInfo = {0};	/* pacify compiler */
-	uint64		processed = 0;
+	int64		processed = 0;
+	int64		excluded = 0;
 	bool		has_before_insert_row_trig;
 	bool		has_instead_insert_row_trig;
 	bool		leafpart_use_multi_insert = false;
@@ -869,7 +870,11 @@ CopyFrom(CopyFromState cstate)
 			econtext->ecxt_scantuple = myslot;
 			/* Skip items that don't match COPY's WHERE clause */
 			if (!ExecQual(cstate->qualexpr, econtext))
+			{
+				/* Report that this tuple was filtered out by the WHERE clause */
+				pgstat_progress_update_param(PROGRESS_COPY_TUPLES_EXCLUDED, ++excluded);
 				continue;
+			}
 		}
 
 		/* Determine the partition to insert the tuple into */
@@ -1107,7 +1112,7 @@ CopyFrom(CopyFromState cstate)
 			 * for counting tuples inserted by an INSERT command. Update
 			 * progress of the COPY command as well.
 			 */
-			pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++processed);
+			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++processed);
 		}
 	}
 
@@ -1200,6 +1205,16 @@ BeginCopyFrom(ParseState *pstate,
 	ExprState **defexprs;
 	MemoryContext oldcontext;
 	bool		volatile_defexprs;
+	const int	progress_cols[] = {
+		PROGRESS_COPY_COMMAND,
+		PROGRESS_COPY_IO_TYPE,
+		PROGRESS_COPY_BYTES_TOTAL
+	};
+	int64		progress_vals[] = {
+		PROGRESS_COPY_COMMAND_FROM,
+		0,
+		0
+	};
 
 	/* Allocate workspace and zero all fields */
 	cstate = (CopyFromStateData *) palloc0(sizeof(CopyFromStateData));
@@ -1437,11 +1452,13 @@ BeginCopyFrom(ParseState *pstate,
 
 	if (data_source_cb)
 	{
+		progress_vals[1] = PROGRESS_COPY_IO_TYPE_CALLBACK;
 		cstate->copy_src = COPY_CALLBACK;
 		cstate->data_source_cb = data_source_cb;
 	}
 	else if (pipe)
 	{
+		progress_vals[1] = PROGRESS_COPY_IO_TYPE_STDIO;
 		Assert(!is_program);	/* the grammar does not allow this */
 		if (whereToSendOutput == DestRemote)
 			ReceiveCopyBegin(cstate);
@@ -1454,6 +1471,7 @@ BeginCopyFrom(ParseState *pstate,
 
 		if (cstate->is_program)
 		{
+			progress_vals[1] = PROGRESS_COPY_IO_TYPE_PROGRAM;
 			cstate->copy_file = OpenPipeStream(cstate->filename, PG_BINARY_R);
 			if (cstate->copy_file == NULL)
 				ereport(ERROR,
@@ -1465,6 +1483,7 @@ BeginCopyFrom(ParseState *pstate,
 		{
 			struct stat st;
 
+			progress_vals[1] = PROGRESS_COPY_IO_TYPE_FILE;
 			cstate->copy_file = AllocateFile(cstate->filename, PG_BINARY_R);
 			if (cstate->copy_file == NULL)
 			{
@@ -1491,10 +1510,12 @@ BeginCopyFrom(ParseState *pstate,
 						(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 						 errmsg("\"%s\" is a directory", cstate->filename)));
 
-			pgstat_progress_update_param(PROGRESS_COPY_BYTES_TOTAL, st.st_size);
+			progress_vals[2] = st.st_size;
 		}
 	}
 
+	pgstat_progress_update_multi_param(3, progress_cols, progress_vals);
+
 	if (cstate->opts.binary)
 	{
 		/* Read and verify binary header */
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index e04ec1e331..606f6f190c 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -394,6 +394,14 @@ BeginCopyTo(ParseState *pstate,
 	TupleDesc	tupDesc;
 	int			num_phys_attrs;
 	MemoryContext oldcontext;
+	const int	progress_cols[] = {
+		PROGRESS_COPY_COMMAND,
+		PROGRESS_COPY_IO_TYPE
+	};
+	int64		progress_vals[] = {
+		PROGRESS_COPY_COMMAND_TO,
+		0
+	};
 
 	if (rel != NULL && rel->rd_rel->relkind != RELKIND_RELATION)
 	{
@@ -700,6 +708,8 @@ BeginCopyTo(ParseState *pstate,
 
 	if (pipe)
 	{
+		progress_vals[1] = PROGRESS_COPY_IO_TYPE_STDIO;
+
 		Assert(!is_program);	/* the grammar does not allow this */
 		if (whereToSendOutput != DestRemote)
 			cstate->copy_file = stdout;
@@ -711,6 +721,7 @@ BeginCopyTo(ParseState *pstate,
 
 		if (is_program)
 		{
+			progress_vals[1] = PROGRESS_COPY_IO_TYPE_PROGRAM;
 			cstate->copy_file = OpenPipeStream(cstate->filename, PG_BINARY_W);
 			if (cstate->copy_file == NULL)
 				ereport(ERROR,
@@ -723,6 +734,7 @@ BeginCopyTo(ParseState *pstate,
 			mode_t		oumask; /* Pre-existing umask value */
 			struct stat st;
 
+			progress_vals[1] = PROGRESS_COPY_IO_TYPE_FILE;
 			/*
 			 * Prevent write to relative path ... too easy to shoot oneself in
 			 * the foot by overwriting a database file ...
@@ -772,6 +784,8 @@ BeginCopyTo(ParseState *pstate,
 	/* initialize progress */
 	pgstat_progress_start_command(PROGRESS_COMMAND_COPY,
 								  cstate->rel ? RelationGetRelid(cstate->rel) : InvalidOid);
+	pgstat_progress_update_multi_param(2, progress_cols, progress_vals);
+
 	cstate->bytes_processed = 0;
 
 	MemoryContextSwitchTo(oldcontext);
@@ -954,7 +968,7 @@ CopyTo(CopyToState cstate)
 			CopyOneRowTo(cstate, slot);
 
 			/* Increment amount of processed tuples and update the progress */
-			pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++processed);
+			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++processed);
 		}
 
 		ExecDropSingleTupleTableSlot(slot);
@@ -1321,7 +1335,7 @@ copy_dest_receive(TupleTableSlot *slot, DestReceiver *self)
 	CopyOneRowTo(cstate, slot);
 
 	/* Increment amount of processed tuples and update the progress */
-	pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++myState->processed);
+	pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++myState->processed);
 
 	return true;
 }
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 95ec5d02e9..508b562d92 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -136,6 +136,19 @@
 /* Commands of PROGRESS_COPY */
 #define PROGRESS_COPY_BYTES_PROCESSED 0
 #define PROGRESS_COPY_BYTES_TOTAL 1
-#define PROGRESS_COPY_LINES_PROCESSED 2
+#define PROGRESS_COPY_TUPLES_PROCESSED 2
+#define PROGRESS_COPY_TUPLES_EXCLUDED 3
+#define PROGRESS_COPY_COMMAND 4
+#define PROGRESS_COPY_IO_TYPE 5
+
+/* Commands of PROGRESS_COPY_COMMAND */
+#define PROGRESS_COPY_COMMAND_FROM 1
+#define PROGRESS_COPY_COMMAND_TO 2
+
+/* Types of PROGRESS_COPY_INOUT_TYPE */
+#define PROGRESS_COPY_IO_TYPE_FILE 1
+#define PROGRESS_COPY_IO_TYPE_PROGRAM 2
+#define PROGRESS_COPY_IO_TYPE_STDIO 3
+#define PROGRESS_COPY_IO_TYPE_CALLBACK 4
 
 #endif
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 10a1f34ebc..a6ae73bd09 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1949,9 +1949,22 @@ pg_stat_progress_copy| SELECT s.pid,
     s.datid,
     d.datname,
     s.relid,
+        CASE s.param5
+            WHEN 1 THEN 'COPY FROM'::text
+            WHEN 2 THEN 'COPY TO'::text
+            ELSE NULL::text
+        END AS command,
+        CASE s.param6
+            WHEN 1 THEN 'FILE'::text
+            WHEN 2 THEN 'PROGRAM'::text
+            WHEN 3 THEN 'STDIO'::text
+            WHEN 4 THEN 'CALLBACK'::text
+            ELSE NULL::text
+        END AS io_type,
     s.param1 AS bytes_processed,
     s.param2 AS bytes_total,
-    s.param3 AS lines_processed
+    s.param3 AS tuples_processed,
+    s.param4 AS tuples_excluded
    FROM (pg_stat_get_progress_info('COPY'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_create_index| SELECT s.pid,
-- 
2.20.1

v10-0002-Add-backlinks-to-progress-reporting-documentatio.patchapplication/x-patch; name=v10-0002-Add-backlinks-to-progress-reporting-documentatio.patchDownload
From 82dfa694f48874d23f25a6d2673fefdd09010e9e Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Fri, 5 Feb 2021 23:11:50 +0100
Subject: [PATCH v10 2/3] Add backlinks to progress reporting documentation

Previously, for most progress-reported features, the only place the
feature was mentioned is in the progress reporting document itself.
This makes the progress reporting more discoverable from the reported
commands.
---
 doc/src/sgml/ref/analyze.sgml       |  7 +++++++
 doc/src/sgml/ref/cluster.sgml       |  6 ++++++
 doc/src/sgml/ref/copy.sgml          | 14 ++++++++++++++
 doc/src/sgml/ref/create_index.sgml  |  7 +++++++
 doc/src/sgml/ref/pg_basebackup.sgml |  1 +
 doc/src/sgml/ref/reindex.sgml       |  7 +++++++
 doc/src/sgml/ref/vacuum.sgml        | 11 +++++++++++
 7 files changed, 53 insertions(+)

diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml
index 7d816c87c6..e44a5e07c1 100644
--- a/doc/src/sgml/ref/analyze.sgml
+++ b/doc/src/sgml/ref/analyze.sgml
@@ -273,6 +273,12 @@ ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replacea
     will not record new statistics for that table.  Any existing statistics
     will be retained.
   </para>
+
+  <para>
+    Each backend running <command>ANALYZE</command> will report its progress to
+    the <structname>pg_stat_progress_analyze</structname> view. See
+    <xref linkend="analyze-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -291,6 +297,7 @@ ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replacea
    <member><xref linkend="app-vacuumdb"/></member>
    <member><xref linkend="runtime-config-resource-vacuum-cost"/></member>
    <member><xref linkend="autovacuum"/></member>
+   <member><xref linkend="analyze-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/cluster.sgml b/doc/src/sgml/ref/cluster.sgml
index 5dd21a0189..6dcf886e54 100644
--- a/doc/src/sgml/ref/cluster.sgml
+++ b/doc/src/sgml/ref/cluster.sgml
@@ -192,6 +192,11 @@ CLUSTER [VERBOSE]
     are periodically reclustered.
    </para>
 
+  <para>
+    Each backend running <command>CLUSTER</command> will report its progress to
+    the <structname>pg_stat_progress_cluster</structname> view. See
+    <xref linkend="cluster-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -242,6 +247,7 @@ CLUSTER <replaceable class="parameter">index_name</replaceable> ON <replaceable
 
   <simplelist type="inline">
    <member><xref linkend="app-clusterdb"/></member>
+   <member><xref linkend="cluster-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 0fca6583af..bd2dd87c52 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -82,6 +82,12 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
    specified, data is transmitted via the connection between the
    client and the server.
   </para>
+
+  <para>
+    Each backend running <command>COPY</command> will report its progress to
+    the <structname>pg_stat_progress_copy</structname> view. See
+    <xref linkend="copy-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -1052,4 +1058,12 @@ COPY [ BINARY ] <replaceable class="parameter">table_name</replaceable>
     [ WITH NULL AS '<replaceable class="parameter">null_string</replaceable>' ]
 </synopsis></para>
  </refsect1>
+
+ <refsect1>
+  <title>See Also</title>
+
+  <simplelist type="inline">
+   <member><xref linkend="copy-progress-reporting"/></member>
+  </simplelist>
+ </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index a5271a9f8f..092372694d 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -865,6 +865,12 @@ Indexes:
    will interpret it as <literal>USING gist</literal>, to simplify conversion
    of old databases to GiST.
   </para>
+
+  <para>
+    Each backend running <command>CREATE INDEX</command> will report its
+    progress to the <structname>pg_stat_progress_create_index</structname>
+    view. See <xref linkend="create-index-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -978,6 +984,7 @@ CREATE INDEX CONCURRENTLY sales_quantity_index ON sales_table (quantity);
    <member><xref linkend="sql-alterindex"/></member>
    <member><xref linkend="sql-dropindex"/></member>
    <member><xref linkend="sql-reindex"/></member>
+   <member><xref linkend="create-index-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 5754ad5aa6..14cc88a852 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -904,6 +904,7 @@ PostgreSQL documentation
 
   <simplelist type="inline">
    <member><xref linkend="app-pgdump"/></member>
+   <member><xref linkend="basebackup-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 07795b5737..67cdd83449 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -493,6 +493,12 @@ Indexes:
     is reindexed concurrently, those indexes will be skipped.  (It is possible
     to reindex such indexes without the <command>CONCURRENTLY</command> option.)
    </para>
+
+  <para>
+    Each backend running <command>REINDEX</command> will report its progress
+    to the <structname>pg_stat_progress_create_index</structname> view. See
+    <xref linkend="create-index-progress-reporting"/> for details.
+  </para>
   </refsect2>
  </refsect1>
 
@@ -551,6 +557,7 @@ REINDEX TABLE CONCURRENTLY my_broken_table;
    <member><xref linkend="sql-createindex"/></member>
    <member><xref linkend="sql-dropindex"/></member>
    <member><xref linkend="app-reindexdb"/></member>
+   <member><xref linkend="create-index-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 4bb624979b..08a40e94f7 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -393,6 +393,15 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     information about automatic and manual vacuuming, see
     <xref linkend="routine-vacuuming"/>.
    </para>
+   <para>
+    Each backend running <command>VACUUM</command> without the
+    <literal>FULL</literal> option will report its progress in the
+    <structname>pg_stat_progress_vacuum</structname> view. Backends running
+    <command>VACUUM</command> with the <literal>FULL</literal> option report
+    progress in the <structname>pg_stat_progress_cluster</structname> instead.
+    See <xref linkend="vacuum-progress-reporting"/> and
+    <xref linkend="cluster-progress-reporting"/> for details.
+   </para>
  </refsect1>
 
  <refsect1>
@@ -422,6 +431,8 @@ VACUUM (VERBOSE, ANALYZE) onek;
    <member><xref linkend="app-vacuumdb"/></member>
    <member><xref linkend="runtime-config-resource-vacuum-cost"/></member>
    <member><xref linkend="autovacuum"/></member>
+   <member><xref linkend="vacuum-progress-reporting"/></member>
+   <member><xref linkend="cluster-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
-- 
2.20.1

v10-0003-Add-copy-progress-reporting-regression-tests.patchapplication/x-patch; name=v10-0003-Add-copy-progress-reporting-regression-tests.patchDownload
From 6062d394a508bf117e666b794a9ff36c1ef0b0d3 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Josef=20=C5=A0im=C3=A1nek?= <josef.simanek@gmail.com>
Date: Tue, 9 Feb 2021 13:06:45 +0100
Subject: [PATCH v10 3/3] Add copy progress reporting regression tests.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This tests some basic features of copy progress reporting.

Sadly, we can only request one snapshot of progress_reporting each
transaction / statement, so we can't (easily) get intermediate results without
each result requiring a seperate statement being run.

Author: Josef Šimánek, Matthias van de Meent
---
 src/test/regress/input/copy.source  | 60 +++++++++++++++++++++++++++++
 src/test/regress/output/copy.source | 47 ++++++++++++++++++++++
 2 files changed, 107 insertions(+)

diff --git a/src/test/regress/input/copy.source b/src/test/regress/input/copy.source
index a1d529ad36..4f1cbc73d2 100644
--- a/src/test/regress/input/copy.source
+++ b/src/test/regress/input/copy.source
@@ -201,3 +201,63 @@ select * from parted_copytest where b = 1;
 select * from parted_copytest where b = 2;
 
 drop table parted_copytest;
+
+--
+-- progress reporting
+--
+
+-- setup
+-- reuse employer datatype, that has a small sized data set
+
+create table progress_reporting (
+	name 		text,
+	age			int4,
+	location 	point,
+	salary 		int4,
+	manager 	name
+);
+
+-- Add a trigger to 'raise info' the contents of pg_stat_progress_copy. This
+-- allows us to test and verify the contents of this view, which would
+-- otherwise require a concurrent session checking that table.
+create function notice_after_progress_reporting() returns trigger AS
+$$
+declare report record;
+begin
+	-- We cannot expect 'pid' nor 'relid' to be consistent over runs due to
+	-- variance in system process ids, and concurrency in runs of tests.
+	-- Additionally, due to the usage of this test in pg_regress, the 'datid'
+	-- also is not consistent between runs.
+	select into report (to_jsonb(r) - '{pid,relid,datid}'::text[]) as value
+		from pg_stat_progress_copy r
+		where pid = pg_backend_pid();
+
+	raise info 'progress: %', report.value::text;
+
+	RETURN NEW;
+END;
+$$ LANGUAGE plpgsql;
+
+create trigger check_after_progress_reporting
+	after insert on progress_reporting
+	for each statement
+	execute function notice_after_progress_reporting();
+
+-- reporting of STDIO imports, and correct bytes-processed/tuples-processed reporting
+
+copy progress_reporting from stdin;
+sharon	25	(15,12)	1000	sam
+sam	30	(10,5)	2000	bill
+bill	20	(11,10)	1000	sharon
+\.
+
+-- reporting of FILE imports, and correct reporting of tuples-excluded
+
+copy progress_reporting from '@abs_srcdir@/data/emp.data'
+	where (salary < 2000);
+
+-- cleanup progress_reporting
+
+drop trigger check_after_progress_reporting on progress_reporting;
+drop function notice_after_progress_reporting();
+drop table progress_reporting;
diff --git a/src/test/regress/output/copy.source b/src/test/regress/output/copy.source
index 938d3551da..f3596bdd15 100644
--- a/src/test/regress/output/copy.source
+++ b/src/test/regress/output/copy.source
@@ -165,3 +165,50 @@ select * from parted_copytest where b = 2;
 (1 row)
 
 drop table parted_copytest;
+--
+-- progress reporting
+--
+-- setup
+-- reuse employer datatype, that has a small sized data set
+create table progress_reporting (
+	name 		text,
+	age			int4,
+	location 	point,
+	salary 		int4,
+	manager 	name
+);
+-- Add a trigger to 'raise info' the contents of pg_stat_progress_copy. This
+-- allows us to test and verify the contents of this view, which would
+-- otherwise require a concurrent session checking that table.
+create function notice_after_progress_reporting() returns trigger AS
+$$
+declare report record;
+begin
+	-- We cannot expect 'pid' nor 'relid' to be consistent over runs due to
+	-- variance in system process ids, and concurrency in runs of tests.
+	-- Additionally, due to the usage of this test in pg_regress, the 'datid'
+	-- also is not consistent between runs.
+	select into report (to_jsonb(r) - '{pid,relid,datid}'::text[]) as value
+		from pg_stat_progress_copy r
+		where pid = pg_backend_pid();
+
+	raise info 'progress: %', report.value::text;
+
+	RETURN NEW;
+END;
+$$ LANGUAGE plpgsql;
+create trigger check_after_progress_reporting
+	after insert on progress_reporting
+	for each statement
+	execute function notice_after_progress_reporting();
+-- reporting of STDIO imports, and correct bytes-processed/tuples-processed reporting
+copy progress_reporting from stdin;
+INFO:  progress: {"command": "COPY FROM", "datname": "regression", "io_type": "STDIO", "bytes_total": 0, "bytes_processed": 79, "tuples_excluded": 0, "tuples_processed": 3}
+-- reporting of FILE imports, and correct reporting of tuples-excluded
+copy progress_reporting from '@abs_srcdir@/data/emp.data'
+	where (salary < 2000);
+INFO:  progress: {"command": "COPY FROM", "datname": "regression", "io_type": "FILE", "bytes_total": 79, "bytes_processed": 79, "tuples_excluded": 1, "tuples_processed": 2}
+-- cleanup progress_reporting
+drop trigger check_after_progress_reporting on progress_reporting;
+drop function notice_after_progress_reporting();
+drop table progress_reporting;
-- 
2.20.1

#28Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Matthias van de Meent (#27)
Re: Improvements and additions to COPY progress reporting

On Tue, Feb 23, 2021 at 2:57 PM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

On Mon, 22 Feb 2021 at 05:49, Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Mon, Feb 22, 2021 at 12:40 AM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

On Sat, 20 Feb 2021 at 07:09, Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

For COPY TO the name "source_type" column and for COPY FROM the name
"destination_type" makes sense. To have a combined column name for
both, how about naming that column as "io_type"?

Thank you, that's way better! PFA what I believe is a finalized
patchset v9, utilizing io_type terminology instead of io_target.

Thanks for the patches. I reviewed them.

0001 - I think there's a bug. See COPY TO stdout doesn't print
io_type as "STDIO".

Fixed in attached

Thanks.

0003 - patch:
I'm doubtful if the "bytes_total": 79 i.e. test file size will be the
same across different platforms and file systems types, if true, then
the below tests will not be stable. Do we also want to exclude the
bytes_total from the output, just to be on the safer side? Thoughts?

I'm fairly certain that input files of the regression tests are
considered 'binary files' to the test framework and that contents
don't change between different architectures or OSes. I also think
that any POSIX-compliant file system would report anything but the
size of the file contents, i.e. the size of the blob that is the file,
and that is correctly reported here. Other than that, if bytes_total
wouldn't be stable, then bytes_processed wouldn't make sense either.

For STDIN / STDOUT you might also have a point (different input
methods might have different length encodings for the specified
input), but insofar that I understand the test framework and the
expected git configurations, the tests run using UTF-8 / ascii only,
with a single style of newlines[+]. Sadly, I cannot provide examples
nor outputs for other test framework settings due to my lack of
experience with running the tests with non-standard settings.

Note, I'm happy to be proven wrong here, in which case I don't
disagree, but according to my limited knowledge, these outputs should
be stable.

I'm no expert in different OS architectures, but I see that the
patches are passing on cf bot where they get tested on Windows,
FreeBSD, Linux and macOS platforms.

I have no further comments on the v10 patch set. I tested the patches,
they work as expected. I will mark the cf entry as "Ready for
Committer".

Below are some snapshots from testing:

postgres=# select * from pg_stat_progress_copy;
pid | datid | datname | relid | command | io_type |
bytes_processed | bytes_total | tuples_processed | tuples_excluded
---------+-------+----------+-------+-----------+---------+-----------------+-------------+------------------+-----------------
1089927 | 13003 | postgres | 16384 | COPY FROM | FILE |
104660992 | 888888898 | 12861112 | 0
1089969 | 13003 | postgres | 16384 | COPY FROM | FILE |
76611584 | 888888898 | 9712999 | 0

postgres=# select * from pg_stat_progress_copy;
pid | datid | datname | relid | command | io_type |
bytes_processed | bytes_total | tuples_processed | tuples_excluded
---------+-------+----------+-------+-----------+---------+-----------------+-------------+------------------+-----------------
1089927 | 13003 | postgres | 16384 | COPY FROM | FILE |
203161600 | 888888898 | 0 | 23804080
1089969 | 13003 | postgres | 16384 | COPY FROM | FILE |
150601728 | 888888898 | 0 | 17961241

postgres=# select * from pg_stat_progress_copy;
pid | datid | datname | relid | command | io_type |
bytes_processed | bytes_total | tuples_processed | tuples_excluded
---------+-------+----------+-------+---------+---------+-----------------+-------------+------------------+-----------------
1089927 | 13003 | postgres | 16384 | COPY TO | FILE |
66806479 | 0 | 7422942 | 0
1089969 | 13003 | postgres | 16384 | COPY TO | FILE |
29803951 | 0 | 3311550 | 0

postgres=# select * from pg_stat_progress_copy;
pid | datid | datname | relid | command | io_type |
bytes_processed | bytes_total | tuples_processed | tuples_excluded
---------+-------+----------+-------+---------+---------+-----------------+-------------+------------------+-----------------
1089927 | 13003 | postgres | 16384 | COPY TO | STDIO |
5998293 | 0 | 666477 | 0
1089969 | 13003 | postgres | 16384 | COPY TO | STDIO |
2780586 | 0 | 308954 | 0

postgres=# select * from pg_stat_progress_copy;
pid | datid | datname | relid | command | io_type |
bytes_processed | bytes_total | tuples_processed | tuples_excluded
---------+-------+----------+-------+---------+---------+-----------------+-------------+------------------+-----------------
1089927 | 13003 | postgres | 0 | COPY TO | FILE |
124447239 | 0 | 13827471 | 0
1089969 | 13003 | postgres | 0 | COPY TO | FILE |
90992466 | 0 | 10110274 | 0

postgres=# select * from pg_stat_progress_copy;
pid | datid | datname | relid | command | io_type |
bytes_processed | bytes_total | tuples_processed | tuples_excluded
---------+-------+----------+-------+-----------+---------+-----------------+-------------+------------------+-----------------
1090927 | 13003 | postgres | 16384 | COPY FROM | STDIO |
492465897 | 0 | 55952999 | 0
1091000 | 13003 | postgres | 16384 | COPY FROM | STDIO |
30494360 | 0 | 3950683 | 0

postgres=# select * from pg_stat_progress_copy;
pid | datid | datname | relid | command | io_type |
bytes_processed | bytes_total | tuples_processed | tuples_excluded
---------+-------+----------+-------+-----------+---------+-----------------+-------------+------------------+-----------------
1091217 | 13003 | postgres | 16384 | COPY FROM | STDIO |
230516127 | 0 | 0 | 26847469
1091224 | 13003 | postgres | 16384 | COPY FROM | STDIO |
212020065 | 0 | 0 | 24792351

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

#29Michael Paquier
michael@paquier.xyz
In reply to: Matthias van de Meent (#27)
Re: Improvements and additions to COPY progress reporting

On Tue, Feb 23, 2021 at 10:27:24AM +0100, Matthias van de Meent wrote:

Note, I'm happy to be proven wrong here, in which case I don't
disagree, but according to my limited knowledge, these outputs should
be stable.

I am planning to look more at 0001 and 0003, but for now I have been
looking at 0002 which is interesting on its own.

+    <structname>pg_stat_progress_vacuum</structname> view. Backends running
+    <command>VACUUM</command> with the <literal>FULL</literal> option report
+    progress in the <structname>pg_stat_progress_cluster</structname> instead.
You have missed one "view" after pg_stat_progress_cluster here.
Except that, this stuff looks fine.  So I'd like to apply it if there
are no objections.
--
Michael
#30Justin Pryzby
pryzby@telsasoft.com
In reply to: Matthias van de Meent (#25)
Re: Improvements and additions to COPY progress reporting

On Sun, Feb 21, 2021 at 08:10:09PM +0100, Matthias van de Meent wrote:

Subject: [PATCH v9 1/3] Add progress-reported components for COPY progress
reporting

/* Increment amount of processed tuples and update the progress */
/* Increment amount of processed tuples and update the progress */

Ideally, this would say "number of processed tuples"

Subject: [PATCH v9 2/3] Add backlinks to progress reporting documentation

I think these should say that they report their progress *in* the view (not
"to"):

+    Each backend running <command>ANALYZE</command> will report its progress to
+    the <structname>pg_stat_progress_analyze</structname> view. See
+    Each backend running <command>CLUSTER</command> will report its progress to
+    the <structname>pg_stat_progress_cluster</structname> view. See
+    Each backend running <command>COPY</command> will report its progress to
+    the <structname>pg_stat_progress_copy</structname> view. See
+    Each backend running <command>CREATE INDEX</command> will report its
+    progress to the <structname>pg_stat_progress_create_index</structname>
+    Each backend running <command>REINDEX</command> will report its progress
+    to the <structname>pg_stat_progress_create_index</structname> view. See

Like this one:

+    Each backend running <command>VACUUM</command> without the
+    <literal>FULL</literal> option will report its progress in the

I'm sorry I didn't include that in last week's message. You could also write:

|"The progress of each backend running >ANALYZE< is reported in the >pg_stat_progress_analyze< view."

Looking at the existing docs:

https://www.postgresql.org/docs/devel/progress-reporting.html#COPY-PROGRESS-REPORTING
| OID of the table on which the COPY command is executed

Maybe it should say ".. is executing". Or ".. being executed":
| OID of the table on which the COPY command is being executed

--
Justin

#31Michael Paquier
michael@paquier.xyz
In reply to: Justin Pryzby (#30)
Re: Improvements and additions to COPY progress reporting

On Wed, Feb 24, 2021 at 01:53:03AM -0600, Justin Pryzby wrote:

On Sun, Feb 21, 2021 at 08:10:09PM +0100, Matthias van de Meent wrote:
I think these should say that they report their progress *in* the view (not
"to"):

+    Each backend running <command>ANALYZE</command> will report its progress to
+    the <structname>pg_stat_progress_analyze</structname> view. See

What is proposed in the patch is:
"Each backend running <command>blah</> will report its progress to the
pg_stat_progress_blah view."

What you propose is:
"Each backend running <command>blah</> will report its progress in the
pg_stat_progress_blah view."

What pg_basebackup tells is:
"Whenever <application>pg_basebackup</application> is taking a base
backup, the server's pg_stat_progress_basebackup view will report the
progress of the backup."

Here is an extra idea:
"Whenever a backend runs <command>blah</>, the server's
pg_stat_progress_blah will report the progress of this command."
--
Michael

#32Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Michael Paquier (#31)
Re: Improvements and additions to COPY progress reporting

On Thu, Feb 25, 2021 at 1:44 PM Michael Paquier <michael@paquier.xyz> wrote:

On Wed, Feb 24, 2021 at 01:53:03AM -0600, Justin Pryzby wrote:

On Sun, Feb 21, 2021 at 08:10:09PM +0100, Matthias van de Meent wrote:
I think these should say that they report their progress *in* the view (not
"to"):

+    Each backend running <command>ANALYZE</command> will report its progress to
+    the <structname>pg_stat_progress_analyze</structname> view. See

What is proposed in the patch is:
"Each backend running <command>blah</> will report its progress to the
pg_stat_progress_blah view."

What you propose is:
"Each backend running <command>blah</> will report its progress in the
pg_stat_progress_blah view."

IMO, the phrasing proposed by Justin upthread looks good. It's like this:

+    Each backend running <command>ANALYZE</command> will report its progress in
+    the <structname>pg_stat_progress_analyze</structname> view. See
+    Each backend running <command>CREATE INDEX</command> will report its
+    progress in the <structname>pg_stat_progress_create_index</structname> view

...
...

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

#33Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Justin Pryzby (#30)
Re: Improvements and additions to COPY progress reporting

On Wed, Feb 24, 2021 at 1:23 PM Justin Pryzby <pryzby@telsasoft.com> wrote:

On Sun, Feb 21, 2021 at 08:10:09PM +0100, Matthias van de Meent wrote:

Subject: [PATCH v9 1/3] Add progress-reported components for COPY progress
reporting

/* Increment amount of processed tuples and update the progress */
/* Increment amount of processed tuples and update the progress */

Ideally, this would say "number of processed tuples"

Correct. It's introduced by the original COPY progress reporting
patch. Having said that, we could just correct them in the 0001 patch.

Looking at the existing docs:

https://www.postgresql.org/docs/devel/progress-reporting.html#COPY-PROGRESS-REPORTING
| OID of the table on which the COPY command is executed

Maybe it should say ".. is executing". Or ".. being executed":
| OID of the table on which the COPY command is being executed

+1 for changing it to "OID of the table on which the COPY command is
being executed."

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

#34Michael Paquier
michael@paquier.xyz
In reply to: Bharath Rupireddy (#32)
Re: Improvements and additions to COPY progress reporting

On Thu, Mar 04, 2021 at 12:16:17PM +0530, Bharath Rupireddy wrote:

IMO, the phrasing proposed by Justin upthread looks good. It's like this:

+    Each backend running <command>ANALYZE</command> will report its progress in
+    the <structname>pg_stat_progress_analyze</structname> view. See

No objections to just go with that. As a new patch set is needed, I
am switching the CF entry to "Waiting on Author".
--
Michael

#35Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Michael Paquier (#34)
3 attachment(s)
Re: Improvements and additions to COPY progress reporting

On Thu, 4 Mar 2021 at 11:38, Michael Paquier <michael@paquier.xyz> wrote:

On Thu, Mar 04, 2021 at 12:16:17PM +0530, Bharath Rupireddy wrote:

IMO, the phrasing proposed by Justin upthread looks good. It's like this:

+    Each backend running <command>ANALYZE</command> will report its progress in
+    the <structname>pg_stat_progress_analyze</structname> view. See

No objections to just go with that. As a new patch set is needed, I
am switching the CF entry to "Waiting on Author".

Thanks for all your comments, and sorry for the delayed response.
Please find attached a new version of the patch set, that is rebased
and contains the requested changes:

1/3:
Docs:
- on which the COPY command is executed
+ on which the COPY command is being executed
Reworded existing commment:
- /* Increment amount of processed tuples and update the progress */
+ /* Increment the number of processed tuples, and report the progress */
2/3:
Docs:
- ... report its progress to ...
+ ... report its progress in ...
- report its progress to the >pg_stat_progress_cluster< ...
+ report its progress in the >pg_stat_progress_cluster< view ...

3/3:
No changes

I believe that that was the extent of the not-yet-resolved comments
and suggestions.

With regards,

Matthias van de Meent.

Attachments:

v11-0003-Add-copy-progress-reporting-regression-tests.patchtext/x-patch; charset=UTF-8; name=v11-0003-Add-copy-progress-reporting-regression-tests.patchDownload
From 7831549452b6d95c3db9060333750256675ff322 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Josef=20=C5=A0im=C3=A1nek?= <josef.simanek@gmail.com>
Date: Tue, 9 Feb 2021 13:06:45 +0100
Subject: [PATCH v11 3/3] Add copy progress reporting regression tests.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This tests some basic features of copy progress reporting.

Sadly, we can only request one snapshot of progress_reporting each
transaction / statement, so we can't (easily) get intermediate results without
each result requiring a seperate statement being run.

Author: Josef Šimánek, Matthias van de Meent
---
 src/test/regress/input/copy.source  | 60 +++++++++++++++++++++++++++++
 src/test/regress/output/copy.source | 47 ++++++++++++++++++++++
 2 files changed, 107 insertions(+)

diff --git a/src/test/regress/input/copy.source b/src/test/regress/input/copy.source
index a1d529ad36..4f1cbc73d2 100644
--- a/src/test/regress/input/copy.source
+++ b/src/test/regress/input/copy.source
@@ -201,3 +201,63 @@ select * from parted_copytest where b = 1;
 select * from parted_copytest where b = 2;
 
 drop table parted_copytest;
+
+--
+-- progress reporting
+--
+
+-- setup
+-- reuse employer datatype, that has a small sized data set
+
+create table progress_reporting (
+	name 		text,
+	age			int4,
+	location 	point,
+	salary 		int4,
+	manager 	name
+);
+
+-- Add a trigger to 'raise info' the contents of pg_stat_progress_copy. This
+-- allows us to test and verify the contents of this view, which would
+-- otherwise require a concurrent session checking that table.
+create function notice_after_progress_reporting() returns trigger AS
+$$
+declare report record;
+begin
+	-- We cannot expect 'pid' nor 'relid' to be consistent over runs due to
+	-- variance in system process ids, and concurrency in runs of tests.
+	-- Additionally, due to the usage of this test in pg_regress, the 'datid'
+	-- also is not consistent between runs.
+	select into report (to_jsonb(r) - '{pid,relid,datid}'::text[]) as value
+		from pg_stat_progress_copy r
+		where pid = pg_backend_pid();
+
+	raise info 'progress: %', report.value::text;
+
+	RETURN NEW;
+END;
+$$ LANGUAGE plpgsql;
+
+create trigger check_after_progress_reporting
+	after insert on progress_reporting
+	for each statement
+	execute function notice_after_progress_reporting();
+
+-- reporting of STDIO imports, and correct bytes-processed/tuples-processed reporting
+
+copy progress_reporting from stdin;
+sharon	25	(15,12)	1000	sam
+sam	30	(10,5)	2000	bill
+bill	20	(11,10)	1000	sharon
+\.
+
+-- reporting of FILE imports, and correct reporting of tuples-excluded
+
+copy progress_reporting from '@abs_srcdir@/data/emp.data'
+	where (salary < 2000);
+
+-- cleanup progress_reporting
+
+drop trigger check_after_progress_reporting on progress_reporting;
+drop function notice_after_progress_reporting();
+drop table progress_reporting;
diff --git a/src/test/regress/output/copy.source b/src/test/regress/output/copy.source
index 938d3551da..f3596bdd15 100644
--- a/src/test/regress/output/copy.source
+++ b/src/test/regress/output/copy.source
@@ -165,3 +165,50 @@ select * from parted_copytest where b = 2;
 (1 row)
 
 drop table parted_copytest;
+--
+-- progress reporting
+--
+-- setup
+-- reuse employer datatype, that has a small sized data set
+create table progress_reporting (
+	name 		text,
+	age			int4,
+	location 	point,
+	salary 		int4,
+	manager 	name
+);
+-- Add a trigger to 'raise info' the contents of pg_stat_progress_copy. This
+-- allows us to test and verify the contents of this view, which would
+-- otherwise require a concurrent session checking that table.
+create function notice_after_progress_reporting() returns trigger AS
+$$
+declare report record;
+begin
+	-- We cannot expect 'pid' nor 'relid' to be consistent over runs due to
+	-- variance in system process ids, and concurrency in runs of tests.
+	-- Additionally, due to the usage of this test in pg_regress, the 'datid'
+	-- also is not consistent between runs.
+	select into report (to_jsonb(r) - '{pid,relid,datid}'::text[]) as value
+		from pg_stat_progress_copy r
+		where pid = pg_backend_pid();
+
+	raise info 'progress: %', report.value::text;
+
+	RETURN NEW;
+END;
+$$ LANGUAGE plpgsql;
+create trigger check_after_progress_reporting
+	after insert on progress_reporting
+	for each statement
+	execute function notice_after_progress_reporting();
+-- reporting of STDIO imports, and correct bytes-processed/tuples-processed reporting
+copy progress_reporting from stdin;
+INFO:  progress: {"command": "COPY FROM", "datname": "regression", "io_type": "STDIO", "bytes_total": 0, "bytes_processed": 79, "tuples_excluded": 0, "tuples_processed": 3}
+-- reporting of FILE imports, and correct reporting of tuples-excluded
+copy progress_reporting from '@abs_srcdir@/data/emp.data'
+	where (salary < 2000);
+INFO:  progress: {"command": "COPY FROM", "datname": "regression", "io_type": "FILE", "bytes_total": 79, "bytes_processed": 79, "tuples_excluded": 1, "tuples_processed": 2}
+-- cleanup progress_reporting
+drop trigger check_after_progress_reporting on progress_reporting;
+drop function notice_after_progress_reporting();
+drop table progress_reporting;
-- 
2.20.1

v11-0002-Add-backlinks-to-progress-reporting-documentatio.patchtext/x-patch; charset=US-ASCII; name=v11-0002-Add-backlinks-to-progress-reporting-documentatio.patchDownload
From e0bb19f4e5857cdfa9fd00c937ca802e91ba26f1 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Fri, 5 Feb 2021 23:11:50 +0100
Subject: [PATCH v11 2/3] Add backlinks to progress reporting documentation

Previously, for most progress-reported features, the only place the
feature was mentioned is in the progress reporting document itself.
This makes the progress reporting more discoverable from the reported
commands.
---
 doc/src/sgml/ref/analyze.sgml       |  7 +++++++
 doc/src/sgml/ref/cluster.sgml       |  6 ++++++
 doc/src/sgml/ref/copy.sgml          | 14 ++++++++++++++
 doc/src/sgml/ref/create_index.sgml  |  7 +++++++
 doc/src/sgml/ref/pg_basebackup.sgml |  1 +
 doc/src/sgml/ref/reindex.sgml       |  7 +++++++
 doc/src/sgml/ref/vacuum.sgml        | 11 +++++++++++
 7 files changed, 53 insertions(+)

diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml
index 7d816c87c6..bc64843948 100644
--- a/doc/src/sgml/ref/analyze.sgml
+++ b/doc/src/sgml/ref/analyze.sgml
@@ -273,6 +273,12 @@ ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replacea
     will not record new statistics for that table.  Any existing statistics
     will be retained.
   </para>
+
+  <para>
+    Each backend running <command>ANALYZE</command> will report its progress in
+    the <structname>pg_stat_progress_analyze</structname> view. See
+    <xref linkend="analyze-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -291,6 +297,7 @@ ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replacea
    <member><xref linkend="app-vacuumdb"/></member>
    <member><xref linkend="runtime-config-resource-vacuum-cost"/></member>
    <member><xref linkend="autovacuum"/></member>
+   <member><xref linkend="analyze-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/cluster.sgml b/doc/src/sgml/ref/cluster.sgml
index 5dd21a0189..ddcdc8191a 100644
--- a/doc/src/sgml/ref/cluster.sgml
+++ b/doc/src/sgml/ref/cluster.sgml
@@ -192,6 +192,11 @@ CLUSTER [VERBOSE]
     are periodically reclustered.
    </para>
 
+  <para>
+    Each backend running <command>CLUSTER</command> will report its progress in
+    the <structname>pg_stat_progress_cluster</structname> view. See
+    <xref linkend="cluster-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -242,6 +247,7 @@ CLUSTER <replaceable class="parameter">index_name</replaceable> ON <replaceable
 
   <simplelist type="inline">
    <member><xref linkend="app-clusterdb"/></member>
+   <member><xref linkend="cluster-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 0fca6583af..fd9309af5f 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -82,6 +82,12 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
    specified, data is transmitted via the connection between the
    client and the server.
   </para>
+
+  <para>
+    Each backend running <command>COPY</command> will report its progress in
+    the <structname>pg_stat_progress_copy</structname> view. See
+    <xref linkend="copy-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -1052,4 +1058,12 @@ COPY [ BINARY ] <replaceable class="parameter">table_name</replaceable>
     [ WITH NULL AS '<replaceable class="parameter">null_string</replaceable>' ]
 </synopsis></para>
  </refsect1>
+
+ <refsect1>
+  <title>See Also</title>
+
+  <simplelist type="inline">
+   <member><xref linkend="copy-progress-reporting"/></member>
+  </simplelist>
+ </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index 965dcf472c..51b4d57939 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -867,6 +867,12 @@ Indexes:
    will interpret it as <literal>USING gist</literal>, to simplify conversion
    of old databases to GiST.
   </para>
+
+  <para>
+    Each backend running <command>CREATE INDEX</command> will report its
+    progress in the <structname>pg_stat_progress_create_index</structname>
+    view. See <xref linkend="create-index-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -980,6 +986,7 @@ CREATE INDEX CONCURRENTLY sales_quantity_index ON sales_table (quantity);
    <member><xref linkend="sql-alterindex"/></member>
    <member><xref linkend="sql-dropindex"/></member>
    <member><xref linkend="sql-reindex"/></member>
+   <member><xref linkend="create-index-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 5754ad5aa6..14cc88a852 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -904,6 +904,7 @@ PostgreSQL documentation
 
   <simplelist type="inline">
    <member><xref linkend="app-pgdump"/></member>
+   <member><xref linkend="basebackup-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index b22d39eba9..ff4dba8c36 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -495,6 +495,12 @@ Indexes:
     is reindexed concurrently, those indexes will be skipped.  (It is possible
     to reindex such indexes without the <command>CONCURRENTLY</command> option.)
    </para>
+
+  <para>
+    Each backend running <command>REINDEX</command> will report its progress
+    in the <structname>pg_stat_progress_create_index</structname> view. See
+    <xref linkend="create-index-progress-reporting"/> for details.
+  </para>
   </refsect2>
  </refsect1>
 
@@ -553,6 +559,7 @@ REINDEX TABLE CONCURRENTLY my_broken_table;
    <member><xref linkend="sql-createindex"/></member>
    <member><xref linkend="sql-dropindex"/></member>
    <member><xref linkend="app-reindexdb"/></member>
+   <member><xref linkend="create-index-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 4bb624979b..fa25268de8 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -393,6 +393,15 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     information about automatic and manual vacuuming, see
     <xref linkend="routine-vacuuming"/>.
    </para>
+   <para>
+    Each backend running <command>VACUUM</command> without the
+    <literal>FULL</literal> option will report its progress in the
+    <structname>pg_stat_progress_vacuum</structname> view. Backends running
+    <command>VACUUM</command> with the <literal>FULL</literal> option report
+    progress in the <structname>pg_stat_progress_cluster</structname> view 
+    instead. See <xref linkend="vacuum-progress-reporting"/> and
+    <xref linkend="cluster-progress-reporting"/> for details.
+   </para>
  </refsect1>
 
  <refsect1>
@@ -422,6 +431,8 @@ VACUUM (VERBOSE, ANALYZE) onek;
    <member><xref linkend="app-vacuumdb"/></member>
    <member><xref linkend="runtime-config-resource-vacuum-cost"/></member>
    <member><xref linkend="autovacuum"/></member>
+   <member><xref linkend="vacuum-progress-reporting"/></member>
+   <member><xref linkend="cluster-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
-- 
2.20.1

v11-0001-Add-progress-reported-components-for-COPY-progre.patchtext/x-patch; charset=US-ASCII; name=v11-0001-Add-progress-reported-components-for-COPY-progre.patchDownload
From c4f47ddf81acb95e465c488057aa8701f5614747 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Fri, 12 Feb 2021 14:06:44 +0100
Subject: [PATCH v11 1/3] Add progress-reported components for COPY progress
 reporting

The command, io target and excluded tuple count (by the
COPY ... FROM ... WHERE -clause) are now reported in the pg_stat_progress_copy
view. Additionally, the column lines_processed is renamed to tuples_processed
to disambiguate the meaning of this column in cases of CSV and BINARY copies
and to stay consistent with regards to names in the pg_stat_progress_*-views.

Of special interest is the reporting of io_type, with which we can
distinguish logical replications' initial table synchronization workers'
progress without having to join the pg_stat_activity view.
---
 doc/src/sgml/monitoring.sgml         | 42 +++++++++++++++++++++++++---
 src/backend/catalog/system_views.sql | 11 +++++++-
 src/backend/commands/copyfrom.c      | 27 ++++++++++++++++--
 src/backend/commands/copyto.c        | 22 ++++++++++++---
 src/include/commands/progress.h      | 15 +++++++++-
 src/test/regress/expected/rules.out  | 15 +++++++++-
 6 files changed, 118 insertions(+), 14 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 3513e127b7..c10de3406d 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6541,8 +6541,32 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
        <structfield>relid</structfield> <type>oid</type>
       </para>
       <para>
-       OID of the table on which the <command>COPY</command> command is executed.
-       It is set to 0 if copying from a <command>SELECT</command> query.
+       OID of the table on which the <command>COPY</command> command is being
+       executed. It is set to 0 if copying from a <command>SELECT</command>
+       query.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>command</structfield> <type>text</type>
+      </para>
+      <para>
+       The command that is running: <literal>COPY FROM</literal>, or
+       <literal>COPY TO</literal>.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>io_type</structfield> <type>text</type>
+      </para>
+      <para>
+       The io type that the data is read from or written to:
+       <literal>FILE</literal>, <literal>PROGRAM</literal>,
+       <literal>STDIO</literal> (for <command>COPY FROM STDIN</command> and
+       <command>COPY TO STDOUT</command>), or <literal>CALLBACK</literal>
+       (used in the table synchronization background worker).
       </para></entry>
      </row>
 
@@ -6567,10 +6591,20 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>lines_processed</structfield> <type>bigint</type>
+       <structfield>tuples_processed</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of tuples already processed by <command>COPY</command> command.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>tuples_excluded</structfield> <type>bigint</type>
       </para>
       <para>
-       Number of lines already processed by <command>COPY</command> command.
+       Number of tuples not processed because they were excluded by the
+       <command>WHERE</command> clause of the <command>COPY</command> command.
       </para></entry>
      </row>
     </tbody>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index fc94a73a54..d94714adb0 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1130,9 +1130,18 @@ CREATE VIEW pg_stat_progress_copy AS
     SELECT
         S.pid AS pid, S.datid AS datid, D.datname AS datname,
         S.relid AS relid,
+        CASE S.param5 WHEN 1 THEN 'COPY FROM'
+                      WHEN 2 THEN 'COPY TO'
+                      END AS command,
+        CASE S.param6 WHEN 1 THEN 'FILE'
+                      WHEN 2 THEN 'PROGRAM'
+                      WHEN 3 THEN 'STDIO'
+                      WHEN 4 THEN 'CALLBACK'
+                      END AS io_type,
         S.param1 AS bytes_processed,
         S.param2 AS bytes_total,
-        S.param3 AS lines_processed
+        S.param3 AS tuples_processed,
+        S.param4 AS tuples_excluded
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index f05e2d2347..72b71f5058 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -539,7 +539,8 @@ CopyFrom(CopyFromState cstate)
 	BulkInsertState bistate = NULL;
 	CopyInsertMethod insertMethod;
 	CopyMultiInsertInfo multiInsertInfo = {0};	/* pacify compiler */
-	uint64		processed = 0;
+	int64		processed = 0;
+	int64		excluded = 0;
 	bool		has_before_insert_row_trig;
 	bool		has_instead_insert_row_trig;
 	bool		leafpart_use_multi_insert = false;
@@ -869,7 +870,11 @@ CopyFrom(CopyFromState cstate)
 			econtext->ecxt_scantuple = myslot;
 			/* Skip items that don't match COPY's WHERE clause */
 			if (!ExecQual(cstate->qualexpr, econtext))
+			{
+				/* Report that this tuple was filtered out by the WHERE clause */
+				pgstat_progress_update_param(PROGRESS_COPY_TUPLES_EXCLUDED, ++excluded);
 				continue;
+			}
 		}
 
 		/* Determine the partition to insert the tuple into */
@@ -1107,7 +1112,7 @@ CopyFrom(CopyFromState cstate)
 			 * for counting tuples inserted by an INSERT command. Update
 			 * progress of the COPY command as well.
 			 */
-			pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++processed);
+			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++processed);
 		}
 	}
 
@@ -1193,6 +1198,16 @@ BeginCopyFrom(ParseState *pstate,
 	ExprState **defexprs;
 	MemoryContext oldcontext;
 	bool		volatile_defexprs;
+	const int	progress_cols[] = {
+		PROGRESS_COPY_COMMAND,
+		PROGRESS_COPY_IO_TYPE,
+		PROGRESS_COPY_BYTES_TOTAL
+	};
+	int64		progress_vals[] = {
+		PROGRESS_COPY_COMMAND_FROM,
+		0,
+		0
+	};
 
 	/* Allocate workspace and zero all fields */
 	cstate = (CopyFromStateData *) palloc0(sizeof(CopyFromStateData));
@@ -1430,11 +1445,13 @@ BeginCopyFrom(ParseState *pstate,
 
 	if (data_source_cb)
 	{
+		progress_vals[1] = PROGRESS_COPY_IO_TYPE_CALLBACK;
 		cstate->copy_src = COPY_CALLBACK;
 		cstate->data_source_cb = data_source_cb;
 	}
 	else if (pipe)
 	{
+		progress_vals[1] = PROGRESS_COPY_IO_TYPE_STDIO;
 		Assert(!is_program);	/* the grammar does not allow this */
 		if (whereToSendOutput == DestRemote)
 			ReceiveCopyBegin(cstate);
@@ -1447,6 +1464,7 @@ BeginCopyFrom(ParseState *pstate,
 
 		if (cstate->is_program)
 		{
+			progress_vals[1] = PROGRESS_COPY_IO_TYPE_PROGRAM;
 			cstate->copy_file = OpenPipeStream(cstate->filename, PG_BINARY_R);
 			if (cstate->copy_file == NULL)
 				ereport(ERROR,
@@ -1458,6 +1476,7 @@ BeginCopyFrom(ParseState *pstate,
 		{
 			struct stat st;
 
+			progress_vals[1] = PROGRESS_COPY_IO_TYPE_FILE;
 			cstate->copy_file = AllocateFile(cstate->filename, PG_BINARY_R);
 			if (cstate->copy_file == NULL)
 			{
@@ -1484,10 +1503,12 @@ BeginCopyFrom(ParseState *pstate,
 						(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 						 errmsg("\"%s\" is a directory", cstate->filename)));
 
-			pgstat_progress_update_param(PROGRESS_COPY_BYTES_TOTAL, st.st_size);
+			progress_vals[2] = st.st_size;
 		}
 	}
 
+	pgstat_progress_update_multi_param(3, progress_cols, progress_vals);
+
 	if (cstate->opts.binary)
 	{
 		/* Read and verify binary header */
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 46155015cf..1d4bdabb3a 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -353,6 +353,14 @@ BeginCopyTo(ParseState *pstate,
 	TupleDesc	tupDesc;
 	int			num_phys_attrs;
 	MemoryContext oldcontext;
+	const int	progress_cols[] = {
+		PROGRESS_COPY_COMMAND,
+		PROGRESS_COPY_IO_TYPE
+	};
+	int64		progress_vals[] = {
+		PROGRESS_COPY_COMMAND_TO,
+		0
+	};
 
 	if (rel != NULL && rel->rd_rel->relkind != RELKIND_RELATION)
 	{
@@ -659,6 +667,8 @@ BeginCopyTo(ParseState *pstate,
 
 	if (pipe)
 	{
+		progress_vals[1] = PROGRESS_COPY_IO_TYPE_STDIO;
+
 		Assert(!is_program);	/* the grammar does not allow this */
 		if (whereToSendOutput != DestRemote)
 			cstate->copy_file = stdout;
@@ -670,6 +680,7 @@ BeginCopyTo(ParseState *pstate,
 
 		if (is_program)
 		{
+			progress_vals[1] = PROGRESS_COPY_IO_TYPE_PROGRAM;
 			cstate->copy_file = OpenPipeStream(cstate->filename, PG_BINARY_W);
 			if (cstate->copy_file == NULL)
 				ereport(ERROR,
@@ -682,6 +693,7 @@ BeginCopyTo(ParseState *pstate,
 			mode_t		oumask; /* Pre-existing umask value */
 			struct stat st;
 
+			progress_vals[1] = PROGRESS_COPY_IO_TYPE_FILE;
 			/*
 			 * Prevent write to relative path ... too easy to shoot oneself in
 			 * the foot by overwriting a database file ...
@@ -731,6 +743,8 @@ BeginCopyTo(ParseState *pstate,
 	/* initialize progress */
 	pgstat_progress_start_command(PROGRESS_COMMAND_COPY,
 								  cstate->rel ? RelationGetRelid(cstate->rel) : InvalidOid);
+	pgstat_progress_update_multi_param(2, progress_cols, progress_vals);
+
 	cstate->bytes_processed = 0;
 
 	MemoryContextSwitchTo(oldcontext);
@@ -881,8 +895,8 @@ DoCopyTo(CopyToState cstate)
 			/* Format and send the data */
 			CopyOneRowTo(cstate, slot);
 
-			/* Increment amount of processed tuples and update the progress */
-			pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++processed);
+			/* Increment the number of processed tuples, and report the progress */
+			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++processed);
 		}
 
 		ExecDropSingleTupleTableSlot(slot);
@@ -1251,8 +1265,8 @@ copy_dest_receive(TupleTableSlot *slot, DestReceiver *self)
 	/* Send the data */
 	CopyOneRowTo(cstate, slot);
 
-	/* Increment amount of processed tuples and update the progress */
-	pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++myState->processed);
+	/* Increment the number of processed tuples, and report the progress */
+	pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++myState->processed);
 
 	return true;
 }
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 95ec5d02e9..508b562d92 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -136,6 +136,19 @@
 /* Commands of PROGRESS_COPY */
 #define PROGRESS_COPY_BYTES_PROCESSED 0
 #define PROGRESS_COPY_BYTES_TOTAL 1
-#define PROGRESS_COPY_LINES_PROCESSED 2
+#define PROGRESS_COPY_TUPLES_PROCESSED 2
+#define PROGRESS_COPY_TUPLES_EXCLUDED 3
+#define PROGRESS_COPY_COMMAND 4
+#define PROGRESS_COPY_IO_TYPE 5
+
+/* Commands of PROGRESS_COPY_COMMAND */
+#define PROGRESS_COPY_COMMAND_FROM 1
+#define PROGRESS_COPY_COMMAND_TO 2
+
+/* Types of PROGRESS_COPY_INOUT_TYPE */
+#define PROGRESS_COPY_IO_TYPE_FILE 1
+#define PROGRESS_COPY_IO_TYPE_PROGRAM 2
+#define PROGRESS_COPY_IO_TYPE_STDIO 3
+#define PROGRESS_COPY_IO_TYPE_CALLBACK 4
 
 #endif
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index b1c9b7bdfe..ace98af739 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1950,9 +1950,22 @@ pg_stat_progress_copy| SELECT s.pid,
     s.datid,
     d.datname,
     s.relid,
+        CASE s.param5
+            WHEN 1 THEN 'COPY FROM'::text
+            WHEN 2 THEN 'COPY TO'::text
+            ELSE NULL::text
+        END AS command,
+        CASE s.param6
+            WHEN 1 THEN 'FILE'::text
+            WHEN 2 THEN 'PROGRAM'::text
+            WHEN 3 THEN 'STDIO'::text
+            WHEN 4 THEN 'CALLBACK'::text
+            ELSE NULL::text
+        END AS io_type,
     s.param1 AS bytes_processed,
     s.param2 AS bytes_total,
-    s.param3 AS lines_processed
+    s.param3 AS tuples_processed,
+    s.param4 AS tuples_excluded
    FROM (pg_stat_get_progress_info('COPY'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_create_index| SELECT s.pid,
-- 
2.20.1

#36Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Matthias van de Meent (#35)
Re: Improvements and additions to COPY progress reporting

On Thu, Mar 4, 2021 at 5:02 PM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

On Thu, 4 Mar 2021 at 11:38, Michael Paquier <michael@paquier.xyz> wrote:

On Thu, Mar 04, 2021 at 12:16:17PM +0530, Bharath Rupireddy wrote:

IMO, the phrasing proposed by Justin upthread looks good. It's like this:

+    Each backend running <command>ANALYZE</command> will report its progress in
+    the <structname>pg_stat_progress_analyze</structname> view. See

No objections to just go with that. As a new patch set is needed, I
am switching the CF entry to "Waiting on Author".

Thanks for all your comments, and sorry for the delayed response.
Please find attached a new version of the patch set, that is rebased
and contains the requested changes:

1/3:
Docs:
- on which the COPY command is executed
+ on which the COPY command is being executed
Reworded existing commment:
- /* Increment amount of processed tuples and update the progress */
+ /* Increment the number of processed tuples, and report the progress */

LGTM.

2/3:
Docs:
- ... report its progress to ...
+ ... report its progress in ...
- report its progress to the >pg_stat_progress_cluster< ...
+ report its progress in the >pg_stat_progress_cluster< view ...
+   <para>
+    Each backend running <command>VACUUM</command> without the
+    <literal>FULL</literal> option will report its progress in the
+    <structname>pg_stat_progress_vacuum</structname> view. Backends running
+    <command>VACUUM</command> with the <literal>FULL</literal> option report
+    progress in the <structname>pg_stat_progress_cluster</structname> view
+    instead. See <xref linkend="vacuum-progress-reporting"/> and
+    <xref linkend="cluster-progress-reporting"/> for details.
+   </para>

I think a typo, missing "will" between option and report - it's "with
the <literal>FULL</literal> option will report"

Except the above typo, 0002 LGTM.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

#37Josef Šimánek
josef.simanek@gmail.com
In reply to: Matthias van de Meent (#35)
Re: Improvements and additions to COPY progress reporting

čt 4. 3. 2021 v 12:32 odesílatel Matthias van de Meent
<boekewurm+postgres@gmail.com> napsal:

On Thu, 4 Mar 2021 at 11:38, Michael Paquier <michael@paquier.xyz> wrote:

On Thu, Mar 04, 2021 at 12:16:17PM +0530, Bharath Rupireddy wrote:

IMO, the phrasing proposed by Justin upthread looks good. It's like this:

+    Each backend running <command>ANALYZE</command> will report its progress in
+    the <structname>pg_stat_progress_analyze</structname> view. See

No objections to just go with that. As a new patch set is needed, I
am switching the CF entry to "Waiting on Author".

Thanks for all your comments, and sorry for the delayed response.
Please find attached a new version of the patch set, that is rebased
and contains the requested changes:

1/3:
Docs:
- on which the COPY command is executed
+ on which the COPY command is being executed
Reworded existing commment:
- /* Increment amount of processed tuples and update the progress */
+ /* Increment the number of processed tuples, and report the progress */
2/3:
Docs:
- ... report its progress to ...
+ ... report its progress in ...
- report its progress to the >pg_stat_progress_cluster< ...
+ report its progress in the >pg_stat_progress_cluster< view ...

3/3:
No changes

I believe that that was the extent of the not-yet-resolved comments
and suggestions.

LGTM, special thanks for taking over the work on initial COPY progress
regression tests.

Show quoted text

With regards,

Matthias van de Meent.

#38Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Bharath Rupireddy (#36)
Re: Improvements and additions to COPY progress reporting

On Thu, 4 Mar 2021 at 13:36, Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

+   <para>
+    Each backend running <command>VACUUM</command> without the
+    <literal>FULL</literal> option will report its progress in the
+    <structname>pg_stat_progress_vacuum</structname> view. Backends running
+    <command>VACUUM</command> with the <literal>FULL</literal> option report
+    progress in the <structname>pg_stat_progress_cluster</structname> view
+    instead. See <xref linkend="vacuum-progress-reporting"/> and
+    <xref linkend="cluster-progress-reporting"/> for details.
+   </para>

I think a typo, missing "will" between option and report - it's "with
the <literal>FULL</literal> option will report"

"Backends running [...] report progress to [...] instead" is,
a.f.a.i.k., correct English. Adding 'will' would indeed still be
correct, but it would in my opinion also be decremental to the
readability of the section due to the repeated use of the same
template sentence structure. I think that keeping it as-is is just
fine.

With regards,

Matthias van de Meent.

#39Justin Pryzby
pryzby@telsasoft.com
In reply to: Matthias van de Meent (#38)
Re: Improvements and additions to COPY progress reporting

On Thu, Mar 04, 2021 at 05:19:18PM +0100, Matthias van de Meent wrote:

On Thu, 4 Mar 2021 at 13:36, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:

+   <para>
+    Each backend running <command>VACUUM</command> without the
+    <literal>FULL</literal> option will report its progress in the
+    <structname>pg_stat_progress_vacuum</structname> view. Backends running
+    <command>VACUUM</command> with the <literal>FULL</literal> option report
+    progress in the <structname>pg_stat_progress_cluster</structname> view
+    instead. See <xref linkend="vacuum-progress-reporting"/> and
+    <xref linkend="cluster-progress-reporting"/> for details.
+   </para>

I think a typo, missing "will" between option and report - it's "with
the <literal>FULL</literal> option will report"

"Backends running [...] report progress to [...] instead" is,
a.f.a.i.k., correct English. Adding 'will' would indeed still be
correct, but it would in my opinion also be decremental to the
readability of the section due to the repeated use of the same
template sentence structure. I think that keeping it as-is is just
fine.

I'd prefer to see the same thing repeated, since then it's easy to compare, for
readers, and also future doc authors. That's normal in technical documentation
to have redundancy. It's easy to read.

I'd suggest to move "instead" into the middle of the sentence,
and combine VACUUM+FULL, and add "their":

+    ... Backends running <command>VACUUM FULL</literal> will instead report
+    their progress in the <structname>pg_stat_progress_cluster</structname> view.

--
Justin

#40Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Justin Pryzby (#39)
3 attachment(s)
Re: Improvements and additions to COPY progress reporting

On Thu, 4 Mar 2021 at 17:29, Justin Pryzby <pryzby@telsasoft.com> wrote:

On Thu, Mar 04, 2021 at 05:19:18PM +0100, Matthias van de Meent wrote:

"Backends running [...] report progress to [...] instead" is,
a.f.a.i.k., correct English. Adding 'will' would indeed still be
correct, but it would in my opinion also be decremental to the
readability of the section due to the repeated use of the same
template sentence structure. I think that keeping it as-is is just
fine.

I'd prefer to see the same thing repeated, since then it's easy to compare, for
readers, and also future doc authors. That's normal in technical documentation
to have redundancy. It's easy to read.

I'd suggest to move "instead" into the middle of the sentence,
and combine VACUUM+FULL, and add "their":

+    ... Backends running <command>VACUUM FULL</literal> will instead report
+    their progress in the <structname>pg_stat_progress_cluster</structname> view.

Sure, I'm convinced. PFA the patchset with this change applied.

With regards,

Matthias van de Meent

Attachments:

v12-0003-Add-copy-progress-reporting-regression-tests.patchtext/x-patch; charset=UTF-8; name=v12-0003-Add-copy-progress-reporting-regression-tests.patchDownload
From af85e442b5f61e145e6214d1fbfc4269af3bc9f1 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Josef=20=C5=A0im=C3=A1nek?= <josef.simanek@gmail.com>
Date: Tue, 9 Feb 2021 13:06:45 +0100
Subject: [PATCH v12 3/3] Add copy progress reporting regression tests.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This tests some basic features of copy progress reporting.

Sadly, we can only request one snapshot of progress_reporting each
transaction / statement, so we can't (easily) get intermediate results without
each result requiring a seperate statement being run.

Author: Josef Šimánek, Matthias van de Meent
---
 src/test/regress/input/copy.source  | 60 +++++++++++++++++++++++++++++
 src/test/regress/output/copy.source | 47 ++++++++++++++++++++++
 2 files changed, 107 insertions(+)

diff --git a/src/test/regress/input/copy.source b/src/test/regress/input/copy.source
index a1d529ad36..4f1cbc73d2 100644
--- a/src/test/regress/input/copy.source
+++ b/src/test/regress/input/copy.source
@@ -201,3 +201,63 @@ select * from parted_copytest where b = 1;
 select * from parted_copytest where b = 2;
 
 drop table parted_copytest;
+
+--
+-- progress reporting
+--
+
+-- setup
+-- reuse employer datatype, that has a small sized data set
+
+create table progress_reporting (
+	name 		text,
+	age			int4,
+	location 	point,
+	salary 		int4,
+	manager 	name
+);
+
+-- Add a trigger to 'raise info' the contents of pg_stat_progress_copy. This
+-- allows us to test and verify the contents of this view, which would
+-- otherwise require a concurrent session checking that table.
+create function notice_after_progress_reporting() returns trigger AS
+$$
+declare report record;
+begin
+	-- We cannot expect 'pid' nor 'relid' to be consistent over runs due to
+	-- variance in system process ids, and concurrency in runs of tests.
+	-- Additionally, due to the usage of this test in pg_regress, the 'datid'
+	-- also is not consistent between runs.
+	select into report (to_jsonb(r) - '{pid,relid,datid}'::text[]) as value
+		from pg_stat_progress_copy r
+		where pid = pg_backend_pid();
+
+	raise info 'progress: %', report.value::text;
+
+	RETURN NEW;
+END;
+$$ LANGUAGE plpgsql;
+
+create trigger check_after_progress_reporting
+	after insert on progress_reporting
+	for each statement
+	execute function notice_after_progress_reporting();
+
+-- reporting of STDIO imports, and correct bytes-processed/tuples-processed reporting
+
+copy progress_reporting from stdin;
+sharon	25	(15,12)	1000	sam
+sam	30	(10,5)	2000	bill
+bill	20	(11,10)	1000	sharon
+\.
+
+-- reporting of FILE imports, and correct reporting of tuples-excluded
+
+copy progress_reporting from '@abs_srcdir@/data/emp.data'
+	where (salary < 2000);
+
+-- cleanup progress_reporting
+
+drop trigger check_after_progress_reporting on progress_reporting;
+drop function notice_after_progress_reporting();
+drop table progress_reporting;
diff --git a/src/test/regress/output/copy.source b/src/test/regress/output/copy.source
index 938d3551da..f3596bdd15 100644
--- a/src/test/regress/output/copy.source
+++ b/src/test/regress/output/copy.source
@@ -165,3 +165,50 @@ select * from parted_copytest where b = 2;
 (1 row)
 
 drop table parted_copytest;
+--
+-- progress reporting
+--
+-- setup
+-- reuse employer datatype, that has a small sized data set
+create table progress_reporting (
+	name 		text,
+	age			int4,
+	location 	point,
+	salary 		int4,
+	manager 	name
+);
+-- Add a trigger to 'raise info' the contents of pg_stat_progress_copy. This
+-- allows us to test and verify the contents of this view, which would
+-- otherwise require a concurrent session checking that table.
+create function notice_after_progress_reporting() returns trigger AS
+$$
+declare report record;
+begin
+	-- We cannot expect 'pid' nor 'relid' to be consistent over runs due to
+	-- variance in system process ids, and concurrency in runs of tests.
+	-- Additionally, due to the usage of this test in pg_regress, the 'datid'
+	-- also is not consistent between runs.
+	select into report (to_jsonb(r) - '{pid,relid,datid}'::text[]) as value
+		from pg_stat_progress_copy r
+		where pid = pg_backend_pid();
+
+	raise info 'progress: %', report.value::text;
+
+	RETURN NEW;
+END;
+$$ LANGUAGE plpgsql;
+create trigger check_after_progress_reporting
+	after insert on progress_reporting
+	for each statement
+	execute function notice_after_progress_reporting();
+-- reporting of STDIO imports, and correct bytes-processed/tuples-processed reporting
+copy progress_reporting from stdin;
+INFO:  progress: {"command": "COPY FROM", "datname": "regression", "io_type": "STDIO", "bytes_total": 0, "bytes_processed": 79, "tuples_excluded": 0, "tuples_processed": 3}
+-- reporting of FILE imports, and correct reporting of tuples-excluded
+copy progress_reporting from '@abs_srcdir@/data/emp.data'
+	where (salary < 2000);
+INFO:  progress: {"command": "COPY FROM", "datname": "regression", "io_type": "FILE", "bytes_total": 79, "bytes_processed": 79, "tuples_excluded": 1, "tuples_processed": 2}
+-- cleanup progress_reporting
+drop trigger check_after_progress_reporting on progress_reporting;
+drop function notice_after_progress_reporting();
+drop table progress_reporting;
-- 
2.20.1

v12-0002-Add-backlinks-to-progress-reporting-documentatio.patchtext/x-patch; charset=US-ASCII; name=v12-0002-Add-backlinks-to-progress-reporting-documentatio.patchDownload
From 9d296e608ca3c555c91b618e38f06c611efbc181 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Fri, 5 Feb 2021 23:11:50 +0100
Subject: [PATCH v12 2/3] Add backlinks to progress reporting documentation

Previously, for most progress-reported features, the only place the
feature was mentioned is in the progress reporting document itself.
This makes the progress reporting more discoverable from the reported
commands.
---
 doc/src/sgml/ref/analyze.sgml       |  7 +++++++
 doc/src/sgml/ref/cluster.sgml       |  6 ++++++
 doc/src/sgml/ref/copy.sgml          | 14 ++++++++++++++
 doc/src/sgml/ref/create_index.sgml  |  7 +++++++
 doc/src/sgml/ref/pg_basebackup.sgml |  1 +
 doc/src/sgml/ref/reindex.sgml       |  7 +++++++
 doc/src/sgml/ref/vacuum.sgml        | 11 +++++++++++
 7 files changed, 53 insertions(+)

diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml
index 7d816c87c6..bc64843948 100644
--- a/doc/src/sgml/ref/analyze.sgml
+++ b/doc/src/sgml/ref/analyze.sgml
@@ -273,6 +273,12 @@ ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replacea
     will not record new statistics for that table.  Any existing statistics
     will be retained.
   </para>
+
+  <para>
+    Each backend running <command>ANALYZE</command> will report its progress in
+    the <structname>pg_stat_progress_analyze</structname> view. See
+    <xref linkend="analyze-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -291,6 +297,7 @@ ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replacea
    <member><xref linkend="app-vacuumdb"/></member>
    <member><xref linkend="runtime-config-resource-vacuum-cost"/></member>
    <member><xref linkend="autovacuum"/></member>
+   <member><xref linkend="analyze-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/cluster.sgml b/doc/src/sgml/ref/cluster.sgml
index 5dd21a0189..ddcdc8191a 100644
--- a/doc/src/sgml/ref/cluster.sgml
+++ b/doc/src/sgml/ref/cluster.sgml
@@ -192,6 +192,11 @@ CLUSTER [VERBOSE]
     are periodically reclustered.
    </para>
 
+  <para>
+    Each backend running <command>CLUSTER</command> will report its progress in
+    the <structname>pg_stat_progress_cluster</structname> view. See
+    <xref linkend="cluster-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -242,6 +247,7 @@ CLUSTER <replaceable class="parameter">index_name</replaceable> ON <replaceable
 
   <simplelist type="inline">
    <member><xref linkend="app-clusterdb"/></member>
+   <member><xref linkend="cluster-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 0fca6583af..fd9309af5f 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -82,6 +82,12 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
    specified, data is transmitted via the connection between the
    client and the server.
   </para>
+
+  <para>
+    Each backend running <command>COPY</command> will report its progress in
+    the <structname>pg_stat_progress_copy</structname> view. See
+    <xref linkend="copy-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -1052,4 +1058,12 @@ COPY [ BINARY ] <replaceable class="parameter">table_name</replaceable>
     [ WITH NULL AS '<replaceable class="parameter">null_string</replaceable>' ]
 </synopsis></para>
  </refsect1>
+
+ <refsect1>
+  <title>See Also</title>
+
+  <simplelist type="inline">
+   <member><xref linkend="copy-progress-reporting"/></member>
+  </simplelist>
+ </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index 965dcf472c..51b4d57939 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -867,6 +867,12 @@ Indexes:
    will interpret it as <literal>USING gist</literal>, to simplify conversion
    of old databases to GiST.
   </para>
+
+  <para>
+    Each backend running <command>CREATE INDEX</command> will report its
+    progress in the <structname>pg_stat_progress_create_index</structname>
+    view. See <xref linkend="create-index-progress-reporting"/> for details.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -980,6 +986,7 @@ CREATE INDEX CONCURRENTLY sales_quantity_index ON sales_table (quantity);
    <member><xref linkend="sql-alterindex"/></member>
    <member><xref linkend="sql-dropindex"/></member>
    <member><xref linkend="sql-reindex"/></member>
+   <member><xref linkend="create-index-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 5754ad5aa6..14cc88a852 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -904,6 +904,7 @@ PostgreSQL documentation
 
   <simplelist type="inline">
    <member><xref linkend="app-pgdump"/></member>
+   <member><xref linkend="basebackup-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index b22d39eba9..ff4dba8c36 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -495,6 +495,12 @@ Indexes:
     is reindexed concurrently, those indexes will be skipped.  (It is possible
     to reindex such indexes without the <command>CONCURRENTLY</command> option.)
    </para>
+
+  <para>
+    Each backend running <command>REINDEX</command> will report its progress
+    in the <structname>pg_stat_progress_create_index</structname> view. See
+    <xref linkend="create-index-progress-reporting"/> for details.
+  </para>
   </refsect2>
  </refsect1>
 
@@ -553,6 +559,7 @@ REINDEX TABLE CONCURRENTLY my_broken_table;
    <member><xref linkend="sql-createindex"/></member>
    <member><xref linkend="sql-dropindex"/></member>
    <member><xref linkend="app-reindexdb"/></member>
+   <member><xref linkend="create-index-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 4bb624979b..bb1c0f8fb6 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -393,6 +393,15 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     information about automatic and manual vacuuming, see
     <xref linkend="routine-vacuuming"/>.
    </para>
+   <para>
+    Each backend running <command>VACUUM</command> without the
+    <literal>FULL</literal> option will report its progress in the
+    <structname>pg_stat_progress_vacuum</structname> view. Backends running
+    <command>VACUUM FULL</command> will instead report their progress in the
+    <structname>pg_stat_progress_cluster</structname> view. See 
+    <xref linkend="vacuum-progress-reporting"/> and
+    <xref linkend="cluster-progress-reporting"/> for details.
+   </para>
  </refsect1>
 
  <refsect1>
@@ -422,6 +431,8 @@ VACUUM (VERBOSE, ANALYZE) onek;
    <member><xref linkend="app-vacuumdb"/></member>
    <member><xref linkend="runtime-config-resource-vacuum-cost"/></member>
    <member><xref linkend="autovacuum"/></member>
+   <member><xref linkend="vacuum-progress-reporting"/></member>
+   <member><xref linkend="cluster-progress-reporting"/></member>
   </simplelist>
  </refsect1>
 </refentry>
-- 
2.20.1

v12-0001-Add-progress-reported-components-for-COPY-progre.patchtext/x-patch; charset=US-ASCII; name=v12-0001-Add-progress-reported-components-for-COPY-progre.patchDownload
From c4f47ddf81acb95e465c488057aa8701f5614747 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Fri, 12 Feb 2021 14:06:44 +0100
Subject: [PATCH v12 1/3] Add progress-reported components for COPY progress
 reporting

The command, io target and excluded tuple count (by the
COPY ... FROM ... WHERE -clause) are now reported in the pg_stat_progress_copy
view. Additionally, the column lines_processed is renamed to tuples_processed
to disambiguate the meaning of this column in cases of CSV and BINARY copies
and to stay consistent with regards to names in the pg_stat_progress_*-views.

Of special interest is the reporting of io_type, with which we can
distinguish logical replications' initial table synchronization workers'
progress without having to join the pg_stat_activity view.
---
 doc/src/sgml/monitoring.sgml         | 42 +++++++++++++++++++++++++---
 src/backend/catalog/system_views.sql | 11 +++++++-
 src/backend/commands/copyfrom.c      | 27 ++++++++++++++++--
 src/backend/commands/copyto.c        | 22 ++++++++++++---
 src/include/commands/progress.h      | 15 +++++++++-
 src/test/regress/expected/rules.out  | 15 +++++++++-
 6 files changed, 118 insertions(+), 14 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 3513e127b7..c10de3406d 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6541,8 +6541,32 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
        <structfield>relid</structfield> <type>oid</type>
       </para>
       <para>
-       OID of the table on which the <command>COPY</command> command is executed.
-       It is set to 0 if copying from a <command>SELECT</command> query.
+       OID of the table on which the <command>COPY</command> command is being
+       executed. It is set to 0 if copying from a <command>SELECT</command>
+       query.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>command</structfield> <type>text</type>
+      </para>
+      <para>
+       The command that is running: <literal>COPY FROM</literal>, or
+       <literal>COPY TO</literal>.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>io_type</structfield> <type>text</type>
+      </para>
+      <para>
+       The io type that the data is read from or written to:
+       <literal>FILE</literal>, <literal>PROGRAM</literal>,
+       <literal>STDIO</literal> (for <command>COPY FROM STDIN</command> and
+       <command>COPY TO STDOUT</command>), or <literal>CALLBACK</literal>
+       (used in the table synchronization background worker).
       </para></entry>
      </row>
 
@@ -6567,10 +6591,20 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>lines_processed</structfield> <type>bigint</type>
+       <structfield>tuples_processed</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of tuples already processed by <command>COPY</command> command.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>tuples_excluded</structfield> <type>bigint</type>
       </para>
       <para>
-       Number of lines already processed by <command>COPY</command> command.
+       Number of tuples not processed because they were excluded by the
+       <command>WHERE</command> clause of the <command>COPY</command> command.
       </para></entry>
      </row>
     </tbody>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index fc94a73a54..d94714adb0 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1130,9 +1130,18 @@ CREATE VIEW pg_stat_progress_copy AS
     SELECT
         S.pid AS pid, S.datid AS datid, D.datname AS datname,
         S.relid AS relid,
+        CASE S.param5 WHEN 1 THEN 'COPY FROM'
+                      WHEN 2 THEN 'COPY TO'
+                      END AS command,
+        CASE S.param6 WHEN 1 THEN 'FILE'
+                      WHEN 2 THEN 'PROGRAM'
+                      WHEN 3 THEN 'STDIO'
+                      WHEN 4 THEN 'CALLBACK'
+                      END AS io_type,
         S.param1 AS bytes_processed,
         S.param2 AS bytes_total,
-        S.param3 AS lines_processed
+        S.param3 AS tuples_processed,
+        S.param4 AS tuples_excluded
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index f05e2d2347..72b71f5058 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -539,7 +539,8 @@ CopyFrom(CopyFromState cstate)
 	BulkInsertState bistate = NULL;
 	CopyInsertMethod insertMethod;
 	CopyMultiInsertInfo multiInsertInfo = {0};	/* pacify compiler */
-	uint64		processed = 0;
+	int64		processed = 0;
+	int64		excluded = 0;
 	bool		has_before_insert_row_trig;
 	bool		has_instead_insert_row_trig;
 	bool		leafpart_use_multi_insert = false;
@@ -869,7 +870,11 @@ CopyFrom(CopyFromState cstate)
 			econtext->ecxt_scantuple = myslot;
 			/* Skip items that don't match COPY's WHERE clause */
 			if (!ExecQual(cstate->qualexpr, econtext))
+			{
+				/* Report that this tuple was filtered out by the WHERE clause */
+				pgstat_progress_update_param(PROGRESS_COPY_TUPLES_EXCLUDED, ++excluded);
 				continue;
+			}
 		}
 
 		/* Determine the partition to insert the tuple into */
@@ -1107,7 +1112,7 @@ CopyFrom(CopyFromState cstate)
 			 * for counting tuples inserted by an INSERT command. Update
 			 * progress of the COPY command as well.
 			 */
-			pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++processed);
+			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++processed);
 		}
 	}
 
@@ -1193,6 +1198,16 @@ BeginCopyFrom(ParseState *pstate,
 	ExprState **defexprs;
 	MemoryContext oldcontext;
 	bool		volatile_defexprs;
+	const int	progress_cols[] = {
+		PROGRESS_COPY_COMMAND,
+		PROGRESS_COPY_IO_TYPE,
+		PROGRESS_COPY_BYTES_TOTAL
+	};
+	int64		progress_vals[] = {
+		PROGRESS_COPY_COMMAND_FROM,
+		0,
+		0
+	};
 
 	/* Allocate workspace and zero all fields */
 	cstate = (CopyFromStateData *) palloc0(sizeof(CopyFromStateData));
@@ -1430,11 +1445,13 @@ BeginCopyFrom(ParseState *pstate,
 
 	if (data_source_cb)
 	{
+		progress_vals[1] = PROGRESS_COPY_IO_TYPE_CALLBACK;
 		cstate->copy_src = COPY_CALLBACK;
 		cstate->data_source_cb = data_source_cb;
 	}
 	else if (pipe)
 	{
+		progress_vals[1] = PROGRESS_COPY_IO_TYPE_STDIO;
 		Assert(!is_program);	/* the grammar does not allow this */
 		if (whereToSendOutput == DestRemote)
 			ReceiveCopyBegin(cstate);
@@ -1447,6 +1464,7 @@ BeginCopyFrom(ParseState *pstate,
 
 		if (cstate->is_program)
 		{
+			progress_vals[1] = PROGRESS_COPY_IO_TYPE_PROGRAM;
 			cstate->copy_file = OpenPipeStream(cstate->filename, PG_BINARY_R);
 			if (cstate->copy_file == NULL)
 				ereport(ERROR,
@@ -1458,6 +1476,7 @@ BeginCopyFrom(ParseState *pstate,
 		{
 			struct stat st;
 
+			progress_vals[1] = PROGRESS_COPY_IO_TYPE_FILE;
 			cstate->copy_file = AllocateFile(cstate->filename, PG_BINARY_R);
 			if (cstate->copy_file == NULL)
 			{
@@ -1484,10 +1503,12 @@ BeginCopyFrom(ParseState *pstate,
 						(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 						 errmsg("\"%s\" is a directory", cstate->filename)));
 
-			pgstat_progress_update_param(PROGRESS_COPY_BYTES_TOTAL, st.st_size);
+			progress_vals[2] = st.st_size;
 		}
 	}
 
+	pgstat_progress_update_multi_param(3, progress_cols, progress_vals);
+
 	if (cstate->opts.binary)
 	{
 		/* Read and verify binary header */
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 46155015cf..1d4bdabb3a 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -353,6 +353,14 @@ BeginCopyTo(ParseState *pstate,
 	TupleDesc	tupDesc;
 	int			num_phys_attrs;
 	MemoryContext oldcontext;
+	const int	progress_cols[] = {
+		PROGRESS_COPY_COMMAND,
+		PROGRESS_COPY_IO_TYPE
+	};
+	int64		progress_vals[] = {
+		PROGRESS_COPY_COMMAND_TO,
+		0
+	};
 
 	if (rel != NULL && rel->rd_rel->relkind != RELKIND_RELATION)
 	{
@@ -659,6 +667,8 @@ BeginCopyTo(ParseState *pstate,
 
 	if (pipe)
 	{
+		progress_vals[1] = PROGRESS_COPY_IO_TYPE_STDIO;
+
 		Assert(!is_program);	/* the grammar does not allow this */
 		if (whereToSendOutput != DestRemote)
 			cstate->copy_file = stdout;
@@ -670,6 +680,7 @@ BeginCopyTo(ParseState *pstate,
 
 		if (is_program)
 		{
+			progress_vals[1] = PROGRESS_COPY_IO_TYPE_PROGRAM;
 			cstate->copy_file = OpenPipeStream(cstate->filename, PG_BINARY_W);
 			if (cstate->copy_file == NULL)
 				ereport(ERROR,
@@ -682,6 +693,7 @@ BeginCopyTo(ParseState *pstate,
 			mode_t		oumask; /* Pre-existing umask value */
 			struct stat st;
 
+			progress_vals[1] = PROGRESS_COPY_IO_TYPE_FILE;
 			/*
 			 * Prevent write to relative path ... too easy to shoot oneself in
 			 * the foot by overwriting a database file ...
@@ -731,6 +743,8 @@ BeginCopyTo(ParseState *pstate,
 	/* initialize progress */
 	pgstat_progress_start_command(PROGRESS_COMMAND_COPY,
 								  cstate->rel ? RelationGetRelid(cstate->rel) : InvalidOid);
+	pgstat_progress_update_multi_param(2, progress_cols, progress_vals);
+
 	cstate->bytes_processed = 0;
 
 	MemoryContextSwitchTo(oldcontext);
@@ -881,8 +895,8 @@ DoCopyTo(CopyToState cstate)
 			/* Format and send the data */
 			CopyOneRowTo(cstate, slot);
 
-			/* Increment amount of processed tuples and update the progress */
-			pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++processed);
+			/* Increment the number of processed tuples, and report the progress */
+			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++processed);
 		}
 
 		ExecDropSingleTupleTableSlot(slot);
@@ -1251,8 +1265,8 @@ copy_dest_receive(TupleTableSlot *slot, DestReceiver *self)
 	/* Send the data */
 	CopyOneRowTo(cstate, slot);
 
-	/* Increment amount of processed tuples and update the progress */
-	pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++myState->processed);
+	/* Increment the number of processed tuples, and report the progress */
+	pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++myState->processed);
 
 	return true;
 }
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 95ec5d02e9..508b562d92 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -136,6 +136,19 @@
 /* Commands of PROGRESS_COPY */
 #define PROGRESS_COPY_BYTES_PROCESSED 0
 #define PROGRESS_COPY_BYTES_TOTAL 1
-#define PROGRESS_COPY_LINES_PROCESSED 2
+#define PROGRESS_COPY_TUPLES_PROCESSED 2
+#define PROGRESS_COPY_TUPLES_EXCLUDED 3
+#define PROGRESS_COPY_COMMAND 4
+#define PROGRESS_COPY_IO_TYPE 5
+
+/* Commands of PROGRESS_COPY_COMMAND */
+#define PROGRESS_COPY_COMMAND_FROM 1
+#define PROGRESS_COPY_COMMAND_TO 2
+
+/* Types of PROGRESS_COPY_INOUT_TYPE */
+#define PROGRESS_COPY_IO_TYPE_FILE 1
+#define PROGRESS_COPY_IO_TYPE_PROGRAM 2
+#define PROGRESS_COPY_IO_TYPE_STDIO 3
+#define PROGRESS_COPY_IO_TYPE_CALLBACK 4
 
 #endif
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index b1c9b7bdfe..ace98af739 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1950,9 +1950,22 @@ pg_stat_progress_copy| SELECT s.pid,
     s.datid,
     d.datname,
     s.relid,
+        CASE s.param5
+            WHEN 1 THEN 'COPY FROM'::text
+            WHEN 2 THEN 'COPY TO'::text
+            ELSE NULL::text
+        END AS command,
+        CASE s.param6
+            WHEN 1 THEN 'FILE'::text
+            WHEN 2 THEN 'PROGRAM'::text
+            WHEN 3 THEN 'STDIO'::text
+            WHEN 4 THEN 'CALLBACK'::text
+            ELSE NULL::text
+        END AS io_type,
     s.param1 AS bytes_processed,
     s.param2 AS bytes_total,
-    s.param3 AS lines_processed
+    s.param3 AS tuples_processed,
+    s.param4 AS tuples_excluded
    FROM (pg_stat_get_progress_info('COPY'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_create_index| SELECT s.pid,
-- 
2.20.1

#41Michael Paquier
michael@paquier.xyz
In reply to: Matthias van de Meent (#40)
Re: Improvements and additions to COPY progress reporting

On Thu, Mar 04, 2021 at 05:45:50PM +0100, Matthias van de Meent wrote:

Sure, I'm convinced. PFA the patchset with this change applied.

0002 looks fine to me, and in line with the discussion, so applied.
--
Michael

#42Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Michael Paquier (#41)
2 attachment(s)
Re: Improvements and additions to COPY progress reporting

On Fri, Mar 5, 2021 at 11:30 AM Michael Paquier <michael@paquier.xyz> wrote:

On Thu, Mar 04, 2021 at 05:45:50PM +0100, Matthias van de Meent wrote:

Sure, I'm convinced. PFA the patchset with this change applied.

0002 looks fine to me, and in line with the discussion, so applied.

Thanks.

Attaching remaining patches 0001 and 0003 from the v11 patch
set(posted upthread) here to make cfbot happier.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v11-0001-Add-progress-reported-components-for-COPY-progre.patchapplication/octet-stream; name=v11-0001-Add-progress-reported-components-for-COPY-progre.patchDownload
From c4f47ddf81acb95e465c488057aa8701f5614747 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Fri, 12 Feb 2021 14:06:44 +0100
Subject: [PATCH v11 1/3] Add progress-reported components for COPY progress
 reporting

The command, io target and excluded tuple count (by the
COPY ... FROM ... WHERE -clause) are now reported in the pg_stat_progress_copy
view. Additionally, the column lines_processed is renamed to tuples_processed
to disambiguate the meaning of this column in cases of CSV and BINARY copies
and to stay consistent with regards to names in the pg_stat_progress_*-views.

Of special interest is the reporting of io_type, with which we can
distinguish logical replications' initial table synchronization workers'
progress without having to join the pg_stat_activity view.
---
 doc/src/sgml/monitoring.sgml         | 42 +++++++++++++++++++++++++---
 src/backend/catalog/system_views.sql | 11 +++++++-
 src/backend/commands/copyfrom.c      | 27 ++++++++++++++++--
 src/backend/commands/copyto.c        | 22 ++++++++++++---
 src/include/commands/progress.h      | 15 +++++++++-
 src/test/regress/expected/rules.out  | 15 +++++++++-
 6 files changed, 118 insertions(+), 14 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 3513e127b7..c10de3406d 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6541,8 +6541,32 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
        <structfield>relid</structfield> <type>oid</type>
       </para>
       <para>
-       OID of the table on which the <command>COPY</command> command is executed.
-       It is set to 0 if copying from a <command>SELECT</command> query.
+       OID of the table on which the <command>COPY</command> command is being
+       executed. It is set to 0 if copying from a <command>SELECT</command>
+       query.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>command</structfield> <type>text</type>
+      </para>
+      <para>
+       The command that is running: <literal>COPY FROM</literal>, or
+       <literal>COPY TO</literal>.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>io_type</structfield> <type>text</type>
+      </para>
+      <para>
+       The io type that the data is read from or written to:
+       <literal>FILE</literal>, <literal>PROGRAM</literal>,
+       <literal>STDIO</literal> (for <command>COPY FROM STDIN</command> and
+       <command>COPY TO STDOUT</command>), or <literal>CALLBACK</literal>
+       (used in the table synchronization background worker).
       </para></entry>
      </row>
 
@@ -6567,10 +6591,20 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>lines_processed</structfield> <type>bigint</type>
+       <structfield>tuples_processed</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of tuples already processed by <command>COPY</command> command.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>tuples_excluded</structfield> <type>bigint</type>
       </para>
       <para>
-       Number of lines already processed by <command>COPY</command> command.
+       Number of tuples not processed because they were excluded by the
+       <command>WHERE</command> clause of the <command>COPY</command> command.
       </para></entry>
      </row>
     </tbody>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index fc94a73a54..d94714adb0 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1130,9 +1130,18 @@ CREATE VIEW pg_stat_progress_copy AS
     SELECT
         S.pid AS pid, S.datid AS datid, D.datname AS datname,
         S.relid AS relid,
+        CASE S.param5 WHEN 1 THEN 'COPY FROM'
+                      WHEN 2 THEN 'COPY TO'
+                      END AS command,
+        CASE S.param6 WHEN 1 THEN 'FILE'
+                      WHEN 2 THEN 'PROGRAM'
+                      WHEN 3 THEN 'STDIO'
+                      WHEN 4 THEN 'CALLBACK'
+                      END AS io_type,
         S.param1 AS bytes_processed,
         S.param2 AS bytes_total,
-        S.param3 AS lines_processed
+        S.param3 AS tuples_processed,
+        S.param4 AS tuples_excluded
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index f05e2d2347..72b71f5058 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -539,7 +539,8 @@ CopyFrom(CopyFromState cstate)
 	BulkInsertState bistate = NULL;
 	CopyInsertMethod insertMethod;
 	CopyMultiInsertInfo multiInsertInfo = {0};	/* pacify compiler */
-	uint64		processed = 0;
+	int64		processed = 0;
+	int64		excluded = 0;
 	bool		has_before_insert_row_trig;
 	bool		has_instead_insert_row_trig;
 	bool		leafpart_use_multi_insert = false;
@@ -869,7 +870,11 @@ CopyFrom(CopyFromState cstate)
 			econtext->ecxt_scantuple = myslot;
 			/* Skip items that don't match COPY's WHERE clause */
 			if (!ExecQual(cstate->qualexpr, econtext))
+			{
+				/* Report that this tuple was filtered out by the WHERE clause */
+				pgstat_progress_update_param(PROGRESS_COPY_TUPLES_EXCLUDED, ++excluded);
 				continue;
+			}
 		}
 
 		/* Determine the partition to insert the tuple into */
@@ -1107,7 +1112,7 @@ CopyFrom(CopyFromState cstate)
 			 * for counting tuples inserted by an INSERT command. Update
 			 * progress of the COPY command as well.
 			 */
-			pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++processed);
+			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++processed);
 		}
 	}
 
@@ -1193,6 +1198,16 @@ BeginCopyFrom(ParseState *pstate,
 	ExprState **defexprs;
 	MemoryContext oldcontext;
 	bool		volatile_defexprs;
+	const int	progress_cols[] = {
+		PROGRESS_COPY_COMMAND,
+		PROGRESS_COPY_IO_TYPE,
+		PROGRESS_COPY_BYTES_TOTAL
+	};
+	int64		progress_vals[] = {
+		PROGRESS_COPY_COMMAND_FROM,
+		0,
+		0
+	};
 
 	/* Allocate workspace and zero all fields */
 	cstate = (CopyFromStateData *) palloc0(sizeof(CopyFromStateData));
@@ -1430,11 +1445,13 @@ BeginCopyFrom(ParseState *pstate,
 
 	if (data_source_cb)
 	{
+		progress_vals[1] = PROGRESS_COPY_IO_TYPE_CALLBACK;
 		cstate->copy_src = COPY_CALLBACK;
 		cstate->data_source_cb = data_source_cb;
 	}
 	else if (pipe)
 	{
+		progress_vals[1] = PROGRESS_COPY_IO_TYPE_STDIO;
 		Assert(!is_program);	/* the grammar does not allow this */
 		if (whereToSendOutput == DestRemote)
 			ReceiveCopyBegin(cstate);
@@ -1447,6 +1464,7 @@ BeginCopyFrom(ParseState *pstate,
 
 		if (cstate->is_program)
 		{
+			progress_vals[1] = PROGRESS_COPY_IO_TYPE_PROGRAM;
 			cstate->copy_file = OpenPipeStream(cstate->filename, PG_BINARY_R);
 			if (cstate->copy_file == NULL)
 				ereport(ERROR,
@@ -1458,6 +1476,7 @@ BeginCopyFrom(ParseState *pstate,
 		{
 			struct stat st;
 
+			progress_vals[1] = PROGRESS_COPY_IO_TYPE_FILE;
 			cstate->copy_file = AllocateFile(cstate->filename, PG_BINARY_R);
 			if (cstate->copy_file == NULL)
 			{
@@ -1484,10 +1503,12 @@ BeginCopyFrom(ParseState *pstate,
 						(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 						 errmsg("\"%s\" is a directory", cstate->filename)));
 
-			pgstat_progress_update_param(PROGRESS_COPY_BYTES_TOTAL, st.st_size);
+			progress_vals[2] = st.st_size;
 		}
 	}
 
+	pgstat_progress_update_multi_param(3, progress_cols, progress_vals);
+
 	if (cstate->opts.binary)
 	{
 		/* Read and verify binary header */
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 46155015cf..1d4bdabb3a 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -353,6 +353,14 @@ BeginCopyTo(ParseState *pstate,
 	TupleDesc	tupDesc;
 	int			num_phys_attrs;
 	MemoryContext oldcontext;
+	const int	progress_cols[] = {
+		PROGRESS_COPY_COMMAND,
+		PROGRESS_COPY_IO_TYPE
+	};
+	int64		progress_vals[] = {
+		PROGRESS_COPY_COMMAND_TO,
+		0
+	};
 
 	if (rel != NULL && rel->rd_rel->relkind != RELKIND_RELATION)
 	{
@@ -659,6 +667,8 @@ BeginCopyTo(ParseState *pstate,
 
 	if (pipe)
 	{
+		progress_vals[1] = PROGRESS_COPY_IO_TYPE_STDIO;
+
 		Assert(!is_program);	/* the grammar does not allow this */
 		if (whereToSendOutput != DestRemote)
 			cstate->copy_file = stdout;
@@ -670,6 +680,7 @@ BeginCopyTo(ParseState *pstate,
 
 		if (is_program)
 		{
+			progress_vals[1] = PROGRESS_COPY_IO_TYPE_PROGRAM;
 			cstate->copy_file = OpenPipeStream(cstate->filename, PG_BINARY_W);
 			if (cstate->copy_file == NULL)
 				ereport(ERROR,
@@ -682,6 +693,7 @@ BeginCopyTo(ParseState *pstate,
 			mode_t		oumask; /* Pre-existing umask value */
 			struct stat st;
 
+			progress_vals[1] = PROGRESS_COPY_IO_TYPE_FILE;
 			/*
 			 * Prevent write to relative path ... too easy to shoot oneself in
 			 * the foot by overwriting a database file ...
@@ -731,6 +743,8 @@ BeginCopyTo(ParseState *pstate,
 	/* initialize progress */
 	pgstat_progress_start_command(PROGRESS_COMMAND_COPY,
 								  cstate->rel ? RelationGetRelid(cstate->rel) : InvalidOid);
+	pgstat_progress_update_multi_param(2, progress_cols, progress_vals);
+
 	cstate->bytes_processed = 0;
 
 	MemoryContextSwitchTo(oldcontext);
@@ -881,8 +895,8 @@ DoCopyTo(CopyToState cstate)
 			/* Format and send the data */
 			CopyOneRowTo(cstate, slot);
 
-			/* Increment amount of processed tuples and update the progress */
-			pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++processed);
+			/* Increment the number of processed tuples, and report the progress */
+			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++processed);
 		}
 
 		ExecDropSingleTupleTableSlot(slot);
@@ -1251,8 +1265,8 @@ copy_dest_receive(TupleTableSlot *slot, DestReceiver *self)
 	/* Send the data */
 	CopyOneRowTo(cstate, slot);
 
-	/* Increment amount of processed tuples and update the progress */
-	pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++myState->processed);
+	/* Increment the number of processed tuples, and report the progress */
+	pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++myState->processed);
 
 	return true;
 }
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 95ec5d02e9..508b562d92 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -136,6 +136,19 @@
 /* Commands of PROGRESS_COPY */
 #define PROGRESS_COPY_BYTES_PROCESSED 0
 #define PROGRESS_COPY_BYTES_TOTAL 1
-#define PROGRESS_COPY_LINES_PROCESSED 2
+#define PROGRESS_COPY_TUPLES_PROCESSED 2
+#define PROGRESS_COPY_TUPLES_EXCLUDED 3
+#define PROGRESS_COPY_COMMAND 4
+#define PROGRESS_COPY_IO_TYPE 5
+
+/* Commands of PROGRESS_COPY_COMMAND */
+#define PROGRESS_COPY_COMMAND_FROM 1
+#define PROGRESS_COPY_COMMAND_TO 2
+
+/* Types of PROGRESS_COPY_INOUT_TYPE */
+#define PROGRESS_COPY_IO_TYPE_FILE 1
+#define PROGRESS_COPY_IO_TYPE_PROGRAM 2
+#define PROGRESS_COPY_IO_TYPE_STDIO 3
+#define PROGRESS_COPY_IO_TYPE_CALLBACK 4
 
 #endif
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index b1c9b7bdfe..ace98af739 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1950,9 +1950,22 @@ pg_stat_progress_copy| SELECT s.pid,
     s.datid,
     d.datname,
     s.relid,
+        CASE s.param5
+            WHEN 1 THEN 'COPY FROM'::text
+            WHEN 2 THEN 'COPY TO'::text
+            ELSE NULL::text
+        END AS command,
+        CASE s.param6
+            WHEN 1 THEN 'FILE'::text
+            WHEN 2 THEN 'PROGRAM'::text
+            WHEN 3 THEN 'STDIO'::text
+            WHEN 4 THEN 'CALLBACK'::text
+            ELSE NULL::text
+        END AS io_type,
     s.param1 AS bytes_processed,
     s.param2 AS bytes_total,
-    s.param3 AS lines_processed
+    s.param3 AS tuples_processed,
+    s.param4 AS tuples_excluded
    FROM (pg_stat_get_progress_info('COPY'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_create_index| SELECT s.pid,
-- 
2.20.1

v11-0003-Add-copy-progress-reporting-regression-tests.patchapplication/octet-stream; name=v11-0003-Add-copy-progress-reporting-regression-tests.patchDownload
From 7831549452b6d95c3db9060333750256675ff322 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Josef=20=C5=A0im=C3=A1nek?= <josef.simanek@gmail.com>
Date: Tue, 9 Feb 2021 13:06:45 +0100
Subject: [PATCH v11 3/3] Add copy progress reporting regression tests.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This tests some basic features of copy progress reporting.

Sadly, we can only request one snapshot of progress_reporting each
transaction / statement, so we can't (easily) get intermediate results without
each result requiring a seperate statement being run.

Author: Josef Šimánek, Matthias van de Meent
---
 src/test/regress/input/copy.source  | 60 +++++++++++++++++++++++++++++
 src/test/regress/output/copy.source | 47 ++++++++++++++++++++++
 2 files changed, 107 insertions(+)

diff --git a/src/test/regress/input/copy.source b/src/test/regress/input/copy.source
index a1d529ad36..4f1cbc73d2 100644
--- a/src/test/regress/input/copy.source
+++ b/src/test/regress/input/copy.source
@@ -201,3 +201,63 @@ select * from parted_copytest where b = 1;
 select * from parted_copytest where b = 2;
 
 drop table parted_copytest;
+
+--
+-- progress reporting
+--
+
+-- setup
+-- reuse employer datatype, that has a small sized data set
+
+create table progress_reporting (
+	name 		text,
+	age			int4,
+	location 	point,
+	salary 		int4,
+	manager 	name
+);
+
+-- Add a trigger to 'raise info' the contents of pg_stat_progress_copy. This
+-- allows us to test and verify the contents of this view, which would
+-- otherwise require a concurrent session checking that table.
+create function notice_after_progress_reporting() returns trigger AS
+$$
+declare report record;
+begin
+	-- We cannot expect 'pid' nor 'relid' to be consistent over runs due to
+	-- variance in system process ids, and concurrency in runs of tests.
+	-- Additionally, due to the usage of this test in pg_regress, the 'datid'
+	-- also is not consistent between runs.
+	select into report (to_jsonb(r) - '{pid,relid,datid}'::text[]) as value
+		from pg_stat_progress_copy r
+		where pid = pg_backend_pid();
+
+	raise info 'progress: %', report.value::text;
+
+	RETURN NEW;
+END;
+$$ LANGUAGE plpgsql;
+
+create trigger check_after_progress_reporting
+	after insert on progress_reporting
+	for each statement
+	execute function notice_after_progress_reporting();
+
+-- reporting of STDIO imports, and correct bytes-processed/tuples-processed reporting
+
+copy progress_reporting from stdin;
+sharon	25	(15,12)	1000	sam
+sam	30	(10,5)	2000	bill
+bill	20	(11,10)	1000	sharon
+\.
+
+-- reporting of FILE imports, and correct reporting of tuples-excluded
+
+copy progress_reporting from '@abs_srcdir@/data/emp.data'
+	where (salary < 2000);
+
+-- cleanup progress_reporting
+
+drop trigger check_after_progress_reporting on progress_reporting;
+drop function notice_after_progress_reporting();
+drop table progress_reporting;
diff --git a/src/test/regress/output/copy.source b/src/test/regress/output/copy.source
index 938d3551da..f3596bdd15 100644
--- a/src/test/regress/output/copy.source
+++ b/src/test/regress/output/copy.source
@@ -165,3 +165,50 @@ select * from parted_copytest where b = 2;
 (1 row)
 
 drop table parted_copytest;
+--
+-- progress reporting
+--
+-- setup
+-- reuse employer datatype, that has a small sized data set
+create table progress_reporting (
+	name 		text,
+	age			int4,
+	location 	point,
+	salary 		int4,
+	manager 	name
+);
+-- Add a trigger to 'raise info' the contents of pg_stat_progress_copy. This
+-- allows us to test and verify the contents of this view, which would
+-- otherwise require a concurrent session checking that table.
+create function notice_after_progress_reporting() returns trigger AS
+$$
+declare report record;
+begin
+	-- We cannot expect 'pid' nor 'relid' to be consistent over runs due to
+	-- variance in system process ids, and concurrency in runs of tests.
+	-- Additionally, due to the usage of this test in pg_regress, the 'datid'
+	-- also is not consistent between runs.
+	select into report (to_jsonb(r) - '{pid,relid,datid}'::text[]) as value
+		from pg_stat_progress_copy r
+		where pid = pg_backend_pid();
+
+	raise info 'progress: %', report.value::text;
+
+	RETURN NEW;
+END;
+$$ LANGUAGE plpgsql;
+create trigger check_after_progress_reporting
+	after insert on progress_reporting
+	for each statement
+	execute function notice_after_progress_reporting();
+-- reporting of STDIO imports, and correct bytes-processed/tuples-processed reporting
+copy progress_reporting from stdin;
+INFO:  progress: {"command": "COPY FROM", "datname": "regression", "io_type": "STDIO", "bytes_total": 0, "bytes_processed": 79, "tuples_excluded": 0, "tuples_processed": 3}
+-- reporting of FILE imports, and correct reporting of tuples-excluded
+copy progress_reporting from '@abs_srcdir@/data/emp.data'
+	where (salary < 2000);
+INFO:  progress: {"command": "COPY FROM", "datname": "regression", "io_type": "FILE", "bytes_total": 79, "bytes_processed": 79, "tuples_excluded": 1, "tuples_processed": 2}
+-- cleanup progress_reporting
+drop trigger check_after_progress_reporting on progress_reporting;
+drop function notice_after_progress_reporting();
+drop table progress_reporting;
-- 
2.20.1

#43Michael Paquier
michael@paquier.xyz
In reply to: Bharath Rupireddy (#42)
Re: Improvements and additions to COPY progress reporting

On Sun, Mar 07, 2021 at 04:50:31PM +0530, Bharath Rupireddy wrote:

Attaching remaining patches 0001 and 0003 from the v11 patch
set(posted upthread) here to make cfbot happier.

Looking at patch 0002, the location of each progress report looks good
to me. I have some issues with some of the names chosen though, so I
would like to suggest a few changes to simplify things:
- PROGRESS_COPY_IO_TYPE_* => PROGRESS_COPY_TYPE_*
- PROGRESS_COPY_IO_TYPE => PROGRESS_COPY_TYPE
- PROGRESS_COPY_TYPE_STDIO => PROGRESS_COPY_TYPE_PIPE
- In pg_stat_progress_copy, io_type => type

It seems a bit confusing to not count insertions on foreign tables
where nothing happened. I am fine to live with that, but can I ask if
this has been thought about?
--
Michael

#44Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Michael Paquier (#43)
3 attachment(s)
Re: Improvements and additions to COPY progress reporting

On Mon, 8 Mar 2021 at 09:24, Michael Paquier <michael@paquier.xyz> wrote:

On Sun, Mar 07, 2021 at 04:50:31PM +0530, Bharath Rupireddy wrote:

Attaching remaining patches 0001 and 0003 from the v11 patch
set(posted upthread) here to make cfbot happier.

Looking at patch 0002, the location of each progress report looks good
to me. I have some issues with some of the names chosen though, so I
would like to suggest a few changes to simplify things:
- PROGRESS_COPY_IO_TYPE_* => PROGRESS_COPY_TYPE_*
- PROGRESS_COPY_IO_TYPE => PROGRESS_COPY_TYPE
- PROGRESS_COPY_TYPE_STDIO => PROGRESS_COPY_TYPE_PIPE
- In pg_stat_progress_copy, io_type => type

Seems reasonable. PFA updated patches. I've renamed the previous 0003
to 0002 to keep git-format-patch easy.

It seems a bit confusing to not count insertions on foreign tables
where nothing happened. I am fine to live with that, but can I ask if
this has been thought about?

This is keeping current behaviour of the implementation as committed
with 8a4f618e, with the rationale of that patch being that this number
should mirror the number returned by the copy command.

I am not opposed to adding another column for `tuples_inserted` and
changing the logic accordingly (see prototype 0003), but that was not
in the intended scope of this patchset. Unless you think that this
should be included in this current patchset, I'll spin that patch out
into a different thread, but I'm not sure that would make it into
pg14.

With regards,

Matthias van de Meent.

Attachments:

v12-0003-Adapt-COPY-progress-reporting-to-report-processe.txttext/plain; charset=US-ASCII; name=v12-0003-Adapt-COPY-progress-reporting-to-report-processe.txtDownload
From 94876abe0ab9c28a6f4b0ac006f356251ca4746c Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Mon, 8 Mar 2021 16:08:24 +0100
Subject: [PATCH v12 3/3] Adapt COPY progress reporting to report processed and
 inserted tuples

Previously, tuples_processed was implied to be the amount of tuples inserted
in COPY ... FROM. This code is now changed to make tuples_processed count
all tuples that were identified in the COPY ... FROM command, and make
tuples_inserted count all tuples that were inserted into the table (that is,
it excludes the tuples excluded using the WHERE clause, and those that were
not inserted due to triggers, or failure to insert into foreign tables).

This also renumbers the columns to be back in-order, before the stamping of
pg14 makes those numbers effectively immutable
---
 doc/src/sgml/monitoring.sgml         | 30 ++++++++++++++++++++++++
 src/backend/catalog/system_views.sql | 13 ++++++-----
 src/backend/commands/copyfrom.c      | 34 ++++++++++++++++++++++++----
 src/include/commands/progress.h      | 13 ++++++-----
 src/test/regress/expected/rules.out  | 13 ++++++-----
 src/test/regress/output/copy.source  |  4 ++--
 6 files changed, 83 insertions(+), 24 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 32bebc70db..aa2e15a748 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6595,6 +6595,24 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
       </para>
       <para>
        Number of tuples already processed by <command>COPY</command> command.
+       For <command>COPY FROM</command> this includes all tuples from the
+       source material (including tuples excluded by the
+       <command>WHERE</command> clause of the <command>COPY</command>
+       command, and tuples that were not inserted as a result of trigger
+       behaviour). For <command>COPY TO</command> this includes all tuples
+       that exist in the table, or those that are returned by the
+       <command>SELECT</command>.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>tuples_inserted</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of tuples inserted into the table with the
+       <command>COPY FROM</command> command. Is <literal>0</literal> for
+       <command>COPY TO</command>.
       </para></entry>
      </row>
 
@@ -6610,6 +6628,18 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
     </tbody>
    </tgroup>
   </table>
+  <note>
+   <para>
+    The <literal>tuples_excluded</literal> column does not count the tuples
+    that failed to insert, but only those tuples that were available in the
+    source material and not eligible for insertion through exclusion by the
+    <command>WHERE</command> clause. You can calculate the number of tuples
+    that failed to insert into the table due to triggers (or that otherwise
+    silently fail to insert) by subtracting both
+    <literal>tuples_excluded</literal> and <literal>tuples_inserted</literal>
+    from <literal>tuples_processed</literal> instead.
+   </para>
+  </note>
  </sect2>
 
  </sect1>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 8f9987f311..f59de36742 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1130,18 +1130,19 @@ CREATE VIEW pg_stat_progress_copy AS
     SELECT
         S.pid AS pid, S.datid AS datid, D.datname AS datname,
         S.relid AS relid,
-        CASE S.param5 WHEN 1 THEN 'COPY FROM'
+        CASE S.param1 WHEN 1 THEN 'COPY FROM'
                       WHEN 2 THEN 'COPY TO'
                       END AS command,
-        CASE S.param6 WHEN 1 THEN 'FILE'
+        CASE S.param2 WHEN 1 THEN 'FILE'
                       WHEN 2 THEN 'PROGRAM'
                       WHEN 3 THEN 'PIPE'
                       WHEN 4 THEN 'CALLBACK'
                       END AS "type",
-        S.param1 AS bytes_processed,
-        S.param2 AS bytes_total,
-        S.param3 AS tuples_processed,
-        S.param4 AS tuples_excluded
+        S.param3 AS bytes_processed,
+        S.param4 AS bytes_total,
+        S.param5 AS tuples_processed,
+        S.param6 AS tuples_inserted,
+        S.param7 AS tuples_excluded
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 8856bf215b..0aae5ae71b 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -540,6 +540,7 @@ CopyFrom(CopyFromState cstate)
 	CopyInsertMethod insertMethod;
 	CopyMultiInsertInfo multiInsertInfo = {0};	/* pacify compiler */
 	int64		processed = 0;
+	int64		inserted = 0;
 	int64		excluded = 0;
 	bool		has_before_insert_row_trig;
 	bool		has_instead_insert_row_trig;
@@ -854,6 +855,9 @@ CopyFrom(CopyFromState cstate)
 		if (!NextCopyFrom(cstate, econtext, myslot->tts_values, myslot->tts_isnull))
 			break;
 
+		/* We're starting processing the next tuple, so update processed */
+		++processed;
+
 		ExecStoreVirtualTuple(myslot);
 
 		/*
@@ -871,8 +875,15 @@ CopyFrom(CopyFromState cstate)
 			/* Skip items that don't match COPY's WHERE clause */
 			if (!ExecQual(cstate->qualexpr, econtext))
 			{
-				/* Report that this tuple was filtered out by the WHERE clause */
-				pgstat_progress_update_param(PROGRESS_COPY_TUPLES_EXCLUDED, ++excluded);
+				const int progress_params[] = {
+					PROGRESS_COPY_TUPLES_PROCESSED,
+					PROGRESS_COPY_TUPLES_EXCLUDED
+				};
+				const int64 progress_vals[] = {
+					processed,
+					++excluded
+				};
+				pgstat_progress_update_multi_param(2, progress_params, progress_vals);
 				continue;
 			}
 		}
@@ -1001,8 +1012,18 @@ CopyFrom(CopyFromState cstate)
 				skip_tuple = true;	/* "do nothing" */
 		}
 
-		if (!skip_tuple)
+		if (skip_tuple)
+			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, processed);
+		else
 		{
+			const int progress_params[] = {
+				PROGRESS_COPY_TUPLES_PROCESSED,
+				PROGRESS_COPY_TUPLES_INSERTED
+			};
+			int64 progress_vals[] = {
+				processed,
+				0
+			};
 			/*
 			 * If there is an INSTEAD OF INSERT ROW trigger, let it handle the
 			 * tuple.  Otherwise, proceed with inserting the tuple into the
@@ -1073,7 +1094,11 @@ CopyFrom(CopyFromState cstate)
 																				 NULL);
 
 						if (myslot == NULL) /* "do nothing" */
+						{
+							/* `processed` was updated at the top of this iteration; report the progress */
+							pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, processed); 
 							continue;	/* next tuple please */
+						}
 
 						/*
 						 * AFTER ROW Triggers might reference the tableoid
@@ -1112,7 +1137,8 @@ CopyFrom(CopyFromState cstate)
 			 * for counting tuples inserted by an INSERT command. Update
 			 * progress of the COPY command as well.
 			 */
-			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++processed);
+			progress_vals[1] = ++inserted;
+			pgstat_progress_update_multi_param(2, progress_params, progress_vals);
 		}
 	}
 
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 7884a11383..6fa78a5405 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -134,12 +134,13 @@
 #define PROGRESS_BASEBACKUP_PHASE_TRANSFER_WAL			5
 
 /* Progress parameters for PROGRESS_COPY */
-#define PROGRESS_COPY_BYTES_PROCESSED 0
-#define PROGRESS_COPY_BYTES_TOTAL 1
-#define PROGRESS_COPY_TUPLES_PROCESSED 2
-#define PROGRESS_COPY_TUPLES_EXCLUDED 3
-#define PROGRESS_COPY_COMMAND 4
-#define PROGRESS_COPY_TYPE 5
+#define PROGRESS_COPY_COMMAND 0
+#define PROGRESS_COPY_TYPE 1
+#define PROGRESS_COPY_BYTES_PROCESSED 2
+#define PROGRESS_COPY_BYTES_TOTAL 3
+#define PROGRESS_COPY_TUPLES_PROCESSED 4
+#define PROGRESS_COPY_TUPLES_INSERTED 5
+#define PROGRESS_COPY_TUPLES_EXCLUDED 6
 
 /* Commands of PROGRESS_COPY_COMMAND */
 #define PROGRESS_COPY_COMMAND_FROM 1
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 02f5e7c905..8558fec72a 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1950,22 +1950,23 @@ pg_stat_progress_copy| SELECT s.pid,
     s.datid,
     d.datname,
     s.relid,
-        CASE s.param5
+        CASE s.param1
             WHEN 1 THEN 'COPY FROM'::text
             WHEN 2 THEN 'COPY TO'::text
             ELSE NULL::text
         END AS command,
-        CASE s.param6
+        CASE s.param2
             WHEN 1 THEN 'FILE'::text
             WHEN 2 THEN 'PROGRAM'::text
             WHEN 3 THEN 'PIPE'::text
             WHEN 4 THEN 'CALLBACK'::text
             ELSE NULL::text
         END AS type,
-    s.param1 AS bytes_processed,
-    s.param2 AS bytes_total,
-    s.param3 AS tuples_processed,
-    s.param4 AS tuples_excluded
+    s.param3 AS bytes_processed,
+    s.param4 AS bytes_total,
+    s.param5 AS tuples_processed,
+    s.param6 AS tuples_inserted,
+    s.param7 AS tuples_excluded
    FROM (pg_stat_get_progress_info('COPY'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_create_index| SELECT s.pid,
diff --git a/src/test/regress/output/copy.source b/src/test/regress/output/copy.source
index 6819c1334c..eb3f38ea9e 100644
--- a/src/test/regress/output/copy.source
+++ b/src/test/regress/output/copy.source
@@ -203,11 +203,11 @@ create trigger check_after_progress_reporting
 	execute function notice_after_progress_reporting();
 -- reporting of STDIO imports, and correct bytes-processed/tuples-processed reporting
 copy progress_reporting from stdin;
-INFO:  progress: {"type": "PIPE", "command": "COPY FROM", "datname": "regression", "bytes_total": 0, "bytes_processed": 79, "tuples_excluded": 0, "tuples_processed": 3}
+INFO:  progress: {"type": "PIPE", "command": "COPY FROM", "datname": "regression", "bytes_total": 0, "bytes_processed": 79, "tuples_excluded": 0, "tuples_inserted": 3, "tuples_processed": 3}
 -- reporting of FILE imports, and correct reporting of tuples-excluded
 copy progress_reporting from '@abs_srcdir@/data/emp.data'
 	where (salary < 2000);
-INFO:  progress: {"type": "FILE", "command": "COPY FROM", "datname": "regression", "bytes_total": 79, "bytes_processed": 79, "tuples_excluded": 1, "tuples_processed": 2}
+INFO:  progress: {"type": "FILE", "command": "COPY FROM", "datname": "regression", "bytes_total": 79, "bytes_processed": 79, "tuples_excluded": 1, "tuples_inserted": 2, "tuples_processed": 3}
 -- cleanup progress_reporting
 drop trigger check_after_progress_reporting on progress_reporting;
 drop function notice_after_progress_reporting();
-- 
2.20.1

v12-0001-Additional-progress-reported-components-for-COPY.patchtext/x-patch; charset=US-ASCII; name=v12-0001-Additional-progress-reported-components-for-COPY.patchDownload
From 9add598d220ca7fd14c86d90cb93daf890436896 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm@gmail.com>
Date: Fri, 12 Feb 2021 14:06:44 +0100
Subject: [PATCH v12 1/3] Additional progress-reported components for COPY
 progress reporting

The command, io target and excluded tuple count (by the
COPY ... FROM ... WHERE -clause) are now reported in the pg_stat_progress_copy
view. Additionally, the column lines_processed is renamed to tuples_processed
to disambiguate the meaning of this column in cases of CSV and BINARY copies
and to stay consistent with regards to names in the pg_stat_progress_*-views.

Of special interest is the reporting of the copy type, with which we can
distinguish logical replications' initial table synchronization workers'
progress without having to join the pg_stat_activity view.
---
 doc/src/sgml/monitoring.sgml         | 42 +++++++++++++++++++++++++---
 src/backend/catalog/system_views.sql | 11 +++++++-
 src/backend/commands/copyfrom.c      | 27 ++++++++++++++++--
 src/backend/commands/copyto.c        | 22 ++++++++++++---
 src/include/commands/progress.h      | 17 +++++++++--
 src/test/regress/expected/rules.out  | 15 +++++++++-
 6 files changed, 119 insertions(+), 15 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 3513e127b7..32bebc70db 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6541,8 +6541,32 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
        <structfield>relid</structfield> <type>oid</type>
       </para>
       <para>
-       OID of the table on which the <command>COPY</command> command is executed.
-       It is set to 0 if copying from a <command>SELECT</command> query.
+       OID of the table on which the <command>COPY</command> command is being
+       executed. It is set to 0 if copying from a <command>SELECT</command>
+       query.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>command</structfield> <type>text</type>
+      </para>
+      <para>
+       The command that is running: <literal>COPY FROM</literal>, or
+       <literal>COPY TO</literal>.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>type</structfield> <type>text</type>
+      </para>
+      <para>
+       The io type that the data is read from or written to:
+       <literal>FILE</literal>, <literal>PROGRAM</literal>,
+       <literal>PIPE</literal> (for <command>COPY FROM STDIN</command> and
+       <command>COPY TO STDOUT</command>), or <literal>CALLBACK</literal>
+       (used in the table synchronization background worker).
       </para></entry>
      </row>
 
@@ -6567,10 +6591,20 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>lines_processed</structfield> <type>bigint</type>
+       <structfield>tuples_processed</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of tuples already processed by <command>COPY</command> command.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>tuples_excluded</structfield> <type>bigint</type>
       </para>
       <para>
-       Number of lines already processed by <command>COPY</command> command.
+       Number of tuples not processed because they were excluded by the
+       <command>WHERE</command> clause of the <command>COPY</command> command.
       </para></entry>
      </row>
     </tbody>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index fc94a73a54..8f9987f311 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1130,9 +1130,18 @@ CREATE VIEW pg_stat_progress_copy AS
     SELECT
         S.pid AS pid, S.datid AS datid, D.datname AS datname,
         S.relid AS relid,
+        CASE S.param5 WHEN 1 THEN 'COPY FROM'
+                      WHEN 2 THEN 'COPY TO'
+                      END AS command,
+        CASE S.param6 WHEN 1 THEN 'FILE'
+                      WHEN 2 THEN 'PROGRAM'
+                      WHEN 3 THEN 'PIPE'
+                      WHEN 4 THEN 'CALLBACK'
+                      END AS "type",
         S.param1 AS bytes_processed,
         S.param2 AS bytes_total,
-        S.param3 AS lines_processed
+        S.param3 AS tuples_processed,
+        S.param4 AS tuples_excluded
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index f05e2d2347..8856bf215b 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -539,7 +539,8 @@ CopyFrom(CopyFromState cstate)
 	BulkInsertState bistate = NULL;
 	CopyInsertMethod insertMethod;
 	CopyMultiInsertInfo multiInsertInfo = {0};	/* pacify compiler */
-	uint64		processed = 0;
+	int64		processed = 0;
+	int64		excluded = 0;
 	bool		has_before_insert_row_trig;
 	bool		has_instead_insert_row_trig;
 	bool		leafpart_use_multi_insert = false;
@@ -869,7 +870,11 @@ CopyFrom(CopyFromState cstate)
 			econtext->ecxt_scantuple = myslot;
 			/* Skip items that don't match COPY's WHERE clause */
 			if (!ExecQual(cstate->qualexpr, econtext))
+			{
+				/* Report that this tuple was filtered out by the WHERE clause */
+				pgstat_progress_update_param(PROGRESS_COPY_TUPLES_EXCLUDED, ++excluded);
 				continue;
+			}
 		}
 
 		/* Determine the partition to insert the tuple into */
@@ -1107,7 +1112,7 @@ CopyFrom(CopyFromState cstate)
 			 * for counting tuples inserted by an INSERT command. Update
 			 * progress of the COPY command as well.
 			 */
-			pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++processed);
+			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++processed);
 		}
 	}
 
@@ -1193,6 +1198,16 @@ BeginCopyFrom(ParseState *pstate,
 	ExprState **defexprs;
 	MemoryContext oldcontext;
 	bool		volatile_defexprs;
+	const int	progress_cols[] = {
+		PROGRESS_COPY_COMMAND,
+		PROGRESS_COPY_TYPE,
+		PROGRESS_COPY_BYTES_TOTAL
+	};
+	int64		progress_vals[] = {
+		PROGRESS_COPY_COMMAND_FROM,
+		0,
+		0
+	};
 
 	/* Allocate workspace and zero all fields */
 	cstate = (CopyFromStateData *) palloc0(sizeof(CopyFromStateData));
@@ -1430,11 +1445,13 @@ BeginCopyFrom(ParseState *pstate,
 
 	if (data_source_cb)
 	{
+		progress_vals[1] = PROGRESS_COPY_TYPE_CALLBACK;
 		cstate->copy_src = COPY_CALLBACK;
 		cstate->data_source_cb = data_source_cb;
 	}
 	else if (pipe)
 	{
+		progress_vals[1] = PROGRESS_COPY_TYPE_PIPE;
 		Assert(!is_program);	/* the grammar does not allow this */
 		if (whereToSendOutput == DestRemote)
 			ReceiveCopyBegin(cstate);
@@ -1447,6 +1464,7 @@ BeginCopyFrom(ParseState *pstate,
 
 		if (cstate->is_program)
 		{
+			progress_vals[1] = PROGRESS_COPY_TYPE_PROGRAM;
 			cstate->copy_file = OpenPipeStream(cstate->filename, PG_BINARY_R);
 			if (cstate->copy_file == NULL)
 				ereport(ERROR,
@@ -1458,6 +1476,7 @@ BeginCopyFrom(ParseState *pstate,
 		{
 			struct stat st;
 
+			progress_vals[1] = PROGRESS_COPY_TYPE_FILE;
 			cstate->copy_file = AllocateFile(cstate->filename, PG_BINARY_R);
 			if (cstate->copy_file == NULL)
 			{
@@ -1484,10 +1503,12 @@ BeginCopyFrom(ParseState *pstate,
 						(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 						 errmsg("\"%s\" is a directory", cstate->filename)));
 
-			pgstat_progress_update_param(PROGRESS_COPY_BYTES_TOTAL, st.st_size);
+			progress_vals[2] = st.st_size;
 		}
 	}
 
+	pgstat_progress_update_multi_param(3, progress_cols, progress_vals);
+
 	if (cstate->opts.binary)
 	{
 		/* Read and verify binary header */
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 46155015cf..3b5a20a065 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -353,6 +353,14 @@ BeginCopyTo(ParseState *pstate,
 	TupleDesc	tupDesc;
 	int			num_phys_attrs;
 	MemoryContext oldcontext;
+	const int	progress_cols[] = {
+		PROGRESS_COPY_COMMAND,
+		PROGRESS_COPY_TYPE
+	};
+	int64		progress_vals[] = {
+		PROGRESS_COPY_COMMAND_TO,
+		0
+	};
 
 	if (rel != NULL && rel->rd_rel->relkind != RELKIND_RELATION)
 	{
@@ -659,6 +667,8 @@ BeginCopyTo(ParseState *pstate,
 
 	if (pipe)
 	{
+		progress_vals[1] = PROGRESS_COPY_TYPE_PIPE;
+
 		Assert(!is_program);	/* the grammar does not allow this */
 		if (whereToSendOutput != DestRemote)
 			cstate->copy_file = stdout;
@@ -670,6 +680,7 @@ BeginCopyTo(ParseState *pstate,
 
 		if (is_program)
 		{
+			progress_vals[1] = PROGRESS_COPY_TYPE_PROGRAM;
 			cstate->copy_file = OpenPipeStream(cstate->filename, PG_BINARY_W);
 			if (cstate->copy_file == NULL)
 				ereport(ERROR,
@@ -682,6 +693,7 @@ BeginCopyTo(ParseState *pstate,
 			mode_t		oumask; /* Pre-existing umask value */
 			struct stat st;
 
+			progress_vals[1] = PROGRESS_COPY_TYPE_FILE;
 			/*
 			 * Prevent write to relative path ... too easy to shoot oneself in
 			 * the foot by overwriting a database file ...
@@ -731,6 +743,8 @@ BeginCopyTo(ParseState *pstate,
 	/* initialize progress */
 	pgstat_progress_start_command(PROGRESS_COMMAND_COPY,
 								  cstate->rel ? RelationGetRelid(cstate->rel) : InvalidOid);
+	pgstat_progress_update_multi_param(2, progress_cols, progress_vals);
+
 	cstate->bytes_processed = 0;
 
 	MemoryContextSwitchTo(oldcontext);
@@ -881,8 +895,8 @@ DoCopyTo(CopyToState cstate)
 			/* Format and send the data */
 			CopyOneRowTo(cstate, slot);
 
-			/* Increment amount of processed tuples and update the progress */
-			pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++processed);
+			/* Increment the number of processed tuples, and report the progress */
+			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++processed);
 		}
 
 		ExecDropSingleTupleTableSlot(slot);
@@ -1251,8 +1265,8 @@ copy_dest_receive(TupleTableSlot *slot, DestReceiver *self)
 	/* Send the data */
 	CopyOneRowTo(cstate, slot);
 
-	/* Increment amount of processed tuples and update the progress */
-	pgstat_progress_update_param(PROGRESS_COPY_LINES_PROCESSED, ++myState->processed);
+	/* Increment the number of processed tuples, and report the progress */
+	pgstat_progress_update_param(PROGRESS_COPY_TUPLES_PROCESSED, ++myState->processed);
 
 	return true;
 }
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 95ec5d02e9..7884a11383 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -133,9 +133,22 @@
 #define PROGRESS_BASEBACKUP_PHASE_WAIT_WAL_ARCHIVE		4
 #define PROGRESS_BASEBACKUP_PHASE_TRANSFER_WAL			5
 
-/* Commands of PROGRESS_COPY */
+/* Progress parameters for PROGRESS_COPY */
 #define PROGRESS_COPY_BYTES_PROCESSED 0
 #define PROGRESS_COPY_BYTES_TOTAL 1
-#define PROGRESS_COPY_LINES_PROCESSED 2
+#define PROGRESS_COPY_TUPLES_PROCESSED 2
+#define PROGRESS_COPY_TUPLES_EXCLUDED 3
+#define PROGRESS_COPY_COMMAND 4
+#define PROGRESS_COPY_TYPE 5
+
+/* Commands of PROGRESS_COPY_COMMAND */
+#define PROGRESS_COPY_COMMAND_FROM 1
+#define PROGRESS_COPY_COMMAND_TO 2
+
+/* Types of PROGRESS_COPY_TYPE */
+#define PROGRESS_COPY_TYPE_FILE 1
+#define PROGRESS_COPY_TYPE_PROGRAM 2
+#define PROGRESS_COPY_TYPE_PIPE 3
+#define PROGRESS_COPY_TYPE_CALLBACK 4
 
 #endif
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index b1c9b7bdfe..02f5e7c905 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1950,9 +1950,22 @@ pg_stat_progress_copy| SELECT s.pid,
     s.datid,
     d.datname,
     s.relid,
+        CASE s.param5
+            WHEN 1 THEN 'COPY FROM'::text
+            WHEN 2 THEN 'COPY TO'::text
+            ELSE NULL::text
+        END AS command,
+        CASE s.param6
+            WHEN 1 THEN 'FILE'::text
+            WHEN 2 THEN 'PROGRAM'::text
+            WHEN 3 THEN 'PIPE'::text
+            WHEN 4 THEN 'CALLBACK'::text
+            ELSE NULL::text
+        END AS type,
     s.param1 AS bytes_processed,
     s.param2 AS bytes_total,
-    s.param3 AS lines_processed
+    s.param3 AS tuples_processed,
+    s.param4 AS tuples_excluded
    FROM (pg_stat_get_progress_info('COPY'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_create_index| SELECT s.pid,
-- 
2.20.1

v12-0002-Add-copy-progress-reporting-regression-tests.patchtext/x-patch; charset=UTF-8; name=v12-0002-Add-copy-progress-reporting-regression-tests.patchDownload
From c26e13408c97dfd469f753a7341730f543ca0084 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Josef=20=C5=A0im=C3=A1nek?= <josef.simanek@gmail.com>
Date: Tue, 9 Feb 2021 13:06:45 +0100
Subject: [PATCH v12 2/3] Add copy progress reporting regression tests.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This tests some basic features of copy progress reporting.

Sadly, we can only request one snapshot of progress_reporting each
transaction / statement, so we can't (easily) get intermediate results without
each result requiring a seperate statement being run.

Author: Josef Šimánek, Matthias van de Meent
---
 src/test/regress/input/copy.source  | 60 +++++++++++++++++++++++++++++
 src/test/regress/output/copy.source | 47 ++++++++++++++++++++++
 2 files changed, 107 insertions(+)

diff --git a/src/test/regress/input/copy.source b/src/test/regress/input/copy.source
index a1d529ad36..4f1cbc73d2 100644
--- a/src/test/regress/input/copy.source
+++ b/src/test/regress/input/copy.source
@@ -201,3 +201,63 @@ select * from parted_copytest where b = 1;
 select * from parted_copytest where b = 2;
 
 drop table parted_copytest;
+
+--
+-- progress reporting
+--
+
+-- setup
+-- reuse employer datatype, that has a small sized data set
+
+create table progress_reporting (
+	name 		text,
+	age			int4,
+	location 	point,
+	salary 		int4,
+	manager 	name
+);
+
+-- Add a trigger to 'raise info' the contents of pg_stat_progress_copy. This
+-- allows us to test and verify the contents of this view, which would
+-- otherwise require a concurrent session checking that table.
+create function notice_after_progress_reporting() returns trigger AS
+$$
+declare report record;
+begin
+	-- We cannot expect 'pid' nor 'relid' to be consistent over runs due to
+	-- variance in system process ids, and concurrency in runs of tests.
+	-- Additionally, due to the usage of this test in pg_regress, the 'datid'
+	-- also is not consistent between runs.
+	select into report (to_jsonb(r) - '{pid,relid,datid}'::text[]) as value
+		from pg_stat_progress_copy r
+		where pid = pg_backend_pid();
+
+	raise info 'progress: %', report.value::text;
+
+	RETURN NEW;
+END;
+$$ LANGUAGE plpgsql;
+
+create trigger check_after_progress_reporting
+	after insert on progress_reporting
+	for each statement
+	execute function notice_after_progress_reporting();
+
+-- reporting of STDIO imports, and correct bytes-processed/tuples-processed reporting
+
+copy progress_reporting from stdin;
+sharon	25	(15,12)	1000	sam
+sam	30	(10,5)	2000	bill
+bill	20	(11,10)	1000	sharon
+\.
+
+-- reporting of FILE imports, and correct reporting of tuples-excluded
+
+copy progress_reporting from '@abs_srcdir@/data/emp.data'
+	where (salary < 2000);
+
+-- cleanup progress_reporting
+
+drop trigger check_after_progress_reporting on progress_reporting;
+drop function notice_after_progress_reporting();
+drop table progress_reporting;
diff --git a/src/test/regress/output/copy.source b/src/test/regress/output/copy.source
index 938d3551da..6819c1334c 100644
--- a/src/test/regress/output/copy.source
+++ b/src/test/regress/output/copy.source
@@ -165,3 +165,50 @@ select * from parted_copytest where b = 2;
 (1 row)
 
 drop table parted_copytest;
+--
+-- progress reporting
+--
+-- setup
+-- reuse employer datatype, that has a small sized data set
+create table progress_reporting (
+	name 		text,
+	age			int4,
+	location 	point,
+	salary 		int4,
+	manager 	name
+);
+-- Add a trigger to 'raise info' the contents of pg_stat_progress_copy. This
+-- allows us to test and verify the contents of this view, which would
+-- otherwise require a concurrent session checking that table.
+create function notice_after_progress_reporting() returns trigger AS
+$$
+declare report record;
+begin
+	-- We cannot expect 'pid' nor 'relid' to be consistent over runs due to
+	-- variance in system process ids, and concurrency in runs of tests.
+	-- Additionally, due to the usage of this test in pg_regress, the 'datid'
+	-- also is not consistent between runs.
+	select into report (to_jsonb(r) - '{pid,relid,datid}'::text[]) as value
+		from pg_stat_progress_copy r
+		where pid = pg_backend_pid();
+
+	raise info 'progress: %', report.value::text;
+
+	RETURN NEW;
+END;
+$$ LANGUAGE plpgsql;
+create trigger check_after_progress_reporting
+	after insert on progress_reporting
+	for each statement
+	execute function notice_after_progress_reporting();
+-- reporting of STDIO imports, and correct bytes-processed/tuples-processed reporting
+copy progress_reporting from stdin;
+INFO:  progress: {"type": "PIPE", "command": "COPY FROM", "datname": "regression", "bytes_total": 0, "bytes_processed": 79, "tuples_excluded": 0, "tuples_processed": 3}
+-- reporting of FILE imports, and correct reporting of tuples-excluded
+copy progress_reporting from '@abs_srcdir@/data/emp.data'
+	where (salary < 2000);
+INFO:  progress: {"type": "FILE", "command": "COPY FROM", "datname": "regression", "bytes_total": 79, "bytes_processed": 79, "tuples_excluded": 1, "tuples_processed": 2}
+-- cleanup progress_reporting
+drop trigger check_after_progress_reporting on progress_reporting;
+drop function notice_after_progress_reporting();
+drop table progress_reporting;
-- 
2.20.1

#45Michael Paquier
michael@paquier.xyz
In reply to: Matthias van de Meent (#44)
Re: Improvements and additions to COPY progress reporting

On Mon, Mar 08, 2021 at 05:33:40PM +0100, Matthias van de Meent wrote:

Seems reasonable. PFA updated patches. I've renamed the previous 0003
to 0002 to keep git-format-patch easy.

Thanks for updating the patch. 0001 has been applied, after tweaking
a bit comments, indentation and the docs.

This is keeping current behaviour of the implementation as committed
with 8a4f618e, with the rationale of that patch being that this number
should mirror the number returned by the copy command.

I am not opposed to adding another column for `tuples_inserted` and
changing the logic accordingly (see prototype 0003), but that was not
in the intended scope of this patchset. Unless you think that this
should be included in this current patchset, I'll spin that patch out
into a different thread, but I'm not sure that would make it into
pg14.

Okay, point taken. If there is demand for it in the future, we could
extend the existing set of columns. After thinking more about it the
usecase if not completely clear to me from a monitoring point of
view.

I have not looked at 0002 in details yet, but I am wondering first if
the size estimations in the expected output are actually portable.
Second, I doubt a bit that the extra cycles spent on that are actually
worth the coverage, even if the trick with an AFTER INSERT trigger is
interesting.
--
Michael

#46Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Michael Paquier (#45)
Re: Improvements and additions to COPY progress reporting

On Tue, Mar 9, 2021 at 11:04 AM Michael Paquier <michael@paquier.xyz> wrote:

This is keeping current behaviour of the implementation as committed
with 8a4f618e, with the rationale of that patch being that this number
should mirror the number returned by the copy command.

I am not opposed to adding another column for `tuples_inserted` and
changing the logic accordingly (see prototype 0003), but that was not
in the intended scope of this patchset. Unless you think that this
should be included in this current patchset, I'll spin that patch out
into a different thread, but I'm not sure that would make it into
pg14.

Okay, point taken. If there is demand for it in the future, we could
extend the existing set of columns. After thinking more about it the
usecase if not completely clear to me from a monitoring point of
view.

I think, for now, having better comments on what's included and what's
not in the existing tuples_processed column would help.

I have not looked at 0002 in details yet, but I am wondering first if
the size estimations in the expected output are actually portable.
Second, I doubt a bit that the extra cycles spent on that are actually
worth the coverage, even if the trick with an AFTER INSERT trigger is
interesting.

I was having the same doubt [1]/messages/by-id/CALj2ACUyNREth8f3M7wXrHVeycfnqBn5pVygYOoBVs=ifo8V4A@mail.gmail.com, please have a look at it and the few
threads that follow. IMO, we can have the tests without the
"bytes_total" column if we think that it's not stable across all OS
platforms. I think those tests don't take much time to run and AFAIK
it's the first progress report command to have tests because others
such as analyse, vacuum, cluster, base backup don't have.

[1]: /messages/by-id/CALj2ACUyNREth8f3M7wXrHVeycfnqBn5pVygYOoBVs=ifo8V4A@mail.gmail.com

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

#47Josef Šimánek
josef.simanek@gmail.com
In reply to: Michael Paquier (#45)
Re: Improvements and additions to COPY progress reporting

út 9. 3. 2021 v 6:34 odesílatel Michael Paquier <michael@paquier.xyz> napsal:

On Mon, Mar 08, 2021 at 05:33:40PM +0100, Matthias van de Meent wrote:

Seems reasonable. PFA updated patches. I've renamed the previous 0003
to 0002 to keep git-format-patch easy.

Thanks for updating the patch. 0001 has been applied, after tweaking
a bit comments, indentation and the docs.

This is keeping current behaviour of the implementation as committed
with 8a4f618e, with the rationale of that patch being that this number
should mirror the number returned by the copy command.

I am not opposed to adding another column for `tuples_inserted` and
changing the logic accordingly (see prototype 0003), but that was not
in the intended scope of this patchset. Unless you think that this
should be included in this current patchset, I'll spin that patch out
into a different thread, but I'm not sure that would make it into
pg14.

Okay, point taken. If there is demand for it in the future, we could
extend the existing set of columns. After thinking more about it the
usecase if not completely clear to me from a monitoring point of
view.

I have not looked at 0002 in details yet, but I am wondering first if
the size estimations in the expected output are actually portable.
Second, I doubt a bit that the extra cycles spent on that are actually
worth the coverage, even if the trick with an AFTER INSERT trigger is
interesting.

Those extra cycles are in there to cover at least parts of the COPY
progress from being accidentally broken. I have seen various patches
modifying COPY command being currently in progress. It would be nice
to ensure at least basic functionality works well in an automated way.
On my machine there is no huge overhead added by adding those tests
(they finish almost instantly).

Show quoted text

--
Michael

#48Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Michael Paquier (#45)
Re: Improvements and additions to COPY progress reporting

On Tue, 9 Mar 2021 at 06:34, Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Mar 08, 2021 at 05:33:40PM +0100, Matthias van de Meent wrote:

Seems reasonable. PFA updated patches. I've renamed the previous 0003
to 0002 to keep git-format-patch easy.

Thanks for updating the patch. 0001 has been applied, after tweaking
a bit comments, indentation and the docs.

Thanks!

This is keeping current behaviour of the implementation as committed
with 8a4f618e, with the rationale of that patch being that this number
should mirror the number returned by the copy command.

I am not opposed to adding another column for `tuples_inserted` and
changing the logic accordingly (see prototype 0003), but that was not
in the intended scope of this patchset. Unless you think that this
should be included in this current patchset, I'll spin that patch out
into a different thread, but I'm not sure that would make it into
pg14.

Okay, point taken. If there is demand for it in the future, we could
extend the existing set of columns. After thinking more about it the
usecase if not completely clear to me from a monitoring point of
view.

I have not looked at 0002 in details yet, but I am wondering first if
the size estimations in the expected output are actually portable.
Second, I doubt a bit that the extra cycles spent on that are actually
worth the coverage, even if the trick with an AFTER INSERT trigger is
interesting.

There are examples in which pg_stat_progress_* -views report
inaccurate data. I think it is fairly reasonable to at least validate
some part of the progress reporting, as it is one of the few methods
for administrators to look at the state of currently running
administrative tasks, and as such, this user-visible api should be
validated.

With regards,

Matthias van de Meent

#49Michael Paquier
michael@paquier.xyz
In reply to: Matthias van de Meent (#48)
1 attachment(s)
Re: Improvements and additions to COPY progress reporting

On Wed, Mar 10, 2021 at 09:35:10AM +0100, Matthias van de Meent wrote:

There are examples in which pg_stat_progress_* -views report
inaccurate data. I think it is fairly reasonable to at least validate
some part of the progress reporting, as it is one of the few methods
for administrators to look at the state of currently running
administrative tasks, and as such, this user-visible api should be
validated.

Looking closer at 0002, the size numbers are actually incorrect on
Windows for the second query. The CRLFs at the end of each line of
emp.data add three bytes to the report of COPY FROM, so this finishes
with 82 bytes for bytes_total and bytes_processed instead of 79.
Let's make this useful but simpler here, so I propose to check that
the counters are higher than zero instead of an exact number. Let's
also add the relation name relid::regclass while on it.

The tests introduced are rather limited, but you are right that
something is better than nothing here, and I have slightly updated
what the tests sent previously as per the attached. What do you
think?
--
Michael

Attachments:

copy-from-report-tests-13.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/test/regress/input/copy.source b/src/test/regress/input/copy.source
index a1d529ad36..9e4766898d 100644
--- a/src/test/regress/input/copy.source
+++ b/src/test/regress/input/copy.source
@@ -201,3 +201,65 @@ select * from parted_copytest where b = 1;
 select * from parted_copytest where b = 2;
 
 drop table parted_copytest;
+
+--
+-- Progress reporting for COPY
+--
+create table tab_progress_reporting (
+	name text,
+	age int4,
+	location point,
+	salary int4,
+	manager name
+);
+
+-- Add a trigger to catch and print the contents of the catalog view
+-- pg_stat_progress_copy during data insertion.  This allows to test
+-- the validation of some progress reports for COPY FROM where the trigger
+-- would fire.
+create function notice_after_tab_progress_reporting() returns trigger AS
+$$
+declare report record;
+begin
+	-- The fields ignored here are the ones that may not remain
+	-- consistent across multiple runs.  The sizes reported may differ
+	-- across platforms, so just check if these are strictly positive.
+	with progress_data as (
+	  select
+	     relid::regclass::text as relname,
+	     command,
+	     type,
+	     bytes_processed > 0 as has_bytes_processed,
+	     bytes_total > 0 as has_bytes_total,
+	     tuples_processed,
+	     tuples_excluded
+	    from pg_stat_progress_copy
+	    where pid = pg_backend_pid())
+	select into report (to_jsonb(r)) as value
+		from progress_data r;
+
+	raise info 'progress: %', report.value::text;
+	return new;
+end;
+$$ language plpgsql;
+
+create trigger check_after_tab_progress_reporting
+	after insert on tab_progress_reporting
+	for each statement
+	execute function notice_after_tab_progress_reporting();
+
+-- Generate COPY FROM report with PIPE.
+copy tab_progress_reporting from stdin;
+sharon	25	(15,12)	1000	sam
+sam	30	(10,5)	2000	bill
+bill	20	(11,10)	1000	sharon
+\.
+
+-- Generate COPY FROM report with FILE, with some excluded tuples.
+truncate tab_progress_reporting;
+copy tab_progress_reporting from '@abs_srcdir@/data/emp.data'
+	where (salary < 2000);
+
+drop trigger check_after_tab_progress_reporting on tab_progress_reporting;
+drop function notice_after_tab_progress_reporting();
+drop table tab_progress_reporting;
diff --git a/src/test/regress/output/copy.source b/src/test/regress/output/copy.source
index 938d3551da..8cf37dc257 100644
--- a/src/test/regress/output/copy.source
+++ b/src/test/regress/output/copy.source
@@ -165,3 +165,57 @@ select * from parted_copytest where b = 2;
 (1 row)
 
 drop table parted_copytest;
+--
+-- Progress reporting for COPY
+--
+create table tab_progress_reporting (
+	name text,
+	age int4,
+	location point,
+	salary int4,
+	manager name
+);
+-- Add a trigger to catch and print the contents of the catalog view
+-- pg_stat_progress_copy during data insertion.  This allows to test
+-- the validation of some progress reports for COPY FROM where the trigger
+-- would fire.
+create function notice_after_tab_progress_reporting() returns trigger AS
+$$
+declare report record;
+begin
+	-- The fields ignored here are the ones that may not remain
+	-- consistent across multiple runs.  The sizes reported may differ
+	-- across platforms, so just check if these are strictly positive.
+	with progress_data as (
+	  select
+	     relid::regclass::text as relname,
+	     command,
+	     type,
+	     bytes_processed > 0 as has_bytes_processed,
+	     bytes_total > 0 as has_bytes_total,
+	     tuples_processed,
+	     tuples_excluded
+	    from pg_stat_progress_copy
+	    where pid = pg_backend_pid())
+	select into report (to_jsonb(r)) as value
+		from progress_data r;
+
+	raise info 'progress: %', report.value::text;
+	return new;
+end;
+$$ language plpgsql;
+create trigger check_after_tab_progress_reporting
+	after insert on tab_progress_reporting
+	for each statement
+	execute function notice_after_tab_progress_reporting();
+-- Generate COPY FROM report with PIPE.
+copy tab_progress_reporting from stdin;
+INFO:  progress: {"type": "PIPE", "command": "COPY FROM", "relname": "tab_progress_reporting", "has_bytes_total": false, "tuples_excluded": 0, "tuples_processed": 3, "has_bytes_processed": true}
+-- Generate COPY FROM report with FILE, with some excluded tuples.
+truncate tab_progress_reporting;
+copy tab_progress_reporting from '@abs_srcdir@/data/emp.data'
+	where (salary < 2000);
+INFO:  progress: {"type": "FILE", "command": "COPY FROM", "relname": "tab_progress_reporting", "has_bytes_total": true, "tuples_excluded": 1, "tuples_processed": 2, "has_bytes_processed": true}
+drop trigger check_after_tab_progress_reporting on tab_progress_reporting;
+drop function notice_after_tab_progress_reporting();
+drop table tab_progress_reporting;
#50Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Michael Paquier (#49)
Re: Improvements and additions to COPY progress reporting

On Mon, 15 Mar 2021 at 05:53, Michael Paquier <michael@paquier.xyz> wrote:

On Wed, Mar 10, 2021 at 09:35:10AM +0100, Matthias van de Meent wrote:

There are examples in which pg_stat_progress_* -views report
inaccurate data. I think it is fairly reasonable to at least validate
some part of the progress reporting, as it is one of the few methods
for administrators to look at the state of currently running
administrative tasks, and as such, this user-visible api should be
validated.

Looking closer at 0002, the size numbers are actually incorrect on
Windows for the second query. The CRLFs at the end of each line of
emp.data add three bytes to the report of COPY FROM, so this finishes
with 82 bytes for bytes_total and bytes_processed instead of 79.

Hmm, does CFBot not run checkout on windows with crlf line endings? I
had expected it to do as such.

Let's make this useful but simpler here, so I propose to check that
the counters are higher than zero instead of an exact number. Let's
also add the relation name relid::regclass while on it.

+1, I hadn't thought of casting relid to its regclass to get a stable
identifier.

The tests introduced are rather limited, but you are right that
something is better than nothing here, and I have slightly updated
what the tests sent previously as per the attached. What do you
think?

That seems great, thanks for picking this up.

With regards,

Matthias van de Meent

#51Michael Paquier
michael@paquier.xyz
In reply to: Matthias van de Meent (#50)
Re: Improvements and additions to COPY progress reporting

On Mon, Mar 15, 2021 at 12:43:40PM +0100, Matthias van de Meent wrote:

Hmm, does CFBot not run checkout on windows with crlf line endings? I
had expected it to do as such.

This is environment-sensitive, so I am not surprised that Appveyor
changes the way newlines are handled there. I could see the
difference by running the tests manually on Windows command prompt for
example.

That seems great, thanks for picking this up.

Okay. Applied, then.
--
Michael