CLUSTER command progress monitor

Started by Tatsuro Yamadaover 8 years ago112 messages

Tatsuro Yamada

yamada.tatsuro@lab.ntt.co.jp

over 8 years ago

1 attachment(s)

Hi,

Following is a proposal for reporting the progress of CLUSTER command:

It seems that the following could be the phases of CLUSTER processing:

1. scanning heap
2. sort tuples
3. writing new heap
4. scan heap and write new heap
5. swapping relation files
6. rebuild index
7. performing final cleanup

These phases are based on Rahila's presentation at PGCon 2017
(https://www.pgcon.org/2017/schedule/attachments/472_Progress%20Measurement%20PostgreSQL.pdf)
and I added some phases on it.

CLUSTER command may use Index Scan or Seq Scan when scanning the heap.
Depending on which one is chosen, the command will proceed in the
following sequence of phases:

* Seq Scan
1. scanning heap
2. sort tuples
3. writing new heap
5. swapping relation files
6. rebuild index
7. performing final cleanup

* Index Scan
4. scan heap and write new heap
5. swapping relation files
6. rebuild index
7. performing final cleanup

Then I have questions.

* Should we have separate views for them? Or should both be covered by the
same view with some indication of which command (CLUSTER or VACUUM FULL)
is actually running?
I mean this progress monitor could be covering not only CLUSTER command but also
VACUUM FULL command.

* I chose tuples as scan heap's counter (heap_tuples_scanned) since it's not
easy to get current blocks from Index Scan. Is it Ok?

I'll add this patch to CF2017-09.
Any comments or suggestion are welcome.

Regards,
Tatsuro Yamada
NTT Open Source Software Center

Attachments:

progress_monitor_cluster_v1.patchtext/x-patch; name=progress_monitor_cluster_v1.patchDownload

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 5575c2c..18fe2c6 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -332,6 +332,14 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
       </entry>
      </row>
 
+     <row>
+      <entry><structname>pg_stat_progress_cluster</><indexterm><primary>pg_stat_progress_cluster</primary></indexterm></entry>
+      <entry>One row for each backend running
+       <command>CLUSTER</>, showing current progress.
+       See <xref linkend='cluster-progress-reporting'>.
+      </entry>
+     </row>
+
     </tbody>
    </tgroup>
   </table>
@@ -3229,9 +3237,9 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 
   <para>
    <productname>PostgreSQL</> has the ability to report the progress of
-   certain commands during command execution.  Currently, the only command
-   which supports progress reporting is <command>VACUUM</>.  This may be
-   expanded in the future.
+   certain commands during command execution.  Currently, the suppoted 
+   progress reporting commands are <command>VACUUM</> and <command>CLUSTER</>.
+   This may be expanded in the future.
   </para>
 
  <sect2 id="vacuum-progress-reporting">
@@ -3423,6 +3431,157 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
   </table>
 
  </sect2>
+
+ <sect2 id="cluster-progress-reporting">
+  <title>CLUSTER Progress Reporting</title>
+
+  <para>
+   Whenever <command>CLUSTER</> is running, the
+   <structname>pg_stat_progress_cluster</structname> view will contain
+   one row for each backend that is currently clustering. 
+   The tables below describe the information that will be reported and
+   provide information about how to interpret it.
+  </para>
+
+  <table id="pg-stat-progress-cluster-view" xreflabel="pg_stat_progress_cluster">
+   <title><structname>pg_stat_progress_cluster</structname> View</title>
+   <tgroup cols="3">
+    <thead>
+    <row>
+      <entry>Column</entry>
+      <entry>Type</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><structfield>pid</></entry>
+     <entry><type>integer</></entry>
+     <entry>Process ID of backend.</entry>
+    </row>
+    <row>
+     <entry><structfield>datid</></entry>
+     <entry><type>oid</></entry>
+     <entry>OID of the database to which this backend is connected.</entry>
+    </row>
+    <row>
+     <entry><structfield>datname</></entry>
+     <entry><type>name</></entry>
+     <entry>Name of the database to which this backend is connected.</entry>
+    </row>
+    <row>
+     <entry><structfield>relid</></entry>
+     <entry><type>oid</></entry>
+     <entry>OID of the table being clustered.</entry>
+    </row>
+    <row>
+     <entry><structfield>phase</></entry>
+     <entry><type>text</></entry>
+     <entry>
+       Current processing phase of cluster.  See <xref linkend='cluster-phases'>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>scan_method</></entry>
+     <entry><type>text</></entry>
+     <entry>
+       Scan method of table: index scan/seq scan.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>scan_index_relid</></entry>
+     <entry><type>bigint</></entry>
+     <entry>
+       OID of the index.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_tuples_total</></entry>
+     <entry><type>bigint</></entry>
+     <entry>
+       Total number of heap tuples in the table.  This number is reported
+       as of the beginning of the scan; tuples added later will not be (and
+       need not be) visited by this <command>CLUSTER</>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_tuples_scanned</></entry>
+     <entry><type>bigint</></entry>
+     <entry>
+       Number of heap tuples scanned.
+       This counter only advances when the phase is <literal>scanning heap</>, 
+       <literal>writing new heap</> and <literal>scan heap and write new heap</>.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
+  <table id="cluster-phases">
+   <title>CLUSTER phases</title>
+   <tgroup cols="2">
+    <thead>
+    <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><literal>initializing</literal></entry>
+     <entry>
+       <command>CLUSTER</> is preparing to begin scanning the heap.  This
+       phase is expected to be very brief.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>scanning heap</literal></entry>
+     <entry>
+       <command>CLUSTER</> is currently scanning heap from the table by
+       seq scan. This phase is shown when the <structfield>scan_method</> is seq scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>sorting tuples</literal></entry>
+     <entry>
+       <command>CLUSTER</> is currently sorting tuples. 
+       This phase is shown when the <structfield>scan_method</> is seq scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>scan heap and write new heap</literal></entry>
+     <entry>
+       <command>CLUSTER</> is currently scanning heap from the table and
+       writing new clusterd heap.  This phase is shown when the <structfield>scan_method</> is
+       index scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>swapping relation files</literal></entry>
+     <entry>
+       <command>CLUSTER</> is currently swapping old heap and new clustered heap.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>rebuilding index</literal></entry>
+     <entry>
+       <command>CLUSTER</> is rebuilding index.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>performing final cleanup</literal></entry>
+     <entry>
+       <command>CLUSTER</> is performing final cleanup.  When this phase is 
+       completed, <command>CLUSTER</> will end.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
+ </sect2>
  </sect1>
 
  <sect1 id="dynamic-trace">
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index dc40cde..c10c830 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -899,6 +899,30 @@ CREATE VIEW pg_stat_progress_vacuum AS
     FROM pg_stat_get_progress_info('VACUUM') AS S
 		LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_cluster AS
+    SELECT
+        S.pid AS pid,
+        S.datid AS datid,
+        D.datname AS datname,
+        S.relid AS relid,
+        CASE S.param1 WHEN 0 THEN 'initializing'
+                      WHEN 1 THEN 'scanning heap'
+                      WHEN 2 THEN 'sorting tuples'
+                      WHEN 3 THEN 'writing new heap'
+                      WHEN 4 THEN 'scan heap and write new heap'
+                      WHEN 5 THEN 'swapping relation files'
+                      WHEN 6 THEN 'rebuilding index'
+                      WHEN 7 THEN 'performing final cleanup'
+                      END AS phase,
+        CASE S.param2 WHEN 0 THEN 'index scan'
+                      WHEN 1 THEN 'seq scan'
+                      END AS scan_method,
+        S.param3 AS scan_index_relid,
+        S.param4 AS heap_tuples_total,
+        S.param5 AS heap_tuples_scanned
+    FROM pg_stat_get_progress_info('CLUSTER') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid;
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 48f1e6e..8f2a473 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -34,10 +34,12 @@
 #include "catalog/objectaccess.h"
 #include "catalog/toasting.h"
 #include "commands/cluster.h"
+#include "commands/progress.h"
 #include "commands/tablecmds.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "optimizer/planner.h"
+#include "pgstat.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/predicate.h"
@@ -105,6 +107,7 @@ static void reform_and_rewrite_tuple(HeapTuple tuple,
 void
 cluster(ClusterStmt *stmt, bool isTopLevel)
 {
+
 	if (stmt->relation != NULL)
 	{
 		/* This is the single-relation case. */
@@ -276,6 +279,11 @@ cluster_rel(Oid tableOid, Oid indexOid, bool recheck, bool verbose)
 	if (!OldHeap)
 		return;
 
+	/* Start progress monitor for cluster command */
+	pgstat_progress_start_command(PROGRESS_COMMAND_CLUSTER, tableOid);
+	/* Set indexOid to column */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_SCAN_INDEX_RELID, indexOid);
+
 	/*
 	 * Since we may open a new transaction for each relation, we have to check
 	 * that the relation still is what we think it is.
@@ -404,6 +412,8 @@ cluster_rel(Oid tableOid, Oid indexOid, bool recheck, bool verbose)
 	rebuild_relation(OldHeap, indexOid, verbose);
 
 	/* NB: rebuild_relation does heap_close() on OldHeap */
+
+	pgstat_progress_end_command();
 }
 
 /*
@@ -771,6 +781,9 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	else
 		OldIndex = NULL;
 
+	/* Set reltuples to total_tuples */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_TOTAL_HEAP_TUPLES, OldHeap->rd_rel->reltuples);
+
 	/*
 	 * Their tuple descriptors should be exactly alike, but here we only need
 	 * assume that they have the same number of columns.
@@ -902,12 +915,16 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 */
 	if (OldIndex != NULL && !use_sort)
 	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_SCAN_HEAP_AND_WRITE_NEW_HEAP);
+		pgstat_progress_update_param(PROGRESS_CLUSTER_SCAN_METHOD, PROGRESS_CLUSTER_METHOD_INDEX_SCAN);
 		heapScan = NULL;
 		indexScan = index_beginscan(OldHeap, OldIndex, SnapshotAny, 0, 0);
 		index_rescan(indexScan, NULL, 0, NULL, 0);
 	}
 	else
 	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_SCAN_HEAP);
+		pgstat_progress_update_param(PROGRESS_CLUSTER_SCAN_METHOD, PROGRESS_CLUSTER_METHOD_SEQ_SCAN);
 		heapScan = heap_beginscan(OldHeap, SnapshotAny, 0, (ScanKey) NULL);
 		indexScan = NULL;
 	}
@@ -1039,6 +1056,9 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 									 oldTupDesc, newTupDesc,
 									 values, isnull,
 									 NewHeap->rd_rel->relhasoids, rwstate);
+
+		/* Regardless of index scan or seq scan, update tuples_scanned column */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED, num_tuples);
 	}
 
 	if (indexScan != NULL)
@@ -1052,8 +1072,15 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 */
 	if (tuplesort != NULL)
 	{
+		double num_tuples = 0;
+
+		/* Report that we are now sorting tuples */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_SORT_TUPLES);
 		tuplesort_performsort(tuplesort);
 
+		/* Report that we are now writing new heap */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP);
+		pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED, num_tuples);
 		for (;;)
 		{
 			HeapTuple	tuple;
@@ -1064,10 +1091,13 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 			if (tuple == NULL)
 				break;
 
+			num_tuples += 1;
 			reform_and_rewrite_tuple(tuple,
 									 oldTupDesc, newTupDesc,
 									 values, isnull,
 									 NewHeap->rd_rel->relhasoids, rwstate);
+
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED, num_tuples);
 		}
 
 		tuplesort_end(tuplesort);
@@ -1480,6 +1510,9 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	int			reindex_flags;
 	int			i;
 
+	/* Report that we are now swapping relation files */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES);
+
 	/* Zero out possible results from swapped_relation_files */
 	memset(mapped_tables, 0, sizeof(mapped_tables));
 
@@ -1514,6 +1547,10 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	 * because the new heap won't contain any HOT chains at all, let alone
 	 * broken ones, so it can't be necessary to set indcheckxmin.
 	 */
+
+	/* Report that we are now reindexing relations */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_REBUILD_INDEX);
+
 	reindex_flags = REINDEX_REL_SUPPRESS_INDEX_USE;
 	if (check_constraints)
 		reindex_flags |= REINDEX_REL_CHECK_CONSTRAINTS;
@@ -1529,6 +1566,9 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 
 	reindex_relation(OIDOldHeap, reindex_flags, 0);
 
+	/* Report that we are now doing clean up */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP);
+
 	/*
 	 * If the relation being rebuild is pg_class, swap_relation_files()
 	 * couldn't update pg_class's own pg_class entry (check comments in
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 20ce48b..90bde85 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -467,6 +467,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 	/* Translate command name into command type code. */
 	if (pg_strcasecmp(cmd, "VACUUM") == 0)
 		cmdtype = PROGRESS_COMMAND_VACUUM;
+	else if(pg_strcasecmp(cmd, "CLUSTER") == 0)
+		cmdtype = PROGRESS_COMMAND_CLUSTER;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 9472ecc..28ccf38 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -34,4 +34,24 @@
 #define PROGRESS_VACUUM_PHASE_TRUNCATE			5
 #define PROGRESS_VACUUM_PHASE_FINAL_CLEANUP		6
 
+/* Progress parameters for cluster */
+#define PROGRESS_CLUSTER_PHASE					0
+#define PROGRESS_CLUSTER_SCAN_METHOD			1
+#define PROGRESS_CLUSTER_SCAN_INDEX_RELID		2
+#define PROGRESS_CLUSTER_TOTAL_HEAP_TUPLES	  	3
+#define PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED	4
+
+/* Phases of cluster (as dvertised via PROGRESS_CLUSTER_PHASE) */
+#define PROGRESS_CLUSTER_PHASE_SCAN_HEAP						1
+#define PROGRESS_CLUSTER_PHASE_SORT_TUPLES						2
+#define PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP					3
+#define PROGRESS_CLUSTER_PHASE_SCAN_HEAP_AND_WRITE_NEW_HEAP		4
+#define PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES					5
+#define PROGRESS_CLUSTER_PHASE_REBUILD_INDEX					6
+#define PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP					7
+
+/* Scan methods of cluster */
+#define PROGRESS_CLUSTER_METHOD_INDEX_SCAN		0
+#define PROGRESS_CLUSTER_METHOD_SEQ_SCAN		1
+
 #endif
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index cb05d9b..1c6d5c7 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -915,7 +915,8 @@ typedef enum
 typedef enum ProgressCommandType
 {
 	PROGRESS_COMMAND_INVALID,
-	PROGRESS_COMMAND_VACUUM
+	PROGRESS_COMMAND_VACUUM,
+	PROGRESS_COMMAND_CLUSTER
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	10
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index d582bc9..cacece5 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1841,6 +1841,31 @@ pg_stat_progress_vacuum| SELECT s.pid,
     s.param7 AS num_dead_tuples
    FROM (pg_stat_get_progress_info('VACUUM'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
+pg_stat_progress_cluster| SELECT
+    s.pid,
+    s.datid,
+    d.datname,
+    s.relid,
+        CASE s.param1
+            WHEN 0 THEN 'initializing'::text
+            WHEN 1 THEN 'scanning heap'::text
+            WHEN 2 THEN 'sorting tuples'::text
+            WHEN 3 THEN 'writing new heap'::text
+            WHEN 4 THEN 'scan heap and write new heap'::text
+            WHEN 5 THEN 'swapping relation files'::text
+            WHEN 6 THEN 'rebuilding index'::text
+            WHEN 7 THEN 'performing final cleanup'::text
+            ELSE NULL::text
+        END AS phase,
+        CASE S.param2
+            WHEN 0 THEN 'index scan'
+            WHEN 1 THEN 'seq scan'
+            END AS scan_method,
+    s.param3 AS index_relid,
+    s.param4 AS heap_blks_total,
+    s.param5 AS heap_blks_scanned
+   FROM (pg_stat_get_progress_info('CLUSTER'::text) s(pid, datid, relid, param1, param2, param3, param4, param5)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_replication| SELECT s.pid,
     s.usesysid,
     u.rolname AS usename,

Thomas Munro

thomas.munro@enterprisedb.com

over 8 years ago

In reply to: Tatsuro Yamada (#1)

Re: CLUSTER command progress monitor

On Thu, Aug 31, 2017 at 2:12 PM, Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

Any comments or suggestion are welcome.

Although this patch updates src/test/regress/expected/rules.out I
think perhaps you included the wrong version? That regression test
fails for me

--
Thomas Munro
http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tatsuro Yamada

yamada.tatsuro@lab.ntt.co.jp

over 8 years ago

In reply to: Thomas Munro (#2)

Re: CLUSTER command progress monitor

Hi Thomas,

Any comments or suggestion are welcome.

Although this patch updates src/test/regress/expected/rules.out I
think perhaps you included the wrong version? That regression test
fails for me

Thanks for the comment.

I use the patch on 7b69b6ce and it's fine.
Did you use "initdb" command after "make install"?
The pg_stat_progress_cluster view is created in initdb, probably.

Regards,
Tatsuro Yamada

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Masahiko Sawada

sawada.mshk@gmail.com

over 8 years ago

In reply to: Tatsuro Yamada (#3)

Re: CLUSTER command progress monitor

On Fri, Sep 1, 2017 at 3:38 PM, Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

Hi Thomas,

Any comments or suggestion are welcome.

Although this patch updates src/test/regress/expected/rules.out I
think perhaps you included the wrong version? That regression test
fails for me

Thanks for the comment.

I use the patch on 7b69b6ce and it's fine.
Did you use "initdb" command after "make install"?
The pg_stat_progress_cluster view is created in initdb, probably.

I also got a regression test error (applied to abe85ef). Here is
regression.diff file.

*** /home/masahiko/source/postgresql/src/test/regress/expected/rules.out
       2017-09-01 17:27:33.680055612 -0700
--- /home/masahiko/source/postgresql/src/test/regress/results/rules.out
2017-09-01 17:28:10.410055596 -0700
***************
*** 1819,1824 ****
--- 1819,1849 ----
      pg_stat_get_db_conflict_bufferpin(d.oid) AS confl_bufferpin,
      pg_stat_get_db_conflict_startup_deadlock(d.oid) AS confl_deadlock
     FROM pg_database d;
+ pg_stat_progress_cluster| SELECT s.pid,
+     s.datid,
+     d.datname,
+     s.relid,
+         CASE s.param1
+             WHEN 0 THEN 'initializing'::text
+             WHEN 1 THEN 'scanning heap'::text
+             WHEN 2 THEN 'sorting tuples'::text
+             WHEN 3 THEN 'writing new heap'::text
+             WHEN 4 THEN 'scan heap and write new heap'::text
+             WHEN 5 THEN 'swapping relation files'::text
+             WHEN 6 THEN 'rebuilding index'::text
+             WHEN 7 THEN 'performing final cleanup'::text
+             ELSE NULL::text
+         END AS phase,
+         CASE s.param2
+             WHEN 0 THEN 'index scan'::text
+             WHEN 1 THEN 'seq scan'::text
+             ELSE NULL::text
+         END AS scan_method,
+     s.param3 AS scan_index_relid,
+     s.param4 AS heap_tuples_total,
+     s.param5 AS heap_tuples_scanned
+    FROM (pg_stat_get_progress_info('CLUSTER'::text) s(pid, datid,
relid, param1, param2, param3, param4, param5, param6, param7, param8,
param9, param10)
+      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
  pg_stat_progress_vacuum| SELECT s.pid,
      s.datid,
      d.datname,
***************
*** 1841,1871 ****
      s.param7 AS num_dead_tuples
     FROM (pg_stat_get_progress_info('VACUUM'::text) s(pid, datid,
relid, param1, param2, param3, param4, param5, param6, param7, param8,
param9, param10)
       LEFT JOIN pg_database d ON ((s.datid = d.oid)));
- pg_stat_progress_cluster| SELECT
-     s.pid,
-     s.datid,
-     d.datname,
-     s.relid,
-         CASE s.param1
-             WHEN 0 THEN 'initializing'::text
-             WHEN 1 THEN 'scanning heap'::text
-             WHEN 2 THEN 'sorting tuples'::text
-             WHEN 3 THEN 'writing new heap'::text
-             WHEN 4 THEN 'scan heap and write new heap'::text
-             WHEN 5 THEN 'swapping relation files'::text
-             WHEN 6 THEN 'rebuilding index'::text
-             WHEN 7 THEN 'performing final cleanup'::text
-             ELSE NULL::text
-         END AS phase,
-         CASE S.param2
-             WHEN 0 THEN 'index scan'
-             WHEN 1 THEN 'seq scan'
-             END AS scan_method,
-     s.param3 AS index_relid,
-     s.param4 AS heap_blks_total,
-     s.param5 AS heap_blks_scanned
-    FROM (pg_stat_get_progress_info('CLUSTER'::text) s(pid, datid,
relid, param1, param2, param3, param4, param5)
-      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
  pg_stat_replication| SELECT s.pid,
      s.usesysid,
      u.rolname AS usename,
--- 1866,1871 ----

======================================================================

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tatsuro Yamada

yamada.tatsuro@lab.ntt.co.jp

over 8 years ago

In reply to: Masahiko Sawada (#4)

1 attachment(s)

Re: CLUSTER command progress monitor

Hi Sawada-san, Thomas,

Thanks for sharing the reggression.diff.
I realized Thomas's comment is right.

Attached patch is fixed version.
Could you try it?

Regards,

Tatsuro Yamada
NTT Open Source Software Center

Show quoted text

On 2017/09/01 17:59, Masahiko Sawada wrote:

On Fri, Sep 1, 2017 at 3:38 PM, Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

Hi Thomas,

Any comments or suggestion are welcome.

Although this patch updates src/test/regress/expected/rules.out I
think perhaps you included the wrong version? That regression test
fails for me

Thanks for the comment.

I use the patch on 7b69b6ce and it's fine.
Did you use "initdb" command after "make install"?
The pg_stat_progress_cluster view is created in initdb, probably.

I also got a regression test error (applied to abe85ef). Here is
regression.diff file.

*** /home/masahiko/source/postgresql/src/test/regress/expected/rules.out
2017-09-01 17:27:33.680055612 -0700
--- /home/masahiko/source/postgresql/src/test/regress/results/rules.out
2017-09-01 17:28:10.410055596 -0700
***************
*** 1819,1824 ****
--- 1819,1849 ----
pg_stat_get_db_conflict_bufferpin(d.oid) AS confl_bufferpin,
pg_stat_get_db_conflict_startup_deadlock(d.oid) AS confl_deadlock
FROM pg_database d;
+ pg_stat_progress_cluster| SELECT s.pid,
+     s.datid,
+     d.datname,
+     s.relid,
+         CASE s.param1
+             WHEN 0 THEN 'initializing'::text
+             WHEN 1 THEN 'scanning heap'::text
+             WHEN 2 THEN 'sorting tuples'::text
+             WHEN 3 THEN 'writing new heap'::text
+             WHEN 4 THEN 'scan heap and write new heap'::text
+             WHEN 5 THEN 'swapping relation files'::text
+             WHEN 6 THEN 'rebuilding index'::text
+             WHEN 7 THEN 'performing final cleanup'::text
+             ELSE NULL::text
+         END AS phase,
+         CASE s.param2
+             WHEN 0 THEN 'index scan'::text
+             WHEN 1 THEN 'seq scan'::text
+             ELSE NULL::text
+         END AS scan_method,
+     s.param3 AS scan_index_relid,
+     s.param4 AS heap_tuples_total,
+     s.param5 AS heap_tuples_scanned
+    FROM (pg_stat_get_progress_info('CLUSTER'::text) s(pid, datid,
relid, param1, param2, param3, param4, param5, param6, param7, param8,
param9, param10)
+      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
pg_stat_progress_vacuum| SELECT s.pid,
s.datid,
d.datname,
***************
*** 1841,1871 ****
s.param7 AS num_dead_tuples
FROM (pg_stat_get_progress_info('VACUUM'::text) s(pid, datid,
relid, param1, param2, param3, param4, param5, param6, param7, param8,
param9, param10)
LEFT JOIN pg_database d ON ((s.datid = d.oid)));
- pg_stat_progress_cluster| SELECT
-     s.pid,
-     s.datid,
-     d.datname,
-     s.relid,
-         CASE s.param1
-             WHEN 0 THEN 'initializing'::text
-             WHEN 1 THEN 'scanning heap'::text
-             WHEN 2 THEN 'sorting tuples'::text
-             WHEN 3 THEN 'writing new heap'::text
-             WHEN 4 THEN 'scan heap and write new heap'::text
-             WHEN 5 THEN 'swapping relation files'::text
-             WHEN 6 THEN 'rebuilding index'::text
-             WHEN 7 THEN 'performing final cleanup'::text
-             ELSE NULL::text
-         END AS phase,
-         CASE S.param2
-             WHEN 0 THEN 'index scan'
-             WHEN 1 THEN 'seq scan'
-             END AS scan_method,
-     s.param3 AS index_relid,
-     s.param4 AS heap_blks_total,
-     s.param5 AS heap_blks_scanned
-    FROM (pg_stat_get_progress_info('CLUSTER'::text) s(pid, datid,
relid, param1, param2, param3, param4, param5)
-      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
pg_stat_replication| SELECT s.pid,
s.usesysid,
u.rolname AS usename,
--- 1866,1871 ----

======================================================================

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

progress_monitor_cluster_v2.patchtext/x-patch; name=progress_monitor_cluster_v2.patchDownload

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 38bf636..35a5c63 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -332,6 +332,14 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
       </entry>
      </row>
 
+     <row>
+      <entry><structname>pg_stat_progress_cluster</><indexterm><primary>pg_stat_progress_cluster</primary></indexterm></entry>
+      <entry>One row for each backend running
+       <command>CLUSTER</>, showing current progress.
+       See <xref linkend='cluster-progress-reporting'>.
+      </entry>
+     </row>
+
     </tbody>
    </tgroup>
   </table>
@@ -3233,9 +3241,9 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 
   <para>
    <productname>PostgreSQL</> has the ability to report the progress of
-   certain commands during command execution.  Currently, the only command
-   which supports progress reporting is <command>VACUUM</>.  This may be
-   expanded in the future.
+   certain commands during command execution.  Currently, the suppoted 
+   progress reporting commands are <command>VACUUM</> and <command>CLUSTER</>.
+   This may be expanded in the future.
   </para>
 
  <sect2 id="vacuum-progress-reporting">
@@ -3427,6 +3435,157 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
   </table>
 
  </sect2>
+
+ <sect2 id="cluster-progress-reporting">
+  <title>CLUSTER Progress Reporting</title>
+
+  <para>
+   Whenever <command>CLUSTER</> is running, the
+   <structname>pg_stat_progress_cluster</structname> view will contain
+   one row for each backend that is currently clustering. 
+   The tables below describe the information that will be reported and
+   provide information about how to interpret it.
+  </para>
+
+  <table id="pg-stat-progress-cluster-view" xreflabel="pg_stat_progress_cluster">
+   <title><structname>pg_stat_progress_cluster</structname> View</title>
+   <tgroup cols="3">
+    <thead>
+    <row>
+      <entry>Column</entry>
+      <entry>Type</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><structfield>pid</></entry>
+     <entry><type>integer</></entry>
+     <entry>Process ID of backend.</entry>
+    </row>
+    <row>
+     <entry><structfield>datid</></entry>
+     <entry><type>oid</></entry>
+     <entry>OID of the database to which this backend is connected.</entry>
+    </row>
+    <row>
+     <entry><structfield>datname</></entry>
+     <entry><type>name</></entry>
+     <entry>Name of the database to which this backend is connected.</entry>
+    </row>
+    <row>
+     <entry><structfield>relid</></entry>
+     <entry><type>oid</></entry>
+     <entry>OID of the table being clustered.</entry>
+    </row>
+    <row>
+     <entry><structfield>phase</></entry>
+     <entry><type>text</></entry>
+     <entry>
+       Current processing phase of cluster.  See <xref linkend='cluster-phases'>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>scan_method</></entry>
+     <entry><type>text</></entry>
+     <entry>
+       Scan method of table: index scan/seq scan.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>scan_index_relid</></entry>
+     <entry><type>bigint</></entry>
+     <entry>
+       OID of the index.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_tuples_total</></entry>
+     <entry><type>bigint</></entry>
+     <entry>
+       Total number of heap tuples in the table.  This number is reported
+       as of the beginning of the scan; tuples added later will not be (and
+       need not be) visited by this <command>CLUSTER</>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_tuples_scanned</></entry>
+     <entry><type>bigint</></entry>
+     <entry>
+       Number of heap tuples scanned.
+       This counter only advances when the phase is <literal>scanning heap</>, 
+       <literal>writing new heap</> and <literal>scan heap and write new heap</>.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
+  <table id="cluster-phases">
+   <title>CLUSTER phases</title>
+   <tgroup cols="2">
+    <thead>
+    <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><literal>initializing</literal></entry>
+     <entry>
+       <command>CLUSTER</> is preparing to begin scanning the heap.  This
+       phase is expected to be very brief.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>scanning heap</literal></entry>
+     <entry>
+       <command>CLUSTER</> is currently scanning heap from the table by
+       seq scan. This phase is shown when the <structfield>scan_method</> is seq scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>sorting tuples</literal></entry>
+     <entry>
+       <command>CLUSTER</> is currently sorting tuples. 
+       This phase is shown when the <structfield>scan_method</> is seq scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>scan heap and write new heap</literal></entry>
+     <entry>
+       <command>CLUSTER</> is currently scanning heap from the table and
+       writing new clusterd heap.  This phase is shown when the <structfield>scan_method</> is
+       index scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>swapping relation files</literal></entry>
+     <entry>
+       <command>CLUSTER</> is currently swapping old heap and new clustered heap.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>rebuilding index</literal></entry>
+     <entry>
+       <command>CLUSTER</> is rebuilding index.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>performing final cleanup</literal></entry>
+     <entry>
+       <command>CLUSTER</> is performing final cleanup.  When this phase is 
+       completed, <command>CLUSTER</> will end.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
+ </sect2>
  </sect1>
 
  <sect1 id="dynamic-trace">
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index dc40cde..c10c830 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -899,6 +899,30 @@ CREATE VIEW pg_stat_progress_vacuum AS
     FROM pg_stat_get_progress_info('VACUUM') AS S
 		LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_cluster AS
+    SELECT
+        S.pid AS pid,
+        S.datid AS datid,
+        D.datname AS datname,
+        S.relid AS relid,
+        CASE S.param1 WHEN 0 THEN 'initializing'
+                      WHEN 1 THEN 'scanning heap'
+                      WHEN 2 THEN 'sorting tuples'
+                      WHEN 3 THEN 'writing new heap'
+                      WHEN 4 THEN 'scan heap and write new heap'
+                      WHEN 5 THEN 'swapping relation files'
+                      WHEN 6 THEN 'rebuilding index'
+                      WHEN 7 THEN 'performing final cleanup'
+                      END AS phase,
+        CASE S.param2 WHEN 0 THEN 'index scan'
+                      WHEN 1 THEN 'seq scan'
+                      END AS scan_method,
+        S.param3 AS scan_index_relid,
+        S.param4 AS heap_tuples_total,
+        S.param5 AS heap_tuples_scanned
+    FROM pg_stat_get_progress_info('CLUSTER') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid;
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 48f1e6e..8f2a473 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -34,10 +34,12 @@
 #include "catalog/objectaccess.h"
 #include "catalog/toasting.h"
 #include "commands/cluster.h"
+#include "commands/progress.h"
 #include "commands/tablecmds.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "optimizer/planner.h"
+#include "pgstat.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/predicate.h"
@@ -105,6 +107,7 @@ static void reform_and_rewrite_tuple(HeapTuple tuple,
 void
 cluster(ClusterStmt *stmt, bool isTopLevel)
 {
+
 	if (stmt->relation != NULL)
 	{
 		/* This is the single-relation case. */
@@ -276,6 +279,11 @@ cluster_rel(Oid tableOid, Oid indexOid, bool recheck, bool verbose)
 	if (!OldHeap)
 		return;
 
+	/* Start progress monitor for cluster command */
+	pgstat_progress_start_command(PROGRESS_COMMAND_CLUSTER, tableOid);
+	/* Set indexOid to column */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_SCAN_INDEX_RELID, indexOid);
+
 	/*
 	 * Since we may open a new transaction for each relation, we have to check
 	 * that the relation still is what we think it is.
@@ -404,6 +412,8 @@ cluster_rel(Oid tableOid, Oid indexOid, bool recheck, bool verbose)
 	rebuild_relation(OldHeap, indexOid, verbose);
 
 	/* NB: rebuild_relation does heap_close() on OldHeap */
+
+	pgstat_progress_end_command();
 }
 
 /*
@@ -771,6 +781,9 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	else
 		OldIndex = NULL;
 
+	/* Set reltuples to total_tuples */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_TOTAL_HEAP_TUPLES, OldHeap->rd_rel->reltuples);
+
 	/*
 	 * Their tuple descriptors should be exactly alike, but here we only need
 	 * assume that they have the same number of columns.
@@ -902,12 +915,16 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 */
 	if (OldIndex != NULL && !use_sort)
 	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_SCAN_HEAP_AND_WRITE_NEW_HEAP);
+		pgstat_progress_update_param(PROGRESS_CLUSTER_SCAN_METHOD, PROGRESS_CLUSTER_METHOD_INDEX_SCAN);
 		heapScan = NULL;
 		indexScan = index_beginscan(OldHeap, OldIndex, SnapshotAny, 0, 0);
 		index_rescan(indexScan, NULL, 0, NULL, 0);
 	}
 	else
 	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_SCAN_HEAP);
+		pgstat_progress_update_param(PROGRESS_CLUSTER_SCAN_METHOD, PROGRESS_CLUSTER_METHOD_SEQ_SCAN);
 		heapScan = heap_beginscan(OldHeap, SnapshotAny, 0, (ScanKey) NULL);
 		indexScan = NULL;
 	}
@@ -1039,6 +1056,9 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 									 oldTupDesc, newTupDesc,
 									 values, isnull,
 									 NewHeap->rd_rel->relhasoids, rwstate);
+
+		/* Regardless of index scan or seq scan, update tuples_scanned column */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED, num_tuples);
 	}
 
 	if (indexScan != NULL)
@@ -1052,8 +1072,15 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 */
 	if (tuplesort != NULL)
 	{
+		double num_tuples = 0;
+
+		/* Report that we are now sorting tuples */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_SORT_TUPLES);
 		tuplesort_performsort(tuplesort);
 
+		/* Report that we are now writing new heap */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP);
+		pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED, num_tuples);
 		for (;;)
 		{
 			HeapTuple	tuple;
@@ -1064,10 +1091,13 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 			if (tuple == NULL)
 				break;
 
+			num_tuples += 1;
 			reform_and_rewrite_tuple(tuple,
 									 oldTupDesc, newTupDesc,
 									 values, isnull,
 									 NewHeap->rd_rel->relhasoids, rwstate);
+
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED, num_tuples);
 		}
 
 		tuplesort_end(tuplesort);
@@ -1480,6 +1510,9 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	int			reindex_flags;
 	int			i;
 
+	/* Report that we are now swapping relation files */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES);
+
 	/* Zero out possible results from swapped_relation_files */
 	memset(mapped_tables, 0, sizeof(mapped_tables));
 
@@ -1514,6 +1547,10 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	 * because the new heap won't contain any HOT chains at all, let alone
 	 * broken ones, so it can't be necessary to set indcheckxmin.
 	 */
+
+	/* Report that we are now reindexing relations */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_REBUILD_INDEX);
+
 	reindex_flags = REINDEX_REL_SUPPRESS_INDEX_USE;
 	if (check_constraints)
 		reindex_flags |= REINDEX_REL_CHECK_CONSTRAINTS;
@@ -1529,6 +1566,9 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 
 	reindex_relation(OIDOldHeap, reindex_flags, 0);
 
+	/* Report that we are now doing clean up */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP);
+
 	/*
 	 * If the relation being rebuild is pg_class, swap_relation_files()
 	 * couldn't update pg_class's own pg_class entry (check comments in
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 20ce48b..90bde85 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -467,6 +467,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 	/* Translate command name into command type code. */
 	if (pg_strcasecmp(cmd, "VACUUM") == 0)
 		cmdtype = PROGRESS_COMMAND_VACUUM;
+	else if(pg_strcasecmp(cmd, "CLUSTER") == 0)
+		cmdtype = PROGRESS_COMMAND_CLUSTER;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 9472ecc..28ccf38 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -34,4 +34,24 @@
 #define PROGRESS_VACUUM_PHASE_TRUNCATE			5
 #define PROGRESS_VACUUM_PHASE_FINAL_CLEANUP		6
 
+/* Progress parameters for cluster */
+#define PROGRESS_CLUSTER_PHASE					0
+#define PROGRESS_CLUSTER_SCAN_METHOD			1
+#define PROGRESS_CLUSTER_SCAN_INDEX_RELID		2
+#define PROGRESS_CLUSTER_TOTAL_HEAP_TUPLES	  	3
+#define PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED	4
+
+/* Phases of cluster (as dvertised via PROGRESS_CLUSTER_PHASE) */
+#define PROGRESS_CLUSTER_PHASE_SCAN_HEAP						1
+#define PROGRESS_CLUSTER_PHASE_SORT_TUPLES						2
+#define PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP					3
+#define PROGRESS_CLUSTER_PHASE_SCAN_HEAP_AND_WRITE_NEW_HEAP		4
+#define PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES					5
+#define PROGRESS_CLUSTER_PHASE_REBUILD_INDEX					6
+#define PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP					7
+
+/* Scan methods of cluster */
+#define PROGRESS_CLUSTER_METHOD_INDEX_SCAN		0
+#define PROGRESS_CLUSTER_METHOD_SEQ_SCAN		1
+
 #endif
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 57ac5d4..1c8dd67 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -916,7 +916,8 @@ typedef enum
 typedef enum ProgressCommandType
 {
 	PROGRESS_COMMAND_INVALID,
-	PROGRESS_COMMAND_VACUUM
+	PROGRESS_COMMAND_VACUUM,
+	PROGRESS_COMMAND_CLUSTER
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	10
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index d582bc9..ede9242 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1819,6 +1819,31 @@ pg_stat_database_conflicts| SELECT d.oid AS datid,
     pg_stat_get_db_conflict_bufferpin(d.oid) AS confl_bufferpin,
     pg_stat_get_db_conflict_startup_deadlock(d.oid) AS confl_deadlock
    FROM pg_database d;
+pg_stat_progress_cluster| SELECT s.pid,
+    s.datid,
+    d.datname,
+    s.relid,
+        CASE s.param1
+            WHEN 0 THEN 'initializing'::text
+            WHEN 1 THEN 'scanning heap'::text
+            WHEN 2 THEN 'sorting tuples'::text
+            WHEN 3 THEN 'writing new heap'::text
+            WHEN 4 THEN 'scan heap and write new heap'::text
+            WHEN 5 THEN 'swapping relation files'::text
+            WHEN 6 THEN 'rebuilding index'::text
+            WHEN 7 THEN 'performing final cleanup'::text
+            ELSE NULL::text
+        END AS phase,
+        CASE s.param2
+            WHEN 0 THEN 'index scan'::text
+            WHEN 1 THEN 'seq scan'::text
+            ELSE NULL::text
+        END AS scan_method,
+    s.param3 AS scan_index_relid,
+    s.param4 AS heap_tuples_total,
+    s.param5 AS heap_tuples_scanned
+   FROM (pg_stat_get_progress_info('CLUSTER'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,

Masahiko Sawada

sawada.mshk@gmail.com

over 8 years ago

In reply to: Tatsuro Yamada (#5)

Re: CLUSTER command progress monitor

On Mon, Sep 4, 2017 at 11:37 AM, Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

Hi Sawada-san, Thomas,

Thanks for sharing the reggression.diff.
I realized Thomas's comment is right.

Attached patch is fixed version.
Could you try it?

Yeah, in my environment the regression test passed. Thanks.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tatsuro Yamada

yamada.tatsuro@lab.ntt.co.jp

over 8 years ago

In reply to: Masahiko Sawada (#6)

Re: CLUSTER command progress monitor

Hi Sawada-san,

Thanks for taking your time.
I'll be more careful.

Regards,
Tatsuro Yamada

On 2017/09/04 11:51, Masahiko Sawada wrote:

On Mon, Sep 4, 2017 at 11:37 AM, Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

Hi Sawada-san, Thomas,

Thanks for sharing the reggression.diff.
I realized Thomas's comment is right.

Attached patch is fixed version.
Could you try it?

Yeah, in my environment the regression test passed. Thanks.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Michael Paquier

michael.paquier@gmail.com

over 8 years ago

In reply to: Tatsuro Yamada (#1)

Re: CLUSTER command progress monitor

On Thu, Aug 31, 2017 at 11:12 AM, Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

Then I have questions.

* Should we have separate views for them? Or should both be covered by the
same view with some indication of which command (CLUSTER or VACUUM FULL)
is actually running?

Using the same view for both, and tell that this is rather VACUUM or
CLUSTER in the view, would be better IMO. Coming up with a name more
generic than pg_stat_progress_cluster may be better though if this
speaks with VACUUM FULL as well, user-facing documentation does not
say that VACUUM FULL is actually CLUSTER.

I'll add this patch to CF2017-09.
Any comments or suggestion are welcome.

Nice to see that you are taking the time to implement patches for
upstream, Yamada-san!
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tatsuro Yamada

yamada.tatsuro@lab.ntt.co.jp

over 8 years ago

In reply to: Michael Paquier (#8)

Re: CLUSTER command progress monitor

On 2017/09/04 15:38, Michael Paquier wrote:

On Thu, Aug 31, 2017 at 11:12 AM, Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

Then I have questions.

* Should we have separate views for them? Or should both be covered by the
same view with some indication of which command (CLUSTER or VACUUM FULL)
is actually running?

Using the same view for both, and tell that this is rather VACUUM or
CLUSTER in the view, would be better IMO. Coming up with a name more
generic than pg_stat_progress_cluster may be better though if this
speaks with VACUUM FULL as well, user-facing documentation does not
say that VACUUM FULL is actually CLUSTER.

Thanks for sharing your thoughts.
Agreed.
I'll add new column like a "command" to tell whether running CLUSTER or VACUUM.
And how about this new view name?
- pg_stat_progress_reorg
Is it more general name than previous name if it covers both commands?

I'll add this patch to CF2017-09.
Any comments or suggestion are welcome.

Nice to see that you are taking the time to implement patches for
upstream, Yamada-san!

Same here. :)
I'd like to contribute creating feature that is for DBA and users.
Progress monitoring feature is important from my DBA experiences.
I'm happy if you lend your hand.

Thanks,
Tatsuro Yamada

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

Tatsuro Yamada

yamada.tatsuro@lab.ntt.co.jp

over 8 years ago

In reply to: Tatsuro Yamada (#9)

1 attachment(s)

Re: CLUSTER command progress monitor

Hi Hackers,

I revised the patch like this:

- Add "command" column in the view
It tells that the running command is CLUSTER or VACUUM FULL.

- Enable VACUUM FULL progress monitor
Add heap_tuples_vacuumed and heap_tuples_recently_dead as a counter in the view.
Sequence of phases are below:
1. scanning heap
5. swapping relation files
6. rebuild index
7. performing final cleanup

I didn't change the name of view (pg_stat_progress_cluster) because I'm not sure
whether the new name (pg_stat_progress_reorg) is suitable or not.

Any comments or suggestion are welcome.

Thanks,
Tatsuro Yamada

Show quoted text

On 2017/09/04 20:17, Tatsuro Yamada wrote:

On 2017/09/04 15:38, Michael Paquier wrote:

On Thu, Aug 31, 2017 at 11:12 AM, Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

Then I have questions.

* Should we have separate views for them? Or should both be covered by the
same view with some indication of which command (CLUSTER or VACUUM FULL)
is actually running?

Using the same view for both, and tell that this is rather VACUUM or
CLUSTER in the view, would be better IMO. Coming up with a name more
generic than pg_stat_progress_cluster may be better though if this
speaks with VACUUM FULL as well, user-facing documentation does not
say that VACUUM FULL is actually CLUSTER.

Thanks for sharing your thoughts.
Agreed.
I'll add new column like a "command" to tell whether running CLUSTER or VACUUM.
And how about this new view name?
- pg_stat_progress_reorg
Is it more general name than previous name if it covers both commands?

I'll add this patch to CF2017-09.
Any comments or suggestion are welcome.

Nice to see that you are taking the time to implement patches for
upstream, Yamada-san!

Same here. :)
I'd like to contribute creating feature that is for DBA and users.
Progress monitoring feature is important from my DBA experiences.
I'm happy if you lend your hand.

Thanks,
Tatsuro Yamada

Attachments:

progress_monitor_cluster_v3.patchtext/x-patch; name=progress_monitor_cluster_v3.patchDownload

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 38bf636..33cedc0 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -332,6 +332,14 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
       </entry>
      </row>
 
+     <row>
+      <entry><structname>pg_stat_progress_cluster</><indexterm><primary>pg_stat_progress_cluster</primary></indexterm></entry>
+      <entry>One row for each backend running
+       <command>CLUSTER</> and <command>VACUUM FULL</>, showing current progress.
+       See <xref linkend='cluster-progress-reporting'>.
+      </entry>
+     </row>
+
     </tbody>
    </tgroup>
   </table>
@@ -3233,9 +3241,9 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 
   <para>
    <productname>PostgreSQL</> has the ability to report the progress of
-   certain commands during command execution.  Currently, the only command
-   which supports progress reporting is <command>VACUUM</>.  This may be
-   expanded in the future.
+   certain commands during command execution.  Currently, the suppoted 
+   progress reporting commands are <command>VACUUM</> and <command>CLUSTER</>.
+   This may be expanded in the future.
   </para>
 
  <sect2 id="vacuum-progress-reporting">
@@ -3247,9 +3255,8 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
    one row for each backend (including autovacuum worker processes) that is
    currently vacuuming.  The tables below describe the information
    that will be reported and provide information about how to interpret it.
-   Progress reporting is not currently supported for <command>VACUUM FULL</>
-   and backends running <command>VACUUM FULL</> will not be listed in this
-   view.
+   Running <command>VACUUM FULL</> is listed in <structname>pg_stat_progress_cluster</structname>
+   view because it uses <command>CLUSTER</> command. See <xref linkend='cluster-progress-reporting'>.
   </para>
 
   <table id="pg-stat-progress-vacuum-view" xreflabel="pg_stat_progress_vacuum">
@@ -3427,6 +3434,228 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
   </table>
 
  </sect2>
+
+ <sect2 id="cluster-progress-reporting">
+  <title>CLUSTER Progress Reporting</title>
+
+  <para>
+   Whenever <command>CLUSTER</> is running, the
+   <structname>pg_stat_progress_cluster</structname> view will contain
+   one row for each backend that is currently clustering or vacuuming (VACUUM FULL). 
+   The tables below describe the information that will be reported and
+   provide information about how to interpret it.
+  </para>
+
+  <table id="pg-stat-progress-cluster-view" xreflabel="pg_stat_progress_cluster">
+   <title><structname>pg_stat_progress_cluster</structname> View</title>
+   <tgroup cols="3">
+    <thead>
+    <row>
+      <entry>Column</entry>
+      <entry>Type</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><structfield>pid</></entry>
+     <entry><type>integer</></entry>
+     <entry>Process ID of backend.</entry>
+    </row>
+    <row>
+     <entry><structfield>datid</></entry>
+     <entry><type>oid</></entry>
+     <entry>OID of the database to which this backend is connected.</entry>
+    </row>
+    <row>
+     <entry><structfield>datname</></entry>
+     <entry><type>name</></entry>
+     <entry>Name of the database to which this backend is connected.</entry>
+    </row>
+    <row>
+     <entry><structfield>relid</></entry>
+     <entry><type>oid</></entry>
+     <entry>OID of the table being clustered.</entry>
+    </row>
+    <row>
+     <entry><structfield>command</></entry>
+     <entry><type>text</></entry>
+     <entry>
+       Current processing command: CLUSTER/VACUUM FULL.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>phase</></entry>
+     <entry><type>text</></entry>
+     <entry>
+       Current processing phase of cluster/vacuum full.  See <xref linkend='cluster-phases'> or <xref linkend='vacuum-full-phases'>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>scan_method</></entry>
+     <entry><type>text</></entry>
+     <entry>
+       Scan method of table: index scan/seq scan.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>scan_index_relid</></entry>
+     <entry><type>bigint</></entry>
+     <entry>
+       OID of the index.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_tuples_total</></entry>
+     <entry><type>bigint</></entry>
+     <entry>
+       Total number of heap tuples in the table.  This number is reported
+       as of the beginning of the scan; tuples added later will not be (and
+       need not be) visited by this <command>CLUSTER</> and <command>VACUUM FULL</>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_tuples_scanned</></entry>
+     <entry><type>bigint</></entry>
+     <entry>
+       Number of heap tuples scanned.
+       This counter only advances when the phase is <literal>scanning heap</>, 
+       <literal>writing new heap</> and <literal>scan heap and write new heap</>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_tuples_vacuumed</></entry>
+     <entry><type>bigint</></entry>
+     <entry>
+       Number of heap tuples vacuumed. This counter only advances when the
+       command is <literal>VACUUM FULL</> and the phase is <literal>scanning heap</>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_tuples_recently_dead</></entry>
+     <entry><type>bigint</></entry>
+     <entry>
+       Number of heap tuples not vacuumed since these tuples marked recently dead.
+       This counter only advances when the command is <literal>VACUUM FULL</> and 
+       the phase is <literal>scanning heap</>.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
+  <table id="cluster-phases">
+   <title>CLUSTER phases</title>
+   <tgroup cols="2">
+    <thead>
+    <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><literal>initializing</literal></entry>
+     <entry>
+       <command>CLUSTER</> is preparing to begin scanning the heap.  This
+       phase is expected to be very brief.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>scanning heap</literal></entry>
+     <entry>
+       <command>CLUSTER</> is currently scanning heap from the table by
+       seq scan. This phase is shown when the <structfield>scan_method</> is seq scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>sorting tuples</literal></entry>
+     <entry>
+       <command>CLUSTER</> is currently sorting tuples. 
+       This phase is shown when the <structfield>scan_method</> is seq scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>scan heap and write new heap</literal></entry>
+     <entry>
+       <command>CLUSTER</> is currently scanning heap from the table and
+       writing new clusterd heap.  This phase is shown when the <structfield>scan_method</> is
+       index scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>swapping relation files</literal></entry>
+     <entry>
+       <command>CLUSTER</> is currently swapping old heap and new clustered heap.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>rebuilding index</literal></entry>
+     <entry>
+       <command>CLUSTER</> is currently rebuilding index.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>performing final cleanup</literal></entry>
+     <entry>
+       <command>CLUSTER</> is performing final cleanup.  When this phase is 
+       completed, <command>CLUSTER</> will end.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
+  <table id="vacuum-full-phases">
+   <title>VACUUM FULL phases</title>
+   <tgroup cols="2">
+    <thead>
+    <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><literal>initializing</literal></entry>
+     <entry>
+       <command>VACUUM FULL</> is preparing to begin scanning the heap.  This
+       phase is expected to be very brief.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>scanning heap</literal></entry>
+     <entry>
+       <command>VACUUM FULL</> is currently scanning heap from the table.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>swapping relation files</literal></entry>
+     <entry>
+       <command>VACUUM FULL</> is currently swapping old heap and new vacuumed heap.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>rebuilding index</literal></entry>
+     <entry>
+       <command>VACUUM FULL</> is currently rebuilding index.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>performing final cleanup</literal></entry>
+     <entry>
+       <command>VACUUM FULL</> is performing final cleanup.  When this phase is
+       completed, <command>VACUUM FULL</> will end.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
+ </sect2>
  </sect1>
 
  <sect1 id="dynamic-trace">
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index dc40cde..0174d8e 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -899,6 +899,35 @@ CREATE VIEW pg_stat_progress_vacuum AS
     FROM pg_stat_get_progress_info('VACUUM') AS S
 		LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_cluster AS
+    SELECT
+        S.pid AS pid,
+        S.datid AS datid,
+        D.datname AS datname,
+        S.relid AS relid,
+        CASE S.param1 WHEN 1 THEN 'CLUSTER'
+                      WHEN 2 THEN 'VACUUM FULL'
+                      END AS command,
+        CASE S.param2 WHEN 0 THEN 'initializing'
+                      WHEN 1 THEN 'scanning heap'
+                      WHEN 2 THEN 'sorting tuples'
+                      WHEN 3 THEN 'writing new heap'
+                      WHEN 4 THEN 'scan heap and write new heap'
+                      WHEN 5 THEN 'swapping relation files'
+                      WHEN 6 THEN 'rebuilding index'
+                      WHEN 7 THEN 'performing final cleanup'
+                      END AS phase,
+        CASE S.param3 WHEN 1 THEN 'index scan'
+                      WHEN 2 THEN 'seq scan'
+                      END AS scan_method,
+        S.param4 AS scan_index_relid,
+        S.param5 AS heap_tuples_total,
+        S.param6 AS heap_tuples_scanned,
+        S.param7 AS heap_tuples_vacuumed,
+        S.param8 AS heap_tuples_recently_dead
+    FROM pg_stat_get_progress_info('CLUSTER') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid;
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 48f1e6e..08c24e6 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -34,10 +34,12 @@
 #include "catalog/objectaccess.h"
 #include "catalog/toasting.h"
 #include "commands/cluster.h"
+#include "commands/progress.h"
 #include "commands/tablecmds.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "optimizer/planner.h"
+#include "pgstat.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/predicate.h"
@@ -105,6 +107,7 @@ static void reform_and_rewrite_tuple(HeapTuple tuple,
 void
 cluster(ClusterStmt *stmt, bool isTopLevel)
 {
+
 	if (stmt->relation != NULL)
 	{
 		/* This is the single-relation case. */
@@ -178,7 +181,9 @@ cluster(ClusterStmt *stmt, bool isTopLevel)
 		heap_close(rel, NoLock);
 
 		/* Do the job. */
+		pgstat_progress_start_command(PROGRESS_COMMAND_CLUSTER, tableOid);
 		cluster_rel(tableOid, indexOid, false, stmt->verbose);
+		pgstat_progress_end_command();
 	}
 	else
 	{
@@ -226,7 +231,9 @@ cluster(ClusterStmt *stmt, bool isTopLevel)
 			/* functions in indexes may want a snapshot set */
 			PushActiveSnapshot(GetTransactionSnapshot());
 			/* Do the job. */
+			pgstat_progress_start_command(PROGRESS_COMMAND_CLUSTER, rvtc->tableOid);
 			cluster_rel(rvtc->tableOid, rvtc->indexOid, true, stmt->verbose);
+			pgstat_progress_end_command();
 			PopActiveSnapshot();
 			CommitTransactionCommand();
 		}
@@ -374,6 +381,19 @@ cluster_rel(Oid tableOid, Oid indexOid, bool recheck, bool verbose)
 	 */
 	CheckTableNotInUse(OldHeap, OidIsValid(indexOid) ? "CLUSTER" : "VACUUM");
 
+	/* Set command to column */
+	if(OidIsValid(indexOid))
+	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_COMMAND, PROGRESS_CLUSTER_COMMAND_CLUSTER);
+		/* Set indexOid to column */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_SCAN_INDEX_RELID, indexOid);
+	}
+	else
+	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_COMMAND, PROGRESS_CLUSTER_COMMAND_VACUUM_FULL);
+	}
+
+
 	/* Check heap and index are valid to cluster on */
 	if (OidIsValid(indexOid))
 		check_index_is_clusterable(OldHeap, indexOid, recheck, AccessExclusiveLock);
@@ -771,6 +791,9 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	else
 		OldIndex = NULL;
 
+	/* Set reltuples to total_tuples */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_TOTAL_HEAP_TUPLES, OldHeap->rd_rel->reltuples);
+
 	/*
 	 * Their tuple descriptors should be exactly alike, but here we only need
 	 * assume that they have the same number of columns.
@@ -902,12 +925,16 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 */
 	if (OldIndex != NULL && !use_sort)
 	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_SCAN_HEAP_AND_WRITE_NEW_HEAP);
+		pgstat_progress_update_param(PROGRESS_CLUSTER_SCAN_METHOD, PROGRESS_CLUSTER_METHOD_INDEX_SCAN);
 		heapScan = NULL;
 		indexScan = index_beginscan(OldHeap, OldIndex, SnapshotAny, 0, 0);
 		index_rescan(indexScan, NULL, 0, NULL, 0);
 	}
 	else
 	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_SCAN_HEAP);
+		pgstat_progress_update_param(PROGRESS_CLUSTER_SCAN_METHOD, PROGRESS_CLUSTER_METHOD_SEQ_SCAN);
 		heapScan = heap_beginscan(OldHeap, SnapshotAny, 0, (ScanKey) NULL);
 		indexScan = NULL;
 	}
@@ -1028,6 +1055,9 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 				tups_vacuumed += 1;
 				tups_recently_dead -= 1;
 			}
+			/* set tups_vacuumed and tups_recently_dead to columns for VACUUM FULL */
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_VACUUMED, tups_vacuumed);
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_RECENTLY_DEAD, tups_recently_dead);
 			continue;
 		}
 
@@ -1039,6 +1069,9 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 									 oldTupDesc, newTupDesc,
 									 values, isnull,
 									 NewHeap->rd_rel->relhasoids, rwstate);
+
+		/* Regardless of index scan or seq scan, update tuples_scanned column */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED, num_tuples);
 	}
 
 	if (indexScan != NULL)
@@ -1052,8 +1085,15 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 */
 	if (tuplesort != NULL)
 	{
+		double num_tuples = 0;
+
+		/* Report that we are now sorting tuples */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_SORT_TUPLES);
 		tuplesort_performsort(tuplesort);
 
+		/* Report that we are now writing new heap */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP);
+		pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED, num_tuples);
 		for (;;)
 		{
 			HeapTuple	tuple;
@@ -1064,10 +1104,13 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 			if (tuple == NULL)
 				break;
 
+			num_tuples += 1;
 			reform_and_rewrite_tuple(tuple,
 									 oldTupDesc, newTupDesc,
 									 values, isnull,
 									 NewHeap->rd_rel->relhasoids, rwstate);
+
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED, num_tuples);
 		}
 
 		tuplesort_end(tuplesort);
@@ -1480,6 +1523,11 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	int			reindex_flags;
 	int			i;
 
+	/* Report that we are now swapping relation files */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES);
+	/* Set scan_method to NULL */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_SCAN_METHOD, -1);
+
 	/* Zero out possible results from swapped_relation_files */
 	memset(mapped_tables, 0, sizeof(mapped_tables));
 
@@ -1514,6 +1562,10 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	 * because the new heap won't contain any HOT chains at all, let alone
 	 * broken ones, so it can't be necessary to set indcheckxmin.
 	 */
+
+	/* Report that we are now reindexing relations */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_REBUILD_INDEX);
+
 	reindex_flags = REINDEX_REL_SUPPRESS_INDEX_USE;
 	if (check_constraints)
 		reindex_flags |= REINDEX_REL_CHECK_CONSTRAINTS;
@@ -1529,6 +1581,9 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 
 	reindex_relation(OIDOldHeap, reindex_flags, 0);
 
+	/* Report that we are now doing clean up */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP);
+
 	/*
 	 * If the relation being rebuild is pg_class, swap_relation_files()
 	 * couldn't update pg_class's own pg_class entry (check comments in
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index faa1812..5c8b0d0 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -1438,8 +1438,10 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 		onerel = NULL;
 
 		/* VACUUM FULL is now a variant of CLUSTER; see cluster.c */
+		pgstat_progress_start_command(PROGRESS_COMMAND_CLUSTER, relid);
 		cluster_rel(relid, InvalidOid, false,
 					(options & VACOPT_VERBOSE) != 0);
+		pgstat_progress_end_command();
 	}
 	else
 		lazy_vacuum_rel(onerel, options, params, vac_strategy);
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 20ce48b..90bde85 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -467,6 +467,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 	/* Translate command name into command type code. */
 	if (pg_strcasecmp(cmd, "VACUUM") == 0)
 		cmdtype = PROGRESS_COMMAND_VACUUM;
+	else if(pg_strcasecmp(cmd, "CLUSTER") == 0)
+		cmdtype = PROGRESS_COMMAND_CLUSTER;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 9472ecc..41dd696 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -34,4 +34,32 @@
 #define PROGRESS_VACUUM_PHASE_TRUNCATE			5
 #define PROGRESS_VACUUM_PHASE_FINAL_CLEANUP		6
 
+/* Progress parameters for cluster */
+#define PROGRESS_CLUSTER_COMMAND					0
+#define PROGRESS_CLUSTER_PHASE						1
+#define PROGRESS_CLUSTER_SCAN_METHOD				2
+#define PROGRESS_CLUSTER_SCAN_INDEX_RELID			3
+#define PROGRESS_CLUSTER_TOTAL_HEAP_TUPLES	  		4
+#define PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED		5
+#define PROGRESS_CLUSTER_HEAP_TUPLES_VACUUMED		6
+#define PROGRESS_CLUSTER_HEAP_TUPLES_RECENTLY_DEAD	7
+
+/* Phases of cluster (as dvertised via PROGRESS_CLUSTER_PHASE) */
+#define PROGRESS_CLUSTER_PHASE_SCAN_HEAP						1
+#define PROGRESS_CLUSTER_PHASE_SORT_TUPLES						2
+#define PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP					3
+#define PROGRESS_CLUSTER_PHASE_SCAN_HEAP_AND_WRITE_NEW_HEAP		4
+#define PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES					5
+#define PROGRESS_CLUSTER_PHASE_REBUILD_INDEX					6
+#define PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP					7
+
+/* Scan methods of cluster */
+#define PROGRESS_CLUSTER_METHOD_INDEX_SCAN		1
+#define PROGRESS_CLUSTER_METHOD_SEQ_SCAN		2
+
+/* Commands of PROGRESS_CLUSTER */
+#define PROGRESS_CLUSTER_COMMAND_CLUSTER		1
+#define PROGRESS_CLUSTER_COMMAND_VACUUM_FULL	2
+
+
 #endif
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 57ac5d4..1c8dd67 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -916,7 +916,8 @@ typedef enum
 typedef enum ProgressCommandType
 {
 	PROGRESS_COMMAND_INVALID,
-	PROGRESS_COMMAND_VACUUM
+	PROGRESS_COMMAND_VACUUM,
+	PROGRESS_COMMAND_CLUSTER
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	10
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index d582bc9..e2f751c 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1819,6 +1819,38 @@ pg_stat_database_conflicts| SELECT d.oid AS datid,
     pg_stat_get_db_conflict_bufferpin(d.oid) AS confl_bufferpin,
     pg_stat_get_db_conflict_startup_deadlock(d.oid) AS confl_deadlock
    FROM pg_database d;
+pg_stat_progress_cluster| SELECT s.pid,
+    s.datid,
+    d.datname,
+    s.relid,
+        CASE s.param1
+            WHEN 1 THEN 'CLUSTER'::text
+            WHEN 2 THEN 'VACUUM FULL'::text
+            ELSE NULL::text
+        END AS command,
+        CASE s.param2
+            WHEN 0 THEN 'initializing'::text
+            WHEN 1 THEN 'scanning heap'::text
+            WHEN 2 THEN 'sorting tuples'::text
+            WHEN 3 THEN 'writing new heap'::text
+            WHEN 4 THEN 'scan heap and write new heap'::text
+            WHEN 5 THEN 'swapping relation files'::text
+            WHEN 6 THEN 'rebuilding index'::text
+            WHEN 7 THEN 'performing final cleanup'::text
+            ELSE NULL::text
+        END AS phase,
+        CASE s.param3
+            WHEN 1 THEN 'index scan'::text
+            WHEN 2 THEN 'seq scan'::text
+            ELSE NULL::text
+        END AS scan_method,
+    s.param4 AS scan_index_relid,
+    s.param5 AS heap_tuples_total,
+    s.param6 AS heap_tuples_scanned,
+    s.param7 AS heap_tuples_vacuumed,
+    s.param8 AS heap_tuples_recently_dead
+   FROM (pg_stat_get_progress_info('CLUSTER'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,

#11

Michael Paquier

michael.paquier@gmail.com

over 8 years ago

In reply to: Tatsuro Yamada (#10)

Re: CLUSTER command progress monitor

On Wed, Sep 6, 2017 at 3:58 PM, Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

I revised the patch like this:

You should avoid top-posting.

I didn't change the name of view (pg_stat_progress_cluster) because I'm not
sure
whether the new name (pg_stat_progress_reorg) is suitable or not.

Here are some ideas: rewrite (incorrect for ALTER TABLE), organize
(somewhat fine), order, operate (too general?), bundle, reform,
assemble.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

Tatsuro Yamada

yamada.tatsuro@lab.ntt.co.jp

over 8 years ago

In reply to: Michael Paquier (#11)

Re: CLUSTER command progress monitor

On 2017/09/06 16:11, Michael Paquier wrote:

On Wed, Sep 6, 2017 at 3:58 PM, Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

I revised the patch like this:

You should avoid top-posting.

I see.

I didn't change the name of view (pg_stat_progress_cluster) because I'm not
sure
whether the new name (pg_stat_progress_reorg) is suitable or not.

Here are some ideas: rewrite (incorrect for ALTER TABLE), organize
(somewhat fine), order, operate (too general?), bundle, reform,
assemble.

Thanks for sharing your ideas.
I searched the words like a "reform table", "reassemble table" and "reorganize table"
on google. I realized "reorganaize table" is good choice than others
because many DBMS uses this keyword. Therefore, I'll change the name to it like this:

before
pg_stat_progress_cluster

after
pg_stat_progress_reorg (I abbreviate reorganize to reorg.)

Does anyone have any suggestions?
I'll revise the patch.

Regards,
Tatsuro Yamada

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Tatsuro Yamada (#1)

Re: CLUSTER command progress monitor

On Wed, Aug 30, 2017 at 10:12 PM, Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

1. scanning heap
2. sort tuples

These two phases overlap, though. I believe progress reporting for
sorts is really hard. In the simple case where the data fits in
work_mem, none of the work of the sort gets done until all the data is
read. Once you switch to an external sort, you're writing batch
files, so a lot of the work is now being done during data loading.
But as the number of batch files grows, the final merge at the end
becomes an increasingly noticeable part of the cost, and eventually
you end up needing multiple merge passes. I think we need some smart
way to report on sorts so that we can tell how much of the work has
really been done, but I don't know how to do it.

heap_tuples_total | bigint | | |

The patch is getting the value reported as heap_tuples_total from
OldHeap->rd_rel->reltuples. I think this is pointless: the user can
see that value anyway if they wish. The point of the progress
counters is to expose things the user couldn't otherwise see. It's
also not necessarily accurate: it's only an estimate in the best case,
and may be way off if the relation has recently be extended by a large
amount. I think it's pretty important that we try hard to only report
values that are known to be accurate, because users hate (and mock)
inaccurate progress reports.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14

Tatsuro Yamada

yamada.tatsuro@lab.ntt.co.jp

over 8 years ago

In reply to: Robert Haas (#13)

Re: CLUSTER command progress monitor

On 2017/09/08 18:55, Robert Haas wrote:

On Wed, Aug 30, 2017 at 10:12 PM, Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

1. scanning heap
2. sort tuples

These two phases overlap, though. I believe progress reporting for
sorts is really hard. In the simple case where the data fits in
work_mem, none of the work of the sort gets done until all the data is
read. Once you switch to an external sort, you're writing batch
files, so a lot of the work is now being done during data loading.
But as the number of batch files grows, the final merge at the end
becomes an increasingly noticeable part of the cost, and eventually
you end up needing multiple merge passes. I think we need some smart
way to report on sorts so that we can tell how much of the work has
really been done, but I don't know how to do it.

Thanks for the comment.

As you know, CLUSTER command uses SEQ SCAN or INDEX SCAN as a scan method by
cost estimation. In the case of SEQ SCAN, these two phases not overlap.
However, in INDEX SCAN, it overlaps. Therefore I created the phase of "scan
heap and write new heap" when INDEX SCAN was selected.

I agree that progress reporting for sort is difficult. So it only reports
the phase ("sorting tuples") in the current design of progress monitor of cluster.
It doesn't report counter of sort.

heap_tuples_total | bigint | | |

The patch is getting the value reported as heap_tuples_total from
OldHeap->rd_rel->reltuples. I think this is pointless: the user can
see that value anyway if they wish. The point of the progress
counters is to expose things the user couldn't otherwise see. It's
also not necessarily accurate: it's only an estimate in the best case,
and may be way off if the relation has recently be extended by a large
amount. I think it's pretty important that we try hard to only report
values that are known to be accurate, because users hate (and mock)
inaccurate progress reports.

Do you mean to use the number of rows by using below calculation instead
OldHeap->rd_rel->reltuples?

estimate rows = physical table size / average row length

I understand that OldHeap->rd_rel->reltuples is sometimes useless because
it is correct by auto analyze and it can't perform when under a threshold.

I'll add it in next patch and also share more detailed the current design of
progress monitor for cluster.

Regards,
Tatsuro Yamada

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Tatsuro Yamada (#14)

Re: CLUSTER command progress monitor

On Sun, Sep 10, 2017 at 10:36 PM, Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

Thanks for the comment.

As you know, CLUSTER command uses SEQ SCAN or INDEX SCAN as a scan method by
cost estimation. In the case of SEQ SCAN, these two phases not overlap.
However, in INDEX SCAN, it overlaps. Therefore I created the phase of "scan
heap and write new heap" when INDEX SCAN was selected.

I agree that progress reporting for sort is difficult. So it only reports
the phase ("sorting tuples") in the current design of progress monitor of
cluster.
It doesn't report counter of sort.

Doesn't that make it almost useless? I would guess that scanning the
heap and writing the new heap would ordinarily account for most of the
runtime, or at least enough that you're going to want something more
than just knowing that's the phase you're in.

The patch is getting the value reported as heap_tuples_total from
OldHeap->rd_rel->reltuples. I think this is pointless: the user can
see that value anyway if they wish. The point of the progress
counters is to expose things the user couldn't otherwise see. It's
also not necessarily accurate: it's only an estimate in the best case,
and may be way off if the relation has recently be extended by a large
amount. I think it's pretty important that we try hard to only report
values that are known to be accurate, because users hate (and mock)
inaccurate progress reports.

Do you mean to use the number of rows by using below calculation instead
OldHeap->rd_rel->reltuples?

estimate rows = physical table size / average row length

No, I mean don't report it at all. The caller can do that calculation
if they wish, without any help from the progress reporting machinery.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16

Peter Geoghegan

pg@bowt.ie

over 8 years ago

In reply to: Robert Haas (#15)

Re: CLUSTER command progress monitor

On Mon, Sep 11, 2017 at 7:38 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Sun, Sep 10, 2017 at 10:36 PM, Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

Thanks for the comment.

As you know, CLUSTER command uses SEQ SCAN or INDEX SCAN as a scan method by
cost estimation. In the case of SEQ SCAN, these two phases not overlap.
However, in INDEX SCAN, it overlaps. Therefore I created the phase of "scan
heap and write new heap" when INDEX SCAN was selected.

I agree that progress reporting for sort is difficult. So it only reports
the phase ("sorting tuples") in the current design of progress monitor of
cluster.
It doesn't report counter of sort.

Doesn't that make it almost useless? I would guess that scanning the
heap and writing the new heap would ordinarily account for most of the
runtime, or at least enough that you're going to want something more
than just knowing that's the phase you're in.

It's definitely my experience that CLUSTER is incredibly I/O bound.
You're shoveling the tuples through tuplesort.c, but the actual
sorting component isn't where the real costs are. Profiling shows that
writing out the new heap (including moderately complicated
bookkeeping) is the bottleneck, IIRC. That's why parallel CLUSTER
didn't look attractive, even though it would be a fairly
straightforward matter to add that on top of the parallel CREATE INDEX
structure from the patch that I wrote to do that.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17

Tatsuro Yamada

yamada.tatsuro@lab.ntt.co.jp

over 8 years ago

In reply to: Robert Haas (#15)

Re: CLUSTER command progress monitor

On 2017/09/11 23:38, Robert Haas wrote:

On Sun, Sep 10, 2017 at 10:36 PM, Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

Thanks for the comment.

As you know, CLUSTER command uses SEQ SCAN or INDEX SCAN as a scan method by
cost estimation. In the case of SEQ SCAN, these two phases not overlap.
However, in INDEX SCAN, it overlaps. Therefore I created the phase of "scan
heap and write new heap" when INDEX SCAN was selected.

I agree that progress reporting for sort is difficult. So it only reports
the phase ("sorting tuples") in the current design of progress monitor of
cluster.
It doesn't report counter of sort.

Doesn't that make it almost useless? I would guess that scanning the
heap and writing the new heap would ordinarily account for most of the
runtime, or at least enough that you're going to want something more
than just knowing that's the phase you're in.

Hmmm, Should I add a counter in tuplesort.c? (tuplesort_performsort())
I know that external merge sort takes a time than quick sort.
I'll try investigating how to get a counter from external merge sort processing.
Is this the right way?

The patch is getting the value reported as heap_tuples_total from
OldHeap->rd_rel->reltuples. I think this is pointless: the user can
see that value anyway if they wish. The point of the progress
counters is to expose things the user couldn't otherwise see. It's
also not necessarily accurate: it's only an estimate in the best case,
and may be way off if the relation has recently be extended by a large
amount. I think it's pretty important that we try hard to only report
values that are known to be accurate, because users hate (and mock)
inaccurate progress reports.

Do you mean to use the number of rows by using below calculation instead
OldHeap->rd_rel->reltuples?

estimate rows = physical table size / average row length

No, I mean don't report it at all. The caller can do that calculation
if they wish, without any help from the progress reporting machinery.

I see. I'll remove that column on next patch.

Regards,
Tatsuro Yamada

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18

Tatsuro Yamada

yamada.tatsuro@lab.ntt.co.jp

over 8 years ago

In reply to: Tatsuro Yamada (#17)

Re: CLUSTER command progress monitor

On 2017/09/12 21:20, Tatsuro Yamada wrote:

On 2017/09/11 23:38, Robert Haas wrote:

On Sun, Sep 10, 2017 at 10:36 PM, Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

Thanks for the comment.

As you know, CLUSTER command uses SEQ SCAN or INDEX SCAN as a scan method by
cost estimation. In the case of SEQ SCAN, these two phases not overlap.
However, in INDEX SCAN, it overlaps. Therefore I created the phase of "scan
heap and write new heap" when INDEX SCAN was selected.

I agree that progress reporting for sort is difficult. So it only reports
the phase ("sorting tuples") in the current design of progress monitor of
cluster.
It doesn't report counter of sort.

Doesn't that make it almost useless? I would guess that scanning the
heap and writing the new heap would ordinarily account for most of the
runtime, or at least enough that you're going to want something more
than just knowing that's the phase you're in.

Hmmm, Should I add a counter in tuplesort.c? (tuplesort_performsort())
I know that external merge sort takes a time than quick sort.
I'll try investigating how to get a counter from external merge sort processing.
Is this the right way?

The patch is getting the value reported as heap_tuples_total from
OldHeap->rd_rel->reltuples. I think this is pointless: the user can
see that value anyway if they wish. The point of the progress
counters is to expose things the user couldn't otherwise see. It's
also not necessarily accurate: it's only an estimate in the best case,
and may be way off if the relation has recently be extended by a large
amount. I think it's pretty important that we try hard to only report
values that are known to be accurate, because users hate (and mock)
inaccurate progress reports.

Do you mean to use the number of rows by using below calculation instead
OldHeap->rd_rel->reltuples?

estimate rows = physical table size / average row length

No, I mean don't report it at all. The caller can do that calculation
if they wish, without any help from the progress reporting machinery.

I see. I'll remove that column on next patch.

I will summarize the current design and future corrections before sending
the next patch.

=== Current design ===

CLUSTER command may use Index Scan or Seq Scan when scanning the heap.
Depending on which one is chosen, the command will proceed in the
following sequence of phases:

* Scan method: Seq Scan
1. scanning heap (*1)
2. sorting tuples (*2)
3. writing new heap (*1)
5. swapping relation files (*2)
6. rebuilding index (*2)
7. performing final cleanup (*2)

* Scan method: Index Scan
4. scan heap and write new heap (*1)
5. swapping relation files (*2)
6. rebuilding index (*2)
7. performing final cleanup (*2)

VACUUM FULL command will proceed in the following sequence of phases:

1. scanning heap (*1)
5. swapping relation files (*2)
6. rebuilding index (*2)
7. performing final cleanup (*2)

(*1): increasing the value in heap_tuples_scanned column
(*2): only shows the phase in the phase column

=== It will be changed on next patch ===

- Rename to pg_stat_progress_reolg from pg_stat_progress_cluster
- Remove heap_tuples_total column from the view
- Add a progress counter in the phase of "sorting tuples" (difficult?!)

=== My test case as a bonus ===

I share my test case of progress monitor.
If someone wants to watch the current progress monitor, you can use
this test case as a example.

[Terminal1]
Run this query on psql:

select * from pg_stat_progress_cluster; \watch 0.05

[Terminal2]
Run these queries on psql:

drop table t1;

create table t1 as select a, random() * 1000 as b from generate_series(0, 99999999) a;
create index idx_t1 on t1(a);
create index idx_t1_b on t1(b);
analyze t1;

-- index scan
set enable_seqscan to off;
cluster verbose t1 using idx_t1;

-- seq scan
set enable_seqscan to on;
set enable_indexscan to off;
cluster verbose t1 using idx_t1;

-- only given table name to cluster command
cluster verbose t1;

-- only cluster command
cluster verbose;

-- vacuum full
vacuum full t1;

-- vacuum full
vacuum full;

Thanks,
Tatsuro Yamada

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19

Jeff Janes

jeff.janes@gmail.com

over 8 years ago

In reply to: Tatsuro Yamada (#1)

Re: CLUSTER command progress monitor

On Wed, Aug 30, 2017 at 7:12 PM, Tatsuro Yamada <
yamada.tatsuro@lab.ntt.co.jp> wrote:

The view provides the information of CLUSTER command progress details as
follows
postgres=# \d pg_stat_progress_cluster
View "pg_catalog.pg_stat_progress_cluster"
Column | Type | Collation | Nullable | Default
---------------------+---------+-----------+----------+---------
pid | integer | | |
datid | oid | | |
datname | name | | |
relid | oid | | |
phase | text | | |
scan_method | text | | |
scan_index_relid | bigint | | |
heap_tuples_total | bigint | | |
heap_tuples_scanned | bigint | | |

I think it should be cluster_index_relid, not scan_index_relid. If the
scan_method is seq, then the index isn't being scanned.

Cheers,

Jeff

#20

Daniel Gustafsson

daniel@yesql.se

over 8 years ago

In reply to: Tatsuro Yamada (#18)

Re: CLUSTER command progress monitor

On 12 Sep 2017, at 14:57, Tatsuro Yamada <yamada.tatsuro@lab.ntt.co.jp> wrote:

On 2017/09/12 21:20, Tatsuro Yamada wrote:

On 2017/09/11 23:38, Robert Haas wrote:

On Sun, Sep 10, 2017 at 10:36 PM, Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

Thanks for the comment.

As you know, CLUSTER command uses SEQ SCAN or INDEX SCAN as a scan method by
cost estimation. In the case of SEQ SCAN, these two phases not overlap.
However, in INDEX SCAN, it overlaps. Therefore I created the phase of "scan
heap and write new heap" when INDEX SCAN was selected.

I agree that progress reporting for sort is difficult. So it only reports
the phase ("sorting tuples") in the current design of progress monitor of
cluster.
It doesn't report counter of sort.

Doesn't that make it almost useless? I would guess that scanning the
heap and writing the new heap would ordinarily account for most of the
runtime, or at least enough that you're going to want something more
than just knowing that's the phase you're in.

Hmmm, Should I add a counter in tuplesort.c? (tuplesort_performsort())
I know that external merge sort takes a time than quick sort.
I'll try investigating how to get a counter from external merge sort processing.
Is this the right way?

The patch is getting the value reported as heap_tuples_total from
OldHeap->rd_rel->reltuples. I think this is pointless: the user can
see that value anyway if they wish. The point of the progress
counters is to expose things the user couldn't otherwise see. It's
also not necessarily accurate: it's only an estimate in the best case,
and may be way off if the relation has recently be extended by a large
amount. I think it's pretty important that we try hard to only report
values that are known to be accurate, because users hate (and mock)
inaccurate progress reports.

Do you mean to use the number of rows by using below calculation instead
OldHeap->rd_rel->reltuples?

estimate rows = physical table size / average row length

No, I mean don't report it at all. The caller can do that calculation
if they wish, without any help from the progress reporting machinery.

I see. I'll remove that column on next patch.

I will summarize the current design and future corrections before sending
the next patch.

=== Current design ===

CLUSTER command may use Index Scan or Seq Scan when scanning the heap.
Depending on which one is chosen, the command will proceed in the
following sequence of phases:

* Scan method: Seq Scan
1. scanning heap (*1)
2. sorting tuples (*2)
3. writing new heap (*1)
5. swapping relation files (*2)
6. rebuilding index (*2)
7. performing final cleanup (*2)

* Scan method: Index Scan
4. scan heap and write new heap (*1)
5. swapping relation files (*2)
6. rebuilding index (*2)
7. performing final cleanup (*2)

VACUUM FULL command will proceed in the following sequence of phases:

1. scanning heap (*1)
5. swapping relation files (*2)
6. rebuilding index (*2)
7. performing final cleanup (*2)

(*1): increasing the value in heap_tuples_scanned column
(*2): only shows the phase in the phase column

The view provides the information of CLUSTER command progress details as follows
# \d pg_stat_progress_cluster
View "pg_catalog.pg_stat_progress_cluster"
Column | Type | Collation | Nullable | Default
---------------------------+---------+-----------+----------+---------
pid | integer | | |
datid | oid | | |
datname | name | | |
relid | oid | | |
command | text | | |
phase | text | | |
scan_method | text | | |
scan_index_relid | bigint | | |
heap_tuples_total | bigint | | |
heap_tuples_scanned | bigint | | |
heap_tuples_vacuumed | bigint | | |
heap_tuples_recently_dead | bigint | | |

=== It will be changed on next patch ===

- Rename to pg_stat_progress_reolg from pg_stat_progress_cluster
- Remove heap_tuples_total column from the view
- Add a progress counter in the phase of "sorting tuples" (difficult?!)

=== My test case as a bonus ===

I share my test case of progress monitor.
If someone wants to watch the current progress monitor, you can use
this test case as a example.

[Terminal1]
Run this query on psql:

select * from pg_stat_progress_cluster; \watch 0.05

[Terminal2]
Run these queries on psql:

drop table t1;

create table t1 as select a, random() * 1000 as b from generate_series(0, 99999999) a;
create index idx_t1 on t1(a);
create index idx_t1_b on t1(b);
analyze t1;

-- index scan
set enable_seqscan to off;
cluster verbose t1 using idx_t1;

-- seq scan
set enable_seqscan to on;
set enable_indexscan to off;
cluster verbose t1 using idx_t1;

-- only given table name to cluster command
cluster verbose t1;

-- only cluster command
cluster verbose;

-- vacuum full
vacuum full t1;

-- vacuum full
vacuum full;

Based on this thread, this patch has been marked Returned with Feedback.
Please re-submit a new version to a future commitfest.

cheers ./daniel

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Tatsuro Yamada (#17)

Re: CLUSTER command progress monitor

On Tue, Sep 12, 2017 at 8:20 AM, Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

I agree that progress reporting for sort is difficult. So it only reports
the phase ("sorting tuples") in the current design of progress monitor of
cluster.
It doesn't report counter of sort.

Doesn't that make it almost useless? I would guess that scanning the
heap and writing the new heap would ordinarily account for most of the
runtime, or at least enough that you're going to want something more
than just knowing that's the phase you're in.

Hmmm, Should I add a counter in tuplesort.c? (tuplesort_performsort())
I know that external merge sort takes a time than quick sort.
I'll try investigating how to get a counter from external merge sort
processing.
Is this the right way?

Progress reporting on sorts seems like a tricky problem to me, as I
said before. In most cases, a sort is going to involve an initial
stage where it reads all the input tuples and writes out quicksorted
runs, and then a merge phase where it merges all the output tapes into
a sorted result. There are some complexities; for example, if the
number of tapes is really large, then we might need multiple merge
phases, only the last of which will produce tuples. On the other
hand, if work_mem is very large, the time taken for sorting each run
might itself be significant that we'd like to have insight into
progress. If we ignore those complexities, though, a reasonable way
of reporting progress might be to report the following:

1. blocks read from the relation
2. # of tuples we've put into the tuplesort
3. # of tuples we've extracted from the tuplesort

During the first part of the sort, (1) and (2) will be growing, and
the user can measure progress by comparing (1) to the total size of
the relation. During the final merge, (3) will be growing, eventually
becoming equal to (2), so the user can measure progress my comparing
(2) with (3).

This approach only works for a seqscan-and-sort, though. I'm not sure
what to do about the index scan case.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#22

Antonin Houska

ah@cybertec.at

about 8 years ago

In reply to: Robert Haas (#13)

Re: [HACKERS] CLUSTER command progress monitor

Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Aug 30, 2017 at 10:12 PM, Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

1. scanning heap
2. sort tuples

These two phases overlap, though. I believe progress reporting for
sorts is really hard. In the simple case where the data fits in
work_mem, none of the work of the sort gets done until all the data is
read. Once you switch to an external sort, you're writing batch
files, so a lot of the work is now being done during data loading.
But as the number of batch files grows, the final merge at the end
becomes an increasingly noticeable part of the cost, and eventually
you end up needing multiple merge passes. I think we need some smart
way to report on sorts so that we can tell how much of the work has
really been done, but I don't know how to do it.

Whatever complexity is hidden in the sort, cost_sort() should have taken it
into consideration when called via plan_cluster_use_sort(). Thus I think that
once we have both startup and total cost, the current progress of the sort
stage can be estimated from the current number of input and output
rows. Please remind me if my proposal appears to be too simplistic.

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at

#23

Tom Lane

tgl@sss.pgh.pa.us

about 8 years ago

In reply to: Antonin Houska (#22)

Re: [HACKERS] CLUSTER command progress monitor

Antonin Houska <ah@cybertec.at> writes:

Robert Haas <robertmhaas@gmail.com> wrote:

These two phases overlap, though. I believe progress reporting for
sorts is really hard.

Whatever complexity is hidden in the sort, cost_sort() should have taken it
into consideration when called via plan_cluster_use_sort(). Thus I think that
once we have both startup and total cost, the current progress of the sort
stage can be estimated from the current number of input and output
rows. Please remind me if my proposal appears to be too simplistic.

Well, even if you assume that the planner's cost model omits nothing
(which I wouldn't bet on), its result is only going to be as good as the
planner's estimate of the number of rows to be sorted. And, in cases
where people actually care about progress monitoring, it's likely that
the planner got that wrong, maybe horribly so. I think it's a bad idea
for progress monitoring to depend on the planner's estimates in any way
whatsoever.

regards, tom lane

#24

Antonin Houska

ah@cybertec.at

about 8 years ago

In reply to: Tom Lane (#23)

Re: [HACKERS] CLUSTER command progress monitor

Tom Lane <tgl@sss.pgh.pa.us> wrote:

Antonin Houska <ah@cybertec.at> writes:

Robert Haas <robertmhaas@gmail.com> wrote:

These two phases overlap, though. I believe progress reporting for
sorts is really hard.

Whatever complexity is hidden in the sort, cost_sort() should have taken it
into consideration when called via plan_cluster_use_sort(). Thus I think that
once we have both startup and total cost, the current progress of the sort
stage can be estimated from the current number of input and output
rows. Please remind me if my proposal appears to be too simplistic.

Well, even if you assume that the planner's cost model omits nothing
(which I wouldn't bet on), its result is only going to be as good as the
planner's estimate of the number of rows to be sorted. And, in cases
where people actually care about progress monitoring, it's likely that
the planner got that wrong, maybe horribly so. I think it's a bad idea
for progress monitoring to depend on the planner's estimates in any way
whatsoever.

The general idea was that some sort of prediction of the total cost is needed
anyway if we should tell during execution what fraction of work has already
been done. And also that the cost computation that we perform during execution
shouldn't (ideally) differ from cost_sort(). So I thought that it's easier to
refine cost_sort() than to implement the same computation from scratch
elsewhere.

Besides that I see 2 circumstances that make the estimate of the number of
input tuples simpler in the CLUSTER case:

* There's only 1 input relation w/o any kind of clause.

* CLUSTER uses SnapshotAny, so pg_class(reltuples) is closer to the actual
number of input rows than it would be in general case. (Of course, pg_class
would only be useful for the initial estimate.)

Unlike planner, the executor could recalculate the cost estimate at some
point(s) as it recognizes that the actual number of tuples per page appears to
differ from the density derived from pg_class initially. Still wrong?

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at

#25

Robert Haas

robertmhaas@gmail.com

about 8 years ago

In reply to: Tom Lane (#23)

Re: [HACKERS] CLUSTER command progress monitor

On Mon, Nov 20, 2017 at 12:25 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Antonin Houska <ah@cybertec.at> writes:

Robert Haas <robertmhaas@gmail.com> wrote:

These two phases overlap, though. I believe progress reporting for
sorts is really hard.

Whatever complexity is hidden in the sort, cost_sort() should have taken it
into consideration when called via plan_cluster_use_sort(). Thus I think that
once we have both startup and total cost, the current progress of the sort
stage can be estimated from the current number of input and output
rows. Please remind me if my proposal appears to be too simplistic.

Well, even if you assume that the planner's cost model omits nothing
(which I wouldn't bet on), its result is only going to be as good as the
planner's estimate of the number of rows to be sorted. And, in cases
where people actually care about progress monitoring, it's likely that
the planner got that wrong, maybe horribly so. I think it's a bad idea
for progress monitoring to depend on the planner's estimates in any way
whatsoever.

I agree.

I have been of the opinion all along that progress monitoring needs to
report facts, not theories. The number of tuples read thus far is a
fact, and is fine to report for whatever value it may have to someone.
The number of tuples that will be read in the future is a theory, and
as you say, progress monitoring is most likely to be used in cases
where theory and practice ended up being very different.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#26

Robert Haas

robertmhaas@gmail.com

about 8 years ago

In reply to: Antonin Houska (#22)

Re: [HACKERS] CLUSTER command progress monitor

On Mon, Nov 20, 2017 at 12:05 PM, Antonin Houska <ah@cybertec.at> wrote:

Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Aug 30, 2017 at 10:12 PM, Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

1. scanning heap
2. sort tuples

These two phases overlap, though. I believe progress reporting for
sorts is really hard. In the simple case where the data fits in
work_mem, none of the work of the sort gets done until all the data is
read. Once you switch to an external sort, you're writing batch
files, so a lot of the work is now being done during data loading.
But as the number of batch files grows, the final merge at the end
becomes an increasingly noticeable part of the cost, and eventually
you end up needing multiple merge passes. I think we need some smart
way to report on sorts so that we can tell how much of the work has
really been done, but I don't know how to do it.

Whatever complexity is hidden in the sort, cost_sort() should have taken it
into consideration when called via plan_cluster_use_sort(). Thus I think that
once we have both startup and total cost, the current progress of the sort
stage can be estimated from the current number of input and output
rows. Please remind me if my proposal appears to be too simplistic.

I think it is far too simplistic. If the sort is being fed by a
sequential scan, reporting the number of blocks scanned so far as
compared to the total number that will be scanned would be a fine way
of reporting on the progress of the sequential scan -- and it's better
to use blocks, which we know for sure about, than rows, at which we
can only guess. But that's the *scan* progress, not the *sort*
progress.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#27

Peter Geoghegan

pg@bowt.ie

about 8 years ago

In reply to: Robert Haas (#21)

Re: [HACKERS] CLUSTER command progress monitor

On Mon, Oct 2, 2017 at 6:04 AM, Robert Haas <robertmhaas@gmail.com> wrote:

Progress reporting on sorts seems like a tricky problem to me, as I
said before. In most cases, a sort is going to involve an initial
stage where it reads all the input tuples and writes out quicksorted
runs, and then a merge phase where it merges all the output tapes into
a sorted result. There are some complexities; for example, if the
number of tapes is really large, then we might need multiple merge
phases, only the last of which will produce tuples.

This would ordinarily be the point at which I'd say "but you're very
unlikely to require multiple passes for an external sort these days".
But I won't say that on this thread, because CLUSTER generally has
unusually wide tuples, and so is much more likely to be I/O bound, to
require multiple passes, etc. (I bet the v10 enhancements
disproportionately improved CLUSTER performance.)

--
Peter Geoghegan

#28

Peter Geoghegan

pg@bowt.ie

about 8 years ago

In reply to: Robert Haas (#25)

Re: [HACKERS] CLUSTER command progress monitor

On Tue, Nov 21, 2017 at 12:55 PM, Robert Haas <robertmhaas@gmail.com> wrote:

I agree.

I have been of the opinion all along that progress monitoring needs to
report facts, not theories. The number of tuples read thus far is a
fact, and is fine to report for whatever value it may have to someone.

That makes a lot of sense to me. I sometimes think that we're too
hesitant to expose internal information due to concerns about it being
hard to interpret. I see wait events as bucking this trend, which I
welcome. We see similar trends in the Linux kernel, with tools like
perf and BCC/eBPF now being regularly used to debug production issues.

The number of tuples that will be read in the future is a theory, and
as you say, progress monitoring is most likely to be used in cases
where theory and practice ended up being very different.

You hit the nail on the head here.

It's not that these things are not difficult to interpret - the
concern itself is justified. It just needs to be weighed against the
benefit of having some instrumentation to start with. People are much
more likely to complain about obscure debug information, which makes
them feel dumb, than they are to complain about the absence of any
instrumentation, but I still think that the latter is the bigger
problem.

Besides, you don't necessarily have to understand something to act on
it. The internals of Oracle are trade secrets, but they were the first
to have wait events, I think. At least having something that you can
Google can make all the difference.

--
Peter Geoghegan

#29

Michael Paquier

michael.paquier@gmail.com

about 8 years ago

In reply to: Robert Haas (#25)

Re: [HACKERS] CLUSTER command progress monitor

On Wed, Nov 22, 2017 at 5:55 AM, Robert Haas <robertmhaas@gmail.com> wrote:

I have been of the opinion all along that progress monitoring needs to
report facts, not theories. The number of tuples read thus far is a
fact, and is fine to report for whatever value it may have to someone.
The number of tuples that will be read in the future is a theory, and
as you say, progress monitoring is most likely to be used in cases
where theory and practice ended up being very different.

+1. We should never as well enter in things like trying to estimate
the amount of time remaining to finish a task [1]https://www.xkcd.com/612/ -- Michael.

[1]: https://www.xkcd.com/612/ -- Michael
--
Michael

#30

Thomas Munro

thomas.munro@enterprisedb.com

about 8 years ago

In reply to: Michael Paquier (#29)

Re: [HACKERS] CLUSTER command progress monitor

On Wed, Nov 22, 2017 at 1:53 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:

On Wed, Nov 22, 2017 at 5:55 AM, Robert Haas <robertmhaas@gmail.com> wrote:

I have been of the opinion all along that progress monitoring needs to
report facts, not theories. The number of tuples read thus far is a
fact, and is fine to report for whatever value it may have to someone.
The number of tuples that will be read in the future is a theory, and
as you say, progress monitoring is most likely to be used in cases
where theory and practice ended up being very different.

+1. We should never as well enter in things like trying to estimate
the amount of time remaining to finish a task [1].

[1]: https://www.xkcd.com/612/

That is one reason I made pg_stat_replication.XXX_lag report the lag
of WAL that has been processed, not (say) the time until we catch up.
In some information-poor scenarios it interpolates which isn't perfect
but the general idea is that is shows you measurements of the past
(facts), not predictions about the future (theories).

--
Thomas Munro
http://www.enterprisedb.com

#31

Tatsuro Yamada

yamada.tatsuro@lab.ntt.co.jp

over 7 years ago

In reply to: Peter Geoghegan (#27)

1 attachment(s)

Re: [HACKERS] CLUSTER command progress monitor

On 2017/11/22 6:07, Peter Geoghegan wrote:

On Mon, Oct 2, 2017 at 6:04 AM, Robert Haas <robertmhaas@gmail.com> wrote:

Progress reporting on sorts seems like a tricky problem to me, as I
said before. In most cases, a sort is going to involve an initial
stage where it reads all the input tuples and writes out quicksorted
runs, and then a merge phase where it merges all the output tapes into
a sorted result. There are some complexities; for example, if the
number of tapes is really large, then we might need multiple merge
phases, only the last of which will produce tuples.

This would ordinarily be the point at which I'd say "but you're very
unlikely to require multiple passes for an external sort these days".
But I won't say that on this thread, because CLUSTER generally has
unusually wide tuples, and so is much more likely to be I/O bound, to
require multiple passes, etc. (I bet the v10 enhancements
disproportionately improved CLUSTER performance.)

Hi,

I came back to develop the feature for community.
V4 patch is corrected these following points:

- Rebase on master (143290efd)
- Fix document
- Replace the column name scan_index_relid with cluster_index_relid.
Thanks to Jeff Janes!

I'm now working on improving the patch based on Robert's comment related to
"Seqscan and Sort case" and also considering how to handle the "Index scan case".

Please find attached file.

Regards,
Tatsuro Yamada

Attachments:

progress_monitor_for_cluster_command_v4.patchtext/x-patch; name=progress_monitor_for_cluster_command_v4.patchDownload

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 0484cfa77a..5a4bd203ea 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -332,6 +332,14 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
       </entry>
      </row>
 
+     <row>
+      <entry><structname>pg_stat_progress_cluster</structname><indexterm><primary>pg_stat_progress_cluster</primary></indexterm></entry>
+      <entry>One row for each backend running
+       <command>CLUSTER</command> and <command>VACUUM FULL</command>, showing current progress.
+       See <xref linkend='cluster-progress-reporting'>.
+      </entry>
+     </row>
+
     </tbody>
    </tgroup>
   </table>
@@ -3338,9 +3346,9 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 
   <para>
    <productname>PostgreSQL</productname> has the ability to report the progress of
-   certain commands during command execution.  Currently, the only command
-   which supports progress reporting is <command>VACUUM</command>.  This may be
-   expanded in the future.
+   certain commands during command execution.  Currently, the suppoted 
+   progress reporting commands are <command>VACUUM</command> and <command>CLUSTER</command>.
+   This may be expanded in the future.
   </para>
 
  <sect2 id="vacuum-progress-reporting">
@@ -3352,9 +3360,8 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
    one row for each backend (including autovacuum worker processes) that is
    currently vacuuming.  The tables below describe the information
    that will be reported and provide information about how to interpret it.
-   Progress reporting is not currently supported for <command>VACUUM FULL</command>
-   and backends running <command>VACUUM FULL</command> will not be listed in this
-   view.
+   Running <command>VACUUM FULL</command> is listed in <structname>pg_stat_progress_cluster</structname>
+   view because it uses <command>CLUSTER</command> command internally.  See <xref linkend='cluster-progress-reporting'>.
   </para>
 
   <table id="pg-stat-progress-vacuum-view" xreflabel="pg_stat_progress_vacuum">
@@ -3531,6 +3538,228 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
    </tgroup>
   </table>
 
+ </sect2>
+
+ <sect2 id="cluster-progress-reporting">
+  <title>CLUSTER Progress Reporting</title>
+
+  <para>
+   Whenever <command>CLUSTER</command> is running, the
+   <structname>pg_stat_progress_cluster</structname> view will contain
+   one row for each backend that is currently clustering or vacuuming (VACUUM FULL). 
+   The tables below describe the information that will be reported and
+   provide information about how to interpret it.
+  </para>
+
+  <table id="pg-stat-progress-cluster-view" xreflabel="pg_stat_progress_cluster">
+   <title><structname>pg_stat_progress_cluster</structname> View</title>
+   <tgroup cols="3">
+    <thead>
+    <row>
+      <entry>Column</entry>
+      <entry>Type</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><structfield>pid</structfield></entry>
+     <entry><type>integer</type></entry>
+     <entry>Process ID of backend.</entry>
+    </row>
+    <row>
+     <entry><structfield>datid</structfield></entry>
+     <entry><type>oid</type></entry>
+     <entry>OID of the database to which this backend is connected.</entry>
+    </row>
+    <row>
+     <entry><structfield>datname</structfield></entry>
+     <entry><type>name</type></entry>
+     <entry>Name of the database to which this backend is connected.</entry>
+    </row>
+    <row>
+     <entry><structfield>relid</structfield></entry>
+     <entry><type>oid</type></entry>
+     <entry>OID of the table being clustered.</entry>
+    </row>
+    <row>
+     <entry><structfield>command</structfield></entry>
+     <entry><type>text</type></entry>
+     <entry>
+       Current processing command: CLUSTER/VACUUM FULL.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>phase</structfield></entry>
+     <entry><type>text</type></entry>
+     <entry>
+       Current processing phase of cluster/vacuum full.  See <xref linkend='cluster-phases'> or <xref linkend='vacuum-full-phases'>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>scan_method</structfield></entry>
+     <entry><type>text</type></entry>
+     <entry>
+       Scan method of table: index scan/seq scan.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>cluster_index_relid</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       OID of the index.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_tuples_total</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Total number of heap tuples in the table.  This number is reported
+       as of the beginning of the scan; tuples added later will not be (and
+       need not be) visited by this <command>CLUSTER</command> and <command>VACUUM FULL</command>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_tuples_scanned</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Number of heap tuples scanned.
+       This counter only advances when the phase is <literal>scanning heap</literal>, 
+       <literal>writing new heap</literal> and <literal>scan heap and write new heap</literal>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_tuples_vacuumed</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Number of heap tuples vacuumed. This counter only advances when the
+       command is <literal>VACUUM FULL</literal> and the phase is <literal>scanning heap</literal>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_tuples_recently_dead</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Number of heap tuples not vacuumed since these tuples marked recently dead.
+       This counter only advances when the command is <literal>VACUUM FULL</literal> and 
+       the phase is <literal>scanning heap</literal>.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
+  <table id="cluster-phases">
+   <title>CLUSTER phases</title>
+   <tgroup cols="2">
+    <thead>
+    <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><literal>initializing</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is preparing to begin scanning the heap.  This
+       phase is expected to be very brief.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>scanning heap</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently scanning heap from the table by
+       seq scan. This phase is shown when the <structfield>scan_method</structfield> is seq scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>sorting tuples</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently sorting tuples. 
+       This phase is shown when the <structfield>scan_method</structfield> is seq scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>scan heap and write new heap</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently scanning heap from the table and
+       writing new clusterd heap.  This phase is shown when the <structfield>scan_method</structfield> is
+       index scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>swapping relation files</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently swapping old heap and new clustered heap.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>rebuilding index</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently rebuilding index.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>performing final cleanup</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is performing final cleanup.  When this phase is 
+       completed, <command>CLUSTER</command> will end.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
+  <table id="vacuum-full-phases">
+   <title>VACUUM FULL phases</title>
+   <tgroup cols="2">
+    <thead>
+    <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><literal>initializing</literal></entry>
+     <entry>
+       <command>VACUUM FULL</command> is preparing to begin scanning the heap.  This
+       phase is expected to be very brief.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>scanning heap</literal></entry>
+     <entry>
+       <command>VACUUM FULL</command> is currently scanning heap from the table.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>swapping relation files</literal></entry>
+     <entry>
+       <command>VACUUM FULL</command> is currently swapping old heap and new vacuumed heap.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>rebuilding index</literal></entry>
+     <entry>
+       <command>VACUUM FULL</command> is currently rebuilding index.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>performing final cleanup</literal></entry>
+     <entry>
+       <command>VACUUM FULL</command> is performing final cleanup.  When this phase is
+       completed, <command>VACUUM FULL</command> will end.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
  </sect2>
  </sect1>
 
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 7251552419..cf3e25c9f9 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -903,6 +903,35 @@ CREATE VIEW pg_stat_progress_vacuum AS
     FROM pg_stat_get_progress_info('VACUUM') AS S
 		LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_cluster AS
+    SELECT
+        S.pid AS pid,
+        S.datid AS datid,
+        D.datname AS datname,
+        S.relid AS relid,
+        CASE S.param1 WHEN 1 THEN 'CLUSTER'
+                      WHEN 2 THEN 'VACUUM FULL'
+                      END AS command,
+        CASE S.param2 WHEN 0 THEN 'initializing'
+                      WHEN 1 THEN 'scanning heap'
+                      WHEN 2 THEN 'sorting tuples'
+                      WHEN 3 THEN 'writing new heap'
+                      WHEN 4 THEN 'scan heap and write new heap'
+                      WHEN 5 THEN 'swapping relation files'
+                      WHEN 6 THEN 'rebuilding index'
+                      WHEN 7 THEN 'performing final cleanup'
+                      END AS phase,
+        CASE S.param3 WHEN 1 THEN 'index scan'
+                      WHEN 2 THEN 'seq scan'
+                      END AS scan_method,
+        S.param4 AS cluster_index_relid,
+        S.param5 AS heap_tuples_total,
+        S.param6 AS heap_tuples_scanned,
+        S.param7 AS heap_tuples_vacuumed,
+        S.param8 AS heap_tuples_recently_dead
+    FROM pg_stat_get_progress_info('CLUSTER') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid;
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 68be470977..f1fc04a96c 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -34,10 +34,12 @@
 #include "catalog/objectaccess.h"
 #include "catalog/toasting.h"
 #include "commands/cluster.h"
+#include "commands/progress.h"
 #include "commands/tablecmds.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "optimizer/planner.h"
+#include "pgstat.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/predicate.h"
@@ -105,6 +107,7 @@ static void reform_and_rewrite_tuple(HeapTuple tuple,
 void
 cluster(ClusterStmt *stmt, bool isTopLevel)
 {
+
 	if (stmt->relation != NULL)
 	{
 		/* This is the single-relation case. */
@@ -186,7 +189,9 @@ cluster(ClusterStmt *stmt, bool isTopLevel)
 		heap_close(rel, NoLock);
 
 		/* Do the job. */
+		pgstat_progress_start_command(PROGRESS_COMMAND_CLUSTER, tableOid);
 		cluster_rel(tableOid, indexOid, stmt->options);
+		pgstat_progress_end_command();
 	}
 	else
 	{
@@ -234,8 +239,10 @@ cluster(ClusterStmt *stmt, bool isTopLevel)
 			/* functions in indexes may want a snapshot set */
 			PushActiveSnapshot(GetTransactionSnapshot());
 			/* Do the job. */
+			pgstat_progress_start_command(PROGRESS_COMMAND_CLUSTER, rvtc->tableOid);
 			cluster_rel(rvtc->tableOid, rvtc->indexOid,
 						stmt->options | CLUOPT_RECHECK);
+			pgstat_progress_end_command();
 			PopActiveSnapshot();
 			CommitTransactionCommand();
 		}
@@ -385,6 +392,19 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 	 */
 	CheckTableNotInUse(OldHeap, OidIsValid(indexOid) ? "CLUSTER" : "VACUUM");
 
+	/* Set command to column */
+	if(OidIsValid(indexOid))
+	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_COMMAND, PROGRESS_CLUSTER_COMMAND_CLUSTER);
+		/* Set indexOid to column */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_INDEX_RELID, indexOid);
+	}
+	else
+	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_COMMAND, PROGRESS_CLUSTER_COMMAND_VACUUM_FULL);
+	}
+
+
 	/* Check heap and index are valid to cluster on */
 	if (OidIsValid(indexOid))
 		check_index_is_clusterable(OldHeap, indexOid, recheck, AccessExclusiveLock);
@@ -793,6 +813,9 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	else
 		OldIndex = NULL;
 
+	/* Set reltuples to total_tuples */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_TOTAL_HEAP_TUPLES, OldHeap->rd_rel->reltuples);
+
 	/*
 	 * Their tuple descriptors should be exactly alike, but here we only need
 	 * assume that they have the same number of columns.
@@ -925,12 +948,16 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 */
 	if (OldIndex != NULL && !use_sort)
 	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_SCAN_HEAP_AND_WRITE_NEW_HEAP);
+		pgstat_progress_update_param(PROGRESS_CLUSTER_SCAN_METHOD, PROGRESS_CLUSTER_METHOD_INDEX_SCAN);
 		heapScan = NULL;
 		indexScan = index_beginscan(OldHeap, OldIndex, SnapshotAny, 0, 0);
 		index_rescan(indexScan, NULL, 0, NULL, 0);
 	}
 	else
 	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_SCAN_HEAP);
+		pgstat_progress_update_param(PROGRESS_CLUSTER_SCAN_METHOD, PROGRESS_CLUSTER_METHOD_SEQ_SCAN);
 		heapScan = heap_beginscan(OldHeap, SnapshotAny, 0, (ScanKey) NULL);
 		indexScan = NULL;
 	}
@@ -1051,6 +1078,9 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 				tups_vacuumed += 1;
 				tups_recently_dead -= 1;
 			}
+			/* set tups_vacuumed and tups_recently_dead to columns for VACUUM FULL */
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_VACUUMED, tups_vacuumed);
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_RECENTLY_DEAD, tups_recently_dead);
 			continue;
 		}
 
@@ -1062,6 +1092,9 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 									 oldTupDesc, newTupDesc,
 									 values, isnull,
 									 NewHeap->rd_rel->relhasoids, rwstate);
+
+		/* Regardless of index scan or seq scan, update tuples_scanned column */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED, num_tuples);
 	}
 
 	if (indexScan != NULL)
@@ -1075,8 +1108,15 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 */
 	if (tuplesort != NULL)
 	{
+		double num_tuples = 0;
+
+		/* Report that we are now sorting tuples */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_SORT_TUPLES);
 		tuplesort_performsort(tuplesort);
 
+		/* Report that we are now writing new heap */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP);
+		pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED, num_tuples);
 		for (;;)
 		{
 			HeapTuple	tuple;
@@ -1087,10 +1127,13 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 			if (tuple == NULL)
 				break;
 
+			num_tuples += 1;
 			reform_and_rewrite_tuple(tuple,
 									 oldTupDesc, newTupDesc,
 									 values, isnull,
 									 NewHeap->rd_rel->relhasoids, rwstate);
+
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED, num_tuples);
 		}
 
 		tuplesort_end(tuplesort);
@@ -1529,6 +1572,11 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	int			reindex_flags;
 	int			i;
 
+	/* Report that we are now swapping relation files */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES);
+	/* Set scan_method to NULL */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_SCAN_METHOD, -1);
+
 	/* Zero out possible results from swapped_relation_files */
 	memset(mapped_tables, 0, sizeof(mapped_tables));
 
@@ -1563,6 +1611,10 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	 * because the new heap won't contain any HOT chains at all, let alone
 	 * broken ones, so it can't be necessary to set indcheckxmin.
 	 */
+
+	/* Report that we are now reindexing relations */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_REBUILD_INDEX);
+
 	reindex_flags = REINDEX_REL_SUPPRESS_INDEX_USE;
 	if (check_constraints)
 		reindex_flags |= REINDEX_REL_CHECK_CONSTRAINTS;
@@ -1578,6 +1630,9 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 
 	reindex_relation(OIDOldHeap, reindex_flags, 0);
 
+	/* Report that we are now doing clean up */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP);
+
 	/*
 	 * If the relation being rebuild is pg_class, swap_relation_files()
 	 * couldn't update pg_class's own pg_class entry (check comments in
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index ee32fe8871..f22e4da198 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -1561,7 +1561,9 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 			cluster_options |= CLUOPT_VERBOSE;
 
 		/* VACUUM FULL is now a variant of CLUSTER; see cluster.c */
+		pgstat_progress_start_command(PROGRESS_COMMAND_CLUSTER, relid);
 		cluster_rel(relid, InvalidOid, cluster_options);
+		pgstat_progress_end_command();
 	}
 	else
 		lazy_vacuum_rel(onerel, options, params, vac_strategy);
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index e95e347184..c3283c7b70 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -468,6 +468,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 	/* Translate command name into command type code. */
 	if (pg_strcasecmp(cmd, "VACUUM") == 0)
 		cmdtype = PROGRESS_COMMAND_VACUUM;
+	else if(pg_strcasecmp(cmd, "CLUSTER") == 0)
+		cmdtype = PROGRESS_COMMAND_CLUSTER;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 6a6b467fee..27553ee678 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -34,4 +34,32 @@
 #define PROGRESS_VACUUM_PHASE_TRUNCATE			5
 #define PROGRESS_VACUUM_PHASE_FINAL_CLEANUP		6
 
+/* Progress parameters for cluster */
+#define PROGRESS_CLUSTER_COMMAND					0
+#define PROGRESS_CLUSTER_PHASE						1
+#define PROGRESS_CLUSTER_SCAN_METHOD				2
+#define PROGRESS_CLUSTER_INDEX_RELID				3
+#define PROGRESS_CLUSTER_TOTAL_HEAP_TUPLES	  		4
+#define PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED		5
+#define PROGRESS_CLUSTER_HEAP_TUPLES_VACUUMED		6
+#define PROGRESS_CLUSTER_HEAP_TUPLES_RECENTLY_DEAD	7
+
+/* Phases of cluster (as dvertised via PROGRESS_CLUSTER_PHASE) */
+#define PROGRESS_CLUSTER_PHASE_SCAN_HEAP						1
+#define PROGRESS_CLUSTER_PHASE_SORT_TUPLES						2
+#define PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP					3
+#define PROGRESS_CLUSTER_PHASE_SCAN_HEAP_AND_WRITE_NEW_HEAP		4
+#define PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES					5
+#define PROGRESS_CLUSTER_PHASE_REBUILD_INDEX					6
+#define PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP					7
+
+/* Scan methods of cluster */
+#define PROGRESS_CLUSTER_METHOD_INDEX_SCAN		1
+#define PROGRESS_CLUSTER_METHOD_SEQ_SCAN		2
+
+/* Commands of PROGRESS_CLUSTER */
+#define PROGRESS_CLUSTER_COMMAND_CLUSTER		1
+#define PROGRESS_CLUSTER_COMMAND_VACUUM_FULL	2
+
+
 #endif
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index d59c24ae23..14559d4d2f 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -933,7 +933,8 @@ typedef enum
 typedef enum ProgressCommandType
 {
 	PROGRESS_COMMAND_INVALID,
-	PROGRESS_COMMAND_VACUUM
+	PROGRESS_COMMAND_VACUUM,
+	PROGRESS_COMMAND_CLUSTER
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	10
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 078129f251..15a3f3c0af 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1821,6 +1821,38 @@ pg_stat_database_conflicts| SELECT d.oid AS datid,
     pg_stat_get_db_conflict_bufferpin(d.oid) AS confl_bufferpin,
     pg_stat_get_db_conflict_startup_deadlock(d.oid) AS confl_deadlock
    FROM pg_database d;
+pg_stat_progress_cluster| SELECT s.pid,
+    s.datid,
+    d.datname,
+    s.relid,
+        CASE s.param1
+            WHEN 1 THEN 'CLUSTER'::text
+            WHEN 2 THEN 'VACUUM FULL'::text
+            ELSE NULL::text
+        END AS command,
+        CASE s.param2
+            WHEN 0 THEN 'initializing'::text
+            WHEN 1 THEN 'scanning heap'::text
+            WHEN 2 THEN 'sorting tuples'::text
+            WHEN 3 THEN 'writing new heap'::text
+            WHEN 4 THEN 'scan heap and write new heap'::text
+            WHEN 5 THEN 'swapping relation files'::text
+            WHEN 6 THEN 'rebuilding index'::text
+            WHEN 7 THEN 'performing final cleanup'::text
+            ELSE NULL::text
+        END AS phase,
+        CASE s.param3
+            WHEN 1 THEN 'index scan'::text
+            WHEN 2 THEN 'seq scan'::text
+            ELSE NULL::text
+        END AS scan_method,
+    s.param4 AS cluster_index_relid,
+    s.param5 AS heap_tuples_total,
+    s.param6 AS heap_tuples_scanned,
+    s.param7 AS heap_tuples_vacuumed,
+    s.param8 AS heap_tuples_recently_dead
+   FROM (pg_stat_get_progress_info('CLUSTER'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,

#32

Dmitry Dolgov

9erthalion6@gmail.com

about 7 years ago

In reply to: Tatsuro Yamada (#31)

Re: [HACKERS] CLUSTER command progress monitor

On Fri, Aug 24, 2018 at 7:06 AM Tatsuro Yamada <yamada.tatsuro@lab.ntt.co.jp> wrote:

On 2017/11/22 6:07, Peter Geoghegan wrote:

On Mon, Oct 2, 2017 at 6:04 AM, Robert Haas <robertmhaas@gmail.com> wrote:

Progress reporting on sorts seems like a tricky problem to me, as I
said before. In most cases, a sort is going to involve an initial
stage where it reads all the input tuples and writes out quicksorted
runs, and then a merge phase where it merges all the output tapes into
a sorted result. There are some complexities; for example, if the
number of tapes is really large, then we might need multiple merge
phases, only the last of which will produce tuples.

This would ordinarily be the point at which I'd say "but you're very
unlikely to require multiple passes for an external sort these days".
But I won't say that on this thread, because CLUSTER generally has
unusually wide tuples, and so is much more likely to be I/O bound, to
require multiple passes, etc. (I bet the v10 enhancements
disproportionately improved CLUSTER performance.)

Hi,

I came back to develop the feature for community.
V4 patch is corrected these following points:

- Rebase on master (143290efd)
- Fix document
- Replace the column name scan_index_relid with cluster_index_relid.
Thanks to Jeff Janes!

I'm now working on improving the patch based on Robert's comment related to
"Seqscan and Sort case" and also considering how to handle the "Index scan case".

Thank you,

Unfortunately, this patch has some conflicts now, could you rebase it? Also
what's is the status of your work on improving it based on the
provided feedback?

In the meantime I'm moving it to the next CF.

#33

Tatsuro Yamada

yamada.tatsuro@lab.ntt.co.jp

about 7 years ago

In reply to: Dmitry Dolgov (#32)

Re: [HACKERS] CLUSTER command progress monitor

On 2018/11/29 21:20, Dmitry Dolgov wrote:

On Fri, Aug 24, 2018 at 7:06 AM Tatsuro Yamada <yamada.tatsuro@lab.ntt.co.jp> wrote:

On 2017/11/22 6:07, Peter Geoghegan wrote:

On Mon, Oct 2, 2017 at 6:04 AM, Robert Haas <robertmhaas@gmail.com> wrote:

Progress reporting on sorts seems like a tricky problem to me, as I
said before. In most cases, a sort is going to involve an initial
stage where it reads all the input tuples and writes out quicksorted
runs, and then a merge phase where it merges all the output tapes into
a sorted result. There are some complexities; for example, if the
number of tapes is really large, then we might need multiple merge
phases, only the last of which will produce tuples.

This would ordinarily be the point at which I'd say "but you're very
unlikely to require multiple passes for an external sort these days".
But I won't say that on this thread, because CLUSTER generally has
unusually wide tuples, and so is much more likely to be I/O bound, to
require multiple passes, etc. (I bet the v10 enhancements
disproportionately improved CLUSTER performance.)

Hi,

I came back to develop the feature for community.
V4 patch is corrected these following points:

- Rebase on master (143290efd)
- Fix document
- Replace the column name scan_index_relid with cluster_index_relid.
Thanks to Jeff Janes!

I'm now working on improving the patch based on Robert's comment related to
"Seqscan and Sort case" and also considering how to handle the "Index scan case".

Thank you,

Unfortunately, this patch has some conflicts now, could you rebase it? Also
what's is the status of your work on improving it based on the
provided feedback?

In the meantime I'm moving it to the next CF.

Thank you for managing the CF and Sorry for the late reply.
I'll rebase it for the next CF and also I'll clear my head because the patch
needs design change to address the feedbacks, I guess. Therefore, the status is
reconsidering the design of the patch. :)

Regards,
Tatsuro Yamada
NTT Open Source Software Center

#34

Alvaro Herrera

alvherre@2ndquadrant.com

about 7 years ago

In reply to: Tatsuro Yamada (#33)

Re: [HACKERS] CLUSTER command progress monitor

On 2018-Dec-03, Tatsuro Yamada wrote:

In the meantime I'm moving it to the next CF.

Thank you for managing the CF and Sorry for the late reply.
I'll rebase it for the next CF and also I'll clear my head because the patch
needs design change to address the feedbacks, I guess. Therefore, the status is
reconsidering the design of the patch. :)

I think we should mark it as Returned with Feedback then.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#35

Michael Paquier

michael@paquier.xyz

about 7 years ago

In reply to: Alvaro Herrera (#34)

Re: [HACKERS] CLUSTER command progress monitor

On Mon, Dec 03, 2018 at 02:17:25PM -0300, Alvaro Herrera wrote:

I think we should mark it as Returned with Feedback then.

+1.
--
Michael

#36

Alvaro Herrera

alvherre@2ndquadrant.com

about 7 years ago

In reply to: Tatsuro Yamada (#33)

Re: [HACKERS] CLUSTER command progress monitor

Hello Yamada-san,

On 2018-Dec-03, Tatsuro Yamada wrote:

Thank you for managing the CF and Sorry for the late reply.
I'll rebase it for the next CF and also I'll clear my head because the patch
needs design change to address the feedbacks, I guess. Therefore, the status is
reconsidering the design of the patch. :)

Do you have a new version of this patch? If not, do you think you'll
have something in time for the upcoming commitfest?

Thanks

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#37

Alvaro Herrera

alvherre@2ndquadrant.com

about 7 years ago

In reply to: Peter Geoghegan (#27)

Re: [HACKERS] CLUSTER command progress monitor

On 2017-Nov-21, Peter Geoghegan wrote:

On Mon, Oct 2, 2017 at 6:04 AM, Robert Haas <robertmhaas@gmail.com> wrote:

Progress reporting on sorts seems like a tricky problem to me, as I
said before. In most cases, a sort is going to involve an initial
stage where it reads all the input tuples and writes out quicksorted
runs, and then a merge phase where it merges all the output tapes into
a sorted result. There are some complexities; for example, if the
number of tapes is really large, then we might need multiple merge
phases, only the last of which will produce tuples.

This would ordinarily be the point at which I'd say "but you're very
unlikely to require multiple passes for an external sort these days".
But I won't say that on this thread, because CLUSTER generally has
unusually wide tuples, and so is much more likely to be I/O bound, to
require multiple passes, etc. (I bet the v10 enhancements
disproportionately improved CLUSTER performance.)

When the seqscan-and-sort strategy is used, we feed tuplesort with every
tuple from the scan. Once that's completed, we call `performsort`, then
retrieve tuples.

If we see this in terms of tapes and merges, we can report the total
number of each of those that we have completed. As far as I understand,
we write one tape to completion, and only then start another one, right?
Since there's no way to know how many tapes/merges are needed in total,
it's not possible to compute a percentage of completion. That's seems
okay -- we're just telling the user that progress is being made, and we
only report facts not theory. Perhaps we can (also?) indicate disk I/O
utilization, in terms of the number of blocks written by tuplesort.

I suppose that in order to have tuplesort.c report progress, we would
have to have some kind of API that tuplesort would invoke internally to
indicate events such as "tape started/completed", "merge started/completed".
One idea is to use a callback system; each tuplesort caller could
optionally pass a callback to the "begin" function, for progress
reporting purposes. Initially only cluster.c would use it, but I
suppose eventually every tuplesort caller would want that.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#38

Peter Geoghegan

pg@bowt.ie

about 7 years ago

In reply to: Alvaro Herrera (#37)

Re: [HACKERS] CLUSTER command progress monitor

On Tue, Dec 18, 2018 at 1:02 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

If we see this in terms of tapes and merges, we can report the total
number of each of those that we have completed. As far as I understand,
we write one tape to completion, and only then start another one, right?
Since there's no way to know how many tapes/merges are needed in total,
it's not possible to compute a percentage of completion. That's seems
okay -- we're just telling the user that progress is being made, and we
only report facts not theory. Perhaps we can (also?) indicate disk I/O
utilization, in terms of the number of blocks written by tuplesort.

The number of blocks tuplesort uses is constant from the end of
initial run generation, since logtape.c will recycle blocks.

I suppose that in order to have tuplesort.c report progress, we would
have to have some kind of API that tuplesort would invoke internally to
indicate events such as "tape started/completed", "merge started/completed".
One idea is to use a callback system; each tuplesort caller could
optionally pass a callback to the "begin" function, for progress
reporting purposes. Initially only cluster.c would use it, but I
suppose eventually every tuplesort caller would want that.

I think that you could have a callback that did something with the
information currently reported by trace_sort. That's not a bad way of
scoping the problem. That's how I myself monitor the progress of a
sort, and it works pretty well (whether or not that means other people
can do it is not exactly clear to me).

We predict the number of merge passes within cost_sort() already. That
doesn't seem all that hard to generalize, so that you report the
expected number of passes against the current pass. Some passes are
much quicker than others, but you generally don't have that many with
realistic cases. I don't expect that it will work very well with an
internal sort, but in the case of CLUSTER that almost seems
irrelevant. And maybe even in all cases.

I think that the user is going to have to be willing to develop some
intuition about the progress for it to be all that useful. They're
really looking for something that gives a clue if they'll have to wait
an hour, a day, or a week, which it seems like trace_sort-like
information gives you some idea of. (BTW, dtrace probes can already
give the user much the same information -- I think that more people
should use those, since tracing technology on Linux has improved
drastically in the last few years.)

--
Peter Geoghegan

#39

Alvaro Herrera

alvherre@2ndquadrant.com

about 7 years ago

In reply to: Peter Geoghegan (#38)

Re: [HACKERS] CLUSTER command progress monitor

On 2018-Dec-18, Peter Geoghegan wrote:

On Tue, Dec 18, 2018 at 1:02 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

If we see this in terms of tapes and merges, we can report the total
number of each of those that we have completed. As far as I understand,
we write one tape to completion, and only then start another one, right?
Since there's no way to know how many tapes/merges are needed in total,
it's not possible to compute a percentage of completion. That's seems
okay -- we're just telling the user that progress is being made, and we
only report facts not theory. Perhaps we can (also?) indicate disk I/O
utilization, in terms of the number of blocks written by tuplesort.

The number of blocks tuplesort uses is constant from the end of
initial run generation, since logtape.c will recycle blocks.

Well, if you think about individual blocks in terms of storage space,
maybe that's true, but I meant in an Heraclitus way of men never
stepping into the same river -- the second time you write the block,
it's not the same block you wrote before, so you count it twice. It's
not the actual disk space utilization that matters, but how much I/O
have you done (even if it is just to kernel cache, I suppose).

I suppose that in order to have tuplesort.c report progress, we would
have to have some kind of API that tuplesort would invoke internally to
indicate events such as "tape started/completed", "merge started/completed".
One idea is to use a callback system; each tuplesort caller could
optionally pass a callback to the "begin" function, for progress
reporting purposes. Initially only cluster.c would use it, but I
suppose eventually every tuplesort caller would want that.

I think that you could have a callback that did something with the
information currently reported by trace_sort. That's not a bad way of
scoping the problem. That's how I myself monitor the progress of a
sort, and it works pretty well (whether or not that means other people
can do it is not exactly clear to me).

Thanks, that looks useful.

I suppose mapping such numbers to actual progress is a bit of an art (or
intuition as you say), but it seems to be the best we can do, if we do
anything at all.

We predict the number of merge passes within cost_sort() already. That
doesn't seem all that hard to generalize, so that you report the
expected number of passes against the current pass. Some passes are
much quicker than others, but you generally don't have that many with
realistic cases. I don't expect that it will work very well with an
internal sort, but in the case of CLUSTER that almost seems
irrelevant. And maybe even in all cases.

How good are those predictions? The feeling I get from this thread is
that if the estimation of the number of passes is unreliable, it's
better not to report it at all; just return how many we've done thus
far. It's undesirable to report that we're about 150% done (or take
hours to get to 40% done, then suddenly be over).

I wonder if internal sorts are really all that interesting from the PoV
of progress reporting. Also, I have the impression that quicksort isn't
very amenable to letting you know how much work is left.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#40

Peter Geoghegan

pg@bowt.ie

about 7 years ago

In reply to: Alvaro Herrera (#39)

Re: [HACKERS] CLUSTER command progress monitor

On Tue, Dec 18, 2018 at 2:47 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

Well, if you think about individual blocks in terms of storage space,
maybe that's true, but I meant in an Heraclitus way of men never
stepping into the same river -- the second time you write the block,
it's not the same block you wrote before, so you count it twice. It's
not the actual disk space utilization that matters, but how much I/O
have you done (even if it is just to kernel cache, I suppose).

Right.

I suppose mapping such numbers to actual progress is a bit of an art (or
intuition as you say), but it seems to be the best we can do, if we do
anything at all.

I think that it's fairly useful. I suspect that you don't have to have
my theoretical grounding in sorting to be able to do almost as well.
All you need is a little bit of experience.

How good are those predictions? The feeling I get from this thread is
that if the estimation of the number of passes is unreliable, it's
better not to report it at all; just return how many we've done thus
far. It's undesirable to report that we're about 150% done (or take
hours to get to 40% done, then suddenly be over).

Maybe it isn't that reliable. But on second thought I think that it
might not matter, and maybe we should just not do that.

"How slow can I make this sort go by subtracting work_mem?" is a game
that I like to play sometimes. This blogpost plays that game, and
reaches some pretty amusing conclusions:

https://www.cybertec-postgresql.com/en/postgresql-improving-sort-performance/

It says that sorting numeric is 60% slower when you do an external
sort. But it's an apples to asteroids comparison, because the
comparison is made between 4MB of work_mem, and 1GB. I think that it's
pretty damn impressive that it's only 60% slower! Besides, even that
difference is probably on the high side of average, because numeric
abbreviated keys work particularly well, and you won't get the same
benefit with a unique numeric values when you happen to be doing a lot
of merging. If you tried the same experiment with integers, or even
text + abbreviated keys, I bet the difference would be a lot smaller.
Despite the huge gap in the amount of memory used.

On modern hardware, where doing some amount of random I/O is not that
noticeable, you'll have a very hard time finding a case where even a
paltry amount of memory with many passes does all that much worse than
an internal sort (OTOH, it's not hard to find cases where an external
sort is *faster*). Even if you make a generic estimate, it's still
probably going to be pretty good, because there just isn't that much
variation in how long the sort will take as you vary the amount of
memory it can use. Some people will be surprised at this, but it's a
pretty robust effect. (This is why I think that a hash_mem GUC might
be a good medium term solution that improves upon work_mem -- the
situation is dramatically different when it comes to hashing.)

My point is that you could offer users the kind of insight they'd find
very useful with only a very crude estimate of the amount of merging.
Even if it was 60% slower than initially projected, that's still not
an awful estimate to most users. That just leaves initial run
generation, but it's relatively easy to accurately estimate the amount
of initial runs. I rarely see a case where merging takes more than 40%
of the total, barring parallel CREATE INDEX.

I wonder if internal sorts are really all that interesting from the PoV
of progress reporting. Also, I have the impression that quicksort isn't
very amenable to letting you know how much work is left.

It is hard to predict the duration of one massive quicksort, but it's
seems fairly easy to recognize a kind of cadence across multiple
quicksorts/runs that each take seconds to a couple of minutes. That's
going to be the vast, vast majority of cases we care about.

--
Peter Geoghegan

#41

Tatsuro Yamada

yamada.tatsuro@lab.ntt.co.jp

about 7 years ago

In reply to: Alvaro Herrera (#36)

Re: [HACKERS] CLUSTER command progress monitor

Hi Alvaro,

On 2018/12/19 2:23, Alvaro Herrera wrote:

Hello Yamada-san,

On 2018-Dec-03, Tatsuro Yamada wrote:

Thank you for managing the CF and Sorry for the late reply.
I'll rebase it for the next CF and also I'll clear my head because the patch
needs design change to address the feedbacks, I guess. Therefore, the status is
reconsidering the design of the patch. :)

Do you have a new version of this patch? If not, do you think you'll
have something in time for the upcoming commitfest?

Not yet, I'll be able to send only a rebased patch by the end of this month.
I mean it has no design change because I can't catch up on how to get a progress
from sort and index scan. However I'm going to register the patch on next CF.
I'm happy if you have interested in the patch. :)

Thanks,
Tatsuro Yamada

#42

Tatsuro Yamada

yamada.tatsuro@lab.ntt.co.jp

about 7 years ago

In reply to: Alvaro Herrera (#36)

1 attachment(s)

Re: [HACKERS] CLUSTER command progress monitor

Hi,

Do you have a new version of this patch? If not, do you think you'll
have something in time for the upcoming commitfest?

Not yet, I'll be able to send only a rebased patch by the end of this month.
I mean it has no design change because I can't catch up on how to get a progress
from sort and index scan. However I'm going to register the patch on next CF.
I'm happy if you have interested in the patch.

This patch is rebased on HEAD.
I'll tackle revising the patch based on feedbacks next month.

Happy holidays!
Tatsuro Yamada

Attachments:

progress_monitor_for_cluster_command_v5.patchtext/x-patch; name=progress_monitor_for_cluster_command_v5.patchDownload

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 96bcc3a63b..83421e5105 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -332,6 +332,14 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
       </entry>
      </row>
 
+     <row>
+      <entry><structname>pg_stat_progress_cluster</structname><indexterm><primary>pg_stat_progress_cluster</primary></indexterm></entry>
+      <entry>One row for each backend running
+       <command>CLUSTER</command> and <command>VACUUM FULL</command>, showing current progress.
+       See <xref linkend='cluster-progress-reporting'>.
+      </entry>
+     </row>
+
     </tbody>
    </tgroup>
   </table>
@@ -3351,9 +3359,9 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 
   <para>
    <productname>PostgreSQL</productname> has the ability to report the progress of
-   certain commands during command execution.  Currently, the only command
-   which supports progress reporting is <command>VACUUM</command>.  This may be
-   expanded in the future.
+   certain commands during command execution.  Currently, the suppoted 
+   progress reporting commands are <command>VACUUM</command> and <command>CLUSTER</command>.
+   This may be expanded in the future.
   </para>
 
  <sect2 id="vacuum-progress-reporting">
@@ -3365,9 +3373,8 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
    one row for each backend (including autovacuum worker processes) that is
    currently vacuuming.  The tables below describe the information
    that will be reported and provide information about how to interpret it.
-   Progress reporting is not currently supported for <command>VACUUM FULL</command>
-   and backends running <command>VACUUM FULL</command> will not be listed in this
-   view.
+   Running <command>VACUUM FULL</command> is listed in <structname>pg_stat_progress_cluster</structname>
+   view because it uses <command>CLUSTER</command> command internally.  See <xref linkend='cluster-progress-reporting'>.
   </para>
 
   <table id="pg-stat-progress-vacuum-view" xreflabel="pg_stat_progress_vacuum">
@@ -3544,6 +3551,228 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
    </tgroup>
   </table>
 
+ </sect2>
+
+ <sect2 id="cluster-progress-reporting">
+  <title>CLUSTER Progress Reporting</title>
+
+  <para>
+   Whenever <command>CLUSTER</command> is running, the
+   <structname>pg_stat_progress_cluster</structname> view will contain
+   one row for each backend that is currently clustering or vacuuming (VACUUM FULL). 
+   The tables below describe the information that will be reported and
+   provide information about how to interpret it.
+  </para>
+
+  <table id="pg-stat-progress-cluster-view" xreflabel="pg_stat_progress_cluster">
+   <title><structname>pg_stat_progress_cluster</structname> View</title>
+   <tgroup cols="3">
+    <thead>
+    <row>
+      <entry>Column</entry>
+      <entry>Type</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><structfield>pid</structfield></entry>
+     <entry><type>integer</type></entry>
+     <entry>Process ID of backend.</entry>
+    </row>
+    <row>
+     <entry><structfield>datid</structfield></entry>
+     <entry><type>oid</type></entry>
+     <entry>OID of the database to which this backend is connected.</entry>
+    </row>
+    <row>
+     <entry><structfield>datname</structfield></entry>
+     <entry><type>name</type></entry>
+     <entry>Name of the database to which this backend is connected.</entry>
+    </row>
+    <row>
+     <entry><structfield>relid</structfield></entry>
+     <entry><type>oid</type></entry>
+     <entry>OID of the table being clustered.</entry>
+    </row>
+    <row>
+     <entry><structfield>command</structfield></entry>
+     <entry><type>text</type></entry>
+     <entry>
+       Current processing command: CLUSTER/VACUUM FULL.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>phase</structfield></entry>
+     <entry><type>text</type></entry>
+     <entry>
+       Current processing phase of cluster/vacuum full.  See <xref linkend='cluster-phases'> or <xref linkend='vacuum-full-phases'>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>scan_method</structfield></entry>
+     <entry><type>text</type></entry>
+     <entry>
+       Scan method of table: index scan/seq scan.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>cluster_index_relid</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       OID of the index.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_tuples_total</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Total number of heap tuples in the table.  This number is reported
+       as of the beginning of the scan; tuples added later will not be (and
+       need not be) visited by this <command>CLUSTER</command> and <command>VACUUM FULL</command>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_tuples_scanned</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Number of heap tuples scanned.
+       This counter only advances when the phase is <literal>scanning heap</literal>, 
+       <literal>writing new heap</literal> and <literal>scan heap and write new heap</literal>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_tuples_vacuumed</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Number of heap tuples vacuumed. This counter only advances when the
+       command is <literal>VACUUM FULL</literal> and the phase is <literal>scanning heap</literal>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_tuples_recently_dead</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Number of heap tuples not vacuumed since these tuples marked recently dead.
+       This counter only advances when the command is <literal>VACUUM FULL</literal> and 
+       the phase is <literal>scanning heap</literal>.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
+  <table id="cluster-phases">
+   <title>CLUSTER phases</title>
+   <tgroup cols="2">
+    <thead>
+    <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><literal>initializing</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is preparing to begin scanning the heap.  This
+       phase is expected to be very brief.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>scanning heap</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently scanning heap from the table by
+       seq scan. This phase is shown when the <structfield>scan_method</structfield> is seq scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>sorting tuples</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently sorting tuples. 
+       This phase is shown when the <structfield>scan_method</structfield> is seq scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>scan heap and write new heap</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently scanning heap from the table and
+       writing new clusterd heap.  This phase is shown when the <structfield>scan_method</structfield> is
+       index scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>swapping relation files</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently swapping old heap and new clustered heap.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>rebuilding index</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently rebuilding index.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>performing final cleanup</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is performing final cleanup.  When this phase is 
+       completed, <command>CLUSTER</command> will end.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
+  <table id="vacuum-full-phases">
+   <title>VACUUM FULL phases</title>
+   <tgroup cols="2">
+    <thead>
+    <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><literal>initializing</literal></entry>
+     <entry>
+       <command>VACUUM FULL</command> is preparing to begin scanning the heap.  This
+       phase is expected to be very brief.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>scanning heap</literal></entry>
+     <entry>
+       <command>VACUUM FULL</command> is currently scanning heap from the table.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>swapping relation files</literal></entry>
+     <entry>
+       <command>VACUUM FULL</command> is currently swapping old heap and new vacuumed heap.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>rebuilding index</literal></entry>
+     <entry>
+       <command>VACUUM FULL</command> is currently rebuilding index.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>performing final cleanup</literal></entry>
+     <entry>
+       <command>VACUUM FULL</command> is performing final cleanup.  When this phase is
+       completed, <command>VACUUM FULL</command> will end.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
  </sect2>
  </sect1>
 
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 5253837b54..6c0f10e11e 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -904,6 +904,35 @@ CREATE VIEW pg_stat_progress_vacuum AS
     FROM pg_stat_get_progress_info('VACUUM') AS S
 		LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_cluster AS
+    SELECT
+        S.pid AS pid,
+        S.datid AS datid,
+        D.datname AS datname,
+        S.relid AS relid,
+        CASE S.param1 WHEN 1 THEN 'CLUSTER'
+                      WHEN 2 THEN 'VACUUM FULL'
+                      END AS command,
+        CASE S.param2 WHEN 0 THEN 'initializing'
+                      WHEN 1 THEN 'scanning heap'
+                      WHEN 2 THEN 'sorting tuples'
+                      WHEN 3 THEN 'writing new heap'
+                      WHEN 4 THEN 'scan heap and write new heap'
+                      WHEN 5 THEN 'swapping relation files'
+                      WHEN 6 THEN 'rebuilding index'
+                      WHEN 7 THEN 'performing final cleanup'
+                      END AS phase,
+        CASE S.param3 WHEN 1 THEN 'index scan'
+                      WHEN 2 THEN 'seq scan'
+                      END AS scan_method,
+        S.param4 AS cluster_index_relid,
+        S.param5 AS heap_tuples_total,
+        S.param6 AS heap_tuples_scanned,
+        S.param7 AS heap_tuples_vacuumed,
+        S.param8 AS heap_tuples_recently_dead
+    FROM pg_stat_get_progress_info('CLUSTER') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid;
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 610e425a56..bc93704725 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -34,10 +34,12 @@
 #include "catalog/objectaccess.h"
 #include "catalog/toasting.h"
 #include "commands/cluster.h"
+#include "commands/progress.h"
 #include "commands/tablecmds.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "optimizer/planner.h"
+#include "pgstat.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/predicate.h"
@@ -105,6 +107,7 @@ static void reform_and_rewrite_tuple(HeapTuple tuple,
 void
 cluster(ClusterStmt *stmt, bool isTopLevel)
 {
+
 	if (stmt->relation != NULL)
 	{
 		/* This is the single-relation case. */
@@ -186,7 +189,9 @@ cluster(ClusterStmt *stmt, bool isTopLevel)
 		heap_close(rel, NoLock);
 
 		/* Do the job. */
+		pgstat_progress_start_command(PROGRESS_COMMAND_CLUSTER, tableOid);
 		cluster_rel(tableOid, indexOid, stmt->options);
+		pgstat_progress_end_command();
 	}
 	else
 	{
@@ -234,8 +239,10 @@ cluster(ClusterStmt *stmt, bool isTopLevel)
 			/* functions in indexes may want a snapshot set */
 			PushActiveSnapshot(GetTransactionSnapshot());
 			/* Do the job. */
+			pgstat_progress_start_command(PROGRESS_COMMAND_CLUSTER, rvtc->tableOid);
 			cluster_rel(rvtc->tableOid, rvtc->indexOid,
 						stmt->options | CLUOPT_RECHECK);
+			pgstat_progress_end_command();
 			PopActiveSnapshot();
 			CommitTransactionCommand();
 		}
@@ -385,6 +392,19 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 	 */
 	CheckTableNotInUse(OldHeap, OidIsValid(indexOid) ? "CLUSTER" : "VACUUM");
 
+	/* Set command to column */
+	if(OidIsValid(indexOid))
+	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_COMMAND, PROGRESS_CLUSTER_COMMAND_CLUSTER);
+		/* Set indexOid to column */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_INDEX_RELID, indexOid);
+	}
+	else
+	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_COMMAND, PROGRESS_CLUSTER_COMMAND_VACUUM_FULL);
+	}
+
+
 	/* Check heap and index are valid to cluster on */
 	if (OidIsValid(indexOid))
 		check_index_is_clusterable(OldHeap, indexOid, recheck, AccessExclusiveLock);
@@ -791,6 +811,9 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	else
 		OldIndex = NULL;
 
+	/* Set reltuples to total_tuples */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_TOTAL_HEAP_TUPLES, OldHeap->rd_rel->reltuples);
+
 	/*
 	 * Their tuple descriptors should be exactly alike, but here we only need
 	 * assume that they have the same number of columns.
@@ -923,12 +946,16 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 */
 	if (OldIndex != NULL && !use_sort)
 	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_SCAN_HEAP_AND_WRITE_NEW_HEAP);
+		pgstat_progress_update_param(PROGRESS_CLUSTER_SCAN_METHOD, PROGRESS_CLUSTER_METHOD_INDEX_SCAN);
 		heapScan = NULL;
 		indexScan = index_beginscan(OldHeap, OldIndex, SnapshotAny, 0, 0);
 		index_rescan(indexScan, NULL, 0, NULL, 0);
 	}
 	else
 	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_SCAN_HEAP);
+		pgstat_progress_update_param(PROGRESS_CLUSTER_SCAN_METHOD, PROGRESS_CLUSTER_METHOD_SEQ_SCAN);
 		heapScan = heap_beginscan(OldHeap, SnapshotAny, 0, (ScanKey) NULL);
 		indexScan = NULL;
 	}
@@ -1049,6 +1076,9 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 				tups_vacuumed += 1;
 				tups_recently_dead -= 1;
 			}
+			/* set tups_vacuumed and tups_recently_dead to columns for VACUUM FULL */
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_VACUUMED, tups_vacuumed);
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_RECENTLY_DEAD, tups_recently_dead);
 			continue;
 		}
 
@@ -1060,6 +1090,9 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 									 oldTupDesc, newTupDesc,
 									 values, isnull,
 									 rwstate);
+
+		/* Regardless of index scan or seq scan, update tuples_scanned column */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED, num_tuples);
 	}
 
 	if (indexScan != NULL)
@@ -1073,8 +1106,15 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 */
 	if (tuplesort != NULL)
 	{
+		double num_tuples = 0;
+
+		/* Report that we are now sorting tuples */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_SORT_TUPLES);
 		tuplesort_performsort(tuplesort);
 
+		/* Report that we are now writing new heap */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP);
+		pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED, num_tuples);
 		for (;;)
 		{
 			HeapTuple	tuple;
@@ -1085,10 +1125,13 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 			if (tuple == NULL)
 				break;
 
+			num_tuples += 1;
 			reform_and_rewrite_tuple(tuple,
 									 oldTupDesc, newTupDesc,
 									 values, isnull,
 									 rwstate);
+
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED, num_tuples);
 		}
 
 		tuplesort_end(tuplesort);
@@ -1527,6 +1570,11 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	int			reindex_flags;
 	int			i;
 
+	/* Report that we are now swapping relation files */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES);
+	/* Set scan_method to NULL */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_SCAN_METHOD, -1);
+
 	/* Zero out possible results from swapped_relation_files */
 	memset(mapped_tables, 0, sizeof(mapped_tables));
 
@@ -1561,6 +1609,10 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	 * because the new heap won't contain any HOT chains at all, let alone
 	 * broken ones, so it can't be necessary to set indcheckxmin.
 	 */
+
+	/* Report that we are now reindexing relations */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_REBUILD_INDEX);
+
 	reindex_flags = REINDEX_REL_SUPPRESS_INDEX_USE;
 	if (check_constraints)
 		reindex_flags |= REINDEX_REL_CHECK_CONSTRAINTS;
@@ -1576,6 +1628,9 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 
 	reindex_relation(OIDOldHeap, reindex_flags, 0);
 
+	/* Report that we are now doing clean up */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE, PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP);
+
 	/*
 	 * If the relation being rebuild is pg_class, swap_relation_files()
 	 * couldn't update pg_class's own pg_class entry (check comments in
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 25b3b0312c..276c0e407e 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -1708,7 +1708,9 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 			cluster_options |= CLUOPT_VERBOSE;
 
 		/* VACUUM FULL is now a variant of CLUSTER; see cluster.c */
+		pgstat_progress_start_command(PROGRESS_COMMAND_CLUSTER, relid);
 		cluster_rel(relid, InvalidOid, cluster_options);
+		pgstat_progress_end_command();
 	}
 	else
 		lazy_vacuum_rel(onerel, options, params, vac_strategy);
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index f955f1912a..fcc30259a8 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -468,6 +468,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 	/* Translate command name into command type code. */
 	if (pg_strcasecmp(cmd, "VACUUM") == 0)
 		cmdtype = PROGRESS_COMMAND_VACUUM;
+	else if(pg_strcasecmp(cmd, "CLUSTER") == 0)
+		cmdtype = PROGRESS_COMMAND_CLUSTER;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 6a6b467fee..27553ee678 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -34,4 +34,32 @@
 #define PROGRESS_VACUUM_PHASE_TRUNCATE			5
 #define PROGRESS_VACUUM_PHASE_FINAL_CLEANUP		6
 
+/* Progress parameters for cluster */
+#define PROGRESS_CLUSTER_COMMAND					0
+#define PROGRESS_CLUSTER_PHASE						1
+#define PROGRESS_CLUSTER_SCAN_METHOD				2
+#define PROGRESS_CLUSTER_INDEX_RELID				3
+#define PROGRESS_CLUSTER_TOTAL_HEAP_TUPLES	  		4
+#define PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED		5
+#define PROGRESS_CLUSTER_HEAP_TUPLES_VACUUMED		6
+#define PROGRESS_CLUSTER_HEAP_TUPLES_RECENTLY_DEAD	7
+
+/* Phases of cluster (as dvertised via PROGRESS_CLUSTER_PHASE) */
+#define PROGRESS_CLUSTER_PHASE_SCAN_HEAP						1
+#define PROGRESS_CLUSTER_PHASE_SORT_TUPLES						2
+#define PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP					3
+#define PROGRESS_CLUSTER_PHASE_SCAN_HEAP_AND_WRITE_NEW_HEAP		4
+#define PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES					5
+#define PROGRESS_CLUSTER_PHASE_REBUILD_INDEX					6
+#define PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP					7
+
+/* Scan methods of cluster */
+#define PROGRESS_CLUSTER_METHOD_INDEX_SCAN		1
+#define PROGRESS_CLUSTER_METHOD_SEQ_SCAN		2
+
+/* Commands of PROGRESS_CLUSTER */
+#define PROGRESS_CLUSTER_COMMAND_CLUSTER		1
+#define PROGRESS_CLUSTER_COMMAND_VACUUM_FULL	2
+
+
 #endif
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index f1c10d16b8..c52e6bf003 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -934,7 +934,8 @@ typedef enum
 typedef enum ProgressCommandType
 {
 	PROGRESS_COMMAND_INVALID,
-	PROGRESS_COMMAND_VACUUM
+	PROGRESS_COMMAND_VACUUM,
+	PROGRESS_COMMAND_CLUSTER
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	10
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index e384cd2279..c2a64604be 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1821,6 +1821,38 @@ pg_stat_database_conflicts| SELECT d.oid AS datid,
     pg_stat_get_db_conflict_bufferpin(d.oid) AS confl_bufferpin,
     pg_stat_get_db_conflict_startup_deadlock(d.oid) AS confl_deadlock
    FROM pg_database d;
+pg_stat_progress_cluster| SELECT s.pid,
+    s.datid,
+    d.datname,
+    s.relid,
+        CASE s.param1
+            WHEN 1 THEN 'CLUSTER'::text
+            WHEN 2 THEN 'VACUUM FULL'::text
+            ELSE NULL::text
+        END AS command,
+        CASE s.param2
+            WHEN 0 THEN 'initializing'::text
+            WHEN 1 THEN 'scanning heap'::text
+            WHEN 2 THEN 'sorting tuples'::text
+            WHEN 3 THEN 'writing new heap'::text
+            WHEN 4 THEN 'scan heap and write new heap'::text
+            WHEN 5 THEN 'swapping relation files'::text
+            WHEN 6 THEN 'rebuilding index'::text
+            WHEN 7 THEN 'performing final cleanup'::text
+            ELSE NULL::text
+        END AS phase,
+        CASE s.param3
+            WHEN 1 THEN 'index scan'::text
+            WHEN 2 THEN 'seq scan'::text
+            ELSE NULL::text
+        END AS scan_method,
+    s.param4 AS cluster_index_relid,
+    s.param5 AS heap_tuples_total,
+    s.param6 AS heap_tuples_scanned,
+    s.param7 AS heap_tuples_vacuumed,
+    s.param8 AS heap_tuples_recently_dead
+   FROM (pg_stat_get_progress_info('CLUSTER'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,

#43

Robert Haas

robertmhaas@gmail.com

almost 7 years ago

In reply to: Tatsuro Yamada (#42)

Re: [HACKERS] CLUSTER command progress monitor

On Fri, Dec 28, 2018 at 3:20 AM Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

This patch is rebased on HEAD.
I'll tackle revising the patch based on feedbacks next month.

+   Running <command>VACUUM FULL</command> is listed in
<structname>pg_stat_progress_cluster</structname>
+   view because it uses <command>CLUSTER</command> command
internally.  See <xref linkend='cluster-progress-reporting'>.

It's not really true to say that VACUUM FULL uses the CLUSTER command
internally. It's not really true. It uses a good chunk of the same
infrastructure, but it certainly doesn't use the actual command, and
it's not really the exact same thing either, because internally it's
doing a sequential scan but no sort, which never happens with CLUSTER.
I'm not sure exactly how to rephrase this, but I think we need to make
it more precise.

One idea is that maybe we should try to think of a design that could
also handle the rewriting variants of ALTER TABLE, and call it
pg_stat_progress_rewrite. Maybe that's moving the goalposts too far,
but I'm not saying we'd necessarily have to do all the work now, just
have a view that we think could also handle that. Then again, maybe
the needs are too different.

+   Whenever <command>CLUSTER</command> is running, the
+   <structname>pg_stat_progress_cluster</structname> view will contain
+   one row for each backend that is currently clustering or vacuuming
(VACUUM FULL).

That sentence contradicts itself. Just say that it contains a row for
each backend that is currently running CLUSTER or VACUUM FULL.

@@ -105,6 +107,7 @@ static void reform_and_rewrite_tuple(HeapTuple tuple,
void
cluster(ClusterStmt *stmt, bool isTopLevel)
{
+
if (stmt->relation != NULL)
{
/* This is the single-relation case. */

Useless hunk.

@@ -186,7 +189,9 @@ cluster(ClusterStmt *stmt, bool isTopLevel)
heap_close(rel, NoLock);

  /* Do the job. */
+ pgstat_progress_start_command(PROGRESS_COMMAND_CLUSTER, tableOid);
  cluster_rel(tableOid, indexOid, stmt->options);
+ pgstat_progress_end_command();
  }
  else
  {

It seems like that stuff should be inside cluster_rel().

+ /* Set reltuples to total_tuples */
+ pgstat_progress_update_param(PROGRESS_CLUSTER_TOTAL_HEAP_TUPLES,
OldHeap->rd_rel->reltuples);

I object. If the user wants that, they can get it from pg_class
themselves via an SQL query. It's also an estimate, not something we
know to be accurate; I want us to only report facts here, not theories

+ pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
PROGRESS_CLUSTER_PHASE_SCAN_HEAP_AND_WRITE_NEW_HEAP);
+ pgstat_progress_update_param(PROGRESS_CLUSTER_SCAN_METHOD,
PROGRESS_CLUSTER_METHOD_INDEX_SCAN);

I think you should use pgstat_progress_update_multi_param() if
updating multiple parameters at the same time.

Also, some lines in this patch, such as this one, are very long.
Consider techniques to reduce the line length to 80 characters or
less, such as inserting a line break between the two arguments.

+ /* Set scan_method to NULL */
+ pgstat_progress_update_param(PROGRESS_CLUSTER_SCAN_METHOD, -1);

NULL and -1 are not the same thing.

I think that we shouldn't have both
PROGRESS_CLUSTER_PHASE_SCAN_HEAP_AND_WRITE_NEW_HEAP and
PROGRESS_CLUSTER_PHASE_SCAN_HEAP. They're the same thing. Let's just
use PROGRESS_CLUSTER_PHASE_SCAN_HEAP for both. Actually, better yet,
let's get rid of PROGRESS_CLUSTER_SCAN_METHOD and have
PROGRESS_CLUSTER_PHASE_SEQ_SCAN_HEAP and
PROGRESS_CLUSTER_PHASE_INDEX_SCAN_HEAP. That seems noticeably
simpler.

I agree that it's acceptable to report
PROGRESS_CLUSTER_HEAP_TUPLES_VACUUMED and
PROGRESS_CLUSTER_HEAP_TUPLES_RECENTLY_DEAD, but I'm not sure I
understand why it's valuable to do so in the context of a progress
indicator.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#44

Tatsuro Yamada

yamada.tatsuro@lab.ntt.co.jp

almost 7 years ago

In reply to: Robert Haas (#43)

1 attachment(s)

Re: [HACKERS] CLUSTER command progress monitor

On 2019/02/23 6:02, Robert Haas wrote:

On Fri, Dec 28, 2018 at 3:20 AM Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

This patch is rebased on HEAD.
I'll tackle revising the patch based on feedbacks next month.
+   Running <command>VACUUM FULL</command> is listed in
<structname>pg_stat_progress_cluster</structname>
+   view because it uses <command>CLUSTER</command> command
internally.  See <xref linkend='cluster-progress-reporting'>.
It's not really true to say that VACUUM FULL uses the CLUSTER command
internally. It's not really true. It uses a good chunk of the same
infrastructure, but it certainly doesn't use the actual command, and
it's not really the exact same thing either, because internally it's
doing a sequential scan but no sort, which never happens with CLUSTER.
I'm not sure exactly how to rephrase this, but I think we need to make
it more precise.

One idea is that maybe we should try to think of a design that could
also handle the rewriting variants of ALTER TABLE, and call it
pg_stat_progress_rewrite. Maybe that's moving the goalposts too far,
but I'm not saying we'd necessarily have to do all the work now, just
have a view that we think could also handle that. Then again, maybe
the needs are too different.

Hmm..., I see.
If possible, I'd like to stop thinking of VACUUM FULL to avoid complication of
the implementation.
For now, I haven't enough time to design pg_stat_progress_rewrite. I suppose that
it's tough work.

+   Whenever <command>CLUSTER</command> is running, the
+   <structname>pg_stat_progress_cluster</structname> view will contain
+   one row for each backend that is currently clustering or vacuuming
(VACUUM FULL).
That sentence contradicts itself. Just say that it contains a row for
each backend that is currently running CLUSTER or VACUUM FULL.

Fixed.

@@ -105,6 +107,7 @@ static void reform_and_rewrite_tuple(HeapTuple tuple,
void
cluster(ClusterStmt *stmt, bool isTopLevel)
{
+
if (stmt->relation != NULL)
{
/* This is the single-relation case. */

Useless hunk.

Fixed.

@@ -186,7 +189,9 @@ cluster(ClusterStmt *stmt, bool isTopLevel)
heap_close(rel, NoLock);
/* Do the job. */
+ pgstat_progress_start_command(PROGRESS_COMMAND_CLUSTER, tableOid);
cluster_rel(tableOid, indexOid, stmt->options);
+ pgstat_progress_end_command();
}
else
{
It seems like that stuff should be inside cluster_rel().

Fixed.

+ /* Set reltuples to total_tuples */
+ pgstat_progress_update_param(PROGRESS_CLUSTER_TOTAL_HEAP_TUPLES,
OldHeap->rd_rel->reltuples);
I object. If the user wants that, they can get it from pg_class
themselves via an SQL query. It's also an estimate, not something we
know to be accurate; I want us to only report facts here, not theories

I understand that progress monitor should only report facts, so I
removed that code.

+ pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
PROGRESS_CLUSTER_PHASE_SCAN_HEAP_AND_WRITE_NEW_HEAP);
+ pgstat_progress_update_param(PROGRESS_CLUSTER_SCAN_METHOD,
PROGRESS_CLUSTER_METHOD_INDEX_SCAN);
I think you should use pgstat_progress_update_multi_param() if
updating multiple parameters at the same time.

Also, some lines in this patch, such as this one, are very long.
Consider techniques to reduce the line length to 80 characters or
less, such as inserting a line break between the two arguments.

Fixed.

+ /* Set scan_method to NULL */
+ pgstat_progress_update_param(PROGRESS_CLUSTER_SCAN_METHOD, -1);

NULL and -1 are not the same thing.

Oops, fixed.

I think that we shouldn't have both
PROGRESS_CLUSTER_PHASE_SCAN_HEAP_AND_WRITE_NEW_HEAP and
PROGRESS_CLUSTER_PHASE_SCAN_HEAP. They're the same thing. Let's just
use PROGRESS_CLUSTER_PHASE_SCAN_HEAP for both. Actually, better yet,
let's get rid of PROGRESS_CLUSTER_SCAN_METHOD and have
PROGRESS_CLUSTER_PHASE_SEQ_SCAN_HEAP and
PROGRESS_CLUSTER_PHASE_INDEX_SCAN_HEAP. That seems noticeably
simpler.

Fixed.

I agree that it's acceptable to report
PROGRESS_CLUSTER_HEAP_TUPLES_VACUUMED and
PROGRESS_CLUSTER_HEAP_TUPLES_RECENTLY_DEAD, but I'm not sure I
understand why it's valuable to do so in the context of a progress
indicator.

Actually, I'm not sure why I added it since so much time has passed. :(
So, I'll remove PROGRESS_CLUSTER_HEAP_TUPLES_RECENTLY_DEAD at least.

Attached patch is wip patch.

Thanks!
Tatsuro Yamada

Attachments:

progress_monitor_for_cluster_command_v6.patchtext/x-patch; name=progress_monitor_for_cluster_command_v6.patchDownload

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 0e73cdcdda..8cf829e72c 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -344,6 +344,14 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
       </entry>
      </row>
 
+     <row>
+      <entry><structname>pg_stat_progress_cluster</structname><indexterm><primary>pg_stat_progress_cluster</primary></indexterm></entry>
+      <entry>One row for each backend running
+       <command>CLUSTER</command> and <command>VACUUM FULL</command>, showing current progress.
+       See <xref linkend='cluster-progress-reporting'>.
+      </entry>
+     </row>
+
     </tbody>
    </tgroup>
   </table>
@@ -3376,9 +3384,9 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 
   <para>
    <productname>PostgreSQL</productname> has the ability to report the progress of
-   certain commands during command execution.  Currently, the only command
-   which supports progress reporting is <command>VACUUM</command>.  This may be
-   expanded in the future.
+   certain commands during command execution.  Currently, the suppoted 
+   progress reporting commands are <command>VACUUM</command> and <command>CLUSTER</command>.
+   This may be expanded in the future.
   </para>
 
  <sect2 id="vacuum-progress-reporting">
@@ -3390,9 +3398,8 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
    one row for each backend (including autovacuum worker processes) that is
    currently vacuuming.  The tables below describe the information
    that will be reported and provide information about how to interpret it.
-   Progress reporting is not currently supported for <command>VACUUM FULL</command>
-   and backends running <command>VACUUM FULL</command> will not be listed in this
-   view.
+   Running <command>VACUUM FULL</command> is listed in <structname>pg_stat_progress_cluster</structname>
+   view because it uses <command>CLUSTER</command> command internally.  See <xref linkend='cluster-progress-reporting'>.
   </para>
 
   <table id="pg-stat-progress-vacuum-view" xreflabel="pg_stat_progress_vacuum">
@@ -3569,6 +3576,228 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
    </tgroup>
   </table>
 
+ </sect2>
+
+ <sect2 id="cluster-progress-reporting">
+  <title>CLUSTER Progress Reporting</title>
+
+  <para>
+   Whenever <command>CLUSTER</command> is running, the
+   <structname>pg_stat_progress_cluster</structname> view will contain
+   a row for each backend that is currently running CLUSTER or VACUUM FULL. 
+   The tables below describe the information that will be reported and
+   provide information about how to interpret it.
+  </para>
+
+  <table id="pg-stat-progress-cluster-view" xreflabel="pg_stat_progress_cluster">
+   <title><structname>pg_stat_progress_cluster</structname> View</title>
+   <tgroup cols="3">
+    <thead>
+    <row>
+      <entry>Column</entry>
+      <entry>Type</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><structfield>pid</structfield></entry>
+     <entry><type>integer</type></entry>
+     <entry>Process ID of backend.</entry>
+    </row>
+    <row>
+     <entry><structfield>datid</structfield></entry>
+     <entry><type>oid</type></entry>
+     <entry>OID of the database to which this backend is connected.</entry>
+    </row>
+    <row>
+     <entry><structfield>datname</structfield></entry>
+     <entry><type>name</type></entry>
+     <entry>Name of the database to which this backend is connected.</entry>
+    </row>
+    <row>
+     <entry><structfield>relid</structfield></entry>
+     <entry><type>oid</type></entry>
+     <entry>OID of the table being clustered.</entry>
+    </row>
+    <row>
+     <entry><structfield>command</structfield></entry>
+     <entry><type>text</type></entry>
+     <entry>
+       Current processing command: CLUSTER/VACUUM FULL.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>phase</structfield></entry>
+     <entry><type>text</type></entry>
+     <entry>
+       Current processing phase of cluster/vacuum full.  See <xref linkend='cluster-phases'> or <xref linkend='vacuum-full-phases'>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>scan_method</structfield></entry>
+     <entry><type>text</type></entry>
+     <entry>
+       Scan method of table: index scan/seq scan.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>cluster_index_relid</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       OID of the index.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_tuples_total</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Total number of heap tuples in the table.  This number is reported
+       as of the beginning of the scan; tuples added later will not be (and
+       need not be) visited by this <command>CLUSTER</command> and <command>VACUUM FULL</command>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_tuples_scanned</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Number of heap tuples scanned.
+       This counter only advances when the phase is <literal>scanning heap</literal>, 
+       <literal>writing new heap</literal> and <literal>scan heap and write new heap</literal>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_tuples_vacuumed</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Number of heap tuples vacuumed. This counter only advances when the
+       command is <literal>VACUUM FULL</literal> and the phase is <literal>scanning heap</literal>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_tuples_recently_dead</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Number of heap tuples not vacuumed since these tuples marked recently dead.
+       This counter only advances when the command is <literal>VACUUM FULL</literal> and 
+       the phase is <literal>scanning heap</literal>.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
+  <table id="cluster-phases">
+   <title>CLUSTER phases</title>
+   <tgroup cols="2">
+    <thead>
+    <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><literal>initializing</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is preparing to begin scanning the heap.  This
+       phase is expected to be very brief.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>scanning heap</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently scanning heap from the table by
+       seq scan. This phase is shown when the <structfield>scan_method</structfield> is seq scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>sorting tuples</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently sorting tuples. 
+       This phase is shown when the <structfield>scan_method</structfield> is seq scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>scan heap and write new heap</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently scanning heap from the table and
+       writing new clusterd heap.  This phase is shown when the <structfield>scan_method</structfield> is
+       index scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>swapping relation files</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently swapping old heap and new clustered heap.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>rebuilding index</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently rebuilding index.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>performing final cleanup</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is performing final cleanup.  When this phase is 
+       completed, <command>CLUSTER</command> will end.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
+  <table id="vacuum-full-phases">
+   <title>VACUUM FULL phases</title>
+   <tgroup cols="2">
+    <thead>
+    <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><literal>initializing</literal></entry>
+     <entry>
+       <command>VACUUM FULL</command> is preparing to begin scanning the heap.  This
+       phase is expected to be very brief.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>scanning heap</literal></entry>
+     <entry>
+       <command>VACUUM FULL</command> is currently scanning heap from the table.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>swapping relation files</literal></entry>
+     <entry>
+       <command>VACUUM FULL</command> is currently swapping old heap and new vacuumed heap.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>rebuilding index</literal></entry>
+     <entry>
+       <command>VACUUM FULL</command> is currently rebuilding index.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>performing final cleanup</literal></entry>
+     <entry>
+       <command>VACUUM FULL</command> is performing final cleanup.  When this phase is
+       completed, <command>VACUUM FULL</command> will end.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
  </sect2>
  </sect1>
 
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 3e229c693c..046b83447c 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -906,6 +906,32 @@ CREATE VIEW pg_stat_progress_vacuum AS
     FROM pg_stat_get_progress_info('VACUUM') AS S
 		LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_cluster AS
+    SELECT
+        S.pid AS pid,
+        S.datid AS datid,
+        D.datname AS datname,
+        S.relid AS relid,
+        CASE S.param1 WHEN 1 THEN 'CLUSTER'
+                      WHEN 2 THEN 'VACUUM FULL'
+                      END AS command,
+        CASE S.param2 WHEN 0 THEN 'initializing'
+                      WHEN 1 THEN 'seq scanning heap'
+                      WHEN 2 THEN 'index scanning heap'
+                      WHEN 3 THEN 'sorting tuples'
+                      WHEN 4 THEN 'writing new heap'
+                      WHEN 5 THEN 'swapping relation files'
+                      WHEN 6 THEN 'rebuilding index'
+                      WHEN 7 THEN 'performing final cleanup'
+                      END AS phase,
+        S.param3 AS cluster_index_relid,
+        S.param4 AS heap_tuples_total,
+        S.param5 AS heap_tuples_scanned,
+        S.param6 AS heap_tuples_vacuumed,
+        S.param7 AS heap_tuples_recently_dead
+    FROM pg_stat_get_progress_info('CLUSTER') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid;
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index a74af4c171..d1449d54d4 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -35,10 +35,12 @@
 #include "catalog/objectaccess.h"
 #include "catalog/toasting.h"
 #include "commands/cluster.h"
+#include "commands/progress.h"
 #include "commands/tablecmds.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "optimizer/optimizer.h"
+#include "pgstat.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/predicate.h"
@@ -275,6 +277,8 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 	/* Check for user-requested abort. */
 	CHECK_FOR_INTERRUPTS();
 
+	pgstat_progress_start_command(PROGRESS_COMMAND_CLUSTER, tableOid);
+
 	/*
 	 * We grab exclusive access to the target rel and index for the duration
 	 * of the transaction.  (This is redundant for the single-transaction
@@ -385,6 +389,27 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 	 */
 	CheckTableNotInUse(OldHeap, OidIsValid(indexOid) ? "CLUSTER" : "VACUUM");
 
+	/* Set command to column */
+	if(OidIsValid(indexOid))
+	{
+		const int   cir_index[] = {
+			PROGRESS_CLUSTER_COMMAND,
+			PROGRESS_CLUSTER_INDEX_RELID
+		};
+		int64       cir_val[2];
+
+		/* Set indexOid to column */
+		cir_val[0] = PROGRESS_CLUSTER_COMMAND_CLUSTER;
+		cir_val[1] = indexOid;
+		pgstat_progress_update_multi_param(2, cir_index, cir_val);
+	}
+	else
+	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_COMMAND,
+									 PROGRESS_CLUSTER_COMMAND_VACUUM_FULL);
+	}
+
+
 	/* Check heap and index are valid to cluster on */
 	if (OidIsValid(indexOid))
 		check_index_is_clusterable(OldHeap, indexOid, recheck, AccessExclusiveLock);
@@ -415,6 +440,8 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 	rebuild_relation(OldHeap, indexOid, verbose);
 
 	/* NB: rebuild_relation does table_close() on OldHeap */
+
+	pgstat_progress_end_command();
 }
 
 /*
@@ -923,12 +950,18 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 */
 	if (OldIndex != NULL && !use_sort)
 	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+									 PROGRESS_CLUSTER_PHASE_INDEX_SCAN_HEAP);
+
 		heapScan = NULL;
 		indexScan = index_beginscan(OldHeap, OldIndex, SnapshotAny, 0, 0);
 		index_rescan(indexScan, NULL, 0, NULL, 0);
 	}
 	else
 	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+									 PROGRESS_CLUSTER_PHASE_SEQ_SCAN_HEAP);
+
 		heapScan = heap_beginscan(OldHeap, SnapshotAny, 0, (ScanKey) NULL);
 		indexScan = NULL;
 	}
@@ -1041,6 +1074,12 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 
 		if (isdead)
 		{
+			const int   chp_index[] = {
+				PROGRESS_CLUSTER_HEAP_TUPLES_VACUUMED,
+				PROGRESS_CLUSTER_HEAP_TUPLES_RECENTLY_DEAD
+			};
+			int64       chp_val[2];
+
 			tups_vacuumed += 1;
 			/* heap rewrite module still needs to see it... */
 			if (rewrite_heap_dead_tuple(rwstate, tuple))
@@ -1049,6 +1088,11 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 				tups_vacuumed += 1;
 				tups_recently_dead -= 1;
 			}
+			/* set tups_vacuumed and tups_recently_dead to columns for VACUUM FULL */
+			chp_val[0] = tups_vacuumed;
+			chp_val[1] = tups_recently_dead;
+			pgstat_progress_update_multi_param(2, chp_index, chp_val);
+
 			continue;
 		}
 
@@ -1060,6 +1104,10 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 									 oldTupDesc, newTupDesc,
 									 values, isnull,
 									 rwstate);
+
+		/* Regardless of index scan or seq scan, update tuples_scanned column */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED,
+									 num_tuples);
 	}
 
 	if (indexScan != NULL)
@@ -1073,8 +1121,23 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 */
 	if (tuplesort != NULL)
 	{
+		double num_tuples = 0;
+		const int   cp_index[] = {
+			PROGRESS_CLUSTER_PHASE,
+			PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED
+		};
+		int64       cp_val[2];
+
+		/* Report that we are now sorting tuples */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+									 PROGRESS_CLUSTER_PHASE_SORT_TUPLES);
 		tuplesort_performsort(tuplesort);
 
+		/* Report that we are now writing new heap */
+		cp_val[0] = PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP;
+		cp_val[1] = num_tuples;
+		pgstat_progress_update_multi_param(2, cp_index, cp_val);
+
 		for (;;)
 		{
 			HeapTuple	tuple;
@@ -1085,10 +1148,14 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 			if (tuple == NULL)
 				break;
 
+			num_tuples += 1;
 			reform_and_rewrite_tuple(tuple,
 									 oldTupDesc, newTupDesc,
 									 values, isnull,
 									 rwstate);
+
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED,
+										 num_tuples);
 		}
 
 		tuplesort_end(tuplesort);
@@ -1527,6 +1594,10 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	int			reindex_flags;
 	int			i;
 
+	/* Report that we are now swapping relation files */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+								 PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES);
+
 	/* Zero out possible results from swapped_relation_files */
 	memset(mapped_tables, 0, sizeof(mapped_tables));
 
@@ -1561,6 +1632,11 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	 * because the new heap won't contain any HOT chains at all, let alone
 	 * broken ones, so it can't be necessary to set indcheckxmin.
 	 */
+
+	/* Report that we are now reindexing relations */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+								 PROGRESS_CLUSTER_PHASE_REBUILD_INDEX);
+
 	reindex_flags = REINDEX_REL_SUPPRESS_INDEX_USE;
 	if (check_constraints)
 		reindex_flags |= REINDEX_REL_CHECK_CONSTRAINTS;
@@ -1576,6 +1652,10 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 
 	reindex_relation(OIDOldHeap, reindex_flags, 0);
 
+	/* Report that we are now doing clean up */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+								 PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP);
+
 	/*
 	 * If the relation being rebuild is pg_class, swap_relation_files()
 	 * couldn't update pg_class's own pg_class entry (check comments in
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index e91df2171e..954219bc83 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -1707,7 +1707,9 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 			cluster_options |= CLUOPT_VERBOSE;
 
 		/* VACUUM FULL is now a variant of CLUSTER; see cluster.c */
+		pgstat_progress_start_command(PROGRESS_COMMAND_CLUSTER, relid);
 		cluster_rel(relid, InvalidOid, cluster_options);
+		pgstat_progress_end_command();
 	}
 	else
 		heap_vacuum_rel(onerel, options, params, vac_strategy);
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 69f7265779..37ff3dbff6 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -468,6 +468,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 	/* Translate command name into command type code. */
 	if (pg_strcasecmp(cmd, "VACUUM") == 0)
 		cmdtype = PROGRESS_COMMAND_VACUUM;
+	else if(pg_strcasecmp(cmd, "CLUSTER") == 0)
+		cmdtype = PROGRESS_COMMAND_CLUSTER;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 9858b36a38..6ea5817b3d 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -34,4 +34,27 @@
 #define PROGRESS_VACUUM_PHASE_TRUNCATE			5
 #define PROGRESS_VACUUM_PHASE_FINAL_CLEANUP		6
 
+/* Progress parameters for cluster */
+#define PROGRESS_CLUSTER_COMMAND					0
+#define PROGRESS_CLUSTER_PHASE						1
+#define PROGRESS_CLUSTER_INDEX_RELID				2
+#define PROGRESS_CLUSTER_TOTAL_HEAP_TUPLES	  		3
+#define PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED		4
+#define PROGRESS_CLUSTER_HEAP_TUPLES_VACUUMED		5
+#define PROGRESS_CLUSTER_HEAP_TUPLES_RECENTLY_DEAD	6
+
+/* Phases of cluster (as dvertised via PROGRESS_CLUSTER_PHASE) */
+#define PROGRESS_CLUSTER_PHASE_SEQ_SCAN_HEAP					1
+#define PROGRESS_CLUSTER_PHASE_INDEX_SCAN_HEAP					2
+#define PROGRESS_CLUSTER_PHASE_SORT_TUPLES						3
+#define PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP					4
+#define PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES					5
+#define PROGRESS_CLUSTER_PHASE_REBUILD_INDEX					6
+#define PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP					7
+
+/* Commands of PROGRESS_CLUSTER */
+#define PROGRESS_CLUSTER_COMMAND_CLUSTER		1
+#define PROGRESS_CLUSTER_COMMAND_VACUUM_FULL	2
+
+
 #endif
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 88a75fb798..745685c8a6 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -934,7 +934,8 @@ typedef enum
 typedef enum ProgressCommandType
 {
 	PROGRESS_COMMAND_INVALID,
-	PROGRESS_COMMAND_VACUUM
+	PROGRESS_COMMAND_VACUUM,
+	PROGRESS_COMMAND_CLUSTER
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	10
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 98f417cb57..b6c08e0b9c 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1829,6 +1829,38 @@ pg_stat_database_conflicts| SELECT d.oid AS datid,
     pg_stat_get_db_conflict_bufferpin(d.oid) AS confl_bufferpin,
     pg_stat_get_db_conflict_startup_deadlock(d.oid) AS confl_deadlock
    FROM pg_database d;
+pg_stat_progress_cluster| SELECT s.pid,
+    s.datid,
+    d.datname,
+    s.relid,
+        CASE s.param1
+            WHEN 1 THEN 'CLUSTER'::text
+            WHEN 2 THEN 'VACUUM FULL'::text
+            ELSE NULL::text
+        END AS command,
+        CASE s.param2
+            WHEN 0 THEN 'initializing'::text
+            WHEN 1 THEN 'scanning heap'::text
+            WHEN 2 THEN 'sorting tuples'::text
+            WHEN 3 THEN 'writing new heap'::text
+            WHEN 4 THEN 'scan heap and write new heap'::text
+            WHEN 5 THEN 'swapping relation files'::text
+            WHEN 6 THEN 'rebuilding index'::text
+            WHEN 7 THEN 'performing final cleanup'::text
+            ELSE NULL::text
+        END AS phase,
+        CASE s.param3
+            WHEN 1 THEN 'index scan'::text
+            WHEN 2 THEN 'seq scan'::text
+            ELSE NULL::text
+        END AS scan_method,
+    s.param4 AS cluster_index_relid,
+    s.param5 AS heap_tuples_total,
+    s.param6 AS heap_tuples_scanned,
+    s.param7 AS heap_tuples_vacuumed,
+    s.param8 AS heap_tuples_recently_dead
+   FROM (pg_stat_get_progress_info('CLUSTER'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,

#45

Tatsuro Yamada

yamada.tatsuro@lab.ntt.co.jp

almost 7 years ago

In reply to: Tatsuro Yamada (#44)

Re: [HACKERS] CLUSTER command progress monitor

Attached patch is wip patch.

Is it possible to remove the following patch?
Because I registered the patch twice on CF Mar.

https://commitfest.postgresql.org/22/2049/

Thanks,
Tatsuro Yamada

#46

Etsuro Fujita

fujita.etsuro@lab.ntt.co.jp

almost 7 years ago

In reply to: Tatsuro Yamada (#45)

Re: [HACKERS] CLUSTER command progress monitor

(2019/03/01 14:17), Tatsuro Yamada wrote:

Attached patch is wip patch.

Is it possible to remove the following patch?
Because I registered the patch twice on CF Mar.

https://commitfest.postgresql.org/22/2049/

Please remove the above and keep this:

https://commitfest.postgresql.org/22/1779/

which I moved from the January CF to the March one on behalf of him.

Best regards,
Etsuro Fujita

#47

Robert Haas

robertmhaas@gmail.com

almost 7 years ago

In reply to: Tatsuro Yamada (#44)

Re: [HACKERS] CLUSTER command progress monitor

On Thu, Feb 28, 2019 at 11:54 PM Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

Attached patch is wip patch.

+ <command>CLUSTER</command> and <command>VACUUM FULL</command>,
showing current progress.

and -> or

+   certain commands during command execution.  Currently, the suppoted
+   progress reporting commands are <command>VACUUM</command> and
<command>CLUSTER</command>.

suppoted -> supported

But I'd just say: Currently, the only commands which support progress
reporting are <command>VACUUM</command> and
<command>CLUSTER</command>.

+   Running <command>VACUUM FULL</command> is listed in
<structname>pg_stat_progress_cluster</structname>
+   view because it uses <command>CLUSTER</command> command
internally.  See <xref linkend='cluster-progress-reporting'>.

How about: Running <command>VACUUM FULL</command> is listed in
<structname>pg_stat_progress_cluster</structname> because both
<command>VACUUM FULL</command> and <command>CLUSTER</command> rewrite
the table, while regular <command>VACUUM</command> only modifies it in
place.

+ Current processing command: CLUSTER/VACUUM FULL.

The command that is running. Either CLUSTER or VACUUM FULL.

+ Current processing phase of cluster/vacuum full. See <xref
linkend='cluster-phases'> or <xref linkend='vacuum-full-phases'>.

Current processing phase of CLUSTER or VACUUM FULL.

Or maybe better, just abbreviate to: Current processing phase.

+ Scan method of table: index scan/seq scan.

Eh, shouldn't this be gone now? And likewise for the view definition?

+ OID of the index.

If the table is being scanned using an index, this is the OID of the
index being used; otherwise, it is zero.

+ <entry><structfield>heap_tuples_total</structfield></entry>

Leftovers. Skipping over the rest of your documentation changes since
it looks like a bunch of things there still need to be updated.

+ pgstat_progress_start_command(PROGRESS_COMMAND_CLUSTER, tableOid);

This now appears inside cluster_rel(), but also vacuum_rel() is still
doing the same thing. That's wrong.

+ if(OidIsValid(indexOid))

Missing space. Please pgindent.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#48

Tatsuro Yamada

yamada.tatsuro@lab.ntt.co.jp

almost 7 years ago

In reply to: Robert Haas (#47)

2 attachment(s)

Re: [HACKERS] CLUSTER command progress monitor

On 2019/03/02 4:15, Robert Haas wrote:

On Thu, Feb 28, 2019 at 11:54 PM Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

Attached patch is wip patch.

Thanks for your comments! :)
I revised the code and the document.

+ <command>CLUSTER</command> and <command>VACUUM FULL</command>,
showing current progress.

and -> or

Fixed.

+   certain commands during command execution.  Currently, the suppoted
+   progress reporting commands are <command>VACUUM</command> and
<command>CLUSTER</command>.
suppoted -> supported

But I'd just say: Currently, the only commands which support progress
reporting are <command>VACUUM</command> and
<command>CLUSTER</command>.

I choose the latter. Fixed.

+   Running <command>VACUUM FULL</command> is listed in
<structname>pg_stat_progress_cluster</structname>
+   view because it uses <command>CLUSTER</command> command
internally.  See <xref linkend='cluster-progress-reporting'>.
How about: Running <command>VACUUM FULL</command> is listed in
<structname>pg_stat_progress_cluster</structname> because both
<command>VACUUM FULL</command> and <command>CLUSTER</command> rewrite
the table, while regular <command>VACUUM</command> only modifies it in
place.

Fixed.

+ Current processing command: CLUSTER/VACUUM FULL.

The command that is running. Either CLUSTER or VACUUM FULL.

Fixed.

+ Current processing phase of cluster/vacuum full. See <xref
linkend='cluster-phases'> or <xref linkend='vacuum-full-phases'>.

Current processing phase of CLUSTER or VACUUM FULL.

Or maybe better, just abbreviate to: Current processing phase.

Fixed as you suggested.

+ Scan method of table: index scan/seq scan.

Eh, shouldn't this be gone now? And likewise for the view definition?

Fixed. Sorry, It was an oversight.

+ OID of the index.

If the table is being scanned using an index, this is the OID of the
index being used; otherwise, it is zero.

Fixed.

+ <entry><structfield>heap_tuples_total</structfield></entry>

Leftovers. Skipping over the rest of your documentation changes since
it looks like a bunch of things there still need to be updated.

I agree. Thanks a lot!
I'll divide the patch into two patch such as code and document.

+ pgstat_progress_start_command(PROGRESS_COMMAND_CLUSTER, tableOid);

This now appears inside cluster_rel(), but also vacuum_rel() is still
doing the same thing. That's wrong.

It was an oversight too. I fixed.

+ if(OidIsValid(indexOid))

Missing space. Please pgindent.

Fixed.
I Will do pgindent later.

Please find attached files. :)

Thanks,
Tatsuro Yamada

Attachments:

progress_monitor_for_cluster_command_v7_code.patchtext/x-patch; name=progress_monitor_for_cluster_command_v7_code.patchDownload

diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 3e229c693c..0d0f8f0e31 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -906,6 +906,30 @@ CREATE VIEW pg_stat_progress_vacuum AS
     FROM pg_stat_get_progress_info('VACUUM') AS S
 		LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_cluster AS
+    SELECT
+        S.pid AS pid,
+        S.datid AS datid,
+        D.datname AS datname,
+        S.relid AS relid,
+        CASE S.param1 WHEN 1 THEN 'CLUSTER'
+                      WHEN 2 THEN 'VACUUM FULL'
+                      END AS command,
+        CASE S.param2 WHEN 0 THEN 'initializing'
+                      WHEN 1 THEN 'seq scanning heap'
+                      WHEN 2 THEN 'index scanning heap'
+                      WHEN 3 THEN 'sorting tuples'
+                      WHEN 4 THEN 'writing new heap'
+                      WHEN 5 THEN 'swapping relation files'
+                      WHEN 6 THEN 'rebuilding index'
+                      WHEN 7 THEN 'performing final cleanup'
+                      END AS phase,
+        S.param3 AS cluster_index_relid,
+        S.param4 AS heap_tuples_scanned,
+        S.param5 AS heap_tuples_vacuumed
+    FROM pg_stat_get_progress_info('CLUSTER') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid;
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index a74af4c171..f22ff590f0 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -35,10 +35,12 @@
 #include "catalog/objectaccess.h"
 #include "catalog/toasting.h"
 #include "commands/cluster.h"
+#include "commands/progress.h"
 #include "commands/tablecmds.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "optimizer/optimizer.h"
+#include "pgstat.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/predicate.h"
@@ -275,6 +277,8 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 	/* Check for user-requested abort. */
 	CHECK_FOR_INTERRUPTS();
 
+	pgstat_progress_start_command(PROGRESS_COMMAND_CLUSTER, tableOid);
+
 	/*
 	 * We grab exclusive access to the target rel and index for the duration
 	 * of the transaction.  (This is redundant for the single-transaction
@@ -385,6 +389,18 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 	 */
 	CheckTableNotInUse(OldHeap, OidIsValid(indexOid) ? "CLUSTER" : "VACUUM");
 
+	/* Set command to column */
+	if (OidIsValid(indexOid))
+	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_COMMAND,
+									 PROGRESS_CLUSTER_COMMAND_CLUSTER);
+	}
+	else
+	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_COMMAND,
+									 PROGRESS_CLUSTER_COMMAND_VACUUM_FULL);
+	}
+
 	/* Check heap and index are valid to cluster on */
 	if (OidIsValid(indexOid))
 		check_index_is_clusterable(OldHeap, indexOid, recheck, AccessExclusiveLock);
@@ -415,6 +431,8 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 	rebuild_relation(OldHeap, indexOid, verbose);
 
 	/* NB: rebuild_relation does table_close() on OldHeap */
+
+	pgstat_progress_end_command();
 }
 
 /*
@@ -923,12 +941,26 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 */
 	if (OldIndex != NULL && !use_sort)
 	{
+		const int   ci_index[] = {
+			PROGRESS_CLUSTER_PHASE,
+			PROGRESS_CLUSTER_INDEX_RELID
+		};
+		int64       ci_val[2];
+
+		/* Set phase and OIDOldIndex to columns */
+		ci_val[0] = PROGRESS_CLUSTER_PHASE_INDEX_SCAN_HEAP;
+		ci_val[1] = OIDOldIndex;
+		pgstat_progress_update_multi_param(2, ci_index, ci_val);
+
 		heapScan = NULL;
 		indexScan = index_beginscan(OldHeap, OldIndex, SnapshotAny, 0, 0);
 		index_rescan(indexScan, NULL, 0, NULL, 0);
 	}
 	else
 	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+									 PROGRESS_CLUSTER_PHASE_SEQ_SCAN_HEAP);
+
 		heapScan = heap_beginscan(OldHeap, SnapshotAny, 0, (ScanKey) NULL);
 		indexScan = NULL;
 	}
@@ -1049,6 +1081,11 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 				tups_vacuumed += 1;
 				tups_recently_dead -= 1;
 			}
+
+			/* set tups_vacuumed column for VACUUM FULL */
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_VACUUMED,
+										 tups_vacuumed);
+
 			continue;
 		}
 
@@ -1060,6 +1097,10 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 									 oldTupDesc, newTupDesc,
 									 values, isnull,
 									 rwstate);
+
+		/* Regardless of index scan or seq scan, update tuples_scanned column */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED,
+									 num_tuples);
 	}
 
 	if (indexScan != NULL)
@@ -1073,8 +1114,25 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 */
 	if (tuplesort != NULL)
 	{
+		double num_tuples = 0;
+		const int   cp_index[] = {
+			PROGRESS_CLUSTER_PHASE,
+			PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED
+		};
+		int64       cp_val[2];
+
+		/* Report that we are now sorting tuples */
+		cp_val[0] = PROGRESS_CLUSTER_PHASE_SORT_TUPLES;
+		cp_val[1] = num_tuples;
+		pgstat_progress_update_multi_param(2, cp_index, cp_val);
+
 		tuplesort_performsort(tuplesort);
 
+		/* Report that we are now writing new heap */
+		cp_val[0] = PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP;
+		cp_val[1] = num_tuples;
+		pgstat_progress_update_multi_param(2, cp_index, cp_val);
+
 		for (;;)
 		{
 			HeapTuple	tuple;
@@ -1085,10 +1143,14 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 			if (tuple == NULL)
 				break;
 
+			num_tuples += 1;
 			reform_and_rewrite_tuple(tuple,
 									 oldTupDesc, newTupDesc,
 									 values, isnull,
 									 rwstate);
+
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED,
+										 num_tuples);
 		}
 
 		tuplesort_end(tuplesort);
@@ -1526,6 +1588,16 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	Oid			mapped_tables[4];
 	int			reindex_flags;
 	int			i;
+	const int   cp_index[] = {
+		PROGRESS_CLUSTER_PHASE,
+		PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED
+	};
+	int64       cp_val[2];
+
+	/* Report that we are now swapping relation files */
+	cp_val[0] = PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES;
+	cp_val[1] = 0;
+	pgstat_progress_update_multi_param(2, cp_index, cp_val);
 
 	/* Zero out possible results from swapped_relation_files */
 	memset(mapped_tables, 0, sizeof(mapped_tables));
@@ -1561,6 +1633,11 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	 * because the new heap won't contain any HOT chains at all, let alone
 	 * broken ones, so it can't be necessary to set indcheckxmin.
 	 */
+
+	/* Report that we are now reindexing relations */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+								 PROGRESS_CLUSTER_PHASE_REBUILD_INDEX);
+
 	reindex_flags = REINDEX_REL_SUPPRESS_INDEX_USE;
 	if (check_constraints)
 		reindex_flags |= REINDEX_REL_CHECK_CONSTRAINTS;
@@ -1576,6 +1653,10 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 
 	reindex_relation(OIDOldHeap, reindex_flags, 0);
 
+	/* Report that we are now doing clean up */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+								 PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP);
+
 	/*
 	 * If the relation being rebuild is pg_class, swap_relation_files()
 	 * couldn't update pg_class's own pg_class entry (check comments in
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 69f7265779..37ff3dbff6 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -468,6 +468,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 	/* Translate command name into command type code. */
 	if (pg_strcasecmp(cmd, "VACUUM") == 0)
 		cmdtype = PROGRESS_COMMAND_VACUUM;
+	else if(pg_strcasecmp(cmd, "CLUSTER") == 0)
+		cmdtype = PROGRESS_COMMAND_CLUSTER;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 9858b36a38..480f2e6820 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -34,4 +34,24 @@
 #define PROGRESS_VACUUM_PHASE_TRUNCATE			5
 #define PROGRESS_VACUUM_PHASE_FINAL_CLEANUP		6
 
+/* Progress parameters for cluster */
+#define PROGRESS_CLUSTER_COMMAND				0
+#define PROGRESS_CLUSTER_PHASE					1
+#define PROGRESS_CLUSTER_INDEX_RELID			2
+#define PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED	3
+#define PROGRESS_CLUSTER_HEAP_TUPLES_VACUUMED	4
+
+/* Phases of cluster (as dvertised via PROGRESS_CLUSTER_PHASE) */
+#define PROGRESS_CLUSTER_PHASE_SEQ_SCAN_HEAP	1
+#define PROGRESS_CLUSTER_PHASE_INDEX_SCAN_HEAP	2
+#define PROGRESS_CLUSTER_PHASE_SORT_TUPLES		3
+#define PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP	4
+#define PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES	5
+#define PROGRESS_CLUSTER_PHASE_REBUILD_INDEX	6
+#define PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP	7
+
+/* Commands of PROGRESS_CLUSTER */
+#define PROGRESS_CLUSTER_COMMAND_CLUSTER		1
+#define PROGRESS_CLUSTER_COMMAND_VACUUM_FULL	2
+
 #endif
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 88a75fb798..745685c8a6 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -934,7 +934,8 @@ typedef enum
 typedef enum ProgressCommandType
 {
 	PROGRESS_COMMAND_INVALID,
-	PROGRESS_COMMAND_VACUUM
+	PROGRESS_COMMAND_VACUUM,
+	PROGRESS_COMMAND_CLUSTER
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	10
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 98f417cb57..e102e91172 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1829,6 +1829,31 @@ pg_stat_database_conflicts| SELECT d.oid AS datid,
     pg_stat_get_db_conflict_bufferpin(d.oid) AS confl_bufferpin,
     pg_stat_get_db_conflict_startup_deadlock(d.oid) AS confl_deadlock
    FROM pg_database d;
+pg_stat_progress_cluster| SELECT s.pid,
+    s.datid,
+    d.datname,
+    s.relid,
+        CASE s.param1
+            WHEN 1 THEN 'CLUSTER'::text
+            WHEN 2 THEN 'VACUUM FULL'::text
+            ELSE NULL::text
+        END AS command,
+        CASE s.param2
+            WHEN 0 THEN 'initializing'::text
+            WHEN 1 THEN 'seq scanning heap'::text
+            WHEN 2 THEN 'index scanning heap'::text
+            WHEN 3 THEN 'sorting tuples'::text
+            WHEN 4 THEN 'writing new heap'::text
+            WHEN 5 THEN 'swapping relation files'::text
+            WHEN 6 THEN 'rebuilding index'::text
+            WHEN 7 THEN 'performing final cleanup'::text
+            ELSE NULL::text
+        END AS phase,
+    s.param4 AS cluster_index_relid,
+    s.param5 AS heap_tuples_scanned,
+    s.param6 AS heap_tuples_vacuumed,
+   FROM (pg_stat_get_progress_info('CLUSTER'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,

progress_monitor_for_cluster_command_v7_doc.patchtext/x-patch; name=progress_monitor_for_cluster_command_v7_doc.patchDownload

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 0e73cdcdda..178e21fc1f 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -344,6 +344,14 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
       </entry>
      </row>
 
+     <row>
+      <entry><structname>pg_stat_progress_cluster</structname><indexterm><primary>pg_stat_progress_cluster</primary></indexterm></entry>
+      <entry>One row for each backend running
+       <command>CLUSTER</command> or <command>VACUUM FULL</command>, showing current progress.
+       See <xref linkend='cluster-progress-reporting'>.
+      </entry>
+     </row>
+
     </tbody>
    </tgroup>
   </table>
@@ -3376,9 +3384,9 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 
   <para>
    <productname>PostgreSQL</productname> has the ability to report the progress of
-   certain commands during command execution.  Currently, the only command
-   which supports progress reporting is <command>VACUUM</command>.  This may be
-   expanded in the future.
+   certain commands during command execution.  Currently, the only commands which 
+   support progress reporting are <command>VACUUM</command> and
+   <command>CLUSTER</command>. This may be expanded in the future.
   </para>
 
  <sect2 id="vacuum-progress-reporting">
@@ -3390,9 +3398,10 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
    one row for each backend (including autovacuum worker processes) that is
    currently vacuuming.  The tables below describe the information
    that will be reported and provide information about how to interpret it.
-   Progress reporting is not currently supported for <command>VACUUM FULL</command>
-   and backends running <command>VACUUM FULL</command> will not be listed in this
-   view.
+   Running <command>VACUUM FULL</command> is listed in <structname>pg_stat_progress_cluster</structname>
+   because both <command>VACUUM FULL</command> and <command>CLUSTER</command> 
+   rewrite the table, while regular <command>VACUUM</command> only modifies it 
+   in place. See <xref linkend='cluster-progress-reporting'>.
   </para>
 
   <table id="pg-stat-progress-vacuum-view" xreflabel="pg_stat_progress_vacuum">
@@ -3569,6 +3578,202 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
    </tgroup>
   </table>
 
+ </sect2>
+
+ <sect2 id="cluster-progress-reporting">
+  <title>CLUSTER Progress Reporting</title>
+
+  <para>
+   Whenever <command>CLUSTER</command> is running, the
+   <structname>pg_stat_progress_cluster</structname> view will contain
+   a row for each backend that is currently running CLUSTER or VACUUM FULL. 
+   The tables below describe the information that will be reported and
+   provide information about how to interpret it.
+  </para>
+
+  <table id="pg-stat-progress-cluster-view" xreflabel="pg_stat_progress_cluster">
+   <title><structname>pg_stat_progress_cluster</structname> View</title>
+   <tgroup cols="3">
+    <thead>
+    <row>
+      <entry>Column</entry>
+      <entry>Type</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><structfield>pid</structfield></entry>
+     <entry><type>integer</type></entry>
+     <entry>Process ID of backend.</entry>
+    </row>
+    <row>
+     <entry><structfield>datid</structfield></entry>
+     <entry><type>oid</type></entry>
+     <entry>OID of the database to which this backend is connected.</entry>
+    </row>
+    <row>
+     <entry><structfield>datname</structfield></entry>
+     <entry><type>name</type></entry>
+     <entry>Name of the database to which this backend is connected.</entry>
+    </row>
+    <row>
+     <entry><structfield>relid</structfield></entry>
+     <entry><type>oid</type></entry>
+     <entry>OID of the table being clustered.</entry>
+    </row>
+    <row>
+     <entry><structfield>command</structfield></entry>
+     <entry><type>text</type></entry>
+     <entry>
+       The command that is running. Either CLUSTER or VACUUM FULL.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>phase</structfield></entry>
+     <entry><type>text</type></entry>
+     <entry>
+       Current processing phase. See <xref linkend='cluster-phases'> or <xref linkend='vacuum-full-phases'>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>cluster_index_relid</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       If the table is being scanned using an index, this is the OID of the
+       index being used; otherwise, it is zero.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_tuples_scanned</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Number of heap tuples scanned.
+       This counter only advances when the phase is <literal>seq scanning heap</literal>, 
+       <literal>writing new heap</literal>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_tuples_vacuumed</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Number of heap tuples vacuumed. This counter only advances when the
+       command is <literal>VACUUM FULL</literal> and the phase is <literal>scanning heap</literal>.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
+  <table id="cluster-phases">
+   <title>CLUSTER phases</title>
+   <tgroup cols="2">
+    <thead>
+    <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><literal>initializing</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is preparing to begin scanning the heap.  This
+       phase is expected to be very brief.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>seq scanning heap</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently scanning heap from the table by
+       seq scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>index scanning heap</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently scanning heap from the table by
+       index scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>sorting tuples</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently sorting tuples. 
+     </entry>
+    </row>
+    <row>
+     <entry><literal>swapping relation files</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently swapping old heap and new clustered heap.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>rebuilding index</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently rebuilding index.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>performing final cleanup</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is performing final cleanup.  When this phase is 
+       completed, <command>CLUSTER</command> will end.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
+  <table id="vacuum-full-phases">
+   <title>VACUUM FULL phases</title>
+   <tgroup cols="2">
+    <thead>
+    <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><literal>initializing</literal></entry>
+     <entry>
+       <command>VACUUM FULL</command> is preparing to begin scanning the heap.  This
+       phase is expected to be very brief.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>seq scanning heap</literal></entry>
+     <entry>
+       <command>VACUUM FULL</command> is currently scanning heap from the table.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>swapping relation files</literal></entry>
+     <entry>
+       <command>VACUUM FULL</command> is currently swapping old heap and new vacuumed heap.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>rebuilding index</literal></entry>
+     <entry>
+       <command>VACUUM FULL</command> is currently rebuilding index.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>performing final cleanup</literal></entry>
+     <entry>
+       <command>VACUUM FULL</command> is performing final cleanup.  When this phase is
+       completed, <command>VACUUM FULL</command> will end.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
  </sect2>
  </sect1>

#49

Tatsuro Yamada

yamada.tatsuro@lab.ntt.co.jp

almost 7 years ago

In reply to: Robert Haas (#47)

Re: [HACKERS] CLUSTER command progress monitor

On 2019/03/02 4:15, Robert Haas wrote:

On Thu, Feb 28, 2019 at 11:54 PM Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

Attached patch is wip patch.

I rewrote the current design of the progress monitor and also
wrote discussion points in the middle of this email. I'd like to
get any feedback from -hackers.

=== Current design ===

CLUSTER command uses Index Scan or Seq Scan when scanning the heap.
Depending on which one is chosen, the command will proceed in the
following sequence of phases:

* Scan method: Seq Scan
0. initializing (*2)
1. seq scanning heap (*1)
3. sorting tuples (*2)
4. writing new heap (*1)
5. swapping relation files (*2)
6. rebuilding index (*2)
7. performing final cleanup (*2)

* Scan method: Index Scan
0. initializing (*2)
2. index scanning heap (*1)
5. swapping relation files (*2)
6. rebuilding index (*2)
7. performing final cleanup (*2)

VACUUM FULL command will proceed in the following sequence of phases:

1. seq scanning heap (*1)
5. swapping relation files (*2)
6. rebuilding index (*2)
7. performing final cleanup (*2)

(*1): increasing the value in heap_tuples_scanned column
(*2): only shows the phase in the phase column

=== Discussion points ===

- Progress counter for "3. sorting tuples" phase
- Should we add pgstat_progress_update_param() in tuplesort.c like a
"trace_sort"?
Thanks to Peter Geoghegan for the useful advice!

- Progress counter for "6. rebuilding index" phase
- Should we add "index_vacuum_count" in the view like a vacuum progress monitor?
If yes, I'll add pgstat_progress_update_param() to reindex_relation() of index.c.
However, I'm not sure whether it is okay or not.

- pg_stat_progress_rewrite
- TBA

=== My test case ===

I share my test case of progress monitor.
If someone wants to watch the current progress monitor, you can use
this test case as a example.

[Terminal1]
Run this query on psql:

select * from pg_stat_progress_cluster; \watch 0.05

[Terminal2]
Run these queries on psql:

drop table t1;

create table t1 as select a, random() * 1000 as b from generate_series(0, 999999) a;
create index idx_t1 on t1(a);
create index idx_t1_b on t1(b);
analyze t1;

-- index scan
set enable_seqscan to off;
cluster verbose t1 using idx_t1;

-- seq scan
set enable_seqscan to on;
set enable_indexscan to off;
cluster verbose t1 using idx_t1;

-- only given table name to cluster command
cluster verbose t1;

-- only cluster command
cluster verbose;

-- vacuum full
vacuum full t1;

-- vacuum full
vacuum full;

Thanks,
Tatsuro Yamada

#50

Robert Haas

robertmhaas@gmail.com

almost 7 years ago

In reply to: Tatsuro Yamada (#49)

Re: [HACKERS] CLUSTER command progress monitor

On Mon, Mar 4, 2019 at 5:38 AM Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

=== Current design ===

CLUSTER command uses Index Scan or Seq Scan when scanning the heap.
Depending on which one is chosen, the command will proceed in the
following sequence of phases:

* Scan method: Seq Scan
0. initializing (*2)
1. seq scanning heap (*1)
3. sorting tuples (*2)
4. writing new heap (*1)
5. swapping relation files (*2)
6. rebuilding index (*2)
7. performing final cleanup (*2)

* Scan method: Index Scan
0. initializing (*2)
2. index scanning heap (*1)
5. swapping relation files (*2)
6. rebuilding index (*2)
7. performing final cleanup (*2)

VACUUM FULL command will proceed in the following sequence of phases:

1. seq scanning heap (*1)
5. swapping relation files (*2)
6. rebuilding index (*2)
7. performing final cleanup (*2)

(*1): increasing the value in heap_tuples_scanned column
(*2): only shows the phase in the phase column

All of that sounds good.

The view provides the information of CLUSTER command progress details as follows
# \d pg_stat_progress_cluster
View "pg_catalog.pg_stat_progress_cluster"
Column | Type | Collation | Nullable | Default
---------------------------+---------+-----------+----------+---------
pid | integer | | |
datid | oid | | |
datname | name | | |
relid | oid | | |
command | text | | |
phase | text | | |
cluster_index_relid | bigint | | |
heap_tuples_scanned | bigint | | |
heap_tuples_vacuumed | bigint | | |

Still not sure if we need heap_tuples_vacuumed. We could try to
report heap_blks_scanned and heap_blks_total like we do for VACUUM, if
we're using a Seq Scan.

=== Discussion points ===

- Progress counter for "3. sorting tuples" phase
- Should we add pgstat_progress_update_param() in tuplesort.c like a
"trace_sort"?
Thanks to Peter Geoghegan for the useful advice!

How would we avoid an abstraction violation?

- Progress counter for "6. rebuilding index" phase
- Should we add "index_vacuum_count" in the view like a vacuum progress monitor?
If yes, I'll add pgstat_progress_update_param() to reindex_relation() of index.c.
However, I'm not sure whether it is okay or not.

Doesn't seem unreasonable to me.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#51

David Steele

david@pgmasters.net

almost 7 years ago

In reply to: Etsuro Fujita (#46)

Re: Re: [HACKERS] CLUSTER command progress monitor

On 3/1/19 7:48 AM, Etsuro Fujita wrote:

(2019/03/01 14:17), Tatsuro Yamada wrote:

Attached patch is wip patch.

Is it possible to remove the following patch?
Because I registered the patch twice on CF Mar.

https://commitfest.postgresql.org/22/2049/

Please remove the above and keep this:

https://commitfest.postgresql.org/22/1779/

which I moved from the January CF to the March one on behalf of him.

I have closed the duplicate entry (#2049) and retained the entry which
contains the CF history (#1779).

Regards,
--
-David
david@pgmasters.net

#52

Tatsuro Yamada

yamada.tatsuro@lab.ntt.co.jp

almost 7 years ago

In reply to: David Steele (#51)

Re: [HACKERS] CLUSTER command progress monitor

Hi David,

On 2019/03/05 17:29, David Steele wrote:

On 3/1/19 7:48 AM, Etsuro Fujita wrote:

(2019/03/01 14:17), Tatsuro Yamada wrote:

Attached patch is wip patch.

Is it possible to remove the following patch?
Because I registered the patch twice on CF Mar.

https://commitfest.postgresql.org/22/2049/

Please remove the above and keep this:

https://commitfest.postgresql.org/22/1779/

which I moved from the January CF to the March one on behalf of him.

I have closed the duplicate entry (#2049) and retained the entry which contains the CF history (#1779).

Thank you! :)

Regards,
Tatsuro Yamada

#53

Tatsuro Yamada

yamada.tatsuro@lab.ntt.co.jp

almost 7 years ago

In reply to: Robert Haas (#50)

Re: [HACKERS] CLUSTER command progress monitor

Hi Robert!

On 2019/03/05 11:35, Robert Haas wrote:

On Mon, Mar 4, 2019 at 5:38 AM Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

=== Current design ===

CLUSTER command uses Index Scan or Seq Scan when scanning the heap.
Depending on which one is chosen, the command will proceed in the
following sequence of phases:

* Scan method: Seq Scan
0. initializing (*2)
1. seq scanning heap (*1)
3. sorting tuples (*2)
4. writing new heap (*1)
5. swapping relation files (*2)
6. rebuilding index (*2)
7. performing final cleanup (*2)

* Scan method: Index Scan
0. initializing (*2)
2. index scanning heap (*1)
5. swapping relation files (*2)
6. rebuilding index (*2)
7. performing final cleanup (*2)

VACUUM FULL command will proceed in the following sequence of phases:

1. seq scanning heap (*1)
5. swapping relation files (*2)
6. rebuilding index (*2)
7. performing final cleanup (*2)

(*1): increasing the value in heap_tuples_scanned column
(*2): only shows the phase in the phase column

All of that sounds good.

The view provides the information of CLUSTER command progress details as follows
# \d pg_stat_progress_cluster
View "pg_catalog.pg_stat_progress_cluster"
Column | Type | Collation | Nullable | Default
---------------------------+---------+-----------+----------+---------
pid | integer | | |
datid | oid | | |
datname | name | | |
relid | oid | | |
command | text | | |
phase | text | | |
cluster_index_relid | bigint | | |
heap_tuples_scanned | bigint | | |
heap_tuples_vacuumed | bigint | | |

Still not sure if we need heap_tuples_vacuumed. We could try to
report heap_blks_scanned and heap_blks_total like we do for VACUUM, if
we're using a Seq Scan.

I have no strong opinion to add heap_tuples_vacuumed, so I'll remove that in
next patch.

Regarding heap_blks_scanned and heap_blks_total, I suppose that it is able to
get those from initscan(). I'll investigate it more.

cluster.c
copy_heap_data()
heap_beginscan()
heap_beginscan_internal()
initscan()

=== Discussion points ===

- Progress counter for "3. sorting tuples" phase
- Should we add pgstat_progress_update_param() in tuplesort.c like a
"trace_sort"?
Thanks to Peter Geoghegan for the useful advice!

How would we avoid an abstraction violation?

Hmm... What do you mean an abstraction violation?
If it is difficult to solve, I'd not like to add the progress counter for the sorting tuples.

- Progress counter for "6. rebuilding index" phase
- Should we add "index_vacuum_count" in the view like a vacuum progress monitor?
If yes, I'll add pgstat_progress_update_param() to reindex_relation() of index.c.
However, I'm not sure whether it is okay or not.

Doesn't seem unreasonable to me.

I see, I'll add it later.

Regards,
Tatsuro Yamada

#54

Robert Haas

robertmhaas@gmail.com

almost 7 years ago

In reply to: Tatsuro Yamada (#53)

Re: [HACKERS] CLUSTER command progress monitor

On Tue, Mar 5, 2019 at 3:56 AM Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

=== Discussion points ===

- Progress counter for "3. sorting tuples" phase
- Should we add pgstat_progress_update_param() in tuplesort.c like a
"trace_sort"?
Thanks to Peter Geoghegan for the useful advice!

How would we avoid an abstraction violation?

Hmm... What do you mean an abstraction violation?
If it is difficult to solve, I'd not like to add the progress counter for the sorting tuples.

What I mean is... I think it would be useful to have this counter, but
I'm not sure how the tuplesort code would know to update the counter
in this case and not in other cases. The tuplesort code is used for
lots of things; we can't update a counter for CLUSTER if the tuplesort
is being used for CREATE INDEX or a Sort node in a query or whatever.
So my question is how we would indicate to the tuplesort that it needs
to do the counter update, and whether that would end up making for
ugly code.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#55

Alvaro Herrera

alvherre@2ndquadrant.com

almost 7 years ago

In reply to: Robert Haas (#50)

Re: [HACKERS] CLUSTER command progress monitor

On 2019-Mar-04, Robert Haas wrote:

On Mon, Mar 4, 2019 at 5:38 AM Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

=== Discussion points ===

- Progress counter for "3. sorting tuples" phase
- Should we add pgstat_progress_update_param() in tuplesort.c like a
"trace_sort"?
Thanks to Peter Geoghegan for the useful advice!

How would we avoid an abstraction violation?

The theory embodied in my patch at /messages/by-id/20190304204607.GA15946@alvherre.pgsql
is that we don't; tuplesort.c functions (index.c's IndexBuildHeapScan in
my case) would get a boolean parameter to indicate whether to update
some params or not -- the param number(s) to update are supposed to be
generic in the sense that it's not part of any individual command's
implementation (PROGRESS_SCAN_BLOCKS_DONE for what you call "blks
scanned", PROGRESS_SCAN_BLOCKS_TOTAL for "blks total"), but rather
defined by the "progress update provider" (index.c or tuplesort.c).

One, err, small issue with that idea is that we need the param numbers
not to conflict for any "progress update providers" that are to be used
simultaneously by any command.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#56

Tatsuro Yamada

yamada.tatsuro@lab.ntt.co.jp

almost 7 years ago

In reply to: Robert Haas (#54)

Re: [HACKERS] CLUSTER command progress monitor

On 2019/03/06 1:13, Robert Haas wrote:

On Tue, Mar 5, 2019 at 3:56 AM Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

=== Discussion points ===

- Progress counter for "3. sorting tuples" phase
- Should we add pgstat_progress_update_param() in tuplesort.c like a
"trace_sort"?
Thanks to Peter Geoghegan for the useful advice!

How would we avoid an abstraction violation?

Hmm... What do you mean an abstraction violation?
If it is difficult to solve, I'd not like to add the progress counter for the sorting tuples.

What I mean is... I think it would be useful to have this counter, but
I'm not sure how the tuplesort code would know to update the counter
in this case and not in other cases. The tuplesort code is used for
lots of things; we can't update a counter for CLUSTER if the tuplesort
is being used for CREATE INDEX or a Sort node in a query or whatever.
So my question is how we would indicate to the tuplesort that it needs
to do the counter update, and whether that would end up making for
ugly code.

Thanks for your explanation!
I understood that now. I guess it means an API to get a progress for sort processing,
so let us just put it aside now. I'd like to leave that as is for an appropriate person.

Regards,
Tatsuro Yamada

#57

Tatsuro Yamada

yamada.tatsuro@lab.ntt.co.jp

almost 7 years ago

In reply to: Tatsuro Yamada (#53)

1 attachment(s)

Re: [HACKERS] CLUSTER command progress monitor

On 2019/03/05 17:56, Tatsuro Yamada wrote:

Hi Robert!

On 2019/03/05 11:35, Robert Haas wrote:

On Mon, Mar 4, 2019 at 5:38 AM Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

=== Current design ===

CLUSTER command uses Index Scan or Seq Scan when scanning the heap.
Depending on which one is chosen, the command will proceed in the
following sequence of phases:

    * Scan method: Seq Scan
      0. initializing                 (*2)
      1. seq scanning heap            (*1)
      3. sorting tuples               (*2)
      4. writing new heap             (*1)
      5. swapping relation files      (*2)
      6. rebuilding index             (*2)
      7. performing final cleanup     (*2)

    * Scan method: Index Scan
      0. initializing                 (*2)
      2. index scanning heap          (*1)
      5. swapping relation files      (*2)
      6. rebuilding index             (*2)
      7. performing final cleanup     (*2)

VACUUM FULL command will proceed in the following sequence of phases:

      1. seq scanning heap            (*1)
      5. swapping relation files      (*2)
      6. rebuilding index             (*2)
      7. performing final cleanup     (*2)

(*1): increasing the value in heap_tuples_scanned column
(*2): only shows the phase in the phase column

All of that sounds good.

The view provides the information of CLUSTER command progress details as follows
# \d pg_stat_progress_cluster
                View "pg_catalog.pg_stat_progress_cluster"
            Column           | Type   | Collation | Nullable | Default
---------------------------+---------+-----------+----------+---------
   pid                       | integer |           |          |
   datid                     | oid     |           |          |
   datname                   | name    |           |          |
   relid                     | oid     |           |          |
   command                   | text    |           |          |
   phase                     | text    |           |          |
   cluster_index_relid       | bigint |           |          |
   heap_tuples_scanned       | bigint |           |          |
   heap_tuples_vacuumed      | bigint |           |          |

Still not sure if we need heap_tuples_vacuumed. We could try to
report heap_blks_scanned and heap_blks_total like we do for VACUUM, if
we're using a Seq Scan.

I have no strong opinion to add heap_tuples_vacuumed, so I'll remove that in
next patch.

Regarding heap_blks_scanned and heap_blks_total, I suppose that it is able to
get those from initscan(). I'll investigate it more.

cluster.c
copy_heap_data()
    heap_beginscan()
      heap_beginscan_internal()
        initscan()

=== Discussion points ===

   - Progress counter for "3. sorting tuples" phase
      - Should we add pgstat_progress_update_param() in tuplesort.c like a
        "trace_sort"?
        Thanks to Peter Geoghegan for the useful advice!

How would we avoid an abstraction violation?

Hmm... What do you mean an abstraction violation?
If it is difficult to solve, I'd not like to add the progress counter for the sorting tuples.

   - Progress counter for "6. rebuilding index" phase
      - Should we add "index_vacuum_count" in the view like a vacuum progress monitor?
        If yes, I'll add pgstat_progress_update_param() to reindex_relation() of index.c.
        However, I'm not sure whether it is okay or not.

Doesn't seem unreasonable to me.

I see, I'll add it later.

Attached file is revised and WIP patch including:

- Remove heap_tuples_vacuumed
- Add heap_blks_scanned and heap_blks_total
- Add index_vacuum_count

I tried to "add heap_blks_scanned and heap_blks_total" columns and I realized that
"heap_tuples_scanned" column is suitable as a counter when a scan method is
both index-scan and seq-scan because CLUSTER is on a tuple basis.

Regards,
Tatsuro Yamada

Attachments:

progress_monitor_for_cluster_command_v8_code.patchtext/x-patch; name=progress_monitor_for_cluster_command_v8_code.patchDownload

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index d16c3d0ea5..a88e8d2492 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -51,6 +51,7 @@
 #include "catalog/storage.h"
 #include "commands/tablecmds.h"
 #include "commands/event_trigger.h"
+#include "commands/progress.h"
 #include "commands/trigger.h"
 #include "executor/executor.h"
 #include "miscadmin.h"
@@ -58,6 +59,7 @@
 #include "nodes/nodeFuncs.h"
 #include "optimizer/optimizer.h"
 #include "parser/parser.h"
+#include "pgstat.h"
 #include "rewrite/rewriteManip.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
@@ -3850,6 +3852,7 @@ reindex_relation(Oid relid, int flags, int options)
 	List	   *indexIds;
 	bool		is_pg_class;
 	bool		result;
+	int			i;
 
 	/*
 	 * Open and lock the relation.  ShareLock is sufficient since we only need
@@ -3937,6 +3940,7 @@ reindex_relation(Oid relid, int flags, int options)
 
 		/* Reindex all the indexes. */
 		doneIndexes = NIL;
+		i = 1;
 		foreach(indexId, indexIds)
 		{
 			Oid			indexOid = lfirst_oid(indexId);
@@ -3954,6 +3958,11 @@ reindex_relation(Oid relid, int flags, int options)
 
 			if (is_pg_class)
 				doneIndexes = lappend_oid(doneIndexes, indexOid);
+
+			/* Set index rebuild count */
+			pgstat_progress_update_param(PROGRESS_CLUSTER_INDEX_REBUILD_COUNT,
+										 i);
+			i++;
 		}
 	}
 	PG_CATCH();
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 3e229c693c..88f3940fa5 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -906,6 +906,32 @@ CREATE VIEW pg_stat_progress_vacuum AS
     FROM pg_stat_get_progress_info('VACUUM') AS S
 		LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_cluster AS
+    SELECT
+        S.pid AS pid,
+        S.datid AS datid,
+        D.datname AS datname,
+        S.relid AS relid,
+        CASE S.param1 WHEN 1 THEN 'CLUSTER'
+                      WHEN 2 THEN 'VACUUM FULL'
+                      END AS command,
+        CASE S.param2 WHEN 0 THEN 'initializing'
+                      WHEN 1 THEN 'seq scanning heap'
+                      WHEN 2 THEN 'index scanning heap'
+                      WHEN 3 THEN 'sorting tuples'
+                      WHEN 4 THEN 'writing new heap'
+                      WHEN 5 THEN 'swapping relation files'
+                      WHEN 6 THEN 'rebuilding index'
+                      WHEN 7 THEN 'performing final cleanup'
+                      END AS phase,
+        S.param3 AS cluster_index_relid,
+        S.param4 AS heap_tuples_scanned,
+        S.param5 AS heap_blks_total,
+        S.param6 AS heap_blks_scanned,
+        S.param7 AS index_rebuild_count
+    FROM pg_stat_get_progress_info('CLUSTER') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid;
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index a74af4c171..73b5e73b04 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -35,10 +35,12 @@
 #include "catalog/objectaccess.h"
 #include "catalog/toasting.h"
 #include "commands/cluster.h"
+#include "commands/progress.h"
 #include "commands/tablecmds.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "optimizer/optimizer.h"
+#include "pgstat.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/predicate.h"
@@ -275,6 +277,8 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 	/* Check for user-requested abort. */
 	CHECK_FOR_INTERRUPTS();
 
+	pgstat_progress_start_command(PROGRESS_COMMAND_CLUSTER, tableOid);
+
 	/*
 	 * We grab exclusive access to the target rel and index for the duration
 	 * of the transaction.  (This is redundant for the single-transaction
@@ -385,6 +389,18 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 	 */
 	CheckTableNotInUse(OldHeap, OidIsValid(indexOid) ? "CLUSTER" : "VACUUM");
 
+	/* Set command to column */
+	if (OidIsValid(indexOid))
+	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_COMMAND,
+									 PROGRESS_CLUSTER_COMMAND_CLUSTER);
+	}
+	else
+	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_COMMAND,
+									 PROGRESS_CLUSTER_COMMAND_VACUUM_FULL);
+	}
+
 	/* Check heap and index are valid to cluster on */
 	if (OidIsValid(indexOid))
 		check_index_is_clusterable(OldHeap, indexOid, recheck, AccessExclusiveLock);
@@ -415,6 +431,8 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 	rebuild_relation(OldHeap, indexOid, verbose);
 
 	/* NB: rebuild_relation does table_close() on OldHeap */
+
+	pgstat_progress_end_command();
 }
 
 /*
@@ -923,14 +941,33 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 */
 	if (OldIndex != NULL && !use_sort)
 	{
+		const int   ci_index[] = {
+			PROGRESS_CLUSTER_PHASE,
+			PROGRESS_CLUSTER_INDEX_RELID
+		};
+		int64       ci_val[2];
+
+		/* Set phase and OIDOldIndex to columns */
+		ci_val[0] = PROGRESS_CLUSTER_PHASE_INDEX_SCAN_HEAP;
+		ci_val[1] = OIDOldIndex;
+		pgstat_progress_update_multi_param(2, ci_index, ci_val);
+
 		heapScan = NULL;
 		indexScan = index_beginscan(OldHeap, OldIndex, SnapshotAny, 0, 0);
 		index_rescan(indexScan, NULL, 0, NULL, 0);
 	}
 	else
 	{
+		/* Set phase */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+									 PROGRESS_CLUSTER_PHASE_SEQ_SCAN_HEAP);
+
 		heapScan = heap_beginscan(OldHeap, SnapshotAny, 0, (ScanKey) NULL);
 		indexScan = NULL;
+
+		/* Set total heap blocks */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_TOTAL_HEAP_BLKS,
+									 heapScan->rs_nblocks);
 	}
 
 	/* Log what we're doing */
@@ -984,6 +1021,10 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 				break;
 
 			buf = heapScan->rs_cbuf;
+
+			/* Set heap blocks scanned */
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_BLKS_SCANNED,
+										 heapScan->rs_cblock);
 		}
 
 		LockBuffer(buf, BUFFER_LOCK_SHARE);
@@ -1049,6 +1090,13 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 				tups_vacuumed += 1;
 				tups_recently_dead -= 1;
 			}
+
+			/* set tups_vacuumed column for VACUUM FULL */
+			/*
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_VACUUMED,
+										 tups_vacuumed);
+			*/
+
 			continue;
 		}
 
@@ -1060,6 +1108,10 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 									 oldTupDesc, newTupDesc,
 									 values, isnull,
 									 rwstate);
+
+		/* Regardless of index scan or seq scan, update tuples_scanned column */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED,
+									 num_tuples);
 	}
 
 	if (indexScan != NULL)
@@ -1073,8 +1125,29 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 */
 	if (tuplesort != NULL)
 	{
+		double num_tuples = 0;
+		const int   cp_index[] = {
+			PROGRESS_CLUSTER_PHASE,
+			PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED,
+			PROGRESS_CLUSTER_TOTAL_HEAP_BLKS,
+			PROGRESS_CLUSTER_HEAP_BLKS_SCANNED
+		};
+		int64       cp_val[4];
+
+		/* Report that we are now sorting tuples */
+		cp_val[0] = PROGRESS_CLUSTER_PHASE_SORT_TUPLES;
+		cp_val[1] = num_tuples;
+		cp_val[2] = 0;
+		cp_val[3] = 0;
+		pgstat_progress_update_multi_param(4, cp_index, cp_val);
+
 		tuplesort_performsort(tuplesort);
 
+		/* Report that we are now writing new heap */
+		cp_val[0] = PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP;
+		cp_val[1] = num_tuples;
+		pgstat_progress_update_multi_param(2, cp_index, cp_val);
+
 		for (;;)
 		{
 			HeapTuple	tuple;
@@ -1085,10 +1158,14 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 			if (tuple == NULL)
 				break;
 
+			num_tuples += 1;
 			reform_and_rewrite_tuple(tuple,
 									 oldTupDesc, newTupDesc,
 									 values, isnull,
 									 rwstate);
+			/* Report num_tuples */
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED,
+										 num_tuples);
 		}
 
 		tuplesort_end(tuplesort);
@@ -1526,6 +1603,16 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	Oid			mapped_tables[4];
 	int			reindex_flags;
 	int			i;
+	const int   cp_index[] = {
+		PROGRESS_CLUSTER_PHASE,
+		PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED
+	};
+	int64       cp_val[2];
+
+	/* Report that we are now swapping relation files */
+	cp_val[0] = PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES;
+	cp_val[1] = 0;
+	pgstat_progress_update_multi_param(2, cp_index, cp_val);
 
 	/* Zero out possible results from swapped_relation_files */
 	memset(mapped_tables, 0, sizeof(mapped_tables));
@@ -1561,6 +1648,11 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	 * because the new heap won't contain any HOT chains at all, let alone
 	 * broken ones, so it can't be necessary to set indcheckxmin.
 	 */
+
+	/* Report that we are now reindexing relations */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+								 PROGRESS_CLUSTER_PHASE_REBUILD_INDEX);
+
 	reindex_flags = REINDEX_REL_SUPPRESS_INDEX_USE;
 	if (check_constraints)
 		reindex_flags |= REINDEX_REL_CHECK_CONSTRAINTS;
@@ -1576,6 +1668,10 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 
 	reindex_relation(OIDOldHeap, reindex_flags, 0);
 
+	/* Report that we are now doing clean up */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+								 PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP);
+
 	/*
 	 * If the relation being rebuild is pg_class, swap_relation_files()
 	 * couldn't update pg_class's own pg_class entry (check comments in
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 69f7265779..37ff3dbff6 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -468,6 +468,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 	/* Translate command name into command type code. */
 	if (pg_strcasecmp(cmd, "VACUUM") == 0)
 		cmdtype = PROGRESS_COMMAND_VACUUM;
+	else if(pg_strcasecmp(cmd, "CLUSTER") == 0)
+		cmdtype = PROGRESS_COMMAND_CLUSTER;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 9858b36a38..0f637fe4e7 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -34,4 +34,26 @@
 #define PROGRESS_VACUUM_PHASE_TRUNCATE			5
 #define PROGRESS_VACUUM_PHASE_FINAL_CLEANUP		6
 
+/* Progress parameters for cluster */
+#define PROGRESS_CLUSTER_COMMAND				0
+#define PROGRESS_CLUSTER_PHASE					1
+#define PROGRESS_CLUSTER_INDEX_RELID			2
+#define PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED	3
+#define PROGRESS_CLUSTER_TOTAL_HEAP_BLKS		4
+#define PROGRESS_CLUSTER_HEAP_BLKS_SCANNED		5
+#define PROGRESS_CLUSTER_INDEX_REBUILD_COUNT	6
+
+/* Phases of cluster (as dvertised via PROGRESS_CLUSTER_PHASE) */
+#define PROGRESS_CLUSTER_PHASE_SEQ_SCAN_HEAP	1
+#define PROGRESS_CLUSTER_PHASE_INDEX_SCAN_HEAP	2
+#define PROGRESS_CLUSTER_PHASE_SORT_TUPLES		3
+#define PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP	4
+#define PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES	5
+#define PROGRESS_CLUSTER_PHASE_REBUILD_INDEX	6
+#define PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP	7
+
+/* Commands of PROGRESS_CLUSTER */
+#define PROGRESS_CLUSTER_COMMAND_CLUSTER		1
+#define PROGRESS_CLUSTER_COMMAND_VACUUM_FULL	2
+
 #endif
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 88a75fb798..745685c8a6 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -934,7 +934,8 @@ typedef enum
 typedef enum ProgressCommandType
 {
 	PROGRESS_COMMAND_INVALID,
-	PROGRESS_COMMAND_VACUUM
+	PROGRESS_COMMAND_VACUUM,
+	PROGRESS_COMMAND_CLUSTER
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	10
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 98f417cb57..f72fdd4d92 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1829,6 +1829,33 @@ pg_stat_database_conflicts| SELECT d.oid AS datid,
     pg_stat_get_db_conflict_bufferpin(d.oid) AS confl_bufferpin,
     pg_stat_get_db_conflict_startup_deadlock(d.oid) AS confl_deadlock
    FROM pg_database d;
+pg_stat_progress_cluster| SELECT s.pid,
+    s.datid,
+    d.datname,
+    s.relid,
+        CASE s.param1
+            WHEN 1 THEN 'CLUSTER'::text
+            WHEN 2 THEN 'VACUUM FULL'::text
+            ELSE NULL::text
+        END AS command,
+        CASE s.param2
+            WHEN 0 THEN 'initializing'::text
+            WHEN 1 THEN 'seq scanning heap'::text
+            WHEN 2 THEN 'index scanning heap'::text
+            WHEN 3 THEN 'sorting tuples'::text
+            WHEN 4 THEN 'writing new heap'::text
+            WHEN 5 THEN 'swapping relation files'::text
+            WHEN 6 THEN 'rebuilding index'::text
+            WHEN 7 THEN 'performing final cleanup'::text
+            ELSE NULL::text
+        END AS phase,
+    s.param4 AS cluster_index_relid,
+    s.param5 AS heap_tuples_scanned,
+    s.param6 AS heap_blks_total,
+    s.param7 AS heap_blks_scanned,
+    s.param8 AS index_rebuild_count
+   FROM (pg_stat_get_progress_info('CLUSTER'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,

#58

Robert Haas

robertmhaas@gmail.com

almost 7 years ago

In reply to: Alvaro Herrera (#55)

Re: [HACKERS] CLUSTER command progress monitor

On Tue, Mar 5, 2019 at 8:03 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

One, err, small issue with that idea is that we need the param numbers
not to conflict for any "progress update providers" that are to be used
simultaneously by any command.

Is that really an issue? I think progress reporting -- at least with
the current infrastructure -- is only ever going to be possible for
utility commands, not queries. And those really shouldn't have very
many sorts going on at once.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#59

Alvaro Herrera

alvherre@2ndquadrant.com

almost 7 years ago

In reply to: Robert Haas (#58)

Re: [HACKERS] CLUSTER command progress monitor

On 2019-Mar-06, Robert Haas wrote:

On Tue, Mar 5, 2019 at 8:03 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

One, err, small issue with that idea is that we need the param numbers
not to conflict for any "progress update providers" that are to be used
simultaneously by any command.

Is that really an issue? I think progress reporting -- at least with
the current infrastructure -- is only ever going to be possible for
utility commands, not queries. And those really shouldn't have very
many sorts going on at once.

Well, I don't think it is, but I thought it was worth pointing out
explicitly.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#60

Tatsuro Yamada

yamada.tatsuro@lab.ntt.co.jp

almost 7 years ago

In reply to: Tatsuro Yamada (#57)

2 attachment(s)

Re: [HACKERS] CLUSTER command progress monitor

On 2019/03/06 15:38, Tatsuro Yamada wrote:

On 2019/03/05 17:56, Tatsuro Yamada wrote:

On 2019/03/05 11:35, Robert Haas wrote:

On Mon, Mar 4, 2019 at 5:38 AM Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

=== Current design ===

CLUSTER command uses Index Scan or Seq Scan when scanning the heap.
Depending on which one is chosen, the command will proceed in the
following sequence of phases:

    * Scan method: Seq Scan
      0. initializing                 (*2)
      1. seq scanning heap            (*1)
      3. sorting tuples               (*2)
      4. writing new heap             (*1)
      5. swapping relation files      (*2)
      6. rebuilding index             (*2)
      7. performing final cleanup     (*2)

    * Scan method: Index Scan
      0. initializing                 (*2)
      2. index scanning heap          (*1)
      5. swapping relation files      (*2)
      6. rebuilding index             (*2)
      7. performing final cleanup     (*2)

VACUUM FULL command will proceed in the following sequence of phases:

      1. seq scanning heap            (*1)
      5. swapping relation files      (*2)
      6. rebuilding index             (*2)
      7. performing final cleanup     (*2)

(*1): increasing the value in heap_tuples_scanned column
(*2): only shows the phase in the phase column

All of that sounds good.

The view provides the information of CLUSTER command progress details as follows
# \d pg_stat_progress_cluster
                View "pg_catalog.pg_stat_progress_cluster"
            Column           | Type   | Collation | Nullable | Default
---------------------------+---------+-----------+----------+---------
   pid                       | integer |           |          |
   datid                     | oid     |           |          |
   datname                   | name    |           |          |
   relid                     | oid     |           |          |
   command                   | text    |           |          |
   phase                     | text    |           |          |
   cluster_index_relid       | bigint |           |          |
   heap_tuples_scanned       | bigint |           |          |
   heap_tuples_vacuumed      | bigint |           |          |

Still not sure if we need heap_tuples_vacuumed. We could try to
report heap_blks_scanned and heap_blks_total like we do for VACUUM, if
we're using a Seq Scan.

I have no strong opinion to add heap_tuples_vacuumed, so I'll remove that in
next patch.

Regarding heap_blks_scanned and heap_blks_total, I suppose that it is able to
get those from initscan(). I'll investigate it more.

cluster.c
   copy_heap_data()
     heap_beginscan()
       heap_beginscan_internal()
         initscan()

=== Discussion points ===

   - Progress counter for "3. sorting tuples" phase
      - Should we add pgstat_progress_update_param() in tuplesort.c like a
        "trace_sort"?
        Thanks to Peter Geoghegan for the useful advice!

How would we avoid an abstraction violation?

Hmm... What do you mean an abstraction violation?
If it is difficult to solve, I'd not like to add the progress counter for the sorting tuples.

   - Progress counter for "6. rebuilding index" phase
      - Should we add "index_vacuum_count" in the view like a vacuum progress monitor?
        If yes, I'll add pgstat_progress_update_param() to reindex_relation() of index.c.
        However, I'm not sure whether it is okay or not.

Doesn't seem unreasonable to me.

I see, I'll add it later.

Attached file is revised and WIP patch including:

- Remove heap_tuples_vacuumed
- Add heap_blks_scanned and heap_blks_total
- Add index_vacuum_count

I tried to "add heap_blks_scanned and heap_blks_total" columns and I realized that
"heap_tuples_scanned" column is suitable as a counter when a scan method is
both index-scan and seq-scan because CLUSTER is on a tuple basis.

Attached file is rebased patch on current HEAD.
I changed a status. :)

Regards,
Tatsuro Yamada

Attachments:

progress_monitor_for_cluster_command_v9_code.patchtext/x-patch; name=progress_monitor_for_cluster_command_v9_code.patchDownload

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 1ee1ed2894..acda12bf52 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -51,6 +51,7 @@
 #include "catalog/storage.h"
 #include "commands/tablecmds.h"
 #include "commands/event_trigger.h"
+#include "commands/progress.h"
 #include "commands/trigger.h"
 #include "executor/executor.h"
 #include "miscadmin.h"
@@ -58,6 +59,7 @@
 #include "nodes/nodeFuncs.h"
 #include "optimizer/optimizer.h"
 #include "parser/parser.h"
+#include "pgstat.h"
 #include "rewrite/rewriteManip.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
@@ -3851,6 +3853,7 @@ reindex_relation(Oid relid, int flags, int options)
 	List	   *indexIds;
 	bool		is_pg_class;
 	bool		result;
+	int			i;
 
 	/*
 	 * Open and lock the relation.  ShareLock is sufficient since we only need
@@ -3938,6 +3941,7 @@ reindex_relation(Oid relid, int flags, int options)
 
 		/* Reindex all the indexes. */
 		doneIndexes = NIL;
+		i = 1;
 		foreach(indexId, indexIds)
 		{
 			Oid			indexOid = lfirst_oid(indexId);
@@ -3955,6 +3959,11 @@ reindex_relation(Oid relid, int flags, int options)
 
 			if (is_pg_class)
 				doneIndexes = lappend_oid(doneIndexes, indexOid);
+
+			/* Set index rebuild count */
+			pgstat_progress_update_param(PROGRESS_CLUSTER_INDEX_REBUILD_COUNT,
+										 i);
+			i++;
 		}
 	}
 	PG_CATCH();
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 3e229c693c..88f3940fa5 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -906,6 +906,32 @@ CREATE VIEW pg_stat_progress_vacuum AS
     FROM pg_stat_get_progress_info('VACUUM') AS S
 		LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_cluster AS
+    SELECT
+        S.pid AS pid,
+        S.datid AS datid,
+        D.datname AS datname,
+        S.relid AS relid,
+        CASE S.param1 WHEN 1 THEN 'CLUSTER'
+                      WHEN 2 THEN 'VACUUM FULL'
+                      END AS command,
+        CASE S.param2 WHEN 0 THEN 'initializing'
+                      WHEN 1 THEN 'seq scanning heap'
+                      WHEN 2 THEN 'index scanning heap'
+                      WHEN 3 THEN 'sorting tuples'
+                      WHEN 4 THEN 'writing new heap'
+                      WHEN 5 THEN 'swapping relation files'
+                      WHEN 6 THEN 'rebuilding index'
+                      WHEN 7 THEN 'performing final cleanup'
+                      END AS phase,
+        S.param3 AS cluster_index_relid,
+        S.param4 AS heap_tuples_scanned,
+        S.param5 AS heap_blks_total,
+        S.param6 AS heap_blks_scanned,
+        S.param7 AS index_rebuild_count
+    FROM pg_stat_get_progress_info('CLUSTER') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid;
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 4d6453d924..c9a84ac805 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -35,10 +35,12 @@
 #include "catalog/objectaccess.h"
 #include "catalog/toasting.h"
 #include "commands/cluster.h"
+#include "commands/progress.h"
 #include "commands/tablecmds.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "optimizer/optimizer.h"
+#include "pgstat.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/predicate.h"
@@ -275,6 +277,8 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 	/* Check for user-requested abort. */
 	CHECK_FOR_INTERRUPTS();
 
+	pgstat_progress_start_command(PROGRESS_COMMAND_CLUSTER, tableOid);
+
 	/*
 	 * We grab exclusive access to the target rel and index for the duration
 	 * of the transaction.  (This is redundant for the single-transaction
@@ -385,6 +389,18 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 	 */
 	CheckTableNotInUse(OldHeap, OidIsValid(indexOid) ? "CLUSTER" : "VACUUM");
 
+	/* Set command to column */
+	if (OidIsValid(indexOid))
+	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_COMMAND,
+									 PROGRESS_CLUSTER_COMMAND_CLUSTER);
+	}
+	else
+	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_COMMAND,
+									 PROGRESS_CLUSTER_COMMAND_VACUUM_FULL);
+	}
+
 	/* Check heap and index are valid to cluster on */
 	if (OidIsValid(indexOid))
 		check_index_is_clusterable(OldHeap, indexOid, recheck, AccessExclusiveLock);
@@ -415,6 +431,8 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 	rebuild_relation(OldHeap, indexOid, verbose);
 
 	/* NB: rebuild_relation does table_close() on OldHeap */
+
+	pgstat_progress_end_command();
 }
 
 /*
@@ -924,14 +942,33 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 */
 	if (OldIndex != NULL && !use_sort)
 	{
+		const int   ci_index[] = {
+			PROGRESS_CLUSTER_PHASE,
+			PROGRESS_CLUSTER_INDEX_RELID
+		};
+		int64       ci_val[2];
+
+		/* Set phase and OIDOldIndex to columns */
+		ci_val[0] = PROGRESS_CLUSTER_PHASE_INDEX_SCAN_HEAP;
+		ci_val[1] = OIDOldIndex;
+		pgstat_progress_update_multi_param(2, ci_index, ci_val);
+
 		heapScan = NULL;
 		indexScan = index_beginscan(OldHeap, OldIndex, SnapshotAny, 0, 0);
 		index_rescan(indexScan, NULL, 0, NULL, 0);
 	}
 	else
 	{
+		/* Set phase */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+									 PROGRESS_CLUSTER_PHASE_SEQ_SCAN_HEAP);
+
 		heapScan = heap_beginscan(OldHeap, SnapshotAny, 0, (ScanKey) NULL);
 		indexScan = NULL;
+
+		/* Set total heap blocks */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_TOTAL_HEAP_BLKS,
+									 heapScan->rs_nblocks);
 	}
 
 	/* Log what we're doing */
@@ -985,6 +1022,10 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 				break;
 
 			buf = heapScan->rs_cbuf;
+
+			/* Set heap blocks scanned */
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_BLKS_SCANNED,
+										 heapScan->rs_cblock);
 		}
 
 		LockBuffer(buf, BUFFER_LOCK_SHARE);
@@ -1061,6 +1102,10 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 									 oldTupDesc, newTupDesc,
 									 values, isnull,
 									 rwstate);
+
+		/* Regardless of index scan or seq scan, update tuples_scanned column */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED,
+									 num_tuples);
 	}
 
 	if (indexScan != NULL)
@@ -1074,8 +1119,29 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 */
 	if (tuplesort != NULL)
 	{
+		double num_tuples = 0;
+		const int   cp_index[] = {
+			PROGRESS_CLUSTER_PHASE,
+			PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED,
+			PROGRESS_CLUSTER_TOTAL_HEAP_BLKS,
+			PROGRESS_CLUSTER_HEAP_BLKS_SCANNED
+		};
+		int64       cp_val[4];
+
+		/* Report that we are now sorting tuples */
+		cp_val[0] = PROGRESS_CLUSTER_PHASE_SORT_TUPLES;
+		cp_val[1] = num_tuples;
+		cp_val[2] = 0;
+		cp_val[3] = 0;
+		pgstat_progress_update_multi_param(4, cp_index, cp_val);
+
 		tuplesort_performsort(tuplesort);
 
+		/* Report that we are now writing new heap */
+		cp_val[0] = PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP;
+		cp_val[1] = num_tuples;
+		pgstat_progress_update_multi_param(2, cp_index, cp_val);
+
 		for (;;)
 		{
 			HeapTuple	tuple;
@@ -1086,10 +1152,14 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 			if (tuple == NULL)
 				break;
 
+			num_tuples += 1;
 			reform_and_rewrite_tuple(tuple,
 									 oldTupDesc, newTupDesc,
 									 values, isnull,
 									 rwstate);
+			/* Report num_tuples */
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED,
+										 num_tuples);
 		}
 
 		tuplesort_end(tuplesort);
@@ -1527,6 +1597,16 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	Oid			mapped_tables[4];
 	int			reindex_flags;
 	int			i;
+	const int   cp_index[] = {
+		PROGRESS_CLUSTER_PHASE,
+		PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED
+	};
+	int64       cp_val[2];
+
+	/* Report that we are now swapping relation files */
+	cp_val[0] = PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES;
+	cp_val[1] = 0;
+	pgstat_progress_update_multi_param(2, cp_index, cp_val);
 
 	/* Zero out possible results from swapped_relation_files */
 	memset(mapped_tables, 0, sizeof(mapped_tables));
@@ -1562,6 +1642,11 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	 * because the new heap won't contain any HOT chains at all, let alone
 	 * broken ones, so it can't be necessary to set indcheckxmin.
 	 */
+
+	/* Report that we are now reindexing relations */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+								 PROGRESS_CLUSTER_PHASE_REBUILD_INDEX);
+
 	reindex_flags = REINDEX_REL_SUPPRESS_INDEX_USE;
 	if (check_constraints)
 		reindex_flags |= REINDEX_REL_CHECK_CONSTRAINTS;
@@ -1577,6 +1662,10 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 
 	reindex_relation(OIDOldHeap, reindex_flags, 0);
 
+	/* Report that we are now doing clean up */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+								 PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP);
+
 	/*
 	 * If the relation being rebuild is pg_class, swap_relation_files()
 	 * couldn't update pg_class's own pg_class entry (check comments in
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 69f7265779..37ff3dbff6 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -468,6 +468,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 	/* Translate command name into command type code. */
 	if (pg_strcasecmp(cmd, "VACUUM") == 0)
 		cmdtype = PROGRESS_COMMAND_VACUUM;
+	else if(pg_strcasecmp(cmd, "CLUSTER") == 0)
+		cmdtype = PROGRESS_COMMAND_CLUSTER;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 9858b36a38..0f637fe4e7 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -34,4 +34,26 @@
 #define PROGRESS_VACUUM_PHASE_TRUNCATE			5
 #define PROGRESS_VACUUM_PHASE_FINAL_CLEANUP		6
 
+/* Progress parameters for cluster */
+#define PROGRESS_CLUSTER_COMMAND				0
+#define PROGRESS_CLUSTER_PHASE					1
+#define PROGRESS_CLUSTER_INDEX_RELID			2
+#define PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED	3
+#define PROGRESS_CLUSTER_TOTAL_HEAP_BLKS		4
+#define PROGRESS_CLUSTER_HEAP_BLKS_SCANNED		5
+#define PROGRESS_CLUSTER_INDEX_REBUILD_COUNT	6
+
+/* Phases of cluster (as dvertised via PROGRESS_CLUSTER_PHASE) */
+#define PROGRESS_CLUSTER_PHASE_SEQ_SCAN_HEAP	1
+#define PROGRESS_CLUSTER_PHASE_INDEX_SCAN_HEAP	2
+#define PROGRESS_CLUSTER_PHASE_SORT_TUPLES		3
+#define PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP	4
+#define PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES	5
+#define PROGRESS_CLUSTER_PHASE_REBUILD_INDEX	6
+#define PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP	7
+
+/* Commands of PROGRESS_CLUSTER */
+#define PROGRESS_CLUSTER_COMMAND_CLUSTER		1
+#define PROGRESS_CLUSTER_COMMAND_VACUUM_FULL	2
+
 #endif
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 88a75fb798..745685c8a6 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -934,7 +934,8 @@ typedef enum
 typedef enum ProgressCommandType
 {
 	PROGRESS_COMMAND_INVALID,
-	PROGRESS_COMMAND_VACUUM
+	PROGRESS_COMMAND_VACUUM,
+	PROGRESS_COMMAND_CLUSTER
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	10
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 98f417cb57..f72fdd4d92 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1829,6 +1829,33 @@ pg_stat_database_conflicts| SELECT d.oid AS datid,
     pg_stat_get_db_conflict_bufferpin(d.oid) AS confl_bufferpin,
     pg_stat_get_db_conflict_startup_deadlock(d.oid) AS confl_deadlock
    FROM pg_database d;
+pg_stat_progress_cluster| SELECT s.pid,
+    s.datid,
+    d.datname,
+    s.relid,
+        CASE s.param1
+            WHEN 1 THEN 'CLUSTER'::text
+            WHEN 2 THEN 'VACUUM FULL'::text
+            ELSE NULL::text
+        END AS command,
+        CASE s.param2
+            WHEN 0 THEN 'initializing'::text
+            WHEN 1 THEN 'seq scanning heap'::text
+            WHEN 2 THEN 'index scanning heap'::text
+            WHEN 3 THEN 'sorting tuples'::text
+            WHEN 4 THEN 'writing new heap'::text
+            WHEN 5 THEN 'swapping relation files'::text
+            WHEN 6 THEN 'rebuilding index'::text
+            WHEN 7 THEN 'performing final cleanup'::text
+            ELSE NULL::text
+        END AS phase,
+    s.param4 AS cluster_index_relid,
+    s.param5 AS heap_tuples_scanned,
+    s.param6 AS heap_blks_total,
+    s.param7 AS heap_blks_scanned,
+    s.param8 AS index_rebuild_count
+   FROM (pg_stat_get_progress_info('CLUSTER'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,

progress_monitor_for_cluster_command_v9_doc.patchtext/x-patch; name=progress_monitor_for_cluster_command_v9_doc.patchDownload

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 0e73cdcdda..cd1743fcac 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -344,6 +344,14 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
       </entry>
      </row>
 
+     <row>
+      <entry><structname>pg_stat_progress_cluster</structname><indexterm><primary>pg_stat_progress_cluster</primary></indexterm></entry>
+      <entry>One row for each backend running
+       <command>CLUSTER</command> or <command>VACUUM FULL</command>, showing current progress.
+       See <xref linkend='cluster-progress-reporting'>.
+      </entry>
+     </row>
+
     </tbody>
    </tgroup>
   </table>
@@ -3376,9 +3384,9 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 
   <para>
    <productname>PostgreSQL</productname> has the ability to report the progress of
-   certain commands during command execution.  Currently, the only command
-   which supports progress reporting is <command>VACUUM</command>.  This may be
-   expanded in the future.
+   certain commands during command execution.  Currently, the only commands which 
+   support progress reporting are <command>VACUUM</command> and
+   <command>CLUSTER</command>. This may be expanded in the future.
   </para>
 
  <sect2 id="vacuum-progress-reporting">
@@ -3390,9 +3398,10 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
    one row for each backend (including autovacuum worker processes) that is
    currently vacuuming.  The tables below describe the information
    that will be reported and provide information about how to interpret it.
-   Progress reporting is not currently supported for <command>VACUUM FULL</command>
-   and backends running <command>VACUUM FULL</command> will not be listed in this
-   view.
+   Running <command>VACUUM FULL</command> is listed in <structname>pg_stat_progress_cluster</structname>
+   because both <command>VACUUM FULL</command> and <command>CLUSTER</command> 
+   rewrite the table, while regular <command>VACUUM</command> only modifies it 
+   in place. See <xref linkend='cluster-progress-reporting'>.
   </para>
 
   <table id="pg-stat-progress-vacuum-view" xreflabel="pg_stat_progress_vacuum">
@@ -3569,6 +3578,218 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
    </tgroup>
   </table>
 
+ </sect2>
+
+ <sect2 id="cluster-progress-reporting">
+  <title>CLUSTER Progress Reporting</title>
+
+  <para>
+   Whenever <command>CLUSTER</command> is running, the
+   <structname>pg_stat_progress_cluster</structname> view will contain
+   a row for each backend that is currently running CLUSTER or VACUUM FULL. 
+   The tables below describe the information that will be reported and
+   provide information about how to interpret it.
+  </para>
+
+  <table id="pg-stat-progress-cluster-view" xreflabel="pg_stat_progress_cluster">
+   <title><structname>pg_stat_progress_cluster</structname> View</title>
+   <tgroup cols="3">
+    <thead>
+    <row>
+      <entry>Column</entry>
+      <entry>Type</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><structfield>pid</structfield></entry>
+     <entry><type>integer</type></entry>
+     <entry>Process ID of backend.</entry>
+    </row>
+    <row>
+     <entry><structfield>datid</structfield></entry>
+     <entry><type>oid</type></entry>
+     <entry>OID of the database to which this backend is connected.</entry>
+    </row>
+    <row>
+     <entry><structfield>datname</structfield></entry>
+     <entry><type>name</type></entry>
+     <entry>Name of the database to which this backend is connected.</entry>
+    </row>
+    <row>
+     <entry><structfield>relid</structfield></entry>
+     <entry><type>oid</type></entry>
+     <entry>OID of the table being clustered.</entry>
+    </row>
+    <row>
+     <entry><structfield>command</structfield></entry>
+     <entry><type>text</type></entry>
+     <entry>
+       The command that is running. Either CLUSTER or VACUUM FULL.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>phase</structfield></entry>
+     <entry><type>text</type></entry>
+     <entry>
+       Current processing phase. See <xref linkend='cluster-phases'> or <xref linkend='vacuum-full-phases'>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>cluster_index_relid</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       If the table is being scanned using an index, this is the OID of the
+       index being used; otherwise, it is zero.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_tuples_scanned</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Number of heap tuples scanned.
+       This counter only advances when the phase is <literal>seq scanning heap</literal>, 
+       <literal>index scanning heap</literal> and <literal>writing new heap</literal>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_blks_total</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Total number of heap blocks in the table.  This number is reported
+       as of the beginning of <literal>seq scanning heap</literal>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_blks_scanned</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Number of heap blocks scanned. 
+       This counter only advances when the phase is <literal>seq scanning heap</literal>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>index_rebuild_count</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Number of rebuilded indexes.
+       This counter only advances when the phase is <literal>rebuilding index</literal>.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
+  <table id="cluster-phases">
+   <title>CLUSTER phases</title>
+   <tgroup cols="2">
+    <thead>
+    <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><literal>initializing</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is preparing to begin scanning the heap.  This
+       phase is expected to be very brief.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>seq scanning heap</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently scanning heap from the table by
+       seq scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>index scanning heap</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently scanning heap from the table by
+       index scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>sorting tuples</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently sorting tuples. 
+     </entry>
+    </row>
+    <row>
+     <entry><literal>swapping relation files</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently swapping old heap and new clustered heap.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>rebuilding index</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently rebuilding index.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>performing final cleanup</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is performing final cleanup.  When this phase is 
+       completed, <command>CLUSTER</command> will end.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
+  <table id="vacuum-full-phases">
+   <title>VACUUM FULL phases</title>
+   <tgroup cols="2">
+    <thead>
+    <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><literal>initializing</literal></entry>
+     <entry>
+       <command>VACUUM FULL</command> is preparing to begin scanning the heap.  This
+       phase is expected to be very brief.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>seq scanning heap</literal></entry>
+     <entry>
+       <command>VACUUM FULL</command> is currently scanning heap from the table.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>swapping relation files</literal></entry>
+     <entry>
+       <command>VACUUM FULL</command> is currently swapping old heap and new vacuumed heap.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>rebuilding index</literal></entry>
+     <entry>
+       <command>VACUUM FULL</command> is currently rebuilding index.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>performing final cleanup</literal></entry>
+     <entry>
+       <command>VACUUM FULL</command> is performing final cleanup.  When this phase is
+       completed, <command>VACUUM FULL</command> will end.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
  </sect2>
  </sect1>

#61

Rafia Sabih

rafia.pghackers@gmail.com

almost 7 years ago

In reply to: Tatsuro Yamada (#60)

1 attachment(s)

Re: [HACKERS] CLUSTER command progress monitor

On Fri, 8 Mar 2019 at 09:14, Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

On 2019/03/06 15:38, Tatsuro Yamada wrote:

On 2019/03/05 17:56, Tatsuro Yamada wrote:

On 2019/03/05 11:35, Robert Haas wrote:

On Mon, Mar 4, 2019 at 5:38 AM Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

=== Current design ===

CLUSTER command uses Index Scan or Seq Scan when scanning the heap.
Depending on which one is chosen, the command will proceed in the
following sequence of phases:

* Scan method: Seq Scan
0. initializing (*2)
1. seq scanning heap (*1)
3. sorting tuples (*2)
4. writing new heap (*1)
5. swapping relation files (*2)
6. rebuilding index (*2)
7. performing final cleanup (*2)

* Scan method: Index Scan
0. initializing (*2)
2. index scanning heap (*1)
5. swapping relation files (*2)
6. rebuilding index (*2)
7. performing final cleanup (*2)

VACUUM FULL command will proceed in the following sequence of phases:

1. seq scanning heap (*1)
5. swapping relation files (*2)
6. rebuilding index (*2)
7. performing final cleanup (*2)

(*1): increasing the value in heap_tuples_scanned column
(*2): only shows the phase in the phase column

All of that sounds good.

The view provides the information of CLUSTER command progress details as follows
# \d pg_stat_progress_cluster
View "pg_catalog.pg_stat_progress_cluster"
Column | Type | Collation | Nullable | Default
---------------------------+---------+-----------+----------+---------
pid | integer | | |
datid | oid | | |
datname | name | | |
relid | oid | | |
command | text | | |
phase | text | | |
cluster_index_relid | bigint | | |
heap_tuples_scanned | bigint | | |
heap_tuples_vacuumed | bigint | | |

Still not sure if we need heap_tuples_vacuumed. We could try to
report heap_blks_scanned and heap_blks_total like we do for VACUUM, if
we're using a Seq Scan.

I have no strong opinion to add heap_tuples_vacuumed, so I'll remove that in
next patch.

Regarding heap_blks_scanned and heap_blks_total, I suppose that it is able to
get those from initscan(). I'll investigate it more.

cluster.c
copy_heap_data()
heap_beginscan()
heap_beginscan_internal()
initscan()

=== Discussion points ===

- Progress counter for "3. sorting tuples" phase
- Should we add pgstat_progress_update_param() in tuplesort.c like a
"trace_sort"?
Thanks to Peter Geoghegan for the useful advice!

How would we avoid an abstraction violation?

Hmm... What do you mean an abstraction violation?
If it is difficult to solve, I'd not like to add the progress counter for the sorting tuples.

- Progress counter for "6. rebuilding index" phase
- Should we add "index_vacuum_count" in the view like a vacuum progress monitor?
If yes, I'll add pgstat_progress_update_param() to reindex_relation() of index.c.
However, I'm not sure whether it is okay or not.

Doesn't seem unreasonable to me.

I see, I'll add it later.

Attached file is revised and WIP patch including:

- Remove heap_tuples_vacuumed
- Add heap_blks_scanned and heap_blks_total
- Add index_vacuum_count

I tried to "add heap_blks_scanned and heap_blks_total" columns and I realized that
"heap_tuples_scanned" column is suitable as a counter when a scan method is
both index-scan and seq-scan because CLUSTER is on a tuple basis.

Attached file is rebased patch on current HEAD.
I changed a status. :)

Looks like the patch needs a rebase.
I was on the commit fb5806533f9fe0433290d84c9b019399cd69e9c2

PFA reject file in case you want to have a look.

Regards,
Tatsuro Yamada

--
Regards,
Rafia Sabih

Attachments:

cluster.c.rejtext/x-reject; charset=US-ASCII; name=cluster.c.rejDownload

--- src/backend/commands/cluster.c
+++ src/backend/commands/cluster.c
@@ -942,14 +960,33 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 */
 	if (OldIndex != NULL && !use_sort)
 	{
+		const int   ci_index[] = {
+			PROGRESS_CLUSTER_PHASE,
+			PROGRESS_CLUSTER_INDEX_RELID
+		};
+		int64       ci_val[2];
+
+		/* Set phase and OIDOldIndex to columns */
+		ci_val[0] = PROGRESS_CLUSTER_PHASE_INDEX_SCAN_HEAP;
+		ci_val[1] = OIDOldIndex;
+		pgstat_progress_update_multi_param(2, ci_index, ci_val);
+
 		heapScan = NULL;
 		indexScan = index_beginscan(OldHeap, OldIndex, SnapshotAny, 0, 0);
 		index_rescan(indexScan, NULL, 0, NULL, 0);
 	}
 	else
 	{
+		/* Set phase */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+									 PROGRESS_CLUSTER_PHASE_SEQ_SCAN_HEAP);
+
 		heapScan = heap_beginscan(OldHeap, SnapshotAny, 0, (ScanKey) NULL);
 		indexScan = NULL;
+
+		/* Set total heap blocks */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_TOTAL_HEAP_BLKS,
+									 heapScan->rs_nblocks);
 	}
 
 	/* Log what we're doing */

#62

Tatsuro Yamada

yamada.tatsuro@lab.ntt.co.jp

almost 7 years ago

In reply to: Rafia Sabih (#61)

1 attachment(s)

Re: [HACKERS] CLUSTER command progress monitor

Hi Rafia!

On 2019/03/18 20:42, Rafia Sabih wrote:

On Fri, 8 Mar 2019 at 09:14, Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

Attached file is rebased patch on current HEAD.
I changed a status. :)

Looks like the patch needs a rebase.
I was on the commit fb5806533f9fe0433290d84c9b019399cd69e9c2

PFA reject file in case you want to have a look.

Thanks for testing it. :)
I rebased the patch on the current head: f2004f19ed9c9228d3ea2b12379ccb4b9212641f.

Please find attached file.

Also, I share my test case of progress monitor below.

=== My test case ===

[Terminal1]
Run this query on psql:

\a \t
select * from pg_stat_progress_cluster; \watch 0.05

[Terminal2]
Run these queries on psql:

drop table t1;

create table t1 as select a, random() * 1000 as b from generate_series(0, 999999) a;
create index idx_t1 on t1(a);
create index idx_t1_b on t1(b);
analyze t1;

-- index scan
set enable_seqscan to off;
cluster verbose t1 using idx_t1;

-- seq scan
set enable_seqscan to on;
set enable_indexscan to off;
cluster verbose t1 using idx_t1;

-- only given table name to cluster command
cluster verbose t1;

-- only cluster command
cluster verbose;

-- vacuum full
vacuum full t1;

-- vacuum full
vacuum full;

====================

Regards,
Tatsuro Yamada

Attachments:

progress_monitor_for_cluster_command_v10_code.patchtext/x-patch; name=progress_monitor_for_cluster_command_v10_code.patchDownload

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index c339a2bb77..8a634dd57e 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -52,6 +52,7 @@
 #include "catalog/storage.h"
 #include "commands/tablecmds.h"
 #include "commands/event_trigger.h"
+#include "commands/progress.h"
 #include "commands/trigger.h"
 #include "executor/executor.h"
 #include "miscadmin.h"
@@ -59,6 +60,7 @@
 #include "nodes/nodeFuncs.h"
 #include "optimizer/optimizer.h"
 #include "parser/parser.h"
+#include "pgstat.h"
 #include "rewrite/rewriteManip.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
@@ -3851,6 +3853,7 @@ reindex_relation(Oid relid, int flags, int options)
 	List	   *indexIds;
 	bool		is_pg_class;
 	bool		result;
+	int			i;
 
 	/*
 	 * Open and lock the relation.  ShareLock is sufficient since we only need
@@ -3938,6 +3941,7 @@ reindex_relation(Oid relid, int flags, int options)
 
 		/* Reindex all the indexes. */
 		doneIndexes = NIL;
+		i = 1;
 		foreach(indexId, indexIds)
 		{
 			Oid			indexOid = lfirst_oid(indexId);
@@ -3955,6 +3959,11 @@ reindex_relation(Oid relid, int flags, int options)
 
 			if (is_pg_class)
 				doneIndexes = lappend_oid(doneIndexes, indexOid);
+
+			/* Set index rebuild count */
+			pgstat_progress_update_param(PROGRESS_CLUSTER_INDEX_REBUILD_COUNT,
+										 i);
+			i++;
 		}
 	}
 	PG_CATCH();
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index d962648bc5..87c0092787 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -907,6 +907,32 @@ CREATE VIEW pg_stat_progress_vacuum AS
     FROM pg_stat_get_progress_info('VACUUM') AS S
 		LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_cluster AS
+    SELECT
+        S.pid AS pid,
+        S.datid AS datid,
+        D.datname AS datname,
+        S.relid AS relid,
+        CASE S.param1 WHEN 1 THEN 'CLUSTER'
+                      WHEN 2 THEN 'VACUUM FULL'
+                      END AS command,
+        CASE S.param2 WHEN 0 THEN 'initializing'
+                      WHEN 1 THEN 'seq scanning heap'
+                      WHEN 2 THEN 'index scanning heap'
+                      WHEN 3 THEN 'sorting tuples'
+                      WHEN 4 THEN 'writing new heap'
+                      WHEN 5 THEN 'swapping relation files'
+                      WHEN 6 THEN 'rebuilding index'
+                      WHEN 7 THEN 'performing final cleanup'
+                      END AS phase,
+        S.param3 AS cluster_index_relid,
+        S.param4 AS heap_tuples_scanned,
+        S.param5 AS heap_blks_total,
+        S.param6 AS heap_blks_scanned,
+        S.param7 AS index_rebuild_count
+    FROM pg_stat_get_progress_info('CLUSTER') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid;
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 3e2a807640..478894c869 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -36,10 +36,12 @@
 #include "catalog/objectaccess.h"
 #include "catalog/toasting.h"
 #include "commands/cluster.h"
+#include "commands/progress.h"
 #include "commands/tablecmds.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "optimizer/optimizer.h"
+#include "pgstat.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/predicate.h"
@@ -276,6 +278,8 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 	/* Check for user-requested abort. */
 	CHECK_FOR_INTERRUPTS();
 
+	pgstat_progress_start_command(PROGRESS_COMMAND_CLUSTER, tableOid);
+
 	/*
 	 * We grab exclusive access to the target rel and index for the duration
 	 * of the transaction.  (This is redundant for the single-transaction
@@ -386,6 +390,18 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 	 */
 	CheckTableNotInUse(OldHeap, OidIsValid(indexOid) ? "CLUSTER" : "VACUUM");
 
+	/* Set command to column */
+	if (OidIsValid(indexOid))
+	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_COMMAND,
+									 PROGRESS_CLUSTER_COMMAND_CLUSTER);
+	}
+	else
+	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_COMMAND,
+									 PROGRESS_CLUSTER_COMMAND_VACUUM_FULL);
+	}
+
 	/* Check heap and index are valid to cluster on */
 	if (OidIsValid(indexOid))
 		check_index_is_clusterable(OldHeap, indexOid, recheck, AccessExclusiveLock);
@@ -416,6 +432,8 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 	rebuild_relation(OldHeap, indexOid, verbose);
 
 	/* NB: rebuild_relation does table_close() on OldHeap */
+
+	pgstat_progress_end_command();
 }
 
 /*
@@ -928,6 +946,17 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 */
 	if (OldIndex != NULL && !use_sort)
 	{
+		const int   ci_index[] = {
+			PROGRESS_CLUSTER_PHASE,
+			PROGRESS_CLUSTER_INDEX_RELID
+		};
+		int64       ci_val[2];
+
+		/* Set phase and OIDOldIndex to columns */
+		ci_val[0] = PROGRESS_CLUSTER_PHASE_INDEX_SCAN_HEAP;
+		ci_val[1] = OIDOldIndex;
+		pgstat_progress_update_multi_param(2, ci_index, ci_val);
+
 		tableScan = NULL;
 		heapScan = NULL;
 		indexScan = index_beginscan(OldHeap, OldIndex, SnapshotAny, 0, 0);
@@ -935,9 +964,17 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	}
 	else
 	{
+		/* Set phase */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+									 PROGRESS_CLUSTER_PHASE_SEQ_SCAN_HEAP);
+
 		tableScan = table_beginscan(OldHeap, SnapshotAny, 0, (ScanKey) NULL);
 		heapScan = (HeapScanDesc) tableScan;
 		indexScan = NULL;
+
+		/* Set total heap blocks */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_TOTAL_HEAP_BLKS,
+									 heapScan->rs_nblocks);
 	}
 
 	slot = table_slot_create(OldHeap, NULL);
@@ -994,6 +1031,10 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 				break;
 
 			buf = heapScan->rs_cbuf;
+
+			/* Set heap blocks scanned */
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_BLKS_SCANNED,
+										 heapScan->rs_cblock);
 		}
 
 		LockBuffer(buf, BUFFER_LOCK_SHARE);
@@ -1070,6 +1111,10 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 									 oldTupDesc, newTupDesc,
 									 values, isnull,
 									 rwstate);
+
+		/* Regardless of index scan or seq scan, update tuples_scanned column */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED,
+									 num_tuples);
 	}
 
 	if (indexScan != NULL)
@@ -1085,8 +1130,29 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 */
 	if (tuplesort != NULL)
 	{
+		double num_tuples = 0;
+		const int   cp_index[] = {
+			PROGRESS_CLUSTER_PHASE,
+			PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED,
+			PROGRESS_CLUSTER_TOTAL_HEAP_BLKS,
+			PROGRESS_CLUSTER_HEAP_BLKS_SCANNED
+		};
+		int64       cp_val[4];
+
+		/* Report that we are now sorting tuples */
+		cp_val[0] = PROGRESS_CLUSTER_PHASE_SORT_TUPLES;
+		cp_val[1] = num_tuples;
+		cp_val[2] = 0;
+		cp_val[3] = 0;
+		pgstat_progress_update_multi_param(4, cp_index, cp_val);
+
 		tuplesort_performsort(tuplesort);
 
+		/* Report that we are now writing new heap */
+		cp_val[0] = PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP;
+		cp_val[1] = num_tuples;
+		pgstat_progress_update_multi_param(2, cp_index, cp_val);
+
 		for (;;)
 		{
 			HeapTuple	tuple;
@@ -1097,10 +1163,14 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 			if (tuple == NULL)
 				break;
 
+			num_tuples += 1;
 			reform_and_rewrite_tuple(tuple,
 									 oldTupDesc, newTupDesc,
 									 values, isnull,
 									 rwstate);
+			/* Report num_tuples */
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED,
+										 num_tuples);
 		}
 
 		tuplesort_end(tuplesort);
@@ -1538,6 +1608,20 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	Oid			mapped_tables[4];
 	int			reindex_flags;
 	int			i;
+	const int   cp_index[] = {
+		PROGRESS_CLUSTER_PHASE,
+		PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED,
+		PROGRESS_CLUSTER_TOTAL_HEAP_BLKS,
+		PROGRESS_CLUSTER_HEAP_BLKS_SCANNED
+	};
+	int64       cp_val[4];
+
+	/* Report that we are now swapping relation files */
+	cp_val[0] = PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES;
+	cp_val[1] = 0;
+	cp_val[2] = 0;
+	cp_val[3] = 0;
+	pgstat_progress_update_multi_param(4, cp_index, cp_val);
 
 	/* Zero out possible results from swapped_relation_files */
 	memset(mapped_tables, 0, sizeof(mapped_tables));
@@ -1573,6 +1657,11 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	 * because the new heap won't contain any HOT chains at all, let alone
 	 * broken ones, so it can't be necessary to set indcheckxmin.
 	 */
+
+	/* Report that we are now reindexing relations */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+								 PROGRESS_CLUSTER_PHASE_REBUILD_INDEX);
+
 	reindex_flags = REINDEX_REL_SUPPRESS_INDEX_USE;
 	if (check_constraints)
 		reindex_flags |= REINDEX_REL_CHECK_CONSTRAINTS;
@@ -1588,6 +1677,10 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 
 	reindex_relation(OIDOldHeap, reindex_flags, 0);
 
+	/* Report that we are now doing clean up */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+								 PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP);
+
 	/*
 	 * If the relation being rebuild is pg_class, swap_relation_files()
 	 * couldn't update pg_class's own pg_class entry (check comments in
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index da1d685c08..a7256dfefa 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -468,6 +468,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 	/* Translate command name into command type code. */
 	if (pg_strcasecmp(cmd, "VACUUM") == 0)
 		cmdtype = PROGRESS_COMMAND_VACUUM;
+	else if(pg_strcasecmp(cmd, "CLUSTER") == 0)
+		cmdtype = PROGRESS_COMMAND_CLUSTER;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 9858b36a38..0f637fe4e7 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -34,4 +34,26 @@
 #define PROGRESS_VACUUM_PHASE_TRUNCATE			5
 #define PROGRESS_VACUUM_PHASE_FINAL_CLEANUP		6
 
+/* Progress parameters for cluster */
+#define PROGRESS_CLUSTER_COMMAND				0
+#define PROGRESS_CLUSTER_PHASE					1
+#define PROGRESS_CLUSTER_INDEX_RELID			2
+#define PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED	3
+#define PROGRESS_CLUSTER_TOTAL_HEAP_BLKS		4
+#define PROGRESS_CLUSTER_HEAP_BLKS_SCANNED		5
+#define PROGRESS_CLUSTER_INDEX_REBUILD_COUNT	6
+
+/* Phases of cluster (as dvertised via PROGRESS_CLUSTER_PHASE) */
+#define PROGRESS_CLUSTER_PHASE_SEQ_SCAN_HEAP	1
+#define PROGRESS_CLUSTER_PHASE_INDEX_SCAN_HEAP	2
+#define PROGRESS_CLUSTER_PHASE_SORT_TUPLES		3
+#define PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP	4
+#define PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES	5
+#define PROGRESS_CLUSTER_PHASE_REBUILD_INDEX	6
+#define PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP	7
+
+/* Commands of PROGRESS_CLUSTER */
+#define PROGRESS_CLUSTER_COMMAND_CLUSTER		1
+#define PROGRESS_CLUSTER_COMMAND_VACUUM_FULL	2
+
 #endif
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index ea6cc8b560..c080fa6388 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -950,7 +950,8 @@ typedef enum
 typedef enum ProgressCommandType
 {
 	PROGRESS_COMMAND_INVALID,
-	PROGRESS_COMMAND_VACUUM
+	PROGRESS_COMMAND_VACUUM,
+	PROGRESS_COMMAND_CLUSTER
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	10
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index f104dc4a62..45ac8085ea 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1830,6 +1830,33 @@ pg_stat_database_conflicts| SELECT d.oid AS datid,
     pg_stat_get_db_conflict_bufferpin(d.oid) AS confl_bufferpin,
     pg_stat_get_db_conflict_startup_deadlock(d.oid) AS confl_deadlock
    FROM pg_database d;
+pg_stat_progress_cluster| SELECT s.pid,
+    s.datid,
+    d.datname,
+    s.relid,
+        CASE s.param1
+            WHEN 1 THEN 'CLUSTER'::text
+            WHEN 2 THEN 'VACUUM FULL'::text
+            ELSE NULL::text
+        END AS command,
+        CASE s.param2
+            WHEN 0 THEN 'initializing'::text
+            WHEN 1 THEN 'seq scanning heap'::text
+            WHEN 2 THEN 'index scanning heap'::text
+            WHEN 3 THEN 'sorting tuples'::text
+            WHEN 4 THEN 'writing new heap'::text
+            WHEN 5 THEN 'swapping relation files'::text
+            WHEN 6 THEN 'rebuilding index'::text
+            WHEN 7 THEN 'performing final cleanup'::text
+            ELSE NULL::text
+        END AS phase,
+    s.param4 AS cluster_index_relid,
+    s.param5 AS heap_tuples_scanned,
+    s.param6 AS heap_blks_total,
+    s.param7 AS heap_blks_scanned,
+    s.param8 AS index_rebuild_count
+   FROM (pg_stat_get_progress_info('CLUSTER'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,

#63

Tatsuro Yamada

yamada.tatsuro@lab.ntt.co.jp

almost 7 years ago

In reply to: Tatsuro Yamada (#62)

1 attachment(s)

Re: [HACKERS] CLUSTER command progress monitor

On 2019/03/19 10:43, Tatsuro Yamada wrote:

Hi Rafia!

On 2019/03/18 20:42, Rafia Sabih wrote:

On Fri, 8 Mar 2019 at 09:14, Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

Attached file is rebased patch on current HEAD.
I changed a status. :)

Looks like the patch needs a rebase.
I was on the commit fb5806533f9fe0433290d84c9b019399cd69e9c2

PFA reject file in case you want to have a look.

Thanks for testing it. :)
I rebased the patch on the current head: f2004f19ed9c9228d3ea2b12379ccb4b9212641f.

Please find attached file.

Also, I share my test case of progress monitor below.

Attached patch is a rebased document patch. :)

Thanks,
Tatsuro Yamada

Attachments:

progress_monitor_for_cluster_command_v10_doc.patchtext/x-patch; name=progress_monitor_for_cluster_command_v10_doc.patchDownload

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index ac2721c8ad..79d98bb601 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -344,6 +344,14 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
       </entry>
      </row>
 
+     <row>
+      <entry><structname>pg_stat_progress_cluster</structname><indexterm><primary>pg_stat_progress_cluster</primary></indexterm></entry>
+      <entry>One row for each backend running
+       <command>CLUSTER</command> or <command>VACUUM FULL</command>, showing current progress.
+       See <xref linkend='cluster-progress-reporting'>.
+      </entry>
+     </row>
+
     </tbody>
    </tgroup>
   </table>
@@ -3394,9 +3402,9 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 
   <para>
    <productname>PostgreSQL</productname> has the ability to report the progress of
-   certain commands during command execution.  Currently, the only command
-   which supports progress reporting is <command>VACUUM</command>.  This may be
-   expanded in the future.
+   certain commands during command execution.  Currently, the only commands which 
+   support progress reporting are <command>VACUUM</command> and
+   <command>CLUSTER</command>. This may be expanded in the future.
   </para>
 
  <sect2 id="vacuum-progress-reporting">
@@ -3408,9 +3416,10 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
    one row for each backend (including autovacuum worker processes) that is
    currently vacuuming.  The tables below describe the information
    that will be reported and provide information about how to interpret it.
-   Progress reporting is not currently supported for <command>VACUUM FULL</command>
-   and backends running <command>VACUUM FULL</command> will not be listed in this
-   view.
+   Running <command>VACUUM FULL</command> is listed in <structname>pg_stat_progress_cluster</structname>
+   because both <command>VACUUM FULL</command> and <command>CLUSTER</command> 
+   rewrite the table, while regular <command>VACUUM</command> only modifies it 
+   in place. See <xref linkend='cluster-progress-reporting'>.
   </para>
 
   <table id="pg-stat-progress-vacuum-view" xreflabel="pg_stat_progress_vacuum">
@@ -3587,6 +3596,218 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
    </tgroup>
   </table>
 
+ </sect2>
+
+ <sect2 id="cluster-progress-reporting">
+  <title>CLUSTER Progress Reporting</title>
+
+  <para>
+   Whenever <command>CLUSTER</command> is running, the
+   <structname>pg_stat_progress_cluster</structname> view will contain
+   a row for each backend that is currently running CLUSTER or VACUUM FULL. 
+   The tables below describe the information that will be reported and
+   provide information about how to interpret it.
+  </para>
+
+  <table id="pg-stat-progress-cluster-view" xreflabel="pg_stat_progress_cluster">
+   <title><structname>pg_stat_progress_cluster</structname> View</title>
+   <tgroup cols="3">
+    <thead>
+    <row>
+      <entry>Column</entry>
+      <entry>Type</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><structfield>pid</structfield></entry>
+     <entry><type>integer</type></entry>
+     <entry>Process ID of backend.</entry>
+    </row>
+    <row>
+     <entry><structfield>datid</structfield></entry>
+     <entry><type>oid</type></entry>
+     <entry>OID of the database to which this backend is connected.</entry>
+    </row>
+    <row>
+     <entry><structfield>datname</structfield></entry>
+     <entry><type>name</type></entry>
+     <entry>Name of the database to which this backend is connected.</entry>
+    </row>
+    <row>
+     <entry><structfield>relid</structfield></entry>
+     <entry><type>oid</type></entry>
+     <entry>OID of the table being clustered.</entry>
+    </row>
+    <row>
+     <entry><structfield>command</structfield></entry>
+     <entry><type>text</type></entry>
+     <entry>
+       The command that is running. Either CLUSTER or VACUUM FULL.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>phase</structfield></entry>
+     <entry><type>text</type></entry>
+     <entry>
+       Current processing phase. See <xref linkend='cluster-phases'> or <xref linkend='vacuum-full-phases'>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>cluster_index_relid</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       If the table is being scanned using an index, this is the OID of the
+       index being used; otherwise, it is zero.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_tuples_scanned</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Number of heap tuples scanned.
+       This counter only advances when the phase is <literal>seq scanning heap</literal>, 
+       <literal>index scanning heap</literal> and <literal>writing new heap</literal>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_blks_total</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Total number of heap blocks in the table.  This number is reported
+       as of the beginning of <literal>seq scanning heap</literal>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_blks_scanned</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Number of heap blocks scanned. 
+       This counter only advances when the phase is <literal>seq scanning heap</literal>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>index_rebuild_count</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Number of rebuilded indexes.
+       This counter only advances when the phase is <literal>rebuilding index</literal>.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
+  <table id="cluster-phases">
+   <title>CLUSTER phases</title>
+   <tgroup cols="2">
+    <thead>
+    <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><literal>initializing</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is preparing to begin scanning the heap.  This
+       phase is expected to be very brief.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>seq scanning heap</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently scanning heap from the table by
+       seq scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>index scanning heap</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently scanning heap from the table by
+       index scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>sorting tuples</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently sorting tuples. 
+     </entry>
+    </row>
+    <row>
+     <entry><literal>swapping relation files</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently swapping old heap and new clustered heap.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>rebuilding index</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently rebuilding index.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>performing final cleanup</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is performing final cleanup.  When this phase is 
+       completed, <command>CLUSTER</command> will end.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
+  <table id="vacuum-full-phases">
+   <title>VACUUM FULL phases</title>
+   <tgroup cols="2">
+    <thead>
+    <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><literal>initializing</literal></entry>
+     <entry>
+       <command>VACUUM FULL</command> is preparing to begin scanning the heap.  This
+       phase is expected to be very brief.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>seq scanning heap</literal></entry>
+     <entry>
+       <command>VACUUM FULL</command> is currently scanning heap from the table.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>swapping relation files</literal></entry>
+     <entry>
+       <command>VACUUM FULL</command> is currently swapping old heap and new vacuumed heap.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>rebuilding index</literal></entry>
+     <entry>
+       <command>VACUUM FULL</command> is currently rebuilding index.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>performing final cleanup</literal></entry>
+     <entry>
+       <command>VACUUM FULL</command> is performing final cleanup.  When this phase is
+       completed, <command>VACUUM FULL</command> will end.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
  </sect2>
  </sect1>
 
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index c339a2bb77..8a634dd57e 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -52,6 +52,7 @@
 #include "catalog/storage.h"
 #include "commands/tablecmds.h"
 #include "commands/event_trigger.h"
+#include "commands/progress.h"
 #include "commands/trigger.h"
 #include "executor/executor.h"
 #include "miscadmin.h"
@@ -59,6 +60,7 @@
 #include "nodes/nodeFuncs.h"
 #include "optimizer/optimizer.h"
 #include "parser/parser.h"
+#include "pgstat.h"
 #include "rewrite/rewriteManip.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
@@ -3851,6 +3853,7 @@ reindex_relation(Oid relid, int flags, int options)
 	List	   *indexIds;
 	bool		is_pg_class;
 	bool		result;
+	int			i;
 
 	/*
 	 * Open and lock the relation.  ShareLock is sufficient since we only need
@@ -3938,6 +3941,7 @@ reindex_relation(Oid relid, int flags, int options)
 
 		/* Reindex all the indexes. */
 		doneIndexes = NIL;
+		i = 1;
 		foreach(indexId, indexIds)
 		{
 			Oid			indexOid = lfirst_oid(indexId);
@@ -3955,6 +3959,11 @@ reindex_relation(Oid relid, int flags, int options)
 
 			if (is_pg_class)
 				doneIndexes = lappend_oid(doneIndexes, indexOid);
+
+			/* Set index rebuild count */
+			pgstat_progress_update_param(PROGRESS_CLUSTER_INDEX_REBUILD_COUNT,
+										 i);
+			i++;
 		}
 	}
 	PG_CATCH();
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index d962648bc5..87c0092787 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -907,6 +907,32 @@ CREATE VIEW pg_stat_progress_vacuum AS
     FROM pg_stat_get_progress_info('VACUUM') AS S
 		LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_cluster AS
+    SELECT
+        S.pid AS pid,
+        S.datid AS datid,
+        D.datname AS datname,
+        S.relid AS relid,
+        CASE S.param1 WHEN 1 THEN 'CLUSTER'
+                      WHEN 2 THEN 'VACUUM FULL'
+                      END AS command,
+        CASE S.param2 WHEN 0 THEN 'initializing'
+                      WHEN 1 THEN 'seq scanning heap'
+                      WHEN 2 THEN 'index scanning heap'
+                      WHEN 3 THEN 'sorting tuples'
+                      WHEN 4 THEN 'writing new heap'
+                      WHEN 5 THEN 'swapping relation files'
+                      WHEN 6 THEN 'rebuilding index'
+                      WHEN 7 THEN 'performing final cleanup'
+                      END AS phase,
+        S.param3 AS cluster_index_relid,
+        S.param4 AS heap_tuples_scanned,
+        S.param5 AS heap_blks_total,
+        S.param6 AS heap_blks_scanned,
+        S.param7 AS index_rebuild_count
+    FROM pg_stat_get_progress_info('CLUSTER') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid;
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 3e2a807640..478894c869 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -36,10 +36,12 @@
 #include "catalog/objectaccess.h"
 #include "catalog/toasting.h"
 #include "commands/cluster.h"
+#include "commands/progress.h"
 #include "commands/tablecmds.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "optimizer/optimizer.h"
+#include "pgstat.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/predicate.h"
@@ -276,6 +278,8 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 	/* Check for user-requested abort. */
 	CHECK_FOR_INTERRUPTS();
 
+	pgstat_progress_start_command(PROGRESS_COMMAND_CLUSTER, tableOid);
+
 	/*
 	 * We grab exclusive access to the target rel and index for the duration
 	 * of the transaction.  (This is redundant for the single-transaction
@@ -386,6 +390,18 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 	 */
 	CheckTableNotInUse(OldHeap, OidIsValid(indexOid) ? "CLUSTER" : "VACUUM");
 
+	/* Set command to column */
+	if (OidIsValid(indexOid))
+	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_COMMAND,
+									 PROGRESS_CLUSTER_COMMAND_CLUSTER);
+	}
+	else
+	{
+		pgstat_progress_update_param(PROGRESS_CLUSTER_COMMAND,
+									 PROGRESS_CLUSTER_COMMAND_VACUUM_FULL);
+	}
+
 	/* Check heap and index are valid to cluster on */
 	if (OidIsValid(indexOid))
 		check_index_is_clusterable(OldHeap, indexOid, recheck, AccessExclusiveLock);
@@ -416,6 +432,8 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 	rebuild_relation(OldHeap, indexOid, verbose);
 
 	/* NB: rebuild_relation does table_close() on OldHeap */
+
+	pgstat_progress_end_command();
 }
 
 /*
@@ -928,6 +946,17 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 */
 	if (OldIndex != NULL && !use_sort)
 	{
+		const int   ci_index[] = {
+			PROGRESS_CLUSTER_PHASE,
+			PROGRESS_CLUSTER_INDEX_RELID
+		};
+		int64       ci_val[2];
+
+		/* Set phase and OIDOldIndex to columns */
+		ci_val[0] = PROGRESS_CLUSTER_PHASE_INDEX_SCAN_HEAP;
+		ci_val[1] = OIDOldIndex;
+		pgstat_progress_update_multi_param(2, ci_index, ci_val);
+
 		tableScan = NULL;
 		heapScan = NULL;
 		indexScan = index_beginscan(OldHeap, OldIndex, SnapshotAny, 0, 0);
@@ -935,9 +964,17 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	}
 	else
 	{
+		/* Set phase */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+									 PROGRESS_CLUSTER_PHASE_SEQ_SCAN_HEAP);
+
 		tableScan = table_beginscan(OldHeap, SnapshotAny, 0, (ScanKey) NULL);
 		heapScan = (HeapScanDesc) tableScan;
 		indexScan = NULL;
+
+		/* Set total heap blocks */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_TOTAL_HEAP_BLKS,
+									 heapScan->rs_nblocks);
 	}
 
 	slot = table_slot_create(OldHeap, NULL);
@@ -994,6 +1031,10 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 				break;
 
 			buf = heapScan->rs_cbuf;
+
+			/* Set heap blocks scanned */
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_BLKS_SCANNED,
+										 heapScan->rs_cblock);
 		}
 
 		LockBuffer(buf, BUFFER_LOCK_SHARE);
@@ -1070,6 +1111,10 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 									 oldTupDesc, newTupDesc,
 									 values, isnull,
 									 rwstate);
+
+		/* Regardless of index scan or seq scan, update tuples_scanned column */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED,
+									 num_tuples);
 	}
 
 	if (indexScan != NULL)
@@ -1085,8 +1130,29 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 */
 	if (tuplesort != NULL)
 	{
+		double num_tuples = 0;
+		const int   cp_index[] = {
+			PROGRESS_CLUSTER_PHASE,
+			PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED,
+			PROGRESS_CLUSTER_TOTAL_HEAP_BLKS,
+			PROGRESS_CLUSTER_HEAP_BLKS_SCANNED
+		};
+		int64       cp_val[4];
+
+		/* Report that we are now sorting tuples */
+		cp_val[0] = PROGRESS_CLUSTER_PHASE_SORT_TUPLES;
+		cp_val[1] = num_tuples;
+		cp_val[2] = 0;
+		cp_val[3] = 0;
+		pgstat_progress_update_multi_param(4, cp_index, cp_val);
+
 		tuplesort_performsort(tuplesort);
 
+		/* Report that we are now writing new heap */
+		cp_val[0] = PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP;
+		cp_val[1] = num_tuples;
+		pgstat_progress_update_multi_param(2, cp_index, cp_val);
+
 		for (;;)
 		{
 			HeapTuple	tuple;
@@ -1097,10 +1163,14 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 			if (tuple == NULL)
 				break;
 
+			num_tuples += 1;
 			reform_and_rewrite_tuple(tuple,
 									 oldTupDesc, newTupDesc,
 									 values, isnull,
 									 rwstate);
+			/* Report num_tuples */
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED,
+										 num_tuples);
 		}
 
 		tuplesort_end(tuplesort);
@@ -1538,6 +1608,20 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	Oid			mapped_tables[4];
 	int			reindex_flags;
 	int			i;
+	const int   cp_index[] = {
+		PROGRESS_CLUSTER_PHASE,
+		PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED,
+		PROGRESS_CLUSTER_TOTAL_HEAP_BLKS,
+		PROGRESS_CLUSTER_HEAP_BLKS_SCANNED
+	};
+	int64       cp_val[4];
+
+	/* Report that we are now swapping relation files */
+	cp_val[0] = PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES;
+	cp_val[1] = 0;
+	cp_val[2] = 0;
+	cp_val[3] = 0;
+	pgstat_progress_update_multi_param(4, cp_index, cp_val);
 
 	/* Zero out possible results from swapped_relation_files */
 	memset(mapped_tables, 0, sizeof(mapped_tables));
@@ -1573,6 +1657,11 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	 * because the new heap won't contain any HOT chains at all, let alone
 	 * broken ones, so it can't be necessary to set indcheckxmin.
 	 */
+
+	/* Report that we are now reindexing relations */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+								 PROGRESS_CLUSTER_PHASE_REBUILD_INDEX);
+
 	reindex_flags = REINDEX_REL_SUPPRESS_INDEX_USE;
 	if (check_constraints)
 		reindex_flags |= REINDEX_REL_CHECK_CONSTRAINTS;
@@ -1588,6 +1677,10 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 
 	reindex_relation(OIDOldHeap, reindex_flags, 0);
 
+	/* Report that we are now doing clean up */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+								 PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP);
+
 	/*
 	 * If the relation being rebuild is pg_class, swap_relation_files()
 	 * couldn't update pg_class's own pg_class entry (check comments in
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index da1d685c08..a7256dfefa 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -468,6 +468,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 	/* Translate command name into command type code. */
 	if (pg_strcasecmp(cmd, "VACUUM") == 0)
 		cmdtype = PROGRESS_COMMAND_VACUUM;
+	else if(pg_strcasecmp(cmd, "CLUSTER") == 0)
+		cmdtype = PROGRESS_COMMAND_CLUSTER;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 9858b36a38..0f637fe4e7 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -34,4 +34,26 @@
 #define PROGRESS_VACUUM_PHASE_TRUNCATE			5
 #define PROGRESS_VACUUM_PHASE_FINAL_CLEANUP		6
 
+/* Progress parameters for cluster */
+#define PROGRESS_CLUSTER_COMMAND				0
+#define PROGRESS_CLUSTER_PHASE					1
+#define PROGRESS_CLUSTER_INDEX_RELID			2
+#define PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED	3
+#define PROGRESS_CLUSTER_TOTAL_HEAP_BLKS		4
+#define PROGRESS_CLUSTER_HEAP_BLKS_SCANNED		5
+#define PROGRESS_CLUSTER_INDEX_REBUILD_COUNT	6
+
+/* Phases of cluster (as dvertised via PROGRESS_CLUSTER_PHASE) */
+#define PROGRESS_CLUSTER_PHASE_SEQ_SCAN_HEAP	1
+#define PROGRESS_CLUSTER_PHASE_INDEX_SCAN_HEAP	2
+#define PROGRESS_CLUSTER_PHASE_SORT_TUPLES		3
+#define PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP	4
+#define PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES	5
+#define PROGRESS_CLUSTER_PHASE_REBUILD_INDEX	6
+#define PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP	7
+
+/* Commands of PROGRESS_CLUSTER */
+#define PROGRESS_CLUSTER_COMMAND_CLUSTER		1
+#define PROGRESS_CLUSTER_COMMAND_VACUUM_FULL	2
+
 #endif
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index ea6cc8b560..c080fa6388 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -950,7 +950,8 @@ typedef enum
 typedef enum ProgressCommandType
 {
 	PROGRESS_COMMAND_INVALID,
-	PROGRESS_COMMAND_VACUUM
+	PROGRESS_COMMAND_VACUUM,
+	PROGRESS_COMMAND_CLUSTER
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	10
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index f104dc4a62..45ac8085ea 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1830,6 +1830,33 @@ pg_stat_database_conflicts| SELECT d.oid AS datid,
     pg_stat_get_db_conflict_bufferpin(d.oid) AS confl_bufferpin,
     pg_stat_get_db_conflict_startup_deadlock(d.oid) AS confl_deadlock
    FROM pg_database d;
+pg_stat_progress_cluster| SELECT s.pid,
+    s.datid,
+    d.datname,
+    s.relid,
+        CASE s.param1
+            WHEN 1 THEN 'CLUSTER'::text
+            WHEN 2 THEN 'VACUUM FULL'::text
+            ELSE NULL::text
+        END AS command,
+        CASE s.param2
+            WHEN 0 THEN 'initializing'::text
+            WHEN 1 THEN 'seq scanning heap'::text
+            WHEN 2 THEN 'index scanning heap'::text
+            WHEN 3 THEN 'sorting tuples'::text
+            WHEN 4 THEN 'writing new heap'::text
+            WHEN 5 THEN 'swapping relation files'::text
+            WHEN 6 THEN 'rebuilding index'::text
+            WHEN 7 THEN 'performing final cleanup'::text
+            ELSE NULL::text
+        END AS phase,
+    s.param4 AS cluster_index_relid,
+    s.param5 AS heap_tuples_scanned,
+    s.param6 AS heap_blks_total,
+    s.param7 AS heap_blks_scanned,
+    s.param8 AS index_rebuild_count
+   FROM (pg_stat_get_progress_info('CLUSTER'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,

#64

Kyotaro HORIGUCHI

horiguchi.kyotaro@lab.ntt.co.jp

almost 7 years ago

In reply to: Tatsuro Yamada (#63)

Re: [HACKERS] CLUSTER command progress monitor

At Tue, 19 Mar 2019 11:02:57 +0900, Tatsuro Yamada <yamada.tatsuro@lab.ntt.co.jp> wrote in <dc0dd07c-f185-0cf9-ba54-c5c31f6514f2@lab.ntt.co.jp>

On 2019/03/19 10:43, Tatsuro Yamada wrote:

Hi Rafia!
On 2019/03/18 20:42, Rafia Sabih wrote:

On Fri, 8 Mar 2019 at 09:14, Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

Attached file is rebased patch on current HEAD.
I changed a status. :)

Looks like the patch needs a rebase.
I was on the commit fb5806533f9fe0433290d84c9b019399cd69e9c2

PFA reject file in case you want to have a look.

Thanks for testing it. :)
I rebased the patch on the current head:
f2004f19ed9c9228d3ea2b12379ccb4b9212641f.
Please find attached file.
Also, I share my test case of progress monitor below.

Attached patch is a rebased document patch. :)

The monitor view has four columns:

heap_tuples_scanned
: used while the "seq scan" and "index scan" phases.

heap_blks_total, heap_blks_scanned
: used only while the "seq scan" phase.

index_rebuild_count:
: used only while the "rebuilding index" phase.

Couldn't we change the view like the following?

Only seq scan phase has two kind of progress indicator so if we
choose "heap blks pct" as the only indicator for the phase, it
could be simplified as:

A downside of the view is that it looks quite differently from
pg_stat_progress_vacuum. Since I'm not sure it's a good design,
feel free to oppose/reject this.

finish_heap_swap is also called in matview code path but the
function doesn't seem to detect that situation. Is it right
behavior? (I didn't confirm what happens when it is called from
matview refresh path, though.)

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#65

Robert Haas

robertmhaas@gmail.com

almost 7 years ago

In reply to: Tatsuro Yamada (#63)

1 attachment(s)

Re: [HACKERS] CLUSTER command progress monitor

On Mon, Mar 18, 2019 at 10:03 PM Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

Attached patch is a rebased document patch. :)

Attached is an updated patch. I went through this patch carefully
today, in the hopes of committing it, and I think the attached version
is pretty closet to being committable, but there's at least one open
issue remaining, as described below.

- The regression tests did not pass because expected/rules.out was not
properly updated. I fixed that.

- The documentation did not build because some tags were not properly
terminated e.g. <xref linkend='...'> rather than <xref
linkend='...'/>. I also fixed that.

- The documentation had two nearly-identical lists of phases. I
merged them into one. There might be room for some further
fine-tuning here.

- cluster_rel() had multiple places where it could return without
calling pgstat_progress_end_command(). I fixed that.

- cluster_rel() inexplicably delayed updating PROGRESS_CLUSTER_COMMAND
for longer than seems necessary. I fixed that.

- copy_heap_data() zeroed out the heap-tuples-scanned,
heap-blocks-scanned, and total-heap-blocks counters when it began
PROGRESS_CLUSTER_PHASE_SORT_TUPLES and
PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES. This seems like throwing away
useful information for no good reason. I changed it not to do that in
all cases except the one mentioned in the next paragraph.

- It *is* currently to reset PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED
because that counter gets reused to indicate the number of heap tuples
*written back out*, but I think that is bad design for two reasons.
First, the documentation does not explain that sometimes the number of
heap tuples scanned is really reporting the number of heap tuples
written. Second, it's bad for columns to have misleading names.
Third, it would actually be really useful to store these values in
separate columns, because then you could expect that the number tuples
written would eventually equal the number scanned, and you'd still
have the number that were scanned around so that you could clearly see
how close you were getting to rewriting the entire heap. This is the
one thing I found but did not fix; any chance you could make this
change and update the documentation to match?

- The comment about reporting that we are now reindexing relations was
jammed in between an existing comment and the associated code. I
moved it to a more logical place.

- The new if-statement in pg_stat_get_progress_info was missing a
space required by project style. I added the space.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

progress_monitor_for_cluster_command_v11.patchapplication/octet-stream; name=progress_monitor_for_cluster_command_v11.patchDownload

From 011fb31ee82db6f893e140d77920bace81837ac4 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Tue, 19 Mar 2019 14:30:21 -0400
Subject: [PATCH] x

---
 doc/src/sgml/monitoring.sgml         | 190 ++++++++++++++++++++++++++-
 src/backend/catalog/index.c          |   9 ++
 src/backend/catalog/system_views.sql |  26 ++++
 src/backend/commands/cluster.c       |  80 +++++++++++
 src/backend/utils/adt/pgstatfuncs.c  |   2 +
 src/include/commands/progress.h      |  22 ++++
 src/include/pgstat.h                 |   3 +-
 src/test/regress/expected/rules.out  |  27 ++++
 8 files changed, 351 insertions(+), 8 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index ac2721c8ad..4479b64efc 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -340,7 +340,15 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
       <entry><structname>pg_stat_progress_vacuum</structname><indexterm><primary>pg_stat_progress_vacuum</primary></indexterm></entry>
       <entry>One row for each backend (including autovacuum worker processes) running
        <command>VACUUM</command>, showing current progress.
-       See <xref linkend='vacuum-progress-reporting'/>.
+       See <xref linkend='vacuum-progress-reporting' />.
+      </entry>
+     </row>
+
+     <row>
+      <entry><structname>pg_stat_progress_cluster</structname><indexterm><primary>pg_stat_progress_cluster</primary></indexterm></entry>
+      <entry>One row for each backend running
+       <command>CLUSTER</command> or <command>VACUUM FULL</command>, showing current progress.
+       See <xref linkend='cluster-progress-reporting' />.
       </entry>
      </row>
 
@@ -3394,9 +3402,9 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 
   <para>
    <productname>PostgreSQL</productname> has the ability to report the progress of
-   certain commands during command execution.  Currently, the only command
-   which supports progress reporting is <command>VACUUM</command>.  This may be
-   expanded in the future.
+   certain commands during command execution.  Currently, the only commands
+   which support progress reporting are <command>VACUUM</command> and
+   <command>CLUSTER</command>. This may be expanded in the future.
   </para>
 
  <sect2 id="vacuum-progress-reporting">
@@ -3408,9 +3416,11 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
    one row for each backend (including autovacuum worker processes) that is
    currently vacuuming.  The tables below describe the information
    that will be reported and provide information about how to interpret it.
-   Progress reporting is not currently supported for <command>VACUUM FULL</command>
-   and backends running <command>VACUUM FULL</command> will not be listed in this
-   view.
+   Progress for <command>VACUUM FULL</command> commands is reported via
+   <structname>pg_stat_progress_cluster</structname>
+   because both <command>VACUUM FULL</command> and <command>CLUSTER</command> 
+   rewrite the table, while regular <command>VACUUM</command> only modifies it 
+   in place. See <xref linkend='cluster-progress-reporting'/>.
   </para>
 
   <table id="pg-stat-progress-vacuum-view" xreflabel="pg_stat_progress_vacuum">
@@ -3587,6 +3597,172 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
    </tgroup>
   </table>
 
+ </sect2>
+
+ <sect2 id="cluster-progress-reporting">
+  <title>CLUSTER Progress Reporting</title>
+
+  <para>
+   Whenever <command>CLUSTER</command> or <command>VACUUM FULL</command> is
+   running, the <structname>pg_stat_progress_cluster</structname> view will
+   contain a row for each backend that is currently running either command. 
+   The tables below describe the information that will be reported and
+   provide information about how to interpret it.
+  </para>
+
+  <table id="pg-stat-progress-cluster-view" xreflabel="pg_stat_progress_cluster">
+   <title><structname>pg_stat_progress_cluster</structname> View</title>
+   <tgroup cols="3">
+    <thead>
+    <row>
+      <entry>Column</entry>
+      <entry>Type</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><structfield>pid</structfield></entry>
+     <entry><type>integer</type></entry>
+     <entry>Process ID of backend.</entry>
+    </row>
+    <row>
+     <entry><structfield>datid</structfield></entry>
+     <entry><type>oid</type></entry>
+     <entry>OID of the database to which this backend is connected.</entry>
+    </row>
+    <row>
+     <entry><structfield>datname</structfield></entry>
+     <entry><type>name</type></entry>
+     <entry>Name of the database to which this backend is connected.</entry>
+    </row>
+    <row>
+     <entry><structfield>relid</structfield></entry>
+     <entry><type>oid</type></entry>
+     <entry>OID of the table being clustered.</entry>
+    </row>
+    <row>
+     <entry><structfield>command</structfield></entry>
+     <entry><type>text</type></entry>
+     <entry>
+       The command that is running. Either CLUSTER or VACUUM FULL.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>phase</structfield></entry>
+     <entry><type>text</type></entry>
+     <entry>
+       Current processing phase. See <xref linkend='cluster-phases' />.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>cluster_index_relid</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       If the table is being scanned using an index, this is the OID of the
+       index being used; otherwise, it is zero.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_tuples_scanned</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Number of heap tuples scanned.
+       This counter only advances when the phase is
+       <literal>seq scanning heap</literal>,
+       <literal>index scanning heap</literal>
+       or <literal>writing new heap</literal>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_blks_total</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Total number of heap blocks in the table.  This number is reported
+       as of the beginning of <literal>seq scanning heap</literal>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_blks_scanned</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Number of heap blocks scanned.  This counter only advances when the
+       phase is <literal>seq scanning heap</literal>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>index_rebuild_count</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Number of indexes rebuilt.  This counter only advances when the phase
+       is <literal>rebuilding index</literal>.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
+  <table id="cluster-phases">
+   <title>CLUSTER and VACUUM FULL phases</title>
+   <tgroup cols="2">
+    <thead>
+    <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><literal>initializing</literal></entry>
+     <entry>
+       The command is preparing to begin scanning the heap.  This phase is
+       expected to be very brief.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>seq scanning heap</literal></entry>
+     <entry>
+       The command is currently scanning the table using a sequential scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>index scanning heap</literal></entry>
+     <entry>
+       The command is currently scanning the table using an index scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>sorting tuples</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently sorting tuples. 
+     </entry>
+    </row>
+    <row>
+     <entry><literal>swapping relation files</literal></entry>
+     <entry>
+       The command is currently swapping newly-built files into place.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>rebuilding index</literal></entry>
+     <entry>
+       The command is currently rebuilding an index.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>performing final cleanup</literal></entry>
+     <entry>
+       The command is performing final cleanup.  When this phase is 
+       completed, <command>CLUSTER</command>
+       or <command>VACUUM FULL</command> will end.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
  </sect2>
  </sect1>
 
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index c339a2bb77..8a634dd57e 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -52,6 +52,7 @@
 #include "catalog/storage.h"
 #include "commands/tablecmds.h"
 #include "commands/event_trigger.h"
+#include "commands/progress.h"
 #include "commands/trigger.h"
 #include "executor/executor.h"
 #include "miscadmin.h"
@@ -59,6 +60,7 @@
 #include "nodes/nodeFuncs.h"
 #include "optimizer/optimizer.h"
 #include "parser/parser.h"
+#include "pgstat.h"
 #include "rewrite/rewriteManip.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
@@ -3851,6 +3853,7 @@ reindex_relation(Oid relid, int flags, int options)
 	List	   *indexIds;
 	bool		is_pg_class;
 	bool		result;
+	int			i;
 
 	/*
 	 * Open and lock the relation.  ShareLock is sufficient since we only need
@@ -3938,6 +3941,7 @@ reindex_relation(Oid relid, int flags, int options)
 
 		/* Reindex all the indexes. */
 		doneIndexes = NIL;
+		i = 1;
 		foreach(indexId, indexIds)
 		{
 			Oid			indexOid = lfirst_oid(indexId);
@@ -3955,6 +3959,11 @@ reindex_relation(Oid relid, int flags, int options)
 
 			if (is_pg_class)
 				doneIndexes = lappend_oid(doneIndexes, indexOid);
+
+			/* Set index rebuild count */
+			pgstat_progress_update_param(PROGRESS_CLUSTER_INDEX_REBUILD_COUNT,
+										 i);
+			i++;
 		}
 	}
 	PG_CATCH();
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index d962648bc5..87c0092787 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -907,6 +907,32 @@ CREATE VIEW pg_stat_progress_vacuum AS
     FROM pg_stat_get_progress_info('VACUUM') AS S
 		LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_cluster AS
+    SELECT
+        S.pid AS pid,
+        S.datid AS datid,
+        D.datname AS datname,
+        S.relid AS relid,
+        CASE S.param1 WHEN 1 THEN 'CLUSTER'
+                      WHEN 2 THEN 'VACUUM FULL'
+                      END AS command,
+        CASE S.param2 WHEN 0 THEN 'initializing'
+                      WHEN 1 THEN 'seq scanning heap'
+                      WHEN 2 THEN 'index scanning heap'
+                      WHEN 3 THEN 'sorting tuples'
+                      WHEN 4 THEN 'writing new heap'
+                      WHEN 5 THEN 'swapping relation files'
+                      WHEN 6 THEN 'rebuilding index'
+                      WHEN 7 THEN 'performing final cleanup'
+                      END AS phase,
+        S.param3 AS cluster_index_relid,
+        S.param4 AS heap_tuples_scanned,
+        S.param5 AS heap_blks_total,
+        S.param6 AS heap_blks_scanned,
+        S.param7 AS index_rebuild_count
+    FROM pg_stat_get_progress_info('CLUSTER') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid;
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 3e2a807640..2e4fbd0663 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -36,10 +36,12 @@
 #include "catalog/objectaccess.h"
 #include "catalog/toasting.h"
 #include "commands/cluster.h"
+#include "commands/progress.h"
 #include "commands/tablecmds.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "optimizer/optimizer.h"
+#include "pgstat.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/predicate.h"
@@ -276,6 +278,14 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 	/* Check for user-requested abort. */
 	CHECK_FOR_INTERRUPTS();
 
+	pgstat_progress_start_command(PROGRESS_COMMAND_CLUSTER, tableOid);
+	if (OidIsValid(indexOid))
+		pgstat_progress_update_param(PROGRESS_CLUSTER_COMMAND,
+									 PROGRESS_CLUSTER_COMMAND_CLUSTER);
+	else
+		pgstat_progress_update_param(PROGRESS_CLUSTER_COMMAND,
+									 PROGRESS_CLUSTER_COMMAND_VACUUM_FULL);
+
 	/*
 	 * We grab exclusive access to the target rel and index for the duration
 	 * of the transaction.  (This is redundant for the single-transaction
@@ -286,7 +296,10 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 
 	/* If the table has gone away, we can skip processing it */
 	if (!OldHeap)
+	{
+		pgstat_progress_end_command();
 		return;
+	}
 
 	/*
 	 * Since we may open a new transaction for each relation, we have to check
@@ -305,6 +318,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 		if (!pg_class_ownercheck(tableOid, GetUserId()))
 		{
 			relation_close(OldHeap, AccessExclusiveLock);
+			pgstat_progress_end_command();
 			return;
 		}
 
@@ -319,6 +333,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 		if (RELATION_IS_OTHER_TEMP(OldHeap))
 		{
 			relation_close(OldHeap, AccessExclusiveLock);
+			pgstat_progress_end_command();
 			return;
 		}
 
@@ -330,6 +345,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 			if (!SearchSysCacheExists1(RELOID, ObjectIdGetDatum(indexOid)))
 			{
 				relation_close(OldHeap, AccessExclusiveLock);
+				pgstat_progress_end_command();
 				return;
 			}
 
@@ -340,6 +356,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 			if (!HeapTupleIsValid(tuple))	/* probably can't happen */
 			{
 				relation_close(OldHeap, AccessExclusiveLock);
+				pgstat_progress_end_command();
 				return;
 			}
 			indexForm = (Form_pg_index) GETSTRUCT(tuple);
@@ -347,6 +364,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 			{
 				ReleaseSysCache(tuple);
 				relation_close(OldHeap, AccessExclusiveLock);
+				pgstat_progress_end_command();
 				return;
 			}
 			ReleaseSysCache(tuple);
@@ -401,6 +419,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 		!RelationIsPopulated(OldHeap))
 	{
 		relation_close(OldHeap, AccessExclusiveLock);
+		pgstat_progress_end_command();
 		return;
 	}
 
@@ -416,6 +435,8 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 	rebuild_relation(OldHeap, indexOid, verbose);
 
 	/* NB: rebuild_relation does table_close() on OldHeap */
+
+	pgstat_progress_end_command();
 }
 
 /*
@@ -928,6 +949,17 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 */
 	if (OldIndex != NULL && !use_sort)
 	{
+		const int   ci_index[] = {
+			PROGRESS_CLUSTER_PHASE,
+			PROGRESS_CLUSTER_INDEX_RELID
+		};
+		int64       ci_val[2];
+
+		/* Set phase and OIDOldIndex to columns */
+		ci_val[0] = PROGRESS_CLUSTER_PHASE_INDEX_SCAN_HEAP;
+		ci_val[1] = OIDOldIndex;
+		pgstat_progress_update_multi_param(2, ci_index, ci_val);
+
 		tableScan = NULL;
 		heapScan = NULL;
 		indexScan = index_beginscan(OldHeap, OldIndex, SnapshotAny, 0, 0);
@@ -935,9 +967,17 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	}
 	else
 	{
+		/* Set phase */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+									 PROGRESS_CLUSTER_PHASE_SEQ_SCAN_HEAP);
+
 		tableScan = table_beginscan(OldHeap, SnapshotAny, 0, (ScanKey) NULL);
 		heapScan = (HeapScanDesc) tableScan;
 		indexScan = NULL;
+
+		/* Set total heap blocks */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_TOTAL_HEAP_BLKS,
+									 heapScan->rs_nblocks);
 	}
 
 	slot = table_slot_create(OldHeap, NULL);
@@ -994,6 +1034,10 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 				break;
 
 			buf = heapScan->rs_cbuf;
+
+			/* Set heap blocks scanned */
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_BLKS_SCANNED,
+										 heapScan->rs_cblock);
 		}
 
 		LockBuffer(buf, BUFFER_LOCK_SHARE);
@@ -1070,6 +1114,10 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 									 oldTupDesc, newTupDesc,
 									 values, isnull,
 									 rwstate);
+
+		/* Report increase in number of tuples scanned */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED,
+									 num_tuples);
 	}
 
 	if (indexScan != NULL)
@@ -1085,8 +1133,24 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 */
 	if (tuplesort != NULL)
 	{
+		double num_tuples = 0;
+		const int   cp_index[] = {
+			PROGRESS_CLUSTER_PHASE,
+			PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED,
+		};
+		int64       cp_val[2];
+
+		/* Report that we are now sorting tuples */
+		cp_val[0] = PROGRESS_CLUSTER_PHASE_SORT_TUPLES;
+		cp_val[1] = num_tuples;
+		pgstat_progress_update_multi_param(2, cp_index, cp_val);
+
 		tuplesort_performsort(tuplesort);
 
+		/* Report that we are now writing new heap */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+									 PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP);
+
 		for (;;)
 		{
 			HeapTuple	tuple;
@@ -1097,10 +1161,14 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 			if (tuple == NULL)
 				break;
 
+			num_tuples += 1;
 			reform_and_rewrite_tuple(tuple,
 									 oldTupDesc, newTupDesc,
 									 values, isnull,
 									 rwstate);
+			/* Report num_tuples */
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED,
+										 num_tuples);
 		}
 
 		tuplesort_end(tuplesort);
@@ -1539,6 +1607,10 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	int			reindex_flags;
 	int			i;
 
+	/* Report that we are now swapping relation files */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+								 PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES);
+
 	/* Zero out possible results from swapped_relation_files */
 	memset(mapped_tables, 0, sizeof(mapped_tables));
 
@@ -1586,8 +1658,16 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	else if (newrelpersistence == RELPERSISTENCE_PERMANENT)
 		reindex_flags |= REINDEX_REL_FORCE_INDEXES_PERMANENT;
 
+	/* Report that we are now reindexing relations */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+								 PROGRESS_CLUSTER_PHASE_REBUILD_INDEX);
+
 	reindex_relation(OIDOldHeap, reindex_flags, 0);
 
+	/* Report that we are now doing clean up */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+								 PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP);
+
 	/*
 	 * If the relation being rebuild is pg_class, swap_relation_files()
 	 * couldn't update pg_class's own pg_class entry (check comments in
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index da1d685c08..90a817a25c 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -468,6 +468,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 	/* Translate command name into command type code. */
 	if (pg_strcasecmp(cmd, "VACUUM") == 0)
 		cmdtype = PROGRESS_COMMAND_VACUUM;
+	else if (pg_strcasecmp(cmd, "CLUSTER") == 0)
+		cmdtype = PROGRESS_COMMAND_CLUSTER;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 9858b36a38..0f637fe4e7 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -34,4 +34,26 @@
 #define PROGRESS_VACUUM_PHASE_TRUNCATE			5
 #define PROGRESS_VACUUM_PHASE_FINAL_CLEANUP		6
 
+/* Progress parameters for cluster */
+#define PROGRESS_CLUSTER_COMMAND				0
+#define PROGRESS_CLUSTER_PHASE					1
+#define PROGRESS_CLUSTER_INDEX_RELID			2
+#define PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED	3
+#define PROGRESS_CLUSTER_TOTAL_HEAP_BLKS		4
+#define PROGRESS_CLUSTER_HEAP_BLKS_SCANNED		5
+#define PROGRESS_CLUSTER_INDEX_REBUILD_COUNT	6
+
+/* Phases of cluster (as dvertised via PROGRESS_CLUSTER_PHASE) */
+#define PROGRESS_CLUSTER_PHASE_SEQ_SCAN_HEAP	1
+#define PROGRESS_CLUSTER_PHASE_INDEX_SCAN_HEAP	2
+#define PROGRESS_CLUSTER_PHASE_SORT_TUPLES		3
+#define PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP	4
+#define PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES	5
+#define PROGRESS_CLUSTER_PHASE_REBUILD_INDEX	6
+#define PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP	7
+
+/* Commands of PROGRESS_CLUSTER */
+#define PROGRESS_CLUSTER_COMMAND_CLUSTER		1
+#define PROGRESS_CLUSTER_COMMAND_VACUUM_FULL	2
+
 #endif
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index ea6cc8b560..c080fa6388 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -950,7 +950,8 @@ typedef enum
 typedef enum ProgressCommandType
 {
 	PROGRESS_COMMAND_INVALID,
-	PROGRESS_COMMAND_VACUUM
+	PROGRESS_COMMAND_VACUUM,
+	PROGRESS_COMMAND_CLUSTER
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	10
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index f104dc4a62..05cc9db019 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1830,6 +1830,33 @@ pg_stat_database_conflicts| SELECT d.oid AS datid,
     pg_stat_get_db_conflict_bufferpin(d.oid) AS confl_bufferpin,
     pg_stat_get_db_conflict_startup_deadlock(d.oid) AS confl_deadlock
    FROM pg_database d;
+pg_stat_progress_cluster| SELECT s.pid,
+    s.datid,
+    d.datname,
+    s.relid,
+        CASE s.param1
+            WHEN 1 THEN 'CLUSTER'::text
+            WHEN 2 THEN 'VACUUM FULL'::text
+            ELSE NULL::text
+        END AS command,
+        CASE s.param2
+            WHEN 0 THEN 'initializing'::text
+            WHEN 1 THEN 'seq scanning heap'::text
+            WHEN 2 THEN 'index scanning heap'::text
+            WHEN 3 THEN 'sorting tuples'::text
+            WHEN 4 THEN 'writing new heap'::text
+            WHEN 5 THEN 'swapping relation files'::text
+            WHEN 6 THEN 'rebuilding index'::text
+            WHEN 7 THEN 'performing final cleanup'::text
+            ELSE NULL::text
+        END AS phase,
+    s.param3 AS cluster_index_relid,
+    s.param4 AS heap_tuples_scanned,
+    s.param5 AS heap_blks_total,
+    s.param6 AS heap_blks_scanned,
+    s.param7 AS index_rebuild_count
+   FROM (pg_stat_get_progress_info('CLUSTER'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,
-- 
2.17.2 (Apple Git-113)

#66

Robert Haas

robertmhaas@gmail.com

almost 7 years ago

In reply to: Robert Haas (#65)

Re: [HACKERS] CLUSTER command progress monitor

On Tue, Mar 19, 2019 at 2:47 PM Robert Haas <robertmhaas@gmail.com> wrote:

how close you were getting to rewriting the entire heap. This is the
one thing I found but did not fix; any chance you could make this
change and update the documentation to match?

Hi, is anybody working on this?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#67

Tattsu Yama

yamatattsu@gmail.com

almost 7 years ago

In reply to: Robert Haas (#66)

1 attachment(s)

Re: [HACKERS] CLUSTER command progress monitor

Hi Robert! >On Tue, Mar 19, 2019 at 2:47 PM Robert Haas
<robertmhaas(at)gmail(dot)com> wrote: >> how close you were getting to
rewriting the entire heap. This is the >> one thing I found but did not
fix; any chance you could make this >> change and update the documentation
to match? > > >Hi, is anybody working on this? Thank you so much for
reviewing the patch and sorry for the late reply. Today, I realized that
you sent the email for the patch because I took a sick leave from work for
a while. So, I created new patch based on your comments asap. I hope it is
acceptable to you. :) Please find attached file. Changes - Add new column
*heap_tuples_written* in the view This column is updated when the phases
are "seq scanning heap", "index scanning heap" or "writing new heap". - Fix
document - Revised the patch on 280a408b48d5ee42969f981bceb9e9426c3a344c

Regards,

Tatsuro Yamada

Attachments:

progress_monitor_for_cluster_command_v12.patchapplication/octet-stream; name=progress_monitor_for_cluster_command_v12.patchDownload

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index ac2721c..26a6899 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -340,7 +340,15 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
       <entry><structname>pg_stat_progress_vacuum</structname><indexterm><primary>pg_stat_progress_vacuum</primary></indexterm></entry>
       <entry>One row for each backend (including autovacuum worker processes) running
        <command>VACUUM</command>, showing current progress.
-       See <xref linkend='vacuum-progress-reporting'/>.
+       See <xref linkend='vacuum-progress-reporting' />.
+      </entry>
+     </row>
+
+     <row>
+      <entry><structname>pg_stat_progress_cluster</structname><indexterm><primary>pg_stat_progress_cluster</primary></indexterm></entry>
+      <entry>One row for each backend running
+       <command>CLUSTER</command> or <command>VACUUM FULL</command>, showing current progress.
+       See <xref linkend='cluster-progress-reporting' />.
       </entry>
      </row>
 
@@ -3394,9 +3402,9 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 
   <para>
    <productname>PostgreSQL</productname> has the ability to report the progress of
-   certain commands during command execution.  Currently, the only command
-   which supports progress reporting is <command>VACUUM</command>.  This may be
-   expanded in the future.
+   certain commands during command execution.  Currently, the only commands
+   which support progress reporting are <command>VACUUM</command> and
+   <command>CLUSTER</command>. This may be expanded in the future.
   </para>
 
  <sect2 id="vacuum-progress-reporting">
@@ -3408,9 +3416,11 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
    one row for each backend (including autovacuum worker processes) that is
    currently vacuuming.  The tables below describe the information
    that will be reported and provide information about how to interpret it.
-   Progress reporting is not currently supported for <command>VACUUM FULL</command>
-   and backends running <command>VACUUM FULL</command> will not be listed in this
-   view.
+   Progress for <command>VACUUM FULL</command> commands is reported via
+   <structname>pg_stat_progress_cluster</structname>
+   because both <command>VACUUM FULL</command> and <command>CLUSTER</command> 
+   rewrite the table, while regular <command>VACUUM</command> only modifies it 
+   in place. See <xref linkend='cluster-progress-reporting'/>.
   </para>
 
   <table id="pg-stat-progress-vacuum-view" xreflabel="pg_stat_progress_vacuum">
@@ -3588,6 +3598,183 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
   </table>
 
  </sect2>
+
+ <sect2 id="cluster-progress-reporting">
+  <title>CLUSTER Progress Reporting</title>
+
+  <para>
+   Whenever <command>CLUSTER</command> or <command>VACUUM FULL</command> is
+   running, the <structname>pg_stat_progress_cluster</structname> view will
+   contain a row for each backend that is currently running either command. 
+   The tables below describe the information that will be reported and
+   provide information about how to interpret it.
+  </para>
+
+  <table id="pg-stat-progress-cluster-view" xreflabel="pg_stat_progress_cluster">
+   <title><structname>pg_stat_progress_cluster</structname> View</title>
+   <tgroup cols="3">
+    <thead>
+    <row>
+      <entry>Column</entry>
+      <entry>Type</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><structfield>pid</structfield></entry>
+     <entry><type>integer</type></entry>
+     <entry>Process ID of backend.</entry>
+    </row>
+    <row>
+     <entry><structfield>datid</structfield></entry>
+     <entry><type>oid</type></entry>
+     <entry>OID of the database to which this backend is connected.</entry>
+    </row>
+    <row>
+     <entry><structfield>datname</structfield></entry>
+     <entry><type>name</type></entry>
+     <entry>Name of the database to which this backend is connected.</entry>
+    </row>
+    <row>
+     <entry><structfield>relid</structfield></entry>
+     <entry><type>oid</type></entry>
+     <entry>OID of the table being clustered.</entry>
+    </row>
+    <row>
+     <entry><structfield>command</structfield></entry>
+     <entry><type>text</type></entry>
+     <entry>
+       The command that is running. Either CLUSTER or VACUUM FULL.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>phase</structfield></entry>
+     <entry><type>text</type></entry>
+     <entry>
+       Current processing phase. See <xref linkend='cluster-phases' />.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>cluster_index_relid</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       If the table is being scanned using an index, this is the OID of the
+       index being used; otherwise, it is zero.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_tuples_scanned</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Number of heap tuples scanned.
+       This counter only advances when the phase is
+       <literal>seq scanning heap</literal>,
+       <literal>index scanning heap</literal>
+       or <literal>writing new heap</literal>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_tuples_written</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Number of heap tuples written.
+       This counter only advances when the phase is
+       <literal>seq scanning heap</literal>,
+       <literal>index scanning heap</literal>
+       or <literal>writing new heap</literal>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_blks_total</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Total number of heap blocks in the table.  This number is reported
+       as of the beginning of <literal>seq scanning heap</literal>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_blks_scanned</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Number of heap blocks scanned.  This counter only advances when the
+       phase is <literal>seq scanning heap</literal>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>index_rebuild_count</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Number of indexes rebuilt.  This counter only advances when the phase
+       is <literal>rebuilding index</literal>.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
+  <table id="cluster-phases">
+   <title>CLUSTER and VACUUM FULL phases</title>
+   <tgroup cols="2">
+    <thead>
+    <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><literal>initializing</literal></entry>
+     <entry>
+       The command is preparing to begin scanning the heap.  This phase is
+       expected to be very brief.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>seq scanning heap</literal></entry>
+     <entry>
+       The command is currently scanning the table using a sequential scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>index scanning heap</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently scanning the table using an index scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>sorting tuples</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently sorting tuples. 
+     </entry>
+    </row>
+    <row>
+     <entry><literal>swapping relation files</literal></entry>
+     <entry>
+       The command is currently swapping newly-built files into place.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>rebuilding index</literal></entry>
+     <entry>
+       The command is currently rebuilding an index.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>performing final cleanup</literal></entry>
+     <entry>
+       The command is performing final cleanup.  When this phase is 
+       completed, <command>CLUSTER</command>
+       or <command>VACUUM FULL</command> will end.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
+ </sect2>
  </sect1>
 
  <sect1 id="dynamic-trace">
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index cb2c001..d2e284f 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -52,6 +52,7 @@
 #include "catalog/storage.h"
 #include "commands/tablecmds.h"
 #include "commands/event_trigger.h"
+#include "commands/progress.h"
 #include "commands/trigger.h"
 #include "executor/executor.h"
 #include "miscadmin.h"
@@ -59,6 +60,7 @@
 #include "nodes/nodeFuncs.h"
 #include "optimizer/optimizer.h"
 #include "parser/parser.h"
+#include "pgstat.h"
 #include "rewrite/rewriteManip.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
@@ -3846,6 +3848,7 @@ reindex_relation(Oid relid, int flags, int options)
 	List	   *indexIds;
 	bool		is_pg_class;
 	bool		result;
+	int			i;
 
 	/*
 	 * Open and lock the relation.  ShareLock is sufficient since we only need
@@ -3933,6 +3936,7 @@ reindex_relation(Oid relid, int flags, int options)
 
 		/* Reindex all the indexes. */
 		doneIndexes = NIL;
+		i = 1;
 		foreach(indexId, indexIds)
 		{
 			Oid			indexOid = lfirst_oid(indexId);
@@ -3950,6 +3954,11 @@ reindex_relation(Oid relid, int flags, int options)
 
 			if (is_pg_class)
 				doneIndexes = lappend_oid(doneIndexes, indexOid);
+
+			/* Set index rebuild count */
+			pgstat_progress_update_param(PROGRESS_CLUSTER_INDEX_REBUILD_COUNT,
+										 i);
+			i++;
 		}
 	}
 	PG_CATCH();
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index d962648..b89df70 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -907,6 +907,33 @@ CREATE VIEW pg_stat_progress_vacuum AS
     FROM pg_stat_get_progress_info('VACUUM') AS S
 		LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_cluster AS
+    SELECT
+        S.pid AS pid,
+        S.datid AS datid,
+        D.datname AS datname,
+        S.relid AS relid,
+        CASE S.param1 WHEN 1 THEN 'CLUSTER'
+                      WHEN 2 THEN 'VACUUM FULL'
+                      END AS command,
+        CASE S.param2 WHEN 0 THEN 'initializing'
+                      WHEN 1 THEN 'seq scanning heap'
+                      WHEN 2 THEN 'index scanning heap'
+                      WHEN 3 THEN 'sorting tuples'
+                      WHEN 4 THEN 'writing new heap'
+                      WHEN 5 THEN 'swapping relation files'
+                      WHEN 6 THEN 'rebuilding index'
+                      WHEN 7 THEN 'performing final cleanup'
+                      END AS phase,
+        S.param3 AS cluster_index_relid,
+        S.param4 AS heap_tuples_scanned,
+        S.param5 AS heap_tuples_written,
+        S.param6 AS heap_blks_total,
+        S.param7 AS heap_blks_scanned,
+        S.param8 AS index_rebuild_count
+    FROM pg_stat_get_progress_info('CLUSTER') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid;
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 3e2a807..205070b 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -36,10 +36,12 @@
 #include "catalog/objectaccess.h"
 #include "catalog/toasting.h"
 #include "commands/cluster.h"
+#include "commands/progress.h"
 #include "commands/tablecmds.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "optimizer/optimizer.h"
+#include "pgstat.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/predicate.h"
@@ -276,6 +278,14 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 	/* Check for user-requested abort. */
 	CHECK_FOR_INTERRUPTS();
 
+	pgstat_progress_start_command(PROGRESS_COMMAND_CLUSTER, tableOid);
+	if (OidIsValid(indexOid))
+		pgstat_progress_update_param(PROGRESS_CLUSTER_COMMAND,
+									 PROGRESS_CLUSTER_COMMAND_CLUSTER);
+	else
+		pgstat_progress_update_param(PROGRESS_CLUSTER_COMMAND,
+									 PROGRESS_CLUSTER_COMMAND_VACUUM_FULL);
+
 	/*
 	 * We grab exclusive access to the target rel and index for the duration
 	 * of the transaction.  (This is redundant for the single-transaction
@@ -286,7 +296,10 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 
 	/* If the table has gone away, we can skip processing it */
 	if (!OldHeap)
+	{
+		pgstat_progress_end_command();
 		return;
+	}
 
 	/*
 	 * Since we may open a new transaction for each relation, we have to check
@@ -305,6 +318,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 		if (!pg_class_ownercheck(tableOid, GetUserId()))
 		{
 			relation_close(OldHeap, AccessExclusiveLock);
+			pgstat_progress_end_command();
 			return;
 		}
 
@@ -319,6 +333,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 		if (RELATION_IS_OTHER_TEMP(OldHeap))
 		{
 			relation_close(OldHeap, AccessExclusiveLock);
+			pgstat_progress_end_command();
 			return;
 		}
 
@@ -330,6 +345,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 			if (!SearchSysCacheExists1(RELOID, ObjectIdGetDatum(indexOid)))
 			{
 				relation_close(OldHeap, AccessExclusiveLock);
+				pgstat_progress_end_command();
 				return;
 			}
 
@@ -340,6 +356,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 			if (!HeapTupleIsValid(tuple))	/* probably can't happen */
 			{
 				relation_close(OldHeap, AccessExclusiveLock);
+				pgstat_progress_end_command();
 				return;
 			}
 			indexForm = (Form_pg_index) GETSTRUCT(tuple);
@@ -347,6 +364,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 			{
 				ReleaseSysCache(tuple);
 				relation_close(OldHeap, AccessExclusiveLock);
+				pgstat_progress_end_command();
 				return;
 			}
 			ReleaseSysCache(tuple);
@@ -401,6 +419,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 		!RelationIsPopulated(OldHeap))
 	{
 		relation_close(OldHeap, AccessExclusiveLock);
+		pgstat_progress_end_command();
 		return;
 	}
 
@@ -416,6 +435,8 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 	rebuild_relation(OldHeap, indexOid, verbose);
 
 	/* NB: rebuild_relation does table_close() on OldHeap */
+
+	pgstat_progress_end_command();
 }
 
 /*
@@ -928,6 +949,17 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 */
 	if (OldIndex != NULL && !use_sort)
 	{
+		const int   ci_index[] = {
+			PROGRESS_CLUSTER_PHASE,
+			PROGRESS_CLUSTER_INDEX_RELID
+		};
+		int64       ci_val[2];
+
+		/* Set phase and OIDOldIndex to columns */
+		ci_val[0] = PROGRESS_CLUSTER_PHASE_INDEX_SCAN_HEAP;
+		ci_val[1] = OIDOldIndex;
+		pgstat_progress_update_multi_param(2, ci_index, ci_val);
+
 		tableScan = NULL;
 		heapScan = NULL;
 		indexScan = index_beginscan(OldHeap, OldIndex, SnapshotAny, 0, 0);
@@ -935,9 +967,17 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	}
 	else
 	{
+		/* In scan-and-sort mode and also VACUUM FULL, set phase */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+									 PROGRESS_CLUSTER_PHASE_SEQ_SCAN_HEAP);
+
 		tableScan = table_beginscan(OldHeap, SnapshotAny, 0, (ScanKey) NULL);
 		heapScan = (HeapScanDesc) tableScan;
 		indexScan = NULL;
+
+		/* Set total heap blocks */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_TOTAL_HEAP_BLKS,
+									 heapScan->rs_nblocks);
 	}
 
 	slot = table_slot_create(OldHeap, NULL);
@@ -994,6 +1034,10 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 				break;
 
 			buf = heapScan->rs_cbuf;
+
+			/* In scan-and-sort mode and also VACUUM FULL, set heap blocks scanned */
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_BLKS_SCANNED,
+										 heapScan->rs_cblock + 1);
 		}
 
 		LockBuffer(buf, BUFFER_LOCK_SHARE);
@@ -1064,12 +1108,31 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 
 		num_tuples += 1;
 		if (tuplesort != NULL)
+		{
 			tuplesort_putheaptuple(tuplesort, tuple);
+
+			/* In scan-and-sort mode, report increase in number of tuples scanned */
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED,
+										 num_tuples);
+		}
 		else
+		{
+			const int   ct_index[] = {
+				PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED,
+				PROGRESS_CLUSTER_HEAP_TUPLES_WRITTEN
+			};
+			int64       ct_val[2];
+
 			reform_and_rewrite_tuple(tuple,
 									 oldTupDesc, newTupDesc,
 									 values, isnull,
 									 rwstate);
+
+			/* In indexscan mode and also VACUUM FULL, report increase in number of tuples scanned and written */
+			ct_val[0] = num_tuples;
+			ct_val[1] = num_tuples;
+			pgstat_progress_update_multi_param(2, ct_index, ct_val);
+		}
 	}
 
 	if (indexScan != NULL)
@@ -1085,8 +1148,17 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 */
 	if (tuplesort != NULL)
 	{
+		double n_tuples = 0;
+		/* Report that we are now sorting tuples */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+									 PROGRESS_CLUSTER_PHASE_SORT_TUPLES);
+
 		tuplesort_performsort(tuplesort);
 
+		/* Report that we are now writing new heap */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+									 PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP);
+
 		for (;;)
 		{
 			HeapTuple	tuple;
@@ -1097,10 +1169,14 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 			if (tuple == NULL)
 				break;
 
+			n_tuples += 1;
 			reform_and_rewrite_tuple(tuple,
 									 oldTupDesc, newTupDesc,
 									 values, isnull,
 									 rwstate);
+			/* Report n_tuples */
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_WRITTEN,
+										 n_tuples);
 		}
 
 		tuplesort_end(tuplesort);
@@ -1539,6 +1615,10 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	int			reindex_flags;
 	int			i;
 
+	/* Report that we are now swapping relation files */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+								 PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES);
+
 	/* Zero out possible results from swapped_relation_files */
 	memset(mapped_tables, 0, sizeof(mapped_tables));
 
@@ -1586,8 +1666,16 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	else if (newrelpersistence == RELPERSISTENCE_PERMANENT)
 		reindex_flags |= REINDEX_REL_FORCE_INDEXES_PERMANENT;
 
+	/* Report that we are now reindexing relations */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+								 PROGRESS_CLUSTER_PHASE_REBUILD_INDEX);
+
 	reindex_relation(OIDOldHeap, reindex_flags, 0);
 
+	/* Report that we are now doing clean up */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+								 PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP);
+
 	/*
 	 * If the relation being rebuild is pg_class, swap_relation_files()
 	 * couldn't update pg_class's own pg_class entry (check comments in
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index da1d685..90a817a 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -468,6 +468,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 	/* Translate command name into command type code. */
 	if (pg_strcasecmp(cmd, "VACUUM") == 0)
 		cmdtype = PROGRESS_COMMAND_VACUUM;
+	else if (pg_strcasecmp(cmd, "CLUSTER") == 0)
+		cmdtype = PROGRESS_COMMAND_CLUSTER;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 9858b36..04542d9 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -34,4 +34,27 @@
 #define PROGRESS_VACUUM_PHASE_TRUNCATE			5
 #define PROGRESS_VACUUM_PHASE_FINAL_CLEANUP		6
 
+/* Progress parameters for cluster */
+#define PROGRESS_CLUSTER_COMMAND				0
+#define PROGRESS_CLUSTER_PHASE					1
+#define PROGRESS_CLUSTER_INDEX_RELID			2
+#define PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED	3
+#define PROGRESS_CLUSTER_HEAP_TUPLES_WRITTEN	4
+#define PROGRESS_CLUSTER_TOTAL_HEAP_BLKS		5
+#define PROGRESS_CLUSTER_HEAP_BLKS_SCANNED		6
+#define PROGRESS_CLUSTER_INDEX_REBUILD_COUNT	7
+
+/* Phases of cluster (as dvertised via PROGRESS_CLUSTER_PHASE) */
+#define PROGRESS_CLUSTER_PHASE_SEQ_SCAN_HEAP	1
+#define PROGRESS_CLUSTER_PHASE_INDEX_SCAN_HEAP	2
+#define PROGRESS_CLUSTER_PHASE_SORT_TUPLES		3
+#define PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP	4
+#define PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES	5
+#define PROGRESS_CLUSTER_PHASE_REBUILD_INDEX	6
+#define PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP	7
+
+/* Commands of PROGRESS_CLUSTER */
+#define PROGRESS_CLUSTER_COMMAND_CLUSTER		1
+#define PROGRESS_CLUSTER_COMMAND_VACUUM_FULL	2
+
 #endif
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index ea6cc8b..c080fa6 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -950,7 +950,8 @@ typedef enum
 typedef enum ProgressCommandType
 {
 	PROGRESS_COMMAND_INVALID,
-	PROGRESS_COMMAND_VACUUM
+	PROGRESS_COMMAND_VACUUM,
+	PROGRESS_COMMAND_CLUSTER
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	10
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index f104dc4..49ca3be 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1830,6 +1830,34 @@ pg_stat_database_conflicts| SELECT d.oid AS datid,
     pg_stat_get_db_conflict_bufferpin(d.oid) AS confl_bufferpin,
     pg_stat_get_db_conflict_startup_deadlock(d.oid) AS confl_deadlock
    FROM pg_database d;
+pg_stat_progress_cluster| SELECT s.pid,
+    s.datid,
+    d.datname,
+    s.relid,
+        CASE s.param1
+            WHEN 1 THEN 'CLUSTER'::text
+            WHEN 2 THEN 'VACUUM FULL'::text
+            ELSE NULL::text
+        END AS command,
+        CASE s.param2
+            WHEN 0 THEN 'initializing'::text
+            WHEN 1 THEN 'seq scanning heap'::text
+            WHEN 2 THEN 'index scanning heap'::text
+            WHEN 3 THEN 'sorting tuples'::text
+            WHEN 4 THEN 'writing new heap'::text
+            WHEN 5 THEN 'swapping relation files'::text
+            WHEN 6 THEN 'rebuilding index'::text
+            WHEN 7 THEN 'performing final cleanup'::text
+            ELSE NULL::text
+        END AS phase,
+    s.param3 AS cluster_index_relid,
+    s.param4 AS heap_tuples_scanned,
+    s.param5 AS heap_tuples_written,
+    s.param6 AS heap_blks_total,
+    s.param7 AS heap_blks_scanned,
+    s.param8 AS index_rebuild_count
+   FROM (pg_stat_get_progress_info('CLUSTER'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,

Import Notes

Resolved by subject fallback

#68

Tatsuro Yamada

yamada.tatsuro@lab.ntt.co.jp

almost 7 years ago

In reply to: Robert Haas (#66)

2 attachment(s)

Re: [HACKERS] CLUSTER command progress monitor

Hi Robert!

On 2019/03/23 3:31, Robert Haas wrote:

On Tue, Mar 19, 2019 at 2:47 PM Robert Haas <robertmhaas@gmail.com> wrote:

how close you were getting to rewriting the entire heap. This is the
one thing I found but did not fix; any chance you could make this
change and update the documentation to match?

Hi, is anybody working on this?

I sent this email using my personal email address: yamatattsu@gmail-.
I re-send it with the patch and my test result.

Thank you so much for reviewing the patch and sorry for the late reply.
Today, I realized that you sent the email for the patch because I took a
sick leave from work for a while. So, I created new patch based on your comments asap.
I hope it is acceptable to you. :)

Changes
- Add new column *heap_tuples_written* in the view
This column is updated when the phases are "seq scanning heap",
"index scanning heap" or "writing new heap".
- Fix document
- Revised the patch on the current head: 940311e4bb32a5fe99155052e41179c88b5d48af.

Please find attached files. :)

Regards,
Tatsuro Yamada

Attachments:

progress_monitor_for_cluster_command_v12.patchtext/x-patch; name=progress_monitor_for_cluster_command_v12.patchDownload

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index ac2721c..26a6899 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -340,7 +340,15 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
       <entry><structname>pg_stat_progress_vacuum</structname><indexterm><primary>pg_stat_progress_vacuum</primary></indexterm></entry>
       <entry>One row for each backend (including autovacuum worker processes) running
        <command>VACUUM</command>, showing current progress.
-       See <xref linkend='vacuum-progress-reporting'/>.
+       See <xref linkend='vacuum-progress-reporting' />.
+      </entry>
+     </row>
+
+     <row>
+      <entry><structname>pg_stat_progress_cluster</structname><indexterm><primary>pg_stat_progress_cluster</primary></indexterm></entry>
+      <entry>One row for each backend running
+       <command>CLUSTER</command> or <command>VACUUM FULL</command>, showing current progress.
+       See <xref linkend='cluster-progress-reporting' />.
       </entry>
      </row>
 
@@ -3394,9 +3402,9 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 
   <para>
    <productname>PostgreSQL</productname> has the ability to report the progress of
-   certain commands during command execution.  Currently, the only command
-   which supports progress reporting is <command>VACUUM</command>.  This may be
-   expanded in the future.
+   certain commands during command execution.  Currently, the only commands
+   which support progress reporting are <command>VACUUM</command> and
+   <command>CLUSTER</command>. This may be expanded in the future.
   </para>
 
  <sect2 id="vacuum-progress-reporting">
@@ -3408,9 +3416,11 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
    one row for each backend (including autovacuum worker processes) that is
    currently vacuuming.  The tables below describe the information
    that will be reported and provide information about how to interpret it.
-   Progress reporting is not currently supported for <command>VACUUM FULL</command>
-   and backends running <command>VACUUM FULL</command> will not be listed in this
-   view.
+   Progress for <command>VACUUM FULL</command> commands is reported via
+   <structname>pg_stat_progress_cluster</structname>
+   because both <command>VACUUM FULL</command> and <command>CLUSTER</command> 
+   rewrite the table, while regular <command>VACUUM</command> only modifies it 
+   in place. See <xref linkend='cluster-progress-reporting'/>.
   </para>
 
   <table id="pg-stat-progress-vacuum-view" xreflabel="pg_stat_progress_vacuum">
@@ -3588,6 +3598,183 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
   </table>
 
  </sect2>
+
+ <sect2 id="cluster-progress-reporting">
+  <title>CLUSTER Progress Reporting</title>
+
+  <para>
+   Whenever <command>CLUSTER</command> or <command>VACUUM FULL</command> is
+   running, the <structname>pg_stat_progress_cluster</structname> view will
+   contain a row for each backend that is currently running either command. 
+   The tables below describe the information that will be reported and
+   provide information about how to interpret it.
+  </para>
+
+  <table id="pg-stat-progress-cluster-view" xreflabel="pg_stat_progress_cluster">
+   <title><structname>pg_stat_progress_cluster</structname> View</title>
+   <tgroup cols="3">
+    <thead>
+    <row>
+      <entry>Column</entry>
+      <entry>Type</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><structfield>pid</structfield></entry>
+     <entry><type>integer</type></entry>
+     <entry>Process ID of backend.</entry>
+    </row>
+    <row>
+     <entry><structfield>datid</structfield></entry>
+     <entry><type>oid</type></entry>
+     <entry>OID of the database to which this backend is connected.</entry>
+    </row>
+    <row>
+     <entry><structfield>datname</structfield></entry>
+     <entry><type>name</type></entry>
+     <entry>Name of the database to which this backend is connected.</entry>
+    </row>
+    <row>
+     <entry><structfield>relid</structfield></entry>
+     <entry><type>oid</type></entry>
+     <entry>OID of the table being clustered.</entry>
+    </row>
+    <row>
+     <entry><structfield>command</structfield></entry>
+     <entry><type>text</type></entry>
+     <entry>
+       The command that is running. Either CLUSTER or VACUUM FULL.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>phase</structfield></entry>
+     <entry><type>text</type></entry>
+     <entry>
+       Current processing phase. See <xref linkend='cluster-phases' />.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>cluster_index_relid</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       If the table is being scanned using an index, this is the OID of the
+       index being used; otherwise, it is zero.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_tuples_scanned</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Number of heap tuples scanned.
+       This counter only advances when the phase is
+       <literal>seq scanning heap</literal>,
+       <literal>index scanning heap</literal>
+       or <literal>writing new heap</literal>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_tuples_written</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Number of heap tuples written.
+       This counter only advances when the phase is
+       <literal>seq scanning heap</literal>,
+       <literal>index scanning heap</literal>
+       or <literal>writing new heap</literal>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_blks_total</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Total number of heap blocks in the table.  This number is reported
+       as of the beginning of <literal>seq scanning heap</literal>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>heap_blks_scanned</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Number of heap blocks scanned.  This counter only advances when the
+       phase is <literal>seq scanning heap</literal>.
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>index_rebuild_count</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>
+       Number of indexes rebuilt.  This counter only advances when the phase
+       is <literal>rebuilding index</literal>.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
+  <table id="cluster-phases">
+   <title>CLUSTER and VACUUM FULL phases</title>
+   <tgroup cols="2">
+    <thead>
+    <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><literal>initializing</literal></entry>
+     <entry>
+       The command is preparing to begin scanning the heap.  This phase is
+       expected to be very brief.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>seq scanning heap</literal></entry>
+     <entry>
+       The command is currently scanning the table using a sequential scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>index scanning heap</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently scanning the table using an index scan.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>sorting tuples</literal></entry>
+     <entry>
+       <command>CLUSTER</command> is currently sorting tuples. 
+     </entry>
+    </row>
+    <row>
+     <entry><literal>swapping relation files</literal></entry>
+     <entry>
+       The command is currently swapping newly-built files into place.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>rebuilding index</literal></entry>
+     <entry>
+       The command is currently rebuilding an index.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>performing final cleanup</literal></entry>
+     <entry>
+       The command is performing final cleanup.  When this phase is 
+       completed, <command>CLUSTER</command>
+       or <command>VACUUM FULL</command> will end.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
+ </sect2>
  </sect1>
 
  <sect1 id="dynamic-trace">
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index cb2c001..d2e284f 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -52,6 +52,7 @@
 #include "catalog/storage.h"
 #include "commands/tablecmds.h"
 #include "commands/event_trigger.h"
+#include "commands/progress.h"
 #include "commands/trigger.h"
 #include "executor/executor.h"
 #include "miscadmin.h"
@@ -59,6 +60,7 @@
 #include "nodes/nodeFuncs.h"
 #include "optimizer/optimizer.h"
 #include "parser/parser.h"
+#include "pgstat.h"
 #include "rewrite/rewriteManip.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
@@ -3846,6 +3848,7 @@ reindex_relation(Oid relid, int flags, int options)
 	List	   *indexIds;
 	bool		is_pg_class;
 	bool		result;
+	int			i;
 
 	/*
 	 * Open and lock the relation.  ShareLock is sufficient since we only need
@@ -3933,6 +3936,7 @@ reindex_relation(Oid relid, int flags, int options)
 
 		/* Reindex all the indexes. */
 		doneIndexes = NIL;
+		i = 1;
 		foreach(indexId, indexIds)
 		{
 			Oid			indexOid = lfirst_oid(indexId);
@@ -3950,6 +3954,11 @@ reindex_relation(Oid relid, int flags, int options)
 
 			if (is_pg_class)
 				doneIndexes = lappend_oid(doneIndexes, indexOid);
+
+			/* Set index rebuild count */
+			pgstat_progress_update_param(PROGRESS_CLUSTER_INDEX_REBUILD_COUNT,
+										 i);
+			i++;
 		}
 	}
 	PG_CATCH();
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index d962648..b89df70 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -907,6 +907,33 @@ CREATE VIEW pg_stat_progress_vacuum AS
     FROM pg_stat_get_progress_info('VACUUM') AS S
 		LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_cluster AS
+    SELECT
+        S.pid AS pid,
+        S.datid AS datid,
+        D.datname AS datname,
+        S.relid AS relid,
+        CASE S.param1 WHEN 1 THEN 'CLUSTER'
+                      WHEN 2 THEN 'VACUUM FULL'
+                      END AS command,
+        CASE S.param2 WHEN 0 THEN 'initializing'
+                      WHEN 1 THEN 'seq scanning heap'
+                      WHEN 2 THEN 'index scanning heap'
+                      WHEN 3 THEN 'sorting tuples'
+                      WHEN 4 THEN 'writing new heap'
+                      WHEN 5 THEN 'swapping relation files'
+                      WHEN 6 THEN 'rebuilding index'
+                      WHEN 7 THEN 'performing final cleanup'
+                      END AS phase,
+        S.param3 AS cluster_index_relid,
+        S.param4 AS heap_tuples_scanned,
+        S.param5 AS heap_tuples_written,
+        S.param6 AS heap_blks_total,
+        S.param7 AS heap_blks_scanned,
+        S.param8 AS index_rebuild_count
+    FROM pg_stat_get_progress_info('CLUSTER') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid;
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 3e2a807..205070b 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -36,10 +36,12 @@
 #include "catalog/objectaccess.h"
 #include "catalog/toasting.h"
 #include "commands/cluster.h"
+#include "commands/progress.h"
 #include "commands/tablecmds.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "optimizer/optimizer.h"
+#include "pgstat.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/predicate.h"
@@ -276,6 +278,14 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 	/* Check for user-requested abort. */
 	CHECK_FOR_INTERRUPTS();
 
+	pgstat_progress_start_command(PROGRESS_COMMAND_CLUSTER, tableOid);
+	if (OidIsValid(indexOid))
+		pgstat_progress_update_param(PROGRESS_CLUSTER_COMMAND,
+									 PROGRESS_CLUSTER_COMMAND_CLUSTER);
+	else
+		pgstat_progress_update_param(PROGRESS_CLUSTER_COMMAND,
+									 PROGRESS_CLUSTER_COMMAND_VACUUM_FULL);
+
 	/*
 	 * We grab exclusive access to the target rel and index for the duration
 	 * of the transaction.  (This is redundant for the single-transaction
@@ -286,7 +296,10 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 
 	/* If the table has gone away, we can skip processing it */
 	if (!OldHeap)
+	{
+		pgstat_progress_end_command();
 		return;
+	}
 
 	/*
 	 * Since we may open a new transaction for each relation, we have to check
@@ -305,6 +318,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 		if (!pg_class_ownercheck(tableOid, GetUserId()))
 		{
 			relation_close(OldHeap, AccessExclusiveLock);
+			pgstat_progress_end_command();
 			return;
 		}
 
@@ -319,6 +333,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 		if (RELATION_IS_OTHER_TEMP(OldHeap))
 		{
 			relation_close(OldHeap, AccessExclusiveLock);
+			pgstat_progress_end_command();
 			return;
 		}
 
@@ -330,6 +345,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 			if (!SearchSysCacheExists1(RELOID, ObjectIdGetDatum(indexOid)))
 			{
 				relation_close(OldHeap, AccessExclusiveLock);
+				pgstat_progress_end_command();
 				return;
 			}
 
@@ -340,6 +356,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 			if (!HeapTupleIsValid(tuple))	/* probably can't happen */
 			{
 				relation_close(OldHeap, AccessExclusiveLock);
+				pgstat_progress_end_command();
 				return;
 			}
 			indexForm = (Form_pg_index) GETSTRUCT(tuple);
@@ -347,6 +364,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 			{
 				ReleaseSysCache(tuple);
 				relation_close(OldHeap, AccessExclusiveLock);
+				pgstat_progress_end_command();
 				return;
 			}
 			ReleaseSysCache(tuple);
@@ -401,6 +419,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 		!RelationIsPopulated(OldHeap))
 	{
 		relation_close(OldHeap, AccessExclusiveLock);
+		pgstat_progress_end_command();
 		return;
 	}
 
@@ -416,6 +435,8 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 	rebuild_relation(OldHeap, indexOid, verbose);
 
 	/* NB: rebuild_relation does table_close() on OldHeap */
+
+	pgstat_progress_end_command();
 }
 
 /*
@@ -928,6 +949,17 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 */
 	if (OldIndex != NULL && !use_sort)
 	{
+		const int   ci_index[] = {
+			PROGRESS_CLUSTER_PHASE,
+			PROGRESS_CLUSTER_INDEX_RELID
+		};
+		int64       ci_val[2];
+
+		/* Set phase and OIDOldIndex to columns */
+		ci_val[0] = PROGRESS_CLUSTER_PHASE_INDEX_SCAN_HEAP;
+		ci_val[1] = OIDOldIndex;
+		pgstat_progress_update_multi_param(2, ci_index, ci_val);
+
 		tableScan = NULL;
 		heapScan = NULL;
 		indexScan = index_beginscan(OldHeap, OldIndex, SnapshotAny, 0, 0);
@@ -935,9 +967,17 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	}
 	else
 	{
+		/* In scan-and-sort mode and also VACUUM FULL, set phase */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+									 PROGRESS_CLUSTER_PHASE_SEQ_SCAN_HEAP);
+
 		tableScan = table_beginscan(OldHeap, SnapshotAny, 0, (ScanKey) NULL);
 		heapScan = (HeapScanDesc) tableScan;
 		indexScan = NULL;
+
+		/* Set total heap blocks */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_TOTAL_HEAP_BLKS,
+									 heapScan->rs_nblocks);
 	}
 
 	slot = table_slot_create(OldHeap, NULL);
@@ -994,6 +1034,10 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 				break;
 
 			buf = heapScan->rs_cbuf;
+
+			/* In scan-and-sort mode and also VACUUM FULL, set heap blocks scanned */
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_BLKS_SCANNED,
+										 heapScan->rs_cblock + 1);
 		}
 
 		LockBuffer(buf, BUFFER_LOCK_SHARE);
@@ -1064,12 +1108,31 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 
 		num_tuples += 1;
 		if (tuplesort != NULL)
+		{
 			tuplesort_putheaptuple(tuplesort, tuple);
+
+			/* In scan-and-sort mode, report increase in number of tuples scanned */
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED,
+										 num_tuples);
+		}
 		else
+		{
+			const int   ct_index[] = {
+				PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED,
+				PROGRESS_CLUSTER_HEAP_TUPLES_WRITTEN
+			};
+			int64       ct_val[2];
+
 			reform_and_rewrite_tuple(tuple,
 									 oldTupDesc, newTupDesc,
 									 values, isnull,
 									 rwstate);
+
+			/* In indexscan mode and also VACUUM FULL, report increase in number of tuples scanned and written */
+			ct_val[0] = num_tuples;
+			ct_val[1] = num_tuples;
+			pgstat_progress_update_multi_param(2, ct_index, ct_val);
+		}
 	}
 
 	if (indexScan != NULL)
@@ -1085,8 +1148,17 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 */
 	if (tuplesort != NULL)
 	{
+		double n_tuples = 0;
+		/* Report that we are now sorting tuples */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+									 PROGRESS_CLUSTER_PHASE_SORT_TUPLES);
+
 		tuplesort_performsort(tuplesort);
 
+		/* Report that we are now writing new heap */
+		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+									 PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP);
+
 		for (;;)
 		{
 			HeapTuple	tuple;
@@ -1097,10 +1169,14 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 			if (tuple == NULL)
 				break;
 
+			n_tuples += 1;
 			reform_and_rewrite_tuple(tuple,
 									 oldTupDesc, newTupDesc,
 									 values, isnull,
 									 rwstate);
+			/* Report n_tuples */
+			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_WRITTEN,
+										 n_tuples);
 		}
 
 		tuplesort_end(tuplesort);
@@ -1539,6 +1615,10 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	int			reindex_flags;
 	int			i;
 
+	/* Report that we are now swapping relation files */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+								 PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES);
+
 	/* Zero out possible results from swapped_relation_files */
 	memset(mapped_tables, 0, sizeof(mapped_tables));
 
@@ -1586,8 +1666,16 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	else if (newrelpersistence == RELPERSISTENCE_PERMANENT)
 		reindex_flags |= REINDEX_REL_FORCE_INDEXES_PERMANENT;
 
+	/* Report that we are now reindexing relations */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+								 PROGRESS_CLUSTER_PHASE_REBUILD_INDEX);
+
 	reindex_relation(OIDOldHeap, reindex_flags, 0);
 
+	/* Report that we are now doing clean up */
+	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+								 PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP);
+
 	/*
 	 * If the relation being rebuild is pg_class, swap_relation_files()
 	 * couldn't update pg_class's own pg_class entry (check comments in
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index da1d685..90a817a 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -468,6 +468,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 	/* Translate command name into command type code. */
 	if (pg_strcasecmp(cmd, "VACUUM") == 0)
 		cmdtype = PROGRESS_COMMAND_VACUUM;
+	else if (pg_strcasecmp(cmd, "CLUSTER") == 0)
+		cmdtype = PROGRESS_COMMAND_CLUSTER;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 9858b36..04542d9 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -34,4 +34,27 @@
 #define PROGRESS_VACUUM_PHASE_TRUNCATE			5
 #define PROGRESS_VACUUM_PHASE_FINAL_CLEANUP		6
 
+/* Progress parameters for cluster */
+#define PROGRESS_CLUSTER_COMMAND				0
+#define PROGRESS_CLUSTER_PHASE					1
+#define PROGRESS_CLUSTER_INDEX_RELID			2
+#define PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED	3
+#define PROGRESS_CLUSTER_HEAP_TUPLES_WRITTEN	4
+#define PROGRESS_CLUSTER_TOTAL_HEAP_BLKS		5
+#define PROGRESS_CLUSTER_HEAP_BLKS_SCANNED		6
+#define PROGRESS_CLUSTER_INDEX_REBUILD_COUNT	7
+
+/* Phases of cluster (as dvertised via PROGRESS_CLUSTER_PHASE) */
+#define PROGRESS_CLUSTER_PHASE_SEQ_SCAN_HEAP	1
+#define PROGRESS_CLUSTER_PHASE_INDEX_SCAN_HEAP	2
+#define PROGRESS_CLUSTER_PHASE_SORT_TUPLES		3
+#define PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP	4
+#define PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES	5
+#define PROGRESS_CLUSTER_PHASE_REBUILD_INDEX	6
+#define PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP	7
+
+/* Commands of PROGRESS_CLUSTER */
+#define PROGRESS_CLUSTER_COMMAND_CLUSTER		1
+#define PROGRESS_CLUSTER_COMMAND_VACUUM_FULL	2
+
 #endif
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index ea6cc8b..c080fa6 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -950,7 +950,8 @@ typedef enum
 typedef enum ProgressCommandType
 {
 	PROGRESS_COMMAND_INVALID,
-	PROGRESS_COMMAND_VACUUM
+	PROGRESS_COMMAND_VACUUM,
+	PROGRESS_COMMAND_CLUSTER
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	10
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index f104dc4..49ca3be 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1830,6 +1830,34 @@ pg_stat_database_conflicts| SELECT d.oid AS datid,
     pg_stat_get_db_conflict_bufferpin(d.oid) AS confl_bufferpin,
     pg_stat_get_db_conflict_startup_deadlock(d.oid) AS confl_deadlock
    FROM pg_database d;
+pg_stat_progress_cluster| SELECT s.pid,
+    s.datid,
+    d.datname,
+    s.relid,
+        CASE s.param1
+            WHEN 1 THEN 'CLUSTER'::text
+            WHEN 2 THEN 'VACUUM FULL'::text
+            ELSE NULL::text
+        END AS command,
+        CASE s.param2
+            WHEN 0 THEN 'initializing'::text
+            WHEN 1 THEN 'seq scanning heap'::text
+            WHEN 2 THEN 'index scanning heap'::text
+            WHEN 3 THEN 'sorting tuples'::text
+            WHEN 4 THEN 'writing new heap'::text
+            WHEN 5 THEN 'swapping relation files'::text
+            WHEN 6 THEN 'rebuilding index'::text
+            WHEN 7 THEN 'performing final cleanup'::text
+            ELSE NULL::text
+        END AS phase,
+    s.param3 AS cluster_index_relid,
+    s.param4 AS heap_tuples_scanned,
+    s.param5 AS heap_tuples_written,
+    s.param6 AS heap_blks_total,
+    s.param7 AS heap_blks_scanned,
+    s.param8 AS index_rebuild_count
+   FROM (pg_stat_get_progress_info('CLUSTER'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,

test_result.txttext/plain; charset=UTF-8; name=test_result.txtDownload

#69

Robert Haas

robertmhaas@gmail.com

almost 7 years ago

In reply to: Tatsuro Yamada (#68)

Re: [HACKERS] CLUSTER command progress monitor

On Sun, Mar 24, 2019 at 9:02 PM Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

Please find attached files. :)

Committed. Thanks!

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#70

Tatsuro Yamada

yamada.tatsuro@lab.ntt.co.jp

almost 7 years ago

In reply to: Robert Haas (#69)

Re: [HACKERS] CLUSTER command progress monitor

Hi Robert and Reviewers!

On 2019/03/26 0:02, Robert Haas wrote:

On Sun, Mar 24, 2019 at 9:02 PM Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:

Please find attached files. :)

Committed. Thanks!

Thank you!
Hope this feature will help DBA and user. :)

Regards,
Tatsuro Yamada
NTT Open Source Software Center

#71

Michael Paquier

michael@paquier.xyz

almost 7 years ago

In reply to: Tatsuro Yamada (#70)

Re: [HACKERS] CLUSTER command progress monitor

On Tue, Mar 26, 2019 at 10:04:48AM +0900, Tatsuro Yamada wrote:

Hope this feature will help DBA and user. :)

Congrats, Yamada-san.
--
Michael

#72

Alvaro Herrera

alvherre@2ndquadrant.com

over 6 years ago

In reply to: Tatsuro Yamada (#68)

Re: [HACKERS] CLUSTER command progress monitor

Hmm, I'm trying this out now and I don't see the index_rebuild_count
ever go up. I think it's because the indexes are built using parallel
index build ... or maybe it was the table AM changes that moved things
around, not sure. There's a period at the end when the CLUSTER command
keeps working, but it's gone from pg_stat_progress_cluster.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#73

Tatsuro Yamada

tatsuro.yamada.tf@nttcom.co.jp

over 6 years ago

In reply to: Alvaro Herrera (#72)

Re: [HACKERS] CLUSTER command progress monitor

Hi Alvaro!

On 2019/08/02 3:43, Alvaro Herrera wrote:

Hmm, I'm trying this out now and I don't see the index_rebuild_count
ever go up. I think it's because the indexes are built using parallel
index build ... or maybe it was the table AM changes that moved things
around, not sure. There's a period at the end when the CLUSTER command
keeps working, but it's gone from pg_stat_progress_cluster.

Thanks for your report.
I'll investigate it. :)

Thanks,
Tatsuro Yamada

#74

Tatsuro Yamada

tatsuro.yamada.tf@nttcom.co.jp

over 6 years ago

In reply to: Tatsuro Yamada (#73)

Re: [HACKERS] CLUSTER command progress monitor

Hi Alvaro and All,

On 2019/08/13 14:40, Tatsuro Yamada wrote:

Hi Alvaro!

On 2019/08/02 3:43, Alvaro Herrera wrote:

Hmm, I'm trying this out now and I don't see the index_rebuild_count
ever go up. I think it's because the indexes are built using parallel
index build ... or maybe it was the table AM changes that moved things
around, not sure. There's a period at the end when the CLUSTER command
keeps working, but it's gone from pg_stat_progress_cluster.

Thanks for your report.
I'll investigate it. :)

I did "git bisect" and found the commit:

03f9e5cba0ee1633af4abe734504df50af46fbd8
Report progress of REINDEX operations

In src/backend/catalog/index.c,
CLUSTER progress reporting increases index_rebuild_count in reindex_relation()
by pgstat_progress_update_param().
However, reindex_relation() calls reindex_index(), and REINDEX progress reporting is existing on the latter function, and it starts pgstat_progress_start_command() pgstat_progress_end_command() for REINDEX progress reporting.
Therefore, CLUSTER progress reporting failed to update index_rebuild_count because
it made a mistake to update the REINDEX's view, I think.

My Idea to fix that is following:

- Add a target view name parameter to Progress monitor's API

For example:
<Before>
pgstat_progress_update_param(PROGRESS_CLUSTER_INDEX_REBUILD_COUNT, i).

<After>
pgstat_progress_update_param(*PROGRESS_CLUSTER_VIEW*, PROGRESS_CLUSTER_INDEX_REBUILD_COUNT, i).

However, I'm not sure whether it is able or not because I haven't read
the code of the API yet.
What do you think about that? :)

Thanks,
Tatsuro Yamada

#75

Michael Paquier

michael@paquier.xyz

over 6 years ago

In reply to: Tatsuro Yamada (#74)

Re: [HACKERS] CLUSTER command progress monitor

On Wed, Aug 14, 2019 at 11:38:01AM +0900, Tatsuro Yamada wrote:

On 2019/08/13 14:40, Tatsuro Yamada wrote:

On 2019/08/02 3:43, Alvaro Herrera wrote:

Hmm, I'm trying this out now and I don't see the index_rebuild_count
ever go up. I think it's because the indexes are built using parallel
index build ... or maybe it was the table AM changes that moved things
around, not sure. There's a period at the end when the CLUSTER command
keeps working, but it's gone from pg_stat_progress_cluster.

Thanks for your report.
I'll investigate it. :)

I did "git bisect" and found the commit:

03f9e5cba0ee1633af4abe734504df50af46fbd8
Report progress of REINDEX operations

I am adding an open item for this one.
--
Michael

#76

Tatsuro Yamada

tatsuro.yamada.tf@nttcom.co.jp

over 6 years ago

In reply to: Michael Paquier (#75)

Re: [HACKERS] CLUSTER command progress monitor

Hi Michael, Alvaro and Robert!

On 2019/08/14 11:52, Michael Paquier wrote:

On Wed, Aug 14, 2019 at 11:38:01AM +0900, Tatsuro Yamada wrote:

On 2019/08/13 14:40, Tatsuro Yamada wrote:

On 2019/08/02 3:43, Alvaro Herrera wrote:

Hmm, I'm trying this out now and I don't see the index_rebuild_count
ever go up.ï¿½ I think it's because the indexes are built using parallel
index build ... or maybe it was the table AM changes that moved things
around, not sure.ï¿½ There's a period at the end when the CLUSTER command
keeps working, but it's gone from pg_stat_progress_cluster.

Thanks for your report.
I'll investigate it. :)

I did "git bisect" and found the commit:

03f9e5cba0ee1633af4abe734504df50af46fbd8
Report progress of REINDEX operations

I am adding an open item for this one.
--
Michael

Okay, I checked it on the wiki.

https://wiki.postgresql.org/wiki/PostgreSQL_12_Open_Items
- index_rebuild_count in CLUSTER reporting never increments

To be clear, 03f9e5cb broke CLUSTER progress reporting, but
I investigated little more and share my ideas to fix the problem.

* Call stack
========================================
cluster_rel
pgstat_progress_start_command(CLUSTER) *A1
rebuild_relation
finish_heap_swap
reindex_relation
reindex_index
pgstat_progress_start_command(CREATE_INDEX) *B1
pgstat_progress_end_command() *B2
pgstat_progress_update_param(INDEX_REBUILD_COUNT, i) <- failed :(
pgstat_progress_end_command() *A2

Note
These are sets:
A1 and A2,
B1 and B2
========================================

* Ideas to fix
There are Three options, I guess.
========================================
1. Call pgstat_progress_start_command(CLUSTER) again
before pgstat_progress_update_param(INDEX_REBUILD_COUNT, i).

2. Add "save and restore" functions for the following two
variables of MyBeentry in pgstat.c.
- st_progress_command
- st_progress_command_target

3. Use Hash or List to store multiple values for the two
variables in pgstat.c.
========================================

I tried 1. and it shown index_rebuild_count, but it also shown
"initializing" phase again and other columns were empty. So, it is
not suitable to fix the problem. :(
I'm going to try 2. and 3., but, unfortunately, I can't get enough
time to do that after PGConf.Asia 2019.
If we selected 3., it affects following these progress reporting
features: VACUUM, CLUSTER, CREATE_INDEX and ANALYZE. But it's okay,
I suppose. Any comments welcome! :)

Thanks,
Tatsuro Yamada

#77

Masahiko Sawada

sawada.mshk@gmail.com

over 6 years ago

In reply to: Tatsuro Yamada (#76)

1 attachment(s)

Re: [HACKERS] CLUSTER command progress monitor

On Thu, Aug 15, 2019 at 12:48 PM Tatsuro Yamada
<tatsuro.yamada.tf@nttcom.co.jp> wrote:

Hi Michael, Alvaro and Robert!

On 2019/08/14 11:52, Michael Paquier wrote:

On Wed, Aug 14, 2019 at 11:38:01AM +0900, Tatsuro Yamada wrote:

On 2019/08/13 14:40, Tatsuro Yamada wrote:

On 2019/08/02 3:43, Alvaro Herrera wrote:

Hmm, I'm trying this out now and I don't see the index_rebuild_count
ever go up. I think it's because the indexes are built using parallel
index build ... or maybe it was the table AM changes that moved things
around, not sure. There's a period at the end when the CLUSTER command
keeps working, but it's gone from pg_stat_progress_cluster.

Thanks for your report.
I'll investigate it. :)

I did "git bisect" and found the commit:

03f9e5cba0ee1633af4abe734504df50af46fbd8
Report progress of REINDEX operations

I am adding an open item for this one.
--
Michael

Okay, I checked it on the wiki.

https://wiki.postgresql.org/wiki/PostgreSQL_12_Open_Items
- index_rebuild_count in CLUSTER reporting never increments

To be clear, 03f9e5cb broke CLUSTER progress reporting, but
I investigated little more and share my ideas to fix the problem.

* Call stack
========================================
cluster_rel
pgstat_progress_start_command(CLUSTER) *A1
rebuild_relation
finish_heap_swap
reindex_relation
reindex_index
pgstat_progress_start_command(CREATE_INDEX) *B1
pgstat_progress_end_command() *B2
pgstat_progress_update_param(INDEX_REBUILD_COUNT, i) <- failed :(
pgstat_progress_end_command() *A2

Note
These are sets:
A1 and A2,
B1 and B2
========================================

* Ideas to fix
There are Three options, I guess.
========================================
1. Call pgstat_progress_start_command(CLUSTER) again
before pgstat_progress_update_param(INDEX_REBUILD_COUNT, i).

2. Add "save and restore" functions for the following two
variables of MyBeentry in pgstat.c.
- st_progress_command
- st_progress_command_target

3. Use Hash or List to store multiple values for the two
variables in pgstat.c.
========================================

I tried 1. and it shown index_rebuild_count, but it also shown
"initializing" phase again and other columns were empty. So, it is
not suitable to fix the problem. :(
I'm going to try 2. and 3., but, unfortunately, I can't get enough
time to do that after PGConf.Asia 2019.
If we selected 3., it affects following these progress reporting
features: VACUUM, CLUSTER, CREATE_INDEX and ANALYZE. But it's okay,
I suppose. Any comments welcome! :)

I looked at this open item. I prefer #3 but I think it would be enough
to have a small stack using a fixed length array to track nested
progress information and the current executing command (maybe 4 or 8
would be enough for maximum nested level for now?). That way, we don't
need any change the interfaces. AFAICS there is no case where we call
only either pgstat_progress_start_command or
pgstat_progress_end_command without each other (although
pgstat_progress_update_param is called without start). So I think that
having a small stack for tracking multiple reports would work.

Attached the draft patch that fixes this issue. Please review it.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

track_nested_command_progress.patchtext/x-patch; charset=US-ASCII; name=track_nested_command_progress.patchDownload

diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index d362e7f7d7..99e4844e6c 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -3016,8 +3016,10 @@ pgstat_bestart(void)
 #endif
 
 	lbeentry.st_state = STATE_UNDEFINED;
-	lbeentry.st_progress_command = PROGRESS_COMMAND_INVALID;
-	lbeentry.st_progress_command_target = InvalidOid;
+	MemSet(&(lbeentry.st_progress_cmds), 0,
+		   sizeof(PgBackendProgressInfo) * PGSTAT_MAX_PROGRESS_INFO);
+	/* Set invalid command index */
+	lbeentry.st_current_cmd = -1;
 
 	/*
 	 * we don't zero st_progress_param here to save cycles; nobody should
@@ -3203,10 +3205,22 @@ pgstat_progress_start_command(ProgressCommandType cmdtype, Oid relid)
 	if (!beentry || !pgstat_track_activities)
 		return;
 
+	Assert(beentry->st_current_cmd >= -1);
+
+	/* The given command is already started */
+	if (beentry->st_progress_cmds[beentry->st_current_cmd].command == cmdtype)
+		return;
+
+	/* The progress information queue is full */
+	if (beentry->st_current_cmd >= PGSTAT_MAX_PROGRESS_INFO - 1)
+		elog(ERROR, "progress information per backends is full");
+
 	PGSTAT_BEGIN_WRITE_ACTIVITY(beentry);
-	beentry->st_progress_command = cmdtype;
-	beentry->st_progress_command_target = relid;
-	MemSet(&beentry->st_progress_param, 0, sizeof(beentry->st_progress_param));
+	beentry->st_current_cmd++;
+	MemSet(&(beentry->st_progress_cmds[beentry->st_current_cmd]),
+		   0, sizeof(PgBackendProgressInfo));
+	beentry->st_progress_cmds[beentry->st_current_cmd].command = cmdtype;
+	beentry->st_progress_cmds[beentry->st_current_cmd].target = relid;
 	PGSTAT_END_WRITE_ACTIVITY(beentry);
 }
 
@@ -3223,11 +3237,11 @@ pgstat_progress_update_param(int index, int64 val)
 
 	Assert(index >= 0 && index < PGSTAT_NUM_PROGRESS_PARAM);
 
-	if (!beentry || !pgstat_track_activities)
+	if (!beentry || !pgstat_track_activities || beentry->st_current_cmd < 0)
 		return;
 
 	PGSTAT_BEGIN_WRITE_ACTIVITY(beentry);
-	beentry->st_progress_param[index] = val;
+	beentry->st_progress_cmds[beentry->st_current_cmd].params[index] = val;
 	PGSTAT_END_WRITE_ACTIVITY(beentry);
 }
 
@@ -3245,7 +3259,8 @@ pgstat_progress_update_multi_param(int nparam, const int *index,
 	volatile PgBackendStatus *beentry = MyBEEntry;
 	int			i;
 
-	if (!beentry || !pgstat_track_activities || nparam == 0)
+	if (!beentry || !pgstat_track_activities || nparam == 0 ||
+		beentry->st_current_cmd < 0)
 		return;
 
 	PGSTAT_BEGIN_WRITE_ACTIVITY(beentry);
@@ -3254,7 +3269,8 @@ pgstat_progress_update_multi_param(int nparam, const int *index,
 	{
 		Assert(index[i] >= 0 && index[i] < PGSTAT_NUM_PROGRESS_PARAM);
 
-		beentry->st_progress_param[index[i]] = val[i];
+		beentry->st_progress_cmds[beentry->st_current_cmd].params[index[i]] =
+			val[i];
 	}
 
 	PGSTAT_END_WRITE_ACTIVITY(beentry);
@@ -3274,13 +3290,18 @@ pgstat_progress_end_command(void)
 
 	if (!beentry)
 		return;
-	if (!pgstat_track_activities
-		&& beentry->st_progress_command == PROGRESS_COMMAND_INVALID)
+
+	if (!pgstat_track_activities || beentry->st_current_cmd < 0 ||
+		beentry->st_progress_cmds[beentry->st_current_cmd].command ==
+		PROGRESS_COMMAND_INVALID)
 		return;
 
 	PGSTAT_BEGIN_WRITE_ACTIVITY(beentry);
-	beentry->st_progress_command = PROGRESS_COMMAND_INVALID;
-	beentry->st_progress_command_target = InvalidOid;
+	beentry->st_progress_cmds[beentry->st_current_cmd].command =
+		PROGRESS_COMMAND_INVALID;
+	beentry->st_progress_cmds[beentry->st_current_cmd].target =
+		InvalidOid;
+	beentry->st_current_cmd--;
 	PGSTAT_END_WRITE_ACTIVITY(beentry);
 }
 
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 05240bfd14..76ae2e0115 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -495,6 +495,7 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 		Datum		values[PG_STAT_GET_PROGRESS_COLS];
 		bool		nulls[PG_STAT_GET_PROGRESS_COLS];
 		int			i;
+		int			cmdidx;
 
 		MemSet(values, 0, sizeof(values));
 		MemSet(nulls, 0, sizeof(nulls));
@@ -510,7 +511,18 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 		 * Report values for only those backends which are running the given
 		 * command.
 		 */
-		if (!beentry || beentry->st_progress_command != cmdtype)
+		if (!beentry)
+			continue;
+
+		/* Look up the progress information of the given command */
+		for (cmdidx = 0; cmdidx <= beentry->st_current_cmd; cmdidx++)
+		{
+			if (beentry->st_progress_cmds[cmdidx].command == cmdtype)
+				break;
+		}
+
+		/* Skip if the command is not running */
+		if (cmdidx > beentry->st_current_cmd)
 			continue;
 
 		/* Value available to all callers */
@@ -520,9 +532,9 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 		/* show rest of the values including relid only to role members */
 		if (has_privs_of_role(GetUserId(), beentry->st_userid))
 		{
-			values[2] = ObjectIdGetDatum(beentry->st_progress_command_target);
+			values[2] = ObjectIdGetDatum(beentry->st_progress_cmds[cmdidx].target);
 			for (i = 0; i < PGSTAT_NUM_PROGRESS_PARAM; i++)
-				values[i + 3] = Int64GetDatum(beentry->st_progress_param[i]);
+				values[i + 3] = Int64GetDatum(beentry->st_progress_cmds[cmdidx].params[i]);
 		}
 		else
 		{
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index fe076d823d..b87f253777 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -953,13 +953,14 @@ typedef enum
  */
 typedef enum ProgressCommandType
 {
-	PROGRESS_COMMAND_INVALID,
+	PROGRESS_COMMAND_INVALID = 0,
 	PROGRESS_COMMAND_VACUUM,
 	PROGRESS_COMMAND_CLUSTER,
 	PROGRESS_COMMAND_CREATE_INDEX
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	20
+#define PGSTAT_MAX_PROGRESS_INFO	4
 
 /* ----------
  * Shared-memory data structures
@@ -1010,6 +1011,18 @@ typedef struct PgBackendGSSStatus
 
 } PgBackendGSSStatus;
 
+/*
+ * Struct for command progress information of one command. The 'target'
+ * should be the OID of the relation which the command targets (we assume
+ * there's just one, as this is meant for utility commands), but the
+ * meaning of each element in the params array is command-specific.
+ */
+typedef struct PgBackendProgressInfo
+{
+	ProgressCommandType command;
+	Oid			target;
+	int64		params[PGSTAT_NUM_PROGRESS_PARAM];
+} PgBackendProgressInfo;
 
 /* ----------
  * PgBackendStatus
@@ -1085,17 +1098,16 @@ typedef struct PgBackendStatus
 	char	   *st_activity_raw;
 
 	/*
-	 * Command progress reporting.  Any command which wishes can advertise
-	 * that it is running by setting st_progress_command,
-	 * st_progress_command_target, and st_progress_param[].
-	 * st_progress_command_target should be the OID of the relation which the
-	 * command targets (we assume there's just one, as this is meant for
-	 * utility commands), but the meaning of each element in the
-	 * st_progress_param array is command-specific.
+	 * Command progress reporting. Since it's possible to nested call
+	 * multiple commands that support progress report, e.g. CLUSTER
+	 * executes REINDEX inside, we track progress of multiple commands.
+	 * st_current_cmd is the index of st_progress_cmds of current executing
+	 * command, -1 for invalid. Any command which wishes can advertise
+	 * that is is running by increasing st_current_cmd and setting
+	 * the corresponding elements of st_progress_cmd.
 	 */
-	ProgressCommandType st_progress_command;
-	Oid			st_progress_command_target;
-	int64		st_progress_param[PGSTAT_NUM_PROGRESS_PARAM];
+	int			st_current_cmd;
+	PgBackendProgressInfo st_progress_cmds[PGSTAT_MAX_PROGRESS_INFO];
 } PgBackendStatus;
 
 /*

#78

Michael Paquier

michael@paquier.xyz

over 6 years ago

In reply to: Masahiko Sawada (#77)

Re: [HACKERS] CLUSTER command progress monitor

On Fri, Aug 30, 2019 at 07:45:57PM +0900, Masahiko Sawada wrote:

I tried 1. and it shown index_rebuild_count, but it also shown
"initializing" phase again and other columns were empty. So, it is
not suitable to fix the problem. :(
I'm going to try 2. and 3., but, unfortunately, I can't get enough
time to do that after PGConf.Asia 2019.
If we selected 3., it affects following these progress reporting
features: VACUUM, CLUSTER, CREATE_INDEX and ANALYZE. But it's okay,
I suppose. Any comments welcome! :)

I looked at this open item. I prefer #3 but I think it would be enough
to have a small stack using a fixed length array to track nested
progress information and the current executing command (maybe 4 or 8
would be enough for maximum nested level for now?). That way, we don't
need any change the interfaces. AFAICS there is no case where we call
only either pgstat_progress_start_command or
pgstat_progress_end_command without each other (although
pgstat_progress_update_param is called without start). So I think that
having a small stack for tracking multiple reports would work.

Attached the draft patch that fixes this issue. Please review it.

Do we actually want to show to the user information about CREATE INDEX
which is different than CLUSTER? It could be confusing for the user
to see a progress report from a command different than the one
actually launched. There could be a method 4 here: do not start a new
command progress when there is another one already started, and do not
try to end it in the code path where it could not be started as it did
not stack. So while I see the advantages of stacking the progress
records as you do when doing cascading calls of the progress
reporting, I am not sure that:
1) We should bloat more PgBackendStatus for that.
2) We want to add more complication in this area close to the
release of 12.

Another solution as mentioned by Yamada-san is just to start again the
progress for the cluster command in reindex_relation() however you got
an issue here as reindex_relation() is also called by REINDEX TABLE. I
find actually very weird that we have a cluster-related field added
for REINDEX, so it seems to me that all the interactions between the
code paths of CLUSTER and REINDEX have not been completely thought
through. This part has been added by 6f97457 :(
--
Michael

#79

Masahiko Sawada

sawada.mshk@gmail.com

over 6 years ago

In reply to: Michael Paquier (#78)

Re: [HACKERS] CLUSTER command progress monitor

On Mon, Sep 2, 2019 at 4:59 PM Michael Paquier <michael@paquier.xyz> wrote:

On Fri, Aug 30, 2019 at 07:45:57PM +0900, Masahiko Sawada wrote:

I tried 1. and it shown index_rebuild_count, but it also shown
"initializing" phase again and other columns were empty. So, it is
not suitable to fix the problem. :(
I'm going to try 2. and 3., but, unfortunately, I can't get enough
time to do that after PGConf.Asia 2019.
If we selected 3., it affects following these progress reporting
features: VACUUM, CLUSTER, CREATE_INDEX and ANALYZE. But it's okay,
I suppose. Any comments welcome! :)

I looked at this open item. I prefer #3 but I think it would be enough
to have a small stack using a fixed length array to track nested
progress information and the current executing command (maybe 4 or 8
would be enough for maximum nested level for now?). That way, we don't
need any change the interfaces. AFAICS there is no case where we call
only either pgstat_progress_start_command or
pgstat_progress_end_command without each other (although
pgstat_progress_update_param is called without start). So I think that
having a small stack for tracking multiple reports would work.

Attached the draft patch that fixes this issue. Please review it.

Do we actually want to show to the user information about CREATE INDEX
which is different than CLUSTER? It could be confusing for the user
to see a progress report from a command different than the one
actually launched.

I personally think it would be helpful for users. We provide the
progress reporting for each commands but it might not be detailed
enough. But we can provide more details of progress information of
each commands by combining them. Only users who want to confirm the
details need to see different progress reports.

There could be a method 4 here: do not start a new
command progress when there is another one already started, and do not
try to end it in the code path where it could not be started as it did
not stack. So while I see the advantages of stacking the progress
records as you do when doing cascading calls of the progress
reporting, I am not sure that:
1) We should bloat more PgBackendStatus for that.
2) We want to add more complication in this area close to the
release of 12.

I agreed, especially to 2. We can live with #4 method in PG12 and the
patch I proposed could be discussed as a new feature.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#80

Masahiko Sawada

sawada.mshk@gmail.com

over 6 years ago

In reply to: Masahiko Sawada (#79)

1 attachment(s)

Re: [HACKERS] CLUSTER command progress monitor

On Mon, Sep 2, 2019 at 6:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Sep 2, 2019 at 4:59 PM Michael Paquier <michael@paquier.xyz> wrote:

On Fri, Aug 30, 2019 at 07:45:57PM +0900, Masahiko Sawada wrote:

I tried 1. and it shown index_rebuild_count, but it also shown
"initializing" phase again and other columns were empty. So, it is
not suitable to fix the problem. :(
I'm going to try 2. and 3., but, unfortunately, I can't get enough
time to do that after PGConf.Asia 2019.
If we selected 3., it affects following these progress reporting
features: VACUUM, CLUSTER, CREATE_INDEX and ANALYZE. But it's okay,
I suppose. Any comments welcome! :)

I looked at this open item. I prefer #3 but I think it would be enough
to have a small stack using a fixed length array to track nested
progress information and the current executing command (maybe 4 or 8
would be enough for maximum nested level for now?). That way, we don't
need any change the interfaces. AFAICS there is no case where we call
only either pgstat_progress_start_command or
pgstat_progress_end_command without each other (although
pgstat_progress_update_param is called without start). So I think that
having a small stack for tracking multiple reports would work.

Attached the draft patch that fixes this issue. Please review it.

Do we actually want to show to the user information about CREATE INDEX
which is different than CLUSTER? It could be confusing for the user
to see a progress report from a command different than the one
actually launched.

I personally think it would be helpful for users. We provide the
progress reporting for each commands but it might not be detailed
enough. But we can provide more details of progress information of
each commands by combining them. Only users who want to confirm the
details need to see different progress reports.

There could be a method 4 here: do not start a new
command progress when there is another one already started, and do not
try to end it in the code path where it could not be started as it did
not stack. So while I see the advantages of stacking the progress
records as you do when doing cascading calls of the progress
reporting, I am not sure that:
1) We should bloat more PgBackendStatus for that.
2) We want to add more complication in this area close to the
release of 12.

I agreed, especially to 2. We can live with #4 method in PG12 and the
patch I proposed could be discussed as a new feature.

After more thought, even if we don't start a new command progress when
there is another one already started the progress update functions
could be called and these functions don't specify the command type.
Therefore, the progress information might be changed with wrong value
by different command. Probably we can change the caller of progress
updating function so that it doesn't call all of them if the command
could not start a new progress report, but it might be a big change.

As an alternative idea, we can make pgstat_progress_end_command() have
one argument that is command the caller wants to end. That is, we
don't end the command progress when the specified command type doesn't
match to the command type of current running command progress. We
unconditionally clear the progress information at CommitTransaction()
and AbortTransaction() but we can do that by passing
PROGRESS_COMMAND_INVALID to pgstat_progress_end_command().

BTW the following condition in pgstat_progress_end_command() seems to
be wrong. We should return from the function when either
pgstat_track_activities is disabled or the current progress command is
invalid.

if (!pgstat_track_activities
&& beentry->st_progress_command == PROGRESS_COMMAND_INVALID)
return;

I've attached the patch fixes the issue I newly found.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

fix_progress_end_command.patchapplication/octet-stream; name=fix_progress_end_command.patchDownload

diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index d362e7f..7d8f235 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -3274,8 +3274,8 @@ pgstat_progress_end_command(void)
 
 	if (!beentry)
 		return;
-	if (!pgstat_track_activities
-		&& beentry->st_progress_command == PROGRESS_COMMAND_INVALID)
+	if (!pgstat_track_activities ||
+		beentry->st_progress_command == PROGRESS_COMMAND_INVALID)
 		return;
 
 	PGSTAT_BEGIN_WRITE_ACTIVITY(beentry);

#81

Michael Paquier

michael@paquier.xyz

over 6 years ago

In reply to: Masahiko Sawada (#80)

Re: [HACKERS] CLUSTER command progress monitor

On Tue, Sep 03, 2019 at 01:59:00PM +0900, Masahiko Sawada wrote:

After more thought, even if we don't start a new command progress when
there is another one already started the progress update functions
could be called and these functions don't specify the command type.
Therefore, the progress information might be changed with wrong value
by different command. Probably we can change the caller of progress
updating function so that it doesn't call all of them if the command
could not start a new progress report, but it might be a big change.

That's one issue.

As an alternative idea, we can make pgstat_progress_end_command() have
one argument that is command the caller wants to end. That is, we
don't end the command progress when the specified command type doesn't
match to the command type of current running command progress. We
unconditionally clear the progress information at CommitTransaction()
and AbortTransaction() but we can do that by passing
PROGRESS_COMMAND_INVALID to pgstat_progress_end_command().

Possibly. I don't dislike the idea of piling up the progress
information for cascading calls and I would use that with a depth
counter and a fixed-size array.

BTW the following condition in pgstat_progress_end_command() seems to
be wrong. We should return from the function when either
pgstat_track_activities is disabled or the current progress command is
invalid.

if (!pgstat_track_activities
&& beentry->st_progress_command == PROGRESS_COMMAND_INVALID)
return;

I've attached the patch fixes the issue I newly found.

Indeed, good catch. This is wrong since b6fb647 which has introduced
the progress reports. I'll fix that one and back-patch if there are
no objections.

With my RMT hat on for v12, I don't think that it is really the moment
to discuss how we want to change this API post beta3, and we have room
for that in future development cycles. There are quite some questions
which need answers I am unsure of:
- Do we really want to support nested calls of progress reports for
multiple command?
- Perhaps for some commands it makes sense to have an overlap of the
fields used, but we need a clear definition of what can be done or
not. I am not really comfortable with the idea of having in
reindex_relation() a progress report related only to CLUSTER, which is
also a REINDEX code path. The semantics shared between both commands
need to be thought a bit more. For example
PROGRESS_CLUSTER_INDEX_REBUILD_COUNT could cause the system catalog to
report PROGRESS_CREATEIDX_PHASE_WAIT_3 because of an incorrect command
type, which would be just wrong for a CLUSTER command.
- Which command should be reported to the user, only the upper-level
one?
- Perhaps we can live only with the approach of not registering a new
command if one already exists, and actually be more flexible with the
phase fields used, in short we use unique numbers for each phase?

The most conservative bet from a release point of view, and actually
my bet because that's safe, would be to basically revert 6f97457
(CLUSTER), 03f9e5c (REINDEX) and ab0dfc96 (CREATE INDEX which has
overlaps with REINDEX in the build and validation paths). What I am
really scared of is that we have just barely scratched the surface of
the issues caused by the inter-dependencies between all the code
paths of those commands, and that we have much more waiting behind
this open item.
--
Michael

#82

Michael Paquier

michael@paquier.xyz

over 6 years ago

In reply to: Michael Paquier (#81)

Re: [HACKERS] CLUSTER command progress monitor

On Tue, Sep 03, 2019 at 02:52:28PM +0900, Michael Paquier wrote:

Indeed, good catch. This is wrong since b6fb647 which has introduced
the progress reports. I'll fix that one and back-patch if there are
no objections.

OK, applied this part down to 9.6.
--
Michael

#83

Masahiko Sawada

sawada.mshk@gmail.com

over 6 years ago

In reply to: Michael Paquier (#82)

Re: [HACKERS] CLUSTER command progress monitor

On Wed, Sep 4, 2019 at 3:48 PM Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Sep 03, 2019 at 02:52:28PM +0900, Michael Paquier wrote:

Indeed, good catch. This is wrong since b6fb647 which has introduced
the progress reports. I'll fix that one and back-patch if there are
no objections.

OK, applied this part down to 9.6.

Thank you!

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#84

Robert Haas

robertmhaas@gmail.com

over 6 years ago

In reply to: Masahiko Sawada (#80)

Re: [HACKERS] CLUSTER command progress monitor

On Tue, Sep 3, 2019 at 1:02 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

After more thought, even if we don't start a new command progress when
there is another one already started the progress update functions
could be called and these functions don't specify the command type.
Therefore, the progress information might be changed with wrong value
by different command. Probably we can change the caller of progress
updating function so that it doesn't call all of them if the command
could not start a new progress report, but it might be a big change.

As an alternative idea, we can make pgstat_progress_end_command() have
one argument that is command the caller wants to end. That is, we
don't end the command progress when the specified command type doesn't
match to the command type of current running command progress. We
unconditionally clear the progress information at CommitTransaction()
and AbortTransaction() but we can do that by passing
PROGRESS_COMMAND_INVALID to pgstat_progress_end_command().

I think this is all going in the wrong direction. It's the
responsibility of the people who are calling the pgstat_progress_*
functions to only do so when it's appropriate. Having the
pgstat_progress_* functions try to untangle whether or not they ought
to ignore calls made to them is backwards.

It seems to me that the problem here can be summarized in this way:
there's a bunch of code reuse in the relevant paths here, and somebody
wasn't careful enough when adding progress reporting for one of the
new commands, and so now things are broken, because e.g. CLUSTER
progress reporting gets ended by a pgstat_progress_end_command() call
that was intended for some other utility command but is reached in the
CLUSTER case anyway. The solution to that problem in my book is to
figure out which commit broke it, and then the person who made that
commit either needs to fix it or revert it.

It's quite possible here that we need a bigger redesign to make adding
progress reporting for new command easier and less prone to bugs, but
I don't think it's at all clear what such a redesign should look like
or even that we definitely need one, and September is not the right
time to be redesigning features for the pending release.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#85

Michael Paquier

michael@paquier.xyz

over 6 years ago

In reply to: Robert Haas (#84)

Re: [HACKERS] CLUSTER command progress monitor

On Wed, Sep 04, 2019 at 09:18:39AM -0400, Robert Haas wrote:

I think this is all going in the wrong direction. It's the
responsibility of the people who are calling the pgstat_progress_*
functions to only do so when it's appropriate. Having the
pgstat_progress_* functions try to untangle whether or not they ought
to ignore calls made to them is backwards.

Check.

It seems to me that the problem here can be summarized in this way:
there's a bunch of code reuse in the relevant paths here, and somebody
wasn't careful enough when adding progress reporting for one of the
new commands, and so now things are broken, because e.g. CLUSTER
progress reporting gets ended by a pgstat_progress_end_command() call
that was intended for some other utility command but is reached in the
CLUSTER case anyway. The solution to that problem in my book is to
figure out which commit broke it, and then the person who made that
commit either needs to fix it or revert it.

I am not sure that it is right as well to say that the first committed
patch is right and that the follow-up ones are wrong. CLUSTER
progress was committed first (6f97457), followed a couple of days
after by CREATE INDEX (ab0dfc9) and then REINDEX (03f9e5c). So let's
have a look at them..

For CLUSTER, the progress starts and ends in cluster_rel(). CLUSTER
uses its code paths at the beginning, but then things get more
complicated, particularly with finish_heap_swap() which calls directly
reindex_table(). 6f97457 includes one progress update at point which
can be a problem per its shared nature in reindex_relation() with
PROGRESS_CLUSTER_INDEX_REBUILD_COUNT. This last part is wrong IMO,
why should cluster reporting take priority in this code path,
enforcing anything else?

For CREATE INDEX, the progress reporting starts and ends once in
DefineIndex(). However, we have updates of progress within each index
AM build routine, which could be taken by many code paths. Is it
actually fine to give priority to CREATE INDEX in those cases? Those
paths can as well be taken by REINDEX or CLUSTER (right?), so having a
counter for CREATE INDEX looks logically wrong to me. The part where
we wait for snapshots looks actually good from the perspective of
REINDEX CONCURRENTLY and CREATE INDEX CONCURRENTLY.

For REINDEX, we have a problematic start progress call in
reindex_index() which is for example called by reindex_relation for
each relation's index for a non-concurrent case (also in
ReindexRelationConcurrently()). I think that these are incorrect
locations, and I would have placed them in ReindexIndex(),
ReindexTable() and ReindexMultipleTables() so as we avoid anything
low-level. This has added two calls to pgstat_progress_update_param()
in reindex_index(), which is shared between all. Why would it be fine
to give the priority to a CREATE INDEX marker here if CLUSTER can also
cross this way?

On top of those issues, I see some problems with the current state of
affairs, and I am repeating myself:
- It is possible that pgstat_progress_update_param() is called for a
given command for a code path taken by a completely different
command, and that we have a real risk of showing a status completely
buggy as the progress phases share the same numbers.
- We don't consider wisely end and start progress handling for
cascading calls, leading to a risk where we start command A, but
for shared code paths where we assume that only command B can run then
the processing abruptly ends for command A.
- Is it actually fine to report information about a command completely
different than the one provided by the client? It is for example
possible to call CLUSTER, but show up to the user progress report
about PROGRESS_COMMAND_CREATE_INDEX (see reindex_index). This could
actually make sense if we are able to handle cascading progress
reports.

These are, at least it seems to me, fundamental problems we need to
ponder more about if we begin to include more progress reporting, and
we don't have that now, and that worries me.

It's quite possible here that we need a bigger redesign to make adding
progress reporting for new command easier and less prone to bugs, but
I don't think it's at all clear what such a redesign should look like
or even that we definitely need one, and September is not the right
time to be redesigning features for the pending release.

Definitely.
--
Michael

#86

Robert Haas

robertmhaas@gmail.com

over 6 years ago

In reply to: Michael Paquier (#85)

Re: [HACKERS] CLUSTER command progress monitor

On Wed, Sep 4, 2019 at 9:03 PM Michael Paquier <michael@paquier.xyz> wrote:

For CLUSTER, the progress starts and ends in cluster_rel(). CLUSTER
uses its code paths at the beginning, but then things get more
complicated, particularly with finish_heap_swap() which calls directly
reindex_table(). 6f97457 includes one progress update at point which
can be a problem per its shared nature in reindex_relation() with
PROGRESS_CLUSTER_INDEX_REBUILD_COUNT. This last part is wrong IMO,
why should cluster reporting take priority in this code path,
enforcing anything else?

Oops. Yeah, that's bogus (as are some of the other things you
mention). I think we're going to have to fix this by passing down
some flags to these functions to tell them what kind of progress
updates to do (or to do none). Or else pass down a callback function
and a context object, but that seems like it might be overkill.

On top of those issues, I see some problems with the current state of
affairs, and I am repeating myself:
- It is possible that pgstat_progress_update_param() is called for a
given command for a code path taken by a completely different
command, and that we have a real risk of showing a status completely
buggy as the progress phases share the same numbers.
- We don't consider wisely end and start progress handling for
cascading calls, leading to a risk where we start command A, but
for shared code paths where we assume that only command B can run then
the processing abruptly ends for command A.

Those are just weaknesses of the infrastructure. Perhaps there is a
better solution, but there's no intrinsic reason that we can't avoid
them by careful coding.

- Is it actually fine to report information about a command completely
different than the one provided by the client? It is for example
possible to call CLUSTER, but show up to the user progress report
about PROGRESS_COMMAND_CREATE_INDEX (see reindex_index). This could
actually make sense if we are able to handle cascading progress
reports.

Well, it might be OK to do that if we're clear that this is the index
progress-reporting view and the command is CLUSTER but it happens to
be building an index now so we're showing it here. But I don't see
how it would work anyway: you can't reported cascading progress
reports in shared memory because you've only got a fixed amount of
space.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#87

Michael Paquier

michael@paquier.xyz

over 6 years ago

In reply to: Robert Haas (#86)

Re: [HACKERS] CLUSTER command progress monitor

On Thu, Sep 05, 2019 at 03:17:51PM -0400, Robert Haas wrote:

Oops. Yeah, that's bogus (as are some of the other things you
mention). I think we're going to have to fix this by passing down
some flags to these functions to tell them what kind of progress
updates to do (or to do none). Or else pass down a callback function
and a context object, but that seems like it might be overkill.

One idea I got was to pass the command ID as an extra argument of the
update routine. I am not completely sure either if we need this level
of complication.

Those are just weaknesses of the infrastructure. Perhaps there is a
better solution, but there's no intrinsic reason that we can't avoid
them by careful coding.

Perhaps. The current infra allows the addition of a progress report
in code paths which are isolated from other things. For CLUSTER, most
things are fine as long as the progress is updated in cluster_rel(),
the rest is too internal.

Well, it might be OK to do that if we're clear that this is the index
progress-reporting view and the command is CLUSTER but it happens to
be building an index now so we're showing it here. But I don't see
how it would work anyway: you can't reported cascading progress
reports in shared memory because you've only got a fixed amount of
space.

I don't see exactly why we could not switch to a fixed number of
slots, say 8, with one code path to start a progress which adds an
extra report on the stack, one to remove one entry from the stack, and
a new one to reset the whole thing for a backend. This would not need
much restructuration of course.

Finally comes the question of what do we do for v12? I am adding in
CC Peter, Alvaro being already present, who have been involved in the
commits with CREATE INDEX and REINDEX. It would be sad to revert a
this feature, but well I'd rather do that now than regret later
releasing the feature as it is currently shaped.. Let's see what the
others think.
--
Michael

#88

Michael Paquier

michael@paquier.xyz

over 6 years ago

In reply to: Michael Paquier (#87)

Re: [HACKERS] CLUSTER command progress monitor

On Fri, Sep 06, 2019 at 02:44:18PM +0900, Michael Paquier wrote:

I don't see exactly why we could not switch to a fixed number of
slots, say 8, with one code path to start a progress which adds an
extra report on the stack, one to remove one entry from the stack, and
a new one to reset the whole thing for a backend. This would not need
much restructuration of course.

Wake up, Neo. Your last sentence is confusing. I meant that this
would need more design efforts, so that's not in scope for v12.
--
Michael

#89

Robert Haas

robertmhaas@gmail.com

over 6 years ago

In reply to: Michael Paquier (#87)

Re: [HACKERS] CLUSTER command progress monitor

On Fri, Sep 6, 2019 at 1:44 AM Michael Paquier <michael@paquier.xyz> wrote:

One idea I got was to pass the command ID as an extra argument of the
update routine. I am not completely sure either if we need this level
of complication.

I still don't think that's the right approach.

Those are just weaknesses of the infrastructure. Perhaps there is a
better solution, but there's no intrinsic reason that we can't avoid
them by careful coding.

Perhaps. The current infra allows the addition of a progress report
in code paths which are isolated from other things. For CLUSTER, most
things are fine as long as the progress is updated in cluster_rel(),
the rest is too internal.

It's fine if things are updated as well -- it's just you need to make
sure that those places know whether or not they are supposed to be
doing those updates. Again, why can't we just pass down a value
telling them "do reindex-style progress updates" or "do cluster-style
progress updates" or "do no progress updates"?

Well, it might be OK to do that if we're clear that this is the index
progress-reporting view and the command is CLUSTER but it happens to
be building an index now so we're showing it here. But I don't see
how it would work anyway: you can't reported cascading progress
reports in shared memory because you've only got a fixed amount of
space.

I don't see exactly why we could not switch to a fixed number of
slots, say 8, with one code path to start a progress which adds an
extra report on the stack, one to remove one entry from the stack, and
a new one to reset the whole thing for a backend. This would not need
much restructuration of course.

You could do that, but I don't think it's probably that great of an
idea. Now you've built something which is significantly more complex
than the original design of this feature, but still not good enough to
report on the progress of a query tree. I tend to think we should
confine ourselves to the progress reporting that can reasonably be
done within the current infrastructure until somebody invents a really
general mechanism that can handle, essentially, an EXPLAIN-on-the-fly
of a current query tree.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#90

Alvaro Herrera from 2ndQuadrant

alvherre@alvh.no-ip.org

over 6 years ago

In reply to: Michael Paquier (#87)

Re: [HACKERS] CLUSTER command progress monitor

On 2019-Sep-06, Michael Paquier wrote:

Finally comes the question of what do we do for v12? I am adding in
CC Peter, Alvaro being already present, who have been involved in the
commits with CREATE INDEX and REINDEX. It would be sad to revert a
this feature, but well I'd rather do that now than regret later
releasing the feature as it is currently shaped.. Let's see what the
others think.

As far as I understand, CREATE INDEX is not affected -- only REINDEX is.
Of course, it would be sad to revert even the latter, but it's not as
bleak as reverting the whole thing.

That said, I did spend some time on this type of issue when doing CREATE
INDEX support; you can tell because I defined the columns for block
numbers in a scan separately from CREATE INDEX specific fields,
precisely to avoid multiple commands running concurrently from
clobbering unrelated columns:

/* Block numbers in a generic relation scan */
#define PROGRESS_SCAN_BLOCKS_TOTAL 15
#define PROGRESS_SCAN_BLOCKS_DONE 16

I would say that it's fairly useful to have CLUSTER report progress on
indexes being created underneath, but I understand that it might be too
late to be designing the CLUSTER report to take advantage of the CREATE
INDEX metrics.

I think a workable, not terribly invasive approach is to have REINDEX
process its commands conditionally: have the caller indicate whether
progress is to be reported, and skip the calls if not. That would
(should) prevent it from clobbering the state set up by CLUSTER.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#91

Michael Paquier

michael@paquier.xyz

over 6 years ago

In reply to: Alvaro Herrera from 2ndQuadrant (#90)

Re: [HACKERS] CLUSTER command progress monitor

On Fri, Sep 06, 2019 at 10:27:02AM -0400, Alvaro Herrera from 2ndQuadrant wrote:

That said, I did spend some time on this type of issue when doing CREATE
INDEX support; you can tell because I defined the columns for block
numbers in a scan separately from CREATE INDEX specific fields,
precisely to avoid multiple commands running concurrently from
clobbering unrelated columns:

/* Block numbers in a generic relation scan */
#define PROGRESS_SCAN_BLOCKS_TOTAL 15
#define PROGRESS_SCAN_BLOCKS_DONE 16

Hm. It is not really clear what is the intention by looking at the
contents progress.h.

I would say that it's fairly useful to have CLUSTER report progress on
indexes being created underneath, but I understand that it might be too
late to be designing the CLUSTER report to take advantage of the CREATE
INDEX metrics.

The same can be said about the reporting done in reindex_relation for
PROGRESS_CLUSTER_INDEX_REBUILD_COUNT. I think that it should be
removed for now.

I think a workable, not terribly invasive approach is to have REINDEX
process its commands conditionally: have the caller indicate whether
progress is to be reported, and skip the calls if not. That would
(should) prevent it from clobbering the state set up by CLUSTER.

So, you would basically add an extra flag in the options of
reindex_index() to decide if a progress report should be started or
not? I am not a fan of that because it does not take care of the root
issue which is that the start of the progress reports is too much
internal. I think that it would be actually less error prone to move
the start of the progress reporting for REINDEX out of reindex_index()
and start it at a higher level. Looking again at the code, I would
recommend that we should start the progress in ReindexIndex() before
calling reindex_index(), ReindexMultipleTables() before calling
reindex_relation() and ReindexRelationConcurrently(), and
ReindexTable() before calling reindex_relation(). That will avoid
each command to clobber each other's in-progress reports.

It would be also very good to document clearly how the overlaps for
the progress parameter values are not happening. For example for
CREATE INDEX, we don't know why 1, 2 and 7 are not used.

Note that there is an ID collision with PROGRESS_CREATEIDX_INDEX_OID
updated in reindex_index() and the CLUSTER part
PROGRESS_CLUSTER_HEAP_BLKS_SCANNED.. There could be an argument to
use a completely different range of IDs for each command phase to
avoid any overlap, but it is scary to consider that we may not have
found all the issues with one command cloberring another one's
state..
--
Michael

#92

Peter Geoghegan

pg@bowt.ie

over 6 years ago

In reply to: Robert Haas (#89)

Re: [HACKERS] CLUSTER command progress monitor

On Fri, Sep 6, 2019 at 5:11 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Sep 6, 2019 at 1:44 AM Michael Paquier <michael@paquier.xyz> wrote:

I don't see exactly why we could not switch to a fixed number of
slots, say 8, with one code path to start a progress which adds an
extra report on the stack, one to remove one entry from the stack, and
a new one to reset the whole thing for a backend. This would not need
much restructuration of course.

You could do that, but I don't think it's probably that great of an
idea. Now you've built something which is significantly more complex
than the original design of this feature, but still not good enough to
report on the progress of a query tree. I tend to think we should
confine ourselves to the progress reporting that can reasonably be
done within the current infrastructure until somebody invents a really
general mechanism that can handle, essentially, an EXPLAIN-on-the-fly
of a current query tree.

+1. Let's not complicate the progress reporting infrastructure for an
uncertain benefit.

CLUSTER/VACUUM FULL is fundamentally an awkward utility command to
target with progress reporting infrastructure. I think that it's okay
to redefine how progress reporting works with CLUSTER now, in order to
fix the REINDEX/CLUSTER state clobbering bug.

--
Peter Geoghegan

#93

Michael Paquier

michael@paquier.xyz

over 6 years ago

In reply to: Robert Haas (#89)

Re: [HACKERS] CLUSTER command progress monitor

On Fri, Sep 06, 2019 at 08:10:58AM -0400, Robert Haas wrote:

It's fine if things are updated as well -- it's just you need to make
sure that those places know whether or not they are supposed to be
doing those updates. Again, why can't we just pass down a value
telling them "do reindex-style progress updates" or "do cluster-style
progress updates" or "do no progress updates"?

That's invasive. CREATE INDEX reporting goes pretty deep into the
tree, with steps dedicated to the builds and scans of btree (nbtsort.c
for example) and hash index AMs. In this case we have something that
does somewhat what you are looking for with report_progress which gets
set to true only for VACUUM. If we were to do something like that, we
would also need to keep some sort of mapping regarding which command
ID (as defined by ProgressCommandType) is able to use which set of
parameter flags (as defined by the arguments of
pgstat_progress_update_param() and its multi_* cousin). Then comes
the issue that some parameters may be used by multiple command types,
while other don't (REINDEX and CREATE INDEX have some shared
mapping).
--
Michael

#94

Robert Haas

robertmhaas@gmail.com

over 6 years ago

In reply to: Michael Paquier (#93)

Re: [HACKERS] CLUSTER command progress monitor

On Fri, Sep 13, 2019 at 2:49 AM Michael Paquier <michael@paquier.xyz> wrote:

On Fri, Sep 06, 2019 at 08:10:58AM -0400, Robert Haas wrote:

It's fine if things are updated as well -- it's just you need to make
sure that those places know whether or not they are supposed to be
doing those updates. Again, why can't we just pass down a value
telling them "do reindex-style progress updates" or "do cluster-style
progress updates" or "do no progress updates"?

That's invasive. CREATE INDEX reporting goes pretty deep into the
tree, with steps dedicated to the builds and scans of btree (nbtsort.c
for example) and hash index AMs. In this case we have something that
does somewhat what you are looking for with report_progress which gets
set to true only for VACUUM. If we were to do something like that, we
would also need to keep some sort of mapping regarding which command
ID (as defined by ProgressCommandType) is able to use which set of
parameter flags (as defined by the arguments of
pgstat_progress_update_param() and its multi_* cousin). Then comes
the issue that some parameters may be used by multiple command types,
while other don't (REINDEX and CREATE INDEX have some shared
mapping).

Well, if CREATE INDEX progress reporting can't be reasonably done
within the current infrastructure, then it should be reverted for v12
and not committed again until somebody upgrades the infrastructure.

I admit that I was a bit suspicious about that commit, but I figured
Alvaro knew what he was doing and didn't study it in any depth. And
perhaps he did know what he was doing and will disagree with your
assessment. But so far I haven't heard an idea for evolving the
infrastructure that sounds more than half-baked.

It's generally clear, though, that the existing infrastructure is not
well-suited to progress reporting for code that bounces all over the
tree. It's not clear that *any* infrastructure is *entirely*
well-suited to do that; such problems are inherently complex. But this
is just a very simple system which was designed to be simple and very
low cost to use, and it may be that it's been stretched outside of its
comfort zone.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#95

Alvaro Herrera

alvherre@2ndquadrant.com

over 6 years ago

In reply to: Robert Haas (#94)

Re: [HACKERS] CLUSTER command progress monitor

On 2019-Sep-13, Robert Haas wrote:

On Fri, Sep 13, 2019 at 2:49 AM Michael Paquier <michael@paquier.xyz> wrote:

On Fri, Sep 06, 2019 at 08:10:58AM -0400, Robert Haas wrote:

It's fine if things are updated as well -- it's just you need to make
sure that those places know whether or not they are supposed to be
doing those updates. Again, why can't we just pass down a value
telling them "do reindex-style progress updates" or "do cluster-style
progress updates" or "do no progress updates"?

That's invasive. CREATE INDEX reporting goes pretty deep into the
tree, with steps dedicated to the builds and scans of btree (nbtsort.c
for example) and hash index AMs.

Well, if CREATE INDEX progress reporting can't be reasonably done
within the current infrastructure, then it should be reverted for v12
and not committed again until somebody upgrades the infrastructure.

Ummm ... I've been operating --in this thread-- under the assumption
that it is REINDEX to blame for this problem, not CREATE INDEX, because
my recollection is that I tested CREATE INDEX together with CLUSTER and
it worked fine. Has anybody done any actual research that the problem
is to blame on CREATE INDEX and not REINDEX?

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#96

Robert Haas

robertmhaas@gmail.com

over 6 years ago

In reply to: Alvaro Herrera (#95)

Re: [HACKERS] CLUSTER command progress monitor

On Fri, Sep 13, 2019 at 12:03 PM Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

Ummm ... I've been operating --in this thread-- under the assumption
that it is REINDEX to blame for this problem, not CREATE INDEX, because
my recollection is that I tested CREATE INDEX together with CLUSTER and
it worked fine. Has anybody done any actual research that the problem
is to blame on CREATE INDEX and not REINDEX?

I am not sure. I think, though, that the point is that all three
commands rebuild indexes. So unless they all expect the same things in
terms of which counters get set during that process, things will not
work correctly.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#97

Alvaro Herrera

alvherre@2ndquadrant.com

over 6 years ago

In reply to: Tatsuro Yamada (#73)

Re: [HACKERS] CLUSTER command progress monitor

Hello Tatsuro,

On 2019-Aug-13, Tatsuro Yamada wrote:

On 2019/08/02 3:43, Alvaro Herrera wrote:

Hmm, I'm trying this out now and I don't see the index_rebuild_count
ever go up. I think it's because the indexes are built using parallel
index build ... or maybe it was the table AM changes that moved things
around, not sure. There's a period at the end when the CLUSTER command
keeps working, but it's gone from pg_stat_progress_cluster.

Thanks for your report.
I'll investigate it. :)

I have fixed it. Can you please verify?

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#98

Michael Paquier

michael@paquier.xyz

over 6 years ago

In reply to: Robert Haas (#96)

Re: [HACKERS] CLUSTER command progress monitor

On Fri, Sep 13, 2019 at 12:48:40PM -0400, Robert Haas wrote:

On Fri, Sep 13, 2019 at 12:03 PM Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

Ummm ... I've been operating --in this thread-- under the assumption
that it is REINDEX to blame for this problem, not CREATE INDEX, because
my recollection is that I tested CREATE INDEX together with CLUSTER and
it worked fine. Has anybody done any actual research that the problem
is to blame on CREATE INDEX and not REINDEX?

I am not sure. I think, though, that the point is that all three
commands rebuild indexes. So unless they all expect the same things in
terms of which counters get set during that process, things will not
work correctly.

I have provided a short summary of the two issues on the open item
page (https://wiki.postgresql.org/wiki/PostgreSQL_12_Open_Items) as
the open item was too much evasive. Here is a copy-paste for the
archives of what I wrote:
1) A progress may be started while another one is already in progress.
Hence, if progress gets stopped the previously-started state is
removed, causing all follow-up updates to not happen.
2) Progress updates happening in a code path shared between those
three commands may clobber a previous state present.

Regarding 1) and based on what I found in the code, you can blame
REINDEX reporting which has added progress_start calls in code paths
which are also taken by CREATE INDEX and CLUSTER, causing their
progress reporting to go to the void. In order to fix this one we
could do what I summarized in [1]/messages/by-id/20190905010316.GB14853@paquier.xyz -- Michael.

As mentioned by Robert, the problem summarized in 2) is much more
complex using the current infrastructure, and one could blame all the
commands per the way they do not share the same set of progress
phases. There are a couple of potential solutions which have been
discussed on the thread:
- Allow commands to share the same set of phases, which requires some
kind of mapping between the phases (?).
- Allow progress reports to become a stack. This would also take care
of any kind of issues in 1) for the future, and this can cause the
incorrect command to be reported to pg_stat_activity if not being
careful.
- Allow only reporting for a given command ID, which would basically
require to pass down the command ID to progress update APIs and bypass
an update if the command ID provided by caller does not match the
existing one started (?).

1) is pretty easy to fix based on the current state of the code, 2)
requires much more consideration, and that's no material for v12. It
could be perfectly possible as well that we have another solution not
discussed yet on this thread.

[1]: /messages/by-id/20190905010316.GB14853@paquier.xyz -- Michael
--
Michael

#99

Tattsu Yama

yamatattsu@gmail.com

over 6 years ago

In reply to: Michael Paquier (#98)

1 attachment(s)

Re: [HACKERS] CLUSTER command progress monitor

Hi Alvaro!

Hello Tatsuro,
On 2019-Aug-13, Tatsuro Yamada wrote:

On 2019/08/02 3:43, Alvaro Herrera wrote:

Hmm, I'm trying this out now and I don't see the index_rebuild_count
ever go up. I think it's because the indexes are built using parallel
index build ... or maybe it was the table AM changes that moved things
around, not sure. There's a period at the end when the CLUSTER command
keeps working, but it's gone from pg_stat_progress_cluster.

Thanks for your report.
I'll investigate it. :)

I have fixed it. Can you please verify?

Thanks! I can review your patch for fix it.
However, I was starting fixing the problem from the last day of PGConf.Asia
(11 Sep).
Attached file is WIP patch.In my patch, I added "command id" to all APIs of
progress reporting to isolate commands. Therefore, it doesn't allow to
cascade updating system views. And my patch is on WIP so it needs clean-up
and test.
I share it anyway. :)

Here is a test result of my patch.
The last column index_rebuild count is increased.
========================================
postgres=# select * from pg_stat_progress_cluster ; \watch 0.001;
11636|13591|postgres|16384|CLUSTER|initializing|0|0|0|0|0|0
11636|13591|postgres|16384|CLUSTER|index scanning heap|16389|251|251|0|0|0
...
11636|13591|postgres|16384|CLUSTER|index scanning
heap|16389|10000|10000|0|0|0
11636|13591|postgres|16384|CLUSTER|rebuilding
index|16389|10000|10000|0|0|0...
11636|13591|postgres|16384|CLUSTER|rebuilding index|16389|10000|10000|0|0|1
...
11636|13591|postgres|16384|CLUSTER|rebuilding index|16389|10000|10000|0|0|2
...
11636|13591|postgres|16384|CLUSTER|rebuilding index|16389|10000|10000|0|0|3
...
11636|13591|postgres|16384|CLUSTER|rebuilding index|16389|10000|10000|0|0|4
...
11636|13591|postgres|16384|CLUSTER|performing final
cleanup|16389|10000|10000|0|0|5
========================================

Thanks,
Tatsuro Yamada

Attachments:

v1_fix_progress_report_for_cluster.patchapplication/octet-stream; name=v1_fix_progress_report_for_cluster.patchDownload

diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index 5cc30da..1cbd336 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -166,7 +166,8 @@ hashbuild(Relation heap, Relation index, IndexInfo *indexInfo)
 	reltuples = table_index_build_scan(heap, index, indexInfo, true, true,
 									   hashbuildCallback,
 									   (void *) &buildstate, NULL);
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_TUPLES_TOTAL,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_TUPLES_TOTAL,
 								 buildstate.indtuples);
 
 	if (buildstate.spool)
diff --git a/src/backend/access/hash/hashsort.c b/src/backend/access/hash/hashsort.c
index 293f80f..98653d4 100644
--- a/src/backend/access/hash/hashsort.c
+++ b/src/backend/access/hash/hashsort.c
@@ -145,7 +145,8 @@ _h_indexbuild(HSpool *hspool, Relation heapRel)
 
 		_hash_doinsert(hspool->index, itup, heapRel);
 
-		pgstat_progress_update_param(PROGRESS_CREATEIDX_TUPLES_DONE,
+		pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+									 PROGRESS_CREATEIDX_TUPLES_DONE,
 									 ++tups_done);
 	}
 }
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index fc19f40..6f086b1 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -759,7 +759,8 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
 		/* Set phase and OIDOldIndex to columns */
 		ci_val[0] = PROGRESS_CLUSTER_PHASE_INDEX_SCAN_HEAP;
 		ci_val[1] = RelationGetRelid(OldIndex);
-		pgstat_progress_update_multi_param(2, ci_index, ci_val);
+		pgstat_progress_update_multi_param(PROGRESS_COMMAND_CLUSTER,
+										   2, ci_index, ci_val);
 
 		tableScan = NULL;
 		heapScan = NULL;
@@ -769,7 +770,8 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
 	else
 	{
 		/* In scan-and-sort mode and also VACUUM FULL, set phase */
-		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+		pgstat_progress_update_param(PROGRESS_COMMAND_CLUSTER,
+									 PROGRESS_CLUSTER_PHASE,
 									 PROGRESS_CLUSTER_PHASE_SEQ_SCAN_HEAP);
 
 		tableScan = table_beginscan(OldHeap, SnapshotAny, 0, (ScanKey) NULL);
@@ -777,7 +779,8 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
 		indexScan = NULL;
 
 		/* Set total heap blocks */
-		pgstat_progress_update_param(PROGRESS_CLUSTER_TOTAL_HEAP_BLKS,
+		pgstat_progress_update_param(PROGRESS_COMMAND_CLUSTER,
+									 PROGRESS_CLUSTER_TOTAL_HEAP_BLKS,
 									 heapScan->rs_nblocks);
 	}
 
@@ -816,7 +819,8 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
 			 * In scan-and-sort mode and also VACUUM FULL, set heap blocks
 			 * scanned
 			 */
-			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_BLKS_SCANNED,
+			pgstat_progress_update_param(PROGRESS_COMMAND_CLUSTER,
+										 PROGRESS_CLUSTER_HEAP_BLKS_SCANNED,
 										 heapScan->rs_cblock + 1);
 		}
 
@@ -898,7 +902,8 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
 			 * In scan-and-sort mode, report increase in number of tuples
 			 * scanned
 			 */
-			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED,
+			pgstat_progress_update_param(PROGRESS_COMMAND_CLUSTER,
+										 PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED,
 										 *num_tuples);
 		}
 		else
@@ -918,7 +923,8 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
 			 */
 			ct_val[0] = *num_tuples;
 			ct_val[1] = *num_tuples;
-			pgstat_progress_update_multi_param(2, ct_index, ct_val);
+			pgstat_progress_update_multi_param(PROGRESS_COMMAND_CLUSTER,
+											   2, ct_index, ct_val);
 		}
 	}
 
@@ -938,13 +944,15 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
 		double		n_tuples = 0;
 
 		/* Report that we are now sorting tuples */
-		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+		pgstat_progress_update_param(PROGRESS_COMMAND_CLUSTER,
+									 PROGRESS_CLUSTER_PHASE,
 									 PROGRESS_CLUSTER_PHASE_SORT_TUPLES);
 
 		tuplesort_performsort(tuplesort);
 
 		/* Report that we are now writing new heap */
-		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+		pgstat_progress_update_param(PROGRESS_COMMAND_CLUSTER,
+									 PROGRESS_CLUSTER_PHASE,
 									 PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP);
 
 		for (;;)
@@ -963,7 +971,8 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
 									 values, isnull,
 									 rwstate);
 			/* Report n_tuples */
-			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_WRITTEN,
+			pgstat_progress_update_param(PROGRESS_COMMAND_CLUSTER,
+										 PROGRESS_CLUSTER_HEAP_TUPLES_WRITTEN,
 										 n_tuples);
 		}
 
@@ -1276,7 +1285,8 @@ heapam_index_build_range_scan(Relation heapRelation,
 		else
 			nblocks = hscan->rs_nblocks;
 
-		pgstat_progress_update_param(PROGRESS_SCAN_BLOCKS_TOTAL,
+		pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+									 PROGRESS_SCAN_BLOCKS_TOTAL,
 									 nblocks);
 	}
 
@@ -1319,7 +1329,8 @@ heapam_index_build_range_scan(Relation heapRelation,
 
 			if (blocks_done != previous_blkno)
 			{
-				pgstat_progress_update_param(PROGRESS_SCAN_BLOCKS_DONE,
+				pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+											 PROGRESS_SCAN_BLOCKS_DONE,
 											 blocks_done);
 				previous_blkno = blocks_done;
 			}
@@ -1681,7 +1692,8 @@ heapam_index_build_range_scan(Relation heapRelation,
 		else
 			blks_done = hscan->rs_nblocks;
 
-		pgstat_progress_update_param(PROGRESS_SCAN_BLOCKS_DONE,
+		pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+									 PROGRESS_SCAN_BLOCKS_DONE,
 									 blks_done);
 	}
 
@@ -1762,7 +1774,8 @@ heapam_index_validate_scan(Relation heapRelation,
 								 false);	/* syncscan not OK */
 	hscan = (HeapScanDesc) scan;
 
-	pgstat_progress_update_param(PROGRESS_SCAN_BLOCKS_TOTAL,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_SCAN_BLOCKS_TOTAL,
 								 hscan->rs_nblocks);
 
 	/*
@@ -1781,7 +1794,8 @@ heapam_index_validate_scan(Relation heapRelation,
 		if ((previous_blkno == InvalidBlockNumber) ||
 			(hscan->rs_cblock != previous_blkno))
 		{
-			pgstat_progress_update_param(PROGRESS_SCAN_BLOCKS_DONE,
+			pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+										 PROGRESS_SCAN_BLOCKS_DONE,
 										 hscan->rs_cblock);
 			previous_blkno = hscan->rs_cblock;
 		}
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index a3c4a1d..b7d4840 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -268,7 +268,7 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 						get_database_name(MyDatabaseId),
 						get_namespace_name(RelationGetNamespace(onerel)),
 						RelationGetRelationName(onerel))));
-		pgstat_progress_end_command();
+		pgstat_progress_end_command(PROGRESS_COMMAND_VACUUM);
 		return;
 	}
 
@@ -314,7 +314,8 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 		lazy_truncate_heap(onerel, vacrelstats);
 
 	/* Report that we are now doing final cleanup */
-	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_VACUUM,
+								 PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_FINAL_CLEANUP);
 
 	/*
@@ -367,7 +368,7 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 						 onerel->rd_rel->relisshared,
 						 new_live_tuples,
 						 vacrelstats->new_dead_tuples);
-	pgstat_progress_end_command();
+	pgstat_progress_end_command(PROGRESS_COMMAND_VACUUM);
 
 	/* and log the action if appropriate */
 	if (IsAutoVacuumWorkerProcess() && params->log_min_duration >= 0)
@@ -560,7 +561,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
 	initprog_val[2] = vacrelstats->max_dead_tuples;
-	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
+	pgstat_progress_update_multi_param(PROGRESS_COMMAND_VACUUM,
+									   3, initprog_index, initprog_val);
 
 	/*
 	 * Except when aggressive is set, we want to skip pages that are
@@ -656,7 +658,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 #define FORCE_CHECK_PAGE() \
 		(blkno == nblocks - 1 && should_attempt_truncation(params, vacrelstats))
 
-		pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
+		pgstat_progress_update_param(PROGRESS_COMMAND_VACUUM,
+									 PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
 
 		if (blkno == next_unskippable_block)
 		{
@@ -762,7 +765,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			vacuum_log_cleanup_info(onerel, vacrelstats);
 
 			/* Report that we are now vacuuming indexes */
-			pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
+			pgstat_progress_update_param(PROGRESS_COMMAND_VACUUM,
+										 PROGRESS_VACUUM_PHASE,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
@@ -779,7 +783,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
 			hvp_val[1] = vacrelstats->num_index_scans + 1;
-			pgstat_progress_update_multi_param(2, hvp_index, hvp_val);
+			pgstat_progress_update_multi_param(PROGRESS_COMMAND_VACUUM,
+											   2, hvp_index, hvp_val);
 
 			/* Remove tuples from heap */
 			lazy_vacuum_heap(onerel, vacrelstats);
@@ -800,7 +805,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			next_fsm_block_to_vacuum = blkno;
 
 			/* Report that we are once again scanning the heap */
-			pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
+			pgstat_progress_update_param(PROGRESS_COMMAND_VACUUM,
+										 PROGRESS_VACUUM_PHASE,
 										 PROGRESS_VACUUM_PHASE_SCAN_HEAP);
 		}
 
@@ -1389,7 +1395,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	}
 
 	/* report that everything is scanned and vacuumed */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
+	pgstat_progress_update_param(PROGRESS_COMMAND_VACUUM,
+								 PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
 
 	pfree(frozen);
 
@@ -1430,7 +1437,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		vacuum_log_cleanup_info(onerel, vacrelstats);
 
 		/* Report that we are now vacuuming indexes */
-		pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
+		pgstat_progress_update_param(PROGRESS_COMMAND_VACUUM,
+									 PROGRESS_VACUUM_PHASE,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
@@ -1442,10 +1450,12 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
 		hvp_val[1] = vacrelstats->num_index_scans + 1;
-		pgstat_progress_update_multi_param(2, hvp_index, hvp_val);
+		pgstat_progress_update_multi_param(PROGRESS_COMMAND_VACUUM,
+										   2, hvp_index, hvp_val);
 
 		/* Remove tuples from heap */
-		pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
+		pgstat_progress_update_param(PROGRESS_COMMAND_VACUUM,
+									 PROGRESS_VACUUM_PHASE,
 									 PROGRESS_VACUUM_PHASE_VACUUM_HEAP);
 		lazy_vacuum_heap(onerel, vacrelstats);
 		vacrelstats->num_index_scans++;
@@ -1459,8 +1469,10 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		FreeSpaceMapVacuumRange(onerel, next_fsm_block_to_vacuum, blkno);
 
 	/* report all blocks vacuumed; and that we're cleaning up */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
-	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_VACUUM,
+								 PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
+	pgstat_progress_update_param(PROGRESS_COMMAND_VACUUM,
+								 PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
 	/* Do post-vacuum cleanup and statistics update for each index */
@@ -1597,7 +1609,8 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 	TransactionId visibility_cutoff_xid;
 	bool		all_frozen;
 
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
+	pgstat_progress_update_param(PROGRESS_COMMAND_VACUUM,
+								 PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
 
 	START_CRIT_SECTION();
 
@@ -1880,7 +1893,8 @@ lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats)
 	pg_rusage_init(&ru0);
 
 	/* Report that we are now truncating */
-	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_VACUUM,
+								 PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_TRUNCATE);
 
 	/*
@@ -2186,7 +2200,8 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 	{
 		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
 		vacrelstats->num_dead_tuples++;
-		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
+		pgstat_progress_update_param(PROGRESS_COMMAND_VACUUM,
+									 PROGRESS_VACUUM_NUM_DEAD_TUPLES,
 									 vacrelstats->num_dead_tuples);
 	}
 }
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 4cfd528..028afb1 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -1024,7 +1024,8 @@ btvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			UnlockRelationForExtension(rel, ExclusiveLock);
 
 		if (info->report_progress)
-			pgstat_progress_update_param(PROGRESS_SCAN_BLOCKS_TOTAL,
+			pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+										 PROGRESS_SCAN_BLOCKS_TOTAL,
 										 num_pages);
 
 		/* Quit if we've scanned the whole relation */
@@ -1035,7 +1036,8 @@ btvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 		{
 			btvacuumpage(&vstate, blkno, blkno);
 			if (info->report_progress)
-				pgstat_progress_update_param(PROGRESS_SCAN_BLOCKS_DONE,
+				pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+											 PROGRESS_SCAN_BLOCKS_DONE,
 											 blkno);
 		}
 	}
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index d0b9013..a4582cc 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -397,7 +397,8 @@ _bt_spools_heapscan(Relation heap, Relation index, BTBuildState *buildstate,
 	buildstate->spool = btspool;
 
 	/* Report table scan phase started */
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_SUBPHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_SUBPHASE,
 								 PROGRESS_BTREE_PHASE_INDEXBUILD_TABLESCAN);
 
 	/* Attempt to launch parallel worker scan when required */
@@ -508,7 +509,8 @@ _bt_spools_heapscan(Relation heap, Relation index, BTBuildState *buildstate,
 			0, 0
 		};
 
-		pgstat_progress_update_multi_param(3, index, val);
+		pgstat_progress_update_multi_param(PROGRESS_COMMAND_CREATE_INDEX,
+										   3, index, val);
 	}
 
 	/* okay, all heap tuples are spooled */
@@ -559,12 +561,14 @@ _bt_leafbuild(BTSpool *btspool, BTSpool *btspool2)
 	}
 #endif							/* BTREE_BUILD_STATS */
 
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_SUBPHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_SUBPHASE,
 								 PROGRESS_BTREE_PHASE_PERFORMSORT_1);
 	tuplesort_performsort(btspool->sortstate);
 	if (btspool2)
 	{
-		pgstat_progress_update_param(PROGRESS_CREATEIDX_SUBPHASE,
+		pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+									 PROGRESS_CREATEIDX_SUBPHASE,
 									 PROGRESS_BTREE_PHASE_PERFORMSORT_2);
 		tuplesort_performsort(btspool2->sortstate);
 	}
@@ -584,7 +588,8 @@ _bt_leafbuild(BTSpool *btspool, BTSpool *btspool2)
 	wstate.btws_pages_written = 0;
 	wstate.btws_zeropage = NULL;	/* until needed */
 
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_SUBPHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_SUBPHASE,
 								 PROGRESS_BTREE_PHASE_LEAF_LOAD);
 	_bt_load(&wstate, btspool, btspool2);
 }
@@ -1259,7 +1264,8 @@ _bt_load(BTWriteState *wstate, BTSpool *btspool, BTSpool *btspool2)
 			}
 
 			/* Report progress */
-			pgstat_progress_update_param(PROGRESS_CREATEIDX_TUPLES_DONE,
+			pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+										 PROGRESS_CREATEIDX_TUPLES_DONE,
 										 ++tuples_done);
 		}
 		pfree(sortKeys);
@@ -1277,7 +1283,8 @@ _bt_load(BTWriteState *wstate, BTSpool *btspool, BTSpool *btspool2)
 			_bt_buildadd(wstate, state, itup);
 
 			/* Report progress */
-			pgstat_progress_update_param(PROGRESS_CREATEIDX_TUPLES_DONE,
+			pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+										 PROGRESS_CREATEIDX_TUPLES_DONE,
 										 ++tuples_done);
 		}
 	}
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index c87524e..fe30835 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -2584,7 +2584,8 @@ AbortTransaction(void)
 
 	/* Clear wait information and command progress indicator */
 	pgstat_report_wait_end();
-	pgstat_progress_end_command();
+
+	pgstat_progress_end_command(PROGRESS_COMMAND_INVALID);
 
 	/* Clean up buffer I/O and buffer context locks, too */
 	AbortBufferIO();
@@ -4896,7 +4897,7 @@ AbortSubTransaction(void)
 	LWLockReleaseAll();
 
 	pgstat_report_wait_end();
-	pgstat_progress_end_command();
+	pgstat_progress_end_command(PROGRESS_COMMAND_INVALID);
 	AbortBufferIO();
 	UnlockBuffers();
 
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 795597b..ac15986 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -2779,7 +2779,8 @@ index_build(Relation heapRelation,
 			0, 0, 0, 0
 		};
 
-		pgstat_progress_update_multi_param(6, index, val);
+		pgstat_progress_update_multi_param(PROGRESS_COMMAND_CREATE_INDEX,
+										   6, index, val);
 	}
 
 	/*
@@ -3083,7 +3084,8 @@ validate_index(Oid heapId, Oid indexId, Snapshot snapshot)
 			0, 0, 0, 0
 		};
 
-		pgstat_progress_update_multi_param(5, index, val);
+		pgstat_progress_update_multi_param(PROGRESS_COMMAND_CREATE_INDEX,
+										   5, index, val);
 	}
 
 	/* Open and lock the parent heap relation */
@@ -3150,14 +3152,16 @@ validate_index(Oid heapId, Oid indexId, Snapshot snapshot)
 			0, 0
 		};
 
-		pgstat_progress_update_multi_param(3, index, val);
+		pgstat_progress_update_multi_param(PROGRESS_COMMAND_CREATE_INDEX,
+										   3, index, val);
 	}
 	tuplesort_performsort(state.tuplesort);
 
 	/*
 	 * Now scan the heap and "merge" it with the index
 	 */
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_PHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_PHASE,
 								 PROGRESS_CREATEIDX_PHASE_VALIDATE_TABLESCAN);
 	table_index_validate_scan(heapRelation,
 							  indexRelation,
@@ -3340,9 +3344,11 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence,
 
 	pgstat_progress_start_command(PROGRESS_COMMAND_CREATE_INDEX,
 								  heapId);
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_COMMAND,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_COMMAND,
 								 PROGRESS_CREATEIDX_COMMAND_REINDEX);
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_INDEX_OID,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_INDEX_OID,
 								 indexId);
 
 	/*
@@ -3351,7 +3357,8 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence,
 	 */
 	iRel = index_open(indexId, AccessExclusiveLock);
 
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_ACCESS_METHOD_OID,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_ACCESS_METHOD_OID,
 								 iRel->rd_rel->relam);
 
 	/*
@@ -3505,7 +3512,7 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence,
 				 errdetail_internal("%s",
 									pg_rusage_show(&ru0))));
 
-	pgstat_progress_end_command();
+	pgstat_progress_end_command(PROGRESS_COMMAND_CREATE_INDEX);
 
 	/* Close rels, but keep locks */
 	index_close(iRel, NoLock);
@@ -3631,7 +3638,8 @@ reindex_relation(Oid relid, int flags, int options)
 			Assert(!ReindexIsProcessingIndex(indexOid));
 
 			/* Set index rebuild count */
-			pgstat_progress_update_param(PROGRESS_CLUSTER_INDEX_REBUILD_COUNT,
+			pgstat_progress_update_param(PROGRESS_COMMAND_CLUSTER,
+										 PROGRESS_CLUSTER_INDEX_REBUILD_COUNT,
 										 i);
 			i++;
 		}
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index ebaec4f..7f276e0 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -274,10 +274,12 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 
 	pgstat_progress_start_command(PROGRESS_COMMAND_CLUSTER, tableOid);
 	if (OidIsValid(indexOid))
-		pgstat_progress_update_param(PROGRESS_CLUSTER_COMMAND,
+		pgstat_progress_update_param(PROGRESS_COMMAND_CLUSTER,
+									 PROGRESS_CLUSTER_COMMAND,
 									 PROGRESS_CLUSTER_COMMAND_CLUSTER);
 	else
-		pgstat_progress_update_param(PROGRESS_CLUSTER_COMMAND,
+		pgstat_progress_update_param(PROGRESS_COMMAND_CLUSTER,
+									 PROGRESS_CLUSTER_COMMAND,
 									 PROGRESS_CLUSTER_COMMAND_VACUUM_FULL);
 
 	/*
@@ -291,7 +293,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 	/* If the table has gone away, we can skip processing it */
 	if (!OldHeap)
 	{
-		pgstat_progress_end_command();
+		pgstat_progress_end_command(PROGRESS_COMMAND_CLUSTER);
 		return;
 	}
 
@@ -312,7 +314,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 		if (!pg_class_ownercheck(tableOid, GetUserId()))
 		{
 			relation_close(OldHeap, AccessExclusiveLock);
-			pgstat_progress_end_command();
+			pgstat_progress_end_command(PROGRESS_COMMAND_CLUSTER);
 			return;
 		}
 
@@ -327,7 +329,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 		if (RELATION_IS_OTHER_TEMP(OldHeap))
 		{
 			relation_close(OldHeap, AccessExclusiveLock);
-			pgstat_progress_end_command();
+			pgstat_progress_end_command(PROGRESS_COMMAND_CLUSTER);
 			return;
 		}
 
@@ -339,7 +341,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 			if (!SearchSysCacheExists1(RELOID, ObjectIdGetDatum(indexOid)))
 			{
 				relation_close(OldHeap, AccessExclusiveLock);
-				pgstat_progress_end_command();
+				pgstat_progress_end_command(PROGRESS_COMMAND_CLUSTER);
 				return;
 			}
 
@@ -350,7 +352,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 			if (!HeapTupleIsValid(tuple))	/* probably can't happen */
 			{
 				relation_close(OldHeap, AccessExclusiveLock);
-				pgstat_progress_end_command();
+				pgstat_progress_end_command(PROGRESS_COMMAND_CLUSTER);
 				return;
 			}
 			indexForm = (Form_pg_index) GETSTRUCT(tuple);
@@ -358,7 +360,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 			{
 				ReleaseSysCache(tuple);
 				relation_close(OldHeap, AccessExclusiveLock);
-				pgstat_progress_end_command();
+				pgstat_progress_end_command(PROGRESS_COMMAND_CLUSTER);
 				return;
 			}
 			ReleaseSysCache(tuple);
@@ -413,7 +415,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 		!RelationIsPopulated(OldHeap))
 	{
 		relation_close(OldHeap, AccessExclusiveLock);
-		pgstat_progress_end_command();
+		pgstat_progress_end_command(PROGRESS_COMMAND_CLUSTER);
 		return;
 	}
 
@@ -430,7 +432,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 
 	/* NB: rebuild_relation does table_close() on OldHeap */
 
-	pgstat_progress_end_command();
+	pgstat_progress_end_command(PROGRESS_COMMAND_CLUSTER);
 }
 
 /*
@@ -1353,7 +1355,8 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	int			i;
 
 	/* Report that we are now swapping relation files */
-	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CLUSTER,
+								 PROGRESS_CLUSTER_PHASE,
 								 PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES);
 
 	/* Zero out possible results from swapped_relation_files */
@@ -1404,13 +1407,15 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 		reindex_flags |= REINDEX_REL_FORCE_INDEXES_PERMANENT;
 
 	/* Report that we are now reindexing relations */
-	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CLUSTER,
+								 PROGRESS_CLUSTER_PHASE,
 								 PROGRESS_CLUSTER_PHASE_REBUILD_INDEX);
 
 	reindex_relation(OIDOldHeap, reindex_flags, 0);
 
 	/* Report that we are now doing clean up */
-	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CLUSTER,
+								 PROGRESS_CLUSTER_PHASE,
 								 PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP);
 
 	/*
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index cbac314..a746d4a 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -347,7 +347,8 @@ WaitForOlderSnapshots(TransactionId limitXmin, bool progress)
 										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
 										  &n_old_snapshots);
 	if (progress)
-		pgstat_progress_update_param(PROGRESS_WAITFOR_TOTAL, n_old_snapshots);
+		pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+									 PROGRESS_WAITFOR_TOTAL, n_old_snapshots);
 
 	for (i = 0; i < n_old_snapshots; i++)
 	{
@@ -388,14 +389,16 @@ WaitForOlderSnapshots(TransactionId limitXmin, bool progress)
 			{
 				PGPROC	   *holder = BackendIdGetProc(old_snapshots[i].backendId);
 
-				pgstat_progress_update_param(PROGRESS_WAITFOR_CURRENT_PID,
+				pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+											 PROGRESS_WAITFOR_CURRENT_PID,
 											 holder->pid);
 			}
 			VirtualXactLock(old_snapshots[i], true);
 		}
 
 		if (progress)
-			pgstat_progress_update_param(PROGRESS_WAITFOR_DONE, i + 1);
+			pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+										 PROGRESS_WAITFOR_DONE, i + 1);
 	}
 }
 
@@ -491,7 +494,8 @@ DefineIndex(Oid relationId,
 	{
 		pgstat_progress_start_command(PROGRESS_COMMAND_CREATE_INDEX,
 									  relationId);
-		pgstat_progress_update_param(PROGRESS_CREATEIDX_COMMAND,
+		pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+									 PROGRESS_CREATEIDX_COMMAND,
 									 stmt->concurrent ?
 									 PROGRESS_CREATEIDX_COMMAND_CREATE_CONCURRENTLY :
 									 PROGRESS_CREATEIDX_COMMAND_CREATE);
@@ -500,7 +504,8 @@ DefineIndex(Oid relationId,
 	/*
 	 * No index OID to report yet
 	 */
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_INDEX_OID,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_INDEX_OID,
 								 InvalidOid);
 
 	/*
@@ -724,7 +729,8 @@ DefineIndex(Oid relationId,
 	accessMethodId = accessMethodForm->oid;
 	amRoutine = GetIndexAmRoutine(accessMethodForm->amhandler);
 
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_ACCESS_METHOD_OID,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_ACCESS_METHOD_OID,
 								 accessMethodId);
 
 	if (stmt->unique && !amRoutine->amcanunique)
@@ -1007,7 +1013,7 @@ DefineIndex(Oid relationId,
 
 		/* If this is the top-level index, we're done */
 		if (!OidIsValid(parentIndexId))
-			pgstat_progress_end_command();
+			pgstat_progress_end_command(PROGRESS_COMMAND_CREATE_INDEX);
 
 		return address;
 	}
@@ -1034,7 +1040,8 @@ DefineIndex(Oid relationId,
 			TupleDesc	parentDesc;
 			Oid		   *opfamOids;
 
-			pgstat_progress_update_param(PROGRESS_CREATEIDX_PARTITIONS_TOTAL,
+			pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+										 PROGRESS_CREATEIDX_PARTITIONS_TOTAL,
 										 nparts);
 
 			memcpy(part_oids, partdesc->oids, sizeof(Oid) * nparts);
@@ -1214,7 +1221,8 @@ DefineIndex(Oid relationId,
 								skip_build, quiet);
 				}
 
-				pgstat_progress_update_param(PROGRESS_CREATEIDX_PARTITIONS_DONE,
+				pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+											 PROGRESS_CREATEIDX_PARTITIONS_DONE,
 											 i + 1);
 				pfree(attmap);
 			}
@@ -1250,7 +1258,7 @@ DefineIndex(Oid relationId,
 		 */
 		table_close(rel, NoLock);
 		if (!OidIsValid(parentIndexId))
-			pgstat_progress_end_command();
+			pgstat_progress_end_command(PROGRESS_COMMAND_CREATE_INDEX);
 		return address;
 	}
 
@@ -1261,7 +1269,7 @@ DefineIndex(Oid relationId,
 
 		/* If this is the top-level index, we're done. */
 		if (!OidIsValid(parentIndexId))
-			pgstat_progress_end_command();
+			pgstat_progress_end_command(PROGRESS_COMMAND_CREATE_INDEX);
 
 		return address;
 	}
@@ -1301,7 +1309,8 @@ DefineIndex(Oid relationId,
 	/*
 	 * The index is now visible, so we can report the OID.
 	 */
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_INDEX_OID,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_INDEX_OID,
 								 indexRelationId);
 
 	/*
@@ -1320,7 +1329,8 @@ DefineIndex(Oid relationId,
 	 * exclusive lock on our table.  The lock code will detect deadlock and
 	 * error out properly.
 	 */
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_PHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_PHASE,
 								 PROGRESS_CREATEIDX_PHASE_WAIT_1);
 	WaitForLockers(heaplocktag, ShareLock, true);
 
@@ -1363,7 +1373,8 @@ DefineIndex(Oid relationId,
 	 * We once again wait until no transaction can have the table open with
 	 * the index marked as read-only for updates.
 	 */
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_PHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_PHASE,
 								 PROGRESS_CREATEIDX_PHASE_WAIT_2);
 	WaitForLockers(heaplocktag, ShareLock, true);
 
@@ -1422,7 +1433,8 @@ DefineIndex(Oid relationId,
 	 * before the reference snap was taken, we have to wait out any
 	 * transactions that might have older snapshots.
 	 */
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_PHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_PHASE,
 								 PROGRESS_CREATEIDX_PHASE_WAIT_3);
 	WaitForOlderSnapshots(limitXmin, true);
 
@@ -1446,7 +1458,7 @@ DefineIndex(Oid relationId,
 	 */
 	UnlockRelationIdForSession(&heaprelid, ShareUpdateExclusiveLock);
 
-	pgstat_progress_end_command();
+	pgstat_progress_end_command(PROGRESS_COMMAND_CREATE_INDEX);
 
 	return address;
 }
@@ -2939,11 +2951,14 @@ ReindexRelationConcurrently(Oid relationOid, int options)
 
 		pgstat_progress_start_command(PROGRESS_COMMAND_CREATE_INDEX,
 									  RelationGetRelid(heapRel));
-		pgstat_progress_update_param(PROGRESS_CREATEIDX_COMMAND,
+		pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+									 PROGRESS_CREATEIDX_COMMAND,
 									 PROGRESS_CREATEIDX_COMMAND_REINDEX_CONCURRENTLY);
-		pgstat_progress_update_param(PROGRESS_CREATEIDX_INDEX_OID,
+		pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+									 PROGRESS_CREATEIDX_INDEX_OID,
 									 indexId);
-		pgstat_progress_update_param(PROGRESS_CREATEIDX_ACCESS_METHOD_OID,
+		pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+									 PROGRESS_CREATEIDX_ACCESS_METHOD_OID,
 									 indexRel->rd_rel->relam);
 
 		/* Choose a temporary relation name for the new index */
@@ -3040,7 +3055,8 @@ ReindexRelationConcurrently(Oid relationOid, int options)
 	 * DefineIndex() for more details.
 	 */
 
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_PHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_PHASE,
 								 PROGRESS_CREATEIDX_PHASE_WAIT_1);
 	WaitForLockersMultiple(lockTags, ShareLock, true);
 	CommitTransactionCommand();
@@ -3084,7 +3100,8 @@ ReindexRelationConcurrently(Oid relationOid, int options)
 	 * for more details.
 	 */
 
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_PHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_PHASE,
 								 PROGRESS_CREATEIDX_PHASE_WAIT_2);
 	WaitForLockersMultiple(lockTags, ShareLock, true);
 	CommitTransactionCommand();
@@ -3134,7 +3151,8 @@ ReindexRelationConcurrently(Oid relationOid, int options)
 		 * just before the reference snap was taken, we have to wait out any
 		 * transactions that might have older snapshots.
 		 */
-		pgstat_progress_update_param(PROGRESS_CREATEIDX_PHASE,
+		pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+									 PROGRESS_CREATEIDX_PHASE,
 									 PROGRESS_CREATEIDX_PHASE_WAIT_3);
 		WaitForOlderSnapshots(limitXmin, true);
 
@@ -3207,7 +3225,8 @@ ReindexRelationConcurrently(Oid relationOid, int options)
 	 * index_drop() for more details.
 	 */
 
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_PHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_PHASE,
 								 PROGRESS_CREATEIDX_PHASE_WAIT_4);
 	WaitForLockersMultiple(lockTags, AccessExclusiveLock, true);
 
@@ -3231,7 +3250,8 @@ ReindexRelationConcurrently(Oid relationOid, int options)
 	 * Drop the old indexes.
 	 */
 
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_PHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_PHASE,
 								 PROGRESS_CREATEIDX_PHASE_WAIT_4);
 	WaitForLockersMultiple(lockTags, AccessExclusiveLock, true);
 
@@ -3308,7 +3328,7 @@ ReindexRelationConcurrently(Oid relationOid, int options)
 
 	MemoryContextDelete(private_context);
 
-	pgstat_progress_end_command();
+	pgstat_progress_end_command(PROGRESS_COMMAND_CREATE_INDEX);
 
 	return true;
 }
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 099e14d..f32c597 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -3203,6 +3203,9 @@ pgstat_progress_start_command(ProgressCommandType cmdtype, Oid relid)
 	if (!beentry || !pgstat_track_activities)
 		return;
 
+    if (beentry->st_progress_command != PROGRESS_COMMAND_INVALID)
+		return;
+
 	PGSTAT_BEGIN_WRITE_ACTIVITY(beentry);
 	beentry->st_progress_command = cmdtype;
 	beentry->st_progress_command_target = relid;
@@ -3217,7 +3220,7 @@ pgstat_progress_start_command(ProgressCommandType cmdtype, Oid relid)
  *-----------
  */
 void
-pgstat_progress_update_param(int index, int64 val)
+pgstat_progress_update_param(ProgressCommandType cmdtype, int index, int64 val)
 {
 	volatile PgBackendStatus *beentry = MyBEEntry;
 
@@ -3226,9 +3229,12 @@ pgstat_progress_update_param(int index, int64 val)
 	if (!beentry || !pgstat_track_activities)
 		return;
 
-	PGSTAT_BEGIN_WRITE_ACTIVITY(beentry);
-	beentry->st_progress_param[index] = val;
-	PGSTAT_END_WRITE_ACTIVITY(beentry);
+	if (cmdtype == PROGRESS_COMMAND_INVALID || beentry->st_progress_command == cmdtype)
+	{
+		PGSTAT_BEGIN_WRITE_ACTIVITY(beentry);
+		beentry->st_progress_param[index] = val;
+		PGSTAT_END_WRITE_ACTIVITY(beentry);
+	}
 }
 
 /*-----------
@@ -3239,7 +3245,8 @@ pgstat_progress_update_param(int index, int64 val)
  *-----------
  */
 void
-pgstat_progress_update_multi_param(int nparam, const int *index,
+pgstat_progress_update_multi_param(ProgressCommandType cmdtype,
+								   int nparam, const int *index,
 								   const int64 *val)
 {
 	volatile PgBackendStatus *beentry = MyBEEntry;
@@ -3248,16 +3255,19 @@ pgstat_progress_update_multi_param(int nparam, const int *index,
 	if (!beentry || !pgstat_track_activities || nparam == 0)
 		return;
 
-	PGSTAT_BEGIN_WRITE_ACTIVITY(beentry);
-
-	for (i = 0; i < nparam; ++i)
+	if (cmdtype == PROGRESS_COMMAND_INVALID || beentry->st_progress_command == cmdtype)
 	{
-		Assert(index[i] >= 0 && index[i] < PGSTAT_NUM_PROGRESS_PARAM);
+		PGSTAT_BEGIN_WRITE_ACTIVITY(beentry);
 
-		beentry->st_progress_param[index[i]] = val[i];
-	}
+		for (i = 0; i < nparam; ++i)
+		{
+			Assert(index[i] >= 0 && index[i] < PGSTAT_NUM_PROGRESS_PARAM);
 
-	PGSTAT_END_WRITE_ACTIVITY(beentry);
+			beentry->st_progress_param[index[i]] = val[i];
+		}
+
+		PGSTAT_END_WRITE_ACTIVITY(beentry);
+	}
 }
 
 /*-----------
@@ -3268,7 +3278,7 @@ pgstat_progress_update_multi_param(int nparam, const int *index,
  *-----------
  */
 void
-pgstat_progress_end_command(void)
+pgstat_progress_end_command(ProgressCommandType cmdtype)
 {
 	volatile PgBackendStatus *beentry = MyBEEntry;
 
@@ -3278,6 +3288,9 @@ pgstat_progress_end_command(void)
 	if (beentry->st_progress_command == PROGRESS_COMMAND_INVALID)
 		return;
 
+	if (beentry->st_progress_command != cmdtype)
+		return;
+
 	PGSTAT_BEGIN_WRITE_ACTIVITY(beentry);
 	beentry->st_progress_command = PROGRESS_COMMAND_INVALID;
 	beentry->st_progress_command_target = InvalidOid;
diff --git a/src/backend/storage/lmgr/lmgr.c b/src/backend/storage/lmgr/lmgr.c
index f838b0f..52efd31 100644
--- a/src/backend/storage/lmgr/lmgr.c
+++ b/src/backend/storage/lmgr/lmgr.c
@@ -885,7 +885,8 @@ WaitForLockersMultiple(List *locktags, LOCKMODE lockmode, bool progress)
 	}
 
 	if (progress)
-		pgstat_progress_update_param(PROGRESS_WAITFOR_TOTAL, total);
+		pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+									 PROGRESS_WAITFOR_TOTAL, total);
 
 	/*
 	 * Note: GetLockConflicts() never reports our own xid, hence we need not
@@ -908,14 +909,16 @@ WaitForLockersMultiple(List *locktags, LOCKMODE lockmode, bool progress)
 			{
 				PGPROC	   *holder = BackendIdGetProc(lockholders->backendId);
 
-				pgstat_progress_update_param(PROGRESS_WAITFOR_CURRENT_PID,
+				pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+											 PROGRESS_WAITFOR_CURRENT_PID,
 											 holder->pid);
 			}
 			VirtualXactLock(*lockholders, true);
 			lockholders++;
 
 			if (progress)
-				pgstat_progress_update_param(PROGRESS_WAITFOR_DONE, ++done);
+				pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+											 PROGRESS_WAITFOR_DONE, ++done);
 		}
 	}
 	if (progress)
@@ -929,7 +932,8 @@ WaitForLockersMultiple(List *locktags, LOCKMODE lockmode, bool progress)
 			0, 0, 0
 		};
 
-		pgstat_progress_update_multi_param(3, index, values);
+		pgstat_progress_update_multi_param(PROGRESS_COMMAND_CREATE_INDEX,
+										   3, index, values);
 	}
 
 	list_free_deep(holders);
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 0a3ad3a..a5c3ca2 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -1288,10 +1288,12 @@ extern const char *pgstat_get_backend_desc(BackendType backendType);
 
 extern void pgstat_progress_start_command(ProgressCommandType cmdtype,
 										  Oid relid);
-extern void pgstat_progress_update_param(int index, int64 val);
-extern void pgstat_progress_update_multi_param(int nparam, const int *index,
+extern void pgstat_progress_update_param(ProgressCommandType cmdtype,
+										 int index, int64 val);
+extern void pgstat_progress_update_multi_param(ProgressCommandType cmdtype,
+											   int nparam, const int *index,
 											   const int64 *val);
-extern void pgstat_progress_end_command(void);
+extern void pgstat_progress_end_command(ProgressCommandType cmdtype);
 
 extern PgStat_TableStatus *find_tabstat_entry(Oid rel_id);
 extern PgStat_BackendFunctionEntry *find_funcstat_entry(Oid func_id);

Import Notes

Resolved by subject fallback

#100

Michael Paquier

michael@paquier.xyz

over 6 years ago

In reply to: Tattsu Yama (#99)

Re: [HACKERS] CLUSTER command progress monitor

On Sat, Sep 14, 2019 at 01:06:32PM +0900, Tattsu Yama wrote:

Thanks! I can review your patch for fix it.
However, I was starting fixing the problem from the last day of PGConf.Asia
(11 Sep).
Attached file is WIP patch.In my patch, I added "command id" to all APIs of
progress reporting to isolate commands. Therefore, it doesn't allow to
cascade updating system views. And my patch is on WIP so it needs clean-up
and test.
I share it anyway. :)

+       if (cmdtype == PROGRESS_COMMAND_INVALID || beentry->st_progress_command == cmdtype)
+       {
+               PGSTAT_BEGIN_WRITE_ACTIVITY(beentry);
+               beentry->st_progress_param[index] = val;
+               PGSTAT_END_WRITE_ACTIVITY(beentry);
+       }
You basically don't need the progress reports if the command ID is
invalid, no?

Another note is that you don't actually fix the problems related to
the calls of pgstat_progress_end_command() which have been added for
REINDEX reporting, so a progress report started for CLUSTER can get
ended earlier than expected, preventing the follow-up progress updates
to show up.
--
Michael

#101

Tattsu Yama

yamatattsu@gmail.com

over 6 years ago

In reply to: Michael Paquier (#100)

Re: [HACKERS] CLUSTER command progress monitor

Hi Michael!

Attached file is WIP patch.In my patch, I added "command id" to all APIs
of

progress reporting to isolate commands. Therefore, it doesn't allow to
cascade updating system views. And my patch is on WIP so it needs

clean-up

and test.
I share it anyway. :)
+       if (cmdtype == PROGRESS_COMMAND_INVALID ||
beentry->st_progress_command == cmdtype)
+       {
+               PGSTAT_BEGIN_WRITE_ACTIVITY(beentry);
+               beentry->st_progress_param[index] = val;
+               PGSTAT_END_WRITE_ACTIVITY(beentry);
+       }
You basically don't need the progress reports if the command ID is
invalid, no?

Ah, right.
I'll check and fix that today. :)

Another note is that you don't actually fix the problems related to
the calls of pgstat_progress_end_command() which have been added for
REINDEX reporting, so a progress report started for CLUSTER can get
ended earlier than expected, preventing the follow-up progress updates
to show up.

Hmm... I fixed the problem. Please confirm the test result repeated below.
CLUSTER is able to get the last phase: performing final clean up by using
the patch.

# Test result
========================================
postgres=# select * from pg_stat_progress_cluster ; \watch 0.001;
11636|13591|postgres|16384|CLUSTER|initializing|0|0|0|0|0|0
11636|13591|postgres|16384|CLUSTER|index scanning heap|16389|251|251|0|0|0
11636|13591|postgres|16384|CLUSTER|index scanning
heap|16389|10000|10000|0|0|0
11636|13591|postgres|16384|CLUSTER|rebuilding
index|16389|10000|10000|0|0|0 <== The last column rebuild_index_count is
increasing!
11636|13591|postgres|16384|CLUSTER|rebuilding index|16389|10000|10000|0|0|1
11636|13591|postgres|16384|CLUSTER|rebuilding index|16389|10000|10000|0|0|2
11636|13591|postgres|16384|CLUSTER|rebuilding index|16389|10000|10000|0|0|3
11636|13591|postgres|16384|CLUSTER|rebuilding index|16389|10000|10000|0|0|4
11636|13591|postgres|16384|CLUSTER|performing final
cleanup|16389|10000|10000|0|0|5 <== The last phase of CLUSTER!
========================================

Thanks,
Tatsuro Yamada

#102

Tattsu Yama

yamatattsu@gmail.com

over 6 years ago

In reply to: Tattsu Yama (#101)

1 attachment(s)

Re: [HACKERS] CLUSTER command progress monitor

Hi Michael,

Attached file is WIP patch.In my patch, I added "command id" to all

APIs of

progress reporting to isolate commands. Therefore, it doesn't allow to
cascade updating system views. And my patch is on WIP so it needs

clean-up

and test.
I share it anyway. :)
+       if (cmdtype == PROGRESS_COMMAND_INVALID ||
beentry->st_progress_command == cmdtype)
+       {
+               PGSTAT_BEGIN_WRITE_ACTIVITY(beentry);
+               beentry->st_progress_param[index] = val;
+               PGSTAT_END_WRITE_ACTIVITY(beentry);
+       }
You basically don't need the progress reports if the command ID is
invalid, no?
Ah, right.
I'll check and fix that today. :)

I fixed the patch based on your comment.
Please find attached file. :)

I should have explained the API changes more. I added cmdtype as a given
parameter for all functions (See below).
Therefore, I suppose that my patch is similar to the following fix as you
mentioned on -hackers.

- Allow only reporting for a given command ID, which would basically

require to pass down the command ID to progress update APIs and bypass an
update
if the command ID provided by caller does not match the existing one
started (?).

#pgstat.c
pgstat_progress_start_command(ProgressCommandType cmdtype,...)
- Progress reporter starts when beentry->st_progress_command is
PROGRESS_COMMAND_INVALID

pgstat_progress_end_command(ProgressCommandType cmdtype,...)
- Progress reporter ends when beentry->st_progress_command equals cmdtype

pgstat_progress_update_param(ProgressCommandType cmdtype,...) and
pgstat_progress_update_multi_param(ProgressCommandType cmdtype,...)
- Progress reporter updates parameters if beentry->st_progress_command
equals cmdtype

Note:
cmdtype means the ProgressCommandType below:

# pgstat.h
typedef enum ProgressCommandType
{
PROGRESS_COMMAND_INVALID,
PROGRESS_COMMAND_VACUUM,
PROGRESS_COMMAND_CLUSTER,
PROGRESS_COMMAND_CREATE_INDEX
} ProgressCommandType;

Thanks,
Tatsuro Yamada

Attachments:

v2_fix_progress_report_for_cluster.patchapplication/octet-stream; name=v2_fix_progress_report_for_cluster.patchDownload

diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index 5cc30da..1cbd336 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -166,7 +166,8 @@ hashbuild(Relation heap, Relation index, IndexInfo *indexInfo)
 	reltuples = table_index_build_scan(heap, index, indexInfo, true, true,
 									   hashbuildCallback,
 									   (void *) &buildstate, NULL);
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_TUPLES_TOTAL,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_TUPLES_TOTAL,
 								 buildstate.indtuples);
 
 	if (buildstate.spool)
diff --git a/src/backend/access/hash/hashsort.c b/src/backend/access/hash/hashsort.c
index 293f80f..98653d4 100644
--- a/src/backend/access/hash/hashsort.c
+++ b/src/backend/access/hash/hashsort.c
@@ -145,7 +145,8 @@ _h_indexbuild(HSpool *hspool, Relation heapRel)
 
 		_hash_doinsert(hspool->index, itup, heapRel);
 
-		pgstat_progress_update_param(PROGRESS_CREATEIDX_TUPLES_DONE,
+		pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+									 PROGRESS_CREATEIDX_TUPLES_DONE,
 									 ++tups_done);
 	}
 }
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index fc19f40..6f086b1 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -759,7 +759,8 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
 		/* Set phase and OIDOldIndex to columns */
 		ci_val[0] = PROGRESS_CLUSTER_PHASE_INDEX_SCAN_HEAP;
 		ci_val[1] = RelationGetRelid(OldIndex);
-		pgstat_progress_update_multi_param(2, ci_index, ci_val);
+		pgstat_progress_update_multi_param(PROGRESS_COMMAND_CLUSTER,
+										   2, ci_index, ci_val);
 
 		tableScan = NULL;
 		heapScan = NULL;
@@ -769,7 +770,8 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
 	else
 	{
 		/* In scan-and-sort mode and also VACUUM FULL, set phase */
-		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+		pgstat_progress_update_param(PROGRESS_COMMAND_CLUSTER,
+									 PROGRESS_CLUSTER_PHASE,
 									 PROGRESS_CLUSTER_PHASE_SEQ_SCAN_HEAP);
 
 		tableScan = table_beginscan(OldHeap, SnapshotAny, 0, (ScanKey) NULL);
@@ -777,7 +779,8 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
 		indexScan = NULL;
 
 		/* Set total heap blocks */
-		pgstat_progress_update_param(PROGRESS_CLUSTER_TOTAL_HEAP_BLKS,
+		pgstat_progress_update_param(PROGRESS_COMMAND_CLUSTER,
+									 PROGRESS_CLUSTER_TOTAL_HEAP_BLKS,
 									 heapScan->rs_nblocks);
 	}
 
@@ -816,7 +819,8 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
 			 * In scan-and-sort mode and also VACUUM FULL, set heap blocks
 			 * scanned
 			 */
-			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_BLKS_SCANNED,
+			pgstat_progress_update_param(PROGRESS_COMMAND_CLUSTER,
+										 PROGRESS_CLUSTER_HEAP_BLKS_SCANNED,
 										 heapScan->rs_cblock + 1);
 		}
 
@@ -898,7 +902,8 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
 			 * In scan-and-sort mode, report increase in number of tuples
 			 * scanned
 			 */
-			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED,
+			pgstat_progress_update_param(PROGRESS_COMMAND_CLUSTER,
+										 PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED,
 										 *num_tuples);
 		}
 		else
@@ -918,7 +923,8 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
 			 */
 			ct_val[0] = *num_tuples;
 			ct_val[1] = *num_tuples;
-			pgstat_progress_update_multi_param(2, ct_index, ct_val);
+			pgstat_progress_update_multi_param(PROGRESS_COMMAND_CLUSTER,
+											   2, ct_index, ct_val);
 		}
 	}
 
@@ -938,13 +944,15 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
 		double		n_tuples = 0;
 
 		/* Report that we are now sorting tuples */
-		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+		pgstat_progress_update_param(PROGRESS_COMMAND_CLUSTER,
+									 PROGRESS_CLUSTER_PHASE,
 									 PROGRESS_CLUSTER_PHASE_SORT_TUPLES);
 
 		tuplesort_performsort(tuplesort);
 
 		/* Report that we are now writing new heap */
-		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+		pgstat_progress_update_param(PROGRESS_COMMAND_CLUSTER,
+									 PROGRESS_CLUSTER_PHASE,
 									 PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP);
 
 		for (;;)
@@ -963,7 +971,8 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
 									 values, isnull,
 									 rwstate);
 			/* Report n_tuples */
-			pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_WRITTEN,
+			pgstat_progress_update_param(PROGRESS_COMMAND_CLUSTER,
+										 PROGRESS_CLUSTER_HEAP_TUPLES_WRITTEN,
 										 n_tuples);
 		}
 
@@ -1276,7 +1285,8 @@ heapam_index_build_range_scan(Relation heapRelation,
 		else
 			nblocks = hscan->rs_nblocks;
 
-		pgstat_progress_update_param(PROGRESS_SCAN_BLOCKS_TOTAL,
+		pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+									 PROGRESS_SCAN_BLOCKS_TOTAL,
 									 nblocks);
 	}
 
@@ -1319,7 +1329,8 @@ heapam_index_build_range_scan(Relation heapRelation,
 
 			if (blocks_done != previous_blkno)
 			{
-				pgstat_progress_update_param(PROGRESS_SCAN_BLOCKS_DONE,
+				pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+											 PROGRESS_SCAN_BLOCKS_DONE,
 											 blocks_done);
 				previous_blkno = blocks_done;
 			}
@@ -1681,7 +1692,8 @@ heapam_index_build_range_scan(Relation heapRelation,
 		else
 			blks_done = hscan->rs_nblocks;
 
-		pgstat_progress_update_param(PROGRESS_SCAN_BLOCKS_DONE,
+		pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+									 PROGRESS_SCAN_BLOCKS_DONE,
 									 blks_done);
 	}
 
@@ -1762,7 +1774,8 @@ heapam_index_validate_scan(Relation heapRelation,
 								 false);	/* syncscan not OK */
 	hscan = (HeapScanDesc) scan;
 
-	pgstat_progress_update_param(PROGRESS_SCAN_BLOCKS_TOTAL,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_SCAN_BLOCKS_TOTAL,
 								 hscan->rs_nblocks);
 
 	/*
@@ -1781,7 +1794,8 @@ heapam_index_validate_scan(Relation heapRelation,
 		if ((previous_blkno == InvalidBlockNumber) ||
 			(hscan->rs_cblock != previous_blkno))
 		{
-			pgstat_progress_update_param(PROGRESS_SCAN_BLOCKS_DONE,
+			pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+										 PROGRESS_SCAN_BLOCKS_DONE,
 										 hscan->rs_cblock);
 			previous_blkno = hscan->rs_cblock;
 		}
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index a3c4a1d..b7d4840 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -268,7 +268,7 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 						get_database_name(MyDatabaseId),
 						get_namespace_name(RelationGetNamespace(onerel)),
 						RelationGetRelationName(onerel))));
-		pgstat_progress_end_command();
+		pgstat_progress_end_command(PROGRESS_COMMAND_VACUUM);
 		return;
 	}
 
@@ -314,7 +314,8 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 		lazy_truncate_heap(onerel, vacrelstats);
 
 	/* Report that we are now doing final cleanup */
-	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_VACUUM,
+								 PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_FINAL_CLEANUP);
 
 	/*
@@ -367,7 +368,7 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
 						 onerel->rd_rel->relisshared,
 						 new_live_tuples,
 						 vacrelstats->new_dead_tuples);
-	pgstat_progress_end_command();
+	pgstat_progress_end_command(PROGRESS_COMMAND_VACUUM);
 
 	/* and log the action if appropriate */
 	if (IsAutoVacuumWorkerProcess() && params->log_min_duration >= 0)
@@ -560,7 +561,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = nblocks;
 	initprog_val[2] = vacrelstats->max_dead_tuples;
-	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
+	pgstat_progress_update_multi_param(PROGRESS_COMMAND_VACUUM,
+									   3, initprog_index, initprog_val);
 
 	/*
 	 * Except when aggressive is set, we want to skip pages that are
@@ -656,7 +658,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 #define FORCE_CHECK_PAGE() \
 		(blkno == nblocks - 1 && should_attempt_truncation(params, vacrelstats))
 
-		pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
+		pgstat_progress_update_param(PROGRESS_COMMAND_VACUUM,
+									 PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
 
 		if (blkno == next_unskippable_block)
 		{
@@ -762,7 +765,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			vacuum_log_cleanup_info(onerel, vacrelstats);
 
 			/* Report that we are now vacuuming indexes */
-			pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
+			pgstat_progress_update_param(PROGRESS_COMMAND_VACUUM,
+										 PROGRESS_VACUUM_PHASE,
 										 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 			/* Remove index entries */
@@ -779,7 +783,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			 */
 			hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
 			hvp_val[1] = vacrelstats->num_index_scans + 1;
-			pgstat_progress_update_multi_param(2, hvp_index, hvp_val);
+			pgstat_progress_update_multi_param(PROGRESS_COMMAND_VACUUM,
+											   2, hvp_index, hvp_val);
 
 			/* Remove tuples from heap */
 			lazy_vacuum_heap(onerel, vacrelstats);
@@ -800,7 +805,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 			next_fsm_block_to_vacuum = blkno;
 
 			/* Report that we are once again scanning the heap */
-			pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
+			pgstat_progress_update_param(PROGRESS_COMMAND_VACUUM,
+										 PROGRESS_VACUUM_PHASE,
 										 PROGRESS_VACUUM_PHASE_SCAN_HEAP);
 		}
 
@@ -1389,7 +1395,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 	}
 
 	/* report that everything is scanned and vacuumed */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
+	pgstat_progress_update_param(PROGRESS_COMMAND_VACUUM,
+								 PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
 
 	pfree(frozen);
 
@@ -1430,7 +1437,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		vacuum_log_cleanup_info(onerel, vacrelstats);
 
 		/* Report that we are now vacuuming indexes */
-		pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
+		pgstat_progress_update_param(PROGRESS_COMMAND_VACUUM,
+									 PROGRESS_VACUUM_PHASE,
 									 PROGRESS_VACUUM_PHASE_VACUUM_INDEX);
 
 		/* Remove index entries */
@@ -1442,10 +1450,12 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		/* Report that we are now vacuuming the heap */
 		hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
 		hvp_val[1] = vacrelstats->num_index_scans + 1;
-		pgstat_progress_update_multi_param(2, hvp_index, hvp_val);
+		pgstat_progress_update_multi_param(PROGRESS_COMMAND_VACUUM,
+										   2, hvp_index, hvp_val);
 
 		/* Remove tuples from heap */
-		pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
+		pgstat_progress_update_param(PROGRESS_COMMAND_VACUUM,
+									 PROGRESS_VACUUM_PHASE,
 									 PROGRESS_VACUUM_PHASE_VACUUM_HEAP);
 		lazy_vacuum_heap(onerel, vacrelstats);
 		vacrelstats->num_index_scans++;
@@ -1459,8 +1469,10 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
 		FreeSpaceMapVacuumRange(onerel, next_fsm_block_to_vacuum, blkno);
 
 	/* report all blocks vacuumed; and that we're cleaning up */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
-	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_VACUUM,
+								 PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
+	pgstat_progress_update_param(PROGRESS_COMMAND_VACUUM,
+								 PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);
 
 	/* Do post-vacuum cleanup and statistics update for each index */
@@ -1597,7 +1609,8 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
 	TransactionId visibility_cutoff_xid;
 	bool		all_frozen;
 
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
+	pgstat_progress_update_param(PROGRESS_COMMAND_VACUUM,
+								 PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
 
 	START_CRIT_SECTION();
 
@@ -1880,7 +1893,8 @@ lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats)
 	pg_rusage_init(&ru0);
 
 	/* Report that we are now truncating */
-	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_VACUUM,
+								 PROGRESS_VACUUM_PHASE,
 								 PROGRESS_VACUUM_PHASE_TRUNCATE);
 
 	/*
@@ -2186,7 +2200,8 @@ lazy_record_dead_tuple(LVRelStats *vacrelstats,
 	{
 		vacrelstats->dead_tuples[vacrelstats->num_dead_tuples] = *itemptr;
 		vacrelstats->num_dead_tuples++;
-		pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
+		pgstat_progress_update_param(PROGRESS_COMMAND_VACUUM,
+									 PROGRESS_VACUUM_NUM_DEAD_TUPLES,
 									 vacrelstats->num_dead_tuples);
 	}
 }
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 4cfd528..028afb1 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -1024,7 +1024,8 @@ btvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 			UnlockRelationForExtension(rel, ExclusiveLock);
 
 		if (info->report_progress)
-			pgstat_progress_update_param(PROGRESS_SCAN_BLOCKS_TOTAL,
+			pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+										 PROGRESS_SCAN_BLOCKS_TOTAL,
 										 num_pages);
 
 		/* Quit if we've scanned the whole relation */
@@ -1035,7 +1036,8 @@ btvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 		{
 			btvacuumpage(&vstate, blkno, blkno);
 			if (info->report_progress)
-				pgstat_progress_update_param(PROGRESS_SCAN_BLOCKS_DONE,
+				pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+											 PROGRESS_SCAN_BLOCKS_DONE,
 											 blkno);
 		}
 	}
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index d0b9013..a4582cc 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -397,7 +397,8 @@ _bt_spools_heapscan(Relation heap, Relation index, BTBuildState *buildstate,
 	buildstate->spool = btspool;
 
 	/* Report table scan phase started */
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_SUBPHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_SUBPHASE,
 								 PROGRESS_BTREE_PHASE_INDEXBUILD_TABLESCAN);
 
 	/* Attempt to launch parallel worker scan when required */
@@ -508,7 +509,8 @@ _bt_spools_heapscan(Relation heap, Relation index, BTBuildState *buildstate,
 			0, 0
 		};
 
-		pgstat_progress_update_multi_param(3, index, val);
+		pgstat_progress_update_multi_param(PROGRESS_COMMAND_CREATE_INDEX,
+										   3, index, val);
 	}
 
 	/* okay, all heap tuples are spooled */
@@ -559,12 +561,14 @@ _bt_leafbuild(BTSpool *btspool, BTSpool *btspool2)
 	}
 #endif							/* BTREE_BUILD_STATS */
 
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_SUBPHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_SUBPHASE,
 								 PROGRESS_BTREE_PHASE_PERFORMSORT_1);
 	tuplesort_performsort(btspool->sortstate);
 	if (btspool2)
 	{
-		pgstat_progress_update_param(PROGRESS_CREATEIDX_SUBPHASE,
+		pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+									 PROGRESS_CREATEIDX_SUBPHASE,
 									 PROGRESS_BTREE_PHASE_PERFORMSORT_2);
 		tuplesort_performsort(btspool2->sortstate);
 	}
@@ -584,7 +588,8 @@ _bt_leafbuild(BTSpool *btspool, BTSpool *btspool2)
 	wstate.btws_pages_written = 0;
 	wstate.btws_zeropage = NULL;	/* until needed */
 
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_SUBPHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_SUBPHASE,
 								 PROGRESS_BTREE_PHASE_LEAF_LOAD);
 	_bt_load(&wstate, btspool, btspool2);
 }
@@ -1259,7 +1264,8 @@ _bt_load(BTWriteState *wstate, BTSpool *btspool, BTSpool *btspool2)
 			}
 
 			/* Report progress */
-			pgstat_progress_update_param(PROGRESS_CREATEIDX_TUPLES_DONE,
+			pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+										 PROGRESS_CREATEIDX_TUPLES_DONE,
 										 ++tuples_done);
 		}
 		pfree(sortKeys);
@@ -1277,7 +1283,8 @@ _bt_load(BTWriteState *wstate, BTSpool *btspool, BTSpool *btspool2)
 			_bt_buildadd(wstate, state, itup);
 
 			/* Report progress */
-			pgstat_progress_update_param(PROGRESS_CREATEIDX_TUPLES_DONE,
+			pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+										 PROGRESS_CREATEIDX_TUPLES_DONE,
 										 ++tuples_done);
 		}
 	}
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index c87524e..fe30835 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -2584,7 +2584,8 @@ AbortTransaction(void)
 
 	/* Clear wait information and command progress indicator */
 	pgstat_report_wait_end();
-	pgstat_progress_end_command();
+
+	pgstat_progress_end_command(PROGRESS_COMMAND_INVALID);
 
 	/* Clean up buffer I/O and buffer context locks, too */
 	AbortBufferIO();
@@ -4896,7 +4897,7 @@ AbortSubTransaction(void)
 	LWLockReleaseAll();
 
 	pgstat_report_wait_end();
-	pgstat_progress_end_command();
+	pgstat_progress_end_command(PROGRESS_COMMAND_INVALID);
 	AbortBufferIO();
 	UnlockBuffers();
 
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 795597b..ac15986 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -2779,7 +2779,8 @@ index_build(Relation heapRelation,
 			0, 0, 0, 0
 		};
 
-		pgstat_progress_update_multi_param(6, index, val);
+		pgstat_progress_update_multi_param(PROGRESS_COMMAND_CREATE_INDEX,
+										   6, index, val);
 	}
 
 	/*
@@ -3083,7 +3084,8 @@ validate_index(Oid heapId, Oid indexId, Snapshot snapshot)
 			0, 0, 0, 0
 		};
 
-		pgstat_progress_update_multi_param(5, index, val);
+		pgstat_progress_update_multi_param(PROGRESS_COMMAND_CREATE_INDEX,
+										   5, index, val);
 	}
 
 	/* Open and lock the parent heap relation */
@@ -3150,14 +3152,16 @@ validate_index(Oid heapId, Oid indexId, Snapshot snapshot)
 			0, 0
 		};
 
-		pgstat_progress_update_multi_param(3, index, val);
+		pgstat_progress_update_multi_param(PROGRESS_COMMAND_CREATE_INDEX,
+										   3, index, val);
 	}
 	tuplesort_performsort(state.tuplesort);
 
 	/*
 	 * Now scan the heap and "merge" it with the index
 	 */
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_PHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_PHASE,
 								 PROGRESS_CREATEIDX_PHASE_VALIDATE_TABLESCAN);
 	table_index_validate_scan(heapRelation,
 							  indexRelation,
@@ -3340,9 +3344,11 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence,
 
 	pgstat_progress_start_command(PROGRESS_COMMAND_CREATE_INDEX,
 								  heapId);
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_COMMAND,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_COMMAND,
 								 PROGRESS_CREATEIDX_COMMAND_REINDEX);
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_INDEX_OID,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_INDEX_OID,
 								 indexId);
 
 	/*
@@ -3351,7 +3357,8 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence,
 	 */
 	iRel = index_open(indexId, AccessExclusiveLock);
 
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_ACCESS_METHOD_OID,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_ACCESS_METHOD_OID,
 								 iRel->rd_rel->relam);
 
 	/*
@@ -3505,7 +3512,7 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence,
 				 errdetail_internal("%s",
 									pg_rusage_show(&ru0))));
 
-	pgstat_progress_end_command();
+	pgstat_progress_end_command(PROGRESS_COMMAND_CREATE_INDEX);
 
 	/* Close rels, but keep locks */
 	index_close(iRel, NoLock);
@@ -3631,7 +3638,8 @@ reindex_relation(Oid relid, int flags, int options)
 			Assert(!ReindexIsProcessingIndex(indexOid));
 
 			/* Set index rebuild count */
-			pgstat_progress_update_param(PROGRESS_CLUSTER_INDEX_REBUILD_COUNT,
+			pgstat_progress_update_param(PROGRESS_COMMAND_CLUSTER,
+										 PROGRESS_CLUSTER_INDEX_REBUILD_COUNT,
 										 i);
 			i++;
 		}
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index ebaec4f..7f276e0 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -274,10 +274,12 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 
 	pgstat_progress_start_command(PROGRESS_COMMAND_CLUSTER, tableOid);
 	if (OidIsValid(indexOid))
-		pgstat_progress_update_param(PROGRESS_CLUSTER_COMMAND,
+		pgstat_progress_update_param(PROGRESS_COMMAND_CLUSTER,
+									 PROGRESS_CLUSTER_COMMAND,
 									 PROGRESS_CLUSTER_COMMAND_CLUSTER);
 	else
-		pgstat_progress_update_param(PROGRESS_CLUSTER_COMMAND,
+		pgstat_progress_update_param(PROGRESS_COMMAND_CLUSTER,
+									 PROGRESS_CLUSTER_COMMAND,
 									 PROGRESS_CLUSTER_COMMAND_VACUUM_FULL);
 
 	/*
@@ -291,7 +293,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 	/* If the table has gone away, we can skip processing it */
 	if (!OldHeap)
 	{
-		pgstat_progress_end_command();
+		pgstat_progress_end_command(PROGRESS_COMMAND_CLUSTER);
 		return;
 	}
 
@@ -312,7 +314,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 		if (!pg_class_ownercheck(tableOid, GetUserId()))
 		{
 			relation_close(OldHeap, AccessExclusiveLock);
-			pgstat_progress_end_command();
+			pgstat_progress_end_command(PROGRESS_COMMAND_CLUSTER);
 			return;
 		}
 
@@ -327,7 +329,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 		if (RELATION_IS_OTHER_TEMP(OldHeap))
 		{
 			relation_close(OldHeap, AccessExclusiveLock);
-			pgstat_progress_end_command();
+			pgstat_progress_end_command(PROGRESS_COMMAND_CLUSTER);
 			return;
 		}
 
@@ -339,7 +341,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 			if (!SearchSysCacheExists1(RELOID, ObjectIdGetDatum(indexOid)))
 			{
 				relation_close(OldHeap, AccessExclusiveLock);
-				pgstat_progress_end_command();
+				pgstat_progress_end_command(PROGRESS_COMMAND_CLUSTER);
 				return;
 			}
 
@@ -350,7 +352,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 			if (!HeapTupleIsValid(tuple))	/* probably can't happen */
 			{
 				relation_close(OldHeap, AccessExclusiveLock);
-				pgstat_progress_end_command();
+				pgstat_progress_end_command(PROGRESS_COMMAND_CLUSTER);
 				return;
 			}
 			indexForm = (Form_pg_index) GETSTRUCT(tuple);
@@ -358,7 +360,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 			{
 				ReleaseSysCache(tuple);
 				relation_close(OldHeap, AccessExclusiveLock);
-				pgstat_progress_end_command();
+				pgstat_progress_end_command(PROGRESS_COMMAND_CLUSTER);
 				return;
 			}
 			ReleaseSysCache(tuple);
@@ -413,7 +415,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 		!RelationIsPopulated(OldHeap))
 	{
 		relation_close(OldHeap, AccessExclusiveLock);
-		pgstat_progress_end_command();
+		pgstat_progress_end_command(PROGRESS_COMMAND_CLUSTER);
 		return;
 	}
 
@@ -430,7 +432,7 @@ cluster_rel(Oid tableOid, Oid indexOid, int options)
 
 	/* NB: rebuild_relation does table_close() on OldHeap */
 
-	pgstat_progress_end_command();
+	pgstat_progress_end_command(PROGRESS_COMMAND_CLUSTER);
 }
 
 /*
@@ -1353,7 +1355,8 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	int			i;
 
 	/* Report that we are now swapping relation files */
-	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CLUSTER,
+								 PROGRESS_CLUSTER_PHASE,
 								 PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES);
 
 	/* Zero out possible results from swapped_relation_files */
@@ -1404,13 +1407,15 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 		reindex_flags |= REINDEX_REL_FORCE_INDEXES_PERMANENT;
 
 	/* Report that we are now reindexing relations */
-	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CLUSTER,
+								 PROGRESS_CLUSTER_PHASE,
 								 PROGRESS_CLUSTER_PHASE_REBUILD_INDEX);
 
 	reindex_relation(OIDOldHeap, reindex_flags, 0);
 
 	/* Report that we are now doing clean up */
-	pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CLUSTER,
+								 PROGRESS_CLUSTER_PHASE,
 								 PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP);
 
 	/*
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index cbac314..a746d4a 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -347,7 +347,8 @@ WaitForOlderSnapshots(TransactionId limitXmin, bool progress)
 										  PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
 										  &n_old_snapshots);
 	if (progress)
-		pgstat_progress_update_param(PROGRESS_WAITFOR_TOTAL, n_old_snapshots);
+		pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+									 PROGRESS_WAITFOR_TOTAL, n_old_snapshots);
 
 	for (i = 0; i < n_old_snapshots; i++)
 	{
@@ -388,14 +389,16 @@ WaitForOlderSnapshots(TransactionId limitXmin, bool progress)
 			{
 				PGPROC	   *holder = BackendIdGetProc(old_snapshots[i].backendId);
 
-				pgstat_progress_update_param(PROGRESS_WAITFOR_CURRENT_PID,
+				pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+											 PROGRESS_WAITFOR_CURRENT_PID,
 											 holder->pid);
 			}
 			VirtualXactLock(old_snapshots[i], true);
 		}
 
 		if (progress)
-			pgstat_progress_update_param(PROGRESS_WAITFOR_DONE, i + 1);
+			pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+										 PROGRESS_WAITFOR_DONE, i + 1);
 	}
 }
 
@@ -491,7 +494,8 @@ DefineIndex(Oid relationId,
 	{
 		pgstat_progress_start_command(PROGRESS_COMMAND_CREATE_INDEX,
 									  relationId);
-		pgstat_progress_update_param(PROGRESS_CREATEIDX_COMMAND,
+		pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+									 PROGRESS_CREATEIDX_COMMAND,
 									 stmt->concurrent ?
 									 PROGRESS_CREATEIDX_COMMAND_CREATE_CONCURRENTLY :
 									 PROGRESS_CREATEIDX_COMMAND_CREATE);
@@ -500,7 +504,8 @@ DefineIndex(Oid relationId,
 	/*
 	 * No index OID to report yet
 	 */
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_INDEX_OID,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_INDEX_OID,
 								 InvalidOid);
 
 	/*
@@ -724,7 +729,8 @@ DefineIndex(Oid relationId,
 	accessMethodId = accessMethodForm->oid;
 	amRoutine = GetIndexAmRoutine(accessMethodForm->amhandler);
 
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_ACCESS_METHOD_OID,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_ACCESS_METHOD_OID,
 								 accessMethodId);
 
 	if (stmt->unique && !amRoutine->amcanunique)
@@ -1007,7 +1013,7 @@ DefineIndex(Oid relationId,
 
 		/* If this is the top-level index, we're done */
 		if (!OidIsValid(parentIndexId))
-			pgstat_progress_end_command();
+			pgstat_progress_end_command(PROGRESS_COMMAND_CREATE_INDEX);
 
 		return address;
 	}
@@ -1034,7 +1040,8 @@ DefineIndex(Oid relationId,
 			TupleDesc	parentDesc;
 			Oid		   *opfamOids;
 
-			pgstat_progress_update_param(PROGRESS_CREATEIDX_PARTITIONS_TOTAL,
+			pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+										 PROGRESS_CREATEIDX_PARTITIONS_TOTAL,
 										 nparts);
 
 			memcpy(part_oids, partdesc->oids, sizeof(Oid) * nparts);
@@ -1214,7 +1221,8 @@ DefineIndex(Oid relationId,
 								skip_build, quiet);
 				}
 
-				pgstat_progress_update_param(PROGRESS_CREATEIDX_PARTITIONS_DONE,
+				pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+											 PROGRESS_CREATEIDX_PARTITIONS_DONE,
 											 i + 1);
 				pfree(attmap);
 			}
@@ -1250,7 +1258,7 @@ DefineIndex(Oid relationId,
 		 */
 		table_close(rel, NoLock);
 		if (!OidIsValid(parentIndexId))
-			pgstat_progress_end_command();
+			pgstat_progress_end_command(PROGRESS_COMMAND_CREATE_INDEX);
 		return address;
 	}
 
@@ -1261,7 +1269,7 @@ DefineIndex(Oid relationId,
 
 		/* If this is the top-level index, we're done. */
 		if (!OidIsValid(parentIndexId))
-			pgstat_progress_end_command();
+			pgstat_progress_end_command(PROGRESS_COMMAND_CREATE_INDEX);
 
 		return address;
 	}
@@ -1301,7 +1309,8 @@ DefineIndex(Oid relationId,
 	/*
 	 * The index is now visible, so we can report the OID.
 	 */
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_INDEX_OID,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_INDEX_OID,
 								 indexRelationId);
 
 	/*
@@ -1320,7 +1329,8 @@ DefineIndex(Oid relationId,
 	 * exclusive lock on our table.  The lock code will detect deadlock and
 	 * error out properly.
 	 */
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_PHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_PHASE,
 								 PROGRESS_CREATEIDX_PHASE_WAIT_1);
 	WaitForLockers(heaplocktag, ShareLock, true);
 
@@ -1363,7 +1373,8 @@ DefineIndex(Oid relationId,
 	 * We once again wait until no transaction can have the table open with
 	 * the index marked as read-only for updates.
 	 */
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_PHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_PHASE,
 								 PROGRESS_CREATEIDX_PHASE_WAIT_2);
 	WaitForLockers(heaplocktag, ShareLock, true);
 
@@ -1422,7 +1433,8 @@ DefineIndex(Oid relationId,
 	 * before the reference snap was taken, we have to wait out any
 	 * transactions that might have older snapshots.
 	 */
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_PHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_PHASE,
 								 PROGRESS_CREATEIDX_PHASE_WAIT_3);
 	WaitForOlderSnapshots(limitXmin, true);
 
@@ -1446,7 +1458,7 @@ DefineIndex(Oid relationId,
 	 */
 	UnlockRelationIdForSession(&heaprelid, ShareUpdateExclusiveLock);
 
-	pgstat_progress_end_command();
+	pgstat_progress_end_command(PROGRESS_COMMAND_CREATE_INDEX);
 
 	return address;
 }
@@ -2939,11 +2951,14 @@ ReindexRelationConcurrently(Oid relationOid, int options)
 
 		pgstat_progress_start_command(PROGRESS_COMMAND_CREATE_INDEX,
 									  RelationGetRelid(heapRel));
-		pgstat_progress_update_param(PROGRESS_CREATEIDX_COMMAND,
+		pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+									 PROGRESS_CREATEIDX_COMMAND,
 									 PROGRESS_CREATEIDX_COMMAND_REINDEX_CONCURRENTLY);
-		pgstat_progress_update_param(PROGRESS_CREATEIDX_INDEX_OID,
+		pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+									 PROGRESS_CREATEIDX_INDEX_OID,
 									 indexId);
-		pgstat_progress_update_param(PROGRESS_CREATEIDX_ACCESS_METHOD_OID,
+		pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+									 PROGRESS_CREATEIDX_ACCESS_METHOD_OID,
 									 indexRel->rd_rel->relam);
 
 		/* Choose a temporary relation name for the new index */
@@ -3040,7 +3055,8 @@ ReindexRelationConcurrently(Oid relationOid, int options)
 	 * DefineIndex() for more details.
 	 */
 
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_PHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_PHASE,
 								 PROGRESS_CREATEIDX_PHASE_WAIT_1);
 	WaitForLockersMultiple(lockTags, ShareLock, true);
 	CommitTransactionCommand();
@@ -3084,7 +3100,8 @@ ReindexRelationConcurrently(Oid relationOid, int options)
 	 * for more details.
 	 */
 
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_PHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_PHASE,
 								 PROGRESS_CREATEIDX_PHASE_WAIT_2);
 	WaitForLockersMultiple(lockTags, ShareLock, true);
 	CommitTransactionCommand();
@@ -3134,7 +3151,8 @@ ReindexRelationConcurrently(Oid relationOid, int options)
 		 * just before the reference snap was taken, we have to wait out any
 		 * transactions that might have older snapshots.
 		 */
-		pgstat_progress_update_param(PROGRESS_CREATEIDX_PHASE,
+		pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+									 PROGRESS_CREATEIDX_PHASE,
 									 PROGRESS_CREATEIDX_PHASE_WAIT_3);
 		WaitForOlderSnapshots(limitXmin, true);
 
@@ -3207,7 +3225,8 @@ ReindexRelationConcurrently(Oid relationOid, int options)
 	 * index_drop() for more details.
 	 */
 
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_PHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_PHASE,
 								 PROGRESS_CREATEIDX_PHASE_WAIT_4);
 	WaitForLockersMultiple(lockTags, AccessExclusiveLock, true);
 
@@ -3231,7 +3250,8 @@ ReindexRelationConcurrently(Oid relationOid, int options)
 	 * Drop the old indexes.
 	 */
 
-	pgstat_progress_update_param(PROGRESS_CREATEIDX_PHASE,
+	pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+								 PROGRESS_CREATEIDX_PHASE,
 								 PROGRESS_CREATEIDX_PHASE_WAIT_4);
 	WaitForLockersMultiple(lockTags, AccessExclusiveLock, true);
 
@@ -3308,7 +3328,7 @@ ReindexRelationConcurrently(Oid relationOid, int options)
 
 	MemoryContextDelete(private_context);
 
-	pgstat_progress_end_command();
+	pgstat_progress_end_command(PROGRESS_COMMAND_CREATE_INDEX);
 
 	return true;
 }
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 099e14d..552779d 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -3203,6 +3203,9 @@ pgstat_progress_start_command(ProgressCommandType cmdtype, Oid relid)
 	if (!beentry || !pgstat_track_activities)
 		return;
 
+    if (beentry->st_progress_command != PROGRESS_COMMAND_INVALID)
+		return;
+
 	PGSTAT_BEGIN_WRITE_ACTIVITY(beentry);
 	beentry->st_progress_command = cmdtype;
 	beentry->st_progress_command_target = relid;
@@ -3217,7 +3220,7 @@ pgstat_progress_start_command(ProgressCommandType cmdtype, Oid relid)
  *-----------
  */
 void
-pgstat_progress_update_param(int index, int64 val)
+pgstat_progress_update_param(ProgressCommandType cmdtype, int index, int64 val)
 {
 	volatile PgBackendStatus *beentry = MyBEEntry;
 
@@ -3226,6 +3229,9 @@ pgstat_progress_update_param(int index, int64 val)
 	if (!beentry || !pgstat_track_activities)
 		return;
 
+	if (beentry->st_progress_command != cmdtype)
+		return;
+
 	PGSTAT_BEGIN_WRITE_ACTIVITY(beentry);
 	beentry->st_progress_param[index] = val;
 	PGSTAT_END_WRITE_ACTIVITY(beentry);
@@ -3239,7 +3245,8 @@ pgstat_progress_update_param(int index, int64 val)
  *-----------
  */
 void
-pgstat_progress_update_multi_param(int nparam, const int *index,
+pgstat_progress_update_multi_param(ProgressCommandType cmdtype,
+								   int nparam, const int *index,
 								   const int64 *val)
 {
 	volatile PgBackendStatus *beentry = MyBEEntry;
@@ -3248,6 +3255,9 @@ pgstat_progress_update_multi_param(int nparam, const int *index,
 	if (!beentry || !pgstat_track_activities || nparam == 0)
 		return;
 
+	if (beentry->st_progress_command != cmdtype)
+		return;
+
 	PGSTAT_BEGIN_WRITE_ACTIVITY(beentry);
 
 	for (i = 0; i < nparam; ++i)
@@ -3268,14 +3278,14 @@ pgstat_progress_update_multi_param(int nparam, const int *index,
  *-----------
  */
 void
-pgstat_progress_end_command(void)
+pgstat_progress_end_command(ProgressCommandType cmdtype)
 {
 	volatile PgBackendStatus *beentry = MyBEEntry;
 
 	if (!beentry || !pgstat_track_activities)
 		return;
 
-	if (beentry->st_progress_command == PROGRESS_COMMAND_INVALID)
+	if (beentry->st_progress_command != cmdtype)
 		return;
 
 	PGSTAT_BEGIN_WRITE_ACTIVITY(beentry);
diff --git a/src/backend/storage/lmgr/lmgr.c b/src/backend/storage/lmgr/lmgr.c
index f838b0f..52efd31 100644
--- a/src/backend/storage/lmgr/lmgr.c
+++ b/src/backend/storage/lmgr/lmgr.c
@@ -885,7 +885,8 @@ WaitForLockersMultiple(List *locktags, LOCKMODE lockmode, bool progress)
 	}
 
 	if (progress)
-		pgstat_progress_update_param(PROGRESS_WAITFOR_TOTAL, total);
+		pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+									 PROGRESS_WAITFOR_TOTAL, total);
 
 	/*
 	 * Note: GetLockConflicts() never reports our own xid, hence we need not
@@ -908,14 +909,16 @@ WaitForLockersMultiple(List *locktags, LOCKMODE lockmode, bool progress)
 			{
 				PGPROC	   *holder = BackendIdGetProc(lockholders->backendId);
 
-				pgstat_progress_update_param(PROGRESS_WAITFOR_CURRENT_PID,
+				pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+											 PROGRESS_WAITFOR_CURRENT_PID,
 											 holder->pid);
 			}
 			VirtualXactLock(*lockholders, true);
 			lockholders++;
 
 			if (progress)
-				pgstat_progress_update_param(PROGRESS_WAITFOR_DONE, ++done);
+				pgstat_progress_update_param(PROGRESS_COMMAND_CREATE_INDEX,
+											 PROGRESS_WAITFOR_DONE, ++done);
 		}
 	}
 	if (progress)
@@ -929,7 +932,8 @@ WaitForLockersMultiple(List *locktags, LOCKMODE lockmode, bool progress)
 			0, 0, 0
 		};
 
-		pgstat_progress_update_multi_param(3, index, values);
+		pgstat_progress_update_multi_param(PROGRESS_COMMAND_CREATE_INDEX,
+										   3, index, values);
 	}
 
 	list_free_deep(holders);
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 0a3ad3a..a5c3ca2 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -1288,10 +1288,12 @@ extern const char *pgstat_get_backend_desc(BackendType backendType);
 
 extern void pgstat_progress_start_command(ProgressCommandType cmdtype,
 										  Oid relid);
-extern void pgstat_progress_update_param(int index, int64 val);
-extern void pgstat_progress_update_multi_param(int nparam, const int *index,
+extern void pgstat_progress_update_param(ProgressCommandType cmdtype,
+										 int index, int64 val);
+extern void pgstat_progress_update_multi_param(ProgressCommandType cmdtype,
+											   int nparam, const int *index,
 											   const int64 *val);
-extern void pgstat_progress_end_command(void);
+extern void pgstat_progress_end_command(ProgressCommandType cmdtype);
 
 extern PgStat_TableStatus *find_tabstat_entry(Oid rel_id);
 extern PgStat_BackendFunctionEntry *find_funcstat_entry(Oid func_id);

#103

Alvaro Herrera

alvherre@2ndquadrant.com

over 6 years ago

In reply to: Tattsu Yama (#102)

Re: [HACKERS] CLUSTER command progress monitor

On 2019-Sep-16, Tattsu Yama wrote:

I should have explained the API changes more. I added cmdtype as a given
parameter for all functions (See below).
Therefore, I suppose that my patch is similar to the following fix as you
mentioned on -hackers.

Is this fix strictly necessary for pg12, or is this something that we
can leave for pg13?

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#104

Tatsuro Yamada

tatsuro.yamada.tf@nttcom.co.jp

over 6 years ago

In reply to: Alvaro Herrera (#103)

Re: [HACKERS] CLUSTER command progress monitor

Hi Alvaro,

On 2019/09/16 23:12, Alvaro Herrera wrote:

On 2019-Sep-16, Tattsu Yama wrote:

I should have explained the API changes more. I added cmdtype as a given
parameter for all functions (See below).
Therefore, I suppose that my patch is similar to the following fix as you
mentioned on -hackers.

Is this fix strictly necessary for pg12, or is this something that we
can leave for pg13?

Not only me but many DBA needs this progress report feature on PG12,
therefore I'm trying to fix the problem. If you send other patch to
fix the problem, and it is more elegant than mine, I can withdraw my patch.
Anyway, I want to avoid this feature being reverted.
Do you have any ideas to fix the problem?

Thanks,
Tatsuro Yamada

#105

Alvaro Herrera

alvherre@2ndquadrant.com

over 6 years ago

In reply to: Tatsuro Yamada (#104)

Re: [HACKERS] CLUSTER command progress monitor

On 2019-Sep-17, Tatsuro Yamada wrote:

On 2019/09/16 23:12, Alvaro Herrera wrote:

Is this fix strictly necessary for pg12, or is this something that we
can leave for pg13?

Not only me but many DBA needs this progress report feature on PG12,
therefore I'm trying to fix the problem. If you send other patch to
fix the problem, and it is more elegant than mine, I can withdraw my patch.
Anyway, I want to avoid this feature being reverted.
Do you have any ideas to fix the problem?

I committed a fix for the originally reported problem as da47e43dc32e in
branch REL_12_STABLE. Is that insufficient, and if so why?

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#106

Tatsuro Yamada

tatsuro.yamada.tf@nttcom.co.jp

over 6 years ago

In reply to: Alvaro Herrera (#105)

Re: [HACKERS] CLUSTER command progress monitor

Hi Alvaro!

Is this fix strictly necessary for pg12, or is this something that we
can leave for pg13?

Not only me but many DBA needs this progress report feature on PG12,
therefore I'm trying to fix the problem. If you send other patch to
fix the problem, and it is more elegant than mine, I can withdraw my patch.
Anyway, I want to avoid this feature being reverted.
Do you have any ideas to fix the problem?

I committed a fix for the originally reported problem as da47e43dc32e in
branch REL_12_STABLE. Is that insufficient, and if so why?

Ooops, I misunderstood. I now realized you committed your patch to
fix the problem. Thanks! I'll test it later. :)

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=da47e43dc32e3c5916396f0cbcfa974b371e4875

Thanks,
Tatsuro Yamada

#107

Tatsuro Yamada

tatsuro.yamada.tf@nttcom.co.jp

over 6 years ago

In reply to: Tatsuro Yamada (#106)

Re: [HACKERS] CLUSTER command progress monitor

Hi Alvaro!

Is this fix strictly necessary for pg12, or is this something that we
can leave for pg13?

Not only me but many DBA needs this progress report feature on PG12,
therefore I'm trying to fix the problem. If you send other patch to
fix the problem, and it is more elegant than mine, I can withdraw my patch.
Anyway, I want to avoid this feature being reverted.
Do you have any ideas to fix the problem?

I committed a fix for the originally reported problem as da47e43dc32e in
branch REL_12_STABLE. Is that insufficient, and if so why?

Ooops, I misunderstood. I now realized you committed your patch to
fix the problem. Thanks! I'll test it later. :)

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=da47e43dc32e3c5916396f0cbcfa974b371e4875

I tested your patch (da47e43d) and it works fine. Thanks! :)
So, my patch improving progress reporting API can leave for PG13.

#Test scenario
===================
[Session #1]
select * from pg_stat_progress_cluster ; \watch 0.0001

[Session #2]
create table hoge as select a from generate_series(1, 100000) a;
create index ind_hoge1 on hoge(a);
create index ind_hoge2 on hoge((a%2));
create index ind_hoge3 on hoge((a%3));
create index ind_hoge4 on hoge((a%4));
create index ind_hoge5 on hoge((a%5));
cluster hoge using ind_hoge1;
===================

#Test result
===================
22283|13593|postgres|16384|CLUSTER|initializing|0|0|0|0|0|0
...
22283|13593|postgres|16384|CLUSTER|rebuilding index|16387|100000|100000|0|0|0 <= Increasing from 0 to 5
22283|13593|postgres|16384|CLUSTER|rebuilding index|16387|100000|100000|0|0|1
22283|13593|postgres|16384|CLUSTER|rebuilding index|16387|100000|100000|0|0|2
22283|13593|postgres|16384|CLUSTER|rebuilding index|16387|100000|100000|0|0|3
22283|13593|postgres|16384|CLUSTER|rebuilding index|16387|100000|100000|0|0|4
22283|13593|postgres|16384|CLUSTER|performing final cleanup|16387|100000|100000|0|0|5
===================

Thanks,
Tatsuro Yamada

#108

Michael Paquier

michael@paquier.xyz

over 6 years ago

In reply to: Tattsu Yama (#102)

Re: [HACKERS] CLUSTER command progress monitor

On Mon, Sep 16, 2019 at 03:26:10PM +0900, Tattsu Yama wrote:

I should have explained the API changes more. I added cmdtype as a given
parameter for all functions (See below).
Therefore, I suppose that my patch is similar to the following fix as you
mentioned on -hackers.

Yes, that's an option I mentioned here, but it has drawbacks:
/messages/by-id/20190914024547.GB15406@paquier.xyz

I have just looked at it again after a small rebase and there are
issues with the design of your patch:
- When aborting a transaction, we need to enforce a reset of the
command ID used in st_progress_command to be PROGRESS_COMMAND_INVALID.
Unfortunately, your patch does not consider the case where an error
happens while a command ID is set, causing a command to still be
tracked with the next transactions of the session. Even worse, it
prevents pgstat_progress_start_command() to be called again in this
case for another command.
- CLUSTER can rebuild indexes, and we'd likely want to be able to
track some of the information from CREATE INDEX for CLUSTER.

The second issue is perhaps fine as it is not really straight-forward
to share the same progress phases across multiple commands, and we
could live without it for now, or require a follow-up patch to make
the information of CREATE INDEX available to CLUSTER.

Now, the first issue is of another caliber and a no-go :(

On HEAD, pgstat_progress_end_command() has the limitation to not be
able to stack multiple commands, so calling it in cascade has the
disadvantage to perhaps erase the progress state of a command (and it
is not designed for that anyway), which is what happens with CLUSTER
when reindex_index() starts a new progress report, but the simplicity
of the current infrastructure is very safe when it comes to failure
handling, to make sure that an reset happens as long as the command ID
is not invalid. Your patch makes that part unpredictable.
--
Michael

#109

Michael Paquier

michael@paquier.xyz

over 6 years ago

In reply to: Michael Paquier (#98)

Re: [HACKERS] CLUSTER command progress monitor

On Sat, Sep 14, 2019 at 11:45:47AM +0900, Michael Paquier wrote:

I have provided a short summary of the two issues on the open item
page (https://wiki.postgresql.org/wiki/PostgreSQL_12_Open_Items) as
the open item was too much evasive. Here is a copy-paste for the
archives of what I wrote:
1) A progress may be started while another one is already in progress.
Hence, if progress gets stopped the previously-started state is
removed, causing all follow-up updates to not happen.
2) Progress updates happening in a code path shared between those
three commands may clobber a previous state present.

Regarding 1) and based on what I found in the code, you can blame
REINDEX reporting which has added progress_start calls in code paths
which are also taken by CREATE INDEX and CLUSTER, causing their
progress reporting to go to the void. In order to fix this one we
could do what I summarized in [1].

[1]: /messages/by-id/20190905010316.GB14853@paquier.xyz

So, with the clock ticking and the release getting close by, what do
we do for this set of issues? REINDEX, CREATE INDEX and CLUSTER all
try to build indexes and the current infrastructure is not really
adapted to hold all that. Robert, Alvaro and Peter E, do you have any
comments to offer?
--
Michael

#110

Alvaro Herrera

alvherre@2ndquadrant.com

over 6 years ago

In reply to: Michael Paquier (#109)

Re: [HACKERS] CLUSTER command progress monitor

On 2019-Sep-18, Michael Paquier wrote:

So, with the clock ticking and the release getting close by, what do
we do for this set of issues? REINDEX, CREATE INDEX and CLUSTER all
try to build indexes and the current infrastructure is not really
adapted to hold all that. Robert, Alvaro and Peter E, do you have any
comments to offer?

Which part of it is not already fixed?

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#111

Michael Paquier

michael@paquier.xyz

over 6 years ago

In reply to: Alvaro Herrera (#110)

Re: [HACKERS] CLUSTER command progress monitor

On Tue, Sep 17, 2019 at 10:50:22PM -0300, Alvaro Herrera wrote:

On 2019-Sep-18, Michael Paquier wrote:

So, with the clock ticking and the release getting close by, what do
we do for this set of issues? REINDEX, CREATE INDEX and CLUSTER all
try to build indexes and the current infrastructure is not really
adapted to hold all that. Robert, Alvaro and Peter E, do you have any
comments to offer?

Which part of it is not already fixed?

I can still see at least two problems. There is one issue with
pgstat_progress_update_param() which gets called in reindex_table()
for a progress phase of CLUSTER, and this even if
REINDEXOPT_REPORT_PROGRESS is not set in the options. Also it seems
to me that the calls to pgstat_progress_start_command() and
pgstat_progress_end_command() are at incorrect locations for
reindex_index() and that those should be one level higher on the stack
to avoid any kind of interactions with another command whose progress
has already started.
--
Michael

#112

Alvaro Herrera

alvherre@2ndquadrant.com

over 6 years ago

In reply to: Michael Paquier (#111)

Re: [HACKERS] CLUSTER command progress monitor

On 2019-Sep-18, Michael Paquier wrote:

On Tue, Sep 17, 2019 at 10:50:22PM -0300, Alvaro Herrera wrote:

On 2019-Sep-18, Michael Paquier wrote:

So, with the clock ticking and the release getting close by, what do
we do for this set of issues? REINDEX, CREATE INDEX and CLUSTER all
try to build indexes and the current infrastructure is not really
adapted to hold all that. Robert, Alvaro and Peter E, do you have any
comments to offer?

Which part of it is not already fixed?

I can still see at least two problems. There is one issue with
pgstat_progress_update_param() which gets called in reindex_table()
for a progress phase of CLUSTER, and this even if
REINDEXOPT_REPORT_PROGRESS is not set in the options.

(You mean reindex_relation.)

... but that param update is there for CLUSTER, not for REINDEX, so if
we made it dependent on the option to turn on CREATE INDEX progress
updates, it would break CLUSTER progress reporting. Also, the parameter
being updated is not used by CREATE INDEX, so there's no spurious
change. I think this ain't broke, and thus it don't need fixin'.

If anything, I would like the CLUSTER report to show index creation
progress, which would go the opposite way. But that seems a patch for
pg13.

Also it seems
to me that the calls to pgstat_progress_start_command() and
pgstat_progress_end_command() are at incorrect locations for
reindex_index() and that those should be one level higher on the stack
to avoid any kind of interactions with another command whose progress
has already started.

That doesn't work, because the caller doesn't have the OID of the table,
which pgstat_progress_start_command() needs.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services