TODO item: adding VERBOSE option to CLUSTER

Started by Jim Coxover 17 years ago19 messages
#1Jim Cox
shakahshakah@gmail.com

Is anyone working the "CLUSTER: Add VERBOSE option..." TODO item listed
on the PostgreSQL Wiki? If not, would it be wise for me to use
VERBOSE handling in an existing command (e.g. VACUUM)
as a guide while adding VERBOSE to CLUSTER?

#2Jim Cox
shakahshakah@gmail.com
In reply to: Jim Cox (#1)
1 attachment(s)
Re: TODO item: adding VERBOSE option to CLUSTER [with patch]

On Thu, Oct 9, 2008 at 9:37 AM, Jim Cox <shakahshakah@gmail.com> wrote:

Is anyone working the "CLUSTER: Add VERBOSE option..." TODO item listed
on the PostgreSQL Wiki? If not, would it be wise for me to use
VERBOSE handling in an existing command (e.g. VACUUM)
as a guide while adding VERBOSE to CLUSTER?

A patch s/b attached which adds a "VERBOSE" option to the CLUSTER command as
mentioned in the following TODO item for CLUSTER: "Add VERBOSE option
to report tables as they are processed, like VACUUM VERBOSE".

In short, all three variations of the CLUSTER command now take an optional
"VERBOSE" arg, if present an INFO message is generated which displays
the schema.tblname just before actual clustering is kicked off (see example
below).

postgres=# CLUSTER ;
CLUSTER

postgres=# CLUSTER VERBOSE ;
INFO: clustering "public.my_b"
INFO: clustering "public.my_c"
INFO: clustering "public.my_a"
CLUSTER

postgres=# CLUSTER public.my_c ;
CLUSTER

postgres=# CLUSTER public.my_c VERBOSE ;
INFO: clustering "public.my_c"
CLUSTER

Attachments:

cluster-verbose.patchtext/x-diff; name=cluster-verbose.patchDownload
Index: src/backend/commands/cluster.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/commands/cluster.c,v
retrieving revision 1.177
diff -c -r1.177 cluster.c
*** src/backend/commands/cluster.c	19 Jun 2008 00:46:04 -0000	1.177
--- src/backend/commands/cluster.c	10 Oct 2008 12:47:04 -0000
***************
*** 60,66 ****
  } RelToCluster;
  
  
! static void cluster_rel(RelToCluster *rv, bool recheck);
  static void rebuild_relation(Relation OldHeap, Oid indexOid);
  static TransactionId copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex);
  static List *get_tables_to_cluster(MemoryContext cluster_context);
--- 60,66 ----
  } RelToCluster;
  
  
! static void cluster_rel(RelToCluster *rv, bool recheck, int elevel);
  static void rebuild_relation(Relation OldHeap, Oid indexOid);
  static TransactionId copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex);
  static List *get_tables_to_cluster(MemoryContext cluster_context);
***************
*** 176,182 ****
  		heap_close(rel, NoLock);
  
  		/* Do the job */
! 		cluster_rel(&rvtc, false);
  	}
  	else
  	{
--- 176,182 ----
  		heap_close(rel, NoLock);
  
  		/* Do the job */
! 		cluster_rel(&rvtc, false, stmt->verbose ? INFO : DEBUG2);
  	}
  	else
  	{
***************
*** 225,231 ****
  			StartTransactionCommand();
  			/* functions in indexes may want a snapshot set */
  			PushActiveSnapshot(GetTransactionSnapshot());
! 			cluster_rel(rvtc, true);
  			PopActiveSnapshot();
  			CommitTransactionCommand();
  		}
--- 225,231 ----
  			StartTransactionCommand();
  			/* functions in indexes may want a snapshot set */
  			PushActiveSnapshot(GetTransactionSnapshot());
! 			cluster_rel(rvtc, true, stmt->verbose ? INFO : DEBUG2);
  			PopActiveSnapshot();
  			CommitTransactionCommand();
  		}
***************
*** 253,259 ****
   * them incrementally while we load the table.
   */
  static void
! cluster_rel(RelToCluster *rvtc, bool recheck)
  {
  	Relation	OldHeap;
  
--- 253,259 ----
   * them incrementally while we load the table.
   */
  static void
! cluster_rel(RelToCluster *rvtc, bool recheck, int elevel)
  {
  	Relation	OldHeap;
  
***************
*** 343,348 ****
--- 343,352 ----
  	check_index_is_clusterable(OldHeap, rvtc->indexOid, recheck);
  
  	/* rebuild_relation does all the dirty work */
+ 	ereport(elevel,
+ 			(errmsg("clustering \"%s.%s\"",
+ 					get_namespace_name(RelationGetNamespace(OldHeap)),
+ 					RelationGetRelationName(OldHeap))));
  	rebuild_relation(OldHeap, rvtc->indexOid);
  
  	/* NB: rebuild_relation does heap_close() on OldHeap */
Index: src/backend/nodes/copyfuncs.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/nodes/copyfuncs.c,v
retrieving revision 1.408
diff -c -r1.408 copyfuncs.c
*** src/backend/nodes/copyfuncs.c	7 Oct 2008 19:27:04 -0000	1.408
--- src/backend/nodes/copyfuncs.c	10 Oct 2008 12:47:04 -0000
***************
*** 2259,2264 ****
--- 2259,2265 ----
  
  	COPY_NODE_FIELD(relation);
  	COPY_STRING_FIELD(indexname);
+         COPY_SCALAR_FIELD(verbose) ;
  
  	return newnode;
  }
Index: src/backend/nodes/equalfuncs.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/nodes/equalfuncs.c,v
retrieving revision 1.333
diff -c -r1.333 equalfuncs.c
*** src/backend/nodes/equalfuncs.c	6 Oct 2008 17:39:26 -0000	1.333
--- src/backend/nodes/equalfuncs.c	10 Oct 2008 12:47:04 -0000
***************
*** 1000,1005 ****
--- 1000,1006 ----
  {
  	COMPARE_NODE_FIELD(relation);
  	COMPARE_STRING_FIELD(indexname);
+ 	COMPARE_SCALAR_FIELD(verbose);
  
  	return true;
  }
Index: src/backend/parser/gram.y
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/parser/gram.y,v
retrieving revision 2.625
diff -c -r2.625 gram.y
*** src/backend/parser/gram.y	4 Oct 2008 21:56:54 -0000	2.625
--- src/backend/parser/gram.y	10 Oct 2008 12:47:05 -0000
***************
*** 5738,5770 ****
  /*****************************************************************************
   *
   *		QUERY:
!  *				CLUSTER <qualified_name> [ USING <index_name> ]
!  *				CLUSTER
!  *				CLUSTER <index_name> ON <qualified_name> (for pre-8.3)
   *
   *****************************************************************************/
  
  ClusterStmt:
! 			CLUSTER qualified_name cluster_index_specification
  				{
  			       ClusterStmt *n = makeNode(ClusterStmt);
  				   n->relation = $2;
  				   n->indexname = $3;
  				   $$ = (Node*)n;
  				}
! 			| CLUSTER
  			    {
  				   ClusterStmt *n = makeNode(ClusterStmt);
  				   n->relation = NULL;
  				   n->indexname = NULL;
  				   $$ = (Node*)n;
  				}
  			/* kept for pre-8.3 compatibility */
! 			| CLUSTER index_name ON qualified_name
  				{
  				   ClusterStmt *n = makeNode(ClusterStmt);
  				   n->relation = $4;
  				   n->indexname = $2;
  				   $$ = (Node*)n;
  				}
  		;
--- 5738,5773 ----
  /*****************************************************************************
   *
   *		QUERY:
!  *				CLUSTER <qualified_name> [ USING <index_name> ] [VERBOSE]
!  *				CLUSTER [VERBOSE]
!  *				CLUSTER <index_name> ON <qualified_name> [VERBOSE] (for pre-8.3)
   *
   *****************************************************************************/
  
  ClusterStmt:
! 			CLUSTER qualified_name cluster_index_specification opt_verbose
  				{
  			       ClusterStmt *n = makeNode(ClusterStmt);
  				   n->relation = $2;
  				   n->indexname = $3;
+ 				   n->verbose = $4;
  				   $$ = (Node*)n;
  				}
! 			| CLUSTER opt_verbose
  			    {
  				   ClusterStmt *n = makeNode(ClusterStmt);
  				   n->relation = NULL;
  				   n->indexname = NULL;
+ 				   n->verbose = $2;
  				   $$ = (Node*)n;
  				}
  			/* kept for pre-8.3 compatibility */
! 			| CLUSTER index_name ON qualified_name opt_verbose
  				{
  				   ClusterStmt *n = makeNode(ClusterStmt);
  				   n->relation = $4;
  				   n->indexname = $2;
+ 				   n->verbose = $5;
  				   $$ = (Node*)n;
  				}
  		;
Index: src/include/nodes/parsenodes.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/nodes/parsenodes.h,v
retrieving revision 1.376
diff -c -r1.376 parsenodes.h
*** src/include/nodes/parsenodes.h	4 Oct 2008 21:56:55 -0000	1.376
--- src/include/nodes/parsenodes.h	10 Oct 2008 12:47:06 -0000
***************
*** 1940,1945 ****
--- 1940,1946 ----
  	NodeTag		type;
  	RangeVar   *relation;		/* relation being indexed, or NULL if all */
  	char	   *indexname;		/* original index defined */
+ 	bool		verbose;		/* print progress info */
  } ClusterStmt;
  
  /* ----------------------
#3Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Jim Cox (#2)
Re: TODO item: adding VERBOSE option to CLUSTER [with patch]

"Jim Cox" <shakahshakah@gmail.com> wrote:

if present an INFO message is generated which displays
the schema.tblname just before actual clustering is kicked off (see

example

below).

postgres=# CLUSTER VERBOSE ;
INFO: clustering "public.my_b"
INFO: clustering "public.my_c"
INFO: clustering "public.my_a"
CLUSTER

Would it make sense to display the pg_total_relation_size before and
after?

-Kevin

#4Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Kevin Grittner (#3)
Re: TODO item: adding VERBOSE option to CLUSTER [with patch]

Kevin Grittner wrote:

"Jim Cox" <shakahshakah@gmail.com> wrote:

if present an INFO message is generated which displays
the schema.tblname just before actual clustering is kicked off (see

example

below).

postgres=# CLUSTER VERBOSE ;
INFO: clustering "public.my_b"
INFO: clustering "public.my_c"
INFO: clustering "public.my_a"
CLUSTER

Would it make sense to display the pg_total_relation_size before and
after?

Assuming you run CLUSTER as a replacement for VACUUM FULL, yes. More
interesting would be a metric of "clusteredness", I think.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#5Jim Cox
shakahshakah@gmail.com
In reply to: Heikki Linnakangas (#4)
Re: TODO item: adding VERBOSE option to CLUSTER [with patch]

On Fri, Oct 10, 2008 at 10:23 AM, Heikki Linnakangas <
heikki.linnakangas@enterprisedb.com> wrote:

Kevin Grittner wrote:

"Jim Cox" <shakahshakah@gmail.com> wrote:

if present an INFO message is generated which displays

the schema.tblname just before actual clustering is kicked off (see

example

below).

postgres=# CLUSTER VERBOSE ;
INFO: clustering "public.my_b"
INFO: clustering "public.my_c"
INFO: clustering "public.my_a"
CLUSTER

Would it make sense to display the pg_total_relation_size before and
after?

Assuming you run CLUSTER as a replacement for VACUUM FULL, yes. More
interesting would be a metric of "clusteredness", I think.

Something more like the following?

postgres=# CLUSTER VERBOSE ;
INFO: clustering "public.my_b"
INFO: complete, 0 rows scanned, 0 rows now live
DETAIL: CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: clustering "public.my_c"
INFO: complete, 20 rows scanned, 10 rows now live
DETAIL: CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: clustering "public.my_a"
INFO: complete, 10 rows scanned, 10 rows now live
DETAIL: CPU 0.00s/0.00u sec elapsed 0.00 sec.
CLUSTER

#6Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Jim Cox (#5)
Re: TODO item: adding VERBOSE option to CLUSTER [with patch]

Jim Cox wrote:

On Fri, Oct 10, 2008 at 10:23 AM, Heikki Linnakangas <
heikki.linnakangas@enterprisedb.com> wrote:

Kevin Grittner wrote:

"Jim Cox" <shakahshakah@gmail.com> wrote:

if present an INFO message is generated which displays

the schema.tblname just before actual clustering is kicked off (see

example

below).

postgres=# CLUSTER VERBOSE ;
INFO: clustering "public.my_b"
INFO: clustering "public.my_c"
INFO: clustering "public.my_a"
CLUSTER

Would it make sense to display the pg_total_relation_size before and
after?

Assuming you run CLUSTER as a replacement for VACUUM FULL, yes. More
interesting would be a metric of "clusteredness", I think.

Something more like the following?

postgres=# CLUSTER VERBOSE ;
INFO: clustering "public.my_b"
INFO: complete, 0 rows scanned, 0 rows now live
DETAIL: CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: clustering "public.my_c"
INFO: complete, 20 rows scanned, 10 rows now live
DETAIL: CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: clustering "public.my_a"
INFO: complete, 10 rows scanned, 10 rows now live
DETAIL: CPU 0.00s/0.00u sec elapsed 0.00 sec.
CLUSTER

No, I was thinking of something along the lines of:
INFO: clustering "public.my_c"
INFO: complete, was 33%, now 100% clustered

The only such measure that we have is the correlation, which isn't very
good anyway, so I'm not sure if that's worthwhile.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#6)
Re: TODO item: adding VERBOSE option to CLUSTER [with patch]

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

No, I was thinking of something along the lines of:
INFO: clustering "public.my_c"
INFO: complete, was 33%, now 100% clustered
The only such measure that we have is the correlation, which isn't very
good anyway, so I'm not sure if that's worthwhile.

It'd be possible to count the number of order reversals during the
indexscan, ie the number of tuples with CTID lower than the previous
one's. But I'm not sure how useful that number really is. Also it's
not clear how to preserve such functionality if cluster is
re-implemented with a sort.

regards, tom lane

#8Jim Cox
shakahshakah@gmail.com
In reply to: Tom Lane (#7)
1 attachment(s)
Re: TODO item: adding VERBOSE option to CLUSTER [with patch]

On Mon, Oct 13, 2008 at 8:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

No, I was thinking of something along the lines of:
INFO: clustering "public.my_c"
INFO: complete, was 33%, now 100% clustered
The only such measure that we have is the correlation, which isn't very
good anyway, so I'm not sure if that's worthwhile.

It'd be possible to count the number of order reversals during the
indexscan, ie the number of tuples with CTID lower than the previous
one's. But I'm not sure how useful that number really is. Also it's
not clear how to preserve such functionality if cluster is
re-implemented with a sort.

regards, tom lane

Another version of the patch should be attached, this time counting the
number of "inversions" (pairs of tuples in the table that are in the wrong
order) as a measure of the "sortedness" of the original data (scanned/live
numbers still reported as an indication of the extent to which the table was
vacuumed).

N.B. -- I'm not familiar enough with Postgres internals to know if the
included inversion_count() method is a valid way to identify inversions.

In any case, example VERBOSE output:

postgres=# CLUSTER public.my_c VERBOSE ;
INFO: clustering "public.my_c"
INFO: complete, 15 tuples scanned, 10 tuples now live, 2 inversions
DETAIL: CPU 0.00s/0.00u sec elapsed 0.00 sec.
CLUSTER

Attachments:

cluster-verbose-2.patchtext/x-diff; name=cluster-verbose-2.patchDownload
Index: src/backend/commands/cluster.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/commands/cluster.c,v
retrieving revision 1.177
diff -c -r1.177 cluster.c
*** src/backend/commands/cluster.c	19 Jun 2008 00:46:04 -0000	1.177
--- src/backend/commands/cluster.c	13 Oct 2008 14:25:09 -0000
***************
*** 46,51 ****
--- 46,52 ----
  #include "utils/snapmgr.h"
  #include "utils/syscache.h"
  #include "utils/tqual.h"
+ #include "utils/pg_rusage.h"
  
  
  /*
***************
*** 59,70 ****
  	Oid			indexOid;
  } RelToCluster;
  
! 
! static void cluster_rel(RelToCluster *rv, bool recheck);
! static void rebuild_relation(Relation OldHeap, Oid indexOid);
! static TransactionId copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex);
  static List *get_tables_to_cluster(MemoryContext cluster_context);
! 
  
  
  /*---------------------------------------------------------------------------
--- 60,80 ----
  	Oid			indexOid;
  } RelToCluster;
  
! /*
!  * This struct is used to collect stats for VERBOSE output.
!  */
! typedef struct
! {
! 	double	tuples_scanned ;
! 	double	tuples_copied ;
! 	double	inversions ;
! } ClusterStatistics;
! 
! static void cluster_rel(RelToCluster *rv, bool recheck, int elevel);
! static void rebuild_relation(Relation OldHeap, Oid indexOid, ClusterStatistics *cs);
! static TransactionId copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, ClusterStatistics *cs);
  static List *get_tables_to_cluster(MemoryContext cluster_context);
! static int inversion_count(BlockIdData *bid_old, OffsetNumber off_old, BlockIdData *bid, OffsetNumber off) ;
  
  
  /*---------------------------------------------------------------------------
***************
*** 176,182 ****
  		heap_close(rel, NoLock);
  
  		/* Do the job */
! 		cluster_rel(&rvtc, false);
  	}
  	else
  	{
--- 186,192 ----
  		heap_close(rel, NoLock);
  
  		/* Do the job */
! 		cluster_rel(&rvtc, false, stmt->verbose ? INFO : DEBUG2);
  	}
  	else
  	{
***************
*** 225,231 ****
  			StartTransactionCommand();
  			/* functions in indexes may want a snapshot set */
  			PushActiveSnapshot(GetTransactionSnapshot());
! 			cluster_rel(rvtc, true);
  			PopActiveSnapshot();
  			CommitTransactionCommand();
  		}
--- 235,241 ----
  			StartTransactionCommand();
  			/* functions in indexes may want a snapshot set */
  			PushActiveSnapshot(GetTransactionSnapshot());
! 			cluster_rel(rvtc, true, stmt->verbose ? INFO : DEBUG2);
  			PopActiveSnapshot();
  			CommitTransactionCommand();
  		}
***************
*** 253,261 ****
   * them incrementally while we load the table.
   */
  static void
! cluster_rel(RelToCluster *rvtc, bool recheck)
  {
  	Relation	OldHeap;
  
  	/* Check for user-requested abort. */
  	CHECK_FOR_INTERRUPTS();
--- 263,273 ----
   * them incrementally while we load the table.
   */
  static void
! cluster_rel(RelToCluster *rvtc, bool recheck, int elevel)
  {
  	Relation	OldHeap;
+ 	PGRUsage	ru0 ;
+ 	ClusterStatistics	cs ;
  
  	/* Check for user-requested abort. */
  	CHECK_FOR_INTERRUPTS();
***************
*** 343,349 ****
  	check_index_is_clusterable(OldHeap, rvtc->indexOid, recheck);
  
  	/* rebuild_relation does all the dirty work */
! 	rebuild_relation(OldHeap, rvtc->indexOid);
  
  	/* NB: rebuild_relation does heap_close() on OldHeap */
  }
--- 355,373 ----
  	check_index_is_clusterable(OldHeap, rvtc->indexOid, recheck);
  
  	/* rebuild_relation does all the dirty work */
! 	ereport(elevel,
! 			(errmsg("clustering \"%s.%s\"",
! 					get_namespace_name(RelationGetNamespace(OldHeap)),
! 					RelationGetRelationName(OldHeap))));
! 	pg_rusage_init(&ru0);
! 	cs.tuples_scanned = cs.tuples_copied = cs.inversions = 0 ;
! 	rebuild_relation(OldHeap, rvtc->indexOid, &cs);
! 	ereport(elevel,
! 			(errmsg("complete, %.0f tuples scanned, %.0f tuples now live, %.0f inversions",
! 				cs.tuples_scanned,
! 				cs.tuples_copied,
! 				cs.inversions),
!                          errdetail("%s.", pg_rusage_show(&ru0))));
  
  	/* NB: rebuild_relation does heap_close() on OldHeap */
  }
***************
*** 556,566 ****
   *
   * OldHeap: table to rebuild --- must be opened and exclusive-locked!
   * indexOid: index to cluster by
   *
   * NB: this routine closes OldHeap at the right time; caller should not.
   */
  static void
! rebuild_relation(Relation OldHeap, Oid indexOid)
  {
  	Oid			tableOid = RelationGetRelid(OldHeap);
  	Oid			tableSpace = OldHeap->rd_rel->reltablespace;
--- 580,591 ----
   *
   * OldHeap: table to rebuild --- must be opened and exclusive-locked!
   * indexOid: index to cluster by
+  * cs: ClusterStatistics for VERBOSE option
   *
   * NB: this routine closes OldHeap at the right time; caller should not.
   */
  static void
! rebuild_relation(Relation OldHeap, Oid indexOid, ClusterStatistics *cs)
  {
  	Oid			tableOid = RelationGetRelid(OldHeap);
  	Oid			tableSpace = OldHeap->rd_rel->reltablespace;
***************
*** 593,599 ****
  	/*
  	 * Copy the heap data into the new table in the desired order.
  	 */
! 	frozenXid = copy_heap_data(OIDNewHeap, tableOid, indexOid);
  
  	/* To make the new heap's data visible (probably not needed?). */
  	CommandCounterIncrement();
--- 618,624 ----
  	/*
  	 * Copy the heap data into the new table in the desired order.
  	 */
! 	frozenXid = copy_heap_data(OIDNewHeap, tableOid, indexOid, cs);
  
  	/* To make the new heap's data visible (probably not needed?). */
  	CommandCounterIncrement();
***************
*** 703,709 ****
   * freeze cutoff point for the tuples.
   */
  static TransactionId
! copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex)
  {
  	Relation	NewHeap,
  				OldHeap,
--- 728,734 ----
   * freeze cutoff point for the tuples.
   */
  static TransactionId
! copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, ClusterStatistics *cs)
  {
  	Relation	NewHeap,
  				OldHeap,
***************
*** 719,724 ****
--- 744,751 ----
  	TransactionId OldestXmin;
  	TransactionId FreezeXid;
  	RewriteState rwstate;
+ 	BlockIdData bid_old = { 0, 0 } ;
+ 	OffsetNumber off_old = 0 ;
  
  	/*
  	 * Open the relations we need.
***************
*** 784,789 ****
--- 811,820 ----
  
  		CHECK_FOR_INTERRUPTS();
  
+ 		if(cs) {
+ 			cs->tuples_scanned++ ;
+ 		}
+ 
  		/* Since we used no scan keys, should never need to recheck */
  		if (scan->xs_recheck)
  			elog(ERROR, "CLUSTER does not support lossy index conditions");
***************
*** 856,861 ****
--- 887,903 ----
  		 *
  		 * So, we must reconstruct the tuple from component Datums.
  		 */
+ 		/* ...keep track of inversions (pairs of tuples in the table that are in the wrong order) */
+ 		if(cs->tuples_copied) {
+ 			cs->inversions += inversion_count(
+ 						&bid_old,
+ 						off_old,
+ 						&(tuple->t_data->t_ctid.ip_blkid),
+ 						tuple->t_data->t_ctid.ip_posid) ;
+ 		}
+ 		bid_old = tuple->t_data->t_ctid.ip_blkid ;
+ 		off_old = tuple->t_data->t_ctid.ip_posid ;
+ 
  		heap_deform_tuple(tuple, oldTupDesc, values, isnull);
  
  		/* Be sure to null out any dropped columns */
***************
*** 875,880 ****
--- 917,924 ----
  		rewrite_heap_tuple(rwstate, tuple, copiedTuple);
  
  		heap_freetuple(copiedTuple);
+ 
+ 		cs->tuples_copied++ ;
  	}
  
  	index_endscan(scan);
***************
*** 1119,1121 ****
--- 1163,1171 ----
  
  	return rvs;
  }
+ 
+ static int inversion_count(BlockIdData *bid_old, OffsetNumber off_old, BlockIdData *bid, OffsetNumber off) {
+   return    bid_old->bi_hi > bid->bi_hi
+          || (bid_old->bi_hi==bid->bi_hi && bid_old->bi_lo > bid->bi_lo)
+          || (bid_old->bi_hi==bid->bi_hi && bid_old->bi_lo==bid->bi_lo && off_old > off) ;
+ }
Index: src/backend/nodes/copyfuncs.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/nodes/copyfuncs.c,v
retrieving revision 1.408
diff -c -r1.408 copyfuncs.c
*** src/backend/nodes/copyfuncs.c	7 Oct 2008 19:27:04 -0000	1.408
--- src/backend/nodes/copyfuncs.c	13 Oct 2008 14:25:10 -0000
***************
*** 2259,2264 ****
--- 2259,2265 ----
  
  	COPY_NODE_FIELD(relation);
  	COPY_STRING_FIELD(indexname);
+ 	COPY_SCALAR_FIELD(verbose) ;
  
  	return newnode;
  }
Index: src/backend/nodes/equalfuncs.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/nodes/equalfuncs.c,v
retrieving revision 1.333
diff -c -r1.333 equalfuncs.c
*** src/backend/nodes/equalfuncs.c	6 Oct 2008 17:39:26 -0000	1.333
--- src/backend/nodes/equalfuncs.c	13 Oct 2008 14:25:10 -0000
***************
*** 1000,1005 ****
--- 1000,1006 ----
  {
  	COMPARE_NODE_FIELD(relation);
  	COMPARE_STRING_FIELD(indexname);
+ 	COMPARE_SCALAR_FIELD(verbose);
  
  	return true;
  }
Index: src/backend/parser/gram.y
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/parser/gram.y,v
retrieving revision 2.625
diff -c -r2.625 gram.y
*** src/backend/parser/gram.y	4 Oct 2008 21:56:54 -0000	2.625
--- src/backend/parser/gram.y	13 Oct 2008 14:25:10 -0000
***************
*** 5738,5770 ****
  /*****************************************************************************
   *
   *		QUERY:
!  *				CLUSTER <qualified_name> [ USING <index_name> ]
!  *				CLUSTER
!  *				CLUSTER <index_name> ON <qualified_name> (for pre-8.3)
   *
   *****************************************************************************/
  
  ClusterStmt:
! 			CLUSTER qualified_name cluster_index_specification
  				{
  			       ClusterStmt *n = makeNode(ClusterStmt);
  				   n->relation = $2;
  				   n->indexname = $3;
  				   $$ = (Node*)n;
  				}
! 			| CLUSTER
  			    {
  				   ClusterStmt *n = makeNode(ClusterStmt);
  				   n->relation = NULL;
  				   n->indexname = NULL;
  				   $$ = (Node*)n;
  				}
  			/* kept for pre-8.3 compatibility */
! 			| CLUSTER index_name ON qualified_name
  				{
  				   ClusterStmt *n = makeNode(ClusterStmt);
  				   n->relation = $4;
  				   n->indexname = $2;
  				   $$ = (Node*)n;
  				}
  		;
--- 5738,5773 ----
  /*****************************************************************************
   *
   *		QUERY:
!  *				CLUSTER <qualified_name> [ USING <index_name> ] [VERBOSE]
!  *				CLUSTER [VERBOSE]
!  *				CLUSTER <index_name> ON <qualified_name> [VERBOSE] (for pre-8.3)
   *
   *****************************************************************************/
  
  ClusterStmt:
! 			CLUSTER qualified_name cluster_index_specification opt_verbose
  				{
  			       ClusterStmt *n = makeNode(ClusterStmt);
  				   n->relation = $2;
  				   n->indexname = $3;
+ 				   n->verbose = $4;
  				   $$ = (Node*)n;
  				}
! 			| CLUSTER opt_verbose
  			    {
  				   ClusterStmt *n = makeNode(ClusterStmt);
  				   n->relation = NULL;
  				   n->indexname = NULL;
+ 				   n->verbose = $2;
  				   $$ = (Node*)n;
  				}
  			/* kept for pre-8.3 compatibility */
! 			| CLUSTER index_name ON qualified_name opt_verbose
  				{
  				   ClusterStmt *n = makeNode(ClusterStmt);
  				   n->relation = $4;
  				   n->indexname = $2;
+ 				   n->verbose = $5;
  				   $$ = (Node*)n;
  				}
  		;
Index: src/include/nodes/parsenodes.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/nodes/parsenodes.h,v
retrieving revision 1.376
diff -c -r1.376 parsenodes.h
*** src/include/nodes/parsenodes.h	4 Oct 2008 21:56:55 -0000	1.376
--- src/include/nodes/parsenodes.h	13 Oct 2008 14:25:10 -0000
***************
*** 1940,1945 ****
--- 1940,1946 ----
  	NodeTag		type;
  	RangeVar   *relation;		/* relation being indexed, or NULL if all */
  	char	   *indexname;		/* original index defined */
+ 	bool		verbose;		/* print progress info */
  } ClusterStmt;
  
  /* ----------------------
#9Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Jim Cox (#8)
Re: TODO item: adding VERBOSE option to CLUSTER [with patch]

Jim Cox wrote:

On Mon, Oct 13, 2008 at 8:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

No, I was thinking of something along the lines of:
INFO: clustering "public.my_c"
INFO: complete, was 33%, now 100% clustered
The only such measure that we have is the correlation, which isn't very
good anyway, so I'm not sure if that's worthwhile.

It'd be possible to count the number of order reversals during the
indexscan, ie the number of tuples with CTID lower than the previous
one's. But I'm not sure how useful that number really is.

It will look bad for patterns like:
2
1
4
3
6
5
..

which for all practical purposes is just as good as a perfectly sorted
table. So no, I don't think that's a very useful metric either without
somehow taking caching effects into account.

Another version of the patch should be attached, this time counting the
number of "inversions" (pairs of tuples in the table that are in the wrong
order) as a measure of the "sortedness" of the original data (scanned/live
numbers still reported as an indication of the extent to which the table was
vacuumed).

Until we have a better metric for "sortedness", my earlier suggestion to
print it was probably a bad idea. If anything, should probably print the
same correlation metric that ANALYZE calculates, so that it would at
least match what the planner uses for decision-making.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#10Jeff Davis
pgsql@j-davis.com
In reply to: Tom Lane (#7)
Re: TODO item: adding VERBOSE option to CLUSTER [with patch]

On Mon, 2008-10-13 at 08:30 -0400, Tom Lane wrote:

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

No, I was thinking of something along the lines of:
INFO: clustering "public.my_c"
INFO: complete, was 33%, now 100% clustered
The only such measure that we have is the correlation, which isn't very
good anyway, so I'm not sure if that's worthwhile.

It'd be possible to count the number of order reversals during the
indexscan, ie the number of tuples with CTID lower than the previous
one's. But I'm not sure how useful that number really is. Also it's
not clear how to preserve such functionality if cluster is
re-implemented with a sort.

I assume here you mean a CTID with a lower page number, as the line
pointer wouldn't make any difference, right?

I think it would be a useful metric to decide whether or not to use an
index scan (I don't know how easy it is to estimate this from a sample,
but a CLUSTER could clearly get an exact number). It would solve the
problem where synchronized scans used by pg_dump could result in poor
correlation on restore and therefore not choose index scans (which is
what prompted turning off sync scans for pg_dump).

Regards,
Jeff Davis

#11Gregory Stark
stark@enterprisedb.com
In reply to: Heikki Linnakangas (#9)
Re: TODO item: adding VERBOSE option to CLUSTER [with patch]

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

Jim Cox wrote:

On Mon, Oct 13, 2008 at 8:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

It'd be possible to count the number of order reversals during the
indexscan, ie the number of tuples with CTID lower than the previous
one's. But I'm not sure how useful that number really is.

Incidentally it finally occurred to me that "sortedness" is actually a pretty
good term to search on. I found several papers for estimating metrics of
sortedness from samples even. Though the best looks like it requires a sample
of size O(sqrt(n)) which is more than we currently take.

The two metrics which seem popular is either the length of the longest
subsequence which is sorted or the number of sorted subsequences. I think the
latter is equivalent to counting the inversions.

I didn't find any papers which claimed to present good ways to draw
conclusions based on these metrics but I only did a quick search. I imagine if
everyone is looking for ways to estimate them they they must be useful for
something...

For some reason my access to the ACM digital library stopped working. Does
anyone else have access?

It will look bad for patterns like:
2
1
4
3
6
5
..

Hm, you could include some measure of how far the inversion goes -- but I
think that's counter-productive. Sure some of them will be cached but others
won't and that'll be equally bad regardless of how far back it goes.

Until we have a better metric for "sortedness", my earlier suggestion to print
it was probably a bad idea. If anything, should probably print the same
correlation metric that ANALYZE calculates, so that it would at least match
what the planner uses for decision-making.

I agree with that. I like the idea of printing a message though -- we should
just have it print the correlation for now and when we improve the stats we'll
print the new metric.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's On-Demand Production Tuning

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Gregory Stark (#11)
Re: TODO item: adding VERBOSE option to CLUSTER [with patch]

Gregory Stark <stark@enterprisedb.com> writes:

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

Until we have a better metric for "sortedness", my earlier suggestion to print
it was probably a bad idea. If anything, should probably print the same
correlation metric that ANALYZE calculates, so that it would at least match
what the planner uses for decision-making.

I agree with that. I like the idea of printing a message though -- we should
just have it print the correlation for now and when we improve the stats we'll
print the new metric.

Short of actually running an ANALYZE, I'm not seeing a good way to
derive the same number it derives.

regards, tom lane

#13Gregory Stark
stark@enterprisedb.com
In reply to: Tom Lane (#12)
Re: TODO item: adding VERBOSE option to CLUSTER [with patch]

Tom Lane <tgl@sss.pgh.pa.us> writes:

Gregory Stark <stark@enterprisedb.com> writes:

I agree with that. I like the idea of printing a message though -- we should
just have it print the correlation for now and when we improve the stats we'll
print the new metric.

Short of actually running an ANALYZE, I'm not seeing a good way to
derive the same number it derives.

Well we could print the _old_ value at least. So if you run cluster
periodically you can see whether you're running it often enough.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's 24x7 Postgres support!

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Gregory Stark (#13)
Re: TODO item: adding VERBOSE option to CLUSTER [with patch]

Gregory Stark <stark@enterprisedb.com> writes:

Tom Lane <tgl@sss.pgh.pa.us> writes:

Short of actually running an ANALYZE, I'm not seeing a good way to
derive the same number it derives.

Well we could print the _old_ value at least.

+1 ... seems an appropriate amount of effort for the likely value.

regards, tom lane

#15Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Tom Lane (#14)
Re: TODO item: adding VERBOSE option to CLUSTER [with patch]

Tom Lane <tgl@sss.pgh.pa.us> wrote:

Gregory Stark <stark@enterprisedb.com> writes:

Tom Lane <tgl@sss.pgh.pa.us> writes:

Short of actually running an ANALYZE, I'm not seeing a good way to
derive the same number it derives.

Well we could print the _old_ value at least.

+1 ... seems an appropriate amount of effort for the likely value.

That seems fine for sortedness, but am I the only one who would like
the verbose mode to show the bloat reduction? Essentially, an INFO
line to show the same information you could get by bracketing the
CLUSTER with a couple SELECTs:

ccdev=# select pg_total_relation_size('"DbTranImageStatus"');
pg_total_relation_size
------------------------
253952
(1 row)

ccdev=# cluster "DbTranImageStatus";
CLUSTER
ccdev=# select pg_total_relation_size('"DbTranImageStatus"');
pg_total_relation_size
------------------------
32768
(1 row)

-Kevin

#16Joshua Drake
jd@commandprompt.com
In reply to: Kevin Grittner (#15)
Re: TODO item: adding VERBOSE option to CLUSTER [with patch]

On Mon, 13 Oct 2008 15:34:04 -0500
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote:

ccdev=# select pg_total_relation_size('"DbTranImageStatus"');
pg_total_relation_size
------------------------
253952
(1 row)

ccdev=# cluster "DbTranImageStatus";
CLUSTER
ccdev=# select pg_total_relation_size('"DbTranImageStatus"');
pg_total_relation_size
------------------------
32768
(1 row)

-Kevin

Although I think it is an interesting bit of information, I find that
"if" I am going to be clustering, I have already done the above.

Sincerely,

Joshua D. Drake

--
The PostgreSQL Company since 1997: http://www.commandprompt.com/
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/

#17Robert Treat
xzilla@users.sourceforge.net
In reply to: Joshua Drake (#16)
Re: TODO item: adding VERBOSE option to CLUSTER [with patch]

On Monday 13 October 2008 16:45:35 Joshua Drake wrote:

On Mon, 13 Oct 2008 15:34:04 -0500

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote:

ccdev=# select pg_total_relation_size('"DbTranImageStatus"');
pg_total_relation_size
------------------------
253952
(1 row)

ccdev=# cluster "DbTranImageStatus";
CLUSTER
ccdev=# select pg_total_relation_size('"DbTranImageStatus"');
pg_total_relation_size
------------------------
32768
(1 row)

-Kevin

Although I think it is an interesting bit of information, I find that
"if" I am going to be clustering, I have already done the above.

I agree with that, though I still wouldn't mind seeing some size specific
information, similar to what vacuum verbose emits.. eg.

INFO: "my_users_mods": removed 790 row versions in 4 pages
INFO: "my_users_mods": found 790 removable, 308 nonremovable row versions in
6 pages

--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

#18Peter Eisentraut
peter_e@gmx.net
In reply to: Jim Cox (#2)
Re: TODO item: adding VERBOSE option to CLUSTER [with patch]

Jim Cox wrote:

A patch s/b attached which adds a "VERBOSE" option to the CLUSTER command as
mentioned in the following TODO item for CLUSTER: "Add VERBOSE option
to report tables as they are processed, like VACUUM VERBOSE".

I have committed this version of your patch, but moving the VERBOSE
option directly after the CLUSTER word, to be more consistent with
VACUUM and ANALYZE. Also I added the corresponding support to the
clusterdb command.

Additional processing information as discussed later in the thread can
now be added easily, but I think there was no consensus on what exactly
to print.

#19Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Peter Eisentraut (#18)
Re: TODO item: adding VERBOSE option to CLUSTER [with patch]

Peter Eisentraut <peter_e@gmx.net> wrote:

Additional processing information as discussed later in the thread

can

now be added easily, but I think there was no consensus on what

exactly

to print.

Since I use CLUSTER almost exclusively to eliminate bloat, I'd like to
see the before and after value for pg_total_relation_size for each
table. Some didn't seem terribly interested in that, as they use the
command primarily to sequence the heap, so some measure(s) of how
out-of-order the heap was would make sense.

We are talking about an option named VERBOSE, so there's no real
reason not to throw in all useful information.

-Kevin