Review of: explain / allow collecting row counts without timing info

Started by Josh Berkusalmost 14 years ago14 messages

Josh Berkus

josh@agliodbs.com

almost 14 years ago

Eric's review follows:

Compiling on Ubuntu 10.04 LTS AMD64 on a GoGrid virtual machine from
2012-01-12 git checkout.

Patch applied fine.

'make check' results in failures when this patch is put into place.

========================
6 of 128 tests failed.
========================

Here are the relevant failures.

parallel group (2 tests): create_view create_index
create_index ... FAILED
parallel group (9 tests): create_cast create_aggregate drop_if_exists
typed_table vacuum constraints create_table_like triggers inherit
inherit ... FAILED
parallel group (20 tests): select_distinct_on btree_index update random
select_distinct select_having namespace delete case hash_index union
select_implicit select_into transactions portals subselect arrays
aggregates join prepared_xacts
join ... FAILED
aggregates ... FAILED
parallel group (15 tests): combocid portals_p2 advisory_lock xmlmap
tsdicts guc functional_deps dependency select_views cluster tsearch
window foreign_data foreign_key bitmapops
foreign_data ... FAILED
parallel group (19 tests): limit prepare copy2 conversion xml plancache
returning temp sequence without_oid with rowtypes truncate polymorphism
domain largeobject rangefuncs alter_table plpgsql
alter_table ... FAILED

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Tomas Vondra

tv@fuzzy.cz

almost 14 years ago

In reply to: Josh Berkus (#1)

1 attachment(s)

Re: Review of: explain / allow collecting row counts without timing info

On 13.1.2012 18:07, Josh Berkus wrote:

Eric's review follows:

Compiling on Ubuntu 10.04 LTS AMD64 on a GoGrid virtual machine from
2012-01-12 git checkout.

Patch applied fine.

'make check' results in failures when this patch is put into place.

========================
6 of 128 tests failed.
========================

Here are the relevant failures.

parallel group (2 tests): create_view create_index
create_index ... FAILED
parallel group (9 tests): create_cast create_aggregate drop_if_exists
typed_table vacuum constraints create_table_like triggers inherit
inherit ... FAILED
parallel group (20 tests): select_distinct_on btree_index update random
select_distinct select_having namespace delete case hash_index union
select_implicit select_into transactions portals subselect arrays
aggregates join prepared_xacts
join ... FAILED
aggregates ... FAILED
parallel group (15 tests): combocid portals_p2 advisory_lock xmlmap
tsdicts guc functional_deps dependency select_views cluster tsearch
window foreign_data foreign_key bitmapops
foreign_data ... FAILED
parallel group (19 tests): limit prepare copy2 conversion xml plancache
returning temp sequence without_oid with rowtypes truncate polymorphism
domain largeobject rangefuncs alter_table plpgsql
alter_table ... FAILED

Fixed. The default value of TIMING option did not work as intended, it
was set to true even for plain EXPLAIN (without ANALYZE). In that case
the EXPLAIN failed.

Tomas

Attachments:

explain-analyze-rows-only-v3.difftext/plain; name=explain-analyze-rows-only-v3.diffDownload

diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c
index 61da6a2..9fc0ae1 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -23,6 +23,7 @@ static int	auto_explain_log_min_duration = -1; /* msec or -1 */
 static bool auto_explain_log_analyze = false;
 static bool auto_explain_log_verbose = false;
 static bool auto_explain_log_buffers = false;
+static bool auto_explain_log_timing  = false;
 static int	auto_explain_log_format = EXPLAIN_FORMAT_TEXT;
 static bool auto_explain_log_nested_statements = false;
 
@@ -133,6 +134,17 @@ _PG_init(void)
 							 NULL,
 							 NULL);
 
+	DefineCustomBoolVariable("auto_explain.log_timing",
+							 "Collect timing data, not just row counts.",
+							 NULL,
+							 &auto_explain_log_timing,
+							 true,
+							 PGC_SUSET,
+							 0,
+							 NULL,
+							 NULL,
+							 NULL);
+
 	EmitWarningsOnPlaceholders("auto_explain");
 
 	/* Install hooks. */
@@ -170,7 +182,12 @@ explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
 		/* Enable per-node instrumentation iff log_analyze is required. */
 		if (auto_explain_log_analyze && (eflags & EXEC_FLAG_EXPLAIN_ONLY) == 0)
 		{
-			queryDesc->instrument_options |= INSTRUMENT_TIMER;
+			if (auto_explain_log_timing)
+				queryDesc->instrument_options |= INSTRUMENT_TIMER;
+			else
+				queryDesc->instrument_options |= INSTRUMENT_ROWS;
+
+
 			if (auto_explain_log_buffers)
 				queryDesc->instrument_options |= INSTRUMENT_BUFFERS;
 		}
diff --git a/doc/src/sgml/ref/explain.sgml b/doc/src/sgml/ref/explain.sgml
index 8a9c9de..4488956 100644
--- a/doc/src/sgml/ref/explain.sgml
+++ b/doc/src/sgml/ref/explain.sgml
@@ -40,6 +40,7 @@ EXPLAIN [ ANALYZE ] [ VERBOSE ] <replaceable class="parameter">statement</replac
     VERBOSE [ <replaceable class="parameter">boolean</replaceable> ]
     COSTS [ <replaceable class="parameter">boolean</replaceable> ]
     BUFFERS [ <replaceable class="parameter">boolean</replaceable> ]
+    TIMING [ <replaceable class="parameter">boolean</replaceable> ]
     FORMAT { TEXT | XML | JSON | YAML }
 </synopsis>
  </refsynopsisdiv>
@@ -172,6 +173,20 @@ ROLLBACK;
    </varlistentry>
 
    <varlistentry>
+    <term><literal>TIMING</literal></term>
+    <listitem>
+     <para>
+      Include information on timing for each node. Specifically, include the
+      actual startup time and time spent in the node. This may involve a lot
+      of gettimeofday calls, and on some systems this means significant
+      slowdown of the query. 
+      This parameter may only be used when <literal>ANALYZE</literal> is also
+      enabled.  It defaults to <literal>TRUE</literal>.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>FORMAT</literal></term>
     <listitem>
      <para>
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 8b48105..8251908 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -116,6 +116,7 @@ ExplainQuery(ExplainStmt *stmt, const char *queryString,
 	TupOutputState *tstate;
 	List	   *rewritten;
 	ListCell   *lc;
+	bool	    timing_set = false;
 
 	/* Initialize ExplainState. */
 	ExplainInitState(&es);
@@ -133,6 +134,11 @@ ExplainQuery(ExplainStmt *stmt, const char *queryString,
 			es.costs = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "buffers") == 0)
 			es.buffers = defGetBoolean(opt);
+		else if (strcmp(opt->defname, "timing") == 0)
+		{
+			timing_set = true;
+			es.timing = defGetBoolean(opt);
+		}
 		else if (strcmp(opt->defname, "format") == 0)
 		{
 			char	   *p = defGetString(opt);
@@ -162,6 +168,15 @@ ExplainQuery(ExplainStmt *stmt, const char *queryString,
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 				 errmsg("EXPLAIN option BUFFERS requires ANALYZE")));
+	
+	/* if the timing was not set explicitly, set default value */
+	es.timing = (timing_set) ? es.timing : es.analyze;
+
+	/* check that timing is used with EXPLAIN ANALYZE */
+	if (es.timing && !es.analyze)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("EXPLAIN option TIMING requires ANALYZE")));
 
 	/*
 	 * Parse analysis was done already, but we still have to run the rule
@@ -355,8 +370,11 @@ ExplainOnePlan(PlannedStmt *plannedstmt, ExplainState *es,
 	int			eflags;
 	int			instrument_option = 0;
 
-	if (es->analyze)
+	if (es->analyze && es->timing)
 		instrument_option |= INSTRUMENT_TIMER;
+	else if (es->analyze)
+		instrument_option |= INSTRUMENT_ROWS;
+
 	if (es->buffers)
 		instrument_option |= INSTRUMENT_BUFFERS;
 
@@ -951,29 +969,42 @@ ExplainNode(PlanState *planstate, List *ancestors,
 
 		if (es->format == EXPLAIN_FORMAT_TEXT)
 		{
-			appendStringInfo(es->str,
-							 " (actual time=%.3f..%.3f rows=%.0f loops=%.0f)",
-							 startup_sec, total_sec, rows, nloops);
+			if (planstate->instrument->need_timer)
+				appendStringInfo(es->str,
+								 " (actual time=%.3f..%.3f rows=%.0f loops=%.0f)",
+								 startup_sec, total_sec, rows, nloops);
+			else
+				appendStringInfo(es->str,
+								 " (actual rows=%.0f loops=%.0f)",
+								 rows, nloops);
 		}
 		else
 		{
-			ExplainPropertyFloat("Actual Startup Time", startup_sec, 3, es);
-			ExplainPropertyFloat("Actual Total Time", total_sec, 3, es);
+			if (planstate->instrument->need_timer)
+			{
+				ExplainPropertyFloat("Actual Startup Time", startup_sec, 3, es);
+				ExplainPropertyFloat("Actual Total Time", total_sec, 3, es);
+			}
 			ExplainPropertyFloat("Actual Rows", rows, 0, es);
 			ExplainPropertyFloat("Actual Loops", nloops, 0, es);
 		}
 	}
 	else if (es->analyze)
 	{
+
 		if (es->format == EXPLAIN_FORMAT_TEXT)
 			appendStringInfo(es->str, " (never executed)");
-		else
+		else if (planstate->instrument->need_timer)
 		{
 			ExplainPropertyFloat("Actual Startup Time", 0.0, 3, es);
 			ExplainPropertyFloat("Actual Total Time", 0.0, 3, es);
+		}
+		else
+		{
 			ExplainPropertyFloat("Actual Rows", 0.0, 0, es);
 			ExplainPropertyFloat("Actual Loops", 0.0, 0, es);
 		}
+
 	}
 
 	/* in text format, first line ends here */
diff --git a/src/backend/executor/instrument.c b/src/backend/executor/instrument.c
index dde73b7..2c749b1 100644
--- a/src/backend/executor/instrument.c
+++ b/src/backend/executor/instrument.c
@@ -29,17 +29,19 @@ InstrAlloc(int n, int instrument_options)
 {
 	Instrumentation *instr;
 
-	/* timer is always required for now */
-	Assert(instrument_options & INSTRUMENT_TIMER);
-
 	/* initialize all fields to zeroes, then modify as needed */
 	instr = palloc0(n * sizeof(Instrumentation));
-	if (instrument_options & INSTRUMENT_BUFFERS)
+	if (instrument_options & (INSTRUMENT_BUFFERS | INSTRUMENT_TIMER))
 	{
 		int			i;
+		bool		need_buffers = instrument_options & INSTRUMENT_BUFFERS;
+		bool		need_timer = instrument_options & INSTRUMENT_TIMER;
 
 		for (i = 0; i < n; i++)
-			instr[i].need_bufusage = true;
+		{
+			instr[i].need_bufusage = need_buffers;
+			instr[i].need_timer = need_timer;
+		}
 	}
 
 	return instr;
@@ -49,7 +51,7 @@ InstrAlloc(int n, int instrument_options)
 void
 InstrStartNode(Instrumentation *instr)
 {
-	if (INSTR_TIME_IS_ZERO(instr->starttime))
+	if (instr->need_timer && INSTR_TIME_IS_ZERO(instr->starttime))
 		INSTR_TIME_SET_CURRENT(instr->starttime);
 	else
 		elog(DEBUG2, "InstrStartNode called twice in a row");
@@ -68,16 +70,22 @@ InstrStopNode(Instrumentation *instr, double nTuples)
 	/* count the returned tuples */
 	instr->tuplecount += nTuples;
 
-	if (INSTR_TIME_IS_ZERO(instr->starttime))
+	/* let's update the time only if the timer was requested */
+	if (instr->need_timer)
 	{
-		elog(DEBUG2, "InstrStopNode called without start");
-		return;
-	}
 
-	INSTR_TIME_SET_CURRENT(endtime);
-	INSTR_TIME_ACCUM_DIFF(instr->counter, endtime, instr->starttime);
+		if (INSTR_TIME_IS_ZERO(instr->starttime))
+		{
+			elog(DEBUG2, "InstrStopNode called without start");
+			return;
+		}
 
-	INSTR_TIME_SET_ZERO(instr->starttime);
+		INSTR_TIME_SET_CURRENT(endtime);
+		INSTR_TIME_ACCUM_DIFF(instr->counter, endtime, instr->starttime);
+
+		INSTR_TIME_SET_ZERO(instr->starttime);
+
+	}
 
 	/* Add delta of buffer usage since entry to node's totals */
 	if (instr->need_bufusage)
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 3443259..a987c43 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -31,6 +31,7 @@ typedef struct ExplainState
 	bool		analyze;		/* print actual times */
 	bool		costs;			/* print costs */
 	bool		buffers;		/* print buffer usage */
+	bool		timing;			/* print timing */
 	ExplainFormat format;		/* output format */
 	/* other states */
 	PlannedStmt *pstmt;			/* top of plan */
diff --git a/src/include/executor/instrument.h b/src/include/executor/instrument.h
index 9ecb544..084302e 100644
--- a/src/include/executor/instrument.h
+++ b/src/include/executor/instrument.h
@@ -31,14 +31,16 @@ typedef struct BufferUsage
 /* Flag bits included in InstrAlloc's instrument_options bitmask */
 typedef enum InstrumentOption
 {
-	INSTRUMENT_TIMER = 1 << 0,	/* needs timer */
+	INSTRUMENT_TIMER = 1 << 0,	/* needs timer (and row counts) */
 	INSTRUMENT_BUFFERS = 1 << 1,	/* needs buffer usage */
+	INSTRUMENT_ROWS = 1 << 2,	/* needs row count */
 	INSTRUMENT_ALL = 0x7FFFFFFF
 } InstrumentOption;
 
 typedef struct Instrumentation
 {
 	/* Parameters set at node creation: */
+	bool		need_timer;	    /* TRUE if we need timer data */
 	bool		need_bufusage;	/* TRUE if we need buffer usage data */
 	/* Info about current plan cycle: */
 	bool		running;		/* TRUE if we've completed first tuple */

Jeff Janes

jeff.janes@gmail.com

almost 14 years ago

In reply to: Tomas Vondra (#2)

Re: Review of: explain / allow collecting row counts without timing info

On Fri, Jan 13, 2012 at 3:07 PM, Tomas Vondra <tv@fuzzy.cz> wrote:

Fixed. The default value of TIMING option did not work as intended, it
was set to true even for plain EXPLAIN (without ANALYZE). In that case
the EXPLAIN failed.

I've applied this over the "show Heap Fetches in EXPLAIN ANALYZE
output for index-only scans" patch. It applied and built and passes
installcheck.

I have not done a full review of this, but I want to say that I want
this very much. Not only can I get the row counts, but also the Heap
Fetches and the output of BUFFERS, without the overhead of timing. I
haven't looked at contrib aspects of it at all.

Thank you for making this.

Cheers,

Jeff

Robert Haas

robertmhaas@gmail.com

almost 14 years ago

In reply to: Jeff Janes (#3)

1 attachment(s)

Re: Review of: explain / allow collecting row counts without timing info

On Sat, Jan 21, 2012 at 10:32 PM, Jeff Janes <jeff.janes@gmail.com> wrote:

On Fri, Jan 13, 2012 at 3:07 PM, Tomas Vondra <tv@fuzzy.cz> wrote:

Fixed. The default value of TIMING option did not work as intended, it
was set to true even for plain EXPLAIN (without ANALYZE). In that case
the EXPLAIN failed.

I've applied this over the "show Heap Fetches in EXPLAIN ANALYZE
output for index-only scans" patch. It applied and built and passes
installcheck.

I have not done a full review of this, but I want to say that I want
this very much. Not only can I get the row counts, but also the Heap
Fetches and the output of BUFFERS, without the overhead of timing. I
haven't looked at contrib aspects of it at all.

Thank you for making this.

+1.

After looking at this a bit, I think we can make the interface a bit
more convenient. I'd like to propose that we call the EXPLAIN option
"ROWS" rather than "TIMING", and that we provide the row-count
information if *either* ROWS or ANALYZE is specified.

Then, if you want rows without timing, you can say this:

EXPLAIN (ROWS) query...

Rather than, as in the approach taken by the current patch:

EXPLAIN (ANALYZE, TIMING off) query...

...which is a little more long-winded. This also has the nice
advantage of making the name of the EXPLAIN option match the name of
the INSTRUMENT flag.

Revised patch along those lines attached. Barring objections, I'll
commit this version.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

explain-rows.patchapplication/octet-stream; name=explain-rows.patchDownload

diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c
index 61da6a2..967cbd9 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -23,6 +23,7 @@ static int	auto_explain_log_min_duration = -1; /* msec or -1 */
 static bool auto_explain_log_analyze = false;
 static bool auto_explain_log_verbose = false;
 static bool auto_explain_log_buffers = false;
+static bool auto_explain_log_rows = false;
 static int	auto_explain_log_format = EXPLAIN_FORMAT_TEXT;
 static bool auto_explain_log_nested_statements = false;
 
@@ -133,6 +134,17 @@ _PG_init(void)
 							 NULL,
 							 NULL);
 
+	DefineCustomBoolVariable("auto_explain.log_rows",
+							 "Collect row counts.",
+							 NULL,
+							 &auto_explain_log_rows,
+							 true,
+							 PGC_SUSET,
+							 0,
+							 NULL,
+							 NULL,
+							 NULL);
+
 	EmitWarningsOnPlaceholders("auto_explain");
 
 	/* Install hooks. */
@@ -168,9 +180,13 @@ explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
 	if (auto_explain_enabled())
 	{
 		/* Enable per-node instrumentation iff log_analyze is required. */
-		if (auto_explain_log_analyze && (eflags & EXEC_FLAG_EXPLAIN_ONLY) == 0)
+		if ((eflags & EXEC_FLAG_EXPLAIN_ONLY) == 0)
 		{
-			queryDesc->instrument_options |= INSTRUMENT_TIMER;
+			if (auto_explain_log_analyze)
+				queryDesc->instrument_options |= INSTRUMENT_TIMER;
+			else if (auto_explain_log_rows)
+				queryDesc->instrument_options |= INSTRUMENT_ROWS;
+
 			if (auto_explain_log_buffers)
 				queryDesc->instrument_options |= INSTRUMENT_BUFFERS;
 		}
diff --git a/doc/src/sgml/ref/explain.sgml b/doc/src/sgml/ref/explain.sgml
index 8a9c9de..dc373e7 100644
--- a/doc/src/sgml/ref/explain.sgml
+++ b/doc/src/sgml/ref/explain.sgml
@@ -40,6 +40,7 @@ EXPLAIN [ ANALYZE ] [ VERBOSE ] <replaceable class="parameter">statement</replac
     VERBOSE [ <replaceable class="parameter">boolean</replaceable> ]
     COSTS [ <replaceable class="parameter">boolean</replaceable> ]
     BUFFERS [ <replaceable class="parameter">boolean</replaceable> ]
+    ROWS [ <replaceable class="parameter">boolean</replaceable> ]
     FORMAT { TEXT | XML | JSON | YAML }
 </synopsis>
  </refsynopsisdiv>
@@ -172,6 +173,18 @@ ROLLBACK;
    </varlistentry>
 
    <varlistentry>
+    <term><literal>ROWS</literal></term>
+    <listitem>
+     <para>
+      Include information on the number of rows produced by node.  This
+      information is also displayed when <literal>ANALYZE</literal> is used,
+      but using only <literal>ROWS</literal> without <literal>ANALYZE</literal>
+      involves less overhead.  It defaults to <literal>FALSE</literal>.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>FORMAT</literal></term>
     <listitem>
      <para>
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index e35ac59..5b01bc3 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -133,6 +133,8 @@ ExplainQuery(ExplainStmt *stmt, const char *queryString,
 			es.costs = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "buffers") == 0)
 			es.buffers = defGetBoolean(opt);
+		else if (strcmp(opt->defname, "rows") == 0)
+			es.rows = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "format") == 0)
 		{
 			char	   *p = defGetString(opt);
@@ -362,6 +364,9 @@ ExplainOnePlan(PlannedStmt *plannedstmt, ExplainState *es,
 
 	if (es->analyze)
 		instrument_option |= INSTRUMENT_TIMER;
+	else if (es->rows)
+		instrument_option |= INSTRUMENT_ROWS;
+
 	if (es->buffers)
 		instrument_option |= INSTRUMENT_BUFFERS;
 
@@ -380,7 +385,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, ExplainState *es,
 								None_Receiver, params, instrument_option);
 
 	/* Select execution options */
-	if (es->analyze)
+	if (instrument_option != 0)
 		eflags = 0;				/* default run-to-completion flags */
 	else
 		eflags = EXEC_FLAG_EXPLAIN_ONLY;
@@ -389,7 +394,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, ExplainState *es,
 	ExecutorStart(queryDesc, eflags);
 
 	/* Execute the plan for statistics if asked for */
-	if (es->analyze)
+	if (instrument_option != 0)
 	{
 		/* run the plan */
 		ExecutorRun(queryDesc, ForwardScanDirection, 0L);
@@ -956,29 +961,42 @@ ExplainNode(PlanState *planstate, List *ancestors,
 
 		if (es->format == EXPLAIN_FORMAT_TEXT)
 		{
-			appendStringInfo(es->str,
-							 " (actual time=%.3f..%.3f rows=%.0f loops=%.0f)",
-							 startup_sec, total_sec, rows, nloops);
+			if (planstate->instrument->need_timer)
+				appendStringInfo(es->str,
+								 " (actual time=%.3f..%.3f rows=%.0f loops=%.0f)",
+								 startup_sec, total_sec, rows, nloops);
+			else
+				appendStringInfo(es->str,
+								 " (actual rows=%.0f loops=%.0f)",
+								 rows, nloops);
 		}
 		else
 		{
-			ExplainPropertyFloat("Actual Startup Time", startup_sec, 3, es);
-			ExplainPropertyFloat("Actual Total Time", total_sec, 3, es);
+			if (planstate->instrument->need_timer)
+			{
+				ExplainPropertyFloat("Actual Startup Time", startup_sec, 3, es);
+				ExplainPropertyFloat("Actual Total Time", total_sec, 3, es);
+			}
 			ExplainPropertyFloat("Actual Rows", rows, 0, es);
 			ExplainPropertyFloat("Actual Loops", nloops, 0, es);
 		}
 	}
 	else if (es->analyze)
 	{
+
 		if (es->format == EXPLAIN_FORMAT_TEXT)
 			appendStringInfo(es->str, " (never executed)");
-		else
+		else if (planstate->instrument->need_timer)
 		{
 			ExplainPropertyFloat("Actual Startup Time", 0.0, 3, es);
 			ExplainPropertyFloat("Actual Total Time", 0.0, 3, es);
+		}
+		else
+		{
 			ExplainPropertyFloat("Actual Rows", 0.0, 0, es);
 			ExplainPropertyFloat("Actual Loops", 0.0, 0, es);
 		}
+
 	}
 
 	/* in text format, first line ends here */
diff --git a/src/backend/executor/instrument.c b/src/backend/executor/instrument.c
index dde73b7..ec15656 100644
--- a/src/backend/executor/instrument.c
+++ b/src/backend/executor/instrument.c
@@ -29,17 +29,19 @@ InstrAlloc(int n, int instrument_options)
 {
 	Instrumentation *instr;
 
-	/* timer is always required for now */
-	Assert(instrument_options & INSTRUMENT_TIMER);
-
 	/* initialize all fields to zeroes, then modify as needed */
 	instr = palloc0(n * sizeof(Instrumentation));
-	if (instrument_options & INSTRUMENT_BUFFERS)
+	if (instrument_options & (INSTRUMENT_BUFFERS | INSTRUMENT_TIMER))
 	{
 		int			i;
+		bool		need_buffers = instrument_options & INSTRUMENT_BUFFERS;
+		bool		need_timer = instrument_options & INSTRUMENT_TIMER;
 
 		for (i = 0; i < n; i++)
-			instr[i].need_bufusage = true;
+		{
+			instr[i].need_bufusage = need_buffers;
+			instr[i].need_timer = need_timer;
+		}
 	}
 
 	return instr;
@@ -49,7 +51,7 @@ InstrAlloc(int n, int instrument_options)
 void
 InstrStartNode(Instrumentation *instr)
 {
-	if (INSTR_TIME_IS_ZERO(instr->starttime))
+	if (instr->need_timer && INSTR_TIME_IS_ZERO(instr->starttime))
 		INSTR_TIME_SET_CURRENT(instr->starttime);
 	else
 		elog(DEBUG2, "InstrStartNode called twice in a row");
@@ -68,16 +70,20 @@ InstrStopNode(Instrumentation *instr, double nTuples)
 	/* count the returned tuples */
 	instr->tuplecount += nTuples;
 
-	if (INSTR_TIME_IS_ZERO(instr->starttime))
+	/* let's update the time only if the timer was requested */
+	if (instr->need_timer)
 	{
-		elog(DEBUG2, "InstrStopNode called without start");
-		return;
-	}
+		if (INSTR_TIME_IS_ZERO(instr->starttime))
+		{
+			elog(DEBUG2, "InstrStopNode called without start");
+			return;
+		}
 
-	INSTR_TIME_SET_CURRENT(endtime);
-	INSTR_TIME_ACCUM_DIFF(instr->counter, endtime, instr->starttime);
+		INSTR_TIME_SET_CURRENT(endtime);
+		INSTR_TIME_ACCUM_DIFF(instr->counter, endtime, instr->starttime);
 
-	INSTR_TIME_SET_ZERO(instr->starttime);
+		INSTR_TIME_SET_ZERO(instr->starttime);
+	}
 
 	/* Add delta of buffer usage since entry to node's totals */
 	if (instr->need_bufusage)
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 3443259..a64889c 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -28,7 +28,8 @@ typedef struct ExplainState
 	StringInfo	str;			/* output buffer */
 	/* options */
 	bool		verbose;		/* be verbose */
-	bool		analyze;		/* print actual times */
+	bool		analyze;		/* print actual times and row counts */
+	bool		rows;			/* print actual row counts */
 	bool		costs;			/* print costs */
 	bool		buffers;		/* print buffer usage */
 	ExplainFormat format;		/* output format */
diff --git a/src/include/executor/instrument.h b/src/include/executor/instrument.h
index 9ecb544..084302e 100644
--- a/src/include/executor/instrument.h
+++ b/src/include/executor/instrument.h
@@ -31,14 +31,16 @@ typedef struct BufferUsage
 /* Flag bits included in InstrAlloc's instrument_options bitmask */
 typedef enum InstrumentOption
 {
-	INSTRUMENT_TIMER = 1 << 0,	/* needs timer */
+	INSTRUMENT_TIMER = 1 << 0,	/* needs timer (and row counts) */
 	INSTRUMENT_BUFFERS = 1 << 1,	/* needs buffer usage */
+	INSTRUMENT_ROWS = 1 << 2,	/* needs row count */
 	INSTRUMENT_ALL = 0x7FFFFFFF
 } InstrumentOption;
 
 typedef struct Instrumentation
 {
 	/* Parameters set at node creation: */
+	bool		need_timer;	    /* TRUE if we need timer data */
 	bool		need_bufusage;	/* TRUE if we need buffer usage data */
 	/* Info about current plan cycle: */
 	bool		running;		/* TRUE if we've completed first tuple */

Tomas Vondra

tv@fuzzy.cz

almost 14 years ago

In reply to: Robert Haas (#4)

Re: Review of: explain / allow collecting row counts without timing info

On 3 Únor 2012, 17:12, Robert Haas wrote:

On Sat, Jan 21, 2012 at 10:32 PM, Jeff Janes <jeff.janes@gmail.com> wrote:

On Fri, Jan 13, 2012 at 3:07 PM, Tomas Vondra <tv@fuzzy.cz> wrote:

Fixed. The default value of TIMING option did not work as intended, it
was set to true even for plain EXPLAIN (without ANALYZE). In that case
the EXPLAIN failed.

I've applied this over the "show Heap Fetches in EXPLAIN ANALYZE
output for index-only scans" patch. It applied and built and passes
installcheck.

I have not done a full review of this, but I want to say that I want
this very much. Not only can I get the row counts, but also the Heap
Fetches and the output of BUFFERS, without the overhead of timing. I
haven't looked at contrib aspects of it at all.

Thank you for making this.

+1.

After looking at this a bit, I think we can make the interface a bit
more convenient. I'd like to propose that we call the EXPLAIN option
"ROWS" rather than "TIMING", and that we provide the row-count
information if *either* ROWS or ANALYZE is specified.

Then, if you want rows without timing, you can say this:

EXPLAIN (ROWS) query...

Rather than, as in the approach taken by the current patch:

EXPLAIN (ANALYZE, TIMING off) query...

...which is a little more long-winded. This also has the nice
advantage of making the name of the EXPLAIN option match the name of
the INSTRUMENT flag.

Revised patch along those lines attached. Barring objections, I'll
commit this version.

I don't think changing the EXPLAIN syntax this way is a good idea. I think
that one option should not silently enable/disable others, so ROWS should
not enable ANALYZE automatically. It should throw an error just like the
TIMING option does when invoked without ANALYZE (IIRC).

Which leads to syntax like this

EXPLAIN (ANALYZE, ROWS)

and that's a bit strange because it mentions ROWS but actually manipulates
TIMING behind the curtains. You can't actually do

EXPLAIN (ANALYZE, ROWS off)

because row counts are always collected. So I think the syntax I proposed
makes more sense.

kind regards
Tomas

Robert Haas

robertmhaas@gmail.com

almost 14 years ago

In reply to: Tomas Vondra (#5)

Re: Review of: explain / allow collecting row counts without timing info

On Fri, Feb 3, 2012 at 12:08 PM, Tomas Vondra <tv@fuzzy.cz> wrote:

I don't think changing the EXPLAIN syntax this way is a good idea. I think
that one option should not silently enable/disable others, so ROWS should
not enable ANALYZE automatically.

I didn't propose that. The point is that the desired behavior
(however we name it) is a SUBSET of what ANALYZE does.

So we can either:

1. Have ANALYZE enable all the behavior, and have another option
(TIMING) that can be used to turn some of it back on again.

2. Have ANALYZE enable all the behavior, and have another option
(ROWS) that enables just the subset of it that we want.

I prefer #2 to #1, because I think it's a bit confusing to have an
option whose effect is to partially disable another option. YMMV, of
course.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Tomas Vondra

tv@fuzzy.cz

almost 14 years ago

In reply to: Robert Haas (#6)

Re: Review of: explain / allow collecting row counts without timing info

On 3.2.2012 18:28, Robert Haas wrote:

On Fri, Feb 3, 2012 at 12:08 PM, Tomas Vondra <tv@fuzzy.cz> wrote:

I don't think changing the EXPLAIN syntax this way is a good idea. I think
that one option should not silently enable/disable others, so ROWS should
not enable ANALYZE automatically.

I didn't propose that. The point is that the desired behavior
(however we name it) is a SUBSET of what ANALYZE does.

So we can either:

1. Have ANALYZE enable all the behavior, and have another option
(TIMING) that can be used to turn some of it back on again.

2. Have ANALYZE enable all the behavior, and have another option
(ROWS) that enables just the subset of it that we want.

I prefer #2 to #1, because I think it's a bit confusing to have an
option whose effect is to partially disable another option. YMMV, of
course.

OK, thanks for the explanation. I don't like the idea of subsets as it
IMHO makes it less obvious what options are enabled. For example this

EXPLAIN (ROWS) query...

does not immediately show it's actually going to do ANALYZE.

I prefer to keep the current 'ANALYZE' definition, i.e. collecting both
row counts and timing data (which is what 99% of people wants anyway),
and an option to disable the timing.

And the BUFFERS option currently works exactly like that, so defining
ROWS the way you proposed would be inconsistent with the current behavior.

Sure, we could redefine BUFFERS as a subset, so you could do

EXPLAIN (ROWS) ... instead of ... EXPLAIN (ANALYZE, TIMING off)
EXPLAIN (BUFFERS) ... instead of ... EXPLAIN (ANALYZE, BUFFERS on)

but what if someone wants both at the same time? Maybe he could do

EXPLAIN (ROWS, BUFFERS)

and treat that as a union of those subsets. I don't think it's worth it.

I surely can live with both solutions (mine or the one you proposed).

kind regards
Tomas

Robert Haas

robertmhaas@gmail.com

almost 14 years ago

In reply to: Tomas Vondra (#7)

Re: Review of: explain / allow collecting row counts without timing info

On Fri, Feb 3, 2012 at 2:56 PM, Tomas Vondra <tv@fuzzy.cz> wrote:

OK, thanks for the explanation. I don't like the idea of subsets as it
IMHO makes it less obvious what options are enabled. For example this

EXPLAIN (ROWS) query...

does not immediately show it's actually going to do ANALYZE.

Well, it isn't, if ANALYZE means rows + timing...

I prefer to keep the current 'ANALYZE' definition, i.e. collecting both
row counts and timing data (which is what 99% of people wants anyway),
and an option to disable the timing.

And the BUFFERS option currently works exactly like that, so defining
ROWS the way you proposed would be inconsistent with the current behavior.

Sure, we could redefine BUFFERS as a subset, so you could do

EXPLAIN (ROWS) ... instead of ... EXPLAIN (ANALYZE, TIMING off)
EXPLAIN (BUFFERS) ... instead of ... EXPLAIN (ANALYZE, BUFFERS on)

but what if someone wants both at the same time? Maybe he could do

EXPLAIN (ROWS, BUFFERS)

and treat that as a union of those subsets. I don't think it's worth it.

Yeah, I forgot that we'd have to allow that, though I don't think it
would be a big deal to fix that.

I surely can live with both solutions (mine or the one you proposed).

Let's wait and see if anyone else has an opinion.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Tomas Vondra

tv@fuzzy.cz

almost 14 years ago

In reply to: Robert Haas (#8)

Re: Review of: explain / allow collecting row counts without timing info

On 3.2.2012 22:51, Robert Haas wrote:

On Fri, Feb 3, 2012 at 2:56 PM, Tomas Vondra <tv@fuzzy.cz> wrote:

OK, thanks for the explanation. I don't like the idea of subsets as it
IMHO makes it less obvious what options are enabled. For example this

EXPLAIN (ROWS) query...

does not immediately show it's actually going to do ANALYZE.

Well, it isn't, if ANALYZE means rows + timing...

Yeah, sure. It depends on how you define ANALYZE. I see that as 'stats
collected at runtime' (in contrast to a query plan, which is just a ...
well, plan).

but what if someone wants both at the same time? Maybe he could do

EXPLAIN (ROWS, BUFFERS)

and treat that as a union of those subsets. I don't think it's worth it.

Yeah, I forgot that we'd have to allow that, though I don't think it
would be a big deal to fix that.

I surely can live with both solutions (mine or the one you proposed).

Let's wait and see if anyone else has an opinion.

Sure. And thanks for your feedback!

Tomas

#10

Noah Misch

noah@leadboat.com

almost 14 years ago

In reply to: Robert Haas (#4)

Re: Review of: explain / allow collecting row counts without timing info

On Fri, Feb 03, 2012 at 11:12:33AM -0500, Robert Haas wrote:

After looking at this a bit, I think we can make the interface a bit
more convenient. I'd like to propose that we call the EXPLAIN option
"ROWS" rather than "TIMING", and that we provide the row-count
information if *either* ROWS or ANALYZE is specified.

Then, if you want rows without timing, you can say this:

EXPLAIN (ROWS) query...

Rather than, as in the approach taken by the current patch:

EXPLAIN (ANALYZE, TIMING off) query...

...which is a little more long-winded.

I favor the originally-submitted syntax. I like having a single option,
ANALYZE, signal whether to actually run the statement. The submitted syntax
also aligns with the fact that would motivate someone to use it: "Timing has
high overhead on my system." or "Timing makes the output unstable."

We have both estimated row counts and actual row counts. It may someday prove
useful to have an invocation that plans the query and shows estimated row
counts, omitting only cost estimates. Having a "ROWS" option that implies
running the query could complicate that effort.

Thanks,
nm

#11

Jeff Janes

jeff.janes@gmail.com

almost 14 years ago

In reply to: Robert Haas (#4)

Re: Review of: explain / allow collecting row counts without timing info

On Fri, Feb 3, 2012 at 8:12 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Sat, Jan 21, 2012 at 10:32 PM, Jeff Janes <jeff.janes@gmail.com> wrote:

On Fri, Jan 13, 2012 at 3:07 PM, Tomas Vondra <tv@fuzzy.cz> wrote:

Fixed. The default value of TIMING option did not work as intended, it
was set to true even for plain EXPLAIN (without ANALYZE). In that case
the EXPLAIN failed.

I've applied this over the "show Heap Fetches in EXPLAIN ANALYZE
output for index-only scans" patch. It applied and built and passes
installcheck.

I have not done a full review of this, but I want to say that I want
this very much. Not only can I get the row counts, but also the Heap
Fetches and the output of BUFFERS, without the overhead of timing. I
haven't looked at contrib aspects of it at all.

Thank you for making this.

+1.

After looking at this a bit, I think we can make the interface a bit
more convenient. I'd like to propose that we call the EXPLAIN option
"ROWS" rather than "TIMING", and that we provide the row-count
information if *either* ROWS or ANALYZE is specified.

If we are allowed to make big breaks with the past, I would get rid of
ANALYZE altogether.

"analyze" basically is a synonym of "explain", or is contained within
it. You can't explain anything until you have first analyzed it.

Without ANALYZE, you are explaining the analysis done by the planner.
With it, you are explaining an analysis done by the planner and the
executor.

So we could do away with ANALYZE and have TIMING, ROWS, and BUFFERS,
any of which causes the query to actually be executed, not just
planned.

I suspect we will be unwilling to make such a break with the past. In
that case, I think I prefer the originally proposed semantics, even
though I agree they are somewhat less natural. ANALYZE is a big flag
that means "This query will be executed, not just planned". If we are
not going to make a major break, but only nibble around the edges,
then I don't think we should remove the property that the query will
be executed if and only if ANALYZE is specified.

Cheers,

Jeff

#12

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Jeff Janes (#11)

Re: Review of: explain / allow collecting row counts without timing info

Jeff Janes <jeff.janes@gmail.com> writes:

I suspect we will be unwilling to make such a break with the past. In
that case, I think I prefer the originally proposed semantics, even
though I agree they are somewhat less natural. ANALYZE is a big flag
that means "This query will be executed, not just planned". If we are
not going to make a major break, but only nibble around the edges,
then I don't think we should remove the property that the query will
be executed if and only if ANALYZE is specified.

Yeah, I think we need to preserve that property. Unexpectedly executing
a query (which may have side-effects) is a very dangerous thing. People
are used to the idea that ANALYZE == execute, and adding random other
flags that also cause execution is going to burn somebody.

regards, tom lane

#13

Kevin Grittner

Kevin.Grittner@wicourts.gov

almost 14 years ago

In reply to: Jeff Janes (#11)

Re: Review of: explain / allow collecting row counts without timing info

Tom Lane wrote:

Yeah, I think we need to preserve that property. Unexpectedly
executing query (which may have side-effects) is a very dangerous
thing. People are used to the idea that ANALYZE == execute, and
adding random other flags that also cause execution is going to
burn somebody.

FWIW, another reason not to use Robert's suggested syntax is that you
get "rows=n" entries with or without the actual run. You just don't
get the "actual" block to compare to the estimate. So ROWS as an
option would be very ambiguous.

-Kevin

Import Notes

Resolved by subject fallback

#14

Robert Haas

robertmhaas@gmail.com

almost 14 years ago

In reply to: Kevin Grittner (#13)

Re: Review of: explain / allow collecting row counts without timing info

On Sun, Feb 5, 2012 at 12:44 AM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:

Tom Lane wrote:

Yeah, I think we need to preserve that property. Unexpectedly
executing query (which may have side-effects) is a very dangerous
thing. People are used to the idea that ANALYZE == execute, and
adding random other flags that also cause execution is going to
burn somebody.

+1

FWIW, another reason not to use Robert's suggested syntax is that you
get "rows=n" entries with or without the actual run. You just don't
get the "actual" block to compare to the estimate. So ROWS as an
option would be very ambiguous.

OK, so based on that resoundingly unanimous input, I've committed
Tomas's last version. I made some alterations to the sgml
documentation to avoid mentioning "gettimeofday" specifically, because
that might not be the call everywhere (e.g. Windows) and even if it
is, it doesn't seem too user-friendly. The code is entirely as he had
it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company