Modify pg_stat_get_activity to build a tuplestore

Started by Stephen Frostover 10 years ago3 messages
#1Stephen Frost
sfrost@snowman.net
1 attachment(s)

All,

* Robert Haas (robertmhaas@gmail.com) wrote (on the role attributes
* thread):

I think this patch is too big and does too many things. It should be
broken up into small patches which can be discussed and validated
independently. The fact that your commit message is incredibly long
is a sign that there's too much going on here, and that message
doesn't even cover all of it.

As I mentioned on another thread, you're certainly right about that, and
here's the first broken-out patch, which just changes
pg_stat_get_activity to build and return a tuplestore in a single
function call instead of using the old-style multiple-invokation call
method. Given the simplicity and the lack of any user-visible change,
and that it's an independently useful change regardless of what happens
with the other changes, I'm planning to move forward with this pretty
quickly, barring concerns, of course.

This is heavily based on the pg_stat_get_wal_senders() implementation,
hat-tip to Itagaki Takahiro for that.

Thanks!

Stephen

Attachments:

pg_stat_get_activity_tuplestore_v1.patchtext/x-diff; charset=us-asciiDownload
From d08171691f579b60e6c0652ffef886bbf0b66d85 Mon Sep 17 00:00:00 2001
From: Stephen Frost <sfrost@snowman.net>
Date: Thu, 7 May 2015 16:41:00 -0400
Subject: [PATCH] Modify pg_stat_get_activity to build a tuplestore

This updates pg_stat_get_activity() to build a tuplestore for its
results instead of using the old-style multiple-call method.  This
simplifies the function, though that wasn't the primary motivation for
the change, which is that we may turn it into a helper function which
can filter the results (or not) much more easily.
---
 src/backend/utils/adt/pgstatfuncs.c | 194 +++++++++++++-----------------------
 1 file changed, 67 insertions(+), 127 deletions(-)

diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index bbe94c3..2b3778b 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -524,137 +524,75 @@ pg_stat_get_backend_idset(PG_FUNCTION_ARGS)
 	}
 }
 
+/*
+ * Returns activity of PG backends.
+ */
 Datum
 pg_stat_get_activity(PG_FUNCTION_ARGS)
 {
-	FuncCallContext *funcctx;
-
-	if (SRF_IS_FIRSTCALL())
-	{
-		MemoryContext oldcontext;
-		TupleDesc	tupdesc;
-
-		funcctx = SRF_FIRSTCALL_INIT();
-
-		oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
-
-		tupdesc = CreateTemplateTupleDesc(22, false);
-		TupleDescInitEntry(tupdesc, (AttrNumber) 1, "datid",
-						   OIDOID, -1, 0);
-		TupleDescInitEntry(tupdesc, (AttrNumber) 2, "pid",
-						   INT4OID, -1, 0);
-		TupleDescInitEntry(tupdesc, (AttrNumber) 3, "usesysid",
-						   OIDOID, -1, 0);
-		TupleDescInitEntry(tupdesc, (AttrNumber) 4, "application_name",
-						   TEXTOID, -1, 0);
-		TupleDescInitEntry(tupdesc, (AttrNumber) 5, "state",
-						   TEXTOID, -1, 0);
-		TupleDescInitEntry(tupdesc, (AttrNumber) 6, "query",
-						   TEXTOID, -1, 0);
-		TupleDescInitEntry(tupdesc, (AttrNumber) 7, "waiting",
-						   BOOLOID, -1, 0);
-		TupleDescInitEntry(tupdesc, (AttrNumber) 8, "act_start",
-						   TIMESTAMPTZOID, -1, 0);
-		TupleDescInitEntry(tupdesc, (AttrNumber) 9, "query_start",
-						   TIMESTAMPTZOID, -1, 0);
-		TupleDescInitEntry(tupdesc, (AttrNumber) 10, "backend_start",
-						   TIMESTAMPTZOID, -1, 0);
-		TupleDescInitEntry(tupdesc, (AttrNumber) 11, "state_change",
-						   TIMESTAMPTZOID, -1, 0);
-		TupleDescInitEntry(tupdesc, (AttrNumber) 12, "client_addr",
-						   INETOID, -1, 0);
-		TupleDescInitEntry(tupdesc, (AttrNumber) 13, "client_hostname",
-						   TEXTOID, -1, 0);
-		TupleDescInitEntry(tupdesc, (AttrNumber) 14, "client_port",
-						   INT4OID, -1, 0);
-		TupleDescInitEntry(tupdesc, (AttrNumber) 15, "backend_xid",
-						   XIDOID, -1, 0);
-		TupleDescInitEntry(tupdesc, (AttrNumber) 16, "backend_xmin",
-						   XIDOID, -1, 0);
-		TupleDescInitEntry(tupdesc, (AttrNumber) 17, "ssl",
-						   BOOLOID, -1, 0);
-		TupleDescInitEntry(tupdesc, (AttrNumber) 18, "sslversion",
-						   TEXTOID, -1, 0);
-		TupleDescInitEntry(tupdesc, (AttrNumber) 19, "sslcipher",
-						   TEXTOID, -1, 0);
-		TupleDescInitEntry(tupdesc, (AttrNumber) 20, "sslbits",
-						   INT4OID, -1, 0);
-		TupleDescInitEntry(tupdesc, (AttrNumber) 21, "sslcompression",
-						   BOOLOID, -1, 0);
-		TupleDescInitEntry(tupdesc, (AttrNumber) 22, "sslclientdn",
-						   TEXTOID, -1, 0);
-
-		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
-
-		funcctx->user_fctx = palloc0(sizeof(int));
-		if (PG_ARGISNULL(0))
-		{
-			/* Get all backends */
-			funcctx->max_calls = pgstat_fetch_stat_numbackends();
-		}
-		else
-		{
-			/*
-			 * Get one backend - locate by pid.
-			 *
-			 * We lookup the backend early, so we can return zero rows if it
-			 * doesn't exist, instead of returning a single row full of NULLs.
-			 */
-			int			pid = PG_GETARG_INT32(0);
-			int			i;
-			int			n = pgstat_fetch_stat_numbackends();
-
-			for (i = 1; i <= n; i++)
-			{
-				PgBackendStatus *be = pgstat_fetch_stat_beentry(i);
-
-				if (be)
-				{
-					if (be->st_procpid == pid)
-					{
-						*(int *) (funcctx->user_fctx) = i;
-						break;
-					}
-				}
-			}
-
-			if (*(int *) (funcctx->user_fctx) == 0)
-				/* Pid not found, return zero rows */
-				funcctx->max_calls = 0;
-			else
-				funcctx->max_calls = 1;
-		}
-
-		MemoryContextSwitchTo(oldcontext);
-	}
-
-	/* stuff done on every call of the function */
-	funcctx = SRF_PERCALL_SETUP();
-
-	if (funcctx->call_cntr < funcctx->max_calls)
+#define PG_STAT_GET_ACTIVITY_COLS	22
+	int					num_backends = pgstat_fetch_stat_numbackends();
+	int					curr_backend;
+	int					pid = PG_ARGISNULL(0) ? -1 : PG_GETARG_INT32(0);
+	ReturnSetInfo	   *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+	TupleDesc			tupdesc;
+	Tuplestorestate	   *tupstore;
+	MemoryContext		per_query_ctx;
+	MemoryContext		oldcontext;
+
+	/* check to see if caller supports us returning a tuplestore */
+	if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("set-valued function called in context that cannot accept a set")));
+	if (!(rsinfo->allowedModes & SFRM_Materialize))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("materialize mode required, but it is not " \
+						"allowed in this context")));
+
+	/* Build a tuple descriptor for our result type */
+	if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+		elog(ERROR, "return type must be a row type");
+
+	per_query_ctx = rsinfo->econtext->ecxt_per_query_memory;
+	oldcontext = MemoryContextSwitchTo(per_query_ctx);
+
+	tupstore = tuplestore_begin_heap(true, false, work_mem);
+	rsinfo->returnMode = SFRM_Materialize;
+	rsinfo->setResult = tupstore;
+	rsinfo->setDesc = tupdesc;
+
+	MemoryContextSwitchTo(oldcontext);
+
+	/* 1-based index */
+	for (curr_backend = 1; curr_backend <= num_backends; curr_backend++)
 	{
 		/* for each row */
-		Datum		values[22];
-		bool		nulls[22];
-		HeapTuple	tuple;
+		Datum		values[PG_STAT_GET_ACTIVITY_COLS];
+		bool		nulls[PG_STAT_GET_ACTIVITY_COLS];
 		LocalPgBackendStatus *local_beentry;
 		PgBackendStatus *beentry;
 
 		MemSet(values, 0, sizeof(values));
 		MemSet(nulls, 0, sizeof(nulls));
 
-		if (*(int *) (funcctx->user_fctx) > 0)
-		{
-			/* Get specific pid slot */
-			local_beentry = pgstat_fetch_stat_local_beentry(*(int *) (funcctx->user_fctx));
-			beentry = &local_beentry->backendStatus;
-		}
-		else
+		if (pid != -1)
 		{
-			/* Get the next one in the list */
-			local_beentry = pgstat_fetch_stat_local_beentry(funcctx->call_cntr + 1);	/* 1-based index */
-			beentry = &local_beentry->backendStatus;
+			/* Skip any which are not the one we're looking for. */
+			PgBackendStatus *be = pgstat_fetch_stat_beentry(curr_backend);
+
+			if (!be || be->st_procpid != pid)
+				continue;
+
 		}
+
+		/* Get the next one in the list */
+		local_beentry = pgstat_fetch_stat_local_beentry(curr_backend);
+		if (!local_beentry)
+			continue;
+
+		beentry = &local_beentry->backendStatus;
 		if (!beentry)
 		{
 			int			i;
@@ -665,8 +603,8 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
 			nulls[5] = false;
 			values[5] = CStringGetTextDatum("<backend information not available>");
 
-			tuple = heap_form_tuple(funcctx->tuple_desc, values, nulls);
-			SRF_RETURN_NEXT(funcctx, HeapTupleGetDatum(tuple));
+			tuplestore_putvalues(tupstore, tupdesc, values, nulls);
+			continue;
 		}
 
 		/* Values available to all callers */
@@ -839,15 +777,17 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
 			nulls[13] = true;
 		}
 
-		tuple = heap_form_tuple(funcctx->tuple_desc, values, nulls);
+		tuplestore_putvalues(tupstore, tupdesc, values, nulls);
 
-		SRF_RETURN_NEXT(funcctx, HeapTupleGetDatum(tuple));
-	}
-	else
-	{
-		/* nothing left */
-		SRF_RETURN_DONE(funcctx);
+		/* If only a single backend was requested, and we found it, break. */
+		if (pid != -1)
+			break;
 	}
+
+	/* clean up and return the tuplestore */
+	tuplestore_donestoring(tupstore);
+
+	return (Datum) 0;
 }
 
 
-- 
1.9.1

#2Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Stephen Frost (#1)
Re: Modify pg_stat_get_activity to build a tuplestore

Stephen Frost wrote:

As I mentioned on another thread, you're certainly right about that, and
here's the first broken-out patch, which just changes
pg_stat_get_activity to build and return a tuplestore in a single
function call instead of using the old-style multiple-invokation call
method. Given the simplicity and the lack of any user-visible change,
and that it's an independently useful change regardless of what happens
with the other changes, I'm planning to move forward with this pretty
quickly, barring concerns, of course.

Um. Abhijit mentioned to me the idea of implementing function execution
code in fmgr that would allow for calling functions lazily (i.e. stop
calling it once enough rows have been returned). One of the reasons not
to do that is that most functions use the materialize mode rather than
value-per-call, so the optimization is pointless because the whole
result is computed anyway. If we change more functions to materialize,
we will make that sort of optimization even more distant.

What is the reason for this change anyway?

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Stephen Frost
sfrost@snowman.net
In reply to: Alvaro Herrera (#2)
Re: Modify pg_stat_get_activity to build a tuplestore

* Alvaro Herrera (alvherre@2ndquadrant.com) wrote:

Stephen Frost wrote:

As I mentioned on another thread, you're certainly right about that, and
here's the first broken-out patch, which just changes
pg_stat_get_activity to build and return a tuplestore in a single
function call instead of using the old-style multiple-invokation call
method. Given the simplicity and the lack of any user-visible change,
and that it's an independently useful change regardless of what happens
with the other changes, I'm planning to move forward with this pretty
quickly, barring concerns, of course.

Um. Abhijit mentioned to me the idea of implementing function execution
code in fmgr that would allow for calling functions lazily (i.e. stop
calling it once enough rows have been returned). One of the reasons not
to do that is that most functions use the materialize mode rather than
value-per-call, so the optimization is pointless because the whole
result is computed anyway. If we change more functions to materialize,
we will make that sort of optimization even more distant.

With pg_stat_get_activity, at least, we already deal with this in a way-
you can pass in the specific pid you want to look at and then we'll just
return the one row that has that pid.

I'm not saying that such an optimization isn't a good idea (I'm all for
it, in fact), but it seems unlikely to help this particular function
much. Wouldn't we also have to change the API to support that, as there
are functions which expect to be called all the way through and do
something at the end?

I've long wanted to get at information, like if there is a LIMIT or a
WHERE clause or set of conditionals which applies to the results of a
given function and have that exposed to all PLs (pl/pgsql being the big
one there). That'd be far more valuable than this short-circuiting, not
that I'm against having both, of course.

What is the reason for this change anyway?

Hrm, guess I wasn't clear enough in the commit message. The reason is
that the default roles patch turns it into a helper function which takes
arguments and is then called from multiple functions which are exposed
at the SQL level. That's much simpler when working with a tuplestore
since we can create the tuplestore in the calling function and have the
helper just populate it for us, based on the parameters passed in.

Thanks!

Stephen