Addition of authenticated ID to pg_stat_activity

Started by Michael Paquierover 4 years ago19 messages
#1Michael Paquier
michael@paquier.xyz
1 attachment(s)

Hi all,

9afffcb has added the concept of authenticated identity to the
information provided in log_connections for audit purposes, with this
data stored in each backend's port. One extra thing that can be
really useful for monitoring is the capability to track this
information directly in pg_stat_activity.

Please find attached a patch to do that, with the following choices
made:
- Like query strings, authenticated IDs could be rather long, so we
need a GUC to control the maximum size allocated for these in shared
memory. The attached uses 128 bytes by default, that should be enough
in most cases even for DNs, LDAP or krb5.
- Multi-byte strings need to be truncated appropriately. As a matter
of consistency with the query string code, I have made things so as
the truncation is done each time a string is requested, with
PgBackendStatus storing the raw information truncated depending on the
maximum size allowed at the GUC level.
- Some tests are added within the SSL and LDAP code paths. We could
add more of that within the authentication and kerberos tests but that
did not strike me as mandatory either as the backend logs are checked
everywhere already.
- The new field has been added at the end of pg_stat_end_activity()
mainly as a matter of readability. I'd rather move that just after
the application_name, now only pg_stat_activity does that.

I am adding that to the next CF.

Thanks,
--
Michael

Attachments:

v1-0001-Add-authenticated-data-to-pg_stat_activity.patchtext/x-diff; charset=us-asciiDownload
From 5acd643aff5a12b8ee2ca3365532f5773c1b02a8 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Mon, 26 Apr 2021 11:25:17 +0900
Subject: [PATCH v1] Add authenticated data to pg_stat_activity

---
 src/include/catalog/pg_proc.dat               |  6 +--
 src/include/utils/backend_status.h            | 16 ++++---
 src/backend/catalog/system_views.sql          |  1 +
 src/backend/utils/activity/backend_status.c   | 45 +++++++++++++------
 src/backend/utils/adt/pgstatfuncs.c           | 21 +++++++--
 src/backend/utils/misc/guc.c                  | 11 +++++
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/test/ldap/t/001_auth.pl                   |  4 +-
 src/test/regress/expected/rules.out           |  9 ++--
 src/test/ssl/t/001_ssltests.pl                |  8 ++--
 doc/src/sgml/config.sgml                      | 18 ++++++++
 doc/src/sgml/monitoring.sgml                  | 10 +++++
 12 files changed, 117 insertions(+), 33 deletions(-)

diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index db1abc149c..7217e017a2 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5279,9 +5279,9 @@
   proname => 'pg_stat_get_activity', prorows => '100', proisstrict => 'f',
   proretset => 't', provolatile => 's', proparallel => 'r',
   prorettype => 'record', proargtypes => 'int4',
-  proallargtypes => '{int4,oid,int4,oid,text,text,text,text,text,timestamptz,timestamptz,timestamptz,timestamptz,inet,text,int4,xid,xid,text,bool,text,text,int4,text,numeric,text,bool,text,bool,int4,int8}',
-  proargmodes => '{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{pid,datid,pid,usesysid,application_name,state,query,wait_event_type,wait_event,xact_start,query_start,backend_start,state_change,client_addr,client_hostname,client_port,backend_xid,backend_xmin,backend_type,ssl,sslversion,sslcipher,sslbits,ssl_client_dn,ssl_client_serial,ssl_issuer_dn,gss_auth,gss_princ,gss_enc,leader_pid,query_id}',
+  proallargtypes => '{int4,oid,int4,oid,text,text,text,text,text,timestamptz,timestamptz,timestamptz,timestamptz,inet,text,int4,xid,xid,text,bool,text,text,int4,text,numeric,text,bool,text,bool,int4,int8,text}',
+  proargmodes => '{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{pid,datid,pid,usesysid,application_name,state,query,wait_event_type,wait_event,xact_start,query_start,backend_start,state_change,client_addr,client_hostname,client_port,backend_xid,backend_xmin,backend_type,ssl,sslversion,sslcipher,sslbits,ssl_client_dn,ssl_client_serial,ssl_issuer_dn,gss_auth,gss_princ,gss_enc,leader_pid,query_id,authenticated_id}',
   prosrc => 'pg_stat_get_activity' },
 { oid => '3318',
   descr => 'statistics: information about progress of backends running maintenance command',
diff --git a/src/include/utils/backend_status.h b/src/include/utils/backend_status.h
index 0cbcc9c943..d2381f07dd 100644
--- a/src/include/utils/backend_status.h
+++ b/src/include/utils/backend_status.h
@@ -145,13 +145,16 @@ typedef struct PgBackendStatus
 	char	   *st_appname;
 
 	/*
-	 * Current command string; MUST be null-terminated. Note that this string
-	 * possibly is truncated in the middle of a multi-byte character. As
-	 * activity strings are stored more frequently than read, that allows to
-	 * move the cost of correct truncation to the display side. Use
-	 * pgstat_clip_activity() to truncate correctly.
+	 * Current command string and authenticated ID string; MUST be
+	 * null-terminated.  Note that those strings possibly are truncated in
+	 * the middle of a multi-byte character.  As activity strings are
+	 * stored more frequently than read, that allows to move the cost of
+	 * correct truncation to the display side.  Authenticated ID strings
+	 * are stored once at backend startup but the cost is minimal.
+	 * Use pgstat_clip_string() to truncate both of them correctly.
 	 */
 	char	   *st_activity_raw;
+	char	   *st_authn_id;
 
 	/*
 	 * Command progress reporting.  Any command which wishes can advertise
@@ -267,6 +270,7 @@ typedef struct LocalPgBackendStatus
  */
 extern PGDLLIMPORT bool pgstat_track_activities;
 extern PGDLLIMPORT int pgstat_track_activity_query_size;
+extern PGDLLIMPORT int pgstat_track_activity_authn_size;
 
 
 /* ----------
@@ -315,7 +319,7 @@ extern uint64 pgstat_get_my_query_id(void);
 extern int	pgstat_fetch_stat_numbackends(void);
 extern PgBackendStatus *pgstat_fetch_stat_beentry(int beid);
 extern LocalPgBackendStatus *pgstat_fetch_stat_local_beentry(int beid);
-extern char *pgstat_clip_activity(const char *raw_activity);
+extern char *pgstat_clip_string(const char *raw_activity, int max_size);
 
 
 #endif /* BACKEND_STATUS_H */
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 70e578894f..ef251daf17 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -821,6 +821,7 @@ CREATE VIEW pg_stat_activity AS
             S.usesysid,
             U.rolname AS usename,
             S.application_name,
+            S.authenticated_id,
             S.client_addr,
             S.client_hostname,
             S.client_port,
diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c
index a368101103..a0f5fc98fd 100644
--- a/src/backend/utils/activity/backend_status.c
+++ b/src/backend/utils/activity/backend_status.c
@@ -44,6 +44,7 @@
  */
 bool		pgstat_track_activities = false;
 int			pgstat_track_activity_query_size = 1024;
+int			pgstat_track_activity_authn_size = 128;
 
 
 /* exposed so that progress.c can access it */
@@ -95,7 +96,9 @@ BackendStatusShmemSize(void)
 					mul_size(NAMEDATALEN, NumBackendStatSlots));
 	/* BackendActivityBuffer: */
 	size = add_size(size,
-					mul_size(pgstat_track_activity_query_size, NumBackendStatSlots));
+					mul_size(pgstat_track_activity_query_size +
+							 pgstat_track_activity_authn_size,
+							 NumBackendStatSlots));
 #ifdef USE_SSL
 	/* BackendSslStatusBuffer: */
 	size = add_size(size,
@@ -171,7 +174,8 @@ CreateSharedBackendStatus(void)
 	}
 
 	/* Create or attach to the shared activity buffer */
-	BackendActivityBufferSize = mul_size(pgstat_track_activity_query_size,
+	BackendActivityBufferSize = mul_size(pgstat_track_activity_query_size +
+										 pgstat_track_activity_authn_size,
 										 NumBackendStatSlots);
 	BackendActivityBuffer = (char *)
 		ShmemInitStruct("Backend Activity Buffer",
@@ -182,12 +186,14 @@ CreateSharedBackendStatus(void)
 	{
 		MemSet(BackendActivityBuffer, 0, BackendActivityBufferSize);
 
-		/* Initialize st_activity pointers. */
+		/* Initialize st_activity and st_authn_id pointers. */
 		buffer = BackendActivityBuffer;
 		for (i = 0; i < NumBackendStatSlots; i++)
 		{
 			BackendStatusArray[i].st_activity_raw = buffer;
 			buffer += pgstat_track_activity_query_size;
+			BackendStatusArray[i].st_authn_id = buffer;
+			buffer += pgstat_track_activity_authn_size;
 		}
 	}
 
@@ -432,10 +438,22 @@ pgstat_bestart(void)
 	else
 		lbeentry.st_clienthostname[0] = '\0';
 	lbeentry.st_activity_raw[0] = '\0';
+	if (MyProcPort && MyProcPort->authn_id)
+	{
+		/* Compute the length of the field to store */
+		int		len = Min(strlen(MyProcPort->authn_id),
+						  pgstat_track_activity_authn_size - 1);
+
+		memcpy(lbeentry.st_authn_id, MyProcPort->authn_id, len);
+		lbeentry.st_authn_id[len] = '\0';
+	}
+	else
+		lbeentry.st_authn_id[0] = '\0';
 	/* Also make sure the last byte in each string area is always 0 */
 	lbeentry.st_appname[NAMEDATALEN - 1] = '\0';
 	lbeentry.st_clienthostname[NAMEDATALEN - 1] = '\0';
 	lbeentry.st_activity_raw[pgstat_track_activity_query_size - 1] = '\0';
+	lbeentry.st_authn_id[pgstat_track_activity_authn_size - 1] = '\0';
 
 #ifdef USE_SSL
 	memcpy(lbeentry.st_sslstatus, &lsslstatus, sizeof(PgBackendSSLStatus));
@@ -936,7 +954,8 @@ pgstat_get_backend_current_activity(int pid, bool checkUser)
 			else
 			{
 				/* this'll leak a bit of memory, but that seems acceptable */
-				return pgstat_clip_activity(beentry->st_activity_raw);
+				return pgstat_clip_string(beentry->st_activity_raw,
+										  pgstat_track_activity_query_size);
 			}
 		}
 
@@ -1102,7 +1121,7 @@ pgstat_fetch_stat_numbackends(void)
 }
 
 /*
- * Convert a potentially unsafely truncated activity string (see
+ * Convert a potentially unsafely truncated activity or authn ID string (see
  * PgBackendStatus.st_activity_raw's documentation) into a correctly truncated
  * one.
  *
@@ -1110,9 +1129,9 @@ pgstat_fetch_stat_numbackends(void)
  * freed.
  */
 char *
-pgstat_clip_activity(const char *raw_activity)
+pgstat_clip_string(const char *raw_string, int max_size)
 {
-	char	   *activity;
+	char	   *string;
 	int			rawlen;
 	int			cliplen;
 
@@ -1124,10 +1143,10 @@ pgstat_clip_activity(const char *raw_activity)
 	 * underlying buffer is guaranteed to be pgstat_track_activity_query_size
 	 * large.
 	 */
-	activity = pnstrdup(raw_activity, pgstat_track_activity_query_size - 1);
+	string = pnstrdup(raw_string, max_size - 1);
 
 	/* now double-guaranteed to be NUL terminated */
-	rawlen = strlen(activity);
+	rawlen = strlen(string);
 
 	/*
 	 * All supported server-encodings make it possible to determine the length
@@ -1137,10 +1156,10 @@ pgstat_clip_activity(const char *raw_activity)
 	 * even if the string earlier was truncated in the middle of a multi-byte
 	 * character.
 	 */
-	cliplen = pg_mbcliplen(activity, rawlen,
-						   pgstat_track_activity_query_size - 1);
+	cliplen = pg_mbcliplen(string, rawlen,
+						   max_size - 1);
 
-	activity[cliplen] = '\0';
+	string[cliplen] = '\0';
 
-	return activity;
+	return string;
 }
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 87f02d572e..e4215c46c4 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -569,7 +569,7 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 Datum
 pg_stat_get_activity(PG_FUNCTION_ARGS)
 {
-#define PG_STAT_GET_ACTIVITY_COLS	30
+#define PG_STAT_GET_ACTIVITY_COLS	31
 	int			num_backends = pgstat_fetch_stat_numbackends();
 	int			curr_backend;
 	int			pid = PG_ARGISNULL(0) ? -1 : PG_GETARG_INT32(0);
@@ -703,10 +703,23 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
 					break;
 			}
 
-			clipped_activity = pgstat_clip_activity(beentry->st_activity_raw);
+			clipped_activity = pgstat_clip_string(beentry->st_activity_raw,
+												  pgstat_track_activity_query_size);
 			values[5] = CStringGetTextDatum(clipped_activity);
 			pfree(clipped_activity);
 
+			if (beentry->st_authn_id)
+			{
+				char	   *clipped_authn_id;
+
+				clipped_authn_id = pgstat_clip_string(beentry->st_authn_id,
+													  pgstat_track_activity_authn_size);
+				values[30] = CStringGetTextDatum(clipped_authn_id);
+				pfree(clipped_authn_id);
+			}
+			else
+				nulls[30] = true;
+
 			/* leader_pid */
 			nulls[28] = true;
 
@@ -946,6 +959,7 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
 			nulls[27] = true;
 			nulls[28] = true;
 			nulls[29] = true;
+			nulls[30] = true;
 		}
 
 		tuplestore_putvalues(tupstore, tupdesc, values, nulls);
@@ -1026,7 +1040,8 @@ pg_stat_get_backend_activity(PG_FUNCTION_ARGS)
 	else
 		activity = beentry->st_activity_raw;
 
-	clipped_activity = pgstat_clip_activity(activity);
+	clipped_activity = pgstat_clip_string(activity,
+										  pgstat_track_activity_query_size);
 	ret = cstring_to_text(activity);
 	pfree(clipped_activity);
 
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index b130874bdc..def2c21fb8 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3516,6 +3516,17 @@ static struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"track_activity_authn_size", PGC_POSTMASTER, RESOURCES_MEM,
+			gettext_noop("Sets the size reserved for pg_stat_activity.authenticated_id, in bytes."),
+			NULL,
+			GUC_UNIT_BYTE
+		},
+		&pgstat_track_activity_authn_size,
+		128, 100, 1048576,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"gin_pending_list_limit", PGC_USERSET, CLIENT_CONN_STATEMENT,
 			gettext_noop("Sets the maximum size of the pending list for GIN index."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 0f7f49b949..762a9d67c4 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -602,6 +602,7 @@
 #track_wal_io_timing = off
 #track_functions = none			# none, pl, all
 #track_activity_query_size = 1024	# (change requires restart)
+#track_activity_authn_size = 128	# (change requires restart)
 #stats_temp_directory = 'pg_stat_tmp'
 
 
diff --git a/src/test/ldap/t/001_auth.pl b/src/test/ldap/t/001_auth.pl
index ec4721234b..17edc33776 100644
--- a/src/test/ldap/t/001_auth.pl
+++ b/src/test/ldap/t/001_auth.pl
@@ -6,7 +6,7 @@ use Test::More;
 
 if ($ENV{with_ldap} eq 'yes')
 {
-	plan tests => 28;
+	plan tests => 29;
 }
 else
 {
@@ -199,6 +199,8 @@ $ENV{"PGPASSWORD"} = 'secret1';
 test_access(
 	$node, 'test1', 0,
 	'simple bind authentication succeeds',
+	sql => 'SELECT authenticated_id FROM pg_stat_activity WHERE pid = pg_backend_pid()',
+	expected_stdout => qr/uid=test1,dc=example,dc=net/,
 	log_like => [
 		qr/connection authenticated: identity="uid=test1,dc=example,dc=net" method=ldap/
 	],);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 6dff5439e0..0a5ae0bc35 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1750,6 +1750,7 @@ pg_stat_activity| SELECT s.datid,
     s.usesysid,
     u.rolname AS usename,
     s.application_name,
+    s.authenticated_id,
     s.client_addr,
     s.client_hostname,
     s.client_port,
@@ -1765,7 +1766,7 @@ pg_stat_activity| SELECT s.datid,
     s.query_id,
     s.query,
     s.backend_type
-   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, query_id)
+   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, query_id, authenticated_id)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)))
      LEFT JOIN pg_authid u ON ((s.usesysid = u.oid)));
 pg_stat_all_indexes| SELECT c.oid AS relid,
@@ -1877,7 +1878,7 @@ pg_stat_gssapi| SELECT s.pid,
     s.gss_auth AS gss_authenticated,
     s.gss_princ AS principal,
     s.gss_enc AS encrypted
-   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, query_id)
+   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, query_id, authenticated_id)
   WHERE (s.client_port IS NOT NULL);
 pg_stat_prefetch_recovery| SELECT s.stats_reset,
     s.prefetch,
@@ -2058,7 +2059,7 @@ pg_stat_replication| SELECT s.pid,
     w.sync_priority,
     w.sync_state,
     w.reply_time
-   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, query_id)
+   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, query_id, authenticated_id)
      JOIN pg_stat_get_wal_senders() w(pid, state, sent_lsn, write_lsn, flush_lsn, replay_lsn, write_lag, flush_lag, replay_lag, sync_priority, sync_state, reply_time) ON ((s.pid = w.pid)))
      LEFT JOIN pg_authid u ON ((s.usesysid = u.oid)));
 pg_stat_replication_slots| SELECT s.slot_name,
@@ -2090,7 +2091,7 @@ pg_stat_ssl| SELECT s.pid,
     s.ssl_client_dn AS client_dn,
     s.ssl_client_serial AS client_serial,
     s.ssl_issuer_dn AS issuer_dn
-   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, query_id)
+   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, query_id, authenticated_id)
   WHERE (s.client_port IS NOT NULL);
 pg_stat_subscription| SELECT su.oid AS subid,
     su.subname,
diff --git a/src/test/ssl/t/001_ssltests.pl b/src/test/ssl/t/001_ssltests.pl
index cc797a5c98..9a6dec7c26 100644
--- a/src/test/ssl/t/001_ssltests.pl
+++ b/src/test/ssl/t/001_ssltests.pl
@@ -17,7 +17,7 @@ if ($ENV{with_ssl} ne 'openssl')
 }
 else
 {
-	plan tests => 110;
+	plan tests => 111;
 }
 
 #### Some configuration
@@ -256,7 +256,6 @@ $node->connect_ok(
 	"host name matching with X.509 Subject Alternative Names 2");
 $node->connect_ok("$common_connstr host=foo.wildcard.pg-ssltest.test",
 	"host name matching with X.509 Subject Alternative Names wildcard");
-
 $node->connect_fails(
 	"$common_connstr host=wronghost.alt-name.pg-ssltest.test",
 	"host name not matching with X.509 Subject Alternative Names",
@@ -446,10 +445,13 @@ $node->connect_ok(
 # same thing but using explicit CN
 $dn_connstr = "$common_connstr dbname=certdb_cn";
 
+# The full DN should still be used as the authenticated identity, within the
+# backend logs and pg_stat_activity.
 $node->connect_ok(
 	"$dn_connstr user=ssltestuser sslcert=ssl/client-dn.crt sslkey=ssl/client-dn_tmp.key",
 	"certificate authorization succeeds with CN mapping",
-	# the full DN should still be used as the authenticated identity
+	sql => 'SELECT authenticated_id FROM pg_stat_activity WHERE pid = pg_backend_pid()',
+	expected_stdout => qr/CN=ssltestuser-dn,OU=Testing,OU=Engineering,O=PGDG/,
 	log_like => [
 		qr/connection authenticated: identity="CN=ssltestuser-dn,OU=Testing,OU=Engineering,O=PGDG" method=cert/
 	],);
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index cf75d913ce..51d040ea9c 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7596,6 +7596,24 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-track-activity-authn-size" xreflabel="track_activity_authn_size">
+      <term><varname>track_activity_authn_size</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>track_activity_authn_size</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+       Specifies the amount of memory reserved to store the text of the
+       currently executing command for each active session, for the
+       <structname>pg_stat_activity</structname>.<structfield>authenticated_id</structfield> field.
+       If this value is specified without units, it is taken as bytes.
+       The default value is 128 bytes.
+       This parameter can only be set at server start.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-counts" xreflabel="track_counts">
       <term><varname>track_counts</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 886e626be8..dda95cd860 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -756,6 +756,16 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>authenticated_id</structfield> <type>text</type>
+      </para>
+      <para>
+       Name of the authenticated ID used for this backend login at
+       authentication
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>client_addr</structfield> <type>inet</type>
-- 
2.31.1

#2Justin Pryzby
pryzby@telsasoft.com
In reply to: Michael Paquier (#1)
Re: Addition of authenticated ID to pg_stat_activity

On Mon, Apr 26, 2021 at 11:34:16AM +0900, Michael Paquier wrote:

+++ b/doc/src/sgml/config.sgml
@@ -7596,6 +7596,24 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+     <varlistentry id="guc-track-activity-authn-size" xreflabel="track_activity_authn_size">
+      <term><varname>track_activity_authn_size</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>track_activity_authn_size</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+       Specifies the amount of memory reserved to store the text of the
+       currently executing command for each active session, for the

That part looks to be a copy+paste error.

+       <structname>pg_stat_activity</structname>.<structfield>authenticated_id</structfield> field.
+       If this value is specified without units, it is taken as bytes.
+       The default value is 128 bytes.
+       This parameter can only be set at server start.
+       </para>
+      </listitem>
+     </varlistentry>

I think many/most things in log/CSV should also go in PSA, and vice versa.

It seems like there should be a comment about this - in both places - to avoid
forgetting it in the future.

--
Justin

#3Michael Paquier
michael@paquier.xyz
In reply to: Justin Pryzby (#2)
Re: Addition of authenticated ID to pg_stat_activity

On Sun, Apr 25, 2021 at 10:14:43PM -0500, Justin Pryzby wrote:

That part looks to be a copy+paste error.

Sorry about that. I have fixed that on my own branch.

+       <structname>pg_stat_activity</structname>.<structfield>authenticated_id</structfield> field.
+       If this value is specified without units, it is taken as bytes.
+       The default value is 128 bytes.
+       This parameter can only be set at server start.
+       </para>
+      </listitem>
+     </varlistentry>

I think many/most things in log/CSV should also go in PSA, and vice versa.

It seems like there should be a comment about this - in both places - to avoid
forgetting it in the future.

I am not sure what you mean here, neither do I see in what this is
related to what is proposed on this thread.
--
Michael

#4Andres Freund
andres@anarazel.de
In reply to: Michael Paquier (#1)
Re: Addition of authenticated ID to pg_stat_activity

Hi,

On 2021-04-26 11:34:16 +0900, Michael Paquier wrote:

9afffcb has added the concept of authenticated identity to the
information provided in log_connections for audit purposes, with this
data stored in each backend's port. One extra thing that can be
really useful for monitoring is the capability to track this
information directly in pg_stat_activity.

I'm getting a bit worried about the incremental increase in
pg_stat_activity width - it's probably by far the view that's most
viewed interactively. I think we need to be careful not to add too niche
things to it. This is especially true for columns that may be wider.

It'd be bad for discoverability, but perhaps something like this, that's
not that likely to be used interactively, would be better done as a
separate function that would need to be used explicitly?

A select * from pg_stat_activity on a plain installation, connecting
over unix socket, with nothing running, is 411 chars wide for me...

Greetings,

Andres Freund

#5Stephen Frost
sfrost@snowman.net
In reply to: Andres Freund (#4)
Re: Addition of authenticated ID to pg_stat_activity

Greetings,

* Andres Freund (andres@anarazel.de) wrote:

On 2021-04-26 11:34:16 +0900, Michael Paquier wrote:

9afffcb has added the concept of authenticated identity to the
information provided in log_connections for audit purposes, with this
data stored in each backend's port. One extra thing that can be
really useful for monitoring is the capability to track this
information directly in pg_stat_activity.

I'm getting a bit worried about the incremental increase in
pg_stat_activity width - it's probably by far the view that's most
viewed interactively. I think we need to be careful not to add too niche
things to it. This is especially true for columns that may be wider.

It'd be bad for discoverability, but perhaps something like this, that's
not that likely to be used interactively, would be better done as a
separate function that would need to be used explicitly?

I mean.. we already have separate functions and views for this, though
they're auth-method-specific currently, but also provide more details,
since it isn't actually a "one size fits all" kind of thing like this
entire approach is imagining it to be.

Thanks,

Stephen

#6Michael Paquier
michael@paquier.xyz
In reply to: Stephen Frost (#5)
Re: Addition of authenticated ID to pg_stat_activity

On Mon, Apr 26, 2021 at 03:21:46PM -0400, Stephen Frost wrote:

* Andres Freund (andres@anarazel.de) wrote:

I'm getting a bit worried about the incremental increase in
pg_stat_activity width - it's probably by far the view that's most
viewed interactively. I think we need to be careful not to add too niche
things to it. This is especially true for columns that may be wider.

It'd be bad for discoverability, but perhaps something like this, that's
not that likely to be used interactively, would be better done as a
separate function that would need to be used explicitly?

I mean.. we already have separate functions and views for this, though
they're auth-method-specific currently, but also provide more details,
since it isn't actually a "one size fits all" kind of thing like this
entire approach is imagining it to be.

Referring to pg_stat_ssl and pg_stat_gssapi here, right? Yes, that
would be very limited as this leads to no visibility for LDAP, all
password-based authentications and more.

I am wondering if we should take this as an occasion to move some data
out of pg_stat_activity into a separate biew, dedicated to the data
related to the connection that remains set to the same value for the
duration of a backend's life, as of the following set:
- the backend PID
- client_addr
- client_hostname
- client_port
- authenticated ID
- application_name? (well, this one could change on reload, so I am
lying).

It would be tempting to move the database name and the username but
these are popular fields with monitoring. Maybe we could name that
pg_stat_connection_status, pg_stat_auth_status or just
pg_stat_connection?
--
Michael

#7Julien Rouhaud
rjuju123@gmail.com
In reply to: Michael Paquier (#6)
Re: Addition of authenticated ID to pg_stat_activity

On Tue, Apr 27, 2021 at 09:59:18AM +0900, Michael Paquier wrote:

I am wondering if we should take this as an occasion to move some data
out of pg_stat_activity into a separate biew, dedicated to the data
related to the connection that remains set to the same value for the
duration of a backend's life, as of the following set:
- the backend PID

-1. It's already annoying enough to have to type "WHERE pid !=
pg_backend_pid()" to exclude my own backend, and I usually need it quite often.
Unless we add some new view which integrate that, something like
pg_stat_activity_except_me with a better name. I also don't see how we could
join a new dedicated view with the old one without that information.

- application_name? (well, this one could change on reload, so I am
lying).

No, it can change at any time. And the fact that it's not transactional makes
it quite convenient for poor man progress reporting. For instance in powa I
use that to report what the bgworker is currently working on, and this has
already proven to be useful.

#8Michael Paquier
michael@paquier.xyz
In reply to: Julien Rouhaud (#7)
Re: Addition of authenticated ID to pg_stat_activity

On Tue, Apr 27, 2021 at 09:26:11AM +0800, Julien Rouhaud wrote:

-1. It's already annoying enough to have to type "WHERE pid !=
pg_backend_pid()" to exclude my own backend, and I usually need it quite often.
Unless we add some new view which integrate that, something like
pg_stat_activity_except_me with a better name. I also don't see how we could
join a new dedicated view with the old one without that information.

Err, sorry for the confusion. What I meant here is to also keep the
PID in pg_stat_activity, but also add it to this new view to be able
to join things across the board.
--
Michael

#9Stephen Frost
sfrost@snowman.net
In reply to: Michael Paquier (#6)
Re: Addition of authenticated ID to pg_stat_activity

Greetings,

* Michael Paquier (michael@paquier.xyz) wrote:

On Mon, Apr 26, 2021 at 03:21:46PM -0400, Stephen Frost wrote:

* Andres Freund (andres@anarazel.de) wrote:

I'm getting a bit worried about the incremental increase in
pg_stat_activity width - it's probably by far the view that's most
viewed interactively. I think we need to be careful not to add too niche
things to it. This is especially true for columns that may be wider.

It'd be bad for discoverability, but perhaps something like this, that's
not that likely to be used interactively, would be better done as a
separate function that would need to be used explicitly?

I mean.. we already have separate functions and views for this, though
they're auth-method-specific currently, but also provide more details,
since it isn't actually a "one size fits all" kind of thing like this
entire approach is imagining it to be.

Referring to pg_stat_ssl and pg_stat_gssapi here, right? Yes, that
would be very limited as this leads to no visibility for LDAP, all
password-based authentications and more.

Yes, of course. The point being made was that we could do the same for
the other auth methods rather than adding something to pg_stat_activity.

I am wondering if we should take this as an occasion to move some data
out of pg_stat_activity into a separate biew, dedicated to the data
related to the connection that remains set to the same value for the
duration of a backend's life, as of the following set:
- the backend PID
- client_addr
- client_hostname
- client_port
- authenticated ID
- application_name? (well, this one could change on reload, so I am
lying).

application_name certainly changes, as pointed out elsewhere.

It would be tempting to move the database name and the username but
these are popular fields with monitoring. Maybe we could name that
pg_stat_connection_status, pg_stat_auth_status or just
pg_stat_connection?

I don't know that there's really any of the existing fields that
aren't "popular fields with monitoring".. The issue that Andres brought
up wasn't about monitoring though- it was about users looking
interactively. Monitoring systems can adjust their queries for the new
major version to do whatever joins, et al, they need and that's a
once-per-major-version to do. On the other hand, people doing:

table pg_stat_activity;

Would like to get the info they really want out of that and not anything
else. If we're going to adjust the fields returned from that then
that's really the lens we should use.

So, what fields are people really looking at when querying
pg_stat_activity interactively? User, database, pid, last query,
transaction start, query start, state, wait event info, maybe backend
xmin/xid? I doubt most people looking at pg_stat_activity interactively
actually care about the non-user backends (autovacuum, et al).

Thanks,

Stephen

#10Justin Pryzby
pryzby@telsasoft.com
In reply to: Michael Paquier (#6)
Re: Addition of authenticated ID to pg_stat_activity

On Tue, Apr 27, 2021 at 09:59:18AM +0900, Michael Paquier wrote:

On Mon, Apr 26, 2021 at 03:21:46PM -0400, Stephen Frost wrote:

* Andres Freund (andres@anarazel.de) wrote:

I'm getting a bit worried about the incremental increase in
pg_stat_activity width - it's probably by far the view that's most
viewed interactively. I think we need to be careful not to add too niche
things to it. This is especially true for columns that may be wider.

It'd be bad for discoverability, but perhaps something like this, that's
not that likely to be used interactively, would be better done as a
separate function that would need to be used explicitly?

I mean.. we already have separate functions and views for this, though
they're auth-method-specific currently, but also provide more details,
since it isn't actually a "one size fits all" kind of thing like this
entire approach is imagining it to be.

I am wondering if we should take this as an occasion to move some data
out of pg_stat_activity into a separate biew, dedicated to the data
related to the connection that remains set to the same value for the
duration of a backend's life, as of the following set:
- the backend PID
- client_addr
- client_hostname
- client_port
- authenticated ID
- application_name? (well, this one could change on reload, so I am
lying).

+backend type
+leader_PID

It would be tempting to move the database name and the username but
these are popular fields with monitoring. Maybe we could name that
pg_stat_connection_status, pg_stat_auth_status or just
pg_stat_connection?

Maybe - there could also be a trivial view which JOINs pg_stat_activity and
pg_stat_connection ON (pid).

Technically I think it could also move backend_start/backend_xmin, but it'd be
odd to move them if the other timestamp/xid columns stayed in pg_stat_activity.

There's no reason that pg_stat_connection would *have* to be "static" per
connction, right ? That's just how you're defining what would be included.

Stephen wrote:

Would like to get the info they really want out of that and not anything
else. If we're going to adjust the fields returned from that then
that's really the lens we should use.

So, what fields are people really looking at when querying
pg_stat_activity interactively? User, database, pid, last query,
transaction start, query start, state, wait event info, maybe backend
xmin/xid? I doubt most people looking at pg_stat_activity interactively
actually care about the non-user backends (autovacuum, et al).

I think the narrow/userfacing view would exclude only the OID/XID fields:

datid | oid | | |
usesysid | oid | | |
backend_xid | xid | | |
backend_xmin | xid | | |

I think interactive users *would* care about other backend types - they're
frequently wondering "what's going on?"

TBH, query text is often so long that I have to write left(query,33), and then
the idea of a "userfacing" variant loses its appeal, since it's necessary to
enumerate columns anyway.

--
Justin

#11Andres Freund
andres@anarazel.de
In reply to: Stephen Frost (#9)
Re: Addition of authenticated ID to pg_stat_activity

Hi,

On 2021-04-27 12:40:29 -0400, Stephen Frost wrote:

So, what fields are people really looking at when querying
pg_stat_activity interactively? User, database, pid, last query,
transaction start, query start, state, wait event info, maybe backend
xmin/xid? I doubt most people looking at pg_stat_activity interactively
actually care about the non-user backends (autovacuum, et al).

Not representative, but I personally am about as often interested in one
of the non-connection processes as the connection
ones. E.g. investigating what is autovacuum's bottleneck, are
checkpointer / wal writer / bgwriter io bound or keeping up, etc.

Greetings,

Andres Freund

#12Magnus Hagander
magnus@hagander.net
In reply to: Andres Freund (#11)
Re: Addition of authenticated ID to pg_stat_activity

On Tue, Apr 27, 2021 at 8:25 PM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2021-04-27 12:40:29 -0400, Stephen Frost wrote:

So, what fields are people really looking at when querying
pg_stat_activity interactively? User, database, pid, last query,
transaction start, query start, state, wait event info, maybe backend
xmin/xid? I doubt most people looking at pg_stat_activity interactively
actually care about the non-user backends (autovacuum, et al).

Not representative, but I personally am about as often interested in one
of the non-connection processes as the connection
ones. E.g. investigating what is autovacuum's bottleneck, are
checkpointer / wal writer / bgwriter io bound or keeping up, etc.

I definitely use it all the time to monitor autovacuum all the time.
The others as well regularly, but autovacuum continuously. I also see
a lot of people doing things like "from pg_stat_activity where query
like '%mytablename%'" where they'd want both any regular queries and
any autovacuums currently processing the table.

I'd say client address is also pretty common to identify which set of
app servers connections are coming in from -- but client port and
client hostname are a lot less interesting. But it'd be kind of weird
to split those out.

For *interactive use* I'd find pretty much all other columns
interesting and commonly used. Probably not that interested in the
oids of the database and user, but again they are the cheap ones. We
could get rid of the joints if we only showed the oids, but in
interactive use it's really the names that are interesting. But if
we're just trying to save column count, I'd say get rid of datid and
usesysid.

I'd hold everything else as interesting.

--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

#13Michael Paquier
michael@paquier.xyz
In reply to: Magnus Hagander (#12)
Re: Addition of authenticated ID to pg_stat_activity

On Thu, Apr 29, 2021 at 04:56:42PM +0200, Magnus Hagander wrote:

I definitely use it all the time to monitor autovacuum all the time.
The others as well regularly, but autovacuum continuously. I also see
a lot of people doing things like "from pg_stat_activity where query
like '%mytablename%'" where they'd want both any regular queries and
any autovacuums currently processing the table.

When it comes to development work, I also look at things different
than backend connections, checkpointer and WAL writer included.

I'd say client address is also pretty common to identify which set of
app servers connections are coming in from -- but client port and
client hostname are a lot less interesting. But it'd be kind of weird
to split those out.

Yes, I agree that it would be confusing to split the client_* fields
across multiple views.

For *interactive use* I'd find pretty much all other columns
interesting and commonly used. Probably not that interested in the
oids of the database and user, but again they are the cheap ones. We
could get rid of the joints if we only showed the oids, but in
interactive use it's really the names that are interesting. But if
we're just trying to save column count, I'd say get rid of datid and
usesysid.

I'd hold everything else as interesting.

Yes, you have an argument here about the removal of usesysid and
datid. Now I find joins involving OIDs to be much more natural than
the object names, because that's the base of what we use in the
catalogs.

Not sure if we would be able to agree on something here, but the
barrier to what a session and a connection hold is thin when it comes
to roles and application_name. Thinking more about that, I would be
really tempted to get to do a more straight split with data that's
associated to a session, to a transaction and to a connection, say:
1) pg_stat_session, data that may change.
- PID
- leader PID
- the role name
- role ID
- application_name
- wait_event_type
- wait_event
2) pg_stat_connection, static data associated to a connection.
- PID
- database name
- database OID
- client_addr
- client_hostname
- client_port
- backend_start
- authn ID
- backend_type
3) pg_stat_transaction, or pg_stat_activity, for the transactional
activity.
- PID
- xact_start
- query_start
- backend_xid
- state_change
- query string
- query ID
- state

Or I could just drop a new function that fetches the authn ID for a
given PID, even if this makes things potentially less consistent when
it comes to the lookup of PgBackendStatus, guarantee given now by
pg_stat_get_activity().
--
Michael

#14Magnus Hagander
magnus@hagander.net
In reply to: Michael Paquier (#13)
Re: Addition of authenticated ID to pg_stat_activity

On Mon, May 17, 2021 at 6:35 AM Michael Paquier <michael@paquier.xyz> wrote:

On Thu, Apr 29, 2021 at 04:56:42PM +0200, Magnus Hagander wrote:

I definitely use it all the time to monitor autovacuum all the time.
The others as well regularly, but autovacuum continuously. I also see
a lot of people doing things like "from pg_stat_activity where query
like '%mytablename%'" where they'd want both any regular queries and
any autovacuums currently processing the table.

When it comes to development work, I also look at things different
than backend connections, checkpointer and WAL writer included.

While I think we should optimize these primarily for users and not
developers, I definitely do those things as well. In particular wait
events for the background processes.

I'd say client address is also pretty common to identify which set of
app servers connections are coming in from -- but client port and
client hostname are a lot less interesting. But it'd be kind of weird
to split those out.

Yes, I agree that it would be confusing to split the client_* fields
across multiple views.

For *interactive use* I'd find pretty much all other columns
interesting and commonly used. Probably not that interested in the
oids of the database and user, but again they are the cheap ones. We
could get rid of the joints if we only showed the oids, but in
interactive use it's really the names that are interesting. But if
we're just trying to save column count, I'd say get rid of datid and
usesysid.

I'd hold everything else as interesting.

Yes, you have an argument here about the removal of usesysid and
datid. Now I find joins involving OIDs to be much more natural than
the object names, because that's the base of what we use in the
catalogs.

Agreed. And I'm not sure the actual gain is that big if we can just
remove oid columns...

Not sure if we would be able to agree on something here, but the
barrier to what a session and a connection hold is thin when it comes
to roles and application_name. Thinking more about that, I would be
really tempted to get to do a more straight split with data that's
associated to a session, to a transaction and to a connection, say:
1) pg_stat_session, data that may change.
- PID
- leader PID
- the role name
- role ID
- application_name
- wait_event_type
- wait_event
2) pg_stat_connection, static data associated to a connection.
- PID
- database name
- database OID
- client_addr
- client_hostname
- client_port
- backend_start
- authn ID
- backend_type
3) pg_stat_transaction, or pg_stat_activity, for the transactional
activity.
- PID
- xact_start
- query_start
- backend_xid
- state_change
- query string
- query ID
- state

This seems extremely user-unfriendly to me.

I mean. Timestamps are nso split out between different views so you
can't track the process iwthout it. And surely wait_event info is
*extremely* related to things like what query is running and what
state the session is in. And putting backend_type off in a separate
view means most queries are going to have to join that in anyway. Or
include it in all views. And if we're forcing the majority of queries
to join multiple views, what have we actually gained?

Based on your list above, I'd definitely want at least (1) and (2) to
be in the same one, but they'd have to also gain at least the database
oid/name and backend_type, and maybe also backend_start.

So basically, it would be moving out client_*, and authn_id. If we're
doing that then as you say maybe pg_stat_connection is a good name and
could then *also* gain the information that's currently in the ssl and
gss views for a net simplification.

tld;dr; I think we have to be really careful here or the cure is going
to be way worse than the disease.

Or I could just drop a new function that fetches the authn ID for a
given PID, even if this makes things potentially less consistent when
it comes to the lookup of PgBackendStatus, guarantee given now by
pg_stat_get_activity().

Well, the authnid will never change so I'm not sure the consistency
part is a big problem? Or maybe I'm misunderstanding the risk you're
referring to?

--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

#15Michael Paquier
michael@paquier.xyz
In reply to: Magnus Hagander (#14)
Re: Addition of authenticated ID to pg_stat_activity

On Mon, May 17, 2021 at 10:28:49AM +0200, Magnus Hagander wrote:

On Mon, May 17, 2021 at 6:35 AM Michael Paquier <michael@paquier.xyz> wrote:

Not sure if we would be able to agree on something here, but the
barrier to what a session and a connection hold is thin when it comes
to roles and application_name. Thinking more about that, I would be
really tempted to get to do a more straight split with data that's
associated to a session, to a transaction and to a connection, say:
1) pg_stat_session, data that may change.
- PID
- leader PID
- the role name
- role ID
- application_name
- wait_event_type
- wait_event
2) pg_stat_connection, static data associated to a connection.
- PID
- database name
- database OID
- client_addr
- client_hostname
- client_port
- backend_start
- authn ID
- backend_type
3) pg_stat_transaction, or pg_stat_activity, for the transactional
activity.
- PID
- xact_start
- query_start
- backend_xid
- state_change
- query string
- query ID
- state

This seems extremely user-unfriendly to me.

I mean. Timestamps are nso split out between different views so you
can't track the process iwthout it. And surely wait_event info is
*extremely* related to things like what query is running and what
state the session is in. And putting backend_type off in a separate
view means most queries are going to have to join that in anyway. Or
include it in all views. And if we're forcing the majority of queries
to join multiple views, what have we actually gained?

Based on your list above, I'd definitely want at least (1) and (2) to
be in the same one, but they'd have to also gain at least the database
oid/name and backend_type, and maybe also backend_start.

Okay.

So basically, it would be moving out client_*, and authn_id.

So that would mean the addition of one new catalog view, called
pg_stat_connection, with the following fields:
- PID
- all three client_*
- authn ID
I can live with this split. Thoughts from others?

If we're
doing that then as you say maybe pg_stat_connection is a good name and
could then *also* gain the information that's currently in the ssl and
gss views for a net simplification.

I am less enthutiastic about this addition. SSL and GSSAPI have no
fields in common, so that would bloat the view for connection data
with mostly NULL fields most of the time.

tld;dr; I think we have to be really careful here or the cure is going
to be way worse than the disease.

Agreed.

Or I could just drop a new function that fetches the authn ID for a
given PID, even if this makes things potentially less consistent when
it comes to the lookup of PgBackendStatus, guarantee given now by
pg_stat_get_activity().

Well, the authnid will never change so I'm not sure the consistency
part is a big problem? Or maybe I'm misunderstanding the risk you're
referring to?

I just mean to keep the consistency we have now with one single call
of pg_stat_get_activity() for each catalog view, so as we still fetch
once a consistent copy of all PgBackendStatus entries in this code
path.
--
Michael

#16Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#15)
1 attachment(s)
Re: Addition of authenticated ID to pg_stat_activity

On Tue, May 18, 2021 at 11:20:49AM +0900, Michael Paquier wrote:

So that would mean the addition of one new catalog view, called
pg_stat_connection, with the following fields:
- PID
- all three client_*
- authn ID
I can live with this split. Thoughts from others?

Just to make the discussion move on, attached is an updated version
doing that.
--
Michael

Attachments:

v2-0001-Add-authenticated-data-to-pg_stat_activity.patchtext/x-diff; charset=us-asciiDownload
From 749422386ed7fd24219c3bce6a5ecfb693602d89 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Fri, 21 May 2021 13:27:05 +0900
Subject: [PATCH v2] Add authenticated data to pg_stat_activity

---
 src/include/catalog/pg_proc.dat               |   6 +-
 src/include/utils/backend_status.h            |  16 ++-
 src/backend/catalog/system_views.sql          |  12 +-
 src/backend/utils/activity/backend_status.c   |  60 +++++----
 src/backend/utils/adt/pgstatfuncs.c           |  21 +++-
 src/backend/utils/misc/guc.c                  |  11 ++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/test/ldap/t/001_auth.pl                   |   4 +-
 src/test/regress/expected/rules.out           |  17 +--
 src/test/ssl/t/001_ssltests.pl                |   7 +-
 doc/src/sgml/config.sgml                      |  18 +++
 doc/src/sgml/monitoring.sgml                  | 119 +++++++++++++-----
 12 files changed, 212 insertions(+), 80 deletions(-)

diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index acbcae4607..6c35496554 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5280,9 +5280,9 @@
   proname => 'pg_stat_get_activity', prorows => '100', proisstrict => 'f',
   proretset => 't', provolatile => 's', proparallel => 'r',
   prorettype => 'record', proargtypes => 'int4',
-  proallargtypes => '{int4,oid,int4,oid,text,text,text,text,text,timestamptz,timestamptz,timestamptz,timestamptz,inet,text,int4,xid,xid,text,bool,text,text,int4,text,numeric,text,bool,text,bool,int4,int8}',
-  proargmodes => '{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{pid,datid,pid,usesysid,application_name,state,query,wait_event_type,wait_event,xact_start,query_start,backend_start,state_change,client_addr,client_hostname,client_port,backend_xid,backend_xmin,backend_type,ssl,sslversion,sslcipher,sslbits,ssl_client_dn,ssl_client_serial,ssl_issuer_dn,gss_auth,gss_princ,gss_enc,leader_pid,query_id}',
+  proallargtypes => '{int4,oid,int4,oid,text,text,text,text,text,timestamptz,timestamptz,timestamptz,timestamptz,inet,text,int4,xid,xid,text,bool,text,text,int4,text,numeric,text,bool,text,bool,int4,int8,text}',
+  proargmodes => '{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{pid,datid,pid,usesysid,application_name,state,query,wait_event_type,wait_event,xact_start,query_start,backend_start,state_change,client_addr,client_hostname,client_port,backend_xid,backend_xmin,backend_type,ssl,sslversion,sslcipher,sslbits,ssl_client_dn,ssl_client_serial,ssl_issuer_dn,gss_auth,gss_princ,gss_enc,leader_pid,query_id,authenticated_id}',
   prosrc => 'pg_stat_get_activity' },
 { oid => '3318',
   descr => 'statistics: information about progress of backends running maintenance command',
diff --git a/src/include/utils/backend_status.h b/src/include/utils/backend_status.h
index 8042b817df..5762d201b0 100644
--- a/src/include/utils/backend_status.h
+++ b/src/include/utils/backend_status.h
@@ -145,13 +145,16 @@ typedef struct PgBackendStatus
 	char	   *st_appname;
 
 	/*
-	 * Current command string; MUST be null-terminated. Note that this string
-	 * possibly is truncated in the middle of a multi-byte character. As
-	 * activity strings are stored more frequently than read, that allows to
-	 * move the cost of correct truncation to the display side. Use
-	 * pgstat_clip_activity() to truncate correctly.
+	 * Current command string and authenticated ID string; MUST be
+	 * null-terminated.  Note that those strings possibly are truncated in
+	 * the middle of a multi-byte character.  As activity strings are
+	 * stored more frequently than read, that allows to move the cost of
+	 * correct truncation to the display side.  Authenticated ID strings
+	 * are stored once at backend startup but the cost is minimal.
+	 * Use pgstat_clip_string() to truncate both of them correctly.
 	 */
 	char	   *st_activity_raw;
+	char	   *st_authn_id;
 
 	/*
 	 * Command progress reporting.  Any command which wishes can advertise
@@ -267,6 +270,7 @@ typedef struct LocalPgBackendStatus
  */
 extern PGDLLIMPORT bool pgstat_track_activities;
 extern PGDLLIMPORT int pgstat_track_activity_query_size;
+extern PGDLLIMPORT int pgstat_track_connection_authn_size;
 
 
 /* ----------
@@ -315,7 +319,7 @@ extern uint64 pgstat_get_my_query_id(void);
 extern int	pgstat_fetch_stat_numbackends(void);
 extern PgBackendStatus *pgstat_fetch_stat_beentry(int beid);
 extern LocalPgBackendStatus *pgstat_fetch_stat_local_beentry(int beid);
-extern char *pgstat_clip_activity(const char *raw_activity);
+extern char *pgstat_clip_string(const char *raw_activity, int max_size);
 
 
 #endif							/* BACKEND_STATUS_H */
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 5c84d758bb..648cf6e138 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -821,9 +821,6 @@ CREATE VIEW pg_stat_activity AS
             S.usesysid,
             U.rolname AS usename,
             S.application_name,
-            S.client_addr,
-            S.client_hostname,
-            S.client_port,
             S.backend_start,
             S.xact_start,
             S.query_start,
@@ -840,6 +837,15 @@ CREATE VIEW pg_stat_activity AS
         LEFT JOIN pg_database AS D ON (S.datid = D.oid)
         LEFT JOIN pg_authid AS U ON (S.usesysid = U.oid);
 
+CREATE VIEW pg_stat_connection AS
+    SELECT
+            S.pid,
+            S.authenticated_id,
+            S.client_addr,
+            S.client_hostname,
+            S.client_port
+    FROM pg_stat_get_activity(NULL) AS S;
+
 CREATE VIEW pg_stat_replication AS
     SELECT
             S.pid,
diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c
index 2901f9f5a9..09252ae0e2 100644
--- a/src/backend/utils/activity/backend_status.c
+++ b/src/backend/utils/activity/backend_status.c
@@ -44,6 +44,7 @@
  */
 bool		pgstat_track_activities = false;
 int			pgstat_track_activity_query_size = 1024;
+int			pgstat_track_connection_authn_size = 128;
 
 
 /* exposed so that progress.c can access it */
@@ -95,7 +96,9 @@ BackendStatusShmemSize(void)
 					mul_size(NAMEDATALEN, NumBackendStatSlots));
 	/* BackendActivityBuffer: */
 	size = add_size(size,
-					mul_size(pgstat_track_activity_query_size, NumBackendStatSlots));
+					mul_size(pgstat_track_activity_query_size +
+							 pgstat_track_connection_authn_size,
+							 NumBackendStatSlots));
 #ifdef USE_SSL
 	/* BackendSslStatusBuffer: */
 	size = add_size(size,
@@ -171,7 +174,8 @@ CreateSharedBackendStatus(void)
 	}
 
 	/* Create or attach to the shared activity buffer */
-	BackendActivityBufferSize = mul_size(pgstat_track_activity_query_size,
+	BackendActivityBufferSize = mul_size(pgstat_track_activity_query_size +
+										 pgstat_track_connection_authn_size,
 										 NumBackendStatSlots);
 	BackendActivityBuffer = (char *)
 		ShmemInitStruct("Backend Activity Buffer",
@@ -182,12 +186,14 @@ CreateSharedBackendStatus(void)
 	{
 		MemSet(BackendActivityBuffer, 0, BackendActivityBufferSize);
 
-		/* Initialize st_activity pointers. */
+		/* Initialize st_activity and st_authn_id pointers. */
 		buffer = BackendActivityBuffer;
 		for (i = 0; i < NumBackendStatSlots; i++)
 		{
 			BackendStatusArray[i].st_activity_raw = buffer;
 			buffer += pgstat_track_activity_query_size;
+			BackendStatusArray[i].st_authn_id = buffer;
+			buffer += pgstat_track_connection_authn_size;
 		}
 	}
 
@@ -432,10 +438,22 @@ pgstat_bestart(void)
 	else
 		lbeentry.st_clienthostname[0] = '\0';
 	lbeentry.st_activity_raw[0] = '\0';
+	if (MyProcPort && MyProcPort->authn_id)
+	{
+		/* Compute the length of the field to store */
+		int		len = Min(strlen(MyProcPort->authn_id),
+						  pgstat_track_connection_authn_size - 1);
+
+		memcpy(lbeentry.st_authn_id, MyProcPort->authn_id, len);
+		lbeentry.st_authn_id[len] = '\0';
+	}
+	else
+		lbeentry.st_authn_id[0] = '\0';
 	/* Also make sure the last byte in each string area is always 0 */
 	lbeentry.st_appname[NAMEDATALEN - 1] = '\0';
 	lbeentry.st_clienthostname[NAMEDATALEN - 1] = '\0';
 	lbeentry.st_activity_raw[pgstat_track_activity_query_size - 1] = '\0';
+	lbeentry.st_authn_id[pgstat_track_connection_authn_size - 1] = '\0';
 
 #ifdef USE_SSL
 	memcpy(lbeentry.st_sslstatus, &lsslstatus, sizeof(PgBackendSSLStatus));
@@ -936,7 +954,8 @@ pgstat_get_backend_current_activity(int pid, bool checkUser)
 			else
 			{
 				/* this'll leak a bit of memory, but that seems acceptable */
-				return pgstat_clip_activity(beentry->st_activity_raw);
+				return pgstat_clip_string(beentry->st_activity_raw,
+										  pgstat_track_activity_query_size);
 			}
 		}
 
@@ -1103,7 +1122,7 @@ pgstat_fetch_stat_numbackends(void)
 }
 
 /*
- * Convert a potentially unsafely truncated activity string (see
+ * Convert a potentially unsafely truncated activity or authn ID string (see
  * PgBackendStatus.st_activity_raw's documentation) into a correctly truncated
  * one.
  *
@@ -1111,37 +1130,36 @@ pgstat_fetch_stat_numbackends(void)
  * freed.
  */
 char *
-pgstat_clip_activity(const char *raw_activity)
+pgstat_clip_string(const char *raw_string, int max_size)
 {
-	char	   *activity;
+	char	   *string;
 	int			rawlen;
 	int			cliplen;
 
 	/*
 	 * Some callers, like pgstat_get_backend_current_activity(), do not
-	 * guarantee that the buffer isn't concurrently modified. We try to take
+	 * guarantee that the buffer isn't concurrently modified.  We try to take
 	 * care that the buffer is always terminated by a NUL byte regardless, but
-	 * let's still be paranoid about the string's length. In those cases the
-	 * underlying buffer is guaranteed to be pgstat_track_activity_query_size
-	 * large.
+	 * let's still be paranoid about the string's length.  In those cases the
+	 * underlying buffer is guaranteed to be max_size large.
 	 */
-	activity = pnstrdup(raw_activity, pgstat_track_activity_query_size - 1);
+	string = pnstrdup(raw_string, max_size - 1);
 
 	/* now double-guaranteed to be NUL terminated */
-	rawlen = strlen(activity);
+	rawlen = strlen(string);
 
 	/*
 	 * All supported server-encodings make it possible to determine the length
 	 * of a multi-byte character from its first byte (this is not the case for
-	 * client encodings, see GB18030). As st_activity is always stored using
-	 * server encoding, this allows us to perform multi-byte aware truncation,
-	 * even if the string earlier was truncated in the middle of a multi-byte
-	 * character.
+	 * client encodings, see GB18030).  As st_activity and st_authn_id are
+	 * always stored using server encoding, this allows us to perform
+	 * multi-byte aware truncation, even if the string earlier was truncated
+	 * in the middle of a multi-byte character.
 	 */
-	cliplen = pg_mbcliplen(activity, rawlen,
-						   pgstat_track_activity_query_size - 1);
+	cliplen = pg_mbcliplen(string, rawlen,
+						   max_size - 1);
 
-	activity[cliplen] = '\0';
+	string[cliplen] = '\0';
 
-	return activity;
+	return string;
 }
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 14056f5347..16a92ff089 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -570,7 +570,7 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 Datum
 pg_stat_get_activity(PG_FUNCTION_ARGS)
 {
-#define PG_STAT_GET_ACTIVITY_COLS	30
+#define PG_STAT_GET_ACTIVITY_COLS	31
 	int			num_backends = pgstat_fetch_stat_numbackends();
 	int			curr_backend;
 	int			pid = PG_ARGISNULL(0) ? -1 : PG_GETARG_INT32(0);
@@ -704,10 +704,23 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
 					break;
 			}
 
-			clipped_activity = pgstat_clip_activity(beentry->st_activity_raw);
+			clipped_activity = pgstat_clip_string(beentry->st_activity_raw,
+												  pgstat_track_activity_query_size);
 			values[5] = CStringGetTextDatum(clipped_activity);
 			pfree(clipped_activity);
 
+			if (beentry->st_authn_id)
+			{
+				char	   *clipped_authn_id;
+
+				clipped_authn_id = pgstat_clip_string(beentry->st_authn_id,
+													  pgstat_track_connection_authn_size);
+				values[30] = CStringGetTextDatum(clipped_authn_id);
+				pfree(clipped_authn_id);
+			}
+			else
+				nulls[30] = true;
+
 			/* leader_pid */
 			nulls[28] = true;
 
@@ -947,6 +960,7 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
 			nulls[27] = true;
 			nulls[28] = true;
 			nulls[29] = true;
+			nulls[30] = true;
 		}
 
 		tuplestore_putvalues(tupstore, tupdesc, values, nulls);
@@ -1027,7 +1041,8 @@ pg_stat_get_backend_activity(PG_FUNCTION_ARGS)
 	else
 		activity = beentry->st_activity_raw;
 
-	clipped_activity = pgstat_clip_activity(activity);
+	clipped_activity = pgstat_clip_string(activity,
+										  pgstat_track_activity_query_size);
 	ret = cstring_to_text(activity);
 	pfree(clipped_activity);
 
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index ee731044b6..024d6539a6 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3472,6 +3472,17 @@ static struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"track_connection_authn_size", PGC_POSTMASTER, RESOURCES_MEM,
+			gettext_noop("Sets the size reserved for pg_stat_connection.authenticated_id, in bytes."),
+			NULL,
+			GUC_UNIT_BYTE
+		},
+		&pgstat_track_connection_authn_size,
+		128, 100, 1048576,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"gin_pending_list_limit", PGC_USERSET, CLIENT_CONN_STATEMENT,
 			gettext_noop("Sets the maximum size of the pending list for GIN index."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 6e36e4c2ef..d79d7e773c 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -595,6 +595,7 @@
 
 #track_activities = on
 #track_activity_query_size = 1024	# (change requires restart)
+#track_connection_authn_size = 128	# (change requires restart)
 #track_counts = on
 #track_io_timing = off
 #track_wal_io_timing = off
diff --git a/src/test/ldap/t/001_auth.pl b/src/test/ldap/t/001_auth.pl
index 0ae14e4c85..87930b25b5 100644
--- a/src/test/ldap/t/001_auth.pl
+++ b/src/test/ldap/t/001_auth.pl
@@ -9,7 +9,7 @@ use Test::More;
 
 if ($ENV{with_ldap} eq 'yes')
 {
-	plan tests => 28;
+	plan tests => 29;
 }
 else
 {
@@ -202,6 +202,8 @@ $ENV{"PGPASSWORD"} = 'secret1';
 test_access(
 	$node, 'test1', 0,
 	'simple bind authentication succeeds',
+	sql => 'SELECT authenticated_id FROM pg_stat_connection WHERE pid = pg_backend_pid()',
+	expected_stdout => qr/uid=test1,dc=example,dc=net/,
 	log_like => [
 		qr/connection authenticated: identity="uid=test1,dc=example,dc=net" method=ldap/
 	],);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index e5ab11275d..9eff8604a2 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1750,9 +1750,6 @@ pg_stat_activity| SELECT s.datid,
     s.usesysid,
     u.rolname AS usename,
     s.application_name,
-    s.client_addr,
-    s.client_hostname,
-    s.client_port,
     s.backend_start,
     s.xact_start,
     s.query_start,
@@ -1765,7 +1762,7 @@ pg_stat_activity| SELECT s.datid,
     s.query_id,
     s.query,
     s.backend_type
-   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, query_id)
+   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, query_id, authenticated_id)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)))
      LEFT JOIN pg_authid u ON ((s.usesysid = u.oid)));
 pg_stat_all_indexes| SELECT c.oid AS relid,
@@ -1828,6 +1825,12 @@ pg_stat_bgwriter| SELECT pg_stat_get_bgwriter_timed_checkpoints() AS checkpoints
     pg_stat_get_buf_fsync_backend() AS buffers_backend_fsync,
     pg_stat_get_buf_alloc() AS buffers_alloc,
     pg_stat_get_bgwriter_stat_reset_time() AS stats_reset;
+pg_stat_connection| SELECT s.pid,
+    s.authenticated_id,
+    s.client_addr,
+    s.client_hostname,
+    s.client_port
+   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, query_id, authenticated_id);
 pg_stat_database| SELECT d.oid AS datid,
     d.datname,
         CASE
@@ -1877,7 +1880,7 @@ pg_stat_gssapi| SELECT s.pid,
     s.gss_auth AS gss_authenticated,
     s.gss_princ AS principal,
     s.gss_enc AS encrypted
-   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, query_id)
+   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, query_id, authenticated_id)
   WHERE (s.client_port IS NOT NULL);
 pg_stat_progress_analyze| SELECT s.pid,
     s.datid,
@@ -2047,7 +2050,7 @@ pg_stat_replication| SELECT s.pid,
     w.sync_priority,
     w.sync_state,
     w.reply_time
-   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, query_id)
+   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, query_id, authenticated_id)
      JOIN pg_stat_get_wal_senders() w(pid, state, sent_lsn, write_lsn, flush_lsn, replay_lsn, write_lag, flush_lag, replay_lag, sync_priority, sync_state, reply_time) ON ((s.pid = w.pid)))
      LEFT JOIN pg_authid u ON ((s.usesysid = u.oid)));
 pg_stat_replication_slots| SELECT s.slot_name,
@@ -2081,7 +2084,7 @@ pg_stat_ssl| SELECT s.pid,
     s.ssl_client_dn AS client_dn,
     s.ssl_client_serial AS client_serial,
     s.ssl_issuer_dn AS issuer_dn
-   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, query_id)
+   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, query_id, authenticated_id)
   WHERE (s.client_port IS NOT NULL);
 pg_stat_subscription| SELECT su.oid AS subid,
     su.subname,
diff --git a/src/test/ssl/t/001_ssltests.pl b/src/test/ssl/t/001_ssltests.pl
index 44daefb002..03cd45ff71 100644
--- a/src/test/ssl/t/001_ssltests.pl
+++ b/src/test/ssl/t/001_ssltests.pl
@@ -20,7 +20,7 @@ if ($ENV{with_ssl} ne 'openssl')
 }
 else
 {
-	plan tests => 110;
+	plan tests => 111;
 }
 
 #### Some configuration
@@ -449,10 +449,13 @@ $node->connect_ok(
 # same thing but using explicit CN
 $dn_connstr = "$common_connstr dbname=certdb_cn";
 
+# The full DN should still be used as the authenticated identity, within the
+# backend logs and pg_stat_connection.
 $node->connect_ok(
 	"$dn_connstr user=ssltestuser sslcert=ssl/client-dn.crt sslkey=ssl/client-dn_tmp.key",
 	"certificate authorization succeeds with CN mapping",
-	# the full DN should still be used as the authenticated identity
+	sql => 'SELECT authenticated_id FROM pg_stat_connection WHERE pid = pg_backend_pid()',
+	expected_stdout => qr/CN=ssltestuser-dn,OU=Testing,OU=Engineering,O=PGDG/,
 	log_like => [
 		qr/connection authenticated: identity="CN=ssltestuser-dn,OU=Testing,OU=Engineering,O=PGDG" method=cert/
 	],);
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 7e32b0686c..2d0c9bae65 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7513,6 +7513,24 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-track-connection-authn-size" xreflabel="track_connection_authn_size">
+      <term><varname>track_connection_authn_size</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>track_connection_authn_size</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+       Specifies the amount of memory reserved to store the text of the
+       authenticated identity for each active session, for the
+       <structname>pg_stat_activity</structname>.<structfield>authenticated_id</structfield> field.
+       If this value is specified without units, it is taken as bytes.
+       The default value is 128 bytes.
+       This parameter can only be set at server start.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-counts" xreflabel="track_counts">
       <term><varname>track_counts</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index dcbb10fb6f..1d21d92d3c 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -749,40 +749,6 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
       </para></entry>
      </row>
 
-     <row>
-      <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>client_addr</structfield> <type>inet</type>
-      </para>
-      <para>
-       IP address of the client connected to this backend.
-       If this field is null, it indicates either that the client is
-       connected via a Unix socket on the server machine or that this is an
-       internal process such as autovacuum.
-      </para></entry>
-     </row>
-
-     <row>
-      <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>client_hostname</structfield> <type>text</type>
-      </para>
-      <para>
-       Host name of the connected client, as reported by a
-       reverse DNS lookup of <structfield>client_addr</structfield>. This field will
-       only be non-null for IP connections, and only when <xref linkend="guc-log-hostname"/> is enabled.
-      </para></entry>
-     </row>
-
-     <row>
-      <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>client_port</structfield> <type>integer</type>
-      </para>
-      <para>
-       TCP port number that the client is using for communication
-       with this backend, or <literal>-1</literal> if a Unix socket is used.
-       If this field is null, it indicates that this is an internal server process.
-      </para></entry>
-     </row>
-
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>backend_start</structfield> <type>timestamp with time zone</type>
@@ -2261,6 +2227,91 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
 
  </sect2>
 
+ <sect2 id="monitoring-pg-stat-connection-view">
+  <title><structname>pg_stat_connection</structname></title>
+
+  <indexterm>
+   <primary>pg_stat_connection</primary>
+  </indexterm>
+
+  <para>
+   The <structname>pg_stat_connection</structname> view will have one row
+   per server process, showing information related to
+   the current connection of that process.
+  </para>
+
+  <table id="pg-stat-connection-view" xreflabel="pg_stat_connection">
+   <title><structname>pg_stat_connection</structname> View</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       Column Type
+      </para>
+      <para>
+       Description
+      </para></entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>pid</structfield> <type>integer</type>
+      </para>
+      <para>
+       Process ID of this backend
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>authenticated_id</structfield> <type>text</type>
+      </para>
+      <para>
+       Name of the authenticated ID used for this backend login at
+       authentication
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>client_addr</structfield> <type>inet</type>
+      </para>
+      <para>
+       IP address of the client connected to this backend.
+       If this field is null, it indicates either that the client is
+       connected via a Unix socket on the server machine or that this is an
+       internal process such as autovacuum.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>client_hostname</structfield> <type>text</type>
+      </para>
+      <para>
+       Host name of the connected client, as reported by a
+       reverse DNS lookup of <structfield>client_addr</structfield>. This field will
+       only be non-null for IP connections, and only when <xref linkend="guc-log-hostname"/> is enabled.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>client_port</structfield> <type>integer</type>
+      </para>
+      <para>
+       TCP port number that the client is using for communication
+       with this backend, or <literal>-1</literal> if a Unix socket is used.
+       If this field is null, it indicates that this is an internal server process.
+      </para></entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+ </sect2>
+
  <sect2 id="monitoring-pg-stat-replication-view">
   <title><structname>pg_stat_replication</structname></title>
 
-- 
2.31.1

#17Aleksander Alekseev
aleksander@timescale.com
In reply to: Michael Paquier (#16)
Re: Addition of authenticated ID to pg_stat_activity

Hi Michael,

Just to make the discussion move on, attached is an updated version
doing that.

The code seems OK, but I have mixed feelings about the way that the
VIEW currently works.

Here is what I get when a single user is connected via a UNIX socket:

43204 (master) =# select * from pg_stat_connection;
pid | authenticated_id | client_addr | client_hostname | client_port
-------+------------------+-------------+-----------------+-------------
25806 | | | |
25808 | | | |
43204 | | | | -1
25804 | | | |
25803 | | | |
25805 | | | |
(6 rows)

I bet we could be more user-friendly than this. To begin with, the
documentation says:

+  <para>
+   The <structname>pg_stat_connection</structname> view will have one row
+   per server process, showing information related to
+   the current connection of that process.
+  </para>

[...]

+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>client_addr</structfield> <type>inet</type>
+      </para>
+      <para>
+       IP address of the client connected to this backend.
+       If this field is null, it indicates either that the client is
+       connected via a Unix socket on the server machine or that this is an
+       internal process such as autovacuum.
+      </para></entry>
+     </row>

Any reason why we shouldn't simply exclude internal processes from the
view? Do they have a connection that the VIEW could show?

Secondly, maybe instead of magic constants like -1, we could use an
additional text column, e.g. connection_type: "unix"? Thirdly, not
sure if client_hostname is really needed, isn't address:port pair
enough to identify the client? Lastly, introducing a new GUC for
truncating values in a single view, which can only be set at server
start, doesn't strike me as a great idea. What is the worst-case
scenario if instead we will always allocate
`strlen(MyProcPort->authn_id)` and the user will truncate the result
manually if needed?

--
Best regards,
Aleksander Alekseev

#18Michael Paquier
michael@paquier.xyz
In reply to: Aleksander Alekseev (#17)
Re: Addition of authenticated ID to pg_stat_activity

On Mon, Jul 19, 2021 at 04:56:24PM +0300, Aleksander Alekseev wrote:

Any reason why we shouldn't simply exclude internal processes from the
view? Do they have a connection that the VIEW could show?

Yeah, they don't really have any useful information. Still, I kept
that mainly as a matter of consistency with pg_stat_activity, and this
can be useful to find out about internal processes without relying on
an extra check based on pg_stat_activity.backend_type. Perhaps we
could just add a check in system_views.sql based on the NULL-ness of
client_port.

Secondly, maybe instead of magic constants like -1, we could use an
additional text column, e.g. connection_type: "unix"?

I am not really incline to break that more, as told by
pg_stat_get_activity() there are cases where this looks useful:
/*
* Unix sockets always reports NULL for host and -1 for
* port, so it's possible to tell the difference to
* connections we have no permissions to view, or with
* errors.
*/

Thirdly, not
sure if client_hostname is really needed, isn't address:port pair
enough to identify the client?

It seems to me that this is still useful to know about
Port->remote_hostname.

Lastly, introducing a new GUC for
truncating values in a single view, which can only be set at server
start, doesn't strike me as a great idea. What is the worst-case
scenario if instead we will always allocate
`strlen(MyProcPort->authn_id)` and the user will truncate the result
manually if needed?

The authenticated ID could be a SSL DN longer than the default of
128kB that this patch is proposing. I think that it is a good idea to
provide some way to the user to be able to control that without a
recompilation.
--
Michael

#19Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#18)
Re: Addition of authenticated ID to pg_stat_activity

On Wed, Jul 21, 2021 at 01:21:17PM +0900, Michael Paquier wrote:

The authenticated ID could be a SSL DN longer than the default of
128kB that this patch is proposing. I think that it is a good idea to
provide some way to the user to be able to control that without a
recompilation.

I got to think about this patch more for the last couple of days, and
I'd still think that having a GUC to control how much shared memory we
need for the authenticated ID in each BackendStatusArray. Now, the
thread has been idle for two months now, and it does not seem to
attract much attention. This also includes a split of
pg_stat_activity for client_addr, client_hostname and client_port into
a new catalog, which may be hard to justify for this feature. So I am
dropping the patch.
--
Michael