Expose lock group leader pid in pg_stat_activity

Started by Julien Rouhaudabout 6 years ago27 messages

rjuju123@gmail.com

about 6 years ago

1 attachment(s)

Hello,

Guillaume (in Cc) recently pointed out [1]https://twitter.com/g_lelarge/status/1209486212190343168 that it's currently not
possible to retrieve the list of parallel workers for a given backend
at the SQL level. His use case was to develop a function in plpgsql
to sample a given query wait event, but it's not hard to imagine other
useful use cases for this information, for instance doing some
analysis on the average number of workers per parallel query, or ratio
of parallel queries. IIUC parallel queries is for now the only user
of lock group, so this should work just fine.

I'm attaching a trivial patch to expose the group leader pid if any
in pg_stat_activity. Quick example of usage:

=# SELECT query, leader_pid,
array_agg(pid) filter(WHERE leader_pid != pid) AS members
FROM pg_stat_activity
WHERE leader_pid IS NOT NULL
GROUP BY query, leader_pid;
query | leader_pid | members
-------------------+------------+---------------
select * from t1; | 28701 | {28728,28732}
(1 row)

[1]: https://twitter.com/g_lelarge/status/1209486212190343168

Attachments:

pgsa_leader_pid-v1.difftext/x-patch; charset=US-ASCII; name=pgsa_leader_pid-v1.diffDownload

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index dcb58115af..af0a04cfb1 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -614,6 +614,11 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
      <entry><type>integer</type></entry>
      <entry>Process ID of this backend</entry>
     </row>
+    <row>
+     <entry><structfield>pid</structfield></entry>
+     <entry><type>integer</type></entry>
+     <entry>Process ID of the lock group leader, if any</entry>
+    </row>
     <row>
      <entry><structfield>usesysid</structfield></entry>
      <entry><type>oid</type></entry>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index f7800f01a6..0c3eb08028 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -734,6 +734,7 @@ CREATE VIEW pg_stat_activity AS
             S.datid AS datid,
             D.datname AS datname,
             S.pid,
+            S.leader_pid,
             S.usesysid,
             U.rolname AS usename,
             S.application_name,
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index d9f78221aa..323eb89c86 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -545,7 +545,7 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 Datum
 pg_stat_get_activity(PG_FUNCTION_ARGS)
 {
-#define PG_STAT_GET_ACTIVITY_COLS	29
+#define PG_STAT_GET_ACTIVITY_COLS	30
 	int			num_backends = pgstat_fetch_stat_numbackends();
 	int			curr_backend;
 	int			pid = PG_ARGISNULL(0) ? -1 : PG_GETARG_INT32(0);
@@ -684,6 +684,7 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
 			values[5] = CStringGetTextDatum(clipped_activity);
 			pfree(clipped_activity);
 
+			nulls[29] = true;
 			proc = BackendPidGetProc(beentry->st_procpid);
 			if (proc != NULL)
 			{
@@ -693,6 +694,12 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
 				wait_event_type = pgstat_get_wait_event_type(raw_wait_event);
 				wait_event = pgstat_get_wait_event(raw_wait_event);
 
+				if (proc->lockGroupLeader)
+				{
+					values[29] = Int32GetDatum(proc->lockGroupLeader->pid);
+					nulls[29] = false;
+				}
+
 			}
 			else if (beentry->st_backendType != B_BACKEND)
 			{
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index ac8f64b219..6ff7322425 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5148,9 +5148,9 @@
   proname => 'pg_stat_get_activity', prorows => '100', proisstrict => 'f',
   proretset => 't', provolatile => 's', proparallel => 'r',
   prorettype => 'record', proargtypes => 'int4',
-  proallargtypes => '{int4,oid,int4,oid,text,text,text,text,text,timestamptz,timestamptz,timestamptz,timestamptz,inet,text,int4,xid,xid,text,bool,text,text,int4,bool,text,numeric,text,bool,text,bool}',
-  proargmodes => '{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{pid,datid,pid,usesysid,application_name,state,query,wait_event_type,wait_event,xact_start,query_start,backend_start,state_change,client_addr,client_hostname,client_port,backend_xid,backend_xmin,backend_type,ssl,sslversion,sslcipher,sslbits,sslcompression,ssl_client_dn,ssl_client_serial,ssl_issuer_dn,gss_auth,gss_princ,gss_enc}',
+  proallargtypes => '{int4,oid,int4,oid,text,text,text,text,text,timestamptz,timestamptz,timestamptz,timestamptz,inet,text,int4,xid,xid,text,bool,text,text,int4,bool,text,numeric,text,bool,text,bool,int4}',
+  proargmodes => '{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{pid,datid,pid,usesysid,application_name,state,query,wait_event_type,wait_event,xact_start,query_start,backend_start,state_change,client_addr,client_hostname,client_port,backend_xid,backend_xmin,backend_type,ssl,sslversion,sslcipher,sslbits,sslcompression,ssl_client_dn,ssl_client_serial,ssl_issuer_dn,gss_auth,gss_princ,gss_enc,leader_pid}',
   prosrc => 'pg_stat_get_activity' },
 { oid => '3318',
   descr => 'statistics: information about progress of backends running maintenance command',

Julien Rouhaud

rjuju123@gmail.com

about 6 years ago

In reply to: Julien Rouhaud (#1)

1 attachment(s)

Re: Expose lock group leader pid in pg_stat_activity

On Wed, Dec 25, 2019 at 7:03 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

Guillaume (in Cc) recently pointed out [1] that it's currently not
possible to retrieve the list of parallel workers for a given backend
at the SQL level. His use case was to develop a function in plpgsql
to sample a given query wait event, but it's not hard to imagine other
useful use cases for this information, for instance doing some
analysis on the average number of workers per parallel query, or ratio
of parallel queries. IIUC parallel queries is for now the only user
of lock group, so this should work just fine.

I'm attaching a trivial patch to expose the group leader pid if any
in pg_stat_activity. Quick example of usage:

=# SELECT query, leader_pid,
array_agg(pid) filter(WHERE leader_pid != pid) AS members
FROM pg_stat_activity
WHERE leader_pid IS NOT NULL
GROUP BY query, leader_pid;
query | leader_pid | members
-------------------+------------+---------------
select * from t1; | 28701 | {28728,28732}
(1 row)

[1] https://twitter.com/g_lelarge/status/1209486212190343168

And I just realized that I forgot to update rule.out, sorry about
that. v2 attached.

Attachments:

pgsa_leader_pid-v2.difftext/x-patch; charset=US-ASCII; name=pgsa_leader_pid-v2.diffDownload

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index dcb58115af..af0a04cfb1 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -614,6 +614,11 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
      <entry><type>integer</type></entry>
      <entry>Process ID of this backend</entry>
     </row>
+    <row>
+     <entry><structfield>pid</structfield></entry>
+     <entry><type>integer</type></entry>
+     <entry>Process ID of the lock group leader, if any</entry>
+    </row>
     <row>
      <entry><structfield>usesysid</structfield></entry>
      <entry><type>oid</type></entry>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index f7800f01a6..0c3eb08028 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -734,6 +734,7 @@ CREATE VIEW pg_stat_activity AS
             S.datid AS datid,
             D.datname AS datname,
             S.pid,
+            S.leader_pid,
             S.usesysid,
             U.rolname AS usename,
             S.application_name,
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index d9f78221aa..323eb89c86 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -545,7 +545,7 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 Datum
 pg_stat_get_activity(PG_FUNCTION_ARGS)
 {
-#define PG_STAT_GET_ACTIVITY_COLS	29
+#define PG_STAT_GET_ACTIVITY_COLS	30
 	int			num_backends = pgstat_fetch_stat_numbackends();
 	int			curr_backend;
 	int			pid = PG_ARGISNULL(0) ? -1 : PG_GETARG_INT32(0);
@@ -684,6 +684,7 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
 			values[5] = CStringGetTextDatum(clipped_activity);
 			pfree(clipped_activity);
 
+			nulls[29] = true;
 			proc = BackendPidGetProc(beentry->st_procpid);
 			if (proc != NULL)
 			{
@@ -693,6 +694,12 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
 				wait_event_type = pgstat_get_wait_event_type(raw_wait_event);
 				wait_event = pgstat_get_wait_event(raw_wait_event);
 
+				if (proc->lockGroupLeader)
+				{
+					values[29] = Int32GetDatum(proc->lockGroupLeader->pid);
+					nulls[29] = false;
+				}
+
 			}
 			else if (beentry->st_backendType != B_BACKEND)
 			{
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index ac8f64b219..6ff7322425 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5148,9 +5148,9 @@
   proname => 'pg_stat_get_activity', prorows => '100', proisstrict => 'f',
   proretset => 't', provolatile => 's', proparallel => 'r',
   prorettype => 'record', proargtypes => 'int4',
-  proallargtypes => '{int4,oid,int4,oid,text,text,text,text,text,timestamptz,timestamptz,timestamptz,timestamptz,inet,text,int4,xid,xid,text,bool,text,text,int4,bool,text,numeric,text,bool,text,bool}',
-  proargmodes => '{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{pid,datid,pid,usesysid,application_name,state,query,wait_event_type,wait_event,xact_start,query_start,backend_start,state_change,client_addr,client_hostname,client_port,backend_xid,backend_xmin,backend_type,ssl,sslversion,sslcipher,sslbits,sslcompression,ssl_client_dn,ssl_client_serial,ssl_issuer_dn,gss_auth,gss_princ,gss_enc}',
+  proallargtypes => '{int4,oid,int4,oid,text,text,text,text,text,timestamptz,timestamptz,timestamptz,timestamptz,inet,text,int4,xid,xid,text,bool,text,text,int4,bool,text,numeric,text,bool,text,bool,int4}',
+  proargmodes => '{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{pid,datid,pid,usesysid,application_name,state,query,wait_event_type,wait_event,xact_start,query_start,backend_start,state_change,client_addr,client_hostname,client_port,backend_xid,backend_xmin,backend_type,ssl,sslversion,sslcipher,sslbits,sslcompression,ssl_client_dn,ssl_client_serial,ssl_issuer_dn,gss_auth,gss_princ,gss_enc,leader_pid}',
   prosrc => 'pg_stat_get_activity' },
 { oid => '3318',
   descr => 'statistics: information about progress of backends running maintenance command',
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 80a07825b9..02200c46c4 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1724,6 +1724,7 @@ pg_shadow| SELECT pg_authid.rolname AS usename,
 pg_stat_activity| SELECT s.datid,
     d.datname,
     s.pid,
+    s.leader_pid,
     s.usesysid,
     u.rolname AS usename,
     s.application_name,
@@ -1741,7 +1742,7 @@ pg_stat_activity| SELECT s.datid,
     s.backend_xmin,
     s.query,
     s.backend_type
-   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc)
+   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)))
      LEFT JOIN pg_authid u ON ((s.usesysid = u.oid)));
 pg_stat_all_indexes| SELECT c.oid AS relid,
@@ -1845,7 +1846,7 @@ pg_stat_gssapi| SELECT s.pid,
     s.gss_auth AS gss_authenticated,
     s.gss_princ AS principal,
     s.gss_enc AS encrypted
-   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc)
+   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid)
   WHERE (s.client_port IS NOT NULL);
 pg_stat_progress_cluster| SELECT s.pid,
     s.datid,
@@ -1956,7 +1957,7 @@ pg_stat_replication| SELECT s.pid,
     w.spill_txns,
     w.spill_count,
     w.spill_bytes
-   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc)
+   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid)
      JOIN pg_stat_get_wal_senders() w(pid, state, sent_lsn, write_lsn, flush_lsn, replay_lsn, write_lag, flush_lag, replay_lag, sync_priority, sync_state, reply_time, spill_txns, spill_count, spill_bytes) ON ((s.pid = w.pid)))
      LEFT JOIN pg_authid u ON ((s.usesysid = u.oid)));
 pg_stat_ssl| SELECT s.pid,
@@ -1968,7 +1969,7 @@ pg_stat_ssl| SELECT s.pid,
     s.ssl_client_dn AS client_dn,
     s.ssl_client_serial AS client_serial,
     s.ssl_issuer_dn AS issuer_dn
-   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc)
+   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid)
   WHERE (s.client_port IS NOT NULL);
 pg_stat_subscription| SELECT su.oid AS subid,
     su.subname,

Guillaume Lelarge

guillaume@lelarge.info

about 6 years ago

In reply to: Julien Rouhaud (#2)

Re: Expose lock group leader pid in pg_stat_activity

Le mer. 25 déc. 2019 à 19:30, Julien Rouhaud <rjuju123@gmail.com> a écrit :

On Wed, Dec 25, 2019 at 7:03 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

Guillaume (in Cc) recently pointed out [1] that it's currently not
possible to retrieve the list of parallel workers for a given backend
at the SQL level. His use case was to develop a function in plpgsql
to sample a given query wait event, but it's not hard to imagine other
useful use cases for this information, for instance doing some
analysis on the average number of workers per parallel query, or ratio
of parallel queries. IIUC parallel queries is for now the only user
of lock group, so this should work just fine.

I'm attaching a trivial patch to expose the group leader pid if any
in pg_stat_activity. Quick example of usage:

=# SELECT query, leader_pid,
array_agg(pid) filter(WHERE leader_pid != pid) AS members
FROM pg_stat_activity
WHERE leader_pid IS NOT NULL
GROUP BY query, leader_pid;
query | leader_pid | members
-------------------+------------+---------------
select * from t1; | 28701 | {28728,28732}
(1 row)

[1] https://twitter.com/g_lelarge/status/1209486212190343168

And I just realized that I forgot to update rule.out, sorry about
that. v2 attached.

So I tried your patch this morning, and it works really well.

On a SELECT count(*), I got this:

SELECT leader_pid, pid, wait_event_type, wait_event, state, backend_type
FROM pg_stat_activity WHERE pid=111439 or leader_pid=111439;

┌────────────┬────────┬─────────────────┬──────────────┬────────┬─────────────────┐
│ leader_pid │ pid │ wait_event_type │ wait_event │ state │
backend_type │
├────────────┼────────┼─────────────────┼──────────────┼────────┼─────────────────┤
│ 111439 │ 111439 │ LWLock │ WALWriteLock │ active │ client
backend │
│ 111439 │ 116887 │ LWLock │ WALWriteLock │ active │ parallel
worker │
│ 111439 │ 116888 │ IO │ WALSync │ active │ parallel
worker │
└────────────┴────────┴─────────────────┴──────────────┴────────┴─────────────────┘
(3 rows)

and this from a CREATE INDEX:

┌────────────┬────────┬─────────────────┬────────────┬────────┬─────────────────┐
│ leader_pid │ pid │ wait_event_type │ wait_event │ state │
backend_type │
├────────────┼────────┼─────────────────┼────────────┼────────┼─────────────────┤
│ 111439 │ 111439 │ │ │ active │ client
backend │
│ 111439 │ 118775 │ │ │ active │ parallel
worker │
└────────────┴────────┴─────────────────┴────────────┴────────┴─────────────────┘
(2 rows)

Anyway, it applies cleanly, it compiles, and it works. Documentation is
available. So it looks to me it's good to go :)

--
Guillaume.

Julien Rouhaud

rjuju123@gmail.com

about 6 years ago

In reply to: Guillaume Lelarge (#3)

1 attachment(s)

Re: Expose lock group leader pid in pg_stat_activity

On Thu, Dec 26, 2019 at 9:08 AM Guillaume Lelarge
<guillaume@lelarge.info> wrote:

Le mer. 25 déc. 2019 à 19:30, Julien Rouhaud <rjuju123@gmail.com> a écrit :

On Wed, Dec 25, 2019 at 7:03 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

Guillaume (in Cc) recently pointed out [1] that it's currently not
possible to retrieve the list of parallel workers for a given backend
at the SQL level. His use case was to develop a function in plpgsql
to sample a given query wait event, but it's not hard to imagine other
useful use cases for this information, for instance doing some
analysis on the average number of workers per parallel query, or ratio
of parallel queries. IIUC parallel queries is for now the only user
of lock group, so this should work just fine.

I'm attaching a trivial patch to expose the group leader pid if any
in pg_stat_activity. Quick example of usage:

=# SELECT query, leader_pid,
array_agg(pid) filter(WHERE leader_pid != pid) AS members
FROM pg_stat_activity
WHERE leader_pid IS NOT NULL
GROUP BY query, leader_pid;
query | leader_pid | members
-------------------+------------+---------------
select * from t1; | 28701 | {28728,28732}
(1 row)

[1] https://twitter.com/g_lelarge/status/1209486212190343168

And I just realized that I forgot to update rule.out, sorry about
that. v2 attached.

So I tried your patch this morning, and it works really well.

On a SELECT count(*), I got this:

SELECT leader_pid, pid, wait_event_type, wait_event, state, backend_type FROM pg_stat_activity WHERE pid=111439 or leader_pid=111439;

┌────────────┬────────┬─────────────────┬──────────────┬────────┬─────────────────┐
│ leader_pid │ pid │ wait_event_type │ wait_event │ state │ backend_type │
├────────────┼────────┼─────────────────┼──────────────┼────────┼─────────────────┤
│ 111439 │ 111439 │ LWLock │ WALWriteLock │ active │ client backend │
│ 111439 │ 116887 │ LWLock │ WALWriteLock │ active │ parallel worker │
│ 111439 │ 116888 │ IO │ WALSync │ active │ parallel worker │
└────────────┴────────┴─────────────────┴──────────────┴────────┴─────────────────┘
(3 rows)

and this from a CREATE INDEX:

┌────────────┬────────┬─────────────────┬────────────┬────────┬─────────────────┐
│ leader_pid │ pid │ wait_event_type │ wait_event │ state │ backend_type │
├────────────┼────────┼─────────────────┼────────────┼────────┼─────────────────┤
│ 111439 │ 111439 │ │ │ active │ client backend │
│ 111439 │ 118775 │ │ │ active │ parallel worker │
└────────────┴────────┴─────────────────┴────────────┴────────┴─────────────────┘
(2 rows)

Anyway, it applies cleanly, it compiles, and it works. Documentation is available. So it looks to me it's good to go :)

Thanks for the review Guillaume. Double checking the doc, I see that
I made a copy/pasto mistake in the new field name. Attached v3 should
be all good.

Attachments:

pgsa_leader_pid-v3.difftext/x-patch; charset=US-ASCII; name=pgsa_leader_pid-v3.diffDownload

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index dcb58115af..a64959b600 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -614,6 +614,11 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
      <entry><type>integer</type></entry>
      <entry>Process ID of this backend</entry>
     </row>
+    <row>
+     <entry><structfield>leader_pid</structfield></entry>
+     <entry><type>integer</type></entry>
+     <entry>Process ID of the lock group leader, if any</entry>
+    </row>
     <row>
      <entry><structfield>usesysid</structfield></entry>
      <entry><type>oid</type></entry>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index f7800f01a6..0c3eb08028 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -734,6 +734,7 @@ CREATE VIEW pg_stat_activity AS
             S.datid AS datid,
             D.datname AS datname,
             S.pid,
+            S.leader_pid,
             S.usesysid,
             U.rolname AS usename,
             S.application_name,
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index d9f78221aa..323eb89c86 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -545,7 +545,7 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 Datum
 pg_stat_get_activity(PG_FUNCTION_ARGS)
 {
-#define PG_STAT_GET_ACTIVITY_COLS	29
+#define PG_STAT_GET_ACTIVITY_COLS	30
 	int			num_backends = pgstat_fetch_stat_numbackends();
 	int			curr_backend;
 	int			pid = PG_ARGISNULL(0) ? -1 : PG_GETARG_INT32(0);
@@ -684,6 +684,7 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
 			values[5] = CStringGetTextDatum(clipped_activity);
 			pfree(clipped_activity);
 
+			nulls[29] = true;
 			proc = BackendPidGetProc(beentry->st_procpid);
 			if (proc != NULL)
 			{
@@ -693,6 +694,12 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
 				wait_event_type = pgstat_get_wait_event_type(raw_wait_event);
 				wait_event = pgstat_get_wait_event(raw_wait_event);
 
+				if (proc->lockGroupLeader)
+				{
+					values[29] = Int32GetDatum(proc->lockGroupLeader->pid);
+					nulls[29] = false;
+				}
+
 			}
 			else if (beentry->st_backendType != B_BACKEND)
 			{
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index ac8f64b219..6ff7322425 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5148,9 +5148,9 @@
   proname => 'pg_stat_get_activity', prorows => '100', proisstrict => 'f',
   proretset => 't', provolatile => 's', proparallel => 'r',
   prorettype => 'record', proargtypes => 'int4',
-  proallargtypes => '{int4,oid,int4,oid,text,text,text,text,text,timestamptz,timestamptz,timestamptz,timestamptz,inet,text,int4,xid,xid,text,bool,text,text,int4,bool,text,numeric,text,bool,text,bool}',
-  proargmodes => '{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{pid,datid,pid,usesysid,application_name,state,query,wait_event_type,wait_event,xact_start,query_start,backend_start,state_change,client_addr,client_hostname,client_port,backend_xid,backend_xmin,backend_type,ssl,sslversion,sslcipher,sslbits,sslcompression,ssl_client_dn,ssl_client_serial,ssl_issuer_dn,gss_auth,gss_princ,gss_enc}',
+  proallargtypes => '{int4,oid,int4,oid,text,text,text,text,text,timestamptz,timestamptz,timestamptz,timestamptz,inet,text,int4,xid,xid,text,bool,text,text,int4,bool,text,numeric,text,bool,text,bool,int4}',
+  proargmodes => '{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{pid,datid,pid,usesysid,application_name,state,query,wait_event_type,wait_event,xact_start,query_start,backend_start,state_change,client_addr,client_hostname,client_port,backend_xid,backend_xmin,backend_type,ssl,sslversion,sslcipher,sslbits,sslcompression,ssl_client_dn,ssl_client_serial,ssl_issuer_dn,gss_auth,gss_princ,gss_enc,leader_pid}',
   prosrc => 'pg_stat_get_activity' },
 { oid => '3318',
   descr => 'statistics: information about progress of backends running maintenance command',
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 80a07825b9..02200c46c4 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1724,6 +1724,7 @@ pg_shadow| SELECT pg_authid.rolname AS usename,
 pg_stat_activity| SELECT s.datid,
     d.datname,
     s.pid,
+    s.leader_pid,
     s.usesysid,
     u.rolname AS usename,
     s.application_name,
@@ -1741,7 +1742,7 @@ pg_stat_activity| SELECT s.datid,
     s.backend_xmin,
     s.query,
     s.backend_type
-   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc)
+   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)))
      LEFT JOIN pg_authid u ON ((s.usesysid = u.oid)));
 pg_stat_all_indexes| SELECT c.oid AS relid,
@@ -1845,7 +1846,7 @@ pg_stat_gssapi| SELECT s.pid,
     s.gss_auth AS gss_authenticated,
     s.gss_princ AS principal,
     s.gss_enc AS encrypted
-   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc)
+   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid)
   WHERE (s.client_port IS NOT NULL);
 pg_stat_progress_cluster| SELECT s.pid,
     s.datid,
@@ -1956,7 +1957,7 @@ pg_stat_replication| SELECT s.pid,
     w.spill_txns,
     w.spill_count,
     w.spill_bytes
-   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc)
+   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid)
      JOIN pg_stat_get_wal_senders() w(pid, state, sent_lsn, write_lsn, flush_lsn, replay_lsn, write_lag, flush_lag, replay_lag, sync_priority, sync_state, reply_time, spill_txns, spill_count, spill_bytes) ON ((s.pid = w.pid)))
      LEFT JOIN pg_authid u ON ((s.usesysid = u.oid)));
 pg_stat_ssl| SELECT s.pid,
@@ -1968,7 +1969,7 @@ pg_stat_ssl| SELECT s.pid,
     s.ssl_client_dn AS client_dn,
     s.ssl_client_serial AS client_serial,
     s.ssl_issuer_dn AS issuer_dn
-   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc)
+   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid)
   WHERE (s.client_port IS NOT NULL);
 pg_stat_subscription| SELECT su.oid AS subid,
     su.subname,

Guillaume Lelarge

guillaume@lelarge.info

about 6 years ago

In reply to: Julien Rouhaud (#4)

Re: Expose lock group leader pid in pg_stat_activity

Le jeu. 26 déc. 2019 à 09:49, Julien Rouhaud <rjuju123@gmail.com> a écrit :

On Thu, Dec 26, 2019 at 9:08 AM Guillaume Lelarge
<guillaume@lelarge.info> wrote:

Le mer. 25 déc. 2019 à 19:30, Julien Rouhaud <rjuju123@gmail.com> a

écrit :

On Wed, Dec 25, 2019 at 7:03 PM Julien Rouhaud <rjuju123@gmail.com>

wrote:

Guillaume (in Cc) recently pointed out [1] that it's currently not
possible to retrieve the list of parallel workers for a given backend
at the SQL level. His use case was to develop a function in plpgsql
to sample a given query wait event, but it's not hard to imagine other
useful use cases for this information, for instance doing some
analysis on the average number of workers per parallel query, or ratio
of parallel queries. IIUC parallel queries is for now the only user
of lock group, so this should work just fine.

I'm attaching a trivial patch to expose the group leader pid if any
in pg_stat_activity. Quick example of usage:

=# SELECT query, leader_pid,
array_agg(pid) filter(WHERE leader_pid != pid) AS members
FROM pg_stat_activity
WHERE leader_pid IS NOT NULL
GROUP BY query, leader_pid;
query | leader_pid | members
-------------------+------------+---------------
select * from t1; | 28701 | {28728,28732}
(1 row)

[1] https://twitter.com/g_lelarge/status/1209486212190343168

And I just realized that I forgot to update rule.out, sorry about
that. v2 attached.

So I tried your patch this morning, and it works really well.

On a SELECT count(*), I got this:

SELECT leader_pid, pid, wait_event_type, wait_event, state, backend_type

FROM pg_stat_activity WHERE pid=111439 or leader_pid=111439;

┌────────────┬────────┬─────────────────┬──────────────┬────────┬─────────────────┐

│ leader_pid │ pid │ wait_event_type │ wait_event │ state │

backend_type │

├────────────┼────────┼─────────────────┼──────────────┼────────┼─────────────────┤

│ 111439 │ 111439 │ LWLock │ WALWriteLock │ active │ client

backend │

│ 111439 │ 116887 │ LWLock │ WALWriteLock │ active │

parallel worker │

│ 111439 │ 116888 │ IO │ WALSync │ active │

parallel worker │

└────────────┴────────┴─────────────────┴──────────────┴────────┴─────────────────┘

(3 rows)

and this from a CREATE INDEX:

┌────────────┬────────┬─────────────────┬────────────┬────────┬─────────────────┐

│ leader_pid │ pid │ wait_event_type │ wait_event │ state │

backend_type │

├────────────┼────────┼─────────────────┼────────────┼────────┼─────────────────┤

│ 111439 │ 111439 │ │ │ active │ client

backend │

│ 111439 │ 118775 │ │ │ active │ parallel

worker │

└────────────┴────────┴─────────────────┴────────────┴────────┴─────────────────┘

(2 rows)

Anyway, it applies cleanly, it compiles, and it works. Documentation is

available. So it looks to me it's good to go :)

Thanks for the review Guillaume. Double checking the doc, I see that
I made a copy/pasto mistake in the new field name. Attached v3 should
be all good.

Feeling bad I missed this. But, yeah, it's much better with the right
column's name.

For me, it's looking good to be ready for commiter. Should I set it this
way in the Commit Fest app?

--
Guillaume.

Julien Rouhaud

rjuju123@gmail.com

about 6 years ago

In reply to: Guillaume Lelarge (#5)

Re: Expose lock group leader pid in pg_stat_activity

On Thu, Dec 26, 2019 at 10:20 AM Guillaume Lelarge
<guillaume@lelarge.info> wrote:

Le jeu. 26 déc. 2019 à 09:49, Julien Rouhaud <rjuju123@gmail.com> a écrit :

On Thu, Dec 26, 2019 at 9:08 AM Guillaume Lelarge
<guillaume@lelarge.info> wrote:

Le mer. 25 déc. 2019 à 19:30, Julien Rouhaud <rjuju123@gmail.com> a écrit :

On Wed, Dec 25, 2019 at 7:03 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

Guillaume (in Cc) recently pointed out [1] that it's currently not
possible to retrieve the list of parallel workers for a given backend
at the SQL level. His use case was to develop a function in plpgsql
to sample a given query wait event, but it's not hard to imagine other
useful use cases for this information, for instance doing some
analysis on the average number of workers per parallel query, or ratio
of parallel queries. IIUC parallel queries is for now the only user
of lock group, so this should work just fine.

I'm attaching a trivial patch to expose the group leader pid if any
in pg_stat_activity. Quick example of usage:

=# SELECT query, leader_pid,
array_agg(pid) filter(WHERE leader_pid != pid) AS members
FROM pg_stat_activity
WHERE leader_pid IS NOT NULL
GROUP BY query, leader_pid;
query | leader_pid | members
-------------------+------------+---------------
select * from t1; | 28701 | {28728,28732}
(1 row)

[1] https://twitter.com/g_lelarge/status/1209486212190343168

And I just realized that I forgot to update rule.out, sorry about
that. v2 attached.

So I tried your patch this morning, and it works really well.

On a SELECT count(*), I got this:

SELECT leader_pid, pid, wait_event_type, wait_event, state, backend_type FROM pg_stat_activity WHERE pid=111439 or leader_pid=111439;

┌────────────┬────────┬─────────────────┬──────────────┬────────┬─────────────────┐
│ leader_pid │ pid │ wait_event_type │ wait_event │ state │ backend_type │
├────────────┼────────┼─────────────────┼──────────────┼────────┼─────────────────┤
│ 111439 │ 111439 │ LWLock │ WALWriteLock │ active │ client backend │
│ 111439 │ 116887 │ LWLock │ WALWriteLock │ active │ parallel worker │
│ 111439 │ 116888 │ IO │ WALSync │ active │ parallel worker │
└────────────┴────────┴─────────────────┴──────────────┴────────┴─────────────────┘
(3 rows)

and this from a CREATE INDEX:

┌────────────┬────────┬─────────────────┬────────────┬────────┬─────────────────┐
│ leader_pid │ pid │ wait_event_type │ wait_event │ state │ backend_type │
├────────────┼────────┼─────────────────┼────────────┼────────┼─────────────────┤
│ 111439 │ 111439 │ │ │ active │ client backend │
│ 111439 │ 118775 │ │ │ active │ parallel worker │
└────────────┴────────┴─────────────────┴────────────┴────────┴─────────────────┘
(2 rows)

Anyway, it applies cleanly, it compiles, and it works. Documentation is available. So it looks to me it's good to go :)

Thanks for the review Guillaume. Double checking the doc, I see that
I made a copy/pasto mistake in the new field name. Attached v3 should
be all good.

Feeling bad I missed this. But, yeah, it's much better with the right column's name.

For me, it's looking good to be ready for commiter. Should I set it this way in the Commit Fest app?

If you don't see any other issue with the patch, I'd say yes. A
committer can still put it back to waiting on author if needed.

Guillaume Lelarge

guillaume@lelarge.info

about 6 years ago

In reply to: Julien Rouhaud (#6)

Re: Expose lock group leader pid in pg_stat_activity

Le jeu. 26 déc. 2019 à 10:26, Julien Rouhaud <rjuju123@gmail.com> a écrit :

On Thu, Dec 26, 2019 at 10:20 AM Guillaume Lelarge
<guillaume@lelarge.info> wrote:

Le jeu. 26 déc. 2019 à 09:49, Julien Rouhaud <rjuju123@gmail.com> a

écrit :

On Thu, Dec 26, 2019 at 9:08 AM Guillaume Lelarge
<guillaume@lelarge.info> wrote:

Le mer. 25 déc. 2019 à 19:30, Julien Rouhaud <rjuju123@gmail.com> a

écrit :

On Wed, Dec 25, 2019 at 7:03 PM Julien Rouhaud <rjuju123@gmail.com>

wrote:

Guillaume (in Cc) recently pointed out [1] that it's currently not
possible to retrieve the list of parallel workers for a given

backend

at the SQL level. His use case was to develop a function in

plpgsql

to sample a given query wait event, but it's not hard to imagine

other

useful use cases for this information, for instance doing some
analysis on the average number of workers per parallel query, or

ratio

of parallel queries. IIUC parallel queries is for now the only

user

of lock group, so this should work just fine.

I'm attaching a trivial patch to expose the group leader pid if any
in pg_stat_activity. Quick example of usage:

=# SELECT query, leader_pid,
array_agg(pid) filter(WHERE leader_pid != pid) AS members
FROM pg_stat_activity
WHERE leader_pid IS NOT NULL
GROUP BY query, leader_pid;
query | leader_pid | members
-------------------+------------+---------------
select * from t1; | 28701 | {28728,28732}
(1 row)

[1] https://twitter.com/g_lelarge/status/1209486212190343168

And I just realized that I forgot to update rule.out, sorry about
that. v2 attached.

So I tried your patch this morning, and it works really well.

On a SELECT count(*), I got this:

SELECT leader_pid, pid, wait_event_type, wait_event, state,

backend_type FROM pg_stat_activity WHERE pid=111439 or leader_pid=111439;

┌────────────┬────────┬─────────────────┬──────────────┬────────┬─────────────────┐

│ leader_pid │ pid │ wait_event_type │ wait_event │ state │

backend_type │

├────────────┼────────┼─────────────────┼──────────────┼────────┼─────────────────┤

│ 111439 │ 111439 │ LWLock │ WALWriteLock │ active │

client backend │

│ 111439 │ 116887 │ LWLock │ WALWriteLock │ active │

parallel worker │

│ 111439 │ 116888 │ IO │ WALSync │ active │

parallel worker │

└────────────┴────────┴─────────────────┴──────────────┴────────┴─────────────────┘

(3 rows)

and this from a CREATE INDEX:

┌────────────┬────────┬─────────────────┬────────────┬────────┬─────────────────┐

│ leader_pid │ pid │ wait_event_type │ wait_event │ state │

backend_type │

├────────────┼────────┼─────────────────┼────────────┼────────┼─────────────────┤

│ 111439 │ 111439 │ │ │ active │

client backend │

│ 111439 │ 118775 │ │ │ active │

parallel worker │

└────────────┴────────┴─────────────────┴────────────┴────────┴─────────────────┘

(2 rows)

Anyway, it applies cleanly, it compiles, and it works. Documentation

is available. So it looks to me it's good to go :)

Thanks for the review Guillaume. Double checking the doc, I see that
I made a copy/pasto mistake in the new field name. Attached v3 should
be all good.

Feeling bad I missed this. But, yeah, it's much better with the right

column's name.

For me, it's looking good to be ready for commiter. Should I set it this

way in the Commit Fest app?

If you don't see any other issue with the patch, I'd say yes. A
committer can still put it back to waiting on author if needed.

That's also what I thought, but as I was the only one commenting on this...
Anyway, done.

--
Guillaume.

Sergei Kornilov

sk@zsrv.org

about 6 years ago

In reply to: Julien Rouhaud (#4)

Re: Expose lock group leader pid in pg_stat_activity

Hello

I doubt that "Process ID of the lock group leader" is enough for user documentation. I think we need note:
- this field is related to parallel query execution
- leader_pid = pid if process is parallel leader
- leader_pid would point to pid of the leader if process is parallel worker
- leader_pid will be NULL for non-parallel queries or idle sessions

Also patch has no tests. Possible this is normal, not sure how to write a reliable test for this feature.
Patch applies, compiles, pass tests

regards, Sergei

Julien Rouhaud

rjuju123@gmail.com

about 6 years ago

In reply to: Sergei Kornilov (#8)

Re: Expose lock group leader pid in pg_stat_activity

Hello,

On Thu, Dec 26, 2019 at 12:18 PM Sergei Kornilov <sk@zsrv.org> wrote:

I doubt that "Process ID of the lock group leader" is enough for user documentation. I think we need note:
- this field is related to parallel query execution
- leader_pid = pid if process is parallel leader
- leader_pid would point to pid of the leader if process is parallel worker
- leader_pid will be NULL for non-parallel queries or idle sessions

As I understand it, lock group is some infrastructure that is used by
parallel queries, but could be used for something else too. So if
more documentation is needed, we should say something like "For now,
only parallel queries can have a lock group" or something like that.

The fact that leader_pid == pid for the leader and different for the
other member should be obvious, I'm not sure that it's worth
documenting that.

Also patch has no tests. Possible this is normal, not sure how to write a reliable test for this feature.

Yes, I was unsure if some extra testing was required. We could set
force_parallel_mode to on and query "select leader_pid is not null
from pg_stat_activity where pid = pg_backend_pid(), and then the
opposite test, which should do the trick.

#10

Sergei Kornilov

sk@zsrv.org

about 6 years ago

In reply to: Julien Rouhaud (#9)

Re: Expose lock group leader pid in pg_stat_activity

Hello

As I understand it, lock group is some infrastructure that is used by
parallel queries, but could be used for something else too. So if
more documentation is needed, we should say something like "For now,
only parallel queries can have a lock group" or something like that.

If lockGroupLeader will be used in some way for non-parallel query, then the name leader_pid could be confusing. No?
I treat pg_stat_activity as view for user. We have document somewhere what is "lock group leader" (excepts README in source tree)? I meant user going to read documentation, "ok, this field is process ID of the lock group leader, but what is it?". Expose a leader pid for parallel worker will be clear improvement for user. And seems lockGroupLeader->pid is exactly this stuff. Therefore, I would like to see such description and meaning of the field.

The fact that leader_pid == pid for the leader and different for the
other member should be obvious, I'm not sure that it's worth
documenting that.

It may be not obvious that leader_pid is not null in this case. But ok, no objections.

regards, Sergei

#11

Julien Rouhaud

rjuju123@gmail.com

about 6 years ago

In reply to: Sergei Kornilov (#10)

Re: Expose lock group leader pid in pg_stat_activity

On Fri, Dec 27, 2019 at 10:01 AM Sergei Kornilov <sk@zsrv.org> wrote:

Hello

As I understand it, lock group is some infrastructure that is used by
parallel queries, but could be used for something else too. So if
more documentation is needed, we should say something like "For now,
only parallel queries can have a lock group" or something like that.

If lockGroupLeader will be used in some way for non-parallel query, then the name leader_pid could be confusing. No?
I treat pg_stat_activity as view for user. We have document somewhere what is "lock group leader" (excepts README in source tree)? I meant user going to read documentation, "ok, this field is process ID of the lock group leader, but what is it?". Expose a leader pid for parallel worker will be clear improvement for user. And seems lockGroupLeader->pid is exactly this stuff. Therefore, I would like to see such description and meaning of the field.

I think that not using "parallel" to name this field will help to
avoid confusion if the lock group infrastructure is eventually used
for something else, but that's only true if indeed we explain what a
lock group is.

The fact that leader_pid == pid for the leader and different for the
other member should be obvious, I'm not sure that it's worth
documenting that.

It may be not obvious that leader_pid is not null in this case. But ok, no objections.

If we adapt lmgr/README to document the group locking, it also
addresses this. What do you thing of:

The leader_pid is NULL for processes not involved in parallel query.
When a process wants to cooperate with parallel workers, it becomes a
lock group leader, which means that this field will be valued to its
own pid. When a parallel worker starts up, this field will be valued
with the leader pid.

#12

Michael Paquier

michael@paquier.xyz

almost 6 years ago

In reply to: Julien Rouhaud (#11)

Re: Expose lock group leader pid in pg_stat_activity

On Fri, Dec 27, 2019 at 10:15:33AM +0100, Julien Rouhaud wrote:

I think that not using "parallel" to name this field will help to
avoid confusion if the lock group infrastructure is eventually used
for something else, but that's only true if indeed we explain what a
lock group is.

As you already pointed out, src/backend/storage/lmgr/README includes a
full description of this stuff under the section "Group Locking". So
I agree that the patch ought to document your new field in a much
better way, without mentioning the term "group locking" that's even
better to not confuse the reader because this term not present in the
docs at all.

The leader_pid is NULL for processes not involved in parallel query.
When a process wants to cooperate with parallel workers, it becomes a
lock group leader, which means that this field will be valued to its
own pid. When a parallel worker starts up, this field will be valued
with the leader pid.

The first sentence is good to have. Now instead of "lock group
leader", I think that we had better use "parallel group leader" as in
other parts of the docs (see wait events for example). Then we just
need to say that if leader_pid has the same value as
pg_stat_activity.pid, then we have a group leader. If not, then it is
a parallel worker process initially spawned by the leader whose PID is
leader_pid (when executing Gather, Gather Merge, soon-to-be parallel
vacuum or for a parallel btree build, but this does not need a mention
in the docs). There could be an argument as well to have leader_pid
set to NULL for a leader, but that would be inconsistent with what
the PGPROC entry reports for the backend.

While looking at the code, I think that we could refactor things a bit
for raw_wait_event, wait_event_type and wait_event which has some
duplicated code for backend and auxiliary processes. What about
filling in the wait event data after fetching the PGPROC entry, and
also fill in leader_pid for auxiliary processes. This does not matter
now, perhaps it will never matter (or not), but that would make the
code much more consistent.
--
Michael

#13

Michael Paquier

michael@paquier.xyz

almost 6 years ago

In reply to: Michael Paquier (#12)

Re: Expose lock group leader pid in pg_stat_activity

On Thu, Jan 16, 2020 at 04:27:27PM +0900, Michael Paquier wrote:

While looking at the code, I think that we could refactor things a bit
for raw_wait_event, wait_event_type and wait_event which has some
duplicated code for backend and auxiliary processes. What about
filling in the wait event data after fetching the PGPROC entry, and
also fill in leader_pid for auxiliary processes. This does not matter
now, perhaps it will never matter (or not), but that would make the
code much more consistent.

And actually, the way you are looking at the leader's PID is visibly
incorrect and inconsistent because the patch takes no shared LWLock on
the leader using LockHashPartitionLockByProc() followed by
LWLockAcquire(), no? That's incorrect because it could be perfectly
possible to crash with this code between the moment you check if
lockGroupLeader is NULL and the moment you look at
lockGroupLeader->pid if a process is being stopped in-between and
removes itself from a lock group in ProcKill(). That's also
inconsistent because it could be perfectly possible to finish with an
incorrect view of the data while scanning for all the backend entries,
like a leader set to NULL with workers pointing to the leader for
example, or even workers marked incorrectly as NULL. The second one
may not be a problem, but the first one could be confusing.
--
Michael

#14

Julien Rouhaud

rjuju123@gmail.com

almost 6 years ago

In reply to: Michael Paquier (#12)

Re: Expose lock group leader pid in pg_stat_activity

On Thu, Jan 16, 2020 at 8:28 AM Michael Paquier <michael@paquier.xyz> wrote:

On Fri, Dec 27, 2019 at 10:15:33AM +0100, Julien Rouhaud wrote:

I think that not using "parallel" to name this field will help to
avoid confusion if the lock group infrastructure is eventually used
for something else, but that's only true if indeed we explain what a
lock group is.

As you already pointed out, src/backend/storage/lmgr/README includes a
full description of this stuff under the section "Group Locking". So
I agree that the patch ought to document your new field in a much
better way, without mentioning the term "group locking" that's even
better to not confuse the reader because this term not present in the
docs at all.

The leader_pid is NULL for processes not involved in parallel query.
When a process wants to cooperate with parallel workers, it becomes a
lock group leader, which means that this field will be valued to its
own pid. When a parallel worker starts up, this field will be valued
with the leader pid.

The first sentence is good to have. Now instead of "lock group
leader", I think that we had better use "parallel group leader" as in
other parts of the docs (see wait events for example).

Ok, I'll change this way.

Then we just
need to say that if leader_pid has the same value as
pg_stat_activity.pid, then we have a group leader. If not, then it is
a parallel worker process initially spawned by the leader whose PID is
leader_pid (when executing Gather, Gather Merge, soon-to-be parallel
vacuum or for a parallel btree build, but this does not need a mention
in the docs). There could be an argument as well to have leader_pid
set to NULL for a leader, but that would be inconsistent with what
the PGPROC entry reports for the backend.

It would also slightly complicate things to get the full set of
backends involved in a parallel query, while excluding the leader is
entirely trivial.

While looking at the code, I think that we could refactor things a bit
for raw_wait_event, wait_event_type and wait_event which has some
duplicated code for backend and auxiliary processes. What about
filling in the wait event data after fetching the PGPROC entry, and
also fill in leader_pid for auxiliary processes. This does not matter
now, perhaps it will never matter (or not), but that would make the
code much more consistent.

Yeah, I didn't think that auxiliary would be involved any time soon
but I can include this refactoring.

#15

Julien Rouhaud

rjuju123@gmail.com

almost 6 years ago

In reply to: Michael Paquier (#13)

Re: Expose lock group leader pid in pg_stat_activity

On Thu, Jan 16, 2020 at 8:49 AM Michael Paquier <michael@paquier.xyz> wrote:

On Thu, Jan 16, 2020 at 04:27:27PM +0900, Michael Paquier wrote:

While looking at the code, I think that we could refactor things a bit
for raw_wait_event, wait_event_type and wait_event which has some
duplicated code for backend and auxiliary processes. What about
filling in the wait event data after fetching the PGPROC entry, and
also fill in leader_pid for auxiliary processes. This does not matter
now, perhaps it will never matter (or not), but that would make the
code much more consistent.

And actually, the way you are looking at the leader's PID is visibly
incorrect and inconsistent because the patch takes no shared LWLock on
the leader using LockHashPartitionLockByProc() followed by
LWLockAcquire(), no? That's incorrect because it could be perfectly
possible to crash with this code between the moment you check if
lockGroupLeader is NULL and the moment you look at
lockGroupLeader->pid if a process is being stopped in-between and
removes itself from a lock group in ProcKill(). That's also
inconsistent because it could be perfectly possible to finish with an
incorrect view of the data while scanning for all the backend entries,
like a leader set to NULL with workers pointing to the leader for
example, or even workers marked incorrectly as NULL. The second one
may not be a problem, but the first one could be confusing.

Oh indeed. But unless we hold some LWLock during the whole function
execution, we cannot guarantee a consistent view right? And isn't it
already possible to e.g. see a parallel worker in pg_stat_activity
while all other queries are shown are idle, if you're unlucky enough?

Also, LockHashPartitionLockByProc requires the leader PGPROC, and
there's no guarantee that we'll see the leader before any of the
workers, so I'm unsure how to implement what you said. Wouldn't it be
better to simply fetch the leader PGPROC after acquiring a shared
ProcArrayLock, and using that copy to display the pid, after checking
that we retrieved a non-null PGPROC?

#16

Michael Paquier

michael@paquier.xyz

almost 6 years ago

In reply to: Julien Rouhaud (#15)

Re: Expose lock group leader pid in pg_stat_activity

On Fri, Jan 17, 2020 at 05:07:55PM +0100, Julien Rouhaud wrote:

Oh indeed. But unless we hold some LWLock during the whole function
execution, we cannot guarantee a consistent view right?

Yep. That's possible.

And isn't it already possible to e.g. see a parallel worker in
pg_stat_activity while all other queries are shown are idle, if
you're unlucky enough?

Yep. That's possible.

Also, LockHashPartitionLockByProc requires the leader PGPROC, and
there's no guarantee that we'll see the leader before any of the
workers, so I'm unsure how to implement what you said. Wouldn't it be
better to simply fetch the leader PGPROC after acquiring a shared
ProcArrayLock, and using that copy to display the pid, after checking
that we retrieved a non-null PGPROC?

Another idea would be to check if the current PGPROC entry is a leader
and print in an int[] the list of PIDs of all the workers while
holding a shared LWLock to avoid anything to be unregistered. Less
handy, but a bit more consistent. One problem with doing that is
that you may have in your list of PIDs some worker processes that are
already gone when going through all the other backend entries. An
advantage is that an empty array could mean "I am the leader, but
nothing has been registered yet to my group lock." (note that the
leader adds itself to lockGroupMembers).
--
Michael

#17

Julien Rouhaud

rjuju123@gmail.com

almost 6 years ago

In reply to: Michael Paquier (#16)

1 attachment(s)

Re: Expose lock group leader pid in pg_stat_activity

On Sat, Jan 18, 2020 at 3:51 AM Michael Paquier <michael@paquier.xyz> wrote:

On Fri, Jan 17, 2020 at 05:07:55PM +0100, Julien Rouhaud wrote:

Oh indeed. But unless we hold some LWLock during the whole function
execution, we cannot guarantee a consistent view right?

Yep. That's possible.

And isn't it already possible to e.g. see a parallel worker in
pg_stat_activity while all other queries are shown are idle, if
you're unlucky enough?

Yep. That's possible.

Also, LockHashPartitionLockByProc requires the leader PGPROC, and
there's no guarantee that we'll see the leader before any of the
workers, so I'm unsure how to implement what you said. Wouldn't it be
better to simply fetch the leader PGPROC after acquiring a shared
ProcArrayLock, and using that copy to display the pid, after checking
that we retrieved a non-null PGPROC?

Another idea would be to check if the current PGPROC entry is a leader
and print in an int[] the list of PIDs of all the workers while
holding a shared LWLock to avoid anything to be unregistered. Less
handy, but a bit more consistent. One problem with doing that is
that you may have in your list of PIDs some worker processes that are
already gone when going through all the other backend entries. An
advantage is that an empty array could mean "I am the leader, but
nothing has been registered yet to my group lock." (note that the
leader adds itself to lockGroupMembers).

So, AFAICT the LockHashPartitionLockByProc is required when
iterating/modifying lockGroupMembers or lockGroupLink, but just
getting the leader pid should be safe. Since we'll never be able to
get a totally consistent view of data here, I'm in favor of avoiding
taking extra locks here. I agree that outputting an array of the pid
would be more consistent for the leader, but will have its own set of
corner cases. It seems to me that a new leader_pid column is easier
to handle at SQL level, so I kept that approach in attached v4. If
you have strong objections to it. I can still change it.

Attachments:

pgsa-leader-pid-v4.difftext/x-patch; charset=US-ASCII; name=pgsa-leader-pid-v4.diffDownload

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 8839699079..2dbcb0f9d0 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -622,6 +622,17 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
      <entry><type>integer</type></entry>
      <entry>Process ID of this backend</entry>
     </row>
+    <row>
+     <entry><structfield>leader_pid</structfield></entry>
+     <entry><type>integer</type></entry>
+     <entry>
+      The leader_pid is NULL for processes not involved in parallel query.
+      When a process wants to cooperate with parallel workers, it becomes a
+      parallel group leader, which means that this field will be valued to its
+      own pid. When a parallel worker starts up, this field will be valued with
+      the parallel group leader pid.
+     </entry>
+    </row>
     <row>
      <entry><structfield>usesysid</structfield></entry>
      <entry><type>oid</type></entry>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index c9e75f4370..d7cc14d175 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -740,6 +740,7 @@ CREATE VIEW pg_stat_activity AS
             S.datid AS datid,
             D.datname AS datname,
             S.pid,
+            S.leader_pid,
             S.usesysid,
             U.rolname AS usename,
             S.application_name,
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 74f899f24d..7bff5eaf37 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -547,7 +547,7 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 Datum
 pg_stat_get_activity(PG_FUNCTION_ARGS)
 {
-#define PG_STAT_GET_ACTIVITY_COLS	29
+#define PG_STAT_GET_ACTIVITY_COLS	30
 	int			num_backends = pgstat_fetch_stat_numbackends();
 	int			curr_backend;
 	int			pid = PG_ARGISNULL(0) ? -1 : PG_GETARG_INT32(0);
@@ -686,33 +686,30 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
 			values[5] = CStringGetTextDatum(clipped_activity);
 			pfree(clipped_activity);
 
+			nulls[29] = true;
 			proc = BackendPidGetProc(beentry->st_procpid);
+
+			/*
+			 * For an auxiliary process, retrieve process info from
+			 * AuxiliaryProcs stored in shared-memory.
+			 */
+			if (!proc && (beentry->st_backendType != B_BACKEND))
+				proc = AuxiliaryPidGetProc(beentry->st_procpid);
+
 			if (proc != NULL)
 			{
 				uint32		raw_wait_event;
+				PGPROC	   *leader;
 
 				raw_wait_event = UINT32_ACCESS_ONCE(proc->wait_event_info);
 				wait_event_type = pgstat_get_wait_event_type(raw_wait_event);
 				wait_event = pgstat_get_wait_event(raw_wait_event);
 
-			}
-			else if (beentry->st_backendType != B_BACKEND)
-			{
-				/*
-				 * For an auxiliary process, retrieve process info from
-				 * AuxiliaryProcs stored in shared-memory.
-				 */
-				proc = AuxiliaryPidGetProc(beentry->st_procpid);
-
-				if (proc != NULL)
+				leader = proc->lockGroupLeader;
+				if (leader)
 				{
-					uint32		raw_wait_event;
-
-					raw_wait_event =
-						UINT32_ACCESS_ONCE(proc->wait_event_info);
-					wait_event_type =
-						pgstat_get_wait_event_type(raw_wait_event);
-					wait_event = pgstat_get_wait_event(raw_wait_event);
+					values[29] = Int32GetDatum(leader->pid);
+					nulls[29] = false;
 				}
 			}
 
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index bef50c76d9..86d18c76b6 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5175,9 +5175,9 @@
   proname => 'pg_stat_get_activity', prorows => '100', proisstrict => 'f',
   proretset => 't', provolatile => 's', proparallel => 'r',
   prorettype => 'record', proargtypes => 'int4',
-  proallargtypes => '{int4,oid,int4,oid,text,text,text,text,text,timestamptz,timestamptz,timestamptz,timestamptz,inet,text,int4,xid,xid,text,bool,text,text,int4,bool,text,numeric,text,bool,text,bool}',
-  proargmodes => '{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{pid,datid,pid,usesysid,application_name,state,query,wait_event_type,wait_event,xact_start,query_start,backend_start,state_change,client_addr,client_hostname,client_port,backend_xid,backend_xmin,backend_type,ssl,sslversion,sslcipher,sslbits,sslcompression,ssl_client_dn,ssl_client_serial,ssl_issuer_dn,gss_auth,gss_princ,gss_enc}',
+  proallargtypes => '{int4,oid,int4,oid,text,text,text,text,text,timestamptz,timestamptz,timestamptz,timestamptz,inet,text,int4,xid,xid,text,bool,text,text,int4,bool,text,numeric,text,bool,text,bool,int4}',
+  proargmodes => '{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{pid,datid,pid,usesysid,application_name,state,query,wait_event_type,wait_event,xact_start,query_start,backend_start,state_change,client_addr,client_hostname,client_port,backend_xid,backend_xmin,backend_type,ssl,sslversion,sslcipher,sslbits,sslcompression,ssl_client_dn,ssl_client_serial,ssl_issuer_dn,gss_auth,gss_princ,gss_enc,leader_pid}',
   prosrc => 'pg_stat_get_activity' },
 { oid => '3318',
   descr => 'statistics: information about progress of backends running maintenance command',
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 70e1e2f78d..bff0c2ff39 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1729,6 +1729,7 @@ pg_shmem_allocations| SELECT pg_get_shmem_allocations.name,
 pg_stat_activity| SELECT s.datid,
     d.datname,
     s.pid,
+    s.leader_pid,
     s.usesysid,
     u.rolname AS usename,
     s.application_name,
@@ -1746,7 +1747,7 @@ pg_stat_activity| SELECT s.datid,
     s.backend_xmin,
     s.query,
     s.backend_type
-   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc)
+   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)))
      LEFT JOIN pg_authid u ON ((s.usesysid = u.oid)));
 pg_stat_all_indexes| SELECT c.oid AS relid,
@@ -1850,7 +1851,7 @@ pg_stat_gssapi| SELECT s.pid,
     s.gss_auth AS gss_authenticated,
     s.gss_princ AS principal,
     s.gss_enc AS encrypted
-   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc)
+   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid)
   WHERE (s.client_port IS NOT NULL);
 pg_stat_progress_analyze| SELECT s.pid,
     s.datid,
@@ -1983,7 +1984,7 @@ pg_stat_replication| SELECT s.pid,
     w.spill_txns,
     w.spill_count,
     w.spill_bytes
-   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc)
+   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid)
      JOIN pg_stat_get_wal_senders() w(pid, state, sent_lsn, write_lsn, flush_lsn, replay_lsn, write_lag, flush_lag, replay_lag, sync_priority, sync_state, reply_time, spill_txns, spill_count, spill_bytes) ON ((s.pid = w.pid)))
      LEFT JOIN pg_authid u ON ((s.usesysid = u.oid)));
 pg_stat_ssl| SELECT s.pid,
@@ -1995,7 +1996,7 @@ pg_stat_ssl| SELECT s.pid,
     s.ssl_client_dn AS client_dn,
     s.ssl_client_serial AS client_serial,
     s.ssl_issuer_dn AS issuer_dn
-   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc)
+   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid)
   WHERE (s.client_port IS NOT NULL);
 pg_stat_subscription| SELECT su.oid AS subid,
     su.subname,

#18

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 6 years ago

In reply to: Julien Rouhaud (#17)

Re: Expose lock group leader pid in pg_stat_activity

On Tue, Jan 28, 2020 at 12:36:41PM +0100, Julien Rouhaud wrote:

On Sat, Jan 18, 2020 at 3:51 AM Michael Paquier <michael@paquier.xyz> wrote:

On Fri, Jan 17, 2020 at 05:07:55PM +0100, Julien Rouhaud wrote:

Oh indeed. But unless we hold some LWLock during the whole function
execution, we cannot guarantee a consistent view right?

Yep. That's possible.

And isn't it already possible to e.g. see a parallel worker in
pg_stat_activity while all other queries are shown are idle, if
you're unlucky enough?

Yep. That's possible.

Also, LockHashPartitionLockByProc requires the leader PGPROC, and
there's no guarantee that we'll see the leader before any of the
workers, so I'm unsure how to implement what you said. Wouldn't it be
better to simply fetch the leader PGPROC after acquiring a shared
ProcArrayLock, and using that copy to display the pid, after checking
that we retrieved a non-null PGPROC?

Another idea would be to check if the current PGPROC entry is a leader
and print in an int[] the list of PIDs of all the workers while
holding a shared LWLock to avoid anything to be unregistered. Less
handy, but a bit more consistent. One problem with doing that is
that you may have in your list of PIDs some worker processes that are
already gone when going through all the other backend entries. An
advantage is that an empty array could mean "I am the leader, but
nothing has been registered yet to my group lock." (note that the
leader adds itself to lockGroupMembers).

So, AFAICT the LockHashPartitionLockByProc is required when
iterating/modifying lockGroupMembers or lockGroupLink, but just
getting the leader pid should be safe. Since we'll never be able to
get a totally consistent view of data here, I'm in favor of avoiding
taking extra locks here. I agree that outputting an array of the pid
would be more consistent for the leader, but will have its own set of
corner cases. It seems to me that a new leader_pid column is easier
to handle at SQL level, so I kept that approach in attached v4. If
you have strong objections to it. I can still change it.

I agree a separate "leader_id" column is easier to work with, as it does
not require unnesting and so on.

As for the consistency, I agree we probably can't make this perfect, as
we're fetching and processing the PGPROC records one by one. Fixing that
would require acquiring a much stronger lock on PGPROC, and perhaps some
other locks. That's pre-existing behavior, of course, it's just not very
obvious as we don't have any dependencies between the rows, I think.
Adding the leader_id will change, that, of course. But I think it's
still mostly OK, even with the possible inconsistency.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#19

Julien Rouhaud

rjuju123@gmail.com

almost 6 years ago

In reply to: Tomas Vondra (#18)

Re: Expose lock group leader pid in pg_stat_activity

On Tue, Jan 28, 2020 at 2:09 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

I agree a separate "leader_id" column is easier to work with, as it does
not require unnesting and so on.

As for the consistency, I agree we probably can't make this perfect, as
we're fetching and processing the PGPROC records one by one. Fixing that
would require acquiring a much stronger lock on PGPROC, and perhaps some
other locks. That's pre-existing behavior, of course, it's just not very
obvious as we don't have any dependencies between the rows, I think.
Adding the leader_id will change, that, of course. But I think it's
still mostly OK, even with the possible inconsistency.

There were already some dependencies between the rows since parallel
queries were added, as you could see eg. a parallel worker while no
query is currently active. This patch will make those corner cases
more obvious. Should I document the possible inconsistencies?

#20

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 6 years ago

In reply to: Julien Rouhaud (#19)

Re: Expose lock group leader pid in pg_stat_activity

On Tue, Jan 28, 2020 at 02:26:34PM +0100, Julien Rouhaud wrote:

On Tue, Jan 28, 2020 at 2:09 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

I agree a separate "leader_id" column is easier to work with, as it does
not require unnesting and so on.

As for the consistency, I agree we probably can't make this perfect, as
we're fetching and processing the PGPROC records one by one. Fixing that
would require acquiring a much stronger lock on PGPROC, and perhaps some
other locks. That's pre-existing behavior, of course, it's just not very
obvious as we don't have any dependencies between the rows, I think.
Adding the leader_id will change, that, of course. But I think it's
still mostly OK, even with the possible inconsistency.

There were already some dependencies between the rows since parallel
queries were added, as you could see eg. a parallel worker while no
query is currently active. This patch will make those corner cases
more obvious.

Yeah, sure. I mean explicit dependencies, e.g. a column referencing
values from another row, like leader_id does.

Should I document the possible inconsistencies?

I think it's worth mentioning that as a comment in the code, say before
the pg_stat_get_activity function. IMO we don't need to document all
possible inconsistencies, a generic explanation is enough.

Not sure about the user docs. Does it currently say anything about this
topic - consistency with stat catalogs?

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#21

Michael Paquier

michael@paquier.xyz

almost 6 years ago

In reply to: Tomas Vondra (#20)

Re: Expose lock group leader pid in pg_stat_activity

On Tue, Jan 28, 2020 at 02:52:08PM +0100, Tomas Vondra wrote:

On Tue, Jan 28, 2020 at 02:26:34PM +0100, Julien Rouhaud wrote:

There were already some dependencies between the rows since parallel
queries were added, as you could see eg. a parallel worker while no
query is currently active. This patch will make those corner cases
more obvious.

I was reviewing the code and one thing that I was wondering is if it
would be better to make the code more defensive and return NULL when
the PID of the referenced leader is 0 or InvalidPid. However that
would mean that we have a dummy 2PC entry from PGPROC or something not
yet started, which makes no sense. So your simpler version is
actually fine. What you have here is that in the worst case you could
finish with an incorrect reference to shared memory if a PGPROC is
recycled between the moment you look for the leader field and the
moment the PID value is fetched. That's very unlikely to happen, and
I agree that it does not really justify the cost of taking extra locks
every time we scan pg_stat_activity.

Yeah, sure. I mean explicit dependencies, e.g. a column referencing
values from another row, like leader_id does.

+ The leader_pid is NULL for processes not involved in parallel query.
This is missing two markups, one for "NULL" and a second for
"leader_pid". The documentation does not match the surroundings
either, so I would suggest a reformulation for the beginning:
PID of the leader process if this process is involved in parallel query.

And actually this paragraph is not completely true, because leader_pid
remains set even after one parallel query run has been finished for a
session when leader_pid = pid as lockGroupLeader is set to NULL only
once the process is stopped in ProcKill().

Should I document the possible inconsistencies?

I think it's worth mentioning that as a comment in the code, say before
the pg_stat_get_activity function. IMO we don't need to document all
possible inconsistencies, a generic explanation is enough.

Agreed that adding some information in the area when we look at the
PGPROC entries for wait events and such would be nice.

Not sure about the user docs. Does it currently say anything about this
topic - consistency with stat catalogs?

Not sure that it is the job of this patch to do that. Do you have
something specific in mind?
--
Michael

#22

Julien Rouhaud

rjuju123@gmail.com

almost 6 years ago

In reply to: Michael Paquier (#21)

1 attachment(s)

Re: Expose lock group leader pid in pg_stat_activity

On Thu, Jan 30, 2020 at 10:03:01PM +0900, Michael Paquier wrote:

On Tue, Jan 28, 2020 at 02:52:08PM +0100, Tomas Vondra wrote:

On Tue, Jan 28, 2020 at 02:26:34PM +0100, Julien Rouhaud wrote:

There were already some dependencies between the rows since parallel
queries were added, as you could see eg. a parallel worker while no
query is currently active. This patch will make those corner cases
more obvious.

I was reviewing the code and one thing that I was wondering is if it
would be better to make the code more defensive and return NULL when
the PID of the referenced leader is 0 or InvalidPid. However that
would mean that we have a dummy 2PC entry from PGPROC or something not
yet started, which makes no sense. So your simpler version is
actually fine. What you have here is that in the worst case you could
finish with an incorrect reference to shared memory if a PGPROC is
recycled between the moment you look for the leader field and the
moment the PID value is fetched. That's very unlikely to happen, and
I agree that it does not really justify the cost of taking extra locks
every time we scan pg_stat_activity.

Ok.

Yeah, sure. I mean explicit dependencies, e.g. a column referencing
values from another row, like leader_id does.

+ The leader_pid is NULL for processes not involved in parallel query.
This is missing two markups, one for "NULL" and a second for
"leader_pid".

The extra "leader_pid" disappeared when I reworked the doc. I'm not sure what
you meant here for NULL as I don't see any extra markup used in closeby
documentation, so I hope this version is ok.

The documentation does not match the surroundings
either, so I would suggest a reformulation for the beginning:
PID of the leader process if this process is involved in parallel query.

And actually this paragraph is not completely true, because leader_pid
remains set even after one parallel query run has been finished for a
session when leader_pid = pid as lockGroupLeader is set to NULL only
once the process is stopped in ProcKill().

Oh good point, that's unfortunately not a super friendly behavior. I tried to
adapt the documentation to address of all that. It's maybe slightly too
verbose, but I guess that extra clarity is welcome here.

Should I document the possible inconsistencies?

I think it's worth mentioning that as a comment in the code, say before
the pg_stat_get_activity function. IMO we don't need to document all
possible inconsistencies, a generic explanation is enough.

Agreed that adding some information in the area when we look at the
PGPROC entries for wait events and such would be nice.

I added some code comments to remind that we don't guarantee any consistency
here.

Attachments:

pgsa-leader-pid-v5.difftext/plain; charset=us-asciiDownload

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 8839699079..5e1f6c057b 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -622,6 +622,18 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
      <entry><type>integer</type></entry>
      <entry>Process ID of this backend</entry>
     </row>
+    <row>
+     <entry><structfield>leader_pid</structfield></entry>
+     <entry><type>integer</type></entry>
+     <entry>
+      Process ID of the leader process if this process is or has been involved
+      in parallel query, or null.  When a process wants to cooperate with
+      parallel workers, it becomes a parallel group leader, which means that
+      this field will be valued to its own process ID, and will remain a group
+      leader as long as the process exist.  When a parallel worker starts up,
+      this field will be valued with the parallel group leader process ID.
+     </entry>
+    </row>
     <row>
      <entry><structfield>usesysid</structfield></entry>
      <entry><type>oid</type></entry>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index c9e6060035..f681aafcf9 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -741,6 +741,7 @@ CREATE VIEW pg_stat_activity AS
             S.datid AS datid,
             D.datname AS datname,
             S.pid,
+            S.leader_pid,
             S.usesysid,
             U.rolname AS usename,
             S.application_name,
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 7b2da2b36f..f4bea2edd2 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -547,7 +547,7 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 Datum
 pg_stat_get_activity(PG_FUNCTION_ARGS)
 {
-#define PG_STAT_GET_ACTIVITY_COLS	29
+#define PG_STAT_GET_ACTIVITY_COLS	30
 	int			num_backends = pgstat_fetch_stat_numbackends();
 	int			curr_backend;
 	int			pid = PG_ARGISNULL(0) ? -1 : PG_GETARG_INT32(0);
@@ -686,33 +686,36 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
 			values[5] = CStringGetTextDatum(clipped_activity);
 			pfree(clipped_activity);
 
+			nulls[29] = true;
 			proc = BackendPidGetProc(beentry->st_procpid);
+
+			/*
+			 * For an auxiliary process, retrieve process info from
+			 * AuxiliaryProcs stored in shared-memory.
+			 */
+			if (!proc && (beentry->st_backendType != B_BACKEND))
+				proc = AuxiliaryPidGetProc(beentry->st_procpid);
+
+			/*
+			 * If we retrieved process info, display wait events and lock group
+			 * leader info if any.  To avoid extra overhead, no extra lock is
+			 * being held, meaning that we don't guarantee consistency over the
+			 * various rows being returned.
+			 */
 			if (proc != NULL)
 			{
 				uint32		raw_wait_event;
+				PGPROC	   *leader;
 
 				raw_wait_event = UINT32_ACCESS_ONCE(proc->wait_event_info);
 				wait_event_type = pgstat_get_wait_event_type(raw_wait_event);
 				wait_event = pgstat_get_wait_event(raw_wait_event);
 
-			}
-			else if (beentry->st_backendType != B_BACKEND)
-			{
-				/*
-				 * For an auxiliary process, retrieve process info from
-				 * AuxiliaryProcs stored in shared-memory.
-				 */
-				proc = AuxiliaryPidGetProc(beentry->st_procpid);
-
-				if (proc != NULL)
+				leader = proc->lockGroupLeader;
+				if (leader)
 				{
-					uint32		raw_wait_event;
-
-					raw_wait_event =
-						UINT32_ACCESS_ONCE(proc->wait_event_info);
-					wait_event_type =
-						pgstat_get_wait_event_type(raw_wait_event);
-					wait_event = pgstat_get_wait_event(raw_wait_event);
+					values[29] = Int32GetDatum(leader->pid);
+					nulls[29] = false;
 				}
 			}
 
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 2228256907..226c904c04 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5175,9 +5175,9 @@
   proname => 'pg_stat_get_activity', prorows => '100', proisstrict => 'f',
   proretset => 't', provolatile => 's', proparallel => 'r',
   prorettype => 'record', proargtypes => 'int4',
-  proallargtypes => '{int4,oid,int4,oid,text,text,text,text,text,timestamptz,timestamptz,timestamptz,timestamptz,inet,text,int4,xid,xid,text,bool,text,text,int4,bool,text,numeric,text,bool,text,bool}',
-  proargmodes => '{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{pid,datid,pid,usesysid,application_name,state,query,wait_event_type,wait_event,xact_start,query_start,backend_start,state_change,client_addr,client_hostname,client_port,backend_xid,backend_xmin,backend_type,ssl,sslversion,sslcipher,sslbits,sslcompression,ssl_client_dn,ssl_client_serial,ssl_issuer_dn,gss_auth,gss_princ,gss_enc}',
+  proallargtypes => '{int4,oid,int4,oid,text,text,text,text,text,timestamptz,timestamptz,timestamptz,timestamptz,inet,text,int4,xid,xid,text,bool,text,text,int4,bool,text,numeric,text,bool,text,bool,int4}',
+  proargmodes => '{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{pid,datid,pid,usesysid,application_name,state,query,wait_event_type,wait_event,xact_start,query_start,backend_start,state_change,client_addr,client_hostname,client_port,backend_xid,backend_xmin,backend_type,ssl,sslversion,sslcipher,sslbits,sslcompression,ssl_client_dn,ssl_client_serial,ssl_issuer_dn,gss_auth,gss_princ,gss_enc,leader_pid}',
   prosrc => 'pg_stat_get_activity' },
 { oid => '3318',
   descr => 'statistics: information about progress of backends running maintenance command',
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2ab2115fa1..634f8256f7 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1730,6 +1730,7 @@ pg_shmem_allocations| SELECT pg_get_shmem_allocations.name,
 pg_stat_activity| SELECT s.datid,
     d.datname,
     s.pid,
+    s.leader_pid,
     s.usesysid,
     u.rolname AS usename,
     s.application_name,
@@ -1747,7 +1748,7 @@ pg_stat_activity| SELECT s.datid,
     s.backend_xmin,
     s.query,
     s.backend_type
-   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc)
+   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)))
      LEFT JOIN pg_authid u ON ((s.usesysid = u.oid)));
 pg_stat_all_indexes| SELECT c.oid AS relid,
@@ -1851,7 +1852,7 @@ pg_stat_gssapi| SELECT s.pid,
     s.gss_auth AS gss_authenticated,
     s.gss_princ AS principal,
     s.gss_enc AS encrypted
-   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc)
+   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid)
   WHERE (s.client_port IS NOT NULL);
 pg_stat_progress_analyze| SELECT s.pid,
     s.datid,
@@ -1984,7 +1985,7 @@ pg_stat_replication| SELECT s.pid,
     w.spill_txns,
     w.spill_count,
     w.spill_bytes
-   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc)
+   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid)
      JOIN pg_stat_get_wal_senders() w(pid, state, sent_lsn, write_lsn, flush_lsn, replay_lsn, write_lag, flush_lag, replay_lag, sync_priority, sync_state, reply_time, spill_txns, spill_count, spill_bytes) ON ((s.pid = w.pid)))
      LEFT JOIN pg_authid u ON ((s.usesysid = u.oid)));
 pg_stat_ssl| SELECT s.pid,
@@ -1996,7 +1997,7 @@ pg_stat_ssl| SELECT s.pid,
     s.ssl_client_dn AS client_dn,
     s.ssl_client_serial AS client_serial,
     s.ssl_issuer_dn AS issuer_dn
-   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc)
+   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid)
   WHERE (s.client_port IS NOT NULL);
 pg_stat_subscription| SELECT su.oid AS subid,
     su.subname,

#23

Michael Paquier

michael@paquier.xyz

almost 6 years ago

In reply to: Julien Rouhaud (#22)

1 attachment(s)

Re: Expose lock group leader pid in pg_stat_activity

On Tue, Feb 04, 2020 at 03:27:25PM +0100, Julien Rouhaud wrote:

I added some code comments to remind that we don't guarantee any consistency
here.

That's mostly fine. I have moved the comment related to
AuxiliaryPidGetProc() within the inner part of its the "if" (or the
comment should be changed to be conditional). An extra thing is that
nulls[29] was not set to true for a user without the proper permission
rights.

Does that look fine to you?
--
Michael

Attachments:

pgsa-leader-pid-v6.difftext/x-diff; charset=us-asciiDownload

diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 2228256907..226c904c04 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5175,9 +5175,9 @@
   proname => 'pg_stat_get_activity', prorows => '100', proisstrict => 'f',
   proretset => 't', provolatile => 's', proparallel => 'r',
   prorettype => 'record', proargtypes => 'int4',
-  proallargtypes => '{int4,oid,int4,oid,text,text,text,text,text,timestamptz,timestamptz,timestamptz,timestamptz,inet,text,int4,xid,xid,text,bool,text,text,int4,bool,text,numeric,text,bool,text,bool}',
-  proargmodes => '{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{pid,datid,pid,usesysid,application_name,state,query,wait_event_type,wait_event,xact_start,query_start,backend_start,state_change,client_addr,client_hostname,client_port,backend_xid,backend_xmin,backend_type,ssl,sslversion,sslcipher,sslbits,sslcompression,ssl_client_dn,ssl_client_serial,ssl_issuer_dn,gss_auth,gss_princ,gss_enc}',
+  proallargtypes => '{int4,oid,int4,oid,text,text,text,text,text,timestamptz,timestamptz,timestamptz,timestamptz,inet,text,int4,xid,xid,text,bool,text,text,int4,bool,text,numeric,text,bool,text,bool,int4}',
+  proargmodes => '{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{pid,datid,pid,usesysid,application_name,state,query,wait_event_type,wait_event,xact_start,query_start,backend_start,state_change,client_addr,client_hostname,client_port,backend_xid,backend_xmin,backend_type,ssl,sslversion,sslcipher,sslbits,sslcompression,ssl_client_dn,ssl_client_serial,ssl_issuer_dn,gss_auth,gss_princ,gss_enc,leader_pid}',
   prosrc => 'pg_stat_get_activity' },
 { oid => '3318',
   descr => 'statistics: information about progress of backends running maintenance command',
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index c9e6060035..f681aafcf9 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -741,6 +741,7 @@ CREATE VIEW pg_stat_activity AS
             S.datid AS datid,
             D.datname AS datname,
             S.pid,
+            S.leader_pid,
             S.usesysid,
             U.rolname AS usename,
             S.application_name,
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 7b2da2b36f..7e6a3c1774 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -547,7 +547,7 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 Datum
 pg_stat_get_activity(PG_FUNCTION_ARGS)
 {
-#define PG_STAT_GET_ACTIVITY_COLS	29
+#define PG_STAT_GET_ACTIVITY_COLS	30
 	int			num_backends = pgstat_fetch_stat_numbackends();
 	int			curr_backend;
 	int			pid = PG_ARGISNULL(0) ? -1 : PG_GETARG_INT32(0);
@@ -686,33 +686,40 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
 			values[5] = CStringGetTextDatum(clipped_activity);
 			pfree(clipped_activity);
 
+			/* leader_pid */
+			nulls[29] = true;
+
 			proc = BackendPidGetProc(beentry->st_procpid);
-			if (proc != NULL)
-			{
-				uint32		raw_wait_event;
 
-				raw_wait_event = UINT32_ACCESS_ONCE(proc->wait_event_info);
-				wait_event_type = pgstat_get_wait_event_type(raw_wait_event);
-				wait_event = pgstat_get_wait_event(raw_wait_event);
-
-			}
-			else if (beentry->st_backendType != B_BACKEND)
+			if (proc == NULL && (beentry->st_backendType != B_BACKEND))
 			{
 				/*
 				 * For an auxiliary process, retrieve process info from
 				 * AuxiliaryProcs stored in shared-memory.
 				 */
 				proc = AuxiliaryPidGetProc(beentry->st_procpid);
+			}
 
-				if (proc != NULL)
+			/*
+			 * If a PGPROC entry was retrieved, display wait events and lock
+			 * group leader information if any.  To avoid extra overhead, no
+			 * extra lock is being held, so there is no guarantee of
+			 * consistency across multiple rows.
+			 */
+			if (proc != NULL)
+			{
+				uint32		raw_wait_event;
+				PGPROC	   *leader;
+
+				raw_wait_event = UINT32_ACCESS_ONCE(proc->wait_event_info);
+				wait_event_type = pgstat_get_wait_event_type(raw_wait_event);
+				wait_event = pgstat_get_wait_event(raw_wait_event);
+
+				leader = proc->lockGroupLeader;
+				if (leader)
 				{
-					uint32		raw_wait_event;
-
-					raw_wait_event =
-						UINT32_ACCESS_ONCE(proc->wait_event_info);
-					wait_event_type =
-						pgstat_get_wait_event_type(raw_wait_event);
-					wait_event = pgstat_get_wait_event(raw_wait_event);
+					values[29] = Int32GetDatum(leader->pid);
+					nulls[29] = false;
 				}
 			}
 
@@ -908,6 +915,7 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
 			nulls[26] = true;
 			nulls[27] = true;
 			nulls[28] = true;
+			nulls[29] = true;
 		}
 
 		tuplestore_putvalues(tupstore, tupdesc, values, nulls);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2ab2115fa1..634f8256f7 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1730,6 +1730,7 @@ pg_shmem_allocations| SELECT pg_get_shmem_allocations.name,
 pg_stat_activity| SELECT s.datid,
     d.datname,
     s.pid,
+    s.leader_pid,
     s.usesysid,
     u.rolname AS usename,
     s.application_name,
@@ -1747,7 +1748,7 @@ pg_stat_activity| SELECT s.datid,
     s.backend_xmin,
     s.query,
     s.backend_type
-   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc)
+   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)))
      LEFT JOIN pg_authid u ON ((s.usesysid = u.oid)));
 pg_stat_all_indexes| SELECT c.oid AS relid,
@@ -1851,7 +1852,7 @@ pg_stat_gssapi| SELECT s.pid,
     s.gss_auth AS gss_authenticated,
     s.gss_princ AS principal,
     s.gss_enc AS encrypted
-   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc)
+   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid)
   WHERE (s.client_port IS NOT NULL);
 pg_stat_progress_analyze| SELECT s.pid,
     s.datid,
@@ -1984,7 +1985,7 @@ pg_stat_replication| SELECT s.pid,
     w.spill_txns,
     w.spill_count,
     w.spill_bytes
-   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc)
+   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid)
      JOIN pg_stat_get_wal_senders() w(pid, state, sent_lsn, write_lsn, flush_lsn, replay_lsn, write_lag, flush_lag, replay_lag, sync_priority, sync_state, reply_time, spill_txns, spill_count, spill_bytes) ON ((s.pid = w.pid)))
      LEFT JOIN pg_authid u ON ((s.usesysid = u.oid)));
 pg_stat_ssl| SELECT s.pid,
@@ -1996,7 +1997,7 @@ pg_stat_ssl| SELECT s.pid,
     s.ssl_client_dn AS client_dn,
     s.ssl_client_serial AS client_serial,
     s.ssl_issuer_dn AS issuer_dn
-   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc)
+   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, application_name, state, query, wait_event_type, wait_event, xact_start, query_start, backend_start, state_change, client_addr, client_hostname, client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid)
   WHERE (s.client_port IS NOT NULL);
 pg_stat_subscription| SELECT su.oid AS subid,
     su.subname,
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 8839699079..08353cb343 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -622,6 +622,18 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
      <entry><type>integer</type></entry>
      <entry>Process ID of this backend</entry>
     </row>
+    <row>
+     <entry><structfield>leader_pid</structfield></entry>
+     <entry><type>integer</type></entry>
+     <entry>
+      Process ID of the parallel group leader if this process is or
+      has been involved in parallel query, or null. This field is set
+      when a process wants to cooperate with parallel workers, and
+      remains set as long as the process exists. For a parallel group leader,
+      this field is set to its own process ID. For a parallel worker,
+      this field is set to the process ID of the parallel group leader.
+     </entry>
+    </row>
     <row>
      <entry><structfield>usesysid</structfield></entry>
      <entry><type>oid</type></entry>

#24

Julien Rouhaud

rjuju123@gmail.com

almost 6 years ago

In reply to: Michael Paquier (#23)

Re: Expose lock group leader pid in pg_stat_activity

On Wed, Feb 05, 2020 at 10:48:31AM +0900, Michael Paquier wrote:

On Tue, Feb 04, 2020 at 03:27:25PM +0100, Julien Rouhaud wrote:

I added some code comments to remind that we don't guarantee any consistency
here.

That's mostly fine. I have moved the comment related to
AuxiliaryPidGetProc() within the inner part of its the "if" (or the
comment should be changed to be conditional). An extra thing is that
nulls[29] was not set to true for a user without the proper permission
rights.

Oh, oops indeed.

Does that look fine to you?

This looks good, thanks a lot!

#25

Michael Paquier

michael@paquier.xyz

almost 6 years ago

In reply to: Julien Rouhaud (#24)

Re: Expose lock group leader pid in pg_stat_activity

On Wed, Feb 05, 2020 at 07:57:20AM +0100, Julien Rouhaud wrote:

This looks good, thanks a lot!

Thanks for double-checking. And done.
--
Michael

#26

Julien Rouhaud

rjuju123@gmail.com

almost 6 years ago

In reply to: Michael Paquier (#25)

1 attachment(s)

Re: Expose lock group leader pid in pg_stat_activity

On Thu, Feb 06, 2020 at 09:24:16AM +0900, Michael Paquier wrote:

On Wed, Feb 05, 2020 at 07:57:20AM +0100, Julien Rouhaud wrote:

This looks good, thanks a lot!

Thanks for double-checking. And done.

Thanks!

While on the topic, is there any reason why the backend stays a group leader
for the rest of its lifetime, and should we change that?

Also, while reading ProcKill, I noticed a typo in a comment:

    /*
     * Detach from any lock group of which we are a member.  If the leader
-    * exist before all other group members, it's PGPROC will remain allocated
+    * exist before all other group members, its PGPROC will remain allocated
     * until the last group process exits; that process must return the
     * leader's PGPROC to the appropriate list.
     */

Attachments:

prockill-typo.difftext/plain; charset=us-asciiDownload

diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 32df8c85a1..eb321f72ea 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -851,7 +851,7 @@ ProcKill(int code, Datum arg)
 
 	/*
 	 * Detach from any lock group of which we are a member.  If the leader
-	 * exist before all other group members, it's PGPROC will remain allocated
+	 * exist before all other group members, its PGPROC will remain allocated
 	 * until the last group process exits; that process must return the
 	 * leader's PGPROC to the appropriate list.
 	 */

#27

Michael Paquier

michael@paquier.xyz

almost 6 years ago

In reply to: Julien Rouhaud (#26)

Re: Expose lock group leader pid in pg_stat_activity

On Thu, Feb 06, 2020 at 09:23:33AM +0100, Julien Rouhaud wrote:

While on the topic, is there any reason why the backend stays a group leader
for the rest of its lifetime, and should we change that?

Nothing happens without a reason. a1c1af2 is the original commit, and
the thread is here:
/messages/by-id/CA+TgmoapgKdy_Z0W9mHqZcGSo2t_t-4_V36DXaKim+X_fYp0oQ@mail.gmail.com

By looking at the surroundings, there are a couple of assumptions
behind the timing of the shutdown for the workers and the leader.
I have not studied much the details on that, but my guess is that it
makes the handling of the leader shutting down before its workers
easier. Robert or Amit likely know all the details here.

Also, while reading ProcKill, I noticed a typo in a comment:

/*
* Detach from any lock group of which we are a member.  If the leader
-    * exist before all other group members, it's PGPROC will remain allocated
+    * exist before all other group members, its PGPROC will remain allocated
* until the last group process exits; that process must return the
* leader's PGPROC to the appropriate list.
*/

Thanks, fixed.
--
Michael