system views for walsender activity

Started by Takahiro Itagakiover 15 years ago65 messages
#1Takahiro Itagaki
itagaki.takahiro@oss.ntt.co.jp
1 attachment(s)

Hi,

We don't have any statistic views for walsenders in SR's master server
in 9.0, but such views would be useful to monitor and manage standby
servers from the master server. I have two ideas for the solution -
adding a new system view or recycling pg_stat_activity:

1. Add another system view for walsenders, ex. "pg_stat_replication".
It would show pid, remote host, and sent location for each walsender.

2. Make pg_stat_activity to show walsenders. We could use current_query
to display walsender-specific information, like:
=# SELECT * FROM my_stat_activity ;
-[ RECORD 1 ]----+---------------------------------
datid | 16384
<snip>
current_query | SELECT * FROM my_stat_activity ;
-[ RECORD 2 ]----+---------------------------------
datid | 0
datname |
procpid | 4300
usesysid | 10
usename | itagaki
application_name |
client_addr | ::1
client_port | 37710
backend_start | 2010-06-16 16:47:35.646486+09
xact_start |
query_start |
waiting | f
current_query | walsender: sent=0/701AAA8

The attached patch is a WIP codes for the case 2, but it might be
better to design management policy via SQL in 9.1 before such detailed
implementation. Comments welcome.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center

Attachments:

walsender_activity-20100618.patchapplication/octet-stream; name=walsender_activity-20100618.patchDownload
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 8852326..da9824e 100644
*** a/src/backend/catalog/system_views.sql
--- b/src/backend/catalog/system_views.sql
*************** CREATE VIEW pg_stat_activity AS 
*** 342,350 ****
              S.query_start,
              S.waiting,
              S.current_query
!     FROM pg_database D, pg_stat_get_activity(NULL) AS S, pg_authid U
!     WHERE S.datid = D.oid AND 
!             S.usesysid = U.oid;
  
  CREATE VIEW pg_stat_database AS 
      SELECT 
--- 342,350 ----
              S.query_start,
              S.waiting,
              S.current_query
!     FROM pg_stat_get_activity(NULL) AS S
!          LEFT JOIN pg_database D ON (S.datid = D.oid), pg_authid U
!     WHERE S.usesysid = U.oid;
  
  CREATE VIEW pg_stat_database AS 
      SELECT 
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index edb5c80..8c2950c 100644
*** a/src/backend/postmaster/pgstat.c
--- b/src/backend/postmaster/pgstat.c
*************** pgstat_read_current_status(void)
*** 2558,2564 ****
  				 */
  				strcpy(localappname, (char *) beentry->st_appname);
  				localentry->st_appname = localappname;
! 				strcpy(localactivity, (char *) beentry->st_activity);
  				localentry->st_activity = localactivity;
  			}
  
--- 2558,2567 ----
  				 */
  				strcpy(localappname, (char *) beentry->st_appname);
  				localentry->st_appname = localappname;
! 				if (beentry->st_databaseid == InvalidOid)
! 					WALSndStatus(beentry->st_procpid, localactivity, pgstat_track_activity_query_size);
! 				else
! 					strcpy(localactivity, (char *) beentry->st_activity);
  				localentry->st_activity = localactivity;
  			}
  
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index a42e6e8..8e759f9 100644
*** a/src/backend/replication/walsender.c
--- b/src/backend/replication/walsender.c
*************** WalSndShmemInit(void)
*** 892,897 ****
--- 892,925 ----
  	}
  }
  
+ /* Format walsender status as a text */
+ void
+ WALSndStatus(pid_t pid, char *buffer, size_t buflen)
+ {
+ 	int			i;
+ 
+ 	for (i = 0; i < max_wal_senders; i++)
+ 	{
+ 		/* use volatile pointer to prevent code rearrangement */
+ 		volatile WalSnd *walsnd = &WalSndCtl->walsnds[i];
+ 		XLogRecPtr	recptr;
+ 
+ 		if (walsnd->pid != pid)
+ 			continue;
+ 
+ 		SpinLockAcquire(&walsnd->mutex);
+ 		recptr = walsnd->sentPtr;
+ 		SpinLockRelease(&walsnd->mutex);
+ 
+ 		snprintf(buffer, buflen, "walsender: sent=%X/%X",
+ 				 recptr.xlogid, recptr.xrecoff);
+ 		return;
+ 	}
+ 
+ 	/* not found */
+ 	buffer[0] = '\0';
+ }
+ 
  /*
   * This isn't currently used for anything. Monitoring tools might be
   * interested in the future, and we'll need something like this in the
diff --git a/src/include/replication/walsender.h b/src/include/replication/walsender.h
index 874959e..8d48998 100644
*** a/src/include/replication/walsender.h
--- b/src/include/replication/walsender.h
*************** extern int	WalSenderMain(void);
*** 45,49 ****
--- 45,50 ----
  extern void WalSndSignals(void);
  extern Size WalSndShmemSize(void);
  extern void WalSndShmemInit(void);
+ extern void WALSndStatus(pid_t pid, char *buffer, size_t buflen);
  
  #endif   /* _WALSENDER_H */
#2Magnus Hagander
magnus@hagander.net
In reply to: Takahiro Itagaki (#1)
Re: system views for walsender activity

On Fri, Jun 18, 2010 at 04:33, Takahiro Itagaki
<itagaki.takahiro@oss.ntt.co.jp> wrote:

Hi,

We don't have any statistic views for walsenders in SR's master server
in 9.0, but such views would be useful to monitor and manage standby
servers from the master server. I have two ideas for the solution -
adding a new system view or recycling pg_stat_activity:

1. Add another system view for walsenders, ex. "pg_stat_replication".
  It would show pid, remote host, and sent location for each walsender.

2. Make pg_stat_activity to show walsenders. We could use current_query
  to display walsender-specific information, like:
   =# SELECT * FROM my_stat_activity ;
   -[ RECORD 1 ]----+---------------------------------
   datid            | 16384
   <snip>
   current_query    | SELECT * FROM my_stat_activity ;
   -[ RECORD 2 ]----+---------------------------------
   datid            | 0
   datname          |
   procpid          | 4300
   usesysid         | 10
   usename          | itagaki
   application_name |
   client_addr      | ::1
   client_port      | 37710
   backend_start    | 2010-06-16 16:47:35.646486+09
   xact_start       |
   query_start      |
   waiting          | f
   current_query    | walsender: sent=0/701AAA8

The attached patch is a WIP codes for the case 2, but it might be
better to design management policy via SQL in 9.1 before such detailed
implementation.  Comments welcome.

I like version 1 much better. It'll be a lot easier for a management
tool to get the data in proper columns and not have to parse it out of
an arbitrary string field.

The downside is that version 1 will require an initdb, and not version
2, right? In that light, #2 is probably the only one we can do for 9.0
(unless we stumble upon other initdb-forcing changes), so maybe we
should do that one as a temporary measure (and if so, note it both in
the documentation and code that it's a temporary thing).

Wouldn't we also need something similar for the receiving end?

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#3Simon Riggs
simon@2ndQuadrant.com
In reply to: Takahiro Itagaki (#1)
Re: system views for walsender activity

On Fri, 2010-06-18 at 11:33 +0900, Takahiro Itagaki wrote:

1. Add another system view for walsenders, ex. "pg_stat_replication".
It would show pid, remote host, and sent location for each walsender.

I prefer this option. I consider it an omission that we should correct.

Not sure I understand why you block up access to the WALSendPtr and then
propose re-accessing it in text form a few days later. Whatever else you
do you should revoke the commit that did that.

--
Simon Riggs www.2ndQuadrant.com

#4Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Simon Riggs (#3)
Re: system views for walsender activity

On 18/06/10 13:41, Simon Riggs wrote:

On Fri, 2010-06-18 at 11:33 +0900, Takahiro Itagaki wrote:

1. Add another system view for walsenders, ex. "pg_stat_replication".
It would show pid, remote host, and sent location for each walsender.

I prefer this option. I consider it an omission that we should correct.

Not sure I understand why you block up access to the WALSendPtr and then
propose re-accessing it in text form a few days later. Whatever else you
do you should revoke the commit that did that.

Are you referring to the dead extern declaration of
GetOldestWALSendPointer()? That was indeed dead, the function itself was
already #ifdeffed out. If we go with the pg_state_replication view as
suggested above, it would've been useless anyway because it returns only
one value, not one for each walsender.

Let's discuss what the best possible user interface for the information
would be first, and then decide if we need/want to force an initdb for
that. We have pg_upgrade now, that makes initdb less painful, and if
it's just a new view it might be possible to just add a note in the
release notes that you'll have to run the CREATE VIEW manually if upgrading.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#4)
Re: system views for walsender activity

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

Let's discuss what the best possible user interface for the information
would be first, and then decide if we need/want to force an initdb for
that. We have pg_upgrade now, that makes initdb less painful, and if
it's just a new view it might be possible to just add a note in the
release notes that you'll have to run the CREATE VIEW manually if upgrading.

The view would presumably depend on a C-language SRF, which isn't there
either.

I'm of the opinion that this is a 9.1 problem. It needs more thought
than we can put into it now --- one obvious question is what about
monitoring on the slave side? Another is who should be able to see the
data?

regards, tom lane

#6Takahiro Itagaki
itagaki.takahiro@oss.ntt.co.jp
In reply to: Magnus Hagander (#2)
Re: system views for walsender activity

Magnus Hagander <magnus@hagander.net> wrote:

The downside is that version 1 will require an initdb, and not version
2, right?

Unfortunately, 2 also requires initdb because pg_stat_activity will
use LEFT JOIN instead of normal JOIN not to hide rows with databaseid = 0.
All of them are items for 9.1.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center

#7Takahiro Itagaki
itagaki.takahiro@oss.ntt.co.jp
In reply to: Tom Lane (#5)
Re: system views for walsender activity

Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm of the opinion that this is a 9.1 problem. It needs more thought
than we can put into it now --- one obvious question is what about
monitoring on the slave side? Another is who should be able to see the
data?

Sure. We should research user's demands for monitoring and management
of replication. I'll report some voices from users as of this moment:

* Managers often ask DBAs "How long standby servers are behind the master?"
We should provide such methods for DBAs. We have pg_xlog_location()
functions, but they should be improved for:
- The returned values are "xxx/yyy" texts, but more useful information
is the difference of two values. Subtraction functions are required.
- For easier management, the master server should provide not only
sent/flush locations but also received/replayed locations for each
standby servers. Users don't want to access both master and slaves.

* Some developers want to pause and restart replication from the master
server. They're going to use replication for application version
managements. They'll pause all replications, and test their new features
at the master, and restart replication to spread the changes to slaves.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center

#8Guillaume Lelarge
guillaume@lelarge.info
In reply to: Takahiro Itagaki (#7)
Re: system views for walsender activity

Le 22/06/2010 06:40, Takahiro Itagaki a �crit :

[...]
Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm of the opinion that this is a 9.1 problem. It needs more thought
than we can put into it now --- one obvious question is what about
monitoring on the slave side? Another is who should be able to see the
data?

Sure. We should research user's demands for monitoring and management
of replication. I'll report some voices from users as of this moment:

* Managers often ask DBAs "How long standby servers are behind the master?"
We should provide such methods for DBAs. We have pg_xlog_location()
functions, but they should be improved for:
- The returned values are "xxx/yyy" texts, but more useful information
is the difference of two values. Subtraction functions are required.
- For easier management, the master server should provide not only
sent/flush locations but also received/replayed locations for each
standby servers. Users don't want to access both master and slaves.

* Some developers want to pause and restart replication from the master
server. They're going to use replication for application version
managements. They'll pause all replications, and test their new features
at the master, and restart replication to spread the changes to slaves.

I agree on these two.

Something I found lacking when I added support for Hot Standby /
Streaming Replication in pgAdmin (that was a really small patch, there
was not a lot to do) was that one cannot get the actual value of each
recovery.conf parameter. Try a "SHOW primary_conninfo;" and it will
juste reply that primary_conninfo is an unknown parameter. I already
talked about this to Heikki, but didn't get a chance to actually look at
the code.

--
Guillaume
http://www.postgresql.fr
http://dalibo.com

#9Simon Riggs
simon@2ndQuadrant.com
In reply to: Guillaume Lelarge (#8)
Re: system views for walsender activity

On Tue, 2010-06-22 at 09:54 +0200, Guillaume Lelarge wrote:

I added support for Hot Standby /
Streaming Replication in pgAdmin (that was a really small patch, there
was not a lot to do)

Well done.

Does this mean that pgAdmin has a read only mode now?

What are the details of that support? I couldn't easily see the commits
in the pgadmin list.

--
Simon Riggs www.2ndQuadrant.com

#10Guillaume Lelarge
guillaume@lelarge.info
In reply to: Simon Riggs (#9)
Re: system views for walsender activity

Le 22/06/2010 11:41, Simon Riggs a �crit :

On Tue, 2010-06-22 at 09:54 +0200, Guillaume Lelarge wrote:

I added support for Hot Standby /
Streaming Replication in pgAdmin (that was a really small patch, there
was not a lot to do)

Well done.

Does this mean that pgAdmin has a read only mode now?

Nope, it does not really have one. Though I intend to work on having
pgAdmin more aware of the actual rights of the connected user (allowing
him to get to display the "create table" dialog when we should already
know he cannot is an issue, at least to me).

What are the details of that support? I couldn't easily see the commits
in the pgadmin list.

Shamely simple : I only added some informations on the server's
properties. See
http://www.pgadmin.org/images/visualtour12/visualtour08.jpg. We only
display the fact that the server is (or isn't) in recovery, and the
result of the two admin functions (receive and replay location). Too bad
the other admin functions aren't there, I could have used them (and hope
to do so in 9.1). Too bad also we cannot know the primary server from a
connection to the slave (that's why I would love to get the value of
primary_conninfo, to found the alias/IP of the primary server).

--
Guillaume
http://www.postgresql.fr
http://dalibo.com

#11Simon Riggs
simon@2ndQuadrant.com
In reply to: Guillaume Lelarge (#10)
Re: system views for walsender activity

On Tue, 2010-06-22 at 12:19 +0200, Guillaume Lelarge wrote:

Shamely simple : I only added some informations on the server's
properties. See
http://www.pgadmin.org/images/visualtour12/visualtour08.jpg. We only
display the fact that the server is (or isn't) in recovery, and the
result of the two admin functions (receive and replay location).

If you store the is-in-Recovery result you could set the .enabled
property of many of the dialog boxes. I think its going to be painful
for people to attempt to submit a DDL command and get an error.

Too bad the other admin functions aren't there, I could have used them
(and hope to do so in 9.1). Too bad also we cannot know the primary
server from a connection to the slave (that's why I would love to get
the value of
primary_conninfo, to found the alias/IP of the primary server).

Agreed

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services

#12Guillaume Lelarge
guillaume@lelarge.info
In reply to: Simon Riggs (#11)
Re: system views for walsender activity

Le 22/06/2010 12:42, Simon Riggs a �crit :

On Tue, 2010-06-22 at 12:19 +0200, Guillaume Lelarge wrote:

Shamely simple : I only added some informations on the server's
properties. See
http://www.pgadmin.org/images/visualtour12/visualtour08.jpg. We only
display the fact that the server is (or isn't) in recovery, and the
result of the two admin functions (receive and replay location).

If you store the is-in-Recovery result you could set the .enabled
property of many of the dialog boxes. I think its going to be painful
for people to attempt to submit a DDL command and get an error.

That's what I first thought. But it would be weird that we disabled all
the OK button of the dialog properties only for hotstandby servers, but
not when a user doesn't have the permission. At least, that was the
reasonning I had at the time.

Too bad the other admin functions aren't there, I could have used them
(and hope to do so in 9.1). Too bad also we cannot know the primary
server from a connection to the slave (that's why I would love to get
the value of
primary_conninfo, to found the alias/IP of the primary server).

Agreed

:)

--
Guillaume
http://www.postgresql.fr
http://dalibo.com

#13Magnus Hagander
magnus@hagander.net
In reply to: Takahiro Itagaki (#6)
Re: system views for walsender activity

On Tue, Jun 22, 2010 at 06:18, Takahiro Itagaki
<itagaki.takahiro@oss.ntt.co.jp> wrote:

Magnus Hagander <magnus@hagander.net> wrote:

The downside is that version 1 will require an initdb, and not version
2, right?

Unfortunately, 2 also requires initdb because pg_stat_activity will
use LEFT JOIN instead of normal JOIN not to hide rows with databaseid = 0.
All of them are items for 9.1.

Did this one end up on the floor?

We definitely need the very basic level for 9.1, and we can always
improve on it later :-) Do you want to keep working on it, or do you
want me to pick it up?

Any of the suggestions that includes the master showing data from the
slaves requires some kind of feedback going in from the slave, making
things a lot more complex (the slave is no longer passive) - let's
leave those for now. (you can use the 2ndquadrant replmgr to get some
of that :P)

I'm not sure it makes much sense to add walsenders to pg_stat_activity
- a lot of the fields would no longer make any sense (statement start?
query start?) - I think we're better off with a separate view for
pg_stat_walsender. It would then only need the columns for procpid,
usesysid, usename, client_addr, client_port, and the WALsender
specific fields.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#14Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Magnus Hagander (#13)
Re: system views for walsender activity

On Tue, Dec 28, 2010 at 21:46, Magnus Hagander <magnus@hagander.net> wrote:

Unfortunately, 2 also requires initdb because pg_stat_activity will
use LEFT JOIN instead of normal JOIN not to hide rows with databaseid = 0.
All of them are items for 9.1.

Did this one end up on the floor?

We definitely need the very basic level for 9.1, and we can always
improve on it later :-) Do you want to keep working on it, or do you
want me to pick it up?

OK, I'll work for it.

I'm not sure it makes much sense to add walsenders to pg_stat_activity
- a lot of the fields would no longer make any sense (statement start?
query start?) - I think we're better off with a separate view for
pg_stat_walsender. It would then only need the columns for procpid,
usesysid, usename, client_addr, client_port, and the WALsender
specific fields.

+1 for the separate view. "backend_start" (or replication_start?)
might be also reasonable.

--
Itagaki Takahiro

#15Magnus Hagander
magnus@hagander.net
In reply to: Itagaki Takahiro (#14)
Re: system views for walsender activity

On Tue, Dec 28, 2010 at 14:14, Itagaki Takahiro
<itagaki.takahiro@gmail.com> wrote:

On Tue, Dec 28, 2010 at 21:46, Magnus Hagander <magnus@hagander.net> wrote:

Unfortunately, 2 also requires initdb because pg_stat_activity will
use LEFT JOIN instead of normal JOIN not to hide rows with databaseid = 0.
All of them are items for 9.1.

Did this one end up on the floor?

We definitely need the very basic level for 9.1, and we can always
improve on it later :-) Do you want to keep working on it, or do you
want me to pick it up?

OK, I'll work for it.

Great.

I'm not sure it makes much sense to add walsenders to pg_stat_activity
- a lot of the fields would no longer make any sense (statement start?
query start?) - I think we're better off with a separate view for
pg_stat_walsender. It would then only need the columns for procpid,
usesysid, usename, client_addr, client_port, and the WALsender
specific fields.

+1 for the separate view. "backend_start" (or replication_start?)
might be also reasonable.

Yeah, agreed. backend_start is probably the best one -
replication_start may be considered conceptually different if the
connection was dropped and reconnected. backend_start is more
explicit.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#16Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Magnus Hagander (#15)
2 attachment(s)
Re: system views for walsender activity

On Tue, Dec 28, 2010 at 22:17, Magnus Hagander <magnus@hagander.net> wrote:

We definitely need the very basic level for 9.1, and we can always
improve on it later :-)

pg_stat_walsender. It would then only need the columns for procpid,
usesysid, usename, client_addr, client_port, and the WALsender
specific fields.

Yeah, agreed. backend_start is probably the best one

Here are patches for pg_stat_walsender.
I split the feature into two pieces:

* get_host_and_port.patch
It separates host and port formatter as a subroutine from pg_stat_activity.
In addition, make pg_stat_get_backend_client_addr/port() functions to
use the subroutine to reduce duplicated codes.

* pg_stat_walsender.patch
It adds pg_stat_walsender system view. It has subset columns of
pg_stat_activity and only one additional WAL sender specific information
via WALSndStatus(). I named the column "sending location" because
standby servers might not have received the WAL record; if we had
synchronous replication, a new "sent location" wold be added.
But the naming is still an open question. Comments welcome.

There is O(max_wal_senders^2) complexity in the view, But I think
it is not so serious problem because we can expect max_wal_senders
is 10 or so at most.

CREATE VIEW pg_stat_walsender AS
SELECT
S.procpid,
S.usesysid,
U.rolname AS usename,
S.client_addr,
S.client_port,
S.backend_start,
S.xlog_sending
FROM pg_stat_get_walsender(NULL) AS S, pg_authid U
WHERE S.usesysid = U.oid;

--
Itagaki Takahiro

Attachments:

get_host_and_port-20110104.patchapplication/octet-stream; name=get_host_and_port-20110104.patchDownload
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index a95ba8b..31df9ee 100644
*** a/src/backend/utils/adt/pgstatfuncs.c
--- b/src/backend/utils/adt/pgstatfuncs.c
*************** extern Datum pg_stat_reset_shared(PG_FUN
*** 107,112 ****
--- 107,114 ----
  extern Datum pg_stat_reset_single_table_counters(PG_FUNCTION_ARGS);
  extern Datum pg_stat_reset_single_function_counters(PG_FUNCTION_ARGS);
  
+ static void get_host_and_port(const SockAddr *addr, inet **host, int *port);
+ 
  /* Global bgwriter statistics, from bgwriter.c */
  extern PgStat_MsgBgWriter bgwriterStats;
  
*************** pg_stat_get_activity(PG_FUNCTION_ARGS)
*** 571,577 ****
  		bool		nulls[11];
  		HeapTuple	tuple;
  		PgBackendStatus *beentry;
- 		SockAddr	zero_clientaddr;
  
  		MemSet(values, 0, sizeof(values));
  		MemSet(nulls, 0, sizeof(nulls));
--- 573,578 ----
*************** pg_stat_get_activity(PG_FUNCTION_ARGS)
*** 612,617 ****
--- 613,621 ----
  		/* Values only available to same user or superuser */
  		if (superuser() || beentry->st_userid == GetUserId())
  		{
+ 			inet	   *client_addr;
+ 			int			client_port;
+ 
  			if (*(beentry->st_activity) == '\0')
  			{
  				values[4] = CStringGetTextDatum("<command string not enabled>");
*************** pg_stat_get_activity(PG_FUNCTION_ARGS)
*** 638,701 ****
  			else
  				nulls[8] = true;
  
! 			/* A zeroed client addr means we don't know */
! 			memset(&zero_clientaddr, 0, sizeof(zero_clientaddr));
! 			if (memcmp(&(beentry->st_clientaddr), &zero_clientaddr,
! 					   sizeof(zero_clientaddr) == 0))
! 			{
  				nulls[9] = true;
! 				nulls[10] = true;
! 			}
  			else
! 			{
! 				if (beentry->st_clientaddr.addr.ss_family == AF_INET
! #ifdef HAVE_IPV6
! 					|| beentry->st_clientaddr.addr.ss_family == AF_INET6
! #endif
! 					)
! 				{
! 					char		remote_host[NI_MAXHOST];
! 					char		remote_port[NI_MAXSERV];
! 					int			ret;
! 
! 					remote_host[0] = '\0';
! 					remote_port[0] = '\0';
! 					ret = pg_getnameinfo_all(&beentry->st_clientaddr.addr,
! 											 beentry->st_clientaddr.salen,
! 											 remote_host, sizeof(remote_host),
! 											 remote_port, sizeof(remote_port),
! 											 NI_NUMERICHOST | NI_NUMERICSERV);
! 					if (ret)
! 					{
! 						nulls[9] = true;
! 						nulls[10] = true;
! 					}
! 					else
! 					{
! 						clean_ipv6_addr(beentry->st_clientaddr.addr.ss_family, remote_host);
! 						values[9] = DirectFunctionCall1(inet_in,
! 											   CStringGetDatum(remote_host));
! 						values[10] = Int32GetDatum(atoi(remote_port));
! 					}
! 				}
! 				else if (beentry->st_clientaddr.addr.ss_family == AF_UNIX)
! 				{
! 					/*
! 					 * Unix sockets always reports NULL for host and -1 for
! 					 * port, so it's possible to tell the difference to
! 					 * connections we have no permissions to view, or with
! 					 * errors.
! 					 */
! 					nulls[9] = true;
! 					values[10] = DatumGetInt32(-1);
! 				}
! 				else
! 				{
! 					/* Unknown address type, should never happen */
! 					nulls[9] = true;
! 					nulls[10] = true;
! 				}
! 			}
  		}
  		else
  		{
--- 642,656 ----
  			else
  				nulls[8] = true;
  
! 			get_host_and_port(&beentry->st_clientaddr, &client_addr, &client_port);
! 			if (client_addr != NULL)
! 				values[9] = PointerGetDatum(client_addr);
! 			else
  				nulls[9] = true;
! 			if (client_port != 0)
! 				values[10] = DatumGetInt32(client_port);
  			else
! 				nulls[10] = true;
  		}
  		else
  		{
*************** pg_stat_get_backend_client_addr(PG_FUNCT
*** 881,889 ****
  {
  	int32		beid = PG_GETARG_INT32(0);
  	PgBackendStatus *beentry;
! 	SockAddr	zero_clientaddr;
! 	char		remote_host[NI_MAXHOST];
! 	int			ret;
  
  	if ((beentry = pgstat_fetch_stat_beentry(beid)) == NULL)
  		PG_RETURN_NULL();
--- 836,842 ----
  {
  	int32		beid = PG_GETARG_INT32(0);
  	PgBackendStatus *beentry;
! 	inet	   *client_addr;
  
  	if ((beentry = pgstat_fetch_stat_beentry(beid)) == NULL)
  		PG_RETURN_NULL();
*************** pg_stat_get_backend_client_addr(PG_FUNCT
*** 891,926 ****
  	if (!superuser() && beentry->st_userid != GetUserId())
  		PG_RETURN_NULL();
  
! 	/* A zeroed client addr means we don't know */
! 	memset(&zero_clientaddr, 0, sizeof(zero_clientaddr));
! 	if (memcmp(&(beentry->st_clientaddr), &zero_clientaddr,
! 			   sizeof(zero_clientaddr) == 0))
! 		PG_RETURN_NULL();
! 
! 	switch (beentry->st_clientaddr.addr.ss_family)
! 	{
! 		case AF_INET:
! #ifdef HAVE_IPV6
! 		case AF_INET6:
! #endif
! 			break;
! 		default:
! 			PG_RETURN_NULL();
! 	}
! 
! 	remote_host[0] = '\0';
! 	ret = pg_getnameinfo_all(&beentry->st_clientaddr.addr,
! 							 beentry->st_clientaddr.salen,
! 							 remote_host, sizeof(remote_host),
! 							 NULL, 0,
! 							 NI_NUMERICHOST | NI_NUMERICSERV);
! 	if (ret)
  		PG_RETURN_NULL();
- 
- 	clean_ipv6_addr(beentry->st_clientaddr.addr.ss_family, remote_host);
- 
- 	PG_RETURN_INET_P(DirectFunctionCall1(inet_in,
- 										 CStringGetDatum(remote_host)));
  }
  
  Datum
--- 844,854 ----
  	if (!superuser() && beentry->st_userid != GetUserId())
  		PG_RETURN_NULL();
  
! 	get_host_and_port(&beentry->st_clientaddr, &client_addr, NULL);
! 	if (client_addr)
! 		PG_RETURN_INET_P(client_addr);
! 	else
  		PG_RETURN_NULL();
  }
  
  Datum
*************** pg_stat_get_backend_client_port(PG_FUNCT
*** 928,936 ****
  {
  	int32		beid = PG_GETARG_INT32(0);
  	PgBackendStatus *beentry;
! 	SockAddr	zero_clientaddr;
! 	char		remote_port[NI_MAXSERV];
! 	int			ret;
  
  	if ((beentry = pgstat_fetch_stat_beentry(beid)) == NULL)
  		PG_RETURN_NULL();
--- 856,862 ----
  {
  	int32		beid = PG_GETARG_INT32(0);
  	PgBackendStatus *beentry;
! 	int			client_port;
  
  	if ((beentry = pgstat_fetch_stat_beentry(beid)) == NULL)
  		PG_RETURN_NULL();
*************** pg_stat_get_backend_client_port(PG_FUNCT
*** 938,973 ****
  	if (!superuser() && beentry->st_userid != GetUserId())
  		PG_RETURN_NULL();
  
! 	/* A zeroed client addr means we don't know */
! 	memset(&zero_clientaddr, 0, sizeof(zero_clientaddr));
! 	if (memcmp(&(beentry->st_clientaddr), &zero_clientaddr,
! 			   sizeof(zero_clientaddr) == 0))
! 		PG_RETURN_NULL();
! 
! 	switch (beentry->st_clientaddr.addr.ss_family)
! 	{
! 		case AF_INET:
! #ifdef HAVE_IPV6
! 		case AF_INET6:
! #endif
! 			break;
! 		case AF_UNIX:
! 			PG_RETURN_INT32(-1);
! 		default:
! 			PG_RETURN_NULL();
! 	}
! 
! 	remote_port[0] = '\0';
! 	ret = pg_getnameinfo_all(&beentry->st_clientaddr.addr,
! 							 beentry->st_clientaddr.salen,
! 							 NULL, 0,
! 							 remote_port, sizeof(remote_port),
! 							 NI_NUMERICHOST | NI_NUMERICSERV);
! 	if (ret)
  		PG_RETURN_NULL();
- 
- 	PG_RETURN_DATUM(DirectFunctionCall1(int4in,
- 										CStringGetDatum(remote_port)));
  }
  
  
--- 864,874 ----
  	if (!superuser() && beentry->st_userid != GetUserId())
  		PG_RETURN_NULL();
  
! 	get_host_and_port(&beentry->st_clientaddr, NULL, &client_port);
! 	if (client_port != 0)
! 		PG_RETURN_INT32(client_port);
! 	else
  		PG_RETURN_NULL();
  }
  
  
*************** pg_stat_reset_single_function_counters(P
*** 1515,1517 ****
--- 1416,1485 ----
  
  	PG_RETURN_VOID();
  }
+ 
+ /*
+  * Get host and port in the SockAddr. Out parameters can be NULLs.
+  */
+ static void
+ get_host_and_port(const SockAddr *addr, inet **host, int *port)
+ {
+ 	SockAddr	zero_clientaddr;
+ 
+ 	if (host)
+ 		*host = NULL;
+ 	if (port)
+ 		*port = 0;
+ 
+ 	/* A zeroed client addr means we don't know */
+ 	memset(&zero_clientaddr, 0, sizeof(zero_clientaddr));
+ 	if (memcmp(addr, &zero_clientaddr, sizeof(zero_clientaddr)) == 0)
+ 		return;
+ 
+ 	if (addr->addr.ss_family == AF_INET
+ 	#ifdef HAVE_IPV6
+ 		|| addr->addr.ss_family == AF_INET6
+ 	#endif
+ 		)
+ 	{
+ 		char		remote_host[NI_MAXHOST];
+ 		char		remote_port[NI_MAXSERV];
+ 		int			ret;
+ 
+ 		if (host)
+ 			remote_host[0] = '\0';
+ 		if (port)
+ 			remote_port[0] = '\0';
+ 		ret = pg_getnameinfo_all(&addr->addr,
+ 								 addr->salen,
+ 								 host ? remote_host : NULL,
+ 								 host ? sizeof(remote_host) : 0,
+ 								 port ? remote_port : NULL,
+ 								 port ? sizeof(remote_port) : 0,
+ 								 NI_NUMERICHOST | NI_NUMERICSERV);
+ 		if (ret == 0)
+ 		{
+ 			if (host)
+ 			{
+ 				clean_ipv6_addr(addr->addr.ss_family, remote_host);
+ 				*host = DatumGetInetP(DirectFunctionCall1(inet_in,
+ 										CStringGetDatum(remote_host)));
+ 			}
+ 			if (port)
+ 				*port = atoi(remote_port);
+ 		}
+ 	}
+ 	else if (addr->addr.ss_family == AF_UNIX)
+ 	{
+ 		/*
+ 		 * Unix sockets always reports NULL for host and -1 for port,
+ 		 * so it's possible to tell the difference to connections we
+ 		 * have no permissions to view, or with errors.
+ 		 */
+ 		if (port)
+ 			*port = -1;
+ 	}
+ 	else
+ 	{
+ 		/* Unknown address type, should never happen */
+ 	}
+ }
pg_stat_walsender-20110104.patchapplication/octet-stream; name=pg_stat_walsender-20110104.patchDownload
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 262b372..3e758f6 100644
*** a/src/backend/catalog/system_views.sql
--- b/src/backend/catalog/system_views.sql
*************** CREATE VIEW pg_stat_activity AS
*** 492,497 ****
--- 492,509 ----
      WHERE S.datid = D.oid AND
              S.usesysid = U.oid;
  
+ CREATE VIEW pg_stat_walsender AS
+     SELECT
+             S.procpid,
+             S.usesysid,
+             U.rolname AS usename,
+             S.client_addr,
+             S.client_port,
+             S.backend_start,
+             S.xlog_sending
+     FROM pg_stat_get_walsender(NULL) AS S, pg_authid U
+     WHERE S.usesysid = U.oid;
+ 
  CREATE VIEW pg_stat_database AS
      SELECT
              D.oid AS datid,
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 269d200..62422d4 100644
*** a/src/backend/replication/walsender.c
--- b/src/backend/replication/walsender.c
*************** WalSndWakeup(void)
*** 943,948 ****
--- 943,979 ----
  }
  
  /*
+  * Retrieve status of a WAL sender. Return true if a WAL sender found.
+  */
+ bool
+ WALSndStatus(pid_t pid, XLogRecPtr *sending)
+ {
+ 	int			i;
+ 
+ 	Assert(sending != NULL);
+ 
+ 	for (i = 0; i < max_wal_senders; i++)
+ 	{
+ 		/* use volatile pointer to prevent code rearrangement */
+ 		volatile WalSnd *walsnd = &WalSndCtl->walsnds[i];
+ 
+ 		if (walsnd->pid == pid)
+ 		{
+ 			SpinLockAcquire(&walsnd->mutex);
+ 			*sending = walsnd->sentPtr;
+ 			SpinLockRelease(&walsnd->mutex);
+ 			elog(INFO, "WALSndStatus %X/%X", sending->xlogid, sending->xrecoff);
+ 			return true;
+ 		}
+ 	}
+ 
+ 	/* not found */
+ 	elog(INFO, "WALSndStatus NOT FOUND: %d", pid);
+ 	memset(sending, 0, sizeof(XLogRecPtr));
+ 	return false;
+ }
+ 
+ /*
   * This isn't currently used for anything. Monitoring tools might be
   * interested in the future, and we'll need something like this in the
   * future for synchronous replication.
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 31df9ee..05826a8 100644
*** a/src/backend/utils/adt/pgstatfuncs.c
--- b/src/backend/utils/adt/pgstatfuncs.c
***************
*** 18,23 ****
--- 18,24 ----
  #include "miscadmin.h"
  #include "pgstat.h"
  #include "catalog/pg_type.h"
+ #include "replication/walsender.h"
  #include "utils/builtins.h"
  #include "utils/inet.h"
  #include "libpq/ip.h"
*************** extern Datum pg_stat_get_function_self_t
*** 49,54 ****
--- 50,56 ----
  
  extern Datum pg_stat_get_backend_idset(PG_FUNCTION_ARGS);
  extern Datum pg_stat_get_activity(PG_FUNCTION_ARGS);
+ extern Datum pg_stat_get_walsender(PG_FUNCTION_ARGS);
  extern Datum pg_backend_pid(PG_FUNCTION_ARGS);
  extern Datum pg_stat_get_backend_pid(PG_FUNCTION_ARGS);
  extern Datum pg_stat_get_backend_dbid(PG_FUNCTION_ARGS);
*************** pg_stat_get_activity(PG_FUNCTION_ARGS)
*** 675,680 ****
--- 677,799 ----
  	}
  }
  
+ Datum
+ pg_stat_get_walsender(PG_FUNCTION_ARGS)
+ {
+ #define PG_STAT_WALSENDERS_COLS		6
+ 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+ 	TupleDesc	tupdesc;
+ 	Tuplestorestate *tupstore;
+ 	MemoryContext per_query_ctx;
+ 	MemoryContext oldcontext;
+ 	int			num_backends;
+ 	int			beid;
+ 	int			pid = (PG_ARGISNULL(0) ? 0 : PG_GETARG_INT32(0));
+ 	Oid			userid = GetUserId();
+ 	bool		is_superuser = superuser();
+ 
+ 	/* check to see if caller supports us returning a tuplestore */
+ 	if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ 				 errmsg("set-valued function called in context that cannot accept a set")));
+ 	if (!(rsinfo->allowedModes & SFRM_Materialize))
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ 				 errmsg("materialize mode required, but it is not " \
+ 						"allowed in this context")));
+ 
+ 	/* Build a tuple descriptor for our result type */
+ 	if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ 		elog(ERROR, "return type must be a row type");
+ 
+ 	per_query_ctx = rsinfo->econtext->ecxt_per_query_memory;
+ 	oldcontext = MemoryContextSwitchTo(per_query_ctx);
+ 
+ 	tupstore = tuplestore_begin_heap(true, false, work_mem);
+ 	rsinfo->returnMode = SFRM_Materialize;
+ 	rsinfo->setResult = tupstore;
+ 	rsinfo->setDesc = tupdesc;
+ 
+ 	MemoryContextSwitchTo(oldcontext);
+ 
+ 	num_backends = pgstat_fetch_stat_numbackends();
+ 	for (beid = 0; beid < num_backends; beid++)
+ 	{
+ 		/* for each row */
+ 		Datum			values[PG_STAT_WALSENDERS_COLS];
+ 		bool			nulls[PG_STAT_WALSENDERS_COLS];
+ 		int				i;
+ 		PgBackendStatus *beentry;
+ 
+ 		beentry = pgstat_fetch_stat_beentry(beid + 1);		/* 1-based index */
+ 
+ 		/* Skip if not active, normal backends, or non-target processes */
+ 		if (!beentry || beentry->st_databaseid != InvalidOid ||
+ 			(pid != 0 && beentry->st_procpid != pid))
+ 			continue;
+ 
+ 		memset(values, 0, sizeof(values));
+ 		memset(nulls, 0, sizeof(nulls));
+ 		i = 0;
+ 
+ 		/* Values available to all callers */
+ 		values[i++] = Int32GetDatum(beentry->st_procpid);
+ 		values[i++] = ObjectIdGetDatum(beentry->st_userid);
+ 
+ 		/* Values only available to same user or superuser */
+ 		if (is_superuser || beentry->st_userid == userid)
+ 		{
+ 			inet	   *client_addr;
+ 			int			client_port;
+ 			XLogRecPtr	xlog_sending;
+ 
+ 			if (beentry->st_proc_start_timestamp != 0)
+ 				values[i++] = TimestampTzGetDatum(beentry->st_proc_start_timestamp);
+ 			else
+ 				nulls[i++] = true;
+ 
+ 			get_host_and_port(&beentry->st_clientaddr, &client_addr, &client_port);
+ 			if (client_addr != NULL)
+ 				values[i++] = PointerGetDatum(client_addr);
+ 			else
+ 				nulls[i++] = true;
+ 			if (client_port != 0)
+ 				values[i++] = DatumGetInt32(client_port);
+ 			else
+ 				nulls[i++] = true;
+ 
+ 			if (WALSndStatus(beentry->st_procpid, &xlog_sending))
+ 			{
+ 				char	location[18];
+ 
+ 				snprintf(location, sizeof(location), "%X/%X",
+ 						 xlog_sending.xlogid, xlog_sending.xrecoff);
+ 				values[i++] = CStringGetTextDatum(location);
+ 			}
+ 			else
+ 				nulls[i++] = true;
+ 		}
+ 		else
+ 		{
+ 			/* No permissions to view data about this session */
+ 			nulls[i++] = true;	/* backend_start */
+ 			nulls[i++] = true;	/* client_addr */
+ 			nulls[i++] = true;	/* client_port */
+ 			nulls[i++] = true;	/* xlog_sending */
+ 		}
+ 
+ 		Assert(i == PG_STAT_WALSENDERS_COLS);
+ 
+ 		tuplestore_putvalues(tupstore, tupdesc, values, nulls);
+ 	}
+ 
+ 	/* clean up and return the tuplestore */
+ 	tuplestore_donestoring(tupstore);
+ 
+ 	return (Datum) 0;
+ }
+ 
  
  Datum
  pg_backend_pid(PG_FUNCTION_ARGS)
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index d6ed60a..27389fe 100644
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
*************** DATA(insert OID = 1936 (  pg_stat_get_ba
*** 3075,3080 ****
--- 3075,3082 ----
  DESCR("statistics: currently active backend IDs");
  DATA(insert OID = 2022 (  pg_stat_get_activity			PGNSP PGUID 12 1 100 0 f f f f t s 1 0 2249 "23" "{23,26,23,26,25,25,16,1184,1184,1184,869,23}" "{i,o,o,o,o,o,o,o,o,o,o,o}" "{pid,datid,procpid,usesysid,application_name,current_query,waiting,xact_start,query_start,backend_start,client_addr,client_port}" _null_ pg_stat_get_activity _null_ _null_ _null_ ));
  DESCR("statistics: information about currently active backends");
+ DATA(insert OID = 3831 (  pg_stat_get_walsender			PGNSP PGUID 12 1 100 0 f f f f t s 1 0 2249 "23" "{23,23,26,1184,869,23,25}" "{i,o,o,o,o,o,o}" "{pid,procpid,usesysid,backend_start,client_addr,client_port,xlog_sending}" _null_ pg_stat_get_walsender _null_ _null_ _null_ ));
+ DESCR("statistics: information about wal senders");
  DATA(insert OID = 2026 (  pg_backend_pid				PGNSP PGUID 12 1 0 0 f f f t f s 0 0 23 "" _null_ _null_ _null_ _null_ pg_backend_pid _null_ _null_ _null_ ));
  DESCR("statistics: current backend PID");
  DATA(insert OID = 1937 (  pg_stat_get_backend_pid		PGNSP PGUID 12 1 0 0 f f f t f s 1 0 23 "23" _null_ _null_ _null_ _null_ pg_stat_get_backend_pid _null_ _null_ _null_ ));
diff --git a/src/include/replication/walsender.h b/src/include/replication/walsender.h
index 6f543ae..87906bf 100644
*** a/src/include/replication/walsender.h
--- b/src/include/replication/walsender.h
*************** extern void WalSndSignals(void);
*** 53,57 ****
--- 53,58 ----
  extern Size WalSndShmemSize(void);
  extern void WalSndShmemInit(void);
  extern void WalSndWakeup(void);
+ extern bool WALSndStatus(pid_t pid, XLogRecPtr *sending);
  
  #endif   /* _WALSENDER_H */
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index a6baae2..0558c13 100644
*** a/src/test/regress/expected/rules.out
--- b/src/test/regress/expected/rules.out
*************** SELECT viewname, definition FROM pg_view
*** 1301,1306 ****
--- 1301,1307 ----
   pg_stat_user_functions      | SELECT p.oid AS funcid, n.nspname AS schemaname, p.proname AS funcname, pg_stat_get_function_calls(p.oid) AS calls, (pg_stat_get_function_time(p.oid) / 1000) AS total_time, (pg_stat_get_function_self_time(p.oid) / 1000) AS self_time FROM (pg_proc p LEFT JOIN pg_namespace n ON ((n.oid = p.pronamespace))) WHERE ((p.prolang <> (12)::oid) AND (pg_stat_get_function_calls(p.oid) IS NOT NULL));
   pg_stat_user_indexes        | SELECT pg_stat_all_indexes.relid, pg_stat_all_indexes.indexrelid, pg_stat_all_indexes.schemaname, pg_stat_all_indexes.relname, pg_stat_all_indexes.indexrelname, pg_stat_all_indexes.idx_scan, pg_stat_all_indexes.idx_tup_read, pg_stat_all_indexes.idx_tup_fetch FROM pg_stat_all_indexes WHERE ((pg_stat_all_indexes.schemaname <> ALL (ARRAY['pg_catalog'::name, 'information_schema'::name])) AND (pg_stat_all_indexes.schemaname !~ '^pg_toast'::text));
   pg_stat_user_tables         | SELECT pg_stat_all_tables.relid, pg_stat_all_tables.schemaname, pg_stat_all_tables.relname, pg_stat_all_tables.seq_scan, pg_stat_all_tables.seq_tup_read, pg_stat_all_tables.idx_scan, pg_stat_all_tables.idx_tup_fetch, pg_stat_all_tables.n_tup_ins, pg_stat_all_tables.n_tup_upd, pg_stat_all_tables.n_tup_del, pg_stat_all_tables.n_tup_hot_upd, pg_stat_all_tables.n_live_tup, pg_stat_all_tables.n_dead_tup, pg_stat_all_tables.last_vacuum, pg_stat_all_tables.last_autovacuum, pg_stat_all_tables.last_analyze, pg_stat_all_tables.last_autoanalyze, pg_stat_all_tables.vacuum_count, pg_stat_all_tables.autovacuum_count, pg_stat_all_tables.analyze_count, pg_stat_all_tables.autoanalyze_count FROM pg_stat_all_tables WHERE ((pg_stat_all_tables.schemaname <> ALL (ARRAY['pg_catalog'::name, 'information_schema'::name])) AND (pg_stat_all_tables.schemaname !~ '^pg_toast'::text));
+  pg_stat_walsender           | SELECT s.procpid, s.usesysid, u.rolname AS usename, s.client_addr, s.client_port, s.backend_start, s.xlog_sending FROM pg_stat_get_walsender(NULL::integer) s(procpid, usesysid, backend_start, client_addr, client_port, xlog_sending), pg_authid u WHERE (s.usesysid = u.oid);
   pg_stat_xact_all_tables     | SELECT c.oid AS relid, n.nspname AS schemaname, c.relname, pg_stat_get_xact_numscans(c.oid) AS seq_scan, pg_stat_get_xact_tuples_returned(c.oid) AS seq_tup_read, (sum(pg_stat_get_xact_numscans(i.indexrelid)))::bigint AS idx_scan, ((sum(pg_stat_get_xact_tuples_fetched(i.indexrelid)))::bigint + pg_stat_get_xact_tuples_fetched(c.oid)) AS idx_tup_fetch, pg_stat_get_xact_tuples_inserted(c.oid) AS n_tup_ins, pg_stat_get_xact_tuples_updated(c.oid) AS n_tup_upd, pg_stat_get_xact_tuples_deleted(c.oid) AS n_tup_del, pg_stat_get_xact_tuples_hot_updated(c.oid) AS n_tup_hot_upd FROM ((pg_class c LEFT JOIN pg_index i ON ((c.oid = i.indrelid))) LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace))) WHERE (c.relkind = ANY (ARRAY['r'::"char", 't'::"char"])) GROUP BY c.oid, n.nspname, c.relname;
   pg_stat_xact_sys_tables     | SELECT pg_stat_xact_all_tables.relid, pg_stat_xact_all_tables.schemaname, pg_stat_xact_all_tables.relname, pg_stat_xact_all_tables.seq_scan, pg_stat_xact_all_tables.seq_tup_read, pg_stat_xact_all_tables.idx_scan, pg_stat_xact_all_tables.idx_tup_fetch, pg_stat_xact_all_tables.n_tup_ins, pg_stat_xact_all_tables.n_tup_upd, pg_stat_xact_all_tables.n_tup_del, pg_stat_xact_all_tables.n_tup_hot_upd FROM pg_stat_xact_all_tables WHERE ((pg_stat_xact_all_tables.schemaname = ANY (ARRAY['pg_catalog'::name, 'information_schema'::name])) OR (pg_stat_xact_all_tables.schemaname ~ '^pg_toast'::text));
   pg_stat_xact_user_functions | SELECT p.oid AS funcid, n.nspname AS schemaname, p.proname AS funcname, pg_stat_get_xact_function_calls(p.oid) AS calls, (pg_stat_get_xact_function_time(p.oid) / 1000) AS total_time, (pg_stat_get_xact_function_self_time(p.oid) / 1000) AS self_time FROM (pg_proc p LEFT JOIN pg_namespace n ON ((n.oid = p.pronamespace))) WHERE ((p.prolang <> (12)::oid) AND (pg_stat_get_xact_function_calls(p.oid) IS NOT NULL));
*************** SELECT viewname, definition FROM pg_view
*** 1335,1341 ****
   shoelace_obsolete           | SELECT shoelace.sl_name, shoelace.sl_avail, shoelace.sl_color, shoelace.sl_len, shoelace.sl_unit, shoelace.sl_len_cm FROM shoelace WHERE (NOT (EXISTS (SELECT shoe.shoename FROM shoe WHERE (shoe.slcolor = shoelace.sl_color))));
   street                      | SELECT r.name, r.thepath, c.cname FROM ONLY road r, real_city c WHERE (c.outline ## r.thepath);
   toyemp                      | SELECT emp.name, emp.age, emp.location, (12 * emp.salary) AS annualsal FROM emp;
! (57 rows)
  
  SELECT tablename, rulename, definition FROM pg_rules
  	ORDER BY tablename, rulename;
--- 1336,1342 ----
   shoelace_obsolete           | SELECT shoelace.sl_name, shoelace.sl_avail, shoelace.sl_color, shoelace.sl_len, shoelace.sl_unit, shoelace.sl_len_cm FROM shoelace WHERE (NOT (EXISTS (SELECT shoe.shoename FROM shoe WHERE (shoe.slcolor = shoelace.sl_color))));
   street                      | SELECT r.name, r.thepath, c.cname FROM ONLY road r, real_city c WHERE (c.outline ## r.thepath);
   toyemp                      | SELECT emp.name, emp.age, emp.location, (12 * emp.salary) AS annualsal FROM emp;
! (58 rows)
  
  SELECT tablename, rulename, definition FROM pg_rules
  	ORDER BY tablename, rulename;
#17Simon Riggs
simon@2ndQuadrant.com
In reply to: Itagaki Takahiro (#16)
1 attachment(s)
Re: system views for walsender activity

On Tue, 2011-01-04 at 15:51 +0900, Itagaki Takahiro wrote:

On Tue, Dec 28, 2010 at 22:17, Magnus Hagander <magnus@hagander.net> wrote:

We definitely need the very basic level for 9.1, and we can always
improve on it later :-)

pg_stat_walsender. It would then only need the columns for procpid,
usesysid, usename, client_addr, client_port, and the WALsender
specific fields.

Yeah, agreed. backend_start is probably the best one

Here are patches for pg_stat_walsender.
I split the feature into two pieces:

* get_host_and_port.patch
It separates host and port formatter as a subroutine from pg_stat_activity.
In addition, make pg_stat_get_backend_client_addr/port() functions to
use the subroutine to reduce duplicated codes.

* pg_stat_walsender.patch
It adds pg_stat_walsender system view. It has subset columns of
pg_stat_activity and only one additional WAL sender specific information
via WALSndStatus(). I named the column "sending location" because
standby servers might not have received the WAL record; if we had
synchronous replication, a new "sent location" wold be added.
But the naming is still an open question. Comments welcome.

There is O(max_wal_senders^2) complexity in the view, But I think
it is not so serious problem because we can expect max_wal_senders
is 10 or so at most.

CREATE VIEW pg_stat_walsender AS
SELECT
S.procpid,
S.usesysid,
U.rolname AS usename,
S.client_addr,
S.client_port,
S.backend_start,
S.xlog_sending
FROM pg_stat_get_walsender(NULL) AS S, pg_authid U
WHERE S.usesysid = U.oid;

Just seen you started working on this again. Very good.

I enclose some snippets of code I was working on, which I am removing
from my patch in favour of your work as a separate commit.

The way I coded it was a new SRF that joins to the existing
pg_stat_activity. So no initdb required, and this can also easily be
included as an external module for 9.0.

Please notice also that my coding of the new SRF does not have the O^2
issue you mention, which I was keen to avoid.

The sent pointer is needed whether or not we have sync rep. We should
also include application name, since the user may set that in the
standby for all the same reasons it is set elsewhere.

Small point: please lets not call this pg_stat_walsender?
pg_stat_replication_sent and pg_stat_replication_received would be
easier for normal humans to understand.

I would very much appreciate it if one of you could complete something
here and commit in the next few days. That would then allow me to extend
the view with sync rep specific info for monitoring and patch testing.

--
Simon Riggs http://www.2ndQuadrant.com/books/
PostgreSQL Development, 24x7 Support, Training and Services

Attachments:

pg_stat_replication.v1.patchtext/x-patch; charset=UTF-8; name=pg_stat_replication.v1.patchDownload
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 346eaaf..75419b7 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -488,6 +488,21 @@ CREATE VIEW pg_stat_activity AS
     WHERE S.datid = D.oid AND
             S.usesysid = U.oid;
 
+CREATE VIEW pg_stat_replication_activity AS
+    SELECT
+            S.procpid,
+            S.usesysid,
+            U.rolname AS usename,
+            S.application_name,
+            S.client_addr,
+            S.client_port,
+            S.backend_start,
+            R.sent_location
+    FROM pg_stat_get_activity(NULL) AS S, pg_authid U,
+            pg_stat_get_replication_activity() AS R
+    WHERE S.usesysid = U.oid AND
+            S.procpid = R.procpid;
+
 CREATE VIEW pg_stat_database AS
     SELECT
             D.oid AS datid,
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index e9d8847..4422f5a 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -37,6 +37,7 @@
 #include <signal.h>
 #include <unistd.h>
 
+#include "funcapi.h"
 #include "access/xlog_internal.h"
 #include "catalog/pg_type.h"
 #include "libpq/libpq.h"
@@ -49,6 +50,7 @@
 #include "storage/ipc.h"
 #include "storage/pmsignal.h"
 #include "tcop/tcopprot.h"
+#include "utils/builtins.h"
 #include "utils/guc.h"
 #include "utils/memutils.h"
 #include "utils/ps_status.h"
@@ -1122,6 +1124,91 @@ WalSndWakeup(void)
 }
 
 /*
+ * Returns the Send position of walsenders with given pid.
+ * Or InvalidXLogRecPtr if none.
+ */
+Datum
+pg_stat_get_replication_activity(PG_FUNCTION_ARGS)
+{
+	FuncCallContext *funcctx;
+
+	if (SRF_IS_FIRSTCALL())
+	{
+		MemoryContext oldcontext;
+		TupleDesc	tupdesc;
+
+		funcctx = SRF_FIRSTCALL_INIT();
+
+		oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+#define PG_STAT_GET_REP_ACTIVITY_COLS 	2
+		tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_REP_ACTIVITY_COLS, false);
+		TupleDescInitEntry(tupdesc, (AttrNumber) 1, "procpid", INT4OID, -1, 0);
+		TupleDescInitEntry(tupdesc, (AttrNumber) 2, "sent_location", TEXTOID, -1, 0);
+
+		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
+
+		funcctx->user_fctx = palloc0(sizeof(int));
+		if (superuser())
+			funcctx->max_calls = max_wal_senders;
+		else
+			funcctx->max_calls = 0;
+
+		MemoryContextSwitchTo(oldcontext);
+	}
+
+	/* stuff done on every call of the function */
+	funcctx = SRF_PERCALL_SETUP();
+
+	if (funcctx->call_cntr < funcctx->max_calls)
+	{
+		/* for each row */
+		Datum		values[PG_STAT_GET_REP_ACTIVITY_COLS];
+		bool		nulls[PG_STAT_GET_REP_ACTIVITY_COLS];
+		HeapTuple	tuple;
+
+		MemSet(values, 0, sizeof(values));
+		MemSet(nulls, 0, sizeof(nulls));
+
+		{
+			/* use volatile pointer to prevent code rearrangement */
+			volatile WalSnd *walsnd = &WalSndCtl->walsnds[funcctx->call_cntr];
+			char		sendlocation[MAXFNAMELEN];
+			XLogRecPtr	recptr;
+
+			if (walsnd->pid == 0)
+			{
+				nulls[0] = true;
+				nulls[1] = true;
+			}
+			else
+			{
+				SpinLockAcquire(&walsnd->mutex);
+				recptr = walsnd->sentPtr;
+				SpinLockRelease(&walsnd->mutex);
+
+				snprintf(sendlocation, sizeof(sendlocation), "%X/%X",
+						 recptr.xlogid, recptr.xrecoff);
+
+				nulls[0] = false;
+				values[0] = Int32GetDatum(walsnd->pid);
+
+				nulls[1] = false;
+				values[1] = CStringGetTextDatum(sendlocation);
+			}
+
+			tuple = heap_form_tuple(funcctx->tuple_desc, values, nulls);
+			SRF_RETURN_NEXT(funcctx, HeapTupleGetDatum(tuple));
+		}
+	}
+	else
+	{
+		/* nothing left */
+		SRF_RETURN_DONE(funcctx);
+	}
+}
+
+/*
  * This isn't currently used for anything. Monitoring tools might be
  * interested in the future, and we'll need something like this in the
  * future for synchronous replication.
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 1e6e75f..3218b87 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -3067,6 +3067,8 @@ DATA(insert OID = 1936 (  pg_stat_get_backend_idset		PGNSP PGUID 12 1 100 0 f f
 DESCR("statistics: currently active backend IDs");
 DATA(insert OID = 2022 (  pg_stat_get_activity			PGNSP PGUID 12 1 100 0 f f f f t s 1 0 2249 "23" "{23,26,23,26,25,25,16,1184,1184,1184,869,23}" "{i,o,o,o,o,o,o,o,o,o,o,o}" "{pid,datid,procpid,usesysid,application_name,current_query,waiting,xact_start,query_start,backend_start,client_addr,client_port}" _null_ pg_stat_get_activity _null_ _null_ _null_ ));
 DESCR("statistics: information about currently active backends");
+DATA(insert OID = 3099 (  pg_stat_get_replication_activity	PGNSP PGUID 12 1 10 0 f f f f t s 0 0 2249 "" "{23,25}" "{o,o}" "{pid,sent_location}" _null_ pg_stat_get_replication_activity _null_ _null_ _null_ ));
+DESCR("statistics: information about currently active replication");
 DATA(insert OID = 2026 (  pg_backend_pid				PGNSP PGUID 12 1 0 0 f f f t f s 0 0 23 "" _null_ _null_ _null_ _null_ pg_backend_pid _null_ _null_ _null_ ));
 DESCR("statistics: current backend PID");
 DATA(insert OID = 1937 (  pg_stat_get_backend_pid		PGNSP PGUID 12 1 0 0 f f f t f s 1 0 23 "23" _null_ _null_ _null_ _null_ pg_stat_get_backend_pid _null_ _null_ _null_ ));
#18Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#17)
Re: system views for walsender activity

On Tue, Jan 4, 2011 at 12:48 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

The sent pointer is needed whether or not we have sync rep. We should
also include application name, since the user may set that in the
standby for all the same reasons it is set elsewhere.

Small point: please lets not call this pg_stat_walsender?
pg_stat_replication_sent and pg_stat_replication_received would be
easier for normal humans to understand.

Eh... I may be showing my status as a non-normal human, but I know
exactly what pg_stat_walsender is (it's the view that shows you the
status of the WAL senders you've allowed by configuring
max_wal_senders>0) but I have no idea what pg_stat_replication_sent
and pg_stat_replication_received are supposed to be. Why two views?
*scratches head in confusion*

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#19Josh Berkus
josh@agliodbs.com
In reply to: Robert Haas (#18)
Re: system views for walsender activity

Eh... I may be showing my status as a non-normal human, but I know
exactly what pg_stat_walsender is (it's the view that shows you the
status of the WAL senders you've allowed by configuring
max_wal_senders>0) but I have no idea what pg_stat_replication_sent
and pg_stat_replication_received are supposed to be. Why two views?
*scratches head in confusion*

How about one view, called "pg_stat_replication"? This would be clear
even to newbies, which none of the other view names would ...

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

#20Joshua D. Drake
jd@commandprompt.com
In reply to: Josh Berkus (#19)
Re: system views for walsender activity

On Tue, 2011-01-04 at 10:50 -0800, Josh Berkus wrote:

Eh... I may be showing my status as a non-normal human, but I know
exactly what pg_stat_walsender is (it's the view that shows you the
status of the WAL senders you've allowed by configuring
max_wal_senders>0) but I have no idea what pg_stat_replication_sent
and pg_stat_replication_received are supposed to be. Why two views?
*scratches head in confusion*

How about one view, called "pg_stat_replication"? This would be clear
even to newbies, which none of the other view names would ...

hmmm I think "pg_stat_standby" might be more relevant but I definitely
agree something more newbie appropriate is in order.

Joshua D. Drake

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt

#21Josh Berkus
josh@agliodbs.com
In reply to: Joshua D. Drake (#20)
Re: system views for walsender activity

hmmm I think "pg_stat_standby" might be more relevant but I definitely
agree something more newbie appropriate is in order.

I'd be fine with that name, too.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

#22Magnus Hagander
magnus@hagander.net
In reply to: Josh Berkus (#21)
Re: system views for walsender activity

On Tue, Jan 4, 2011 at 20:28, Josh Berkus <josh@agliodbs.com> wrote:

hmmm I think "pg_stat_standby" might be more relevant but I definitely
agree something more newbie appropriate is in order.

I'd be fine with that name, too.

That seems kind of backwards though - given that the view only
contains data on the master...

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#23Robert Haas
robertmhaas@gmail.com
In reply to: Magnus Hagander (#22)
Re: system views for walsender activity

On Tue, Jan 4, 2011 at 2:31 PM, Magnus Hagander <magnus@hagander.net> wrote:

On Tue, Jan 4, 2011 at 20:28, Josh Berkus <josh@agliodbs.com> wrote:

hmmm I think "pg_stat_standby" might be more relevant but I definitely
agree something more newbie appropriate is in order.

I'd be fine with that name, too.

That seems kind of backwards though - given that the view only
contains data on the master...

I think pg_stat_replication is better than pg_stat_standby, but I'm
still not convinced we shouldn't go with the obvious
pg_stat_walsenders.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#24Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Robert Haas (#23)
Re: system views for walsender activity

On 04.01.2011 21:43, Robert Haas wrote:

On Tue, Jan 4, 2011 at 2:31 PM, Magnus Hagander<magnus@hagander.net> wrote:

On Tue, Jan 4, 2011 at 20:28, Josh Berkus<josh@agliodbs.com> wrote:

hmmm I think "pg_stat_standby" might be more relevant but I definitely
agree something more newbie appropriate is in order.

I'd be fine with that name, too.

That seems kind of backwards though - given that the view only
contains data on the master...

I think pg_stat_replication is better than pg_stat_standby, but I'm
still not convinced we shouldn't go with the obvious
pg_stat_walsenders.

How about pg_stat_replication_activity? If I understood correctly, the
view is similar to pg_stat_activity, but displays information about
connected standbys rather than regular backends. It's a bit long name,
though.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#25David Fetter
david@fetter.org
In reply to: Josh Berkus (#19)
Re: system views for walsender activity

On Tue, Jan 04, 2011 at 10:50:12AM -0800, Josh Berkus wrote:

Eh... I may be showing my status as a non-normal human, but I know
exactly what pg_stat_walsender is (it's the view that shows you the
status of the WAL senders you've allowed by configuring
max_wal_senders>0) but I have no idea what pg_stat_replication_sent
and pg_stat_replication_received are supposed to be. Why two views?
*scratches head in confusion*

How about one view, called "pg_stat_replication"? This would be clear
even to newbies, which none of the other view names would ...

Wait. We can't do that. We'd be breaking a decades-old tradition of
terrible names! ;)

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#26Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Heikki Linnakangas (#24)
Re: system views for walsender activity

On Wed, Jan 5, 2011 at 04:56, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

I think pg_stat_replication is better than pg_stat_standby, but I'm
still not convinced we shouldn't go with the obvious
pg_stat_walsenders.

How about pg_stat_replication_activity? If I understood correctly, the view
is similar to pg_stat_activity, but displays information about connected
standbys rather than regular backends. It's a bit long name, though.

The view currently discussed is for *master* servers. We might have some
views for replication activity in *standby* servers. So, I'd like to
choose consistent and symmetric names for them -- for example,
pg_stat_replication_master and pg_stat_replication_standby.
I've expected they will be pg_stat_wal_[senders|receivers]
when I was writing the patch, but any other better names welcome.

However, we have "max_wal_senders" GUC parameter. So, users still
need to know what "wal_senders" is.

--
Itagaki Takahiro

#27Magnus Hagander
magnus@hagander.net
In reply to: Itagaki Takahiro (#26)
Re: system views for walsender activity

On Wed, Jan 5, 2011 at 02:32, Itagaki Takahiro
<itagaki.takahiro@gmail.com> wrote:

On Wed, Jan 5, 2011 at 04:56, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

I think pg_stat_replication is better than pg_stat_standby, but I'm
still not convinced we shouldn't go with the obvious
pg_stat_walsenders.

How about pg_stat_replication_activity? If I understood correctly, the view
is similar to pg_stat_activity, but displays information about connected
standbys rather than regular backends. It's a bit long name, though.

The view currently discussed is for *master* servers. We might have some
views for replication activity in *standby* servers. So, I'd like to
choose consistent and symmetric names for them -- for example,
pg_stat_replication_master and pg_stat_replication_standby.
I've expected they will be pg_stat_wal_[senders|receivers]
when I was writing the patch, but any other better names welcome.

However, we have "max_wal_senders" GUC parameter. So, users still
need to know what "wal_senders" is.

An example to compare with could be pg_stat_bgwriter - that's also one
the really expects you to know some internals. Now, it so happens that
it's a very *bad* example, since it contains a bunch of information
that's *not* actually about the bgwriter these days :-)

But from that perspective, is it likely to ever contain anyting
*other* than walsender information? Given that it's keyed by the
process id of a walsender, I don't expect it would. Whereas a
pg_stat_replication or such could equally be expected to contain
information about other ways of replication - like the file based
modes or even slony.

+1 for pg_stat_walsender or pg_stat_walsender_activity

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#28Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Simon Riggs (#17)
Re: system views for walsender activity

On Wed, Jan 5, 2011 at 02:48, Simon Riggs <simon@2ndquadrant.com> wrote:

The way I coded it was a new SRF that joins to the existing
pg_stat_activity. So no initdb required, and this can also easily be
included as an external module for 9.0.

Please notice also that my coding of the new SRF does not have the O^2
issue you mention, which I was keen to avoid.

Yeah, using SQL JOIN to avoid O(n^2) is a good idea. My only concern is
that pg_stat_get_activity(NULL) might return rows that are not actually
used. Is it an ignorable overhead?

We should
also include application name, since the user may set that in the
standby for all the same reasons it is set elsewhere.

Ah, we can use application_name to name each wal senders.

Small point: please lets not call this pg_stat_walsender?
pg_stat_replication_sent and pg_stat_replication_received would be
easier for normal humans to understand.

A list of proposed view names for replication master server:
- pg_stat_replication
- pg_stat_replication_sent
- pg_stat_standby
- pg_stat_walsender
- pg_stat_walsender_activity

We have some functions for standby server activity
(pg_last_xlog_[receive|replay]_[location|timestamp])
but could have a view for them:
- pg_stat_replication_received
- pg_stat_walreceiver

"pg_stat_replication" and "pg_stat_standby" might not be good names
when we have a view for standby server because the names are not
clear for master server. But if we will have a view only on master,
"pg_stat_replication" seems to be the most understandable name.

I would very much appreciate it if one of you could complete something
here and commit in the next few days. That would then allow me to extend
the view with sync rep specific info for monitoring and patch testing.

What will we name xlog locations that have been received? We call
xlog locations sent to standby as "sentPtr". If we have sync rep,
we will have two locations for each wal sender. For example,
we can call them "sent_location" and "sync_location".

--
Itagaki Takahiro

#29Simon Riggs
simon@2ndQuadrant.com
In reply to: Itagaki Takahiro (#28)
Re: system views for walsender activity

On Fri, 2011-01-07 at 12:13 +0900, Itagaki Takahiro wrote:

"pg_stat_replication" seems to be the most understandable name.

I would very much appreciate it if one of you could complete something
here and commit in the next few days. That would then allow me to extend
the view with sync rep specific info for monitoring and patch testing.

Please go with whatever you think best for now. I'm sure people will ask
for different names later anyway. Let's get this committed soon, to
reduce later patch conflicts. Thanks.

What will we name xlog locations that have been received? We call
xlog locations sent to standby as "sentPtr". If we have sync rep,
we will have two locations for each wal sender. For example,
we can call them "sent_location" and "sync_location".

Please add sent_location, I will add others.

--
Simon Riggs http://www.2ndQuadrant.com/books/
PostgreSQL Development, 24x7 Support, Training and Services

#30Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Simon Riggs (#29)
1 attachment(s)
Re: system views for walsender activity

On Fri, Jan 7, 2011 at 19:20, Simon Riggs <simon@2ndquadrant.com> wrote:

"pg_stat_replication" seems to be the most understandable name.

Please go with whatever you think best for now. I'm sure people will ask
for different names later anyway. Let's get this committed soon, to
reduce later patch conflicts. Thanks.

Please add sent_location, I will add others.

OK, I added a view named s "pg_stat_replication". The view is basically
based on Simon's patch, but I just skipped unused WalSnd entreis in
WalSndCtl rather than return NULLs. The applied patch attached.

I expect we will have two views for master and standby servers:

* pg_stat_replication
Activity of wal senders in master server.
* pg_stat_standby (not yet)
Activity of a wal receiver and a recovery process in standby servers.

I didn't use pg_stat_wal_sender/receiver as their names because standby
activity in slaves could contain not only a wal receiver but also a
recovery process.

--
Itagaki Takahiro

Attachments:

pg_stat_replication-20110107.patchapplication/octet-stream; name=pg_stat_replication-20110107.patchDownload
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 262b372..aa89240 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -492,6 +492,21 @@ CREATE VIEW pg_stat_activity AS
     WHERE S.datid = D.oid AND
             S.usesysid = U.oid;
 
+CREATE VIEW pg_stat_replication AS
+    SELECT
+            S.procpid,
+            S.usesysid,
+            U.rolname AS usename,
+            S.application_name,
+            S.client_addr,
+            S.client_port,
+            S.backend_start,
+            W.sent_location
+    FROM pg_stat_get_activity(NULL) AS S, pg_authid U,
+            pg_stat_get_wal_senders() AS W
+    WHERE S.usesysid = U.oid AND
+            S.procpid = W.procpid;
+
 CREATE VIEW pg_stat_database AS
     SELECT
             D.oid AS datid,
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 269d200..f2a3ee2 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -37,6 +37,7 @@
 #include <signal.h>
 #include <unistd.h>
 
+#include "funcapi.h"
 #include "access/xlog_internal.h"
 #include "catalog/pg_type.h"
 #include "libpq/libpq.h"
@@ -49,6 +50,7 @@
 #include "storage/ipc.h"
 #include "storage/pmsignal.h"
 #include "tcop/tcopprot.h"
+#include "utils/builtins.h"
 #include "utils/guc.h"
 #include "utils/memutils.h"
 #include "utils/ps_status.h"
@@ -943,6 +945,78 @@ WalSndWakeup(void)
 }
 
 /*
+ * Returns activity of walsenders, including pids and xlog locations sent to
+ * standby servers.
+ */
+Datum
+pg_stat_get_wal_senders(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_WAL_SENDERS_COLS 	2
+	ReturnSetInfo	   *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+	TupleDesc			tupdesc;
+	Tuplestorestate	   *tupstore;
+	MemoryContext		per_query_ctx;
+	MemoryContext		oldcontext;
+	int					i;
+
+	/* check to see if caller supports us returning a tuplestore */
+	if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("set-valued function called in context that cannot accept a set")));
+	if (!(rsinfo->allowedModes & SFRM_Materialize))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("materialize mode required, but it is not " \
+						"allowed in this context")));
+
+	/* Build a tuple descriptor for our result type */
+	if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+		elog(ERROR, "return type must be a row type");
+
+	per_query_ctx = rsinfo->econtext->ecxt_per_query_memory;
+	oldcontext = MemoryContextSwitchTo(per_query_ctx);
+
+	tupstore = tuplestore_begin_heap(true, false, work_mem);
+	rsinfo->returnMode = SFRM_Materialize;
+	rsinfo->setResult = tupstore;
+	rsinfo->setDesc = tupdesc;
+
+	MemoryContextSwitchTo(oldcontext);
+
+	for (i = 0; i < max_wal_senders; i++)
+	{
+		/* use volatile pointer to prevent code rearrangement */
+		volatile WalSnd *walsnd = &WalSndCtl->walsnds[i];
+		char		sent_location[MAXFNAMELEN];
+		XLogRecPtr	sentPtr;
+		Datum		values[PG_STAT_GET_WAL_SENDERS_COLS];
+		bool		nulls[PG_STAT_GET_WAL_SENDERS_COLS];
+
+		if (walsnd->pid == 0)
+			continue;
+
+		SpinLockAcquire(&walsnd->mutex);
+		sentPtr = walsnd->sentPtr;
+		SpinLockRelease(&walsnd->mutex);
+
+		snprintf(sent_location, sizeof(sent_location), "%X/%X",
+					sentPtr.xlogid, sentPtr.xrecoff);
+
+		memset(nulls, 0, sizeof(nulls));
+		values[0] = Int32GetDatum(walsnd->pid);
+		values[1] = CStringGetTextDatum(sent_location);
+
+		tuplestore_putvalues(tupstore, tupdesc, values, nulls);
+	}
+
+	/* clean up and return the tuplestore */
+	tuplestore_donestoring(tupstore);
+
+	return (Datum) 0;
+}
+
+/*
  * This isn't currently used for anything. Monitoring tools might be
  * interested in the future, and we'll need something like this in the
  * future for synchronous replication.
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index d6ed60a..dcb494f 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -3075,6 +3075,8 @@ DATA(insert OID = 1936 (  pg_stat_get_backend_idset		PGNSP PGUID 12 1 100 0 f f
 DESCR("statistics: currently active backend IDs");
 DATA(insert OID = 2022 (  pg_stat_get_activity			PGNSP PGUID 12 1 100 0 f f f f t s 1 0 2249 "23" "{23,26,23,26,25,25,16,1184,1184,1184,869,23}" "{i,o,o,o,o,o,o,o,o,o,o,o}" "{pid,datid,procpid,usesysid,application_name,current_query,waiting,xact_start,query_start,backend_start,client_addr,client_port}" _null_ pg_stat_get_activity _null_ _null_ _null_ ));
 DESCR("statistics: information about currently active backends");
+DATA(insert OID = 3099 (  pg_stat_get_wal_senders	PGNSP PGUID 12 1 10 0 f f f f t s 0 0 2249 "" "{23,25}" "{o,o}" "{procpid,sent_location}" _null_ pg_stat_get_wal_senders _null_ _null_ _null_ ));
+DESCR("statistics: information about currently active replication");
 DATA(insert OID = 2026 (  pg_backend_pid				PGNSP PGUID 12 1 0 0 f f f t f s 0 0 23 "" _null_ _null_ _null_ _null_ pg_backend_pid _null_ _null_ _null_ ));
 DESCR("statistics: current backend PID");
 DATA(insert OID = 1937 (  pg_stat_get_backend_pid		PGNSP PGUID 12 1 0 0 f f f t f s 1 0 23 "23" _null_ _null_ _null_ _null_ pg_stat_get_backend_pid _null_ _null_ _null_ ));
diff --git a/src/include/replication/walsender.h b/src/include/replication/walsender.h
index 6f543ae..d6767b9 100644
--- a/src/include/replication/walsender.h
+++ b/src/include/replication/walsender.h
@@ -54,4 +54,6 @@ extern Size WalSndShmemSize(void);
 extern void WalSndShmemInit(void);
 extern void WalSndWakeup(void);
 
+extern Datum pg_stat_get_wal_senders(PG_FUNCTION_ARGS);
+
 #endif   /* _WALSENDER_H */
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index a6baae2..61bee46 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1296,6 +1296,7 @@ SELECT viewname, definition FROM pg_views WHERE schemaname <> 'information_schem
  pg_stat_bgwriter            | SELECT pg_stat_get_bgwriter_timed_checkpoints() AS checkpoints_timed, pg_stat_get_bgwriter_requested_checkpoints() AS checkpoints_req, pg_stat_get_bgwriter_buf_written_checkpoints() AS buffers_checkpoint, pg_stat_get_bgwriter_buf_written_clean() AS buffers_clean, pg_stat_get_bgwriter_maxwritten_clean() AS maxwritten_clean, pg_stat_get_buf_written_backend() AS buffers_backend, pg_stat_get_buf_fsync_backend() AS buffers_backend_fsync, pg_stat_get_buf_alloc() AS buffers_alloc;
  pg_stat_database            | SELECT d.oid AS datid, d.datname, pg_stat_get_db_numbackends(d.oid) AS numbackends, pg_stat_get_db_xact_commit(d.oid) AS xact_commit, pg_stat_get_db_xact_rollback(d.oid) AS xact_rollback, (pg_stat_get_db_blocks_fetched(d.oid) - pg_stat_get_db_blocks_hit(d.oid)) AS blks_read, pg_stat_get_db_blocks_hit(d.oid) AS blks_hit, pg_stat_get_db_tuples_returned(d.oid) AS tup_returned, pg_stat_get_db_tuples_fetched(d.oid) AS tup_fetched, pg_stat_get_db_tuples_inserted(d.oid) AS tup_inserted, pg_stat_get_db_tuples_updated(d.oid) AS tup_updated, pg_stat_get_db_tuples_deleted(d.oid) AS tup_deleted, pg_stat_get_db_conflict_all(d.oid) AS conflicts FROM pg_database d;
  pg_stat_database_conflicts  | SELECT d.oid AS datid, d.datname, pg_stat_get_db_conflict_tablespace(d.oid) AS confl_tablespace, pg_stat_get_db_conflict_lock(d.oid) AS confl_lock, pg_stat_get_db_conflict_snapshot(d.oid) AS confl_snapshot, pg_stat_get_db_conflict_bufferpin(d.oid) AS confl_bufferpin, pg_stat_get_db_conflict_startup_deadlock(d.oid) AS confl_deadlock FROM pg_database d;
+ pg_stat_replication         | SELECT s.procpid, s.usesysid, u.rolname AS usename, s.application_name, s.client_addr, s.client_port, s.backend_start, w.sent_location FROM pg_stat_get_activity(NULL::integer) s(datid, procpid, usesysid, application_name, current_query, waiting, xact_start, query_start, backend_start, client_addr, client_port), pg_authid u, pg_stat_get_wal_senders() w(procpid, sent_location) WHERE ((s.usesysid = u.oid) AND (s.procpid = w.procpid));
  pg_stat_sys_indexes         | SELECT pg_stat_all_indexes.relid, pg_stat_all_indexes.indexrelid, pg_stat_all_indexes.schemaname, pg_stat_all_indexes.relname, pg_stat_all_indexes.indexrelname, pg_stat_all_indexes.idx_scan, pg_stat_all_indexes.idx_tup_read, pg_stat_all_indexes.idx_tup_fetch FROM pg_stat_all_indexes WHERE ((pg_stat_all_indexes.schemaname = ANY (ARRAY['pg_catalog'::name, 'information_schema'::name])) OR (pg_stat_all_indexes.schemaname ~ '^pg_toast'::text));
  pg_stat_sys_tables          | SELECT pg_stat_all_tables.relid, pg_stat_all_tables.schemaname, pg_stat_all_tables.relname, pg_stat_all_tables.seq_scan, pg_stat_all_tables.seq_tup_read, pg_stat_all_tables.idx_scan, pg_stat_all_tables.idx_tup_fetch, pg_stat_all_tables.n_tup_ins, pg_stat_all_tables.n_tup_upd, pg_stat_all_tables.n_tup_del, pg_stat_all_tables.n_tup_hot_upd, pg_stat_all_tables.n_live_tup, pg_stat_all_tables.n_dead_tup, pg_stat_all_tables.last_vacuum, pg_stat_all_tables.last_autovacuum, pg_stat_all_tables.last_analyze, pg_stat_all_tables.last_autoanalyze, pg_stat_all_tables.vacuum_count, pg_stat_all_tables.autovacuum_count, pg_stat_all_tables.analyze_count, pg_stat_all_tables.autoanalyze_count FROM pg_stat_all_tables WHERE ((pg_stat_all_tables.schemaname = ANY (ARRAY['pg_catalog'::name, 'information_schema'::name])) OR (pg_stat_all_tables.schemaname ~ '^pg_toast'::text));
  pg_stat_user_functions      | SELECT p.oid AS funcid, n.nspname AS schemaname, p.proname AS funcname, pg_stat_get_function_calls(p.oid) AS calls, (pg_stat_get_function_time(p.oid) / 1000) AS total_time, (pg_stat_get_function_self_time(p.oid) / 1000) AS self_time FROM (pg_proc p LEFT JOIN pg_namespace n ON ((n.oid = p.pronamespace))) WHERE ((p.prolang <> (12)::oid) AND (pg_stat_get_function_calls(p.oid) IS NOT NULL));
@@ -1335,7 +1336,7 @@ SELECT viewname, definition FROM pg_views WHERE schemaname <> 'information_schem
  shoelace_obsolete           | SELECT shoelace.sl_name, shoelace.sl_avail, shoelace.sl_color, shoelace.sl_len, shoelace.sl_unit, shoelace.sl_len_cm FROM shoelace WHERE (NOT (EXISTS (SELECT shoe.shoename FROM shoe WHERE (shoe.slcolor = shoelace.sl_color))));
  street                      | SELECT r.name, r.thepath, c.cname FROM ONLY road r, real_city c WHERE (c.outline ## r.thepath);
  toyemp                      | SELECT emp.name, emp.age, emp.location, (12 * emp.salary) AS annualsal FROM emp;
-(57 rows)
+(58 rows)
 
 SELECT tablename, rulename, definition FROM pg_rules
 	ORDER BY tablename, rulename;
#31Magnus Hagander
magnus@hagander.net
In reply to: Itagaki Takahiro (#30)
Re: system views for walsender activity

On Fri, Jan 7, 2011 at 12:42, Itagaki Takahiro
<itagaki.takahiro@gmail.com> wrote:

On Fri, Jan 7, 2011 at 19:20, Simon Riggs <simon@2ndquadrant.com> wrote:

"pg_stat_replication" seems to be the most understandable name.

Please go with whatever you think best for now. I'm sure people will ask
for different names later anyway. Let's get this committed soon, to
reduce later patch conflicts. Thanks.

Please add sent_location, I will add others.

OK, I added a view named s "pg_stat_replication". The view is basically
based on Simon's patch, but I just skipped unused WalSnd entreis in
WalSndCtl rather than return NULLs. The applied patch attached.

I expect we will have two views for master and standby servers:

 * pg_stat_replication
      Activity of wal senders in master server.
 * pg_stat_standby (not yet)
      Activity of a wal receiver and a recovery process in standby servers.

Just to keep the bikeshedding up, should it in this case not be
pg_stat_replication_master and pg_stat_replication_standby or such?
Replication applies to both master and slave...

I didn't use pg_stat_wal_sender/receiver as their names because standby
activity in slaves could contain not only a wal receiver but also a
recovery process.

Good point.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#32Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Magnus Hagander (#31)
Re: system views for walsender activity

On Fri, Jan 7, 2011 at 21:48, Magnus Hagander <magnus@hagander.net> wrote:

 * pg_stat_replication
 * pg_stat_standby (not yet)

Just to keep the bikeshedding up, should it in this case not be
pg_stat_replication_master and pg_stat_replication_standby or such?
Replication applies to both master and slave...

The reason I didn't use term "master" is that pg_stat_replication is
information of *standby* servers on master server. Of course,
wal senders are processes in the master, but users probably think
they are the location standby servers receives.

I forgot to update SGML for the view. I'll do it soon.
Thanks for the heads-up.

--
Itagaki Takahiro

#33Robert Haas
robertmhaas@gmail.com
In reply to: Itagaki Takahiro (#32)
Re: system views for walsender activity

On Fri, Jan 7, 2011 at 8:09 AM, Itagaki Takahiro
<itagaki.takahiro@gmail.com> wrote:

On Fri, Jan 7, 2011 at 21:48, Magnus Hagander <magnus@hagander.net> wrote:

 * pg_stat_replication
 * pg_stat_standby (not yet)

Just to keep the bikeshedding up, should it in this case not be
pg_stat_replication_master and pg_stat_replication_standby or such?
Replication applies to both master and slave...

The reason I didn't use term "master" is that pg_stat_replication is
information of *standby* servers on master server. Of course,
wal senders are processes in the master, but users probably think
they are the location standby servers receives.

To my way of thinking, pg_stat_walsender and pg_stat_walreceiver would
be more clear than pg_stat_replication_master and
pg_stat_replication_slave.

However, my way of thinking is of course not the only way of thinking.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#34Josh Berkus
josh@agliodbs.com
In reply to: Robert Haas (#33)
Re: system views for walsender activity

To my way of thinking, pg_stat_walsender and pg_stat_walreceiver would
be more clear than pg_stat_replication_master and
pg_stat_replication_slave.

Let's commit it so that some of us can get a look at the data it
contains, and then we can fix the name during beta.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

#35Magnus Hagander
magnus@hagander.net
In reply to: Josh Berkus (#34)
Re: system views for walsender activity

On Fri, Jan 7, 2011 at 19:46, Josh Berkus <josh@agliodbs.com> wrote:

To my way of thinking, pg_stat_walsender and pg_stat_walreceiver would
be more clear than pg_stat_replication_master and
pg_stat_replication_slave.

Let's commit it so that some of us can get a look at the data it
contains, and then we can fix the name during beta.

We try to avoid inidb-requiring changes (like renaming a system
object...) during beta.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#36Simon Riggs
simon@2ndQuadrant.com
In reply to: Magnus Hagander (#35)
Re: system views for walsender activity

On Fri, 2011-01-07 at 19:48 +0100, Magnus Hagander wrote:

On Fri, Jan 7, 2011 at 19:46, Josh Berkus <josh@agliodbs.com> wrote:

To my way of thinking, pg_stat_walsender and pg_stat_walreceiver would
be more clear than pg_stat_replication_master and
pg_stat_replication_slave.

Let's commit it so that some of us can get a look at the data it
contains, and then we can fix the name during beta.

We try to avoid inidb-requiring changes (like renaming a system
object...) during beta.

Why?

--
Simon Riggs http://www.2ndQuadrant.com/books/
PostgreSQL Development, 24x7 Support, Training and Services

#37Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#36)
Re: system views for walsender activity

Simon Riggs <simon@2ndQuadrant.com> writes:

On Fri, 2011-01-07 at 19:48 +0100, Magnus Hagander wrote:

We try to avoid inidb-requiring changes (like renaming a system
object...) during beta.

Why?

So that beta testers won't be forced to do a dump and reload.

When and if pg_upgrade is actually 100% trustworthy, maybe the desire
to avoid initdb's after beta1 will vanish completely; but for now it's
still a good thing to avoid.

regards, tom lane

#38Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#37)
Re: system views for walsender activity

On Fri, Jan 7, 2011 at 20:51, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Simon Riggs <simon@2ndQuadrant.com> writes:

On Fri, 2011-01-07 at 19:48 +0100, Magnus Hagander wrote:

We try to avoid inidb-requiring changes (like renaming a system
object...) during beta.

Why?

So that beta testers won't be forced to do a dump and reload.

When and if pg_upgrade is actually 100% trustworthy, maybe the desire
to avoid initdb's after beta1 will vanish completely; but for now it's
still a good thing to avoid.

I think it's still a good thing to avoid. It's at that point no longer
as necessary, but it's still best if we can avoid to make those
changes after beta when possible.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#39Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#37)
Re: system views for walsender activity

On Fri, 2011-01-07 at 14:51 -0500, Tom Lane wrote:

Simon Riggs <simon@2ndQuadrant.com> writes:

On Fri, 2011-01-07 at 19:48 +0100, Magnus Hagander wrote:

We try to avoid inidb-requiring changes (like renaming a system
object...) during beta.

Why?

So that beta testers won't be forced to do a dump and reload.

When and if pg_upgrade is actually 100% trustworthy, maybe the desire
to avoid initdb's after beta1 will vanish completely; but for now it's
still a good thing to avoid.

Then before we go beta, we should confirm the names.

IIRC that is not until at least 15 Feb, perhaps 15 Mar.

ISTM there will be much discussion on a range of things, just as there
was last year and no doubt will be every year.

--
Simon Riggs http://www.2ndQuadrant.com/books/
PostgreSQL Development, 24x7 Support, Training and Services

#40Robert Haas
robertmhaas@gmail.com
In reply to: Josh Berkus (#34)
Re: system views for walsender activity

On Fri, Jan 7, 2011 at 1:46 PM, Josh Berkus <josh@agliodbs.com> wrote:

To my way of thinking, pg_stat_walsender and pg_stat_walreceiver would
be more clear than pg_stat_replication_master and
pg_stat_replication_slave.

Let's commit it so that some of us can get a look at the data it
contains, and then we can fix the name during beta.

Well, the first half is committed, under the name pg_stat_replication.
So go look at that, for starters...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#41Magnus Hagander
magnus@hagander.net
In reply to: Robert Haas (#40)
Re: system views for walsender activity

On Fri, Jan 7, 2011 at 22:21, Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Jan 7, 2011 at 1:46 PM, Josh Berkus <josh@agliodbs.com> wrote:

To my way of thinking, pg_stat_walsender and pg_stat_walreceiver would
be more clear than pg_stat_replication_master and
pg_stat_replication_slave.

Let's commit it so that some of us can get a look at the data it
contains, and then we can fix the name during beta.

Well, the first half is committed, under the name pg_stat_replication.
 So go look at that, for starters...

One thing I noticed is that it gives an interesting property to my
patch for streaming base backups - they now show up in
pg_stat_replication, with a streaming location of 0/0.

If the view is named pg_stat_replication, we probably want to filter
that out somehow. But then, do we want a separate view listing the
walsenders that are busy sending base backups?

For that matter, do we want an indication that separates a walsender
not sending data from one sending that happens to be at location 0/0?
Most will leave 0/0 really quickly, but a walsender can be idle (not
received a command yet), or it can be running IDENTIFY_SYSTEM for
example.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#42Simon Riggs
simon@2ndQuadrant.com
In reply to: Magnus Hagander (#41)
Re: system views for walsender activity

On Sun, 2011-01-09 at 12:52 +0100, Magnus Hagander wrote:

One thing I noticed is that it gives an interesting property to my
patch for streaming base backups - they now show up in
pg_stat_replication, with a streaming location of 0/0.

If the view is named pg_stat_replication, we probably want to filter
that out somehow. But then, do we want a separate view listing the
walsenders that are busy sending base backups?

For that matter, do we want an indication that separates a walsender
not sending data from one sending that happens to be at location 0/0?
Most will leave 0/0 really quickly, but a walsender can be idle (not
received a command yet), or it can be running IDENTIFY_SYSTEM for
example.

I think we need a status enum. ('BACKUP', 'CATCHUP', 'STREAM') for the 3
phases of replication.

--
Simon Riggs http://www.2ndQuadrant.com/books/
PostgreSQL Development, 24x7 Support, Training and Services

#43Magnus Hagander
magnus@hagander.net
In reply to: Simon Riggs (#42)
Re: system views for walsender activity

On Sun, Jan 9, 2011 at 15:53, Simon Riggs <simon@2ndquadrant.com> wrote:

On Sun, 2011-01-09 at 12:52 +0100, Magnus Hagander wrote:

One thing I noticed is that it gives an interesting property to my
patch for streaming base backups - they now show up in
pg_stat_replication, with a streaming location of 0/0.

If the view is named pg_stat_replication, we probably want to filter
that out somehow. But then, do we want a separate view listing the
walsenders that are busy sending base backups?

For that matter, do we want an indication that separates a walsender
not sending data from one sending that happens to be at location 0/0?
Most will leave 0/0 really quickly, but a walsender can be idle (not
received a command yet), or it can be running IDENTIFY_SYSTEM for
example.

I think we need a status enum. ('BACKUP', 'CATCHUP', 'STREAM') for the 3
phases of replication.

That seems reasonable. But if we keep BACKUP in there, should we
really have it called pg_stat_replication? (yeah, I know, I'm not
giving up :P)

(You'd need a 4th mode for WAITING or so, to indicate it's waiting for
a command)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#44Simon Riggs
simon@2ndQuadrant.com
In reply to: Magnus Hagander (#43)
Re: system views for walsender activity

On Mon, 2011-01-10 at 15:20 +0100, Magnus Hagander wrote:

On Sun, Jan 9, 2011 at 15:53, Simon Riggs <simon@2ndquadrant.com> wrote:

On Sun, 2011-01-09 at 12:52 +0100, Magnus Hagander wrote:

One thing I noticed is that it gives an interesting property to my
patch for streaming base backups - they now show up in
pg_stat_replication, with a streaming location of 0/0.

If the view is named pg_stat_replication, we probably want to filter
that out somehow. But then, do we want a separate view listing the
walsenders that are busy sending base backups?

For that matter, do we want an indication that separates a walsender
not sending data from one sending that happens to be at location 0/0?
Most will leave 0/0 really quickly, but a walsender can be idle (not
received a command yet), or it can be running IDENTIFY_SYSTEM for
example.

I think we need a status enum. ('BACKUP', 'CATCHUP', 'STREAM') for the 3
phases of replication.

That seems reasonable. But if we keep BACKUP in there, should we
really have it called pg_stat_replication? (yeah, I know, I'm not
giving up :P)

(You'd need a 4th mode for WAITING or so, to indicate it's waiting for
a command)

That's something different.

The 3 phases are more concrete.

BACKUP --> CATCHUP <---> STREAM

When you connect you either do BACKUP or CATCHUP. Once in CATCHUP mode
you never issue a BACKUP. Once we have caught up we move to STREAM. That
has nothing to do with idle/active.

--
Simon Riggs http://www.2ndQuadrant.com/books/
PostgreSQL Development, 24x7 Support, Training and Services

#45Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Simon Riggs (#44)
Re: system views for walsender activity

On 10.01.2011 16:49, Simon Riggs wrote:

On Mon, 2011-01-10 at 15:20 +0100, Magnus Hagander wrote:

On Sun, Jan 9, 2011 at 15:53, Simon Riggs<simon@2ndquadrant.com> wrote:

On Sun, 2011-01-09 at 12:52 +0100, Magnus Hagander wrote:

One thing I noticed is that it gives an interesting property to my
patch for streaming base backups - they now show up in
pg_stat_replication, with a streaming location of 0/0.

If the view is named pg_stat_replication, we probably want to filter
that out somehow. But then, do we want a separate view listing the
walsenders that are busy sending base backups?

For that matter, do we want an indication that separates a walsender
not sending data from one sending that happens to be at location 0/0?
Most will leave 0/0 really quickly, but a walsender can be idle (not
received a command yet), or it can be running IDENTIFY_SYSTEM for
example.

I think we need a status enum. ('BACKUP', 'CATCHUP', 'STREAM') for the 3
phases of replication.

That seems reasonable. But if we keep BACKUP in there, should we
really have it called pg_stat_replication? (yeah, I know, I'm not
giving up :P)

(You'd need a 4th mode for WAITING or so, to indicate it's waiting for
a command)

That's something different.

The 3 phases are more concrete.

BACKUP --> CATCHUP<---> STREAM

When you connect you either do BACKUP or CATCHUP. Once in CATCHUP mode
you never issue a BACKUP. Once we have caught up we move to STREAM. That
has nothing to do with idle/active.

So how does a walsender that's waiting for a command from the client
show up? Surely it's not in "catchup" mode yet?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#46Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#45)
Re: system views for walsender activity

On Mon, 2011-01-10 at 17:05 +0200, Heikki Linnakangas wrote:

On 10.01.2011 16:49, Simon Riggs wrote:

On Mon, 2011-01-10 at 15:20 +0100, Magnus Hagander wrote:

On Sun, Jan 9, 2011 at 15:53, Simon Riggs<simon@2ndquadrant.com> wrote:

On Sun, 2011-01-09 at 12:52 +0100, Magnus Hagander wrote:

One thing I noticed is that it gives an interesting property to my
patch for streaming base backups - they now show up in
pg_stat_replication, with a streaming location of 0/0.

If the view is named pg_stat_replication, we probably want to filter
that out somehow. But then, do we want a separate view listing the
walsenders that are busy sending base backups?

For that matter, do we want an indication that separates a walsender
not sending data from one sending that happens to be at location 0/0?
Most will leave 0/0 really quickly, but a walsender can be idle (not
received a command yet), or it can be running IDENTIFY_SYSTEM for
example.

I think we need a status enum. ('BACKUP', 'CATCHUP', 'STREAM') for the 3
phases of replication.

That seems reasonable. But if we keep BACKUP in there, should we
really have it called pg_stat_replication? (yeah, I know, I'm not
giving up :P)

(You'd need a 4th mode for WAITING or so, to indicate it's waiting for
a command)

That's something different.

The 3 phases are more concrete.

BACKUP --> CATCHUP<---> STREAM

When you connect you either do BACKUP or CATCHUP. Once in CATCHUP mode
you never issue a BACKUP. Once we have caught up we move to STREAM. That
has nothing to do with idle/active.

So how does a walsender that's waiting for a command from the client
show up? Surely it's not in "catchup" mode yet?

There is a trivial state between connect and first command. If you think
that is worth publishing, feel free. STARTING?

--
Simon Riggs http://www.2ndQuadrant.com/books/
PostgreSQL Development, 24x7 Support, Training and Services

In reply to: Heikki Linnakangas (#45)
Re: system views for walsender activity

Em 10-01-2011 12:05, Heikki Linnakangas escreveu:

So how does a walsender that's waiting for a command from the client
show up? Surely it's not in "catchup" mode yet?

It is kind of "initializing catchup". I think it is not worth representing
those short lifespan states (in normal scenarios).

--
Euler Taveira de Oliveira
http://www.timbira.com/

#48Magnus Hagander
magnus@hagander.net
In reply to: Simon Riggs (#46)
Re: system views for walsender activity

On Mon, Jan 10, 2011 at 16:41, Simon Riggs <simon@2ndquadrant.com> wrote:

On Mon, 2011-01-10 at 17:05 +0200, Heikki Linnakangas wrote:

On 10.01.2011 16:49, Simon Riggs wrote:

On Mon, 2011-01-10 at 15:20 +0100, Magnus Hagander wrote:

On Sun, Jan 9, 2011 at 15:53, Simon Riggs<simon@2ndquadrant.com>  wrote:

On Sun, 2011-01-09 at 12:52 +0100, Magnus Hagander wrote:

One thing I noticed is that it gives an interesting property to my
patch for streaming base backups - they now show up in
pg_stat_replication, with a streaming location of 0/0.

If the view is named pg_stat_replication, we probably want to filter
that out somehow. But then, do we want a separate view listing the
walsenders that are busy sending base backups?

For that matter, do we want an indication that separates a walsender
not sending data from one sending that happens to be at location 0/0?
Most will leave 0/0 really quickly, but a walsender can be idle (not
received a command yet), or it can be running IDENTIFY_SYSTEM for
example.

I think we need a status enum. ('BACKUP', 'CATCHUP', 'STREAM') for the 3
phases of replication.

That seems reasonable. But if we keep BACKUP in there, should we
really have it called pg_stat_replication? (yeah, I know, I'm not
giving up :P)

(You'd need a 4th mode for WAITING or so, to indicate it's waiting for
a command)

That's something different.

The 3 phases are more concrete.

BACKUP -->  CATCHUP<--->  STREAM

When you connect you either do BACKUP or CATCHUP. Once in CATCHUP mode
you never issue a BACKUP. Once we have caught up we move to STREAM. That
has nothing to do with idle/active.

So how does a walsender that's waiting for a command from the client
show up? Surely it's not in "catchup" mode yet?

There is a trivial state between connect and first command. If you think
that is worth publishing, feel free. STARTING?

If we don't publish it, it'll implicitly be in one of the others..
Unless we say NULL, of course, but I definitely prefer STARTING then.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#49Magnus Hagander
magnus@hagander.net
In reply to: Euler Taveira de Oliveira (#47)
Re: system views for walsender activity

On Mon, Jan 10, 2011 at 16:48, Euler Taveira de Oliveira
<euler@timbira.com> wrote:

Em 10-01-2011 12:05, Heikki Linnakangas escreveu:

So how does a walsender that's waiting for a command from the client
show up? Surely it's not in "catchup" mode yet?

It is kind of "initializing catchup". I think it is not worth representing
those short lifespan states (in normal scenarios).

True, but it's quite important to detect and diagnose the abnormal ones...

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#50Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#46)
Re: system views for walsender activity

On Mon, Jan 10, 2011 at 10:41 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

I think we need a status enum. ('BACKUP', 'CATCHUP', 'STREAM') for the 3
phases of replication.

That seems reasonable. But if we keep BACKUP in there, should we
really have it called pg_stat_replication? (yeah, I know, I'm not
giving up :P)

(You'd need a 4th mode for WAITING or so, to indicate it's waiting for
a command)

That's something different.

The 3 phases are more concrete.

BACKUP -->  CATCHUP<--->  STREAM

When you connect you either do BACKUP or CATCHUP. Once in CATCHUP mode
you never issue a BACKUP. Once we have caught up we move to STREAM. That
has nothing to do with idle/active.

So how does a walsender that's waiting for a command from the client
show up? Surely it's not in "catchup" mode yet?

There is a trivial state between connect and first command. If you think
that is worth publishing, feel free. STARTING?

I think it's worth publishing. STARTING would be OK, or maybe STARTUP
to parallel the other two -UP states.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#51Magnus Hagander
magnus@hagander.net
In reply to: Robert Haas (#50)
1 attachment(s)
Re: system views for walsender activity

On Tue, Jan 11, 2011 at 02:24, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Jan 10, 2011 at 10:41 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

I think we need a status enum. ('BACKUP', 'CATCHUP', 'STREAM') for the 3
phases of replication.

That seems reasonable. But if we keep BACKUP in there, should we
really have it called pg_stat_replication? (yeah, I know, I'm not
giving up :P)

(You'd need a 4th mode for WAITING or so, to indicate it's waiting for
a command)

That's something different.

The 3 phases are more concrete.

BACKUP -->  CATCHUP<--->  STREAM

When you connect you either do BACKUP or CATCHUP. Once in CATCHUP mode
you never issue a BACKUP. Once we have caught up we move to STREAM. That
has nothing to do with idle/active.

So how does a walsender that's waiting for a command from the client
show up? Surely it's not in "catchup" mode yet?

There is a trivial state between connect and first command. If you think
that is worth publishing, feel free. STARTING?

I think it's worth publishing.  STARTING would be OK, or maybe STARTUP
to parallel the other two -UP states.

Here's a patch for this. I chose IDLE, because that's what we call
other backends that are waiting for commands...

Does this seem correct?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Attachments:

walsender_state.patchtext/x-patch; charset=US-ASCII; name=walsender_state.patchDownload
*** a/doc/src/sgml/monitoring.sgml
--- b/doc/src/sgml/monitoring.sgml
***************
*** 298,305 **** postgres: <replaceable>user</> <replaceable>database</> <replaceable>host</> <re
        <entry><structname>pg_stat_replication</><indexterm><primary>pg_stat_replication</primary></indexterm></entry>
        <entry>One row per WAL sender process, showing process <acronym>ID</>,
        user OID, user name, application name, client's address and port number,
!       time at which the server process began execution, and transaction log
!       location.
       </entry>
       </row>
  
--- 298,305 ----
        <entry><structname>pg_stat_replication</><indexterm><primary>pg_stat_replication</primary></indexterm></entry>
        <entry>One row per WAL sender process, showing process <acronym>ID</>,
        user OID, user name, application name, client's address and port number,
!       time at which the server process began execution, current WAL sender
!       state and transaction log location.
       </entry>
       </row>
  
*** a/src/backend/catalog/system_views.sql
--- b/src/backend/catalog/system_views.sql
***************
*** 501,506 **** CREATE VIEW pg_stat_replication AS
--- 501,507 ----
              S.client_addr,
              S.client_port,
              S.backend_start,
+             W.state,
              W.sent_location
      FROM pg_stat_get_activity(NULL) AS S, pg_authid U,
              pg_stat_get_wal_senders() AS W
*** a/src/backend/replication/basebackup.c
--- b/src/backend/replication/basebackup.c
***************
*** 24,29 ****
--- 24,30 ----
  #include "libpq/pqformat.h"
  #include "nodes/pg_list.h"
  #include "replication/basebackup.h"
+ #include "replication/walsender.h"
  #include "storage/fd.h"
  #include "storage/ipc.h"
  #include "utils/builtins.h"
***************
*** 83,88 **** SendBaseBackup(const char *options)
--- 84,91 ----
  										   ALLOCSET_DEFAULT_MAXSIZE);
  	old_context = MemoryContextSwitchTo(backup_context);
  
+ 	WalSndSetState(WalSndState_BACKUP);
+ 
  	if (backup_label == NULL)
  		ereport(FATAL,
  				(errcode(ERRCODE_PROTOCOL_VIOLATION),
*** a/src/backend/replication/walsender.c
--- b/src/backend/replication/walsender.c
***************
*** 179,184 **** WalSndHandshake(void)
--- 179,185 ----
  	{
  		int			firstchar;
  
+ 		WalSndSetState(WalSndState_IDLE);
  		set_ps_display("idle", false);
  
  		/* Wait for a command to arrive */
***************
*** 482,487 **** WalSndLoop(void)
--- 483,491 ----
  			if (!XLogSend(output_message, &caughtup))
  				break;
  		}
+ 
+ 		/* Update our state to indicate if we're behind or not */
+ 		WalSndSetState(caughtup ? WalSndState_STREAMING : WalSndState_CATCHUP);
  	}
  
  	/*
***************
*** 533,538 **** InitWalSnd(void)
--- 537,543 ----
  			 */
  			walsnd->pid = MyProcPid;
  			MemSet(&walsnd->sentPtr, 0, sizeof(XLogRecPtr));
+ 			walsnd->state = WalSndState_IDLE;
  			SpinLockRelease(&walsnd->mutex);
  			/* don't need the lock anymore */
  			OwnLatch((Latch *) &walsnd->latch);
***************
*** 960,965 **** WalSndWakeup(void)
--- 965,999 ----
  		SetLatch(&WalSndCtl->walsnds[i].latch);
  }
  
+ /* Set state for current walsender (only called in walsender) */
+ void WalSndSetState(WalSndState state)
+ {
+ 	/* use volatile pointer to prevent code rearrangement */
+ 	volatile WalSnd *walsnd = MyWalSnd;
+ 
+ 	SpinLockAcquire(&walsnd->mutex);
+ 	walsnd->state = state;
+ 	SpinLockRelease(&walsnd->mutex);
+ }
+ 
+ /*
+  * Return a string constant representing the state. This is used
+  * in system views, and should *not* be translated.
+  */
+ static const char *
+ WalSndGetStateString(WalSndState state)
+ {
+ 	switch (state)
+ 	{
+ 		case WalSndState_IDLE: return "IDLE";
+ 		case WalSndState_BACKUP: return "BACKUP";
+ 		case WalSndState_CATCHUP: return "CATCHUP";
+ 		case WalSndState_STREAMING: return "STREAMING";
+ 	}
+ 	return "UNKNOWN";
+ }
+ 
+ 
  /*
   * Returns activity of walsenders, including pids and xlog locations sent to
   * standby servers.
***************
*** 967,973 **** WalSndWakeup(void)
  Datum
  pg_stat_get_wal_senders(PG_FUNCTION_ARGS)
  {
! #define PG_STAT_GET_WAL_SENDERS_COLS 	2
  	ReturnSetInfo	   *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
  	TupleDesc			tupdesc;
  	Tuplestorestate	   *tupstore;
--- 1001,1007 ----
  Datum
  pg_stat_get_wal_senders(PG_FUNCTION_ARGS)
  {
! #define PG_STAT_GET_WAL_SENDERS_COLS 	3
  	ReturnSetInfo	   *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
  	TupleDesc			tupdesc;
  	Tuplestorestate	   *tupstore;
***************
*** 1021,1027 **** pg_stat_get_wal_senders(PG_FUNCTION_ARGS)
  
  		memset(nulls, 0, sizeof(nulls));
  		values[0] = Int32GetDatum(walsnd->pid);
! 		values[1] = CStringGetTextDatum(sent_location);
  
  		tuplestore_putvalues(tupstore, tupdesc, values, nulls);
  	}
--- 1055,1062 ----
  
  		memset(nulls, 0, sizeof(nulls));
  		values[0] = Int32GetDatum(walsnd->pid);
! 		values[1] = CStringGetTextDatum(WalSndGetStateString(walsnd->state));
! 		values[2] = CStringGetTextDatum(sent_location);
  
  		tuplestore_putvalues(tupstore, tupdesc, values, nulls);
  	}
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
***************
*** 3075,3081 **** DATA(insert OID = 1936 (  pg_stat_get_backend_idset		PGNSP PGUID 12 1 100 0 f f
  DESCR("statistics: currently active backend IDs");
  DATA(insert OID = 2022 (  pg_stat_get_activity			PGNSP PGUID 12 1 100 0 f f f f t s 1 0 2249 "23" "{23,26,23,26,25,25,16,1184,1184,1184,869,23}" "{i,o,o,o,o,o,o,o,o,o,o,o}" "{pid,datid,procpid,usesysid,application_name,current_query,waiting,xact_start,query_start,backend_start,client_addr,client_port}" _null_ pg_stat_get_activity _null_ _null_ _null_ ));
  DESCR("statistics: information about currently active backends");
! DATA(insert OID = 3099 (  pg_stat_get_wal_senders	PGNSP PGUID 12 1 10 0 f f f f t s 0 0 2249 "" "{23,25}" "{o,o}" "{procpid,sent_location}" _null_ pg_stat_get_wal_senders _null_ _null_ _null_ ));
  DESCR("statistics: information about currently active replication");
  DATA(insert OID = 2026 (  pg_backend_pid				PGNSP PGUID 12 1 0 0 f f f t f s 0 0 23 "" _null_ _null_ _null_ _null_ pg_backend_pid _null_ _null_ _null_ ));
  DESCR("statistics: current backend PID");
--- 3075,3081 ----
  DESCR("statistics: currently active backend IDs");
  DATA(insert OID = 2022 (  pg_stat_get_activity			PGNSP PGUID 12 1 100 0 f f f f t s 1 0 2249 "23" "{23,26,23,26,25,25,16,1184,1184,1184,869,23}" "{i,o,o,o,o,o,o,o,o,o,o,o}" "{pid,datid,procpid,usesysid,application_name,current_query,waiting,xact_start,query_start,backend_start,client_addr,client_port}" _null_ pg_stat_get_activity _null_ _null_ _null_ ));
  DESCR("statistics: information about currently active backends");
! DATA(insert OID = 3099 (  pg_stat_get_wal_senders	PGNSP PGUID 12 1 10 0 f f f f t s 0 0 2249 "" "{23,25,25}" "{o,o,o}" "{procpid,state,sent_location}" _null_ pg_stat_get_wal_senders _null_ _null_ _null_ ));
  DESCR("statistics: information about currently active replication");
  DATA(insert OID = 2026 (  pg_backend_pid				PGNSP PGUID 12 1 0 0 f f f t f s 0 0 23 "" _null_ _null_ _null_ _null_ pg_backend_pid _null_ _null_ _null_ ));
  DESCR("statistics: current backend PID");
*** a/src/include/replication/walsender.h
--- b/src/include/replication/walsender.h
***************
*** 16,27 ****
--- 16,36 ----
  #include "storage/latch.h"
  #include "storage/spin.h"
  
+ 
+ typedef enum WalSndState {
+ 	WalSndState_IDLE = 0,
+ 	WalSndState_BACKUP,
+ 	WalSndState_CATCHUP,
+ 	WalSndState_STREAMING
+ } WalSndState;
+ 
  /*
   * Each walsender has a WalSnd struct in shared memory.
   */
  typedef struct WalSnd
  {
  	pid_t		pid;			/* this walsender's process id, or 0 */
+ 	WalSndState	state;			/* this walsender's state */
  	XLogRecPtr	sentPtr;		/* WAL has been sent up to this point */
  
  	slock_t		mutex;			/* locks shared variables shown above */
***************
*** 53,58 **** extern void WalSndSignals(void);
--- 62,68 ----
  extern Size WalSndShmemSize(void);
  extern void WalSndShmemInit(void);
  extern void WalSndWakeup(void);
+ extern void WalSndSetState(WalSndState state);
  
  extern Datum pg_stat_get_wal_senders(PG_FUNCTION_ARGS);
  
#52Robert Haas
robertmhaas@gmail.com
In reply to: Magnus Hagander (#51)
Re: system views for walsender activity

On Tue, Jan 11, 2011 at 5:28 AM, Magnus Hagander <magnus@hagander.net> wrote:

Does this seem correct?

It looks reasonable, except that I the way you've chosen to capitalize
the wal sender states makes me want to shoot myself.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#53Simon Riggs
simon@2ndQuadrant.com
In reply to: Magnus Hagander (#51)
Re: system views for walsender activity

On Tue, 2011-01-11 at 11:28 +0100, Magnus Hagander wrote:

(You'd need a 4th mode for WAITING or so, to indicate it's waiting for
a command)

That's something different.

The 3 phases are more concrete.

BACKUP --> CATCHUP<---> STREAM

When you connect you either do BACKUP or CATCHUP. Once in CATCHUP mode
you never issue a BACKUP. Once we have caught up we move to STREAM. That
has nothing to do with idle/active.

So how does a walsender that's waiting for a command from the client
show up? Surely it's not in "catchup" mode yet?

There is a trivial state between connect and first command. If you think
that is worth publishing, feel free. STARTING?

I think it's worth publishing. STARTING would be OK, or maybe STARTUP
to parallel the other two -UP states.

Here's a patch for this. I chose IDLE, because that's what we call
other backends that are waiting for commands...

Does this seem correct?

No

You can be "idle" yet in STREAMING mode. What mode we are in has nothing
to do with idle/active. Either STARTING/STARTUP/NULL but not IDLE.

If you want that as well, then we need a second column, but personally
it's too much information and its too hard to say what it actually
means. For example, with sync rep, the WALSender might be idle, yet
there might yet be backends waiting for a reply.

--
Simon Riggs http://www.2ndQuadrant.com/books/
PostgreSQL Development, 24x7 Support, Training and Services

#54Magnus Hagander
magnus@hagander.net
In reply to: Robert Haas (#52)
Re: system views for walsender activity

On Tue, Jan 11, 2011 at 12:17, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Jan 11, 2011 at 5:28 AM, Magnus Hagander <magnus@hagander.net> wrote:

Does this seem correct?

It looks reasonable, except that I the way you've chosen to capitalize
the wal sender states makes me want to shoot myself.

Hah, I was waiting for that. I can certainly change that :-)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#55Magnus Hagander
magnus@hagander.net
In reply to: Simon Riggs (#53)
Re: system views for walsender activity

On Tue, Jan 11, 2011 at 12:23, Simon Riggs <simon@2ndquadrant.com> wrote:

On Tue, 2011-01-11 at 11:28 +0100, Magnus Hagander wrote:

(You'd need a 4th mode for WAITING or so, to indicate it's waiting for
a command)

That's something different.

The 3 phases are more concrete.

BACKUP -->  CATCHUP<--->  STREAM

When you connect you either do BACKUP or CATCHUP. Once in CATCHUP mode
you never issue a BACKUP. Once we have caught up we move to STREAM. That
has nothing to do with idle/active.

So how does a walsender that's waiting for a command from the client
show up? Surely it's not in "catchup" mode yet?

There is a trivial state between connect and first command. If you think
that is worth publishing, feel free. STARTING?

I think it's worth publishing.  STARTING would be OK, or maybe STARTUP
to parallel the other two -UP states.

Here's a patch for this. I chose IDLE, because that's what we call
other backends that are waiting for commands...

Does this seem correct?

No

You can be "idle" yet in STREAMING mode. What mode we are in has nothing
to do with idle/active. Either STARTING/STARTUP/NULL but not IDLE.

If you want that as well, then we need a second column, but personally
it's too much information and its too hard to say what it actually
means. For example, with sync rep, the WALSender might be idle, yet
there might yet be backends waiting for a reply.

That's a good point.

Just to be clear, you're objecting to the *name* of the state, right,
not how/where it's set? In particular, how the catchup/streaming
things are set?

I agree that using a second column for it is unnecessary.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#56Simon Riggs
simon@2ndQuadrant.com
In reply to: Magnus Hagander (#55)
Re: system views for walsender activity

On Tue, 2011-01-11 at 12:41 +0100, Magnus Hagander wrote:

Just to be clear, you're objecting to the *name* of the state, right,
not how/where it's set?

Yes

In particular, how the catchup/streaming
things are set?

You've set it in the right places.

I would personally constrain the state transitions, so that we can't
ever make illegal changes, such as CATCHUP -> BACKUP.

I would also check the state locally before grabbing the spinlock, as is
typical in other xlog code. Continually resetting shared state to
exactly what it is already seems strange, to me. If we make the rule
that the state can only be changed by the WALSender itself, it won't
need to grab spinlock. If you don't like that, a local variable works.

Also, mixing upper camel case and uppercase for the STATe NamES looKS
weIRD. Uppercase makes them look more clearly like enum/states as used
elsewhere in similar code.

--
Simon Riggs http://www.2ndQuadrant.com/books/
PostgreSQL Development, 24x7 Support, Training and Services

#57Magnus Hagander
magnus@hagander.net
In reply to: Simon Riggs (#56)
Re: system views for walsender activity

On Tue, Jan 11, 2011 at 12:58, Simon Riggs <simon@2ndquadrant.com> wrote:

On Tue, 2011-01-11 at 12:41 +0100, Magnus Hagander wrote:

Just to be clear, you're objecting to the *name* of the state, right,
not how/where it's set?

Yes

In particular, how the catchup/streaming
things are set?

You've set it in the right places.

I would personally constrain the state transitions, so that we can't
ever make illegal changes, such as CATCHUP -> BACKUP.

Well, once we enter the walsender loop we can never get out of it, so
it simply cannot happen...

I would also check the state locally before grabbing the spinlock, as is
typical in other xlog code. Continually resetting shared state to
exactly what it is already seems strange, to me. If we make the rule
that the state can only be changed by the WALSender itself, it won't
need to grab spinlock. If you don't like that, a local variable works.

Hmm. I don't see why anybody other than the walsender should change
it, so yeah. You mean I can just drop the spinlock calls completely,
and then do an
if (walsnd->state != state)
walsnd->state = state;

? Do I need to keep the volatile pointer, or should I drop that as well?

Also, mixing upper camel case and uppercase for the STATe NamES looKS
weIRD. Uppercase makes them look more clearly like enum/states as used
elsewhere in similar code.

Yeah, I'll change that.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#58Simon Riggs
simon@2ndQuadrant.com
In reply to: Magnus Hagander (#57)
Re: system views for walsender activity

On Tue, 2011-01-11 at 13:04 +0100, Magnus Hagander wrote:

On Tue, Jan 11, 2011 at 12:58, Simon Riggs <simon@2ndquadrant.com> wrote:

On Tue, 2011-01-11 at 12:41 +0100, Magnus Hagander wrote:

Just to be clear, you're objecting to the *name* of the state, right,
not how/where it's set?

Yes

In particular, how the catchup/streaming
things are set?

You've set it in the right places.

I would personally constrain the state transitions, so that we can't
ever make illegal changes, such as CATCHUP -> BACKUP.

Well, once we enter the walsender loop we can never get out of it, so
it simply cannot happen...

Accidentally it can, so I would like to protect against that.

Putting those checks in are like Asserts(), they help document what is
and is not possible by design.

I would also check the state locally before grabbing the spinlock, as is
typical in other xlog code. Continually resetting shared state to
exactly what it is already seems strange, to me. If we make the rule
that the state can only be changed by the WALSender itself, it won't
need to grab spinlock. If you don't like that, a local variable works.

Hmm. I don't see why anybody other than the walsender should change
it, so yeah. You mean I can just drop the spinlock calls completely,
and then do an
if (walsnd->state != state)
walsnd->state = state;

? Do I need to keep the volatile pointer, or should I drop that as well?

No, do this at top

if (walsnd->state == state)
return;

Keep spinlocks when actually setting it.

Also, mixing upper camel case and uppercase for the STATe NamES looKS
weIRD. Uppercase makes them look more clearly like enum/states as used
elsewhere in similar code.

Yeah, I'll change that.

--
Simon Riggs http://www.2ndQuadrant.com/books/
PostgreSQL Development, 24x7 Support, Training and Services

#59Magnus Hagander
magnus@hagander.net
In reply to: Simon Riggs (#58)
1 attachment(s)
Re: system views for walsender activity

On Tue, Jan 11, 2011 at 13:18, Simon Riggs <simon@2ndquadrant.com> wrote:

On Tue, 2011-01-11 at 13:04 +0100, Magnus Hagander wrote:

On Tue, Jan 11, 2011 at 12:58, Simon Riggs <simon@2ndquadrant.com> wrote:

On Tue, 2011-01-11 at 12:41 +0100, Magnus Hagander wrote:

Just to be clear, you're objecting to the *name* of the state, right,
not how/where it's set?

Yes

In particular, how the catchup/streaming
things are set?

You've set it in the right places.

I would personally constrain the state transitions, so that we can't
ever make illegal changes, such as CATCHUP -> BACKUP.

Well, once we enter the walsender loop we can never get out of it, so
it simply cannot happen...

Accidentally it can, so I would like to protect against that.

Putting those checks in are like Asserts(), they help document what is
and is not possible by design.

I would also check the state locally before grabbing the spinlock, as is
typical in other xlog code. Continually resetting shared state to
exactly what it is already seems strange, to me. If we make the rule
that the state can only be changed by the WALSender itself, it won't
need to grab spinlock. If you don't like that, a local variable works.

Hmm. I don't see why anybody other than the walsender should change
it, so yeah. You mean I can just drop the spinlock calls completely,
and then do an
if (walsnd->state != state)
   walsnd->state = state;

? Do I need to keep the volatile pointer, or should I drop that as well?

No, do this at top

if (walsnd->state == state)
 return;

Keep spinlocks when actually setting it.

Aha. Thanks for the pointers, pfa a new version.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Attachments:

walsender_state.patchtext/x-patch; charset=US-ASCII; name=walsender_state.patchDownload
*** a/doc/src/sgml/monitoring.sgml
--- b/doc/src/sgml/monitoring.sgml
***************
*** 298,305 **** postgres: <replaceable>user</> <replaceable>database</> <replaceable>host</> <re
        <entry><structname>pg_stat_replication</><indexterm><primary>pg_stat_replication</primary></indexterm></entry>
        <entry>One row per WAL sender process, showing process <acronym>ID</>,
        user OID, user name, application name, client's address and port number,
!       time at which the server process began execution, and transaction log
!       location.
       </entry>
       </row>
  
--- 298,305 ----
        <entry><structname>pg_stat_replication</><indexterm><primary>pg_stat_replication</primary></indexterm></entry>
        <entry>One row per WAL sender process, showing process <acronym>ID</>,
        user OID, user name, application name, client's address and port number,
!       time at which the server process began execution, current WAL sender
!       state and transaction log location.
       </entry>
       </row>
  
*** a/src/backend/catalog/system_views.sql
--- b/src/backend/catalog/system_views.sql
***************
*** 501,506 **** CREATE VIEW pg_stat_replication AS
--- 501,507 ----
              S.client_addr,
              S.client_port,
              S.backend_start,
+             W.state,
              W.sent_location
      FROM pg_stat_get_activity(NULL) AS S, pg_authid U,
              pg_stat_get_wal_senders() AS W
*** a/src/backend/replication/basebackup.c
--- b/src/backend/replication/basebackup.c
***************
*** 24,29 ****
--- 24,30 ----
  #include "libpq/pqformat.h"
  #include "nodes/pg_list.h"
  #include "replication/basebackup.h"
+ #include "replication/walsender.h"
  #include "storage/fd.h"
  #include "storage/ipc.h"
  #include "utils/builtins.h"
***************
*** 83,88 **** SendBaseBackup(const char *options)
--- 84,91 ----
  										   ALLOCSET_DEFAULT_MAXSIZE);
  	old_context = MemoryContextSwitchTo(backup_context);
  
+ 	WalSndSetState(WALSNDSTATE_BACKUP);
+ 
  	if (backup_label == NULL)
  		ereport(FATAL,
  				(errcode(ERRCODE_PROTOCOL_VIOLATION),
*** a/src/backend/replication/walsender.c
--- b/src/backend/replication/walsender.c
***************
*** 179,184 **** WalSndHandshake(void)
--- 179,185 ----
  	{
  		int			firstchar;
  
+ 		WalSndSetState(WALSNDSTATE_IDLE);
  		set_ps_display("idle", false);
  
  		/* Wait for a command to arrive */
***************
*** 482,487 **** WalSndLoop(void)
--- 483,491 ----
  			if (!XLogSend(output_message, &caughtup))
  				break;
  		}
+ 
+ 		/* Update our state to indicate if we're behind or not */
+ 		WalSndSetState(caughtup ? WALSNDSTATE_STREAMING : WALSNDSTATE_CATCHUP);
  	}
  
  	/*
***************
*** 533,538 **** InitWalSnd(void)
--- 537,543 ----
  			 */
  			walsnd->pid = MyProcPid;
  			MemSet(&walsnd->sentPtr, 0, sizeof(XLogRecPtr));
+ 			walsnd->state = WALSNDSTATE_IDLE;
  			SpinLockRelease(&walsnd->mutex);
  			/* don't need the lock anymore */
  			OwnLatch((Latch *) &walsnd->latch);
***************
*** 960,965 **** WalSndWakeup(void)
--- 965,1004 ----
  		SetLatch(&WalSndCtl->walsnds[i].latch);
  }
  
+ /* Set state for current walsender (only called in walsender) */
+ void WalSndSetState(WalSndState state)
+ {
+ 	/* use volatile pointer to prevent code rearrangement */
+ 	volatile WalSnd *walsnd = MyWalSnd;
+ 
+ 	Assert(am_walsender);
+ 
+ 	if (walsnd->state == state)
+ 		return;
+ 
+ 	SpinLockAcquire(&walsnd->mutex);
+ 	walsnd->state = state;
+ 	SpinLockRelease(&walsnd->mutex);
+ }
+ 
+ /*
+  * Return a string constant representing the state. This is used
+  * in system views, and should *not* be translated.
+  */
+ static const char *
+ WalSndGetStateString(WalSndState state)
+ {
+ 	switch (state)
+ 	{
+ 		case WALSNDSTATE_IDLE: return "IDLE";
+ 		case WALSNDSTATE_BACKUP: return "BACKUP";
+ 		case WALSNDSTATE_CATCHUP: return "CATCHUP";
+ 		case WALSNDSTATE_STREAMING: return "STREAMING";
+ 	}
+ 	return "UNKNOWN";
+ }
+ 
+ 
  /*
   * Returns activity of walsenders, including pids and xlog locations sent to
   * standby servers.
***************
*** 967,973 **** WalSndWakeup(void)
  Datum
  pg_stat_get_wal_senders(PG_FUNCTION_ARGS)
  {
! #define PG_STAT_GET_WAL_SENDERS_COLS 	2
  	ReturnSetInfo	   *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
  	TupleDesc			tupdesc;
  	Tuplestorestate	   *tupstore;
--- 1006,1012 ----
  Datum
  pg_stat_get_wal_senders(PG_FUNCTION_ARGS)
  {
! #define PG_STAT_GET_WAL_SENDERS_COLS 	3
  	ReturnSetInfo	   *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
  	TupleDesc			tupdesc;
  	Tuplestorestate	   *tupstore;
***************
*** 1021,1027 **** pg_stat_get_wal_senders(PG_FUNCTION_ARGS)
  
  		memset(nulls, 0, sizeof(nulls));
  		values[0] = Int32GetDatum(walsnd->pid);
! 		values[1] = CStringGetTextDatum(sent_location);
  
  		tuplestore_putvalues(tupstore, tupdesc, values, nulls);
  	}
--- 1060,1067 ----
  
  		memset(nulls, 0, sizeof(nulls));
  		values[0] = Int32GetDatum(walsnd->pid);
! 		values[1] = CStringGetTextDatum(WalSndGetStateString(walsnd->state));
! 		values[2] = CStringGetTextDatum(sent_location);
  
  		tuplestore_putvalues(tupstore, tupdesc, values, nulls);
  	}
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
***************
*** 3075,3081 **** DATA(insert OID = 1936 (  pg_stat_get_backend_idset		PGNSP PGUID 12 1 100 0 f f
  DESCR("statistics: currently active backend IDs");
  DATA(insert OID = 2022 (  pg_stat_get_activity			PGNSP PGUID 12 1 100 0 f f f f t s 1 0 2249 "23" "{23,26,23,26,25,25,16,1184,1184,1184,869,23}" "{i,o,o,o,o,o,o,o,o,o,o,o}" "{pid,datid,procpid,usesysid,application_name,current_query,waiting,xact_start,query_start,backend_start,client_addr,client_port}" _null_ pg_stat_get_activity _null_ _null_ _null_ ));
  DESCR("statistics: information about currently active backends");
! DATA(insert OID = 3099 (  pg_stat_get_wal_senders	PGNSP PGUID 12 1 10 0 f f f f t s 0 0 2249 "" "{23,25}" "{o,o}" "{procpid,sent_location}" _null_ pg_stat_get_wal_senders _null_ _null_ _null_ ));
  DESCR("statistics: information about currently active replication");
  DATA(insert OID = 2026 (  pg_backend_pid				PGNSP PGUID 12 1 0 0 f f f t f s 0 0 23 "" _null_ _null_ _null_ _null_ pg_backend_pid _null_ _null_ _null_ ));
  DESCR("statistics: current backend PID");
--- 3075,3081 ----
  DESCR("statistics: currently active backend IDs");
  DATA(insert OID = 2022 (  pg_stat_get_activity			PGNSP PGUID 12 1 100 0 f f f f t s 1 0 2249 "23" "{23,26,23,26,25,25,16,1184,1184,1184,869,23}" "{i,o,o,o,o,o,o,o,o,o,o,o}" "{pid,datid,procpid,usesysid,application_name,current_query,waiting,xact_start,query_start,backend_start,client_addr,client_port}" _null_ pg_stat_get_activity _null_ _null_ _null_ ));
  DESCR("statistics: information about currently active backends");
! DATA(insert OID = 3099 (  pg_stat_get_wal_senders	PGNSP PGUID 12 1 10 0 f f f f t s 0 0 2249 "" "{23,25,25}" "{o,o,o}" "{procpid,state,sent_location}" _null_ pg_stat_get_wal_senders _null_ _null_ _null_ ));
  DESCR("statistics: information about currently active replication");
  DATA(insert OID = 2026 (  pg_backend_pid				PGNSP PGUID 12 1 0 0 f f f t f s 0 0 23 "" _null_ _null_ _null_ _null_ pg_backend_pid _null_ _null_ _null_ ));
  DESCR("statistics: current backend PID");
*** a/src/include/replication/walsender.h
--- b/src/include/replication/walsender.h
***************
*** 16,27 ****
--- 16,36 ----
  #include "storage/latch.h"
  #include "storage/spin.h"
  
+ 
+ typedef enum WalSndState {
+ 	WALSNDSTATE_IDLE = 0,
+ 	WALSNDSTATE_BACKUP,
+ 	WALSNDSTATE_CATCHUP,
+ 	WALSNDSTATE_STREAMING
+ } WalSndState;
+ 
  /*
   * Each walsender has a WalSnd struct in shared memory.
   */
  typedef struct WalSnd
  {
  	pid_t		pid;			/* this walsender's process id, or 0 */
+ 	WalSndState	state;			/* this walsender's state */
  	XLogRecPtr	sentPtr;		/* WAL has been sent up to this point */
  
  	slock_t		mutex;			/* locks shared variables shown above */
***************
*** 53,58 **** extern void WalSndSignals(void);
--- 62,68 ----
  extern Size WalSndShmemSize(void);
  extern void WalSndShmemInit(void);
  extern void WalSndWakeup(void);
+ extern void WalSndSetState(WalSndState state);
  
  extern Datum pg_stat_get_wal_senders(PG_FUNCTION_ARGS);
  
#60Robert Haas
robertmhaas@gmail.com
In reply to: Magnus Hagander (#59)
Re: system views for walsender activity

On Tue, Jan 11, 2011 at 7:24 AM, Magnus Hagander <magnus@hagander.net> wrote:

No, do this at top

if (walsnd->state == state)
 return;

Keep spinlocks when actually setting it.

I think this is safe...

Aha. Thanks for the pointers, pfa a new version.

...but I think you also need to take the spinlock when reading the value.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#61Shigeru HANADA
hanada@metrosystems.co.jp
In reply to: Magnus Hagander (#59)
1 attachment(s)
Re: system views for walsender activity

On Tue, 11 Jan 2011 13:24:33 +0100
Magnus Hagander <magnus@hagander.net> wrote:

Aha. Thanks for the pointers, pfa a new version.

Changing pg_stat replication view would require to fix regression test
"rule". Please find attached patch.

Regards,
--
Shigeru Hanada

Attachments:

rule_test.patchapplication/octet-stream; name=rule_test.patchDownload
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 61bee46..72e5630 100644
*** a/src/test/regress/expected/rules.out
--- b/src/test/regress/expected/rules.out
*************** SELECT viewname, definition FROM pg_view
*** 1296,1302 ****
   pg_stat_bgwriter            | SELECT pg_stat_get_bgwriter_timed_checkpoints() AS checkpoints_timed, pg_stat_get_bgwriter_requested_checkpoints() AS checkpoints_req, pg_stat_get_bgwriter_buf_written_checkpoints() AS buffers_checkpoint, pg_stat_get_bgwriter_buf_written_clean() AS buffers_clean, pg_stat_get_bgwriter_maxwritten_clean() AS maxwritten_clean, pg_stat_get_buf_written_backend() AS buffers_backend, pg_stat_get_buf_fsync_backend() AS buffers_backend_fsync, pg_stat_get_buf_alloc() AS buffers_alloc;
   pg_stat_database            | SELECT d.oid AS datid, d.datname, pg_stat_get_db_numbackends(d.oid) AS numbackends, pg_stat_get_db_xact_commit(d.oid) AS xact_commit, pg_stat_get_db_xact_rollback(d.oid) AS xact_rollback, (pg_stat_get_db_blocks_fetched(d.oid) - pg_stat_get_db_blocks_hit(d.oid)) AS blks_read, pg_stat_get_db_blocks_hit(d.oid) AS blks_hit, pg_stat_get_db_tuples_returned(d.oid) AS tup_returned, pg_stat_get_db_tuples_fetched(d.oid) AS tup_fetched, pg_stat_get_db_tuples_inserted(d.oid) AS tup_inserted, pg_stat_get_db_tuples_updated(d.oid) AS tup_updated, pg_stat_get_db_tuples_deleted(d.oid) AS tup_deleted, pg_stat_get_db_conflict_all(d.oid) AS conflicts FROM pg_database d;
   pg_stat_database_conflicts  | SELECT d.oid AS datid, d.datname, pg_stat_get_db_conflict_tablespace(d.oid) AS confl_tablespace, pg_stat_get_db_conflict_lock(d.oid) AS confl_lock, pg_stat_get_db_conflict_snapshot(d.oid) AS confl_snapshot, pg_stat_get_db_conflict_bufferpin(d.oid) AS confl_bufferpin, pg_stat_get_db_conflict_startup_deadlock(d.oid) AS confl_deadlock FROM pg_database d;
!  pg_stat_replication         | SELECT s.procpid, s.usesysid, u.rolname AS usename, s.application_name, s.client_addr, s.client_port, s.backend_start, w.sent_location FROM pg_stat_get_activity(NULL::integer) s(datid, procpid, usesysid, application_name, current_query, waiting, xact_start, query_start, backend_start, client_addr, client_port), pg_authid u, pg_stat_get_wal_senders() w(procpid, sent_location) WHERE ((s.usesysid = u.oid) AND (s.procpid = w.procpid));
   pg_stat_sys_indexes         | SELECT pg_stat_all_indexes.relid, pg_stat_all_indexes.indexrelid, pg_stat_all_indexes.schemaname, pg_stat_all_indexes.relname, pg_stat_all_indexes.indexrelname, pg_stat_all_indexes.idx_scan, pg_stat_all_indexes.idx_tup_read, pg_stat_all_indexes.idx_tup_fetch FROM pg_stat_all_indexes WHERE ((pg_stat_all_indexes.schemaname = ANY (ARRAY['pg_catalog'::name, 'information_schema'::name])) OR (pg_stat_all_indexes.schemaname ~ '^pg_toast'::text));
   pg_stat_sys_tables          | SELECT pg_stat_all_tables.relid, pg_stat_all_tables.schemaname, pg_stat_all_tables.relname, pg_stat_all_tables.seq_scan, pg_stat_all_tables.seq_tup_read, pg_stat_all_tables.idx_scan, pg_stat_all_tables.idx_tup_fetch, pg_stat_all_tables.n_tup_ins, pg_stat_all_tables.n_tup_upd, pg_stat_all_tables.n_tup_del, pg_stat_all_tables.n_tup_hot_upd, pg_stat_all_tables.n_live_tup, pg_stat_all_tables.n_dead_tup, pg_stat_all_tables.last_vacuum, pg_stat_all_tables.last_autovacuum, pg_stat_all_tables.last_analyze, pg_stat_all_tables.last_autoanalyze, pg_stat_all_tables.vacuum_count, pg_stat_all_tables.autovacuum_count, pg_stat_all_tables.analyze_count, pg_stat_all_tables.autoanalyze_count FROM pg_stat_all_tables WHERE ((pg_stat_all_tables.schemaname = ANY (ARRAY['pg_catalog'::name, 'information_schema'::name])) OR (pg_stat_all_tables.schemaname ~ '^pg_toast'::text));
   pg_stat_user_functions      | SELECT p.oid AS funcid, n.nspname AS schemaname, p.proname AS funcname, pg_stat_get_function_calls(p.oid) AS calls, (pg_stat_get_function_time(p.oid) / 1000) AS total_time, (pg_stat_get_function_self_time(p.oid) / 1000) AS self_time FROM (pg_proc p LEFT JOIN pg_namespace n ON ((n.oid = p.pronamespace))) WHERE ((p.prolang <> (12)::oid) AND (pg_stat_get_function_calls(p.oid) IS NOT NULL));
--- 1296,1302 ----
   pg_stat_bgwriter            | SELECT pg_stat_get_bgwriter_timed_checkpoints() AS checkpoints_timed, pg_stat_get_bgwriter_requested_checkpoints() AS checkpoints_req, pg_stat_get_bgwriter_buf_written_checkpoints() AS buffers_checkpoint, pg_stat_get_bgwriter_buf_written_clean() AS buffers_clean, pg_stat_get_bgwriter_maxwritten_clean() AS maxwritten_clean, pg_stat_get_buf_written_backend() AS buffers_backend, pg_stat_get_buf_fsync_backend() AS buffers_backend_fsync, pg_stat_get_buf_alloc() AS buffers_alloc;
   pg_stat_database            | SELECT d.oid AS datid, d.datname, pg_stat_get_db_numbackends(d.oid) AS numbackends, pg_stat_get_db_xact_commit(d.oid) AS xact_commit, pg_stat_get_db_xact_rollback(d.oid) AS xact_rollback, (pg_stat_get_db_blocks_fetched(d.oid) - pg_stat_get_db_blocks_hit(d.oid)) AS blks_read, pg_stat_get_db_blocks_hit(d.oid) AS blks_hit, pg_stat_get_db_tuples_returned(d.oid) AS tup_returned, pg_stat_get_db_tuples_fetched(d.oid) AS tup_fetched, pg_stat_get_db_tuples_inserted(d.oid) AS tup_inserted, pg_stat_get_db_tuples_updated(d.oid) AS tup_updated, pg_stat_get_db_tuples_deleted(d.oid) AS tup_deleted, pg_stat_get_db_conflict_all(d.oid) AS conflicts FROM pg_database d;
   pg_stat_database_conflicts  | SELECT d.oid AS datid, d.datname, pg_stat_get_db_conflict_tablespace(d.oid) AS confl_tablespace, pg_stat_get_db_conflict_lock(d.oid) AS confl_lock, pg_stat_get_db_conflict_snapshot(d.oid) AS confl_snapshot, pg_stat_get_db_conflict_bufferpin(d.oid) AS confl_bufferpin, pg_stat_get_db_conflict_startup_deadlock(d.oid) AS confl_deadlock FROM pg_database d;
!  pg_stat_replication         | SELECT s.procpid, s.usesysid, u.rolname AS usename, s.application_name, s.client_addr, s.client_port, s.backend_start, w.state, w.sent_location FROM pg_stat_get_activity(NULL::integer) s(datid, procpid, usesysid, application_name, current_query, waiting, xact_start, query_start, backend_start, client_addr, client_port), pg_authid u, pg_stat_get_wal_senders() w(procpid, state, sent_location) WHERE ((s.usesysid = u.oid) AND (s.procpid = w.procpid));
   pg_stat_sys_indexes         | SELECT pg_stat_all_indexes.relid, pg_stat_all_indexes.indexrelid, pg_stat_all_indexes.schemaname, pg_stat_all_indexes.relname, pg_stat_all_indexes.indexrelname, pg_stat_all_indexes.idx_scan, pg_stat_all_indexes.idx_tup_read, pg_stat_all_indexes.idx_tup_fetch FROM pg_stat_all_indexes WHERE ((pg_stat_all_indexes.schemaname = ANY (ARRAY['pg_catalog'::name, 'information_schema'::name])) OR (pg_stat_all_indexes.schemaname ~ '^pg_toast'::text));
   pg_stat_sys_tables          | SELECT pg_stat_all_tables.relid, pg_stat_all_tables.schemaname, pg_stat_all_tables.relname, pg_stat_all_tables.seq_scan, pg_stat_all_tables.seq_tup_read, pg_stat_all_tables.idx_scan, pg_stat_all_tables.idx_tup_fetch, pg_stat_all_tables.n_tup_ins, pg_stat_all_tables.n_tup_upd, pg_stat_all_tables.n_tup_del, pg_stat_all_tables.n_tup_hot_upd, pg_stat_all_tables.n_live_tup, pg_stat_all_tables.n_dead_tup, pg_stat_all_tables.last_vacuum, pg_stat_all_tables.last_autovacuum, pg_stat_all_tables.last_analyze, pg_stat_all_tables.last_autoanalyze, pg_stat_all_tables.vacuum_count, pg_stat_all_tables.autovacuum_count, pg_stat_all_tables.analyze_count, pg_stat_all_tables.autoanalyze_count FROM pg_stat_all_tables WHERE ((pg_stat_all_tables.schemaname = ANY (ARRAY['pg_catalog'::name, 'information_schema'::name])) OR (pg_stat_all_tables.schemaname ~ '^pg_toast'::text));
   pg_stat_user_functions      | SELECT p.oid AS funcid, n.nspname AS schemaname, p.proname AS funcname, pg_stat_get_function_calls(p.oid) AS calls, (pg_stat_get_function_time(p.oid) / 1000) AS total_time, (pg_stat_get_function_self_time(p.oid) / 1000) AS self_time FROM (pg_proc p LEFT JOIN pg_namespace n ON ((n.oid = p.pronamespace))) WHERE ((p.prolang <> (12)::oid) AND (pg_stat_get_function_calls(p.oid) IS NOT NULL));
#62Andrew Dunstan
andrew@dunslane.net
In reply to: Shigeru HANADA (#61)
Re: system views for walsender activity

On 01/11/2011 10:24 PM, Shigeru HANADA wrote:

On Tue, 11 Jan 2011 13:24:33 +0100
Magnus Hagander<magnus@hagander.net> wrote:

Aha. Thanks for the pointers, pfa a new version.

Changing pg_stat replication view would require to fix regression test
"rule". Please find attached patch.

I have just committed a fix for this.

cheers

andrew

#63Magnus Hagander
magnus@hagander.net
In reply to: Robert Haas (#60)
Re: system views for walsender activity

On Wed, Jan 12, 2011 at 03:03, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Jan 11, 2011 at 7:24 AM, Magnus Hagander <magnus@hagander.net> wrote:

No, do this at top

if (walsnd->state == state)
 return;

Keep spinlocks when actually setting it.

I think this is safe...

Aha. Thanks for the pointers, pfa a new version.

...but I think you also need to take the spinlock when reading the value.

Even when it can only ever be set by one process (the owning
walsender), and the variable is atomic (as it should be, since it's a
single enum/int)?

Anyway, it should be as simple as copying it out to a local variable
when it's already in the spinlock and then use that, right?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#64Robert Haas
robertmhaas@gmail.com
In reply to: Magnus Hagander (#63)
Re: system views for walsender activity

On Thu, Jan 13, 2011 at 11:08 AM, Magnus Hagander <magnus@hagander.net> wrote:

On Wed, Jan 12, 2011 at 03:03, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Jan 11, 2011 at 7:24 AM, Magnus Hagander <magnus@hagander.net> wrote:

No, do this at top

if (walsnd->state == state)
 return;

Keep spinlocks when actually setting it.

I think this is safe...

Aha. Thanks for the pointers, pfa a new version.

...but I think you also need to take the spinlock when reading the value.

Even when it can only ever be set by one process (the owning
walsender), and the variable is atomic (as it should be, since it's a
single enum/int)?

The fact that it can only be modified by one process makes it safe for
*that process* to read it without taking the lock, but another process
that wants to read it still needs the lock, I believe - otherwise you
might get a slightly stale value. That's probably not a *huge* deal
in this case, but I think it'd be better to get it right because
people tend to copy these sorts of things elsewhere, and it'd be bad
if it got copied into some place more critical.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#65Magnus Hagander
magnus@hagander.net
In reply to: Robert Haas (#64)
Re: system views for walsender activity

On Thu, Jan 13, 2011 at 18:43, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Jan 13, 2011 at 11:08 AM, Magnus Hagander <magnus@hagander.net> wrote:

On Wed, Jan 12, 2011 at 03:03, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Jan 11, 2011 at 7:24 AM, Magnus Hagander <magnus@hagander.net> wrote:

No, do this at top

if (walsnd->state == state)
 return;

Keep spinlocks when actually setting it.

I think this is safe...

Aha. Thanks for the pointers, pfa a new version.

...but I think you also need to take the spinlock when reading the value.

Even when it can only ever be set by one process (the owning
walsender), and the variable is atomic (as it should be, since it's a
single enum/int)?

The fact that it can only be modified by one process makes it safe for
*that process* to read it without taking the lock, but another process
that wants to read it still needs the lock, I believe - otherwise you
might get a slightly stale value.  That's probably not a *huge* deal
in this case, but I think it'd be better to get it right because
people tend to copy these sorts of things elsewhere, and it'd be bad
if it got copied into some place more critical.

ok, thanks for the pointers - fix applied.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/