pg_stat_activity.waiting_start

Started by Joel Jacobsonabout 9 years ago32 messages
#1Joel Jacobson
joel@trustly.com

Hi hackers,

We already have xact_start, query_start and backend_start
to get the timestamptz for when different things happened.

I would like to propose adding a fourth such column, "waiting_start",
which would tell how long time a backend has been waiting.

The column would be NULL when waiting=FALSE.

While it's trivial to write a script that just polls pg_stat_activity
every second and keeps tack of when a backend started
waiting by just checking for any new waiting=TRUE rows,
it would be more convenient to just get the information from
pg_stat_activity directly.

The use-case would be e.g. monitoring tools
when you want to know how long time queries are waiting.

--
Joel Jacobson

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Joel Jacobson
joel@trustly.com
In reply to: Joel Jacobson (#1)
Re: pg_stat_activity.waiting_start

Actually, "wait_start" is a better name to match the others
("wait_event_type" and "wait_event").

On Sat, Dec 24, 2016 at 8:20 AM, Joel Jacobson <joel@trustly.com> wrote:

Hi hackers,

We already have xact_start, query_start and backend_start
to get the timestamptz for when different things happened.

I would like to propose adding a fourth such column, "waiting_start",
which would tell how long time a backend has been waiting.

The column would be NULL when waiting=FALSE.

While it's trivial to write a script that just polls pg_stat_activity
every second and keeps tack of when a backend started
waiting by just checking for any new waiting=TRUE rows,
it would be more convenient to just get the information from
pg_stat_activity directly.

The use-case would be e.g. monitoring tools
when you want to know how long time queries are waiting.

--
Joel Jacobson

--
Joel Jacobson

Mobile: +46703603801
Trustly.com | Newsroom | LinkedIn | Twitter

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joel Jacobson (#1)
Re: pg_stat_activity.waiting_start

Joel Jacobson <joel@trustly.com> writes:

We already have xact_start, query_start and backend_start
to get the timestamptz for when different things happened.

I would like to propose adding a fourth such column, "waiting_start",
which would tell how long time a backend has been waiting.

The difficulty with that is it'd require a gettimeofday() call for
every wait start. Even on platforms where those are relatively cheap,
the overhead would be nasty --- and on some platforms, it'd be
astonishingly bad. We sweated quite a lot to get the overhead of
pg_stat_activity wait monitoring down to the point where it would be
tolerable for non-heavyweight locks, but I'm afraid this would push
it back into the not-tolerable range.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Stephen Frost
sfrost@snowman.net
In reply to: Tom Lane (#3)
Re: pg_stat_activity.waiting_start

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

Joel Jacobson <joel@trustly.com> writes:

We already have xact_start, query_start and backend_start
to get the timestamptz for when different things happened.

I would like to propose adding a fourth such column, "waiting_start",
which would tell how long time a backend has been waiting.

The difficulty with that is it'd require a gettimeofday() call for
every wait start. Even on platforms where those are relatively cheap,
the overhead would be nasty --- and on some platforms, it'd be
astonishingly bad. We sweated quite a lot to get the overhead of
pg_stat_activity wait monitoring down to the point where it would be
tolerable for non-heavyweight locks, but I'm afraid this would push
it back into the not-tolerable range.

Could we handle this like log_lock_waits..?

Thanks!

Stephen

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Stephen Frost (#4)
Re: pg_stat_activity.waiting_start

Stephen Frost <sfrost@snowman.net> writes:

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

The difficulty with that is it'd require a gettimeofday() call for
every wait start. Even on platforms where those are relatively cheap,
the overhead would be nasty --- and on some platforms, it'd be
astonishingly bad. We sweated quite a lot to get the overhead of
pg_stat_activity wait monitoring down to the point where it would be
tolerable for non-heavyweight locks, but I'm afraid this would push
it back into the not-tolerable range.

Could we handle this like log_lock_waits..?

Well, that only applies to heavyweight locks, which do a gettimeofday
anyway in order to schedule the deadlock-check timeout. If you were
willing to populate this new column only for heavyweight locks, maybe it
could be done for minimal overhead. But that would be backsliding
quite a lot compared to what we just did to extend pg_stat_activity's
coverage of lock types.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Joel Jacobson
joel@trustly.com
In reply to: Tom Lane (#3)
Re: pg_stat_activity.waiting_start

On Sat, Dec 24, 2016 at 9:00 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I would like to propose adding a fourth such column, "waiting_start",
which would tell how long time a backend has been waiting.

The difficulty with that is it'd require a gettimeofday() call for
every wait start. Even on platforms where those are relatively cheap,
the overhead would be nasty --- and on some platforms, it'd be
astonishingly bad. We sweated quite a lot to get the overhead of
pg_stat_activity wait monitoring down to the point where it would be
tolerable for non-heavyweight locks, but I'm afraid this would push
it back into the not-tolerable range.

I don't think we need the microsecond resolution provided by
gettimeofday() via GetCurrentTimestamp()
It would be enough to know which second the waiting started, so we
could use time().

gettimeofday() takes 42 cycles.
time() only takes 3 cycles. [1]http://stackoverflow.com/questions/6498972/faster-equivalent-of-gettimeofday

[1]: http://stackoverflow.com/questions/6498972/faster-equivalent-of-gettimeofday

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Joel Jacobson
joel@trustly.com
In reply to: Joel Jacobson (#6)
Re: pg_stat_activity.waiting_start

On Sat, Dec 24, 2016 at 9:56 AM, Joel Jacobson <joel@trustly.com> wrote:

The difficulty with that is it'd require a gettimeofday() call for
every wait start. Even on platforms where those are relatively cheap,

I just realized how this can be optimized.
We only need to set wait_start for every new waiting period,
not for every wait start, i.e. not for every call to pgstat_report_wait_start():

Example:

In pgstat_report_wait_start():
if (proc->wait_start == 0)
proc->wait_start = (pg_time_t) time(NULL);

And then in pgstat_report_wait_end():
proc->wait_start = 0;

This means we only need to call time() or gettimeofday() once per
waiting period.

--
Joel Jacobson

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Joel Jacobson
joel@trustly.com
In reply to: Joel Jacobson (#7)
1 attachment(s)
Re: pg_stat_activity.waiting_start

Attached is a patch implementing the seconds-resolution wait_start, but
presented as a timestamptz to the user, just like the other *_start fields:

commit c001e5c537e36d2683a7e55c7c8bfcc154de4c9d
Author: Joel Jacobson <joel@trustly.com>
Date: Sat Dec 24 13:20:09 2016 +0700

Add OUT parameter "wait_start" timestamptz to pg_stat_get_activity()
and pg_catalog.pg_stat_activity

This is set to the timestamptz with seconds resolution
when the process started waiting, and reset to NULL
when it's not waiting any longer.

This is useful if you want to know not only what the process is
waiting for, but also how long it has been waiting,
which can be useful in situations when it might be
normal for different users/applications to wait for some amount of time,
but abnormal if they are waiting longer than some threshold.

When such a threshold is exceeded, monitoring applications
could then alert the user or possibly cancel/terminate
the blocking processes.

Without information on how long time processes have been waiting,
the monitoring applications would have no other option than
to cancel/terminate a process as soon as something is waiting,
or keep track of how long time processes have been waiting
by polling and keeping track on a per process basis,
which is less user-friendly than if PostgreSQL would provide
the information directly to the user.

src/backend/catalog/system_views.sql | 3 ++-
src/backend/postmaster/pgstat.c | 1 +
src/backend/storage/lmgr/proc.c | 1 +
src/backend/utils/adt/pgstatfuncs.c | 7 ++++++-
src/include/catalog/pg_proc.h | 2 +-
src/include/pgstat.h | 6 +++++-
src/include/storage/proc.h | 3 +++
7 files changed, 19 insertions(+), 4 deletions(-)

On Sat, Dec 24, 2016 at 12:32 PM, Joel Jacobson <joel@trustly.com> wrote:

On Sat, Dec 24, 2016 at 9:56 AM, Joel Jacobson <joel@trustly.com> wrote:

The difficulty with that is it'd require a gettimeofday() call for
every wait start. Even on platforms where those are relatively cheap,

I just realized how this can be optimized.
We only need to set wait_start for every new waiting period,
not for every wait start, i.e. not for every call to
pgstat_report_wait_start():

Example:

In pgstat_report_wait_start():
if (proc->wait_start == 0)
proc->wait_start = (pg_time_t) time(NULL);

And then in pgstat_report_wait_end():
proc->wait_start = 0;

This means we only need to call time() or gettimeofday() once per
waiting period.

--
Joel Jacobson

--
Joel Jacobson

Mobile: +46703603801
*Trustly.com <http://trustly.com/&gt; | Newsroom
<http://www.mynewsdesk.com/trustly_en&gt; | LinkedIn
<https://www.linkedin.com/company/trustly-group-ab&gt; | **Twitter
<https://twitter.com/Trustly&gt;*

* <https://trustly.com/&gt;*

Attachments:

0001-pg_stat_get_activity_wait_start.patchapplication/octet-stream; name=0001-pg_stat_get_activity_wait_start.patchDownload
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 649cef8..f52bc59 100644
*** a/src/backend/catalog/system_views.sql
--- b/src/backend/catalog/system_views.sql
***************
*** 664,670 **** CREATE VIEW pg_stat_activity AS
              S.state,
              S.backend_xid,
              s.backend_xmin,
!             S.query
      FROM pg_stat_get_activity(NULL) AS S
          LEFT JOIN pg_database AS D ON (S.datid = D.oid)
          LEFT JOIN pg_authid AS U ON (S.usesysid = U.oid);
--- 664,671 ----
              S.state,
              S.backend_xid,
              s.backend_xmin,
!             S.query,
!             S.wait_start
      FROM pg_stat_get_activity(NULL) AS S
          LEFT JOIN pg_database AS D ON (S.datid = D.oid)
          LEFT JOIN pg_authid AS U ON (S.usesysid = U.oid);
diff --git a/src/backend/postmaster/pgstat.index 61e6a2c..fb855f2 100644
*** a/src/backend/postmaster/pgstat.c
--- b/src/backend/postmaster/pgstat.c
***************
*** 2833,2838 **** pgstat_report_activity(BackendState state, const char *cmd_str)
--- 2833,2839 ----
  			/* st_xact_start_timestamp and wait_event_info are also disabled */
  			beentry->st_xact_start_timestamp = 0;
  			proc->wait_event_info = 0;
+ 			proc->wait_start = 0;
  			pgstat_increment_changecount_after(beentry);
  		}
  		return;
diff --git a/src/backend/storage/lmgr/index e9555f2..8db2c52 100644
*** a/src/backend/storage/lmgr/proc.c
--- b/src/backend/storage/lmgr/proc.c
***************
*** 406,411 **** InitProcess(void)
--- 406,412 ----
  
  	/* Initialize wait event information. */
  	MyProc->wait_event_info = 0;
+ 	MyProc->wait_start = 0;
  
  	/*
  	 * Acquire ownership of the PGPROC's latch, so that we can use WaitLatch
diff --git a/src/backend/utils/adt/pgsindex 2d3cf9e..dbb4605 100644
*** a/src/backend/utils/adt/pgstatfuncs.c
--- b/src/backend/utils/adt/pgstatfuncs.c
***************
*** 638,644 **** pg_stat_get_progress_info(PG_FUNCTION_ARGS)
  Datum
  pg_stat_get_activity(PG_FUNCTION_ARGS)
  {
! #define PG_STAT_GET_ACTIVITY_COLS	23
  	int			num_backends = pgstat_fetch_stat_numbackends();
  	int			curr_backend;
  	int			pid = PG_ARGISNULL(0) ? -1 : PG_GETARG_INT32(0);
--- 638,644 ----
  Datum
  pg_stat_get_activity(PG_FUNCTION_ARGS)
  {
! #define PG_STAT_GET_ACTIVITY_COLS	24
  	int			num_backends = pgstat_fetch_stat_numbackends();
  	int			curr_backend;
  	int			pid = PG_ARGISNULL(0) ? -1 : PG_GETARG_INT32(0);
***************
*** 806,811 **** pg_stat_get_activity(PG_FUNCTION_ARGS)
--- 806,816 ----
  			else
  				nulls[7] = true;
  
+ 			if (proc->wait_start)
+ 				values[23] = TimestampTzGetDatum(time_t_to_timestamptz(proc->wait_start));
+ 			else
+ 				nulls[23] = true;
+ 
  			if (beentry->st_xact_start_timestamp != 0)
  				values[8] = TimestampTzGetDatum(beentry->st_xact_start_timestamp);
  			else
diff --git a/src/include/catalog/pg_proc.hindex cd7b909..eee9e31 100644
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
***************
*** 2764,2770 **** DATA(insert OID = 3057 ( pg_stat_get_autoanalyze_count PGNSP PGUID 12 1 0 0 0 f
  DESCR("statistics: number of auto analyzes for a table");
  DATA(insert OID = 1936 (  pg_stat_get_backend_idset		PGNSP PGUID 12 1 100 0 0 f f f f t t s r 0 0 23 "" _null_ _null_ _null_ _null_ _null_ pg_stat_get_backend_idset _null_ _null_ _null_ ));
  DESCR("statistics: currently active backend IDs");
! DATA(insert OID = 2022 (  pg_stat_get_activity			PGNSP PGUID 12 1 100 0 0 f f f f f t s r 1 0 2249 "23" "{23,26,23,26,25,25,25,25,25,1184,1184,1184,1184,869,25,23,28,28,16,25,25,23,16,25}" "{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}" "{pid,datid,pid,usesysid,application_name,state,query,wait_event_type,wait_event,xact_start,query_start,backend_start,state_change,client_addr,client_hostname,client_port,backend_xid,backend_xmin,ssl,sslversion,sslcipher,sslbits,sslcompression,sslclientdn}" _null_ _null_ pg_stat_get_activity _null_ _null_ _null_ ));
  DESCR("statistics: information about currently active backends");
  DATA(insert OID = 3318 (  pg_stat_get_progress_info			  PGNSP PGUID 12 1 100 0 0 f f f f t t s r 1 0 2249 "25" "{25,23,26,26,20,20,20,20,20,20,20,20,20,20}" "{i,o,o,o,o,o,o,o,o,o,o,o,o,o}" "{cmdtype,pid,datid,relid,param1,param2,param3,param4,param5,param6,param7,param8,param9,param10}" _null_ _null_ pg_stat_get_progress_info _null_ _null_ _null_ ));
  DESCR("statistics: information about progress of backends running maintenance command");
--- 2764,2770 ----
  DESCR("statistics: number of auto analyzes for a table");
  DATA(insert OID = 1936 (  pg_stat_get_backend_idset		PGNSP PGUID 12 1 100 0 0 f f f f t t s r 0 0 23 "" _null_ _null_ _null_ _null_ _null_ pg_stat_get_backend_idset _null_ _null_ _null_ ));
  DESCR("statistics: currently active backend IDs");
! DATA(insert OID = 2022 (  pg_stat_get_activity			PGNSP PGUID 12 1 100 0 0 f f f f f t s r 1 0 2249 "23" "{23,26,23,26,25,25,25,25,25,1184,1184,1184,1184,869,25,23,28,28,16,25,25,23,16,25,1184}" "{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}" "{pid,datid,pid,usesysid,application_name,state,query,wait_event_type,wait_event,xact_start,query_start,backend_start,state_change,client_addr,client_hostname,client_port,backend_xid,backend_xmin,ssl,sslversion,sslcipher,sslbits,sslcompression,sslclientdn,wait_start}" _null_ _null_ pg_stat_get_activity _null_ _null_ _null_ ));
  DESCR("statistics: information about currently active backends");
  DATA(insert OID = 3318 (  pg_stat_get_progress_info			  PGNSP PGUID 12 1 100 0 0 f f f f t t s r 1 0 2249 "25" "{25,23,26,26,20,20,20,20,20,20,20,20,20,20}" "{i,o,o,o,o,o,o,o,o,o,o,o,o,o}" "{cmdtype,pid,datid,relid,param1,param2,param3,param4,param5,param6,param7,param8,param9,param10}" _null_ _null_ pg_stat_get_progress_info _null_ _null_ _null_ ));
  DESCR("statistics: information about progress of backends running maintenance command");
diff --git a/src/include/pgstat.h b/index 282f8ae..009055c 100644
*** a/src/include/pgstat.h
--- b/src/include/pgstat.h
***************
*** 20,26 ****
  #include "storage/proc.h"
  #include "utils/hsearch.h"
  #include "utils/relcache.h"
! 
  
  /* ----------
   * Paths for the statistics files (relative to installation's $PGDATA).
--- 20,26 ----
  #include "storage/proc.h"
  #include "utils/hsearch.h"
  #include "utils/relcache.h"
! #include "pgtime.h"
  
  /* ----------
   * Paths for the statistics files (relative to installation's $PGDATA).
***************
*** 1106,1111 **** pgstat_report_wait_start(uint32 wait_event_info)
--- 1106,1114 ----
  	 * four-bytes, updates are atomic.
  	 */
  	proc->wait_event_info = wait_event_info;
+ 
+ 	if (proc->wait_start == 0)
+ 		proc->wait_start = (pg_time_t) time(NULL);
  }
  
  /* ----------
***************
*** 1130,1135 **** pgstat_report_wait_end(void)
--- 1133,1139 ----
  	 * four-bytes, updates are atomic.
  	 */
  	proc->wait_event_info = 0;
+ 	proc->wait_start = 0;
  }
  
  /* nontransactional event counts are simple enough to inline */
diff --git a/src/include/stindex 0344f42..c259a1b 100644
*** a/src/include/storage/proc.h
--- b/src/include/storage/proc.h
***************
*** 20,25 ****
--- 20,26 ----
  #include "storage/lock.h"
  #include "storage/pg_sema.h"
  #include "storage/proclist_types.h"
+ #include "pgtime.h"
  
  /*
   * Each backend advertises up to PGPROC_MAX_CACHED_SUBXIDS TransactionIds
***************
*** 176,181 **** struct PGPROC
--- 177,184 ----
  	PGPROC	   *lockGroupLeader;	/* lock group leader, if I'm a member */
  	dlist_head	lockGroupMembers;		/* list of members, if I'm a leader */
  	dlist_node	lockGroupLink;	/* my member link, if I'm a member */
+ 
+ 	pg_time_t wait_start; /* time when the proc started waiting */
  };
  
  /* NOTE: "typedef struct PGPROC PGPROC" appears in storage/lock.h. */
#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joel Jacobson (#6)
1 attachment(s)
Re: pg_stat_activity.waiting_start

Joel Jacobson <joel@trustly.com> writes:

On Sat, Dec 24, 2016 at 9:00 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The difficulty with that is it'd require a gettimeofday() call for
every wait start.

I don't think we need the microsecond resolution provided by
gettimeofday() via GetCurrentTimestamp()
It would be enough to know which second the waiting started, so we
could use time().

For some use-cases, perhaps ...

gettimeofday() takes 42 cycles.
time() only takes 3 cycles. [1]

... on some platforms, perhaps, back in 2011. My own quick testing flatly
contradicts that. On Linux (RHEL6) x86_64, I see gettimeofday() taking
about 40ns (or circa 100 CPU cycles @ 2.4GHz), time() only a shade better
at 38ns. On macOS x86_64 2.7GHz, I see gettimeofday() taking about 37ns,
time() way behind at 178ns. I think we'd need at least an order of
magnitude cheaper to consider putting timing calls into spinlock or lwlock
paths, and that's just not available at all, let alone portably.

This is not an easy problem. See our most recent discussion at
/messages/by-id/31856.1400021891@sss.pgh.pa.us

I'm prepared to consider an argument that wait timing might have weaker
requirements than EXPLAIN ANALYZE (which certainly needs to measure short
durations) but you didn't actually make that argument.

regards, tom lane

Linux:

$ gcc -Wall -O2 -o time-timing-calls time-timing-calls.c
$ time ./time-timing-calls

real 0m37.678s
user 0m37.685s
sys 0m0.002s
$ gcc -Wall -O2 -o time-timing-calls -DUSE_GT time-timing-calls.c
$ time ./time-timing-calls

real 0m39.964s
user 0m39.971s
sys 0m0.001s

macOS:

$ gcc -Wall -O2 -o time-timing-calls time-timing-calls.c
$ time ./time-timing-calls

real 2m58.683s
user 2m58.515s
sys 0m0.131s
$ gcc -Wall -O2 -o time-timing-calls -DUSE_GT time-timing-calls.c
$ time ./time-timing-calls

real 0m37.004s
user 0m36.993s
sys 0m0.006s

Attachments:

time-timing-calls.ctext/x-c; charset=us-ascii; name=time-timing-calls.cDownload
#10Joel Jacobson
joel@trustly.com
In reply to: Tom Lane (#9)
Re: pg_stat_activity.waiting_start

This is not an easy problem. See our most recent discussion at
/messages/by-id/31856.1400021891@sss.pgh.pa.us

Thanks for the small test program.
I tested it on my MacBook Pro and gettimeofday() was way faster than time().

The clock_gettime() used by the patch clock_gettime_1.patch in the
mailing thread from Hari Babu apparently doesn't work on OS X.

Instead, I tested the OS X specific mach_absolute_time() which was the fastest:

gcc -Wall -O2 -o time-timing-calls -DUSE_MACH_ABSOLUTE_TIME time-timing-calls.c
time ./time-timing-calls
real 0m16.806s
user 0m16.781s
sys 0m0.012s

gcc -Wall -O2 -o time-timing-calls -DUSE_GETTIMEOFDAY time-timing-calls.c
time ./time-timing-calls
real 0m35.466s
user 0m35.062s
sys 0m0.393s

Code:

#ifdef __MACH__
#include <mach/mach_time.h>
#endif
#ifdef USE_MACH_ABSOLUTE_TIME
uint64_t tv;
tv = mach_absolute_time();
#endif

I'm prepared to consider an argument that wait timing might have weaker
requirements than EXPLAIN ANALYZE (which certainly needs to measure short
durations) but you didn't actually make that argument.

I can see why timing overload is a problem in EXPLAIN ANALYZE and at
other places,
and that would of course be a great thing to fix.

However, I'm not sure I fully understand how it can be that much of a
problem in pgstat_report_wait_start()?

As far as I can tell from reading the source code, it only appears
pgstat_report_wait_start() is only entered when a process is waiting?

Is it not likely the time spent waiting will vastly exceed the amount
of extra time for the gettimeofday() call?

Is it really a typical real-life scenario that processes can be
waiting extremely often for extremely short periods of time,
where the timing overhead would be significant?

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Andres Freund
andres@anarazel.de
In reply to: Joel Jacobson (#10)
Re: pg_stat_activity.waiting_start

On December 25, 2016 1:21:43 AM GMT+01:00, Joel Jacobson <joel@trustly.com> wrote:

Is it really a typical real-life scenario that processes can be
waiting extremely often for extremely short periods of time,
where the timing overhead would be significant?

Yes. Consider WAL insertion, procarray or other similar contended locks.

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Joel Jacobson
joel@trustly.com
In reply to: Andres Freund (#11)
Re: pg_stat_activity.waiting_start

On Sun, Dec 25, 2016 at 8:01 PM, Andres Freund <andres@anarazel.de> wrote:

On December 25, 2016 1:21:43 AM GMT+01:00, Joel Jacobson <joel@trustly.com> wrote:

Is it really a typical real-life scenario that processes can be
waiting extremely often for extremely short periods of time,
where the timing overhead would be significant?

Yes. Consider WAL insertion, procarray or other similar contended locks.

Ah, I see, then I understand it has to be blazingly fast.

Maybe a good tradeoff then would be to let "wait_start" represent the
very first time the txn started waiting?
That way gettimeofday() would only be called once per txn, and the
value would be remembered, but not exposed when the txn is not
waiting.
If the txn is waiting/not waiting multiple times during it's
life-time, the same "wait_start" value would be exposed when it's
waiting, and NULL when it's not. Sounds good?

As long as the documentation is clear on "wait_start" meaning when the
first wait started in the txn, I think that's useful enough to improve
the situation, as one could then ask a query like "select all
processes that have possibly been waiting for at least 5 seconds",
which you cannot do today.

The best you can do today is ask a query like "select all processes
that are waiting and have been running for at least 5 seconds", but
during those 5 seconds they have been running, they might only have
been waiting for the very last few milliseconds, which might not be a
problem at all. If instead knowing they were waiting 5 seconds ago,
and are still waiting, but might have had periods in between where
they were not waiting, I would say that is close enough to what I as a
user want to know, and can use that information for automatic
decision-making on e.g. if I want to terminate other blocking
processes.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joel Jacobson (#12)
Re: pg_stat_activity.waiting_start

Joel Jacobson <joel@trustly.com> writes:

Maybe a good tradeoff then would be to let "wait_start" represent the
very first time the txn started waiting?
...
As long as the documentation is clear on "wait_start" meaning when the
first wait started in the txn, I think that's useful enough to improve
the situation, as one could then ask a query like "select all
processes that have possibly been waiting for at least 5 seconds",
which you cannot do today.

Meh. You *can* do that now: query pg_stat_activity for waiting processes,
wait a couple seconds, query again, intersect the results. I think really
the only useful improvement on that would be to be able to tell that the
process has been blocked continuously for X seconds, and what you're
proposing here won't do that.

In practice, there should never be waits on LWLocks (much less spinlocks)
that exceed order-of-milliseconds; if there are, either we chose the wrong
lock type or the system is pretty broken in general. So maybe it's
sufficient if we provide a wait start time for heavyweight locks ...
though that still seems kind of ugly. (Also, I don't recall the existing
code factorization there, but getting the start time into pg_stat_activity
without an extra gettimeofday call might be hard. As I said, there is
one being done, but I'm not sure how accessible its result is.)

I did a bit more research over the weekend into the cost of gettimeofday
and clock_gettime, and have some new results that I'll post into that
thread shortly. But the short answer is still "they're damn expensive
on some platforms, and not really cheap anywhere".

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Stephen Frost
sfrost@snowman.net
In reply to: Tom Lane (#13)
Re: pg_stat_activity.waiting_start

Tom,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

In practice, there should never be waits on LWLocks (much less spinlocks)
that exceed order-of-milliseconds; if there are, either we chose the wrong
lock type or the system is pretty broken in general. So maybe it's
sufficient if we provide a wait start time for heavyweight locks ...
though that still seems kind of ugly.

While I agree that it's a bit ugly, if the alternative is "don't have
anything", then I'd argue that it's worth it. The use-case for this
feature, as I see it, is for admins to be able to go look at how long
something has been waiting and monitoring scripts to which fire only
every minute or more, and order-of-milliseconds differences aren't
significant there.

It's terribly ugly, but from a practical standpoint, we could probably
make it "waiting after deadlock timeout" and just set the time when the
deadlock timeout fires and the use-case for this would be satisfied.

Thanks!

Stephen

#15Greg Stark
stark@mit.edu
In reply to: Tom Lane (#9)
Re: pg_stat_activity.waiting_start

On Dec 24, 2016 5:44 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

I think we'd need at least an order of
magnitude cheaper to consider putting timing calls into spinlock or lwlock
paths, and that's just not available at all, let alone portably.

For spinlocks we could conceivably just bite the bullet and use a raw rdtsc
or the equivalent for other platforms. It might be pretty easy to
distinguish sane numbers from numbers that result after a process
reschedule and we could just discard data when that happens (or count
occurrences).

That may possibly work for spinlocks but it won't work for anything heavier
where process reschedules are routine.

#16Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Greg Stark (#15)
Re: pg_stat_activity.waiting_start

On 12/27/16 11:17 AM, Greg Stark wrote:

On Dec 24, 2016 5:44 PM, "Tom Lane" <tgl@sss.pgh.pa.us
<mailto:tgl@sss.pgh.pa.us>> wrote:

I think we'd need at least an order of
magnitude cheaper to consider putting timing calls into spinlock or
lwlock
paths, and that's just not available at all, let alone portably.

For spinlocks we could conceivably just bite the bullet and use a raw
rdtsc or the equivalent for other platforms. It might be pretty easy to
distinguish sane numbers from numbers that result after a process
reschedule and we could just discard data when that happens (or count
occurrences).

That may possibly work for spinlocks but it won't work for anything
heavier where process reschedules are routine.

Anything heavier and a ~50ns gettimeofday() call is probably not that
bad anymore...

I also wonder if setting an alarm is cheap enough to happen during a
spinlock? Set a fairly short alarm on entry, clear it on exit. If the
alarm fires, do the gettimeofday(). Add the alarm time back in and
you're going to be close enough to when the wait started for any
practical use of pg_stat_activity (perhaps this is what Stephen is
suggesting about deadlock timeout...)
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Amit Kapila
amit.kapila16@gmail.com
In reply to: Tom Lane (#5)
Re: pg_stat_activity.waiting_start

On Sat, Dec 24, 2016 at 7:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Stephen Frost <sfrost@snowman.net> writes:

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

The difficulty with that is it'd require a gettimeofday() call for
every wait start. Even on platforms where those are relatively cheap,
the overhead would be nasty --- and on some platforms, it'd be
astonishingly bad. We sweated quite a lot to get the overhead of
pg_stat_activity wait monitoring down to the point where it would be
tolerable for non-heavyweight locks, but I'm afraid this would push
it back into the not-tolerable range.

Could we handle this like log_lock_waits..?

Well, that only applies to heavyweight locks, which do a gettimeofday
anyway in order to schedule the deadlock-check timeout. If you were
willing to populate this new column only for heavyweight locks, maybe it
could be done for minimal overhead. But that would be backsliding
quite a lot compared to what we just did to extend pg_stat_activity's
coverage of lock types.

Can we think of introducing new guc trace_system_waits or something
like that which will indicate that the sessions will report the value
of wait_start in pg_stat_activity? The default value of such a
parameter can be false which means wait_start will be shown as NULL in
pg_stat_activity and when it is enabled the wait_start can show the
time as proposed in this thread.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Amit Kapila (#17)
Re: pg_stat_activity.waiting_start

On 12/28/16 7:10 AM, Amit Kapila wrote:

Can we think of introducing new guc trace_system_waits or something
like that which will indicate that the sessions will report the value
of wait_start in pg_stat_activity? The default value of such a
parameter can be false which means wait_start will be shown as NULL in
pg_stat_activity and when it is enabled the wait_start can show the
time as proposed in this thread.

In my experience the problem with those kind of settings is that they're
never enabled when you actually need them. I think it'd be much better
to find a way to always capture wait_starts that are over some minimum
duration, where collection overhead won't matter but you still have some
good info about what's going on. For pg_stat_activity I'd think that
threshold would be on the order of 50-100ms, though maybe there's other
places where a tighter tolerance would help.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim Nasby (#18)
Re: pg_stat_activity.waiting_start

Jim Nasby <Jim.Nasby@BlueTreble.com> writes:

On 12/28/16 7:10 AM, Amit Kapila wrote:

Can we think of introducing new guc trace_system_waits or something
like that which will indicate that the sessions will report the value
of wait_start in pg_stat_activity?

In my experience the problem with those kind of settings is that they're
never enabled when you actually need them.

Yeah. And they especially won't be enabled in production situations,
if people find out that they cause lots of overhead.

I think it'd be much better
to find a way to always capture wait_starts that are over some minimum
duration, where collection overhead won't matter but you still have some
good info about what's going on. For pg_stat_activity I'd think that
threshold would be on the order of 50-100ms, though maybe there's other
places where a tighter tolerance would help.

The idea of just capturing the wait start for heavyweight locks, and
not other lock types, still seems superior to any of the alternatives
that have been suggested ...

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Tom Lane (#19)
Re: pg_stat_activity.waiting_start

On 12/28/16 11:25 AM, Tom Lane wrote:

The idea of just capturing the wait start for heavyweight locks, and
not other lock types, still seems superior to any of the alternatives
that have been suggested ...

Is some kind of alarm a viable option for the others? If setting the
alarm is cheap, you could just set one for say 5ms when you have to wait
on a lock. If setting one is too expensive perhaps one could be set at
transaction start that looks at a global (and the global would be set
when a wait started). Obviously that means there's inaccuracy to the
true time spent waiting, but I think that's certainly fine for
pg_stat_activity. Most importantly, it would mean that if something has
gone horribly wrong you'd at least have some kind of relatively accurate
duration to work from.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim Nasby (#20)
Re: pg_stat_activity.waiting_start

Jim Nasby <Jim.Nasby@bluetreble.com> writes:

On 12/28/16 11:25 AM, Tom Lane wrote:

The idea of just capturing the wait start for heavyweight locks, and
not other lock types, still seems superior to any of the alternatives
that have been suggested ...

Is some kind of alarm a viable option for the others? If setting the
alarm is cheap,

Setting an alarm is certain to require a gettimeofday and/or a kernel
call. It is far from cheap.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#22Amit Kapila
amit.kapila16@gmail.com
In reply to: Tom Lane (#19)
Re: pg_stat_activity.waiting_start

On Wed, Dec 28, 2016 at 10:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Jim Nasby <Jim.Nasby@BlueTreble.com> writes:

On 12/28/16 7:10 AM, Amit Kapila wrote:

Can we think of introducing new guc trace_system_waits or something
like that which will indicate that the sessions will report the value
of wait_start in pg_stat_activity?

In my experience the problem with those kind of settings is that they're
never enabled when you actually need them.

I agree with that, but I think it might be better than nothing
especially for LWLock waits.

Yeah. And they especially won't be enabled in production situations,
if people find out that they cause lots of overhead.

What kind of overhead are we expecting here? I think it probably
depends on workload, but it is quite possible that in most workloads
it is in single digit. I agree that in an ideal situation we should
not allow even 1% overhead, however, if people are interested in doing
the detail level of monitoring at the expense of some performance hit,
then why not give them the option to do so.

I think it'd be much better
to find a way to always capture wait_starts that are over some minimum
duration, where collection overhead won't matter but you still have some
good info about what's going on. For pg_stat_activity I'd think that
threshold would be on the order of 50-100ms, though maybe there's other
places where a tighter tolerance would help.

The idea of just capturing the wait start for heavyweight locks, and
not other lock types, still seems superior to any of the alternatives
that have been suggested ...

If we want to capture waits without any system level parameter, then
that seems better option and maybe, later on, we can capture I/O waits
as well. However, I feel as proposed the name of parameter wait_start
can confuse users considering we have wait_event and wait_event_type
for generic waits.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#23Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#21)
Re: pg_stat_activity.waiting_start

On Wed, Dec 28, 2016 at 1:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Jim Nasby <Jim.Nasby@bluetreble.com> writes:

On 12/28/16 11:25 AM, Tom Lane wrote:

The idea of just capturing the wait start for heavyweight locks, and
not other lock types, still seems superior to any of the alternatives
that have been suggested ...

Is some kind of alarm a viable option for the others? If setting the
alarm is cheap,

Setting an alarm is certain to require a gettimeofday and/or a kernel
call. It is far from cheap.

If one were OK with a really-really-rough value for when the wait
started, you could have a slot for a timestamp in PGPROC that is
cleared by pgstat_report_wait_end() but not set by
pgstat_report_wait_start(). Then you could have a background process
that goes through and sets it for all processes advertising a wait
event every second, or every 10 seconds, or every half second, or
whatever you want. Of course this doesn't eliminate the overhead but
it pushes it into the background which, if there are idle CPUs, is
almost as good.

I previously proposed something similar to this as a way of profiling
waits and I believe you weren't very supportive of it, but I still
think these kinds of ideas have some legs. We can neither take the
approach that timestamp overhead is irrelevant nor the position that
timestamps are expensive so let's not have any. Some third way has to
be found.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#24Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#13)
Re: pg_stat_activity.waiting_start

On Mon, Dec 26, 2016 at 03:36:39PM -0500, Tom Lane wrote:

In practice, there should never be waits on LWLocks (much less spinlocks)
that exceed order-of-milliseconds; if there are, either we chose the wrong
lock type or the system is pretty broken in general. So maybe it's
sufficient if we provide a wait start time for heavyweight locks ...
though that still seems kind of ugly. (Also, I don't recall the existing
code factorization there, but getting the start time into pg_stat_activity
without an extra gettimeofday call might be hard. As I said, there is
one being done, but I'm not sure how accessible its result is.)

Agreed. No need in adding overhead for short-lived locks because the
milli-second values are going to be meaningless to users. I would be
happy if we could find some weasel value for non-heavyweight locks.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#25Joel Jacobson
joel@trustly.com
In reply to: Bruce Momjian (#24)
Re: pg_stat_activity.waiting_start

On Thu, Jan 5, 2017 at 4:59 PM, Bruce Momjian <bruce@momjian.us> wrote:

Agreed. No need in adding overhead for short-lived locks because the
milli-second values are going to be meaningless to users. I would be
happy if we could find some weasel value for non-heavyweight locks.

To avoid a NULL value for waiting_start, and thanks to non-heavyweight
locks don't exceed order-of-milliseconds, I think it would be
acceptable to just return now() whenever something wants to know
waiting_start i.e. when something selects from pg_stat_activity.

The exact value would only be within orders-of-milliseconds away from
now() anyway, so one can argue it's not that important, as long as the
documentation is clear on that point.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#26Bruce Momjian
bruce@momjian.us
In reply to: Joel Jacobson (#25)
Re: pg_stat_activity.waiting_start

On Thu, Jan 5, 2017 at 06:48:17PM -1000, Joel Jacobson wrote:

On Thu, Jan 5, 2017 at 4:59 PM, Bruce Momjian <bruce@momjian.us> wrote:

Agreed. No need in adding overhead for short-lived locks because the
milli-second values are going to be meaningless to users. I would be
happy if we could find some weasel value for non-heavyweight locks.

To avoid a NULL value for waiting_start, and thanks to non-heavyweight
locks don't exceed order-of-milliseconds, I think it would be
acceptable to just return now() whenever something wants to know
waiting_start i.e. when something selects from pg_stat_activity.

The exact value would only be within orders-of-milliseconds away from
now() anyway, so one can argue it's not that important, as long as the
documentation is clear on that point.

I don't think now() is a good value as it doesn't indicate to the user
which values are real measurements and which are not. NULL is probably
the best. +/-infinity is odd too.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#27Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#26)
Re: pg_stat_activity.waiting_start

On 2017-01-06 10:43:32 -0500, Bruce Momjian wrote:

On Thu, Jan 5, 2017 at 06:48:17PM -1000, Joel Jacobson wrote:

On Thu, Jan 5, 2017 at 4:59 PM, Bruce Momjian <bruce@momjian.us> wrote:

Agreed. No need in adding overhead for short-lived locks because the
milli-second values are going to be meaningless to users. I would be
happy if we could find some weasel value for non-heavyweight locks.

To avoid a NULL value for waiting_start, and thanks to non-heavyweight
locks don't exceed order-of-milliseconds, I think it would be
acceptable to just return now() whenever something wants to know
waiting_start i.e. when something selects from pg_stat_activity.

The exact value would only be within orders-of-milliseconds away from
now() anyway, so one can argue it's not that important, as long as the
documentation is clear on that point.

I don't think now() is a good value as it doesn't indicate to the user
which values are real measurements and which are not. NULL is probably
the best. +/-infinity is odd too.

Yea. If one wants to make NULL into now() it's trivial enough with a
single coalesce().

Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#28Greg Stark
stark@mit.edu
In reply to: Bruce Momjian (#24)
Re: pg_stat_activity.waiting_start

On 6 January 2017 at 02:59, Bruce Momjian <bruce@momjian.us> wrote:

Agreed. No need in adding overhead for short-lived locks because the
milli-second values are going to be meaningless to users. I would be
happy if we could find some weasel value for non-heavyweight locks.

For what it's worth I don't think this is true. It may be true that we
don't need precise measurements of short-lived locks but we do need
accurate measurements even if they're in the expected range.

What users need to know is in aggregate how much of the time the
database is spending working on their queries is going into different
states. Even if no LWLock is held for more than a few milliseconds at
a time if it turns out that 80% of the time is being spend in waiting
on LWLocks then there's no point in worrying about cpu speed, for
example. And knowing *which* LWLock is taking up an aggregate of 80%
of time would point at either configuration changes or code changes to
reduce that contention.

I would actually argue the reverse of the above proposal would be more
useful. What we need are counts of how often LWLocks take longer than,
say, 50ms and for shorter waits we need to know how long. Perhaps not
precisely for individual waits but in aggregate we need the totals to
be right so as long as the measurements are accurate that would be
sufficient. So an accurate but imprecise measurement +/- 10ms with low
overhead is better than a precise measurement with high overhead.

--
greg

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#29Bruce Momjian
bruce@momjian.us
In reply to: Greg Stark (#28)
Re: pg_stat_activity.waiting_start

On Sat, Jan 7, 2017 at 01:25:08PM +0000, Greg Stark wrote:

I would actually argue the reverse of the above proposal would be more
useful. What we need are counts of how often LWLocks take longer than,
say, 50ms and for shorter waits we need to know how long. Perhaps not
precisely for individual waits but in aggregate we need the totals to
be right so as long as the measurements are accurate that would be
sufficient. So an accurate but imprecise measurement +/- 10ms with low
overhead is better than a precise measurement with high overhead.

I agree those values are important, but I don't think people are going
to be able to use pg_stat_activity to get them, so I don't see the point
of trying to supply them there.

See
/messages/by-id/CA+Tgmoav9Q5v5ZGT3+wP_1tQjT6TGYXrwrDcTRrWimC+ZY7RRA@mail.gmail.com
for an excellent example of getting those values via polling.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#30Joel Jacobson
joel@trustly.com
In reply to: Greg Stark (#28)
Re: pg_stat_activity.waiting_start

On Sat, Jan 7, 2017 at 3:25 AM, Greg Stark <stark@mit.edu> wrote:

What users need to know is in aggregate how much of the time the
database is spending working on their queries is going into different
states.

This is a separate feature idea, but I think it's really valuable as well.

Maybe something similar to pg_stat_user_functions?
But instead grouping by wait_event_type, wait_event, and showing
accumulated count and sum of waiting time since last stat reset, just
like the other pg_stat_* views?

Maybe something like this?

\d pg_stat_waiting
View "pg_catalog.pg_stat_waiting"
Column | Type | Modifiers
-----------------+------------------+-----------
wait_event_type | name |
wait_event | name |
waiting_counter | bigint |
waiting_time | double precision |

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#31Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Robert Haas (#23)
Re: pg_stat_activity.waiting_start

On 12/28/16 10:26 PM, Robert Haas wrote:

On Wed, Dec 28, 2016 at 1:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Jim Nasby <Jim.Nasby@bluetreble.com> writes:

On 12/28/16 11:25 AM, Tom Lane wrote:

The idea of just capturing the wait start for heavyweight locks, and
not other lock types, still seems superior to any of the alternatives
that have been suggested ...

Is some kind of alarm a viable option for the others? If setting the
alarm is cheap,

Setting an alarm is certain to require a gettimeofday and/or a kernel
call. It is far from cheap.

If one were OK with a really-really-rough value for when the wait
started, you could have a slot for a timestamp in PGPROC that is
cleared by pgstat_report_wait_end() but not set by
pgstat_report_wait_start(). Then you could have a background process
that goes through and sets it for all processes advertising a wait
event every second, or every 10 seconds, or every half second, or
whatever you want. Of course this doesn't eliminate the overhead but
it pushes it into the background which, if there are idle CPUs, is
almost as good.

I previously proposed something similar to this as a way of profiling
waits and I believe you weren't very supportive of it, but I still
think these kinds of ideas have some legs. We can neither take the
approach that timestamp overhead is irrelevant nor the position that
timestamps are expensive so let's not have any. Some third way has to
be found.

I like that idea as a compromise. I've spent a lot of time looking at
systems with no process > 90% CPU and IO < 90% yet individual backends
being slow and wished I had some idea of what the backend was doing that
wasn't being accounted for.

BTW, instead of a flag I'd consider using a counter. That would allow
the background process to notice if backends were going into waits that
ultimately got resolved in between scans of PGPROC. That's obviously not
useful for pg_stat_activity, but it would be meaningful for anything
that was trying to accumulate wait time stats.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#32Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Joel Jacobson (#30)
Re: pg_stat_activity.waiting_start

On 1/7/17 12:41 PM, Joel Jacobson wrote:

On Sat, Jan 7, 2017 at 3:25 AM, Greg Stark <stark@mit.edu> wrote:

What users need to know is in aggregate how much of the time the
database is spending working on their queries is going into different
states.

This is a separate feature idea, but I think it's really valuable as well.

Maybe something similar to pg_stat_user_functions?
But instead grouping by wait_event_type, wait_event, and showing
accumulated count and sum of waiting time since last stat reset, just
like the other pg_stat_* views?

Maybe something like this?

\d pg_stat_waiting
View "pg_catalog.pg_stat_waiting"
Column | Type | Modifiers
-----------------+------------------+-----------
wait_event_type | name |
wait_event | name |
waiting_counter | bigint |
waiting_time | double precision |

Yes, I've wanted this many times in the past. If combined with Robert's
idea of a background process that does the expensive time calls this
could potentially provide very useful information for even very short
duration locks.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers