RFC: replace pg_stat_activity.waiting with something more descriptive

robertmhaas@gmail.com

almost 11 years ago

In reply to: David G. Johnston (#3)

Re: RFC: replace pg_stat_activity.waiting with something more descriptive

On Mon, Jun 22, 2015 at 1:59 PM, David G. Johnston
<david.g.johnston@gmail.com> wrote:

In addition to the codes themselves I think it would aid less-experienced
operators if we would provide a meta-data categorization of the codes.
Something like, I/O Sub-System; Storage Maintenance; Concurrency, etc..

There could be a section in the documentation with these topics as section
headings and a listing and explanation of each of the possible code would
then be described within.

The meta-information is already embedded within the code/descriptions but
explicitly pulling them out would be, IMO, more user-friendly and likely
also aid in triage and speed-of-recognition when reading the corresponding
code/description.

I was thinking that the codes would probably be fairly straightforward
renderings of the underlying C identifiers, e.g.:

Lock (Relation)
Lock (Relation Extension)
Lock (Page)
...
...
LWLock (ShmemIndexLock)
LWLock (OidGenLock)
LWLock (XidGenLock)
LWLock (ProcArrayLock)
...
...
Spin Lock Delay
Buffer Cleanup Lock

We'd then have to figure out how to document that stuff.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

David G. Johnston

david.g.johnston@gmail.com

almost 11 years ago

In reply to: Robert Haas (#4)

Re: RFC: replace pg_stat_activity.waiting with something more descriptive

On Mon, Jun 22, 2015 at 4:09 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Jun 22, 2015 at 1:59 PM, David G. Johnston
<david.g.johnston@gmail.com> wrote:

In addition to the codes themselves I think it would aid less-experienced
operators if we would provide a meta-data categorization of the codes.
Something like, I/O Sub-System; Storage Maintenance; Concurrency, etc..

There could be a section in the documentation with these topics as

section

headings and a listing and explanation of each of the possible code would
then be described within.

The meta-information is already embedded within the code/descriptions but
explicitly pulling them out would be, IMO, more user-friendly and likely
also aid in triage and speed-of-recognition when reading the

corresponding

code/description.

I was thinking that the codes would probably be fairly straightforward
renderings of the underlying C identifiers, e.g.:

Lock (Relation)
Lock (Relation Extension)
Lock (Page)
...
...
LWLock (ShmemIndexLock)
LWLock (OidGenLock)
LWLock (XidGenLock)
LWLock (ProcArrayLock)
...
...
Spin Lock Delay
Buffer Cleanup Lock

We'd then have to figure out how to document that stuff.

Just tossing stuff at the wall...

CREATE TABLE pg_stat_wait_event_info (
wait_event integer PRIMARY KEY,
category text, --possibly a FK to a description-holding table too
event_type text, --Lock, LWLock, etc...or something higher-level; was
pondering whether ltree could be brought into core and used here...
event_documentation --asciidoc or something similar
);

Add \psql commands:

\dproc [ proc_id] --runs pg_stat_activity more-or-less
\dproc+ proc_id --shows the event information w/ description; and ideally
info from pg_locks among other possibilities

That said, the same documentation should be made available online as well -
but having this allows tools to make use the info to put the documentation
closer to the user.

David J.

Merlin Moncure

mmoncure@gmail.com

almost 11 years ago

In reply to: Robert Haas (#1)

Re: RFC: replace pg_stat_activity.waiting with something more descriptive

On Mon, Jun 22, 2015 at 12:37 PM, Robert Haas <robertmhaas@gmail.com> wrote:

When a PostgreSQL system wedges, or when it becomes dreadfully slow
for some reason, I often find myself relying on tools like strace,
gdb, or perf to figure out what is happening. This doesn't tend to
instill customers with confidence; they would like (quite
understandably) a process that doesn't require installing developer
tools on their production systems, and doesn't require a developer to
interpret the results, and perhaps even something that they could
connect up to PEM or Nagios or whatever alerting system they are
using.

There are obviously many ways that we might think about improving
things here, but what I'd like to do is try to put some better
information in pg_stat_activity, so that when a process is not
running, users can get some better information about *why* it's not
running. The basic idea is that pg_stat_activity.waiting would be
replaced by a new column pg_stat_activity.wait_event, which would
display the reason why that backend is waiting. This wouldn't be a
free-form text field, because that would be too expensive to populate.
Instead it would contain a "reason code" which would be chosen from a
list of reason codes and translated to text for display.

Instead of changing the column, can't we add a new one? Adjusting
columns in PSA requires the innumerable queries written against it to
be adjusted along with all the wiki instructions to dev ops for
emergency stuck query detection etc etc. I would also prefer to
query 'waiting' in some cases, especially when in emergency
situations; it's faster to type.

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

robertmhaas@gmail.com

almost 11 years ago

In reply to: Merlin Moncure (#6)

Re: RFC: replace pg_stat_activity.waiting with something more descriptive

On Mon, Jun 22, 2015 at 4:40 PM, Merlin Moncure <mmoncure@gmail.com> wrote:

Instead of changing the column, can't we add a new one? Adjusting
columns in PSA requires the innumerable queries written against it to
be adjusted along with all the wiki instructions to dev ops for
emergency stuck query detection etc etc. I would also prefer to
query 'waiting' in some cases, especially when in emergency
situations; it's faster to type.

If people feel strongly about backward compatibility, yes, we can do
that. However, if waiting continues to mean "on a heavyweight lock"
for backward compatibility, then you could sometimes have waiting =
false but wait_state non-null. That seems confusing enough to be a
bad plan, at least to me.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

tgl@sss.pgh.pa.us

almost 11 years ago

In reply to: Merlin Moncure (#6)

Re: RFC: replace pg_stat_activity.waiting with something more descriptive

Merlin Moncure <mmoncure@gmail.com> writes:

On Mon, Jun 22, 2015 at 12:37 PM, Robert Haas <robertmhaas@gmail.com> wrote:

... The basic idea is that pg_stat_activity.waiting would be
replaced by a new column pg_stat_activity.wait_event, which would
display the reason why that backend is waiting.

Instead of changing the column, can't we add a new one? Adjusting
columns in PSA requires the innumerable queries written against it to
be adjusted along with all the wiki instructions to dev ops for
emergency stuck query detection etc etc.

+1. Removing the boolean column seems like it will arbitrarily break
a whole lot of client-side code, for not-very-adequate reasons.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Jim Nasby

Jim.Nasby@BlueTreble.com

almost 11 years ago

In reply to: Robert Haas (#1)

Re: RFC: replace pg_stat_activity.waiting with something more descriptive

On 6/22/15 12:37 PM, Robert Haas wrote:

It's
not my goal here to create some kind of a performance counter system,
even though that would be valuable and could possibly be based on the
same infrastructure, but rather just to create a very simple system
that lets people know, without any developer tools, what is causing a
backend that has accepted a query and not yet returned a result to be
off-CPU rather than on-CPU.

Ilya Kosmodemiansky presented such a system at pgCon[1]http://www.pgcon.org/2015/schedule/events/809.en.html -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Data in Trouble? Get it in Treble! http://BlueTreble.com, and hopes to
submit an initial patch in the coming weeks. The general idea was to do
something similar to what you're describing (though, I believe even more
granular) and have a bgworker accumulating that information.

[1]: http://www.pgcon.org/2015/schedule/events/809.en.html -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Data in Trouble? Get it in Treble! http://BlueTreble.com
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

amit.kapila16@gmail.com

almost 11 years ago

In reply to: Robert Haas (#7)

Re: RFC: replace pg_stat_activity.waiting with something more descriptive

On Tue, Jun 23, 2015 at 2:33 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Jun 22, 2015 at 4:40 PM, Merlin Moncure <mmoncure@gmail.com>

wrote:

Instead of changing the column, can't we add a new one? Adjusting
columns in PSA requires the innumerable queries written against it to
be adjusted along with all the wiki instructions to dev ops for
emergency stuck query detection etc etc. I would also prefer to
query 'waiting' in some cases, especially when in emergency
situations; it's faster to type.

If people feel strongly about backward compatibility, yes, we can do
that. However, if waiting continues to mean "on a heavyweight lock"
for backward compatibility, then you could sometimes have waiting =
false but wait_state non-null. That seems confusing enough to be a
bad plan, at least to me.

That's right if we leave the 'waiting' as it is for the sake of backward
compatibility, then it will be confusing after we add wait_event to
pg_stat_activity and if we change it such that for any kind of wait_event
waiting will be true (or entirely remove waiting), then it will break the
backward compatibility. So we have below alternatives here:

1. Remove/Change 'waiting' in pg_stat_activity and break the backward
compatibility. I think we should try to avoid going via this route.

2. Add 2 new columns to pg_stat_activity
waiting_resource - true for waits other heavy wait locks, false
otherwise
wait_event - description code for the wait event

3. Add new view 'pg_stat_wait_event' with following info:
pid - process id of this backend
waiting - true for any form of wait, false otherwise
wait_event_type - Heavy Weight Lock, Light Weight Lock, I/O wait, etc
wait_event - Lock (Relation), Lock (Relation Extension), etc

Do you think 2nd or 3rd could be viable way to proceed for this feature?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#11

andres@anarazel.de

almost 11 years ago

In reply to: Amit Kapila (#10)

Re: RFC: replace pg_stat_activity.waiting with something more descriptive

On 2015-06-25 16:07:45 +0530, Amit Kapila wrote:

On Tue, Jun 23, 2015 at 2:33 AM, Robert Haas <robertmhaas@gmail.com> wrote:

If people feel strongly about backward compatibility, yes, we can do
that. However, if waiting continues to mean "on a heavyweight lock"
for backward compatibility, then you could sometimes have waiting =
false but wait_state non-null. That seems confusing enough to be a
bad plan, at least to me.

That's right if we leave the 'waiting' as it is for the sake of backward
compatibility, then it will be confusing after we add wait_event to
pg_stat_activity and if we change it such that for any kind of wait_event
waiting will be true (or entirely remove waiting), then it will break the
backward compatibility. So we have below alternatives here:

1. Remove/Change 'waiting' in pg_stat_activity and break the backward
compatibility. I think we should try to avoid going via this route.

2. Add 2 new columns to pg_stat_activity
waiting_resource - true for waits other heavy wait locks, false
otherwise
wait_event - description code for the wait event

3. Add new view 'pg_stat_wait_event' with following info:
pid - process id of this backend
waiting - true for any form of wait, false otherwise
wait_event_type - Heavy Weight Lock, Light Weight Lock, I/O wait, etc
wait_event - Lock (Relation), Lock (Relation Extension), etc

Do you think 2nd or 3rd could be viable way to proceed for this feature?

3) sounds best to me. Keeping 'waiting' even makes sense in that case,
because it'll tell whether wait_event_type is currently being blocked
on. We can leave the former contents in until the next thing is being
blocked...

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

amit.kapila16@gmail.com

almost 11 years ago

In reply to: Andres Freund (#11)

Re: RFC: replace pg_stat_activity.waiting with something more descriptive

On Thu, Jun 25, 2015 at 4:16 PM, Andres Freund <andres@anarazel.de> wrote:

On 2015-06-25 16:07:45 +0530, Amit Kapila wrote:

On Tue, Jun 23, 2015 at 2:33 AM, Robert Haas <robertmhaas@gmail.com>

wrote:

If people feel strongly about backward compatibility, yes, we can do
that. However, if waiting continues to mean "on a heavyweight lock"
for backward compatibility, then you could sometimes have waiting =
false but wait_state non-null. That seems confusing enough to be a
bad plan, at least to me.

That's right if we leave the 'waiting' as it is for the sake of backward
compatibility, then it will be confusing after we add wait_event to
pg_stat_activity and if we change it such that for any kind of

wait_event

waiting will be true (or entirely remove waiting), then it will break

the

backward compatibility. So we have below alternatives here:

1. Remove/Change 'waiting' in pg_stat_activity and break the backward
compatibility. I think we should try to avoid going via this route.

2. Add 2 new columns to pg_stat_activity
waiting_resource - true for waits other heavy wait locks, false
otherwise
wait_event - description code for the wait event

3. Add new view 'pg_stat_wait_event' with following info:
pid - process id of this backend
waiting - true for any form of wait, false otherwise
wait_event_type - Heavy Weight Lock, Light Weight Lock, I/O wait, etc
wait_event - Lock (Relation), Lock (Relation Extension), etc

Do you think 2nd or 3rd could be viable way to proceed for this feature?

3) sounds best to me. Keeping 'waiting' even makes sense in that case,
because it'll tell whether wait_event_type is currently being blocked
on. We can leave the former contents in until the next thing is being
blocked...

Won't leaving former contents as it is (until the next thing is being
blocked) could give misleading information. Currently we mark 'waiting'
as false as soon as Heavy Weight Lock is over, so following that way
sounds more appropriate, is there any reason why you want it differently
than what we are doing currently?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#13

Ilya Kosmodemiansky

ilya.kosmodemiansky@postgresql-consulting.com

almost 11 years ago

In reply to: Amit Kapila (#10)

Re: RFC: replace pg_stat_activity.waiting with something more descriptive

Hi all

On Thu, Jun 25, 2015 at 12:37 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

2. Add 2 new columns to pg_stat_activity
waiting_resource - true for waits other heavy wait locks, false
otherwise
wait_event - description code for the wait event

3. Add new view 'pg_stat_wait_event' with following info:
pid - process id of this backend
waiting - true for any form of wait, false otherwise
wait_event_type - Heavy Weight Lock, Light Weight Lock, I/O wait, etc
wait_event - Lock (Relation), Lock (Relation Extension), etc

Personally I think, that tracking waits is a not a good idea for
pg_stat_activity (at least in that straight-forward manner). One
process can wait for lots of things between 2 sampling of
pg_stat_activity and that sampling can be pretty useless.

My approach (about which Ive had a talk mentioned by Jim and which I
hope to finalize and submit within a few days) is a bit different and
I believe is more useful:

1. Some sort of histogram of top waits within entire database by pid.
That will be an approximate one, because I hardly believe there is a
possibility to make a precise one without significant overhead.

2. Some cyclic buffer of more precise wait statistic inside each
worker. Sampling may be turned on if we see some issues in histogram
(1) and want to have some more details.

Do you think 2nd or 3rd could be viable way to proceed for this feature?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Ilya Kosmodemiansky,

PostgreSQL-Consulting.com
tel. +14084142500
cell. +4915144336040
ik@postgresql-consulting.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14

andres@anarazel.de

almost 11 years ago

In reply to: Amit Kapila (#12)

Re: RFC: replace pg_stat_activity.waiting with something more descriptive

On 2015-06-25 16:26:39 +0530, Amit Kapila wrote:

Won't leaving former contents as it is (until the next thing is being
blocked) could give misleading information. Currently we mark 'waiting'
as false as soon as Heavy Weight Lock is over, so following that way
sounds more appropriate, is there any reason why you want it differently
than what we are doing currently?

But we don't do the same for query, so I don't think that says much. I
think it'd be useful because it gives you a bit more chance to see what
you blocked on last, even if the time the backend was blocked was very
short.

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15

amit.kapila16@gmail.com

almost 11 years ago

In reply to: Andres Freund (#14)

Re: RFC: replace pg_stat_activity.waiting with something more descriptive

On Thu, Jun 25, 2015 at 4:28 PM, Andres Freund <andres@anarazel.de> wrote:

On 2015-06-25 16:26:39 +0530, Amit Kapila wrote:

Won't leaving former contents as it is (until the next thing is being
blocked) could give misleading information. Currently we mark 'waiting'
as false as soon as Heavy Weight Lock is over, so following that way
sounds more appropriate, is there any reason why you want it differently
than what we are doing currently?

But we don't do the same for query, so I don't think that says much. I
think it'd be useful because it gives you a bit more chance to see what
you blocked on last, even if the time the backend was blocked was very
short.

Sure, that's another way to look at it, if you and or others feels that is
better,
then we can follow that way.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#16

amit.kapila16@gmail.com

almost 11 years ago

In reply to: Ilya Kosmodemiansky (#13)

Re: RFC: replace pg_stat_activity.waiting with something more descriptive

On Thu, Jun 25, 2015 at 4:28 PM, Ilya Kosmodemiansky <
ilya.kosmodemiansky@postgresql-consulting.com> wrote:

On Thu, Jun 25, 2015 at 12:37 PM, Amit Kapila <amit.kapila16@gmail.com>

wrote:

2. Add 2 new columns to pg_stat_activity
waiting_resource - true for waits other heavy wait locks, false
otherwise
wait_event - description code for the wait event

3. Add new view 'pg_stat_wait_event' with following info:
pid - process id of this backend
waiting - true for any form of wait, false otherwise
wait_event_type - Heavy Weight Lock, Light Weight Lock, I/O wait, etc
wait_event - Lock (Relation), Lock (Relation Extension), etc

Personally I think, that tracking waits is a not a good idea for
pg_stat_activity (at least in that straight-forward manner).

As mentioned in the initial mail by Robert, that sometimes system becomes
slow (either due to contention on various kinds of locks or due to I/O or
due
to some other such reasons) that such kind of handy information via some
view is quite useful. Recently while working on one of the
performance/scalability
projects, I need to use gdb to attach to different processes to see what
they
are doing (of course one can use perf or some other utilities as well) and I
found most of them were trying to wait on some LW locks, now having such
an information available via view could be really useful, because sometimes
at customer sites, we can't use gdb or perf to see what's going on.

One
process can wait for lots of things between 2 sampling of
pg_stat_activity and that sampling can be pretty useless.

Yeah, that's right and I am not sure if we should bother about such
scenario's
as the system is generally fine in such situations, however there are other
cases where we can find most of the backends are waiting on one or other
thing.

My approach (about which Ive had a talk mentioned by Jim and which I
hope to finalize and submit within a few days) is a bit different and
I believe is more useful:

1. Some sort of histogram of top waits within entire database by pid.
That will be an approximate one, because I hardly believe there is a
possibility to make a precise one without significant overhead.

2. Some cyclic buffer of more precise wait statistic inside each
worker. Sampling may be turned on if we see some issues in histogram
(1) and want to have some more details.

I think this is some what different kind of utility which can give us
aggregated information and I think this will address different kind of
usecase and will have somewhat more complex design and it doesn't
look impossible to use part of what will be developed as part of this
proposal.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#17

Ilya Kosmodemiansky

ilya.kosmodemiansky@postgresql-consulting.com

almost 11 years ago

In reply to: Amit Kapila (#16)

Re: RFC: replace pg_stat_activity.waiting with something more descriptive

On Thu, Jun 25, 2015 at 1:49 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

Personally I think, that tracking waits is a not a good idea for
pg_stat_activity (at least in that straight-forward manner).

As mentioned in the initial mail by Robert, that sometimes system becomes
slow (either due to contention on various kinds of locks or due to I/O or
due
to some other such reasons) that such kind of handy information via some
view is quite useful. Recently while working on one of the
performance/scalability
projects, I need to use gdb to attach to different processes to see what
they
are doing (of course one can use perf or some other utilities as well) and I
found most of them were trying to wait on some LW locks, now having such
an information available via view could be really useful, because sometimes
at customer sites, we can't use gdb or perf to see what's going on.

Yes, I understand such a use-case. But I hardly see if suggested
design can help for such cases.

Basically, a DBA has two reasons to take a look on waits:

1. Long response time for particular query (or some type of queries).
In that case it is good to know how much time we spend on waiting for
particular resources we need to get query results
2. Overall bad performance of a database. We know, that something goes
wrong and consumes resources, we need to identify which backend, which
query causes the most of waits.

In both cases we need a) some historical data rather than simple
snapshot b) some approach how to aggregate it because the will be
certainly a lot of events

So my point is, we need separate interface for waits, instead of
integrating in pg_stat_activity. And it should be several interfaces:
one for approximate top of waiting sessions (like
active_sessions_history in oracle), one for detailed tracing of a
session, one for waits per resource statistics etc.

One
process can wait for lots of things between 2 sampling of
pg_stat_activity and that sampling can be pretty useless.

Yeah, that's right and I am not sure if we should bother about such
scenario's
as the system is generally fine in such situations, however there are other
cases where we can find most of the backends are waiting on one or other
thing.

I think approach with top of waiting sessions covers both scenarios
(well, with only one exception: if we have billions of very short
waits and high contention is the problem)

However, it maybe a good idea, to identify the resource we are waiting
for from pg_stat_activity if we are waiting for a long time.

I think this is some what different kind of utility which can give us
aggregated information and I think this will address different kind of
usecase and will have somewhat more complex design and it doesn't
look impossible to use part of what will be developed as part of this
proposal.

I think it is more than possible to mix both approaches. My proof of
concept now is only about LWLocks - yours and Robert's is more
general, and certainly some wait event classification will be needed
for both approaches and its much better to implement one rather than
two different.

And at least, I will be interesting in reviewing your approach.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Ilya Kosmodemiansky,

PostgreSQL-Consulting.com
tel. +14084142500
cell. +4915144336040
ik@postgresql-consulting.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18

amit.kapila16@gmail.com

almost 11 years ago

In reply to: Ilya Kosmodemiansky (#17)

Re: RFC: replace pg_stat_activity.waiting with something more descriptive

On Thu, Jun 25, 2015 at 6:10 PM, Ilya Kosmodemiansky <
ilya.kosmodemiansky@postgresql-consulting.com> wrote:

On Thu, Jun 25, 2015 at 1:49 PM, Amit Kapila <amit.kapila16@gmail.com>

wrote:

Personally I think, that tracking waits is a not a good idea for
pg_stat_activity (at least in that straight-forward manner).

As mentioned in the initial mail by Robert, that sometimes system

becomes

slow (either due to contention on various kinds of locks or due to I/O

due
to some other such reasons) that such kind of handy information via some
view is quite useful. Recently while working on one of the
performance/scalability
projects, I need to use gdb to attach to different processes to see what
they
are doing (of course one can use perf or some other utilities as well)

and I

found most of them were trying to wait on some LW locks, now having such
an information available via view could be really useful, because

sometimes

at customer sites, we can't use gdb or perf to see what's going on.

Yes, I understand such a use-case. But I hardly see if suggested
design can help for such cases.

Basically, a DBA has two reasons to take a look on waits:

1. Long response time for particular query (or some type of queries).
In that case it is good to know how much time we spend on waiting for
particular resources we need to get query results
2. Overall bad performance of a database. We know, that something goes
wrong and consumes resources, we need to identify which backend, which
query causes the most of waits.

In both cases we need a) some historical data rather than simple
snapshot b) some approach how to aggregate it because the will be
certainly a lot of events

I think this thread's proposal will help for cases, when user/DBA wants to
see where currently database is spending most time (during waits).

I understand that there is a use of historical information which can
be helpful for the kind of cases which you have explained above.

I think it is more than possible to mix both approaches. My proof of
concept now is only about LWLocks - yours and Robert's is more
general, and certainly some wait event classification will be needed
for both approaches and its much better to implement one rather than
two different.

And at least, I will be interesting in reviewing your approach.

Okay, I am planning to spend time on this patch in coming few days
and when that's ready, may be we can see if that could be useful
for what you are planning to do.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#19

Peter Eisentraut

peter_e@gmx.net

almost 11 years ago

In reply to: Robert Haas (#1)

Re: RFC: replace pg_stat_activity.waiting with something more descriptive

On 6/22/15 1:37 PM, Robert Haas wrote:

Currently, the only time we report a process as waiting is when it is
waiting for a heavyweight lock. I'd like to make that somewhat more
fine-grained, by reporting the type of heavyweight lock it's awaiting
(relation, relation extension, transaction, etc.). Also, I'd like to
report when we're waiting for a lwlock, and report either the specific
fixed lwlock for which we are waiting, or else the type of lock (lock
manager lock, buffer content lock, etc.) for locks of which there is
more than one. I'm less sure about this next part, but I think we
might also want to report ourselves as waiting when we are doing an OS
read or an OS write, because it's pretty common for people to think
that a PostgreSQL bug is to blame when in fact it's the operating
system that isn't servicing our I/O requests very quickly.

Could that also cover waiting on network?

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20

tgl@sss.pgh.pa.us

almost 11 years ago

In reply to: Andres Freund (#14)

Re: RFC: replace pg_stat_activity.waiting with something more descriptive

Andres Freund <andres@anarazel.de> writes:

On 2015-06-25 16:26:39 +0530, Amit Kapila wrote:

Won't leaving former contents as it is (until the next thing is being
blocked) could give misleading information. Currently we mark 'waiting'
as false as soon as Heavy Weight Lock is over, so following that way
sounds more appropriate, is there any reason why you want it differently
than what we are doing currently?

But we don't do the same for query, so I don't think that says much. I
think it'd be useful because it gives you a bit more chance to see what
you blocked on last, even if the time the backend was blocked was very
short.

The problem with the query analogy is that it's possible to tell whether
the query is active or not, by looking at the status column. We need to
avoid a situation where you can't tell if the wait status is current or
merely the last thing waited for.

At the moment I'm inclined to think we should put this on the back burner
until we see what Ilya submits. None of the proposals for changing
pg_stat_activity sound terribly clean to me.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21

andres@anarazel.de

almost 11 years ago

In reply to: Tom Lane (#20)

#22

tgl@sss.pgh.pa.us

almost 11 years ago

In reply to: Andres Freund (#21)

#23

amit.kapila16@gmail.com

almost 11 years ago

In reply to: Tom Lane (#22)

#24

robertmhaas@gmail.com

almost 11 years ago

In reply to: Andres Freund (#11)

#25

robertmhaas@gmail.com

almost 11 years ago

In reply to: Ilya Kosmodemiansky (#13)

#26

robertmhaas@gmail.com

almost 11 years ago

In reply to: Peter Eisentraut (#19)

#27

amit.kapila16@gmail.com

almost 11 years ago

In reply to: Robert Haas (#24)

#28

robertmhaas@gmail.com

almost 11 years ago

In reply to: Amit Kapila (#27)

#29

Peter Eisentraut

peter_e@gmx.net

almost 11 years ago

In reply to: Robert Haas (#26)

#30

amit.kapila16@gmail.com

almost 11 years ago

In reply to: Robert Haas (#28)

#31

Fujii Masao

masao.fujii@gmail.com

almost 11 years ago

In reply to: Robert Haas (#26)

#32

Fujii Masao

masao.fujii@gmail.com

almost 11 years ago

In reply to: Amit Kapila (#30)

#33

horikyota.ntt@gmail.com

almost 11 years ago

In reply to: Fujii Masao (#32)

#34

horikyota.ntt@gmail.com

almost 11 years ago

In reply to: Kyotaro Horiguchi (#33)

#35

amit.kapila16@gmail.com

almost 11 years ago

In reply to: Fujii Masao (#32)

#36

aekorotkov@gmail.com

almost 11 years ago

In reply to: Robert Haas (#26)

#37

amit.kapila16@gmail.com

almost 11 years ago

In reply to: Alexander Korotkov (#36)

#38

i.kurbangaliev@postgrespro.ru

almost 11 years ago

In reply to: Amit Kapila (#37)

#39

amit.kapila16@gmail.com

almost 11 years ago

In reply to: Ildus Kurbangaliev (#38)

#40

i.kurbangaliev@postgrespro.ru

almost 11 years ago

In reply to: Amit Kapila (#39)

#41

Fujii Masao

masao.fujii@gmail.com

almost 11 years ago

In reply to: Ildus Kurbangaliev (#40)

#42

robertmhaas@gmail.com

almost 11 years ago

In reply to: Fujii Masao (#31)

#43

robertmhaas@gmail.com

almost 11 years ago

In reply to: Kyotaro Horiguchi (#34)

#44

robertmhaas@gmail.com

almost 11 years ago

In reply to: Alexander Korotkov (#36)

#45

tgl@sss.pgh.pa.us

almost 11 years ago

In reply to: Robert Haas (#44)

#46

i.kurbangaliev@postgrespro.ru

almost 11 years ago

In reply to: Tom Lane (#45)

#47

tgl@sss.pgh.pa.us

almost 11 years ago

In reply to: Ildus Kurbangaliev (#46)

#48

robertmhaas@gmail.com

almost 11 years ago

In reply to: Tom Lane (#47)

#49

Peter Geoghegan

pg@bowt.ie

almost 11 years ago

In reply to: Tom Lane (#45)

#50

tgl@sss.pgh.pa.us

almost 11 years ago

In reply to: Peter Geoghegan (#49)

#51

aekorotkov@gmail.com

almost 11 years ago

In reply to: Tom Lane (#50)

#52

i.kurbangaliev@postgrespro.ru

almost 11 years ago

In reply to: Alexander Korotkov (#51)

#53

andres@anarazel.de

almost 11 years ago

In reply to: Ildus Kurbangaliev (#52)

#54

i.kurbangaliev@postgrespro.ru

almost 11 years ago

In reply to: Andres Freund (#53)

#55

horikyota.ntt@gmail.com

almost 11 years ago

In reply to: Ildus Kurbangaliev (#54)

#56

i.kurbangaliev@postgrespro.ru

almost 11 years ago

In reply to: Kyotaro Horiguchi (#55)

#57

horikyota.ntt@gmail.com

almost 11 years ago

In reply to: Ildus Kurbangaliev (#56)

#58

horikyota.ntt@gmail.com

almost 11 years ago

In reply to: Kyotaro Horiguchi (#57)

#59

David Rowley

dgrowleyml@gmail.com

almost 11 years ago

In reply to: Robert Haas (#1)

#60

i.kurbangaliev@postgrespro.ru

almost 11 years ago

In reply to: Kyotaro Horiguchi (#57)

#61

i.kurbangaliev@postgrespro.ru

almost 11 years ago

In reply to: Ildus Kurbangaliev (#60)

#62

amit.kapila16@gmail.com

almost 11 years ago

In reply to: Ildus Kurbangaliev (#61)

#63

i.kurbangaliev@postgrespro.ru

almost 11 years ago

In reply to: Amit Kapila (#62)

#64

robertmhaas@gmail.com

almost 11 years ago

In reply to: Ildus Kurbangaliev (#61)

#65

amit.kapila16@gmail.com

almost 11 years ago

In reply to: Ildus Kurbangaliev (#63)

#66

i.kurbangaliev@postgrespro.ru

almost 11 years ago

In reply to: Robert Haas (#64)

#67

amit.kapila16@gmail.com

almost 11 years ago

In reply to: Ildus Kurbangaliev (#66)

#68

i.kurbangaliev@postgrespro.ru

almost 11 years ago

In reply to: Robert Haas (#64)

#69

alvherre@2ndquadrant.com

almost 11 years ago

In reply to: David Rowley (#59)

#70

robertmhaas@gmail.com

almost 11 years ago

In reply to: Alvaro Herrera (#69)

#71

alvherre@2ndquadrant.com

almost 11 years ago

In reply to: Robert Haas (#70)

#72

robertmhaas@gmail.com

almost 11 years ago

In reply to: Alvaro Herrera (#71)

#73

Jim Nasby

Jim.Nasby@BlueTreble.com

almost 11 years ago

In reply to: Robert Haas (#72)

#74

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

almost 11 years ago

In reply to: Ildus Kurbangaliev (#68)

#75

i.kurbangaliev@postgrespro.ru

almost 11 years ago

In reply to: Heikki Linnakangas (#74)

#76

i.kurbangaliev@postgrespro.ru

almost 11 years ago

In reply to: Ildus Kurbangaliev (#75)

#77

robertmhaas@gmail.com

almost 11 years ago

In reply to: Heikki Linnakangas (#74)

#78

i.kurbangaliev@postgrespro.ru

almost 11 years ago

In reply to: Robert Haas (#77)

#79

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

almost 11 years ago

In reply to: Robert Haas (#77)

#80

i.kurbangaliev@postgrespro.ru

almost 11 years ago

In reply to: Heikki Linnakangas (#79)

#81

robertmhaas@gmail.com

almost 11 years ago

In reply to: Ildus Kurbangaliev (#80)

#82

robertmhaas@gmail.com

almost 11 years ago

In reply to: Robert Haas (#81)

#83

alvherre@2ndquadrant.com

almost 11 years ago

In reply to: Ildus Kurbangaliev (#80)

#84

Josh Berkus

josh@agliodbs.com

almost 11 years ago

In reply to: Robert Haas (#1)

#85

amit.kapila16@gmail.com

almost 11 years ago

In reply to: Robert Haas (#77)

#86

robertmhaas@gmail.com

almost 11 years ago

In reply to: Josh Berkus (#84)

#87

i.kurbangaliev@postgrespro.ru

almost 11 years ago

In reply to: Robert Haas (#81)

#88

robertmhaas@gmail.com

almost 11 years ago

In reply to: Ildus Kurbangaliev (#87)

#89

i.kurbangaliev@postgrespro.ru

almost 11 years ago

In reply to: Robert Haas (#88)

#90

aekorotkov@gmail.com

almost 11 years ago

In reply to: Ildus Kurbangaliev (#89)

#91

tgl@sss.pgh.pa.us

almost 11 years ago

In reply to: Alexander Korotkov (#90)

#92

Bruce Momjian

bruce@momjian.us

almost 11 years ago

In reply to: Jim Nasby (#73)

#93

andres@anarazel.de

over 10 years ago

In reply to: Ildus Kurbangaliev (#80)

#94

robertmhaas@gmail.com

over 10 years ago

In reply to: Andres Freund (#93)

#95

amit.kapila16@gmail.com

over 10 years ago

In reply to: Ildus Kurbangaliev (#89)

#96

amit.kapila16@gmail.com

over 10 years ago

In reply to: Alexander Korotkov (#90)

#97

robertmhaas@gmail.com

over 10 years ago

In reply to: Amit Kapila (#96)

#98

i.kurbangaliev@postgrespro.ru

over 10 years ago

In reply to: Robert Haas (#97)

#99

robertmhaas@gmail.com

over 10 years ago

In reply to: Ildus Kurbangaliev (#98)

#100

root@simply.name

over 10 years ago

In reply to: Amit Kapila (#95)

#101

aekorotkov@gmail.com

over 10 years ago

In reply to: Amit Kapila (#95)

#102

aekorotkov@gmail.com

over 10 years ago

In reply to: Amit Kapila (#96)

#103

amit.kapila16@gmail.com

over 10 years ago

In reply to: Robert Haas (#97)

#104

amit.kapila16@gmail.com

over 10 years ago

In reply to: Alexander Korotkov (#101)

#105

amit.kapila16@gmail.com

over 10 years ago

In reply to: Alexander Korotkov (#102)

#106

aekorotkov@gmail.com

over 10 years ago

In reply to: Amit Kapila (#105)

#107

aekorotkov@gmail.com

over 10 years ago

In reply to: Amit Kapila (#104)

#108

robertmhaas@gmail.com

over 10 years ago

In reply to: Alexander Korotkov (#102)

#109

aekorotkov@gmail.com

over 10 years ago

In reply to: Robert Haas (#108)

#110

robertmhaas@gmail.com

over 10 years ago

In reply to: Alexander Korotkov (#109)

#111

root@simply.name

over 10 years ago

In reply to: Robert Haas (#110)

#112

robertmhaas@gmail.com

over 10 years ago

In reply to: Vladimir Borodin (#111)

#113

root@simply.name

over 10 years ago

In reply to: Robert Haas (#112)

#114

robertmhaas@gmail.com

over 10 years ago

In reply to: Vladimir Borodin (#113)

#115

amit.kapila16@gmail.com

over 10 years ago

In reply to: Robert Haas (#110)

#116

Craig Ringer

craig@2ndquadrant.com

over 10 years ago

In reply to: Amit Kapila (#115)

#117

root@simply.name

over 10 years ago

In reply to: Amit Kapila (#115)

#118

Michael Paquier

michael@paquier.xyz

over 10 years ago

In reply to: Vladimir Borodin (#117)

#119

amit.kapila16@gmail.com

over 10 years ago

In reply to: Michael Paquier (#118)

#120

alvherre@2ndquadrant.com

over 10 years ago

In reply to: Amit Kapila (#119)

#121

robertmhaas@gmail.com

over 10 years ago

In reply to: Alvaro Herrera (#120)

#122

amit.kapila16@gmail.com

over 10 years ago

In reply to: Robert Haas (#121)

#123

robertmhaas@gmail.com

over 10 years ago

In reply to: Amit Kapila (#122)

#124

amit.kapila16@gmail.com

over 10 years ago

In reply to: Robert Haas (#123)

#125

robertmhaas@gmail.com

over 10 years ago

In reply to: Amit Kapila (#124)

#126

amit.kapila16@gmail.com

over 10 years ago

In reply to: Robert Haas (#125)

#127

andres@anarazel.de

over 10 years ago

In reply to: Amit Kapila (#126)

#128

amit.kapila16@gmail.com

over 10 years ago

In reply to: Andres Freund (#127)

#129

robertmhaas@gmail.com

over 10 years ago

In reply to: Andres Freund (#127)

#130

amit.kapila16@gmail.com

over 10 years ago

In reply to: Robert Haas (#129)

#131

amit.kapila16@gmail.com

over 10 years ago

In reply to: Amit Kapila (#130)

#132

aekorotkov@gmail.com

over 10 years ago

In reply to: Amit Kapila (#131)

#133

amit.kapila16@gmail.com

over 10 years ago

In reply to: Alexander Korotkov (#132)

#134

robertmhaas@gmail.com

over 10 years ago

In reply to: Alexander Korotkov (#132)

#135

amit.kapila16@gmail.com

over 10 years ago

In reply to: Robert Haas (#134)

#136

robertmhaas@gmail.com

over 10 years ago

In reply to: Amit Kapila (#135)

#137

amit.kapila16@gmail.com

over 10 years ago

In reply to: Robert Haas (#136)

#138

robertmhaas@gmail.com

over 10 years ago

In reply to: Amit Kapila (#137)

#139

Peter Eisentraut

peter_e@gmx.net

over 10 years ago

In reply to: Amit Kapila (#137)

#140

amit.kapila16@gmail.com

over 10 years ago

In reply to: Robert Haas (#138)

#141

robertmhaas@gmail.com

over 10 years ago

In reply to: Amit Kapila (#140)

#142

amit.kapila16@gmail.com

about 10 years ago

In reply to: Robert Haas (#138)

#143

amit.kapila16@gmail.com

about 10 years ago

In reply to: Peter Eisentraut (#139)

#144

robertmhaas@gmail.com

about 10 years ago

In reply to: Amit Kapila (#143)

#145

amit.kapila16@gmail.com

about 10 years ago

In reply to: Robert Haas (#138)

#146

aekorotkov@gmail.com

about 10 years ago

In reply to: Amit Kapila (#145)

#147

amit.kapila16@gmail.com

about 10 years ago

In reply to: Alexander Korotkov (#146)

#148

thom@linux.com

about 10 years ago

In reply to: Amit Kapila (#145)

#149

aekorotkov@gmail.com

about 10 years ago

In reply to: Amit Kapila (#147)

#150

thom@linux.com

about 10 years ago

In reply to: Thom Brown (#148)

#151

thom@linux.com

about 10 years ago

In reply to: Alexander Korotkov (#149)

#152

amit.kapila16@gmail.com

about 10 years ago

In reply to: Thom Brown (#151)

#153

amit.kapila16@gmail.com

about 10 years ago

In reply to: Alexander Korotkov (#149)

#154

robertmhaas@gmail.com

about 10 years ago

In reply to: Amit Kapila (#153)

#155

thom@linux.com

about 10 years ago

In reply to: Amit Kapila (#153)

#156

amit.kapila16@gmail.com

about 10 years ago

In reply to: Robert Haas (#154)

#157

robertmhaas@gmail.com

about 10 years ago

In reply to: Amit Kapila (#156)

#158

amit.kapila16@gmail.com

about 10 years ago

In reply to: Robert Haas (#157)

#159

robertmhaas@gmail.com

about 10 years ago

In reply to: Amit Kapila (#158)

#160

Thomas Reiss

thomas.reiss@dalibo.com

about 10 years ago

In reply to: Amit Kapila (#158)

#161

thom@linux.com

about 10 years ago

In reply to: Robert Haas (#157)

#162

thom@linux.com

about 10 years ago

In reply to: Thom Brown (#161)

#163

robertmhaas@gmail.com

about 10 years ago

In reply to: Thomas Reiss (#160)

#164

robertmhaas@gmail.com

about 10 years ago

In reply to: Thom Brown (#162)

#165

alvherre@2ndquadrant.com

about 10 years ago

In reply to: Robert Haas (#164)

#166

Michael Paquier

michael@paquier.xyz

about 10 years ago

In reply to: Robert Haas (#163)

#167

amit.kapila16@gmail.com

about 10 years ago

In reply to: Michael Paquier (#166)

#168

robertmhaas@gmail.com

about 10 years ago

In reply to: Michael Paquier (#166)

#169

Michael Paquier

michael@paquier.xyz

about 10 years ago

In reply to: Robert Haas (#168)

#170

robertmhaas@gmail.com

about 10 years ago

In reply to: Michael Paquier (#169)

#171

andres@anarazel.de

about 10 years ago

In reply to: Robert Haas (#170)

#172

aekorotkov@gmail.com

about 10 years ago

In reply to: Andres Freund (#171)

#173

robertmhaas@gmail.com

about 10 years ago

In reply to: Alexander Korotkov (#172)

#174

Neha Khatri

nehakhatri5@gmail.com

almost 10 years ago

In reply to: Amit Kapila (#153)

#175