pg_stat_lwlock wait time view

Started by Haribabu Kommiover 9 years ago7 messages
#1Haribabu Kommi
kommi.haribabu@gmail.com

There was some discussion earlier in adding pg_stat_lwlock view in [1]/messages/by-id/4FE9A6F5.2080405@uptime.jp.
The main objections which I observed for that patch was showing LWLOCK
information to the user that don't understand what this lock used for and etc.

Currently as part of wait_event information in pg_stat_activity the LWLOCK
information is available to the user and the details of LWLOCk's that are
used in PostgreSQL are also listed in the documentation and with their
purpose.

So I feel it may be worth to add this view to find out the wait times of the
LWLOCK's. This information can be useful to find out the bottlenecks
around LWLOCK's in production environments. But adding the timing calculations
may cause performance problem. Is there any need of writing this stats
information to file? As this just provides the wait time information.

Based on the performance impact with the additional timing calculations,
we can decide the view default behavior, Are there any objections to the
concept?

[1]: /messages/by-id/4FE9A6F5.2080405@uptime.jp

Regards,
Hari Babu
Fujitsu Australia

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Robert Haas
robertmhaas@gmail.com
In reply to: Haribabu Kommi (#1)
Re: pg_stat_lwlock wait time view

On Wed, Aug 24, 2016 at 4:23 AM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:

There was some discussion earlier in adding pg_stat_lwlock view in [1].
The main objections which I observed for that patch was showing LWLOCK
information to the user that don't understand what this lock used for and etc.

Currently as part of wait_event information in pg_stat_activity the LWLOCK
information is available to the user and the details of LWLOCk's that are
used in PostgreSQL are also listed in the documentation and with their
purpose.

So I feel it may be worth to add this view to find out the wait times of the
LWLOCK's. This information can be useful to find out the bottlenecks
around LWLOCK's in production environments. But adding the timing calculations
may cause performance problem. Is there any need of writing this stats
information to file? As this just provides the wait time information.

Based on the performance impact with the additional timing calculations,
we can decide the view default behavior, Are there any objections to the
concept?

There have been some other recent threads on extending the wait event
stuff. If you haven't already read those, you should, because the
issues are closely related. I think that timing LWLock waits will be
quite expensive. I believe that what the Postgres Pro folks want to
do is add up the wait times or maybe keep a history of waits (though
maybe I'm wrong about that), but showing them in pg_stat_activity is
another idea. That's likely to add some synchronization overhead
which might be even greater in this case than for a feature that just
publishes accumulated times, but maybe it's all a drop in the bucket
compared to the cost of calling gettimeofday() in the first place.

Personally, my preferred solution is still to have a background worker
that samples the published wait events and rolls up statistics, but
I'm not sure I've convinced anyone else. It could report the number
of seconds since it detected a wait event other than the current one,
which is not precisely the same thing as tracking the length of the
current wait but it's pretty close. I don't know for sure what's best
here - I think some experimentation and dialog is needed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Tsunakawa, Takayuki
tsunakawa.takay@jp.fujitsu.com
In reply to: Haribabu Kommi (#1)
Re: pg_stat_lwlock wait time view

From: pgsql-hackers-owner@postgresql.org

[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Haribabu Kommi
calculations may cause performance problem. Is there any need of writing
this stats information to file? As this just provides the wait time
information.

Yes, saving the workload diagnosis information would be nice like Oracle AWR. I mean not the current accumulated total, but a series of snapshots taken periodically. It would enable:

* Daily monitoring across database restarts. e.g. The response times of applications degraded after applying a patch. What's the difference between before and after the patch application?

* Hint on troubleshooting a crash failure. e.g. Excessive memory use by PostgreSQL crashed the OS. What was the workload like just before the crash?

The point of discussion may be whether PostgreSQL itself provides the feature to accumulate performance diagnosis information on persistent storage. pg_statsinfo will be able to take on it, but it wouldn't be convenient nor efficient.

Regards
Takayuki Tsunakawa

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Haribabu Kommi
kommi.haribabu@gmail.com
In reply to: Robert Haas (#2)
Re: pg_stat_lwlock wait time view

On Thu, Aug 25, 2016 at 6:57 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Aug 24, 2016 at 4:23 AM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:

Based on the performance impact with the additional timing calculations,
we can decide the view default behavior, Are there any objections to the
concept?

There have been some other recent threads on extending the wait event
stuff. If you haven't already read those, you should, because the
issues are closely related. I think that timing LWLock waits will be
quite expensive. I believe that what the Postgres Pro folks want to
do is add up the wait times or maybe keep a history of waits (though
maybe I'm wrong about that), but showing them in pg_stat_activity is
another idea. That's likely to add some synchronization overhead
which might be even greater in this case than for a feature that just
publishes accumulated times, but maybe it's all a drop in the bucket
compared to the cost of calling gettimeofday() in the first place.

Yes, I agree this is an issue for the cases where the wait time is smaller
than the logic that is added to calculate the wait time. Even if we use
clock_gettime with CLOCK_REALTIME_COARSE there will be some
overhead, as this clock method is 8 times faster than gettimeofday
but not that accurate in result. May be we can use the clock_getime
instead of gettimeofday in this case, as we may not needed the fine-grained
value.

Personally, my preferred solution is still to have a background worker
that samples the published wait events and rolls up statistics, but
I'm not sure I've convinced anyone else. It could report the number
of seconds since it detected a wait event other than the current one,
which is not precisely the same thing as tracking the length of the
current wait but it's pretty close. I don't know for sure what's best
here - I think some experimentation and dialog is needed.

Yes, using of background worker can reduce the load of adding all the
wait time calculations in the main backend. I can give a try by modifying
direct calculation approach and background worker (may be pg_stat_collector)
to find the wait time based on the stat messages that are received from
main backend related to wait start and wait end.

I am not sure with out getting any signal or message from main backend,
how much accurate the data can be gathered from a background worker.

Regards,
Hari Babu
Fujitsu Australia

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Satoshi Nagayasu
snaga@uptime.jp
In reply to: Haribabu Kommi (#4)
Re: pg_stat_lwlock wait time view

2016-08-25 13:46 GMT+09:00 Haribabu Kommi <kommi.haribabu@gmail.com>:

On Thu, Aug 25, 2016 at 6:57 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Aug 24, 2016 at 4:23 AM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:

Based on the performance impact with the additional timing calculations,
we can decide the view default behavior, Are there any objections to the
concept?

There have been some other recent threads on extending the wait event
stuff. If you haven't already read those, you should, because the
issues are closely related. I think that timing LWLock waits will be
quite expensive. I believe that what the Postgres Pro folks want to
do is add up the wait times or maybe keep a history of waits (though
maybe I'm wrong about that), but showing them in pg_stat_activity is
another idea. That's likely to add some synchronization overhead
which might be even greater in this case than for a feature that just
publishes accumulated times, but maybe it's all a drop in the bucket
compared to the cost of calling gettimeofday() in the first place.

Yes, I agree this is an issue for the cases where the wait time is smaller
than the logic that is added to calculate the wait time. Even if we use
clock_gettime with CLOCK_REALTIME_COARSE there will be some
overhead, as this clock method is 8 times faster than gettimeofday
but not that accurate in result. May be we can use the clock_getime
instead of gettimeofday in this case, as we may not needed the fine-grained
value.

Is there any other option (rather than gettimeofday()) to measure elapsed
time with lower overhead?

I've heard about the RDTSC feature (hardware counter) supported by the recent
processors, and have found a few articles [1]http://stackoverflow.com/questions/15623343/using-cpu-counters-versus-gettimeofday [2]http://stackoverflow.com/questions/6498972/faster-equivalent-of-gettimeofday on its lower overhead than
gettimeofday().

[1]: http://stackoverflow.com/questions/15623343/using-cpu-counters-versus-gettimeofday
[2]: http://stackoverflow.com/questions/6498972/faster-equivalent-of-gettimeofday

I'm not sure how we can benefit from it so far, because I'm not
familiar with this
facility and of course I don't have the numbers. In addition to that,
I guess it would
bring some portability issues. But I'm still curious to know more
about these stuff.
Anyone has some experiences on it?

Personally, my preferred solution is still to have a background worker
that samples the published wait events and rolls up statistics, but
I'm not sure I've convinced anyone else. It could report the number
of seconds since it detected a wait event other than the current one,
which is not precisely the same thing as tracking the length of the
current wait but it's pretty close. I don't know for sure what's best
here - I think some experimentation and dialog is needed.

Yes, using of background worker can reduce the load of adding all the
wait time calculations in the main backend. I can give a try by modifying
direct calculation approach and background worker (may be pg_stat_collector)
to find the wait time based on the stat messages that are received from
main backend related to wait start and wait end.

I am not sure with out getting any signal or message from main backend,
how much accurate the data can be gathered from a background worker.

It looks a sort of accuracy-performance trade-off.
So, I think the use-cases matter here to get a better design.

I guess that's the reason why llya is looking for feature requests from DBA
in another thread [3]/messages/by-id/CAG95seUAQVj09KzLwU+z1B-GqdMqerzEkPFR3hn0q88XzMq-PA@mail.gmail.com.

[3]: /messages/by-id/CAG95seUAQVj09KzLwU+z1B-GqdMqerzEkPFR3hn0q88XzMq-PA@mail.gmail.com

Regards,
--
Satoshi Nagayasu <snaga@uptime.jp>

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Haribabu Kommi
kommi.haribabu@gmail.com
In reply to: Haribabu Kommi (#1)
Re: pg_stat_lwlock wait time view

[ Re sending to Hackers as the earlier mail failed to deliver to Hackers
mailing list]

On Mon, Jan 9, 2017 at 4:13 PM, Haribabu Kommi <kommi.haribabu@gmail.com>
wrote:

On Thu, Aug 25, 2016 at 2:46 PM, Haribabu Kommi <kommi.haribabu@gmail.com>
wrote:

On Thu, Aug 25, 2016 at 6:57 AM, Robert Haas <robertmhaas@gmail.com>
wrote:

Personally, my preferred solution is still to have a background worker
that samples the published wait events and rolls up statistics, but
I'm not sure I've convinced anyone else. It could report the number
of seconds since it detected a wait event other than the current one,
which is not precisely the same thing as tracking the length of the
current wait but it's pretty close. I don't know for sure what's best
here - I think some experimentation and dialog is needed.

Yes, using of background worker can reduce the load of adding all the
wait time calculations in the main backend. I can give a try by modifying
direct calculation approach and background worker (may be
pg_stat_collector)
to find the wait time based on the stat messages that are received from
main backend related to wait start and wait end.

I am not sure with out getting any signal or message from main backend,
how much accurate the data can be gathered from a background worker.

Apologies to come back to an old thread.

I tried of using "stats collector process" as a background worker to
calculate
the wait times for LWLocks instead of adding another background worker
for proof of concept.

Created two hash tables, one is to store the "LWLock stats" and another
is to store the "Backend's information" with PID as a key.

Whenever the Backend is waiting for an LWLock, it sends the message to
"stats collector" with PID and wait_event_info of the lock. Once the stats
collector receives the message, Adds that Backend entry to Hash table after
getting the start time. Once the Backend ends the waiting for the Lock, it
sends the signal to the "stats collector" and it gets the entry from Hash
table
and finds out the wait time and update this time to the corresponding LWLock
entry in another Hash table.

The LWLock wait stats are stored in the stats file for persistence.

Currently no stats reset logic.

This patch is helpful in creating a view to display wait times of all wait
events
that is discussed in [1]/messages/by-id/CAASwCXdvQgZ-ox_ SyYMF5TAJVH-_rW71vthZynS%3DEMeexN5Giw%40mail.gmail.com.

Comments?

[1]: /messages/by-id/CAASwCXdvQgZ-ox_ SyYMF5TAJVH-_rW71vthZynS%3DEMeexN5Giw%40mail.gmail.com
SyYMF5TAJVH-_rW71vthZynS%3DEMeexN5Giw%40mail.gmail.com

Regards,
Hari Babu
Fujitsu Australia

#7Robert Haas
robertmhaas@gmail.com
In reply to: Haribabu Kommi (#1)
Re: pg_stat_lwlock wait time view

On Mon, Jan 9, 2017 at 12:13 AM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:

Whenever the Backend is waiting for an LWLock, it sends the message to
"stats collector" with PID and wait_event_info of the lock. Once the stats
collector receives the message, Adds that Backend entry to Hash table after
getting the start time. Once the Backend ends the waiting for the Lock, it
sends the signal to the "stats collector" and it gets the entry from Hash
table
and finds out the wait time and update this time to the corresponding LWLock
entry in another Hash table.

I will be extremely surprised if this doesn't have a severe negative
impact on performance when LWLock contention is high (e.g. a pgbench
read-only test using a scale factor that fits in the OS cache but not
shared_buffers).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers