Lock Wait Statistics (next commitfest)
Where I work they make extensive use of Postgresql. One of the things
they typically want to know about are lock waits. Out of the box in
there is not much in the way of tracking for such, particularly in older
versions. The view pg_stats is fine for stuff happening *now*, but
typically I find I'm being asked about something that happened last
night...
Now for those platforms with dtrace there is a lock wait probe, useful -
but not for those of us on Linux! There is the conf option to log lock
waits > deadlock timeout (8.3 onwards), and this is great, but I
wondered if having something like this available as part of the stats
module would be useful.
So here is my initial attempt at this, at this point merely to spark
discussion (attached patch)
I have followed some of the ideas from the function execution stats, but
locks required some special treatment.
- new parameter track_locks (maybe should be track_lock_waits?)
- new hash locks attached to stats dbentry
- several new stat*lock c functions
- new view pg_stat_lock_waits
- new attributes for pg_stat_database
This version has simply exposed the locktag structure in the view along
with corresponding #waits and wait time. This should probably get
reformed to look a little more like pg_locks. I figured this is easily
doable along with the no doubt many changes coming from revew comments.
Also I did not do any clever amalgamation of transaction lock waits -
there is probably gonna be a lot of those and keeping the detail is
unlikely to be useful. It would be easy to collect them all together in
one transaction entry.
regards
Mark
Attachments:
lockstats-1.patch.gzapplication/x-gzip; name=lockstats-1.patch.gzDownload
On Sun, Jan 25, 2009 at 6:57 PM, Mark Kirkwood<markir@paradise.net.nz> wrote:
So here is my initial attempt at this, at this point merely to spark
discussion (attached patch)
this patch doesn't apply cleanly to head... can you update it, please?
--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157
On Fri, Jul 10, 2009 at 12:23 PM, Jaime
Casanova<jcasanov@systemguards.com.ec> wrote:
On Sun, Jan 25, 2009 at 6:57 PM, Mark Kirkwood<markir@paradise.net.nz> wrote:
So here is my initial attempt at this, at this point merely to spark
discussion (attached patch)this patch doesn't apply cleanly to head... can you update it, please?
i did it myself, i think this is something we need...
this compile and seems to work... something i was wondering is that
having the total time of lock waits is not very accurate because we
can have 9 lock waits awaiting 1 sec each and one awaiting for 1
minute... simply sum them all will give a bad statistic or am i
missing something?
--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157
Attachments:
lockstats-2.patchtext/x-diff; charset=US-ASCII; name=lockstats-2.patchDownload+681-23
Jaime Casanova wrote:
i did it myself, i think this is something we need...
this compile and seems to work... something i was wondering is that
having the total time of lock waits is not very accurate because we
can have 9 lock waits awaiting 1 sec each and one awaiting for 1
minute... simply sum them all will give a bad statistic or am i
missing something?
Thank you Jaime - looks good. I seem to have been asleep at the wheel
and missed *both* of your emails until now, belated apologies for that
- especially the first one :-(
With respect to the sum of wait times being not very granular, yes -
quite true. I was thinking it is useful to be able to answer the
question 'where is my wait time being spent' - but it hides cases like
the one you mention. What would you like to see? would max and min wait
times be a useful addition, or are you thinking along different lines?
Cheers
Mark
On Fri, Jul 17, 2009 at 3:38 AM, Mark Kirkwood<markir@paradise.net.nz> wrote:
With respect to the sum of wait times being not very granular, yes - quite
true. I was thinking it is useful to be able to answer the question 'where
is my wait time being spent' - but it hides cases like the one you mention.
What would you like to see? would max and min wait times be a useful
addition, or are you thinking along different lines?
track number of locks, sum of wait times, max(wait time).
but actually i started to think that the best is just make use of
log_lock_waits send the logs to csvlog and analyze there...
--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157
Jaime Casanova wrote:
On Fri, Jul 17, 2009 at 3:38 AM, Mark Kirkwood<markir@paradise.net.nz> wrote:
With respect to the sum of wait times being not very granular, yes - quite
true. I was thinking it is useful to be able to answer the question 'where
is my wait time being spent' - but it hides cases like the one you mention.
What would you like to see? would max and min wait times be a useful
addition, or are you thinking along different lines?track number of locks, sum of wait times, max(wait time).
but actually i started to think that the best is just make use of
log_lock_waits send the logs to csvlog and analyze there...
Right - I'll look at adding max (at least) early next week.
Yeah, enabling log_lock_waits is certainly another approach, however you
currently miss out on those that are < deadlock_timeout - and
potentially they could be the source of your problem (i.e millions of
waits all < deadlock_timeout but taken together rather significant).
This shortcoming could be overcome by making the cutoff wait time
decoupled from deadlock_timeout (e.g a new parameter
log_min_lock_wait_time or similar).
I'm thinking that having the lock waits analyzable via sql easily may
mean that for most people they don't need to collect and analyze their
logs for this stuff (they just examine the lock stats view from Pgadmin
or similar).
Cheers
Mark
Mark Kirkwood <markir@paradise.net.nz> writes:
Yeah, enabling log_lock_waits is certainly another approach, however you
currently miss out on those that are < deadlock_timeout - and
potentially they could be the source of your problem (i.e millions of
waits all < deadlock_timeout but taken together rather significant).
This shortcoming could be overcome by making the cutoff wait time
decoupled from deadlock_timeout (e.g a new parameter
log_min_lock_wait_time or similar).
The reason that they're tied together is to keep from creating
unreasonable complexity (and an unreasonable number of extra kernel
calls) in management of the timeout timers. You will find that you
can't just wave your hand and decree that they are now decoupled.
regards, tom lane
Tom Lane wrote:
Mark Kirkwood <markir@paradise.net.nz> writes:
Yeah, enabling log_lock_waits is certainly another approach, however you
currently miss out on those that are < deadlock_timeout - and
potentially they could be the source of your problem (i.e millions of
waits all < deadlock_timeout but taken together rather significant).
This shortcoming could be overcome by making the cutoff wait time
decoupled from deadlock_timeout (e.g a new parameter
log_min_lock_wait_time or similar).The reason that they're tied together is to keep from creating
unreasonable complexity (and an unreasonable number of extra kernel
calls) in management of the timeout timers. You will find that you
can't just wave your hand and decree that they are now decoupled.
Thanks Tom - I did wonder if there was a deeper reason they were tied
together!
Cheers
Mark
Mark Kirkwood wrote:
Jaime Casanova wrote:
On Fri, Jul 17, 2009 at 3:38 AM, Mark
Kirkwood<markir@paradise.net.nz> wrote:With respect to the sum of wait times being not very granular, yes -
quite
true. I was thinking it is useful to be able to answer the question
'where
is my wait time being spent' - but it hides cases like the one you
mention.
What would you like to see? would max and min wait times be a useful
addition, or are you thinking along different lines?track number of locks, sum of wait times, max(wait time).
but actually i started to think that the best is just make use of
log_lock_waits send the logs to csvlog and analyze there...Right - I'll look at adding max (at least) early next week.
I'm also thinking of taking a look at amalgamating transaction type lock
waits. This seems like a good idea because:
- individually, and viewed at a later date, I don't think they
individual detail is going to be useful
- there will be a lot of them
- I think the statistical data (count, sum elapsed, max elapsed) may be
sufficiently interesting
Cheers
Mark
Mark Kirkwood wrote:
Jaime Casanova wrote:
On Fri, Jul 17, 2009 at 3:38 AM, Mark
Kirkwood<markir@paradise.net.nz> wrote:With respect to the sum of wait times being not very granular, yes -
quite
true. I was thinking it is useful to be able to answer the question
'where
is my wait time being spent' - but it hides cases like the one you
mention.
What would you like to see? would max and min wait times be a useful
addition, or are you thinking along different lines?track number of locks, sum of wait times, max(wait time).
but actually i started to think that the best is just make use of
log_lock_waits send the logs to csvlog and analyze there...Right - I'll look at adding max (at least) early next week.
Patch with max(wait time).
Still TODO
- amalgamate individual transaction lock waits
- redo (rather ugly) temporary pg_stat_lock_waits in a form more like
pg_locks
Attachments:
Mark Kirkwood wrote:
Mark Kirkwood wrote:
Jaime Casanova wrote:
On Fri, Jul 17, 2009 at 3:38 AM, Mark
Kirkwood<markir@paradise.net.nz> wrote:With respect to the sum of wait times being not very granular, yes
- quite
true. I was thinking it is useful to be able to answer the question
'where
is my wait time being spent' - but it hides cases like the one you
mention.
What would you like to see? would max and min wait times be a useful
addition, or are you thinking along different lines?track number of locks, sum of wait times, max(wait time).
but actually i started to think that the best is just make use of
log_lock_waits send the logs to csvlog and analyze there...Right - I'll look at adding max (at least) early next week.
Patch with max(wait time).
Still TODO
- amalgamate individual transaction lock waits
- redo (rather ugly) temporary pg_stat_lock_waits in a form more like
pg_locks
This version has the individual transaction lock waits amalgamated.
Still TODO: redo pg_stat_lock_waits ...
Attachments:
Mark,
Your last email on this patch, from August 9th, indicates that you've
still got "TODO: redo pg_stat_lock_waits ...". Has you updated this
patch since then?
Thanks!
Stephen
I have this patch, if you're interested.
LWLock Instrumentation Patch
- counts locks and waits in shared and exclusive mode
- for selected locks, measures wait and hold times
- for selected locks, displays a histogram of wait and hold times
- information is printed at backend exit
Configurable by #define's in lwlock.c
Regards,
pierre
Attachments:
lwlock_instrument_v4.patchapplication/octet-stream; name=lwlock_instrument_v4.patchDownload+444-90
Pierre,
Configurable by #define's in lwlock.c
Given that we already have dtrace/systemtap probes around the lwlocks,
is there some way you could use those instead of extra #defines?
--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com
Stephen Frost wrote:
Mark,
Your last email on this patch, from August 9th, indicates that you've
still got "TODO: redo pg_stat_lock_waits ...". Has you updated this
patch since then?
Stephen,
No - that is still a TODO for me - real life getting in the way :-)
Cheers
Mark
On Sat, Aug 8, 2009 at 7:47 PM, Mark Kirkwood <markir@paradise.net.nz> wrote:
Patch with max(wait time).
Still TODO
- amalgamate individual transaction lock waits
- redo (rather ugly) temporary pg_stat_lock_waits in a form more like
pg_locksThis version has the individual transaction lock waits amalgamated.
Still TODO: redo pg_stat_lock_waits ...
it applies with some hunks, compiles fine and seems to work...
i'm still not sure this is what we need, some more comments could be helpful.
what kind of questions are we capable of answer with this and and what
kind of questions are we still missing?
for example, now we know "number of locks that had to wait", "total
time waiting" and "max time waiting for a single lock"... but still we
can have an inaccurate understanding if we have lots of locks waiting
little time and a few waiting a huge amount of time...
something i have been asked when system starts to slow down is "can we
know if there were a lock contention on that period"? for now the only
way to answer that is logging locks
about the patch itself:
you haven't documented either. what is the pg_stat_lock_waits view
for? and what are those fieldx it has?
i'll let this patch as "needs review" for more people to comment on it...
--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157
On Mon, Sep 28, 2009 at 12:14 AM, Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:
On Sat, Aug 8, 2009 at 7:47 PM, Mark Kirkwood <markir@paradise.net.nz> wrote:
Patch with max(wait time).
Still TODO
- amalgamate individual transaction lock waits
- redo (rather ugly) temporary pg_stat_lock_waits in a form more like
pg_locksThis version has the individual transaction lock waits amalgamated.
Still TODO: redo pg_stat_lock_waits ...
it applies with some hunks, compiles fine and seems to work...
i'm still not sure this is what we need, some more comments could be helpful.
I'm pretty sure the logic of this patch is not correct.
in pgstat_init_lock_wait(LOCKTAG *locktag)
...
+ l_curr = htabent->l_counts.l_tot_wait_time;
+ INSTR_TIME_SET_CURRENT(l_start);
+ INSTR_TIME_ADD(l_curr, l_start);
+ htabent->l_counts.l_tot_wait_time = l_curr;
in pgstat_end_lock_wait(LOCKTAG *locktag)
...
+ l_start = htabent->l_counts.l_tot_wait_time;
+ INSTR_TIME_SET_CURRENT(l_end);
+ INSTR_TIME_SUBTRACT(l_end, l_start);
+ htabent->l_counts.l_tot_wait_time = l_end;
So l_start = time cumulatively waited previously + time at start of this wait.
l_end - l_start is equal to:
= time at end of this wait - ( time at start of this wait + time
cumulatively waited previously)
= (time at end of this wait - time at start of this wait) - time
cumulatively waited previously
= (duration of this wait) - time waited cumulatively previously.
That minus sign in the last line can't be good, can it?
Also
+ htabent->l_counts.l_tot_wait_time = l_end;
+
+ if (INSTR_TIME_GET_MICROSEC(l_end) >
INSTR_TIME_GET_MICROSEC(htabent->l_counts.l_max_wait_time))
+ htabent->l_counts.l_max_wait_time = l_end;
The total wait time is equal to the max wait time (which are both
equal to l_end)?
One or both of those has to end up being wrong. At this stage, is
l_end supposed to be the last wait time, or the cumulative wait time?
One of the things in the patch review checklist is about compiler
warnings, so I am reporting these:
lock.c: In function `LockAcquire':
lock.c:797: warning: passing arg 1 of `pgstat_init_lock_wait' discards
qualifiers from pointer target type
lock.c:802: warning: passing arg 1 of `pgstat_end_lock_wait' discards
qualifiers from pointer target type
Cheers,
Jeff
Jaime Casanova wrote:
it applies with some hunks, compiles fine and seems to work...
i'm still not sure this is what we need, some more comments could be helpful.
Yeah, that's the big question. Are the current capabilities (logging 'em
for waits > deadlock timeout + dtrace hooks) enough? I'm thinking that
if we had dtrace for linux generally available, then the need for this
patch would be lessened.
what kind of questions are we capable of answer with this and and what
kind of questions are we still missing?for example, now we know "number of locks that had to wait", "total
time waiting" and "max time waiting for a single lock"... but still we
can have an inaccurate understanding if we have lots of locks waiting
little time and a few waiting a huge amount of time...something i have been asked when system starts to slow down is "can we
know if there were a lock contention on that period"? for now the only
way to answer that is logging locks
Right - there still may be other aggregates that need to be captured....
it would be great to have some more feedback from the field about this.
In my case, I was interested in seeing if the elapsed time was being
spent waiting for locks or actually executing (in fact it turned out to
be the latter - but was still very useful to be able to rule out locking
issues). However , as you mention - there maybe cases where the
question is more about part of the system suffering a disproportional
number/time of lock waits...
about the patch itself:
you haven't documented either. what is the pg_stat_lock_waits view
for? and what are those fieldx it has?
Yeah, those fields are straight from the LOCKTAG structure. I need to
redo the view to be more like pg_locks, and also do the doco. However
I've been a bit hesitant to dive into these last two steps until I see
that there is some real enthusiasm for this patch (or failing that, a
feeling that it is needed...).
i'll let this patch as "needs review" for more people to comment on it...
Agreed, thanks for the comments.
Mark
Jeff Janes wrote:
The total wait time is equal to the max wait time (which are both
equal to l_end)?
One or both of those has to end up being wrong. At this stage, is
l_end supposed to be the last wait time, or the cumulative wait time?
Hmm - I may well have fat fingered the arithmetic, thanks I'll take a look!
One of the things in the patch review checklist is about compiler
warnings, so I am reporting these:lock.c: In function `LockAcquire':
lock.c:797: warning: passing arg 1 of `pgstat_init_lock_wait' discards
qualifiers from pointer target type
lock.c:802: warning: passing arg 1 of `pgstat_end_lock_wait' discards
qualifiers from pointer target type
Right, will look at too.
Cheers
Mark
On Mon, Sep 28, 2009 at 12:14 AM, Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:
On Sat, Aug 8, 2009 at 7:47 PM, Mark Kirkwood <markir@paradise.net.nz> wrote:
Patch with max(wait time).
Still TODO
- amalgamate individual transaction lock waits
- redo (rather ugly) temporary pg_stat_lock_waits in a form more like
pg_locksThis version has the individual transaction lock waits amalgamated.
Still TODO: redo pg_stat_lock_waits ...
it applies with some hunks, compiles fine and seems to work...
i'm still not sure this is what we need, some more comments could be helpful.what kind of questions are we capable of answer with this and and what
kind of questions are we still missing?for example, now we know "number of locks that had to wait", "total
time waiting" and "max time waiting for a single lock"... but still we
can have an inaccurate understanding if we have lots of locks waiting
little time and a few waiting a huge amount of time...
Aren't the huge ones already loggable from the deadlock detector?
With the max, we can at least put an upper limit on how long the
longest ones could have been. However, is there a way to reset the
max? I tried deleting data/pg_stat_tmp, but that didn't work. With
cumulative values, you can you take snapshots and then take the
difference of them, that won't work with max. If the max can't be
reset except with an initdb, I think that makes it barely usable.
something i have been asked when system starts to slow down is "can we
know if there were a lock contention on that period"? for now the only
way to answer that is logging locks
I was surprised to find that running with track_locks on did not cause
a detectable difference in performance, so you could just routinely do
regularly scheduled snapshots and go back and mine them over the time
that a problem was occurring. I just checked with pgbench over
various levels of concurrency and fsync settings. If potential
slowness wouldn't show up there, I don't know how else to look for it.
Cheers,
Jeff