Additional stats for Relations
Hi,
Currently a "select * from pg_statio_user_tables;" displays only
heap_blks_read, heap_blks_hit stats amongst others for the main relation. It
would be good to have the following stats collected too. I think these stats
can be used to better statistically analyze/understand the block I/O
activity on the relation:
heap_blks_reused: The number of buffers returned by the FSM for use to store
a new tuple in
heap_blks_extend: The number of times file extend was invoked on the
relation
heap_blks_truncate: The total number of blocks that have been truncated due
to vacuum activity e.g.
As an addendum to the truncate stats above, we can also have the additional
following stats:
heap_blks_maxtruncate: The max block of buffers truncated in one go
heap_blks_ntruncate: The number of times truncate was called on this
relation
I can come up with a patch (already have one) for the above. Any
thought/comments?
Regards,
Nikhils
(www.enterprisedb.com)
--
All the world's a stage, and most of us are desperately unrehearsed.
On Fri, Oct 13, 2006 at 06:17:47PM +0530, NikhilS wrote:
Currently a "select * from pg_statio_user_tables;" displays only
heap_blks_read, heap_blks_hit stats amongst others for the main relation. It
would be good to have the following stats collected too. I think these stats
can be used to better statistically analyze/understand the block I/O
activity on the relation:heap_blks_reused: The number of buffers returned by the FSM for use to store
a new tuple in
The description on this is misleading... FSM doesn't return buffers, it
returns pages that have free space on them.
I'm also not sure if this metric is what you actually want, since a
single page can be returned many times from the FSM even between
vacuums. Tracking how many pages for a relation have been put into the
FSM might be more useful...
heap_blks_extend: The number of times file extend was invoked on the
relationheap_blks_truncate: The total number of blocks that have been truncated due
to vacuum activity e.g.As an addendum to the truncate stats above, we can also have the additional
following stats:heap_blks_maxtruncate: The max block of buffers truncated in one go
heap_blks_ntruncate: The number of times truncate was called on this
relationI can come up with a patch (already have one) for the above. Any
thought/comments?
Do you have a use-case for this info? I can see where it might be neat
to know, but I'm not sure how you'd actually use it in the real world.
--
Jim Nasby jim@nasby.net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
Hi Jim,
On 10/13/06, Jim C. Nasby <jim@nasby.net> wrote:
On Fri, Oct 13, 2006 at 06:17:47PM +0530, NikhilS wrote:
Currently a "select * from pg_statio_user_tables;" displays only
heap_blks_read, heap_blks_hit stats amongst others for the mainrelation. It
would be good to have the following stats collected too. I think these
stats
can be used to better statistically analyze/understand the block I/O
activity on the relation:heap_blks_reused: The number of buffers returned by the FSM for use to
store
a new tuple in
The description on this is misleading... FSM doesn't return buffers, it
returns pages that have free space on them.
<Nikhils>
FSM returns the block number from which we fetch the buffer. This is similar
to the way we track buffer_read stats in ReadBuffer.
<Nikhils>
I'm also not sure if this metric is what you actually want, since a
single page can be returned many times from the FSM even between
vacuums. Tracking how many pages for a relation have been put into the
FSM might be more useful...
<Nikhils>
Pages might be put into the FSM, but by this metric don't we get the actual
usage of the pages from the FSM? Agreed a single page can be returned
multiple times, but since it serves a new tuple, shouldn't we track it?
<Nikhils>
heap_blks_extend: The number of times file extend was invoked on the
relationheap_blks_truncate: The total number of blocks that have been truncated
due
to vacuum activity e.g.
As an addendum to the truncate stats above, we can also have the
additional
following stats:
heap_blks_maxtruncate: The max block of buffers truncated in one go
heap_blks_ntruncate: The number of times truncate was called on this
relationI can come up with a patch (already have one) for the above. Any
thought/comments?Do you have a use-case for this info? I can see where it might be neat
to know, but I'm not sure how you'd actually use it in the real world.
<Nikhils>
The use-case according to me is that these stats help prove the
effectiveness of autovacuum/vacuum operations. By varying some autovac guc
variables, and doing subsequent (pgbench e.g.) runs, one can find out the
optimum values for these variables using these stats.
<Nikhils>
Regards,
Nikhils
EnterpriseDB http://www.enterprisedb.com
--
All the world's a stage, and most of us are desperately unrehearsed.
On Sat, 2006-10-14 at 11:32 +0530, NikhilS wrote:
On 10/13/06, Jim C. Nasby <jim@nasby.net> wrote:
I'm also not sure if this metric is what you actually want,
since a
single page can be returned many times from the FSM even
between
vacuums. Tracking how many pages for a relation have been put
into the
FSM might be more useful...<Nikhils>
Pages might be put into the FSM, but by this metric don't we get the
actual usage of the pages from the FSM? Agreed a single page can be
returned multiple times, but since it serves a new tuple, shouldn't we
track it?
<Nikhils>
This makes sense for indexes, but only makes sense for heaps when we
know that the backend will keep re-accessing the block until it is full
- so only of interest in steady-state workloads.
IMHO Jim's proposal makes more sense for general use.
heap_blks_extend: The number of times file extend was
invoked on the
relation
Sounds good
heap_blks_truncate: The total number of blocks that have
been truncated due
to vacuum activity e.g.
Sounds good
As an addendum to the truncate stats above, we can also have
the additional
following stats:
heap_blks_maxtruncate: The max block of buffers truncated in
one go
heap_blks_ntruncate: The number of times truncate was called
on this
relation
Those last 2 sound too complex for normal use and ntruncate is most
likely the same as number of vacuums anyway. Hmmm...Perhaps nvacuums is
a more interesting metric? We've got last vacuum date, but no indication
of how frequently a vacuum has run.
Do you have a use-case for this info? I can see where it might
be neat
to know, but I'm not sure how you'd actually use it in the
real world.<Nikhils>
The use-case according to me is that these stats help prove the
effectiveness of autovacuum/vacuum operations. By varying some autovac
guc variables, and doing subsequent (pgbench e.g.) runs, one can find
out the optimum values for these variables using these stats.
<Nikhils>
This should be useful for tuning space allocation/deallocation. If we
get this patch in early it should help get feedback on this area.
--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com
Hi,
So:
heap_blks_reused (with Jim's semantics), heap_blks_extend,
heap_blks_truncate are the "interesting" stats? Will try to work up a patch
for this.
Regards,
Nikhils
EnterpriseDB http://www.enterprisedb.com
On 10/15/06, Simon Riggs <simon@2ndquadrant.com> wrote:
On Sat, 2006-10-14 at 11:32 +0530, NikhilS wrote:
On 10/13/06, Jim C. Nasby <jim@nasby.net> wrote:
I'm also not sure if this metric is what you actually want,
since a
single page can be returned many times from the FSM even
between
vacuums. Tracking how many pages for a relation have been put
into the
FSM might be more useful...<Nikhils>
Pages might be put into the FSM, but by this metric don't we get the
actual usage of the pages from the FSM? Agreed a single page can be
returned multiple times, but since it serves a new tuple, shouldn't we
track it?
<Nikhils>This makes sense for indexes, but only makes sense for heaps when we
know that the backend will keep re-accessing the block until it is full
- so only of interest in steady-state workloads.IMHO Jim's proposal makes more sense for general use.
heap_blks_extend: The number of times file extend was
invoked on the
relation
Sounds good
heap_blks_truncate: The total number of blocks that have
been truncated due
to vacuum activity e.g.
Sounds good
As an addendum to the truncate stats above, we can also have
the additional
following stats:
heap_blks_maxtruncate: The max block of buffers truncated in
one go
heap_blks_ntruncate: The number of times truncate was called
on this
relation
Those last 2 sound too complex for normal use and ntruncate is most
likely the same as number of vacuums anyway. Hmmm...Perhaps nvacuums is
a more interesting metric? We've got last vacuum date, but no indication
of how frequently a vacuum has run.Do you have a use-case for this info? I can see where it might
be neat
to know, but I'm not sure how you'd actually use it in the
real world.<Nikhils>
The use-case according to me is that these stats help prove the
effectiveness of autovacuum/vacuum operations. By varying some autovac
guc variables, and doing subsequent (pgbench e.g.) runs, one can find
out the optimum values for these variables using these stats.
<Nikhils>This should be useful for tuning space allocation/deallocation. If we
get this patch in early it should help get feedback on this area.--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com
--
All the world's a stage, and most of us are desperately unrehearsed.
Also how many times a relation has been vacuumed (which puts all the
other numbers in more perspective... good catch Simon). And I think
number of pages that could not be added to the FSM would also be
extremely valuable.
On Wed, Oct 18, 2006 at 11:27:39AM +0530, NikhilS wrote:
Hi,
So:
heap_blks_reused (with Jim's semantics), heap_blks_extend,
heap_blks_truncate are the "interesting" stats? Will try to work up a patch
for this.Regards,
Nikhils
EnterpriseDB http://www.enterprisedb.com
On 10/15/06, Simon Riggs <simon@2ndquadrant.com> wrote:On Sat, 2006-10-14 at 11:32 +0530, NikhilS wrote:
On 10/13/06, Jim C. Nasby <jim@nasby.net> wrote:
I'm also not sure if this metric is what you actually want,
since a
single page can be returned many times from the FSM even
between
vacuums. Tracking how many pages for a relation have been put
into the
FSM might be more useful...<Nikhils>
Pages might be put into the FSM, but by this metric don't we get the
actual usage of the pages from the FSM? Agreed a single page can be
returned multiple times, but since it serves a new tuple, shouldn't we
track it?
<Nikhils>This makes sense for indexes, but only makes sense for heaps when we
know that the backend will keep re-accessing the block until it is full
- so only of interest in steady-state workloads.IMHO Jim's proposal makes more sense for general use.
heap_blks_extend: The number of times file extend was
invoked on the
relation
Sounds good
heap_blks_truncate: The total number of blocks that have
been truncated due
to vacuum activity e.g.
Sounds good
As an addendum to the truncate stats above, we can also have
the additional
following stats:
heap_blks_maxtruncate: The max block of buffers truncated in
one go
heap_blks_ntruncate: The number of times truncate was called
on this
relation
Those last 2 sound too complex for normal use and ntruncate is most
likely the same as number of vacuums anyway. Hmmm...Perhaps nvacuums is
a more interesting metric? We've got last vacuum date, but no indication
of how frequently a vacuum has run.Do you have a use-case for this info? I can see where it might
be neat
to know, but I'm not sure how you'd actually use it in the
real world.<Nikhils>
The use-case according to me is that these stats help prove the
effectiveness of autovacuum/vacuum operations. By varying some autovac
guc variables, and doing subsequent (pgbench e.g.) runs, one can find
out the optimum values for these variables using these stats.
<Nikhils>This should be useful for tuning space allocation/deallocation. If we
get this patch in early it should help get feedback on this area.--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com--
All the world's a stage, and most of us are desperately unrehearsed.
--
Jim Nasby jim@nasby.net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
Hi Jim,
On 10/18/06, Jim C. Nasby <jim@nasby.net> wrote:
Also how many times a relation has been vacuumed (which puts all the
other numbers in more perspective... good catch Simon). And I think
number of pages that could not be added to the FSM would also be
extremely valuable.
By the above, do you mean the number of pages that could not be added to the
FSM because they had freespace which was less than the threshold for this
particular relation?
Regards,
Nikhils
EnterpriseDB http://www.enterprisedb.com
On Wed, Oct 18, 2006 at 11:27:39AM +0530, NikhilS wrote:
Hi,
So:
heap_blks_reused (with Jim's semantics), heap_blks_extend,
heap_blks_truncate are the "interesting" stats? Will try to work up apatch
for this.
Regards,
Nikhils
EnterpriseDB http://www.enterprisedb.com
On 10/15/06, Simon Riggs <simon@2ndquadrant.com> wrote:On Sat, 2006-10-14 at 11:32 +0530, NikhilS wrote:
On 10/13/06, Jim C. Nasby <jim@nasby.net> wrote:
I'm also not sure if this metric is what you actually want,
since a
single page can be returned many times from the FSM even
between
vacuums. Tracking how many pages for a relation have been put
into the
FSM might be more useful...<Nikhils>
Pages might be put into the FSM, but by this metric don't we get the
actual usage of the pages from the FSM? Agreed a single page can be
returned multiple times, but since it serves a new tuple, shouldn'twe
track it?
<Nikhils>This makes sense for indexes, but only makes sense for heaps when we
know that the backend will keep re-accessing the block until it is full
- so only of interest in steady-state workloads.IMHO Jim's proposal makes more sense for general use.
heap_blks_extend: The number of times file extend was
invoked on the
relation
Sounds good
heap_blks_truncate: The total number of blocks that have
been truncated due
to vacuum activity e.g.
Sounds good
As an addendum to the truncate stats above, we can also
have
the additional
following stats:
heap_blks_maxtruncate: The max block of buffers truncated
in
one go
heap_blks_ntruncate: The number of times truncate was
called
on this
relation
Those last 2 sound too complex for normal use and ntruncate is most
likely the same as number of vacuums anyway. Hmmm...Perhaps nvacuums is
a more interesting metric? We've got last vacuum date, but noindication
of how frequently a vacuum has run.
Do you have a use-case for this info? I can see where it
might
be neat
to know, but I'm not sure how you'd actually use it in the
real world.<Nikhils>
The use-case according to me is that these stats help prove the
effectiveness of autovacuum/vacuum operations. By varying someautovac
guc variables, and doing subsequent (pgbench e.g.) runs, one can find
out the optimum values for these variables using these stats.
<Nikhils>This should be useful for tuning space allocation/deallocation. If we
get this patch in early it should help get feedback on this area.--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com--
All the world's a stage, and most of us are desperately unrehearsed.--
Jim Nasby jim@nasby.net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
--
All the world's a stage, and most of us are desperately unrehearsed.
On Thu, Oct 19, 2006 at 04:10:46PM +0530, NikhilS wrote:
Hi Jim,
On 10/18/06, Jim C. Nasby <jim@nasby.net> wrote:
Also how many times a relation has been vacuumed (which puts all the
other numbers in more perspective... good catch Simon). And I think
number of pages that could not be added to the FSM would also be
extremely valuable.By the above, do you mean the number of pages that could not be added to the
FSM because they had freespace which was less than the threshold for this
particular relation?
Yes... but... :)
We want to ignore pages that have less than the average request size,
because vacuum will never try and put them in the FSM anyway. We only
care about pages that were dropped because MaxFSMPages was less than
DesiredFSMPages (see freespace.c for more info).
It would also be useful to keep track of what relations have been bumped
out of the FSM (or never got recorded, though I'm not sure if that's
possible) because we've run into the MaxFSMRelations limit.
BTW, if we add these counters we'll be up to 7 stats dealing with vacuum
and analyze, and NikhilS has a patch we're finalizing that would add 3
more. Right now there's 4 slated to go into pg_stat_* in 8.2, but maybe
we should have a separate view for vacuum/FSM statistics?
--
Jim Nasby jim@nasby.net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
"Jim C. Nasby" <jim@nasby.net> writes:
BTW, if we add these counters we'll be up to 7 stats dealing with vacuum
and analyze, and NikhilS has a patch we're finalizing that would add 3
more. Right now there's 4 slated to go into pg_stat_* in 8.2, but maybe
we should have a separate view for vacuum/FSM statistics?
I've seen no demonstration of a need for *any* of them, actually,
and am pretty dubious that we want to add so much collection overhead.
At least not without a major redesign of the stats reporting mechanism.
If we just drop in another seven counters, we'll create an immediate 50%
increase in the stats-file I/O volume, even when no vacuuming is
happening at all.
regards, tom lane
On Thu, Oct 19, 2006 at 11:47:53AM -0400, Tom Lane wrote:
"Jim C. Nasby" <jim@nasby.net> writes:
BTW, if we add these counters we'll be up to 7 stats dealing with vacuum
and analyze, and NikhilS has a patch we're finalizing that would add 3
more. Right now there's 4 slated to go into pg_stat_* in 8.2, but maybe
we should have a separate view for vacuum/FSM statistics?I've seen no demonstration of a need for *any* of them, actually,
and am pretty dubious that we want to add so much collection overhead.
At least not without a major redesign of the stats reporting mechanism.
If we just drop in another seven counters, we'll create an immediate 50%
increase in the stats-file I/O volume, even when no vacuuming is
happening at all.
Yeah, for stuff like vacuuming the current stats system may not make any
sense. Almost anything dealing with vacuum can really just be put into a
table, because it doesn't happen all that often.
My concern is that there's enough useful data to collect about vacuuming
and the FSM that it should probably get it's own set of tables/views,
rather than piggy-backing on pg_stat_*. But that ship has pretty much
sailed, so we're probably stuck with at least the last_* stuff in
pg_stat_* for the immediate future.
--
Jim Nasby jim@nasby.net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)