add vacuum starttime columns
Hi Hackers
Some one complains that pg_stat_all_tables does not see the time when
the table starts vacuum/autovacuum, we want to hear from the big boss
[image: image.png]
Attachments:
image.pngimage/png; name=image.pngDownload+2-1
On Monday, December 30, 2024, wenhui qiu <qiuwenhuifx@gmail.com> wrote:
Hi Hackers
Some one complains that pg_stat_all_tables does not see the time when
the table starts vacuum/autovacuum, we want to hear from the big boss[image: image.png]
Your email is quite unclear. Who do you consider to be the big boss? Is it
ok if anyone watching this list answers? What do you want to hear? How
did you get the image to show columns that don’t exist? Why do you want
something other than the last vacuum commit timestamp necessary to make a
time sense of the “since_vacuum” data elements?
David J.
Attachments:
image.pngimage/png; name=image.pngDownload+2-1
The last_(auto)vacuum is useful because it allows
the user to monitor vacuum to ensure that vacuums
are completing on a relation at expected intervals.
I am not sure what value a start time will provide.
Can you provide a reason for this?
Regards,
Sami Imseih
Sorry, I and my friend improve that columns,
I just want to know what think about this feature,big boss is only a term
of respect in China
On Tue, 31 Dec 2024 at 12:31, David G. Johnston <david.g.johnston@gmail.com>
wrote:
Show quoted text
On Monday, December 30, 2024, wenhui qiu <qiuwenhuifx@gmail.com> wrote:
Hi Hackers
Some one complains that pg_stat_all_tables does not see the time
when the table starts vacuum/autovacuum, we want to hear from the big boss[image: image.png]
Your email is quite unclear. Who do you consider to be the big boss? Is
it ok if anyone watching this list answers? What do you want to hear? How
did you get the image to show columns that don’t exist? Why do you want
something other than the last vacuum commit timestamp necessary to make a
time sense of the “since_vacuum” data elements?David J.
Attachments:
image.pngimage/png; name=image.pngDownload+2-1
Hi Sami
Many people have encountered situations where autovacuum or auto analyze on
tables are not triggered in time, leading to suboptimal execution plans and
some performance issues. When analyzing such problems, we often need to
trace back to when autovacuum or auto analyze was triggered for the
corresponding table. However, if the log_autovacuum_min_duration parameter
is not configured (as I’ve encountered in many cases where this parameter
is either not set or has an inappropriate value), we cannot determine the
trigger time and duration of the operation.
Our idea is to directly query the pg_stat_all_tables view to obtain the
start time of (auto)vacuum/(auto)analyze for a table. This would help us
monitor the duration of vacuum/analyze and determine whether it is
necessary to tune and speed up the vacuum/analyze process.
Of course, to observe the duration of vacuum operations, we can configure
the log_autovacuum_min_durationparameter, but if there are many tables in
the database, the vacuum entries in the logs might be quite numerous,
making it difficult to analyze.
Additionally, we are currently considering whether it would be possible to
add a last_(auto)vacuum_start field to the pg_stat_all_tables view. For
tables where a vacuum operation is in progress, the last_(auto)vacuum field
may not be updated, and it may not be possible to estimate the vacuum
duration using just these two fields.
However, this could still indicate whether a vacuum is in progress (if
last_(auto)vacuum_start is more recent than last_(auto)vacuum, it means a
vacuum is ongoing). While it is possible to monitor vacuum activity through
the pg_stat_progress_vacuum view, this view itself does not record
timestamps, so additional views might be needed, which would be less
convenient than querying pg_stat_all_tables directly.
Therefore, we personally believe that adding such fields would be
beneficial for monitoring the execution of (auto)vacuum and (auto)analyze.
Thanks
On Tue, Dec 31, 2024 at 12:40 PM Sami Imseih <samimseih@gmail.com> wrote:
Show quoted text
The last_(auto)vacuum is useful because it allows
the user to monitor vacuum to ensure that vacuums
are completing on a relation at expected intervals.
I am not sure what value a start time will provide.
Can you provide a reason for this?Regards,
Sami Imseih
However, if the log_autovacuum_min_duration parameter is not configured (as I’ve encountered in many cases where this parameter is either not set or has an inappropriate value), we cannot determine the trigger time and duration of the operation.
log_autovacuum_min_duration logs when the vacuum completes
and not when it starts. It does provide the elapsed time in the log
as you say.
Our idea is to directly query the pg_stat_all_tables view to obtain the start time of (auto)vacuum/(auto)analyze for a table. This would help us monitor the duration of vacuum/analyze and determine whether it is necessary to tune and speed up the vacuum/analyze process.
This is the crux of the question. I agree that pg_stat_all_tables is missing
information to allow someone to determine if the vacuums for a table
are becoming progressively longer to complete or to verify the vacuum
is faster after some tuning was performed.
The pg_stat_all_tables.vacuum(autovacuum)_count
does not help answer these questions and one must then enable
extra logging.
but if there are many tables in the database, the vacuum entries in the logs might be quite numerous, making it difficult to analyze.
Agree
However, this could still indicate whether a vacuum is in progress (if last_(auto)vacuum_start is more recent than last_(auto)vacuum, it means a vacuum is ongoing). While it is possible to monitor vacuum activity through the pg_stat_progress_vacuum view, this view itself does not record timestamps, so additional views might be needed, which would be less convenient than querying pg_stat_all_tables directly.
The progress view does a good job of providing the
phase of the vacuum which at least tells you if your
vacuum is not "stuck" on a single operation. It is not
really good at providing information to predict when
the vacuum will complete. It is very difficult to actually
make such a prediction.
IMO, I think this situation could be improved, but I am
not sure if a start_time is the best answer. Let's suppose
you set the start_time at the beginning of the vacuum,
then if the vacuum for some reason terminates early, then
you are left with some inconsistent data.
Thinking about this, it will be good to have cumulative statistics
for elapsed times of a all vacuums on a relation ( since last reset ).
This way, someone who is monitoring pg_stat_all_tables
can divide the total accumulated elapsed time by the vacuum counts.
This can give a good idea of how long vacuum is taking on a table
and with enough sampling over time, one can derive patterns
of vacuum slowing down, etc.
Of course, the time can only be accumulated when the vacuum
completes, which should be good enough.
Regards,
Sami Imseih
On Tue, Dec 31, 2024 at 2:33 AM wenhui qiu <qiuwenhuifx@gmail.com> wrote:
Of course, to observe the duration of vacuum operations, we can configure
the log_autovacuum_min_durationparameter, but if there are many tables in
the database, the vacuum entries in the logs might be quite numerous,
making it difficult to analyze.Using log_autovacuum_min_duration really is the best solution here. It is
not very difficult to write a script to pull the information out. Yes, it
can be a lot of lines if there are a lot of tables, but that's part of why
it's a duration and not a toggle - you can ignore the quick-running ones.
Also, the log files give you historical overview that the pg_stat views
simply cannot provide, in addition to the actual details of what was
vacuumed and why. All that is to say that I am not convinced we need to
tweak the system views when we have the information already available in a
better way.
For what it's worth, log_autovacuum_min_duration is one of the few
parameters that I always recommend be turned on at the highest level (i.e.
log it all). The extra log verbosity is well worth it.
Cheers,
Greg
Also, the log files give you historical overview that the pg_stat views simply cannot provide,
in addition to the actual details of what was vacuumed and why.
Logfiles have the ability to provide more details and they have their place.
However, one must also think about how much logging they want to
enable and this limits how much they can learn about the system.
A user also needs to come up with a custom solution to parse and process
data.
This is why we provide other (auto)vacuum/(auto)analyze metrics in pg_stat
views. logs alone are not very practical to learn about such an important
activity in the cluster.
Being able to quickly answer "how long are vacuum/autovacuums are
taking per table?"
and being able to trend this information without enabling additional logging
is a good user experience in my mind.
For what it's worth, log_autovacuum_min_duration is one of the few parameters that I always
recommend be turned on at the highest level (i.e. log it all). The extra log verbosity is well worth it.
If we can avoid this additional logging, the better IMO.
I would like to see us actually expose more vacuum level metrics in
pg_stats, so maybe
this is the time to think about potentially dedicated views for vacuum/analyze.
Regards,
Sami
worth mentioning here that there is discussion for tracking vacuum history
in this recent thread [1]/messages/by-id/b68ab452-c41f-4d04-893f-eaab84f1855b@vondra.me, and this includes both start_time and end_time
of the vacuum from what I can tell.
postgres=# select * from vacuum_history ;
start_time | end_time | dbid | relid | is_autovacuum | .....
It will also be good to see if there are other opinions for accumulating
the elapsed time for vacuums in pg_stat_all_tables.
[1]: /messages/by-id/b68ab452-c41f-4d04-893f-eaab84f1855b@vondra.me
Regards,
Sami