[PROPOSAL] VACUUM Progress Checker.
Hello Hackers,
Following is a proposal for feature to calculate VACUUM progress.
Use Case : Measuring progress of long running VACUUMs to help DBAs make
informed decision
whether to continue running VACUUM or abort it.
Design:
A shared preload library to store progress information from different
backends running VACUUM, calculate remaining time for each and display
progress in the
in the form a view.
VACUUM needs to be instrumented with a hook to collect progress
information (pages vacuumed/scanned) periodically.
The patch attached implements a new hook to store vacuumed_pages and
scanned_pages count at the end of each page scanned by VACUUM.
This information is stored in a shared memory structure.
In addition to measuring progress this function using hook also calculates
remaining time for VACUUM.
The frequency of collecting progress information can be reduced by
appending delays in between hook function calls.
Also, a GUC parameter
log_vacuum_min_duration can be used.
This will cause VACUUM progress to be calculated only if VACUUM runs more
than specified milliseconds.
A value of zero calculates VACUUM progress for each page processed. -1
disables logging.
Progress calculation :
percent_complete = scanned_pages * 100 / total_pages_to_be_scanned;
remaining_time = elapsed_time * (total_pages_to_be_scanned - scanned_pages)
/ scanned_pages;
Shared memory struct:
typedef struct PgStat_VacuumStats
{
Oid databaseoid;
Oid tableoid;
Int32 vacuumed_pages;
Int32 total_pages;
Int32 scanned_pages;
double elapsed_time;
double remaining_time;
} PgStat_VacuumStats[max_connections];
Reporting :
A view named 'pg_maintenance_progress' can be created using the values in
the struct above.
pg_stat_maintenance can be called from any other backend and will display
progress of
each running VACUUM.
Other uses of hook in VACUUM:
Cost of VACUUM in terms of pages hit , missed and dirtied same as
autovacuum can be collected using this hook.
Autovacuum does it at the end of VACUUM for each table. It can be done
while VACUUM on a table is in progress.
This can be helpful to track manual VACUUMs also not just autovacuum.
Read/Write(I/O) rates can be computed on the lines of autovacuum.
Read rate patterns can be used to help tuning future vacuum on the
table(like shared buffers tuning)
Other resource usages can also be collected using progress checker hook.
Attached patch is POC patch of progress calculation for a single backend.
Also attached is a brief snapshot of the output log.
Hi
2015-06-30 9:37 GMT+02:00 Rahila Syed <rahilasyed90@gmail.com>:
Hello Hackers,
Following is a proposal for feature to calculate VACUUM progress.
interesting idea - I like to see it integrated to core.
Use Case : Measuring progress of long running VACUUMs to help DBAs make
informed decision
whether to continue running VACUUM or abort it.Design:
A shared preload library to store progress information from different
backends running VACUUM, calculate remaining time for each and display
progress in the
in the form a view.
probably similar idea can be used for REINDEX, CREATE INDEX, COPY TO
statements
I though about the possibilities of progress visualization - and one
possibility is one or two special column in pg_stat_activity table - this
info can be interesting for VACUUM started by autovacuum too.
Regards
Pavel
Show quoted text
VACUUM needs to be instrumented with a hook to collect progress
information (pages vacuumed/scanned) periodically.The patch attached implements a new hook to store vacuumed_pages and
scanned_pages count at the end of each page scanned by VACUUM.This information is stored in a shared memory structure.
In addition to measuring progress this function using hook also calculates
remaining time for VACUUM.The frequency of collecting progress information can be reduced by
appending delays in between hook function calls.Also, a GUC parameter
log_vacuum_min_duration can be used.
This will cause VACUUM progress to be calculated only if VACUUM runs more
than specified milliseconds.A value of zero calculates VACUUM progress for each page processed. -1
disables logging.Progress calculation :
percent_complete = scanned_pages * 100 / total_pages_to_be_scanned;
remaining_time = elapsed_time * (total_pages_to_be_scanned -
scanned_pages) / scanned_pages;Shared memory struct:
typedef struct PgStat_VacuumStats
{
Oid databaseoid;
Oid tableoid;
Int32 vacuumed_pages;
Int32 total_pages;
Int32 scanned_pages;
double elapsed_time;
double remaining_time;
} PgStat_VacuumStats[max_connections];
Reporting :
A view named 'pg_maintenance_progress' can be created using the values in
the struct above.pg_stat_maintenance can be called from any other backend and will display
progress ofeach running VACUUM.
Other uses of hook in VACUUM:
Cost of VACUUM in terms of pages hit , missed and dirtied same as
autovacuum can be collected using this hook.Autovacuum does it at the end of VACUUM for each table. It can be done
while VACUUM on a table is in progress.
This can be helpful to track manual VACUUMs also not just autovacuum.Read/Write(I/O) rates can be computed on the lines of autovacuum.
Read rate patterns can be used to help tuning future vacuum on the
table(like shared buffers tuning)
Other resource usages can also be collected using progress checker hook.Attached patch is POC patch of progress calculation for a single backend.
Also attached is a brief snapshot of the output log.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Jun 30, 2015 at 7:37 PM, Rahila Syed <rahilasyed90@gmail.com> wrote:
Hello Hackers,
Following is a proposal for feature to calculate VACUUM progress.
Use Case : Measuring progress of long running VACUUMs to help DBAs make
informed decision
whether to continue running VACUUM or abort it.
+1
I was thinking recently that it would be very cool to see some
estimation of the progress of VACUUM and CLUSTER in a view similar to
pg_stat_activity, or the ps title.
--
Thomas Munro
http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Jun 30, 2015 at 1:07 PM, Rahila Syed <rahilasyed90@gmail.com> wrote:
Hello Hackers,
Following is a proposal for feature to calculate VACUUM progress.
Use Case : Measuring progress of long running VACUUMs to help DBAs make
informed decision
whether to continue running VACUUM or abort it.+1
I am excited to know how the progress works in when any of the statement
got blocked during locks. Rather displaying the stats in the LOG, shall we
have this in a pg_stat_vacuum_activity[ New catalog for all auto-vacuum
stats].
Best Regards,
Dinesh
manojadinesh.blogspot.com
Design:
Show quoted text
A shared preload library to store progress information from different
backends running VACUUM, calculate remaining time for each and display
progress in the
in the form a view.VACUUM needs to be instrumented with a hook to collect progress
information (pages vacuumed/scanned) periodically.The patch attached implements a new hook to store vacuumed_pages and
scanned_pages count at the end of each page scanned by VACUUM.This information is stored in a shared memory structure.
In addition to measuring progress this function using hook also calculates
remaining time for VACUUM.The frequency of collecting progress information can be reduced by
appending delays in between hook function calls.Also, a GUC parameter
log_vacuum_min_duration can be used.
This will cause VACUUM progress to be calculated only if VACUUM runs more
than specified milliseconds.A value of zero calculates VACUUM progress for each page processed. -1
disables logging.Progress calculation :
percent_complete = scanned_pages * 100 / total_pages_to_be_scanned;
remaining_time = elapsed_time * (total_pages_to_be_scanned -
scanned_pages) / scanned_pages;Shared memory struct:
typedef struct PgStat_VacuumStats
{
Oid databaseoid;
Oid tableoid;
Int32 vacuumed_pages;
Int32 total_pages;
Int32 scanned_pages;
double elapsed_time;
double remaining_time;
} PgStat_VacuumStats[max_connections];
Reporting :
A view named 'pg_maintenance_progress' can be created using the values in
the struct above.pg_stat_maintenance can be called from any other backend and will display
progress ofeach running VACUUM.
Other uses of hook in VACUUM:
Cost of VACUUM in terms of pages hit , missed and dirtied same as
autovacuum can be collected using this hook.Autovacuum does it at the end of VACUUM for each table. It can be done
while VACUUM on a table is in progress.
This can be helpful to track manual VACUUMs also not just autovacuum.Read/Write(I/O) rates can be computed on the lines of autovacuum.
Read rate patterns can be used to help tuning future vacuum on the
table(like shared buffers tuning)
Other resource usages can also be collected using progress checker hook.Attached patch is POC patch of progress calculation for a single backend.
Also attached is a brief snapshot of the output log.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 30 June 2015 at 08:52, Pavel Stehule <pavel.stehule@gmail.com> wrote:
I though about the possibilities of progress visualization - and one
possibility is one or two special column in pg_stat_activity table - this
info can be interesting for VACUUM started by autovacuum too.
Yes, I suggest just a single column on pg_stat_activity called pct_complete
trace_completion_interval = 5s (default)
Every interval, we report the current % complete for any operation that
supports it. We just show NULL if the current operation has not reported
anything or never will.
We do this for VACUUM first, then we can begin adding other operations as
we work out how (for that operation).
--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2015-06-30 PM 04:37, Rahila Syed wrote:
Design:
A shared preload library to store progress information from different
backends running VACUUM, calculate remaining time for each and display
progress in the
in the form a view.[...]
Reporting :
A view named 'pg_maintenance_progress' can be created using the values in
the struct above.pg_stat_maintenance can be called from any other backend and will display
progress of
+1
Just to clarify, the attached patch does not implement the view or the shared
memory initialization part yet, right? I understand your intention to get
comments on proposed hooks and shared memory structure(s) at this point. By
the way, how does a regular send stats to background stats collector approach
compares to the proposed hooks+shmem approach?
Thanks,
Amit
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 30 June 2015 at 08:37, Rahila Syed <rahilasyed90@gmail.com> wrote:
Hello Hackers,
Following is a proposal for feature to calculate VACUUM progress.
Use Case : Measuring progress of long running VACUUMs to help DBAs make
informed decision
whether to continue running VACUUM or abort it.Design:
A shared preload library to store progress information from different
backends running VACUUM, calculate remaining time for each and display
progress in the
in the form a view.VACUUM needs to be instrumented with a hook to collect progress
information (pages vacuumed/scanned) periodically.The patch attached implements a new hook to store vacuumed_pages and
scanned_pages count at the end of each page scanned by VACUUM.This information is stored in a shared memory structure.
In addition to measuring progress this function using hook also calculates
remaining time for VACUUM.The frequency of collecting progress information can be reduced by
appending delays in between hook function calls.Also, a GUC parameter
log_vacuum_min_duration can be used.
This will cause VACUUM progress to be calculated only if VACUUM runs more
than specified milliseconds.A value of zero calculates VACUUM progress for each page processed. -1
disables logging.Progress calculation :
percent_complete = scanned_pages * 100 / total_pages_to_be_scanned;
remaining_time = elapsed_time * (total_pages_to_be_scanned -
scanned_pages) / scanned_pages;Shared memory struct:
typedef struct PgStat_VacuumStats
{
Oid databaseoid;
Oid tableoid;
Int32 vacuumed_pages;
Int32 total_pages;
Int32 scanned_pages;
double elapsed_time;
double remaining_time;
} PgStat_VacuumStats[max_connections];
Reporting :
A view named 'pg_maintenance_progress' can be created using the values in
the struct above.pg_stat_maintenance can be called from any other backend and will display
progress ofeach running VACUUM.
Other uses of hook in VACUUM:
Cost of VACUUM in terms of pages hit , missed and dirtied same as
autovacuum can be collected using this hook.Autovacuum does it at the end of VACUUM for each table. It can be done
while VACUUM on a table is in progress.
This can be helpful to track manual VACUUMs also not just autovacuum.Read/Write(I/O) rates can be computed on the lines of autovacuum.
Read rate patterns can be used to help tuning future vacuum on the
table(like shared buffers tuning)
Other resource usages can also be collected using progress checker hook.Attached patch is POC patch of progress calculation for a single backend.
Also attached is a brief snapshot of the output log.
@@ -559,7 +567,9 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
* following blocks.
*/
if (next_not_all_visible_block - blkno > SKIP_PAGES_THRESHOLD)
+ {
skipping_all_visible_blocks = true;
+ }
There's no need to add those curly braces, or to subsequent if blocks.
Also, is this patch taking the visibility map into account for its
calculations?
--
Thom
Hello,
There's no need to add those curly braces, or to subsequent if blocks
Yes, those are added by mistake.
Also, is this patch taking the visibility map into account for its calculations?
Yes, it subtracts skippable/all-visible pages from total pages to be scanned.
For each page processed by lazy_scan_heap, if number of all visible pages ahead exceeds the threshold, it is subtracted from
the ‘total pages to be scanned’ count.
The all visible pages are accounted for incrementally during the execution of VACUUM and not before starting the process.
Thank you,
Rahila Syed
From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Thom Brown
Sent: Tuesday, June 30, 2015 2:20 PM
To: Rahila Syed
Cc: PostgreSQL-development
Subject: Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
On 30 June 2015 at 08:37, Rahila Syed <rahilasyed90@gmail.com<mailto:rahilasyed90@gmail.com>> wrote:
Hello Hackers,
Following is a proposal for feature to calculate VACUUM progress.
Use Case : Measuring progress of long running VACUUMs to help DBAs make informed decision
whether to continue running VACUUM or abort it.
Design:
A shared preload library to store progress information from different backends running VACUUM, calculate remaining time for each and display progress in the
in the form a view.
VACUUM needs to be instrumented with a hook to collect progress information (pages vacuumed/scanned) periodically.
The patch attached implements a new hook to store vacuumed_pages and scanned_pages count at the end of each page scanned by VACUUM.
This information is stored in a shared memory structure.
In addition to measuring progress this function using hook also calculates remaining time for VACUUM.
The frequency of collecting progress information can be reduced by appending delays in between hook function calls.
Also, a GUC parameter
log_vacuum_min_duration can be used.
This will cause VACUUM progress to be calculated only if VACUUM runs more than specified milliseconds.
A value of zero calculates VACUUM progress for each page processed. -1 disables logging.
Progress calculation :
percent_complete = scanned_pages * 100 / total_pages_to_be_scanned;
remaining_time = elapsed_time * (total_pages_to_be_scanned - scanned_pages) / scanned_pages;
Shared memory struct:
typedef struct PgStat_VacuumStats
{
Oid databaseoid;
Oid tableoid;
Int32 vacuumed_pages;
Int32 total_pages;
Int32 scanned_pages;
double elapsed_time;
double remaining_time;
} PgStat_VacuumStats[max_connections];
Reporting :
A view named 'pg_maintenance_progress' can be created using the values in the struct above.
pg_stat_maintenance can be called from any other backend and will display progress of
each running VACUUM.
Other uses of hook in VACUUM:
Cost of VACUUM in terms of pages hit , missed and dirtied same as autovacuum can be collected using this hook.
Autovacuum does it at the end of VACUUM for each table. It can be done while VACUUM on a table is in progress.
This can be helpful to track manual VACUUMs also not just autovacuum.
Read/Write(I/O) rates can be computed on the lines of autovacuum.
Read rate patterns can be used to help tuning future vacuum on the table(like shared buffers tuning)
Other resource usages can also be collected using progress checker hook.
Attached patch is POC patch of progress calculation for a single backend.
Also attached is a brief snapshot of the output log.
@@ -559,7 +567,9 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
* following blocks.
*/
if (next_not_all_visible_block - blkno > SKIP_PAGES_THRESHOLD)
+ {
skipping_all_visible_blocks = true;
+ }
There's no need to add those curly braces, or to subsequent if blocks.
Also, is this patch taking the visibility map into account for its calculations?
--
Thom
______________________________________________________________________
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.
Hello,
Thank you for suggestions.
Yes, I suggest just a single column on pg_stat_activity called pct_complete
Reporting remaining time also can be crucial to make decisions regarding
continuing or aborting VACUUM.
The same has been suggested in the thread below,
/messages/by-id/13072.1284826206@sss.pgh.pa.us
trace_completion_interval = 5s (default)
Every interval, we report the current % complete for any operation that
supports it. We just show NULL if the current operation has not reported
anything or never will.
We do this for VACUUM first, then we can begin adding other operations as
we work out how (for that operation).
Thank you for explaining. This design seems good to me except, adding more
than one columns(percent_complete, remaining_time) if required to
pg_stat_activity can be less user intuitive than having a separate view for
VACUUM.
-Rahila Syed
On Tue, Jun 30, 2015 at 2:02 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
Show quoted text
On 30 June 2015 at 08:52, Pavel Stehule <pavel.stehule@gmail.com> wrote:
I though about the possibilities of progress visualization - and one
possibility is one or two special column in pg_stat_activity table - this
info can be interesting for VACUUM started by autovacuum too.Yes, I suggest just a single column on pg_stat_activity called pct_complete
trace_completion_interval = 5s (default)
Every interval, we report the current % complete for any operation that
supports it. We just show NULL if the current operation has not reported
anything or never will.We do this for VACUUM first, then we can begin adding other operations as
we work out how (for that operation).--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hello,
I though about the possibilities of progress visualization - and one
possibility is one or two special column in pg_stat_activity table - this
info can be interesting for VACUUM >started by autovacuum too.
Thank you for suggestion. The design with hooks and a separate view was
mainly to keep most of the code outside core as the feature proposed is
specific to VACUUM command. Also, having a separate view can give more
flexibility in terms of displaying various progress parameters.
FWIW ,there was resistance to include columns in pg_stat_activity earlier
in the following thread,
/messages/by-id/AANLkTi=TcuMA38oGUKX9p5WVPpY+M3L0XUp7=PLT+LCT@mail.gmail.com
Thank you,
Rahila Syed
On Tue, Jun 30, 2015 at 1:22 PM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:
Show quoted text
Hi
2015-06-30 9:37 GMT+02:00 Rahila Syed <rahilasyed90@gmail.com>:
Hello Hackers,
Following is a proposal for feature to calculate VACUUM progress.
interesting idea - I like to see it integrated to core.
Use Case : Measuring progress of long running VACUUMs to help DBAs make
informed decision
whether to continue running VACUUM or abort it.Design:
A shared preload library to store progress information from different
backends running VACUUM, calculate remaining time for each and display
progress in the
in the form a view.probably similar idea can be used for REINDEX, CREATE INDEX, COPY TO
statementsI though about the possibilities of progress visualization - and one
possibility is one or two special column in pg_stat_activity table - this
info can be interesting for VACUUM started by autovacuum too.Regards
Pavel
VACUUM needs to be instrumented with a hook to collect progress
information (pages vacuumed/scanned) periodically.The patch attached implements a new hook to store vacuumed_pages and
scanned_pages count at the end of each page scanned by VACUUM.This information is stored in a shared memory structure.
In addition to measuring progress this function using hook also
calculates remaining time for VACUUM.The frequency of collecting progress information can be reduced by
appending delays in between hook function calls.Also, a GUC parameter
log_vacuum_min_duration can be used.
This will cause VACUUM progress to be calculated only if VACUUM runs more
than specified milliseconds.A value of zero calculates VACUUM progress for each page processed. -1
disables logging.Progress calculation :
percent_complete = scanned_pages * 100 / total_pages_to_be_scanned;
remaining_time = elapsed_time * (total_pages_to_be_scanned -
scanned_pages) / scanned_pages;Shared memory struct:
typedef struct PgStat_VacuumStats
{
Oid databaseoid;
Oid tableoid;
Int32 vacuumed_pages;
Int32 total_pages;
Int32 scanned_pages;
double elapsed_time;
double remaining_time;
} PgStat_VacuumStats[max_connections];
Reporting :
A view named 'pg_maintenance_progress' can be created using the values
in the struct above.pg_stat_maintenance can be called from any other backend and will display
progress ofeach running VACUUM.
Other uses of hook in VACUUM:
Cost of VACUUM in terms of pages hit , missed and dirtied same as
autovacuum can be collected using this hook.Autovacuum does it at the end of VACUUM for each table. It can be done
while VACUUM on a table is in progress.
This can be helpful to track manual VACUUMs also not just autovacuum.Read/Write(I/O) rates can be computed on the lines of autovacuum.
Read rate patterns can be used to help tuning future vacuum on the
table(like shared buffers tuning)
Other resource usages can also be collected using progress checker hook.Attached patch is POC patch of progress calculation for a single backend.
Also attached is a brief snapshot of the output log.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2015-07-02 AM 11:41, Rahila Syed wrote:
Hello,
I though about the possibilities of progress visualization - and one
possibility is one or two special column in pg_stat_activity table - this
info can be interesting for VACUUM >started by autovacuum too.Thank you for suggestion. The design with hooks and a separate view was
mainly to keep most of the code outside core as the feature proposed is
specific to VACUUM command. Also, having a separate view can give more
flexibility in terms of displaying various progress parameters.
Unless I am missing something, I guess you can still keep the actual code that
updates counters outside the core if you adopt an approach that Simon
suggests. Whatever the view (existing/new), any related counters would have a
valid (non-NULL) value when read off the view iff hooks are set perhaps
because you have an extension that sets them. I guess that means any operation
that "supports" progress tracking would have an extension with suitable hooks
implemented.
Of course unless I misinterpreted Simon's words.
Thanks,
Amit
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello,
Thank you for suggestion. The design with hooks and a separate view was
mainly to keep most of the >code outside core as the feature proposed is
specific to VACUUM command. Also, having a separate view >can give more
flexibility in terms of displaying various progress parameters.
FWIW ,there was resistance to include columns in pg_stat_activity earlier
in the following thread,
/messages/by-id/AANLkTi=TcuMA38oGUKX9p5WVPpY+M3L0XUp7=PLT+LCT@...
Perhaps as suggested in the link, the progress could be made available via a
function call which does progress calculation "on demand". Then we do not
need a separate view, or clutter pg_stat_activity, and also has benefit of
calculating progress just when it's needed.
--
View this message in context: http://postgresql.nabble.com/PROPOSAL-VACUUM-Progress-Checker-tp5855849p5856192.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2 July 2015 at 03:00, Rahila Syed <rahilasyed90@gmail.com> wrote:
Yes, I suggest just a single column on pg_stat_activity called
pct_complete
Reporting remaining time also can be crucial to make decisions regarding
continuing or aborting VACUUM.
The same has been suggested in the thread below,/messages/by-id/13072.1284826206@sss.pgh.pa.us
trace_completion_interval = 5s (default)
Every interval, we report the current % complete for any operation that
supports it. We just show NULL if the current operation has not reported
anything or never will.We do this for VACUUM first, then we can begin adding other operations
as we work out how (for that operation).
Thank you for explaining. This design seems good to me except, adding more
than one columns(percent_complete, remaining_time)
It is attractive to have a "remaining_time" column, or a
"predicted_completion_timestamp", but those columns are prediction
calculations rather than actual progress reports. I'm interested in seeing
a report that relates to actual progress made.
Predicted total work required is also interesting, but is much less
trustworthy figure.
I think we'll need to get wider input about the user interface for this
feature.
if required to pg_stat_activity can be less user intuitive than having a
separate view for VACUUM.
I think it is a mistake to do something just for VACUUM.
Monitoring software will look at pg_stat_activity. I don't think we should
invent a separate view for progress statistics because it will cause users
to look in two places rather than just one. Reporting progress is fairly
cheap instrumentation, calculating a prediction of completion time might be
expensive.
Having said that, monitoring systems currently use a polling mechanism to
retrieve status data. They look at information published by the backend. We
don't currently have a mechanism to defer publication of expensive
monitoring information until requested by the monitoring system. If you
have a design for how that might work then say so, otherwise we need to
assume a simple workflow: the backend publishes whatever it chooses,
whenever it chooses and then that is made available via the monitoring
system via views.
Your current design completely misses the time taken to scan indexes, which
is significant.
There might be a justification to put this out of core, but measuring
progress of VACUUM wouldn't be it, IMHO.
--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Le 2 juil. 2015 7:28 AM, "Simon Riggs" <simon@2ndquadrant.com> a écrit :
On 2 July 2015 at 03:00, Rahila Syed <rahilasyed90@gmail.com> wrote:
Yes, I suggest just a single column on pg_stat_activity called
pct_complete
Reporting remaining time also can be crucial to make decisions regarding
continuing or aborting VACUUM.
The same has been suggested in the thread below,
/messages/by-id/13072.1284826206@sss.pgh.pa.us
trace_completion_interval = 5s (default)
Every interval, we report the current % complete for any operation that
supports it. We just show NULL if the current operation has not reported
anything or never will.
We do this for VACUUM first, then we can begin adding other operations
as we work out how (for that operation).
Thank you for explaining. This design seems good to me except, adding
more than one columns(percent_complete, remaining_time)
It is attractive to have a "remaining_time" column, or a
"predicted_completion_timestamp", but those columns are prediction
calculations rather than actual progress reports. I'm interested in seeing
a report that relates to actual progress made.
Agreed.
Predicted total work required is also interesting, but is much less
trustworthy figure.
And it is something a client app or an extension can compute. No need to
put this in core as long as we have the actual progress.
I think we'll need to get wider input about the user interface for this
feature.
if required to pg_stat_activity can be less user intuitive than having a
separate view for VACUUM.
I think it is a mistake to do something just for VACUUM.
Monitoring software will look at pg_stat_activity. I don't think we
should invent a separate view for progress statistics because it will cause
users to look in two places rather than just one. Reporting progress is
fairly cheap instrumentation, calculating a prediction of completion time
might be expensive.
+1
Having said that, monitoring systems currently use a polling mechanism to
retrieve status data. They look at information published by the backend. We
don't currently have a mechanism to defer publication of expensive
monitoring information until requested by the monitoring system. If you
have a design for how that might work then say so, otherwise we need to
assume a simple workflow: the backend publishes whatever it chooses,
whenever it chooses and then that is made available via the monitoring
system via views.
Your current design completely misses the time taken to scan indexes,
which is significant.
There might be a justification to put this out of core, but measuring
progress of VACUUM wouldn't be it, IMHO.
Show quoted text
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hello,
Unless I am missing something, I guess you can still keep the actual code that updates counters outside the core if you adopt an approach that Simon suggests.
Yes. The code to extract progress information from VACUUM and storing in shared memory can be outside core even with pg_stat_activity as a user interface.
Whatever the view (existing/new), any related counters would have a valid (non-NULL) value when read off the view iff hooks are set perhaps because you have an extension that sets them.
I guess that means any operation that "supports" progress tracking would have an extension with suitable hooks implemented.
Do you mean to say , any operation/application that want progress tracking feature will dynamically load the progress checker module which will set the hooks for progress reporting?
If yes , unless I am missing something such dynamic loading cannot happen if we use pg_stat_activity as it gets values from shared memory. Module has to be a shared_preload_library
to allocate a shared memory. So this will mean the module can be loaded only at server restart. Am I missing something?
Thank you,
Rahila Syed
______________________________________________________________________
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
"Syed, Rahila" <Rahila.Syed@nttdata.com> writes:
Hello,
Unless I am missing something, I guess you can still keep the actual code that updates counters outside the core if you adopt an approach that Simon suggests.
Yes. The code to extract progress information from VACUUM and storing in shared memory can be outside core even with pg_stat_activity as a user interface.
Whatever the view (existing/new), any related counters would have a valid (non-NULL) value when read off the view iff hooks are set perhaps because you have an extension that sets them.
I guess that means any operation that "supports" progress tracking would have an extension with suitable hooks implemented.Do you mean to say , any operation/application that want progress tracking feature will dynamically load the progress checker module which will set the hooks for progress reporting?
If yes , unless I am missing something such dynamic loading cannot happen if we use pg_stat_activity as it gets values from shared memory. Module has to be a shared_preload_library
to allocate a shared memory. So this will mean the module can be loadedonly at server restart. Am I missing something?
TBH, I think that designing this as a hook-based solution is adding a
whole lot of complexity for no value. The hard parts of the problem are
collecting the raw data and making the results visible to users, and
both of those require involvement of the core code. Where is the benefit
from pushing some trivial intermediate arithmetic into an external module?
If there's any at all, it's certainly not enough to justify problems such
as you mention here.
So I'd just create a "pgstat_report_percent_done()" type of interface in
pgstat.c and then teach VACUUM to call it directly.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2015-07-02 PM 11:00, Syed, Rahila wrote:
Hello,
Unless I am missing something, I guess you can still keep the actual code that updates counters outside the core if you adopt an approach that Simon suggests.
Yes. The code to extract progress information from VACUUM and storing in shared memory can be outside core even with pg_stat_activity as a user interface.
Whatever the view (existing/new), any related counters would have a valid (non-NULL) value when read off the view iff hooks are set perhaps because you have an extension that sets them.
I guess that means any operation that "supports" progress tracking would have an extension with suitable hooks implemented.Do you mean to say , any operation/application that want progress tracking feature will dynamically load the progress checker module which will set the hooks for progress reporting?
If yes , unless I am missing something such dynamic loading cannot happen if we use pg_stat_activity as it gets values from shared memory. Module has to be a shared_preload_library
to allocate a shared memory. So this will mean the module can be loaded only at server restart. Am I missing something?
Assuming that set of hooks per command and shared memory structure(s) is a way
to go, I meant to say that hook implementations per command would be in their
separate modules, of course loaded at the server start for shared memory). Of
those, your proposed patch has vacuum_progress, for example. And in context of
my comment above, that means the view would say NULL for commands for which
the module has not been set up in advance. IOW, between showing NULL in the
view and dynamically loading hook functions, we choose the former because I
don't know what the latter means in postgres.
Having said that, Tom's suggestion to export pgstat.c function(s) for
command(s) may be a more appealing way to go.
Thanks,
Amit
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello,
TBH, I think that designing this as a hook-based solution is adding a whole lot of complexity for no value. The hard parts of the problem are collecting the raw data and making the results visible to users, and both of those require involvement of the core code. Where is the benefit from pushing some trivial >intermediate arithmetic into an external module?
If there's any at all, it's certainly not enough to justify problems such as you mention here.
So I'd just create a "pgstat_report_percent_done()" type of interface in pgstat.c and then teach VACUUM to call it directly.
Thank you for suggestion. I agree that adding code in core will reduce code complexity with no additional overhead.
Going by the consensus, I will update the patch with code to collect and store progress information from vacuum in pgstat.c and
UI using pg_stat_activity view.
Thank you,
Rahila Syed
______________________________________________________________________
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello,
Please find attached updated patch with an interface to calculate command
progress in pgstat.c. This interface currently implements VACUUM progress
tracking .
A column named percent_complete has been added in pg_stat_activity to
report progress.
VACUUM calls the progress calculation interface periodically at an interval
specified by pgstat_track_progress GUC in ms.
Progress calculation can be disabled by setting pgstat_track_progress as
-1.
Remaining_time for VACUUM is not included in current patch to avoid
cluttering pg_stat_activity with too many columns.
But the estimate as seen from previous implementation seems reasonable
enough to be included in progress information , may be as an exclusive view
for vacuum progress information.
GUC parameter 'pgstat_track_progress' is currently PGC_SUSET in line with
'track_activities' GUC parameter. Although IMO, pgstat_track_progress can
be made PGC_USERSET in order to provide more flexibility to any user to
enable/disable progress calculation provided progress is tracked only if
track_activities GUC parameter is enabled.
In this patch, index scans are not taken into account for progress
calculation as of now .
Thank you,
Rahila Syed.
Attachments:
Vacuum_progress_checker_v1.patchapplication/octet-stream; name=Vacuum_progress_checker_v1.patchDownload+79-8
On 2015-07-16 AM 05:18, Rahila Syed wrote:
GUC parameter 'pgstat_track_progress' is currently PGC_SUSET in line with
'track_activities' GUC parameter.
Naming the GUC pgstat* seems a little inconsistent. It could be called,
say, track_maintenance_progress_interval/track_vacuum_progress_interval.
That way, it will look similar to existing track_* parameters:
#track_activities = on
#track_counts = on
#track_io_timing = off
#track_functions = none # none, pl, all
#track_activity_query_size = 1024 # (change requires restart)
Also, adding the new GUC to src/backend/utils/misc/postgresql.conf.sample
might be helpful.
Thanks,
Amit
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers