ANALYZE command progress checker

Started by vinayakover 9 years ago47 messageshackers

Pokale_Vinayak_q3@lab.ntt.co.jp

over 9 years ago

Hello Hackers,

Following is a proposal for reporting the progress of ANALYZE command:

It seems that the following could be the phases of ANALYZE processing:
1. Collecting sample rows
2. Collecting inherited sample rows
3. Computing heap stats
4. Computing index stats
5. Cleaning up indexes

The first phase is easy if there is no inheritance but in case of
inheritance we need to sample the blocks from multiple heaps.
Here the progress is counted against total number of blocks processed.

I feel this view information may be useful in checking the progress of
long running ANALYZE command.

The attached patch reports the different phases of analyze command.
Added this patch to CF 2017-03.

Opinions?

Note: Collecting inherited sample rows phase is not reported yet in the
patch.

Regards,
Vinayak Pokale
NTT Open Source Software Center

Peter Eisentraut

peter_e@gmx.net

about 9 years ago

In reply to: vinayak (#1)

Re: ANALYZE command progress checker

On 2/28/17 04:24, vinayak wrote:

The view provides the information of analyze command progress details as
follows
postgres=# \d pg_stat_progress_analyze
View "pg_catalog.pg_stat_progress_analyze"

Hmm, do we want a separate "progress" system view for every kind of
command? What if someone comes up with a progress checker for CREATE
INDEX, REINDEX, CLUSTER, etc.?

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

David Fetter

david@fetter.org

about 9 years ago

In reply to: Peter Eisentraut (#2)

Re: ANALYZE command progress checker

On Wed, Mar 01, 2017 at 09:45:40AM -0500, Peter Eisentraut wrote:

On 2/28/17 04:24, vinayak wrote:

The view provides the information of analyze command progress details as
follows
postgres=# \d pg_stat_progress_analyze
View "pg_catalog.pg_stat_progress_analyze"

Hmm, do we want a separate "progress" system view for every kind of
command? What if someone comes up with a progress checker for CREATE
INDEX, REINDEX, CLUSTER, etc.?

Some kind of design for progress seems like a good plan. Some ideas:

- System view(s)

This has the advantage of being shown to work at least to a PoC by
this patch, and is similar to extant system views like
pg_stat_activity in the sense of capturing a moment in time.

- NOTIFY

Event-driven model as opposed to a polling one. This is
attractive on efficiency grounds, less so on reliability ones.

- Something added to the wire protocol

More specialized, limits the information to the session where the
command was issued

- Other things not named here

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Andres Freund

andres@anarazel.de

about 9 years ago

In reply to: David Fetter (#3)

Re: ANALYZE command progress checker

On 2017-03-01 10:20:41 -0800, David Fetter wrote:

On Wed, Mar 01, 2017 at 09:45:40AM -0500, Peter Eisentraut wrote:

On 2/28/17 04:24, vinayak wrote:

The view provides the information of analyze command progress details as
follows
postgres=# \d pg_stat_progress_analyze
View "pg_catalog.pg_stat_progress_analyze"

Hmm, do we want a separate "progress" system view for every kind of
command? What if someone comes up with a progress checker for CREATE
INDEX, REINDEX, CLUSTER, etc.?

I don't think that'd be that bad, otherwise the naming of the fields is
complicated. I guess the alternative (or do both?) would be to to have
a pivoted table, but that'd painful to query. Do you have a better idea?

Some kind of design for progress seems like a good plan. Some ideas:

- System view(s)

This has the advantage of being shown to work at least to a PoC by
this patch, and is similar to extant system views like
pg_stat_activity in the sense of capturing a moment in time.

- NOTIFY

Event-driven model as opposed to a polling one. This is
attractive on efficiency grounds, less so on reliability ones.

- Something added to the wire protocol

More specialized, limits the information to the session where the
command was issued

- Other things not named here

We now have a framework for this [1]https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=c16dc1aca5e (currently used by vacuum, but
extensible). The question is about presentation. I'm fairly sure that
we shouldn't just add yet another framework, and I doubt that that's
what's proposed by Peter.

[1]: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=c16dc1aca5e

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Andres Freund

andres@anarazel.de

about 9 years ago

In reply to: Andres Freund (#4)

Re: ANALYZE command progress checker

On 2017-03-01 10:25:49 -0800, Andres Freund wrote:

We now have a framework for this [1] (currently used by vacuum, but
extensible). The question is about presentation. I'm fairly sure that
we shouldn't just add yet another framework, and I doubt that that's
what's proposed by Peter.

[1]
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=c16dc1aca5e

Majority of that is actually in
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b6fb6471f6afaf649e52f38269fd8c5c60647669

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

David Fetter

david@fetter.org

about 9 years ago

In reply to: Andres Freund (#5)

Re: ANALYZE command progress checker

On Wed, Mar 01, 2017 at 10:28:23AM -0800, Andres Freund wrote:

On 2017-03-01 10:25:49 -0800, Andres Freund wrote:

We now have a framework for this [1] (currently used by vacuum, but
extensible). The question is about presentation. I'm fairly sure that
we shouldn't just add yet another framework, and I doubt that that's
what's proposed by Peter.

[1]
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=c16dc1aca5e

Majority of that is actually in
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b6fb6471f6afaf649e52f38269fd8c5c60647669

If that's even vaguely usable, I'd say we should use it for this.

I notice that that commit has no SGML component. Should it have one?

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Andres Freund

andres@anarazel.de

about 9 years ago

In reply to: David Fetter (#6)

Re: ANALYZE command progress checker

On March 1, 2017 11:34:48 AM PST, David Fetter <david@fetter.org> wrote:

I notice that that commit has no SGML component. Should it have one?

Don't think so.

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

David Steele

david@pgmasters.net

about 9 years ago

In reply to: Andres Freund (#4)

Re: ANALYZE command progress checker

On 3/1/17 1:25 PM, Andres Freund wrote:

On 2017-03-01 10:20:41 -0800, David Fetter wrote:

On Wed, Mar 01, 2017 at 09:45:40AM -0500, Peter Eisentraut wrote:

On 2/28/17 04:24, vinayak wrote:

The view provides the information of analyze command progress details as
follows
postgres=# \d pg_stat_progress_analyze
View "pg_catalog.pg_stat_progress_analyze"

Hmm, do we want a separate "progress" system view for every kind of
command? What if someone comes up with a progress checker for CREATE
INDEX, REINDEX, CLUSTER, etc.?

I don't think that'd be that bad, otherwise the naming of the fields is
complicated. I guess the alternative (or do both?) would be to to have
a pivoted table, but that'd painful to query. Do you have a better idea?

Some kind of design for progress seems like a good plan. Some ideas:

- System view(s)

This has the advantage of being shown to work at least to a PoC by
this patch, and is similar to extant system views like
pg_stat_activity in the sense of capturing a moment in time.

- NOTIFY

Event-driven model as opposed to a polling one. This is
attractive on efficiency grounds, less so on reliability ones.

- Something added to the wire protocol

More specialized, limits the information to the session where the
command was issued

- Other things not named here

We now have a framework for this [1] (currently used by vacuum, but
extensible). The question is about presentation. I'm fairly sure that
we shouldn't just add yet another framework, and I doubt that that's
what's proposed by Peter.

I think the idea of a general progress view is very valuable and there
are a ton of operations it could be used for: full table scans, index
rebuilds, vacuum, copy, etc.

However, I feel that this proposal is not flexible enough and comes too
late in the release cycle to allow development into something that could
be committed.

I propose we move this to the 2017-07 CF so the idea can be more fully
developed.

--
-David
david@pgmasters.net

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Michael Paquier

michael@paquier.xyz

about 9 years ago

In reply to: David Steele (#8)

Re: ANALYZE command progress checker

On Sat, Mar 4, 2017 at 5:33 AM, David Steele <david@pgmasters.net> wrote:

I think the idea of a general progress view is very valuable and there
are a ton of operations it could be used for: full table scans, index
rebuilds, vacuum, copy, etc.

However, I feel that this proposal is not flexible enough and comes too
late in the release cycle to allow development into something that could
be committed.

Well, each command really has its own requirements in terms of data to
store, so we either finish with a bunch of small tables that anyone
could query and join as they wish or a somewhat unique table that is
bloated with all the information, with a set of views on top of it to
query all the information. For extensibility's sake of each command
(for example imagine that REINDEX could be extended with a
CONCURRENTLY option and multiple phases), I would think that having a
table per command type would not be that bad.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

Andres Freund

andres@anarazel.de

about 9 years ago

In reply to: David Steele (#8)

Re: ANALYZE command progress checker

On 2017-03-03 15:33:15 -0500, David Steele wrote:

On 3/1/17 1:25 PM, Andres Freund wrote:

On 2017-03-01 10:20:41 -0800, David Fetter wrote:

On Wed, Mar 01, 2017 at 09:45:40AM -0500, Peter Eisentraut wrote:

On 2/28/17 04:24, vinayak wrote:

The view provides the information of analyze command progress details as
follows
postgres=# \d pg_stat_progress_analyze
View "pg_catalog.pg_stat_progress_analyze"

Hmm, do we want a separate "progress" system view for every kind of
command? What if someone comes up with a progress checker for CREATE
INDEX, REINDEX, CLUSTER, etc.?

I don't think that'd be that bad, otherwise the naming of the fields is
complicated. I guess the alternative (or do both?) would be to to have
a pivoted table, but that'd painful to query. Do you have a better idea?

Some kind of design for progress seems like a good plan. Some ideas:

- System view(s)

This has the advantage of being shown to work at least to a PoC by
this patch, and is similar to extant system views like
pg_stat_activity in the sense of capturing a moment in time.

- NOTIFY

Event-driven model as opposed to a polling one. This is
attractive on efficiency grounds, less so on reliability ones.

- Something added to the wire protocol

More specialized, limits the information to the session where the
command was issued

- Other things not named here

We now have a framework for this [1] (currently used by vacuum, but
extensible). The question is about presentation. I'm fairly sure that
we shouldn't just add yet another framework, and I doubt that that's
what's proposed by Peter.

I think the idea of a general progress view is very valuable and there
are a ton of operations it could be used for: full table scans, index
rebuilds, vacuum, copy, etc.
However, I feel that this proposal is not flexible enough and comes too
late in the release cycle to allow development into something that could
be committed.

I'm not following. As I pointed out, we already have this framework?
This patch is just a short one using that framework?

I propose we move this to the 2017-07 CF so the idea can be more fully
developed.

I don't see that being warranted in this case, we're really not talking
about something complicated:
doc/src/sgml/monitoring.sgml | 135 +++++++++++++++++++++++++++++++++++
src/backend/catalog/system_views.sql | 16 ++++
src/backend/commands/analyze.c | 34 ++++++++
src/backend/utils/adt/pgstatfuncs.c | 2
src/include/commands/progress.h | 13 +++
src/include/pgstat.h | 3
src/test/regress/expected/rules.out | 18 ++++
7 files changed, 219 insertions(+), 2 deletions(-)
excepting tests and docs, this is very little actual code.

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

about 9 years ago

In reply to: vinayak (#1)

Re: ANALYZE command progress checker

Hi Vinayak,

On 2017/02/28 18:24, vinayak wrote:

The attached patch reports the different phases of analyze command.
Added this patch to CF 2017-03.

In the updated monitoring.sgml:

+    <row>
+     <entry><literal>computing heap stats</literal></entry>
+     <entry>
+       <command>VACUUM</> is currently computing heap stats.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>computing index stats</literal></entry>
+     <entry>
+       <command>VACUUM</> is currently computing index stats.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>cleaning up indexes</literal></entry>
+     <entry>
+       <command>ANALYZE</> is currently cleaning up indexes.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>

The entries mentioning VACUUM should actually say ANALYZE.

Thanks,
Amit

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

vinayak

Pokale_Vinayak_q3@lab.ntt.co.jp

about 9 years ago

In reply to: Amit Langote (#11)

Re: ANALYZE command progress checker

On 2017/03/06 17:02, Amit Langote wrote:

Hi Vinayak,

On 2017/02/28 18:24, vinayak wrote:

The attached patch reports the different phases of analyze command.
Added this patch to CF 2017-03.

In the updated monitoring.sgml:

+    <row>
+     <entry><literal>computing heap stats</literal></entry>
+     <entry>
+       <command>VACUUM</> is currently computing heap stats.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>computing index stats</literal></entry>
+     <entry>
+       <command>VACUUM</> is currently computing index stats.
+     </entry>
+    </row>
+    <row>
+     <entry><literal>cleaning up indexes</literal></entry>
+     <entry>
+       <command>ANALYZE</> is currently cleaning up indexes.
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>

The entries mentioning VACUUM should actually say ANALYZE.

Yes. Thank you.
I will fix it.

Regards,
Vinayak Pokale
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13

David Steele

david@pgmasters.net

about 9 years ago

In reply to: Andres Freund (#10)

Re: ANALYZE command progress checker

On 3/6/17 1:58 AM, Andres Freund wrote:

On 2017-03-03 15:33:15 -0500, David Steele wrote:

I propose we move this to the 2017-07 CF so the idea can be more fully
developed.

I don't see that being warranted in this case, we're really not talking
about something complicated:

<...>

excepting tests and docs, this is very little actual code.

Fair enough. From my read through it appeared a redesign was
required/requested.

--
-David
david@pgmasters.net

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14

Michael Paquier

michael@paquier.xyz

about 9 years ago

In reply to: Andres Freund (#10)

Re: ANALYZE command progress checker

On Mon, Mar 6, 2017 at 3:58 PM, Andres Freund <andres@anarazel.de> wrote:

On 2017-03-03 15:33:15 -0500, David Steele wrote:

On 3/1/17 1:25 PM, Andres Freund wrote:

On 2017-03-01 10:20:41 -0800, David Fetter wrote:

On Wed, Mar 01, 2017 at 09:45:40AM -0500, Peter Eisentraut wrote:

On 2/28/17 04:24, vinayak wrote:

The view provides the information of analyze command progress details as
follows
postgres=# \d pg_stat_progress_analyze
View "pg_catalog.pg_stat_progress_analyze"

Hmm, do we want a separate "progress" system view for every kind of
command? What if someone comes up with a progress checker for CREATE
INDEX, REINDEX, CLUSTER, etc.?

I don't think that'd be that bad, otherwise the naming of the fields is
complicated. I guess the alternative (or do both?) would be to to have
a pivoted table, but that'd painful to query. Do you have a better idea?

Some kind of design for progress seems like a good plan. Some ideas:

- System view(s)

This has the advantage of being shown to work at least to a PoC by
this patch, and is similar to extant system views like
pg_stat_activity in the sense of capturing a moment in time.

- NOTIFY

Event-driven model as opposed to a polling one. This is
attractive on efficiency grounds, less so on reliability ones.

- Something added to the wire protocol

More specialized, limits the information to the session where the
command was issued

- Other things not named here

We now have a framework for this [1] (currently used by vacuum, but
extensible). The question is about presentation. I'm fairly sure that
we shouldn't just add yet another framework, and I doubt that that's
what's proposed by Peter.

I think the idea of a general progress view is very valuable and there
are a ton of operations it could be used for: full table scans, index
rebuilds, vacuum, copy, etc.
However, I feel that this proposal is not flexible enough and comes too
late in the release cycle to allow development into something that could
be committed.

I'm not following. As I pointed out, we already have this framework?
This patch is just a short one using that framework?

I propose we move this to the 2017-07 CF so the idea can be more fully
developed.

I don't see that being warranted in this case, we're really not talking
about something complicated:
doc/src/sgml/monitoring.sgml | 135 +++++++++++++++++++++++++++++++++++
src/backend/catalog/system_views.sql | 16 ++++
src/backend/commands/analyze.c | 34 ++++++++
src/backend/utils/adt/pgstatfuncs.c | 2
src/include/commands/progress.h | 13 +++
src/include/pgstat.h | 3
src/test/regress/expected/rules.out | 18 ++++
7 files changed, 219 insertions(+), 2 deletions(-)
excepting tests and docs, this is very little actual code.

Or 35 lines just for the backend portion, it is hard to something smaller.

@@ -496,7 +499,6 @@ do_analyze_rel(Relation onerel, int options,
VacuumParams *params,
numrows = (*acquirefunc) (onerel, elevel,
rows, targrows,
&totalrows, &totaldeadrows);
-
/*
Useless diff.

+     <entry>
+       <command>ANALYZE</> is currently collecting the sample rows.
+       The sample it reads is taken randomly.Its size depends on
+       the default_statistics_target parameter value.
+     </entry>
This should use a <varname> markup for default_statistics_target.

@@ -203,6 +204,8 @@ analyze_rel(Oid relid, RangeVar *relation, int options,
    if (onerel->rd_rel->relkind == RELKIND_RELATION ||
        onerel->rd_rel->relkind == RELKIND_MATVIEW)
    {
+       pgstat_progress_start_command(PROGRESS_COMMAND_ANALYZE,
+                                               RelationGetRelid(onerel));
It seems to me that the report should begin in do_analyze_rel().
-- 
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15

Haribabu Kommi

kommi.haribabu@gmail.com

about 9 years ago

In reply to: Michael Paquier (#14)

Re: ANALYZE command progress checker

On Tue, Mar 7, 2017 at 5:01 PM, Michael Paquier <michael.paquier@gmail.com>
wrote:

@@ -496,7 +499,6 @@ do_analyze_rel(Relation onerel, int options,
VacuumParams *params,
numrows = (*acquirefunc) (onerel, elevel,
rows, targrows,
&totalrows, &totaldeadrows);
-
/*
Useless diff.

+     <entry>
+       <command>ANALYZE</> is currently collecting the sample rows.
+       The sample it reads is taken randomly.Its size depends on
+       the default_statistics_target parameter value.
+     </entry>
This should use a <varname> markup for default_statistics_target.

@@ -203,6 +204,8 @@ analyze_rel(Oid relid, RangeVar *relation, int options,
if (onerel->rd_rel->relkind == RELKIND_RELATION ||
onerel->rd_rel->relkind == RELKIND_MATVIEW)
{
+       pgstat_progress_start_command(PROGRESS_COMMAND_ANALYZE,
+                                               RelationGetRelid(onerel));
It seems to me that the report should begin in do_analyze_rel().

some more comments,

+ /* Report compute heap stats phase */
+ pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
+ PROGRESS_ANALYZE_PHASE_COMPUTE_HEAP_STATS);

The above analyze phase is updated inside a for loop, instead just set it
above once.

+ /* Report compute index stats phase */
+ pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
+ PROGRESS_ANALYZE_PHASE_COMPUTE_INDEX_STATS);

Irrespective of whether the index exists on the table or not, the above
analyze phase
is set. It is better to set the above phase and index cleanup phase only
when there
are indexes on the table.

+ /* Report total number of heap blocks and collectinf sample row phase*/

Typo? collecting?

+ /* Report total number of heap blocks and collectinf sample row phase*/
+ initprog_val[0] = PROGRESS_ANALYZE_PHASE_COLLECT_HEAP_SAMPLE_ROWS;
+ initprog_val[1] = totalblocks;
+ pgstat_progress_update_multi_param(2, initprog_index, initprog_val);

acquire_sample_rows function is called from acquire_inherited_sample_rows
function, so adding the phase in that function will provide wrong info.

+#define PROGRESS_ANALYZE_PHASE_COLLECT_INH_SAMPLE_ROWS 2

why there is no code added for the phase, any specific reason?

+/* Phases of analyze */

Can be written as following for better understanding, and also
similar like vacuum.

/* Phases of analyze (as advertised via PROGRESS_ANALYZE_PHASE) */

Regards,
Hari Babu
Fujitsu Australia

#16

Robert Haas

robertmhaas@gmail.com

about 9 years ago

In reply to: Andres Freund (#4)

Re: ANALYZE command progress checker

On Wed, Mar 1, 2017 at 1:25 PM, Andres Freund <andres@anarazel.de> wrote:

On 2017-03-01 10:20:41 -0800, David Fetter wrote:

On Wed, Mar 01, 2017 at 09:45:40AM -0500, Peter Eisentraut wrote:

On 2/28/17 04:24, vinayak wrote:

The view provides the information of analyze command progress details as
follows
postgres=# \d pg_stat_progress_analyze
View "pg_catalog.pg_stat_progress_analyze"

Hmm, do we want a separate "progress" system view for every kind of
command? What if someone comes up with a progress checker for CREATE
INDEX, REINDEX, CLUSTER, etc.?

I don't think that'd be that bad, otherwise the naming of the fields is
complicated.

+1.

I suppose if it gets really out of control we might have to rethink,
but 2 is not 50, and having appropriate column names is worth a lot.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17

vinayak

Pokale_Vinayak_q3@lab.ntt.co.jp

about 9 years ago

In reply to: Haribabu Kommi (#15)

Re: ANALYZE command progress checker

Thank you for reviewing the patch.

The attached patch incorporated Michael and Amit comments also.

On 2017/03/07 15:45, Haribabu Kommi wrote:

On Tue, Mar 7, 2017 at 5:01 PM, Michael Paquier
<michael.paquier@gmail.com <mailto:michael.paquier@gmail.com>> wrote:

@@ -496,7 +499,6 @@ do_analyze_rel(Relation onerel, int options,
VacuumParams *params,
numrows = (*acquirefunc) (onerel, elevel,
rows, targrows,
&totalrows, &totaldeadrows);
-
/*
Useless diff.

Fixed.

+     <entry>
+       <command>ANALYZE</> is currently collecting the sample rows.
+       The sample it reads is taken randomly.Its size depends on
+       the default_statistics_target parameter value.
+     </entry>
This should use a <varname> markup for default_statistics_target.

Fixed.

@@ -203,6 +204,8 @@ analyze_rel(Oid relid, RangeVar *relation, int
options,
if (onerel->rd_rel->relkind == RELKIND_RELATION ||
onerel->rd_rel->relkind == RELKIND_MATVIEW)
{
+       pgstat_progress_start_command(PROGRESS_COMMAND_ANALYZE,
+  RelationGetRelid(onerel));
It seems to me that the report should begin in do_analyze_rel().

Fixed.

some more comments,
+/* Report compute heap stats phase */
+pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
+PROGRESS_ANALYZE_PHASE_COMPUTE_HEAP_STATS);
The above analyze phase is updated inside a for loop, instead just set
it above once.

Fixed.

+ /* Report compute index stats phase */
+ pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE,
+ PROGRESS_ANALYZE_PHASE_COMPUTE_INDEX_STATS);
Irrespective of whether the index exists on the table or not, the
above analyze phase
is set. It is better to set the above phase and index cleanup phase
only when there
are indexes on the table.

Agreed. Fixed.

+/* Report total number of heap blocks and collectinf sample row phase*/

Typo? collecting?

Fixed.

+/* Report total number of heap blocks and collectinf sample row phase*/
+initprog_val[0] = PROGRESS_ANALYZE_PHASE_COLLECT_HEAP_SAMPLE_ROWS;
+initprog_val[1] = totalblocks;
+pgstat_progress_update_multi_param(2, initprog_index, initprog_val);
acquire_sample_rows function is called from acquire_inherited_sample_rows
function, so adding the phase in that function will provide wrong info.

I agree with you.

+#define PROGRESS_ANALYZE_PHASE_COLLECT_INH_SAMPLE_ROWS2

why there is no code added for the phase, any specific reason?

I am thinking how to report this phase. Do you have any suggestion?

+/* Phases of analyze */

Can be written as following for better understanding, and also
similar like vacuum.

/* Phases of analyze (as advertised via PROGRESS_ANALYZE_PHASE) */

Done.

Regards,
Vinayak Pokale
NTT Open Source Software Center

#18

Jim Nasby

Jim.Nasby@BlueTreble.com

about 9 years ago

In reply to: Michael Paquier (#9)

Re: ANALYZE command progress checker

On 3/6/17 12:49 AM, Michael Paquier wrote:

On Sat, Mar 4, 2017 at 5:33 AM, David Steele <david@pgmasters.net> wrote:

I think the idea of a general progress view is very valuable and there
are a ton of operations it could be used for: full table scans, index
rebuilds, vacuum, copy, etc.

However, I feel that this proposal is not flexible enough and comes too
late in the release cycle to allow development into something that could
be committed.

Well, each command really has its own requirements in terms of data to
store, so we either finish with a bunch of small tables that anyone
could query and join as they wish or a somewhat unique table that is
bloated with all the information, with a set of views on top of it to
query all the information. For extensibility's sake of each command
(for example imagine that REINDEX could be extended with a
CONCURRENTLY option and multiple phases), I would think that having a
table per command type would not be that bad.

Well, the ideal scenario is that someone uses the raw data to come up
with a good way to just provide ye olde 0-100% progress bar. At that
point a single view would do the trick.

Perhaps instead of adding more clutter to \dvS we could just have a SRF
for now. At over 2800 rows currently, you're not going to notice one
more addition to \dfS.
--
Jim Nasby, Chief Data Architect, OpenSCG
http://OpenSCG.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19

Andres Freund

andres@anarazel.de

about 9 years ago

In reply to: Jim Nasby (#18)

Re: ANALYZE command progress checker

Hi,

On 2017-03-10 02:11:18 -0600, Jim Nasby wrote:

Perhaps instead of adding more clutter to \dvS we could just have a SRF for
now.

I don't see that as clutter, it's useful information, and keeping it
discoverable is good, not bad.

At over 2800 rows currently, you're not going to notice one more
addition to \dfS.

I think it's hard to design a good SRF for this. Because the fields for
different types of progress are different / empty, you can't just
trivially return them as rows. You'd have to do some EAV like
'command, field_name1, field_value1, ...' type of thing - not
particularly pretty / easy to use.

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20

Jim Nasby

Jim.Nasby@BlueTreble.com

about 9 years ago

In reply to: Andres Freund (#19)

Re: ANALYZE command progress checker

On 3/10/17 1:06 PM, Andres Freund wrote:

Hi,

On 2017-03-10 02:11:18 -0600, Jim Nasby wrote:

Perhaps instead of adding more clutter to \dvS we could just have a SRF for
now.

I don't see that as clutter, it's useful information, and keeping it
discoverable is good, not bad.

If we keep adding status reporting commands at some point it's going to
get unwieldy. Though, if they were in their own schema...

At over 2800 rows currently, you're not going to notice one more
addition to \dfS.

I think it's hard to design a good SRF for this. Because the fields for
different types of progress are different / empty, you can't just
trivially return them as rows. You'd have to do some EAV like
'command, field_name1, field_value1, ...' type of thing - not
particularly pretty / easy to use.

Oh, I wasn't suggesting a single SRF for everything. Hopefully users
will eventually figure out a good formula to drive a "progress bar" for
each type of monitor, which is what you really want anyway (at least 99%
of the time). If we got there we could have a single view that gave the
% complete for every command that was providing feedback. If someone
wanted details they could hit the individual SRF.
--
Jim Nasby, Chief Data Architect, OpenSCG
http://OpenSCG.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21

Robert Haas

robertmhaas@gmail.com

about 9 years ago

In reply to: Jim Nasby (#20)

#22

Haribabu Kommi

kommi.haribabu@gmail.com

about 9 years ago

In reply to: vinayak (#17)

#23

Robert Haas