Query progress indication - an implementation

Started by Scara Maccaiover 16 years ago26 messages
#1Scara Maccai
m_lists@yahoo.it
1 attachment(s)

Hi all,

following the link in

http://wiki.postgresql.org/wiki/Query_progress_indication

but mostly:

http://www.postech.ac.kr/~swhwang/progress2.pdf [1]

I'm trying to write an implementation of the "dne" method in postgresql.

I added another column to the pg_stat_get_activity function to report the percentage of work done for the query (of course, any other method could be used... the way the percentage is reported to the user can be easily changed).

I attached a first patch (just to see if anyone is interested, the work is by no means finished).

I guess I did a lot of mistakes, since I don't know anything about postgresql code...

1) the progress indicator can be eliminated at runtime; this could be done with another runtime flag (at the moment is always on)

2) I added a new structure (Progress) to PlanState to keep all the info about execution progress

3) I needed a pointer to the root of the PlanStates, to be able to calculate the total progress of the query tree (I bet this pointer was already available somewhere, but I couldn't find where...)

4) sub-plans are not included yet (well, just to be honest, I don't really know what postgresql means with those... :) )

5) the percentage is updated at most every second (can be easily changed)

6) the methods to adjust upper/lower bounds in [1] are not implemented yet (but that shouldn't be a problem)

7) the "spilled tuples" handling in [1] is not supported yet

8) only hash join, loop join, aggregate, sequence scans are implemented at the moment

9) I added another flag (EXEC_FLAG_DRIVER_BRANCH) in executor.h to signal to the sub-nodes if they are part of a branch that will contain a driver node (for example, inner subtree of a Nested Loops join is not a driver branch). I guess this could be done better at Plan level (instead of PlanState), but this way less code has to be changed

10) at the moment all driver nodes have the same "work_per_tuple=1", but this could be changed (for example, CPU-intensive driver nodes could have a smaller work_per_tuple value)

Well, some (very early) tests on a tpcd db showed it works as expected (well, I only did very few tests...)

Hope someone is interested

Attachments:

patch.txttext/plain; name=patch.txtDownload
Index: src/include/pgstat.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/pgstat.h,v
retrieving revision 1.83
diff -r1.83 pgstat.h
568a569,571
> 
>     /* current percentage of progress */
>     float       st_progress_perc;
646a650,651
> extern void pgstat_report_progress_percentage(double perc);
> 
Index: src/backend/executor/Makefile
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/executor/Makefile,v
retrieving revision 1.29
diff -r1.29 Makefile
25c25
<        nodeWindowAgg.o tstoreReceiver.o spi.o
---
>        nodeWindowAgg.o tstoreReceiver.o spi.o progress.o
Index: src/backend/executor/execProcnode.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/executor/execProcnode.c,v
retrieving revision 1.65
diff -r1.65 execProcnode.c
109a110,111
> #include "executor/progress.h"
> #include "pgstat.h"
111a114
> void ProgressUpdate(PlanState* node, double* tot_operations_expected, double* tot_operations_so_far);
132a136
> 	bool	is_driver_node_candidate = false;
175a180,181
> 			is_driver_node_candidate = true;
> 
261a268
> 			is_driver_node_candidate = true;
314a322,325
> 	/* Set up progress info for this node if requested */
> 	if (result->state->es_progress)
> 		ProgressSetInfo(result, node, eflags, is_driver_node_candidate);
> 
328a340,343
> 	struct timeval t;
> 	double	tot_operations_expected = 0;
> 	double	tot_operations_so_far = 0;
> 
462a478,493
> 	// progress calcs (only if required)
> 	if (node->state->es_progress && node->progress != NULL && node->progress->is_driver_node)
> 	{
> 		node->progress->operations_so_far++;
> 		gettimeofday(&t, NULL);
> 		if (t.tv_sec > node->state->es_progress_last_update.tv_sec)
> 		{
> 			ProgressUpdate(node->state->es_root_planstate, &tot_operations_expected, &tot_operations_so_far);
> 			if (tot_operations_expected != 0)
> 			{
> 				pgstat_report_progress_percentage(tot_operations_so_far*100/tot_operations_expected);
> 				node->state->es_progress_last_update = t;
> 			}
> 		}
> 	}
> 
466a498,524
> void ProgressUpdate(PlanState* node, double* tot_operations_expected, double* tot_operations_so_far)
> {
> 	// TODO here a  switch (nodeTag(node)) is needed in case we want upper/lower limit update
> 	if (node->progress->is_driver_node)
> 	{
> 		*tot_operations_expected += node->progress->lower_bound;
> 		*tot_operations_so_far += node->progress->operations_so_far;
> 	}
> 
> 	/*else ??? */if (node->progress->is_driver_branch)
> 	{
> 		if (outerPlanState(node) != NULL)
> 		{
> 			ProgressUpdate(outerPlanState(node), tot_operations_expected, tot_operations_so_far);
> 		}
> 		if (innerPlanState(node) != NULL)
> 		{
> 			ProgressUpdate(innerPlanState(node), tot_operations_expected, tot_operations_so_far);
> 		}
> 	}
> 
> 
> 
> }
> 
> 
> 
Index: src/backend/executor/execMain.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/executor/execMain.c,v
retrieving revision 1.326
diff -r1.326 execMain.c
829a830,832
> 	// TODO use configuration parameter
> 	estate->es_progress = true;
> 
835c838,841
< 	planstate = ExecInitNode(plan, estate, eflags);
---
> 	planstate = ExecInitNode(plan, estate, eflags | EXEC_FLAG_DRIVER_BRANCH);
> 
> 	estate->es_root_planstate = planstate;
> 
2737c2743
< 	epq->planstate = ExecInitNode(estate->es_plannedstmt->planTree, epqstate, 0);
---
> 	epq->planstate = ExecInitNode(estate->es_plannedstmt->planTree, epqstate, 0 | EXEC_FLAG_DRIVER_BRANCH);
Index: src/backend/executor/nodeHashjoin.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/executor/nodeHashjoin.c,v
retrieving revision 1.101
diff -r1.101 nodeHashjoin.c
400a401
> 	//innerPlanState(hjstate) = ExecInitNode((Plan *) hashNode, estate, eflags & ~EXEC_FLAG_DRIVER_BRANCH);
Index: src/backend/executor/nodeNestloop.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/executor/nodeNestloop.c,v
retrieving revision 1.53
diff -r1.53 nodeNestloop.c
317a318,319
> 
> 	// for progress estimation: the inner branch is not a driver branch
319c321
< 										   eflags | EXEC_FLAG_REWIND);
---
> 										   (eflags | EXEC_FLAG_REWIND) & ~(EXEC_FLAG_DRIVER_BRANCH));
Index: src/backend/postmaster/pgstat.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/postmaster/pgstat.c,v
retrieving revision 1.189
diff -r1.189 pgstat.c
2215a2216
> 	beentry->st_progress_perc = 0;
2341a2343,2363
> /*
>  * Report current progress percentage
>  */
> void
> pgstat_report_progress_percentage(double perc)
> {
> 	volatile PgBackendStatus *beentry = MyBEEntry;
> 
> 	if (!pgstat_track_activities || !beentry)
> 		return;
> 
> 	/*
> 	 * Update my status entry, following the protocol of bumping
> 	 * st_changecount before and after.  We use a volatile pointer here to
> 	 * ensure the compiler doesn't try to get cute.
> 	 */
> 	beentry->st_changecount++;
> 	beentry->st_progress_perc = perc;
> 	beentry->st_changecount++;
> 	Assert((beentry->st_changecount & 1) == 0);
> }
Index: src/include/executor/executor.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/executor/executor.h,v
retrieving revision 1.155
diff -r1.155 executor.h
51a52
> #define EXEC_FLAG_DRIVER_BRANCH	0x0010	/* for progress update: this is a driver branch  */
Index: src/backend/utils/adt/pgstatfuncs.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/utils/adt/pgstatfuncs.c,v
retrieving revision 1.54
diff -r1.54 pgstatfuncs.c
419c419
< 		tupdesc = CreateTemplateTupleDesc(10, false);
---
> 		tupdesc = CreateTemplateTupleDesc(11, false);
429a430
> 		TupleDescInitEntry(tupdesc, (AttrNumber) 11, "progress_perc", FLOAT4OID, -1, 0);
481,482c482,483
< 		Datum		values[10];
< 		bool		nulls[10];
---
> 		Datum		values[11];
> 		bool		nulls[11];
601a603,604
> 			nulls[10] = false;
>             values[10] = Float4GetDatum(beentry->st_progress_perc);
612a616
> 			nulls[10] = true;
Index: src/include/catalog/pg_proc.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/catalog/pg_proc.h,v
retrieving revision 1.544
diff -r1.544 pg_proc.h
2984c2984
< DATA(insert OID = 2022 (  pg_stat_get_activity			PGNSP PGUID 12 1 100 0 f f f f t s 1 0 2249 "23" "{23,26,23,26,25,16,1184,1184,1184,869,23}" "{i,o,o,o,o,o,o,o,o,o,o}" "{pid,datid,procpid,usesysid,current_query,waiting,xact_start,query_start,backend_start,client_addr,client_port}" _null_ pg_stat_get_activity _null_ _null_ _null_ ));
---
> DATA(insert OID = 2022 (  pg_stat_get_activity			PGNSP PGUID 12 1 100 0 f f f f t s 1 0 2249 "23" "{23,26,23,26,25,16,1184,1184,1184,869,23,700}" "{i,o,o,o,o,o,o,o,o,o,o,o}" "{pid,datid,procpid,usesysid,current_query,waiting,xact_start,query_start,backend_start,client_addr,client_port,progress_perc}" _null_ pg_stat_get_activity _null_ _null_ _null_ ));
Index: src/include/executor/progress.h
===================================================================
RCS file: src/include/executor/progress.h
diff -N src/include/executor/progress.h
0a1,43
> /*-------------------------------------------------------------------------
>  *
>  * progress.h
>  *	  definitions for run-time progress information
>  *
>  *
>  * Copyright (c) 2001-2009, PostgreSQL Global Development Group
>  *
>  *-------------------------------------------------------------------------
>  */
> #ifndef PROGRESS_H
> #define PROGRESS_H
> 
> #include "nodes/execnodes.h"
> 
> 
> typedef struct Progress
> {
> 	// information for progress estimation
> 	bool		is_driver_branch;		/* is this node in a branch that will contain a driver node? */
> 	bool		is_driver_node;			/* is this node a driver node? */
> 	double		operations_so_far;		/* rows emitted so far */
> 	double		upper_bound;		/* max number of expected rows */
> 	double		lower_bound;		/* min number of expected rows */
> 	float		work_per_tuple;		/* could be nice to divide CPU vs disk nodes */
> } Progress;
> 
> extern Progress *ProgressAlloc(void);
> 
> /* increment rows processed in case this is a driver node */
> #define ProgressNewRow(node) \
> 	do { \
> 		if (node->ps.state->es_progress && /*this shouldn't be needed...*/ node->ps.progress != 0) { \
> 			node->ps.progress->rows_so_far++; \
> 			ProgressUpdate((PlanState*)&(node->ps));	\
> 		} \
> 	} while(0);
> 
> 
> void ProgressSetInfo(PlanState * planstate, Plan* plan, int eflags, bool is_candidate_driver_node);
> 
> 
> #endif   /* PROGRESS_H */
Index: src/backend/executor/progress.c
===================================================================
RCS file: src/backend/executor/progress.c
diff -N src/backend/executor/progress.c
0a1,44
> /*-------------------------------------------------------------------------
>  *
>  * progress.c
>  *	 functions for progress info of plan execution
>  *
>  *
>  * Copyright (c) 2001-2009, PostgreSQL Global Development Group
>  *
>  * IDENTIFICATION
>  *
>  *-------------------------------------------------------------------------
>  */
> #include "postgres.h"
> 
> #include <unistd.h>
> 
> #include "executor/progress.h"
> #include "pgstat.h"
> #include "executor/executor.h"
> 
> 
> /* Allocate new progress structure */
> Progress *ProgressAlloc(void)
> {
> 	Progress *prog = palloc0(sizeof(Progress));
> 	prog->work_per_tuple = 1;
> 
> 	return prog;
> }
> 
> /* Set up progress info for this node if requested */
> void ProgressSetInfo(PlanState * planstate, Plan* plan, int eflags, bool is_candidate_driver_node)
> {
> 	planstate->progress = ProgressAlloc();
> 	if (eflags & EXEC_FLAG_DRIVER_BRANCH)
> 	{
> 		planstate->progress->is_driver_branch = true;
> 		if (is_candidate_driver_node)
> 		{
> 			planstate->progress->lower_bound = plan->plan_rows;
> 			planstate->progress->is_driver_node = true;
> 		}
> 	}
> }
#2Robert Haas
robertmhaas@gmail.com
In reply to: Scara Maccai (#1)
Re: Query progress indication - an implementation

On Fri, Jun 26, 2009 at 3:37 AM, Scara Maccai<m_lists@yahoo.it> wrote:

Hi all,

following the link in

http://wiki.postgresql.org/wiki/Query_progress_indication

but mostly:

http://www.postech.ac.kr/~swhwang/progress2.pdf [1]

I'm trying to write an implementation of the "dne" method in postgresql.

I added another column to the pg_stat_get_activity function to report the percentage of work done for the query (of course, any other method could be used... the way the percentage is reported to the user can be easily changed).

I attached a first patch (just to see if anyone is interested, the work is by no means finished).

I guess I did a lot of mistakes, since I don't know anything about postgresql code...

1) the progress indicator can be eliminated at runtime; this could be done with another runtime flag (at the moment is always on)

2) I added a new structure (Progress) to PlanState to keep all the info about execution progress

3) I needed a pointer to the root of the PlanStates, to be able to calculate the total progress of the query tree (I bet this pointer was already available somewhere, but I couldn't find where...)

4) sub-plans are not included yet (well, just to be honest, I don't really know what postgresql means with those... :) )

5) the percentage is updated at most every second (can be easily changed)

6) the methods to adjust upper/lower bounds in [1] are not implemented yet (but that shouldn't be a problem)

7) the "spilled tuples" handling in [1] is not supported yet

8) only hash join, loop join, aggregate, sequence scans are implemented at the moment

9) I added another flag (EXEC_FLAG_DRIVER_BRANCH) in executor.h to signal to the sub-nodes if they are part of a branch that will contain a driver node (for example, inner subtree of a Nested Loops join is not a driver branch). I guess this could be done better at Plan level (instead of PlanState), but this way less code has to be changed

10) at the moment all driver nodes have the same "work_per_tuple=1", but this could be changed (for example, CPU-intensive driver nodes could have a smaller work_per_tuple value)

Well, some (very early) tests on a tpcd db showed it works as expected (well, I only did very few tests...)

You might want to take a look at this:
http://wiki.postgresql.org/wiki/Submitting_a_Patch

The project style is not to use C++-style comments, and you should
eliminate all of the unnecessary diff hunks from your patch (like
files that have only comment or whitespace changes). Also, it is
requested that patches be submitted in context diff format and added
to the CommitFest wiki here:

http://wiki.postgresql.org/wiki/CommitFest_2009-First

As to the content of the patch, I think that what you are doing is
comparing the actual number of "operations" with the expected number
of operations. If that's correct, I'm not sure it's really all that
useful, because it will only give you accurate
percentage-of-completion information when the estimates are correct.
But when the estimates are correct, you probably have a pretty good
idea how long the query will take to run anyway. When the estimates
are off, you'll find that the actual number of operations is more than
the expected number of operations, but that won't really tell you how
far you have to go.

The only other use case I can think of for functionality of this type
is some kind of dashboard view on a system with very long-running
queries, where you want to see how far you have yet to go on each one
(maybe to approximate when you can submit the next one) without having
detailed knowledge of how expensive each individual query was project
to be. But that's a pretty narrow use case, and I'm not sure it
really justifies the overhead of instrumenting every query in this
way. For a fraction of the run-time cost, you could include the
estimated total cost of the query in the pg_stat_activity output,
which would let the user do much the same thing presuming that they
have some knowledge of the usual ratio between costs and execution
times.

Greg Stark was (is?) working on a way to get EXPLAIN-ANALYZE type
statistics on running queries; you might want to take a look at some
of that work and see what you think.

http://wiki.postgresql.org/wiki/Greg_Stark%27s_Development_Projects#EXPLAIN_queries_in_progress

...Robert

#3Peter Eisentraut
peter_e@gmx.net
In reply to: Robert Haas (#2)
Re: Query progress indication - an implementation

On Sunday 28 June 2009 20:38:59 Robert Haas wrote:

The only other use case I can think of for functionality of this type
is some kind of dashboard view on a system with very long-running
queries, where you want to see how far you have yet to go on each one
(maybe to approximate when you can submit the next one) without having
detailed knowledge of how expensive each individual query was project
to be. But that's a pretty narrow use case, and I'm not sure it
really justifies the overhead of instrumenting every query in this
way.

Well, progress information is probably only going to be useful for long-
running processes anyway, and then only when the admin is too bored and
wonders what he can usefully cancel. So it seems import to figure out exactly
when to turn this on and when not to without causing too much overhead
elsewhere.

#4Scara Maccai
m_lists@yahoo.it
In reply to: Peter Eisentraut (#3)
Re: Query progress indication - an implementation

You might want to take a look at this:
http://wiki.postgresql.org/wiki/Submitting_a_Patch

I will; I'm sorry it wasn't in the proper format. It was just a proof of concept, I guess I should have talked about it before even sending the patch.

As to the content of the patch, I think that what you are
doing is
comparing the actual number of "operations" with the
expected number
of operations. If that's correct, I'm not sure it's
really all that
useful, because it will only give you accurate
percentage-of-completion information when the estimates are
correct.
But when the estimates are correct, you probably have a
pretty good
idea how long the query will take to run anyway.

That would be a first step. Having an idea of how much a query "progressed" is very important for long-running queries. It's like copying files: even if you have a rough idea of how much time a copy will take, having an interface that tells you the percentage done is quite useful (IMHO).

When
the estimates
are off, you'll find that the actual number of operations
is more than
the expected number of operations, but that won't really
tell you how
far you have to go.

The second phase would be using histograms to help refine the statistics at runtime.

The only other use case I can think of for functionality of
this type
is some kind of dashboard view on a system with very
long-running
queries, where you want to see how far you have yet to go
on each one
(maybe to approximate when you can submit the next one)
without having
detailed knowledge of how expensive each individual query
was project
to be. But that's a pretty narrow use case

I don't think it's that narrow: it is important, for long running queries, to know how far the query processed (the user wants to know how much of a query has yet to be executed). That's why you find so many papers on query progress indicators.
The real problem is that they don't give you a solution :)

Greg Stark was (is?) working on a way to get
EXPLAIN-ANALYZE type
statistics on running queries; you might want to take a
look at some
of that work and see what you think.

http://wiki.postgresql.org/wiki/Greg_Stark%27s_Development_Projects#EXPLAIN_queries_in_progress

That's interesting. I'll take a look!

Thank you very much for your comments.

Is anyone interested in such a progress indicator???

#5Peter Hunsberger
peter.hunsberger@gmail.com
In reply to: Scara Maccai (#4)
Re: Query progress indication - an implementation

On Mon, Jun 29, 2009 at 3:47 AM, Scara Maccai<m_lists@yahoo.it> wrote:

Is anyone interested in such a progress indicator???

I'm relatively new to Postgres and just starting to look at starting
to look at what we might do with it for handling large genomic
datasets. I've used Toad for Oracle to have a look at whats going on
inside long running Oracle queries. Knowing that a particular step is
doing a particular activity is useful for diagnostics as well as being
assured that you actually are making forward progress. IMO any
diagnostics you can provide for a low cost are useful. The more
detail, the better. "Step 1 of 10" is good, "80% complete on step 1
of 10" is better. "80% complete on step 1, 10% complete on 10 steps"
is even better.

--
Peter Hunsberger

#6Robert Haas
robertmhaas@gmail.com
In reply to: Scara Maccai (#4)
Re: Query progress indication - an implementation

On Mon, Jun 29, 2009 at 4:47 AM, Scara Maccai<m_lists@yahoo.it> wrote:

As to the content of the patch, I think that what you are doing is
comparing the actual number of "operations" with the expected number
of operations.  If that's correct, I'm not sure it's really all that
useful, because it will only give you accurate
percentage-of-completion information when the estimates are
correct. But when the estimates are correct, you probably have a
pretty good idea how long the query will take to run anyway.

That would be a first step. Having an idea of how much a query "progressed" is very important for long-running queries. It's like
copying files: even if you have a rough idea of how much time a copy will take, having an interface that tells you the percentage
done is quite useful (IMHO).

I am all in favor of a query progress indicator; the trick is
constructing one that will actually be useful. It's easy to have
estimates that are off by a factor of two or three, though, so I think
you'd frequently have situations when the query completed when the
progress estimater was at 40% or 250%. Those kinds of progress
indicators tend to annoy users, and for good reason. File copying is
the sort of thing where it's pretty easy to estimate percentage of
completion by files and bytes; query execution is not.

So, I'm all in favor of what you're trying to conceptually; I just
don't like your proposed implementation.

...Robert

#7Noname
m_lists@yahoo.it
In reply to: Robert Haas (#6)
Re: Query progress indication - an implementation

It's
easy to have
estimates that are off by a factor of two or three, though,
so I think
you'd frequently have situations when the query completed
when the
progress estimater was at 40% or 250%.

I thought about implementing a "given perfect estimates" indicator at first then, as a second step, using histograms to leverage the indicator precision at run time. Of course, this doesn't mean the user wouldn't see the query completed at 40% or "slowing down" in a lot of cases...

I started this patch after reading the papers in
http://wiki.postgresql.org/wiki/Query_progress_indication
Apparently they were able to predict query execution remaining time (in case of a "perfect estimates" query) with a very simple algorithm.

Given that:
1) The algorithm ("driver node hypothesis") is so easy
2) My project fits in the category of "perfect estimates" queries

I thought "I'll give it a try".

Well: I have no idea how they got their results.

IMHO it's not possible to get max 10% error on query remaing time on most of the tpcd queries using that method, since the "driver nodes" have all the same "importance". I had to introduce a lot of complexity (not in the patch that I posted) to have it "somehow" working, giving the nodes different work per tuple according to the node type (example: in a loop join the time it takes to read a row of the outer relation can't be compared to, say, the time it takes to read a row from a table scan: but the driver node hypothesis says they will take the same time...).

So the code that I have right now works "pretty well" for the 10 queries of my project, but I guess won't work for general queries :(

So, I'm all in favor of what you're trying to conceptually;
I just
don't like your proposed implementation.

What kind of implementation would you propose?

Thank you very much for your comments.

#8Scara Maccai
m_lists@yahoo.it
In reply to: Noname (#7)
Re: Query progress indication - an implementation

IMO
any
diagnostics you can provide for a low cost are
useful.  The more
detail, the better.  "Step 1 of 10" is good, "80%
complete on step 1
of 10" is better.  "80% complete on step 1, 10%
complete on 10 steps"
is even better.

Well, I guess "Step 1 of 10" would be pretty trivial to implement (given the tree plan, how many branches have finished executing).
This doesn't tell you anything about the remaining time though.
That is: "Step 1 of 10" doesn't mean that you are at 10%, nor that you are at 50%, or 99%. In fact, I'm afraid it doesn't tell you anything...
I don't understand how useful such an implementation would be... "being
assured that you actually are making forward progress"... in which cases could you not be making forward progress?

Still, this won't take much time: if it's needed by someone else I guess it could be easily done.

#9Josh Berkus
josh@agliodbs.com
In reply to: Robert Haas (#6)
Re: Query progress indication - an implementation

All,

Actually, an indicator of even just what step of the query was being
executed would be very useful for checking on stuck queries. If a DBA
checks once that the query is on "bitmapscan on table_x(index_y)", and
it's still on that 15 minutes later, he/she can guess that the query is
thrashing due to HW or bad plan issues and kill it.

If the query is on "sort rowset by col1" then the DBA knows not to kill
it because it's almost done.

So, while an actual % completed indicator would be perfect, a "query
steps completed, current step =" would still be very useful and a large
improvement over what we have now.

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#9)
Re: Query progress indication - an implementation

Josh Berkus <josh@agliodbs.com> writes:

So, while an actual % completed indicator would be perfect, a "query
steps completed, current step =" would still be very useful and a large
improvement over what we have now.

I think this is pretty much nonsense --- most queries run all their plan
nodes concurrently to some extent. You can't usefully say that a query
is "on" some node, nor measure progress by whether some node is "done".

regards, tom lane

#11Robert Haas
robertmhaas@gmail.com
In reply to: Noname (#7)
Re: Query progress indication - an implementation

On Mon, Jun 29, 2009 at 11:15 AM, <m_lists@yahoo.it> wrote:

So the code that I have right now works "pretty well" for the 10 queries of my project, but I guess won't work for general queries :(

I think that's probably right.

So, I'm all in favor of what you're trying to conceptually;
I just
don't like your proposed implementation.

What kind of implementation would you propose?

I don't really have a proposed implementation in mind; I think it's a
hard problem. That's why I suggested looking at the
EXPLAIN-ANALYZE-in-progress stuff. By providing a lot more detail, a
human being can take a look at that output and make a judgment about
what's going on. That's not as easy-to-use as what you're trying to
do, but I suspect it's more useful in practice. It might be that
after reading a few hundred of those someone could propose some rules
of thumb to estimate the percentage of completion, which we could then
incorporate back into the system. If not, we lose nothing by
implementing that feature first, since it is independently useful.

Thank you very much for your comments.

No problem.

...Robert

#12Joshua Tolley
eggyknap@gmail.com
In reply to: Tom Lane (#10)
Re: Query progress indication - an implementation

On Mon, Jun 29, 2009 at 02:07:23PM -0400, Tom Lane wrote:

Josh Berkus <josh@agliodbs.com> writes:

So, while an actual % completed indicator would be perfect, a "query
steps completed, current step =" would still be very useful and a large
improvement over what we have now.

I think this is pretty much nonsense --- most queries run all their plan
nodes concurrently to some extent. You can't usefully say that a query
is "on" some node, nor measure progress by whether some node is "done".

What about showing the outermost node where work has started?

--
Josh / eggyknap
End Point Corp.
www.endpoint.com

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joshua Tolley (#12)
Re: Query progress indication - an implementation

Joshua Tolley <eggyknap@gmail.com> writes:

On Mon, Jun 29, 2009 at 02:07:23PM -0400, Tom Lane wrote:

I think this is pretty much nonsense --- most queries run all their plan
nodes concurrently to some extent. You can't usefully say that a query
is "on" some node, nor measure progress by whether some node is "done".

What about showing the outermost node where work has started?

That's always the outermost node; what would it tell you?

regards, tom lane

#14Peter Hunsberger
peter.hunsberger@gmail.com
In reply to: Tom Lane (#10)
Re: Query progress indication - an implementation

On Mon, Jun 29, 2009 at 1:07 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:

Josh Berkus <josh@agliodbs.com> writes:

So, while an actual % completed indicator would be perfect, a "query
steps completed, current step =" would still be very useful and a large
improvement over what we have now.

I think this is pretty much nonsense --- most queries run all their plan
nodes concurrently to some extent.  You can't usefully say that a query
is "on" some node, nor measure progress by whether some node is "done".

What you get in Toad for Oracle is the ability to see long running
processes with multiple lines, one per process currently underway. If
I recall correctly, the returned information includes what operation
is underway (eg, physical reads), the % complete, start time, time
remaining and elapsed time. Time remaining has been mostly useless
every time I've had to drill down to this level, but otherwise this
has been relatively useful information.

--
Peter Hunsberger

#15Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#10)
Re: Query progress indication - an implementation

On Mon, 2009-06-29 at 14:07 -0400, Tom Lane wrote:

Josh Berkus <josh@agliodbs.com> writes:

So, while an actual % completed indicator would be perfect, a "query
steps completed, current step =" would still be very useful and a large
improvement over what we have now.

I think this is pretty much nonsense --- most queries run all their plan
nodes concurrently to some extent. You can't usefully say that a query
is "on" some node, nor measure progress by whether some node is "done".

The requirement is not nonsense, even if the detail was slightly off.

We can regard plans as acting in phases with each blocking node
separating the plan. We know which nodes those are, so we can report
that.

For each phase, it may be very hard to say what percentage is truly
complete, but we could at least report how much work has been done and
provide a percentage against planned numbers.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#15)
Re: Query progress indication - an implementation

Simon Riggs <simon@2ndQuadrant.com> writes:

On Mon, 2009-06-29 at 14:07 -0400, Tom Lane wrote:

I think this is pretty much nonsense --- most queries run all their plan
nodes concurrently to some extent. You can't usefully say that a query
is "on" some node, nor measure progress by whether some node is "done".

The requirement is not nonsense, even if the detail was slightly off.

I was applying the word "nonsense" to the proposed implementation,
not the desire to have query progress indications ...

We can regard plans as acting in phases with each blocking node
separating the plan. We know which nodes those are, so we can report
that.

[ shrug... ] You can regard them that way, but you won't get
particularly helpful results for a large fraction of real queries.
The system is generally set up to prefer "streaming" evaluation
as much as it can. Even in nominally blocking nodes like Sort and Hash,
there are operational modes that look more like streaming, or at least
chunking.

regards, tom lane

#17Greg Stark
gsstark@mit.edu
In reply to: Tom Lane (#16)
Re: Query progress indication - an implementation

On Mon, 2009-06-29 at 14:07 -0400, Tom Lane wrote:

I think this is pretty much nonsense --- most queries run all their plan
nodes concurrently to some extent.  You can't usefully say that a query
is "on" some node, nor measure progress by whether some node is "done".

Right, that was why my proposed interface was to dump out the explain
plan with the number of loops, row counts seen so far, and approximate
percentage progress.

My thinking was that a human could interpret that to understand where
the bottleneck is if, say you're still on the first row for the top
few nodes but all the nodes below a certain sort have run to
completion that the query is busy running the sort...

But a tool like psql or pgadmin would receive that and just display
the top-level percent progress. pgadmin might actually be able to
display its graphical explain with some graphical representation of
the percent progress of each node.

We can actually do *very* well for percent progress for a lot of
nodes. Sequential scans or bitmap scans, for example, can display
their actual percent done in terms of disk blocks.

The gotcha I ran into was what to do with a nested loop join. The safe
thing to do would be to report just the outer child's percentage
directly. But that would perform poorly in the not uncommon case where
there's one expected outer tuple. If we could trust the outer estimate
we could report (outer-percentage + (1/outer-estimate *
inner-percentage)) but that will get weird quickly if the
outer-percentage turns out to be underestimated.

Basically I disagree that imperfect progress reports annoy users. I
think we can do better than reporting 250% done or having a percentage
that goes backward though. It would be quite tolerable (though perhaps
for no logical reason) to have a progress indicator which slows done
as it gets closer to 100% and never seems to make it to 100%.

--
greg
http://mit.edu/~gsstark/resume.pdf

#18Ron Mayer
rm_pg@cheapcomplexdevices.com
In reply to: Greg Stark (#17)
Re: Query progress indication - an implementation

Greg Stark wrote:

Right, that was why my proposed interface was to dump out the explain
plan with the number of loops, row counts seen so far, and approximate
percentage progress.

My thinking was that a human could interpret that to understand where
the bottleneck is if, say you're still on the first row for the top
few nodes but all the nodes below a certain sort have run to
completion that the query is busy running the sort...

+1. Especially if I run it a few times and I can see which counters
are still moving.

Basically I disagree that imperfect progress reports annoy users. I
think we can do better than reporting 250% done or having a percentage
that goes backward though. It would be quite tolerable (though perhaps
for no logical reason) to have a progress indicator which slows done
as it gets closer to 100% and never seems to make it to 100%.

-1. A counter that slowly goes from 99% to 99.5% done is
much worse than a counter that takes the same much time
going from "1000% of estimated rows done" to "2000% of
estimated rows done".

The former just tells me that it lies about how much is done.
The latter tells me that it's processing each row quickly but
that the estimate was way off.

#19Robert Haas
robertmhaas@gmail.com
In reply to: Ron Mayer (#18)
Re: Query progress indication - an implementation

On Mon, Jun 29, 2009 at 8:15 PM, Ron Mayer<rm_pg@cheapcomplexdevices.com> wrote:

Greg Stark wrote:

Right, that was why my proposed interface was to dump out the explain
plan with the number of loops, row counts seen so far, and approximate
percentage progress.

My thinking was that a human could interpret that to understand where
the bottleneck is if, say you're still on the first row for the top
few nodes but all the nodes below a certain sort have run to
completion that the query is busy running the sort...

+1.  Especially if I run it a few times and I can see which counters
are still moving.

+1 from me, too, as I said upthread.

Basically I disagree that imperfect progress reports annoy users. I
think we can do better than reporting 250% done or having a percentage
that goes backward though. It would be quite tolerable (though perhaps
for no logical reason) to have a progress indicator which slows done
as it gets closer to 100% and never seems to make it to 100%.

-1.    A counter that slowly goes from 99% to 99.5% done is
much worse than a counter that takes the same much time
going from "1000% of estimated rows done" to "2000% of
estimated rows done".

The former just tells me that it lies about how much is done.
The latter tells me that it's processing each row quickly but
that the estimate was way off.

I think both of those options are a little wacky. Maybe 800% **of
estimated rows done** is not so bad, since the tag line provides some
context, but what does it mean exactly? Rows for the toplevel plan
node? That doesn't seem like it would always be too useful. I keep
coming back to thinking you need to see the whole tree.

...Robert

#20Dimitri Fontaine
dfontaine@hi-media.com
In reply to: Greg Stark (#17)
Re: Query progress indication - an implementation

Le 30 juin 2009 à 01:34, Greg Stark <gsstark@mit.edu> a écrit :

Basically I disagree that imperfect progress reports annoy users. I
think we can do better than reporting 250% done or having a percentage
that goes backward though. It would be quite tolerable (though perhaps
for no logical reason) to have a progress indicator which slows done
as it gets closer to 100% and never seems to make it to 100%.

I guess bad stats are such an important problem in planning queries
that a 250% progress is doing more good than harm in showing users how
badly they need to review their analyze related settings.

Regards,
--
dim

#21Scara Maccai
m_lists@yahoo.it
In reply to: Dimitri Fontaine (#20)
Re: Query progress indication - an implementation

+1.  Especially if I run it a few times and I can see
which counters
are still moving.

Per-node percentage is easy to do (given the perfect estimates, of course).
The problem comes when you want to give an "overall" percentage.

I wouldn't know where to put that "explain-like" output though: in a column in pg_stat_get_activity??? (and it would be available only if the proper variable was "on" before sending the query)

-1.    A counter that slowly goes from 99% to
99.5% done is
much worse than a counter that takes the same much time
going from "1000% of estimated rows done" to "2000% of
estimated rows done".

It's not just about estimates.
Even with 100% correct estimates, IMHO there's no way to get the perfect amount of work done so far.
And this is even without considering multiple queries running at the same time...

If someone has some time to read those papers let me know what he thinks about them... because I think their methods couldn't give them those results...

#22Simon Riggs
simon@2ndQuadrant.com
In reply to: Dimitri Fontaine (#20)
Re: Query progress indication - an implementation

On Tue, 2009-06-30 at 07:04 +0200, Dimitri Fontaine wrote:

Le 30 juin 2009 à 01:34, Greg Stark <gsstark@mit.edu> a écrit :

Basically I disagree that imperfect progress reports annoy users. I
think we can do better than reporting 250% done or having a percentage
that goes backward though. It would be quite tolerable (though perhaps
for no logical reason) to have a progress indicator which slows done
as it gets closer to 100% and never seems to make it to 100%.

I guess bad stats are such an important problem in planning queries
that a 250% progress is doing more good than harm in showing users how
badly they need to review their analyze related settings.

Yeh, I agree. We can define it as "planned work", rather than actual. So
if the progress bar says 250% and query is still going at least you know
it is doing more work, rather than just being slow at doing the planned
work.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#23Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#16)
Re: Query progress indication - an implementation

On Mon, 2009-06-29 at 18:49 -0400, Tom Lane wrote:

Simon Riggs <simon@2ndQuadrant.com> writes:

On Mon, 2009-06-29 at 14:07 -0400, Tom Lane wrote:

I think this is pretty much nonsense --- most queries run all their plan
nodes concurrently to some extent. You can't usefully say that a query
is "on" some node, nor measure progress by whether some node is "done".

The requirement is not nonsense, even if the detail was slightly off.

I was applying the word "nonsense" to the proposed implementation,
not the desire to have query progress indications ...

Understood, just trying to limit the blast radius.

We can regard plans as acting in phases with each blocking node
separating the plan. We know which nodes those are, so we can report
that.

[ shrug... ] You can regard them that way, but you won't get
particularly helpful results for a large fraction of real queries.
The system is generally set up to prefer "streaming" evaluation
as much as it can. Even in nominally blocking nodes like Sort and Hash,
there are operational modes that look more like streaming, or at least
chunking.

It's not always useful, though many large queries do have multiple
phases. The concept and the name come from ETL tools and it is of real
practical use in those environments. We can put the phase number on the
EXPLAIN easily, and it is very simple to calculate the total number of
phases and the current phase - e.g. 2 of 5 phases complete.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#24Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#13)
Re: Query progress indication - an implementation

Tom Lane wrote:

Joshua Tolley <eggyknap@gmail.com> writes:

On Mon, Jun 29, 2009 at 02:07:23PM -0400, Tom Lane wrote:

I think this is pretty much nonsense --- most queries run all their plan
nodes concurrently to some extent. You can't usefully say that a query
is "on" some node, nor measure progress by whether some node is "done".

What about showing the outermost node where work has started?

That's always the outermost node; what would it tell you?

[ Repost ]

I think the only resonable solution would be to consider the estimated
cost of each node and then compute what percentage complete each node
is.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#25Greg Stark
gsstark@mit.edu
In reply to: Bruce Momjian (#24)
Re: Query progress indication - an implementation

On Thu, Jul 2, 2009 at 2:32 AM, Bruce Momjian<bruce@momjian.us> wrote:

I think the only resonable solution would be to consider the estimated
cost of each node and then compute what percentage complete each node
is.

Well you can do better for some nodes. A sequential scan for example
can tell you exactly what percentage of the way through its scan it
is. A sort node that's fnished the sort can produce an value based on
both the estimate of the relative costs of the sort vs reading the
results and the actual percentage progress reading the results.

So I think it has to come down to another ExecProcNode method the way
I had it arranged in my patch that actually implemented this.

I was partly waiting for the other patch which multiplexed signals
onto fewer actual unix signals to go through. And for XML explain
plans to go through. Once we have those then I think my patch is
actually nearly there, it just needs some additional tweaking of the
heuristics for more plan types.

Then comes the fun part of figuring out a useful UI for psql and
pgadmin. Personally I'm happy for psql to just print the plan whenever
the user hits siginfo. I think an apt-style curses progress bar would
be unecessarily heavyweight for the lightweight vision I have for
psql. But I know others have more ambitious visions for psql.

--
greg
http://mit.edu/~gsstark/resume.pdf

#26Robert Haas
robertmhaas@gmail.com
In reply to: Scara Maccai (#4)
Re: Query progress indication - an implementation

On Thu, Jul 2, 2009 at 12:48 PM, Euler Taveira de
Oliveira<euler@timbira.com> wrote:

I know that it didn't solve the estimation problem but ... IMHO the
[under|over]estimation should be treated by an external tool (autoexplain?).
So when we enable the query progress and some node reports a difference
between estimated and real more than x%, log the plan. Doing it, we will be
helping DBAs to investigate the bad plans.

Keep in mind that it is frequently the case that the estimates are
substantially off but the plan still works OK. I just put a dirty
hack into one of my apps to improve the selectivity estimates by a
factor of 200, but they're still off by a factor of 5. Even when they
were off by 1000x the bad plan happened only intermittently. You
notice the cases where the estimates are off and it makes for a bad
plan, but there are lots of other cases where the estimates are off
but the plan is still OK.

...Robert