[RFC] CLUSTER VERBOSE

Started by Grzegorz Jaskiewiczalmost 19 years ago5 messages
#1Grzegorz Jaskiewicz
gj@pointblue.com.pl

Hi folks,

I figure - I should start brand new thread for this one - so here you
go.

I am in a need for verbose CLUSTER. Ie. one that would give me
feedback and progress.
Because CLUSTER is divided into two major operations, (data
reordering, index rebuild) - I see it this way:

CLUSTER on I: <index name> T: <table name>, data reordering
CLUSTER on I: <index name> T: <table name>, index rebuild

and than:
CLUSTER 10%
CLUSTER 12% , etc

(yeah, I know how hard it is to write good progress ..).

I don't have even slight doubt that it can be useful, just like
"VACUUM VERBOSE" is. So no question about it.
I am seeking for comments. Ideas.
The patch would not be very intrusive, atm no one is using VERBOSE
for CLUSTER, because it is not there. And nothing would change in
this area.
I am looking for opinions, on what information should be presented.
Perhaps there's also use for some information it might gather
elsewhere (stats, etc) - but that's not really my point atm.

Thanks for all comments.
btw, I would really appreciate not CCing me on this, I am subscribed
here for yeaaars now (8.0 times).

ta.

--
Grzegorz Jaskiewicz

C/C++ freelance for hire

#2Dawid Kuroczko
qnex42@gmail.com
In reply to: Grzegorz Jaskiewicz (#1)
Re: [RFC] CLUSTER VERBOSE

On 3/15/07, Grzegorz Jaskiewicz <gj@pointblue.com.pl> wrote:

I figure - I should start brand new thread for this one - so here you
go.

I am in a need for verbose CLUSTER. Ie. one that would give me
feedback and progress.
Because CLUSTER is divided into two major operations, (data
reordering, index rebuild) - I see it this way:

CLUSTER on I: <index name> T: <table name>, data reordering
CLUSTER on I: <index name> T: <table name>, index rebuild

and than:
CLUSTER 10%
CLUSTER 12% , etc

Well, I'm afraid that would be inconsistent with other VERBOSE
commands (VACUUM VERBOSE), which don't give a progress
indication other than that of specific stage being finished.

I think if you want to add VERBOSE to cluster, it should behave
exactly like all other 'VERBOSE' commands.

And as for progress indication, there has been proposals for more
or less similar feature, like:
http://archives.postgresql.org/pgsql-hackers/2006-07/msg00719.php

As I recall the ideas which caught most traction were
indicating current progress via shared memory (pg_stat_activity)
and a GUC variable which instructs the server to send notices
indicating the progress status. The latter is harder.

I'm afraid creating such a feature 'just for CLUSTER' is not the greatest
idea -- there a lots of other places where having a progress bar would
be a great benefit. REINDEX, most ALTER TABLEs, CREATE INDEX, even
long running SELECTs, UPDATEs and DELETEs not to mention VACUUM
would equally benefit from it. I think you will be having hard time trying
to push CLUSTER-specific extension when there is a need for more
generic one.

Regards,
Dawid

#3Heikki Linnakangas
heikki@enterprisedb.com
In reply to: Grzegorz Jaskiewicz (#1)
Re: [RFC] CLUSTER VERBOSE

Grzegorz Jaskiewicz wrote:

Because CLUSTER is divided into two major operations, (data reordering,
index rebuild) - I see it this way:

CLUSTER on I: <index name> T: <table name>, data reordering
CLUSTER on I: <index name> T: <table name>, index rebuild

Something like that would be nice to see how long each step takes, like
vacuum verbose.

and than:
CLUSTER 10%
CLUSTER 12% , etc

We don't have progress indicators for any other commands, and I don't
see why we should add one for cluster in particular. Sure, progress
indicators are nice, but we should rather try to add some kind of a
general progress indicator support that would support SELECTs for
example. I know it's much harder, but also much more useful.

I am looking for opinions, on what information should be presented.

What would be useful is some kind of a metric of how (de)clustered the
table was before CLUSTER, and the same # of dead vs. live row counts
that vacuum verbose prints.

We don't really have a good metric for clusteredness, as have been
discussed before, so if you can come up with a good one that would be
useful in the planner as well, that would be great.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#4Grzegorz Jaskiewicz
gj@pointblue.com.pl
In reply to: Heikki Linnakangas (#3)
Re: [RFC] CLUSTER VERBOSE

On Mar 16, 2007, at 9:53 AM, Heikki Linnakangas wrote:

Grzegorz Jaskiewicz wrote:

Because CLUSTER is divided into two major operations, (data
reordering, index rebuild) - I see it this way:
CLUSTER on I: <index name> T: <table name>, data reordering
CLUSTER on I: <index name> T: <table name>, index rebuild

Something like that would be nice to see how long each step takes,
like vacuum verbose.

yup.

I am looking for opinions, on what information should be presented.

What would be useful is some kind of a metric of how (de)clustered
the table was before CLUSTER, and the same # of dead vs. live row
counts that vacuum verbose prints.

Is that information available in cluster.c atm ? I am looking for
some hints here. One of the reasons I decided to go with this patch,
is to learn something - and cluster seems to be touching very 'bone'
of postgres,
tuples system (just like vacuum), and indices. I would appreciate any
hints.

We don't really have a good metric for clusteredness, as have been
discussed before, so if you can come up with a good one that would
be useful in the planner as well, that would be great.

I really don't know where and how should I calculate such param. Any
hints ?

thanks.

--
Grzegorz Jaskiewicz

C/C++ freelance for hire

#5Bruce Momjian
bruce@momjian.us
In reply to: Grzegorz Jaskiewicz (#4)
Re: [RFC] CLUSTER VERBOSE

Added to TODO for CLUSTER:

o %Add VERBOSE option to report tables as they are processed,
like VACUUM VERBOSE

---------------------------------------------------------------------------

Grzegorz Jaskiewicz wrote:

On Mar 16, 2007, at 9:53 AM, Heikki Linnakangas wrote:

Grzegorz Jaskiewicz wrote:

Because CLUSTER is divided into two major operations, (data
reordering, index rebuild) - I see it this way:
CLUSTER on I: <index name> T: <table name>, data reordering
CLUSTER on I: <index name> T: <table name>, index rebuild

Something like that would be nice to see how long each step takes,
like vacuum verbose.

yup.

I am looking for opinions, on what information should be presented.

What would be useful is some kind of a metric of how (de)clustered
the table was before CLUSTER, and the same # of dead vs. live row
counts that vacuum verbose prints.

Is that information available in cluster.c atm ? I am looking for
some hints here. One of the reasons I decided to go with this patch,
is to learn something - and cluster seems to be touching very 'bone'
of postgres,
tuples system (just like vacuum), and indices. I would appreciate any
hints.

We don't really have a good metric for clusteredness, as have been
discussed before, so if you can come up with a good one that would
be useful in the planner as well, that would be great.

I really don't know where and how should I calculate such param. Any
hints ?

thanks.

--
Grzegorz Jaskiewicz

C/C++ freelance for hire

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +