Add index scan progress to pg_stat_progress_vacuum

Started by Sami Imseihover 4 years ago138 messageshackers
Jump to latest
#1Sami Imseih
samimseih@gmail.com

The current implementation of pg_stat_progress_vacuum does not provide progress on which index is being vacuumed making it difficult for a user to determine if the "vacuuming indexes" phase is making progress. By exposing which index is being scanned as well as the total progress the scan has made for the current cycle, a user can make better estimations on when the vacuum will complete.

The proposed patch adds 4 new columns to pg_stat_progress_vacuum:

1. indrelid - the relid of the index being vacuumed
2. index_blks_total - total number of blocks to be scanned in the current cycle
3. index_blks_scanned - number of blocks scanned in the current cycle
4. leader_pid - if the pid for the pg_stat_progress_vacuum entry is a leader or a vacuum worker. This patch places an entry for every worker pid ( if parallel ) as well as the leader pid

Attached is the patch.

Here is a sample output of a parallel vacuum for table with relid = 16638

postgres=# select * from pg_stat_progress_vacuum ;
-[ RECORD 1 ]------+------------------
pid | 18180
datid | 13732
datname | postgres
relid | 16638
phase | vacuuming indexes
heap_blks_total | 5149825
heap_blks_scanned | 5149825
heap_blks_vacuumed | 3686381
index_vacuum_count | 2
max_dead_tuples | 178956969
num_dead_tuples | 142086544
indrelid | 0 <<-----
index_blks_total | 0 <<-----
index_blks_scanned | 0 <<-----
leader_pid | <<-----
-[ RECORD 2 ]------+------------------
pid | 1543
datid | 13732
datname | postgres
relid | 16638
phase | vacuuming indexes
heap_blks_total | 0
heap_blks_scanned | 0
heap_blks_vacuumed | 0
index_vacuum_count | 0
max_dead_tuples | 0
num_dead_tuples | 0
indrelid | 16646
index_blks_total | 3030305
index_blks_scanned | 2356564
leader_pid | 18180
-[ RECORD 3 ]------+------------------
pid | 1544
datid | 13732
datname | postgres
relid | 16638
phase | vacuuming indexes
heap_blks_total | 0
heap_blks_scanned | 0
heap_blks_vacuumed | 0
index_vacuum_count | 0
max_dead_tuples | 0
num_dead_tuples | 0
indrelid | 16651
index_blks_total | 2685921
index_blks_scanned | 2119179
leader_pid | 18180

Regards,

Sami Imseih
Database Engineer @ Amazon Web Services

Attachments:

patch.txttext/plain; name=patch.txtDownload+209-12
#2Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#1)
Re: Add index scan progress to pg_stat_progress_vacuum

On 12/1/21, 3:02 PM, "Imseih (AWS), Sami" <simseih@amazon.com> wrote:

The current implementation of pg_stat_progress_vacuum does not
provide progress on which index is being vacuumed making it
difficult for a user to determine if the "vacuuming indexes" phase
is making progress. By exposing which index is being scanned as well
as the total progress the scan has made for the current cycle, a
user can make better estimations on when the vacuum will complete.

+1

The proposed patch adds 4 new columns to pg_stat_progress_vacuum:

1. indrelid - the relid of the index being vacuumed
2. index_blks_total - total number of blocks to be scanned in the
current cycle
3. index_blks_scanned - number of blocks scanned in the current
cycle
4. leader_pid - if the pid for the pg_stat_progress_vacuum entry is
a leader or a vacuum worker. This patch places an entry for every
worker pid ( if parallel ) as well as the leader pid

nitpick: Shouldn't index_blks_scanned be index_blks_vacuumed? IMO it
is more analogous to heap_blks_vacuumed.

This will tell us which indexes are currently being vacuumed and the
current progress of those operations, but it doesn't tell us which
indexes have already been vacuumed or which ones are pending vacuum.
I think such information is necessary to truly understand the current
progress of vacuuming indexes, and I can think of a couple of ways we
might provide it:

1. Make the new columns you've proposed return arrays. This isn't
very clean, but it would keep all the information for a given
vacuum operation in a single row. The indrelids column would be
populated with all the indexes that have been vacuumed, need to
be vacuumed, or are presently being vacuumed. The other index-
related columns would then have the associated stats and the
worker PID (which might be the same as the pid column depending
on whether parallel index vacuum was being done). Alternatively,
the index column could have an array of records, each containing
all the information for a given index.
2. Create a new view for just index vacuum progress information.
This would have similar information as 1. There would be an
entry for each index that has been vacuumed, needs to be
vacuumed, or is currently being vacuumed. And there would be an
easy way to join with pg_stat_progress_vacuum (e.g., leader_pid,
which again might be the same as our index vacuum PID depending
on whether we were doing parallel index vacuum). Note that it
would be possible for the PID of these entries to be null before
and after we process the index.
3. Instead of adding columns to pg_stat_progress_vacuum, adjust the
current ones to be more general, and then add new entries for
each of the indexes that have been, need to be, or currently are
being vacuumed. This is the most similar option to your current
proposal, but instead of introducing a column like
index_blks_total, we'd rename heap_blks_total to blks_total and
use that for both the heap and indexes. I think we'd still want
to add a leader_pid column. Again, we have to be prepared for
the PID to be null in this case. Or we could just make the pid
column always refer to the leader, and we could introduce a
worker_pid column. That might create confusion, though.

I wish option #1 was cleaner, because I think it would be really nice
to have all this information in a single row. However, I don't expect
much support for a 3-dimensional view, so I suspect option #2
(creating a separate view for index vacuum progress) is the way to go.
The other benefit of option #2 versus option #3 or your original
proposal is that it cleanly separates the top-level vacuum operations
and the index vacuum operations, which are related at the moment, but
which might not always be tied so closely together.

Nathan

#3Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#2)
Re: Add index scan progress to pg_stat_progress_vacuum

On 12/15/21, 4:10 PM, "Bossart, Nathan" <bossartn@amazon.com> wrote:

On 12/1/21, 3:02 PM, "Imseih (AWS), Sami" <simseih@amazon.com> wrote:

The current implementation of pg_stat_progress_vacuum does not
provide progress on which index is being vacuumed making it
difficult for a user to determine if the "vacuuming indexes" phase
is making progress. By exposing which index is being scanned as well
as the total progress the scan has made for the current cycle, a
user can make better estimations on when the vacuum will complete.

+1

The proposed patch adds 4 new columns to pg_stat_progress_vacuum:

1. indrelid - the relid of the index being vacuumed
2. index_blks_total - total number of blocks to be scanned in the
current cycle
3. index_blks_scanned - number of blocks scanned in the current
cycle
4. leader_pid - if the pid for the pg_stat_progress_vacuum entry is
a leader or a vacuum worker. This patch places an entry for every
worker pid ( if parallel ) as well as the leader pid

nitpick: Shouldn't index_blks_scanned be index_blks_vacuumed? IMO it
is more analogous to heap_blks_vacuumed.

No, What is being tracked is the number of index blocks scanned from the total index blocks. The block will be scanned regardless if it will be vacuumed or not.

This will tell us which indexes are currently being vacuumed and the
current progress of those operations, but it doesn't tell us which
indexes have already been vacuumed or which ones are pending vacuum.
I think such information is necessary to truly understand the current
progress of vacuuming indexes, and I can think of a couple of ways we
might provide it:

1. Make the new columns you've proposed return arrays. This isn't
very clean, but it would keep all the information for a given
vacuum operation in a single row. The indrelids column would be
populated with all the indexes that have been vacuumed, need to
be vacuumed, or are presently being vacuumed. The other index-
related columns would then have the associated stats and the
worker PID (which might be the same as the pid column depending
on whether parallel index vacuum was being done). Alternatively,
the index column could have an array of records, each containing
all the information for a given index.
2. Create a new view for just index vacuum progress information.
This would have similar information as 1. There would be an
entry for each index that has been vacuumed, needs to be
vacuumed, or is currently being vacuumed. And there would be an
easy way to join with pg_stat_progress_vacuum (e.g., leader_pid,
which again might be the same as our index vacuum PID depending
on whether we were doing parallel index vacuum). Note that it
would be possible for the PID of these entries to be null before
and after we process the index.
3. Instead of adding columns to pg_stat_progress_vacuum, adjust the
current ones to be more general, and then add new entries for
each of the indexes that have been, need to be, or currently are
being vacuumed. This is the most similar option to your current
proposal, but instead of introducing a column like
index_blks_total, we'd rename heap_blks_total to blks_total and
use that for both the heap and indexes. I think we'd still want
to add a leader_pid column. Again, we have to be prepared for
the PID to be null in this case. Or we could just make the pid
column always refer to the leader, and we could introduce a
worker_pid column. That might create confusion, though.

I wish option #1 was cleaner, because I think it would be really nice
to have all this information in a single row. However, I don't expect
much support for a 3-dimensional view, so I suspect option #2
(creating a separate view for index vacuum progress) is the way to go.
The other benefit of option #2 versus option #3 or your original
proposal is that it cleanly separates the top-level vacuum operations
and the index vacuum operations, which are related at the moment, but
which might not always be tied so closely together.

Option #1 is not clean as you will need to unnest the array to make sense out of it. It will be too complex to use.
Option #3 I am reluctant to spent time looking at this option. It's more valuable to see progress per index instead of total.
Option #2 was one that I originally designed but backed away as it was introducing a new view. Thinking about it a bit more, this is a cleaner approach.
1. Having a view called pg_stat_progress_vacuum_worker to join with pg_stat_progress_vacuum is clean
2. No changes required to pg_stat_progress_vacuum
3. I’ll lean towards calling the view " pg_stat_progress_vacuum_worker" instead of " pg_stat_progress_vacuum_index", to perhaps allow us to track other items a vacuum worker may do in future releases. As of now, only indexes are vacuumed by workers.
I will rework the patch for option #2

Nathan

#4Bruce Momjian
bruce@momjian.us
In reply to: Sami Imseih (#3)
Re: Add index scan progress to pg_stat_progress_vacuum

I had a similar question. And I'm still not clear from the response
what exactly index_blks_total is and whether it addresses it.

I think I agree that a user is likely to want to see the progress in a
way they can understand which means for a single index at a time.

I think what you're describing is that index_blks_total and
index_blks_scanned are the totals across all the indexes? That isn't
clear from the definitions but if that's what you intend then I think
that would work.

(For what it's worth what I was imagining was having a pair of
counters for blocks scanned and max blocks in this index and a second
counter for number of indexes processed and max number of indexes. But
I don't think that's necessarily any better than what you have)

#5Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#3)
Re: Add index scan progress to pg_stat_progress_vacuum

Here is a V2 attempt of the patch to include a new view called pg_stat_progress_vacuum_worker. Also, scans for index cleanups will also have an entry in the new view.

- here is the new view which reports an entry for every worker ( or leader ) that is doing index vacuum/index cleanup work.
postgres=# select * from pg_stat_progress_vacuum_worker ;
-[ RECORD 1 ]------+------
pid | 29355
leader_pid | 26501
indrelid | 16391
index_blks_total | 68894
index_blks_scanned | 35618

- the view can be joined with pg_stat_progress_vacuum. Sample output below

postgres=# select a.*, b.phase, b.heap_blks_total, b.heap_blks_scanned from pg_stat_progress_vacuum_worker a full outer join pg_stat_progress_vacuum b on a.pid = b.pid ;
pid | leader_pid | indrelid | index_blks_total | index_blks_scanned | phase | heap_blks_total | heap_blks_scanned
-------+------------+----------+------------------+--------------------+---------------------+-----------------+-------------------
26667 | 26667 | 16391 | 9165 | 401 | cleaning up indexes | 20082 | 20082
(1 row)

postgres=# select a.*, b.phase, b.heap_blks_total, b.heap_blks_scanned from pg_stat_progress_vacuum_worker a full outer join pg_stat_progress_vacuum b on a.pid = b.pid ;
-[ RECORD 1 ]------+------------------
pid | 26501
leader_pid | 26501
indrelid | 16393
index_blks_total | 145107
index_blks_scanned | 11060
phase | vacuuming indexes
heap_blks_total | 165375
heap_blks_scanned | 165375
-[ RECORD 2 ]------+------------------
pid | 28982
leader_pid | 26501
indrelid | 16392
index_blks_total | 47616
index_blks_scanned | 11861
phase | vacuuming indexes
heap_blks_total | 0
heap_blks_scanned | 0
-[ RECORD 3 ]------+------------------
pid | 28983
leader_pid | 26501
indrelid | 16391
index_blks_total | 56936
index_blks_scanned | 9138
phase | vacuuming indexes
heap_blks_total | 0
heap_blks_scanned | 0

On 12/15/21, 4:10 PM, "Bossart, Nathan" <bossartn@amazon.com> wrote:

On 12/1/21, 3:02 PM, "Imseih (AWS), Sami" <simseih@amazon.com> wrote:

The current implementation of pg_stat_progress_vacuum does not
provide progress on which index is being vacuumed making it
difficult for a user to determine if the "vacuuming indexes" phase
is making progress. By exposing which index is being scanned as well
as the total progress the scan has made for the current cycle, a
user can make better estimations on when the vacuum will complete.

+1

The proposed patch adds 4 new columns to pg_stat_progress_vacuum:

1. indrelid - the relid of the index being vacuumed
2. index_blks_total - total number of blocks to be scanned in the
current cycle
3. index_blks_scanned - number of blocks scanned in the current
cycle
4. leader_pid - if the pid for the pg_stat_progress_vacuum entry is
a leader or a vacuum worker. This patch places an entry for every
worker pid ( if parallel ) as well as the leader pid

nitpick: Shouldn't index_blks_scanned be index_blks_vacuumed? IMO it
is more analogous to heap_blks_vacuumed.

No, What is being tracked is the number of index blocks scanned from the total index blocks. The block will be scanned regardless if it will be vacuumed or not.

This will tell us which indexes are currently being vacuumed and the
current progress of those operations, but it doesn't tell us which
indexes have already been vacuumed or which ones are pending vacuum.
I think such information is necessary to truly understand the current
progress of vacuuming indexes, and I can think of a couple of ways we
might provide it:

1. Make the new columns you've proposed return arrays. This isn't
very clean, but it would keep all the information for a given
vacuum operation in a single row. The indrelids column would be
populated with all the indexes that have been vacuumed, need to
be vacuumed, or are presently being vacuumed. The other index-
related columns would then have the associated stats and the
worker PID (which might be the same as the pid column depending
on whether parallel index vacuum was being done). Alternatively,
the index column could have an array of records, each containing
all the information for a given index.
2. Create a new view for just index vacuum progress information.
This would have similar information as 1. There would be an
entry for each index that has been vacuumed, needs to be
vacuumed, or is currently being vacuumed. And there would be an
easy way to join with pg_stat_progress_vacuum (e.g., leader_pid,
which again might be the same as our index vacuum PID depending
on whether we were doing parallel index vacuum). Note that it
would be possible for the PID of these entries to be null before
and after we process the index.
3. Instead of adding columns to pg_stat_progress_vacuum, adjust the
current ones to be more general, and then add new entries for
each of the indexes that have been, need to be, or currently are
being vacuumed. This is the most similar option to your current
proposal, but instead of introducing a column like
index_blks_total, we'd rename heap_blks_total to blks_total and
use that for both the heap and indexes. I think we'd still want
to add a leader_pid column. Again, we have to be prepared for
the PID to be null in this case. Or we could just make the pid
column always refer to the leader, and we could introduce a
worker_pid column. That might create confusion, though.

I wish option #1 was cleaner, because I think it would be really nice
to have all this information in a single row. However, I don't expect
much support for a 3-dimensional view, so I suspect option #2
(creating a separate view for index vacuum progress) is the way to go.
The other benefit of option #2 versus option #3 or your original
proposal is that it cleanly separates the top-level vacuum operations
and the index vacuum operations, which are related at the moment, but
which might not always be tied so closely together.

Option #1 is not clean as you will need to unnest the array to make sense out of it. It will be too complex to use.
Option #3 I am reluctant to spent time looking at this option. It's more valuable to see progress per index instead of total.
Option #2 was one that I originally designed but backed away as it was introducing a new view. Thinking about it a bit more, this is a cleaner approach.
1. Having a view called pg_stat_progress_vacuum_worker to join with pg_stat_progress_vacuum is clean
2. No changes required to pg_stat_progress_vacuum
3. I’ll lean towards calling the view " pg_stat_progress_vacuum_worker" instead of " pg_stat_progress_vacuum_index", to perhaps allow us to track other items a vacuum worker may do in future releases. As of now, only indexes are vacuumed by workers.
I will rework the patch for option #2

Nathan

Attachments:

patch.v2.txttext/plain; name=patch.v2.txtDownload+246-11
In reply to: Nathan Bossart (#2)
Re: Add index scan progress to pg_stat_progress_vacuum

On Wed, Dec 15, 2021 at 2:10 PM Bossart, Nathan <bossartn@amazon.com> wrote:

nitpick: Shouldn't index_blks_scanned be index_blks_vacuumed? IMO it
is more analogous to heap_blks_vacuumed.

+1.

This will tell us which indexes are currently being vacuumed and the
current progress of those operations, but it doesn't tell us which
indexes have already been vacuumed or which ones are pending vacuum.

VACUUM will process a table's indexes in pg_class OID order (outside
of parallel VACUUM, I suppose). See comments about sort order above
RelationGetIndexList().

Anyway, it might be useful to add ordinal numbers to each index, that
line up with this processing/OID order. It would also be reasonable to
display the same number in log_autovacuum* (and VACUUM VERBOSE)
per-index output, to reinforce the idea. Note that we don't
necessarily display a distinct line for each distinct index in this
log output, which is why including the ordinal number there makes
sense.

I wish option #1 was cleaner, because I think it would be really nice
to have all this information in a single row.

I do too. I agree with the specific points you raise in your remarks
about what you've called options #2 and #3, but those options still
seem unappealing to me.

--
Peter Geoghegan

In reply to: Sami Imseih (#1)
Re: Add index scan progress to pg_stat_progress_vacuum

On Wed, Dec 1, 2021 at 2:59 PM Imseih (AWS), Sami <simseih@amazon.com> wrote:

The current implementation of pg_stat_progress_vacuum does not provide progress on which index is being vacuumed making it difficult for a user to determine if the "vacuuming indexes" phase is making progress.

I notice that your patch largely assumes that indexes can be treated
like heap relations, in the sense that they're scanned sequentially,
and process each block exactly once (or exactly once per "pass"). But
that isn't quite true. There are a few differences that seem like they
might matter:

* An ambulkdelete() scan of an index cannot take the size of the
relation once, at the start, and ignore any blocks that are added
after the scan begins. And so the code may need to re-establish the
total size of the index multiple times, to make sure no index tuples
are missed -- there may be index tuples that VACUUM needs to process
that appear in later pages due to concurrent page splits. You don't
have the issue with things like IndexBulkDeleteResult.num_pages,
because they report on the index after ambulkdelete/amvacuumcleanup
return (they're not granular progress indicators).

* Some index AMs don't work like nbtree and GiST in that they cannot
do their scan sequentially -- they have to do something like a
logical/keyspace order scan instead, which is *totally* different to
heapam (not just a bit different). There is no telling how many times
each page will be accessed in these other index AMs, and in what
order, even under optimal conditions. We should arguably not even try
to provide any granular progress information here, since it'll
probably be too messy.

I'm not sure what to recommend for your patch, in light of this. Maybe
you should change the names of the new columns to own the squishiness.
For example, instead of using the name index_blks_total, you might
instead use the name index_blks_initial. That might be enough to avoid
user confusion when we scan more blocks than the index initially
contained (within a single ambulkdelete scan).

Note also that we have to do something called backtracking in
btvacuumpage(), which you've ignored -- that's another reasonably
common way that we'll end up scanning a page twice. But that probably
should just be ignored -- it's too narrow a case to be worth caring
about.

--
Peter Geoghegan

#8Justin Pryzby
pryzby@telsasoft.com
In reply to: Sami Imseih (#1)
Re: Add index scan progress to pg_stat_progress_vacuum

This view also doesn't show vacuum progress across a partitioned table.

For comparison:

pg_stat_progress_create_index (added in v12) has:
partitions_total
partitions_done

pg_stat_progress_analyze (added in v13) has:
child_tables_total
child_tables_done

pg_stat_progress_cluster should have something similar.

--
Justin Pryzby
System Administrator
Telsasoft
+1-952-707-8581

#9Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Peter Geoghegan (#6)
Re: Add index scan progress to pg_stat_progress_vacuum

On Tue, Dec 21, 2021 at 3:37 AM Peter Geoghegan <pg@bowt.ie> wrote:

On Wed, Dec 15, 2021 at 2:10 PM Bossart, Nathan <bossartn@amazon.com> wrote:

nitpick: Shouldn't index_blks_scanned be index_blks_vacuumed? IMO it
is more analogous to heap_blks_vacuumed.

+1.

This will tell us which indexes are currently being vacuumed and the
current progress of those operations, but it doesn't tell us which
indexes have already been vacuumed or which ones are pending vacuum.

VACUUM will process a table's indexes in pg_class OID order (outside
of parallel VACUUM, I suppose). See comments about sort order above
RelationGetIndexList().

Right.

Anyway, it might be useful to add ordinal numbers to each index, that
line up with this processing/OID order. It would also be reasonable to
display the same number in log_autovacuum* (and VACUUM VERBOSE)
per-index output, to reinforce the idea. Note that we don't
necessarily display a distinct line for each distinct index in this
log output, which is why including the ordinal number there makes
sense.

An alternative idea would be to show the number of indexes on the
table and the number of indexes that have been processed in the
leader's entry of pg_stat_progress_vacuum. Even in parallel vacuum
cases, since we have index vacuum status for each index it would not
be hard for the leader process to count how many indexes have been
processed.

Regarding the details of the progress of index vacuum, I'm not sure
this progress information can fit for pg_stat_progress_vacuum. As
Peter already mentioned, the behavior quite varies depending on index
AM.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

#10Andrei Lepikhov
lepihov@gmail.com
In reply to: Peter Geoghegan (#7)
Re: Add index scan progress to pg_stat_progress_vacuum

On 21/12/2021 00:05, Peter Geoghegan wrote:

* Some index AMs don't work like nbtree and GiST in that they cannot
do their scan sequentially -- they have to do something like a
logical/keyspace order scan instead, which is *totally* different to
heapam (not just a bit different). There is no telling how many times
each page will be accessed in these other index AMs, and in what
order, even under optimal conditions. We should arguably not even try
to provide any granular progress information here, since it'll
probably be too messy.

Maybe we could add callbacks into AM interface for
send/receive/representation implementation of progress?
So AM would define a set of parameters to send into stat collector and
show to users.

--
regards,
Andrey Lepikhov
Postgres Professional

#11Justin Pryzby
pryzby@telsasoft.com
In reply to: Justin Pryzby (#8)
Re: Add index scan progress to pg_stat_progress_vacuum

Please send your patches as *.diff or *.patch, so they're processed by the
patch tester. Preferably with commit messages; git format-patch is the usual
tool for this.
http://cfbot.cputube.org/sami-imseih.html

(Occasionally, it's also useful to send a *.txt to avoid the cfbot processing
the wrong thing, in case one sends an unrelated, secondary patch, or sends
fixes to a patch as a "relative patch" which doesn't include the main patch.)

I'm including a patch rebased on 8e1fae193.

Attachments:

0001-Add-index-scan-progress-to-pg_stat_progress_vacuum.patchtext/x-diff; charset=us-asciiDownload+247-12
#12Sami Imseih
samimseih@gmail.com
In reply to: Masahiko Sawada (#9)
Re: Add index scan progress to pg_stat_progress_vacuum

I do agree that tracking progress by # of blocks scanned is not deterministic for all index types.

Based on this feedback, I went back to the drawing board on this.

Something like below may make more sense.

In pg_stat_progress_vacuum, introduce 2 new columns:

1. total_index_vacuum - total # of indexes to vacuum
2. max_cycle_time - the time in seconds of the longest index cycle.

Introduce another view called pg_stat_progress_vacuum_index_cycle:

postgres=# \d pg_stat_progress_vacuum_index_cycle
View "public.pg_stat_progress_vacuum_worker"
Column | Type | Collation | Nullable | Default
----------------+---------+-----------+----------+---------
pid | integer | | | <<<-- the PID of the vacuum worker ( or leader if it's doing index vacuuming )
leader_pid | bigint | | | <<<-- the leader PID to allow this view to be joined back to pg_stat_progress_vacuum
indrelid | bigint | | | <<<- the index relid of the index being vacuumed
ordinal_position | bigint | | | <<<- the processing position, which will give an idea of the processing position of the index being vacuumed.
dead_tuples_removed | bigint | | <<<- the number of dead rows removed in the current cycle for the index.

Having this information, one can

1. Determine which index is being vacuumed. For monitoring tools, this can help identify the index that accounts for most of the index vacuuming time.
2. Having the processing order of the current index will allow the user to determine how many of the total indexes has been completed in the current cycle.
3. dead_tuples_removed will show progress on the index vacuum in the current cycle.
4. the max_cycle_time will give an idea on how long the longest index cycle took for the current vacuum operation.

On 12/23/21, 2:46 AM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.

On Tue, Dec 21, 2021 at 3:37 AM Peter Geoghegan <pg@bowt.ie> wrote:

On Wed, Dec 15, 2021 at 2:10 PM Bossart, Nathan <bossartn@amazon.com> wrote:

nitpick: Shouldn't index_blks_scanned be index_blks_vacuumed? IMO it
is more analogous to heap_blks_vacuumed.

+1.

This will tell us which indexes are currently being vacuumed and the
current progress of those operations, but it doesn't tell us which
indexes have already been vacuumed or which ones are pending vacuum.

VACUUM will process a table's indexes in pg_class OID order (outside
of parallel VACUUM, I suppose). See comments about sort order above
RelationGetIndexList().

Right.

Anyway, it might be useful to add ordinal numbers to each index, that
line up with this processing/OID order. It would also be reasonable to
display the same number in log_autovacuum* (and VACUUM VERBOSE)
per-index output, to reinforce the idea. Note that we don't
necessarily display a distinct line for each distinct index in this
log output, which is why including the ordinal number there makes
sense.

An alternative idea would be to show the number of indexes on the
table and the number of indexes that have been processed in the
leader's entry of pg_stat_progress_vacuum. Even in parallel vacuum
cases, since we have index vacuum status for each index it would not
be hard for the leader process to count how many indexes have been
processed.

Regarding the details of the progress of index vacuum, I'm not sure
this progress information can fit for pg_stat_progress_vacuum. As
Peter already mentioned, the behavior quite varies depending on index
AM.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

#13Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#12)
Re: Add index scan progress to pg_stat_progress_vacuum

Attached is the latest revision of the patch.

In "pg_stat_progress_vacuum", introduce 2 columns:

* total_index_vacuum : This is the # of indexes that will be vacuumed. Keep in mind that if failsafe mode kicks in mid-flight to the vacuum, Postgres may choose to forgo index scans. This value will be adjusted accordingly.
* max_index_vacuum_cycle_time : The total elapsed time for a index vacuum cycle is calculated and this value will be updated to reflect the longest vacuum cycle. Until the first cycle completes, this value will be 0. The purpose of this column is to give the user an idea of how long an index vacuum cycle takes to complete.

postgres=# \d pg_stat_progress_vacuum
View "pg_catalog.pg_stat_progress_vacuum"
Column | Type | Collation | Nullable | Default
-----------------------------+---------+-----------+----------+---------
pid | integer | | |
datid | oid | | |
datname | name | | |
relid | oid | | |
phase | text | | |
heap_blks_total | bigint | | |
heap_blks_scanned | bigint | | |
heap_blks_vacuumed | bigint | | |
index_vacuum_count | bigint | | |
max_dead_tuples | bigint | | |
num_dead_tuples | bigint | | |
total_index_vacuum | bigint | | |
max_index_vacuum_cycle_time | bigint | | |

Introduce a new view called "pg_stat_progress_vacuum_index". This view will track the progress of a worker ( or leader PID ) while it's vacuuming an index. It will expose some key columns:

* pid: The PID of the worker process

* leader_pid: The PID of the leader process. This is the column that can be joined with "pg_stat_progress_vacuum". leader_pid and pid can have the same value as a leader can also perform an index vacuum.

* indrelid: The relid of the index currently being vacuumed

* vacuum_cycle_ordinal_position: The processing position of the index being vacuumed. This can be useful to determine how many indexes out of the total indexes ( pg_stat_progress_vacuum.total_index_vacuum ) have been vacuumed

* index_tuples_vacuumed: This is the number of index tuples vacuumed for the index overall. This is useful to show that the vacuum is actually doing work, as the # of tuples keeps increasing.

postgres=# \d pg_stat_progress_vacuum_index
View "pg_catalog.pg_stat_progress_vacuum_index"
Column | Type | Collation | Nullable | Default
-------------------------------+---------+-----------+----------+---------
pid | integer | | |
leader_pid | bigint | | |
indrelid | bigint | | |
vacuum_cycle_ordinal_position | bigint | | |
index_tuples_vacuumed | bigint | | |

On 12/27/21, 6:12 PM, "Imseih (AWS), Sami" <simseih@amazon.com> wrote:

I do agree that tracking progress by # of blocks scanned is not deterministic for all index types.

Based on this feedback, I went back to the drawing board on this.

Something like below may make more sense.

In pg_stat_progress_vacuum, introduce 2 new columns:

1. total_index_vacuum - total # of indexes to vacuum
2. max_cycle_time - the time in seconds of the longest index cycle.

Introduce another view called pg_stat_progress_vacuum_index_cycle:

postgres=# \d pg_stat_progress_vacuum_index_cycle
View "public.pg_stat_progress_vacuum_worker"
Column | Type | Collation | Nullable | Default
----------------+---------+-----------+----------+---------
pid | integer | | | <<<-- the PID of the vacuum worker ( or leader if it's doing index vacuuming )
leader_pid | bigint | | | <<<-- the leader PID to allow this view to be joined back to pg_stat_progress_vacuum
indrelid | bigint | | | <<<- the index relid of the index being vacuumed
ordinal_position | bigint | | | <<<- the processing position, which will give an idea of the processing position of the index being vacuumed.
dead_tuples_removed | bigint | | <<<- the number of dead rows removed in the current cycle for the index.

Having this information, one can

1. Determine which index is being vacuumed. For monitoring tools, this can help identify the index that accounts for most of the index vacuuming time.
2. Having the processing order of the current index will allow the user to determine how many of the total indexes has been completed in the current cycle.
3. dead_tuples_removed will show progress on the index vacuum in the current cycle.
4. the max_cycle_time will give an idea on how long the longest index cycle took for the current vacuum operation.

On 12/23/21, 2:46 AM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.

On Tue, Dec 21, 2021 at 3:37 AM Peter Geoghegan <pg@bowt.ie> wrote:

On Wed, Dec 15, 2021 at 2:10 PM Bossart, Nathan <bossartn@amazon.com> wrote:

nitpick: Shouldn't index_blks_scanned be index_blks_vacuumed? IMO it
is more analogous to heap_blks_vacuumed.

+1.

This will tell us which indexes are currently being vacuumed and the
current progress of those operations, but it doesn't tell us which
indexes have already been vacuumed or which ones are pending vacuum.

VACUUM will process a table's indexes in pg_class OID order (outside
of parallel VACUUM, I suppose). See comments about sort order above
RelationGetIndexList().

Right.

Anyway, it might be useful to add ordinal numbers to each index, that
line up with this processing/OID order. It would also be reasonable to
display the same number in log_autovacuum* (and VACUUM VERBOSE)
per-index output, to reinforce the idea. Note that we don't
necessarily display a distinct line for each distinct index in this
log output, which is why including the ordinal number there makes
sense.

An alternative idea would be to show the number of indexes on the
table and the number of indexes that have been processed in the
leader's entry of pg_stat_progress_vacuum. Even in parallel vacuum
cases, since we have index vacuum status for each index it would not
be hard for the leader process to count how many indexes have been
processed.

Regarding the details of the progress of index vacuum, I'm not sure
this progress information can fit for pg_stat_progress_vacuum. As
Peter already mentioned, the behavior quite varies depending on index
AM.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

Attachments:

0001-Add-index-scan-progress-to-pg_stat_progress_vacuum.patchapplication/octet-stream; name=0001-Add-index-scan-progress-to-pg_stat_progress_vacuum.patchDownload+137-30
#14Justin Pryzby
pryzby@telsasoft.com
In reply to: Sami Imseih (#13)
Re: Add index scan progress to pg_stat_progress_vacuum

http://cfbot.cputube.org/sami-imseih.html
You should run "make check" and update rules.out.

You should also use make check-world - usually something like:
make check-world -j4 >check-world.out 2>&1 ; echo ret $?

indrelid: The relid of the index currently being vacuumed

I think it should be called indexrelid not indrelid, for consistency with
pg_index.

S.param10 vacuum_cycle_ordinal_position,
S.param13 index_rows_vacuumed

These should both say "AS" for consistency.

system_views.sql is using tabs, but should use spaces for consistency.

#include "commands/progress.h"

The postgres convention is to alphabetize the includes.

/* VACCUM operation's longest index scan cycle */

VACCUM => VACUUM

Ultimately you'll also need to update the docs.

#15Sami Imseih
samimseih@gmail.com
In reply to: Justin Pryzby (#14)
Re: Add index scan progress to pg_stat_progress_vacuum

Attaching the latest revision of the patch with the fixes suggested. Also ran make check and make check-world successfully.

On 12/29/21, 11:51 AM, "Justin Pryzby" <pryzby@telsasoft.com> wrote:

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.

http://cfbot.cputube.org/sami-imseih.html
You should run "make check" and update rules.out.

You should also use make check-world - usually something like:
make check-world -j4 >check-world.out 2>&1 ; echo ret $?

indrelid: The relid of the index currently being vacuumed

I think it should be called indexrelid not indrelid, for consistency with
pg_index.

S.param10 vacuum_cycle_ordinal_position,
S.param13 index_rows_vacuumed

These should both say "AS" for consistency.

system_views.sql is using tabs, but should use spaces for consistency.

#include "commands/progress.h"

The postgres convention is to alphabetize the includes.

/* VACCUM operation's longest index scan cycle */

VACCUM => VACUUM

Ultimately you'll also need to update the docs.

Attachments:

0001-Add-index-scan-progress-to-pg_stat_progress_vacuum.patchapplication/octet-stream; name=0001-Add-index-scan-progress-to-pg_stat_progress_vacuum.patchDownload+146-32
#16Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#15)
Re: Add index scan progress to pg_stat_progress_vacuum

On 12/29/21, 8:44 AM, "Imseih (AWS), Sami" <simseih@amazon.com> wrote:

In "pg_stat_progress_vacuum", introduce 2 columns:

* total_index_vacuum : This is the # of indexes that will be vacuumed. Keep in mind that if failsafe mode kicks in mid-flight to the vacuum, Postgres may choose to forgo index scans. This value will be adjusted accordingly.
* max_index_vacuum_cycle_time : The total elapsed time for a index vacuum cycle is calculated and this value will be updated to reflect the longest vacuum cycle. Until the first cycle completes, this value will be 0. The purpose of this column is to give the user an idea of how long an index vacuum cycle takes to complete.

I think that total_index_vacuum is a good thing to have. I would
expect this to usually just be the number of indexes on the table, but
as you pointed out, this can be different when we are skipping
indexes. My only concern with this new column is the potential for
confusion when compared with the index_vacuum_count value.
index_vacuum_count indicates the number of vacuum cycles completed,
but total_index_vacuum indicates the number of indexes that will be
vacuumed. However, the names sound like they could refer to the same
thing to me. Perhaps we should rename index_vacuum_count to
index_vacuum_cycles/index_vacuum_cycle_count, and the new column
should be something like num_indexes_to_vacuum or index_vacuum_total.

I don't think we need the max_index_vacuum_cycle_time column. While
the idea is to give users a rough estimate for how long an index cycle
will take, I don't think it will help generate any meaningful
estimates for how much longer the vacuum operation will take. IIUC we
won't have any idea how many total index vacuum cycles will be needed.
Even if we did, the current cycle could take much more or much less
time. Also, none of the other progress views seem to provide any
timing information, which I suspect is by design to avoid inaccurate
estimates.

Introduce a new view called "pg_stat_progress_vacuum_index". This view will track the progress of a worker ( or leader PID ) while it's vacuuming an index. It will expose some key columns:

* pid: The PID of the worker process

* leader_pid: The PID of the leader process. This is the column that can be joined with "pg_stat_progress_vacuum". leader_pid and pid can have the same value as a leader can also perform an index vacuum.

* indrelid: The relid of the index currently being vacuumed

* vacuum_cycle_ordinal_position: The processing position of the index being vacuumed. This can be useful to determine how many indexes out of the total indexes ( pg_stat_progress_vacuum.total_index_vacuum ) have been vacuumed

* index_tuples_vacuumed: This is the number of index tuples vacuumed for the index overall. This is useful to show that the vacuum is actually doing work, as the # of tuples keeps increasing.

Should we also provide some information for determining the progress
of the current cycle? Perhaps there should be an
index_tuples_vacuumed_current_cycle column that users can compare with
the num_dead_tuples value in pg_stat_progress_vacuum. However,
perhaps the number of tuples vacuumed in the current cycle can already
be discovered via index_tuples_vacuumed % max_dead_tuples.

+void
+rusage_adjust(const PGRUsage *ru0, PGRUsage *ru1)
+{
+	if (ru1->tv.tv_usec < ru0->tv.tv_usec)
+	{
+		ru1->tv.tv_sec--;
+		ru1->tv.tv_usec += 1000000;
+	}
+	if (ru1->ru.ru_stime.tv_usec < ru0->ru.ru_stime.tv_usec)
+	{
+		ru1->ru.ru_stime.tv_sec--;
+		ru1->ru.ru_stime.tv_usec += 1000000;
+	}
+	if (ru1->ru.ru_utime.tv_usec < ru0->ru.ru_utime.tv_usec)
+	{
+		ru1->ru.ru_utime.tv_sec--;
+		ru1->ru.ru_utime.tv_usec += 1000000;
+	}
+}

I think this function could benefit from a comment. Without going
through it line by line, it is not clear to me exactly what it is
doing.

I know we're still working on what exactly this stuff should look
like, but I would suggest adding the documentation changes in the near
future.

Nathan

#17Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#16)
Re: Add index scan progress to pg_stat_progress_vacuum

Thanks for the review.

I am hesitant to make column name changes for obvious reasons, as it breaks existing tooling. However, I think there is a really good case to change "index_vacuum_count" as the name is confusing. "index_vacuum_cycles_completed" is the name I suggest if we agree to rename.

For the new column, "num_indexes_to_vacuum" is good with me.

As far as max_index_vacuum_cycle_time goes, Besides the points you make, another reason is that until one cycle completes, this value will remain at 0. It will not be helpful data for most vacuum cases. Removing it also reduces the complexity of the patch.

On 1/6/22, 2:41 PM, "Bossart, Nathan" <bossartn@amazon.com> wrote:

On 12/29/21, 8:44 AM, "Imseih (AWS), Sami" <simseih@amazon.com> wrote:

In "pg_stat_progress_vacuum", introduce 2 columns:

* total_index_vacuum : This is the # of indexes that will be vacuumed. Keep in mind that if failsafe mode kicks in mid-flight to the vacuum, Postgres may choose to forgo index scans. This value will be adjusted accordingly.
* max_index_vacuum_cycle_time : The total elapsed time for a index vacuum cycle is calculated and this value will be updated to reflect the longest vacuum cycle. Until the first cycle completes, this value will be 0. The purpose of this column is to give the user an idea of how long an index vacuum cycle takes to complete.

I think that total_index_vacuum is a good thing to have. I would
expect this to usually just be the number of indexes on the table, but
as you pointed out, this can be different when we are skipping
indexes. My only concern with this new column is the potential for
confusion when compared with the index_vacuum_count value.
index_vacuum_count indicates the number of vacuum cycles completed,
but total_index_vacuum indicates the number of indexes that will be
vacuumed. However, the names sound like they could refer to the same
thing to me. Perhaps we should rename index_vacuum_count to
index_vacuum_cycles/index_vacuum_cycle_count, and the new column
should be something like num_indexes_to_vacuum or index_vacuum_total.

I don't think we need the max_index_vacuum_cycle_time column. While
the idea is to give users a rough estimate for how long an index cycle
will take, I don't think it will help generate any meaningful
estimates for how much longer the vacuum operation will take. IIUC we
won't have any idea how many total index vacuum cycles will be needed.
Even if we did, the current cycle could take much more or much less
time. Also, none of the other progress views seem to provide any
timing information, which I suspect is by design to avoid inaccurate
estimates.

Introduce a new view called "pg_stat_progress_vacuum_index". This view will track the progress of a worker ( or leader PID ) while it's vacuuming an index. It will expose some key columns:

* pid: The PID of the worker process

* leader_pid: The PID of the leader process. This is the column that can be joined with "pg_stat_progress_vacuum". leader_pid and pid can have the same value as a leader can also perform an index vacuum.

* indrelid: The relid of the index currently being vacuumed

* vacuum_cycle_ordinal_position: The processing position of the index being vacuumed. This can be useful to determine how many indexes out of the total indexes ( pg_stat_progress_vacuum.total_index_vacuum ) have been vacuumed

* index_tuples_vacuumed: This is the number of index tuples vacuumed for the index overall. This is useful to show that the vacuum is actually doing work, as the # of tuples keeps increasing.

Should we also provide some information for determining the progress
of the current cycle? Perhaps there should be an
index_tuples_vacuumed_current_cycle column that users can compare with
the num_dead_tuples value in pg_stat_progress_vacuum. However,
perhaps the number of tuples vacuumed in the current cycle can already
be discovered via index_tuples_vacuumed % max_dead_tuples.

    +void
    +rusage_adjust(const PGRUsage *ru0, PGRUsage *ru1)
    +{
    +	if (ru1->tv.tv_usec < ru0->tv.tv_usec)
    +	{
    +		ru1->tv.tv_sec--;
    +		ru1->tv.tv_usec += 1000000;
    +	}
    +	if (ru1->ru.ru_stime.tv_usec < ru0->ru.ru_stime.tv_usec)
    +	{
    +		ru1->ru.ru_stime.tv_sec--;
    +		ru1->ru.ru_stime.tv_usec += 1000000;
    +	}
    +	if (ru1->ru.ru_utime.tv_usec < ru0->ru.ru_utime.tv_usec)
    +	{
    +		ru1->ru.ru_utime.tv_sec--;
    +		ru1->ru.ru_utime.tv_usec += 1000000;
    +	}
    +}

I think this function could benefit from a comment. Without going
through it line by line, it is not clear to me exactly what it is
doing.

I know we're still working on what exactly this stuff should look
like, but I would suggest adding the documentation changes in the near
future.

Nathan

#18Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#17)
Re: Add index scan progress to pg_stat_progress_vacuum

On 1/6/22, 6:14 PM, "Imseih (AWS), Sami" <simseih@amazon.com> wrote:

I am hesitant to make column name changes for obvious reasons, as it breaks existing tooling. However, I think there is a really good case to change "index_vacuum_count" as the name is confusing. "index_vacuum_cycles_completed" is the name I suggest if we agree to rename.

For the new column, "num_indexes_to_vacuum" is good with me.

Yeah, I think we can skip renaming index_vacuum_count for now. In any
case, it would probably be good to discuss that in a separate thread.

Nathan

#19Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#18)
Re: Add index scan progress to pg_stat_progress_vacuum

I agree, Renaming "index_vacuum_count" can be taken up in a separate discussion.

I have attached the 3rd revision of the patch which also includes the documentation changes. Also attached is a rendered html of the docs for review.

"max_index_vacuum_cycle_time" has been removed.
"index_rows_vacuumed" renamed to "index_tuples_removed". "tuples" is a more consistent with the terminology used.
"vacuum_cycle_ordinal_position" renamed to "index_ordinal_position".

On 1/10/22, 12:30 PM, "Bossart, Nathan" <bossartn@amazon.com> wrote:

On 1/6/22, 6:14 PM, "Imseih (AWS), Sami" <simseih@amazon.com> wrote:

I am hesitant to make column name changes for obvious reasons, as it breaks existing tooling. However, I think there is a really good case to change "index_vacuum_count" as the name is confusing. "index_vacuum_cycles_completed" is the name I suggest if we agree to rename.

For the new column, "num_indexes_to_vacuum" is good with me.

Yeah, I think we can skip renaming index_vacuum_count for now. In any
case, it would probably be good to discuss that in a separate thread.

Nathan

Attachments:

progress-reporting.htmltext/html; name=progress-reporting.htmlDownload
0001-Expose-progress-for-the-vacuuming-indexes-phase-of-a.patchapplication/octet-stream; name=0001-Expose-progress-for-the-vacuuming-indexes-phase-of-a.patchDownload+205-19
#20Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#19)
Re: Add index scan progress to pg_stat_progress_vacuum

On 1/10/22, 5:01 PM, "Imseih (AWS), Sami" <simseih@amazon.com> wrote:

I have attached the 3rd revision of the patch which also includes the documentation changes. Also attached is a rendered html of the docs for review.

"max_index_vacuum_cycle_time" has been removed.
"index_rows_vacuumed" renamed to "index_tuples_removed". "tuples" is a more consistent with the terminology used.
"vacuum_cycle_ordinal_position" renamed to "index_ordinal_position".

Thanks for the new version of the patch!

nitpick: I get one whitespace error when applying the patch.

Applying: Expose progress for the "vacuuming indexes" phase of a VACUUM operation.
.git/rebase-apply/patch:44: tab in indent.
Whenever <xref linkend="guc-vacuum-failsafe-age"/> is triggered, index
warning: 1 line adds whitespace errors.

+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>num_indexes_to_vacuum</structfield> <type>bigint</type>
+      </para>
+      <para>
+       The number of indexes that will be vacuumed. Only indexes with
+       <literal>pg_index.indisready</literal> set to "true" will be vacuumed.
+       Whenever <xref linkend="guc-vacuum-failsafe-age"/> is triggered, index
+       vacuuming will be bypassed.
+      </para></entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>

We may want to avoid exhaustively listing the cases when this value
will be zero. I would suggest saying, "When index cleanup is skipped,
this value will be zero" instead.

+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>relid</structfield> <type>oid</type>
+      </para>
+      <para>
+       OID of the table being vacuumed.
+      </para></entry>
+     </row>

Do we need to include this field? I would expect indexrelid to go
here.

+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>leader_pid</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Process ID of the parallel group leader. This field is <literal>NULL</literal>
+       if this process is a parallel group leader or the
+       <literal>vacuuming indexes</literal> phase is not performed in parallel.
+      </para></entry>
+     </row>

Are there cases where the parallel group leader will have an entry in
this view when parallelism is enabled?

+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>index_ordinal_position</structfield> <type>bigint</type>
+      </para>
+      <para>
+       The order in which the index is being vacuumed. Indexes are vacuumed by OID in ascending order.
+      </para></entry>
+     </row>

Should we include the bit about the OID ordering? I suppose that is
unlikely to change in the near future, but I don't know if it is
relevant information. Also, do we need to include the "index_"
prefix? This view is specific for indexes. (I have the same question
for index_tuples_removed.)

Should this new table go after the "VACUUM phases" table? It might
make sense to keep the phases table closer to where it is referenced.

+    /* Advertise the number of indexes to vacuum if we are not in failsafe mode */
+    if (!lazy_check_wraparound_failsafe(vacrel))
+        pgstat_progress_update_param(PROGRESS_VACUUM_TOTAL_INDEX_VACUUM, vacrel->nindexes);

Shouldn't this be 0 when INDEX_CLEANUP is off, too?

+#define PROGRESS_VACUUM_CURRENT_INDRELID         7
+#define PROGRESS_VACUUM_LEADER_PID               8
+#define PROGRESS_VACUUM_INDEX_ORDINAL            9
+#define PROGRESS_VACUUM_TOTAL_INDEX_VACUUM       10
+#define PROGRESS_VACUUM_DEAD_TUPLES_VACUUMED     11

nitpick: I would suggest the following names to match the existing
style:

PROGRESS_VACUUM_NUM_INDEXES_TO_VACUUM
PROGRESS_VACUUM_INDEX_LEADER_PID
PROGRESS_VACUUM_INDEX_INDEXRELID
PROGRESS_VACUUM_INDEX_ORDINAL_POSITION
PROGRESS_VACUUM_INDEX_TUPLES_REMOVED

Nathan

#21Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#20)
#22Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#21)
#23Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Sami Imseih (#19)
#24Nathan Bossart
nathandbossart@gmail.com
In reply to: Masahiko Sawada (#23)
#25Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#24)
#26Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#25)
#27Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#26)
#28Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#26)
#29Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#26)
#30Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#29)
#31Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#30)
#32Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#31)
#33Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#32)
#34Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#33)
In reply to: Sami Imseih (#34)
#36Nathan Bossart
nathandbossart@gmail.com
In reply to: Peter Geoghegan (#35)
#37Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#36)
#38Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#36)
#39Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Sami Imseih (#38)
#40Sami Imseih
samimseih@gmail.com
In reply to: Masahiko Sawada (#39)
#41Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Sami Imseih (#40)
#42Sami Imseih
samimseih@gmail.com
In reply to: Masahiko Sawada (#41)
#43Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#42)
#44Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Sami Imseih (#42)
#45Sami Imseih
samimseih@gmail.com
In reply to: Masahiko Sawada (#44)
#46Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#45)
#47Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#46)
#48Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#47)
#49Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#48)
#50Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#49)
#51Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#46)
#52Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#51)
#53Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Sami Imseih (#51)
#54Sami Imseih
samimseih@gmail.com
In reply to: Masahiko Sawada (#53)
#55Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#54)
#56Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#55)
#57Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#56)
#58Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Sami Imseih (#54)
#59Andres Freund
andres@anarazel.de
In reply to: Sami Imseih (#56)
#60Sami Imseih
samimseih@gmail.com
In reply to: Masahiko Sawada (#58)
#61Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Sami Imseih (#60)
#62Sami Imseih
samimseih@gmail.com
In reply to: Masahiko Sawada (#61)
#63Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Sami Imseih (#62)
#64Sami Imseih
samimseih@gmail.com
In reply to: Masahiko Sawada (#63)
#65Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#64)
#66Andres Freund
andres@anarazel.de
In reply to: Sami Imseih (#65)
#67Sami Imseih
samimseih@gmail.com
In reply to: Andres Freund (#66)
#68Robert Haas
robertmhaas@gmail.com
In reply to: Sami Imseih (#67)
#69Andres Freund
andres@anarazel.de
In reply to: Sami Imseih (#67)
#70Sami Imseih
samimseih@gmail.com
In reply to: Andres Freund (#69)
#71Sami Imseih
samimseih@gmail.com
In reply to: Robert Haas (#68)
#72Robert Haas
robertmhaas@gmail.com
In reply to: Sami Imseih (#71)
#73Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Robert Haas (#72)
#74Bruce Momjian
bruce@momjian.us
In reply to: Masahiko Sawada (#73)
#75Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#71)
#76Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Sami Imseih (#75)
#77Sami Imseih
samimseih@gmail.com
In reply to: Masahiko Sawada (#76)
#78Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#77)
#79Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Sami Imseih (#77)
#80Sami Imseih
samimseih@gmail.com
In reply to: Masahiko Sawada (#79)
#81Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Sami Imseih (#80)
#82Robert Haas
robertmhaas@gmail.com
In reply to: Masahiko Sawada (#79)
#83Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Robert Haas (#82)
#84Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Masahiko Sawada (#83)
#85Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#51)
#86Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Sami Imseih (#85)
#87Sami Imseih
samimseih@gmail.com
In reply to: Masahiko Sawada (#86)
#88Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#87)
#89Ian Lawrence Barwick
barwick@gmail.com
In reply to: Sami Imseih (#88)
#90Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#88)
#91Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Sami Imseih (#88)
#92Andres Freund
andres@anarazel.de
In reply to: Sami Imseih (#90)
#93Sami Imseih
samimseih@gmail.com
In reply to: Andres Freund (#92)
#94Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Sami Imseih (#93)
#95Sami Imseih
samimseih@gmail.com
In reply to: Masahiko Sawada (#94)
#96Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Sami Imseih (#95)
#97Sami Imseih
samimseih@gmail.com
In reply to: Masahiko Sawada (#96)
#98Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Sami Imseih (#97)
#99Sami Imseih
samimseih@gmail.com
In reply to: Masahiko Sawada (#98)
#100Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#99)
#101Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#100)
#102vignesh C
vignesh21@gmail.com
In reply to: Sami Imseih (#101)
#103Sami Imseih
samimseih@gmail.com
In reply to: vignesh C (#102)
#104Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Sami Imseih (#103)
#105Sami Imseih
samimseih@gmail.com
In reply to: Bertrand Drouvot (#104)
#106Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Sami Imseih (#105)
#107Sami Imseih
samimseih@gmail.com
In reply to: Masahiko Sawada (#106)
#108Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Sami Imseih (#107)
#109Sami Imseih
samimseih@gmail.com
In reply to: Masahiko Sawada (#108)
#110Andres Freund
andres@anarazel.de
In reply to: Sami Imseih (#109)
#111Sami Imseih
samimseih@gmail.com
In reply to: Andres Freund (#110)
#112Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Sami Imseih (#111)
#113Sami Imseih
samimseih@gmail.com
In reply to: Masahiko Sawada (#112)
#114Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Sami Imseih (#113)
#115Sami Imseih
samimseih@gmail.com
In reply to: Masahiko Sawada (#114)
#116Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Sami Imseih (#115)
#117Sami Imseih
samimseih@gmail.com
In reply to: Masahiko Sawada (#116)
#118Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Sami Imseih (#117)
#119Sami Imseih
samimseih@gmail.com
In reply to: Masahiko Sawada (#118)
#120Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Sami Imseih (#119)
#121Michael Paquier
michael@paquier.xyz
In reply to: Masahiko Sawada (#120)
#122Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Michael Paquier (#121)
#123Sami Imseih
samimseih@gmail.com
In reply to: Masahiko Sawada (#122)
#124Michael Paquier
michael@paquier.xyz
In reply to: Sami Imseih (#123)
#125Sami Imseih
samimseih@gmail.com
In reply to: Michael Paquier (#124)
#126Michael Paquier
michael@paquier.xyz
In reply to: Sami Imseih (#125)
#127Andres Freund
andres@anarazel.de
In reply to: Michael Paquier (#124)
#128Sami Imseih
samimseih@gmail.com
In reply to: Michael Paquier (#126)
#129Michael Paquier
michael@paquier.xyz
In reply to: Sami Imseih (#128)
#130Michael Paquier
michael@paquier.xyz
In reply to: Andres Freund (#127)
#131Sami Imseih
samimseih@gmail.com
In reply to: Michael Paquier (#129)
#132Andres Freund
andres@anarazel.de
In reply to: Michael Paquier (#130)
#133Michael Paquier
michael@paquier.xyz
In reply to: Sami Imseih (#131)
#134Sami Imseih
samimseih@gmail.com
In reply to: Michael Paquier (#133)
#135Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Sami Imseih (#134)
#136Michael Paquier
michael@paquier.xyz
In reply to: Masahiko Sawada (#135)
#137Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Michael Paquier (#136)
#138Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Masahiko Sawada (#137)