pg_stat_progress_basebackup - progress reporting for pg_basebackup, in the server side
Hi,
pg_basebackup reports the backup progress if --progress option is specified,
and we can monitor it in the client side. I think that it's useful if we can
monitor the progress information also in the server side because, for example,
we can easily check that by using SQL. So I'd like to propose
pg_stat_progress_basebackup view that allows us to monitor the progress
of pg_basebackup, in the server side. Thought?
POC patch is attached.
Regards,
--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters
Attachments:
pg_stat_progress_basebackup_v1.patchtext/plain; charset=UTF-8; name=pg_stat_progress_basebackup_v1.patch; x-mac-creator=0; x-mac-type=0Download+264-3
At Wed, 29 Jan 2020 23:16:08 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
Hi,
pg_basebackup reports the backup progress if --progress option is
specified,
and we can monitor it in the client side. I think that it's useful if
we can
monitor the progress information also in the server side because, for
example,
we can easily check that by using SQL. So I'd like to propose
pg_stat_progress_basebackup view that allows us to monitor the
progress
of pg_basebackup, in the server side. Thought?POC patch is attached.
| postgres=# \d pg_stat_progress_basebackup
| View "pg_catalog.pg_stat_progress_basebackup"
| Column | Type | Collation | Nullable | Default
| ---------------------+---------+-----------+----------+---------
| pid | integer | | |
| phase | text | | |
| backup_total | bigint | | |
| backup_streamed | bigint | | |
| tablespace_total | bigint | | |
| tablespace_streamed | bigint | | |
I think the view needs client identity such like host/port pair
besides pid (host/port pair fails identify client in the case of
unix-sockets.). Also elapsed time from session start might be
useful. pg_stat_progress_acuum has datid, datname and relid.
+ if (backup_total > 0 && backup_streamed > backup_total)
+ {
+ backup_total = backup_streamed;
Do we need the condition "backup_total > 0"?
+ int tblspc_streamed = 0;
+
+ pgstat_progress_update_param(PROGRESS_BASEBACKUP_PHASE,
+ PROGRESS_BASEBACKUP_PHASE_STREAM_BACKUP);
This starts "streaming backup" phase with backup_total = 0. Coudln't
we move to that phase after setting backup total and tablespace total?
That is, just after calling SendBackupHeader().
+ WHEN 3 THEN 'stopping backup'::text
I'm not sure, but the "stop" seems suggesting the backup is terminated
before completion. If it is following the name of the function
pg_stop_backup, I think the name is suggesting to stop "the state for
performing backup", not a backup.
In the first place, the progress is about "backup" so it seems strange
that we have another phase after the "stopping backup" phase. It
might be better that it is "finishing file transfer" or such.
"initializing"
-> "starting file transfer"
-> "transferring files"
-> "finishing file transfer"
-> "transaferring WAL"
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On 2020/01/30 12:58, Kyotaro Horiguchi wrote:
At Wed, 29 Jan 2020 23:16:08 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
Hi,
pg_basebackup reports the backup progress if --progress option is
specified,
and we can monitor it in the client side. I think that it's useful if
we can
monitor the progress information also in the server side because, for
example,
we can easily check that by using SQL. So I'd like to propose
pg_stat_progress_basebackup view that allows us to monitor the
progress
of pg_basebackup, in the server side. Thought?POC patch is attached.
| postgres=# \d pg_stat_progress_basebackup
| View "pg_catalog.pg_stat_progress_basebackup"
| Column | Type | Collation | Nullable | Default
| ---------------------+---------+-----------+----------+---------
| pid | integer | | |
| phase | text | | |
| backup_total | bigint | | |
| backup_streamed | bigint | | |
| tablespace_total | bigint | | |
| tablespace_streamed | bigint | | |I think the view needs client identity such like host/port pair
besides pid (host/port pair fails identify client in the case of
unix-sockets.).
I don't think this is job of a progress reporting. If those information
is necessary, we can join this view with pg_stat_replication using
pid column as the join key.
Also elapsed time from session start might be
useful.
+1
I think that not only pg_stat_progress_basebackup but also all the other
progress views should report the time when the target command was
started and the time when the phase was last changed. Those times
would be useful to estimate the remaining execution time from the
progress infromation.
pg_stat_progress_acuum has datid, datname and relid.
+ if (backup_total > 0 && backup_streamed > backup_total) + { + backup_total = backup_streamed;Do we need the condition "backup_total > 0"?
I added the condition for the case where --progress option is not supplied
in pg_basebackup, i.e., the case where the total amount of backup is not
estimated at the beginning of the backup. In this case, total_backup is
always 0.
+ int tblspc_streamed = 0; + + pgstat_progress_update_param(PROGRESS_BASEBACKUP_PHASE, + PROGRESS_BASEBACKUP_PHASE_STREAM_BACKUP);This starts "streaming backup" phase with backup_total = 0. Coudln't
we move to that phase after setting backup total and tablespace total?
That is, just after calling SendBackupHeader().
OK, that's a bit less confusing for users. I will change in that way.
+ WHEN 3 THEN 'stopping backup'::text
I'm not sure, but the "stop" seems suggesting the backup is terminated
before completion. If it is following the name of the function
pg_stop_backup, I think the name is suggesting to stop "the state for
performing backup", not a backup.In the first place, the progress is about "backup" so it seems strange
that we have another phase after the "stopping backup" phase. It
might be better that it is "finishing file transfer" or such."initializing"
-> "starting file transfer"
-> "transferring files"
-> "finishing file transfer"
-> "transaferring WAL"
Better name is always welcome! If "stopping back" is confusing,
what about "performing pg_stop_backup"? So
initializing
performing pg_start_backup
streaming database files
performing pg_stop_backup
transfering WAL files
Regards,
--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters
On Fri, 31 Jan 2020 at 02:29, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/01/30 12:58, Kyotaro Horiguchi wrote:
At Wed, 29 Jan 2020 23:16:08 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
Hi,
pg_basebackup reports the backup progress if --progress option is
specified,
and we can monitor it in the client side. I think that it's useful if
we can
monitor the progress information also in the server side because, for
example,
we can easily check that by using SQL. So I'd like to propose
pg_stat_progress_basebackup view that allows us to monitor the
progress
of pg_basebackup, in the server side. Thought?POC patch is attached.
| postgres=# \d pg_stat_progress_basebackup
| View "pg_catalog.pg_stat_progress_basebackup"
| Column | Type | Collation | Nullable | Default
| ---------------------+---------+-----------+----------+---------
| pid | integer | | |
| phase | text | | |
| backup_total | bigint | | |
| backup_streamed | bigint | | |
| tablespace_total | bigint | | |
| tablespace_streamed | bigint | | |I think the view needs client identity such like host/port pair
besides pid (host/port pair fails identify client in the case of
unix-sockets.).I don't think this is job of a progress reporting. If those information
is necessary, we can join this view with pg_stat_replication using
pid column as the join key.Also elapsed time from session start might be
useful.+1
I think that not only pg_stat_progress_basebackup but also all the other
progress views should report the time when the target command was
started and the time when the phase was last changed. Those times
would be useful to estimate the remaining execution time from the
progress infromation.pg_stat_progress_acuum has datid, datname and relid.
+ if (backup_total > 0 && backup_streamed > backup_total) + { + backup_total = backup_streamed;Do we need the condition "backup_total > 0"?
I added the condition for the case where --progress option is not supplied
in pg_basebackup, i.e., the case where the total amount of backup is not
estimated at the beginning of the backup. In this case, total_backup is
always 0.+ int tblspc_streamed = 0; + + pgstat_progress_update_param(PROGRESS_BASEBACKUP_PHASE, + PROGRESS_BASEBACKUP_PHASE_STREAM_BACKUP);This starts "streaming backup" phase with backup_total = 0. Coudln't
we move to that phase after setting backup total and tablespace total?
That is, just after calling SendBackupHeader().OK, that's a bit less confusing for users. I will change in that way.
+ WHEN 3 THEN 'stopping backup'::text
I'm not sure, but the "stop" seems suggesting the backup is terminated
before completion. If it is following the name of the function
pg_stop_backup, I think the name is suggesting to stop "the state for
performing backup", not a backup.In the first place, the progress is about "backup" so it seems strange
that we have another phase after the "stopping backup" phase. It
might be better that it is "finishing file transfer" or such."initializing"
-> "starting file transfer"
-> "transferring files"
-> "finishing file transfer"
-> "transaferring WAL"Better name is always welcome! If "stopping back" is confusing,
what about "performing pg_stop_backup"? Soinitializing
performing pg_start_backup
streaming database files
performing pg_stop_backup
transfering WAL files
Another idea I came up with is to show steps that take time instead of
pg_start_backup/pg_stop_backup, for better understanding the
situation. That is, "performing checkpoint" and "performing WAL
archive" etc, which engage the most of the time of these functions.
Regards,
--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2020/02/02 14:59, Masahiko Sawada wrote:
On Fri, 31 Jan 2020 at 02:29, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/01/30 12:58, Kyotaro Horiguchi wrote:
At Wed, 29 Jan 2020 23:16:08 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
Hi,
pg_basebackup reports the backup progress if --progress option is
specified,
and we can monitor it in the client side. I think that it's useful if
we can
monitor the progress information also in the server side because, for
example,
we can easily check that by using SQL. So I'd like to propose
pg_stat_progress_basebackup view that allows us to monitor the
progress
of pg_basebackup, in the server side. Thought?POC patch is attached.
| postgres=# \d pg_stat_progress_basebackup
| View "pg_catalog.pg_stat_progress_basebackup"
| Column | Type | Collation | Nullable | Default
| ---------------------+---------+-----------+----------+---------
| pid | integer | | |
| phase | text | | |
| backup_total | bigint | | |
| backup_streamed | bigint | | |
| tablespace_total | bigint | | |
| tablespace_streamed | bigint | | |I think the view needs client identity such like host/port pair
besides pid (host/port pair fails identify client in the case of
unix-sockets.).I don't think this is job of a progress reporting. If those information
is necessary, we can join this view with pg_stat_replication using
pid column as the join key.Also elapsed time from session start might be
useful.+1
I think that not only pg_stat_progress_basebackup but also all the other
progress views should report the time when the target command was
started and the time when the phase was last changed. Those times
would be useful to estimate the remaining execution time from the
progress infromation.pg_stat_progress_acuum has datid, datname and relid.
+ if (backup_total > 0 && backup_streamed > backup_total) + { + backup_total = backup_streamed;Do we need the condition "backup_total > 0"?
I added the condition for the case where --progress option is not supplied
in pg_basebackup, i.e., the case where the total amount of backup is not
estimated at the beginning of the backup. In this case, total_backup is
always 0.+ int tblspc_streamed = 0; + + pgstat_progress_update_param(PROGRESS_BASEBACKUP_PHASE, + PROGRESS_BASEBACKUP_PHASE_STREAM_BACKUP);This starts "streaming backup" phase with backup_total = 0. Coudln't
we move to that phase after setting backup total and tablespace total?
That is, just after calling SendBackupHeader().OK, that's a bit less confusing for users. I will change in that way.
Fixed. Attached is the updated version of the patch.
I also fixed the regression test failure.
+ WHEN 3 THEN 'stopping backup'::text
I'm not sure, but the "stop" seems suggesting the backup is terminated
before completion. If it is following the name of the function
pg_stop_backup, I think the name is suggesting to stop "the state for
performing backup", not a backup.In the first place, the progress is about "backup" so it seems strange
that we have another phase after the "stopping backup" phase. It
might be better that it is "finishing file transfer" or such."initializing"
-> "starting file transfer"
-> "transferring files"
-> "finishing file transfer"
-> "transaferring WAL"Better name is always welcome! If "stopping back" is confusing,
what about "performing pg_stop_backup"? Soinitializing
performing pg_start_backup
streaming database files
performing pg_stop_backup
transfering WAL filesAnother idea I came up with is to show steps that take time instead of
pg_start_backup/pg_stop_backup, for better understanding the
situation. That is, "performing checkpoint" and "performing WAL
archive" etc, which engage the most of the time of these functions.
Yeah, that's an idea. ISTM that "waiting for WAL archiving" sounds
better than "performing WAL archive". Thought?
I've not applied this change in the patch yet, but if there is no
other idea, I'd like to adopt this.
Regards,
--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters
Attachments:
pg_stat_progress_basebackup_v2.patchtext/plain; charset=UTF-8; name=pg_stat_progress_basebackup_v2.patch; x-mac-creator=0; x-mac-type=0Download+263-3
On Mon, Feb 3, 2020 at 1:17 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/02/02 14:59, Masahiko Sawada wrote:
On Fri, 31 Jan 2020 at 02:29, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/01/30 12:58, Kyotaro Horiguchi wrote:
+ WHEN 3 THEN 'stopping backup'::text
I'm not sure, but the "stop" seems suggesting the backup is terminated
before completion. If it is following the name of the function
pg_stop_backup, I think the name is suggesting to stop "the state for
performing backup", not a backup.In the first place, the progress is about "backup" so it seems strange
that we have another phase after the "stopping backup" phase. It
might be better that it is "finishing file transfer" or such."initializing"
-> "starting file transfer"
-> "transferring files"
-> "finishing file transfer"
-> "transaferring WAL"Better name is always welcome! If "stopping back" is confusing,
what about "performing pg_stop_backup"? Soinitializing
performing pg_start_backup
streaming database files
performing pg_stop_backup
transfering WAL filesAnother idea I came up with is to show steps that take time instead of
pg_start_backup/pg_stop_backup, for better understanding the
situation. That is, "performing checkpoint" and "performing WAL
archive" etc, which engage the most of the time of these functions.Yeah, that's an idea. ISTM that "waiting for WAL archiving" sounds
better than "performing WAL archive". Thought?
I've not applied this change in the patch yet, but if there is no
other idea, I'd like to adopt this.
If we are trying to "pg_stop_backup" in phase name, maybe we should
avoid "pg_start_backup"? Then maybe:
initializing
starting backup / waiting for [ backup start ] checkpoint to finish
transferring database files
finishing backup / waiting for WAL archiving to finish
transferring WAL files
?
Some comments on documentation changes in v2 patch:
+ Amount of data already streamed.
"already" may be redundant. For example, in pg_start_progress_vacuum,
heap_blks_scanned is described as "...blocks scanned", not "...blocks
already scanned".
+ <entry><structfield>tablespace_total</structfield></entry>
+ <entry><structfield>tablespace_streamed</structfield></entry>
Better to use plural tablespaces_total and tablespaces_streamed for consistency?
+ The WAL sender process is currently performing
+ <function>pg_start_backup</function> and setting up for
+ making a base backup.
How about "taking" instead of "making" in the above sentence?
- <varlistentry>
+ <varlistentry id="protocol-replication-base-backup" xreflabel="BASE_BACKUP">
I don't see any new text in the documentation patch that uses above
xref, so no need to define it?
Thanks,
Amit
On Mon, Feb 3, 2020 at 4:28 PM Amit Langote <amitlangote09@gmail.com> wrote:
If we are trying to "pg_stop_backup" in phase name, maybe we should
avoid "pg_start_backup"? Then maybe:
Sorry, I meant to write:
If we are trying to avoid "pg_stop_backup" in phase name, maybe we
should avoid "pg_start_backup"? Then maybe:
Thanks,
Amit
On 2020/02/03 16:28, Amit Langote wrote:
On Mon, Feb 3, 2020 at 1:17 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/02/02 14:59, Masahiko Sawada wrote:
On Fri, 31 Jan 2020 at 02:29, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/01/30 12:58, Kyotaro Horiguchi wrote:
+ WHEN 3 THEN 'stopping backup'::text
I'm not sure, but the "stop" seems suggesting the backup is terminated
before completion. If it is following the name of the function
pg_stop_backup, I think the name is suggesting to stop "the state for
performing backup", not a backup.In the first place, the progress is about "backup" so it seems strange
that we have another phase after the "stopping backup" phase. It
might be better that it is "finishing file transfer" or such."initializing"
-> "starting file transfer"
-> "transferring files"
-> "finishing file transfer"
-> "transaferring WAL"Better name is always welcome! If "stopping back" is confusing,
what about "performing pg_stop_backup"? Soinitializing
performing pg_start_backup
streaming database files
performing pg_stop_backup
transfering WAL filesAnother idea I came up with is to show steps that take time instead of
pg_start_backup/pg_stop_backup, for better understanding the
situation. That is, "performing checkpoint" and "performing WAL
archive" etc, which engage the most of the time of these functions.Yeah, that's an idea. ISTM that "waiting for WAL archiving" sounds
better than "performing WAL archive". Thought?
I've not applied this change in the patch yet, but if there is no
other idea, I'd like to adopt this.If we are trying to "pg_stop_backup" in phase name, maybe we should
avoid "pg_start_backup"? Then maybe:initializing
starting backup / waiting for [ backup start ] checkpoint to finish
transferring database files
finishing backup / waiting for WAL archiving to finish
transferring WAL files
So we now have the following ideas about the phase names for pg_basebackup.
1.
initializing
2.
2-1. starting backup
2-2. starting file transfer
2-3. performing pg_start_backup
2-4. performing checkpoint
2-5. waiting for [ backup start ] checkpoint to finish
3.
3-1. streaming backup
3-2. transferring database files
3-3. streaming database files
3-4. transferring files
4.
4-1. stopping backup
4-2. finishing file transfer
4-3. performing pg_stop_backup
4-4. finishing backup
4-5. waiting for WAL archiving to finish
4-6. performing WAL archive
5.
1. transferring wal
2. transferring WAL files
What conbination of these do you prefer?
Some comments on documentation changes in v2 patch:
+ Amount of data already streamed.
Ok, fixed.
"already" may be redundant. For example, in pg_start_progress_vacuum,
heap_blks_scanned is described as "...blocks scanned", not "...blocks
already scanned".+ <entry><structfield>tablespace_total</structfield></entry> + <entry><structfield>tablespace_streamed</structfield></entry>Better to use plural tablespaces_total and tablespaces_streamed for consistency?
Fixed.
+ The WAL sender process is currently performing + <function>pg_start_backup</function> and setting up for + making a base backup.How about "taking" instead of "making" in the above sentence?
Fixed. Attached is the updated version of the patch.
- <varlistentry> + <varlistentry id="protocol-replication-base-backup" xreflabel="BASE_BACKUP">I don't see any new text in the documentation patch that uses above
xref, so no need to define it?
The following description that I added uses this.
certain commands during command execution. Currently, the only commands
which support progress reporting are <command>ANALYZE</command>,
<command>CLUSTER</command>,
- <command>CREATE INDEX</command>, and <command>VACUUM</command>.
+ <command>CREATE INDEX</command>, <command>VACUUM</command>,
+ and <xref linkend="protocol-replication-base-backup"/> (i.e., replication
+ command that <xref linkend="app-pgbasebackup"/> issues to take
+ a base backup).
Regards,
--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters
Attachments:
pg_stat_progress_basebackup_v3.patchtext/plain; charset=UTF-8; name=pg_stat_progress_basebackup_v3.patch; x-mac-creator=0; x-mac-type=0Download+263-3
On Mon, Feb 3, 2020 at 11:04 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
So we now have the following ideas about the phase names for pg_basebackup.
1.
initializing2.
2-1. starting backup
2-2. starting file transfer
2-3. performing pg_start_backup
2-4. performing checkpoint
2-5. waiting for [ backup start ] checkpoint to finish3.
3-1. streaming backup
3-2. transferring database files
3-3. streaming database files
3-4. transferring files4.
4-1. stopping backup
4-2. finishing file transfer
4-3. performing pg_stop_backup
4-4. finishing backup
4-5. waiting for WAL archiving to finish
4-6. performing WAL archive5.
1. transferring wal
2. transferring WAL filesWhat conbination of these do you prefer?
I like:
1. initializing
2-5 waiting for backup start checkpoint to finish
3-3 streaming database files
4-5 waiting for wal archiving to finish
5-1 transferring wal (or streaming wal)
- <varlistentry> + <varlistentry id="protocol-replication-base-backup" xreflabel="BASE_BACKUP">I don't see any new text in the documentation patch that uses above
xref, so no need to define it?The following description that I added uses this.
certain commands during command execution. Currently, the only commands which support progress reporting are <command>ANALYZE</command>, <command>CLUSTER</command>, - <command>CREATE INDEX</command>, and <command>VACUUM</command>. + <command>CREATE INDEX</command>, <command>VACUUM</command>, + and <xref linkend="protocol-replication-base-backup"/> (i.e., replication + command that <xref linkend="app-pgbasebackup"/> issues to take + a base backup).
Sorry, I missed that. I was mistakenly expecting a different value of linkend.
Some comments on v3:
+ <entry>Process ID of a WAL sender process.</entry>
"a" sounds redundant. Maybe:
of this WAL sender process or
of WAL sender process
Reading this:
+ <entry><structfield>backup_total</structfield></entry>
+ <entry><type>bigint</type></entry>
+ <entry>
+ Total amount of data that will be streamed. If progress reporting
+ is not enabled in <application>pg_basebackup</application>
+ (i.e., <literal>--progress</literal> option is not specified),
+ this is <literal>0</literal>.
I wonder how hard would it be to change basebackup.c to always set
backup_total, irrespective of whether --progress is specified with
pg_basebackup or not? It seems quite misleading to leave it set to 0,
because one may panic that they have lost their data, that is, if they
haven't first read the documentation.
Thanks,
Amit
On Tue, Feb 4, 2020 at 10:34 AM Amit Langote <amitlangote09@gmail.com> wrote:
Reading this:
+ <entry><structfield>backup_total</structfield></entry> + <entry><type>bigint</type></entry> + <entry> + Total amount of data that will be streamed. If progress reporting + is not enabled in <application>pg_basebackup</application> + (i.e., <literal>--progress</literal> option is not specified), + this is <literal>0</literal>.I wonder how hard would it be to change basebackup.c to always set
backup_total, irrespective of whether --progress is specified with
pg_basebackup or not? It seems quite misleading to leave it set to 0,
because one may panic that they have lost their data, that is, if they
haven't first read the documentation.
For example, the attached patch changes basebackup.c to always set
tablespaceinfo.size, irrespective of whether --progress was passed
with BASE_BACKUP command. It passes make check-world, so maybe safe.
Maybe it would be a good idea to add a couple of more comments around
tablespaceinfo struct definition, such as how 'size' is to be
interpreted.
Thanks,
Amit
Attachments:
basebackup-always-set-tablespace-size.patchtext/plain; charset=US-ASCII; name=basebackup-always-set-tablespace-size.patchDownload+7-9
On 2020/02/04 10:34, Amit Langote wrote:
On Mon, Feb 3, 2020 at 11:04 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
So we now have the following ideas about the phase names for pg_basebackup.
1.
initializing2.
2-1. starting backup
2-2. starting file transfer
2-3. performing pg_start_backup
2-4. performing checkpoint
2-5. waiting for [ backup start ] checkpoint to finish3.
3-1. streaming backup
3-2. transferring database files
3-3. streaming database files
3-4. transferring files4.
4-1. stopping backup
4-2. finishing file transfer
4-3. performing pg_stop_backup
4-4. finishing backup
4-5. waiting for WAL archiving to finish
4-6. performing WAL archive5.
1. transferring wal
2. transferring WAL filesWhat conbination of these do you prefer?
I like:
Thanks for reviewing the patch!
1. initializing
2-5 waiting for backup start checkpoint to finish
Can we shorten this to "waiting for checkpoint"? IMO the simpler
phase name is better and "to finish" sounds a bit redundant. Also
in the description of pg_stat_progress_create_index, basically
"waiting for xxx" is used.
3-3 streaming database files
4-5 waiting for wal archiving to finish
Can we shorten this to "waiting for wal archiving" because of
the same reason as above?
5-1 transferring wal (or streaming wal)
IMO "transferring wal" is better because this phase happens only when
"--wal-method=fetch" is specified in pg_basebackup. "streaming wal"
seems to implie "--wal-method=stream", instead.
- <varlistentry> + <varlistentry id="protocol-replication-base-backup" xreflabel="BASE_BACKUP">I don't see any new text in the documentation patch that uses above
xref, so no need to define it?The following description that I added uses this.
certain commands during command execution. Currently, the only commands which support progress reporting are <command>ANALYZE</command>, <command>CLUSTER</command>, - <command>CREATE INDEX</command>, and <command>VACUUM</command>. + <command>CREATE INDEX</command>, <command>VACUUM</command>, + and <xref linkend="protocol-replication-base-backup"/> (i.e., replication + command that <xref linkend="app-pgbasebackup"/> issues to take + a base backup).Sorry, I missed that. I was mistakenly expecting a different value of linkend.
Some comments on v3:
+ <entry>Process ID of a WAL sender process.</entry>
"a" sounds redundant. Maybe:
of this WAL sender process or
of WAL sender process
I borrowed "Process ID of a WAL sender process" from the description
of pg_stat_replication.pid. But if it's better to get rid of "a",
I'm happy to do that!
Reading this:
+ <entry><structfield>backup_total</structfield></entry> + <entry><type>bigint</type></entry> + <entry> + Total amount of data that will be streamed. If progress reporting + is not enabled in <application>pg_basebackup</application> + (i.e., <literal>--progress</literal> option is not specified), + this is <literal>0</literal>.I wonder how hard would it be to change basebackup.c to always set
backup_total, irrespective of whether --progress is specified with
pg_basebackup or not? It seems quite misleading to leave it set to 0,
because one may panic that they have lost their data, that is, if they
haven't first read the documentation.
Yeah, I understand your concern. The pg_basebackup document explains
the risk when --progress is specified, as follows. Since I imagined that
someone may explicitly disable --progress to avoid this risk, I made
the server estimate the total size only when --progress is specified.
But you think that this overhead by --progress is negligibly small?
--------------------
When this is enabled, the backup will start by enumerating the size of
the entire database, and then go back and send the actual contents.
This may make the backup take slightly longer, and in particular it will
take longer before the first data is sent.
--------------------
If we really can always estimate the total size, whether --progress is
specified or not, we should get rid of PROGRESS option from BASE_BACKUP
replication command because it will no longer be necessary, I think.
Regards,
--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters
On Wed, Feb 5, 2020 at 3:36 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/02/04 10:34, Amit Langote wrote:
I like:
Thanks for reviewing the patch!
1. initializing
2-5 waiting for backup start checkpoint to finishCan we shorten this to "waiting for checkpoint"? IMO the simpler
phase name is better and "to finish" sounds a bit redundant. Also
in the description of pg_stat_progress_create_index, basically
"waiting for xxx" is used.
"waiting for checkpoint" works for me.
3-3 streaming database files
4-5 waiting for wal archiving to finishCan we shorten this to "waiting for wal archiving" because of
the same reason as above?
Yes.
5-1 transferring wal (or streaming wal)
IMO "transferring wal" is better because this phase happens only when
"--wal-method=fetch" is specified in pg_basebackup. "streaming wal"
seems to implie "--wal-method=stream", instead.
Ah, okay,
Reading this:
+ <entry><structfield>backup_total</structfield></entry> + <entry><type>bigint</type></entry> + <entry> + Total amount of data that will be streamed. If progress reporting + is not enabled in <application>pg_basebackup</application> + (i.e., <literal>--progress</literal> option is not specified), + this is <literal>0</literal>.I wonder how hard would it be to change basebackup.c to always set
backup_total, irrespective of whether --progress is specified with
pg_basebackup or not? It seems quite misleading to leave it set to 0,
because one may panic that they have lost their data, that is, if they
haven't first read the documentation.Yeah, I understand your concern. The pg_basebackup document explains
the risk when --progress is specified, as follows. Since I imagined that
someone may explicitly disable --progress to avoid this risk, I made
the server estimate the total size only when --progress is specified.
But you think that this overhead by --progress is negligibly small?--------------------
When this is enabled, the backup will start by enumerating the size of
the entire database, and then go back and send the actual contents.
This may make the backup take slightly longer, and in particular it will
take longer before the first data is sent.
--------------------
Sorry, I hadn't read this before. So, my proposal would make this a lie.
Still, if "streaming database files" is the longest phase, then not
having even an approximation of how much data is to be streamed over
doesn't much help estimating progress, at least as long as one only
has this view to look at.
That said, the overhead of checking the size before sending any data
may be worse for some people than others, so having the option to
avoid that might be good after all.
Thanks,
Amit
At Wed, 5 Feb 2020 16:29:54 +0900, Amit Langote <amitlangote09@gmail.com> wrote in
On Wed, Feb 5, 2020 at 3:36 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/02/04 10:34, Amit Langote wrote:
I like:
Thanks for reviewing the patch!
1. initializing
2-5 waiting for backup start checkpoint to finishCan we shorten this to "waiting for checkpoint"? IMO the simpler
phase name is better and "to finish" sounds a bit redundant. Also
in the description of pg_stat_progress_create_index, basically
"waiting for xxx" is used."waiting for checkpoint" works for me.
I'm not sure, but doesn't that mean "waiting for a checkpoint to
start"? Sorry in advance if that is not the case.
3-3 streaming database files
4-5 waiting for wal archiving to finishCan we shorten this to "waiting for wal archiving" because of
the same reason as above?Yes.
5-1 transferring wal (or streaming wal)
IMO "transferring wal" is better because this phase happens only when
"--wal-method=fetch" is specified in pg_basebackup. "streaming wal"
seems to implie "--wal-method=stream", instead.Ah, okay,
Reading this:
+ <entry><structfield>backup_total</structfield></entry> + <entry><type>bigint</type></entry> + <entry> + Total amount of data that will be streamed. If progress reporting + is not enabled in <application>pg_basebackup</application> + (i.e., <literal>--progress</literal> option is not specified), + this is <literal>0</literal>.I wonder how hard would it be to change basebackup.c to always set
backup_total, irrespective of whether --progress is specified with
pg_basebackup or not? It seems quite misleading to leave it set to 0,
because one may panic that they have lost their data, that is, if they
haven't first read the documentation.Yeah, I understand your concern. The pg_basebackup document explains
the risk when --progress is specified, as follows. Since I imagined that
someone may explicitly disable --progress to avoid this risk, I made
the server estimate the total size only when --progress is specified.
But you think that this overhead by --progress is negligibly small?--------------------
When this is enabled, the backup will start by enumerating the size of
the entire database, and then go back and send the actual contents.
This may make the backup take slightly longer, and in particular it will
take longer before the first data is sent.
--------------------Sorry, I hadn't read this before. So, my proposal would make this a lie.
Still, if "streaming database files" is the longest phase, then not
having even an approximation of how much data is to be streamed over
doesn't much help estimating progress, at least as long as one only
has this view to look at.That said, the overhead of checking the size before sending any data
may be worse for some people than others, so having the option to
avoid that might be good after all.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Wed, Feb 5, 2020 at 5:32 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
At Wed, 5 Feb 2020 16:29:54 +0900, Amit Langote <amitlangote09@gmail.com> wrote in
On Wed, Feb 5, 2020 at 3:36 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
On 2020/02/04 10:34, Amit Langote wrote:
I like:
Thanks for reviewing the patch!
1. initializing
2-5 waiting for backup start checkpoint to finishCan we shorten this to "waiting for checkpoint"? IMO the simpler
phase name is better and "to finish" sounds a bit redundant. Also
in the description of pg_stat_progress_create_index, basically
"waiting for xxx" is used."waiting for checkpoint" works for me.
I'm not sure, but doesn't that mean "waiting for a checkpoint to
start"? Sorry in advance if that is not the case.
No, I really meant "to finish". As Sawada-san said upthread, we
should really use text that describes the activity that usually takes
long. While it takes takes only a moment to actually start the
checkpoint, it might take long for it to finish. As Fujii-san says
though we don't need the noise words "to finish".
Thanks,
Amit
On Wed, Feb 5, 2020 at 18:25 Amit Langote <amitlangote09@gmail.com> wrote:
On Wed, Feb 5, 2020 at 6:15 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:2020年2月5日(水) 17:54 Amit Langote <amitlangote09@gmail.com>:
I'm not sure, but doesn't that mean "waiting for a checkpoint to
start"? Sorry in advance if that is not the case.No, I really meant "to finish". As Sawada-san said upthread, we
should really use text that describes the activity that usually takes
long. While it takes takes only a moment to actually start the
checkpoint, it might take long for it to finish.I meant that the wording might sound as if it implies "to start", but..
Ah, I misunderstood then, sorry.
So, maybe you're saying that "waiting for checkpoint" is ambiguous and
most people will assume it means "...to start". As for me, I assume
it ends with "...to finish".As Fujii-san says
though we don't need the noise words "to finish".Understood, sorry for my noise.
Actually, that's an important point to consider and we should strive
to use words that are unambiguous.
Last two messages weren’t sent to the list.
Thanks,
Amit
Show quoted text
Import Notes
Reply to msg id not found: CA+HiwqF7JofKfB8Neupm9RiMxK_ghpD46F4FA3NjYKbVF9UYcA@mail.gmail.com
At Wed, 5 Feb 2020 18:53:19 +0900, Amit Langote <amitlangote09@gmail.com> wrote in
Last two messages weren’t sent to the list.
Oops! Sorry, I made a mistake sending the mail.
On Wed, Feb 5, 2020 at 18:25 Amit Langote <amitlangote09@gmail.com> wrote:
On Wed, Feb 5, 2020 at 6:15 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:2020年2月5日(水) 17:54 Amit Langote <amitlangote09@gmail.com>:
So, maybe you're saying that "waiting for checkpoint" is ambiguous and
most people will assume it means "...to start". As for me, I assume
it ends with "...to finish".
I'm not sure "most peple will assume" or not, so I said "I'm not
sure". For example, I feel strangeness to use "I'm waiting for Amit"
to express that I'm waiting Amit to leave there. That phrase gives me
such kind of uneasiness.
I thought of "establishing checkpoint" or "running a checkpoint" as
other candidates.
As Fujii-san says
though we don't need the noise words "to finish".Understood, sorry for my noise.
Actually, that's an important point to consider and we should strive
to use words that are unambiguous.
I think it's not ambiguous if knowing what happens during backup so my
concern was not unambiguity, but that it might give feeling of
strangeness that that sentense appears in that context.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Thu, Feb 6, 2020 at 9:51 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
On Wed, Feb 5, 2020 at 18:25 Amit Langote <amitlangote09@gmail.com> wrote:
So, maybe you're saying that "waiting for checkpoint" is ambiguous and
most people will assume it means "...to start". As for me, I assume
it ends with "...to finish".I'm not sure "most peple will assume" or not, so I said "I'm not
sure". For example, I feel strangeness to use "I'm waiting for Amit"
to express that I'm waiting Amit to leave there. That phrase gives me
such kind of uneasiness.I thought of "establishing checkpoint" or "running a checkpoint" as
other candidates.
Okay, I understand. I am fine with "running checkpoint", although I
think "waiting for checkpoint" isn't totally wrong either.
Thanks,
Amit
On Wed, Feb 5, 2020 at 4:29 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Wed, Feb 5, 2020 at 3:36 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
Yeah, I understand your concern. The pg_basebackup document explains
the risk when --progress is specified, as follows. Since I imagined that
someone may explicitly disable --progress to avoid this risk, I made
the server estimate the total size only when --progress is specified.
But you think that this overhead by --progress is negligibly small?--------------------
When this is enabled, the backup will start by enumerating the size of
the entire database, and then go back and send the actual contents.
This may make the backup take slightly longer, and in particular it will
take longer before the first data is sent.
--------------------Sorry, I hadn't read this before. So, my proposal would make this a lie.
Still, if "streaming database files" is the longest phase, then not
having even an approximation of how much data is to be streamed over
doesn't much help estimating progress, at least as long as one only
has this view to look at.That said, the overhead of checking the size before sending any data
may be worse for some people than others, so having the option to
avoid that might be good after all.
By the way, if calculating backup total size can take significantly
long in some cases, that is when requested by specifying --progress,
then it might be a good idea to define a separate phase for that, like
"estimating backup size" or some such. Currently, it's part of
"starting backup", which covers both running the checkpoint and size
estimation which run one after another.
I suspect people might never get stuck on "estimating backup size" as
they might on "running checkpoint", which perhaps only strengthens the
case for *always* calculating the size before sending the backup
header.
Thanks,
Amit
On 2020/02/06 11:07, Amit Langote wrote:
On Thu, Feb 6, 2020 at 9:51 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:On Wed, Feb 5, 2020 at 18:25 Amit Langote <amitlangote09@gmail.com> wrote:
So, maybe you're saying that "waiting for checkpoint" is ambiguous and
most people will assume it means "...to start". As for me, I assume
it ends with "...to finish".I'm not sure "most peple will assume" or not, so I said "I'm not
sure". For example, I feel strangeness to use "I'm waiting for Amit"
to express that I'm waiting Amit to leave there. That phrase gives me
such kind of uneasiness.I thought of "establishing checkpoint" or "running a checkpoint" as
other candidates.Okay, I understand. I am fine with "running checkpoint", although I
think "waiting for checkpoint" isn't totally wrong either.
Yeah, but if "waiting for XXX" sounds a bit confusing to some people,
I'm OK to back to "waiting for XXX to finish" that you originally
proposed.
Attached the updated version of the patch. This patch uses the following
descriptions of the phases.
waiting for checkpoint to finish
estimating backup size
streaming database files
waiting for wal archiving to finish
transferring wal files
Regards,
--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters
Attachments:
pg_stat_progress_basebackup_v4.patchtext/plain; charset=UTF-8; name=pg_stat_progress_basebackup_v4.patch; x-mac-creator=0; x-mac-type=0Download+321-5
On 2020/02/06 11:35, Amit Langote wrote:
On Wed, Feb 5, 2020 at 4:29 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Wed, Feb 5, 2020 at 3:36 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
Yeah, I understand your concern. The pg_basebackup document explains
the risk when --progress is specified, as follows. Since I imagined that
someone may explicitly disable --progress to avoid this risk, I made
the server estimate the total size only when --progress is specified.
But you think that this overhead by --progress is negligibly small?--------------------
When this is enabled, the backup will start by enumerating the size of
the entire database, and then go back and send the actual contents.
This may make the backup take slightly longer, and in particular it will
take longer before the first data is sent.
--------------------Sorry, I hadn't read this before. So, my proposal would make this a lie.
Still, if "streaming database files" is the longest phase, then not
having even an approximation of how much data is to be streamed over
doesn't much help estimating progress, at least as long as one only
has this view to look at.That said, the overhead of checking the size before sending any data
may be worse for some people than others, so having the option to
avoid that might be good after all.By the way, if calculating backup total size can take significantly
long in some cases, that is when requested by specifying --progress,
then it might be a good idea to define a separate phase for that, like
"estimating backup size" or some such. Currently, it's part of
"starting backup", which covers both running the checkpoint and size
estimation which run one after another.
OK, I added this phase in the latest patch that I posted upthread.
Regards,
--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters