add checkpoint stats of snapshot and mapping files of pg_logical dir

Started by Bharath Rupireddyover 4 years ago45 messageshackers
Jump to latest
#1Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com

Hi,

At times, there can be many snapshot and mapping files under
pg_logical dir that the checkpoint might have to delete/fsync based on
the cutoff LSN which can increase the checkpoint time. The
user/admin/developer might want to get an answer for "what's the total
time spent in deleting these files or what's the total size of the
files deleted or what's the cutoff LSN with which these files were
deleted". How about adding these stats to existing checkpoint stats?
IMO this information will be super useful. Attaching a patch doing
that. The output looks like [1]2021-10-29 18:48:10.731 UTC [615566] LOG: checkpoint complete: wrote 449 buffers (2.7%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.008 s, sync=0.035 s, total=0.081 s; sync files=2, longest=0.033 s, average=0.018 s; distance=6338 kB, estimate=6338 kB; snapbuild snapshot file(s) removed=4, size=516 bytes, time=0.001 s, cutoff LSN=0/14C8E08; logical rewrite mapping file(s) removed/synced=0, size=0 bytes, time=0.001 s, cutoff LSN=0/14C8E08.

Thoughts?

[1]: 2021-10-29 18:48:10.731 UTC [615566] LOG: checkpoint complete: wrote 449 buffers (2.7%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.008 s, sync=0.035 s, total=0.081 s; sync files=2, longest=0.033 s, average=0.018 s; distance=6338 kB, estimate=6338 kB; snapbuild snapshot file(s) removed=4, size=516 bytes, time=0.001 s, cutoff LSN=0/14C8E08; logical rewrite mapping file(s) removed/synced=0, size=0 bytes, time=0.001 s, cutoff LSN=0/14C8E08
2021-10-29 18:48:10.731 UTC [615566] LOG: checkpoint complete: wrote
449 buffers (2.7%); 0 WAL file(s) added, 0 removed, 0 recycled;
write=0.008 s, sync=0.035 s, total=0.081 s; sync files=2,
longest=0.033 s, average=0.018 s; distance=6338 kB, estimate=6338 kB;
snapbuild snapshot file(s) removed=4, size=516 bytes, time=0.001 s,
cutoff LSN=0/14C8E08; logical rewrite mapping file(s)
removed/synced=0, size=0 bytes, time=0.001 s, cutoff LSN=0/14C8E08

Regards,
Bharath Rupireddy.

Attachments:

v1-0001-add-checkpoint-stats-of-snapshot-and-mapping-file.patchapplication/octet-stream; name=v1-0001-add-checkpoint-stats-of-snapshot-and-mapping-file.patchDownload+64-6
#2Cary Huang
cary.huang@highgo.ca
In reply to: Bharath Rupireddy (#1)
Re: add checkpoint stats of snapshot and mapping files of pg_logical dir

The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: tested, passed
Documentation: not tested

Hi

The patch applies and tests fine. I don't think there is any harm having little extra statistical information about the checkpoint process. In fact, it could be useful in identifying a bottleneck during the checkpoint process as the stats exactly the time taken to do the file IO in pg_logical dir.

best

Cary Huang

#3Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Cary Huang (#2)
Re: add checkpoint stats of snapshot and mapping files of pg_logical dir

On Tue, Nov 30, 2021 at 6:24 AM Cary Huang <cary.huang@highgo.ca> wrote:

The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: tested, passed
Documentation: not tested

Hi

The patch applies and tests fine. I don't think there is any harm having little extra statistical information about the checkpoint process. In fact, it could be useful in identifying a bottleneck during the checkpoint process as the stats exactly the time taken to do the file IO in pg_logical dir.

Thanks for the review. Here's the CF entry
https://commitfest.postgresql.org/36/3389/, please feel free to add
you as reviewer.

Regards,
Bharath Rupireddy.

#4Nathan Bossart
nathandbossart@gmail.com
In reply to: Bharath Rupireddy (#3)
Re: add checkpoint stats of snapshot and mapping files of pg_logical dir

+ CheckpointStats.repl_map_cutoff_lsn = cutoff;

Could we set repl_map_cutoff_lsn closer to where it is calculated? Right
now, it's set at the bottom of the function just before the directory is
freed. Is there a strong reason to do it there?

+ "logical rewrite mapping file(s) removed/synced=" UINT64_FORMAT ", size=%zu bytes, time=%ld.%03d s, cutoff LSN=%X/%X",

Can we split out the "removed" versus "synced" files? Otherwise, I think
this is basically just telling us how many files are in the mappings
directory.

+ CheckpointStats.repl_snap_cutoff_lsn = cutoff;

I have the same note for this one as I have for repl_map_cutoff_lsn.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#5Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Nathan Bossart (#4)
Re: add checkpoint stats of snapshot and mapping files of pg_logical dir

On Sat, Mar 12, 2022 at 1:35 AM Nathan Bossart <nathandbossart@gmail.com> wrote:

+ CheckpointStats.repl_map_cutoff_lsn = cutoff;

Could we set repl_map_cutoff_lsn closer to where it is calculated? Right
now, it's set at the bottom of the function just before the directory is
freed. Is there a strong reason to do it there?

+ "logical rewrite mapping file(s) removed/synced=" UINT64_FORMAT ", size=%zu bytes, time=%ld.%03d s, cutoff LSN=%X/%X",

Can we split out the "removed" versus "synced" files? Otherwise, I think
this is basically just telling us how many files are in the mappings
directory.

+ CheckpointStats.repl_snap_cutoff_lsn = cutoff;

I have the same note for this one as I have for repl_map_cutoff_lsn.

Thanks for reviewing this. I agree with all of the above suggestions
and incorporated them in the v2 patch.

Another thing I added in v2 is to not emit snapshot and mapping files
stats in case of restartpoint as logical decoding isn't supported on
standbys, so it doesn't make sense to emit the stats there and cause
server log to grow unnecessarily. Having said that, I added a note
there to change it whenever logical decoding on standbys is supported.

Please review v2.

Regards,
Bharath Rupireddy.

Attachments:

v2-0001-add-checkpoint-stats-of-snapshot-and-mapping-file.patchapplication/octet-stream; name=v2-0001-add-checkpoint-stats-of-snapshot-and-mapping-file.patchDownload+67-4
#6Nathan Bossart
nathandbossart@gmail.com
In reply to: Bharath Rupireddy (#5)
Re: add checkpoint stats of snapshot and mapping files of pg_logical dir

On Sun, Mar 13, 2022 at 01:54:10PM +0530, Bharath Rupireddy wrote:

Thanks for reviewing this. I agree with all of the above suggestions
and incorporated them in the v2 patch.

Thanks for the new patch.

Another thing I added in v2 is to not emit snapshot and mapping files
stats in case of restartpoint as logical decoding isn't supported on
standbys, so it doesn't make sense to emit the stats there and cause
server log to grow unnecessarily. Having said that, I added a note
there to change it whenever logical decoding on standbys is supported.

I think we actually do want to include this information for restartpoints
since some files might be left over from the snapshot that was used to
create the standby. Also, presumably these functions could do some work
during recovery on a primary.

Another problem I see is that this patch depends on the return value of the
lstat() calls that we are trying to remove in 0001 from another thread [0]/messages/by-id/20220215231123.GA2584239@nathanxps13.
I think the size of the removed/sync'd files is somewhat useful for
understanding disk space usage, but I suspect the time spent performing
these tasks is more closely related to the number of files. Do you think
reporting the sizes is worth the extra system call?

[0]: /messages/by-id/20220215231123.GA2584239@nathanxps13

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#7Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#6)
Re: add checkpoint stats of snapshot and mapping files of pg_logical dir

On Sun, Mar 13, 2022 at 02:58:58PM -0700, Nathan Bossart wrote:

On Sun, Mar 13, 2022 at 01:54:10PM +0530, Bharath Rupireddy wrote:

Another thing I added in v2 is to not emit snapshot and mapping files
stats in case of restartpoint as logical decoding isn't supported on
standbys, so it doesn't make sense to emit the stats there and cause
server log to grow unnecessarily. Having said that, I added a note
there to change it whenever logical decoding on standbys is supported.

I think we actually do want to include this information for restartpoints
since some files might be left over from the snapshot that was used to
create the standby. Also, presumably these functions could do some work
during recovery on a primary.

Yes, I would agree that consistency makes sense here, and it is not
complicated to add the code to support this code path anyway. There
is a risk that folks working on logical decoding on standbys overse
this code path, instead.

Another problem I see is that this patch depends on the return value of the
lstat() calls that we are trying to remove in 0001 from another thread [0].
I think the size of the removed/sync'd files is somewhat useful for
understanding disk space usage, but I suspect the time spent performing
these tasks is more closely related to the number of files. Do you think
reporting the sizes is worth the extra system call?

We are not talking about files that are large either, are we?

Another thing I am a bit annoyed with in this patch is the fact that
the size of the ereport() call is doubled. The LOG currently
generated is already bloated, and this does not arrange things.
--
Michael

#8Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Michael Paquier (#7)
Re: add checkpoint stats of snapshot and mapping files of pg_logical dir

On Mon, Mar 14, 2022 at 10:45 AM Michael Paquier <michael@paquier.xyz> wrote:

On Sun, Mar 13, 2022 at 02:58:58PM -0700, Nathan Bossart wrote:

On Sun, Mar 13, 2022 at 01:54:10PM +0530, Bharath Rupireddy wrote:

Another thing I added in v2 is to not emit snapshot and mapping files
stats in case of restartpoint as logical decoding isn't supported on
standbys, so it doesn't make sense to emit the stats there and cause
server log to grow unnecessarily. Having said that, I added a note
there to change it whenever logical decoding on standbys is supported.

I think we actually do want to include this information for restartpoints
since some files might be left over from the snapshot that was used to
create the standby. Also, presumably these functions could do some work
during recovery on a primary.

Yes, I would agree that consistency makes sense here, and it is not
complicated to add the code to support this code path anyway. There
is a risk that folks working on logical decoding on standbys overse
this code path, instead.

I agree to be consistent and emit the message even in case of restartpoint.

Another problem I see is that this patch depends on the return value of the
lstat() calls that we are trying to remove in 0001 from another thread [0].
I think the size of the removed/sync'd files is somewhat useful for
understanding disk space usage, but I suspect the time spent performing
these tasks is more closely related to the number of files. Do you think
reporting the sizes is worth the extra system call?

We are not talking about files that are large either, are we?

Another thing I am a bit annoyed with in this patch is the fact that
the size of the ereport() call is doubled. The LOG currently
generated is already bloated, and this does not arrange things.

Yes, this is a concern. Also, when there were no logical replication
slots on a plain server or the server removed or cleaned up all the
snapshot/mappings files, why would anyone want to have these messages
with all 0s in the server log?

Here's what I'm thinking:

Leave the existing "checkpoint/restartpoint complete" messages intact,
add the following in LogCheckpointEnd:

if (CheckpointStats.repl_map_files_rmvd_cnt ||
CheckpointStats.repl_map_files_syncd_cnt ||
CheckpointStats.repl_snap_files_rmvd_cnt)
{
ereport(LOG,
(errmsg("snapbuild snapshot file(s) removed="
UINT64_FORMAT ", size=%zu bytes, time=%ld.%03d s, cutoff LSN=%X/%X; "
"logical rewrite mapping file(s) removed="
UINT64_FORMAT ", size=%zu bytes, synced=" UINT64_FORMAT ", size=%zu
bytes, time=%ld.%03d s, cutoff LSN=%X/%X",
CheckpointStats.repl_snap_files_rmvd_cnt,
CheckpointStats.repl_snap_files_rmvd_sz,
repl_snap_msecs / 1000, (int) (repl_snap_msecs % 1000),
LSN_FORMAT_ARGS(CheckpointStats.repl_snap_cutoff_lsn),
CheckpointStats.repl_map_files_rmvd_cnt,
CheckpointStats.repl_map_files_rmvd_sz,
CheckpointStats.repl_map_files_syncd_cnt,
CheckpointStats.repl_map_files_syncd_sz,
repl_map_msecs / 1000, (int) (repl_map_msecs % 1000),
LSN_FORMAT_ARGS(CheckpointStats.repl_map_cutoff_lsn))));
}

Thoughts?

Regards,
Bharath Rupireddy.

#9Michael Paquier
michael@paquier.xyz
In reply to: Bharath Rupireddy (#8)
Re: add checkpoint stats of snapshot and mapping files of pg_logical dir

On Mon, Mar 14, 2022 at 10:54:56AM +0530, Bharath Rupireddy wrote:

Yes, this is a concern. Also, when there were no logical replication
slots on a plain server or the server removed or cleaned up all the
snapshot/mappings files, why would anyone want to have these messages
with all 0s in the server log?

The default settings don't enable that, so making things conditional
roughly as you are suggesting with two different LOG messages sounds
rather fine.

Here's what I'm thinking:

Leave the existing "checkpoint/restartpoint complete" messages intact,
add the following in LogCheckpointEnd:

FWIW, I also think that it would be good to check if there are cases
where this information is significant enough that its inclusion makes
sense. In the top message of the thread, the logs you showed refer to
operations that represent 1/2ms worth of checkpoint. So, if in most
cases this is going to be very quick, adding it to the logs won't
matter because that's not a performance bottleneck. Perhaps that's
something the patch that works on progress reporting for checkpoint
is able to track?
--
Michael

#10Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Michael Paquier (#9)
Re: add checkpoint stats of snapshot and mapping files of pg_logical dir

On Mon, Mar 14, 2022 at 1:33 PM Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Mar 14, 2022 at 10:54:56AM +0530, Bharath Rupireddy wrote:

Yes, this is a concern. Also, when there were no logical replication
slots on a plain server or the server removed or cleaned up all the
snapshot/mappings files, why would anyone want to have these messages
with all 0s in the server log?

The default settings don't enable that, so making things conditional
roughly as you are suggesting with two different LOG messages sounds
rather fine.

Here's what I'm thinking:

Leave the existing "checkpoint/restartpoint complete" messages intact,
add the following in LogCheckpointEnd:

FWIW, I also think that it would be good to check if there are cases
where this information is significant enough that its inclusion makes
sense. In the top message of the thread, the logs you showed refer to
operations that represent 1/2ms worth of checkpoint. So, if in most
cases this is going to be very quick, adding it to the logs won't
matter because that's not a performance bottleneck.

At times, the snapshot or mapping files can be large in number and one
some platforms it takes time for checkpoint to process all of them.
Having the stats about them in server logs can help us better analyze
why checkpoint took a long time and provide a better RCA.

Perhaps that's
something the patch that works on progress reporting for checkpoint
is able to track?

pg_stat_progress_checkpoint will give the info about the on-going
checkpoint but the info in the server logs will help analyze the
issues after they occur.

Regards,
Bharath Rupireddy.

#11Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Michael Paquier (#9)
Re: add checkpoint stats of snapshot and mapping files of pg_logical dir

On Mon, Mar 14, 2022 at 1:33 PM Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Mar 14, 2022 at 10:54:56AM +0530, Bharath Rupireddy wrote:

Yes, this is a concern. Also, when there were no logical replication
slots on a plain server or the server removed or cleaned up all the
snapshot/mappings files, why would anyone want to have these messages
with all 0s in the server log?

The default settings don't enable that, so making things conditional
roughly as you are suggesting with two different LOG messages sounds
rather fine.

Attaching v3 patch which emits logs only when necessary and doesn't
clutter the existing "checkpoint/restartpoint completed" message, see
some sample logs at [1]2022-03-14 11:20:28.381 UTC [3257102] LOG: checkpoint starting: immediate force wait 2022-03-14 11:20:28.444 UTC [3257102] LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.001 s, total=0.064 s; sync files=0, longest=0.000 s, average=0.000 s; distance=0 kB, estimate=35717 kB 2022-03-14 11:20:28.444 UTC [3257102] LOG: snapbuild snapshot file(s) removed=2, size=280 bytes, time=0.001 s, cutoff LSN=0/40D2F40 2022-03-14 11:20:28.444 UTC [3257102] LOG: logical rewrite mapping file(s) removed=16, size=1368 bytes, synced=16, size=1368 bytes, time=0.047 s, cutoff LSN=0/40D2F40. Please review it further.

Just another thought - after looking at sample logs at [1]2022-03-14 11:20:28.381 UTC [3257102] LOG: checkpoint starting: immediate force wait 2022-03-14 11:20:28.444 UTC [3257102] LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.001 s, total=0.064 s; sync files=0, longest=0.000 s, average=0.000 s; distance=0 kB, estimate=35717 kB 2022-03-14 11:20:28.444 UTC [3257102] LOG: snapbuild snapshot file(s) removed=2, size=280 bytes, time=0.001 s, cutoff LSN=0/40D2F40 2022-03-14 11:20:28.444 UTC [3257102] LOG: logical rewrite mapping file(s) removed=16, size=1368 bytes, synced=16, size=1368 bytes, time=0.047 s, cutoff LSN=0/40D2F40, once can
think why can't the checkpoint logs be like [2]2022-03-14 11:20:28.381 UTC [3257102] LOG: checkpoint starting: immediate force wait 2022-03-14 11:20:28.444 UTC [3257102] LOG: wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.001 s, total=0.064 s; sync files=0, longest=0.000 s, average=0.000 s; distance=0 kB, estimate=35717 kB 2022-03-14 11:20:28.444 UTC [3257102] LOG: snapbuild snapshot file(s) removed=2, size=280 bytes, time=0.001 s, cutoff LSN=0/40D2F40 2022-03-14 11:20:28.444 UTC [3257102] LOG: logical rewrite mapping file(s) removed=16, size=1368 bytes, synced=16, size=1368 bytes, time=0.047 s, cutoff LSN=0/40D2F40 2022-03-14 11:20:28.444 UTC [3257102] LOG: checkpoint complete: immediate force wait or [3]2022-03-14 11:20:28.381 UTC [3257102] LOG: checkpoint starting: immediate force wait 2022-03-14 11:20:28.444 UTC [3257102] LOG: wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.001 s, total=0.064 s; sync files=0, longest=0.000 s, average=0.000 s; distance=0 kB, estimate=35717 kB 2022-03-14 11:20:28.444 UTC [3257102] LOG: snapbuild snapshot file(s) removed=2, size=280 bytes, time=0.001 s, cutoff LSN=0/40D2F40 2022-03-14 11:20:28.444 UTC [3257102] LOG: logical rewrite mapping file(s) removed=16, size=1368 bytes, synced=16, size=1368 bytes, time=0.047 s, cutoff LSN=0/40D2F40 2022-03-14 11:20:28.444 UTC [3257102] LOG: checkpoint complete?

[1]: 2022-03-14 11:20:28.381 UTC [3257102] LOG: checkpoint starting: immediate force wait 2022-03-14 11:20:28.444 UTC [3257102] LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.001 s, total=0.064 s; sync files=0, longest=0.000 s, average=0.000 s; distance=0 kB, estimate=35717 kB 2022-03-14 11:20:28.444 UTC [3257102] LOG: snapbuild snapshot file(s) removed=2, size=280 bytes, time=0.001 s, cutoff LSN=0/40D2F40 2022-03-14 11:20:28.444 UTC [3257102] LOG: logical rewrite mapping file(s) removed=16, size=1368 bytes, synced=16, size=1368 bytes, time=0.047 s, cutoff LSN=0/40D2F40
2022-03-14 11:20:28.381 UTC [3257102] LOG: checkpoint starting:
immediate force wait
2022-03-14 11:20:28.444 UTC [3257102] LOG: checkpoint complete: wrote
0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled;
write=0.001 s, sync=0.001 s, total=0.064 s; sync files=0,
longest=0.000 s, average=0.000 s; distance=0 kB, estimate=35717 kB
2022-03-14 11:20:28.444 UTC [3257102] LOG: snapbuild snapshot file(s)
removed=2, size=280 bytes, time=0.001 s, cutoff LSN=0/40D2F40
2022-03-14 11:20:28.444 UTC [3257102] LOG: logical rewrite mapping
file(s) removed=16, size=1368 bytes, synced=16, size=1368 bytes,
time=0.047 s, cutoff LSN=0/40D2F40

[2]: 2022-03-14 11:20:28.381 UTC [3257102] LOG: checkpoint starting: immediate force wait 2022-03-14 11:20:28.444 UTC [3257102] LOG: wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.001 s, total=0.064 s; sync files=0, longest=0.000 s, average=0.000 s; distance=0 kB, estimate=35717 kB 2022-03-14 11:20:28.444 UTC [3257102] LOG: snapbuild snapshot file(s) removed=2, size=280 bytes, time=0.001 s, cutoff LSN=0/40D2F40 2022-03-14 11:20:28.444 UTC [3257102] LOG: logical rewrite mapping file(s) removed=16, size=1368 bytes, synced=16, size=1368 bytes, time=0.047 s, cutoff LSN=0/40D2F40 2022-03-14 11:20:28.444 UTC [3257102] LOG: checkpoint complete: immediate force wait
2022-03-14 11:20:28.381 UTC [3257102] LOG: checkpoint starting:
immediate force wait
2022-03-14 11:20:28.444 UTC [3257102] LOG: wrote 0 buffers (0.0%); 0
WAL file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.001 s,
total=0.064 s; sync files=0, longest=0.000 s, average=0.000 s;
distance=0 kB, estimate=35717 kB
2022-03-14 11:20:28.444 UTC [3257102] LOG: snapbuild snapshot file(s)
removed=2, size=280 bytes, time=0.001 s, cutoff LSN=0/40D2F40
2022-03-14 11:20:28.444 UTC [3257102] LOG: logical rewrite mapping
file(s) removed=16, size=1368 bytes, synced=16, size=1368 bytes,
time=0.047 s, cutoff LSN=0/40D2F40
2022-03-14 11:20:28.444 UTC [3257102] LOG: checkpoint complete:
immediate force wait

[3]: 2022-03-14 11:20:28.381 UTC [3257102] LOG: checkpoint starting: immediate force wait 2022-03-14 11:20:28.444 UTC [3257102] LOG: wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.001 s, total=0.064 s; sync files=0, longest=0.000 s, average=0.000 s; distance=0 kB, estimate=35717 kB 2022-03-14 11:20:28.444 UTC [3257102] LOG: snapbuild snapshot file(s) removed=2, size=280 bytes, time=0.001 s, cutoff LSN=0/40D2F40 2022-03-14 11:20:28.444 UTC [3257102] LOG: logical rewrite mapping file(s) removed=16, size=1368 bytes, synced=16, size=1368 bytes, time=0.047 s, cutoff LSN=0/40D2F40 2022-03-14 11:20:28.444 UTC [3257102] LOG: checkpoint complete
2022-03-14 11:20:28.381 UTC [3257102] LOG: checkpoint starting:
immediate force wait
2022-03-14 11:20:28.444 UTC [3257102] LOG: wrote 0 buffers (0.0%); 0
WAL file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.001 s,
total=0.064 s; sync files=0, longest=0.000 s, average=0.000 s;
distance=0 kB, estimate=35717 kB
2022-03-14 11:20:28.444 UTC [3257102] LOG: snapbuild snapshot file(s)
removed=2, size=280 bytes, time=0.001 s, cutoff LSN=0/40D2F40
2022-03-14 11:20:28.444 UTC [3257102] LOG: logical rewrite mapping
file(s) removed=16, size=1368 bytes, synced=16, size=1368 bytes,
time=0.047 s, cutoff LSN=0/40D2F40
2022-03-14 11:20:28.444 UTC [3257102] LOG: checkpoint complete

Regards,
Bharath Rupireddy.

Attachments:

v3-0001-Add-checkpoint-stats-of-snapshot-and-mapping-file.patchapplication/x-patch; name=v3-0001-Add-checkpoint-stats-of-snapshot-and-mapping-file.patchDownload+73-1
#12Nathan Bossart
nathandbossart@gmail.com
In reply to: Bharath Rupireddy (#11)
Re: add checkpoint stats of snapshot and mapping files of pg_logical dir

On Mon, Mar 14, 2022 at 05:03:59PM +0530, Bharath Rupireddy wrote:

On Mon, Mar 14, 2022 at 1:33 PM Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Mar 14, 2022 at 10:54:56AM +0530, Bharath Rupireddy wrote:

Yes, this is a concern. Also, when there were no logical replication
slots on a plain server or the server removed or cleaned up all the
snapshot/mappings files, why would anyone want to have these messages
with all 0s in the server log?

The default settings don't enable that, so making things conditional
roughly as you are suggesting with two different LOG messages sounds
rather fine.

Attaching v3 patch which emits logs only when necessary and doesn't
clutter the existing "checkpoint/restartpoint completed" message, see
some sample logs at [1]. Please review it further.

I'm okay with not adding these statistics when the number of files removed
and sync'd is zero.

IMO we shouldn't include the size of the files since it will prevent us
from removing lstat() calls. As noted upthread, I suspect the time spent
in these tasks is more closely related to the number of files anyway.

I'm -1 on splitting these new statistics to separate LOGs. In addition to
making it more difficult to discover statistics for a given checkpoint, I
think it actually adds even more bloat to the server log. If we are
concerned about the amount of information in these LOGs, perhaps we should
adjust the format to make it more human-readable.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#13Michael Paquier
michael@paquier.xyz
In reply to: Bharath Rupireddy (#10)
Re: add checkpoint stats of snapshot and mapping files of pg_logical dir

On Mon, Mar 14, 2022 at 03:54:19PM +0530, Bharath Rupireddy wrote:

At times, the snapshot or mapping files can be large in number and one
some platforms it takes time for checkpoint to process all of them.
Having the stats about them in server logs can help us better analyze
why checkpoint took a long time and provide a better RCA.

Do you have any numbers to share regarding that? Seeing information
about 1k WAL segments being recycled and/or removed by a checkpoint
where the operation takes dozens of seconds to complete because we can
talk about hundred of gigs worth of files moved around. If we are
talking about 100~200 files up to 10~20kB each for snapshot and
mapping files, the information has less value, worth only a portion of
one WAL segment.
--
Michael

#14Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#13)
Re: add checkpoint stats of snapshot and mapping files of pg_logical dir

On Tue, Mar 15, 2022 at 11:04:26AM +0900, Michael Paquier wrote:

On Mon, Mar 14, 2022 at 03:54:19PM +0530, Bharath Rupireddy wrote:

At times, the snapshot or mapping files can be large in number and one
some platforms it takes time for checkpoint to process all of them.
Having the stats about them in server logs can help us better analyze
why checkpoint took a long time and provide a better RCA.

Do you have any numbers to share regarding that? Seeing information
about 1k WAL segments being recycled and/or removed by a checkpoint
where the operation takes dozens of seconds to complete because we can
talk about hundred of gigs worth of files moved around. If we are
talking about 100~200 files up to 10~20kB each for snapshot and
mapping files, the information has less value, worth only a portion of
one WAL segment.

I don't have specific numbers to share, but as noted elsewhere [0]/messages/by-id/18ED8B1F-7F5B-4ABF-848D-45916C938BC7@amazon.com, I
routinely see lengthy checkpoints that spend a lot of time in these cleanup
tasks.

[0]: /messages/by-id/18ED8B1F-7F5B-4ABF-848D-45916C938BC7@amazon.com

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#15Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Nathan Bossart (#14)
Re: add checkpoint stats of snapshot and mapping files of pg_logical dir

On Wed, Mar 16, 2022 at 1:47 AM Nathan Bossart <nathandbossart@gmail.com> wrote:

On Tue, Mar 15, 2022 at 11:04:26AM +0900, Michael Paquier wrote:

On Mon, Mar 14, 2022 at 03:54:19PM +0530, Bharath Rupireddy wrote:

At times, the snapshot or mapping files can be large in number and one
some platforms it takes time for checkpoint to process all of them.
Having the stats about them in server logs can help us better analyze
why checkpoint took a long time and provide a better RCA.

Do you have any numbers to share regarding that? Seeing information
about 1k WAL segments being recycled and/or removed by a checkpoint
where the operation takes dozens of seconds to complete because we can
talk about hundred of gigs worth of files moved around. If we are
talking about 100~200 files up to 10~20kB each for snapshot and
mapping files, the information has less value, worth only a portion of
one WAL segment.

I don't have specific numbers to share, but as noted elsewhere [0], I
routinely see lengthy checkpoints that spend a lot of time in these cleanup
tasks.

[0] /messages/by-id/18ED8B1F-7F5B-4ABF-848D-45916C938BC7@amazon.com

I too don't have numbers in hand to share. But I have come across
cases where checkpoints spend a good amount of time cleaning
up/processing snapshots or mapping files.

Regards,
Bharath Rupireddy.

#16Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Nathan Bossart (#12)
Re: add checkpoint stats of snapshot and mapping files of pg_logical dir

On Mon, Mar 14, 2022 at 10:34 PM Nathan Bossart
<nathandbossart@gmail.com> wrote:

Attaching v3 patch which emits logs only when necessary and doesn't
clutter the existing "checkpoint/restartpoint completed" message, see
some sample logs at [1]. Please review it further.

I'm okay with not adding these statistics when the number of files removed
and sync'd is zero.

Thanks.

IMO we shouldn't include the size of the files since it will prevent us
from removing lstat() calls. As noted upthread, I suspect the time spent
in these tasks is more closely related to the number of files anyway.

Yeah. I will remove them.

I'm -1 on splitting these new statistics to separate LOGs. In addition to
making it more difficult to discover statistics for a given checkpoint, I
think it actually adds even more bloat to the server log. If we are
concerned about the amount of information in these LOGs, perhaps we should
adjust the format to make it more human-readable.

Below are the ways that I can think of. Thoughts?

1)
if (restartpoint)
{
if (snap_files_rmvd > 0 && map_files_rmvd > 0)
existing "restartpoint complete" message + "snapshot files
removed cnt, time, cutoff lsn" + "mapping files removed cnt, time,
cutoff lsn"
else if (snap_files_rmvd > 0)
existing "restartpoint complete" message + "snapshot files
removed cnt, time, cutoff lsn"
else if (map_files_rmvd > 0)
existing "restartpoint complete" message + "mapping files removed
cnt, time, cutoff lsn"
else
existing "restartpoint complete" message
}
else
{
same as above but with "checkpoint complete" instead of "restart complete"
}

2) Use StringInfoData to prepare the message dynamically but this will
create message translation problems.

3) Emit the "snapshot files removed cnt, time, cutoff lsn" and
"mapping files removed cnt, time, cutoff lsn" at LOG level if
(log_checkpoints) in CheckPointSnapBuild and
CheckPointLogicalRewriteHeap respectively.

4) Have separate log messages in LogCheckpointEnd:
if (snap_files_rmvd > 0)
"snapshot files removed cnt, time, cutoff lsn"
if (map_files_rmvd > 0)
"mapping files removed cnt, time, cutoff lsn"

if (restartpoint)
existing "restartpoint complete" message
else
existing "checkpoint complete" message

Regards,
Bharath Rupireddy.

#17Nathan Bossart
nathandbossart@gmail.com
In reply to: Bharath Rupireddy (#16)
Re: add checkpoint stats of snapshot and mapping files of pg_logical dir

On Wed, Mar 16, 2022 at 03:02:41PM +0530, Bharath Rupireddy wrote:

On Mon, Mar 14, 2022 at 10:34 PM Nathan Bossart
<nathandbossart@gmail.com> wrote:

I'm -1 on splitting these new statistics to separate LOGs. In addition to
making it more difficult to discover statistics for a given checkpoint, I
think it actually adds even more bloat to the server log. If we are
concerned about the amount of information in these LOGs, perhaps we should
adjust the format to make it more human-readable.

Below are the ways that I can think of. Thoughts?

I think I prefer the first option. If these tasks don't do anything, we
leave the stats out of the message. If they do some work, we add them.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#18Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Nathan Bossart (#17)
Re: add checkpoint stats of snapshot and mapping files of pg_logical dir

On Thu, Mar 17, 2022 at 12:12 AM Nathan Bossart
<nathandbossart@gmail.com> wrote:

On Wed, Mar 16, 2022 at 03:02:41PM +0530, Bharath Rupireddy wrote:

On Mon, Mar 14, 2022 at 10:34 PM Nathan Bossart
<nathandbossart@gmail.com> wrote:

I'm -1 on splitting these new statistics to separate LOGs. In addition to
making it more difficult to discover statistics for a given checkpoint, I
think it actually adds even more bloat to the server log. If we are
concerned about the amount of information in these LOGs, perhaps we should
adjust the format to make it more human-readable.

Below are the ways that I can think of. Thoughts?

I think I prefer the first option. If these tasks don't do anything, we
leave the stats out of the message. If they do some work, we add them.

Thanks Nathan.

Below are the ways that I can think of. Thoughts?

1)
if (restartpoint)
{
if (snap_files_rmvd > 0 && map_files_rmvd > 0)
existing "restartpoint complete" message + "snapshot files
removed cnt, time, cutoff lsn" + "mapping files removed cnt, time,
cutoff lsn"
else if (snap_files_rmvd > 0)
existing "restartpoint complete" message + "snapshot files
removed cnt, time, cutoff lsn"
else if (map_files_rmvd > 0)
existing "restartpoint complete" message + "mapping files removed
cnt, time, cutoff lsn"
else
existing "restartpoint complete" message
}
else
{
same as above but with "checkpoint complete" instead of "restart complete"
}

2) Use StringInfoData to prepare the message dynamically but this will
create message translation problems.

3) Emit the "snapshot files removed cnt, time, cutoff lsn" and
"mapping files removed cnt, time, cutoff lsn" at LOG level if
(log_checkpoints) in CheckPointSnapBuild and
CheckPointLogicalRewriteHeap respectively.

4) Have separate log messages in LogCheckpointEnd:
if (snap_files_rmvd > 0)
"snapshot files removed cnt, time, cutoff lsn"
if (map_files_rmvd > 0)
"mapping files removed cnt, time, cutoff lsn"

if (restartpoint)
existing "restartpoint complete" message
else
existing "checkpoint complete" message

FWIW, I'm attaching patches as follows:
v4-0001...patch implements option (4)
v4-0002...txt implements option (3)
v4-0003...txt implements option (1)

Personally, I tend to agree with v4-0001 (option (4)) or v4-0002
(option (3)) than v4-0003 (option (1)) as it adds more unreadability
with most of the code duplicated creating more diff with previous
versions and maintainability problems. Having said that, I will leave
it to the committer to decide on that.

Regards,
Bharath Rupireddy.

Attachments:

v4-0002-Add-checkpoint-stats-of-snapshot-and-mapping-fi.txttext/plain; charset=US-ASCII; name=v4-0002-Add-checkpoint-stats-of-snapshot-and-mapping-fi.txtDownload+55-1
v4-0001-Add-checkpoint-stats-of-snapshot-and-mapping-fil.patchapplication/octet-stream; name=v4-0001-Add-checkpoint-stats-of-snapshot-and-mapping-fil.patchDownload+65-1
v4-0003-Add-checkpoint-stats-of-snapshot-and-mapping-file.txttext/plain; charset=US-ASCII; name=v4-0003-Add-checkpoint-stats-of-snapshot-and-mapping-file.txtDownload+290-39
#19Nathan Bossart
nathandbossart@gmail.com
In reply to: Bharath Rupireddy (#18)
Re: add checkpoint stats of snapshot and mapping files of pg_logical dir

On Thu, Mar 17, 2022 at 06:48:43PM +0530, Bharath Rupireddy wrote:

Personally, I tend to agree with v4-0001 (option (4)) or v4-0002
(option (3)) than v4-0003 (option (1)) as it adds more unreadability
with most of the code duplicated creating more diff with previous
versions and maintainability problems. Having said that, I will leave
it to the committer to decide on that.

I don't think v4-0003/option 1 needs to be unreadable. Perhaps we can use
a StringInfo to build the message. That might be a net improvement in
readability anyway.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#20Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Nathan Bossart (#19)
Re: add checkpoint stats of snapshot and mapping files of pg_logical dir

On Fri, Mar 18, 2022 at 2:15 AM Nathan Bossart <nathandbossart@gmail.com> wrote:

On Thu, Mar 17, 2022 at 06:48:43PM +0530, Bharath Rupireddy wrote:

Personally, I tend to agree with v4-0001 (option (4)) or v4-0002
(option (3)) than v4-0003 (option (1)) as it adds more unreadability
with most of the code duplicated creating more diff with previous
versions and maintainability problems. Having said that, I will leave
it to the committer to decide on that.

I don't think v4-0003/option 1 needs to be unreadable. Perhaps we can use
a StringInfo to build the message. That might be a net improvement in
readability anyway.

Looks like we already use StringInfo in PerformAuthentication for
"connection" related log messages, see [1]if (Log_connections) { StringInfoData logmsg;.

I will give it a try using StringInfo for the "checkpoint/restartpoint
complete" message too.

[1]: if (Log_connections) { StringInfoData logmsg;
if (Log_connections)
{
StringInfoData logmsg;

initStringInfo(&logmsg);
if (am_walsender)
appendStringInfo(&logmsg, _("replication connection
authorized: user=%s"),
port->user_name);
else
appendStringInfo(&logmsg, _("connection authorized: user=%s"),
port->user_name);
if (!am_walsender)
appendStringInfo(&logmsg, _(" database=%s"), port->database_name);

Regards,
Bharath Rupireddy.

#21Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Bharath Rupireddy (#20)
#22Nathan Bossart
nathandbossart@gmail.com
In reply to: Bharath Rupireddy (#21)
#23Andres Freund
andres@anarazel.de
In reply to: Bharath Rupireddy (#21)
#24Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Andres Freund (#23)
#25Andres Freund
andres@anarazel.de
In reply to: Bharath Rupireddy (#24)
#26Nathan Bossart
nathandbossart@gmail.com
In reply to: Bharath Rupireddy (#24)
#27Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Andres Freund (#25)
#28Andres Freund
andres@anarazel.de
In reply to: Bharath Rupireddy (#27)
#29Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Andres Freund (#28)
#30Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Bharath Rupireddy (#29)
#31Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Bharath Rupireddy (#30)
#32Nathan Bossart
nathandbossart@gmail.com
In reply to: Bharath Rupireddy (#31)
#33Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#32)
#34Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Michael Paquier (#33)
#35Nathan Bossart
nathandbossart@gmail.com
In reply to: Bharath Rupireddy (#34)
#36Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Nathan Bossart (#35)
#37Nathan Bossart
nathandbossart@gmail.com
In reply to: Bharath Rupireddy (#36)
#38Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Nathan Bossart (#37)
#39Michael Paquier
michael@paquier.xyz
In reply to: Kyotaro Horiguchi (#38)
#40Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Nathan Bossart (#37)
#41Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Michael Paquier (#39)
#42Nathan Bossart
nathandbossart@gmail.com
In reply to: Bharath Rupireddy (#40)
#43Hamid Akhtar
hamid.akhtar@percona.com
In reply to: Nathan Bossart (#42)
#44Justin Pryzby
pryzby@telsasoft.com
In reply to: Hamid Akhtar (#43)
#45Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Justin Pryzby (#44)