Fix pg_checksums progress report
Hi,
I found a problem with the pg_checksums.c.
The total_size is calculated by scanning the directory.
The current_size is calculated by scanning the files, but the current_size does not include the size of NewPages.
This may cause pg_checksums progress report to not be 100%.
I have attached a patch that fixes this.
Regards,
Shinya Kato
Attachments:
fix_pg_checksums_progress_report.patchapplication/octet-stream; name=fix_pg_checksums_progress_report.patchDownload
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 0223ee4408..f3b4751e3f 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -229,12 +229,14 @@ scan_file(const char *fn, BlockNumber segmentno)
}
blocks++;
+ current_size += r;
+
/* New pages have no checksum yet */
if (PageIsNew(header))
continue;
csum = pg_checksum_page(buf.data, blockno + segmentno * RELSEG_SIZE);
- current_size += r;
+
if (mode == PG_MODE_CHECK)
{
if (csum != header->pd_checksum)
On 2021/04/02 14:23, Shinya11.Kato@nttdata.com wrote:
Hi,
I found a problem with the pg_checksums.c.
The total_size is calculated by scanning the directory.
The current_size is calculated by scanning the files, but the current_size does not include the size of NewPages.This may cause pg_checksums progress report to not be 100%.
I have attached a patch that fixes this.
Thanks for the report and patch!
I could reproduce this issue and confirmed that the patch fixes it.
Regarding the patch, I think that it's better to add the comment about
why current_size needs to be counted including new pages.
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
-----Original Message-----
From: Fujii Masao <masao.fujii@oss.nttdata.com>
Sent: Friday, April 2, 2021 2:39 PM
To: Shinya11.Kato@nttdata.com; pgsql-hackers@postgresql.org
Subject: Re: Fix pg_checksums progress reportOn 2021/04/02 14:23, Shinya11.Kato@nttdata.com wrote:
Hi,
I found a problem with the pg_checksums.c.
The total_size is calculated by scanning the directory.
The current_size is calculated by scanning the files, but the current_size doesnot include the size of NewPages.
This may cause pg_checksums progress report to not be 100%.
I have attached a patch that fixes this.Thanks for the report and patch!
I could reproduce this issue and confirmed that the patch fixes it.
Regarding the patch, I think that it's better to add the comment about why
current_size needs to be counted including new pages.
Thanks for your review.
I added a comment to the patch, and attached the new patch.
Regards,
Shinya Kato
Attachments:
fix_pg_checksums_progress_report_v2.patchapplication/octet-stream; name=fix_pg_checksums_progress_report_v2.patchDownload
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 0223ee4408..2d078d8fbb 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -229,12 +229,17 @@ scan_file(const char *fn, BlockNumber segmentno)
}
blocks++;
+ /*
+ * The current_size is calculated before checking if header is a
+ * new page, because total_size includes the size of new pages.
+ */
+ current_size += r;
+
/* New pages have no checksum yet */
if (PageIsNew(header))
continue;
csum = pg_checksum_page(buf.data, blockno + segmentno * RELSEG_SIZE);
- current_size += r;
if (mode == PG_MODE_CHECK)
{
if (csum != header->pd_checksum)
On Fri, Apr 02, 2021 at 07:30:32AM +0000, Shinya11.Kato@nttdata.com wrote:
I added a comment to the patch, and attached the new patch.
Hmm. This looks to come from 280e5f14 that introduced the progress
reports so this would need a backpatch down to 12. I have not looked
in details and have not looked at the patch yet, though. Fujii-san,
are you planning to take care of that? That was my stuff originally,
so I am fine to look at it. But not now, on a Friday afternoon :)
--
Michael
On 2021/04/02 16:47, Michael Paquier wrote:
On Fri, Apr 02, 2021 at 07:30:32AM +0000, Shinya11.Kato@nttdata.com wrote:
I added a comment to the patch, and attached the new patch.
Thanks for updating the patch!
+ /*
+ * The current_size is calculated before checking if header is a
+ * new page, because total_size includes the size of new pages.
+ */
+ current_size += r;
I'd like to comment more. What about the following?
---------------------------
Since the file size is counted as total_size for progress status information, the sizes of all pages including new ones in the file should be counted as current_size. Otherwise the progress reporting calculated using those counters may not reach 100%.
---------------------------
Hmm. This looks to come from 280e5f14 that introduced the progress
reports so this would need a backpatch down to 12.
Yes.
I have not looked
in details and have not looked at the patch yet, though. Fujii-san,
are you planning to take care of that?
Yes, I will. Thanks for the consideration!
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
-----Original Message-----
From: Fujii Masao <masao.fujii@oss.nttdata.com>
Sent: Friday, April 2, 2021 6:03 PM
To: Michael Paquier <michael@paquier.xyz>; Shinya11.Kato@nttdata.com
Cc: pgsql-hackers@postgresql.org
Subject: Re: Fix pg_checksums progress reportOn 2021/04/02 16:47, Michael Paquier wrote:
On Fri, Apr 02, 2021 at 07:30:32AM +0000, Shinya11.Kato@nttdata.com wrote:
I added a comment to the patch, and attached the new patch.
Thanks for updating the patch!
+ /* + * The current_size is calculated before checking if header is a + * new page, because total_size includes the size of new pages. + */ + current_size += r;I'd like to comment more. What about the following?
---------------------------
Since the file size is counted as total_size for progress status information, the
sizes of all pages including new ones in the file should be counted as
current_size. Otherwise the progress reporting calculated using those counters
may not reach 100%.
---------------------------
Thanks for your review!
I updated the patch, and attached it.
Regards,
Shinya Kato
Attachments:
fix_pg_checksums_progress_report_v3.patchapplication/octet-stream; name=fix_pg_checksums_progress_report_v3.patchDownload
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 0223ee4408..831cf42d3a 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -229,12 +229,19 @@ scan_file(const char *fn, BlockNumber segmentno)
}
blocks++;
+ /*
+ * Since the file size is counted as total_size for progress status
+ * information, the sizes of all pages including new ones in the file
+ * should be counted as current_size. Otherwise the progress reporting
+ * calculated using those counters may not reach 100%.
+ */
+ current_size += r;
+
/* New pages have no checksum yet */
if (PageIsNew(header))
continue;
csum = pg_checksum_page(buf.data, blockno + segmentno * RELSEG_SIZE);
- current_size += r;
if (mode == PG_MODE_CHECK)
{
if (csum != header->pd_checksum)
On Fri, Apr 02, 2021 at 06:03:21PM +0900, Fujii Masao wrote:
On 2021/04/02 16:47, Michael Paquier wrote:
I have not looked
in details and have not looked at the patch yet, though. Fujii-san,
are you planning to take care of that?Yes, I will. Thanks for the consideration!
OK, thanks!
--
Michael
On 2021/04/02 18:19, Shinya11.Kato@nttdata.com wrote:
Thanks for your review!
I updated the patch, and attached it.
Thanks for updating the patch! Pushed.
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION