Fix pg_checksums progress report

Started by Shinya11.Kato@nttdata.comabout 5 years ago8 messageshackers
Jump to latest
#1Shinya11.Kato@nttdata.com
Shinya11.Kato@nttdata.com

Hi,

I found a problem with the pg_checksums.c.

The total_size is calculated by scanning the directory.
The current_size is calculated by scanning the files, but the current_size does not include the size of NewPages.

This may cause pg_checksums progress report to not be 100%.
I have attached a patch that fixes this.

Regards,
Shinya Kato

Attachments:

fix_pg_checksums_progress_report.patchapplication/octet-stream; name=fix_pg_checksums_progress_report.patchDownload+3-1
#2Fujii Masao
masao.fujii@gmail.com
In reply to: Shinya11.Kato@nttdata.com (#1)
Re: Fix pg_checksums progress report

On 2021/04/02 14:23, Shinya11.Kato@nttdata.com wrote:

Hi,

I found a problem with the pg_checksums.c.

The total_size is calculated by scanning the directory.
The current_size is calculated by scanning the files, but the current_size does not include the size of NewPages.

This may cause pg_checksums progress report to not be 100%.
I have attached a patch that fixes this.

Thanks for the report and patch!

I could reproduce this issue and confirmed that the patch fixes it.

Regarding the patch, I think that it's better to add the comment about
why current_size needs to be counted including new pages.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#3Shinya11.Kato@nttdata.com
Shinya11.Kato@nttdata.com
In reply to: Fujii Masao (#2)
RE: Fix pg_checksums progress report

-----Original Message-----
From: Fujii Masao <masao.fujii@oss.nttdata.com>
Sent: Friday, April 2, 2021 2:39 PM
To: Shinya11.Kato@nttdata.com; pgsql-hackers@postgresql.org
Subject: Re: Fix pg_checksums progress report

On 2021/04/02 14:23, Shinya11.Kato@nttdata.com wrote:

Hi,

I found a problem with the pg_checksums.c.

The total_size is calculated by scanning the directory.
The current_size is calculated by scanning the files, but the current_size does

not include the size of NewPages.

This may cause pg_checksums progress report to not be 100%.
I have attached a patch that fixes this.

Thanks for the report and patch!

I could reproduce this issue and confirmed that the patch fixes it.

Regarding the patch, I think that it's better to add the comment about why
current_size needs to be counted including new pages.

Thanks for your review.
I added a comment to the patch, and attached the new patch.

Regards,
Shinya Kato

Attachments:

fix_pg_checksums_progress_report_v2.patchapplication/octet-stream; name=fix_pg_checksums_progress_report_v2.patchDownload+6-1
#4Michael Paquier
michael@paquier.xyz
In reply to: Shinya11.Kato@nttdata.com (#3)
Re: Fix pg_checksums progress report

On Fri, Apr 02, 2021 at 07:30:32AM +0000, Shinya11.Kato@nttdata.com wrote:

I added a comment to the patch, and attached the new patch.

Hmm. This looks to come from 280e5f14 that introduced the progress
reports so this would need a backpatch down to 12. I have not looked
in details and have not looked at the patch yet, though. Fujii-san,
are you planning to take care of that? That was my stuff originally,
so I am fine to look at it. But not now, on a Friday afternoon :)
--
Michael

#5Fujii Masao
masao.fujii@gmail.com
In reply to: Michael Paquier (#4)
Re: Fix pg_checksums progress report

On 2021/04/02 16:47, Michael Paquier wrote:

On Fri, Apr 02, 2021 at 07:30:32AM +0000, Shinya11.Kato@nttdata.com wrote:

I added a comment to the patch, and attached the new patch.

Thanks for updating the patch!

+		/*
+		 * The current_size is calculated before checking if header is a
+		 * new page, because total_size includes the size of new pages.
+		 */
+		current_size += r;

I'd like to comment more. What about the following?

---------------------------
Since the file size is counted as total_size for progress status information, the sizes of all pages including new ones in the file should be counted as current_size. Otherwise the progress reporting calculated using those counters may not reach 100%.
---------------------------

Hmm. This looks to come from 280e5f14 that introduced the progress
reports so this would need a backpatch down to 12.

Yes.

I have not looked
in details and have not looked at the patch yet, though. Fujii-san,
are you planning to take care of that?

Yes, I will. Thanks for the consideration!

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#6Shinya11.Kato@nttdata.com
Shinya11.Kato@nttdata.com
In reply to: Fujii Masao (#5)
RE: Fix pg_checksums progress report

-----Original Message-----
From: Fujii Masao <masao.fujii@oss.nttdata.com>
Sent: Friday, April 2, 2021 6:03 PM
To: Michael Paquier <michael@paquier.xyz>; Shinya11.Kato@nttdata.com
Cc: pgsql-hackers@postgresql.org
Subject: Re: Fix pg_checksums progress report

On 2021/04/02 16:47, Michael Paquier wrote:

On Fri, Apr 02, 2021 at 07:30:32AM +0000, Shinya11.Kato@nttdata.com wrote:

I added a comment to the patch, and attached the new patch.

Thanks for updating the patch!

+		/*
+		 * The current_size is calculated before checking if header is a
+		 * new page, because total_size includes the size of new
pages.
+		 */
+		current_size += r;

I'd like to comment more. What about the following?

---------------------------
Since the file size is counted as total_size for progress status information, the
sizes of all pages including new ones in the file should be counted as
current_size. Otherwise the progress reporting calculated using those counters
may not reach 100%.
---------------------------

Thanks for your review!
I updated the patch, and attached it.

Regards,
Shinya Kato

Attachments:

fix_pg_checksums_progress_report_v3.patchapplication/octet-stream; name=fix_pg_checksums_progress_report_v3.patchDownload+8-1
#7Michael Paquier
michael@paquier.xyz
In reply to: Fujii Masao (#5)
Re: Fix pg_checksums progress report

On Fri, Apr 02, 2021 at 06:03:21PM +0900, Fujii Masao wrote:

On 2021/04/02 16:47, Michael Paquier wrote:

I have not looked
in details and have not looked at the patch yet, though. Fujii-san,
are you planning to take care of that?

Yes, I will. Thanks for the consideration!

OK, thanks!
--
Michael

#8Fujii Masao
masao.fujii@gmail.com
In reply to: Shinya11.Kato@nttdata.com (#6)
Re: Fix pg_checksums progress report

On 2021/04/02 18:19, Shinya11.Kato@nttdata.com wrote:

Thanks for your review!
I updated the patch, and attached it.

Thanks for updating the patch! Pushed.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION