Thread: Fix pg_checksums progress report

Fix pg_checksums progress report

From
Date:
Hi,

I found a problem with the pg_checksums.c.

The total_size is calculated by scanning the directory.
The current_size is calculated by scanning the files, but the current_size does not include the size of NewPages.

This may cause pg_checksums progress report to not be 100%.
I have attached a patch that fixes this.

Regards,
Shinya Kato

Attachment

Re: Fix pg_checksums progress report

From
Fujii Masao
Date:

On 2021/04/02 14:23, Shinya11.Kato@nttdata.com wrote:
> Hi,
> 
> I found a problem with the pg_checksums.c.
> 
> The total_size is calculated by scanning the directory.
> The current_size is calculated by scanning the files, but the current_size does not include the size of NewPages.
> 
> This may cause pg_checksums progress report to not be 100%.
> I have attached a patch that fixes this.

Thanks for the report and patch!

I could reproduce this issue and confirmed that the patch fixes it.

Regarding the patch, I think that it's better to add the comment about
why current_size needs to be counted including new pages.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



RE: Fix pg_checksums progress report

From
Date:
>-----Original Message-----
>From: Fujii Masao <masao.fujii@oss.nttdata.com>
>Sent: Friday, April 2, 2021 2:39 PM
>To: Shinya11.Kato@nttdata.com; pgsql-hackers@postgresql.org
>Subject: Re: Fix pg_checksums progress report
>
>
>
>On 2021/04/02 14:23, Shinya11.Kato@nttdata.com wrote:
>> Hi,
>>
>> I found a problem with the pg_checksums.c.
>>
>> The total_size is calculated by scanning the directory.
>> The current_size is calculated by scanning the files, but the current_size does
>not include the size of NewPages.
>>
>> This may cause pg_checksums progress report to not be 100%.
>> I have attached a patch that fixes this.
>
>Thanks for the report and patch!
>
>I could reproduce this issue and confirmed that the patch fixes it.
>
>Regarding the patch, I think that it's better to add the comment about why
>current_size needs to be counted including new pages.

Thanks for your review.
I added a comment to the patch, and attached the new patch.

Regards,
Shinya Kato


Attachment

Re: Fix pg_checksums progress report

From
Michael Paquier
Date:
On Fri, Apr 02, 2021 at 07:30:32AM +0000, Shinya11.Kato@nttdata.com wrote:
> I added a comment to the patch, and attached the new patch.

Hmm.  This looks to come from 280e5f14 that introduced the progress
reports so this would need a backpatch down to 12.  I have not looked
in details and have not looked at the patch yet, though.  Fujii-san,
are you planning to take care of that?  That was my stuff originally,
so I am fine to look at it.  But not now, on a Friday afternoon :)
--
Michael

Attachment

Re: Fix pg_checksums progress report

From
Fujii Masao
Date:

On 2021/04/02 16:47, Michael Paquier wrote:
> On Fri, Apr 02, 2021 at 07:30:32AM +0000, Shinya11.Kato@nttdata.com wrote:
>> I added a comment to the patch, and attached the new patch.

Thanks for updating the patch!

+        /*
+         * The current_size is calculated before checking if header is a
+         * new page, because total_size includes the size of new pages.
+         */
+        current_size += r;

I'd like to comment more. What about the following?

---------------------------
Since the file size is counted as total_size for progress status information, the sizes of all pages including new ones
inthe file should be counted as current_size. Otherwise the progress reporting calculated using those counters may not
reach100%.
 
---------------------------


> Hmm.  This looks to come from 280e5f14 that introduced the progress
> reports so this would need a backpatch down to 12.

Yes.


> I have not looked
> in details and have not looked at the patch yet, though.  Fujii-san,
> are you planning to take care of that?

Yes, I will. Thanks for the consideration!

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



RE: Fix pg_checksums progress report

From
Date:
>-----Original Message-----
>From: Fujii Masao <masao.fujii@oss.nttdata.com>
>Sent: Friday, April 2, 2021 6:03 PM
>To: Michael Paquier <michael@paquier.xyz>; Shinya11.Kato@nttdata.com
>Cc: pgsql-hackers@postgresql.org
>Subject: Re: Fix pg_checksums progress report
>
>
>
>On 2021/04/02 16:47, Michael Paquier wrote:
>> On Fri, Apr 02, 2021 at 07:30:32AM +0000, Shinya11.Kato@nttdata.com wrote:
>>> I added a comment to the patch, and attached the new patch.
>
>Thanks for updating the patch!
>
>+        /*
>+         * The current_size is calculated before checking if header is a
>+         * new page, because total_size includes the size of new
>pages.
>+         */
>+        current_size += r;
>
>I'd like to comment more. What about the following?
>
>---------------------------
>Since the file size is counted as total_size for progress status information, the
>sizes of all pages including new ones in the file should be counted as
>current_size. Otherwise the progress reporting calculated using those counters
>may not reach 100%.
>---------------------------

Thanks for your review!
I updated the patch, and attached it.

Regards,
Shinya Kato

Attachment

Re: Fix pg_checksums progress report

From
Michael Paquier
Date:
On Fri, Apr 02, 2021 at 06:03:21PM +0900, Fujii Masao wrote:
> On 2021/04/02 16:47, Michael Paquier wrote:
>> I have not looked
>> in details and have not looked at the patch yet, though.  Fujii-san,
>> are you planning to take care of that?
>
> Yes, I will. Thanks for the consideration!

OK, thanks!
--
Michael

Attachment

Re: Fix pg_checksums progress report

From
Fujii Masao
Date:

On 2021/04/02 18:19, Shinya11.Kato@nttdata.com wrote:
> Thanks for your review!
> I updated the patch, and attached it.

Thanks for updating the patch! Pushed.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION