Thread: [PATCH]A minor improvement to the error-report in SimpleLruWriteAll()



Hi hackers,
When I read the code, I noticed that in SimpleLruWriteAll(), only the last error is 
recorded when the file fails to close. Like the following,
```void
SimpleLruWriteAll(SlruCtl ctl, bool allow_redirtied)
{
        SlruShared      shared = ctl->shared;
        SlruWriteAllData fdata;
        int64           pageno = 0;
        int                     prevbank = SlotGetBankNumber(0);
        bool            ok;
...
        /*
         * Now close any files that were open
         */
        ok = true;
        for (int i = 0; i < fdata.num_files; i++)
        {
                if (CloseTransientFile(fdata.fd[i]) != 0)
                {
                        slru_errcause = SLRU_CLOSE_FAILED;
                        slru_errno = errno;
                        pageno = fdata.segno[i] * SLRU_PAGES_PER_SEGMENT;
                        ok = false;
                }
        }
        if (!ok)
                SlruReportIOError(ctl, pageno, InvalidTransactionId);
```
// Here, SlruReportIOError() is called only once, meaning that the last error message is
 recorded. In my opinion, since failure to close a file is not common, logging an error 
message every time a failure occurs will not result in much log growth, but detailed error 
logging will help in most cases.

So, I changed the code to move the call to SlruReportIOError() inside the while loop.

Attached is the patch, I hope it can help.




Attachment
Hi,
Actually, I still wonder why only the error message
of the last failure to close the file was recorded.
For this unusual situation, it is acceptable to
record all failure information without causing
too much logging.
Was it designed that way on purpose?




At 2024-05-25 17:29:00, "Long Song" <songlong88@126.com> wrote:



Hi hackers,
When I read the code, I noticed that in SimpleLruWriteAll(), only the last error is 
recorded when the file fails to close. Like the following,
```void
SimpleLruWriteAll(SlruCtl ctl, bool allow_redirtied)
{
        SlruShared      shared = ctl->shared;
        SlruWriteAllData fdata;
        int64           pageno = 0;
        int                     prevbank = SlotGetBankNumber(0);
        bool            ok;
...
        /*
         * Now close any files that were open
         */
        ok = true;
        for (int i = 0; i < fdata.num_files; i++)
        {
                if (CloseTransientFile(fdata.fd[i]) != 0)
                {
                        slru_errcause = SLRU_CLOSE_FAILED;
                        slru_errno = errno;
                        pageno = fdata.segno[i] * SLRU_PAGES_PER_SEGMENT;
                        ok = false;
                }
        }
        if (!ok)
                SlruReportIOError(ctl, pageno, InvalidTransactionId);
```
// Here, SlruReportIOError() is called only once, meaning that the last error message is
 recorded. In my opinion, since failure to close a file is not common, logging an error 
message every time a failure occurs will not result in much log growth, but detailed error 
logging will help in most cases.

So, I changed the code to move the call to SlruReportIOError() inside the while loop.

Attached is the patch, I hope it can help.





Re: [PATCH]A minor improvement to the error-report in SimpleLruWriteAll()

From
Kyotaro Horiguchi
Date:
At Tue, 28 May 2024 20:15:59 +0800 (CST), "Long Song" <songlong88@126.com> wrote in 
> 
> Hi,
> Actually, I still wonder why only the error message
> of the last failure to close the file was recorded.
> For this unusual situation, it is acceptable to
> record all failure information without causing
> too much logging.
> Was it designed that way on purpose?

Note that SlruReportIOError() causes a non-local exit. To me, the
point of the loop seems to be that we want to close every single file,
apart from the failed ones. From that perspective, the patch disrupts
that intended behavior by exiting in the middle of the loop. It seems
we didn't want to bother collecting errors for every failed file in
that part.
 
regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Hi Kyotaro,
Thank you for the response.



At 2024-06-04 14:44:09, "Kyotaro Horiguchi" <horikyota.ntt@gmail.com> wrote:
>At Tue, 28 May 2024 20:15:59 +0800 (CST), "Long Song" <songlong88@126.com> wrote in
>>
>> Hi,
>> Actually, I still wonder why only the error message
>> of the last failure to close the file was recorded.
>> For this unusual situation, it is acceptable to
>> record all failure information without causing
>> too much logging.
>> Was it designed that way on purpose?
>
>Note that SlruReportIOError() causes a non-local exit. To me, the
>point of the loop seems to be that we want to close every single file,
>apart from the failed ones. From that perspective, the patch disrupts
>that intended behavior by exiting in the middle of the loop. It seems
>we didn't want to bother collecting errors for every failed file in
>that part.

Yeah, thanks for your reminder.
It was my mistake not to notice the ereport() exit in the function.
But is it necessary to record it in a log? If there is a benefit to
logging, I can submit a modified patch and record the necessary
failure information into the log in another way.

>
>regards.
>
>--
>Kyotaro Horiguchi
>NTT Open Source Software Center












--
Best Regards,

Long

Hi,
I modified the patch, kept the old call of SlruReportIOError()
outside the for-loop, and add a call of erport() with LOG level
when file-close failure occurs in the for-loop.

Please check whether this modification is appropriate
and let me know if there is any problem. Thank you.



At 2024-06-04 14:44:09, "Kyotaro Horiguchi" <horikyota.ntt@gmail.com> wrote:
>At Tue, 28 May 2024 20:15:59 +0800 (CST), "Long Song" <songlong88@126.com> wrote in
>>
>> Hi,
>> Actually, I still wonder why only the error message
>> of the last failure to close the file was recorded.
>> For this unusual situation, it is acceptable to
>> record all failure information without causing
>> too much logging.
>> Was it designed that way on purpose?
>
>Note that SlruReportIOError() causes a non-local exit. To me, the
>point of the loop seems to be that we want to close every single file,
>apart from the failed ones. From that perspective, the patch disrupts
>that intended behavior by exiting in the middle of the loop. It seems
>we didn't want to bother collecting errors for every failed file in
>that part.
>
>regards.
>
>--
>Kyotaro Horiguchi
>NTT Open Source Software Center












--
Best Regards,

Long

Attachment