On Tue, Aug 09, 2022 at 09:29:16AM +0530, Bharath Rupireddy wrote:
> On Tue, Aug 9, 2022 at 9:20 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Hmmm ... I'll grant that ignoring lstat errors altogether isn't great.
>> But should the replacement behavior be elog-LOG-and-press-on,
>> or elog-ERROR-and-fail-the-surrounding-operation? I'm not in any
>> hurry to believe that the latter is more appropriate without some
>> analysis of what the callers are doing.
>>
>> The bottom line here is that I'm distrustful of behavioral changes
>> introduced to simplify refactoring rather than to solve a live
>> problem.
>
> +1. I agree with Tom not to change elog-LOG to elog-ERROR and fail the
> checkpoint operation. Because the checkpoint is more important than
> why a single snapshot file (out thousands or even million files) isn't
> removed at that moment. Also, I originally proposed to change
> elog-ERROR to elog-LOG in CheckPointLogicalRewriteHeap for unlink()
> failures for the same reason.
This was my initial instinct as well, but this thread has received
contradictory feedback during the months since.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com