On Fri, Jan 21, 2022 at 11:49:56AM -0800, Andres Freund wrote:
> On 2022-01-20 20:41:16 +0000, Bossart, Nathan wrote:
>> Here's this part.
>
> And pushed to all branches. Thanks.
Thanks!
I spent some time thinking about the right way to proceed here, and I came
up with the attached patches. The first patch just adds error checking for
various lstat() calls in the replication code. If lstat() fails, then it
probably doesn't make sense to try to continue processing the file.
The second patch changes some nearby calls to ereport() to ERROR. If these
failures are truly unexpected, and we don't intend to support use-cases
like concurrent manual deletion, then failing might be the right way to go.
I think it's a shame that such failures could cause checkpointing to
continually fail, but that topic is already being discussed elsewhere [0].
[0] https://postgr.es/m/C1EE64B0-D4DB-40F3-98C8-0CED324D34CB%40amazon.com
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com/