On Fri, Jul 8, 2022 at 10:44 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> > 0002 - I'm not quite happy with this patch, with the change,
> > checkpoint errors out, if it can't remove just a file - the comments
> > there says it all. Is there any strong reason for this change?
>
> Andres noted several concerns upthread. In short, ignoring unexpected
> errors makes them harder to debug and likely masks bugs.
>
> FWIW I agree that it is unfortunate that a relatively non-critical error
> here leads to checkpoint failures, which can cause much worse problems down
> the road. I think this is one of the reasons for moving tasks like this
> out of the checkpointer, as I'm trying to do with the proposed custodian
> process [0].
>
> [0] https://commitfest.postgresql.org/38/3448/
IMHO, we can keep it as-is and if required can be changed as part of
the patch set [0], as this change without [0] can cause a checkpoint
to fail. Furthermore, I would like it if we convert "could not parse
filename" and "could not remove file" ERRORs of
CheckPointLogicalRewriteHeap to LOGs until [0] gets in - others may
have different opinions though.
Just wondering - do we ever have a problem if we can't remove the
snapshot or mapping file?
May be unrelated, RemoveTempXlogFiles doesn't even bother to check if
the temp wal file is removed.
Regards,
Bharath Rupireddy.