On Wed, Jan 25, 2023 at 04:34:21PM +0900, Michael Paquier wrote:
> The loop part is annoying.. I've never been a fan of adding this
> cross-value checks for the archiver modules in the first place, and it
> would make things much simpler in the checkpointer if we need to think
> about that as we want these values to be reloadable. Perhaps this
> could just be an exception where we just give priority on one over the
> other archive_cleanup_command? The startup process has a well-defined
> sequence after a failure, while the checkpointer is designed to remain
> robust.
Yeah, there are some problems here. If we ERROR, we'll just bounce back to
the sigsetjmp() block once a second, and we'll never pick up configuration
reloads, shutdown signals, etc. If we FATAL, we'll just rapidly restart
over and over. Given the dicussion about misconfigured archiving
parameters [0], I doubt folks will be okay with giving priority to one or
the other.
I'm currently thinking that the checkpointer should set a flag and clear
the recovery callbacks when a misconfiguration is detected. Anytime the
checkpointer tries to use the archive-cleanup callback, a WARNING would be
emitted. This is similar to an approach I proposed for archiving
misconfigurations (that we didn't proceed with) [1]. Given the
aformentioned problems, this approach might be more suitable for the
checkpointer than it is for the archiver.
Thoughts?
[0] https://postgr.es/m/9ee5d180-2c32-a1ca-d3d7-63a723f68d9a%40enterprisedb.com
[1] https://postgr.es/m/20220914222736.GA3042279%40nathanxps13
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com