On Fri, Jan 28, 2022 at 03:20:41PM -0500, Robert Haas wrote:
> On Fri, Jan 28, 2022 at 3:01 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>> I discussed the two main deficiencies I'm aware of with basic_archive
>> earlier [0]. The first one is the issue with "incovenient" server crashes
>> (mentioned below).
>
> Seems easy enough to rectify, if it's just a matter of silently-succeed-if-same.
Yes.
>> The second is that there is no handling for multiple
>> servers writing to the same location since the temporary file is always
>> named "archtemp." I thought about a few ways to pick a unique file name
>> (or at least one that is _probably_ unique), but that began adding a lot of
>> complexity for something I intended as a test module. I can spend some
>> more time on this if you think it's worth fixing for a contrib module.
>
> Well, my first thought was to wonder whether we even care about that
> scenario, but I guess we probably do, at least a little bit.
>
> How about:
>
> 1. Name temporary files like
> archive_temp.${FINAL_NAME}.${PID}.${SOME_RANDOM_NUMBER}. Create them
> with O_EXCL. If it fails, die.
>
> 2. Try not to leave files like this behind, perhaps installing an
> on_proc_exit callback or similar, but accept that crashes and unlink()
> failures will make it inevitable in some cases.
>
> 3. Document that crashes or other strange failure cases may leave
> archive_temp.* files behind in the archive directory, and recommend
> that users remove them before restarting the database after a crash
> (or, with caution, removing them while the database is running if the
> user is sure that the files are old and unrelated to any archiving
> still in progress).
>
> I'm not arguing that this is exactly the right idea. But I am arguing
> that it shouldn't take a ton of engineering to come up with something
> reasonable here.
This is roughly what I had in mind. I'll give it a try.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com