Re: thinko in basic_archive.c - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: thinko in basic_archive.c
Date
Msg-id CALj2ACVMf0kv=MnYq3ctB-EfOuhpO+X_cN2CM2LYimUDigq79g@mail.gmail.com
Whole thread Raw
In response to Re: thinko in basic_archive.c  (Nathan Bossart <nathandbossart@gmail.com>)
Responses Re: thinko in basic_archive.c  (Nathan Bossart <nathandbossart@gmail.com>)
List pgsql-hackers
On Sat, Oct 15, 2022 at 12:03 AM Nathan Bossart
<nathandbossart@gmail.com> wrote:
>
> On Fri, Oct 14, 2022 at 02:15:19PM +0530, Bharath Rupireddy wrote:
> > Given that temp file name includes WAL file name, epoch to
> > milliseconds scale and MyProcPid, can there be name collisions after a
> > server crash or even when multiple servers with different pids are
> > archiving/copying the same WAL file to the same directory?
>
> While unlikely, I think it's theoretically possible.

Can you please help me understand how name collisions can happen with
temp file names including WAL file name, timestamp to millisecond
scale, and PID? Having the timestamp is enough to provide a non-unique
temp file name when PID wraparound occurs, right? Am I missing
something here?

> > What happens to the left-over temp files after a server crash? Will
> > they be lying around in the archive directory? I understand that we
> > can't remove such files because we can't distinguish left-over files
> > from a crash and the temp files that another server is in the process
> > of copying.
>
> The temporary files are not automatically removed after a crash.  The
> documentation for basic archive has a note about this [0].

Hm, we cannot remove the temp file for all sorts of crashes, but
having on_shmem_exit() or before_shmem_exit() or atexit() or any such
callback removing it would help us cover some crash scenarios (that
exit with proc_exit() or exit()) at least. I think the basic_archive
module currently leaves temp files around even when the server is
restarted legitimately while copying to or renaming the temp file, no?

I can quickly find these exit callbacks deleting the files:
atexit(cleanup_directories_atexit);
atexit(remove_temp);
before_shmem_exit(ReplicationSlotShmemExit, 0);
before_shmem_exit(logicalrep_worker_onexit, (Datum) 0);
before_shmem_exit(BeforeShmemExit_Files, 0);

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Add regular expression testing for user name mapping in the peer authentication TAP test
Next
From: Shay Rojansky
Date:
Subject: Re: CREATE COLLATION must be specified