Thread: [patch] pg_copy - a command for reliable WAL archiving
Hello, As I proposed before in the thread below, I've implemented a simple command for reliable WAL archiving. I would appreciate it if you could review and test the patch. http://www.postgresql.org/message-id/9C1EB95CA1F34DAB93DF549A51E3E874@maumau Regards MauMau
Attachment
At 2014-06-17 23:26:37 +0900, maumau307@gmail.com wrote: > > Hello, > > As I proposed before in the thread below, I've implemented a simple > command for reliable WAL archiving. I would appreciate it if you > could review and test the patch. Please add your patch to the next CF (i.e. 2014-08), so that it gets review attention at the appropriate time. -- Abhijit
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 06/17/2014 07:26 AM, MauMau wrote: > Hello, > > As I proposed before in the thread below, I've implemented a simple > command for reliable WAL archiving. I would appreciate it if you could > review and test the patch. > > http://www.postgresql.org/message-id/9C1EB95CA1F34DAB93DF549A51E3E874@maumau That first hunk refers to dblink -- I'm guessing it does not belong with this patch. Joe - -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, & 24x7 Support -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBAgAGBQJToFiiAAoJEDfy90M199hlseMP/Rpwneo5la7mJkHJA0BWUtj/ Hh8yNzzyBVPbKYuTEOZWi2yblFW6yA2dknrYD8RigS1qJFwLw+drFt5673Vi6jKW Pf7Qn62Ck/U0lZTGXUU9akOhSx7BsKVwH8HvdARIp5DSV2n7/HFDtazi3hTtFq31 GHKA84lPwuynODN42eVez0YXdeRUX7K/s5rCRq154q3BrLPhEEvQwo/kfhet3F0A utf0ymSPuX3RvpGDDPAZ1oQxTd/xA+qJhX0jrsRoJVaY40rufTgXPmAnNFOdWEds fXs6ObRm+tHmeLhYVUXhODa1HPHBQMvElVeB3CGxhPUvhP2494sfoLOd7qs8Lblg oUkYbIJee8Ir7NN34Gc69YW58sekPSVI4vXKu0PwF++Ubs4MYNNd7nP4Wu9N2ISw p/GPIbx0uR3NbFGCoOLD0K3QHnX/b0FTWHzTTboZ+b69rNIDePpn3eGO2QEOLL/R P/YkPma8SLyDvNnCzuSU3XDkFmWQsgK7xa7qpsBR1mbdF7zKPfOxCtoby/enSeuk z7NJxtpHeUQkO7Pb3ZNk6gL+E8UAQihvdBgdBwBzj4qaZyAM5ec29aya3TtBbF+3 UoFX3m4tthR6mMWqizsdadSPvozDjMrhqSRFAjdBSX80Nfs2DVN1Hepp8NXvjRzV b5RfV7x4yvtr92FFQboj =+TWh -----END PGP SIGNATURE-----
At 2014-06-17 08:02:59 -0700, mail@joeconway.com wrote: > > That first hunk refers to dblink -- I'm guessing it does not belong > with this patch. Yes, it's a leftover of the dblink memory leak patch that's in this CF. -- Abhijit
From: "Joe Conway" <mail@joeconway.com> > That first hunk refers to dblink -- I'm guessing it does not belong with > this patch. Ouch, what a careless mistake. The attached one is clean. I'll update the CommitFest entry when I'm back home from work. Regards MauMau
Attachment
I fixed some minor mistakes. Regards MauMau
Attachment
On Thu, Aug 14, 2014 at 1:52 PM, MauMau <maumau307@gmail.com> wrote: > I fixed some minor mistakes. What's the main purpose of this tool? If it's for WAL archiving, the tool name "pg_copy" sounds too generic. We already have pg_archivecleanup, so maybe "pg_archivecopy" or something is better for the consistency? pg_copy in the patch copies the file to the destination in a straightforward way, i.e., directly copies the file to the dest file with actual name. This can cause the problem which some people reported. The problem is that, when the server crashes while WAL file is being archived by cp command, its partially-filled WAL file remains at the archival area. This half-baked archive file can cause various troubles. To address this, WAL file needs to be copied to the temporary file at first, then renamed to the actual name. I think that pg_copy should copy the WAL file in that way. Currently pg_copy always syncs the archive file, and there is no way to disable that. But I'm sure that not everyone want to sync the archive file. So I think that it's better to add the option specifying whether to sync the file or not, into pg_copy. Some users might want to specify whether to call posix_fadvise or not because they might need to re-read the archvied files just after the archiving. For example, network copy of the archived files from the archive area to remote site for disaster recovery. Do you recommend to use pg_copy for restore_command? If yes, it also should be documented. And in the WAL restore case, the restored WAL files are re-read soon by recovery, so posix_fadvise is not good in that case. Direct I/O and posix_fadvise are used only for destination file. But why not source file? That might be useful especially for restore_command case. At last, the big question is, is there really no OS command which provides the same functionality as pg_copy does? If there is, I'd like to avoid duplicate work basically. Regards, -- Fujii Masao
Fujii Masao <masao.fujii@gmail.com> wrote: > Direct I/O and posix_fadvise are used only for destination file. > But why not source file? That might be useful especially for > restore_command case. That would prevent people from piping the file through a compression utility. We should support piped I/O for input (and if we want to use it on the recovery side, for the output, too), at least as an option. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
From: "Fujii Masao" <masao.fujii@gmail.com> > What's the main purpose of this tool? If it's for WAL archiving, the tool > name > "pg_copy" sounds too generic. We already have pg_archivecleanup, so maybe > "pg_archivecopy" or something is better for the consistency? > > pg_copy in the patch copies the file to the destination in a > straightforward way, > i.e., directly copies the file to the dest file with actual name. This can > cause > the problem which some people reported. The problem is that, when the > server > crashes while WAL file is being archived by cp command, its > partially-filled > WAL file remains at the archival area. This half-baked archive file can > cause > various troubles. To address this, WAL file needs to be copied to the > temporary > file at first, then renamed to the actual name. I think that pg_copy > should > copy the WAL file in that way. I intended to make pg_copy a straightforward replacement of cp/copy, which complements the missing sync. Direct I/O and posix_fadvice() feature may be convenient but not essential for this utility. cp/copy doesn't copy to a temporary file, and the problem can be solved easily by mv/move. I wanted to keep pg_copy as generic as cp/copy, so that it can be used by some advanced features in the future, e.g. comprehensive backup/recovery management like RMAN (this example may not be best) when it's integrated into the core. With that said, copying to a temporary file like <dest>.tmp and renaming it to <dest> sounds worthwhile even as a basic copy utility. I want to avoid copying to a temporary file with a fixed name like _copy.tmp, because some advanced utility may want to run multiple instances of pg_copy to copy several files into the same directory simultaneously. However, I'm afraid multiple <dest>.tmp files might continue to occupy disk space after canceling copy or power failure in some use cases, where the copy of the same file won't be retried. That's also the reason why I chose to not use a temporary file like cp/copy. > Currently pg_copy always syncs the archive file, and there is no way to > disable > that. But I'm sure that not everyone want to sync the archive file. So I > think > that it's better to add the option specifying whether to sync the file > or not, into > pg_copy. pg_copy is for copying a file reliably by syncing. If sync is not necessary, people can use cp/copy. > Some users might want to specify whether to call posix_fadvise or not > because > they might need to re-read the archvied files just after the archiving. > For example, network copy of the archived files from the archive area to > remote site for disaster recovery. This sounds reasonable. Do you have an idea on the switch name and the default behavior? > Do you recommend to use pg_copy for restore_command? If yes, it also > should > be documented. And in the WAL restore case, the restored WAL files are > re-read > soon by recovery, so posix_fadvise is not good in that case. > > Direct I/O and posix_fadvise are used only for destination file. But why > not > source file? That might be useful especially for restore_command case. No, I don't think it's necessary to use pg_copy for restore_command. > At last, the big question is, is there really no OS command which provides > the same functionality as pg_copy does? If there is, I'd like to avoid > duplicate > work basically. If there exists such a command available in the standard OS installation, I want to use it. Regards MauMau
On 8/19/14 9:35 AM, MauMau wrote: > pg_copy is for copying a file reliably by syncing. If sync is not > necessary, people can use cp/copy. I'm getting mixed messages from this thread. I think there could be a fair amount of support for a new tool that can serve as a universal plug-and-play archive_command, with a variety of options, such as fsync yes/no, fadvise yes/no, direct-io[*] yes/no, atomic copy yes/no, allow overwrite yes/no, compression yes/no. That would reduce the need for users to compose adventurous shell commands, and it would set out the various options more clearly. This is not that. This is cp+fsync with a hardcoded fadvise policy and optional direct-io. That is a valid problem to address, but it is one of many. On the other hand, I fear that the addition of this single-and-a-half-purpose tool would make the overall landscape more complicated than it already is. Since it's in the examples, people will probably use it, even if they don't need to or shouldn't. And not recommending it for the restore_command is also confusing. Another example of how confusing all of this is: On Windows, the copy command by default doesn't overwrite files, which is what we want (usually). The patch changes those instances of copy to pg_copy, but it doesn't have that behavior. Should the examples by changed to do a test && pg_copy on Windows (what's the Windows shell syntax for that?), or should pg_copy have an option to not overwrite a file? How do you then avoid inconsistencies with the Unix behavior? Or what if you want fsync but allow overwriting on Windows? On the technical side, I think if you fsync a new file, you also need to fsync the directory, to make sure the file is certain to be visible after a crash. [*] I keep reading "directio" as a typo of "direction", so please consider putting a hyphen or underscore in the option and variable names. ;-)
On 8/15/14 10:46 AM, Fujii Masao wrote: > At last, the big question is, is there really no OS command which provides > the same functionality as pg_copy does? If there is, I'd like to avoid duplicate > work basically. If you look hard enough, you can maybe find an OS command that can fsync a file after it was copied. Some versions of dd can do that, and some systems have an fsync program. But it's not clear whether all systems have that, and it probably won't be simple and consistent.
On 08/19/2014 12:37 PM, Peter Eisentraut wrote: > On 8/19/14 9:35 AM, MauMau wrote: >> pg_copy is for copying a file reliably by syncing. If sync is not >> necessary, people can use cp/copy. > > I'm getting mixed messages from this thread. > > I think there could be a fair amount of support for a new tool that can > serve as a universal plug-and-play archive_command, with a variety of > options, such as fsync yes/no, fadvise yes/no, direct-io[*] yes/no, > atomic copy yes/no, allow overwrite yes/no, compression yes/no. That > would reduce the need for users to compose adventurous shell commands, > and it would set out the various options more clearly. Yes please! Although I'm not sold on the idea of using DirectIO for this. Is there really enough benefit to make it worth the trouble? > > This is not that. This is cp+fsync with a hardcoded fadvise policy and > optional direct-io. That is a valid problem to address, but it is one > of many. On the other hand, I fear that the addition of this > single-and-a-half-purpose tool would make the overall landscape more > complicated than it already is. Since it's in the examples, people will > probably use it, even if they don't need to or shouldn't. And not > recommending it for the restore_command is also confusing. I'm afraid that I agree with Peter here. pg_copy looks like a nice foundation for the eventual pg_copy utility we need, but it's not there yet. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
MauMau wrote: > With that said, copying to a temporary file like <dest>.tmp and > renaming it to <dest> sounds worthwhile even as a basic copy > utility. I want to avoid copying to a temporary file with a fixed > name like _copy.tmp, because some advanced utility may want to run > multiple instances of pg_copy to copy several files into the same > directory simultaneously. However, I'm afraid multiple <dest>.tmp > files might continue to occupy disk space after canceling copy or > power failure in some use cases, where the copy of the same file > won't be retried. That's also the reason why I chose to not use a > temporary file like cp/copy. Is there a way to create a link to a file which only exists as an open file descriptor? If there was, you could create a temp file, open an fd, then delete the file. That would remove the issue with files being leaked due to failures of various kinds. Also, it's been mentioned that this utility might be useful for restore_command. That sounds good I guess, but need to keep the RECOVERYXLOG trick in mind. I remember a case of stalled replay because the restore command was writing to RECOVERYXLOG.gz and ungzipping, and the unlink("RECOVERYXLOG") call failed after a partial copy and so did the copy from the archive. (Removing the borked RECOVERYXLOG.gz fixed it.) -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Tue, Aug 19, 2014 at 10:42 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Is there a way to create a link to a file which only exists as an open > file descriptor? If there was, you could create a temp file, open an > fd, then delete the file. That would remove the issue with files being > leaked due to failures of various kinds. Sort of. On recent Linuxen you can create a file with open(O_TMPFILE) then use linkat(2) to create a link for it only once it's fully written. -- greg
c.f.: O_TMPFILE (since Linux 3.11) Create an unnamed temporary file. The pathname argument specifiesa directory; an unnamed inode will be created in that directory's filesystem. Anything written to the resulting file will be lost when the last file descriptor is closed, unless the file is givena name. O_TMPFILE must be specified with one of O_RDWR or O_WRONLY and, optionally, O_EXCL. If O_EXCL isnot specified, then linkat(2) can be used to link the temporary file into the filesystem, makingit permanent, using code like the following: char path[PATH_MAX]; fd = open("/path/to/dir", O_TMPFILE | O_RDWR, S_IRUSR | S_IWUSR); /* File I/O on 'fd'... */ snprintf(path, PATH_MAX, "/proc/self/fd/%d", fd); linkat(AT_FDCWD, path, AT_FDCWD, "/path/for/file", AT_SYMLINK_FOLLOW); In this case, the open() mode argument determines the file permission mode, as with O_CREAT. Specifying O_EXCL in conjunction with O_TMPFILE prevents a temporary file from being linked intothe filesystem in the above manner. (Note that the meaning of O_EXCL in this case is differentfrom the meaning of O_EXCL otherwise.) There are two main use cases for O_TMPFILE: * Improved tmpfile(3) functionality: race-free creation of temporary files that (1) are automaticallydeleted when closed; (2) can never be reached via any pathname; (3) are not subjectto symlink attacks; and (4) do not require the caller to devise unique names. * Creating a file that is initially invisible, which is then populated with data and adjustedto have appropriate filesystem attributes (chown(2), chmod(2), fsetxattr(2), etc.) before being atomically linked into the filesystem in a fully formed state (using linkat(2) as described above). O_TMPFILE requires support by the underlying filesystem; only a subset of Linux filesystems providethat support. In the initial implementation, support was provided in the ext2, ext3, ext4,UDF, Minix, and shmem filesystems. XFS support was added in Linux 3.15.
Greg Stark wrote: > char path[PATH_MAX]; > fd = open("/path/to/dir", O_TMPFILE | O_RDWR, > S_IRUSR | S_IWUSR); > > /* File I/O on 'fd'... */ > > snprintf(path, PATH_MAX, "/proc/self/fd/%d", fd); > linkat(AT_FDCWD, path, AT_FDCWD, "/path/for/file", > AT_SYMLINK_FOLLOW); Hmm, the real trick here is linkat(... "/proc/self/foobar"), not the O_TMPFILE: you can have an open file descriptor to an "invisible" file simply by creating a normal file and unlinking it. I looked at linkat() yesterday but the idea of using /proc/self didn't occur to me. Nasty trick :-( It seems linkat() is quite a bit more portable than O_TMPFILE, fortunately ... -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 2014-08-19 17:42:06 -0400, Alvaro Herrera wrote: > MauMau wrote: > > > With that said, copying to a temporary file like <dest>.tmp and > > renaming it to <dest> sounds worthwhile even as a basic copy > > utility. I want to avoid copying to a temporary file with a fixed > > name like _copy.tmp, because some advanced utility may want to run > > multiple instances of pg_copy to copy several files into the same > > directory simultaneously. However, I'm afraid multiple <dest>.tmp > > files might continue to occupy disk space after canceling copy or > > power failure in some use cases, where the copy of the same file > > won't be retried. That's also the reason why I chose to not use a > > temporary file like cp/copy. > > Is there a way to create a link to a file which only exists as an open > file descriptor? If there was, you could create a temp file, open an > fd, then delete the file. That would remove the issue with files being > leaked due to failures of various kinds. Isn't this a solution looking for a problem? We're using tempfiles in dozens of other places and I really don't see why this is the place to stop doing so. Just copy to <dest>.tmp and move it into place. If things crash during that, the command will be repeated shortly afterwards again *anyway*. Let's not get into platform specific games here. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund wrote: > On 2014-08-19 17:42:06 -0400, Alvaro Herrera wrote: > > MauMau wrote: > > > > > With that said, copying to a temporary file like <dest>.tmp and > > > renaming it to <dest> sounds worthwhile even as a basic copy > > > utility. I want to avoid copying to a temporary file with a fixed > > > name like _copy.tmp, because some advanced utility may want to run > > > multiple instances of pg_copy to copy several files into the same > > > directory simultaneously. However, I'm afraid multiple <dest>.tmp > > > files might continue to occupy disk space after canceling copy or > > > power failure in some use cases, where the copy of the same file > > > won't be retried. That's also the reason why I chose to not use a > > > temporary file like cp/copy. > > > > Is there a way to create a link to a file which only exists as an open > > file descriptor? If there was, you could create a temp file, open an > > fd, then delete the file. That would remove the issue with files being > > leaked due to failures of various kinds. > > Isn't this a solution looking for a problem? We're using tempfiles in > dozens of other places and I really don't see why this is the place to > stop doing so. Just copy to <dest>.tmp and move it into place. If things > crash during that, the command will be repeated shortly afterwards again > *anyway*. Let's not get into platform specific games here. The issue is what happens if there's a crash while the temp file is in the middle of being filled. I agree there are bigger problems to solve in this area though. Yes, I agree that a fixed name such as _copy.tmp is a really bad choice for several reasons. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Aug 20, 2014 at 2:27 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Hmm, the real trick here is linkat(... "/proc/self/foobar"), not the > O_TMPFILE: you can have an open file descriptor to an "invisible" file > simply by creating a normal file and unlinking it. I looked at linkat() > yesterday but the idea of using /proc/self didn't occur to me. Nasty > trick :-( It seems linkat() is quite a bit more portable than > O_TMPFILE, fortunately ... Supposedly linkat(2) on Linux refuses to create a link to a file that was opened normally and had its last link removed with unlink(2) due to concerns that this would create security holes. -- greg
On 2014-08-20 09:50:56 -0400, Alvaro Herrera wrote: > Andres Freund wrote: > > On 2014-08-19 17:42:06 -0400, Alvaro Herrera wrote: > > > MauMau wrote: > > > > > > > With that said, copying to a temporary file like <dest>.tmp and > > > > renaming it to <dest> sounds worthwhile even as a basic copy > > > > utility. I want to avoid copying to a temporary file with a fixed > > > > name like _copy.tmp, because some advanced utility may want to run > > > > multiple instances of pg_copy to copy several files into the same > > > > directory simultaneously. However, I'm afraid multiple <dest>.tmp > > > > files might continue to occupy disk space after canceling copy or > > > > power failure in some use cases, where the copy of the same file > > > > won't be retried. That's also the reason why I chose to not use a > > > > temporary file like cp/copy. > > > > > > Is there a way to create a link to a file which only exists as an open > > > file descriptor? If there was, you could create a temp file, open an > > > fd, then delete the file. That would remove the issue with files being > > > leaked due to failures of various kinds. > > > > Isn't this a solution looking for a problem? We're using tempfiles in > > dozens of other places and I really don't see why this is the place to > > stop doing so. Just copy to <dest>.tmp and move it into place. If things > > crash during that, the command will be repeated shortly afterwards again > > *anyway*. Let's not get into platform specific games here. > > The issue is what happens if there's a crash while the temp file is in > the middle of being filled. The archive command will be be run again a couple seconds and remove the half-filled temp file. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Greg Stark wrote: > On Wed, Aug 20, 2014 at 2:27 PM, Alvaro Herrera > <alvherre@2ndquadrant.com> wrote: > > Hmm, the real trick here is linkat(... "/proc/self/foobar"), not the > > O_TMPFILE: you can have an open file descriptor to an "invisible" file > > simply by creating a normal file and unlinking it. I looked at linkat() > > yesterday but the idea of using /proc/self didn't occur to me. Nasty > > trick :-( It seems linkat() is quite a bit more portable than > > O_TMPFILE, fortunately ... > > Supposedly linkat(2) on Linux refuses to create a link to a file that > was opened normally and had its last link removed with unlink(2) due > to concerns that this would create security holes. Sigh. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-08-20 09:50:56 -0400, Alvaro Herrera wrote: >> Andres Freund wrote: >>> Isn't this a solution looking for a problem? We're using tempfiles in >>> dozens of other places and I really don't see why this is the place to >>> stop doing so. Just copy to <dest>.tmp and move it into place. >> The issue is what happens if there's a crash while the temp file is in >> the middle of being filled. > The archive command will be be run again a couple seconds and remove the > half-filled temp file. Alternatively, you could use the process PID as part of the temp file name; which is probably a good idea anyway. regards, tom lane
On 2014-08-20 10:19:33 -0400, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2014-08-20 09:50:56 -0400, Alvaro Herrera wrote: > >> Andres Freund wrote: > >>> Isn't this a solution looking for a problem? We're using tempfiles in > >>> dozens of other places and I really don't see why this is the place to > >>> stop doing so. Just copy to <dest>.tmp and move it into place. > > >> The issue is what happens if there's a crash while the temp file is in > >> the middle of being filled. > > > The archive command will be be run again a couple seconds and remove the > > half-filled temp file. > > Alternatively, you could use the process PID as part of the temp file > name; which is probably a good idea anyway. I think that's actually worse, because nothing will clean up those unless you explicitly scan for all <whatever>.$pid files, and somehow kill them. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-08-20 10:19:33 -0400, Tom Lane wrote: >> Alternatively, you could use the process PID as part of the temp file >> name; which is probably a good idea anyway. > I think that's actually worse, because nothing will clean up those > unless you explicitly scan for all <whatever>.$pid files, and somehow > kill them. True. As long as the copy command is prepared to get rid of a pre-existing target file, using a fixed .tmp extension should be fine. regards, tom lane
On Wed, Aug 20, 2014 at 10:36:40AM -0400, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2014-08-20 10:19:33 -0400, Tom Lane wrote: > >> Alternatively, you could use the process PID as part of the temp file > >> name; which is probably a good idea anyway. > > > I think that's actually worse, because nothing will clean up those > > unless you explicitly scan for all <whatever>.$pid files, and somehow > > kill them. > > True. As long as the copy command is prepared to get rid of a > pre-existing target file, using a fixed .tmp extension should be fine. Well, then we are back to this comment by MauMau: > With that said, copying to a temporary file like <dest>.tmp and > renaming it to <dest> sounds worthwhile even as a basic copy utility. > I want to avoid copying to a temporary file with a fixed name like > _copy.tmp, because some advanced utility may want to run multiple > instances of pg_copy to copy several files into the same directory > simultaneously. However, I'm afraid multiple <dest>.tmp files might > continue to occupy disk space after canceling copy or power failure in > some use cases, where the copy of the same file won't be retried. > That's also the reason why I chose to not use a temporary file like > cp/copy. Do we want cases where the same directory is used multiple pg_copy processes? I can't imagine how that setup would make sense. I am thinking pg_copy should emit a warning message when it removes an old temp file. This might alert people that something odd is happening if they see the message often. The pid-extension idea would work as pg_copy can test all pid extension files to see if the pid is still active. However, that assumes that the pid is running on the local machine and not on another machines that has NFS-mounted this directory, so maybe this is a bad idea, but again, we are back to the idea that only one process should be writing into this directory. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On 2014-08-20 18:58:05 -0400, Bruce Momjian wrote: > On Wed, Aug 20, 2014 at 10:36:40AM -0400, Tom Lane wrote: > > Andres Freund <andres@2ndquadrant.com> writes: > > > On 2014-08-20 10:19:33 -0400, Tom Lane wrote: > > >> Alternatively, you could use the process PID as part of the temp file > > >> name; which is probably a good idea anyway. > > > > > I think that's actually worse, because nothing will clean up those > > > unless you explicitly scan for all <whatever>.$pid files, and somehow > > > kill them. > > > > True. As long as the copy command is prepared to get rid of a > > pre-existing target file, using a fixed .tmp extension should be fine. > > Well, then we are back to this comment by MauMau: > > With that said, copying to a temporary file like <dest>.tmp and > > renaming it to <dest> sounds worthwhile even as a basic copy utility. > > I want to avoid copying to a temporary file with a fixed name like > > _copy.tmp, because some advanced utility may want to run multiple > > instances of pg_copy to copy several files into the same directory > > simultaneously. However, I'm afraid multiple <dest>.tmp files might > > continue to occupy disk space after canceling copy or power failure in > > some use cases, where the copy of the same file won't be retried. > > That's also the reason why I chose to not use a temporary file like > > cp/copy. > > Do we want cases where the same directory is used multiple pg_copy > processes? I can't imagine how that setup would make sense. I don't think anybody is proposing the _copy.tmp proposal. We've just argued about the risk of <dest>.tmp. And I argued - and others seem to agree - the space usage problem isn't really relevant because archive commands and such are rerun after failure and can then clean up the temp file again. > I am thinking pg_copy should emit a warning message when it removes an > old temp file. This might alert people that something odd is happening > if they see the message often. Don't really see a point in this. If the archive command or such failed, that will already have been logged. I'd expect this to be implemented by passing O_CREAT | O_TRUNC to open(), nothing else. > The pid-extension idea would work as pg_copy can test all pid extension > files to see if the pid is still active. However, that assumes that the > pid is running on the local machine and not on another machines that has > NFS-mounted this directory, so maybe this is a bad idea, but again, we > are back to the idea that only one process should be writing into this > directory. I don't actually think we should assume that. There very well could be one process running an archive command, using differently prefixed file names or such. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services