Re: where should I stick that backup? - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: where should I stick that backup? |
Date | |
Msg-id | 20200412191702.ul7ohgv5gus3tsvo@alap3.anarazel.de Whole thread Raw |
In response to | Re: where should I stick that backup? (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: where should I stick that backup?
Re: where should I stick that backup? |
List | pgsql-hackers |
Hi, On 2020-04-11 16:22:09 -0400, Robert Haas wrote: > On Fri, Apr 10, 2020 at 3:38 PM Andres Freund <andres@anarazel.de> wrote: > > Wouldn't there be state like a S3/ssh/https/... connection? And perhaps > > a 'backup_id' in the backup metadata DB that'd one would want to update > > at the end? > > Good question. I don't know that there would be but, uh, maybe? It's > not obvious to me why all of that would need to be done using the same > connection, but if it is, the idea I proposed isn't going to work very > nicely. Well, it depends on what you want to support. If you're only interested in supporting tarball mode ([1]), *maybe* you can get away without longer lived sessions (but I'm doubtful). But if you're interested in also supporting archiving plain files, then the cost of establishing sessions, and the latency penalty of having to wait for command completion would imo be prohibitive. A lot of solutions for storing backups can achieve pretty decent throughput, but have very significant latency. That's of course in addition to network latency itself. [1] I don't think we should restrict it that way. Would make it much more complicated to support incremental backup, pg_rewind, deduplication, etc. > More generally, can you think of any ideas for how to structure an API > here that are easier to use than "write some C code"? Or do you think > we should tell people to write some C code if they want to > compress/encrypt/relocate their backup in some non-standard way? > For the record, I'm not against eventually having more than one way to > do this, maybe a shell-script interface for simpler things and some > kind of API for more complex needs (e.g. NetBackup integration, > perhaps). And I did wonder if there was some other way we could do > this. I'm doubtful that an API based on string replacement is the way to go. It's hard for me to see how that's not either going to substantially restrict the way the "tasks" are done, or yield a very complicated interface. I wonder whether the best approach here could be that pg_basebackup (and perhaps other tools) opens pipes to/from a subcommand and over the pipe it communicates with the subtask using a textual ([2]) description of tasks. Like: backup mode=files base_directory=/path/to/data/directory backup_file name=base/14037/16396.14 size=1073741824 backup_file name=pg_wal/XXXX size=16777216 or backup mode=tar base_directory /path/to/data/ backup_tar name=dir.tar size=983498875687487 The obvious problem with that proposal is that we don't want to unnecessarily store the incoming data on the system pg_basebackup is running on, just for the subcommand to get access to them. More on that in a second. A huge advantage of a scheme like this would be that it wouldn't have to be specific to pg_basebackup. It could just as well work directly on the server, avoiding an unnecesary loop through the network. Which e.g. could integrate with filesystem snapshots etc. Without needing to build the 'archive target' once with server libraries, and once with client libraries. One reason I think something like this could be advantageous over a C API is that it's quite feasible to implement it from a number of different language, including shell if really desired, without needing to provide a C API via a FFI. It'd also make it quite natural to split out compression from pg_basebackup's main process, which IME currently makes it not really feasible to use pg_basebackup's compression. There's various ways we could address the issue for how the subcommand can access the file data. The most flexible probably would be to rely on exchanging file descriptors between basebackup and the subprocess (these days all supported platforms have that, I think). Alternatively we could invoke the subcommand before really starting the backup, and ask how many files it'd like to receive in parallel, and restart the subcommand with that number of file descriptors open. If we relied on FDs, here's an example for how a trace between pg_basebackup (BB) a backup target command (TC) could look like: BB: required_capabilities fd_send files BB: provided_capabilities fd_send file_size files tar TC: required_capabilities fd_send files file_size BB: backup mode=files base_directory=/path/to/data/directory BB: backup_file method=fd name=base/14037/16396.1 size=1073741824 BB: backup_file method=fd name=base/14037/16396.2 size=1073741824 BB: backup_file method=fd name=base/14037/16396.3 size=1073741824 TC: fd name=base/14037/16396.1 (contains TC fd 4) TC: fd name=base/14037/16396.2 (contains TC fd 5) BB: backup_file method=fd name=base/14037/16396.4 size=1073741824 TC: fd name=base/14037/16396.3 (contains TC fd 6) BB: backup_file method=fd name=base/14037/16396.5 size=1073741824 TC: fd name=base/14037/16396.4 (contains TC fd 4) TC: fd name=base/14037/16396.5 (contains TC fd 5) BB: done TC: done backup_file type=fd mode=fd base/14037/16396.4 1073741824 or backup_features tar backup_mode tar base_directory /path/to/data/ backup_tar dir.tar 983498875687487 [2] yes, I already hear json. A line deliminated format would have some advantages though. Greetings, Andres Freund
pgsql-hackers by date: