Re: where should I stick that backup? - Mailing list pgsql-hackers
From | David Steele |
---|---|
Subject | Re: where should I stick that backup? |
Date | |
Msg-id | f6d3048d-99a1-8258-23d1-db8a9fa93506@pgmasters.net Whole thread Raw |
In response to | Re: where should I stick that backup? (Robert Haas <robertmhaas@gmail.com>) |
List | pgsql-hackers |
On 4/12/20 11:04 AM, Robert Haas wrote: > On Sun, Apr 12, 2020 at 10:09 AM Magnus Hagander <magnus@hagander.net> wrote: >> There are certainly cases for it. It might not be they have to be the same connection, but still be the same session,meaning before the first time you perform some step of authentication, get a token, and then use that for all thefiles. You'd need somewhere to maintain that state, even if it doesn't happen to be a socket. But there are definitelyplenty of cases where keeping an open socket can be a huge performance gain -- especially when it comes to notre-negotiating encryption etc. > > Hmm, OK. When we implemented connection-sharing for S3 in pgBackRest it was a significant performance boost, even for large files since they must be uploaded in parts. The same goes for files transferred over SSH, though in this case the overhead is per-file and can be mitigated with control master. We originally (late 2013) implemented everything with commmand-line tools during the POC phase. The idea was to get something viable quickly and then improve as needed. At the time our config file had entries something like this: [global:command] compress=/usr/bin/gzip --stdout %file% decompress=/usr/bin/gzip -dc %file% checksum=/usr/bin/shasum %file% | awk '{print $1}' manifest=/opt/local/bin/gfind %path% -printf '%P\t%y\t%u\t%g\t%m\t%T@\t%i\t%s\t%l\n' psql=/Library/PostgreSQL/9.3/bin/psql -X %option% [db] psql_options=--cluster=9.3/main [db:command:option] psql=--port=6001 These appear to be for MacOS, but Linux would be similar. This *did* work, but it was really hard to debug when things went wrong, the per-file cost was high, and the slight differences between the command-line tools on different platforms was maddening. For example, lots of versions of 'find' would error if a file disappeared while building the manifest, which is a pretty common occurrence in PostgreSQL (most newer distros had an option to fix this). I know that doesn't apply here, but it's an example. Also, debugging was complicated with so many processes, with any degree of parallelism the process list got pretty crazy, fsync was not happening, etc. It's been a long time but I don't have any good memories of the solution that used all command-line tools. Once we had a POC that solved our basic problem, i.e. backup up about 50TB of data reasonably efficiently, we immediately started working on a version that did not rely on command-line tools and we never looked back. Currently the only command-line tool we use is ssh. I'm sure it would be possible to create a solution that worked better than ours, but I'm pretty certain it would still be hard for users to make it work correctly and to prove it worked correctly. >> For compression and encryption, it could perhaps be as simple as "the command has to be pipe on both input and output"and basically send the response back to pg_basebackup. >> >> But that won't help if the target is to relocate things... > > Right. And, also, it forces things to be sequential in a way I'm not > too happy about. Like, if we have some kind of parallel backup, which > I hope we will, then you can imagine (among other possibilities) > getting files for each tablespace concurrently, and piping them > through the output command concurrently. But if we emit the result in > a tarfile, then it has to be sequential; there's just no other choice. > I think we should try to come up with something that can work in a > multi-threaded environment. > >> That is one way to go for it -- and in a case like that, I'd suggest the shellscript interface would be an implementationof the other API. A number of times through the years I've bounced ideas around for what to do with archive_commandwith different people (never quite to the level of "it's time to write a patch"), and it's mostly come downto some sort of shlib api where in turn we'd ship a backwards compatible implementation that would behave like archive_command.I'd envision something similar here. > > I agree. Let's imagine that there are a conceptually unlimited number > of "targets" and "filters". Targets and filters accept data via the > same API, but a target is expected to dispose of the data, whereas a > filter is expected to pass it, via that same API, to a subsequent > filter or target. So filters could include things like "gzip", "lz4", > and "encrypt-with-rot13", whereas targets would include things like > "file" (the thing we have today - write my data into some local > files!), "shell" (which writes my data to a shell command, as > originally proposed), and maybe eventually things like "netbackup" and > "s3". Ideally this will all eventually be via a loadable module > interface so that third-party filters and targets can be fully > supported, but perhaps we could consider that an optional feature for > v1. Note that there is quite a bit of work to do here just to > reorganize the code. > > I would expect that we would want to provide a flexible way for a > target or filter to be passed options from the pg_basebackup command > line. So one might for example write this: > > pg_basebackup --filter='lz4 -9' --filter='encrypt-with-rot13 > rotations=2' --target='shell ssh rhaas@depository pgfile > create-exclusive - %f.lz4' > > The idea is that the first word of the filter or target identifies > which one should be used, and the rest is just options text in > whatever form the provider cares to accept them; but with some > %<character> substitutions allowed, for things like the file name. > (The aforementioned escaping problems for things like filenames with > spaces in them still need to be sorted out, but this is just a sketch, > so while I think it's quite solvable, I am going to refrain from > proposing a precise solution here.) This is basically the solution we have landed on after many iterations. We implement two types of filters, In and InOut. The In filters process data and produce a result, e.g. SHA1, size, page checksum, etc. The InOut filters modify data, e.g. compression, encryption. Yeah, the names could probably be better... I have attached our filter interface (filter.intern.h) as a concrete example of how this works. We call 'targets' storage and have a standard interface for creating storage drivers. I have also attached our storage interface (storage.intern.h) as a concrete example of how this works. Note that for just performing backup this is overkill, but once you consider verify this is pretty much the minimum storage interface needed, according to our experience. Regards, -- -David david@pgmasters.net
Attachment
pgsql-hackers by date: