Re: Btrfs clone WIP patch - Mailing list pgsql-hackers
From | Greg Smith |
---|---|
Subject | Re: Btrfs clone WIP patch |
Date | |
Msg-id | 513B71A7.6080506@2ndQuadrant.com Whole thread Raw |
In response to | Re: Btrfs clone WIP patch (Jonathan Rogers <jrogers@socialserve.com>) |
Responses |
Re: Btrfs clone WIP patch
|
List | pgsql-hackers |
On 3/1/13 1:40 AM, Jonathan Rogers wrote: > I've been thinking about both of these issues and decided to try a > different approach. This patch adds GUC options for two external > commands This is a reasonable approach for a proof of concept patch. I like the idea you're playing with here, as a useful boost for both production deployments and development synchronization jobs. There are some major coding hurdles for this idea to go beyond proof of concept into committed feature though, so I'm just going to dump all the obvious ones I can see on you too. Feel free to ignore those and focus on the fun usability / capability parts for a while first. Just want you to realize what work is in your way, but not visible to you necessarily. With so many GUCs already, adding more of them to support something with a fairly narrow user base will be hard to justify committing. This sort of thing seems like you could load it as an extension instead. I don't think this is worth doing yet. For now the GUC approach is fine, and it keeps your code sample small. I'm going to start with how to continue validating this idea is useful without worrying about those, to get some more testing/benchmarking samples, and for all of that job what you're doing so far is fine. The road toward something to commit is going to take a lot more complicated code eventually though. There's no big conceptual leap involved, it's just where the changes need to go and how much they need to touch to do this accurately in all cases isn't as simple as your demo. Handling unusual situations and race conditions turns out to be almost all the code in what you're replacing, and you'll need similar work in the replacements. > I have just been experimenting with ZFS and it does not seem to have any> capability or interface for cloning ordinaryfiles or directories so the> configuration is not as straightforward. However, I was able to set up a> Postgres clusteras a hierarchy of ZFS file systems in the same pool> with each directory under "base" being a separate file systemand> configure Postgres to call shell scripts which call zfs snapshot and> clone commands to do the cloning and deleting. To make things easier for a reviewer, it's useful to include working examples like this as part of your submission. For this one that would include: -A sample script that created a test pool -Working postgresql.conf changes compatible with ZFS -Test program showing how to exercise the feature I think I can see how to construct such an example for the btrfs version, but having you show that explicitly (preferably with a whole sample session executing it) will also help reviewers. Remember: if you want to get your submission off to a good start, the reviewer should be able to run your sample test, see the code work, and do something fun within a few seconds of compiling it. Make that easy for them, and your reviewer will start with a good impression of you and a positive outlook for the change. Now onto the code nitpicking! = Extension vs. GUC = In addition to not polluting the postgresql.conf.sample, there's another reason this might make for better extension material eventually. Adding new execv calls that are building strings like this is just generally a risky thing. It would be nice from a security perspective if that entire mechanism wasn't even introduced into the server at all unless someone loaded the extension. An extension implementation will end up being more code, both to add a useful hook for replace these calls and for the extension packaging itself. Having a bit more code in contrib/ but less in src & postgresql.conf is probably a net positive trade though. = Diff reduction / code refactoring = Looks like you added a "File Operation Options" entry into guc.c but then not use it here? I would just keep this in an existing category for now, try to reduce the diff length of the proof of concept version as much as possible in the beginning. On the topic of smaller diffs, the similar cut and paste sections of the two entries that both do fork/exec/waitpid should really be refactored into one function. The archiver does something similar for running archive_command, there may be some reuse or refactoring of its code to present this interface. Again, this sort of refactoring is not necessary as a POC patch. But it will probably come up if this moves toward commit candidate. = Recursion and directory navigation = > In either case, the directories are copied recursively while the > Postgres internal copydir function does not recurse. I don't think that > should be a problem since there shouldn't be nested directories in the > first place. copydir takes an option for whether it should recurse or not. The rm side of makes me twitch for a number of reasons. First off, there's just the general scariness of the concept of shelling out to run rm recursively with some text string you build. The worst time I saw a bug in that sort of code destroyed a terabyte, and the last time I saw such a bug was only a week ago. Validation you're doing the right thing is always job #1 in removing files. Specifically here, take a look at src/port/dirmod.c for a) its comments on race conditions and b) how it does error handling. The external rm will need to duplicate all of that. I don't see how you could possibly replace rmtree with something simplier that has the same necessary properties. I think what you're actually going to need is a hook to replace both the unlink and rmdir calls with your alternate implementation idea. The same sort of issue is in your external_copydir. Iterating into subdirectories when it doesn't happen now just isn't safe, even though the one expected case you're hooking won't be any different. You really can't just do that. Would this work instead, and is there any concern about files that start with a "."? "cp * --reflink=auto" Regardless, you need to keep most of the structure to copydir anyway. Error handling, handling cancellation, and fsync calls are all vital things. You probably have to make the forked command copy a single file at a time to get the same interrupt handling behavior. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
pgsql-hackers by date: