Re: Btrfs clone WIP patch - Mailing list pgsql-hackers

From Jonathan Rogers
Subject Re: Btrfs clone WIP patch
Date
Msg-id 51304D5C.2040107@socialserve.com
Whole thread Raw
In response to Re: Btrfs clone WIP patch  (Phil Sorber <phil@omniti.com>)
Responses Re: Btrfs clone WIP patch  (Greg Smith <greg@2ndQuadrant.com>)
List pgsql-hackers
Phil Sorber wrote:
> On Wed, Feb 13, 2013 at 5:48 PM, Josh Berkus <josh@agliodbs.com> wrote:
>> On 02/13/2013 02:13 PM, Tom Lane wrote:
>>> The big-picture question of course is whether we want to carry and
>>> maintain a filesystem-specific hack.  I don't have a sense that btrfs
>>> is so widely used as to justify this.
>>
>> If this is a valuable hack, it seems like it could work on ZFS as well.
>>  If we could make it for any snapshot-capable filesystem, and not just
>> BTRFS, then it would make more sense.
>
> I was thinking that too, but I think this is a file level clone, not a
> whole filesystem. As far as I can tell, you can't clone individual
> files in ZFS.
>

I've been thinking about both of these issues and decided to try a
different approach. This patch adds GUC options for two external
commands: one to copy a directory and one to delete a directory. This
allows filesystem-specific tools to be used to accomplish the efficient
cloning without Postgres having to know any details.

This works particularly well for Btrfs. On a GNU/Linux system, one can
simply configure the external copy command as "/bin/cp -r
--reflink=auto" and efficient cloning will be done on file systems that
support it and ordinary copying will be done otherwise. The directory
deletion command isn't needed and no special Postgres setup is required
other than putting the data directory on a Btrfs file system.

I have just been experimenting with ZFS and it does not seem to have any
capability or interface for cloning ordinary files or directories so the
configuration is not as straightforward. However, I was able to set up a
Postgres cluster as a hierarchy of ZFS file systems in the same pool
with each directory under "base" being a separate file system and
configure Postgres to call shell scripts which call zfs snapshot and
clone commands to do the cloning and deleting.

In either case, the directories are copied recursively while the
Postgres internal copydir function does not recurse. I don't think that
should be a problem since there shouldn't be nested directories in the
first place.
--
Jonathan Ross Rogers

Attachment

pgsql-hackers by date:

Previous
From: Alex Hunsaker
Date:
Subject: Re: Memory leakage associated with plperl spi_prepare/spi_freeplan
Next
From: Albe Laurenz
Date:
Subject: Re: [GENERAL] Floating point error