Re: Re: pg_basebackup bug: base backup is double the size of the database - Mailing list pgsql-admin

From David Johnston
Subject Re: Re: pg_basebackup bug: base backup is double the size of the database
Date
Msg-id CAKFQuwYTpRw7q=AdZ2h6wdsab1BfJ+0oncO_ZHJzbDMWMy7ArA@mail.gmail.com
Whole thread Raw
In response to Re: Re: pg_basebackup bug: base backup is double the size of the database  (Craig James <cjames@emolecules.com>)
List pgsql-admin
On Thursday, January 22, 2015, Craig James <cjames@emolecules.com> wrote:


On Wed, Jan 21, 2015 at 10:02 PM, David G Johnston <david.g.johnston@gmail.com> wrote:
Craig James-2 wrote
> We've encountered a serious bug with pg_basebackup. It seems to be
> following hard links and duplicating all files in the tablespaces rather
> than preserving links.

This entire sentence doesn't make sense to me.  How does one "follow" a
hard-link?  A soft-link yes but a hard-link is an alias to actual data.  I'm
not sure directory hard-linking is even allowed or used so following in that
sense don't compute...

See the man page for rsync, the -H option, which explains it better:

       -H, --hard-links
              This  tells  rsync to look for hard-linked files in the transfer
              and link together the corresponding files on the receiving side.
              Without  this  option,  hard-linked  files  in  the transfer are
              treated as though they were separate files.


Which makes sense in a full system backup but a single-cluster backup should not (I think) have any situations where a file and a matching hard link are both within the same source structure.  The -H option should not be needed because the scenario it solves is not expected to exist.  That it does either means user error or a use-case that hasn't been considered.  It seems improvements could be made here but a reliable test case describing the specific setup is needed first.

David J.

 

pgsql-admin by date:

Previous
From: Craig James
Date:
Subject: Re: Re: pg_basebackup bug: base backup is double the size of the database
Next
From: "Cassiano, Marco"
Date:
Subject: Postgresql Foreign Data Wrapper & Query plan