Thread: pg_basebackup fails with long tablespace paths
My Salesforce colleague Thomas Fanghaenel observed that the TAP tests for pg_basebackup fail when run in a sufficiently deeply-nested directory tree. The cause appears to be that we rely on standard "tar" format to represent the symlink for a tablespace, and POSIX tar format has a hard-wired restriction of 99 bytes in a symlink's expansion. What do we want to do about this? I think a minimum expectation would be for pg_basebackup to notice and complain when it's trying to create an unworkably long symlink entry, but it would be far better if we found a way to cope instead. One thing we could possibly do without reinventing "tar" is to avoid using absolute path names if a PGDATA-relative one would do. regards, tom lane
On 10/20/14 2:59 PM, Tom Lane wrote: > What do we want to do about this? I think a minimum expectation would be > for pg_basebackup to notice and complain when it's trying to create an > unworkably long symlink entry, but it would be far better if we found a > way to cope instead. Isn't it the backend that should error out before sending truncated files names? src/port/tar.c: /* Name 100 */ sprintf(&h[0], "%.99s", filename); And then do we need to prevent the creation of tablespaces that can't be backed up? > One thing we could possibly do without reinventing "tar" is to avoid > using > absolute path names if a PGDATA-relative one would do. Maybe we could hack up the tar format to store the symlink target as the file body, like cpio does. Of course then we'd lose the property of this actually being tar.
On Tue, Oct 21, 2014 at 12:29 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> My Salesforce colleague Thomas Fanghaenel observed that the TAP tests
> for pg_basebackup fail when run in a sufficiently deeply-nested directory
> tree. The cause appears to be that we rely on standard "tar" format
> to represent the symlink for a tablespace, and POSIX tar format has a
> hard-wired restriction of 99 bytes in a symlink's expansion.
>
> What do we want to do about this? I think a minimum expectation would be
> for pg_basebackup to notice and complain when it's trying to create an
> unworkably long symlink entry, but it would be far better if we found a
> way to cope instead.
One way to cope with such a situation could be that during backup we create
>
> My Salesforce colleague Thomas Fanghaenel observed that the TAP tests
> for pg_basebackup fail when run in a sufficiently deeply-nested directory
> tree. The cause appears to be that we rely on standard "tar" format
> to represent the symlink for a tablespace, and POSIX tar format has a
> hard-wired restriction of 99 bytes in a symlink's expansion.
>
> What do we want to do about this? I think a minimum expectation would be
> for pg_basebackup to notice and complain when it's trying to create an
> unworkably long symlink entry, but it would be far better if we found a
> way to cope instead.
One way to cope with such a situation could be that during backup we create
a backup symlink file which contains listing of symlinks and then archive
recovery recreates it. Basically this is the solution (patch), I have proposed
for Windows [1].
On 10/20/14 2:59 PM, Tom Lane wrote: > My Salesforce colleague Thomas Fanghaenel observed that the TAP tests > for pg_basebackup fail when run in a sufficiently deeply-nested directory > tree. As for the test, we can do something like the attached to mark the test as "TODO".
Attachment
On Tue, Oct 28, 2014 at 8:29 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > On 10/20/14 2:59 PM, Tom Lane wrote: >> My Salesforce colleague Thomas Fanghaenel observed that the TAP tests >> for pg_basebackup fail when run in a sufficiently deeply-nested directory >> tree. > > As for the test, we can do something like the attached to mark the test > as "TODO". What does this actually do? It doesn't appear that it's just disabling the test. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 10/29/14 10:48 AM, Robert Haas wrote: > On Tue, Oct 28, 2014 at 8:29 PM, Peter Eisentraut <peter_e@gmx.net> wrote: >> On 10/20/14 2:59 PM, Tom Lane wrote: >>> My Salesforce colleague Thomas Fanghaenel observed that the TAP tests >>> for pg_basebackup fail when run in a sufficiently deeply-nested directory >>> tree. >> >> As for the test, we can do something like the attached to mark the test >> as "TODO". > > What does this actually do? It doesn't appear that it's just > disabling the test. It still runs the tests, but doesn't count the results in whether the suite passes.
On 10/20/14 4:51 PM, Peter Eisentraut wrote: > On 10/20/14 2:59 PM, Tom Lane wrote: >> What do we want to do about this? I think a minimum expectation would be >> for pg_basebackup to notice and complain when it's trying to create an >> unworkably long symlink entry, but it would be far better if we found a >> way to cope instead. > > Isn't it the backend that should error out before sending truncated > files names? > > src/port/tar.c: > > /* Name 100 */ > sprintf(&h[0], "%.99s", filename); Here are patches to address that. First, it reports errors when attempting to create a tar header that would truncate file or symlink names. Second, it works around the problem in the tests by creating a symlink from the short-name tempdir that we had set up for the Unix-socket directory case. The first patch can be backpatched to 9.3. The tar code before that is different and would need manual adjustments. If someone has a too-long tablespace path, I think they can work around that after this patch by creating a shorter symlink and updating the pg_tblspc symlinks to point there.
Attachment
On 11/4/14 3:52 PM, Peter Eisentraut wrote: > Here are patches to address that. First, it reports errors when > attempting to create a tar header that would truncate file or symlink > names. Second, it works around the problem in the tests by creating a > symlink from the short-name tempdir that we had set up for the > Unix-socket directory case. I ended up splitting this up differently. I applied to part of the second patch that works around the length issue in tablespaces. So the tests now pass in 9.4 and up even in working directories with long names. This clears up the regression in 9.4. The remaining, not applied patch is attached. It errors when the file name is too long and adds tests for that. This could be applied to 9.5 and backpatched, if we so choose. It might become obsolete if https://commitfest.postgresql.org/action/patch_view?id=1512 is accepted. If that patch doesn't get accepted, I might add my patch to a future commit fest.
Attachment
08.11.2014, 04:03, Peter Eisentraut kirjoitti: > On 11/4/14 3:52 PM, Peter Eisentraut wrote: >> > Here are patches to address that. First, it reports errors when >> > attempting to create a tar header that would truncate file or symlink >> > names. Second, it works around the problem in the tests by creating a >> > symlink from the short-name tempdir that we had set up for the >> > Unix-socket directory case. > I ended up splitting this up differently. I applied to part of the > second patch that works around the length issue in tablespaces. So the > tests now pass in 9.4 and up even in working directories with long > names. This clears up the regression in 9.4. > > The remaining, not applied patch is attached. It errors when the file > name is too long and adds tests for that. This could be applied to 9.5 > and backpatched, if we so choose. It might become obsolete if > https://commitfest.postgresql.org/action/patch_view?id=1512 is accepted. > If that patch doesn't get accepted, I might add my patch to a future > commit fest. I think we should just use the UStar tar format (http://en.wikipedia.org/wiki/Tar_%28computing%29#UStar_format) and allow long file names; all actively used tar implementations should be able to handle them. I'll try to write a patch for that soonish. Until UStar format is used we should raise an error if a filename is being truncated by tar instead of creating invalid archives. Also note that Posix tar format allows 100 byte file names as the name doesn't have to be zero terminated, but we may want to stick to 99 bytes in old type tar anyway as using 100 byte filenames has shown bugs in other tar implementations, for example https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=689582 - and truncating at 100 bytes instead of 99 doesn't help us too much anyway. / Oskari
On Tue, Dec 23, 2014 at 4:10 AM, Oskari Saarenmaa <os@ohmu.fi> wrote:
>
> 08.11.2014, 04:03, Peter Eisentraut kirjoitti:
> > On 11/4/14 3:52 PM, Peter Eisentraut wrote:
> >> > Here are patches to address that. First, it reports errors when
> >> > attempting to create a tar header that would truncate file or symlink
> >> > names. Second, it works around the problem in the tests by creating a
> >> > symlink from the short-name tempdir that we had set up for the
> >> > Unix-socket directory case.
> > I ended up splitting this up differently. I applied to part of the
> > second patch that works around the length issue in tablespaces. So the
> > tests now pass in 9.4 and up even in working directories with long
> > names. This clears up the regression in 9.4.
> >
> > The remaining, not applied patch is attached. It errors when the file
> > name is too long and adds tests for that. This could be applied to 9.5
> > and backpatched, if we so choose. It might become obsolete if
> > https://commitfest.postgresql.org/action/patch_view?id=1512 is accepted.
> > If that patch doesn't get accepted, I might add my patch to a future
> > commit fest.
>
> I think we should just use the UStar tar format
> (http://en.wikipedia.org/wiki/Tar_%28computing%29#UStar_format) and
> allow long file names; all actively used tar implementations should be
> able to handle them. I'll try to write a patch for that soonish.
>
>
> 08.11.2014, 04:03, Peter Eisentraut kirjoitti:
> > On 11/4/14 3:52 PM, Peter Eisentraut wrote:
> >> > Here are patches to address that. First, it reports errors when
> >> > attempting to create a tar header that would truncate file or symlink
> >> > names. Second, it works around the problem in the tests by creating a
> >> > symlink from the short-name tempdir that we had set up for the
> >> > Unix-socket directory case.
> > I ended up splitting this up differently. I applied to part of the
> > second patch that works around the length issue in tablespaces. So the
> > tests now pass in 9.4 and up even in working directories with long
> > names. This clears up the regression in 9.4.
> >
> > The remaining, not applied patch is attached. It errors when the file
> > name is too long and adds tests for that. This could be applied to 9.5
> > and backpatched, if we so choose. It might become obsolete if
> > https://commitfest.postgresql.org/action/patch_view?id=1512 is accepted.
> > If that patch doesn't get accepted, I might add my patch to a future
> > commit fest.
>
> I think we should just use the UStar tar format
> (http://en.wikipedia.org/wiki/Tar_%28computing%29#UStar_format) and
> allow long file names; all actively used tar implementations should be
> able to handle them. I'll try to write a patch for that soonish.
>
I think even using UStar format won't make it work for Windows where
the standard utilities are not able to understand the symlinks in tar.
There is already a patch [1] in this CF which will handle both cases, so I am
not sure if it is very good idea to go with a new tar format to handle this
issue.
23.12.2014, 05:00, Amit Kapila kirjoitti: > On Tue, Dec 23, 2014 at 4:10 AM, Oskari Saarenmaa wrote: >> 08.11.2014, 04:03, Peter Eisentraut kirjoitti: >> > It errors when the file >> > name is too long and adds tests for that. This could be applied to 9.5 >> > and backpatched, if we so choose. It might become obsolete if >> > https://commitfest.postgresql.org/action/patch_view?id=1512 is accepted. >> > If that patch doesn't get accepted, I might add my patch to a future >> > commit fest. >> >> I think we should just use the UStar tar format >> (http://en.wikipedia.org/wiki/Tar_%28computing%29#UStar_format) and >> allow long file names; all actively used tar implementations should be >> able to handle them. I'll try to write a patch for that soonish. >> > > I think even using UStar format won't make it work for Windows where > the standard utilities are not able to understand the symlinks in tar. > There is already a patch [1] in this CF which will handle both cases, so > I am > not sure if it is very good idea to go with a new tar format to handle this > issue. > > [1] : https://commitfest.postgresql.org/action/patch_view?id=1512 That patch makes sense for 9.5, but I don't think it's going to be backpatched to previous releases? I think we should also apply Peter's patch to master and backbranches to avoid creating invalid tar files anywhere. And optionally implement and backpatch long filename support in tar even if 9.5 no longer creates tar files with long names. / Oskari
On 12/22/14 5:40 PM, Oskari Saarenmaa wrote: > I think we should just use the UStar tar format > (http://en.wikipedia.org/wiki/Tar_%28computing%29#UStar_format) and > allow long file names; all actively used tar implementations should be > able to handle them. I'll try to write a patch for that soonish. UStar doesn't handle long link targets, only long file names (and then only up to 255 characters, which doesn't seem satisfactory). AFAICT, to allow long link targets, the available solutions are either pax extended headers or GNU-specific long-link extra headers. When I create a symlink with a long target and call tar on it, GNU tar by default creates the GNU long-link header and BSD tar by default creates a pax header. But they are both able to extract either one. As a demo for how this might look, attached is a wildly incomplete patch to produce GNU long-link headers.
Attachment
On 12/22/14 10:00 PM, Amit Kapila wrote: > There is already a patch [1] in this CF which will handle both cases, so > I am > not sure if it is very good idea to go with a new tar format to handle this > issue. I think it would still make sense to have proper symlinks in the basebackup if possible, for clarity.
On Wed, Dec 24, 2014 at 8:12 AM, Peter Eisentraut <peter_e@gmx.net> wrote: > On 12/22/14 10:00 PM, Amit Kapila wrote: >> There is already a patch [1] in this CF which will handle both cases, so >> I am >> not sure if it is very good idea to go with a new tar format to handle this >> issue. > > I think it would still make sense to have proper symlinks in the > basebackup if possible, for clarity. I guess I would have assumed it would be more clear to omit the symlinks if we're expecting the server to put them in. Otherwise, the server has to remove the existing symlinks and create new ones, which introduces various possibilities for failure and confusion. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 12/27/14 8:02 PM, Robert Haas wrote: > On Wed, Dec 24, 2014 at 8:12 AM, Peter Eisentraut <peter_e@gmx.net> wrote: >> On 12/22/14 10:00 PM, Amit Kapila wrote: >>> There is already a patch [1] in this CF which will handle both cases, so >>> I am >>> not sure if it is very good idea to go with a new tar format to handle this >>> issue. >> >> I think it would still make sense to have proper symlinks in the >> basebackup if possible, for clarity. > > I guess I would have assumed it would be more clear to omit the > symlinks if we're expecting the server to put them in. Otherwise, the > server has to remove the existing symlinks and create new ones, which > introduces various possibilities for failure and confusion. Currently, when you unpack a tarred basebackup with tablespaces, the symlinks will tell you whether you have unpacked the tablespace tars at the right place. Otherwise, how do you know? Secondly, you also have the option of putting the tablespaces somewhere else by changing the symlinks. Under the new scheme, the existing symlinks would be overwritten (or not?). If that is actually correct, then the proposed fix doesn't really replicate the required functionality on Windows. One way to address this would be to do away with the symlinks altogether and have pg_tblspc/12345 be a text file that contains the tablespace location. Kind of symlinks implemented in user space.
On Wed, Jan 7, 2015 at 3:03 AM, Peter Eisentraut <peter_e@gmx.net> wrote:
>
> On 12/27/14 8:02 PM, Robert Haas wrote:
> > On Wed, Dec 24, 2014 at 8:12 AM, Peter Eisentraut <peter_e@gmx.net> wrote:
> >> On 12/22/14 10:00 PM, Amit Kapila wrote:
> >>> There is already a patch [1] in this CF which will handle both cases, so
> >>> I am
> >>> not sure if it is very good idea to go with a new tar format to handle this
> >>> issue.
> >>
> >> I think it would still make sense to have proper symlinks in the
> >> basebackup if possible, for clarity.
> >
> > I guess I would have assumed it would be more clear to omit the
> > symlinks if we're expecting the server to put them in. Otherwise, the
> > server has to remove the existing symlinks and create new ones, which
> > introduces various possibilities for failure and confusion.
>
> Currently, when you unpack a tarred basebackup with tablespaces, the
> symlinks will tell you whether you have unpacked the tablespace tars at
> the right place. Otherwise, how do you know?
>
> On 12/27/14 8:02 PM, Robert Haas wrote:
> > On Wed, Dec 24, 2014 at 8:12 AM, Peter Eisentraut <peter_e@gmx.net> wrote:
> >> On 12/22/14 10:00 PM, Amit Kapila wrote:
> >>> There is already a patch [1] in this CF which will handle both cases, so
> >>> I am
> >>> not sure if it is very good idea to go with a new tar format to handle this
> >>> issue.
> >>
> >> I think it would still make sense to have proper symlinks in the
> >> basebackup if possible, for clarity.
> >
> > I guess I would have assumed it would be more clear to omit the
> > symlinks if we're expecting the server to put them in. Otherwise, the
> > server has to remove the existing symlinks and create new ones, which
> > introduces various possibilities for failure and confusion.
>
> Currently, when you unpack a tarred basebackup with tablespaces, the
> symlinks will tell you whether you have unpacked the tablespace tars at
> the right place. Otherwise, how do you know?
via some kind of tablespace map file which will tell us the exact
location where symlink need to be pointed and the same will be used
to create a symlink. So after you unpack a tarred basebackup with
tablespaces, there will be no symlinks; when you start the server
(archive recovery) using base backup, it will create the appropriate
symlinks.
> Secondly, you also have
> the option of putting the tablespaces somewhere else by changing the
> symlinks. Under the new scheme, the existing symlinks would be
> overwritten (or not?). If that is actually correct, then the proposed
> fix doesn't really replicate the required functionality on Windows.
>
> One way to address this would be to do away with the symlinks altogether
> and have pg_tblspc/12345 be a text file that contains the tablespace
> location. Kind of symlinks implemented in user space.
>
> the option of putting the tablespaces somewhere else by changing the
> symlinks. Under the new scheme, the existing symlinks would be
> overwritten (or not?). If that is actually correct, then the proposed
> fix doesn't really replicate the required functionality on Windows.
>
> One way to address this would be to do away with the symlinks altogether
> and have pg_tblspc/12345 be a text file that contains the tablespace
> location. Kind of symlinks implemented in user space.
>
I think this is somewhat similar to what existing patch [1] does with
the different that there is just one file for all the tablespace locations
rather than individual file in each tablespace directory.
[1] : https://commitfest.postgresql.org/action/patch_view?id=1512
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On Tue, Jan 6, 2015 at 4:33 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > Currently, when you unpack a tarred basebackup with tablespaces, the > symlinks will tell you whether you have unpacked the tablespace tars at > the right place. Otherwise, how do you know? Secondly, you also have > the option of putting the tablespaces somewhere else by changing the > symlinks. That's a good argument for making the tablespace-map file human-readable and human-editable, but I don't think it's an argument for duplicating its contents inaccurately in the filesystem. > One way to address this would be to do away with the symlinks altogether > and have pg_tblspc/12345 be a text file that contains the tablespace > location. Kind of symlinks implemented in user space. Well, that's just spreading the tablespace-map file out into several files, and maybe keeping it around after we've restored from backup. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 1/7/15 3:19 PM, Robert Haas wrote: > On Tue, Jan 6, 2015 at 4:33 PM, Peter Eisentraut <peter_e@gmx.net> wrote: >> Currently, when you unpack a tarred basebackup with tablespaces, the >> symlinks will tell you whether you have unpacked the tablespace tars at >> the right place. Otherwise, how do you know? Secondly, you also have >> the option of putting the tablespaces somewhere else by changing the >> symlinks. > > That's a good argument for making the tablespace-map file > human-readable and human-editable, but I don't think it's an argument > for duplicating its contents inaccurately in the filesystem. > >> One way to address this would be to do away with the symlinks altogether >> and have pg_tblspc/12345 be a text file that contains the tablespace >> location. Kind of symlinks implemented in user space. > > Well, that's just spreading the tablespace-map file out into several > files, and maybe keeping it around after we've restored from backup. I think the key point I'm approaching is that the information should only ever be in one place, all the time. This is not dissimilar from why we took the tablespace location out of the system catalogs. Users might have all kinds of workflows for how they back up, restore, and move their tablespaces. This works pretty well right now, because the authoritative configuration information is always in plain view. The proposal is essentially that we add another location for this information, because the existing location is incompatible with some operating system tools. And, when considered by a user, that second location might or might not collide with or overwrite the first location at some mysterious times. So I think the preferable fix is not to add a second location, but to make the first location compatible with said operating system tools, possibly in the way I propose above.
On Tue, Jan 13, 2015 at 4:41 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > I think the key point I'm approaching is that the information should > only ever be in one place, all the time. This is not dissimilar from > why we took the tablespace location out of the system catalogs. Users > might have all kinds of workflows for how they back up, restore, and > move their tablespaces. This works pretty well right now, because the > authoritative configuration information is always in plain view. The > proposal is essentially that we add another location for this > information, because the existing location is incompatible with some > operating system tools. And, when considered by a user, that second > location might or might not collide with or overwrite the first location > at some mysterious times. > > So I think the preferable fix is not to add a second location, but to > make the first location compatible with said operating system tools, > possibly in the way I propose above. I see. I'm a little concerned that following symlinks may be cheaper than whatever system we would come up with for caching the tablespace-name-to-file-name mappings. But that concern might be unfounded, and apart from it I have no reason to oppose your proposal, if you want to do the work. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
At 2014-12-24 08:10:46 -0500, peter_e@gmx.net wrote: > > As a demo for how this might look, attached is a wildly incomplete > patch to produce GNU long-link headers. Hi Peter. In what way exactly is this patch wildly incomplete? (I ask because it's been added to the current CF). -- Abhijit
On Fri, Nov 7, 2014 at 9:03 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > On 11/4/14 3:52 PM, Peter Eisentraut wrote: >> Here are patches to address that. First, it reports errors when >> attempting to create a tar header that would truncate file or symlink >> names. Second, it works around the problem in the tests by creating a >> symlink from the short-name tempdir that we had set up for the >> Unix-socket directory case. > > I ended up splitting this up differently. I applied to part of the > second patch that works around the length issue in tablespaces. So the > tests now pass in 9.4 and up even in working directories with long > names. This clears up the regression in 9.4. > > The remaining, not applied patch is attached. It errors when the file > name is too long and adds tests for that. This could be applied to 9.5 > and backpatched, if we so choose. It might become obsolete if > https://commitfest.postgresql.org/action/patch_view?id=1512 is accepted. > If that patch doesn't get accepted, I might add my patch to a future > commit fest. I think we should commit this, where by "this" I mean your patch to error-check the length of filenames and symlinks instead of truncating them. I don't know what will become of Amit's patch, but I think this is a good idea anyway. We should perhaps even consider back-patching it, because silently eating people's data is generally not cool. It's possible that there are people out there who know that their filenames and links are being truncated and don't care, and those people would be unhappy to see this back-patched. However, it's also possible that there are people who don't know that this is happening and do care, and those people would be happy about a back-patch. I don't know which group is larger. At the least, I think we should apply it to master; because whatever we end up doing about Amit's patch, adding error checks for conditions where we're chewing up somebody's filenames and spitting out what's left over has got to be a good thing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2/2/15 8:58 AM, Robert Haas wrote: > I think we should commit this, where by "this" I mean your patch to > error-check the length of filenames and symlinks instead of truncating > them. done
On 1/23/15 3:26 AM, Abhijit Menon-Sen wrote: > At 2014-12-24 08:10:46 -0500, peter_e@gmx.net wrote: >> >> As a demo for how this might look, attached is a wildly incomplete >> patch to produce GNU long-link headers. > > Hi Peter. > > In what way exactly is this patch wildly incomplete? (I ask because it's > been added to the current CF). This patch is not in the commit fest. It's just the most recently posted patch-like attachment in this thread, which confuses the new CP app.