Thread: pg_basebackup creates a corrupt file for pg_stat_tmp and pg_replslot on a backup location
pg_basebackup creates a corrupt file for pg_stat_tmp and pg_replslot on a backup location
From
Ashutosh Sharma
Date:
Hi All,
I tried performing pg_basebackup after creating a symbolic link for pg_replslot, pg_stat_tmp, pg_log and pg_clog in the source directory and found that on the backup location pg_stat_tmp, pg_repl_slot is a corrupt file rather than a link or directory whereas pg_clog and pg_log are getting skipped. As per the documentation of pg_basebackup, symbolic links on any directories
other than tablespace and xlog should be skipped. But this statement is not true for pg_replslot and pg_stat_tmp. The reason is as follows:
pg_basebackup is expecting pg_stat_tmp/pg_replslot to be a directory and irrespective of whether pg_stat_tmp is empty or not, it will always include it as a empty directory in backup path. Now, in my case i have created a softlink for pg_stat_tmp/pg_replslot and pg_basebackup is trying to create a tar format header without changing the filemode as it does in case of pg_xlog. Also, linkpath is not considered as in case of pg_tblspc. This is the reason why a regular file is getting created in the backup path even though i have a symbolic link in the source path but ideally it should be skipped.
Solution: Skip pg_stat_tmp and pg_replslot if they are symbolic link. Attached is the patch that fix this issue.
With Regards,
Ashutosh Sharma
EnterpriseDB: http://www.enterprisedb.com
I tried performing pg_basebackup after creating a symbolic link for pg_replslot, pg_stat_tmp, pg_log and pg_clog in the source directory and found that on the backup location pg_stat_tmp, pg_repl_slot is a corrupt file rather than a link or directory whereas pg_clog and pg_log are getting skipped. As per the documentation of pg_basebackup, symbolic links on any directories
other than tablespace and xlog should be skipped. But this statement is not true for pg_replslot and pg_stat_tmp. The reason is as follows:
pg_basebackup is expecting pg_stat_tmp/pg_replslot to be a directory and irrespective of whether pg_stat_tmp is empty or not, it will always include it as a empty directory in backup path. Now, in my case i have created a softlink for pg_stat_tmp/pg_replslot and pg_basebackup is trying to create a tar format header without changing the filemode as it does in case of pg_xlog. Also, linkpath is not considered as in case of pg_tblspc. This is the reason why a regular file is getting created in the backup path even though i have a symbolic link in the source path but ideally it should be skipped.
Solution: Skip pg_stat_tmp and pg_replslot if they are symbolic link. Attached is the patch that fix this issue.
With Regards,
Ashutosh Sharma
EnterpriseDB: http://www.enterprisedb.com
Attachment
Re: pg_basebackup creates a corrupt file for pg_stat_tmp and pg_replslot on a backup location
From
Andres Freund
Date:
Hi, On 2016-04-14 13:43:34 +0530, Ashutosh Sharma wrote: > I tried performing pg_basebackup after creating a symbolic link for > pg_replslot, pg_stat_tmp, pg_log and pg_clog in the source directory That's not supported, and I strongly suspect that you're goint to hit more than just this issue. The only directory you're allowed to symlink away is pg_xlog. What were you actually trying to achieve? Greetings, Andres Freund
Re: pg_basebackup creates a corrupt file for pg_stat_tmp and pg_replslot on a backup location
From
Ashutosh Sharma
Date:
Hi,
I was just curious to know how would "pg_basebackup" behave if we do create a symbolic link for directories other than pg_xlog/pg_tblspc. However it is clearly mentioned in the documentation of pg_basebackup that if a Symbolic link for the directories other than pg_tblspc and pg_xlog is created then it would be skipped. But, that is not the case for pg_replslot and pg_stat_tmp. Is this not an issue. Should these directories not be skipped. Please let me know your thoughts on this. Thanks.With Regards,
Ashutosh Sharma
EnterpriseDB: http://www.enterprisedb.com
On Thu, Apr 14, 2016 at 8:42 PM, Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2016-04-14 13:43:34 +0530, Ashutosh Sharma wrote:
> I tried performing pg_basebackup after creating a symbolic link for
> pg_replslot, pg_stat_tmp, pg_log and pg_clog in the source directory
That's not supported, and I strongly suspect that you're goint to hit
more than just this issue. The only directory you're allowed to symlink
away is pg_xlog.
What were you actually trying to achieve?
Greetings,
Andres Freund
Re: pg_basebackup creates a corrupt file for pg_stat_tmp and pg_replslot on a backup location
From
Magnus Hagander
Date:
On Thu, Apr 14, 2016 at 8:20 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
-- Hi,I was just curious to know how would "pg_basebackup" behave if we do create a symbolic link for directories other than pg_xlog/pg_tblspc. However it is clearly mentioned in the documentation of pg_basebackup that if a Symbolic link for the directories other than pg_tblspc and pg_xlog is created then it would be skipped. But, that is not the case for pg_replslot and pg_stat_tmp. Is this not an issue. Should these directories not be skipped. Please let me know your thoughts on this. Thanks.
I agree that actually generating a corrupt tarfile is not good. But I think the correct fix is to actually generate an empty placeholder directory rather than skipping it - thereby making the backup look the same as it would if it was a correct directory where we just skipped the contents.
Re: pg_basebackup creates a corrupt file for pg_stat_tmp and pg_replslot on a backup location
From
Alvaro Herrera
Date:
Magnus Hagander wrote: > On Thu, Apr 14, 2016 at 8:20 PM, Ashutosh Sharma <ashu.coek88@gmail.com> > wrote: > > > I was just curious to know how would "*pg_basebackup*" behave if we do > > create a symbolic link for directories other than pg_xlog/pg_tblspc. > > However it is clearly mentioned in the documentation of pg_basebackup that > > if a Symbolic link for the directories other than pg_tblspc and pg_xlog is > > created then it would be skipped. But, that is not the case for pg_replslot > > and pg_stat_tmp. Is this not an issue. Should these directories not be > > skipped. Please let me know your thoughts on this. Thanks. > > I agree that actually generating a corrupt tarfile is not good. But I think > the correct fix is to actually generate an empty placeholder directory > rather than skipping it - thereby making the backup look the same as it > would if it was a correct directory where we just skipped the contents. Hmm, if your server has replication slots and pg_basebackup doesn't copy them, is the copy okay? It may be a waste to try to create all the decoded data if there was any spillage, but perhaps it should at least backup the state? If this is excessive datadir knowledge in pg_basebackup, perhaps it should throw an error if some of the critical subdirs aren't copied? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: pg_basebackup creates a corrupt file for pg_stat_tmp and pg_replslot on a backup location
From
Magnus Hagander
Date:
On Thu, Apr 14, 2016 at 8:44 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Magnus Hagander wrote:
> On Thu, Apr 14, 2016 at 8:20 PM, Ashutosh Sharma <ashu.coek88@gmail.com>
> wrote:
>
> > I was just curious to know how would "*pg_basebackup*" behave if we do
> > create a symbolic link for directories other than pg_xlog/pg_tblspc.
> > However it is clearly mentioned in the documentation of pg_basebackup that
> > if a Symbolic link for the directories other than pg_tblspc and pg_xlog is
> > created then it would be skipped. But, that is not the case for pg_replslot
> > and pg_stat_tmp. Is this not an issue. Should these directories not be
> > skipped. Please let me know your thoughts on this. Thanks.
>
> I agree that actually generating a corrupt tarfile is not good. But I think
> the correct fix is to actually generate an empty placeholder directory
> rather than skipping it - thereby making the backup look the same as it
> would if it was a correct directory where we just skipped the contents.
Hmm, if your server has replication slots and pg_basebackup doesn't copy
them, is the copy okay?
If pg_replslot is a directory and not a symlink, we skip the contents and create an empty directory. So it'd better be ok...
It may be a waste to try to create all the
decoded data if there was any spillage, but perhaps it should at least
backup the state? If this is excessive datadir knowledge in
pg_basebackup, perhaps it should throw an error if some of the critical
subdirs aren't copied?
The knowledge isn't actually in pg_basebackup, it's in basebackup.c in backend/replication.
Re: pg_basebackup creates a corrupt file for pg_stat_tmp and pg_replslot on a backup location
From
Robert Haas
Date:
On Thu, Apr 14, 2016 at 11:12 AM, Andres Freund <andres@anarazel.de> wrote: > On 2016-04-14 13:43:34 +0530, Ashutosh Sharma wrote: >> I tried performing pg_basebackup after creating a symbolic link for >> pg_replslot, pg_stat_tmp, pg_log and pg_clog in the source directory > > That's not supported, and I strongly suspect that you're goint to hit > more than just this issue. The only directory you're allowed to symlink > away is pg_xlog. I think various tools might choke on such a configuration, but I'm not entirely sure why we haven't made them all work. Is there some more fundamental problem? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: pg_basebackup creates a corrupt file for pg_stat_tmp and pg_replslot on a backup location
From
Andres Freund
Date:
On 2016-04-14 14:55:37 -0400, Robert Haas wrote: > On Thu, Apr 14, 2016 at 11:12 AM, Andres Freund <andres@anarazel.de> wrote: > > On 2016-04-14 13:43:34 +0530, Ashutosh Sharma wrote: > >> I tried performing pg_basebackup after creating a symbolic link for > >> pg_replslot, pg_stat_tmp, pg_log and pg_clog in the source directory > > > > That's not supported, and I strongly suspect that you're goint to hit > > more than just this issue. The only directory you're allowed to symlink > > away is pg_xlog. > > I think various tools might choke on such a configuration, but I'm not > entirely sure why we haven't made them all work. Is there some more > fundamental problem? I don't think there's a fundamental problem, just that it'd require adding code to numerous places, and that it'd make it harder to reason about things. For example you have to be a lot more careful about iterating over the data directory, because loops suddenly are a major concern. Fsyncing files & directories suddenly needs special care to fsync the correct directory for a file, lest it's symlinked somewhere. It's harder to perform checks like the "not at the top of the filesystem" when every file and directory can be somewhere else. - Andres
Re: pg_basebackup creates a corrupt file for pg_stat_tmp and pg_replslot on a backup location
From
David Steele
Date:
On 4/14/16 2:55 PM, Robert Haas wrote: > On Thu, Apr 14, 2016 at 11:12 AM, Andres Freund <andres@anarazel.de> wrote: >> On 2016-04-14 13:43:34 +0530, Ashutosh Sharma wrote: >>> I tried performing pg_basebackup after creating a symbolic link for >>> pg_replslot, pg_stat_tmp, pg_log and pg_clog in the source directory >> >> That's not supported, and I strongly suspect that you're goint to hit >> more than just this issue. The only directory you're allowed to symlink >> away is pg_xlog. > > I think various tools might choke on such a configuration, but I'm not > entirely sure why we haven't made them all work. Is there some more > fundamental problem? I'm don't think there's a fundamental problem it just takes a lot of work to get it right. Extensive link support has been added for pgBackRest 1.0 and it took a lot of effort, not just for the backup itself (loop detection, links into multiple hierarchy levels, etc.) but adding options to the restore to make remapping possible on systems with different disk layouts. -- -David david@pgmasters.net
Re: pg_basebackup creates a corrupt file for pg_stat_tmp and pg_replslot on a backup location
From
David Steele
Date:
On 4/14/16 3:01 PM, Andres Freund wrote: > On 2016-04-14 14:55:37 -0400, Robert Haas wrote: >> On Thu, Apr 14, 2016 at 11:12 AM, Andres Freund <andres@anarazel.de> wrote: >>> On 2016-04-14 13:43:34 +0530, Ashutosh Sharma wrote: >>>> I tried performing pg_basebackup after creating a symbolic link for >>>> pg_replslot, pg_stat_tmp, pg_log and pg_clog in the source directory >>> >>> That's not supported, and I strongly suspect that you're goint to hit >>> more than just this issue. The only directory you're allowed to symlink >>> away is pg_xlog. >> >> I think various tools might choke on such a configuration, but I'm not >> entirely sure why we haven't made them all work. Is there some more >> fundamental problem? > > <...> Fsyncing files & directories suddenly needs special care to > fsync the correct directory for a file, lest it's symlinked somewhere. That's a good point. I'll need to make sure pgBackRest is correctly handling that case on restore. -- -David david@pgmasters.net
Re: pg_basebackup creates a corrupt file for pg_stat_tmp and pg_replslot on a backup location
From
Ashutosh Sharma
Date:
Hi,
Knowing that pg_basebackup always creates an empty directory for pg_stat_tmp and pg_replslot in backup location, even i think it would be better to handle these directories in such a way that pg_basebackup generates an empty directory for pg_replslot and pg_stat_tmp if they are symbolic link.
PFA patch for the same.
With Regards,Knowing that pg_basebackup always creates an empty directory for pg_stat_tmp and pg_replslot in backup location, even i think it would be better to handle these directories in such a way that pg_basebackup generates an empty directory for pg_replslot and pg_stat_tmp if they are symbolic link.
PFA patch for the same.
Ashutosh Sharma
EnterpriseDB: http://www.enterprisedb.com
On Thu, Apr 14, 2016 at 11:57 PM, Magnus Hagander <magnus@hagander.net> wrote:
On Thu, Apr 14, 2016 at 8:20 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:--Hi,I was just curious to know how would "pg_basebackup" behave if we do create a symbolic link for directories other than pg_xlog/pg_tblspc. However it is clearly mentioned in the documentation of pg_basebackup that if a Symbolic link for the directories other than pg_tblspc and pg_xlog is created then it would be skipped. But, that is not the case for pg_replslot and pg_stat_tmp. Is this not an issue. Should these directories not be skipped. Please let me know your thoughts on this. Thanks.I agree that actually generating a corrupt tarfile is not good. But I think the correct fix is to actually generate an empty placeholder directory rather than skipping it - thereby making the backup look the same as it would if it was a correct directory where we just skipped the contents.
Attachment
Re: pg_basebackup creates a corrupt file for pg_stat_tmp and pg_replslot on a backup location
From
Peter Eisentraut
Date:
On 4/26/16 5:02 AM, Ashutosh Sharma wrote: > Knowing that pg_basebackup always creates an empty directory for > pg_stat_tmp and pg_replslot in backup location, even i think it would be > better to handle these directories in such a way that pg_basebackup > generates an empty directory for pg_replslot and pg_stat_tmp if they are > symbolic link. I just wanted to update you, I have taken this commit fest entry patch to review because I think it will be addresses as part of "Exclude additional directories in pg_basebackup", which I'm also reviewing. Therefore, I'm not actually planning on discussing this patch further. Please correct me if this assessment does not match your expectations. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: pg_basebackup creates a corrupt file for pg_stat_tmp and pg_replslot on a backup location
From
David Steele
Date:
On 9/23/16 2:12 PM, Peter Eisentraut wrote: > On 4/26/16 5:02 AM, Ashutosh Sharma wrote: >> Knowing that pg_basebackup always creates an empty directory for >> pg_stat_tmp and pg_replslot in backup location, even i think it would be >> better to handle these directories in such a way that pg_basebackup >> generates an empty directory for pg_replslot and pg_stat_tmp if they are >> symbolic link. > > I just wanted to update you, I have taken this commit fest entry patch > to review because I think it will be addresses as part of "Exclude > additional directories in pg_basebackup", which I'm also reviewing. > Therefore, I'm not actually planning on discussing this patch further. > Please correct me if this assessment does not match your expectations. Just to be clear, and as I noted in [1], I pulled the symlink logic from this this patch into the exclusion patch [2]. I think it makes sense to only commit [2] as long as Ashutosh gets credit for his contribution. Thanks, -- -David david@pgmasters.net [1] https://www.postgresql.org/message-id/raw/ced3b05f-c1d9-c262-ce63-9744ef7e6de8%40pgmasters.net [2] https://commitfest.postgresql.org/10/721/
Re: pg_basebackup creates a corrupt file for pg_stat_tmp and pg_replslot on a backup location
From
Ashutosh Sharma
Date:
Hi Peter, > I just wanted to update you, I have taken this commit fest entry patch > to review because I think it will be addresses as part of "Exclude > additional directories in pg_basebackup", which I'm also reviewing. > Therefore, I'm not actually planning on discussing this patch further. > Please correct me if this assessment does not match your expectations. Thanks for the update. I am absolutely OK with it. I feel it would be a good idea to review "Exclude additional directories in pg_basebackup" which also addresses the issue reported by me. With Regards, Ashutosh Sharma EnterpriseDB: http://www.enterprisedb.com
Re: pg_basebackup creates a corrupt file for pg_stat_tmp and pg_replslot on a backup location
From
Peter Eisentraut
Date:
On 9/25/16 8:06 AM, Ashutosh Sharma wrote: > Hi Peter, > >> I just wanted to update you, I have taken this commit fest entry patch >> to review because I think it will be addresses as part of "Exclude >> additional directories in pg_basebackup", which I'm also reviewing. >> Therefore, I'm not actually planning on discussing this patch further. >> Please correct me if this assessment does not match your expectations. > > Thanks for the update. I am absolutely OK with it. I feel it would be > a good idea to review "Exclude additional directories in > pg_basebackup" which also addresses the issue reported by me. That has been committed. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services