Thread: Re: [HACKERS] Bug with pg_basebackup and 'shared' tablespace
Hi, Pierre. Maybe you're the winner:p At Thu, 06 Apr 2017 12:34:09 +0200, Pierre Ducroquet <p.psql@pinaraf.info> wrote in <1714428.BHRm6e8A2D@peanuts2> > On Thursday, April 6, 2017 2:00:55 PM CEST Kyotaro HORIGUCHI wrote: > > https://www.postgresql.org/docs/9.6/static/manage-ag-tablespaces.html > > > > | The location must be an existing, empty directory that is owned > > | by the PostgreSQL operating system user. > > > > This explicitly prohibits to share one tablespace directory among > > multiple servers. The code is just missing the case of multiple > > servers with different versions. I think the bug is rather that > > Pg9.6 in the case allowed to create the tablespace. > > > > The current naming rule of tablespace directory was introduced as > > of 9.0 so that pg_upgrade (or pg_migrator at the time) can > > perform in-place migration. It is not intended to share a > > directory among multiple instances with different versions. > > > > That being said, an additional trick in the attached file will > > work for you. > > Thanks for your answer. > Indeed, either PostgreSQL should enforce that empty folder restriction, or > pg_basebackup should lift it and the documentation should reflect this. That being said, it is a different matter if the behavior is preferable. The discussion on the behavior is continued here. https://www.postgresql.org/message-id/20170406.160844.120459562.horiguchi.kyotaro@lab.ntt.co.jp > Right now, there is a conflict between pg_basebackup and the server since they > do not allow the same behaviour. I can submit a patch either way, but I won't > decide what is the right way to do it. > I know tricks will allow to work around that issue, I found them hopefully and > I guess most people affected by this issue would be able to find and use them, > but nevertheless being able to build a server that can no longer be base- > backuped does not seem right. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
On Friday, April 7, 2017 3:12:58 AM CEST, Kyotaro HORIGUCHI wrote: > Hi, Pierre. > > Maybe you're the winner:p > > At Thu, 06 Apr 2017 12:34:09 +0200, Pierre Ducroquet > <p.psql@pinaraf.info> wrote in <1714428.BHRm6e8A2D@peanuts2> >> On Thursday, April 6, 2017 2:00:55 PM CEST Kyotaro HORIGUCHI wrote: ... > > That being said, it is a different matter if the behavior is > preferable. The discussion on the behavior is continued here. > > https://www.postgresql.org/message-id/20170406.160844.120459562.horiguchi.kyotaro@lab.ntt.co.jp > >> Right now, there is a conflict between pg_basebackup and the >> server since they >> do not allow the same behaviour. I can submit a patch either >> way, but I won't >> decide what is the right way to do it. >> I know tricks will allow to work around that issue, I found >> them hopefully and >> I guess most people affected by this issue would be able to >> find and use them, >> but nevertheless being able to build a server that can no longer be base- >> backuped does not seem right. > > regards, Hi I didn't have much time to spend on that issue until today, and I found a way to solve it that seems acceptable to me. The biggest drawback will be that if the backup is interrupted, previous tablespaces already copied will stay on disk, but since that issue could already happen, I don't think it's too much a drawback. The patch simply delays the empty folder checking until we start reading the tablespace tarball. The first entry of the tarball being the PG_MAJOR_CATVER folder, that can not be found otherwise, there is no real alternative to that. I will submit this patch in the current commit fest. Regards Pierre -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Hi, I noticed this just now. At Mon, 01 May 2017 19:28:59 +0200, Pierre Ducroquet <p.psql@pinaraf.info> wrote in <05c62730-8670-4da6-b783-52e66fb42232@pinaraf.info> > I didn't have much time to spend on that issue until today, and I > found a way to solve it that seems acceptable to me. > > The biggest drawback will be that if the backup is interrupted, > previous tablespaces already copied will stay on disk, but since that > issue could already happen, I don't think it's too much a drawback. > The patch simply delays the empty folder checking until we start > reading the tablespace tarball. The first entry of the tarball being > the PG_MAJOR_CATVER folder, that can not be found otherwise, there is > no real alternative to that. > > I will submit this patch in the current commit fest. My concern is the behavior different from server, it accepts existing catver directory if it is empty. Anyway, I think it's better that ReceiveAndUnpackTarFile() doesn't accept any existing direcotry. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
On Tue, May 2, 2017 at 2:28 AM, Pierre Ducroquet <p.psql@pinaraf.info> wrote: > I didn't have much time to spend on that issue until today, and I found a > way to solve it that seems acceptable to me. > > The biggest drawback will be that if the backup is interrupted, previous > tablespaces already copied will stay on disk, but since that issue could > already happen, I don't think it's too much a drawback. Yeah... Perhaps cleanup_directories_atexit() could be made smarter by registering the list of folders to cleanup when they are created. But that would be for another patch. > The patch simply delays the empty folder checking until we start reading the > tablespace tarball. The first entry of the tarball being the PG_MAJOR_CATVER > folder, that can not be found otherwise, there is no real alternative to > that. I think I like this idea. The server allows tablespaces to be created in parallel for different versions in the same path. It seems that pg_basebackup should allow the same for consistency. The current behavior is present since pg_basebackup has been introduced by the way. Perhaps Magnus has some insight to offer on this. > I will submit this patch in the current commit fest. I have not spotted any flaws in the refactored logic. bool basetablespace; + bool firstfile = 1; char *copybuf = NULL; booleans are assigned with true or false. -- Michael
> On 15 May 2017, at 07:26, Michael Paquier <michael.paquier@gmail.com> wrote: > > On Tue, May 2, 2017 at 2:28 AM, Pierre Ducroquet <p.psql@pinaraf.info> wrote: > >> I will submit this patch in the current commit fest. > > I have not spotted any flaws in the refactored logic. This patch no longer applies, could you take a look at it and submit a new version rebased on top of HEAD? cheers ./daniel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Wednesday, September 13, 2017 6:01:43 PM CEST you wrote: > > On 15 May 2017, at 07:26, Michael Paquier <michael.paquier@gmail.com> > > wrote:> > > On Tue, May 2, 2017 at 2:28 AM, Pierre Ducroquet <p.psql@pinaraf.info> wrote: > >> I will submit this patch in the current commit fest. > > > > I have not spotted any flaws in the refactored logic. > > This patch no longer applies, could you take a look at it and submit a new > version rebased on top of HEAD? > > cheers ./daniel Hi All my apologies for the schockingly long time with no answer on this topic. Attached to this email is the new version of the patch, checked against HEAD and REL_10_STABLE, with the small change required by Michael (affect true/ false to the boolean instead of 1/0) applied. I will do my best to help review some patches in the current CF. Pierre -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Thu, Sep 14, 2017 at 2:17 AM, Pierre Ducroquet <p.psql@pinaraf.info> wrote: > All my apologies for the schockingly long time with no answer on this topic. No problem. That's the concept called life I suppose. > I will do my best to help review some patches in the current CF. Thanks for the potential help! > Attached to this email is the new version of the patch, checked against HEAD > and REL_10_STABLE, with the small change required by Michael (affect true/ > false to the boolean instead of 1/0) applied. Thanks for the new version. So I have been pondering about this patch, and allowing multiple versions of Postgres to run in the same tablespace is mainly here for upgrade purposes, so I don't think that we should encourage such practices for performance reasons. There is as well --tablespace-mapping which is here to cover a similar need and we know that this solution works, at least people have spent time to make sure it does. Another problem that I have with this patch is that on HEAD, permission checks are done before receiving any data. I think that this is really saner to do any checks like that first, so as you can know if the tablespace is available for write access before writing any data to it. With this patch, you may finish by writing a bunch of data to a tablespace path, but then fail on another tablespace because of permission issues so the backup becomes useless after transferring and writing potentially hundreds of gigabytes. This is no fun to detect that potentially too late, and can impact the responsiveness of instances to get backups and restore from them. All in all, this patch gets a -1 from me in its current shape. If Horiguchi-san or yourself want to process further with this patch, of course feel free to do so, but I don't feel that we are actually trying to solve a new problem here. I am switching the patch to "waiting on author". Here is a nitpick about the patch by the way: + if (firstfile && !basetablespace) + { + /* + * The first file in the tablespace is its main folder, whose name can not be guessed (PG_MAJORVER_CATVER) + * So we must check here that this folder can be created or is empty. + */ + verify_dir_is_empty_or_create(filename, &made_tablespace_dirs, &found_tablespace_dirs); + } + else if (mkdir(filename, S_IRWXU) != 0) This patch has formatting problems. You should try to run a pgindent run before submitting it. The code usually needs to fit within the 72-80 character window. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Tuesday, September 19, 2017 12:52:37 PM CEST you wrote: > On Thu, Sep 14, 2017 at 2:17 AM, Pierre Ducroquet <p.psql@pinaraf.info> wrote: > > All my apologies for the schockingly long time with no answer on this > > topic. > No problem. That's the concept called life I suppose. > > > I will do my best to help review some patches in the current CF. > > Thanks for the potential help! > > > Attached to this email is the new version of the patch, checked against > > HEAD and REL_10_STABLE, with the small change required by Michael (affect > > true/ false to the boolean instead of 1/0) applied. > > Thanks for the new version. > > So I have been pondering about this patch, and allowing multiple > versions of Postgres to run in the same tablespace is mainly here for > upgrade purposes, so I don't think that we should encourage such > practices for performance reasons. There is as well > --tablespace-mapping which is here to cover a similar need and we know > that this solution works, at least people have spent time to make sure > it does. > > Another problem that I have with this patch is that on HEAD, > permission checks are done before receiving any data. I think that > this is really saner to do any checks like that first, so as you can > know if the tablespace is available for write access before writing > any data to it. With this patch, you may finish by writing a bunch of > data to a tablespace path, but then fail on another tablespace because > of permission issues so the backup becomes useless after transferring > and writing potentially hundreds of gigabytes. This is no fun to > detect that potentially too late, and can impact the responsiveness of > instances to get backups and restore from them. > > All in all, this patch gets a -1 from me in its current shape. If > Horiguchi-san or yourself want to process further with this patch, of > course feel free to do so, but I don't feel that we are actually > trying to solve a new problem here. I am switching the patch to > "waiting on author". Hi The point of this patch has never been to 'promote' sharing tablespaces between PostgreSQL versions. This is not a documented feature, and it would be imho a bad idea to promote it because of bugs like this one. But that bug is a problem for people who are migrating between postgresql releases one database at a time on the same server (maybe not a best practice, but a real use case nevertheless). That's how I met this bug, and that's why I wanted to patch it. I know there is a workaround, but to me it seems counter- intuitive that with no warning I can create a postgresql cluster that can not be restored without additional pg_basebackup settings. If it is indeed forbidden to share a tablespace between postgresql releases, why don't we enforce it ? Or at least show a warning when CREATE TABLESPACE encounter a non-empty folder ? Regarding the permission check, I don't really see how this is a real problem with the patch. I have to check on master, but it seems to me that the permission check can still be done before starting writing data on disc. We could just delay the 'empty' folder check, and keep checking the folder permissions. And I will do the pgindent run, thanks for the nitpick :) Pierre -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
El 13/09/17 a las 14:17, Pierre Ducroquet escribió: > + bool firstfile = 1; You are still assigning 1 to a bool (which is not incorrect) instead of true or false as Michael mentioned before. P.D.: I didn't go though all the thread and patch in depth so will not comment further. -- Martín Marqués http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Hi, At Tue, 19 Sep 2017 17:07:10 +0200, Pierre Ducroquet <pierre.ducroquet@people-doc.com> wrote in <1678633.OBaBNztJnJ@pierred-pdoc> > On Tuesday, September 19, 2017 12:52:37 PM CEST you wrote: > > On Thu, Sep 14, 2017 at 2:17 AM, Pierre Ducroquet <p.psql@pinaraf.info> > wrote: > > > All my apologies for the schockingly long time with no answer on this > > > topic. > > No problem. That's the concept called life I suppose. > > > > > I will do my best to help review some patches in the current CF. > > > > Thanks for the potential help! > > > > > Attached to this email is the new version of the patch, checked against > > > HEAD and REL_10_STABLE, with the small change required by Michael (affect > > > true/ false to the boolean instead of 1/0) applied. > > > > Thanks for the new version. > > > > So I have been pondering about this patch, and allowing multiple > > versions of Postgres to run in the same tablespace is mainly here for > > upgrade purposes, so I don't think that we should encourage such > > practices for performance reasons. There is as well > > --tablespace-mapping which is here to cover a similar need and we know > > that this solution works, at least people have spent time to make sure > > it does. > > > > Another problem that I have with this patch is that on HEAD, > > permission checks are done before receiving any data. I think that > > this is really saner to do any checks like that first, so as you can > > know if the tablespace is available for write access before writing > > any data to it. With this patch, you may finish by writing a bunch of > > data to a tablespace path, but then fail on another tablespace because > > of permission issues so the backup becomes useless after transferring > > and writing potentially hundreds of gigabytes. This is no fun to > > detect that potentially too late, and can impact the responsiveness of > > instances to get backups and restore from them. > > > > All in all, this patch gets a -1 from me in its current shape. If > > Horiguchi-san or yourself want to process further with this patch, of > > course feel free to do so, but I don't feel that we are actually > > trying to solve a new problem here. I am switching the patch to > > "waiting on author". Hmm. I recall that my opinion on the "problem" was that we should just prohibit sharing rather than allow it. However, if there's who wants it and the behavior is not so bizarre, I don't reject the direction. As long as I saw the patch, it just delays digging of the top directory until the CATVER directory becomes knwon without additional checks. The behavior is identical to what the current version does on tablespace directories so the pointed problems don't seem to be new ones caused by this patch and such situation seems a bit malicious. > Hi > > The point of this patch has never been to 'promote' sharing tablespaces > between PostgreSQL versions. This is not a documented feature, and it would be > imho a bad idea to promote it because of bugs like this one. That *was* a bug, but could be a not-a-bug after we define the behavior reasonably, and may be should be documented. > But that bug is a problem for people who are migrating between postgresql > releases one database at a time on the same server (maybe not a best practice, > but a real use case nevertheless). That's how I met this bug, and that's why I > wanted to patch it. I know there is a workaround, but to me it seems counter- > intuitive that with no warning I can create a postgresql cluster that can not > be restored without additional pg_basebackup settings. > If it is indeed forbidden to share a tablespace between postgresql releases, > why don't we enforce it ? Or at least show a warning when CREATE TABLESPACE > encounter a non-empty folder ? > > Regarding the permission check, I don't really see how this is a real problem > with the patch. I have to check on master, but it seems to me that the > permission check can still be done before starting writing data on disc. We > could just delay the 'empty' folder check, and keep checking the folder > permissions. > > And I will do the pgindent run, thanks for the nitpick :) So I'll wait for the new version, but I have a comment on the patch. It would practically work but I don't like the fact that the patch relies on the specific directory/file ordering in the tar stream. This is not about the CATVER directory but lower directories. Being more strict, it actually makes excessive calls to verify_dir_is_em..() for more lower directories contrarily to expectations. I think that we can take more more robust way to detect the CATVER directory. Just checking if it is a top-level directory will work. That is, making the following change. - if (firstfile && !basetablespace) + /* copybuf must contain at least one '/' here */ + if (!basetablespace && strchr(copybuf, '/')[1] == 0) This condition exactly hits only CATVER directories not being disturbed by file ordering of the tar stream. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Sep 29, 2017 at 2:19 PM, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote: > It would practically work but I don't like the fact that the > patch relies on the specific directory/file ordering in the tar > stream. This is not about the CATVER directory but lower > directories. Being more strict, it actually makes excessive calls > to verify_dir_is_em..() for more lower directories contrarily to > expectations. > > I think that we can take more more robust way to detect the > CATVER directory. Just checking if it is a top-level directory > will work. That is, making the following change. My tendency about this patch is still that it should be rejected. This is presenting additional handling for no real gain. > > - if (firstfile && !basetablespace) > + /* copybuf must contain at least one '/' here */ > + if (!basetablespace && strchr(copybuf, '/')[1] == 0) > > This condition exactly hits only CATVER directories not being > disturbed by file ordering of the tar stream. Anyway, as a new version is at least needed, I am marking it as returned with feedback. The CF is close to its end. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Sep 29, 2017 at 2:06 AM, Michael Paquier <michael.paquier@gmail.com> wrote: > My tendency about this patch is still that it should be rejected. This > is presenting additional handling for no real gain. I vehemently disagree. If the server lets you create a tablespace, then everything that happens after that ought to work. On another thread, there is the issue that if you create a tablespace inside $PGDATA, things break. We should either unbreak those things or not allow creating the tablespace in the first place. On this thread, there is the issue that if you create two tablespaces for different PG versions in the same directory, things break. We should either unbreak those things or not allow creating the tablespace in the first place. It is completely awful behavior to let users do things and then punish them later for having done them. Users are not obliged to read the minds of the developers and guess what things the developers consider "reasonable". They should be able to count on the principle that if they do something that we consider wrong, they'll get an error when they try to do it -- not have it superficially appear to work and then explode later. To put that another way, there should be ONE rule about what is or is not allowable in a particular situation, and all commands, utilities, etc. that deal with that situation should handle it in a uniform fashion. Each .c file shouldn't get to make up its own notion of what is or is not supported. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 30/09/17 06:43, Robert Haas wrote: > On Fri, Sep 29, 2017 at 2:06 AM, Michael Paquier > <michael.paquier@gmail.com> wrote: >> My tendency about this patch is still that it should be rejected. This >> is presenting additional handling for no real gain. > I vehemently disagree. If the server lets you create a tablespace, > then everything that happens after that ought to work. > > On another thread, there is the issue that if you create a tablespace > inside $PGDATA, things break. We should either unbreak those things > or not allow creating the tablespace in the first place. On this > thread, there is the issue that if you create two tablespaces for > different PG versions in the same directory, things break. We should > either unbreak those things or not allow creating the tablespace in > the first place. > > It is completely awful behavior to let users do things and then punish > them later for having done them. Users are not obliged to read the > minds of the developers and guess what things the developers consider > "reasonable". They should be able to count on the principle that if > they do something that we consider wrong, they'll get an error when > they try to do it -- not have it superficially appear to work and then > explode later. > > To put that another way, there should be ONE rule about what is or is > not allowable in a particular situation, and all commands, utilities, > etc. that deal with that situation should handle it in a uniform > fashion. Each .c file shouldn't get to make up its own notion of what > is or is not supported. > +1 It seems simply wrong (and potentially dangerous) to allow users to arrange a system state that cannot be backed up (easily/without surgery etc etc). Also the customer concerned that did exactly that started the conversation about it with me like this (paraphrasing) 'So this pg_basebackup thing is a bit temperamental...'. I'm thinking we do not want to be giving users this impression. regards Mark -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Thanks for the objection with clear reasoning. For clarity, I first proposed to prohibit servers of different versions from sharing same tablespace directory. https://www.postgresql.org/message-id/20170406.160844.120459562.horiguchi.kyotaro@lab.ntt.co.jp And I had -1 that it is just a reverting to the previous behavior (it was exactly the patch intended, though) and persuaded to take the way in this thread there, so I'm here. At Fri, 29 Sep 2017 13:43:22 -0400, Robert Haas <robertmhaas@gmail.com> wrote in <CA+TgmoYqOQOn_jwgC9V+w+5HfGH7Te5_FnCk3qvA4HzZ0UG0Pw@mail.gmail.com> > On Fri, Sep 29, 2017 at 2:06 AM, Michael Paquier > <michael.paquier@gmail.com> wrote: > > My tendency about this patch is still that it should be rejected. This > > is presenting additional handling for no real gain. > > I vehemently disagree. If the server lets you create a tablespace, > then everything that happens after that ought to work. > > On another thread, there is the issue that if you create a tablespace > inside $PGDATA, things break. We should either unbreak those things pg_basebackup copies the tablespace twice, or some maintenaince commands give a wrong result, or careless cleanup script can blow away a part of the data. > or not allow creating the tablespace in the first place. On this > thread, there is the issue that if you create two tablespaces for > different PG versions in the same directory, things break. We should Server never accesses out of <tblspace>/CARVER/ directory in the <tblspace> and servers with different versoins can share the <tblspace> directory (I think this is a bug). pg_upgrade will complain if it finds the destination CATVER directory created even though no data will be broken. Just a clarification, not "fixing" the problem, users may get punished by pg_basebackup later. If "fixing" in this way, pg_basebacup will work in the case but in turn pg_upgrade may punish them later. May I assume that we agree on this point? > either unbreak those things or not allow creating the tablespace in > the first place. > > It is completely awful behavior to let users do things and then punish > them later for having done them. Users are not obliged to read the > minds of the developers and guess what things the developers consider > "reasonable". They should be able to count on the principle that if > they do something that we consider wrong, they'll get an error when > they try to do it -- not have it superficially appear to work and then > explode later. > > To put that another way, there should be ONE rule about what is or is > not allowable in a particular situation, and all commands, utilities, > etc. that deal with that situation should handle it in a uniform > fashion. Each .c file shouldn't get to make up its own notion of what > is or is not supported. Anyway currently server and pg_basebackup disagrees on the point. If the "just reverting" patch above is not rejected again, I'll resume working on it. Or other way to go? This is not an issue that ought to take a long time. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers