Thread: adding a new column in IDENTIFY_SYSTEM
Hi I want to propose the addition of a new field in IDENTIFY_SYSTEM: xlogversion, which will carry XLOG_PAGE_MAGIC from primary. The idea of sending that info is to allow us to know if the xlog page version of two different major versions are compatible or not. Currently pg_upgrade requires the primary to be taken down, and then re-run all base backups for each standby which is a lot of work and doesn't sound like "online upgrade". I want to add the field now to make the protocol stable, also because when we connect to start replication we check for the number of fields retrieved from IDENTIFY_SYSTEM so if we add it in 9.2 we will be unable to do this until 9.3 (when both releases agree about the number of fields returned). patch is very simple and doesn't affect anyone nor this will require an initdb so i guess is safe to apply now. comments? -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte y capacitación de PostgreSQL
Attachment
Hi, On Wednesday, May 04, 2011 12:39:49 AM Jaime Casanova wrote: > I want to propose the addition of a new field in IDENTIFY_SYSTEM: > xlogversion, which will carry XLOG_PAGE_MAGIC from primary. > The idea of sending that info is to allow us to know if the xlog page > version of two different major versions are compatible or not. > Currently pg_upgrade requires the primary to be taken down, and then > re-run all base backups for each standby which is a lot of work and > doesn't sound like "online upgrade". > I want to add the field now to make the protocol stable, also because > when we connect to start replication we check for the number of fields > retrieved from IDENTIFY_SYSTEM so if we add it in 9.2 we will be > unable to do this until 9.3 (when both releases agree about the number > of fields returned). I can't see xlog replication working between major versions. How should it handle the differing catversion of the system catalogs? You sure can't replicate any of the system catalog changes. Even if that weren't the case I don't like the idea of xlog improvements weighted against the problem of breaking compatibility between major releases. Or are you proposing not invalidating the base backup? I can't really see that working either. You would need to copy over the catalog from the master in the exactly right version to the restartpoint on the standby. Andres
On Tue, May 3, 2011 at 6:32 PM, Andres Freund <andres@anarazel.de> wrote: > > I can't see xlog replication working between major versions. well, probably... but not many years ago we wouldn't see integrated replication in postgresql... still, here we are... it's a difficult problem to solve? surely. we can make it work all the time and for any x major release? probably not. that's why i want the XLOG_PAGE_MAGIC, but you're question about catversion is a very good one -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte y capacitación de PostgreSQL
Jaime Casanova <jaime@2ndquadrant.com> writes: > I want to propose the addition of a new field in IDENTIFY_SYSTEM: > xlogversion, which will carry XLOG_PAGE_MAGIC from primary. > The idea of sending that info is to allow us to know if the xlog page > version of two different major versions are compatible or not. > Currently pg_upgrade requires the primary to be taken down, That's *intentional*. The notion of WAL-shipping-replication compatibility between two different major versions is insane on its face. They will not have compatible system catalog contents. You might get perfect replication of the master's catalogs, but the slave wouldn't be able to interpret them. The reason we have XLOG_PAGE_MAGIC is really more the opposite: to prevent people from trying to recover across a minor version update in which we had to break XLOG compatibility. I don't recall right now if that's ever actually happened, but it definitely could. regards, tom lane
On Wed, May 4, 2011 at 3:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Jaime Casanova <jaime@2ndquadrant.com> writes: >> I want to propose the addition of a new field in IDENTIFY_SYSTEM: >> xlogversion, which will carry XLOG_PAGE_MAGIC from primary. >> The idea of sending that info is to allow us to know if the xlog page >> version of two different major versions are compatible or not. >> Currently pg_upgrade requires the primary to be taken down, > > That's *intentional*. > > The notion of WAL-shipping-replication compatibility between two > different major versions is insane on its face. They will not have > compatible system catalog contents. You might get perfect replication > of the master's catalogs, but the slave wouldn't be able to interpret > them. That's exactly how hard in place upgrade was to begin with. Considering how valuable this would be, it seems worth it to pursue this. > The reason we have XLOG_PAGE_MAGIC is really more the opposite: to > prevent people from trying to recover across a minor version update in > which we had to break XLOG compatibility. I don't recall right now > if that's ever actually happened, but it definitely could. If that is true, then allowing this patch will allow us to detect that incompatibility when the standby connects to the master, and explain the issue in a useful error message. Otherwise we will just barf on the magic value. Having access to these details might make it possible to upgrade. They could be inferred, but it would be better to have the full data so we can take an informed decision about whether or not it is possible. So even if people don't believe in the rationale behind the patch, would allowing it harm anything at this point? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Wed, May 4, 2011 at 22:42, Simon Riggs <simon@2ndquadrant.com> wrote: > On Wed, May 4, 2011 at 3:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Jaime Casanova <jaime@2ndquadrant.com> writes: >>> I want to propose the addition of a new field in IDENTIFY_SYSTEM: >>> xlogversion, which will carry XLOG_PAGE_MAGIC from primary. >>> The idea of sending that info is to allow us to know if the xlog page >>> version of two different major versions are compatible or not. >>> Currently pg_upgrade requires the primary to be taken down, >> >> That's *intentional*. >> >> The notion of WAL-shipping-replication compatibility between two >> different major versions is insane on its face. They will not have >> compatible system catalog contents. You might get perfect replication >> of the master's catalogs, but the slave wouldn't be able to interpret >> them. > > That's exactly how hard in place upgrade was to begin with. > > Considering how valuable this would be, it seems worth it to pursue this. > >> The reason we have XLOG_PAGE_MAGIC is really more the opposite: to >> prevent people from trying to recover across a minor version update in >> which we had to break XLOG compatibility. I don't recall right now >> if that's ever actually happened, but it definitely could. > > If that is true, then allowing this patch will allow us to detect that > incompatibility when the standby connects to the master, and explain > the issue in a useful error message. Otherwise we will just barf on > the magic value. > > Having access to these details might make it possible to upgrade. They > could be inferred, but it would be better to have the full data so we > can take an informed decision about whether or not it is possible. > > So even if people don't believe in the rationale behind the patch, > would allowing it harm anything at this point? Adding it for the sake of upgrades seems very far fetched. Adding it for the sake of giving a better error message seems like a very good idea. But in that case, the client side code to actually give a better error message should be included from the start, IMHO. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Magnus Hagander <magnus@hagander.net> writes: >> So even if people don't believe in the rationale behind the patch, >> would allowing it harm anything at this point? > Adding it for the sake of upgrades seems very far fetched. > Adding it for the sake of giving a better error message seems like a > very good idea. But in that case, the client side code to actually > give a better error message should be included from the start, IMHO. What's not apparent to me is how we'll even get to this check; if there's a mismatch, won't the database system identifier comparison fail first in most scenarios? I'm also wondering why send WAL version number and not, say, catalog version number, if there's some idea that we need more tests than the system identifier comparison. Given reasonable answers to these questions, I'd not object to putting in additional error testing. I concur with Magnus that the patch should actually provide those tests, and not just put in an unused field. regards, tom lane
On Thu, May 5, 2011 at 10:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Magnus Hagander <magnus@hagander.net> writes: >>> So even if people don't believe in the rationale behind the patch, >>> would allowing it harm anything at this point? > >> Adding it for the sake of upgrades seems very far fetched. > >> Adding it for the sake of giving a better error message seems like a >> very good idea. But in that case, the client side code to actually >> give a better error message should be included from the start, IMHO. > > What's not apparent to me is how we'll even get to this check; if > there's a mismatch, won't the database system identifier comparison > fail first in most scenarios? > that's why i didn't propose that to begin with... but thinking on that, we can use it to add a message in pg_basebackup, maybe just a warning if we are taking a basebackup from an incompatible system... but for that i will need to add xlog_internal.h and postgres.h to pg_basebackup and use the "#define FRONTEND 1" hack we have in pg_resetxlog > I'm also wondering why send WAL version number and not, say, catalog > version number, if there's some idea that we need more tests than the > system identifier comparison. > well... catversion is not that informative, we change it for a lot of reasons, not only catalog estructure changes... so we can't swear that xlog records will be incompatible just because catversion changes... but yes, we need to know if catalog estructure has changed, maybe we can change XLOG_PAGE_MAGIC when that happens? > Given reasonable answers to these questions, I'd not object to putting > in additional error testing. I concur with Magnus that the patch should > actually provide those tests, and not just put in an unused field. > actually, now is when we can play with that API at will when/if we can make online upgrades work then we will be stuck with whatever we have made. before that we know it won't affect anybody -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte y capacitación de PostgreSQL
On Sun, May 15, 2011 at 6:03 PM, Jaime Casanova <jaime@2ndquadrant.com> wrote: > On Thu, May 5, 2011 at 10:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Magnus Hagander <magnus@hagander.net> writes: >>>> So even if people don't believe in the rationale behind the patch, >>>> would allowing it harm anything at this point? >> >>> Adding it for the sake of upgrades seems very far fetched. >> >>> Adding it for the sake of giving a better error message seems like a >>> very good idea. But in that case, the client side code to actually >>> give a better error message should be included from the start, IMHO. >> >> What's not apparent to me is how we'll even get to this check; if >> there's a mismatch, won't the database system identifier comparison >> fail first in most scenarios? >> > > that's why i didn't propose that to begin with... but thinking on > that, we can use it to add a message in pg_basebackup, maybe just a > warning if we are taking a basebackup from an incompatible system... > > but for that i will need to add xlog_internal.h and postgres.h to > pg_basebackup and use the "#define FRONTEND 1" hack we have in > pg_resetxlog > attached, comments? -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte y capacitación de PostgreSQL
Attachment
On Mon, May 16, 2011 at 01:03, Jaime Casanova <jaime@2ndquadrant.com> wrote: > On Thu, May 5, 2011 at 10:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Magnus Hagander <magnus@hagander.net> writes: >>>> So even if people don't believe in the rationale behind the patch, >>>> would allowing it harm anything at this point? >> >>> Adding it for the sake of upgrades seems very far fetched. >> >>> Adding it for the sake of giving a better error message seems like a >>> very good idea. But in that case, the client side code to actually >>> give a better error message should be included from the start, IMHO. >> >> What's not apparent to me is how we'll even get to this check; if >> there's a mismatch, won't the database system identifier comparison >> fail first in most scenarios? >> > > that's why i didn't propose that to begin with... but thinking on > that, we can use it to add a message in pg_basebackup, maybe just a > warning if we are taking a basebackup from an incompatible system... > > but for that i will need to add xlog_internal.h and postgres.h to > pg_basebackup and use the "#define FRONTEND 1" hack we have in > pg_resetxlog Well, pg_basebackup doesn't need it critically, since it never looks at the contents fo the files anyway. You could use a pg_basebackup for 9.1 to backup a 9.2 database - at least in theory. Granted, it wouldn't hurt to get the message from pg_basebackup *before* you took the backup, which your patch (from the other email) does. But I'm not entirely sure I like that kludge... I think it'd be less of a kludge to move the definition of XLOG_PAGE_MAGIC somewhere that's visible already. Also, this error message: + fprintf(stderr, _("%s: could not identify system: XLOG pages are incompatible.\n"), is clearly wrong - it *could* identify the system, it just didn't like what it saw... Anyway, the more useful point would be to have it in walreceiver, I believe. >> I'm also wondering why send WAL version number and not, say, catalog >> version number, if there's some idea that we need more tests than the >> system identifier comparison. >> > > well... catversion is not that informative, we change it for a lot of > reasons, not only catalog estructure changes... so we can't swear that > xlog records will be incompatible just because catversion changes... From the *replication* perspective we can be pretty certain it breaks. From the base backup perspective, it might well keep on working, since you get the new version of both the base backup and the logs. And what other reasons than catalog structure changes do we actually change catversion? > but yes, we need to know if catalog estructure has changed, maybe we > can change XLOG_PAGE_MAGIC when that happens? Uh, that seems like a seriously bad idea. >> Given reasonable answers to these questions, I'd not object to putting >> in additional error testing. I concur with Magnus that the patch should >> actually provide those tests, and not just put in an unused field. >> > > actually, now is when we can play with that API at will when/if we can > make online upgrades work then we will be stuck with whatever we have > made. before that we know it won't affect anybody True - but there should be at least a POC implenentation of something *using* the API, or we don't know if it's useful. As stated earlier, I'd prefer that POC API to be the walreceiver. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Mon, May 16, 2011 at 2:35 AM, Magnus Hagander <magnus@hagander.net> wrote: > On Mon, May 16, 2011 at 01:03, Jaime Casanova <jaime@2ndquadrant.com> wrote: >> On Thu, May 5, 2011 at 10:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Magnus Hagander <magnus@hagander.net> writes: >>>>> So even if people don't believe in the rationale behind the patch, >>>>> would allowing it harm anything at this point? >>> >>>> Adding it for the sake of upgrades seems very far fetched. >>> >>>> Adding it for the sake of giving a better error message seems like a >>>> very good idea. But in that case, the client side code to actually >>>> give a better error message should be included from the start, IMHO. >>> >>> What's not apparent to me is how we'll even get to this check; if >>> there's a mismatch, won't the database system identifier comparison >>> fail first in most scenarios? >>> >> >> that's why i didn't propose that to begin with... but thinking on >> that, we can use it to add a message in pg_basebackup, maybe just a >> warning if we are taking a basebackup from an incompatible system... >> >> but for that i will need to add xlog_internal.h and postgres.h to >> pg_basebackup and use the "#define FRONTEND 1" hack we have in >> pg_resetxlog > > Well, pg_basebackup doesn't need it critically, since it never looks > at the contents fo the files anyway. You could use a pg_basebackup for > 9.1 to backup a 9.2 database - at least in theory. > > Granted, it wouldn't hurt to get the message from pg_basebackup > *before* you took the backup, while you could, is also possible that you really think is the right version and that you will waste time until you found out you have the wrong version installed and that your backup won't work > I think it'd be > less of a kludge to move the definition of XLOG_PAGE_MAGIC somewhere > that's visible already. > agree, that also will allow us to avoid have that kludge in pg_resetxlog... > Also, this error message: > + fprintf(stderr, _("%s: could not identify system: XLOG > pages are incompatible.\n"), > > is clearly wrong - it *could* identify the system, it just didn't like > what it saw... > ah! yeah! we can, of course, put better messages! > > Anyway, the more useful point would be to have it in walreceiver, I believe. > you mean a message like this in walreceiver? we can put it but probably it will never get to that... >>> I'm also wondering why send WAL version number and not, say, catalog >>> version number, if there's some idea that we need more tests than the >>> system identifier comparison. >>> >> >> well... catversion is not that informative, we change it for a lot of >> reasons, not only catalog estructure changes... so we can't swear that >> xlog records will be incompatible just because catversion changes... > > From the *replication* perspective we can be pretty certain it breaks. > From the base backup perspective, it might well keep on working, since > you get the new version of both the base backup and the logs. > > And what other reasons than catalog structure changes do we actually > change catversion? > see these commits: 76dd09bbec893c02376e3440a6a86a3b994d804c f5e524d92be609c709825be8995bf77f10880c3b 47082fa875179ae629edb26807ab3f38a775280b -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte y capacitación de PostgreSQL
On Tue, May 17, 2011 at 16:38, Jaime Casanova <jaime@2ndquadrant.com> wrote: > On Mon, May 16, 2011 at 2:35 AM, Magnus Hagander <magnus@hagander.net> wrote: >> On Mon, May 16, 2011 at 01:03, Jaime Casanova <jaime@2ndquadrant.com> wrote: >>> On Thu, May 5, 2011 at 10:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>>> Magnus Hagander <magnus@hagander.net> writes: >>>>>> So even if people don't believe in the rationale behind the patch, >>>>>> would allowing it harm anything at this point? >>>> >>>>> Adding it for the sake of upgrades seems very far fetched. >>>> >>>>> Adding it for the sake of giving a better error message seems like a >>>>> very good idea. But in that case, the client side code to actually >>>>> give a better error message should be included from the start, IMHO. >>>> >>>> What's not apparent to me is how we'll even get to this check; if >>>> there's a mismatch, won't the database system identifier comparison >>>> fail first in most scenarios? >>>> >>> >>> that's why i didn't propose that to begin with... but thinking on >>> that, we can use it to add a message in pg_basebackup, maybe just a >>> warning if we are taking a basebackup from an incompatible system... >>> >>> but for that i will need to add xlog_internal.h and postgres.h to >>> pg_basebackup and use the "#define FRONTEND 1" hack we have in >>> pg_resetxlog >> >> Well, pg_basebackup doesn't need it critically, since it never looks >> at the contents fo the files anyway. You could use a pg_basebackup for >> 9.1 to backup a 9.2 database - at least in theory. >> >> Granted, it wouldn't hurt to get the message from pg_basebackup >> *before* you took the backup, > > while you could, is also possible that you really think is the right > version and that you will waste time until you found out you have the > wrong version installed and that your backup won't work Yes. It might be useful to note it, and then ust make an override flag. My pointm, though, was that doing it for walreceiver is more important and a more logical first step. >> I think it'd be >> less of a kludge to move the definition of XLOG_PAGE_MAGIC somewhere >> that's visible already. >> > > agree, that also will allow us to avoid have that kludge in pg_resetxlog... > >> Also, this error message: >> + fprintf(stderr, _("%s: could not identify system: XLOG >> pages are incompatible.\n"), >> >> is clearly wrong - it *could* identify the system, it just didn't like >> what it saw... >> > > ah! yeah! we can, of course, put better messages! > >> >> Anyway, the more useful point would be to have it in walreceiver, I believe. >> > > you mean a message like this in walreceiver? we can put it but > probably it will never get to that... Yes, I think that's the more important one. And that without that, I don't really see why we should do this for 9.1. >>>> I'm also wondering why send WAL version number and not, say, catalog >>>> version number, if there's some idea that we need more tests than the >>>> system identifier comparison. >>>> >>> >>> well... catversion is not that informative, we change it for a lot of >>> reasons, not only catalog estructure changes... so we can't swear that >>> xlog records will be incompatible just because catversion changes... >> >> From the *replication* perspective we can be pretty certain it breaks. >> From the base backup perspective, it might well keep on working, since >> you get the new version of both the base backup and the logs. >> >> And what other reasons than catalog structure changes do we actually >> change catversion? >> > > see these commits: > 76dd09bbec893c02376e3440a6a86a3b994d804c That's bruce... :-) > f5e524d92be609c709825be8995bf77f10880c3b That does have catalog changes. Not structure, but contents, which would cause similar problems, no? > 47082fa875179ae629edb26807ab3f38a775280b That's a catversion bump because it was forgotten in a *previous* patch that required it. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Fri, May 20, 2011 at 12:50 PM, Magnus Hagander <magnus@hagander.net> wrote: > > Yes. It might be useful to note it, and then ust make an override > flag. My pointm, though, was that doing it for walreceiver is more > important and a more logical first step. > ok, patch attached. -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte y capacitación de PostgreSQL
Attachment
On Wed, May 25, 2011 at 8:26 AM, Jaime Casanova <jaime@2ndquadrant.com> wrote: > On Fri, May 20, 2011 at 12:50 PM, Magnus Hagander <magnus@hagander.net> wrote: >> >> Yes. It might be useful to note it, and then ust make an override >> flag. My pointm, though, was that doing it for walreceiver is more >> important and a more logical first step. >> > > ok, patch attached. Why is the check of WAL version required for streaming replication? As Tom said, if the version is different between two servers, the check of system identifier fails first. No? + primary_xlp_magic = atoi(PQgetvalue(res, 0, 2)); You wrongly get the third field (i.e., current xlog location) as the WAL version. You should call PQgetvalue(res, 0, 3), instead. > errdetail("Expected 1 tuple with 3 fields, got %d tuples with %d fields.", You need to change the above message. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Tue, May 24, 2011 at 8:52 PM, Fujii Masao <masao.fujii@gmail.com> wrote: > + primary_xlp_magic = atoi(PQgetvalue(res, 0, 2)); > > You wrongly get the third field (i.e., current xlog location) as the > WAL version. > You should call PQgetvalue(res, 0, 3), instead. > >> errdetail("Expected 1 tuple with 3 fields, got %d tuples with %d fields.", > > You need to change the above message. > Fixed. About you comments on the check... if you read the thread, you will find that the whole reason for the field is future improvement, but everyone wanted some use of the field now... so i made a patch to use it in pg_basebackup before the transfer starts and avoid time and bandwith waste but Magnus prefer this in walreceiver... -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte y capacitación de PostgreSQL
Attachment
On Tue, May 24, 2011 at 22:31, Jaime Casanova <jaime@2ndquadrant.com> wrote: > On Tue, May 24, 2011 at 8:52 PM, Fujii Masao <masao.fujii@gmail.com> wrote: > >> + primary_xlp_magic = atoi(PQgetvalue(res, 0, 2)); >> >> You wrongly get the third field (i.e., current xlog location) as the >> WAL version. >> You should call PQgetvalue(res, 0, 3), instead. >> >>> errdetail("Expected 1 tuple with 3 fields, got %d tuples with %d fields.", >> >> You need to change the above message. >> > > Fixed. > > About you comments on the check... if you read the thread, you will > find that the whole reason for the field is future improvement, but > everyone wanted some use of the field now... so i made a patch to use > it in pg_basebackup before the transfer starts and avoid time and > bandwith waste but Magnus prefer this in walreceiver... The idea *at this point* was, I believe, to be able to provide a more useful error message in the case of walreceiver. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/