Re: Set new system identifier using pg_resetxlog - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Set new system identifier using pg_resetxlog |
Date | |
Msg-id | 20140617165011.GA3115@awork2.anarazel.de Whole thread Raw |
In response to | Re: Set new system identifier using pg_resetxlog (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Set new system identifier using pg_resetxlog
|
List | pgsql-hackers |
On 2014-06-17 12:07:04 -0400, Robert Haas wrote: > On Tue, Jun 17, 2014 at 10:33 AM, Petr Jelinek <petr@2ndquadrant.com> wrote: > > On 17/06/14 16:18, Robert Haas wrote: > >> On Fri, Jun 13, 2014 at 8:31 PM, Petr Jelinek <petr@2ndquadrant.com> > >> wrote: > >>> attached is a simple patch which makes it possible to change the system > >>> identifier of the cluster in pg_control. This is useful for > >>> individualization of the instance that is started on top of data > >>> directory > >>> produced by pg_basebackup - something that's helpful for logical > >>> replication > >>> setup where you need to easily identify each node (it's used by > >>> Bidirectional Replication for example). > >> > >> > >> I can clearly understand the utility of being able to reset the system > >> ID to a new, randomly-generated system ID - but giving the user the > >> ability to set a particular value of their own choosing seems like a > >> pretty sharp tool. What is the use case for that? I've previously hacked this up adhoc during data recovery when I needed to make another cluster similar enough that I could replay WAL. Another usecase is to mark a database as independent from its origin. Imagine a database that gets sharded across several servers. It's not uncommon to do that by initially basebackup'ing the database to several nodes and then use them separately from thereon. It's quite useful to actually mark them as being distinct. Especially as several of them right now would end up with the same timeline id... > But it seems to me that we might need to have a process discussion > here, because, while I'm all in favor of incremental feature proposals > that build towards a larger goal, it currently appears that the larger > goal toward which you are building is not something that's been > publicly discussed and debated on this list. And I really think we > need to have that conversation. Obviously, individual patches will > still need to be debated, but I feel like 2ndQuadrant is trying to > construct a castle without showing the community the floor plan. I > believe that there is relatively broad agreement that we would all > like a castle, but different people may have legitimately different > ideas about how it should be constructed. If the work arrives as a > series of disconnected pieces (user-specified system ID, event > triggers for CREATE, etc.), then everyone outside of 2ndQuadrant has > to take it on faith that those pieces are going to eventually fit > together in a way that we'll all be happy with. In some cases, that's > fine, because the feature is useful on its own merits whether it ends > up being part of the castle or not. > Uh. Right now this patch has been written because it's needed for a out of core replication solution. That's what BDR is at this point. The patch is unobtrusive, has other usecases than just our internal one and doesn't make pg_resetxlog even more dangerous than it already is. I don't see much problem with considering it on it's own cost/benefit? So this seems to be a concern that's relatively independent of this patch. Am I seing that right? I think one very important point here is that BDR is *not* the proposed in core solution. I think a reasonable community perspective - besides also being useful on it's own - is to view it as a *prototype* for a in core solution. And e.g. logical decoding would have looked much worse - and likely not have been integrated - without externally already being used for BDR. I'm not sure how we can ease or even resolve your conerns when talking about pretty independent and general pieces of functionality like the DDL even trigger stuff. We needed to actually *write* those to see how BDR will look like. And the communities feedback heavily influenced how BDR looks like by accepting some pieces, demanding others, and outright rejecting the remainder. I think there's some pieces that need to consider them on their own merit. Logical decoding is useful on it's own. The ability for out of core systems to do DDL replication is another piece (that you referred to above). I think the likelihood of success if we were to try to design a in-core system from ground up first and then follow through prety exactly along those lines is minimal. So, what I think we can do is to continue trying to build independent, generally useful bits. Which imo all the stuff that's been integrated is. Then, somewhat soon I think, we'll have to come up with a proposal how the parts that are *not* necessarily useful outside of in-core logical rep. might look like. Which will likely trigger some long long discussions that turn that design around a couple of times. Which is fine. I *don't* think that's going to be a trimmed down version of todays BDR. > But in other cases, like this one, if the premise that the slot name > should match the system identifier isn't something the community wants > to accept, then taking a patch that lets people do that is probably a > bad idea, because at least one person will use it to set the system > identifier of a system to a value that enables physical replication to > take place when that is actually totally unsafe, and we don't want to > enable that for no reason. It also allows many other dangerous things. Many of which are much more dangerous than changing the system identifier. Resetting an independent cluster is also not very likely to work - the LSNs would still not match. But it wouldn't corrupt the copy of the database that's been changed... > Maybe the slot name should match the > replication identifier rather than the standby system ID, for example. > There are conflicting proposals for how replication identifiers should > work, but one of those proposals limits it to 16 bits. I actually don't think any of the discussions I was involved in had the externally visible version of replication identifiers limited to 16bits? If you are referring to my patch, 16bits was just the width of the *internal* name that should basically never be looked at. User visible replication identifiers are always identified by an arbitrary string - whose format is determined by the user of the replication identifier facility. *BDR* currently stores the system identifer, the database id and a name in there - but that's nothing core needs to concern itself with. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: