Re: Set new system identifier using pg_resetxlog - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Set new system identifier using pg_resetxlog
Date
Msg-id CA+TgmobqTWOXuo2ibxQrDZa3BLwk4B4oh7s7C9hwO+NBnWaJig@mail.gmail.com
Whole thread Raw
In response to Re: Set new system identifier using pg_resetxlog  (Petr Jelinek <petr@2ndquadrant.com>)
Responses Re: Set new system identifier using pg_resetxlog  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
On Tue, Jun 17, 2014 at 10:33 AM, Petr Jelinek <petr@2ndquadrant.com> wrote:
> On 17/06/14 16:18, Robert Haas wrote:
>> On Fri, Jun 13, 2014 at 8:31 PM, Petr Jelinek <petr@2ndquadrant.com>
>> wrote:
>>> attached is a simple patch which makes it possible to change the system
>>> identifier of the cluster in pg_control. This is useful for
>>> individualization of the instance that is started on top of data
>>> directory
>>> produced by pg_basebackup - something that's helpful for logical
>>> replication
>>> setup where you need to easily identify each node (it's used by
>>> Bidirectional Replication for example).
>>
>>
>> I can clearly understand the utility of being able to reset the system
>> ID to a new, randomly-generated system ID - but giving the user the
>> ability to set a particular value of their own choosing seems like a
>> pretty sharp tool.  What is the use case for that?
>
> Let's say you want to initialize new logical replication node via
> pg_basebackup and you want your replication slots to be easily identifiable
> so you use your local system id as part of the slot name.
>
> In that case you need to know the future system id of the node because you
> need to register the slot before consistent point to which you replay via
> streaming replication (and you can't replay anymore once you changed the
> system id). Which means you need to generate your system id in advance and
> be able to change it in pg_control later.

Hmm.  I guess that makes sense.

But it seems to me that we might need to have a process discussion
here, because, while I'm all in favor of incremental feature proposals
that build towards a larger goal, it currently appears that the larger
goal toward which you are building is not something that's been
publicly discussed and debated on this list.  And I really think we
need to have that conversation.  Obviously, individual patches will
still need to be debated, but I feel like 2ndQuadrant is trying to
construct a castle without showing the community the floor plan.  I
believe that there is relatively broad agreement that we would all
like a castle, but different people may have legitimately different
ideas about how it should be constructed.  If the work arrives as a
series of disconnected pieces (user-specified system ID, event
triggers for CREATE, etc.), then everyone outside of 2ndQuadrant has
to take it on faith that those pieces are going to eventually fit
together in a way that we'll all be happy with.  In some cases, that's
fine, because the feature is useful on its own merits whether it ends
up being part of the castle or not.

But in other cases, like this one, if the premise that the slot name
should match the system identifier isn't something the community wants
to accept, then taking a patch that lets people do that is probably a
bad idea, because at least one person will use it to set the system
identifier of a system to a value that enables physical replication to
take place when that is actually totally unsafe, and we don't want to
enable that for no reason.  Maybe the slot name should match the
replication identifier rather than the standby system ID, for example.
There are conflicting proposals for how replication identifiers should
work, but one of those proposals limits it to 16 bits.  If we're going
to have multiple identifiers floating around anyway, I'd rather have a
slot called 7 than one called 6024402925054484590.  On the other hand,
maybe there's going to be a new proposal to use the database system
identifier as a replication identifier, which might be a fine idea and
which would demolish that argument.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Minmax indexes
Next
From: Robert Haas
Date:
Subject: Re: Minmax indexes