Re: replication identifier format - Mailing list pgsql-hackers

From Robert Haas
Subject Re: replication identifier format
Date
Msg-id CA+TgmobR7uFxKqUdC7UbQcqP-pQfrQuCUh72cwVzkFrp2dfjGA@mail.gmail.com
Whole thread Raw
In response to Re: replication identifier format  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
On Mon, Jun 23, 2014 at 11:28 AM, Andres Freund <andres@2ndquadrant.com> wrote:
>> Oh, great.  Somehow I missed the fact that that had been addressed.  I
>> had assumed that we still needed global identifiers in which case I
>> think they'd need to be 64+ bits (preferably more like 128).  If they
>> only need to be locally significant that makes things much better.
>
> Well, I was just talking about the 'short ids' here and how they are
> used in crash recovery/shmem et al. Those indeed don't need to be
> coordinated.
> If you ever use logical decoding on a system that receives changes from
> other systems (cascading replication, multimaster) you'll likely want to
> add the *long* form of that identifier to the output in the output
> plugin so the downstream nodes can identify the source. How one
> specific replication solution deals with coordinating this between
> systems is essentially that suite's problem.

OK.

> The external identifier currently is a 'text' column, so essentially
> unlimited. (Well, I just noticed that the table currently doesn't have a
> toast table assigned, so it's only a couple kb right now, but ...)

OK.  I have no clear reason to dislike that.

>> Is there any real reason to add a pg_replication_identifier table, or
>> should we just let individual replication solutions manage the
>> identifiers within their own configuration tables?
>
> I don't think that'd work. During crash recovery the short/internal IDs
> are read from WAL records and need to be unique across *all*
> databases. Since there's no way for different replication solutions or
> even the same to coordinate this across databases (as there's no way to
> add shared relations) it has to be builtin.

That makes sense.

> It's also useful so we can have stuff like the
> 'pg_replication_identifier_progress' view which tells you internal_id,
> external_id, remote_lsn, local_lsn. Just showing the internal ID would
> imo be bad.

OK.

>> I guess one
>> question is: What happens if there are multiple replication solutions
>> in use on a single server?  How do they coordinate?
>
> What's your concern here? You're wondering how they can make sure the
> identifiers they create are non-overlapping?

Yeah, I was just thinking that might be why you installed a catalog
table for this, but now I see that there are several other reasons
also.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Vik Fearing
Date:
Subject: Re: please review source(SQLServer compatible)‏
Next
From: Robert Haas
Date:
Subject: Re: SQL access to database attributes