Re: Replication identifiers, take 4 - Mailing list pgsql-hackers

From Petr Jelinek
Subject Re: Replication identifiers, take 4
Date
Msg-id 54E9D3D2.4080103@2ndquadrant.com
Whole thread Raw
In response to Re: Replication identifiers, take 4  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
On 22/02/15 09:57, Andres Freund wrote:
> On 2015-02-19 00:49:50 +0100, Petr Jelinek wrote:
>> On 16/02/15 10:46, Andres Freund wrote:
>>> On 2015-02-16 11:34:10 +0200, Heikki Linnakangas wrote:
>>>
>>>> At a quick glance, this basic design seems workable. I would suggest
>>>> expanding the replication IDs to regular 4 byte oids. Two extra bytes is a
>>>> small price to pay, to make it work more like everything else in the system.
>>>
>>> I don't know. Growing from 3 to 5 byte overhead per relevant record (or
>>> even 0 to 5 in case the padding is reused) is rather noticeable. If we
>>> later find it to be a limit (I seriously doubt that), we can still
>>> increase it in a major release without anybody really noticing.
>>>
>>
>> I agree that limiting the overhead is important.
>>
>> But I have related though about this - I wonder if it's worth to try to map
>> this more directly to the CommitTsNodeId.
>
> Maybe. I'd rather go the other way round though;
> replication_identifier.c/h's stuff seems much more generic than
> CommitTsNodeId.
>

Probably not more generic but definitely nicer as the nodes are named 
and yes it has richer API.

>> I mean it is currently using that
>> for the actual storage, but I am thinking of having the Replication
>> Identifiers be the public API and the CommitTsNodeId stuff be just hidden
>> implementation detail. This thought is based on the fact that CommitTsNodeId
>> provides somewhat meaningless number while Replication Identifiers give us
>> nice name for that number. And if we do that then the type should perhaps be
>> same for both?
>
> I'm not sure. Given that this is included in a significant number of
> recordsd I'd really rather not increase the overhead as described
> above. Maybe we can just limit CommitTsNodeId to the same size for now?
>

That would make sense.

I also wonder about the default nodeid infrastructure the committs 
provides. I can't think of use-case which it can be used for and which 
isn't solved by replication identifiers - in the end the main reason I 
added that infra was to make it possible to write something like 
replication identifiers as part of an extension. In any case I don't 
think the default nodeid can be used in parallel with replication 
identifiers since one will overwrite the SLRU record for the other. 
Maybe it's enough if this is documented but I think it might be better 
if this patch removed that default committs nodeid infrastructure. It's 
just few lines of code which nobody should be using yet.

Thinking about this some more and rereading the code I see that you are 
setting TransactionTreeSetCommitTsData during xlog replay, but not 
during transaction commit, that does not seem correct as the local 
records will not have any nodeid/origin.

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: hash agg is slower on wide tables?
Next
From: Andres Freund
Date:
Subject: Re: SSL renegotiation