Thread: replication protocol documentation inconsistencies

replication protocol documentation inconsistencies

From
Peter Eisentraut
Date:
Looking at
http://www.postgresql.org/docs/devel/static/protocol-replication.html
under START_REPLICATION it goes

"""
The payload of each CopyData message from server to the client contains
a message of one of the following formats:

If a slot's name is provided via slotname, it will be updated as
replication progresses so that the server knows which WAL segments - and
if hot_standby_feedback is on which transactions - are still needed by
the standby.

XLogData (B)
...

Primary keepalive message (B)
...
"""

That second paragraph was inserted recently and doesn't make sense
there.  It should be moved somewhere else.

More generally, it is weird that the message formats are described
there, even though the rest of the protocol documentation only mentions
the messages by name and then describes the formats later
(http://www.postgresql.org/docs/devel/static/protocol-message-types.html
and
http://www.postgresql.org/docs/devel/static/protocol-message-formats.html).For example, the meaning of the "(B)" code
isn'tdescribed until two
 
sections later.

I think the description of the details of the these messages should also
be moved there.  The CopyBothResponse, which is also used for
replication only, is also listed among the "normal" message formats.



Re: replication protocol documentation inconsistencies

From
Andres Freund
Date:
Hi,

On 2014-05-21 07:29:53 -0400, Peter Eisentraut wrote:
> Looking at
> http://www.postgresql.org/docs/devel/static/protocol-replication.html
> under START_REPLICATION it goes
> 
> """
> The payload of each CopyData message from server to the client contains
> a message of one of the following formats:
> 
> If a slot's name is provided via slotname, it will be updated as
> replication progresses so that the server knows which WAL segments - and
> if hot_standby_feedback is on which transactions - are still needed by
> the standby.
> 
> XLogData (B)
> ...
> 
> Primary keepalive message (B)
> ...
> """
> 
> That second paragraph was inserted recently and doesn't make sense
> there.  It should be moved somewhere else.

Hm. I am not sure why it doesn't make sense there? It's about the SLOT
$slotname parameter to START_REPLICATION?

> More generally, it is weird that the message formats are described
> there, even though the rest of the protocol documentation only mentions
> the messages by name and then describes the formats later
> (http://www.postgresql.org/docs/devel/static/protocol-message-types.html
> and
> http://www.postgresql.org/docs/devel/static/protocol-message-formats.html).
>  For example, the meaning of the "(B)" code isn't described until two
> sections later.
> 

I am not sure that makes sense. These messages cannot be sent as
toplevel messages - they're just describing the contents of the CopyBoth
stream after START_REPLICATION has begun. It seems wierd to add these
'subprotocol' messages to the toplevel protocol description.

> I think the description of the details of the these messages should also
> be moved there.  The CopyBothResponse, which is also used for
> replication only, is also listed among the "normal" message formats.

I think that's different because CopyBoth is a toplevel protocol issue.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: replication protocol documentation inconsistencies

From
Robert Haas
Date:
On Wed, May 28, 2014 at 6:51 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2014-05-21 07:29:53 -0400, Peter Eisentraut wrote:
>> Looking at
>> http://www.postgresql.org/docs/devel/static/protocol-replication.html
>> under START_REPLICATION it goes
>>
>> """
>> The payload of each CopyData message from server to the client contains
>> a message of one of the following formats:
>>
>> If a slot's name is provided via slotname, it will be updated as
>> replication progresses so that the server knows which WAL segments - and
>> if hot_standby_feedback is on which transactions - are still needed by
>> the standby.
>>
>> XLogData (B)
>> ...
>>
>> Primary keepalive message (B)
>> ...
>> """
>>
>> That second paragraph was inserted recently and doesn't make sense
>> there.  It should be moved somewhere else.
>
> Hm. I am not sure why it doesn't make sense there? It's about the SLOT
> $slotname parameter to START_REPLICATION?

I think it would probably read better if we added that into the first
paragraph about START_REPLICATION, instead of having it down at the
end.  i.e. "Instructs server to start streaming WAL, starting at WAL
position XXX/XXX. If TIMELINE option is specified, streaming starts on
timeline tli; otherwise, the server's current timeline is selected.
The server can reply with an error, e.g. if the requested section of
WAL has already been recycled.  On success, server responds with a
CopyBothResponse message, and then starts to stream WAL to the
frontend.  If a slot's name is provided via slotname, it will be
updated as replication progresses so that the server knows which WAL
segments - and if hot_standby_feedback is on which transactions - are
still needed by the standby."

This bit here:

"The payload of each CopyData message from server to the client
contains a message of one of the following formats:"

...is followed by a colon and needs to immediately proceed the list to
which it refers.

>> More generally, it is weird that the message formats are described
>> there, even though the rest of the protocol documentation only mentions
>> the messages by name and then describes the formats later
>> (http://www.postgresql.org/docs/devel/static/protocol-message-types.html
>> and
>> http://www.postgresql.org/docs/devel/static/protocol-message-formats.html).
>>  For example, the meaning of the "(B)" code isn't described until two
>> sections later.
>
> I am not sure that makes sense. These messages cannot be sent as
> toplevel messages - they're just describing the contents of the CopyBoth
> stream after START_REPLICATION has begun. It seems wierd to add these
> 'subprotocol' messages to the toplevel protocol description.

I see your point, but I think Peter has a good point, too.  It would
be weird to document this mini-protocol mixed in with the main
protocol, and it's so short that adding a separate section for it
hardly makes sense, but it's still strange to have the mini-protocol
being documented before we've explained our conventions for how we
document wire protocol messages.  Forward references are best avoided,
especially implicit ones.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: replication protocol documentation inconsistencies

From
Alvaro Herrera
Date:
Robert Haas wrote:
> On Wed, May 28, 2014 at 6:51 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> > On 2014-05-21 07:29:53 -0400, Peter Eisentraut wrote:

> >> More generally, it is weird that the message formats are described
> >> there, even though the rest of the protocol documentation only mentions
> >> the messages by name and then describes the formats later
> >> (http://www.postgresql.org/docs/devel/static/protocol-message-types.html
> >> and
> >> http://www.postgresql.org/docs/devel/static/protocol-message-formats.html).
> >>  For example, the meaning of the "(B)" code isn't described until two
> >> sections later.
> >
> > I am not sure that makes sense. These messages cannot be sent as
> > toplevel messages - they're just describing the contents of the CopyBoth
> > stream after START_REPLICATION has begun. It seems wierd to add these
> > 'subprotocol' messages to the toplevel protocol description.
> 
> I see your point, but I think Peter has a good point, too.  It would
> be weird to document this mini-protocol mixed in with the main
> protocol, and it's so short that adding a separate section for it
> hardly makes sense, but it's still strange to have the mini-protocol
> being documented before we've explained our conventions for how we
> document wire protocol messages.  Forward references are best avoided,
> especially implicit ones.

IMHO this mini-protocol and the CopyBoth "subprotocol" should be
documented in full in a separate subsection -- maybe even have its own
index entry which points to one place that documents the whole thing.
We can't expect people to read the whole FE/BE protocol chapter to hunt
for the hidden references to the replication protocol or the CopyBoth
stream.

I would put aside the "subsection too small to make sense" argument.
I don't think that matters much.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services