Thread: Primary keepalive message not appearing in Logical Streaming Replication

Primary keepalive message not appearing in Logical Streaming Replication

From
Virendra Negi
Date:
Implemented the Logical Streaming Replication thing are working fine I see the XLogData message appearing and I'm able to parse them.

But I haven't see any "Primary Keepalive message"  yet. I had tried setting the tcp_keepalive_interval, tcp_keepalives_idle both from client runtime paramter and well as from postgresql.conf still no clue of it.

Any information around it?



Re: Primary keepalive message not appearing in Logical Streaming Replication

From
Virendra Negi
Date:
I forgot to mention the plugin I have been using along with logical replication 

its wal2json.

On Friday, September 13, 2019, Virendra Negi <viren.negi@teliax.com> wrote:
Implemented the Logical Streaming Replication thing are working fine I see the XLogData message appearing and I'm able to parse them.

But I haven't see any "Primary Keepalive message"  yet. I had tried setting the tcp_keepalive_interval, tcp_keepalives_idle both from client runtime paramter and well as from postgresql.conf still no clue of it.

Any information around it?



Re: Primary keepalive message not appearing in Logical Streaming Replication

From
Michael Loftis
Date:


On Fri, Sep 13, 2019 at 07:12 Virendra Negi <viren.negi@teliax.com> wrote:
Implemented the Logical Streaming Replication thing are working fine I see the XLogData message appearing and I'm able to parse them.

But I haven't see any "Primary Keepalive message"  yet. I had tried setting the tcp_keepalive_interval, tcp_keepalives_idle both from client runtime paramter and well as from postgresql.conf still no clue of it.

Any information around it?

Both of these options are not in the Pg protocol. They are within the OS TCP stack and are not visible to the applications at all.



--

"Genius might be described as a supreme capacity for getting its possessors
into trouble of all kinds."
-- Samuel Butler

Re: Primary keepalive message not appearing in Logical Streaming Replication

From
Virendra Negi
Date:
Agreed but why is there a message specification for it describe in the documentation  and it ask to client reply back if a particular *bit* is set.(1 means that the client should reply to this message as soon as possible, to avoid a timeout disconnect. 0 otherwise)


Primary keepalive message (B)
Byte1('k')

Identifies the message as a sender keepalive.

Int64

The current end of WAL on the server.

Int64

The server's system clock at the time of transmission, as microseconds since midnight on 2000-01-01.

Byte1

1 means that the client should reply to this message as soon as possible, to avoid a timeout disconnect. 0 otherwise.

The receiving process can send replies back to the sender at any time, using one of the following message formats (also in the payload of a CopyData message):



On Sun, Sep 15, 2019 at 7:39 PM Michael Loftis <mloftis@wgops.com> wrote:


On Fri, Sep 13, 2019 at 07:12 Virendra Negi <viren.negi@teliax.com> wrote:
Implemented the Logical Streaming Replication thing are working fine I see the XLogData message appearing and I'm able to parse them.

But I haven't see any "Primary Keepalive message"  yet. I had tried setting the tcp_keepalive_interval, tcp_keepalives_idle both from client runtime paramter and well as from postgresql.conf still no clue of it.

Any information around it?

Both of these options are not in the Pg protocol. They are within the OS TCP stack and are not visible to the applications at all.



--

"Genius might be described as a supreme capacity for getting its possessors
into trouble of all kinds."
-- Samuel Butler

Re: Primary keepalive message not appearing in Logical Streaming Replication

From
Virendra Negi
Date:
Oh I miss the documentation link there you go https://www.postgresql.org/docs/9.5/protocol-replication.html

On Sun, Sep 15, 2019 at 8:05 PM Virendra Negi <viren.negi@teliax.com> wrote:
Agreed but why is there a message specification for it describe in the documentation  and it ask to client reply back if a particular *bit* is set.(1 means that the client should reply to this message as soon as possible, to avoid a timeout disconnect. 0 otherwise)


Primary keepalive message (B)
Byte1('k')

Identifies the message as a sender keepalive.

Int64

The current end of WAL on the server.

Int64

The server's system clock at the time of transmission, as microseconds since midnight on 2000-01-01.

Byte1

1 means that the client should reply to this message as soon as possible, to avoid a timeout disconnect. 0 otherwise.

The receiving process can send replies back to the sender at any time, using one of the following message formats (also in the payload of a CopyData message):



On Sun, Sep 15, 2019 at 7:39 PM Michael Loftis <mloftis@wgops.com> wrote:


On Fri, Sep 13, 2019 at 07:12 Virendra Negi <viren.negi@teliax.com> wrote:
Implemented the Logical Streaming Replication thing are working fine I see the XLogData message appearing and I'm able to parse them.

But I haven't see any "Primary Keepalive message"  yet. I had tried setting the tcp_keepalive_interval, tcp_keepalives_idle both from client runtime paramter and well as from postgresql.conf still no clue of it.

Any information around it?

Both of these options are not in the Pg protocol. They are within the OS TCP stack and are not visible to the applications at all.



--

"Genius might be described as a supreme capacity for getting its possessors
into trouble of all kinds."
-- Samuel Butler

Re: Primary keepalive message not appearing in Logical Streaming Replication

From
Michael Loftis
Date:


On Sun, Sep 15, 2019 at 08:36 Virendra Negi <viren.negi@teliax.com> wrote:
Oh I miss the documentation link there you go https://www.postgresql.org/docs/9.5/protocol-replication.html

On Sun, Sep 15, 2019 at 8:05 PM Virendra Negi <viren.negi@teliax.com> wrote:
Agreed but why is there a message specification for it describe in the documentation  and it ask to client reply back if a particular *bit* is set.(1 means that the client should reply to this message as soon as possible, to avoid a timeout disconnect. 0 otherwise)

This is unrelated to TCP keepalive. I honestly don't know where the knob is to turn these on but the configuration variables you quoted earlier I am familiar with and they are not it. Perhaps someone else can chime in with how to enable the protocol level keepalive in replication. 



Primary keepalive message (B)
Byte1('k')

Identifies the message as a sender keepalive.

Int64

The current end of WAL on the server.

Int64

The server's system clock at the time of transmission, as microseconds since midnight on 2000-01-01.

Byte1

1 means that the client should reply to this message as soon as possible, to avoid a timeout disconnect. 0 otherwise.

The receiving process can send replies back to the sender at any time, using one of the following message formats (also in the payload of a CopyData message):



On Sun, Sep 15, 2019 at 7:39 PM Michael Loftis <mloftis@wgops.com> wrote:


On Fri, Sep 13, 2019 at 07:12 Virendra Negi <viren.negi@teliax.com> wrote:
Implemented the Logical Streaming Replication thing are working fine I see the XLogData message appearing and I'm able to parse them.

But I haven't see any "Primary Keepalive message"  yet. I had tried setting the tcp_keepalive_interval, tcp_keepalives_idle both from client runtime paramter and well as from postgresql.conf still no clue of it.

Any information around it?

Both of these options are not in the Pg protocol. They are within the OS TCP stack and are not visible to the applications at all.



--

"Genius might be described as a supreme capacity for getting its possessors
into trouble of all kinds."
-- Samuel Butler
--

"Genius might be described as a supreme capacity for getting its possessors
into trouble of all kinds."
-- Samuel Butler

Re: Primary keepalive message not appearing in Logical StreamingReplication

From
Tomas Vondra
Date:
On Sun, Sep 15, 2019 at 09:44:14AM -0600, Michael Loftis wrote:
>On Sun, Sep 15, 2019 at 08:36 Virendra Negi <viren.negi@teliax.com> wrote:
>
>> Oh I miss the documentation link there you go
>> https://www.postgresql.org/docs/9.5/protocol-replication.html
>>
>> On Sun, Sep 15, 2019 at 8:05 PM Virendra Negi <viren.negi@teliax.com>
>> wrote:
>>
>>> Agreed but why is there a message specification for it describe in the
>>> documentation  and it ask to client reply back if a particular *bit* is
>>> set.(1 means that the client should reply to this message as soon as
>>> possible, to avoid a timeout disconnect. 0 otherwise)
>>>
>>
>This is unrelated to TCP keepalive. I honestly don't know where the knob is
>to turn these on but the configuration variables you quoted earlier I am
>familiar with and they are not it. Perhaps someone else can chime in with
>how to enable the protocol level keepalive in replication.
>

Pretty sure it's wal_sender_timeout. Which by default is 60s, but if you
tune it down it should send keepalives more often.

See WalSndKeepaliveIfNecessary in [1]:

[1] https://github.com/postgres/postgres/blob/master/src/backend/replication/walsender.c#L3425

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



Re: Primary keepalive message not appearing in Logical Streaming Replication

From
Jeff Janes
Date:
On Sun, Sep 15, 2019 at 11:44 AM Michael Loftis <mloftis@wgops.com> wrote:


On Sun, Sep 15, 2019 at 08:36 Virendra Negi <viren.negi@teliax.com> wrote:
Oh I miss the documentation link there you go https://www.postgresql.org/docs/9.5/protocol-replication.html

On Sun, Sep 15, 2019 at 8:05 PM Virendra Negi <viren.negi@teliax.com> wrote:
Agreed but why is there a message specification for it describe in the documentation  and it ask to client reply back if a particular *bit* is set.(1 means that the client should reply to this message as soon as possible, to avoid a timeout disconnect. 0 otherwise)

This is unrelated to TCP keepalive. I honestly don't know where the knob is to turn these on but the configuration variables you quoted earlier I am familiar with and they are not it. Perhaps someone else can chime in with how to enable the protocol level keepalive in replication. 
 
Protocol-level keepalives are governed by "wal_sender_timeout"

Cheers,

Jeff