Thread: Re: [COMMITTERS] pgsql: Send new protocol keepalive messages to standby servers.

On Sat, Dec 31, 2011 at 10:34 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Send new protocol keepalive messages to standby servers.
> Allows streaming replication users to calculate transfer latency
> and apply delay via internal functions. No external functions yet.

pq_flush_if_writable() needs to be called just after
WalSndKeepalive(). Otherwise,
keepalive packet is not sent for a while.

+static void
+ProcessWalSndrMessage(XLogRecPtr walEnd, TimestampTz sendTime)

walEnd is not used in ProcessWalSndrMessage() at all. Can't we remove it?
If yes, walEnd field in WalSndrMessage is also not used anywhere, so ISTM
we can remove it.

+    elog(DEBUG2, "sendtime %s receipttime %s replication apply delay %d
transfer latency %d",
+                    timestamptz_to_str(sendTime),
+                    timestamptz_to_str(lastMsgReceiptTime),
+                    GetReplicationApplyDelay(),
+                    GetReplicationTransferLatency());

The unit of replication apply delay and transfer latency should be in
log message.

GetReplicationApplyDelay() and GetReplicationTransferLatency() are called
whenever the standby receives the message from the master. Which might
degrade the performance of replication a bit. So we should skip the above elog
when log_message >= DEBUG2?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


On Wed, Jan 11, 2012 at 2:05 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Sat, Dec 31, 2011 at 10:34 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> Send new protocol keepalive messages to standby servers.
>> Allows streaming replication users to calculate transfer latency
>> and apply delay via internal functions. No external functions yet.

Thanks for further review.

> pq_flush_if_writable() needs to be called just after
> WalSndKeepalive(). Otherwise,
> keepalive packet is not sent for a while.

It will get sent though won't it? Maybe not immediately. I guess we
may as well flush though, since we're not doing anything else - by
definition. Will add.

> +static void
> +ProcessWalSndrMessage(XLogRecPtr walEnd, TimestampTz sendTime)
>
> walEnd is not used in ProcessWalSndrMessage() at all. Can't we remove it?
> If yes, walEnd field in WalSndrMessage is also not used anywhere, so ISTM
> we can remove it.

It's there to allow extension of the message processing to be more
complex than it currently is. Changing the protocol is much harder
than changing a function call.

I'd like to keep it since it doesn't have any negative effects.

> +       elog(DEBUG2, "sendtime %s receipttime %s replication apply delay %d
> transfer latency %d",
> +                                       timestamptz_to_str(sendTime),
> +                                       timestamptz_to_str(lastMsgReceiptTime),
> +                                       GetReplicationApplyDelay(),
> +                                       GetReplicationTransferLatency());
>
> The unit of replication apply delay and transfer latency should be in
> log message.

OK, will do.

> GetReplicationApplyDelay() and GetReplicationTransferLatency() are called
> whenever the standby receives the message from the master. Which might
> degrade the performance of replication a bit. So we should skip the above elog
> when log_message >= DEBUG2?

OK, will put in a specific test for you.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


On Thu, Jan 12, 2012 at 12:20 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> +static void
>> +ProcessWalSndrMessage(XLogRecPtr walEnd, TimestampTz sendTime)
>>
>> walEnd is not used in ProcessWalSndrMessage() at all. Can't we remove it?
>> If yes, walEnd field in WalSndrMessage is also not used anywhere, so ISTM
>> we can remove it.
>
> It's there to allow extension of the message processing to be more
> complex than it currently is. Changing the protocol is much harder
> than changing a function call.
>
> I'd like to keep it since it doesn't have any negative effects.

OK. Another problem about walEnd is that WalDataMessageHeader.walEnd is not
the same kind of location as WalSndrMessage.walEnd. The former indicates the
location that WAL has already been flushed (maybe not sent yet), i.e.,
"send request
location". OTOH, the latter indicates the location that WAL has
already been sent.
Is this inconsistency intentional?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


On Thu, Jan 12, 2012 at 3:09 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Thu, Jan 12, 2012 at 12:20 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>> +static void
>>> +ProcessWalSndrMessage(XLogRecPtr walEnd, TimestampTz sendTime)
>>>
>>> walEnd is not used in ProcessWalSndrMessage() at all. Can't we remove it?
>>> If yes, walEnd field in WalSndrMessage is also not used anywhere, so ISTM
>>> we can remove it.
>>
>> It's there to allow extension of the message processing to be more
>> complex than it currently is. Changing the protocol is much harder
>> than changing a function call.
>>
>> I'd like to keep it since it doesn't have any negative effects.
>
> OK. Another problem about walEnd is that WalDataMessageHeader.walEnd is not
> the same kind of location as WalSndrMessage.walEnd. The former indicates the
> location that WAL has already been flushed (maybe not sent yet), i.e.,
> "send request
> location". OTOH, the latter indicates the location that WAL has
> already been sent.
> Is this inconsistency intentional?

WalSndrMessage isn't set to anything, its just a definition.

PrimaryKeepaliveMessage is a message type that uses WalSndrMessage.
That message type is only sent when the WalSndr is quiet, so what is
the difference, in that case?

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


On Thu, Jan 12, 2012 at 5:53 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> PrimaryKeepaliveMessage is a message type that uses WalSndrMessage.
> That message type is only sent when the WalSndr is quiet, so what is
> the difference, in that case?

Oh, you are right. There is no difference.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


On Thu, Jan 12, 2012 at 10:37 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Thu, Jan 12, 2012 at 5:53 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> PrimaryKeepaliveMessage is a message type that uses WalSndrMessage.
>> That message type is only sent when the WalSndr is quiet, so what is
>> the difference, in that case?
>
> Oh, you are right. There is no difference.

Here are the changes we discussed. Further comments before commit?

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment
On Fri, Jan 13, 2012 at 4:19 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Thu, Jan 12, 2012 at 10:37 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
>> On Thu, Jan 12, 2012 at 5:53 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>> PrimaryKeepaliveMessage is a message type that uses WalSndrMessage.
>>> That message type is only sent when the WalSndr is quiet, so what is
>>> the difference, in that case?
>>
>> Oh, you are right. There is no difference.
>
> Here are the changes we discussed. Further comments before commit?

Can you add the test for avoiding useless call of GetReplicationApplyDelay()
and GetReplicationTransferLatency()?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center