Thread: [HACKERS] Logical replication launcher uses wal_retrieve_retry_interval

[HACKERS] Logical replication launcher uses wal_retrieve_retry_interval

From
Masahiko Sawada
Date:
Hi,

I noticed that the logical replication launcher uses
wal_retrieve_retry_interval as a interval of launching logical
replication worker process. This behavior is not documented and I
guess this is no longer consistent with what its name means.

I think that we should either introduce a new GUC parameter (say
logical_replication_retry_interval?) for this or update the
description of wal_retrieve_retry_interval. IMO the former is better.

Thought?

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



Re: [HACKERS] Logical replication launcher useswal_retrieve_retry_interval

From
Petr Jelinek
Date:
On 14/04/17 12:57, Masahiko Sawada wrote:
> Hi,
> 
> I noticed that the logical replication launcher uses
> wal_retrieve_retry_interval as a interval of launching logical
> replication worker process. This behavior is not documented and I
> guess this is no longer consistent with what its name means.
> 

Yes that was done based on reviews (and based on general attitude of not
adding more knobs that are similar in meaning). It is briefly documented
in the replication config section. Same is true for wal_receiver_timeout
btw.

> I think that we should either introduce a new GUC parameter (say
> logical_replication_retry_interval?) for this or update the
> description of wal_retrieve_retry_interval. IMO the former is better.
> 

I am not quite sure adding more GUCs is all that great option. When
writing the patches I was wondering if we should perhaps rename the
wal_receiver_timeout and wal_retrieve_retry_interval to something that
makes more sense for both physical and logical replication though.

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: [HACKERS] Logical replication launcher uses wal_retrieve_retry_interval

From
Michael Paquier
Date:
On Fri, Apr 14, 2017 at 9:19 PM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:
> I am not quite sure adding more GUCs is all that great option. When
> writing the patches I was wondering if we should perhaps rename the
> wal_receiver_timeout and wal_retrieve_retry_interval to something that
> makes more sense for both physical and logical replication though.

It seems to me that you should really have a different GUC,
wal_retrieve_retry_interval has been designed to work in the startup
process, and I think that it should still only behave as originally
designed. And at some point I think that it would make as well sense
to be able to make this parameter settable at worker-level.
-- 
Michael



Re: [HACKERS] Logical replication launcher uses wal_retrieve_retry_interval

From
Masahiko Sawada
Date:
On Fri, Apr 14, 2017 at 9:19 PM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:
> On 14/04/17 12:57, Masahiko Sawada wrote:
>> Hi,
>>
>> I noticed that the logical replication launcher uses
>> wal_retrieve_retry_interval as a interval of launching logical
>> replication worker process. This behavior is not documented and I
>> guess this is no longer consistent with what its name means.
>>
>
> Yes that was done based on reviews (and based on general attitude of not
> adding more knobs that are similar in meaning). It is briefly documented
> in the replication config section. Same is true for wal_receiver_timeout
> btw.

Thank you for the comment!

These two parameters are classed as a standby server parameter in the
document. We might want to fix it.

>> I think that we should either introduce a new GUC parameter (say
>> logical_replication_retry_interval?) for this or update the
>> description of wal_retrieve_retry_interval. IMO the former is better.
>>
>
> I am not quite sure adding more GUCs is all that great option. When
> writing the patches I was wondering if we should perhaps rename the
> wal_receiver_timeout and wal_retrieve_retry_interval to something that
> makes more sense for both physical and logical replication though.
>

It would work but IMO having multiple different behaviors for one GUC
parameter is not good design.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



Re: [HACKERS] Logical replication launcher useswal_retrieve_retry_interval

From
Petr Jelinek
Date:
On 14/04/17 14:30, Michael Paquier wrote:
> On Fri, Apr 14, 2017 at 9:19 PM, Petr Jelinek
> <petr.jelinek@2ndquadrant.com> wrote:
>> I am not quite sure adding more GUCs is all that great option. When
>> writing the patches I was wondering if we should perhaps rename the
>> wal_receiver_timeout and wal_retrieve_retry_interval to something that
>> makes more sense for both physical and logical replication though.
> 
> It seems to me that you should really have a different GUC,
> wal_retrieve_retry_interval has been designed to work in the startup
> process, and I think that it should still only behave as originally
> designed.

Ah yeah I am actually confusing it with wal_receiver_timeout which
behaves same for wal_receiver and subscription worker. So yeah it makes
sense to have separate GUC (I wonder if we then need yet another one for
tablesync though since both of those will be controlling restarts of
subscription workers after crash).

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: [HACKERS] Logical replication launcher uses wal_retrieve_retry_interval

From
Masahiko Sawada
Date:
On Fri, Apr 14, 2017 at 9:59 PM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:
> On 14/04/17 14:30, Michael Paquier wrote:
>> On Fri, Apr 14, 2017 at 9:19 PM, Petr Jelinek
>> <petr.jelinek@2ndquadrant.com> wrote:
>>> I am not quite sure adding more GUCs is all that great option. When
>>> writing the patches I was wondering if we should perhaps rename the
>>> wal_receiver_timeout and wal_retrieve_retry_interval to something that
>>> makes more sense for both physical and logical replication though.
>>
>> It seems to me that you should really have a different GUC,
>> wal_retrieve_retry_interval has been designed to work in the startup
>> process, and I think that it should still only behave as originally
>> designed.
>
> Ah yeah I am actually confusing it with wal_receiver_timeout which
> behaves same for wal_receiver and subscription worker. So yeah it makes
> sense to have separate GUC

Attached two patches add new GUCs apply_worker_timeout and
apply_worker_launch_interval which are used instead of
wal_receiver_timeout and wal_retrieve_retry_timeout. These new
parameters are not settable at worker-level so far.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] Logical replication launcher useswal_retrieve_retry_interval

From
Peter Eisentraut
Date:
On 4/16/17 22:40, Masahiko Sawada wrote:
> Attached two patches add new GUCs apply_worker_timeout and
> apply_worker_launch_interval which are used instead of
> wal_receiver_timeout and wal_retrieve_retry_timeout. These new
> parameters are not settable at worker-level so far.

Under what circumstances are these needed?  Does anyone ever set these?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Logical replication launcher useswal_retrieve_retry_interval

From
Petr Jelinek
Date:
On 18/04/17 16:24, Peter Eisentraut wrote:
> On 4/16/17 22:40, Masahiko Sawada wrote:
>> Attached two patches add new GUCs apply_worker_timeout and
>> apply_worker_launch_interval which are used instead of
>> wal_receiver_timeout and wal_retrieve_retry_timeout. These new
>> parameters are not settable at worker-level so far.
> 
> Under what circumstances are these needed?  Does anyone ever set these?
> 

Personally I don't see need for apply_worker_timeout, no idea why that
can't use wal_receiver_timeout, the mechanics are exactly same, and it's
IMHO only needed because default tcp keepalive settings are usually too
generous. As for apply_worker_launch_interval, I think we want different
name so that it can be used for tablesync rate limiting as well.

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: [HACKERS] Logical replication launcher useswal_retrieve_retry_interval

From
Peter Eisentraut
Date:
On 4/18/17 12:00, Petr Jelinek wrote:
> As for apply_worker_launch_interval, I think we want different
> name so that it can be used for tablesync rate limiting as well.

But that's a mechanism we don't have yet, so maybe we should design that
when we get there?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services