Home > mailing lists

Re: terminating walsender process due to replication timeout - Mailing list pgsql-general

From	Achilleas Mantzios
Subject	Re: terminating walsender process due to replication timeout
Date	May 24, 2019 06:23:50
Msg-id	8ef8fcb4-e893-58ab-8d48-4a8d802ab5f2@matrix.gatewaynet.com Whole thread Raw
In response to	Re: terminating walsender process due to replication timeout (AYahorau@ibagroup.eu)
List	pgsql-general

Tree view

On 23/5/19 5:05 μ.μ., AYahorau@ibagroup.eu wrote:

Hello Everyone!

I can simplify and describe the issue I faced.
I have 2 nodes in db cluster: master and standby.
I create a simple table on master node by a command via psql:
CREATE TABLE table1 (a INTEGER);
After this I fill the table by COPY command from a file which contains 2000000 (2 million) entries.

And in case when I run for example such a command:
UPDATE table1 SET a='1'
or such a command:
DELETE FROM table1;
I see in PostgreSQL log the an entry: terminating walsender process due to replication timeout.

I suppose that this issue caused by small value of wal_sender_timeout=1s and long runtime of the queries (it takes about 15 seconds).

What is the best way to proceed it? How to avoid this? Is there any additional configuration which can help here?

I have set mine to 15min. No problems for over 7 months, knock on wood.

Regards,
Andrei

From: Andrei Yahorau/IBA
To: Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>,
Cc: pgsql-general@postgresql.org, rene.romero.b@gmail.com
Date: 17/05/2019 11:04
Subject: Re: terminating walsender process due to replication timeout

Hello.

Thanks for the answer.

Can frequent database operations cause getting a standby server behind? Is there a way to avoid this situation?
I checked that walsender works well in my test if I set wal_sender_timeout at least to 5 second.

Best regards,
Andrei Yahorau

From: Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>
To: AYahorau@ibagroup.eu,
Cc: rene.romero.b@gmail.com, pgsql-general@postgresql.org
Date: 16/05/2019 10:36
Subject: Re: terminating walsender process due to replication timeout

Hello. At Wed, 15 May 2019 10:04:12 +0300, AYahorau@ibagroup.eu wrote in <OF99D0D839.6A5BCB70-ON432583FB.0025912E-432583FB.0026D664@iba.by> > Hello, > Thank You for the response. > > Yes that's possible to monitor replication delay. But my questions were > not about monitoring network issues. > > I use exactly wal_sender_timeout=1s because it allows to detect > replication problems quickly. Though I don't have an exact idea of your configuration, it seems to me that your standby is simply getting behind more than one second from the master. If you regard the fact as a problem of replication, the configuration can be said to be finding the problem correctly. Since the keep-alive packet is sent in-band, it doesn't get to the standby before already-sent-but-not-processed packets. > So, I need clarification to the following questions: > Is it possible to use exactly this configuration and be sure that it will > be work properly. > What did I do wrong? Should I correct my configuration somehow? > Is this the same issue as mentioned here: >https://www.postgresql.org/message-id/e082a56a-fd95-a250-3bae-0fff93832510@2ndquadrant.com> ? If it is so, why I do I face this problem again? It is not the same "problem". What was mentioned there is fast network making the sender-side loop busy, which prevents keepalive packet from sending. regards. -- Kyotaro Horiguchi NTT Open Source Software Center

-- 
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt

pgsql-general by date:

From: Pavel Stehule
Date: 24 May 2019, 03:36:29
Subject: Re: Strange performance degregation in sql function (PG11.1)

From: Kyotaro HORIGUCHI
Date: 24 May 2019, 06:34:04
Subject: Re: terminating walsender process due to replication timeout

Re: terminating walsender process due to replication timeout - Mailing list pgsql-general

Previous

Next