Home > mailing lists

Re: terminating walsender process due to replication timeout - Mailing list pgsql-general

From	Rene Romero Benavides
Subject	Re: terminating walsender process due to replication timeout
Date	May 14, 2019 17:04:54
Msg-id	CANaGW0-7rLaetm5zBopCtYDwXSg43ZyNWMJHk9rMb+7mAOvJwA@mail.gmail.com Whole thread
In response to	terminating walsender process due to replication timeout (AYahorau@ibagroup.eu)
Responses	Re: terminating walsender process due to replication timeout
List	pgsql-general

Tree view

To detect network issues maybe you could monitor replication delay.

On Mon, May 13, 2019 at 6:42 AM <AYahorau@ibagroup.eu> wrote:

Hello PostgreSQL Community!

I faced an issue on my linux machine using Postgres 11.3 .
I have 2 nodes in db cluster: master and standby.
I tried to perform a plenty of long-running queries which lead to the databases desynchronization:
terminating walsender process due to replication timeout

Here is the output in debug mode:
2019-05-13 13:21:33 FET 00000 DEBUG: sending replication keepalive
2019-05-13 13:21:34 FET 00000 DEBUG: StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
2019-05-13 13:21:34 FET 00000 DEBUG: CommitTransaction(1) name: unnamed; blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0
2019-05-13 13:21:34 FET 00000 DEBUG: StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
2019-05-13 13:21:34 FET 00000 DEBUG: CommitTransaction(1) name: unnamed; blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0
2019-05-13 13:21:34 FET 00000 DEBUG: StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
2019-05-13 13:21:34 FET 00000 DEBUG: CommitTransaction(1) name: unnamed; blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0
2019-05-13 13:21:34 FET 00000 DEBUG: StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
2019-05-13 13:21:34 FET 00000 DEBUG: CommitTransaction(1) name: unnamed; blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0
2019-05-13 13:21:34 FET 00000 DEBUG: StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
2019-05-13 13:21:34 FET 00000 DEBUG: CommitTransaction(1) name: unnamed; blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0
2019-05-13 13:21:34 FET 00000 DEBUG: StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
2019-05-13 13:21:34 FET 00000 DEBUG: CommitTransaction(1) name: unnamed; blockState: END; state: INPROGRESS, xid/subid/cid: 0/1/0
2019-05-13 13:21:34 FET 00000 LOG: terminating walsender process due to replication timeout

The issue is reproducible. I configure 2 nodes cluster, download demo_small.zip from https://edu.postgrespro.ru/ and run the following command:
psql -U user1 -f demo_small.sql db1
and I get the observed behaviour.

I know that I can increase wal_sender_timeout value to avoid this behaviour (currently wal_sender_timeout is equal to 1 second.)
To be honest I don't want to increase wal_sender_timeout because I would like to detect some network issues quickly.

After having googled I found that someone faced a similar issue https://www.postgresql.org/message-id/e082a56a-fd95-a250-3bae-0fff93832510@2ndquadrant.com which was fixed in PostgreSQL 9.4.16.

Is my issue the same as described here https://www.postgresql.org/message-id/e082a56a-fd95-a250-3bae-0fff93832510@2ndquadrant.com ?
Is there any other chance to avoid it without increasing wal_sender_timeout?

Thank you in advance.
Regards,
Andrei

El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/

pgsql-general by date:

From: Adrian Klaver
Date: 14 May 2019, 14:45:33
Subject: Re: perl path issue

From: Rich Shepard
Date: 14 May 2019, 20:46:32
Subject: Table update: restore or replace?

Re: terminating walsender process due to replication timeout - Mailing list pgsql-general

Previous

Next