Re: Streaming replication connection break - unexpected EOF onstandby connection - Mailing list pgsql-admin

From Ganesh Korde
Subject Re: Streaming replication connection break - unexpected EOF onstandby connection
Date
Msg-id CAPNyb0VPd2Jb7t5jevq072PSOrgAKaUoohOGiwBJ9P+NRbmABA@mail.gmail.com
Whole thread Raw
In response to Re: Streaming replication connection break - unexpected EOF onstandby connection  (Ganesh Korde <ganeshakorde@gmail.com>)
Responses Re: Streaming replication connection break - unexpected EOF onstandby connection  (Fabio Pardi <f.pardi@portavita.eu>)
List pgsql-admin
Hi,

    After analysis by network team, they found packets are getting reset by Secondary server. Below are the logs.

782.822280 port7 in <Secondary_server_IP>.35918 -> <Primary_server_IP>.5433: rst 1232664740

782.822310 wan2 out <Secondary_server_IP>.35918 -> <Primary_server_IP>.5433: rst 1232664740

782.822313 port7 in <Secondary_server_IP>.35918 -> <Primary_server_IP>.5433: rst 1232664740

782.822315 wan2 out <Secondary_server_IP>.35918 -> <Primary_server_IP>.5433: rst 1232664740

782.822317 port7 in <Secondary_server_IP>.35918 -> <Primary_server_IP>.5433: rst 1232664740

782.822319 wan2 out <Secondary_server_IP>.35918 -> <Primary_server_IP>.5433: rst 1232664740

782.822345 port7 in <Secondary_server_IP>.35918 -> <Primary_server_IP>.5433: rst 1232664740


But they didn't able to find why secondary generating reset packet. There are no any devices between these servers which can modify the packets.
Though both servers are on different firewall, but packets are getting reset at secondary server and not at the firewall level, we can see this in the log.

Below points I would like to mention about application 
1. This connection interruption happens in day time, when transactions are little bit high. In day time, average transactions per second are 5 (Inserts and processing). 
2. We are not using connection pool, so each time request comes app server creates new connection to db server and when processing is done, app server disconnects. 

We are now clue less why secondary server resetting the packets.  Any help is highly appreciated. 

Thanks & Regards,
Ganesh.




On Thu, Jun 28, 2018 at 3:38 PM Ganesh Korde <ganeshakorde@gmail.com> wrote:
Hi  Johannes,

  Thanks for your reply. We are using VPN Tunnel between these two hosts. I will check with network team, with remaining questions you mentioned and will get back.

Thanks  & Regards,
Ganesh.

On Wed, Jun 27, 2018 at 6:46 PM Johannes Truschnigg <johannes@truschnigg.info> wrote:
Hi Ganesh,


On Wed, Jun 27, 2018 at 06:37:25PM +0530, Ganesh Korde wrote:
> [...]
> 1. Because of what reason, " unexpected EOF on standby connection" occurs
> on primary db server?
> 2. After replication disconnection, secondary should immediately connect to
> primary, but it takes some time, what could be the reason for this?

From skimming the log, it seems to me that there is an issue at the
socket/network level, which yields the "connection reset by peer" eror
message.

What is the network between these two hosts like? Is it a WAN link; is a VPN
or SSH tunnel involved? Do you have other, long-running TCP sessions between
these peers, and do they experience similar or other problems? Do the hosts'
link-layer stats hint at problems, e. g. packet loss? Do the hosts' kernels
leave a message hinting at L2 connectivity problems in their debug ringbuffers
(`dmesg`) at the time you observe the replication drop out?

--
with best regards:
- Johannes Truschnigg ( johannes@truschnigg.info )

www:   https://johannes.truschnigg.info/
phone: +43 650 2 133337
xmpp:  johannes@truschnigg.info

Please do not bother me with HTML-email or attachments. Thank you.

pgsql-admin by date:

Previous
From: "Heinemann, Manfred (IMS)"
Date:
Subject: RE: Segmentation fault with parallelism PG 10.4
Next
From: Fabio Pardi
Date:
Subject: Re: Streaming replication connection break - unexpected EOF onstandby connection