Re: Dropped connections with pg_basebackup - Mailing list pgsql-general

From Adrian Klaver
Subject Re: Dropped connections with pg_basebackup
Date
Msg-id 56045DC7.7060002@aklaver.com
Whole thread Raw
In response to Dropped connections with pg_basebackup  (Francisco Reyes <lists@natserv.net>)
List pgsql-general
On 09/24/2015 12:57 PM, Francisco Reyes wrote:
> Have an existing setup of 9.3 servers. Replication has been rock solid,
> but recently the circuits between data centers were upgraded and
> pg_basebackup now seems to fail often when setting up streaming
> replication. What used to take 10+ hours now  only took 68 minutes, but
> had to do many retries. Many attempts fail within minutes while others
> go to 90% or higher and then drop. The reason we are doing a sync is
> because we have to swap data centers every so often for compliance. So I
> had to swap master and slave.
>
> Calling pg_basebackup like this:
> pg_basebackup -P -R -X s -h <HostName> -D <Folder> -U replicator
>
> The error we keep having is:
> Sep 23 13:36:32 <HostName> postgres[16804]: [11-1] 2015-09-23 13:36:32
> EDT <IP> [unknown] replicator LOG: SSL error: bad write retry
> Sep 23 13:36:32 <HostName> postgres[16804]: [12-1] 2015-09-23 13:36:32
> EDT <IP> [unknown] replicator LOG: SSL error: bad write retry

Seems to be an SSL problem, so how is your SSL set up on the servers?

> Sep 23 13:36:32 <HostName> postgres[16804]: [13-1] 2015-09-23 13:36:32
> EDT <IP> [unknown] replicator FATAL: connection to client lost
> Sep 23 13:36:32 <HostName> postgres[16972]: [9-1] 2015-09-23 13:36:32
> EDT <IP> [unknown] replicator LOG: could not receive data from client:
> Connection reset by peer
>
> I have been working with the network team and we have even been actively
> monitoring the line, and running ping, as the replication is setup. At
> the point the connection reset by peer error happens, we don't see any
> issue with the network and ping doesn't show an issue at that point in
> time.
>
> The issue also happened on another set of machines and likewise, had to
> retry many times before pg_basebackup would do the initial sync. Once
> the initial sync is set, replication is fine.
>
> I  tried both "-X s" (stream) and "-X f" (fetch) and both fail often.
>
> Any ideas what may be going on?
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


pgsql-general by date:

Previous
From: Sherrylyn Branchaw
Date:
Subject: Re: Dropped connections with pg_basebackup
Next
From: Alvaro Herrera
Date:
Subject: Re: Dropped connections with pg_basebackup