Re: Logical replication fails due to SocketException - Mailing list pgsql-jdbc

From Alex Maltinsky
Subject Re: Logical replication fails due to SocketException
Date
Msg-id CAH+ZVca_hLe5kYSbu1qDvKPszjfhNjX1XP5=_yE3diLJ72uUeA@mail.gmail.com
Whole thread Raw
In response to Re: Logical replication fails due to SocketException  (Dave Cramer <pg@fastcrypt.com>)
List pgsql-jdbc
> Seems to be the common theme. Is there something about windows that drops connections ?

Yeah, I noticed the Windows part too. I'm not aware of anything Windows-specific that resets active connections. Also, I tried tunneling the postgres connection over SSH and got the same exact result while the SSH connection itself was fine.


On Tue, 21 May 2019 at 22:55, Dave Cramer <pg@fastcrypt.com> wrote:


On Tue, 21 May 2019 at 15:49, Alex Maltinsky <alex@bgprotect.com> wrote:
The server runs a dockerized Postgres on Ubuntu. 

The client is on Windows 10.

On Tue, 21 May 2019 at 21:22 Dave Cramer <pg@fastcrypt.com> wrote:
On Tue, 21 May 2019 at 09:58, Alex Maltinsky <alex@bgprotect.com> wrote:
Hi All

I ran into the a problem with Postgres 11 and JDBC driver 42.2.5 which resembles a problem that was posted here before (https://www.postgresql-archive.org/postgresql-Logical-Replication-Stream-fails-with-Database-connection-failed-when-reading-from-copy-td6036639.html), but unfortunately the solution was never posted here.

I have a simple endless loop that follows the official replication example (https://jdbc.postgresql.org/documentation/head/replication.html), and I keep getting socket exceptions like these after fetching approximately 66K rows with remarkable consistency:

    Exception in thread "main" org.postgresql.util.PSQLException: Database connection failed when reading from copy
    at org.postgresql.core.v3.QueryExecutorImpl.readFromCopy(QueryExecutorImpl.java:1037)
    at org.postgresql.core.v3.CopyDualImpl.readFromCopy(CopyDualImpl.java:41)
    at org.postgresql.core.v3.replication.V3PGReplicationStream.receiveNextData(V3PGReplicationStream.java:155)
    at org.postgresql.core.v3.replication.V3PGReplicationStream.readInternal(V3PGReplicationStream.java:124)
    at org.postgresql.core.v3.replication.V3PGReplicationStream.readPending(V3PGReplicationStream.java:78)
    at com.example.main(ReplicationTest.java:48)
    Caused by: java.net.SocketException: socket closed
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
    at java.net.SocketInputStream.read(SocketInputStream.java:171)
    at java.net.SocketInputStream.read(SocketInputStream.java:141)
    at org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:140)
    at org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:109)
    at org.postgresql.core.VisibleBufferedInputStream.read(VisibleBufferedInputStream.java:191)
    at org.postgresql.core.PGStream.receive(PGStream.java:462)
    at org.postgresql.core.PGStream.receive(PGStream.java:446)
    at org.postgresql.core.v3.QueryExecutorImpl.processCopyResults(QueryExecutorImpl.java:1170)
    at org.postgresql.core.v3.QueryExecutorImpl.readFromCopy(QueryExecutorImpl.java:1035)
    ... 5 more

The database log shows "LOG:  could not send data to client: Connection reset by peer"

Wireshark shows that it was the client who suddenly sent a TCP RST to the server and closed the connection.

Parameters: `wal_sender_timeout` is set to 60 seconds, and I'm using a status interval of 10 seconds and  TCP_KEEP_ALIVE is enabled.

The body of the Java loop looks like this:

        while (true) {
            ByteBuffer msg = stream.readPending();
            if (msg == null) {
                TimeUnit.MILLISECONDS.sleep(10L);
                continue;
            }
                     
            LogSequenceNumber lastReceiveLSN = stream.getLastReceiveLSN();
            System.out.println((i++) + " " + lastReceiveLSN);
           
            stream.setAppliedLSN(lastReceiveLSN);
            stream.setFlushedLSN(lastReceiveLSN);
        }


Curiously enough, if I change the loop to the code below, the problem disappears:

        while (true) {
            ByteBuffer msg = stream.readPending();
            if (msg == null) {
                TimeUnit.MILLISECONDS.sleep(10L);
                continue;
            }
           
            int offset = msg.arrayOffset();
            byte[] source = msg.array();
            int length = source.length - offset;
           
            LogSequenceNumber lastReceiveLSN = stream.getLastReceiveLSN();
            System.out.println((i++) + " " + lastReceiveLSN + " " + new String(source, offset, length));
           
            stream.setAppliedLSN(lastReceiveLSN);
            stream.setFlushedLSN(lastReceiveLSN);
        }


Any help would be appreciated

- Alex


What OS are you using ?




Seems to be the common theme. Is there something about windows that drops connections ?



pgsql-jdbc by date:

Previous
From: Dave Cramer
Date:
Subject: Re: Logical replication fails due to SocketException
Next
From: Dave Cramer
Date:
Subject: [pgjdbc/pgjdbc] 8bd906: some fixbugs cleanup (#1486)