Re: Connection terminated by the server causes deadlock in jdbc client side connection - Mailing list pgsql-jdbc

From Dave Cramer
Subject Re: Connection terminated by the server causes deadlock in jdbc client side connection
Date
Msg-id CADK3HHK9CCc=T3xYUrJ6R=cmPDa5jMywMT9eu8-jYANimmbnPQ@mail.gmail.com
Whole thread Raw
In response to Re: Connection terminated by the server causes deadlock in jdbc client side connection  (Dave Cramer <pg@fastcrypt.com>)
List pgsql-jdbc
Hi Leonardo,

So this works fine ? I coded around it as well, but I'd like to see your changes

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 7 October 2015 at 17:17, Dave Cramer <pg@fastcrypt.com> wrote:
Can you send me a patch or better yet a Pull Request on github ?

Dave

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 7 October 2015 at 16:06, Leonardo Frittelli <leonardo.frittelli@gmail.com> wrote:
Dave,

I've now replaced all protoConnection.close() by protoConnection.abort() in all IO Exception handlers for QueryExecutorImpl (v2 and v3 just in case) and that appears to have done the trick. 

With exactly the same load test, I would before lock out all 45 connections in my pool, but now they get released nicely. Once the test finishes, they are all released, when before they would have remained deadlocked.

I really think this is a bug then. Would you kindly indicate further steps?

Thanks,

Leonardo



On 7 October 2015 at 16:27, Leonardo Frittelli <leonardo.frittelli@gmail.com> wrote:
I'm actually giving it a go now with abort(). If this works I'll let you know.

On 7 October 2015 at 16:15, Dave Cramer <pg@fastcrypt.com> wrote:
OK, let me see if I can change the code and give you a driver that will print out and not call close.

Dave

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 7 October 2015 at 15:08, Leonardo Frittelli <leonardo.frittelli@gmail.com> wrote:
Dave,

Sadly ExecutorQueryImpl will only print out the exception _after_ closing the connection :)

        catch (IOException e)
        {
            protoConnection.close();
            handler.handleError(new PSQLException(GT.tr("An I/O error occurred while sending to the backend."), PSQLState.CONNECTION_FAILURE, e));
        }


Here is a longer StackTrace in case there is anything else of value. We are using BoneCP and Hibernate. The next line in the stack trace would be our application code.

   java.lang.Thread.State: RUNNABLE
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(Unknown Source)
at java.net.SocketOutputStream.write(Unknown Source)
at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
at java.io.BufferedOutputStream.flush(Unknown Source)
- locked <0x0000000700742e30> (a java.io.BufferedOutputStream)
at org.postgresql.core.PGStream.flush(PGStream.java:518)
at org.postgresql.core.v3.ProtocolConnectionImpl.close(ProtocolConnectionImpl.java:136)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:280)
- locked <0x0000000700742fd0> (a org.postgresql.core.v3.QueryExecutorImpl)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:547)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:417)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:302)
at com.jolbox.bonecp.PreparedStatementHandle.executeQuery(PreparedStatementHandle.java:174)
at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:56)
at org.hibernate.loader.Loader.getResultSet(Loader.java:2040)
at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1837)
at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1816)
at org.hibernate.loader.Loader.doQuery(Loader.java:900)
at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:342)
at org.hibernate.loader.Loader.doList(Loader.java:2526)
at org.hibernate.loader.Loader.doList(Loader.java:2512)
at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2342)
at org.hibernate.loader.Loader.list(Loader.java:2337)
at org.hibernate.loader.criteria.CriteriaLoader.list(CriteriaLoader.java:124)
at org.hibernate.internal.SessionImpl.list(SessionImpl.java:1662)
at org.hibernate.internal.CriteriaImpl.list(CriteriaImpl.java:374)

Regards,

Leonardo

On 7 October 2015 at 16:01, Dave Cramer <pg@fastcrypt.com> wrote:
Leonardo,

At this point I'm wondering if we get an I/O error if it is feasible to attempt to close it. I am leaning towards no.

What is above this in the stack trace. Do you see any information about the IOException error that caused it

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 7 October 2015 at 14:43, Leonardo Frittelli <leonardo.frittelli@gmail.com> wrote:
Dave,

Thank you very much for your prompt response.
I am using version postgresql-9.3-1101, but I have checked the latest available 9.4 sources and it appears to have the same code (the code example I pasted was from v2, but I see similar code in v3).

Could it be that in this particular scenario we should use ProtocolConnectionImpl.abort() instead of close()?

Thanks,

Leonardo

On 7 October 2015 at 15:23, Dave Cramer <pg@fastcrypt.com> wrote:
It could be a problem. What version are you using ?

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 7 October 2015 at 12:33, Leonardo Frittelli <leonardo.frittelli@gmail.com> wrote:
Hi,

We are experiencing very frequent deadlocks in pgsql jdbc connections. The scenario is a replicated database with hot_standby = on

At times of high volumes of queries in both the primary and the replicated server, we sometimes get the following log indicating that the server has terminated the connection in the replicated database.
FATAL:  terminating connection due to conflict with recovery
DETAIL:  User query might have needed to see row versions that must be removed.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.

This is expected and we see no issue with that. What I did not expect, however, is that in the JDBC client side, the connection is deadlocked.

java.lang.Thread.State: RUNNABLE
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(Unknown Source)
at java.net.SocketOutputStream.write(Unknown Source)
at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
at java.io.BufferedOutputStream.flush(Unknown Source)
- locked <0x0000000700742e30> (a java.io.BufferedOutputStream)
at org.postgresql.core.PGStream.flush(PGStream.java:518)
at org.postgresql.core.v3.ProtocolConnectionImpl.close(ProtocolConnectionImpl.java:136)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:280)
- locked <0x0000000700742fd0> (a org.postgresql.core.v3.QueryExecutorImpl)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:547)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:417)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:302)
...

Looking at the Postgres JDBC code, I notice that ProtocolConnectionImpl.close() (invoked by the exception handler in QueryExecutorImpl.execute) is trying to 'gracefully close' by sending an 'X' to the server before actually closing the socket.

...
if (logger.logDebug()) logger.debug(" FE=> Terminate"); pgStream.SendChar('X'); pgStream.flush(); pgStream.close(); ...

Does this make sense in a scenario of a connection which has already been terminated by the server side? 

At times of high load, all connections in the pool get eventually locked with exactly the same stack trace. 

I'd appreciate any advice on how to handle this. Could this be a bug in the JDBC driver?

Thanks,

Leonardo









pgsql-jdbc by date:

Previous
From: bocap
Date:
Subject: Re: JDBC-94: "Multiple resultsets were returned by query" in query end with "; "
Next
From: "Steffen Heil (Mailinglisten)"
Date:
Subject: Re: Connection terminated by the server causes deadlock in jdbc client side connection