Postgres restart during CopyManager.copyIn does not free connection, thread stuck on QueryExecutorImpl.waitOnLock - Mailing list pgsql-jdbc

From Brendan Reekie
Subject Postgres restart during CopyManager.copyIn does not free connection, thread stuck on QueryExecutorImpl.waitOnLock
Date
Msg-id B7E8BC2518B97643AA6F5E49635CAF0B306BD6BF@wtl-exchp-2.sandvine.com
Whole thread Raw
Responses Re: Postgres restart during CopyManager.copyIn does not free connection, thread stuck on QueryExecutorImpl.waitOnLock  (Alexis Meneses <alexis.meneses@gmail.com>)
List pgsql-jdbc

Hi,

 

I’m currently using driver: 9.3.1100-jdbc3.jar with a 9.3.5 server.

 

The behaviour I’m seeing is if the connection to the database is lost due a restart of Postgres and the block of code being executed is a CopyManager.copyIn() method the connection to the database is never freed and the stack trace shows that the thread is still awaiting unlock:

 

                java.lang.Object.$$YJP$$wait(Native Method)

                java.lang.Object.wait(Object.java)

                java.lang.Object.wait(Object.java:503)

                org.postgresql.core.v3.QueryExecutorImpl.waitOnLock(QueryExecutorImpl.java:91)

                org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:228)

                org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:560)

                org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:403)

                org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:395)

 

Debugging through the code it looks like the issue might be in the QueryExecutorImpl.cancelCopy() operation.  When the operation is attempting to flush the pgStream this throws an IOException and the block of code to remove the lock (processCopyResults) is never called and the connection remains open and the lock never freed.

 

 

    /**

     * Finishes a copy operation and unlocks connection discarding any exchanged data.

     * @param op the copy operation presumably currently holding lock on this connection

     * @throws SQLException on any additional failure

     */

    public void cancelCopy(CopyOperationImpl op) throws SQLException {

        if(!hasLock(op))

            throw new PSQLException(GT.tr("Tried to cancel an inactive copy operation"), PSQLState.OBJECT_NOT_IN_STATE);

 

        SQLException error = null;

        int errors = 0;

 

        try {

            if(op instanceof CopyInImpl) {

                synchronized (this) {

                    if (logger.logDebug()) {

                        logger.debug("FE => CopyFail");

                    }

                    final byte[] msg = Utils.encodeUTF8("Copy cancel requested");

                    pgStream.SendChar('f'); // CopyFail

                    pgStream.SendInteger4(5 + msg.length);

                    pgStream.Send(msg);

                    pgStream.SendChar(0);

                    pgStream.flush();

                    do {

                        try {

                            processCopyResults(op, true); // discard rest of input

                        } catch(SQLException se) { // expected error response to failing copy

                            errors++;

                            if( error != null ) {

                                SQLException e = se, next;

                                while( (next = e.getNextException()) != null )

                                    e = next;

                                e.setNextException(error);

                            }

                            error = se;

                        }

                    } while(hasLock(op));

                }

            } else if (op instanceof CopyOutImpl) {

                protoConnection.sendQueryCancel();

            }

 

        } catch(IOException ioe) {

            throw new PSQLException(GT.tr("Database connection failed when canceling copy operation"), PSQLState.CONNECTION_FAILURE, ioe);

        }

 

        if (op instanceof CopyInImpl) {

            if(errors < 1) {

                throw new PSQLException(GT.tr("Missing expected error response to copy cancel request"), PSQLState.COMMUNICATION_ERROR);

            } else if(errors > 1) {

                throw new PSQLException(GT.tr("Got {0} error responses to single copy cancel request", String.valueOf(errors)), PSQLState.COMMUNICATION_ERROR, error);

            }

        }

    }

 

I’ve tried the latest driver 9.4-1200 and observed the same behaviour.  To reproduce this test I’m using a tester that writes to copyIn using a stream of data and set a break point and restart Postgres server while performing the copyIn.

 

Has anyone seen this issue previously?  Is there a work around to this scenario?

 

Thanks in advance,

Brendan

pgsql-jdbc by date:

Previous
From: Nikos Viorres
Date:
Subject: Re: Postgres driver bug
Next
From: Albe Laurenz
Date:
Subject: SSL renegotiation is broken