Thread: patch: streaming of bytea parameter values

patch: streaming of bytea parameter values

From

Oliver Jowett

Date:

21 April 2004, 01:02:45

This patch allows bytea parameters set via setBytes() /
setBinaryStream() to be streamed to the backend. With the patch applied,
the additional memory overhead for bytea parameters is small and
independent of how large the parameter data is.

Note that it doesn't touch the ResultSet path, so you'll still need a
pile of extra memory if you execute queries that return large bytea values.

It passes the driver's testcases against a 7.4 server, and doesn't show
any problems after brief testing with our bytea-using application. I've
had a report of strange things happening on the server->client path, but
don't have any details at the moment.

This needs testing under a real workload, and testing against a 7.3
server (the V2 protocol path is currently untested). Any volunteers?

-O
? org/postgresql/core/StreamableBytea.java
Index: org/postgresql/core/Encoding.java
===================================================================
RCS file: /usr/local/cvsroot/pgjdbc/pgjdbc/org/postgresql/core/Encoding.java,v
retrieving revision 1.14
diff -u -c -r1.14 Encoding.java
*** org/postgresql/core/Encoding.java    16 Feb 2004 11:35:20 -0000    1.14
--- org/postgresql/core/Encoding.java    21 Apr 2004 03:25:55 -0000
***************
*** 15,23 ****
--- 15,27 ----
  import java.io.InputStream;
  import java.io.InputStreamReader;
  import java.io.Reader;
+ import java.io.OutputStream;
+ import java.io.OutputStreamWriter;
+ import java.io.Writer;
  import java.io.UnsupportedEncodingException;
  import java.sql.SQLException;
  import java.util.Hashtable;
+ import java.util.Vector;
  import org.postgresql.util.PSQLException;
  import org.postgresql.util.PSQLState;

***************
*** 31,36 ****
--- 35,45 ----
       */
      private static final Hashtable encodings = new Hashtable();

+     /*
+      * Encodings that are "7-bit safe"; see below.
+      */
+     private static final Vector sevenBitEncodings = new Vector();
+
      static {
          //Note: this list should match the set of supported server
          // encodings found in backend/util/mb/encnames.c
***************
*** 72,84 ****
--- 81,109 ----
          encodings.put("LATIN6", new String[0]);
          encodings.put("LATIN8", new String[0]);
          encodings.put("LATIN10", new String[0]);
+
+         //
+         // These are the Java encodings known to be 7-bit safe i.e. you can
+         // represent characters 32..127 as a single byte with the corresponding value.
+         // NB: this list needs extending by someone who knows about encodings,
+         // I've only populated it with those I'm familiar with.
+         //
+         // Beware of case sensitivity here.
+         //
+         sevenBitEncodings.addElement("ASCII");
+         sevenBitEncodings.addElement("us-ascii");
+         sevenBitEncodings.addElement("UTF8");
+         sevenBitEncodings.addElement("UTF-8");
+         sevenBitEncodings.addElement("ISO8859_1");
      }

      private final String encoding;
+     private final boolean sevenBit;

      private Encoding(String encoding)
      {
          this.encoding = encoding;
+         this.sevenBit = (encoding != null && sevenBitEncodings.contains(encoding));
      }

      /*
***************
*** 136,141 ****
--- 161,174 ----
          return encoding;
      }

+     /**
+      * @return true if this encoding is "7 bit safe"
+      */
+     public boolean isSevenBitSafe()
+     {
+         return sevenBit;
+     }
+
      /*
       * Encode a string to an array of bytes.
       */
***************
*** 212,217 ****
--- 245,272 ----
              else
              {
                  return new InputStreamReader(in, encoding);
+             }
+         }
+         catch (UnsupportedEncodingException e)
+         {
+             throw new PSQLException("postgresql.res.encoding", PSQLState.DATA_ERROR, e);
+         }
+     }
+
+     /*
+      * Get a Writer that encodes to the given OutputStream.
+      */
+     public Writer getEncodingWriter(OutputStream out) throws SQLException
+     {
+         try
+         {
+             if (encoding == null)
+             {
+                 return new OutputStreamWriter(out);
+             }
+             else
+             {
+                 return new OutputStreamWriter(out, encoding);
              }
          }
          catch (UnsupportedEncodingException e)
Index: org/postgresql/core/QueryExecutor.java
===================================================================
RCS file: /usr/local/cvsroot/pgjdbc/pgjdbc/org/postgresql/core/QueryExecutor.java,v
retrieving revision 1.34
diff -u -c -r1.34 QueryExecutor.java
*** org/postgresql/core/QueryExecutor.java    10 Apr 2004 13:53:10 -0000    1.34
--- org/postgresql/core/QueryExecutor.java    21 Apr 2004 03:25:55 -0000
***************
*** 14,19 ****
--- 14,21 ----

  import java.util.Vector;
  import java.io.IOException;
+ import java.io.InputStream;
+ import java.io.Writer;
  import java.sql.*;
  import org.postgresql.Driver;
  import org.postgresql.util.PSQLException;
***************
*** 321,326 ****
--- 323,391 ----
          }
      }

+     /**
+      * Send a streamable bytea parameter encoded as a seven-bit text representation.
+      */
+     private void streamByteaSevenBit(StreamableBytea param, boolean escapeEverything) throws IOException {
+         pgStream.SendChar('\'');
+
+         InputStream stream = param.getStream();
+         byte[] buffer = new byte[] { (byte)'\\', (byte)'\\', 0, 0, 0 };
+         for (int remaining = param.getLength(); remaining > 0; --remaining) {
+             int nextByte = stream.read();
+
+             if (escapeEverything ||
+                 nextByte < 32 || nextByte > 127 ||
+                 nextByte == '\\' || nextByte == '\'') {
+                 // Needs escaping.
+                 buffer[2] = (byte)( (int)'0' + ((nextByte >> 6) & 3));
+                 buffer[3] = (byte)( (int)'0' + ((nextByte >> 3) & 7));
+                 buffer[4] = (byte)( (int)'0' + (nextByte & 7));
+                 pgStream.Send(buffer, 5);
+             } else {
+                 pgStream.SendChar(nextByte);
+             }
+         }
+
+         pgStream.SendChar('\'');
+         param.close();
+     }
+
+     /**
+      * Send a streamable bytea encoded as a text representation with an arbitary encoding.
+      */
+     private void streamBytea(StreamableBytea param, Encoding encoding) throws IOException, SQLException {
+         if (encoding.isSevenBitSafe()) {
+             streamByteaSevenBit(param, false);
+             return;
+         }
+
+         // NB: we escape everything in this path as I don't like assuming
+         // that byte values 32..127 will make it through the encoding
+         // unscathed..
+
+         Writer writer = encoding.getEncodingWriter(pgStream.pg_output);
+
+         InputStream stream = param.getStream();
+         char[] buffer = new char[] { '\\', '\\', 0, 0, 0 };
+
+         writer.write('\'');
+         for (int remaining = param.getLength(); remaining > 0; --remaining) {
+             int nextByte = stream.read();
+
+             buffer[2] = (char)( '0' + ((nextByte >> 6) & 3));
+             buffer[3] = (char)( '0' + ((nextByte >> 3) & 7));
+             buffer[4] = (char)( '0' + (nextByte & 7));
+
+             writer.write(buffer, 0, 5);
+         }
+
+         writer.write('\'');
+         writer.flush(); // But do not close()!
+
+         param.close();
+     }
+
      /*
       * Send a query to the backend.
       */
***************
*** 337,357 ****
              int j = 0;
              int l_msgSize = 4;
              Encoding l_encoding = connection.getEncoding();
              pgStream.SendChar('Q');
              for (int i = 0 ; i < m_binds.length ; ++i)
              {
                  l_parts[j] = l_encoding.encode(m_sqlFrags[i]);
                  l_msgSize += l_parts[j].length;
                  j++;
!                 l_parts[j] = l_encoding.encode(m_binds[i].toString());
!                 l_msgSize += l_parts[j].length;
                  j++;
              }
              l_parts[j] = l_encoding.encode(m_sqlFrags[m_binds.length]);
              l_msgSize += l_parts[j].length;
              pgStream.SendInteger(l_msgSize+1,4);
              for (int k = 0; k < l_parts.length; k++) {
!                 pgStream.Send(l_parts[k]);
              }
              pgStream.SendChar(0);
              pgStream.flush();
--- 402,440 ----
              int j = 0;
              int l_msgSize = 4;
              Encoding l_encoding = connection.getEncoding();
+
              pgStream.SendChar('Q');
+
+             // Compute and send message size.
              for (int i = 0 ; i < m_binds.length ; ++i)
              {
                  l_parts[j] = l_encoding.encode(m_sqlFrags[i]);
                  l_msgSize += l_parts[j].length;
                  j++;
!                 if (m_binds[i] instanceof StreamableBytea) {
!                     l_parts[j] = null;
!                     // Magic constants:
!                     //  2 bytes for the leading and trailing single quotes
!                     //  every input byte becomes '\\nnn' with one output byte per character => 5 bytes.
!                     l_msgSize += 2 + ((StreamableBytea)m_binds[i]).getLength() * 5;
!                 } else {
!                     l_parts[j] = l_encoding.encode(m_binds[i].toString());
!                     l_msgSize += l_parts[j].length;
!                 }
                  j++;
              }
              l_parts[j] = l_encoding.encode(m_sqlFrags[m_binds.length]);
              l_msgSize += l_parts[j].length;
              pgStream.SendInteger(l_msgSize+1,4);
+
+             // Now the message itself.
              for (int k = 0; k < l_parts.length; k++) {
!                 if (l_parts[k] == null) {
!                     StreamableBytea param = (StreamableBytea)m_binds[k/2];
!                     streamByteaSevenBit(param, true); // Escape everything to get a predictable output length.
!                 } else {
!                     pgStream.Send(l_parts[k]);
!                 }
              }
              pgStream.SendChar(0);
              pgStream.flush();
***************
*** 378,384 ****
              for (int i = 0 ; i < m_binds.length ; ++i)
              {
                  pgStream.Send(connection.getEncoding().encode(m_sqlFrags[i]));
!                 pgStream.Send(connection.getEncoding().encode(m_binds[i].toString()));
              }

              pgStream.Send(connection.getEncoding().encode(m_sqlFrags[m_binds.length]));
--- 461,470 ----
              for (int i = 0 ; i < m_binds.length ; ++i)
              {
                  pgStream.Send(connection.getEncoding().encode(m_sqlFrags[i]));
!                 if (m_binds[i] instanceof StreamableBytea)
!                     streamBytea((StreamableBytea)m_binds[i], connection.getEncoding());
!                 else
!                     pgStream.Send(connection.getEncoding().encode(m_binds[i].toString()));
              }

              pgStream.Send(connection.getEncoding().encode(m_sqlFrags[m_binds.length]));
Index: org/postgresql/jdbc1/AbstractJdbc1Statement.java
===================================================================
RCS file: /usr/local/cvsroot/pgjdbc/pgjdbc/org/postgresql/jdbc1/AbstractJdbc1Statement.java,v
retrieving revision 1.52
diff -u -c -r1.52 AbstractJdbc1Statement.java
*** org/postgresql/jdbc1/AbstractJdbc1Statement.java    29 Mar 2004 19:17:11 -0000    1.52
--- org/postgresql/jdbc1/AbstractJdbc1Statement.java    21 Apr 2004 03:25:56 -0000
***************
*** 5,10 ****
--- 5,11 ----
  import org.postgresql.core.BaseStatement;
  import org.postgresql.core.Field;
  import org.postgresql.core.QueryExecutor;
+ import org.postgresql.core.StreamableBytea;
  import org.postgresql.largeobject.LargeObject;
  import org.postgresql.largeobject.LargeObjectManager;
  import org.postgresql.util.PGbytea;
***************
*** 12,17 ****
--- 13,19 ----
  import org.postgresql.util.PSQLException;
  import org.postgresql.util.PSQLState;
  import java.io.IOException;
+ import java.io.ByteArrayInputStream;
  import java.io.InputStream;
  import java.io.InputStreamReader;
  import java.io.OutputStream;
***************
*** 1225,1231 ****
              }
              else
              {
!                 setString(parameterIndex, PGbytea.toPGString(x), PG_BYTEA);
              }
          }
          else
--- 1227,1241 ----
              }
              else
              {
!                 final byte[] clonedParameter = (byte[])x.clone();
!                 final ByteArrayInputStream bais = new ByteArrayInputStream(clonedParameter);
!                 StreamableBytea param = new StreamableBytea() {
!                         public int getLength()          { return clonedParameter.length; }
!                         public InputStream getStream()  { return bais; }
!                         public void close()             { try { bais.close(); } catch (IOException ioe) {} }
!                     };
!
!                 bind(parameterIndex, param, PG_BYTEA);
              }
          }
          else
***************
*** 1514,1520 ****
       * @param x the parameter value
       * @exception SQLException if a database access error occurs
       */
!     public void setBinaryStream(int parameterIndex, InputStream x, int length) throws SQLException
      {
          checkClosed();
          if (connection.haveMinimumCompatibleVersion("7.2"))
--- 1524,1530 ----
       * @param x the parameter value
       * @exception SQLException if a database access error occurs
       */
!     public void setBinaryStream(int parameterIndex, final InputStream x, final int length) throws SQLException
      {
          checkClosed();
          if (connection.haveMinimumCompatibleVersion("7.2"))
***************
*** 1529,1568 ****
              //As the spec/javadoc for this method indicate this is to be used for
              //large binary values (i.e. LONGVARBINARY)    PG doesn't have a separate
              //long binary datatype, but with toast the bytea datatype is capable of
!             //handling very large values.  Thus the implementation ends up calling
!             //setBytes() since there is no current way to stream the value to the server
!             byte[] l_bytes = new byte[length];
!             int l_bytesRead = 0;
!             try
!             {
!                 while (true)
!                 {
!                     int n = x.read(l_bytes, l_bytesRead, length - l_bytesRead);
!                     if (n == -1)
!                         break;
!
!                     l_bytesRead += n;
!
!                     if (l_bytesRead == length)
!                         break;
!
!                 }
!             }
!             catch (IOException l_ioe)
!             {
!                 throw new PSQLException("postgresql.unusual", PSQLState.UNEXPECTED_ERROR, l_ioe);
!             }
!             if (l_bytesRead == length)
!             {
!                 setBytes(parameterIndex, l_bytes);
!             }
!             else
!             {
!                 //the stream contained less data than they said
!                 byte[] l_bytes2 = new byte[l_bytesRead];
!                 System.arraycopy(l_bytes, 0, l_bytes2, 0, l_bytesRead);
!                 setBytes(parameterIndex, l_bytes2);
!             }
          }
          else
          {
--- 1539,1553 ----
              //As the spec/javadoc for this method indicate this is to be used for
              //large binary values (i.e. LONGVARBINARY)    PG doesn't have a separate
              //long binary datatype, but with toast the bytea datatype is capable of
!             //handling very large values.
!
!             StreamableBytea param = new StreamableBytea() {
!                     public int getLength()          { return length; }
!                     public InputStream getStream()  { return x; }
!                     public void close()             { /* no-op; is this correct? */ }
!                 };
!
!             bind(parameterIndex, param, PG_BYTEA);
          }
          else
          {
*** /dev/null    Mon Feb  2 17:15:26 2004
--- org/postgresql/core/StreamableBytea.java    Wed Apr 21 14:37:37 2004
***************
*** 0 ****
--- 1,25 ----
+ package org.postgresql.core;
+
+ import java.io.InputStream;
+
+ /**
+  * Wrapper for bytea representations that can stream themselves to a
+  * binary form on demand.
+  */
+ public interface StreamableBytea {
+     /**
+      * @return the total length of the stream returned by {@link #getStream()}.
+      */
+     int getLength();
+
+     /**
+      * @return a stream that the raw bytea data can be read from.
+      */
+     InputStream getStream();
+
+     /**
+      * Free any underlying resources associated with this parameter.
+      * getStream() / getLength() should not be called after close() is called.
+      */
+     void close();
+ }

Re: patch: streaming of bytea parameter values

From

Oliver Jowett

Date:

25 April 2004, 09:46:59

Oliver Jowett wrote:
> This patch allows bytea parameters set via setBytes() /
> setBinaryStream() to be streamed to the backend. With the patch applied,
> the additional memory overhead for bytea parameters is small and
> independent of how large the parameter data is.
>
> Note that it doesn't touch the ResultSet path, so you'll still need a
> pile of extra memory if you execute queries that return large bytea values.
>
> It passes the driver's testcases against a 7.4 server, and doesn't show
> any problems after brief testing with our bytea-using application. I've
> had a report of strange things happening on the server->client path, but
> don't have any details at the moment.
>
> This needs testing under a real workload, and testing against a 7.3
> server (the V2 protocol path is currently untested). Any volunteers?

I just tried this against a 7.3.6 server and saw no problems (passes the
driver testsuite, boots our application).

-O

Re: patch: streaming of bytea parameter values

From

Kris Jurka

Date:

07 May 2004, 03:41:21

On Wed, 21 Apr 2004, Oliver Jowett wrote:

> This patch allows bytea parameters set via setBytes() /
> setBinaryStream() to be streamed to the backend. With the patch applied,
> the additional memory overhead for bytea parameters is small and
> independent of how large the parameter data is.

Have you given any consideration to extending this to handle streaming
large text fields via setAsciiStream(), setUnicodeStream(), or
setObject(int, Object, Types.LONGVARCHAR) ?  Also updateable ResultSets
have similar updateXXX to PreparedStatement's setXXX methods that could
benefit from a streaming interface.

There is also the implied assumption that V3 protocol implies a unicode
client encoding which implies a seven bit safe streaming option which
would be better spelled out in a comment or called in a way that used the
encoding variable to derive this fact.

Kris Jurka

Re: patch: streaming of bytea parameter values

From

Oliver Jowett

Date:

07 May 2004, 04:50:42

Kris Jurka wrote:
>
> On Wed, 21 Apr 2004, Oliver Jowett wrote:
>
>
>>This patch allows bytea parameters set via setBytes() /
>>setBinaryStream() to be streamed to the backend. With the patch applied,
>>the additional memory overhead for bytea parameters is small and
>>independent of how large the parameter data is.
>
>
> Have you given any consideration to extending this to handle streaming
> large text fields via setAsciiStream(), setUnicodeStream(), or
> setObject(int, Object, Types.LONGVARCHAR) ?  Also updateable ResultSets
> have similar updateXXX to PreparedStatement's setXXX methods that could
> benefit from a streaming interface.

I think setObject() should use the changes when given a byte[], as it
appears to delegate to setBytes() in that case.. although I find the
setObject logic hard to follow at times (you are in a maze of twisty
little case statements, all alike)

updateBytes() delegates to setObject(). updateBinaryStream() won't
stream, at a glance, but should be easy to change so it delegates to
setBinaryStream().

 From a quick look at the other streaming parameter types, it's not
clear that we can stream them. setUnicodeStream and setCharacterStream
can't be streamed under v3 as there's no simple relationship between the
length parameter provided and the number of UTF8-encoded bytes we'd send
to the backend. setAsciiStream could be streamed if we assume 7-bit
ASCII (1:1 mapping to UTF8) but the JDBC javadoc doesn't say exactly
what it means by "ASCII"..

> There is also the implied assumption that V3 protocol implies a unicode
> client encoding which implies a seven bit safe streaming option which
> would be better spelled out in a comment or called in a way that used the
> encoding variable to derive this fact.

Fair comment.

In my local tree, this patch has been superceded by changes to use the
v3 extended query protocol and stream the data directly as a binary
parameter. I thought it might make a good interim solution for the
official driver though.

-O

Re: patch: streaming of bytea parameter values

From

Kris Jurka

Date:

22 May 2004, 04:34:36

On Wed, 21 Apr 2004, Oliver Jowett wrote:

> This patch allows bytea parameters set via setBytes() /
> setBinaryStream() to be streamed to the backend. With the patch applied,
> the additional memory overhead for bytea parameters is small and
> independent of how large the parameter data is.

Taking another look, this patch makes the assumption that the length
parameter to setBinaryStream is correct.  It does not check that the calls
to the input stream's read method do not return -1 indicating the
exhaustion of the stream, so doing

setBinaryStream("col", new ByteArrayInputStream(new byte[0]), 1000);

will actually put 1000 values of -1 into col.  Fixing this is difficult
because you cannot simply throw an Exception if you hit the end of stream
before the specified length because you can't just bail out when streaming
data as that will leave the PGStream in a confused state.  You've got to
get it back to a known state and do so without the query succeeding.

Another problem along this line is that the InputStream given could throw
an IOException when being read and the same unknown state problem would
result.

Kris Jurka

Re: patch: streaming of bytea parameter values

From

Kris Jurka

Date:

23 May 2004, 04:21:06


On Sat, 22 May 2004, Oliver Jowett wrote:

> > Fixing this is difficult
> > because you cannot simply throw an Exception if you hit the end of stream
> > before the specified length because you can't just bail out when streaming
> > data as that will leave the PGStream in a confused state.  You've got to
> > get it back to a known state and do so without the query succeeding.
> >
> > Another problem along this line is that the InputStream given could throw
> > an IOException when being read and the same unknown state problem would
> > result.
>
> My V3 patches handle these cases by closing the connection entirely &
> throwing an exception (i.e. providing a short stream or throwing
> IOException is a fatal error). This seemed better than silently padding
> with bogus data.
>
> I don't see that either case is a "normal" case we should be supporting
> -- we just need to be sure we clean up properly.
>

I don't consider abandoning the connection "cleaning up properly."  With
the V3 protocol we have a reasonable recovery method because we will
detect this error at Bind time and simply not issue the Execute.  In the
current code's case it seems the only way to do it is to deliberately
introduce a syntax error to make the query fail.

Another issue that just came to me is what is the desired behavior for

ByteArrayInputStream bais = new ByteArrayInputStream(new byte[1000]));
PreparedStatement ps = conn.prepareStatement("INSERT INTO t VALUES (?)");
ps.setBinaryStream(1, bais, 100);
ps.execute();
ps.execute();
ps.execute();

Should each inserted row have the same values?  Should the second row get
the second 100 bytes and the third row get the third 100?

Kris Jurka

Re: patch: streaming of bytea parameter values

From

Oliver Jowett

Date:

23 May 2004, 04:30:20

Kris Jurka wrote:
>
> On Sat, 22 May 2004, Oliver Jowett wrote:

>>My V3 patches handle these cases by closing the connection entirely &
>>throwing an exception (i.e. providing a short stream or throwing
>>IOException is a fatal error). This seemed better than silently padding
>>with bogus data.
>>
>>I don't see that either case is a "normal" case we should be supporting
>>-- we just need to be sure we clean up properly.
>>
>
>
> I don't consider abandoning the connection "cleaning up properly."

I don't see what isn't cleaned up by closing the connection. It seems
the simplest and safest way of handling this sort of error.

> With
> the V3 protocol we have a reasonable recovery method because we will
> detect this error at Bind time and simply not issue the Execute.  In the
> current code's case it seems the only way to do it is to deliberately
> introduce a syntax error to make the query fail.

We'd have to insert an artificial syntax error (another Parse after the
Bind that failed) anyway to force transaction rollback.

I'm reluctant to allow the txn to continue as a) it's inconsistent with
how other errors are handled and b) the distinction between Bind and
Execute is a bit fuzzy. There is some portal setup work run on Bind, and
I'm not sure if we can assume that a Bind immediately followed by a
ClosePortal does not modify the DB in any way.

Really, this is an *application error*, so I'm happy with anything that
preserves data integrity in the DB, even if it toasts the application's
current connection. If the application doesn't want this behaviour, it
shouldn't give the driver bad streams!

> Another issue that just came to me is what is the desired behavior for
>
> ByteArrayInputStream bais = new ByteArrayInputStream(new byte[1000]));
> PreparedStatement ps = conn.prepareStatement("INSERT INTO t VALUES (?)");
> ps.setBinaryStream(1, bais, 100);
> ps.execute();
> ps.execute();
> ps.execute();
>
> Should each inserted row have the same values?  Should the second row get
> the second 100 bytes and the third row get the third 100?

I'm sure that there is something somewhere in the spec that says that
the client has to keep the stream in an appropriate state until
execution (but now I can't find it anywhere!). So I'd expect in this
case to get bytes 0-99, 100-199, 200-299 inserted into the three rows
respectively.

More fun is what happens if you do the above with addBatch() /
executeBatch(); should that be equivalent to treating addBatch() as a
"query execution"? What if we want to delay reading the data until the
real batch execution?

-O

Re: patch: streaming of bytea parameter values

From

Kris Jurka

Date:

23 May 2004, 05:05:53

On Sun, 23 May 2004, Oliver Jowett wrote:

> Kris Jurka wrote:
> >
> > I don't consider abandoning the connection "cleaning up properly."
>
> I don't see what isn't cleaned up by closing the connection. It seems
> the simplest and safest way of handling this sort of error.

Yes it is the simplest and safest way of doing this, but it is far from
the user friendliest way of doing things.  They want to just
Connection.rollback() and continue.  In the past we have rejected patches
adding copy support because it had the potential of completely hosing a
connection.

>
> > With
> > the V3 protocol we have a reasonable recovery method because we will
> > detect this error at Bind time and simply not issue the Execute.  In the
> > current code's case it seems the only way to do it is to deliberately
> > introduce a syntax error to make the query fail.
>
> We'd have to insert an artificial syntax error (another Parse after the
> Bind that failed) anyway to force transaction rollback.
>
> I'm reluctant to allow the txn to continue as a) it's inconsistent with
> how other errors are handled and b) the distinction between Bind and
> Execute is a bit fuzzy. There is some portal setup work run on Bind, and
> I'm not sure if we can assume that a Bind immediately followed by a
> ClosePortal does not modify the DB in any way.

If you don't issue an Execute and the query proceeds I'd consider that a
bug in the server.  This is more of a general problem with the JDBC driver
in that if it throws an Exception itself it does not require a rollback
while server generated errors do.  Perhaps this should be added to the
drivers query issuing/commit logic to refuse to proceed if the driver has
thrown an error.

> > Another issue that just came to me is what is the desired behavior for
> >
> > ByteArrayInputStream bais = new ByteArrayInputStream(new byte[1000]));
> > PreparedStatement ps = conn.prepareStatement("INSERT INTO t VALUES (?)");
> > ps.setBinaryStream(1, bais, 100);
> > ps.execute();
> > ps.execute();
> > ps.execute();
> >
> > Should each inserted row have the same values?  Should the second row get
> > the second 100 bytes and the third row get the third 100?
>
> I'm sure that there is something somewhere in the spec that says that
> the client has to keep the stream in an appropriate state until
> execution (but now I can't find it anywhere!). So I'd expect in this
> case to get bytes 0-99, 100-199, 200-299 inserted into the three rows
> respectively.

I would agree with this statement, but this would invalidate the wrapping
of a byte[] in a ByteArrayInputStream because that seems to require a
different behavior.

>
> More fun is what happens if you do the above with addBatch() /
> executeBatch(); should that be equivalent to treating addBatch() as a
> "query execution"? What if we want to delay reading the data until the
> real batch execution?
>

Yes, the javadocs pretty much suck in determining the corner cases so I
don't know what to tell you.  Should we try to set up environments for
other databases and see what they do?  Without some investigation our
current plan of providing the safest behavior is something that can
kill any potential performance gains.

Kris Jurka

Re: patch: streaming of bytea parameter values

From

Oliver Jowett

Date:

23 May 2004, 07:10:50

Kris Jurka wrote:
> On Sun, 23 May 2004, Oliver Jowett wrote:
>>Kris Jurka wrote:
>>
>>>I don't consider abandoning the connection "cleaning up properly."
>>
>>I don't see what isn't cleaned up by closing the connection. It seems
>>the simplest and safest way of handling this sort of error.
>
> Yes it is the simplest and safest way of doing this, but it is far from
> the user friendliest way of doing things.  They want to just
> Connection.rollback() and continue.  In the past we have rejected patches
> adding copy support because it had the potential of completely hosing a
> connection.

I thought the issue with V2 copy support was more that the copy stream
was mostly freeform, and as such you were at the mercy of the
application to correctly terminate it. If the protocol stream state got
hosed, you couldn't easily detect it.

This is a bit different in that we can always detect it. You never end
up with a hosed connection that the driver will try to continue using.

Being user-friendly when dealing with application errors is nice
behaviour to have, but it *is* a separate issue..

> This is more of a general problem with the JDBC driver
> in that if it throws an Exception itself it does not require a rollback
> while server generated errors do.  Perhaps this should be added to the
> drivers query issuing/commit logic to refuse to proceed if the driver has
> thrown an error.

If you're talking about all the exceptions that can ever be thrown by
the driver, it sounds like unnecessary complexity to me,

It makes sense for exceptions thrown during query execution though, Are
there currently any cases where the driver can throw an exception from
query execution without invalidating the current transaction?

Also: multiple-statement queries and batch updates. Currently we always
stop execution on the first error. I'd like to keep that property. The
batch update javadoc requires that the continue-after-error behaviour be
consistent for a particular DBMS+driver.

> I would agree with this statement, but this would invalidate the wrapping
> of a byte[] in a ByteArrayInputStream because that seems to require a
> different behavior.

Oops, good point. My v3 changes will need a bugfix.

Anyway, I'm unlikely to have time to revisit this particular patch. I'll
take another look at the error path in my v3 work.

-O

Re: patch: streaming of bytea parameter values

From

Kris Jurka

Date:

23 May 2004, 18:24:09

On Sun, 23 May 2004, Oliver Jowett wrote:

> Kris Jurka wrote:
> > On Sun, 23 May 2004, Oliver Jowett wrote:
> >>
> >>I don't see what isn't cleaned up by closing the connection. It seems
> >>the simplest and safest way of handling this sort of error.
> >
> This is a bit different in that we can always detect it. You never end
> up with a hosed connection that the driver will try to continue using.
>
> Being user-friendly when dealing with application errors is nice
> behaviour to have, but it *is* a separate issue..

Separate from what, correctness?  We both agree that a short stream or
IOException is an error, but I'm not advocating the JDBC equivalent of
System.exit() as a viable solution.  What makes it OK for this
particular error to destroy a Connection while no other application level
error can?

>
> > This is more of a general problem with the JDBC driver
> > in that if it throws an Exception itself it does not require a rollback
> > while server generated errors do.  Perhaps this should be added to the
> > drivers query issuing/commit logic to refuse to proceed if the driver has
> > thrown an error.
>
> If you're talking about all the exceptions that can ever be thrown by
> the driver, it sounds like unnecessary complexity to me,
>
> It makes sense for exceptions thrown during query execution though, Are
> there currently any cases where the driver can throw an exception from
> query execution without invalidating the current transaction?
>

I am unsure what you mean by "query execution."  From a user's perspective
the moment they say PreparedStatement.execute() query execution has begun
and in that case the Exception that comes to mind is the check to make
sure all parameters of a PreparedStatement have been set.  See
QueryExecute.sendQueryV3.  From a driver perspective query execution
begins when it issues pgStream.SendChar('Q') or similar and I believe the
only Exception that should be thrown there in the current code is an
IOException on the stream which is fatal to the connection in any case.

Kris Jurka

Re: patch: streaming of bytea parameter values

From

Oliver Jowett

Date:

23 May 2004, 19:17:46

Kris Jurka wrote:
>
> On Sun, 23 May 2004, Oliver Jowett wrote:
>>It makes sense for exceptions thrown during query execution though, Are
>>there currently any cases where the driver can throw an exception from
>>query execution without invalidating the current transaction?
>>
>
> I am unsure what you mean by "query execution."  From a user's perspective
> the moment they say PreparedStatement.execute() query execution has begun

This is what I meant by query execution -- the user calling one of the
execute...() methods.

> and in that case the Exception that comes to mind is the check to make
> sure all parameters of a PreparedStatement have been set.

Ok. If that's the only driver-generated case, it's quite consistent
already (not entirely, but close!).

-O

Re: patch: streaming of bytea parameter values

From

Oliver Jowett

Date:

23 May 2004, 19:26:14

Kris Jurka wrote:

>>Being user-friendly when dealing with application errors is nice
>>behaviour to have, but it *is* a separate issue..
>
>
> Separate from what, correctness?

Yes.

> We both agree that a short stream or
> IOException is an error, but I'm not advocating the JDBC equivalent of
> System.exit() as a viable solution.  What makes it OK for this
> particular error to destroy a Connection while no other application level
> error can?

The implementation of SQLState (or connectionErrorOccurred) across
different drivers is quite patchy. When writing portable applications
(or even driver-specific applications, to a lesser extent) you have to
assume that any SQLException you don't specifically recognize means your
connection is toast.

I don't see that tearing down the connection for a certain class of
hard-to-handle errors makes this situation any worse.

-O

Re: patch: streaming of bytea parameter values

From

Kris Jurka

Date:

24 May 2004, 03:48:00

On Mon, 24 May 2004, Oliver Jowett wrote:

> The implementation of SQLState (or connectionErrorOccurred) across
> different drivers is quite patchy. When writing portable applications
> (or even driver-specific applications, to a lesser extent) you have to
> assume that any SQLException you don't specifically recognize means your
> connection is toast.
>

I think a reasonable coding method would be to catch an SQLException try
to rollback and if that throws an Exception then you give up the
Connection.

Kris Jurka

Re: patch: streaming of bytea parameter values

From

Dave Cramer

Date:

25 May 2004, 08:39:18

We previously had a discussion about how to deal with this code, I think
before this is committed I'd like to tag the code. Just so we can have a
cvs version that folks can checkout while we sort out the issues with
this.

Dave
On Mon, 2004-05-24 at 02:47, Kris Jurka wrote:
> On Mon, 24 May 2004, Oliver Jowett wrote:
>
> > The implementation of SQLState (or connectionErrorOccurred) across
> > different drivers is quite patchy. When writing portable applications
> > (or even driver-specific applications, to a lesser extent) you have to
> > assume that any SQLException you don't specifically recognize means your
> > connection is toast.
> >
>
> I think a reasonable coding method would be to catch an SQLException try
> to rollback and if that throws an Exception then you give up the
> Connection.
>
> Kris Jurka
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>
>
>
> !DSPAM:40b19bce197952303717472!
>
>
--
Dave Cramer
519 939 0336
ICQ # 14675561

Re: patch: streaming of bytea parameter values

From

Kris Jurka

Date:

25 May 2004, 12:16:02

On Tue, 25 May 2004, Dave Cramer wrote:

> We previously had a discussion about how to deal with this code, I think
> before this is committed I'd like to tag the code. Just so we can have a
> cvs version that folks can checkout while we sort out the issues with
> this.
>

Well, I wasn't planning on applying this patch until we sort out the
issues.  A special tag for people to checkout isn't very likely to help
anyone because people can barely find the cvs repository now, much less a
specific tag within it.

Kris Jurka