Re: ResultSet.getBytes() of bytea speedup batch - Mailing list pgsql-jdbc

From Heikki Linnakangas
Subject Re: ResultSet.getBytes() of bytea speedup batch
Date
Msg-id 4560BF23.9070003@enterprisedb.com
Whole thread Raw
In response to ResultSet.getBytes() of bytea speedup batch  (Mikko Tiihonen <mikko.tiihonen@iki.fi>)
List pgsql-jdbc
Mikko Tiihonen wrote:
> The attached patch gives 20-400% speedup depending on the bytea column
> size. The speedup comes from switching to the binary transfer mode
> instead of the text transfer mode for bytea columns.

Impressive results!

> The patch also contains the test code which I used to test the
> performance (ByteaBenchTest.java). The benchmark results are:
>
> cvs:
> column size:    transfer speed:
>     512b speed:   3,8MB/s
>    2048b speed:  10,0MB/s
>    4096b speed:  13,1MB/s
>   16384b speed:  22,0MB/s
>   65536b speed:  22,4MB/s
> 1048576b speed:  21,2MB/s
>
> patch:
> column size:    transfer speed:
>     512b speed:   4,6MB/s -  21% speedup
>    2048b speed:  16,8MB/s -  68% speedup
>    4096b speed:  27,9MB/s - 112% speedup
>   16384b speed:  79,2MB/s - 260% speedup
>   65536b speed: 111,1MB/s - 395% speedup (optimal for my L2 cache?)
> 1048576b speed:  74,8MB/s - 252% speedup

I'm getting similar results, though the drop after 65536b is even steeper:

cvs:
     [junit]     512b speed:   1.8MB/s memory:   2.5MB
     [junit]    2048b speed:   5.0MB/s memory:   2.0MB
     [junit]    4096b speed:   7.5MB/s memory:   2.0MB
     [junit]   16384b speed:  11.8MB/s memory:   2.4MB
     [junit]   65536b speed:  12.6MB/s memory:   2.4MB
     [junit] 1048576b speed:   5.3MB/s memory:   5.8MB

patch:
     [junit]     512b speed:   2.1MB/s memory:   2.0MB
     [junit]    2048b speed:   7.4MB/s memory:   2.2MB
     [junit]    4096b speed:  13.0MB/s memory:   2.1MB
     [junit]   16384b speed:  35.9MB/s memory:   2.3MB
     [junit]   65536b speed:  56.7MB/s memory:   2.0MB
     [junit] 1048576b speed:  17.2MB/s memory:   2.9MB
     [junit] ------------- ---------------- ---------------

Ran on Sun JRE 1.5.0_06, on Intel Core Duo. I sure would like to find an
explanation for the drop.

> Still open questions:
> - is there any better way to obtain the result set field information in
> QueryExecutor?
>   * could the result set type information be obtained during the
> prepare?

It could. If the driver issued a Sync message after Describe, and waited
for the response before sending the Bind, the driver could use the
information in the RowDesription response. But that would require an
extra round-trip to the server, so it's probably not worth it in most
queries.

I wish we could set a default format for each data type in a separate
message. Like:

SetFormat(oid = 17, format="b")

The server would then send all bytea fields as binary unless otherwise
stated in the Bind message. It would require a backend-change, of course.

> - in which conditions should a prepared statement drop the cached fields
>   * do the result set fields ever change?
>   * any other conditions?
> - if caching of Field information is allowed (at least in some
> circumstances), could the driver stop requesting a Describe from the
> backend?

I can't imagine how the fields could change. Seems safe to me.

> @@ -126,6 +134,8 @@
>      private byte[] encodedStatementName;
>      private PhantomReference cleanupRef;
>      private int[] preparedTypes;
> +    private static final byte[] EMPTY_STATEMENTNAME = new byte[] { 0 };
> +    private Field[] fields;
>
>      final static SimpleParameterList NO_PARAMETERS = new SimpleParameterList(0);
>  }

This is just some leftover crud, right? Couldn't find a reference to
EMPTY_STATEMENTNAME anywhere.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com


pgsql-jdbc by date:

Previous
From: Mikko Tiihonen
Date:
Subject: binary tuple receiving patch v2
Next
From: Roland Walter
Date:
Subject: Re: Use Driver to Create database?