Re: BLOB performance test FYI - Mailing list pgsql-jdbc

From Barry Lind
Subject Re: BLOB performance test FYI
Date
Msg-id 3CBE24D4.40704@xythos.com
Whole thread Raw
In response to Re: BLOB performance test FYI  ("Joe Shevland" <jshevland@j-elite.com>)
List pgsql-jdbc

Anders Bengtsson wrote:
>
> Some of these problems aren't really about the datatypes themselves, but
> how the driver handles them. Instead of streaming large data into the
> backend, the client reads them into memory before sending them.
> It seems the LO parts of the driver has a different solution, that
> streams the data directly.
> It would take some major reworking of the JDBC driver to fix this. I've
> looked at it, and concluded that it is possible, but that it was too big
> for me to fix.
>
> /Anders

I would classify the problems more as limitations of the protocol that
exists between the server and all clients (not just jdbc, but odbc and
the rest as well).  One of the things I have been advocating for a while
is the ability to use bind variables such that you can send the sql
statement separatly from the data values.  This should allow passing
values that are typed such that they don't all need to be converted to
strings as happens today.  This requires changing the FE/BE protocol or
uses the FastPath functionality to implement this on top of the existing
FE/BE protocol.

thanks,
--Barry


PS.  Below is a mail note I sent last you to one of the kernel hackers
who was working on improving TOAST (thus bytea) functionality in an
effort to get a datatype better suited for jdbc blobs.




I don't know how much work you are planning on putting into this, but I
think what you are doing is great and I hope you continue.  Is your goal
here to provide better BLOB/CLOB functionality than currently exists in
either existing TOAST or LargeObjects?

If so I would like to share with you my thoughts on what I think is
needed to provide good BLOB/CLOB support in postgres.  I am approching
this from what I need in the server to provide decent Blob support
through the JDBC API.

The JDBC API basically assumes that when you select a blob you only
fetch a pointer/object reference (not any actual data).  Then later you
use that pointer in function calls to fetch part/all of the data (or
append/update in the update case).

This is basically how the old LargeObjects work, but they have many
problems in their implementation that make them unsuitable for use in
the JDBC driver.

Thus what I would eventually like to see in the backend is the following:

Two new datatypes (blob and clob).  These would basically be 'wrappers'
around the bytea and text datatypes.  They would differ in that the text
out method on each would return an object reference instead of returning
the value of the column.

New functions could be created that take these object references and
allow you to get, update, or append data to/from the blob.

So you could do something like:

create table foo (foo_id int, foo_blob blob);

insert into foo values (1, '');

select foo_blob from foo where foo_id = 1;

(would return something like:   foo:foo_blob:rowoid  -- something that
can be used to identify the toasted value)

select blob_append('foo:foo_blob:rowoid', 'the stuff to append');
// there should also be a blob_update() that allows you to update some
// some range of bytes/characters

select blob_read('foo:foo_blob:rowoid', start_position, length);

(would return the substring of the blob requested)


I just wanted to share my ideas with you as I knew you were working in
this area of the code.

thanks,
--Barry



pgsql-jdbc by date:

Previous
From: "David Wall"
Date:
Subject: Re: BLOB performance test FYI
Next
From: Barry Lind
Date:
Subject: Re: BLOB performance test FYI