Thread: Array passing

Array passing

From
"John Lister"
Date:
Hi, i've been playing with arrays using the 604 version JDBC 4 driver and i have a question. They seem to be converted to strings which are passed back and forth to the server.
 
Is it possible to configure the driver to pass the arrays in native form? The overhead of converting to strings is quite big in my case (both in the size of the string and the time taken) so would like to eliminate it if possible...
 
Which leads onto, does the postgresql protocol support native arrays? i'm trying to find docs about the datatypes but would hope it did...
 
Thanks
 
JOHN
--
 
Got needs? Get Goblin'! - http://www.pricegoblin.co.uk/

Re: Array passing

From
Dave Cramer
Date:


On Tue, Apr 7, 2009 at 10:24 AM, John Lister <john.lister-ps@kickstone.com> wrote:
Hi, i've been playing with arrays using the 604 version JDBC 4 driver and i have a question. They seem to be converted to strings which are passed back and forth to the server.
 
Is it possible to configure the driver to pass the arrays in native form? The overhead of converting to strings is quite big in my case (both in the size of the string and the time taken) so would like to eliminate it if possible...
 
Which leads onto, does the postgresql protocol support native arrays? i'm trying to find docs about the datatypes but would hope it did...
 
Yes, it does, but there are platform specific binary representations of native types. For instance you can have floating point or integer datetimes.

There was some work done on supporting binary protocol at one point, not sure where it stands now.

Dave

Re: Array passing

From
John Lister
Date:
Looking at the spec/docs although you can specify the return type in the
protocol there is no way to do it.. The only way i can see is to use a
stored procedure and specifying the return type there using
registerOutParameter(). However looking in the code
(SimpleParameterList) although the parameter is set to OUT, the return
type is ignored...

Is there any reason for this?

Would setting the return type like this cause any problems?

This would seem to solve my problem...

Thanks

Dave Cramer wrote:
>
>
> On Tue, Apr 7, 2009 at 10:24 AM, John Lister
> <john.lister-ps@kickstone.com <mailto:john.lister-ps@kickstone.com>>
> wrote:
>
>     Hi, i've been playing with arrays using the 604 version JDBC 4
>     driver and i have a question. They seem to be converted to strings
>     which are passed back and forth to the server.
>
>     Is it possible to configure the driver to pass the arrays in
>     native form? The overhead of converting to strings is quite big in
>     my case (both in the size of the string and the time taken) so
>     would like to eliminate it if possible...
>
>     Which leads onto, does the postgresql protocol support native
>     arrays? i'm trying to find docs about the datatypes but would hope
>     it did...
>
>
> Yes, it does, but there are platform specific binary representations
> of native types. For instance you can have floating point or integer
> datetimes.
>
> There was some work done on supporting binary protocol at one point,
> not sure where it stands now.
>
> Dave

Re: Array passing

From
Kris Jurka
Date:

On Tue, 7 Apr 2009, John Lister wrote:

> Looking at the spec/docs although you can specify the return type in the
> protocol there is no way to do it.. The only way i can see is to use a
> stored procedure and specifying the return type there using
> registerOutParameter(). However looking in the code
> (SimpleParameterList) although the parameter is set to OUT, the return
> type is ignored...
>

A parameter being in-vs-out is only relevent to function arguments being
in or out, they have nothing to do with the format (text/binary) of the
data, or even the direction data is traveling (from server/client).

> Would setting the return type like this cause any problems?
>

If you could rig this up to set a flag indicating it should be binary,
that wouldn't magically make everyone else respect it and do all the
necessary conversions.  You'd end up sending or receiving garbage data.

The place to start is really with the existing binary transfer patches:

http://archives.postgresql.org/pgsql-jdbc/2008-11/msg00000.php

I've only looked at the patches to receive data in binary form, but there
are also patches to send data in binary as well.  Perhaps you can just
apply them and all will be well, if not they're a good starting point and
should at least demonstrate the scope of the problem (it's not just
setting a flag).

Kris Jurka

Re: Array passing

From
John Lister
Date:
Cheers for the pointer to the patches, i'll have a look.... I may be
over simplifying things (and i haven't looked at probably nearly enough
of the jdbc code) but surely the driver knows from the server what types
it is returning and can thus convert to whatever the user wants in the
same way it converts from string.



Kris Jurka wrote:
>
>
> On Tue, 7 Apr 2009, John Lister wrote:
>
>> Looking at the spec/docs although you can specify the return type in
>> the protocol there is no way to do it.. The only way i can see is to
>> use a stored procedure and specifying the return type there using
>> registerOutParameter(). However looking in the code
>> (SimpleParameterList) although the parameter is set to OUT, the
>> return type is ignored...
>>
>
> A parameter being in-vs-out is only relevent to function arguments
> being in or out, they have nothing to do with the format (text/binary)
> of the data, or even the direction data is traveling (from
> server/client).
>
>> Would setting the return type like this cause any problems?
>>
>
> If you could rig this up to set a flag indicating it should be binary,
> that wouldn't magically make everyone else respect it and do all the
> necessary conversions.  You'd end up sending or receiving garbage data.
>
> The place to start is really with the existing binary transfer patches:
>
> http://archives.postgresql.org/pgsql-jdbc/2008-11/msg00000.php
>
> I've only looked at the patches to receive data in binary form, but
> there are also patches to send data in binary as well.  Perhaps you
> can just apply them and all will be well, if not they're a good
> starting point and should at least demonstrate the scope of the
> problem (it's not just setting a flag).
>
> Kris Jurka

Re: Array passing

From
Dave Cramer
Date:


On Wed, Apr 8, 2009 at 3:33 AM, John Lister <john.lister-ps@kickstone.com> wrote:
Cheers for the pointer to the patches, i'll have a look.... I may be over simplifying things (and i haven't looked at probably nearly enough of the jdbc code) but surely the driver knows from the server what types it is returning and can thus convert to whatever the user wants in the same way it converts from string.

 
Yes, it does know what types it is returning but it also has to do some more work to determine the native representation. As I mentioned there is at least one problem that I am aware of and that is different server types for datetime.

Dave

Re: Array passing

From
"John Lister"
Date:
I've applied the patch, but there seems to be a couple of issues.

To do binary transfer, it needs to do a describe before the bind so as to
get the returned field types. This seems to only be done when
ForceBinaryTransfer is set.

Secondly when it does the describe, sendParse trashes the fields details by
setting them to null before the main query is executed.

Did this patch use to work, i can't see how in its current form.

I'll continue investigating unless there are any hints...

Thanks


Re: Array passing

From
Maciek Sakrejda
Date:
Maybe this is obvious, but have you tried applying the patch to CVS as
of the date of that mailing list post? That should (hopefully) give you
a good idea whether (and to what extent) it did work at the time.

-Maciek


-----Original Message-----
From: John Lister <john.lister-ps@kickstone.com>
To: Kris Jurka <books@ejurka.com>
Cc: pgsql-jdbc@postgresql.org
Subject: Re: [JDBC] Array passing
Date: Wed, 8 Apr 2009 14:54:11 +0100

I've applied the patch, but there seems to be a couple of issues.

To do binary transfer, it needs to do a describe before the bind so as to
get the returned field types. This seems to only be done when
ForceBinaryTransfer is set.

Secondly when it does the describe, sendParse trashes the fields details by
setting them to null before the main query is executed.

Did this patch use to work, i can't see how in its current form.

I'll continue investigating unless there are any hints...

Thanks





Re: Array passing

From
"John Lister"
Date:
Sounds obvious, but not tried it.. I'll give that a go :)

> Maybe this is obvious, but have you tried applying the patch to CVS as
> of the date of that mailing list post? That should (hopefully) give you
> a good idea whether (and to what extent) it did work at the time.


Re: Array passing

From
Kris Jurka
Date:
John Lister wrote:
> I've applied the patch, but there seems to be a couple of issues.
>
> To do binary transfer, it needs to do a describe before the bind so as
> to get the returned field types. This seems to only be done when
> ForceBinaryTransfer is set.
>

My memory of what the patch does is a little fuzzy and I've only looked
at the binary receive patch, but here's how I understood it.

The current driver execution does not bother retrieving the describe
results before executing a query.  This means we don't know what types
will be returned beforehand, so we don't know if we can support
retrieving them in binary and must retrieve them as text.  This is done
to avoid an extra network roundtrip for query execution.  What the patch
does is to look at a prepared statement and see if we've already
prepared it.  If we have, then we have the previous describe results and
know what fields we can handle as binary.  So without
ForceBinaryTransfer, binary will only kick in on the second execution
(or perhaps prepareThreshold executions, I forget).  ForceBinaryTransfer
is really a hack to test this which prepares a query twice to make this
work, so don't be fooled by the first preparation and wait for the
second one.

> Secondly when it does the describe, sendParse trashes the fields details
> by setting them to null before the main query is executed.
>

I'm not sure what you're referring to specifically.  sendParse certainly
doesn't adjust any of the params that are passed in, and it doesn't
touch anything if the query is already prepared.

> Did this patch use to work, i can't see how in its current form.
>

It did work and I have no reason to believe that's changed.  Perhaps it
just isn't working as you're expecting.  Perhaps you're trying it with
arrays which it might not support?  I'd try getting it working with a
plain int or two first.

Kris Jurka


Re: Array passing

From
"John Lister"
Date:
----- Original Message -----
From: "Kris Jurka" <books@ejurka.com>
To: "John Lister" <john.lister-ps@kickstone.com>
Cc: <pgsql-jdbc@postgresql.org>
Sent: Wednesday, April 08, 2009 5:02 PM
Subject: Re: [JDBC] Array passing


> John Lister wrote:
>> I've applied the patch, but there seems to be a couple of issues.
>>
>> To do binary transfer, it needs to do a describe before the bind so as to
>> get the returned field types. This seems to only be done when
>> ForceBinaryTransfer is set.
>>
>
> My memory of what the patch does is a little fuzzy and I've only looked at
> the binary receive patch, but here's how I understood it.
>
> The current driver execution does not bother retrieving the describe
> results before executing a query.  This means we don't know what types
> will be returned beforehand, so we don't know if we can support retrieving
> them in binary and must retrieve them as text.  This is done to avoid an
> extra network roundtrip for query execution.  What the patch does is to
> look at a prepared statement and see if we've already prepared it.  If we
> have, then we have the previous describe results and know what fields we
> can handle as binary.  So without ForceBinaryTransfer, binary will only
> kick in on the second execution (or perhaps prepareThreshold executions, I
> forget).  ForceBinaryTransfer is really a hack to test this which prepares
> a query twice to make this work, so don't be fooled by the first
> preparation and wait for the second one.

Ah that makes more sense, i was expecting it to always do a describe before
the bind. Apologies...
I'll have a look what happens on multiple calls...

Personally I think the extra network traffic is worth it.

>> Secondly when it does the describe, sendParse trashes the fields details
>> by setting them to null before the main query is executed.
>>
>
> I'm not sure what you're referring to specifically.  sendParse certainly
> doesn't adjust any of the params that are passed in, and it doesn't touch
> anything if the query is already prepared.
>
>> Did this patch use to work, i can't see how in its current form.
>>
>
> It did work and I have no reason to believe that's changed.  Perhaps it
> just isn't working as you're expecting.  Perhaps you're trying it with
> arrays which it might not support?  I'd try getting it working with a
> plain int or two first.

I was misunderstanding it - see above..

Arrays aren't implemented in the patch, only simple types. I'm adding
support now..



Re: Array passing

From
Oliver Jowett
Date:
John Lister wrote:

> Personally I think the extra network traffic is worth it.

Maybe for your use case, but are you prepared to make that decision for
every user of the driver?

(also, it's the added latency that's really the problem, for simple
queries you could end up doubling the execution time)

-O

Re: Array passing

From
John Lister
Date:
Oliver Jowett wrote:
> John Lister wrote:
>
>> Personally I think the extra network traffic is worth it.
>
> Maybe for your use case, but are you prepared to make that decision
> for every user of the driver?
>
> (also, it's the added latency that's really the problem, for simple
> queries you could end up doubling the execution time)
>

I agree and wasn't suggesting it be applied to everyone/query. I was
planning to add an option (as is already the case in the original patch).
Additionally it probably only makes sense to do this for
PreparedStatements as normal Statements are likely to be simple one off
queries. Also given the jdbc docs states they are pre-compiled you'd
expect prepared statements to suffer some slight overhead during the
preparation in return for better execution time.

Thoughts