Thread: PostgreSQL gaps wrt to java, and jdbc

PostgreSQL gaps wrt to java, and jdbc

From
Dave Cramer
Date:
I have been actively maintaining the driver off and on since 1999 or so. Recently we have had a flurry of activity and one of the things I noticed was the surprise that PostgreSQL didn't support X or did Y in some unexpected way. Sometimes we are a bit too complacent, and accept things the way they are.

I am wondering what could the server do better that would help JDBC?

Obviously streaming a column is one. Possibly rowid's. Anything else ?


Dave Cramer

Re: PostgreSQL gaps wrt to java, and jdbc

From
"Markus KARG"
Date:

Binary represenation of XML and JSON instead of converting it to a String BEFORE transmission. Both JSON and XML essentially are graphs, and can be inflated with whitespace on the driver side AFTER transmission. Same in the other direction (deflate by transmitting binary object graph instead of whitespace-inflated String representation).

 

 

From: pgsql-jdbc-owner@postgresql.org [mailto:pgsql-jdbc-owner@postgresql.org] On Behalf Of Dave Cramer
Sent: Montag, 6. Juli 2015 14:15
To: List
Subject: [JDBC] PostgreSQL gaps wrt to java, and jdbc

 

I have been actively maintaining the driver off and on since 1999 or so. Recently we have had a flurry of activity and one of the things I noticed was the surprise that PostgreSQL didn't support X or did Y in some unexpected way. Sometimes we are a bit too complacent, and accept things the way they are.

 

I am wondering what could the server do better that would help JDBC?

 

Obviously streaming a column is one. Possibly rowid's. Anything else ?

 


Dave Cramer

Re: PostgreSQL gaps wrt to java, and jdbc

From
Dave Cramer
Date:
Looking at the backend code. One possibility is to just use compression to send it over?

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 6 July 2015 at 12:35, Markus KARG <markus@headcrashing.eu> wrote:

Binary represenation of XML and JSON instead of converting it to a String BEFORE transmission. Both JSON and XML essentially are graphs, and can be inflated with whitespace on the driver side AFTER transmission. Same in the other direction (deflate by transmitting binary object graph instead of whitespace-inflated String representation).

 

 

From: pgsql-jdbc-owner@postgresql.org [mailto:pgsql-jdbc-owner@postgresql.org] On Behalf Of Dave Cramer
Sent: Montag, 6. Juli 2015 14:15
To: List
Subject: [JDBC] PostgreSQL gaps wrt to java, and jdbc

 

I have been actively maintaining the driver off and on since 1999 or so. Recently we have had a flurry of activity and one of the things I noticed was the surprise that PostgreSQL didn't support X or did Y in some unexpected way. Sometimes we are a bit too complacent, and accept things the way they are.

 

I am wondering what could the server do better that would help JDBC?

 

Obviously streaming a column is one. Possibly rowid's. Anything else ?

 


Dave Cramer


Re: PostgreSQL gaps wrt to java, and jdbc

From
Kevin Wooten
Date:
By “streaming a column” I assume you mean large bytea columns? If so, I say… hell yes.

Also,..

== Binary type coercion ==
      The biggest impediment I’ve run into is the fact that PG will not attempt coercion on binary types.  If you
specifya binary type in the protocol, that’s the final word.  A better system would be to have PG perform the same
coercionit does for text types and then send the results back with coerced types, in binary; finally falling back to
textwhen binary cannot be produced.  This would solve almost every issue we have had to work around in the “ng” driver. 

== Unbound binary results ==
      Being able to send a query, without the parse & bind step, that returns binary results.

== Better cursor support ==
      Providing capabilities that better match JDBC.  PG is pretty close but a few cases make cursors not work as
expecteda lot of the time. 

I could probably come up with a bunch more but these are the big ones that jump out at me.



I have a couple solutions (with no idea how to implement them)...

For "binary type coercion” & “unbound binary results” a simple setting flag telling PG that we basically know how to
handleall the binary types would suffice.  In this flag was set PG could just start returning stuff in the fashion we
wantand it would helpfully “just work”. 

For streaming it seems a sub mode, similar to the copy protocol, that doesn’t force us out of the transaction but can
sendus multiple packets would really help.  Although I know there are many factors at work on the server that I
definitelydon’t understand. 


> On Jul 6, 2015, at 5:14 AM, Dave Cramer <davecramer@gmail.com> wrote:
>
> I have been actively maintaining the driver off and on since 1999 or so. Recently we have had a flurry of activity
andone of the things I noticed was the surprise that PostgreSQL didn't support X or did Y in some unexpected way.
Sometimeswe are a bit too complacent, and accept things the way they are. 
>
> I am wondering what could the server do better that would help JDBC?
>
> Obviously streaming a column is one. Possibly rowid's. Anything else ?
>
>
> Dave Cramer



Re: PostgreSQL gaps wrt to java, and jdbc

From
Heikki Linnakangas
Date:
On 07/07/2015 09:31 PM, Kevin Wooten wrote:
> == Binary type coercion ==
> The biggest impediment I’ve run into is the fact that PG will not
> attempt coercion on binary types. If you specify a binary type in the
> protocol, that’s the final word. A better system would be to have PG
> perform the same coercion it does for text types and then send the
> results back with coerced types, in binary; finally falling back to
> text when binary cannot be produced. This would solve almost every
> issue we have had to work around in the “ng” driver.

Can you elaborate? I'm confused, because the server can produce binary
output for any datatype that has binary output functions. That covers
all built-in datatypes and all extension datatypes people use in practice.

What I've imagined to be a problem, though, is that you have to either
understand the binary representation of all datatypes in the client, or
you have to request binary for only those datatypes you can handle. And
to know which datatype a result set's column has, you have to Describe
it before fetching the results, which adds an extra round-trip. So it
would be handy if the driver could provide the server a list of
datatypes that should be sent as binary, as a one-time operation at the
beginning of the session for example, rather than indicating for each
column in each query separately.

- Heikki



Re: PostgreSQL gaps wrt to java, and jdbc

From
Dave Cramer
Date:




On 7 July 2015 at 15:46, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
On 07/07/2015 09:31 PM, Kevin Wooten wrote:
== Binary type coercion ==
The biggest impediment I’ve run into is the fact that PG will not
attempt coercion on binary types. If you specify a binary type in the
protocol, that’s the final word. A better system would be to have PG
perform the same coercion it does for text types and then send the
results back with coerced types, in binary; finally falling back to
text when binary cannot be produced. This would solve almost every
issue we have had to work around in the “ng” driver.

Can you elaborate? I'm confused, because the server can produce binary output for any datatype that has binary output functions. That covers all built-in datatypes and all extension datatypes people use in practice.

What I've imagined to be a problem, though, is that you have to either understand the binary representation of all datatypes in the client, or you have to request binary for only those datatypes you can handle. And to know which datatype a result set's column has, you have to Describe it before fetching the results, which adds an extra round-trip. So it would be handy if the driver could provide the server a list of datatypes that should be sent as binary, as a one-time operation at the beginning of the session for example, rather than indicating for each column in each query separately.


Is there a way to get json and or xml out as binary ?


Dave Cramer

Re: PostgreSQL gaps wrt to java, and jdbc

From
Heikki Linnakangas
Date:
On 07/07/2015 10:51 PM, Dave Cramer wrote:
> On 7 July 2015 at 15:46, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>
>> On 07/07/2015 09:31 PM, Kevin Wooten wrote:
>>
>>> == Binary type coercion ==
>>> The biggest impediment I’ve run into is the fact that PG will not
>>> attempt coercion on binary types. If you specify a binary type in the
>>> protocol, that’s the final word. A better system would be to have PG
>>> perform the same coercion it does for text types and then send the
>>> results back with coerced types, in binary; finally falling back to
>>> text when binary cannot be produced. This would solve almost every
>>> issue we have had to work around in the “ng” driver.
>>>
>>
>> Can you elaborate? I'm confused, because the server can produce binary
>> output for any datatype that has binary output functions. That covers all
>> built-in datatypes and all extension datatypes people use in practice.
>>
>> What I've imagined to be a problem, though, is that you have to either
>> understand the binary representation of all datatypes in the client, or you
>> have to request binary for only those datatypes you can handle. And to know
>> which datatype a result set's column has, you have to Describe it before
>> fetching the results, which adds an extra round-trip. So it would be handy
>> if the driver could provide the server a list of datatypes that should be
>> sent as binary, as a one-time operation at the beginning of the session for
>> example, rather than indicating for each column in each query separately.
>>
> Is there a way to get json and or xml out as binary ?

Well, as far as the protocol is considered, yes. However, the "binary
representation" of json and xml datatypes is not too exciting; here's
the binary send function for json for example:

/*
  * Binary send.
  */
Datum
json_send(PG_FUNCTION_ARGS)
{
         text       *t = PG_GETARG_TEXT_PP(0);
         StringInfoData buf;

         pq_begintypsend(&buf);
         pq_sendtext(&buf, VARDATA_ANY(t), VARSIZE_ANY_EXHDR(t));
         PG_RETURN_BYTEA_P(pq_endtypsend(&buf));
}

I.e. it just sends the string with a bytea header. There is no benefit
to using that instead of the regular text representation. That dummy
representation is used just so that e.g. binary-format pg_dump works.

- Heikki



Re: PostgreSQL gaps wrt to java, and jdbc

From
Kevin Wooten
Date:
Although what you suggest might be useful.  For the ng driver we’ve handled that.  At startup we download every type
andmap them based on the send,recv,in,out. So we just specify them for each query. 

As far as “coercion" goes maybe a better term would be “deduction”.  There are numerous cases where it would be great
ifwe could send the server a “don’t care” type and just a binary format.   Instead we have to specify a type and format
together.Doing this causes issues in edge cases (e.g. strings coming back padded because we specified varchar but the
columnwas text). 

Actually an easier solution specific to our implementation would be to simply tell the driver to always return results
inbinary when it’s available. It would cover most cases including removing the parse/bind step for many queries. 

> On Jul 7, 2015, at 12:46 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>
> On 07/07/2015 09:31 PM, Kevin Wooten wrote:
>> == Binary type coercion ==
>> The biggest impediment I’ve run into is the fact that PG will not
>> attempt coercion on binary types. If you specify a binary type in the
>> protocol, that’s the final word. A better system would be to have PG
>> perform the same coercion it does for text types and then send the
>> results back with coerced types, in binary; finally falling back to
>> text when binary cannot be produced. This would solve almost every
>> issue we have had to work around in the “ng” driver.
>
> Can you elaborate? I'm confused, because the server can produce binary output for any datatype that has binary output
functions.That covers all built-in datatypes and all extension datatypes people use in practice. 
>
> What I've imagined to be a problem, though, is that you have to either understand the binary representation of all
datatypesin the client, or you have to request binary for only those datatypes you can handle. And to know which
datatypea result set's column has, you have to Describe it before fetching the results, which adds an extra round-trip.
Soit would be handy if the driver could provide the server a list of datatypes that should be sent as binary, as a
one-timeoperation at the beginning of the session for example, rather than indicating for each column in each query
separately.
>
> - Heikki
>



Re: PostgreSQL gaps wrt to java, and jdbc

From
Kevin Wooten
Date:

On Jul 7, 2015, at 1:02 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 07/07/2015 10:51 PM, Dave Cramer wrote:
On 7 July 2015 at 15:46, Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 07/07/2015 09:31 PM, Kevin Wooten wrote:

== Binary type coercion ==
The biggest impediment I’ve run into is the fact that PG will not
attempt coercion on binary types. If you specify a binary type in the
protocol, that’s the final word. A better system would be to have PG
perform the same coercion it does for text types and then send the
results back with coerced types, in binary; finally falling back to
text when binary cannot be produced. This would solve almost every
issue we have had to work around in the “ng” driver.


Can you elaborate? I'm confused, because the server can produce binary
output for any datatype that has binary output functions. That covers all
built-in datatypes and all extension datatypes people use in practice.

What I've imagined to be a problem, though, is that you have to either
understand the binary representation of all datatypes in the client, or you
have to request binary for only those datatypes you can handle. And to know
which datatype a result set's column has, you have to Describe it before
fetching the results, which adds an extra round-trip. So it would be handy
if the driver could provide the server a list of datatypes that should be
sent as binary, as a one-time operation at the beginning of the session for
example, rather than indicating for each column in each query separately.

Is there a way to get json and or xml out as binary ?

Well, as far as the protocol is considered, yes. However, the "binary representation" of json and xml datatypes is not too exciting; here's the binary send function for json for example:

/*
* Binary send.
*/
Datum
json_send(PG_FUNCTION_ARGS)
{
       text       *t = PG_GETARG_TEXT_PP(0);
       StringInfoData buf;

       pq_begintypsend(&buf);
       pq_sendtext(&buf, VARDATA_ANY(t), VARSIZE_ANY_EXHDR(t));
       PG_RETURN_BYTEA_P(pq_endtypsend(&buf));
}

I.e. it just sends the string with a bytea header. There is no benefit to using that instead of the regular text representation. That dummy representation is used just so that e.g. binary-format pg_dump works.

- Heikki

This confused me when we went to implement this as a binary type.  Isn’t it stored in a binary format? Or is that just an intermediate format that’s cached around inside the server?


Re: PostgreSQL gaps wrt to java, and jdbc

From
Dave Cramer
Date:


Dave Cramer

On 7 July 2015 at 16:02, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
On 07/07/2015 10:51 PM, Dave Cramer wrote:
On 7 July 2015 at 15:46, Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 07/07/2015 09:31 PM, Kevin Wooten wrote:

== Binary type coercion ==
The biggest impediment I’ve run into is the fact that PG will not
attempt coercion on binary types. If you specify a binary type in the
protocol, that’s the final word. A better system would be to have PG
perform the same coercion it does for text types and then send the
results back with coerced types, in binary; finally falling back to
text when binary cannot be produced. This would solve almost every
issue we have had to work around in the “ng” driver.


Can you elaborate? I'm confused, because the server can produce binary
output for any datatype that has binary output functions. That covers all
built-in datatypes and all extension datatypes people use in practice.

What I've imagined to be a problem, though, is that you have to either
understand the binary representation of all datatypes in the client, or you
have to request binary for only those datatypes you can handle. And to know
which datatype a result set's column has, you have to Describe it before
fetching the results, which adds an extra round-trip. So it would be handy
if the driver could provide the server a list of datatypes that should be
sent as binary, as a one-time operation at the beginning of the session for
example, rather than indicating for each column in each query separately.

Is there a way to get json and or xml out as binary ?

Well, as far as the protocol is considered, yes. However, the "binary representation" of json and xml datatypes is not too exciting; here's the binary send function for json for example:

/*
 * Binary send.
 */
Datum
json_send(PG_FUNCTION_ARGS)
{
        text       *t = PG_GETARG_TEXT_PP(0);
        StringInfoData buf;

        pq_begintypsend(&buf);
        pq_sendtext(&buf, VARDATA_ANY(t), VARSIZE_ANY_EXHDR(t));
        PG_RETURN_BYTEA_P(pq_endtypsend(&buf));
}

I.e. it just sends the string with a bytea header. There is no benefit to using that instead of the regular text representation. That dummy representation is used just so that e.g. binary-format pg_dump works.

- Heikki

Surely this could be compressed ? 

Re: PostgreSQL gaps wrt to java, and jdbc

From
Heikki Linnakangas
Date:
On 07/07/2015 11:04 PM, Kevin Wooten wrote:
> As far as “coercion" goes maybe a better term would be “deduction”.
> There are numerous cases where it would be great if we could send the
> server a “don’t care” type and just a binary format.   Instead we
> have to specify a type and format together. Doing this causes issues
> in edge cases (e.g. strings coming back padded because we specified
> varchar but the column was text).

Oh, are you talking about query parameters, sent from the client to the
server? The type OID and format code are not actually sent in the same
message - you give the OID in the Parse message and the format code in
Bind. You don't have to specify the OID in the Parse if you don't want
to, and the server will deduce the most sensible one from the query
context. After the Parse, you can find out what the server deduced by
sending a ParameterDescription message, and then form the appropriate
binary representation for the deduced type, and send it with Bind. Am I
missing something?

- Heikki



Re: PostgreSQL gaps wrt to java, and jdbc

From
Dave Cramer
Date:



On 7 July 2015 at 16:24, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
On 07/07/2015 11:04 PM, Kevin Wooten wrote:
As far as “coercion" goes maybe a better term would be “deduction”.
There are numerous cases where it would be great if we could send the
server a “don’t care” type and just a binary format.   Instead we
have to specify a type and format together. Doing this causes issues
in edge cases (e.g. strings coming back padded because we specified
varchar but the column was text).

Oh, are you talking about query parameters, sent from the client to the server? The type OID and format code are not actually sent in the same message - you give the OID in the Parse message and the format code in Bind. You don't have to specify the OID in the Parse if you don't want to, and the server will deduce the most sensible one from the query context. After the Parse, you can find out what the server deduced by sending a ParameterDescription message, and then form the appropriate binary representation for the deduced type, and send it with Bind. Am I missing something?


I think Kevin was hoping the server would just "figure it out" without all the back and forth. 

Dave Cramer

Re: PostgreSQL gaps wrt to java, and jdbc

From
Heikki Linnakangas
Date:
On 07/07/2015 11:08 PM, Kevin Wooten wrote:
>
>> On Jul 7, 2015, at 1:02 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>>
>> On 07/07/2015 10:51 PM, Dave Cramer wrote:
>>> Is there a way to get json and or xml out as binary ?
>>
>> Well, as far as the protocol is considered, yes. However, the
>> "binary representation" of json and xml datatypes is not too
>> exciting; here's the binary send function for json for example:
>>
>> /*
>> * Binary send.
>> */
>> Datum
>> json_send(PG_FUNCTION_ARGS)
>> {
>>         text       *t = PG_GETARG_TEXT_PP(0);
>>         StringInfoData buf;
>>
>>         pq_begintypsend(&buf);
>>         pq_sendtext(&buf, VARDATA_ANY(t), VARSIZE_ANY_EXHDR(t));
>>         PG_RETURN_BYTEA_P(pq_endtypsend(&buf));
>> }
>>
>> I.e. it just sends the string with a bytea header. There is no
>> benefit to using that instead of the regular text representation.
>> That dummy representation is used just so that e.g. binary-format
>> pg_dump works.
>
> This confused me when we went to implement this as a binary type.
> Isn’t it stored in a binary format? Or is that just an intermediate
> format that’s cached around inside the server?

Json is stored as text. It's really nothing more than a text field with
a validation step in the input function to check that it's valid json.
It preserves formatting and all.

Jsonb is stored in a binary format, but funnily enough its "binary
representation" in the protocol is mostly text:

> /*
>  * jsonb type send function
>  *
>  * Just send jsonb as a version number, then a string of text
>  */
> Datum
> jsonb_send(PG_FUNCTION_ARGS)
> {
>     Jsonb       *jb = PG_GETARG_JSONB(0);
>     StringInfoData buf;
>     StringInfo    jtext = makeStringInfo();
>     int            version = 1;
>
>     (void) JsonbToCString(jtext, &jb->root, VARSIZE(jb));
>
>     pq_begintypsend(&buf);
>     pq_sendint(&buf, version, 1);
>     pq_sendtext(&buf, jtext->data, jtext->len);
>     pfree(jtext->data);
>     pfree(jtext);
>
>     PG_RETURN_BYTEA_P(pq_endtypsend(&buf));
> }

There's a version number there, so perhaps we'll replace that with
something more binary in the future ;-)

- Heikki



Re: PostgreSQL gaps wrt to java, and jdbc

From
Kevin Wooten
Date:
> On Jul 7, 2015, at 1:30 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>
> On 07/07/2015 11:08 PM, Kevin Wooten wrote:
>>
>>> On Jul 7, 2015, at 1:02 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>>>
>>> On 07/07/2015 10:51 PM, Dave Cramer wrote:
>>>> Is there a way to get json and or xml out as binary ?
>>>
>>> Well, as far as the protocol is considered, yes. However, the
>>> "binary representation" of json and xml datatypes is not too
>>> exciting; here's the binary send function for json for example:
>>>
>>> /*
>>> * Binary send.
>>> */
>>> Datum
>>> json_send(PG_FUNCTION_ARGS)
>>> {
>>>        text       *t = PG_GETARG_TEXT_PP(0);
>>>        StringInfoData buf;
>>>
>>>        pq_begintypsend(&buf);
>>>        pq_sendtext(&buf, VARDATA_ANY(t), VARSIZE_ANY_EXHDR(t));
>>>        PG_RETURN_BYTEA_P(pq_endtypsend(&buf));
>>> }
>>>
>>> I.e. it just sends the string with a bytea header. There is no
>>> benefit to using that instead of the regular text representation.
>>> That dummy representation is used just so that e.g. binary-format
>>> pg_dump works.
>>
>> This confused me when we went to implement this as a binary type.
>> Isn’t it stored in a binary format? Or is that just an intermediate
>> format that’s cached around inside the server?
>
> Json is stored as text. It's really nothing more than a text field with a validation step in the input function to
checkthat it's valid json. It preserves formatting and all. 
>
> Jsonb is stored in a binary format, but funnily enough its "binary representation" in the protocol is mostly text:
>
>> /*
>> * jsonb type send function
>> *
>> * Just send jsonb as a version number, then a string of text
>> */
>> Datum
>> jsonb_send(PG_FUNCTION_ARGS)
>> {
>>     Jsonb       *jb = PG_GETARG_JSONB(0);
>>     StringInfoData buf;
>>     StringInfo    jtext = makeStringInfo();
>>     int            version = 1;
>>
>>     (void) JsonbToCString(jtext, &jb->root, VARSIZE(jb));
>>
>>     pq_begintypsend(&buf);
>>     pq_sendint(&buf, version, 1);
>>     pq_sendtext(&buf, jtext->data, jtext->len);
>>     pfree(jtext->data);
>>     pfree(jtext);
>>
>>     PG_RETURN_BYTEA_P(pq_endtypsend(&buf));
>> }
>
> There's a version number there, so perhaps we'll replace that with something more binary in the future ;-)
>
> - Heikki
>

Apologies I wasn’t paying enough attention to see the first example wasn’t JSONB.  Seems crazy to add another format
indirectionwith the “version” field. Seems the field itself would have solved whatever problems would have been
encounteredwith just spitting out the binary type in the first place. 




Re: PostgreSQL gaps wrt to java, and jdbc

From
Heikki Linnakangas
Date:
On 07/07/2015 11:26 PM, Dave Cramer wrote:
> On 7 July 2015 at 16:24, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>
>> On 07/07/2015 11:04 PM, Kevin Wooten wrote:
>>
>>> As far as “coercion" goes maybe a better term would be “deduction”.
>>> There are numerous cases where it would be great if we could send the
>>> server a “don’t care” type and just a binary format.   Instead we
>>> have to specify a type and format together. Doing this causes issues
>>> in edge cases (e.g. strings coming back padded because we specified
>>> varchar but the column was text).
>>
>> Oh, are you talking about query parameters, sent from the client to the
>> server? The type OID and format code are not actually sent in the same
>> message - you give the OID in the Parse message and the format code in
>> Bind. You don't have to specify the OID in the Parse if you don't want to,
>> and the server will deduce the most sensible one from the query context.
>> After the Parse, you can find out what the server deduced by sending a
>> ParameterDescription message, and then form the appropriate binary
>> representation for the deduced type, and send it with Bind. Am I missing
>> something?
>>
> I think Kevin was hoping the server would just "figure it out" without all
> the back and forth.

Well, if you use the binary representation, you need to somehow tell the
server which datatype's binary representation you're using. Perhaps you
could tell the server "I'm sending this in the binary representation of
int4, but I still want you to deduce the actual type as if it was
unknown", but that raises the question: if you know it's an integer, why
do you want the server to do the deduction?

- Heikki



Re: PostgreSQL gaps wrt to java, and jdbc

From
Kevin Wooten
Date:

On Jul 7, 2015, at 1:26 PM, Dave Cramer <davecramer@gmail.com> wrote:




On 7 July 2015 at 16:24, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
On 07/07/2015 11:04 PM, Kevin Wooten wrote:
As far as “coercion" goes maybe a better term would be “deduction”.
There are numerous cases where it would be great if we could send the
server a “don’t care” type and just a binary format.   Instead we
have to specify a type and format together. Doing this causes issues
in edge cases (e.g. strings coming back padded because we specified
varchar but the column was text).

Oh, are you talking about query parameters, sent from the client to the server? The type OID and format code are not actually sent in the same message - you give the OID in the Parse message and the format code in Bind. You don't have to specify the OID in the Parse if you don't want to, and the server will deduce the most sensible one from the query context. After the Parse, you can find out what the server deduced by sending a ParameterDescription message, and then form the appropriate binary representation for the deduced type, and send it with Bind. Am I missing something?


I think Kevin was hoping the server would just "figure it out" without all the back and forth. 

Dave Cramer


Well I want both.

I want it to work as Heikki describes.  We currently do exactly what you are saying but when we leave out the types for parameters that are in turn returned in the results that’s where the edge cases begin.  It was a while ago that I spent literally weeks trying to figure out how to get these right, ultimately I had to settle on sending all the parameter types (I think).  I really need to peruse the final code again to refresh my memory and get some concrete examples that I was having. What we have now works well but it’s not perfect.

Also, yes, I want the server to just “figure it out” and send my binary types. Done.


Re: PostgreSQL gaps wrt to java, and jdbc

From
"Markus KARG"
Date:

No, unfortunately not, as there is a big difference between compression and binary transmission. See the following example:

 

<Some-Element xyz="345678,901">

               <Some-Other-Element abc="54321,876">

                              <Some-Value my-attribute="123456,789"/>

               </Some-Other-Element>

</Some-Element>

 

With compression, you can certainly get rid of the whitespace, and if the compression algorithms is better you even will have refs instead of element names, but that's it, mostly. What about the numbers? Still transfered untouched, as unique hence uncompressable. So lots of integers and decimals screws compression.

 

Binary transmission on the other hand will only need four bytes per integer. That makes twelve bytes for all the above attributes.

 

To sum up, compressed transmission should be an option that is performed on a lower level (like gzip transmission with http), but is *additive* to binary transmission. The latter should be used even in case no compressed transmission is in place at all.

 

-Markus

 

 

 

From: davecramer@gmail.com [mailto:davecramer@gmail.com] On Behalf Of Dave Cramer
Sent: Dienstag, 7. Juli 2015 19:54
To: Markus KARG
Cc: List
Subject: Re: [JDBC] PostgreSQL gaps wrt to java, and jdbc

 

Looking at the backend code. One possibility is to just use compression to send it over?


Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

 

On 6 July 2015 at 12:35, Markus KARG <markus@headcrashing.eu> wrote:

Binary represenation of XML and JSON instead of converting it to a String BEFORE transmission. Both JSON and XML essentially are graphs, and can be inflated with whitespace on the driver side AFTER transmission. Same in the other direction (deflate by transmitting binary object graph instead of whitespace-inflated String representation).

 

 

From: pgsql-jdbc-owner@postgresql.org [mailto:pgsql-jdbc-owner@postgresql.org] On Behalf Of Dave Cramer
Sent: Montag, 6. Juli 2015 14:15
To: List
Subject: [JDBC] PostgreSQL gaps wrt to java, and jdbc

 

I have been actively maintaining the driver off and on since 1999 or so. Recently we have had a flurry of activity and one of the things I noticed was the surprise that PostgreSQL didn't support X or did Y in some unexpected way. Sometimes we are a bit too complacent, and accept things the way they are.

 

I am wondering what could the server do better that would help JDBC?

 

Obviously streaming a column is one. Possibly rowid's. Anything else ?

 


Dave Cramer

 

Re: PostgreSQL gaps wrt to java, and jdbc

From
Dave Cramer
Date:



On 7 July 2015 at 16:46, Markus KARG <markus@headcrashing.eu> wrote:

No, unfortunately not, as there is a big difference between compression and binary transmission. See the following example:

 

<Some-Element xyz="345678,901">

               <Some-Other-Element abc="54321,876">

                              <Some-Value my-attribute="123456,789"/>

               </Some-Other-Element>

</Some-Element>

 

With compression, you can certainly get rid of the whitespace, and if the compression algorithms is better you even will have refs instead of element names, but that's it, mostly. What about the numbers? Still transfered untouched, as unique hence uncompressable. So lots of integers and decimals screws compression.

 

Binary transmission on the other hand will only need four bytes per integer. That makes twelve bytes for all the above attributes.


You would require oids (or some other way to decode) for numbers, boolean, strings, etc for each binary representation, no ?


Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

 

 

From: davecramer@gmail.com [mailto:davecramer@gmail.com] On Behalf Of Dave Cramer
Sent: Dienstag, 7. Juli 2015 19:54
To: Markus KARG
Cc: List
Subject: Re: [JDBC] PostgreSQL gaps wrt to java, and jdbc

 

Looking at the backend code. One possibility is to just use compression to send it over?


Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

 

On 6 July 2015 at 12:35, Markus KARG <markus@headcrashing.eu> wrote:

Binary represenation of XML and JSON instead of converting it to a String BEFORE transmission. Both JSON and XML essentially are graphs, and can be inflated with whitespace on the driver side AFTER transmission. Same in the other direction (deflate by transmitting binary object graph instead of whitespace-inflated String representation).

 

 

From: pgsql-jdbc-owner@postgresql.org [mailto:pgsql-jdbc-owner@postgresql.org] On Behalf Of Dave Cramer
Sent: Montag, 6. Juli 2015 14:15
To: List
Subject: [JDBC] PostgreSQL gaps wrt to java, and jdbc

 

I have been actively maintaining the driver off and on since 1999 or so. Recently we have had a flurry of activity and one of the things I noticed was the surprise that PostgreSQL didn't support X or did Y in some unexpected way. Sometimes we are a bit too complacent, and accept things the way they are.

 

I am wondering what could the server do better that would help JDBC?

 

Obviously streaming a column is one. Possibly rowid's. Anything else ?

 


Dave Cramer

 


Re: PostgreSQL gaps wrt to java, and jdbc

From
Kevin Wooten
Date:

On Jul 7, 2015, at 1:49 PM, Dave Cramer <pg@fastcrypt.com> wrote:




On 7 July 2015 at 16:46, Markus KARG <markus@headcrashing.eu> wrote:

No, unfortunately not, as there is a big difference between compression and binary transmission. See the following example:

 

<Some-Element xyz="345678,901">

               <Some-Other-Element abc="54321,876">

                              <Some-Value my-attribute="123456,789"/>

               </Some-Other-Element>

</Some-Element>

 

With compression, you can certainly get rid of the whitespace, and if the compression algorithms is better you even will have refs instead of element names, but that's it, mostly. What about the numbers? Still transfered untouched, as unique hence uncompressable. So lots of integers and decimals screws compression.

 

Binary transmission on the other hand will only need four bytes per integer. That makes twelve bytes for all the above attributes.


You would require oids (or some other way to decode) for numbers, boolean, strings, etc for each binary representation, no ?


Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca



My assumption has been that the true binary format is something dictated by Javascript engine (I assume it would be V8 since Pl/V8 exists).   So it seems we could just glean it from there and need no other transmitted information besides that “standard”.

 

 

From: davecramer@gmail.com [mailto:davecramer@gmail.com] On Behalf Of Dave Cramer
Sent: Dienstag, 7. Juli 2015 19:54
To: Markus KARG
Cc: List
Subject: Re: [JDBC] PostgreSQL gaps wrt to java, and jdbc

 

Looking at the backend code. One possibility is to just use compression to send it over?


Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

 

On 6 July 2015 at 12:35, Markus KARG <markus@headcrashing.eu> wrote:

Binary represenation of XML and JSON instead of converting it to a String BEFORE transmission. Both JSON and XML essentially are graphs, and can be inflated with whitespace on the driver side AFTER transmission. Same in the other direction (deflate by transmitting binary object graph instead of whitespace-inflated String representation).

 

 

From: pgsql-jdbc-owner@postgresql.org [mailto:pgsql-jdbc-owner@postgresql.org] On Behalf Of Dave Cramer
Sent: Montag, 6. Juli 2015 14:15
To: List
Subject: [JDBC] PostgreSQL gaps wrt to java, and jdbc

 

I have been actively maintaining the driver off and on since 1999 or so. Recently we have had a flurry of activity and one of the things I noticed was the surprise that PostgreSQL didn't support X or did Y in some unexpected way. Sometimes we are a bit too complacent, and accept things the way they are.

 

I am wondering what could the server do better that would help JDBC?

 

Obviously streaming a column is one. Possibly rowid's. Anything else ?

 


Dave Cramer

 



Re: PostgreSQL gaps wrt to java, and jdbc

From
Dave Cramer
Date:

On 7 July 2015 at 16:56, Kevin Wooten <kdubb@me.com> wrote:

On Jul 7, 2015, at 1:49 PM, Dave Cramer <pg@fastcrypt.com> wrote:




On 7 July 2015 at 16:46, Markus KARG <markus@headcrashing.eu> wrote:

No, unfortunately not, as there is a big difference between compression and binary transmission. See the following example:

 

<Some-Element xyz="345678,901">

               <Some-Other-Element abc="54321,876">

                              <Some-Value my-attribute="123456,789"/>

               </Some-Other-Element>

</Some-Element>

 

With compression, you can certainly get rid of the whitespace, and if the compression algorithms is better you even will have refs instead of element names, but that's it, mostly. What about the numbers? Still transfered untouched, as unique hence uncompressable. So lots of integers and decimals screws compression.

 

Binary transmission on the other hand will only need four bytes per integer. That makes twelve bytes for all the above attributes.


You would require oids (or some other way to decode) for numbers, boolean, strings, etc for each binary representation, no ?


Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca



My assumption has been that the true binary format is something dictated by Javascript engine (I assume it would be V8 since Pl/V8 exists).   So it seems we could just glean it from there and need no other transmitted information besides that “standard”.

Not having followed JS too closely are there other binary formats besides V8 ?



Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

Re: PostgreSQL gaps wrt to java, and jdbc

From
Steven Schlansker
Date:
On Jul 7, 2015, at 1:59 PM, Dave Cramer <pg@fastcrypt.com> wrote:

> My assumption has been that the true binary format is something dictated by Javascript engine (I assume it would be
V8since Pl/V8 exists).   So it seems we could just glean it from there and need no other transmitted information
besidesthat “standard”. 
>
> Not having followed JS too closely are there other binary formats besides V8 ?

http://wiki.fasterxml.com/SmileFormatSpec