Thread: Re: binary protocol was Performance problem with timestamps in result sets

Re: binary protocol was Performance problem with timestamps in result sets

From
"mikael-aronsson"
Date:
How about the actual transport cost difference between text and binary
protocols ? it may not be any big difference though, and many times text
representation can be smaller then a binary protocol.

I have no idea about endianness, but as the clients works fine between
different platforms I would assume that the endian format in the protocol is
fixed (but you should not assume things so maybe I am hanging myself again
here).

I do not think it would give much though to use the binary protocol as Java
is not very good when it comes to converting binary data back to native
values unless it is serialized or you start to mess around with nio buffers,
so in the end I do not think there would be much difference in performance.

Mikael

----- Original Message -----
From: "mikael-aronsson" <mikael-aronsson@telia.com>
To: "Dave Cramer" <pg@fastcrypt.com>
Sent: Thursday, March 09, 2006 1:38 PM
Subject: Re: [JDBC] binary protocol was Performance problem with timestamps
in result sets


> How about the actual transport cost difference between text and binary
> protocols ? it may not be any big difference though, and many times text
> representation can be smaller then a binary protocol.
>
> I have no idea about endianness, but as the clients works fine between
> different platforms I would assume that the endian format in the protocol
> is fixed (but you should not assume things so maybe I am hanging myself
> again here).
>
> I do not think it would give much though to use the binary protocol as
> Java is not very good when it comes to converting binary data back to
> native values unless it is serialized or you start to mess around with nio
> buffers, so in the end I do not think there would be much difference in
> performance.
>
> Mikael
>
> ----- Original Message -----
> From: "Dave Cramer" <pg@fastcrypt.com>
> To: "List" <pgsql-jdbc@postgresql.org>
> Sent: Thursday, March 09, 2006 1:11 PM
> Subject: [JDBC] binary protocol was Performance problem with timestamps in
> result sets
>
>
>> As Oliver points out the timestamp is not a 64bit integer, or even a
>> floatingpoint number. It is a textual representation of the timestamp
>> which needs to be parsed. I looked at the parsing and I was unable to
>> see anything that could be significantly optimized.
>>
>> So the option of going to the binary protocol exists. There are a  number
>> of challenges with this. As Oliver points out this is an all  or nothing
>> proposition. In other words you can't ask for just  timestamps to be
>> returned in binary. The entire row comes back as  binary. Additionally,
>> there are two possible representations of  timestamps in postgresql. One
>> is a 64 bit integer, the other is  floating point. Added to this there
>> may be endian issues ( do we know  the answer to this question ?)
>>
>> Significant performance improvement exists for dates, times,  timestamps,
>> however the advantages for the rest of the types is  questionable given
>> the above assertions.
>>
>> Comments ?
>>
>> Dave
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 1: if posting/reading through Usenet, please send an appropriate
>>       subscribe-nomail command to majordomo@postgresql.org so that your
>>       message can get through to the mailing list cleanly
>


Re: binary protocol was Performance problem with timestamps in result sets

From
"Thomas Dudziak"
Date:
On 3/9/06, mikael-aronsson <mikael-aronsson@telia.com> wrote:
> How about the actual transport cost difference between text and binary
> protocols ? it may not be any big difference though, and many times text
> representation can be smaller then a binary protocol.

Really ? I would have thought its vice versa. E.g. a float is usualy 4
(or 8) bytes in binary, but can be a lot longer in text depending on
the format.

> I do not think it would give much though to use the binary protocol as Java
> is not very good when it comes to converting binary data back to native
> values unless it is serialized or you start to mess around with nio buffers,
> so in the end I do not think there would be much difference in performance.

Personally I would be interested in whether a binary protocol impl in
the JDBC driver would bring benefits or not for the other, simpler
types (int, string, ...). If they don't suffer, then it might actually
be worthwhile to investigate a binary impl. That is being said of
course from a pure user perspective - I have no insight whatsoever in
the core workings of the JDBC driver.

cheers,
Tom

Re: binary protocol was Performance problem with timestamps

From
Markus Schaber
Date:
Hi, Mikael,

mikael-aronsson wrote:

> I do not think it would give much though to use the binary protocol as Java
> is not very good when it comes to converting binary data back to native
> values unless it is serialized or you start to mess around with nio
> buffers,

I think that parsing complicated text representations is not faster than
multiplying fixed-length bunches of byte values together.


HTH,
Markus
--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

Re: binary protocol was Performance problem with timestamps in result sets

From
Marc Herbert
Date:
"Thomas Dudziak" <tomdzk@gmail.com> writes:

> On 3/9/06, mikael-aronsson <mikael-aronsson@telia.com> wrote:
>> How about the actual transport cost difference between text and binary
>> protocols ? it may not be any big difference though, and many times text
>> representation can be smaller then a binary protocol.
>
> Really ? I would have thought its vice versa. E.g. a float is usualy 4
> (or 8) bytes in binary, but can be a lot longer in text depending on
> the format.


To represent binary IEEE754's floats (4 bytes) without loss the
maximum required number of base10 digits is 9. For IEEE754's doubles
(8 bytes) it's 17. I don't know what is the "average" required number
of digits.


Of course using one byte-character per base10 digit is a waste of
space... you could gzip or BCD-encode the string :-)

References:
- "What Every Computer Scientist Should Know About Floating Point
Arithmetic" 1991 - David Goldberg
- paragraph "Conversions" at:
<http://www2.hursley.ibm.com/decimal/>

Re: binary protocol was Performance problem with timestamps in result sets

From
Marc Herbert
Date:
Marc Herbert <Marc.Herbert@continuent.com> writes:

> To represent binary IEEE754's floats (4 bytes) without loss the
> maximum required number of base10 digits is 9. For IEEE754's doubles
> (8 bytes) it's 17. I don't know what is the "average" required number
> of digits.

Sorry, forgot to precise: that's just for the fraction. You need to
add representations for sign and exponent.