Thread: Status of binary protocol usage?

Status of binary protocol usage?

From
aaime74
Date:
Hi,
I'm wondering what's the status of binary protocol usage patches. I've seen
7 of them
posted to the mailing list up to January 2007, and then, nothing (patches
are here:
http://mokki.dyndns.org/~mtiihone/postgresql/binarytransfer/).

I'm asking because I suspect the text protocol is biting me with slow
performance quite
a lot. I'm getting geometries out of a Postgis database. Well, guess what,
the following query:

SELECT revision, gid, encode(AsBinary(force_2d(the_geom), 'XDR'),'base64')
FROM world

happens to be 25% faster than:

SELECT revision, gid, AsBinary(force_2d(the_geom), 'XDR') FROM world

even if in the former case the backend has to do more work during the base64
encoding, and
the client has to do base64 decoding. This is sounds counter intuitive, a
profiler
informs me that quite a big of time is spent in the PGBytea.toBytes(byte[]s)
method, which
is used only if the transfer occurrs in text mode.

I have other experiences where Postgres resulted to be quite a bit slower
than other databases
when gathering big amounts of data (not necessarily geometries). I always
had the gut feeling the text protocol was to blame, but never asked... well,
now I do :)
Any chance we'll see the binary protocol used by default?

--
View this message in context: http://www.nabble.com/Status-of-binary-protocol-usage--tf3972236.html#a11275147
Sent from the PostgreSQL - jdbc mailing list archive at Nabble.com.


Re: Status of binary protocol usage?

From
Tom Lane
Date:
aaime74 <andrea.aime@gmail.com> writes:
> A profiler informs me that quite a big of time is spent in the
> PGBytea.toBytes(byte[]s) method, which is used only if the transfer
> occurrs in text mode.

That hardly seems like a killer argument for switching to binary
(which has got a boatload of disadvantages of its own).  Surely a
bit of code-optimization work can fix that.

            regards, tom lane

Re: Status of binary protocol usage?

From
aaime74
Date:


Tom Lane-2 wrote:
>
> aaime74 <andrea.aime@gmail.com> writes:
>> A profiler informs me that quite a big of time is spent in the
>> PGBytea.toBytes(byte[]s) method, which is used only if the transfer
>> occurrs in text mode.
>
> That hardly seems like a killer argument for switching to binary
> (which has got a boatload of disadvantages of its own).  Surely a
> bit of code-optimization work can fix that.
>

Hum, interesting. What would be the boatload of disadvantages? Is there
any reference to those?
Do you have any idea why doing more processing (Base64 encoding/decoding)
leads to significant better performance? :)

Cheers
Andrea

--
View this message in context: http://www.nabble.com/Status-of-binary-protocol-usage--tf3972236.html#a11276153
Sent from the PostgreSQL - jdbc mailing list archive at Nabble.com.


Re: Status of binary protocol usage?

From
Tom Lane
Date:
aaime74 <andrea.aime@gmail.com> writes:
> Hum, interesting. What would be the boatload of disadvantages?

Portability and cross-machine compatibility, or lack of same.

It's tough enough trying to make things work for relatively primitive
datatypes like float.  I can hardly imagine that anyone would want to
support binary representations across platforms for geometric types.

            regards, tom lane

Re: Status of binary protocol usage?

From
Mikko Tiihonen
Date:
tom lane wrote:
> aaime74 <andrea ( dot ) aime ( at ) gmail ( dot ) com> writes:
> > Hum, interesting. What would be the boatload of disadvantages?
>
> Portability and cross-machine compatibility, or lack of same.

I can understand that future postgres versions can change the binary
protocol, but I hope it is done so that it can be deduced from the
version or alternatively similarly to the float/long timestamp
so that the server informs about its usage of new features.

But I cannot understand why the protocol would be allowed to ever
vary with the server 32/64bitness or endiandness. And at least to my
current understanding it doesn't. And if those are taken care of
in the server side then there shouldn't be any problems because the
java side by definition already is cross-machine compatible and portable.

But if you really think that the binary protocol varies so much with the
server then why is it documented (even as much as it is) and why is the
documentation warning about the possible compatibility problems.

> It's tough enough trying to make things work for relatively primitive
> datatypes like float.  I can hardly imagine that anyone would want to
> support binary representations across platforms for geometric types.

I think that is why the postgres binary protocol allows requesting use
of binary types for each input and output parameter separately.
In addition the jdbc driver (with my proposed patches) allows
controlling binary/text format of each type separately to work around any
future changes in the protocol.

In the proposed binary protocol changes one just needs to add a few
lines code to send or receive path (which ever is the bottleneck) whenever
a new datatype that is performance critical for an application is found.

Anyway, I seem to have vanished for half a year but I could try to pick up
things from where I left them. Kris had some good ideas on how to hack
the test framework in order to ensure that the binary patches would not
cause any regressions.

-Mikko