Thread: Status of binary protocol usage?
Hi, I'm wondering what's the status of binary protocol usage patches. I've seen 7 of them posted to the mailing list up to January 2007, and then, nothing (patches are here: http://mokki.dyndns.org/~mtiihone/postgresql/binarytransfer/). I'm asking because I suspect the text protocol is biting me with slow performance quite a lot. I'm getting geometries out of a Postgis database. Well, guess what, the following query: SELECT revision, gid, encode(AsBinary(force_2d(the_geom), 'XDR'),'base64') FROM world happens to be 25% faster than: SELECT revision, gid, AsBinary(force_2d(the_geom), 'XDR') FROM world even if in the former case the backend has to do more work during the base64 encoding, and the client has to do base64 decoding. This is sounds counter intuitive, a profiler informs me that quite a big of time is spent in the PGBytea.toBytes(byte[]s) method, which is used only if the transfer occurrs in text mode. I have other experiences where Postgres resulted to be quite a bit slower than other databases when gathering big amounts of data (not necessarily geometries). I always had the gut feeling the text protocol was to blame, but never asked... well, now I do :) Any chance we'll see the binary protocol used by default? -- View this message in context: http://www.nabble.com/Status-of-binary-protocol-usage--tf3972236.html#a11275147 Sent from the PostgreSQL - jdbc mailing list archive at Nabble.com.
aaime74 <andrea.aime@gmail.com> writes: > A profiler informs me that quite a big of time is spent in the > PGBytea.toBytes(byte[]s) method, which is used only if the transfer > occurrs in text mode. That hardly seems like a killer argument for switching to binary (which has got a boatload of disadvantages of its own). Surely a bit of code-optimization work can fix that. regards, tom lane
Tom Lane-2 wrote: > > aaime74 <andrea.aime@gmail.com> writes: >> A profiler informs me that quite a big of time is spent in the >> PGBytea.toBytes(byte[]s) method, which is used only if the transfer >> occurrs in text mode. > > That hardly seems like a killer argument for switching to binary > (which has got a boatload of disadvantages of its own). Surely a > bit of code-optimization work can fix that. > Hum, interesting. What would be the boatload of disadvantages? Is there any reference to those? Do you have any idea why doing more processing (Base64 encoding/decoding) leads to significant better performance? :) Cheers Andrea -- View this message in context: http://www.nabble.com/Status-of-binary-protocol-usage--tf3972236.html#a11276153 Sent from the PostgreSQL - jdbc mailing list archive at Nabble.com.
aaime74 <andrea.aime@gmail.com> writes: > Hum, interesting. What would be the boatload of disadvantages? Portability and cross-machine compatibility, or lack of same. It's tough enough trying to make things work for relatively primitive datatypes like float. I can hardly imagine that anyone would want to support binary representations across platforms for geometric types. regards, tom lane
tom lane wrote: > aaime74 <andrea ( dot ) aime ( at ) gmail ( dot ) com> writes: > > Hum, interesting. What would be the boatload of disadvantages? > > Portability and cross-machine compatibility, or lack of same. I can understand that future postgres versions can change the binary protocol, but I hope it is done so that it can be deduced from the version or alternatively similarly to the float/long timestamp so that the server informs about its usage of new features. But I cannot understand why the protocol would be allowed to ever vary with the server 32/64bitness or endiandness. And at least to my current understanding it doesn't. And if those are taken care of in the server side then there shouldn't be any problems because the java side by definition already is cross-machine compatible and portable. But if you really think that the binary protocol varies so much with the server then why is it documented (even as much as it is) and why is the documentation warning about the possible compatibility problems. > It's tough enough trying to make things work for relatively primitive > datatypes like float. I can hardly imagine that anyone would want to > support binary representations across platforms for geometric types. I think that is why the postgres binary protocol allows requesting use of binary types for each input and output parameter separately. In addition the jdbc driver (with my proposed patches) allows controlling binary/text format of each type separately to work around any future changes in the protocol. In the proposed binary protocol changes one just needs to add a few lines code to send or receive path (which ever is the bottleneck) whenever a new datatype that is performance critical for an application is found. Anyway, I seem to have vanished for half a year but I could try to pick up things from where I left them. Kris had some good ideas on how to hack the test framework in order to ensure that the binary patches would not cause any regressions. -Mikko