Thread: Incoming/Sent traffic data
Hello, I'm doing a paper for my university about advantages and disadvantages of OMR/Direct Access for different languages. One of the tests measures the traffic between application and the database (we have proxy in the middle of them). This test gave me some weird result.
Bellow, code and result:
Python
http://pastebin.com/LfQDtF5X
Java
http://pastebin.com/ZfNpqP0H
The postgre log:
Python LOG: comando: select * from pessoa
Java LOG: executar <unnamed>: select * from pessoa
(translation from portuguese to english)
comando -> command
executar -> execute
Almost difference of received data size is HUGE. Do you guys have any idea about what could am I doing wrong? I heard something about binary protocol, could it help?
Bellow, code and result:
Python
http://pastebin.com/LfQDtF5X
Java
http://pastebin.com/ZfNpqP0H
The postgre log:
Python LOG: comando: select * from pessoa
Java LOG: executar <unnamed>: select * from pessoa
(translation from portuguese to english)
comando -> command
executar -> execute
Almost difference of received data size is HUGE. Do you guys have any idea about what could am I doing wrong? I heard something about binary protocol, could it help?
On 05/12/2011 10:45 AM, Israel Ben Guilherme Fonseca wrote: > Almost difference of received data size is HUGE. Do you guys have any > idea about what could am I doing wrong? I heard something about binary > protocol, could it help? The total size of the transferred data in your tests appears to be tiny. As a result, you are measuring the connection overhead of the drivers more than anything else. The JDBC driver will be issuing a lot more SET statements and the like during connection setup than psycopg2 does. To see exactly what the differences are, enable full statement logging in the server so you can see everything that's issued. Try testing with a meaningful amount of data, or starting measuring only after you've established a connection. -- Craig Ringer
> The postgre log: > > Python LOG: comando: select * from pessoa > Java LOG: executar <unnamed>: select * from pessoa > > (translation from portuguese to english) > comando -> command > executar -> execute Based on these log messages, it looks like this particular invocation in Python is using the simple query protocol [1], whereas the JDBC one is using the extended protocol [2] (with an unnamed statement and unnamed portal). As far as I can tell, the JDBC driver only uses the simple protocol for COPY. The extended query protocol is a little chattier, but I wouldn't expect a *huge* difference there. In any case, for what you're doing, I would strongly recommend looking at a tool like Wireshark or tcpdump to get more accurate results and more insight into what happens on the wire. E.g., I'm rather surprised that the Java bytes written is 5 times (!) lower than the Python version. Make sure you know what you're actually measuring. [1]: http://developer.postgresql.org/pgdocs/postgres/protocol-flow.html#AEN91249 [2]: http://developer.postgresql.org/pgdocs/postgres/protocol-flow.html#PROTOCOL-FLOW-EXT-QUERY --- Maciek Sakrejda | System Architect | Truviso 1065 E. Hillsdale Blvd., Suite 215 Foster City, CA 94404 (650) 242-3500 Main www.truviso.com
Thanks for the answers,
Craig, I'm already measuring the traffic only after the initial setup. It's clean data, only for the operation itself. That's why I have those lines with the comment 'Wait for ENTER KEY to clear the setup traffic'. About the log, i'm using
Macieck, I'll try to give a look on these tools. My current implementation of proxy just use sockets to transmit the data between the app (that's why the port 4444 on connections) to the database, that's how I check the size of data transfered with it (it have a GUI to clear the traffic so I can check only the operation, not the setup).
Israel
Craig, I'm already measuring the traffic only after the initial setup. It's clean data, only for the operation itself. That's why I have those lines with the comment 'Wait for ENTER KEY to clear the setup traffic'. About the log, i'm using
log_statement = "all", on the postgres config,
is there any other specific option?Macieck, I'll try to give a look on these tools. My current implementation of proxy just use sockets to transmit the data between the app (that's why the port 4444 on connections) to the database, that's how I check the size of data transfered with it (it have a GUI to clear the traffic so I can check only the operation, not the setup).
Israel
2011/5/12 Maciek Sakrejda <msakrejda@truviso.com>
> The postgre log:Based on these log messages, it looks like this particular invocation
>
> Python LOG: comando: select * from pessoa
> Java LOG: executar <unnamed>: select * from pessoa
>
> (translation from portuguese to english)
> comando -> command
> executar -> execute
in Python is using the simple query protocol [1], whereas the JDBC one
is using the extended protocol [2] (with an unnamed statement and
unnamed portal). As far as I can tell, the JDBC driver only uses the
simple protocol for COPY. The extended query protocol is a little
chattier, but I wouldn't expect a *huge* difference there.
In any case, for what you're doing, I would strongly recommend looking
at a tool like Wireshark or tcpdump to get more accurate results and
more insight into what happens on the wire. E.g., I'm rather surprised
that the Java bytes written is 5 times (!) lower than the Python
version. Make sure you know what you're actually measuring.
[1]: http://developer.postgresql.org/pgdocs/postgres/protocol-flow.html#AEN91249
[2]: http://developer.postgresql.org/pgdocs/postgres/protocol-flow.html#PROTOCOL-FLOW-EXT-QUERY
---
Maciek Sakrejda | System Architect | Truviso
1065 E. Hillsdale Blvd., Suite 215
Foster City, CA 94404
(650) 242-3500 Main
www.truviso.com
> About the log, i'm using log_statement = "all", on the postgres config,is there any > other specific option? You could try setting log_min_messages = debug1 (I don't think anything below that--i.e., debug2 through debug5--is useful for your case, but you can give it a shot to see what's there). > My current implementation of proxy just use sockets to transmit the data between the app (that's > why the port 4444 on connections) to the database, that's how I check the size of data > transfered with it (it have a GUI to clear the traffic so I can check only the operation, not the > setup). Nice. I didn't mean to imply you were entirely unscientific about this, just that you should be careful regarding assumptions as to what is sent where when. A tool like Wireshark is (relatively) easy to pick up and gives you tremendous insight into what's going on on the wire with no application changes required (and it even has a PostgreSQL protocol plugin by default so you don't need to stare at raw bytes). --- Maciek Sakrejda | System Architect | Truviso 1065 E. Hillsdale Blvd., Suite 215 Foster City, CA 94404 (650) 242-3500 Main www.truviso.com
I just got a try with the wireshark, it's indeed nice, but I didn't figure a good filter for knowing what's my app trafic (I'm on localhost), so I put the 'lo' interface, executed the test, pick some random package that have just been collected, get the source port (the application port that's random), and filter packages that use it.
I did a fast test with both languages and it's still look that the Java traffic is somewhat bigger. (I couldn't isolate the setup traffic)
Total Traffic (same test of before but the select is in a loop of 20 iterations)
28KB python
39KB java
The strange thing is, I compared the results of my proxy and wireshark for the Java test, and it's ok. For the python test, there was a difference of 10KB between the proxy and wireshark result. I dunno why. I'll have to explore this a bit more.
Do you have experience with this Macieck? Could I sent direct email to you regarding the wireshark questions?
I did a fast test with both languages and it's still look that the Java traffic is somewhat bigger. (I couldn't isolate the setup traffic)
Total Traffic (same test of before but the select is in a loop of 20 iterations)
28KB python
39KB java
The strange thing is, I compared the results of my proxy and wireshark for the Java test, and it's ok. For the python test, there was a difference of 10KB between the proxy and wireshark result. I dunno why. I'll have to explore this a bit more.
Do you have experience with this Macieck? Could I sent direct email to you regarding the wireshark questions?
2011/5/12 Maciek Sakrejda <msakrejda@truviso.com>
> About the log, i'm using log_statement = "all", on the postgres config,is there anyYou could try setting log_min_messages = debug1 (I don't think
> other specific option?
anything below that--i.e., debug2 through debug5--is useful for your
case, but you can give it a shot to see what's there).Nice. I didn't mean to imply you were entirely unscientific about
> My current implementation of proxy just use sockets to transmit the data between the app (that's
> why the port 4444 on connections) to the database, that's how I check the size of data
> transfered with it (it have a GUI to clear the traffic so I can check only the operation, not the
> setup).
this, just that you should be careful regarding assumptions as to what
is sent where when. A tool like Wireshark is (relatively) easy to pick
up and gives you tremendous insight into what's going on on the wire
with no application changes required (and it even has a PostgreSQL
protocol plugin by default so you don't need to stare at raw bytes).
---
Maciek Sakrejda | System Architect | Truviso
1065 E. Hillsdale Blvd., Suite 215
Foster City, CA 94404
(650) 242-3500 Main
www.truviso.com
On Thu, May 12, 2011 at 10:27 AM, Israel Ben Guilherme Fonseca <israel.bgf@gmail.com> wrote:
Do you have experience with this Macieck? Could I sent direct email to you regarding the wireshark questions?
I can't speak for everyone else on the list, but I have no trouble ignoring a thread that I'm not interested in so I much prefer that people don't take things to private communication, since I learn a lot by reading these discussions and they are then available in searches when I have a similar problem.
Samuel Gendler wrote: > Ben Guilherme Fonseca wrote: >> Do you have experience with this Macieck? Could I sent direct email to you >> regarding the wireshark questions? > I can't speak for everyone else on the list, but I have no trouble ignoring a > thread that I'm not interested in so I much prefer that people don't take > things to private communication, since I learn a lot by reading these > discussions and they are then available in searches when I have a similar problem. +1 -- Lew
> +1 Alright, then. I'm no Wireshark expert (or a TCP expert, for that matter), but I'll try to help on-list unless others complain. Israel, What I *typically* do is listen on all interfaces, and set the filter to pgsql (which looks only for PostgreSQL protocol messages). When running on a different port, you'll need to specify that explicitly or the pgsql filter will ignore the TCP conversation. E.g., a filter like tcp.dstport == 4444 or tcp.srcport == 4444 and pgsql should work. From there, it's a matter of getting moderately familiar with the wire protocol [1] and inspecting the messages Wireshark shows you in the packet details / packet bytes panes. [1]: http://developer.postgresql.org/pgdocs/postgres/protocol.html --- Maciek Sakrejda | System Architect | Truviso 1065 E. Hillsdale Blvd., Suite 215 Foster City, CA 94404 (650) 242-3500 Main www.truviso.com
> +1 too. I just asked because its common to get shot at mailists after a little of 'off-topic'.
Well afters tons of tests, using my brand new Wireshark skills (thanks Maciek), and I got a very strange result (even stranger than before):
I created a new database for the tests, 1 'Person' table, 2 columns (id, name), 7000++ registers.
The traffic difference was:
Java 220861 Bytes
Python 29014 Bytes
A difference of 8x. Now, I did the same test with my proxy implementation and got similar results. So.. that's strange. Very strange.
I think that's much probably that I'm doing something wrong, but I did and redid all the tests many times until now, that i'm starting to accept this.
My ONLY track of explanation is this postgres log (debug5)
JAVA
simpletests DEPURAÇÃO: análise de <unnamed>: select * from "Person"
simpletests DEPURAÇÃO: StartTransactionCommand
simpletests DEPURAÇÃO: StartTransaction
simpletests DEPURAÇÃO: name: unnamed; blockState: DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:
simpletests DEPURAÇÃO: ligação de <unnamed> para <unnamed>
simpletests LOG: executar <unnamed>: select * from "Person"
simpletests DEPURAÇÃO: CommitTransactionCommand
simpletests DEPURAÇÃO: CommitTransaction
simpletests DEPURAÇÃO: name: unnamed; blockState: STARTED; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:
simpletests DEPURAÇÃO: shmem_exit(0): 6 callbacks to make
simpletests DEPURAÇÃO: proc_exit(0): 4 callbacks to make
simpletests LOG: desconexão: tempo da sessão: 0:00:12.172 usuário=postgres banco de dados=simpletests máquina=localhost port=56401
simpletests DEPURAÇÃO: exit(0)
simpletests DEPURAÇÃO: shmem_exit(-1): 0 callbacks to make
simpletests DEPURAÇÃO: proc_exit(-1): 0 callbacks to make
PYTHON
simpletests DEPURAÇÃO: StartTransactionCommand
simpletests DEPURAÇÃO: StartTransaction
simpletests DEPURAÇÃO: name: unnamed; blockState: DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:
simpletests LOG: comando: BEGIN; SET TRANSACTION ISOLATION LEVEL READ COMMITTED
simpletests DEPURAÇÃO: ProcessUtility
simpletests DEPURAÇÃO: CommitTransactionCommand
simpletests DEPURAÇÃO: StartTransactionCommand
simpletests DEPURAÇÃO: ProcessUtility
simpletests DEPURAÇÃO: CommitTransactionCommand
simpletests DEPURAÇÃO: StartTransactionCommand
simpletests LOG: comando: select * from "Person"
simpletests DEPURAÇÃO: CommitTransactionCommand
The python log looks somewhat cleaner, and Java got this:
simpletests DEPURAÇÃO: shmem_exit(0): 6 callbacks to make
simpletests DEPURAÇÃO: proc_exit(0): 4 callbacks to make
10 extra callbacks? I dont have any idea about what is about, but maybe it could mean something.
I hosted all the files and source code for this test-case on google code. If you are using ubuntu it probabbly take just 10 minutes to execute everthing and see this with your own eyes (and hopefully someone would say "your idiot, you did THAT <code> wrong").
http://orm-native-comparative.googlecode.com/files/tests.zip
Thanks in advance,
Israel
Well afters tons of tests, using my brand new Wireshark skills (thanks Maciek), and I got a very strange result (even stranger than before):
I created a new database for the tests, 1 'Person' table, 2 columns (id, name), 7000++ registers.
The traffic difference was:
Java 220861 Bytes
Python 29014 Bytes
A difference of 8x. Now, I did the same test with my proxy implementation and got similar results. So.. that's strange. Very strange.
I think that's much probably that I'm doing something wrong, but I did and redid all the tests many times until now, that i'm starting to accept this.
My ONLY track of explanation is this postgres log (debug5)
JAVA
simpletests DEPURAÇÃO: análise de <unnamed>: select * from "Person"
simpletests DEPURAÇÃO: StartTransactionCommand
simpletests DEPURAÇÃO: StartTransaction
simpletests DEPURAÇÃO: name: unnamed; blockState: DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:
simpletests DEPURAÇÃO: ligação de <unnamed> para <unnamed>
simpletests LOG: executar <unnamed>: select * from "Person"
simpletests DEPURAÇÃO: CommitTransactionCommand
simpletests DEPURAÇÃO: CommitTransaction
simpletests DEPURAÇÃO: name: unnamed; blockState: STARTED; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:
simpletests DEPURAÇÃO: shmem_exit(0): 6 callbacks to make
simpletests DEPURAÇÃO: proc_exit(0): 4 callbacks to make
simpletests LOG: desconexão: tempo da sessão: 0:00:12.172 usuário=postgres banco de dados=simpletests máquina=localhost port=56401
simpletests DEPURAÇÃO: exit(0)
simpletests DEPURAÇÃO: shmem_exit(-1): 0 callbacks to make
simpletests DEPURAÇÃO: proc_exit(-1): 0 callbacks to make
PYTHON
simpletests DEPURAÇÃO: StartTransactionCommand
simpletests DEPURAÇÃO: StartTransaction
simpletests DEPURAÇÃO: name: unnamed; blockState: DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:
simpletests LOG: comando: BEGIN; SET TRANSACTION ISOLATION LEVEL READ COMMITTED
simpletests DEPURAÇÃO: ProcessUtility
simpletests DEPURAÇÃO: CommitTransactionCommand
simpletests DEPURAÇÃO: StartTransactionCommand
simpletests DEPURAÇÃO: ProcessUtility
simpletests DEPURAÇÃO: CommitTransactionCommand
simpletests DEPURAÇÃO: StartTransactionCommand
simpletests LOG: comando: select * from "Person"
simpletests DEPURAÇÃO: CommitTransactionCommand
The python log looks somewhat cleaner, and Java got this:
simpletests DEPURAÇÃO: shmem_exit(0): 6 callbacks to make
simpletests DEPURAÇÃO: proc_exit(0): 4 callbacks to make
10 extra callbacks? I dont have any idea about what is about, but maybe it could mean something.
I hosted all the files and source code for this test-case on google code. If you are using ubuntu it probabbly take just 10 minutes to execute everthing and see this with your own eyes (and hopefully someone would say "your idiot, you did THAT <code> wrong").
http://orm-native-comparative.googlecode.com/files/tests.zip
Thanks in advance,
Israel
2011/5/12 Maciek Sakrejda <msakrejda@truviso.com>
> +1
Alright, then. I'm no Wireshark expert (or a TCP expert, for that
matter), but I'll try to help on-list unless others complain.
Israel,
What I *typically* do is listen on all interfaces, and set the filter
to pgsql (which looks only for PostgreSQL protocol messages). When
running on a different port, you'll need to specify that explicitly or
the pgsql filter will ignore the TCP conversation. E.g., a filter like
tcp.dstport == 4444 or tcp.srcport == 4444 and pgsql
should work.
From there, it's a matter of getting moderately familiar with the wire
protocol [1] and inspecting the messages Wireshark shows you in the
packet details / packet bytes panes.
[1]: http://developer.postgresql.org/pgdocs/postgres/protocol.html---
Maciek Sakrejda | System Architect | Truviso
1065 E. Hillsdale Blvd., Suite 215
Foster City, CA 94404
(650) 242-3500 Main
www.truviso.com--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc
On 13 May 2011 16:38, Israel Ben Guilherme Fonseca <israel.bgf@gmail.com> wrote: >> +1 too. I just asked because its common to get shot at mailists after a >> little of 'off-topic'. > > Well afters tons of tests, using my brand new Wireshark skills (thanks > Maciek), and I got a very strange result (even stranger than before): > > I created a new database for the tests, 1 'Person' table, 2 columns (id, > name), 7000++ registers. > > The traffic difference was: > > Java 220861 Bytes > Python 29014 Bytes Well, your next step should be compare the two wireshark captures and see what's being done differently in the two cases. You can run the JDBC driver with loglevel=2 and it'll tell you what it is doing, too. But really, if what you care about is what's on the network, then wireshark is the right tool for that job. I would be suspicious of your Python measurements, FWIW. 29kB for 7k rows implies you're only receiving ~4 bytes per row, which seems far too low. Oliver
Well, I'm suspicious too. :)
The logs that I pasted in the previous mail was with debug5, but you mean the client driver itself? I'll give a look.
About the wireshark itself, in the link that I pasted, I put 2 screenshots about the measure, at a gross look at it, You can easily tell that the packages are somewhat different.
The java hava a lot of /D/D/D/D
Maybe it's about this:
http://developer.postgresql.org/pgdocs/postgres/protocol-error-fields.html
The logs that I pasted in the previous mail was with debug5, but you mean the client driver itself? I'll give a look.
About the wireshark itself, in the link that I pasted, I put 2 screenshots about the measure, at a gross look at it, You can easily tell that the packages are somewhat different.
The java hava a lot of /D/D/D/D
Maybe it's about this:
http://developer.postgresql.org/pgdocs/postgres/protocol-error-fields.html
2011/5/13 Oliver Jowett <oliver@opencloud.com>
On 13 May 2011 16:38, Israel Ben Guilherme Fonseca <israel.bgf@gmail.com> wrote:Well, your next step should be compare the two wireshark captures and
>> +1 too. I just asked because its common to get shot at mailists after a
>> little of 'off-topic'.
>
> Well afters tons of tests, using my brand new Wireshark skills (thanks
> Maciek), and I got a very strange result (even stranger than before):
>
> I created a new database for the tests, 1 'Person' table, 2 columns (id,
> name), 7000++ registers.
>
> The traffic difference was:
>
> Java 220861 Bytes
> Python 29014 Bytes
see what's being done differently in the two cases.
You can run the JDBC driver with loglevel=2 and it'll tell you what it
is doing, too. But really, if what you care about is what's on the
network, then wireshark is the right tool for that job.
I would be suspicious of your Python measurements, FWIW. 29kB for 7k
rows implies you're only receiving ~4 bytes per row, which seems far
too low.
Oliver
On 13 May 2011 23:53, Israel Ben Guilherme Fonseca <israel.bgf@gmail.com> wrote: > The java hava a lot of /D/D/D/D > > Maybe it's about this: > > http://developer.postgresql.org/pgdocs/postgres/protocol-error-fields.html D is just a DataRow message, you get one of those per row returned. Wireshark is telling you that there are many DataRow messages within a single TCP segment. Drill down into the packet contents and take a look. Oliver
>> The java hava a lot of /D/D/D/D >> >> Maybe it's about this: >> >> http://developer.postgresql.org/pgdocs/postgres/protocol-error-fields.html > > D is just a DataRow message, you get one of those per row returned. > Wireshark is telling you that there are many DataRow messages within a > single TCP segment. Right. If you're *not* getting those in Python, you're not getting the data, so something is seriously wrong there. I don't know whether Python has some hidden laziness that you're running afoul of or something w.r.t. retrieving results, but that's the first place I'd look. --- Maciek Sakrejda | System Architect | Truviso 1065 E. Hillsdale Blvd., Suite 215 Foster City, CA 94404 (650) 242-3500 Main www.truviso.com
Maciek Sakrejda <msakrejda@truviso.com> wrote: >> D is just a DataRow message, you get one of those per row >> returned. > Right. If you're *not* getting those in Python, you're not getting > the data, so something is seriously wrong there. Perhaps python doesn't ship all the rows back on execute, but waits for the rows to be requested? If it isn't already doing it, try changing the script to read all the rows in the result set. -Kevin
Well I finally figure it out thanks to you guys.
After checking the package contents I could notice that the Java test I can easily see it contents (traffic data and select statements). That wasn't true for the Python test, the content show as a bizarre sequence o characters.
So as you guys argued, it was indeed ssl encrypted. I checked the docs and learned how to disable it and I got the same results as the java driver.
221222KB (almost the same)
But here is the question, does the SSL compress the data too? I enabled the self-signed ssl mode for the java test, and It was indeed encrypted, but the traffic was still "big", differently to the python version.
Any Ideas?
(i'm doing this question on the psycopg2 mailist too)
Thank you guys again for the help.
After checking the package contents I could notice that the Java test I can easily see it contents (traffic data and select statements). That wasn't true for the Python test, the content show as a bizarre sequence o characters.
So as you guys argued, it was indeed ssl encrypted. I checked the docs and learned how to disable it and I got the same results as the java driver.
221222KB (almost the same)
But here is the question, does the SSL compress the data too? I enabled the self-signed ssl mode for the java test, and It was indeed encrypted, but the traffic was still "big", differently to the python version.
Any Ideas?
(i'm doing this question on the psycopg2 mailist too)
Thank you guys again for the help.
2011/5/13 Kevin Grittner <Kevin.Grittner@wicourts.gov>
Maciek Sakrejda <msakrejda@truviso.com> wrote:
>> D is just a DataRow message, you get one of those per row
>> returned.> Right. If you're *not* getting those in Python, you're not gettingPerhaps python doesn't ship all the rows back on execute, but waits
> the data, so something is seriously wrong there.
for the rows to be requested? If it isn't already doing it, try
changing the script to read all the rows in the result set.
-Kevin
On Fri, 2011-05-13 at 13:40 -0300, Israel Ben Guilherme Fonseca wrote: > Well I finally figure it out thanks to you guys. > > After checking the package contents I could notice that the Java test > I can easily see it contents (traffic data and select statements). > That wasn't true for the Python test, the content show as a bizarre > sequence o characters. > > So as you guys argued, it was indeed ssl encrypted. I checked the docs > and learned how to disable it and I got the same results as the java > driver. > > 221222KB (almost the same) > > But here is the question, does the SSL compress the data too? http://httpd.apache.org/docs/2.0/ssl/ssl_faq.html#comp and postgresql and libpq use OpenSSL, so depending on version of SSL used, you may get the compression part automatically JDBC's SSL support may no (yet) include it > I enabled the self-signed ssl mode for the java test, and It was > indeed encrypted, but the traffic was still "big", differently to the > python version. > > Any Ideas? > > (i'm doing this question on the psycopg2 mailist too) > > Thank you guys again for the help. > > 2011/5/13 Kevin Grittner <Kevin.Grittner@wicourts.gov> > Maciek Sakrejda <msakrejda@truviso.com> wrote: > > >> D is just a DataRow message, you get one of those per row > >> returned. > > > > Right. If you're *not* getting those in Python, you're not > getting > > the data, so something is seriously wrong there. > > > Perhaps python doesn't ship all the rows back on execute, but > waits > for the rows to be requested? If it isn't already doing it, > try > changing the script to read all the rows in the result set. > > -Kevin > -- ------- Hannu Krosing PostgreSQL Infinite Scalability and Performance Consultant PG Admin Book: http://www.2ndQuadrant.com/books/
Israel Ben Guilherme Fonseca <israel.bgf@gmail.com> wrote: > But here is the question, does the SSL compress the data too? It can, if that is negotiated properly: http://httpd.apache.org/docs/2.0/ssl/ssl_faq.html#comp This makes sense, because any attempt to compress binary encrypted data will not buy much. > I enabled the self-signed ssl mode for the java test, and It was > indeed encrypted, but the traffic was still "big", differently to > the python version. > > Any Ideas? Apparently the JDBC driver isn't attempting to negotiate compression. I don't know how hard that would be to change. -Kevin
Interesting,
I don't know about the java implementation but psycopg2 is on top of libpq, so that's why the SSL compressing is working.
--
I didn't measure the CPU consumption of compressing/encrypting the data, but 8 times less traffic looks like a good performance boost since (I think) network is usually the bottleneck for web systems.
Is there any plan to support it? And if not is there a place to put this as a feature request (or is this thread automatically a feature request)?
Thanks again for all the help,
Israel
I don't know about the java implementation but psycopg2 is on top of libpq, so that's why the SSL compressing is working.
--
I didn't measure the CPU consumption of compressing/encrypting the data, but 8 times less traffic looks like a good performance boost since (I think) network is usually the bottleneck for web systems.
Is there any plan to support it? And if not is there a place to put this as a feature request (or is this thread automatically a feature request)?
Thanks again for all the help,
Israel
2011/5/13 Kevin Grittner <Kevin.Grittner@wicourts.gov>
> But here is the question, does the SSL compress the data too?It can, if that is negotiated properly:
This makes sense, because any attempt to compress binary encrypted
data will not buy much.Apparently the JDBC driver isn't attempting to negotiate
> I enabled the self-signed ssl mode for the java test, and It was
> indeed encrypted, but the traffic was still "big", differently to
> the python version.
>
> Any Ideas?
compression. I don't know how hard that would be to change.
-Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Israel Ben Guilherme Fonseca <israel.bgf@gmail.com> wrote: >> But here is the question, does the SSL compress the data too? > It can, if that is negotiated properly: > http://httpd.apache.org/docs/2.0/ssl/ssl_faq.html#comp > Apparently the JDBC driver isn't attempting to negotiate > compression. I don't know how hard that would be to change. What surprises me here is not so much that JDBC doesn't enable SSL compression as that the python case does. I remember people complaining that libpq doesn't enable SSL compression, which is unsurprising given what it says on that FAQ page. Is the python test not using a libpq-based client library? regards, tom lane
> I don't know about the java implementation but psycopg2 is on top of libpq, > so that's why the SSL compressing is working. The jdbc driver does not depend on libpq--it is a standalone wire protocol implementation based on Java Sockets. --- Maciek Sakrejda | System Architect | Truviso 1065 E. Hillsdale Blvd., Suite 215 Foster City, CA 94404 (650) 242-3500 Main www.truviso.com
Tom ->
http://initd.org/psycopg/features/
"psycopg is written mostly in C and wraps the libpq library with the result of being both fast and secure."
So it's always using the libpq lbrary.
Macieck ->
A much more portable solution indeed.
http://initd.org/psycopg/features/
"psycopg is written mostly in C and wraps the libpq library with the result of being both fast and secure."
So it's always using the libpq lbrary.
Macieck ->
A much more portable solution indeed.
2011/5/13 Maciek Sakrejda <msakrejda@truviso.com>
> I don't know about the java implementation but psycopg2 is on top of libpq,The jdbc driver does not depend on libpq--it is a standalone wire
> so that's why the SSL compressing is working.
protocol implementation based on Java Sockets.
---
Maciek Sakrejda | System Architect | Truviso
1065 E. Hillsdale Blvd., Suite 215
Foster City, CA 94404
(650) 242-3500 Main
www.truviso.com
Macieck, another question about the wireshark.
Is there any way to control it from a java code? Something like:
// Start the wireshark normally with some filter
// Java code starts here
Shell.run("clear") // clear the data received util now
// Some procces here
double v = Shell.run("get bytes")
Shell.run("clear") // clear the data received util now
---
With that I could automatize my tests using the wireshark instead my old proxy. Do you have any idea about how to achieve that?
Is there any way to control it from a java code? Something like:
// Start the wireshark normally with some filter
// Java code starts here
Shell.run("clear") // clear the data received util now
// Some procces here
double v = Shell.run("get bytes")
Shell.run("clear") // clear the data received util now
---
With that I could automatize my tests using the wireshark instead my old proxy. Do you have any idea about how to achieve that?
2011/5/13 Israel Ben Guilherme Fonseca <israel.bgf@gmail.com>
Tom ->
http://initd.org/psycopg/features/
"psycopg is written mostly in C and wraps the libpq library with the result of being both fast and secure."
So it's always using the libpq lbrary.
Macieck ->
A much more portable solution indeed.2011/5/13 Maciek Sakrejda <msakrejda@truviso.com>> I don't know about the java implementation but psycopg2 is on top of libpq,The jdbc driver does not depend on libpq--it is a standalone wire
> so that's why the SSL compressing is working.
protocol implementation based on Java Sockets.
---
Maciek Sakrejda | System Architect | Truviso
1065 E. Hillsdale Blvd., Suite 215
Foster City, CA 94404
(650) 242-3500 Main
www.truviso.com
Tom Lane <tgl@sss.pgh.pa.us> writes: > Is the python test not using a libpq-based client library? It must be. I've not build psycopg2 for 9.0.4 [*] but here goes: johann@asuka:postgresql/9.0.3% cd lib/python2.6/site-packages/psycopg2/64 johann@asuka:psycopg2/64% ldd _psycopg.so libpython2.6.so.1.0 => /usr/lib/64/libpython2.6.so.1.0 libpq.so.5 => /opt/myrkraverk/postgresql/9.0.3/lib/amd64/libpq.so.5 [...] [*] I keep each release and dependent software in its own prefix directory. Even though it may not be neccessary for things like psycopg2. -- Johann Oskarsson http://www.2ndquadrant.com/ |[] PostgreSQL Development, 24x7 Support, Training and Services --+-- | Blog: http://my.opera.com/myrkraverk/blog/
>Macieck -> >A much more portable solution indeed. Sure, but there's trade-offs--you have to reinvent the wheel (or find existing reimplementations of the different pieces of the wheel). >Macieck, another question about the wireshark. > Is there any way to control it from a java code? You don't want to control a GUI program from another program if you can help it. Wireshark relies on libpcap, a library for listening to (and interpreting) network traffic. Jpcap [1] is a Java wrapper for libpcap. You *should* be able to do everything you need through Jpcap. It doesn't understand the PostgreSQL wire protocol like Wireshark does, but if you need to do that, it might be easier to write your own wrapper rather than try to control Wireshark programmatically. [1]: http://netresearch.ics.uci.edu/kfujii/Jpcap/doc/ Thanks, --- Maciek Sakrejda | System Architect | Truviso 1065 E. Hillsdale Blvd., Suite 215 Foster City, CA 94404 (650) 242-3500 Main www.truviso.com
I implemented a language agnostic way of testing these traffic issues with tshark and packets generation time. Thanks for all the help.
--
So, for the time there's no way of compressing the traffic, right?
--
So, for the time there's no way of compressing the traffic, right?
2011/5/15 Maciek Sakrejda <msakrejda@truviso.com>
>Macieck ->Sure, but there's trade-offs--you have to reinvent the wheel (or find
>A much more portable solution indeed.
existing reimplementations of the different pieces of the wheel).You don't want to control a GUI program from another program if you
>Macieck, another question about the wireshark.
> Is there any way to control it from a java code?
can help it. Wireshark relies on libpcap, a library for listening to
(and interpreting) network traffic. Jpcap [1] is a Java wrapper for
libpcap. You *should* be able to do everything you need through Jpcap.
It doesn't understand the PostgreSQL wire protocol like Wireshark
does, but if you need to do that, it might be easier to write your own
wrapper rather than try to control Wireshark programmatically.
[1]: http://netresearch.ics.uci.edu/kfujii/Jpcap/doc/
Thanks,---
Maciek Sakrejda | System Architect | Truviso
1065 E. Hillsdale Blvd., Suite 215
Foster City, CA 94404
(650) 242-3500 Main
www.truviso.com