Hi,
The recent discussion about pipelining in the jodbc driver prompted me to look
at what it would take for libpq.
I have a proof of concept patch working. The results are even more promising
than I expected.
While it's true that many applications and frameworks won't easily benefit, it
amazes me that this hasn't been explored before.
I developed a simple test application that creates a table with a single auto
increment primary key column, then runs a 4 simple queries x times each:
"INSERT INTO test() VALUES ()"
"SELECT * FROM test LIMIT 1"
"SELECT * FROM test"
"DELETE FROM test"
The parameters to testPipelinedSeries are (number of times to execute each
query, maximum number of queued queries).
Results against local server:
testPipelinedSeries(10,1) took 0.020884
testPipelinedSeries(10,3) took 0.020630, speedup 1.01
testPipelinedSeries(10,10) took 0.006265, speedup 3.33
testPipelinedSeries(100,1) took 0.042731
testPipelinedSeries(100,3) took 0.043035, speedup 0.99
testPipelinedSeries(100,10) took 0.037222, speedup 1.15
testPipelinedSeries(100,25) took 0.031223, speedup 1.37
testPipelinedSeries(100,50) took 0.032482, speedup 1.32
testPipelinedSeries(100,100) took 0.031356, speedup 1.36
Results against remote server through ssh tunnel(30-40ms rtt):
testPipelinedSeries(10,1) took 3.2461736
testPipelinedSeries(10,3) took 1.1008443, speedup 2.44
testPipelinedSeries(10,10) took 0.342399, speedup 7.19
testPipelinedSeries(100,1) took 26.25882588
testPipelinedSeries(100,3) took 8.8509234, speedup 3.04
testPipelinedSeries(100,10) took 3.2866285, speedup 9.03
testPipelinedSeries(100,25) took 2.1472847, speedup 17.57
testPipelinedSeries(100,50) took 1.957510, speedup 27.03
testPipelinedSeries(100,100) took 0.690682, speedup 37.47
I plan to write documentation, add regression testing, and do general cleanup
before asking for feedback on the patch itself. Any suggestions about
performance testing or api design would be nice. I haven't played with
changing the sync logic yet, but I'm guessing that an api to allow manual sync
instead of a sync per PQsendQuery will be needed. That could make things
tricky though with multi-statement queries, because currently the only way to
detect when results change from one query to the next are a ReadyForQuery
message.
Matt Newell