Thread: performance question: protocol v2 vs v3
Hello, I have conducted tests that seem to indicate that using protocol v2 gives slightly better insert performance as using protocol v3. - Database product name: PostgreSQL - Database product version: 9.3.4 - Database driver name: PostgreSQL Native Driver - Database driver version: PostgreSQL 9.3 JDBC4 (build 1101) - JDBC major version: 4 - JDBC minor version: 0 - java.runtime.version: 1.8.0_05-b13 - os.name: Linux - os.arch: amd64 - os.version: 2.6.38.7-desktop-1mnb2 - tomcat 8.0.8 Most PG config default except: shared_buffers = 512MB vacuum_cost_delay = 50 vacuum_cost_page_hit = 1 vacuum_cost_page_miss = 10 vacuum_cost_page_dirty = 20 vacuum_cost_limit = 1000 track_counts = on autovacuum = on autovacuum_vacuum_cost_delay = 100 synchronous_commit = off random_page_cost = 2 effective_cache_size = 1536MB Hardware is a random desktop computer of 2011. Test is a mono threaded insertion of 10,000 entries into each of two tables (insert #1 into table a, insert #1 into table b, insert #2 into table a, etc): Table "public.a" Column | Type | Modifiers --------------------+--------------------------+------------------------------------------------------------- uid | integer | not null default nextval('a_uid_seq'::regclass) cola | character varying(15) | colb | character varying(25) | colc | text | cold | character varying(25) | cole | text | colf | character varying(25) | colg | character varying(128) | colh | integer | coli | timestamp with time zone | not null colj | timestamp with time zone | Indexes: "a_pkey1" PRIMARY KEY, btree (uid) "idx_a_coli" btree (coli) "idx_a_coli_colc" btree (coli, colc) "idx_a_coli_cold_colg" btree (coli, cold, colg) Referenced by: TABLE "b" CONSTRAINT "fk_a_uid" FOREIGN KEY (a_uid) REFERENCES a(uid) ON DELETE CASCADE Table "public.b" Column | Type | Modifiers ------------------------+----------------------+----------- a_uid | integer | not null cola | text | colb | text | colc | integer | cold | text | not null cole | character varying(3) | colf | integer | colg | integer | colh | integer | coli | integer | colj | text | Indexes: "b_pkey2" PRIMARY KEY, btree (a_uid) Foreign-key constraints: "fk_a_uid" FOREIGN KEY (a_uid) REFERENCES a(uid) ON DELETE CASCADE Using a prepared statement on an autocommit=true connection: ps = conn.prepareStatement( "INSERT INTO a( ... ) VALUES ( ?, ... )", ResultSet.TYPE_SCROLL_INSENSITIVE, ResultSet.CONCUR_READ_ONLY); ps.setString( 1, ... ) ... synchronized( ps.getConnection() ) { ps.executeUpdate() } Using jdbc:postgresql:dbname?charSet=UTF8&protocolVersion=2, clock time is: #1: 00:58.519 #2: 00:56.678 #3: 00:56.222 Using jdbc:postgresql:dbname?charSet=UTF8&protocolVersion=3, clock time is: #1: 01:01.404 #2: 00:59.331 #3: 01:00.091 I know this is not massive. However, I was about to switch from protocol v2 to protocol v3 wholly, but now, I'm wondering if anyone can give any insight on this. Also, is there any known downsides in sticking to protocol v2 - since it's very old now. Thanks -- Guillaume Cottenceau
The V3 protocol does involve a few more steps so yes, it might be slower.
As for downside to V2.
It may get deprecated completely.
It is vulnerable to SQL injection.
It cannot use binary transfer as everything is sent over the wire as text
If you really want to speed up inserts use copy.
On 25 November 2014 at 05:05, Guillaume Cottenceau <gc@mnc.ch> wrote:
Hello,
I have conducted tests that seem to indicate that using protocol
v2 gives slightly better insert performance as using protocol v3.
- Database product name: PostgreSQL
- Database product version: 9.3.4
- Database driver name: PostgreSQL Native Driver
- Database driver version: PostgreSQL 9.3 JDBC4 (build 1101)
- JDBC major version: 4
- JDBC minor version: 0
- java.runtime.version: 1.8.0_05-b13
- os.name: Linux
- os.arch: amd64
- os.version: 2.6.38.7-desktop-1mnb2
- tomcat 8.0.8
Most PG config default except:
shared_buffers = 512MB
vacuum_cost_delay = 50
vacuum_cost_page_hit = 1
vacuum_cost_page_miss = 10
vacuum_cost_page_dirty = 20
vacuum_cost_limit = 1000
track_counts = on
autovacuum = on
autovacuum_vacuum_cost_delay = 100
synchronous_commit = off
random_page_cost = 2
effective_cache_size = 1536MB
Hardware is a random desktop computer of 2011.
Test is a mono threaded insertion of 10,000 entries into each of
two tables (insert #1 into table a, insert #1 into table b,
insert #2 into table a, etc):
Table "public.a"
Column | Type | Modifiers
--------------------+--------------------------+-------------------------------------------------------------
uid | integer | not null default nextval('a_uid_seq'::regclass)
cola | character varying(15) |
colb | character varying(25) |
colc | text |
cold | character varying(25) |
cole | text |
colf | character varying(25) |
colg | character varying(128) |
colh | integer |
coli | timestamp with time zone | not null
colj | timestamp with time zone |
Indexes:
"a_pkey1" PRIMARY KEY, btree (uid)
"idx_a_coli" btree (coli)
"idx_a_coli_colc" btree (coli, colc)
"idx_a_coli_cold_colg" btree (coli, cold, colg)
Referenced by:
TABLE "b" CONSTRAINT "fk_a_uid" FOREIGN KEY (a_uid) REFERENCES a(uid) ON DELETE CASCADE
Table "public.b"
Column | Type | Modifiers
------------------------+----------------------+-----------
a_uid | integer | not null
cola | text |
colb | text |
colc | integer |
cold | text | not null
cole | character varying(3) |
colf | integer |
colg | integer |
colh | integer |
coli | integer |
colj | text |
Indexes:
"b_pkey2" PRIMARY KEY, btree (a_uid)
Foreign-key constraints:
"fk_a_uid" FOREIGN KEY (a_uid) REFERENCES a(uid) ON DELETE CASCADE
Using a prepared statement on an autocommit=true connection:
ps = conn.prepareStatement( "INSERT INTO a( ... ) VALUES ( ?, ... )", ResultSet.TYPE_SCROLL_INSENSITIVE, ResultSet.CONCUR_READ_ONLY );
ps.setString( 1, ... )
...
synchronized( ps.getConnection() ) {
ps.executeUpdate()
}
Using jdbc:postgresql:dbname?charSet=UTF8&protocolVersion=2,
clock time is:
#1: 00:58.519
#2: 00:56.678
#3: 00:56.222
Using jdbc:postgresql:dbname?charSet=UTF8&protocolVersion=3,
clock time is:
#1: 01:01.404
#2: 00:59.331
#3: 01:00.091
I know this is not massive. However, I was about to switch from
protocol v2 to protocol v3 wholly, but now, I'm wondering if
anyone can give any insight on this. Also, is there any known
downsides in sticking to protocol v2 - since it's very old now.
Thanks
--
Guillaume Cottenceau
--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc
Guillaume Cottenceau wrote: > I have conducted tests that seem to indicate that using protocol > v2 gives slightly better insert performance as using protocol v3. [...] > Using a prepared statement on an autocommit=true connection: [...] > Using jdbc:postgresql:dbname?charSet=UTF8&protocolVersion=2, > clock time is: > > #1: 00:58.519 > #2: 00:56.678 > #3: 00:56.222 > > Using jdbc:postgresql:dbname?charSet=UTF8&protocolVersion=3, > clock time is: > > #1: 01:01.404 > #2: 00:59.331 > #3: 01:00.091 > > I know this is not massive. However, I was about to switch from > protocol v2 to protocol v3 wholly, but now, I'm wondering if > anyone can give any insight on this. Also, is there any known > downsides in sticking to protocol v2 - since it's very old now. An explanation is maybe that prepared statements are handled using the extended query protocol, which only exists in v3. With the extended query protocol, each statement is first parsed, then parameters are bound to the statement and then the statement is executed, leading to three client-server round trips. With protocol v2, the JDBC driver converts the statements to simple statements that contain the parameters, so there will be fewer client-server round trips. Do you you call conn.prepareStatement for each statement, or do you reuse the prepared statements? In the latter case, you might see better performance with protocol v3. Yours, Laurenz Albe
Albe Laurenz <laurenz.albe 'at' wien.gv.at> writes: > Do you you call conn.prepareStatement for each statement, or do > you reuse the prepared statements? In the latter case, you might see > better performance with protocol v3. I recreate one each time. Makes sense, thanks. -- Guillaume Cottenceau
On 11/25/2014 06:45 PM, Albe Laurenz wrote: > An explanation is maybe that prepared statements are handled > using the extended query protocol, which only exists in v3. > With the extended query protocol, each statement is first parsed, > then parameters are bound to the statement and then the statement > is executed, leading to three client-server round trips. While Parse/Bind/Execute are separate steps, it is not correct to say that this means three round trips. Take a look at the conversation on the wire in Wireshark. You'll see that generally, the Parse, Bind, Describe, Execute and Sync messages are all actually sent in the same packet. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Craig Ringer wrote: > On 11/25/2014 06:45 PM, Albe Laurenz wrote: > > An explanation is maybe that prepared statements are handled > > using the extended query protocol, which only exists in v3. > > With the extended query protocol, each statement is first parsed, > > then parameters are bound to the statement and then the statement > > is executed, leading to three client-server round trips. > While Parse/Bind/Execute are separate steps, it is not correct to say > that this means three round trips. > > Take a look at the conversation on the wire in Wireshark. You'll see > that generally, the Parse, Bind, Describe, Execute and Sync messages are > all actually sent in the same packet. I didn't know that, thanks for the clarification. Yours, Laurenz Albe
On 12/03/2014 07:41 PM, Albe Laurenz wrote: > I didn't know that, thanks for the clarification. Neither did I until I did some performance analysis work recently. It's not always strictly true - in particular, if you're using binary parameters, PgJDBC does a parse and describe, then waits for a response before sending a bind. So there's an extra round trip there. Same if you're doing a batch returning generated keys. In general though, the v3 protocol doesn't add roundtrips. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services