Thread: performance question: protocol v2 vs v3

performance question: protocol v2 vs v3

From
Guillaume Cottenceau
Date:
Hello,

I have conducted tests that seem to indicate that using protocol
v2 gives slightly better insert performance as using protocol v3.

- Database product name: PostgreSQL
- Database product version: 9.3.4
- Database driver name: PostgreSQL Native Driver
- Database driver version: PostgreSQL 9.3 JDBC4 (build 1101)
- JDBC major version: 4
- JDBC minor version: 0
- java.runtime.version: 1.8.0_05-b13
- os.name: Linux
- os.arch: amd64
- os.version: 2.6.38.7-desktop-1mnb2
- tomcat 8.0.8

Most PG config default except:

    shared_buffers = 512MB
    vacuum_cost_delay = 50
    vacuum_cost_page_hit = 1
    vacuum_cost_page_miss = 10
    vacuum_cost_page_dirty = 20
    vacuum_cost_limit = 1000
    track_counts = on
    autovacuum = on
    autovacuum_vacuum_cost_delay = 100
    synchronous_commit = off
    random_page_cost = 2
    effective_cache_size = 1536MB

Hardware is a random desktop computer of 2011.

Test is a mono threaded insertion of 10,000 entries into each of
two tables (insert #1 into table a, insert #1 into table b,
insert #2 into table a, etc):

                                        Table "public.a"
       Column       |           Type           |                          Modifiers
--------------------+--------------------------+-------------------------------------------------------------
 uid                | integer                  | not null default nextval('a_uid_seq'::regclass)
 cola               | character varying(15)    |
 colb               | character varying(25)    |
 colc               | text                     |
 cold               | character varying(25)    |
 cole               | text                     |
 colf               | character varying(25)    |
 colg               | character varying(128)   |
 colh               | integer                  |
 coli               | timestamp with time zone | not null
 colj               | timestamp with time zone |
Indexes:
    "a_pkey1" PRIMARY KEY, btree (uid)
    "idx_a_coli" btree (coli)
    "idx_a_coli_colc" btree (coli, colc)
    "idx_a_coli_cold_colg" btree (coli, cold, colg)
Referenced by:
    TABLE "b" CONSTRAINT "fk_a_uid" FOREIGN KEY (a_uid) REFERENCES a(uid) ON DELETE CASCADE

             Table "public.b"
         Column         |         Type         | Modifiers
------------------------+----------------------+-----------
 a_uid                  | integer              | not null
 cola                   | text                 |
 colb                   | text                 |
 colc                   | integer              |
 cold                   | text                 | not null
 cole                   | character varying(3) |
 colf                   | integer              |
 colg                   | integer              |
 colh                   | integer              |
 coli                   | integer              |
 colj                   | text                 |
Indexes:
    "b_pkey2" PRIMARY KEY, btree (a_uid)
Foreign-key constraints:
    "fk_a_uid" FOREIGN KEY (a_uid) REFERENCES a(uid) ON DELETE CASCADE

Using a prepared statement on an autocommit=true connection:

    ps = conn.prepareStatement( "INSERT INTO a( ... ) VALUES ( ?, ... )", ResultSet.TYPE_SCROLL_INSENSITIVE,
ResultSet.CONCUR_READ_ONLY); 
    ps.setString( 1, ... )
    ...
    synchronized( ps.getConnection() ) {
        ps.executeUpdate()
    }

Using jdbc:postgresql:dbname?charSet=UTF8&protocolVersion=2,
clock time is:

#1: 00:58.519
#2: 00:56.678
#3: 00:56.222

Using jdbc:postgresql:dbname?charSet=UTF8&protocolVersion=3,
clock time is:

#1: 01:01.404
#2: 00:59.331
#3: 01:00.091

I know this is not massive. However, I was about to switch from
protocol v2 to protocol v3 wholly, but now, I'm wondering if
anyone can give any insight on this. Also, is there any known
downsides in sticking to protocol v2 - since it's very old now.

Thanks

--
Guillaume Cottenceau


Re: performance question: protocol v2 vs v3

From
Dave Cramer
Date:
The V3 protocol does involve a few more steps so yes, it might be slower.

As for downside to V2.

It may get deprecated completely.
It is vulnerable to SQL injection.
It cannot use binary transfer as everything is sent over the wire as text


If you really want to speed up inserts use copy.


Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 25 November 2014 at 05:05, Guillaume Cottenceau <gc@mnc.ch> wrote:
Hello,

I have conducted tests that seem to indicate that using protocol
v2 gives slightly better insert performance as using protocol v3.

- Database product name: PostgreSQL
- Database product version: 9.3.4
- Database driver name: PostgreSQL Native Driver
- Database driver version: PostgreSQL 9.3 JDBC4 (build 1101)
- JDBC major version: 4
- JDBC minor version: 0
- java.runtime.version: 1.8.0_05-b13
- os.name: Linux
- os.arch: amd64
- os.version: 2.6.38.7-desktop-1mnb2
- tomcat 8.0.8

Most PG config default except:

    shared_buffers = 512MB
    vacuum_cost_delay = 50
    vacuum_cost_page_hit = 1
    vacuum_cost_page_miss = 10
    vacuum_cost_page_dirty = 20
    vacuum_cost_limit = 1000
    track_counts = on
    autovacuum = on
    autovacuum_vacuum_cost_delay = 100
    synchronous_commit = off
    random_page_cost = 2
    effective_cache_size = 1536MB

Hardware is a random desktop computer of 2011.

Test is a mono threaded insertion of 10,000 entries into each of
two tables (insert #1 into table a, insert #1 into table b,
insert #2 into table a, etc):

                                        Table "public.a"
       Column       |           Type           |                          Modifiers
--------------------+--------------------------+-------------------------------------------------------------
 uid                | integer                  | not null default nextval('a_uid_seq'::regclass)
 cola               | character varying(15)    |
 colb               | character varying(25)    |
 colc               | text                     |
 cold               | character varying(25)    |
 cole               | text                     |
 colf               | character varying(25)    |
 colg               | character varying(128)   |
 colh               | integer                  |
 coli               | timestamp with time zone | not null
 colj               | timestamp with time zone |
Indexes:
    "a_pkey1" PRIMARY KEY, btree (uid)
    "idx_a_coli" btree (coli)
    "idx_a_coli_colc" btree (coli, colc)
    "idx_a_coli_cold_colg" btree (coli, cold, colg)
Referenced by:
    TABLE "b" CONSTRAINT "fk_a_uid" FOREIGN KEY (a_uid) REFERENCES a(uid) ON DELETE CASCADE

             Table "public.b"
         Column         |         Type         | Modifiers
------------------------+----------------------+-----------
 a_uid                  | integer              | not null
 cola                   | text                 |
 colb                   | text                 |
 colc                   | integer              |
 cold                   | text                 | not null
 cole                   | character varying(3) |
 colf                   | integer              |
 colg                   | integer              |
 colh                   | integer              |
 coli                   | integer              |
 colj                   | text                 |
Indexes:
    "b_pkey2" PRIMARY KEY, btree (a_uid)
Foreign-key constraints:
    "fk_a_uid" FOREIGN KEY (a_uid) REFERENCES a(uid) ON DELETE CASCADE

Using a prepared statement on an autocommit=true connection:

    ps = conn.prepareStatement( "INSERT INTO a( ... ) VALUES ( ?, ... )", ResultSet.TYPE_SCROLL_INSENSITIVE, ResultSet.CONCUR_READ_ONLY );
    ps.setString( 1, ... )
    ...
    synchronized( ps.getConnection() ) {
        ps.executeUpdate()
    }

Using jdbc:postgresql:dbname?charSet=UTF8&protocolVersion=2,
clock time is:

#1: 00:58.519
#2: 00:56.678
#3: 00:56.222

Using jdbc:postgresql:dbname?charSet=UTF8&protocolVersion=3,
clock time is:

#1: 01:01.404
#2: 00:59.331
#3: 01:00.091

I know this is not massive. However, I was about to switch from
protocol v2 to protocol v3 wholly, but now, I'm wondering if
anyone can give any insight on this. Also, is there any known
downsides in sticking to protocol v2 - since it's very old now.

Thanks

--
Guillaume Cottenceau


--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc

Re: performance question: protocol v2 vs v3

From
Albe Laurenz
Date:
Guillaume Cottenceau wrote:
> I have conducted tests that seem to indicate that using protocol
> v2 gives slightly better insert performance as using protocol v3.

[...]

> Using a prepared statement on an autocommit=true connection:

[...]

> Using jdbc:postgresql:dbname?charSet=UTF8&protocolVersion=2,
> clock time is:
> 
> #1: 00:58.519
> #2: 00:56.678
> #3: 00:56.222
> 
> Using jdbc:postgresql:dbname?charSet=UTF8&protocolVersion=3,
> clock time is:
> 
> #1: 01:01.404
> #2: 00:59.331
> #3: 01:00.091
> 
> I know this is not massive. However, I was about to switch from
> protocol v2 to protocol v3 wholly, but now, I'm wondering if
> anyone can give any insight on this. Also, is there any known
> downsides in sticking to protocol v2 - since it's very old now.

An explanation is maybe that prepared statements are handled
using the extended query protocol, which only exists in v3.
With the extended query protocol, each statement is first parsed,
then parameters are bound to the statement and then the statement
is executed, leading to three client-server round trips.

With protocol v2, the JDBC driver converts the statements
to simple statements that contain the parameters, so there will
be fewer client-server round trips.

Do you you call conn.prepareStatement for each statement, or do
you reuse the prepared statements?  In the latter case, you might see
better performance with protocol v3.

Yours,
Laurenz Albe

Re: performance question: protocol v2 vs v3

From
Guillaume Cottenceau
Date:
Albe Laurenz <laurenz.albe 'at' wien.gv.at> writes:

> Do you you call conn.prepareStatement for each statement, or do
> you reuse the prepared statements?  In the latter case, you might see
> better performance with protocol v3.

I recreate one each time. Makes sense, thanks.

--
Guillaume Cottenceau


Re: performance question: protocol v2 vs v3

From
Craig Ringer
Date:
On 11/25/2014 06:45 PM, Albe Laurenz wrote:

> An explanation is maybe that prepared statements are handled
> using the extended query protocol, which only exists in v3.
> With the extended query protocol, each statement is first parsed,
> then parameters are bound to the statement and then the statement
> is executed, leading to three client-server round trips.

While Parse/Bind/Execute are separate steps, it is not correct to say
that this means three round trips.

Take a look at the conversation on the wire in Wireshark. You'll see
that generally, the Parse, Bind, Describe, Execute and Sync messages are
all actually sent in the same packet.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: performance question: protocol v2 vs v3

From
Albe Laurenz
Date:
Craig Ringer wrote:
> On 11/25/2014 06:45 PM, Albe Laurenz wrote:
> > An explanation is maybe that prepared statements are handled
> > using the extended query protocol, which only exists in v3.
> > With the extended query protocol, each statement is first parsed,
> > then parameters are bound to the statement and then the statement
> > is executed, leading to three client-server round trips.

> While Parse/Bind/Execute are separate steps, it is not correct to say
> that this means three round trips.
> 
> Take a look at the conversation on the wire in Wireshark. You'll see
> that generally, the Parse, Bind, Describe, Execute and Sync messages are
> all actually sent in the same packet.

I didn't know that, thanks for the clarification.

Yours,
Laurenz Albe

Re: performance question: protocol v2 vs v3

From
Craig Ringer
Date:
On 12/03/2014 07:41 PM, Albe Laurenz wrote:
> I didn't know that, thanks for the clarification.

Neither did I until I did some performance analysis work recently.

It's not always strictly true - in particular, if you're using binary
parameters, PgJDBC does a parse and describe, then waits for a response
before sending a bind. So there's an extra round trip there. Same if
you're doing a batch returning generated keys.

In general though, the v3 protocol doesn't add roundtrips.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services