Thread: How to improve cockroach performance with pgbench?

How to improve cockroach performance with pgbench?

From
Fabien COELHO
Date:
Hello,

I've been playing with CockroachDB, a distributed database system which is 
more or less compatible with Postgres because it implements the same 
network protocol. Because if this compatibility, I have used pgbench to 
setup and run some tests on various AWS VMs (5 identical VMs, going up to 
a total 80 vcpu in the system).

The general behavior and ease of use is great. Data are shared between 
nodes, adding a new node makes the system automatically replicate and 
balance the data, wow. Also, the provided web interface is quite nice and 
gives hints about what is happening. They implement an automatic retry 
feature so that when a transaction fails it is retried without the client 
needed to know about it.

All this is impressive, but performance wise I ran in a few issues and/or 
questions:

  - Loading data with a COPY (pgbench -i) is pretty slow, typically 3
    seconds per scale whereas on a basic postgres I would get 0.3 seconds
    per scale. Should I expect better performance, or is this the expected
    performance that can be achieved because of the automatic (automagic)
    replication performed by cockroach? Would it be better if I generated
    data from several connections (hmmm, pgbench does not know how to do
    that, but the tool could be improved if it is worth it)?

  - I'm at a loss at finding the right number of client connections to
    "maximise" tps under reasonable latency. Some of my tests suggest that
    maybe 4 clients per core is the best option. For a standard postgres,
    a typical client count would be larger, typically around 8-10 per
    core.
    Is this choice reasonable for cockroach?

  - The overall performance is a little bit disappointing. Ok, this is
    a distributed system which does automatic partitioning and replication
    on serializable transactions, so obviously this quality of service must
    cost something, but I'm typically running around 10 tps per core (with
    pgbench default transaction), so a pretty high latency, and even if
    it scales somehow, it which seems  quite low.
    What I am doing wrong? What should I check?

  - Another strange thing is that the steady state at full speed is quite
    unstable: looking at instantaneous performance, the tps varies a lot,
    eg between 0 and 4500 tps, more or less uniformly, i.e. the standard
    deviation is large, say 1000 tps stddev for a 2000 tps average
    performance.

Basically, any advice about cockroach configuration and running pgbench 
against it is welcome!

Thanks in advance,

-- 
Fabien.