Thread: Benchmarking PostgreSQL?

Benchmarking PostgreSQL?

From
Ivan Voras
Date:
I'm conducting some benchmarking (mostly for fun and learning), and one
part of it is benchmarking PostgreSQL (7.4.1, on FreeBSD 4.9 and 5.2).
I'm using pgbench from the contrib directory, but I'm puzzled by the
results. I have little experience in benchmarking, but I don't think I
should be getting the scattered results I have.

- the computer is P3 @ 933MHz, 1Gb RAM
- I'm running pgbench with 35 clients and 50 transactions/client
- benchmark results are differentiating by about +/- 6 TPS. The results
are between 32 TPS and 44 TPS
- the results seem to be getting worse over time then suddenly jumping
to the maximum (saw-tooth like). Sometime there is even (very noticable
as a pattern!) indication of more-or-less regular alteration between the
minimum and maximum values (e.g. first measurement yields 32, second
yields 44, third again 32 or 31, etc...)
- running vacuumdb -z -f on the database does not influence the results
in predictable ways
- waiting (sleeping) between pgbench runs (either regular or random)
does not influence it in predictable ways
- the same patterns appear between operating systems (FreeBSD 4.9 and
5.2) and reinstalls of it (the numbers are ofcourse somewhat different)

postgresql.conf contains only these active lines:
  max_connections = 40
  shared_buffers = 10000
  sort_mem = 8192
  vacuum_mem = 32768

I've used these settings as they are expected to be used on a regular
work load when the server goes into production. I've upped vacuum_mem as
it seems to shorten vacuum times dramaticaly (I assume the memory is
allocated when needed as the postgresql process is only 88MB in size
(80Mb shared buffers)).

What I'm really asking here is: are these results normal? If not, can
they be improved, or is there a better off-the-shelf PostgreSQL
benchmark tool?



Re: Benchmarking PostgreSQL?

From
Tom Lane
Date:
Ivan Voras <ivoras@geri.cc.fer.hr> writes:
> I'm conducting some benchmarking (mostly for fun and learning), and one
> part of it is benchmarking PostgreSQL (7.4.1, on FreeBSD 4.9 and 5.2).
> I'm using pgbench from the contrib directory, but I'm puzzled by the
> results.

It is notoriously hard to get reproducible results from pgbench.
However...

> - I'm running pgbench with 35 clients and 50 transactions/client

(1) what scale factor did you use to size the database?  One of the
gotchas is that you need to use a scale factor at least as large as the
number of clients you are testing.  The scale factor is equal to the
number of rows in the "branches" table, and since every transaction
wants to update some row of branches, you end up mostly measuring the
effects of update contention if the scale factor is less than about
the number of clients.  scale 1 is particularly deadly, it means all
the transactions get serialized :-(

(2) 50 xacts/client is too small to get anything reproducible; you'll
mostly be measuring startup transients.  I usually use 1000 xacts/client.

            regards, tom lane

Re: Benchmarking PostgreSQL?

From
Ivan Voras
Date:
Tom Lane wrote:

> It is notoriously hard to get reproducible results from pgbench.
> However...
>
>
>>- I'm running pgbench with 35 clients and 50 transactions/client
>
>
> (1) what scale factor did you use to size the database?  One of the
> gotchas is that you need to use a scale factor at least as large as the

I forgot to mention that - I read the pgbench README, and the scale
factor was set to 40.

> (2) 50 xacts/client is too small to get anything reproducible; you'll
> mostly be measuring startup transients.  I usually use 1000 xacts/client.

I was using 100 and 50, hoping that the larger value will help
reproducability and the smaller just what you said - to measure startup
time. What I also forgot to mention was that the numbers I was talking
about were got by using '-C' pgbench switch. Without it the results wary
from about 60 and 145 (same 'alternating' effects, etc).

Thanks, I will try 1000 transactions!

There's another thing I'm puzzled about: I deliberately used -C switch
in intention to measure connection time, but with it, the numbers
displayed by pgbench for 'tps with' and 'tps without connection time'
are same to the 6th decimal place. Without -C, both numbers are more
then doubled and are different by about 2-3 tps. (I was expecting that
with -C the 'tps with c.t.' would be much lower than 'tps without c.t.').

(the README is here:
http://developer.postgresql.org/cvsweb.cgi/pgsql-server/contrib/pgbench/README.pgbench)