Re: What constitutes "reproducible" numbers from pgbench? - Mailing list pgsql-general

From Andy Colson
Subject Re: What constitutes "reproducible" numbers from pgbench?
Date
Msg-id 55368C2E.6020700@squeakycode.net
Whole thread Raw
In response to What constitutes "reproducible" numbers from pgbench?  (<Holger.Friedrich-Fa-Trivadis@it.nrw.de>)
Responses Re: What constitutes "reproducible" numbers from pgbench?
List pgsql-general
On 4/21/2015 9:21 AM, Holger.Friedrich-Fa-Trivadis@it.nrw.de wrote:
> Hello list,
> Exactly what constitutes „reproducible“ values from pgbench?  I keep
> getting a range between 340 tps and 440 tps or something like that using
> the same command line on the same machine.  Is that reproducible enough?
> The docs state that one should verify that the numbers are reproducible,
> so I repeat any test run ten times before believing the results.  I’ve
> tried increasing the test duration (-T) from one minute to five minutes,
> then turning off autovacuum (in postgresql.conf) as recommended by the
> docs, but the range of results is not getting any narrower.  So what
> does “reproducible” mean as applied to pgbench?
> Obviously I could be doing something wrong, such as missing some vital
> configuration option…
> Thanks in advance for any insights.
> Cheers,
> Holger Friedrich

I think its common to get different timings.  I think its ok because
things are changing (files, caches, indexes, etc).

If you run three to five short runs, they should all be withing the same
range (say 340 to 440).  If you are planning hardware, you might take
the worst case and purchase based on that.  If you are planning
schedules you might use the average case.  If you are bragging on the
newsgroups use the best case :-).

If you want more realistic then keep vacuum enabled and run for 24
hours.  In the real world, you are going to vacuum, so benchmark it too.

If you are playing with postgres.conf settings, then three runs of a few
minutes each will give you an average, and you can compare different
settings based on that.

As Qingqing said, a read-only test should be more stable, because you
are comparing apples to apples.  A read-write test is changing under the
hood so expect some differences.

Also, if your test data is small, or large, you are benchmarking
different things. (lock speed, cpu speed, disk io, etc)

pgbench is good for a first test, but its going to act different than
your real world work load.

-Andy


pgsql-general by date:

Previous
From: Qingqing Zhou
Date:
Subject: Re: What constitutes "reproducible" numbers from pgbench?
Next
From: Jacek Wielemborek
Date:
Subject: Performance tuning assisted by a GUI application