Thread: What constitutes "reproducible" numbers from pgbench?
Hello list,
Exactly what constitutes „reproducible“ values from pgbench? I keep getting a range between 340 tps and 440 tps or something like that using the same command line on the same machine. Is that reproducible enough?
The docs state that one should verify that the numbers are reproducible, so I repeat any test run ten times before believing the results. I’ve tried increasing the test duration (-T) from one minute to five minutes, then turning off autovacuum (in postgresql.conf) as recommended by the docs, but the range of results is not getting any narrower. So what does “reproducible” mean as applied to pgbench?
Obviously I could be doing something wrong, such as missing some vital configuration option…
Thanks in advance for any insights.
Cheers,
Holger Friedrich
On Tue, Apr 21, 2015 at 7:21 AM, <Holger.Friedrich-Fa-Trivadis@it.nrw.de> wrote: > Hello list, > > Exactly what constitutes „reproducible“ values from pgbench? I keep getting > a range between 340 tps and 440 tps or something like that using the same > command line on the same machine. Is that reproducible enough? > Nope, it is not. Is PostgreSQL the only resource consuming (IO, memory, CPU, etc) program running there? By reproducible, meaning the tps numbers you get shall be close, within several percent, if nothing changed with your runs. You can try a selection only (-S) pgbench first. Regards, Qingqing
On 4/21/2015 9:21 AM, Holger.Friedrich-Fa-Trivadis@it.nrw.de wrote: > Hello list, > Exactly what constitutes „reproducible“ values from pgbench? I keep > getting a range between 340 tps and 440 tps or something like that using > the same command line on the same machine. Is that reproducible enough? > The docs state that one should verify that the numbers are reproducible, > so I repeat any test run ten times before believing the results. I’ve > tried increasing the test duration (-T) from one minute to five minutes, > then turning off autovacuum (in postgresql.conf) as recommended by the > docs, but the range of results is not getting any narrower. So what > does “reproducible” mean as applied to pgbench? > Obviously I could be doing something wrong, such as missing some vital > configuration option… > Thanks in advance for any insights. > Cheers, > Holger Friedrich I think its common to get different timings. I think its ok because things are changing (files, caches, indexes, etc). If you run three to five short runs, they should all be withing the same range (say 340 to 440). If you are planning hardware, you might take the worst case and purchase based on that. If you are planning schedules you might use the average case. If you are bragging on the newsgroups use the best case :-). If you want more realistic then keep vacuum enabled and run for 24 hours. In the real world, you are going to vacuum, so benchmark it too. If you are playing with postgres.conf settings, then three runs of a few minutes each will give you an average, and you can compare different settings based on that. As Qingqing said, a read-only test should be more stable, because you are comparing apples to apples. A read-write test is changing under the hood so expect some differences. Also, if your test data is small, or large, you are benchmarking different things. (lock speed, cpu speed, disk io, etc) pgbench is good for a first test, but its going to act different than your real world work load. -Andy
On Tuesday, April 21, 2015 7:43 PM, Andy Colson wrote: > On 4/21/2015 9:21 AM, Holger.Friedrich-Fa-Trivadis@it.nrw.de wrote: >> Exactly what constitutes "reproducible" values from pgbench? I keep >> getting a range between 340 tps and 440 tps or something like that > I think its common to get different timings. I think its ok because things are changing (files, caches, indexes, etc). As I found out, our test server is a virtual machine, so while I should be "alone" on that virtual machine, of course I haveno idea what else might be going on on the physical server the virtual machine is running on. That would explain thesomewhat wide variations. Qingqing Zhou wrote that the range between 340 tps and 440 tps I keep getting is not ok and numbers should be the same withinseveral per cent. Of course, if other things are going on on the physical server, I can't always expect a close match. Since someone asked, the point of the exercise is to see if and how various configurations in postgresql.conf are affectingperformance. Cheers, Holger Friedrich
On Thu, 23 Apr 2015 11:07:05 +0200 <Holger.Friedrich-Fa-Trivadis@it.nrw.de> wrote: > On Tuesday, April 21, 2015 7:43 PM, Andy Colson wrote: > > On 4/21/2015 9:21 AM, Holger.Friedrich-Fa-Trivadis@it.nrw.de wrote: > >> Exactly what constitutes "reproducible" values from pgbench? I keep > >> getting a range between 340 tps and 440 tps or something like that > > I think its common to get different timings. I think its ok because things are changing (files, caches, indexes, etc). > > As I found out, our test server is a virtual machine, so while I should be "alone" on that virtual machine, of course Ihave no idea what else might be going on on the physical server the virtual machine is running on. That would explain thesomewhat wide variations. > > Qingqing Zhou wrote that the range between 340 tps and 440 tps I keep getting is not ok and numbers should be the samewithin several per cent. Of course, if other things are going on on the physical server, I can't always expect a closematch. > > Since someone asked, the point of the exercise is to see if and how various configurations in postgresql.conf are affectingperformance. You're going to have difficulty doing that sort of tuning and testing on a VM. Even when there's nothing else going on, VMs tend to have a wider range of behaviors than native installs (since things like cron jobs can run both on the host and the guest OS, as well as other reasons, I'm sure). Whether such an endeavour is worthwhile depends on your reason for doing it. If your production environment will also be a VM of similar configuration to this one, then I would proceed with the tests, simply tracking the +/- variance and keeping it in mind; since you'll likely see the same variance on production. If you're doing it for your own general learning, then it might still be worth it, but it's hardly an idea setup for that kind of thing. -- PT <wmoran@potentialtech.com>
On 4/23/2015 4:07 AM, Holger.Friedrich-Fa-Trivadis@it.nrw.de wrote: > On Tuesday, April 21, 2015 7:43 PM, Andy Colson wrote: >> On 4/21/2015 9:21 AM, Holger.Friedrich-Fa-Trivadis@it.nrw.de wrote: >>> Exactly what constitutes "reproducible" values from pgbench? I keep >>> getting a range between 340 tps and 440 tps or something like that >> I think its common to get different timings. I think its ok because things are changing (files, caches, indexes, etc). > > Qingqing Zhou wrote that the range between 340 tps and 440 tps I keep getting is not ok and numbers should be the samewithin several per cent. Of course, if other things are going on on the physical server, I can't always expect a closematch. > I disagree. Having a reproducible test withing a few percent is a great result. But any result is informative. You're tests tell you an upper and lower bound on performance. It tells you to expect a little variance in your work load. It probably tells you a little about how your vm host is caching writes to disk. You are feeling the pulse of your hardware. Each hardware setup has its own pulse, and understanding it will help you understand how it'll handle a load. -Andy