Home > mailing lists

Re: pgbench randomness initialization - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: pgbench randomness initialization
Date	April 7, 2016 13:38:27
Msg-id	20160407103821.mj3bepccxiakug2b@alap3.anarazel.de Whole thread Raw
In response to	Re: pgbench randomness initialization (Fabien COELHO <coelho@cri.ensmp.fr>)
Responses	Re: pgbench randomness initialization (Fabien COELHO <coelho@cri.ensmp.fr>) Re: pgbench randomness initialization (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-hackers

Tree view

On 2016-04-07 12:25:58 +0200, Fabien COELHO wrote:
> 
> >> (2) runs which really vary from one to the next, so as
> >>     to have an idea about how much it may vary, what is the
> >>     performance stability.
> >
> >I don't think this POV makes all that much sense. If you do something
> >non-comparable, then the results aren't, uh, comparable. Which also
> >means there's a lower chance to reproduce observed problems.
> 
> That also means that you are likely not to hit them if you always do the
> very same run...

If you run the test for longer... Or explicitly iterate over IVs. At the
very least we need to make pgbench output the IV used, to have some
chance of repeating tests.


> >Uh, and what's the benefit of that variability? pgbench isn't a reality
> >simulation tool, it's a benchmarking tool. And benchmarks with intrisinc
> >variability are bad benchmarks.
> 
> From a statistical perspective, one run does not mean anything. If you do
> the exact same run over and over again, then all mathematical results about
> (slow) convergence towards the average are lost. This is like trying to
> survey a population by asking the questions to the same person over and
> over: the result will be biased.

That comparison pretty much invalidates any point you're making, it's
that bad.


> Now when you develop, which is the use case you probably have in mind, you
> want to compare two pg version and check for the performance impact, so
> having the exact same run seems like a proxy to quickly check for that.

It's not about "quickly" checking for something. If you look at the
results in thread mentioned in the OP, the order of operations
drastically and *PERSISTENTLY* changes the observations. Causing *days*
of work lost.


> However, from a stastistical perspective this is just heresy: you may do a
> change which improves one given run at the expense of all possible others
> and you would not know it: Say for instance that there are two different
> behaviors depending on something, then you will check against one of them
> only.

Meh. That assumes that we're doing a huge number of pgbench runs; but
usually people do maybe a handful. Tops. If you're trying to defend
against scenarios like that you need to design your tests so that you'll
encounter such problems by running longer.


> So I have no mathematical doubt that changing the seed is the right default
> setting, thus I think that the current behavior is fine. However I'm okay if
> someone wants to control the randomness for some reason (maybe having "less
> sure" results, but quickly), so it could be allowed somehow.

There might be some statistics arguments, but I think they're pretty
ignoring reality.

Greetings,

Andres Freund

pgsql-hackers by date:

From: Amit Kapila
Date: 07 April 2016, 13:29:31
Subject: Re: Support for N synchronous standby servers - take 2

From: Kevin Grittner
Date: 07 April 2016, 13:42:39
Subject: Re: WIP: Detecting SSI conflicts before reporting constraint violations

Re: pgbench randomness initialization - Mailing list pgsql-hackers

Previous

Next