Re: gaussian distribution pgbench - Mailing list pgsql-hackers

From Fabien COELHO
Subject Re: gaussian distribution pgbench
Date
Msg-id alpine.DEB.2.10.1403150738110.13791@sto
Whole thread Raw
In response to Re: gaussian distribution pgbench  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses Re: gaussian distribution pgbench
Re: gaussian distribution pgbench
Re: gaussian distribution pgbench
List pgsql-hackers
Hello Heikki,

> A couple of comments:
>
> * There should be an explicit "\setrandom ... uniform" option too, even 
> though you get that implicitly if you don't specify the distribution

Indeed. I agree. I suggested it, but it got lost.

> * What exactly does the "threshold" mean? The docs informally explain that 
> "the larger the thresold, the more frequent values close to the middle of the 
> interval are drawn", but that's pretty vague.

There are explanations and computations as comments in the code. If it is 
about the documentation, I'm not sure that a very precise mathematical 
definition will help a lot of people, and might rather hinder 
understanding, so the doc focuses on an intuitive explanation instead.

> * Does min and max really make sense for gaussian and exponential 
> distributions? For gaussian, I would expect mean and standard deviation as 
> the parameters, not min/max/threshold.

Yes... and no:-) The aim is to draw an integer primary key from a table, 
so it must be in a specified range. This is approximated by drawing a 
double value with the expected distribution (gaussian or exponential) and 
project it carefully onto integers. If it is out of range, there is a loop 
and another value is drawn. The minimal threshold constraint (2.0) ensures 
that the probability of looping is low.

> * How about setting the variable as a float instead of integer? Would seem 
> more natural to me. At least as an option.

Which variable? The values set by setrandom are mostly used for primary 
keys. We really want integers in a range.

-- 
Fabien.



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: jsonb and nested hstore
Next
From: Mitsumasa KONDO
Date:
Subject: Re: gaussian distribution pgbench