Re: gaussian distribution pgbench - Mailing list pgsql-hackers

From KONDO Mitsumasa
Subject Re: gaussian distribution pgbench
Date
Msg-id 53268AA0.2040909@lab.ntt.co.jp
Whole thread Raw
In response to Re: gaussian distribution pgbench  (Fabien COELHO <coelho@cri.ensmp.fr>)
Responses Re: gaussian distribution pgbench
List pgsql-hackers
(2014/03/15 15:53), Fabien COELHO wrote:
>
> Hello Heikki,
>
>> A couple of comments:
>>
>> * There should be an explicit "\setrandom ... uniform" option too, even though
>> you get that implicitly if you don't specify the distribution
>
> Indeed. I agree. I suggested it, but it got lost.
>
>> * What exactly does the "threshold" mean? The docs informally explain that "the
>> larger the thresold, the more frequent values close to the middle of the
>> interval are drawn", but that's pretty vague.
>
> There are explanations and computations as comments in the code. If it is about
> the documentation, I'm not sure that a very precise mathematical definition will
> help a lot of people, and might rather hinder understanding, so the doc focuses
> on an intuitive explanation instead.
>
>> * Does min and max really make sense for gaussian and exponential
>> distributions? For gaussian, I would expect mean and standard deviation as the
>> parameters, not min/max/threshold.
>
> Yes... and no:-) The aim is to draw an integer primary key from a table, so it
> must be in a specified range. This is approximated by drawing a double value with
> the expected distribution (gaussian or exponential) and project it carefully onto
> integers. If it is out of range, there is a loop and another value is drawn. The
> minimal threshold constraint (2.0) ensures that the probability of looping is low.
>
>> * How about setting the variable as a float instead of integer? Would seem more
>> natural to me. At least as an option.
>
> Which variable? The values set by setrandom are mostly used for primary keys. We
> really want integers in a range.
Oh, I see. He said about documents.

+       Moreover, set gaussian or exponential with threshold interger value,
+       we can get gaussian or exponential random in integer value between
+       <replaceable>min</> and <replaceable>max</> bounds inclusive.

Collectry,
+       Moreover, set gaussian or exponential with threshold double value,
+       we can get gaussian or exponential random in integer value between
+       <replaceable>min</> and <replaceable>max</> bounds inclusive.


And I am going to fix the document more easily understanding for user.

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Fix typo in nbtree.h introduced by efada2b
Next
From: Joshua Yanovski
Date:
Subject: Re: [WIP] Better partial index-only scans