Re: gaussian distribution pgbench -- splits v4 - Mailing list pgsql-hackers

From Fabien COELHO
Subject Re: gaussian distribution pgbench -- splits v4
Date
Msg-id alpine.DEB.2.10.1407231828340.3626@sto
Whole thread Raw
In response to Re: gaussian distribution pgbench -- part 1/2  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: gaussian distribution pgbench -- splits v4
Re: gaussian distribution pgbench -- splits v4
List pgsql-hackers
Hello Robert,

> Some review comments:

Thanks a lot for your return.

Please find attached two new parts of the patch (A for setrandom 
extension, B for pgbench embedded test case extension).

> 1. I suggest that getExponentialrand and getGaussianrand be renamed to
> getExponentialRand and getGaussianRand.

Done.

It was named like that because "getrand" was used for the uniform case.


> 2. I suggest that the code be changed so that the branch currently
> labeled as /* uniform with extra argument */ become a hard error
> instead of a warning.
>
> 3. Similarly, I suggest that the use of gaussian or uniform be an
> error when argc < 6 OR argc > 6.  I also suggest that the
> parenthesized distribution type be dropped from the error message in
> all cases.

I wish to agree, but my interpretation of the previous code is that they 
were ignored before, so ISTM that we are stuck with keeping the same 
unfortunate behavior.

> 4. This question mark seems like it should be a period:
>
> +  * value fails the test? To be on the safe side, let us try over.

Indeed.

> 5. With regards to the following paragraph:
>
>      <para>
> +      The default random distribution is uniform, that is all values in the
> +      range are drawn with equal probability. The gaussian and exponential
> +      options allow to change this default. The mandatory
> +      <replaceable>threshold</> double value controls the actual distribution
> +      with gaussian or exponential.
> +     </para>
>
> This paragraph needs a bit of copy-editing.  Here's an attempt: "By
> default, all values in the range are drawn with equal probability.
> The <literal>gaussian</> and <literal>exponential</> options modify
> this behavior; each requires a mandatory threshold which determines
> the precise shape of the distribution."  The following paragraph
> should be changed to begin with "For a Gaussian distribution" and the
> one after "For an exponential distribution".

Ok. I've kept "uniform" in the first sentence, because this is both
an option name and it is significant in term of probabilities.

> 6. Overall, I think the documentation here looks much better now, but
> I suggest adding one or two example to the Gaussian section.  Like
> this: for example, if threshold is 2.0, 68% of the values will fall in
> the middle third of the interval; with a threshold of 3.0, 99.7% of
> the values will fall in the middle third of the interval.  These
> numbers are fabricated, and the middle third of the interval might not
> be the best part to talk about, but you get the idea (I hope).

Done with threshold value 4.0 so I have a "middle quarter" and a "middle 
half".

-- 
Fabien.

pgsql-hackers by date:

Previous
From: "MauMau"
Date:
Subject: Re: [RFC] Should smgrtruncate() avoid sending sinval message for temp relations
Next
From: Robert Haas
Date:
Subject: Re: Making joins involving ctid work for the benefit of UPSERT