Re: gaussian distribution pgbench - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: gaussian distribution pgbench |
Date | |
Msg-id | 53D610DF.5030106@vmware.com Whole thread Raw |
In response to | Re: gaussian distribution pgbench (Fabien COELHO <coelho@cri.ensmp.fr>) |
List | pgsql-hackers |
On 07/17/2014 11:13 PM, Fabien COELHO wrote: > >>> However, ISTM that it is not the purpose of pgbench documentation to be a >>> primer about what is an exponential or gaussian distribution, so the idea >>> would yet be to have a relatively compact explanation, and that the >>> interested but clueless reader would document h..self from wikipedia or a >>> text book or a friend or a math teacher (who could be a friend as well:-). >> >> Well, I think it's a balance. I agree that the pgbench documentation >> shouldn't try to substitute for a text book or a math teacher, but I >> also think that you shouldn't necessarily need to refer to a text book >> or a math teacher in order to figure out how to use pgbench. Saying >> "it's complicated, so we don't have to explain it" would be a cop out; >> we need to *make* it simple. And if there's no way to do that, then >> IMHO we should reject the patch in favor of some future patch that >> implements something that will be easy for users to understand. >> >>>>> [nttcom@localhost postgresql]$ contrib/pgbench/pgbench --exponential=10 >>>>> starting vacuum...end. >>>>> transaction type: Exponential distribution TPC-B (sort of) >>>>> scaling factor: 1 >>>>> exponential threshold: 10.00000 >>>>> >>>>> decile percents: 63.2% 23.3% 8.6% 3.1% 1.2% 0.4% 0.2% 0.1% 0.0% 0.0% >>>>> highest/lowest percent of the range: 9.5% 0.0% >>>> >>>> I don't have a clue what that means. None. >>> >>> Maybe we could add in front of the decile/percent >>> >>> "distribution of increasing account key values selected by pgbench:" >> >> I still wouldn't know what that meant. And it misses the point >> anyway: if the documentation is good, this will be unnecessary. If >> the documentation is bad, a printout that tries to illustrate it by >> example is not an acceptable substitute. > > The decile description is quite classic when discussing statistics. IMHO we should include a diagram for each distribution. A diagram would be much more easy to understand than a decile or verbal explanation. The only problem is that the build infrastructure doesn't currently support including images in the docs. That's been discussed before, and I think we even used to have a couple of images there a long time ago. Now would be a good time to bite the bullet and add the support. We got fairly close to a consensus on how to do it in this thread: www.postgresql.org/message-id/flat/20120712181636.GC11063@momjian.us. The biggest problem was choosing an editor that has a fairly stable file format, so that we don't get huge diffs every time someone moves a line in a diagram. One work-around for that is to use graphviz and/or gnuplot as the source format, instead of a graphical editor. - Heikki
pgsql-hackers by date: