"Tom Lane" <tgl@sss.pgh.pa.us> writes:
> Gregory Stark <stark@enterprisedb.com> writes:
>> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
>>> How so? The entries in the histogram are equidistant by definition.
>
>> Huh? They have equal number of values between them, they're not equidistant in
>> the scalar space. So the area of each bar should be the same but the width
>> would be proportional to the width of the bucket and the height inversely
>> proportional.
>
> You're presuming there exists a linear scalar space to reference the
> results to. I'm unconvinced that that's a good assumption to make.
> Consider for instance a column of URLs; practically all the entries will
> start with http or ftp or one of a small number of other possibilities.
> If you try to represent this with some hypothetical linear x-axis you
> are going to produce an unusable display.
I was planning to use the first and last histogram values for the frame of
reference. It could still produce some weird graphs but those cases are
precisely the cases where users might want to look at it and go, "hm, that
looks weird". In most sane cases I expect it'll make a nice pretty graph. It
remains to be seen though. All I have so far are synthetic integer domain
graphs which look just like what my synthetic distribution should look like.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com