Re: New function normal_rand_array function to contrib/tablefunc. - Mailing list pgsql-hackers
From | Andy Fan |
---|---|
Subject | Re: New function normal_rand_array function to contrib/tablefunc. |
Date | |
Msg-id | 87v814wmz0.fsf@163.com Whole thread Raw |
In response to | New function normal_rand_array function to contrib/tablefunc. (Andy Fan <zhihuifan1213@163.com>) |
Responses |
Re: New function normal_rand_array function to contrib/tablefunc.
|
List | pgsql-hackers |
Dean Rasheed <dean.a.rasheed@gmail.com> writes: > My suggestion would be to mirror the signatures of the core random() > functions more closely, and have this: > > 1). rand_array(numvals int, minlen int, maxlen int) > returns setof float8[] > > 2). rand_array(numvals int, minlen int, maxlen int, > minval int, maxval int) > returns setof int[] > > 3). rand_array(numvals int, minlen int, maxlen int, > minval bigint, maxval bigint) > returns setof bigint[] > > 4). rand_array(numvals int, minlen int, maxlen int, > minval numeric, maxval numeric) > returns setof numeric[] this is indeed a more clean and correct APIs, I will use the above ones in the next version. Thanks for the suggestion. It is just not clear to me how verbose the document should to be, and where the document should be, tablefunc.sgml, the comment above the function or the places just the code? In many cases you provides above or below are just implementation details, not the API declaimed purpose. > Something else that's not obvious is that this patch is relying on the > core random functions, which means that it's using the same PRNG state > as those functions. That's probably OK, but it should be documented, > because it's different from tablefunc's normal_rand() function, which > uses pg_global_prng_state. My above question applies to this comment. > In particular, what this means is that > calling setseed() will affect the output of these new functions, but > not normal_rand(). Incidentally, that makes writing tests much simpler > -- just call setseed() first and the output will be repeatable. Good to know this user case. for example, should this be documented? >> In some cases the function returns an empty array. Is it also expected? >> > > Perhaps it would be useful to have separate minimum and maximum array > length arguments, rather than a mean array length argument. I'm not sure which one is better, but main user case of this function for testing pupose, so it I think minimum and maximum array length is good for me too. > > Actually, I find the current behaviour somewhat counterintuitive. Only > after reading the source code did I realise what it's actually doing, > which is this: > > Row 1: array of random length in range [0, meanarraylen] > Row 2: array of length 2*meanarraylen - length of previous array > Row 3: array of random length in range [0, meanarraylen] > Row 4: array of length 2*meanarraylen - length of previous array > ... > > That's far from obvious (it's certainly not documented) and I don't > think it's a particularly good way of achieving a specified mean array > length, because only half the lengths are random. I'm not sure how does this matter in real user case. > One thing that's definitely needed is more documentation. It should > have its own new subsection, like the other tablefunc functions. is the documentaion for the '2*meanarraylen - lastarraylen'? and What is new subsection, do you mean anything wrong in 'tablefunc.sgml', I did have some issue to run 'make html', but the error exists before my patch, so I change the document carefully without testing it. do you know how to fix the below error in 'make html'? $/usr/bin/xsltproc --nonet --path . --path . --stringparam pg.version '18devel' stylesheet.xsl postgres-full.xml I/O error : Attempt to load network entity http://docbook.sourceforge.net/release/xsl/current/xhtml/chunk.xsl warning: failed to load external entity "http://docbook.sourceforge.net/release/xsl/current/xhtml/chunk.xsl" compilation error: file stylesheet.xsl line 6 element import xsl:import : unable to load http://docbook.sourceforge.net/release/xsl/current/xhtml/chunk.xsl I/O error : Attempt to load network entity http://docbook.sourceforge.net/release/xsl/current/common/entities.ent stylesheet-html-common.xsl:4: warning: failed to load external entity "http://docbook.sourceforge.net/release/xsl/current/common/entities.ent" %common.entities; ^ stylesheet-html-common.xsl:124: parser error : Entity 'primary' not defined translate(substring(&primary;, 1, 1), > Something else that confused me is why this function is called > normal_rand_array(). The underlying random functions that it's calling > return random values from a uniform distribution, not a normal > distribution. Arrays of normally distributed random values might also > be useful, but that's not what this patch is doing. OK, you are right, your new names should be better. > Also, the function accepts float8 minval and maxval arguments, and > then simply ignores them and returns random float8 values in the range > [0,1), which is highly counterintuitive. This is a obvious bug and it only exists in float8 case IIUC, will fix it in the next version. -- Best Regards Andy Fan
pgsql-hackers by date: