Thread: replace oidrand() with random_sample()
The "random" regression test uses a function called oidrand(), which takes two parameters, an OID x and an integer y, and returns "true" with probability 1/y (the OID argument is ignored). This can be useful -- for example, it can be used to select a random sampling of the rows in a table (which is what the "random" regression test uses it for). This patch removes that function, because it was old and messy. The old function had the following problems: - it was undocumented - it was poorly named - it was designed to workaround an optimizer bug that no longer exists (the OID argument is to ensure that the optimizer won't optimize away calls to the function; AFAIK marking the function as 'volatile' suffices nowadays) - it used a different random-number generation technique than the other PSRNG-related functions in the backend do (it called random() like they do, but it had its own logic for setting a set and deciding when to reseed the RNG). This patch implements a new version of the function, called "random_sample" (if you think that's not a good name, please suggest an improvement). It corrects the problems mentioned above. I also documented the setseed() function: there were some SGML docs for it, but they were commented out (Peter E. removed them revision 1.41 of func.sgml, without explaining why). Finally, I removed the oidsrand() and userfntest() functions: the former was only for use with oidrand(), and the later has no value AFAICS (I checked with Bruce, and he agreed with me). Initdb required. Cheers, Neil -- Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC
Attachment
Neil Conway <neilc@samurai.com> writes: > The "random" regression test uses a function called oidrand(), which > takes two parameters, an OID x and an integer y, and returns "true" with > probability 1/y (the OID argument is ignored). This can be useful -- for > example, it can be used to select a random sampling of the rows in a > table (which is what the "random" regression test uses it for). Do we actually need a separate function at all? Seems like random() < 1.0/y would accomplish the same result. I agree that oidrand() is crufty and no longer useful, but I had hesitated to rip it out, for fear that somebody somewhere might still be using it. It's not like it's costing us any maintenance effort to leave it there. regards, tom lane
On Wed, 2003-01-15 at 20:21, Tom Lane wrote: > Do we actually need a separate function at all? Seems like > random() < 1.0/y > would accomplish the same result. It does, although it's arguably less obvious. But I wouldn't mind if we just got rid of it outright. What does everyone else think -- is it worth keeping? > I agree that oidrand() is crufty and no longer useful, but I had > hesitated to rip it out, for fear that somebody somewhere might still > be using it. It's not like it's costing us any maintenance effort > to leave it there. Keep in mind that it was undocumented and strangely named -- while there might be some people using it, I'd wager not many. Whether we leave it in or not, I'd vote for eventually getting rid of the current version. So if we keep it, we could add random_sample() and keep oidrand() as a wrapper over it (that emits an elog(WARNING) to let people know it's been deprecated). And if we want to get rid of it, we could probably just add the elog() for 7.4 and then remove it outright in 7.5. Cheers, Neil -- Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC
Neil Conway <neilc@samurai.com> writes: >> I agree that oidrand() is crufty and no longer useful, but I had >> hesitated to rip it out, for fear that somebody somewhere might still >> be using it. It's not like it's costing us any maintenance effort >> to leave it there. > Keep in mind that it was undocumented and strangely named -- while there > might be some people using it, I'd wager not many. True enough. > Whether we leave it in or not, I'd vote for eventually getting rid of > the current version. So if we keep it, we could add random_sample() and > keep oidrand() as a wrapper over it (that emits an elog(WARNING) to let > people know it's been deprecated). And if we want to get rid of it, we > could probably just add the elog() for 7.4 and then remove it outright > in 7.5. Well, now you're adding a whole lot of maintenance effort to support some quite-hypothetical users of a function we all agree is junk ;-) I think we should either leave it be, or rip it out and be done with it. I don't actually much care which (does anyone out there have an opinion?). But a phased-obsolescence plan is far more work than it deserves. regards, tom lane
Tom Lane wrote: > > Whether we leave it in or not, I'd vote for eventually getting rid of > > the current version. So if we keep it, we could add random_sample() and > > keep oidrand() as a wrapper over it (that emits an elog(WARNING) to let > > people know it's been deprecated). And if we want to get rid of it, we > > could probably just add the elog() for 7.4 and then remove it outright > > in 7.5. > > Well, now you're adding a whole lot of maintenance effort to support > some quite-hypothetical users of a function we all agree is junk ;-) > > I think we should either leave it be, or rip it out and be done with it. > I don't actually much care which (does anyone out there have an > opinion?). But a phased-obsolescence plan is far more work than it > deserves. Yea, just remove it and we can re-add it later if people ask for it. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Wed, 2003-01-15 at 23:26, Tom Lane wrote: > I think we should either leave it be, or rip it out and be done with it. > I don't actually much care which (does anyone out there have an > opinion?). But a phased-obsolescence plan is far more work than it > deserves. Heh, fair enough. I'll send in a revised patch that removes it tomorrow. (BTW, this *particular* case may be a bad situation to worry about phased-obsolescence, but in general I think it's worth paying more attention to reducing backward-compatibility headaches for users...) Cheers, Neil -- Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC
> > Well, now you're adding a whole lot of maintenance effort to support > > some quite-hypothetical users of a function we all agree is junk ;-) > > > > I think we should either leave it be, or rip it out and be done with it. > > I don't actually much care which (does anyone out there have an > > opinion?). But a phased-obsolescence plan is far more work than it > > deserves. > > Yea, just remove it and we can re-add it later if people ask for it. C'mon guys! We really have to leave it! It's of no consequence to us to leave it there, and backwards compatibility is really, really important! (Says guy who just spend 3 hours trying to upgrade development server to 7.3!) Chris
Neil Conway <neilc@samurai.com> writes: > (BTW, this *particular* case may be a bad situation to worry about > phased-obsolescence, but in general I think it's worth paying more > attention to reducing backward-compatibility headaches for users...) Indeed, we've been beat about the head and shoulders on that point often enough. But you can easily bring a development effort to a standstill by spending all your time on such matters rather than productive new work. I think the secret is to understand where it's worth spending effort on backwards-compatibility issues, and where not. I can't claim to have a real good track record on making the right choices... regards, tom lane
On Wed, 2003-01-15 at 23:45, Christopher Kings-Lynne wrote: > C'mon guys! We really have to leave it! It's of no consequence to us to > leave it there It is, though. Continuing to maintain old, crufty code that is no longer used *does* have a finite cost. Consider the security holes during 7.2.x: the majority of those were found in code that was written years ago, and only occaisonally looked at since (e.g. the geometry types). > backwards compatibility is really, really important! Agreed, but given that this feature (a) was completely undocumented (b) strangely named You would have to try *pretty hard* just to know that it existed. And even if you decided to use it in an application, it won't take much more than grepping for 'oidrand' and replacing it with 'random() < 1.0/y' in order to upgrade to 7.4. (That's in contrast to things like tightening the rules for varchar input, which aren't trivial to workaround -- in *that* case, I can see the need for special arrangements for backward-compatibility...) Cheers, Neil P.S. Is there a reason for keeping nullvalue() and nonnullvalue() in misc.c? -- Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC
Neil Conway <neilc@samurai.com> writes: > You would have to try *pretty hard* just to know that it existed. And > even if you decided to use it in an application, it won't take much more > than grepping for 'oidrand' and replacing it with 'random() < 1.0/y' in > order to upgrade to 7.4. Or you could do CREATE FUNCTION oidrand(oid, int) RETURNS bool AS 'SELECT random() < 1.0/$2' LANGUAGE sql; regards, tom lane
On Wed, 2003-01-15 at 23:27, Bruce Momjian wrote: > Yea, just remove it and we can re-add it later if people ask for it. Ok, this patch removes oidrand(), oidsrand(), and userfntest(), and improves the SGML docs a little bit (un-commenting the setseed() documentation). Cheers, Neil -- Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC
Attachment
Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches I will try to apply it within the next 48 hours. --------------------------------------------------------------------------- Neil Conway wrote: > On Wed, 2003-01-15 at 23:27, Bruce Momjian wrote: > > Yea, just remove it and we can re-add it later if people ask for it. > > Ok, this patch removes oidrand(), oidsrand(), and userfntest(), and > improves the SGML docs a little bit (un-commenting the setseed() > documentation). > > Cheers, > > Neil > -- > Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC > > [ Attachment, skipping... ] -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Patch applied. Thanks. --------------------------------------------------------------------------- Neil Conway wrote: > On Wed, 2003-01-15 at 23:27, Bruce Momjian wrote: > > Yea, just remove it and we can re-add it later if people ask for it. > > Ok, this patch removes oidrand(), oidsrand(), and userfntest(), and > improves the SGML docs a little bit (un-commenting the setseed() > documentation). > > Cheers, > > Neil > -- > Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC > > [ Attachment, skipping... ] -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073