Thread: More fun with random selects

More fun with random selects

From
Darren Greer
Date:
Hello all.

I have a question that has to do with getting a random number of rows from a
table.  I am already using oidrand(oid, #), to get a random percentage of rows
from a table.  What I want to do now is get a rand # of rows, where I pick the
number.  

For example, lets say I want a random 125 rows from an existing table.  Is
there an easy way I can do that where I actually am getting a decent sample
from the table?

As always, any help is appreciated.

Darren

--
Darren Greer
System Administrator -
Applications Development Manager
Websight Solutions Inc.
Phone: (414) 790-9327 | Fax: (414) 790-5952


Re: [SQL] More fun with random selects

From
Tom Lane
Date:
Darren Greer <dgreer@websightsolutions.com> writes:
> For example, lets say I want a random 125 rows from an existing table.  Is
> there an easy way I can do that where I actually am getting a decent sample
> from the table?

I don't think there's any way to do that with a simple SQL command.

It's possible to draw exactly N items from a set fully randomly if you
are willing to write a little code.  The algorithm looks like this:
itemsRemaining = size of set (# rows in table);itemsStillNeeded = N (# rows wanted, 125 in your example);foreach (item
inset){    generate random value X that is 'true' with probability        itemsStillNeeded / itemsRemaining;    if (X)
 {        emit current item as a selected item;        decrement itemsStillNeeded;        if (itemsStillNeeded == 0)
done;   }    decrement itemsRemaining;}
 

The random choice is typically done with a random number generator
that generates output G between say 0 and M; you make X 'true' if
G <= M * itemsStillNeeded / itemsRemaining.
        regards, tom lane