Thread: random

random

From
Jelle Ouwerkerk
Date:
Hi

What would be the best way to select a random row from a result set?

Possibilities:

1) o get the total number of rows using count()  o generate a random number between 1 and the total  o select the n'th
rowusing OFFSET
 

2) o get the total number of rows using count()  o generate a random number between 1 and the total  o fetch n times

Option 1 uses 'OFFSET' which is a database-specific keyword, which is
probably bad. 

Also, is there a way to randomize the order of a result set?

Thanks in advance

Jelle Ouwerkerk
Software Developer 
Openface Internet Inc.
Montreal, Quebec, Canada
http://www.openface.ca



Re: random

From
Tom Lane
Date:
Jelle Ouwerkerk <jelle@openface.ca> writes:
> Also, is there a way to randomize the order of a result set?

There's alwaysSELECT * FROM foo ORDER BY random();
        regards, tom lane


Re: random

From
Bruce Momjian
Date:
> Jelle Ouwerkerk <jelle@openface.ca> writes:
> > Also, is there a way to randomize the order of a result set?
> 
> There's always
>     SELECT * FROM foo ORDER BY random();
> 

How does that work?
test=> select random();      random       ------------------- 0.896045367650709(1 row)

However:test=> select * from pg_class order by random();

does return some output.  Is it random, and if so, how?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: random

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> Jelle Ouwerkerk <jelle@openface.ca> writes:
> Also, is there a way to randomize the order of a result set?
>> 
>> There's always
>> SELECT * FROM foo ORDER BY random();

> However:
>     test=> select * from pg_class order by random();
> does return some output.  Is it random, and if so, how?

Sure.  The function is marked noncachable, so it gets evaluated
afresh at each row --- but only once per row, as the sort step doesn't
recompute any values.  Should work just peachy, and it seems to do
so in a couple quick tests.
        regards, tom lane


Re: random

From
Stephan Szabo
Date:
On Mon, 5 Mar 2001, Bruce Momjian wrote:

> > Jelle Ouwerkerk <jelle@openface.ca> writes:
> > > Also, is there a way to randomize the order of a result set?
> > 
> > There's always
> >     SELECT * FROM foo ORDER BY random();
> > 
> 
> How does that work?
> 
>     test=> select random();
>           random       
>     -------------------
>      0.896045367650709
>     (1 row)
> 
> However:
>     
>     test=> select * from pg_class order by random();
> 
> does return some output.  Is it random, and if so, how?

As a guess...
I'd assume that if random() is not marked as cachable, it
would call random() once for each output row after any
where clauses are done so it'd get different random
numbers for each row that it'd use for the sorting.



Re: random

From
Bruce Momjian
Date:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >> Jelle Ouwerkerk <jelle@openface.ca> writes:
> > Also, is there a way to randomize the order of a result set?
> >> 
> >> There's always
> >> SELECT * FROM foo ORDER BY random();
> 
> > However:
> >     test=> select * from pg_class order by random();
> > does return some output.  Is it random, and if so, how?
> 
> Sure.  The function is marked noncachable, so it gets evaluated
> afresh at each row --- but only once per row, as the sort step doesn't
> recompute any values.  Should work just peachy, and it seems to do
> so in a couple quick tests.

But random returns a random value from 0-1, right?  How does that work
in ORDER BY?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: random

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> But random returns a random value from 0-1, right?  How does that work
> in ORDER BY?

What's the problem?  Each row gets a different random value, then we
sort.
        regards, tom lane


Re: random

From
Bruce Momjian
Date:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > But random returns a random value from 0-1, right?  How does that work
> > in ORDER BY?
> 
> What's the problem?  Each row gets a different random value, then we
> sort.

Oh, I see.  Nifty.  I am used to seeing a column name or number in ORDER
BY.  Why doesn't the parser see this as a column number?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: random

From
Tomek Zielonka
Date:
On Mon, Mar 05, 2001 at 02:12:15PM -0500, Jelle Ouwerkerk wrote:
> Hi
> 
> What would be the best way to select a random row from a result set?
> 
> Possibilities:
> 
> 1) o get the total number of rows using count()
>    o generate a random number between 1 and the total
>    o select the n'th row using OFFSET
> 
> 2) o get the total number of rows using count()
>    o generate a random number between 1 and the total
>    o fetch n times

Here's my approach. It allows to get random row from the table, not from result
set in general. Maybe you will be able to modify it to fit your needs.

The table, which we want to get random row from, is not shown here. Let's
assume that it has primary key on integer type column. In the table rnd we keep
references to this table in value column.

Other columns are needed by our selection mechanism. We may want 'values' to be
duplicated (different weights), so there is this 'id' column which uniquely
identifies rnd's row.

CREATE SEQUENCE rnd_seq;

CREATE TABLE rnd (   id      INT4             NOT NULL DEFAULT NEXTVAL('rnd_seq'),   r       DOUBLE PRECISION NOT NULL
DEFAULTrandom(),   value   INT4,   PRIMARY KEY (id)
 
);

/* My idea is to index this table with random values.* Then it is sufficent to take first row in this order to get a
randomone* and of course we have to modify its r (random) fields not to get it again,* and again, ... ** I hope you
understandmy explanation in weak english */
 
CREATE INDEX rnd_r_idx ON rnd (r);

/* This function does it. SELECT it issues is very fast, 'cause it uses an* index */

CREATE FUNCTION get_rnd() RETURNS INT4 AS '   DECLARE       rowid  INT4;       val  INT4;   BEGIN       SELECT id,
valueINTO rowid, val           FROM rnd           ORDER BY r           LIMIT 1;                  IF NOT FOUND THEN
RETURNNULL; END IF;              UPDATE rnd SET r = random() WHERE id = rowid;
 
       RETURN val;   END;
' LANGUAGE 'plpgsql';

/* This function only fills the table with test data */

CREATE FUNCTION fill() RETURNS INT4 AS '   DECLARE       i     INT4;   BEGIN       i := 5555;       WHILE i <> 0 LOOP
       INSERT INTO rnd (value) VALUES (i);           i := i - 1;       END LOOP;       RETURN 0;   END;
 
' LANGUAGE 'plpgsql';

VACUUM ANALYZE rnd;



What do you think? Is it a good idea?

greetings,
Tom

-- 
.signature: Too many levels of symbolic links


Re: random

From
Tomek Zielonka
Date:
On Wed, Mar 07, 2001 at 12:50:36PM +0100, Tomek Zielonka wrote:
> On Mon, Mar 05, 2001 at 02:12:15PM -0500, Jelle Ouwerkerk wrote:
> > Hi
> > 
> > What would be the best way to select a random row from a result set?
> > 
> 
> Here's my approach. It allows to get random row from the table, not from result
> set in general. Maybe you will be able to modify it to fit your needs.

[cut]

> What do you think? Is it a good idea?

I tested it myself and unfortunately it doesn't work as should. I have to
rethink it.

greetings,
Tom

-- 
.signature: Too many levels of symbolic links