Thread: random
Hi What would be the best way to select a random row from a result set? Possibilities: 1) o get the total number of rows using count() o generate a random number between 1 and the total o select the n'th rowusing OFFSET 2) o get the total number of rows using count() o generate a random number between 1 and the total o fetch n times Option 1 uses 'OFFSET' which is a database-specific keyword, which is probably bad. Also, is there a way to randomize the order of a result set? Thanks in advance Jelle Ouwerkerk Software Developer Openface Internet Inc. Montreal, Quebec, Canada http://www.openface.ca
Jelle Ouwerkerk <jelle@openface.ca> writes: > Also, is there a way to randomize the order of a result set? There's alwaysSELECT * FROM foo ORDER BY random(); regards, tom lane
> Jelle Ouwerkerk <jelle@openface.ca> writes: > > Also, is there a way to randomize the order of a result set? > > There's always > SELECT * FROM foo ORDER BY random(); > How does that work? test=> select random(); random ------------------- 0.896045367650709(1 row) However:test=> select * from pg_class order by random(); does return some output. Is it random, and if so, how? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian <pgman@candle.pha.pa.us> writes: >> Jelle Ouwerkerk <jelle@openface.ca> writes: > Also, is there a way to randomize the order of a result set? >> >> There's always >> SELECT * FROM foo ORDER BY random(); > However: > test=> select * from pg_class order by random(); > does return some output. Is it random, and if so, how? Sure. The function is marked noncachable, so it gets evaluated afresh at each row --- but only once per row, as the sort step doesn't recompute any values. Should work just peachy, and it seems to do so in a couple quick tests. regards, tom lane
On Mon, 5 Mar 2001, Bruce Momjian wrote: > > Jelle Ouwerkerk <jelle@openface.ca> writes: > > > Also, is there a way to randomize the order of a result set? > > > > There's always > > SELECT * FROM foo ORDER BY random(); > > > > How does that work? > > test=> select random(); > random > ------------------- > 0.896045367650709 > (1 row) > > However: > > test=> select * from pg_class order by random(); > > does return some output. Is it random, and if so, how? As a guess... I'd assume that if random() is not marked as cachable, it would call random() once for each output row after any where clauses are done so it'd get different random numbers for each row that it'd use for the sorting.
> Bruce Momjian <pgman@candle.pha.pa.us> writes: > >> Jelle Ouwerkerk <jelle@openface.ca> writes: > > Also, is there a way to randomize the order of a result set? > >> > >> There's always > >> SELECT * FROM foo ORDER BY random(); > > > However: > > test=> select * from pg_class order by random(); > > does return some output. Is it random, and if so, how? > > Sure. The function is marked noncachable, so it gets evaluated > afresh at each row --- but only once per row, as the sort step doesn't > recompute any values. Should work just peachy, and it seems to do > so in a couple quick tests. But random returns a random value from 0-1, right? How does that work in ORDER BY? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian <pgman@candle.pha.pa.us> writes: > But random returns a random value from 0-1, right? How does that work > in ORDER BY? What's the problem? Each row gets a different random value, then we sort. regards, tom lane
> Bruce Momjian <pgman@candle.pha.pa.us> writes: > > But random returns a random value from 0-1, right? How does that work > > in ORDER BY? > > What's the problem? Each row gets a different random value, then we > sort. Oh, I see. Nifty. I am used to seeing a column name or number in ORDER BY. Why doesn't the parser see this as a column number? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
On Mon, Mar 05, 2001 at 02:12:15PM -0500, Jelle Ouwerkerk wrote: > Hi > > What would be the best way to select a random row from a result set? > > Possibilities: > > 1) o get the total number of rows using count() > o generate a random number between 1 and the total > o select the n'th row using OFFSET > > 2) o get the total number of rows using count() > o generate a random number between 1 and the total > o fetch n times Here's my approach. It allows to get random row from the table, not from result set in general. Maybe you will be able to modify it to fit your needs. The table, which we want to get random row from, is not shown here. Let's assume that it has primary key on integer type column. In the table rnd we keep references to this table in value column. Other columns are needed by our selection mechanism. We may want 'values' to be duplicated (different weights), so there is this 'id' column which uniquely identifies rnd's row. CREATE SEQUENCE rnd_seq; CREATE TABLE rnd ( id INT4 NOT NULL DEFAULT NEXTVAL('rnd_seq'), r DOUBLE PRECISION NOT NULL DEFAULTrandom(), value INT4, PRIMARY KEY (id) ); /* My idea is to index this table with random values.* Then it is sufficent to take first row in this order to get a randomone* and of course we have to modify its r (random) fields not to get it again,* and again, ... ** I hope you understandmy explanation in weak english */ CREATE INDEX rnd_r_idx ON rnd (r); /* This function does it. SELECT it issues is very fast, 'cause it uses an* index */ CREATE FUNCTION get_rnd() RETURNS INT4 AS ' DECLARE rowid INT4; val INT4; BEGIN SELECT id, valueINTO rowid, val FROM rnd ORDER BY r LIMIT 1; IF NOT FOUND THEN RETURNNULL; END IF; UPDATE rnd SET r = random() WHERE id = rowid; RETURN val; END; ' LANGUAGE 'plpgsql'; /* This function only fills the table with test data */ CREATE FUNCTION fill() RETURNS INT4 AS ' DECLARE i INT4; BEGIN i := 5555; WHILE i <> 0 LOOP INSERT INTO rnd (value) VALUES (i); i := i - 1; END LOOP; RETURN 0; END; ' LANGUAGE 'plpgsql'; VACUUM ANALYZE rnd; What do you think? Is it a good idea? greetings, Tom -- .signature: Too many levels of symbolic links
On Wed, Mar 07, 2001 at 12:50:36PM +0100, Tomek Zielonka wrote: > On Mon, Mar 05, 2001 at 02:12:15PM -0500, Jelle Ouwerkerk wrote: > > Hi > > > > What would be the best way to select a random row from a result set? > > > > Here's my approach. It allows to get random row from the table, not from result > set in general. Maybe you will be able to modify it to fit your needs. [cut] > What do you think? Is it a good idea? I tested it myself and unfortunately it doesn't work as should. I have to rethink it. greetings, Tom -- .signature: Too many levels of symbolic links