Re: random - Mailing list pgsql-sql

From Tomek Zielonka
Subject Re: random
Date
Msg-id 20010307125036.A4207@mult.i.pl
Whole thread Raw
In response to random  (Jelle Ouwerkerk <jelle@openface.ca>)
Responses Re: random  (Tomek Zielonka <tomek-lists@mult.i.pl>)
List pgsql-sql
On Mon, Mar 05, 2001 at 02:12:15PM -0500, Jelle Ouwerkerk wrote:
> Hi
> 
> What would be the best way to select a random row from a result set?
> 
> Possibilities:
> 
> 1) o get the total number of rows using count()
>    o generate a random number between 1 and the total
>    o select the n'th row using OFFSET
> 
> 2) o get the total number of rows using count()
>    o generate a random number between 1 and the total
>    o fetch n times

Here's my approach. It allows to get random row from the table, not from result
set in general. Maybe you will be able to modify it to fit your needs.

The table, which we want to get random row from, is not shown here. Let's
assume that it has primary key on integer type column. In the table rnd we keep
references to this table in value column.

Other columns are needed by our selection mechanism. We may want 'values' to be
duplicated (different weights), so there is this 'id' column which uniquely
identifies rnd's row.

CREATE SEQUENCE rnd_seq;

CREATE TABLE rnd (   id      INT4             NOT NULL DEFAULT NEXTVAL('rnd_seq'),   r       DOUBLE PRECISION NOT NULL
DEFAULTrandom(),   value   INT4,   PRIMARY KEY (id)
 
);

/* My idea is to index this table with random values.* Then it is sufficent to take first row in this order to get a
randomone* and of course we have to modify its r (random) fields not to get it again,* and again, ... ** I hope you
understandmy explanation in weak english */
 
CREATE INDEX rnd_r_idx ON rnd (r);

/* This function does it. SELECT it issues is very fast, 'cause it uses an* index */

CREATE FUNCTION get_rnd() RETURNS INT4 AS '   DECLARE       rowid  INT4;       val  INT4;   BEGIN       SELECT id,
valueINTO rowid, val           FROM rnd           ORDER BY r           LIMIT 1;                  IF NOT FOUND THEN
RETURNNULL; END IF;              UPDATE rnd SET r = random() WHERE id = rowid;
 
       RETURN val;   END;
' LANGUAGE 'plpgsql';

/* This function only fills the table with test data */

CREATE FUNCTION fill() RETURNS INT4 AS '   DECLARE       i     INT4;   BEGIN       i := 5555;       WHILE i <> 0 LOOP
       INSERT INTO rnd (value) VALUES (i);           i := i - 1;       END LOOP;       RETURN 0;   END;
 
' LANGUAGE 'plpgsql';

VACUUM ANALYZE rnd;



What do you think? Is it a good idea?

greetings,
Tom

-- 
.signature: Too many levels of symbolic links


pgsql-sql by date:

Previous
From: "Grigoriy G. Vovk"
Date:
Subject: Re: Quick question MySQL --> PgSQL
Next
From: Tomek Zielonka
Date:
Subject: Re: random