Selecting a random row - Mailing list pgsql-general

From Kari Lavikka
Subject Selecting a random row
Date
Msg-id Pine.HPX.4.51.0411041329040.3138@purple.bdb.fi
Whole thread Raw
Responses Re: Selecting a random row
Re: Selecting a random row
Re: Selecting a random row
List pgsql-general
Hi!

I have to select a random row from a table where primary key isn't
continuous (some rows have been deleted). Postgres just seems to do
something strange with my method.

--
-- Use the order by desc limit 1 -trick to get maximum value
--
CREATE OR REPLACE FUNCTION max_uid() RETURNS int4 AS
  'SELECT uid FROM users WHERE status = ''a'' ORDER BY uid DESC LIMIT 1'
  LANGUAGE 'sql';


--
-- Choose a random point between 0 and max_uid and select the first
-- value from the bigger part
--
CREATE OR REPLACE FUNCTION random_uid() RETURNS int4 AS
  'SELECT uid FROM users u WHERE u.status = ''a'' AND uid >=
   cast(cast(max_uid() - 1 AS FLOAT) * random() AS INTEGER) ORDER BY uid
   ASC LIMIT 1'
  LANGUAGE 'sql';

--
-- testing and looks good
--
galleria=> SELECT max_uid();
 max_uid
---------
  126263

--
-- testing...
--
galleria=> SELECT random_uid(), random_uid(), random_uid(), random_uid(), random_uid();
 random_uid | random_uid | random_uid | random_uid | random_uid
------------+------------+------------+------------+------------
        322 |        601 |        266 |        427 |        369

... but what is this? Values seem to vary from 0 to ~1000.
Not from 0 to 126263!!

How about doing some manual work...

--
-- Testing split point selection
--
galleria=> SELECT cast(cast(max_uid() - 1 AS FLOAT) * random() AS INTEGER);
 int4
-------
 43279

--
-- And inserting split point manually
--
galleria=> SELECT uid FROM users u WHERE u.status = 'a' AND uid >= 43279
           ORDER BY uid ASC LIMIT 1;
  uid
-------
 43284

Works just fine!


Is there any explanation for this strange behavior or are there better
ways to select a random row?

I'm using PG 8.0 b2. Plan for the query is:

 Limit  (cost=0.00..5.19 rows=1 width=4)
   ->  Index Scan using users_pkey on users u  (cost=0.00..69145.26 rows=13329 width=4)
         Filter: ((status = 'a'::bpchar) AND (uid >= ((((max_uid() - 1))::double precision * random()))::integer))

    |\__/|
    ( oo )    Kari Lavikka - tuner@bdb.fi
__ooO(  )Ooo_______ _____ ___ _ _  _   _    _      _                  _
      ""

pgsql-general by date:

Previous
From: Tatsuo Ishii
Date:
Subject: visiting Seoul, Korea
Next
From: Holger Klawitter
Date:
Subject: Re: Selecting a random row