Re: Obtaining random rows from a result set - Mailing list pgsql-general

From Martijn van Oosterhout
Subject Re: Obtaining random rows from a result set
Date
Msg-id 20070831135416.GA23673@svana.org
Whole thread Raw
In response to Obtaining random rows from a result set  (Alban Hertroys <alban@magproductions.nl>)
Responses Re: Obtaining random rows from a result set  (Alban Hertroys <alban@magproductions.nl>)
List pgsql-general
On Fri, Aug 31, 2007 at 02:42:18PM +0200, Alban Hertroys wrote:
> Examples:
> * random(maxrows) would return random rows from the resultset.
> * median() would return the rows in the middle of the result set (this
> would require ordering to be meaningful).

It would be possible to write an aggregate that returns a single random
value from a set. The algorithm is something like:

n = 1
v = null
for each row
  if random() < 1/n:
     v = value of row
  n = n + 1

return v

It does require a seqscan though. If you're asking for 5 random rows
you probably mean 5 random but distinct rows, which is different to
just running the above set 5 times in parallel.

I don't know if there's a similar method for median...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment

pgsql-general by date:

Previous
From: Csaba Nagy
Date:
Subject: Re: Obtaining random rows from a result set
Next
From: Kaloyan Iliev
Date:
Subject: Re: Obtaining random rows from a result set