Re: Obtaining random rows from a result set - Mailing list pgsql-general

From Kaloyan Iliev
Subject Re: Obtaining random rows from a result set
Date
Msg-id 46D818F8.9080006@digsys.bg
Whole thread Raw
In response to Obtaining random rows from a result set  (Alban Hertroys <alban@magproductions.nl>)
Responses Re: Obtaining random rows from a result set  (Erik Jones <erik@myemma.com>)
List pgsql-general
Hi,
Why not generate a random number in your application and then:

SELECT *
FROM table_x
WHERE condition = true
OFFSET generated_random_number
LIMIT xx

Kaloyan Iliev

Alban Hertroys wrote:

>Hello,
>
>I've recently been busy improving a query that yields a fixed number of
>random records matching certain conditions. I have tried all the usual
>approaches, and although they do work, they're all limited in some way
>and don't translate really well to what you "want". They're kludges, IMHO.
>
>The methods I've tried are explained quite well on
>http://people.planetpostgresql.org/greg/index.php?/archives/40-Getting-random-rows-from-a-database-table.html
>
>All these methods involve calculating a random number for every record
>in the result set at some point in time, which is really not what I'm
>trying to model. I think the database should provide some means to get
>those records, so...
>
>Dear Santa,
>
>I'd like my database to have functionality analogue to how LIMIT works,
>but for other - non-sequential - algorithms.
>
>I was thinking along the lines of:
>
>    SELECT *
>      FROM table
>     WHERE condition = true
>     RANDOM 5;
>
>Which would (up to) return 5 random rows from the result set, just as
>LIMIT 5 returns (up to) the first 5 records in the result set.
>
>
>Or maybe even with a custom function, so that you could get non-linear
>distributions:
>
>    SELECT *
>      FROM table
>     WHERE condition = true
>     LIMIT 5 USING my_func();
>
>    Where my_func() could be a user definable function accepting a number
>that should be (an estimate of?) the number of results being returned so
>that it can provide pointers to which rows in the resultset will be
>returned from the query.
>
>Examples:
>* random(maxrows) would return random rows from the resultset.
>* median() would return the rows in the middle of the result set (this
>would require ordering to be meaningful).
>
>What do people think, is this feasable? Desirable? Necessary?
>
>If I'd have time I'd volunteer for at least looking into this, but I'm
>working on three projects simultaneously already. Alas...
>
>Regards,
>Alban Hertroys.
>
>
>



pgsql-general by date:

Previous
From: Martijn van Oosterhout
Date:
Subject: Re: Obtaining random rows from a result set
Next
From: Tom Lane
Date:
Subject: Re: Out of memory error, FreeBSD 6.2, PostgreSQL 8.2.4