Re: Random() - Mailing list pgsql-sql

From Rajesh Kumar Mallah
Subject Re: Random()
Date
Msg-id a97c77030611181045tde17ec0t323354998fc43619@mail.gmail.com
Whole thread Raw
In response to Re: Random()  ("A. Kretschmer" <andreas.kretschmer@schollglas.com>)
List pgsql-sql


On 11/18/06, A. Kretschmer <andreas.kretschmer@schollglas.com> wrote:
am  Sat, dem 18.11.2006, um 23:02:33 +0530 mailte Rajesh Kumar Mallah folgendes:
> >select <bigint_column> from <your_table> order by random() limit 1;
>
> This query will tend to get slower as the table grows because of the
> sorting.

Right.


>
> it possible to get a row from a random offset
> how about
>
> select <bigint_column> from <your_table>  limit 1 offset  20000*random() ;

For instance, you have a table with only 500 rows and random() returns
0.999? In other words: in think, this is a bad idea, unless you use a
constant for the offset-calculation thats not larger than the number of
rows in this table. For this, you need this value, but select count(1)
from table force a seqscan -> slow.

but order by random() would also forces a seqscan , looks like the random offset method performs
better

explain analyze select count(*) from general.profile_master;
+--------------------------------------------------------------------------------------------------------------------------------+
|                                                           QUERY PLAN                                                           |
+--------------------------------------------------------------------------------------------------------------------------------+
| Aggregate  (cost=139214.26..139214.27 rows=1 width=0) (actual time=3071.178..3071.179 rows=1 loops=1)                          |
|   ->  Seq Scan on profile_master  (cost=0.00..137703.21 rows=604421 width=0) (actual time=0.032..2686.842 rows=601240 loops=1) |
| Total runtime: 3071.268 ms                                                                                                     |
+--------------------------------------------------------------------------------------------------------------------------------+
(3 rows)

tradein_clients=> explain analyze SELECT profile_id from general.profile_master limit 1 offset 601240*random();
+--------------------------------------------------------------------------------------------------------------------------------+
|                                                           QUERY PLAN                                                           |
+--------------------------------------------------------------------------------------------------------------------------------+
| Limit  (cost=13770.30..13770.53 rows=1 width=4) (actual time=1614.146..1614.147 rows=1 loops=1)                                |
|   ->  Seq Scan on profile_master  (cost=0.00..137703.21 rows=604421 width=4) (actual time=0.036..1375.742 rows=429779 loops=1) |
| Total runtime: 1614.187 ms                                                                                                     |
+--------------------------------------------------------------------------------------------------------------------------------+
(3 rows)

tradein_clients=> explain analyze SELECT profile_id from general.profile_master order by random() limit 1;
+--------------------------------------------------------------------------------------------------------------------------------------+
|                                                              QUERY PLAN                                                              |
+--------------------------------------------------------------------------------------------------------------------------------------+
| Limit  (cost=207079.39..207079.40 rows=1 width=4) (actual time=11715.694..11715.695 rows=1 loops=1)                                  |
|   ->  Sort  (cost=207079.39..208590.45 rows=604421 width=4) (actual time=11715.691..11715.691 rows=1 loops=1)                        |
|         Sort Key: random()                                                                                                           |
|         ->  Seq Scan on profile_master  (cost=0.00..139214.26 rows=604421 width=4) (actual time=0.036..4605.259 rows=601241 loops=1) |
| Total runtime: 11716.039 ms                                                                                                          |
+--------------------------------------------------------------------------------------------------------------------------------------+
(5 rows)







Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47215,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

pgsql-sql by date:

Previous
From: Tom Lane
Date:
Subject: Re: transactional shared variable in postgresql
Next
From: "Rajesh Kumar Mallah"
Date:
Subject: Re: transactional shared variable in postgresql