Thread: Best way to select a random row from a derived table

Best way to select a random row from a derived table

From
Ryan Murphy
Date:
Hello hackers and postgressers,

I am aware of 2 ways to select a random row from a table:

    1) select * from table_name order by random() limit 1;
        -- terribly inefficient

    2) select * from table_name tablesample system_rows(1) limit 1;
        -- only works on tables, not views or subqueries

Is there an option that is reasonably efficient and can be used on views and subqueries?

Thanks!
Ryan

Re: Best way to select a random row from a derived table

From
Fabien COELHO
Date:
> I am aware of 2 ways to select a random row from a table:
>
>    1) select * from table_name order by random() limit 1;
>        -- terribly inefficient
>
>    2) select * from table_name tablesample system_rows(1) limit 1;
>        -- only works on tables, not views or subqueries
>
> Is there an option that is reasonably efficient and can be used on views
> and subqueries?

My 0.02€: I'd say this is not possible. In order to choose a item from a 
set randomly, you need to consider somehow the size of the set. Solution 
(2) can be done because the size of the table is known to TABLESAMPLE. 
Solution (1) does it by actually generating the set, hence the cost, so 
that its size is implicitely known as well. I cannot see a way out of this 
conundrum for a general query for which the size is not known without 
executing it. I'd like to be proven false, though:-)

-- 
Fabien.

Re: Best way to select a random row from a derived table

From
Fabien COELHO
Date:
> I am aware of 2 ways to select a random row from a table:
>
>    1) select * from table_name order by random() limit 1;
>        -- terribly inefficient
>
>    2) select * from table_name tablesample system_rows(1) limit 1;
>        -- only works on tables, not views or subqueries
>
> Is there an option that is reasonably efficient and can be used on views
> and subqueries?

My 0.02€: I'd say this is not possible. In order to choose a item from a 
set randomly, you need to consider somehow the size of the set. Solution 
(2) can be done because the size of the table is known to TABLESAMPLE. 
Solution (1) does it by actually generating the set, hence the cost, so 
that its size is implicitely known as well. I cannot see a way out of this 
conundrum for a general query for which the size is not known without 
executing it. I'd like to be proven false, though:-)

-- 
Fabien.

Re: Best way to select a random row from a derived table

From
Condor
Date:
On 28-01-2018 08:39, Ryan Murphy wrote:
> Hello hackers and postgressers,
> 
> I am aware of 2 ways to select a random row from a table:
> 
>     1) select * from table_name order by random() limit 1;
>         -- terribly inefficient
> 
>     2) select * from table_name tablesample system_rows(1) limit 1;
>         -- only works on tables, not views or subqueries
> 
> Is there an option that is reasonably efficient and can be used on
> views and subqueries?
> 
> Thanks!
> Ryan


I do it with:

SELECT * FROM table_name OFFSET RANDOM() * LIMIT 1;

Regards,
HC


Re: Best way to select a random row from a derived table

From
Condor
Date:
On 28-01-2018 08:39, Ryan Murphy wrote:
> Hello hackers and postgressers,
> 
> I am aware of 2 ways to select a random row from a table:
> 
>     1) select * from table_name order by random() limit 1;
>         -- terribly inefficient
> 
>     2) select * from table_name tablesample system_rows(1) limit 1;
>         -- only works on tables, not views or subqueries
> 
> Is there an option that is reasonably efficient and can be used on
> views and subqueries?
> 
> Thanks!
> Ryan


I do it with:

SELECT * FROM table_name OFFSET RANDOM() * LIMIT 1;

Regards,
HC