Thread: select where id=random()*something returns two results

select where id=random()*something returns two results

From
Ulrich Meis
Date:
========================================================================
====
POSTGRESQL BUG REPORT TEMPLATE
========================================================================
====


Your name        :    Ulrich Meis
Your email address    :    u.meis ( at ) gmx ( dot ) de


System Configuration
---------------------
Architecture (example: Intel Pentium) : AMD XP 1ghz

Operating System (example: Linux 2.0.26 ELF) : Linux (6month old) Gentoo

PostgreSQL version (example: PostgreSQL-7.3): PostgreSQL 7.3.4

Compiler used (example: gcc 2.95.2) : gnu gcc 3.2.2


Please enter a FULL description of your problem:
------------------------------------------------
A "select * from table where primkey=..." sometimes returns two results.
Having a table of quotes created as can be seen in the next section,
I've seen the following in psql:

select * from quotes where id=1+round(random()* cast ((select max(id)
from quotes) as double precision));
 id  |                           quote                           |
author
-----+-----------------------------------------------------------+------
-----------
 187 | Vergib Deinen Feinden, aber vergiss niemals ihre Namen.   | John
F. Kennedy
 377 | Die Wirklichkeit ist nicht so oder so, sondern so und so. | Harry
Mulisch
(2 rows)

I'm not really into databases, but this sounds wrong. Most of the time,
I actually get 0 results.
This should be impossible as well, because I filled the table up without
touching the serial and without
deleting a single row.

Please describe a way to repeat the problem.   Please try to provide a
concise reproducible example, if at all possible:
----------------------------------------------------------------------
create table quotes (id serial,quote text,author text);

Fill in some quotes...

select * from quotes where id=1+round(random()* cast ((select max(id)
from quotes) as double precision));

If you know how this problem might be fixed, list the solution below:
---------------------------------------------------------------------
No idea.

Re: select where id=random()*something returns two results

From
Kris Jurka
Date:
On Fri, 19 Sep 2003, Ulrich Meis wrote:

> select * from quotes where id=1+round(random()* cast ((select max(id)
> from quotes) as double precision));
>  id  |                           quote                           |
> author
> -----+-----------------------------------------------------------+------
> -----------
>  187 | Vergib Deinen Feinden, aber vergiss niemals ihre Namen.   | John
> F. Kennedy
>  377 | Die Wirklichkeit ist nicht so oder so, sondern so und so. | Harry
> Mulisch
> (2 rows)
>
> I'm not really into databases, but this sounds wrong. Most of the time,
> I actually get 0 results.

The problem is that random() is evaluated on each line giving a different
result.

Kris Jurka

Re: select where id=random()*something returns two results

From
Rod Taylor
Date:
> select * from quotes where id=3D1+round(random()* cast ((select max(id)
> from quotes) as double precision));
>  id  |                           quote                           |
> author=20=20=20=20=20=20
> -----+-----------------------------------------------------------+------
> -----------
>  187 | Vergib Deinen Feinden, aber vergiss niemals ihre Namen.   | John
> F. Kennedy
>  377 | Die Wirklichkeit ist nicht so oder so, sondern so und so. | Harry
> Mulisch
> (2 rows)
>=20
> I'm not really into databases, but this sounds wrong. Most of the time,
> I actually get 0 results.

Random is calculated per call (in this case per comparison). So, the
value you compare against for 187 is not the same as 377.

UPDATE table SET column =3D random(); will show the effect.

If you wrap randon() in a subselect, it will cause it to be evaluated
once:

SELECT * from quotes where id =3D 1+round((SELECT random()) * cast(....).



However, a much faster query for your purposes would be:

SELECT * FROM quotes ORDER BY random() LIMIT 1;

Re: select where id=random()*something returns two results

From
Ulrich Meis
Date:
Sorry for buggering you, I get the point :)

And thanks for the order by limit 1 hint. That will do.

Ulrich Meis

Re: select where id=random()*something returns two results

From
Ian Grant
Date:
> Sorry for buggering you, I get the point :)

I think you mean 'bugging.' Buggering is something quite different.

Re: select where id=random()*something returns two results

From
Jean-Luc Lachance
Date:
Rod,

If the table has 100,000 tupples your query is generating 100,000 new
tupples...
Try:

select * from quotes where id = (
  select int8( 1 + random() * (
    select id from quotes order by id desc limit 1)));

JLL

Rod Taylor wrote:
>
> > select * from quotes where id=1+round(random()* cast ((select max(id)
> > from quotes) as double precision));
> >  id  |                           quote                           |
> > author
> > -----+-----------------------------------------------------------+------
> > -----------
> >  187 | Vergib Deinen Feinden, aber vergiss niemals ihre Namen.   | John
> > F. Kennedy
> >  377 | Die Wirklichkeit ist nicht so oder so, sondern so und so. | Harry
> > Mulisch
> > (2 rows)
> >
> > I'm not really into databases, but this sounds wrong. Most of the time,
> > I actually get 0 results.
>
> Random is calculated per call (in this case per comparison). So, the
> value you compare against for 187 is not the same as 377.
>
> UPDATE table SET column = random(); will show the effect.
>
> If you wrap randon() in a subselect, it will cause it to be evaluated
> once:
>
> SELECT * from quotes where id = 1+round((SELECT random()) * cast(....).
>
> However, a much faster query for your purposes would be:
>
> SELECT * FROM quotes ORDER BY random() LIMIT 1;
>
>   ------------------------------------------------------------------------
>                        Name: signature.asc
>    signature.asc       Type: application/pgp-signature
>                 Description: This is a digitally signed message part

Re: select where id=random()*something returns two results

From
Ulrich Meis
Date:
> -----Original Message-----
> From: pgsql-bugs-owner@postgresql.org [mailto:pgsql-bugs-
> owner@postgresql.org] On Behalf Of Ian Grant
> Sent: Friday, September 19, 2003 1:02 PM
> To: Ulrich Meis
> Cc: Ian.Grant@cl.cam.ac.uk; pgsql-bugs@postgresql.org
> Subject: Re: [BUGS] select where id=random()*something returns two
results
>
> > Sorry for buggering you, I get the point :)
>
> I think you mean 'bugging.' Buggering is something quite different.

Sorry again, surely I meant bugging :-)

Re: select where id=random()*something returns two results

From
Ulrich Meis
Date:
> -----Original Message-----
> From: pgsql-bugs-owner@postgresql.org [mailto:pgsql-bugs-
> owner@postgresql.org] On Behalf Of Jean-Luc Lachance
> Sent: Friday, September 19, 2003 4:44 PM
> To: Rod Taylor
> Cc: Ulrich Meis; pgsql-bugs@postgresql.org
> Subject: Re: [BUGS] select where id=random()*something returns two
results
>
> Rod,
>
> If the table has 100,000 tupples your query is generating 100,000 new
> tupples...
> Try:
>
> select * from quotes where id = (
>   select int8( 1 + random() * (
>     select id from quotes order by id desc limit 1)));
>

How about

select * from quotes where id=1+int8((select random())*(select max(id)
from quotes));

It works, but is it more or less efficient?

Re: select where id=random()*something returns two results

From
Rod Taylor
Date:
On Sun, 2003-09-21 at 08:21, Ulrich Meis wrote:
> > -----Original Message-----
> > From: pgsql-bugs-owner@postgresql.org [mailto:pgsql-bugs-
> > owner@postgresql.org] On Behalf Of Jean-Luc Lachance
> > Sent: Friday, September 19, 2003 4:44 PM
> > To: Rod Taylor
> > Cc: Ulrich Meis; pgsql-bugs@postgresql.org
> > Subject: Re: [BUGS] select where id=3Drandom()*something returns two
> results
> >=20
> > Rod,
> >=20
> > If the table has 100,000 tupples your query is generating 100,000 new
> > tupples...
> > Try:
> >=20
> > select * from quotes where id =3D (
> >   select int8( 1 + random() * (
> >     select id from quotes order by id desc limit 1)));
> >=20
>=20
> How about
>=20
> select * from quotes where id=3D1+int8((select random())*(select max(id)
> from quotes));
>=20
> It works, but is it more or less efficient?

Run EXPLAIN ANALYZE on them both and you tell me which is more
efficient.

Efficiency of a query tends to change with the data that it is being
executed on.