Re: random record from small set - Mailing list pgsql-general

From Jan Poslusny
Subject Re: random record from small set
Date
Msg-id 4212200A.1010606@gingerall.cz
Whole thread Raw
In response to random record from small set  (Jeff Davis <jdavis-pgsql@empires.org>)
Responses Re: random record from small set
List pgsql-general
And what about another data representation like

create table r1 (
   i int,
   chance_from numeric,
   chance_to numeric
)
, you can select one random row in one select, for instance
select * from r1 where chance_from <= $rnd and chance_to > $rnd;

I see these advantages
- Only one select.
- Indices can improve performance if r1 has many rows.

and disadvantage
- Tricky update


Jeff Davis wrote:

>I am trying to retrieve a random record (according to a chance
>attribute) from a small set of records, each with a "chance" attribute.
>This may eventually be somwhat of a performance concern, so I'd like to
>make sure I'm doing this right.
>
>Here's what I have so far:
>
>create table r1 (
>   i int,
>   chance numeric
>)
>create or replace function randrec() returns int as $$
>    $res = spi_exec_query('select i,chance from r1');
>    $r = rand;
>    $accum = 0;
>    $i = 0;
>    while($accum < $r) {
>        $accum += $res->{rows}[$i++]->{chance}
>    }
>    return $res->{rows}[$i-1]->{i};
>$$ language plperl;
>
>test=# select * from r1;
> i | chance
>---+--------
> 1 |   0.25
> 2 |   0.20
> 3 |   0.15
> 4 |   0.10
> 5 |   0.30
>
>
>That seems to work, in that out of 10k times, I got the following
>numbers of each:
>1 2479
>2 1959
>3 1522
>4 950
>5 3090
>
>But I have a few questions:
>* Am I right to use NUMERIC for the chance attribute?
>* Does perl's arithmetic leave me with the chance that those numeric
>values don't add up to 1.00 (and in this case that could mean an
>infinite loop)?
>* In my design I'll need a constraint trigger making sure that the
>numbers add up to 1.00. Will that be a performance problem for
>operations on the table that don't modify the chance attribute?
>* Is there a better way?
>* Does spi_exec_query pull the entire result set into memory at once? Is
>there a point at which performance could be a serious problem if there
>are a large number of items to select among?
>
>Regards,
>    Jeff Davis
>
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 2: you can get off all lists at once with the unregister command
>    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>
>
>

pgsql-general by date:

Previous
From: Ruben Oliveira
Date:
Subject: problem with dots in order by
Next
From: Jan Poslusny
Date:
Subject: Re: random record from small set