Tim,
Hmm... this might just work because I could actually perform myrandfunc() <
.2 and then do a LIMIT on it for 10% or what not. That would almost
gurantee the exact amount of rows.
-----------------
Nathan Barnett
-----Original Message-----
From: keitt@ulysses.nceas.ucsb.edu
[mailto:keitt@ulysses.nceas.ucsb.edu]On Behalf Of Timothy H. Keitt
Sent: Monday, July 24, 2000 3:41 PM
To: Nathan Barnett
Subject: Re: [GENERAL] Statistical Analysis
You would need to add a pseudorandom number function to postgresql. If
your function returns numbers on [0, 1), then you could do:
select * from mytable where myrandfunc() < 0.1;
and get back (asymtotically) 10% of the rows. If you want exactly n
randomly chosen rows, its a bit more expensive computationally.
Another more involved approach would be to implement random cursors.
This would be great for bootstrapping analysis.
Tim
Nathan Barnett wrote:
>
> I am having to perform a large data analysis query fairly frequently and
the
> execution time is not exceptable, so I was looking at doing a statictical
> sample of the data to get fairly accurate results. Is there a way to
> perform a query on a set number of random rows instead of the whole
dataset?
> I have looked through the documentation for a function that would do this,
> but I have not seen any. If this is a RTFM type question, then feel free
to
> tell me so and point me in the right direction because I just haven't been
> able to find any info on it.
>
> Thanks ahead of time.
>
> ---------------
> Nathan Barnett
--
Timothy H. Keitt
National Center for Ecological Analysis and Synthesis
735 State Street, Suite 300, Santa Barbara, CA 93101
Phone: 805-892-2519, FAX: 805-892-2510
http://www.nceas.ucsb.edu/~keitt/