Selecting "sample" data from large tables. - Mailing list pgsql-sql

From Joseph Turner
Subject Selecting "sample" data from large tables.
Date
Msg-id 200406031131.24535.joseph.turner@oakleynetworks.com
Whole thread Raw
Responses Re: Selecting "sample" data from large tables.  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-sql
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I have a table with a decent number of rows (let's say for example a
billion rows).  I am trying to construct a graph that displays the
distribution of that data.  However, I don't want to read in the
complete data set (as reading a billion rows would take a while).  Can
anyone thing of a way to do this is postgresql?  I've been looking
online and most of the stuff I've found has been for other databases.
As far as I can tell ANSI SQL doesn't provide for this scenario.

I could potentially write a function to do this, however I'd prefer
not to.  But if that's what I'm going to be stuck doing I'd like to
know earlier then later.  Here's the description of the table:

create table score
{ pageId Integer NOT NULL, ruleId, Integer NOT NULL score Double precision NULL, rowAddedDate BigInt NULL,primary key
(pageId,ruleId)
 
};

I also have an index on row added date, which is just the number of
millis since the epoc (Jan 1, 1970 or so [java style timestamps]).
I'd be willing to accept that the row added date values are random
enough to represent random.

Thanks in advance,
 -- Joe T.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFAv2Bqs/P36Z9SDAARAkmLAJ9dDB0sqACgFrxH8NukFUsizXz5zgCgt9IT
/wh3ryz4WQzc5qQY2cAZtVE=
=5dg+
-----END PGP SIGNATURE-----


pgsql-sql by date:

Previous
From: Bruno Wolff III
Date:
Subject: Re: ORDER BY TIMESTAMP_column ASC, NULL first
Next
From: Bruno Wolff III
Date:
Subject: Re: Reference with condition on other table column?