Re: Statistical Analysis - Mailing list pgsql-general

From Tom Lane
Subject Re: Statistical Analysis
Date
Msg-id 2089.964484308@sss.pgh.pa.us
Whole thread Raw
In response to Re: Statistical Analysis  ("Timothy H. Keitt" <keitt@nceas.ucsb.edu>)
Responses Re: Statistical Analysis  (Steve Heaven <steve@thornet.co.uk>)
List pgsql-general
"Timothy H. Keitt" <keitt@nceas.ucsb.edu> writes:
> Oops.  Just tried that; the random() call only get evaluated once.

Sigh, forgot to mention the solution.  There's an undocumented function:

 * bool oidrand (oid o, int4 X)-
 *      takes in an oid and a int4 X, and will return 'true'
 *    about 1/X of the time.

typically used like this:

-- select roughly 1/10 of the tuples
SELECT * FROM onek WHERE oidrand(onek.oid, 10);

This doesn't get collapsed by the overly aggressive constant-qual
recognizer because it takes a table column as input.  (The function
doesn't actually *use* the OID, mind you, but the planner doesn't
know that.  What a kluge... but it gets the job done.)

Note that this isn't necessarily going to fix your performance problem,
since a scan of the whole input table is still going to be required.
But if the expensive processing was somewhere downstream of that basic
scan, it should help.

            regards, tom lane

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Statistical Analysis
Next
From: Joseph Shraibman
Date:
Subject: Re: 4 billion record limit?