Re: plan problem - Mailing list pgsql-performance

From Ken Geis
Subject Re: plan problem
Date
Msg-id 4073C3DF.2070307@speakeasy.org
Whole thread Raw
In response to Re: plan problem  (Richard Huxton <dev@archonet.com>)
Responses Re: plan problem  (Richard Huxton <dev@archonet.com>)
Re: plan problem  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Richard Huxton wrote:
> On Tuesday 06 April 2004 21:25, Ken Geis wrote:
>
>>I am trying to find an efficient way to draw a random sample from a
>>complex query.  I also want it to be easy to use within my application.
>>
>>So I've defined a view that encapsulates the query.  The id in the
>>"driving" table is exposed, and I run a query like:
>>
>>select * from stats_record_view
>>  where id in (select id from driver_stats
>>                order by random()
>>                limit 30000);
>
>
> How about a join?
>
> SELECT s.*
> FROM
> stats_record_view s
> JOIN
> (SELECT id FROM driver_stats ORDER BY random() LIMIT 30000) AS r
> ON s.id = r.id;

Yes, I tried this too after I sent the first mail, and this was somewhat
better.  I ended up adding a random column to the driving table, putting
an index on it, and exposing that column in the view.  Now I can say

SELECT * FROM stats_record_view WHERE random < 0.093;

For my application, it's OK if the same sample is picked time after time
and it may change if data is added.

...
> Also worth checking the various list archives - this has come up in the past,
> but some time ago.

There are some messages in the archives about how to get a random
sample.  I know how to do that, and that's not why I posted my message.
  Are you saying that the planner behavior I spoke of is in the
archives?  I wouldn't know what to search on to find that thread.  Does
anyone think that the planner issue has merit to address?  Can someone
help me figure out what code I would look at?


Ken Geis



pgsql-performance by date:

Previous
From: huang yaqin
Date:
Subject: Re: good pc but bad performance,why?
Next
From: Dennis Bjorklund
Date:
Subject: Re: good pc but bad performance,why?