Home > mailing lists

Re: Select random lines of a table using a probability distribution - Mailing list pgsql-sql

From	ktm@rice.edu
Subject	Re: Select random lines of a table using a probability distribution
Date	July 13, 2011 13:58:17
Msg-id	20110713135810.GA1874@staff-mud-56-27.rice.edu Whole thread Raw
In response to	Select random lines of a table using a probability distribution ("Jira, Marcel" <Marcel.Jira@wu.ac.at>)
List	pgsql-sql

Tree view

On Wed, Jul 13, 2011 at 03:27:10PM +0200, Jira, Marcel wrote:
> Hi!
> 
> Let's consider I have a table like this
> 
> id    qualification    gender    age    income
> 
> I'd like to select (for example 100) lines of this table by random, but the random mechanism has to follow a certain
probabilitydistribution.
 
> 
> I want to use this procedure to construct a test group for another selection.
> 
> Example:
> 
> I filter all lines having the qualification "plumber".
> I get 50 different ids consisting of 40 males, 10 females and a certain age distribution.
> 
> I also get some information concerning the income of the plumbers.
> 
> Now I want to know if the income is more influenced by the gender and age distribution or by the qualification
"plumber".
> 
> Therefore I would like to select a test group (of 50 or more) without any plumbers. This test group has to follow the
sameage and gender distribution.
 
> 
> Then I would be able to compare this groups income statistics with the plumbers income statistics.
> 
> Is this possible (and doable with reasonable effort) in PostgreSQL?
> 
> Thank you in advance.
> 
> Best regards,
> 
> Marcel Jira
> 

You may want to take a look at pl/R which make the R system available to
PostgreSQL as a function language.

Regards,
Ken

pgsql-sql by date:

From: "Jira, Marcel"
Date: 13 July 2011, 13:52:58
Subject: Select random lines of a table using a probability distribution

From: Wes James
Date: 13 July 2011, 15:36:48
Subject: Re: combining strings to make a query

Re: Select random lines of a table using a probability distribution - Mailing list pgsql-sql

Previous

Next