Re: Gsoc2012 Idea --- Social Network database schema - Mailing list pgsql-hackers

From Qi Huang
Subject Re: Gsoc2012 Idea --- Social Network database schema
Date
Msg-id BAY159-W61A45C1C6B5AC25ECE9B08A3400@phx.gbl
Whole thread Raw
In response to Re: Gsoc2012 Idea --- Social Network database schema  (Neil Conway <neil.conway@gmail.com>)
Responses Re: Gsoc2012 Idea --- Social Network database schema  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
> Date: Tue, 20 Mar 2012 14:12:45 -0700
> Subject: Re: [HACKERS] Gsoc2012 Idea --- Social Network database schema
> From: neil.conway@gmail.com
> To: huangqiyx@hotmail.com
> CC: daniel@heroku.com; josh@agliodbs.com; pgsql-hackers@postgresql.org
>
> 2012/3/19 Qi Huang <huangqiyx@hotmail.com>:
> >> I actually tried to find out, personally...not sure if I was searching
> >> wrongly, but searching for TABLESAMPLE did not yield a cornucopia of
> >> useful conversations at the right time in history (~2007), even when
> >> the search is given a broad date-horizon (all), so I, too, an
> >> uninformed as to the specific objections.
> >>
> >> http://www.postgresql.org/search/?m=1&q=TABLESAMPLE&l=&d=-1&s=d
> >
> > I sent a mail to Nail Conway asking him about this. Hope he could give a
> > good answer.
>
> I never tried to get TABLESAMPLE support into the main PostgreSQL tree
> -- I just developed the original code as an exercise for the purposes
> of the talk. Implementing TABLESAMPLE would probably be a reasonable
> GSoc project.
>
> My memory of the details is fuzzy, but one thing to check is whether
> the approach taken by my patch (randomly choose heap pages and then
> return all the live tuples in a chosen page) actually meets the
> standard's requirements -- obviously it is not true that each heap
> page has the same number of live tuples, so you aren't getting a truly
> random sample.
>
> Neil
>


Thanks so much, Neil. 
I think I kind of understand the situation for now. The implementation posted by Neil was for the purpose of the talk, thus rushed and may not be up to standard of Postgres Community. Also Neil mentioned the PRNG state in the patch is buggy, and maybe also some others. Thus, in the Gsoc project, I could understand the details of Neil's implementation, fix the bugs, make the code fit for the community standard, and test. 
Is there any comment on this? 



Best Regards and Thanks
Huang Qi Victor
Computer Science of National University of Singapore

pgsql-hackers by date:

Previous
From: Alex
Date:
Subject: Re: Another review of URI for libpq, v7 submission
Next
From: Atri Sharma
Date:
Subject: Re: Regarding column reordering project for GSoc 2012