Thread: GSoC project: K-medoids clustering in Madlib

GSoC project: K-medoids clustering in Madlib

From
Maxence AHLOUCHE
Date:
Hi again!

I am "viod.len@gmail.com", but now writing with my "true" email address.

Note that I've sent this mail to both MADlib and PostgreSQL mailing lists, in order to synchronize the efforts. I've also sent it to everyone that was CC'ed in previous mails.
Where should I send mails regarding this project from now on? Sending on both mailing lists seems like a quite bad idea.

As I had lost hope of getting an answer from MADlib, I have recontacted Atri, as he was willing to mentor the MADlib projects.

Here is his answer:

Le 15 avr. 2013 17:46, "Atri Sharma" <atri.jiit@gmail.com> a écrit :
On Mon, Apr 15, 2013 at 9:12 PM, viod <viod.len@gmail.com> wrote:
> Hello all!
>
>
>>
>> Do you have any interest in data analytics? I proposed a couple of
>> ideas in that field.If you are interested, we could talk over them.
>>
>> Regards,
>>
>> Atri
>
>
> I am pretty much interested in the ideas you posted on the mailing list
> (particularly in implementing the K-medoids algorithm). I've asked MADlib if
> they could mentor me, but unfortunately  their org has not been accepted in
> GSoC, and they haven't answered since then.


No issues, I can try to help you out.

> Do you think I could still do it with PostgreSQL? I would really like to do
> this project, as an initiation to classification algorithms.
>
> Still, I don't really understand how this would be used by PostgreSQL?

MADLib is the de facto library for in database analytics in PostgreSQL.

Download,install MADLib, run a few programs, and think more about the
implementation and discuss here.

Regards,

Atri

And here is Rahul's message:

Hi Maxence, 

Welcome aboard on MADlib development. We would happy to help you out in adding K-mediods to the MADlib suite. 

Do you have any idea of stuff I could do to get familiar with the code?
I've already been through the doc to search for functions I already knew,
and found a little bit, I'll go and read the code by the end of the week.
You could start off my looking the the Linear Regression code to understand the workflow. Another document that would be useful to review is the design doc (found here). Chapter 1 gives an overview of the Abstraction layer that is used by all modules. 
(there are some bibtex errors that I debugging, but it's still readable)

Linear regression does not use the iterative constructs and is easier to understand. You could then look at how k-means is implemented since k-mediods would interplay with it in the final product. 
I believe once we go through the k-mediods implementation, extending the backprop code would be easy. So we could definitely look at that when we get to it. 

Sometimes we get busy enough to not be able to give a quick response on the devel@madlib list but keep posting questions there (or ping when you don't hear back) and we will be able to support you in this endeavor. 

Best, 
Rahul

By the way, my "ping" message didn't intend to look aggressive -- just in case. My phone simply ate my question mark.

I'm also adding the presentation I had sent on MADlib's mailing list, so that PostgreSQL's guys can also get a better idea of who I am:

I'm Maxence Ahlouche, and have now been studying IT for almost three years. I've spent the first two years of my studies in the French equivalent of an HND, a very technical training. After having obtained my diploma, I've integrated an engineering school, as I wanted to learn more theorical stuff, and understand better the tools I use every day.
My current training is actually called IT and Applied Mathematics (and I currently have some difficulties in mathematics, as all the other students, except for one, have done a very maths-intensive "preparatory course" before coming here). Still, I'm really interested into what I learn, and am very curious about many things.

At first, I wanted to apply for a PostgreSQL project, and, while lurking on their mailing list, I found a reference to the aforementioned project about K-medoids algorithm. I found this project in perfect fit with my centers of interest: a teacher made me love databases and want to learn more about their internals, and machine learning is a domain that's been attracting me for a while now.

As to my skills, I've learnt lots of programming languages (not exhaustive list: C, C++, Java, a bit of Matlab and Fortran, Bash, PHP, C#, VBA, Python, Caml...). I know how to learn by myself and quickly. During my courses, I've done a (very little) bit of classification: we had to determine the zone in which a pixel belongs via their maximum likelihood. This made me want to learn more about this domain.

That being said, I thank you all for your investment :)

--
Maxence Ahlouche
06 06 66 97 00
93 avenue Paul DOUMER
24100 Bergerac

Re: GSoC project: K-medoids clustering in Madlib

From
Akansha Singh
Date:
Hi, MADLib guys, Any Updates..? On my Part I am trying to understand the modules placed in Github .I a trying to get hands on it. http://madlib.net/ https://github.com/madlib/madlib/

Re: GSoC project: K-medoids clustering in Madlib

From
Maxence AHLOUCHE
Date:
Hi all!

I've had a bit of fun with the k-means clustering, and have made a small script to visualize the result of the classification.
However, I couldn't guess how to assign a cluster to a point from the output of the algorithm, could someone give me an indication, please?

My script is written in python3, and uses py-postgresql (http://python.projects.pgfoundry.org/) as PostgreSQL interface. It also requires Pillow (a PIL fork) which you can find here : https://pypi.python.org/pypi/Pillow/2.0.0.

Before your first use, you may want to change the settings (on top of the file) to connect to your PostgreSQL server.
The script will create a table in your database, populate it with random groups of points, and then call the k-means algorithm on it. Finally, it will generate a PNG image, displaying the points and the centroids.

For a first run, use something like this:
./k-means_test.py --regen -o clustered_data.png

You can call "./k-means_test.py -h" for a list of available options.

In attachment are my script and an example of its output.

By the way, I'll have a lot of work next week, as I have several exams coming and a big project to do (about empirical orthogonal functions), so I'll probably be inactive for a few days! Then I'll be on holidays, so I will be able to focus on  MADlib and GSoC :)

Regards,
Maxence


2013/4/19 Iyer, Rahul <Rahul.Iyer@emc.com>
Hi Akansha, 

I am confused about the question - MADlib is open-source and available from Github. If you're having trouble in fork/clone or have a specific question about a module, we would be glad to help you. Please be specific about your question. 

- Rahul
---------------------------------------------------------
Rahul Iyer
Senior Software Engineer | Predictive Analytics
rahul.iyer@emc.com

On Apr 19, 2013, at 3:13 AM, Akansha Singh wrote:

Hi, MADLib guys, Any Updates..? On my Part I am trying to understand the modules placed in Github .I a trying to get hands on it. http://madlib.net/ https://github.com/madlib/madlib/




--
Maxence Ahlouche
06 06 66 97 00
93 avenue Paul DOUMER
24100 Bergerac
Attachment

Re: GSoC project: K-medoids clustering in Madlib

From
Maxence AHLOUCHE
Date:
Oops, forgot to attach the output!


2013/4/20 Maxence AHLOUCHE <maxence.ahlouche@gmail.com>
Hi all!

I've had a bit of fun with the k-means clustering, and have made a small script to visualize the result of the classification.
However, I couldn't guess how to assign a cluster to a point from the output of the algorithm, could someone give me an indication, please?

My script is written in python3, and uses py-postgresql (http://python.projects.pgfoundry.org/) as PostgreSQL interface. It also requires Pillow (a PIL fork) which you can find here : https://pypi.python.org/pypi/Pillow/2.0.0.

Before your first use, you may want to change the settings (on top of the file) to connect to your PostgreSQL server.
The script will create a table in your database, populate it with random groups of points, and then call the k-means algorithm on it. Finally, it will generate a PNG image, displaying the points and the centroids.

For a first run, use something like this:
./k-means_test.py --regen -o clustered_data.png

You can call "./k-means_test.py -h" for a list of available options.

In attachment are my script and an example of its output.

By the way, I'll have a lot of work next week, as I have several exams coming and a big project to do (about empirical orthogonal functions), so I'll probably be inactive for a few days! Then I'll be on holidays, so I will be able to focus on  MADlib and GSoC :)

Regards,
Maxence


2013/4/19 Iyer, Rahul <Rahul.Iyer@emc.com>

Hi Akansha, 

I am confused about the question - MADlib is open-source and available from Github. If you're having trouble in fork/clone or have a specific question about a module, we would be glad to help you. Please be specific about your question. 

- Rahul
---------------------------------------------------------
Rahul Iyer
Senior Software Engineer | Predictive Analytics
rahul.iyer@emc.com

On Apr 19, 2013, at 3:13 AM, Akansha Singh wrote:

Hi, MADLib guys, Any Updates..? On my Part I am trying to understand the modules placed in Github .I a trying to get hands on it. http://madlib.net/ https://github.com/madlib/madlib/




--
Maxence Ahlouche
06 06 66 97 00
93 avenue Paul DOUMER
24100 Bergerac



--
Maxence Ahlouche
06 06 66 97 00
93 avenue Paul DOUMER
24100 Bergerac
Attachment

Re: GSoC project: K-medoids clustering in Madlib

From
Atri Sharma
Date:


Sent from my iPad

On 20-Apr-2013, at 16:11, Maxence AHLOUCHE <maxence.ahlouche@gmail.com> wrote:

Oops, forgot to attach the output!


2013/4/20 Maxence AHLOUCHE <maxence.ahlouche@gmail.com>
Hi all!

I've had a bit of fun with the k-means clustering, and have made a small script to visualize the result of the classification.
However, I couldn't guess how to assign a cluster to a point from the output of the algorithm, could someone give me an indication, please?

My script is written in python3, and uses py-postgresql (http://python.projects.pgfoundry.org/) as PostgreSQL interface. It also requires Pillow (a PIL fork) which you can find here : https://pypi.python.org/pypi/Pillow/2.0.0.

Before your first use, you may want to change the settings (on top of the file) to connect to your PostgreSQL server.
The script will create a table in your database, populate it with random groups of points, and then call the k-means algorithm on it. Finally, it will generate a PNG image, displaying the points and the centroids.

For a first run, use something like this:
./k-means_test.py --regen -o clustered_data.png

You can call "./k-means_test.py -h" for a list of available options.

In attachment are my script and an example of its output.

By the way, I'll have a lot of work next week, as I have several exams coming and a big project to do (about empirical orthogonal functions), so I'll probably be inactive for a few days! Then I'll be on holidays, so I will be able to focus on  MADlib and GSoC :)

Regards,
Maxence




Very interesting! The results look encouraging,although this is on Python :)

Good work!

Regards,

Atri

Re: GSoC project: K-medoids clustering in Madlib

From
Atri Sharma
Date:
On Sat, Apr 20, 2013 at 7:31 PM, Maxence AHLOUCHE
<maxence.ahlouche@gmail.com> wrote:
>
>
>
> 2013/4/20 Atri Sharma <atri.jiit@gmail.com>
>>
>> Very interesting! The results look encouraging,although this is on Python
>> :)
>>
>> Good work!
>>
>> Regards,
>>
>> Atri
>
>
> Thanks :)
> But do you know if there is a way to know the cluster that a point has been
> assigned to? Can the objective function have something to do with it? I
> haven't understood why it was returned yet!

I didnt get your question. Could you please elaborate a bit more?

Regards,

Atri


Re: GSoC project: K-medoids clustering in Madlib

From
Maxence AHLOUCHE
Date:
Sure!

The k-means algorithms tries to group the points, but how can we know to which group a point has been assigned?
What I mean is that, on the output, I would like to color the points with the same color as the centroid they "depend" on.

And another question, which I thought could be related to the first one, is why does the algorithms returns the objective function? What's its use?

Thanks ffor spending time for my questions :)



2013/4/20 Atri Sharma <atri.jiit@gmail.com>
On Sat, Apr 20, 2013 at 7:31 PM, Maxence AHLOUCHE
<maxence.ahlouche@gmail.com> wrote:
>
>
>
> 2013/4/20 Atri Sharma <atri.jiit@gmail.com>
>>
>> Very interesting! The results look encouraging,although this is on Python
>> :)
>>
>> Good work!
>>
>> Regards,
>>
>> Atri
>
>
> Thanks :)
> But do you know if there is a way to know the cluster that a point has been
> assigned to? Can the objective function have something to do with it? I
> haven't understood why it was returned yet!

I didnt get your question. Could you please elaborate a bit more?

Regards,

Atri



--
Maxence Ahlouche
06 06 66 97 00
93 avenue Paul DOUMER
24100 Bergerac

Re: GSoC project: K-medoids clustering in Madlib

From
Atri Sharma
Date:
On Sat, Apr 20, 2013 at 8:11 PM, Maxence AHLOUCHE
<maxence.ahlouche@gmail.com> wrote:
> Sure!
>
> The k-means algorithms tries to group the points, but how can we know to
> which group a point has been assigned?
> What I mean is that, on the output, I would like to color the points with
> the same color as the centroid they "depend" on.
>
> And another question, which I thought could be related to the first one, is
> why does the algorithms returns the objective function? What's its use?
>
> Thanks ffor spending time for my questions :)


No problem

You can probably maintain a data structure for this purpose. A simple
Vector would suffice, I think. You will need to empty the Vectors in
each iteration of the algorithm, until the algorithm doesnt finish.
Then, the vectors shall contain the final memberships.

So, for each Vector, you designate the current centroid and put the
points assigned to that centroid's groups in that Vector. Then, if
another iteration of your algorithm shall run, you can empty the
vectors and reassign the centroids.

Atri
--
Regards,

Atri
l'apprenant


Re: GSoC project: K-medoids clustering in Madlib

From
Maxence AHLOUCHE
Date:
Hello!

I've pretty much improved my visualizer:
  • It is now able to color the points according to their assigned cluster. It has occured to me last night that it was the main inconvenient of the k-means algorithm: a point is assigned to the nearest centroid's cluster, which is easy to compute.
  • It's now able to load data from a file, if it has been previously generated, so it is no longer mandatory to specify the number of clusters wished when using "old" data.
  • It displays the original clustering and the k-means++ clustering for comparison purposes, and it is very easy to add new clusterings.
  • It is available on GitHub! https://github.com/viodlen/clustering_visualizer

Still, there is room for improvements:

  • It only tests with 2-dimensional spaces. This won't evolve, as it would get difficult to visualize.
  • It only uses the euclidean distance. This can be fixed, but would be heavy to implement, and probably ugly (a hashmap to match the python's distance function with the MADlib's one).
  • For now, the points are reparted in the clusters folowing a gaussian law (not sure of my vocabulary here). This can be easily fixed, and will probably be in a future version.

In attachment is an output example. It shows the poor results of the k-means algorithm on contiguous clusters.

It also shows the interest of the k-means++ algorithm: the small light-blue cluster (in the original clustering) is correctly identified as a complete cluster by the k-means++ algorithm, when it was  only a part of a bigger cluster with the simple k-means algorithm.


The characteristics of the k-means algorithm make it easy to calculate the clusters only from the points and the centroids, but this won't be true for the k-medoids algorithm. So, sadly, if I implement the latter, it won't be possible to keep the same function signature: some more data will have to be returned, along with the centroids list.


I've also wondered if it would be useful to implement the clustering algorithms for non-float vectors (for example, strings), provided the user gives a distance function for this type?

--
Maxence Ahlouche
06 06 66 97 00
93 avenue Paul DOUMER
24100 Bergerac
Attachment

Re: GSoC project: K-medoids clustering in Madlib

From
Atri Sharma
Date:


Sent from my iPad

On 21-Apr-2013, at 23:16, Maxence AHLOUCHE <maxence.ahlouche@gmail.com> wrote:

Hello!

I've pretty much improved my visualizer:
  • It is now able to color the points according to their assigned cluster. It has occured to me last night that it was the main inconvenient of the k-means algorithm: a point is assigned to the nearest centroid's cluster, which is easy to compute.
  • It's now able to load data from a file, if it has been previously generated, so it is no longer mandatory to specify the number of clusters wished when using "old" data.
  • It displays the original clustering and the k-means++ clustering for comparison purposes, and it is very easy to add new clusterings.
  • It is available on GitHub! https://github.com/viodlen/clustering_visualizer

Still, there is room for improvements:

  • It only tests with 2-dimensional spaces. This won't evolve, as it would get difficult to visualize.
  • It only uses the euclidean distance. This can be fixed, but would be heavy to implement, and probably ugly (a hashmap to match the python's distance function with the MADlib's one).
  • For now, the points are reparted in the clusters folowing a gaussian law (not sure of my vocabulary here). This can be easily fixed, and will probably be in a future version.

In attachment is an output example. It shows the poor results of the k-means algorithm on contiguous clusters.

It also shows the interest of the k-means++ algorithm: the small light-blue cluster (in the original clustering) is correctly identified as a complete cluster by the k-means++ algorithm, when it was  only a part of a bigger cluster with the simple k-means algorithm.


The characteristics of the k-means algorithm make it easy to calculate the clusters only from the points and the centroids, but this won't be true for the k-medoids algorithm. So, sadly, if I implement the latter, it won't be possible to keep the same function signature: some more data will have to be returned, along with the centroids list.


I've also wondered if it would be useful to implement the clustering algorithms for non-float vectors (for example, strings), provided the user gives a distance function for this type?



Interesting! Good work!

Could you draw up a summary, giving your findings about the performance of different algorithms,and which one should be implemented,or both(k means++ vs k medoids).

Regards,

Atri

Re: GSoC project: K-medoids clustering in Madlib

From
Maxence AHLOUCHE
Date:
2013/4/21 Atri Sharma <atri.jiit@gmail.com>

Interesting! Good work!

Could you draw up a summary, giving your findings about the performance of different algorithms,and which one should be implemented,or both(k means++ vs k medoids).

Regards,

Atri

From the few articles I've already read, I've found that K-medoids clustering usually goes faster on standard datasets such as the ones I generate). But I'll look for more detailed information during the week, and report what I'll have found here!
By the way, have you got any idea of other forms of datasets that could be useful to test?




2013/4/21 <hellerstein@cs.berkeley.edu>
Very cool!
 
May I suggest generating a visualization in a web toolkit?  Perhaps the new vega library would be simplest (http://trifacta.github.io/vega/) or the more popular but lower-level D3.js?
 
More generally, a project to connect MADlib outputs to vega vis specifications seems like it would be enormously useful!

Joe


I'll give it a look during my holidays, in a week! It would indeed be nice if one just had to open a webpage to test my work!
Considering your other idea, aren't MADlib outputs PostgreSQL/GreenPlum outputs? If so, only a database connector is required, which probably already exists (I may be wrong, I had never heard of D3.js or Vega before, and I don't know well the MADlib project yet).

--
Maxence Ahlouche
06 06 66 97 00
93 avenue Paul DOUMER
24100 Bergerac


--
Maxence Ahlouche
06 06 66 97 00
93 avenue Paul DOUMER
24100 Bergerac

Re: [MADlib-devel] GSoC project: K-medoids clustering in Madlib

From
Atri Sharma
Date:
On 4/21/13, hellerstein@cs.berkeley.edu <hellerstein@cs.berkeley.edu> wrote:
> Very cool!
>
>
> May I suggest generating a visualization in a web toolkit?  Perhaps the new
> vega library would be simplest (http://trifacta.github.io/vega/) or the more
> popular but lower-level D3.js?
>
>
> More generally, a project to connect MADlib outputs to vega vis
> specifications seems like it would be enormously useful!

Yes, the idea seems awesome. Generating these kind of results in a web
toolkit should serve multiple purposes.

Regards,

Atri


Re: GSoC project: K-medoids clustering in Madlib

From
Maxence AHLOUCHE
Date:
Hello!

I've submitted my proposal on the GSoC website.
I've been inactive for the last few days, because I was setting up a server in order to make all my tests. As it is the first time I do this, I've met a bunch of (usually stupid) problems, but it's now almost entirely configured! I hope I'll soon be able to provide a web visualization for the k-means algorithm, but it will probably be a simple png at first. It will allow you to test my work without having to download or install anything.

Regards,
Maxence

Re: GSoC project: K-medoids clustering in Madlib

From
Atri Sharma
Date:

Sent from my iPad

On 02-May-2013, at 17:34, Maxence AHLOUCHE <maxence.ahlouche@gmail.com> wrote:

> Hello!
>
> I've submitted my proposal on the GSoC website.
> I've been inactive for the last few days, because I was setting up a server in order to make all my tests. As it is
thefirst time I do this, I've met a bunch of (usually stupid) problems, but it's now almost entirely configured! I hope
I'llsoon be able to provide a web visualization for the k-means algorithm, but it will probably be a simple png at
first.It will allow you to test my work without having to download or install anything. 
>
>

Great.All the best and looking forward to the web visualisation.

Regards,

Atri

Re: GSoC project: K-medoids clustering in Madlib

From
Maxence AHLOUCHE
Date:
Hi!

I'm pasting here the comment Heikki Linnakangas left on my GSoC proposal (available here: http://www.google-melange.com/gsoc/proposal/review/google/gsoc2013/viod/1):

I don't think we have anyone from the madlib project signed up as a mentor currently. Is there someone in the madlib project that would be willing to mentor this? We'll need to get a mentor assigned for this ASAP, before we can even consider this.

I don't think we have anyone from the madlib project signed up as a mentor currently. Is there someone in the madlib project that would be willing to mentor this? We'll need to get a mentor assigned for this ASAP, before we can even consider this.
So, would anyone from MADlib be interested in co-mentoring this project? I think Atri Sharma was willing to mentor this project, on the PostgreSQL side.

Thanks in advance!

Regards,
Maxence

Re: GSoC project: K-medoids clustering in Madlib

From
Atri Sharma
Date:


Sent from my iPad

On 08-May-2013, at 16:27, Maxence AHLOUCHE <maxence.ahlouche@gmail.com> wrote:

Hi!

I'm pasting here the comment Heikki Linnakangas left on my GSoC proposal (available here: http://www.google-melange.com/gsoc/proposal/review/google/gsoc2013/viod/1):

I don't think we have anyone from the madlib project signed up as a mentor currently. Is there someone in the madlib project that would be willing to mentor this? We'll need to get a mentor assigned for this ASAP, before we can even consider this.

I don't think we have anyone from the madlib project signed up as a mentor currently. Is there someone in the madlib project that would be willing to mentor this? We'll need to get a mentor assigned for this ASAP, before we can even consider this.
So, would anyone from MADlib be interested in co-mentoring this project? I think Atri Sharma was willing to mentor this project, on the PostgreSQL side.

Thanks in advance!

Regards,
Maxence

I am still available as a co mentor.Rahul, would you be willing to be the mentor,please?

Regards,

Atri

Re: GSoC project: K-medoids clustering in Madlib

From
Maxence AHLOUCHE
Date:
Akansha Singh said:
HI, I would like to extend my help to mentor if possible.

Sure, you should contact the GSoC responsibles in PostgreSQL! I'm not in charge for deciding who will mentor this project.

Regards,
Maxence

Re: GSoC project: K-medoids clustering in Madlib

From
Akansha Singh
Date:
Hi All, Can i be allowed to mentor the project , i am very keen to extend my hand and give surety about my sincerity .
----- Original Message -----
From: atri.jiit@gmail.com
To: maxence.ahlouche@gmail.com
Cc: Sujit.Philip@emc.com, devel@madlib.net, Rahul.Iyer@emc.com, pgsql-students@postgresql.org, hlinnakangas@vmware.com
Sent: Wednesday, May 8, 2013 5:52:11 PM GMT +05:30 Chennai, Kolkata, Mumbai, New Delhi
Subject: Re: [pgsql-students] GSoC project: K-medoids clustering in Madlib



Sent from my iPad

On 08-May-2013, at 16:27, Maxence AHLOUCHE <maxence.ahlouche@gmail.com> wrote:

Hi!

I'm pasting here the comment Heikki Linnakangas left on my GSoC proposal (available here: http://www.google-melange.com/gsoc/proposal/review/google/gsoc2013/viod/1):

I don't think we have anyone from the madlib project signed up as a mentor currently. Is there someone in the madlib project that would be willing to mentor this? We'll need to get a mentor assigned for this ASAP, before we can even consider this.

I don't think we have anyone from the madlib project signed up as a mentor currently. Is there someone in the madlib project that would be willing to mentor this? We'll need to get a mentor assigned for this ASAP, before we can even consider this.
So, would anyone from MADlib be interested in co-mentoring this project? I think Atri Sharma was willing to mentor this project, on the PostgreSQL side.

Thanks in advance!

Regards,
Maxence

I am still available as a co mentor.Rahul, would you be willing to be the mentor,please?

Regards,

Atri

Re: GSoC project: K-medoids clustering in Madlib

From
Tomas Vondra
Date:
On 8.5.2013 18:13, Akansha Singh wrote:
> Hi All, Can i be allowed to mentor the project , i am very keen to
> extend my hand and give surety about my sincerity .

I'm a bit confused. AFAIK you've submitted two madlib-related proposals
on your own. I don't think it's a good idea to work on a GSoC project
and mentor another one at the same time.

Tomas


Re: GSoC project: K-medoids clustering in Madlib

From
Maxence AHLOUCHE
Date:



2013/5/8 Atri Sharma <atri.jiit@gmail.com>

On 08-May-2013, at 16:27, Maxence AHLOUCHE <maxence.ahlouche@gmail.com> wrote:

Hi!

I'm pasting here the comment Heikki Linnakangas left on my GSoC proposal (available here: http://www.google-melange.com/gsoc/proposal/review/google/gsoc2013/viod/1):
I don't think we have anyone from the madlib project signed up as a mentor currently. Is there someone in the madlib project that would be willing to mentor this? We'll need to get a mentor assigned for this ASAP, before we can even consider this.
So, would anyone from MADlib be interested in co-mentoring this project? I think Atri Sharma was willing to mentor this project, on the PostgreSQL side.

I am still available as a co mentor.Rahul, would you be willing to be the mentor,please?

Ping, MADlib?


--
Maxence Ahlouche
06 06 66 97 00
93 avenue Paul DOUMER
24100 Bergerac

Re: GSoC project: K-medoids clustering in Madlib

From
Maxence AHLOUCHE
Date:
Hello!

Akansha Singh said:
Hi, I will be obliged if given a chance to mentor.

On the MADlib side so? Just put a comment here then: http://www.google-melange.com/gsoc/proposal/review/google/gsoc2013/viod/1 , so that Heikki can be notified!

Regards,
Maxence

--
Maxence Ahlouche
06 06 66 97 00
93 avenue Paul DOUMER
24100 Bergerac

Re: GSoC project: K-medoids clustering in Madlib

From
Atri Sharma
Date:
On Tue, May 14, 2013 at 11:34 AM, Maxence AHLOUCHE
<maxence.ahlouche@gmail.com> wrote:
> Hello!
>
> Akansha Singh said:
>>
>> Hi, I will be obliged if given a chance to mentor.
>
>
> On the MADlib side so? Just put a comment here then:
> http://www.google-melange.com/gsoc/proposal/review/google/gsoc2013/viod/1 ,
> so that Heikki can be notified!
>
> Regards,
> Maxence
>
> --
> Maxence Ahlouche
> 06 06 66 97 00
> 93 avenue Paul DOUMER
> 24100 Bergerac

You cannot be a mentor and a student at the same time. I think a
senior member of the community clearly expressed this to Ms. Akansha
earlier.

Maxence, we need a member of the MADLib community to be the mentor. I
would suggest you to contact them for the same.

Regards,

Atri

Regards,

Atri

--
Regards,

Atri
l'apprenant


Re: GSoC project: K-medoids clustering in Madlib

From
Heikki Linnakangas
Date:
On 14.05.2013 09:42, Atri Sharma wrote:
> Maxence, we need a member of the MADLib community to be the mentor. I
> would suggest you to contact them for the same.

Well, Philip Sujit and Rahul Iyer are CC'd on this thread. Looking at
the mailing list archives and commit history, I believe they are the two
most active people working on Madlib. I would expect Philip or Rahul to
mentor, or for them to point at someone else who knows the code and the
community well enough to mentor.

Philip, Rahul, is either one of you interested in mentoring any of the
proposed GSoC projects, under the PostgreSQL umbrella? If you are, we
need to get you signed up in the next couple of days, so that you can
take part in reviewing and ranking the proposals.

- Heikki


Re: GSoC project: K-medoids clustering in Madlib

From
Maxence AHLOUCHE
Date:
2013/5/24 Josh Berkus <josh@agliodbs.com>

Well, the problem is that you needed to do it over a week ago.  Google
has pretty strict deadlines for GSOC.  As such, we had to reject all
proposals for work on Madlib this year.

Next year, we will get one or more of your team signed up early in the
GSOC process.

Well, sad :(
I'm gonna try to work on my project this summer anyway, but as I have another job, I won't be able to follow the deadlines I had planned.

Maybe next year then, if my university considers GSoC a valid placement!

Regards,
Maxence


--
Maxence Ahlouche
06 06 66 97 00
93 avenue Paul DOUMER
24100 Bergerac