Home > mailing lists

GSoC project : K-medoids clustering in Madlib - Mailing list pgsql-hackers

From	viod
Subject	GSoC project : K-medoids clustering in Madlib
Date	March 26, 2013 19:48:50
Msg-id	CAATbgJwGQHxvM9fxeKn5GtsAuyWHDZULGyzbOzAmGD2bX8-qHw@mail.gmail.com Whole thread
Responses	Re: GSoC project : K-medoids clustering in Madlib
List	pgsql-hackers

Tree view

Hello!

I'm an IT student, and I would like to apply for the 2013 GSoC.

I've been looking at this mailing list for a while now, and I saw a suggestion for GSoC that particularly interested me: implementing the K-medoids clustering in Madlib, as it is supposed to be more efficient than the K-means algorithm.

I didn't know about these algorithms before, but I have documented myself, and it looks quite interesting to me, and even more as I currently have lessons (but very very simplified unfortunately).

I've got a few questions:

Won't this be a quite short project? I can't get an idea of how long it would take me to implement this algorithm in a way that would be usable by postgresql, but 3 months looks long for this task, doesn't it?

Someone on the IRC channel (can't remember who, sorry) told me it was used in the KNN index. I guess this is used by pg_trgm, but are there other modules using it currently?

And could you please give me some links explaining the internals of this index? I've been through several articles presenting of it, but none very satisfying.

Thanks a lot in advance!

pgsql-hackers by date:

From: Robert Haas
Date: 26 March 2013, 19:47:24
Subject: Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)

From: Alvaro Herrera
Date: 26 March 2013, 19:50:08
Subject: Re: sql_drop Event Triggerg

GSoC project : K-medoids clustering in Madlib - Mailing list pgsql-hackers

Previous

Next