Home > mailing lists

Topic: Agglomerative clustering - Mailing list pgsql-students

From	Akansha Singh
Subject	Topic: Agglomerative clustering
Date	April 29, 2013 12:11:53
Msg-id	2701ba0a-ce70-45db-9cc2-bf098c019780@default Whole thread
List	pgsql-students

Tree view

hi
i have another clustering algorithm.. Please give your suggestions for this one whether it can be implemented? I will
beobliged.

Agglomerative Clustering
In agglomorative clustering, each data point is initially placed in a cluster by itself. At each step, two clusters
withthe highest similarity score are merged. The output of agglomerative clustering is a tree, where the leaves are the
dataitems. At any intermediate step, the clusters so far are different trees. When two clusters are merged by the
algorithm,the respective trees are merged by making the roots of these trees the left and right children of a new node
(sothat the total number of trees decreases by 1). In the end, the process yields a single tree. See here for an
example.

This algorithm closely resembles Kruskal's algorithm for Minimum Spanning Tree construction. In Kruskal's algorithm, at
eachstep the smallest edge connecting two components is added till there is only one component left. Each new edge
addedmerges two components. This can be seen as clustering where the similarity score between two components is the
negativethe length of the smallest edge between any two points in the cluster.

Many different clustering algorithms can be obtained by varying the similarity measures between clusters. Some examples
ofsimilarity measures are the following. Here, A and B are two clusters, and sim(A,B) is the similarity measure between
them.The similarity measure sim(u,v) between input data items u and v is specified as input. Typically, the similarity
isa number between 0 and 1, with larger values implying greater similarity.

Regards
Akansha SIngh

pgsql-students by date:

From: Christoph Berg
Date: 29 April 2013, 08:15:04
Subject: Re: [GSoC 2013] [Debian]

From: Thom Brown
Date: 01 May 2013, 08:31:02
Subject: Re: GSoC proposals deadline

Topic: Agglomerative clustering - Mailing list pgsql-students

Previous

Next