| SOM Toolbox | Online documentation | http://www.cis.hut.fi/projects/somtoolbox/ |
sC = som_cllinkage(sM,varargin)
SOM_CLLINKAGE Make a hierarchical linkage of the SOM map units.
sC = som_cllinkage(sM, [[argID,] value, ...])
sC = som_cllinkage(sM);
sC = som_cllinkage(D,'complete');
sC = som_cllinkage(sM,'single','ignore',find(~som_hits(sM,D)));
sC = som_cllinkage(sM,pdist(sM.codebook,'mahal'));
som_clplot(sC);
Input and output arguments ([]'s are optional):
sM (struct) map or data struct to be clustered
(matrix) size dlen x dim, a data set: the matrix must not
contain any NaN's!
[argID, (string) See below. The values which are unambiguous can
value] (varies) be given without the preceeding argID.
sC (struct) a clustering struct with e.g. the following fields
(for more information see SOMCL_STRUCT)
.base (vector) if base partitioning is given, this is a newly
coded version of it so that the cluster indices
go from 1 to the number of clusters.
.tree (matrix) size clen-1 x 3, the linkage info
Z(i,1) and Z(i,2) hold the indeces of clusters
combined on level i (starting from bottom). The new
cluster has index dlen+i. The initial cluster
index of each unit is its linear index in the
original data matrix. Z(i,3) is the distance
between the combined clusters. See LINKAGE
function in the Statistics Toolbox.
Here are the valid argument IDs and corresponding values. The values
which are unambiguous (marked with '*') can be given without the
preceeding argID.
'topol' *(struct) topology struct
'connect' *(string) 'neighbors' or 'any' (default), whether the
connections should be allowed only between
neighbors or between any vectors
(matrix) size dlen x dlen indicating the connections
between vectors
'linkage' *(string) the linkage criteria to use: 'single' (the
default), 'average', 'complete', 'centroid', or 'ward'
'dist' (matrix) size dlen x dlen, pairwise distance matrix to
be used instead of euclidian distances
(vector) as the output of PDIST function
(scalar) distance norm to use (default is euclidian = 2)
'mask' (vector) size dim x 1, the search mask used to
weight distance calculation. By default
sM.mask or a vector of ones is used.
'base' (vector) giving the base partitioning of the data:
base(i) = j denotes that vector i belongs to
base cluster j, and base(i) = NaN that vector
i does not belong to any cluster, but should be
ignored. At the beginning of the clustering, the
vector of each cluster are averaged, and these
averaged vectors are then clustered using
hierarchical clustering.
'ignore' (vector) units to be ignored (in addition to those listed
in base argument)
'tracking' (scalar) 1 or 0: whether to show tracking bar or not (default = 0)
Note that if 'connect'='neighbors' and some vector are ignored (as denoted
by NaNs in the base vector), there may be areas on the map which will
never be connected: connections across the ignored map units simply do not
exist. In such a case, the neighborhood is gradually increased until
the areas can be connected.
See also KMEANS_CLUSTERS, LINKAGE, PDIST, DENDROGRAM.