net.sourceforge.cilib.util
Class ClusteringUtils

java.lang.Object
  extended by net.sourceforge.cilib.util.ClusteringUtils

public final class ClusteringUtils
extends Object

A class that simplifies clustering when making use of a ClusteringProblem, a ClusterableDataSet and a ClusteringFitnessFunction.
This class is not dependent on a ClusteringFitnessFunction, but the ClusteringFitnessFunctions use this class extensively.


Method Summary
 void arrangeClustersAndCentroids(Vector centroids)
          The three methods called in this method must be called in that specific order, i.e.
 double calculateDistance(int x, int y)
          A central point where the cached distance between the two given patterns can be retrieved.
 double calculateDistance(Vector lhs, Vector rhs)
          A central point where distances can be calculated.
static ClusteringUtils get()
          Return the current instance of this class.
 ArrayList<Vector> getArrangedCentroids()
          Get the structure that represents the split-up centroids after the non-associated centroids have been removed.
 ArrayList<Hashtable<Integer,ClusterableDataSet.Pattern>> getArrangedClusters()
          Get the structure that represents the seperate clusters after the empty clusters have been removed.
 ClusterableDataSet getClusterableDataSet()
          Get the ClusterableDataSet used throughout the current clustering.
 ClusteringProblem getClusteringProblem()
          Get the ClusteringProblem used throughout the current clustering.
 Vector getDataSetMean()
          Get the mean Vector that has been cached by the clusterableDataSet.
 double getDataSetVariance()
          Get the variance (scalar) thas been cached by the clusterableDataSet.
 int getNumberOfPatternsInDataSet()
          Get the number of patterns in the clusterableDataSet.
 ArrayList<Vector> getOriginalCentroids()
          Get the structure that represents the split-up centroids before the non-associated centroids were removed.
 ArrayList<Hashtable<Integer,ClusterableDataSet.Pattern>> getOriginalClusters()
          Get the structure that represents the seperate clusters before the empty clusters were removed.
 ArrayList<ClusterableDataSet.Pattern> getPatternsInDataSet()
          Get the patterns in the clusterableDataSet.
 void setClusterableDataSet(ClusterableDataSet cds)
          This class only deals with ClusterableDataSets.
 void setClusteringProblem(ClusteringProblem cp)
          This class only deals with ClusteringProblems.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

get

public static ClusteringUtils get()
Return the current instance of this class.

Returns:
the current instance of this class.

setClusteringProblem

public void setClusteringProblem(ClusteringProblem cp)
This class only deals with ClusteringProblems. Other problems are not allowed.

Parameters:
cp - the ClusteringProblem used throughout the current clustering

getClusteringProblem

public ClusteringProblem getClusteringProblem()
Get the ClusteringProblem used throughout the current clustering.

Returns:
the clusteringProblem

setClusterableDataSet

public void setClusterableDataSet(ClusterableDataSet cds)
This class only deals with ClusterableDataSets. Other datasets/dataset builders are not allowed.

Parameters:
cds - the ClusterableDataSet used throughout the current clustering

getClusterableDataSet

public ClusterableDataSet getClusterableDataSet()
Get the ClusterableDataSet used throughout the current clustering.

Returns:
the clusterableDataSet

calculateDistance

public double calculateDistance(Vector lhs,
                                Vector rhs)
A central point where distances can be calculated. This is to prevent various different distance measures from being used in the same clustering algorithm/problem.

Parameters:
lhs - the one Vector
rhs - the other Vector
Returns:
the distance between the two given vectors calculated using the current #distanceMeasure

calculateDistance

public double calculateDistance(int x,
                                int y)
A central point where the cached distance between the two given patterns can be retrieved.

Parameters:
x - index of the one pattern
y - index of the other pattern
Returns:
the cached distance between the two given patterns

arrangeClustersAndCentroids

public void arrangeClustersAndCentroids(Vector centroids)
The three methods called in this method must be called in that specific order, i.e.
  1. Arrange the centroids (split them up to be manageable)
  2. Arrange the clusters (assign patterns to their closest centroids) (depends on Step 1)
  3. Remove the empty clusters and their associated centroids from the arranged lists, thereby finalizing the arranging of clusters (depends on both Steps 1 & 2)

Parameters:
centroids - the @Vector that represents the centroids

getPatternsInDataSet

public ArrayList<ClusterableDataSet.Pattern> getPatternsInDataSet()
Get the patterns in the clusterableDataSet.

Returns:
the patterns in the clusterableDataSet

getNumberOfPatternsInDataSet

public int getNumberOfPatternsInDataSet()
Get the number of patterns in the clusterableDataSet.

Returns:
the number of patterns in the clusterableDataSet.

getOriginalCentroids

public ArrayList<Vector> getOriginalCentroids()
Get the structure that represents the split-up centroids before the non-associated centroids were removed.

Returns:
an ArrayList of Vectors that may contain centroids that are not associated with any patterns

getArrangedCentroids

public ArrayList<Vector> getArrangedCentroids()
Get the structure that represents the split-up centroids after the non-associated centroids have been removed.

Returns:
an ArrayList of Vectors that does NOT contain centroids that are NOT associated with any patterns

getOriginalClusters

public ArrayList<Hashtable<Integer,ClusterableDataSet.Pattern>> getOriginalClusters()
Get the structure that represents the seperate clusters before the empty clusters were removed.

Returns:
an ArrayList of Hashtables that may contain empty clusters

getArrangedClusters

public ArrayList<Hashtable<Integer,ClusterableDataSet.Pattern>> getArrangedClusters()
Get the structure that represents the seperate clusters after the empty clusters have been removed.

Returns:
an ArrayList of Hashtables that does NOT contain empty clusters

getDataSetMean

public Vector getDataSetMean()
Get the mean Vector that has been cached by the clusterableDataSet.

Returns:
a Vector that represents the mean of all the patterns inside the clusterableDataSet

getDataSetVariance

public double getDataSetVariance()
Get the variance (scalar) thas been cached by the clusterableDataSet.

Returns:
a double that represents the variance of all the patterns inside the clusterableDataSet


Copyright © 2009 CIRG. All Rights Reserved.