|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object net.sourceforge.cilib.problem.OptimisationProblemAdapter net.sourceforge.cilib.problem.ClusteringProblem
public class ClusteringProblem
This class is used to setup/configure a problem that is capable of clustering the data in
a dataset, more specifically the data contained in an
AssociatedPairDataSetBuilder
. Clustering is an
optimisation
problem. The process of optimising a
clustering is driven by a fitness function that determines the fitness of a specific
clustering. This class therefore wraps a FunctionOptimisationProblem
(called the
innerProblem
) that may either be a FunctionMinimisationProblem
or a
FunctionMaximisationProblem
. The
FunctionOptimisationProblem
in turn
makes use of a function
that determines
the fitness of the problem being optimised. Because we are clustering data in a dataset,
this function should be a ClusteringFitnessFunction
.
The following is a list of methods that should be called (usually from XML in this order) to correctly configure a clustering problem:
setDomain(String)
setInnerProblem(FunctionOptimisationProblem)
FunctionOptimisationProblem.setFunction(Function)
on the
innerProblem
setDataSetBuilder(DataSetBuilder)
DataSetBuilder.addDataSet(DataSet)
on the dataset builder
One important aspect that should be noted is that the domain of the dataset (or
this clustering problem), the number of clusters and the fitness function used to
optimise the clustering are all dependant on one another. The reason for this is that the
domain of the dataset is duplicated a number of times and then used as the domain of the
clustering fitness function. The number of clusters determines the number of times the
domain string is duplicated. See regenerateDomain()
for more detail. The reason
for this is because the centroids of a clustering is represented by a single
Entity
such as a Particle
or
Individual
(that have an internal
representation of a Vector
) and these entities are
initialised by using the domain of the FunctionOptimisationProblem
which effectively turns out to be the domain of the ClusteringFitnessFunction
.
This class also provides a central point for specifying the distanceMeasure
that should be used
for calculating distances throughout the entire clustering process.
regenerateDomain()
,
Serialized FormField Summary |
---|
Fields inherited from class net.sourceforge.cilib.problem.OptimisationProblemAdapter |
---|
dataSetBuilder, fitnessEvaluations |
Constructor Summary | |
---|---|
ClusteringProblem()
|
|
ClusteringProblem(ClusteringProblem rhs)
|
Method Summary | |
---|---|
protected Fitness |
calculateFitness(Type solution)
We are actually optimising the innerProblem , so use it to calculate the
fitness. |
DomainRegistry |
getBehaviouralDomain()
Return the actual domain of the problem's dataset, i.e. |
ClusteringProblem |
getClone()
Create a cloned copy of the current object and return it. |
DistanceMeasure |
getDistanceMeasure()
This method will be called from ClusteringUtils.calculateDistance(Vector, Vector) which is the
central point for distance calculations during a clustering. |
DomainRegistry |
getDomain()
Return the domain as used by the configured fitness function, i.e. |
DomainRegistry |
getDomainRegistry()
Return the actual domain of the problem's dataset, i.e. |
int |
getNumberOfClusters()
Return the number of clusters used throughout this clustering problem. |
void |
setDataSetBuilder(DataSetBuilder dsb)
Use the DataSetManager singleton to parse and/or retrieve the given
DataSetBuilder . |
void |
setDistanceMeasure(DistanceMeasure dm)
Set the DistanceMeasure that will be used for all distance calculations
throughout a clustering. |
void |
setDomain(String representation)
Sets the domain of the dataset being clustered. |
void |
setDomainRegistry(DomainRegistry dr)
Set the actual domain of the problem's dataset. |
void |
setInnerProblem(FunctionOptimisationProblem fop)
Sets the problem that will be used to optimise the clustering. |
void |
setNumberOfClusters(int noc)
The expert uses this method to set the number of clusters that should be used to optimise this clustering. |
Methods inherited from class net.sourceforge.cilib.problem.OptimisationProblemAdapter |
---|
accept, changeEnvironment, getChangeStrategy, getDataSetBuilder, getFitness, getFitnessEvaluations, setChangeStrategy |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public ClusteringProblem()
public ClusteringProblem(ClusteringProblem rhs)
Method Detail |
---|
public ClusteringProblem getClone()
OptimisationProblem
getClone
in interface OptimisationProblem
getClone
in interface Problem
getClone
in interface Cloneable
getClone
in class OptimisationProblemAdapter
Object.clone()
public void setInnerProblem(FunctionOptimisationProblem fop)
FunctionOptimisationProblem
that either optimises
the fitness as calculated by a ClusteringFitnessFunction
.
Once the problem is set (changed), the domain of the
ClusteringFitnessFunction
is automatically
regenerated
.
fop
- a FunctionOptimisationProblem that should take a
ClusteringFitnessFunction
that drives the
optimisation process.regenerateDomain()
public int getNumberOfClusters()
numberOfClusters
public void setNumberOfClusters(int noc)
ClusteringFitnessFunction
is automatically regenerated
.
noc
- the user-specified number of clusters that should be used to
optimise this clusteringregenerateDomain()
public DomainRegistry getDomainRegistry()
domainRegistry
of this clustering problempublic void setDomainRegistry(DomainRegistry dr)
ClusteringFitnessFunction
is automatically regenerated
.
dr
- the domainRegistry
of this clustering problemregenerateDomain()
public DomainRegistry getBehaviouralDomain()
domainRegistry
of this clustering problempublic void setDomain(String representation)
ClusteringFitnessFunction
is automatically regenerated
.
representation
- a String
representing the domain of the dataset being
clusteredregenerateDomain()
public DomainRegistry getDomain()
innerProblem
's function's
domain registrypublic void setDataSetBuilder(DataSetBuilder dsb)
DataSetManager
singleton to parse and/or retrieve the given
DataSetBuilder
. Then use the ClusteringUtils
per-thread singleton to
set the DataSetBuilder
as the current dataset for this clustering.
setDataSetBuilder
in interface OptimisationProblem
setDataSetBuilder
in class OptimisationProblemAdapter
dsb
- the DataSetBuilder
that represents the dataset that should be
clustered
IllegalArgumentException
- when the given DataSetBuilder
is not an
AssociatedPairDataSetBuilder
. This is only temporary, because I
didn't want to change the more generic DataSetBuilder
too much.public void setDistanceMeasure(DistanceMeasure dm)
DistanceMeasure
that will be used for all distance calculations
throughout a clustering.
dm
- the desired DistanceMeasure
public DistanceMeasure getDistanceMeasure()
ClusteringUtils.calculateDistance(Vector, Vector)
which is the
central point for distance calculations during a clustering.
distanceMeasure
protected Fitness calculateFitness(Type solution)
innerProblem
, so use it to calculate the
fitness.
calculateFitness
in class OptimisationProblemAdapter
solution
- The Type
representing the candidate solution.
OptimisationProblemAdapter.getFitness(Type, boolean)
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |