|
|
|||||||||||||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||||||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectde.lmu.ifi.dbs.elki.logging.AbstractLoggable
de.lmu.ifi.dbs.elki.utilities.optionhandling.AbstractParameterizable
de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm<V,Clustering<EMModel<V>>>
de.lmu.ifi.dbs.elki.algorithm.clustering.EM<V>
V
- a type of RealVector
as a suitable datatype for this algorithmpublic class EM<V extends RealVector<V,?>>
Provides the EM algorithm (clustering by expectation maximization).
Initialization is implemented as random initialization of means (uniformly distributed within the attribute ranges of the given database) and initial zero-covariance and variance=1 in covariance matrices.Reference:
A. P. Dempster, N. M. Laird, D. B. Rubin:
Maximum Likelihood from Incomplete Data via the EM algorithm.
In Journal of the Royal Statistical Society, Series B, 39(1), 1977, pp. 1-31
Field Summary | |
---|---|
private double |
delta
Holds the value of DELTA_PARAM . |
static OptionID |
DELTA_ID
OptionID for DELTA_PARAM |
private DoubleParameter |
DELTA_PARAM
Parameter to specify the termination criterion for maximization of E(M): E(M) - E(M') < em.delta, must be a double equal to or greater than 0. |
private int |
k
Holds the value of K_PARAM . |
static OptionID |
K_ID
OptionID for K_PARAM |
private IntParameter |
K_PARAM
Parameter to specify the number of clusters to find, must be an integer greater than 0. |
private static double |
MIN_LOGLIKELIHOOD
|
private Clustering<EMModel<V>> |
result
Keeps the result. |
private static double |
SINGULARITY_CHEAT
Small value to increment diagonally of a matrix in order to avoid singularity before building the inverse. |
Fields inherited from class de.lmu.ifi.dbs.elki.utilities.optionhandling.AbstractParameterizable |
---|
optionHandler |
Fields inherited from class de.lmu.ifi.dbs.elki.logging.AbstractLoggable |
---|
debug, logger |
Constructor Summary | |
---|---|
EM()
Provides the EM algorithm (clustering by expectation maximization), adding parameters K_PARAM and DELTA_PARAM
to the option handler
additionally to parameters of super class. |
Method Summary | |
---|---|
protected void |
assignProbabilitiesToInstances(Database<V> database,
List<Double> normDistrFactor,
List<V> means,
List<Matrix> invCovMatr,
List<Double> clusterWeights)
Assigns the current probability values to the instances in the database. |
protected double |
expectationOfMixture(Database<V> database)
The expectation value of the current mixture of distributions. |
Description |
getDescription()
Returns a description of the algorithm. |
Clustering<EMModel<V>> |
getResult()
Retrieve the result. |
protected List<V> |
initialMeans(Database<V> database)
Creates k random points distributed uniformly within the
attribute ranges of the given database. |
protected Clustering<EMModel<V>> |
runInTime(Database<V> database)
Performs the EM clustering algorithm on the given database. |
List<String> |
setParameters(List<String> args)
Calls the super method and sets additionally the values of the parameters K_PARAM and DELTA_PARAM . |
Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm |
---|
isTime, isVerbose, run, setTime, setVerbose |
Methods inherited from class de.lmu.ifi.dbs.elki.utilities.optionhandling.AbstractParameterizable |
---|
addOption, addParameterizable, addParameterizable, checkGlobalParameterConstraints, collectOptions, getAttributeSettings, getParameters, rememberParametersExcept, removeOption, removeParameterizable, shortDescription |
Methods inherited from class de.lmu.ifi.dbs.elki.logging.AbstractLoggable |
---|
debugFine, debugFiner, debugFinest, exception, progress, verbose, warning |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Methods inherited from interface de.lmu.ifi.dbs.elki.algorithm.clustering.ClusteringAlgorithm |
---|
run |
Methods inherited from interface de.lmu.ifi.dbs.elki.algorithm.Algorithm |
---|
setTime, setVerbose |
Methods inherited from interface de.lmu.ifi.dbs.elki.utilities.optionhandling.Parameterizable |
---|
checkGlobalParameterConstraints, collectOptions, getParameters, shortDescription |
Field Detail |
---|
private static final double SINGULARITY_CHEAT
public static final OptionID K_ID
K_PARAM
private final IntParameter K_PARAM
Key: -em.k
private int k
K_PARAM
.
public static final OptionID DELTA_ID
DELTA_PARAM
private static final double MIN_LOGLIKELIHOOD
private final DoubleParameter DELTA_PARAM
Default value: 0.0
Key: -em.delta
private double delta
DELTA_PARAM
.
private Clustering<EMModel<V extends RealVector<V,?>>> result
Constructor Detail |
---|
public EM()
K_PARAM
and DELTA_PARAM
to the option handler
additionally to parameters of super class.
Method Detail |
---|
protected Clustering<EMModel<V>> runInTime(Database<V> database) throws IllegalStateException
runInTime
in class AbstractAlgorithm<V extends RealVector<V,?>,Clustering<EMModel<V extends RealVector<V,?>>>>
database
- the database to run the algorithm on
IllegalStateException
- if the algorithm has not been initialized
properly (e.g. the setParameters(String[]) method has been failed
to be called).protected void assignProbabilitiesToInstances(Database<V> database, List<Double> normDistrFactor, List<V> means, List<Matrix> invCovMatr, List<Double> clusterWeights)
database
- the database used for assignment to instancesnormDistrFactor
- normalization factor for density function, based on current covariance matrixmeans
- the current meansinvCovMatr
- the inverse covariance matricesclusterWeights
- the weights of the current clustersprotected double expectationOfMixture(Database<V> database)
database
- the database where the prior probability of each instance is associated
protected List<V> initialMeans(Database<V> database)
k
random points distributed uniformly within the
attribute ranges of the given database.
database
- the database must contain enough points in order to
ascertain the range of attribute values. Less than two points
would make no sense. The content of the database is not touched
otherwise.
k
random points distributed uniformly within
the attribute ranges of the given databasepublic Description getDescription()
Algorithm
getDescription
in interface Algorithm<V extends RealVector<V,?>,Clustering<EMModel<V extends RealVector<V,?>>>>
public Clustering<EMModel<V>> getResult()
ClusteringAlgorithm
getResult
in interface Algorithm<V extends RealVector<V,?>,Clustering<EMModel<V extends RealVector<V,?>>>>
getResult
in interface ClusteringAlgorithm<Clustering<EMModel<V extends RealVector<V,?>>>,V extends RealVector<V,?>>
public List<String> setParameters(List<String> args) throws ParameterException
K_PARAM
and DELTA_PARAM
.
setParameters
in interface Parameterizable
setParameters
in class AbstractAlgorithm<V extends RealVector<V,?>,Clustering<EMModel<V extends RealVector<V,?>>>>
args
- parameters to set the attributes accordingly to
ParameterException
- in case of wrong parameter-setting
|
|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |