V
- vector type to analyzeM
- model type to produce@Title(value="EM-Clustering: Clustering by Expectation Maximization") @Description(value="Cluster data via Gaussian mixture modeling and the EM algorithm") @Reference(authors="A. P. Dempster, N. M. Laird, D. B. Rubin", title="Maximum Likelihood from Incomplete Data via the EM algorithm", booktitle="Journal of the Royal Statistical Society, Series B, 39(1), 1977, pp. 1-31", url="http://www.jstor.org/stable/2984875") @Alias(value={"de.lmu.ifi.dbs.elki.algorithm.clustering.EM","EM"}) public class EM<V extends NumberVector,M extends MeanModel> extends AbstractAlgorithm<Clustering<M>> implements ClusteringAlgorithm<Clustering<M>>
Reference: A. P. Dempster, N. M. Laird, D. B. Rubin:
Maximum Likelihood from Incomplete Data via the EM algorithm.
In Journal of the Royal Statistical Society, Series B, 39(1), 1977, pp. 1-31
Modifier and Type | Class and Description |
---|---|
static class |
EM.Parameterizer<V extends NumberVector,M extends MeanModel>
Parameterization class.
|
Modifier and Type | Field and Description |
---|---|
private double |
delta
Delta parameter
|
private int |
k
Number of clusters
|
private static Logging |
LOG
The logger for this class.
|
private int |
maxiter
Maximum number of iterations to allow
|
private EMClusterModelFactory<V,M> |
mfactory
Factory for producing the initial cluster model.
|
private static double |
MIN_LOGLIKELIHOOD |
private boolean |
soft
Retain soft assignments.
|
static SimpleTypeInformation<double[]> |
SOFT_TYPE
Soft assignment result type.
|
Constructor and Description |
---|
EM(int k,
double delta,
EMClusterModelFactory<V,M> mfactory,
int maxiter,
boolean soft)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
static double |
assignProbabilitiesToInstances(Relation<? extends NumberVector> relation,
List<? extends EMClusterModel<?>> models,
WritableDataStore<double[]> probClusterIGivenX)
Assigns the current probability values to the instances in the database and
compute the expectation value of the current mixture of distributions.
|
TypeInformation[] |
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query.
|
protected Logging |
getLogger()
Get the (STATIC) logger for this class.
|
boolean |
isSoft() |
static void |
recomputeCovarianceMatrices(Relation<? extends NumberVector> relation,
WritableDataStore<double[]> probClusterIGivenX,
List<? extends EMClusterModel<?>> models)
Recompute the covariance matrixes.
|
Clustering<M> |
run(Database database,
Relation<V> relation)
Performs the EM clustering algorithm on the given database.
|
void |
setSoft(boolean soft) |
makeParameterDistanceFunction, run
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
run
private static final Logging LOG
private int k
private double delta
private EMClusterModelFactory<V extends NumberVector,M extends MeanModel> mfactory
private int maxiter
private boolean soft
private static final double MIN_LOGLIKELIHOOD
public static final SimpleTypeInformation<double[]> SOFT_TYPE
public EM(int k, double delta, EMClusterModelFactory<V,M> mfactory, int maxiter, boolean soft)
k
- k parameterdelta
- delta parametermfactory
- EM cluster model factorymaxiter
- Maximum number of iterationssoft
- Include soft assignmentspublic Clustering<M> run(Database database, Relation<V> relation)
database
- Databaserelation
- Relationpublic static void recomputeCovarianceMatrices(Relation<? extends NumberVector> relation, WritableDataStore<double[]> probClusterIGivenX, List<? extends EMClusterModel<?>> models)
relation
- Vector dataprobClusterIGivenX
- Object probabilitiesmodels
- Cluster models to updatepublic static double assignProbabilitiesToInstances(Relation<? extends NumberVector> relation, List<? extends EMClusterModel<?>> models, WritableDataStore<double[]> probClusterIGivenX)
relation
- the database used for assignment to instancesmodels
- Cluster modelsprobClusterIGivenX
- Output storage for cluster probabilitiespublic TypeInformation[] getInputTypeRestriction()
AbstractAlgorithm
getInputTypeRestriction
in interface Algorithm
getInputTypeRestriction
in class AbstractAlgorithm<Clustering<M extends MeanModel>>
protected Logging getLogger()
AbstractAlgorithm
getLogger
in class AbstractAlgorithm<Clustering<M extends MeanModel>>
public boolean isSoft()
public void setSoft(boolean soft)
soft
- the soft to setCopyright © 2015 ELKI Development Team, Lehr- und Forschungseinheit für Datenbanksysteme, Ludwig-Maximilians-Universität München. License information.