public abstract class AbstractKMeansQualityMeasure<O extends NumberVector> extends java.lang.Object implements KMeansQualityMeasure<O>
References:
The use of information-theoretic criteria for evaluating k-means was popularized by X-means:
D. Pelleg, A. Moore
X-means: Extending K-means with Efficient Estimation on the Number of
Clusters
Proc. 17th Int. Conf. on Machine Learning (ICML 2000)
A different version of logLikelihood is derived in:
Q. Zhao, M. Xu, P. Fränti
Knee Point Detection on Bayesian Information Criterion
20th IEEE International Conference on Tools with Artificial Intelligence
Constructor and Description |
---|
AbstractKMeansQualityMeasure() |
Modifier and Type | Method and Description |
---|---|
static <V extends NumberVector> |
logLikelihood(Relation<V> relation,
Clustering<? extends MeanModel> clustering,
NumberVectorDistanceFunction<? super V> distanceFunction)
Computes log likelihood of an entire clustering.
|
static int |
numberOfFreeParameters(Relation<? extends NumberVector> relation,
Clustering<? extends MeanModel> clustering)
Compute the number of free parameters.
|
static int |
numPoints(Clustering<? extends MeanModel> clustering)
Compute the number of points in a given set of clusters (which may be
less than the complete data set for X-means!)
|
static <V extends NumberVector> |
varianceOfCluster(Cluster<? extends MeanModel> cluster,
NumberVectorDistanceFunction<? super V> distanceFunction,
Relation<V> relation)
Variance contribution of a single cluster.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
isBetter, quality
public static int numPoints(Clustering<? extends MeanModel> clustering)
clustering
- Clustering to analyzepublic static <V extends NumberVector> double varianceOfCluster(Cluster<? extends MeanModel> cluster, NumberVectorDistanceFunction<? super V> distanceFunction, Relation<V> relation)
If possible, this information is reused from the clustering process (when a KMeansModel is returned).
V
- Vector typecluster
- Cluster to accessdistanceFunction
- Distance functionrelation
- Data relation@Reference(authors="D. Pelleg, A. Moore", title="X-means: Extending K-means with Efficient Estimation on the Number of Clusters", booktitle="Proc. 17th Int. Conf. on Machine Learning (ICML 2000)", url="http://www.pelleg.org/shared/hp/download/xmeans.ps", bibkey="DBLP:conf/icml/PellegM00") public static <V extends NumberVector> double logLikelihood(Relation<V> relation, Clustering<? extends MeanModel> clustering, NumberVectorDistanceFunction<? super V> distanceFunction)
Version as used in the X-means publication.
V
- Vector typerelation
- Data relationclustering
- ClusteringdistanceFunction
- Distance functionpublic static int numberOfFreeParameters(Relation<? extends NumberVector> relation, Clustering<? extends MeanModel> clustering)
relation
- Data relation (for dimensionality)clustering
- Set of clustersCopyright © 2019 ELKI Development Team. License information.