V
- Vector typeM
- Model type@Reference(authors="D. Pelleg, A. Moore", title="X-means: Extending K-means with Efficient Estimation on the Number of Clusters", booktitle="Proc. 17th Int. Conf. on Machine Learning (ICML 2000)", url="http://www.pelleg.org/shared/hp/download/xmeans.ps", bibkey="DBLP:conf/icml/PellegM00") public class XMeans<V extends NumberVector,M extends MeanModel> extends AbstractKMeans<V,M>
Note: this implementation does currently not use a k-d-tree for acceleration. Also note that kmax is not a hard threshold - the algorithm can return up to 2*kmax clusters!
Reference:
D. Pelleg, A. Moore
X-means: Extending K-means with Efficient Estimation on the Number of
Clusters
Proc. 17th Int. Conf. on Machine Learning (ICML 2000)
Modifier and Type | Class and Description |
---|---|
static class |
XMeans.Parameterizer<V extends NumberVector,M extends MeanModel>
Parameterization class.
|
AbstractKMeans.Instance
Modifier and Type | Field and Description |
---|---|
(package private) KMeansQualityMeasure<V> |
informationCriterion
Information criterion to choose the better split.
|
private KMeans<V,M> |
innerKMeans
Inner k-means algorithm.
|
private int |
k
Effective number of clusters, minimum and maximum.
|
private int |
k_max
Effective number of clusters, minimum and maximum.
|
private int |
k_min
Effective number of clusters, minimum and maximum.
|
private static java.lang.String |
KEY
Key for statistics logging.
|
private static Logging |
LOG
The logger for this class.
|
(package private) RandomFactory |
rnd
Random factory.
|
(package private) PredefinedInitialMeans |
splitInitializer
Initializer for k-means.
|
initializer, maxiter
distanceFunction
ALGORITHM_ID
INIT_ID, K_ID, MAXITER_ID, SEED_ID, VARSTAT_ID
DISTANCE_FUNCTION_ID
Constructor and Description |
---|
XMeans(NumberVectorDistanceFunction<? super V> distanceFunction,
int k_min,
int k_max,
int maxiter,
KMeans<V,M> innerKMeans,
KMeansInitialization initializer,
KMeansQualityMeasure<V> informationCriterion,
RandomFactory random)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
TypeInformation[] |
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query.
|
protected Logging |
getLogger()
Get the (STATIC) logger for this class.
|
Clustering<M> |
run(Database database,
Relation<V> relation)
Run the algorithm on a database and relation.
|
protected double[][] |
splitCentroid(Cluster<? extends MeanModel> parentCluster,
Relation<V> relation)
Split an existing centroid into two initial centers.
|
protected java.util.List<Cluster<M>> |
splitCluster(Cluster<M> parentCluster,
Database database,
Relation<V> relation)
Conditionally splits the clusters based on the information criterion.
|
incrementalUpdateMean, initialMeans, means, minusEquals, nearestMeans, plusEquals, plusMinusEquals, setDistanceFunction, setInitializer, setK
getDistanceFunction
run
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
run
getDistanceFunction
private static final Logging LOG
private static final java.lang.String KEY
private KMeans<V extends NumberVector,M extends MeanModel> innerKMeans
private int k
private int k_min
private int k_max
PredefinedInitialMeans splitInitializer
KMeansQualityMeasure<V extends NumberVector> informationCriterion
RandomFactory rnd
public XMeans(NumberVectorDistanceFunction<? super V> distanceFunction, int k_min, int k_max, int maxiter, KMeans<V,M> innerKMeans, KMeansInitialization initializer, KMeansQualityMeasure<V> informationCriterion, RandomFactory random)
distanceFunction
- Distance functionk_min
- k_min parameter - minimum number of result clustersk_max
- k_max parameter - maximum number of result clustersmaxiter
- Maximum number of iterations each.innerKMeans
- K-Means variant to use inside.informationCriterion
- The information criterion used for the
splitting steprandom
- Random factorypublic Clustering<M> run(Database database, Relation<V> relation)
database
- Database to processrelation
- Data relationprotected java.util.List<Cluster<M>> splitCluster(Cluster<M> parentCluster, Database database, Relation<V> relation)
parentCluster
- Cluster to splitdatabase
- Databaserelation
- Data relationprotected double[][] splitCentroid(Cluster<? extends MeanModel> parentCluster, Relation<V> relation)
parentCluster
- Existing clusterrelation
- Data relationpublic TypeInformation[] getInputTypeRestriction()
AbstractAlgorithm
getInputTypeRestriction
in interface Algorithm
getInputTypeRestriction
in class AbstractKMeans<V extends NumberVector,M extends MeanModel>
protected Logging getLogger()
AbstractAlgorithm
getLogger
in class AbstractAlgorithm<Clustering<M extends MeanModel>>
Copyright © 2019 ELKI Development Team. License information.