V
- Vector typeM
- Model type@Reference(authors="D. Pelleg, A. Moore", booktitle="X-means: Extending K-means with Efficient Estimation on the Number of Clusters", title="Proceedings of the 17th International Conference on Machine Learning (ICML 2000)", url="http://www.pelleg.org/shared/hp/download/xmeans.ps") public class XMeans<V extends NumberVector,M extends MeanModel> extends AbstractKMeans<V,M>
D. Pelleg, A. Moore:
X-means: Extending K-means with Efficient Estimation on the Number of
Clusters
In: Proceedings of the 17th International Conference on Machine Learning
(ICML 2000)
Modifier and Type | Class and Description |
---|---|
static class |
XMeans.Parameterizer<V extends NumberVector,M extends MeanModel>
Parameterization class.
|
Modifier and Type | Field and Description |
---|---|
(package private) KMeansQualityMeasure<V> |
informationCriterion
Information criterion to choose the better split.
|
private KMeans<V,M> |
innerKMeans
Inner k-means algorithm.
|
private int |
k
Effective number of clusters, minimum and maximum.
|
private int |
k_max
Effective number of clusters, minimum and maximum.
|
private int |
k_min
Effective number of clusters, minimum and maximum.
|
private static String |
KEY
Key for statistics logging.
|
private static Logging |
LOG
The logger for this class.
|
(package private) RandomFactory |
rnd
Random factory.
|
(package private) PredefinedInitialMeans |
splitInitializer
Initializer for k-means.
|
initializer, maxiter
distanceFunction
INIT_ID, K_ID, MAXITER_ID, SEED_ID
DISTANCE_FUNCTION_ID
Constructor and Description |
---|
XMeans(NumberVectorDistanceFunction<? super V> distanceFunction,
int k_min,
int k_max,
int maxiter,
KMeans<V,M> innerKMeans,
KMeansInitialization<? super V> initializer,
PredefinedInitialMeans splitInitializer,
KMeansQualityMeasure<V> informationCriterion,
RandomFactory random)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
TypeInformation[] |
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query.
|
protected Logging |
getLogger()
Get the (STATIC) logger for this class.
|
Clustering<M> |
run(Database database,
Relation<V> relation)
Run the algorithm on a database and relation.
|
protected List<? extends NumberVector> |
splitCentroid(Cluster<? extends MeanModel> parentCluster,
Relation<V> relation)
Split an existing centroid into two initial centers.
|
protected List<Cluster<M>> |
splitCluster(Cluster<M> parentCluster,
Database database,
Relation<V> relation)
Conditionally splits the clusters based on the information criterion.
|
assignToNearestCluster, incrementalUpdateMean, logVarstat, macQueenIterate, means, medians, setDistanceFunction, setK
getDistanceFunction
makeParameterDistanceFunction, run
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
run
getDistanceFunction
private static final Logging LOG
private static final String KEY
private KMeans<V extends NumberVector,M extends MeanModel> innerKMeans
private int k
private int k_min
private int k_max
PredefinedInitialMeans splitInitializer
KMeansQualityMeasure<V extends NumberVector> informationCriterion
RandomFactory rnd
public XMeans(NumberVectorDistanceFunction<? super V> distanceFunction, int k_min, int k_max, int maxiter, KMeans<V,M> innerKMeans, KMeansInitialization<? super V> initializer, PredefinedInitialMeans splitInitializer, KMeansQualityMeasure<V> informationCriterion, RandomFactory random)
distanceFunction
- Distance functionk_min
- k_min parameter - minimum number of result clustersk_max
- k_max parameter - maximum number of result clustersmaxiter
- Maximum number of iterations each.innerKMeans
- K-Means variant to use inside.informationCriterion
- The information criterion used for the
splitting steprandom
- Random factorypublic Clustering<M> run(Database database, Relation<V> relation)
database
- Database to processrelation
- Data relationprotected List<Cluster<M>> splitCluster(Cluster<M> parentCluster, Database database, Relation<V> relation)
parentCluster
- Cluster to splitdatabase
- Databaserelation
- Data relationprotected List<? extends NumberVector> splitCentroid(Cluster<? extends MeanModel> parentCluster, Relation<V> relation)
parentCluster
- Existing clusterrelation
- Data relationpublic TypeInformation[] getInputTypeRestriction()
AbstractAlgorithm
getInputTypeRestriction
in interface Algorithm
getInputTypeRestriction
in class AbstractKMeans<V extends NumberVector,M extends MeanModel>
protected Logging getLogger()
AbstractAlgorithm
getLogger
in class AbstractAlgorithm<Clustering<M extends MeanModel>>
Copyright © 2015 ELKI Development Team, Lehr- und Forschungseinheit für Datenbanksysteme, Ludwig-Maximilians-Universität München. License information.