XMeans (ELKI: Environment for DeveLoping KDD-Applications Supported by Index-Structures)

java.lang.Object
- de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm<R>
- - de.lmu.ifi.dbs.elki.algorithm.AbstractNumberVectorDistanceBasedAlgorithm<V,Clustering<M>>
  - - de.lmu.ifi.dbs.elki.algorithm.clustering.kmeans.AbstractKMeans<V,M>
    - - de.lmu.ifi.dbs.elki.algorithm.clustering.kmeans.XMeans<V,M>

Type Parameters:

V - Vector type

M - Model type

All Implemented Interfaces:

Algorithm, ClusteringAlgorithm<Clustering<M>>, KMeans<V,M>, DistanceBasedAlgorithm<V>
```
@Reference(authors="D. Pelleg, A. Moore",
           title="X-means: Extending K-means with Efficient Estimation on the Number of Clusters",
           booktitle="Proc. 17th Int. Conf. on Machine Learning (ICML 2000)",
           url="http://www.pelleg.org/shared/hp/download/xmeans.ps",
           bibkey="DBLP:conf/icml/PellegM00")
public class XMeans<V extends NumberVector,M extends MeanModel>
extends AbstractKMeans<V,M>
```
X-means: Extending K-means with Efficient Estimation on the Number of Clusters.
Note: this implementation does currently not use a k-d-tree for acceleration. Also note that kmax is not a hard threshold - the algorithm can return up to 2*kmax clusters!
Reference:
D. Pelleg, A. Moore
X-means: Extending K-means with Efficient Estimation on the Number of Clusters
Proc. 17th Int. Conf. on Machine Learning (ICML 2000)

Since:

0.7.0

Author:

Tibor Goldschwendt, Erich Schubert

Nested Class Summary

Nested Classes
Modifier and Type Class and Description

static class XMeans.Parameterizer<V extends NumberVector,M extends MeanModel>
Parameterization class.
- Nested classes/interfaces inherited from class de.lmu.ifi.dbs.elki.algorithm.clustering.kmeans.AbstractKMeans
  AbstractKMeans.Instance

Nested Classes
Modifier and Type	Class and Description
`static class`	`XMeans.Parameterizer<V extends NumberVector,M extends MeanModel>` Parameterization class.

Field Summary

Fields
Modifier and Type	Field and Description
`(package private) KMeansQualityMeasure<V>`	`informationCriterion` Information criterion to choose the better split.
`private KMeans<V,M>`	`innerKMeans` Inner k-means algorithm.
`private int`	`k` Effective number of clusters, minimum and maximum.
`private int`	`k_max` Effective number of clusters, minimum and maximum.
`private int`	`k_min` Effective number of clusters, minimum and maximum.
`private static java.lang.String`	`KEY` Key for statistics logging.
`private static Logging`	`LOG` The logger for this class.
`(package private) RandomFactory`	`rnd` Random factory.
`(package private) PredefinedInitialMeans`	`splitInitializer` Initializer for k-means.

Fields inherited from class de.lmu.ifi.dbs.elki.algorithm.clustering.kmeans.AbstractKMeans
initializer, maxiter

Fields inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractNumberVectorDistanceBasedAlgorithm
distanceFunction

Fields inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm
ALGORITHM_ID

Fields inherited from interface de.lmu.ifi.dbs.elki.algorithm.clustering.kmeans.KMeans
INIT_ID, K_ID, MAXITER_ID, SEED_ID, VARSTAT_ID

Fields inherited from interface de.lmu.ifi.dbs.elki.algorithm.DistanceBasedAlgorithm
DISTANCE_FUNCTION_ID

Constructor Summary

Constructors
Constructor and Description
`XMeans(NumberVectorDistanceFunction<? super V> distanceFunction, int k_min, int k_max, int maxiter, KMeans<V,M> innerKMeans, KMeansInitialization initializer, KMeansQualityMeasure<V> informationCriterion, RandomFactory random)` Constructor.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`TypeInformation[]`	`getInputTypeRestriction()` Get the input type restriction used for negotiating the data query.
`protected Logging`	`getLogger()` Get the (STATIC) logger for this class.
`Clustering<M>`	`run(Database database, Relation<V> relation)` Run the algorithm on a database and relation.
`protected double[][]`	`splitCentroid(Cluster<? extends MeanModel> parentCluster, Relation<V> relation)` Split an existing centroid into two initial centers.
`protected java.util.List<Cluster<M>>`	`splitCluster(Cluster<M> parentCluster, Database database, Relation<V> relation)` Conditionally splits the clusters based on the information criterion.

Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.clustering.kmeans.AbstractKMeans
incrementalUpdateMean, initialMeans, means, minusEquals, nearestMeans, plusEquals, plusMinusEquals, setDistanceFunction, setInitializer, setK

Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractNumberVectorDistanceBasedAlgorithm
getDistanceFunction

Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm
run

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface de.lmu.ifi.dbs.elki.algorithm.clustering.ClusteringAlgorithm
run

Methods inherited from interface de.lmu.ifi.dbs.elki.algorithm.DistanceBasedAlgorithm
getDistanceFunction

- Field Detail
  - LOG
```
private static final Logging LOG
```
    The logger for this class.
  - KEY
```
private static final java.lang.String KEY
```
    Key for statistics logging.
  - innerKMeans
```
private KMeans<V extends NumberVector,M extends MeanModel> innerKMeans
```
    Inner k-means algorithm.
  - k
```
private int k
```
    Effective number of clusters, minimum and maximum.
  - k_min
```
private int k_min
```
    Effective number of clusters, minimum and maximum.
  - k_max
```
private int k_max
```
    Effective number of clusters, minimum and maximum.
  - splitInitializer
```
PredefinedInitialMeans splitInitializer
```
    Initializer for k-means.
  - informationCriterion
```
KMeansQualityMeasure<V extends NumberVector> informationCriterion
```
    Information criterion to choose the better split.
  - rnd
```
RandomFactory rnd
```
    Random factory.
- Constructor Detail
  - XMeans
```
public XMeans(NumberVectorDistanceFunction<? super V> distanceFunction,
              int k_min,
              int k_max,
              int maxiter,
              KMeans<V,M> innerKMeans,
              KMeansInitialization initializer,
              KMeansQualityMeasure<V> informationCriterion,
              RandomFactory random)
```
    Constructor.
    
    Parameters:
    
    distanceFunction - Distance function
    
    k_min - k_min parameter - minimum number of result clusters
    
    k_max - k_max parameter - maximum number of result clusters
    
    maxiter - Maximum number of iterations each.
    
    innerKMeans - K-Means variant to use inside.
    
    informationCriterion - The information criterion used for the splitting step
    
    random - Random factory
- Method Detail
  - run
```
public Clustering<M> run(Database database,
                         Relation<V> relation)
```
    Run the algorithm on a database and relation.
    
    Parameters:
    
    database - Database to process
    
    relation - Data relation
    
    Returns:
    
    Clustering result.
  - splitCluster
```
protected java.util.List<Cluster<M>> splitCluster(Cluster<M> parentCluster,
                                                  Database database,
                                                  Relation<V> relation)
```
    Conditionally splits the clusters based on the information criterion.
    
    Parameters:
    
    parentCluster - Cluster to split
    
    database - Database
    
    relation - Data relation
    
    Returns:
    
    Parent cluster when split decreases clustering quality or child clusters when split improves clustering.
  - splitCentroid
```
protected double[][] splitCentroid(Cluster<? extends MeanModel> parentCluster,
                                   Relation<V> relation)
```
    Split an existing centroid into two initial centers.
    
    Parameters:
    
    parentCluster - Existing cluster
    
    relation - Data relation
    
    Returns:
    
    List of new centroids
  - getInputTypeRestriction
```
public TypeInformation[] getInputTypeRestriction()
```
    Description copied from class: AbstractAlgorithm
    
    Get the input type restriction used for negotiating the data query.
    
    Specified by:
    
    getInputTypeRestriction in interface Algorithm
    
    Overrides:
    
    getInputTypeRestriction in class AbstractKMeans<V extends NumberVector,M extends MeanModel>
    
    Returns:
    
    Type restriction
  - getLogger
```
protected Logging getLogger()
```
    Description copied from class: AbstractAlgorithm
    
    Get the (STATIC) logger for this class.
    
    Specified by:
    
    getLogger in class AbstractAlgorithm<Clustering<M extends MeanModel>>
    
    Returns:
    
    the static logger

Class XMeans<V extends NumberVector,M extends MeanModel>

Nested Class Summary

Nested classes/interfaces inherited from class de.lmu.ifi.dbs.elki.algorithm.clustering.kmeans.AbstractKMeans

Field Summary

Fields inherited from class de.lmu.ifi.dbs.elki.algorithm.clustering.kmeans.AbstractKMeans

Fields inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractNumberVectorDistanceBasedAlgorithm

Fields inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm

Fields inherited from interface de.lmu.ifi.dbs.elki.algorithm.clustering.kmeans.KMeans

Fields inherited from interface de.lmu.ifi.dbs.elki.algorithm.DistanceBasedAlgorithm

Constructor Summary

Method Summary

Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.clustering.kmeans.AbstractKMeans

Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractNumberVectorDistanceBasedAlgorithm

Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm

Methods inherited from class java.lang.Object

Methods inherited from interface de.lmu.ifi.dbs.elki.algorithm.clustering.ClusteringAlgorithm

Methods inherited from interface de.lmu.ifi.dbs.elki.algorithm.DistanceBasedAlgorithm

Field Detail

LOG

KEY

innerKMeans

k

k_min

k_max

splitInitializer

informationCriterion

rnd

Constructor Detail

XMeans

Method Detail

run

splitCluster

splitCentroid

getInputTypeRestriction

getLogger