
V - vector datatype@Reference(authors="C. Elkan", title="Using the triangle inequality to accelerate k-means", booktitle="Proc. 20th International Conference on Machine Learning, ICML 2003", url="http://www.aaai.org/Library/ICML/2003/icml03-022.php") public class KMeansElkan<V extends NumberVector> extends AbstractKMeans<V,KMeansModel>
KMeansHamerly for a close variant that only uses O(n*2)
additional memory for bounds.
Reference:
C. Elkan
Using the triangle inequality to accelerate k-means
Proc. 20th International Conference on Machine Learning, ICML 2003
| Modifier and Type | Class and Description |
|---|---|
static class |
KMeansElkan.Parameterizer<V extends NumberVector>
Parameterization class.
|
| Modifier and Type | Field and Description |
|---|---|
private static String |
KEY
Key for statistics logging.
|
private static Logging |
LOG
The logger for this class.
|
initializer, k, maxiterdistanceFunctionINIT_ID, K_ID, MAXITER_ID, SEED_IDDISTANCE_FUNCTION_ID| Constructor and Description |
|---|
KMeansElkan(NumberVectorDistanceFunction<? super V> distanceFunction,
int k,
int maxiter,
KMeansInitialization<? super V> initializer)
Constructor.
|
| Modifier and Type | Method and Description |
|---|---|
private int |
assignToNearestCluster(Relation<V> relation,
List<Vector> means,
List<Vector> sums,
List<ModifiableDBIDs> clusters,
WritableIntegerDataStore assignment,
double[] sep,
double[][] cdist,
WritableDoubleDataStore upper,
WritableDataStore<double[]> lower)
Reassign objects, but only if their bounds indicate it is necessary to do
so.
|
protected Logging |
getLogger()
Get the (STATIC) logger for this class.
|
private int |
initialAssignToNearestCluster(Relation<V> relation,
List<Vector> means,
List<Vector> sums,
List<ModifiableDBIDs> clusters,
WritableIntegerDataStore assignment,
WritableDoubleDataStore upper,
WritableDataStore<double[]> lower)
Reassign objects, but only if their bounds indicate it is necessary to do
so.
|
private double |
maxMoved(List<Vector> means,
List<Vector> newmeans,
double[] dists)
Maximum distance moved.
|
private void |
recomputeSeperation(List<Vector> means,
double[] sep,
double[][] cdist)
Recompute the separation of cluster means.
|
Clustering<KMeansModel> |
run(Database database,
Relation<V> relation)
Run the clustering algorithm.
|
private void |
updateBounds(Relation<V> relation,
WritableIntegerDataStore assignment,
WritableDoubleDataStore upper,
WritableDataStore<double[]> lower,
double[] move)
Update the bounds for k-means.
|
assignToNearestCluster, getInputTypeRestriction, incrementalUpdateMean, logVarstat, macQueenIterate, means, medians, setDistanceFunction, setK, updateAssignmentgetDistanceFunctionmakeParameterDistanceFunction, runclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitrungetDistanceFunctionprivate static final Logging LOG
private static final String KEY
public KMeansElkan(NumberVectorDistanceFunction<? super V> distanceFunction, int k, int maxiter, KMeansInitialization<? super V> initializer)
distanceFunction - distance functionk - k parametermaxiter - Maxiter parameterinitializer - Initialization methodpublic Clustering<KMeansModel> run(Database database, Relation<V> relation)
KMeansdatabase - Database to run on.relation - Relation to process.private void recomputeSeperation(List<Vector> means, double[] sep, double[][] cdist)
means - Meanssep - Output array of separationcdist - Center-to-Center distancesprivate int initialAssignToNearestCluster(Relation<V> relation, List<Vector> means, List<Vector> sums, List<ModifiableDBIDs> clusters, WritableIntegerDataStore assignment, WritableDoubleDataStore upper, WritableDataStore<double[]> lower)
relation - Datameans - Current meanssums - New meansclusters - Current clustersassignment - Cluster assignmentupper - Upper boundslower - Lower boundsprivate int assignToNearestCluster(Relation<V> relation, List<Vector> means, List<Vector> sums, List<ModifiableDBIDs> clusters, WritableIntegerDataStore assignment, double[] sep, double[][] cdist, WritableDoubleDataStore upper, WritableDataStore<double[]> lower)
relation - Datameans - Current meanssums - New meansclusters - Current clustersassignment - Cluster assignmentsep - Separation of meanscdist - Center-to-center distancesupper - Upper boundslower - Lower boundsprivate double maxMoved(List<Vector> means, List<Vector> newmeans, double[] dists)
means - Old meansnewmeans - New meansdists - Distances movedprivate void updateBounds(Relation<V> relation, WritableIntegerDataStore assignment, WritableDoubleDataStore upper, WritableDataStore<double[]> lower, double[] move)
relation - Relationassignment - Cluster assignmentupper - Upper boundslower - Lower boundsmove - Movement of centersprotected Logging getLogger()
AbstractAlgorithmgetLogger in class AbstractAlgorithm<Clustering<KMeansModel>>Copyright © 2015 ELKI Development Team, Lehr- und Forschungseinheit für Datenbanksysteme, Ludwig-Maximilians-Universität München. License information.