V
- vector datatype@Reference(authors="G. Hamerly", title="Making k-means even faster", booktitle="Proc. 2010 SIAM International Conference on Data Mining", url="http://dx.doi.org/10.1137/1.9781611972801.12") public class KMeansHamerly<V extends NumberVector> extends AbstractKMeans<V,KMeansModel>
Reference:
G. Hamerly
Making k-means even faster
Proc. 2010 SIAM International Conference on Data Mining
Modifier and Type | Class and Description |
---|---|
static class |
KMeansHamerly.Parameterizer<V extends NumberVector>
Parameterization class.
|
Modifier and Type | Field and Description |
---|---|
private static String |
KEY
Key for statistics logging.
|
private static Logging |
LOG
The logger for this class.
|
initializer, k, maxiter
distanceFunction
INIT_ID, K_ID, MAXITER_ID, SEED_ID
DISTANCE_FUNCTION_ID
Constructor and Description |
---|
KMeansHamerly(NumberVectorDistanceFunction<? super V> distanceFunction,
int k,
int maxiter,
KMeansInitialization<? super V> initializer)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
private int |
assignToNearestCluster(Relation<V> relation,
List<Vector> means,
List<Vector> sums,
List<ModifiableDBIDs> clusters,
WritableIntegerDataStore assignment,
double[] sep,
WritableDoubleDataStore upper,
WritableDoubleDataStore lower)
Reassign objects, but only if their bounds indicate it is necessary to do
so.
|
protected Logging |
getLogger()
Get the (STATIC) logger for this class.
|
private int |
initialAssignToNearestCluster(Relation<V> relation,
List<Vector> means,
List<Vector> sums,
List<ModifiableDBIDs> clusters,
WritableIntegerDataStore assignment,
WritableDoubleDataStore upper,
WritableDoubleDataStore lower)
Reassign objects, but only if their bounds indicate it is necessary to do
so.
|
private double |
maxMoved(List<Vector> means,
List<Vector> newmeans,
double[] dists)
Maximum distance moved.
|
private void |
recomputeSeperation(List<Vector> means,
double[] sep)
Recompute the separation of cluster means.
|
Clustering<KMeansModel> |
run(Database database,
Relation<V> relation)
Run the clustering algorithm.
|
private void |
updateBounds(Relation<V> relation,
WritableIntegerDataStore assignment,
WritableDoubleDataStore upper,
WritableDoubleDataStore lower,
double[] move,
double delta)
Update the bounds for k-means.
|
assignToNearestCluster, getInputTypeRestriction, incrementalUpdateMean, logVarstat, macQueenIterate, means, medians, setDistanceFunction, setK, updateAssignment
getDistanceFunction
makeParameterDistanceFunction, run
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
run
getDistanceFunction
private static final Logging LOG
private static final String KEY
public KMeansHamerly(NumberVectorDistanceFunction<? super V> distanceFunction, int k, int maxiter, KMeansInitialization<? super V> initializer)
distanceFunction
- distance functionk
- k parametermaxiter
- Maxiter parameterinitializer
- Initialization methodpublic Clustering<KMeansModel> run(Database database, Relation<V> relation)
KMeans
database
- Database to run on.relation
- Relation to process.private void recomputeSeperation(List<Vector> means, double[] sep)
means
- Meanssep
- Output arrayprivate int initialAssignToNearestCluster(Relation<V> relation, List<Vector> means, List<Vector> sums, List<ModifiableDBIDs> clusters, WritableIntegerDataStore assignment, WritableDoubleDataStore upper, WritableDoubleDataStore lower)
relation
- Datameans
- Current meanssums
- Running sums of the new meansclusters
- Current clustersassignment
- Cluster assignmentupper
- Upper boundslower
- Lower boundsprivate int assignToNearestCluster(Relation<V> relation, List<Vector> means, List<Vector> sums, List<ModifiableDBIDs> clusters, WritableIntegerDataStore assignment, double[] sep, WritableDoubleDataStore upper, WritableDoubleDataStore lower)
relation
- Datameans
- Current meanssums
- New means as running sumsclusters
- Current clustersassignment
- Cluster assignmentsep
- Separation of meansupper
- Upper boundslower
- Lower boundsprivate double maxMoved(List<Vector> means, List<Vector> newmeans, double[] dists)
means
- Old meansnewmeans
- New meansdists
- Distances movedprivate void updateBounds(Relation<V> relation, WritableIntegerDataStore assignment, WritableDoubleDataStore upper, WritableDoubleDataStore lower, double[] move, double delta)
relation
- Relationassignment
- Cluster assignmentupper
- Upper boundslower
- Lower boundsmove
- Movement of centersdelta
- Maximum center movement.protected Logging getLogger()
AbstractAlgorithm
getLogger
in class AbstractAlgorithm<Clustering<KMeansModel>>
Copyright © 2015 ELKI Development Team, Lehr- und Forschungseinheit für Datenbanksysteme, Ludwig-Maximilians-Universität München. License information.