V
- Vector typeD
- Distance typeM
- Cluster model typepublic abstract class AbstractKMeans<V extends NumberVector<?>,D extends Distance<D>,M extends MeanModel<V>> extends AbstractPrimitiveDistanceBasedAlgorithm<NumberVector<?>,D,Clustering<M>> implements KMeans<V,D,M>, ClusteringAlgorithm<Clustering<M>>
Modifier and Type | Class and Description |
---|---|
static class |
AbstractKMeans.Parameterizer<V extends NumberVector<?>,D extends Distance<D>>
Parameterization class.
|
Modifier and Type | Field and Description |
---|---|
protected KMeansInitialization<V> |
initializer
Method to choose initial means.
|
protected int |
k
Holds the value of
KMeans.K_ID . |
protected int |
maxiter
Holds the value of
KMeans.MAXITER_ID . |
distanceFunction
INIT_ID, K_ID, MAXITER_ID, SEED_ID
DISTANCE_FUNCTION_ID
Constructor and Description |
---|
AbstractKMeans(PrimitiveDistanceFunction<? super NumberVector<?>,D> distanceFunction,
int k,
int maxiter,
KMeansInitialization<V> initializer)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
protected boolean |
assignToNearestCluster(Relation<V> relation,
List<? extends NumberVector<?>> means,
List<? extends ModifiableDBIDs> clusters,
WritableIntegerDataStore assignment)
Returns a list of clusters.
|
TypeInformation[] |
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query.
|
protected void |
incrementalUpdateMean(Vector mean,
V vec,
int newsize,
double op)
Compute an incremental update for the mean.
|
protected boolean |
macQueenIterate(Relation<V> relation,
List<Vector> means,
List<ModifiableDBIDs> clusters,
WritableIntegerDataStore assignment)
Perform a MacQueen style iteration.
|
protected List<Vector> |
means(List<? extends ModifiableDBIDs> clusters,
List<? extends NumberVector<?>> means,
Relation<V> database)
Returns the mean vectors of the given clusters in the given database.
|
protected List<NumberVector<?>> |
medians(List<? extends ModifiableDBIDs> clusters,
List<? extends NumberVector<?>> medians,
Relation<V> database)
Returns the median vectors of the given clusters in the given database.
|
void |
setDistanceFunction(PrimitiveDistanceFunction<? super NumberVector<?>,D> distanceFunction)
Set the distance function to use.
|
void |
setK(int k)
Set the value of k.
|
protected boolean |
updateAssignment(DBIDIter iditer,
List<? extends ModifiableDBIDs> clusters,
WritableIntegerDataStore assignment,
int newA) |
private boolean |
updateMeanAndAssignment(List<ModifiableDBIDs> clusters,
List<Vector> means,
int minIndex,
V fv,
DBIDIter iditer,
WritableIntegerDataStore assignment)
Try to update the cluster assignment.
|
getDistanceFunction
getLogger, makeParameterDistanceFunction, run
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
run
getDistanceFunction
protected int k
KMeans.K_ID
.protected int maxiter
KMeans.MAXITER_ID
.protected KMeansInitialization<V extends NumberVector<?>> initializer
public AbstractKMeans(PrimitiveDistanceFunction<? super NumberVector<?>,D> distanceFunction, int k, int maxiter, KMeansInitialization<V> initializer)
distanceFunction
- distance functionk
- k parametermaxiter
- Maxiter parameterinitializer
- Function to generate the initial meansprotected boolean assignToNearestCluster(Relation<V> relation, List<? extends NumberVector<?>> means, List<? extends ModifiableDBIDs> clusters, WritableIntegerDataStore assignment)
relation
- the database to clustermeans
- a list of k meansclusters
- cluster assignmentassignment
- Current cluster assignmentprotected boolean updateAssignment(DBIDIter iditer, List<? extends ModifiableDBIDs> clusters, WritableIntegerDataStore assignment, int newA)
public TypeInformation[] getInputTypeRestriction()
AbstractAlgorithm
getInputTypeRestriction
in interface Algorithm
getInputTypeRestriction
in class AbstractAlgorithm<Clustering<M extends MeanModel<V>>>
protected List<Vector> means(List<? extends ModifiableDBIDs> clusters, List<? extends NumberVector<?>> means, Relation<V> database)
clusters
- the clusters to compute the meansmeans
- the recent meansdatabase
- the database containing the vectorsprotected List<NumberVector<?>> medians(List<? extends ModifiableDBIDs> clusters, List<? extends NumberVector<?>> medians, Relation<V> database)
clusters
- the clusters to compute the meansmedians
- the recent mediansdatabase
- the database containing the vectorsprotected void incrementalUpdateMean(Vector mean, V vec, int newsize, double op)
mean
- Mean to updatevec
- Object vectornewsize
- (New) size of clusterop
- Cluster size change / Weight changeprotected boolean macQueenIterate(Relation<V> relation, List<Vector> means, List<ModifiableDBIDs> clusters, WritableIntegerDataStore assignment)
relation
- Relationmeans
- Meansclusters
- Clustersassignment
- Current cluster assignmentprivate boolean updateMeanAndAssignment(List<ModifiableDBIDs> clusters, List<Vector> means, int minIndex, V fv, DBIDIter iditer, WritableIntegerDataStore assignment)
clusters
- Current clustersmeans
- Means to updateminIndex
- Cluster to assign tofv
- Vectoriditer
- Object IDassignment
- Current cluster assignmenttrue
when assignment changedpublic void setK(int k)
KMeans
public void setDistanceFunction(PrimitiveDistanceFunction<? super NumberVector<?>,D> distanceFunction)
KMeans
setDistanceFunction
in interface KMeans<V extends NumberVector<?>,D extends Distance<D>,M extends MeanModel<V>>
distanceFunction
- Distance function.