
V - Vector type@Reference(authors="L. Kaufman, P. J. Rousseeuw", title="Clustering Large Data Sets (with discussion)", booktitle="Pattern Recognition in Practice II") public class CLARA<V> extends KMedoidsPAM<V>
KMedoidsPAM) based on
sampling.
Reference:
L. Kaufman, P. J. Rousseeuw
Clustering Large Data Sets (with discussion)
in: Pattern Recognition in Practice II
| Modifier and Type | Class and Description |
|---|---|
static class |
CLARA.Parameterizer<V>
Parameterization class.
|
| Modifier and Type | Field and Description |
|---|---|
private static Logging |
LOG
Class logger.
|
(package private) int |
numsamples
Number of samples to draw (i.e. iterations).
|
(package private) RandomFactory |
random
Random factory for initialization.
|
(package private) double |
sampling
Sampling rate.
|
initializer, k, maxiterDISTANCE_FUNCTION_ID| Constructor and Description |
|---|
CLARA(DistanceFunction<? super V> distanceFunction,
int k,
int maxiter,
KMedoidsInitialization<V> initializer,
int numsamples,
double sampling,
RandomFactory random)
Constructor.
|
| Modifier and Type | Method and Description |
|---|---|
protected double |
assignRemainingToNearestCluster(ArrayDBIDs means,
DBIDs ids,
DBIDs rids,
WritableIntegerDataStore assignment,
DistanceQuery<V> distQ)
Returns a list of clusters.
|
Clustering<MedoidModel> |
run(Database database,
Relation<V> relation)
Run k-medoids
|
assignToNearestCluster, getInputTypeRestriction, getLogger, runPAMOptimizationgetDistanceFunctionmakeParameterDistanceFunction, runclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitrunprivate static final Logging LOG
double sampling
int numsamples
RandomFactory random
public CLARA(DistanceFunction<? super V> distanceFunction, int k, int maxiter, KMedoidsInitialization<V> initializer, int numsamples, double sampling, RandomFactory random)
distanceFunction - Distance function to usek - Number of clusters to producemaxiter - Maximum number of iterationsinitializer - Initialization functionnumsamples - Number of samples (sampling iterations)sampling - Sampling rate (absolute or relative)random - Random generatorpublic Clustering<MedoidModel> run(Database database, Relation<V> relation)
KMedoidsPAMrun in class KMedoidsPAM<V>database - Databaserelation - relation to useprotected double assignRemainingToNearestCluster(ArrayDBIDs means, DBIDs ids, DBIDs rids, WritableIntegerDataStore assignment, DistanceQuery<V> distQ)
means - Object centroidsids - Object idsrids - Sample that was already assignedassignment - cluster assignmentdistQ - distance queryCopyright © 2015 ELKI Development Team, Lehr- und Forschungseinheit für Datenbanksysteme, Ludwig-Maximilians-Universität München. License information.