V
- Vector type@Reference(authors="L. Kaufman, P. J. Rousseeuw", title="Clustering Large Data Sets (with discussion)", booktitle="Pattern Recognition in Practice II") public class CLARA<V> extends KMedoidsPAM<V>
KMedoidsPAM
) based on
sampling.
Reference:
L. Kaufman, P. J. Rousseeuw
Clustering Large Data Sets (with discussion)
in: Pattern Recognition in Practice II
Modifier and Type | Class and Description |
---|---|
static class |
CLARA.Parameterizer<V>
Parameterization class.
|
Modifier and Type | Field and Description |
---|---|
private static Logging |
LOG
Class logger.
|
(package private) int |
numsamples
Number of samples to draw (i.e. iterations).
|
(package private) RandomFactory |
random
Random factory for initialization.
|
(package private) double |
sampling
Sampling rate.
|
initializer, k, maxiter
DISTANCE_FUNCTION_ID
Constructor and Description |
---|
CLARA(DistanceFunction<? super V> distanceFunction,
int k,
int maxiter,
KMedoidsInitialization<V> initializer,
int numsamples,
double sampling,
RandomFactory random)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
protected double |
assignRemainingToNearestCluster(ArrayDBIDs means,
DBIDs ids,
DBIDs rids,
WritableIntegerDataStore assignment,
DistanceQuery<V> distQ)
Returns a list of clusters.
|
Clustering<MedoidModel> |
run(Database database,
Relation<V> relation)
Run k-medoids
|
assignToNearestCluster, getInputTypeRestriction, getLogger, runPAMOptimization
getDistanceFunction
makeParameterDistanceFunction, run
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
run
private static final Logging LOG
double sampling
int numsamples
RandomFactory random
public CLARA(DistanceFunction<? super V> distanceFunction, int k, int maxiter, KMedoidsInitialization<V> initializer, int numsamples, double sampling, RandomFactory random)
distanceFunction
- Distance function to usek
- Number of clusters to producemaxiter
- Maximum number of iterationsinitializer
- Initialization functionnumsamples
- Number of samples (sampling iterations)sampling
- Sampling rate (absolute or relative)random
- Random generatorpublic Clustering<MedoidModel> run(Database database, Relation<V> relation)
KMedoidsPAM
run
in class KMedoidsPAM<V>
database
- Databaserelation
- relation to useprotected double assignRemainingToNearestCluster(ArrayDBIDs means, DBIDs ids, DBIDs rids, WritableIntegerDataStore assignment, DistanceQuery<V> distQ)
means
- Object centroidsids
- Object idsrids
- Sample that was already assignedassignment
- cluster assignmentdistQ
- distance queryCopyright © 2015 ELKI Development Team, Lehr- und Forschungseinheit für Datenbanksysteme, Ludwig-Maximilians-Universität München. License information.