V - Data type@Reference(authors="Erich Schubert, Peter J. Rousseeuw", title="Faster k-Medoids Clustering: Improving the PAM, CLARA, and CLARANS Algorithms", booktitle="preprint, to appear", url="https://arxiv.org/abs/1810.05691", bibkey="DBLP:journals/corr/abs-1810-05691") public class FastCLARA<V> extends KMedoidsFastPAM<V>
KMedoidsFastPAM
improvements, to increase scalability in the number of clusters. This variant
will also default to twice the sample size, to improve quality.
TODO: use a triangular distance matrix, rather than a hash-map based cache, for a bit better performance and less memory.
Reference:
Erich Schubert, Peter J. Rousseeuw
Faster k-Medoids Clustering: Improving the PAM, CLARA, and CLARANS
Algorithms
preprint, to appear
| Modifier and Type | Class and Description |
|---|---|
static class |
FastCLARA.Parameterizer<V>
Parameterization class.
|
KMedoidsFastPAM.Instance| Modifier and Type | Field and Description |
|---|---|
(package private) boolean |
keepmed
Keep the previous medoids in the sample (see page 145).
|
private static Logging |
LOG
Class logger.
|
(package private) int |
numsamples
Number of samples to draw (i.e. iterations).
|
(package private) RandomFactory |
random
Random factory for initialization.
|
(package private) double |
sampling
Sampling rate.
|
fasttolinitializer, k, maxiterALGORITHM_IDDISTANCE_FUNCTION_ID| Constructor and Description |
|---|
FastCLARA(DistanceFunction<? super V> distanceFunction,
int k,
int maxiter,
KMedoidsInitialization<V> initializer,
double fasttol,
int numsamples,
double sampling,
boolean keepmed,
RandomFactory random)
Constructor.
|
| Modifier and Type | Method and Description |
|---|---|
Clustering<MedoidModel> |
run(Database database,
Relation<V> relation)
Run k-medoids
|
getLogger, rungetInputTypeRestriction, initialMedoidsgetDistanceFunctionrunclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitrunprivate static final Logging LOG
double sampling
int numsamples
boolean keepmed
RandomFactory random
public FastCLARA(DistanceFunction<? super V> distanceFunction, int k, int maxiter, KMedoidsInitialization<V> initializer, double fasttol, int numsamples, double sampling, boolean keepmed, RandomFactory random)
distanceFunction - Distance function to usek - Number of clusters to producemaxiter - Maximum number of iterationsinitializer - Initialization functionnumsamples - Number of samples (sampling iterations)sampling - Sampling rate (absolute or relative)keepmed - Keep the previous medoids in the next samplerandom - Random generatorpublic Clustering<MedoidModel> run(Database database, Relation<V> relation)
KMedoidsPAMrun in class KMedoidsPAM<V>database - Databaserelation - relation to useCopyright © 2019 ELKI Development Team. License information.