V
- Data type@Reference(authors="Erich Schubert, Peter J. Rousseeuw", title="Faster k-Medoids Clustering: Improving the PAM, CLARA, and CLARANS Algorithms", booktitle="preprint, to appear", url="https://arxiv.org/abs/1810.05691", bibkey="DBLP:journals/corr/abs-1810-05691") public class FastCLARA<V> extends KMedoidsFastPAM<V>
KMedoidsFastPAM
improvements, to increase scalability in the number of clusters. This variant
will also default to twice the sample size, to improve quality.
TODO: use a triangular distance matrix, rather than a hash-map based cache, for a bit better performance and less memory.
Reference:
Erich Schubert, Peter J. Rousseeuw
Faster k-Medoids Clustering: Improving the PAM, CLARA, and CLARANS
Algorithms
preprint, to appear
Modifier and Type | Class and Description |
---|---|
static class |
FastCLARA.Parameterizer<V>
Parameterization class.
|
KMedoidsFastPAM.Instance
Modifier and Type | Field and Description |
---|---|
(package private) boolean |
keepmed
Keep the previous medoids in the sample (see page 145).
|
private static Logging |
LOG
Class logger.
|
(package private) int |
numsamples
Number of samples to draw (i.e. iterations).
|
(package private) RandomFactory |
random
Random factory for initialization.
|
(package private) double |
sampling
Sampling rate.
|
fasttol
initializer, k, maxiter
ALGORITHM_ID
DISTANCE_FUNCTION_ID
Constructor and Description |
---|
FastCLARA(DistanceFunction<? super V> distanceFunction,
int k,
int maxiter,
KMedoidsInitialization<V> initializer,
double fasttol,
int numsamples,
double sampling,
boolean keepmed,
RandomFactory random)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
Clustering<MedoidModel> |
run(Database database,
Relation<V> relation)
Run k-medoids
|
getLogger, run
getInputTypeRestriction, initialMedoids
getDistanceFunction
run
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
run
private static final Logging LOG
double sampling
int numsamples
boolean keepmed
RandomFactory random
public FastCLARA(DistanceFunction<? super V> distanceFunction, int k, int maxiter, KMedoidsInitialization<V> initializer, double fasttol, int numsamples, double sampling, boolean keepmed, RandomFactory random)
distanceFunction
- Distance function to usek
- Number of clusters to producemaxiter
- Maximum number of iterationsinitializer
- Initialization functionnumsamples
- Number of samples (sampling iterations)sampling
- Sampling rate (absolute or relative)keepmed
- Keep the previous medoids in the next samplerandom
- Random generatorpublic Clustering<MedoidModel> run(Database database, Relation<V> relation)
KMedoidsPAM
run
in class KMedoidsPAM<V>
database
- Databaserelation
- relation to useCopyright © 2019 ELKI Development Team. License information.