V
- vector datatype@Priority(value=102) @Reference(authors="Erich Schubert, Peter J. Rousseeuw", title="Faster k-Medoids Clustering: Improving the PAM, CLARA, and CLARANS Algorithms", booktitle="preprint, to appear", url="https://arxiv.org/abs/1810.05691", bibkey="DBLP:journals/corr/abs-1810-05691") public class KMedoidsFastPAM<V> extends KMedoidsFastPAM1<V>
KMedoidsFastPAM1
, but in addition
it tries to perform multiple swaps in each iteration (FastPAM2), which can
reduce the total number of iterations needed substantially for large k, if
some areas of the data are largely independent.
There is a tolerance parameter, which controls how many additional swaps are performed. When set to 0, it will only execute an additional swap if it appears to be independent (i.e., the improvements resulting from the swap have not decreased when the first swap was executed). We suggest to rather leave it at the default of 1, which means to perform any additional swap that gives an improvement. We could not observe a tendency to find worse results when doing these additional swaps, but a reduced runtime.
Because of the speed benefits, we also suggest to use a linear-time initialization, such as the k-means++ initialization or the proposed LAB (linear approximative BUILD, the third component of FastPAM) initialization, and try multiple times if the runtime permits.
Reference:
Erich Schubert, Peter J. Rousseeuw
Faster k-Medoids Clustering: Improving the PAM, CLARA, and CLARANS
Algorithms
preprint, to appear
Modifier and Type | Class and Description |
---|---|
protected static class |
KMedoidsFastPAM.Instance
Instance for a single dataset.
|
static class |
KMedoidsFastPAM.Parameterizer<V>
Parameterization class.
|
Modifier and Type | Field and Description |
---|---|
protected double |
fasttol
Tolerance for fast swapping behavior (may perform worse swaps).
|
private static java.lang.String |
KEY
Key for statistics logging.
|
private static Logging |
LOG
The logger for this class.
|
initializer, k, maxiter
ALGORITHM_ID
DISTANCE_FUNCTION_ID
Constructor and Description |
---|
KMedoidsFastPAM(DistanceFunction<? super V> distanceFunction,
int k,
int maxiter,
KMedoidsInitialization<V> initializer)
Constructor.
|
KMedoidsFastPAM(DistanceFunction<? super V> distanceFunction,
int k,
int maxiter,
KMedoidsInitialization<V> initializer,
double fasttol)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
protected Logging |
getLogger()
Get the (STATIC) logger for this class.
|
protected void |
run(DistanceQuery<V> distQ,
DBIDs ids,
ArrayModifiableDBIDs medoids,
WritableIntegerDataStore assignment)
Run the main algorithm.
|
getInputTypeRestriction, initialMedoids, run
getDistanceFunction
run
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
run
private static final Logging LOG
private static final java.lang.String KEY
protected double fasttol
public KMedoidsFastPAM(DistanceFunction<? super V> distanceFunction, int k, int maxiter, KMedoidsInitialization<V> initializer)
distanceFunction
- distance functionk
- k parametermaxiter
- Maxiter parameterinitializer
- Function to generate the initial meanspublic KMedoidsFastPAM(DistanceFunction<? super V> distanceFunction, int k, int maxiter, KMedoidsInitialization<V> initializer, double fasttol)
distanceFunction
- distance functionk
- k parametermaxiter
- Maxiter parameterinitializer
- Function to generate the initial meansfasttol
- Tolerance for fast swappingprotected void run(DistanceQuery<V> distQ, DBIDs ids, ArrayModifiableDBIDs medoids, WritableIntegerDataStore assignment)
KMedoidsPAM
run
in class KMedoidsFastPAM1<V>
distQ
- Distance queryids
- IDs to processmedoids
- Current medoidsassignment
- Cluster assignment outputprotected Logging getLogger()
AbstractAlgorithm
getLogger
in class KMedoidsFastPAM1<V>
Copyright © 2019 ELKI Development Team. License information.