V
- Vector type@Reference(authors="R. T. Ng, J. Han", title="CLARANS: a method for clustering objects for spatial data mining", booktitle="IEEE Transactions on Knowledge and Data Engineering 14(5)", url="https://doi.org/10.1109/TKDE.2002.1033770", bibkey="DBLP:journals/tkde/NgH02") public class CLARANS<V> extends AbstractDistanceBasedAlgorithm<V,Clustering<MedoidModel>> implements ClusteringAlgorithm<Clustering<MedoidModel>>
KMedoidsPAM
)
and CLARA and also based on sampling.
This implementation tries to balance memory and computation time. By caching the distances to the two nearest medoids, we usually only need O(n) instead of O(nk) distance computations for one iteration, at the cost of needing O(2n) memory to store them.
The implementation is fairly ugly, because we have three solutions (the best found so far, the current solution, and a neighbor candidate); and for each point in each solution we need the best and second best assignments. But with Java 11, we may be able to switch to value types that would clean this code significantly, without the overhead of O(n) objects.
Reference:
R. T. Ng, J. Han
CLARANS: a method for clustering objects for spatial data mining
IEEE Transactions on Knowledge and Data Engineering 14(5)
Modifier and Type | Class and Description |
---|---|
protected static class |
CLARANS.Assignment
Assignment state.
|
static class |
CLARANS.Parameterizer<V>
Parameterization class.
|
Modifier and Type | Field and Description |
---|---|
(package private) int |
k
Number of clusters to find.
|
private static Logging |
LOG
Class logger.
|
(package private) double |
maxneighbor
Sampling rate.
|
(package private) int |
numlocal
Number of samples to draw (i.e. restarts).
|
(package private) RandomFactory |
random
Random factory for initialization.
|
ALGORITHM_ID
DISTANCE_FUNCTION_ID
Constructor and Description |
---|
CLARANS(DistanceFunction<? super V> distanceFunction,
int k,
int numlocal,
double maxneighbor,
RandomFactory random)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
TypeInformation[] |
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query.
|
protected Logging |
getLogger()
Get the (STATIC) logger for this class.
|
Clustering<MedoidModel> |
run(Database database,
Relation<V> relation) |
getDistanceFunction
run
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
run
private static final Logging LOG
int k
int numlocal
double maxneighbor
RandomFactory random
public CLARANS(DistanceFunction<? super V> distanceFunction, int k, int numlocal, double maxneighbor, RandomFactory random)
distanceFunction
- Distance function to usek
- Number of clusters to producenumlocal
- Number of samples (restarts)maxneighbor
- Neighbor sampling rate (absolute or relative)random
- Random generatorpublic Clustering<MedoidModel> run(Database database, Relation<V> relation)
public TypeInformation[] getInputTypeRestriction()
AbstractAlgorithm
getInputTypeRestriction
in interface Algorithm
getInputTypeRestriction
in class AbstractAlgorithm<Clustering<MedoidModel>>
protected Logging getLogger()
AbstractAlgorithm
getLogger
in class AbstractAlgorithm<Clustering<MedoidModel>>
Copyright © 2019 ELKI Development Team. License information.