O
- Object type@Title(value="HDBSCAN: Hierarchical Density-Based Spatial Clustering of Applications with Noise") @Description(value="Density-Based Clustering Based on Hierarchical Density Estimates") @Reference(authors="R. J. G. B. Campello, D. Moulavi, J. Sander", title="Density-Based Clustering Based on Hierarchical Density Estimates", booktitle="Pacific-Asia Conf. Advances in Knowledge Discovery and Data Mining (PAKDD)", url="https://doi.org/10.1007/978-3-642-37456-2_14", bibkey="DBLP:conf/pakdd/CampelloMS13") public class HDBSCANLinearMemory<O> extends AbstractHDBSCAN<O,PointerDensityHierarchyRepresentationResult> implements HierarchicalClusteringAlgorithm
By not building a distance matrix, we can reduce memory usage to linear memory only; but at the cost of roughly double the runtime (unless using indexes) as we first need to compute all kNN distances (for core sizes), then recompute distances when building the spanning tree.
This implementation follows the HDBSCAN publication more closely than
SLINKHDBSCANLinearMemory
, by computing the minimum spanning tree
using Prim's algorithm (instead of SLINK; although the two are remarkably
similar). In order to produce the preferred internal format of hierarchical
clusterings (the compact pointer representation introduced in SLINK
)
we have to perform a postprocessing conversion.
This implementation does not include the cluster extraction discussed as Step 4, which is provided in a separate step. For this reason, we also do not include self-edges.
Reference:
R. J. G. B. Campello, D. Moulavi, J. Sander
Density-Based Clustering Based on Hierarchical Density Estimates
Pacific-Asia Conf. Advances in Knowledge Discovery and Data Mining (PAKDD)
Modifier and Type | Class and Description |
---|---|
static class |
HDBSCANLinearMemory.Parameterizer<O>
Parameterization class
|
AbstractHDBSCAN.HDBSCANAdapter, AbstractHDBSCAN.HeapMSTCollector
Modifier and Type | Field and Description |
---|---|
private static Logging |
LOG
Class logger.
|
minPts
ALGORITHM_ID
DISTANCE_FUNCTION_ID
Constructor and Description |
---|
HDBSCANLinearMemory(DistanceFunction<? super O> distanceFunction,
int minPts)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
TypeInformation[] |
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query.
|
protected Logging |
getLogger()
Get the (STATIC) logger for this class.
|
PointerDensityHierarchyRepresentationResult |
run(Database db,
Relation<O> relation)
Run the algorithm
|
computeCoreDists, convertToPointerRepresentation
getDistanceFunction
run
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
run
private static final Logging LOG
public HDBSCANLinearMemory(DistanceFunction<? super O> distanceFunction, int minPts)
distanceFunction
- Distance functionminPts
- Minimum number of points for densitypublic PointerDensityHierarchyRepresentationResult run(Database db, Relation<O> relation)
db
- Databaserelation
- Relationpublic TypeInformation[] getInputTypeRestriction()
AbstractAlgorithm
getInputTypeRestriction
in interface Algorithm
getInputTypeRestriction
in class AbstractHDBSCAN<O,PointerDensityHierarchyRepresentationResult>
protected Logging getLogger()
AbstractAlgorithm
getLogger
in class AbstractAlgorithm<PointerDensityHierarchyRepresentationResult>
Copyright © 2019 ELKI Development Team. License information.