Package | Description |
---|---|
de.lmu.ifi.dbs.elki.algorithm |
Algorithms suitable as a task for the
KDDTask
main routine. |
de.lmu.ifi.dbs.elki.algorithm.clustering |
Clustering algorithms
Clustering algorithms are supposed to implement the
Algorithm -Interface. |
de.lmu.ifi.dbs.elki.algorithm.clustering.affinitypropagation |
Affinity Propagation (AP) clustering.
|
de.lmu.ifi.dbs.elki.algorithm.clustering.biclustering |
Biclustering algorithms
|
de.lmu.ifi.dbs.elki.algorithm.clustering.correlation |
Correlation clustering algorithms
|
de.lmu.ifi.dbs.elki.algorithm.clustering.gdbscan |
Generalized DBSCAN
Generalized DBSCAN is an abstraction of the original DBSCAN idea,
that allows the use of arbitrary "neighborhood" and "core point" predicates.
|
de.lmu.ifi.dbs.elki.algorithm.clustering.gdbscan.parallel |
Parallel versions of Generalized DBSCAN.
|
de.lmu.ifi.dbs.elki.algorithm.clustering.hierarchical |
Hierarchical agglomerative clustering (HAC).
|
de.lmu.ifi.dbs.elki.algorithm.clustering.hierarchical.birch |
BIRCH clustering.
|
de.lmu.ifi.dbs.elki.algorithm.clustering.hierarchical.extraction |
Extraction of partitional clusterings from hierarchical results.
|
de.lmu.ifi.dbs.elki.algorithm.clustering.hierarchical.linkage |
Linkages for hierarchical clustering.
|
de.lmu.ifi.dbs.elki.algorithm.clustering.kmeans |
K-means clustering and variations
|
de.lmu.ifi.dbs.elki.algorithm.clustering.kmeans.initialization |
Initialization strategies for k-means.
|
de.lmu.ifi.dbs.elki.algorithm.clustering.kmeans.quality |
Quality measures for k-Means results.
|
de.lmu.ifi.dbs.elki.algorithm.clustering.optics |
OPTICS family of clustering algorithms.
|
de.lmu.ifi.dbs.elki.algorithm.clustering.subspace |
Axis-parallel subspace clustering algorithms
The clustering algorithms in this package are instances of both, projected
clustering algorithms or subspace clustering algorithms according to the
classical but somewhat obsolete classification schema of clustering
algorithms for axis-parallel subspaces.
|
de.lmu.ifi.dbs.elki.algorithm.clustering.uncertain |
Clustering algorithms for uncertain data.
|
de.lmu.ifi.dbs.elki.algorithm.itemsetmining |
Algorithms for frequent itemset mining such as APRIORI.
|
de.lmu.ifi.dbs.elki.algorithm.itemsetmining.associationrules |
Association rule mining.
|
de.lmu.ifi.dbs.elki.algorithm.itemsetmining.associationrules.interest |
Association rule interestingness measures.
|
de.lmu.ifi.dbs.elki.algorithm.outlier |
Outlier detection algorithms
|
de.lmu.ifi.dbs.elki.algorithm.outlier.anglebased |
Angle-based outlier detection algorithms.
|
de.lmu.ifi.dbs.elki.algorithm.outlier.clustering |
Clustering based outlier detection.
|
de.lmu.ifi.dbs.elki.algorithm.outlier.distance |
Distance-based outlier detection algorithms, such as DBOutlier and kNN.
|
de.lmu.ifi.dbs.elki.algorithm.outlier.distance.parallel |
Parallel implementations of distance-based outlier detectors.
|
de.lmu.ifi.dbs.elki.algorithm.outlier.intrinsic |
Outlier detection algorithms based on intrinsic dimensionality.
|
de.lmu.ifi.dbs.elki.algorithm.outlier.lof |
LOF family of outlier detection algorithms
|
de.lmu.ifi.dbs.elki.algorithm.outlier.lof.parallel |
Parallelized variants of LOF.
|
de.lmu.ifi.dbs.elki.algorithm.outlier.meta |
Meta outlier detection algorithms: external scores, score rescaling
|
de.lmu.ifi.dbs.elki.algorithm.outlier.spatial |
Spatial outlier detection algorithms
|
de.lmu.ifi.dbs.elki.algorithm.outlier.subspace |
Subspace outlier detection methods
Methods that detect outliers in subspaces (projections) of the data set.
|
de.lmu.ifi.dbs.elki.algorithm.outlier.svm |
Support-Vector-Machines for outlier detection.
|
de.lmu.ifi.dbs.elki.algorithm.projection |
Data projections (see also preprocessing filters for basic projections).
|
de.lmu.ifi.dbs.elki.algorithm.statistics |
Statistical analysis algorithms.
|
de.lmu.ifi.dbs.elki.algorithm.timeseries |
Algorithms for change point detection in time series.
|
de.lmu.ifi.dbs.elki.application |
Base classes for standalone applications.
|
de.lmu.ifi.dbs.elki.application.experiments |
Packaged experiments to make them easy to reproduce.
|
de.lmu.ifi.dbs.elki.application.greedyensemble |
Greedy ensembles for outlier detection.
|
de.lmu.ifi.dbs.elki.data.projection.random |
Random projection families
|
de.lmu.ifi.dbs.elki.database.ids.integer |
Integer-based DBID implementation --
do not use directly - always use
DBIDUtil . |
de.lmu.ifi.dbs.elki.datasource.filter.transform |
Data space transformations
|
de.lmu.ifi.dbs.elki.distance.distancefunction |
Distance functions for use within ELKI.
|
de.lmu.ifi.dbs.elki.distance.distancefunction.colorhistogram |
Distance functions using correlations
|
de.lmu.ifi.dbs.elki.distance.distancefunction.geo |
Geographic (earth) distance functions
|
de.lmu.ifi.dbs.elki.distance.distancefunction.histogram |
Distance functions for one-dimensional histograms.
|
de.lmu.ifi.dbs.elki.distance.distancefunction.probabilistic |
Distance from probability theory, mostly divergences such as K-L-divergence,
J-divergence, F-divergence, χ²-divergence, etc.
|
de.lmu.ifi.dbs.elki.distance.distancefunction.set |
Distance functions for binary and set type data.
|
de.lmu.ifi.dbs.elki.distance.distancefunction.strings |
Distance functions for strings
|
de.lmu.ifi.dbs.elki.distance.distancefunction.timeseries |
Distance functions designed for time series
Note that some regular distance functions (e.g., Euclidean) are also used on
time series.
|
de.lmu.ifi.dbs.elki.distance.similarityfunction |
Similarity functions
|
de.lmu.ifi.dbs.elki.distance.similarityfunction.cluster |
Similarity measures for comparing clusters.
|
de.lmu.ifi.dbs.elki.evaluation.clustering |
Evaluation of clustering results
|
de.lmu.ifi.dbs.elki.evaluation.clustering.internal |
Internal evaluation measures for clusterings.
|
de.lmu.ifi.dbs.elki.evaluation.clustering.pairsegments |
Pair-segment analysis of multiple clusterings
|
de.lmu.ifi.dbs.elki.evaluation.outlier |
Evaluate an outlier score using a misclassification based cost model
|
de.lmu.ifi.dbs.elki.evaluation.scores |
Evaluation of rankings and scorings
|
de.lmu.ifi.dbs.elki.index.lsh.hashfamilies |
Hash function families for LSH
|
de.lmu.ifi.dbs.elki.index.lsh.hashfunctions |
Hash functions for LSH
|
de.lmu.ifi.dbs.elki.index.preprocessed.fastoptics |
Preprocessed index used by the FastOPTICS algorithm.
|
de.lmu.ifi.dbs.elki.index.preprocessed.knn |
Indexes providing KNN and rKNN data.
|
de.lmu.ifi.dbs.elki.index.preprocessed.preference |
Indexes storing preference vectors
|
de.lmu.ifi.dbs.elki.index.projected |
Projected indexes for data
|
de.lmu.ifi.dbs.elki.index.tree.metrical.covertree |
Cover-tree variations.
|
de.lmu.ifi.dbs.elki.index.tree.metrical.mtreevariants.mtree | |
de.lmu.ifi.dbs.elki.index.tree.metrical.mtreevariants.strategies.insert |
Insertion (choose path) strategies of nodes in an M-Tree (and variants)
|
de.lmu.ifi.dbs.elki.index.tree.metrical.mtreevariants.strategies.split |
Splitting strategies of nodes in an M-Tree (and variants)
|
de.lmu.ifi.dbs.elki.index.tree.metrical.mtreevariants.strategies.split.distribution |
Entry distsribution strategies of nodes in an M-Tree (and variants).
|
de.lmu.ifi.dbs.elki.index.tree.spatial.kd |
K-d-tree and variants
|
de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.query |
Queries on the R-Tree family of indexes: kNN and range queries
|
de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.rstar | |
de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.strategies.bulk |
Packages for bulk-loading R*-Trees
|
de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.strategies.insert |
Insertion strategies for R-Trees
|
de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.strategies.overflow |
Overflow treatment strategies for R-Trees
|
de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.strategies.reinsert |
Reinsertion strategies for R-Trees
|
de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.strategies.split |
Splitting strategies for R-Trees
|
de.lmu.ifi.dbs.elki.index.vafile |
Vector Approximation File
|
de.lmu.ifi.dbs.elki.math |
Mathematical operations and utilities used throughout the framework
|
de.lmu.ifi.dbs.elki.math.geodesy |
Functions for computing on the sphere / earth.
|
de.lmu.ifi.dbs.elki.math.geometry |
Algorithms from computational geometry
|
de.lmu.ifi.dbs.elki.math.linearalgebra |
The linear algebra package provides classes and computational methods for
operations on matrices and vectors.
|
de.lmu.ifi.dbs.elki.math.linearalgebra.pca |
Principal Component Analysis (PCA) and Eigenvector processing
|
de.lmu.ifi.dbs.elki.math.spacefillingcurves |
Space filling curves
|
de.lmu.ifi.dbs.elki.math.statistics.dependence |
Statistical measures of dependence, such as correlation
|
de.lmu.ifi.dbs.elki.math.statistics.distribution |
Standard distributions, with random generation functionalities
|
de.lmu.ifi.dbs.elki.math.statistics.distribution.estimator |
Estimators for statistical distributions.
|
de.lmu.ifi.dbs.elki.math.statistics.distribution.estimator.meta |
Meta estimators: estimators that do not actually estimate themselves, but instead use other estimators, e.g. on a trimmed data set, or as an ensemble.
|
de.lmu.ifi.dbs.elki.math.statistics.intrinsicdimensionality |
Methods for estimating the intrinsic dimensionality.
|
de.lmu.ifi.dbs.elki.math.statistics.kernelfunctions |
Kernel functions from statistics.
|
de.lmu.ifi.dbs.elki.math.statistics.tests |
Statistical tests
|
de.lmu.ifi.dbs.elki.result |
Result types, representation and handling
|
de.lmu.ifi.dbs.elki.utilities.datastructures.arrays |
Utilities for arrays: advanced sorting for primitvie arrays
|
de.lmu.ifi.dbs.elki.utilities.datastructures.unionfind |
Union-find data structures.
|
de.lmu.ifi.dbs.elki.utilities.random |
Random number generation.
|
de.lmu.ifi.dbs.elki.utilities.scaling.outlier |
Scaling of outlier scores, that require a statistical analysis of the
occurring values
|
de.lmu.ifi.dbs.elki.visualization.parallel3d |
3DPC: 3D parallel coordinate plot visualization for ELKI.
|
de.lmu.ifi.dbs.elki.visualization.parallel3d.layout |
Layouting algorithms for 3D parallel coordinate plots.
|
de.lmu.ifi.dbs.elki.visualization.projector |
Projectors are responsible for finding appropriate projections for data
relations
|
de.lmu.ifi.dbs.elki.visualization.visualizers.pairsegments |
Visualizers for inspecting cluster differences using pair counting segments
|
de.lmu.ifi.dbs.elki.visualization.visualizers.scatterplot.density |
Visualizers for data set density in a scatterplot projection
|
de.lmu.ifi.dbs.elki.visualization.visualizers.scatterplot.outlier |
Visualizers for outlier scores based on 2D projections
|
tutorial.clustering |
Classes from the tutorial on implementing a custom k-means variation
|
tutorial.outlier |
Tutorials on implementing outlier detection methods in ELKI.
|
Package | Description |
---|---|
de.lmu.ifi.dbs.elki.algorithm.outlier.lof.parallel |
Parallelized variants of LOF.
|
Modifier and Type | Class and Description |
---|---|
class |
DependencyDerivator<V extends NumberVector>
Dependency derivator computes quantitatively linear dependencies among
attributes of a given dataset based on a linear correlation PCA.
|
Modifier and Type | Class and Description |
---|---|
class |
CanopyPreClustering<O>
Canopy pre-clustering is a simple preprocessing step for clustering.
|
class |
GriDBSCAN<V extends NumberVector>
Using Grid for Accelerating Density-Based Clustering.
|
class |
Leader<O>
Leader clustering algorithm.
|
class |
NaiveMeanShiftClustering<V extends NumberVector>
Mean-shift based clustering algorithm.
|
class |
SNNClustering<O>
Shared nearest neighbor clustering.
|
Modifier and Type | Class and Description |
---|---|
class |
AffinityPropagationClusteringAlgorithm<O>
Cluster analysis by affinity propagation.
|
Modifier and Type | Class and Description |
---|---|
class |
ChengAndChurch<V extends NumberVector>
Cheng and Church biclustering.
|
Modifier and Type | Class and Description |
---|---|
class |
CASH<V extends NumberVector>
The CASH algorithm is a subspace clustering algorithm based on the Hough
transform.
|
class |
COPAC<V extends NumberVector>
COPAC is an algorithm to partition a database according to the correlation
dimension of its objects and to then perform an arbitrary clustering
algorithm over the partitions.
|
class |
ERiC<V extends NumberVector>
Performs correlation clustering on the data partitioned according to local
correlation dimensionality and builds a hierarchy of correlation clusters
that allows multiple inheritance from the clustering result.
|
class |
FourC<V extends NumberVector>
4C identifies local subgroups of data objects sharing a uniform correlation.
|
class |
HiCO<V extends NumberVector>
Implementation of the HiCO algorithm, an algorithm for detecting hierarchies
of correlation clusters.
|
class |
LMCLUS
Linear manifold clustering in high dimensional spaces by stochastic search.
|
class |
ORCLUS<V extends NumberVector>
ORCLUS: Arbitrarily ORiented projected CLUSter generation.
|
Modifier and Type | Class and Description |
---|---|
class |
COPACNeighborPredicate<V extends NumberVector>
COPAC neighborhood predicate.
|
class |
EpsilonNeighborPredicate<O>
The default DBSCAN and OPTICS neighbor predicate, using an
epsilon-neighborhood.
|
class |
ERiCNeighborPredicate<V extends NumberVector>
ERiC neighborhood predicate.
|
class |
FourCCorePredicate
The 4C core point predicate.
|
class |
FourCNeighborPredicate<V extends NumberVector>
4C identifies local subgroups of data objects sharing a uniform correlation.
|
class |
GeneralizedDBSCAN
Generalized DBSCAN, density-based clustering with noise.
|
class |
LSDBC<O extends NumberVector>
Locally Scaled Density Based Clustering.
|
class |
MinPtsCorePredicate
The DBSCAN default core point predicate -- having at least
MinPtsCorePredicate.minpts
neighbors. |
class |
PreDeConCorePredicate
The PreDeCon core point predicate -- having at least minpts. neighbors, and a
maximum preference dimensionality of lambda.
|
class |
PreDeConNeighborPredicate<V extends NumberVector>
Neighborhood predicate used by PreDeCon.
|
class |
SimilarityNeighborPredicate<O>
The DBSCAN neighbor predicate for a
SimilarityFunction , using all
neighbors with a minimum similarity. |
Modifier and Type | Class and Description |
---|---|
class |
ParallelGeneralizedDBSCAN
Parallel version of DBSCAN clustering.
|
Modifier and Type | Class and Description |
---|---|
class |
AbstractHDBSCAN<O,R extends Result>
Abstract base class for HDBSCAN variations.
|
class |
AnderbergHierarchicalClustering<O>
This is a modification of the classic AGNES algorithm for hierarchical
clustering using a nearest-neighbor heuristic for acceleration.
|
class |
CLINK<O>
CLINK algorithm for complete linkage.
|
class |
HDBSCANLinearMemory<O>
Linear memory implementation of HDBSCAN clustering.
|
class |
MiniMaxAnderberg<O>
This is a modification of the classic MiniMax algorithm for hierarchical
clustering using a nearest-neighbor heuristic for acceleration.
|
class |
SLINK<O>
Implementation of the efficient Single-Link Algorithm SLINK of R.
|
class |
SLINKHDBSCANLinearMemory<O>
Linear memory implementation of HDBSCAN clustering based on SLINK.
|
Modifier and Type | Class and Description |
---|---|
class |
AverageInterclusterDistance
Average intercluster distance.
|
class |
AverageIntraclusterDistance
Average intracluster distance.
|
class |
CentroidEuclideanDistance
Centroid Euclidean distance.
|
class |
CentroidManhattanDistance
Centroid Manhattan Distance
Reference:
Data Clustering for Very Large Datasets Plus Applications
T. |
class |
DiameterCriterion
Average Radius (R) criterion.
|
class |
RadiusCriterion
Average Radius (R) criterion.
|
class |
VarianceIncreaseDistance
Variance increase distance.
|
Modifier and Type | Class and Description |
---|---|
class |
ClustersWithNoiseExtraction
Extraction of a given number of clusters with a minimum size, and noise.
|
class |
HDBSCANHierarchyExtraction
Extraction of simplified cluster hierarchies, as proposed in HDBSCAN.
|
class |
SimplifiedHierarchyExtraction
Extraction of simplified cluster hierarchies, as proposed in HDBSCAN.
|
Modifier and Type | Class and Description |
---|---|
class |
CentroidLinkage
Centroid linkage — Unweighted Pair-Group Method using Centroids
(UPGMC).
|
class |
FlexibleBetaLinkage
Flexible-beta linkage as proposed by Lance and Williams.
|
class |
GroupAverageLinkage
Group-average linkage clustering method (UPGMA).
|
interface |
Linkage
Abstract interface for implementing a new linkage method into hierarchical
clustering.
|
class |
MedianLinkage
Median-linkage — weighted pair group method using centroids (WPGMC).
|
class |
SingleLinkage
Single-linkage ("minimum") clustering method.
|
class |
WeightedAverageLinkage
Weighted average linkage clustering method (WPGMA).
|
Modifier and Type | Class and Description |
---|---|
class |
CLARANS<V>
CLARANS: a method for clustering objects for spatial data mining
is inspired by PAM (partitioning around medoids,
KMedoidsPAM )
and CLARA and also based on sampling. |
class |
FastCLARA<V>
Clustering Large Applications (CLARA) with the
KMedoidsFastPAM
improvements, to increase scalability in the number of clusters. |
class |
FastCLARANS<V>
A faster variation of CLARANS, that can explore O(k) as many swaps at a
similar cost by considering all medoids for each candidate non-medoid.
|
class |
KMeansBisecting<V extends NumberVector,M extends MeanModel>
The bisecting k-means algorithm works by starting with an initial
partitioning into two clusters, then repeated splitting of the largest
cluster to get additional clusters.
|
class |
KMeansCompare<V extends NumberVector>
Compare-Means: Accelerated k-means by exploiting the triangle inequality and
pairwise distances of means to prune candidate means.
|
class |
KMeansElkan<V extends NumberVector>
Elkan's fast k-means by exploiting the triangle inequality.
|
class |
KMeansExponion<V extends NumberVector>
Newlings's exponion k-means algorithm, exploiting the triangle inequality.
|
class |
KMeansHamerly<V extends NumberVector>
Hamerly's fast k-means by exploiting the triangle inequality.
|
class |
KMeansMacQueen<V extends NumberVector>
The original k-means algorithm, using MacQueen style incremental updates;
making this effectively an "online" (streaming) algorithm.
|
class |
KMeansMinusMinus<V extends NumberVector>
k-means--: A Unified Approach to Clustering and Outlier Detection.
|
class |
KMeansSimplifiedElkan<V extends NumberVector>
Simplified version of Elkan's k-means by exploiting the triangle inequality.
|
class |
KMeansSort<V extends NumberVector>
Sort-Means: Accelerated k-means by exploiting the triangle inequality and
pairwise distances of means to prune candidate means (with sorting).
|
class |
KMediansLloyd<V extends NumberVector>
k-medians clustering algorithm, but using Lloyd-style bulk iterations instead
of the more complicated approach suggested by Kaufman and Rousseeuw (see
KMedoidsPAM instead). |
class |
KMedoidsFastPAM<V>
FastPAM: An improved version of PAM, that is usually O(k) times faster.
|
class |
KMedoidsFastPAM1<V>
FastPAM1: A version of PAM that is O(k) times faster, i.e., now in O((n-k)²).
|
class |
KMedoidsPAMReynolds<V>
The Partitioning Around Medoids (PAM) algorithm with some additional
optimizations proposed by Reynolds et al.
|
class |
XMeans<V extends NumberVector,M extends MeanModel>
X-means: Extending K-means with Efficient Estimation on the Number of
Clusters.
|
Modifier and Type | Class and Description |
---|---|
class |
FirstKInitialMeans<O>
Initialize K-means by using the first k objects as initial means.
|
class |
KMeansPlusPlusInitialMeans<O>
K-Means++ initialization for k-means.
|
class |
LABInitialMeans<O>
Linear approximative BUILD (LAB) initialization for FastPAM (and k-means).
|
class |
ParkInitialMeans<O>
Initialization method proposed by Park and Jun.
|
class |
RandomNormalGeneratedInitialMeans
Initialize k-means by generating random vectors (normal distributed
with \(N(\mu,\sigma)\) in each dimension).
|
class |
RandomUniformGeneratedInitialMeans
Initialize k-means by generating random vectors (uniform, within the value
range of the data set).
|
class |
SampleKMeansInitialization<V extends NumberVector>
Initialize k-means by running k-means on a sample of the data set only.
|
Modifier and Type | Class and Description |
---|---|
class |
BayesianInformationCriterion
Bayesian Information Criterion (BIC), also known as Schwarz criterion (SBC,
SBIC) for the use with evaluating k-means results.
|
class |
BayesianInformationCriterionZhao
Different version of the BIC criterion.
|
Modifier and Type | Method and Description |
---|---|
static <V extends NumberVector> |
AbstractKMeansQualityMeasure.logLikelihood(Relation<V> relation,
Clustering<? extends MeanModel> clustering,
NumberVectorDistanceFunction<? super V> distanceFunction)
Computes log likelihood of an entire clustering.
|
Modifier and Type | Class and Description |
---|---|
class |
AbstractOPTICS<O>
The OPTICS algorithm for density-based hierarchical clustering.
|
class |
DeLiClu<V extends NumberVector>
DeliClu: Density-Based Hierarchical Clustering
A hierarchical algorithm to find density-connected sets in a database,
closely related to OPTICS but exploiting the structure of a R-tree for
acceleration.
|
class |
FastOPTICS<V extends NumberVector>
FastOPTICS algorithm (Fast approximation of OPTICS)
Note that this is not FOPTICS as in "Fuzzy OPTICS"!
|
class |
OPTICSHeap<O>
The OPTICS algorithm for density-based hierarchical clustering.
|
class |
OPTICSList<O>
The OPTICS algorithm for density-based hierarchical clustering.
|
Modifier and Type | Class and Description |
---|---|
class |
CLIQUE
Implementation of the CLIQUE algorithm, a grid-based algorithm to identify
dense clusters in subspaces of maximum dimensionality.
|
class |
DiSH<V extends NumberVector>
Algorithm for detecting subspace hierarchies.
|
class |
DOC<V extends NumberVector>
DOC is a sampling based subspace clustering algorithm.
|
class |
FastDOC<V extends NumberVector>
The heuristic variant of the DOC algorithm, FastDOC
Reference:
C.
|
class |
HiSC<V extends NumberVector>
Implementation of the HiSC algorithm, an algorithm for detecting hierarchies
of subspace clusters.
|
class |
P3C<V extends NumberVector>
P3C: A Robust Projected Clustering Algorithm.
|
class |
PreDeCon<V extends NumberVector>
PreDeCon computes clusters of subspace preference weighted connected points.
|
class |
PROCLUS<V extends NumberVector>
The PROCLUS algorithm, an algorithm to find subspace clusters in high
dimensional spaces.
|
class |
SUBCLU<V extends NumberVector>
Implementation of the SUBCLU algorithm, an algorithm to detect arbitrarily
shaped and positioned clusters in subspaces.
|
Modifier and Type | Class and Description |
---|---|
class |
CenterOfMassMetaClustering<C extends Clustering<?>>
Center-of-mass meta clustering reduces uncertain objects to their center of
mass, then runs a vector-oriented clustering algorithm on this data set.
|
class |
CKMeans
Run k-means on the centers of each uncertain object.
|
class |
FDBSCAN
FDBSCAN is an adaption of DBSCAN for fuzzy (uncertain) objects.
|
class |
FDBSCANNeighborPredicate
Density-based Clustering of Applications with Noise and Fuzzy objects
(FDBSCAN) is an Algorithm to find sets in a fuzzy database that are
density-connected with minimum probability.
|
class |
RepresentativeUncertainClustering
Representative clustering of uncertain data.
|
class |
UKMeans
Uncertain K-Means clustering, using the average deviation from the center.
|
Modifier and Type | Class and Description |
---|---|
class |
APRIORI
The APRIORI algorithm for Mining Association Rules.
|
class |
Eclat
Eclat is a depth-first discovery algorithm for mining frequent itemsets.
|
class |
FPGrowth
FP-Growth is an algorithm for mining the frequent itemsets by using a
compressed representation of the database called
FPGrowth.FPTree . |
Modifier and Type | Class and Description |
---|---|
class |
AssociationRuleGeneration
Association rule generation from frequent itemsets
This algorithm calls a specified frequent itemset algorithm
and calculates all association rules, having a interest value between
then the specified boundaries form the obtained frequent itemsets
Reference:
M.
|
Modifier and Type | Class and Description |
---|---|
class |
AddedValue
Added value (AV) interestingness measure:
\( \text{confidence}(X \rightarrow Y) - \text{support}(Y) = P(Y|X)-P(Y) \).
|
class |
CertaintyFactor
Certainty factor (CF; Loevinger) interestingness measure.
\( \tfrac{\text{confidence}(X \rightarrow Y) -
\text{support}(Y)}{\text{support}(\neg Y)} \).
|
class |
Confidence
Confidence interestingness measure,
\( \tfrac{\text{support}(X \cup Y)}{\text{support}(X)}
= \tfrac{P(X \cap Y)}{P(X)}=P(Y|X) \).
|
class |
Conviction
Conviction interestingness measure:
\(\frac{P(X) P(\neg Y)}{P(X\cap\neg Y)}\).
|
class |
Cosine
Cosine interestingness measure,
\(\tfrac{\text{support}(A\cup B)}{\sqrt{\text{support}(A)\text{support}(B)}}
=\tfrac{P(A\cap B)}{\sqrt{P(A)P(B)}}\).
|
class |
JMeasure
J-Measure interestingness measure.
|
class |
Klosgen
Klösgen interestingness measure.
|
class |
Leverage
Leverage interestingness measure.
|
class |
Lift
Lift interestingness measure.
|
Modifier and Type | Class and Description |
---|---|
class |
COP<V extends NumberVector>
Correlation outlier probability: Outlier Detection in Arbitrarily Oriented
Subspaces
Reference:
Hans-Peter Kriegel, Peer Kröger, Erich Schubert, Arthur Zimek
Outlier Detection in Arbitrarily Oriented Subspaces Proc. |
class |
DWOF<O>
Algorithm to compute dynamic-window outlier factors in a database based on a
specified parameter k, which specifies the number of the neighbors to be
considered during the calculation of the DWOF score.
|
class |
GaussianUniformMixture<V extends NumberVector>
Outlier detection algorithm using a mixture model approach.
|
class |
OPTICSOF<O>
OPTICS-OF outlier detection algorithm, an algorithm to find Local Outliers in
a database based on ideas from
OPTICSTypeAlgorithm clustering. |
class |
SimpleCOP<V extends NumberVector>
Algorithm to compute local correlation outlier probability.
|
Modifier and Type | Class and Description |
---|---|
class |
ABOD<V extends NumberVector>
Angle-Based Outlier Detection / Angle-Based Outlier Factor.
|
class |
FastABOD<V extends NumberVector>
Fast-ABOD (approximateABOF) version of
Angle-Based Outlier Detection / Angle-Based Outlier Factor.
|
class |
LBABOD<V extends NumberVector>
LB-ABOD (lower-bound) version of
Angle-Based Outlier Detection / Angle-Based Outlier Factor.
|
Modifier and Type | Class and Description |
---|---|
class |
CBLOF<O extends NumberVector>
Cluster-based local outlier factor (CBLOF).
|
class |
SilhouetteOutlierDetection<O>
Outlier detection by using the Silhouette Coefficients.
|
Modifier and Type | Class and Description |
---|---|
class |
AbstractDBOutlier<O>
Simple distance based outlier detection algorithms.
|
class |
DBOutlierDetection<O>
Simple distanced based outlier detection algorithm.
|
class |
DBOutlierScore<O>
Compute percentage of neighbors in the given neighborhood with size d.
|
class |
HilOut<O extends NumberVector>
Fast Outlier Detection in High Dimensional Spaces
Outlier Detection using Hilbert space filling curves
Reference:
F.
|
class |
KNNDD<O>
Nearest Neighbor Data Description.
|
class |
KNNOutlier<O>
Outlier Detection based on the distance of an object to its k nearest
neighbor.
|
class |
KNNWeightOutlier<O>
Outlier Detection based on the accumulated distances of a point to its k
nearest neighbors.
|
class |
LocalIsolationCoefficient<O>
The Local Isolation Coefficient is the sum of the kNN distance and the
average distance to its k nearest neighbors.
|
class |
ODIN<O>
Outlier detection based on the in-degree of the kNN graph.
|
class |
ReferenceBasedOutlierDetection
Reference-Based Outlier Detection algorithm, an algorithm that computes kNN
distances approximately, using reference points.
|
class |
SOS<O>
Stochastic Outlier Selection.
|
Modifier and Type | Method and Description |
---|---|
protected static double |
SOS.estimateInitialBeta(DBIDRef ignore,
DoubleDBIDListIter it,
double perplexity)
Estimate beta from the distances in a row.
|
Modifier and Type | Class and Description |
---|---|
class |
ParallelKNNOutlier<O>
Parallel implementation of KNN Outlier detection.
|
class |
ParallelKNNWeightOutlier<O>
Parallel implementation of KNN Weight Outlier detection.
|
Modifier and Type | Class and Description |
---|---|
class |
IDOS<O>
Intrinsic Dimensional Outlier Detection in High-Dimensional Data.
|
class |
IntrinsicDimensionalityOutlier<O>
Use intrinsic dimensionality for outlier detection.
|
class |
ISOS<O>
Intrinsic Stochastic Outlier Selection.
|
Modifier and Type | Class and Description |
---|---|
class |
ALOCI<O extends NumberVector>
Fast Outlier Detection Using the "approximate Local Correlation Integral".
|
class |
COF<O>
Connectivity-based Outlier Factor (COF).
|
class |
FlexibleLOF<O>
Flexible variant of the "Local Outlier Factor" algorithm.
|
class |
INFLO<O>
Influence Outliers using Symmetric Relationship (INFLO) using two-way search,
is an outlier detection method based on LOF; but also using the reverse kNN.
|
class |
KDEOS<O>
Generalized Outlier Detection with Flexible Kernel Density Estimates.
|
class |
LDF<O extends NumberVector>
Outlier Detection with Kernel Density Functions.
|
class |
LDOF<O>
Computes the LDOF (Local Distance-Based Outlier Factor) for all objects of a
Database.
|
class |
LOCI<O>
Fast Outlier Detection Using the "Local Correlation Integral".
|
class |
LOF<O>
Algorithm to compute density-based local outlier factors in a database based
on a specified parameter
-lof.k . |
class |
LoOP<O>
LoOP: Local Outlier Probabilities
Distance/density based algorithm similar to LOF to detect outliers, but with
statistical methods to achieve better result stability.
|
class |
SimplifiedLOF<O>
A simplified version of the original LOF algorithm, which does not use the
reachability distance, yielding less stable results on inliers.
|
class |
VarianceOfVolume<O extends SpatialComparable>
Variance of Volume for outlier detection.
|
Modifier and Type | Class and Description |
---|---|
class |
ParallelLOF<O>
Parallel implementation of Local Outlier Factor using processors.
|
class |
ParallelSimplifiedLOF<O>
Parallel implementation of Simplified-LOF Outlier detection using processors.
|
Modifier and Type | Class and Description |
---|---|
class |
FeatureBagging
A simple ensemble method called "Feature bagging" for outlier detection.
|
class |
HiCS<V extends NumberVector>
Algorithm to compute High Contrast Subspaces for Density-Based Outlier
Ranking.
|
Modifier and Type | Class and Description |
---|---|
class |
CTLuGLSBackwardSearchAlgorithm<V extends NumberVector>
GLS-Backward Search is a statistical approach to detecting spatial outliers.
|
class |
CTLuMeanMultipleAttributes<N,O extends NumberVector>
Mean Approach is used to discover spatial outliers with multiple attributes.
|
class |
CTLuMedianAlgorithm<N>
Median Algorithm of C.
|
class |
CTLuMedianMultipleAttributes<N,O extends NumberVector>
Median Approach is used to discover spatial outliers with multiple
attributes.
|
class |
CTLuMoranScatterplotOutlier<N>
Moran scatterplot outliers, based on the standardized deviation from the
local and global means.
|
class |
CTLuRandomWalkEC<P>
Spatial outlier detection based on random walks.
|
class |
CTLuScatterplotOutlier<N>
Scatterplot-outlier is a spatial outlier detection method that performs a
linear regression of object attributes and their neighbors average value.
|
class |
CTLuZTestOutlier<N>
Detect outliers by comparing their attribute value to the mean and standard
deviation of their neighborhood.
|
class |
SLOM<N,O>
SLOM: a new measure for local spatial outliers
Reference:
S.
|
class |
SOF<N,O>
The Spatial Outlier Factor (SOF) is a spatial
LOF variation. |
class |
TrimmedMeanApproach<N>
A Trimmed Mean Approach to Finding Spatial Outliers.
|
Modifier and Type | Class and Description |
---|---|
class |
AbstractAggarwalYuOutlier<V extends NumberVector>
Abstract base class for the sparse-grid-cell based outlier detection of
Aggarwal and Yu.
|
class |
AggarwalYuEvolutionary<V extends NumberVector>
Evolutionary variant (EAFOD) of the high-dimensional outlier detection
algorithm by Aggarwal and Yu.
|
class |
AggarwalYuNaive<V extends NumberVector>
BruteForce variant of the high-dimensional outlier detection algorithm by
Aggarwal and Yu.
|
class |
OutRankS1
OutRank: ranking outliers in high dimensional data.
|
class |
OUTRES
Adaptive outlierness for subspace outlier ranking (OUTRES).
|
class |
SOD<V extends NumberVector>
Subspace Outlier Degree.
|
Modifier and Type | Class and Description |
---|---|
class |
LibSVMOneClassOutlierDetection<V extends NumberVector>
Outlier-detection using one-class support vector machines.
|
Modifier and Type | Class and Description |
---|---|
class |
BarnesHutTSNE<O>
tSNE using Barnes-Hut-Approximation.
|
class |
GaussianAffinityMatrixBuilder<O>
Compute the affinity matrix for SNE and tSNE using a Gaussian distribution
with a constant sigma.
|
class |
IntrinsicNearestNeighborAffinityMatrixBuilder<O>
Build sparse affinity matrix using the nearest neighbors only, adjusting for
intrinsic dimensionality.
|
class |
NearestNeighborAffinityMatrixBuilder<O>
Build sparse affinity matrix using the nearest neighbors only.
|
class |
PerplexityAffinityMatrixBuilder<O>
Compute the affinity matrix for SNE and tSNE.
|
class |
SNE<O>
Stochastic Neighbor Embedding is a projection technique designed for
visualization that tries to preserve the nearest neighbor structure.
|
class |
TSNE<O>
t-Stochastic Neighbor Embedding is a projection technique designed for
visualization that tries to preserve the nearest neighbor structure.
|
Modifier and Type | Class and Description |
---|---|
class |
HopkinsStatisticClusteringTendency
The Hopkins Statistic of Clustering Tendency measures the probability that a
data set is generated by a uniform data distribution.
|
Modifier and Type | Class and Description |
---|---|
class |
SigniTrendChangeDetection
Signi-Trend detection algorithm applies to a single time-series.
|
Modifier and Type | Field and Description |
---|---|
static java.lang.String |
AbstractApplication.REFERENCE
Information for citation and version.
|
Modifier and Type | Class and Description |
---|---|
class |
VisualizeGeodesicDistances
Visualization function for Cross-track, Along-track, and minimum distance
function.
|
Modifier and Type | Class and Description |
---|---|
class |
ComputeKNNOutlierScores<O extends NumberVector>
Application that runs a series of kNN-based algorithms on a data set, for
building an ensemble in a second step.
|
class |
GreedyEnsembleExperiment
Class to load an outlier detection summary file, as produced by
ComputeKNNOutlierScores , and compute a naive ensemble for it. |
class |
VisualizePairwiseGainMatrix
Class to load an outlier detection summary file, as produced by
ComputeKNNOutlierScores , and compute a matrix with the pairwise
gains. |
Modifier and Type | Class and Description |
---|---|
class |
AchlioptasRandomProjectionFamily
Random projections as suggested by Dimitris Achlioptas.
|
class |
CauchyRandomProjectionFamily
Random projections using Cauchy distributions (1-stable).
|
class |
GaussianRandomProjectionFamily
Random projections using Cauchy distributions (1-stable).
|
class |
RandomSubsetProjectionFamily
Random projection family based on selecting random features.
|
class |
SimplifiedRandomHyperplaneProjectionFamily
Random hyperplane projection family.
|
Modifier and Type | Class and Description |
---|---|
(package private) class |
IntegerDBIDArrayQuickSort
Class to sort an integer DBID array, using a modified quicksort.
|
Modifier and Type | Class and Description |
---|---|
class |
LinearDiscriminantAnalysisFilter<V extends NumberVector>
Linear Discriminant Analysis (LDA) / Fisher's linear discriminant.
|
class |
PerturbationFilter<V extends NumberVector>
A filter to perturb the values by adding micro-noise.
|
Modifier and Type | Class and Description |
---|---|
class |
CanberraDistanceFunction
Canberra distance function, a variation of Manhattan distance.
|
class |
ClarkDistanceFunction
Clark distance function for vector spaces.
|
class |
MahalanobisDistanceFunction
Mahalanobis quadratic form distance for feature vectors.
|
Modifier and Type | Class and Description |
---|---|
class |
HistogramIntersectionDistanceFunction
Intersection distance for color histograms.
|
class |
HSBHistogramQuadraticDistanceFunction
Distance function for HSB color histograms based on a quadratic form and
color similarity.
|
class |
RGBHistogramQuadraticDistanceFunction
Distance function for RGB color histograms based on a quadratic form and
color similarity.
|
Modifier and Type | Class and Description |
---|---|
class |
DimensionSelectingLatLngDistanceFunction
Distance function for 2D vectors in Latitude, Longitude form.
|
class |
LatLngDistanceFunction
Distance function for 2D vectors in Latitude, Longitude form.
|
class |
LngLatDistanceFunction
Distance function for 2D vectors in Longitude, Latitude form.
|
Modifier and Type | Class and Description |
---|---|
class |
HistogramMatchDistanceFunction
Distance function based on histogram matching, i.e., Manhattan distance on
the cumulative density function.
|
Modifier and Type | Class and Description |
---|---|
class |
ChiSquaredDistanceFunction
χ² distance function, symmetric version.
|
class |
KullbackLeiblerDivergenceAsymmetricDistanceFunction
Kullback-Leibler divergence, also known as relative entropy,
information deviation, or just KL-distance (albeit asymmetric).
|
class |
KullbackLeiblerDivergenceReverseAsymmetricDistanceFunction
Kullback-Leibler divergence, also known as relative entropy, information
deviation or just KL-distance (albeit asymmetric).
|
class |
SqrtJensenShannonDivergenceDistanceFunction
The square root of Jensen-Shannon divergence is a metric.
|
class |
TriangularDiscriminationDistanceFunction
Triangular Discrimination has relatively tight upper and lower bounds to the
Jensen-Shannon divergence, but is much less expensive.
|
class |
TriangularDistanceFunction
Triangular Distance has relatively tight upper and lower bounds to the
(square root of the) Jensen-Shannon divergence, but is much less expensive.
|
Modifier and Type | Class and Description |
---|---|
class |
HammingDistanceFunction
Computes the Hamming distance of arbitrary vectors - i.e. counting, on how
many places they differ.
|
class |
JaccardSimilarityDistanceFunction
A flexible extension of Jaccard similarity to non-binary vectors.
|
Modifier and Type | Class and Description |
---|---|
class |
LevenshteinDistanceFunction
Classic Levenshtein distance on strings.
|
class |
NormalizedLevenshteinDistanceFunction
Levenshtein distance on strings, normalized by string length.
|
Modifier and Type | Class and Description |
---|---|
class |
DerivativeDTWDistanceFunction
Derivative Dynamic Time Warping distance for numerical vectors.
|
class |
DTWDistanceFunction
Dynamic Time Warping distance (DTW) for numerical vectors.
|
class |
EDRDistanceFunction
Edit Distance on Real Sequence distance for numerical vectors.
|
class |
ERPDistanceFunction
Edit Distance With Real Penalty distance for numerical vectors.
|
class |
LCSSDistanceFunction
Longest Common Subsequence distance for numerical vectors.
|
Modifier and Type | Class and Description |
---|---|
class |
Kulczynski1SimilarityFunction
Kulczynski similarity 1.
|
class |
Kulczynski2SimilarityFunction
Kulczynski similarity 2.
|
Modifier and Type | Class and Description |
---|---|
class |
ClusteringAdjustedRandIndexSimilarityFunction
Measure the similarity of clusters via the Adjusted Rand Index.
|
class |
ClusteringBCubedF1SimilarityFunction
Measure the similarity of clusters via the BCubed F1 Index.
|
class |
ClusteringFowlkesMallowsSimilarityFunction
Measure the similarity of clusters via the Fowlkes-Mallows Index.
|
class |
ClusteringRandIndexSimilarityFunction
Measure the similarity of clusters via the Rand Index.
|
class |
ClusterJaccardSimilarityFunction
Measure the similarity of clusters via the Jaccard coefficient.
|
Modifier and Type | Class and Description |
---|---|
class |
BCubed
BCubed measures.
|
class |
EditDistance
Edit distance measures.
|
class |
Entropy
Entropy based measures.
|
class |
SetMatchingPurity
Set matching purity measures.
|
Modifier and Type | Method and Description |
---|---|
double |
PairCounting.adjustedRandIndex()
Computes the adjusted Rand index (ARI).
|
double |
SetMatchingPurity.f1Measure()
Get the set matching F1-Measure
M.
|
double |
SetMatchingPurity.fMeasureFirst()
Get the Van Rijsbergen’s F measure (asymmetric) for first clustering
E.
|
double |
SetMatchingPurity.fMeasureSecond()
Get the Van Rijsbergen’s F measure (asymmetric) for second clustering
E.
|
double |
PairCounting.fowlkesMallows()
Computes the pair-counting Fowlkes-mallows (flat only, non-hierarchical!)
|
double |
PairCounting.jaccard()
Computes the Jaccard index
P.
|
long |
PairCounting.mirkin()
Computes the Mirkin index, aka Equivalence Mismatch Distance.
|
double |
Entropy.normalizedVariationOfInformation()
Get the normalized variation of information (normalized, 0 = equal) NVI = 1
- NMI_Joint
X.
|
double |
SetMatchingPurity.purity()
Get the set matchings purity (first:second clustering)
(normalized, 1 = equal)
Y.
|
double |
PairCounting.randIndex()
Computes the Rand index (RI).
|
Modifier and Type | Class and Description |
---|---|
class |
EvaluateCIndex<O>
Compute the C-index of a data set.
|
class |
EvaluateConcordantPairs<O>
Compute the Gamma Criterion of a data set.
|
class |
EvaluateDaviesBouldin
Compute the Davies-Bouldin index of a data set.
|
class |
EvaluateDBCV<O>
Compute the Density-Based Clustering Validation Index.
|
class |
EvaluatePBMIndex
Compute the PBM index of a clustering
Reference:
M.
|
class |
EvaluateSilhouette<O>
Compute the silhouette of a data set.
|
class |
EvaluateVarianceRatioCriteria<O>
Compute the Variance Ratio Criteria of a data set, also known as
Calinski-Harabasz index.
|
Modifier and Type | Method and Description |
---|---|
double |
EvaluateConcordantPairs.computeTau(long c,
long d,
double m,
long wd,
long bd)
Compute the Tau correlation measure
|
Modifier and Type | Class and Description |
---|---|
class |
ClusterPairSegmentAnalysis
Evaluate clustering results by building segments for their pairs: shared
pairs and differences.
|
class |
Segments
Creates segments of two or more clusterings.
|
Modifier and Type | Class and Description |
---|---|
class |
OutlierSmROCCurve
Smooth ROC curves are a variation of classic ROC curves that takes the scores
into account.
|
Modifier and Type | Class and Description |
---|---|
class |
DCGEvaluation
Discounted Cumulative Gain.
|
class |
NDCGEvaluation
Normalized Discounted Cumulative Gain.
|
Modifier and Type | Class and Description |
---|---|
class |
EuclideanHashFunctionFamily
2-stable hash function family for Euclidean distances.
|
class |
ManhattanHashFunctionFamily
2-stable hash function family for Euclidean distances.
|
Modifier and Type | Class and Description |
---|---|
class |
CosineLocalitySensitiveHashFunction
Random projection family to use with sparse vectors.
|
class |
MultipleProjectionsLocalitySensitiveHashFunction
LSH hash function for vector space data.
|
Modifier and Type | Class and Description |
---|---|
class |
RandomProjectedNeighborsAndDensities<V extends NumberVector>
Random Projections used for computing neighbors and density estimates.
|
Modifier and Type | Class and Description |
---|---|
class |
NaiveProjectedKNNPreprocessor<O extends NumberVector>
Compute the approximate k nearest neighbors using 1 dimensional projections.
|
class |
NNDescent<O>
NN-desent (also known as KNNGraph) is an approximate nearest neighbor search
algorithm beginning with a random sample, then iteratively refining this
sample until.
|
class |
RandomSampleKNNPreprocessor<O>
Class that computed the kNN only on a random sample.
|
class |
SpacefillingKNNPreprocessor<O extends NumberVector>
Compute the nearest neighbors approximatively using space filling curves.
|
class |
SpacefillingMaterializeKNNPreprocessor<O extends NumberVector>
Compute the nearest neighbors approximatively using space filling curves.
|
Modifier and Type | Class and Description |
---|---|
class |
HiSCPreferenceVectorIndex<V extends NumberVector>
Preprocessor for HiSC preference vector assignment to objects of a certain
database.
|
Modifier and Type | Class and Description |
---|---|
class |
PINN<O extends NumberVector>
Projection-Indexed nearest-neighbors (PINN) is an index to retrieve the
nearest neighbors in high dimensional spaces by using a random projection
based index.
|
Modifier and Type | Class and Description |
---|---|
class |
CoverTree<O>
Cover tree data structure (in-memory).
|
Modifier and Type | Class and Description |
---|---|
class |
MTree<O>
MTree is a metrical index structure based on the concepts of the M-Tree.
|
Modifier and Type | Class and Description |
---|---|
class |
MinimumEnlargementInsert<N extends AbstractMTreeNode<?,N,E>,E extends MTreeEntry>
Minimum enlargement insert - default insertion strategy for the M-tree.
|
Modifier and Type | Class and Description |
---|---|
class |
MLBDistSplit<E extends MTreeEntry,N extends AbstractMTreeNode<?,N,E>>
Encapsulates the required methods for a split of a node in an M-Tree.
|
class |
MMRadSplit<E extends MTreeEntry,N extends AbstractMTreeNode<?,N,E>>
Encapsulates the required methods for a split of a node in an M-Tree.
|
class |
MRadSplit<E extends MTreeEntry,N extends AbstractMTreeNode<?,N,E>>
Encapsulates the required methods for a split of a node in an M-Tree.
|
class |
MSTSplit<E extends MTreeEntry,N extends AbstractMTreeNode<?,N,E>>
Splitting algorithm using the minimum spanning tree (MST), as proposed by the
Slim-Tree variant.
|
class |
RandomSplit<E extends MTreeEntry,N extends AbstractMTreeNode<?,N,E>>
Encapsulates the required methods for a split of a node in an M-Tree.
|
Modifier and Type | Class and Description |
---|---|
class |
BalancedDistribution
Balanced entry distribution strategy of the M-tree.
|
class |
GeneralizedHyperplaneDistribution
Generalized hyperplane entry distribution strategy of the M-tree.
|
Modifier and Type | Class and Description |
---|---|
class |
MinimalisticMemoryKDTree<O extends NumberVector>
Simple implementation of a static in-memory K-D-tree.
|
class |
SmallMemoryKDTree<O extends NumberVector>
Simple implementation of a static in-memory K-D-tree.
|
Modifier and Type | Class and Description |
---|---|
class |
EuclideanRStarTreeKNNQuery<O extends NumberVector>
Instance of a KNN query for a particular spatial index.
|
class |
EuclideanRStarTreeRangeQuery<O extends NumberVector>
Instance of a range query for a particular spatial index.
|
class |
RStarTreeKNNQuery<O extends SpatialComparable>
Instance of a KNN query for a particular spatial index.
|
class |
RStarTreeRangeQuery<O extends SpatialComparable>
Instance of a range query for a particular spatial index.
|
Modifier and Type | Class and Description |
---|---|
class |
RStarTree
RStarTree is a spatial index structure based on the concepts of the R*-Tree.
|
Modifier and Type | Class and Description |
---|---|
class |
OneDimSortBulkSplit
Simple bulk loading strategy by sorting the data along the first dimension.
|
class |
SortTileRecursiveBulkSplit
Sort-Tile-Recursive aims at tiling the data space with a grid-like structure
for partitioning the dataset into the required number of buckets.
|
class |
SpatialSortBulkSplit
Bulk loading by spatially sorting the objects, then partitioning the sorted
list appropriately.
|
Modifier and Type | Class and Description |
---|---|
class |
ApproximativeLeastOverlapInsertionStrategy
The choose subtree method proposed by the R*-Tree with slightly better
performance for large leaf sizes (linear approximation).
|
class |
CombinedInsertionStrategy
Use two different insertion strategies for directory and leaf nodes.
|
class |
LeastEnlargementInsertionStrategy
The default R-Tree insertion strategy: find rectangle with least volume
enlargement.
|
class |
LeastEnlargementWithAreaInsertionStrategy
A slight modification of the default R-Tree insertion strategy: find
rectangle with least volume enlargement, but choose least area on ties.
|
class |
LeastOverlapInsertionStrategy
The choose subtree method proposed by the R*-Tree for leaf nodes.
|
Modifier and Type | Class and Description |
---|---|
class |
LimitedReinsertOverflowTreatment
Limited reinsertions, as proposed by the R*-Tree: For each real insert, allow
reinsertions to happen only once per level.
|
Modifier and Type | Class and Description |
---|---|
class |
CloseReinsert
Reinsert objects on page overflow, starting with close objects first (even
when they will likely be inserted into the same page again!)
|
class |
FarReinsert
Reinsert objects on page overflow, starting with farther objects first (even
when they will likely be inserted into the same page again!)
|
Modifier and Type | Class and Description |
---|---|
class |
AngTanLinearSplit
Line-time complexity split proposed by Ang and Tan.
|
class |
GreeneSplit
Quadratic-time complexity split as used by Diane Greene for the R-Tree.
|
class |
RTreeLinearSplit
Linear-time complexity greedy split as used by the original R-Tree.
|
class |
RTreeQuadraticSplit
Quadratic-time complexity greedy split as used by the original R-Tree.
|
class |
TopologicalSplitter
Encapsulates the required parameters for a topological split of a R*-Tree.
|
Modifier and Type | Class and Description |
---|---|
class |
DAFile
Dimension approximation file, a one-dimensional part of the
PartialVAFile . |
class |
PartialVAFile<V extends NumberVector>
PartialVAFile.
|
class |
VAFile<V extends NumberVector>
Vector-approximation file (VAFile)
Reference:
R.
|
Modifier and Type | Method and Description |
---|---|
static double |
Mean.highPrecision(double... data)
Static helper function, with extra precision
|
Modifier and Type | Class and Description |
---|---|
class |
SphereUtil
Class with utility functions for distance computations on the sphere.
|
Modifier and Type | Method and Description |
---|---|
static double |
SphereUtil.ellipsoidVincentyFormulaRad(double f,
double lat1,
double lon1,
double lat2,
double lon2)
Compute the approximate great-circle distance of two points.
|
static double |
SphereUtil.haversineFormulaRad(double lat1,
double lon1,
double lat2,
double lon2)
Compute the approximate great-circle distance of two points using the
Haversine formula
Complexity: 5 trigonometric functions, 1-2 sqrt.
|
static double |
SphereUtil.latlngMinDistDeg(double plat,
double plng,
double rminlat,
double rminlng,
double rmaxlat,
double rmaxlng)
Point to rectangle minimum distance.
|
static double |
SphereUtil.latlngMinDistRad(double plat,
double plng,
double rminlat,
double rminlng,
double rmaxlat,
double rmaxlng)
Point to rectangle minimum distance.
|
static double |
SphereUtil.latlngMinDistRadFull(double plat,
double plng,
double rminlat,
double rminlng,
double rmaxlat,
double rmaxlng)
Point to rectangle minimum distance.
|
static double |
SphereUtil.sphericalVincentyFormulaRad(double lat1,
double lon1,
double lat2,
double lon2)
Compute the approximate great-circle distance of two points.
|
Modifier and Type | Class and Description |
---|---|
class |
GrahamScanConvexHull2D
Classes to compute the convex hull of a set of points in 2D, using the
classic Grahams scan.
|
class |
PrimsMinimumSpanningTree
Prim's algorithm for finding the minimum spanning tree.
|
class |
SweepHullDelaunay2D
Compute the Convex Hull and/or Delaunay Triangulation, using the sweep-hull
approach of David Sinclair.
|
Modifier and Type | Method and Description |
---|---|
static double |
VMath.mahalanobisDistance(double[][] B,
double[] a,
double[] c)
Matrix multiplication, (a-c)T * B * (a-c)
Note: it may (or may not) be more efficient to materialize (a-c), then use
transposeTimesTimes(a_minus_c, B, a_minus_c) instead. |
Modifier and Type | Class and Description |
---|---|
class |
AutotuningPCA
Performs a self-tuning local PCA based on the covariance matrices of given
objects.
|
class |
WeightedCovarianceMatrixBuilder
CovarianceMatrixBuilder with weights. |
Modifier and Type | Class and Description |
---|---|
class |
BinarySplitSpatialSorter
Spatially sort the data set by repetitive binary splitting, circulating
through the dimensions.
|
class |
HilbertSpatialSorter
Sort object along the Hilbert Space Filling curve by mapping them to their
Hilbert numbers and sorting them.
|
class |
PeanoSpatialSorter
Bulk-load an R-tree index by presorting the objects with their position on
the Peano curve.
|
Modifier and Type | Class and Description |
---|---|
class |
DistanceCorrelationDependenceMeasure
Distance correlation.
|
class |
HoeffdingsDDependenceMeasure
Calculate Hoeffding's D as a measure of dependence.
|
class |
HSMDependenceMeasure
Compute the "interestingness" of dimension connections using the hough
transformation.
|
class |
MCEDependenceMeasure
Compute a mutual information based dependence measure using a nested means
discretization, originally proposed for ordering axes in parallel coordinate
plots.
|
class |
SlopeDependenceMeasure
Arrange dimensions based on the entropy of the slope spectrum.
|
class |
SlopeInversionDependenceMeasure
Arrange dimensions based on the entropy of the slope spectrum.
|
Modifier and Type | Class and Description |
---|---|
class |
HaltonUniformDistribution
Halton sequences are a pseudo-uniform distribution.
|
class |
SkewGeneralizedNormalDistribution
Generalized normal distribution by adding a skew term, similar to lognormal
distributions.
|
Modifier and Type | Method and Description |
---|---|
static double |
NormalDistribution.cdf(double x,
double mu,
double sigma)
Cumulative probability density function (CDF) of a normal distribution.
|
protected static double |
GammaDistribution.chisquaredProbitApproximation(double p,
double nu,
double g)
Approximate probit for chi squared distribution
Based on first half of algorithm AS 91
Reference:
D.
|
private static double |
PoissonDistribution.devianceTerm(double x,
double np)
Evaluate the deviance term of the saddle point approximation.
|
static double |
GammaDistribution.digamma(double x)
Compute the Psi / Digamma function
Reference:
J.
|
static double |
NormalDistribution.erfc(double x)
Complementary error function for Gaussian distributions = Normal
distributions.
|
static double |
NormalDistribution.erfcinv(double y)
Inverse error function.
|
static double |
PoissonDistribution.pmf(double x,
int n,
double p)
Poisson probability mass function (PMF) for integer values.
|
static double |
ChiSquaredDistribution.quantile(double x,
double dof)
Return the quantile function for this distribution
Reference:
D.
|
static double |
GammaDistribution.quantile(double p,
double k,
double theta)
Compute probit (inverse cdf) for Gamma distributions.
|
static double |
NormalDistribution.standardNormalCDF(double x)
Cumulative probability density function (CDF) of a normal distribution.
|
private static double |
PoissonDistribution.stirlingError(double n)
Calculates the Stirling Error
stirlerr(n) = ln(n!)
|
private static double |
PoissonDistribution.stirlingError(int n)
Calculates the Stirling Error
stirlerr(n) = ln(n!)
|
Modifier and Type | Class and Description |
---|---|
class |
CauchyMADEstimator
Estimate Cauchy distribution parameters using Median and MAD.
|
class |
EMGOlivierNorbergEstimator
Naive distribution estimation using mean and sample variance.
|
class |
ExponentialLMMEstimator
Estimate the parameters of a Gamma Distribution, using the methods of
L-Moments (LMM).
|
class |
ExponentialMADEstimator
Estimate Exponential distribution parameters using Median and MAD.
|
class |
ExponentialMedianEstimator
Estimate Exponential distribution parameters using Median and MAD.
|
class |
GammaChoiWetteEstimator
Estimate distribution parameters using the method by Choi and Wette.
|
class |
GammaLMMEstimator
Estimate the parameters of a Gamma Distribution, using the methods of
L-Moments (LMM).
|
class |
GammaMOMEstimator
Simple parameter estimation for the Gamma distribution.
|
class |
GeneralizedExtremeValueLMMEstimator
Estimate the parameters of a Generalized Extreme Value Distribution, using
the methods of L-Moments (LMM).
|
class |
GeneralizedLogisticAlternateLMMEstimator
Estimate the parameters of a Generalized Logistic Distribution, using the
methods of L-Moments (LMM).
|
class |
GeneralizedParetoLMMEstimator
Estimate the parameters of a Generalized Pareto Distribution (GPD), using the
methods of L-Moments (LMM).
|
class |
GumbelLMMEstimator
Estimate the parameters of a Gumbel Distribution, using the methods of
L-Moments (LMM).
|
class |
GumbelMADEstimator
Parameter estimation via median and median absolute deviation from median
(MAD).
|
class |
LaplaceMADEstimator
Estimate Laplace distribution parameters using Median and MAD.
|
class |
LaplaceMLEEstimator
Estimate Laplace distribution parameters using Median and mean deviation from
median.
|
class |
LogisticLMMEstimator
Estimate the parameters of a Logistic Distribution, using the methods of
L-Moments (LMM).
|
class |
LogisticMADEstimator
Estimate Logistic distribution parameters using Median and MAD.
|
class |
LogLogisticMADEstimator
Estimate Logistic distribution parameters using Median and MAD.
|
class |
LogNormalBilkovaLMMEstimator
Alternate estimate the parameters of a log Gamma Distribution, using the
methods of L-Moments (LMM) for the Generalized Normal Distribution.
|
class |
LogNormalLMMEstimator
Estimate the parameters of a log Normal Distribution, using the methods of
L-Moments (LMM) for the Generalized Normal Distribution.
|
class |
LogNormalLogMADEstimator
Estimator using Medians.
|
class |
NormalLMMEstimator
Estimate the parameters of a normal distribution using the method of
L-Moments (LMM).
|
class |
NormalMADEstimator
Estimator using Medians.
|
class |
RayleighMADEstimator
Estimate the parameters of a RayleighDistribution using the MAD.
|
class |
SkewGNormalLMMEstimator
Estimate the parameters of a skew Normal Distribution (Hoskin's Generalized
Normal Distribution), using the methods of L-Moments (LMM).
|
class |
UniformMADEstimator
Estimate Uniform distribution parameters using Median and MAD.
|
class |
WeibullLogMADEstimator
Parameter estimation via median and median absolute deviation from median
(MAD).
|
Modifier and Type | Class and Description |
---|---|
class |
WinsorizingEstimator<D extends Distribution>
Winsorizing or Georgization estimator.
|
Modifier and Type | Class and Description |
---|---|
class |
AggregatedHillEstimator
Estimator using the weighted average of multiple hill estimators.
|
class |
ALIDEstimator
ALID estimator of the intrinsic dimensionality (maximum likelihood estimator
for ID using auxiliary distances).
|
class |
GEDEstimator
Generalized Expansion Dimension for estimating the intrinsic dimensionality.
|
class |
HillEstimator
Hill estimator of the intrinsic dimensionality (maximum likelihood estimator
for ID).
|
class |
MOMEstimator
Methods of moments estimator, using the first moment (i.e. average).
|
class |
RVEstimator
Regularly Varying Functions estimator of the intrinsic dimensionality
Reference:
L.
|
Modifier and Type | Field and Description |
---|---|
static double |
UniformKernelDensityFunction.CANONICAL_BANDWIDTH
Canonical bandwidth: (9/2)^(1/5)
|
static double |
TriweightKernelDensityFunction.CANONICAL_BANDWIDTH
Canonical bandwidth: (9450/143)^(1/5)
|
static double |
BiweightKernelDensityFunction.CANONICAL_BANDWIDTH
Canonical bandwidth: 35^(1/5)
|
static double |
GaussianKernelDensityFunction.CANONICAL_BANDWIDTH
Canonical bandwidth: (1./(4*pi))^(1/10)
|
static double |
EpanechnikovKernelDensityFunction.CANONICAL_BANDWIDTH
Canonical bandwidth: 15^(1/5)
|
Modifier and Type | Method and Description |
---|---|
double |
KernelDensityFunction.canonicalBandwidth()
Get the canonical bandwidth for this kernel.
|
Modifier and Type | Class and Description |
---|---|
class |
AndersonDarlingTest
Perform Anderson-Darling test for a Gaussian distribution.
|
Modifier and Type | Method and Description |
---|---|
static double |
AndersonDarlingTest.removeBiasNormalDistribution(double A2,
int n)
Remove bias from the Anderson-Darling statistic if the mean and standard
deviation were estimated from the data, and a normal distribution was
assumed.
|
Modifier and Type | Class and Description |
---|---|
class |
KMLOutputHandler
Class to handle KML output.
|
Modifier and Type | Class and Description |
---|---|
class |
IntegerArrayQuickSort
Class to sort an int array, using a modified quicksort.
|
Modifier and Type | Class and Description |
---|---|
class |
WeightedQuickUnionInteger
Union-find algorithm for primitive integers, with optimizations.
|
class |
WeightedQuickUnionRangeDBIDs
Union-find algorithm for
DBIDRange only, with optimizations. |
class |
WeightedQuickUnionStaticDBIDs
Union-find algorithm for
StaticDBIDs , with optimizations. |
Modifier and Type | Class and Description |
---|---|
class |
Xoroshiro128NonThreadsafeRandom
Replacement for Java's
Random class, using a different
random number generation strategy. |
class |
XorShift1024NonThreadsafeRandom
Replacement for Java's
Random class, using a different
random number generation strategy. |
class |
XorShift64NonThreadsafeRandom
Replacement for Java's
Random class, using a different
random number generation strategy. |
Modifier and Type | Method and Description |
---|---|
int |
XorShift64NonThreadsafeRandom.nextInt(int n)
Returns a pseudorandom, uniformly distributed
int value between 0
(inclusive) and the specified value (exclusive), drawn from this random
number generator's sequence. |
int |
XorShift1024NonThreadsafeRandom.nextInt(int n)
Returns a pseudorandom, uniformly distributed
int value between 0
(inclusive) and the specified value (exclusive), drawn from this random
number generator's sequence. |
int |
Xoroshiro128NonThreadsafeRandom.nextInt(int n)
Returns a pseudorandom, uniformly distributed
int value between 0
(inclusive) and the specified value (exclusive), drawn from this random
number generator's sequence. |
int |
FastNonThreadsafeRandom.nextInt(int n)
Returns a pseudorandom, uniformly distributed
int value between 0
(inclusive) and the specified value (exclusive), drawn from this random
number generator's sequence. |
int |
FastNonThreadsafeRandom.nextIntRefined(int n)
Returns a pseudorandom, uniformly distributed
int value between 0
(inclusive) and the specified value (exclusive), drawn from this random
number generator's sequence. |
Modifier and Type | Class and Description |
---|---|
class |
HeDESNormalizationOutlierScaling
Normalization used by HeDES
Reference:
H. |
class |
MinusLogGammaScaling
Scaling that can map arbitrary values to a probability in the range of [0:1],
by assuming a Gamma distribution on the data and evaluating the Gamma CDF.
|
class |
MinusLogStandardDeviationScaling
Scaling that can map arbitrary values to a probability in the range of [0:1].
|
class |
MixtureModelOutlierScaling
Tries to fit a mixture model (exponential for inliers and gaussian for
outliers) to the outlier score distribution.
|
class |
MultiplicativeInverseScaling
Scaling function to invert values by computing 1/x, but in a variation that
maps the values to the [0:1] interval and avoiding division by 0.
|
class |
OutlierGammaScaling
Scaling that can map arbitrary values to a probability in the range of [0:1]
by assuming a Gamma distribution on the values.
|
class |
OutlierMinusLogScaling
Scaling function to invert values by computing -log(x)
Useful for example for scaling
ABOD , but see
MinusLogStandardDeviationScaling and MinusLogGammaScaling for
more advanced scalings for this algorithm. |
class |
SigmoidOutlierScaling
Tries to fit a sigmoid to the outlier scores and use it to convert the values
to probability estimates in the range of 0.0 to 1.0
Reference:
J.
|
class |
SqrtStandardDeviationScaling
Scaling that can map arbitrary values to a probability in the range of [0:1].
|
class |
StandardDeviationScaling
Scaling that can map arbitrary values to a probability in the range of [0:1].
|
Modifier and Type | Class and Description |
---|---|
class |
OpenGL3DParallelCoordinates<O extends NumberVector>
Simple JOGL2 based parallel coordinates visualization.
|
class |
Parallel3DRenderer<O extends NumberVector>
Renderer for 3D parallel plots.
|
Modifier and Type | Class and Description |
---|---|
class |
CompactCircularMSTLayout3DPC
Simple circular layout based on the minimum spanning tree.
|
class |
MultidimensionalScalingMSTLayout3DPC
Layout the axes by multi-dimensional scaling.
|
class |
SimpleCircularMSTLayout3DPC
Simple circular layout based on the minimum spanning tree.
|
Modifier and Type | Class and Description |
---|---|
class |
ParallelPlotProjector<V extends SpatialComparable>
ParallelPlotProjector is responsible for producing a parallel axes
visualization.
|
Modifier and Type | Class and Description |
---|---|
class |
CircleSegmentsVisualizer
Visualizer to draw circle segments of clusterings and enable interactive
selection of segments.
|
Modifier and Type | Method and Description |
---|---|
private double[] |
DensityEstimationOverlay.Instance.initializeBandwidth(double[][] data) |
Modifier and Type | Class and Description |
---|---|
class |
BubbleVisualization
Generates a SVG-Element containing bubbles.
|
class |
COPVectorVisualization
Visualize error vectors as produced by COP.
|
Modifier and Type | Class and Description |
---|---|
class |
NaiveAgglomerativeHierarchicalClustering3<O>
This tutorial will step you through implementing a well known clustering
algorithm, agglomerative hierarchical clustering, in multiple steps.
|
class |
NaiveAgglomerativeHierarchicalClustering4<O>
This tutorial will step you through implementing a well known clustering
algorithm, agglomerative hierarchical clustering, in multiple steps.
|
Modifier and Type | Class and Description |
---|---|
class |
ODIN<O>
Outlier detection based on the in-degree of the kNN graph.
|
Copyright © 2019 ELKI Development Team. License information.