Package | Description |
---|---|
de.lmu.ifi.dbs.elki.algorithm |
Algorithms suitable as a task for the
KDDTask main routine. |
de.lmu.ifi.dbs.elki.algorithm.clustering |
Clustering algorithms.
|
de.lmu.ifi.dbs.elki.algorithm.clustering.affinitypropagation |
Affinity Propagation (AP) clustering.
|
de.lmu.ifi.dbs.elki.algorithm.clustering.biclustering |
Biclustering algorithms.
|
de.lmu.ifi.dbs.elki.algorithm.clustering.correlation |
Correlation clustering algorithms
|
de.lmu.ifi.dbs.elki.algorithm.clustering.gdbscan |
Generalized DBSCAN.
|
de.lmu.ifi.dbs.elki.algorithm.clustering.hierarchical | |
de.lmu.ifi.dbs.elki.algorithm.clustering.kmeans |
K-means clustering and variations.
|
de.lmu.ifi.dbs.elki.algorithm.clustering.subspace |
Axis-parallel subspace clustering algorithms
The clustering algorithms in this package are instances of both, projected clustering algorithms or
subspace clustering algorithms according to the classical but somewhat obsolete classification schema
of clustering algorithms for axis-parallel subspaces.
|
de.lmu.ifi.dbs.elki.algorithm.outlier |
Outlier detection algorithms
|
de.lmu.ifi.dbs.elki.algorithm.outlier.lof |
LOF family of outlier detection algorithms.
|
de.lmu.ifi.dbs.elki.algorithm.outlier.meta |
Meta outlier detection algorithms: external scores, score rescaling.
|
de.lmu.ifi.dbs.elki.algorithm.outlier.spatial |
Spatial outlier detection algorithms
|
de.lmu.ifi.dbs.elki.algorithm.outlier.subspace |
Subspace outlier detection methods.
|
de.lmu.ifi.dbs.elki.application.greedyensemble |
Greedy ensembles for outlier detection.
|
de.lmu.ifi.dbs.elki.application.internal |
Internal utilities for development.
|
de.lmu.ifi.dbs.elki.database.ids.integer |
Integer-based DBID implementation --
do not use directly - always use
DBIDUtil . |
de.lmu.ifi.dbs.elki.datasource.filter.transform |
Data space transformations.
|
de.lmu.ifi.dbs.elki.distance.distancefunction |
Distance functions for use within ELKI.
|
de.lmu.ifi.dbs.elki.distance.distancefunction.colorhistogram |
Distance functions using correlations.
|
de.lmu.ifi.dbs.elki.distance.distancefunction.geo |
Geographic (earth) distance functions.
|
de.lmu.ifi.dbs.elki.distance.distancefunction.probabilistic |
Distance from probability theory, mostly divergences such as K-L-divergence, J-divergence.
|
de.lmu.ifi.dbs.elki.distance.distancefunction.strings |
Distance functions for strings.
|
de.lmu.ifi.dbs.elki.distance.distancefunction.timeseries |
Distance functions designed for time series.
|
de.lmu.ifi.dbs.elki.distance.similarityfunction |
Similarity functions.
|
de.lmu.ifi.dbs.elki.evaluation.clustering |
Evaluation of clustering results.
|
de.lmu.ifi.dbs.elki.evaluation.clustering.pairsegments |
Pair-segment analysis of multiple clusterings.
|
de.lmu.ifi.dbs.elki.evaluation.outlier |
Evaluate an outlier score using a misclassification based cost model.
|
de.lmu.ifi.dbs.elki.index.lsh.hashfamilies |
Hash function families for LSH.
|
de.lmu.ifi.dbs.elki.index.lsh.hashfunctions |
Hash functions for LSH
|
de.lmu.ifi.dbs.elki.index.preprocessed.knn |
Indexes providing KNN and rKNN data.
|
de.lmu.ifi.dbs.elki.index.projected |
Projected indexes for data.
|
de.lmu.ifi.dbs.elki.index.tree.metrical.mtreevariants.mtree | |
de.lmu.ifi.dbs.elki.index.tree.metrical.mtreevariants.strategies.insert |
Insertion (choose path) strategies of nodes in an M-Tree (and variants).
|
de.lmu.ifi.dbs.elki.index.tree.metrical.mtreevariants.strategies.split |
Splitting strategies of nodes in an M-Tree (and variants).
|
de.lmu.ifi.dbs.elki.index.tree.spatial.kd |
K-d-tree and variants.
|
de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.query |
Queries on the R-Tree family of indexes: kNN and range queries.
|
de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.rstar | |
de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.strategies.bulk |
Packages for bulk-loading R*-Trees.
|
de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.strategies.insert |
Insertion strategies for R-Trees
|
de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.strategies.overflow |
Overflow treatment strategies for R-Trees
|
de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.strategies.reinsert |
Reinsertion strategies for R-Trees
|
de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.strategies.split |
Splitting strategies for R-Trees
|
de.lmu.ifi.dbs.elki.index.vafile |
Vector Approximation File
|
de.lmu.ifi.dbs.elki.math |
Mathematical operations and utilities used throughout the framework.
|
de.lmu.ifi.dbs.elki.math.dimensionsimilarity |
Functions to compute the similarity of dimensions (or the interestingness of the combination).
|
de.lmu.ifi.dbs.elki.math.geodesy | |
de.lmu.ifi.dbs.elki.math.geometry |
Algorithms from computational geometry.
|
de.lmu.ifi.dbs.elki.math.linearalgebra.pca |
Principal Component Analysis (PCA) and Eigenvector processing.
|
de.lmu.ifi.dbs.elki.math.linearalgebra.randomprojections |
Random projection families.
|
de.lmu.ifi.dbs.elki.math.spacefillingcurves |
Space filling curves.
|
de.lmu.ifi.dbs.elki.math.statistics |
Statistical tests and methods.
|
de.lmu.ifi.dbs.elki.math.statistics.distribution |
Standard distributions, with random generation functionalities.
|
de.lmu.ifi.dbs.elki.math.statistics.distribution.estimator |
Estimators for statistical distributions.
|
de.lmu.ifi.dbs.elki.math.statistics.distribution.estimator.meta |
Meta estimators: estimators that do not actually estimate themselves, but instead use other estimators, e.g. on a trimmed data set, or as an ensemble.
|
de.lmu.ifi.dbs.elki.math.statistics.kernelfunctions |
Kernel functions from statistics.
|
de.lmu.ifi.dbs.elki.result |
Result types, representation and handling
|
de.lmu.ifi.dbs.elki.utilities.datastructures.arrays |
Utilities for arrays: advanced sorting for primitvie arrays.
|
de.lmu.ifi.dbs.elki.utilities.documentation |
Documentation utilities: Annotations for Title, Description, Reference
|
de.lmu.ifi.dbs.elki.utilities.scaling.outlier |
Scaling of Outlier scores, that require a statistical analysis of the occurring values
|
de.lmu.ifi.dbs.elki.visualization.visualizers.pairsegments |
Visualizers for inspecting cluster differences using pair counting segments.
|
de.lmu.ifi.dbs.elki.visualization.visualizers.scatterplot.density |
Visualizers for data set density in a scatterplot projection.
|
de.lmu.ifi.dbs.elki.visualization.visualizers.scatterplot.outlier |
Visualizers for outlier scores based on 2D projections.
|
tutorial.clustering |
Classes from the tutorial on implementing a custom k-means variation.
|
tutorial.outlier |
Modifier and Type | Class and Description |
---|---|
class |
APRIORI
Provides the APRIORI algorithm for Mining Association Rules.
|
class |
DependencyDerivator<V extends NumberVector<?>,D extends Distance<D>>
Dependency derivator computes quantitatively linear dependencies among
attributes of a given dataset based on a linear correlation PCA.
|
Modifier and Type | Class and Description |
---|---|
class |
CanopyPreClustering<O,D extends Distance<D>>
Canopy pre-clustering is a simple preprocessing step for clustering.
|
class |
DBSCAN<O,D extends Distance<D>>
DBSCAN provides the DBSCAN algorithm, an algorithm to find density-connected
sets in a database.
|
class |
DeLiClu<NV extends NumberVector<?>,D extends Distance<D>>
DeLiClu provides the DeLiClu algorithm, a hierarchical algorithm to find
density-connected sets in a database.
|
class |
EM<V extends NumberVector<?>>
Provides the EM algorithm (clustering by expectation maximization).
|
class |
NaiveMeanShiftClustering<V extends NumberVector<?>,D extends NumberDistance<D,?>>
Mean-shift based clustering algorithm.
|
class |
OPTICS<O,D extends Distance<D>>
OPTICS provides the OPTICS algorithm.
|
class |
SNNClustering<O>
Shared nearest neighbor clustering.
|
Modifier and Type | Class and Description |
---|---|
class |
AffinityPropagationClusteringAlgorithm<O>
Cluster analysis by affinity propagation.
|
Modifier and Type | Class and Description |
---|---|
class |
ChengAndChurch<V extends NumberVector<?>>
Perform Cheng and Church biclustering.
|
Modifier and Type | Class and Description |
---|---|
class |
CASH<V extends NumberVector<?>>
Provides the CASH algorithm, an subspace clustering algorithm based on the
Hough transform.
|
class |
COPAC<V extends NumberVector<?>,D extends Distance<D>>
Provides the COPAC algorithm, an algorithm to partition a database according
to the correlation dimension of its objects and to then perform an arbitrary
clustering algorithm over the partitions.
|
class |
ERiC<V extends NumberVector<?>>
Performs correlation clustering on the data partitioned according to local
correlation dimensionality and builds a hierarchy of correlation clusters
that allows multiple inheritance from the clustering result.
|
class |
FourC<V extends NumberVector<?>>
4C identifies local subgroups of data objects sharing a uniform correlation.
|
class |
HiCO<V extends NumberVector<?>>
Implementation of the HiCO algorithm, an algorithm for detecting hierarchies
of correlation clusters.
|
class |
LMCLUS
Linear manifold clustering in high dimensional spaces by stochastic search.
|
class |
ORCLUS<V extends NumberVector<?>>
ORCLUS provides the ORCLUS algorithm, an algorithm to find clusters in high
dimensional spaces.
|
Modifier and Type | Class and Description |
---|---|
class |
EpsilonNeighborPredicate<O,D extends Distance<D>>
The default DBSCAN and OPTICS neighbor predicate, using an
epsilon-neighborhood.
|
class |
GeneralizedDBSCAN
Generalized DBSCAN, density-based clustering with noise.
|
class |
MinPtsCorePredicate
The DBSCAN default core point predicate -- having at least
MinPtsCorePredicate.minpts
neighbors. |
Modifier and Type | Class and Description |
---|---|
class |
CentroidLinkageMethod
Centroid linkage clustering method, aka UPGMC: Unweighted Pair-Group Method
using Centroids.
|
class |
GroupAverageLinkageMethod
Group-average linkage clustering method.
|
interface |
LinkageMethod
Abstract interface for implementing a new linkage method into hierarchical
clustering.
|
class |
MedianLinkageMethod
Median-linkage clustering method: Weighted pair group method using centroids
(WPGMC).
|
class |
NaiveAgglomerativeHierarchicalClustering<O,D extends NumberDistance<D,?>>
This tutorial will step you through implementing a well known clustering
algorithm, agglomerative hierarchical clustering, in multiple steps.
|
class |
SingleLinkageMethod
Single-linkage clustering method.
|
class |
SLINK<O,D extends Distance<D>>
Implementation of the efficient Single-Link Algorithm SLINK of R.
|
class |
WardLinkageMethod
Ward's method clustering method.
|
class |
WeightedAverageLinkageMethod
Weighted average linkage clustering method.
|
Modifier and Type | Class and Description |
---|---|
class |
KMeansBisecting<V extends NumberVector<?>,D extends Distance<?>,M extends MeanModel<V>>
The bisecting k-means algorithm works by starting with an initial
partitioning into two clusters, then repeated splitting of the largest
cluster to get additional clusters.
|
class |
KMeansLloyd<V extends NumberVector<?>,D extends Distance<D>>
Provides the k-means algorithm, using Lloyd-style bulk iterations.
|
class |
KMeansMacQueen<V extends NumberVector<?>,D extends Distance<D>>
Provides the k-means algorithm, using MacQueen style incremental updates.
|
class |
KMeansPlusPlusInitialMeans<V,D extends NumberDistance<D,?>>
K-Means++ initialization for k-means.
|
class |
KMediansLloyd<V extends NumberVector<?>,D extends Distance<D>>
Provides the k-medians clustering algorithm, using Lloyd-style bulk
iterations.
|
class |
KMedoidsPAM<V,D extends NumberDistance<D,?>>
Provides the k-medoids clustering algorithm, using the
"Partitioning Around Medoids" approach.
|
class |
PAMInitialMeans<V,D extends NumberDistance<D,?>>
PAM initialization for k-means (and of course, PAM).
|
Modifier and Type | Class and Description |
---|---|
class |
CLIQUE<V extends NumberVector<?>>
Implementation of the CLIQUE algorithm, a grid-based algorithm to identify
dense clusters in subspaces of maximum dimensionality.
|
class |
DiSH<V extends NumberVector<?>>
Algorithm for detecting subspace hierarchies.
|
class |
DOC<V extends NumberVector<?>>
Provides the DOC algorithm, and it's heuristic variant, FastDOC.
|
class |
HiSC<V extends NumberVector<?>>
Implementation of the HiSC algorithm, an algorithm for detecting hierarchies
of subspace clusters.
|
class |
P3C<V extends NumberVector<?>>
P3C: A Robust Projected Clustering Algorithm.
|
class |
PreDeCon<V extends NumberVector<?>>
PreDeCon computes clusters of subspace preference weighted connected points.
|
class |
PROCLUS<V extends NumberVector<?>>
Provides the PROCLUS algorithm, an algorithm to find subspace clusters in
high dimensional spaces.
|
class |
SUBCLU<V extends NumberVector<?>>
Implementation of the SUBCLU algorithm, an algorithm to detect arbitrarily
shaped and positioned clusters in subspaces.
|
Modifier and Type | Class and Description |
---|---|
class |
ABOD<V extends NumberVector<?>>
Angle-Based Outlier Detection / Angle-Based Outlier Factor.
|
class |
AbstractAggarwalYuOutlier<V extends NumberVector<?>>
Abstract base class for the sparse-grid-cell based outlier detection of
Aggarwal and Yu.
|
class |
AggarwalYuEvolutionary<V extends NumberVector<?>>
EAFOD provides the evolutionary outlier detection algorithm, an algorithm to
detect outliers for high dimensional data.
|
class |
AggarwalYuNaive<V extends NumberVector<?>>
BruteForce provides a naive brute force algorithm in which all k-subsets of
dimensions are examined and calculates the sparsity coefficient to find
outliers.
|
class |
COP<V extends NumberVector<?>,D extends NumberDistance<D,?>>
Correlation outlier probability: Outlier Detection in Arbitrarily Oriented
Subspaces
Hans-Peter Kriegel, Peer Kröger, Erich Schubert, Arthur Zimek
Outlier Detection in Arbitrarily Oriented Subspaces in: Proc. |
class |
DBOutlierDetection<O,D extends Distance<D>>
Simple distanced based outlier detection algorithm.
|
class |
DBOutlierScore<O,D extends Distance<D>>
Compute percentage of neighbors in the given neighborhood with size d.
|
class |
DWOF<O,D extends NumberDistance<D,?>>
Algorithm to compute dynamic-window outlier factors in a database based on a
specified parameter
DWOF.Parameterizer.K_ID (-dwof.k ). |
class |
FastABOD<V extends NumberVector<?>>
Angle-Based Outlier Detection / Angle-Based Outlier Factor.
|
class |
GaussianUniformMixture<V extends NumberVector<?>>
Outlier detection algorithm using a mixture model approach.
|
class |
HilOut<O extends NumberVector<?>>
Fast Outlier Detection in High Dimensional Spaces
Outlier Detection using Hilbert space filling curves
Reference:
F.
|
class |
KNNOutlier<O,D extends NumberDistance<D,?>>
Outlier Detection based on the distance of an object to its k nearest
neighbor.
|
class |
KNNWeightOutlier<O,D extends NumberDistance<D,?>>
Outlier Detection based on the accumulated distances of a point to its k
nearest neighbors.
|
class |
LBABOD<V extends NumberVector<?>>
Angle-Based Outlier Detection / Angle-Based Outlier Factor.
|
class |
ODIN<O,D extends Distance<D>>
Outlier detection based on the in-degree of the kNN graph.
|
class |
OPTICSOF<O,D extends NumberDistance<D,?>>
OPTICSOF provides the Optics-of algorithm, an algorithm to find Local
Outliers in a database.
|
class |
ReferenceBasedOutlierDetection<V extends NumberVector<?>,D extends NumberDistance<D,?>>
provides the Reference-Based Outlier Detection algorithm, an algorithm that
computes kNN distances approximately, using reference points.
|
class |
SimpleCOP<V extends NumberVector<?>,D extends NumberDistance<D,?>>
Algorithm to compute local correlation outlier probability.
|
Modifier and Type | Class and Description |
---|---|
class |
ALOCI<O extends NumberVector<?>,D extends NumberDistance<D,?>>
Fast Outlier Detection Using the "approximate Local Correlation Integral".
|
class |
FlexibleLOF<O,D extends NumberDistance<D,?>>
Flexible variant of the "Local Outlier Factor" algorithm.
|
class |
INFLO<O,D extends NumberDistance<D,?>>
INFLO provides the Mining Algorithms (Two-way Search Method) for Influence
Outliers using Symmetric Relationship
Reference:
Jin, W., Tung, A., Han, J., and Wang, W. 2006 Ranking outliers using symmetric neighborhood relationship In Proc. |
class |
LDF<O extends NumberVector<?>,D extends NumberDistance<D,?>>
Outlier Detection with Kernel Density Functions.
|
class |
LDOF<O,D extends NumberDistance<D,?>>
Computes the LDOF (Local Distance-Based Outlier Factor) for all objects of a
Database.
|
class |
LOCI<O,D extends NumberDistance<D,?>>
Fast Outlier Detection Using the "Local Correlation Integral".
|
class |
LOF<O,D extends NumberDistance<D,?>>
Algorithm to compute density-based local outlier factors in a database based
on a specified parameter
LOF.Parameterizer.K_ID (-lof.k ). |
class |
LoOP<O,D extends NumberDistance<D,?>>
LoOP: Local Outlier Probabilities
Distance/density based algorithm similar to LOF to detect outliers, but with
statistical methods to achieve better result stability.
|
class |
SimplifiedLOF<O,D extends NumberDistance<D,?>>
A simplified version of the original LOF algorithm, which does not use the
reachability distance, yielding less stable results on inliers.
|
Modifier and Type | Class and Description |
---|---|
class |
FeatureBagging
A simple ensemble method called "Feature bagging" for outlier detection.
|
class |
HiCS<V extends NumberVector<?>>
Algorithm to compute High Contrast Subspaces for Density-Based Outlier
Ranking.
|
Modifier and Type | Class and Description |
---|---|
class |
CTLuGLSBackwardSearchAlgorithm<V extends NumberVector<?>,D extends NumberDistance<D,?>>
GLS-Backward Search is a statistical approach to detecting spatial outliers.
|
class |
CTLuMeanMultipleAttributes<N,O extends NumberVector<?>>
Mean Approach is used to discover spatial outliers with multiple attributes.
|
class |
CTLuMedianAlgorithm<N>
Median Algorithm of C.
|
class |
CTLuMedianMultipleAttributes<N,O extends NumberVector<?>>
Median Approach is used to discover spatial outliers with multiple
attributes.
|
class |
CTLuMoranScatterplotOutlier<N>
Moran scatterplot outliers, based on the standardized deviation from the
local and global means.
|
class |
CTLuRandomWalkEC<N,D extends NumberDistance<D,?>>
Spatial outlier detection based on random walks.
|
class |
CTLuScatterplotOutlier<N>
Scatterplot-outlier is a spatial outlier detection method that performs a
linear regression of object attributes and their neighbors average value.
|
class |
CTLuZTestOutlier<N>
Detect outliers by comparing their attribute value to the mean and standard
deviation of their neighborhood.
|
class |
SLOM<N,O,D extends NumberDistance<D,?>>
SLOM: a new measure for local spatial outliers
Reference:
Sanjay Chawla and Pei Sun SLOM: a new measure for local spatial outliers in Knowledge and Information Systems 9(4), 412-429, 2006 This implementation works around some corner cases in SLOM, in particular when an object has none or a single neighbor only (albeit the results will still not be too useful then), which will result in divisions by zero. |
class |
SOF<N,O,D extends NumberDistance<D,?>>
The Spatial Outlier Factor (SOF) is a spatial
LOF variation. |
class |
TrimmedMeanApproach<N>
A Trimmed Mean Approach to Finding Spatial Outliers.
|
Modifier and Type | Class and Description |
---|---|
class |
OutRankS1
OutRank: ranking outliers in high dimensional data.
|
class |
OUTRES<V extends NumberVector<?>>
Adaptive outlierness for subspace outlier ranking (OUTRES).
|
class |
SOD<V extends NumberVector<?>,D extends NumberDistance<D,?>>
Subspace Outlier Degree.
|
Modifier and Type | Class and Description |
---|---|
class |
ComputeKNNOutlierScores<O extends NumberVector<?>,D extends NumberDistance<D,?>>
Application that runs a series of kNN-based algorithms on a data set, for
building an ensemble in a second step.
|
class |
GreedyEnsembleExperiment
Class to load an outlier detection summary file, as produced by
ComputeKNNOutlierScores , and compute a naive ensemble for it. |
class |
VisualizePairwiseGainMatrix
Class to load an outlier detection summary file, as produced by
ComputeKNNOutlierScores , and compute a matrix with the pairwise
gains. |
Modifier and Type | Method and Description |
---|---|
private static List<Pair<Reference,List<Class<?>>>> |
DocumentReferences.sortedReferences() |
Modifier and Type | Method and Description |
---|---|
private static Document |
DocumentReferences.documentReferences(List<Pair<Reference,List<Class<?>>>> refs) |
private static void |
DocumentReferences.documentReferencesWiki(List<Pair<Reference,List<Class<?>>>> refs,
PrintStream refstreamW) |
private static void |
DocumentReferences.inspectClass(Class<?> cls,
List<Pair<Reference,List<Class<?>>>> refs,
Map<Reference,List<Class<?>>> map) |
private static void |
DocumentReferences.inspectClass(Class<?> cls,
List<Pair<Reference,List<Class<?>>>> refs,
Map<Reference,List<Class<?>>> map) |
Modifier and Type | Class and Description |
---|---|
(package private) class |
IntegerDBIDArrayQuickSort
Class to sort an integer DBID array, using a modified quicksort.
|
Modifier and Type | Class and Description |
---|---|
class |
LinearDiscriminantAnalysisFilter<V extends NumberVector<?>>
Linear Discriminant Analysis (LDA) / Fisher's linear discriminant.
|
Modifier and Type | Class and Description |
---|---|
class |
BrayCurtisDistanceFunction
Bray-Curtis distance function / Sørensen–Dice coefficient for continuous
spaces.
|
class |
CanberraDistanceFunction
Canberra distance function, a variation of Manhattan distance.
|
class |
ClarkDistanceFunction
Clark distance function for vector spaces.
|
class |
Kulczynski1DistanceFunction
Kulczynski similarity 1, in distance form.
|
class |
LorentzianDistanceFunction
Lorentzian distance function for vector spaces.
|
Modifier and Type | Method and Description |
---|---|
(package private) static void |
BrayCurtisDistanceFunction.secondReference()
Dummy method, just to attach a second reference.
|
(package private) static void |
BrayCurtisDistanceFunction.thirdReference()
Dummy method, just to attach a third reference.
|
Modifier and Type | Class and Description |
---|---|
class |
HistogramIntersectionDistanceFunction
Intersection distance for color histograms.
|
class |
HSBHistogramQuadraticDistanceFunction
Distance function for HSB color histograms based on a quadratic form and
color similarity.
|
class |
RGBHistogramQuadraticDistanceFunction
Distance function for RGB color histograms based on a quadratic form and
color similarity.
|
Modifier and Type | Method and Description |
---|---|
double |
LngLatDistanceFunction.doubleMinDist(SpatialComparable mbr1,
SpatialComparable mbr2) |
double |
LatLngDistanceFunction.doubleMinDist(SpatialComparable mbr1,
SpatialComparable mbr2) |
double |
DimensionSelectingLatLngDistanceFunction.doubleMinDist(SpatialComparable mbr1,
SpatialComparable mbr2) |
Modifier and Type | Class and Description |
---|---|
class |
ChiSquaredDistanceFunction
Chi-Squared distance function, symmetric version.
|
class |
JeffreyDivergenceDistanceFunction
Provides the Jeffrey Divergence Distance for FeatureVectors.
|
class |
KullbackLeiblerDivergenceAsymmetricDistanceFunction
Kullback-Leibler (asymmetric!)
|
class |
KullbackLeiblerDivergenceReverseAsymmetricDistanceFunction
Kullback-Leibler (asymmetric!)
|
class |
SqrtJensenShannonDivergenceDistanceFunction
The square root of Jensen-Shannon divergence is metric.
|
Modifier and Type | Class and Description |
---|---|
class |
LevenshteinDistanceFunction
Classic Levenshtein distance on strings.
|
class |
NormalizedLevenshteinDistanceFunction
Levenshtein distance on strings, normalized by string length.
|
Modifier and Type | Class and Description |
---|---|
class |
DTWDistanceFunction
Provides the Dynamic Time Warping distance for FeatureVectors.
|
class |
EDRDistanceFunction
Provides the Edit Distance on Real Sequence distance for FeatureVectors.
|
class |
ERPDistanceFunction
Provides the Edit Distance With Real Penalty distance for FeatureVectors.
|
class |
LCSSDistanceFunction
Provides the Longest Common Subsequence distance for FeatureVectors.
|
Modifier and Type | Class and Description |
---|---|
class |
JaccardPrimitiveSimilarityFunction<O extends FeatureVector<?>>
A flexible extension of Jaccard similarity to non-binary vectors.
|
class |
Kulczynski1SimilarityFunction
Kulczynski similarity 1.
|
class |
Kulczynski2SimilarityFunction
Kulczynski similarity 2.
|
Modifier and Type | Class and Description |
---|---|
class |
BCubed
BCubed measures.
|
class |
EditDistance
Edit distance measures.
|
class |
Entropy
Entropy based measures.
|
class |
SetMatchingPurity
Set matching purity measures.
|
Modifier and Type | Method and Description |
---|---|
double |
SetMatchingPurity.f1Measure()
Get the set matching F1-Measure
Steinbach, M. and Karypis, G. and Kumar, V. and others
A comparison of document clustering techniques KDD workshop on text mining, 2000 |
double |
PairCounting.fowlkesMallows()
Computes the pair-counting Fowlkes-mallows (flat only, non-hierarchical!)
|
double |
Entropy.normalizedVariationOfInformation()
Get the normalized variation of information (normalized, 0 = equal)
NVI = 1 - NMI_Joint
Vinh, N.X. and Epps, J. and Bailey, J.
|
double |
SetMatchingPurity.purity()
Get the set matchings purity (first:second clustering) (normalized, 1 =
equal)
|
double |
PairCounting.randIndex()
Computes the Rand index (RI).
|
Modifier and Type | Class and Description |
---|---|
class |
Segments
Creates segments of two or more clusterings.
|
Modifier and Type | Class and Description |
---|---|
class |
OutlierSmROCCurve
Smooth ROC curves are a variation of classic ROC curves that takes the scores
into account.
|
Modifier and Type | Class and Description |
---|---|
class |
EuclideanHashFunctionFamily
2-stable hash function family for Euclidean distances.
|
class |
ManhattanHashFunctionFamily
2-stable hash function family for Euclidean distances.
|
Modifier and Type | Class and Description |
---|---|
class |
MultipleProjectionsLocalitySensitiveHashFunction
LSH hash function for vector space data.
|
Modifier and Type | Class and Description |
---|---|
class |
RandomSampleKNNPreprocessor<O,D extends Distance<D>>
Class that computed the kNN only on a random sample.
|
Modifier and Type | Class and Description |
---|---|
class |
PINN<O extends NumberVector<?>>
Projection-Indexed nearest-neighbors (PINN) is an index to retrieve the
nearest neighbors in high dimensional spaces by using a random projection
based index.
|
Modifier and Type | Class and Description |
---|---|
class |
MTree<O,D extends NumberDistance<D,?>>
MTree is a metrical index structure based on the concepts of the M-Tree.
|
Modifier and Type | Class and Description |
---|---|
class |
MinimumEnlargementInsert<O,D extends NumberDistance<D,?>,N extends AbstractMTreeNode<O,D,N,E>,E extends MTreeEntry>
Default insertion strategy for the M-tree.
|
Modifier and Type | Class and Description |
---|---|
class |
MLBDistSplit<O,D extends NumberDistance<D,?>,N extends AbstractMTreeNode<O,D,N,E>,E extends MTreeEntry>
Encapsulates the required methods for a split of a node in an M-Tree.
|
class |
MMRadSplit<O,D extends NumberDistance<D,?>,N extends AbstractMTreeNode<O,D,N,E>,E extends MTreeEntry>
Encapsulates the required methods for a split of a node in an M-Tree.
|
class |
MRadSplit<O,D extends NumberDistance<D,?>,N extends AbstractMTreeNode<O,D,N,E>,E extends MTreeEntry>
Encapsulates the required methods for a split of a node in an M-Tree.
|
class |
RandomSplit<O,D extends NumberDistance<D,?>,N extends AbstractMTreeNode<O,D,N,E>,E extends MTreeEntry>
Encapsulates the required methods for a split of a node in an M-Tree.
|
Modifier and Type | Class and Description |
---|---|
class |
MinimalisticMemoryKDTree<O extends NumberVector<?>>
Simple implementation of a static in-memory K-D-tree.
|
Modifier and Type | Class and Description |
---|---|
class |
DoubleDistanceRStarTreeKNNQuery<O extends SpatialComparable>
Instance of a KNN query for a particular spatial index.
|
class |
DoubleDistanceRStarTreeRangeQuery<O extends SpatialComparable>
Instance of a range query for a particular spatial index.
|
class |
GenericRStarTreeKNNQuery<O extends SpatialComparable,D extends Distance<D>>
Instance of a KNN query for a particular spatial index.
|
class |
GenericRStarTreeRangeQuery<O extends SpatialComparable,D extends Distance<D>>
Instance of a range query for a particular spatial index.
|
Modifier and Type | Class and Description |
---|---|
class |
RStarTree
RStarTree is a spatial index structure based on the concepts of the R*-Tree.
|
Modifier and Type | Class and Description |
---|---|
class |
OneDimSortBulkSplit
Simple bulk loading strategy by sorting the data along the first dimension.
|
class |
SortTileRecursiveBulkSplit
Sort-Tile-Recursive aims at tiling the data space with a grid-like structure
for partitioning the dataset into the required number of buckets.
|
class |
SpatialSortBulkSplit
Bulk loading by spatially sorting the objects, then partitioning the sorted
list appropriately.
|
Modifier and Type | Class and Description |
---|---|
class |
ApproximativeLeastOverlapInsertionStrategy
The choose subtree method proposed by the R*-Tree with slightly better
performance for large leaf sizes (linear approximation).
|
class |
CombinedInsertionStrategy
Use two different insertion strategies for directory and leaf nodes.
|
class |
LeastEnlargementInsertionStrategy
The default R-Tree insertion strategy: find rectangle with least volume
enlargement.
|
class |
LeastEnlargementWithAreaInsertionStrategy
A slight modification of the default R-Tree insertion strategy: find
rectangle with least volume enlargement, but choose least area on ties.
|
class |
LeastOverlapInsertionStrategy
The choose subtree method proposed by the R*-Tree for leaf nodes.
|
Modifier and Type | Class and Description |
---|---|
class |
LimitedReinsertOverflowTreatment
Limited reinsertions, as proposed by the R*-Tree: For each real insert, allow
reinsertions to happen only once per level.
|
Modifier and Type | Class and Description |
---|---|
class |
CloseReinsert
Reinsert objects on page overflow, starting with close objects first (even
when they will likely be inserted into the same page again!)
|
class |
FarReinsert
Reinsert objects on page overflow, starting with farther objects first (even
when they will likely be inserted into the same page again!)
|
Modifier and Type | Class and Description |
---|---|
class |
AngTanLinearSplit
Line-time complexity split proposed by Ang and Tan.
|
class |
GreeneSplit
Quadratic-time complexity split as used by Diane Greene for the R-Tree.
|
class |
RTreeLinearSplit
Linear-time complexity greedy split as used by the original R-Tree.
|
class |
RTreeQuadraticSplit
Quadratic-time complexity greedy split as used by the original R-Tree.
|
class |
TopologicalSplitter
Encapsulates the required parameters for a topological split of a R*-Tree.
|
Modifier and Type | Class and Description |
---|---|
class |
DAFile
Dimension approximation file, a one-dimensional part of the
PartialVAFile . |
class |
PartialVAFile<V extends NumberVector<?>>
PartialVAFile.
|
class |
VAFile<V extends NumberVector<?>>
Vector-approximation file (VAFile)
Reference:
Weber, R. and Blott, S.
|
Modifier and Type | Class and Description |
---|---|
class |
Mean
Compute the mean using a numerically stable online algorithm.
|
class |
MeanVariance
Do some simple statistics (mean, variance) using a numerically stable online
algorithm.
|
class |
StatisticalMoments
Track various statistical moments, including mean, variance, skewness and
kurtosis.
|
Modifier and Type | Method and Description |
---|---|
void |
MeanVariance.put(double val,
double weight)
Add data with a given weight.
|
Modifier and Type | Class and Description |
---|---|
class |
HiCSDimensionSimilarity
Use the statistical tests as used by HiCS to arrange dimensions.
|
class |
HSMDimensionSimilarity
Compute the similarity of dimensions by using a hough transformation.
|
class |
MCEDimensionSimilarity
Compute dimension similarity by using a nested means discretization.
|
class |
SlopeDimensionSimilarity
Arrange dimensions based on the entropy of the slope spectrum.
|
class |
SlopeInversionDimensionSimilarity
Arrange dimensions based on the entropy of the slope spectrum.
|
class |
SURFINGDimensionSimilarity
Compute the similarity of dimensions using the SURFING score.
|
Modifier and Type | Method and Description |
---|---|
void |
SURFINGDimensionSimilarity.computeDimensionSimilarites(Database database,
Relation<? extends NumberVector<?>> relation,
DBIDs subset,
DimensionSimilarityMatrix matrix) |
Modifier and Type | Class and Description |
---|---|
class |
SphereUtil
Class with utility functions for distance computations on the sphere.
|
Modifier and Type | Method and Description |
---|---|
static double |
SphereUtil.ellipsoidVincentyFormulaDeg(double f,
double lat1,
double lon1,
double lat2,
double lon2)
Compute the approximate great-circle distance of two points.
|
static double |
SphereUtil.ellipsoidVincentyFormulaRad(double f,
double lat1,
double lon1,
double lat2,
double lon2)
Compute the approximate great-circle distance of two points.
|
static double |
SphereUtil.haversineFormulaDeg(double lat1,
double lon1,
double lat2,
double lon2)
Compute the approximate great-circle distance of two points using the
Haversine formula
Complexity: 5 trigonometric functions, 2 sqrt.
|
static double |
SphereUtil.haversineFormulaRad(double lat1,
double lon1,
double lat2,
double lon2)
Compute the approximate great-circle distance of two points using the
Haversine formula
Complexity: 5 trigonometric functions, 2 sqrt.
|
static double |
SphereUtil.latlngMinDistDeg(double plat,
double plng,
double rminlat,
double rminlng,
double rmaxlat,
double rmaxlng)
Point to rectangle minimum distance.
|
static double |
SphereUtil.latlngMinDistRad(double plat,
double plng,
double rminlat,
double rminlng,
double rmaxlat,
double rmaxlng)
Point to rectangle minimum distance.
|
static double |
SphereUtil.latlngMinDistRadFull(double plat,
double plng,
double rminlat,
double rminlng,
double rmaxlat,
double rmaxlng)
Point to rectangle minimum distance.
|
static double |
SphereUtil.sphericalVincentyFormulaDeg(double lat1,
double lon1,
double lat2,
double lon2)
Compute the approximate great-circle distance of two points.
|
static double |
SphereUtil.sphericalVincentyFormulaRad(double lat1,
double lon1,
double lat2,
double lon2)
Compute the approximate great-circle distance of two points.
|
Modifier and Type | Class and Description |
---|---|
class |
GrahamScanConvexHull2D
Classes to compute the convex hull of a set of points in 2D, using the
classic Grahams scan.
|
class |
PrimsMinimumSpanningTree
Prim's algorithm for finding the minimum spanning tree.
|
class |
SweepHullDelaunay2D
Compute the Convex Hull and/or Delaunay Triangulation, using the sweep-hull
approach of David Sinclair.
|
Modifier and Type | Class and Description |
---|---|
class |
PCAFilteredAutotuningRunner<V extends NumberVector<?>>
Performs a self-tuning local PCA based on the covariance matrices of given
objects.
|
class |
RANSACCovarianceMatrixBuilder<V extends NumberVector<?>>
RANSAC based approach to a more robust covariance matrix computation.
|
class |
WeightedCovarianceMatrixBuilder<V extends NumberVector<?>>
CovarianceMatrixBuilder with weights. |
Modifier and Type | Method and Description |
---|---|
Matrix |
RANSACCovarianceMatrixBuilder.processIds(DBIDs ids,
Relation<? extends V> relation) |
Modifier and Type | Class and Description |
---|---|
class |
AchlioptasRandomProjectionFamily
Random projections as suggested by Dimitris Achlioptas.
|
class |
CauchyRandomProjectionFamily
Random projections using Cauchy distributions (1-stable).
|
class |
GaussianRandomProjectionFamily
Random projections using Cauchy distributions (1-stable).
|
class |
RandomSubsetProjectionFamily
Random projection family based on selecting random features.
|
Modifier and Type | Class and Description |
---|---|
class |
BinarySplitSpatialSorter
Spatially sort the data set by repetitive binary splitting, circulating
through the dimensions.
|
class |
HilbertSpatialSorter
Sort object along the Hilbert Space Filling curve by mapping them to their
Hilbert numbers and sorting them.
|
class |
PeanoSpatialSorter
Bulk-load an R-tree index by presorting the objects with their position on
the Peano curve.
|
Modifier and Type | Class and Description |
---|---|
class |
ProbabilityWeightedMoments
Estimate the L-Moments of a sample.
|
Modifier and Type | Class and Description |
---|---|
class |
HaltonUniformDistribution
Halton sequences are a pseudo-uniform distribution.
|
Modifier and Type | Method and Description |
---|---|
protected static double |
GammaDistribution.chisquaredProbitApproximation(double p,
double nu,
double g)
Approximate probit for chi squared distribution
Based on first half of algorithm AS 91
Reference:
Algorithm AS 91: The percentage points of the $\chi^2$ distribution
D.J. |
private static double |
PoissonDistribution.devianceTerm(double x,
double np)
Evaluate the deviance term of the saddle point approximation.
|
static double |
GammaDistribution.digamma(double x)
Compute the Psi / Digamma function
Reference:
J.
|
static double |
PoissonDistribution.pmf(double x,
int n,
double p)
Poisson probability mass function (PMF) for integer values.
|
static double |
ChiSquaredDistribution.quantile(double x,
double dof)
Return the quantile function for this distribution
Reference:
Algorithm AS 91: The percentage points of the $\chi$^2 distribution
D.J. |
static double |
GammaDistribution.quantile(double p,
double k,
double theta)
Compute probit (inverse cdf) for Gamma distributions.
|
private static double |
PoissonDistribution.stirlingError(double n)
Calculates the Striling Error
stirlerr(n) = ln(n!)
|
private static double |
PoissonDistribution.stirlingError(int n)
Calculates the Striling Error
stirlerr(n) = ln(n!)
|
Modifier and Type | Class and Description |
---|---|
class |
CauchyMADEstimator
Estimate Cauchy distribution parameters using Median and MAD.
|
class |
EMGOlivierNorbergEstimator
Naive distribution estimation using mean and sample variance.
|
class |
ExponentialLMMEstimator
Estimate the parameters of a Gamma Distribution, using the methods of
L-Moments (LMM).
|
class |
ExponentialMADEstimator
Estimate Exponential distribution parameters using Median and MAD.
|
class |
ExponentialMedianEstimator
Estimate Exponential distribution parameters using Median and MAD.
|
class |
GammaChoiWetteEstimator
Estimate distribution parameters using the method by Choi and Wette.
|
class |
GammaLMMEstimator
Estimate the parameters of a Gamma Distribution, using the methods of
L-Moments (LMM).
|
class |
GammaMADEstimator
Robust parameter estimation for the Gamma distribution.
|
class |
GammaMOMEstimator
Simple parameter estimation for the Gamma distribution.
|
class |
GeneralizedExtremeValueLMMEstimator
Estimate the parameters of a Generalized Extreme Value Distribution, using
the methods of L-Moments (LMM).
|
class |
GeneralizedLogisticAlternateLMMEstimator
Estimate the parameters of a Generalized Logistic Distribution, using the
methods of L-Moments (LMM).
|
class |
GumbelLMMEstimator
Estimate the parameters of a Gumbel Distribution, using the methods of
L-Moments (LMM).
|
class |
GumbelMADEstimator
Parameter estimation via median and median absolute deviation from median
(MAD).
|
class |
LaplaceMADEstimator
Estimate Laplace distribution parameters using Median and MAD.
|
class |
LaplaceMLEEstimator
Estimate Laplace distribution parameters using Median and mean deviation from
median.
|
class |
LogGammaChoiWetteEstimator
Estimate distribution parameters using the method by Choi and Wette.
|
class |
LogisticLMMEstimator
Estimate the parameters of a Logistic Distribution, using the methods of
L-Moments (LMM).
|
class |
LogisticMADEstimator
Estimate Logistic distribution parameters using Median and MAD.
|
class |
LogLogisticMADEstimator
Estimate Logistic distribution parameters using Median and MAD.
|
class |
LogNormalBilkovaLMMEstimator
Alternate estimate the parameters of a log Gamma Distribution, using the
methods of L-Moments (LMM) for the Generalized Normal Distribution.
|
class |
LogNormalLMMEstimator
Estimate the parameters of a log Normal Distribution, using the methods of
L-Moments (LMM) for the Generalized Normal Distribution.
|
class |
LogNormalLogMADEstimator
Estimator using Medians.
|
class |
NormalLMMEstimator
Estimate the parameters of a normal distribution using the method of
L-Moments (LMM).
|
class |
NormalMADEstimator
Estimator using Medians.
|
class |
RayleighMADEstimator
Estimate the parameters of a RayleighDistribution using the MAD.
|
class |
SkewGNormalLMMEstimator
Estimate the parameters of a skew Normal Distribution (Hoskin's Generalized
Normal Distribution), using the methods of L-Moments (LMM).
|
class |
UniformMADEstimator
Estimate Uniform distribution parameters using Median and MAD.
|
class |
WeibullLogMADEstimator
Parameter estimation via median and median absolute deviation from median
(MAD).
|
Modifier and Type | Class and Description |
---|---|
class |
WinsorisingEstimator<D extends Distribution>
Winsorising or Georgization estimator.
|
Modifier and Type | Field and Description |
---|---|
static double |
BiweightKernelDensityFunction.CANONICAL_BANDWIDTH
Canonical bandwidth: 35^(1/5)
|
static double |
TriweightKernelDensityFunction.CANONICAL_BANDWIDTH
Canonical bandwidth: (9450/143)^(1/5)
|
static double |
UniformKernelDensityFunction.CANONICAL_BANDWIDTH
Canonical bandwidth: (9/2)^(1/5)
|
static double |
EpanechnikovKernelDensityFunction.CANONICAL_BANDWIDTH
Canonical bandwidth: 15^(1/5)
|
static double |
GaussianKernelDensityFunction.CANONICAL_BANDWIDTH
Canonical bandwidth: (1./(4*pi))^(1/10)
|
Modifier and Type | Method and Description |
---|---|
double |
KernelDensityFunction.canonicalBandwidth()
Get the canonical bandwidth for this kernel.
|
Modifier and Type | Class and Description |
---|---|
class |
KMLOutputHandler
Class to handle KML output.
|
Modifier and Type | Class and Description |
---|---|
class |
IntegerArrayQuickSort
Class to sort an int array, using a modified quicksort.
|
Modifier and Type | Method and Description |
---|---|
static Reference |
DocumentationUtil.getReference(Class<?> c)
Get the reference annotation of a class, or
null . |
Modifier and Type | Class and Description |
---|---|
class |
COPOutlierScaling
CDF based outlier score scaling.
|
class |
HeDESNormalizationOutlierScaling
Normalization used by HeDES
|
class |
MinusLogGammaScaling
Scaling that can map arbitrary values to a probability in the range of [0:1],
by assuming a Gamma distribution on the data and evaluating the Gamma CDF.
|
class |
MinusLogStandardDeviationScaling
Scaling that can map arbitrary values to a probability in the range of [0:1].
|
class |
MixtureModelOutlierScalingFunction
Tries to fit a mixture model (exponential for inliers and gaussian for
outliers) to the outlier score distribution.
|
class |
MultiplicativeInverseScaling
Scaling function to invert values basically by computing 1/x, but in a variation
that maps the values to the [0:1] interval and avoiding division by 0.
|
class |
OutlierGammaScaling
Scaling that can map arbitrary values to a probability in the range of [0:1]
by assuming a Gamma distribution on the values.
|
class |
OutlierMinusLogScaling
Scaling function to invert values by computing -1 * Math.log(x)
Useful for example for scaling
ABOD , but see
MinusLogStandardDeviationScaling and MinusLogGammaScaling for
more advanced scalings for this algorithm. |
class |
SigmoidOutlierScalingFunction
Tries to fit a sigmoid to the outlier scores and use it to convert the values
to probability estimates in the range of 0.0 to 1.0
|
class |
SqrtStandardDeviationScaling
Scaling that can map arbitrary values to a probability in the range of [0:1].
|
class |
StandardDeviationScaling
Scaling that can map arbitrary values to a probability in the range of [0:1].
|
Modifier and Type | Method and Description |
---|---|
static void |
COPOutlierScaling.secondReference()
Secondary reference.
|
Modifier and Type | Class and Description |
---|---|
class |
CircleSegmentsVisualizer
Visualizer to draw circle segments of clusterings and enable interactive
selection of segments.
|
Modifier and Type | Method and Description |
---|---|
private double[] |
DensityEstimationOverlay.Instance.initializeBandwidth(double[][] data) |
Modifier and Type | Class and Description |
---|---|
class |
BubbleVisualization
Generates a SVG-Element containing bubbles.
|
class |
COPVectorVisualization
Visualize error vectors as produced by COP.
|
Modifier and Type | Class and Description |
---|---|
class |
NaiveAgglomerativeHierarchicalClustering3<O,D extends NumberDistance<D,?>>
This tutorial will step you through implementing a well known clustering
algorithm, agglomerative hierarchical clustering, in multiple steps.
|
class |
NaiveAgglomerativeHierarchicalClustering4<O,D extends NumberDistance<D,?>>
This tutorial will step you through implementing a well known clustering
algorithm, agglomerative hierarchical clustering, in multiple steps.
|
Modifier and Type | Class and Description |
---|---|
class |
ODIN<O,D extends Distance<D>>
Outlier detection based on the in-degree of the kNN graph.
|