
V - a type of NumberVector as a suitable data object for this
algorithmD - the distance type processed@Title(value="An Efficient Reference-based Approach to Outlier Detection in Large Datasets") @Description(value="Computes kNN distances approximately, using reference points with various reference point strategies.") @Reference(authors="Y. Pei, O.R. Zaiane, Y. Gao", title="An Efficient Reference-based Approach to Outlier Detection in Large Datasets", booktitle="Proc. 6th IEEE Int. Conf. on Data Mining (ICDM \'06), Hong Kong, China, 2006", url="http://dx.doi.org/10.1109/ICDM.2006.17") public class ReferenceBasedOutlierDetection<V extends NumberVector<?>,D extends NumberDistance<D,?>> extends AbstractAlgorithm<OutlierResult> implements OutlierAlgorithm
provides the Reference-Based Outlier Detection algorithm, an algorithm that computes kNN distances approximately, using reference points.
Reference:
Y. Pei, O. R. Zaiane, Y. Gao: An Efficient Reference-Based Approach to
Outlier Detection in Large Datasets. In: Proc. IEEE Int. Conf. on Data
Mining (ICDM'06), Hong Kong, China, 2006.
| Modifier and Type | Class and Description |
|---|---|
static class |
ReferenceBasedOutlierDetection.Parameterizer<V extends NumberVector<?>,D extends NumberDistance<D,?>>
Parameterization class.
|
| Modifier and Type | Field and Description |
|---|---|
private DistanceFunction<V,D> |
distanceFunction
Distance function to use.
|
private int |
k
Holds the value of
K_ID. |
static OptionID |
K_ID
Parameter to specify the number of nearest neighbors of an object, to be
considered for computing its REFOD_SCORE, must be an integer greater than
1.
|
private static Logging |
LOG
The logger for this class.
|
private ReferencePointsHeuristic<V> |
refp
Stores the reference point strategy
|
static OptionID |
REFP_ID
Parameter for the reference points heuristic.
|
| Constructor and Description |
|---|
ReferenceBasedOutlierDetection(int k,
DistanceFunction<V,D> distanceFunction,
ReferencePointsHeuristic<V> refp)
Constructor with parameters.
|
| Modifier and Type | Method and Description |
|---|---|
protected double |
computeDensity(DistanceDBIDList<D> referenceDists,
int index)
Computes the density of an object.
|
protected DistanceDBIDList<D> |
computeDistanceVector(V refPoint,
Relation<V> database,
DistanceQuery<V,D> distFunc)
Computes for each object the distance to one reference point.
|
TypeInformation[] |
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query.
|
protected Logging |
getLogger()
Get the (STATIC) logger for this class.
|
OutlierResult |
run(Database database,
Relation<V> relation)
Run the algorithm on the given relation.
|
makeParameterDistanceFunction, runclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitrunprivate static final Logging LOG
public static final OptionID REFP_ID
public static final OptionID K_ID
private int k
K_ID.private ReferencePointsHeuristic<V extends NumberVector<?>> refp
private DistanceFunction<V extends NumberVector<?>,D extends NumberDistance<D,?>> distanceFunction
public ReferenceBasedOutlierDetection(int k,
DistanceFunction<V,D> distanceFunction,
ReferencePointsHeuristic<V> refp)
k - k ParameterdistanceFunction - distance functionrefp - Reference points heuristicpublic OutlierResult run(Database database, Relation<V> relation)
database - Databaserelation - Relation to processprotected DistanceDBIDList<D> computeDistanceVector(V refPoint, Relation<V> database, DistanceQuery<V,D> distFunc)
refPoint - Reference Point Feature Vectordatabase - database to work ondistFunc - Distance function to useprotected double computeDensity(DistanceDBIDList<D> referenceDists, int index)
referenceDists - vector of the reference distances,index - index of the current objectpublic TypeInformation[] getInputTypeRestriction()
AbstractAlgorithmgetInputTypeRestriction in interface AlgorithmgetInputTypeRestriction in class AbstractAlgorithm<OutlierResult>protected Logging getLogger()
AbstractAlgorithmgetLogger in class AbstractAlgorithm<OutlierResult>