V
- a type of NumberVector
as a suitable data object for this
algorithmD
- the distance type processed@Title(value="An Efficient Reference-based Approach to Outlier Detection in Large Datasets") @Description(value="Computes kNN distances approximately, using reference points with various reference point strategies.") @Reference(authors="Y. Pei, O.R. Zaiane, Y. Gao", title="An Efficient Reference-based Approach to Outlier Detection in Large Datasets", booktitle="Proc. 19th IEEE Int. Conf. on Data Engineering (ICDE \'03), Bangalore, India, 2003", url="http://dx.doi.org/10.1109/ICDM.2006.17") public class ReferenceBasedOutlierDetection<V extends NumberVector<?,?>,D extends NumberDistance<D,?>> extends AbstractAlgorithm<OutlierResult> implements OutlierAlgorithm
provides the Reference-Based Outlier Detection algorithm, an algorithm that computes kNN distances approximately, using reference points.
Reference:
Y. Pei, O. R. Zaiane, Y. Gao: An Efficient Reference-Based Approach to
Outlier Detection in Large Datasets. In: Proc. IEEE Int. Conf. on Data
Mining (ICDM'06), Hong Kong, China, 2006.
Modifier and Type | Class and Description |
---|---|
static class |
ReferenceBasedOutlierDetection.Parameterizer<V extends NumberVector<?,?>,D extends NumberDistance<D,?>>
Parameterization class.
|
Modifier and Type | Field and Description |
---|---|
private DistanceFunction<V,D> |
distanceFunction
Distance function to use.
|
private int |
k
Holds the value of
K_ID . |
static OptionID |
K_ID
Parameter to specify the number of nearest neighbors of an object, to be
considered for computing its REFOD_SCORE, must be an integer greater than
1.
|
private static Logging |
logger
The logger for this class.
|
private ReferencePointsHeuristic<V> |
refp
Stores the reference point strategy
|
static OptionID |
REFP_ID
Parameter for the reference points heuristic.
|
Constructor and Description |
---|
ReferenceBasedOutlierDetection(int k,
DistanceFunction<V,D> distanceFunction,
ReferencePointsHeuristic<V> refp)
Constructor with parameters.
|
Modifier and Type | Method and Description |
---|---|
protected double |
computeDensity(List<DistanceResultPair<D>> referenceDists,
int index)
Computes the density of an object.
|
protected List<DistanceResultPair<D>> |
computeDistanceVector(V refPoint,
Relation<V> database,
DistanceQuery<V,D> distFunc)
Computes for each object the distance to one reference point.
|
TypeInformation[] |
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query.
|
protected Logging |
getLogger()
Get the (STATIC) logger for this class.
|
OutlierResult |
run(Relation<V> relation)
Run the algorithm on the given relation.
|
makeParameterDistanceFunction, run
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
run
private static final Logging logger
public static final OptionID REFP_ID
public static final OptionID K_ID
private int k
K_ID
.private ReferencePointsHeuristic<V extends NumberVector<?,?>> refp
private DistanceFunction<V extends NumberVector<?,?>,D extends NumberDistance<D,?>> distanceFunction
public ReferenceBasedOutlierDetection(int k, DistanceFunction<V,D> distanceFunction, ReferencePointsHeuristic<V> refp)
k
- k ParameterdistanceFunction
- distance functionrefp
- Reference points heuristicpublic OutlierResult run(Relation<V> relation)
relation
- Relation to processprotected List<DistanceResultPair<D>> computeDistanceVector(V refPoint, Relation<V> database, DistanceQuery<V,D> distFunc)
refPoint
- Reference Point Feature Vectordatabase
- database to work ondistFunc
- Distance function to useprotected double computeDensity(List<DistanceResultPair<D>> referenceDists, int index)
referenceDists
- vector of the reference distances,index
- index of the current objectpublic TypeInformation[] getInputTypeRestriction()
AbstractAlgorithm
getInputTypeRestriction
in interface Algorithm
getInputTypeRestriction
in class AbstractAlgorithm<OutlierResult>
protected Logging getLogger()
AbstractAlgorithm
getLogger
in class AbstractAlgorithm<OutlierResult>